feat: cli arg to specify max parquet fanout #25714

hiltontj · 2024-12-27T17:24:29Z

This is needed for https://github.com/influxdata/influxdb_pro/issues/308

This allows the max_parquet_fanout to be specified in the CLI for the influxdb3 serve command. This could be done previously via the --datafusion-config CLI argument, but the drawbacks to that were:

that is a fairly advanced option given the available key/value pairs are not well documented
if iox.max_parquet_fanout was not provided to that argument, the default would be set to 40

This PR maintains the existing --datafusion-config CLI argument (with one caveat, see below) which allows users to provide a set key/value pairs that will be used to build the internal DataFusion config, but in addition provides the --datafusion-max-parquet-fanout argument:

    --datafusion-max-parquet-fanout <MAX_PARQUET_FANOUT>
          When multiple parquet files are required in a sorted way (e.g. for de-duplication), we have two options:

          1. **In-mem sorting:** Put them into `datafusion.target_partitions` DataFusion partitions. This limits the fan-out, but requires that we potentially chain multiple parquet files into a single DataFusion partition. Since chaining sorted data does NOT automatically result in sorted data (e.g. AB-AB is not sorted), we need to preform an in-memory sort using `SortExec` afterwards. This is expensive. 2. **Fan-out:** Instead of chaining files within DataFusion partitions, we can accept a fan-out beyond `target_partitions`. This prevents in-memory sorting but may result in OOMs (out-of-memory) if the fan-out is too large.

          We try to pick option 2 up to a certain number of files, which is configured by this setting.

          [env: INFLUXDB3_DATAFUSION_MAX_PARQUET_FANOUT=]
          [default: 1000]

with the default value of 1000, which will override the core iox_query default of 40.

A test was added to check that this is propagated down to the IOxSessionContext that is used during queries.

The only change to the datafusion-config CLI argument was to rename INFLUXDB_IOX in the environment variable to INFLUXDB3:

    --datafusion-config <DATAFUSION_CONFIG>
          Provide custom configuration to DataFusion as a comma-separated list of key:value pairs.

          # Example ```text --datafusion-config "datafusion.key1:value1, datafusion.key2:value2" ```

          [env: INFLUXDB3_DATAFUSION_CONFIG=]
          [default: ]

praveen-influx · 2024-12-27T18:11:33Z

influxdb3_clap_blocks/src/datafusion.rs

+    #[clap(
+        long = "datafusion-max-parquet-fanout",
+        env = "INFLUXDB3_DATAFUSION_MAX_PARQUET_FANOUT",
+        default_value = "1000",


Just checking if 1000 is a good default value, I understand this depends on the size of the files but given it can result in OOM just wanted to double check 1000 is still good.

Good to call this out. I copied the comment from IOx/core to preserve the context it provided. I think we may need to tune this a bit, or it could be possible to base the default on the system memory, and how we allocate memory in different modes in pro.

As it stands, with the low default of 40, we are getting OOMs with the fallback, i.e., non-fanout, query plan, so we should know soon if increasing this much makes the problem worse or not. Based on https://github.com/influxdata/influxdb_pro/issues/308#issuecomment-2562955195, this default may be a bit low/out-dated (perhaps the way the DataFusion plan handles fanout is different than when the default was decided). There are some distributed clusters in IOx setting this to 800 as per https://github.com/influxdata/influxdb_pro/issues/308#issuecomment-2563245404.

We'll see how this goes - at the minimum, I got the env vars switched from INFLUXDB_IOX_ to INFLUXDB3_ 😄

I might have misunderstood the docs for this setting, I interpreted it as, the higher this number the more files it tries to fan-out, which leads to OOMs. If we don't fan-out then it leads to doing expensive in memory sorting (guessing without running into OOMs?).

(guessing without running into OOMs?)

Unfortunately, though, it is OOM'ing without the fanout, while not OOM'ing with the fanout, so we may need to update this doc comment (see https://github.com/influxdata/influxdb_pro/issues/205#issuecomment-2565377397)

I think the memory sort is going to OOM, unless you set a memory limit on DF, but in that case it just means that the query will get killed and return a resource exhaustion error. The only way around that I can think of is if spill to disk is enabled, but that's not really much better either.

I think the fanout setting should effectively be ignored (i.e. set to whatever the max of the type is). Resorting the data is always going to be more expensive and completely unnecessary in our case.

If DF allocates an arrow buffer for each input file, then you'd have that size * num of files. The Arrow buffer could be quite large if there are very wide tables and depending on the size of that buffer. I think one way to counter this would be to make sure that the pre-allocated buffer is limited in size or scaled down depending on the number of input files.

feat: cli arg to specify max parquet fanout

e32fd74

hiltontj added the v3 label Dec 27, 2024

hiltontj requested review from pauldix and praveen-influx December 27, 2024 17:24

hiltontj self-assigned this Dec 27, 2024

pauldix approved these changes Dec 27, 2024

View reviewed changes

hiltontj merged commit 03ea565 into main Dec 27, 2024
13 checks passed

hiltontj deleted the hiltontj/parquet-fan-out branch December 27, 2024 17:42

praveen-influx reviewed Dec 27, 2024

View reviewed changes

hiltontj mentioned this pull request Dec 28, 2024

iox_query uses a cached parquet reader by default #25721

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: cli arg to specify max parquet fanout #25714

feat: cli arg to specify max parquet fanout #25714

hiltontj commented Dec 27, 2024

praveen-influx Dec 27, 2024

hiltontj Dec 27, 2024

praveen-influx Dec 30, 2024

hiltontj Dec 30, 2024 •

edited

Loading

pauldix Dec 30, 2024

feat: cli arg to specify max parquet fanout #25714

feat: cli arg to specify max parquet fanout #25714

Conversation

hiltontj commented Dec 27, 2024

praveen-influx Dec 27, 2024

Choose a reason for hiding this comment

hiltontj Dec 27, 2024

Choose a reason for hiding this comment

praveen-influx Dec 30, 2024

Choose a reason for hiding this comment

hiltontj Dec 30, 2024 • edited Loading

Choose a reason for hiding this comment

pauldix Dec 30, 2024

Choose a reason for hiding this comment

hiltontj Dec 30, 2024 •

edited

Loading