-
Notifications
You must be signed in to change notification settings - Fork 3.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: cli arg to specify max parquet fanout #25714
Merged
Merged
Changes from all commits
Commits
File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.
Oops, something went wrong.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,125 @@ | ||
use std::collections::HashMap; | ||
|
||
use datafusion::config::ConfigExtension; | ||
use iox_query::config::IoxConfigExt; | ||
|
||
/// Extends the standard [`HashMap`] based DataFusion config option in the CLI with specific | ||
/// options (along with defaults) for InfluxDB 3 OSS/Pro. This is intended for customization of | ||
/// options that are defined in the `iox_query` crate, e.g., those defined in [`IoxConfigExt`] | ||
/// that are relevant to the monolithinc versions of InfluxDB 3. | ||
#[derive(Debug, clap::Parser, Clone)] | ||
pub struct IoxQueryDatafusionConfig { | ||
/// When multiple parquet files are required in a sorted way (e.g. for de-duplication), we have | ||
/// two options: | ||
/// | ||
/// 1. **In-mem sorting:** Put them into `datafusion.target_partitions` DataFusion partitions. | ||
/// This limits the fan-out, but requires that we potentially chain multiple parquet files into | ||
/// a single DataFusion partition. Since chaining sorted data does NOT automatically result in | ||
/// sorted data (e.g. AB-AB is not sorted), we need to preform an in-memory sort using | ||
/// `SortExec` afterwards. This is expensive. | ||
/// 2. **Fan-out:** Instead of chaining files within DataFusion partitions, we can accept a | ||
/// fan-out beyond `target_partitions`. This prevents in-memory sorting but may result in OOMs | ||
/// (out-of-memory) if the fan-out is too large. | ||
/// | ||
/// We try to pick option 2 up to a certain number of files, which is configured by this | ||
/// setting. | ||
#[clap( | ||
long = "datafusion-max-parquet-fanout", | ||
env = "INFLUXDB3_DATAFUSION_MAX_PARQUET_FANOUT", | ||
default_value = "1000", | ||
action | ||
)] | ||
pub max_parquet_fanout: usize, | ||
|
||
/// Provide custom configuration to DataFusion as a comma-separated list of key:value pairs. | ||
/// | ||
/// # Example | ||
/// ```text | ||
/// --datafusion-config "datafusion.key1:value1, datafusion.key2:value2" | ||
/// ``` | ||
#[clap( | ||
long = "datafusion-config", | ||
env = "INFLUXDB3_DATAFUSION_CONFIG", | ||
default_value = "", | ||
value_parser = parse_datafusion_config, | ||
action | ||
)] | ||
pub datafusion_config: HashMap<String, String>, | ||
} | ||
|
||
impl IoxQueryDatafusionConfig { | ||
/// Build a [`HashMap`] to be used as the DataFusion config for the query executor | ||
/// | ||
/// This takes the provided `--datafusion-config` and extends it with options available on this | ||
/// [`IoxQueryDatafusionConfig`] struct. Note, any IOx extension parameters that are defined | ||
/// in the `datafusion_config` will be overridden by the provided values or their default. For | ||
/// example, if the user provides: | ||
/// ``` | ||
/// --datafusion-config "iox.max_arquet_fanout:50" | ||
/// ``` | ||
/// This will be overridden with with the default value for `max_parquet_fanout` of `1000`, or | ||
/// with the value provided for the `--datafusion-max-parquet-fanout` argument. | ||
pub fn build(mut self) -> HashMap<String, String> { | ||
self.datafusion_config.insert( | ||
format!("{prefix}.max_parquet_fanout", prefix = IoxConfigExt::PREFIX), | ||
self.max_parquet_fanout.to_string(), | ||
); | ||
self.datafusion_config | ||
} | ||
} | ||
|
||
fn parse_datafusion_config( | ||
s: &str, | ||
) -> Result<HashMap<String, String>, Box<dyn std::error::Error + Send + Sync + 'static>> { | ||
let s = s.trim(); | ||
if s.is_empty() { | ||
return Ok(HashMap::with_capacity(0)); | ||
} | ||
|
||
let mut out = HashMap::new(); | ||
for part in s.split(',') { | ||
let kv = part.trim().splitn(2, ':').collect::<Vec<_>>(); | ||
match kv.as_slice() { | ||
[key, value] => { | ||
let key_owned = key.trim().to_owned(); | ||
let value_owned = value.trim().to_owned(); | ||
let existed = out.insert(key_owned, value_owned).is_some(); | ||
if existed { | ||
return Err(format!("key '{key}' passed multiple times").into()); | ||
} | ||
} | ||
_ => { | ||
return Err( | ||
format!("Invalid key value pair - expected 'KEY:VALUE' got '{s}'").into(), | ||
); | ||
} | ||
} | ||
} | ||
|
||
Ok(out) | ||
} | ||
|
||
#[cfg(test)] | ||
mod tests { | ||
use clap::Parser; | ||
use iox_query::{config::IoxConfigExt, exec::Executor}; | ||
|
||
use super::IoxQueryDatafusionConfig; | ||
|
||
#[test_log::test] | ||
fn max_parquet_fanout() { | ||
let datafusion_config = | ||
IoxQueryDatafusionConfig::parse_from(["", "--datafusion-max-parquet-fanout", "5"]) | ||
.build(); | ||
let exec = Executor::new_testing(); | ||
let mut session_config = exec.new_session_config(); | ||
for (k, v) in &datafusion_config { | ||
session_config = session_config.with_config_option(k, v); | ||
} | ||
let ctx = session_config.build(); | ||
let inner_ctx = ctx.inner().state(); | ||
let config = inner_ctx.config(); | ||
let iox_config_ext = config.options().extensions.get::<IoxConfigExt>().unwrap(); | ||
assert_eq!(5, iox_config_ext.max_parquet_fanout); | ||
} | ||
} |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,3 +1,4 @@ | ||
//! Configuration options for the `influxdb3` CLI which uses the `clap` crate | ||
|
||
pub mod datafusion; | ||
pub mod tokio; |
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just checking if 1000 is a good default value, I understand this depends on the size of the files but given it can result in OOM just wanted to double check 1000 is still good.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good to call this out. I copied the comment from IOx/core to preserve the context it provided. I think we may need to tune this a bit, or it could be possible to base the default on the system memory, and how we allocate memory in different modes in pro.
As it stands, with the low default of 40, we are getting OOMs with the fallback, i.e., non-fanout, query plan, so we should know soon if increasing this much makes the problem worse or not. Based on https://github.com/influxdata/influxdb_pro/issues/308#issuecomment-2562955195, this default may be a bit low/out-dated (perhaps the way the DataFusion plan handles fanout is different than when the default was decided). There are some distributed clusters in IOx setting this to 800 as per https://github.com/influxdata/influxdb_pro/issues/308#issuecomment-2563245404.
We'll see how this goes - at the minimum, I got the env vars switched from
INFLUXDB_IOX_
toINFLUXDB3_
😄There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I might have misunderstood the docs for this setting, I interpreted it as, the higher this number the more files it tries to fan-out, which leads to OOMs. If we don't fan-out then it leads to doing expensive in memory sorting (guessing without running into OOMs?).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Unfortunately, though, it is OOM'ing without the fanout, while not OOM'ing with the fanout, so we may need to update this doc comment (see https://github.com/influxdata/influxdb_pro/issues/205#issuecomment-2565377397)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the memory sort is going to OOM, unless you set a memory limit on DF, but in that case it just means that the query will get killed and return a resource exhaustion error. The only way around that I can think of is if spill to disk is enabled, but that's not really much better either.
I think the fanout setting should effectively be ignored (i.e. set to whatever the max of the type is). Resorting the data is always going to be more expensive and completely unnecessary in our case.
If DF allocates an arrow buffer for each input file, then you'd have that size * num of files. The Arrow buffer could be quite large if there are very wide tables and depending on the size of that buffer. I think one way to counter this would be to make sure that the pre-allocated buffer is limited in size or scaled down depending on the number of input files.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So, there are two things here that I can create issues for: