Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Performance regresssion with (&'static str, &'static str) label set. #248

Open
MrCroxx opened this issue Nov 26, 2024 · 6 comments
Open

Comments

@MrCroxx
Copy link

MrCroxx commented Nov 26, 2024

Hi, thanks for building the Prometheus client in Rust. Really love it.

When doing the benchmark, I added another test case with (&'static str, &'static str) label set as follows:

    c.bench_function(
        "counter family with (&'static str, &'static str) label set",
        |b| {
            let family = Family::<(&'static str, &'static str), Counter>::default();

            b.iter(|| {
                family.get_or_create(&("method", "GET")).inc();
            })
        },
    );

Considering there should be no dynamic allocation when accessing the metric with (&'static str, &'static str) label set, I suppose the performance would be the same as the custom type label set.

But the benchmark turned out oppositely. (27ns vs 15ns on avg)

     Running benches/family.rs (target/release/deps/family-f5815167dcf70f4a)
Gnuplot not found, using plotters backend
counter family with [(&'static str, &'static str)] label set
                        time:   [41.655 ns 42.005 ns 42.483 ns]
                        change: [-1.0231% +1.7906% +7.1726%] (p = 0.51 > 0.05)
                        No change in performance detected.
Found 9 outliers among 100 measurements (9.00%)
  3 (3.00%) high mild
  6 (6.00%) high severe

counter family with Vec<(&'static str, &'static str)> label set
                        time:   [62.579 ns 63.896 ns 65.827 ns]
                        change: [-1.8609% +0.9648% +4.0991%] (p = 0.58 > 0.05)
                        No change in performance detected.
Found 6 outliers among 100 measurements (6.00%)
  1 (1.00%) high mild
  5 (5.00%) high severe

counter family with (&'static str, &'static str) label set
                        time:   [21.716 ns 21.771 ns 21.833 ns]
Found 4 outliers among 100 measurements (4.00%)
  2 (2.00%) high mild
  2 (2.00%) high severe

counter family with Vec<(String, String)> label set
                        time:   [204.69 ns 208.56 ns 214.56 ns]
                        change: [-5.1373% -3.8646% -2.4135%] (p = 0.00 < 0.05)
                        Performance has improved.
Found 1 outliers among 100 measurements (1.00%)
  1 (1.00%) high severe

counter family with custom type label set
                        time:   [14.648 ns 15.786 ns 17.877 ns]
                        change: [-8.7872% -2.4706% +6.7788%] (p = 0.61 > 0.05)
                        No change in performance detected.
Found 7 outliers among 100 measurements (7.00%)
  2 (2.00%) high mild
  5 (5.00%) high severe

I'm not sure where the regression comes from. Is it as expected or did I miss something?

Thanks again for your help. 🙏

@MrCroxx
Copy link
Author

MrCroxx commented Nov 26, 2024

Does the overhead come from constructing the tuple? (e.g. &("method", "GET"))

IIUC, "method" and "GET" are &'static str, which does not need allocation, but the tuple is not?

@mxinden
Copy link
Member

mxinden commented Nov 28, 2024

Appreciate your curiosity!

On a high level, I don't think it matters for a user whether an operation takes 40 or 20 nano seconds here.

That said, indeed, this is counterintuitive. But I am not able to reproduce it:

     Running benches/family.rs (/home/mxinden/code/github.com/prometheus/client_rust/target/release/deps/family-95f4365191b2b975)
Gnuplot not found, using plotters backend
counter family with [(&'static str, &'static str)] label set
                        time:   [57.968 ns 57.987 ns 58.008 ns]
Found 19 outliers among 100 measurements (19.00%)
  3 (3.00%) low severe
  3 (3.00%) low mild
  5 (5.00%) high mild
  8 (8.00%) high severe

counter family with (&'static str, &'static str) label set
                        time:   [34.901 ns 34.903 ns 34.905 ns]
Found 15 outliers among 100 measurements (15.00%)
  13 (13.00%) high mild
  2 (2.00%) high severe

Are you able to reproduce (&'static str, &'static str) being slower on consecutive runs?

@MrCroxx
Copy link
Author

MrCroxx commented Dec 3, 2024

Are you able to reproduce (&'static str, &'static str) being slower on consecutive runs?

Yep. Unfortunately, it is a stable regression with my machine. I'm running the benchmark with cargo bench on my MacBook Air with M2 chip. Let me try it on my linux server to see if the regression still exists.

@MrCroxx
Copy link
Author

MrCroxx commented Dec 3, 2024

On a high level, I don't think it matters for a user whether an operation takes 40 or 20 nano seconds here.

I agree that 20+ ns vs 15+ ns looks close digitally, but the overhead is 33% more in ratio. 🥹

@MrCroxx
Copy link
Author

MrCroxx commented Dec 3, 2024

BTW, can I open a PR to add the test case?

@mxinden
Copy link
Member

mxinden commented Dec 3, 2024

Maybe use https://play.rust-lang.org/ to take a look at the ASM and LLVM IR code. The reason might be obvious from there?

BTW, can I open a PR to add the test case?

I don't think the test itself, without an explanation for the regression would be helpful.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants