[TRY] perf: spawn (parallelize) io tasks related to tarball #190

kdy1 · 2023-11-11T21:19:15Z

This PR uses the threadpool of rayon to perform actual blocking IO operations. It's done by using rayon::scope API.

Blocked by:

kdy1 · 2023-11-11T21:19:59Z

Oops my commits are gone. I'll use git reflog to find it..

KSXGitHub

Thank you for your works.

I have a few change requests.

crates/tarball/Cargo.toml

KSXGitHub · 2023-11-21T16:30:58Z

crates/tarball/src/lib.rs

+                    let (file_path, file_hash) = store_dir
+                        .write_cas_file(scope, buffer, file_is_executable)
+                        .map_err(TarballError::WriteCasFile)?;


This seems to be the only IO operation in the entire scope. And none of the operations that follow depend on this IO operation being done either. Meaning, we could potentially defer this operation after building the cas_paths (i.e. do the IO on the created cas_paths). See if this suggestion can improve the code quality and performance.

KSXGitHub · 2023-11-21T16:47:57Z

A new PR has been merged into main. It allows pacquet to work with a big lockfile. You may pull from main (or rebase) and benchmark with this lockfile: https://gist.github.com/KSXGitHub/ca8076a18b384b6cb6eea86d4ec7c675 (The --fixture-dir/-D flag would allow you to change lockfile without copy pasting)

codecov · 2023-11-22T04:15:01Z

Codecov Report

Attention: 9 lines in your changes are missing coverage. Please review.

Comparison is base (c62a631) 92.65% compared to head (4569c7f) 92.50%.

❗ Current head 4569c7f differs from pull request most recent head 339b83a. Consider uploading reports for the commit 339b83a to get more accurate results

Files	Patch %	Lines
crates/tarball/src/lib.rs	86.15%	9 Missing ⚠️

Additional details and impacted files

@@            Coverage Diff             @@
##             main     #190      +/-   ##
==========================================
- Coverage   92.65%   92.50%   -0.16%     
==========================================
  Files          57       57              
  Lines        2847     2881      +34     
==========================================
+ Hits         2638     2665      +27     
- Misses        209      216       +7

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

KSXGitHub · 2023-11-22T04:20:47Z

@kdy1 main is faster when run just integrated-benchmark -s frozen-lockfile main HEAD -V -D ./crates/testing-utils/src/fixtures/big/ on this branch:

kdy1 · 2023-11-22T04:35:46Z

It's expected because you are using a beast machine and the lockfile is for a very small project.

The PR is about limiting the number of threads used for syscalls.

KSXGitHub · 2023-11-22T04:43:47Z

the lockfile is for a very small project

I added -D ./crates/testing-utils/src/fixtures/big/ which contains a big lockfile.

The PR is about limiting the number of threads used for syscalls.

Is there a mechanism to scale the number of syscalls limit based on number of CPUs? Yesterday, there was a merged PR that scale the concurrent network requests limit based on number of CPUs.

Also, is the perf: in the title misleading?

kdy1 · 2023-11-22T05:28:25Z

rayon has an issue with a beast machine. rayon-rs/rayon#394

Can you try limiting the number of rayon threads by configuring RAYON_NUM_THREADS? It's documented at https://docs.rs/rayon/latest/rayon/struct.ThreadPoolBuilder.html#method.num_threads

And the perf is intentional, as I expected it to improve the performance. (rayon is not a good fit for syscalls, but we should not mix rayon and spawn_blocking for syscalls anyway)

Co-authored-by: Khải <[email protected]>

crates/tarball/Cargo.toml

KSXGitHub · 2023-11-22T05:54:50Z

@kdy1 Should the RAYON_NUM_THREADS be low or high? Since I/O are blocking operations, which won't be benefited from having many parallel threads, it would be counterproductive to hog all CPUs for themselves. So counterintuitively, the more CPUs the machine has, the slower this code will be? Am I correct? If this is the case, then would it be more appropriate to have a dedicated thread reserved for only I/O operations?

kdy1 · 2023-11-22T06:14:42Z

It should be lower.

So counterintuitively, the more CPUs the machine has, the slower this code will be

This is partially correct, but this is not related to IO but rather a bug of rayon.
But as it's about communication cost, it goes like

         .
      .    .  
   .          .
.                .

The performance graph will look like this because n is very small here. The communication cost is O(n^2), which is still small if the CPU core is in small range

KSXGitHub · 2023-11-22T06:23:59Z

@kdy1 Is RAYON_NUM_THREADS=16 just integrated-benchmark -s frozen-lockfile main HEAD -V the correct way to run it? main was always 3 times faster than HEAD regardless of the number.

kdy1 · 2023-11-22T06:29:22Z

Hmmm... Yeah, it's correct. So we should move syscalls to spawn_blocking, not rayon

KSXGitHub · 2023-11-23T12:01:27Z

@kdy1 Actually, my hypothesis regarding I/O operations is proven incorrect.

I have created a test repo which benchmark 4 cases. The result was parallel I/O + serial CPU is the fastest.

kdy1 added 14 commits November 12, 2023 03:14

malloc

a195be2

swc_malloc

46b4ade

Dep on rayon

1839714

cargo lockfile

8aba776

Dep on num_cpus

af4045d

cargo lockfile

3297bdb

Enlarge IO thread pool

3c3166c

doc

8662d94

Dep on rayon

8b41c42

cargo lockfile

28678b5

Dep on rayon

80254ec

cargo lockfile

18512c6

Make writing of cas file happen on other thread

7afc31d

Make writing of lazy file happen on other thread

6515ebd

KSXGitHub requested review from anonrig and zkochan November 21, 2023 16:25

KSXGitHub reviewed Nov 21, 2023

View reviewed changes

KSXGitHub changed the title ~~perf: Spawn (parallelize) io tasks related to tarball~~ perf: spawn (parallelize) io tasks related to tarball Nov 22, 2023

KSXGitHub added 2 commits November 22, 2023 11:08

chore(git): merge from main

23a69f3

fix: clippy

4569c7f

Update crates/tarball/Cargo.toml

e93df61

Co-authored-by: Khải <[email protected]>

KSXGitHub reviewed Nov 22, 2023

View reviewed changes

crates/tarball/Cargo.toml Outdated Show resolved Hide resolved

kdy1 changed the title ~~perf: spawn (parallelize) io tasks related to tarball~~ [TRY] perf: spawn (parallelize) io tasks related to tarball Nov 22, 2023

fmt

339b83a

kdy1 closed this Nov 22, 2023

kdy1 deleted the perf-3-spawn-io-tasks branch November 22, 2023 06:29

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[TRY] perf: spawn (parallelize) io tasks related to tarball #190

[TRY] perf: spawn (parallelize) io tasks related to tarball #190

kdy1 commented Nov 11, 2023 •

edited

Loading

kdy1 commented Nov 11, 2023

KSXGitHub left a comment

KSXGitHub Nov 21, 2023

KSXGitHub commented Nov 21, 2023 •

edited

Loading

codecov bot commented Nov 22, 2023 •

edited

Loading

KSXGitHub commented Nov 22, 2023

kdy1 commented Nov 22, 2023

KSXGitHub commented Nov 22, 2023

kdy1 commented Nov 22, 2023 •

edited

Loading

KSXGitHub commented Nov 22, 2023

kdy1 commented Nov 22, 2023

KSXGitHub commented Nov 22, 2023

kdy1 commented Nov 22, 2023

KSXGitHub commented Nov 23, 2023

[TRY] perf: spawn (parallelize) io tasks related to tarball #190

[TRY] perf: spawn (parallelize) io tasks related to tarball #190

Conversation

kdy1 commented Nov 11, 2023 • edited Loading

kdy1 commented Nov 11, 2023

KSXGitHub left a comment

Choose a reason for hiding this comment

KSXGitHub Nov 21, 2023

Choose a reason for hiding this comment

KSXGitHub commented Nov 21, 2023 • edited Loading

codecov bot commented Nov 22, 2023 • edited Loading

Codecov Report

KSXGitHub commented Nov 22, 2023

kdy1 commented Nov 22, 2023

KSXGitHub commented Nov 22, 2023

kdy1 commented Nov 22, 2023 • edited Loading

KSXGitHub commented Nov 22, 2023

kdy1 commented Nov 22, 2023

KSXGitHub commented Nov 22, 2023

kdy1 commented Nov 22, 2023

KSXGitHub commented Nov 23, 2023

kdy1 commented Nov 11, 2023 •

edited

Loading

KSXGitHub commented Nov 21, 2023 •

edited

Loading

codecov bot commented Nov 22, 2023 •

edited

Loading

kdy1 commented Nov 22, 2023 •

edited

Loading