-
Notifications
You must be signed in to change notification settings - Fork 525
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix: lower num_workers
to 4
#4535
Conversation
For multi-task training in pytorch, each data source will have their own dataloader. If the number of workers of dataloaders is large, there will be many worker processes stressing CPU. Signed-off-by: Chun Cai <[email protected]>
📝 WalkthroughWalkthroughThe pull request modifies the default value of the Changes
This change reduces the maximum number of workers that can be configured for data loading in the PyTorch backend. Thank you for using CodeRabbit. We offer it for free to the OSS community and would appreciate your support in helping us grow. If you find it useful, would you consider giving us a shout-out on your favorite social media? 🪧 TipsChatThere are 3 ways to chat with CodeRabbit:
Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments. CodeRabbit Commands (Invoked using PR comments)
Other keywords and placeholders
CodeRabbit Configuration File (
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please update the documentation: https://docs.deepmodeling.com/projects/deepmd/en/stable/env.html#envvar-NUM_WORKERS
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
🧹 Nitpick comments (1)
doc/env.md (1)
Line range hint
75-80
: Consider adding a note about performance implications.To help users make informed decisions, consider adding a note explaining:
- The trade-off between CPU usage and data loading performance
- Guidelines for adjusting this value based on specific workload requirements (e.g., single-task vs multi-task training)
Example addition:
{{ pytorch_icon }} Number of subprocesses to use for data loading in the PyTorch backend. See [PyTorch documentation](https://pytorch.org/docs/stable/data.html) for details. + +Note: The default value is optimized for multi-task training scenarios to prevent excessive CPU usage. For single-task training or if you have sufficient CPU resources, you may increase this value to potentially improve data loading performance.
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (1)
doc/env.md
(1 hunks)
🔇 Additional comments (1)
doc/env.md (1)
75-75
: Documentation accurately reflects the implementation change.The updated default value aligns with the PR objective to reduce CPU usage in multi-task training scenarios.
Codecov ReportAll modified and coverable lines are covered by tests ✅
Additional details and impacted files@@ Coverage Diff @@
## devel #4535 +/- ##
==========================================
- Coverage 84.57% 84.57% -0.01%
==========================================
Files 675 675
Lines 63695 63695
Branches 3488 3488
==========================================
- Hits 53872 53871 -1
Misses 8698 8698
- Partials 1125 1126 +1 ☔ View full report in Codecov by Sentry. |
For multi-task training in pytorch, each data source will have their own dataloader. If the number of workers of dataloaders is large, there will be many (number of tasks * num_workers) worker processes stressing CPU.
Summary by CodeRabbit
NUM_WORKERS
from 8 to 4