Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add batch size tuning docs #341

Open
wants to merge 2 commits into
base: master
Choose a base branch
from
Open
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
23 changes: 23 additions & 0 deletions docs/user_guide/batch_size_tuning.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,23 @@
In Ludwig, users have the option to set `batch_size` to a fixed value as part of the training config.
```
trainer:
batch_size: 128
```
If the batch size is unspecified Ludwig sets `batch_size=auto`.

```
trainer:
batch_size: auto
```
`auto` enables Ludwig to select an efficient batch size automatically. The actual value of the batch size can be found in training logs and in the model output directory.

Batch size tuning is supported in single-node and multi-node CPU and GPU settings.

## ECD Models

Batch size tuning for ECD models follows this procedure, starting from batch size 1:
1. Perform a small number of forward passes through the model using a sample from the dataset
2. If the model does not hit a memory error, increment the batch size and repeat from step 1. Otherwise, use the last valid batch size.

## LLMs
The main element that separates LLM batch size tuning from its ECD counterpart is the sequence length. LLM's thus undergo the batch size tuning process as ECD models with the exception being that, instead of using a random sample from the dataset, the forward passes use a synthetic data sample with a sequence length equal to the longest sequence length in the provided dataset.
Loading