Skip to content

Commit

Permalink
Update create_pretraining_data.py
Browse files Browse the repository at this point in the history
  • Loading branch information
sushreebarsa authored Oct 20, 2023
1 parent 57659eb commit 0587142
Showing 1 changed file with 1 addition and 1 deletion.
2 changes: 1 addition & 1 deletion official/nlp/data/create_pretraining_data.py
Original file line number Diff line number Diff line change
Expand Up @@ -432,7 +432,7 @@ def _contiguous(sorted_grams):
def _masking_ngrams(grams, max_ngram_size, max_masked_tokens, rng):
"""Create a list of masking {1, ..., n}-grams from a list of one-grams.
This is an extention of 'whole word masking' to mask multiple, contiguous
This is an extension of 'whole word masking' to mask multiple, contiguous
words such as (e.g., "the red boat").
Each input gram represents the token indices of a single word,
Expand Down

0 comments on commit 0587142

Please sign in to comment.