-
Notifications
You must be signed in to change notification settings - Fork 35
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Integrate ModularTokenizerOp
with Hugging Face remote 🤗
#368
Conversation
@@ -58,7 +58,7 @@ exclude = | |||
|
|||
|
|||
[mypy] | |||
python_version = 3.7 | |||
python_version = 3.9 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
due mypy
pre-commit errors
def save_pretrained(self, save_directory: Union[str, Path]) -> None: | ||
print(f"Saving @ {save_directory=}") | ||
self._tokenizer.save(path=str(save_directory)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As far as I understand, all of the information is being stored in tokenizer
. That's why it's enough to save it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Correct
cache_dir: Optional[Union[str, Path]] = None, | ||
local_files_only: bool = False, | ||
revision: Optional[str] = None, | ||
) -> "ModularTokenizerOp": |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
add a comment that for arguments details directs to snapshot_download
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
Added inline suggestion to consider.
Supporting HF-like API to integrate the modular_tokenizer_op with HF remote.
API: