Skip to content

Commit

Permalink
update readme and help
Browse files Browse the repository at this point in the history
  • Loading branch information
hleft committed Mar 15, 2023
1 parent 4e42860 commit cc85d6b
Show file tree
Hide file tree
Showing 2 changed files with 15 additions and 7 deletions.
15 changes: 9 additions & 6 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -16,33 +16,36 @@ The bilingual_book_maker is an AI translation tool that uses ChatGPT to assist u
## Use

- `pip install -r requirements.txt` or `pip install -U bbook_maker`(you can use)
- Use `--openai_key` option to specify OpenAI API key. If you have multiple keys, separate them by commas (xxx,xxx,xxx) to reduce errors caused by API call limits.
- Use `--openai_key` option to specify OpenAI API key. If you have multiple keys, separate them by commas (xxx,xxx,xxx) to reduce errors caused by API call limits.
Or, just set environment variable `BMM_OPENAI_API_KEY` instead.
- A sample book, `test_books/animal_farm.epub`, is provided for testing purposes.
- The default underlying model is [GPT-3.5-turbo](https://openai.com/blog/introducing-chatgpt-and-whisper-apis), which is used by ChatGPT currently. Use `--model gpt3` to change the underlying model to `GPT3`
5. support DeepL model [DeepL Translator](https://rapidapi.com/splintPRO/api/deepl-translator) need pay to get the token use `--model deepl --deepl_key ${deepl_key}`
- Use `--test` option to preview the result if you haven't paid for the service. Note that there is a limit and it may take some time.
- Set the target language like `--language "Simplified Chinese"`. Default target language is `"Simplified Chinese"`.
- Set the target language like `--language "Simplified Chinese"`. Default target language is `"Simplified Chinese"`.
Read available languages by helper message: `python make_book.py --help`
- Use `--proxy` option to specify proxy server for internet access. Enter a string such as `http://127.0.0.1:7890`.
- Use `--resume` option to manually resume the process after an interruption.
- epub is made of html files. By default, we only translate contents in `<p>`.
Use `--translate-tags` to specify tags need for translation. Use comma to seperate multiple tags. For example:
`--translate-tags h1,h2,h3,p,div`
- Use `--book_from` option to specify e-reader type (Now only `kobo` is available), and use `--device_path` to specify the mounting point.
- If you want to change api_base like using Cloudflare Workers, use `--api_base <URL>` to support it.
- If you want to change api_base like using Cloudflare Workers, use `--api_base <URL>` to support it.
**Note: the api url should be '`https://xxxx/v1`'. Quotation marks are required.**
- Once the translation is complete, a bilingual book named `${book_name}_bilingual.epub` would be generated.
- If there are any errors or you wish to interrupt the translation by pressing `CTRL+C`. A book named `${book_name}_bilingual_temp.epub` would be generated. You can simply rename it to any desired name.
- If you want to translate strings in an e-book that aren't labeled with any tags, you can use the `--allow_navigable_strings` parameter. This will add the strings to the translation queue. **Note that it's best to look for e-books that are more standardized if possible.**
- To tweak the prompt, use the `--prompt` parameter. Valid placeholders for the `user` role template include `{text}` and `{language}`. It supports a few ways to configure the prompt:
If you don't need to set the `system` role content, you can simply set it up like this: `--prompt "Translate {text} to {language}."` or `--prompt prompt_template_sample.txt` (example of a text file can be found at [./prompt_template_sample.txt](./prompt_template_sample.txt)).
If you need to set the `system` role content, you can use the following format: `--prompt '{"user":"Translate {text} to {language}", "system": "You are a professional translator."}'` or `--prompt prompt_template_sample.json` (example of a JSON file can be found at [./prompt_template_sample.json](./prompt_template_sample.json)).
- To tweak the prompt, use the `--prompt` parameter. Valid placeholders for the `user` role template include `{text}` and `{language}`. It supports a few ways to configure the prompt:
If you don't need to set the `system` role content, you can simply set it up like this: `--prompt "Translate {text} to {language}."` or `--prompt prompt_template_sample.txt` (example of a text file can be found at [./prompt_template_sample.txt](./prompt_template_sample.txt)).
If you need to set the `system` role content, you can use the following format: `--prompt '{"user":"Translate {text} to {language}", "system": "You are a professional translator."}'` or `--prompt prompt_template_sample.json` (example of a JSON file can be found at [./prompt_template_sample.json](./prompt_template_sample.json)).
You can also set the `user` and `system` role prompt by setting environment variables: `BBM_CHATGPTAPI_USER_MSG_TEMPLATE` and `BBM_CHATGPTAPI_SYS_MSG`.
- Once the translation is complete, a bilingual book named `${book_name}_bilingual.epub` would be generated.
- If there are any errors or you wish to interrupt the translation by pressing `CTRL+C`. A book named `${book_name}_bilingual_temp.epub` would be generated. You can simply rename it to any desired name.
- If you want to translate strings in an e-book that aren't labeled with any tags, you can use the `--allow_navigable_strings` parameter. This will add the strings to the translation queue. **Note that it's best to look for e-books that are more standardized if possible.**
- Use the `--batch_size` parameter to specify the number of lines for batch translation (default is 10, currently only effective for txt files).
- `--accumulated_num` Wait for how many tokens have been accumulated before starting the translation. gpt3.5 limits the total_token to 4090. For example, if you use --accumulated_num 1600, maybe openai will
output 2200 tokens and maybe 200 tokens for other messages in the system messages user messages, 1600+2200+200=4000, So you are close to reaching the limit. You have to choose your own
value, there is no way to know if the limit is reached before sending

### Examples

Expand Down
7 changes: 6 additions & 1 deletion book_maker/cli.py
Original file line number Diff line number Diff line change
Expand Up @@ -175,7 +175,12 @@ def main():
dest="accumulated_num",
type=int,
default=1,
help="Wait for how many tokens have been accumulated before starting the translation",
help="""Wait for how many tokens have been accumulated before starting the translation.
gpt3.5 limits the total_token to 4090.
For example, if you use --accumulated_num 1600, maybe openai will output 2200 tokens
and maybe 200 tokens for other messages in the system messages user messages, 1600+2200+200=4000,
So you are close to reaching the limit. You have to choose your own value, there is no way to know if the limit is reached before sending
""",
)
parser.add_argument(
"--batch_size",
Expand Down

0 comments on commit cc85d6b

Please sign in to comment.