From cc85d6bf161bf3161b653da7dbc9dd92a9f74630 Mon Sep 17 00:00:00 2001 From: h Date: Wed, 15 Mar 2023 21:09:29 +0800 Subject: [PATCH] update readme and help --- README.md | 15 +++++++++------ book_maker/cli.py | 7 ++++++- 2 files changed, 15 insertions(+), 7 deletions(-) diff --git a/README.md b/README.md index 26f01c6b..71647232 100644 --- a/README.md +++ b/README.md @@ -16,13 +16,13 @@ The bilingual_book_maker is an AI translation tool that uses ChatGPT to assist u ## Use - `pip install -r requirements.txt` or `pip install -U bbook_maker`(you can use) -- Use `--openai_key` option to specify OpenAI API key. If you have multiple keys, separate them by commas (xxx,xxx,xxx) to reduce errors caused by API call limits. +- Use `--openai_key` option to specify OpenAI API key. If you have multiple keys, separate them by commas (xxx,xxx,xxx) to reduce errors caused by API call limits. Or, just set environment variable `BMM_OPENAI_API_KEY` instead. - A sample book, `test_books/animal_farm.epub`, is provided for testing purposes. - The default underlying model is [GPT-3.5-turbo](https://openai.com/blog/introducing-chatgpt-and-whisper-apis), which is used by ChatGPT currently. Use `--model gpt3` to change the underlying model to `GPT3` 5. support DeepL model [DeepL Translator](https://rapidapi.com/splintPRO/api/deepl-translator) need pay to get the token use `--model deepl --deepl_key ${deepl_key}` - Use `--test` option to preview the result if you haven't paid for the service. Note that there is a limit and it may take some time. -- Set the target language like `--language "Simplified Chinese"`. Default target language is `"Simplified Chinese"`. +- Set the target language like `--language "Simplified Chinese"`. Default target language is `"Simplified Chinese"`. Read available languages by helper message: `python make_book.py --help` - Use `--proxy` option to specify proxy server for internet access. Enter a string such as `http://127.0.0.1:7890`. - Use `--resume` option to manually resume the process after an interruption. @@ -30,19 +30,22 @@ The bilingual_book_maker is an AI translation tool that uses ChatGPT to assist u Use `--translate-tags` to specify tags need for translation. Use comma to seperate multiple tags. For example: `--translate-tags h1,h2,h3,p,div` - Use `--book_from` option to specify e-reader type (Now only `kobo` is available), and use `--device_path` to specify the mounting point. -- If you want to change api_base like using Cloudflare Workers, use `--api_base ` to support it. +- If you want to change api_base like using Cloudflare Workers, use `--api_base ` to support it. **Note: the api url should be '`https://xxxx/v1`'. Quotation marks are required.** - Once the translation is complete, a bilingual book named `${book_name}_bilingual.epub` would be generated. - If there are any errors or you wish to interrupt the translation by pressing `CTRL+C`. A book named `${book_name}_bilingual_temp.epub` would be generated. You can simply rename it to any desired name. - If you want to translate strings in an e-book that aren't labeled with any tags, you can use the `--allow_navigable_strings` parameter. This will add the strings to the translation queue. **Note that it's best to look for e-books that are more standardized if possible.** -- To tweak the prompt, use the `--prompt` parameter. Valid placeholders for the `user` role template include `{text}` and `{language}`. It supports a few ways to configure the prompt: - If you don't need to set the `system` role content, you can simply set it up like this: `--prompt "Translate {text} to {language}."` or `--prompt prompt_template_sample.txt` (example of a text file can be found at [./prompt_template_sample.txt](./prompt_template_sample.txt)). - If you need to set the `system` role content, you can use the following format: `--prompt '{"user":"Translate {text} to {language}", "system": "You are a professional translator."}'` or `--prompt prompt_template_sample.json` (example of a JSON file can be found at [./prompt_template_sample.json](./prompt_template_sample.json)). +- To tweak the prompt, use the `--prompt` parameter. Valid placeholders for the `user` role template include `{text}` and `{language}`. It supports a few ways to configure the prompt: + If you don't need to set the `system` role content, you can simply set it up like this: `--prompt "Translate {text} to {language}."` or `--prompt prompt_template_sample.txt` (example of a text file can be found at [./prompt_template_sample.txt](./prompt_template_sample.txt)). + If you need to set the `system` role content, you can use the following format: `--prompt '{"user":"Translate {text} to {language}", "system": "You are a professional translator."}'` or `--prompt prompt_template_sample.json` (example of a JSON file can be found at [./prompt_template_sample.json](./prompt_template_sample.json)). You can also set the `user` and `system` role prompt by setting environment variables: `BBM_CHATGPTAPI_USER_MSG_TEMPLATE` and `BBM_CHATGPTAPI_SYS_MSG`. - Once the translation is complete, a bilingual book named `${book_name}_bilingual.epub` would be generated. - If there are any errors or you wish to interrupt the translation by pressing `CTRL+C`. A book named `${book_name}_bilingual_temp.epub` would be generated. You can simply rename it to any desired name. - If you want to translate strings in an e-book that aren't labeled with any tags, you can use the `--allow_navigable_strings` parameter. This will add the strings to the translation queue. **Note that it's best to look for e-books that are more standardized if possible.** - Use the `--batch_size` parameter to specify the number of lines for batch translation (default is 10, currently only effective for txt files). +- `--accumulated_num` Wait for how many tokens have been accumulated before starting the translation. gpt3.5 limits the total_token to 4090. For example, if you use --accumulated_num 1600, maybe openai will +output 2200 tokens and maybe 200 tokens for other messages in the system messages user messages, 1600+2200+200=4000, So you are close to reaching the limit. You have to choose your own +value, there is no way to know if the limit is reached before sending ### Examples diff --git a/book_maker/cli.py b/book_maker/cli.py index 8b0358f5..66c1f5c2 100644 --- a/book_maker/cli.py +++ b/book_maker/cli.py @@ -175,7 +175,12 @@ def main(): dest="accumulated_num", type=int, default=1, - help="Wait for how many tokens have been accumulated before starting the translation", + help="""Wait for how many tokens have been accumulated before starting the translation. +gpt3.5 limits the total_token to 4090. +For example, if you use --accumulated_num 1600, maybe openai will output 2200 tokens +and maybe 200 tokens for other messages in the system messages user messages, 1600+2200+200=4000, +So you are close to reaching the limit. You have to choose your own value, there is no way to know if the limit is reached before sending +""", ) parser.add_argument( "--batch_size",