Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix (#90) split text if token larger than 4096 #106

Open
wants to merge 6 commits into
base: main
Choose a base branch
from

Conversation

jeffery9
Copy link
Contributor

@jeffery9 jeffery9 commented Mar 8, 2023

split text when token larger than the quota

@jeffery9
Copy link
Contributor Author

jeffery9 commented Mar 8, 2023

should fix #90

@jeffery9 jeffery9 closed this Mar 8, 2023
@jeffery9 jeffery9 reopened this Mar 8, 2023
@jeffery9 jeffery9 marked this pull request as ready for review March 8, 2023 06:40
@jeffery9 jeffery9 changed the title fix #90 split if token larger than 4096, try to fix #90 Mar 8, 2023
@jeffery9 jeffery9 changed the title split if token larger than 4096, try to fix #90 [fix #90] split text if token larger than 4096 Mar 10, 2023
@jeffery9 jeffery9 changed the title [fix #90] split text if token larger than 4096 fix (#90) split text if token larger than 4096 Mar 10, 2023
Comment on lines 22 to 106

message_log = [
{
"role": "user",
# english prompt here to save tokens
"content": f"Please help me to translate,`{text}` to {self.language}, please return only translated content not include the origin text",
}
]
count_tokens = num_tokens_from_messages(message_log)
consumed_tokens = 0
t_text = ""
if count_tokens > 4000:
print("too long!")

splits = count_tokens // 4000 + 1

text_list = text.split(".")
sub_text = ""
t_sub_text = ""
for n in range(splits):
text_segment = text_list[n * splits : (n + 1) * splits]
sub_text = ".".join(text_segment)
print(sub_text)

completion = openai.ChatCompletion.create(
model="gpt-3.5-turbo",
messages=[
{
"role": "user",
# english prompt here to save tokens
"content": f"Please help me to translate,`{sub_text}` to {self.language}, please return only translated content not include the origin text",
}
],
)
t_sub_text = (
completion["choices"][0]
.get("message")
.get("content")
.encode("utf8")
.decode()
)
print(t_sub_text)
consumed_tokens += completion["usage"]["prompt_tokens"]

t_text = t_text + t_sub_text

else:
try:
completion = openai.ChatCompletion.create(
model="gpt-3.5-turbo",
messages=[
{
"role": "user",
# english prompt here to save tokens
"content": f"Please help me to translate,`{text}` to {self.language}, please return only translated content not include the origin text",
}
],
)
t_text = (
completion["choices"][0]
.get("message")
.get("content")
.encode("utf8")
.decode()
)
consumed_tokens += completion["usage"]["prompt_tokens"]

except Exception as e:
# TIME LIMIT for open api please pay
key_len = self.key.count(",") + 1
sleep_time = int(60 / key_len)
time.sleep(sleep_time)
print(e, f"will sleep {sleep_time} seconds")
self.rotate_key()
completion = openai.ChatCompletion.create(
model="gpt-3.5-turbo",
messages=[
{
"role": "user",
"content": f"Please help me to translate,`{text}` to {self.language}, please return only translated content not include the origin text",
}
],
)
t_text = (
completion["choices"][0]
.get("message")
.get("content")
.encode("utf8")
.decode()
)
consumed_tokens += completion["usage"]["prompt_tokens"]

print(t_text)
print(f"{consumed_tokens} prompt tokens used.")
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this functions is too long let's split it

book_maker/utils.py Outdated Show resolved Hide resolved
Copy link
Contributor Author

@jeffery9 jeffery9 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

refactored.

Copy link
Contributor Author

@jeffery9 jeffery9 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

verified

@jeffery9
Copy link
Contributor Author

have passed ' black . --check' local, but not pass ci.

@jeffery9 jeffery9 requested review from yihong0618 March 13, 2023 05:17
@yihong0618
Copy link
Owner

pip insall -U black

@jeffery9
Copy link
Contributor Author

pip insall -U black

already formatted.


jeffery@jeffery-MBP ~/repos/bilingual_book_maker (split_p) $ python3.9 -m pip install -U black
Defaulting to user installation because normal site-packages is not writeable
Looking in indexes: https://mirrors.163.com/pypi/simple/
Requirement already satisfied: black in /Users/jeffery/Library/Python/3.9/lib/python/site-packages (21.6b0)
Collecting black
  Using cached https://mirrors.163.com/pypi/packages/9b/27/b2f98b627738b02dcac06ae9e2ab13f14ab906fe6dd6366050c76883d4b5/black-21.12b0-py3-none-any.whl (156 kB)
Requirement already satisfied: click>=7.1.2 in /Users/jeffery/Library/Python/3.9/lib/python/site-packages (from black) (8.0.3)
Requirement already satisfied: mypy-extensions>=0.4.3 in /Users/jeffery/Library/Python/3.9/lib/python/site-packages (from black) (0.4.3)
  Using cached https://mirrors.163.com/pypi/packages/c7/24/0de05480822e5f0f2cc539fce9029bc2507b44b7f85ec1a9e23d89dea6c3/black-21.11b1-py3-none-any.whl (155 kB)
  Using cached https://mirrors.163.com/pypi/packages/3d/ad/1cf514e7f9ee4c3d8df7c839d7977f7605ad76557f3fca741ec67f76dba6/black-21.11b0-py3-none-any.whl (155 kB)
  Using cached https://mirrors.163.com/pypi/packages/12/df/0e55791b9c6ca07b4a3404eef6cee1ca42503bf16e9fc9df0247b4803cf1/black-21.10b0-py3-none-any.whl (150 kB)
  Using cached https://mirrors.163.com/pypi/packages/d2/16/a92c999103bee1236dd93f703f3522217fe00bd97bd50ae3699c2d91e320/black-21.9b0-py3-none-any.whl (148 kB)
  Using cached https://mirrors.163.com/pypi/packages/9d/11/cee7b695f95178025c428168dd75094f0e00fdcfe0fd004a0f8bc9bea3ee/black-21.8b0-py3-none-any.whl (148 kB)
  Using cached https://mirrors.163.com/pypi/packages/b6/6e/b706ab6440ebac6e0f7fb4615232216dd3bba09fa9fba6815df90601411c/black-21.7b0-py3-none-any.whl (141 kB)
Requirement already satisfied: appdirs in /Users/jeffery/Library/Python/3.9/lib/python/site-packages (from black) (1.4.4)
Requirement already satisfied: toml>=0.10.1 in /Users/jeffery/Library/Python/3.9/lib/python/site-packages (from black) (0.10.2)
Requirement already satisfied: regex>=2020.1.8 in /Users/jeffery/Library/Python/3.9/lib/python/site-packages (from black) (2021.10.21)
Requirement already satisfied: pathspec<1,>=0.8.1 in /Users/jeffery/Library/Python/3.9/lib/python/site-packages (from black) (0.8.1)
jeffery@jeffery-MBP ~/repos/bilingual_book_maker (split_p) $ black .
All done! ✨ 🍰 ✨
17 files left unchanged.
jeffery@jeffery-MBP ~/repos/bilingual_book_maker (split_p) $ git status
On branch split_p
Your branch is up to date with 'origin/split_p'.

nothing to commit, working tree clean
jeffery@jeffery-MBP ~/repos/bilingual_book_maker (split_p) $ 

@yihong0618
Copy link
Owner

no worry I will take a look tonight or tomorrow.

wayhome pushed a commit to wayhome/bilingual_book_maker that referenced this pull request Aug 29, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants