You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
运行python simple_ui.py直接报如下错误。
(mindspore) root@autodl-container-bff2469f3e-a4796232:~/autodl-tmp/ChatPDF# python simple_ui.py
Building prefix dict from the default dictionary ...
Loading model from cache /tmp/jieba.cache
Loading model cost 0.559 seconds.
Prefix dict has been built successfully.
2024-08-09 09:59:49.937 | INFO | main::32 - Namespace(gen_model_type='auto', gen_model_name='./.mindnlp/model/01ai/Yi-6B-Chat', lora_model=None, rerank_model_name=None, corpus_files='sample.pdf', int4=False, int8=False, chunk_size=220, chunk_overlap=0, num_expand_context_chunk=1, server_name='0.0.0.0', server_port=8082, share=False)
The following parameters in checkpoint files are not loaded:
['embeddings.position_ids']
Loading checkpoint shards: 0%| | 0/3 [00:00<?, ?it/s]MindSpore do not support bfloat16 dtype, we will automaticlly convert to float16
Loading checkpoint shards: 100%|████████████████████████████████████████████████████| 3/3 [00:18<00:00, 6.14s/it]
2024-08-09 10:00:12.427 | INFO | msimilarities.bert_similarity:add_corpus:105 - Start computing corpus embeddings, new docs: 212
Batches: 100%|██████████████████████████████████████████████████████████████████████| 7/7 [00:16<00:00, 2.39s/it]
2024-08-09 10:00:29.184 | INFO | msimilarities.bert_similarity:add_corpus:117 - Add 212 docs, total: 212, emb len: 212
2024-08-09 10:00:29.185 | INFO | msimilarities.literal_similarity:add_corpus:395 - Add corpus done, new docs: 212, all corpus size: 212
2024-08-09 10:00:29.336 | INFO | msimilarities.literal_similarity:build_bm25:405 - Total corpus: 212
2024-08-09 10:00:29.336 | DEBUG | chatpdf:add_corpus:281 - files: ['sample.pdf'], corpus size: 212, top3: ['Style Transfer from Non-Parallel Text byCross-AlignmentTianxiao Shen1Tao Lei2Regina Barzilay1Tommi Jaakkola11MIT CSAIL2ASAPP Inc.', '1{tianxiao, regina, tommi}@[email protected] paper focuses on style transfer on the basis of non-parallel text.', 'This is aninstance of a broad family of problems including machine translation, decipherment,and sentiment modification. The key challenge is to separate the content fromother aspects such as style.']
Traceback (most recent call last):
File "/root/autodl-tmp/ChatPDF/simple_ui.py", line 34, in
model = ChatPDF(
File "/root/autodl-tmp/ChatPDF/chatpdf.py", line 184, in init
self.rerank_tokenizer = AutoTokenizer.from_pretrained(rerank_model_name_or_path, mirror='modelscope')
File "/root/miniconda3/envs/mindspore/lib/python3.9/site-packages/mindnlp/transformers/models/auto/tokenization_auto.py", line 775, in from_pretrained
return tokenizer_class.from_pretrained(pretrained_model_name_or_path, *inputs, **kwargs)
File "/root/miniconda3/envs/mindspore/lib/python3.9/site-packages/mindnlp/transformers/tokenization_utils_base.py", line 1723, in from_pretrained
return cls._from_pretrained(
File "/root/miniconda3/envs/mindspore/lib/python3.9/site-packages/mindnlp/transformers/tokenization_utils_base.py", line 1942, in _from_pretrained
tokenizer = cls(*init_inputs, **init_kwargs)
File "/root/miniconda3/envs/mindspore/lib/python3.9/site-packages/mindnlp/transformers/models/xlm_roberta/tokenization_xlm_roberta_fast.py", line 154, in init
super().init(
File "/root/miniconda3/envs/mindspore/lib/python3.9/site-packages/mindnlp/transformers/tokenization_utils_fast.py", line 106, in init
fast_tokenizer = convert_slow_tokenizer(slow_tokenizer)
File "/root/miniconda3/envs/mindspore/lib/python3.9/site-packages/mindnlp/transformers/convert_slow_tokenizer.py", line 1388, in convert_slow_tokenizer
return converter_class(transformer_tokenizer).converted()
File "/root/miniconda3/envs/mindspore/lib/python3.9/site-packages/mindnlp/transformers/convert_slow_tokenizer.py", line 533, in converted
pre_tokenizer = self.pre_tokenizer(replacement, add_prefix_space)
File "/root/miniconda3/envs/mindspore/lib/python3.9/site-packages/mindnlp/transformers/convert_slow_tokenizer.py", line 515, in pre_tokenizer
return pre_tokenizers.Metaspace(replacement=replacement, add_prefix_space=add_prefix_space)
The text was updated successfully, but these errors were encountered:
运行python simple_ui.py直接报如下错误。
(mindspore) root@autodl-container-bff2469f3e-a4796232:~/autodl-tmp/ChatPDF# python simple_ui.py
Building prefix dict from the default dictionary ...
Loading model from cache /tmp/jieba.cache
Loading model cost 0.559 seconds.
Prefix dict has been built successfully.
2024-08-09 09:59:49.937 | INFO | main::32 - Namespace(gen_model_type='auto', gen_model_name='./.mindnlp/model/01ai/Yi-6B-Chat', lora_model=None, rerank_model_name=None, corpus_files='sample.pdf', int4=False, int8=False, chunk_size=220, chunk_overlap=0, num_expand_context_chunk=1, server_name='0.0.0.0', server_port=8082, share=False)
The following parameters in checkpoint files are not loaded:
['embeddings.position_ids']
Loading checkpoint shards: 0%| | 0/3 [00:00<?, ?it/s]MindSpore do not support bfloat16 dtype, we will automaticlly convert to float16
Loading checkpoint shards: 100%|████████████████████████████████████████████████████| 3/3 [00:18<00:00, 6.14s/it]
2024-08-09 10:00:12.427 | INFO | msimilarities.bert_similarity:add_corpus:105 - Start computing corpus embeddings, new docs: 212
Batches: 100%|██████████████████████████████████████████████████████████████████████| 7/7 [00:16<00:00, 2.39s/it]
2024-08-09 10:00:29.184 | INFO | msimilarities.bert_similarity:add_corpus:117 - Add 212 docs, total: 212, emb len: 212
2024-08-09 10:00:29.185 | INFO | msimilarities.literal_similarity:add_corpus:395 - Add corpus done, new docs: 212, all corpus size: 212
2024-08-09 10:00:29.336 | INFO | msimilarities.literal_similarity:build_bm25:405 - Total corpus: 212
2024-08-09 10:00:29.336 | DEBUG | chatpdf:add_corpus:281 - files: ['sample.pdf'], corpus size: 212, top3: ['Style Transfer from Non-Parallel Text byCross-AlignmentTianxiao Shen1Tao Lei2Regina Barzilay1Tommi Jaakkola11MIT CSAIL2ASAPP Inc.', '1{tianxiao, regina, tommi}@[email protected] paper focuses on style transfer on the basis of non-parallel text.', 'This is aninstance of a broad family of problems including machine translation, decipherment,and sentiment modification. The key challenge is to separate the content fromother aspects such as style.']
Traceback (most recent call last):
File "/root/autodl-tmp/ChatPDF/simple_ui.py", line 34, in
model = ChatPDF(
File "/root/autodl-tmp/ChatPDF/chatpdf.py", line 184, in init
self.rerank_tokenizer = AutoTokenizer.from_pretrained(rerank_model_name_or_path, mirror='modelscope')
File "/root/miniconda3/envs/mindspore/lib/python3.9/site-packages/mindnlp/transformers/models/auto/tokenization_auto.py", line 775, in from_pretrained
return tokenizer_class.from_pretrained(pretrained_model_name_or_path, *inputs, **kwargs)
File "/root/miniconda3/envs/mindspore/lib/python3.9/site-packages/mindnlp/transformers/tokenization_utils_base.py", line 1723, in from_pretrained
return cls._from_pretrained(
File "/root/miniconda3/envs/mindspore/lib/python3.9/site-packages/mindnlp/transformers/tokenization_utils_base.py", line 1942, in _from_pretrained
tokenizer = cls(*init_inputs, **init_kwargs)
File "/root/miniconda3/envs/mindspore/lib/python3.9/site-packages/mindnlp/transformers/models/xlm_roberta/tokenization_xlm_roberta_fast.py", line 154, in init
super().init(
File "/root/miniconda3/envs/mindspore/lib/python3.9/site-packages/mindnlp/transformers/tokenization_utils_fast.py", line 106, in init
fast_tokenizer = convert_slow_tokenizer(slow_tokenizer)
File "/root/miniconda3/envs/mindspore/lib/python3.9/site-packages/mindnlp/transformers/convert_slow_tokenizer.py", line 1388, in convert_slow_tokenizer
return converter_class(transformer_tokenizer).converted()
File "/root/miniconda3/envs/mindspore/lib/python3.9/site-packages/mindnlp/transformers/convert_slow_tokenizer.py", line 533, in converted
pre_tokenizer = self.pre_tokenizer(replacement, add_prefix_space)
File "/root/miniconda3/envs/mindspore/lib/python3.9/site-packages/mindnlp/transformers/convert_slow_tokenizer.py", line 515, in pre_tokenizer
return pre_tokenizers.Metaspace(replacement=replacement, add_prefix_space=add_prefix_space)
The text was updated successfully, but these errors were encountered: