ChatPDF功能存在的问题 #71

kevinwei1975 · 2024-08-09T08:24:15Z

执行以下命令时，虽得到了正确结果，但推理没有使用GPU，运行了5分钟才出结果，太慢了。
(mindspore) root@autodl-container-bff2469f3e-a4796232:~/autodl-tmp/ChatPDF# python chatpdf.py
Building prefix dict from the default dictionary ...
Loading model from cache /tmp/jieba.cache
Loading model cost 0.960 seconds.
Prefix dict has been built successfully.
Namespace(sim_model_name='shibing624/text2vec-base-multilingual', gen_model_type='auto', gen_model_name='./.mindnlp/model/01ai/Yi-6B-Chat', lora_model=None, rerank_model_name='', corpus_files='sample.pdf', chunk_size=220, chunk_overlap=0, num_expand_context_chunk=1)
The following parameters in checkpoint files are not loaded:
['embeddings.position_ids']
Loading checkpoint shards: 0%| | 0/3 [00:00<?, ?it/s]MindSpore do not support bfloat16 dtype, we will automaticlly convert to float16
Loading checkpoint shards: 100%|████████████████████████████████████████████████████| 3/3 [00:24<00:00, 8.05s/it]
2024-08-09 09:44:01.138 | INFO | msimilarities.bert_similarity:add_corpus:105 - Start computing corpus embeddings, new docs: 212
Batches: 100%|██████████████████████████████████████████████████████████████████████| 7/7 [00:16<00:00, 2.34s/it]
2024-08-09 09:44:17.592 | INFO | msimilarities.bert_similarity:add_corpus:117 - Add 212 docs, total: 212, emb len: 212
2024-08-09 09:44:17.594 | DEBUG | main:add_corpus:281 - files: ['sample.pdf'], corpus size: 212, top3: ['Style Transfer from Non-Parallel Text byCross-AlignmentTianxiao Shen1Tao Lei2Regina Barzilay1Tommi Jaakkola11MIT CSAIL2ASAPP Inc.', '1{tianxiao, regina, tommi}@[email protected] paper focuses on style transfer on the basis of non-parallel text.', 'This is aninstance of a broad family of problems including machine translation, decipherment,and sentiment modiﬁcation. The key challenge is to separate the content fromother aspects such as style.']
2024-08-09 09:44:17.997 | DEBUG | main:predict:475 - prompt: 基于以下已知信息，简洁和专业的来回答用户的问题。
如果无法从中得到答案，请说 "根据已知信息无法回答该问题" 或 "没有提供足够的相关信息"，不允许在答案中添加编造成分，答案请使用中文。

已知内容:
[1] "ReferencesPeter F Brown, John Cocke, Stephen A Della Pietra, Vincent J Della Pietra, Fredrick Jelinek, John DLafferty, Robert L Mercer, and Paul S Roossin. A statistical approach to machine translation.Computational linguistics , 16(2):79–85, 1990. Tong Che, Yanran Li, Ruixiang Zhang, R Devon Hjelm, Wenjie Li, Yangqiu Song, and YoshuaBengio. Maximum-likelihood augmented discrete generative adversarial networks.arXiv preprintarXiv:1702.07983 , 2017. Xi Chen, Yan Duan, Rein Houthooft, John Schulman, Ilya Sutskever, and Pieter Abbeel."
[2] "Method sentiment ﬂuency overall transferHu et al. (2017) 70.8 3.2 41.0Cross-align 62.6 2.8 41.5Table 2: Human evaluations on sentiment, ﬂuency and overall transfer quality.Fluency rating is from1 (unreadable) to 4 (perfect).Overall transfer quality is evaluated in a comparative manner, where thejudge is shown a source sentence and two transferred sentences, and decides whether they are bothgood, both bad, or one is better."
[3] "We demonstrate the effectiveness of this cross-alignment method on three tasks: sentiment modiﬁcation, decipherment of word substitution ciphers, and recoveryof word order.11 IntroductionUsing massive amounts of parallel data has been essential for recent advances in text generation tasks,such as machine translation and summarization.However, in many text generation problems, we canonly assume access to non-parallel or mono-lingual data. Problems such as decipherment or styletransfer are all instances of this family of tasks.In all of these problems, we must preserve the contentof the source sentence but render the sentence consistent with desired presentation constraints (e.g.,style, plaintext/ciphertext)."
[4] "On the other hand, it lowers the entropy in p(xjy;z),which helps to produce meaningful style transfer in practice as we ﬂip between y1andy2.Withoutexplicitly modeling p(z), it is still possible to force dist

问题:
自然语言中的非平行迁移是指什么？

根据上述信息，非平行迁移是指在没有平行语料的情况下，通过其他数据或方法实现文本风格迁移或内容 Foster et al. (2018) 的方法，该方法将语言模型作为条件随机场模型的一部分，通过训练语言模型来学习文本的表示，然后使用这些表示来生成风格化的文本。这种方法的优点是可以利用大量的无标签数据进行训练，从而提高模型的泛化能力。
['[1]\t "ReferencesPeter F Brown, John Cocke, Stephen A Della Pietra, Vincent J Della Pietra, Fredrick Jelinek, John DLafferty, Robert L Mercer, and Paul S Roossin. A statistical approach to machine translation.Computational linguistics , 16(2):79–85, 1990. Tong Che, Yanran Li, Ruixiang Zhang, R Devon Hjelm, Wenjie Li, Yangqiu Song, and YoshuaBengio. Maximum-likelihood augmented discrete generative adversarial networks.arXiv preprintarXiv:1702.07983 , 2017. Xi Chen, Yan Duan, Rein Houthooft, John Schulman, Ilya Sutskever, and Pieter Abbeel."', '[2]\t "Method sentiment ﬂuency overall transferHu et al. (2017) 70.8 3.2 41.0Cross-align 62.6 2.8 41.5Table 2: Human evaluations on sentiment, ﬂuency and overall transfer quality.Fluency rating is from1 (unreadable) to 4 (perfect).Overall transfer quality is evaluated in a comparative manner, where thejudge is shown a source sentence and two transferred sentences, and decides whether they are bothgood, both bad, or one is better."', '[3]\t "We demonstrate the effectiveness of this cross-alignment method on three tasks: sentiment modiﬁcation, decipherment of word substitution ciphers, and recoveryof word order.11 IntroductionUsing massive amounts of parallel data has been essential for recent advances in text generation tasks,such as machine translation and summarization.However, in many text generation problems, we canonly assume access to non-parallel or mono-lingual data. Problems such as decipherment or styletransfer are all instances of this family of tasks.In all of these problems, we must preserve the contentof the source sentence but render the sentence consistent with desired presentation constraints (e.g.,style, plaintext/ciphertext)."', '[4]\t "On the other hand, it lowers the entropy in p(xjy;z),which helps to produce meaningful style transfer in practice as we ﬂip between y1andy2.Withoutexplicitly modeling p(z), it is still possible to force distributional alignment of p(zjy1)andp(zjy2). To this end, we introduce two constrained variants of auto-encoder.4.1 Aligned auto-encoderDispense with V AEs that make an explicit assumption about p(z)and align both posteriors to it, wealignpE(zjy1)andpE(zjy2)with each other, which leads to the following constrained optimizationproblem: \x12\x03= arg min\x12Lrec(\x12E;\x12G) s.t.E(x1;y1)d=E(x2;y2)x1\x18X1;x2\x18X2(5) In practice, a Lagrangian relaxation of the primal problem is instead optimized."', '[5]\t "There-fore, we report model performance with respect to the percentage of the substituted vocabulary. Notethat the transfer models do not know that fis a word substitution function.They learn it entirelyfrom the data distribution. In addition to having different transfer models, we introduce a simple decipherment baseline basedon word frequency.Speciﬁcally, we assume that words shared between X1andX2do not requiretranslation. The rest of the words are mapped based on their frequency, and ties are broken arbitrarily."', '[6]\t "They can be both satisfactory, A/B is better,or both unsatisfactory. We collect two labels for each question. The label agreement and conﬂictresolution strategy can be found in the supplementary material.Note that the two evaluations are notredundant.For instance, a system that always generates the same grammatically correct sentencewith the right sentiment independently of the source sentence will score high in the ﬁrst evaluationsetup, but low in the second one."', '[7]\t "Our work mostclosely relates to approaches that do not utilize parallel data, but instead guide sentence generationfrom an indirect training signal (Mueller et al., 2017; Hu et al., 2017). For instance, Mueller et al.(2017) manipulate the hidden representation to generate sentences that satisfy a desired property (e.g.,sentiment) as measured by a corresponding classiﬁer.However, their model does not necessarilyenforce content preservation. More similar to our work, Hu et al."', '[8]\t "MethodSubstitution decipherOrder recover20% 40% 60% 80% 100%No transfer (copy) 56.4 21.4 6.3 4.5 0 5.1Unigram matching 74.3 48.1 17.8 10.7 1.2 -Variational auto-encoder 79.8 59.6 44.6 34.4 0.9 5.3Aligned auto-encoder 81.0 68.9 50.7 45.6 7.2 5.2Cross-aligned auto-encoder 83.8 79.1 74.7 66.1 57.4 26.1Parallel translation 99.0 98.9 98.2 98.5 97.2 64.6Table 4: Bleu scores of word substitution decipherment and word order recovery.7 ConclusionTransferring languages from one style to another has been previously trained using parallel data. Inthis work, we formulate the task as a decipherment problem with access only to non-parallel data.The two data collections are assumed to be generated by a latent variable generative model."', '[9]\t "arXiv preprintarXiv:1702.07983 , 2017. Xi Chen, Yan Duan, Rein Houthooft, John Schulman, Ilya Sutskever, and Pieter Abbeel.Infogan: Interpretable representation learning by information maximizing generative adversarial nets. InAdvances in neural information processing systems , 2016. Qing Dou and Kevin Knight.Large scale decipherment for out-of-domain machine translation. InProceedings of the 2012 Joint Conference on Empirical Methods in Natural Language Processingand Computational Natural Language Learning , pages 266–275."', '[10]\t "Simple statistical gradient-following algorithms for connectionist reinforcementlearning. Machine learning , 8(3-4):229–256, 1992. Zili Yi, Hao Zhang, Ping Tan Gong, et al.Dualgan: Unsupervised dual learning for image-to-imagetranslation. arXiv preprint arXiv:1704.02510 , 2017. Lantao Yu, Weinan Zhang, Jun Wang, and Yong Yu. Seqgan: sequence generative adversarial netswith policy gradient.arXiv preprint arXiv:1609.05473 , 2016. Jun-Yan Zhu, Taesung Park, Phillip Isola, and Alexei A Efros. Unpaired image-to-image translationusing cycle-consistent adversarial networks. arXiv preprint arXiv:1703.10593 , 2017."']

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

ChatPDF功能存在的问题 #71

ChatPDF功能存在的问题 #71

kevinwei1975 commented Aug 9, 2024

ChatPDF功能存在的问题 #71

ChatPDF功能存在的问题 #71

Comments

kevinwei1975 commented Aug 9, 2024