We read every piece of feedback, and take your input very seriously.
To see all available qualifiers, see our documentation.
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
语种识别准确率很低,是用错了吗?
issues/3317 里提到用
系统
工具包
功能:
代码:
import paddleclas file_name = r"E:\ocr\data\hand\3.jpg" lang_model = paddleclas.PaddleClas(model_name="language_classification") result = lang_model.predict(input_data=file_name) result = list(result) lang_type = result[0][0]['label_names'][0] print('语言类型为:',lang_type)
随机检测5-6张图片,语种识别准确率很差,只有1个识别正确
[[{'class_ids': [2, 4], 'scores': [0.20747, 0.1695], 'label_names': ['cyrillic', 'japan'], 'filename': 'E:\\ocr\\data\\a.png'}]]
英文影印版截图识别成了 中文繁体、韩文
result=[[{'class_ids': [1, 6], 'scores': [0.91424, 0.01366], 'label_names': ['chinese_cht', 'korean'], 'filename': 'E:\\ocr\\data\\OCR_e2e_img\\scan.png'}]]
result=[[{'class_ids': [9, 0], 'scores': [0.3635, 0.1387], 'label_names': ['latin', 'arabic'], 'filename': 'E:\\ocr\\data\\hand\\1.jpg'}]]
[{'class_ids': [6, 9], 'scores': [0.55042, 0.14912], 'label_names': ['korean', 'latin'], 'filename': 'E:\\ocr\\data\\all.jpg'}]
The text was updated successfully, but these errors were encountered:
猜测,你这里测试的有问题。输入图像应该是一条条的文本行,而不是整个图像。类似下面这样:
Sorry, something went wrong.
对输入图像还有限制?那实用性相当受限了
没有啊,这个的输入是文本检测模型的输出。输入一张图像,先过文本检测模型,就得到一条条的文本行图像,再送入这个模型来区分具体语种。
啊,这么用的?那语种检测的意义不大了,更加实用的场景是直接对图片进行语种预判
一张图像中存在多种语言的呢?
Bobholamovic
No branches or pull requests
issues/3317 里提到用
系统
工具包
功能:
代码:
随机检测5-6张图片,语种识别准确率很差,只有1个识别正确
中文截图被识别为 cyrillic 和 日文
英文影印版截图识别成了 中文繁体、韩文
中文手写体识别成了 拉丁文、阿拉伯文
多语种图片(中英日法德)被识别成韩文、拉拉丁文
The text was updated successfully, but these errors were encountered: