Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[功能]: 指定视觉模型为文本模型提供OCR能力 #612

Open
mkraku opened this issue Jan 1, 2025 · 0 comments
Open

[功能]: 指定视觉模型为文本模型提供OCR能力 #612

mkraku opened this issue Jan 1, 2025 · 0 comments
Assignees

Comments

@mkraku
Copy link

mkraku commented Jan 1, 2025

您的功能建议是否与某个问题相关?

请描述您希望实现的解决方案

参考deepseek官网,他们应该是有一个专门的OCR模型,可以提取图片和扫描版的文档的文字作为上下文发起提问
想要在默认模型中增加一个“视觉模型”,为无视觉能力的AI提供图片和文档对话的可能。
硅基流动有便宜的视觉模型,智谱也有免费的视觉模型,用来做ocr足够了。
image

请描述您考虑过的其他方案

No response

其他补充信息

No response

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants