We read every piece of feedback, and take your input very seriously.
To see all available qualifiers, see our documentation.
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
想问问这个模型是如何微调的,论文没仔细写,只是说用了不同数据来微调不同的task的module,具体是怎么微调的,是分开微调还是一起微调用一起的数据呀?分开微调的话先调哪个呢?
压缩模块也没说具体用的哪些,不知道作者是否会给个详细讲解,是用的self-attention类似qformer的来压缩吗?
The text was updated successfully, but these errors were encountered:
myownskyW7
No branches or pull requests
想问问这个模型是如何微调的,论文没仔细写,只是说用了不同数据来微调不同的task的module,具体是怎么微调的,是分开微调还是一起微调用一起的数据呀?分开微调的话先调哪个呢?
压缩模块也没说具体用的哪些,不知道作者是否会给个详细讲解,是用的self-attention类似qformer的来压缩吗?
The text was updated successfully, but these errors were encountered: