-
Notifications
You must be signed in to change notification settings - Fork 38
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Relax continuity constraints on Annotation #362
Comments
"Attention is not not Explanation" https://aclanthology.org/D19-1002.pdf |
One difficulty I noticed is that HTML is not just text with tags added in between. Some characters, like |
Good enough for HTML replacement, good enough for the visualization. Attention is all we need 🤗. Besides, we can build UI etc with the existing
We don't need to relax the continuity constraints for this, but such op support via the No hurries though, we can slowly incubate this idea. |
The alignments come from guided alignment trained from fastalign. Not from attention. The alignments are what drives HTML alignment. |
jerinphilip#88 (This is early experimental code, will take a while to merge to main). |
Related: #355 (comment), #298
I have proposed jelmervdl/translatelocally-web-ext#5 at the experimental extension, a next feature in wishlist would be an explanation like the one below. A little far-fetched, but someday I'd like to see the visualization usually depicting attention as an explanation of translation via the extension.
(Screenshot taken from https://distill.pub/2016/augmented-rnns/, so we already have JS available under a permissive license, hopefully).
#298 indicates that we are editing annotation to get HTML in, but the subword tokens now include tag information. This is not ideal when we want to build things like the above. A solution is to relax the continuity constraints imposed to connect strongly to SentencePiece to just a constraint of monotonous byte ranges.
We may look at planting methods on
Annotation
to insert markup in between rather than doing it externally, keeping the whole data structure consistent. This would also make it simple for other markups when we get to building those.Opening this issue to discuss.
The text was updated successfully, but these errors were encountered: