-
Notifications
You must be signed in to change notification settings - Fork 10
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
How does characterboundsupdate interact with multi-codeunit characters? #96
Comments
Hey, this seemed like a reasonable thing to ask clarification on—but the response has been absolute silence for 5 months. Is anyone steering this ship? |
Sorry for the delay here, this is a good question. |
Thanks for the response. There may even be a case to be made for making the granularity of this grapheme clusters, though those are still awkward to determine in JS. An interface that provides the client code with the ranges of the specific grapheme(s) it is querying seems preferable, but I'm guessing you wouldn't want to break backwards compatibility at this time anymore. |
The minutes from today's call:
In summary it's still undecided which way we should go here, and we're going to ask for more developer feedback on which way is preferable. |
I haven't been keeping up on all the details of edit context, but I am an editor developer. While having a range implementation based on grapheme clusters sounds great, that would make it different from every other DOM range which seems like a recipe for confusion and bugs. The little work I've done with clusters is mostly in UI, not editing, but we were recently able to switch that to Intl.Segmenter so I can say my concept of a written "character" has evolved to mean a grapheme cluster. Looking at the method documentation for This would imply that the browser-provided range request has offsets between clusters, which I think is a reasonable assumption to make. As more developers become familiar with grapheme clusters I would hope that they, like me, will start to read any mention of "character bounds" as implying "grapheme bounds". |
From TPAC 2024 minutes:
|
The spec doesn't seem to explicitly say that the number of rectangles passed to
updateCharacterBounds
should equalevent.rangeEnd - event.rangeStart
, but the example implementation does it that way, and it kind of seems implied by the fact that the browser needs to be able to find the appropriate rectangle for a given character by offset and the rectangles don't get explicitly associated with a specific position, except for their array position.Since a given 'character' can take up multiple string positions, how should astral characters be handled here? Repeat their position multiple times in the array? If so, that seems non-obvious enough to mention explicitly. (But it also seems like a somewhat awkward solution, and defining this in such a way that the number of rectangles should match the number of actual unicode characters, not code points, between the given offsets, would also be reasonable, assuming the API can garantee that the queried offsets never fall in the middle of a surrogate pair).
The text was updated successfully, but these errors were encountered: