-
-
Notifications
You must be signed in to change notification settings - Fork 221
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Improve repl output window performance #2011
base: dev
Are you sure you want to change the base?
Improve repl output window performance #2011
Conversation
When an excessive amount of output is produced by the repl (for example as a result of a rogue loop that is printing to stdout) Calva can sometimes hang while trying to write all the output to the output window/file. The only way to resolve is to restart/reload the VSCode window. This commit introduces a new config entry `replOutputThrottleRate` which when set to a non-0 number will throttle output from the repl connection. If more output items are received than the throttle rate in a 500ms window then they will just be dropped. Addresses BetterThanTomorrow#942 Fixes BetterThanTomorrow#2010
Calvas performance can be drastically affected as the size of the repl output window grows. This commit adds a config entry `replOutputMaxLines` which if set to a non-0 number will cause the output window to be truncated if it grows beyond this threshold. Fixes BetterThanTomorrow#804
6f95b38
to
99b6e9d
Compare
"calva.replOutputMaxLines": { | ||
"markdownDescription": "The maximum number of lines to retain in the repl output window. Having the repl output window grow too large can significantly affect performance. Setting this to 0 will disable truncating", | ||
"type": "number", | ||
"default": 1000 | ||
}, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's generally not the number of lines that is the problem, rather the number of tokens on the top level, or in a sexpr. A way to count tokens on the top level is to count words when printing output. That will introduce state that can be tricky to manage, though...
1000 lines is very low. It could be some 50 evaluations of not too big structures. This would be completely non-problematic for Calva to handle. While printing 1000 lines of output with 100 words each might start to be problematic (I actually don't know where the problems start.)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's generally not the number of lines that is the problem, rather the number of tokens on the top level, or in a sexpr
While this makes sense, I don't think the additional complexity needed to implement this is really worth it. My assumption is that counting lines is a close-enough analogy in the average case for this to probably be sufficient. Especially if we make sure the threshold is high enough for the user to not care about the truncation.
1000 lines is very low. It could be some 50 evaluations of not too big structures
This value was chosen pretty arbitrary, I'd be happy to increase it to a degree. While working on this I started to see noticeable performance degradation after about 10k lines of stack-trace content (I was printing Exceptions to stdout) so I don't think it should be increased too much. Maybe 5k lines?
I also think it's unlikely that a user would really be scrolling that far back in the repl history too often for us to need to worry about truncating after a few thousand lines.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Unlikely, maybe, but I know of a user who does this all the time. 😄 I search the file rather than scroll, but anyway. (I keep tons of browser tabs open as well.) To me it makes sense to disable this limit feature (by setting limit to 0, e.g.) and to have that as default.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hey @julienvincent, thanks for taking time to look into this.
I think deleting at an arbitrary line length will cause unbalanced forms and break syntax highlighting and maybe also formatting, unless I'm missing something. If I'm not though, I don't think that's really an option.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To me it makes sense to disable this limit feature (by setting limit to 0, e.g.) and to have that as default.
I'm happy with that too.
I think deleting at an arbitrary line length will cause unbalanced forms and break syntax highlighting ...
It does break highlighting but in a localised way (only the broken form will be colored incorrectly). The rest of the file maintains the correct highlighting. I think this is an probably an acceptable behaviour.
and maybe also formatting
Which formatting do you mean? I haven't noticed anything obviously wrong.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The way the structural editor works, when there is broken balance in the document, all bets are off. It will sometimes seem to work (or even work), but in many cases the structural editor is broken and with that a lot of Calva functionality that depends on this.
If we limit on lines, it will have to do it such that the balance is never broken. There are some ways to do this, I can think of two right now:
- Add the missing brackets. There is code for that in Calva, even if it might be a bit tricky to re-use (or, messy code, tbh).
- Always print full results. The limit then need to happen before the buffering, I think. (Not that I know how it is done now, but anyway.)
That said, I think we could also consider popping up a warning when the line limit is reached, instead of enforcing the limit. The warning can have a button for clearing the window.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It does break highlighting but in a localised way (only the broken form will be colored incorrectly). The rest of the file maintains the correct highlighting. I think this is an probably an acceptable behaviour.
Yeah, I'd say that, in particular, is acceptable too.
By formatting I meant if someone is typing code in the repl window, the auto-formatting could/would be broken if there's an unbalanced form (which Peter mentioned above - structural editing). And as he said other functionality may be broken too in that case.
Thanks for caring about this issue! I'm not sure about the solution: Generally we try to challenge the need to for adding configuration. There might be some other ways to tackle this that don't introduce configuration, or that introduces configuration that is easier to understand for the user. I think I posted an issue at some point, suggesting that we move the stdout and stderr output to another place, like an output channel, where we don't need to support structural editing (which doesn't make super much sense for print output anyway). That would leave mostly evaluation results in the REPL/output window and there is already configuration that can limit the length of these. Seems reasonable throttling the pace of printing. And we might need/want to do that even if we print stdout/stderr somewhere else. I have no clue about what would be a good default though. Throttling too eagerly will disrupt things too as missing output can be very confusing. Printing together with the evaluation makes sense for many use cases, so we might want to have config for this if we go the print-somewhere-else route. We might want to collect more information about the problem. What if we added counters to the documents about how many tokens they have and how long it takes to scan them and to iterate through them? We could use the token counter for limitation settings instead of line count. And we could store this information in an edn file or something under the |
Co-authored-by: Peter Strömberg <[email protected]>
Personally I really like that all my output lives in one place and I also much prefer the user-experience of it just being a document/editor window over the vscode native output channels. If we wanted to add support for outputting to a vscode channel I would ask for making that configurable so a user can choose to keep it in the repl/output window.
My guess is that while outputting to a vscode channel might be more performant than the output window, we would probably still want to throttle yes.
Setting this to something quite high, like 2k, as a default starting point is probably quite safe. I think it would be quite rare that people actually run into this - it's mostly intended as a safety guard against rogue code, which I think people would struggle to trigger under normal development.
It's worth highlighting that throttled/dropped items are reported in the output window which should help to reduce confusion. Perhaps when reporting we should also show a help message for how to increase/disable the throttling?
Do you mean as a way to drive the decision for setting a default or are you proposing some kind of mechanism for Calva to automatically decide on a value? |
I was thinking that if the document knows how many top level forms it has, it will be as easy to limit the output on this as on line counts. Then we could consider measuring two things on each document:
If we store this info on the document as well, it could also be used for limiting the output window output. Then the mechanism could be that we inform the user when the output window is getting slow, with the instruction to clear the window, rather than having an automatic limit. |
I believe this was suggested / the idea, were that to be implemented.
I doubt throttling is necessary in output channels at all, but of course we lose the benefits of it going to an editor, and this is a bit of a tangent - I just wanted to comment. I wonder if spending much more time and effort on the current output window is worth it. I've been thinking lately that Calva might be better off without the current output window. In its place we could use Portal or (to not add such a dependency) some read-only display provided by VS Code that supports syntax highlighting. @PEZ I know the term "webview" might stir things inside you from past Calva days 😄, but I think a read-only webview would be drastically simpler than when we had another paredit implementation in a webview. In a webview we could use an existing JS solution for syntax highlighting. Maybe we could also use markdown instead of or within a webview? I don't know if that would be a good idea or not. Anyway, I've just been thinking we can remove a lot of complexity and get rid of some of these issues by doing something like the above. |
This PR introduces two new config entries -
replOutputThrottleRate
andreplOutputMaxLines
which are both used to improve the performance of the repl output window/file.replOutputThrottleRate
If set to a non-0 number this will throttle output from the repl connection. If more output items are received than the throttle rate in a 500ms window then they will just be dropped.
replOutputMaxLines
If set to a non-0 number this will cause the output window to be truncated if it grows
beyond this threshold.
This should fix #2010 and #804 and should help address #942 (or maybe fix it?)