Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Optimize AsyncTimeout operations. #1488

Open
yyaroshevich opened this issue Jun 4, 2024 · 1 comment
Open

Optimize AsyncTimeout operations. #1488

yyaroshevich opened this issue Jun 4, 2024 · 1 comment

Comments

@yyaroshevich
Copy link

This task is an invitation to discussion initially started here.

As a software engineer I have an access to monitoring of serveral backend services. Those services heavily rely on OkHttp to perform API calls to other services. I've noticed that on those services operations on AsyncTimeout like scheduling can take significant amount of time - it is top 1 CPU intensive function.
выява
According to DataDog AsyncTimeout also sometimes become a bottleneck due to application wide lock held to perform long CPU intensive operation.
выява
выява

This is presumably due to implementation detail of AsyncTimeout which is implemented as sorted linked list. An insertion to such list is a linear scan which takes more and more time when service has many requests in parallel and probably many timeouts scheduled.

I'll try to propose several options to overcome this issue. They are all different in complexity of implementation, but worth discussion.
As on option it is possibly to use ScheduledThreadPoolExecutor to delegate efficient timeout scheduling to JDK. Under the hood ScheduledThreadPoolExecutor implemented as binary heap which will provide O(log(n)) operations on list. The drawback is that each task scheduled to ScheduledThreadPoolExecutor is wrapped into internal object and stored into array - which means additional allocations. I personally don't think it will be a big problem, but it worth noting this drawback.

Alternatively it is possible to adjust current implementation of AsyncTimeout which seems to be intrusive singly linked list to be intrusive liked binary heap which will also provide O(log(n)) operations without any extra allocations. The drawback of this approach is that it is much more complex to implement (those manipulations over binary tree are tricky), but this approach to timeouts is known and used by one of the most widespread native library for IO.

As suggested by @swankjesse it also might worth just to change direction of lookup which might reduce amount of elements of list traversed. Presumably due to fact that most of timeouts will be the same. Here in my opinion it will need to be validated that hypothesis about same timeouts is actually true. There might be something like timeout groups coming from different requests with different timeouts. Also different timeout can come from other timers like timers assumed by H2 protocol.

@yyaroshevich
Copy link
Author

yyaroshevich commented Dec 21, 2024

caught new "record": Image

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant