Reduce in memory queue limit by 16x #2455

jackkleeman · 2024-12-23T10:50:26Z

This was previously across all partitions, but since 1.1 its per partition. And it is 350M per partition. Those entries are not initially used, but as you scale to 1m invocations per partition, all the memory pages in the queue's ring buffer are dirtied and contribute to RSS. This leads to 9G of usage on a 24 partition node.

This PR reduces the limit by 16x to 21M per partition, or 562M on a 24 partition node, which it will reach after 1.5 million invocations. A more manageable figure, even if it still appears as a 'leak' until that amount is reached.

This was previously across all partitions, but since 1.1 its per partition. And it is 350M per partition. Those entries are not initially used, but as you scale to 1m invocations per partition, all the memory pages in the queue's ring buffer are dirtied and contribute to RSS. This leads to 9G of usage on a 24 partition node. This PR reduces the limit by 16x to 21M per partition, or 562M on a 24 partition node, which it will reach after 1.5 million invocations. A more manageable figure, even if it still appears as a 'leak' until that amount is reached.

AhmedSoliman

Great work investigating and proposing an improvement to this @jackkleeman. Changes look good to me.

This was previously across all partitions, but since 1.1 its per partition. And it is 350M per partition. Those entries are not initially used, but as you scale to 1m invocations per partition, all the memory pages in the queue's ring buffer are dirtied and contribute to RSS. This leads to 9G of usage on a 24 partition node. This PR reduces the limit by 16x to 21M per partition, or 562M on a 24 partition node, which it will reach after 1.5 million invocations. A more manageable figure, even if it still appears as a 'leak' until that amount is reached.

jackkleeman requested a review from AhmedSoliman December 23, 2024 10:50

AhmedSoliman approved these changes Dec 23, 2024

View reviewed changes

jackkleeman merged commit 1ac1f70 into restatedev:main Dec 23, 2024
11 checks passed

jackkleeman deleted the in-memory-queue-limit branch December 23, 2024 14:34

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Reduce in memory queue limit by 16x #2455

Reduce in memory queue limit by 16x #2455

jackkleeman commented Dec 23, 2024

AhmedSoliman left a comment

Reduce in memory queue limit by 16x #2455

Reduce in memory queue limit by 16x #2455

Conversation

jackkleeman commented Dec 23, 2024

AhmedSoliman left a comment

Choose a reason for hiding this comment