Optimize for topology term #1310

huww98 · 2024-12-11T19:14:46Z

What type of PR is this?

/kind cleanup

What this PR does / why we need it:

Greatly reduce the CPU and memory usage, especially for drivers assign different topology to each host. Benchmark shows 30x CPU reduction and 21x memory allocation size reduction.

We were performing a stress test recently, and found the bottleneck to be the CPU usage of external-provisioner. From profiling, topologyTerm.hash takes almost all the CPU time (apart from logGRPC).

Our setup is a 3k node cluster, each node has a distinct topology. It took about 50s for provisioning 3k immediate binding volumes, despite that we have raised the CPU limit to 10. external-provisioner also took ~3GiB memory, which is far higher than expected.

This is my initial attempt (not included in this PR) to optimize this. I re-implemented the compare function for sort to avoid hash() and added some benchmarks. The benchmark result is:

goos: darwin
goarch: arm64
pkg: github.com/kubernetes-csi/external-provisioner/v5/pkg/controller
cpu: Apple M2 Pro
BenchmarkDedupAndSortZone/baseline-12               1641            643224 ns/op          697858 B/op      15032 allocs/op
BenchmarkDedupAndSortZone/compare-12                1812            642864 ns/op          697004 B/op      15012 allocs/op
BenchmarkDedupAndSortZone/slices.Sort-12            1795            644037 ns/op          696948 B/op      15010 allocs/op
BenchmarkDedupAndSortZone/slices.Compact-12          906           1309465 ns/op          540267 B/op       8060 allocs/op
BenchmarkDedupAndSortHost/baseline-12                 50          22663181 ns/op        27017105 B/op     530151 allocs/op
BenchmarkDedupAndSortHost/compare-12                 156           7625107 ns/op         5039063 B/op      60242 allocs/op
BenchmarkDedupAndSortHost/slices.Sort-12             156           7646612 ns/op         5040749 B/op      60260 allocs/op
BenchmarkDedupAndSortHost/slices.Compact-12          193           6147562 ns/op         3457850 B/op      37009 allocs/op

By comparing compare and baseline for BenchmarkDedupAndSortHost, re-implementing compare already gives 3x speedup and 5x memory allocation size reduction. But when I try to replace the hash() func completely by slices.Compact, the host case is faster, but the zone case becomes slower.

While I'm trying to find more opportunity to optimize, I do another profile for Host/slices.Sort case:

Making slice, accessing map, and sorting keys can all be avoid if we define topologyTerm as a sorted slice instead of map (the first commit of this PR). The benchmark result from the first commit is:

BenchmarkDedupAndSortZone/slices.Sort-12                    1975            527407 ns/op          747079 B/op      15014 allocs/op
BenchmarkDedupAndSortZone/slices.Compact-12                10000            114418 ns/op           74904 B/op         11 allocs/op
BenchmarkDedupAndSortHost/slices.Sort-12                     652           1847723 ns/op         2948386 B/op      30078 allocs/op
BenchmarkDedupAndSortHost/slices.Compact-12                 1597            739441 ns/op         1250314 B/op       9002 allocs/op

Host/slices.Sort case is speedup several times again, the Zone/slices.Sort case is also improved because of the more efficient hash(). With the very cheap compare(), using slices.Compact to deduplicate becomes beneficial for both case!
So, with the second commit, the hash() is removed completely.

Note that for fair comparation, the benchmark also includes toCSITopology invocation, because if topologyTerm is a slice, we need to convert it back to map for GRPC. Actually, after this refactor, most of the time is spent on toCSITopology(). If I remove it, the result is:

BenchmarkDedupAndSortZone-12               10725            111114 ns/op           73753 B/op          1 allocs/op
BenchmarkDedupAndSortHost-12                4029            287086 ns/op           73830 B/op          3 allocs/op

This is ~78x speedup compared to the baseline, with zero allocation in deduplication and sort step!

Which issue(s) this PR fixes:

Fixes #

Special notes for your reviewer:

Does this PR introduce a user-facing change?:

NONE

k8s-ci-robot · 2024-12-11T19:14:56Z

Hi @huww98. Thanks for your PR.

I'm waiting for a kubernetes-csi member to verify that this patch is reasonable to test. If it is, they should reply with /ok-to-test on its own line. Until that is done, I will not automatically test new commits in this PR, but the usual testing commands by org members will still work. Regular contributors should join the org to skip this step.

Once the patch is verified, the new status will be reflected by the ok-to-test label.

I understand the commands that are listed here.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

pkg/controller/topology.go

jsafrane · 2024-12-12T14:35:09Z

pkg/controller/topology.go

@@ -39,8 +39,14 @@ import (
 	"k8s.io/klog/v2"
 )

+type topologyKey struct {


I find it strange to call something a Key if it has a Key field inside. topologyKeyValue? topologyItem?

Renamed to topologySegment, according to CSI spec.

jsafrane · 2024-12-12T14:57:12Z

/ok-to-test

Compared with map, slice is a lot faster. And we can avoid allocation in the compare() func, which is a very hot path.

This allow us to get rid of the hash() func, which is slow and do too many allocations.

jsafrane · 2024-12-13T10:01:00Z

/lgtm
/approve
I am really glad we have a very good unit test coverage around.

k8s-ci-robot · 2024-12-13T10:01:07Z

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: huww98, jsafrane

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

~~OWNERS~~ [jsafrane]

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

k8s-ci-robot added release-note-none Denotes a PR that doesn't merit a release note. kind/cleanup Categorizes issue or PR as related to cleaning up code, process, or technical debt. labels Dec 11, 2024

k8s-ci-robot requested review from humblec and sunnylovestiramisu December 11, 2024 19:14

k8s-ci-robot added cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. labels Dec 11, 2024

k8s-ci-robot added the size/L Denotes a PR that changes 100-499 lines, ignoring generated files. label Dec 11, 2024

huww98 mentioned this pull request Dec 12, 2024

Pass in hostname to create volume request #1285

Open

jsafrane reviewed Dec 12, 2024

View reviewed changes

k8s-ci-robot added ok-to-test Indicates a non-member PR verified by an org member that is safe to test. and removed needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. labels Dec 12, 2024

huww98 added 2 commits December 13, 2024 01:35

make topologyTerm a slice

5e854d9

Compared with map, slice is a lot faster. And we can avoid allocation in the compare() func, which is a very hot path.

use slices.Compact to deduplicate topologyTerm

bd3b4a1

This allow us to get rid of the hash() func, which is slow and do too many allocations.

huww98 force-pushed the optimize-topologyTerm branch from 266c667 to bd3b4a1 Compare December 12, 2024 17:35

k8s-ci-robot assigned jsafrane Dec 13, 2024

k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Dec 13, 2024

k8s-ci-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Dec 13, 2024

k8s-ci-robot merged commit c79d156 into kubernetes-csi:master Dec 13, 2024
7 checks passed

huww98 mentioned this pull request Dec 14, 2024

cleanups and refactor of protosanitizer kubernetes-csi/csi-lib-utils#184

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Optimize for topology term #1310

Optimize for topology term #1310

huww98 commented Dec 11, 2024 •

edited

Loading

k8s-ci-robot commented Dec 11, 2024

jsafrane Dec 12, 2024

huww98 Dec 12, 2024

jsafrane commented Dec 12, 2024

jsafrane commented Dec 13, 2024

k8s-ci-robot commented Dec 13, 2024

Optimize for topology term #1310

Optimize for topology term #1310

Conversation

huww98 commented Dec 11, 2024 • edited Loading

k8s-ci-robot commented Dec 11, 2024

jsafrane Dec 12, 2024

Choose a reason for hiding this comment

huww98 Dec 12, 2024

Choose a reason for hiding this comment

jsafrane commented Dec 12, 2024

jsafrane commented Dec 13, 2024

k8s-ci-robot commented Dec 13, 2024

huww98 commented Dec 11, 2024 •

edited

Loading