Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add performance tuning section for RDMA #190

Merged
merged 3 commits into from
Nov 27, 2024
Merged
Changes from 1 commit
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
41 changes: 41 additions & 0 deletions topics/RDMA.md
Original file line number Diff line number Diff line change
Expand Up @@ -50,6 +50,47 @@ Or:

ibv_devices (ibverbs-utils package of Debian/Ubuntu)

### Performance tuning
enjoy-binbin marked this conversation as resolved.
Show resolved Hide resolved
The RDMA completion queue will use the completion vector to signal completion events
via hardware interrupts. A large number of hardware interrupts can affect CPU performance.
It is possible to tune the performance using `rdma-comp-vector`.

#### Example 1

- Pin hardware interrupt vectors [0, 3] to CPU [0, 3].
- Set CPU affinity for valkey to CPU [4, X].
- Any valkey server uses a random RDMA completion vector.

All valkey servers will not affect each other and will be isolated from kernel interrupts.

```
SYS SYS SYS SYS VALKEY VALKEY VALKEY
| | | | | | |
CPU0 CPU1 CPU2 CPU3 CPU4 CPU5 ... CPUX
| | | |
INTR0 INTR1 INTR2 INTR3
```

#### Example 2

- Pin hardware interrupt vectors [0, X] to CPU [0, X].
- Set CPU affinity for valkey to CPU [0, X].
- Valkey server [M] uses RDMA completion vector [M].

A single CPU handles hardware interrupts, the RDMA completion queue, and the valkey server.
pizhenwei marked this conversation as resolved.
Show resolved Hide resolved
This avoids overhead and function calls across multiple CPUs.

```
VALKEY VALKEY VALKEY VALKEY VALKEY VALKEY VALKEY
| | | | | | |
CPU0 CPU1 CPU2 CPU3 CPU4 CPU5 ... CPUX
| | | | | | |
INTR0 INTR1 INTR2 INTR3 INTR4 INTR5 INTRX
```

Use 0 and positive numbers to specify the RDMA completion vector, or specify -1 to allow
the server to use a random vector for a new connection. The default vector is -1.
pizhenwei marked this conversation as resolved.
Show resolved Hide resolved


## Protocol

Expand Down