From c8fc5c45d7825668bc3e88d46911c2ac787c8767 Mon Sep 17 00:00:00 2001
From: yaron2 <schneider.yaron@live.com>
Date: Wed, 11 Oct 2023 17:20:39 -0700
Subject: [PATCH] update performance benchmarks

Signed-off-by: yaron2 <schneider.yaron@live.com>
---
 .../perf-actors-activation.md                 | 17 +++----
 .../perf-pubsub.md                            | 50 +++++++++++++++++++
 .../perf-service-invocation.md                | 24 +++------
 .../performance-and-scalability/perf-state.md | 50 +++++++++++++++++++
 4 files changed, 115 insertions(+), 26 deletions(-)
 create mode 100644 daprdocs/content/en/operations/performance-and-scalability/perf-pubsub.md
 create mode 100644 daprdocs/content/en/operations/performance-and-scalability/perf-state.md

diff --git a/daprdocs/content/en/operations/performance-and-scalability/perf-actors-activation.md b/daprdocs/content/en/operations/performance-and-scalability/perf-actors-activation.md
index ddc604142fa..5b93b169cf2 100644
--- a/daprdocs/content/en/operations/performance-and-scalability/perf-actors-activation.md
+++ b/daprdocs/content/en/operations/performance-and-scalability/perf-actors-activation.md
@@ -19,7 +19,7 @@ For applications using actors in Dapr there are two aspects to be considered. Fi
 * Sidecar Injector (control plane)
 * Sentry (optional, control plane)
 
-## Performance summary for Dapr v1.0
+## Performance summary for Dapr v1.12
 
 The actors API in Dapr sidecar will identify which hosts are registered for a given actor type and route the request to the appropriate host for a given actor ID. The host runs an instance of the application and uses the Dapr SDK (.Net, Java, Python or PHP) to handle actors requests via HTTP.
 
@@ -40,17 +40,14 @@ Test parameters:
 * Sidecar limited to 0.5 vCPU
 * mTLS enabled
 * Sidecar telemetry enabled (tracing with a sampling rate of 0.1)
-* Payload of an empty JSON object: `{}`
 
 ### Results
 
-* The actual throughput was ~500 qps.
-* The tp90 latency was ~3ms.
-* The tp99 latency was ~6.2ms.
-* Dapr app consumed ~523m CPU and ~304.7Mb of Memory
-* Dapr sidecar consumed 2m CPU and ~18.2Mb of Memory
+* The requested throughput was 500 qps.
+* The actual throughput was 500 qps.
+* The tp90 latency was ~3.2ms.
+* The tp99 latency was ~7ms.
+* Dapr app consumed ~339m CPU and ~336Mb of Memory
+* Dapr sidecar consumed 93m CPU and ~60Mb of Memory
 * No app restarts
 * No sidecar restarts
-
-## Related links
-* For more information see [overview of Dapr on Kubernetes]({{< ref kubernetes-overview.md >}})
\ No newline at end of file
diff --git a/daprdocs/content/en/operations/performance-and-scalability/perf-pubsub.md b/daprdocs/content/en/operations/performance-and-scalability/perf-pubsub.md
new file mode 100644
index 00000000000..a6e596219d9
--- /dev/null
+++ b/daprdocs/content/en/operations/performance-and-scalability/perf-pubsub.md
@@ -0,0 +1,50 @@
+---
+type: docs
+title: "Pub/sub performance"
+linkTitle: "Pub/sub performance"
+weight: 20000
+description: ""
+---
+This article provides pub/sub API performance benchmarks and resource utilization in Dapr on Kubernetes.
+
+## System overview
+
+Dapr consists of a data plane, the sidecar that runs next to your app, and a control plane that configures the sidecars and provides capabilities such as cert and identity management.
+
+### Kubernetes components
+
+* Sidecar (data plane)
+* Placement (required for actors, control plane mapping actor types to hosts)
+* Operator (control plane)
+* Sidecar Injector (control plane)
+* Sentry (optional, control plane)
+* Kafka cluster with 3 replicas
+
+## Performance summary for Dapr v1.12
+
+The Pub/Sub API is used to publish messages to a message broker. Dapr accepts requests from the app via HTTP or gRPC, wraps them in a cloud event if needed, and sends the request to the message broker.
+
+Performance varies based on the underlying message broker. The Pub/Sub performance test measures the added latency when publishing a message with Dapr compared with the baseline latency when publishing directly to the message broker.
+
+### Kubernetes performance test setup
+
+The test was conducted on a 3 node Kubernetes cluster, using commodity hardware running 4 cores and 8GB of RAM, without any network acceleration.
+
+Test parameters:
+
+* 1000 requests per second
+* 1 replica
+* 1 minute duration
+* Sidecar limited to 0.5 vCPU
+* Sidecar telemetry enabled (tracing with a sampling rate of 0.1)
+* Payload of a 1kb size
+
+### Results
+
+* The requested throughput was 1000 qps
+* The actual throughput was 1000 qps
+* Added latency for 90th percentile was 0.64ms for gRPC and 0.49ms for HTTP
+* Added latency for 99th percentile was 1.91ms for gRPC and 1.21ms for HTTP
+* Dapr app consumed ~0.2 vCPU and ~30Mb of Memory for both gRPC and HTTP
+* No app restarts
+* No sidecar restarts
diff --git a/daprdocs/content/en/operations/performance-and-scalability/perf-service-invocation.md b/daprdocs/content/en/operations/performance-and-scalability/perf-service-invocation.md
index 6246f346037..1808ea8969c 100644
--- a/daprdocs/content/en/operations/performance-and-scalability/perf-service-invocation.md
+++ b/daprdocs/content/en/operations/performance-and-scalability/perf-service-invocation.md
@@ -29,7 +29,7 @@ For more information see [overview of Dapr in self-hosted mode]({{< ref self-hos
 
 For more information see [overview of Dapr on Kubernetes]({{< ref kubernetes-overview.md >}}).
 
-## Performance summary for Dapr v1.0
+## Performance summary for Dapr v1.12
 
 The service invocation API is a reverse proxy with built-in service discovery to connect to other services. This includes tracing, metrics, mTLS for in-transit encryption of traffic, together with resiliency in the form of retries for network partitions and connection errors.
 
@@ -59,10 +59,10 @@ When running in a highly available production setup, the Dapr control plane cons
 
 | Component  | vCPU | Memory
 | ------------- | ------------- | -------------
-| Operator  | 0.001  | 12.5 Mb
-| Sentry  | 0.005  | 13.6 Mb
-| Sidecar Injector  | 0.002  | 14.6 Mb
-| Placement | 0.001  | 20.9 Mb
+| Operator  | 0.003  | 18 Mb
+| Sentry  | 0.01  | 33 Mb
+| Sidecar Injector  | 0.008  | 17 Mb
+| Placement | 0.005  | 25 Mb
 
 There are a number of variants that affect the CPU and memory consumption for each of the system components. These variants are shown in the table below.
 
@@ -75,18 +75,10 @@ There are a number of variants that affect the CPU and memory consumption for ea
 
 ### Data plane performance
 
-The Dapr sidecar uses 0.48 vCPU and 23Mb per 1000 requests per second.
-End-to-end, the Dapr sidecars (client and server) add ~1.40 ms to the 90th percentile latency, and ~2.10 ms to the 99th percentile latency. End-to-end here is a call from one app to another app receiving a response. This is shown by steps 1-7 in [this diagram]({{< ref service-invocation-overview.md >}}).
-
-This performance is on par or better than commonly used service meshes.
-
-### Latency
-
 In the test setup, requests went through the Dapr sidecar both on the client side (serving requests from the load tester tool) and the server side (the target app).
 mTLS and telemetry (tracing with a sampling rate of 0.1) and metrics were enabled on the Dapr test, and disabled for the baseline test.
 
-<img src="/images/perf_invocation_p90.png" alt="Latency for 90th percentile">
-
-<br>
+The Dapr sidecar uses 0.45 vCPU and 38Mb per 1000 requests per second.
+End-to-end, the Dapr sidecars (client and server) add ~1.20 ms to the 90th percentile latency, and ~2.50 ms to the 99th percentile latency. End-to-end here is a call from one app to another app receiving a response. This is shown by steps 1-7 in [this diagram]({{< ref service-invocation-overview.md >}}).
 
-<img src="/images/perf_invocation_p99.png" alt="Latency for 99th percentile">
+This performance is on par or better than commonly used service meshes.
diff --git a/daprdocs/content/en/operations/performance-and-scalability/perf-state.md b/daprdocs/content/en/operations/performance-and-scalability/perf-state.md
new file mode 100644
index 00000000000..82d585c26b7
--- /dev/null
+++ b/daprdocs/content/en/operations/performance-and-scalability/perf-state.md
@@ -0,0 +1,50 @@
+---
+type: docs
+title: "State performance"
+linkTitle: "State performance"
+weight: 20000
+description: ""
+---
+This article provides state API performance benchmarks and resource utilization in Dapr on Kubernetes.
+
+## System overview
+
+Dapr consists of a data plane, the sidecar that runs next to your app, and a control plane that configures the sidecars and provides capabilities such as cert and identity management.
+
+### Kubernetes components
+
+* Sidecar (data plane)
+* Placement (required for actors, control plane mapping actor types to hosts)
+* Operator (control plane)
+* Sidecar Injector (control plane)
+* Sentry (optional, control plane)
+* PosgreSQL database (single node)
+
+## Performance summary for Dapr v1.12
+
+The state API is used to persist state to a database, commonly called state store in Dapr.
+
+Performance varies based on the underlying state store. The state API performance test measures the added latency when using Dapr to get state compared with the baseline latency when getting state directly from the state store.
+
+### Kubernetes performance test setup
+
+The test was conducted on a 3 node Kubernetes cluster, using commodity hardware running 4 cores and 8GB of RAM, without any network acceleration.
+
+Test parameters:
+
+* 1000 requests per second
+* 1 replica
+* 1 minute duration
+* Sidecar limited to 0.5 vCPU
+* Sidecar telemetry enabled (tracing with a sampling rate of 0.1)
+* Payload of a 1kb size
+
+### Results
+
+* The requested throughput was 1000 qps
+* The actual throughput was 1000 qps
+* Added latency for 90th percentile was 0.75ms for gRPC
+* Added latency for 99th percentile was 1.52ms for gRPC
+* Dapr app consumed ~0.3 vCPU and ~48 of Memory for gRPC
+* No app restarts
+* No sidecar restarts