-
Notifications
You must be signed in to change notification settings - Fork 1.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
New Time Slice SLO #20888
New Time Slice SLO #20888
Conversation
Preview links (active after the
|
…documentation into esther/docs-6808-time-slice-slo
content/en/service_management/service_level_objectives/time_slice.md
Outdated
Show resolved
Hide resolved
content/en/service_management/service_level_objectives/guide/slo_types_comparison.md
Outdated
Show resolved
Hide resolved
content/en/service_management/service_level_objectives/guide/slo_types_comparison.md
Outdated
Show resolved
Hide resolved
content/en/service_management/service_level_objectives/guide/slo_types_comparison.md
Outdated
Show resolved
Hide resolved
content/en/service_management/service_level_objectives/guide/slo_types_comparison.md
Outdated
Show resolved
Hide resolved
content/en/service_management/service_level_objectives/time_slice.md
Outdated
Show resolved
Hide resolved
content/en/service_management/service_level_objectives/_index.md
Outdated
Show resolved
Hide resolved
content/en/service_management/service_level_objectives/_index.md
Outdated
Show resolved
Hide resolved
@roxanne-moslehi added the examples you provided, thank you! |
## Overview | ||
|
||
When creating SLOs, you can choose from the following types: | ||
- **Metric-based SLOs**: can be used for count-based data streams, the SLI is based on the sum of good events divided by the sum of total events. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Here we say "data streams" but below we say "data sets". I think I would personally just say "data". It's true that the data has to be some kind of count. It is also find to say "the SLI is calculated as the sum of the good events divided by the sum of total events" rather than saying "based on". It literally is that calculation.
|
||
When creating SLOs, you can choose from the following types: | ||
- **Metric-based SLOs**: can be used for count-based data streams, the SLI is based on the sum of good events divided by the sum of total events. | ||
- **Monitor-based SLOs**: can be be used for time-based data sets, the SLI is based on the amount of time your system exhibits good behavior divided by the total time. Monitor-based SLOs must be based on a new or existing Datadog monitor, any adjustments must be made to the underlying monitor (cannot be done through SLO creation). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's not really true that the data set needs to be "time based". That describes the calculation but not the data. They can use can data at all. It would be more correct to say that the SLI is based on the monitor's uptime (times when the monitor is not in the "Alert" state).
When creating SLOs, you can choose from the following types: | ||
- **Metric-based SLOs**: can be used for count-based data streams, the SLI is based on the sum of good events divided by the sum of total events. | ||
- **Monitor-based SLOs**: can be be used for time-based data sets, the SLI is based on the amount of time your system exhibits good behavior divided by the total time. Monitor-based SLOs must be based on a new or existing Datadog monitor, any adjustments must be made to the underlying monitor (cannot be done through SLO creation). | ||
- **Time Slice SLOs**: can be be used for time-based data sets, the SLI is based on the amount of time your system exhibits good behavior divided by the total time. Time Slice SLOs do not require a Datadog monitor, you can try out different metric filters and thresholds and instantly explore downtime during SLO creation. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same comment here. "time-based" describes the calculation, but not the data. The way we describe this is that it can be used with any kind of data, and "good behavior" is defined using the condition specified by the user.
| **Handling missing data in the SLO calculation** | Missing data is ignored in SLO status and error budget calculations | Missing data is handled based on the [underlying Monitor's configuration][6] | Missing data is treated as uptime in SLO status and error budget calculations | | ||
| **Uptime Calculations** | N/A | Uptime calculations are based on the underlying Monitor <br><br>If groups are present, overall uptime requires *all* groups to have uptime| [Uptime][7] is calculated by looking at discrete time chunks, not rolling time windows<br><br>If groups are present, overall uptime requires *all* groups to have uptime | | ||
| **Calendar View on SLO Manage Page** | Available | Not available | Available | | ||
| **Public [APIs][8] and Terraform Support** | Available | Available | Not available | |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For time-slices can we say "Not yet available" or "Coming soon"?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same as above
| **SLO alerting ([Error Budget][1] or [Burn Rate][2] Alerts)** | Available | Available for SLOs based on Metric Monitor types only (not available for Synthetic Monitors or Service Checks) | Not available | | ||
| [**SLO Status Corrections**][3] | Correction periods are ignored from SLO status calculation | Correction periods are ignored from SLO status calculation | Correction periods are counted as uptime in SLO status calculation | | ||
| **[SLO Widgets][4] (up to 90 days of historical data)** | Available | Available | Available | | ||
| [**SLO Data Source**][5] | Available (with up to 15 months of historical data) | Not available | Not available | |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For time slices can we say "Coming soon".
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We try to stay away from future promises in docs. We can update the docs as soon as features are available!
* Add time slice to left nav * Add time slice instructions and images * Add uptime calculations page * Add uptime calculations to left nav * Standardize use of Time Slice SLO * Remove duplicate file * Merge uptime with time slice * Add SLO comparison chart * Apply code review suggestions * Update content/en/service_management/service_level_objectives/_index.md * Apply suggestions from code review Co-authored-by: jhgilbert <[email protected]> * Apply suggestions from code review, removed commented examples * Add time slice to left nav * Add time slice instructions and images * Add uptime calculations page * Add uptime calculations to left nav * Standardize use of Time Slice SLO * Remove duplicate file * Merge uptime with time slice * Add SLO comparison chart * Apply code review suggestions * Update content/en/service_management/service_level_objectives/_index.md * Apply suggestions from code review Co-authored-by: jhgilbert <[email protected]> * Apply suggestions from code review, removed commented examples * Add examples with images * minor changes * API info comparison chart * update comparison chart * update comparison chart again * fix status correction info * update SLO definitions * calendar view info --------- Co-authored-by: jhgilbert <[email protected]> Co-authored-by: Roxanne Moslehi <[email protected]>
What does this PR do? What is the motivation?
https://datadoghq.atlassian.net/browse/DOCS-6808
Merge instructions
Do not merge, pending PM approval