-
Notifications
You must be signed in to change notification settings - Fork 361
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Bug 1683042 - Document Perfherder's data retention policy
- Loading branch information
1 parent
bd74f86
commit df7fe18
Showing
3 changed files
with
56 additions
and
2 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,51 @@ | ||
# Data retention policies | ||
|
||
## On Perfherder | ||
|
||
On a daily basis, Perfherder expires data for several reasons: | ||
|
||
* data provides less value as it grows older | ||
* data accumulates very fast (>1 million new data points are ingested daily) | ||
* query latency degrades in time | ||
* database is rather limited (in terms of storage capacity & scalability) | ||
|
||
To ensure persistence of the most relevant performance data, Perfherder' s cycling algorithm takes a more aggressive approach towards the less relevant one. It employs multiple expiring strategies, each one specialized on deleting specific sets of data. | ||
|
||
Basically, not all data is deleted in the same way. Some data sets can be kept for longer time than others. | ||
|
||
Data targeted for removal includes: | ||
|
||
* data points | ||
* series (AKA performance signatures; they collect data points sharing same characteristics) | ||
* alerts | ||
* alert summaries | ||
|
||
Generally, the daily cycling starts by removing data points first, using all of its defined strategies. Then it continues with removing series, alerts & alert summaries using a garbage collection approach. | ||
|
||
### Cycling strategies | ||
|
||
All following strategies target the `performance_datum` table, which stores the performance data points. | ||
|
||
#### Generic | ||
|
||
Removes data points older than 1 year. | ||
|
||
#### Try data | ||
|
||
Removes data points originating from try pushes, that are older than 6 weeks. | ||
|
||
#### Not actively sheriffed | ||
|
||
Removes data points from repositories other than autoland, mozilla-central, mozilla-beta, fenix & reference-browser, which are older than 6 months. | ||
|
||
#### Stalled data | ||
|
||
Removes data points from series which haven't been ingesting new ones for the last 4 months. | ||
|
||
### Garbage collection | ||
|
||
Removes performance signatures which no longer has any data points linked to them. This cascades to the linked alerts, as they don't make sense without a parent series. | ||
|
||
Removes alert summaries which no longer has any alerts linked to them. | ||
|
||
These kinds of data pertain to the `performance_signature`, `performance_alert` & `performance_alert_summary` table respectively. |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters