Skip to content

Commit

Permalink
[DOCS-6680] Add keyword dictionary and priority level (#20991)
Browse files Browse the repository at this point in the history
* add keyword dictionary and priority level

* move location

* moveregex

* small edit

* move priority level

* Apply suggestions from code review

* Update sensitive_data_scanner.md

* apply suggestion on the other section

* Update sensitive_data_scanner.md

* add indent

* Apply suggestions from code review

Co-authored-by: Bryce Eadie <[email protected]>

---------

Co-authored-by: Victoria Teng <[email protected]>
Co-authored-by: Bryce Eadie <[email protected]>
  • Loading branch information
3 people authored and MaelNamNam committed Jan 17, 2024
1 parent 4395b6c commit a45abdd
Showing 1 changed file with 34 additions and 29 deletions.
63 changes: 34 additions & 29 deletions content/en/sensitive_data_scanner.md
Original file line number Diff line number Diff line change
Expand Up @@ -28,49 +28,59 @@ Sensitive data, such as credit card numbers, bank routing numbers, and API keys

Often, businesses are required to identify, remediate, and prevent the exposure of such sensitive data within their logs due to organizational policies, compliance requirements, industry regulations, and privacy concerns. This is especially true within industries such as banking, financial services, healthcare, and insurance.

## Sensitive Data Scanner

Sensitive Data Scanner is a stream-based, pattern matching service that you can use to identify, tag, and optionally redact or hash sensitive data. Security and compliance teams can implement Sensitive Data Scanner as a new line of defense, helping prevent against sensitive data leaks and limiting non-compliance risks.

Sensitive Data Scanner can be found under [Organization Settings][1].

{{< img src="sensitive_data_scanner/sds_main_28_03_23.png" alt="Sensitive Data Scanner in Organization Settings" style="width:90%;">}}

### Setup

- **Define Scanning Groups:** A scanning group determines what data to scan. It consists of a query filter and a set of toggles to enable scanning for Logs, APM, RUM, and/or Events. See the [Log Search Syntax][2] documentation to learn more about query filters.
- For Terraform, see the [datadog_sensitive_data_scanner_group][3] resource.
- **Define Scanning Rules:** A scanning rule determines what sensitive information to match within the data. Within a scanning group, add predefined scanning rules from Datadog's Scanning Rule Library or create your own rules from scratch to scan using custom regex patterns.
- For Terraform, see the [datadog_sensitive_data_scanner_rule][4] resource.

Sensitive Data Scanner supports Perl Compatible RegEx (PCRE), but the following patterns are not supported:
- Backreferences and capturing sub-expressions (lookarounds)
- Arbitrary zero-width assertions
- Subroutine references and recursive patterns
- Conditional patterns
- Backtracking control verbs
- The \C "single-byte" directive (which breaks UTF-8 sequences)
- The \R newline match
- The \K start of match reset directive
- Callouts and embedded code
- Atomic grouping and possessive quantifiers
## Setup

1. **Define Scanning Groups:** A scanning group determines what data to scan. It consists of a query filter and a set of toggles to enable scanning for Logs, APM, RUM, and/or Events. See the [Log Search Syntax][2] documentation to learn more about query filters.
- For Terraform, see the [datadog_sensitive_data_scanner_group][3] resource.
2. **Define Scanning Rules:** A scanning rule determines what sensitive information to match within the data. Within a scanning group, add predefined scanning rules from Datadog's Scanning Rule Library or create your own rules from scratch to scan using custom regex patterns.
- For Terraform, see the [datadog_sensitive_data_scanner_rule][4] resource.

**Note:**
- Any rules that you add or update only affect data coming into Datadog after the rule was defined.
- Sensitive Data Scanner does not affect any rules you define on the Datadog Agent directly.
- To turn off Sensitive Data Scanner entirely, set the toggle to **off** for each Scanning Group and Scanning Rule so that they are disabled.

### Custom Scanning Rules
### Define Scanning Rules

#### Out-of-the-box Scanning Rules

The Scanning Rule Library contains an evergrowing collection of predefined rules maintained by Datadog for detecting common patterns such as email addresses, credit card numbers, API keys, authorization tokens, and more.
{{< img src="sensitive_data_scanner/sds-library-28-03-23.png" alt="Scanning Rule Library" style="width:90%;">}}

- **Define pattern:** Specify the regex pattern to be used for matching against events. Test with sample data to verify that your regex pattern is valid.
#### Custom Scanning Rules

- Define custom scanning rules to scan for sensitive data specific to your business.
- **Define match conditions:** Specify the regex pattern to be used for matching against events. Test with sample data to verify that your regex pattern is valid.
- Sensitive Data Scanner supports Perl Compatible RegEx (PCRE), but the following patterns are not supported:
- Backreferences and capturing sub-expressions (lookarounds)
- Arbitrary zero-width assertions
- Subroutine references and recursive patterns
- Conditional patterns
- Backtracking control verbs
- The \C "single-byte" directive (which breaks UTF-8 sequences)
- The \R newline match
- The \K start of match reset directive
- Callouts and embedded code
- Atomic grouping and possessive quantifiers

#### Define rule target and action

- **Create keyword dictionary:** Add keywords to tune detection accuracy when matching regex conditions. For example, if you are scanning for a sixteen-digit Visa credit card number, you can add keywords like `visa`, `credit`, and `card` and require that these keywords must be within a specified number of characters of a match. By default, keywords must be within 30 characters before a matched value.
- **Define scope:** Specify whether you want to scan the entire event or just specific attributes. You can also choose to exclude specific attributes from the scan.
- **Add tags:** Specify the tags you want to associate with events where the values match the specified regex pattern. Datadog recommends using `sensitive_data` and `sensitive_data_category` tags. These tags can then be used in searches, dashboards, and monitors.
- **Process matching values:** Optionally, specify whether you want to redact, partially redact, or hash matching values. When redacting, specify placeholder text to replace the matching values with. When partially redacting, specify the position (start/end) and length (# of characters) to redact within matching values. Redaction, partial redaction, and hashing are all irreversible actions.
- **Add tags:** Specify the tags you want to associate with events where the values match the specified regex pattern. Datadog recommends using `sensitive_data` and `sensitive_data_category` tags. These tags can then be used in searches, dashboards, and monitors.
- **Set priority level:** Set the priority level for a rule based on your business needs.
- **Name the rule:** Provide a human-readable name for the rule.

{{< img src="sensitive_data_scanner/sds_rules_28_03_23.png" alt="A Sensitive Data Scanner custom rule" style="width:90%;">}}

### Redact sensitive data in tags
#### Redact sensitive data in tags

To redact sensitive data contained in tags, you must [remap][5] the tag to an attribute and then redact the attribute. Uncheck `Preserve source attribute` in the remapper processor so that the tag is not preserved during the remapping.

Expand All @@ -97,11 +107,6 @@ To redact the attribute:
7. Optionally, add tags.
8. Click **Add Rules**.

### Out-of-the-box Scanning Rules

The Scanning Rule Library contains an evergrowing collection of predefined rules maintained by Datadog for detecting common patterns such as email addresses, credit card numbers, API keys, authorization tokens, and more.
{{< img src="sensitive_data_scanner/sds-library-28-03-23.png" alt="Scanning Rule Library" style="width:90%;">}}

### Permissions

By default, users with the Datadog Admin role have access to view and define the scanning rules. To allow other user access, grant read or write permissions for Data Scanner under **Compliance**. See the [Custom RBAC documentation][7] for details on Roles and Permissions.
Expand Down

0 comments on commit a45abdd

Please sign in to comment.