Configure el Análisis de Código Estático (SAST)

Documentos > Datadog Security > Code Security > Static Code Analysis (SAST) > Configure el Análisis de Código Estático (SAST)

Esta traducción no está actualizada. Para consultar la última versión en inglés, haz clic aquí

Descripción general

Para configurar SAST de Datadog en la aplicación, navegue a Seguridad > Code Security.

Seleccione dónde ejecutar los escaneos de Análisis de Código Estático

Escanear con escaneo alojado en Datadog

Puede ejecutar escaneos de Análisis de Código Estático (SAST) de Datadog directamente en la infraestructura de Datadog. Los tipos de repositorios compatibles incluyen:

GitHub (excluyendo repositorios que utilizan Git Large File Storage)
GitLab.com and GitLab Self-Managed
Azure DevOps

Para comenzar, navegue a la página de Code Security.

Escanear en pipelines de CI

El Análisis de Código Estático de Datadog se ejecuta en sus pipelines de CI utilizando el datadog-ci CLI.

Primero, configure sus claves de API y de aplicación de Datadog. Agregue DD_APP_KEY y DD_API_KEY como secretos. Por favor, asegúrese de que su clave de aplicación de Datadog tenga el code_analysis_read contexto.

A continuación, ejecute el Análisis de Código Estático siguiendo las instrucciones para su proveedor de CI elegido a continuación.

Vea las instrucciones según su proveedor de CI:

GitHub Actions

Proveedores de CI Genéricos

Seleccione su proveedor de gestión de código fuente

El Análisis de Código Estático de Datadog es compatible con todos los proveedores de gestión de código fuente, con soporte nativo para GitHub, GitLab y Azure DevOps.

Configure una GitHub App con el GitHub integration tile y configure la integración de código fuente para habilitar fragmentos de código en línea y comentarios de solicitudes de extracción.

Al instalar una Aplicación de GitHub, se requieren los siguientes permisos para habilitar ciertas funciones:

Content: Read, que le permite ver fragmentos de código mostrados en Datadog
Pull Request: Read & Write, que permite a Datadog agregar comentarios sobre violaciones directamente en sus solicitudes de extracción utilizando comentarios de solicitudes de extracción, así como abrir solicitudes de extracción para corregir vulnerabilidades
Checks: Read & Write, que le permite crear verificaciones sobre violaciones de SAST para bloquear solicitudes de extracción

Vea las instrucciones de configuración del código fuente de GitLab para conectar repositorios de GitLab a Datadog. Se admiten tanto GitLab.com como instancias Self-Managed.

Nota: Sus integraciones de Azure DevOps deben estar conectadas a un inquilino de Microsoft Entra. Azure DevOps Server no es compatible.

Consulte las instrucciones de configuración del código fuente de Azure para conectar los repositorios de Azure DevOps a Datadog.

Si está utilizando otro proveedor de gestión de código fuente, configure el Análisis de Código Estático para ejecutarse en sus pipelines de CI utilizando la herramienta datadog-ci CLI y subir los resultados a Datadog. Usted debe ejecutar un análisis de su repositorio en la rama predeterminada antes de que los resultados puedan comenzar a aparecer en la página de Code Security.

Personalice su configuración

Por defecto, el Análisis de Código Estático de Datadog (SAST) escanea sus repositorios con los conjuntos de reglas predeterminados de Datadog para cada lenguaje de programación. Puede personalizar qué conjuntos de reglas o reglas se ejecutan, junto con otros parámetros, en Datadog o en un code-security.datadog.yaml archivo. Para la referencia completa de configuración, consulte Configuración del Análisis de Código Estático (SAST).

Vincule hallazgos a los servicios y equipos de Datadog

Datadog associates code and library scan results with Datadog services and teams to automatically route findings to the appropriate owners. This enables service-level visibility, ownership-based workflows, and faster remediation.

To determine the service where a vulnerability belongs, Datadog evaluates several mapping mechanisms in the order listed in this section.

Each vulnerability is mapped with one method only: if a mapping mechanism succeeds for a particular finding, Datadog does not attempt the remaining mechanisms for that finding.

Using service definitions that include code locations in the Software Catalog is the only way to explicitly control how static findings are mapped to services. The additional mechanisms described below, such as Error Tracking usage patterns and naming-based inference, are not user-configurable and depend on existing data from other Datadog products. Consequently, these mechanisms might not provide consistent mappings for organizations not using these products.

Mapping using the Software Catalog (recommended)

Services in the Software Catalog identify their codebase content using the codeLocations field. This field is available in the Software Catalog schema version v3 and allows a service to specify:

a repository URL

apiVersion: v3
kind: service
metadata:
  name: billing-service
  owner: billing-team
datadog:
  codeLocations:
    - repositoryURL: https://github.com/org/myrepo.git

one or more code paths inside that repository

apiVersion: v3
kind: service
metadata:
  name: billing-service
  owner: billing-team
datadog:
  codeLocations:
    - repositoryURL: https://github.com/org/myrepo.git
      paths:
        - path/to/service/code/**

If you want all the files in a repository to be associated with a service, you can use the glob ** as follows:

apiVersion: v3
kind: service
metadata:
  name: billing-service
  owner: billing-team
datadog:
  codeLocations:
    - repositoryURL: https://github.com/org/myrepo.git
      paths:
        - path/to/service/code/**
    - repositoryURL: https://github.com/org/billing-service.git
      paths:
        - "**"

The schema for this field is described in the Software Catalog entity model.

Datadog goes through all Software Catalog definitions and checks whether the finding’s file path matches. For a finding to be mapped to a service through codeLocations, it must contain a file path.

Some findings might not contain a file path. In those cases, Datadog cannot evaluate codeLocations for that finding, and this mechanism is skipped.

Services defined with a Software Catalog schema v2.x do not support codeLocations. Existing definitions can be upgraded to the v3 schema in the Software Catalog. After migration is completed, changes might take up to 24 hours to apply to findings. If you are unable to upgrade to v3, Datadog falls back to alternative linking techniques (described below). These rely on less precise heuristics, so accuracy might vary depending on the Code Security product and your use of other Datadog features.

Example (v3 schema)

apiVersion: v3
kind: service
metadata:
  name: billing-service
  owner: billing-team
datadog:
  codeLocations:
    - repositoryURL: https://github.com/org/myrepo.git
      paths:
        - path/to/service/code/**
    - repositoryURL: https://github.com/org/billing-service.git
      paths:
        - "**"

SAST finding

If a vulnerability appeared in github.com/org/myrepo at /src/billing/models/payment.py, then using the codeLocations for billing-service Datadog would add billing-service as an owning service. If your service defines an owner (see above), then Datadog links that team to the finding too. In this case, the finding would be linked to the billing-team.

SCA finding

If a library was declared in github.com/org/myrepo at /go.mod, then Datadog would not match it to billing-service.

Instead, if it was declared in github.com/org/billing-service at /go.mod, then Datadog would match it to billing-service due to the “**” catch-all glob. Consequently, Datadog would link the finding to the billing-team.

Datadog attempts to map a single finding to as many services as possible. If no matches are found, Datadog continues onto the next linking method.

When the Software Catalog cannot determine the service

If the Software Catalog does not provide a match, either because the finding’s file path does not match any codeLocations, or because the service uses the v2.x schema, Datadog evaluates whether Error Tracking can identify the service associated with the code. Datadog uses only the last 30 days of Error Tracking data due to product data-retention limits.

When Error Tracking processes stack traces, the traces often include file paths. For example, if an error occurs in: /foo/bar/baz.py, Datadog inspects the directory: /foo/bar. Datadog then checks whether the finding’s file path resides under that directory.

If the finding file is under the same directory:

Datadog treats this as a strong indication that the vulnerability belongs to the same service.
The finding inherits the service and team associated with that error in Error Tracking.

If this mapping succeeds, Datadog stops here.

Service inference from file paths or repository names

When neither of the above strategies can determine the service, Datadog inspects naming patterns in the repository and file paths.

Datadog evaluates whether:

The file path contains identifiers matching a known service.
The repository name corresponds to a service name.

When using the finding’s file path, Datadog performs a reverse search on each path segment until it finds a matching service or exhausts all options.

For example, if a finding occurs in github.com/org/checkout-service at /foo/bar/baz/main.go, Datadog takes the last path segment, main, and sees if any Software Catalog service uses that name. If there is a match, the finding is attributed to that service. If not, the process continues with baz, then bar, and so on.

When all options have been tried, Datadog checks whether the repository name, checkout-service, matches a Software Catalog service name. If no match is found, Datadog is unsuccessful at linking your finding using Software Catalog.

This mechanism ensures that findings receive meaningful service attribution when no explicit metadata exists.

Link findings to teams through Code Owners

If Datadog is able to link your finding to a service using the above strategies, then the team that owns that service (if defined) is associated with that finding automatically.

Regardless of whether Datadog successfully links a finding to a service (and a Datadog team), Datadog uses the CODEOWNERS information from your finding’s repository to link Datadog and GitHub teams to your findings.

You must accurately map your Git provider teams to your Datadog Teams for team attribution to function properly.

Escaneo consciente de diferencias

El escaneo consciente de diferencias permite que el analizador estático de Datadog solo escanee los archivos modificados por un commit en una rama de características. Acelera significativamente el tiempo de escaneo al no tener que ejecutar el análisis en cada archivo del repositorio para cada escaneo. Para habilitar el escaneo consciente de diferencias en su pipeline de CI, siga estos pasos:

Asegúrese de que sus variables DD_APP_KEY, DD_SITE y DD_API_KEY estén configuradas en su pipeline de CI.
Agregue una llamada a datadog-ci git-metadata upload antes de invocar el analizador estático. Este comando asegura que los metadatos de Git estén disponibles para el backend de Datadog. Se requieren los metadatos de Git para calcular el número de archivos a analizar.
Asegúrese de que el analizador estático de datadog se invoque con la bandera --diff-aware.

Ejemplo de secuencia de comandos (estos comandos deben ser invocados en su repositorio de Git):

datadog-ci git-metadata upload

datadog-static-analyzer -i /path/to/directory -g -o sarif.json -f sarif –-diff-aware <...other-options...>

Nota: Cuando no se puede completar un escaneo consciente de diferencias, se escanea todo el directorio.

Suba los resultados del análisis estático de terceros a Datadog

La importación de SARIF ha sido probada para Snyk, CodeQL, Semgrep, Gitleaks y Sysdig. Póngase en contacto con Soporte de Datadog si experimenta algún problema con otras herramientas compatibles con SARIF.

Puede enviar resultados de herramientas de análisis estático de terceros a Datadog, siempre que estén en el formato interoperable Formato de Intercambio de Resultados de Análisis Estático (SARIF). Se requiere la versión 14 o posterior de Node.js.

Para subir un informe SARIF:

Asegúrese de que las variables DD_API_KEY y DD_APP_KEY estén definidas.
Opcionalmente, establezca una DD_SITE variable (esto tiene como valor predeterminado datadoghq.com).
Instale la utilidad datadog-ci:
```
npm install -g @datadog/datadog-ci
```
Ejecute la herramienta de análisis estático de terceros en su código y genere los resultados en el formato SARIF.

Suba los resultados a Datadog:

datadog-ci sarif upload $OUTPUT_LOCATION

Directrices de Soporte de SARIF

Datadog admite la ingestión de archivos SARIF de terceros que son compatibles con el esquema SARIF 2.1.0. El SARIF El esquema se utiliza de manera diferente por las herramientas de análisis estático. Si desea enviar archivos SARIF de terceros a Datadog, por favor asegúrese de que cumplan con los siguientes detalles:

La ubicación de la violación se especifica a través del objeto physicalLocation de un resultado.
- El artifactLocation y su uri deben ser relativos a la raíz del repositorio.
- El objeto region es la parte del código resaltada en la interfaz de usuario de Datadog.
El partialFingerprints se utiliza para identificar de manera única un hallazgo en un repositorio.
properties y tags añaden más información:
- La etiqueta DATADOG_CATEGORY especifica la categoría del hallazgo. Los valores aceptables son SECURITY, PERFORMANCE, CODE_STYLE, BEST_PRACTICES, ERROR_PRONE.
- Las violaciones anotadas con la categoría SECURITY se muestran en el explorador de vulnerabilidades y en la pestaña de seguridad de la vista del repositorio.
La sección tool debe tener una sección driver válida con atributos name y version.

Por ejemplo, aquí hay un ejemplo de un archivo SARIF procesado por Datadog:


{
    "runs": [
        {
            "results": [
                {
                    "level": "error",
                    "locations": [
                        {
                            "physicalLocation": {
                                "artifactLocation": {
                                    "uri": "missing_timeout.py"
                                },
                                "region": {
                                    "endColumn": 76,
                                    "endLine": 6,
                                    "startColumn": 25,
                                    "startLine": 6
                                }
                            }
                        }
                    ],
                    "message": {
                        "text": "timeout not defined"
                    },
                    "partialFingerprints": {
                        "DATADOG_FINGERPRINT": "b45eb11285f5e2ae08598cb8e5903c0ad2b3d68eaa864f3a6f17eb4a3b4a25da"
                    },
                    "properties": {
                        "tags": [
                            "DATADOG_CATEGORY:SECURITY",
                            "CWE:1088"
                        ]
                    },
                    "ruleId": "python-security/requests-timeout",
                    "ruleIndex": 0
                }
            ],
            "tool": {
                "driver": {
                    "informationUri": "https://www.datadoghq.com",
                    "name": "<tool-name>",
                    "rules": [
                        {
                            "fullDescription": {
                                "text": "Access to remote resources should always use a timeout and appropriately handle the timeout and recovery. When using `requests.get`, `requests.put`, `requests.patch`, etc. - we should always use a `timeout` as an argument.\n\n#### Learn More\n\n - [CWE-1088 - Synchronous Access of Remote Resource without Timeout](https://cwe.mitre.org/data/definitions/1088.html)\n - [Python Best Practices: always use a timeout with the requests library](https://www.codiga.io/blog/python-requests-timeout/)"
                            },
                            "helpUri": "https://link/to/documentation",
                            "id": "python-security/requests-timeout",
                            "properties": {
                                "tags": [
                                    "CWE:1088"
                                ]
                            },
                            "shortDescription": {
                                "text": "no timeout was given on call to external resource"
                            }
                        }
                    ],
                    "version": "<tool-version>"
                }
            }
        }
    ],
    "version": "2.1.0"
}

Mapeo de severidad de SARIF a CVSS

El formato SARIF define cuatro severidades: ninguna, nota, advertencia y error. Sin embargo, Datadog informa sobre la severidad de violaciones y vulnerabilidades utilizando el Sistema Común de Puntuación de Vulnerabilidades (CVSS), que define cinco severidades: crítica, alta, media, baja y ninguna.

Al ingerir archivos SARIF, Datadog mapea las severidades de SARIF a las severidades de CVSS utilizando las reglas de mapeo a continuación.

Severidad SARIF	Severidad CVSS
Error	Crítica
Advertencia	Alta
Nota	Medio
Ninguno	Bajo

Retención de datos

Datadog almacena hallazgos de acuerdo con nuestros Períodos de retención de datos. Datadog no almacena ni retiene el código fuente del cliente.

<!– Lectura Adicional

Más enlaces, artículos y documentación útiles:

–>