Configuration Reference

Complete reference for Socket Registry Firewall configuration options. All options can be set in socket.yml or overridden via environment variables.

Configuration File (`socket.yml`)

The firewall reads configuration from /app/socket.yml inside the container. Mount your config file:

volumes:
  - ./socket.yml:/app/socket.yml:ro

Core Socket Settings

socket:
  # Socket.dev API endpoint (required)
  api_url: https://api.socket.dev
  
  # Behavior when Socket API is unreachable
  fail_open: true            # true = allow packages (default), false = block all
  
  # Behavior for unscanned/unknown packages (Socket returns purlError)
  fail_open_unscanned: true  # true = allow unscanned with warning (default), false = block unscanned
  expose_unscanned_header: false  # true = add X-Socket-Unscanned response header (default: false)
  
  # Console log level (controls which messages appear in logs)
  log_level: info            # error, warn, info (default), debug
  
  # Corporate egress proxy for all upstream connections
  outbound_proxy: http://proxy.company.com:3128
  no_proxy: localhost,127.0.0.1,internal.company.com
  
  # SSL verification for Socket API calls
  api_ssl_verify: false      # Verify SSL for Socket API (default: false)
  api_ssl_ca_cert: /path/to/corporate-ca.crt  # Custom CA cert
  
  # SSL verification for upstream registry connections
  upstream_ssl_verify: false # Verify SSL for upstream registries (default: false, inherits api_ssl_verify)
  upstream_ssl_ca_cert: /path/to/upstream-ca.crt  # Custom CA for upstreams
  
  # Request behavior
  request_id_header: X-Socket-Request-ID  # Custom request ID header name
  
  # Client auth gate (require clients to present credentials)
  bearer_token: SOCKET_AUTH_TOKEN  # Env var name containing the token
  bearer_token_type: env           # Resolve bearer_token from the env var
  # basic_auth_username: ${SOCKET_BASIC_AUTH_USERNAME}  # HTTP Basic auth username
  # basic_auth_password: ${SOCKET_BASIC_AUTH_PASSWORD}  # HTTP Basic auth password

Environment variables:

 SOCKET_SECURITY_API_TOKEN=${SOCKET_SECURITY_API_TOKEN}  # Required (scopes: packages:list, entitlements:list)
SOCKET_API_URL=https://api.socket.dev
SOCKET_FAIL_OPEN=true
SOCKET_FAIL_OPEN_UNSCANNED=true
SOCKET_EXPOSE_UNSCANNED_HEADER=false
SOCKET_LOG_LEVEL=info
SOCKET_LOG_MAX_BODY_SIZE=3900
SOCKET_OUTBOUND_PROXY=http://proxy:3128
SOCKET_NO_PROXY=localhost,127.0.0.1
SOCKET_BEARER_TOKEN=${SOCKET_AUTH_TOKEN} # Client auth gate (alternative to socket.bearer_token in YAML)

Remote Configuration from Socket Dashboard

Set socket.use_remote_config: true and the firewall fetches its deployment config from the Socket API on startup, then polls for changes at socket.config_refresh_interval. The remote config becomes the source of truth — it replaces the contents of socket.yml wholesale on each refresh, not a deep merge.

socket:
  use_remote_config: true        # Default: false
  deployment: prod               # Deployment name in the Socket dashboard (required)
  config_refresh_interval: 60s   # How often to poll for changes (default: 60s; supports 30s, 5m, 1h, 1d)

  # Bootstrap fields — always read from socket.yml even when remote config is applied:
  api_url: https://api.socket.dev
  api_ssl_verify: true
  outbound_proxy: http://proxy.company.com:3128

Environment variables:

SOCKET_USE_REMOTE_CONFIG=true
SOCKET_DEPLOYMENT_NAME=prod
SOCKET_CONFIG_REFRESH_INTERVAL=60s
SOCKET_SECURITY_API_TOKEN=${SOCKET_SECURITY_API_TOKEN} # Required
SOCKET_ORG_SLUG=your-org                 # Recommended (resolved from API if unset)

How it works

On generate, if socket.use_remote_config is true, the tool fetches GET /v0/orgs/{org}/settings/socket-firewall-deployment/{deployment_uuid} and uses the deployment's value as the effective config.
The refresh daemon repeats the fetch at config_refresh_interval. On each cycle it writes a candidate config, runs nginx -t, and does nginx -s reload — all hot, with no downtime.
If a fetch returns a malformed config, the daemon reverts to the previous known-good config and keeps retrying.

Bootstrap fields (always local)

A small set of fields is always read from socket.yml, even when remote config is applied — these are required to reach the API and to pin the feature flag to the operator's choice:

socket.use_remote_config
socket.deployment
socket.api_url
socket.api_ssl_verify, socket.api_ssl_ca_cert
socket.outbound_proxy, socket.no_proxy
socket.config_refresh_interval

Everything else — registries, path_routing, cache, redis, splunk, metadata_filtering, ports, nginx, etc. — comes from the Socket dashboard when the flag is on.

Failure behavior

If the API is unreachable on startup, the firewall falls back to the local socket.yml and logs a warning. The refresh daemon keeps retrying in the background, so the firewall self-heals as soon as the API is reachable again.

Fail-Open Behavior

The firewall has two independent fail-open settings that control behavior in different error scenarios:

`fail_open` — API Errors

Controls behavior when the Socket API is unreachable or returns an HTTP error (timeout, 500, network failure).

Value	Behavior
`true` (default)	Allow the package with a warning. Error message appears in `warn_reason` field of decision logs, Splunk events, and webhooks.
`false`	Block the package. Error message appears in `block_reason` field. Returns 403.

`fail_open_unscanned` — Unscanned Packages

Controls behavior when the Socket API doesn't recognize a package/version (returns a purlError response). This happens when the package hasn't been scanned yet or doesn't exist in Socket's database.

Value	Behavior
`true` (default)	Allow the package with a warning. The purlError message appears in `warn_reason`. Decision logs include `unscanned: true`.
`false`	Block the package. The purlError message appears in `block_reason`. Returns 403. Decision logs include `unscanned: true`.

Why two settings? fail_open covers infrastructure issues (API down, network errors) while fail_open_unscanned covers data coverage gaps (new packages, private packages Socket hasn't indexed). Organizations may want different policies for each — for example, allowing packages during API outages but blocking unscanned packages in strict environments.

`expose_unscanned_header` — Response Header Visibility

When true, adds an X-Socket-Unscanned: true response header on requests for unscanned packages. Default: false (header not sent).

socket:
  fail_open: true               # API errors: allow with warning (default)
  fail_open_unscanned: true     # Unscanned packages: allow with warning (default)
  expose_unscanned_header: false # X-Socket-Unscanned header: suppressed (default)

Client Auth Gate (`bearer_token` / `basic_auth`)

Optional feature to require all inbound requests to present valid credentials. Supports two methods:

Bearer token: Authorization: Bearer <token>
Basic auth: Authorization: Basic <base64(username:password)>

If both are configured, either method is accepted. When configured, requests without valid credentials receive a 401 Unauthorized response.

Configuration

socket:
  # Bearer token auth:
  # Read token from an environment variable
  bearer_token: SOCKET_AUTH_TOKEN       # Name of the env var (NOT the value)
  bearer_token_type: env                # Tells config tool to resolve the env var

  # Basic auth:
  basic_auth_username: ${SOCKET_BASIC_AUTH_USERNAME}
  basic_auth_password: ${SOCKET_BASIC_AUTH_PASSWORD}

Setting	Values	Default	Description
`bearer_token`	string	(empty)	The token value, or env var name when `bearer_token_type` is `env`
`bearer_token_type`	`string`, `env`	`string`	How to interpret `bearer_token`
`basic_auth_username`	string	(empty)	Username for HTTP Basic authentication
`basic_auth_password`	string	(empty)	Password for HTTP Basic authentication

Or via environment variables (works without any YAML config):

SOCKET_BEARER_TOKEN=${SOCKET_AUTH_TOKEN}
# or
SOCKET_BASIC_AUTH_USERNAME=${SOCKET_BASIC_AUTH_USERNAME}
SOCKET_BASIC_AUTH_PASSWORD=${SOCKET_BASIC_AUTH_PASSWORD}

Behavior

All requests (except /health) must include a valid Authorization header matching one of the configured methods
Unauthenticated or mismatched requests receive 401 with appropriate WWW-Authenticate header and JSON error body
The /health endpoint is exempt from auth — always accessible without credentials
When auth succeeds, the client's Authorization header is stripped and NOT forwarded upstream. Upstream auth is handled separately via upstream_token if configured
Auth failures are logged (credentials are never logged)
If both bearer token and basic auth are configured, the WWW-Authenticate header advertises both methods

Interaction with `upstream_token`

`bearer_token` / `basic_auth`	`upstream_token`	Behavior
—	—	Client auth passes through to upstream (default)
set	—	Client must match; no auth forwarded upstream
—	set	No inbound gate; upstream gets token from env var
set	set	Client must match; upstream gets token from env var

Upstream Auth Token (`upstream_token`)

Inject a Bearer token on all upstream (firewall → registry) requests for a specific route. The token value comes from an environment variable — the YAML config specifies only the env var name, so secrets never appear in config files.

Configuration

Supported on both path-based routes and domain-based registries:

# Path-based routing
path_routing:
  routes:
    - path: /pypi
      upstream: https://private-pypi.company.com
      registry: pypi
      upstream_token: PYPI_AUTH_TOKEN        # env var name → value used as Bearer token

    - path: /npm
      upstream: https://private-npm.company.com
      registry: npm
      upstream_token: NPM_AUTH_TOKEN

# Domain-based routing
registries:
  pypi:
    domains:
      - pypi.company.com
    upstream: https://private-pypi.company.com
    upstream_token: PYPI_AUTH_TOKEN           # same behavior for domain routes

Set the actual token value as an environment variable on the container:

# Bearer token (no colon in value)
PYPI_AUTH_TOKEN=${PYPI_AUTH_TOKEN}

# Basic auth (user:password format — auto-detected by the colon)
NPM_AUTH_TOKEN=${NPM_REGISTRY_CREDS}

Setting	Values	Default	Description
`upstream_token`	string	(empty)	Name of an environment variable containing the auth credential for upstream requests

Auth Scheme Auto-Detection

The firewall inspects the value of the env var at startup to choose the HTTP auth scheme:

Env var value	Detected scheme	Authorization header sent
`<token-value>` (no `:`)	Bearer	`Authorization: Bearer <token-value>`
`<username>:<password>` (contains `:`)	Basic	`Authorization: Basic <base64(username:password)>`

This is fully automatic — no additional configuration needed.

Behavior

When upstream_token is set for a route, every upstream request on that route includes the auto-detected Authorization header — replacing any client-sent Authorization header
Routes without upstream_token pass the client's Authorization header through unchanged (default behavior)
The env var name must match [A-Za-z_][A-Za-z0-9_]* (standard env var naming)
Token values are pre-resolved at worker startup for performance (no per-request os.getenv() overhead)
Token values are never logged — only the env var name appears in init logs as <redacted>
If the env var is empty or not set, a warning is logged and no Authorization header is injected

Security Notes

Token values exist only in environment variables and process memory — never in config files or logs
Each route can have a different token, enabling per-registry credential isolation
Works independently from bearer_token (inbound client auth gate) — see interaction table above

Response Tracking Headers

The firewall adds tracking headers to responses for downstream observability and end-to-end request correlation.

Request ID Header

Every response includes a request ID header for tracking. The header name is configurable via socket.request_id_header in socket.yml (default: X-Socket-Request-ID). The same header is also sent to upstream registries and the Socket API for end-to-end correlation.

socket:
  request_id_header: X-Socket-Request-ID  # Default value; customize as needed

Header	Present On	Description
`X-Socket-Request-ID`	All responses	Unique request identifier (nginx `$request_id`). Header name configurable via `socket.request_id_header`.

Decision Headers

Security-checked requests (package downloads) include additional headers indicating the firewall's decision. Passthrough requests (metadata, checksums, default routes) do not include decision headers.

Header	Present On	Values / Description
`X-Socket-Decision`	Security-checked responses	`allowed` — Package passed security checks. `blocked` — Package blocked by security policy. `fail_open` — Socket API unavailable, package allowed due to fail-open policy.
`X-Socket-Block-Reason`	Blocked responses (403)	Comma-separated alert titles that caused the block (e.g., `malware,typosquat`).
`X-Socket-Warn-Reason`	Allowed responses with warnings	Comma-separated alert titles with warn-level severity.
`X-Socket-Monitor-Reason`	Allowed responses with monitors	Comma-separated alert titles with monitor-level severity.
`X-Socket-Unscanned`	Unscanned package responses	`true` when the package/version was not found or not yet scanned by Socket. Only present when `socket.expose_unscanned_header: true` (default: false).

Examples:

Blocked package:

HTTP/1.1 403 Forbidden
X-Socket-Request-ID: a1b2c3d4e5f6...
X-Socket-Decision: blocked
X-Socket-Block-Reason: malware,typosquat

Allowed package with warnings:

HTTP/1.1 200 OK
X-Socket-Request-ID: a1b2c3d4e5f6...
X-Socket-Decision: allowed
X-Socket-Warn-Reason: protestware

Allowed package (no alerts):

HTTP/1.1 200 OK
X-Socket-Request-ID: a1b2c3d4e5f6...
X-Socket-Decision: allowed

Passthrough request (metadata/checksums):

HTTP/1.1 200 OK
X-Socket-Request-ID: a1b2c3d4e5f6...

Ports

ports:
  http: 8080     # HTTP port (redirects to HTTPS)
  https: 8443    # HTTPS port

Environment variables:

HTTP_PORT=8080
HTTPS_PORT=8443

Deployment Mode

Controls path generation for different deployment topologies:

# Default (downstream) - Client → Firewall → Registry
# Generates API paths for package manager clients
# No config_mode needed

# Upstream mode - Registry → Firewall → Public
# Generates direct paths for registry-to-registry communication
config_mode: upstream

# Middle mode - Registry → Firewall → Registry
# Generates both API and direct paths for multi-tier registries
config_mode: middle

Mode	Use When	Paths Generated	URL Rewriting
(default)	Client → FW → Registry	API paths	Yes
`upstream`	Private Registry → FW → Public	Direct paths	Yes
`middle`	Private Registry → FW → Private	Both API+Direct	No (proxy)

Environment variable:

CONFIG_MODE=upstream  # or 'middle'

Path-Based Routing

All registries behind a single domain with path prefixes. Recommended for most deployments.

path_routing:
  enabled: true
  domain: firewall.company.com
  
  routes:
    - path: /npm
      upstream: https://registry.npmjs.org
      registry: npm
      mode: rewrite  # 'rewrite' (default) or 'proxy'
      
    - path: /pypi
      upstream: https://pypi.org
      registry: pypi
      mode: rewrite
      
    - path: /maven
      upstream: https://repo1.maven.org/maven2
      registry: maven
      
    - path: /cargo
      upstream: https://index.crates.io
      registry: cargo
      
    - path: /rubygems
      upstream: https://rubygems.org
      registry: rubygems
      
    - path: /openvsx
      upstream: https://open-vsx.org
      registry: openvsx
      
    - path: /nuget
      upstream: https://api.nuget.org
      registry: nuget
      
    - path: /go
      upstream: https://proxy.golang.org
      registry: go
      
    - path: /conda
      upstream: https://repo.anaconda.com/pkgs/main
      registry: conda

URL Rewrite Scheme

Control the URL scheme used when rewriting metadata URLs.

path_routing:
  enabled: true
  domain: firewall.company.com
  rewrite_scheme: https            # Scheme for upstream connections (default: https)
  client_rewrite_scheme: http      # Scheme for client-facing URLs (optional)

Field	Default	Description
`rewrite_scheme`	`https`	Scheme used for upstream connections and URL rewriting
`client_rewrite_scheme`	(same as `rewrite_scheme`)	Scheme used in rewritten URLs returned to clients

Use case: When the firewall terminates SSL but clients connect via HTTP:

path_routing:
  rewrite_scheme: https             # Upstream connections use HTTPS
  client_rewrite_scheme: http       # Rewritten URLs use HTTP for clients

The firewall also respects X-Forwarded-Proto and X-Forwarded-Scheme headers as fallbacks when rewrite_scheme is not set.

Environment variables:

PATH_ROUTING_REWRITE_SCHEME=https
PATH_ROUTING_CLIENT_REWRITE_SCHEME=http

Forward for Domain (Unmatched Path Handling)

Controls what happens to requests that don't match any configured route.

path_routing:
  enabled: true
  domain: firewall.company.com
  upstream_fqdn: artifactory.company.com
  forward_for_domain: true    # Forward unmatched paths to upstream_fqdn (default: false)

`forward_for_domain`	`upstream_fqdn` set	Unmatched path behavior
`true`	Yes	Forwarded to `upstream_fqdn` uninspected (passthrough)
`true`	No	Returns 404
`false` (default)	Any	Returns 404

When auto-discovery is active (private_registry configured), forward_for_domain is automatically enabled. This means any repository type not explicitly routed (unsupported ecosystems, repos filtered by include_pattern/exclude_pattern, etc.) will still be forwarded to the upstream without Socket inspection.

This setting can also be used without auto-discovery — for example, with manual routes where you want unmatched paths forwarded to an Artifactory or Nexus instance:

path_routing:
  enabled: true
  domain: firewall.company.com
  upstream_fqdn: artifactory.company.com
  forward_for_domain: true     # Forward docker, helm, raw, etc. without inspection
  routes:
    - path: /artifactory/api/npm/npm-remote
      upstream: https://artifactory.company.com/artifactory/api/npm/npm-remote
      registry: npm
    # Only npm is inspected; everything else passes through

Environment variable:

PATH_ROUTING_FORWARD_FOR_DOMAIN=true

Route Options

Field	Required	Description	Values
`path`	Yes	URL path prefix (must start with `/`)	`/npm`, `/pypi`, etc.
`upstream`	Yes	Upstream registry URL	HTTPS URL
`registry`	Yes	Registry type/ecosystem	npm, pypi, maven, etc.
`mode`	No	URL rewriting mode	`rewrite` (default), `proxy`

Route Mode: rewrite vs proxy

mode: rewrite (default) - Rewrites package URLs to point back through the firewall:

Use for: Downstream, Upstream deployments
URL in metadata → https://firewall.company.com/npm/package.tgz
Clients fetch packages through firewall

mode: proxy - Passes URLs through unchanged:

Use for: Middle deployments (Registry → FW → Registry)
URL in metadata → ../../packages/package.tgz (relative) or original upstream URL
Downstream registry resolves relative URLs
Required when using config_mode: middle

External Routes File

For 50+ routes or dynamic route management, use an external file:

path_routing:
  enabled: true
  domain: firewall.company.com
  routes_file: /config/routes.yml

routes.yml format:

routes:
  - path: /npm-public
    upstream: https://registry.npmjs.org
    registry: npm
  - path: /npm-internal
    upstream: https://nexus.company.com/repository/npm-internal
    registry: npm
  # ... many more routes

Mount the routes file:

volumes:
  - ./routes.yml:/config/routes.yml:ro

Auto-Discovery (Artifactory/Nexus)

Automatically sync routes from your artifact manager. Routes update on interval without restarting the firewall!

path_routing:
  enabled: true
  domain: firewall.company.com
  mode: artifactory  # or 'nexus'
  
  private_registry:
    api_url: https://artifactory.company.com/artifactory
    api_key: ${PRIVATE_REGISTRY_KEY}      # Token auth (takes precedence)
    # OR use basic auth with separate fields:
    username: ${PRIVATE_REGISTRY_USERNAME} # Basic auth username
    password: ${PRIVATE_REGISTRY_PASSWORD} # Basic auth password
    interval: 5m                         # Auto-sync interval (e.g., 30s, 5m, 1h)
    ignore_ssl_errors: false             # Disable verification of SSL when connecting to the Private Registry
    include_pattern: ".*"                 # Include all repos (default)
    exclude_pattern: "(tmp|test)-.*"      # Exclude temp/test repos
    supported_ecosystems_only: true      # Skip unsupported package types (default: true)

Artifactory Auto-Discovery

Discovers REMOTE, LOCAL, and FEDERATED repositories in Artifactory and creates firewall routes automatically. VIRTUAL repositories are excluded by default (see include_virtual). Only REMOTE repos pointing to known public registries get native Socket scanning routes.

Supported repository types:

npm
pypi
maven
cargo
rubygems
nuget
go
conda (experimental support)

Example discovered routes:

/npm-public       → https://registry.npmjs.org
/pypi-public      → https://pypi.org
/maven-central    → https://repo1.maven.org/maven2
/cargo-crates     → https://index.crates.io

Nexus Auto-Discovery

Discovers proxy and hosted repositories in Nexus and creates firewall routes automatically. Group repositories are excluded by default (see include_virtual).

path_routing:
  mode: nexus
  private_registry:
    api_url: https://nexus.company.com
    api_key: ${PRIVATE_REGISTRY_KEY}
    interval: 5m

Supported repository formats:

npm
pypi
maven2
cargo
rubygems
nuget
go
conda

Route naming: Routes are named after the repository name in Nexus (e.g., /npm-proxy, /pypi-proxy)

Auto-Discovery Configuration Options

All auto-discovery settings are configured in socket.yml under path_routing.private_registry:

private_registry:
  api_url: https://artifactory.company.com    # Repository manager base URL (required)
  api_key: ${PRIVATE_REGISTRY_KEY}            # API key/token (takes precedence)
  # OR use basic auth with separate fields:
  username: ${PRIVATE_REGISTRY_USERNAME}       # Basic auth username
  password: ${PRIVATE_REGISTRY_PASSWORD}       # Basic auth password
  interval: 5m                                # Sync interval (default: 5m)
  ignore_ssl_errors: false                    # Disable SSL cert verification (default: false)
  include_pattern: ".*"                        # Regex to include repos (default: all)
  exclude_pattern: "(tmp|test)-.*"             # Regex to exclude repos (default: none)
  supported_ecosystems_only: true             # Only route supported ecosystems (default: true)
  include_virtual: false                      # Include VIRTUAL/group repos (default: false)

The api_key can also be provided via the PRIVATE_REGISTRY_KEY environment variable.

Authentication priority: api_key (or PRIVATE_REGISTRY_KEY env var) takes precedence. If api_key is not set, username and password are combined as username:password for basic auth.

`supported_ecosystems_only` (default: `true`)

Controls which discovered repositories get explicit firewall routes:

Value	Supported ecosystems (npm, pypi, maven, etc.)	Unsupported ecosystems (docker, helm, etc.)
`true`	Native route with Socket PURL scanning	No explicit route — forwarded uninspected via `forward_for_domain` catchall
`true` + `external_registry_cooldown.enabled: true`	Native route with Socket PURL scanning	Cooldown route only if repo matches an `external_registry_cooldown.registries` entry; otherwise skipped
`false`	Native route with Socket PURL scanning	Explicit passthrough route (forwarded without inspection)

REMOTE repository host validation: For REMOTE repos with supported package types, auto-discovery also verifies the remote URL points to a known public registry (e.g., registry.npmjs.org for npm). If the remote URL points elsewhere (e.g., Google AOSS, a private Artifactory, or any non-public host), the repo is:

Downgraded to a cooldown route when external_registry_cooldown.enabled: true
Skipped (no route) otherwise — traffic still reaches the upstream via the forward_for_domain catchall

This prevents repos that proxy non-public registries (like Google AOSS themes) from being treated as native ecosystems that Socket can scan.

`include_virtual` (default: `false`)

Controls whether VIRTUAL (Artifactory) or group (Nexus) repositories are included in auto-discovery.

Virtual/group repos aggregate multiple member repositories behind a single endpoint. They are excluded by default because:

Routing through a virtual aggregator bypasses per-member allowlist and cooldown gating
The concrete member repos (REMOTE, LOCAL, etc.) are already discovered individually

Set to true only when clients need to route through the virtual/group endpoint directly (e.g., downstream topologies).

Domain-Based Routing

Each registry gets its own subdomain. Requires multiple DNS records (or wildcard DNS) and certificates (or wildcard cert).

registries:
  npm:
    domains: [npm.company.com]
    upstream: https://registry.npmjs.org  # Optional - defaults to public registry
    
  pypi:
    domains: [pypi.company.com, python.company.com]  # Multiple domains supported
    upstream: https://pypi.org
    
  maven:
    domains: [maven.company.com]
    
  cargo:
    domains: [cargo.company.com]
    
  rubygems:
    domains: [rubygems.company.com]
    
  openvsx:
    domains: [vsx.company.com]
    
  nuget:
    domains: [nuget.company.com]
    
  go:
    domains: [go.company.com]
    
  conda:
    domains: [conda.company.com]

Client usage:

npm config set registry https://npm.company.com/
pip config set global.index-url https://pypi.company.com/simple

DNS requirements:
Each domain needs an A or CNAME record pointing to the firewall host.

SSL requirements:
Either provide individual certs for each domain, or use a wildcard cert (*.company.com).

Caching

Local In-Memory Cache (Default)

cache:
  ttl: 600  # Freshness window in seconds (10 minutes default)

Cached results are stored in nginx shared memory. Fresh for ttl seconds, then becomes stale but is retained for revalidation.

Environment variable:

SOCKET_CACHE_TTL=600

Redis Cache (Distributed)

For multi-instance deployments or persistent caching across restarts:

redis:
  enabled: true
  host: redis.company.com
  port: 6379
  password: ${REDIS_PASSWORD}  # Optional
  db: 0  # Redis database number (default: 0)
  ttl: 86400   # Stale window in seconds (24 hours default)
  
  # SSL/TLS settings
  ssl: true
  ssl_verify: true
  ssl_ca_cert: /path/to/redis-ca.pem
  ssl_client_cert: /path/to/client-cert.pem  # For mTLS
  ssl_client_key: /path/to/client-key.pem   # For mTLS
  ssl_server_name: redis.company.com  # SNI hostname

Stale-while-revalidate behavior:

Fresh zone (0 to cache.ttl seconds): Return cached value immediately
Stale zone (cache.ttl to redis.ttl seconds): Revalidate with Socket API, fallback to stale on error
Expired (after redis.ttl): Key removed by Redis, fetch fresh from Socket API

Environment variables:

REDIS_ENABLED=true
REDIS_HOST=redis
REDIS_PORT=6379
REDIS_PASSWORD=${REDIS_PASSWORD}
REDIS_DB=0
REDIS_TTL=86400
REDIS_SSL=true
REDIS_SSL_VERIFY=true

Nginx Performance

nginx:
  worker_processes: 2        # Number of worker processes (match CPU cores)
  worker_connections: 4096   # Max concurrent connections per worker

Resource-Based Recommendations

Resources	worker_processes	worker_connections	Est. Throughput
1 CPU / 1 GB RAM	1	512	~30 req/s
2 CPU / 2 GB RAM	2	1024	~60 req/s
4 CPU / 4 GB RAM	4	4096	~100 req/s
8 CPU / 8 GB RAM	8	8192	~170 req/s
16 CPU / 16 GB	16	16384	~300 req/s

Environment variables:

WORKER_PROCESSES=2
WORKER_CONNECTIONS=4096

Proxy Timeouts

Configure timeouts for upstream registry connections:

proxy:
  connect_timeout: 60  # Seconds to establish connection
  send_timeout: 60     # Seconds to send request
  read_timeout: 60     # Seconds to read response
  
  # Buffer sizes (advanced)
  buffer_size: 4k      # Initial buffer for response headers
  buffers_count: 8     # Number of buffers for response body
  buffers_size: 4k     # Size of each buffer
  busy_buffers_size: 8k  # Buffers that can be sent to client while reading

For large packages (e.g., Maven artifacts > 100MB):

proxy:
  connect_timeout: 120
  send_timeout: 300
  read_timeout: 300

Environment variables:

PROXY_CONNECT_TIMEOUT=60
PROXY_SEND_TIMEOUT=60
PROXY_READ_TIMEOUT=60

Client IP Detection (`client_ip`)

When the firewall sits behind a load balancer or reverse proxy, $remote_addr will be the proxy's IP address — not the real client. The client_ip section configures nginx's ngx_http_realip_module to extract the true client IP from a trusted header.

Once configured, all logging, telemetry, webhook events, Splunk HEC events, and the SOCKET_DECISION log field client_ip automatically reflect the real client IP. No Lua code changes are needed — the module transparently replaces $remote_addr.

Configuration

client_ip:
  header: X-Forwarded-For       # Header containing the real client IP
  trusted_proxies:               # CIDR ranges of trusted proxies
    - 10.0.0.0/8
    - 172.16.0.0/12
    - 192.168.0.0/16
  recursive: true                # Walk the header chain to find first untrusted IP (default: true)

Options

Setting	Type	Default	Description
`header`	string	(none)	HTTP header to read client IP from. Common values: `X-Forwarded-For`, `X-Real-IP`, `CF-Connecting-IP` (Cloudflare), `True-Client-IP` (Akamai)
`trusted_proxies`	string[]	`[]`	List of CIDR ranges whose IPs are trusted to set the client IP header. Required when `header` is set.
`recursive`	bool	`true`	When `true` and `header` is `X-Forwarded-For`, nginx walks the comma-separated IP chain from right to left, skipping IPs that match `trusted_proxies`, and uses the first untrusted IP.

Environment Variables

CLIENT_IP_HEADER=X-Forwarded-For
CLIENT_IP_TRUSTED_PROXIES=10.0.0.0/8,172.16.0.0/12,192.168.0.0/16
CLIENT_IP_RECURSIVE=true

Examples

Behind an AWS ALB:

client_ip:
  header: X-Forwarded-For
  trusted_proxies:
    - 10.0.0.0/8       # VPC CIDR
  recursive: true

Behind Cloudflare:

client_ip:
  header: CF-Connecting-IP
  trusted_proxies:
    - 173.245.48.0/20
    - 103.21.244.0/22
    - 103.22.200.0/22
    - 103.31.4.0/22
    - 141.101.64.0/18
    - 108.162.192.0/18
    - 190.93.240.0/20
    - 188.114.96.0/20
    - 197.234.240.0/22
    - 198.41.128.0/17
    - 162.158.0.0/15
    - 104.16.0.0/13
    - 104.24.0.0/14
    - 172.64.0.0/13
    - 131.0.72.0/22

Behind a single known proxy (e.g., nginx ingress controller):

client_ip:
  header: X-Real-IP
  trusted_proxies:
    - 10.0.1.5/32       # Proxy IP

How It Works

No client_ip configured — $remote_addr is used as-is (direct client connection).

client_ip.header + trusted_proxies set — nginx's ngx_http_realip_module generates:

set_real_ip_from 10.0.0.0/8;
set_real_ip_from 172.16.0.0/12;
real_ip_header X-Forwarded-For;
real_ip_recursive on;

When a request arrives from a trusted proxy IP, nginx replaces $remote_addr with the value from the configured header. Lua's ngx.var.remote_addr reflects this automatically.

Security Notes

Always restrict trusted_proxies to your actual proxy/LB CIDR ranges. If set too broadly (e.g., 0.0.0.0/0), any client can spoof their IP via the header.
The header value is only trusted when the request comes from an IP in trusted_proxies.
Without trusted_proxies, the header setting is ignored (fail-safe).

Metadata Filtering

Requires version 1.1.108 or higher.

Remove blocked or warned package versions from registry metadata responses before clients see them, preventing installation attempts of unsafe packages entirely. Supports all 9 ecosystems with ecosystem-appropriate filtering granularity.

metadata_filtering:
  enabled: true
  filter_blocked: true              # Remove blocked/error packages from metadata
  filter_warn: false                # Keep warned packages visible (show warnings only)
  include_unchecked_versions: true  # Include versions not yet checked by Socket (default: true)
  max_versions: 100                 # Max versions to check per package (default: 100, newest first)
  cache_ttl: 3600                   # Cache TTL for metadata PURL lookups (default: 3600s = 1 hour)
  batch_size: 4000                  # Max PURLs per batch (Socket API limit ~4K)
  max_body_size: 500m              # Max metadata response body size for filtering (default: 500m)
  prefetch_enabled: true            # Enable/disable background prefetch for conda metadata (default: true)
  prefetch_ttl: 600                 # Prefetch refresh interval in seconds (0 = always check upstream, >0 = check every N seconds)
  prefetch_max_concurrent: 2        # Max concurrent prefetch operations across all workers (default: 2)
  prefetch_batch_concurrency: 4     # Max concurrent PURL batch API calls during metadata filtering (default: 4)

Options

Field	Default	Description
`enabled`	`false`	Enable metadata filtering
`filter_blocked`	`true`	Remove packages with `block` or `error` actions from metadata
`filter_warn`	`false`	Remove packages with `warn` actions from metadata
`include_unchecked_versions`	`true`	Keep versions not yet scanned by Socket (`false` = strict security posture)
`max_versions`	`100`	Max versions to check per package (newest first; older versions kept as-is)
`cache_ttl`	`3600`	Cache TTL in seconds for metadata PURL lookups (separate from download TTL)
`batch_size`	`4000`	Max PURLs per Socket API batch call (~4K API limit)
`max_body_size`	`500m`	Max metadata response body size for filtering (supports `k`, `m`, `g` suffixes)
`prefetch_enabled`	`true`	Enable/disable background prefetch for conda metadata. Set to `false` to disable all prefetch timers.
`prefetch_ttl`	`600`	Prefetch refresh interval in seconds. `0` = always check upstream for changes (does NOT disable prefetch). `>0` = check every N seconds.
`prefetch_max_concurrent`	`2`	Max concurrent prefetch operations across all nginx workers. Prevents worker exhaustion on startup.
`prefetch_batch_concurrency`	`4`	Max concurrent PURL batch API calls during metadata filtering. Higher values speed up large metadata like lodash or conda repodata.

How It Works

Client requests package metadata (e.g., npm install lodash, pip install requests)
Firewall fetches the full metadata response from the upstream registry
Extracts all package versions/artifacts and builds Package URLs (PURLs)
Calls the Socket API in batches to check security status of each version
Removes blocked (and optionally warned) versions from the response
Returns sanitized metadata to the client

When filtering is disabled, responses stream through unchanged with no buffering.

Supported Ecosystems

Per-artifact filtering — individual artifacts within a version can be selectively removed:

Ecosystem	Metadata Format	Notes
PyPI	HTML (PEP 503) and JSON (PEP 691) `/simple/{package}/`	Filters tar.gz, wheels, eggs, zips independently
Maven	HTML directory listings and `maven-metadata.xml`	Filters by classifier and type (e.g., `?classifier=sources&type=jar`)

Per-version filtering — if any artifact is blocked, the entire version is removed:

Ecosystem	Metadata Format	Notes
npm	Package JSON (`/{package}`)	Filters `versions`, `dist-tags`, and `time` objects
NuGet	Registration catalog JSON (`/v3/registration5-gz-semver2/`)	Removes entries from catalog pages
RubyGems	CompactIndex (`/info/{gem}`) and JSON API (`/api/v1/versions/`)	Line-based and JSON array formats
Cargo	Sparse index NDJSON (`/{2-chars}/{2-chars}/{crate}`)	One version per NDJSON line
Go	`/@v/list` (newline-separated) and `/@latest` (JSON)	Source-only, no binary artifacts
Conda	`repodata.json` (`packages` and `packages.conda` objects)	Uses PyPI PURLs (Socket API fallback)
OpenVSX	Extension detail JSON (`/api/{namespace}/{extension}`)	Single .vsix per version

Use Cases

Prevent accidental installation of malicious or vulnerable packages
Remove flagged packages from search results and dependency resolution
Enforce strict security posture by excluding unchecked versions (include_unchecked_versions: false)
Pre-warm the PURL cache from metadata lookups, speeding up subsequent download checks

Environment Variables

METADATA_FILTERING_ENABLED=true
METADATA_FILTER_BLOCKED=true
METADATA_FILTER_WARN=false
METADATA_INCLUDE_UNCHECKED_VERSIONS=true
METADATA_MAX_VERSIONS=100
METADATA_CACHE_TTL=3600
METADATA_FILTER_BATCH_SIZE=4000
METADATA_MAX_BODY_SIZE=524288000       # 500MB in bytes (default)

Per-Ecosystem `recentlyPublished` Downgrade

Requires version 1.1.134 or higher.

Control which ecosystems have the recentlyPublished alert downgraded from its API-assigned severity to warn, allowing the package through instead of blocking it.

By default (empty or omitted), the API action is respected as-is for all ecosystems. When one or more ecosystems are listed, recentlyPublished alerts are downgraded to warn only for those ecosystems; all other ecosystems continue to use the API-assigned action.

socket:
  # Downgrade recentlyPublished to warn only for npm and pypi;
  # all other ecosystems use the API-assigned action
  recently_published_enabled_ecosystems:
    - npm
    - pypi

Options

Field	Default	Description
`recently_published_enabled_ecosystems`	`[]` (empty)	List of ecosystems where `recentlyPublished` alerts are downgraded to `warn`. Ecosystems not listed respect the API-assigned action. When empty or omitted, all ecosystems respect the API action.

Valid Ecosystem Values

npm, pypi, maven, cargo, rubygems, nuget, go, conda, openvsx

Behavior

Configuration	Effect
Empty or omitted	All ecosystems use the API-assigned action (default — no downgrade)
One or more ecosystems listed	`recentlyPublished` is downgraded to `warn` for listed ecosystems; all others use the API-assigned action

When a recentlyPublished alert is downgraded:

The package is allowed through (not blocked)
A warning is logged with the original and downgraded action
The X-Socket-Warn-Reason response header includes recentlyPublished
Splunk, webhook, and telemetry events reflect the downgraded warn action

Examples

Downgrade recentlyPublished to warn only for npm:

socket:
  recently_published_enabled_ecosystems:
    - npm

Downgrade recentlyPublished to warn for npm, pypi, and maven:

socket:
  recently_published_enabled_ecosystems:
    - npm
    - pypi
    - maven

Environment Variable

RECENTLY_PUBLISHED_ENABLED_ECOSYSTEMS=npm,pypi   # Comma-separated list

Per-Ecosystem Parameters

Fine-tune behavior for specific ecosystems using the ecosystem_params section under socket:. Currently supports conda-specific settings for managing packages that lack native Socket API coverage.

socket:
  ecosystem_params:
    conda:
      use_private_created_at: true  # Use private registry timestamps for cooldown enforcement (default: true)
      allow_unknown: true            # Allow unscanned packages with a warn alert (default: true)

Options

Field	Default	Description
`use_private_created_at`	`true`	When enabled, the conda parser queries the private registry (Nexus/Artifactory) for package publish timestamps and applies cooldown enforcement. Requires `external_registry_cooldown.enabled: true`. Set to `false` to skip private registry cooldown checks for conda.
`allow_unknown`	`true`	Overrides the global `fail_open_unscanned` setting for this ecosystem. When `true`, unscanned packages (Socket API returns `purlError`) are allowed with a `warn` action and Splunk/telemetry events. When `false`, unscanned packages are blocked regardless of the global setting. When not set, falls back to `fail_open_unscanned`.

Environment Variable

SOCKET_ECOSYSTEM_PARAMS='{"conda":{"use_private_created_at":true,"allow_unknown":true}}'  # JSON

External Registry Cooldown

For registries not natively supported by Socket — or private registries where Socket hasn't scanned packages — the cooldown system blocks recently-published packages. Packages published within a configurable cooldown window are blocked, protecting against supply-chain attacks that rely on recently-published malicious packages.

The cooldown system is a replacement for the Socket PURL API for unsupported registries. For ecosystems Socket doesn't natively support, the cooldown system is the security check.

Modes

The cooldown system supports two modes:

API mode (default when socket.deployment is configured):

Sends PURLs to the centralized Socket cooldown API endpoint
Same request format as the PURL API ({components: [{purl}]})
Falls back to the local daemon if the API is unreachable
Does not require Redis (uses Redis for caching when available)

Local mode (legacy, used when deployment is not configured):

Runs a local Python daemon that queries external registries directly for publish dates
Communicates with nginx via Redis queue
Requires Redis

How It Works

Auto-discovery (Artifactory/Nexus) detects repos with unsupported package types
Routes for those repos are tagged as cooldown-checked instead of passthrough
When a package is requested, the firewall checks Redis cache for the cooldown result
On cache miss:
- API mode: HTTP POST to POST /v0/orgs/{org}/firewall/{deployment}/cooldown
- Local mode: Push to Redis queue → daemon queries registry → response via Redis
If recentlyPublished alert with action: error → blocked (403)
If no recentlyPublished alert → allowed (proxied upstream)

Configuration

external_registry_cooldown:
  enabled: true                    # Enable cooldown checks (default: false)
  mode: api                        # "api" (default when deployment set) or "local" (legacy daemon)
  cooldown_period: 7d              # Block packages published within this window (default: 7d)
  check_interval: 60               # Queue polling interval in seconds (default: 60, local mode only)
  redis_key_prefix: "cooldown:"    # Redis key prefix (default: "cooldown:")
  cache_ttl: 86400                 # Cache TTL for cooldown results (default: 86400 = 24h)

Options

Field	Default	Description
`enabled`	`false`	Enable cooldown checks
`mode`	auto	`api` (Socket API), `local` (daemon). Auto-detected from `socket.deployment` if not set
`cooldown_period`	`7d`	Block packages published within this window (e.g., `30s`, `5m`, `1h`, `3d`, `1w`)
`check_interval`	`60`	Seconds between queue polling cycles (local mode only)
`redis_key_prefix`	`"cooldown:"`	Prefix for Redis keys storing cooldown data
`cache_ttl`	`86400`	Cache TTL in seconds for cooldown results (24 hours)
`fallback`	`""`	Use private registry as publish-date source (see Publish-Date Fallback)

Explicit External Registries

Query external registries directly for package publish dates:

external_registry_cooldown:
  enabled: true
  cooldown_period: 7d
  
  registries:
    - name: internal-pypi
      url: https://pypi.internal.company.com
      ecosystem: pypi
      auth_type: bearer_token       # none (default), bearer_token, basic
      auth_credential: INTERNAL_PYPI_TOKEN  # Env var name
      cooldown_period: 3d           # Per-registry override (optional)

    - name: private-npm
      url: https://npm.internal.company.com
      ecosystem: npm
      auth_type: basic
      auth_credential: NPM_REGISTRY_CREDS  # Env var (user:pass format)

Field	Required	Description
`name`	Yes	Unique name for this registry (used in logs and cache keys)
`url`	Yes	Base URL of the registry API
`ecosystem`	Yes	Package ecosystem: `npm`, `pypi`, `maven`, `cargo`, `rubygems`, `nuget`, `go`, `conda`
`auth_type`	No	Authentication type: `none` (default), `bearer_token`, `basic`
`auth_credential`	No	Name of environment variable containing the credential
`cooldown_period`	No	Override the global cooldown period for this registry

Private Registry Auto-Discovery

Reuses the existing Artifactory/Nexus connection from path_routing.private_registry to discover unsupported repos and check their import timestamps:

external_registry_cooldown:
  enabled: true
  
  private_registry:
    enabled: true                     # Enable auto-discovery for cooldown
    source: auto                      # auto | artifactory | nexus
    include_unsupported_only: true    # Only repos skipped by supported_ecosystems_only
    include_pattern: ".*"             # Regex for repo names to include
    exclude_pattern: "^$"             # Regex for repo names to exclude

Field	Default	Description
`enabled`	`false`	Enable cooldown via private registry auto-discovery
`source`	`auto`	Registry type: `auto` (detect from `path_routing.mode`), `artifactory`, `nexus`
`include_unsupported_only`	`true`	`true` = only repos with unsupported ecosystems; `false` = all repos
`include_pattern`	`".*"`	Regex filter for repo names to include
`exclude_pattern`	`"^$"`	Regex filter for repo names to exclude

Both modes (explicit registries and private registry auto-discovery) can be active simultaneously. Explicit entries take precedence by name.

Supported Ecosystem Plugins

Each ecosystem has a dedicated plugin that knows how to query the registry API for publish dates:

Ecosystem	API Used
npm	Packument `.time.{version}` field
PyPI	JSON API `/pypi/{name}/json` → `upload_time_iso_8601`
Maven	`maven-metadata.xml` `<lastUpdated>` + POM Last-Modified
Cargo	crates.io API `.version.created_at`
RubyGems	Versions API `.created_at`
NuGet	Registration index `.catalogEntry.published`
Go	Proxy `.info` endpoint `.Time`
Conda	`repodata.json` `.timestamp` field
Artifactory	File Info API + AQL search (import timestamps)
Nexus	Components API + Search Assets API

Redis Communication

The firewall uses a cache-first strategy for low latency:

Cache hit (fast path, ~1ms): Direct Redis lookup cooldown:{ecosystem}:{name}:{version}
Cache miss: Push request to COOLDOWN_QUEUE, wait for daemon response (configurable timeout, default 2s)
Timeout: Allow through (fail-open) — no blocking on transient daemon failures

Decision Logging

Cooldown decisions are logged with full parity to Socket API decisions:

[SOCKET_DECISION] log entries include block_source: cooldown
Splunk HEC events include a synthesized recentlyPublished alert
Webhook payloads include cooldown metadata
Socket telemetry events include cooldown status

Publish-Date Fallback

When the primary ecosystem plugin can't determine a package's publish date, the cooldown system can fall back to querying the configured private registry (Artifactory or Nexus) to determine when the artifact was first imported.

This is useful when:

The ecosystem plugin's public registry is unreachable
The package exists only in the private registry (no public registry entry)
The plugin can't parse the upstream response

The fallback uses path_routing.private_registry connection config, so no extra auth is needed.

external_registry_cooldown:
  enabled: true
  fallback: external    # Use private registry as primary date source

Fallback Modes

Value	First lookup	Fallback (if first returns None)
`""`	Ecosystem plugin only	— (no fallback)
`"external"`	Private registry (auto-detect type)	Ecosystem plugin
`"artifactory"`	Ecosystem plugin	Artifactory AQL
`"nexus"`	Ecosystem plugin	Nexus REST Search API

external (recommended): Auto-detects whether to use Artifactory AQL or Nexus REST from path_routing.mode. Queries the private registry first — if it has the artifact's import date, that's used. Falls back to the ecosystem plugin if the private registry returns nothing.

artifactory / nexus: Queries the ecosystem plugin first (e.g., Maven Central metadata), and only tries the private registry if the plugin returns no date.

How Each Registry Is Queried

Artifactory AQL: Queries items.find() with a name/version pattern. Searches both the repo and its -cache variant (remote repos store cached artifacts in <repo>-cache). Returns the earliest created timestamp. Compatible with Artifactory Pro and OSS.

Nexus REST: Queries /service/rest/v1/search?repository={repo}&name={name}&version={version}. Returns the earliest blobCreated timestamp (falls back to lastModified if blobCreated is unavailable).

Environment Variables

COOLDOWN_ENABLED=true
COOLDOWN_PERIOD=7d
COOLDOWN_CHECK_INTERVAL=60
COOLDOWN_REDIS_KEY_PREFIX=cooldown:
COOLDOWN_CACHE_TTL=86400
COOLDOWN_FALLBACK=external
COOLDOWN_REGISTRIES='[{"name":"internal-pypi","url":"https://pypi.internal.company.com","ecosystem":"pypi"}]'
COOLDOWN_PRIVATE_REGISTRY_ENABLED=true
COOLDOWN_PRIVATE_REGISTRY_SOURCE=auto
COOLDOWN_PRIVATE_REGISTRY_UNSUPPORTED_ONLY=true

Splunk Integration

Forward security events to Splunk HTTP Event Collector (HEC):

splunk:
  enabled: true
  hec_url: https://splunk.company.com:8088/services/collector/event
  hec_token: ${SPLUNK_HEC_TOKEN}
  index: security           # Splunk index name (optional, no default)
  source: socket-firewall   # Splunk source name
  sourcetype: socket:firewall:event  # Splunk sourcetype (default: socket:firewall:event)
  
  # SSL settings
  ssl_verify: true
  ssl_ca_cert: /path/to/splunk-ca.pem
  
  # Event batching
  batch_size: 1            # Events per batch (default: 1)

Event types logged:

Package blocks (malicious/supply-chain attacks)
Package warnings
API errors
Cache hits/misses
Request/response metadata

Example Splunk event:

{
  "time": 1709078400,
  "event": {
    "event_type": "package_check",
    "purl": "pkg:npm/[email protected]",
    "decision": "blocked",
    "action": "block",
    "response_code": 403,
    "upstream_status": null,
    "block_source": "download",
    "block_reason": "Known Malware",
    "warn_reason": "",
    "repo": "npm-remote",
    "client_ip": "203.0.113.10",
    "user_agent": "npm/8.19.2",
    "request_id": "abc123xyz",
    "upstream_host": "registry.npmjs.org",
    "source_path": "/repository/npm/malicious-package/-/malicious-package-1.0.0.tgz",
    "cached": false,
    "stale": false,
    "socket_api_response_code": 403,
    "purl_check_latency_ms": 142,
    "private_registry_request_id": "ecb06b92c7f89c93:ecb06b92c7f89c93:0000000000000000:0",
    "reason": "security_policy",
    "alerts": [
      {"type": "knownMalware", "severity": "critical", "category": "security", "action": "error"}
    ],
    "alert_count": 1,
    "blocked_alerts": [
      {"type": "knownMalware", "severity": "critical", "category": "security", "action": "error"}
    ],
    "blocked_alert_count": 1,
    "score": 0.1,
    "versions": {}
  },
  "source": "socket-firewall",
  "sourcetype": "socket:firewall:event"
}

Environment variables:

SPLUNK_ENABLED=true
SPLUNK_HEC_URL=https://splunk.company.com:8088/services/collector/event
SPLUNK_HEC_TOKEN=${SPLUNK_HEC_TOKEN}
SPLUNK_SOURCE=socket-firewall

Unified Event Fields

All three event systems — console logging ([SOCKET_DECISION]), Splunk HEC, and Socket telemetry — share the same core event fields built by a single function. This guarantees consistent observability regardless of which system is consuming the events.

Core Fields (all systems)

Field	Type	Description
`request_id`	string	Unique request identifier for correlation
`purl`	string	Package URL (e.g., `pkg:npm/[email protected]`)
`decision`	string	`"blocked"` or `"allowed"`
`action`	string	Overall severity: `"block"`, `"warn"`, `"monitor"`, `"ignore"`
`response_code`	number	HTTP status code sent to client (e.g., 200 or 403)
`upstream_status`	number or null	HTTP status from upstream registry (null for blocked packages)
`source_path`	string	Request URI path
`upstream_host`	string or null	Upstream registry hostname
`upstream_path`	string or null	Upstream registry request path
`repo`	string or null	Route/repository name (from `socket_route_name`)
`client_ip`	string	Client IP address
`user_agent`	string	Client User-Agent header
`socket_api_response_code`	number	Socket API HTTP status (200 allowed, 403 blocked)
`cached`	boolean	Whether result was served from cache
`stale`	boolean	Whether cached value was stale (revalidation attempted)
`block_source`	string or null	`"download"` (artifact check) or `"metadata"` (metadata filtering)
`block_reason`	string	Comma-separated alert titles for block/error alerts
`warn_reason`	string	Comma-separated alert titles for warn alerts
`api_error`	string or null	Error message if Socket API call failed
`unscanned`	boolean or null	`true` when the package/version was not found or not yet scanned by Socket (purlError response)
`purl_check_latency_ms`	number or null	Milliseconds to check package via Socket API
`private_registry_request_id`	string or null	Trace/request ID from private registry (Jaeger `uber-trace-id` or `X-Request-Id` from Artifactory/Nexus)

Platform-Specific Fields

Splunk HEC adds these fields on top of the core:

Field	Type	Description
`event_type`	string	Always `"package_check"`
`reason`	string	Result reason string from Socket API
`alerts`	array	Structured array of alert objects (`type`, `severity`, `category`, `action`)
`alert_count`	number	Total number of alerts
`blocked_alerts`	array	Structured array of blocked/error alert objects
`blocked_alert_count`	number	Number of blocked/error alerts
`score`	number	Package security score
`versions`	object	Component versions from `.versions` file

Socket telemetry adds these fields on top of the core:

Field	Type	Description
`input_purl`	string	Decoded PURL sent for readability
`event_sender_created_at`	string	HTTP date timestamp
`socket_client_version`	string	Socket client library version
`event_type`	string	Always `"firewall_package_encountered"`
`event_category`	string	Always `"proactive"`
`registryFqdn`	string	Registry hostname from request
`machine_id`	string	SHA256-based machine identifier
`parser_name`	string	Ecosystem parser name
`parser_version`	string	Ecosystem parser version
`artifact_purl`	string	Decoded PURL for the artifact
`alert_action`	string	Alias for `action`
`client_action`	string	Alias for `action`
`purlCheckLatencyMs`	number	Alias for `purl_check_latency_ms` (camelCase)
`versions`	object	Component versions from `.versions` file

SOCKET_DECISION ([SOCKET_DECISION] JSON log line) includes the core fields plus:

Field	Type	Description
`monitor_reason`	string	Comma-separated alert titles for monitor alerts

Webhook Events

Send package decision events to any HTTP endpoint. Useful for custom dashboards, alerting systems, or SIEM integrations beyond Splunk.

webhook:
  enabled: true
  url: https://siem.company.com/api/events
  auth_header: "Bearer ${WEBHOOK_AUTH_TOKEN}" # Authorization header (optional)
  ssl_verify: false                     # Verify TLS certificate (default: false)
  timeout: 5000                         # Request timeout in ms (default: 5000)
  on_block: true                        # Fire on block decisions (default: true)
  on_warn: true                         # Fire on warn decisions (default: true)
  on_monitor: true                      # Fire on monitor decisions (default: true)
  on_ignore: true                       # Fire on ignore decisions (default: true)

Field	Default	Description
`enabled`	`false`	Enable webhook event delivery
`url`	(none)	Webhook endpoint URL (required when enabled)
`auth_header`	(none)	Value for the `Authorization` header (optional)
`ssl_verify`	`false`	Verify TLS certificate of webhook endpoint
`timeout`	`5000`	HTTP request timeout in milliseconds
`on_block`	`true`	Send events for blocked packages
`on_warn`	`true`	Send events for warned packages
`on_monitor`	`true`	Send events for monitored packages
`on_ignore`	`true`	Send events for ignored packages

Events are delivered asynchronously (non-blocking) and include all core event fields:

{
  "event_type": "package_decision",
  "timestamp": 1709078400.123,
  "request_id": "abc123xyz",
  "purl": "pkg:npm/[email protected]",
  "decision": "blocked",
  "action": "block",
  "response_code": 403,
  "upstream_status": null,
  "block_source": "download",
  "block_reason": "Known Malware",
  "warn_reason": "",
  "client_ip": "203.0.113.10",
  "user_agent": "npm/8.19.2",
  "repo": "npm-remote",
  "source_path": "/repository/npm/malicious-package/-/malicious-package-1.0.0.tgz",
  "upstream_host": "registry.npmjs.org",
  "cached": false,
  "stale": false,
  "socket_api_response_code": 403,
  "purl_check_latency_ms": 142,
  "private_registry_request_id": "ecb06b92c7f89c93:ecb06b92c7f89c93:0000000000000000:0"
}

Environment variables:

WEBHOOK_ENABLED=true
WEBHOOK_URL=https://siem.company.com/api/events
WEBHOOK_AUTH_HEADER="Bearer ${WEBHOOK_AUTH_TOKEN}"
WEBHOOK_SSL_VERIFY=false
WEBHOOK_TIMEOUT=5000
WEBHOOK_ON_BLOCK=true
WEBHOOK_ON_WARN=true
WEBHOOK_ON_MONITOR=true
WEBHOOK_ON_IGNORE=true

Log Level

Controls which messages appear in console output. The default level is info, which shows all security decisions. Splunk HEC events, Socket telemetry events, and webhook deliveries are always sent regardless of log level — this setting only affects console (stderr) output.

socket:
  log_level: info  # error, warn, info (default), debug

Level	Console Output
`error`	Only block/error decisions (`[SOCKET_DECISION]` at ERR level)
`warn`	Block/error + warn decisions
`info`	All decisions including monitor/ignore (default)
`debug`	All decisions + verbose debug traces (automatically enables `debug_logging_enabled`)

SOCKET_DECISION Log Level Mapping

Each security decision action maps to a specific log level:

Action	Log Level	When Visible
`block`/`error`	ERR	Always (all log levels)
`warn`	WARN	`log_level: warn` or lower
`monitor`	INFO	`log_level: info` or lower (default)
`ignore`	INFO	`log_level: info` or lower (default)

Integration with Debug Logging

Setting log_level: debug automatically enables debug_logging_enabled, which provides verbose HTTP request/response header logging. You can also enable debug logging independently via socket.debug_logging_enabled: true without changing the log level.

Environment variable:

SOCKET_LOG_LEVEL=info  # error, warn, info (default), debug

Log Max Body Size

Controls the maximum byte length of a [SOCKET_DECISION] JSON body in a single ngx.log() call. OpenResty has a hard 4096-byte buffer (NGX_MAX_ERROR_STR) — messages exceeding this limit are silently truncated. When a decision body exceeds the configured limit, it is automatically split across multiple continuation log lines.

socket:
  log_max_body_size: 3900  # bytes per log line (default: 3900, 0 = disable splitting)

Value	Behavior
`3900` (default)	Split long bodies into `[SOCKET_DECISION 1/N]`, `[SOCKET_DECISION 2/N]`, ... continuation lines
`0`	No splitting — output is truncated by nginx at 4096 bytes
Custom (≥100)	Use the specified chunk size (must leave room for ~80 bytes of prefix overhead)

Example output (split message)

[SOCKET_DECISION 1/2] {"request_id":"abc123","purl":"pkg:npm/[email protected]","decision":"blocked",...
[SOCKET_DECISION 2/2] ...,"blocked_alerts":["Malware"],"score":0.1}

Environment variable:

SOCKET_LOG_MAX_BODY_SIZE=3900  # default; set to 0 to disable splitting

Debug Logging

Enable verbose request/response header logging for troubleshooting. Disabled by default.

socket:
  debug_logging_enabled: false              # Enable debug logging (default: false)
  debug_user_agent_filter: "*artifactory*"  # Glob pattern to match user-agents (optional)

Field	Default	Description
`debug_logging_enabled`	`false`	Enable verbose HTTP header logging
`debug_user_agent_filter`	(none)	Glob pattern to limit debug logging to matching user-agents only

When debug_user_agent_filter is set, only requests whose User-Agent header matches the glob pattern will produce debug log output. The match is case-insensitive. Standard glob syntax is supported (* matches any characters, ? matches a single character).

Examples:

# Log all requests
socket:
  debug_logging_enabled: true

# Log only Artifactory traffic
socket:
  debug_logging_enabled: true
  debug_user_agent_filter: "*artifactory*"

# Log only npm client traffic  
socket:
  debug_logging_enabled: true
  debug_user_agent_filter: "npm/*"

Environment variables:

SOCKET_DEBUG_LOGGING_ENABLED=true
SOCKET_DEBUG_USER_AGENT_FILTER="*artifactory*"

Health Check Logging

The firewall exposes a /health endpoint on every server block (default, per-registry, and path-routing). Health check requests are automatically excluded from console output to prevent log noise from load balancers and Kubernetes probes.

What is suppressed

Log source	Suppressed?	How
nginx access log	Yes	`access_log off;` on every `/health` location block
Debug logging (`[DEBUG]`)	Yes	`should_debug_log()` returns `false` for `/health` requests
Splunk HEC / Socket telemetry	N/A	Health checks do not trigger security decisions

Health check response

GET /health HTTP/1.1

HTTP/1.1 200 OK
Content-Type: text/plain
Server: SocketFirewall/1.2.3

SocketFirewall/1.2.3 - Health OK

Per-registry and path-routing health endpoints include additional context:

SocketFirewall/1.2.3 - Health OK - npm (npm.company.com)
SocketFirewall/1.2.3 - Health OK - path-routing (firewall.company.com)

No configuration is required — health check log suppression is always active.

Decision Log (SOCKET_DECISION)

Every package security check emits a [SOCKET_DECISION] JSON log entry for audit and observability. These entries appear in the firewall's standard error log.

Example log entry:

[error] [REQUEST_ID: abc123] [SOCKET_DECISION] {"request_id":"abc123","purl":"pkg:npm/[email protected]","decision":"blocked","action":"block","response_code":403,"upstream_status":null,"source_path":"/npm/malicious-package/-/malicious-package-1.0.0.tgz","upstream_host":"registry.npmjs.org","repo":"npm","client_ip":"10.0.0.5","socket_api_response_code":200,"cached":false,"stale":false,"block_source":"download","block_reason":"malware,typosquat","warn_reason":"","api_error":null}

Decision Log Fields

Field	Type	Description
`request_id`	string	Unique request identifier
`purl`	string	Package URL (decoded, e.g., `pkg:npm/[email protected]`)
`decision`	string	`"allowed"` or `"blocked"`
`action`	string	Overall severity: `block`, `warn`, `monitor`, `ignore`, `error`
`response_code`	number	HTTP status returned to client (`200` or `403`)
`upstream_status`	number/null	HTTP status from upstream registry (null for blocked requests)
`source_path`	string	Request URI path
`upstream_host`	string/null	Upstream registry hostname
`repo`	string/null	Route name (e.g., `npm`, `pypi-remote`)
`client_ip`	string/null	Client IP address
`socket_api_response_code`	number	HTTP status from Socket API response
`cached`	boolean	Whether the result was served from cache
`stale`	boolean	Whether the cached result was stale (revalidation attempted)
`block_source`	string/null	`"download"` (artifact check) or `"metadata"` (filtering)
`block_reason`	string	Comma-separated alert titles that caused a block
`warn_reason`	string	Comma-separated alert titles at warn level
`api_error`	string/null	Error message if Socket API call failed
`private_registry_request_id`	string/null	Trace/request ID from private registry (`uber-trace-id` or `X-Request-Id`)

Log Level by Action

Action	Log Level	When
`block` / `error`	ERROR	Package blocked or API error in fail-closed
`warn`	WARN	Package has warn-level alerts (still allowed)
`monitor` / `ignore`	NOTICE	Package allowed with monitor alerts or clean

Filtering Logs

# All security decisions
docker compose logs socket-firewall | grep SOCKET_DECISION

# Only blocked packages
docker compose logs socket-firewall | grep SOCKET_DECISION | grep '"decision":"blocked"'

# Decisions for a specific package
docker compose logs socket-firewall | grep SOCKET_DECISION | grep 'lodash'

# Metadata filtering decisions
docker compose logs socket-firewall | grep SOCKET_DECISION | grep '"block_source":"metadata"'

Access Log Format

The firewall uses a custom access log format that includes timing, upstream, and authentication fields for operational monitoring.

Log format:

$remote_addr - $remote_user [$time_local] "$request_method $request_uri $server_protocol"
  $status $body_bytes_sent "$http_referer"
  "$http_user_agent" "$http_x_forwarded_for"
  rt=$request_time
  upstream=$upstream_addr us=$upstream_status ut=$upstream_response_time
  auth=$sanitized_authorization
  req=$request_id trace=$sent_http_x_trace_id

Access Log Fields

Field	Description
`rt=`	Total request time in seconds (includes upstream + processing)
`upstream=`	Upstream server address (IP:port)
`us=`	Upstream HTTP status code
`ut=`	Upstream response time in seconds
`auth=`	Authorization header (redacted to `[REDACTED]` for security)
`req=`	NGINX-generated unique request ID (32-char hex, correlates with `[REQUEST_ID: ...]` in Lua logs)
`trace=`	Upstream registry trace ID (`uber-trace-id` or `X-Request-Id` from upstream response), also sent as `X-Trace-Id` response header

Query parameters are stripped from logged URIs to prevent sensitive data leakage.

Access Log Buffering

Control log output buffering with access_log_buffer:

nginx:
  access_log_buffer: 64k      # Default — buffer 64k before flushing
  # access_log_buffer: off    # Disable buffering (flush every line)
  # access_log_buffer: 256k   # Larger buffer for high-throughput

Value	Behavior
`64k`	Default. Buffers up to 64k before flushing to stdout
`off`	Disables buffering — each log line is written immediately
`256k`	Larger buffer for high-throughput deployments

Set access_log_buffer: off when you need real-time log output (e.g., debugging, streaming to log aggregators).

SSL/TLS Certificates

Certificates are stored in /etc/nginx/ssl inside the container. Mount from host:

volumes:
  - ./ssl:/etc/nginx/ssl

Configuration

ssl:
  cert: /etc/nginx/ssl/fullchain.pem   # Server certificate (default: auto-generated)
  key: /etc/nginx/ssl/privkey.pem      # Server private key (default: auto-generated)
  ca_cert: /etc/nginx/ssl/ca-cert.pem  # Custom CA certificate (optional)

Setting	Purpose	Default
`cert`	Server certificate for HTTPS listener	Auto-generated self-signed
`key`	Server private key	Auto-generated self-signed
`ca_cert`	Custom CA certificate — trusted in addition to system root CAs	(not set)

Custom CA Certificate (`ca_cert`)

When set, the firewall creates a combined CA bundle at startup that includes:

System root CAs (/etc/ssl/certs/ca-certificates.crt)
Your custom CA certificate
Redis CA certificate (if configured)

This bundle is used for all outbound SSL connections — upstream registries, Socket API, and Redis.

Use case: Upstream registries (Nexus, Artifactory, etc.) use internal or self-signed certificates.

ssl:
  ca_cert: /etc/nginx/ssl/internal-ca.pem

Note: The per-connection overrides socket.api_ssl_ca_cert and socket.upstream_ssl_ca_cert still work for advanced use cases where different connections need different trust stores.

Required Files

File	Purpose	Permissions
`ssl/fullchain.pem`	Certificate chain (cert + intermediates)	644
`ssl/privkey.pem`	Private key	644

Auto-Generated Certificates

The firewall generates self-signed certs on first run if none exist. Located at /etc/nginx/ssl/.

Custom Certificates (Production)

Place your organization's certificates in the ssl/ directory on the host:

mkdir -p ssl
cp /path/to/cert.pem ssl/fullchain.pem
cp /path/to/key.pem ssl/privkey.pem
chmod 644 ssl/fullchain.pem ssl/privkey.pem

Generate Self-Signed Certificates

Single domain:

mkdir -p ssl
openssl req -x509 -nodes -days 365 -newkey rsa:2048 \
  -keyout ssl/privkey.pem \
  -out ssl/fullchain.pem \
  -subj "/CN=firewall.company.com" \
  -addext "subjectAltName=DNS:firewall.company.com,DNS:localhost"
chmod 644 ssl/fullchain.pem ssl/privkey.pem

Wildcard (multiple subdomains):

mkdir -p ssl
openssl req -x509 -nodes -days 365 -newkey rsa:2048 \
  -keyout ssl/privkey.pem \
  -out ssl/fullchain.pem \
  -subj "/CN=*.company.com" \
  -addext "subjectAltName=DNS:*.company.com,DNS:company.com,DNS:localhost"
chmod 644 ssl/fullchain.pem ssl/privkey.pem

Trust Self-Signed Certificates

macOS:

sudo security add-trusted-cert -d -r trustRoot \
  -k /Library/Keychains/System.keychain ssl/fullchain.pem

Linux:

sudo cp ssl/fullchain.pem /usr/local/share/ca-certificates/socket-firewall.crt
sudo update-ca-certificates

Windows:

Import-Certificate -FilePath ssl\fullchain.pem -CertStoreLocation Cert:\LocalMachine\Root

Environment Variables Reference

All configuration can be overridden via environment variables. Useful for Docker/Kubernetes deployments.

Core Settings

# Required
SOCKET_SECURITY_API_TOKEN=${SOCKET_SECURITY_API_TOKEN} # Socket.dev API key

# Socket API
SOCKET_API_URL=https://api.socket.dev         # Default
SOCKET_FAIL_OPEN=true                         # Allow on API error (default: true)
SOCKET_FAIL_OPEN_UNSCANNED=true               # Allow unscanned packages (default: true)
SOCKET_CACHE_TTL=600                          # Freshness window (seconds)

# Ports
HTTP_PORT=8080                                # HTTP port
HTTPS_PORT=8443                               # HTTPS port

# Deployment mode
CONFIG_MODE=upstream                          # 'upstream' or 'middle'

# SSL verification
SOCKET_API_SSL_VERIFY=false                   # Verify Socket API SSL (default: false)
SOCKET_API_SSL_CA_CERT=/path/to/ca.crt       # Custom Socket API CA
SOCKET_UPSTREAM_SSL_VERIFY=false             # Verify upstream registry SSL (default: false, inherits api_ssl_verify)
SOCKET_UPSTREAM_SSL_CA_CERT=/path/to/ca.crt  # Custom upstream CA

# Corporate proxy
SOCKET_OUTBOUND_PROXY=http://proxy:3128      # Egress proxy
SOCKET_NO_PROXY=localhost,127.0.0.1          # No-proxy exceptions

# Request tracking
SOCKET_REQUEST_ID_HEADER=X-Socket-Request-ID # Request ID header name (default)

# Log level
SOCKET_LOG_LEVEL=info                        # Console log level (error/warn/info/debug)

# Debug logging
SOCKET_DEBUG_LOGGING_ENABLED=false           # Enable debug logging
SOCKET_DEBUG_USER_AGENT_FILTER="*pattern*"   # Glob filter for user-agent

Redis

REDIS_ENABLED=true                            # Enable Redis
REDIS_HOST=redis.company.com                  # Redis hostname
REDIS_PORT=6379                               # Redis port
REDIS_PASSWORD=${REDIS_PASSWORD}              # Redis password
REDIS_DB=0                                    # Redis database number
REDIS_TTL=86400                               # Stale window (seconds)

# Redis SSL
REDIS_SSL=true                                # Enable SSL
REDIS_SSL_VERIFY=true                         # Verify Redis SSL
REDIS_SSL_CA_CERT=/path/to/redis-ca.pem      # Redis CA cert
REDIS_SSL_SERVER_NAME=redis.company.com       # SNI hostname

Nginx Performance

WORKER_PROCESSES=2                            # nginx worker processes
WORKER_CONNECTIONS=4096                       # Connections per worker

Proxy Timeouts

PROXY_CONNECT_TIMEOUT=60                      # Connection timeout (seconds)
PROXY_SEND_TIMEOUT=60                         # Send timeout
PROXY_READ_TIMEOUT=60                         # Read timeout

Auto-Discovery

Auto-discovery is configured via socket.yml under path_routing.private_registry (see above).
The api_key can also be provided via the PRIVATE_REGISTRY_KEY environment variable.

Metadata Filtering

METADATA_FILTERING_ENABLED=true               # Enable filtering (v1.1.108+)
METADATA_FILTER_BLOCKED=true                  # Filter blocked packages
METADATA_FILTER_WARN=false                    # Filter warned packages
METADATA_INCLUDE_UNCHECKED_VERSIONS=true      # Keep unchecked versions
METADATA_MAX_VERSIONS=100                     # Max versions to check per package
METADATA_CACHE_TTL=3600                       # Cache TTL for metadata lookups (seconds)
METADATA_FILTER_BATCH_SIZE=4000               # Max PURLs per batch
METADATA_MAX_BODY_SIZE=524288000              # Max body size for filtering (500MB default)
METADATA_PREFETCH_ENABLED=true                # Enable/disable background prefetch (true/false)
METADATA_PREFETCH_TTL=600                     # Prefetch refresh interval in seconds
PREFETCH_MAX_CONCURRENT=2                     # Max concurrent prefetch operations across workers
PREFETCH_BATCH_CONCURRENCY=4                  # Max concurrent PURL batch API calls per filter

Per-Ecosystem Alert Override

RECENTLY_PUBLISHED_ENABLED_ECOSYSTEMS=npm,pypi  # Enforce recentlyPublished blocking for these ecosystems (v1.1.134+)

Splunk

SPLUNK_ENABLED=true                           # Enable Splunk
SPLUNK_HEC_URL=https://splunk.company.com:8088/services/collector/event
SPLUNK_HEC_TOKEN=${SPLUNK_HEC_TOKEN}         # Splunk HEC token
SPLUNK_INDEX=security                         # Splunk index (optional)
SPLUNK_SOURCE=socket-firewall                 # Splunk source
SPLUNK_SOURCETYPE=socket:firewall:event       # Splunk sourcetype (default: socket:firewall:event)
SPLUNK_SSL_VERIFY=true                        # Verify Splunk SSL
SPLUNK_BATCH_SIZE=1                           # Events per batch (default: 1)

Webhook

WEBHOOK_ENABLED=true                          # Enable webhook
WEBHOOK_URL=https://siem.company.com/api/events  # Webhook endpoint URL
WEBHOOK_AUTH_HEADER="Bearer ${WEBHOOK_AUTH_TOKEN}" # Authorization header (optional)
WEBHOOK_SSL_VERIFY=false                      # Verify TLS (default: false)
WEBHOOK_TIMEOUT=5000                          # Timeout in ms (default: 5000)
WEBHOOK_ON_BLOCK=true                         # Fire on block (default: true)
WEBHOOK_ON_WARN=true                          # Fire on warn (default: true)
WEBHOOK_ON_MONITOR=true                       # Fire on monitor (default: true)
WEBHOOK_ON_IGNORE=true                        # Fire on ignore (default: true)

Docker Compose Examples

Minimal Configuration

services:
  socket-firewall:
    image: socketdev/socket-registry-firewall:latest
    ports:
      - "8080:8080"
      - "8443:8443"
    environment:
      - SOCKET_SECURITY_API_TOKEN=${SOCKET_SECURITY_API_TOKEN}
    volumes:
      - ./socket.yml:/app/socket.yml:ro
      - ./ssl:/etc/nginx/ssl
    restart: unless-stopped

Full Configuration with Redis

services:
  socket-firewall:
    image: socketdev/socket-registry-firewall:latest
    ports:
      - "8080:8080"
      - "8443:8443"
    environment:
      # Core
      - SOCKET_SECURITY_API_TOKEN=${SOCKET_SECURITY_API_TOKEN}
      - SOCKET_FAIL_OPEN=true
      - SOCKET_FAIL_OPEN_UNSCANNED=true
      - SOCKET_CACHE_TTL=600
      
      # Redis
      - REDIS_ENABLED=true
      - REDIS_HOST=redis
      - REDIS_PORT=6379
      - REDIS_PASSWORD=${REDIS_PASSWORD}
      - REDIS_TTL=86400
      
      # Performance
      - WORKER_PROCESSES=4
      - WORKER_CONNECTIONS=8192
      
      # Corporate proxy
      - SOCKET_OUTBOUND_PROXY=http://proxy.company.com:3128
      - SOCKET_NO_PROXY=localhost,127.0.0.1
      
    volumes:
      - ./socket.yml:/app/socket.yml:ro
      - ./ssl:/etc/nginx/ssl
    restart: unless-stopped
    healthcheck:
      test: ["CMD", "curl", "-fk", "https://localhost:8443/health"]
      interval: 30s
      timeout: 10s
      retries: 3
    depends_on:
      - redis

  redis:
    image: redis:7-alpine
    ports:
      - "6379:6379"
    command: redis-server --requirepass ${REDIS_PASSWORD}
    volumes:
      - redis-data:/data
    restart: unless-stopped

volumes:
  redis-data:

With Splunk Integration

services:
  socket-firewall:
    image: socketdev/socket-registry-firewall:latest
    ports:
      - "8080:8080"
      - "8443:8443"
    environment:
      - SOCKET_SECURITY_API_TOKEN=${SOCKET_SECURITY_API_TOKEN}
      
      # Splunk
      - SPLUNK_ENABLED=true
      - SPLUNK_HEC_URL=https://splunk.company.com:8088/services/collector/event
      - SPLUNK_HEC_TOKEN=${SPLUNK_HEC_TOKEN}
      - SPLUNK_INDEX=security
      - SPLUNK_SOURCE=socket-firewall
      
    volumes:
      - ./socket.yml:/app/socket.yml:ro
      - ./ssl:/etc/nginx/ssl
    restart: unless-stopped

Health Checks

The firewall exposes a health endpoint at /health:

curl -k https://localhost:8443/health

Response:

SocketFirewall/1.1.94 - Health OK - npm (registry.npmjs.org)

The response is plain text (Content-Type: text/plain) and includes the firewall version, registry name, and domain.

HTTP status codes:

200 OK - Firewall is healthy
503 Service Unavailable - Firewall is unhealthy (configuration error)

Docker healthcheck:

healthcheck:
  test: ["CMD", "curl", "-fk", "https://localhost:8443/health"]
  interval: 30s
  timeout: 10s
  retries: 3
  start_period: 10s

Complete Configuration Example

socket.yml:

# Core Socket settings
socket:
  api_url: https://api.socket.dev
  fail_open: true
  fail_open_unscanned: true
  outbound_proxy: http://proxy.company.com:3128
  no_proxy: localhost,127.0.0.1,internal.company.com
  api_ssl_verify: false
  api_ssl_ca_cert: /etc/ssl/certs/corporate-ca.crt
  upstream_ssl_verify: false

# Ports
ports:
  http: 8080
  https: 8443

# Deployment mode
config_mode: upstream

# Path-based routing with auto-discovery
path_routing:
  enabled: true
  domain: socket-firewall.company.com
  mode: artifactory
  
  private_registry:
    api_url: https://artifactory.company.com/artifactory
    api_key: ${ARTIFACTORY_API_KEY}
    interval: 5m
    exclude_pattern: "(tmp|test|snapshot)-.*"

# Caching
cache:
  ttl: 600

redis:
  enabled: true
  host: redis.company.com
  port: 6380
  password: ${REDIS_PASSWORD}
  ttl: 86400
  ssl: true
  ssl_verify: true
  ssl_ca_cert: /etc/redis/ssl/ca-cert.pem

# Performance
nginx:
  worker_processes: 8
  worker_connections: 16384

proxy:
  connect_timeout: 120
  send_timeout: 300
  read_timeout: 300

# Advanced features (v1.1.108+)
metadata_filtering:
  enabled: true
  filter_blocked: true
  filter_warn: false
  include_unchecked_versions: true
  max_versions: 100
  cache_ttl: 3600
  batch_size: 4000
  max_body_size: 500m

# Per-ecosystem recentlyPublished override (v1.1.134+)
# recently_published_enabled_ecosystems:
#   - npm
#   - pypi

splunk:
  enabled: true
  hec_url: https://splunk.company.com:8088/services/collector/event
  hec_token: ${SPLUNK_HEC_TOKEN}
  index: security
  source: socket-firewall
  sourcetype: socket:firewall:event
  ssl_verify: true

Configuration Reference

Configuration File (socket.yml)

Core Socket Settings

Remote Configuration from Socket Dashboard

How it works

Bootstrap fields (always local)

Failure behavior

Fail-Open Behavior

fail_open — API Errors

fail_open_unscanned — Unscanned Packages

expose_unscanned_header — Response Header Visibility

Client Auth Gate (bearer_token / basic_auth)

Configuration

Behavior

Interaction with upstream_token

Upstream Auth Token (upstream_token)

Configuration

Auth Scheme Auto-Detection

Behavior

Security Notes

Response Tracking Headers

Request ID Header

Decision Headers

Ports

Deployment Mode

Path-Based Routing

URL Rewrite Scheme

Forward for Domain (Unmatched Path Handling)

Route Options

Route Mode: rewrite vs proxy

External Routes File

Auto-Discovery (Artifactory/Nexus)

Artifactory Auto-Discovery

Nexus Auto-Discovery

Auto-Discovery Configuration Options

supported_ecosystems_only (default: true)

include_virtual (default: false)

Domain-Based Routing

Caching

Local In-Memory Cache (Default)

Redis Cache (Distributed)

Nginx Performance

Resource-Based Recommendations

Proxy Timeouts

Client IP Detection (client_ip)

Configuration

Options

Environment Variables

Examples

How It Works

Security Notes

Metadata Filtering

Options

How It Works

Supported Ecosystems

Use Cases

Environment Variables

Per-Ecosystem recentlyPublished Downgrade

Options

Valid Ecosystem Values

Behavior

Examples

Environment Variable

Per-Ecosystem Parameters

Options

Environment Variable

External Registry Cooldown

Modes

How It Works

Configuration

Options

Explicit External Registries

Private Registry Auto-Discovery

Supported Ecosystem Plugins

Redis Communication

Decision Logging

Publish-Date Fallback

Fallback Modes

How Each Registry Is Queried

Environment Variables

Configuration File (`socket.yml`)

`fail_open` — API Errors

`fail_open_unscanned` — Unscanned Packages

`expose_unscanned_header` — Response Header Visibility

Client Auth Gate (`bearer_token` / `basic_auth`)

Interaction with `upstream_token`

Upstream Auth Token (`upstream_token`)

`supported_ecosystems_only` (default: `true`)

`include_virtual` (default: `false`)

Client IP Detection (`client_ip`)

Per-Ecosystem `recentlyPublished` Downgrade

Custom CA Certificate (`ca_cert`)