For AI agents: A markdown version of this page is available at https://docs.datadoghq.com/experiments/plan_and_launch_experiments.md.
A documentation index is available at /llms.txt.
After creating your experiment, define the metrics, feature flag, and randomization settings.
Set decision metrics
To define the metrics that measure the outcome of your experiment:
Use the Calculate metrics by dropdown to select the subject type.
To define a custom subject type, select Create subject type from the dropdown.
Click the Primary metric button to open the picker:
Select a primary metric for the outcome you want to measure.
(Optional) Click the Certified or Non-certified tab to filter the list.
(Optional) Click Create Metric to define a new metric. For setup instructions, see Create Experiment Metrics.
(Optional) Click the Secondary metrics button to add guardrail metrics, which monitor unintended effects of the experiment on other areas such as performance, engagement, or revenue.
The sample size calculator estimates the number of users and the duration needed to detect a meaningful effect. You choose an entry point, the event that assigns users to the experiment, and Datadog uses the volume of traffic to that event to produce the estimate.
To run the calculation:
In the Run a sample size calculation (optional) section, click the sample size calculator link to open the side panel.
Expand Calculation details. Your primary and secondary metrics appear under Metrics.
Use the Entry point dropdown to select the event that assigns users to the experiment, such as viewing a checkout page or clicking an add-to-cart button. Datadog uses this event to estimate traffic volume.
(Optional) Under Filter entry point, narrow the entry point’s audience:
Click + Filter and select a property from the picker. If you do not see the property you need, type the property name in the Custom property field and click Add.
In the filter row that appears, modify the operator as needed and select a value from the dropdown.
(Optional) Click + Filter to add more rows. Between rows, use the dropdown to select or or and to set how filters combine.
Set the Number of variants and Traffic exposure.
Expand Additional inputs, then choose the statistical Power and enter a Target experiment duration in weeks.
The Target experiment duration value must be 1 or an even number because the calculator estimates MDE values and expected user counts at 1-, 2-, 4-, 6-, and 8-week intervals.
Randomize your users and split traffic across your experiment variants.
After you select a feature flag, Datadog pre-populates the randomization settings based on the flag’s configuration.
The randomization settings you configure here have the following effect after you launch your experiment:
Datadog adds a targeting rule to the selected feature flag.
If multiple experiments share the same flag, Datadog evaluates traffic based on the order of the flag's targeting rules. You can reorder targeting rules in the confirmation dialog before launching your experiment.
To configure randomization:
Select the Environment for your experiment from the dropdown.
Under Targeting rules, configure a filter to target users based on custom attributes (for example, user role or subscription tier) that you set in your evaluation context:
Click Add Filter. For the IF row, enter an attribute and value, and select an operator from the dropdown.
(Optional) Refine your targeting rule:
To add an AND row within the same filter, click Add Condition.
To add another filter joined by OR, click Add Filter.
Under Variants, use the Randomize users and split traffic dropdown to choose Equally (recommended) or Custom. This sets how Datadog splits traffic between your variants. Each user sees only their assigned variant throughout the experiment.
If you select Custom, enter a percentage for each variant. Percentages must sum to 100%.
Under Traffic exposure, set the percentage of users matching your targeting rules to include in the experiment.
To gradually ramp up experiment traffic instead of launching to all users at once:
In the Randomization section, click Add Rollout Steps and select a preset step configuration from the dropdown (for example, 3 steps from 5% to 100%).
Adjust the Traffic exposure percentage for each step as needed.
Next to Scheduled rollout by holding between steps for, use the two dropdowns to select a number and a time unit (for example, 1 and days). This sets how long each step runs before advancing.
At each rollout step, Datadog samples a percentage of eligible users to include in the experiment. Users outside the sample still see the default (control) experience, but Datadog does not include them in experiment results.
Set notifications
Route notifications to the right people as the experiment progresses.
In the Notifications section, use the Recipients dropdown to select who receives notifications about experiment life cycle events, such as results reaching statistical significance or Datadog detecting an issue.
Choose a statistical analysis plan
Configure how Datadog calculates statistical significance for your experiment.
If your organization has configured default settings, a COMPANY DEFAULT badge appears and Datadog pre-populates the settings.
To modify the statistical analysis plan:
Expand the Statistical analysis plan section.
Select a method from the Confidence interval method dropdown.
If you select Bayesian, choose a Standard Deviation of Prior from the dropdown.
Select a percentage from the Confidence level dropdown.
To disable CUPED, toggle off CUPED calculation. CUPED is enabled by default and uses pre-experiment data from each subject to reduce the variance of the metrics and improve experiment sensitivity.
To reduce the risk of false positives, toggle on Multiple testing correction. This setting adjusts for the increased risk across multiple metric comparisons, producing more conservative results.
This setting is not available when you use the Bayesian method.
Click Reset to Default to restore the default settings. If your organization has configured a company default, Datadog restores those settings instead.
Add split-by exploration dimensions
Segment your experiment results by properties (also called attributes) from your evaluation context.
To configure split-by dimensions:
Expand the Split-by exploration dimensions section.
Select properties from the Properties to compute for dimensional analysis dropdown. Available properties have the context. prefix.
If you do not see the property you need:
Type the property name in the dropdown field, prefixed with context. (for example, context.team). Then, click Add custom property to open the Split-by exploration dimensions dialog.
Verify the Column Name matches the property name you entered.
Select the property Type from the dropdown.
Click Save. The custom property appears in the Properties to compute for dimensional analysis dropdown.
Launch your experiment
To launch your experiment:
Click Start Experiment to open the Confirm starting the experiment dialog.
In the dialog, review the environment, feature flag, and the flag’s targeting rules for accuracy.
If multiple experiments share the same flag, use the up and down arrows on each targeting rule to reorder them.
Click Start Experiment & Enable Flag to launch the experiment.
Launching the experiment opens the Flag & Exposures page. Verify your configuration is live:
Review the Exposure balance check to confirm your variants are split at the percentages you configured.
Click View Exposures Log to monitor real-time user enrollment.