Sensitivity analysis

Sensitivity analysis is a technique used to determine how variations in the input parameters of a model or system affect its output, thereby quantifying the uncertainty and robustness of results under different assumptions.^[1] By systematically altering one or more independent variables while holding others constant, it identifies which factors most significantly influence outcomes, aiding in decision-making across diverse fields.^[2] This method is essential for evaluating model reliability, prioritizing key inputs, and mitigating risks associated with uncertain data or parameters.^[3] Key approaches to sensitivity analysis include local sensitivity analysis, which examines the impact of small changes in inputs around a nominal or base-case point, and global sensitivity analysis, which explores the effects across the full range of input variations to account for interactions and non-linearities.^[1] Common techniques within global methods encompass variance-based approaches like Sobol indices, which decompose output variance attributable to individual inputs or their combinations, and screening methods such as the Morris method for efficient factor prioritization in high-dimensional models.^[4] These methods often rely on sampling strategies, including Monte Carlo simulations, to propagate input uncertainties through the model.^[5] Applications of sensitivity analysis span multiple disciplines, including financial modeling, where it assesses how changes in variables like interest rates or sales volumes impact metrics such as net present value or return on investment.^[2] In engineering and environmental sciences, it optimizes designs for systems like pipelines or water networks by identifying critical parameters affecting performance and reliability.^[1] Additionally, in observational health research, sensitivity analysis tests the robustness of findings against biases, such as unmeasured confounding, by varying assumptions in comparative effectiveness studies.^[6] Overall, it enhances model credibility, supports policy decisions, and informs resource allocation by highlighting influential factors in complex systems.^[3]

Introduction

Definition and Purpose

Sensitivity analysis (SA) is a methodological framework used to quantify how variations or uncertainties in the input parameters of a computational model influence the variability in its output.^[7] It systematically evaluates the relative importance of different inputs by apportioning the uncertainty in the model output to sources of uncertainty in the inputs, thereby revealing the robustness and reliability of the model under varying conditions.^[8] The primary purposes of SA include identifying the most influential parameters that drive output changes, which aids in prioritizing research efforts and resource allocation; reducing model complexity by fixing non-influential inputs; supporting risk assessment by highlighting potential vulnerabilities in system predictions; and facilitating model validation through verification of key assumptions and behaviors.^[3] These applications make SA essential in fields such as engineering, environmental science, and policy analysis, where models inform critical decisions.^[9] For example, consider a simple linear model $ Y = aX + b $, where varying the input $ X $ while holding $ a $ and $ b $ constant demonstrates direct proportionality: the output $ Y $ changes linearly with $ X $, with the sensitivity coefficient $ a $ indicating the magnitude of influence.^[7] This illustrates how SA can straightforwardly assess input-output relationships in basic systems. SA is distinct from uncertainty analysis (UA), which focuses on propagating input uncertainties to characterize the overall distribution of output uncertainty, whereas SA specifically examines the contributions of individual or groups of inputs to that output variability.^[8]

Historical Development

The origins of sensitivity analysis can be traced to the mid-20th century, with foundational work in statistics and engineering through factorial designs and response surface methodology introduced in the 1950s by George E. P. Box and others, which allowed for systematic examination of how variations in inputs affect model outputs.^[10] By the 1960s, the concept began appearing in decision theory contexts, where it supported robust decision-making under uncertainty, including early explorations of expected value of perfect information in finite decision problems.^[11] In the 1970s, engineering applications advanced significantly, particularly in chemistry and nonlinear modeling, with the development of the Fourier Amplitude Sensitivity Test (FAST) by Robert I. Cukier and colleagues, enabling the identification of influential parameters through spectral analysis of model responses. Concurrently, C.S. Holling and C.J. Walters pioneered global perspectives in 1978 by advocating simultaneous variation of multiple parameters to capture interactions, a departure from isolated perturbations.^[10] The 1980s marked a pivotal expansion with the maturation of local sensitivity methods, driven by rising computational capabilities that facilitated derivative-based techniques and random sampling for exploring model behavior around nominal points.^[10] This era emphasized one-at-a-time analyses and regression approaches in engineering and systems modeling, as reviewed in subsequent works like Helton et al. (2006).^[10] Influential contributions included Bradley Efron and Charles Stein's 1981 advancements in variance decomposition, providing theoretical underpinnings for partitioning output uncertainty.^[10] The 1990s ushered in the rise of global variance-based techniques, transforming sensitivity analysis into a tool for comprehensive uncertainty attribution. Ilya M. Sobol' introduced his seminal indices in 1993, formalizing the decomposition of model output variance into contributions from individual inputs and their interactions, applicable to nonlinear and high-dimensional systems. Andrey Saltelli emerged as a key figure in the late 1990s, developing efficient Monte Carlo-based estimators and ANOVA-like decompositions that made global methods computationally feasible, as detailed in his collaborative works around 2000.^[10] In the 2000s, sensitivity analysis integrated closely with uncertainty quantification frameworks, particularly in environmental science and risk assessment, where Saltelli's 2004 handbook standardized variance-based practices and highlighted their role in model validation. This period saw widespread adoption in fields like climate modeling, emphasizing robust propagation of input uncertainties.^[10] By the 2010s, the field advanced with moment-independent measures and open-source software, broadening accessibility.^[10] Entering the 2020s, evolution accelerated toward AI-enhanced sensitivity analysis for high-dimensional data, leveraging machine learning to optimize sampling and interpret complex interactions, as evidenced in 2024 reviews of AI-driven methodologies that improve efficiency in black-box models.^[12]^[10] In 2025, the field continued to progress, highlighted by the Eleventh International Conference on Sensitivity Analysis of Model Output (SAMO 2025) held in April in Grenoble, France, focusing on advancements from theory to applications in uncertainty quantification.^[13]

Theoretical Foundations

Mathematical Formulation

Sensitivity analysis is grounded in the study of input-output relationships within mathematical models. Consider a general model where the output $ Y $ is a function of a vector of input factors $ \mathbf{X} = (X_1, X_2, \dots, X_k) $, expressed as $ Y = f(\mathbf{X}) $. Here, $ \mathbf{X} $ represents the inputs, which may be deterministic or stochastic, and $ f $ denotes either an analytical function or a black-box model evaluated computationally.^[14]^[15] For local sensitivity analysis, the impact of individual inputs is assessed near a nominal point $ \mathbf{x}^* $. The local sensitivity of the output with respect to the $ i $-th input is given by the partial derivative:

S_i = \left. \frac{\partial Y}{\partial X_i} \right|_{\mathbf{X} = \mathbf{x}^*}

This measure quantifies the rate of change in $ Y $ for infinitesimal variations in $ X_i $, holding other inputs fixed, and is particularly useful for models where $ f $ is differentiable.^[16]^[17] In preliminary screening approaches, such as the elementary effects method, a finite perturbation is used to approximate local effects. The elementary effect for the $ i $-th factor is defined as:

EE_i(\mathbf{x}) = \frac{f(\mathbf{x} + \Delta \mathbf{e}_i) - f(\mathbf{x})}{\Delta}

where $ \Delta $ is a small finite increment within the input domain, and $ \mathbf{e}_i $ is the $ i $-th unit vector. Multiple such effects are computed along randomized trajectories to estimate mean and variance, identifying influential factors.^[18] Global sensitivity analysis extends this framework by considering the full range of input variability, assuming $ \mathbf{X} $ follows a joint probability density function $ p(\mathbf{x}) $ over a probability space. The inputs are typically modeled as independent random variables, and the output $ Y $ is analyzed through integrals over the input domain. A foundational approach decomposes the unconditional variance of $ Y $ as:

\mathrm{Var}(Y) = \sum_{i=1}^k \mathrm{Var}\left( E(Y \mid X_i) \right) + \sum_{i < j} \mathrm{Var}\left( E(Y \mid X_i, X_j) \right) + \cdots + \mathrm{Var}\left( E(Y \mid \mathbf{X}) \right)

This decomposition isolates main effects and interactions, with the first-order term $ \mathrm{Var}(E(Y \mid X_i)) $ representing the expected variance attributable to $ X_i $ alone, averaged over other inputs. Higher-order terms capture interactions, providing a basis for variance-based indices.^[19]^[15] Such indices, including Sobol' indices, quantify the fractional contributions to total variance and are explored in subsequent method-specific sections.

Key Concepts and Terminology

In sensitivity analysis, input factors, also referred to as parameters or variables, represent the uncertain components of a computational model that drive its outputs. These factors, such as physical constants, environmental variables, or economic parameters, are characterized by probability distributions or specified ranges to account for their inherent uncertainty.^[14] The output quantities of interest (QoI) are the specific scalar or functional results from the model that analysts seek to evaluate, such as risk levels, system performance metrics, or predicted doses in environmental simulations; the goal of sensitivity analysis is to apportion the uncertainty in these QoI to the input factors.^[14] Sensitivity indices provide normalized quantitative measures, typically ranging from 0 to 1, of an input factor's contribution to the variance or uncertainty in the QoI, enabling the ranking of factors by their relative importance.^[14] A key distinction exists between first-order effects and total-order effects. First-order effects quantify the direct, isolated influence of a single input factor on the QoI, assuming all other factors are fixed, thereby capturing only the main effect without interactions.^[14] In contrast, total-order effects measure the comprehensive impact of an input factor, incorporating its main effect plus all higher-order interactions with other factors, which is essential for understanding the full propagation of uncertainty in complex systems.^[14] The concepts of linearity and nonlinearity are fundamental to selecting appropriate sensitivity methods. In linear models, the relationship between inputs and outputs is additive and proportional, resulting in constant sensitivities that can be assessed locally without considering interactions.^[14] Nonlinear models, however, feature varying sensitivities across the input space, often with significant interactions, requiring global methods to fully characterize the effects and avoid underestimation of factor importance.^[14] Efficient exploration of the input space relies on sampling techniques like Latin Hypercube Sampling (LHS), a stratified method that divides each input factor's probability distribution into equal intervals and selects one sample from each to ensure even coverage, thereby reducing the number of model evaluations needed compared to random sampling while maintaining representativeness. Assessing convergence ensures the reliability of sensitivity results, typically through criteria such as the stabilization of sensitivity indices or the narrowing of their confidence intervals (e.g., at 95% level) as sample size increases, confirming that further sampling yields negligible changes.^[20]

Classification of Sensitivity Analysis Methods

Local versus Global Approaches

Local sensitivity analysis (LSA) examines the impact of input parameters on model outputs by perturbing them individually or in small combinations around a specific nominal or baseline point in the parameter space.^[21] This approach assumes that the model behaves linearly or nearly so near the reference point and that parameters act independently without significant interactions.^[22] It is particularly suited for scenarios where the system operates close to known operating conditions, such as in preliminary model calibration or when computational resources are limited.^[23] In contrast, global sensitivity analysis (GSA) evaluates the effects of inputs across their entire feasible range or probability distributions, capturing nonlinearities, parameter interactions, and the propagation of uncertainties throughout the model. Unlike LSA, GSA does not rely on a single reference point but instead explores the full input domain, making it ideal for complex systems with high uncertainty or where inputs may vary widely.^[22] Seminal works emphasize that GSA provides a more robust assessment of parameter importance by accounting for the joint effects of all inputs, which LSA often overlooks. The primary distinction lies in their scope and assumptions: LSA is computationally inexpensive and offers quick insights into local behavior but can miss global interactions and yield misleading results for nonlinear models, while GSA is more comprehensive yet demands substantially higher computational effort due to extensive sampling.^[23] For instance, LSA might suffice for low-uncertainty engineering designs near nominal values, whereas GSA is preferred for risk assessment in environmental or biological models with distributed inputs.^[22] Decision criteria for selection include model linearity, parameter variability, and analysis goals—opt for LSA in resource-constrained, near-nominal cases; use GSA for thorough uncertainty quantification in nonlinear or interactive systems. Hybrid approaches bridge these methods by leveraging LSA for initial screening around multiple points in the parameter space, followed by GSA to confirm and explore interactions, thereby balancing efficiency and completeness. Examples include the Distributed Evaluation of Local Sensitivity Analysis (DELSA), which distributes LSA computations to approximate global effects,^[24] and variation-based hybrids that integrate local derivatives with global variance measures.^[25] These are particularly useful in high-dimensional models where pure GSA is prohibitive, allowing practitioners to refine focus before full global exploration.

Deterministic versus Probabilistic Frameworks

In deterministic sensitivity analysis, input parameters are treated as fixed values that are systematically varied, often through methods like grid searches or one-at-a-time perturbations, to evaluate the impact on model outputs without incorporating probability distributions.^[14] This framework focuses on nominal or worst-case scenarios, such as assessing how outputs change when inputs are set to upper and lower bounds, thereby identifying critical parameters for scenario testing in deterministic models.^[26] It assumes parameters have true or representative values and ignores input correlations or variability, making it suitable for exploring local effects around a baseline point but potentially overlooking broader interactions.^[14] In contrast, probabilistic sensitivity analysis models inputs as random variables following specified probability distributions, such as uniform or normal, to propagate uncertainty through the model and quantify its effects on outputs via statistical measures like expectations or variances.^[14] Techniques like Monte Carlo sampling are commonly used to generate ensembles of input combinations, enabling the assessment of risk under input variability and providing probabilistic outputs, such as confidence intervals for model predictions.^[26] This approach requires knowledge of input distributions and accounts for correlations if specified, offering a more comprehensive view of uncertainty compared to deterministic methods.^[14] The key differences between these frameworks lie in their treatment of uncertainty: deterministic analysis excels in targeted scenario exploration and is computationally simpler, ideal for preliminary assessments or when distributions are unknown, while probabilistic analysis better captures overall risk and variability, though it demands more data and resources for distribution specification.^[26] A natural transition occurs from deterministic one-at-a-time analysis to probabilistic extensions, such as randomized versions that sample from distributions to incorporate variability while retaining the focus on individual parameter effects.^[14] Deterministic frameworks often assume independence and linearity, potentially underestimating complex behaviors, whereas probabilistic ones necessitate accurate distributional assumptions to avoid biased uncertainty propagation.^[26]

Local Sensitivity Methods

One-at-a-Time (OAT) Analysis

One-at-a-time (OAT) analysis is a fundamental local sensitivity method that assesses the impact of individual input parameters on model outputs by varying one parameter while holding all others fixed at their nominal values.^[22] This approach operates within the framework of local sensitivity analysis, focusing on small perturbations around a chosen reference point to approximate partial effects.^[27] It is particularly suited for initial screening in deterministic models where interactions between parameters are assumed negligible or secondary.^[28] The procedure begins with selecting a nominal or baseline point for all input parameters, representing the expected or central values of the system. One input is then perturbed by fixed increments, such as ±5% or ±10% of its nominal value, or across a discrete set of levels within its range, while keeping the remaining inputs constant. For each perturbation, the model is re-evaluated to compute the corresponding change in output. This process is repeated sequentially for each input parameter in turn.^[22] Sensitivity is quantified using measures like the elementary effect, often expressed as the slope of the output change relative to the input perturbation,

\frac{\Delta Y}{\Delta X_i}

, or the percentage change in output

\frac{\Delta Y / Y}{ \Delta X_i / X_i }

, where

Y

is the output and

X_i

is the

i

-th input.^[29] These metrics provide a direct indicator of how responsive the output is to variations in each input near the nominal point.^[22] A practical example of OAT analysis appears in financial budgeting models, such as capital investment projects, where the net present value (NPV) serves as the output. To evaluate sensitivity to costs, the nominal cost estimate is perturbed by 5% (e.g., increased or decreased) while holding revenue projections, discount rates, and other expenses fixed; the resulting

\Delta

NPV reveals the cost's influence on project viability. Similar applications occur in engineering models, like varying a single material property in a structural simulation to observe effects on load-bearing capacity.^[30] OAT analysis offers key advantages, including its intuitive simplicity and low computational demand, as it requires only

k+1

model evaluations for

k

inputs in the basic form, making it accessible for quick assessments in resource-constrained settings.^[27] However, it has notable limitations: by fixing other inputs, it overlooks interactions and nonlinear effects across the parameter space, potentially underestimating overall uncertainty and providing misleading results in complex, high-dimensional systems.^[22] Its strictly local nature also restricts validity to the vicinity of the nominal point, rendering it unsuitable for exploring global behaviors.^[27] Extensions of OAT incorporate limited multi-way interactions through factorial designs, where small subsets of inputs (e.g., two or three) are varied simultaneously in a structured grid, such as a

2^k

full factorial for

k=2

, to capture pairwise effects without full global exploration.^[28] This hybrid approach balances computational efficiency with improved detection of dependencies, often used as a bridge to more advanced methods.^[29]

Derivative-Based Techniques

Derivative-based techniques in local sensitivity analysis compute the partial derivatives of the model output $ Y $ with respect to each input parameter $ X_i $ at a specific nominal point in the parameter space, quantifying how small changes in $ X_i $ affect $ Y $. These methods assume the model is locally differentiable and focus on the gradient of the response surface near the base case.^[16] Analytical derivatives involve direct computation from the model's mathematical equations, providing exact sensitivities without additional evaluations if the model is symbolic or differentiable by hand. For instance, consider a simple model $ Y = X_1^2 X_2 $; the partial derivative with respect to $ X_1 $ is $ \frac{\partial Y}{\partial X_1} = 2 X_1 X_2 $, evaluated at nominal values to yield the local sensitivity. This approach is efficient for low-dimensional, explicit models in fields like engineering design.^[16]^[31] When analytical forms are unavailable or complex, numerical approximations estimate derivatives using finite differences, which perturb the input slightly and compute the difference quotient. The forward difference formula is $ \frac{\partial Y}{\partial X_i} \approx \frac{f(\mathbf{X} + h \mathbf{e}_i) - f(\mathbf{X})}{h} $, where $ h $ is a small step size and $ \mathbf{e}_i $ is the unit vector in the $ i $-th direction; backward differences use $ -h $ for improved accuracy in some cases. These methods require multiple model evaluations per parameter but are versatile for black-box simulations. Step size selection is critical to balance truncation and round-off errors, often using $ h = \sqrt{\epsilon} X_i $ where $ \epsilon $ is machine precision.^[16]^[32] Elasticity normalizes the derivative to a dimensionless measure, defined as $ \eta_i = \frac{\partial Y}{\partial X_i} \cdot \frac{X_i}{Y} $, representing the percentage change in $ Y $ per percentage change in $ X_i $ at the nominal point. This scaling facilitates comparison across parameters with different units or scales, commonly applied in economic and ecological models.^[33] These techniques find applications in optimization, where gradients guide parameter tuning via methods like steepest descent, and in stability analysis of dynamical systems, such as assessing eigenvalue sensitivities in control theory.^[16]^[34] Limitations include the requirement for model differentiability, which fails for discontinuous or non-smooth functions, and sensitivity to the choice of nominal point, potentially missing nonlinear behaviors away from it. Numerical methods also amplify errors in high dimensions due to the curse of dimensionality in evaluations.^[16]^[35]

Regression Analysis

Regression analysis serves as a local sensitivity method that infers the influence of input variables on model outputs by fitting a statistical regression model to input-output data generated from simulations or observations. In this approach, a multiple linear regression model is constructed in the form $ Y = \beta_0 + \sum \beta_i X_i + \epsilon $, where $ Y $ represents the model output, $ X_i $ are the input variables, $ \beta_i $ are the regression coefficients quantifying the sensitivity of $ Y $ to each $ X_i $, $ \beta_0 $ is the intercept, and $ \epsilon $ is the error term.^[3] The absolute values of the $ \beta_i $ coefficients indicate the relative importance of each input, with larger magnitudes signifying greater sensitivity. To enable comparability across inputs with different scales or units, standardized regression coefficients (SRCs) are employed by scaling the inputs and output to zero mean and unit variance before fitting the model. These SRCs, which range from -1 to 1, measure the change in output standard deviation per standard deviation change in an input, while holding other inputs constant, and their squared values approximate the fraction of output variance attributable to each input. For instance, in an ecological predator-prey model based on the Lotka-Volterra equations, regression analysis was applied to assess species abundances as outputs regressed against environmental parameters such as growth rates ($ r

), predation rates (

\alpha

), and mortality rates (

m, \delta $); SRCs revealed that the predator mortality rate $ \delta $ explained up to 82.4% of variance in prey abundance at equilibrium. Stepwise regression selection enhances this method by iteratively adding or removing inputs based on statistical criteria, such as improvements in the coefficient of determination $ R^2 $ or t-tests for coefficient significance, to identify the subset of inputs most influential on the output.^[3] This process prioritizes inputs that significantly contribute to explaining output variance, often achieving high $ R^2 $ values (e.g., over 0.7) indicative of adequate model fit. The validity of regression-based sensitivity analysis relies on key assumptions, including linearity between inputs and output, independence among inputs, normally distributed residuals, and no severe multicollinearity. Violations of these, such as nonlinear relationships or correlated inputs, can bias coefficient estimates and suggest the limitations of local methods, necessitating a shift to global sensitivity approaches that explore the full input space.^[3]

Global Sensitivity Methods

Screening Methods (e.g., Morris)

Screening methods in sensitivity analysis aim to efficiently identify the most influential input factors among many in complex models, particularly when computational resources are limited. These techniques prioritize qualitative or semi-quantitative assessment over precise quantification, making them suitable for initial factor prioritization in global sensitivity contexts. The Morris method, introduced in 1991, is a prominent screening approach that evaluates elementary effects through randomized one-factor-at-a-time perturbations along multiple trajectories in the input space. For a model $ y = f(\mathbf{x}) $ with $ p $ input factors, the elementary effect for the $ i $-th factor at a point $ \mathbf{x} $ in the input space is defined as

EE_i(\mathbf{x}) = \frac{f(\mathbf{x} + \Delta \cdot \mathbf{e}_i) - f(\mathbf{x})}{\Delta},

where $ \Delta $ is a finite increment (typically $ \Delta = 1/(p+1) $ for a standardized unit hypercube), and $ \mathbf{e}i $ is the $ i $-th unit vector. This measures the local sensitivity at various points, capturing both main effects and potential nonlinearities or interactions by varying the base point $ \mathbf{x} $. The mean absolute elementary effect, $ \mu_i^* = \frac{1}{r} \sum{j=1}^r |EE_{i,j}| $, quantifies the overall influence of factor $ i $ across $ r $ replicates, while the standard deviation $ \sigma_i = \sqrt{\frac{1}{r} \sum_{j=1}^r (EE_{i,j} - \frac{1}{r} \sum_{k=1}^r EE_{i,k})^2} $, indicates nonlinearity or interactions if $ \sigma_i / \mu_i^* > 0.1 $. The procedure involves generating $ r $ random trajectories, each consisting of $ p $ steps (for a total of $ r(p+1) $ model evaluations), where each step perturbs one randomly selected factor while keeping others fixed. The input space is discretized into a grid to ensure perturbations align with grid points, and trajectories are designed to cover the space efficiently without overlap. Factors with high $ \mu_i^* $ and low $ \sigma_i $ are deemed influential but additive, whereas high $ \sigma_i $ suggests higher-order effects. This distribution of effects allows ranking factors for further analysis. One key advantage of the Morris method is its low computational demand, requiring only 10 to 40 evaluations per factor for $ p $ up to 20, compared to thousands for more exhaustive global methods, enabling its use in high-dimensional problems. It effectively detects both main effects and interactions, providing a profile for each factor that guides model simplification. As an alternative for deterministic screening, sequential bifurcation iteratively halves the set of candidate factors by evaluating group averages and eliminating those with negligible impact, assuming additivity and no interactions.^[36] This method is particularly efficient for models with many factors, requiring $ O(p \log p) $ evaluations in the best case.

Variance-Based Decomposition (e.g., Sobol')

Variance-based decomposition methods quantify the contribution of input variables to the output uncertainty of a model by partitioning the total variance of the output into components attributable to individual inputs and their interactions. These approaches, rooted in the functional ANOVA (analysis of variance) framework, assume that the model's output $ Y = f(\mathbf{X}) $ can be expressed as a sum of functions depending on subsets of the input vector $ \mathbf{X} = (X_1, \dots, X_p) $, where the inputs are independent and uniformly distributed over the unit hypercube. This decomposition enables a complete attribution of variance, making it suitable for global sensitivity analysis in complex systems such as engineering simulations and environmental modeling.^[37] The Sobol' decomposition, named after Ilya M. Sobol', expands the output as:

Y = f_0 + \sum_i f_i(X_i) + \sum_{i<j} f_{ij}(X_i, X_j) + \cdots + f_{12\dots p}(X_1, \dots, X_p),

where $ f_0 = E(Y) $ is the mean, and each $ f_\mathbf{u} $ for a subset $ \mathbf{u} \subseteq {1, \dots, p} $ has zero conditional expectation over any proper subset of its variables. The total variance $ \text{Var}(Y) $ is then additively decomposed as:

\text{Var}(Y) = \sum_i V_i + \sum_{i<j} V_{ij} + \cdots + V_{12\dots p},

with $ V_\mathbf{u} = \text{Var}(E(Y \mid \mathbf{X}\mathbf{u})) - \sum{\mathbf{v} \subsetneq \mathbf{u}} V_\mathbf{v} $ representing the variance due to interactions among the variables in $ \mathbf{u} $. This formulation ensures orthogonality and completeness, allowing all variance to be accounted for without overlap.^[37]^[38] From this decomposition, sensitivity indices are derived to measure the relative importance of inputs. The first-order index for input $ X_i $, denoted $ S_i $, captures the main effect:

S_i = \frac{V(E(Y \mid X_i))}{\text{Var}(Y)},

indicating the fraction of total output variance explained solely by $ X_i $. The total-order index $ S_{T_i} $, which includes all interactions involving $ X_i $, is:

S_{T_i} = 1 - \frac{V(E(Y \mid \mathbf{X}_{\sim i}))}{\text{Var}(Y)},

where $ \mathbf{X}{\sim i} $ denotes all inputs except $ X_i $. Higher-order interaction indices, such as the second-order $ S{ij} = V_{ij} / \text{Var}(Y) $, quantify pairwise effects with $ V_{ij} = D_{ij} / \text{Var}(Y) $ and $ D_{ij} = V(E(Y \mid X_i, X_j)) - V_i - V_j $. These indices provide a hierarchical view of sensitivity, with $ \sum_i S_i + \sum_{i<j} S_{ij} + \cdots = 1 $, facilitating identification of dominant factors and interaction strengths in nonlinear models.^[37] Estimation of Sobol' indices typically relies on Monte Carlo integration due to the intractability of analytical solutions for most models. A widely adopted scheme is the Saltelli sampler, which generates $ N(p+2) $ samples using two base matrices $ \mathbf{A} $ and $ \mathbf{B} $ of size $ N \times p $, along with auxiliary matrices $ \mathbf{A}{\mathbf{B}} $ (columns of $ \mathbf{A} $ with $ B_i $ in the $ i $-th column) and $ \mathbf{B}{\mathbf{A}} $. Conditional expectations are approximated as, for example, $ E(Y \mid X_i) \approx \frac{1}{N} \sum_{j=1}^N f_{\mathbf{A}{\mathbf{B}}}(j) f{\mathbf{B}}(j) $ for the numerator of $ S_i $, with variances estimated via sample means. This method achieves improved convergence rates compared to naive Monte Carlo, requiring on the order of $ 10^3 $ to $ 10^4 $ model evaluations for reliable indices in moderate dimensions. Extensions like the Jansen or Sobol' estimators further refine accuracy by alternative pairings of the matrices. For higher-order terms, interactions are estimated similarly by conditioning on pairs or groups, such as $ V_{ij} $ via $ E(Y \mid X_i, X_j) \approx \frac{1}{N} \sum_{k=1}^N f_{\mathbf{A}{\mathbf{B}{-ij}}}(k) [f_{\mathbf{A}}(k) - f_0] $ where $ \mathbf{B}_{-ij} $ replaces non-$ i,j $ columns. These computations scale poorly with dimension $ p $, often limiting full decomposition to low-order terms, but total indices remain feasible as they avoid explicit higher-order evaluations. In practice, variance-based methods excel in revealing nonlinear and interaction effects, outperforming local measures in capturing full input ranges.^[38] Classical Sobol' indices assume input independence; extensions for correlated inputs, such as those using copula representations or conditional variance adjustments, have been developed to preserve additivity while handling dependence.^[39]

Moment-Independent and Correlation Measures

Moment-independent measures provide a global sensitivity assessment by evaluating the influence of input uncertainty on the full probability distribution of the model output, without relying on assumptions about moments such as mean or variance. These methods are particularly valuable for models with non-normal or heavy-tailed outputs, where traditional variance-based approaches may fail or provide misleading results. A prominent example is the delta measure, introduced by Borgonovo, which quantifies the shift in the output distribution induced by varying a single input factor while integrating over the others. The delta sensitivity index for an input $ X_i $ is formally defined as

\delta_i = \frac{1}{2} \mathbb{E}_{X_i} \left[ \int_{-\infty}^{\infty} \left| f_Y(y) - f_{Y \mid X_i}(y) \right| \, dy \right],

where $ f_Y(y) $ denotes the unconditional probability density function of the output $ Y = f(\mathbf{X}) $, and $ f_{Y \mid X_i}(y) $ is the conditional density given a fixed value of $ X_i $. This index represents the average L1 distance between the unconditional output density and the conditional density across all possible values of $ X_i $, normalized to lie between 0 and 1, with $ \delta_i = 0 $ indicating independence between $ X_i $ and $ Y $, and $ \sum \delta_i = 1 $ for additive models. Computationally, the integral and expectations are estimated using Monte Carlo sampling, often employing kernel density estimation (KDE) for non-parametric reconstruction of the densities from simulated data points; KDE uses a kernel function (e.g., Gaussian) to smooth the empirical distribution, enabling flexible estimation without distributional assumptions. Correlation-based measures offer a complementary, computationally lighter approach to quantify sensitivity through the strength and direction of associations between inputs and outputs, focusing on monotonic relationships. The Spearman's rank correlation coefficient $ \rho_i $ measures the monotonic dependence between $ X_i $ and $ Y $ by correlating their ranks:

\rho_i = 1 - \frac{6 \sum_{j=1}^n d_j^2}{n(n^2 - 1)},

where $ d_j $ is the difference between the ranks of the $ j $-th paired observations of $ X_i $ and $ Y $, and $ n $ is the sample size; values range from -1 to 1, indicating perfect negative to positive monotonicity. To account for confounding effects from other inputs, the partial rank correlation coefficient (PRCC) extends this by partialling out the influence of remaining factors via rank-transformed linear regression residuals before computing the correlation, providing a more isolated assessment of individual sensitivities in multi-input models. These coefficients are estimated from Latin hypercube or quasi-random samples and are robust to outliers and non-linear monotonicities. Unlike variance-based decompositions, which apportion output variance and assume finite second moments, moment-independent and correlation measures like delta and PRCC handle broader output types, including non-monotonic functions and distributions with infinite variance, thus offering greater applicability in risk analysis and complex simulations. For instance, delta captures distributional shifts holistically, while correlations emphasize directional influences, making them suitable for screening influential factors in preliminary analyses. However, these methods incur higher computational demands for delta due to the need for density estimation via KDE, which requires larger sample sizes (often $ n > 10^4 $) compared to the modest $ n \approx 500-1000 $ sufficient for correlation coefficients, potentially limiting their use in very high-dimensional problems.

Fourier Amplitude Sensitivity Test (FAST)

The Fourier Amplitude Sensitivity Test (FAST) is a variance-based global sensitivity analysis method that quantifies the contribution of input factors to the output variance by exploring the model's parameter space through periodic sampling and spectral decomposition. Developed originally for assessing sensitivities in coupled reaction systems, FAST transforms each input variable

x_i

into a periodic function using distinct integer frequencies

\omega_i

, generating a one-dimensional search curve that sweeps the input space uniformly. The model output

Y = f(\mathbf{x})

is then expressed as a Fourier series expansion along this curve:

Y(s) = A_0 + \sum_{k=1}^\infty [A_k \cos(k s) + B_k \sin(k s)]

, where

s

parameterizes the curve. The first-order sensitivity index for input

i

is computed as

S_i = \frac{1}{\mathrm{Var}(Y)} \sum_{k \neq 0, \, k \in \omega_i \mathbb{Z}} |A_k + i B_k|^2

, capturing the portion of output variance attributable solely to

x_i

, excluding interactions.^[40] In classical FAST, a single search curve is used, with frequencies

\omega_i

chosen as integers to prevent spectral overlap (e.g.,

\omega_i = 1, 2, 4, \dots, 2^{p-1}

for

p

inputs), and sampling points generated via

x_i(s) = G_i^{-1} \left( \frac{1}{2} + \frac{1}{\pi} \arcsin(\sin(\omega_i s + \phi_i)) \right)

for

s \in [-\pi, \pi]

, where

G_i

is the cumulative distribution function of

x_i

and

\phi_i

are phase shifts. This resampling scheme ensures efficient exploration with

N

model evaluations, where

N

must satisfy the Nyquist criterion (

N \geq 2 \max(\omega_i) + 1

) to resolve frequencies up to a cutoff. An analytical form for the first-order index, assuming rescaled output

g(t)

with zero mean and unit variance, is

S_i = \sum_{s=1}^\infty \left[ \frac{2}{\pi} \int_0^\pi \sin(s \omega_i t) g(t) \, dt \right]^2 / \mathrm{Var}(g)

, derived from the Fourier coefficients' squared amplitudes. While classical FAST excels at main effects, it overlooks interactions due to the single-curve limitation.^[40]^[41] The extended FAST (eFAST) addresses this by employing multiple search curves, each assigning a high base frequency to one input while low frequencies to others, enabling separation of main effects from total effects (including interactions). For total sensitivity,

\omega_i

is set large (e.g.,

\omega_i = 2M \max(\omega_{\sim i}) + 1

, with

M

as the interference order), and the index is

S_{T_i} = 1 - S_{\sim i}

, where

S_{\sim i}

is the sensitivity excluding

x_i

. Sampling follows a similar integer-frequency scheme but iterates over curves (typically 5–10 per input), with total evaluations scaling as

n \times r \times N

, where

n

is the number of inputs and

r

the number of curves. eFAST aligns with Sobol' variance decomposition by estimating indices via spectral analysis.^[42] FAST and eFAST are particularly efficient for models with up to 20 input factors, requiring fewer evaluations than Monte Carlo-based methods while providing robust variance apportionment, and have been applied in fields like chemical kinetics, environmental modeling, and engineering systems.^[42]^[43]

Polynomial Chaos Expansion

Polynomial chaos expansion (PCE) provides a spectral representation of a model's output as a series of orthogonal polynomials in the input random variables, enabling efficient quantification of uncertainty and derivation of sensitivity measures. Originally developed by Wiener in 1938 for Gaussian processes using Hermite polynomials, the framework was generalized by Xiu and Karniadakis in 2002 to incorporate a wider class of orthogonal bases from the Askey scheme, matching the input distributions (e.g., Hermite for Gaussian, Legendre for uniform).^[44] This generalized PCE approximates the model output $ Y = \mathcal{M}(\mathbf{X}) $, where

\mathbf{X}

is the vector of random inputs, as

Y \approx \sum_{\boldsymbol{\alpha} \in \mathbb{N}^M} c_{\boldsymbol{\alpha}} \Psi_{\boldsymbol{\alpha}}(\mathbf{X}),

with

\Psi_{\boldsymbol{\alpha}}

denoting multivariate orthogonal polynomials of multi-index

\boldsymbol{\alpha}

and coefficients

c_{\boldsymbol{\alpha}}

.^[44] Due to the orthogonality of the basis, statistical moments of

Y

can be computed directly from the coefficients, facilitating variance-based sensitivity analysis without extensive resampling. Sensitivity indices are derived analytically from the PCE coefficients, leveraging the decomposition of the output variance. The total variance is

\mathrm{Var}(Y) = \sum_{\boldsymbol{\alpha} \neq \mathbf{0}} c_{\boldsymbol{\alpha}}^2 \langle \Psi_{\boldsymbol{\alpha}}^2 \rangle

, where

\langle \cdot \rangle

denotes the expectation (often normalized to 1). The first-order Sobol' index for input

X_i

, measuring its individual contribution, is

S_i = \frac{1}{\mathrm{Var}(Y)} \sum_{\boldsymbol{\alpha}: \alpha_j = 0 \ \forall j \neq i} c_{\boldsymbol{\alpha}}^2 \langle \Psi_{\boldsymbol{\alpha}}^2 \rangle,

while the total-order index, capturing all effects including interactions, follows from inclusion-exclusion as

S_{T_i} = 1 - S_{\sim i}

, where

S_{\sim i}

sums over terms excluding

X_i

. This approach, formalized by Sudret in 2008, allows computation of Sobol' indices and higher-order interaction terms directly from the expansion, offering computational efficiency over traditional Monte Carlo estimation for variance-based methods. PCE constructions are categorized as intrusive or non-intrusive. Intrusive methods, such as Galerkin projection, integrate the expansion into the model's governing equations, solving a system of deterministic equations for the coefficients; this is particularly suited for differential equation models but requires code modifications.^[44] Non-intrusive approaches treat the model as a black box, estimating coefficients via regression (e.g., least-squares) or orthogonal projection on a set of model evaluations at random input samples, making them applicable to legacy codes. PCE excels in handling nonlinear responses and correlated inputs, while also yielding a surrogate model for rapid subsequent evaluations and sensitivity assessments. For high-dimensional problems, where the full expansion grows exponentially with the number of inputs, sparse PCE variants have emerged since the 2010s to mitigate the curse of dimensionality. These methods adaptively select a subset of basis terms using techniques like least angle regression or Bayesian inference, retaining only those with significant coefficients while truncating others. Developed by Blatman and Sudret in 2011, sparse PCE reduces computational demands and improves accuracy in dimensions exceeding 10–20 variables, as demonstrated in structural reliability applications.

Game-Theoretic Approaches (e.g., Shapley Effects)

Game-theoretic approaches to sensitivity analysis draw from cooperative game theory to fairly attribute the output variability of a model to its input factors, treating inputs as "players" in a game where the "payout" is the model's variance. The Shapley effect, a prominent method in this framework, assigns to each input

i

a value that represents its average marginal contribution to the variance across all possible coalitions of inputs. This approach ensures equitable distribution of importance, accounting for interactions without bias toward any particular order of inclusion. Introduced in the context of global sensitivity analysis by Song, Nelson, and Staum (2016), the Shapley effect

\phi_i

for input

i

in a model with

p

inputs is given by

\phi_i = \frac{1}{p!} \sum_{u \subseteq \{1,\dots,p\} \setminus \{i\}} |u|! \, (p - |u| - 1)! \, \left[ V(u \cup \{i\}) - V(u) \right],

where

V(u)

denotes the variance of the model output explained by the subset of inputs

u

, typically

V(u) = \mathrm{Var}(\mathbb{E}[Y \mid X_u])

under the assumption of finite variance.^[45] The Shapley effects satisfy key axiomatic properties derived from the original Shapley value in game theory: efficiency (the sum of effects equals the total output variance), symmetry (inputs with identical marginal contributions receive equal effects), dummy (an input with no contribution gets zero effect), and additivity (effects for a sum of games add up). These properties make the method robust and interpretable, as they guarantee a complete and fair decomposition of variance even when inputs interact or are dependent. Unlike variance-based methods such as Sobol' indices, which partition variance into main and interaction effects, Shapley effects provide a unified attribution by averaging over all coalition sizes and compositions.^[45] Computing exact Shapley effects is combinatorially expensive, requiring evaluation over

2^p

subsets, so Monte Carlo estimation is essential, particularly Owen sampling schemes that efficiently approximate the multilinear extension of the value function. For high-dimensional problems (

p > 20

), approximations such as truncated sampling or metamodel-assisted methods reduce the burden while maintaining accuracy, as detailed in algorithms by Plischke, Rabitti, and Borgonovo (2021). These computational strategies allocate model evaluations to estimate marginal contributions across random permutations of inputs. Shapley effects offer advantages in handling all input interactions equally, providing a moment-independent measure of importance that does not rely on distributional assumptions beyond finite variance, thus applicable to non-normal outputs. This fairness in attribution contrasts with hierarchical decompositions in other global methods, making it suitable for complex systems. Shapley effects have gained traction in machine learning interpretability, where they underpin tools like SHAP for feature attribution in black-box models.^[45]

Advanced and Complementary Techniques

Surrogate Modeling and Metamodels

Surrogate models, also known as metamodels, serve as computationally efficient approximations of expensive simulation models, enabling sensitivity analysis by minimizing the number of direct evaluations required from the original system.^[46] These approximations capture the input-output relationship of the underlying model, allowing global sensitivity measures, such as variance-based indices, to be computed rapidly on the surrogate instead of the full model. By leveraging a limited set of training data, surrogates facilitate exploration of high-dimensional parameter spaces that would otherwise be infeasible due to computational constraints. Common types of surrogate models include polynomial response surfaces, kriging (or Gaussian processes), and radial basis functions. Polynomial response surfaces approximate the model response using low-order polynomials, often quadratic forms fitted via least-squares regression, making them suitable for smooth functions in low to moderate dimensions. Kriging models the output as a Gaussian process, providing an interpolating predictor with associated uncertainty estimates derived from the covariance structure. Radial basis function models express the response as a linear combination of basis functions centered at training points, offering flexibility for irregular data and effective performance in higher dimensions. To construct a surrogate, a design of experiments (DOE) such as Latin hypercube sampling (LHS) is used to select input points across the parameter space, the expensive model is evaluated at these points to generate training data, and the surrogate is then fitted to this dataset.^[47] Sensitivity analysis is subsequently performed on the surrogate, which supports efficient Monte Carlo sampling or analytical derivations for indices like Sobol' sensitivities. For example, in a Gaussian process surrogate for the output Y(X), the posterior mean provides the predicted response, while the posterior variance quantifies uncertainty, aiding in the computation of sensitivity metrics that account for prediction error.^[48] The primary benefit of surrogate modeling is its ability to enable global sensitivity analysis for computationally intensive models, such as those in computational fluid dynamics (CFD), where each evaluation may require hours or days; surrogates can reduce overall computational time by orders of magnitude while maintaining acceptable accuracy.^[46] Surrogate accuracy is validated through cross-validation methods, including leave-one-out or k-fold procedures, which estimate out-of-sample prediction error and confirm the surrogate's reliability for sensitivity inference.^[47] Polynomial chaos expansions may also serve as surrogates in this context, linking to variance-based decomposition techniques in global sensitivity methods.^[49]

High-Dimensional Model Representations (HDMR)

High-dimensional model representations (HDMR) provide a functional decomposition technique to approximate multivariate functions in sensitivity analysis, particularly addressing the curse of dimensionality in models with many input factors. The core idea is to expand the model output

f(\mathbf{X})

, where

\mathbf{X} = (X_1, X_2, \dots, X_d)

represents

d

input variables, into a sum of component functions that capture individual and interactive effects of the inputs. This expansion is typically truncated at a low order to focus on dominant low-dimensional contributions, enabling efficient analysis even for

d > 100

.^[50] The general HDMR expansion takes the form:

f(\mathbf{X}) = f_0 + \sum_i f_i(X_i) + \sum_{i<j} f_{ij}(X_i, X_j) + \sum_{i<j<k} f_{ijk}(X_i, X_j, X_k) + \cdots + f_{12\cdots d}(X_1, \dots, X_d),

where

f_0

is the constant term (mean output),

f_i(X_i)

are univariate components,

f_{ij}(X_i, X_j)

are bivariate interactions, and higher-order terms represent multi-way interactions up to the full

d

-way term. In practice, the series is cut off at order

k \ll d

, assuming higher-order interactions are negligible, which holds for many physical and engineering systems. This decomposition reveals the effective dimensionality of the model by identifying which low-order terms contribute most to output variability.^[50]^[51] A common variant is the cut-HDMR, which anchors the expansion at a reference point

\mathbf{x}^0

(often the mean or a nominal value of the inputs). The component functions are defined such that when lower-order variables are fixed at

\mathbf{x}^0

, the higher-order terms vanish, ensuring a unique hierarchical decomposition. For instance, the first-order term is

f_i(X_i) = f(X_i, \mathbf{x}_{-i}^0) - f_0

, where

\mathbf{x}_{-i}^0

denotes the reference point with

X_i

varying. Cut-HDMR is particularly useful for deterministic models where sampling can be controlled along "cuts" through the reference point.^[51]^[52] In contrast, ANOVA-HDMR employs an orthogonal decomposition over the input probability space, assuming independent inputs, which directly links to variance-based sensitivity measures. The component functions are mutually orthogonal, and their variances correspond to Sobol' indices, quantifying the contribution of each input subset to the total output variance. This makes ANOVA-HDMR suitable for global sensitivity analysis, as the zeroth-order term is the unconditional mean, and higher-order terms capture conditional expectations excluding lower interactions.^[52] Component functions in cut-HDMR are computed using the inclusion-exclusion principle, evaluating the model at specific points along the cuts and subtracting contributions from lower-order subsets. For a second-order term,

f_{ij}(X_i, X_j) = f(X_i, X_j, \mathbf{x}_{-ij}^0) - f_i(X_i) - f_j(X_j) - f_0

, with higher orders following recursively. This method requires

O(2^k \binom{d}{k})

model evaluations for order

k

, but truncation at low

k

(e.g., 2 or 3) keeps costs manageable for high

d

. Random sampling variants like RS-HDMR further reduce evaluations by projecting onto a basis over the input domain.^[51] HDMR has been applied to reduce effective dimensionality in sensitivity analysis of complex models with over 100 factors, such as chemical kinetics mechanisms and atmospheric simulations, where low-order terms often explain over 90% of output variance. For example, in combustion modeling, cut-HDMR identifies key reaction pathways influencing ignition delay, enabling targeted parameter studies without full explorations of the input space.^[50]^[53]

Monte Carlo and Filtering Methods

Monte Carlo filtering methods represent a class of sampling-based techniques in sensitivity analysis that focus on identifying influential input parameters by conditioning on specific output behaviors, particularly in systems with uncertainty and nonlinear responses. These methods generate a large number of input samples from their probability distributions, propagate them through the model to obtain corresponding outputs, and then filter the samples into subsets based on predefined criteria for the outputs, such as exceeding a threshold indicative of high-risk or desirable states. By comparing the conditional distributions of inputs within these filtered subsets against the unconditional distributions, analysts can quantify how certain parameters contribute to specific output regimes, providing insights into conditional sensitivities that reveal parameter importance in targeted scenarios.^[54] The foundational procedure for Monte Carlo filtering, often integrated with regional sensitivity analysis (RSA), begins with drawing

N

random samples from the joint input distribution, where

N

is typically large (e.g., thousands to millions) to ensure statistical reliability. Each sample is evaluated by the model to produce output values

Y

, which are then sorted and divided into behavioral (e.g.,

Y

meeting a success or failure criterion) and non-behavioral sets, often corresponding to the tails of the output distribution for extreme events. Input parameter distributions are then analyzed within these sets—using statistical tests like the Kolmogorov-Smirnov distance to assess differences—allowing identification of parameters whose values cluster differently in behavioral versus non-behavioral regions, thus highlighting their regional influence on the output. This approach excels in partitioning the input space into regions of interest and comparing conditional variances or densities, offering a non-parametric way to explore how inputs drive outputs toward specific thresholds without assuming model linearity.^[54] A key advantage of Monte Carlo filtering and RSA lies in their ability to handle nonlinear thresholds and complex, non-monotonic relationships in the model, making them particularly valuable for reliability and risk assessment applications where standard global measures might overlook conditional effects. For instance, in environmental modeling, these methods have been used to pinpoint parameters controlling pollutant exceedances by focusing on output tails, thereby aiding in model calibration and uncertainty reduction under probabilistic frameworks. Unlike variance-based decompositions that average over the entire input space, filtering emphasizes extremes, providing interpretable maps of parameter influence in high-stakes regions.^[55] Variants of Monte Carlo filtering incorporate importance sampling to enhance efficiency, especially for rare events where standard sampling yields few relevant outcomes. In this adaptation, samples are drawn from a biased distribution that oversamples high-risk input regions, with weights adjusted via the likelihood ratio to maintain unbiased estimates of conditional sensitivities. This reduces the required

N

for accurate tail analysis in reliability contexts, such as failure probability estimation, by concentrating computational effort on informative samples.

Emerging AI-Driven and Machine Learning Integrations

Recent advancements in sensitivity analysis (SA) have increasingly incorporated machine learning (ML) techniques to construct high-fidelity surrogate models, particularly using neural networks and random forests, which enable efficient approximation of complex, computationally expensive simulations. Neural networks, such as graph neural networks combined with transformers, serve as surrogates for agent-based models in domains like transport simulations, allowing for rapid evaluation of input-output relationships to compute sensitivity indices.^[56] Random forests, valued for their non-parametric flexibility, have been adapted for global SA by building metamodels that quantify variable importance through permutation-based metrics, demonstrating superior performance in high-dimensional settings compared to traditional methods.^[57] In these surrogates, SHAP (SHapley Additive exPlanations) values provide a unified measure of feature contributions to predictions, facilitating interpretable sensitivity assessments by attributing output variance to inputs in a game-theoretic framework.^[58] Tools like ML-AMPSIT exemplify this integration by automating multi-method parameter sensitivity identification in weather models using ML predictions to estimate impacts without exhaustive simulations.^[59] Active learning strategies enhance SA by employing adaptive sampling to concentrate evaluations in influential input regions, reducing computational demands while improving accuracy. These approaches iteratively select samples based on uncertainty or expected information gain, often integrated with emulation methods to refine sensitivity estimates for expensive models.^[60] For derivative-based global SA, novel acquisition functions target key sensitivity quantities, enabling efficient exploration of parameter spaces in neural network-driven analyses.^[61] In finite population estimation, ML-assisted active sampling adapts importance sampling dynamically, achieving robust sensitivity insights with fewer iterations.^[62] AI-driven dashboards are emerging as platforms for real-time SA visualization, supporting interactive exploration of model behaviors through what-if scenarios and dynamic sensitivity metrics. These tools leverage ML to generate on-the-fly surrogates and visualizations, such as heatmaps of parameter influences, streamlining decision-making in financial modeling like leveraged buyouts (LBOs).^[63] For instance, Sparkco's 2025 AI suite enables users to simulate input perturbations and observe propagated effects in real time, integrating SHAP-based explanations for immediate interpretability.^[63] Broader AI analytics platforms, including those from ThoughtSpot, extend this to enterprise-scale dashboards that automate sensitivity computations and predictive visualizations.^[64] Innovative methods like COMSAM (COMprehensive Sensitivity Analysis Method), introduced in 2024, advance multi-criteria SA by systematically modifying multiple parameters to assess decision robustness, outperforming traditional one-at-a-time approaches in composite material selection.^[65] COMSAM's structured framework reveals how matrix alterations impact rankings, providing comprehensive insights for decision-making under uncertainty.^[66] Complementing this, a 2025 Springer study compares global SA methods on deep learning models for digit classification, evaluating techniques like Sobol' indices and SHAP against neural network outputs to identify the most effective for high-dimensional feature attribution.^[67] This comparison highlights variance-based methods' strengths in capturing nonlinear interactions within deep architectures.^[67] Despite these progresses, interpretability remains a key challenge for ML-based sensitivity indices, as black-box models like deep neural networks often obscure the causal links between inputs and derived measures.^[68] Sensitivity analyses applied to ML predictions can reveal model vulnerabilities, but ensuring faithfulness of explanations—such as through robustness checks on SHAP values—requires additional validation to avoid misleading attributions.^[69] In medical imaging segmentation, for example, post-hoc sensitivity methods expose biases in ML outputs, underscoring the need for hybrid approaches that balance accuracy with transparency.^[70]

Challenges and Limitations

Computational and Scalability Issues

Sensitivity analysis encounters substantial computational hurdles, especially in high-dimensional settings where the curse of dimensionality leads to an exponential increase in the number of required model evaluations. For Monte Carlo methods, reliable estimation of sensitivity indices in models with 20 or more input factors often necessitates sample sizes exceeding 10^6 to achieve adequate convergence, as the volume of the input space grows rapidly, diluting sampling efficiency.^[71] This challenge is exacerbated in variance-based approaches, where the effective dimensionality determines the sample requirements, making exhaustive exploration impractical without specialized techniques.^[72] Computational demands are commonly quantified by the total number of model function evaluations required. The Sobol' method, for computing first-order and total-order indices, typically demands around 2pN evaluations, where p denotes the number of input factors and N is the base sample size (often 1,000 to 10,000 for statistical reliability).^[14] By comparison, the Morris screening method is considerably less demanding, requiring only p(r + 1) evaluations, with r representing the number of trajectories (usually 10 to 20), enabling rapid identification of influential factors at a fraction of the cost.^[14] Several strategies mitigate these scalability issues. Parallel computing architectures allow simultaneous execution of independent model evaluations, drastically cutting runtime for embarrassingly parallel tasks like Monte Carlo sampling.^[73] Quasi-Monte Carlo methods further enhance efficiency by employing low-discrepancy sequences that converge faster than pseudorandom sampling, often reducing the necessary N by an order of magnitude while preserving accuracy in global sensitivity estimates.^[74] In contemporary large-scale simulations, such as those in climate modeling as of 2025, sensitivity analyses must contend with enormous datasets; for example, ensemble runs exploring parameter uncertainties can require substantial computational resources on exascale systems to perform such analyses within reasonable timeframes.^[75] These demands push the boundaries of high-performance computing, where exascale systems are increasingly essential. A core trade-off in sensitivity analysis lies between achieving high-fidelity results and practical feasibility, particularly for computationally expensive models. Analysts often prioritize methods that sacrifice some precision for speed, such as using screening to focus subsequent detailed computations on a reduced parameter set, ensuring scalability without wholly compromising insights.^[76]

Interpretation and Uncertainty Challenges

Interpreting results from sensitivity analysis (SA) presents significant challenges due to inherent uncertainties in both the model and its inputs, which can undermine confidence in the identified influential factors. Global sensitivity indices, such as Sobol' indices, quantify how input variability propagates to output uncertainty, but their reliability depends on accurately capturing these uncertainties.^[77] Failure to account for such uncertainties may lead to misleading rankings of input importance, particularly in complex systems where epistemic (knowledge-related) and aleatory (inherent randomness) uncertainties coexist.^[78] Uncertainty in sensitivity indices themselves arises from finite sample sizes in Monte Carlo-based methods, requiring techniques like bootstrapping to estimate confidence intervals. Bootstrap resampling involves repeatedly drawing samples from the input distribution to compute variability in indices such as the first-order Sobol' index

S_i

, providing analytical or empirical errors that indicate convergence.^[79] Convergence plots visualize this by plotting index values against increasing sample sizes, helping assess whether results have stabilized or if further evaluations are needed to reduce estimation error.^[20] For instance, in environmental models, such plots reveal that indices may fluctuate significantly at low sample sizes before converging, highlighting the need for robust error quantification to trust interpretations.^[80] Model form uncertainty further complicates interpretation, as SA outcomes are highly sensitive to the choice of the underlying model structure or function

f

. Different model formulations can yield divergent sensitivity rankings even for the same inputs, stemming from assumptions about physical processes or simplifications that introduce epistemic biases.^[81] In engineering applications, such as reentry flow simulations, model form choices like turbulence modeling directly alter propagated uncertainties, emphasizing that SA must be contextualized within potential structural alternatives to avoid overconfidence in results.^[82] Correlations among input variables pose another interpretive hurdle, as standard variance-based indices assume independence and thus produce biased estimates when correlations exist. Ignoring input dependencies redistributes variance attribution, potentially inflating or deflating individual indices and leading to erroneous conclusions about factor importance.^[83] For linear models with correlated inputs, modified indices that incorporate covariance matrices are essential to correct these biases, ensuring that interpretations reflect realistic input interactions.^[84] Visualization techniques aid in navigating these uncertainties by illustrating multi-way interactions and sensitivities in an intuitive manner. Tornado plots rank inputs by their impact on output variance, displaying bars symmetric around a baseline to highlight the range of changes, which is particularly useful for one-at-a-time assessments in decision-making contexts.^[85] Heatmaps, conversely, represent sensitivity indices across input pairs or dimensions as color gradients, revealing correlation-driven patterns and higher-order effects that might be obscured in tabular data.^[86] These tools, when combined with uncertainty bounds, facilitate clearer communication of SA results, though their effectiveness depends on selecting representations that align with the dimensionality of the problem.^[87] Recent reviews in 2025 underscore ongoing gaps in addressing epistemic uncertainties within global SA, particularly regarding robustness to incomplete knowledge of input distributions or model structures, yet methods for robust propagation remain underdeveloped, calling for hybrid approaches that separate aleatory and epistemic effects.^[88] In integrated assessment models, such reviews note that epistemic components often dominate in high-stakes applications like climate modeling, and while variance-based methods provide interpretable insights, their sensitivity to epistemic assumptions necessitates validation against scenario ensembles to enhance trustworthiness.^[89]

Common Pitfalls and Best Practices

One common pitfall in sensitivity analysis (SA) is overlooking interactions between input variables, which can lead to incomplete assessments of model behavior, particularly in complex systems where variables do not act independently.^[90] Poor sampling strategies, such as using uniform distributions for inputs that exhibit skewness or heavy tails in reality, further exacerbate errors by failing to capture the true variability of the input space.^[91] Additionally, ignoring non-influential factors—those with minimal impact on outputs—can distort prioritization efforts, as these variables may still contribute subtly to overall uncertainty when combined with others.^[90] A notable case of misinterpretation arises when local SA methods, such as one-at-a-time (OAT) approaches, are erroneously applied to nonlinear models as proxies for global SA; in such scenarios, local derivatives provide point-specific insights that do not reflect the full range of input variations, with up to 65% of published studies affected by such flawed practices leading to underestimation of sensitivities.^[91] This error is prevalent because OAT explores only a tiny fraction of the input space (e.g., less than 0.25% in high-dimensional cases), rendering it unsuitable for nonlinear dynamics where sensitivities vary across the parameter domain.^[90] To mitigate these issues, best practices emphasize validating SA results through multiple methods, such as combining screening techniques like the Morris method with variance-based global approaches, to ensure robustness across different assumptions.^[90] Reporting confidence intervals or uncertainty bounds around sensitivity indices is essential for transparency, allowing users to gauge the reliability of findings in the face of sampling variability.^[91] Method selection should be context-specific; for instance, the Morris method is ideal for initial screening in high-dimensional problems due to its efficiency in identifying influential factors without exhaustive computation.^[90] International standards provide further guidance, with ISO/IEC 31010:2019 outlining sensitivity analysis as a technique to evaluate the impact of changes in individual input parameters on risk magnitude, recommending its integration into broader risk assessment processes to identify critical uncertainties systematically. This standard, updated in the late 2010s, stresses the importance of tailoring SA to the model's structure and input distributions to avoid procedural oversights.^[92]

Applications and Auditing

Engineering and Scientific Applications

In engineering and physical sciences, sensitivity analysis (SA) is extensively applied to optimize designs, validate models, and quantify uncertainties in complex systems where parameters like material properties, environmental loads, and operational conditions introduce variability. By decomposing model outputs into contributions from input factors, SA enables engineers to prioritize critical variables, thereby enhancing reliability and efficiency in fields ranging from civil infrastructure to high-energy physics simulations. Global methods, such as Sobol' indices, are particularly valued for their ability to capture nonlinear interactions without assuming model linearity.^[90] A prominent application is in structural reliability, where SA assesses how uncertainties in load factors, material strengths, and geometric dimensions affect failure probabilities in bridges and other infrastructure. For instance, Sobol' indices have been used to evaluate the sensitivity of failure probabilities in steel bridge members under bending and fatigue loads, revealing that variables like yield strength and traffic load magnitudes dominate reliability outcomes. This approach helps identify key design tolerances, reducing the risk of over-conservative specifications that inflate costs.^[93]^[94] In chemical kinetics, SA elucidates the influence of reaction rate constants and activation energies on overall process dynamics, aiding in the optimization of industrial reactors and combustion systems. Techniques like local sensitivity analysis compute partial derivatives of species concentrations with respect to rate parameters, highlighting dominant pathways in multi-step mechanisms, such as those in hydrocarbon oxidation. For example, in gas-phase combustion models, SA identifies reactions that most strongly affect ignition delay times, guiding parameter refinement in detailed kinetic schemes.^[95]^[96] Aerospace engineering leverages SA for aerodynamic design, particularly in analyzing how wing geometry parameters—such as aspect ratio, sweep angle, and airfoil thickness—impact lift-to-drag ratios and stability under varying flow conditions. Adjoint-based sensitivity methods, integrated with computational fluid dynamics, quantify gradients of performance metrics like drag coefficient with respect to shape variables, facilitating gradient-based optimization in transonic wing designs. This has been applied to reduce fuel consumption in aircraft by pinpointing aerodynamically sensitive features, such as leading-edge contours.^[97]^[98] In physics, particularly particle simulations, SA evaluates parameter influences in Monte Carlo methods used to model high-energy collisions and detector responses. For LHC data analysis pipelines, sensitivity measures assess how variations in simulation inputs—like particle interaction cross-sections or detector efficiencies—affect reconstructed event yields and background rejection. This is crucial for validating models against experimental data, as seen in transport simulations where SA identifies dominant uncertainties in radiation shielding or beam dynamics.^[99]^[100] The benefits of SA in these domains include substantial reductions in physical prototyping needs by simulating uncertainty propagation early in design. It also pinpoints critical tolerances, allowing tighter control over influential parameters to improve system robustness without excessive safety margins.^[90] A notable case study involves NASA's application of polynomial chaos expansion (PCE) for uncertainty quantification in atmospheric entry vehicle models. In simulations of Mars entry trajectories, PCE-based SA propagates uncertainties in aerothermal properties and vehicle geometry to predict peak heating and deceleration variances, with Sobol' indices indicating that entry angle and ballistic coefficient are significant contributors to heat flux uncertainty. This non-intrusive PCE approach, implemented in NASA's UQPCE toolkit, enables efficient global sensitivity assessment, informing robust design choices for missions like the Mars Science Laboratory.^[101]

Economic and Financial Applications

In economics and finance, sensitivity analysis serves as a critical tool for scenario planning and risk management by evaluating how variations in key parameters influence model outcomes, enabling decision-makers to anticipate disruptions and optimize strategies under uncertainty. For instance, in portfolio optimization, it assesses the robustness of asset allocations to changes in expected returns, volatilities, and correlations, revealing potential vulnerabilities in long-term expected utility. A study on incomplete markets using factor models demonstrates that small perturbations in initial volatility or drift parameters can significantly alter optimal portfolio performance over extended horizons, with sensitivities derived from ergodic Hamilton-Jacobi-Bellman equations quantifying these effects for models like Kim-Omberg and Heston stochastic volatility.^[102] Sensitivity analysis is particularly valuable for examining interest rate impacts on fixed-income securities, such as French Obligations Assimilables du Trésor (OAT) bonds, where yield spreads relative to benchmarks like German Bunds highlight fiscal and political risks. Analysis of OAT yields shows that macroeconomic fundamentals, including debt levels and growth prospects, drive sensitivity to rate changes.^[103]^[104] This approach aids central banks and investors in stress-testing bond portfolios against monetary policy shifts. In cost-benefit analysis for macroeconomic models, sensitivity analysis tests the effects of varying policy parameters on GDP projections, informing fiscal and monetary decisions. Using the NiGEM model, scenarios recycling carbon tax revenues through public investment versus tax cuts illustrate divergent GDP paths: public investment boosts long-term GDP by up to 7% in regions like Japan by 2050, while tax cuts lead to a 3-4% decline due to crowding-out effects.^[105] Such evaluations underscore how policy choices amplify or mitigate economic shocks. In financial derivatives pricing, the Black-Scholes model employs local sensitivity measures known as the Greeks—delta, gamma, vega, rho, and theta—to quantify option value changes with respect to underlying price, volatility, time, and rates. For a European call option, delta (∂V/∂S = N(d₁)) measures price sensitivity, while vega (∂V/∂σ = S√(T-t) N'(d₁) e^{-q(T-t)}) captures volatility effects, enabling hedgers to maintain delta-neutral positions amid market fluctuations.^[106] As of 2025, tools like Rockstep integrate sensitivity analysis into investment platforms for real estate and retail portfolios, allowing users to simulate scenarios such as 2% rental growth or 1% interest rate shifts on a $10 million property, thereby enhancing risk assessment and lender negotiations.^[107] Overall, these applications reveal leverage points in market volatility, where high-uncertainty environments magnify the impact of asymmetric shocks on returns, as shown in volatility models incorporating option-implied information.^[108] By identifying parameters like leverage ratios or volatility persistence that drive outsized responses, sensitivity analysis guides targeted interventions to stabilize financial systems.^[109]

Environmental and Risk Assessment Applications

Sensitivity analysis plays a crucial role in environmental modeling by quantifying how variations in input parameters, such as emission rates or pollutant levels, influence model outputs like climate projections or ecosystem responses, thereby supporting evidence-based policy decisions. In climate modeling, global sensitivity analysis (GSA) has been extensively applied to assess the impacts of greenhouse gas emissions on future climate scenarios, as detailed in IPCC assessments. For instance, GSA of chemistry-climate models reveals that tropospheric hydroxyl radical (OH) budgets are highly sensitive to methane emissions and temperature changes, with methane a key contributor to uncertainty in global OH concentrations. Similarly, integrated assessment models for emissions pathways demonstrate that socioeconomic drivers, like population growth and technological change, are major influences on the sensitivity of projected warming levels under various shared socioeconomic pathways. These analyses help prioritize mitigation strategies by identifying parameters that most affect equilibrium climate sensitivity, estimated between 2.5°C and 4.0°C for doubled CO₂ concentrations in recent IPCC evaluations.^[110] In hydrological applications, sensitivity analysis is essential for flood risk assessment, where models simulate inundation based on varying rainfall parameters to evaluate hazard zones and inform infrastructure planning. Global sensitivity methods, such as Sobol indices, have been used in hydrodynamic models to identify key inputs like rainfall intensity and Manning's roughness coefficient as significant factors in flood extent predictions. For example, in urban pluvial flood simulations, sensitivity to temporal resolution of rainfall data shows that finer intervals improve the accuracy of peak flow predictions, enhancing the accuracy of risk maps for vulnerable areas. These insights guide adaptive measures, such as improved drainage systems, by highlighting parameters most critical to extreme event simulations.^[111]^[112] Epidemiological modeling benefits from sensitivity analysis to understand disease spread dynamics, particularly in pandemic scenarios like COVID-19, where parameter uncertainties can significantly alter outbreak forecasts. In SEIR-type models extended for COVID-19, global sensitivity analysis identifies quarantine rates and intervention timing as primary drivers of total infections. For instance, studies on U.S. COVID-19 transmission models reveal that contact rates and incubation periods dominate uncertainty in peak infection timing, informing public health responses like lockdown durations. Such analyses underscore the need for robust data on behavioral parameters to refine predictions for future epidemics. In risk assessment, probabilistic sensitivity analysis is employed to evaluate tail events—rare, high-impact occurrences—in insurance contexts, focusing on extreme losses from natural disasters or pandemics. Techniques like Monte Carlo-based sensitivity testing assess how variations in claim severity distributions affect tail risk metrics, such as Value-at-Risk (VaR) at 99.5% confidence and Expected Shortfall. For heavy-tailed claim models, sensitivity to tail dependence parameters shows that underestimating extreme event correlations can significantly inflate ruin probabilities, as seen in analyses of catastrophe reinsurance portfolios. These methods enable insurers to stress-test portfolios against tail risks, optimizing reserves for events like floods or outbreaks. The application of sensitivity analysis in these domains directly informs environmental regulations by pinpointing influential factors for compliance and management. Under the EU Water Framework Directive (WFD), global sensitivity analysis of water quality models identifies key uncertainties in pollutant transport parameters, such as flow rates and degradation coefficients, as important drivers of variance in ecological status assessments. This supports the directive's objectives by prioritizing monitoring of sensitive inputs, facilitating cost-effective measures to achieve good water status by 2027 and guiding basin management plans across member states.^[113]

Sensitivity Auditing and Validation

Sensitivity auditing and validation represent systematic processes to evaluate the reliability, transparency, and robustness of sensitivity analysis (SA) outcomes in modeling applications. Sensitivity auditing extends traditional SA by scrutinizing not only input-output relationships but also underlying assumptions, structural choices, and contextual influences that may affect results, ensuring that model-based inferences support informed decision-making without undue certainty.^[114] This approach is particularly vital in policy-relevant models, where unexamined sensitivities can lead to overconfidence in projections, such as varying estimates of the social cost of carbon depending on discount rates (e.g., $171 per ton at 2% versus $56 per ton at 3%).^[114] Key auditing steps include replicating the SA to verify computational accuracy, rigorously checking model assumptions for plausibility and completeness, and conducting peer reviews of sensitivity indices to assess their interpretability and potential biases. Replication involves re-running analyses with identical inputs to confirm consistency, while assumption checking adopts an "assumption-hunting" mindset to identify overlooked structural elements, such as subjective parameter selections or data limitations.^[115] Peer review of indices, like Sobol' or Morris measures, ensures these quantify true variability rather than artifacts of model formulation.^[116] These steps foster transparency by mandating full disclosure of inputs, code, and uncertainty ranges, aligning with principles from the Sensitivity Analysis of Model Output (SAMO) community's resources.^[117] Validation complements auditing by testing SA results against independent benchmarks, such as experimental or observational data, to confirm predictive fidelity. This entails comparing model sensitivities—e.g., how output variance apportions to inputs—with real-world perturbations, like varying inclusion criteria in propensity score matching to evaluate impacts on outcomes such as mortality rates.^[115] Sensitivity to perturbations is assessed through techniques like cross-validation, where data subsets train and test the model, or external validation on unseen cohorts, revealing if SA holds under modified conditions.^[115] Such comparisons prevent reliance on unverified models, particularly in high-stakes scenarios. Established frameworks guide these practices, including the seven-rule checklist proposed by Saltelli et al. for extending SA into auditing, which emphasizes rhetorical scrutiny, uncertainty propagation, and participatory dialogue to address values and interests embedded in models.^[116] SAMO-endorsed guidelines, drawn from seminal works like Global Sensitivity Analysis: The Primer, advocate global SA methods (e.g., variance-based indices) as a baseline for auditing, promoting their use before publication to apportion uncertainties comprehensively.^[117] International standards, such as the European Commission's Better Regulation Guidelines (updated 2021), integrate sensitivity auditing into impact assessments, requiring explicit communication of model limitations to enhance trustworthiness.^[113] The role of sensitivity auditing and validation is to mitigate overconfidence in policy models by highlighting irreducible uncertainties, thereby supporting robust decision-making; for instance, it counters pitfalls like manipulated uncertainty ranges that could mislead stakeholders.^[114] As of 2025, integration with machine learning (ML) auditing tools has advanced robustness checks, enabling dynamic scenario modeling and stress-testing in complex systems, with AI-driven global SA improving predictive accuracy by up to 30% over traditional local methods in financial applications.^[63] These tools facilitate automated perturbation analysis and assumption validation, enhancing scalability while maintaining transparency.^[118]

Sensitivity analysis

Introduction

Definition and Purpose

Historical Development

Theoretical Foundations

Mathematical Formulation

Key Concepts and Terminology

Classification of Sensitivity Analysis Methods

Local versus Global Approaches

Deterministic versus Probabilistic Frameworks

Local Sensitivity Methods

One-at-a-Time (OAT) Analysis

Derivative-Based Techniques

Regression Analysis

Global Sensitivity Methods

Screening Methods (e.g., Morris)

Variance-Based Decomposition (e.g., Sobol')

Moment-Independent and Correlation Measures

Fourier Amplitude Sensitivity Test (FAST)

Polynomial Chaos Expansion

Game-Theoretic Approaches (e.g., Shapley Effects)

Advanced and Complementary Techniques

Surrogate Modeling and Metamodels

High-Dimensional Model Representations (HDMR)

Monte Carlo and Filtering Methods

Emerging AI-Driven and Machine Learning Integrations

Challenges and Limitations

Computational and Scalability Issues

Interpretation and Uncertainty Challenges

Common Pitfalls and Best Practices

Applications and Auditing

Engineering and Scientific Applications

Economic and Financial Applications

Environmental and Risk Assessment Applications

Sensitivity Auditing and Validation

References

Table of Contents

Sensitivity analysis

Introduction

Definition and Purpose

Historical Development

Theoretical Foundations

Mathematical Formulation

Key Concepts and Terminology

Classification of Sensitivity Analysis Methods

Local versus Global Approaches

Deterministic versus Probabilistic Frameworks

Local Sensitivity Methods

One-at-a-Time (OAT) Analysis

Derivative-Based Techniques

Regression Analysis

Global Sensitivity Methods

Screening Methods (e.g., Morris)

Variance-Based Decomposition (e.g., Sobol')

Moment-Independent and Correlation Measures

Fourier Amplitude Sensitivity Test (FAST)

Polynomial Chaos Expansion

Game-Theoretic Approaches (e.g., Shapley Effects)

Advanced and Complementary Techniques

Surrogate Modeling and Metamodels

High-Dimensional Model Representations (HDMR)

Monte Carlo and Filtering Methods

Emerging AI-Driven and Machine Learning Integrations

Challenges and Limitations

Computational and Scalability Issues

Interpretation and Uncertainty Challenges

Common Pitfalls and Best Practices

Applications and Auditing

Engineering and Scientific Applications

Economic and Financial Applications

Environmental and Risk Assessment Applications

Sensitivity Auditing and Validation

References

Table of Contents

Sign in to contribute

Suggest an article

Something went wrong

Thank you!