Theory¶
Overview¶
soerp implements second-order error propagation (SOERP), a method for estimating the
statistical moments of a function of random variables. Given a function
\(z = h(x_1, x_2, \ldots, x_n)\) where each \(x_i\) is an independent random variable with
known moments, SOERP computes the first four moments of \(z\) using a second-order Taylor
series approximation. These moments characterize the output distribution — its mean,
variance, skewness, and kurtosis — without requiring Monte Carlo simulation.
The method is particularly useful for:
- Tolerance analysis: predicting how part-to-part variation propagates to system output
- Uncertainty analysis: quantifying output uncertainty from imprecisely known input parameters
- Sensitivity analysis: identifying which input variables contribute most to output variance
Second-Order Taylor Series Approximation¶
System Output Polynomial¶
Let \(h(x_1, x_2, \ldots, x_n)\) be the function of interest and let \(\nu_{i1}\) denote the expected value (mean) of the \(i\)-th random variable \(x_i\). The function is expanded in a second-order Taylor series about the point \((\nu_{11}, \nu_{21}, \ldots, \nu_{n1})\):
where all partial derivatives are evaluated at the mean values \(\nu_{i1}\).
This expansion is written compactly as a second-order polynomial \(z\) in the centered variables \((x_i - \nu_{i1})\):
Polynomial Coefficients¶
The coefficients are determined directly from the partial derivatives of \(h\) evaluated at the mean values:
| Coefficient | Expression | Description |
|---|---|---|
| \(b_0\) | \(h(\nu_{11}, \nu_{21}, \ldots, \nu_{n1})\) | Intercept (nominal output) |
| \(b_i\) | \(\displaystyle\left.\frac{\partial h}{\partial x_i}\right\|_{\boldsymbol{\nu}}\) | Linear (first-order) coefficient |
| \(b_{ii}\) | \(\displaystyle\frac{1}{2}\left.\frac{\partial^2 h}{\partial x_i^2}\right\|_{\boldsymbol{\nu}}\) | Quadratic (second-order) coefficient |
| \(b_{ij}\) | \(\displaystyle\left.\frac{\partial^2 h}{\partial x_i \partial x_j}\right\|_{\boldsymbol{\nu}}\) | Cross-product (interaction) coefficient |
Centered Output Variable¶
For the purpose of moment calculations, a centered output variable \(y\) is defined by subtracting the nominal output from \(z\):
The central moments of \(y\) (order 2 and above) are identical to those of \(z\). Only the mean differs; the mean of \(z\) is recovered as:
Input Variable Moments¶
Each input variable \(x_i\) is independently distributed with central moments \(\mu_{ij}\) for \(2 \le j \le 8\). The central moments are related to moments about the origin by:
A rigorous second-order propagation requires moments up through eighth order (\(\mu_{i2}\) through \(\mu_{i8}\)). Truncating at fourth-order central moments produces an approximation equivalent to the simplified formulas found in the earlier literature.
Standardized Input Variables¶
It is convenient to work with standardized (transformed) variables:
where \(\sigma_i\) is the standard deviation of \(x_i\). Each \(w_i\) then has zero mean and unit variance, and its central moments depend only on the shape of the distribution, not its scale. For a normally distributed variable, the standardized central moments are:
When using standardized variables, the polynomial coefficients are also rescaled. The quadratic coefficient for \(w_i\) becomes:
and the cross-product coefficient for \(w_i w_j\) becomes:
Moment Equations¶
The \(k\)-th moment of \(y\) about the origin is:
where \(f_i\) is the probability density function of \(x_i\). Substituting the second-order polynomial approximation and exploiting the independence of the \(x_i\), this integral reduces to an algebraic function of the polynomial coefficients and the central moments of the input variables.
The first four moments are given below. These are the equations solved by SOERP.
First Moment (Mean of \(y\))¶
Note that the linear terms \(b_i\) do not contribute to the mean because the \(x_i\) are centered at their expected values. Only the quadratic terms \(b_{ii}\) shift the mean.
Second Moment¶
Third Moment¶
Fourth Moment¶
The fourth moment involves single-variable sums, double sums over pairs \((i,j)\), triple sums over triplets \((i,j,k)\), and quadruple sums over quadruplets \((i,j,k,m)\).
Single-variable terms:
Pair terms (\(i < j\)):
Cross-diagonal pair terms (\(j \ne i\)):
Triplet terms (\(i < j < k\)):
Quadruplet terms (\(i < j < k < m\)):
Output Distribution Characterization¶
Central Moments of the Output¶
The central moments of \(y\) (order 2 and above) are computed from the moments about the origin using the standard relations:
Skewness and Kurtosis¶
The shape of the output distribution is characterized by two dimensionless coefficients.
Skewness coefficient:
A symmetric distribution has \(\sqrt{\beta_1} = 0\). Negative values indicate left-skew; positive values indicate right-skew.
Kurtosis coefficient:
For the normal distribution, \(\beta_2 = 3\). Values greater than 3 indicate heavier tails (leptokurtic); values less than 3 indicate lighter tails (platykurtic). For reference, the uniform distribution has \(\beta_2 = 1.8\).
Selecting a Probability Distribution¶
Given the four computed moments, an appropriate parametric distribution can be fitted to the output. Two common families are:
- Pearson family: covers a wide range of shapes indexed by \(\beta_1\) and \(\beta_2\), including Types I through VII. The Pearson Type IV distribution is appropriate for asymmetric, heavy-tailed outputs.
- Johnson family: provides an alternative set of transformations to normality.
If the computed \(\sqrt{\beta_1}\) is close to 0 and \(\beta_2\) is close to 3, the output is approximately normal and standard Gaussian confidence intervals apply directly.
Variance Sensitivity¶
By running SOERP with only a single input variable at a time (setting all other \(b_i\), \(b_{ii}\), and \(b_{ij}\) to zero), the contribution of each variable to the total output variance can be isolated. This identifies which inputs drive the most uncertainty and guides where precision improvements are most cost-effective.
For the linear approximation, the variance contribution of variable \(i\) is simply \(b_i^2\,\mu_{i2}\). The second-order terms add corrections from \(b_{ii}\) and cross-products \(b_{ij}\), which can be significant when inputs are far from symmetric or when the function is strongly nonlinear.