Uncertainty and covariance ?
As in many other scientific fields, nuclear physics measurements come with a degree of uncertainty. The concept of uncertainty is used to express the level of confidence in the measurement results. It can be characterized by a range of values, a probability distribution, or a standard deviation. The sources of uncertainty can be numerous and may include the precision of the instrument, the limited statistics, or the unknowns on some parameters. Whatever the source, the proper characterization of uncertainty is essential for the correct interpretation of experimental results and for the comparison of these results with other measurements and/or theoretical predictions.
The discussion in this section focuses on the practical point of view, and the quoted formulas might differ from the pure mathematic ones found in statistics textbooks.
Writing convention
In the following, the uncertainty on a quantity \(x\) will be noted \(u_x\) in the case of a Gaussian probability density function and \(\delta_x\) for a uniform one [1]. The central value of \(x\) is noted \(\bar x\).
Uncertainties : definitions, usage, methods
The intrinsic uncertainties in nuclear physics experiment (and in particular in the case of inelastic neutron scattering cross-section determination) come from many different sources: time interval between clock-ticks, limited precision on distances, unknown contaminants in target materials, trigger rate limitation and signal processing in the acquisition (a.k.a. pile-up and deadtime), …
All the original uncertainties in a physics analysis have to be evaluated (it is the nature of uncertainty that it cannot be measured [2]) to correctly take them into account. There are basically two ways to perform the uncertainty estimation, associated with uncertainty Type [3].
- Type A
The first type of uncertainty is determined by statistical methods. Mostly, via a repeated set of measurements. The standard deviation of the obtained distribution of results will be kept as the uncertainty.
- Type B
The other type of uncertainty is … any other methods. For type B uncertainty, the estimated variance (i.e. \({u_x}^2\)) is evaluated using the knowledge about the setup used for the determination of the value, the use of a model or prescription, …
Important
The fact that one uncertainty is evaluated via type A or B method does not imply that the source of the uncertainty is systematic or statistic. Indeed, this categorization is often misleading, as the statistical uncertainty in one case might be systematic in another (see later).
In addition to a central value and its associated standard deviation, the uncertainty is more generally defined by a probability density function (pdf, not PDF) that will generally follow a generic mathematical shape (Gaussian, uniform over an interval, Triangular, log-normal, Poisson, Maxwell-Boltzmann, … See examples in Figure 8). The standard deviation is usually defined for these shapes, but does not fully characterize them.
Therefore, one should consider all values for which an uncertainty is associated (whether it is the result of a measurement, or a parameter used in the analysis) as a random variable. That means that a measured value of \(x\) is one realization of the random variable \(X\), defined by a probability density function for which a mean value \(\bar x\) and a standard deviation \(u_x\) can be computed [4].
Combination of uncertainties
The combination of uncertainties is usually done with the square summation formula. Considering the quantity \(a\), obtain from two independent variables \(x\) and \(y\) with the function \(f(x,~y)\), the variance of \(a\) is given by :
\[{u_a}^2 = \left(\frac{\partial f}{\partial x} (\bar x,~\bar y) \right)^2 \times {u_x}^2 + \left(\frac{\partial f}{\partial y} (\bar x,~\bar y) \right)^2 \times {u_y}^2\]
The formula can be generalized to the case of many variables. One can easily obtain it by first order series expansion on the function \(f\) around \((\bar x,~\bar y)\), using the definition that \(\left<(\bar x - x)^2\right> = {u_x}^2\) (i.e. the variance of \(x\) is the average of the squares of the spread of the realizations of \(x\) around the random variable mean value) [5].
In cases where the variables are not independent, an extra term appears: \(+ 2 \times \frac{\partial f}{\partial x} (\bar x,~\bar y) \times \frac{\partial f}{\partial y} (\bar x,~\bar y) \times \left< (x - \bar x)(y - \bar y) \right>\) (note that this term may be negative). Where we define \(cov(x, y)=\left< (x - \bar x)(y - \bar y) \right>\) the covariance between the two random variables \(x\) and \(y\) (the variance \({u_x}^2\) is by definition equal to \(cov(x,x)\)). Additionally, the correlation between the two variables is \(corr(x, y)=\frac{cov(x,y)}{u_x u_y}\).
Note
Two variables can be strongly correlated and still have a covariance equal to 0.
For example, if \(x\) is randomly distributed along the real axis with a central value of 0, and \(y = x^2\), then \(cov(x,y)=0\) despite \(y\) being clearly correlated to \(x\).
But [6], the previous formulas for uncertainty combinations apply only in cases of symmetrical, small deviations around the mean values. In practice, it’s not always the case. (Normalization for example may prevent symmetric variations around a central value.) The exact combination of uncertainties should be done by convolution of probability density functions (Using a random sampling approach can be an efficient way to perform the uncertainty combination). This makes sure that any shape and type of uncertainties are correctly combined. In general, if one considers two random variables \(x\) and \(y\), with probability density functions \(P_x\) and \(P_y\), the distribution corresponding to possible values of \(a\), with \(a = f(x, y)\), will be \(P_a(a) = \int\delta(a-f(x,y))\times P_x(t) dx \times P_y(y) dy\).
Statistical vs. Systematic uncertainties
It is very common in physics results to separate between the so-called statistical and systematic uncertainties. However, it is not always clear what is meant by these words. Let’s clarify in the following paragraphs.
- True statistical uncertainty
Literally, the statistical uncertainty could be understood as uncertainty determined from the statistical distribution of results (for example, read “Introduction to Statistical vs. Systematic Uncertainty”, one of the top results in a Google search for statistical uncertainty). In other words, it is a Type A uncertainty.
This uncertainty will follow the Statistics-101 formula \(\bar x = \frac{1}{N} \sum_i {x_i}\), with the uncertainty \(u_x\) being equated to the standard deviation \(\sigma_x\): \({\sigma_x}^2 = \frac{1}{N-1} \sum_i {(x_i - \bar x)^2}\).
- Uncertainty from limited Statistics
In (nuclear) physics, statistical uncertainty is more generally understood as uncertainty arising from the limited statistics, in other words “we are not exactly sure of the value of the quantity of interest because we could not run the experiment forever“. This statistical uncertainty typically arises from counting something (events, atoms, …) and will propagate through the usual uncertainty combination to the final value.
The default estimation for the statistical uncertainty for a quantity \(N\) is usually taken as \(\sqrt{N}\). It can be (and usually is) true, but that’s not a universal truth. The \(\sqrt{N}\) is actually a limit case starting way back to a binomial law: if \(N_0\) independent events happen, each with a probability \(p\) to be observed, the (discrete) probability density function of the observed number of events \(N\) is \(P(N) = \frac{N_0 !}{N ! (N_0 - N) !} p^N (1-p)^{N_0 -N}\). The average \(N\) according to this distribution is \(\bar N = N_0 \times p\), with a standard deviation \(\sigma_N = \sqrt{N_0 \times p \times (1-p)}\). When \(p \longrightarrow 0\) the distribution \(P(N)\) can first be approximated by a Poisson distribution (if \(N\) is not too big) and ultimately (if \(N_0 \longrightarrow \infty\)) in a Normal (i.e. Gaussian) distribution centered on \(\bar N = N_0 \times p\): \(P(N) = \frac{1}{\sigma \sqrt{2 \pi}} \exp(- \frac{(N-N_0)^2}{2 \sigma^2})\) with \(\sigma = \sqrt{N}\).
Therefore, when one declares that the statistical uncertainty on a measured quantity \(N_m\) is \(u_{N_m} = \sqrt{N_m}\), the quoted uncertainty is of type B. It is assumed that the observed number is the mean value of a Gaussian distribution and that the associated standard deviation (taken as the uncertainty) is \(\sqrt{N_m}\). (If you prefer to use \(2 \sigma\) or \(3 \sigma\) as the uncertainty on a quantity, then your statistical uncertainty should be \(2~\text{or}~3 \sqrt{N_m}\)).
It is important to remember where the \(\sqrt{N}\) uncertainty comes from. It does not hold for very low number of \(N\) (use Poisson distribution for these cases), and is even less valid if the observation probability \(p\) is high (going back to the binomial law, if \(p = 0.5\) for example, \(\sigma = 0.5 \sqrt{N_0}\)).
Finally, not all numbers measured follow an event-observation logic. If a number of atoms is determined by weighting, there is absolutely no justification to use \(\sqrt{N}\) to estimate the uncertainty.
- Systematic uncertainty
This covers any source of uncertainty that is not related to limited statistics.
Generally, it comes from the limited precision of an instrument, and the value is always the same (over repeated measurements) and cannot be reduced further down by recording more data in the same experimental condition. As it is, the Systematic uncertainty is irreducible, while the so called statistical one, by nature, can be reduced.
The systematic uncertainty may still be determined by statistical methods (type A).
Warning
Reducing the overall uncertainty is not always just a matter of increasing the statistics. A more intense beam, longer recording times, … may come with increased systematic uncertainties (from increase background or parasitic reactions, …). In other words, a gain in statistics may come at a price in systematics.
Note
The statistical uncertainty on a measurement can become a systematic source of uncertainty on another one. For example, when determining the activity of a source, the limited statistics will be a statistical uncertainty on the recorded value for the activity. Later, when using the source to determine an efficiency, the uncertainty on the activity will be a systematic one (but multiple measurement might be used to determine the systematic Type A uncertainty on the efficiency via statistical method).
Note
For a value obtained at the end of a data analysis process, one may prefer the labels intrinsic and parametric uncertainties (which we will use later in this manuscript). The latter is the uncertainty associated with the uncertainties on analysis parameters (cuts and selections, scaling and normalizations, …), while the first is the uncertainty that comes directly from the analysis process (e.g. results of fits, …) and is “intrinsic” to the dataset (including its limited statistics).
Whatever the scale of the different sources of uncertainties into the final results, the total combined uncertainty on the measurement is the quantity that matters. It is not particularly important whether most of the uncertainty is due to systematics factors or the limited statistics. Given the same final combined uncertainty, one measurement will not be “better” if it boasts a smaller systematic uncertainty.
However, splitting between the two kind may be justified for data evaluation, where they can be processed differently.
Covariance, why it matters?
We hinted at covariance just above. Covariances are, to put it simply, the links between values. For illustration, let’s imagine we measure a quantity (let’s say a cross-section) at two different energies.
Our experiment and data analysis will yield two values \(\sigma_1\) and \(\sigma_2\), with their respective uncertainties \(u_{\sigma_1}\) and \(u_{\sigma_2}\). The quantities \(\sigma_i\) are derived from number of counts \(N_i\) (and the associated uncertainty \(u_{N_i}\)) by \(\sigma_i = N_i / \epsilon\) with \(\epsilon\) a common parameter between both values. Since the two values have been obtained with the same setup, using the same procedure of data analysis, their uncertainties are linked together by the common parameter \(\epsilon\).
The efficiency itself has an uncertainty \(u_\epsilon\), and we can express the uncertainty on our cross section (here we use the simplified square summation formula) by \({u_{\sigma_i}}^2 = \left(\frac{u_{N_i}}{\epsilon}\right)^2 + \left(\frac{u_\epsilon N_i}{\epsilon^2}\right)^2\). If the dominant source of uncertainty in our experiment is the detection efficiency, then we can neglect the uncertainty on \(N_i\).
The relative uncertainty \(u_{\sigma_i} / \sigma_i\) is therefore approximately given by \(u_\epsilon / \epsilon\). It is easy then to understand that if we underestimate the central value of \(\epsilon\), both \(\sigma_1\) and \(\sigma_2\) will be overestimated by the same relative amount.
Or, to summarize: the common source of uncertainty creates a leverage between uncertainties of different variables.
Covariances to interpret experimental data
The knowledge of covariance is a powerful tool for interpretation of experimental data. Let’s take the example in Figure 9 with two experimental points (the \(\sigma_i\) from before), that are compared to two theoretical model predictions.
If one tries to distinguish which of the two models fits better the experimental data, without considering the covariance, one can for example computes the \(\chi^2\) value for both models and compare. In the presented example, the values are \(0.782\) for model 1, and \(0.784\) for model 2. In other words, without covariance information, both models are compatible with the data with the same level of compatibility.
But we know that the uncertainties are linked, just as if uncertainty bars had a direction (in other words, it’s more like uncertainty arrows). The graph in Figure 9 is therefore better when represented as in Figure 10.
Generalized \(\chi^2\) formula using covariance
The generalized formula for computing the \(\chi^2\) using covariances uses a matrix notation: \(\chi^2 = (X - \bar X)^T {Cov}^{-1} (X - \bar X)\) with \(X\) the vector of values, \(\bar X\) the vector of model values to compare to and \(Cov\) the covariance matrix.
Tip
For two variables only, \(x\) and \(y\) with uncertainties \(u_x\) and \(u_y\) respectively, and \(r_{x,y}\) the correlation between the two variables (i.e. \(cov(x,y)=r_{x,y} \times u_x \times u_y\)), the generalized \(\chi^2\) is computed by the formula: \(\chi^2 = \frac{{u_y}^2}{{u_x}^2 {u_y}^2 - (r_{x,y} u_x u_y)^2} (x - \bar x)^2 + \frac{{u_x}^2}{{u_x}^2 {u_y}^2 - (r_{x,y} u_x u_y)^2} (y - \bar y)^2 - \frac{r_{x,y} u_x u_y}{{u_x}^2 {u_y}^2 - (r_{x,y} u_x u_y)^2} (x - \bar x)(y - \bar y)\). It’s a bit of a mouthful, but if \(r_{x,y}=0\), we fall back on the classic uncorrelated \(\chi^2\)
When we take into account the covariance (arbitrarily assuming a 0.95 correlation between the two variables), the obtained \(\chi^2\) values are respectively \(0.0002\) and \(1.5704\) for model 1 and 2 respectively [7]. Just by considering the extra information of covariance, we can now say that the first model has a high compatibility with the data, while the second one is much less compatible.
The important thing about uncertainties
By definition, the uncertainties cannot be known exactly [8], they are estimated or evaluated. In any case, to correctly use the information from the uncertainties, as well as the covariance and correlation matrices, it is important to know how they have been constructed. To put it another way: the way the uncertainties are obtained is as much important as the uncertainty values themselves.
That is why it is important to fully document the sources of uncertainty in a result, and how they are combined to produce the final, published, uncertainty.
The principles of Open Science, whether it is Open Data or Open Source, are a great way to ensure the maximum documentation on how the uncertainties were produced.
Following this idea, the Implementation of the Monte Carlo analysis chapter will describe in details how a full Monte Carlo analysis code is designed and produce central values, uncertainties, covariance and correlation matrices. That code is available in Open access and can be completely examined to understand the way the computations are performed.
Such a fully open code, with the ability to be re-executed, answers the need for detail documentation on how the data is analyzed, and the uncertainties produced, that is required for accurate and meaningful evaluation of experimental values.
Note
The amount of time during which the data, code, … should be archive depends on the importance of the data and the ease of repeating the measurement. Still, a properly archived data set and code should be easy to retrieve and reuse.
Dissemination via Open channels is also a way to maximize the future availability of the archives, as they are more likely to be copied in different places.
Footnotes