The Monte Carlo method
Principle
The name Monte Carlo was coined by Nicholas Metropolis, John von Neumann and Stanislaw Ulam during their work on the Manhattan Project, as a reference to the Monte Carlo Casino in Monaco [1].
The principle of Monte Carlo calculations is to generate random numbers, use these numbers to perform a simulation, repeat the simulation many times to generate a large sample of results, and finally use statistical methods to estimate the desired quantities (their central values, plus the standard deviations, covariances, …).
This type of simulation provides an efficient way to study complex problems by taking into account the uncertainty and possible correlations between input parameters and variables (when done right).
There is not one strict definition of what a Monte Carlo calculation should look like, but the general principle generally reproduces the following steps:
Draw random numbers from a given probability density function to create a new set of parameters.
Compute one or several values using the parameters obtained in the previous step, as well as other data. Each computed value(s) is stored in a stack.
Repeat steps 1 and 2 many times (until convergence).
From the values stored in the stack, the average value, standard deviation and (when applicable) covariance matrix are calculated.
The use of Monte Carlo methods is very widespread in nuclear physics. One can note in particular that transport codes like MCNP [2] or Geant4 [3] use Monte Carlo methods. The same method can be used for some evaluation processes [4] or to recreate a nucleus’ level scheme [5].
Convergence
At the end of a Monte Carlo calculation, one typically ends up with a series of values \({x_1, x_2, x_3, ..., x_N}\) with \(N\) being a large number. The next step is to compute the mean value \(\bar x = 1/N \sum_{i=1}^{N} x_i\) and standard deviation \(\sigma_x = \sqrt{1/(N-1) \sum_{i=1}^{N} (x-\bar x)^2}\) (the uncertainty associated to \(x\), \(u_x\) is taken to be equal to the standard deviation \(\sigma_x\).
It is obvious that for these values to be relevant, one needs \(N\) larger than 3… But how large should it be?
Unfortunately, there is no universal answer to that, and a particular attention should be paid to this issue of convergence.
Convergence accomplished?
We say that the Monte Carlo calculation has converged when the sufficient number of iterations has been performed (i.e. \(N\) is large enough), so that the \({x_i}\) set reach a certain convergence criteria. The criteria to test for, however, is not unique, and is left to the appreciation of the person performing the calculation. Depending on the situation, one or more criteria will be appropriate to apply.
Warning
An important thing to keep in mind when choosing a criteria is that the stricter it is, the harder it will be to achieve. That means calculations may run for a very long time and require an increasingly big amount of CPU time and memory storage (for the intermediate results stack). A poorly chosen criteria might even make your Monte Carlo code run forever. A failsafe should always be implemented.
We will now describe some of the possible criteria.
Number of iterations
The simplest convergence criteria is setting a strict number of iteration to run. It offers the safety of always stopping at the same point, but does not guarantee that the number chosen will be large enough to have a statistically relevant \(\{x_i\}\) set to work with.
Having a (very) large maximum number of iterations in combination with another criteria is a good failsafe for calculation. It ensures the code will eventually stop.
Target uncertainty on the mean value
The uncertainty on the mean value \(\bar x\) (we are not talking about the standard deviation here) is \(\sqrt{ 1/N \sum_i^N var(x_i)}\) where \(var(x_i)\) is the square of the uncertainty associated with the obtained value \(x_i\) [6]. If all \(x_i\) have the same uncertainty \(\varsigma_x\), then the uncertainty on the mean value is \(\varsigma_x / \sqrt{N}\).
This is a well known results that the more data points one have, the easiest it is to determine the mean value of the points.
This criteria will lead to a reasonably quick convergence (for example, \(N=100\) leads to an uncertainty on the \(\bar x \approx \varsigma_x / 10\)). However, it tells us nothing on the standard deviation or covariance aspect of the result.
Target standard deviation
The Monte Carlo calculation may quickly converge on the mean value, but the standard deviation of the \(\{x_i\}\) set might still be very large (for example, if the \(x_i\) values are evenly distributed around \(\bar x\) but with a very broad distribution). Let’s remember that it’s the standard deviation of the final \(\{x_i\}\) set that counts, much more than the ease of finding the central value.
So, it is a good idea to target the standard deviation and continue the Monte Carlo iterations until we reach \(\sqrt{1/(N-1) \sum_{i=1}^{N} (x-\bar x)^2} \leq \sigma_\text{target}\).
One has then to choose the target standard deviation \(\sigma_\text{target}\) wisely. If it is too small, the value will never be reached and the code may run forever. Also, and more importantly, it has to have significance: is the target standard deviation the value needed to be compared to another results? To discriminate between theories?
Step to step change in mean value
What if we do not have a way to set a significant target for the mean value uncertainty or standard value? That may happen when there is no other value to compare to.
In that case, one can observe the step by step (possible averaged over some number of steps) change in the result. Convergence can be considered achieved once the calculated mean value is stable, i.e. \(| \frac{1}{P}\sum_{i=1}^{P} x_i - \frac{1}{P'}\sum_{i=1}^{P'} x_i | \leq \epsilon\), with a value of \(\epsilon\) chosen accordingly.
With this convergence test, one may add the condition of a minimum number of steps, as the random process might create the first few steps to lead to mean values close enough to trigger the convergence.
Step to step change in standard deviation
The same method can be applied to the standard deviation, with a target stability in the \(\sigma_x\). Both step-to-step changes on \(\bar x\) and \(\sigma_x\) are complementary and can be applied together.
When to test for convergence
The convergence test may be applied “dynamically” (at the end of each step, to see if it is necessary to continue), or a posteriori, to check if the number of performed iterations was enough.
When applying the same Monte Carlo calculation to several quantities, it’s important that the convergence criteria and number of iterations between each stay consistent. Ideally, one will process all with the maximum number of iterations needed to ensure that convergence is reached for all.
Full Monte Carlo data analysis to produce uncertainties and covariances
By Full Monte Carlo, we understand that the data analysis is completely done within a Monte Carlo loop. It is opposed to partial Monte Carlo analysis, where only a subset of analysis steps are performed using random sampling. For example, for the DNR group, I developed with F. Claeys during his PhD. (2019-2023) an intermediate data semi-Monte Carlo code [7], [8]and [9].
The workflow of a full Monte Carlo analysis is given in Figure 11.
Starting with the distributions associated with each parameter, as well as the raw data (for performance’s sake, all the preparatory steps and anything that does not require sampling parameters is done before entering the Monte Carlo loop) are prepared.
At the start of each iteration, the relevant parameters are sampled according to their given distribution. Then, the analysis is performed automatically (if the analysis requires human input, the iteration process will be much slower - but it is still possible), starting from the most raw spectra, up to the final value. If / When uncertainties from some analysis steps are generated in the analysis (typically, uncertainty from a peak fit in a spectrum), this intrinsic uncertainty is carried over and stored with the final value to be combined with the Monte-Carlo uncertainty (from the dispersion of results, i.e. parametric uncertainty).
Once all the iterations are done (according to the chosen Convergence criteria), a final stage gathers all the iteration results, computes their central values, standard deviation (which will be labeled parametric uncertainty), and covariances (the covariance is only relevant to the parametric uncertainty, all intrinsic uncertainties are normally independent).
We note that with this method, the correlations from parameters are automatically taken into account. That is one of the major advantage of it.
Footnotes