Fourier transformation tools are used to obtain information about spectra for a given data set. As any data has an uncertainty, Fourier transformation techniques can be supported by probabilistic theory captured by Bayes’ theorem  to improve scientific results and conclusions. The works   demonstrate some advantages of the probabilistic ansatz over a conventional approach.
A common and misleading assumption in the field is that the Nyquist theorem determines the spectral band limitation. However, as the Nyquist frequency follows from finite data sampling, it can only be an upper limit but cannot be an estimator for the limit caused actually by a source and/or diagnostic throughput. A similar reasoning applies to the lower spectral limit. After a basic formalism has been derived, it could be shown in Ref.  that the band limitation can be well inferred from experimental data. Thereby, a linear Gaussian inversion technique was used to infer spectral amplitudes. Furthermore, a settings posterior was introduced which estimates non-linear parameters and hyperparameters of a problem. For instance, the spectral band limits and the uncertainty on the derived quasi-continuous spectrum, originating in non-probed Fourier coefficients, have been inferred jointly.
For many applied analysis schemes, certain model assumptions and their implications are not tested but assumed to be valid with infinite trust. From scientific point of view, any analysis scheme should be tested given simulated noisy data for which all model assumptions are clear. Then, analysis results and model assumptions can be investigated which is achieved objectively by a probabilistic ansatz. If this can be carried out analytically, valuable information is available when only actual measured data are given for a scientific problem.
The problem of a monochromatic source is a good example to show the powerfulness of Bayesian inference and to test model assumptions. In this work, the formalism derived in Ref.  is adapted to a monochromatic problem in Section 2 and applied to simulated data with different noise levels in Section 3. In addition, analytical and numerical investigations are carried out, so that the probabilistic findings can be understood in more detail. For example, a better signal-to-noise ratio improves the estimation of the frequency, while more data points compensate a worsened signal-to-noise ratio. The conclusion section can be found at the end.
2. Adapted Probabilistic Formalism
The formalism derived in Ref.  is adapted below to a monochromatic problem, involving one frequency parameter and one amplitude parameter . In addition, a hyperparameter is at hand, entering in the normal prior for . Abstractly, Bayes’ theorem reads then
with the joint posterior on the left-hand, the likelihood devided by the evidence, the amplitude prior the prior for the hyperparameter, and the prior for the frequency .
The amplitude maps linearly to the data domain via the vector dependent on the frequency in a non-linear manner. The data with entries are assumed to be acquired independently, and the measurement uncertainty of each data point follows a normal distribution with standard deviation . With these assumptions, the Gaussian likelihood becomes
with the covariance matrix . The amplitude prior takes the form
for vanishing prior mean and variance . After some analytical operations, one obtaines the Gaussian amplitude posterior
conditional on and . In the above equation, the posterior variance is given by , and the posterior mean reads .
From the remaining terms, the so-called settings posterior
is derived which can be interpreted as Ockham’s razor with respect to . This settings posterior has no general analytical solution, and, thus, the normalisation constant cannot be stated further. For the remainder of the paper, and are taken as uniform distributions. By this choice, can be used to investigate analytically and numerically the quantity
Finally, the evidence can be identified as
The constant varies with the chosen model H, including likelihood, priors, and the prior domains for and . Hence, this constant is linked to the model posterior which is of importance, when the model is further abstracted, or a comparison with an alternative model is carried out.
3. Monochromatic Problem
For a monochromatic even source, data sets with different noise levels are simulated, and the formalism derived in the previous section is applied. To explain the results found, analytical and numerical investigations are carried out. While this problem is treated abstractly in the following, the main results will be presented for two examples with low and high noise contribution to ease the presentation.
(A) Simulated Data: Two Examples
Data sets are modelled for the real-world interferometer found in Ref.  which achieves an optical path difference to obtain an interferogram. Formally, the interferogram is simulated with the amplitude and the frequency f = 50 GHz. The optical path difference grid is chosen as with with constant increment . Accordingly, the total length L = 31.56 mm and the length of the single-sided region follow. Two noisy data sets are generated with the noise levels at 0.05 and 1.0 (see Figure 1).
(B) Application of Formalism
The linear mapping of the amplitude parameter is given by , yielding . For the two simulated data sets, Figure 2 shows , varying in and for three values of . Even for the high noise level case, peaks close to 50 GHz, and the uncertainty remains reasonably small. The peak has the full width at half maximum (FWHM) of about 8 GHz which is roughly twice as large as the classical spectral resolution stated by . High and low values of seem to have little effect on the results.
Figure 1. Simulated interferograms (without noise, black) and (with noise, red). Formally, was used with amplitude , frequency f = 50 GHz, and optical path difference grid x with increment (number of data points ). The data-like quantity resembles a sample of , assuming a Gaussian noise component with variance . Furthermore, the noise is chosen to be independent of x. (a) (low noise); (b) (high noise).
Figure 2. Conditional Gaussian amplitude posterior with mean and standard deviation . This posterior depends on the frequency , prior standard deviation for , and noisy interferogram data . The conditional posterior is obtained for two sets (see Figure 1) modelled by a monochromatic source with the frequency f = 50 GHz, the amplitude S = 0.5 and the noise levels (left column) and 1 (right column). The posterior mean (cyan) peaks at values of about 0.5 close to 50 GHz. The spectral FWHM of this peak accounts for about 8 GHz. The posterior standard deviation (dashed-white for its multiples) increases with the noise level, as one would expect. Only small values of affect when the noise level is high.
The settings posterior is proportional to the quantity which takes very large values even on logarithmic scale and, thus, is used in the following. For the two cases, is shown in Figure 3(a) and Figure 3(b) versus the spectral parameter and for three prior values of . All three cross-sections look similar, but close to 50 GHz the peak is highest for values near . Identifying this global maximum by , the ratio takes reasonable numbers and is presented for the
Figure 3. , being proportional to joint posterior , dependent on frequency and prior standard deviation given interferogram data set (see Figure 1) with low and high noise levels (left column) and 1 (right column), respectively. Since takes large values, is presented in (a) and (b). Deviding by the global maximum, the ratio becomes manageable from numerical point of view as shown in (c) and (d). This ratio peaks in the vicinity of 50 GHz and ( ). The spectral width of increases with the noise level from about 10 to 200 MHz but remains well below the classical spectral resolution of about 4.75 GHz. The width with respect to remains constant and covers one order of magnitude. (a) (low noise case) for broad spectral domain at three values of ; (b) (high noise case) for broad spectral domain at three values of ; (c) (low noise case) for spectral domain 49.9 - 50.1 GHz; (d) (high noise case) for spectral domain 49 - 51 GHz.
relevant domains of and by Figure 3(c) and Figure 3(d) for both data sets. This ratio is narrow in the vicinity of 50 GHz and in the domain . Furthermore, the peaking has a spectral width of about 10 MHz and 200 MHz for the low and high noise case, respectively. When interpreted as spectral resolution, this width is smaller by several orders of magnitude with respect to its classical counterpart of about 4.75 GHz.
(C) Analytical and Numerical Investigations
To have further inside in the results, the conditional amplitude posterior and are analytically investigated. In order to do so, the noise contribution in the data is neglected, meaning that is replaced with the noise-less interferogram . This allows the tracking of the dependencies of the posteriors on the original amplitude S and frequency f, noise level , and number of data points . Furthermore, a second factorisation can be justified. Parallel to this investigation, numerics and conditional posteriors are presented.
1) Conditional Amplitude Posterior
Starting point is the variance of the conditional posterior (see Equation (4)). Formally, one finds for the inverse
As long as the spatial sampling is well enough, one can use the approximation
as shown by Equation (A1), to get the posterior variance
This variance increases quadratically with the noise level and reduces with the number of data samples. Thus, more data points per spatial unit can compensate the noise contribution, at least partly. However, this assumes that the noise is independent of the data sampling.
The modulated sinc function in the denominator of Equation (10) depends on , L and for which the interferogram is available. The sinc function becomes unity and vanishes for small and large frequencies, respectively. For instance, since both data examples share the same spatial domain, the sinc function is close to 0 above about 20 GHz. The amplitude prior influences , when the ratio becomes larger than 1 which is obtained for a large noise level, a small value of and/or few data points.
For the posterior mean,
follows, leaving aside the noise contribution but the uncertainty on the mean is still captured by . With the approximation (A1) and the trigonometric identity (B1), one can resolve
Indeed, one finds the original amplitude for and prior values with the condition . Furthermore, due to there is a peaking of in the vicinity of with the FWHM determined by both trigonometric arguments. Away from f, the oscillating amplitude drops like due to the sinc function which can be seen in Figure 2.
2) Conditional Posterior for f1
Now the information is available to investigate . Neglecting the noise, the triple product in the exponent of the exponential in reads
The global maximum is close to , and, assuming a sufficiently large frequency f, one gets the approximation
when the Taylor series expansions and are used. Then,
remains, and the term which is independent on takes very large numerical values and is treated in more detail in the next subsection. The remaining term can be rewritten by a quadratic exponent of a Gaussian with posterior mean and square root of the variance
using Equation (10) for large frequencies and . This uncertainty increases with the noise but decreases with the square root of the number of data points per spatial unit, the signal level and the spatial domain covered. For the data set examples, one inserts mm, mm, , and 2 to find the astonishing numbers and 171.9 MHz, respectively. This explains the narrowness of with respect to as shown in Figure 3(c) and Figure 3(d).
In fact, changes with on a GHz, but the peaking in is of the order of MHz. Hence, the factorisation with the conditional Gaussian posterior
is justified. Thereby, needs to be taken at . The remaining probability is investigated in the next subsection.
Because the variance changes in on GHz scale but
on MHz scale, only the exponential needs to be taken into account for the determination of and . Furthermore, one faces large numbers, and, hence, the exponent is dealt with directly.
At a given , the frequency at which the maximum of the exponent occurs is identified as the posterior mean . The posterior standard deviation is obtained when the ratio of the exponent to its maximum at reads . For both data sets, the ratios are shown in Figure 4(a) and Figure 4(b) in the vicinity of the peak. While the uncertainty of and, hence, the width of the Gaussian augment with the noise level (see also Figure 5(a) and Figure 5(b)) as described by Equation (16), f remains included in the 2- band centred at . Furthermore, the choice of affects and only for small values (see Figure 5(a) and Figure 5(b)).
3) Posterior for
After the factorisation, one combines Equations (16) and (18) like
with the important term
Figure 4. Ratio of exponent to its maximum dependent on frequency and keeping constant prior standard deviation . For the data set with low noise level ( , left), the peaking is more narrow than the one for the high noise level ( , right). The maximum location and the width of the ratio are used to estimate posterior mean and standard deviation . (a) Ratio for low noise data set at three values of . For , GHz and MHz are extracted which are consistent with 50 GHz used to simulate the data; (b) Ratio for high noise data set at three values of . For , GHz and MHz are estimated. and the deviation of from 50 GHz increase with the noise level. Furthermore, small prior values for affect .
Figure 5. Posterior mean and standard deviation for frequency dependent on prior standard deviation for noise levels (left) and 1 (right). The original frequency f = 50 GHz (red line) is included usually in the band . However, increases from about 10 MHz to 200 MHz given both noise levels. The posterior (bottom row) changes little for the different noise levels and has a global maximum close to S = 0.5 (dashed-black). (a) for ; (b) for ; (c) for ; (d) for .
The exponent of becomes
which is monotonically rising in from 0 to about . In case and , the number of data points dominates. For the two examples with , large numbers and are at hand as could be seen in Figure 3(a) and Figure 3(b).
As the exponent vanishes for , and, thus, P and diverge. Hence, this global maximum should be excluded by a proper choice for the prior of . Doing so, P has the global maximum at
supposed a large . Interestingly, reduces the noise impact drastically, so that even for elevated noise levels , the maximum is still given by S with minor corrections.
The uniform prior may be finite for the domain , and the maximum of P in this domain is denoted by . Then, the peaking can be made more obvious by
after some algebra for a large . With the normalisation of the above equation
the posterior becomes
In case, the prior is finite for the domain , so that , one can set . This reveals the simple forms
of Equation (23). With the above expression and Equation (19), the constant
follows with proper unit Hz. K depends in a complicated way on the noise level, the signal, the number of data points, the spatial domain for which the data is acquired, and the limits and . For the two examples, one gets ( ) and ( ).
Starting point is Equation (19) which is rewritten like
to handle large numerical values. Thereby, is evaluated at . The far right-hand side of the above equation is numerically available, and the maximum at and the normalisation can be determined for the domain from to .
Choosing for the moment and , so that the global maximum is included, the above formalism yields (213) and (6.36) for the data set simulated with (1). For both examples, the order of magnitude of is well described by Equation (26).
For the data sets, both are located at and 0.501 which is in the vicinity of S (see Figure 5(c) and Figure 5(d)) and is in agreement with expression (22). For the different noise levels, has a similar shape; the probability reduces drastically below but remains finite above up to , at least.
4) Model Posterior
The posterior of the model with uniform priors for and is given by
This posterior weighs the constants obtained with respect to the spectral and amplitude domains. Hence, increases with the constants in the numerator and smaller domains in the denominator. When one investigates Figure 3(a) and Figure 3(b), it is obvious that increases, when and are chosen to include and are separated by few GHz or even some 100s MHz. Similarly, choosing and to include the global maximum at (see Figure 5(c) and Figure 5(d)) elevates . Furthermore, has a global maximum when and approach from either side. This is numerically confirmed by scanning and over several orders of magnitude and evaluating the model posterior for the two data sets (see Figure 6(a) and Figure 6(b)).
(D) Joint Posterior and Marginal Posteriors for S1 and f1
Because the model posterior peaks at , the marginal posterior for is simply given by the Gaussian with posterior mean and standard deviation (see Section III C2 and Figure 5(a) and Figure 5(b)). Furthermore, the joint posterior
can be evaluated and is shown for both data sets in Figure 7(a) and Figure 7(b). This joint posterior peaks close to S and f used to simulate the data, and its width increases with the noise level. However, S and f are usually located inside the peak, and, hence, one can trust the results within the posterior uncertainties.
The marginal posterior for is obtained by the marginalisation
The above integration is performed numerically, and the results are shown in Figure 7(c) and Figure 7(d) for both data sets. Another way to achieve the marginalisation is to draw samples for from , and for each of those samples draw samples for from . Histograms of these posterior samples are given in Figure 7(c) and Figure 7(d).
Figure 6. Model posterior . The stated model uses a uniform prior for the prior standard deviation in the domain from to . As these limits approach for and 0.501 for (dashed-white), the posterior increases. Thereby, is closely linked with the original amplitude S = 0.5. (a) Low noise case ; (b) High noise case .
Figure 7. Joint posterior and marginal posterior of frequency and amplitude for low noise ( , left) and high noise data set ( , right). The joint posterior peaks close to the original values S = 0.5 and f = 50 GHz used to simulate the data sets. For the higher noise level, the posterior width increases, and the location of the global maximum separates further from S and f. By numerical integration, the marginal is evaluated, and its shape is close to a normal distribution. This is confirmed by a second approach which uses 105 samples from the joint posterior to estimate the marginal posterior mean and standard deviation for . The histogram of the samples is given as well. (a) for low noise level; (b) for high noise level; (c) for low noise level. The marginal mean and standard deviation read 0.4998 and 0.0013, respectively; (d) for high noise level. One finds that the marginal mean and standard deviation account for 0.505 and 0.026, respectively.
Evaluating the mean and standard deviation from these samples approximates by a Gaussian. In doing so for the data set with (1), the marginal posterior mean (0.505) and standard deviation 0.0013 (0.026) are obtained. Both normal distributions are good approximations for (see Figure 7(c) and Figure 7(d)).
The presented work investigates a monochromatic problem with an adapted version of Bayes’ theorem to infer an amplitude parameter, a frequency parameter and a hyperparameter which acts on the amplitude prior only. The amplitude at a given frequency is a parameter which has a linear mapping to the data domain. If the measurements have a normal noise contribution (Gaussian likelihood) and the amplitude prior is chosen to be Gaussian, an analytical linear inversion technique is enabled. The amplitude posterior, being conditional on the frequency and prior information, has similiarities with the result of a conventional Fourier transformation analysis approach. However, the estimation of the frequency reveals a well localised domain which is in accordance with the measured data. The corresponding posterior for the frequency is orders of magnitude narrower than the width of the convolution function used by conventional approaches to estimate the spectral resolution. The prior information captured by one hyperparameter is tested as well, and its posterior peaks over a one order of magnitude and is very robust for significantly different noise levels on the data.
The findings for the monochromatic problem are investigated analytically. This reveals, for instance, the influence of the signal-to-noise ratio, the amount of data points and the spatial domains covered by the measurements on the results. The presented approach can be followed to examine more complex problems which involve more than one frequency and one amplitude and other diagnostic imperfections like a variable offset. Since this offset has a linear correspondence to the data domain, it would enter in the conditional amplitude posterior. Any non-linear parameter and additional hyperparameter would be estimated by the joint settings posterior specific for a certain problem. In doing so, a profound understanding of model implications could be established in analytical terms for applications relying on Fourier transformations. This is essential for designing a diagnostic for a given problem with somewhat known signal-to-noise ratio and hardware limitations, or for comparing results of different models in an objective way when a data set is given and, for example, the number of contributing frequencies is unknown.
This work has been carried out within the framework of the EURO fusion Consortium and has received funding from the Euratom research and training programme 2014-2018 and 2019-2020 under grant agreement No. 633053. The views and opinions expressed herein do not necessarily reflect those of the European Commission.
Appendix A: Integral Approximation
The spatial grid with may be given by the set of uniformly distributed locations separated by the constant increment , so that and . Then, the sum
is approximated by an integral over the spatial domain using the trigonometric identities (B1) and (B2). The approximation reveals two modulated sinc functions with a spatial and a spectral dependence. The spectral dependence is due to the sum and difference frequencies and , respectively. Spatially, the modulation depends on the length of single-sided domain which is present on side only. The spatial dependence of the sinc function is given by the length L of the total domain covered.
Appendix B: Trigonometric Identities
The trigonometric identities are