The Monte Carlo Simulation and Non-Parametric Tests Application on Chemical Data

Show more

1. Introduction

Monte Carlo simulation (MC) was named after the gambling city of Monte Carlo in Monaco. During the simulation steps to generating variables and random distribution, this is so-called MC. MC is very powerful tools in radiation physics owing to it has great chance to resolve very complex physical models [ 1 ].

The difference between MC and real experiment is that MC carries out random sampling and performs a large number of computed experiments. The statistical measurements of the computed model are observed and then concluded. Every computed run is generated with accordance to its distribution [ 2 ].

The steps of MC are summarized in Figure 1. In first step, MC generates random variables which are distributed between 0 to 1. The significance of this distribution is that they can be formed into actual values which shape distribution of the purpose. The second step is to estimate the performance. The last step

Figure 1. Flowchart of MC file.

is carried out to characterize the output values.

Statistical evaluation is vital method in which to determine the validity of measurements. It also provides meaning of reported numbers and grants scientists senses to draw discussion and conclusion from their obtained numbers and variables. Luckily, most of articles dealing with applied sciences pay more attention to statistical methods to enhance statistical validity as proven evidence of their theory. Nowadays, advanced statistical software opens the appetite for more movement towards statistical techniques. Nevertheless, inappropriate understanding of these statistical packages can lead to misinterpretation of the reported data [ 5 ].

Statistical methods developed to carry out statistical analysis can be broken into two categorizes: the first is so-called parametric method and the second one is non-parametric. The parametric methods are based on one assumption which is normal (homogeneous and independent) distribution of the reported data. However, most of scientific data are violated this assumption [ 3 ].

Mood’s test is rarely used in literature for chemical data but is mostly clinical studies. Many non-pa- rametric tests depend on Mood’s test [ 4 ]. The median test is very important quantification of studying distribution owing to normal skewness. For instance, if variables are shared in their median then their medians can be comparable.

Using Mood’s Median Test, the obtained results listed in Table 1 except Fe and Mn were not included in the Mood’s test, one can end-up with precise conclusion. Thus, the chemical data calculations were performed to answer whether MC is applicable with other non-parametric tests e.g. Kruskal-Wallis. The MC results are discussed with more emphasized on matching between these performed tests.

2. Results

In Table 2, the mood’s test of the chemical data are listed. Almost half of the test was above the median and the other half was less than the median. All the median values of elements were located within upper and lower confidence intervals. For instance, for chromium the upper median was 28.11 ppm whereas the lower median was 18.4 ppm. The median for chromium was 22.11 ppm. Another example, for major element, e.g. iron, not reported here, the median was away from the upper and lower confidence intervals, thus, it was decided to removed from the list because only trace levels were part of this investigation.

In Kruskal-Wallis Test each group of elements was treated as independent unit. It should be noted that the Kruskal-Wallis test merely informs us that the groups differ in some way. In this case, the degree of freedom was above 5, thus we cannot use critical values of Kruskal-Wallis table. It was at 0.05 significance level. We can only use Chi tables. We are going to inspect each group medians to decide precisely how they differ rather giving two examples and later visualizing them in one figure. In Table 3, it showed the performance of Kruskal-Wallis Test for the study materials. The lowest score given by Kruskal-Wallis Test was for cadmium with z-values of −3.5 while the highest score was given to vanadium with z-value of 4.8. Thus, the data obviously have big different and the population medians of the chemical data were not all equal. The observations of median values of the study materials can easily recognize the median of easy element located between the lower and upper intervals. For example, let’s take zinc element, the lower limit of median was 19 ppm whereas the upper limit of median was 23.7 ppm. Fortunately, all the both-side tailings were near to zero which indicated the distributions were con dent. The test statistic for the Kruskal Wallis test is denoted H_{0}. The calculated H_{0} (as Chi-Square) was 162 and medians for all reported elements were more than H_{0} indicating the original data can be tested as non-parametric. So, we can conclude that there is a difference of original data

The Nonparametric Runs Test was performed for the chemical data as listed in Table 4 as supportive for MC results. The run test can be helpful in testing the null hypothesis of the equality of the distributions. Now, let’s look at one method in which the distribution functions could be unequal. One possibility is that one of the distribution functions is at least as great as the other distribution function at all data. Nonparametric Runs can do the same job as Kruskal-Wallis Test. Therefore, the test was carried out to

Table 1. Elemental analysis and statistical evaluation for chemical data of the study materials.

Table 2. Mood’s median test for the study materials.

Table 3. Kruskal-wallis test for the study materials.

Table 4. Nonparametric runs test for the study materials.

support our hypothesis of difference between the variables.

Matrix correlations were studied for the study chemical using Pearson Methods as listed in Table 5. Arsenic was almost correlated with all elements.

3. Conclusion

As seen in result section, Monte Carlo simulation showed clear difference at significant level of 95% among the study data. The difference of the reported data makes the non-parametric test valid. The Figure 2, log scale, illustrated the significant difference in the reported data. To support the Monte Carlo simulation, Kruskal-Wall test in Figure 3 showed the significant difference among median variables. At 95%

Figure 2. Medians (log scale) of the study materials for mood’s median test.

Figure 3. Kruskal-wall test for of study materials.

Table 5. Correlation calculations between chemical and radiation measurements using pearson methods for adhesive materials.

significance level, we conclude that the study data were non-identical populations.

References

[1] Alshammari, H., et al. (2017) The Experimental and Simu-Lation Risk Assessment of Radioactivity in Marble Building Materials Used in Saudi Arabia. Journal of Fundamental and Applied Sciences, 9, 1341-1348.

[2] Landau, D.P. and Binder, K. (2014) A Guide to Monte Carlo Simulations in Statistical Physics. Cambridge University Press, Cambridge.

[3] Derrac, J., et al. (2011) A Practical Tutorial on the Use of Nonparametric Statistical Tests as a Methodology for Comparing Evolutionary and Swarm Intelligence Algorithms. Swarm and Evolutionary Computation, 1, 3-18.

https://doi.org/10.1016/j.swevo.2011.02.002

[4] Chen, Z. and Zhang, G. (2016) Comparing Survival Curves Based on Medians. BMC Medical Research Methodology, 16, 33.

https://doi.org/10.1186/s12874-016-0133-3

[5] Gelman, A., et al. (2014) Bayesian Data Analysis. CRC Press, Boca Raton.