OJS  Vol.5 No.5 , August 2015
On the Estimation of a Univariate Gaussian Distribution: A Comparative Approach
Abstract: Estimation of the unknown mean, μ and variance, σ2 of a univariate Gaussian distribution given a single study variable x is considered. We propose an approach that does not require initialization of the sufficient unknown distribution parameters. The approach is motivated by linearizing the Gaussian distribution through differential techniques, and estimating, μ and σ2 as regression coefficients using the ordinary least squares method. Two simulated datasets on hereditary traits and morphometric analysis of housefly strains are used to evaluate the proposed method (PM), the maximum likelihood estimation (MLE), and the method of moments (MM). The methods are evaluated by re-estimating the required Gaussian parameters on both large and small samples. The root mean squared error (RMSE), mean error (ME), and the standard deviation (SD) are used to assess the accuracy of the PM and MLE; confidence intervals (CIs) are also constructed for the ME estimate. The PM compares well with both the MLE and MM approaches as they all produce estimates whose errors have good asymptotic properties, also small CIs are observed for the ME using the PM and MLE. The PM can be used symbiotically with the MLE to provide initial approximations at the expectation maximization step.
Cite this paper: Kikawa, C. , Shatalov, M. , Kloppers, P. and Mkolesia, A. (2015) On the Estimation of a Univariate Gaussian Distribution: A Comparative Approach. Open Journal of Statistics, 5, 445-454. doi: 10.4236/ojs.2015.55046.

[1]   Anita, H.M. (2002) Numerical Methods for Scientist and Engineers. Birkhauser-Verlag, Switzerland.

[2]   Baushev, A.N. and Morozova, E.Y. (2007) A Multidimensional Bisection Method for Minimizing Function over Simplex. Lectures Notes in Engineering and Computer Science, 2, 801-803.

[3]   Darvishi, M.T. and Barati, A. (2007) A Third-Order Newton-Type Method to Solve Systems of Nonlinear Equations. Applied Mathematics and Computation, 87, 630-635.

[4]   Jamil, N. (2013) A Comparison of Iterative Methods for the Solution of Non-Linear Systems of Equations. International Journal of Emerging Science, 3, 119-130.

[5]   Murray, W. and Overton, M.L. (1979) Steplength Algorithm for Minimizing a Class of Nondifferentiable Functions. Computing, 23, 309-331.

[6]   Hornberger, G. and Wiberd, P. (2005) User’s Guide for: Numerical Methods in the Hydrological Sciences, in Numerical Methods in the Hydrological Sciences.

[7]   Bishop, C.M. (1991) A Fast Procedure for Retraining the Multilayer Perceptron. International Journal of Neural Systems, 2, 229-236.

[8]   Bishop, C.M. (1992) Exact Calculation of the Hessian Matrix for the Multilayer Perceptron. Neural Computation, 4, 494-501.

[9]   Bishop, C.M. and Nabney, I.T. (2008) Pattern Recognition and Machine Learning: A Matlab Companion. Springer, In preparation.

[10]   Mackay, D.J.C. (1988) Introduction to Gaussian Processes. In: Bishop, C.M., Ed., Neural Networks and Machine Learning, Springer.

[11]   Mackay, D.J.C. (2003) Information Theory, Inference and Learning Algorithms. Cambridge University Press, Cambridge.

[12]   Richard, J., Douglas, F. and Burden, L. (2005) Numerical Analysis. 9th Edition, Cengage Learning, Boston.

[13]   Robert, W.H. (1975) Numerical Analysis. Quantum Publishers, New York.

[14]   Bhatti, S. (2008) Analysis of the S. pombe Sister Chromatid Cohesin Subunit in Response to DNA Damage Agents During Mitosis. PhD Thesis, University of Glasgow.

[15]   Wood, G. (1989) The Bisection Method in Higher Dimensions. Mathematical Programming, 55, 319-337.

[16]   Myung, I.J., Forster, M. and Browne, M.W. (2000) Special Issue on Model Selection. Journal of Mathematical Psychology, 44, 1-2.

[17]   Myung, I.J. (2003b) Tutorial on Maximum Likelihood Estimation. Journal of Mathematical Psychology, 47, 90-100.

[18]   Berndt, E.K., Hall, B.H. and Hall, R.E. (1974) Estimation and Inference in Nonlinear Structural Models. Annals of Economic and Social Measurement, 3, 653-665.

[19]   Kloppers, P.H., Kikawa, C.R. and Shatalov, M.Y. (2012) A New Method for Least Squares Identification of Parameters of the Transcendental Equations. International Journal of the Physical Sciences, 7, 5218-5223.

[20]   Krishnamoorthy, K. (2006) Handbook of Statistical Distributions with Applications. Chapman & Hall/CRC, London.

[21]   Duda, R.O., Hart, P.E. and Stork, D.G. (1995) Pattern Classification and Scene analysis. John Wiley and Sons, New York.

[22]   Fayyad, U., Reina, C. and Bradley, P.S. (1998) Initialization of Iterative Refinement Clustering Algorithms. Proceedings of the 4th International Conference on Knowledge Discovery and Data Mining (KDD98), New York, 27-31 August 1998, 194-198.

[23]   Kikawa, C.R., Shatalov, M.Y. and Kloppers, P.H. (2015) A Method for Computing Initial Approximations for a 3-Parameter Exponential Function. Physical Science International Journal, 6, 203-208.

[24]   Pearson, K. and Lee, A. (1903) On the Laws of Inheritance in Man: Inheritance of Physical Characters. Biometrika, 2, 357-462.

[25]   Sokal, R.R. and Hunter, P.E. (1955) A Morphometric Analysis of DDT-Resistant and Non-Resistant Housefly Strains. Annals of the Entomological Society of America, 48, 499-507.

[26]   Hohle, J. (2009) Accuracy Assessment of Digital Elevation Models by Means of Robust Statistical Methods. Japan Society of Photogrammetry and Remote Sensing, 64, 398-406.

[27]   Searl, R.S. (1971) Linear Models: John Wiley and Sons, Hoboken.

[28]   Kay, M.S. (1993) Fundamentals of Statistical Signal Processing: Estimation Theory. Prentice-Hall, Upper Saddle River.

[29]   Muller, K.E. and Stewart, P.W. (2006) Linear Models Theory: Univariate, Multivariate, and Mixed Models. John Wiley and Sons, Hoboken.

[30]   NIST/SEMATECH (2012) e-Handbook of Statistical Methods.

[31]   Burden, R.L. and Douglas, J.F. (2000) Numerical Analysis. 7th Edition, Brooks/Cole, Pacific Grove.

[32]   D’Agostino, R.B., Belanger, A., Ralph, B. and D’Agostino Jr., R.B. (1990) A Suggestion for Using Powerful and Informative Tests of Normality. The American Statistician, 44, 316-321.