Research on Initialization on EM Algorithm Based on Gaussian Mixture Model

Show more

The EM algorithm is a very popular maximum likelihood estimation method, the iterative algorithm for solving the maximum likelihood estimator when the observation data is the incomplete data, but also is very effective algorithm to estimate the finite mixture model parameters. However, EM algorithm can not guarantee to find the global optimal solution, and often easy to fall into local optimal solution, so it is sensitive to the determination of initial value to iteration. Traditional EM algorithm select the initial value at random, we propose an improved method of selection of initial value. First, we use the k-nearest-neighbor method to delete outliers. Second, use the k-means to initialize the EM algorithm. Compare this method with the original random initial value method, numerical experiments show that the parameter estimation effect of the initialization of the EM algorithm is significantly better than the effect of the original EM algorithm.

References

[1] Wang, X. (2012) Gaussian Mixture Model Based k-Means to Initialize the EM Algorithm. Journal of Shangqiu Normal University, 28, 11-14.

[2] Wang, J.K. and Gai, J.Y. (1995) Mixture Distribution and Its Application. Journal of Biomathematics, 10, 87-92.

[3] Trevor, H., Robert, T. and Jerome, F. (2001) The Elements of Statistical Learning, Springer-Verlag, New York.

[4] Zhang, Z.P., Xu, X.Y. and Wang, P. (2011) Spatial Outlier Mining Algorithm Based on KNN Graph. Computer Engineering, 3737-3739.

[5] Zhu, J.Y. (2013) Research and Application of K-Means Algorithm. Dalian University of Technology.

[6] Zhai, S.D. (2009) Research on Clustering Algorithm Based on Mixtured Model. Northwest University.