A New Weight Initialization Method Using Cauchy’s Inequality Based on Sensitivity Analysis

ABSTRACT

In this paper, an efficient weight initialization method is proposed using Cauchy’s inequality based on sensitivity analy- sis to improve the convergence speed in single hidden layer feedforward neural networks. The proposed method ensures that the outputs of hidden neurons are in the active region which increases the rate of convergence. Also the weights are learned by minimizing the sum of squared errors and obtained by solving linear system of equations. The proposed method is simulated on various problems. In all the problems the number of epochs and time required for the proposed method is found to be minimum compared with other weight initialization methods.

In this paper, an efficient weight initialization method is proposed using Cauchy’s inequality based on sensitivity analy- sis to improve the convergence speed in single hidden layer feedforward neural networks. The proposed method ensures that the outputs of hidden neurons are in the active region which increases the rate of convergence. Also the weights are learned by minimizing the sum of squared errors and obtained by solving linear system of equations. The proposed method is simulated on various problems. In all the problems the number of epochs and time required for the proposed method is found to be minimum compared with other weight initialization methods.

KEYWORDS

Weight Initialization, Backpropagation, Feedforward Neural Network, Cauchy’s Inequality, Linear System of Equations

Weight Initialization, Backpropagation, Feedforward Neural Network, Cauchy’s Inequality, Linear System of Equations

Cite this paper

nullT. Kathirvalavakumar and S. Subavathi, "A New Weight Initialization Method Using Cauchy’s Inequality Based on Sensitivity Analysis,"*Journal of Intelligent Learning Systems and Applications*, Vol. 3 No. 4, 2011, pp. 242-248. doi: 10.4236/jilsa.2011.34027.

nullT. Kathirvalavakumar and S. Subavathi, "A New Weight Initialization Method Using Cauchy’s Inequality Based on Sensitivity Analysis,"

References

[1] R. Battiti, “First and Second Order Methods for Learning: Between Steepest Descent and Newton’s Method,” Neural Computation, Vol. 4, No. 2, 1992, pp. 141-166. doi:10.1162/neco.1992.4.2.141

[2] W. L. Buntine and A. S. Weigend, “Computing Second Derivatives in Feedforward Networks: A Review,” IEEE Transactions on Neural Networks, Vol. 5, No. 3, 1994, pp. 480-488. doi:10.1109/72.286919

[3] G. B. Orr and T. K. Leen, “Using Curvature Information for Fast Stochastic Search,” Neural Information Processing Systems, Vol. 9, 1996, pp. 606-612.

[4] N. N. Schrusolph, “Fast Curvature Matrix-Vector Products for Second Order Gradient Descent,” Neural Computation, Vol. 14, No. 7, 2002, pp. 1723-1738. doi:10.1162/08997660260028683

[5] F. Biegler-Konig and F. Barnmann, “A Learning Agorithm for Multilayered Neural Networks Based on Linear Least Squares Problems,” Neural Networks, Vol. 6, No. 1, 1993, pp. 127-131. doi:10.1016/S0893-6080(05)80077-2

[6] Y. F. Yam and T. W. S. Chow, “Determining Initial Weights of Feedforward Neural Networks Based on Least Squares Method,” Neural Processing Letters, Vol. 2, No. 2, 1995, pp. 13-17. doi:10.1007/BF02312350

[7] Y. F. Yam, T. W. S. Chow and C. T. Leung, “A New Method in Determining the Initial Weights of Feedforward Neural Networks for Training Enhancement,” Neurocomputing, Vol. 16, No. 1, 1997, pp. 23-32. doi:10.1016/S0925-2312(96)00058-6

[8] G. P .Drago amd S. Ridella, “Statiscally Controlled Activation Weight Initialization (SCAWI),” IEEE Transactions on Neural Networks, Vol. 3, No. 4, 1992, pp. 899-905. doi:10.1109/72.143378

[9] D. Nguyen and B. Widrow, “Improving the Learning Speed of 2-Layer Neural Networks by Choosing Initial Values of the Adaptive Weights,” Proceedings of the International Joint Conference on Neural Networks, San Diego, Vol. 3, 17-21 June 1990, pp. 21-26. doi:10.1109/IJCNN.1990.137819

[10] H. Shimodaira, “A Weight Value Initialization Method for Improved Learning Performance of the Back Propagation Algorithm in Neural Networks,” Proceedings of the sixth Internation Conference on Tools with Artificial Intelligence, New Orleans, 6-9 November 1994, pp. 672-675. doi:10.1109/TAI.1994.346429

[11] M. Lehtokangas, J. Saarinen, K. Kaski and P. Huuhtanen, “Initializing Weights of a Multilayer Perceptron Network by Using the Orthogonal Least Squares Problem,” Neural Computation, Vol. 7, No. 5, 1995, pp. 982-999. doi:10.1162/neco.1995.7.5.982

[12] Y. Liu, C. F. Zhou and Y. W. Chen, “Weight Initialization of Feedforward Neural Networks by Means of Partial Least Squares,” International Conference on Maching Learning and Cybernetics, Dalian, 13-16 August 2006, pp. 3119-3122.

[13] X. M. Zhang, Y. Q. Chen, N. Ansari and Y. Q. Shi, “Mini-Max Initialization for Function Approximation,” Neurocomputing, Vol. 57, 2004, pp. 389-409. doi:10.1016/j.neucom.2003.10.014

[14] M. Fernandez-Redondo and C. Hernandez-Espinosa, “A Comparison among Weight Initialization Methods for Multilayer Feedforward Networks,” Proceedings of the IEEE-INNS-ENNS International Joint Conference on Neural Networks, Como, Vol. 4, 24-27 July 2000, pp. 543-548 .

[15] T.-C. Hsiao, C.-W. Lin and H. K. Chiang, “Partial Least Squares Algorithm for Weight Initialization of Backpropagation Network,” Neurocomputing, Vol. 50, 2003, pp. 237-247. doi:10.1016/S0925-2312(01)00708-1

[16] M. Huskan and C. Goerick, “Fast Learning for Problem Classes Using Knowledge Based Network Initialization,” Proceedings of International Conference on Neural Networks, Como, 24-27 July 2000, pp. 619-624.

[17] D. Erdogmus, O. Fontenla-Romero, J. C. Principe, A. Alonso-Betanzos and E. Castillo, “Linear-Leaset-Squares Initialization of Multilayer Perceptrons through Backpropagation of the Desired Response,” IEEE Transactions of Neural Networks, Vol. 16, No. 2, 2005, pp. 325-337. doi:10.1109/TNN.2004.841777

[18] Y. F. Yam and T. W. S. Chow, “A Weight Initialization Method for Improving Training Speed in Feedforward Neural Network,” Neurocomputing, Vol. 30, No. 1-4, 2000, pp. 219-232. doi:10.1016/S0925-2312(99)00127-7

[19] Y. F. Yam and T. W. S. Chow, “Feedforward Networks Training Speed Enhancement by Optimal Initialization of the Synaptic Coefficients,” IEEE Transactions on Neural Networks, Vol. 12, No. 2, 2001, pp. 430-434. doi:10.1109/72.914538

[20] E. Castillo, O. Fontenla-Romero, A. A. Betanzos and B. Guijarro-Berdinas, “A Global Optimum Approach for One Layer Neural Networks,” Neural Computation, Vol. 14, No. 6, 2002, pp. 1429-1449. doi:10.1162/089976602753713007

[21] E. Castillo, B. Guijarro-Berdinas, O. Fontenla-Romero and A. A. Betanzos, “A Very Fast Learning Method for Neural Networks Based on Sensitivity Analysis,” Journal of Machine Learning Research, Vol. 7, 2006, pp. 1159-1182.

[22] R. A. Fisher, “The Use of Multiple Measurements in Taxonomic Problems,” Annual Eugenics, Vol. 7, No. 2, 1936, pp. 179-188. doi:10.1111/j.1469-1809.1936.tb02137.x

[23] A. Frank and A. Asuncion, “UCI Machine Learning Repository,” School of Information and Computer Science, Universty of California, Irvine, 2010. http://archieve.ics.uci.edu/ml

[1] R. Battiti, “First and Second Order Methods for Learning: Between Steepest Descent and Newton’s Method,” Neural Computation, Vol. 4, No. 2, 1992, pp. 141-166. doi:10.1162/neco.1992.4.2.141

[2] W. L. Buntine and A. S. Weigend, “Computing Second Derivatives in Feedforward Networks: A Review,” IEEE Transactions on Neural Networks, Vol. 5, No. 3, 1994, pp. 480-488. doi:10.1109/72.286919

[3] G. B. Orr and T. K. Leen, “Using Curvature Information for Fast Stochastic Search,” Neural Information Processing Systems, Vol. 9, 1996, pp. 606-612.

[4] N. N. Schrusolph, “Fast Curvature Matrix-Vector Products for Second Order Gradient Descent,” Neural Computation, Vol. 14, No. 7, 2002, pp. 1723-1738. doi:10.1162/08997660260028683

[5] F. Biegler-Konig and F. Barnmann, “A Learning Agorithm for Multilayered Neural Networks Based on Linear Least Squares Problems,” Neural Networks, Vol. 6, No. 1, 1993, pp. 127-131. doi:10.1016/S0893-6080(05)80077-2

[6] Y. F. Yam and T. W. S. Chow, “Determining Initial Weights of Feedforward Neural Networks Based on Least Squares Method,” Neural Processing Letters, Vol. 2, No. 2, 1995, pp. 13-17. doi:10.1007/BF02312350

[7] Y. F. Yam, T. W. S. Chow and C. T. Leung, “A New Method in Determining the Initial Weights of Feedforward Neural Networks for Training Enhancement,” Neurocomputing, Vol. 16, No. 1, 1997, pp. 23-32. doi:10.1016/S0925-2312(96)00058-6

[8] G. P .Drago amd S. Ridella, “Statiscally Controlled Activation Weight Initialization (SCAWI),” IEEE Transactions on Neural Networks, Vol. 3, No. 4, 1992, pp. 899-905. doi:10.1109/72.143378

[9] D. Nguyen and B. Widrow, “Improving the Learning Speed of 2-Layer Neural Networks by Choosing Initial Values of the Adaptive Weights,” Proceedings of the International Joint Conference on Neural Networks, San Diego, Vol. 3, 17-21 June 1990, pp. 21-26. doi:10.1109/IJCNN.1990.137819

[10] H. Shimodaira, “A Weight Value Initialization Method for Improved Learning Performance of the Back Propagation Algorithm in Neural Networks,” Proceedings of the sixth Internation Conference on Tools with Artificial Intelligence, New Orleans, 6-9 November 1994, pp. 672-675. doi:10.1109/TAI.1994.346429

[11] M. Lehtokangas, J. Saarinen, K. Kaski and P. Huuhtanen, “Initializing Weights of a Multilayer Perceptron Network by Using the Orthogonal Least Squares Problem,” Neural Computation, Vol. 7, No. 5, 1995, pp. 982-999. doi:10.1162/neco.1995.7.5.982

[12] Y. Liu, C. F. Zhou and Y. W. Chen, “Weight Initialization of Feedforward Neural Networks by Means of Partial Least Squares,” International Conference on Maching Learning and Cybernetics, Dalian, 13-16 August 2006, pp. 3119-3122.

[13] X. M. Zhang, Y. Q. Chen, N. Ansari and Y. Q. Shi, “Mini-Max Initialization for Function Approximation,” Neurocomputing, Vol. 57, 2004, pp. 389-409. doi:10.1016/j.neucom.2003.10.014

[14] M. Fernandez-Redondo and C. Hernandez-Espinosa, “A Comparison among Weight Initialization Methods for Multilayer Feedforward Networks,” Proceedings of the IEEE-INNS-ENNS International Joint Conference on Neural Networks, Como, Vol. 4, 24-27 July 2000, pp. 543-548 .

[15] T.-C. Hsiao, C.-W. Lin and H. K. Chiang, “Partial Least Squares Algorithm for Weight Initialization of Backpropagation Network,” Neurocomputing, Vol. 50, 2003, pp. 237-247. doi:10.1016/S0925-2312(01)00708-1

[16] M. Huskan and C. Goerick, “Fast Learning for Problem Classes Using Knowledge Based Network Initialization,” Proceedings of International Conference on Neural Networks, Como, 24-27 July 2000, pp. 619-624.

[17] D. Erdogmus, O. Fontenla-Romero, J. C. Principe, A. Alonso-Betanzos and E. Castillo, “Linear-Leaset-Squares Initialization of Multilayer Perceptrons through Backpropagation of the Desired Response,” IEEE Transactions of Neural Networks, Vol. 16, No. 2, 2005, pp. 325-337. doi:10.1109/TNN.2004.841777

[18] Y. F. Yam and T. W. S. Chow, “A Weight Initialization Method for Improving Training Speed in Feedforward Neural Network,” Neurocomputing, Vol. 30, No. 1-4, 2000, pp. 219-232. doi:10.1016/S0925-2312(99)00127-7

[19] Y. F. Yam and T. W. S. Chow, “Feedforward Networks Training Speed Enhancement by Optimal Initialization of the Synaptic Coefficients,” IEEE Transactions on Neural Networks, Vol. 12, No. 2, 2001, pp. 430-434. doi:10.1109/72.914538

[20] E. Castillo, O. Fontenla-Romero, A. A. Betanzos and B. Guijarro-Berdinas, “A Global Optimum Approach for One Layer Neural Networks,” Neural Computation, Vol. 14, No. 6, 2002, pp. 1429-1449. doi:10.1162/089976602753713007

[21] E. Castillo, B. Guijarro-Berdinas, O. Fontenla-Romero and A. A. Betanzos, “A Very Fast Learning Method for Neural Networks Based on Sensitivity Analysis,” Journal of Machine Learning Research, Vol. 7, 2006, pp. 1159-1182.

[22] R. A. Fisher, “The Use of Multiple Measurements in Taxonomic Problems,” Annual Eugenics, Vol. 7, No. 2, 1936, pp. 179-188. doi:10.1111/j.1469-1809.1936.tb02137.x

[23] A. Frank and A. Asuncion, “UCI Machine Learning Repository,” School of Information and Computer Science, Universty of California, Irvine, 2010. http://archieve.ics.uci.edu/ml