ABSTRACT A method of robust speech endpoint detection in airplane cockpit voice background is presented. Based on the analysis of background noise character, a complex Laplacian distribution model directly aiming at noisy speech is established. Then the likelihood ratio test based on binary hypothesis test is carried out. The decision criterion of conventional maximum a posterior incorporating the inter-frame correlation leads to two separate thresholds. Speech endpoint detection decision is finally made depend on the previous frame and the observed spectrum, and the speech endpoint is searched based on the decision. Compared with the typical algorithms, the proposed method operates robust in the airplane cockpit voice background.
Cite this paper
nullH. CHENG, M. LEI, G. HUANG and Y. XIA, "Robust Speech Endpoint Detection in Airplane Cockpit Voice Background," Wireless Sensor Network, Vol. 1 No. 5, 2009, pp. 489-495. doi: 10.4236/wsn.2009.15059.
 Y. M. Guo, Q. Fu, and Y. H. Yan, “Speech endpoint detection in complex noise environment [J],” Journal of Acoustics, Vol. 31, No. 6, pp. 549–554, 2006.
D. L. Cheng, C. J. Yi, H. Y. Yao, et al., “The primary research of voice information identify methods of airplane cockpit voice recorder [J],” Control of Noise and Quiver, Vol. 3, pp. 81–84, 2006.
J. L. Shen, J. W. Hung, and L. S. Lee, “Robust entropy-based endpoint detection for speech recognition in noisy environments [C],” In Proceedings of ICSLP, pp. 232–235, 1998.
J. L. Shen and C. H. Yang, “A novel approach to robust speech endpoint detection in car environment [C],” In Proceedings of ICASSP, Vol. 3, pp. 1751–1754, 2000.
C. Jia and B. Xu, “An improved entropy-based endpoint detection algorithm [C],” In Proceedings of ISCSLP, 2002.
J. A. Haigh and J. S. Mason, “Robust voice activity detection using cepstral feature [C],” In Proceedings of IEEE TELCON’93, pp. 321–324, 1993.
X. D. Wei, G. R. Hu, and X. L. Ren,” Speech endpoint detection with noise using cepstral feature [J],” Journal of Shanghai Jiao Tong University, Vol. 34, No. 2, pp. 185– 188, 2001.
E. Nemer, R. Goubran, and S. Mahmoud, “Robust voice activity detection using higher-order statistics in the LPC residual domain [J],” IEEE Transactions on Speech and Audio Processing, Vol. 9, No. 3, pp. 217–231, 2001.
R. Q. Yan and Y. S. Zhu, “Speech endpoint detection based on the analysis of signal recursion [J],” Journal of Communication, Vol. 1, pp. 35–39, 2007.
J. Sohn, N. S. Kim, and W. Sung, “A statistical model- based voice activity detection [J],” IEEE Signal Processing Letters, Vol. 6, No. 1, pp. 1–3, 1999.
A. Davis, S. Nordholm, and R. Togneri, “Statistical voice activity detection using low-variance spectrum estimation and an adaptive threshold [J],” IEEE Transactions on Audio, Speech, Language Process, Vol. 14, No. 2, pp. 412–424, 2006.
M. Fujimoto, K. Ishizuka, and H. Kato, “Noise robust voice activity detection based on statistical model and parallel non-linear Kalman filter [C],” ICASSP’07, pp. 797–800, 2007.
J. H. Chang, J. W. Shin, and N. S. Kim, “Likehood ratio test with complex Laplacian model for voice activity detection [C],” In Proceedings of Euro Speech, pp. 1065– 1068, 2003.
M. J. F. Gales, “Models based techniques for noise robust speech recognition [D],” Cambridge University, 1995.
H. Hirsch and C. Ehrlicher, “Noise estimation techniques for robust speech recognition [A],” ICASSP’95 Proceedings, pp. 153–156, 1995.
N. S. Kim and J. H. Chang, “Space enhancement based on global soft decision [J],” IEEE Signal Processing Letters, Vol. 7, No. 5, pp. 108–110, 2000.
W. H. Shin, B. S. Lee, Y. H. Lee, et al., “Speech/non- speech classification using multiple features for robust endpoint detection [C],” In Proceeding of ICAASSP, Vol. 3, pp. 1399–1402, 2000.
J. J. Lei, “The research of some issues in noise robust speech identification [D],” Doctor Thesis of Beijing University of Posts and Telecommunications, 2007.