ENG  Vol.5 No.10 B , October 2013
A Restricted, Adaptive Threshold Segmentation Approach for Processing High-Speed Image Sequences of the Glottis
Abstract: In this paper, we propose a restricted, adaptive threshold approach for the segmentation of images of the glottis acquired from high speed video-endoscopy (HSV). The approach involves first, identifying a region of interest (ROI) that encloses the vocal-fold motion extent for each image frame as estimated by the different image sequences. This procedure is then followed by threshold segmentation restricted within the identified ROI for each image frame of the original image sequences, or referred to as sub-image sequences. The threshold value is adapted for each sub-image frame and determined by respective minimum gray-scale value that typically corresponds to a spatial location within the glottis. The proposed approach is practical and highly efficient for segmenting a vast amount of image frames since simple threshold method is adapted. Results obtained from the segmentation of representative clinical image sequences are presented to verify the proposed method.
Cite this paper: Blanco, M. , Chen, X. and Yan, Y. (2013) A Restricted, Adaptive Threshold Segmentation Approach for Processing High-Speed Image Sequences of the Glottis. Engineering, 5, 357-362. doi: 10.4236/eng.2013.510B072.

[1]   R. Timke, H. von Leden and P. Moore, “Laryngeal Vi-brations: Measurements of the Glottic Wave. Part I: The Normal Vibratory Cycle,” AMA Archives Otolaryngology, Vol. 68, 1958, pp. 1-19.

[2]   J. Booth and D. Childers, “Automated Analysis of Ultra High-Speed Laryngeal Films,” IEEE Transactions on Biomedical Engineering, Vol. 26, 1979, pp. 185-192.

[3]   J. Noordzij and P. Woo, “Glottal Area Waveform Analysis of Benign Vocal Fold Lesions before and after Surgery,” Annals of Otology, Rhinology, and Laryngology, Vol. 109, 2000, pp. 441-446.

[4]   Y. Yan, K. Ahmad, M. Kunduk and D. Bless, “Analysis of Vocal Fold Vibrations from High-Speed Laryngeal Images Using a Hilbert Transform-Based Methodology,” Journal of Voice, Vol. 2, 2005, pp. 161-175.

[5]   X. Chen, D. Bless and Y. Yan. “A Segmentation Scheme Based on Rayleigh Distribution Model for Extracting Glottal Waveform from High-speed Laryngeal Images,” 27th Annual International Conference of the Engineering in Medicine and Biology Society, Shanghai, 17-18 January 2005, pp. 6269-6272.

[6]   Y. Yan, D. Bless and X. Chen, “Biomedical Image Analysis in High-speed Laryngeal Imaging of Voice Production,” 27th Annual International Conference of the Engineering in Medicine and Biology Society, Shanghai, 17-18 January 2005, pp. 7684-7687.

[7]   K. Ahmad, Y. Yan and D. Bless, “Vocal-Fold Vibratory Characteristics in Normal Female Speakers from High-speed Digital Imaging,” Journal of Voice, Vol. 26, No. 2, 2012, pp. 239-253.

[8]   K. Ahmad, Y. Yan and D. Bless, “Vocal Fold Vibratory Characteristics of Healthy Geriatric Females—Analysis of High-Speed Digital Images,” Journal of Voice, Vol. 26, No. 6, 2012, pp. 751-759.

[9]   Y. Yan and K. Izdebski, “Integrated Spatio-Temporal Analysis of High-Speed Laryngeal Imaging and Abnormal Vocal Functions—Their Role and Applications in the Study of Normal and Abnormal Vocal Functions,” In: G. Demenko, Ed., Speech and Language Technology, Poznan, 2012.

[10]   M. Sonka, V. Hlavac and R. Boyle, “Image Processing, Analysis and Machine Vision,” 3rd Edition, Thomson Books/Cole, Toronto, 2008, pp. 74-77.

[11]   K. Fu and J. Mui, “A Survey on Image Segmentation,” Pattern Recognition, Vol. 13, No.1, 1981, pp. 3-16.

[12]   M. Atkins and B. Mackiewich, “Fully Automatic Segmentation of the Brain in MRI,” IEEE Transactions on Medical Imaging, Vol. 17, No. 1, 1998, pp. 98-107.

[13]   J. Duncan and N. Ayache, “Medical Image Analysis: Progress Over Two Decades and the Challenges Ahead,” IEEE Transactions on Pattern Analysis and Machine Intelligence, Vol. 22, 2000, pp. 85-106.

[14]   Y. Yan, X. Chen, and D. Bless, “Automatic Tracing of Vocal-Fold Motion from High-Speed Digital Images,” IEEE Transactions on Medical Imaging, Vol. 53, No. 7, 2006, pp. 1394-1400.

[15]   J. Lohscheller, H. Toy, F. Rosanowski, U. Eysholdt and M. D?llinger, “Clinically Evaluated Procedure for the Reconstruction of Vocal Fold Vibrations from Endoscopic Digital High-Speed Videos,” Medical Image Analysis, Vol. 11, No. 4, 2007, pp. 400-413.

[16]   B. Marendic, N. Galatsanos and D. Bless, “A New Active Contour Algorithm for Tracking Vibrating Vocal Folds,” IEEE International Conference on Image Processing, 2001, pp. 397-400.

[17]   J. Lohscheller, M. D?llinger, M. Schuster, R. Schwarz, U. Eysholdt and U. Hoppe, “Quantitative Investigation of the Vibration Pattern of the Substitute Voice Generator,” IEEE Transactions on Biomedical Engineering, Vol. 51, No. 8, 2004, pp. 1394-1400.

[18]   Y. Yan, G. Du, C. Zhu and G. Marriott. “Snake Based Automatic Tracing of Vocal-fold Motion from High-Speed Digital Imaging,” 2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Kyoto, 25-30 March 2012, pp. 593-596.

[19]   S. Karakozoglou, N. Henrich, C. D‘Alessandro and Y. Stylianou, “Automatic Glottal Segmentation Using Local-Based Active Contours and Application to Glottovibrography,” Speech Communication, Vol. 54, No. 5, 2012, pp. 641-654.

[20]   C. Manfredi, L. Bocchi, G. Cantarella and G. Peretti, “Videokymographic Image Processing: Objective Parameters and User-Friendly Interface,” Biomedical Signal Processing and Control, Vol. 7, No. 2, 2012, pp. 192-201.

[21]   J. Rong, J. Coatrieux and R. Collorec, “Combining Motion Estimation and Segmentation in Digital Subtracted Angiograms Analysis,” IEEE Sixth Multidimensional SignalProcessing Workshop, Piscataway, 1989.

[22]   N. Otsu, “Threshold Selection Method from Gray-Level Histograms,” IEEE Transactions on Systems, Man, and Cybernetics, Vol. 9, 1979, pp. 62-66.