Back
 JSEA  Vol.2 No.2 , July 2009
Transliterated Word Identification and Application to Query Translation Mining
Abstract: Query translation mining is a key technique in cross-language information retrieval and machine translation knowl-edge acquisition. For better performance, the queries are classified into transliterated words and non-transliterated words based on transliterated word identification model, and are further channeled to different mining processes. This paper is a pilot study on query classification for better translation mining performance, which is based on supervised classification and linguistic heuristics. The person name identification gets a precision of over 97%. Transliterated word translation mining shows satisfactory performance.
Cite this paper: nullJ. Zhang, L. Guo, M. Zhou and J. Yao, "Transliterated Word Identification and Application to Query Translation Mining," Journal of Software Engineering and Applications, Vol. 2 No. 2, 2009, pp. 122-126. doi: 10.4236/jsea.2009.22018.
References

[1]   [1] F. Huang and Y. Zhang, “Ming key phrase translations from web corpora,” Proceedings of Human Language Technology Conference and Conference on Empirical Methods in Natural Language Processing, pp. 483-490 ACL, 2005.

[2]   [2] P. J. Cheng, J. W. Teng, R. C. Chen, J. H. Wang, W. H. Lu, and L. F. Chien, “Translating unknown queries with web corpora for cross-language information retrieval,” in the Proceedings of 27th ACM SIGIR, ACM Press, pp. 146-153, 2004.

[3]   [3] C. Y. Lu, Y. Xu, and S. Geva, “Web-based query transla-tion for English-Chinese CLIR,” Computational Linguis-tics and Chinese Language Processing, Vol. 13, No. 1, pp. 61-90, 2008.

[4]   [4] M. Nagata, T. Saito, and K. Suzuki, “Using the web as a bilingual dictionary,” Proceedings of ACL 2001 Work-shop Data-Driven Methods in Machine Translation, pp. 95-102. 2001.

[5]   [5] W. H. Lu, L. F. Chien, and H. J. Lee, “Translation of web queries using anchor text mining,” ACM Transactions on Asian Language Information Processing (TALIP), Vol. 1, No. 2, pp. 159-172, 2002.

[6]   [6] S. Li and H. T. Ng, “Mining new word translations from comparable corpora,” COLING 2004 ACL, 2004.

[7]   [7] M. L. Zhou and J. M. Yao, “Mining named entity trans-literations from comparable corpora,” Proceedings of 7th International Conference on Chinese Computing, 2007.

[8]   [8] J. Li, “Researching and implementing of English-Chinese transliteration method based on text,” Master’s degree thesis, Harbin Institute of Technology, 2005.

[9]   [9] W. Gao, “Phoneme-based statistical transliteration of foreign names for OOV problem [D],” The Chinese Uni-versity of Hong Kong, 2004.

[10]   [10] P. Virga and S. Khudanpur, “Transliteration of proper names in cross-lingual information retrieval[A],” in Pro-ceedings of the ACI Workshop on Multilingual Named Entity Recognition [C], 2003.

[11]   [11] Xinhua News Agency, “Translation name office diction-ary of world-wide person name translations,” China Translation and Publishing Corporation, 1993.

[12]   [12] W. H. Lin and H. H. Chen, “Backward machine translit-eration by learning phonetic similarity,” in Proceedings of CONLL, Taipei, Taiwan, pp. 139-145, 2002.

[13]   [13] T. Lin, C. C. Wu, and J. S. Chang, “Word-transliteration alignment,” in Proceedings of ROCLING XV, Hsinchu, Taiwan, pp. 1-16, 2003.

[14]   [14] W. Gao, K. F. Wong, and W. Lam, “Phoneme-based transliteration of foreign name for OOV problem,” in Proceedings of the first International Joint Conference on Natural Language Processing (IJCNLP), Hainan Island, China, pp. 274-381, 2004.

[15]   [15] W. Lam, R. Z. Huang, and P. S. Cheung, “Learning pho-netic similarity for matching named entity translations and mining new translations,” in Proceedings of 27th In-ternational ACM SIGIR Conference on Research and Development in Information Retrieval, the University of Sheffield, UK, pp. 281-288, 2004.

[16]   [16] S. Wan and C. M. Verspoor, “Automatic English-Chinese name transliteration for development of multilingual re-sources,” in Proceedings of 36th Annual Meeting of the Association for Computational Linguistics, Montreal, Quebec, Canada, pp. 1352-1357, 1998.

[17]   [17] W. H. Lu, J. H. Lin, and Y. S. Chang, “Improving trans-lation of queries with infrequent unknown abbreviations and proper names,” Computational Linguistics and Chinese Language Processing, Vol. 13, No. 1, pp. 91-120,

 
 
Top