Back
 JSEA  Vol.1 No.1 , December 2008
A Bioinformatics-Inspired Adaptation to Ukkonen’s Edit Distance Calculating Algorithm and Its Applicability Towards Distributed Data Mining
Show more
Abstract: Edit distance measures the similarity between two strings (as the minimum number of change, insert or delete operations that transform one string to the other). An edit sequence s is a sequence of such operations and can be used to represent the string resulting from applying s to a reference string. We present a modification to Ukkonen’s edit distance calculating algorithm based upon representing strings by edit sequences. We conclude with a demonstration of how using this representation can improve mitochondrial DNA query throughput performance in a distributed computing environment.
Cite this paper: nullJ. Bruce, "A Bioinformatics-Inspired Adaptation to Ukkonen’s Edit Distance Calculating Algorithm and Its Applicability Towards Distributed Data Mining," Journal of Software Engineering and Applications, Vol. 1 No. 1, 2008, pp. 8-12. doi: 10.4236/jsea.2008.11002.
References

[1]   M. D. Vose, “A formal analysis of edit distance,” UT CS Technical Report ut-cs-04-517, February 2004.

[2]   R. O. Duda and P. E. Hart, Pattern Classification (2nd ed.). Wiley Interscience, 2000.

[3]   A. Wagner and M. I. Fischer, “The string-to-string correction problem,” Journal of the ACM, 21(1) (Jan. 1974), pp. 168-173, 1974.

[4]   E. Ukkonen, “Algorithms for approximate string matching,” International Control 64, pp. 100-118, 1985.

[5]   E. Ukkonen, “On approximate string matching,” International Conference Fundamentals of Computation Theory, Lecture Notes in Computer Science, pp. 158:487-495, 1983.

[6]   N. Campbell and J. Reese, Biology (6th ed.), Addison Wesley, 1997.

[7]   S. Anderson, et al., “Sequence and organization of the human mitochondrial genome,” Nature, 290(5806) (April 9, 1981), pp. 457-265, 1981.

[8]   K. L. Monson, et al., “The mtDNA population database: An integrated software and database resource for forensic comparison,” Forensic Science Communications, 4(2), April 2002. DOI=http://www.fbi.gov/hq/lab/fsc/ backissu/april2002/miller1.htm.

[9]   http://iperf.sourceforge.net.

 
 
Top