On the Metaheuristics Approach to the Problem of Genetic Sequence Comparison and Its Parallel Implementation

Affiliation(s)

Department of Mathematics and Information Science, Togliatti State University, Togliatti, Russia.

Togliatti Branch of Samara State University, Togliatti, Russia.

Yandex Company, Moscow, Russia.

Department of Mathematics and Information Science, Togliatti State University, Togliatti, Russia.

Togliatti Branch of Samara State University, Togliatti, Russia.

Yandex Company, Moscow, Russia.

ABSTRACT

We describe parallel implementation of the metaheuristic approach to the problem of comparing strings representing DNA sequence. By this approach, one can define a whole new class of metrics on a set of strings; some of this metrics can lead to interesting results when used for string comparison. We propose several heuristics; compare results achieved when using those heuristics and compare parallel and sequential implementation of proposed approach.

We describe parallel implementation of the metaheuristic approach to the problem of comparing strings representing DNA sequence. By this approach, one can define a whole new class of metrics on a set of strings; some of this metrics can lead to interesting results when used for string comparison. We propose several heuristics; compare results achieved when using those heuristics and compare parallel and sequential implementation of proposed approach.

Cite this paper

S. Makarkin, B. Melnikov and A. Panin, "On the Metaheuristics Approach to the Problem of Genetic Sequence Comparison and Its Parallel Implementation,"*Applied Mathematics*, Vol. 4 No. 10, 2013, pp. 35-39. doi: 10.4236/am.2013.410A1006.

S. Makarkin, B. Melnikov and A. Panin, "On the Metaheuristics Approach to the Problem of Genetic Sequence Comparison and Its Parallel Implementation,"

References

[1] D. Gusfield, “Algorithms on Strings, Trees and Sequences,” Informatics and Computers, BHV-Peterburg, Saint-Petersburg, 2003.

[2] I. Torshin, “Bioinformatics in the Post-Genomic Era: The Role of Biophysics,” Nova Publishers, 2006.

[3] D. S. Hirschberg, “A Linear Space Algorithm for Computing Maximal Common Subsequences,” Communications of the ACM, Vol. 18, No. 6, 1975, pp. 341-343. http:dx.doi.org/10.1145/360825.

360861

[4] J. W. Hunt and T. G. Szymanski, “A Fast Algorithm for Computing Longest Common Subsequences,” Communications of the ACM, Vol. 20, No. 5, 1977, pp. 350-353. http:dx.doi.org/10.1145/359581.

359603

[5] E. W. Myers, “An Overview of Sequence Comparison Algorithms in Molecular Biology,” 1991.

[6] R. A. Wagner and M. J. Fischer, “The String-to-String Correction Problem,” Journal of the ACM, Vol. 21, No. 1, 1974, pp. 168-173. http:dx.doi.org/10.1145/321796.321811

[7] G. Wieds, “Bioinformatics Explained: BLAST versus Smith-Waterman,” CLCBio, 2007.

[8] A. Panin, “Using Metaheuristic Approach for DNA Comparison,” Vector of Science TSU, Vol. 3, No. 17, 2011, pp. 27-29.

[9] B. Melnikov, “Discrete Optimization Problems. Some New Heuristic Approaches,” 8th International Conference on High Performance Computing and Grid in Asia Pacific Region, IEEE Computer Society Press, 2005, pp. 73-80. http:dx.doi.org/10.1109/HPCASIA.2005.34

[10] B. Melnikov, “Multiheuristic Approach to Discrete Optimization Problems,” Cybernetics and Systems Analysis, Vol. 42, No. 3, 2006, pp. 335-341. http://dx.doi.org/10.1007/s10559-006-0070-y

[11] S. Henikoff and J. G. Henikoff, “Amino Acid Substitution Matrices from Protein Blocks,” PNAS, Vol. 89, No. 22, 1992, pp. 10915-10919. http:dx.doi.org/10.1073/pnas.89.22.10915

[12] J. Hromkovic, “Algorithms for Hard Problems. Introduction to Combinatorial Optimazation, Randomization, Approximation, and Heuristics,” Springer, Berlin, 2003.

[13] M. Bellmore and G. Nemhauser, “The Traveling Salesman Problem: A Survey,” Operation Research, Vol. 16, No. 3, 1968, pp. 538-558. http:dx.doi.org/10.1287/opre.16.3.538

[14] B. Mel’nikov and A. Radionov, “A Choice of Strategy in Nondeterministic Antagonistic Games,” Programming and Computer Software, Vol. 24, No. 5, 1998, pp. 247-252.

[15] B. Melnikov, “Heuristics in Programming of Nondeterministic Games,” Programming and Computer Software, Vol. 27, No. 5, 2001, pp. 277-288. http://dx.doi.org/10.1023/A:1012345111076

[16] A. C. Wilson, R. L. Cann, S. M. Carr, M. George Jr., U. B. Gyllensten, K. Helm-Bychowski, R. G. Higuchi, S. R. Palumbi, E. M. Prager, R. D. Sage and M. Stoneking, “Mitochondrial DNA and Two Perspectives on Evolutionary Genetics,” Biological Journal of the Linnean Society, Vol. 26, No. 4, 1985, pp. 375-400. http:dx.doi.org/10.1111/j.1095-8312.1985.tb02048.x

[17] NCBI Nucleotide Database. http:www.ncbi.nlm.nih.gov/nuccore

[18] A. Shipunov, “Systema Naturae or the Outline of Living World Classification,” Protistology, Vol. 6, No. 1, 2009, pp. 3-13.

[1] D. Gusfield, “Algorithms on Strings, Trees and Sequences,” Informatics and Computers, BHV-Peterburg, Saint-Petersburg, 2003.

[2] I. Torshin, “Bioinformatics in the Post-Genomic Era: The Role of Biophysics,” Nova Publishers, 2006.

[3] D. S. Hirschberg, “A Linear Space Algorithm for Computing Maximal Common Subsequences,” Communications of the ACM, Vol. 18, No. 6, 1975, pp. 341-343. http:dx.doi.org/10.1145/360825.

360861

[4] J. W. Hunt and T. G. Szymanski, “A Fast Algorithm for Computing Longest Common Subsequences,” Communications of the ACM, Vol. 20, No. 5, 1977, pp. 350-353. http:dx.doi.org/10.1145/359581.

359603

[5] E. W. Myers, “An Overview of Sequence Comparison Algorithms in Molecular Biology,” 1991.

[6] R. A. Wagner and M. J. Fischer, “The String-to-String Correction Problem,” Journal of the ACM, Vol. 21, No. 1, 1974, pp. 168-173. http:dx.doi.org/10.1145/321796.321811

[7] G. Wieds, “Bioinformatics Explained: BLAST versus Smith-Waterman,” CLCBio, 2007.

[8] A. Panin, “Using Metaheuristic Approach for DNA Comparison,” Vector of Science TSU, Vol. 3, No. 17, 2011, pp. 27-29.

[9] B. Melnikov, “Discrete Optimization Problems. Some New Heuristic Approaches,” 8th International Conference on High Performance Computing and Grid in Asia Pacific Region, IEEE Computer Society Press, 2005, pp. 73-80. http:dx.doi.org/10.1109/HPCASIA.2005.34

[10] B. Melnikov, “Multiheuristic Approach to Discrete Optimization Problems,” Cybernetics and Systems Analysis, Vol. 42, No. 3, 2006, pp. 335-341. http://dx.doi.org/10.1007/s10559-006-0070-y

[11] S. Henikoff and J. G. Henikoff, “Amino Acid Substitution Matrices from Protein Blocks,” PNAS, Vol. 89, No. 22, 1992, pp. 10915-10919. http:dx.doi.org/10.1073/pnas.89.22.10915

[12] J. Hromkovic, “Algorithms for Hard Problems. Introduction to Combinatorial Optimazation, Randomization, Approximation, and Heuristics,” Springer, Berlin, 2003.

[13] M. Bellmore and G. Nemhauser, “The Traveling Salesman Problem: A Survey,” Operation Research, Vol. 16, No. 3, 1968, pp. 538-558. http:dx.doi.org/10.1287/opre.16.3.538

[14] B. Mel’nikov and A. Radionov, “A Choice of Strategy in Nondeterministic Antagonistic Games,” Programming and Computer Software, Vol. 24, No. 5, 1998, pp. 247-252.

[15] B. Melnikov, “Heuristics in Programming of Nondeterministic Games,” Programming and Computer Software, Vol. 27, No. 5, 2001, pp. 277-288. http://dx.doi.org/10.1023/A:1012345111076

[16] A. C. Wilson, R. L. Cann, S. M. Carr, M. George Jr., U. B. Gyllensten, K. Helm-Bychowski, R. G. Higuchi, S. R. Palumbi, E. M. Prager, R. D. Sage and M. Stoneking, “Mitochondrial DNA and Two Perspectives on Evolutionary Genetics,” Biological Journal of the Linnean Society, Vol. 26, No. 4, 1985, pp. 375-400. http:dx.doi.org/10.1111/j.1095-8312.1985.tb02048.x

[17] NCBI Nucleotide Database. http:www.ncbi.nlm.nih.gov/nuccore

[18] A. Shipunov, “Systema Naturae or the Outline of Living World Classification,” Protistology, Vol. 6, No. 1, 2009, pp. 3-13.