G. Grillo, M. Attimonelli, S. Liuni, G. Pesole. (1996) CLEANUP: a fast computer program for removing redundancies from nucleotide sequence databases. CABIOS, 12, 1–8.
L. Holm and C. Sander. (1998) Removing near- neighbour redundancy from llarge protein sequence collections. Bioinformatics, 14, 423–429.
W. Li and A. Godzik. (2006) Cd-Hit: a fast program for clustering and comparing large sets of protein or nucleotide sequences. Bioinformatics, 22, 1658–1659.
W. Li, J L. aroszewski, A. Godzik. (2001) Clustering of highly homologous sequences to reduce the size of large protein databases. Bioinformatics, 17, 282–283.
U. Hobohm, M. Scharf, R. Schneider, C. Sander. (1992) selection of representative protein data sets. Protein Sci, 1, 409–417.
U. Hobohm and C. Sander. (1994) Enlarged representative set of protein structures. Protein Sci, 3, 522–524.
G. Wang and R. L. Jr. Dunbrack. (2003) PISCES: a protein sequence culling server. Bioinformatics, 12, 1589– 1591.
S. F. Altschul, T. L. Madden, A. A. Sch?ffer, J. Zhang, Z. Zhang, W. Miller, D. J. Lipman. (1997) Gapped BLAST and PSI-BLAST: a new generation of protein database search programs. Nucleic Acids Research, 25, 3389– 3402.
S. F. Altschul, W. Gish, W. Miller, E. W. Myers, D. J. Lipman. (1990) basic local alignment search tool. J. Mol. Biol., 215, 403–410.
S. Niskanen and P. R. J. ?sterg?rd. (2003) Cliquer user’s guide, Version 1.0, Communications Laboratory, Helsinki University of Technology, Espoo, Tech. Rep. T48. http://users.tkk.fi/~pat/cliquer.html.