JBiSE  Vol.9 No.10 B , September 2016
The Research on Identification of Gene Splice Sites by Support Vector Machine
The recognition of splicing sites is a very important step in the eukaryotic DNA se-quence analysis. Many scholars are working hard to improve the accuracy of identifi-cation. Our team carried out research on this issue based on support vector machine, which is one famous algorithm in data mining. The training and testing data is from the HS3D dataset, and excellent accuracy rate is achieved by nucleic acid sequence orthogonal coding and RBF core function, and the cross validation experiment hints that base pattern information is mainly located within 20 nucleotides upstream and downstream splice sites.
