AM  Vol.6 No.6 , June 2015
A Novel Method for Transforming XML Documents to Time Series and Clustering Them Based on Delaunay Triangulation
Abstract: Nowadays exchanging data in XML format become more popular and have widespread application because of simple maintenance and transferring nature of XML documents. So, accelerating search within such a document ensures search engine’s efficiency. In this paper, we propose a technique for detecting the similarity in the structure of XML documents; in the following, we would cluster this document with Delaunay Triangulation method. The technique is based on the idea of representing the structure of an XML document as a time series in which each occurrence of a tag corresponds to a given impulse. So we could use Discrete Fourier Transform as a simple method to analyze these signals in frequency domain and make similarity matrices through a kind of distance measurement, in order to group them into clusters. We exploited Delaunay Triangulation as a clustering method to cluster the d-dimension points of XML documents. The results show a significant efficiency and accuracy in front of common methods.
Cite this paper: Shafieian, N. (2015) A Novel Method for Transforming XML Documents to Time Series and Clustering Them Based on Delaunay Triangulation. Applied Mathematics, 6, 1076-1085. doi: 10.4236/am.2015.66098.

