ACS  Vol.4 No.3 , July 2014
Analysis of the Homogeneity of Wind Roses' Groups Employing Andrews’ Curves
Abstract: The homogeneity of groups of 16-dimensional wind direction roses (obtained by hierarchical clustering in a previous report) is discussed through the application of Andrews’ Curves. Principal Component Analysis (PCA) is employed to reduce dimensionality and to provide an ordering of the variables to compute Andrews’ Curves. Our results suggest that Andrews’ Curves greatly facilitate the visualization of homogeneity as well as reveal information that allows improving the clusters’ arrangement. A combined analysis employing Andrews’ Curves and Calinkski and Harabasz’ approach (a method for determining the optimal number of groups) helps to assess the strength of the group structure of the data as well as to detect anomalies such as misclassified objects or atypical values. Furthermore, it allows finding out that the 24 original seasonal hourly roses (representing the “day”) become better represented by 6 groups (rather than by 5 as proposed in the previous report). The new group arrangement was consistent with the dendogram for another cut-off distance. As a result the wind occurrences are now represented by a more detailed and smooth pattern: there is a decrease in northern wind between midday and twilight while eastern winds become more important towards the evening. The methodology proposed is a subject to be considered to become part of an automated system.
Cite this paper: Gustavo, R. , Fabián, V. and Jorge, R. (2014) Analysis of the Homogeneity of Wind Roses' Groups Employing Andrews’ Curves. Atmospheric and Climate Sciences, 4, 447-456. doi: 10.4236/acs.2014.43043.

[1]   Ratto, G., Maronna, R. and Berri, G. (2010) Analysis of Wind Roses Using Hierarchical Cluster and Multidimensional Scaling Analysis at La Plata, Argentina. Boundary-Layer Meteorology, 137, 477-492.

[2]   Andrews, D.F. (1972) Plots of High-Dimensional Data. Biometrics, 28, 125-136.

[3]   Unwin, A. (2008) Good Graphics? In: Chen, C., Hardle, W. and Unwin, A., Eds., Handbook of Data Visualization, Springer, Heidelberg, 57.

[4]   Moustafa, R.E. (2011) Andrews’ Curves. Computational Statistics, 3, 373-382.

[5]   Fayyad, U., Grinstein, G. and Wierse, A. (2002) Information Visualization in Data Mining and Knowledge Discovery. Elsevier, London.

[6]   Uddin, M., Hussain, M. and Fatmi, A.I. (2011) Visualizing Multivariate Data with Andrews’ Curves. Proceedings of the 8th International Conference on Recent Advances in Statistics: Statistics, Biostatistics and Econometrics, Lahore, 8-9 February 2011, 213-222.

[7]   Cluff, E., Burton, R. and Barrett, W. (1991) A Survey and Characterization of Multidimensional Presentation Techniques. Journal of Imaging Technology, 47, 142-153.

[8]   Embrechts, P., Herzbergb, A.M., Kalbfleischb, H.K., Travesc, W.N. and Whitlad, R. (1995) An Introduction to Wavelets with Applications to Andrews’ Plots. Journal of Computational and Applied Mathematics, 64, 41-56.

[9]   Garcia-Osorio, C. and Fyfe, C. (2005) The Combined Use of Self-Organizing Maps and Andrews’ Cur-
ves. International Journal of Neural Systems, 15, 197-206.

[10]   Carr, D.B. (1998) Multivariate Graphics. In: Armitage, P. and Colton, T., Eds., Encyclopedia of Biostatistics, Wiley, Chichester, 2864-2886.

[11]   Seber, G.A.F. (2004) Multivariate Observations. John Wiley and Sons, New Jersey.

[12]   Gnanadesikan, R. (1997) Methods for Statistical Data Analysis of Multivariate Observations. John Wiley and Sons, New York.

[13]   Spencer, N.H. (2003) Investigating Data with Andrews Plots. Social Science Computer Review, 21, 244-249.

[14]   Wilks, D.S. (2006) Statistical Methods in the Atmospheric Sciences, 2nd Edition, Elsevier, New York.

[15]   Chan, W.W.-Y. (2006) A Survey on Multivariate Data Visualization. Report of the Department of Computer Science and Engineering, Hong Kong University of Science and Technology, Kowloon, Hong Kong.

[16]   Schmid, C. and Hinterberger, H. (1994) Comparative Multivariate Visualization across Conceptually Different Graphic Displays. Proceedings of SSDBM’94, Charlottesville, 28-30 September 1994, 42-51.

[17]   Hinterberger, H. (2010) The VisuLab®: An Instrument for Interactive, Comparative Visualization. Technical Report Nr. 682, Department of Computer Science Information Technology and Education, Zurich.

[18]   Martinez, W. and Martinez, A. (2002) Computational Statistics Handbook with MATLAB®. Chapman & Hall/CRC, Washington.

[19]   Garcia, J.R.M., Monteiro, A.M.V. and dos Santos, R.D.C. (2012) Visual Data Mining for Identification of Patterns and Outliers in Weather Stations’ Data. Proceedings of the 13th International Conference on Intelligent Data Engineering and Automated Learning—IDEAL 2012, Natal, 29-31 August 2012, 7435, 245-252.

[20]   Calinkski, T. and Harabasz, J. (1974) A Dendrite Method for Cluster Analysis. Communications in Statistics, 3, 1-27.

[21]   Milligan, G.W. and Cooper, M.C. (1985) An Examination of Procedures for Determining the Number of Clusters in a Data Set. Psychometrika, 50, 159-179.

[22]   Tibshirani, R., Walther G. and Hastie, T. (2001) Estimating the Number of Clusters in a Dataset via the Gap Statistic. Journal of the Royal Statistical Society Series B, 63, 411-423.

[23]   Jolliffe, I.T., Jones, B. and Morgan, B.J.T. (1986) Comparison of Cluster Analyses of the English Personal Social Services Authorities. Journal of the Royal Statistical Society Series A, 149, 253-270.