Using Data Mining with Time Series Data in Short-Term Stocks Prediction: A Literature Review

Affiliation(s)

Department of Mathematics, Instituto Politécnico do Porto, Porto, Portugal.

Department of Mathematics, Faculdade de Ciências, Universidade da Beira Interior, Covilha, Portugal.

Department of Informatics, Faculdade de Engenharia, Universidade da Beira Interior, Covilha, Portugal.

Department of Mathematics, Instituto Politécnico do Porto, Porto, Portugal.

Department of Mathematics, Faculdade de Ciências, Universidade da Beira Interior, Covilha, Portugal.

Department of Informatics, Faculdade de Engenharia, Universidade da Beira Interior, Covilha, Portugal.

ABSTRACT

Data Mining (DM) methods are being increasingly used in prediction with time series data, in addition to traditional statistical approaches. This paper presents a literature review of the use of DM with time series data, focusing on shorttime stocks prediction. This is an area that has been attracting a great deal of attention from researchers in the field. The main contribution of this paper is to provide an outline of the use of DM with time series data, using mainly examples related with short-term stocks prediction. This is important to a better understanding of the field. Some of the main trends and open issues will also be introduced.

Data Mining (DM) methods are being increasingly used in prediction with time series data, in addition to traditional statistical approaches. This paper presents a literature review of the use of DM with time series data, focusing on shorttime stocks prediction. This is an area that has been attracting a great deal of attention from researchers in the field. The main contribution of this paper is to provide an outline of the use of DM with time series data, using mainly examples related with short-term stocks prediction. This is important to a better understanding of the field. Some of the main trends and open issues will also be introduced.

Cite this paper

J. Azevedo, R. Almeida and P. Almeida, "Using Data Mining with Time Series Data in Short-Term Stocks Prediction: A Literature Review,"*International Journal of Intelligence Science*, Vol. 2 No. 4, 2012, pp. 176-180. doi: 10.4236/ijis.2012.224023.

J. Azevedo, R. Almeida and P. Almeida, "Using Data Mining with Time Series Data in Short-Term Stocks Prediction: A Literature Review,"

References

[1] P. Almeida, “Previsao do Comportamento de Séries Temporais Financeiras com Apoio de Conhecimento Sobre o Domínio,” Ph.D. Thesis, Universidade da Beira Interior, Covilha, 2003.

[2] L. Breiman, “Statistical Modeling: The Two Cultures,” Statistical Science, Vol. 18, No. 3, 2001, pp. 199-231.

[3] M. A. Ruggiero, “Cibernetic Trading Strategies—Developing a Profitable Trading System State-of-the-Art Technologies,” John Wiley & Sons, New York, 1977.

[4] U. M. Fayyad, G. Piatetski-Shapiro and P. Smyth, “From Data Mining to Knowledge Discovery: An Overview,” In: U. M. Fayyad, G. Piatetsky-Shapiro, P. Smyth and R. Uthurusamy, Eds., Advances in Knowledge Discovery and Data Mining, The MIT Press, Cambridge, 1996, pp. 1-34.

[5] J. Han, M. Kamber and J. Pei, “Data Mining: Concepts and Techniques,” Morgan Kaufman Publishers, California, 2011.

[6] D. Hand, H. Mannila and P. Smyth, “Principles of Data Mining”, The MIT Press, Cambridge, 2011.

[7] A. Azevedo and M. F. Santos, “KDD, SEMMA, and CRISP-DM: A Parallel Overview,” Proceedings of the IADIS European Conference on Data Mining, Amsterdam, 2008, pp. 182-185.

[8] OECD, “Time Series,” 2006. http://stats.oecd.org/glossary/detail.asp?ID=2708

[9] M. A. Ferreira and P. Santa-Clara, “Forecasting Stock Market Returns: The Sum of the Parts Is More Than the Whole,” Journal of Financial Economics, Vol. 100, No. 3, 2011, pp. 514-537.

[10] T. Fu, “A Review on Time Series Data Mining,” Engineering Applications of Artificial Intelligence, Vol. 21, No. 1, 2011, pp. 164-181.

[11] T. O. Hill, M. Connor and W. Remus, “Neural Network Models for Time Series Forecasts,” Management Science, Vol. 42, No. 7, 1996, pp. 1082-1092.

[12] S. Fong and Z. Nannan, “Towards an Adaptive Forecasting of Earthquake Time Series from Decomposable and Salient Characteritics,” Proceendings of the third international conference on pervasive patterns and applications, Rome, 2011, pp. 53-60.

[13] K. J. Walsh, M. Milligan, M. Woodman and J. Sherwell, “Data Mining to Characterize Ozone Behavior in Baltimore and Washington DC,” Journal of Atmospheric Environment, Vol. 42, No. 18, 2008, pp. 4280-4292.

[14] C. Damle and A. Yalcin, “Flood Prediction Using Time Series Data Mining,” Journal of Hidrology, Vol. 333, No. 2-4, 2007, pp. 305-316.

[15] E. Tsang, P. Yung and J. Li, “EDDIE-Automation, a Decision Support Tool for Financial Forecasting,” Decision Support Systems, Vol. 37, No. 4, 2004, pp. 559-565.

[16] A. Pasley and J. Austin, “Distribution Forecasting of High Frequency Time Series,” Decision Support Systems, Vol. 37, No. 4, 2004, pp. 501-513.

[17] Z. Huang, H. Chen, C. J. Hsu, W. H. Chen and S. Wu, “Credit Ratings Analysis with Support Vector Machines and Neural Networks. A Market Comparative Study,” Decision Support Systems, Vol. 37, No. 4, 2004, pp. 542-558.

[18] H. M. Krolzig and J. Toro, “Multiperiod Forecasting in Stock Market: A Paradox Solved,” Decision Support Systems, Vol. 37, No. 4, 2004, pp. 531-542.

[19] K. C. Chiu and L. Xu, “Arbitrage Pricing Theory-Based Gaussian Temporal Factor Analysis for Adaptive Portfolio Management,” Decision Support Systems, Vol. 37, No. 4, 2004, pp. 485-500.

[20] O. Coupelon, “Nneural Network Modeling for Stock Movement Prediction: A State of the Art,” Network, Vol. , No. , 2007, pp. 1-5. http://olivier.coupelon.free.fr/Neural_network_modeling_for_stock_movemen_prediction.pdf

[21] M. Kordos and A. Cwiok, “A New Approach to Neural Network Based Stock Trading Strategy,” Proceedings of the 12th International Conference on Intelligent Data Engineering and Automated Learning, 2011, pp. 429-436.

[22] J. H. Cheng, H. P. Chen and Y. M. Lin, “A Hybrid Forecast Market Timing Model Based on Probabilistic Neural Network, Rough Set and C 4.5,” Expert Systems with Applications, Vol. 37, No. 4, 2010, pp. 1814-1820.

[23] Z. Yudong and W. Lenan, “Stock Market Prediction of S & P 500 via Combination of Improved BCO Approach and BP Neural Network,” Expert Systems with Applications, Vol. 36, No. 5, 2009, pp. 8849-8854.

[24] X. Lin, Z. Yang and Y. Song, “Short-Term Stock Price Based on Echo State Networks,” Expert Systems with Applications, Vol. 36, No. 3, 2009, pp. 7313-7317.

[25] T. Chang, “A Comparative Study of Artificial Neural Networks, and Decision Trees for Digital Game Content Stocks Price Prediction,” Expert Systems with Applications, Vol. 38, No. 12, 2011, pp. 14846-14851.

[26] L. Shen and H. T. Loh, “Applying Rough Set to Market Timing Decisions,” Decision Support System, Vol. 37, No. 4, 2004, pp. 583-597.

[27] Q. Wen, Z. Yang, Y. Song and P. Jia, “Automatic Stock Decision Support System Based on Box Theory and SVM Algorithm,” Expert Systems with Applications, Vol. 37, No. 2, 2010, pp. 1015-1022.

[28] E. Keogh, S. Chu, D. Hart and M. Pazzani, “Segmenting Time Series: A Survey and Novel Approach,” In: M. Last, A. Kandel and H. Bunke, Eds., Data Mining in Time Series Databases—Series in Machine Perception Artificial Intelligence, World Scientific, Singapore, 2004, pp. 1-21.

[29] X. Jiang, H. Bunke and J. Csirik, “Median Strings: A Review,” In: M. Last, A. Kandel and H. Bunke, Eds., Data Mining in Time Series Databases—Series in Machine Perception Artificial Intelligence, World Scientific, Singapore, 2004, pp. 173-192.

[30] G. Das and D. Gunopulos, “Time Series Similarity and Indexing,” In: N. Ye, Ed., The Handbook of Data Mining, Lawrence Erlbaum Associates, London, 2003, pp. 279-304.

[31] M. L. Hetland, “A Survey of Recent Methods for Efficient Retrieval of Similar Time Sequences,” In: M. Last, A. Kandel and H. Bunke, Eds., Data Mining in Time Series Databases—Series in Machine Perception Artificial Intelligence, World Scientific, Singapore, 2004, pp. 23-42.

[32] K. Mehta and S. Bhattacharya, “Adequacy of Training Data for Evolutionary Mining of Trading Rules,” Decision Support Systems, Vol. 37, No. 4, 2004, pp. 461-474.

[33] C. F. Tsai and Y. C. Hsiao, “Combining Multiple Feature Selection Methods for Stock Prediction: Union, Intersection, and Multi-Intersection Approaches,” Decision Support Systems, Vol. 50, No. 1, 2010, pp. 258-269.

[34] M. H. F. Zarandi, B. Rezaee, I. B. Turksen and E. Neshat, “A Type-2 Fuzzy Rule-Based Expert System Model for Stock Price Analysis,” Decision Support Systems, Vol. 36, No. 1, 2009, pp. 139-154.

[35] H. Ince and T. Trafalis, “Kernel Principal Component Analysis and Support Vector Machines for Stock Price Prediction,” IIE Transactions, Vol. 39, No. 6, 2007, pp. 629-637.

[36] W. Shen, X. Guo, C. Wu and D. Wu, “Forecasting Stock Indices Using Radial Basis Function Neural Networks Optimized by Artificial Fish Swarm Algorithm,” Knowledge-Based Systems, Vol. 24, No. 3, 2011, pp. 378-385.

[37] B. C. O. Tas, “Private Information of the Fed and Predictability of Stock Returns,” Applied Economics, Vol. 43, No. 19, 2011, pp. 2381-2398.

[38] P. M. Dechow, A. P. Hutton, L. Meulbroek and R. G. Sloan, “Short-Sellers, Fundamental Analysis, and Stock Returns,” Journal of Financial Economics, Vol. 61, No. 1, 2001, pp. 77-106.

[39] M. Lam, “Neural Network Techniques for Financial Performance Prediction: Integrating Fundamental and Technical Analysis,” Decision Support Systems, Vol. 37, No. 4, 2004, pp. 567-581.

[40] K. Wohlrabe, “Forecasting with Mixed-Frequency Time Series Models”, Ph.D. Thesis, Ludwig Maximilians Universitat, Munchen, 2008.

[41] E. Andreou, E. Ghysels and A. Kourtellos, “Forecasting with Mixed-Frequency Data,” Journal of Econometrics, Vol. 158, No. 2, 2010, pp. 246-261.

[1] P. Almeida, “Previsao do Comportamento de Séries Temporais Financeiras com Apoio de Conhecimento Sobre o Domínio,” Ph.D. Thesis, Universidade da Beira Interior, Covilha, 2003.

[2] L. Breiman, “Statistical Modeling: The Two Cultures,” Statistical Science, Vol. 18, No. 3, 2001, pp. 199-231.

[3] M. A. Ruggiero, “Cibernetic Trading Strategies—Developing a Profitable Trading System State-of-the-Art Technologies,” John Wiley & Sons, New York, 1977.

[4] U. M. Fayyad, G. Piatetski-Shapiro and P. Smyth, “From Data Mining to Knowledge Discovery: An Overview,” In: U. M. Fayyad, G. Piatetsky-Shapiro, P. Smyth and R. Uthurusamy, Eds., Advances in Knowledge Discovery and Data Mining, The MIT Press, Cambridge, 1996, pp. 1-34.

[5] J. Han, M. Kamber and J. Pei, “Data Mining: Concepts and Techniques,” Morgan Kaufman Publishers, California, 2011.

[6] D. Hand, H. Mannila and P. Smyth, “Principles of Data Mining”, The MIT Press, Cambridge, 2011.

[7] A. Azevedo and M. F. Santos, “KDD, SEMMA, and CRISP-DM: A Parallel Overview,” Proceedings of the IADIS European Conference on Data Mining, Amsterdam, 2008, pp. 182-185.

[8] OECD, “Time Series,” 2006. http://stats.oecd.org/glossary/detail.asp?ID=2708

[9] M. A. Ferreira and P. Santa-Clara, “Forecasting Stock Market Returns: The Sum of the Parts Is More Than the Whole,” Journal of Financial Economics, Vol. 100, No. 3, 2011, pp. 514-537.

[10] T. Fu, “A Review on Time Series Data Mining,” Engineering Applications of Artificial Intelligence, Vol. 21, No. 1, 2011, pp. 164-181.

[11] T. O. Hill, M. Connor and W. Remus, “Neural Network Models for Time Series Forecasts,” Management Science, Vol. 42, No. 7, 1996, pp. 1082-1092.

[12] S. Fong and Z. Nannan, “Towards an Adaptive Forecasting of Earthquake Time Series from Decomposable and Salient Characteritics,” Proceendings of the third international conference on pervasive patterns and applications, Rome, 2011, pp. 53-60.

[13] K. J. Walsh, M. Milligan, M. Woodman and J. Sherwell, “Data Mining to Characterize Ozone Behavior in Baltimore and Washington DC,” Journal of Atmospheric Environment, Vol. 42, No. 18, 2008, pp. 4280-4292.

[14] C. Damle and A. Yalcin, “Flood Prediction Using Time Series Data Mining,” Journal of Hidrology, Vol. 333, No. 2-4, 2007, pp. 305-316.

[15] E. Tsang, P. Yung and J. Li, “EDDIE-Automation, a Decision Support Tool for Financial Forecasting,” Decision Support Systems, Vol. 37, No. 4, 2004, pp. 559-565.

[16] A. Pasley and J. Austin, “Distribution Forecasting of High Frequency Time Series,” Decision Support Systems, Vol. 37, No. 4, 2004, pp. 501-513.

[17] Z. Huang, H. Chen, C. J. Hsu, W. H. Chen and S. Wu, “Credit Ratings Analysis with Support Vector Machines and Neural Networks. A Market Comparative Study,” Decision Support Systems, Vol. 37, No. 4, 2004, pp. 542-558.

[18] H. M. Krolzig and J. Toro, “Multiperiod Forecasting in Stock Market: A Paradox Solved,” Decision Support Systems, Vol. 37, No. 4, 2004, pp. 531-542.

[19] K. C. Chiu and L. Xu, “Arbitrage Pricing Theory-Based Gaussian Temporal Factor Analysis for Adaptive Portfolio Management,” Decision Support Systems, Vol. 37, No. 4, 2004, pp. 485-500.

[20] O. Coupelon, “Nneural Network Modeling for Stock Movement Prediction: A State of the Art,” Network, Vol. , No. , 2007, pp. 1-5. http://olivier.coupelon.free.fr/Neural_network_modeling_for_stock_movemen_prediction.pdf

[21] M. Kordos and A. Cwiok, “A New Approach to Neural Network Based Stock Trading Strategy,” Proceedings of the 12th International Conference on Intelligent Data Engineering and Automated Learning, 2011, pp. 429-436.

[22] J. H. Cheng, H. P. Chen and Y. M. Lin, “A Hybrid Forecast Market Timing Model Based on Probabilistic Neural Network, Rough Set and C 4.5,” Expert Systems with Applications, Vol. 37, No. 4, 2010, pp. 1814-1820.

[23] Z. Yudong and W. Lenan, “Stock Market Prediction of S & P 500 via Combination of Improved BCO Approach and BP Neural Network,” Expert Systems with Applications, Vol. 36, No. 5, 2009, pp. 8849-8854.

[24] X. Lin, Z. Yang and Y. Song, “Short-Term Stock Price Based on Echo State Networks,” Expert Systems with Applications, Vol. 36, No. 3, 2009, pp. 7313-7317.

[25] T. Chang, “A Comparative Study of Artificial Neural Networks, and Decision Trees for Digital Game Content Stocks Price Prediction,” Expert Systems with Applications, Vol. 38, No. 12, 2011, pp. 14846-14851.

[26] L. Shen and H. T. Loh, “Applying Rough Set to Market Timing Decisions,” Decision Support System, Vol. 37, No. 4, 2004, pp. 583-597.

[27] Q. Wen, Z. Yang, Y. Song and P. Jia, “Automatic Stock Decision Support System Based on Box Theory and SVM Algorithm,” Expert Systems with Applications, Vol. 37, No. 2, 2010, pp. 1015-1022.

[28] E. Keogh, S. Chu, D. Hart and M. Pazzani, “Segmenting Time Series: A Survey and Novel Approach,” In: M. Last, A. Kandel and H. Bunke, Eds., Data Mining in Time Series Databases—Series in Machine Perception Artificial Intelligence, World Scientific, Singapore, 2004, pp. 1-21.

[29] X. Jiang, H. Bunke and J. Csirik, “Median Strings: A Review,” In: M. Last, A. Kandel and H. Bunke, Eds., Data Mining in Time Series Databases—Series in Machine Perception Artificial Intelligence, World Scientific, Singapore, 2004, pp. 173-192.

[30] G. Das and D. Gunopulos, “Time Series Similarity and Indexing,” In: N. Ye, Ed., The Handbook of Data Mining, Lawrence Erlbaum Associates, London, 2003, pp. 279-304.

[31] M. L. Hetland, “A Survey of Recent Methods for Efficient Retrieval of Similar Time Sequences,” In: M. Last, A. Kandel and H. Bunke, Eds., Data Mining in Time Series Databases—Series in Machine Perception Artificial Intelligence, World Scientific, Singapore, 2004, pp. 23-42.

[32] K. Mehta and S. Bhattacharya, “Adequacy of Training Data for Evolutionary Mining of Trading Rules,” Decision Support Systems, Vol. 37, No. 4, 2004, pp. 461-474.

[33] C. F. Tsai and Y. C. Hsiao, “Combining Multiple Feature Selection Methods for Stock Prediction: Union, Intersection, and Multi-Intersection Approaches,” Decision Support Systems, Vol. 50, No. 1, 2010, pp. 258-269.

[34] M. H. F. Zarandi, B. Rezaee, I. B. Turksen and E. Neshat, “A Type-2 Fuzzy Rule-Based Expert System Model for Stock Price Analysis,” Decision Support Systems, Vol. 36, No. 1, 2009, pp. 139-154.

[35] H. Ince and T. Trafalis, “Kernel Principal Component Analysis and Support Vector Machines for Stock Price Prediction,” IIE Transactions, Vol. 39, No. 6, 2007, pp. 629-637.

[36] W. Shen, X. Guo, C. Wu and D. Wu, “Forecasting Stock Indices Using Radial Basis Function Neural Networks Optimized by Artificial Fish Swarm Algorithm,” Knowledge-Based Systems, Vol. 24, No. 3, 2011, pp. 378-385.

[37] B. C. O. Tas, “Private Information of the Fed and Predictability of Stock Returns,” Applied Economics, Vol. 43, No. 19, 2011, pp. 2381-2398.

[38] P. M. Dechow, A. P. Hutton, L. Meulbroek and R. G. Sloan, “Short-Sellers, Fundamental Analysis, and Stock Returns,” Journal of Financial Economics, Vol. 61, No. 1, 2001, pp. 77-106.

[39] M. Lam, “Neural Network Techniques for Financial Performance Prediction: Integrating Fundamental and Technical Analysis,” Decision Support Systems, Vol. 37, No. 4, 2004, pp. 567-581.

[40] K. Wohlrabe, “Forecasting with Mixed-Frequency Time Series Models”, Ph.D. Thesis, Ludwig Maximilians Universitat, Munchen, 2008.

[41] E. Andreou, E. Ghysels and A. Kourtellos, “Forecasting with Mixed-Frequency Data,” Journal of Econometrics, Vol. 158, No. 2, 2010, pp. 246-261.