According to the World Tourism Organization, tourism is considered a sector of hope. Tourism accounts for 10% of world GDP. It contributes to world societies by promoting cultures, as well as by adding more jobs to the economy. In 2016, 1.2 billion tourists were recorded worldwide and this number is expected to grow to 1.8 billion (50% growth) by 2030  . This research paper focuses on one aspect of the tourism and hospitality sector, which is accommodation and, more specifically, hotels. Practitioners as well as researchers have been motivated to study the dynamic prices of hotel rooms and understand the determinants of these changes and, accordingly, be able to forecast prices more effectively. Akm  , Drew et al.  , Hassani et al.  , Padhi and Aggarwal  , Yang et al.  , Youn and Gu  , Jovanovic et al.  , Uysal  , and Magnini et al.  have all forecasted hotel or tourism demand using neural networks jointly with support vector analysis, the autoregressive integrated moving average (ARIMA), logistic regression, fuzzy goal programming, and decision trees.
This study introduces a novel approach to hospitality forecasting in this region and more specifically to hotel rooms’ average daily rates (ADR) based on eight major cities in the Middle East and North Africa (MENA) region. These methods include linear models (simple moving average and ARIMA) and nonlinear models (radial basis function (RBF) and the support vector machine (SVM). Research on hotels and advanced machine learning forecasting techniques is very limited in this region. This study adds to the literature in this field.
The rest of the paper is organized as follows: the second section reviews the historical literature on Tourism and Hospitality studies and the different models used for forecasting; the third section covers the problem definition and research objective of this study; the fourth section highlights the conceptual framework which would provide good visuals for the study objective and hypothesis; the fifth is the methodology section which covers the data collection part, a detailed data analysis, a list of the key variables and the models used in the research; the sixth section covers the statistical performance evaluation and measures which were employed to select the best model. Finally, a conclusion section covers a summary of the study outcomes and provides recommendation for future studies.
2. Literature Review
In the tourism and hospitality industry, it is very important to understand the variables affecting demand and eventually the performance of the industry (or a particular hotel or restaurant). In the past, managers and decision-makers relied on simple forms of data analysis (simple linear regression or multiple regression) to investigate the influence of those variables. More recently, big data analytics has unlocked the potential for studying complex business situations to understand the correlation between variables or causes and eventually forecast performance. Machine learning techniques have made it possible to analyze tens or hundreds (even thousands) of data variables that are either stored or live-streamed to help shape a time bound decision-making process.
Hospitality and tourism studies using machine learning to predict demand have gained momentum lately. Simple neural networks (NN) (or artificial neural networks (ANN)) have been the most-used machine learning technique. Philips et al.  , Pattie and Snyder  , Govers et al.  , and Law  have all used NN or ANN to forecast tourism or hotel demand using historical tourists’ arrival data or room occupancy data. Some researchers have combined neural networks with other ML techniques, either to pick the best model or to improve the performance of the model.
Others have explored different machine learning techniques as well. Vu et al.  , Tkaczynski et al.  , Toral et al.  , Dolnicar and Leisch  , Brochado et al.  , and Geetha et al.  have used clustering while dealing with market or customer segmentations and consumer behaviors. Hadavandi et al.  , Yu and Schwartz  and Sohrabi et al.  have used fuzzy systems to predict hotel demand using arrival data. Fuzzy systems are also widely used in planning and decision-making in retail and banking. Li and Sun  used support vectors to predict firm failure using financial and non-financial data, while Chen and Wang  used the support vector technique to forecast the demand. Pantano et al.  used tourist attraction characteristics and the random forest method to predict tourist response, while Shapoval et al.  used inbound visitors numbers and the decision tree technique to develop effective destination marketing.
Rong et al.  , Xiang et al.  , Li et al.  , Yang et al.  , Guo et al.  , Versichele et al.  , Sun et al.  , Athanasopoulos et al.,  , Schwartz and Hiemstra  , Yang et al.  , Tussyadiah and Wang  , Pereira  , Liu et al.  , Lim et al.  , Wu et al.  , Zhang and Zhang  , and Chiu et al.  have all introduced new “unconventional” variables while studying hospitality performance. The results are evidence that machine learning techniques are extremely powerful in enhancing the accuracy of analysis and prediction. With a very globalized industry such as the hospitality and tourism industry, machine learning has indeed proven its potential.
3. Problem Definition and Research Objective
Prior studies have focused mostly on tourism demand in countries, that is, at the macro level. Tourism arrivals have been extensively used as a forecasting component and most of the studies are country-specific. Macroeconomic factors take longer time to respond to changes of what is known as the lag effect. On the other hand, firm specific microeconomic factors which represent the fundamental factor model are much faster to respond to changes and are in the control of the hotel managers/owners. They often lead to significant characterization of the dependent variable under investigation  . The objective of this study is to firstly combine macro and micro elements while studying dynamic hotel prices. Several cities from the MENA region, which, it is assumed, go through the same economic effects, are included in the study. In addition, this study explores the benefits of using machine learning techniques in forecasting.
4. Research Framework and Proposed Hypotheses
Hotel practitioners have used inventory planning and pricing as major inputs in their revenue management systems, which has led to successful hotel performance  . Lee  found while measuring hotel room rates that prices were affected by both internal (hotel-specific) and external (economic) factors.
Driven by literature findings, hotel attributes have a direct effect on hotel performance. This leads to the first hypothesis:
H1: Big data on hotels leads to better price prediction.
Moreover, various economic factors were investigated in hotel performance studies. This has led to the inclusion of economic factors as a moderating effect to be tested in this study:
H2: Economic factors moderate the relation between hotels’ attributes and hotel room pricing.
The following diagram (Figure 1) provides a visual representation of the study direction.
Data collection and analysis
The daily hotel data used in this research paper came from STR and covered eight cities in the MENA region (at our request): Dubai, Jeddah, Manama, Muscat, Kuwait, Beirut, Amman, and Sharm El Sheikh. The hotels were split into three categories (Luxury and Upper Upscale Class, Upscale and Upper Midscale Class, Midscale and Economy Class). However, due to some missing data and the need for uniform analysis across all eight cities, it was decided to deal only with the Luxury and Upper Upscale class since data were available for this class
Figure 1. Hotel performance determinants in selected MENA cities.
for all cities. The sample contained 2800 observations of daily room sales covering the period between January 2010 and August 2017. The data set was split in the following way (Table 1).
The data split is done to test the models’ performance in predicting unknown observations (test data is part of unsupervised learning) after determining all network/model parameters using the training data.
Economic variables were obtained from other credible sources such as the World Bank, the World Tourism Organization, the World Economic Forum and the US Energy Information Administration. The only challenge was that most of these data had different frequencies (monthly, yearly, and once every two years). To deal with different frequencies, international tourist arrival data were converted to daily rates by dividing the annual rate by 365, while country-level annual GDP growth percentage rate, inflation rate (average consumer prices), oil price (WTI and Brent), and index data on each country’s business environment, safety and security, health and hygiene, human resources, and labor market were all converted to daily rates by maintaining the same rate throughout the year. The aim was to use these variables and measures to gain insight into the determinants of hotel room prices that would help us, and eventually decision-makers, to predict these prices with more accuracy. Innovative ways of handling mixed data frequencies could be an opportunity for future research. This study would also be a good piece of research to validate the hotel performance determinants (HPD) model suggested by Assaf et al.  . The table below (Table 2) represents a list of the variables found in the literature that we utilized in our study based on data availability/accessibility.
6. Models in the Research
This research is based on predicting dynamic hotel room prices based on the selection of the best forecasting model. These models are linear (simple moving average and ARIMA) and non-linear (RBF and SVM) in form. The goal is to compare the above-mentioned models using the model performance measures to determine the best model or a combination of them.
6.1. Time Series Forecast
Using ADR values from the years, time series forecast would help in predicting the future ADR of hotels based on historical data. The main goal is to find a model with a better fit for the data, hence reducing the noise or error. The models that we used are the simple moving average and the Box?Jenkins ARIMA model. These models are widely used in tourism and hospitality research.
6.2. Simple Moving Average
The simple moving average method uses the average of previous n-periods as a forecast value  .
Table 1. Data sample.
Table 2. Summary table of variables used in tourism and hospitality literature.
ARIMA (p, d, q) consists of the autoregressive AR(p), the moving average MA(q) and (d), which represents the order of differencing used to achieve stationarity. ARIMA is one of the models most widely used to forecast with time series data  .
6.3. Machine Learning Models
RBF is one of the most widely used neural network techniques. Used for classification as well as regression, the RBF model is a feed-forward neural network that is based on three layers: input, hidden and output  . RBF models gained interest due to their advantage in achieving faster convergence with fewer errors while also being reliable (Moradkhani et al., 2004)  .
where C represents the center and represents the width of the neuron or the radius (Wei, 2012)  .
SVM was introduced by Vapnik in the early 1990s. SVM is a statistical technique used widely for classification and more recently for regression (support vector regression (SVR)). Unlike other models, SVM aims to minimize the generalized error (structural risk minimization). When visualized, SVM works to maximize a hyperplane that separates two classes (or more). SVR is another version of SVM that was proposed by Drucker, Burges, Kaufman, Smola, and Vapnik  . SVM could also work in higher dimensions if a kernel function is applied, which allows SVM to solve non-linear equations  .
7. Data Analysis
As a first step, we ran a descriptive statistics analysis for the data which highlights the mean, max, min and standard deviation of the variables used in the study. The data used in this study represents daily observations obtained from STR which provided insights of the internal or microeconomic factors used by the industry to study the hospitality sector. Other macro-economic factors, which appeared also in several studies within the tourism and hospitality field; were obtained from different sources such as the world bank the World Tourism Organization, the World Economic Forum and the US Energy Information Administration. However, those factors where country based since capturing them at city level and on daily basis was impossible and out of the scope of this study. The table blow (Table 3) provides a summary of the descriptive statistics for significant variables generated from Dubai Luxury upper data which was produced using IBM SPSS:
With skewness that is close to zero and less than 1 for most variables, this indicated that though we are dealing with very dynamic environment, yet data is normally distributed with means around zero.
Using documented steps in literature for each proposed model in this study, the data was then used in each model to produce the forecasts and to carry further analysis. The aim is to compare models and choose the best model for ADR prediction (or perhaps a combination of models) based on
As a first step in the time series analysis, the data were plotted to check visually for any seasonal trends throughout the year (refer to Figure 2 for ADR). The aim was to regulate this seasonality or make the data stationary in order to be able to explain the data using the autoregressive model, ARIMA. Many of these cities showed acceptable stationarity in data while some (i.e. Jeddah) showed some increasing trends over a number of years, which necessitated some treatment of the data to make the mean constant. As a result, and to maintain uniformity, first-order differencing for all cities’ data was considered to make those data stationary following the 1970 Box-Jenkins method (see Figure 3)  .
After dealing with stationarity, AIC and BIC tests were employed. A different combination of ARIMA models for each city was tested and based on the tests criteria the best model was selected for data analysis.
8. Statistical Performance
The following table (Table 4) represents the result of the models’ performance measures for each city as a measure of forecasting accuracy.
8.1. Conventional Techniques
By employing the ARIMA and simple moving average techniques to forecast future room rates, the study found that the simple moving average performed poorly
Figure 2. Average daily rates for Luxury and Upper Upscale hotels in selected cities in the MENA region (StataSE13).
Figure 3. D1 of average daily rates for Luxury and Upper Upscale hotels in selected cities in the MENA region (StataSE 13).
Table 3. Descriptive statistics for Dubai Luxury & upper sample data.
Table 4. Performance for all cities.
according to the performance measures, while ARIMA was a significantly better predictor. From the above table, it appears that Amman, Dubai, and Sharm El-Shaikh have the lowest errors in models compared to other cities, while Manama produced the highest errors. Overall, ARIMA performed better than the simple moving average method in term of forecasting accuracy using conventional techniques.
8.2. Innovative Techniques
One of the contributions of this paper is the use of innovative machine learning tools to forecast room prices. Both RBF and SVM were utilized for prediction, which resulted in significant improvements in performance. The inclusion of external economic factors could also be one reason why these models outperformed the conventional models.
When comparing the forecasting accuracy of different models and for the eight different cities, it was found that SVM and RBF performed better than ARIMA or the simple moving average. The results show the machine learning technique’s superiority in prediction compared to conventional forecasting models.
The use of innovative tools in hotel performance forecasting would help researchers as well as practitioners in planning effectively. Hotel internal attributes positively affect hotel performance and more specifically prices. External economic factors moderate the relationship between the hotel attributes and hotel performance.
The main objective of the study was to predict hotel room prices using new tools. The study shows that SVM is the leading model in “luxury and upscale” hotel room price forecasting, followed by RBF and then ARIMA, while the simple moving average is found in this study to be the inferior model.
Machine learning is insufficiently studied in the hotel and tourism sector in the MENA region, and this study adds to the academic literature. Due to their abundance and rapid development in recent years, future studies could explore other machine learning and artificial intelligence models and compare their performance against traditional models. Other performance measures such as precision and speed of returning results could be used for model evaluation as well. Given the dynamic environment within the tourism and hospitality sector, policy makers and hotel operators could use these tools to maintain their strategic lead. SVM model or a combination of forecasting models can be utilized to forecast short-term and long-term market direction based on the model strength.
 Drew, J.H., Mani, D., Betz, A.L. and Datta, P. (2001) Targeting Customers with Statistical and Data-Mining Techniques. Journal of Service Research, 3, 205-219.
 Padhi, S.S. and Aggarwal, V. (2011) Competitive Revenue Management for Fixing Quota and Price of Hotel Commodities under Uncertainty. International Journal of Hospitality Management, 30, 725-734.
 Yang, Y., Tang, J., Luo, H. and Law, R. (2015) Hotel Location Evaluation: A Combination of Machine Learning Tools and Web GIS. International Journal of Hospitality Management, 47, 14-24.
 Youn, H. and Gu, Z. (2010) Predicting Korean Lodging Firm Failures: An Artificial Neural Network Model along with a Logistic Regression Model. International Journal of Hospitality Management, 29, 120-127.
 Magnini, V.P., Honeycutt Jr., E.D. and Hodge, S.K. (2003) Data Mining for Hotel Firms: Use and Limitations. Cornell Hotel and Restaurant Administration Quarterly.
 Philips, P., Zigan, K., Silva, M.M.S. and Schegg, R. (2015) The Interactive Effects of Online Reviews on the Determinants of Swiss Hotel Performance: A Neural Network Analysis. Tourism Management, 50, 130-141.
 Vu, H.Q., Li, G., Law, R. and Ye, B.H. (2015) Exploring the Travel Behaviors of Inbound Tourists to Hong Kong Using Geotagged Photos. Tourism Management, 46, 222-232.
 Tkaczynski, A., Rundle-Thiele, S. and Beaumont, N. (2010) Destination Segmentation: A Recommended Two-Step Approach. Journal of Travel Research, 49, 139-152.
 Geetha, M., Singha, P. and Sinha, S. (2017) Relationship between Customer Sentiment and Online Customer Ratings for Hotel—An Empirical Analysis. Tourism Management, 61, 43-54.
 Hadavandi, E., Ghanbari, A., Shahanaghi, K. and Abbasian-Naghneh, S. (2011) Tourist Arrival Forecasting by Evolutionary Fuzzy Systems. Tourism Management, 32, 1196-1203.
 Sohrabi, B., Vanani, I.R., Tahmasebipur, K. and Fazli, S. (2012) An Exploratory Analysis of Hotel Selection Factors: A Comprehensive Survey of Tehran Hotels. International Journal of Hospitality Management, 31, 96-106.
 Li, H. and Sun, J. (2012) Forecasting Business Failure: The Use of Nearest-Neighbour Support Vectors and Correcting Imbalanced Samples—Evidence from the Chinese Hotel Industry. Tourism Management, 33, 622-634.
 Pantano, E., Priporas, C.-V. and Stylos, N. (2017) “You Will Like It!” Using Open Data to Predict Tourists’ Response to a Tourist Attraction. Tourism Management, 60, 430-438.
 Rong, J., Vu, H.Q., Law, R. and Li, G. (2012) A Behavioral Analysis of Web Shares and Browsers in Hong Kong Using Targeted Association Rule Mining. Tourism Management, 33, 731-740.
 Xiang, Z., Du, Q., Ma, Y. and Fan, W. (2017) A Comparative Analysis of Major Online Review Platforms: Implications of Social Media Analytics in Hospitality and Tourism. Tourism Management, 58, 51-65.
 Li, G., Law, R., Vu, H.Q. and Rong, J. (2013) Discovering the Hotel Selection Preferences of Hong Kong Inbound Travelers Using the Choquet Integral. Tourism Management, 36, 321-330.
 Guo, Y., Barnes, S.J. and Jia, Q. (2017) Mining Meaning from Online Ratings and Reviews: Tourist Satisfaction Analysis Using Latent Dirichlet Allocation. Tourism Management, 59, 467-483.
 Versichele, M., et al. (2014) Pattern Mining in Tourist Attraction Visits through Association Rule Learning on Bluetooth Tracking Data: A Case Study of Ghent, Belgium. Tourism Management, 44, 67-81.
 Sun, X., et al. (2016) Using a Grey-Markov Model Optimized by Cuckoo Search Algorithm to Forecast the Annual Foreign Tourist Arrival to China. Tourism Management, 52, 369-379.
 Yang, Y., Pan, B. and Song, H. (2014) Predicting Hotel Demand Using Destination Marketing Organization’s Web Traffic Data. Journal of Travel Research, 53, 433-447.
 Pereira, L.N. (2016) An Introduction to Help Forecasting Methods for Hotel Revenue Management. International Journal of Hospitality Management, 58, 12-23.
 Morosan, C. and DeFranco, A. (2015) Disclosing Personal Information via Hotel Apps: A Privacy Calculus Perspective. International Journal of Hospitality Management, 47, 120-130.
 Wu, D.C., Song, H. and Shen, S. (2017) New Developments in Tourism and Hotel Demand Modeling and Forecasting. International Journal of Contemporary Hospitality Management, 29, 507-529.
 Zhang, C. and Zhang, J. (2014) Analysing Chinese Citizens’ Intentions of Outbound Travel: A Machine Learning Approach. Current Issues in Tourism, 17, 592-609.
 Lee, C.G. (2011) The Determinants of Hotel Room Rates: Another Visit with Singapore’s Data. International Journal of Hospitality Management, 30, 756-758.
 Assaf, A.G., Josiassen, A., Woo, L., Agbola, F.W. and Tsionas, M. (2017) Destination Characteristics That Drive Hotel Performance: A State-of-the-Art Global Analysis. Tourism Management, 60, 270-279.
 Fernández, J.I.P., Cala, A.S. and Domecq, C.F. (2011) Critical External Factors Behind Hotels’ Investments in Innovation and Technology in Emerging Urban Destinations. Tourism Economics, 17, 339-357.
 Martins, L.F., Gan, Y. and Ferreira-Lopes, A. (2017) An Empirical Analysis of the Influence of Macroeconomic Determinants on World Tourism Demand. Tourism Management, 61, 248-260.
 Dogru, T. and Sirakaya-Turk, E. (2017) Engines of Tourism’s Growth: An Examination of Efficacy of Shift-Share Regression Analysis in South Carolina. Tourism Management, 58, 205-214.
 Lado-Sestayo, R., Otero-Gonzalez, L., Vivel-Bua, M. and Martorell-Cunill, O. (2016) Impact of Location on Profitability in the Spanish Hotel Sector. Tourism Management, 52, 405-415.
 Hanly, P.A. (2012) Measuring the Economic Contribution of the International Association Conference Market: An Irish Case Study. Tourism Management, 33, 1574-1582.
 Chatziantoniou, I., Filis, G., Eeckels, B. and Apostolakis, A. (2013) Oil Prices, Tourism Income and Economic Growth: A Structural VAR Approach for European Mediterranean Countries. Tourism Management, 36, 331-341.
 Tang, C.-H. and Jang, S. (2009) The Tourism-Economy Causality in the United States: A Sub-Industry Level Examination. Tourism Management, 30, 553-558.
 Ahlert, G. (2008) Estimating the Economic Impact of an Increase in Inbound Tourism on the German Economy Using TSA Results. Journal of Travel Research, 47, 225-234.
 Burger, C., Dohnal, M., Kathrada, M. and Law, R. (2001) A Practitioner’s Guide to Time-Series Methods for Tourism Demand—A Case Study of Durban, South Africa. Tourism Management, 22, 403-409.
 Galavi, H., Mirzaei, M., Shui, L.T. and Valizadeh, N. (2013) Klang River-Level Forecasting Using ARIMA and ANFIS Models. Journal—American Water Works Association, 105, E496-E506.
 Wei, C.-C. (2012) RBF Neural Networks Combined with Principal Component Analysis Applied to Quantitative Precipitation Forecast for a Reservoir Watershed during Typhoon Periods. Journal of Hydrometeorology, 13, 722-734.
 Moradkhani, H., Hsu, K.-L., Gupta, H.V. and Sorooshian, S. (2004) Improved Streamflow Forecasting Using Self-Organizing Radial Basis Function Artificial Neural Networks. Journal of Hydrology, 295, 246-262.
 Pan, Y., Xiao, Z., Wang, X. and Yang, D. (2017) A Multiple Support Vector Machine Approach to Stock Index Forecasting with Mixed Frequency Sampling. Knowledge-Based Systems, 122, 90-102.