Nowadays, weather forecasting has been a very important process to ensure the running of several important human activities such as in renewable energy systems . However, weather forecasting using some traditional techniques becomes useless and ineffective due to the impact of climate changes . Some countries which have flash flood are not possible for predictions in such weather conditions with convectional of forecasting systems because the systems are used for the prediction for large regions . The forecasting of solar radiation plays an important part in the meteorological area. Many methods have been developed to estimate solar radiation which involves correlations between solar radiation and other measured meteorological variables. However, in many cases of study, the information about solar radiation is not available with a very limited number of meteorological stations for a location of interest . Applying a combination of numerical weather prediction (NWP) and time series method for forecasting is a promising approach for the modeling variation of solar radiation.
In recent years, modeling of solar radiation has been used in many countries with different climates by applying Machine Learning based on Artificial Neural Network (ANN). Many advanced countries such as Japan, the UK, the USA, China, India, and Spain are applying ANN in the modeling of solar radiation based on their location and different climates . A requirement of large input data that must be connected to the target variable is one of the challenges for the machine learning method, but due to cost and maintenance including site, the important data may not be available. The successful planning of construction for the renewable energy project is depending on the accuracy of solar radiation prediction. In addition, many architects in the field and agriculturists require the accuracy of solar radiation for farming purposes .
Some cases of studies show accurate of the solar radiation data that require some combination of input parameters such as coordinate of a location (latitude and longitude), hourly of sunshine, maximum and minimum ambient temperature, albedo, aerosol optical depth, cloudiness, evaporation, precipitation, and relative humidity for prediction of solar radiation for several weather stations, but as mentioned before, due to costing a lot for a long-term record of solar radiation cause it is limit for a specific location . Another important issue is the duration of the study for solar radiation prediction. The period of tests must be longer than one day particularly for the cloudy days and rainy days to stabilize uncertainty errors from weather forecasting. Mean absolute error (MAE), mean bias error (MBE), root mean square error (RMSE), and mean absolute error percentage (MAPE) or corresponding normalized errors such as nMAE, nMBE, nRMSE, and nMAPE are typically asses to estimate the forecast accuracy .
The implementation of ANN has been successfully applied in a variety of areas as presented in several studies. Vakili et al.  used the ANN model for daily global solar radiation prediction. Their study used several input parameters such as relative humidity, wind speed, and daily temperature for one year of Tehran in Iran. They used three types of ANN models for predicting the daily global solar radiation such as Multilayer Perceptron (MLP), Generalized Regression NN, (GRNN), and Radial Basis NN (RBNN). Some error metrics such as root mean square error (RMSE), mean absolute error (MBE), and the absolute fraction of variance (R2) are used to evaluate the accuracy and efficiency of the models. Their results showed that MLP and RBNN models were better accuracies than the GRNN. Yadav and Changel  presented their study that the performance of the accuracy results of ANN models was mostly dependent on the input parameters. The focused on the estimation of solar radiation for the Eastern Mediterranean Region of Turkey by using the ANN model based on learning algorithms and the number of hidden neurons to obtain and optimize the efficient estimation of the prediction performance. Their results showed that ANN predicts more accurate solar radiation comparing to conventional methods and the ANN model is found to be dependent on configuration architecture, an algorithm of training, and the combination of the parameter.
The objective of this study is to combine the Weather Research and Forecasting (hereinafter WRF) model and the Deep Learning method using Long Short-Term Memory (hereinafter LSTM) for future prediction. The results of one calendar year from January to December 2014 from WRF simulation were used as input data in LSTM to run a future prediction of solar radiation for location in Dili, Timor Leste. This dataset divided into training datasets (81%) and testing datasets (19%). In this study, the three months observation data obtained from a weather station in Dili were used for comparison purposes.
The structure of this work is organized as follows: Section 2 describes the study domain, evaluation of observation data, and sources. Section 3 presents the methodology used for solving problems such as the WRF model including machine learning methods using Long-Short Term Memory, and four error analysis metrics. In Section 4, the results of the simulations are presented. Particularly, three months of daily solar radiation forecasting and four the error metrics analysis data are shown in this section. Section 5 concludes the work of the paper.
2. Study Domain, Evaluation Observation, and Sources
Data on weather forecast with 2160 hours from January to March 2015 were collected at one station in Hera  (lat: 8˚33'03.9''S, long: 125˚39'33.7''E) which located about 12.4 km in the east of Dili, Timor Leste as shown in Figure 1. Located in the centered of Faculty of Engineering, Science, and Technology in Hera campus, Weather station of type Vaisala WXT530 provides hourly solar radiation which will be used for comparison purposes with the result data of combination between WRF model and LSTM network for further analysis. This weather station provides wind speed, wind direction, temperature including solar radiation. However, because the data generated from the weather station is very limit, only three months of solar radiation data from 1st January to 31st March 2015 were used in this study for local necessary forecasting. The objective
Figure 1. Plotting of land cover study area (See Ref. ).
of getting external information from weather forecasting services is to obtain solar radiation for the application of energy management. The Global Forecasting System (GFS) is the most used data source for a weather forecast. It provides data of weather and demonstrates as a useful tool for various weather variables including solar radiation and solar farm operations. Six-hourly interval 1˚ × 1˚ NCEP FNL analysis data via a web server (https://rda.ucar.edu/datasets/ds083.2/) used as Global Forecast System (GFS) for initial data of the simulation for one calendar year from 1st January to 31st December 2014.
3.1. WRF Model
The Weather Research and Forecasting (WRF)  with Advance Research WRF (ARW) version 3.9.1 was used to simulate solar radiation. WRF-ARW is an open-source mesoscale numerical weather prediction is developed and contributed from a large user community such as National Oceanic and Atmospheric Administration (NOAA), the National Centers for Environmental Prediction (NCEP), and the National Center for Atmospheric Research (NCAR). WRF applies the dynamic and thermodynamic equations for the atmosphere simulation. In addition, WRF executes and runs some physical schemes that simulate phenomena in which cannot be done by the dynamical solver. One big advantage of using WRF is implementations for each physical scheme for the large choice that allows the users to configure the model based on their necessary.
In this study, the performance of WRF was evaluated in three different configurations. Three two-way nestings with a horizontal spatial resolution of 9, 3, and 1 km as illustrated in Figure 2 with domain 1 is composed of 86 × 68 cells, domain 2 is 88 × 88 cells, and domain 3 is composed of 100 × 100 cells. WRF Single-Moment 5-Class scheme was used for the microphysics. RRTMG is a new scheme of Rapid Radiative Transfer Model was applied for longwave (LW) and shortwave (SW) radiation . This study used the Monin-Obukhov MM5 theory for the surface layer . Noah LSM was used for land surface . Planetary boundary condition used Yonsei University scheme . Mercator was used as a map projection. However, only data obtaining from domain 03 was used as input data in the LSTM network for comparison purposes with the observed data for further analysis of forecasting. NCL (NCAR Command Language) version 6.5.0 was to plot its grid point and variables .
Figure 2. (a) The three two-nested domain, (b) Plotting of 1 km horizontal resolution domain d03 (See Ref. ).
Three variable of weather forecast such as solar direct, solar diffuse, and cos zenith was used to calculate ground surface solar radiation and define in the formula as shown below;
where: Srad = solar radiation presents in W/m2.
Sdir = Solar direct presents in W/m2.
Sdif = Solar diffuse presents in W/m2.
CZ = Solar zenith angle presents in degree.
Phi = 3.1415926/180 presents in radians.
0˚ = ground surface solar radiation.
3.2. LSTM Network
LSTM network as a branch of the RNN model is suitable for forecasting various learning problems particularly for solar radiation prediction . The ability and flexibility of LSTM architecture to control and manipulate several parameters of the time series are a great benefit in time series forecasting, where we can apply these inputs to multivariate data for future prediction. The structure of LSTM as part of Recurrent Neural Network (RNN) consists of three layers such as an input layer, a hidden layer, and an output layer as shown in Figure 3. The LSTM network is mostly applied using the Keras package for training and testing datasets    . This work uses a moving-forward window technique to run prediction in the next time step . The selection of the number of hidden layers, number of neurons, number of epochs, and batch size play an important role in the implementation of Long-Short Term Memory. So, in this study, these parameters are selected based on trial and error with a range of 1 - 512 neurons, 1 - 300 batch size, and a number of the epoch with a range of 1 - 325 were evaluated until it converged into close results with the observed data. The input data uses the min-max scaler technique for normalizing (−1, 1) before running the algorithm. Table 1 shows more configuration about the LSTM network using two-layered LSTM architecture of 512 hidden neurons coupled with a dense output layer with linear as the model activation to predict with time steps 50 and the number of features is 1. The maximum epoch was set to 325 with batch size 300, and the validation split is 0.09.
Figure 3. Recurrent neural network model.
Table 1. LSTM configuration parameter used in this study.
Figure 3 showed the RNN model which usually use for the time series forecasting. The input layer as the first layer which has weight and each layer will receive weight from the previous layer and use activation function for the hidden layer and linear function for the output layer. In the previous time (t − 1), a delay is happened between the input layer and the hidden layer and can be used in the current time (t). Parameter x(t) and y(t) are the input and output of time series. RNN network can be described by the equations as shown below;
where W0H, W1H, and WHH are the three connection weights, h(t) is a set of values from the summarize of all information in the past which is necessary to describe the future.
3.3. Evaluation of Solar Radiation
Two error metrics such as root mean square error (RMSE) and mean bias error (MBE) from David et al.  and expressed in W/m2 were applied in this paper to evaluate accuracy between observed and simulated data. Meanwhile, normalized MBE (nMBE) and normalized RMSE (nRMSE) expressed in %   were used for normalizing solar radiation in the considered period. These four error metrics are defined as below;
where n represents the number of the time step, pred represents data of the combination of WRF and LSTM algorithm, and obs is Dili weather station observed data. Rmax and Rmin represent the maximum and minimum values of solar radiation from the simulated and observation data. All error metrics validated using hourly data for the considered period where MBE defines if the model is producing underestimation (MBE < 0) or overestimation (MBE > 0), and the other three metrics errors of RMSE, nMBE, and nRMSE will count for distribution and percentage error.
This section presented the results of three months prediction comparison between simulated and observation data including the error analysis of four metrics for solar radiation forecasting. Since there is a lack of information from the local weather station regarding cloud cover, aerosol optical depth (AOD), water vapor, cloud water path, and cloud effective radius, only 2160 hours of solar radiation were used to ensure the experimental comparability and accuracy. Figures 4(a)-(c) show the comparison between predicted values from the combination of WRF and LSTM and the local weather station observed data. Respectively, the blue curves represent the weather station observed data and the red curves represent the LSTM network. The 2160 hours are starting from 1st January to 31 March 2015 show interesting values for prediction purposes.
Figure 4(a) shows 744 hours of solar radiation forecasting in January from the combination of the WRF model and the LSTM network comparing to weather station observation data. It can be observed that some hours of observation data were obtained zero and almost minimum solar radiation generated reaching 1.8 W/m2 and maximum solar radiation generated reach to 428 W/m2 in the 1st, 2nd, 3rd, 4th, 12th, 14th day. Other days were found also minimum solar radiation observation data lower in the beginning and middle of the month when comparing to the LSTM data as it caused by electricity was found unstable causing data generation from the weather station to become limit. However, some days in the month of January showed good accuracy of the forecasting particularly in the middle and at the end of the month when the electricity was found stable. Figure 4(b) illustrated 672 hours of solar radiation comparison between LSTM and observed data in February where the values of this month were almost close one to each other but some hours of forecasting were found zero from the observed data caused by the in-existent electricity. It was observed that some values from other hours were also found no similar range solar radiation forecasting particularly on the day of 6th at 11 AM, 9th at 1 and 2 pm, and 10th at 11 AM causing simulated date are little higher comparing to observed data. Figure 4(c) shows comparison data in the month of March 2015 where the performance of solar radiation forecasting was observed almost lower comparing the simulated data. Some hours of observation data were found zero, particularly in the 2nd and 28th day causing the decrease of observation solar radiation. However, the performance of solar radiation in the month of March shows good results comparing to January and February where the difference in forecasting almost reached from 4.4 W/m2 in several days and the maximum difference reached 485 W/m2.
When sunlight passes through the atmosphere, solar irradiance would reduce caused by damping processes such as absorption of water vapor, the existence of
Figure 4. (a) Plotting of solar radiation analysis in January; (b) Plotting of solar radiation analysis in February; (c) Plotting of solar radiation analysis in March.
cloudy conditions, and aerosol. In addition, humid areas vary time and location may also decrease solar irradiance. In this study, the maximum solar radiation prediction was obtained from the LSTM method around 1002 W/m2, 991 W/m2, and 992 W/m2 in January, February, and March. Figures 4(a)-(c) show hourly solar radiation from the LSTM almost reach above 600 w/m2. Meanwhile, some hours of solar radiation were obtained under 600 W/m2 where it was supposed to be rainy days.
Table 2 shows the result of the values of root mean square error (RMSE), mean bias error (MBE), normalized MBE (nMBE), and normalized RMSE error (nRMSE). The RMSE showed error value reached 203 W/m2 in January, 177 W/m2 in February, and 161 W/m2 in March. Meanwhile, the MBE metrics of these three months showed error values reached above 0 estimations where it indicated the overestimation of the combination from both methods as it shows the positive values of prediction. Meanwhile, the nMBE showed a small percentage error decreasing from 7.38% to 0.65%. In addition, the nRMSE showed also a small percentage error decreasing from 20.09% to 16.18%. The percentage and distribution error is imperative to detect the performance of forecasting skill. Hence the MBE, RMSE, nMBE, and nRMSE are used to evaluate the performance of the model. In the case of these four error metrics, values continue lower from January, February, and then March indicating good performance with the LSTM model.
Based on the decision surface, it can be analyzed that every month of the year have always the maximum effect on solar radiation forecasting as they may be caused by the effect of the top of atmosphere solar insolation, ambient maximum and minimum temperature, and ambient pressure. Moreover, the influence of the location for the latitude and longitude may cause also to surface solar radiation. These values of four error metrics demonstrate that the algorithm of LSTM can successfully increase the performance of the solar radiation forecasting. Obtaining good accuracy for forecasting of solar radiation using LSTM can be done by adjusting the number of epochs, number of batch size, number of neurons, and validation of split. In addition, the performance of LSTM can be also influenced by the input variables over a range of frequencies such as hourly and daily data. All these parameters are done by a large number of trial and error to perform the best results which close to the observation data. Overall, the performance of LSTM for solar radiation forecasting showed accuracy and agreement. The only main problem in this present study is the lack of data from the weather station in the year 2015 which can be used for comparison purposes with the LSTM method.
Table 2. The four error metrics analysis in January, February, and March.
In this study, the evaluation of the reliability of three months of solar radiation provided by a combination of the WRF model and the LSTM method comparing with the observation data was conducted in Dili, Timor Leste. 1 km spatial horizontal resolution estimation with an hourly time resolution from the WRF model was used as input data in the LSTM method to predict three months of solar radiation at the beginning of the year 2015. The 1 deg × 1 deg FNL analysis data obtaining free from the NCEP website were used to run the WRF model for solar radiation simulation. Since there is a lack of information on other variables from the observed data over 3 months in Dili, applying the solar radiation variable is one option to analyze the performance of combination from both methods for future prediction. The three months observed data at the beginning of 2015 are valuable points in understanding solar radiation forecasting for a long-term period. However, some values of weather station data were found zero at the beginning of the year caused the four error metrics to become higher. Meanwhile, the understanding of numerical weather prediction, input data, and deep learning could help to analyze the performance of forecasting.
Three important variables (solar direct, solar diffuse, and cos zenith) were carried out to evaluate solar radiation on the ground surface. The first analysis showed that the LSTM method performed overestimated solar radiation for MBE in January, February, and March about 0.07, 0.04, and 0.006. The RMSE, nMBE, and nRMSE also showed that the decreasing value in the performance of these three months’ prediction. A lower error for solar radiation forecasting in two metrics is not always indicating to lower forecasting of the solar PV system, however, lower forecasting error mostly reaches a higher accuracy for the solar radiation forecasting itself.
The main contribution of this paper is the performance of combining two very well-developed powerful models for local solar radiation forecasting, the WRF model and the LSTM network, respectively. Even though only single location data and a limited number of forecasting data are presented, it’s giving significant understanding for PV set up as an initial measurement in solar energy modeling. The proposal of this study is combining physical method and learning model performs a best of breed approach to achieve a favorable and valuable to better appraise of the accuracy corresponding forecasts. The conclusion of this study is applying a combination of these two powerful models, WRF and LSTM respectively, for solar radiation prediction in Dili will be one solution to deal with other variables for future prediction.
My grateful thank you to Dr. Ruben Jeronimo Freitas as a lecturer at Hera campus for providing the weather station data. I would like to thank you NWP based on the WRF model and LSTM method for their free available sources. My special thanks to anonymous reviewers for theirs useful comments on this paper.
This paper entitled “Combination of WRF model and LSTM network for solar radiation forecasting – Timor Leste case study” contains data of solar radiation forecasting in the Timor Leste region which are useful for comparison purposes. Data of simulation were used to compare with a local weather station for three months period at the beginning of the year 2015. By running simulation in Hera city which has high solar radiation and located in the east of Dili, this paper highlights future solar radiation forecasting. It also shows the results of four error metrics such as mean bias error (MBE), root mean square error (RMSE), normalized RMSE, and normalized nMBE. This present study uses Ubuntu 16.04 long term support 64-bit distribution under the Linux operating system with specifications of 8 GB of RAM running on Inter (R) Core (TM) i7-7700 CPU@3.60 GHz computer.
This study first runs a year simulation of solar radiation for the year of 2014 by using the WRF model. The results simulation from WRF was applied as input data in the LSTM method to run future solar radiation prediction in January, February, and March of the year 2015. Results of LSTM were used to compare with a local weather station for well-understanding about the LSTM coding application. This study provides supplementary material such as “Appendix.xlsx” and “LSTM.py”. A file of “Appendix.xlsx” shows data of hourly solar radiation from the simulated (LSTM) and observed data (local weather station) for three months period of January, February, and March of 2015. Meanwhile, “LSTM.py” is a python file that contains code to run the future prediction of solar radiation in the LSTM method.
Value of the data
1) The data of “Appendix.xlsx” might be used or needed to compare by other researchers with their forecasting data.
2) “LSTM.py” file might be used by other researchers to perform their forecasting of any variables.
 Sharma, N., Gummeson, J., Irwin, D., Zhu, T. and Shenoy, P. (2014) Leveraging Weather Forecasts in Renewable Energy Systems. Sustainable Computing: Informatics and Systems, 4, 160-171.
 Yousif, J.H., Al-Balushi, H.A., Kazem, H.A. and Chaichan, M.T. (2019) Analysis and Forecasting of Weather Conditions in Oman for Renewable Energy Applications. Case Studies in Thermal Engineering, 13, Article ID: 100355.
 Fowdur, T.P., Beeharry, Y., Hurbungs, V., Bassoo, V., Ramnarain-Seetohul, V. and Chan Moo Lun, E. (2018) Performance Analysis and Implementation of an Adaptive Real-Time Weather Forecasting System. Internet of Things, 3-4, 12-33.
 Sun, H.W., Zhao, N., Zeng, X.F. and Yan, D. (2015) Study of Solar Radiation Prediction and Modeling of Relationships between Solar Radiation and Meteorological Variables. Energy Conversion and Management, 105, 880-890.
 Kashyap, Y., Bansal, A. and Sao, A.K. (2015) Solar Radiation Forecasting with Multiple Parameters Neural Networks. Renewable and Sustainable Energy Reviews, 49, 825-835.
 Khosravi, A., Nunes, R.O., Assad, M.E.H. and Machado, L. (2018) Comparison of Artificial Intelligence Methods in Estimation of Daily Global Solar Radiation. Journal of Cleaner Production, 194, 342-358.
 Qin, W.M., Wang, L.C., Lin, A.W., Zhang, M., Xia, X.A., Hu, B. and Niu, Z.G. (2018) Comparison of Deterministic and Data-Driven Models for Solar Radiation Estimation in China. Renewable and Sustainable Energy Reviews, 81, 579-594.
 Agüera-Pérez, A., Palomares-Salas, J.C., González dela Rosa, J.J. and Florencias-Oliveros, O. (2018) Weather Forecasts for Microgrid Energy Management: Review, Discussion and Recommendations. Applied Energy, 228, 265-278.
 Vakili, M., Sabbagh-Yazdi, S.R., Khosrojerdi, S. and Kalhor, K. (2017) Evaluating the Effect of Particulate Matter Pollution on Estimation of Daily Global Solar Radiation Using Artificial Neural Network Modeling Based on Meteorological Data. Clean Production, 141, 1275-1285.
 Yadav, A.K. and Chandel, S.S. (2014) Solar Radiation Prediction Using Artificial Neural Network Technique. Renewable Sustainable Energy Revision, 33, 772-781.
 Iacono, M.J., Mlawer, E.J., Clough, S.A. and Morcrette, J.-J. (2000) Impact of an Improved Longwave Radiation Model, RRTM, on the Energy Budget and Thermodynamic Properties of the NCAR Community Climate Mode, CCM3. Journal of Geophysical Research Atmospheres, 105, 14873-14890.
 Mlawer, E.J., Taubman, J., Brown, P.D., Iacono, M.J. and Clough, S.A. (1997) Radiative Transfer for Inhomogeneous Atmospheres: RRTM, a Validated Correlated-k Model for the Longwave. Journal of Geophysical Research, 102, 16663-16682.
 Ettehad, L.B. (2008) Surface Layer Parameterization in WRF.
 Hong, S.-Y., Noh, Y. and Dudhia, J. (2006) A New Vertical Diffusion Package with an Explicit Treatment of Entrainment Processes. Monthly Weather Review, 134, 2318-2341.
 Mentayani, T.F. and Krauss, C. (2018) Deep Learning with Long-Term Memory Networks for Financial Market Predictions. European Journal of Operational Research, 270, 654-669.
 Kumar, J., Goomer, R. and Singh, A.K. (2018) Long Short Term Memory Recurrent Neural Network (LSTM-RNN). Based Work Load Forecasting Model for Cloud Datacenters. Procedia Computer Science, 125, 676-682.
 Mathieu, D., Diagne, M., Boland, J., Schmutz, N. and Lauret, P. (2014) Post-Processing of Solar Irradiance Forecasts from WRF Model at Reunion Island. Solar Energy, 105, 99-108.
 Susetyarto, Kim, S. and Kim, H. (2016) A New Metric of Absolute Percentage Error for Intermittent Demand Forecasts. International Journal of Forecasting, 32, 669-679.
 Olatomiwa, L., Mekhilef, S., Shamshirband, S. and Petkovic, D. (2015) Adaptive Neuro-Fuzzy Approach for Solar Radiation Prediction in Nigeria. Renewable and Sustainable Energy Reviews, 51, 1784-1792.
 Khosravi, A., Koury, R.N.N., Machado, L. and Pabon, J.J.G. (2018) Prediction of Hourly Solar Radiation in Abu Musa Island Using Machine Learning Algorithms. Journal of Cleaner Production, 176, 63-75.