The 11-year solar cycle contributes to events such as sunspots, coronal mass ejection, and solar wind. The mechanism of the sun-earth magnetosphere connection is a mystery in relation to earthquakes . Several studies have proposed that solar activity (SA) might be linked to earthquakes   . Statistical methods are usually used to prove this hypothesis. Reference  suggests a correlation between SA and large earthquakes worldwide, and  investigates the correlation between long-range clustering of global seismicity and SA. Sunspot number is also considered to be an SA variable for predicting earthquakes . Meanwhile, some mechanisms have been considered to improve the correlation between the SA and the earthquakes. For example, induced current causes an increase in fault stress through piezoelectricity , and the eddy electric currents in faults reduce the shear strength .
The previous studies mainly focused on investigating the significant correlation between SA and earthquakes using non-parametric statistical methods. However, parametric statistical models and machine learning models are also necessary for earthquake forecasting, although they are far from applicable to this task. In our previous work , we attempted to predict Global Earthquake Numbers (GEN) by using variables associated with SA as inputs. The results in  show that the GEN of earthquakes with magnitude 4 - 4.9 is most predictable.
With the development of sensing technologies, including GPS and InSAR , a massive amount of data on SA has been accumulated. Furthermore, the solar-earth coupling can be characterized as a non-linear dynamical system. For these two reasons, we decided to construct deep learning models to predict GEN with SA as the input for earthquakes of magnitude 4 - 4.9. In particular, we considered daily time series of GEN and SA in sequential format. The recurrent neural network (RNN) and long-short-memory-term are two benchmark DL models for sequential data. However, feedback in the recurrent architecture can lead to higher computational complexity . Recent studies   indicate that certain convolutional neural network (CNN) architectures can reach state-of-the-art accuracy for sequential data. A CNN can ensure the causality of sequential data of any sequence length with no feedback.
By considering the proven effectiveness of CNNs for sequencing data, we took all the observations in the time-series format and implemented the temporal convolutional network (TCN) . We constructed TCN by using GEN data and SA data as input to predict GEN for earthquakes of magnitude 4 - 4.9.
Daily data of GEN were downloaded from ComCat (https://earthquake.usgs.gov/earthquakes/search/). The data ranged from 01/01/1996 to 12/31/2019, including the 23rd and 24th solar cycles, and are partly depicted in Table 1. EQi means earthquakes with magnitude i - i.9 for . Note that earthquakes with M ≥ 8 rarely occurred, so we combined those into one column: EQ89. The data contain an earthquake M = 7.2 (04/05/2010) that occurred in Estado de Baja California of Mexico and the Touhoku earthquake M = 9.0 (03/11/2011) that occurred in the north-east of Japan. Because large earthquakes always cause aftershocks, the GEN itself was also used as an input of TCN. (Figure 1)
The daily data of SA were downloaded from OMNIWeb (https://omniweb.gsfc.nasa.gov/). The SA variables used in this research are listed in Table 2. Part of the SA data are illustrated in Figure 2. Missing values in the original SA data were filled using the linear interpolation method. (Table 3)
Table 1. Daily data of GEN.
Table 2. SA variables list.
Table 3. Daily data of SA variables.
Figure 1. Time series of earthquake numbers (1996/01/01-2019/12/31).
Figure 2. Time series of SA variables (1996/01/01-2019/12/31).
3. TCN Architecture
According to our previous works , to model the earthquakes, GEN of EQ4 is taken as the outputs of TCN, denoted as . Then, two types of inputs for TCN are included: the first type, denoted as , takes GEN data up to k as the inputs of TCN; the second type, denoted as , includes SA variables. Here, k means that the latest observations in and are obtained at time k. With respect to our previous results , the maximum time lag of each variable was set as 14 days in both and . That is,
In this way, we construct a non-linear model
to couple the relation between GEN and SA by comparing these models, with being an independently identically distributed Gaussian noise term. Here, d is the number of days later than day k and indicates the prediction steps. In this research, is considered.
This research uses TCN as , whose architecture is shown by Figure 3. The TCN is mainly composed of convolutional blocks with 16, 32, 32, and 64 channels. In each block, a dilated convolutional operation is performed on a sequence input :
with being a filter.
Because the maximum time lag is relatively short, the convolutional kernels of size 1 × 2 are implemented in each block. To obtain the robust estimates, the Huber loss function is used as follows:
Figure 3. Architecture of TCN.
where is the observed GEN at time i, and and are the corresponding input and output of TCN. The Adam optimizer  was used to chain TCN. The SA and GEN data in the 23rd solar cycle (01/01/1996-12/31/2007) were used as the training data. The SA and GEN data in the 24th solar cycle (01/01/2008-12/31/2019) were used as the test data.
4. Prediction Results
The whole dataset was divided into two parts. The SA and GEN data in the 23rd solar cycle (01/01/1996-12/31/2007) were used as the training data. The SA and GEN data in the 24th solar cycle (01/01/2008-12/31/2019) were used as the test data to verify the trained TCN. Pearson’s correlation coefficient R was used to evaluate the fitting and prediction performance of the TCN.
4.1. TCN without/with SA Variables
First, we constructed a TCN without SA variables. Figure 4 illustrates the loss curves of the training and test losses versus epoch number. The curves indicate that 100 epochs are enough to ensure convergence of the TCN training.
We also constructed TCN with all of the SA variables. Figure 5 illustrates the training and test losses plotted against epoch number. The curves indicate that 100 epochs are enough to ensure convergence of the TCN training.
Table 4 lists the fitting and prediction performance of TCNs without SA variables for 1- to 3-day-ahead predictions. Let Rf and Rp be the correlations between the real observations of EQ4 and the output of the TCN obtained from the training data and test data, respectively. Table 5 lists the fitting and prediction
Figure 4. Dynamical curves of traning and test losses versus epoch number for TCN without SA variables.
Figure 5. Dynamical curves of traning and test losses versus epoch number for TCN with SA variables.
Table 4. Fitting and prediction performance of TCNs without SA variables.
Table 5. Fitting and prediction performance of TCNs with SA variables.
performance of TCNs with SA variables for 1- to 3-day-ahead predictions. As a reasonable result, Rf is Rp for all days ahead in Table 4 and Table 5. Thus, the “decrease” in Table 4 and Table 5 means the difference between Rf and Rp.
The two tables indicate that the TCNs are of better fitting and prediction performance than the support vector regression in our previous work . By comparing Table 4 and Table 5, it can be seen that the SA variables improve both the fitting and prediction performance of TCNs. The gap between Rf and Rp is trivial in the 1-day-ahead prediction, which supposes a balance between the fitting and prediction performance of TCNs with/without SA variables. However, Rp significantly decreases for the 2- and 3-day-ahead predictions.
4.2. Impact of SA Variables on Prediction of Earthquakes
To evaluate how the SA variables improve the prediction of earthquakes, we adopted the following forward stepwise procedure:
Table 6. Variables selected by forward stepwise procedure.
1) Let denote the set of inputs of TCN, which is initially assumed to be the empty set. Let be the set of 15 input candidates of the TCN.
2) Add one variable from to and construct a TCN. Define the corresponding evaluation criterion . Add to the variable from that gives the biggest improvement in .
3) Repeat (2) until becomes empty and a total of 15 TCNs are obtained.
Table 6 shows the sequentially selected variables according to Ra for the 1-day-ahead prediction. We can see that the plasma speed V improves Ra by almost 0.02 based on EQ3 and EQ4 in step 3. The IMF Magnitude improves Ra by almost 0.02 at the last step, jointly with other variables. These results suggest that all the SA variables should be used as the inputs of TCNs.
In this research, we investigate the relation between SA and GEN. We construct the deep learning model TCN to predict EQ4 for 1- to 3-day-ahead predictions. The numerical results show that:
1) Compared with SVR in our previous works, TCN significantly enhances the fitting and prediction performance. This result confirms that there exists a strong nonlinear relation between GEN and SA.
2) Because the fitting performance Rf is similar to Rp, we suppose that TCN is of potential capacity for the 1-day-ahead prediction for EQ4.
3) EQ4 in the past is the crucial input of TCN. Thus, TCN is essentially a nonlinear autoregressive model. However, SA variables can still improve the fitting and prediction performance of TCN.
From the aforementioned results, we suppose that SA has the potential to affect GEN.
TCNs in this research are still far from being predictive. Table 6 shows that the TCN is continuously improved until all the SA variables are implemented. This result suggests that the prediction performance can be further improved by considering more variables other than the candidates selected in this research. Over the decades, lots of novel geophysics and space data have become available, thanks to improvements in sensing and measurement technologies. Although earthquakes remain not predictable for now, we will continue to reveal relations among earthquakes, the earth’s environment and SA on the basis of various statistical methods and machine/deep learning models.
 Sukma, I. and Abidin, Z.Z. (2017) Study of Seismic Activity during the Ascending and Descending Phases of Solar Activity. Indian Journal of Physics, 91, 595-606.
 Odintsov, S., Boyarchuk, K., Georgieva, K., Kirov, B. and Atanasov, D. (2006) Long Period Trends in Global Seismic and Geomagnetic Activity and Their Relation to Solar Activity. Physics and Chemistry of the Earth, Parts A/B/C, 31, 88-93.
 Huzaimy, J.M. and Yumoto, K. (2011) Possible Correlation between Solar Activity and Global Seismicity. Proceeding of the 2011 IEEE International Conference on Space Science and Communication, Penang, 12-13 July 2011, 138-141.
 Marchitelli, V., Harabaglia, P., Troise, C. and De Natale, G. (2020) On the Correlation between Solar Activity and Large Earthquakes Worldwide. Scientific Reports, 10, Article No. 11495.
 Marchitelli, V., Troise, C., Harabaglia, P., Valenzano, B. and De Natale, G. (2020) On the Long Range Clustering of Global Seismicity and Its Correlation with Solar Activity: A New Perspective for Earthquake Forecasting. Frontiers in Earth Science, 8, 470.
 Khain, V.E. and Khalilov, E.N. (2007) About Possible Influence of Solar Activity upon Seismic and Volcanic Activities: Long-Term Forecast. Transactions of the International Academy of Science H&E, 3, 217-240.
 Han, Y., Guo, Z., Wu, J. and Ma, L. (2004) Possible Triggering of Solar Activity to Big Earthquakes (Ms ≥ 8) in Faults with near West-East Strike in China. Science in China Series G: Physics and Astronomy, 47, 173-181.
 Nishii, R., Qin, P. and Kikuyama, R. (2020) Solar Activity is One of Triggers of Earthquakes with Magnitudes Less than 6. IGARSS 2020 IEEE International Geoscience and Remote Sensing Symposium, Hawaii, 26 September-2 October 2020, 377-380.
 Qiao, X., Wang, Q., Yang, S., Li, J., Zou, R. and Ding, K. (2015) The 2008 Nura Mw6.7 Earthquake: A Shallow Rupture on the Main Pamir Thrust Revealed by GPS and InSAR. Geodesy and Geodynamics, 6, 91-100.
 Dauphin, Y.N., Fan, A., Auli, M. and Grangier, D. (2017). Language Modeling with Gated Convolutional Networks. Proceedings of the 34th International Conference on Machine Learning, Sydney, 6-11 August 2017, 933-941.
 Oord, A.V.D., Dieleman, S., Zen, H., Simonyan, K., Vinyals, O., Graves, A., Kalchbrenner, N., Senior, A. and Kavukcuoglu, K. (2016) Wavenet: A Generative Model for Raw Audio.