According to different payment methods, mobile payment can be divided into remote payment and near field payment. Mobile remote payment refers to a payment method that uses the Internet, telecommunications network, etc. to exchange information with the back-end server without using a physical recipient terminal. Mobile near field payment refers to a payment in which the physical acceptance terminal accesses the acquiring network online or offline to complete transaction ( Zhan & Qiao, 2016). Third-party institutions such as Alipay Wallet and WeChat Pay use the advantages of Internet technology, e-commerce platforms, and social networks to enlarge customer number and business scale. And the business model of the remote payment industry has been relatively formed. Through the continuous integration of online and offline services, various innovative mobile near-field payment methods (mobile phone flash payment, barcode payment) are increasingly accepted by users, which promotes customer loyalty and expands usage scenarios. The offline retail market based on mobile near field payment has become the focus of competition among major payment companies and financial institutions.
As an important participant in the mobile payment industry, merchants’ attitude towards mobile payment adoption is particularly important. Exploring the influencing factors of merchants’ adoption of mobile payment, formulating efficient strategies to identify and predict the willingness and behavior of merchants to adopt mobile payment, and expanding the acquiring market have become the main means for financial institutions to compete with payment companies. However, there are few studies on the adoption of mobile payment systems by merchants in the existing literature. This article used the mobile payment system as an innovative product and applied the objective behavior data of a merchant bank’s acquiring merchants. Based on three integrated ensemble learning algorithms: Random Forest, XGBoost, and Adaboost, analysis models are built to study the influencing factors and decision-making process of merchants’ adoption of mobile payment. The research results are compared with the results of Lasso-logistic regression model.
The paper is arranged as follows: Section 2 reviews relevant literature; Section 3 introduces the research method and process of the paper. Section 4 analyses and discusses the research results. Finally, summarize the paper and put forward the prospect.
2. Literature Review
2.1. Research on Mobile Payment Adoption Behavior
At present, there are many researches on mobile payment, most of which use empirical research methods, TAM, UTAUT, TTF and other models were used as the theoretical basis to discuss consumer or merchant (anthropomorphic) mobile payment adoption willingness ( Li, Sun, & Yan, 2013). Kim et al. ( Kim, Mirsobit, & In, 2010) based on the TAM theory, combining the characteristics of individual users (individual innovation, mobile payment knowledge) and mobile payment products characteristics (movability, accessibility, compatibility and convenience), studied influence factors for users’ willingness of mobile payment. Liébana-Cabanillas et al. ( Liébana-Cabanillas , Sánchez-Fernández, & Muñoz-Leiva, 2014) explored the effects of external environment variables, perceived ease of use, perceived usefulness, attitude,risk, and trust on the willingness to adopt mobile payments, using user age as adjustable variable. Schierz et al. (Schierz, Oliver, & Wirtz, 2010) proposed a probabilistic model of the factors affecting mobile payment adoption and verified it by empirical methods. It was found that product compatibility, individual mobility and subjective norms have a significant impact on mobile payment. Liu (Liu, 2011) proposed in his doctoral dissertation that compared with other influencing factors, the security of mobile payment technology has the strongest impact on user willingness. Zhou believes that in order to improve consumer acceptance and use of mobile payments (Zhou, 2011), it is important to establish initial trust in mobile payments. Through empirical research in his paper, he found that perceived security, perceived universality, and perceived ease of use have a significant impact on the initial trust of mobile payment consumers.
Dan’s research indicated that perceived usefulness and perceived ease of use have a significant impact on consumer acceptance intentions (Dan & Jing, 2011). At the same time, the externalities of the network are also a significant factor, but consumers show lower perceived risk levels for mobile payments. Oliveira et al. combined UTAUT theory and innovation diffusion theory to explore the influencing factors of user adoption of mobile payment and recommendation of others to use mobile payment (Oliveira, Thomas, Manoj et al., 2016). Liu et al. studied the willingness to adopt mobile payment from the perspectives of health care and incentive based on exchange theory and two-factor theory (Liu, Xia, Li, & Liang, 2017). Guo’s research shows that enterprises need to understand consumer payment habits and methods in depth, and achieve enterprise innovation by improving the convenience of operations and enriching mobile operation functions (Guo & Li, 2018). User adoption theory is also applicable to research on merchants’ willingness to adopt. Cao studied the willingness of enterprises to adopt mobile payment based on TTF theory with case analysis (Cao, 2008).
2.2. Application of Machine Learning in Adoption Behavior Research
In recent years, some scholars have applied weak classification algorithms in machine learning, such as Logistic regression and decision tree, to research on product diffusion and adoption behavior. In the research of Laurell et al., they collected data spread by Virtual Reality (VR) product Oculus Rift and HTC Vive through Blog, Facebook, Twitter, etc., analyzed with machine learning and found the main obstacles to product proliferation ( Laurell, Sandström, Berthold, & Larsson, 2019). Li et al. studied the adoption behavior of mobile commerce consumers ( Li, Xu, & Feng, 2018). They took adopting intentions and attitudes as dependent variables and other factors as independent variables, used multiple regression algorithms to compare and analyze the two behavior analysis theories of TPB and TAM. Yang et al. used two different machine learning algorithms, logistic regression and C5.0 decision tree, conducted comparative research on enterprise SaaS cloud service adoption decisions ( Yang, Hu, & Zhou, 2016). Sorournejad ( Sorournejad et al., 2010) used the CART decision tree to establish a classification prediction model in the mobile banking product decision behavior, and evaluated consumer adoption decision behavior from the four perspective, that is availability, ease of use, speed, and security. Regression analysis algorithms and decision tree have an advantage in the interpretability of the model and can better explain the relationship between feature variables. However, the uncertainty and complexity of adopting decision-making behaviors are difficult to fully reflect by using weak classification models, and the accuracy of adopting decision-making behavior prediction models is not high.
Sun et al. applied the support vector machine (SVM) to the prediction of the information adoption level of WeChat public account and Weibo, and achieved high prediction accuracy (Sun & Wang, 2019). In the research by Li et al. they applied support vector machine (SVM) and BP neural network separately into decision-making behaviors for training organization adoption, extracted feature variables based on meta-analysis (MA) and rough sets, which improved the prediction accuracy of the model (Li & Chen, 2014; Li, Peng, & Zhen, 2014). Support vector machine (SVM) and BP neural network, as two widely used machine learning algorithms, can achieve high prediction accuracy, but the disadvantages are also obvious, that is the interpretability of the two models is poor. In the study of mobile payment adoption, Chong extended the theoretical model of TAM, using the output of the structural equation model as the input of the neural network method, and realized feature selection through the advantages of path analysis of the structural equation model (Chong, 2013a, 2013b). Chen et al. inherited Chong’s model and compared the prediction accuracy of neural networks and multiple regression models during the model detection stage (Chen, Jing, & He, 2015). The sample set for the studies is subjective data collected through questionnaires, which has limitation. Jiao et al. used machine learning models such as Lasso regression (Jiao, Jing, Liu, & Zhang, 2018), Ridge regression, random forest and decision tree to predict the demand for shared bicycles, and obtained the conclusion that the prediction accuracy of the random forest model is higher than that of regression analysis. Compared with the weak classification model, the random forest model integrated with decision tree has a greater improvement in prediction accuracy. However, because there is no fixed channel to collect objective data for mobile payment adoption by merchants, which makes it difficult to obtain research data, there were few researches leverage machine learning for mobile payment adoption by merchants.
Therefore, consider the interpretability of the prediction model to obtain guidance for future, and at the same time try to improve the prediction accuracy of the model, this study chose three different types of ensemble learning algorithms: Random Forest (Breiman, 2001), XGBoost (Chen & He, 2015), Adaboost (Freund & Schapire, 1996; Freund & Schapire, 1997), and made a comparative analysis with Lasso-logistic regression model (Tibshirani, 1996). The advantage of Lasso-logistic regression model is that it can get the significance and weight value of each characteristic variable, so that it can make targeted management recommendations, but the prediction accuracy of this model is not as good as ensemble learning.
3. Research Methods
The paper uses the Lasso-logistic regression model and three ensemble learning algorithms in supervised learning scenarios. First, data collection, variable selection, and data preprocessing are performed. In order to solve the problem of collinearity among variables, this study adopted a stepwise regression method for attribute selection. The filtered variables constitute the input variables of the three models in this paper. Lasso-logistic regression model, Adaboost, random forest, and XGBoost can be used to achieve function of prediction after training on data samples.
3.1. Variable Selection and Data Collection
3.1.1. Variable Selection
1) Dependent variable
This article used a commercial bank acquiring merchant as an experimental sample, and defined a merchant that has not made mobile payment within one month as an unadopted merchant, and a merchant that has made mobile payment within one month as an adopted merchant.
2) Independent variables
a) Variables of social-economic attributes of merchants
Learned from the study of Liébana-Cabanillas, this study added the social-economic attributes of merchants to the study of mobile payment adoption. By grabbing online enterprise database through a web crawler, the following four static attributes are added into the research: registered capital, business category, operation duration, and staff size.
b) Merchant dynamic transaction variables
Studies by Bellotti have shown that consumers’ dynamic consumption data can affect their credit card default behaviors (Bellotti & Crook, 2013); Lin et al. added variables such as the number of purchases, the amount of consumption, and the length of non-transactions to the customer churn model (Lin, Tzeng, & Chin, 2011). This study added the merchant’s dynamic transaction records to the independent variables. The merchant’s dynamic transaction records are obtained by collecting the records of a commercial bank acquiring system in July 2019, and extracting variables such as the average daily transaction number, the average consumption amount, the maximum consumption amount, the minimum consumption amount and etc.
c) Merchant cluster effect variable
According to the theory of innovation diffusion, the clustering effect promotes the adoption of new technologies. Previous studies often analyzed the impact on the adoption behavior from technical perspectives such as technical specifications and security control, while lacked quantitative research. This article firstly obtained the latitude and longitude coordinates of each merchant through the Baidu map application interface, and uses the merchant’s coordinates as the origin to calculate the number and ratio of merchants that have adopted mobile payment within the range of 100 meters, 200 meters, 500 meters, and 1 km in the surrounding area. The number of merchants in the market and the adoption proportion are added to the model to analyze the effect of clustering effect on the adoption behavior.
3.1.2. Data Preprocessing
There are a total of 67 interpretable variables in the model of this paper, including three major categories of merchants’ social-economic variables, dynamic transaction variables and cluster utility variables. Among them, the social-economic variables of the merchant include four categories: registered capital, business category, operation duration, and staff size. The business category and staff size are categorical variables, which include 46 categories and 7 categories, respectively. This study converts them into category binary variables. The merchant dynamic transaction variables include four variables: the average daily transaction number, the average daily consumption amount, the maximum and minimum consumption amounts. The effect of merchant clustering is described by the adoption of merchants within 100 meters, 200 meters, 500 meters and 1000 meters around the merchant. Each measurement range includes the total number of merchants and the ratio of adoption. The variables in the different models in this study may be different. For example, the nonlinear SVM model will add the multiplication terms and higher order terms of the original variables.Table 1 lists the main variables and their descriptions.
There are missing values in the original sample sets, while the requirement of training samples for integrated learning is relatively high. This paper dealt with missing values in continuous variables by filling in the average value of corresponding addresses according to address classification; for discrete variables, it is solved by filling in the mode of the variable in the same area. The value range of each feature of the training samples is different. In order to make the parameters of the model comparable, this paper has standardized the continuous variables.
3.2. Experimental Design and Model Introduction
3.2.1. Experimental Procedure
The experiment in this study consists of 6 steps:
Step 1: Data acquisition. Merchants’ static attributes, dynamic transaction data, and clustering effects, etc. are collected in three different ways.
Step 2: Data pre-processing. There are missing values and outliers in the collected data, so only data after data preprocessing can become the data samples of the model in this paper.
Step 3: Attribute selection. This paper uses a stepwise regression method to filter the input variables.
Step 4: For the filtered attributes, as input variables, they are trained by Lasso-
Table 1. Main variables and descriptions.
logistic regression model, and output the weight value and significance index of each feature variable. Then further filter the independent variables.
Step 5: Input the selected independent variables into the integrated learning model to obtain three different prediction models.
Step 6: Through the classification matrix, evaluate the model from hit rate, coverage rate, accuracy rate and ROC curve.
3.2.2. Model Introduction
The four models in the experiments in this paper are based on the scikit-learn library implemented in Python machine learning tools. The scikit-learn library is built on NumPy, SciPy and matplotlib, and provides efficient data mining and analysis tools. In this paper, a random logistic regression model is used to select variables, and the selected feature variables are added to the four prediction models. In order to avoid the overwhelming influence of variables with small values by large variables on the model, and to simplify the calculation complexity of the model, this paper uses the linear range method to normalize the data after filtering.
1) Lasso-logistic regression model
Logistic regression model is a classification method widely used and studied in industry and academia. The calculation method of this model is simple and the variable interpretation ability is strong. However, if the sample data size is large and there are many covariates, the statistical significance of some variables will be inaccurate due to multicollinearity, which will reduce the interpretability and prediction accuracy of the model. Because the Lasso algorithm has good variable selection properties in large-scale data variable models, this article introduces the Lasso algorithm into a logistic regression model. The Lasso method implements model selection by adding a penalty function to the sum of squared residuals.
Define the data variables as where and are the explanatory variables and explained variables of the model respectively. The conditional probability of the Logistic linear regression model is shown in Equation (1):
The coefficient estimates in the Lasso-logistic regression model can be written as shown in Equation (2).
In Equation (2), λ represents the harmonic parameter. The key to the variable selection of Lasso-logistic regression model lies in the choice of harmonic parameter λ. The smaller the value of λ, the more parameters the model retains; otherwise, the less it retains.
2) Ensemble Learning Model
Integrated learning refers to the method adopted for multiple learner collections, which can be divided into several types: multi-classifier system, mixture of experts, and committee-based learning. The current research focus is still homogeneous ensemble learning. The main idea of integrated learning is to first generate multiple learners through certain rules, and then use a certain integration strategy to combine, and finally comprehensively judge and output the final result. Homogeneous class integrated learning means that multiple learners are homogeneous “weak learners”. Based on this weak learner, multiple learnings are generated through sample set disturbance, input feature disturbance, output representation disturbance, algorithm parameter disturbance, etc. After the integration, a “strong learner” with better accuracy is obtained.
Breiman proposed the Random Forest (RF) algorithm in 2001. The classification decision tree without pruning constructed by the CART decision tree algorithm was used as the base classifier. The idea of the random forest algorithm is shown in Figure 1. First, the Bootstrap method is used to extract the training set from the original sample set; then a decision tree model is trained on each training set, and during the growth of each tree, randomly selected some variables from all feature variables, and then select the best attributes among these variables according to the principle of the smallest Gini coefficient; finally gather the prediction results of all base classifiers, and vote to get the final category.
The full name of XGBoost is eXtreme Gradient Boosting, which is an extension of Gradient Boosting Machine. The Gradient Boosting Machine algorithm adopts the idea of gradient descent when generating each tree. Based on all the trees generated in the previous step, it moves towards minimizing the given objective function. Under reasonable parameter settings, a certain number of trees need to be generated to achieve the expected accuracy. When the data set is large and complex, the calculation of Gradient Boosting Machine algorithm is huge. XGBoost is an implementation of Gradient Boosting Machine, which is beneficial to the parallel calculation of the algorithm and improves the accuracy.
Freund proposed the Adaboost algorithm based on Boosting. The algorithm is shown in Figure 2. The principle is to first assign weights to each sample in the training data, and initialize the sample weights to equal values, and conduct training to get the first base classifier; after the first base classifier weights
Figure 1. Aspect of Random forest classifier.
Figure 2. Aspect of Adaboost classifier.
determined by calculating the error rate, readjust the weight of each sample, increase the weight of the misclassified samples, so that the misclassified samples can be classified as accurately as possible in the next learning. Repeat the above steps until you get a sufficiently good classifier.
4. Results and Discussion
4.1. Analysis of Influencing Factors Based on Lasso-Logistic Regression
The sample set in this study includes three aspects: merchant static attribute variables, merchant dynamic transaction variables, and merchant clustering effect variables. The Lasso-logistic regression model was used to analyze the experimental data in order to obtain the influencing factors that influence the merchants’ adoption of mobile payment, so as to obtain the enlightening significance of management.
1) social-economic variables of merchants
Merchants in the business category of accommodation, hospitals, supply for water/electricity/coal, and retail are more willing to adopt mobile payment systems. Staff size is significantly negatively correlated with whether or not mobile payment is adopted. Merchants with larger staff size are not willing to adopt mobile payment. The variables such as the registered capital of the merchant and the length of business hours have nothing to do with the adoption of mobile payment.
2) Dynamic merchant transaction variables
The average daily transaction volume of the merchant and the average number of each consumption amount significantly affect the merchant’s mobile payment adoption behavior. The transaction volume is significantly positively correlated with the adoption behavior; the average amount of each consumption amount is significantly negatively correlated with the adoption behavior. That is, merchants with large transaction volume and low single transaction amount are more willing to adopt mobile payment. Because mobile payment can save transaction payment time, the greater the transaction volume, the more obvious the advantage of saving time; at the same time, the lower the amount of each transaction, the lower the degree of concern for consumers and merchants on payment security, and they are more willing to accept Mobile payment methods.
3) Merchant cluster effect variable
From the analysis of the influence variables of the surrounding merchants’ adoption behavior, the proportion of the surrounding adopted merchants significantly affects the other merchants’ adoption behavior. This conclusion shows that the willingness of merchants to adopt will be affected by neighboring merchants.
Table 2. Analysis of main variables.
4.2. Predictive Analysis of Adoption Behavior
The three indicators (Equations (4)-(6)) of model hit rate, coverage rate and accuracy rate are used to evaluate the effectiveness of the model. Among them, TP indicates the number of forecasts adopted and actually adopted; FP indicates the number of forecasts adopted and actually not adopted; FN indicates the number of forecasts not adopted and actually adopted; TN indicates the number of forecasts not adopted and actually not adopted.
The comparison results of different algorithm performance indicators are shown in Table 3. In order to more accurately measure the accuracy of each model, the ROC curves of the four models are drawn, as shown in Figure 3.
It can be seen from Table 3 and Figure 3 that the accuracy, precision, and coverage of the three integrated learning models of random forest, XGBoost model, and AdaBoost are significantly higher than the Lasso-logistic model, which reflects the advantage of predictive accuracy of ensemble learning; the prediction accuracy of the AdaBoost model is lower than that of the random forest and XGBoost models, indicating that the accuracy of the optimized model based on sample perturbations is not suitable for the data set of this study.
5. Conclusion and Further Work
This study builds analysis model based on three Ensemble learning algorithms: Adaboost, Random Forest, and XGBoost, using social-economic attribute variables, dynamic transaction variables, and clustering effect variables of the merchants as independent variables, researched influencing factors of merchants’ adoption of mobile payment services. Comparative study was performed
Table 3. Accuracy of diffenrnt models.
Figure 3. ROC curve comparison of ensemble models.
on prediction models corresponding to different algorithms. The results show that the ensemble learning algorithm significantly improves the prediction accuracy of the model.
The research in this paper helps commercial banks or other financial institutions to understand the influencing factors of merchant mobile payment system adoption behavior and promote the proliferation of mobile payment systems in the market. The main management revelations are: commercial banks and other financial institutions should develop special mobile payment system promotion strategies based on the characteristics of industries with medium or small staff size, such as the accommodation industry, health hospital industry, supply industry for water/electricity/coal. Due to the fact that merchants with large transaction volume and low single transaction amounts are more willing to adopt mobile payment merchants, commercial banks and other financial institutions need to focus on merchants with those characteristics, such as supermarkets and retail and develop promotion strategy accordingly. Considering clustering effect on adoption behavior, financial institutions can market by region when promoting mobile payment systems.
The follow-up research can be in the following aspects: 1) This study does not consider the impact of uneven regional development in China on merchants’ adoption of mobile payment services. Subsequent research can add macro variables from all over China to independent variables to explore differences in the adoption behaviors and influencing factors of merchants in different regions. 2) With the vigorous development and progress of deep learning algorithms such as GPU computing power and convolutional neural networks in recent years, it has brought computing power and algorithmic foundation for prediction model of merchants adoption behavior, which can subsequently improve the accuracy of the model in multiple ways and improve prediction ability of the model.
 Bellotti, T., & Crook, J. (2013). Forecasting and Stress Testing Credit Card Default Using Dynamic Models. International Journal of Forecasting, 29, 563-574.
 Chen, Y., Jing, S., & He, D. (2015). Research on the Intention to Adopt Mobile Commerce from the Perspective of Self-Efficacy. Modern Finance and Economics-Journal of Tianjin University of Finance and Economics, 4, 106-115.
 Chong, Y. L. (2013a). A Two-Staged SEM-Neural Network Approach for Understanding and Predicting the Determinants of M-Commerce Adoption. Expert Systems with Application, 40, 1240-1247.
 Freund, Y., & Schapire, R. E. (1997). A Decision-Theoretic Generalization of on-Line Learning and an Application to Boosting. Journal of Computer and System Sciences, 55, 119-139.
 Guo, Y. Z., & Li, X. M. (2018). An Empirical Study on Consumers’ Intention of Buying Tourism Products with Mobile Payments: An Integration Model of TAM and TPB. Journal of Sichuan University (Social Science Edition), 219, 161-172.
 Jiao, Z. L., Jing, H., Liu, B. L., & Zhang, Z. H. (2018). Short Term Bike-Sharing Ridership Prediction under the Big-Data Condition: Comparison of Machine Learning Models. Journal of Business Economics, 8, 16-25+35.
 Kim, C., Mirsobit, M., & In, L. (2010). An Empirical Examination of Factors Influencing the Intention to Use Mobile Payment. Computers in Human Behavior, 26, 310-322.
 Laurell, C., Sandström, C., Berthold, A., & Larsson, D. (2019). Exploring Barriers to Adoption of Virtual Reality through Social Media Analytics and Machine Learning: An Assessment of Technology, Network, Price and Trialability. Journal of Business Research, 100, 469-474.
 Li, S. Q., & Chen, L. (2014). A Study on the Construction Companies’ Adoption Decision of Lean Construction Techniques Based on MA-SVM. Science and Technology Management Research, 21, 206-211+234.
 Li, S. Q., Peng, Y. F., & Zhen, Y. M. (2014). Research on Influence Factors and Adoption Intention for Adopting Lean Construction Technology. Science and Technology Management Research, 22, 172-177.
 Li, X. Y., Xu, K. Y., & Feng, Y. (2018). The Analysis of Consumers’ Adoption Behavior under O2O Mobile Commerce: Comparison for Two Theoretical Models. Information Studies: Theory & Application, 41, 112-116+122.
 Liébana-Cabanillas, F., Sánchez-Fernández, J., & Muoz-Leiva, F. (2014). Antecedents of the Adoption of the New Mobile Payment Systems: The Moderating Effect of Age. Computers in Human Behavior, 35, 464-478.
 Lin, C. S., Tzeng, G. H., & Chin, Y. C. (2011). Combined Rough Set Theory and Flow Network Graph to Predict Customer Churn in Credit Card Accounts. Expert Systems with Applications, 38, 8-15.
 Liu, B. L., Xia, H. M., Li, Y. H., & Liang, L. T. (2017). An Empirical Study on User’s Mobile Payment Willingness from the Double Perspectives of Both Hygiene and Motivation. Chinese Journal of Management, 14, 600-608.
 Oliveira, T., Thomas, M., Baptista, G. et al. (2016). Mobile Payment: Understanding the Determinants of Customer Adoption and Intention to Recommend the Technology. Computers in Human Behavior, 61, 404-414.
 Schierz, P. G., Oliver, S., & Wirtz, B. W. (2010). Understanding Consumer Acceptance of Mobile Payment Services: An Empirical Analysis. Electronic Commerce Research & Applications, 9, 209-216.
 Sorournejad, S., Monadjemi, A., & Zojaji, Z. (2010). A Model for Adoption of Mobile Banking Services Using Classification and Regression Trees. Journal of US-China Public Administration, No. 11, 66-73.
 Sun, Z. M., & Wang, Z. B. (2019). Research on the Forecast of Health Information Adoption on Sina Micro-Blog Basing on Information Characteristics. Information Studies: Theory & Application, 42, 150-156.
 Tibshirani, R. (1996). Regression Shrinkage and Selection via the Lasso. Journal of the Royal Statistical Society Series B (Methodological), 58, 267-288.
 Zhan, X., & Qiao, H. (2016). Research on Game Theory Strategy of Trusted Service Management Platform in Mobile Near Field Payment Industry: Based on Two-Sided Markets Theory. Systems Engineering-Theory & Practice, 36, 2259-2267.