JDAIP  Vol.9 No.3 , August 2021
Classification of Acupuncture Points Based on the Bert Model*
Abstract: In this paper, we explore the multi-classification problem of acupuncture acupoints based on Bert model, i.e., we try to recommend the best main acupuncture point for treating the disease by classifying and predicting the main acupuncture point for the disease, and further explore its acupuncture point grouping to provide the medical practitioner with the optimal solution for treating the disease and improving the clinical decision-making ability. The Bert-Chinese-Acupoint model was constructed by retraining on the basis of the Bert model, and the semantic features in terms of acupuncture points were added to the acupuncture point corpus in the fine-tuning process to increase the semantic features in terms of acupuncture points, and compared with the machine learning method. The results show that the Bert-Chinese Acupoint model proposed in this paper has a 3% improvement in accuracy compared to the best performing model in the machine learning approach.

1. Introduction

Making full use of modern information technology to explore the great value contained in acupuncture and moxibustion medical documents has always been the goal pursued by researchers [1]. Traditional acupuncture has immeasurable cultural value and rich medical knowledge resources. But its development mode is lack of innovation in theory, the research method of acupuncture and moxibustion lacks unique innovation and forward-looking [2] [3] [4] [5]. There are many acupoints in the human body, and each acupoint has multiple functions, and there is a complex many-to-many relationship between acupoints and diseases. How to use modern advanced scientific and technological methods to extract relevant data from a large number of medical literature, select the most matching acupoints for treatment, dig out the best acupuncture point formula, recommend the best treatment plan to the practitioner, help the practitioner to quickly select the acupuncture point formula in the clinic, and improve the reliability and stability of acupuncture in clinical efficacy is a problem that needs to be thoroughly explored in acupuncture research.

At present, research on acupuncture and moxibustion points is more common based on Data Mining methods and traditional medical analysis methods. In the application of Data Mining technology, the Association Rule Algorithm is mainly used to explore the dominant symptoms of acupoints or the characteristics of different acupoints compatibility [6] - [13]. In traditional medical methods, medical researchers often use Control Group Observation [14] [15], acupoint massage application [16] and other methods to observe and analyze specific diseases with drugs or acupuncture through a large amount of experimental data, get affirmative result. In the face of more and more complex diseases, it is the current development trend to accelerate the development of medical intelligence and the sharing of cross-disciplinary advantages. Support vector machine models [17] [18] and plain Bayesian models [19] [20] have been used more frequently for classification studies in Chinese medicine agents and electronic medical records, and convolutional neural networks [21] [22] and long and short-term memory networks [23] have also been used in related applications.

Bert (Bidirectional Encoder Representation from Transformers) language model [24] is a Natural Language Processing one based on deep learning released by Google in 2018. It has better performance in classification tasks. Moreover, the training corpus is rich, which is more accurate for language feature representation. At present, the application of Bert model in classification problems is mostly based on fine-tuning methods. For example, Wang [25] built a domain ontology classification and recognition model based on the Bert model; Zhang et al. [26] used the Bert model to classify the text content of the network platform and obtained more accurate results of the classification of psychological characteristics; Yao et al. [27] used BERT and clinical corpus to classify the clinical records of traditional Chinese medicine; Kang et al. [28] used syntactic features and Bert model word embedding to propose a BILSTM-CRF attention mechanism model to achieve fine-grained classification of product evaluation. There are also related studies on improving the Bert model. Wang et al. [29] proposed an algorithm TFN + BERT-Pair-ATT based on text filtering and improved BERT to classify long text aspect-level sentiment analysis; Wang Xingyu et al. [30] constructed a BERT-BILSTM-CNN-CRF hybrid model based on the text in the medical field as the classification object, combined with entity keyword features. The above-mentioned researchers took advantage of the Bert model in text processing, and its stronger semantic representation to further refine the target task and improve the accuracy of the classification results.

The corpus used in the pre-training phase of the Bert language model is a general corpus with no domain specificity, therefore, in this paper, the Bert-Chinese Acupoint model will be obtained by re-training on the basis of the Bert-Base-Chinese language model. Then, we will conduct a classification prediction experiment for 10 acupoints to observe the accuracy of the obtained results and whether they can provide better data support for the next step of the best acupoint grouping recommendation.

2. Classification of Acupoints Based on Bert Model

In this paper, we construct a Bert-Chinese Acupoint model based on the Bert model to classify and predict 10 acupoints. The acupoint category is obtained by inputting the “name of the disease” or the text content related to the disease. For example, if “toothache” is entered, “Neiting” is output. The model is constructed by collecting acupuncture point corpus, pre-training and post-tuning.

2.1. Data Preparation and Pre-Training

First, write a crawler program, take Baidu as the crawling object, and collect human body acupoint data. Then search the acupuncture points related literature after 1949 as training corpus. The main search sources are CNKI, Wanfang database, Chinese Medicine biomedical literature database, Web of Science. Search by subject terms, for example, subject terms are “行间”, “Xingjian”, and “LR2”. The acupoint classification training sample set is sorted out by manual screening.

The corpus of acupuncture and moxibustion points is mainly obtained through two ways. First, based on the names of human acupoints in the book “Acupuncture and moxibustion” [31] (this book is a textbook for general higher education Chinese medicine planning), a crawler program was written to crawl data from Internet, and corpus containing 434 acupoints were obtained. Second, corpus was collected from the literature retrieval platform from all kinds of libraries. Text content such as the title and abstract of the paper obtained through the subject word search. Because the Bert pre-training process is unsupervised learning, the obtained text can be directly used to obtain the sentence-level vector representation, without the need to filter and label the text. The obtained acupoint text data of acupuncture points were used for pre-training to construct the Bert-Chinese Acupoint model, so that it could enhance the targeting of the terminology of the Chinese medical specialty class and enhance the linguistic feature representation in this domain.

The data used in the fine-tuning process are extracted manually from various documents and ancient texts on diseases, methods of operation, and acupuncture points, and are organized and labeled in the following style, including serial number, main acupuncture point, and text content, as shown in Table 1.

The data samples used in the fine-tuning process were selected from the

Table 1. Pre-training data samples.

post-1949 acupuncture literature, and the main acupoints, main medical conditions, matching acupoints, and acupuncture manipulation methods were extracted and compiled from the papers. This article chooses Specific Points (The Five infusion points, the Yuan points, the Luo points, the Qie points, the Ba Mai Jiao Hui points, and the Xia He points below the elbows and knees of the extremities; the Bei Yu points and the Mu points in the chest, abdomen, and back and waist; and the Ba Hui points and the Jing Mai Jiao Hui points in the trunk of the extremities.) which are more clinically used and effective for classification research. 10 points out of Five Points [32] are selected, which are: “Yongquan”, “Xingjian”, “Yuji”, “Yangxi”, “Dadun”, “Zhiyin”, “Guanchong”, “Inner Court”, “Yemen”, “Zhigou”.

2.2. Bert-Chinese-Acupoint Model

For the problem of Chinese text classification, the data set needs to be organized into a usable form, and different formats will correspond to different Data Processor. The data format is (category id\t sentence), the data type is.tsv file, one line represents one text, consisting of tag, tab and body. The text is split into three files, train.tsv (training sample set), dev.tsv (validation sample set), and test.tsv (test sample set), and the three files are placed under the same data_dir file.

Based on the acupuncture point database built in this paper, the sample length of the acupuncture point corpus is obtained, a suitable corpus sequence length is set, the corpus sentence pairs are de-duplicated to obtain individual sentences, the frequency of each word is calculated, and a corpus dictionary is made. Google provides a variety of pre-trained Bert models for different languages and different model scales, including Base and Large for Chinese languages. The Bert-Base-Chinese model is represented as a 12-layer Transformer block with 768 word vector dimensions, 12 attention mechanism heads, and 110 M total parameters. Because the Bert model is too large, the amount of data in this paper is relatively small, which leads to an excess of model parameters and is prone to overfitting and poor generalization ability. Therefore, in this paper, the Bert-Base-Chinese model is improved in the pre-training stage, and the Bert-Chinese-Acupoint model is represented as follows: 6-layer Transformer block, word vector dimension is 384, attention mechanism head is 12, and the total parameters are 23 M. The pre-training process is completed in Google Cloud platform in about two weeks.

The fine-tuning based approach is to add Acupoint Processor acupuncture acupuncture corpus module in the pre-processing stage of the model, so that the model can fully extract the intrinsic meanings of tokens, increase the vector distance of similar words in the domain, and improve the performance of the model. In this paper, during the fine-tuning process, the training set and validation set are partitioned according to the ratio of 9:1. However, in the process of conducting experiments in this paper, the data volume of the 10 acupuncture categories differed greatly, which led to unsatisfactory initial training results in this paper, and the evaluation indexes between the categories differed more. Therefore, in this paper, while ensuring the quality of the text, the amount of data for each acupoint was manually selected and reduced according to the text content and sentence length, so that the difference in quantity between the categories was relatively reduced and the text content was relatively balanced. The first is to reduce the problem of excessive disparity in data volume between categories, and the second is to delete some useless and repetitive text content to reduce the problem of high accuracy rate due to excessive data repetition.

2.3. Compare Benchmark Mode

The comparison model should have strong representativeness and universality. Therefore, in order to better analyze whether the Bert model has a considerable effect in the classification of acupuncture points, this paper chooses SVM, Naive Bayes, and Long Short-Term Memory models as the baseline model. SVM and Naive Bayes are the more traditional and representative classifier models in Machine Learning methods, which have excellent performance in text classification; Long Short-Term Memory is a special Recurrent Neural Network, which also has a wide range of text classification application.

Naive Bayes (N-Bayes) is a classification algorithm based on probability theory, which uses the highest probability to make decisions. Even if the amount of data is small, it can still be classified effectively. Support Vector Machines are very representative in classification problems. Almost all classifications can choose SVM, which has good generalization ability and low error rate during the experiment. The influencing factors are reduced to the minimum range through continuous training and debugging. The best experimental results are selected as the benchmark data of the machine learning method.

The proposal of Long Short-Term Memory solves the problem of RNN gradient disappearance. The repetitive module is improved into 4 interactive layers, and information is filtered and updated through three “control gates” to achieve the ability to learn long-term dependence on information. In practice, it solves the problem of being unable to connect because the time span is too far, provides more effective elements when processing input information.

2.4. Evaluation Index

Evaluation metrics are measures to evaluate different models. Through the evaluation indicators, different information about the data can be obtained and the classification effect can be judged. Accuracy: The percentage of tuples that are correctly classified by the classifier. This reflects how well the classifier identifies each type of tuple. Recall: All samples that are actually positive, how many are correctly predicted to be positive. The higher the recall rate, the stronger the model’s ability to distinguish positive samples. F1-score: Comprehensive consideration of the accuracy and recall rate of the prediction results. The higher the F1-score, the better the model effect. This paper compares the precision, recall and F1-score of the three classification models to analyze the classification effect of the model.

3. Classification Comparison

According to the research ideas of this article, it is necessary to N-Bayes, SVM, Long Short-Term Memory and Bert-Chinese-Acupoint models, input the data sets into the four models respectively, and perform classification task experiments on 10 acupoints.

3.1. Naïve Bayes Experiment

Naive Bayes classification is based on Bayes’ theorem, which is to select the category corresponding to the highest probability as the classification result. Through the realization based on Multinomial NB, the data text of the existing label, perform stop word removal and word segmentation processing, use TfidfVectorizer to calculate weights, fit() method to fit the model, fit_transform() method to extract training set features, and transform() method to extract test set features. Finally, the result of the model is obtained, as shown in Table 2.

3.2. Support Vector Machine Experiment

In the experimental process of the support vector machine, first convert the “content” text of each category in the training sample into a word frequency matrix, and then calculate the TF-IDF value. Select the SVC in the Sklearn package; the data set is nonlinear classification, and the kernel function “linear” is introduced to map the data features from the original space to the high-dimensional space, thereby converting the nonlinear problem in the low-dimensional space into the linear problem in the high-dimensional space. Set the slack variable C to increase the model’s fault tolerance rate. When the parameter C = 0.99, the model effect is good. The classification result shown in Table 3.

3.3. Long Short-Term Memory Network Experiment

Long Short-Term Memory is a type of Recurrent Neural Network, which improves the vanishing gradient problem of RNN through the “gate” method. In the process of using Long Short-Term Memory to classify the experiment, integrate

Table 2. N-Bayes Experiment.

the manually marked data, use regular expressions to delete letters, numbers, and other characters except Chinese characters, segment words and filter stop words, and set the Embedding layer dimension to 100. Split the data set into training set and data set according to the ratio of 9:1, the activation function is Softmax, the loss function is Ctegorical_crossentropy, the model result is shown in Table 4.

Table 3. SVM Experiment.

Table 4. LSTM Experiment.

3.4. Bert-Chinese-Acupoint Experiment

The number of training rounds, batch training size, learning rate and other parameters are set by adding training samples for fine-tuning, and the training set and data set are experimented according to 9:1. After repeated experiments, it is concluded that the model achieves the best accuracy and the curve tends to be flat when setting batch_ size = 8, optimizer = Adam (1e−5), epochs = 3. The result is shown in Table 5.

4. Result Analysis

The evaluation index of the classification results of each model is shown in Figure 1. It can be concluded from the above experiments that the 10-classification problem based on the Bert model is better than other model classifiers in effect.

Compared with other models, the experimental effect of N-Bayes is not ideal. The experimental data features in this article not suitable for Naive Bayes, because the model is more sensitive to the input data method, and this article is not appropriate in processing the input data. The next step is to try to extract each type of text entity, count the number distribution, and form a feature vocabulary according to the distribution characteristics.

The classification results of SVM model and Long Short-Term Memory model are not much different, which may be related to the performance of Long Short-Term Memory model. SVM has good universality, the experimental data characteristics of this article are not sensitive to the performance of the Long Short-Term Memory model, the amount of data in this article is limited. For the problem of insufficient and imperfect information acquisition in the input data, there are certain differences in the classification problem.

Table 5. Bert-Chinese-Acupoint Experiment.

Figure 1. Classification and evaluation indicators of each model.

The Bert-Chinese-Acupoint model is generally better. The performance and theoretical advantages of the Bert language model occupy a certain proportion. This article uses this model to combine the characteristics of both left and right texts to obtain semantic information. Adding the acupuncture and acupoint corpus module to the pre-training process has a certain role and significance, adding its characteristic information, and then fine-tuning the model to make the model more suitable for the research task of this article.

5. Conclusions

During the experiment, due to the limited ability of the current technology to recognize and express classical Chinese, the contents of many ancient Chinese medicine books cannot be applied. Therefore, this article only chooses to sort out and analyze the literature after 1949, classify and research the data of ten acupoints among the Five Points. The amount of data is not large enough, and the corpus of acupuncture is not rich enough, there are certain limitations, and further research is needed based on clinical practical applications. In addition, due to the uneven number of samples of the ten acupoints, there are differences between each category during the training process.

Based on the Bert model, this article classifies 10 acupoints and has achieved good experimental results. In the next research work, we will further improve the Chinese acupuncture and moxibustion acupoint corpus, and at the same time try to study ancient books in classical Chinese, continue to extend on the basis of this experiment, improve the practical application ability of the model, combine Gephi and MongoDB to mine the compatibility of acupoints, and solve practical clinical problems.


*This work is supported by the National Natural Science Foundation of China under Grant No. 81973695, Soft Scientific Research Project of Shandong Province under Grant No. 2018RKB01080.

Cite this paper: Zhong, X. , Jia, Y. , Li, D. and Zhang, X. (2021) Classification of Acupuncture Points Based on the Bert Model*. Journal of Data Analysis and Information Processing, 9, 123-135. doi: 10.4236/jdaip.2021.93008.

[1]   Guo, Y., Zhang, K., Xu, Y., Guo, Y., He, L.Y. and Liu, B.Y. (2020) Computational Acupuncture. World Traditional Chinese Medicine, 15, 953-960.

[2]   Chen, S.Z. (2015) The Study of the Law of Acupoint Action and the Law of Prescription Are the Two Cores of Modern Acupuncture Research. Proceedings of the 7th Annual Conference of Shandong Acupuncture and Moxibustion Society, Rizhao, 25 July 2015, 11-15.

[3]   Chen, S.Z. (2019) Inclusivity and Innovation in Modern Acupuncture and Moxibustion: Breakthroughs of Introduction to Acupuncture and Moxibustion Medicine. Chinese Acupuncture & Moxibustion, 39, 331-334.

[4]   Wu, B.J. (2016) Ten Development Tendencies and Strategies of Acupuncture in the 21st Century. World Journal of Acupuncture-Moxibustion, 26, 15-19, 32.

[5]   Yin, T., He, Z.X., Sun, R.R., Li, Z.J., et al. (2020) Progress and Prospect of Machine Learning in Research of Acupuncture and Moxibustion. Chinese Acupuncture & Moxibustion, 40, 1383-1386.

[6]   Zhang, J., Zhang, Z.L., Zhu, Y.Z., Jia, H.L. and Zhang, Y.C. (2020) Data Mining to Explore the Main Treatment of YuJi and the Rules of Compounding. Shandong Journal of Traditional Chinese Medicine, No. 9, 906-913.

[7]   Zhang, J., Jia, H.L., Zhang, Z.L. and Zhang, Y.C. (2020) Based on Data Mining to Analyze the Indications and Compatibility of Xingjian Points. Shandong Journal of Traditional Chinese Medicine, 39, 773-781.

[8]   Wang, J.Y., Wang, T., Zhu, Y.Z., Jia, Y.L., Jia, H.L. and Zhang, Y.C. (2020) Based on Data Mining, Analysis of the Indications and Compatibility of Yangxi Point. Shandong Journal of Traditional Chinese Medicine, 39, 1039-1046.

[9]   Song, Q.M., Zhu, Y.Z., Jia, Y.L., Wang, Q., Jia, H.L. and Zhang, Y.C. (2020) Analysis of the Acupuncture Indications and Compatibility Laws of Houxi (SI3) Based on Data Mining. Shandong Journal of Traditional Chinese Medicine, 39, 1153-1160+1165.

[10]   Wang, T., Jia, Y.L., Zhu, Y.Z., Jia, H.L. and Zhang, Y.C. (2020) Based on Data Mining to Analyze the Dominant Symptoms and Compatibility of the Yinmen Point. Shandong Journal of Traditional Chinese Medicine, 39, 1274-1281.

[11]   Pu, L., Lin, J.H., Chen, W.H., Cao, L., Lin, S.J., Chen, S.L., Li, C.L., Fu, Q., Zhang, Y.M. and Zhu, M.M. (2019) Investigation on the Law of Acupuncture in the Acute Phase of Peripheral Facial Paralysis Based on Data Mining. Lishizhen Medicine and Materia Medica Research, 30, 2270-2273.

[12]   Fan, M.L., Wei, Y.T., Wei, J.N., Zhang, Y.Z., Chen, J., Ge, H., He, X.F. and Hao, H.W. (2019) Law of Meridian and Acupoint Selection in Acupuncture Treatment for Anxiety Based on Data-mining Method. Chinese Journal of Basic Medicine in Traditional Chinese Medicine, 25, 357-360.

[13]   Zhang, P.M., Zhang, W., Tan, Z.G., Tang, Y.N., Liu, X.J., Pan, T., Xiao, D., Xie, Z.R. and Chen, D.Z. (2020) Application of Special Acupoints for Chronic Gastritis in Ancient Literature of Acupuncture and Moxibustion. Chinese Acupuncture & Moxibustion, 40, 1018-1023.

[14]   Li, X.R., Hou, X., Ye, Y.J., Bu, X.M., An, C.L. and Yan, X.K. (2020) A Comparative Observation of the Curative Effect of Thread-Embedding and Acupuncture in the Treatment of Liver-Stagnation and Qi Stagnation Type Insomnia Based on the “Shugan Tiaoshen” Prescription. Chinese Acupuncture & Moxibustion, 40, 1277-1280+1285.

[15]   Sun, Y., Pang, Y.H., Mao, N.Q., Luo, J.N., Cai, D.L. and Chen, F.F. (2020) Effect of Transcutaneous Electrical Acupoint Stimulation on Venous Thrombosis after Lung Cancer Surgery: A Randomized Controlled Trial. Chinese Acupuncture & Moxibustion, 40, 1304-1308.

[16]   Sheng, J., Xia, H.O., Ding, Y., Wang, J. and Zhang, J.P. (2020) Effect of Breast Massage Combined with Acupoint Stimulation on Milk Volume of Mothers Separated from Their Preterm Neonates. Journal of Nursing Science, 35, 48-52.

[17]   Yang, Y., Xiao, J.M., Zhou, J., He, F.Y., Zeng, H.J. and Yang, Y.T. (2020) Research on Classification Algorithm Based on Support Vector Machine in Chinese Materia Medica. Chinese Traditional and Herbal Drugs, 51, 2258-2266.

[18]   Jin, Z.L., Hu, J.X., Jin, H.W., Zhang, L.R. and Liu, Z.M. (2018) Analysis of Traditional Chinese Medicine Prescriptions Based on Support Vector Machine and Analytic Hierarchy Process. China Journal of Chinese Materia Medica, 43, 2817-2823.

[19]   Wang, R.X. and Sun, J. (2019) Research on the Classification of Traditional Chinese Medicine Physicians Based on Medication Pattern Recognition. Lishizhen Medicine and Materia Medica Research, 30, 2802-2803.

[20]   Liu, Y.B., Ye, H., Yi, J. and Cao, D. (2020) Text Information Extraction of Traditional Chinese Medicine Electronic Medical Records Based on Native Bias and Word2vec. World Science and Technology, 22, 3563-3568.

[21]   Huang, S.Q., Wang, F., Wang, X.S., Zhou, Q. and Zhao, X. (2020) A Review of Convolutional Neural Networks in Chinese Medicine Tongue Diagnosis. Computer Knowledge and Technology, 16, 20-22.

[22]   Wang, H.Y. (2019) Research on the Identification Method of Chinese Herbal Medicine Based on Deep Learning. Master’s Thesis, Guilin University of Electronic Science and Technology, Guangxi Zhuang Autonomous Region.

[23]   Du, L., Cao, D., Lin, S.Y., Qu, Y.Q. and Ye, H. (2020) Extraction and Automatic Classification of Chinese Medical Records Text Based on BERT and Bi-LSTM Fusion Attention Mechanism. Computer Science, 47, 416-420.

[24]   Devlin, J., Chang, M.W., Lee, K. and Toutanova, K. (2018) BERT: Pre-Training of Deep Bidirectional Transformers for Language Understanding.

[25]   Wang, S.L., Yang, H., Zhu, Z.M. and Liu, W. (2020) Automatic Identification of Domain Ontology Classification Relations Based on BERT. Intelligence Science, 1-8.

[26]   Zhang, H., Jia, T.Y., Luo, F., Zhang, S. and Wu, X. (2020) A Study on the Prediction of Psychological Traits of BERT for Online Texts. Journal of Frontiers of Computer Science and Technology, 1-13.

[27]   Yao, L., Jin, Z., Mao, C.S., Zhang, Y. and Luo, Y. (2019) Traditional Chinese Medicine Clinical Records Classification with BERT and Domain Specific Corpora. Journal of the American Medical Informatics Association, 26, 1632-1636.

[28]   Kang, Y., Xue, H.Z. and Hua, B. (2021) Fine-Grained Commodity Evaluation Analysis for Deep Learning Networks. Computer Engineering and Applications, 1-10.

[29]   Wang, K., Zheng, Y., Fang, S.Y. and Liu, S.Y. (2020) Long Text Aspect-Level Sentiment Analysis Based on Text Filtering and Improved BERT. Computer Applications, 40, 2838-2844.

[30]   Wang, X.Y., Lu, X.Q. and You, X.D. (2021) KBLCC: An Entity Classification Method for Medical Domains Incorporating Entity Keyword Features. Journal of Chinese Computer Systems, 1-9.

[31]   Sun, G.J. (2011) Acupuncture. 2nd Edition, People’s Medical Publishing House (PMPH), Beijing.

[32]   Chen, S., Wu, S., Wang, H. and Liang, F.X. (2016) A Brief Description of the Clinical Use of the Five Influence Points. Lishizhen Medicine and Materia Medica Research, 27, 2468-2469.