A series of chalcone derivatives were synthesized according to their physiological activities, such as anti-inflammatory  , antiviral  , antitumor  , anticancer  and so on. The chalcone derivatives have different molecular structure and function and combine with different receptors. Now the tumor becomes a serious disease that endangers human health. The International Agency for Research on Cancer (IARC) released the latest estimate on the global burden of cancer and the report displayed that the global cancer burden rises to 18.1 million new cases and 9.6 million cancer deaths in 2018. In recent years, a number of studies have shown that chalcone derivatives could inhibit the growth of tumor    . For example, Sapavat Madhavi  synthesized a kind of anticancer drug which was based on the structure of chalcone. Bruna Lannuce Silva Cabral  and others have synthesized a new chalcone derivative LQFM064, which could induce the death of breast cancer cells.
There are many chalcone derivatives and these derivatives have long research and development cycle. In order to reduce the time of the research and development, the QSAR was used to study the relationship between the structure of chalcone and antitumor activity. Lots of studies have shown that 3D-QSAR was widely used in many compounds and their derivatives and provided a new idea for design of new derivatives     . For example, Hui Zhi  studied on triazole derivatives as sglt inhibitors via CoMFA, and on the basis of the model, 8 new sglt inhibitors were designed and predicted. Xin Zhang and Hui Zhang  generated predictive 3D-QSAR model for modified design of Dopamine D3 receptor antagonists via CoMFA .
In this study, we used the SYBYL2.0 to establish the CoMFA model to research about the 3D-QSAR of 24 chalcone derivatives. And then according to the result of the model, we design two new chalcone derivatives. We expect that our work could provide complementary and useful information to design novel chalcone derivatives and predict their antitumor activity.
2. Research Methods
2.1. Data Resources
The data we used was taken from the paper  . The structures of chalcone derivatives used in this study were listed in Table 1 and their biological activity IC50 (half maximal inhibitory concentration) values µM (for inhibitor of the human colon cancer). The IC50 was used as a dependent variable in the QSAR study and all original IC50 values were converted to negative logarithm of IC50 (pIC50). These data were importantly used to establish a 3D-QSAR model to observe the structure-activity relationship of chalcone derivatives.
2.2. Comparative Molecular Field Analysis (CoMFA)
Molecular similarity searching is fast becoming a key tool in drug design. CoMFA is a method that reflects the non-bonding interaction between the receptor and the ligand  . CoMFA, analyses the relationship of compounds’ structure and activity, is widely used in drug design. The results of statistical analysis can be graphically outputted on the molecular surface to indicate steric and electrostatic properties, and can help understand the detail of interaction between the ligand and the active site of recepotir  .
2.3. Model Establishment
First, molecular alignment is an important step in CoMFA operation, directly affects the predictive ability of the CoMFA model. In this study, 20 chalcone derivatives were used as the training sets and 4 chalcone derivatives were used as the test sets. The data sets were used to construct 3D-QSAR model and analysis their physicochemical properties. The most active chalcone compound 21 was selected as a common skeleton which was selected for the alignment of training molecules. As shown in Figure 1, it’s compound 21 and shows the selected common skeleton. Figure 2 shows the result of compounds of training sets aligned.
Table 1. Chalcone derivatives and their experimental and predicted IC50.
*test data sets.
Figure 1. The common substructure of compound 21 for alignment, (shown in black bold atoms).
Figure 2. Alignment of training set molecules.
3. Results and Discussion
The analysis results of the chalcone derivatives model show in Table 2. The Q2, cross validation coefficient of the model is 0.6, which indicates that the model has the predictive ability. In addition, the calculational R2pred. value of the test set is 0.978 by using Matlab software, which also indicates the model has good predictive ability. The R2, correlation coefficient of non cross validation is 0.997, which indicates the statistical stability of the model is good. The standard deviation is 0.017 and the variance ratio is 388. The contributions of the steric and electrostatic fields are 65.7% and 34.3% respectively, which indicates that the steric and electrostatic effects of molecules contribute to the molecular activity. The contribution of the steric field is better than the contribution of electrostatic field.
3.1. Model Validation of CoMFA
Figure 3 shows the correlation between the experimental value from the literature  and the predicted value of chalcone CoMFA model. As shown in the picture, the abscissa represents the experimental value, the ordinate represents the predicted value, the black points represents the training set and the red points represents the test set. And all the samples are concentrated near the line y = x, which indicates that the model has better fitting ability and predictive ability and the predicted value of model was similar to the experimental value.
3.2. Analysis of the Countour Map
The countour map of chalcone is used to represent the steric and electrostatic fields of 24 compounds in different colors, as shown in Figure 4. Figure 4(a) shows steric interactions by green and yellow colored contours. The green regions indicate that the bulky substituents are favored, and yellow regions increase activity with small volume substituents. Figure 4(b) shows electrostatic interactions by red and blue colored contours. The blue regions indicate that positive charges are favored, and red regions increase activity only with negative charges.
There is major green region around R2 position in the steric contour map suggesting that a bulky group is preferred in the position to produce higher
Table 2. Statistical results of the CoMFA model.
Q2.: Cross-validated correlation coefficient. ONC.: Optimum number of components. R2: Non-cross- validated correlation coefficient. Scv.: Standard error of the estimate. F: test value.
Figure 3. Plot of experimental activity against predicted activity by CoMFA model.
Figure 4. Contour maps of the CoMFA model. Compound 21 was used as the template. (a) Green and yellow represent favorable and unfavorable regions for the steric field. (b) Blue and red represent favorable and unfavorable regions for electrostatic field.
antitumor activity. For example, 9 compounds (3 - 11) have higher antitumor activity than compound 2, because a mono-halogen and mono-methyl groups have bulky volume than hydrogen. An analysis that a relatively bulky substituent is beneficial to the antitumor activity. The electrostatic contour map shows a blue region surrounding the R2 position, indicating that substituting which can increase the positive charge on the ring systems would result in better activity. It is found that an electron-withdrawing group like a mono-halogen increases the antitumor activity, such as compounds (3 - 8, 17 - 19). An analysis that a relatively electron-withdrawing substituent is beneficial to the antitumor activity.
3.3. Design and Activity Verification of Novel Chalcone Derivatives
According to the contributions of the steric and electrostatic fields in the CoMFA model, the contribution of the steric field was greater than the contribution of electrostatic field. According to the analysis of the contour map, a relatively bulky substituent and electron-withdrawing substituent around R2 position could increase the antitumor activity of chalcones. Then some new chalcone molecules were designed, as shown in Table 3.
Table 3. Structure and predicted pIC50 values of newly designed drug molecules with antitumor.
The pIC50 value is larger, the antitumor activity of compounds is better. Finally, CoMFA model was used to predict the antitumor activity of new chalcones, then we found that the pIC50 values of the designed structures (N1, N2, N4, N6) were 6.090, 6.101, 6.050 and 6.064 respectively, were better than other compounds’ pIC50, because these pIC50 were bigger than the pIC50 of some compounds in Table 1. The results proved that this model can provide useful indicators for further design of new drug.
In this paper, we divided the data of 24 chalcone derivatives into training set and test set, and then used the training set to establish the CoMFA model, used the test set to predict the antitumor activity. The result shows that the model had a good Q2 and R2 values of 0.6 and 0.997 respectively, which indicated that the model had higher predictive ability and stability. By analyzing the CoMFA contour maps, we found that the effects of the steric and electrostatic fields around the aligned molecules on their antitumor activity were proved. The designed new chalcone’s antitumor activity was calculated and their value was 6.090, 6.101, 6.050 and 6.064 respectively, which were better than the antitumor activity of other chalcone derivatives. Therefore, this model can be used as a tool for designing new chalcone and can improve the efficiency of molecular design.