ABCR  Vol.10 No.3 , July 2021
In-Silico Identification of Anticancer Compounds; Ligand-Based Pharmacophore Approach against EGFR Involved in Breast Cancer
Abstract: Objective: Breast cancer is a public health challenge on a global scale that is caused by environmental or genetic factors. Breast cancer is affecting both males and females, but there is still a lack of effective drugs with improved potency and admissibility against breast cancer as many of the breast cancer drugs have severe side effects. Methods: The docking approach has been used to find a new compound for breast cancer with more efficacy and tolerance and with lesser side effects. A ligand-based pharmacophore approach has been generated for 39 anticancer compounds with significance for the development of new drugs. Result: Through docking, the approach found new lead compounds for breast cancer. The proposed pharmacophore model in this study contains two HBAs and one HYD, one hydrophobic domain and two Aromatic rings and the estimated distance range is minimum to maximum of derived pharmacophore features. Conclusion: Based on this research, it is proposed that these two lead compounds may be able to be used against EGFR in breast cancer. New compounds can be identified based on common features in the Pharmacophore model. 3D pharmacophore triangle could be used for further studies because this pharmacophore has better merging and in the future for more studies can suggest the same distance range of pharmacophore features as this pharmacophore.

1. Introduction

Breast cancer is the most widespread and universal disease, each year, 1.3 million cases of breast cancer are reported [1]. According to the World Health Organization (WHO), 16% of patients die due to breast cancer globally [2]. Breast cancer is the uncontrolled division or multiplication of breast cells. An Egyptian study of 1600 BC woman history showed that inner layers of milk ducts supply milk which is associated with the first type of breast cancer [3]. This disease is the most common in females and has a higher rate in developed country females but men can get it, too [4] [5]. Breast cancer is the 2nd most cancer overall and it is a big issue worldwide. 508,000 females died due to breast cancer in 2011 and 1.7 million new cases of breast cancer were reported in 2012 [6]. In 2016 American cancer society estimates 246,660 new cases of invasive breast cancer diagnosed in the females of the United States [7]. In Pakistan every year 83,000 cases of breast cancers are reported. Breast cancer kills nearly 40,000 women every year in Pakistan [8]. The main causes of this cancer are age, inheritance, height, radiations, smoking, obesity, use of alcohol, overweight, and aversion to breastfeeding. There are many proteins involved in breast cancer but I choose EGFR (Epidermal growth factor receptor) because this protein is also involved in many other cancers like lung neck and head cancer and is also involved in the disease of Sepsis [9]. EGFR is also known as HER1 and also has a main role in breast cancer. EGFR is the abbreviation of epidermal growth factor receptor. EGFR is a member of the ErbB family of tyrosine kinase receptors and it transmits signals to the cell [10]. The normal functions of EGFR are normal cellular processes of the body like proliferation differentiation and development but EGFR dysregulation leads to cancers of different types [11]. The overexpression of EGFR was observed in 20.6 percent of 165 case studies but the EGFR gene amplification was observed in 7.9 percent cases [12]. In 1992, Klijn et al. observed the expression of EGFR in 2500 out of 5232 breast carcinomas from different patient histories which revealed that EGFR has a keen role in the development of breast cancer. EGFR gene amplification cases were rare but according to Buchholz et al. from 82 patients with breast cancer, 17% of patients present EGFR expression in breast carcinomas and there are 14.5% of EGFR positive breast carcinoma cases among 278 cases of invasive breast carcinomas [13]. Mutations occur in axons 17 - 21 of EGFR that causes breast cancer. Accordingly, to previous studies, some drugs are not useful as therapeutic agents but they are beneficial and fighting against cancerous cells with the combination of other drugs. Initial candidate chemicals or “leads”, are often the only recognized test agents for prolonging survival. This is used for the identification of the most clinically active breast cancer drugs. The docking approach and pharmacophore modeling are considered a very important part of drug design and through these steps can understand the interaction between a protein and a ligand and in both approaches can predict the binding of receptor and ligand by specifying the arrangement of atoms of the functional group. The pharmacophore model is very convenient for understanding the common properties of the binding group to determine the type of inhibitor binding with a target. Cell surfaces are the regions where interactions between receptors and ligands occur. Any activity starts from the cell surface and then moves towards the intracellular pathway. Initially, the abnormality in EGFR was identified in signaling pathways and hence leads to tumor formation.

The pharmacophore model is a very convenient way to understand the common properties of the binding groups to determine the type of inhibitor binding to the respective target. In this research docking approach, pharmacophore modeling and pharmacophore triangle were developed to promote the discovery of more effective EGFR inhibitors for the treatment of breast cancer. The compounds used for docking and then for pharmacophore modeling have been reported in reference papers. In previous studies showed that several compounds have toxic effects and non-beneficial against breast cancer. Due to toxic side effects, there is an urgent need to develop a natural inhibitor for breast cancer.

2. Materials and Methods

2.1. Structure Retrieval

The work was initiated by retrieval of the 3D structure of EGFR protein from PDB ( and the PDB id of protein was 4WD5 and PDB is a protein data bank that contains information of macromolecules like proteins and nucleic acid etc and also contains information about small molecule like ligands and it also contains some detail about crystallographic structures, structural descriptors, and data collection and structure refinement [14]. The sequence and three-dimensional structure of this protein were already predicted to know the properties of protein like several amino acids, molecular weight atomic composition program ( tool was used. In program can check all required properties of the protein [15]. for protein visualization and purification of protein like removal of water molecules and unique ligands, UCSF chimera 1.10 software was used UCSF Chimera is a highly extensible visualization program for interactive visualization and analysis of molecular structures and related data, including density maps, supramolecular assemblies, sequence alignments, docking results, trajectories, and conformational assemblies either there are macromolecules or micromolecules. High-quality images and animations can also be generated through chimera. Chimera includes complete documentation and several tutorials and can be downloaded free of charge for academic, public, non-profit, and personal use. Chimera is developed by the resource for informatics, visualization, and biocomputing [16].

2.2. Selection of Compounds and Docking Approach

The docking process involves the prediction of ligand orientation and confirmation within the targeted binding site of EGFR. Two essential components of protein-ligand docking are sampling and scoring [17]. Sampling gives the information about ligand orientation or confirmation near the binding site of protein and scoring gives information of the binding tightness for ligand the top orientation is the one with the lowest energy score. For docking, the selection of data set compounds is the most important step. From literature survey, anti-breast cancer compounds have been identified which collected a set of 39 ligands and the chemical structures of compounds are shown in (Table 1). It is urgent need to control the overexpression of EGFR at the initial level of disease. The 2D structures of compounds were derived from Pubchem and saved in SDF format. Pubchem is a database for the biological properties of small molecules. The main purpose of PubChem is to make information easily accessible for biomedical researchers and PubChem became a leading public data repository by expanding search, structural retrieval, and data analysis and download tool [18]. The small compounds of targeted ligands were imported into AutoDock Vina utilizing PyRx features and were docked with EGFR protein and results of Docking scores are shown in (Table 2). PyRx is open source software its interface runs on all main operating systems like Windows, Mac Os, Linux, supercomputers. PyRx is virtual screening software designed for computational drug discovery and used to screen libraries of compounds against drug targets [19].

AutoDock Vina is a program for molecular docking and is also for virtual screening. AutoDock Vina, speed-up the molecular docking than other previous tools and also increases the accuracy of the binding site predictions. AutoDock tool used to convert compounds into PDB format and the grid box was used to define the binding site and the objective of docking studies was to identify the binding pattern and to find the binding energies [20].

Figure 1. Both graphs show the binding affinity of docked compounds. In graph (a) bms599626 compound has lowest binding affinity and in the graph (b) bms599626 hydrochloride compound has lowest binding affinity.

Table 1. Chemical structures of compounds used as inhibitors for EGFR protein involved in Breast Cancer. These inhibitors were selected from the literature review. The chemical structures of both lead compounds are same but their names are different.

2.3. Protein-Ligand Interactions

PLIP ( is the abbreviation of protein-ligand interaction profiler and it is the first web server that provides visualization and analysis of protein-ligand interactions by loading PDB structures [21]. The algorithm of PLIP does not require any structure preparation and PLIP offers 2D and 3D interactions and can also download results in XML and text format and can also save images [21]. The key objective to find interactions is to check how ligands are interacting with their surrounding residues and after docking and interactions, found a lead compound having low binding affinity and maximum interactions.

2.4. Toxicity Analysis

After identification of the lead compound the toxicity analysis was an important step and for toxicity admetSAR ( software was used. Through admetSAR check the different properties like Absorption, Distribution, Metabolism, Excretion, and Toxicity. These all properties are the key properties for drug discovery and are shown in (Table 3) and admetSAR is free of cost and open source software [22] [23] [24].

2.5. Pharmacophore Modeling

Pharmacophore modeling work was initiated by using ligands out software and ligands out runs freely on all major operating systems. LigandScout is an automated and speedy tool. LigandScout is not only important for binding site analysis but also important for designing shared feature pharmacophore [25]. Pharmacophore features are shown in (Table 4).

2.6. 3D Pharmacophore

The key objective for the designing of the pharmacophore was to create the 3D

Table 2. Calculation of binding affinities (Kcal/mol) of selected anti-cancer compounds against breast cancer receptor.

Table 3. Toxicity analyses of top two compounds utilizing admetSAR analyses tool.

Table 4. A pharmacophore is a precise representation of molecular features that are important for the molecular identification of a ligand with protein. Usually the Pharmacophore features are based on hydrophobic, hydrogen bond acceptors, aromatic rings, and hydrogen bond donors.

pharmacophore. The purpose of Pharmacophore triangle generation is to find distance range from minimum to maximum in Pharmacophore features like aromatic ring hydrogen bond donor, hydrogen bond acceptor, and hydrophobic [26]. The last step is to develop a triangle shown in (Table 4).

3. Results

The 3D structure of EGFR was already predicted so just retrieve the 3D structure of EGFR from PDB and the PDB id of EGFR was 4WD5.EGFR structure comprising of 2 chains A, B, and 660 Amino acid residues. For the stability of protein removed water molecules and hydrogen atoms from the structure and just select A chain for research consists 330 amino acid residues. The docking approach is an essential portion of drug designing. Docking of protein with selected ligands was done through AutoDock Vina (PyRx). The docking process involves the prediction of ligand orientation and conformation inside the targeted binding sites of the protein. The main purpose of docking is to find the lowest binding affinities with the highest binding energies in docked complex between protein-ligands. The binding affinities (Kcal/mol) of docking analyses of the selected inhibitors are shown in Figure 1

Pharmacophore analysis is also a vital part of drug designing. The pharmacophore generated by LigandScout for the selected inhibitors of EGFR involved in breast cancer showed four main features hydrophobic group, hydrogen bond donor, hydrogen bond acceptor, and aromatic ring. In the Pharmacophore model of lead compounds, the red arrow show represents HBA, the blue arrow represents aromatic ring, the yellow arrow represents Hydrophobic, and the green arrow represents HBD shown in Figure 2. Minimum and maximum distance range calculated between three Pharmacophore features. It is observed that BMS-599626 and BMS-599626 hydrochloride are the best compounds with the lowest binding affinity and showed maximum molecular interactions in their binding pockets Figure 3.

The distances between the aromatic ring and HBD range from 4.247 to 5.527, between aromatic rings to HBA range from 6.557 to 7.557, and between HBA to HBD range from 3.069 to 4.069 shown in Figure 4.


Figure 2. Pharmacophore models of compounds demonstrated ideal docking results with EGFR protein

Fig 4: Pharmacophore Triangle on the bases of minimum and maximum distance ranges between Pharmacophoric features like HBA, HBD, AR


Figure 3. The molecular interactions formed by the best hit compounds with EGFR Protein. (a) bms599626; (b) bms599626 hydrochloride.

Figure 4. Representation of Pharmacophore triangle: Pharmacophore Triangle based on the minimum and maximum distance ranges between Pharmacophore features like hydrogen bond acceptor, hydrogen bond donor, and aromatic ring.

4. Discussion

In the present work, Docking is performed to identify the binding affinity of 39 ligands with protein and also identify that how ligands are interacting with their surrounding residues after docking, and toxicity analyses are performed to check the ADMET properties of all compounds. Selected standard compounds are used to identify the pharmacophore and to verify the result of the selected 39 anticancer compounds and 3D pharmacophore generation is to find the minimum and maximum distance range between Pharmacophoric features for the first time. In this research, we follow the same techniques and tools for docking, identification, generation of pharmacophore, and generation of 3D pharmacophore were reported before in different research articles. The selection of compounds for datasets is the most important step in pharmacophore modeling. A set of 39 compounds were selected from literature to find out their binding affinity with protein and the arrangement of chemical features of protein tell us about its drug activity towards the selected compounds.

The pharmacophore model of selected compounds involves hydrogen bond donor, hydrogen bond acceptor, aromatic ring, and hydrophobic group, the main focus is on chemical features of ligands that enhance their binding affinity to the target protein EGFR. Docking of selected compounds was done through the PyRx tool. The pharmacophore model for the selected compounds was generated through ligands out. Essentially, identification common features such as H-bond acceptors, hydrogen bond donor, hydrophobic regions and aromatic ring, etc. The selected ligands were docked first before the pharmacophore generation. The docking approach helps in the identification of binding affinity between protein and ligands. The Pharmacophore features help in the identification of better anticancer agents.

The summarized form of pharmacophore of respective compounds showed how many pharmacophore features are present. Based on the above information distance triangle was made which shows different features like hydrophobic feature, aromatic ring, hydrogen bond acceptors, and hydrogen bond donor. For the pharmacophore triangle distance range is also given. The calculations of distances were done through chimera software. The distances between the aromatic ring and HBD range from 4.247 to 5.527, between aromatic rings to HBA range from 6.557 to 7.557, and between HBA to HBD range from 3.069 to 4.069.

5. Conclusion

This study used to find an affective compound as an inhibitor for EGFR through docking approaches and Pharmacophore modeling. The final findings concluded that the selected compounds stabilize EGFR protein. These lead compounds can be suggested for further drug designing and experimentation on breast cancer. This research might provide a new base for controlling irregularities and cancer that is caused by EGFR. These lead compounds will be helpful for the scientific world and can also help in the identification of new compounds against breast cancer.


Authors are thankful to Ahsanullah Unar for precious efforts in this manuscript.


Epidermal growth factor receptor (EGFR),

Hydrogen bond acceptor (HBA),

Hydrogen bond donor (HBD),

Virtual Screening (VS),

Absorption, Distribution, Metabolism, Excretion, Toxicity (ADMET).

Cite this paper: Khalid, I. , Jafar, T. , Unar, A. , Rasool, R. , Sahar, A. and Rashid, H. (2021) In-Silico Identification of Anticancer Compounds; Ligand-Based Pharmacophore Approach against EGFR Involved in Breast Cancer. Advances in Breast Cancer Research, 10, 120-132. doi: 10.4236/abcr.2021.103010.

[1]   Munir, A., et al. (2016) Structure-Based Pharmacophore Modeling, Virtual Screening and Molecular Docking for the Treatment of ESR1 Mutations in Breast Cancer. Drug Designing, 5, Article ID: 1000137.

[2]   Chanihoon, G.Q., et al. (2021) An AAS Dependent Method for Quantitative Essential Elements Analysis of Pakistani Female Breast Cancer Blood and Serum Samples. Advances in Breast Cancer Research, 10, 44-59.

[3]   Azim, H.A. and Ibrahim, A.S. (2014) Breast Cancer in Egypt, China and Chinese: Statistics and Beyond. Journal of Thoracic Disease, 6, 864.

[4]   Mustafa, M., et al. (2016) Breast Cancer: Detection Markers, Prognosis, and Prevention. IOSR Journal of Dental and Medical Sciences (IOSR-JDMS), 15, 73-80.

[5]   Aftab, A., et al. (2021) Computational Analysis of Cyclin D1 Gene SNPs and Association with Breast Cancer. Bioscience Reports, 41, BSR20202269.

[6]   Mathers, C., Fat, D.M. and Boerma, J.T. (2008) The Global Burden of Disease: 2004 Update. World Health Organization, Genevan.

[7]   Asif, H.M., et al. (2014) Prevalence, Risk Factors and Disease Knowledge of Breast Cancer in Pakistan. Asian Pacific Journal of Cancer Prevention, 15, 4411-4416.

[8]   Menhas, R. and Umer, S. (2015) Breast Cancer among Pakistani Women. Iranian Journal of Public Health, 44, 586-587.

[9]   Nicholson, R.I., Gee, J.M.W. and Harper, M.E. (2001) EGFR and Cancer Prognosis. European Journal of Cancer, 37, 9-15.

[10]   Yano, S., et al. (2002) Distribution and Function of EGFR in Human Tissue and the Effect of EGFR Tyrosine Kinase Inhibition. Anticancer Research, 23, 3639-3650.

[11]   Herbst, R.S. (2004) Review of Epidermal Growth Factor Receptor Biology. International Journal of Radiation Oncology Biology Physics, 59, S21-S26.

[12]   Park, K., et al. (2007) EGFR Gene and Protein Expression in Breast Cancers. European Journal of Surgical Oncology (EJSO), 33, 956-960.

[13]   Klijn, J., et al. (1992) The Clinical Significance of Epidermal Growth Factor Receptor (EGF-R) in Human Breast Cancer: A Review on 5232 Patients. Endocrine Reviews, 13, 3-17.

[14]   Berman, H.M. (2008) The Protein Data Bank: A Historical Perspective. Acta Crystallographica Section A: Foundations of Crystallography, 64, 88-95.

[15]   Gasteiger, E., et al. (2005) Protein Identification and Analysis Tools on the ExPASy Server. Springer, Berlin.

[16]   Pettersen, E.F., et al. (2004) UCSF Chimera—A Visualization System for Exploratory Research and Analysis. Journal of Computational Chemistry, 25, 1605-1612.

[17]   Huang, S.-Y. and Zou, X. (2010) Advances and Challenges in Protein-Ligand Docking. International Journal of Molecular Sciences, 11, 3016-3034.

[18]   Wang, Y., et al. (2013) PubChem Bioassay: 2014 Update. Nucleic Acids Research, 42, D1075-D1082.

[19]   Dallakyan, S. and Olson, A.J. (2015) Small-Molecule Library Screening by Docking with PyRx. Chemical Biology: Methods and Protocols, 1263, 243-250.

[20]   Trott, O. and Olson, A.J. (2010) AutoDock Vina: Improving the Speed and Accuracy of Docking with a New Scoring Function, Efficient Optimization, and Multithreading. Journal of Computational Chemistry, 31, 455-461.

[21]   Salentin, S., et al. (2015) PLIP: Fully Automated Protein-Ligand Interaction Profiler. Nucleic Acids Research, 43, W443-W447.

[22]   Cheng, F., et al. (2012) admetSAR: A Comprehensive Source and Free Tool for Assessment of Chemical ADMET Properties. Journal of Chemical Information and Modeling, 52, 3099-3105.

[23]   Dali, Y., Abbasi, S.M., Khan, S.A.F., Larra, S.A., Rasool, R., Ain, Q.T. and Jafar, T.H. (2019) Computational Drug Design and Exploration of Potent Phytochemicals against Cancer through in Silico Approaches. Biomedical Letters, 5, 21-26.

[24]   Malik, A., et al. (2018) In Silico and in Vivo Characterization of Cabralealactone, Solasodin and Salvadorin in a Rat Model: Potential Anti-Inflammatory Agents. Drug Design, Development and Therapy, 12, 1431-1443.

[25]   Wolber, G. and Langer, T. (2005) Ligand Scout: 3-D Pharmacophores Derived from Protein-Bound Ligands and Their Use as Virtual Screening Filters. Journal of Chemical Information and Modeling, 45, 160-169.

[26]   Haseeb, M., et al. (2014) Ligand Based Pharmacophore Development for Colorectal Cancer Drugs. Professional Medical Journal, 21, 856-863.