1
|
Guan R, Liu W, Li N, Cui Z, Cai R, Wang Y, Zhao C. Machine learning models based on residue interaction network for ABCG2 transportable compounds recognition. ENVIRONMENTAL POLLUTION (BARKING, ESSEX : 1987) 2023; 337:122620. [PMID: 37769706 DOI: 10.1016/j.envpol.2023.122620] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 05/24/2023] [Revised: 09/03/2023] [Accepted: 09/25/2023] [Indexed: 10/02/2023]
Abstract
As the one of the most important protein of placental transport of environmental substances, the identification of ABCG2 transport molecules is the key step for assessing the risk of placental exposure to environmental chemicals. Here, residue interaction network (RIN) was used to explore the difference of ABCG2 binding conformations between transportable and non-transportable compounds. The RIN were treated as a kind of special quantitative data of protein conformation, which not only reflected the changes of single amino acid conformation in protein, but also indicated the changes of distance and action type between amino acids. Based on the quantitative RIN, four machine learning algorithms were applied to establish the classification and recognition model for 1100 compounds with transported by ABCG2 potential. The random forest (RF) models constructed with RIN presented the best and satisfied predictive ability with an accuracy of training set of 0.97 and the test set of 0.96 respectively. In conclusion, the construction of residue interaction network provided a new perspective for the quantitative characterization of protein conformation and the establishment of prediction models for transporter molecular recognition. The ABCG2 transport molecular recognition model based on residue interaction network provides a possible way for screening environmental chemistry transported through placenta.
Collapse
Affiliation(s)
- Ruining Guan
- School of Pharmacy, Lanzhou University, Lanzhou, 730000, China
| | - Wencheng Liu
- School of Pharmacy, Lanzhou University, Lanzhou, 730000, China
| | - Ningqi Li
- School of Pharmacy, Lanzhou University, Lanzhou, 730000, China
| | - Zeyang Cui
- School of Information Science & Engineering, Lanzhou University, Lanzhou, 730000, China
| | - Ruitong Cai
- School of Pharmacy, Lanzhou University, Lanzhou, 730000, China
| | - Yawei Wang
- Research Center for Eco-Environmental Sciences, Chinese Academy of Sciences, Beijing, 100085, China
| | - Chunyan Zhao
- School of Pharmacy, Lanzhou University, Lanzhou, 730000, China.
| |
Collapse
|
2
|
Shih ESC, Hwang MJ. NPPD: A Protein-Protein Docking Scoring Function Based on Dyadic Differences in Networks of Hydrophobic and Hydrophilic Amino Acid Residues. BIOLOGY 2015; 4:282-97. [PMID: 25811640 PMCID: PMC4498300 DOI: 10.3390/biology4020282] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 11/27/2014] [Accepted: 03/16/2015] [Indexed: 11/16/2022]
Abstract
Protein-protein docking (PPD) predictions usually rely on the use of a scoring function to rank docking models generated by exhaustive sampling. To rank good models higher than bad ones, a large number of scoring functions have been developed and evaluated, but the methods used for the computation of PPD predictions remain largely unsatisfactory. Here, we report a network-based PPD scoring function, the NPPD, in which the network consists of two types of network nodes, one for hydrophobic and the other for hydrophilic amino acid residues, and the nodes are connected when the residues they represent are within a certain contact distance. We showed that network parameters that compute dyadic interactions and those that compute heterophilic interactions of the amino acid networks thus constructed allowed NPPD to perform well in a benchmark evaluation of 115 PPD scoring functions, most of which, unlike NPPD, are based on some sort of protein-protein interaction energy. We also showed that NPPD was highly complementary to these energy-based scoring functions, suggesting that the combined use of conventional scoring functions and NPPD might significantly improve the accuracy of current PPD predictions.
Collapse
Affiliation(s)
- Edward S C Shih
- Institute of Biomedical Sciences, Academia Sinica, Nankang, Taipei 115, Taiwan.
| | - Ming-Jing Hwang
- Institute of Biomedical Sciences, Academia Sinica, Nankang, Taipei 115, Taiwan.
| |
Collapse
|
3
|
Yan W, Sun M, Hu G, Zhou J, Zhang W, Chen J, Chen B, Shen B. Amino acid contact energy networks impact protein structure and evolution. J Theor Biol 2014; 355:95-104. [PMID: 24703984 DOI: 10.1016/j.jtbi.2014.03.032] [Citation(s) in RCA: 9] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/08/2014] [Accepted: 03/21/2014] [Indexed: 01/13/2023]
Abstract
One of the most challenging tasks in structural proteomics is to understand the relationship between protein structure, biological function, and evolution. An understanding of amino acid networks based on protein topology has an important role in the study of this relationship; however, the relationship between network parameters underlying protein topology with structural properties or evolutionary rate is still unknown. To investigate this further, we modeled the three dimensional structure of proteins as amino acid contact energy networks (AACENs) with nodes represented as amino acid residues and edges established according to environment-dependent residue-residue contact energies. Five other types of networks were also constructed to investigate their topological parameters and compare their effect on protein structure and evolution: (1) a random contact network (RCN), (2) a rewiring network with the same degree of distribution as AACEN (RNDD), (3) long-range contact energy networks with and without the backbone connectivity (LCEN_BBs and LCENs), and (4) short range contact energy networks (SCENs). The results indicated that the long-range link percentage and the network clustering coefficient showed a significantly positive and negative correlation, respectively, with protein secondary structure density. In addition, the long-range link percentage and network diameter had a significantly positive and negative correlation, respectively, with evolutionary rate. According to our knowledge, this is the first study to identify the potential role of long-range links and network diameter in protein evolution.
Collapse
Affiliation(s)
- Wenying Yan
- Center for Systems Biology, Soochow University, No. 1, Shizi Street, Suzhou, Jiangsu 215006, China
| | - Maomin Sun
- Center for Systems Biology, Soochow University, No. 1, Shizi Street, Suzhou, Jiangsu 215006, China; Laboratory Animal Research Center, School of Medical, Soochow University, China
| | - Guang Hu
- Center for Systems Biology, Soochow University, No. 1, Shizi Street, Suzhou, Jiangsu 215006, China
| | - Jianhong Zhou
- Center for Systems Biology, Soochow University, No. 1, Shizi Street, Suzhou, Jiangsu 215006, China
| | - Wenyu Zhang
- Center for Systems Biology, Soochow University, No. 1, Shizi Street, Suzhou, Jiangsu 215006, China
| | - Jiajia Chen
- Center for Systems Biology, Soochow University, No. 1, Shizi Street, Suzhou, Jiangsu 215006, China; Department of Chemistry and Biological Engineering, Suzhou University of Science and Technology, Jiangsu, Suzhou 215011, China
| | - Biao Chen
- Center for Systems Biology, Soochow University, No. 1, Shizi Street, Suzhou, Jiangsu 215006, China
| | - Bairong Shen
- Center for Systems Biology, Soochow University, No. 1, Shizi Street, Suzhou, Jiangsu 215006, China.
| |
Collapse
|
4
|
Yan W, Zhou J, Sun M, Chen J, Hu G, Shen B. The construction of an amino acid network for understanding protein structure and function. Amino Acids 2014; 46:1419-39. [PMID: 24623120 DOI: 10.1007/s00726-014-1710-6] [Citation(s) in RCA: 70] [Impact Index Per Article: 7.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/05/2013] [Accepted: 02/21/2014] [Indexed: 01/08/2023]
Abstract
Amino acid networks (AANs) are undirected networks consisting of amino acid residues and their interactions in three-dimensional protein structures. The analysis of AANs provides novel insight into protein science, and several common amino acid network properties have revealed diverse classes of proteins. In this review, we first summarize methods for the construction and characterization of AANs. We then compare software tools for the construction and analysis of AANs. Finally, we review the application of AANs for understanding protein structure and function, including the identification of functional residues, the prediction of protein folding, analyzing protein stability and protein-protein interactions, and for understanding communication within and between proteins.
Collapse
Affiliation(s)
- Wenying Yan
- Center for Systems Biology, Soochow University, Suzhou, 215006, Jiangsu, China
| | | | | | | | | | | |
Collapse
|