1
|
Pang W, Chen M, Qin Y. Prediction of anticancer drug sensitivity using an interpretable model guided by deep learning. BMC Bioinformatics 2024; 25:182. [PMID: 38724920 PMCID: PMC11080240 DOI: 10.1186/s12859-024-05669-x] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/09/2023] [Accepted: 01/22/2024] [Indexed: 05/13/2024] Open
Abstract
BACKGROUND The prediction of drug sensitivity plays a crucial role in improving the therapeutic effect of drugs. However, testing the effectiveness of drugs is challenging due to the complex mechanism of drug reactions and the lack of interpretability in most machine learning and deep learning methods. Therefore, it is imperative to establish an interpretable model that receives various cell line and drug feature data to learn drug response mechanisms and achieve stable predictions between available datasets. RESULTS This study proposes a new and interpretable deep learning model, DrugGene, which integrates gene expression, gene mutation, gene copy number variation of cancer cells, and chemical characteristics of anticancer drugs to predict their sensitivity. This model comprises two different branches of neural networks, where the first involves a hierarchical structure of biological subsystems that uses the biological processes of human cells to form a visual neural network (VNN) and an interpretable deep neural network for human cancer cells. DrugGene receives genotype input from the cell line and detects changes in the subsystem states. We also employ a traditional artificial neural network (ANN) to capture the chemical structural features of drugs. DrugGene generates final drug response predictions by combining VNN and ANN and integrating their outputs into a fully connected layer. The experimental results using drug sensitivity data extracted from the Cancer Drug Sensitivity Genome Database and the Cancer Treatment Response Portal v2 reveal that the proposed model is better than existing prediction methods. Therefore, our model achieves higher accuracy, learns the reaction mechanisms between anticancer drugs and cell lines from various features, and interprets the model's predicted results. CONCLUSIONS Our method utilizes biological pathways to construct neural networks, which can use genotypes to monitor changes in the state of network subsystems, thereby interpreting the prediction results in the model and achieving satisfactory prediction accuracy. This will help explore new directions in cancer treatment. More available code resources can be downloaded for free from GitHub ( https://github.com/pangweixiong/DrugGene ).
Collapse
Affiliation(s)
- Weixiong Pang
- College of Information Technology, Shanghai Ocean University, Hucheng Ring Road, Shanghai, China
- Key Laboratory of Fisheries Information Ministry of Agriculture, Shanghai, China
| | - Ming Chen
- College of Information Technology, Shanghai Ocean University, Hucheng Ring Road, Shanghai, China
- Key Laboratory of Fisheries Information Ministry of Agriculture, Shanghai, China
| | - Yufang Qin
- College of Information Technology, Shanghai Ocean University, Hucheng Ring Road, Shanghai, China.
- Key Laboratory of Fisheries Information Ministry of Agriculture, Shanghai, China.
| |
Collapse
|
2
|
Yu CC, Raj N, Chu JW. Statistical Learning of Protein Elastic Network from Positional Covariance Matrix. Comput Struct Biotechnol J 2023; 21:2524-2535. [PMID: 37095762 PMCID: PMC10121796 DOI: 10.1016/j.csbj.2023.03.033] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/21/2022] [Revised: 03/20/2023] [Accepted: 03/20/2023] [Indexed: 03/30/2023] Open
Abstract
Positional fluctuation and covariance during protein dynamics are key observables for understanding the molecular origin of biological functions. A frequently employed potential energy function for describing protein structural variation at the coarse-gained level is elastic network model (ENM). A long-standing issue in biomolecular simulation is thus the parametrization of ENM spring constants from the components of positional covariance matrix (PCM). Based on sensitivity analysis of PCM, the direct-coupling statistics of each spring, which is a specific combination of position fluctuation and covariance, is found to exhibit prominent signal of parameter dependence. This finding provides the basis for devising the objective function and the scheme of running through the effective one-dimensional optimization of every spring by self-consistent iteration. Formal derivation of the positional covariance statistical learning (PCSL) method also motivates the necessary data regularization for stable calculations. Robust convergence of PCSL is achieved in taking an all-atom molecular dynamics trajectory or an ensemble of homologous structures as input data. The PCSL framework can also be generalized with mixed objective functions to capture specific property such as the residue flexibility profile. Such physical chemistry-based statistical learning thus provides a useful platform for integrating the mechanical information encoded in various experimental or computational data.
Collapse
|
3
|
Chen L, Gong W, Han Z, Zhou W, Yang S, Li C. Key Residues in δ Opioid Receptor Allostery Explored by the Elastic Network Model and the Complex Network Model Combined with the Perturbation Method. J Chem Inf Model 2022; 62:6727-6738. [PMID: 36073904 DOI: 10.1021/acs.jcim.2c00513] [Citation(s) in RCA: 5] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/07/2023]
Abstract
Opioid receptors, a kind of G protein-coupled receptors (GPCRs), mainly mediate an analgesic response via allosterically transducing the signal of endogenous ligand binding in the extracellular domain to couple to effector proteins in the intracellular domain. The δ opioid receptor (DOP) is associated with emotional control besides pain control, which makes it an attractive therapeutic target. However, its allosteric mechanism and key residues responsible for the structural stability and signal communication are not completely clear. Here we utilize the Gaussian network model (GNM) and amino acid network (AAN) combined with perturbation methods to explore the issues. The constructed fcfGNMMD, where the force constants are optimized with the inverse covariance estimation based on the correlated fluctuations from the available DOP molecular dynamics (MD) ensemble, shows a better performance than traditional GNM in reproducing residue fluctuations and cross-correlations and in capturing functionally low-frequency modes. Additionally, fcfGNMMD can consider implicitly the environmental effects to some extent. The lowest mode can well divide DOP segments and identify the two sodium ion (important allosteric regulator) binding coordination shells, and from the fastest modes, the key residues important for structure stabilization are identified. Using fcfGNMMD combined with a dynamic perturbation-response method, we explore the key residues related to the sodium ion binding. Interestingly, we identify not only the key residues in sodium ion binding shells but also the ones far away from the perturbation sites, which are involved in binding with DOP ligands, suggesting the possible long-range allosteric modulation of sodium binding for the ligand binding to DOP. Furthermore, utilizing the weighted AAN combined with attack perturbations, we identify the key residues for allosteric communication. This work helps strengthen the understanding of the allosteric communication mechanism in δ opioid receptor and can provide valuable information for drug design.
Collapse
Affiliation(s)
- Lei Chen
- Faculty of Environmental and Life Sciences, Beijing University of Technology, Beijing 100124, China
| | - Weikang Gong
- Faculty of Environmental and Life Sciences, Beijing University of Technology, Beijing 100124, China
| | - Zhongjie Han
- Faculty of Environmental and Life Sciences, Beijing University of Technology, Beijing 100124, China
| | - Wenxue Zhou
- Faculty of Environmental and Life Sciences, Beijing University of Technology, Beijing 100124, China
| | - Shuang Yang
- Faculty of Environmental and Life Sciences, Beijing University of Technology, Beijing 100124, China
| | - Chunhua Li
- Faculty of Environmental and Life Sciences, Beijing University of Technology, Beijing 100124, China
| |
Collapse
|
4
|
Gong W, Wee J, Wu MC, Sun X, Li C, Xia K. Persistent spectral simplicial complex-based machine learning for chromosomal structural analysis in cellular differentiation. Brief Bioinform 2022; 23:6583209. [PMID: 35536545 DOI: 10.1093/bib/bbac168] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/21/2022] [Revised: 04/12/2022] [Accepted: 03/13/2022] [Indexed: 11/13/2022] Open
Abstract
The three-dimensional (3D) chromosomal structure plays an essential role in all DNA-templated processes, including gene transcription, DNA replication and other cellular processes. Although developing chromosome conformation capture (3C) methods, such as Hi-C, which can generate chromosomal contact data characterized genome-wide chromosomal structural properties, understanding 3D genomic nature-based on Hi-C data remains lacking. Here, we propose a persistent spectral simplicial complex (PerSpectSC) model to describe Hi-C data for the first time. Specifically, a filtration process is introduced to generate a series of nested simplicial complexes at different scales. For each of these simplicial complexes, its spectral information can be calculated from the corresponding Hodge Laplacian matrix. PerSpectSC model describes the persistence and variation of the spectral information of the nested simplicial complexes during the filtration process. Different from all previous models, our PerSpectSC-based features provide a quantitative global-scale characterization of chromosome structures and topology. Our descriptors can successfully classify cell types and also cellular differentiation stages for all the 24 types of chromosomes simultaneously. In particular, persistent minimum best characterizes cell types and Dim (1) persistent multiplicity best characterizes cellular differentiation. These results demonstrate the great potential of our PerSpectSC-based models in polymeric data analysis.
Collapse
Affiliation(s)
- Weikang Gong
- Faculty of Environmental and Life Sciences, Beijing University of Technology, Beijing, China 100124.,Division of Mathematical Sciences, School of Physical and Mathematical Sciences, Nanyang Technological University, Singapore 637371
| | - JunJie Wee
- Division of Mathematical Sciences, School of Physical and Mathematical Sciences, Nanyang Technological University, Singapore 637371
| | - Min-Chun Wu
- Division of Mathematical Sciences, School of Physical and Mathematical Sciences, Nanyang Technological University, Singapore 637371
| | - Xiaohan Sun
- Faculty of Environmental and Life Sciences, Beijing University of Technology, Beijing, China 100124
| | - Chunhua Li
- Faculty of Environmental and Life Sciences, Beijing University of Technology, Beijing, China 100124
| | - Kelin Xia
- Division of Mathematical Sciences, School of Physical and Mathematical Sciences, Nanyang Technological University, Singapore 637371
| |
Collapse
|
5
|
Deng X, Wang S, Han Z, Gong W, Liu Y, Li C. Dynamics of binding interactions of TDP-43 and RNA: An equally weighted multiscale elastic network model study. Proteins 2021; 90:589-600. [PMID: 34599611 DOI: 10.1002/prot.26255] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/03/2021] [Revised: 09/15/2021] [Accepted: 09/21/2021] [Indexed: 01/03/2023]
Abstract
Transactive response DNA binding protein 43 (TDP-43), an alternative-splicing regulator, can specifically bind long UG-rich RNAs, associated with a range of neurodegenerative diseases. Upon binding RNA, TDP-43 undergoes a large conformational change with two RNA recognition motifs (RRMs) connected by a long linker rearranged, strengthening the binding affinity of TDP-43 with RNA. We extend the equally weighted multiscale elastic network model (ewmENM), including its Gaussian network model (ewmGNM) and Anisotropic network model (ewmANM), with the multiscale effect of interactions considered, to the characterization of the dynamics of binding interactions of TDP-43 and RNA. The results reveal upon RNA binding a loss of flexibility occurs to TDP-43's loop3 segments rich in positively charged residues and C-terminal of high flexibility, suggesting their anchoring RNA, induced fit and conformational adjustment roles in recognizing RNA. Additionally, based on movement coupling analyses, it is found that RNA binding strengthens the interactions among intra-RRM β-sheets and between RRMs partially through the linker's mediating role, which stabilizes RNA binding interface, facilitating RNA binding efficiency. In addition, utilizing our proposed thermodynamic cycle method combined with ewmGNM, we identify the key residues for RNA binding whose perturbations induce a large change in binding free energy. We identify not only the residues important for specific binding, but also the ones critical for the conformational rearrangement between RRMs. Furthermore, molecular dynamics simulations are also performed to validate and further interpret the ENM-based results. The study demonstrates a useful avenue to utilize ewmENM to investigate the protein-RNA interaction dynamics characteristics.
Collapse
Affiliation(s)
- Xueqing Deng
- Faculty of Environmental and Life Sciences, Beijing University of Technology, Beijing, China
| | - Shihao Wang
- Faculty of Environmental and Life Sciences, Beijing University of Technology, Beijing, China
| | - Zhongjie Han
- Faculty of Environmental and Life Sciences, Beijing University of Technology, Beijing, China
| | - Weikang Gong
- Faculty of Environmental and Life Sciences, Beijing University of Technology, Beijing, China
| | - Yang Liu
- Faculty of Environmental and Life Sciences, Beijing University of Technology, Beijing, China
| | - Chunhua Li
- Faculty of Environmental and Life Sciences, Beijing University of Technology, Beijing, China
| |
Collapse
|
6
|
Zhang S, Gong W, Han Z, Liu Y, Li C. Insight into Shared Properties and Differential Dynamics and Specificity of Secretory Phospholipase A 2 Family Members. J Phys Chem B 2021; 125:3353-3363. [PMID: 33780247 DOI: 10.1021/acs.jpcb.1c01315] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/12/2022]
Abstract
Understanding generic mechanisms of functions shared by the secretory phospholipase A2 (sPLA2) family involved in the lipid metabolism and cell signaling and the molecular basis of function specificity for family members is an intriguing but challenging problem for biologists. Here, we explore the issue through extensive analyses using a combination of structure-based methods and bioinformatics tools on130 sPLA2 family members. The principal component analysis of the structure ensemble reveals that the enzyme has an open-close motion which helps widen the substrate binding channel, facilitating its binding to phospholipid. Performing elastic network model and sequence analyses found that the residues critical for family functions, such as cysteine and catalytic residues, are highly conserved and undergo minimal movements, which is evolutionarily essential as their perturbation would impact the function, while the four residue regions involved in the association with the calcium ion/membrane are lowly conserved and of high mobility and large variations in low-to-intermediate frequency modes, which reflects the specificity of members. The analyses from perturbation response scanning also reveal that the above four regions with high sensitivity to an external perturbation are member-specific, suggesting their different roles in allosteric modulation, while the minimal sensitive residues are the shared characteristics across family members, which play an important role in maintaining structural stability as the folding core. This study is helpful for understanding how sequences, structures, and dynamics of sPLA2 family members evolve to ensure their common and specific functions and can provide a guide for accurate design of proteins with finely tuned activities.
Collapse
Affiliation(s)
- Shan Zhang
- Faculty of Environmental and Life Sciences, Beijing University of Technology, Beijing 100124, China
| | - Weikang Gong
- Faculty of Environmental and Life Sciences, Beijing University of Technology, Beijing 100124, China
| | - Zhongjie Han
- Faculty of Environmental and Life Sciences, Beijing University of Technology, Beijing 100124, China
| | - Yang Liu
- Faculty of Environmental and Life Sciences, Beijing University of Technology, Beijing 100124, China
| | - Chunhua Li
- Faculty of Environmental and Life Sciences, Beijing University of Technology, Beijing 100124, China
| |
Collapse
|