1
|
Yuan W, Li Y, Han Z, Chen Y, Xie J, Chen J, Bi Z, Xi J. Evolutionary Mechanism Based Conserved Gene Expression Biclustering Module Analysis for Breast Cancer Genomics. Biomedicines 2024; 12:2086. [PMID: 39335599 PMCID: PMC11428256 DOI: 10.3390/biomedicines12092086] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/23/2024] [Revised: 08/23/2024] [Accepted: 09/02/2024] [Indexed: 09/30/2024] Open
Abstract
The identification of significant gene biclusters with particular expression patterns and the elucidation of functionally related genes within gene expression data has become a critical concern due to the vast amount of gene expression data generated by RNA sequencing technology. In this paper, a Conserved Gene Expression Module based on Genetic Algorithm (CGEMGA) is proposed. Breast cancer data from the TCGA database is used as the subject of this study. The p-values from Fisher's exact test are used as evaluation metrics to demonstrate the significance of different algorithms, including the Cheng and Church algorithm, CGEM algorithm, etc. In addition, the F-test is used to investigate the difference between our method and the CGEM algorithm. The computational cost of the different algorithms is further investigated by calculating the running time of each algorithm. Finally, the established driver genes and cancer-related pathways are used to validate the process. The results of 10 independent runs demonstrate that CGEMGA has a superior average p-value of 1.54 × 10-4 ± 3.06 × 10-5 compared to all other algorithms. Furthermore, our approach exhibits consistent performance across all methods. The F-test yields a p-value of 0.039, indicating a significant difference between our approach and the CGEM. Computational cost statistics also demonstrate that our approach has a significantly shorter average runtime of 5.22 × 100 ± 1.65 × 10-1 s compared to the other algorithms. Enrichment analysis indicates that the genes in our approach are significantly enriched for driver genes. Our algorithm is fast and robust, efficiently extracting co-expressed genes and associated co-expression condition biclusters from RNA-seq data.
Collapse
Affiliation(s)
- Wei Yuan
- School of Biomedical Engineering, Guangzhou Medical University, Guangzhou 511436, China
| | - Yaming Li
- School of Biomedical Engineering, Guangzhou Medical University, Guangzhou 511436, China
| | - Zhengpan Han
- School of Biomedical Engineering, Guangzhou Medical University, Guangzhou 511436, China
| | - Yu Chen
- School of Biomedical Engineering, Guangzhou Medical University, Guangzhou 511436, China
| | - Jinnan Xie
- School of Biomedical Engineering, Guangzhou Medical University, Guangzhou 511436, China
| | - Jianguo Chen
- School of Biomedical Engineering, Guangzhou Medical University, Guangzhou 511436, China
| | - Zhisheng Bi
- School of Biomedical Engineering, Guangzhou Medical University, Guangzhou 511436, China
| | - Jianing Xi
- School of Biomedical Engineering, Guangzhou Medical University, Guangzhou 511436, China
| |
Collapse
|
2
|
Gorlov IP, Gorlova OY, Tsavachidis S, Amos CI. Strength of selection in lung tumors correlates with clinical features better than tumor mutation burden. Sci Rep 2024; 14:12732. [PMID: 38831004 PMCID: PMC11148192 DOI: 10.1038/s41598-024-63468-z] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/08/2023] [Accepted: 05/29/2024] [Indexed: 06/05/2024] Open
Abstract
Single nucleotide substitutions are the most common type of somatic mutations in cancer genome. The goal of this study was to use publicly available somatic mutation data to quantify negative and positive selection in individual lung tumors and test how strength of directional and absolute selection is associated with clinical features. The analysis found a significant variation in strength of selection (both negative and positive) among tumors, with median selection tending to be negative even though tumors with strong positive selection also exist. Strength of selection estimated as the density of missense mutations relative to the density of silent mutations showed only a weak correlation with tumor mutation burden. In the "all histology together" analysis we found that absolute strength of selection was strongly correlated with all clinically relevant features analyzed. In histology-stratified analysis selection was strongest in small cell lung cancer. Selection in adenocarcinoma was somewhat higher compared to squamous cell carcinoma. The study suggests that somatic mutation- based quantifying of directional and absolute selection in individual tumors can be a useful biomarker of tumor aggressiveness.
Collapse
Affiliation(s)
- Ivan P Gorlov
- Institute for Clinical and Translational Research, Baylor College of Medicine, One Baylor Plaza, Mailstop: BCM451, Houston, TX, 77030, USA.
| | - Olga Y Gorlova
- Institute for Clinical and Translational Research, Baylor College of Medicine, One Baylor Plaza, Mailstop: BCM451, Houston, TX, 77030, USA
| | - Spyridon Tsavachidis
- Institute for Clinical and Translational Research, Baylor College of Medicine, One Baylor Plaza, Mailstop: BCM451, Houston, TX, 77030, USA
| | - Christopher I Amos
- Institute for Clinical and Translational Research, Baylor College of Medicine, One Baylor Plaza, Mailstop: BCM451, Houston, TX, 77030, USA
| |
Collapse
|
3
|
Samir S. Human DNA Mutations and their Impact on Genetic Disorders. Recent Pat Biotechnol 2024; 18:288-315. [PMID: 37936448 DOI: 10.2174/0118722083255081231020055309] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/08/2023] [Revised: 07/25/2023] [Accepted: 09/18/2023] [Indexed: 11/09/2023]
Abstract
DNA is a remarkably precise medium for copying and storing biological information. It serves as a design for cellular machinery that permits cells, organs, and even whole organisms to work. The fidelity of DNA replication results from the action of hundreds of genes involved in proofreading and damage repair. All human cells can acquire genetic changes in their DNA all over life. Genetic mutations are changes to the DNA sequence that happen during cell division when the cells make copies of themselves. Mutations in the DNA can cause genetic illnesses such as cancer, or they could help humans better adapt to their environment over time. The endogenous reactive metabolites, therapeutic medicines, and an excess of environmental mutagens, such as UV rays all continuously damage DNA, compromising its integrity. One or more chromosomal alterations and point mutations at a single site (monogenic mutation) including deletions, duplications, and inversions illustrate such DNA mutations. Genetic conditions can occur when an altered gene is inherited from parents, which increases the risk of developing that particular condition, or some gene alterations can happen randomly. Moreover, symptoms of genetic conditions depend on which gene has a mutation. There are many different diseases and conditions caused by mutations. Some of the most common genetic conditions are Alzheimer's disease, some cancers, cystic fibrosis, Down syndrome, and sickle cell disease. Interestingly, scientists find that DNA mutations are more common than formerly thought. This review outlines the main DNA mutations that occur along the human genome and their influence on human health. The subject of patents pertaining to DNA mutations and genetic disorders has been brought up.
Collapse
Affiliation(s)
- Safia Samir
- Department of Biochemistry and Molecular Biology, Theodor Bilharz Research Institute, Giza, Egypt
| |
Collapse
|
4
|
Annan A, Raiss N, Lemrabet S, Elomari N, Elmir EH, Filali-Maltouf A, Medraoui L, Oumzil H. Proposal of pharmacophore model for HIV reverse transcriptase inhibitors: Combined mutational effect analysis, molecular dynamics, molecular docking and pharmacophore modeling study. Int J Immunopathol Pharmacol 2024; 38:3946320241231465. [PMID: 38296818 PMCID: PMC10832406 DOI: 10.1177/03946320241231465] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/17/2023] [Accepted: 01/13/2024] [Indexed: 02/02/2024] Open
Abstract
OBJECTIVES Antiretroviral therapy (ART) efficacy is jeopardized by the emergence of drug resistance mutations in HIV, compromising treatment effectiveness. This study aims to propose novel analogs of Effavirenz (EFV) as potential direct inhibitors of HIV reverse transcriptase, employing computer-aided drug design methodologies. METHODS Three key approaches were applied: a mutational profile study, molecular dynamics simulations, and pharmacophore development. The impact of mutations on the stability, flexibility, function, and affinity of target proteins, especially those associated with NRTI, was assessed. Molecular dynamics analysis identified G190E as a mutation significantly altering protein properties, potentially leading to therapeutic failure. Comparative analysis revealed that among six first-line antiretroviral drugs, EFV exhibited notably low affinity with viral reverse transcriptase, further reduced by the G190E mutation. Subsequently, a search for EFV-similar inhibitors yielded 12 promising molecules based on their affinity, forming the basis for generating a pharmacophore model. RESULTS Mutational analysis pinpointed G190E as a crucial mutation impacting protein properties, potentially undermining therapeutic efficacy. EFV demonstrated diminished affinity with viral reverse transcriptase, exacerbated by the G190E mutation. The search for EFV analogs identified 12 high-affinity molecules, culminating in a pharmacophore model elucidating key structural features crucial for potent inhibition. CONCLUSION This study underscores the significance of EFV analogs as potential inhibitors of HIV reverse transcriptase. The findings highlight the impact of mutations on drug efficacy, particularly the detrimental effect of G190E. The generated pharmacophore model serves as a pivotal reference for future drug development efforts targeting HIV, providing essential structural insights for the design of potent inhibitors based on EFV analogs identified in vitro.
Collapse
Affiliation(s)
- Azzeddine Annan
- Research Center of Plant and Microbial Biotechnologies, Biodiversity and Environment, Faculty of Sciences, Mohammed V University, Rabat, Morocco
- Virology Department, National Reference Laboratory for HIV, Institute National of Hygiene, Rabat, Morocco
| | - Noureddine Raiss
- Research Center of Plant and Microbial Biotechnologies, Biodiversity and Environment, Faculty of Sciences, Mohammed V University, Rabat, Morocco
- Virology Department, National Reference Laboratory for HIV, Institute National of Hygiene, Rabat, Morocco
| | - Sanae Lemrabet
- Virology Department, National Reference Laboratory for HIV, Institute National of Hygiene, Rabat, Morocco
| | - Nezha Elomari
- Virology Department, National Reference Laboratory for HIV, Institute National of Hygiene, Rabat, Morocco
| | - El Harti Elmir
- Virology Department, National Reference Laboratory for HIV, Institute National of Hygiene, Rabat, Morocco
| | - Abdelkarim Filali-Maltouf
- Research Center of Plant and Microbial Biotechnologies, Biodiversity and Environment, Faculty of Sciences, Mohammed V University, Rabat, Morocco
| | - Leila Medraoui
- Research Center of Plant and Microbial Biotechnologies, Biodiversity and Environment, Faculty of Sciences, Mohammed V University, Rabat, Morocco
| | - Hicham Oumzil
- Virology Department, National Reference Laboratory for HIV, Institute National of Hygiene, Rabat, Morocco
- Pedagogy and Research Unit of Microbiology, and Genomic Center of Human Pathologies, School of Medicine and Pharmacy, Mohamed V University, Rabat, Morocco
| |
Collapse
|
5
|
Li J, Hu ZQ, Yu SY, Mao L, Zhou ZJ, Wang PC, Gong Y, Su S, Zhou J, Fan J, Zhou SL, Huang XW. CircRPN2 inhibits aerobic glycolysis and metastasis in hepatocellular carcinoma. Cancer Res 2022; 82:1055-1069. [PMID: 35045986 DOI: 10.1158/0008-5472.can-21-1259] [Citation(s) in RCA: 68] [Impact Index Per Article: 34.0] [Reference Citation Analysis] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/22/2021] [Revised: 07/05/2021] [Accepted: 01/10/2022] [Indexed: 11/16/2022]
Affiliation(s)
- Jia Li
- Department of Liver Surgery and Transplantation, Liver Cancer Institute, Zhongshan Hospital, Fudan University
- Key Laboratory of Carcinogenesis and Cancer Invasion (Fudan University), Ministry of Education, Shanghai, China
| | - Zhi-Qiang Hu
- Department of Liver Surgery and Transplantation, Liver Cancer Institute, Zhongshan Hospital, Fudan University
- Key Laboratory of Carcinogenesis and Cancer Invasion (Fudan University), Ministry of Education, Shanghai, China
| | - Song-Yang Yu
- Department of Liver Surgery and Transplantation, Liver Cancer Institute, Zhongshan Hospital, Fudan University
- Key Laboratory of Carcinogenesis and Cancer Invasion (Fudan University), Ministry of Education, Shanghai, China
| | - Li Mao
- Key Laboratory of Carcinogenesis and Cancer Invasion (Fudan University), Ministry of Education, Shanghai, China
| | - Zheng-Jun Zhou
- Department of Liver Surgery and Transplantation, Liver Cancer Institute, Zhongshan Hospital, Fudan University
- Key Laboratory of Carcinogenesis and Cancer Invasion (Fudan University), Ministry of Education, Shanghai, China
| | - Peng-Cheng Wang
- Department of Liver Surgery and Transplantation, Liver Cancer Institute, Zhongshan Hospital, Fudan University
- Key Laboratory of Carcinogenesis and Cancer Invasion (Fudan University), Ministry of Education, Shanghai, China
| | - Yu Gong
- Department of Liver Surgery and Transplantation, Liver Cancer Institute, Zhongshan Hospital, Fudan University
- Key Laboratory of Carcinogenesis and Cancer Invasion (Fudan University), Ministry of Education, Shanghai, China
| | - Sheng Su
- Department of Liver Surgery and Transplantation, Liver Cancer Institute, Zhongshan Hospital, Fudan University
- Key Laboratory of Carcinogenesis and Cancer Invasion (Fudan University), Ministry of Education, Shanghai, China
| | - Jian Zhou
- Department of Liver Surgery and Transplantation, Liver Cancer Institute, Zhongshan Hospital, Fudan University
- Key Laboratory of Carcinogenesis and Cancer Invasion (Fudan University), Ministry of Education, Shanghai, China
- Shanghai Key Laboratory of Organ Transplantation, Zhongshan Hospital, Fudan University, Shanghai, China
| | - Jia Fan
- Department of Liver Surgery and Transplantation, Liver Cancer Institute, Zhongshan Hospital, Fudan University
- Key Laboratory of Carcinogenesis and Cancer Invasion (Fudan University), Ministry of Education, Shanghai, China
- Shanghai Key Laboratory of Organ Transplantation, Zhongshan Hospital, Fudan University, Shanghai, China
| | - Shao-Lai Zhou
- Department of Liver Surgery and Transplantation, Liver Cancer Institute, Zhongshan Hospital, Fudan University
- Key Laboratory of Carcinogenesis and Cancer Invasion (Fudan University), Ministry of Education, Shanghai, China
| | - Xiao-Wu Huang
- Department of Liver Surgery and Transplantation, Liver Cancer Institute, Zhongshan Hospital, Fudan University
- Key Laboratory of Carcinogenesis and Cancer Invasion (Fudan University), Ministry of Education, Shanghai, China
- Shanghai Key Laboratory of Organ Transplantation, Zhongshan Hospital, Fudan University, Shanghai, China
| |
Collapse
|
6
|
Gaudelet T, Day B, Jamasb AR, Soman J, Regep C, Liu G, Hayter JBR, Vickers R, Roberts C, Tang J, Roblin D, Blundell TL, Bronstein MM, Taylor-King JP. Utilizing graph machine learning within drug discovery and development. Brief Bioinform 2021; 22:bbab159. [PMID: 34013350 PMCID: PMC8574649 DOI: 10.1093/bib/bbab159] [Citation(s) in RCA: 48] [Impact Index Per Article: 16.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/09/2020] [Revised: 04/01/2021] [Accepted: 04/05/2021] [Indexed: 12/15/2022] Open
Abstract
Graph machine learning (GML) is receiving growing interest within the pharmaceutical and biotechnology industries for its ability to model biomolecular structures, the functional relationships between them, and integrate multi-omic datasets - amongst other data types. Herein, we present a multidisciplinary academic-industrial review of the topic within the context of drug discovery and development. After introducing key terms and modelling approaches, we move chronologically through the drug development pipeline to identify and summarize work incorporating: target identification, design of small molecules and biologics, and drug repurposing. Whilst the field is still emerging, key milestones including repurposed drugs entering in vivo studies, suggest GML will become a modelling framework of choice within biomedical machine learning.
Collapse
Affiliation(s)
| | - Ben Day
- Relation Therapeutics, London, UK
- The Computer Laboratory, University of Cambridge, UK
| | - Arian R Jamasb
- Relation Therapeutics, London, UK
- The Computer Laboratory, University of Cambridge, UK
- Department of Biochemistry, University of Cambridge, UK
| | | | | | | | | | | | | | - Jian Tang
- Mila, the Quebec AI Institute, Canada
- HEC Montreal, Canada
| | - David Roblin
- Relation Therapeutics, London, UK
- Juvenescence, London, UK
- The Francis Crick Institute, London, UK
| | | | - Michael M Bronstein
- Relation Therapeutics, London, UK
- Department of Computing, Imperial College London, UK
- Twitter, UK
| | | |
Collapse
|
7
|
Chen J, Guo JT. Structural and functional analysis of somatic coding and UTR indels in breast and lung cancer genomes. Sci Rep 2021; 11:21178. [PMID: 34707120 PMCID: PMC8551294 DOI: 10.1038/s41598-021-00583-1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/15/2021] [Accepted: 10/14/2021] [Indexed: 11/24/2022] Open
Abstract
Insertions and deletions (Indels) represent one of the major variation types in the human genome and have been implicated in diseases including cancer. To study the features of somatic indels in different cancer genomes, we investigated the indels from two large samples of cancer types: invasive breast carcinoma (BRCA) and lung adenocarcinoma (LUAD). Besides mapping somatic indels in both coding and untranslated regions (UTRs) from the cancer whole exome sequences, we investigated the overlap between these indels and transcription factor binding sites (TFBSs), the key elements for regulation of gene expression that have been found in both coding and non-coding sequences. Compared to the germline indels in healthy genomes, somatic indels contain more coding indels with higher than expected frame-shift (FS) indels in cancer genomes. LUAD has a higher ratio of deletions and higher coding and FS indel rates than BRCA. More importantly, these somatic indels in cancer genomes tend to locate in sequences with important functions, which can affect the core secondary structures of proteins and have a bigger overlap with predicted TFBSs in coding regions than the germline indels. The somatic CDS indels are also enriched in highly conserved nucleotides when compared with germline CDS indels.
Collapse
Affiliation(s)
- Jing Chen
- Department of Bioinformatics and Genomics, University of North Carolina at Charlotte, Charlotte, NC, 28223, USA
| | - Jun-Tao Guo
- Department of Bioinformatics and Genomics, University of North Carolina at Charlotte, Charlotte, NC, 28223, USA.
| |
Collapse
|
8
|
Petrosino M, Novak L, Pasquo A, Chiaraluce R, Turina P, Capriotti E, Consalvi V. Analysis and Interpretation of the Impact of Missense Variants in Cancer. Int J Mol Sci 2021; 22:ijms22115416. [PMID: 34063805 PMCID: PMC8196604 DOI: 10.3390/ijms22115416] [Citation(s) in RCA: 25] [Impact Index Per Article: 8.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/30/2021] [Revised: 05/03/2021] [Accepted: 05/17/2021] [Indexed: 01/10/2023] Open
Abstract
Large scale genome sequencing allowed the identification of a massive number of genetic variations, whose impact on human health is still unknown. In this review we analyze, by an in silico-based strategy, the impact of missense variants on cancer-related genes, whose effect on protein stability and function was experimentally determined. We collected a set of 164 variants from 11 proteins to analyze the impact of missense mutations at structural and functional levels, and to assess the performance of state-of-the-art methods (FoldX and Meta-SNP) for predicting protein stability change and pathogenicity. The result of our analysis shows that a combination of experimental data on protein stability and in silico pathogenicity predictions allowed the identification of a subset of variants with a high probability of having a deleterious phenotypic effect, as confirmed by the significant enrichment of the subset in variants annotated in the COSMIC database as putative cancer-driving variants. Our analysis suggests that the integration of experimental and computational approaches may contribute to evaluate the risk for complex disorders and develop more effective treatment strategies.
Collapse
Affiliation(s)
- Maria Petrosino
- Dipartimento Scienze Biochimiche “A. Rossi Fanelli”, Sapienza University of Rome, 00185 Roma, Italy; (M.P.); (L.N.); (R.C.)
| | - Leonore Novak
- Dipartimento Scienze Biochimiche “A. Rossi Fanelli”, Sapienza University of Rome, 00185 Roma, Italy; (M.P.); (L.N.); (R.C.)
| | - Alessandra Pasquo
- ENEA CR Frascati, Diagnostics and Metrology Laboratory FSN-TECFIS-DIM, 00044 Frascati, Italy;
| | - Roberta Chiaraluce
- Dipartimento Scienze Biochimiche “A. Rossi Fanelli”, Sapienza University of Rome, 00185 Roma, Italy; (M.P.); (L.N.); (R.C.)
| | - Paola Turina
- Dipartimento di Farmacia e Biotecnologie (FaBiT), University of Bologna, 40126 Bologna, Italy;
| | - Emidio Capriotti
- Dipartimento di Farmacia e Biotecnologie (FaBiT), University of Bologna, 40126 Bologna, Italy;
- Correspondence: (E.C.); (V.C.)
| | - Valerio Consalvi
- Dipartimento Scienze Biochimiche “A. Rossi Fanelli”, Sapienza University of Rome, 00185 Roma, Italy; (M.P.); (L.N.); (R.C.)
- Correspondence: (E.C.); (V.C.)
| |
Collapse
|
9
|
Sanavia T, Birolo G, Montanucci L, Turina P, Capriotti E, Fariselli P. Limitations and challenges in protein stability prediction upon genome variations: towards future applications in precision medicine. Comput Struct Biotechnol J 2020; 18:1968-1979. [PMID: 32774791 PMCID: PMC7397395 DOI: 10.1016/j.csbj.2020.07.011] [Citation(s) in RCA: 74] [Impact Index Per Article: 18.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/15/2020] [Revised: 07/10/2020] [Accepted: 07/14/2020] [Indexed: 12/13/2022] Open
Abstract
Protein stability predictions are becoming essential in medicine to develop novel immunotherapeutic agents and for drug discovery. Despite the large number of computational approaches for predicting the protein stability upon mutation, there are still critical unsolved problems: 1) the limited number of thermodynamic measurements for proteins provided by current databases; 2) the large intrinsic variability of ΔΔG values due to different experimental conditions; 3) biases in the development of predictive methods caused by ignoring the anti-symmetry of ΔΔG values between mutant and native protein forms; 4) over-optimistic prediction performance, due to sequence similarity between proteins used in training and test datasets. Here, we review these issues, highlighting new challenges required to improve current tools and to achieve more reliable predictions. In addition, we provide a perspective of how these methods will be beneficial for designing novel precision medicine approaches for several genetic disorders caused by mutations, such as cancer and neurodegenerative diseases.
Collapse
Affiliation(s)
- Tiziana Sanavia
- Department of Medical Sciences, University of Torino, Via Santena 19, 10126 Torino, Italy
| | - Giovanni Birolo
- Department of Medical Sciences, University of Torino, Via Santena 19, 10126 Torino, Italy
| | - Ludovica Montanucci
- Department of Comparative Biomedicine and Food Science (BCA), University of Padova, Viale dell'Università 16, 35020 Legnaro, Italy
| | - Paola Turina
- Department of Pharmacy and Biotechnology (FaBiT), University of Bologna, Via F. Selmi 3, 40126 Bologna, Italy
| | - Emidio Capriotti
- Department of Pharmacy and Biotechnology (FaBiT), University of Bologna, Via F. Selmi 3, 40126 Bologna, Italy
| | - Piero Fariselli
- Department of Medical Sciences, University of Torino, Via Santena 19, 10126 Torino, Italy
| |
Collapse
|
10
|
Pandurangan AP, Blundell TL. Prediction of impacts of mutations on protein structure and interactions: SDM, a statistical approach, and mCSM, using machine learning. Protein Sci 2020; 29:247-257. [PMID: 31693276 PMCID: PMC6933854 DOI: 10.1002/pro.3774] [Citation(s) in RCA: 46] [Impact Index Per Article: 11.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/30/2019] [Revised: 10/31/2019] [Accepted: 10/31/2019] [Indexed: 02/02/2023]
Abstract
Next-generation sequencing methods have not only allowed an understanding of genome sequence variation during the evolution of organisms but have also provided invaluable information about genetic variants in inherited disease and the emergence of resistance to drugs in cancers and infectious disease. A challenge is to distinguish mutations that are drivers of disease or drug resistance, from passengers that are neutral or even selectively advantageous to the organism. This requires an understanding of impacts of missense mutations in gene expression and regulation, and on the disruption of protein function by modulating protein stability or disturbing interactions with proteins, nucleic acids, small molecule ligands, and other biological molecules. Experimental approaches to understanding differences between wild-type and mutant proteins are most accurate but are also time-consuming and costly. Computational tools used to predict the impacts of mutations can provide useful information more quickly. Here, we focus on two widely used structure-based approaches, originally developed in the Blundell lab: site-directed mutator (SDM), a statistical approach to analyze amino acid substitutions, and mutation cutoff scanning matrix (mCSM), which uses graph-based signatures to represent the wild-type structural environment and machine learning to predict the effect of mutations on protein stability. Here, we describe DUET that uses machine learning to combine the two approaches. We discuss briefly the development of mCSM for understanding the impacts of mutations on interfaces with other proteins, nucleic acids, and ligands, and we exemplify the wide application of these approaches to understand human genetic disorders and drug resistance mutations relevant to cancer and mycobacterial infections. STATEMENT FOR A BROADER AUDIENCE: Genetic or somatic changes in genes can lead to mutations in human proteins, which give rise to genetic disorders or cancer, or to genes of pathogens leading to drug resistance. Computer software described here, using statistical approaches or machine learning, uses the information from genome sequencing of humans and pathogens, together with experimental or modeled 3D structures of gene products, the proteins, to predict impacts of mutations in genetic disease, cancer and drug resistance.
Collapse
Affiliation(s)
- Arun Prasad Pandurangan
- Department of BiochemistryUniversity of CambridgeCambridgeUK
- MRC Laboratory of Molecular BiologyCambridgeUK
| | - Tom L. Blundell
- Department of BiochemistryUniversity of CambridgeCambridgeUK
| |
Collapse
|