1
|
Akhavan-Safar M, Teimourpour B, Nowzari-Dalini A. A network-based method for detecting cancer driver gene in transcriptional regulatory networks using the structure analysis of weighted regulatory interactions. Curr Bioinform 2022. [DOI: 10.2174/1574893617666220127094224] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/22/2022]
Abstract
Background:
The identification of genes that instigate cell anomalies and cause cancer in humans is an important field in oncology research. Abnormalities in these genes are transferred to other genes in the cell, disrupting its normal functionality. Such genes are known as cancer driver genes (CDGs). Various methods have been proposed for predicting CDGs, most of which are based on genomic data and computational methods. Some novel bioinformatic approaches have been developed.
Objective:
In this article, we propose a network-based algorithm, SalsaDriver (Stochastic approach for link-structure analysis to driver detection), which can calculate the receiving and influencing power of each gene using the stochastic analysis of regulatory interaction structures in gene regulatory networks.
Method:
First, regulatory networks related to breast, colon, and lung cancers were constructed using gene expression data and a list of regulatory interactions, the weights of which were then calculated using biological and topological features of the network. After that, the weighted regulatory interactions were used in the structure analysis of interactions achieved using two separate Markov chains on the bipartite graph taken from the main graph of the gene network and implementing the stochastic approach for link-structure analysis. The proposed algorithm categorizes higher-ranked genes as driver genes.
Results:
The proposed algorithm was compared with 24 other computational and network tools based on the F-measure value and the number of detected CDGs. The results were validated using four valid databases. The findings of this study show that SalsaDriver outperforms other methods and can identify a significant number of driver genes not identified using other methods.
Conclusion:
The SalsaDriver network-based approach is suitable for predicting CDGs and can be used as a complementary method along with other computational tools.
Collapse
Affiliation(s)
- Mostafa Akhavan-Safar
- Department of Computer and Information Technology Engineering, Payame Noor University (PNU), P.O. Box, 19395-4697, Tehran, Iran
- Department of Information Technology Engineering, School of Systems and Industrial Engineering, Tarbiat Modares University (TMU), Tehran, Iran
| | - Babak Teimourpour
- Department of Information Technology Engineering, School of Systems and Industrial Engineering, Tarbiat Modares University (TMU), Tehran, Iran
| | - Abbas Nowzari-Dalini
- Department of Computer Science, School of Mathematics, Statistics, and Computer Science, University of Tehran, Tehran, Iran
| |
Collapse
|
2
|
Pham VVH, Liu L, Bracken C, Goodall G, Li J, Le TD. Computational methods for cancer driver discovery: A survey. Am J Cancer Res 2021; 11:5553-5568. [PMID: 33859763 PMCID: PMC8039954 DOI: 10.7150/thno.52670] [Citation(s) in RCA: 10] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/01/2020] [Accepted: 01/20/2021] [Indexed: 12/21/2022] Open
Abstract
Identifying the genes responsible for driving cancer is of critical importance for directing treatment. Accordingly, multiple computational tools have been developed to facilitate this task. Due to the different methods employed by these tools, different data considered by the tools, and the rapidly evolving nature of the field, the selection of an appropriate tool for cancer driver discovery is not straightforward. This survey seeks to provide a comprehensive review of the different computational methods for discovering cancer drivers. We categorise the methods into three groups; methods for single driver identification, methods for driver module identification, and methods for identifying personalised cancer drivers. In addition to providing a “one-stop” reference of these methods, by evaluating and comparing their performance, we also provide readers the information about the different capabilities of the methods in identifying biologically significant cancer drivers. The biologically relevant information identified by these tools can be seen through the enrichment of discovered cancer drivers in GO biological processes and KEGG pathways and through our identification of a small cancer-driver cohort that is capable of stratifying patient survival.
Collapse
|
3
|
Seo H, Cho DH. Feature selection algorithm based on dual correlation filters for cancer-associated somatic variants. BMC Bioinformatics 2020; 21:486. [PMID: 33121438 PMCID: PMC7596964 DOI: 10.1186/s12859-020-03767-0] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/23/2019] [Accepted: 09/18/2020] [Indexed: 12/30/2022] Open
Abstract
Background Since the development of sequencing technology, an enormous amount of genetic information has been generated, and human cancer analysis using this information is drawing attention. As the effects of variants on human cancer become known, it is important to find cancer-associated variants among countless variants. Results We propose a new filter-based feature selection method applicable for extracting cancer-associated somatic variants considering correlations of data. Both variants associated with the activation and deactivation of cancer’s characteristics are analyzed using dual correlation filters. The multiobjective optimization is utilized to consider two types of variants simultaneously without redundancy. To overcome high computational complexity problem, we calculate the correlation-based weight to select significant variants instead of directly searching for the optimal subset of variants. The proposed algorithm is applied to the identification of melanoma metastasis or breast cancer stage, and the classification results of the proposed method are compared with those of conventional single correlation filter-based method. Conclusions We verified that the proposed dual correlation filter-based method can extract cancer-associated variants related to the characteristics of human cancer.
Collapse
Affiliation(s)
- Hyein Seo
- School of Electrical Engineering, Korea Advanced Institute of Science and Technology (KAIST), 291 Daehak-ro, Yuseong-gu, 34141, Daejeon, Republic of Korea
| | - Dong-Ho Cho
- School of Electrical Engineering, Korea Advanced Institute of Science and Technology (KAIST), 291 Daehak-ro, Yuseong-gu, 34141, Daejeon, Republic of Korea.
| |
Collapse
|
4
|
Abstract
Cellular DNA is constantly chemically altered by exogenous and endogenous agents. As all processes of life depend on the transmission of the genetic information, multiple biological processes exist to ensure genome integrity. Chemically damaged DNA has been linked to cancer and aging, therefore it is of great interest to map DNA damage formation and repair to elucidate the distribution of damage on a genome-wide scale. While the low abundance and inability to enzymatically amplify DNA damage are obstacles to genome-wide sequencing, new developments in the last few years have enabled high-resolution mapping of damaged bases. Recently, a number of DNA damage sequencing library construction strategies coupled to new data analysis pipelines allowed the mapping of specific DNA damage formation and repair at high and single nucleotide resolution. Strikingly, these advancements revealed that the distribution of DNA damage is heavily influenced by chromatin states and the binding of transcription factors. In the last seven years, these novel approaches have revealed new genomic maps of DNA damage distribution in a variety of organisms as generated by diverse chemical and physical DNA insults; oxidative stress, chemotherapeutic drugs, environmental pollutants, and sun exposure. Preferred sequences for damage formation and repair have been elucidated, thus making it possible to identify persistent weak spots in the genome as locations predicted to be vulnerable for mutation. As such, sequencing DNA damage will have an immense impact on our ability to elucidate mechanisms of disease initiation, and to evaluate and predict the efficacy of chemotherapeutic drugs.
Collapse
Affiliation(s)
- Cécile Mingard
- Department of Health Sciences and Technology, ETH Zürich, Schmelzbergstrasse 9, 8092 Zürich, Switzerland.
| | | | | | | |
Collapse
|
5
|
Pham VVH, Liu L, Bracken CP, Goodall GJ, Long Q, Li J, Le TD. CBNA: A control theory based method for identifying coding and non-coding cancer drivers. PLoS Comput Biol 2019; 15:e1007538. [PMID: 31790386 PMCID: PMC6907873 DOI: 10.1371/journal.pcbi.1007538] [Citation(s) in RCA: 14] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/25/2019] [Revised: 12/12/2019] [Accepted: 11/12/2019] [Indexed: 02/06/2023] Open
Abstract
A key task in cancer genomics research is to identify cancer driver genes. As these genes initialise and progress cancer, understanding them is critical in designing effective cancer interventions. Although there are several methods developed to discover cancer drivers, most of them only identify coding drivers. However, non-coding RNAs can regulate driver mutations to develop cancer. Hence, novel methods are required to reveal both coding and non-coding cancer drivers. In this paper, we develop a novel framework named Controllability based Biological Network Analysis (CBNA) to uncover coding and non-coding cancer drivers (i.e. miRNA cancer drivers). CBNA integrates different genomic data types, including gene expression, gene network, mutation data, and contains a two-stage process: (1) Building a network for a condition (e.g. cancer condition) and (2) Identifying drivers. The application of CBNA to the BRCA dataset demonstrates that it is more effective than the existing methods in detecting coding cancer drivers. In addition, CBNA also predicts 17 miRNA drivers for breast cancer. Some of these predicted miRNA drivers have been validated by literature and the rest can be good candidates for wet-lab validation. We further use CBNA to detect subtype-specific cancer drivers and several predicted drivers have been confirmed to be related to breast cancer subtypes. Another application of CBNA is to discover epithelial-mesenchymal transition (EMT) drivers. Of the predicted EMT drivers, 7 coding and 6 miRNA drivers are in the known EMT gene lists. Cancer is a disease of cells in human body and it causes a high rate of deaths worldwide. There has been evidence that coding and non-coding RNAs are key players in the initialisation and progression of cancer. These coding and non-coding RNAs are considered as cancer drivers. To design better diagnostic and therapeutic plans for cancer patients, we need to know the roles of cancer drivers in cancer development as well as their regulatory mechanisms in the human body. In this study, we propose a novel framework to identify coding and non-coding cancer drivers (i.e. miRNA cancer drivers). The proposed framework is applied to the breast cancer dataset for identifying drivers of breast cancer. Comparing our method with existing methods in predicting coding cancer drivers, our method shows a better performance. Several miRNA cancer drivers predicted by our method have already been validated by literature. The predicted cancer drivers by our method could be a potential source for further wet-lab experiments to discover the causes of cancer. In addition, the proposed method can be used to detect drivers of cancer subtypes and drivers of the epithelial-mesenchymal transition in cancer.
Collapse
Affiliation(s)
- Vu V. H. Pham
- School of Information Technology and Mathematical Sciences, University of South Australia, Mawson Lakes, Australia
| | - Lin Liu
- School of Information Technology and Mathematical Sciences, University of South Australia, Mawson Lakes, Australia
| | - Cameron P. Bracken
- Centre for Cancer Biology, an alliance of SA Pathology and University of South Australia, Adelaide, Australia
- Department of Medicine, The University of Adelaide, Adelaide, Australia
| | - Gregory J. Goodall
- Centre for Cancer Biology, an alliance of SA Pathology and University of South Australia, Adelaide, Australia
- Department of Medicine, The University of Adelaide, Adelaide, Australia
| | - Qi Long
- Perelman School of Medicine, University of Pennsylvania, Philadelphia, Pennsylvania, United States of America
| | - Jiuyong Li
- School of Information Technology and Mathematical Sciences, University of South Australia, Mawson Lakes, Australia
- * E-mail: (JL); (TL)
| | - Thuc D. Le
- School of Information Technology and Mathematical Sciences, University of South Australia, Mawson Lakes, Australia
- * E-mail: (JL); (TL)
| |
Collapse
|
6
|
Wang Y, Song B, Zhu L, Zhang X. Long non-coding RNA, LINC01614 as a potential biomarker for prognostic prediction in breast cancer. PeerJ 2019; 7:e7976. [PMID: 31741788 PMCID: PMC6858983 DOI: 10.7717/peerj.7976] [Citation(s) in RCA: 17] [Impact Index Per Article: 3.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/30/2019] [Accepted: 10/02/2019] [Indexed: 12/16/2022] Open
Abstract
Background Dysregulated long non-coding RNAs (lncRNAs) may serve as potential biomarkers of cancers including breast cancer (BRCA). This study aimed to identify lncRNAs with strong prognostic value for BRCA. Methods LncRNA expression profiles of 929 tissue samples were downloaded from TANRIC database. We performed differential expression analysis between paired BRCA and adjacent normal tissues. Survival analysis was used to identify lncRNAs with prognostic value. Univariate and multivariate Cox regression analyses were performed to confirm the independent prognostic value of potential lncRNAs. Dysregulated signaling pathways associated with lncRNA expression were evaluated using gene set enrichment analysis. Results We found that a total of 398 lncRNAs were significantly differentially expressed between BRCA and adjacent normal tissues (adjusted P value <= 0.0001 and |logFC| >= 1). Additionally, 381 potential lncRNAs were correlated Overall Survival (OS) (P value < 0.05). A total of 48 lncRNAs remained when differentially expressed lncRNAs overlapped with lncRNAs that had prognostic value. Among the 48 lncRNAs, one lncRNA (LINC01614) had stronger prognostic value and was highly expressed in BRCA tissues. LINC01614 expression was validated as an independent prognostic factor using univariate and multivariate analyses. Higher LINC01614 expression was observed in several molecular subgroups including estrogen receptors+, progesterone receptors+ and human epidermal growth factor receptor 2 (HER2)+ subgroup, respectively. Also, BRCA carrying one of four gene mutations had higher expression of LINC01614 including AOAH, CIT, HER2 and ODZ1. Higher expression of LINC01614 was positively correlated with several gene sets including TGF-β1 response, CDH1 signals and cell adhesion pathways. Conclusions A novel lncRNA LINC01614 was identified as a potential biomarker for prognosis prediction of BRCA. This study emphasized the importance of LINC01614 and further research should be focused on it.
Collapse
Affiliation(s)
- Yaozong Wang
- Department of General Surgery, Hwa Mei Hospital (Ningbo No.2 Hospital), University of Chinese Academy of Sciences, Ningbo, China
| | - Baorong Song
- Department of General Surgery, Hwa Mei Hospital (Ningbo No.2 Hospital), University of Chinese Academy of Sciences, Ningbo, China
| | - Leilei Zhu
- Department of Radiotherapy, Shanghai East Hospital, Tongji University, Shanghai, China
| | - Xia Zhang
- Breast Cancer Center, Shanghai East Hospital, Tongji University, Shanghai, China
| |
Collapse
|
7
|
Nussinov R, Tsai CJ, Jang H. Why Are Some Driver Mutations Rare? Trends Pharmacol Sci 2019; 40:919-929. [PMID: 31699406 DOI: 10.1016/j.tips.2019.10.003] [Citation(s) in RCA: 26] [Impact Index Per Article: 5.2] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/16/2019] [Revised: 10/09/2019] [Accepted: 10/10/2019] [Indexed: 12/13/2022]
Abstract
Understanding why driver mutations that promote cancer are sometimes rare is important for precision medicine since it would help in their identification. Driver mutations are largely discovered through their frequencies. Thus, rare mutations often escape detection. Unlike high-frequency drivers, low-frequency drivers can be tissue specific; rare drivers have extremely low frequencies. Here, we discuss rare drivers and strategies to discover them. We suggest that allosteric driver mutations shift the protein ensemble from the inactive to the active state. Rare allosteric drivers are statistically rare since, to switch the protein functional state, they cooperate with additional mutations, and these are not considered in the patient cancer-specific protein sequence analysis. A complete landscape of mutations that drive cancer will reveal tumor-specific therapeutic vulnerabilities.
Collapse
Affiliation(s)
- Ruth Nussinov
- Computational Structural Biology Section, Basic Science Program, Frederick National Laboratory for Cancer Research, Frederick, MD 21702, USA; Department of Human Molecular Genetics and Biochemistry, Sackler School of Medicine, Tel Aviv University, Tel Aviv 69978, Israel.
| | - Chung-Jung Tsai
- Computational Structural Biology Section, Basic Science Program, Frederick National Laboratory for Cancer Research, Frederick, MD 21702, USA
| | - Hyunbum Jang
- Computational Structural Biology Section, Basic Science Program, Frederick National Laboratory for Cancer Research, Frederick, MD 21702, USA
| |
Collapse
|
8
|
Collier O, Stoven V, Vert JP. LOTUS: A single- and multitask machine learning algorithm for the prediction of cancer driver genes. PLoS Comput Biol 2019; 15:e1007381. [PMID: 31568528 PMCID: PMC6786659 DOI: 10.1371/journal.pcbi.1007381] [Citation(s) in RCA: 19] [Impact Index Per Article: 3.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/28/2018] [Revised: 10/10/2019] [Accepted: 09/04/2019] [Indexed: 12/16/2022] Open
Abstract
Cancer driver genes, i.e., oncogenes and tumor suppressor genes, are involved in the acquisition of important functions in tumors, providing a selective growth advantage, allowing uncontrolled proliferation and avoiding apoptosis. It is therefore important to identify these driver genes, both for the fundamental understanding of cancer and to help finding new therapeutic targets or biomarkers. Although the most frequently mutated driver genes have been identified, it is believed that many more remain to be discovered, particularly for driver genes specific to some cancer types. In this paper, we propose a new computational method called LOTUS to predict new driver genes. LOTUS is a machine-learning based approach which allows to integrate various types of data in a versatile manner, including information about gene mutations and protein-protein interactions. In addition, LOTUS can predict cancer driver genes in a pan-cancer setting as well as for specific cancer types, using a multitask learning strategy to share information across cancer types. We empirically show that LOTUS outperforms five other state-of-the-art driver gene prediction methods, both in terms of intrinsic consistency and prediction accuracy, and provide predictions of new cancer genes across many cancer types. Cancer development is driven by mutations and dysfunction of important, so-called cancer driver genes, that could be targeted by specific therapies. While a number of such cancer genes have already been identified, it is believed that many more remain to be discovered. To help prioritize experimental investigations of candidate genes, several computational methods have been proposed to rank promising candidates based on their mutations in large cohorts of cancer cases, or on their interactions with known driver genes in biological networks. We propose LOTUS, a new computational approach to identify genes with high oncogenic potential. LOTUS implements a machine learning approach to learn an oncogenic potential score from known driver genes, and brings two novelties compared to existing methods. First, it allows to easily combine heterogeneous sources of information into the scoring function, which we illustrate by learning a scoring function from both known mutations in large cancer cohorts and interactions in biological networks. Second, using a multitask learning strategy, it can predict different driver genes for different cancer types, while sharing information between them to improve the prediction for every type. We provide experimental results showing that LOTUS significantly outperforms several state-of-the-art cancer gene prediction software.
Collapse
Affiliation(s)
- Olivier Collier
- Modal’X, UPL, Univ Paris Nanterre, F-92000 Nanterre, France
- * E-mail: (OC); (J-PV)
| | - Véronique Stoven
- MINES ParisTech, PSL University, CBIO-Centre for Computational Biology, F-75006 Paris, France
- Institut Curie, F-75248 Paris Cedex 5, France
- INSERM U900, F-75248 Paris Cedex 5, France
| | - Jean-Philippe Vert
- MINES ParisTech, PSL University, CBIO-Centre for Computational Biology, F-75006 Paris, France
- Google Research, Brain team, F-75009 Paris, France
- * E-mail: (OC); (J-PV)
| |
Collapse
|
9
|
Review: Precision medicine and driver mutations: Computational methods, functional assays and conformational principles for interpreting cancer drivers. PLoS Comput Biol 2019; 15:e1006658. [PMID: 30921324 PMCID: PMC6438456 DOI: 10.1371/journal.pcbi.1006658] [Citation(s) in RCA: 56] [Impact Index Per Article: 11.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/19/2022] Open
Abstract
At the root of the so-called precision medicine or precision oncology, which is our focus here, is the hypothesis that cancer treatment would be considerably better if therapies were guided by a tumor’s genomic alterations. This hypothesis has sparked major initiatives focusing on whole-genome and/or exome sequencing, creation of large databases, and developing tools for their statistical analyses—all aspiring to identify actionable alterations, and thus molecular targets, in a patient. At the center of the massive amount of collected sequence data is their interpretations that largely rest on statistical analysis and phenotypic observations. Statistics is vital, because it guides identification of cancer-driving alterations. However, statistics of mutations do not identify a change in protein conformation; therefore, it may not define sufficiently accurate actionable mutations, neglecting those that are rare. Among the many thematic overviews of precision oncology, this review innovates by further comprehensively including precision pharmacology, and within this framework, articulating its protein structural landscape and consequences to cellular signaling pathways. It provides the underlying physicochemical basis, thereby also opening the door to a broader community.
Collapse
|
10
|
Zhang W, Wang SL. An Integrated Framework for Identifying Mutated Driver Pathway and Cancer Progression. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2019; 16:455-464. [PMID: 29990286 DOI: 10.1109/tcbb.2017.2788016] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/08/2023]
Abstract
Next-generation sequencing (NGS) technologies provide amount of somatic mutation data in a large number of patients. The identification of mutated driver pathway and cancer progression from these data is a challenging task because of the heterogeneity of interpatient. In addition, cancer progression at the pathway level has been proved to be more reasonable than at the gene level. In this paper, we introduce an integrated framework to identify mutated driver pathways and cancer progression (iMDPCP) at the pathway level from somatic mutation data. First, we use uncertainty coefficient to quantify mutual exclusivity on gene driver pathways and develop a computational framework to identify mutated driver pathways based on the adaptive discrete differential evolution algorithm. Then, we construct cancer progression model for driver pathways based on the Bayesian Network. Finally, we evaluate the performance of iMDPCP on real cancer somatic mutation datasets. The experimental results indicate that iMDPCP is more accurate than state-of-the-art methods according to the enrichment of KEGG pathways, and it also provides new insights on identifying cancer progression at the pathway level.
Collapse
|
11
|
Zapata L, Susak H, Drechsel O, Friedländer MR, Estivill X, Ossowski S. Signatures of positive selection reveal a universal role of chromatin modifiers as cancer driver genes. Sci Rep 2017; 7:13124. [PMID: 29030609 PMCID: PMC5640613 DOI: 10.1038/s41598-017-12888-1] [Citation(s) in RCA: 14] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/09/2017] [Accepted: 09/15/2017] [Indexed: 12/24/2022] Open
Abstract
Tumors are composed of an evolving population of cells subjected to tissue-specific selection, which fuels tumor heterogeneity and ultimately complicates cancer driver gene identification. Here, we integrate cancer cell fraction, population recurrence, and functional impact of somatic mutations as signatures of selection into a Bayesian model for driver prediction. We demonstrate that our model, cDriver, outperforms competing methods when analyzing solid tumors, hematological malignancies, and pan-cancer datasets. Applying cDriver to exome sequencing data of 21 cancer types from 6,870 individuals revealed 98 unreported tumor type-driver gene connections. These novel connections are highly enriched for chromatin-modifying proteins, hinting at a universal role of chromatin regulation in cancer etiology. Although infrequently mutated as single genes, we show that chromatin modifiers are altered in a large fraction of cancer patients. In summary, we demonstrate that integration of evolutionary signatures is key for identifying mutational driver genes, thereby facilitating the discovery of novel therapeutic targets for cancer treatment.
Collapse
Affiliation(s)
- Luis Zapata
- Bioinformatics and Genomics Program, Centre for Genomic Regulation (CRG), The Barcelona Institute of Science and Technology, Dr. Aiguader 88, 08003, Barcelona, Spain
- Universitat Pompeu Fabra (UPF), 08003, Barcelona, Spain
| | - Hana Susak
- Bioinformatics and Genomics Program, Centre for Genomic Regulation (CRG), The Barcelona Institute of Science and Technology, Dr. Aiguader 88, 08003, Barcelona, Spain
- Universitat Pompeu Fabra (UPF), 08003, Barcelona, Spain
| | - Oliver Drechsel
- Bioinformatics and Genomics Program, Centre for Genomic Regulation (CRG), The Barcelona Institute of Science and Technology, Dr. Aiguader 88, 08003, Barcelona, Spain
- Universitat Pompeu Fabra (UPF), 08003, Barcelona, Spain
- Institute of Molecular Biology gGmbH (IMB), Ackermannweg 4, 55128, Mainz, Germany
| | - Marc R Friedländer
- Bioinformatics and Genomics Program, Centre for Genomic Regulation (CRG), The Barcelona Institute of Science and Technology, Dr. Aiguader 88, 08003, Barcelona, Spain
- Universitat Pompeu Fabra (UPF), 08003, Barcelona, Spain
- Science for Life Laboratory, Department of Molecular Biosciences, The Wenner-Gren Institute, Stockholm University, S-10691, Stockholm, Sweden
| | - Xavier Estivill
- Bioinformatics and Genomics Program, Centre for Genomic Regulation (CRG), The Barcelona Institute of Science and Technology, Dr. Aiguader 88, 08003, Barcelona, Spain
- Universitat Pompeu Fabra (UPF), 08003, Barcelona, Spain
- Experimental Genetics Division, Sidra Medical and Research Center, 26999, Doha, Qatar
| | - Stephan Ossowski
- Bioinformatics and Genomics Program, Centre for Genomic Regulation (CRG), The Barcelona Institute of Science and Technology, Dr. Aiguader 88, 08003, Barcelona, Spain.
- Universitat Pompeu Fabra (UPF), 08003, Barcelona, Spain.
- Institute of Medical Genetics and Applied Genomics, University of Tübingen, Tübingen, Germany.
| |
Collapse
|
12
|
Abstract
The past several years have seen an explosion in development of applications for the CRISPR-Cas9 system, from efficient genome editing, to high-throughput screening, to recruitment of a range of DNA and chromatin-modifying enzymes. While homology-directed repair (HDR) coupled with Cas9 nuclease cleavage has been used with great success to repair and re-write genomes, recently developed base-editing systems present a useful orthogonal strategy to engineer nucleotide substitutions. Base editing relies on recruitment of cytidine deaminases to introduce changes (rather than double-stranded breaks and donor templates) and offers potential improvements in efficiency while limiting damage and simplifying the delivery of editing machinery. At the same time, these systems enable novel mutagenesis strategies to introduce sequence diversity for engineering and discovery. Here, we review the different base-editing platforms, including their deaminase recruitment strategies and editing outcomes, and compare them to other CRISPR genome-editing technologies. Additionally, we discuss how these systems have been applied in therapeutic, engineering, and research settings. Lastly, we explore future directions of this emerging technology.
Collapse
|
13
|
Hess GT, Tycko J, Yao D, Bassik MC. Methods and Applications of CRISPR-Mediated Base Editing in Eukaryotic Genomes. Mol Cell 2017; 68:26-43. [PMID: 28985508 PMCID: PMC5997582 DOI: 10.1016/j.molcel.2017.09.029] [Citation(s) in RCA: 161] [Impact Index Per Article: 23.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/24/2017] [Revised: 09/20/2017] [Accepted: 09/21/2017] [Indexed: 12/26/2022]
Abstract
The past several years have seen an explosion in development of applications for the CRISPR-Cas9 system, from efficient genome editing, to high-throughput screening, to recruitment of a range of DNA and chromatin-modifying enzymes. While homology-directed repair (HDR) coupled with Cas9 nuclease cleavage has been used with great success to repair and re-write genomes, recently developed base-editing systems present a useful orthogonal strategy to engineer nucleotide substitutions. Base editing relies on recruitment of cytidine deaminases to introduce changes (rather than double-stranded breaks and donor templates) and offers potential improvements in efficiency while limiting damage and simplifying the delivery of editing machinery. At the same time, these systems enable novel mutagenesis strategies to introduce sequence diversity for engineering and discovery. Here, we review the different base-editing platforms, including their deaminase recruitment strategies and editing outcomes, and compare them to other CRISPR genome-editing technologies. Additionally, we discuss how these systems have been applied in therapeutic, engineering, and research settings. Lastly, we explore future directions of this emerging technology.
Collapse
Affiliation(s)
- Gaelen T Hess
- Department of Genetics and Stanford University Chemistry, Engineering, and Medicine for Human Health (ChEM-H), Stanford, CA, USA
| | - Josh Tycko
- Department of Genetics and Stanford University Chemistry, Engineering, and Medicine for Human Health (ChEM-H), Stanford, CA, USA
| | - David Yao
- Department of Genetics and Stanford University Chemistry, Engineering, and Medicine for Human Health (ChEM-H), Stanford, CA, USA
| | - Michael C Bassik
- Department of Genetics and Stanford University Chemistry, Engineering, and Medicine for Human Health (ChEM-H), Stanford, CA, USA.
| |
Collapse
|
14
|
Deihimi S, Lev A, Slifker M, Shagisultanova E, Xu Q, Jung K, Vijayvergia N, Ross EA, Xiu J, Swensen J, Gatalica Z, Andrake M, Dunbrack RL, El-Deiry WS. BRCA2, EGFR, and NTRK mutations in mismatch repair-deficient colorectal cancers with MSH2 or MLH1 mutations. Oncotarget 2017; 8:39945-39962. [PMID: 28591715 PMCID: PMC5522275 DOI: 10.18632/oncotarget.18098] [Citation(s) in RCA: 23] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/14/2017] [Accepted: 04/26/2017] [Indexed: 02/07/2023] Open
Abstract
Deficient mismatch repair (MMR) and microsatellite instability (MSI) contribute to ~15% of colorectal cancer (CRCs). We hypothesized MSI leads to mutations in DNA repair proteins including BRCA2 and cancer drivers including EGFR. We analyzed mutations among a discovery cohort of 26 MSI-High (MSI-H) and 558 non-MSI-H CRCs profiled at Caris Life Sciences. Caris-profiled MSI-H CRCs had high mutation rates (50% vs 14% in non-MSI-H, P < 0.0001) in BRCA2. Of 1104 profiled CRCs from a second cohort (COSMIC), MSH2/MLH1-mutant CRCs showed higher mutation rates in BRCA2 compared to non-MSH2/MLH1-mutant tumors (38% vs 6%, P < 0.0000001). BRCA2 mutations in MSH2/MLH1-mutant CRCs included 75 unique mutations not known to occur in breast or pancreatic cancer per COSMIC v73. Only 5 deleterious BRCA2 mutations in CRC were previously reported in the BIC database as germ-line mutations in breast cancer. Some BRCA2 mutations were predicted to disrupt interactions with partner proteins DSS1 and RAD51. Some CRCs harbored multiple BRCA2 mutations. EGFR was mutated in 45.5% of MSH2/MLH1-mutant and 6.5% of non-MSH2/MLH1-mutant tumors (P < 0.0000001). Approximately 15% of EGFR mutations found may be actionable through TKI therapy, including N700D, G719D, T725M, T790M, and E884K. NTRK gene mutations were identified in MSH2/MLH1-mutant CRC including NTRK1 I699V, NTRK2 P716S, and NTRK3 R745L. Our findings have clinical relevance regarding therapeutic targeting of BRCA2 vulnerabilities, EGFR mutations or other identified oncogenic drivers such as NTRK in MSH2/MLH1-mutant CRCs or other tumors with mismatch repair deficiency.
Collapse
Affiliation(s)
- Safoora Deihimi
- Laboratory of Translational Oncology and Experimental Cancer Therapeutics, Fox Chase Cancer Center, Philadelphia, PA, USA
- Molecular Therapeutics Program, Fox Chase Cancer Center, Philadelphia, PA, USA
- Department of Hematology/Oncology, Fox Chase Cancer Center, Philadelphia, PA, USA
- Department of Biochemistry and Molecular Biology, Drexel University College of Medicine, Philadelphia, PA, USA
| | - Avital Lev
- Laboratory of Translational Oncology and Experimental Cancer Therapeutics, Fox Chase Cancer Center, Philadelphia, PA, USA
- Molecular Therapeutics Program, Fox Chase Cancer Center, Philadelphia, PA, USA
- Department of Hematology/Oncology, Fox Chase Cancer Center, Philadelphia, PA, USA
| | - Michael Slifker
- Biostatistics and Bioinformatics Department, Fox Chase Cancer Center, Philadelphia, PA, USA
| | - Elena Shagisultanova
- Department of Hematology/Oncology, Fox Chase Cancer Center, Philadelphia, PA, USA
- University of Colorado Denver Cancer Center, Denver, CO, USA
| | - Qifang Xu
- Molecular Therapeutics Program, Fox Chase Cancer Center, Philadelphia, PA, USA
| | - Kyungsuk Jung
- Department of Medicine, Fox Chase Cancer Center, Philadelphia, PA, USA
| | - Namrata Vijayvergia
- Department of Hematology/Oncology, Fox Chase Cancer Center, Philadelphia, PA, USA
| | - Eric A. Ross
- Biostatistics and Bioinformatics Department, Fox Chase Cancer Center, Philadelphia, PA, USA
- Cancer Prevention and Control Program, Fox Chase Cancer Center, Philadelphia, PA, USA
| | | | | | | | - Mark Andrake
- Molecular Therapeutics Program, Fox Chase Cancer Center, Philadelphia, PA, USA
| | - Roland L. Dunbrack
- Molecular Therapeutics Program, Fox Chase Cancer Center, Philadelphia, PA, USA
| | - Wafik S. El-Deiry
- Laboratory of Translational Oncology and Experimental Cancer Therapeutics, Fox Chase Cancer Center, Philadelphia, PA, USA
- Molecular Therapeutics Program, Fox Chase Cancer Center, Philadelphia, PA, USA
- Department of Hematology/Oncology, Fox Chase Cancer Center, Philadelphia, PA, USA
- Department of Biochemistry and Molecular Biology, Drexel University College of Medicine, Philadelphia, PA, USA
| |
Collapse
|
15
|
Benstead-Hume G, Wooller SK, Pearl FMG. 'Big data' approaches for novel anti-cancer drug discovery. Expert Opin Drug Discov 2017; 12:599-609. [PMID: 28462602 DOI: 10.1080/17460441.2017.1319356] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/16/2022]
Abstract
INTRODUCTION The development of improved cancer therapies is frequently cited as an urgent unmet medical need. Recent advances in platform technologies and the increasing availability of biological 'big data' are providing an unparalleled opportunity to systematically identify the key genes and pathways involved in tumorigenesis. The discoveries made using these new technologies may lead to novel therapeutic interventions. Areas covered: The authors discuss the current approaches that use 'big data' to identify cancer drivers. These approaches include the analysis of genomic sequencing data, pathway data, multi-platform data, identifying genetic interactions such as synthetic lethality and using cell line data. They review how big data is being used to identify novel drug targets. The authors then provide an overview of the available data repositories and tools being used at the forefront of cancer drug discovery. Expert opinion: Targeted therapies based on the genomic events driving the tumour will eventually inform treatment protocols. However, using a tailored approach to treat all tumour patients may require developing a large repertoire of targeted drugs.
Collapse
Affiliation(s)
- Graeme Benstead-Hume
- a Bioinformatics Group, School of Life Sciences , University of Sussex , Brighton , United Kingdom
| | - Sarah K Wooller
- a Bioinformatics Group, School of Life Sciences , University of Sussex , Brighton , United Kingdom
| | - Frances M G Pearl
- a Bioinformatics Group, School of Life Sciences , University of Sussex , Brighton , United Kingdom
| |
Collapse
|
16
|
Dimitrakopoulos CM, Beerenwinkel N. Computational approaches for the identification of cancer genes and pathways. WILEY INTERDISCIPLINARY REVIEWS-SYSTEMS BIOLOGY AND MEDICINE 2016; 9. [PMID: 27863091 PMCID: PMC5215607 DOI: 10.1002/wsbm.1364] [Citation(s) in RCA: 40] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 02/28/2016] [Revised: 07/26/2016] [Accepted: 08/23/2016] [Indexed: 12/27/2022]
Abstract
High‐throughput DNA sequencing techniques enable large‐scale measurement of somatic mutations in tumors. Cancer genomics research aims at identifying all cancer‐related genes and solid interpretation of their contribution to cancer initiation and development. However, this venture is characterized by various challenges, such as the high number of neutral passenger mutations and the complexity of the biological networks affected by driver mutations. Based on biological pathway and network information, sophisticated computational methods have been developed to facilitate the detection of cancer driver mutations and pathways. They can be categorized into (1) methods using known pathways from public databases, (2) network‐based methods, and (3) methods learning cancer pathways de novo. Methods in the first two categories use and integrate different types of data, such as biological pathways, protein interaction networks, and gene expression measurements. The third category consists of de novo methods that detect combinatorial patterns of somatic mutations across tumor samples, such as mutual exclusivity and co‐occurrence. In this review, we discuss recent advances, current limitations, and future challenges of these approaches for detecting cancer genes and pathways. We also discuss the most important current resources of cancer‐related genes. WIREs Syst Biol Med 2017, 9:e1364. doi: 10.1002/wsbm.1364 For further resources related to this article, please visit the WIREs website.
Collapse
Affiliation(s)
- Christos M Dimitrakopoulos
- Department of Biosystems Science and Engineering, ETH Zürich, Basel, Switzerland.,SIB Swiss Institute of Bioinformatics, Basel, Switzerland
| | - Niko Beerenwinkel
- Department of Biosystems Science and Engineering, ETH Zürich, Basel, Switzerland.,SIB Swiss Institute of Bioinformatics, Basel, Switzerland
| |
Collapse
|
17
|
Wu H, Gao L, Kasabov NK. Network-Based Method for Inferring Cancer Progression at the Pathway Level from Cross-Sectional Mutation Data. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2016; 13:1036-1044. [PMID: 26915128 DOI: 10.1109/tcbb.2016.2520934] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/05/2023]
Abstract
Large-scale cancer genomics projects are providing a wealth of somatic mutation data from a large number of cancer patients. However, it is difficult to obtain several samples with a temporal order from one patient in evaluating the cancer progression. Therefore, one of the most challenging problems arising from the data is to infer the temporal order of mutations across many patients. To solve the problem efficiently, we present a Network-based method (NetInf) to Infer cancer progression at the pathway level from cross-sectional data across many patients, leveraging on the exclusive property of driver mutations within a pathway and the property of linear progression between pathways. To assess the robustness of NetInf, we apply it on simulated data with the addition of different levels of noise. To verify the performance of NetInf, we apply it to analyze somatic mutation data from three real cancer studies with large number of samples. Experimental results reveal that the pathways detected by NetInf show significant enrichment. Our method reduces computational complexity by constructing gene networks without assigning the number of pathways, which also provides new insights on the temporal order of somatic mutations at the pathway level rather than at the gene level.
Collapse
|
18
|
Beerenwinkel N, Greenman CD, Lagergren J. Computational Cancer Biology: An Evolutionary Perspective. PLoS Comput Biol 2016; 12:e1004717. [PMID: 26845763 PMCID: PMC4742235 DOI: 10.1371/journal.pcbi.1004717] [Citation(s) in RCA: 36] [Impact Index Per Article: 4.5] [Reference Citation Analysis] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/19/2023] Open
Affiliation(s)
- Niko Beerenwinkel
- Department of Biosystems Science and Engineering, ETH Zurich, Basel, Switzerland
- SIB Swiss Institute of Bioinformatics, Basel, Switzerland
- * E-mail: (NB); (CDG); (JL)
| | - Chris D. Greenman
- School of Computing Sciences, University of East Anglia, Norwich, United Kingdom
- * E-mail: (NB); (CDG); (JL)
| | - Jens Lagergren
- Science for Life Laboratory, School of Computer Science and Communication, Swedish E-Science Research Center, KTH Royal Institute of Technology, Solna, Sweden
- * E-mail: (NB); (CDG); (JL)
| |
Collapse
|