1
|
Xu Z, Lee DS, Chandran S, Le VT, Bump R, Yasis J, Dallarda S, Marcotte S, Clock B, Haghani N, Cho CY, Akdemir K, Tyndale S, Futreal PA, McVicker G, Wahl GM, Dixon JR. Structural variants drive context-dependent oncogene activation in cancer. Nature 2022; 612:564-572. [PMID: 36477537 PMCID: PMC9810360 DOI: 10.1038/s41586-022-05504-4] [Citation(s) in RCA: 31] [Impact Index Per Article: 15.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/21/2020] [Accepted: 11/01/2022] [Indexed: 12/12/2022]
Abstract
Higher-order chromatin structure is important for the regulation of genes by distal regulatory sequences1,2. Structural variants (SVs) that alter three-dimensional (3D) genome organization can lead to enhancer-promoter rewiring and human disease, particularly in the context of cancer3. However, only a small minority of SVs are associated with altered gene expression4,5, and it remains unclear why certain SVs lead to changes in distal gene expression and others do not. To address these questions, we used a combination of genomic profiling and genome engineering to identify sites of recurrent changes in 3D genome structure in cancer and determine the effects of specific rearrangements on oncogene activation. By analysing Hi-C data from 92 cancer cell lines and patient samples, we identified loci affected by recurrent alterations to 3D genome structure, including oncogenes such as MYC, TERT and CCND1. By using CRISPR-Cas9 genome engineering to generate de novo SVs, we show that oncogene activity can be predicted by using 'activity-by-contact' models that consider partner region chromatin contacts and enhancer activity. However, activity-by-contact models are only predictive of specific subsets of genes in the genome, suggesting that different classes of genes engage in distinct modes of regulation by distal regulatory elements. These results indicate that SVs that alter 3D genome organization are widespread in cancer genomes and begin to illustrate predictive rules for the consequences of SVs on oncogene activation.
Collapse
Affiliation(s)
- Zhichao Xu
- Gene Expression Laboratory; Salk Institute for Biological Studies; La Jolla, CA, 92037; USA,These authors contributed equally
| | - Dong-Sung Lee
- Department of Life Sciences, University of Seoul, Seoul, South Korea,These authors contributed equally
| | - Sahaana Chandran
- Gene Expression Laboratory; Salk Institute for Biological Studies; La Jolla, CA, 92037; USA
| | - Victoria T. Le
- Gene Expression Laboratory; Salk Institute for Biological Studies; La Jolla, CA, 92037; USA
| | - Rosalind Bump
- Gene Expression Laboratory; Salk Institute for Biological Studies; La Jolla, CA, 92037; USA
| | - Jean Yasis
- Gene Expression Laboratory; Salk Institute for Biological Studies; La Jolla, CA, 92037; USA
| | - Sofia Dallarda
- Gene Expression Laboratory; Salk Institute for Biological Studies; La Jolla, CA, 92037; USA
| | - Samantha Marcotte
- Gene Expression Laboratory; Salk Institute for Biological Studies; La Jolla, CA, 92037; USA
| | - Benjamin Clock
- Gene Expression Laboratory; Salk Institute for Biological Studies; La Jolla, CA, 92037; USA
| | - Nicholas Haghani
- Gene Expression Laboratory; Salk Institute for Biological Studies; La Jolla, CA, 92037; USA
| | - Chae Yun Cho
- Gene Expression Laboratory; Salk Institute for Biological Studies; La Jolla, CA, 92037; USA
| | - Kadir Akdemir
- Department of Genomic Medicine; UT MD Anderson Cancer Center; Houston, TX, 77030; USA
| | - Selene Tyndale
- Integrative Biology Laboratory; Salk Institute for Biological Studies; La Jolla, CA, 92037; USA
| | - P. Andrew Futreal
- Department of Genomic Medicine; UT MD Anderson Cancer Center; Houston, TX, 77030; USA
| | - Graham McVicker
- Integrative Biology Laboratory; Salk Institute for Biological Studies; La Jolla, CA, 92037; USA
| | - Geoffrey M. Wahl
- Gene Expression Laboratory; Salk Institute for Biological Studies; La Jolla, CA, 92037; USA
| | - Jesse R. Dixon
- Gene Expression Laboratory; Salk Institute for Biological Studies; La Jolla, CA, 92037; USA,Correspondence:
| |
Collapse
|
2
|
Castro-Mondragon JA, Aure M, Lingjærde O, Langerød A, Martens JWM, Børresen-Dale AL, Kristensen V, Mathelier A. Cis-regulatory mutations associate with transcriptional and post-transcriptional deregulation of gene regulatory programs in cancers. Nucleic Acids Res 2022; 50:12131-12148. [PMID: 36477895 PMCID: PMC9757053 DOI: 10.1093/nar/gkac1143] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/18/2022] [Revised: 11/03/2022] [Accepted: 11/17/2022] [Indexed: 12/13/2022] Open
Abstract
Most cancer alterations occur in the noncoding portion of the human genome, where regulatory regions control gene expression. The discovery of noncoding mutations altering the cells' regulatory programs has been limited to few examples with high recurrence or high functional impact. Here, we show that transcription factor binding sites (TFBSs) have similar mutation loads to those in protein-coding exons. By combining cancer somatic mutations in TFBSs and expression data for protein-coding and miRNA genes, we evaluate the combined effects of transcriptional and post-transcriptional alterations on the regulatory programs in cancers. The analysis of seven TCGA cohorts culminates with the identification of protein-coding and miRNA genes linked to mutations at TFBSs that are associated with a cascading trans-effect deregulation on the cells' regulatory programs. Our analyses of cis-regulatory mutations associated with miRNAs recurrently predict 12 mature miRNAs (derived from 7 precursors) associated with the deregulation of their target gene networks. The predictions are enriched for cancer-associated protein-coding and miRNA genes and highlight cis-regulatory mutations associated with the dysregulation of key pathways associated with carcinogenesis. By combining transcriptional and post-transcriptional regulation of gene expression, our method predicts cis-regulatory mutations related to the dysregulation of key gene regulatory networks in cancer patients.
Collapse
Affiliation(s)
- Jaime A Castro-Mondragon
- Centre for Molecular Medicine Norway (NCMM), Nordic EMBL Partnership, University of Oslo, 0318 Oslo, Norway
| | - Miriam Ragle Aure
- Department of Cancer Genetics, Institute for Cancer Research, Oslo University Hospital Radiumhospitalet, 0310 Oslo, Norway
- Department of Medical Genetics, Institute of Clinical Medicine, University of Oslo and Oslo University Hospital, Oslo, Norway
| | - Ole Christian Lingjærde
- Department of Cancer Genetics, Institute for Cancer Research, Oslo University Hospital Radiumhospitalet, 0310 Oslo, Norway
- Centre for Bioinformatics, Department of Informatics, University of Oslo, Gaustadalléen 23 B, N-0373 Oslo, Norway
- KG Jebsen Centre for B-cell malignancies, Institute for Clinical Medicine, University of Oslo, Ullernchausseen 70, N-0372 Oslo, Norway
| | - Anita Langerød
- Department of Cancer Genetics, Institute for Cancer Research, Oslo University Hospital Radiumhospitalet, 0310 Oslo, Norway
| | - John W M Martens
- Erasmus MC Cancer Institute and Cancer Genomics Netherlands, University Medical Center Rotterdam, Department of Medical Oncology, 3015GD Rotterdam, The Netherlands
| | - Anne-Lise Børresen-Dale
- Department of Cancer Genetics, Institute for Cancer Research, Oslo University Hospital Radiumhospitalet, 0310 Oslo, Norway
| | - Vessela N Kristensen
- Department of Cancer Genetics, Institute for Cancer Research, Oslo University Hospital Radiumhospitalet, 0310 Oslo, Norway
- Department of Medical Genetics, Institute of Clinical Medicine, University of Oslo and Oslo University Hospital, Oslo, Norway
| | - Anthony Mathelier
- Centre for Molecular Medicine Norway (NCMM), Nordic EMBL Partnership, University of Oslo, 0318 Oslo, Norway
- Department of Cancer Genetics, Institute for Cancer Research, Oslo University Hospital Radiumhospitalet, 0310 Oslo, Norway
- Department of Medical Genetics, Institute of Clinical Medicine, University of Oslo and Oslo University Hospital, Oslo, Norway
| |
Collapse
|
3
|
Kelly MR, Wisniewska K, Regner MJ, Lewis MW, Perreault AA, Davis ES, Phanstiel DH, Parker JS, Franco HL. A multi-omic dissection of super-enhancer driven oncogenic gene expression programs in ovarian cancer. Nat Commun 2022; 13:4247. [PMID: 35869079 PMCID: PMC9307778 DOI: 10.1038/s41467-022-31919-8] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/18/2021] [Accepted: 07/08/2022] [Indexed: 01/14/2023] Open
Abstract
The human genome contains regulatory elements, such as enhancers, that are often rewired by cancer cells for the activation of genes that promote tumorigenesis and resistance to therapy. This is especially true for cancers that have little or no known driver mutations within protein coding genes, such as ovarian cancer. Herein, we utilize an integrated set of genomic and epigenomic datasets to identify clinically relevant super-enhancers that are preferentially amplified in ovarian cancer patients. We systematically probe the top 86 super-enhancers, using CRISPR-interference and CRISPR-deletion assays coupled to RNA-sequencing, to nominate two salient super-enhancers that drive proliferation and migration of cancer cells. Utilizing Hi-C, we construct chromatin interaction maps that enable the annotation of direct target genes for these super-enhancers and confirm their activity specifically within the cancer cell compartment of human tumors using single-cell genomics data. Together, our multi-omic approach examines a number of fundamental questions about how regulatory information encoded into super-enhancers drives gene expression networks that underlie the biology of ovarian cancer.
Collapse
Affiliation(s)
- Michael R Kelly
- Lineberger Comprehensive Cancer Center, University of North Carolina at Chapel Hill, Chapel Hill, NC, 27599, USA
- Bioinformatics and Computational Biology Graduate Program, University of North Carolina at Chapel Hill, Chapel Hill, NC, 27599, USA
| | - Kamila Wisniewska
- Lineberger Comprehensive Cancer Center, University of North Carolina at Chapel Hill, Chapel Hill, NC, 27599, USA
| | - Matthew J Regner
- Lineberger Comprehensive Cancer Center, University of North Carolina at Chapel Hill, Chapel Hill, NC, 27599, USA
- Bioinformatics and Computational Biology Graduate Program, University of North Carolina at Chapel Hill, Chapel Hill, NC, 27599, USA
| | - Michael W Lewis
- Lineberger Comprehensive Cancer Center, University of North Carolina at Chapel Hill, Chapel Hill, NC, 27599, USA
| | - Andrea A Perreault
- Thurston Arthritis Research Center, University of North Carolina at Chapel Hill, Chapel Hill, NC, 27599, USA
| | - Eric S Davis
- Bioinformatics and Computational Biology Graduate Program, University of North Carolina at Chapel Hill, Chapel Hill, NC, 27599, USA
| | - Douglas H Phanstiel
- Lineberger Comprehensive Cancer Center, University of North Carolina at Chapel Hill, Chapel Hill, NC, 27599, USA
- Thurston Arthritis Research Center, University of North Carolina at Chapel Hill, Chapel Hill, NC, 27599, USA
- Department of Cell Biology & Physiology, University of North Carolina at Chapel Hill, Chapel Hill, NC, 27599, USA
| | - Joel S Parker
- Lineberger Comprehensive Cancer Center, University of North Carolina at Chapel Hill, Chapel Hill, NC, 27599, USA
- Bioinformatics and Computational Biology Graduate Program, University of North Carolina at Chapel Hill, Chapel Hill, NC, 27599, USA
- Department of Genetics, School of Medicine, University of North Carolina at Chapel Hill, Chapel Hill, NC, 27599, USA
| | - Hector L Franco
- Lineberger Comprehensive Cancer Center, University of North Carolina at Chapel Hill, Chapel Hill, NC, 27599, USA.
- Bioinformatics and Computational Biology Graduate Program, University of North Carolina at Chapel Hill, Chapel Hill, NC, 27599, USA.
- Department of Genetics, School of Medicine, University of North Carolina at Chapel Hill, Chapel Hill, NC, 27599, USA.
| |
Collapse
|
4
|
Lewis MW, Wisniewska K, King CM, Li S, Coffey A, Kelly MR, Regner MJ, Franco HL. Enhancer RNA Transcription Is Essential for a Novel CSF1 Enhancer in Triple-Negative Breast Cancer. Cancers (Basel) 2022; 14:1852. [PMID: 35406623 PMCID: PMC8997997 DOI: 10.3390/cancers14071852] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/02/2021] [Revised: 03/24/2022] [Accepted: 03/29/2022] [Indexed: 12/11/2022] Open
Abstract
Enhancers are critical regulatory elements in the genome that help orchestrate spatiotemporal patterns of gene expression during development and normal physiology. In cancer, enhancers are often rewired by various genetic and epigenetic mechanisms for the activation of oncogenes that lead to initiation and progression. A key feature of active enhancers is the production of non-coding RNA molecules called enhancer RNAs, whose functions remain unknown but can be used to specify active enhancers de novo. Using a combination of eRNA transcription and chromatin modifications, we have identified a novel enhancer located 30 kb upstream of Colony Stimulating Factor 1 (CSF1). Notably, CSF1 is implicated in the progression of breast cancer, is overexpressed in triple-negative breast cancer (TNBC) cell lines, and its enhancer is primarily active in TNBC patient tumors. Genomic deletion of the enhancer (via CRISPR/Cas9) enabled us to validate this regulatory element as a bona fide enhancer of CSF1 and subsequent cell-based assays revealed profound effects on cancer cell proliferation, colony formation, and migration. Epigenetic silencing of the enhancer via CRISPR-interference assays (dCas9-KRAB) coupled to RNA-sequencing, enabled unbiased identification of additional target genes, such as RSAD2, that are predictive of clinical outcome. Additionally, we repurposed the RNA-guided RNA-targeting CRISPR-Cas13 machinery to specifically degrade the eRNAs transcripts produced at this enhancer to determine the consequences on CSF1 mRNA expression, suggesting a post-transcriptional role for these non-coding transcripts. Finally, we test our eRNA-dependent model of CSF1 enhancer function and demonstrate that our results are extensible to other forms of cancer. Collectively, this work describes a novel enhancer that is active in the TNBC subtype, which is associated with cellular growth, and requires eRNA transcripts for proper enhancer function. These results demonstrate the significant impact of enhancers in cancer biology and highlight their potential as tractable targets for therapeutic intervention.
Collapse
Affiliation(s)
- Michael W. Lewis
- The Lineberger Comprehensive Cancer Center, University of North Carolina at Chapel Hill, Chapel Hill, NC 27599, USA; (M.W.L.); (K.W.); (C.M.K.); (S.L.); (A.C.); (M.R.K.); (M.J.R.)
| | - Kamila Wisniewska
- The Lineberger Comprehensive Cancer Center, University of North Carolina at Chapel Hill, Chapel Hill, NC 27599, USA; (M.W.L.); (K.W.); (C.M.K.); (S.L.); (A.C.); (M.R.K.); (M.J.R.)
| | - Caitlin M. King
- The Lineberger Comprehensive Cancer Center, University of North Carolina at Chapel Hill, Chapel Hill, NC 27599, USA; (M.W.L.); (K.W.); (C.M.K.); (S.L.); (A.C.); (M.R.K.); (M.J.R.)
| | - Shen Li
- The Lineberger Comprehensive Cancer Center, University of North Carolina at Chapel Hill, Chapel Hill, NC 27599, USA; (M.W.L.); (K.W.); (C.M.K.); (S.L.); (A.C.); (M.R.K.); (M.J.R.)
| | - Alisha Coffey
- The Lineberger Comprehensive Cancer Center, University of North Carolina at Chapel Hill, Chapel Hill, NC 27599, USA; (M.W.L.); (K.W.); (C.M.K.); (S.L.); (A.C.); (M.R.K.); (M.J.R.)
| | - Michael R. Kelly
- The Lineberger Comprehensive Cancer Center, University of North Carolina at Chapel Hill, Chapel Hill, NC 27599, USA; (M.W.L.); (K.W.); (C.M.K.); (S.L.); (A.C.); (M.R.K.); (M.J.R.)
- Bioinformatics and Computational Biology Graduate Program, University of North Carolina at Chapel Hill, Chapel Hill, NC 27599, USA
| | - Matthew J. Regner
- The Lineberger Comprehensive Cancer Center, University of North Carolina at Chapel Hill, Chapel Hill, NC 27599, USA; (M.W.L.); (K.W.); (C.M.K.); (S.L.); (A.C.); (M.R.K.); (M.J.R.)
- Bioinformatics and Computational Biology Graduate Program, University of North Carolina at Chapel Hill, Chapel Hill, NC 27599, USA
| | - Hector L. Franco
- The Lineberger Comprehensive Cancer Center, University of North Carolina at Chapel Hill, Chapel Hill, NC 27599, USA; (M.W.L.); (K.W.); (C.M.K.); (S.L.); (A.C.); (M.R.K.); (M.J.R.)
- Bioinformatics and Computational Biology Graduate Program, University of North Carolina at Chapel Hill, Chapel Hill, NC 27599, USA
- The Department of Genetics, School of Medicine, University of North Carolina at Chapel Hill, Chapel Hill, NC 27599, USA
| |
Collapse
|
5
|
Lange M, Begolli R, Giakountis A. Non-Coding Variants in Cancer: Mechanistic Insights and Clinical Potential for Personalized Medicine. Noncoding RNA 2021; 7:47. [PMID: 34449663 PMCID: PMC8395730 DOI: 10.3390/ncrna7030047] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/28/2021] [Revised: 07/26/2021] [Accepted: 08/01/2021] [Indexed: 12/11/2022] Open
Abstract
The cancer genome is characterized by extensive variability, in the form of Single Nucleotide Polymorphisms (SNPs) or structural variations such as Copy Number Alterations (CNAs) across wider genomic areas. At the molecular level, most SNPs and/or CNAs reside in non-coding sequences, ultimately affecting the regulation of oncogenes and/or tumor-suppressors in a cancer-specific manner. Notably, inherited non-coding variants can predispose for cancer decades prior to disease onset. Furthermore, accumulation of additional non-coding driver mutations during progression of the disease, gives rise to genomic instability, acting as the driving force of neoplastic development and malignant evolution. Therefore, detection and characterization of such mutations can improve risk assessment for healthy carriers and expand the diagnostic and therapeutic toolbox for the patient. This review focuses on functional variants that reside in transcribed or not transcribed non-coding regions of the cancer genome and presents a collection of appropriate state-of-the-art methodologies to study them.
Collapse
Affiliation(s)
- Marios Lange
- Department of Biochemistry and Biotechnology, University of Thessaly, Biopolis, 41500 Larissa, Greece; (M.L.); (R.B.)
| | - Rodiola Begolli
- Department of Biochemistry and Biotechnology, University of Thessaly, Biopolis, 41500 Larissa, Greece; (M.L.); (R.B.)
| | - Antonis Giakountis
- Department of Biochemistry and Biotechnology, University of Thessaly, Biopolis, 41500 Larissa, Greece; (M.L.); (R.B.)
- Institute for Fundamental Biomedical Research, B.S.R.C “Alexander Fleming”, 34 Fleming Str., 16672 Vari, Greece
| |
Collapse
|
6
|
Cruz MAD, Lund D, Szekeres F, Karlsson S, Faresjö M, Larsson D. Cis-regulatory elements in conserved non-coding sequences of nuclear receptor genes indicate for crosstalk between endocrine systems. Open Med (Wars) 2021; 16:640-650. [PMID: 33954257 PMCID: PMC8051167 DOI: 10.1515/med-2021-0264] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/01/2020] [Revised: 02/01/2021] [Accepted: 03/09/2021] [Indexed: 11/16/2022] Open
Abstract
Nuclear receptors (NRs) are ligand-activated transcription factors that regulate gene expression when bound to specific DNA sequences. Crosstalk between steroid NR systems has been studied for understanding the development of hormone-driven cancers but not to an extent at a genetic level. This study aimed to investigate crosstalk between steroid NRs in conserved intron and exon sequences, with a focus on steroid NRs involved in prostate cancer etiology. For this purpose, we evaluated conserved intron and exon sequences among all 49 members of the NR Superfamily (NRS) and their relevance as regulatory sequences and NR-binding sequences. Sequence conservation was found to be higher in the first intron (35%), when compared with downstream introns. Seventy-nine percent of the conserved regions in the NRS contained putative transcription factor binding sites (TFBS) and a large fraction of these sequences contained splicing sites (SS). Analysis of transcription factors binding to putative intronic and exonic TFBS revealed that 5 and 16%, respectively, were NRs. The present study suggests crosstalk between steroid NRs, e.g., vitamin D, estrogen, progesterone, and retinoic acid endocrine systems, through cis-regulatory elements in conserved sequences of introns and exons. This investigation gives evidence for crosstalk between steroid hormones and contributes to novel targets for steroid NR regulation.
Collapse
Affiliation(s)
- Maria Araceli Diaz Cruz
- Research School of Health and Welfare, School of Health and Welfare, Jönköping University, Jönköping, Sweden
| | - Dan Lund
- Department of Natural Science and Biomedicine, School of Health and Welfare, Jönköping University, Jönköping, Sweden
| | - Ferenc Szekeres
- Department of Biomedicine, School of Health Sciences, University of Skövde, Skövde, Sweden
| | - Sandra Karlsson
- Department of Natural Science and Biomedicine, School of Health and Welfare, Jönköping University, Jönköping, Sweden
| | - Maria Faresjö
- Department of Natural Science and Biomedicine, School of Health and Welfare, Jönköping University, Jönköping, Sweden
| | - Dennis Larsson
- Sahlgrenska University Hospital, Gothia Forum for Clinical Research, Gothenburg, Sweden
| |
Collapse
|
7
|
Cheng Z, Vermeulen M, Rollins-Green M, DeVeale B, Babak T. Cis-regulatory mutations with driver hallmarks in major cancers. iScience 2021; 24:102144. [PMID: 33665563 PMCID: PMC7903341 DOI: 10.1016/j.isci.2021.102144] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/11/2020] [Revised: 09/02/2020] [Accepted: 01/25/2021] [Indexed: 12/05/2022] Open
Abstract
Despite the recent availability of complete genome sequences of tumors from thousands of patients, isolating disease-causing (driver) non-coding mutations from the plethora of somatic variants remains challenging, and only a handful of validated examples exist. By integrating whole-genome sequencing, genetic data, and allele-specific gene expression from TCGA, we identified 320 somatic non-coding mutations that affect gene expression in cis (FDR<0.25). These mutations cluster into 47 cis-regulatory elements that modulate expression of their subject genes through diverse molecular mechanisms. We further show that these mutations have hallmark features of non-coding drivers; namely, that they preferentially disrupt transcription factor binding motifs, are associated with a selective advantage, increased oncogene expression and decreased tumor suppressor expression. Enrichment of functional non-coding somatic mutations predicts drivers Elevated variant allele frequencies are consistent with roles in tumorigenesis Putative non-coding drivers disrupt transcription factor binding motifs Predicted drivers associate with increased oncogene and decreased TSG expression
Collapse
Affiliation(s)
- Zhongshan Cheng
- Department of Biology, Queen's University, Kingston, ON K7L 3N6, Canada
| | - Michael Vermeulen
- Department of Biology, Queen's University, Kingston, ON K7L 3N6, Canada
| | | | - Brian DeVeale
- The Eli and Edythe Broad Center of Regeneration Medicine and Stem Cell Research, Center for Reproductive Sciences, University of California, San Francisco, San Francisco, CA 94143, USA
| | - Tomas Babak
- Department of Biology, Queen's University, Kingston, ON K7L 3N6, Canada
| |
Collapse
|
8
|
Penzar DD, Zinkevich AO, Vorontsov IE, Sitnik VV, Favorov AV, Makeev VJ, Kulakovskiy IV. What Do Neighbors Tell About You: The Local Context of Cis-Regulatory Modules Complicates Prediction of Regulatory Variants. Front Genet 2019; 10:1078. [PMID: 31737053 PMCID: PMC6834773 DOI: 10.3389/fgene.2019.01078] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/15/2019] [Accepted: 10/09/2019] [Indexed: 02/05/2023] Open
Abstract
Many problems of modern genetics and functional genomics require the assessment of functional effects of sequence variants, including gene expression changes. Machine learning is considered to be a promising approach for solving this task, but its practical applications remain a challenge due to the insufficient volume and diversity of training data. A promising source of valuable data is a saturation mutagenesis massively parallel reporter assay, which quantitatively measures changes in transcription activity caused by sequence variants. Here, we explore the computational predictions of the effects of individual single-nucleotide variants on gene transcription measured in the massively parallel reporter assays, based on the data from the recent "Regulation Saturation" Critical Assessment of Genome Interpretation challenge. We show that the estimated prediction quality strongly depends on the structure of the training and validation data. Particularly, training on the sequence segments located next to the validation data results in the "information leakage" caused by the local context. This information leakage allows reproducing the prediction quality of the best CAGI challenge submissions with a fairly simple machine learning approach, and even obtaining notably better-than-random predictions using irrelevant genomic regions. Validation scenarios preventing such information leakage dramatically reduce the measured prediction quality. The performance at independent regulatory regions entirely excluded from the training set appears to be much lower than needed for practical applications, and even the performance estimation will become reliable only in the future with richer data from multiple reporters. The source code and data are available at https://bitbucket.org/autosomeru_cagi2018/cagi2018_regsat and https://genomeinterpretation.org/content/expression-variants.
Collapse
Affiliation(s)
- Dmitry D. Penzar
- Vavilov Institute of General Genetics, Russian Academy of Sciences, Moscow, Russia
- Faculty of Bioengineering and Bioinformatics, Lomonosov Moscow State University, Moscow, Russia
- Department of Medical and Biological Physics, Moscow Institute of Physics and Technology (State University), Dolgoprudny, Russia
| | - Arsenii O. Zinkevich
- Vavilov Institute of General Genetics, Russian Academy of Sciences, Moscow, Russia
- Faculty of Bioengineering and Bioinformatics, Lomonosov Moscow State University, Moscow, Russia
| | - Ilya E. Vorontsov
- Vavilov Institute of General Genetics, Russian Academy of Sciences, Moscow, Russia
| | - Vasily V. Sitnik
- Vavilov Institute of General Genetics, Russian Academy of Sciences, Moscow, Russia
| | - Alexander V. Favorov
- Vavilov Institute of General Genetics, Russian Academy of Sciences, Moscow, Russia
- Department of Oncology, Sidney Kimmel Comprehensive Cancer Center, The Johns Hopkins University School of Medicine, Baltimore, MD, United States
| | - Vsevolod J. Makeev
- Vavilov Institute of General Genetics, Russian Academy of Sciences, Moscow, Russia
- Department of Medical and Biological Physics, Moscow Institute of Physics and Technology (State University), Dolgoprudny, Russia
- Engelhardt Institute of Molecular Biology, Russian Academy of Sciences, Moscow, Russia
| | - Ivan V. Kulakovskiy
- Vavilov Institute of General Genetics, Russian Academy of Sciences, Moscow, Russia
- Engelhardt Institute of Molecular Biology, Russian Academy of Sciences, Moscow, Russia
- Institute of Mathematical Problems of Biology RAS - the Branch of Keldysh Institute of Applied Mathematics of Russian Academy of Sciences, Pushchino, Russia
| |
Collapse
|
9
|
Rojano E, Seoane P, Ranea JAG, Perkins JR. Regulatory variants: from detection to predicting impact. Brief Bioinform 2019; 20:1639-1654. [PMID: 29893792 PMCID: PMC6917219 DOI: 10.1093/bib/bby039] [Citation(s) in RCA: 65] [Impact Index Per Article: 13.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/13/2018] [Revised: 04/18/2018] [Indexed: 02/01/2023] Open
Abstract
Variants within non-coding genomic regions can greatly affect disease. In recent years, increasing focus has been given to these variants, and how they can alter regulatory elements, such as enhancers, transcription factor binding sites and DNA methylation regions. Such variants can be considered regulatory variants. Concurrently, much effort has been put into establishing international consortia to undertake large projects aimed at discovering regulatory elements in different tissues, cell lines and organisms, and probing the effects of genetic variants on regulation by measuring gene expression. Here, we describe methods and techniques for discovering disease-associated non-coding variants using sequencing technologies. We then explain the computational procedures that can be used for annotating these variants using the information from the aforementioned projects, and prediction of their putative effects, including potential pathogenicity, based on rule-based and machine learning approaches. We provide the details of techniques to validate these predictions, by mapping chromatin-chromatin and chromatin-protein interactions, and introduce Clustered Regularly Interspaced Short Palindromic Repeats-Associated Protein 9 (CRISPR-Cas9) technology, which has already been used in this field and is likely to have a big impact on its future evolution. We also give examples of regulatory variants associated with multiple complex diseases. This review is aimed at bioinformaticians interested in the characterization of regulatory variants, molecular biologists and geneticists interested in understanding more about the nature and potential role of such variants from a functional point of views, and clinicians who may wish to learn about variants in non-coding genomic regions associated with a given disease and find out what to do next to uncover how they impact on the underlying mechanisms.
Collapse
Affiliation(s)
- Elena Rojano
- Department of Molecular Biology and Biochemistry, University of Malaga (UMA), 29010 Malaga, Spain
| | - Pedro Seoane
- Department of Molecular Biology and Biochemistry, University of Malaga (UMA), 29010 Malaga, Spain
| | - Juan A G Ranea
- CIBER de Enfermedades Raras, ISCIII, Madrid, Spain and Department of Molecular Biology and Biochemistry, University of Malaga (UMA), 29010 Malaga, Spain
| | - James R Perkins
- Research laboratory, IBIMA-Regional University Hospital of Malaga, UMA, Malaga 29009, Spain
| |
Collapse
|
10
|
Liu Z, Chen JY, Zhong Y, Xie L, Li JS. lncRNA MEG3 inhibits the growth of hepatocellular carcinoma cells by sponging miR-9-5p to upregulate SOX11. ACTA ACUST UNITED AC 2019; 52:e8631. [PMID: 31531526 PMCID: PMC6753855 DOI: 10.1590/1414-431x20198631] [Citation(s) in RCA: 39] [Impact Index Per Article: 7.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/10/2019] [Accepted: 08/14/2019] [Indexed: 02/08/2023]
Abstract
The long non-coding RNA (lncRNA) maternally expressed gene 3 (MEG3), a tumor suppressor, is critical for the carcinogenesis and progression of different cancers, including hepatocellular carcinoma (HCC). To date, the roles of lncRNA MEG3 in HCC are not well illustrated. Therefore, this study used western blot and qRT-PCR to evaluate the expression of MEG3, miR-9-5p, and Sex determining Region Y-related HMG-box 11 (SOX11) in HCC tissues and cell lines. RNA pull-down and luciferase reporter assay were used to evaluate these molecular interactions. 3-(4,5-dimethylthiazol-2-yl)-2,5-diphenyltetrazolium bromide and flow cytometry detected the viability and apoptosis of HCC cells, respectively. The results showed that MEG3 and SOX11 were poorly expressed but miR-9-5p was highly expressed in HCC. The expression levels of these molecules suggested a negative correlation between MEG3 and miR-9-5p and a positive correlation with SOX11, confirmed by Pearson's correlation analysis and biology experiments. Furthermore, MEG3 could combine with miR-9-5p, and SOX11 was a direct target of miR-9-5p. Moreover, MEG3 over-expression promoted cell apoptosis and growth inhibition in HCC cells through sponging miR-9-5p to up-regulate SOX11. Therefore, the interactions among MEG3, miR-9-5p, and SOX11 might offer a novel insight for understanding HCC pathogeny and provide potential diagnostic markers and therapeutic targets for HCC.
Collapse
Affiliation(s)
- Zhi Liu
- Department of Hepatobiliary Surgery, Affiliated Hospital of North Sichuan Medical College, Nanchong, China.,Institute of Hepatobiliary, Pancreatic and Intestinal Disease, North Sichuan Medical College, Nanchong, China
| | - Jian Yu Chen
- Department of Hepatobiliary Surgery, Affiliated Hospital of North Sichuan Medical College, Nanchong, China.,Institute of Hepatobiliary, Pancreatic and Intestinal Disease, North Sichuan Medical College, Nanchong, China
| | - Yang Zhong
- Department of Hepatobiliary Surgery, Affiliated Hospital of North Sichuan Medical College, Nanchong, China.,Institute of Hepatobiliary, Pancreatic and Intestinal Disease, North Sichuan Medical College, Nanchong, China
| | - Liang Xie
- Department of Hepatobiliary Surgery, Affiliated Hospital of North Sichuan Medical College, Nanchong, China.,Institute of Hepatobiliary, Pancreatic and Intestinal Disease, North Sichuan Medical College, Nanchong, China
| | - Jian Shui Li
- Department of Hepatobiliary Surgery, Affiliated Hospital of North Sichuan Medical College, Nanchong, China
| |
Collapse
|
11
|
Gan KA, Carrasco Pro S, Sewell JA, Fuxman Bass JI. Identification of Single Nucleotide Non-coding Driver Mutations in Cancer. Front Genet 2018; 9:16. [PMID: 29456552 PMCID: PMC5801294 DOI: 10.3389/fgene.2018.00016] [Citation(s) in RCA: 17] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/05/2017] [Accepted: 01/12/2018] [Indexed: 12/14/2022] Open
Abstract
Recent whole-genome sequencing studies have identified millions of somatic variants present in tumor samples. Most of these variants reside in non-coding regions of the genome potentially affecting transcriptional and post-transcriptional gene regulation. Although a few hallmark examples of driver mutations in non-coding regions have been reported, the functional role of the vast majority of somatic non-coding variants remains to be determined. This is because the few driver variants in each sample must be distinguished from the thousands of passenger variants and because the logic of regulatory element function has not yet been fully elucidated. Thus, variants prioritized based on mutational burden and location within regulatory elements need to be validated experimentally. This is generally achieved by combining assays that measure physical binding, such as chromatin immunoprecipitation, with those that determine regulatory activity, such as luciferase reporter assays. Here, we present an overview of in silico approaches used to prioritize somatic non-coding variants and the experimental methods used for functional validation and characterization.
Collapse
Affiliation(s)
- Kok A Gan
- Department of Biology, Boston University, Boston, MA, United States
| | | | - Jared A Sewell
- Department of Biology, Boston University, Boston, MA, United States
| | | |
Collapse
|