1
|
Hong W, Zhao Y, Weng YL, Cheng C. Random Forest model reveals the interaction between N6-methyladenosine modifications and RNA-binding proteins. iScience 2023; 26:106250. [PMID: 36922995 PMCID: PMC10009289 DOI: 10.1016/j.isci.2023.106250] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/11/2022] [Revised: 12/16/2022] [Accepted: 02/15/2023] [Indexed: 02/22/2023] Open
Abstract
RNA-binding proteins (RBPs) have critical roles in N6-methyladenosine (m6A) modification process. We designed a Random Forest (RF) model to systematically analyze the interaction among RBPs and m6A modifications by integrating the binding signals from hundreds of RBPs. Accurate prediction of m6A sites demonstrated significant connections between RBP bindings and m6A modifications. The relative importance of different RBPs from the model provided a quantitative metric to evaluate their interactions with m6A modifications. Redundancy analysis showed that several RBPs may have similar binding patterns with m6A sites. The RF model exhibited fairly high prediction accuracy across cell lines, suggesting a conservative RBP interaction network regulates m6A occupancy. Specific RBPs can engage to the corresponding regional m6A sites and deploy distinct regulatory processes, such as cleavage site selection of the alternative polyadenylation (APA). We also integrated histone modifications into our RF model, which demonstrated H3K36me3 and H3K27me3 as determining features for m6A distribution.
Collapse
Affiliation(s)
- Wei Hong
- Department of Medicine, Baylor College of Medicine, Houston, TX 77030, USA
| | - Yanding Zhao
- Department of Medicine, Baylor College of Medicine, Houston, TX 77030, USA
| | - Yi-Lan Weng
- Center for Neuroregeneration, Department of Neurosurgery, Houston Methodist Research Institute, Houston, TX 77030, USA
| | - Chao Cheng
- Department of Medicine, Baylor College of Medicine, Houston, TX 77030, USA
| |
Collapse
|
2
|
Chen K, Zhao H, Yang Y. Capturing large genomic contexts for accurately predicting enhancer-promoter interactions. Brief Bioinform 2022; 23:6513727. [DOI: 10.1093/bib/bbab577] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/15/2021] [Revised: 12/13/2021] [Accepted: 12/15/2021] [Indexed: 11/14/2022] Open
Abstract
Abstract
Enhancer-promoter interaction (EPI) is a key mechanism underlying gene regulation. EPI prediction has always been a challenging task because enhancers could regulate promoters of distant target genes. Although many machine learning models have been developed, they leverage only the features in enhancers and promoters, or simply add the average genomic signals in the regions between enhancers and promoters, without utilizing detailed features between or outside enhancers and promoters. Due to a lack of large-scale features, existing methods could achieve only moderate performance, especially for predicting EPIs in different cell types. Here, we present a Transformer-based model, TransEPI, for EPI prediction by capturing large genomic contexts. TransEPI was developed based on EPI datasets derived from Hi-C or ChIA-PET data in six cell lines. To avoid over-fitting, we evaluated the TransEPI model by testing it on independent test datasets where the cell line and chromosome are different from the training data. TransEPI not only achieved consistent performance across the cross-validation and test datasets from different cell types but also outperformed the state-of-the-art machine learning and deep learning models. In addition, we found that the improved performance of TransEPI was attributed to the integration of large genomic contexts. Lastly, TransEPI was extended to study the non-coding mutations associated with brain disorders or neural diseases, and we found that TransEPI was also useful for predicting the target genes of non-coding mutations.
Collapse
|
3
|
Sheng T, Ho SWT, Ooi WF, Xu C, Xing M, Padmanabhan N, Huang KK, Ma L, Ray M, Guo YA, Sim NL, Anene-Nzelu CG, Chang MM, Razavi-Mohseni M, Beer MA, Foo RSY, Sundar R, Chan YH, Tan ALK, Ong X, Skanderup AJ, White KP, Jha S, Tan P. Integrative epigenomic and high-throughput functional enhancer profiling reveals determinants of enhancer heterogeneity in gastric cancer. Genome Med 2021; 13:158. [PMID: 34635154 PMCID: PMC8504099 DOI: 10.1186/s13073-021-00970-3] [Citation(s) in RCA: 9] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/18/2021] [Accepted: 09/15/2021] [Indexed: 12/21/2022] Open
Abstract
BACKGROUND Enhancers are distal cis-regulatory elements required for cell-specific gene expression and cell fate determination. In cancer, enhancer variation has been proposed as a major cause of inter-patient heterogeneity-however, most predicted enhancer regions remain to be functionally tested. METHODS We analyzed 132 epigenomic histone modification profiles of 18 primary gastric cancer (GC) samples, 18 normal gastric tissues, and 28 GC cell lines using Nano-ChIP-seq technology. We applied Capture-based Self-Transcribing Active Regulatory Region sequencing (CapSTARR-seq) to assess functional enhancer activity. An Activity-by-contact (ABC) model was employed to explore the effects of histone acetylation and CapSTARR-seq levels on enhancer-promoter interactions. RESULTS We report a comprehensive catalog of 75,730 recurrent predicted enhancers, the majority of which are GC-associated in vivo (> 50,000) and associated with lower somatic mutation rates inferred by whole-genome sequencing. Applying CapSTARR-seq to the enhancer catalog, we observed significant correlations between CapSTARR-seq functional activity and H3K27ac/H3K4me1 levels. Super-enhancer regions exhibited increased CapSTARR-seq signals compared to regular enhancers, even when decoupled from native chromatin contexture. We show that combining histone modification and CapSTARR-seq functional enhancer data improves the prediction of enhancer-promoter interactions and pinpointing of germline single nucleotide polymorphisms (SNPs), somatic copy number alterations (SCNAs), and trans-acting TFs involved in GC expression. We identified cancer-relevant genes (ING1, ARL4C) whose expression between patients is influenced by enhancer differences in genomic copy number and germline SNPs, and HNF4α as a master trans-acting factor associated with GC enhancer heterogeneity. CONCLUSIONS Our results indicate that combining histone modification and functional assay data may provide a more accurate metric to assess enhancer activity than either platform individually, providing insights into the relative contribution of genetic (cis) and regulatory (trans) mechanisms to GC enhancer functional heterogeneity.
Collapse
Affiliation(s)
- Taotao Sheng
- Department of Biochemistry, National University of Singapore, Singapore, 117596, Singapore
- Cancer and Stem Cell Biology Program, Duke-NUS Medical School, Singapore, 169857, Singapore
| | - Shamaine Wei Ting Ho
- Cancer and Stem Cell Biology Program, Duke-NUS Medical School, Singapore, 169857, Singapore
- Cancer Science Institute of Singapore, National University of Singapore, Singapore, 117599, Singapore
| | - Wen Fong Ooi
- Epigenetic and Epitranscriptomic Regulation, Genome Institute of Singapore, Singapore, 138672, Singapore
| | - Chang Xu
- Cancer and Stem Cell Biology Program, Duke-NUS Medical School, Singapore, 169857, Singapore
- Cancer Science Institute of Singapore, National University of Singapore, Singapore, 117599, Singapore
| | - Manjie Xing
- Cancer and Stem Cell Biology Program, Duke-NUS Medical School, Singapore, 169857, Singapore
- Epigenetic and Epitranscriptomic Regulation, Genome Institute of Singapore, Singapore, 138672, Singapore
| | - Nisha Padmanabhan
- Cancer and Stem Cell Biology Program, Duke-NUS Medical School, Singapore, 169857, Singapore
| | - Kie Kyon Huang
- Cancer and Stem Cell Biology Program, Duke-NUS Medical School, Singapore, 169857, Singapore
| | - Lijia Ma
- The Institute for Genomics and Systems Biology, The University of Chicago, Chicago, Illinois, USA
| | - Mohana Ray
- The Institute for Genomics and Systems Biology, The University of Chicago, Chicago, Illinois, USA
| | - Yu Amanda Guo
- Precision Medicine and Population Genomics (Somatic), Genome Institute of Singapore, Singapore, 138672, Singapore
| | - Ngak Leng Sim
- Precision Medicine and Population Genomics (Somatic), Genome Institute of Singapore, Singapore, 138672, Singapore
| | - Chukwuemeka George Anene-Nzelu
- Cardiovascular Research Institute, National University Health System, Singapore, 119074, Singapore
- Precision Medicine and Population Genomics (Germline), Genome Institute of Singapore, Singapore, Singapore
- Montreal Heart Institute, Montreal, Canada
- Department of Medicine, University of Montreal, Montreal, Canada
| | - Mei Mei Chang
- Precision Medicine and Population Genomics (Somatic), Genome Institute of Singapore, Singapore, 138672, Singapore
| | - Milad Razavi-Mohseni
- Department of Biomedical Engineering and McKusick-Nathans Department of Genetic Medicine, Johns Hopkins University, Baltimore, USA
| | - Michael A Beer
- Department of Biomedical Engineering and McKusick-Nathans Department of Genetic Medicine, Johns Hopkins University, Baltimore, USA
| | - Roger Sik Yin Foo
- Cardiovascular Research Institute, National University Health System, Singapore, 119074, Singapore
- Precision Medicine and Population Genomics (Germline), Genome Institute of Singapore, Singapore, Singapore
| | - Raghav Sundar
- Cancer and Stem Cell Biology Program, Duke-NUS Medical School, Singapore, 169857, Singapore
- Department of Haematology-Oncology, National University Cancer Institute Singapore, National University Hospital, Singapore, 119074, Singapore
| | - Yiong Huak Chan
- Biostatistics Unit, Yong Loo Lin School of Medicine, National University of Singapore, Singapore, 119228, Singapore
| | - Angie Lay Keng Tan
- Cancer and Stem Cell Biology Program, Duke-NUS Medical School, Singapore, 169857, Singapore
| | - Xuewen Ong
- Cancer and Stem Cell Biology Program, Duke-NUS Medical School, Singapore, 169857, Singapore
| | - Anders Jacobsen Skanderup
- Precision Medicine and Population Genomics (Somatic), Genome Institute of Singapore, Singapore, 138672, Singapore
| | - Kevin P White
- The Institute for Genomics and Systems Biology, The University of Chicago, Chicago, Illinois, USA.
- Tempus Labs, Chicago, USA.
| | - Sudhakar Jha
- Department of Biochemistry, National University of Singapore, Singapore, 117596, Singapore.
- Cancer Science Institute of Singapore, National University of Singapore, Singapore, 117599, Singapore.
- NUS Center for Cancer Research, Yong Loo Lin School of Medicine, National University of Singapore, Singapore, 117599, Singapore.
- Department of Physiological Sciences, College of Veterinary Medicine, Oklahoma State University, Stillwater, OK, USA.
| | - Patrick Tan
- Cancer and Stem Cell Biology Program, Duke-NUS Medical School, Singapore, 169857, Singapore.
- Cancer Science Institute of Singapore, National University of Singapore, Singapore, 117599, Singapore.
- Epigenetic and Epitranscriptomic Regulation, Genome Institute of Singapore, Singapore, 138672, Singapore.
- SingHealth/Duke-NUS Institute of Precision Medicine, National Heart Centre Singapore, Singapore, 168752, Singapore.
- Department of Physiology, National University of Singapore, Singapore, 117593, Singapore.
- Singapore Gastric Cancer Consortium, Singapore, 119228, Singapore.
| |
Collapse
|
4
|
Mychaleckyj JC, Valo E, Ichimura T, Ahluwalia TS, Dina C, Miller RG, Shabalin IG, Gyorgy B, Cao J, Onengut-Gumuscu S, Satake E, Smiles AM, Haukka JK, Tregouet DA, Costacou T, O’Neil K, Paterson AD, Forsblom C, Keenan HA, Pezzolesi MG, Pragnell M, Galecki A, Rich SS, Sandholm N, Klein R, Klein BE, Susztak K, Orchard TJ, Korstanje R, King GL, Hadjadj S, Rossing P, Bonventre JV, Groop PH, Warram JH, Krolewski AS. Association of Coding Variants in Hydroxysteroid 17-beta Dehydrogenase 14 ( HSD17B14) with Reduced Progression to End Stage Kidney Disease in Type 1 Diabetes. J Am Soc Nephrol 2021; 32:2634-2651. [PMID: 34261756 PMCID: PMC8722802 DOI: 10.1681/asn.2020101457] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/14/2020] [Accepted: 05/27/2021] [Indexed: 02/04/2023] Open
Abstract
BACKGROUND Rare variants in gene coding regions likely have a greater impact on disease-related phenotypes than common variants through disruption of their encoded protein. We searched for rare variants associated with onset of ESKD in individuals with type 1 diabetes at advanced kidney disease stage. METHODS Gene-based exome array analyses of 15,449 genes in five large incidence cohorts of individuals with type 1 diabetes and proteinuria were analyzed for survival time to ESKD, testing the top gene in a sixth cohort (n=2372/1115 events all cohorts) and replicating in two retrospective case-control studies (n=1072 cases, 752 controls). Deep resequencing of the top associated gene in five cohorts confirmed the findings. We performed immunohistochemistry and gene expression experiments in human control and diseased cells, and in mouse ischemia reperfusion and aristolochic acid nephropathy models. RESULTS Protein coding variants in the hydroxysteroid 17-β dehydrogenase 14 gene (HSD17B14), predicted to affect protein structure, had a net protective effect against development of ESKD at exome-wide significance (n=4196; P value=3.3 × 10-7). The HSD17B14 gene and encoded enzyme were robustly expressed in healthy human kidney, maximally in proximal tubular cells. Paradoxically, gene and protein expression were attenuated in human diabetic proximal tubules and in mouse kidney injury models. Expressed HSD17B14 gene and protein levels remained low without recovery after 21 days in a murine ischemic reperfusion injury model. Decreased gene expression was found in other CKD-associated renal pathologies. CONCLUSIONS HSD17B14 gene is mechanistically involved in diabetic kidney disease. The encoded sex steroid enzyme is a druggable target, potentially opening a new avenue for therapeutic development.
Collapse
Affiliation(s)
- Josyf C. Mychaleckyj
- Center for Public Health Genomics, University of Virginia, Charlottesville, Virginia
| | - Erkka Valo
- Folkhälsan Institute of Genetics, Folkhälsan Research Center, Helsinki, Finland
- Department of Nephrology, University of Helsinki and Helsinki University Hospital, Helsinki, Finland
- Research Program for Clinical and Molecular Metabolism, Faculty of Medicine, University of Helsinki, Finland
| | - Takaharu Ichimura
- Renal Division, Brigham and Women’s Hospital, Department of Medicine, Harvard Medical School, Boston, Massachusetts
| | | | - Christian Dina
- Université de Nantes, CNRS INSERM, L’institut du thorax, Nantes, France
| | - Rachel G. Miller
- Department of Epidemiology, University of Pittsburgh, Pittsburgh, Pennsylvania
| | - Ivan G. Shabalin
- Department of Molecular Physiology and Biological Physics, University of Virginia, Charlottesville, Virginia
| | - Beata Gyorgy
- INSERM UMRS1166, Institute of CardioMetabolism and Nutrition, Sorbonne Université, Paris, France
| | - JingJing Cao
- Genetics & Genome Biology Research Institute, SickKids Hospital, Toronto, Ontario, Canada
| | - Suna Onengut-Gumuscu
- Center for Public Health Genomics, University of Virginia, Charlottesville, Virginia
| | - Eiichiro Satake
- Research Division, Joslin Diabetes Center, Boston, Massachusetts
- Department of Medicine, Harvard Medical School, Boston, Massachusetts
| | - Adam M. Smiles
- Research Division, Joslin Diabetes Center, Boston, Massachusetts
| | - Jani K. Haukka
- Folkhälsan Institute of Genetics, Folkhälsan Research Center, Helsinki, Finland
- Department of Nephrology, University of Helsinki and Helsinki University Hospital, Helsinki, Finland
- Research Program for Clinical and Molecular Metabolism, Faculty of Medicine, University of Helsinki, Finland
| | - David-Alexandre Tregouet
- INSERM UMRS1166, Institute of CardioMetabolism and Nutrition, Sorbonne Université, Paris, France
- Université de Bordeaux, INSERM, Bordeaux Population Health, Bordeaux U1219, France
| | - Tina Costacou
- Department of Epidemiology, University of Pittsburgh, Pittsburgh, Pennsylvania
| | - Kristina O’Neil
- Research Division, Joslin Diabetes Center, Boston, Massachusetts
| | - Andrew D. Paterson
- Genetics & Genome Biology Research Institute, SickKids Hospital, Toronto, Ontario, Canada
| | - Carol Forsblom
- Folkhälsan Institute of Genetics, Folkhälsan Research Center, Helsinki, Finland
- Department of Nephrology, University of Helsinki and Helsinki University Hospital, Helsinki, Finland
- Research Program for Clinical and Molecular Metabolism, Faculty of Medicine, University of Helsinki, Finland
| | - Hillary A. Keenan
- Research Division, Joslin Diabetes Center, Boston, Massachusetts
- Department of Medicine, Harvard Medical School, Boston, Massachusetts
| | - Marcus G. Pezzolesi
- Research Division, Joslin Diabetes Center, Boston, Massachusetts
- Department of Medicine, Harvard Medical School, Boston, Massachusetts
- Division of Nephrology and Hypertension, University of Utah, Salt Lake City, Utah
| | | | - Andrzej Galecki
- Department of Internal Medicine, University of Michigan Medical School, Ann Arbor, Michigan
| | - Stephen S. Rich
- Center for Public Health Genomics, University of Virginia, Charlottesville, Virginia
| | - Niina Sandholm
- Folkhälsan Institute of Genetics, Folkhälsan Research Center, Helsinki, Finland
- Department of Nephrology, University of Helsinki and Helsinki University Hospital, Helsinki, Finland
- Research Program for Clinical and Molecular Metabolism, Faculty of Medicine, University of Helsinki, Finland
| | - Ronald Klein
- Department of Ophthalmology and Visual Sciences, University of Wisconsin School of Medicine and Public Health, Madison, Wisconsin
| | - Barbara E. Klein
- Department of Ophthalmology and Visual Sciences, University of Wisconsin School of Medicine and Public Health, Madison, Wisconsin
| | - Katalin Susztak
- Department of Medicine and Genetics, University of Pennsylvania, Philadelphia, Pennsylvania
| | - Trevor J. Orchard
- Department of Epidemiology, University of Pittsburgh, Pittsburgh, Pennsylvania
| | | | - George L. King
- Research Division, Joslin Diabetes Center, Boston, Massachusetts
- Department of Medicine, Harvard Medical School, Boston, Massachusetts
| | - Samy Hadjadj
- INSERM CIC 1402 and U 1082, Poitiers, France
- Department of Endocrinology, L’institut du thorax, INSERM, CNRS, Centre Hospitalier Universitaire de Nantes, Nantes, France
| | - Peter Rossing
- Steno Diabetes Center Copenhagen, Copenhagen, Denmark
- University of Copenhagen, Copenhagen, Denmark
| | - Joseph V. Bonventre
- Renal Division, Brigham and Women’s Hospital, Department of Medicine, Harvard Medical School, Boston, Massachusetts
| | - Per-Henrik Groop
- Folkhälsan Institute of Genetics, Folkhälsan Research Center, Helsinki, Finland
- Department of Nephrology, University of Helsinki and Helsinki University Hospital, Helsinki, Finland
- Research Program for Clinical and Molecular Metabolism, Faculty of Medicine, University of Helsinki, Finland
- Department of Diabetes, Central Clinical School, Monash University, Melbourne, Victoria, Australia
| | - James H. Warram
- Research Division, Joslin Diabetes Center, Boston, Massachusetts
| | - Andrzej S. Krolewski
- Research Division, Joslin Diabetes Center, Boston, Massachusetts
- Department of Medicine, Harvard Medical School, Boston, Massachusetts
| |
Collapse
|
5
|
Lange M, Begolli R, Giakountis A. Non-Coding Variants in Cancer: Mechanistic Insights and Clinical Potential for Personalized Medicine. Noncoding RNA 2021; 7:47. [PMID: 34449663 PMCID: PMC8395730 DOI: 10.3390/ncrna7030047] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/28/2021] [Revised: 07/26/2021] [Accepted: 08/01/2021] [Indexed: 12/11/2022] Open
Abstract
The cancer genome is characterized by extensive variability, in the form of Single Nucleotide Polymorphisms (SNPs) or structural variations such as Copy Number Alterations (CNAs) across wider genomic areas. At the molecular level, most SNPs and/or CNAs reside in non-coding sequences, ultimately affecting the regulation of oncogenes and/or tumor-suppressors in a cancer-specific manner. Notably, inherited non-coding variants can predispose for cancer decades prior to disease onset. Furthermore, accumulation of additional non-coding driver mutations during progression of the disease, gives rise to genomic instability, acting as the driving force of neoplastic development and malignant evolution. Therefore, detection and characterization of such mutations can improve risk assessment for healthy carriers and expand the diagnostic and therapeutic toolbox for the patient. This review focuses on functional variants that reside in transcribed or not transcribed non-coding regions of the cancer genome and presents a collection of appropriate state-of-the-art methodologies to study them.
Collapse
Affiliation(s)
- Marios Lange
- Department of Biochemistry and Biotechnology, University of Thessaly, Biopolis, 41500 Larissa, Greece; (M.L.); (R.B.)
| | - Rodiola Begolli
- Department of Biochemistry and Biotechnology, University of Thessaly, Biopolis, 41500 Larissa, Greece; (M.L.); (R.B.)
| | - Antonis Giakountis
- Department of Biochemistry and Biotechnology, University of Thessaly, Biopolis, 41500 Larissa, Greece; (M.L.); (R.B.)
- Institute for Fundamental Biomedical Research, B.S.R.C “Alexander Fleming”, 34 Fleming Str., 16672 Vari, Greece
| |
Collapse
|
6
|
Pachera E, Assassi S, Salazar GA, Stellato M, Renoux F, Wunderlin A, Blyszczuk P, Lafyatis R, Kurreeman F, de Vries-Bouwstra J, Messemaker T, Feghali-Bostwick CA, Rogler G, van Haaften WT, Dijkstra G, Oakley F, Calcagni M, Schniering J, Maurer B, Distler JH, Kania G, Frank-Bertoncelj M, Distler O. Long noncoding RNA H19X is a key mediator of TGF-β-driven fibrosis. J Clin Invest 2021; 130:4888-4905. [PMID: 32603313 DOI: 10.1172/jci135439] [Citation(s) in RCA: 47] [Impact Index Per Article: 15.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/09/2019] [Accepted: 06/17/2020] [Indexed: 12/22/2022] Open
Abstract
TGF-β is a master regulator of fibrosis, driving the differentiation of fibroblasts into apoptosis-resistant myofibroblasts and sustaining the production of extracellular matrix (ECM) components. Here, we identified the nuclear long noncoding RNA (lncRNA) H19X as a master regulator of TGF-β-driven tissue fibrosis. H19X was consistently upregulated in a wide variety of human fibrotic tissues and diseases and was strongly induced by TGF-β, particularly in fibroblasts and fibroblast-related cells. Functional experiments following H19X silencing revealed that H19X was an obligatory factor for TGF-β-induced ECM synthesis as well as differentiation and survival of ECM-producing myofibroblasts. We showed that H19X regulates DDIT4L gene expression, specifically interacting with a region upstream of the DDIT4L gene and changing the chromatin accessibility of a DDIT4L enhancer. These events resulted in transcriptional repression of DDIT4L and, in turn, in increased collagen expression and fibrosis. Our results shed light on key effectors of TGF-β-induced ECM remodeling and fibrosis.
Collapse
Affiliation(s)
- Elena Pachera
- Center of Experimental Rheumatology, Department of Rheumatology, University Hospital Zurich, Zurich, Switzerland
| | - Shervin Assassi
- Division of Rheumatology, Department of Internal Medicine, University of Texas Health Science Center at Houston, McGovern Medical School, Houston, Texas, USA
| | - Gloria A Salazar
- Division of Rheumatology, Department of Internal Medicine, University of Texas Health Science Center at Houston, McGovern Medical School, Houston, Texas, USA
| | - Mara Stellato
- Center of Experimental Rheumatology, Department of Rheumatology, University Hospital Zurich, Zurich, Switzerland
| | - Florian Renoux
- Center of Experimental Rheumatology, Department of Rheumatology, University Hospital Zurich, Zurich, Switzerland
| | - Adam Wunderlin
- Center of Experimental Rheumatology, Department of Rheumatology, University Hospital Zurich, Zurich, Switzerland
| | - Przemyslaw Blyszczuk
- Center of Experimental Rheumatology, Department of Rheumatology, University Hospital Zurich, Zurich, Switzerland
| | - Robert Lafyatis
- Division of Rheumatology and Clinical Immunology, Department of Medicine, University of Pittsburgh, Pittsburgh, Pennsylvania, USA
| | - Fina Kurreeman
- Department of Rheumatology, Leiden University Medical Center, Leiden, Netherlands
| | | | - Tobias Messemaker
- Department of Rheumatology, Leiden University Medical Center, Leiden, Netherlands
| | | | - Gerhard Rogler
- Department of Gastroenterology and Hepatology, University Hospital Zurich, Zurich, Switzerland
| | - Wouter T van Haaften
- Department of Gastroenterology and Hepatology, University Medical Center Groningen, Groningen, Netherlands
| | - Gerard Dijkstra
- Department of Gastroenterology and Hepatology, University Medical Center Groningen, Groningen, Netherlands
| | - Fiona Oakley
- Newcastle Fibrosis Research Group, Institute of Cellular Medicine, Newcastle University, Newcastle upon Tyne, United Kingdom
| | - Maurizio Calcagni
- Department of Plastic Surgery and Hand Surgery, University Hospital Zurich, Zurich, Switzerland
| | - Janine Schniering
- Center of Experimental Rheumatology, Department of Rheumatology, University Hospital Zurich, Zurich, Switzerland
| | - Britta Maurer
- Center of Experimental Rheumatology, Department of Rheumatology, University Hospital Zurich, Zurich, Switzerland
| | - Jörg Hw Distler
- Department of Internal Medicine 3, University of Erlangen, Erlangen, Germany
| | - Gabriela Kania
- Center of Experimental Rheumatology, Department of Rheumatology, University Hospital Zurich, Zurich, Switzerland
| | - Mojca Frank-Bertoncelj
- Center of Experimental Rheumatology, Department of Rheumatology, University Hospital Zurich, Zurich, Switzerland
| | - Oliver Distler
- Center of Experimental Rheumatology, Department of Rheumatology, University Hospital Zurich, Zurich, Switzerland
| |
Collapse
|
7
|
Osato N. Characteristics of functional enrichment and gene expression level of human putative transcriptional target genes. BMC Genomics 2018; 19:957. [PMID: 29363429 PMCID: PMC5780744 DOI: 10.1186/s12864-017-4339-5] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/02/2023] Open
Abstract
BACKGROUND Transcriptional target genes show functional enrichment of genes. However, how many and how significantly transcriptional target genes include functional enrichments are still unclear. To address these issues, I predicted human transcriptional target genes using open chromatin regions, ChIP-seq data and DNA binding sequences of transcription factors in databases, and examined functional enrichment and gene expression level of putative transcriptional target genes. RESULTS Gene Ontology annotations showed four times larger numbers of functional enrichments in putative transcriptional target genes than gene expression information alone, independent of transcriptional target genes. To compare the number of functional enrichments of putative transcriptional target genes between cells or search conditions, I normalized the number of functional enrichment by calculating its ratios in the total number of transcriptional target genes. With this analysis, native putative transcriptional target genes showed the largest normalized number of functional enrichments, compared with target genes including 5-60% of randomly selected genes. The normalized number of functional enrichments was changed according to the criteria of enhancer-promoter interactions such as distance from transcriptional start sites and orientation of CTCF-binding sites. Forward-reverse orientation of CTCF-binding sites showed significantly higher normalized number of functional enrichments than the other orientations. Journal papers showed that the top five frequent functional enrichments were related to the cellular functions in the three cell types. The median expression level of transcriptional target genes changed according to the criteria of enhancer-promoter assignments (i.e. interactions) and was correlated with the changes of the normalized number of functional enrichments of transcriptional target genes. CONCLUSIONS Human putative transcriptional target genes showed significant functional enrichments. Functional enrichments were related to the cellular functions. The normalized number of functional enrichments of human putative transcriptional target genes changed according to the criteria of enhancer-promoter assignments and correlated with the median expression level of the target genes. These analyses and characters of human putative transcriptional target genes would be useful to examine the criteria of enhancer-promoter assignments and to predict the novel mechanisms and factors such as DNA binding proteins and DNA sequences of enhancer-promoter interactions.
Collapse
Affiliation(s)
- Naoki Osato
- Department of Bioinformatic Engineering, Graduate School of Information Science and Technology, Osaka University, Osaka, 565-0871, Japan.
| |
Collapse
|
8
|
Fishilevich S, Nudel R, Rappaport N, Hadar R, Plaschkes I, Iny Stein T, Rosen N, Kohn A, Twik M, Safran M, Lancet D, Cohen D. GeneHancer: genome-wide integration of enhancers and target genes in GeneCards. DATABASE-THE JOURNAL OF BIOLOGICAL DATABASES AND CURATION 2017; 2017:3737828. [PMID: 28605766 PMCID: PMC5467550 DOI: 10.1093/database/bax028] [Citation(s) in RCA: 688] [Impact Index Per Article: 98.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 11/15/2016] [Accepted: 03/10/2017] [Indexed: 12/14/2022]
Abstract
A major challenge in understanding gene regulation is the unequivocal identification of enhancer elements and uncovering their connections to genes. We present GeneHancer, a novel database of human enhancers and their inferred target genes, in the framework of GeneCards. First, we integrated a total of 434 000 reported enhancers from four different genome-wide databases: the Encyclopedia of DNA Elements (ENCODE), the Ensembl regulatory build, the functional annotation of the mammalian genome (FANTOM) project and the VISTA Enhancer Browser. Employing an integration algorithm that aims to remove redundancy, GeneHancer portrays 285 000 integrated candidate enhancers (covering 12.4% of the genome), 94 000 of which are derived from more than one source, and each assigned an annotation-derived confidence score. GeneHancer subsequently links enhancers to genes, using: tissue co-expression correlation between genes and enhancer RNAs, as well as enhancer-targeted transcription factor genes; expression quantitative trait loci for variants within enhancers; and capture Hi-C, a promoter-specific genome conformation assay. The individual scores based on each of these four methods, along with gene–enhancer genomic distances, form the basis for GeneHancer’s combinatorial likelihood-based scores for enhancer–gene pairing. Finally, we define ‘elite’ enhancer–gene relations reflecting both a high-likelihood enhancer definition and a strong enhancer–gene association. GeneHancer predictions are fully integrated in the widely used GeneCards Suite, whereby candidate enhancers and their annotations are displayed on every relevant GeneCard. This assists in the mapping of non-coding variants to enhancers, and via the linked genes, forms a basis for variant–phenotype interpretation of whole-genome sequences in health and disease. Database URL:http://www.genecards.org/
Collapse
Affiliation(s)
- Simon Fishilevich
- Department of Molecular Genetics, Weizmann Institute of Science, Rehovot 7610001, Israel
| | - Ron Nudel
- Department of Molecular Genetics, Weizmann Institute of Science, Rehovot 7610001, Israel
| | - Noa Rappaport
- Department of Molecular Genetics, Weizmann Institute of Science, Rehovot 7610001, Israel
| | - Rotem Hadar
- Department of Molecular Genetics, Weizmann Institute of Science, Rehovot 7610001, Israel
| | - Inbar Plaschkes
- Department of Molecular Genetics, Weizmann Institute of Science, Rehovot 7610001, Israel
| | - Tsippi Iny Stein
- Department of Molecular Genetics, Weizmann Institute of Science, Rehovot 7610001, Israel
| | - Naomi Rosen
- Department of Molecular Genetics, Weizmann Institute of Science, Rehovot 7610001, Israel
| | - Asher Kohn
- LifeMap Sciences Inc, Marshfield, MA 02050, USA
| | - Michal Twik
- Department of Molecular Genetics, Weizmann Institute of Science, Rehovot 7610001, Israel
| | - Marilyn Safran
- Department of Molecular Genetics, Weizmann Institute of Science, Rehovot 7610001, Israel
| | - Doron Lancet
- Department of Molecular Genetics, Weizmann Institute of Science, Rehovot 7610001, Israel
| | - Dana Cohen
- Department of Molecular Genetics, Weizmann Institute of Science, Rehovot 7610001, Israel
| |
Collapse
|
9
|
Hafez D, Karabacak A, Krueger S, Hwang YC, Wang LS, Zinzen RP, Ohler U. McEnhancer: predicting gene expression via semi-supervised assignment of enhancers to target genes. Genome Biol 2017; 18:199. [PMID: 29070071 PMCID: PMC5657048 DOI: 10.1186/s13059-017-1316-x] [Citation(s) in RCA: 19] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/19/2017] [Accepted: 09/08/2017] [Indexed: 12/24/2022] Open
Abstract
Transcriptional enhancers regulate spatio-temporal gene expression. While genomic assays can identify putative enhancers en masse, assigning target genes is a complex challenge. We devised a machine learning approach, McEnhancer, which links target genes to putative enhancers via a semi-supervised learning algorithm that predicts gene expression patterns based on enriched sequence features. Predicted expression patterns were 73–98% accurate, predicted assignments showed strong Hi-C interaction enrichment, enhancer-associated histone modifications were evident, and known functional motifs were recovered. Our model provides a general framework to link globally identified enhancers to targets and contributes to deciphering the regulatory genome.
Collapse
Affiliation(s)
- Dina Hafez
- Department of Computer Science, Duke University, Durham, 27708, NC, USA.,Berlin Institute for Medical Systems Biology, Max Delbrück Center for Molecular Medicine, Berlin, 13125, Germany
| | - Aslihan Karabacak
- Berlin Institute for Medical Systems Biology, Max Delbrück Center for Molecular Medicine, Berlin, 13125, Germany
| | - Sabrina Krueger
- Berlin Institute for Medical Systems Biology, Max Delbrück Center for Molecular Medicine, Berlin, 13125, Germany
| | - Yih-Chii Hwang
- Genomics and Computational Biology Graduate Program, University of Pennsylvania, Philadelphia, 19104, PA, USA
| | - Li-San Wang
- Genomics and Computational Biology Graduate Program, University of Pennsylvania, Philadelphia, 19104, PA, USA
| | - Robert P Zinzen
- Berlin Institute for Medical Systems Biology, Max Delbrück Center for Molecular Medicine, Berlin, 13125, Germany.
| | - Uwe Ohler
- Department of Computer Science, Duke University, Durham, 27708, NC, USA. .,Berlin Institute for Medical Systems Biology, Max Delbrück Center for Molecular Medicine, Berlin, 13125, Germany. .,Departments of Biology and Computer Science, Humboldt University, Berlin, 10099, Germany.
| |
Collapse
|
10
|
Battle A, Brown CD, Engelhardt BE, Montgomery SB. Genetic effects on gene expression across human tissues. Nature 2017; 550:204-213. [PMID: 29022597 PMCID: PMC5776756 DOI: 10.1038/nature24277] [Citation(s) in RCA: 2534] [Impact Index Per Article: 362.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/08/2016] [Accepted: 09/15/2017] [Indexed: 12/12/2022]
Abstract
Characterization of the molecular function of the human genome and its variation across individuals is essential for identifying the cellular mechanisms that underlie human genetic traits and diseases. The Genotype-Tissue Expression (GTEx) project aims to characterize variation in gene expression levels across individuals and diverse tissues of the human body, many of which are not easily accessible. Here we describe genetic effects on gene expression levels across 44 human tissues. We find that local genetic variation affects gene expression levels for the majority of genes, and we further identify inter-chromosomal genetic effects for 93 genes and 112 loci. On the basis of the identified genetic effects, we characterize patterns of tissue specificity, compare local and distal effects, and evaluate the functional properties of the genetic effects. We also demonstrate that multi-tissue, multi-individual data can be used to identify genes and pathways affected by human disease-associated variation, enabling a mechanistic interpretation of gene regulation and the genetic basis of disease.
Collapse
Affiliation(s)
- Alexis Battle
- Department of Computer Science, Johns Hopkins University, Baltimore, Maryland 21218, USA
| | - Christopher D Brown
- Department of Genetics and Institute for Biomedical Informatics, Perelman School of Medicine, University of Pennsylvania, Philadelphia, Pennsylvania 19104, USA
| | - Barbara E Engelhardt
- Department of Computer Science and Center for Statistics and Machine Learning, Princeton University, Princeton, New Jersey 08540, USA
| | - Stephen B Montgomery
- Department of Genetics, Stanford University, Stanford, California 94305, USA
- Department of Pathology, Stanford University, Stanford, California 94305, USA
| |
Collapse
|
11
|
Fishilevich S, Nudel R, Rappaport N, Hadar R, Plaschkes I, Iny Stein T, Rosen N, Kohn A, Twik M, Safran M, Lancet D, Cohen D. GeneHancer: genome-wide integration of enhancers and target genes in GeneCards. Database (Oxford) 2017; 2017:3737828. [PMID: 28605766 DOI: 10.1093/database/bax028/3737828] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/15/2016] [Accepted: 03/10/2017] [Indexed: 05/26/2023]
Abstract
UNLABELLED A major challenge in understanding gene regulation is the unequivocal identification of enhancer elements and uncovering their connections to genes. We present GeneHancer, a novel database of human enhancers and their inferred target genes, in the framework of GeneCards. First, we integrated a total of 434 000 reported enhancers from four different genome-wide databases: the Encyclopedia of DNA Elements (ENCODE), the Ensembl regulatory build, the functional annotation of the mammalian genome (FANTOM) project and the VISTA Enhancer Browser. Employing an integration algorithm that aims to remove redundancy, GeneHancer portrays 285 000 integrated candidate enhancers (covering 12.4% of the genome), 94 000 of which are derived from more than one source, and each assigned an annotation-derived confidence score. GeneHancer subsequently links enhancers to genes, using: tissue co-expression correlation between genes and enhancer RNAs, as well as enhancer-targeted transcription factor genes; expression quantitative trait loci for variants within enhancers; and capture Hi-C, a promoter-specific genome conformation assay. The individual scores based on each of these four methods, along with gene–enhancer genomic distances, form the basis for GeneHancer’s combinatorial likelihood-based scores for enhancer–gene pairing. Finally, we define ‘elite’ enhancer–gene relations reflecting both a high-likelihood enhancer definition and a strong enhancer–gene association. GeneHancer predictions are fully integrated in the widely used GeneCards Suite, whereby candidate enhancers and their annotations are displayed on every relevant GeneCard. This assists in the mapping of non-coding variants to enhancers, and via the linked genes, forms a basis for variant–phenotype interpretation of whole-genome sequences in health and disease. DATABASE URL http://www.genecards.org/.
Collapse
Affiliation(s)
- Simon Fishilevich
- Department of Molecular Genetics, Weizmann Institute of Science, Rehovot 7610001, Israel
| | - Ron Nudel
- Department of Molecular Genetics, Weizmann Institute of Science, Rehovot 7610001, Israel
| | - Noa Rappaport
- Department of Molecular Genetics, Weizmann Institute of Science, Rehovot 7610001, Israel
| | - Rotem Hadar
- Department of Molecular Genetics, Weizmann Institute of Science, Rehovot 7610001, Israel
| | - Inbar Plaschkes
- Department of Molecular Genetics, Weizmann Institute of Science, Rehovot 7610001, Israel
| | - Tsippi Iny Stein
- Department of Molecular Genetics, Weizmann Institute of Science, Rehovot 7610001, Israel
| | - Naomi Rosen
- Department of Molecular Genetics, Weizmann Institute of Science, Rehovot 7610001, Israel
| | - Asher Kohn
- LifeMap Sciences Inc, Marshfield, MA 02050, USA
| | - Michal Twik
- Department of Molecular Genetics, Weizmann Institute of Science, Rehovot 7610001, Israel
| | - Marilyn Safran
- Department of Molecular Genetics, Weizmann Institute of Science, Rehovot 7610001, Israel
| | - Doron Lancet
- Department of Molecular Genetics, Weizmann Institute of Science, Rehovot 7610001, Israel
| | - Dana Cohen
- Department of Molecular Genetics, Weizmann Institute of Science, Rehovot 7610001, Israel
| |
Collapse
|
12
|
Choudhury M, Ramsey SA. Identifying Cell Type-Specific Transcription Factors by Integrating ChIP-seq and eQTL Data-Application to Monocyte Gene Regulation. GENE REGULATION AND SYSTEMS BIOLOGY 2016; 10:105-110. [PMID: 28008225 PMCID: PMC5156548 DOI: 10.4137/grsb.s40768] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 08/22/2016] [Revised: 11/03/2016] [Accepted: 11/06/2016] [Indexed: 01/22/2023]
Abstract
We describe a novel computational approach to identify transcription factors (TFs) that are candidate regulators in a human cell type of interest. Our approach involves integrating cell type-specific expression quantitative trait locus (eQTL) data and TF data from chromatin immunoprecipitation-to-tag-sequencing (ChIP-seq) experiments in cell lines. To test the method, we used eQTL data from human monocytes in order to screen for TFs. Using a list of known monocyte-regulating TFs, we tested the hypothesis that the binding sites of cell type-specific TF regulators would be concentrated in the vicinity of monocyte eQTLs. For each of 397 ChIP-seq data sets, we obtained an enrichment ratio for the number of ChIP-seq peaks that are located within monocyte eQTLs. We ranked ChIP-seq data sets according to their statistical significances for eQTL overlap, and from this ranking, we observed that monocyte-regulating TFs are more highly ranked than would be expected by chance. We identified 27 TFs that had significant monocyte enrichment scores and mapped them into a protein interaction network. Our analysis uncovered two novel candidate monocyte-regulating TFs, BCLAF1 and SIN3A. Our approach is an efficient method to identify candidate TFs that can be used for any cell/tissue type for which eQTL data are available.
Collapse
Affiliation(s)
- Mudra Choudhury
- Department of Biomedical Sciences, Oregon State University, Corvallis, OR, USA
| | - Stephen A Ramsey
- Department of Biomedical Sciences, Oregon State University, Corvallis, OR, USA
| |
Collapse
|
13
|
Huang F, Shen J, Guo Q, Shi Y. eRFSVM: a hybrid classifier to predict enhancers-integrating random forests with support vector machines. Hereditas 2016; 153:6. [PMID: 28096768 PMCID: PMC5226099 DOI: 10.1186/s41065-016-0012-2] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/15/2016] [Accepted: 06/16/2016] [Indexed: 01/03/2023] Open
Abstract
BACKGROUND Enhancers are tissue specific distal regulation elements, playing vital roles in gene regulation and expression. The prediction and identification of enhancers are important but challenging issues for bioinformatics studies. Existing computational methods, mostly single classifiers, can only predict the transcriptional coactivator EP300 based enhancers and show low generalization performance. RESULTS We built a hybrid classifier called eRFSVM in this study, using random forests as a base classifier, and support vector machines as a main classifier. eRFSVM integrated two components as eRFSVM-ENCODE and eRFSVM-FANTOM5 with diverse features and labels. The base classifier trained datasets from a single tissue or cell with random forests. The main classifier made the final decision by support vector machines algorithm, with the predicting results of base classifiers as inputs. For eRFSVM-ENCODE, we trained datasets from cell lines including Gm12878, Hep, H1-hesc and Huvec, using ChIP-Seq datasets as features and EP300 based enhancers as labels. We tested eRFSVM-ENCODE on K562 dataset, and resulted in a predicting precision of 83.69 %, which was much better than existing classifiers. For eRFSVM-FANTOM5, with enhancers identified by RNA in FANTOM5 project as labels, the precision, recall, F-score and accuracy were 86.17 %, 36.06 %, 50.84 % and 93.38 % using eRFSVM, increasing 23.24 % (69.92 %), 97.05 % (18.30 %), 76.90 % (28.74 %), 4.69 % (89.20 %) than the existing algorithm, respectively. CONCLUSIONS All these results demonstrated that eRFSVM was a better classifier in predicting both EP300 based and FAMTOM5 RNAs based enhancers.
Collapse
Affiliation(s)
- Fang Huang
- Bio-X Institutes, Key Laboratory for the Genetics of Developmental and Neuropsychiatric Disorders (Ministry of Education) and the Collaborative Innovation Center for Brain Science, Shanghai Jiao Tong University, Shanghai, 200030 People’s Republic of China
| | - Jiawei Shen
- Bio-X Institutes, Key Laboratory for the Genetics of Developmental and Neuropsychiatric Disorders (Ministry of Education) and the Collaborative Innovation Center for Brain Science, Shanghai Jiao Tong University, Shanghai, 200030 People’s Republic of China
| | - Qingli Guo
- Bio-X Institutes, Key Laboratory for the Genetics of Developmental and Neuropsychiatric Disorders (Ministry of Education) and the Collaborative Innovation Center for Brain Science, Shanghai Jiao Tong University, Shanghai, 200030 People’s Republic of China
| | - Yongyong Shi
- Bio-X Institutes, Key Laboratory for the Genetics of Developmental and Neuropsychiatric Disorders (Ministry of Education) and the Collaborative Innovation Center for Brain Science, Shanghai Jiao Tong University, Shanghai, 200030 People’s Republic of China
- Shanghai Changning Mental Health Center, Shanghai, 200042 People’s Republic of China
- Department of Psychiatry, The First Teaching Hospital of Xinjiang Medical University, Urumqi, 830054 People’s Republic of China
- The Bio-X Little White Building, Shanghai Jiao Tong University, No.55 Guang Yuan Xi Road, Shanghai, 200030 China
| |
Collapse
|
14
|
Enhancer-promoter interactions are encoded by complex genomic signatures on looping chromatin. Nat Genet 2016; 48:488-96. [PMID: 27064255 DOI: 10.1038/ng.3539] [Citation(s) in RCA: 263] [Impact Index Per Article: 32.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/17/2015] [Accepted: 03/07/2016] [Indexed: 12/15/2022]
Abstract
Discriminating the gene target of a distal regulatory element from other nearby transcribed genes is a challenging problem with the potential to illuminate the causal underpinnings of complex diseases. We present TargetFinder, a computational method that reconstructs regulatory landscapes from diverse features along the genome. The resulting models accurately predict individual enhancer-promoter interactions across multiple cell lines with a false discovery rate up to 15 times smaller than that obtained using the closest gene. By evaluating the genomic features driving this accuracy, we uncover interactions between structural proteins, transcription factors, epigenetic modifications, and transcription that together distinguish interacting from non-interacting enhancer-promoter pairs. Most of this signature is not proximal to the enhancers and promoters but instead decorates the looping DNA. We conclude that complex but consistent combinations of marks on the one-dimensional genome encode the three-dimensional structure of fine-scale regulatory interactions.
Collapse
|
15
|
Simpfendorfer KR, Armstead BE, Shih A, Li W, Curran M, Manjarrez-Orduño N, Lee AT, Diamond B, Gregersen PK. Autoimmune disease-associated haplotypes of BLK exhibit lowered thresholds for B cell activation and expansion of Ig class-switched B cells. Arthritis Rheumatol 2016; 67:2866-76. [PMID: 26246128 DOI: 10.1002/art.39301] [Citation(s) in RCA: 35] [Impact Index Per Article: 4.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/03/2015] [Accepted: 07/21/2015] [Indexed: 12/26/2022]
Abstract
OBJECTIVE B lymphoid kinase (BLK) is associated with rheumatoid arthritis (RA) and several other B cell-associated autoimmune disorders. BLK risk variants are consistently associated with reduced BLK expression, but the mechanisms by which reduced expression alters human B cell function to confer autoimmune disease susceptibility are unknown. This study was undertaken to characterize the BLK risk haplotype and to determine associated B cell functional phenotypes involved in autoimmunity. METHODS The BLK risk haplotype association with RA (determined using whole-genome sequencing data) was confirmed in 2,526 RA cases and 2,134 controls. Peripheral blood mononuclear cells (PBMCs) from RA patients, healthy adults, and umbilical cord blood were used to study B cell functional phenotypes associated with the BLK risk genotype. Association of the BLK haplotype with B cell phenotypes was analyzed using cell culture and flow cytometry. RESULTS Two insertion/deletions were found on the RA risk haplotype in BLK, and the reduction in BLK expression associated with the risk haplotype was confirmed in primary B lymphocytes. Carriers of the RA-associated haplotype had evidence of lower basal B cell receptor (BCR) signaling activity, yet their B cells were hyperactivatable, with enhanced up-regulation of CD86 after BCR crosslinking and greater T cell stimulatory capacity. The number of isotype-switched memory B cells was also significantly increased in subjects carrying the risk haplotype. CONCLUSION A major mechanism underlying the BLK association with autoimmune disease involves lowered thresholds for BCR signaling, enhanced B cell-T cell interactions, and altered patterns of isotype switching.
Collapse
Affiliation(s)
| | | | - Andrew Shih
- Feinstein Institute for Medical Research, Manhasset, New York
| | - Wentian Li
- Feinstein Institute for Medical Research, Manhasset, New York
| | - Mark Curran
- Janssen Pharmaceuticals, Springhouse, Pennsylvania
| | | | - Annette T Lee
- Feinstein Institute for Medical Research, Manhasset, New York
| | - Betty Diamond
- Feinstein Institute for Medical Research, Manhasset, New York
| | | |
Collapse
|
16
|
Yao L, Berman BP, Farnham PJ. Demystifying the secret mission of enhancers: linking distal regulatory elements to target genes. Crit Rev Biochem Mol Biol 2015; 50:550-73. [PMID: 26446758 PMCID: PMC4666684 DOI: 10.3109/10409238.2015.1087961] [Citation(s) in RCA: 62] [Impact Index Per Article: 6.9] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022]
Abstract
Enhancers are short regulatory sequences bound by sequence-specific transcription factors and play a major role in the spatiotemporal specificity of gene expression patterns in development and disease. While it is now possible to identify enhancer regions genomewide in both cultured cells and primary tissues using epigenomic approaches, it has been more challenging to develop methods to understand the function of individual enhancers because enhancers are located far from the gene(s) that they regulate. However, it is essential to identify target genes of enhancers not only so that we can understand the role of enhancers in disease but also because this information will assist in the development of future therapeutic options. After reviewing models of enhancer function, we discuss recent methods for identifying target genes of enhancers. First, we describe chromatin structure-based approaches for directly mapping interactions between enhancers and promoters. Second, we describe the use of correlation-based approaches to link enhancer state with the activity of nearby promoters and/or gene expression. Third, we describe how to test the function of specific enhancers experimentally by perturbing enhancer–target relationships using high-throughput reporter assays and genome editing. Finally, we conclude by discussing as yet unanswered questions concerning how enhancers function, how target genes can be identified, and how to distinguish direct from indirect changes in gene expression mediated by individual enhancers.
Collapse
Affiliation(s)
- Lijing Yao
- a Norris Comprehensive Cancer Center, Keck School of Medicine, University of Southern California , Los Angeles , CA , USA and
| | - Benjamin P Berman
- b Department of Biomedical Sciences , Bioinformatics and Computational Biology Research Center, Cedars-Sinai Medical Center , Los Angeles , CA , USA
| | - Peggy J Farnham
- a Norris Comprehensive Cancer Center, Keck School of Medicine, University of Southern California , Los Angeles , CA , USA and
| |
Collapse
|
17
|
Mechanisms of Evolutionary Innovation Point to Genetic Control Logic as the Key Difference Between Prokaryotes and Eukaryotes. J Mol Evol 2015. [PMID: 26208881 DOI: 10.1007/s00239-015-9688-6] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/19/2022]
Abstract
The evolution of life from the simplest, original form to complex, intelligent animal life occurred through a number of key innovations. Here we present a new tool to analyze these key innovations by proposing that the process of evolutionary innovation may follow one of three underlying processes, namely a Random Walk, a Critical Path, or a Many Paths process, and in some instances may also constitute a "Pull-up the Ladder" event. Our analysis is based on the occurrence of function in modern biology, rather than specific structure or mechanism. A function in modern biology may be classified in this way either on the basis of its evolution or the basis of its modern mechanism. Characterizing key innovations in this way helps identify the likelihood that an innovation could arise. In this paper, we describe the classification, and methods to classify functional features of modern organisms into these three classes based on the analysis of how a function is implemented in modern biology. We present the application of our categorization to the evolution of eukaryotic gene control. We use this approach to support the argument that there are few, and possibly no basic chemical differences between the functional constituents of the machinery of gene control between eukaryotes, bacteria and archaea. This suggests that the difference between eukaryotes and prokaryotes that allows the former to develop the complex genetic architecture seen in animals and plants is something other than their chemistry. We tentatively identify the difference as a difference in control logic, that prokaryotic genes are by default 'on' and eukaryotic genes are by default 'off.' The Many Paths evolutionary process suggests that, from a 'default off' starting point, the evolution of the genetic complexity of higher eukaryotes is a high probability event.
Collapse
|
18
|
Kirsten H, Al-Hasani H, Holdt L, Gross A, Beutner F, Krohn K, Horn K, Ahnert P, Burkhardt R, Reiche K, Hackermüller J, Löffler M, Teupser D, Thiery J, Scholz M. Dissecting the genetics of the human transcriptome identifies novel trait-related trans-eQTLs and corroborates the regulatory relevance of non-protein coding loci†. Hum Mol Genet 2015; 24:4746-63. [PMID: 26019233 PMCID: PMC4512630 DOI: 10.1093/hmg/ddv194] [Citation(s) in RCA: 83] [Impact Index Per Article: 9.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/07/2014] [Accepted: 05/21/2015] [Indexed: 12/24/2022] Open
Abstract
Genetics of gene expression (eQTLs or expression QTLs) has proved an indispensable tool for understanding biological pathways and pathomechanisms of trait-associated SNPs. However, power of most genome-wide eQTL studies is still limited. We performed a large eQTL study in peripheral blood mononuclear cells of 2112 individuals increasing the power to detect trans-effects genome-wide. Going beyond univariate SNP-transcript associations, we analyse relations of eQTLs to biological pathways, polygenetic effects of expression regulation, trans-clusters and enrichment of co-localized functional elements. We found eQTLs for about 85% of analysed genes, and 18% of genes were trans-regulated. Local eSNPs were enriched up to a distance of 5 Mb to the transcript challenging typically implemented ranges of cis-regulations. Pathway enrichment within regulated genes of GWAS-related eSNPs supported functional relevance of identified eQTLs. We demonstrate that nearest genes of GWAS-SNPs might frequently be misleading functional candidates. We identified novel trans-clusters of potential functional relevance for GWAS-SNPs of several phenotypes including obesity-related traits, HDL-cholesterol levels and haematological phenotypes. We used chromatin immunoprecipitation data for demonstrating biological effects. Yet, we show for strongly heritable transcripts that still little trans-chromosomal heritability is explained by all identified trans-eSNPs; however, our data suggest that most cis-heritability of these transcripts seems explained. Dissection of co-localized functional elements indicated a prominent role of SNPs in loci of pseudogenes and non-coding RNAs for the regulation of coding genes. In summary, our study substantially increases the catalogue of human eQTLs and improves our understanding of the complex genetic regulation of gene expression, pathways and disease-related processes.
Collapse
Affiliation(s)
- Holger Kirsten
- Institute for Medical Informatics, Statistics and Epidemiology, LIFE - Leipzig Research Center for Civilization Diseases, Cognitive Genetics, Department of Cell Therapy
| | - Hoor Al-Hasani
- Department for Computer Science, Analysis Strategies Group, Department of Diagnostics, Young Investigators Group Bioinformatics and Transcriptomics, Department Proteomics, Helmholtz Centre for Environmental Research - UFZ, Leipzig, Germany and
| | - Lesca Holdt
- Institute of Laboratory Medicine, Ludwig-Maximilians-University, Munich, Germany
| | - Arnd Gross
- Institute for Medical Informatics, Statistics and Epidemiology, LIFE - Leipzig Research Center for Civilization Diseases
| | - Frank Beutner
- LIFE - Leipzig Research Center for Civilization Diseases, Department of Internal Medicine/Cardiology, Heart Center
| | - Knut Krohn
- Interdisciplinary Center for Clinical Research, Faculty of Medicine and
| | - Katrin Horn
- Institute for Medical Informatics, Statistics and Epidemiology, LIFE - Leipzig Research Center for Civilization Diseases
| | - Peter Ahnert
- Institute for Medical Informatics, Statistics and Epidemiology, LIFE - Leipzig Research Center for Civilization Diseases
| | - Ralph Burkhardt
- LIFE - Leipzig Research Center for Civilization Diseases, Institute of Laboratory Medicine, University of Leipzig, Leipzig, Germany
| | - Kristin Reiche
- Department for Computer Science, RNomics Group, Department of Diagnostics, Fraunhofer Institute for Cell Therapy and Immunology- IZI, Leipzig, Germany, Young Investigators Group Bioinformatics and Transcriptomics, Department Proteomics, Helmholtz Centre for Environmental Research - UFZ, Leipzig, Germany and
| | - Jörg Hackermüller
- Department for Computer Science, RNomics Group, Department of Diagnostics, Fraunhofer Institute for Cell Therapy and Immunology- IZI, Leipzig, Germany, Young Investigators Group Bioinformatics and Transcriptomics, Department Proteomics, Helmholtz Centre for Environmental Research - UFZ, Leipzig, Germany and
| | - Markus Löffler
- Institute for Medical Informatics, Statistics and Epidemiology, LIFE - Leipzig Research Center for Civilization Diseases
| | - Daniel Teupser
- Institute of Laboratory Medicine, Ludwig-Maximilians-University, Munich, Germany
| | - Joachim Thiery
- LIFE - Leipzig Research Center for Civilization Diseases, Institute of Laboratory Medicine, University of Leipzig, Leipzig, Germany
| | - Markus Scholz
- Institute for Medical Informatics, Statistics and Epidemiology, LIFE - Leipzig Research Center for Civilization Diseases,
| |
Collapse
|
19
|
Whitaker JW, Nguyen TT, Zhu Y, Wildberg A, Wang W. Computational schemes for the prediction and annotation of enhancers from epigenomic assays. Methods 2015; 72:86-94. [PMID: 25461775 PMCID: PMC4778972 DOI: 10.1016/j.ymeth.2014.10.008] [Citation(s) in RCA: 24] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/09/2014] [Revised: 09/14/2014] [Accepted: 10/08/2014] [Indexed: 12/31/2022] Open
Abstract
Identifying and annotating distal regulatory enhancers is critical to understand the mechanisms that control gene expression and cell-type-specific activities. Next-generation sequencing techniques have provided us an exciting toolkit of genome-wide assays that can be used to predict and annotate enhancers. However, each assay comes with its own specific set of analytical needs if enhancer prediction is to be optimal. Furthermore, integration of multiple genome-wide assays allows for different genomic features to be combined, and can improve predictive performance. Herein, we review the genome-wide assays and analysis schemes that are used to predict and annotate enhancers. In particular, we focus on three key computational topics: predicting enhancer locations, determining the cell-type-specific activity of enhancers, and linking enhancers to their target genes.
Collapse
Affiliation(s)
- John W Whitaker
- Department of Chemistry and Biochemistry, University of California, San Diego, La Jolla, CA 92093-0359, United States; Department of Cellular and Molecular Medicine, University of California, San Diego, La Jolla, CA 92093-0359, United States
| | - Tung T Nguyen
- Department of Chemistry and Biochemistry, University of California, San Diego, La Jolla, CA 92093-0359, United States; Department of Cellular and Molecular Medicine, University of California, San Diego, La Jolla, CA 92093-0359, United States
| | - Yun Zhu
- Department of Chemistry and Biochemistry, University of California, San Diego, La Jolla, CA 92093-0359, United States; Department of Cellular and Molecular Medicine, University of California, San Diego, La Jolla, CA 92093-0359, United States
| | - Andre Wildberg
- Department of Chemistry and Biochemistry, University of California, San Diego, La Jolla, CA 92093-0359, United States; Department of Cellular and Molecular Medicine, University of California, San Diego, La Jolla, CA 92093-0359, United States
| | - Wei Wang
- Department of Chemistry and Biochemistry, University of California, San Diego, La Jolla, CA 92093-0359, United States; Department of Cellular and Molecular Medicine, University of California, San Diego, La Jolla, CA 92093-0359, United States.
| |
Collapse
|
20
|
Duggal G, Wang H, Kingsford C. Higher-order chromatin domains link eQTLs with the expression of far-away genes. Nucleic Acids Res 2014; 42:87-96. [PMID: 24089144 PMCID: PMC3874174 DOI: 10.1093/nar/gkt857] [Citation(s) in RCA: 41] [Impact Index Per Article: 4.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/13/2013] [Revised: 08/28/2013] [Accepted: 09/03/2013] [Indexed: 12/21/2022] Open
Abstract
Distal expression quantitative trait loci (distal eQTLs) are genetic mutations that affect the expression of genes genomically far away. However, the mechanisms that cause a distal eQTL to modulate gene expression are not yet clear. Recent high-resolution chromosome conformation capture experiments along with a growing database of eQTLs provide an opportunity to understand the spatial mechanisms influencing distal eQTL associations on a genome-wide scale. We test the hypothesis that spatial proximity contributes to eQTL-gene regulation in the context of the higher-order domain structure of chromatin as determined from recent Hi-C chromosome conformation experiments. This analysis suggests that the large-scale topology of chromatin is coupled with eQTL associations by providing evidence that eQTLs are in general spatially close to their target genes, occur often around topological domain boundaries and preferentially associate with genes across domains. We also find that within-domain eQTLs that overlap with regulatory elements such as promoters and enhancers are spatially more close than the overall set of within-domain eQTLs, suggesting that spatial proximity derived from the domain structure in chromatin plays an important role in the regulation of gene expression.
Collapse
Affiliation(s)
- Geet Duggal
- Lane Center for Computational Biology, Carnegie Mellon University, 5000 Forbes Avenue Pittsburgh, PA, USA
| | - Hao Wang
- Lane Center for Computational Biology, Carnegie Mellon University, 5000 Forbes Avenue Pittsburgh, PA, USA
| | - Carl Kingsford
- Lane Center for Computational Biology, Carnegie Mellon University, 5000 Forbes Avenue Pittsburgh, PA, USA
| |
Collapse
|
21
|
Dozmorov MG, Wren JD, Alarcón-Riquelme ME. Epigenomic elements enriched in the promoters of autoimmunity susceptibility genes. Epigenetics 2013; 9:276-85. [PMID: 24213554 DOI: 10.4161/epi.27021] [Citation(s) in RCA: 37] [Impact Index Per Article: 3.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/15/2022] Open
Abstract
Genome-wide association studies have identified a number of autoimmune disease-susceptibility genes. Whether or not these loci share any regulatory or functional elements, however, is an open question. Finding such common regulators is of considerable research interest in order to define systemic therapeutic targets. The growing amount of experimental genomic annotations, particularly those from the ENCODE project, provide a wealth of opportunities to search for such commonalities. We hypothesized that regulatory commonalities might not only delineate a regulatory landscape predisposing to autoimmune diseases, but also define functional elements distinguishing specific diseases. We further investigated if, and how, disease-specific epigenomic elements can identify novel genes yet to be associated with the diseases. We evaluated transcription factors, histone modifications, and chromatin state data obtained from the ENCODE project for statistically significant over- or under-representation in the promoters of genes associated with Systemic Lupus Erythematosus (SLE), Rheumatoid Arthritis (RA), and Systemic Sclerosis (SSc). We identified BATF, BCL11A, IRF4, NFkB, PAX5, and PU.1 as transcription factors over-represented in SLE- and RA-susceptibility gene promoters. H3K4me1 and H3K4me2 epigenomic marks were associated with SLE susceptibility genes, and H3K9me3 was common to both SLE and RA. In contrast to a transcriptionally active signature in SLE and RA, SSc-susceptibility genes were depleted in activating epigenomic elements. Using epigenomic elements enriched in SLE and RA, we identified additional immune and B cell signaling-related genes with the same elements in their promoters. Our analysis suggests common and disease-specific epigenomic elements that may define novel therapeutic targets for controlling aberrant activation of autoimmune susceptibility genes.
Collapse
Affiliation(s)
- Mikhail G Dozmorov
- Oklahoma Medical Research Foundation; Arthritis and Clinical Immunology Research Program; Oklahoma City, OK USA
| | - Jonathan D Wren
- Oklahoma Medical Research Foundation; Arthritis and Clinical Immunology Research Program; Oklahoma City, OK USA; University of Oklahoma Health Sciences Center; Department of Biochemistry and Molecular Biology; Oklahoma City, OK USA
| | - Marta E Alarcón-Riquelme
- Oklahoma Medical Research Foundation; Arthritis and Clinical Immunology Research Program; Oklahoma City, OK USA; GENYO; Centre for Genomics and Oncological Research; Pfizer; University of Granada; Andalusian Regional Government; Granada, Spain
| |
Collapse
|