151
|
Levings D, Shaw KE, Lacher SE. Genomic resources for dissecting the role of non-protein coding variation in gene-environment interactions. Toxicology 2020; 441:152505. [PMID: 32450112 DOI: 10.1016/j.tox.2020.152505] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/18/2020] [Revised: 05/18/2020] [Accepted: 05/18/2020] [Indexed: 12/27/2022]
Abstract
The majority of single nucleotide variants (SNVs) identified in Genome Wide Association Studies (GWAS) fall within non-protein coding DNA and have the potential to alter gene expression. Non-protein coding DNA can control gene expression by acting as transcription factor (TF) binding sites or by regulating the organization of DNA into chromatin. SNVs in non-coding DNA sequences can disrupt TF binding and chromatin structure and this can result in pathology. Further, environmental health studies have shown that exposure to xenobiotics can disrupt the ability of TFs to regulate entire gene networks and result in pathology. However, there is a large amount of interindividual variability in exposure-linked health outcomes. One explanation for this heterogeneity is that genetic variation and exposure combine to disrupt gene regulation, and this eventually manifests in disease. Many resources exist that annotate common variants from GWAS and combine them with conservation, functional genomics, and TF binding data. These annotation tools provide clues regarding the biological implications of an SNV, as well as lead to the generation of hypotheses regarding potentially disrupted target genes, epigenetic markers, pathways, and cell types. Collectively this information can be used to predict how SNVs can alter an individual's response to exposure and disease risk. A basic understanding of the regulatory information contained within non-protein coding DNA is needed to predict the biological consequences of SNVs, and to determine how these SNVs impact exposure-related disease. We hope that this review will aid in the characterization of disease-associated genetic variation in the non-protein coding genome.
Collapse
Affiliation(s)
- Daniel Levings
- Department of Biomedical Sciences, University of Minnesota Medical School, Duluth Campus, 1035 University Drive, Duluth, MN, 55812, USA
| | - Kirsten E Shaw
- Department of Biomedical Sciences, University of Minnesota Medical School, Duluth Campus, 1035 University Drive, Duluth, MN, 55812, USA
| | - Sarah E Lacher
- Department of Biomedical Sciences, University of Minnesota Medical School, Duluth Campus, 1035 University Drive, Duluth, MN, 55812, USA.
| |
Collapse
|
152
|
Amano T. Gene regulatory landscape of the sonic hedgehog locus in embryonic development. Dev Growth Differ 2020; 62:334-342. [PMID: 32343848 DOI: 10.1111/dgd.12668] [Citation(s) in RCA: 14] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/02/2020] [Revised: 03/31/2020] [Accepted: 04/15/2020] [Indexed: 12/22/2022]
Abstract
The organs of vertebrate species display a wide variety of morphology. A remaining challenge in evolutionary developmental biology is to elucidate how vertebrate lineages acquire distinct morphological features. Developmental programs are driven by spatiotemporal regulation of gene expression controlled by hundreds of thousands of cis-regulatory elements. Changes in the regulatory elements caused by the introduction of genetic variants can confer regulatory innovation that may underlie morphological novelties. Recent advances in sequencing technology have revealed a number of potential regulatory variants that can alter gene expression patterns. However, a limited number of studies demonstrate causal dependence between genetic and morphological changes. Regulation of Shh expression is a good model to understand how multiple regulatory elements organize tissue-specific gene expression patterns. This model also provides insights into how evolution of molecular traits, such as gene regulatory networks, lead to phenotypic novelty.
Collapse
Affiliation(s)
- Takanori Amano
- Next Generation Human Disease Model Team, RIKEN BioResource Research Center, Tsukuba, Japan
| |
Collapse
|
153
|
Chen CH, Zheng R, Tokheim C, Dong X, Fan J, Wan C, Tang Q, Brown M, Liu JS, Meyer CA, Liu XS. Determinants of transcription factor regulatory range. Nat Commun 2020; 11:2472. [PMID: 32424124 PMCID: PMC7235260 DOI: 10.1038/s41467-020-16106-x] [Citation(s) in RCA: 20] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/11/2019] [Accepted: 04/14/2020] [Indexed: 01/03/2023] Open
Abstract
Characterization of the genomic distances over which transcription factor (TF) binding influences gene expression is important for inferring target genes from TF chromatin immunoprecipitation followed by sequencing (ChIP-seq) data. Here we systematically examine the relationship between thousands of TF and histone modification ChIP-seq data sets with thousands of gene expression profiles. We develop a model for integrating these data, which reveals two classes of TFs with distinct ranges of regulatory influence, chromatin-binding preferences, and auto-regulatory properties. We find that the regulatory range of the same TF bound within different topologically associating domains (TADs) depend on intrinsic TAD properties such as local gene density and G/C content, but also on the TAD chromatin states. Our results suggest that considering TF type, binding distance to gene locus, as well as chromatin context is important in identifying implicated TFs from GWAS SNPs.
Collapse
Affiliation(s)
- Chen-Hao Chen
- Department of Data Sciences, Dana-Farber Cancer Institute. Harvard T.H. Chan School of Public Health, Boston, MA, USA
- Biological and Biomedical Science Program, Harvard Medical School, Boston, MA, USA
- Center for Functional Cancer Epigenetics, Dana-Farber Cancer Institute, Boston, MA, USA
| | - Rongbin Zheng
- Clinical Translational Research Center, Shanghai Pulmonary Hospital, School of Life Sciences and Technology, Tongji University, Shanghai, 200092, China
| | - Collin Tokheim
- Department of Data Sciences, Dana-Farber Cancer Institute. Harvard T.H. Chan School of Public Health, Boston, MA, USA
- Center for Functional Cancer Epigenetics, Dana-Farber Cancer Institute, Boston, MA, USA
| | - Xin Dong
- Clinical Translational Research Center, Shanghai Pulmonary Hospital, School of Life Sciences and Technology, Tongji University, Shanghai, 200092, China
| | - Jingyu Fan
- Clinical Translational Research Center, Shanghai Pulmonary Hospital, School of Life Sciences and Technology, Tongji University, Shanghai, 200092, China
| | - Changxin Wan
- Clinical Translational Research Center, Shanghai Pulmonary Hospital, School of Life Sciences and Technology, Tongji University, Shanghai, 200092, China
| | - Qin Tang
- Center for Functional Cancer Epigenetics, Dana-Farber Cancer Institute, Boston, MA, USA
- Department of Medical Oncology, Dana-Farber Cancer Institute, Harvard Medical School, Boston, MA, USA
| | - Myles Brown
- Center for Functional Cancer Epigenetics, Dana-Farber Cancer Institute, Boston, MA, USA
- Department of Medical Oncology, Dana-Farber Cancer Institute, Harvard Medical School, Boston, MA, USA
| | - Jun S Liu
- Department of Statistics, Harvard University, Cambridge, MA, USA
| | - Clifford A Meyer
- Department of Data Sciences, Dana-Farber Cancer Institute. Harvard T.H. Chan School of Public Health, Boston, MA, USA.
- Center for Functional Cancer Epigenetics, Dana-Farber Cancer Institute, Boston, MA, USA.
| | - X Shirley Liu
- Department of Data Sciences, Dana-Farber Cancer Institute. Harvard T.H. Chan School of Public Health, Boston, MA, USA.
- Center for Functional Cancer Epigenetics, Dana-Farber Cancer Institute, Boston, MA, USA.
- Department of Statistics, Harvard University, Cambridge, MA, USA.
| |
Collapse
|
154
|
Han F, Liu S, Jing J, Li H, Yuan Y, Sun LP. Identification of High-Frequency Methylation Sites in RNF180 Promoter Region Affecting Expression and Their Relationship with Prognosis of Gastric Cancer. Cancer Manag Res 2020; 12:3389-3399. [PMID: 32494203 PMCID: PMC7231750 DOI: 10.2147/cmar.s246995] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/22/2020] [Accepted: 04/14/2020] [Indexed: 12/24/2022] Open
Abstract
Background Ring finger protein 180 (RNF180) is a tumor suppressor gene regulated by promoter methylation. We previously demonstrated that the RNF180 promoter methylation could be a risk factor for gastric cancer (GC); and eight high-frequency hypermethylated CpG sites were associated with GC. However, it is not clear whether these key sites can affect gene expression and involve in prognosis. The aim of this study was to investigate the effects of above CpG sites on the gene expression and prognosis of GC. Patients and Methods A total of 164 GC tissues were enrolled and followed up. Tissue samples were used for DNA and RNA isolation. Methylation status of RNF180 was detected using bisulfite sequencing PCR (BSP). Expression levels of RNF180 were detected using quantitative real-time reverse transcription-polymerase chain reaction (qRT-PCR). JASPAR and PROMO databases were used to predict the transcription factors (TFs) binding to the CpG site. Results The methylation in RNF180 promoter region increased and mRNA expression decreased in GC tissue. Correlation analysis revealed that the average methylation rate (AMR) and four CpG sites methylation rate were negatively related to RNF180 expression, including M3(−165)(Chr5:64165942), M5(−148)(Chr5:64,165,959), M7(−133)(Chr5:64,165,974) and M8(−130)(Chr5:64,165,977). Furthermore, the methylation rate of M5(−148)(Chr5:64,165,959) and M27(−26)(Chr5:64,166,081) above 0.3 indicated poor prognosis (PM5 = 0.008, PM27 = 0.003, HRM5(−148) = 2.000 (1.201,3.332), HRM27(−26)=2.389 (1.336,4.271)), which could be independent factors of prognosis. Conclusion By focusing on the methylation sites in the RNF180 promoter region, we identified two high-frequency methylation sites, M5(−148)(Chr5:64,165,959) and M27(−26)(Chr5:64,166,081), which could affect gene expression and predict the prognosis of GC. In the future, the possible molecular mechanism involved needs to be further studied.
Collapse
Affiliation(s)
- Fang Han
- Tumor Etiology and Screening Department of Cancer Institute and General Surgery, The First Hospital of China Medical University, and Key Laboratory of Cancer Etiology and Prevention in Liaoning Education Department, Shenyang 110001, People's Republic of China.,Hepatobiliary and Pancreatic Surgery, Minimal Invasive Surgery, Zhejiang Provincial People's Hospital, Hangzhou Medical College, Hangzhou 310014, People's Republic of China
| | - Shuang Liu
- Tumor Etiology and Screening Department of Cancer Institute and General Surgery, The First Hospital of China Medical University, and Key Laboratory of Cancer Etiology and Prevention in Liaoning Education Department, Shenyang 110001, People's Republic of China.,Department of Oncology, Shanxi Provincial Tumor Hospital, Xi'an 710076, People's Republic of China
| | - Jingjing Jing
- Tumor Etiology and Screening Department of Cancer Institute and General Surgery, The First Hospital of China Medical University, and Key Laboratory of Cancer Etiology and Prevention in Liaoning Education Department, Shenyang 110001, People's Republic of China
| | - Hao Li
- Tumor Etiology and Screening Department of Cancer Institute and General Surgery, The First Hospital of China Medical University, and Key Laboratory of Cancer Etiology and Prevention in Liaoning Education Department, Shenyang 110001, People's Republic of China
| | - Yuan Yuan
- Tumor Etiology and Screening Department of Cancer Institute and General Surgery, The First Hospital of China Medical University, and Key Laboratory of Cancer Etiology and Prevention in Liaoning Education Department, Shenyang 110001, People's Republic of China
| | - Li-Ping Sun
- Tumor Etiology and Screening Department of Cancer Institute and General Surgery, The First Hospital of China Medical University, and Key Laboratory of Cancer Etiology and Prevention in Liaoning Education Department, Shenyang 110001, People's Republic of China
| |
Collapse
|
155
|
Moradifard S, Saghiri R, Ehsani P, Mirkhani F, Ebrahimi‐Rad M. A preliminary computational outputs versus experimental results: Application of sTRAP, a biophysical tool for the analysis of SNPs of transcription factor-binding sites. Mol Genet Genomic Med 2020; 8:e1219. [PMID: 32155318 PMCID: PMC7216802 DOI: 10.1002/mgg3.1219] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/10/2019] [Accepted: 02/25/2020] [Indexed: 11/12/2022] Open
Abstract
BACKGROUND In the human genome, the transcription factors (TFs) and transcription factor-binding sites (TFBSs) network has a great regulatory function in the biological pathways. Such crosstalk might be affected by the single-nucleotide polymorphisms (SNPs), which could create or disrupt a TFBS, leading to either a disease or a phenotypic defect. Many computational resources have been introduced to predict the TFs binding variations due to SNPs inside TFBSs, sTRAP being one of them. METHODS A literature review was performed and the experimental data for 18 TFBSs located in 12 genes was provided. The sequences of TFBS motifs were extracted using two different strategies; in the size similar with synthetic target sites used in the experimental techniques, and with 60 bp upstream and downstream of the SNPs. The sTRAP (http://trap.molgen.mpg.de/cgi-bin/trap_two_seq_form.cgi) was applied to compute the binding affinity scores of their cognate TFs in the context of reference and mutant sequences of TFBSs. The alternative bioinformatics model used in this study was regulatory analysis of variation in enhancers (RAVEN; http://www.cisreg.ca/cgi-bin/RAVEN/a). The bioinformatics outputs of our study were compared with experimental data, electrophoretic mobility shift assay (EMSA). RESULTS In 6 out of 18 TFBSs in the following genes COL1A1, Hb ḉᴪ, TF, FIX, MBL2, NOS2A, the outputs of sTRAP were inconsistent with the results of EMSA. Furthermore, no p value of the difference between the two scores of binding affinity under the wild and mutant conditions of TFBSs was presented. Nor, were any criteria for preference or selection of any of the measurements of different matrices used for the same analysis. CONCLUSION Our preliminary study indicated some paradoxical results between sTRAP and experimental data. However, to link the data of sTRAP to the biological functions, its optimization via experimental procedures with the integration of expanded data and applying several other bioinformatics tools might be required.
Collapse
Affiliation(s)
| | - Reza Saghiri
- Biochemistry DepartmentPasteur Institute of IranTehranIran
| | - Parastoo Ehsani
- Molecular Biology DepartmentPasteur Institute of IranTehranIran
| | | | | |
Collapse
|
156
|
Ohnmacht J, May P, Sinkkonen L, Krüger R. Missing heritability in Parkinson's disease: the emerging role of non-coding genetic variation. J Neural Transm (Vienna) 2020; 127:729-748. [PMID: 32248367 PMCID: PMC7242266 DOI: 10.1007/s00702-020-02184-0] [Citation(s) in RCA: 24] [Impact Index Per Article: 4.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/27/2020] [Accepted: 03/24/2020] [Indexed: 02/01/2023]
Abstract
Parkinson's disease (PD) is a neurodegenerative disorder caused by a complex interplay of genetic and environmental factors. For the stratification of PD patients and the development of advanced clinical trials, including causative treatments, a better understanding of the underlying genetic architecture of PD is required. Despite substantial efforts, genome-wide association studies have not been able to explain most of the observed heritability. The majority of PD-associated genetic variants are located in non-coding regions of the genome. A systematic assessment of their functional role is hampered by our incomplete understanding of genotype-phenotype correlations, for example through differential regulation of gene expression. Here, the recent progress and remaining challenges for the elucidation of the role of non-coding genetic variants is reviewed with a focus on PD as a complex disease with multifactorial origins. The function of gene regulatory elements and the impact of non-coding variants on them, and the means to map these elements on a genome-wide level, will be delineated. Moreover, examples of how the integration of functional genomic annotations can serve to identify disease-associated pathways and to prioritize disease- and cell type-specific regulatory variants will be given. Finally, strategies for functional validation and considerations for suitable model systems are outlined. Together this emphasizes the contribution of rare and common genetic variants to the complex pathogenesis of PD and points to remaining challenges for the dissection of genetic complexity that may allow for better stratification, improved diagnostics and more targeted treatments for PD in the future.
Collapse
Affiliation(s)
- Jochen Ohnmacht
- LCSB, University of Luxembourg, Belvaux, Luxembourg
- Department of Life Sciences and Medicine (DLSM), University of Luxembourg, Belvaux, Luxembourg
| | - Patrick May
- LCSB, University of Luxembourg, Belvaux, Luxembourg
| | - Lasse Sinkkonen
- Department of Life Sciences and Medicine (DLSM), University of Luxembourg, Belvaux, Luxembourg
| | - Rejko Krüger
- LCSB, University of Luxembourg, Belvaux, Luxembourg.
- Luxembourg Institute of Health (LIH), Transversal Translational Medicine, Strassen, Luxembourg.
- Parkinson Research Clinic, Centre Hospitalier de Luxembourg (CHL), Luxembourg, Luxembourg.
| |
Collapse
|
157
|
Shaban HA, Seeber A. Monitoring the spatio-temporal organization and dynamics of the genome. Nucleic Acids Res 2020; 48:3423-3434. [PMID: 32123910 PMCID: PMC7144944 DOI: 10.1093/nar/gkaa135] [Citation(s) in RCA: 44] [Impact Index Per Article: 8.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/30/2019] [Revised: 02/17/2020] [Accepted: 02/23/2020] [Indexed: 12/22/2022] Open
Abstract
The spatio-temporal organization of chromatin in the eukaryotic cell nucleus is of vital importance for transcription, DNA replication and genome maintenance. Each of these activities is tightly regulated in both time and space. While we have a good understanding of chromatin organization in space, for example in fixed snapshots as a result of techniques like FISH and Hi-C, little is known about chromatin dynamics in living cells. The rapid development of flexible genomic loci imaging approaches can address fundamental questions on chromatin dynamics in a range of model organisms. Moreover, it is now possible to visualize not only single genomic loci but the whole genome simultaneously. These advances have opened many doors leading to insight into several nuclear processes including transcription and DNA repair. In this review, we discuss new chromatin imaging methods and how they have been applied to study transcription.
Collapse
Affiliation(s)
- Haitham A Shaban
- Center for Advanced Imaging, Harvard University, Cambridge, MA 02138, USA
- Spectroscopy Department, Physics Division, National Research Centre, Dokki, 12622 Cairo, Egypt
| | - Andrew Seeber
- Center for Advanced Imaging, Harvard University, Cambridge, MA 02138, USA
| |
Collapse
|
158
|
Baumgarten N, Schmidt F, Schulz MH. Improved linking of motifs to their TFs using domain information. Bioinformatics 2020; 36:1655-1662. [PMID: 31742324 PMCID: PMC7703792 DOI: 10.1093/bioinformatics/btz855] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/01/2019] [Revised: 11/08/2019] [Accepted: 11/16/2019] [Indexed: 11/23/2022] Open
Abstract
Motivation A central aim of molecular biology is to identify mechanisms of transcriptional regulation. Transcription factors (TFs), which are DNA-binding proteins, are highly involved in these processes, thus a crucial information is to know where TFs interact with DNA and to be aware of the TFs’ DNA-binding motifs. For that reason, computational tools exist that link DNA-binding motifs to TFs either without sequence information or based on TF-associated sequences, e.g. identified via a chromatin immunoprecipitation followed by sequencing (ChIP-seq) experiment. In this paper, we present MASSIF, a novel method to improve the performance of existing tools that link motifs to TFs relying on TF-associated sequences. MASSIF is based on the idea that a DNA-binding motif, which is correctly linked to a TF, should be assigned to a DNA-binding domain (DBD) similar to that of the mapped TF. Because DNA-binding motifs are in general not linked to DBDs, it is not possible to compare the DBD of a TF and the motif directly. Instead we created a DBD collection, which consist of TFs with a known DBD and an associated motif. This collection enables us to evaluate how likely it is that a linked motif and a TF of interest are associated to the same DBD. We named this similarity measure domain score, and represent it as a P-value. We developed two different ways to improve the performance of existing tools that link motifs to TFs based on TF-associated sequences: (i) using meta-analysis to combine P-values from one or several of these tools with the P-value of the domain score and (ii) filter unlikely motifs based on the domain score. Results We demonstrate the functionality of MASSIF on several human ChIP-seq datasets, using either motifs from the HOCOMOCO database or de novo identified ones as input motifs. In addition, we show that both variants of our method improve the performance of tools that link motifs to TFs based on TF-associated sequences significantly independent of the considered DBD type. Availability and implementation MASSIF is freely available online at https://github.com/SchulzLab/MASSIF. Supplementary information Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Nina Baumgarten
- Institute for Cardiovascular Regeneration, Goethe University, Frankfurt am Main 60590, Germany.,German Center for Cardiovascular Regeneration, Partner Site Rhein-Main, Frankfurt am Main 60590, Germany
| | - Florian Schmidt
- High-throughput Genomics & Systems Biology, Cluster of Excellence MMCI, Saarland University.,Research Group Computational Biology, Max Planck Institute for Informatics, Saarland Informatics Campus, Saarbrücken 66123, Germany
| | - Marcel H Schulz
- Institute for Cardiovascular Regeneration, Goethe University, Frankfurt am Main 60590, Germany.,German Center for Cardiovascular Regeneration, Partner Site Rhein-Main, Frankfurt am Main 60590, Germany.,High-throughput Genomics & Systems Biology, Cluster of Excellence MMCI, Saarland University.,Research Group Computational Biology, Max Planck Institute for Informatics, Saarland Informatics Campus, Saarbrücken 66123, Germany
| |
Collapse
|
159
|
Wang X, Goldstein DB. Enhancer Domains Predict Gene Pathogenicity and Inform Gene Discovery in Complex Disease. Am J Hum Genet 2020; 106:215-233. [PMID: 32032514 PMCID: PMC7010980 DOI: 10.1016/j.ajhg.2020.01.012] [Citation(s) in RCA: 55] [Impact Index Per Article: 11.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/10/2019] [Accepted: 01/13/2020] [Indexed: 02/07/2023] Open
Abstract
Non-coding transcriptional regulatory elements are critical for controlling the spatiotemporal expression of genes. Here, we demonstrate that the sizes and number of enhancers linked to a gene reflect its disease pathogenicity. Moreover, genes with redundant enhancer domains are depleted of cis-acting genetic variants that disrupt gene expression, and they are buffered against the effects of disruptive non-coding mutations. Our results demonstrate that dosage-sensitive genes have evolved a robustness to the disruptive effects of genetic variation by expanding their regulatory domains. This solves a puzzle about why genes associated with human disease are depleted of cis-eQTLs (cis-expression quantitative trait loci), suggesting that this relationship might complicate gene identification in causal genome-wide association studies (GWASs) using eQTL information, and establishes a framework for identifying non-coding regulatory variation with phenotypic consequences.
Collapse
Affiliation(s)
- Xinchen Wang
- Institute for Genomic Medicine, Columbia University Medical Center, Hammer Health Sciences, 701 West 168th Street, New York, New York 10032, USA.
| | - David B Goldstein
- Institute for Genomic Medicine, Columbia University Medical Center, Hammer Health Sciences, 701 West 168th Street, New York, New York 10032, USA; Department of Genetics and Development, Columbia University Medical Center, Hammer Health Sciences, 701 West 168th Street, New York, New York 10032, USA.
| |
Collapse
|
160
|
Candidate SNP Markers of Atherogenesis Significantly Shifting the Affinity of TATA-Binding Protein for Human Gene Promoters show stabilizing Natural Selection as a Sum of Neutral Drift Accelerating Atherogenesis and Directional Natural Selection Slowing It. Int J Mol Sci 2020; 21:ijms21031045. [PMID: 32033288 PMCID: PMC7037642 DOI: 10.3390/ijms21031045] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/30/2019] [Revised: 02/03/2020] [Accepted: 02/04/2020] [Indexed: 12/15/2022] Open
Abstract
(1) Background: The World Health Organization (WHO) regards atherosclerosis-related myocardial infarction and stroke as the main causes of death in humans. Susceptibility to atherogenesis-associated diseases is caused by single-nucleotide polymorphisms (SNPs). (2) Methods: Using our previously developed public web-service SNP_TATA_Comparator, we estimated statistical significance of the SNP-caused alterations in TATA-binding protein (TBP) binding affinity for 70 bp proximal promoter regions of the human genes clinically associated with diseases syntonic or dystonic with atherogenesis. Additionally, we did the same for several genes related to the maintenance of mitochondrial genome integrity, according to present-day active research aimed at retarding atherogenesis. (3) Results: In dbSNP, we found 1186 SNPs altering such affinity to the same extent as clinical SNP markers do (as estimated). Particularly, clinical SNP marker rs2276109 can prevent autoimmune diseases via reduced TBP affinity for the human MMP12 gene promoter and therefore macrophage elastase deficiency, which is a well-known physiological marker of accelerated atherogenesis that could be retarded nutritionally using dairy fermented by lactobacilli. (4) Conclusions: Our results uncovered SNPs near clinical SNP markers as the basis of neutral drift accelerating atherogenesis and SNPs of genes encoding proteins related to mitochondrial genome integrity and microRNA genes associated with instability of the atherosclerotic plaque as a basis of directional natural selection slowing atherogenesis. Their sum may be stabilizing the natural selection that sets the normal level of atherogenesis.
Collapse
|
161
|
Cebola I. Liver gene regulatory networks: Contributing factors to nonalcoholic fatty liver disease. WILEY INTERDISCIPLINARY REVIEWS-SYSTEMS BIOLOGY AND MEDICINE 2020; 12:e1480. [PMID: 32020788 DOI: 10.1002/wsbm.1480] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 09/29/2019] [Revised: 01/02/2020] [Accepted: 01/03/2020] [Indexed: 12/17/2022]
Abstract
Metabolic diseases such as nonalcoholic fatty liver disease (NAFLD) result from complex interactions between intrinsic and extrinsic factors, including genetics and exposure to obesogenic environments. These risk factors converge in aberrant gene expression patterns in the liver, which are underlined by altered cis-regulatory networks. In homeostasis and in disease states, liver cis-regulatory networks are established by coordinated action of liver-enriched transcription factors (TFs), which define enhancer landscapes, activating broad gene programs with spatiotemporal resolution. Recent advances in DNA sequencing have dramatically expanded our ability to map active transcripts, enhancers and TF cistromes, and to define the 3D chromatin topology that contains these elements. Deployment of these technologies has allowed investigation of the molecular processes that regulate liver development and metabolic homeostasis. Moreover, genomic studies of NAFLD patients and NAFLD models have demonstrated that the liver undergoes pervasive regulatory rewiring in NAFLD, which is reflected by aberrant gene expression profiles. We have therefore achieved an unprecedented level of detail in the understanding of liver cis-regulatory networks, particularly in physiological conditions. Future studies should aim to map active regulatory elements with added levels of resolution, addressing how the chromatin landscapes of different cell lineages contribute to and are altered in NAFLD and NAFLD-associated metabolic states. Such efforts would provide additional clues into the molecular factors that trigger this disease. This article is categorized under: Biological Mechanisms > Metabolism Biological Mechanisms > Regulatory Biology Laboratory Methods and Technologies > Genetic/Genomic Methods.
Collapse
Affiliation(s)
- Inês Cebola
- Department of Metabolism, Digestion and Reproduction, Section of Genetics and Genomics, Imperial College London, London, UK
| |
Collapse
|
162
|
Abstract
Regulatory landscapes have been defined in vertebrates as large DNA segments containing diverse enhancer sequences that produce coherent gene transcription. These genomic platforms integrate multiple cellular signals and hence can trigger pleiotropic expression of developmental genes. Identifying and evaluating how these chromatin regions operate may be difficult as the underlying regulatory mechanisms can be as unique as the genes they control. In this brief article and accompanying poster, we discuss some of the ways in which regulatory landscapes operate, illustrating these mechanisms using genes important for vertebrate development as examples. We also highlight some of the techniques available to researchers for analysing regulatory landscapes.
Collapse
Affiliation(s)
- Christopher Chase Bolt
- Swiss Institute for Cancer Research (ISREC), School of Life Sciences, Federal Institute of Technology, Lausanne, 1015 Lausanne, Switzerland
| | - Denis Duboule
- Swiss Institute for Cancer Research (ISREC), School of Life Sciences, Federal Institute of Technology, Lausanne, 1015 Lausanne, Switzerland
- Department of Genetics and Evolution, University of Geneva, 1211 Geneva 4, Switzerland
- Collège de France, 75005 Paris, France
| |
Collapse
|
163
|
Zhou W, Dorrity MW, Bubb KL, Queitsch C, Fields S. Binding and Regulation of Transcription by Yeast Ste12 Variants To Drive Mating and Invasion Phenotypes. Genetics 2020; 214:397-407. [PMID: 31810988 PMCID: PMC7017024 DOI: 10.1534/genetics.119.302929] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/27/2019] [Accepted: 11/25/2019] [Indexed: 12/31/2022] Open
Abstract
Amino acid substitutions are commonly found in human transcription factors, yet the functional consequences of much of this variation remain unknown, even in well-characterized DNA-binding domains. Here, we examine how six single-amino acid variants in the DNA-binding domain of Ste12-a yeast transcription factor regulating mating and invasion-alter Ste12 genome binding, motif recognition, and gene expression to yield markedly different phenotypes. Using a combination of the "calling-card" method, RNA sequencing, and HT-SELEX (high throughput systematic evolution of ligands by exponential enrichment), we find that variants with dissimilar binding and expression profiles can converge onto similar cellular behaviors. Mating-defective variants led to decreased expression of distinct subsets of genes necessary for mating. Hyper-invasive variants also decreased expression of subsets of genes involved in mating, but increased the expression of other subsets of genes associated with the cellular response to osmotic stress. While single-amino acid changes in the coding region of this transcription factor result in complex regulatory reconfiguration, the major phenotypic consequences for the cell appear to depend on changes in the expression of a small number of genes with related functions.
Collapse
Affiliation(s)
- Wei Zhou
- Department of Genome Sciences, University of Washington, Seattle, Washington 98195
- Molecular and Cellular Biology Program, University of Washington, Seattle, Washington 98195
| | - Michael W Dorrity
- Department of Genome Sciences, University of Washington, Seattle, Washington 98195
| | - Kerry L Bubb
- Department of Genome Sciences, University of Washington, Seattle, Washington 98195
| | - Christine Queitsch
- Department of Genome Sciences, University of Washington, Seattle, Washington 98195
| | - Stanley Fields
- Department of Genome Sciences, University of Washington, Seattle, Washington 98195
- Department of Medicine, University of Washington, Seattle, Washington 98195
| |
Collapse
|
164
|
Frochaux MV, Bou Sleiman M, Gardeux V, Dainese R, Hollis B, Litovchenko M, Braman VS, Andreani T, Osman D, Deplancke B. cis-regulatory variation modulates susceptibility to enteric infection in the Drosophila genetic reference panel. Genome Biol 2020; 21:6. [PMID: 31948474 PMCID: PMC6966807 DOI: 10.1186/s13059-019-1912-z] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/15/2019] [Accepted: 12/05/2019] [Indexed: 02/07/2023] Open
Abstract
BACKGROUND Resistance to enteric pathogens is a complex trait at the crossroads of multiple biological processes. We have previously shown in the Drosophila Genetic Reference Panel (DGRP) that resistance to infection is highly heritable, but our understanding of how the effects of genetic variants affect different molecular mechanisms to determine gut immunocompetence is still limited. RESULTS To address this, we perform a systems genetics analysis of the gut transcriptomes from 38 DGRP lines that were orally infected with Pseudomonas entomophila. We identify a large number of condition-specific, expression quantitative trait loci (local-eQTLs) with infection-specific ones located in regions enriched for FOX transcription factor motifs. By assessing the allelic imbalance in the transcriptomes of 19 F1 hybrid lines from a large round robin design, we independently attribute a robust cis-regulatory effect to only 10% of these detected local-eQTLs. However, additional analyses indicate that many local-eQTLs may act in trans instead. Comparison of the transcriptomes of DGRP lines that were either susceptible or resistant to Pseudomonas entomophila infection reveals nutcracker as the only differentially expressed gene. Interestingly, we find that nutcracker is linked to infection-specific eQTLs that correlate with its expression level and to enteric infection susceptibility. Further regulatory analysis reveals one particular eQTL that significantly decreases the binding affinity for the repressor Broad, driving differential allele-specific nutcracker expression. CONCLUSIONS Our collective findings point to a large number of infection-specific cis- and trans-acting eQTLs in the DGRP, including one common non-coding variant that lowers enteric infection susceptibility.
Collapse
Affiliation(s)
- Michael V. Frochaux
- Laboratory of Systems Biology and Genetics, Institute of Bioengineering, Ecole Polytechnique Fédérale de Lausanne (EPFL) and Swiss Institute of Bioinformatics, Lausanne, Switzerland
| | - Maroun Bou Sleiman
- Laboratory of Systems Biology and Genetics, Institute of Bioengineering, Ecole Polytechnique Fédérale de Lausanne (EPFL) and Swiss Institute of Bioinformatics, Lausanne, Switzerland
- Current Address: Laboratory of Integrative Systems Physiology, Institute of Bioengineering, Ecole Polytechnique Fédérale de Lausanne (EPFL), Lausanne, Switzerland
| | - Vincent Gardeux
- Laboratory of Systems Biology and Genetics, Institute of Bioengineering, Ecole Polytechnique Fédérale de Lausanne (EPFL) and Swiss Institute of Bioinformatics, Lausanne, Switzerland
| | - Riccardo Dainese
- Laboratory of Systems Biology and Genetics, Institute of Bioengineering, Ecole Polytechnique Fédérale de Lausanne (EPFL) and Swiss Institute of Bioinformatics, Lausanne, Switzerland
| | - Brian Hollis
- Laboratory of Systems Biology and Genetics, Institute of Bioengineering, Ecole Polytechnique Fédérale de Lausanne (EPFL) and Swiss Institute of Bioinformatics, Lausanne, Switzerland
- Current Address: Department of Biological Sciences, University of South Carolina, Columbia, South Carolina USA
| | - Maria Litovchenko
- Laboratory of Systems Biology and Genetics, Institute of Bioengineering, Ecole Polytechnique Fédérale de Lausanne (EPFL) and Swiss Institute of Bioinformatics, Lausanne, Switzerland
| | - Virginie S. Braman
- Laboratory of Systems Biology and Genetics, Institute of Bioengineering, Ecole Polytechnique Fédérale de Lausanne (EPFL), Lausanne, Switzerland
| | - Tommaso Andreani
- Computational Biology and Data Mining Group, Institute of Molecular Biology, Johannes Gutenberg-Universität Mainz, Mainz, Germany
| | - Dani Osman
- Faculty of Sciences III and Azm Center for Research in Biotechnology and its Applications, LBA3B, EDST, Lebanese University, Tripoli, 1300 Lebanon
| | - Bart Deplancke
- Laboratory of Systems Biology and Genetics, Institute of Bioengineering, Ecole Polytechnique Fédérale de Lausanne (EPFL), Lausanne, Switzerland
| |
Collapse
|
165
|
Wragg D, Liu Q, Lin Z, Riggio V, Pugh CA, Beveridge AJ, Brown H, Hume DA, Harris SE, Deary IJ, Tenesa A, Prendergast JGD. Using regulatory variants to detect gene-gene interactions identifies networks of genes linked to cell immortalisation. Nat Commun 2020; 11:343. [PMID: 31953380 PMCID: PMC6969137 DOI: 10.1038/s41467-019-13762-6] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/28/2018] [Accepted: 11/19/2019] [Indexed: 12/30/2022] Open
Abstract
The extent to which the impact of regulatory genetic variants may depend on other factors, such as the expression levels of upstream transcription factors, remains poorly understood. Here we report a framework in which regulatory variants are first aggregated into sets, and using these as estimates of the total cis-genetic effects on a gene we model their non-additive interactions with the expression of other genes in the genome. Using 1220 lymphoblastoid cell lines across platforms and independent datasets we identify 74 genes where the impact of their regulatory variant-set is linked to the expression levels of networks of distal genes. We show that these networks are predominantly associated with tumourigenesis pathways, through which immortalised cells are able to rapidly proliferate. We consequently present an approach to define gene interaction networks underlying important cellular pathways such as cell immortalisation.
Collapse
Affiliation(s)
- D. Wragg
- 0000 0004 1936 7988grid.4305.2The Roslin Institute, University of Edinburgh, Easter Bush, Midlothian, EH25 9RG UK
| | - Q. Liu
- 0000 0004 1936 7988grid.4305.2The Roslin Institute, University of Edinburgh, Easter Bush, Midlothian, EH25 9RG UK
| | - Z. Lin
- 0000 0004 1936 7988grid.4305.2The Roslin Institute, University of Edinburgh, Easter Bush, Midlothian, EH25 9RG UK
| | - V. Riggio
- 0000 0004 1936 7988grid.4305.2The Roslin Institute, University of Edinburgh, Easter Bush, Midlothian, EH25 9RG UK
| | - C. A. Pugh
- 0000 0004 1936 7988grid.4305.2The Roslin Institute, University of Edinburgh, Easter Bush, Midlothian, EH25 9RG UK
| | - A. J. Beveridge
- 0000 0001 2193 314Xgrid.8756.cGlasgow Polyomics, College of Medical, Veterinary and Life Science, University of Glasgow, Glasgow, UK
| | - H. Brown
- 0000 0004 1936 7988grid.4305.2The Roslin Institute, University of Edinburgh, Easter Bush, Midlothian, EH25 9RG UK
| | - D. A. Hume
- 0000000406180938grid.489335.0Mater Research Institute-University of Queensland, Translational Research Institute, Woolloongabba, QLD 4102 Australia
| | - S. E. Harris
- 0000 0004 1936 7988grid.4305.2Centre for Cognitive Ageing and Cognitive Epidemiology, University of Edinburgh, Edinburgh, EH8 9JZ UK
| | - I. J. Deary
- 0000 0004 1936 7988grid.4305.2Centre for Cognitive Ageing and Cognitive Epidemiology, University of Edinburgh, Edinburgh, EH8 9JZ UK
| | - A. Tenesa
- 0000 0004 1936 7988grid.4305.2The Roslin Institute, University of Edinburgh, Easter Bush, Midlothian, EH25 9RG UK
| | - J. G. D. Prendergast
- 0000 0004 1936 7988grid.4305.2The Roslin Institute, University of Edinburgh, Easter Bush, Midlothian, EH25 9RG UK
| |
Collapse
|
166
|
Chadaeva IV, Rasskazov DA, Sharypova EB, Drachkova IA, Oshchepkova EA, Savinkova LK, Ponomarenko PM, Ponomarenko MP, Kolchanov NA, Kozlov VA. Сandidate SNP-markers of rheumatoid arthritis that can significantly alter the affinity of the TATA-binding protein for human gene promoters. Vavilovskii Zhurnal Genet Selektsii 2020. [DOI: 10.18699/vj19.586] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/07/2023] Open
Abstract
Rheumatoid polyarthritis (RA) is an autoimmune disease with autoantibodies, including antibodies to citrullant antigens and proinflammatory cytokines, such as TNF-α and IL-6, which are involved in the induction of chronic synovitis, bone erosion, followed by deformity. Immunopathogenesis is based on the mechanisms of the breakdown of immune tolerance to its own antigens, which is characterized by an increase in the activity of T-effector cells, causing RA symptomatology. At the same time, against the background of such increased activity of effector lymphocytes, a decrease in the activity of a number of regulatory cells, including regulatory T-cells (Treg) and myeloid suppressor cells, is recorded. There is reason to say that it is the change in the activity of suppressor cells that is the leading element in RA pathogenesis. That is why only periods of weakening (remission) of RA are spoken of. According to the more powerful female immune system compared to the male one, the risk of developing RA in women is thrice as high, this risk decreases during breastfeeding and grows during pregnancy as well as after menopause in proportion to the level of sex hormones. It is believed that 50 % of the risk of developing RA depends on the conditions and lifestyle, while the remaining 50 % is dependent on genetic predisposition. That is why, RA fits the main idea of postgenomic predictive-preventive personalized medicine that is to give a chance to those who would like to reduce his/her risk of diseases by bringing his/her conditions and lifestyle in line with the data on his/her genome sequenced. This is very important, since doctors consider RA as one of the most frequent causes of disability. Using the Web service SNP_TATA_Z-tester (http://beehive.bionet.nsc.ru/cgi-bin/mgs/tatascan_fox/start.pl), 227 variants of single nucleotide polymorphism (SNP) of the human gene promoters were studied. As a result, 43 candidate SNP markers for RA that can alter the affinity of the TATA-binding protein (TBP) for the promoters of these genes were predicted.
Collapse
Affiliation(s)
- I. V. Chadaeva
- Institute of Cytology and Genetics, SB RAS; Novosibirsk State University
| | | | | | | | | | | | | | - M. P. Ponomarenko
- Institute of Cytology and Genetics, SB RAS; Novosibirsk State University
| | - N. A. Kolchanov
- Institute of Cytology and Genetics, SB RAS; Novosibirsk State University
| | - V. A. Kozlov
- Research Institute of Fundamental and Clinical Immunology
| |
Collapse
|
167
|
Ibarra IL, Hollmann NM, Klaus B, Augsten S, Velten B, Hennig J, Zaugg JB. Mechanistic insights into transcription factor cooperativity and its impact on protein-phenotype interactions. Nat Commun 2020; 11:124. [PMID: 31913281 PMCID: PMC6949242 DOI: 10.1038/s41467-019-13888-7] [Citation(s) in RCA: 53] [Impact Index Per Article: 10.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/16/2019] [Accepted: 11/28/2019] [Indexed: 11/25/2022] Open
Abstract
Recent high-throughput transcription factor (TF) binding assays revealed that TF cooperativity is a widespread phenomenon. However, a global mechanistic and functional understanding of TF cooperativity is still lacking. To address this, here we introduce a statistical learning framework that provides structural insight into TF cooperativity and its functional consequences based on next generation sequencing data. We identify DNA shape as driver for cooperativity, with a particularly strong effect for Forkhead-Ets pairs. Follow-up experiments reveal a local shape preference at the Ets-DNA-Forkhead interface and decreased cooperativity upon loss of the interaction. Additionally, we discover many functional associations for cooperatively bound TFs. Examination of the link between FOXO1:ETV6 and lymphomas reveals that their joint expression levels improve patient clinical outcome stratification. Altogether, our results demonstrate that inter-family cooperative TF binding is driven by position-specific DNA readout mechanisms, which provides an additional regulatory layer for downstream biological functions. Although transcription factor (TF) cooperativity is widespread, a global mechanistic understanding of the role of TF cooperativity is still lacking. Here the authors introduce a statistical learning framework that provides structural insight into TF cooperativity and its functional consequences based on next generation sequencing data and provide mechanistic insights into TF cooperativity and its impact on protein-phenotype interactions.
Collapse
Affiliation(s)
- Ignacio L Ibarra
- Structural and Computational Biology Unit, European Molecular Biology Laboratory, Heidelberg, Germany.,Faculty of Biosciences, Collaboration for Joint PhD Degree between EMBL and Heidelberg University, Heidelberg, Germany
| | - Nele M Hollmann
- Structural and Computational Biology Unit, European Molecular Biology Laboratory, Heidelberg, Germany.,Faculty of Biosciences, Collaboration for Joint PhD Degree between EMBL and Heidelberg University, Heidelberg, Germany
| | - Bernd Klaus
- Genome Biology Unit, European Molecular Biology Laboratory, Heidelberg, Germany
| | - Sandra Augsten
- Structural and Computational Biology Unit, European Molecular Biology Laboratory, Heidelberg, Germany
| | - Britta Velten
- Genome Biology Unit, European Molecular Biology Laboratory, Heidelberg, Germany
| | - Janosch Hennig
- Structural and Computational Biology Unit, European Molecular Biology Laboratory, Heidelberg, Germany
| | - Judith B Zaugg
- Structural and Computational Biology Unit, European Molecular Biology Laboratory, Heidelberg, Germany.
| |
Collapse
|
168
|
Broekema RV, Bakker OB, Jonkers IH. A practical view of fine-mapping and gene prioritization in the post-genome-wide association era. Open Biol 2020; 10:190221. [PMID: 31937202 PMCID: PMC7014684 DOI: 10.1098/rsob.190221] [Citation(s) in RCA: 75] [Impact Index Per Article: 15.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/13/2019] [Accepted: 12/05/2019] [Indexed: 12/17/2022] Open
Abstract
Over the past 15 years, genome-wide association studies (GWASs) have enabled the systematic identification of genetic loci associated with traits and diseases. However, due to resolution issues and methodological limitations, the true causal variants and genes associated with traits remain difficult to identify. In this post-GWAS era, many biological and computational fine-mapping approaches now aim to solve these issues. Here, we review fine-mapping and gene prioritization approaches that, when combined, will improve the understanding of the underlying mechanisms of complex traits and diseases. Fine-mapping of genetic variants has become increasingly sophisticated: initially, variants were simply overlapped with functional elements, but now the impact of variants on regulatory activity and direct variant-gene 3D interactions can be identified. Moreover, gene manipulation by CRISPR/Cas9, the identification of expression quantitative trait loci and the use of co-expression networks have all increased our understanding of the genes and pathways affected by GWAS loci. However, despite this progress, limitations including the lack of cell-type- and disease-specific data and the ever-increasing complexity of polygenic models of traits pose serious challenges. Indeed, the combination of fine-mapping and gene prioritization by statistical, functional and population-based strategies will be necessary to truly understand how GWAS loci contribute to complex traits and diseases.
Collapse
Affiliation(s)
| | | | - I. H. Jonkers
- Department of Genetics, University Medical Center Groningen, University of Groningen, Groningen, The Netherlands
| |
Collapse
|
169
|
The Temporal Neurogenesis Patterning of Spinal p3-V3 Interneurons into Divergent Subpopulation Assemblies. J Neurosci 2019; 40:1440-1452. [PMID: 31826942 DOI: 10.1523/jneurosci.1518-19.2019] [Citation(s) in RCA: 23] [Impact Index Per Article: 3.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/01/2019] [Revised: 12/01/2019] [Accepted: 12/06/2019] [Indexed: 11/21/2022] Open
Abstract
Neuronal diversity provides the spinal cord with the functional flexibility required to perform complex motor tasks. Spinal neurons arise during early embryonic development with the establishment of spatially and molecularly discrete progenitor domains that give rise to distinct, but highly heterogeneous, postmitotic interneuron (IN) populations. Our previous studies have shown that Sim1-expressing V3 INs, originating from the p3 progenitor domain, are anatomically and physiologically divergent. However, the developmental logic guiding V3 subpopulation diversity remains elusive. In specific cases of other IN classes, neurogenesis timing can play a role in determining the ultimate fates and unique characteristics of distinctive subpopulations. To examine whether neurogenesis timing contributes to V3 diversity, we systematically investigated the temporal neurogenesis profiles of V3 INs in the mouse spinal cord. Our work uncovered that V3 INs were organized into either early-born [embryonic day 9.5 (E9.5) to E10.5] or late-born (E11.5-E12.5) neurogenic waves. Early-born V3 INs displayed both ascending and descending commissural projections and clustered into subgroups across dorsoventral spinal laminae. In contrast, late-born V3 INs became fate-restricted to ventral laminae and displayed mostly descending and local commissural projections and uniform membrane properties. Furthermore, we found that the postmitotic transcription factor, Sim1, although expressed in all V3 INs, exclusively regulated the dorsal clustering and electrophysiological diversification of early-born, but not late-born, V3 INs, which indicates that neurogenesis timing may enable newborn V3 INs to interact with different postmitotic differentiation pathways. Thus, our work demonstrates neurogenesis timing as a developmental mechanism underlying the postmitotic differentiation of V3 INs into distinct subpopulation assemblies.SIGNIFICANCE STATEMENT Interneuron (IN) diversity empowers the spinal cord with the computation flexibility required to perform appropriate sensorimotor control. As such, uncovering the developmental logic guiding spinal IN diversity is fundamental to understanding the development of movement. In our current work, through a focus on the cardinal spinal V3 IN population, we investigated the role of neurogenesis timing on IN diversity. We uncovered that V3 INs are organized into early-born [embryonic day 9.5 (E9.5) to E10.5] or late-born (E11.5-E12.5) neurogenic waves, where late-born V3 INs display increasingly restricted subpopulation fates. Next, to better understand the consequences of V3 neurogenesis timing, we investigated the time-dependent functions of the Sim1 transcription factor, which is expressed in postmitotic V3 INs. Interestingly, Sim1 exclusively regulated the diversification of early-born, but not late-born, V3 INs. Thus, our current work indicates neurogenesis timing can modulate the functions of early postmitotic transcription factors and, thus, subpopulation fate specifications.
Collapse
|
170
|
Campbell MC, Ashong B, Teng S, Harvey J, Cross CN. Multiple selective sweeps of ancient polymorphisms in and around LTα located in the MHC class III region on chromosome 6. BMC Evol Biol 2019; 19:218. [PMID: 31791241 PMCID: PMC6889576 DOI: 10.1186/s12862-019-1516-y] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/28/2019] [Accepted: 09/20/2019] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND Lymphotoxin-α (LTα), located in the Major Histocompatibility Complex (MHC) class III region on chromosome 6, encodes a cytotoxic protein that mediates a variety of antiviral responses among other biological functions. Furthermore, several genotypes at this gene have been implicated in the onset of a number of complex diseases, including myocardial infarction, autoimmunity, and various types of cancer. However, little is known about levels of nucleotide variation and linkage disequilibrium (LD) in and near LTα, which could also influence phenotypic variance. To address this gap in knowledge, we examined sequence variation across ~ 10 kilobases (kbs), encompassing LTα and the upstream region, in 2039 individuals from the 1000 Genomes Project originating from 21 global populations. RESULTS Here, we observed striking patterns of diversity, including an excess of intermediate-frequency alleles, the maintenance of multiple common haplotypes and a deep coalescence time for variation (dating > 1.0 million years ago), in global populations. While these results are generally consistent with a model of balancing selection, we also uncovered a signature of positive selection in the form of long-range LD on chromosomes with derived alleles primarily in Eurasian populations. To reconcile these findings, which appear to support different models of selection, we argue that selective sweeps (particularly, soft sweeps) of multiple derived alleles in and/or near LTα occurred in non-Africans after their ancestors left Africa. Furthermore, these targets of selection were predicted to alter transcription factor binding site affinity and protein stability, suggesting they play a role in gene function. Additionally, our data also showed that a subset of these functional adaptive variants are present in archaic hominin genomes. CONCLUSIONS Overall, this study identified candidate functional alleles in a biologically-relevant genomic region, and offers new insights into the evolutionary origins of these loci in modern human populations.
Collapse
Affiliation(s)
- Michael C. Campbell
- Department of Biology, College of Arts and Sciences, Howard University, Washington, DC 20059 USA
| | - Bryan Ashong
- Department of Biology, College of Arts and Sciences, Howard University, Washington, DC 20059 USA
| | - Shaolei Teng
- Department of Biology, College of Arts and Sciences, Howard University, Washington, DC 20059 USA
| | - Jayla Harvey
- Department of Biology, College of Arts and Sciences, Howard University, Washington, DC 20059 USA
| | - Christopher N. Cross
- Department of Anatomy, College of Medicine, Howard University, Washington, DC 20059 USA
| |
Collapse
|
171
|
Liu D, Davila-Velderrain J, Zhang Z, Kellis M. Integrative construction of regulatory region networks in 127 human reference epigenomes by matrix factorization. Nucleic Acids Res 2019; 47:7235-7246. [PMID: 31265076 PMCID: PMC6698807 DOI: 10.1093/nar/gkz538] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/27/2018] [Revised: 04/19/2019] [Accepted: 06/09/2019] [Indexed: 01/14/2023] Open
Abstract
Despite large experimental and computational efforts aiming to dissect the mechanisms underlying disease risk, mapping cis-regulatory elements to target genes remains a challenge. Here, we introduce a matrix factorization framework to integrate physical and functional interaction data of genomic segments. The framework was used to predict a regulatory network of chromatin interaction edges linking more than 20 000 promoters and 1.8 million enhancers across 127 human reference epigenomes, including edges that are present in any of the input datasets. Our network integrates functional evidence of correlated activity patterns from epigenomic data and physical evidence of chromatin interactions. An important contribution of this work is the representation of heterogeneous data with different qualities as networks. We show that the unbiased integration of independent data sources suggestive of regulatory interactions produces meaningful associations supported by existing functional and physical evidence, correlating with expected independent biological features.
Collapse
Affiliation(s)
- Dianbo Liu
- MIT Computer Science and Artificial Intelligence Laboratory, Cambridge, MA 02139, USA.,Broad Institute of MIT and Harvard, Cambridge, MA 02142, USA.,Division of Computational Biology, School of Life Sciences, University of Dundee, Dundee, DD1 5HL, Scotland, UK
| | - Jose Davila-Velderrain
- MIT Computer Science and Artificial Intelligence Laboratory, Cambridge, MA 02139, USA.,Broad Institute of MIT and Harvard, Cambridge, MA 02142, USA
| | - Zhizhuo Zhang
- MIT Computer Science and Artificial Intelligence Laboratory, Cambridge, MA 02139, USA.,Broad Institute of MIT and Harvard, Cambridge, MA 02142, USA
| | - Manolis Kellis
- MIT Computer Science and Artificial Intelligence Laboratory, Cambridge, MA 02139, USA.,Broad Institute of MIT and Harvard, Cambridge, MA 02142, USA
| |
Collapse
|
172
|
Korneev KV, Sviriaeva EN, Mitkin NA, Gorbacheva AM, Uvarova AN, Ustiugova AS, Polanovsky OL, Kulakovskiy IV, Afanasyeva MA, Schwartz AM, Kuprash DV. Minor C allele of the SNP rs7873784 associated with rheumatoid arthritis and type-2 diabetes mellitus binds PU.1 and enhances TLR4 expression. Biochim Biophys Acta Mol Basis Dis 2019; 1866:165626. [PMID: 31785408 DOI: 10.1016/j.bbadis.2019.165626] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/23/2019] [Revised: 11/08/2019] [Accepted: 11/26/2019] [Indexed: 12/19/2022]
Abstract
Toll-like receptor 4 (TLR4) is an innate immunity receptor predominantly expressed on myeloid cells and involved in the development of various diseases, many of them with complex genetics. Here we present data on functionality of single nucleotide polymorphism rs7873784 located in the 3'-untranslated region (3'-UTR) of TLR4 gene and associated with various pathologies involving chronic inflammation. We demonstrate that TLR4 3'-UTR strongly enhanced the activity of TLR4 promoter in U937 human monocytic cell line while minor rs7873784(C) allele created a binding site for transcription factor PU.1 (encoded by SPI1 gene), a known regulator of TLR4 expression. Increased binding of PU.1 further augmented the TLR4 transcription while PU.1 knockdown or complete disruption of the PU.1 binding site abrogated the effect. We hypothesize that additional functional PU.1 site may increase TLR4 expression in individuals carrying minor C variant of rs7873784 and modulate the development of certain pathologies, such as rheumatoid arthritis and type-2 diabetes mellitus.
Collapse
Affiliation(s)
- Kirill V Korneev
- Engelhardt Institute of Molecular Biology, Russian Academy of Sciences, 119991 Moscow, Russia
| | - Ekaterina N Sviriaeva
- Engelhardt Institute of Molecular Biology, Russian Academy of Sciences, 119991 Moscow, Russia
| | - Nikita A Mitkin
- Engelhardt Institute of Molecular Biology, Russian Academy of Sciences, 119991 Moscow, Russia
| | - Alisa M Gorbacheva
- Engelhardt Institute of Molecular Biology, Russian Academy of Sciences, 119991 Moscow, Russia; Biological Faculty, Lomonosov Moscow State University, 119234 Moscow, Russia
| | - Aksinya N Uvarova
- Engelhardt Institute of Molecular Biology, Russian Academy of Sciences, 119991 Moscow, Russia; Biological Faculty, Lomonosov Moscow State University, 119234 Moscow, Russia
| | - Alina S Ustiugova
- Engelhardt Institute of Molecular Biology, Russian Academy of Sciences, 119991 Moscow, Russia; Biological Faculty, Lomonosov Moscow State University, 119234 Moscow, Russia
| | - Oleg L Polanovsky
- Engelhardt Institute of Molecular Biology, Russian Academy of Sciences, 119991 Moscow, Russia
| | - Ivan V Kulakovskiy
- Engelhardt Institute of Molecular Biology, Russian Academy of Sciences, 119991 Moscow, Russia; Vavilov Institute of General Genetics, Russian Academy of Sciences, 119991 Moscow, Russia; Institute of Mathematical Problems of Biology, Keldysh Institute of Applied Mathematics, Russian Academy of Sciences, 142290 Pushchino, Russia
| | - Marina A Afanasyeva
- Engelhardt Institute of Molecular Biology, Russian Academy of Sciences, 119991 Moscow, Russia
| | - Anton M Schwartz
- Engelhardt Institute of Molecular Biology, Russian Academy of Sciences, 119991 Moscow, Russia
| | - Dmitry V Kuprash
- Engelhardt Institute of Molecular Biology, Russian Academy of Sciences, 119991 Moscow, Russia; Biological Faculty, Lomonosov Moscow State University, 119234 Moscow, Russia.
| |
Collapse
|
173
|
Hoffman GE, Bendl J, Girdhar K, Schadt EE, Roussos P. Functional interpretation of genetic variants using deep learning predicts impact on chromatin accessibility and histone modification. Nucleic Acids Res 2019; 47:10597-10611. [PMID: 31544924 PMCID: PMC6847046 DOI: 10.1093/nar/gkz808] [Citation(s) in RCA: 33] [Impact Index Per Article: 5.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/20/2018] [Revised: 08/28/2019] [Accepted: 09/12/2019] [Indexed: 12/19/2022] Open
Abstract
Identifying functional variants underlying disease risk and adoption of personalized medicine are currently limited by the challenge of interpreting the functional consequences of genetic variants. Predicting the functional effects of disease-associated protein-coding variants is increasingly routine. Yet, the vast majority of risk variants are non-coding, and predicting the functional consequence and prioritizing variants for functional validation remains a major challenge. Here, we develop a deep learning model to accurately predict locus-specific signals from four epigenetic assays using only DNA sequence as input. Given the predicted epigenetic signal from DNA sequence for the reference and alternative alleles at a given locus, we generate a score of the predicted epigenetic consequences for 438 million variants observed in previous sequencing projects. These impact scores are assay-specific, are predictive of allele-specific transcription factor binding and are enriched for variants associated with gene expression and disease risk. Nucleotide-level functional consequence scores for non-coding variants can refine the mechanism of known functional variants, identify novel risk variants and prioritize downstream experiments.
Collapse
Affiliation(s)
- Gabriel E Hoffman
- Pamela Sklar Division of Psychiatric Genomics, Icahn School of Medicine at Mount Sinai, New York, NY, USA
- Icahn Institute for Data Science and Genomic Technology, Icahn School of Medicine at Mount Sinai, New York, NY, USA
- Department of Genetics and Genomic Sciences, Icahn School of Medicine at Mount Sinai, New York, NY, USA
| | - Jaroslav Bendl
- Pamela Sklar Division of Psychiatric Genomics, Icahn School of Medicine at Mount Sinai, New York, NY, USA
- Department of Genetics and Genomic Sciences, Icahn School of Medicine at Mount Sinai, New York, NY, USA
| | - Kiran Girdhar
- Pamela Sklar Division of Psychiatric Genomics, Icahn School of Medicine at Mount Sinai, New York, NY, USA
- Department of Genetics and Genomic Sciences, Icahn School of Medicine at Mount Sinai, New York, NY, USA
| | - Eric E Schadt
- Icahn Institute for Data Science and Genomic Technology, Icahn School of Medicine at Mount Sinai, New York, NY, USA
- Department of Genetics and Genomic Sciences, Icahn School of Medicine at Mount Sinai, New York, NY, USA
- Sema4, Stamford, CT, USA
| | - Panos Roussos
- Pamela Sklar Division of Psychiatric Genomics, Icahn School of Medicine at Mount Sinai, New York, NY, USA
- Icahn Institute for Data Science and Genomic Technology, Icahn School of Medicine at Mount Sinai, New York, NY, USA
- Department of Genetics and Genomic Sciences, Icahn School of Medicine at Mount Sinai, New York, NY, USA
- Department of Psychiatry, Icahn School of Medicine at Mount Sinai, New York, NY, USA
- Friedman Brain Institute, Icahn School of Medicine at Mount Sinai, New York, NY, USA
- Mental Illness Research, Education, and Clinical Center (VISN 2 South), James J. Peters VA Medical Center, Bronx, NY, USA
| |
Collapse
|
174
|
Santana-Garcia W, Rocha-Acevedo M, Ramirez-Navarro L, Mbouamboua Y, Thieffry D, Thomas-Chollier M, Contreras-Moreira B, van Helden J, Medina-Rivera A. RSAT variation-tools: An accessible and flexible framework to predict the impact of regulatory variants on transcription factor binding. Comput Struct Biotechnol J 2019; 17:1415-1428. [PMID: 31871587 PMCID: PMC6906655 DOI: 10.1016/j.csbj.2019.09.009] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/27/2019] [Revised: 09/22/2019] [Accepted: 09/25/2019] [Indexed: 02/06/2023] Open
Abstract
Gene regulatory regions contain short and degenerated DNA binding sites recognized by transcription factors (TFBS). When TFBS harbor SNPs, the DNA binding site may be affected, thereby altering the transcriptional regulation of the target genes. Such regulatory SNPs have been implicated as causal variants in Genome-Wide Association Study (GWAS) studies. In this study, we describe improved versions of the programs Variation-tools designed to predict regulatory variants, and present four case studies to illustrate their usage and applications. In brief, Variation-tools facilitate i) obtaining variation information, ii) interconversion of variation file formats, iii) retrieval of sequences surrounding variants, and iv) calculating the change on predicted transcription factor affinity scores between alleles, using motif scanning approaches. Notably, the tools support the analysis of haplotypes. The tools are included within the well-maintained suite Regulatory Sequence Analysis Tools (RSAT, http://rsat.eu), and accessible through a web interface that currently enables analysis of five metazoa and ten plant genomes. Variation-tools can also be used in command-line with any locally-installed Ensembl genome. Users can input personal collections of variants and motifs, providing flexibility in the analysis.
Collapse
Key Words
- Binding motifs
- CEU, Northern Europeans from Utah
- CRM, Cis-Regulatory Module
- GWAS, Genome Wide Association Studies
- LD, Linkage Disequilibrium
- MPRA, Massively Parallel Reporter Assays: MPRA
- PSSM, Position Specific Scoring Matrix
- Position specific scoring matrix
- ROC, Receiver Operating Characteristic
- RSAT, Regulatory Sequence Analysis Tools
- Regulatory variants
- SNP, Single Nucleotide Polymorphism
- SNPs
- SOIs, SNPs of Interest
- TF, Transcription Factor
- TFBS, Transcription Factor Binding Site
- Transcription factors
- eQTL, Expression Quantitative Trait Loci
- rsID, Reference SNP Identifier
Collapse
Affiliation(s)
- Walter Santana-Garcia
- Institut de Biologie de l’ENS (IBENS), Département de biologie, École normale supérieure, CNRS, INSERM, Université PSL, 75005 Paris, France
- Laboratorio Internacional de Investigación sobre el Genoma Humano, Universidad Nacional Autónoma de México, Campus Juriquilla, Blvd Juriquilla 3001, Santiago de Querétaro 76230, Mexico
| | - Maria Rocha-Acevedo
- Laboratorio Internacional de Investigación sobre el Genoma Humano, Universidad Nacional Autónoma de México, Campus Juriquilla, Blvd Juriquilla 3001, Santiago de Querétaro 76230, Mexico
| | - Lucia Ramirez-Navarro
- Laboratorio Internacional de Investigación sobre el Genoma Humano, Universidad Nacional Autónoma de México, Campus Juriquilla, Blvd Juriquilla 3001, Santiago de Querétaro 76230, Mexico
| | - Yvon Mbouamboua
- Fondation Congolaise pour la Recherche Médicale, Brazzaville, People’s Republic of Congo
- Aix-Marseille Univ, INSERM UMR S 1090, Theory and Approaches of Genome Complexity (TAGC), F-13288 Marseille, France
| | - Denis Thieffry
- Institut de Biologie de l’ENS (IBENS), Département de biologie, École normale supérieure, CNRS, INSERM, Université PSL, 75005 Paris, France
| | - Morgane Thomas-Chollier
- Institut de Biologie de l’ENS (IBENS), Département de biologie, École normale supérieure, CNRS, INSERM, Université PSL, 75005 Paris, France
| | | | - Jacques van Helden
- Aix-Marseille Univ, INSERM UMR S 1090, Theory and Approaches of Genome Complexity (TAGC), F-13288 Marseille, France
- CNRS, Institut Français de Bioinformatique, IFB-core, UMS 3601, Evry, France
- Corresponding authors at: Laboratorio Internacional de Investigación sobre el Genoma Humano, Universidad Nacional Autónoma de México, Campus Juriquilla, Blvd Juriquilla 3001, Santiago de Querétaro 76230, México (Medina-Rivera). Aix-Marseille Univ, INSERM UMR S 1090, Theory and Approaches of Genome Complexity (TAGC), F-13288 Marseille, France (J. van Heldenf).
| | - Alejandra Medina-Rivera
- Laboratorio Internacional de Investigación sobre el Genoma Humano, Universidad Nacional Autónoma de México, Campus Juriquilla, Blvd Juriquilla 3001, Santiago de Querétaro 76230, Mexico
- Corresponding authors at: Laboratorio Internacional de Investigación sobre el Genoma Humano, Universidad Nacional Autónoma de México, Campus Juriquilla, Blvd Juriquilla 3001, Santiago de Querétaro 76230, México (Medina-Rivera). Aix-Marseille Univ, INSERM UMR S 1090, Theory and Approaches of Genome Complexity (TAGC), F-13288 Marseille, France (J. van Heldenf).
| |
Collapse
|
175
|
Li M, Jiang L, Mak TSH, Kwan JSH, Xue C, Chen P, Leung HCM, Cui L, Li T, Sham PC. A powerful conditional gene-based association approach implicated functionally important genes for schizophrenia. Bioinformatics 2019; 35:628-635. [PMID: 30101339 DOI: 10.1093/bioinformatics/bty682] [Citation(s) in RCA: 13] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/04/2018] [Revised: 06/27/2018] [Accepted: 08/06/2018] [Indexed: 02/05/2023] Open
Abstract
MOTIVATION It remains challenging to unravel new susceptibility genes of complex diseases and the mechanisms in genome-wide association studies. There are at least two difficulties, isolation of the genuine susceptibility genes from many indirectly associated genes and functional validation of these genes. RESULTS We first proposed a novel conditional gene-based association test which can use only summary statistics to isolate independently associated genes of a disease. Applying this method, we detected 185 genes of independent association with schizophrenia. We then designed an in-silico experiment based on expression/co-expression to systematically validate pathogenic potential of these genes. We found that genes of independent association with schizophrenia formed more co-expression pairs in normal post-natal but not pre-natal human brain regions than expected. Interestingly, no co-expression enrichment was found in the brain regions of schizophrenia patients. The genes with independent association also had more significant P-values for differential expression between schizophrenia patients and controls in the brain regions. In contrast, indirectly associated genes or associated genes by other widely-used gene-based tests had no such differential expression and co-expression patterns. In summary, this conditional gene-based association test is effective for isolating directly associated genes from indirectly associated genes, and the results insightfully suggest that common variants might contribute to schizophrenia largely by distorting expression and co-expression in post-natal brains. AVAILABILITY AND IMPLEMENTATION The conditional gene-based association test has been implemented in a platform 'KGG' in Java and is publicly available at http://grass.cgs.hku.hk/limx/kgg/. SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Miaoxin Li
- Zhongshan School of Medicine, First Affiliated Hospital, Center for Genome Research, Center for Precision Medicine, Sun Yat-sen University, Guangzhou, China.,The Centre for Genomic Sciences, The University of Hong Kong, Pokfulam, Hong Kong, China.,Department of Psychiatry, The University of Hong Kong, Pokfulam, Hong Kong, China.,State Key Laboratory for Cognitive and Brain Sciences, The University of Hong Kong, Pokfulam, Hong Kong, China.,Key Laboratory of Tropical Disease Control (SYSU), Ministry of Education, Guangzhou, Hong Kong, China
| | - Lin Jiang
- Zhongshan School of Medicine, First Affiliated Hospital, Center for Genome Research, Center for Precision Medicine, Sun Yat-sen University, Guangzhou, China.,The Centre for Genomic Sciences, The University of Hong Kong, Pokfulam, Hong Kong, China
| | - Timothy Shin Heng Mak
- The Centre for Genomic Sciences, The University of Hong Kong, Pokfulam, Hong Kong, China
| | | | - Chao Xue
- Zhongshan School of Medicine, First Affiliated Hospital, Center for Genome Research, Center for Precision Medicine, Sun Yat-sen University, Guangzhou, China
| | - Peikai Chen
- The Centre for Genomic Sciences, The University of Hong Kong, Pokfulam, Hong Kong, China.,School of Biomedical Sciences, The University of Hong Kong, Pokfulam, Hong Kong, China
| | - Henry Chi-Ming Leung
- Department of Computer Science, The University of Hong Kong, Pokfulam, Hong Kong, China
| | - Liqian Cui
- The First Affiliated Hospital, Sun Yat-sen University, Guangzhou, China
| | - Tao Li
- The Mental Health Center and the Psychiatric Laboratory, West China Hospital, Sichuan University, Chengdu, China
| | - Pak Chung Sham
- The Centre for Genomic Sciences, The University of Hong Kong, Pokfulam, Hong Kong, China.,Department of Psychiatry, The University of Hong Kong, Pokfulam, Hong Kong, China.,State Key Laboratory for Cognitive and Brain Sciences, The University of Hong Kong, Pokfulam, Hong Kong, China
| |
Collapse
|
176
|
Penzar DD, Zinkevich AO, Vorontsov IE, Sitnik VV, Favorov AV, Makeev VJ, Kulakovskiy IV. What Do Neighbors Tell About You: The Local Context of Cis-Regulatory Modules Complicates Prediction of Regulatory Variants. Front Genet 2019; 10:1078. [PMID: 31737053 PMCID: PMC6834773 DOI: 10.3389/fgene.2019.01078] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/15/2019] [Accepted: 10/09/2019] [Indexed: 02/05/2023] Open
Abstract
Many problems of modern genetics and functional genomics require the assessment of functional effects of sequence variants, including gene expression changes. Machine learning is considered to be a promising approach for solving this task, but its practical applications remain a challenge due to the insufficient volume and diversity of training data. A promising source of valuable data is a saturation mutagenesis massively parallel reporter assay, which quantitatively measures changes in transcription activity caused by sequence variants. Here, we explore the computational predictions of the effects of individual single-nucleotide variants on gene transcription measured in the massively parallel reporter assays, based on the data from the recent "Regulation Saturation" Critical Assessment of Genome Interpretation challenge. We show that the estimated prediction quality strongly depends on the structure of the training and validation data. Particularly, training on the sequence segments located next to the validation data results in the "information leakage" caused by the local context. This information leakage allows reproducing the prediction quality of the best CAGI challenge submissions with a fairly simple machine learning approach, and even obtaining notably better-than-random predictions using irrelevant genomic regions. Validation scenarios preventing such information leakage dramatically reduce the measured prediction quality. The performance at independent regulatory regions entirely excluded from the training set appears to be much lower than needed for practical applications, and even the performance estimation will become reliable only in the future with richer data from multiple reporters. The source code and data are available at https://bitbucket.org/autosomeru_cagi2018/cagi2018_regsat and https://genomeinterpretation.org/content/expression-variants.
Collapse
Affiliation(s)
- Dmitry D. Penzar
- Vavilov Institute of General Genetics, Russian Academy of Sciences, Moscow, Russia
- Faculty of Bioengineering and Bioinformatics, Lomonosov Moscow State University, Moscow, Russia
- Department of Medical and Biological Physics, Moscow Institute of Physics and Technology (State University), Dolgoprudny, Russia
| | - Arsenii O. Zinkevich
- Vavilov Institute of General Genetics, Russian Academy of Sciences, Moscow, Russia
- Faculty of Bioengineering and Bioinformatics, Lomonosov Moscow State University, Moscow, Russia
| | - Ilya E. Vorontsov
- Vavilov Institute of General Genetics, Russian Academy of Sciences, Moscow, Russia
| | - Vasily V. Sitnik
- Vavilov Institute of General Genetics, Russian Academy of Sciences, Moscow, Russia
| | - Alexander V. Favorov
- Vavilov Institute of General Genetics, Russian Academy of Sciences, Moscow, Russia
- Department of Oncology, Sidney Kimmel Comprehensive Cancer Center, The Johns Hopkins University School of Medicine, Baltimore, MD, United States
| | - Vsevolod J. Makeev
- Vavilov Institute of General Genetics, Russian Academy of Sciences, Moscow, Russia
- Department of Medical and Biological Physics, Moscow Institute of Physics and Technology (State University), Dolgoprudny, Russia
- Engelhardt Institute of Molecular Biology, Russian Academy of Sciences, Moscow, Russia
| | - Ivan V. Kulakovskiy
- Vavilov Institute of General Genetics, Russian Academy of Sciences, Moscow, Russia
- Engelhardt Institute of Molecular Biology, Russian Academy of Sciences, Moscow, Russia
- Institute of Mathematical Problems of Biology RAS - the Branch of Keldysh Institute of Applied Mathematics of Russian Academy of Sciences, Pushchino, Russia
| |
Collapse
|
177
|
Wnuk K, Sudol J, Givechian KB, Soon-Shiong P, Rabizadeh S, Szeto C, Vaske C. Deep Learning Implicitly Handles Tissue Specific Phenomena to Predict Tumor DNA Accessibility and Immune Activity. iScience 2019; 20:119-136. [PMID: 31563852 PMCID: PMC6823659 DOI: 10.1016/j.isci.2019.09.018] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/29/2019] [Revised: 08/23/2019] [Accepted: 09/11/2019] [Indexed: 01/22/2023] Open
Abstract
DNA accessibility is a key dynamic feature of chromatin regulation that can potentiate transcriptional events and tumor progression. To gain insight into chromatin state across existing tumor data, we improved neural network models for predicting accessibility from DNA sequence and extended them to incorporate a global set of RNA sequencing gene expression inputs. Our expression-informed model expanded the application domain beyond specific tissue types to tissues not present in training and achieved consistently high accuracy in predicting DNA accessibility at promoter and promoter flank regions. We then leveraged our new tool by analyzing the DNA accessibility landscape of promoters across The Cancer Genome Atlas. We show that in lung adenocarcinoma the accessibility perspective uniquely highlights immune pathways inversely correlated with a more open chromatin state and that accessibility patterns learned from even a single tumor type can discriminate immune inflammation across many cancers, often with direct relation to patient prognosis.
Collapse
Affiliation(s)
- Kamil Wnuk
- ImmunityBio Inc., Culver City, CA 90232, USA.
| | | | | | | | | | | | | |
Collapse
|
178
|
Kaul T, Eswaran M, Ahmad S, Thangaraj A, Jain R, Kaul R, Raman NM, Bharti J. Probing the effect of a plus 1bp frameshift mutation in protein-DNA interface of domestication gene, NAMB1, in wheat. J Biomol Struct Dyn 2019; 38:3633-3647. [PMID: 31621500 DOI: 10.1080/07391102.2019.1680435] [Citation(s) in RCA: 19] [Impact Index Per Article: 3.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/25/2022]
Abstract
Transcription factor NAM-B1 has a major role in the process of senescence, which results in higher Fe and Zn concentrations in grains of wild wheat (T. durum; Td). The absence of the wild type NAMB1 in T. aestivum (Ta), one of the cardinal crops essential for more than 1/3rd of the global population, affects Fe and Zn remobilisation to the maturing grain from the flag leaf resulting in lesser micronutrient bioavailability. The cardinal difference in the NAMB1 gene between the two species is the absence of +1 bp allele in Ta. Insilico studies using NAMB1 from Td and Ta was performed to explore the variation in the interaction with the conserved cis-element DNA motif (CATGTG) as both the proteins share the same domain, but there are no in silico studies reported of these proteins. The secondary structure, 3D-modelling of the proteins, DNA-protein docking and dynamics have computed by Schrodinger Prime Suite. Predicted secondary structures were energy minimised using Macromodel and docking was performed based on binding energy and hydrogen bonds. Molecular dynamics simulation of NAMB1-Ta and NAMB1-Td individually and with the cis-element motif, performed for 100 ns, revealed significant variations in the protein-DNA interaction in Ta. This work provides the modelled 3D-interaction profile caused by a single bp frameshift mutation in understanding the difference in function between NAMB1 orthologs due to lack of NAC domain. The overall computational analysis reveals that NAMB1-Ta and NAMB1-Td proteins display a good amount of dissimilarity in their structure, dynamics and DNA-binding characteristics.Communicated by Ramaswamy H. Sarma.
Collapse
Affiliation(s)
- Tanushri Kaul
- Nutritional Improvement of Crops Group, Plant Molecular Biology Division, International Centre for Genetic Engineering and Biotechnology (ICGEB), New Delhi, India
| | - Murugesh Eswaran
- Nutritional Improvement of Crops Group, Plant Molecular Biology Division, International Centre for Genetic Engineering and Biotechnology (ICGEB), New Delhi, India
| | - Shaban Ahmad
- Nutritional Improvement of Crops Group, Plant Molecular Biology Division, International Centre for Genetic Engineering and Biotechnology (ICGEB), New Delhi, India
| | - Arulprakash Thangaraj
- Nutritional Improvement of Crops Group, Plant Molecular Biology Division, International Centre for Genetic Engineering and Biotechnology (ICGEB), New Delhi, India
| | - Rashmi Jain
- Nutritional Improvement of Crops Group, Plant Molecular Biology Division, International Centre for Genetic Engineering and Biotechnology (ICGEB), New Delhi, India
| | - Rashmi Kaul
- Nutritional Improvement of Crops Group, Plant Molecular Biology Division, International Centre for Genetic Engineering and Biotechnology (ICGEB), New Delhi, India
| | - Nitya Meenakshi Raman
- Nutritional Improvement of Crops Group, Plant Molecular Biology Division, International Centre for Genetic Engineering and Biotechnology (ICGEB), New Delhi, India
| | - Jyotsna Bharti
- Nutritional Improvement of Crops Group, Plant Molecular Biology Division, International Centre for Genetic Engineering and Biotechnology (ICGEB), New Delhi, India
| |
Collapse
|
179
|
Lenzini L, Di Patti F, Livi R, Fondi M, Fani R, Mengoni A. A Method for the Structure-Based, Genome-Wide Analysis of Bacterial Intergenic Sequences Identifies Shared Compositional and Functional Features. Genes (Basel) 2019; 10:genes10100834. [PMID: 31652625 PMCID: PMC6826451 DOI: 10.3390/genes10100834] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/26/2019] [Revised: 10/07/2019] [Accepted: 10/16/2019] [Indexed: 11/16/2022] Open
Abstract
In this paper, we propose a computational strategy for performing genome-wide analyses of intergenic sequences in bacterial genomes. Following similar directions of a previous paper, where a method for genome-wide analysis of eucaryotic Intergenic sequences was proposed, here we developed a tool for implementing similar concepts in bacteria genomes. This allows us to (i) classify intergenic sequences into clusters, characterized by specific global structural features and (ii) draw possible relations with their functional features.
Collapse
Affiliation(s)
- Leonardo Lenzini
- Dipartimento di Fisica e Astronomia, Università degli Studi di Firenze, Sesto Fiorentino, 50019, Italy.
- Istituto Nazionale di Fisica Nucleare, Sesto Fiorentino, 50019, Italy.
| | - Francesca Di Patti
- Dipartimento di Fisica e Astronomia, Università degli Studi di Firenze, Sesto Fiorentino, 50019, Italy.
- Centro Interdipartimentale per lo Studio delle Dinamiche Complesse, Sesto Fiorentino, 50019, Italy.
| | - Roberto Livi
- Dipartimento di Fisica e Astronomia, Università degli Studi di Firenze, Sesto Fiorentino, 50019, Italy.
- Istituto Nazionale di Fisica Nucleare, Sesto Fiorentino, 50019, Italy.
- Centro Interdipartimentale per lo Studio delle Dinamiche Complesse, Sesto Fiorentino, 50019, Italy.
- Istituto dei Sistemi Complessi, Consiglio Nazionale delle Ricerche, Sesto Fiorentino, 50019, Italy.
| | - Marco Fondi
- Dipartimento di Biologia, Università degli Studi di Firenze, Sesto Fiorentino, 50019, Italy.
| | - Renato Fani
- Istituto dei Sistemi Complessi, Consiglio Nazionale delle Ricerche, Sesto Fiorentino, 50019, Italy.
- Dipartimento di Biologia, Università degli Studi di Firenze, Sesto Fiorentino, 50019, Italy.
| | - Alessio Mengoni
- Dipartimento di Biologia, Università degli Studi di Firenze, Sesto Fiorentino, 50019, Italy.
| |
Collapse
|
180
|
van Ouwerkerk AF, Bosada FM, van Duijvenboden K, Hill MC, Montefiori LE, Scholman KT, Liu J, de Vries AAF, Boukens BJ, Ellinor PT, Goumans MJTH, Efimov IR, Nobrega MA, Barnett P, Martin JF, Christoffels VM. Identification of atrial fibrillation associated genes and functional non-coding variants. Nat Commun 2019; 10:4755. [PMID: 31628324 PMCID: PMC6802215 DOI: 10.1038/s41467-019-12721-5] [Citation(s) in RCA: 59] [Impact Index Per Article: 9.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/23/2019] [Accepted: 09/19/2019] [Indexed: 12/31/2022] Open
Abstract
Disease-associated genetic variants that lie in non-coding regions found by genome-wide association studies are thought to alter the functionality of transcription regulatory elements and target gene expression. To uncover causal genetic variants, variant regulatory elements and their target genes, here we cross-reference human transcriptomic, epigenomic and chromatin conformation datasets. Of 104 genetic variant regions associated with atrial fibrillation candidate target genes are prioritized. We optimize EMERGE enhancer prediction and use accessible chromatin profiles of human atrial cardiomyocytes to more accurately predict cardiac regulatory elements and identify hundreds of sub-threshold variants that co-localize with regulatory elements. Removal of mouse homologues of atrial fibrillation-associated regions in vivo uncovers a distal regulatory region involved in Gja1 (Cx43) expression. Our analyses provide a shortlist of genes likely affected by atrial fibrillation-associated variants and provide variant regulatory elements in each region that link genetic variation and target gene regulation, helping to focus future investigations.
Collapse
Affiliation(s)
- Antoinette F van Ouwerkerk
- Department of Medical Biology, Amsterdam University Medical Centers, Academic Medical Center, 1105 AZ, Amsterdam, The Netherlands
| | - Fernanda M Bosada
- Department of Medical Biology, Amsterdam University Medical Centers, Academic Medical Center, 1105 AZ, Amsterdam, The Netherlands
| | - Karel van Duijvenboden
- Department of Medical Biology, Amsterdam University Medical Centers, Academic Medical Center, 1105 AZ, Amsterdam, The Netherlands
| | - Matthew C Hill
- Program in Developmental Biology, Baylor College of Medicine, Houston, TX, 77030, USA
| | | | - Koen T Scholman
- Department of Medical Biology, Amsterdam University Medical Centers, Academic Medical Center, 1105 AZ, Amsterdam, The Netherlands
| | - Jia Liu
- Department of Cardiology, Leiden University Medical Center, Albinusdreef 2, 2333 ZA, Leiden, The Netherlands
- Department of Cell Biology and Genetics, Center for Anti-ageing and Regenerative Medicine, Shenzhen Key Laboratory for Anti-ageing and Regenerative Medicine, Shenzhen University Medical School, Shenzhen University, Nanhai Ave, 3688, Shenzhen, China
- Netherlands Heart Institute, Holland Heart House, Moreelsepark 1, 3511 EP, Utrecht, The Netherlands
| | - Antoine A F de Vries
- Department of Cardiology, Leiden University Medical Center, Albinusdreef 2, 2333 ZA, Leiden, The Netherlands
- Netherlands Heart Institute, Holland Heart House, Moreelsepark 1, 3511 EP, Utrecht, The Netherlands
| | - Bastiaan J Boukens
- Department of Medical Biology, Amsterdam University Medical Centers, Academic Medical Center, 1105 AZ, Amsterdam, The Netherlands
| | - Patrick T Ellinor
- Cardiovascular Disease Initiative, Broad Institute of MIT and Harvard, Cambridge, MA, USA
- Cardiovasular Research Center, Massachusetts General Hospital, Charlestown, MA, USA
- Cardiac Arrhythmia Service, Massachusetts General Hospital, Boston, MA, USA
| | - Marie José T H Goumans
- Department of Cell and Chemical Biology, Leiden University Medical Center, Einthovenweg 20, 2333 ZC, Leiden, The Netherlands
| | - Igor R Efimov
- Department of Biomedical Engineering, George Washington University, Washington, DC, USA
| | - Marcelo A Nobrega
- Department of Human Genetics, The University of Chicago, Chicago, USA
| | - Phil Barnett
- Department of Medical Biology, Amsterdam University Medical Centers, Academic Medical Center, 1105 AZ, Amsterdam, The Netherlands
| | - James F Martin
- Program in Developmental Biology, Baylor College of Medicine, Houston, TX, 77030, USA
- Department of Molecular Physiology and Biophysics, Baylor College of Medicine, Houston, TX, 77030, USA
- Texas Heart Institute, Houston, TX, 77030, USA
- Cardiovascular Research Institute, Baylor College of Medicine, Houston, TX, 77030, USA
| | - Vincent M Christoffels
- Department of Medical Biology, Amsterdam University Medical Centers, Academic Medical Center, 1105 AZ, Amsterdam, The Netherlands.
| |
Collapse
|
181
|
Benaglio P, D'Antonio-Chronowska A, Ma W, Yang F, Young Greenwald WW, Donovan MKR, DeBoever C, Li H, Drees F, Singhal S, Matsui H, van Setten J, Sotoodehnia N, Gaulton KJ, Smith EN, D'Antonio M, Rosenfeld MG, Frazer KA. Allele-specific NKX2-5 binding underlies multiple genetic associations with human electrocardiographic traits. Nat Genet 2019; 51:1506-1517. [PMID: 31570892 PMCID: PMC6858543 DOI: 10.1038/s41588-019-0499-3] [Citation(s) in RCA: 30] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/21/2018] [Accepted: 08/15/2019] [Indexed: 12/15/2022]
Abstract
The cardiac transcription factor (TF) gene NKX2-5 has been associated with electrocardiographic (EKG) traits through genome-wide association studies (GWASs), but the extent to which differential binding of NKX2-5 at common regulatory variants contributes to these traits has not yet been studied. We analyzed transcriptomic and epigenomic data from induced pluripotent stem cell-derived cardiomyocytes from seven related individuals, and identified ~2,000 single-nucleotide variants associated with allele-specific effects (ASE-SNVs) on NKX2-5 binding. NKX2-5 ASE-SNVs were enriched for altered TF motifs, for heart-specific expression quantitative trait loci and for EKG GWAS signals. Using fine-mapping combined with epigenomic data from induced pluripotent stem cell-derived cardiomyocytes, we prioritized candidate causal variants for EKG traits, many of which were NKX2-5 ASE-SNVs. Experimentally characterizing two NKX2-5 ASE-SNVs (rs3807989 and rs590041) showed that they modulate the expression of target genes via differential protein binding in cardiac cells, indicating that they are functional variants underlying EKG GWAS signals. Our results show that differential NKX2-5 binding at numerous regulatory variants across the genome contributes to EKG phenotypes.
Collapse
Affiliation(s)
- Paola Benaglio
- Department of Pediatrics, Rady Children's Hospital, Division of Genome Information Sciences, University of California, San Diego, La Jolla, CA, USA
| | | | - Wubin Ma
- Howard Hughes Medical Institute, Department of Medicine, University of California, San Diego, La Jolla, CA, USA
| | - Feng Yang
- Howard Hughes Medical Institute, Department of Medicine, University of California, San Diego, La Jolla, CA, USA
| | | | - Margaret K R Donovan
- Bioinformatics and Systems Biology, University of California, San Diego, La Jolla, CA, USA.,Department of Biomedical Informatics, University of California, San Diego, La Jolla, CA, USA
| | - Christopher DeBoever
- Bioinformatics and Systems Biology, University of California, San Diego, La Jolla, CA, USA
| | - He Li
- Institute for Genomic Medicine, University of California, San Diego, La Jolla, CA, USA
| | - Frauke Drees
- Institute for Genomic Medicine, University of California, San Diego, La Jolla, CA, USA
| | - Sanghamitra Singhal
- Department of Pediatrics, Rady Children's Hospital, Division of Genome Information Sciences, University of California, San Diego, La Jolla, CA, USA
| | - Hiroko Matsui
- Institute for Genomic Medicine, University of California, San Diego, La Jolla, CA, USA
| | - Jessica van Setten
- Department of Cardiology, University Medical Center Utrecht, University of Utrecht, Utrecht, the Netherlands
| | - Nona Sotoodehnia
- Department of Medicine, Cardiovascular Health Research Unit, Division of Cardiology, University of Washington, Seattle, WA, USA.,Department of Epidemiology, Cardiovascular Health Research Unit, Division of Cardiology, University of Washington, Seattle, WA, USA
| | - Kyle J Gaulton
- Department of Pediatrics, Rady Children's Hospital, Division of Genome Information Sciences, University of California, San Diego, La Jolla, CA, USA
| | - Erin N Smith
- Department of Pediatrics, Rady Children's Hospital, Division of Genome Information Sciences, University of California, San Diego, La Jolla, CA, USA
| | - Matteo D'Antonio
- Institute for Genomic Medicine, University of California, San Diego, La Jolla, CA, USA
| | - Michael G Rosenfeld
- Howard Hughes Medical Institute, Department of Medicine, University of California, San Diego, La Jolla, CA, USA.
| | - Kelly A Frazer
- Department of Pediatrics, Rady Children's Hospital, Division of Genome Information Sciences, University of California, San Diego, La Jolla, CA, USA. .,Institute for Genomic Medicine, University of California, San Diego, La Jolla, CA, USA.
| |
Collapse
|
182
|
Lian S, Li L, Zhou Y, Liu Z, Wang L. The co-expression networks of differentially expressed RBPs with TFs and LncRNAs related to clinical TNM stages of cancers. PeerJ 2019; 7:e7696. [PMID: 31576243 PMCID: PMC6753928 DOI: 10.7717/peerj.7696] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/22/2019] [Accepted: 08/19/2019] [Indexed: 12/17/2022] Open
Abstract
BACKGROUND RNA-binding proteins (RBPs) play important roles in cellular homeostasis by regulating the expression of thousands of transcripts, which have been reported to be involved in human tumorigenesis. Despite previous reports of the dysregulation of RBPs in cancers, the degree of dysregulation of RBPs in cancers and the intrinsic relevance between dysregulated RBPs and clinical TNM information remains unknown. Furthermore, the co-expressed networks of dysregulated RBPs with transcriptional factors and lncRNAs also require further investigation. RESULTS Here, we firstly analyzed the deviations of expression levels of 1,542 RBPs from 20 cancer types and found that (1) RBPs are dysregulated in almost all 20 cancer types, especially in BLCA, COAD, READ, STAD, LUAD, LUSC and GBM with proportion of deviation larger than 300% compared with non-RBPs in normal tissues. (2) Up- and down-regulated RBPs also show opposed patterns of differential expression in cancers and normal tissues. In addition, down-regulated RBPs show a greater degree of dysregulated expression than up-regulated RBPs do. Secondly, we analyzed the intrinsic relevance between dysregulated RBPs and clinical TNM information and found that (3) Clinical TNM information for two cancer types-CHOL and KICH-is shown to be closely related to patterns of differentially expressed RBPs (DE RBPs) by co-expression cluster analysis. Thirdly, we identified ten key RBPs (seven down-regulated and three up-regulated) in CHOL and seven key RBPs (five down-regulated and two up-regulated) in KICH by analyzing co-expression correlation networks. Fourthly, we constructed the co-expression networks of key RBPs between 1,570 TFs and 4,147 lncRNAs for CHOL and KICH, respectively. CONCLUSIONS These results may provide an insight into the understanding of the functions of RBPs in human carcinogenesis. Furthermore, key RBPs and the co-expressed networks offer useful information for potential prognostic biomarkers and therapeutic targets for patients with cancers at the N and M stages in two cancer types CHOL and KICH.
Collapse
Affiliation(s)
- Shuaibin Lian
- College of Physics and Electronic Engineering, XinYang Normal University, Xinyang, HeNan, China
| | - Liansheng Li
- College of Life Sciences, XinYang Normal University, Xinyang, HeNan, China
| | - Yongjie Zhou
- College of Physics and Electronic Engineering, XinYang Normal University, Xinyang, HeNan, China
| | - Zixiao Liu
- College of Physics and Electronic Engineering, XinYang Normal University, Xinyang, HeNan, China
| | - Lei Wang
- College of Life Sciences, XinYang Normal University, Xinyang, HeNan, China
| |
Collapse
|
183
|
Li S, Zheng EB, Zhao L, Liu S. Nonreciprocal and Conditional Cooperativity Directs the Pioneer Activity of Pluripotency Transcription Factors. Cell Rep 2019; 28:2689-2703.e4. [PMID: 31484078 PMCID: PMC6750763 DOI: 10.1016/j.celrep.2019.07.103] [Citation(s) in RCA: 35] [Impact Index Per Article: 5.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/08/2019] [Revised: 05/24/2019] [Accepted: 07/26/2019] [Indexed: 01/02/2023] Open
Abstract
Cooperative binding of transcription factors (TFs) to chromatin orchestrates gene expression programming and cell fate specification. However, the biophysical principles of TF cooperativity remain incompletely understood. Here we use single-molecule fluorescence microscopy to study the partnership between Sox2 and Oct4, two core members of the pluripotency gene regulatory network. We find that the ability of Sox2 to target DNA inside nucleosomes is strongly affected by the translational and rotational positioning of its binding motif. In contrast, Oct4 can access nucleosomal sites with equal capacities. Furthermore, the Sox2-Oct4 pair displays nonreciprocal cooperativity, with Oct4 modulating interaction of Sox2 with the nucleosome but not vice versa. Such cooperativity is conditional upon the composite motif's residing at specific nucleosomal locations. These results reveal that pioneer factors possess distinct chromatin-binding properties and suggest that the same set of TFs can differentially regulate gene activities on the basis of their motif positions in the nucleosomal context.
Collapse
Affiliation(s)
- Sai Li
- Laboratory of Nanoscale Biophysics and Biochemistry, The Rockefeller University, New York, NY 10065, USA
| | - Eric Bo Zheng
- Laboratory of Evolutionary Genetics and Genomics, The Rockefeller University, New York, NY 10065, USA
| | - Li Zhao
- Laboratory of Evolutionary Genetics and Genomics, The Rockefeller University, New York, NY 10065, USA
| | - Shixin Liu
- Laboratory of Nanoscale Biophysics and Biochemistry, The Rockefeller University, New York, NY 10065, USA.
| |
Collapse
|
184
|
Ponomarenko MP, Rasskazov DA, Chadaeva IV, Sharypova EB, Drachkova IA, Ponomarenko PM, Oshchepkova EA, Savinkova LK, Kolchanov NA. Candidate SNP Markers of Atherosclerosis That May Significantly Change the Affinity of the TATA-Binding Protein for the Human Gene Promoters. RUSS J GENET+ 2019. [DOI: 10.1134/s1022795419090114] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/05/2023]
|
185
|
Wong KC, Lin J, Li X, Lin Q, Liang C, Song YQ. Heterodimeric DNA motif synthesis and validations. Nucleic Acids Res 2019; 47:1628-1636. [PMID: 30590725 PMCID: PMC6393289 DOI: 10.1093/nar/gky1297] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/14/2018] [Revised: 12/04/2018] [Accepted: 12/19/2018] [Indexed: 02/06/2023] Open
Abstract
Bound by transcription factors, DNA motifs (i.e. transcription factor binding sites) are prevalent and important for gene regulation in different tissues at different developmental stages of eukaryotes. Although considerable efforts have been made on elucidating monomeric DNA motif patterns, our knowledge on heterodimeric DNA motifs are still far from complete. Therefore, we propose to develop a computational approach to synthesize a heterodimeric DNA motif from two monomeric DNA motifs. The approach is sequentially divided into two components (Phases A and B). In Phase A, we propose to develop the inference models on how two DNA monomeric motifs can be oriented and overlapped with each other at nucleotide level. In Phase B, given the two monomeric DNA motifs oriented, we further propose to develop DNA-binding family-specific input-output hidden Markov models (IOHMMs) to synthesize a heterodimeric DNA motif. To validate the approach, we execute and cross-validate it with the experimentally verified 618 heterodimeric DNA motifs across 49 DNA-binding family combinations. We observe that our approach can even "rescue" the existing heterodimeric DNA motif pattern (i.e. HOXB2_EOMES) previously published on Nature. Lastly, we apply the proposed approach to infer previously uncharacterized heterodimeric motifs. Their motif instances are supported by DNase accessibility, gene ontology, protein-protein interactions, in vivo ChIP-seq peaks, and even structural data from PDB. A public web-server is built for open accessibility and scientific impact. Its address is listed as follows: http://motif.cs.cityu.edu.hk/custom/MotifKirin.
Collapse
Affiliation(s)
- Ka-Chun Wong
- Department of Computer Science, City University of Hong Kong, Kowloon Tong, Hong Kong SAR
| | - Jiecong Lin
- Department of Computer Science, City University of Hong Kong, Kowloon Tong, Hong Kong SAR
| | - Xiangtao Li
- Department of Computer Science, City University of Hong Kong, Kowloon Tong, Hong Kong SAR
| | - Qiuzhen Lin
- College of Computer Science and Software Engineering, Shenzhen University, Shenzhen, China
| | - Cheng Liang
- School of Information Science and Engineering, Shandong Normal University, Jinan, China
| | - You-Qiang Song
- School of Biomedical Sciences, University of Hong Kong, Pokfulam, Hong Kong SAR
| |
Collapse
|
186
|
Johnston AD, Simões-Pires CA, Thompson TV, Suzuki M, Greally JM. Functional genetic variants can mediate their regulatory effects through alteration of transcription factor binding. Nat Commun 2019; 10:3472. [PMID: 31375681 PMCID: PMC6677801 DOI: 10.1038/s41467-019-11412-5] [Citation(s) in RCA: 30] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/29/2018] [Accepted: 07/10/2019] [Indexed: 12/23/2022] Open
Abstract
Functional variants in the genome are usually identified by their association with local gene expression, DNA methylation or chromatin states. DNA sequence motif analysis and chromatin immunoprecipitation studies have provided indirect support for the hypothesis that functional variants alter transcription factor binding to exert their effects. In this study, we provide direct evidence that functional variants can alter transcription factor binding. We identify a multifunctional variant within the TBC1D4 gene encoding a canonical NFκB binding site, and edited it using CRISPR-Cas9 to remove this site. We show that this editing reduces TBC1D4 expression, local chromatin accessibility and binding of the p65 component of NFκB. We then used CRISPR without genomic editing to guide p65 back to the edited locus, demonstrating that this re-targeting, occurring ~182 kb from the gene promoter, is enough to restore the function of the locus, supporting the central role of transcription factors mediating the effects of functional variants.
Collapse
Affiliation(s)
- Andrew D Johnston
- Center for Epigenomics and Department of Genetics (Division of Genomics), Albert Einstein College of Medicine, 1301 Morris Park Avenue, Bronx, NY, 10461, USA
| | - Claudia A Simões-Pires
- Center for Epigenomics and Department of Genetics (Division of Genomics), Albert Einstein College of Medicine, 1301 Morris Park Avenue, Bronx, NY, 10461, USA
| | - Taylor V Thompson
- Center for Epigenomics and Department of Genetics (Division of Genomics), Albert Einstein College of Medicine, 1301 Morris Park Avenue, Bronx, NY, 10461, USA
| | - Masako Suzuki
- Center for Epigenomics and Department of Genetics (Division of Genomics), Albert Einstein College of Medicine, 1301 Morris Park Avenue, Bronx, NY, 10461, USA
| | - John M Greally
- Center for Epigenomics and Department of Genetics (Division of Genomics), Albert Einstein College of Medicine, 1301 Morris Park Avenue, Bronx, NY, 10461, USA.
| |
Collapse
|
187
|
Kulakovskiy IV, Vorontsov IE, Yevshin IS, Sharipov RN, Fedorova AD, Rumynskiy EI, Medvedeva YA, Magana-Mora A, Bajic VB, Papatsenko DA, Kolpakov FA, Makeev VJ. HOCOMOCO: towards a complete collection of transcription factor binding models for human and mouse via large-scale ChIP-Seq analysis. Nucleic Acids Res 2019; 46:D252-D259. [PMID: 29140464 PMCID: PMC5753240 DOI: 10.1093/nar/gkx1106] [Citation(s) in RCA: 531] [Impact Index Per Article: 88.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/14/2017] [Accepted: 10/31/2017] [Indexed: 12/15/2022] Open
Abstract
We present a major update of the HOCOMOCO collection that consists of patterns describing DNA binding specificities for human and mouse transcription factors. In this release, we profited from a nearly doubled volume of published in vivo experiments on transcription factor (TF) binding to expand the repertoire of binding models, replace low-quality models previously based on in vitro data only and cover more than a hundred TFs with previously unknown binding specificities. This was achieved by systematic motif discovery from more than five thousand ChIP-Seq experiments uniformly processed within the BioUML framework with several ChIP-Seq peak calling tools and aggregated in the GTRD database. HOCOMOCO v11 contains binding models for 453 mouse and 680 human transcription factors and includes 1302 mononucleotide and 576 dinucleotide position weight matrices, which describe primary binding preferences of each transcription factor and reliable alternative binding specificities. An interactive interface and bulk downloads are available on the web: http://hocomoco.autosome.ru and http://www.cbrc.kaust.edu.sa/hocomoco11. In this release, we complement HOCOMOCO by MoLoTool (Motif Location Toolbox, http://molotool.autosome.ru) that applies HOCOMOCO models for visualization of binding sites in short DNA sequences.
Collapse
Affiliation(s)
- Ivan V Kulakovskiy
- Engelhardt Institute of Molecular Biology, Russian Academy of Sciences, 119991, GSP-1, Vavilova 32, Moscow, Russia.,Vavilov Institute of General Genetics, Russian Academy of Sciences, 119991, GSP-1, Gubkina 3, Moscow, Russia.,Center for Data-Intensive Biomedicine and Biotechnology, Skolkovo Institute of Science and Technology, 143026 Moscow, Russia
| | - Ilya E Vorontsov
- Vavilov Institute of General Genetics, Russian Academy of Sciences, 119991, GSP-1, Gubkina 3, Moscow, Russia
| | - Ivan S Yevshin
- BIOSOFT.RU Ltd, 630058, Russkaya 41/1, Novosibirsk, Russia
| | - Ruslan N Sharipov
- BIOSOFT.RU Ltd, 630058, Russkaya 41/1, Novosibirsk, Russia.,Institute of Computational Technologies, Siberian Branch of the Russian Academy of Sciences, 630090, Akad. Rzhanova 6, Novosibirsk, Russia.,Novosibirsk State University, 630090, Pirogova 2, Novosibirsk, Russia
| | - Alla D Fedorova
- Faculty of Bioengineering and Bioinformatics, Lomonosov Moscow State University, 119234, Leninskiye Gory 1-73, Moscow, Russia
| | - Eugene I Rumynskiy
- Vavilov Institute of General Genetics, Russian Academy of Sciences, 119991, GSP-1, Gubkina 3, Moscow, Russia.,Moscow Institute of Physics and Technology (State University), 141700, 9 Institutskiy per, Dolgoprudny, Russia
| | - Yulia A Medvedeva
- Vavilov Institute of General Genetics, Russian Academy of Sciences, 119991, GSP-1, Gubkina 3, Moscow, Russia.,Moscow Institute of Physics and Technology (State University), 141700, 9 Institutskiy per, Dolgoprudny, Russia.,Institute of Bioengineering, Research Center of Biotechnology of the Russian Academy of Sciences, 119071, 2 Leninsky Ave. 33, Moscow, Russia
| | - Arturo Magana-Mora
- National Institute of Advanced Industrial Science and Technology (AIST), Com. Bio Big-Data Open Innovation Lab. (CBBD-OIL), AIST Tokyo Waterfront Main Bldg. #323, 2-3-26 Aomi, Tokyo 135-0064, Japan.,King Abdullah University of Science and Technology (KAUST), Computational Bioscience Research Center (CBRC), Thuwal 23955-6900, Saudi Arabia
| | - Vladimir B Bajic
- King Abdullah University of Science and Technology (KAUST), Computational Bioscience Research Center (CBRC), Thuwal 23955-6900, Saudi Arabia
| | - Dmitry A Papatsenko
- Center for Data-Intensive Biomedicine and Biotechnology, Skolkovo Institute of Science and Technology, 143026 Moscow, Russia
| | - Fedor A Kolpakov
- BIOSOFT.RU Ltd, 630058, Russkaya 41/1, Novosibirsk, Russia.,Institute of Computational Technologies, Siberian Branch of the Russian Academy of Sciences, 630090, Akad. Rzhanova 6, Novosibirsk, Russia
| | - Vsevolod J Makeev
- Engelhardt Institute of Molecular Biology, Russian Academy of Sciences, 119991, GSP-1, Vavilova 32, Moscow, Russia.,Vavilov Institute of General Genetics, Russian Academy of Sciences, 119991, GSP-1, Gubkina 3, Moscow, Russia.,Moscow Institute of Physics and Technology (State University), 141700, 9 Institutskiy per, Dolgoprudny, Russia
| |
Collapse
|
188
|
Cook PR, Marenduzzo D. Transcription-driven genome organization: a model for chromosome structure and the regulation of gene expression tested through simulations. Nucleic Acids Res 2019; 46:9895-9906. [PMID: 30239812 PMCID: PMC6212781 DOI: 10.1093/nar/gky763] [Citation(s) in RCA: 78] [Impact Index Per Article: 13.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/13/2018] [Accepted: 09/14/2018] [Indexed: 12/29/2022] Open
Abstract
Current models for the folding of the human genome see a hierarchy stretching down from chromosome territories, through A/B compartments and topologically-associating domains (TADs), to contact domains stabilized by cohesin and CTCF. However, molecular mechanisms underlying this folding, and the way folding affects transcriptional activity, remain obscure. Here we review physical principles driving proteins bound to long polymers into clusters surrounded by loops, and present a parsimonious yet comprehensive model for the way the organization determines function. We argue that clusters of active RNA polymerases and their transcription factors are major architectural features; then, contact domains, TADs and compartments just reflect one or more loops and clusters. We suggest tethering a gene close to a cluster containing appropriate factors—a transcription factory—increases the firing frequency, and offer solutions to many current puzzles concerning the actions of enhancers, super-enhancers, boundaries and eQTLs (expression quantitative trait loci). As a result, the activity of any gene is directly influenced by the activity of other transcription units around it in 3D space, and this is supported by Brownian-dynamics simulations of transcription factors binding to cognate sites on long polymers.
Collapse
Affiliation(s)
- Peter R Cook
- Sir William Dunn School of Pathology, University of Oxford, South Parks Road, Oxford OX1 3RE, UK
| | - Davide Marenduzzo
- SUPA, School of Physics, University of Edinburgh, Peter Guthrie Tait Road, Edinburgh, EH9 3FD, UK
| |
Collapse
|
189
|
Xie X, Hanson C, Sinha S. Mechanistic interpretation of non-coding variants for discovering transcriptional regulators of drug response. BMC Biol 2019; 17:62. [PMID: 31362726 PMCID: PMC6664756 DOI: 10.1186/s12915-019-0679-8] [Citation(s) in RCA: 13] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/14/2019] [Accepted: 07/09/2019] [Indexed: 12/21/2022] Open
Abstract
BACKGROUND Identification of functional non-coding variants and their mechanistic interpretation is a major challenge of modern genomics, especially for precision medicine. Transcription factor (TF) binding profiles and epigenomic landscapes in reference samples allow functional annotation of the genome, but do not provide ready answers regarding the effects of non-coding variants on phenotypes. A promising computational approach is to build models that predict TF-DNA binding from sequence, and use such models to score a variant's impact on TF binding strength. Here, we asked if this mechanistic approach to variant interpretation can be combined with information on genotype-phenotype associations to discover transcription factors regulating phenotypic variation among individuals. RESULTS We developed a statistical approach that integrates phenotype, genotype, gene expression, TF ChIP-seq, and Hi-C chromatin interaction data to answer this question. Using drug sensitivity of lymphoblastoid cell lines as the phenotype of interest, we tested if non-coding variants statistically linked to the phenotype are enriched for strong predicted impact on DNA binding strength of a TF and thus identified TFs regulating individual differences in the phenotype. Our approach relies on a new method for predicting variant impact on TF-DNA binding that uses a combination of biophysical modeling and machine learning. We report statistical and literature-based support for many of the TFs discovered here as regulators of drug response variation. We show that the use of mechanistically driven variant impact predictors can identify TF-drug associations that would otherwise be missed. We examined in depth one reported association-that of the transcription factor ELF1 with the drug doxorubicin-and identified several genes that may mediate this regulatory relationship. CONCLUSION Our work represents initial steps in utilizing predictions of variant impact on TF binding sites for discovery of regulatory mechanisms underlying phenotypic variation. Future advances on this topic will be greatly beneficial to the reconstruction of phenotype-associated gene regulatory networks.
Collapse
Affiliation(s)
- Xiaoman Xie
- Center for Biophysics and Quantitative Biology, University of Illinois Urbana-Champaign, Urbana, IL, 61801, USA
| | - Casey Hanson
- Department of Computer Science, University of Illinois Urbana-Champaign, Urbana, IL, 61801, USA
| | - Saurabh Sinha
- Department of Computer Science, University of Illinois Urbana-Champaign, Urbana, IL, 61801, USA. .,Institute of Genomic Biology, University of Illinois Urbana-Champaign, Urbana, IL, 61801, USA.
| |
Collapse
|
190
|
Advances in epigenetics link genetics to the environment and disease. Nature 2019; 571:489-499. [DOI: 10.1038/s41586-019-1411-0] [Citation(s) in RCA: 566] [Impact Index Per Article: 94.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/29/2018] [Accepted: 06/14/2019] [Indexed: 12/16/2022]
|
191
|
Pellacani D, Tan S, Lefort S, Eaves CJ. Transcriptional regulation of normal human mammary cell heterogeneity and its perturbation in breast cancer. EMBO J 2019; 38:e100330. [PMID: 31304632 PMCID: PMC6627240 DOI: 10.15252/embj.2018100330] [Citation(s) in RCA: 30] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/23/2018] [Revised: 10/22/2018] [Accepted: 11/08/2018] [Indexed: 12/18/2022] Open
Abstract
The mammary gland in adult women consists of biologically distinct cell types that differ in their surface phenotypes. Isolation and molecular characterization of these subpopulations of mammary cells have provided extensive insights into their different transcriptional programs and regulation. This information is now serving as a baseline for interpreting the heterogeneous features of human breast cancers. Examination of breast cancer mutational profiles further indicates that most have undergone a complex evolutionary process even before being detected. The consequent intra-tumoral as well as inter-tumoral heterogeneity of these cancers thus poses major challenges to deriving information from early and hence likely pervasive changes in potential therapeutic interest. Recently described reproducible and efficient methods for generating human breast cancers de novo in immunodeficient mice transplanted with genetically altered primary cells now offer a promising alternative to investigate initial stages of human breast cancer development. In this review, we summarize current knowledge about key transcriptional regulatory processes operative in these partially characterized subpopulations of normal human mammary cells and effects of disrupting these processes in experimentally produced human breast cancers.
Collapse
Affiliation(s)
- Davide Pellacani
- Terry Fox LaboratoryBritish Columbia Cancer AgencyVancouverBCCanada
| | - Susanna Tan
- Terry Fox LaboratoryBritish Columbia Cancer AgencyVancouverBCCanada
| | - Sylvain Lefort
- Terry Fox LaboratoryBritish Columbia Cancer AgencyVancouverBCCanada
| | - Connie J Eaves
- Terry Fox LaboratoryBritish Columbia Cancer AgencyVancouverBCCanada
| |
Collapse
|
192
|
Li S, Kvon EZ, Visel A, Pennacchio LA, Ovcharenko I. Stable enhancers are active in development, and fragile enhancers are associated with evolutionary adaptation. Genome Biol 2019; 20:140. [PMID: 31307522 PMCID: PMC6631995 DOI: 10.1186/s13059-019-1750-z] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Key Words] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/25/2019] [Accepted: 06/28/2019] [Indexed: 12/13/2022] Open
Abstract
Background Despite continual progress in the identification and characterization of trait- and disease-associated variants that disrupt transcription factor (TF)-DNA binding, little is known about the distribution of TF binding deactivating mutations (deMs) in enhancer sequences. Here, we focus on elucidating the mechanism underlying the different densities of deMs in human enhancers. Results We identify two classes of enhancers based on the density of nucleotides prone to deMs. Firstly, fragile enhancers with abundant deM nucleotides are associated with the immune system and regular cellular maintenance. Secondly, stable enhancers with only a few deM nucleotides are associated with the development and regulation of TFs and are evolutionarily conserved. These two classes of enhancers feature different regulatory programs: the binding sites of pioneer TFs of FOX family are specifically enriched in stable enhancers, while tissue-specific TFs are enriched in fragile enhancers. Moreover, stable enhancers are more tolerant of deMs due to their dominant employment of homotypic TF binding site (TFBS) clusters, as opposed to the larger-extent usage of heterotypic TFBS clusters in fragile enhancers. Notably, the sequence environment and chromatin context of the cognate motif, other than the motif itself, contribute more to the susceptibility to deMs of TF binding. Conclusions This dichotomy of enhancer activity is conserved across different tissues, has a specific footprint in epigenetic profiles, and argues for a bimodal evolution of gene regulatory programs in vertebrates. Specifically encoded stable enhancers are evolutionarily conserved and associated with development, while differently encoded fragile enhancers are associated with the adaptation of species. Electronic supplementary material The online version of this article (10.1186/s13059-019-1750-z) contains supplementary material, which is available to authorized users.
Collapse
Affiliation(s)
- Shan Li
- Computational Biology Branch, National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, MD, 20892, USA
| | - Evgeny Z Kvon
- Environmental Genomics and Systems Biology Division, Lawrence Berkeley National Laboratory, Berkeley, CA, 94720, USA
| | - Axel Visel
- Environmental Genomics and Systems Biology Division, Lawrence Berkeley National Laboratory, Berkeley, CA, 94720, USA.,United States Department of Energy Joint Genome Institute, Walnut Creek, CA, 94598, USA.,School of Natural Sciences, University of California, Merced, CA, 95343, USA
| | - Len A Pennacchio
- Environmental Genomics and Systems Biology Division, Lawrence Berkeley National Laboratory, Berkeley, CA, 94720, USA.,United States Department of Energy Joint Genome Institute, Walnut Creek, CA, 94598, USA.,Comparative Biochemistry Program, University of California, Berkeley, CA, 94720, USA
| | - Ivan Ovcharenko
- Computational Biology Branch, National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, MD, 20892, USA.
| |
Collapse
|
193
|
Cohen DM, Lim HW, Won KJ, Steger DJ. Shared nucleotide flanks confer transcriptional competency to bZip core motifs. Nucleic Acids Res 2019; 46:8371-8384. [PMID: 30085281 PMCID: PMC6144830 DOI: 10.1093/nar/gky681] [Citation(s) in RCA: 15] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/23/2018] [Accepted: 07/17/2018] [Indexed: 12/31/2022] Open
Abstract
Sequence-specific DNA binding recruits transcription factors (TFs) to the genome to regulate gene expression. Here, we perform high resolution mapping of CEBP proteins to determine how sequence dictates genomic occupancy. We demonstrate a fundamental difference between the sequence repertoire utilized by CEBPs in vivo versus the palindromic sequence preference reported by classical in vitro models, by identifying a palindromic motif at <1% of the genomic binding sites. On the native genome, CEBPs bind a diversity of related 10 bp sequences resulting from the fusion of degenerate and canonical half-sites. Altered DNA specificity of CEBPs in cells occurs through heterodimerization with other bZip TFs, and approximately 40% of CEBP-binding sites in primary human cells harbor motifs characteristic of CEBP heterodimers. In addition, we uncover an important role for sequence bias at core-motif-flanking bases for CEBPs and demonstrate that flanking bases regulate motif function across mammalian bZip TFs. Favorable flanking bases confer efficient TF occupancy and transcriptional activity, and DNA shape may explain how the flanks alter TF binding. Importantly, motif optimization within the 10-mer is strongly correlated with cell-type-independent recruitment of CEBPβ, providing key insight into how sequence sub-optimization affects genomic occupancy of widely expressed CEBPs across cell types.
Collapse
Affiliation(s)
- Daniel M Cohen
- Division of Endocrinology, Diabetes, and Metabolism, Department of Medicine, Perelman School of Medicine at the University of Pennsylvania, Philadelphia, PA 19104, USA.,The Institute for Diabetes, Obesity, and Metabolism, Perelman School of Medicine at the University of Pennsylvania, Philadelphia, PA 19104, USA
| | - Hee-Woong Lim
- The Institute for Diabetes, Obesity, and Metabolism, Perelman School of Medicine at the University of Pennsylvania, Philadelphia, PA 19104, USA.,Department of Genetics, Perelman School of Medicine at the University of Pennsylvania, Philadelphia, PA 19104, USA
| | - Kyoung-Jae Won
- The Institute for Diabetes, Obesity, and Metabolism, Perelman School of Medicine at the University of Pennsylvania, Philadelphia, PA 19104, USA.,Department of Genetics, Perelman School of Medicine at the University of Pennsylvania, Philadelphia, PA 19104, USA.,Biotech Research and Innovation Centre (BRIC), University of Copenhagen, Copenhagen, Denmark
| | - David J Steger
- Division of Endocrinology, Diabetes, and Metabolism, Department of Medicine, Perelman School of Medicine at the University of Pennsylvania, Philadelphia, PA 19104, USA.,The Institute for Diabetes, Obesity, and Metabolism, Perelman School of Medicine at the University of Pennsylvania, Philadelphia, PA 19104, USA
| |
Collapse
|
194
|
Shi L, Lv X, Liu L, Yang Y, Ma Z, Han B, Sun D. A post-GWAS confirming effects of PRKG1 gene on milk fatty acids in a Chinese Holstein dairy population. BMC Genet 2019; 20:53. [PMID: 31269900 PMCID: PMC6610796 DOI: 10.1186/s12863-019-0755-7] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/26/2019] [Accepted: 06/20/2019] [Indexed: 01/03/2023] Open
Abstract
BACKGROUND We previously conducted a genome-wide association study (GWAS) strategy for milk fatty acids in Chinese Holstein, and identified 83 genome-wide significant single nucleotide polymorphisms (SNPs) and 314 suggestive significant SNPs. Among them, two SNPs, BTB-01077939 and BTA-11275-no-rs associated with C10:0, C12:0, and C14 index (P = 0.000014 ~ 0.000024), were within and close to (0.85 Mb) protein kinase, cGMP-dependent, type І (PRKG1) gene on BTA26, respectively. PRKG1 gene plays a key role in lipolysis to release fatty acids and glycerol through the hydrolysis of triacyglycerol in adipocytes. We herein considered it as a promising candidate for milk fatty acids. The purpose of this study was to investigate whether PRKG1 had effects on milk fatty acids. RESULTS By direct sequencing the PCR products of pooled DNA, we identified a total of six SNPs, including one in 5' flanking region, four in 3' untranslated region (UTR), and one in 3' flanking region. The single-locus association analysis was carried out, and showed that the six SNPs mainly had significant associations with C6:0, C8:0 and C17:1 (P < 0.0001 ~ 0.0035). In addition, we observed a haplotype block formed by g.6903810G > A and g.6904047G > T with Haploview 4.1, and it was strongly associated with C8:0, C10:0, C16:1, C17:1, C20:0 and C16 index (P = < 0.0001 ~ 0.0123). The SNP, g.8344262A > T, was predicted to alter the binding site (BS) of transcription factor (TF) GAGA box with Genomatix software, and the subsequent luciferase assay verified that it really changed the transcriptional activity of PRKG1 gene (P = 0.0009). CONCLUSION In conclusion, to our best of knowledge, we are the first who identified the significant effects of PRKG1 on milk fatty acids in dairy cattle.
Collapse
Affiliation(s)
- Lijun Shi
- Department of Animal Genetics, Breeding and Reproduction, College of Animal Science and Technology, Key Laboratory of Animal Genetics, Breeding and Reproduction of Ministry of Agriculture and Rural Affairs, National Engineering Laboratory for Animal Breeding, China Agricultural University, No. 2 Yuanmingyuan West Road, Haidian District, Beijing, 100193 China
| | - Xiaoqing Lv
- Beijing Dairy Cattle Center, Beijing, 100192 China
| | - Lin Liu
- Beijing Dairy Cattle Center, Beijing, 100192 China
| | - Yuze Yang
- Beijing Municipal Bureau of Agriculture, Beijing, 100101 China
| | - Zhu Ma
- Beijing Dairy Cattle Center, Beijing, 100192 China
| | - Bo Han
- Department of Animal Genetics, Breeding and Reproduction, College of Animal Science and Technology, Key Laboratory of Animal Genetics, Breeding and Reproduction of Ministry of Agriculture and Rural Affairs, National Engineering Laboratory for Animal Breeding, China Agricultural University, No. 2 Yuanmingyuan West Road, Haidian District, Beijing, 100193 China
| | - Dongxiao Sun
- Department of Animal Genetics, Breeding and Reproduction, College of Animal Science and Technology, Key Laboratory of Animal Genetics, Breeding and Reproduction of Ministry of Agriculture and Rural Affairs, National Engineering Laboratory for Animal Breeding, China Agricultural University, No. 2 Yuanmingyuan West Road, Haidian District, Beijing, 100193 China
| |
Collapse
|
195
|
Rudnizky S, Khamis H, Malik O, Squires AH, Meller A, Melamed P, Kaplan A. Single-molecule DNA unzipping reveals asymmetric modulation of a transcription factor by its binding site sequence and context. Nucleic Acids Res 2019; 46:1513-1524. [PMID: 29253225 PMCID: PMC5815098 DOI: 10.1093/nar/gkx1252] [Citation(s) in RCA: 16] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/23/2017] [Accepted: 12/11/2017] [Indexed: 12/31/2022] Open
Abstract
Most functional transcription factor (TF) binding sites deviate from their ‘consensus’ recognition motif, although their sites and flanking sequences are often conserved across species. Here, we used single-molecule DNA unzipping with optical tweezers to study how Egr-1, a TF harboring three zinc fingers (ZF1, ZF2 and ZF3), is modulated by the sequence and context of its functional sites in the Lhb gene promoter. We find that both the core 9 bp bound to Egr-1 in each of the sites, and the base pairs flanking them, modulate the affinity and structure of the protein–DNA complex. The effect of the flanking sequences is asymmetric, with a stronger effect for the sequence flanking ZF3. Characterization of the dissociation time of Egr-1 revealed that a local, mechanical perturbation of the interactions of ZF3 destabilizes the complex more effectively than a perturbation of the ZF1 interactions. Our results reveal a novel role for ZF3 in the interaction of Egr-1 with other proteins and the DNA, providing insight on the regulation of Lhb and other genes by Egr-1. Moreover, our findings reveal the potential of small changes in DNA sequence to alter transcriptional regulation, and may shed light on the organization of regulatory elements at promoters.
Collapse
Affiliation(s)
- Sergei Rudnizky
- Faculty of Biology, Technion-Israel Institute of Technology, Haifa 32000, Israel
| | - Hadeel Khamis
- Faculty of Biology, Technion-Israel Institute of Technology, Haifa 32000, Israel.,Faculty of Physics, Technion-Israel Institute of Technology, Haifa 32000, Israel
| | - Omri Malik
- Faculty of Biology, Technion-Israel Institute of Technology, Haifa 32000, Israel.,Russell Berrie Nanotechnology Institute, Technion-Israel Institute of Technology, Haifa 32000, Israel
| | - Allison H Squires
- Department of Biomedical Engineering, Boston University, Boston, MA 02215, USA
| | - Amit Meller
- Russell Berrie Nanotechnology Institute, Technion-Israel Institute of Technology, Haifa 32000, Israel.,Department of Biomedical Engineering, Boston University, Boston, MA 02215, USA.,Faculty of Biomedical Engineering, Technion-Israel Institute of Technology, Haifa 32000, Israel
| | - Philippa Melamed
- Faculty of Biology, Technion-Israel Institute of Technology, Haifa 32000, Israel.,Russell Berrie Nanotechnology Institute, Technion-Israel Institute of Technology, Haifa 32000, Israel
| | - Ariel Kaplan
- Faculty of Biology, Technion-Israel Institute of Technology, Haifa 32000, Israel.,Russell Berrie Nanotechnology Institute, Technion-Israel Institute of Technology, Haifa 32000, Israel
| |
Collapse
|
196
|
Yu Z, Pandian GN, Hidaka T, Sugiyama H. Therapeutic gene regulation using pyrrole-imidazole polyamides. Adv Drug Deliv Rev 2019; 147:66-85. [PMID: 30742856 DOI: 10.1016/j.addr.2019.02.001] [Citation(s) in RCA: 27] [Impact Index Per Article: 4.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/27/2018] [Revised: 11/22/2018] [Accepted: 02/04/2019] [Indexed: 12/13/2022]
Abstract
Recent innovations in cutting-edge sequencing platforms have allowed the rapid identification of genes associated with communicable, noncommunicable and rare diseases. Exploitation of this collected biological information has facilitated the development of nonviral gene therapy strategies and the design of several proteins capable of editing specific DNA sequences for disease control. Small molecule-based targeted therapeutic approaches have gained increasing attention because of their suggested clinical benefits, ease of control and lower costs. Pyrrole-imidazole polyamides (PIPs) are a major class of DNA minor groove-binding small molecules that can be predesigned to recognize specific DNA sequences. This programmability of PIPs allows the on-demand design of artificial genetic switches and fluorescent probes. In this review, we detail the progress in the development of PIP-based designer ligands and their prospects as advanced DNA-based small-molecule drugs for therapeutic gene modulation.
Collapse
|
197
|
van Arensbergen J, Pagie L, FitzPatrick VD, de Haas M, Baltissen MP, Comoglio F, van der Weide RH, Teunissen H, Võsa U, Franke L, de Wit E, Vermeulen M, Bussemaker HJ, van Steensel B. High-throughput identification of human SNPs affecting regulatory element activity. Nat Genet 2019; 51:1160-1169. [PMID: 31253979 PMCID: PMC6609452 DOI: 10.1038/s41588-019-0455-2] [Citation(s) in RCA: 120] [Impact Index Per Article: 20.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/30/2018] [Accepted: 05/24/2019] [Indexed: 01/08/2023]
Abstract
Most of the millions of SNPs in the human genome are non-coding, and many overlap with putative regulatory elements. Genome-wide association studies (GWAS) have linked many of these SNPs to human traits or to gene expression levels, but rarely with sufficient resolution to identify the causal SNPs. Functional screens based on reporter assays have previously been of insufficient throughput to test the vast space of SNPs for possible effects on regulatory element activity. Here we leveraged the throughput and resolution of the survey of regulatory elements (SuRE) reporter technology to survey the effect of 5.9 million SNPs, including 57% of the known common SNPs, on enhancer and promoter activity. We identified more than 30,000 SNPs that alter the activity of putative regulatory elements, partially in a cell-type-specific manner. Integration of this dataset with GWAS results may help to pinpoint SNPs that underlie human traits.
Collapse
Affiliation(s)
- Joris van Arensbergen
- Division of Gene Regulation, Oncode Institute, Netherlands Cancer Institute, Amsterdam, the Netherlands.
| | - Ludo Pagie
- Division of Gene Regulation, Oncode Institute, Netherlands Cancer Institute, Amsterdam, the Netherlands
| | - Vincent D FitzPatrick
- Department of Biological Sciences, Columbia University, New York, NY, USA
- Department of Systems Biology, Columbia University Medical Center, New York, NY, USA
| | - Marcel de Haas
- Division of Gene Regulation, Oncode Institute, Netherlands Cancer Institute, Amsterdam, the Netherlands
| | - Marijke P Baltissen
- Department of Molecular Biology, Oncode Institute, Radboud Institute for Molecular Life Sciences, Radboud University Nijmegen, Nijmegen, the Netherlands
| | - Federico Comoglio
- Division of Gene Regulation, Oncode Institute, Netherlands Cancer Institute, Amsterdam, the Netherlands
- Department of Haematology, University of Cambridge, Cambridge, UK
| | - Robin H van der Weide
- Division of Gene Regulation, Oncode Institute, Netherlands Cancer Institute, Amsterdam, the Netherlands
| | - Hans Teunissen
- Division of Gene Regulation, Oncode Institute, Netherlands Cancer Institute, Amsterdam, the Netherlands
| | - Urmo Võsa
- Department of Genetics, University Medical Center Groningen, University of Groningen, Groningen, the Netherlands
- Estonian Genome Center, Institute of Genomics, University of Tartu, Tartu, Estonia
| | - Lude Franke
- Department of Genetics, University Medical Center Groningen, University of Groningen, Groningen, the Netherlands
| | - Elzo de Wit
- Division of Gene Regulation, Oncode Institute, Netherlands Cancer Institute, Amsterdam, the Netherlands
| | - Michiel Vermeulen
- Department of Molecular Biology, Oncode Institute, Radboud Institute for Molecular Life Sciences, Radboud University Nijmegen, Nijmegen, the Netherlands
| | - Harmen J Bussemaker
- Department of Biological Sciences, Columbia University, New York, NY, USA
- Department of Systems Biology, Columbia University Medical Center, New York, NY, USA
| | - Bas van Steensel
- Division of Gene Regulation, Oncode Institute, Netherlands Cancer Institute, Amsterdam, the Netherlands.
| |
Collapse
|
198
|
Assad N, Tillo D, Ray S, Dzienny A, FitzGerald PC, Vinson C. GABPα and CREB1 Binding to Double Nucleotide Polymorphisms of Their Consensus Motifs and Cooperative Binding to the Composite ETS ⇔ CRE Motif ( ACCGGAAGTGACGTCA). ACS OMEGA 2019; 4:9904-9910. [PMID: 34151054 PMCID: PMC8208074 DOI: 10.1021/acsomega.9b00540] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 02/26/2019] [Accepted: 05/24/2019] [Indexed: 06/13/2023]
Abstract
Previously, cooperative binding of the bZIP domain of CREB1 and the ETS domain of GABPα was observed for the composite DNA ETS ⇔ CRE motif (A 0 C 1 C 2 G 3 G 4 A 5 A 6 G 7 T 8 G 9 A 10 C 11 G 12 T 13 C 14 A 15 ). Single nucleotide polymorphisms (SNPs) at the beginning and end of the ETS motif (ACCGGAAGT) increased cooperative binding. Here, we use an Agilent microarray of 60-mers containing all double nucleotide polymorphisms (DNPs) of the ETS ⇔ CRE motif to explore GABPα and CREB1 binding to their individual motifs and their cooperative binding. For GABPα, all DNPs were bound as if each SNP acted independently. In contrast, CREB1 binding to some DNPs was stronger or weaker than expected, depending on the locations of each SNP. CREB1 binding to DNPs where both SNPs were in the same half site, T 8 G 9 A 10 or T 13 C 14 A 15 , was greater than expected, indicating that an additional SNP cannot destroy binding as much as expected, suggesting that an individual SNP is enough to abolish sequence-specific DNA binding of a single bZIP monomer. If a DNP contains SNPs in each half site, binding is weaker than expected. Similar results were observed for additional ETS and bZIP family members. Cooperative binding between GABPα and CREB1 to the ETS ⇔ CRE motif was weaker than expected except for DNPs containing A 7 and SNPs at the beginning of the ETS motif.
Collapse
Affiliation(s)
- Nima Assad
- Laboratory
of Metabolism, National Cancer Institute, National Institutes of Health, Bethesda, Maryland 20892, United States
| | - Desiree Tillo
- Laboratory
of Metabolism, National Cancer Institute, National Institutes of Health, Bethesda, Maryland 20892, United States
| | - Sreejana Ray
- Laboratory
of Metabolism, National Cancer Institute, National Institutes of Health, Bethesda, Maryland 20892, United States
| | - Alexa Dzienny
- Laboratory
of Metabolism, National Cancer Institute, National Institutes of Health, Bethesda, Maryland 20892, United States
| | - Peter C. FitzGerald
- Genome
Analysis Unit, Genetics Branch, National Cancer Institute, National Institutes of Health, Building 37, Bethesda, Maryland 20892, United States
| | - Charles Vinson
- Laboratory
of Metabolism, National Cancer Institute, National Institutes of Health, Bethesda, Maryland 20892, United States
| |
Collapse
|
199
|
Mitkin NA, Korneev K, Gorbacheva AM, Kuprash DV. Relative Efficiency of Transcription Factor Binding to Allelic Variants of Regulatory Regions of Human Genes in Immunoprecipitation and Real-Time PCR. Mol Biol 2019. [DOI: 10.1134/s0026893319030117] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/23/2022]
|
200
|
Dekkers KF, Neele AE, Jukema JW, Heijmans BT, de Winther MPJ. Human monocyte-to-macrophage differentiation involves highly localized gain and loss of DNA methylation at transcription factor binding sites. Epigenetics Chromatin 2019; 12:34. [PMID: 31171035 PMCID: PMC6551876 DOI: 10.1186/s13072-019-0279-4] [Citation(s) in RCA: 36] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/19/2018] [Accepted: 05/17/2019] [Indexed: 12/21/2022] Open
Abstract
Background Macrophages and their precursors monocytes play a key role in inflammation and chronic inflammatory disorders. Monocyte-to-macrophage differentiation and activation programs are accompanied by significant epigenetic remodeling where DNA methylation associates with cell identity. Here we show that DNA methylation changes characteristic for monocyte-to-macrophage differentiation occur at transcription factor binding sites, and, in contrast to what was previously described, are generally highly localized and encompass both losses and gains of DNA methylation. Results We compared genome-wide DNA methylation across 440,292 CpG sites between human monocytes, naïve macrophages and macrophages further activated toward a pro-inflammatory state (using LPS/IFNγ), an anti-inflammatory state (IL-4) or foam cells (oxLDL and acLDL). Moreover, we integrated these data with public whole-genome sequencing data on monocytes and macrophages to demarcate differentially methylated regions. Our analysis showed that differential DNA methylation was most pronounced during monocyte-to-macrophage differentiation, was typically restricted to single CpGs or very short regions, and co-localized with lineage-specific enhancers irrespective of whether it concerns gain or loss of methylation. Furthermore, differentially methylated CpGs were located at sites characterized by increased binding of transcription factors known to be involved in monocyte-to-macrophage differentiation including C/EBP and ETS for gain and AP-1 for loss of methylation. Conclusion Our study highlights the involvement of subtle, yet highly localized remodeling of DNA methylation at regulatory regions in cell differentiation. Electronic supplementary material The online version of this article (10.1186/s13072-019-0279-4) contains supplementary material, which is available to authorized users.
Collapse
Affiliation(s)
- Koen F Dekkers
- Molecular Epidemiology, Department of Biomedical Data Sciences, Leiden University Medical Center, Leiden, The Netherlands
| | - Annette E Neele
- Department of Medical Biochemistry, Amsterdam Cardiovascular Sciences, Meibergdreef 9, Amsterdam UMC, University of Amsterdam, Amsterdam, The Netherlands
| | - J Wouter Jukema
- Department of Cardiology, Leiden University Medical Center, Leiden, The Netherlands
| | - Bastiaan T Heijmans
- Molecular Epidemiology, Department of Biomedical Data Sciences, Leiden University Medical Center, Leiden, The Netherlands.
| | - Menno P J de Winther
- Department of Medical Biochemistry, Amsterdam Cardiovascular Sciences, Meibergdreef 9, Amsterdam UMC, University of Amsterdam, Amsterdam, The Netherlands. .,Institute for Cardiovascular Prevention (IPEK), Munich, Germany.
| |
Collapse
|