51
|
Paredes O, Romo-Vázquez R, Román-Godínez I, Vélez-Pérez H, Salido-Ruiz RA, Morales JA. Frequency spectra characterization of noncoding human genomic sequences. Genes Genomics 2020; 42:1215-1226. [DOI: 10.1007/s13258-020-00980-2] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/22/2019] [Accepted: 04/27/2020] [Indexed: 11/28/2022]
|
52
|
Bansal R, Hussain S, Chanana UB, Bisht D, Goel I, Muthuswami R. SMARCAL1, the annealing helicase and the transcriptional co-regulator. IUBMB Life 2020; 72:2080-2096. [PMID: 32754981 DOI: 10.1002/iub.2354] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/03/2020] [Revised: 06/26/2020] [Accepted: 07/07/2020] [Indexed: 12/15/2022]
Abstract
The ATP-dependent chromatin remodeling proteins play an important role in DNA repair. The energy released by ATP hydrolysis is used for myriad functions ranging from nucleosome repositioning and nucleosome eviction to histone variant exchange. In addition, the distant member of the family, SMARCAL1, uses the energy to reanneal stalled replication forks in response to DNA damage. Biophysical studies have shown that this protein has the unique ability to recognize and bind specifically to DNA structures possessing double-strand to single-strand transition regions. Mutations in SMARCAL1 have been linked to Schimke immuno-osseous dysplasia, an autosomal recessive disorder that exhibits variable penetrance and expressivity. It has long been hypothesized that the variable expressivity and pleiotropic phenotypes observed in the patients might be due to the ability of SMARCAL1 to co-regulate the expression of a subset of genes within the genome. Recently, the role of SMARCAL1 in regulating transcription has been delineated. In this review, we discuss the biophysical and functional properties of the protein that help it to transcriptionally co-regulate DNA damage response as well as to bind to the stalled replication fork and stabilize it, thus ensuring genomic stability. We also discuss the role of SMARCAL1 in cancer and the possibility of using this protein as a chemotherapeutic target.
Collapse
Affiliation(s)
- Ritu Bansal
- Chromatin Remodeling Laboratory, School of Life Sciences, Jawaharlal Nehru University, New Delhi, India
| | - Saddam Hussain
- Chromatin Remodeling Laboratory, School of Life Sciences, Jawaharlal Nehru University, New Delhi, India
| | - Upasana Bedi Chanana
- Chromatin Remodeling Laboratory, School of Life Sciences, Jawaharlal Nehru University, New Delhi, India
| | - Deepa Bisht
- Chromatin Remodeling Laboratory, School of Life Sciences, Jawaharlal Nehru University, New Delhi, India
| | - Isha Goel
- Chromatin Remodeling Laboratory, School of Life Sciences, Jawaharlal Nehru University, New Delhi, India
| | - Rohini Muthuswami
- Chromatin Remodeling Laboratory, School of Life Sciences, Jawaharlal Nehru University, New Delhi, India
| |
Collapse
|
53
|
Sandra US, Shukla A, Kolthur-Seetharam U. Search and Capture: Disorder Rules Gene Promoter Selection. Trends Genet 2020; 36:721-722. [PMID: 32739029 DOI: 10.1016/j.tig.2020.07.005] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/07/2020] [Accepted: 07/09/2020] [Indexed: 10/23/2022]
Abstract
Intrinsically disordered regions (IDRs) are preponderant in transcription factors (TFs) and are evolutionarily less conserved vis-à-vis DNA-binding domains (DBDs). Unexpected findings from Barkai and colleagues, which demonstrate that promoter selectivity is determined by IDRs, should significantly enhance our understanding of gene expression regulation.
Collapse
Affiliation(s)
- U S Sandra
- Department of Biological Sciences, Tata Institute of Fundamental Research, Homi Bhabha Road, Colaba, Mumbai 400 005, Maharashtra, India
| | - Arushi Shukla
- Department of Biological Sciences, Tata Institute of Fundamental Research, Homi Bhabha Road, Colaba, Mumbai 400 005, Maharashtra, India
| | - Ullas Kolthur-Seetharam
- Department of Biological Sciences, Tata Institute of Fundamental Research, Homi Bhabha Road, Colaba, Mumbai 400 005, Maharashtra, India.
| |
Collapse
|
54
|
Brodsky S, Jana T, Mittelman K, Chapal M, Kumar DK, Carmi M, Barkai N. Intrinsically Disordered Regions Direct Transcription Factor In Vivo Binding Specificity. Mol Cell 2020; 79:459-471.e4. [DOI: 10.1016/j.molcel.2020.05.032] [Citation(s) in RCA: 99] [Impact Index Per Article: 24.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/29/2019] [Revised: 03/10/2020] [Accepted: 05/21/2020] [Indexed: 11/25/2022]
|
55
|
Belcher MS, Vuu KM, Zhou A, Mansoori N, Agosto Ramos A, Thompson MG, Scheller HV, Loqué D, Shih PM. Design of orthogonal regulatory systems for modulating gene expression in plants. Nat Chem Biol 2020; 16:857-865. [PMID: 32424304 DOI: 10.1038/s41589-020-0547-4] [Citation(s) in RCA: 46] [Impact Index Per Article: 11.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/18/2019] [Accepted: 04/09/2020] [Indexed: 11/08/2022]
Abstract
Agricultural biotechnology strategies often require the precise regulation of multiple genes to effectively modify complex plant traits. However, most efforts are hindered by a lack of characterized tools that allow for reliable and targeted expression of transgenes. We have successfully engineered a library of synthetic transcriptional regulators that modulate expression strength in planta. By leveraging orthogonal regulatory systems from Saccharomyces spp., we have developed a strategy for the design of synthetic activators, synthetic repressors, and synthetic promoters and have validated their use in Nicotiana benthamiana and Arabidopsis thaliana. This characterization of contributing genetic elements that dictate gene expression represents a foundation for the rational design of refined synthetic regulators. Our findings demonstrate that these tools provide variation in transcriptional output while enabling the concerted expression of multiple genes in a tissue-specific and environmentally responsive manner, providing a basis for generating complex genetic circuits that process endogenous and environmental stimuli.
Collapse
Affiliation(s)
- Michael S Belcher
- Feedstocks Division, Joint BioEnergy Institute, Emeryville, CA, USA
- Department of Plant and Microbial Biology, University of California, Berkeley, Berkeley, CA, USA
| | - Khanh M Vuu
- Feedstocks Division, Joint BioEnergy Institute, Emeryville, CA, USA
- Environmental Genomics and Systems Biology Division, Lawrence Berkeley National Laboratory, Berkeley, CA, USA
| | - Andy Zhou
- Feedstocks Division, Joint BioEnergy Institute, Emeryville, CA, USA
- Environmental Genomics and Systems Biology Division, Lawrence Berkeley National Laboratory, Berkeley, CA, USA
- Department of Plant Biology, University of California, Davis, Davis, CA, USA
| | - Nasim Mansoori
- Feedstocks Division, Joint BioEnergy Institute, Emeryville, CA, USA
- Environmental Genomics and Systems Biology Division, Lawrence Berkeley National Laboratory, Berkeley, CA, USA
| | - Amanda Agosto Ramos
- Feedstocks Division, Joint BioEnergy Institute, Emeryville, CA, USA
- Environmental Genomics and Systems Biology Division, Lawrence Berkeley National Laboratory, Berkeley, CA, USA
| | - Mitchell G Thompson
- Feedstocks Division, Joint BioEnergy Institute, Emeryville, CA, USA
- Environmental Genomics and Systems Biology Division, Lawrence Berkeley National Laboratory, Berkeley, CA, USA
- Department of Plant Biology, University of California, Davis, Davis, CA, USA
| | - Henrik V Scheller
- Feedstocks Division, Joint BioEnergy Institute, Emeryville, CA, USA
- Department of Plant and Microbial Biology, University of California, Berkeley, Berkeley, CA, USA
- Environmental Genomics and Systems Biology Division, Lawrence Berkeley National Laboratory, Berkeley, CA, USA
| | - Dominique Loqué
- Feedstocks Division, Joint BioEnergy Institute, Emeryville, CA, USA
- Environmental Genomics and Systems Biology Division, Lawrence Berkeley National Laboratory, Berkeley, CA, USA
| | - Patrick M Shih
- Feedstocks Division, Joint BioEnergy Institute, Emeryville, CA, USA.
- Environmental Genomics and Systems Biology Division, Lawrence Berkeley National Laboratory, Berkeley, CA, USA.
- Department of Plant Biology, University of California, Davis, Davis, CA, USA.
| |
Collapse
|
56
|
Zhao Y, Wu D, Jiang D, Zhang X, Wu T, Cui J, Qian M, Zhao J, Oesterreich S, Sun W, Finkel T, Li G. A sequential methodology for the rapid identification and characterization of breast cancer-associated functional SNPs. Nat Commun 2020; 11:3340. [PMID: 32620845 PMCID: PMC7334201 DOI: 10.1038/s41467-020-17159-8] [Citation(s) in RCA: 16] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/12/2019] [Accepted: 06/11/2020] [Indexed: 12/24/2022] Open
Abstract
GWAS cannot identify functional SNPs (fSNP) from disease-associated SNPs in linkage disequilibrium (LD). Here, we report developing three sequential methodologies including Reel-seq (Regulatory element-sequencing) to identify fSNPs in a high-throughput fashion, SDCP-MS (SNP-specific DNA competition pulldown-mass spectrometry) to identify fSNP-bound proteins and AIDP-Wb (allele-imbalanced DNA pulldown-Western blot) to detect allele-specific protein:fSNP binding. We first apply Reel-seq to screen a library containing 4316 breast cancer-associated SNPs and identify 521 candidate fSNPs. As proof of principle, we verify candidate fSNPs on three well-characterized loci: FGFR2, MAP3K1 and BABAM1. Next, using SDCP-MS and AIDP-Wb, we rapidly identify multiple regulatory factors that specifically bind in an allele-imbalanced manner to the fSNPs on the FGFR2 locus. We finally demonstrate that the factors identified by SDCP-MS can regulate risk gene expression. These data suggest that the sequential application of Reel-seq, SDCP-MS, and AIDP-Wb can greatly help to translate large sets of GWAS data into biologically relevant information.
Collapse
Affiliation(s)
- Yihan Zhao
- Aging Institute, University of Pittsburgh, Pittsburgh, PA, 15219, USA
- School of Life Sciences, East China Normal University, Shanghai, China
| | - Di Wu
- Adams School of Dentistry, University of North Carolina at Chapel Hill, Chapel Hill, NC, 27599, USA
- Department of Biostatistics, University of North Carolina at Chapel Hill, Chapel Hill, NC, 27599, USA
| | - Danli Jiang
- Aging Institute, University of Pittsburgh, Pittsburgh, PA, 15219, USA
| | - Xiaoyu Zhang
- Aging Institute, University of Pittsburgh, Pittsburgh, PA, 15219, USA
| | - Ting Wu
- Aging Institute, University of Pittsburgh, Pittsburgh, PA, 15219, USA
- Department of Medicine, Xiangya School of Medicine, Central South University, Changsha, China
| | - Jing Cui
- Department of Medicine, Brigham and Women's Hospital, Boston, MA, 02115, USA
| | - Min Qian
- School of Life Sciences, East China Normal University, Shanghai, China
| | - Jean Zhao
- Department of Chemical Biology, DFCI, Boston, MA, 02115, USA
| | - Steffi Oesterreich
- Department of Pharmacology & Chemical Biology, University of Pittsburgh School of Medicine, Pittsburgh, PA, 15261, USA
- Women's Cancer Research Center, Magee-Women's Research Institute, University of Pittsburgh Cancer Institute, 204 Craft Avenue, Pittsburgh, PA, 15213, USA
| | - Wei Sun
- Department of Medicine, Division of Cardiology, University of Pittsburgh Medical Center, Pittsburgh, PA, 15219, USA
| | - Toren Finkel
- Aging Institute, University of Pittsburgh, Pittsburgh, PA, 15219, USA
- Department of Medicine, Division of Cardiology, University of Pittsburgh Medical Center, Pittsburgh, PA, 15219, USA
| | - Gang Li
- Aging Institute, University of Pittsburgh, Pittsburgh, PA, 15219, USA.
- Department of Medicine, Division of Cardiology, University of Pittsburgh Medical Center, Pittsburgh, PA, 15219, USA.
| |
Collapse
|
57
|
Gatto V, Binati RL, Lemos Junior WJF, Basile A, Treu L, de Almeida OGG, Innocente G, Campanaro S, Torriani S. New insights into the variability of lactic acid production in Lachancea thermotolerans at the phenotypic and genomic level. Microbiol Res 2020; 238:126525. [PMID: 32593090 DOI: 10.1016/j.micres.2020.126525] [Citation(s) in RCA: 12] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/15/2020] [Revised: 06/03/2020] [Accepted: 06/05/2020] [Indexed: 01/13/2023]
Abstract
Non-conventional yeasts are increasingly applied in fermented beverage industry to obtain distinctive products with improved quality. Among these yeasts, Lachancea thermotolerans has multiple features of industrial relevance, especially the production of l(+)-lactic acid (LA), useful for the biological acidification of wine and beer. Since few information is available on this peculiar activity, the current study aimed to explore the physiological and genetic variability among L. thermotolerans strains. From a strain collection, mostly isolated from wine, a huge phenotypic diversity was acknowledged and allowed the selection of a high (SOL13) and a low (COLC27) LA producer. Comparative whole-genome sequencing of these two selected strains and the type strain CBS 6340T showed a high similarity in terms of gene content and functional annotation. Notwithstanding, target gene-based analysis revealed variations between high and low producers in the key gene sequences related to LA accumulation. More in-depth investigation of the core promoters and expression analysis of the genes ldh, encoding lactate dehydrogenase, indicated the transcriptional regulation may be the principal cause behind phenotypic differences. These findings highlighted the usefulness of whole-genome sequencing coupled with expression analysis. They provided crucial genetic insights for a deeper investigation of the intraspecific variability in LA production pathway.
Collapse
Affiliation(s)
- Veronica Gatto
- Department of Biotechnology, University of Verona, 37134, Verona, Italy
| | - Renato L Binati
- Department of Biotechnology, University of Verona, 37134, Verona, Italy
| | | | - Arianna Basile
- Department of Biology, University of Padua, 35121, Padua, Italy
| | - Laura Treu
- Department of Biology, University of Padua, 35121, Padua, Italy
| | - Otávio G G de Almeida
- Faculty of Pharmaceutical Sciences of Ribeirão Preto, University of São Paulo, 14040-900, Ribeirão Preto, Brazil
| | - Giada Innocente
- Department of Biotechnology, University of Verona, 37134, Verona, Italy
| | | | - Sandra Torriani
- Department of Biotechnology, University of Verona, 37134, Verona, Italy.
| |
Collapse
|
58
|
In-silico analysis of eukaryotic translation initiation factors (eIFs) in response to environmental stresses in rice (Oryza sativa). Biologia (Bratisl) 2020. [DOI: 10.2478/s11756-020-00467-1] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/24/2023]
|
59
|
Yevshin I, Sharipov R, Kolmykov S, Kondrakhin Y, Kolpakov F. GTRD: a database on gene transcription regulation-2019 update. Nucleic Acids Res 2020; 47:D100-D105. [PMID: 30445619 PMCID: PMC6323985 DOI: 10.1093/nar/gky1128] [Citation(s) in RCA: 145] [Impact Index Per Article: 36.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/15/2018] [Accepted: 10/26/2018] [Indexed: 01/16/2023] Open
Abstract
The current version of the Gene Transcription Regulation Database (GTRD; http://gtrd.biouml.org) contains information about: (i) transcription factor binding sites (TFBSs) and transcription coactivators identified by ChIP-seq experiments for Homo sapiens, Mus musculus, Rattus norvegicus, Danio rerio, Caenorhabditis elegans, Drosophila melanogaster, Saccharomyces cerevisiae, Schizosaccharomyces pombe and Arabidopsis thaliana; (ii) regions of open chromatin and TFBSs (DNase footprints) identified by DNase-seq; (iii) unmappable regions where TFBSs cannot be identified due to repeats; (iv) potential TFBSs for both human and mouse using position weight matrices from the HOCOMOCO database. Raw ChIP-seq and DNase-seq data were obtained from ENCODE and SRA, and uniformly processed. ChIP-seq peaks were called using four different methods: MACS, SISSRs, GEM and PICS. Moreover, peaks for the same factor and peak calling method, albeit using different experiment conditions (cell line, treatment, etc.), were merged into clusters. To reduce noise, such clusters for different peak calling methods were merged into meta-clusters; these were considered to be non-redundant TFBS sets. Moreover, extended quality control was applied to all ChIP-seq data. Web interface to access GTRD was developed using the BioUML platform. It provides browsing and displaying information, advanced search possibilities and an integrated genome browser.
Collapse
Affiliation(s)
- Ivan Yevshin
- BIOSOFT.RU, LLC, Novosibirsk 630090, Russian Federation
| | - Ruslan Sharipov
- BIOSOFT.RU, LLC, Novosibirsk 630090, Russian Federation.,Institute of Computational Technologies SB RAS, Novosibirsk 630090, Russian Federation.,Novosibirsk State University, Novosibirsk 630090, Russian Federation
| | - Semyon Kolmykov
- BIOSOFT.RU, LLC, Novosibirsk 630090, Russian Federation.,Institute of Cytology and Genetics SB RAS, Novosibirsk 630090, Russian Federation
| | - Yury Kondrakhin
- BIOSOFT.RU, LLC, Novosibirsk 630090, Russian Federation.,Institute of Computational Technologies SB RAS, Novosibirsk 630090, Russian Federation
| | - Fedor Kolpakov
- BIOSOFT.RU, LLC, Novosibirsk 630090, Russian Federation.,Institute of Computational Technologies SB RAS, Novosibirsk 630090, Russian Federation
| |
Collapse
|
60
|
de Jongh RP, van Dijk AD, Julsing MK, Schaap PJ, de Ridder D. Designing Eukaryotic Gene Expression Regulation Using Machine Learning. Trends Biotechnol 2020; 38:191-201. [DOI: 10.1016/j.tibtech.2019.07.007] [Citation(s) in RCA: 10] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/08/2019] [Revised: 07/12/2019] [Accepted: 07/19/2019] [Indexed: 12/11/2022]
|
61
|
Cruz Díaz LA, Gutiérrez Ortega A, Chávez Álvarez RDC, Velarde Félix JS, Prado Montes de Oca E. Regulatory SNP rs5743417 impairs constitutive expression of human β-defensin 1 and has high frequency in Africans and Afro-Americans. Int J Immunogenet 2020; 47:332-341. [PMID: 31994826 DOI: 10.1111/iji.12475] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/30/2019] [Revised: 11/26/2019] [Accepted: 01/02/2020] [Indexed: 01/01/2023]
Abstract
The prediction of regulatory single nucleotide polymorphisms (rSNPs) in proximal promoters of disease-related genes could be a useful tool for personalized medicine in both patient stratification and customized therapy. Using our previously reported method of rSNPs prediction (currently a software called SNPClinic v.1.0) as well as with PredictSNP tool, we performed in silico prediction of regulatory SNPs in the antimicrobial peptide human β-defensin 1 gene in three human cell lines from 1,000 Genomes Project (1kGP), namely A549 (epithelial cell line), HL-60 (neutrophils) and TH 1 (lymphocytes). These predictions were run in a proximal pseudo-promoter comprising all common alleles on each polymorphic site according to the 1,000 Genomes Project data (1kGP: ALL). Plasmid vectors containing either the major or the minor allele of a putative rSNP rs5743417 (categorized as regulatory by SNPClinic and confirmed by PredictSNP) and a non-rSNP negative control were transfected to lung A549 human epithelial cell line. We assessed functionality of rSNPs by qPCR using the Pfaffl method. In A549 cells, minor allele of the SNP rs5743417 G→A showed a significant reduction in gene expression, diminishing DEFB1 transcription by 33% when compared with the G major allele (p-value = .03). SNP rs5743417 minor allele has high frequency in Gambians (8%, 1kGP population: GWD) and Afro-Americans (3.3%, 1kGP population: ASW). This SNP alters three transcription factors binding sites (TFBSs) comprising SREBP2 (sterols and haematopoietic pathways), CREB1 (cAMP, insulin and TNF pathways) and JUND (apoptosis, senescence and stress pathways) in the proximal promoter of DEFB1. Further in silico analysis reveals that this SNP also overlaps with GS1-24F4.2, a lincRNA gene complementary to the X Kell blood group related 5 (XKR5) mRNA. The potential clinical impact of the altered constitutive expression of DEFB1 caused by rSNP rs5743417 in DEFB1-associated diseases as tuberculosis, COPD, asthma, cystic fibrosis and cancer in African and Afro-American populations deserves further research.
Collapse
Affiliation(s)
- Luis Antonio Cruz Díaz
- Interinstitutional Posgrade in Science and Technology (PICYT), Research Center of Technology and Design Assistance of Jalisco State, (CIATEJ A.C.), Guadalajara, Mexico.,Laboratory of Regulatory SNPs, Personalized Medicine National Laboratory (LAMPER), Pharmaceutical and Medical Biotechnology, Central Unit, CIATEJ A.C., National Council of Science and Technology (CONACYT), Guadalajara, Mexico
| | - Abel Gutiérrez Ortega
- Laboratory of Regulatory SNPs, Personalized Medicine National Laboratory (LAMPER), Pharmaceutical and Medical Biotechnology, Central Unit, CIATEJ A.C., National Council of Science and Technology (CONACYT), Guadalajara, Mexico
| | - Rocío Del Carmen Chávez Álvarez
- Laboratory of Regulatory SNPs, Personalized Medicine National Laboratory (LAMPER), Pharmaceutical and Medical Biotechnology, Central Unit, CIATEJ A.C., National Council of Science and Technology (CONACYT), Guadalajara, Mexico
| | - Jesús Salvador Velarde Félix
- Faculty of Chemical and Biological Sciences, Autonomous University of Sinaloa, Culiacan, Mexico.,Faculty of Biology, Autonomous University of Sinaloa, Culiacan, Mexico.,Genomic Medicine Center, Dr. Bernardo J. Gastélum Primary Care Hospital, Sinaloa Health Ministry, Culiacan, Mexico
| | - Ernesto Prado Montes de Oca
- Laboratory of Regulatory SNPs, Personalized Medicine National Laboratory (LAMPER), Pharmaceutical and Medical Biotechnology, Central Unit, CIATEJ A.C., National Council of Science and Technology (CONACYT), Guadalajara, Mexico.,Laboratory of Pharmacogenomics and Preventive Medicine, Personalized Medicine National Laboratory (LAMPER), Pharmaceutical and Medical Biotechnology, Central Unit, CIATEJ A.C., CONACYT, Guadalajara, Mexico.,Scripps Research Translational Institute, La Jolla, CA, USA.,Integrative Structural and Computational Biology, Scripps Research Institute, La Jolla, CA, USA
| |
Collapse
|
62
|
Shrinivas K, Sabari BR, Coffey EL, Klein IA, Boija A, Zamudio AV, Schuijers J, Hannett NM, Sharp PA, Young RA, Chakraborty AK. Enhancer Features that Drive Formation of Transcriptional Condensates. Mol Cell 2020; 75:549-561.e7. [PMID: 31398323 DOI: 10.1016/j.molcel.2019.07.009] [Citation(s) in RCA: 246] [Impact Index Per Article: 61.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/15/2018] [Revised: 03/31/2019] [Accepted: 07/08/2019] [Indexed: 12/12/2022]
Abstract
Enhancers are DNA elements that are bound by transcription factors (TFs), which recruit coactivators and the transcriptional machinery to genes. Phase-separated condensates of TFs and coactivators have been implicated in assembling the transcription machinery at particular enhancers, yet the role of DNA sequence in this process has not been explored. We show that DNA sequences encoding TF binding site number, density, and affinity above sharply defined thresholds drive condensation of TFs and coactivators. A combination of specific structured (TF-DNA) and weak multivalent (TF-coactivator) interactions allows for condensates to form at particular genomic loci determined by the DNA sequence and the complement of expressed TFs. DNA features found to drive condensation promote enhancer activity and transcription in cells. Our study provides a framework to understand how the genome can scaffold transcriptional condensates at specific loci and how the universal phenomenon of phase separation might regulate this process.
Collapse
Affiliation(s)
- Krishna Shrinivas
- Department of Chemical Engineering, Massachusetts Institute of Technology, Cambridge, MA 02139, USA; Institute for Medical Engineering and Science, Massachusetts Institute of Technology, Cambridge, MA 02139, USA
| | - Benjamin R Sabari
- Whitehead Institute for Biomedical Research, 455 Main Street, Cambridge, MA 02142, USA
| | - Eliot L Coffey
- Whitehead Institute for Biomedical Research, 455 Main Street, Cambridge, MA 02142, USA; Department of Biology, Massachusetts Institute of Technology, Cambridge, MA 02139, USA
| | - Isaac A Klein
- Whitehead Institute for Biomedical Research, 455 Main Street, Cambridge, MA 02142, USA; Department of Medical Oncology, Dana-Farber Cancer Institute, Harvard Medical School, Boston, MA 02215, USA
| | - Ann Boija
- Whitehead Institute for Biomedical Research, 455 Main Street, Cambridge, MA 02142, USA
| | - Alicia V Zamudio
- Whitehead Institute for Biomedical Research, 455 Main Street, Cambridge, MA 02142, USA; Department of Biology, Massachusetts Institute of Technology, Cambridge, MA 02139, USA
| | - Jurian Schuijers
- Whitehead Institute for Biomedical Research, 455 Main Street, Cambridge, MA 02142, USA
| | - Nancy M Hannett
- Whitehead Institute for Biomedical Research, 455 Main Street, Cambridge, MA 02142, USA
| | - Phillip A Sharp
- Department of Biology, Massachusetts Institute of Technology, Cambridge, MA 02139, USA; Koch Institute for Integrative Cancer Research, Massachusetts Institute of Technology, Cambridge, MA 02139, USA.
| | - Richard A Young
- Whitehead Institute for Biomedical Research, 455 Main Street, Cambridge, MA 02142, USA; Department of Biology, Massachusetts Institute of Technology, Cambridge, MA 02139, USA.
| | - Arup K Chakraborty
- Department of Chemical Engineering, Massachusetts Institute of Technology, Cambridge, MA 02139, USA; Institute for Medical Engineering and Science, Massachusetts Institute of Technology, Cambridge, MA 02139, USA; Department of Physics, Massachusetts Institute of Technology, Cambridge, MA 02139, USA; Ragon Institute of Massachusetts General Hospital, Massachusetts Institute of Technology and Harvard University, Cambridge, MA 02139, USA; Department of Chemistry, Massachusetts Institute of Technology, Cambridge, MA 02139, USA.
| |
Collapse
|
63
|
Blanco E, González-Ramírez M, Alcaine-Colet A, Aranda S, Di Croce L. The Bivalent Genome: Characterization, Structure, and Regulation. Trends Genet 2019; 36:118-131. [PMID: 31818514 DOI: 10.1016/j.tig.2019.11.004] [Citation(s) in RCA: 104] [Impact Index Per Article: 20.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/09/2019] [Revised: 10/28/2019] [Accepted: 11/08/2019] [Indexed: 02/05/2023]
Abstract
An intricate molecular machinery is at the core of gene expression regulation in every cell. During the initial stages of organismal development, the coordinated activation of diverse transcriptional programs is crucial and must be carefully executed to shape every organ and tissue. Bivalent promoters and poised enhancers are regulatory regions decorated with histone marks that are associated with both positive and negative transcriptional outcomes. These apparently contradictory signals are important for setting bivalent genes in a poised state, which is subsequently resolved during differentiation into either active or repressive states. We discuss the origins of bivalent promoters and the mechanisms implicated in their acquisition and maintenance. We further review how the presence of bivalent marks influences genome architecture. Finally, we highlight the potential link between bivalency and cancer which could drive biomedical research in disease etiology and treatment.
Collapse
Affiliation(s)
- Enrique Blanco
- Centre for Genomic Regulation (CRG), Barcelona Institute of Science and Technology, Carrer del Doctor Aiguader 88, 08003 Barcelona, Spain
| | - Mar González-Ramírez
- Centre for Genomic Regulation (CRG), Barcelona Institute of Science and Technology, Carrer del Doctor Aiguader 88, 08003 Barcelona, Spain
| | - Anna Alcaine-Colet
- Centre for Genomic Regulation (CRG), Barcelona Institute of Science and Technology, Carrer del Doctor Aiguader 88, 08003 Barcelona, Spain
| | - Sergi Aranda
- Centre for Genomic Regulation (CRG), Barcelona Institute of Science and Technology, Carrer del Doctor Aiguader 88, 08003 Barcelona, Spain
| | - Luciano Di Croce
- Centre for Genomic Regulation (CRG), Barcelona Institute of Science and Technology, Carrer del Doctor Aiguader 88, 08003 Barcelona, Spain; Universitat Pompeu Fabra (UPF), Plaça de la Mercè 10, 08002 Barcelona, Spain; Institució Catalana de Recerca i Estudis Avançats (ICREA), Passeig Lluis Companys 23, 08010 Barcelona, Spain.
| |
Collapse
|
64
|
Schikora-Tamarit MÀ, Lopez-Grado I Salinas G, Gonzalez-Navasa C, Calderón I, Marcos-Fa X, Sas M, Carey LB. Promoter Activity Buffering Reduces the Fitness Cost of Misregulation. Cell Rep 2019; 24:755-765. [PMID: 30021171 DOI: 10.1016/j.celrep.2018.06.059] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/06/2017] [Revised: 05/04/2018] [Accepted: 06/14/2018] [Indexed: 01/21/2023] Open
Abstract
Organisms regulate gene expression through changes in the activity of transcription factors (TFs). In yeast, the response of genes to changes in TF activity is generally assumed to be encoded in the promoter. To directly test this assumption, we chose 42 genes and, for each, replaced the promoter with a synthetic inducible promoter and measured how protein expression changes as a function of TF activity. Most genes exhibited gene-specific TF dose-response curves not due to differences in mRNA stability, translation, or protein stability. Instead, most genes have an intrinsic ability to buffer the effects of promoter activity. This can be encoded in the open reading frame and the 3' end of genes and can be implemented by both autoregulatory feedback and by titration of limiting trans regulators. We show experimentally and computationally that, when misexpression of a gene is deleterious, this buffering insulates cells from fitness defects due to misregulation.
Collapse
Affiliation(s)
- Miquel Àngel Schikora-Tamarit
- Systems Bioengineering Program, Department of Experimental and Health Sciences, Universitat Pompeu Fabra, Carrer Dr. Aiguader 88, 08003 Barcelona, Spain
| | - Guillem Lopez-Grado I Salinas
- Systems Bioengineering Program, Department of Experimental and Health Sciences, Universitat Pompeu Fabra, Carrer Dr. Aiguader 88, 08003 Barcelona, Spain
| | - Carolina Gonzalez-Navasa
- Systems Bioengineering Program, Department of Experimental and Health Sciences, Universitat Pompeu Fabra, Carrer Dr. Aiguader 88, 08003 Barcelona, Spain
| | - Irene Calderón
- Systems Bioengineering Program, Department of Experimental and Health Sciences, Universitat Pompeu Fabra, Carrer Dr. Aiguader 88, 08003 Barcelona, Spain
| | - Xavi Marcos-Fa
- Systems Bioengineering Program, Department of Experimental and Health Sciences, Universitat Pompeu Fabra, Carrer Dr. Aiguader 88, 08003 Barcelona, Spain
| | - Miquel Sas
- Systems Bioengineering Program, Department of Experimental and Health Sciences, Universitat Pompeu Fabra, Carrer Dr. Aiguader 88, 08003 Barcelona, Spain
| | - Lucas B Carey
- Systems Bioengineering Program, Department of Experimental and Health Sciences, Universitat Pompeu Fabra, Carrer Dr. Aiguader 88, 08003 Barcelona, Spain.
| |
Collapse
|
65
|
Forstnerič V, Oven I, Ogorevc J, Lainšček D, Praznik A, Lebar T, Jerala R, Horvat S. CRISPRa-mediated FOXP3 gene upregulation in mammalian cells. Cell Biosci 2019; 9:93. [PMID: 31832140 PMCID: PMC6873431 DOI: 10.1186/s13578-019-0357-0] [Citation(s) in RCA: 19] [Impact Index Per Article: 3.8] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/27/2019] [Accepted: 11/15/2019] [Indexed: 12/18/2022] Open
Abstract
BACKGROUND Forkhead box P3+ (FOXP3 +) regulatory T cells (Tregs) are a subset of lymphocytes, critical for the maintenance of immune homeostasis. Loss-of-function mutations of the FOXP3 gene in animal models and humans results in loss of differentiation potential into Treg cells and are responsible for several immune-mediated inflammatory diseases. Strategies of increasing FOXP3 expression represent a potential approach to increase the pool of Tregs within the lymphocyte population and may be employed in therapies of diverse autoimmune conditions. In the present study, a dCas9 CRISPR-based method was systematically employed to achieve upregulation and sustained high expression of endogenous FOXP3 in HEK293 and human Jurkat T cell lines through targeting of the core promotor, three known regulatory regions of the FOXP3 gene (CNS1-3), and two additional regions selected through extensive bioinformatics analysis (Cage1 and Cage2). RESULTS Using an activator-domain fusion based dCas9 transcription activator, robust upregulation of FOXP3 was achieved, and an optimal combination of single guide RNAs was selected, which exerted an additive effect on FOXP3 gene upregulation. Simultaneous targeting of FOXP3 and EOS, a transcription factor known to act in concert with FOXP3 in initiating a Treg phenotype, resulted in upregulation of FOXP3 downstream genes CD25 and TNFR2. When compared to ectopic expression of FOXP3 via plasmid electroporation, upregulation of endogenous FOXP3 via the Cas9-based method resulted in prolonged expression of FOXP3 in Jurkat cells. CONCLUSIONS Transfection of both HEK293 and Jurkat cells with dCas9-activators showed that regulatory regions downstream and upstream of FOXP3 promoter can be very potent transcription inducers in comparison to targeting the core promoter. While introduction of genes by conventional methods of gene therapy may involve a risk of insertional mutagenesis due to viral integration into the genome, transient up- or down-regulation of transcription by a CRISPR-dCas9 approach may resolve this safety concern. dCas9-based systems provide great promise in DNA footprint-free phenotype perturbations (perturbation without the risk of DNA damage) to drive development of transcription modulation-based therapies.
Collapse
Affiliation(s)
- Vida Forstnerič
- Department of Synthetic Biology and Immunology, National Institute of Chemistry, Ljubljana, Hajdrihova 19, 1000 Ljubljana, Slovenia
| | - Irena Oven
- Department of Animal Science, Biotechnical Faculty, University of Ljubljana, Groblje 3, 1230 Domžale, Slovenia
| | - Jernej Ogorevc
- Department of Animal Science, Biotechnical Faculty, University of Ljubljana, Groblje 3, 1230 Domžale, Slovenia
| | - Duško Lainšček
- Department of Synthetic Biology and Immunology, National Institute of Chemistry, Ljubljana, Hajdrihova 19, 1000 Ljubljana, Slovenia
| | - Arne Praznik
- Department of Synthetic Biology and Immunology, National Institute of Chemistry, Ljubljana, Hajdrihova 19, 1000 Ljubljana, Slovenia
| | - Tina Lebar
- Department of Synthetic Biology and Immunology, National Institute of Chemistry, Ljubljana, Hajdrihova 19, 1000 Ljubljana, Slovenia
| | - Roman Jerala
- Department of Synthetic Biology and Immunology, National Institute of Chemistry, Ljubljana, Hajdrihova 19, 1000 Ljubljana, Slovenia
- EN-FIST Centre of Excellence, Trg Osvobodilne fronte 13, 1000 Ljubljana, Slovenia
| | - Simon Horvat
- Department of Animal Science, Biotechnical Faculty, University of Ljubljana, Groblje 3, 1230 Domžale, Slovenia
| |
Collapse
|
66
|
Decoene T, De Maeseneire SL, De Mey M. Modulating transcription through development of semi-synthetic yeast core promoters. PLoS One 2019; 14:e0224476. [PMID: 31689317 PMCID: PMC6830820 DOI: 10.1371/journal.pone.0224476] [Citation(s) in RCA: 21] [Impact Index Per Article: 4.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/10/2019] [Accepted: 10/15/2019] [Indexed: 01/07/2023] Open
Abstract
Altering gene expression regulation by promoter engineering is a very effective way to fine-tune heterologous pathways in eukaryotic hosts. Typically, pathway building approaches in yeast still use a limited set of long, native promoters. With the today’s introduction of longer and more complex pathways, an expansion of this synthetic biology toolbox is necessary. In this study we elucidated the core promoter structure of the well-characterized yeast TEF1 promoter and determined the minimal length needed for sufficient protein expression. Furthermore, this minimal core promoter sequence was used for the creation of a promoter library covering different expression strengths. This resulted in a group of short, 69 bp promoters with an 8.0-fold expression range. One exemplar had a two and four times higher expression compared to the native CYC1 and ADH1 promoter, respectively. Additionally, as it was described that the protein expression range could be broadened by upstream activating sequences (UASs), we integrated earlier described single and multiple short, synthetic UASs in front of the strongest yeast core promoter. This approach resulted to further variation in protein expression and an overall promoter library spanning a 20-fold activity range and covering a length from 69 bp to maximally 129 bp. Furthermore, the robustness of this library was assessed on three alternative carbon sources besides glucose. As such, the suitability of short yeast core promoters for metabolic engineering applications on different media, either in an individual context or combined with UAS elements, was demonstrated.
Collapse
Affiliation(s)
- Thomas Decoene
- Centre for Synthetic Biology (CSB), Ghent University, Ghent, Belgium
| | - Sofie L. De Maeseneire
- Centre for Industrial Biotechnology and Biocatalysis (InBio.be), Ghent University, Coupure links, Ghent, Belgium
| | - Marjan De Mey
- Centre for Synthetic Biology (CSB), Ghent University, Ghent, Belgium
- * E-mail:
| |
Collapse
|
67
|
Beytebiere JR, Greenwell BJ, Sahasrabudhe A, Menet JS. Clock-controlled rhythmic transcription: is the clock enough and how does it work? Transcription 2019; 10:212-221. [PMID: 31595813 DOI: 10.1080/21541264.2019.1673636] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/06/2023] Open
Abstract
Circadian clocks regulate the rhythmic expression of thousands of genes underlying the daily oscillations of biological functions. Here, we discuss recent findings showing that circadian clock rhythmic transcriptional outputs rely on additional mechanisms than just clock gene DNA binding, which may ultimately contribute to the plasticity of circadian transcriptional programs.
Collapse
Affiliation(s)
- Joshua R Beytebiere
- Department of Biology, Center for Biological Clock Research, Texas A&M University, TX, USA
| | - Ben J Greenwell
- Department of Biology, Center for Biological Clock Research, Texas A&M University, TX, USA.,Program of Genetics, Texas A&M University, College Station, TX, USA
| | - Aishwarya Sahasrabudhe
- Department of Biology, Center for Biological Clock Research, Texas A&M University, TX, USA
| | - Jerome S Menet
- Department of Biology, Center for Biological Clock Research, Texas A&M University, TX, USA.,Program of Genetics, Texas A&M University, College Station, TX, USA
| |
Collapse
|
68
|
Kemble H, Nghe P, Tenaillon O. Recent insights into the genotype-phenotype relationship from massively parallel genetic assays. Evol Appl 2019; 12:1721-1742. [PMID: 31548853 PMCID: PMC6752143 DOI: 10.1111/eva.12846] [Citation(s) in RCA: 32] [Impact Index Per Article: 6.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/27/2019] [Revised: 06/21/2019] [Accepted: 07/02/2019] [Indexed: 12/20/2022] Open
Abstract
With the molecular revolution in Biology, a mechanistic understanding of the genotype-phenotype relationship became possible. Recently, advances in DNA synthesis and sequencing have enabled the development of deep mutational scanning assays, capable of scoring comprehensive libraries of genotypes for fitness and a variety of phenotypes in massively parallel fashion. The resulting empirical genotype-fitness maps pave the way to predictive models, potentially accelerating our ability to anticipate the behaviour of pathogen and cancerous cell populations from sequencing data. Besides from cellular fitness, phenotypes of direct application in industry (e.g. enzyme activity) and medicine (e.g. antibody binding) can be quantified and even selected directly by these assays. This review discusses the technological basis of and recent developments in massively parallel genetics, along with the trends it is uncovering in the genotype-phenotype relationship (distribution of mutation effects, epistasis), their possible mechanistic bases and future directions for advancing towards the goal of predictive genetics.
Collapse
Affiliation(s)
- Harry Kemble
- Infection, Antimicrobials, Modelling, Evolution, INSERM, Unité Mixte de Recherche 1137Université Paris Diderot, Université Paris NordParisFrance
- École Supérieure de Physique et de Chimie Industrielles de la Ville de Paris (ESPCI Paris), UMR CNRS‐ESPCI CBI 8231PSL Research UniversityParis Cedex 05France
| | - Philippe Nghe
- École Supérieure de Physique et de Chimie Industrielles de la Ville de Paris (ESPCI Paris), UMR CNRS‐ESPCI CBI 8231PSL Research UniversityParis Cedex 05France
| | - Olivier Tenaillon
- Infection, Antimicrobials, Modelling, Evolution, INSERM, Unité Mixte de Recherche 1137Université Paris Diderot, Université Paris NordParisFrance
| |
Collapse
|
69
|
Toenhake CG, Bártfai R. What functional genomics has taught us about transcriptional regulation in malaria parasites. Brief Funct Genomics 2019; 18:290-301. [PMID: 31220867 PMCID: PMC6859821 DOI: 10.1093/bfgp/elz004] [Citation(s) in RCA: 16] [Impact Index Per Article: 3.2] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/15/2018] [Revised: 02/08/2019] [Accepted: 03/14/2019] [Indexed: 12/16/2022] Open
Abstract
Malaria parasites are characterized by a complex life cycle that is accompanied by dynamic gene expression patterns. The factors and mechanisms that regulate gene expression in these parasites have been searched for even before the advent of next generation sequencing technologies. Functional genomics approaches have substantially boosted this area of research and have yielded significant insights into the interplay between epigenetic, transcriptional and post-transcriptional mechanisms. Recently, considerable progress has been made in identifying sequence-specific transcription factors and DNA-encoded regulatory elements. Here, we review the insights obtained from these efforts including the characterization of core promoters, the involvement of sequence-specific transcription factors in life cycle progression and the mapping of gene regulatory elements. Furthermore, we discuss recent developments in the field of functional genomics and how they might contribute to further characterization of this complex gene regulatory network.
Collapse
Affiliation(s)
- Christa G Toenhake
- Radboud University, Faculty of Science, Department of Molecular Biology, Nijmegen, the Netherlands
| | - Richárd Bártfai
- Radboud University, Faculty of Science, Department of Molecular Biology, Nijmegen, the Netherlands
| |
Collapse
|
70
|
Adam AC, Lie KK, Whatmore P, Jakt LM, Moren M, Skjærven KH. Profiling DNA methylation patterns of zebrafish liver associated with parental high dietary arachidonic acid. PLoS One 2019; 14:e0220934. [PMID: 31398226 PMCID: PMC6688801 DOI: 10.1371/journal.pone.0220934] [Citation(s) in RCA: 15] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/20/2019] [Accepted: 07/26/2019] [Indexed: 12/18/2022] Open
Abstract
Diet has been shown to influence epigenetic key players, such as DNA methylation, which can regulate the gene expression potential in both parents and offspring. Diets enriched in omega-6 and deficient in omega-3 PUFAs (low dietary omega-3/omega-6 PUFA ratio), have been associated with the promotion of pathogenesis of diseases in humans and other mammals. In this study, we investigated the impact of increased dietary intake of arachidonic acid (ARA), a physiologically important omega-6 PUFA, on 2 generations of zebrafish. Parental fish were fed either a low or a high ARA diet, while the progeny of both groups were fed the low ARA diet. We screened for DNA methylation on single base-pair resolution using reduced representation bisulfite sequencing (RRBS). The DNA methylation profiling revealed significant differences between the dietary groups in both parents and offspring. The majority of differentially methylated loci associated with high dietary ARA were found in introns and intergenic regions for both generations. Common loci between the identified differentially methylated loci in F0 and F1 livers were reported. We described overlapping gene annotations of identified methylation changes with differential expression, but based on a small number of overlaps. The present study describes the diet-associated methylation profiles across genomic regions, and it demonstrates that parental high dietary ARA modulates DNA methylation patterns in zebrafish liver.
Collapse
Affiliation(s)
| | | | | | - Lars Martin Jakt
- Faculty of Biosciences and Aquaculture, Nord University, Bodø, Norway
| | - Mari Moren
- Institute of Marine Research, Bergen, Norway
| | | |
Collapse
|
71
|
Saturation mutagenesis of twenty disease-associated regulatory elements at single base-pair resolution. Nat Commun 2019; 10:3583. [PMID: 31395865 PMCID: PMC6687891 DOI: 10.1038/s41467-019-11526-w] [Citation(s) in RCA: 133] [Impact Index Per Article: 26.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/12/2019] [Accepted: 07/15/2019] [Indexed: 02/06/2023] Open
Abstract
The majority of common variants associated with common diseases, as well as an unknown proportion of causal mutations for rare diseases, fall in noncoding regions of the genome. Although catalogs of noncoding regulatory elements are steadily improving, we have a limited understanding of the functional effects of mutations within them. Here, we perform saturation mutagenesis in conjunction with massively parallel reporter assays on 20 disease-associated gene promoters and enhancers, generating functional measurements for over 30,000 single nucleotide substitutions and deletions. We find that the density of putative transcription factor binding sites varies widely between regulatory elements, as does the extent to which evolutionary conservation or integrative scores predict functional effects. These data provide a powerful resource for interpreting the pathogenicity of clinically observed mutations in these disease-associated regulatory elements, and comprise a rich dataset for the further development of algorithms that aim to predict the regulatory effects of noncoding mutations.
Collapse
|
72
|
Caporale AL, Gonda CM, Franchini LF. Transcriptional Enhancers in the FOXP2 Locus Underwent Accelerated Evolution in the Human Lineage. Mol Biol Evol 2019; 36:2432-2450. [PMID: 31359064 DOI: 10.1093/molbev/msz173] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/28/2018] [Revised: 04/26/2019] [Accepted: 07/16/2019] [Indexed: 12/11/2022] Open
Abstract
Unique human features such as complex language are the result of molecular evolutionary changes that modified developmental programs of our brain. The human-specific evolution of the forkhead box P2 (FOXP2) gene coding region has been linked to the emergence of speech and language in the human kind. However, little is known about how the expression of FOXP2 is regulated and if its regulatory machinery evolved in a lineage-specific manner in humans. In order to identify FOXP2 regulatory regions containing human-specific changes we used databases of human accelerated non-coding sequences or HARs. We found that the topologically associating domain (TAD) determined using developing human cerebral cortex containing the FOXP2 locus includes two clusters of 12 HARs, placing the locus occupied by FOXP2 among the top regions showing fast acceleration rates in non-coding regions in the human genome. Using in vivo enhancer assays in zebrafish, we found that at least five FOXP2-HARs behave as transcriptional enhancers throughout different developmental stages. In addition, we found that at least two FOXP2-HARs direct the expression of the reporter gene EGFP to foxP2 expressing regions and cells. Moreover, we uncovered two FOXP2-HARs showing reporter expression gain of function in the nervous system when compared with the chimpanzee ortholog sequences. Our results indicate that regulatory sequences in the FOXP2 locus underwent a human-specific evolutionary process suggesting that the transcriptional machinery controlling this gene could have also evolved differentially in the human lineage.
Collapse
Affiliation(s)
- Alfredo Leandro Caporale
- Instituto de Investigaciones en Ingeniería Genética y Biología Molecular (INGEBI), Consejo de Investigaciones Científicas y Técnicas (CONICET), Buenos Aires, Argentina
| | - Catalina M Gonda
- Instituto de Investigaciones en Ingeniería Genética y Biología Molecular (INGEBI), Consejo de Investigaciones Científicas y Técnicas (CONICET), Buenos Aires, Argentina
| | - Lucía Florencia Franchini
- Instituto de Investigaciones en Ingeniería Genética y Biología Molecular (INGEBI), Consejo de Investigaciones Científicas y Técnicas (CONICET), Buenos Aires, Argentina
| |
Collapse
|
73
|
Khatri BS, Goldstein RA. Biophysics and population size constrains speciation in an evolutionary model of developmental system drift. PLoS Comput Biol 2019; 15:e1007177. [PMID: 31335870 PMCID: PMC6677325 DOI: 10.1371/journal.pcbi.1007177] [Citation(s) in RCA: 10] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/30/2018] [Revised: 08/02/2019] [Accepted: 06/13/2019] [Indexed: 02/06/2023] Open
Abstract
Developmental system drift is a likely mechanism for the origin of hybrid incompatibilities between closely related species. We examine here the detailed mechanistic basis of hybrid incompatibilities between two allopatric lineages, for a genotype-phenotype map of developmental system drift under stabilising selection, where an organismal phenotype is conserved, but the underlying molecular phenotypes and genotype can drift. This leads to number of emergent phenomenon not obtainable by modelling genotype or phenotype alone. Our results show that: 1) speciation is more rapid at smaller population sizes with a characteristic, Orr-like, power law, but at large population sizes slow, characterised by a sub-diffusive growth law; 2) the molecular phenotypes under weakest selection contribute to the earliest incompatibilities; and 3) pair-wise incompatibilities dominate over higher order, contrary to previous predictions that the latter should dominate. The population size effect we find is consistent with previous results on allopatric divergence of transcription factor-DNA binding, where smaller populations have common ancestors with a larger drift load because genetic drift favours phenotypes which have a larger number of genotypes (higher sequence entropy) over more fit phenotypes which have far fewer genotypes; this means less substitutions are required in either lineage before incompatibilities arise. Overall, our results indicate that biophysics and population size provide a much stronger constraint to speciation than suggested by previous models, and point to a general mechanistic principle of how incompatibilities arise the under stabilising selection for an organismal phenotype. The process of speciation is of fundamental importance to the field of evolution as it is intimately connected to understanding the immense bio-diversity of life. There is still relatively little understanding of the underlying genetic mechanisms that give rise to hybrid incompatibilities with results suggesting that divergence in transcription factor DNA binding and gene expression play an important role. A key finding from the field of evo-devo is that organismal phenotypes show developmental system drift, where species maintain the same phenotype, but diverge in developmental pathways; this is an important potential source of hybrid incompatibilities. Here, we explore a theoretical framework to understand how incompatibilities arise due to developmental system drift, using a tractable biophysically inspired genotype-phenotype for spatial gene expression. Modelling the evolution of phenotypes in this way has the key advantage that it mirrors how selection works in nature, i.e. that selection acts on phenotypes, but variation (mutation) arise at the level of genotypes. This results, as we demonstrate, in a number of non-trivial and testable predictions concerning speciation due to developmental system drift, which would not be obtainable by modelling evolution of genotypes or phenotypes alone.
Collapse
Affiliation(s)
| | - Richard A. Goldstein
- Division of Infection & Immunity, University College London, London, United Kingdom
| |
Collapse
|
74
|
Li S, Kvon EZ, Visel A, Pennacchio LA, Ovcharenko I. Stable enhancers are active in development, and fragile enhancers are associated with evolutionary adaptation. Genome Biol 2019; 20:140. [PMID: 31307522 PMCID: PMC6631995 DOI: 10.1186/s13059-019-1750-z] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Key Words] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/25/2019] [Accepted: 06/28/2019] [Indexed: 12/13/2022] Open
Abstract
Background Despite continual progress in the identification and characterization of trait- and disease-associated variants that disrupt transcription factor (TF)-DNA binding, little is known about the distribution of TF binding deactivating mutations (deMs) in enhancer sequences. Here, we focus on elucidating the mechanism underlying the different densities of deMs in human enhancers. Results We identify two classes of enhancers based on the density of nucleotides prone to deMs. Firstly, fragile enhancers with abundant deM nucleotides are associated with the immune system and regular cellular maintenance. Secondly, stable enhancers with only a few deM nucleotides are associated with the development and regulation of TFs and are evolutionarily conserved. These two classes of enhancers feature different regulatory programs: the binding sites of pioneer TFs of FOX family are specifically enriched in stable enhancers, while tissue-specific TFs are enriched in fragile enhancers. Moreover, stable enhancers are more tolerant of deMs due to their dominant employment of homotypic TF binding site (TFBS) clusters, as opposed to the larger-extent usage of heterotypic TFBS clusters in fragile enhancers. Notably, the sequence environment and chromatin context of the cognate motif, other than the motif itself, contribute more to the susceptibility to deMs of TF binding. Conclusions This dichotomy of enhancer activity is conserved across different tissues, has a specific footprint in epigenetic profiles, and argues for a bimodal evolution of gene regulatory programs in vertebrates. Specifically encoded stable enhancers are evolutionarily conserved and associated with development, while differently encoded fragile enhancers are associated with the adaptation of species. Electronic supplementary material The online version of this article (10.1186/s13059-019-1750-z) contains supplementary material, which is available to authorized users.
Collapse
Affiliation(s)
- Shan Li
- Computational Biology Branch, National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, MD, 20892, USA
| | - Evgeny Z Kvon
- Environmental Genomics and Systems Biology Division, Lawrence Berkeley National Laboratory, Berkeley, CA, 94720, USA
| | - Axel Visel
- Environmental Genomics and Systems Biology Division, Lawrence Berkeley National Laboratory, Berkeley, CA, 94720, USA.,United States Department of Energy Joint Genome Institute, Walnut Creek, CA, 94598, USA.,School of Natural Sciences, University of California, Merced, CA, 95343, USA
| | - Len A Pennacchio
- Environmental Genomics and Systems Biology Division, Lawrence Berkeley National Laboratory, Berkeley, CA, 94720, USA.,United States Department of Energy Joint Genome Institute, Walnut Creek, CA, 94598, USA.,Comparative Biochemistry Program, University of California, Berkeley, CA, 94720, USA
| | - Ivan Ovcharenko
- Computational Biology Branch, National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, MD, 20892, USA.
| |
Collapse
|
75
|
Omelina ES, Ivankin AV, Letiagina AE, Pindyurin AV. Optimized PCR conditions minimizing the formation of chimeric DNA molecules from MPRA plasmid libraries. BMC Genomics 2019; 20:536. [PMID: 31291895 PMCID: PMC6620194 DOI: 10.1186/s12864-019-5847-2] [Citation(s) in RCA: 20] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/28/2023] Open
Abstract
Background Massively parallel reporter assays (MPRAs) enable high-throughput functional evaluation of various DNA regulatory elements and their mutant variants. The assays are based on construction of highly diverse plasmid libraries containing two variable fragments, a region of interest (a sequence under study; ROI) and a barcode (BC) used to uniquely tag each ROI, which are separated by a constant spacer sequence. The sequences of BC–ROI combinations present in the libraries may be either known a priori or not. In the latter case, it is necessary to identify these combinations before performing functional experiments. Typically, this is done by PCR amplification of the BC–ROI regions with flanking primers, followed by next-generation sequencing (NGS) of the products. However, chimeric DNA molecules formed on templates with identical spacer fragment during the amplification process may substantially hamper the identification of genuine BC–ROI combinations, and as a result lower the performance of the assays. Results To identify settings that minimize formation of chimeric products we tested a number of PCR amplification parameters, such as conventional and emulsion types of PCR, one- or two-round amplification strategies, amount of DNA template, number of PCR cycles, and the duration of the extension step. Using specific MPRA libraries as templates, we found that the two-round amplification of the BC–ROI regions with a very low initial template amount, an elongated extension step, and a specific number of PCR cycles result in as low as 0.30 and 0.32% of chimeric products for emulsion and conventional PCR approaches, respectively. Conclusions We have identified PCR parameters that ensure synthesis of specific (non-chimeric) products from highly diverse MPRA plasmid libraries. In addition, we found that there is a negligible difference in performance of emulsion and conventional PCR approaches performed with the identified settings. Electronic supplementary material The online version of this article (10.1186/s12864-019-5847-2) contains supplementary material, which is available to authorized users.
Collapse
Affiliation(s)
| | - Anton V Ivankin
- Institute of Molecular and Cellular Biology SB RAS, Novosibirsk, Russia
| | - Anna E Letiagina
- Institute of Molecular and Cellular Biology SB RAS, Novosibirsk, Russia.,Novosibirsk State University, Novosibirsk, Russia
| | - Alexey V Pindyurin
- Institute of Molecular and Cellular Biology SB RAS, Novosibirsk, Russia. .,Novosibirsk State University, Novosibirsk, Russia.
| |
Collapse
|
76
|
Wu MR, Nissim L, Stupp D, Pery E, Binder-Nissim A, Weisinger K, Enghuus C, Palacios SR, Humphrey M, Zhang Z, Maria Novoa E, Kellis M, Weiss R, Rabkin SD, Tabach Y, Lu TK. A high-throughput screening and computation platform for identifying synthetic promoters with enhanced cell-state specificity (SPECS). Nat Commun 2019; 10:2880. [PMID: 31253799 PMCID: PMC6599391 DOI: 10.1038/s41467-019-10912-8] [Citation(s) in RCA: 23] [Impact Index Per Article: 4.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/17/2019] [Accepted: 05/28/2019] [Indexed: 01/26/2023] Open
Abstract
Cell state-specific promoters constitute essential tools for basic research and biotechnology because they activate gene expression only under certain biological conditions. Synthetic Promoters with Enhanced Cell-State Specificity (SPECS) can be superior to native ones, but the design of such promoters is challenging and frequently requires gene regulation or transcriptome knowledge that is not readily available. Here, to overcome this challenge, we use a next-generation sequencing approach combined with machine learning to screen a synthetic promoter library with 6107 designs for high-performance SPECS for potentially any cell state. We demonstrate the identification of multiple SPECS that exhibit distinct spatiotemporal activity during the programmed differentiation of induced pluripotent stem cells (iPSCs), as well as SPECS for breast cancer and glioblastoma stem-like cells. We anticipate that this approach could be used to create SPECS for gene therapies that are activated in specific cell states, as well as to study natural transcriptional regulatory networks.
Collapse
Affiliation(s)
- Ming-Ru Wu
- Synthetic Biology Group, Research Laboratory of Electronics, Massachusetts Institute of Technology, Cambridge, MA, 02139, USA
| | - Lior Nissim
- Department of Biochemistry and Molecular Biology, The Institute for Medical Research Israel-Canada, Hadassah Medical School, The Hebrew University of Jerusalem, 91120, Jerusalem, Israel
| | - Doron Stupp
- Department of Developmental Biology and Cancer Research, The Institute for Medical Research Israel-Canada, Hadassah Medical School, The Hebrew University of Jerusalem, 91120, Jerusalem, Israel
| | - Erez Pery
- Synthetic Biology Group, Research Laboratory of Electronics, Massachusetts Institute of Technology, Cambridge, MA, 02139, USA.,Department of Biological Engineering, Massachusetts Institute of Technology, Cambridge, MA, 02139, USA.,Synthetic Biology Center, Massachusetts Institute of Technology, Cambridge, MA, 02139, USA
| | - Adina Binder-Nissim
- Synthetic Biology Group, Research Laboratory of Electronics, Massachusetts Institute of Technology, Cambridge, MA, 02139, USA
| | - Karen Weisinger
- Synthetic Biology Group, Research Laboratory of Electronics, Massachusetts Institute of Technology, Cambridge, MA, 02139, USA
| | - Casper Enghuus
- Department of Biological Engineering, Massachusetts Institute of Technology, Cambridge, MA, 02139, USA.,Synthetic Biology Center, Massachusetts Institute of Technology, Cambridge, MA, 02139, USA
| | - Sebastian R Palacios
- Synthetic Biology Center, Massachusetts Institute of Technology, Cambridge, MA, 02139, USA.,Department of Electrical Engineering and Computer Science, Massachusetts Institute of Technology, Cambridge, MA, 02139, USA
| | - Melissa Humphrey
- Brain Tumor Research Center, Department of Neurosurgery, Massachusetts General Hospital, Boston, MA, 02144, USA
| | - Zhizhuo Zhang
- Computer Science and Artificial Intelligence Laboratory, Massachusetts Institute of Technology, Cambridge, MA, 02139, USA.,Broad Institute of MIT and Harvard, Cambridge, MA, 02142, USA
| | - Eva Maria Novoa
- Computer Science and Artificial Intelligence Laboratory, Massachusetts Institute of Technology, Cambridge, MA, 02139, USA.,Broad Institute of MIT and Harvard, Cambridge, MA, 02142, USA.,Center for Genomic Regulation (CRG), 08003, Barcelona, Spain
| | - Manolis Kellis
- Computer Science and Artificial Intelligence Laboratory, Massachusetts Institute of Technology, Cambridge, MA, 02139, USA.,Broad Institute of MIT and Harvard, Cambridge, MA, 02142, USA
| | - Ron Weiss
- Department of Biological Engineering, Massachusetts Institute of Technology, Cambridge, MA, 02139, USA.,Synthetic Biology Center, Massachusetts Institute of Technology, Cambridge, MA, 02139, USA.,Department of Electrical Engineering and Computer Science, Massachusetts Institute of Technology, Cambridge, MA, 02139, USA
| | - Samuel D Rabkin
- Brain Tumor Research Center, Department of Neurosurgery, Massachusetts General Hospital, Boston, MA, 02144, USA.,Department of Neurosurgery (Microbiology & Immunobiology), Harvard Medical School, Boston, MA, 02115, USA
| | - Yuval Tabach
- Department of Developmental Biology and Cancer Research, The Institute for Medical Research Israel-Canada, Hadassah Medical School, The Hebrew University of Jerusalem, 91120, Jerusalem, Israel.
| | - Timothy K Lu
- Synthetic Biology Group, Research Laboratory of Electronics, Massachusetts Institute of Technology, Cambridge, MA, 02139, USA. .,Synthetic Biology Center, Massachusetts Institute of Technology, Cambridge, MA, 02139, USA. .,Department of Electrical Engineering and Computer Science, Massachusetts Institute of Technology, Cambridge, MA, 02139, USA. .,Biophysics Program, Harvard University, Boston, MA, 02115, USA. .,Center for Microbiome Informatics and Therapeutics, Massachusetts Institute of Technology, Cambridge, MA, 02139, USA.
| |
Collapse
|
77
|
Kinney JB, McCandlish DM. Massively Parallel Assays and Quantitative Sequence-Function Relationships. Annu Rev Genomics Hum Genet 2019; 20:99-127. [PMID: 31091417 DOI: 10.1146/annurev-genom-083118-014845] [Citation(s) in RCA: 76] [Impact Index Per Article: 15.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/09/2022]
Abstract
Over the last decade, a rich variety of massively parallel assays have revolutionized our understanding of how biological sequences encode quantitative molecular phenotypes. These assays include deep mutational scanning, high-throughput SELEX, and massively parallel reporter assays. Here, we review these experimental methods and how the data they produce can be used to quantitatively model sequence-function relationships. In doing so, we touch on a diverse range of topics, including the identification of clinically relevant genomic variants, the modeling of transcription factor binding to DNA, the functional and evolutionary landscapes of proteins, and cis-regulatory mechanisms in both transcription and mRNA splicing. We further describe a unified conceptual framework and a core set of mathematical modeling strategies that studies in these diverse areas can make use of. Finally, we highlight key aspects of experimental design and mathematical modeling that are important for the results of such studies to be interpretable and reproducible.
Collapse
Affiliation(s)
- Justin B Kinney
- Simons Center for Quantitative Biology, Cold Spring Harbor Laboratory, Cold Spring Harbor, New York 11724, USA; ,
| | - David M McCandlish
- Simons Center for Quantitative Biology, Cold Spring Harbor Laboratory, Cold Spring Harbor, New York 11724, USA; ,
| |
Collapse
|
78
|
Zeeshan S, Xiong R, Liang BT, Ahmed Z. 100 Years of evolving gene-disease complexities and scientific debutants. Brief Bioinform 2019; 21:885-905. [PMID: 30972412 DOI: 10.1093/bib/bbz038] [Citation(s) in RCA: 28] [Impact Index Per Article: 5.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/09/2019] [Revised: 03/06/2019] [Accepted: 03/08/2019] [Indexed: 12/22/2022] Open
Abstract
It's been over 100 years since the word `gene' is around and progressively evolving in several scientific directions. Time-to-time technological advancements have heavily revolutionized the field of genomics, especially when it's about, e.g. triple code development, gene number proposition, genetic mapping, data banks, gene-disease maps, catalogs of human genes and genetic disorders, CRISPR/Cas9, big data and next generation sequencing, etc. In this manuscript, we present the progress of genomics from pea plant genetics to the human genome project and highlight the molecular, technical and computational developments. Studying genome and epigenome led to the fundamentals of development and progression of human diseases, which includes chromosomal, monogenic, multifactorial and mitochondrial diseases. World Health Organization has classified, standardized and maintained all human diseases, when many academic and commercial online systems are sharing information about genes and linking to associated diseases. To efficiently fathom the wealth of this biological data, there is a crucial need to generate appropriate gene annotation repositories and resources. Our focus has been how many gene-disease databases are available worldwide and which sources are authentic, timely updated and recommended for research and clinical purposes. In this manuscript, we have discussed and compared 43 such databases and bioinformatics applications, which enable users to connect, explore and, if possible, download gene-disease data.
Collapse
Affiliation(s)
- Saman Zeeshan
- The Jackson Laboratory for Genomic Medicine, 10 Discovery Drive, Farmington, CT, USA
| | - Ruoyun Xiong
- Department of Genetics and Genome Sciences, School of Medicine, University of Connecticut Health Center, Farmington Ave, Farmington, CT, USA
| | - Bruce T Liang
- Department of Genetics and Genome Sciences, School of Medicine, University of Connecticut Health Center, Farmington Ave, Farmington, CT, USA.,Pat and Jim Calhoun Cardiology Center, School of Medicine, University of Connecticut Health Center, Farmington Ave, Farmington, CT, USA
| | - Zeeshan Ahmed
- Department of Genetics and Genome Sciences, School of Medicine, University of Connecticut Health Center, Farmington Ave, Farmington, CT, USA
| |
Collapse
|
79
|
Yildirim N, Aktas ME, Ozcan SN, Akbas E, Ay A. Differential transcriptional regulation by alternatively designed mechanisms: A mathematical modeling approach. In Silico Biol 2019; 12:95-127. [PMID: 27497472 DOI: 10.3233/isb-160467] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/15/2022]
Abstract
Cells maintain cellular homeostasis employing different regulatory mechanisms to respond external stimuli. We study two groups of signal-dependent transcriptional regulatory mechanisms. In the first group, we assume that repressor and activator proteins compete for binding to the same regulatory site on DNA (competitive mechanisms). In the second group, they can bind to different regulatory regions in a noncompetitive fashion (noncompetitive mechanisms). For both competitive and noncompetitive mechanisms, we studied the gene expression dynamics by increasing the repressor or decreasing the activator abundance (inhibition mechanisms), or by decreasing the repressor or increasing the activator abundance (activation mechanisms). We employed delay differential equation models. Our simulation results show that the competitive and noncompetitive inhibition mechanisms exhibit comparable repression effectiveness. However, response time is fastest in the noncompetitive inhibition mechanism due to increased repressor abundance, and slowest in the competitive inhibition mechanism by increased repressor level. The competitive and noncompetitive inhibition mechanisms through decreased activator abundance show comparable and moderate response times, while the competitive and noncompetitive activation mechanisms by increased activator protein level display more effective and faster response. Our study exemplifies the importance of mathematical modeling and computer simulation in the analysis of gene expression dynamics.
Collapse
Affiliation(s)
- Necmettin Yildirim
- Division of Natural Sciences, New College of Florida, Bayshore Road, Sarasota, FL, USA
| | - Mehmet Emin Aktas
- Department of Mathematics, Florida State University, W College Ave, Tallahassee, FL, USA
| | - Seyma Nur Ozcan
- Department of Mathematics, North Carolina State University, Raleigh, NC, USA
| | - Esra Akbas
- Department of Computer Science, Florida State University, W College Ave, Tallahassee, FL, USA
| | - Ahmet Ay
- Departments of Biology and Mathematics, Colgate University, Oak Drive, Hamilton, NY, USA
| |
Collapse
|
80
|
Osman NM, Kitapci TH, Vlaho S, Wunderlich Z, Nuzhdin SV. Inference of Transcription Factor Regulation Patterns Using Gene Expression Covariation in Natural Populations of Drosophila melanogaster. Biophysics (Nagoya-shi) 2019; 63:43-51. [PMID: 30739944 DOI: 10.1134/s0006350918010128] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/23/2022] Open
Abstract
Gene regulatory networks control the complex programs that drive development. Deciphering the connections between transcription factors (TFs) and target genes is challenging, in part because TFs bind to thousands of places in the genome but control expression through a subset of these binding events. We hypothesize that we can combine natural variation of expression levels and predictions of TF binding sites to identify TF targets. We gather RNA-seq data from 71 genetically distinct F1 Drosophila melanogaster embryos and calculate the correlations between TF and potential target genes' expression levels, which we call "regulatory strength." To separate direct and indirect TF targets, we hypothesize that direct TF targets will have a preponderance of binding sites in their upstream regions. Using 14 TFs active during embryogenesis, we find that 12 TFs showed a significant correlation between their binding strength and regulatory strength on downstream targets, and 10 TFs showed a significant correlation between the number of binding sites and the regulatory effect on target genes. The general roles, e.g. bicoid's role as an activator, and the particular interactions we observed between our TFs, e.g. twist's role as a repressor of sloppy paired and odd paired, generally coincide with the literature.
Collapse
Affiliation(s)
- Noha M Osman
- University of Southern California, Los Angeles, CA.,National Research Centre, Dokki, Giza, Egypt
| | | | - Srna Vlaho
- University of Southern California, Los Angeles, CA
| | | | - Sergey V Nuzhdin
- University of Southern California, Los Angeles, CA.,Saint Petersburg Polytechnical University, St Petersburg, Russia
| |
Collapse
|
81
|
Forcier TL, Ayaz A, Gill MS, Jones D, Phillips R, Kinney JB. Measuring cis-regulatory energetics in living cells using allelic manifolds. eLife 2018; 7:40618. [PMID: 30570483 PMCID: PMC6301791 DOI: 10.7554/elife.40618] [Citation(s) in RCA: 16] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/31/2018] [Accepted: 11/27/2018] [Indexed: 12/04/2022] Open
Abstract
Gene expression in all organisms is controlled by cooperative interactions between DNA-bound transcription factors (TFs), but quantitatively measuring TF-DNA and TF-TF interactions remains difficult. Here we introduce a strategy for precisely measuring the Gibbs free energy of such interactions in living cells. This strategy centers on the measurement and modeling of ‘allelic manifolds’, a multidimensional generalization of the classical genetics concept of allelic series. Allelic manifolds are measured using reporter assays performed on strategically designed cis-regulatory sequences. Quantitative biophysical models are then fit to the resulting data. We used this strategy to study regulation by two Escherichia coli TFs, CRP and σ70 RNA polymerase. Doing so, we consistently obtained energetic measurements precise to ∼0.1 kcal/mol. We also obtained multiple results that deviate from the prior literature. Our strategy is compatible with massively parallel reporter assays in both prokaryotes and eukaryotes, and should therefore be highly scalable and broadly applicable. Editorial note: This article has been through an editorial process in which the authors decide how to respond to the issues raised during peer review. The Reviewing Editor's assessment is that minor issues remain unresolved (see decision letter).
Collapse
Affiliation(s)
- Talitha L Forcier
- Simons Center for Quantitative Biology, Cold Spring Harbor Laboratory, Cold Spring Harbor, United States
| | - Andalus Ayaz
- Simons Center for Quantitative Biology, Cold Spring Harbor Laboratory, Cold Spring Harbor, United States
| | - Manraj S Gill
- Simons Center for Quantitative Biology, Cold Spring Harbor Laboratory, Cold Spring Harbor, United States
| | - Daniel Jones
- Simons Center for Quantitative Biology, Cold Spring Harbor Laboratory, Cold Spring Harbor, United States.,Department of Applied Physics, California Institute of Technology, Pasadena, United States
| | - Rob Phillips
- Department of Applied Physics, California Institute of Technology, Pasadena, United States
| | - Justin B Kinney
- Simons Center for Quantitative Biology, Cold Spring Harbor Laboratory, Cold Spring Harbor, United States
| |
Collapse
|
82
|
Single-cell and single-molecule epigenomics to uncover genome regulation at unprecedented resolution. Nat Genet 2018; 51:19-25. [DOI: 10.1038/s41588-018-0290-x] [Citation(s) in RCA: 115] [Impact Index Per Article: 19.2] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/29/2017] [Accepted: 08/30/2018] [Indexed: 12/19/2022]
|
83
|
Yella VR, Bhimsaria D, Ghoshdastidar D, Rodríguez-Martínez J, Ansari AZ, Bansal M. Flexibility and structure of flanking DNA impact transcription factor affinity for its core motif. Nucleic Acids Res 2018; 46:11883-11897. [PMID: 30395339 PMCID: PMC6294565 DOI: 10.1093/nar/gky1057] [Citation(s) in RCA: 45] [Impact Index Per Article: 7.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/12/2018] [Revised: 10/11/2018] [Accepted: 10/17/2018] [Indexed: 01/13/2023] Open
Abstract
Spatial and temporal expression of genes is essential for maintaining phenotype integrity. Transcription factors (TFs) modulate expression patterns by binding to specific DNA sequences in the genome. Along with the core binding motif, the flanking sequence context can play a role in DNA-TF recognition. Here, we employ high-throughput in vitro and in silico analyses to understand the influence of sequences flanking the cognate sites in binding of three most prevalent eukaryotic TF families (zinc finger, homeodomain and bZIP). In vitro binding preferences of each TF toward the entire DNA sequence space were correlated with a wide range of DNA structural parameters, including DNA flexibility. Results demonstrate that conformational plasticity of flanking regions modulates binding affinity of certain TF families. DNA duplex stability and minor groove width also play an important role in DNA-TF recognition but differ in how exactly they influence the binding in each specific case. Our analyses further reveal that the structural features of preferred flanking sequences are not universal, as similar DNA-binding folds can employ distinct DNA recognition modes.
Collapse
Affiliation(s)
- Venkata Rajesh Yella
- Molecular Biophysics Unit, Indian Institute of Science, Bangalore 560012, India
- Department of Biotechnology, Koneru Lakshmaiah Education Foundation, Vaddeswaram, Guntur, Andhra Pradesh 522502, India
| | - Devesh Bhimsaria
- Department of Biochemistry, University of Wisconsin-Madison, Madison, WI 53706, USA
| | | | - José A Rodríguez-Martínez
- Department of Biochemistry, University of Wisconsin-Madison, Madison, WI 53706, USA
- Department of Biology, University of Puerto Rico-Rio Piedras, San Juan, PR 00925, USA
| | - Aseem Z Ansari
- Department of Biochemistry, University of Wisconsin-Madison, Madison, WI 53706, USA
- The Genome Center of Wisconsin, Madison, WI 53706, USA
| | - Manju Bansal
- Molecular Biophysics Unit, Indian Institute of Science, Bangalore 560012, India
| |
Collapse
|
84
|
Fu S, Wang Q, Moore JE, Purcaro MJ, Pratt HE, Fan K, Gu C, Jiang C, Zhu R, Kundaje A, Lu A, Weng Z. Differential analysis of chromatin accessibility and histone modifications for predicting mouse developmental enhancers. Nucleic Acids Res 2018; 46:11184-11201. [PMID: 30137428 PMCID: PMC6265487 DOI: 10.1093/nar/gky753] [Citation(s) in RCA: 27] [Impact Index Per Article: 4.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/02/2018] [Revised: 07/15/2018] [Accepted: 08/08/2018] [Indexed: 12/11/2022] Open
Abstract
Enhancers are distal cis-regulatory elements that modulate gene expression. They are depleted of nucleosomes and enriched in specific histone modifications; thus, calling DNase-seq and histone mark ChIP-seq peaks can predict enhancers. We evaluated nine peak-calling algorithms for predicting enhancers validated by transgenic mouse assays. DNase and H3K27ac peaks were consistently more predictive than H3K4me1/2/3 and H3K9ac peaks. DFilter and Hotspot2 were the best DNase peak callers, while HOMER, MUSIC, MACS2, DFilter and F-seq were the best H3K27ac peak callers. We observed that the differential DNase or H3K27ac signals between two distant tissues increased the area under the precision-recall curve (PR-AUC) of DNase peaks by 17.5-166.7% and that of H3K27ac peaks by 7.1-22.2%. We further improved this differential signal method using multiple contrast tissues. Evaluated using a blind test, the differential H3K27ac signal method substantially improved PR-AUC from 0.48 to 0.75 for predicting heart enhancers. We further validated our approach using postnatal retina and cerebral cortex enhancers identified by massively parallel reporter assays, and observed improvements for both tissues. In summary, we compared nine peak callers and devised a superior method for predicting tissue-specific mouse developmental enhancers by reranking the called peaks.
Collapse
Affiliation(s)
- Shaliu Fu
- Clinical Translational Research Center, Shanghai Pulmonary Hospital, School of Life Sciences and Technology, Tongji University, Shanghai 200092, China
| | - Qin Wang
- Clinical Translational Research Center, Shanghai Pulmonary Hospital, School of Life Sciences and Technology, Tongji University, Shanghai 200092, China
| | - Jill E Moore
- Program in Bioinformatics and Integrative Biology, University of Massachusetts Medical School, Worcester, MA 01605, USA
| | - Michael J Purcaro
- Program in Bioinformatics and Integrative Biology, University of Massachusetts Medical School, Worcester, MA 01605, USA
| | - Henry E Pratt
- Program in Bioinformatics and Integrative Biology, University of Massachusetts Medical School, Worcester, MA 01605, USA
| | - Kaili Fan
- Clinical Translational Research Center, Shanghai Pulmonary Hospital, School of Life Sciences and Technology, Tongji University, Shanghai 200092, China
| | - Cuihua Gu
- Clinical Translational Research Center, Shanghai Pulmonary Hospital, School of Life Sciences and Technology, Tongji University, Shanghai 200092, China
| | - Cizhong Jiang
- Clinical Translational Research Center, Shanghai Pulmonary Hospital, School of Life Sciences and Technology, Tongji University, Shanghai 200092, China
| | - Ruixin Zhu
- Clinical Translational Research Center, Shanghai Pulmonary Hospital, School of Life Sciences and Technology, Tongji University, Shanghai 200092, China
| | - Anshul Kundaje
- Department of Genetics, School of Medicine, Department of Computer Science, Stanford University, Stanford, CA 94305, USA
| | - Aiping Lu
- Clinical Translational Research Center, Shanghai Pulmonary Hospital, School of Life Sciences and Technology, Tongji University, Shanghai 200092, China
| | - Zhiping Weng
- Clinical Translational Research Center, Shanghai Pulmonary Hospital, School of Life Sciences and Technology, Tongji University, Shanghai 200092, China
- Program in Bioinformatics and Integrative Biology, University of Massachusetts Medical School, Worcester, MA 01605, USA
| |
Collapse
|
85
|
Rahane CS, Kutzner A, Heese K. A cancer tissue-specific FAM72 expression profile defines a novel glioblastoma multiform (GBM) gene-mutation signature. J Neurooncol 2018; 141:57-70. [PMID: 30414097 DOI: 10.1007/s11060-018-03029-3] [Citation(s) in RCA: 26] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/23/2018] [Accepted: 10/09/2018] [Indexed: 11/26/2022]
Abstract
INTRODUCTION Glioblastoma multiform (GBM) is a neural stem cell (NSC)-derived malignant brain tumor with complex genetic alterations challenging clinical treatments. FAM72 is a NSC-specific protein comprised of four paralogous genes (FAM72 A-D) in the human genome, but its functional tumorigenic significance is unclear. METHODS We conducted an in-depth expression and somatic mutation data analysis of FAM72 (A-D) in GBM using the comprehensive human clinical cancer study database cBioPortal [including The Cancer Genome Atlas (TCGA)]. RESULTS We established a FAM72 transcription profile across TCGA correlated with the expression of the proliferative marker MKI67 and a tissue-specific gene-mutation signature represented by pivotal genes involved in driving the cell cycle. FAM72 paralogs are overexpressed in cancer cells, specifically correlating with the mitotic cell cycle genes ASPM, KIF14, KIF23, CENPE, CENPE, CEP55, SGO1, and BUB1, thereby contributing to centrosome and mitotic spindle formation. FAM72 expression correlation identifies a novel GBM-specific gene set (SCN9A, MXRA5, ADAM29, KDR, LRP1B, and PIK3C2G) in the de novo pathway of primary GBM predestined as viable targets for therapeutics. CONCLUSION Our newly identified primary GBM-specific gene-mutation signature, along with FAM72, could thus provide a new basis for prognostic biomarkers for diagnostics of GBM and could serve as potential therapeutic targets.
Collapse
Affiliation(s)
- Chinmay Satish Rahane
- Graduate School of Biomedical Science and Engineering, Hanyang University, 222 Wangsimni-ro, Seongdong-gu, Seoul, 133-791, Republic of Korea
| | - Arne Kutzner
- Department of Information Systems, College of Engineering, Hanyang University, 222, Wangsimni-ro, Seongdong-gu, Seoul, 133-791, Republic of Korea
| | - Klaus Heese
- Graduate School of Biomedical Science and Engineering, Hanyang University, 222 Wangsimni-ro, Seongdong-gu, Seoul, 133-791, Republic of Korea.
| |
Collapse
|
86
|
Fischer V, Schumacher K, Tora L, Devys D. Global role for coactivator complexes in RNA polymerase II transcription. Transcription 2018; 10:29-36. [PMID: 30299209 PMCID: PMC6351120 DOI: 10.1080/21541264.2018.1521214] [Citation(s) in RCA: 14] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/19/2022] Open
Abstract
SAGA and TFIID are related transcription complexes, which were proposed to alternatively deliver TBP at different promoter classes. Recent genome-wide studies in yeast revealed that both complexes are required for the transcription of a vast majority of genes by RNA polymerase II raising new questions about the role of coactivators.
Collapse
Affiliation(s)
- Veronique Fischer
- a Institut de Génétique et de Biologie Moléculaire et Cellulaire , Illkirch , France.,b Centre National de la Recherche Scientifique , UMR7104 , Illkirch , France.,c Institut National de la Santé et de la Recherche Médicale , Illkirch , France.,d Université de Strasbourg , Illkirch , France
| | - Kenny Schumacher
- a Institut de Génétique et de Biologie Moléculaire et Cellulaire , Illkirch , France.,b Centre National de la Recherche Scientifique , UMR7104 , Illkirch , France.,c Institut National de la Santé et de la Recherche Médicale , Illkirch , France.,d Université de Strasbourg , Illkirch , France
| | - Laszlo Tora
- a Institut de Génétique et de Biologie Moléculaire et Cellulaire , Illkirch , France.,b Centre National de la Recherche Scientifique , UMR7104 , Illkirch , France.,c Institut National de la Santé et de la Recherche Médicale , Illkirch , France.,d Université de Strasbourg , Illkirch , France
| | - Didier Devys
- a Institut de Génétique et de Biologie Moléculaire et Cellulaire , Illkirch , France.,b Centre National de la Recherche Scientifique , UMR7104 , Illkirch , France.,c Institut National de la Santé et de la Recherche Médicale , Illkirch , France.,d Université de Strasbourg , Illkirch , France
| |
Collapse
|
87
|
Ross J, Kuzin A, Brody T, Odenwald WF. Mutational analysis of a Drosophila neuroblast enhancer governing nubbin expression during CNS development. Genesis 2018; 56:e23237. [PMID: 30005136 PMCID: PMC6175444 DOI: 10.1002/dvg.23237] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/23/2018] [Revised: 06/07/2018] [Accepted: 06/22/2018] [Indexed: 11/17/2022]
Abstract
While developmental studies of Drosophila neural stem cell lineages have identified transcription factors (TFs) important to cell identity decisions, currently only an incomplete understanding exists of the cis‐regulatory elements that control the dynamic expression of these TFs. Our previous studies have identified multiple enhancers that regulate the POU‐domain TF paralogs nubbin and pdm‐2 genes. Evolutionary comparative analysis of these enhancers reveals that they each contain multiple conserved sequence blocks (CSBs) that span TF DNA‐binding sites for known regulators of neuroblast (NB) gene expression in addition to novel sequences. This study functionally analyzes the conserved DNA sequence elements within a NB enhancer located within the nubbin gene and highlights a high level of complexity underlying enhancer structure. Mutational analysis has revealed CSBs that are important for enhancer activation and silencing in the developing CNS. We have also observed that adjusting the number and relative positions of the TF binding sites within these CSBs alters enhancer function.
Collapse
Affiliation(s)
- Jermaine Ross
- Neural Cell-Fate Determinants Section, NINDS, NIH, Bethesda, Maryland
| | - Alexander Kuzin
- Neural Cell-Fate Determinants Section, NINDS, NIH, Bethesda, Maryland
| | - Thomas Brody
- Neural Cell-Fate Determinants Section, NINDS, NIH, Bethesda, Maryland
| | - Ward F Odenwald
- Neural Cell-Fate Determinants Section, NINDS, NIH, Bethesda, Maryland
| |
Collapse
|
88
|
Mishra A, Siwach P, Misra P, Jayaram B, Bansal M, Olson WK, Thayer KM, Beveridge DL. Toward a Universal Structural and Energetic Model for Prokaryotic Promoters. Biophys J 2018; 115:1180-1189. [PMID: 30172386 DOI: 10.1016/j.bpj.2018.08.002] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/15/2018] [Revised: 07/28/2018] [Accepted: 08/02/2018] [Indexed: 01/04/2023] Open
Abstract
With almost no consensus promoter sequence in prokaryotes, recruitment of RNA polymerase (RNAP) to precise transcriptional start sites (TSSs) has remained an unsolved puzzle. Uncovering the underlying mechanism is critical for understanding the principle of gene regulation. We attempted to search the hidden code in ∼16,500 promoters of 12 prokaryotes representing two kingdoms in their structure and energetics. Twenty-eight fundamental parameters of DNA structure including backbone angles, basepair axis, and interbasepair and intrabasepair parameters were used, and information was extracted from x-ray crystallography data. Three parameters (solvation energy, hydrogen-bond energy, and stacking energy) were selected for creating energetics profiles using in-house programs. DNA of promoter regions was found to be inherently designed to undergo a change in every parameter undertaken for the study, in all prokaryotes. The change starts from some distance upstream of TSSs and continues past some distance from TSS, hence giving a signature state to promoter regions. These signature states might be the universal hidden codes recognized by RNAP. This observation was reiterated when randomly selected promoter sequences (with little sequence conservation) were subjected to structure generation; all developed into very similar three-dimensional structures quite distinct from those of conventional B-DNA and coding sequences. Fine structural details at important motifs (viz. -11, -35, and -75 positions relative to TSS) of promoters reveal novel to our knowledge and pointed insights for RNAP interaction at these locations; it could be correlated with how some particular structural changes at the -11 region may allow insertion of RNAP amino acids in interbasepair space as well as facilitate the flipping out of bases from the DNA duplex.
Collapse
Affiliation(s)
- Akhilesh Mishra
- Supercomputing Facility for Bioinformatics & Computational Biology; Kusuma School of Biological Sciences, Indian Institute of Technology, Delhi, India
| | - Priyanka Siwach
- Supercomputing Facility for Bioinformatics & Computational Biology; Department of Biotechnology, Chaudhary Devi Lal University, Sirsa, Haryana, India
| | - Pallavi Misra
- Supercomputing Facility for Bioinformatics & Computational Biology
| | - Bhyravabhotla Jayaram
- Supercomputing Facility for Bioinformatics & Computational Biology; Kusuma School of Biological Sciences, Indian Institute of Technology, Delhi, India; Department of Chemistry, Indian Institute of Technology, Delhi, India.
| | - Manju Bansal
- Molecular Biophysics Unit, Indian Institute of Science, Bangalore, Karnataka, India
| | - Wilma K Olson
- Department of Chemistry & Chemical Biology and BioMaPS Institute for Quantitative Biology, Rutgers, Piscataway, New Jersey
| | - Kelly M Thayer
- Department of Chemistry, Vassar College, Poughkeepsie, New York
| | - David L Beveridge
- Departments of Chemistry, Molecular Biology, and Biochemistry and Molecular Biophysics Program, Wesleyan University, Middletown, Connecticut
| |
Collapse
|
89
|
Li G, Martínez-Bonet M, Wu D, Yang Y, Cui J, Nguyen HN, Cunin P, Levescot A, Bai M, Westra HJ, Okada Y, Brenner MB, Raychaudhuri S, Hendrickson EA, Maas RL, Nigrovic PA. High-throughput identification of noncoding functional SNPs via type IIS enzyme restriction. Nat Genet 2018; 50:1180-1188. [PMID: 30013183 PMCID: PMC6072570 DOI: 10.1038/s41588-018-0159-z] [Citation(s) in RCA: 27] [Impact Index Per Article: 4.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/04/2017] [Accepted: 05/04/2018] [Indexed: 02/06/2023]
Abstract
Genome-wide association studies (GWAS) have identified many disease-associated noncoding variants, but cannot distinguish functional single-nucleotide polymorphisms (fSNPs) from others that reside incidentally within risk loci. To address this challenge, we developed an unbiased high-throughput screen that employs type IIS enzymatic restriction to identify fSNPs that allelically modulate the binding of regulatory proteins. We coupled this approach, termed SNP-seq, with flanking restriction enhanced pulldown (FREP) to identify regulation of CD40 by three disease-associated fSNPs via four regulatory proteins, RBPJ, RSRC2 and FUBP-1/TRAP150. Applying this approach across 27 loci associated with juvenile idiopathic arthritis, we identified 148 candidate fSNPs, including two that regulate STAT4 via the regulatory proteins SATB2 and H1.2. Together, these findings establish the utility of tandem SNP-seq/FREP to bridge the gap between GWAS and disease mechanism.
Collapse
Affiliation(s)
- Gang Li
- Division of Rheumatology, Immunology and Allergy, Brigham and Women's Hospital, Harvard Medical School, Boston, MA, USA.
- Division of Cardiology and The Aging Institute, University of Pittsburgh, Pittsburgh, PA, USA.
| | - Marta Martínez-Bonet
- Division of Rheumatology, Immunology and Allergy, Brigham and Women's Hospital, Harvard Medical School, Boston, MA, USA
| | - Di Wu
- Department of Periodontology, University of North Carolina at Chapel Hill, Chapel Hill, NC, USA
| | - Yu Yang
- Division of Rheumatology, Immunology and Allergy, Brigham and Women's Hospital, Harvard Medical School, Boston, MA, USA
- Division of Cardiology and The Aging Institute, University of Pittsburgh, Pittsburgh, PA, USA
| | - Jing Cui
- Division of Rheumatology, Immunology and Allergy, Brigham and Women's Hospital, Harvard Medical School, Boston, MA, USA
| | - Hung N Nguyen
- Division of Rheumatology, Immunology and Allergy, Brigham and Women's Hospital, Harvard Medical School, Boston, MA, USA
| | - Pierre Cunin
- Division of Rheumatology, Immunology and Allergy, Brigham and Women's Hospital, Harvard Medical School, Boston, MA, USA
| | - Anaïs Levescot
- Division of Rheumatology, Immunology and Allergy, Brigham and Women's Hospital, Harvard Medical School, Boston, MA, USA
| | - Ming Bai
- Division of Rheumatology, Immunology and Allergy, Brigham and Women's Hospital, Harvard Medical School, Boston, MA, USA
| | - Harm-Jan Westra
- Division of Genetics, Brigham and Women's Hospital, Harvard Medical School, Boston, MA, USA
| | - Yukinori Okada
- Department of Statistical Genetics, Osaka University Graduate School of Medicine, Osaka, Japan
- Laboratory of Statistical Immunology, Immunology Frontier Research Center (WPI-IFReC), Osaka University, Suita, Japan
| | - Michael B Brenner
- Division of Rheumatology, Immunology and Allergy, Brigham and Women's Hospital, Harvard Medical School, Boston, MA, USA
| | - Soumya Raychaudhuri
- Division of Rheumatology, Immunology and Allergy, Brigham and Women's Hospital, Harvard Medical School, Boston, MA, USA
- Division of Genetics, Brigham and Women's Hospital, Harvard Medical School, Boston, MA, USA
- Department of Biomedical Informatics, Harvard Medical School, Boston, MA, USA
- Program in Medical and Population Genetics, Broad Institute of MIT and Harvard, Cambridge, MA, USA
- School of Biological Sciences, University of Manchester, Manchester, UK
| | - Eric A Hendrickson
- Biochemistry, Molecular Biology and Biophysics Department, University of Minnesota Medical School, Minneapolis, MN, USA
| | - Richard L Maas
- Division of Genetics, Brigham and Women's Hospital, Harvard Medical School, Boston, MA, USA
| | - Peter A Nigrovic
- Division of Rheumatology, Immunology and Allergy, Brigham and Women's Hospital, Harvard Medical School, Boston, MA, USA.
- Division of Immunology, Boston Children's Hospital, Boston, MA, USA.
| |
Collapse
|
90
|
Khamis AM, Motwalli O, Oliva R, Jankovic BR, Medvedeva YA, Ashoor H, Essack M, Gao X, Bajic VB. A novel method for improved accuracy of transcription factor binding site prediction. Nucleic Acids Res 2018; 46:e72. [PMID: 29617876 PMCID: PMC6037060 DOI: 10.1093/nar/gky237] [Citation(s) in RCA: 23] [Impact Index Per Article: 3.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/19/2017] [Revised: 03/01/2018] [Accepted: 03/20/2018] [Indexed: 12/12/2022] Open
Abstract
Identifying transcription factor (TF) binding sites (TFBSs) is important in the computational inference of gene regulation. Widely used computational methods of TFBS prediction based on position weight matrices (PWMs) usually have high false positive rates. Moreover, computational studies of transcription regulation in eukaryotes frequently require numerous PWM models of TFBSs due to a large number of TFs involved. To overcome these problems we developed DRAF, a novel method for TFBS prediction that requires only 14 prediction models for 232 human TFs, while at the same time significantly improves prediction accuracy. DRAF models use more features than PWM models, as they combine information from TFBS sequences and physicochemical properties of TF DNA-binding domains into machine learning models. Evaluation of DRAF on 98 human ChIP-seq datasets shows on average 1.54-, 1.96- and 5.19-fold reduction of false positives at the same sensitivities compared to models from HOCOMOCO, TRANSFAC and DeepBind, respectively. This observation suggests that one can efficiently replace the PWM models for TFBS prediction by a small number of DRAF models that significantly improve prediction accuracy. The DRAF method is implemented in a web tool and in a stand-alone software freely available at http://cbrc.kaust.edu.sa/DRAF.
Collapse
Affiliation(s)
- Abdullah M Khamis
- King Abdullah University of Science and Technology (KAUST), Computational Bioscience Research Center (CBRC), Computer, Electrical and Mathematical Sciences and Engineering (CEMSE) Division, Thuwal 23955–6900, Saudi Arabia
| | - Olaa Motwalli
- King Abdullah University of Science and Technology (KAUST), Computational Bioscience Research Center (CBRC), Computer, Electrical and Mathematical Sciences and Engineering (CEMSE) Division, Thuwal 23955–6900, Saudi Arabia
| | - Romina Oliva
- King Abdullah University of Science and Technology (KAUST), Computational Bioscience Research Center (CBRC), Computer, Electrical and Mathematical Sciences and Engineering (CEMSE) Division, Thuwal 23955–6900, Saudi Arabia
- Department of Sciences and Technologies, University ‘Parthenope’ of Naples, Centro Direzionale Isola C4 80143, Naples, Italy
| | - Boris R Jankovic
- King Abdullah University of Science and Technology (KAUST), Computational Bioscience Research Center (CBRC), Computer, Electrical and Mathematical Sciences and Engineering (CEMSE) Division, Thuwal 23955–6900, Saudi Arabia
| | - Yulia A Medvedeva
- King Abdullah University of Science and Technology (KAUST), Computational Bioscience Research Center (CBRC), Computer, Electrical and Mathematical Sciences and Engineering (CEMSE) Division, Thuwal 23955–6900, Saudi Arabia
- Institute of Bioengineering, Research Centre of Biotechnology, Russian Academy of Science, 117312 Moscow, Russia
- Department of Computational Biology, Vavilov Institute of General Genetics, Russian Academy of Science, 119991 Moscow, Russia
- Department of Biological and Medical Physics, Moscow Institute of Physics and Technology, 141701, Dolgoprudny, Moscow Region, Russia
| | - Haitham Ashoor
- King Abdullah University of Science and Technology (KAUST), Computational Bioscience Research Center (CBRC), Computer, Electrical and Mathematical Sciences and Engineering (CEMSE) Division, Thuwal 23955–6900, Saudi Arabia
| | - Magbubah Essack
- King Abdullah University of Science and Technology (KAUST), Computational Bioscience Research Center (CBRC), Computer, Electrical and Mathematical Sciences and Engineering (CEMSE) Division, Thuwal 23955–6900, Saudi Arabia
| | - Xin Gao
- King Abdullah University of Science and Technology (KAUST), Computational Bioscience Research Center (CBRC), Computer, Electrical and Mathematical Sciences and Engineering (CEMSE) Division, Thuwal 23955–6900, Saudi Arabia
| | - Vladimir B Bajic
- King Abdullah University of Science and Technology (KAUST), Computational Bioscience Research Center (CBRC), Computer, Electrical and Mathematical Sciences and Engineering (CEMSE) Division, Thuwal 23955–6900, Saudi Arabia
| |
Collapse
|
91
|
Wang M, Tai C, E W, Wei L. DeFine: deep convolutional neural networks accurately quantify intensities of transcription factor-DNA binding and facilitate evaluation of functional non-coding variants. Nucleic Acids Res 2018; 46:e69. [PMID: 29617928 PMCID: PMC6009584 DOI: 10.1093/nar/gky215] [Citation(s) in RCA: 56] [Impact Index Per Article: 9.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/03/2017] [Revised: 03/12/2018] [Accepted: 03/14/2018] [Indexed: 01/19/2023] Open
Abstract
The complex system of gene expression is regulated by the cell type-specific binding of transcription factors (TFs) to regulatory elements. Identifying variants that disrupt TF binding and lead to human diseases remains a great challenge. To address this, we implement sequence-based deep learning models that accurately predict the TF binding intensities to given DNA sequences. In addition to accurately classifying TF-DNA binding or unbinding, our models are capable of accurately predicting real-valued TF binding intensities by leveraging large-scale TF ChIP-seq data. The changes in the TF binding intensities between the altered sequence and the reference sequence reflect the degree of functional impact for the variant. This enables us to develop the tool DeFine (Deep learning based Functional impact of non-coding variants evaluator, http://define.cbi.pku.edu.cn) with improved performance for assessing the functional impact of non-coding variants including SNPs and indels. DeFine accurately identifies the causal functional non-coding variants from disease-associated variants in GWAS. DeFine is an effective and easy-to-use tool that facilities systematic prioritization of functional non-coding variants.
Collapse
Affiliation(s)
- Meng Wang
- Center for Bioinformatics, State Key Laboratory of Protein and Plant Gene Research, School of Life Sciences, Peking University, Beijing, 100871, P.R. China
| | - Cheng Tai
- Center for Data Science, Peking University, Beijing, 100871, P.R. China
- Beijing Institute of Big Data Research, Beijing, 100871, P.R. China
| | - Weinan E
- Center for Data Science, Peking University, Beijing, 100871, P.R. China
- Beijing Institute of Big Data Research, Beijing, 100871, P.R. China
- Department of Mathematics and PACM, Princeton University, Princeton, NJ, 08544, USA
| | - Liping Wei
- Center for Bioinformatics, State Key Laboratory of Protein and Plant Gene Research, School of Life Sciences, Peking University, Beijing, 100871, P.R. China
| |
Collapse
|
92
|
Guo Y, Tian K, Zeng H, Guo X, Gifford DK. A novel k-mer set memory (KSM) motif representation improves regulatory variant prediction. Genome Res 2018; 28:891-900. [PMID: 29654070 PMCID: PMC5991515 DOI: 10.1101/gr.226852.117] [Citation(s) in RCA: 27] [Impact Index Per Article: 4.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/29/2017] [Accepted: 04/04/2018] [Indexed: 12/15/2022]
Abstract
The representation and discovery of transcription factor (TF) sequence binding specificities is critical for understanding gene regulatory networks and interpreting the impact of disease-associated noncoding genetic variants. We present a novel TF binding motif representation, the k-mer set memory (KSM), which consists of a set of aligned k-mers that are overrepresented at TF binding sites, and a new method called KMAC for de novo discovery of KSMs. We find that KSMs more accurately predict in vivo binding sites than position weight matrix (PWM) models and other more complex motif models across a large set of ChIP-seq experiments. Furthermore, KSMs outperform PWMs and more complex motif models in predicting in vitro binding sites. KMAC also identifies correct motifs in more experiments than five state-of-the-art motif discovery methods. In addition, KSM-derived features outperform both PWM and deep learning model derived sequence features in predicting differential regulatory activities of expression quantitative trait loci (eQTL) alleles. Finally, we have applied KMAC to 1600 ENCODE TF ChIP-seq data sets and created a public resource of KSM and PWM motifs. We expect that the KSM representation and KMAC method will be valuable in characterizing TF binding specificities and in interpreting the effects of noncoding genetic variations.
Collapse
Affiliation(s)
- Yuchun Guo
- Computer Science and Artificial Intelligence Laboratory, Massachusetts Institute of Technology, Cambridge, Massachusetts 02139, USA
| | - Kevin Tian
- Computer Science and Artificial Intelligence Laboratory, Massachusetts Institute of Technology, Cambridge, Massachusetts 02139, USA
| | - Haoyang Zeng
- Computer Science and Artificial Intelligence Laboratory, Massachusetts Institute of Technology, Cambridge, Massachusetts 02139, USA
| | - Xiaoyun Guo
- Computer Science and Artificial Intelligence Laboratory, Massachusetts Institute of Technology, Cambridge, Massachusetts 02139, USA
| | - David Kenneth Gifford
- Computer Science and Artificial Intelligence Laboratory, Massachusetts Institute of Technology, Cambridge, Massachusetts 02139, USA
| |
Collapse
|
93
|
Aymoz D, Solé C, Pierre JJ, Schmitt M, de Nadal E, Posas F, Pelet S. Timing of gene expression in a cell-fate decision system. Mol Syst Biol 2018; 14:e8024. [PMID: 29695607 PMCID: PMC5916086 DOI: 10.15252/msb.20178024] [Citation(s) in RCA: 21] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022] Open
Abstract
During development, morphogens provide extracellular cues allowing cells to select a specific fate by inducing complex transcriptional programs. The mating pathway in budding yeast offers simplified settings to understand this process. Pheromone secreted by the mating partner triggers the activity of a MAPK pathway, which results in the expression of hundreds of genes. Using a dynamic expression reporter, we quantified the kinetics of gene expression in single cells upon exogenous pheromone stimulation and in the physiological context of mating. In both conditions, we observed striking differences in the timing of induction of mating‐responsive promoters. Biochemical analyses and generation of synthetic promoter variants demonstrated how the interplay between transcription factor binding and nucleosomes contributes to determine the kinetics of transcription in a simplified cell‐fate decision system.
Collapse
Affiliation(s)
- Delphine Aymoz
- Department of Fundamental Microbiology, University of Lausanne, Lausanne, Switzerland
| | - Carme Solé
- Cell Signaling Research Group, Departament de Ciències Experimentals i de la Salut, Universitat Pompeu Fabra, Barcelona, Spain
| | - Jean-Jerrold Pierre
- Department of Fundamental Microbiology, University of Lausanne, Lausanne, Switzerland
| | - Marta Schmitt
- Department of Fundamental Microbiology, University of Lausanne, Lausanne, Switzerland
| | - Eulàlia de Nadal
- Cell Signaling Research Group, Departament de Ciències Experimentals i de la Salut, Universitat Pompeu Fabra, Barcelona, Spain
| | - Francesc Posas
- Cell Signaling Research Group, Departament de Ciències Experimentals i de la Salut, Universitat Pompeu Fabra, Barcelona, Spain
| | - Serge Pelet
- Department of Fundamental Microbiology, University of Lausanne, Lausanne, Switzerland
| |
Collapse
|
94
|
Toenhake CG, Fraschka SAK, Vijayabaskar MS, Westhead DR, van Heeringen SJ, Bártfai R. Chromatin Accessibility-Based Characterization of the Gene Regulatory Network Underlying Plasmodium falciparum Blood-Stage Development. Cell Host Microbe 2018; 23:557-569.e9. [PMID: 29649445 PMCID: PMC5899830 DOI: 10.1016/j.chom.2018.03.007] [Citation(s) in RCA: 108] [Impact Index Per Article: 18.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/22/2017] [Revised: 02/05/2018] [Accepted: 03/05/2018] [Indexed: 02/07/2023]
Abstract
Underlying the development of malaria parasites within erythrocytes and the resulting pathogenicity is a hardwired program that secures proper timing of gene transcription and production of functionally relevant proteins. How stage-specific gene expression is orchestrated in vivo remains unclear. Here, using the assay for transposase accessible chromatin sequencing (ATAC-seq), we identified ∼4,000 regulatory regions in P. falciparum intraerythrocytic stages. The vast majority of these sites are located within 2 kb upstream of transcribed genes and their chromatin accessibility pattern correlates positively with abundance of the respective mRNA transcript. Importantly, these regions are sufficient to drive stage-specific reporter gene expression and DNA motifs enriched in stage-specific sets of regulatory regions interact with members of the P. falciparum AP2 transcription factor family. Collectively, this study provides initial insights into the in vivo gene regulatory network of P. falciparum intraerythrocytic stages and should serve as a valuable resource for future studies.
Collapse
Affiliation(s)
- Christa Geeke Toenhake
- Radboud University, Faculty of Science, Department of Molecular Biology, Nijmegen, 6525 GA, the Netherlands
| | | | | | - David Robert Westhead
- School of Molecular and Cellular Biology, Faculty of Biological Sciences, University of Leeds, Leeds LS2 9JT, UK
| | - Simon Jan van Heeringen
- Radboud University, Faculty of Science, Department of Molecular Developmental Biology, Nijmegen, 6525 GA, the Netherlands
| | - Richárd Bártfai
- Radboud University, Faculty of Science, Department of Molecular Biology, Nijmegen, 6525 GA, the Netherlands.
| |
Collapse
|
95
|
Xin B, Rohs R. Relationship between histone modifications and transcription factor binding is protein family specific. Genome Res 2018; 28:321-333. [PMID: 29326300 PMCID: PMC5848611 DOI: 10.1101/gr.220079.116] [Citation(s) in RCA: 33] [Impact Index Per Article: 5.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/23/2016] [Accepted: 01/10/2018] [Indexed: 12/20/2022]
Abstract
The very small fraction of putative binding sites (BSs) that are occupied by transcription factors (TFs) in vivo can be highly variable across different cell types. This observation has been partly attributed to changes in chromatin accessibility and histone modification (HM) patterns surrounding BSs. Previous studies focusing on BSs within DNA regulatory regions found correlations between HM patterns and TF binding specificities. However, a mechanistic understanding of TF-DNA binding specificity determinants is still not available. The ability to predict in vivo TF binding on a genome-wide scale requires the identification of features that determine TF binding based on evolutionary relationships of DNA binding proteins. To reveal protein family-dependent mechanisms of TF binding, we conducted comprehensive comparisons of HM patterns surrounding BSs and non-BSs with exactly matched core motifs for TFs in three cell lines: 33 TFs in GM12878, 37 TFs in K562, and 18 TFs in H1-hESC. These TFs displayed protein family-specific preferences for HM patterns surrounding BSs, with high agreement among cell lines. Moreover, compared to models based on DNA sequence and shape at flanking regions of BSs, HM-augmented quantitative machine-learning methods resulted in increased performance in a TF family-specific manner. Analysis of the relative importance of features in these models indicated that TFs, displaying larger HM pattern differences between BSs and non-BSs, bound DNA in an HM-specific manner on a protein family-specific basis. We propose that TF family-specific HM preferences reveal distinct mechanisms that assist in guiding TFs to their cognate BSs by altering chromatin structure and accessibility.
Collapse
Affiliation(s)
- Beibei Xin
- Computational Biology and Bioinformatics Program, Departments of Biological Sciences, Chemistry, Physics & Astronomy, and Computer Science, University of Southern California, Los Angeles, California 90089, USA
| | - Remo Rohs
- Computational Biology and Bioinformatics Program, Departments of Biological Sciences, Chemistry, Physics & Astronomy, and Computer Science, University of Southern California, Los Angeles, California 90089, USA
| |
Collapse
|
96
|
Comoglio F, Park HJ, Schoenfelder S, Barozzi I, Bode D, Fraser P, Green AR. Thrombopoietin signaling to chromatin elicits rapid and pervasive epigenome remodeling within poised chromatin architectures. Genome Res 2018; 28:295-309. [PMID: 29429976 PMCID: PMC5848609 DOI: 10.1101/gr.227272.117] [Citation(s) in RCA: 29] [Impact Index Per Article: 4.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/06/2017] [Accepted: 01/26/2018] [Indexed: 12/13/2022]
Abstract
Thrombopoietin (TPO) is a critical cytokine regulating hematopoietic stem cell maintenance and differentiation into the megakaryocytic lineage. However, the transcriptional and chromatin dynamics elicited by TPO signaling are poorly understood. Here, we study the immediate early transcriptional and cis-regulatory responses to TPO in hematopoietic stem/progenitor cells (HSPCs) and use this paradigm of cytokine signaling to chromatin to dissect the relationship between cis-regulatory activity and chromatin architecture. We show that TPO profoundly alters the transcriptome of HSPCs, with key hematopoietic regulators being transcriptionally repressed within 30 min of TPO. By examining cis-regulatory dynamics and chromatin architectures, we demonstrate that these changes are accompanied by rapid and extensive epigenome remodeling of cis-regulatory landscapes that is spatially coordinated within topologically associating domains (TADs). Moreover, TPO-responsive enhancers are spatially clustered and engage in preferential homotypic intra- and inter-TAD interactions that are largely refractory to TPO signaling. By further examining the link between cis-regulatory dynamics and chromatin looping, we show that rapid modulation of cis-regulatory activity is largely independent of chromatin looping dynamics. Finally, we show that, although activated and repressed cis-regulatory elements share remarkably similar DNA sequence compositions, transcription factor binding patterns accurately predict rapid cis-regulatory responses to TPO.
Collapse
Affiliation(s)
- Federico Comoglio
- Cambridge Institute for Medical Research, Medical Research Council/Wellcome Trust Stem Cell Institute, and Department of Haematology, University of Cambridge, Cambridge CB2 0XY, United Kingdom
| | - Hyun Jung Park
- Cambridge Institute for Medical Research, Medical Research Council/Wellcome Trust Stem Cell Institute, and Department of Haematology, University of Cambridge, Cambridge CB2 0XY, United Kingdom
| | - Stefan Schoenfelder
- Nuclear Dynamics Programme, The Babraham Institute, Cambridge CB22 3AT, United Kingdom
| | - Iros Barozzi
- Lawrence Berkeley National Laboratory, Berkeley, California 94720, USA
| | - Daniel Bode
- Cambridge Institute for Medical Research, Medical Research Council/Wellcome Trust Stem Cell Institute, and Department of Haematology, University of Cambridge, Cambridge CB2 0XY, United Kingdom
| | - Peter Fraser
- Nuclear Dynamics Programme, The Babraham Institute, Cambridge CB22 3AT, United Kingdom
- Department of Biological Science, Florida State University, Tallahassee, Florida 32301, USA
| | - Anthony R Green
- Cambridge Institute for Medical Research, Medical Research Council/Wellcome Trust Stem Cell Institute, and Department of Haematology, University of Cambridge, Cambridge CB2 0XY, United Kingdom
- Department of Haematology, Addenbrooke's Hospital, Cambridge CB2 0XY, United Kingdom
| |
Collapse
|
97
|
Harakalova M, Asselbergs FW. Systems analysis of dilated cardiomyopathy in the next generation sequencing era. WILEY INTERDISCIPLINARY REVIEWS-SYSTEMS BIOLOGY AND MEDICINE 2018; 10:e1419. [PMID: 29485202 DOI: 10.1002/wsbm.1419] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 09/12/2017] [Revised: 12/31/2017] [Accepted: 01/17/2018] [Indexed: 12/17/2022]
Abstract
Dilated cardiomyopathy (DCM) is a form of severe failure of cardiac muscle caused by a long list of etiologies ranging from myocardial infarction, DNA mutations in cardiac genes, to toxics. Systems analysis integrating next-generation sequencing (NGS)-based omics approaches, such as the sequencing of DNA, RNA, and chromatin, provide valuable insights into DCM mechanisms. The outcome and interpretation of NGS methods can be affected by the localization of cardiac biopsy, level of tissue degradation, and variable ratios of different cell populations, especially in the presence of fibrosis. Heart tissue composition may even differ between sexes, or siblings carrying the same disease causing mutation. Therefore, before planning any experiments, it is important to fully appreciate the complexities of DCM, and the selection of samples suitable for given research question should be an interdisciplinary effort involving clinicians and biologists. The list of NGS omics datasets in DCM to date is short. More studies have to be performed to contribute to public data repositories and facilitate systems analysis. In addition, proper data integration is a difficult task requiring complex computational approaches. Despite these complications, there are multiple promising implications of systems analysis in DCM. By combining various types of datasets, for example, RNA-seq, ChIP-seq, or 4C, deep insights into cardiac biology, and possible biomarkers and treatment targets, can be gained. Systems analysis can also facilitate the annotation of noncoding mutations in cardiac-specific DNA regulatory regions that play a substantial role in maintaining the tissue- and cell-specific transcriptional programs in the heart. This article is categorized under: Physiology > Mammalian Physiology in Health and Disease Laboratory Methods and Technologies > Genetic/Genomic Methods Laboratory Methods and Technologies > RNA Methods.
Collapse
Affiliation(s)
- Magdalena Harakalova
- Department of Cardiology, Division Heart and Lungs, University Medical Center Utrecht, Utrecht University, Utrecht, Netherlands
| | - Folkert W Asselbergs
- Department of Cardiology, Division Heart and Lungs, University Medical Center Utrecht, Utrecht University, Utrecht, Netherlands.,Durrer Center for Cardiovascular Research, Netherlands Heart Institute, Utrecht, Netherlands.,Institute of Cardiovascular Science, University College London, London, UK
| |
Collapse
|
98
|
Li J, Sagendorf JM, Chiu TP, Pasi M, Perez A, Rohs R. Expanding the repertoire of DNA shape features for genome-scale studies of transcription factor binding. Nucleic Acids Res 2018; 45:12877-12887. [PMID: 29165643 PMCID: PMC5728407 DOI: 10.1093/nar/gkx1145] [Citation(s) in RCA: 62] [Impact Index Per Article: 10.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/21/2017] [Accepted: 10/30/2017] [Indexed: 12/18/2022] Open
Abstract
Uncovering the mechanisms that affect the binding specificity of transcription factors (TFs) is critical for understanding the principles of gene regulation. Although sequence-based models have been used successfully to predict TF binding specificities, we found that including DNA shape information in these models improved their accuracy and interpretability. Previously, we developed a method for modeling DNA binding specificities based on DNA shape features extracted from Monte Carlo (MC) simulations. Prediction accuracies of our models, however, have not yet been compared to accuracies of models incorporating DNA shape information extracted from X-ray crystallography (XRC) data or Molecular Dynamics (MD) simulations. Here, we integrated DNA shape information extracted from MC or MD simulations and XRC data into predictive models of TF binding and compared their performance. Models that incorporated structural information consistently showed improved performance over sequence-based models regardless of data source. Furthermore, we derived and validated nine additional DNA shape features beyond our original set of four features. The expanded repertoire of 13 distinct DNA shape features, including six intra-base pair and six inter-base pair parameters and minor groove width, is available in our R/Bioconductor package DNAshapeR and enables a comprehensive structural description of the double helix on a genome-wide scale.
Collapse
Affiliation(s)
- Jinsen Li
- Computational Biology and Bioinformatics Program, Departments of Biological Sciences, Chemistry, Physics & Astronomy, and Computer Science, University of Southern California, Los Angeles, CA 90089, USA
| | - Jared M Sagendorf
- Computational Biology and Bioinformatics Program, Departments of Biological Sciences, Chemistry, Physics & Astronomy, and Computer Science, University of Southern California, Los Angeles, CA 90089, USA
| | - Tsu-Pei Chiu
- Computational Biology and Bioinformatics Program, Departments of Biological Sciences, Chemistry, Physics & Astronomy, and Computer Science, University of Southern California, Los Angeles, CA 90089, USA
| | - Marco Pasi
- Centre for Biomolecular Sciences and School of Pharmacy, University of Nottingham, Nottingham NG7 2RD, UK
| | - Alberto Perez
- Laufer Center for Physical and Quantitative Biology, Stony Brook University, Stony Brook, NY 11794, USA
| | - Remo Rohs
- Computational Biology and Bioinformatics Program, Departments of Biological Sciences, Chemistry, Physics & Astronomy, and Computer Science, University of Southern California, Los Angeles, CA 90089, USA
| |
Collapse
|
99
|
Botta S, de Prisco N, Marrocco E, Renda M, Sofia M, Curion F, Bacci ML, Ventrella D, Wilson C, Gesualdo C, Rossi S, Simonelli F, Surace EM. Targeting and silencing of rhodopsin by ectopic expression of the transcription factor KLF15. JCI Insight 2017; 2:96560. [PMID: 29263295 DOI: 10.1172/jci.insight.96560] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/27/2017] [Accepted: 11/15/2017] [Indexed: 12/20/2022] Open
Abstract
The genome-wide activity of transcription factors (TFs) on multiple regulatory elements precludes their use as gene-specific regulators. Here we show that ectopic expression of a TF in a cell-specific context can be used to silence the expression of a specific gene as a therapeutic approach to regulate gene expression in human disease. We selected the TF Krüppel-like factor 15 (KLF15) based on its putative ability to recognize a specific DNA sequence motif present in the rhodopsin (RHO) promoter and its lack of expression in terminally differentiated rod photoreceptors (the RHO-expressing cells). Adeno-associated virus (AAV) vector-mediated ectopic expression of KLF15 in rod photoreceptors of pigs enables Rho silencing with limited genome-wide transcriptional perturbations. Suppression of a RHO mutant allele by KLF15 corrects the phenotype of a mouse model of retinitis pigmentosa with no observed toxicity. Cell-specific-context conditioning of TF activity may prove a novel mode for somatic gene-targeted manipulation.
Collapse
Affiliation(s)
| | | | - Elena Marrocco
- Telethon Institute of Genetics and Medicine, Napoli, Italy
| | - Mario Renda
- Telethon Institute of Genetics and Medicine, Napoli, Italy
| | - Martina Sofia
- Telethon Institute of Genetics and Medicine, Napoli, Italy
| | - Fabiola Curion
- Telethon Institute of Genetics and Medicine, Napoli, Italy
| | - Maria Laura Bacci
- Department of Veterinary Medical Sciences, University of Bologna, Bologna, Italy
| | - Domenico Ventrella
- Department of Veterinary Medical Sciences, University of Bologna, Bologna, Italy
| | - Cathal Wilson
- Telethon Institute of Genetics and Medicine, Napoli, Italy
| | - Carlo Gesualdo
- Multidisciplinary Department of Medical, Surgical and Dental Sciences, Eye Clinic, Second University of Naples, Naples, Italy
| | - Settimio Rossi
- Multidisciplinary Department of Medical, Surgical and Dental Sciences, Eye Clinic, Second University of Naples, Naples, Italy
| | - Francesca Simonelli
- Multidisciplinary Department of Medical, Surgical and Dental Sciences, Eye Clinic, Second University of Naples, Naples, Italy
| | - Enrico Maria Surace
- Telethon Institute of Genetics and Medicine, Napoli, Italy.,Department of Translational Medicine, University of Naples Federico II, Naples, Italy
| |
Collapse
|
100
|
Crocker J, Ilsley GR. Using synthetic biology to study gene regulatory evolution. Curr Opin Genet Dev 2017; 47:91-101. [DOI: 10.1016/j.gde.2017.09.001] [Citation(s) in RCA: 17] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/05/2017] [Revised: 09/06/2017] [Accepted: 09/11/2017] [Indexed: 12/21/2022]
|