1
|
Nithun RV, Yao YM, Harel O, Habiballah S, Afek A, Jbara M. Site-Specific Acetylation of the Transcription Factor Protein Max Modulates Its DNA Binding Activity. ACS CENTRAL SCIENCE 2024; 10:1295-1303. [PMID: 38947213 PMCID: PMC11212134 DOI: 10.1021/acscentsci.4c00686] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 04/28/2024] [Revised: 06/03/2024] [Accepted: 06/04/2024] [Indexed: 07/02/2024]
Abstract
Chemical protein synthesis provides a powerful means to prepare novel modified proteins with precision down to the atomic level, enabling an unprecedented opportunity to understand fundamental biological processes. Of particular interest is the process of gene expression, orchestrated through the interactions between transcription factors (TFs) and DNA. Here, we combined chemical protein synthesis and high-throughput screening technology to decipher the role of post-translational modifications (PTMs), e.g., Lys-acetylation on the DNA binding activity of Max TF. We synthesized a focused library of singly, doubly, and triply modified Max variants including site-specifically acetylated and fluorescently tagged analogs. The resulting synthetic analogs were employed to decipher the molecular role of Lys-acetylation on the DNA binding activity and sequence specificity of Max. We provide evidence that the acetylation sites at Lys-31 and Lys-57 significantly inhibit the DNA binding activity of Max. Furthermore, by utilizing high-throughput binding measurements, we assessed the binding activities of the modified Max variants across diverse DNA sequences. Our results indicate that acetylation marks can alter the binding specificities of Max toward certain sequences flanking its consensus binding sites. Our work provides insight into the hidden molecular code of PTM-TFs and DNA interactions, paving the way to interpret gene expression regulation programs.
Collapse
Affiliation(s)
- Raj V. Nithun
- School
of Chemistry, Raymond and Beverly Sackler Faculty of Exact Sciences, Tel Aviv University, Tel Aviv, 69978 Israel
| | - Yumi Minyi Yao
- Department
of Chemical and Structural Biology, Weizmann
Institute of Science, Rehovot, 7610001, Israel
| | - Omer Harel
- School
of Chemistry, Raymond and Beverly Sackler Faculty of Exact Sciences, Tel Aviv University, Tel Aviv, 69978 Israel
| | - Shaimaa Habiballah
- School
of Chemistry, Raymond and Beverly Sackler Faculty of Exact Sciences, Tel Aviv University, Tel Aviv, 69978 Israel
| | - Ariel Afek
- Department
of Chemical and Structural Biology, Weizmann
Institute of Science, Rehovot, 7610001, Israel
| | - Muhammad Jbara
- School
of Chemistry, Raymond and Beverly Sackler Faculty of Exact Sciences, Tel Aviv University, Tel Aviv, 69978 Israel
| |
Collapse
|
2
|
Khetan S, Bulyk ML. Overlapping binding sites underlie TF genomic occupancy. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2024.03.05.583629. [PMID: 38496549 PMCID: PMC10942454 DOI: 10.1101/2024.03.05.583629] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 03/19/2024]
Abstract
Sequence-specific DNA binding by transcription factors (TFs) is a crucial step in gene regulation. However, current high-throughput in vitro approaches cannot reliably detect lower affinity TF-DNA interactions, which play key roles in gene regulation. Here, we developed PADIT-seq ( p rotein a ffinity to D NA by in vitro transcription and RNA seq uencing) to assay TF binding preferences to all 10-bp DNA sequences at far greater sensitivity than prior approaches. The expanded catalogs of low affinity DNA binding sites for the human TFs HOXD13 and EGR1 revealed that nucleotides flanking high affinity DNA binding sites create overlapping lower affinity sites that together modulate TF genomic occupancy in vivo . Formation of such extended recognition sequences stems from an inherent property of TF binding sites to interweave each other and expands the genomic sequence space for identifying noncoding variants that directly alter TF binding. One-Sentence Summary Overlapping DNA binding sites underlie TF genomic occupancy through their inherent propensity to interweave each other.
Collapse
|
3
|
Goshisht MK. Machine Learning and Deep Learning in Synthetic Biology: Key Architectures, Applications, and Challenges. ACS OMEGA 2024; 9:9921-9945. [PMID: 38463314 PMCID: PMC10918679 DOI: 10.1021/acsomega.3c05913] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 08/11/2023] [Revised: 01/19/2024] [Accepted: 01/30/2024] [Indexed: 03/12/2024]
Abstract
Machine learning (ML), particularly deep learning (DL), has made rapid and substantial progress in synthetic biology in recent years. Biotechnological applications of biosystems, including pathways, enzymes, and whole cells, are being probed frequently with time. The intricacy and interconnectedness of biosystems make it challenging to design them with the desired properties. ML and DL have a synergy with synthetic biology. Synthetic biology can be employed to produce large data sets for training models (for instance, by utilizing DNA synthesis), and ML/DL models can be employed to inform design (for example, by generating new parts or advising unrivaled experiments to perform). This potential has recently been brought to light by research at the intersection of engineering biology and ML/DL through achievements like the design of novel biological components, best experimental design, automated analysis of microscopy data, protein structure prediction, and biomolecular implementations of ANNs (Artificial Neural Networks). I have divided this review into three sections. In the first section, I describe predictive potential and basics of ML along with myriad applications in synthetic biology, especially in engineering cells, activity of proteins, and metabolic pathways. In the second section, I describe fundamental DL architectures and their applications in synthetic biology. Finally, I describe different challenges causing hurdles in the progress of ML/DL and synthetic biology along with their solutions.
Collapse
Affiliation(s)
- Manoj Kumar Goshisht
- Department of Chemistry, Natural and
Applied Sciences, University of Wisconsin—Green
Bay, Green
Bay, Wisconsin 54311-7001, United States
| |
Collapse
|
4
|
He H, Yang M, Li S, Zhang G, Ding Z, Zhang L, Shi G, Li Y. Mechanisms and biotechnological applications of transcription factors. Synth Syst Biotechnol 2023; 8:565-577. [PMID: 37691767 PMCID: PMC10482752 DOI: 10.1016/j.synbio.2023.08.006] [Citation(s) in RCA: 5] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/17/2023] [Revised: 08/15/2023] [Accepted: 08/27/2023] [Indexed: 09/12/2023] Open
Abstract
Transcription factors play an indispensable role in maintaining cellular viability and finely regulating complex internal metabolic networks. These crucial bioactive functions rely on their ability to respond to effectors and concurrently interact with binding sites. Recent advancements have brought innovative insights into the understanding of transcription factors. In this review, we comprehensively summarize the mechanisms by which transcription factors carry out their functions, along with calculation and experimental-based methods employed in their identification. Additionally, we highlight recent achievements in the application of transcription factors in various biotechnological fields, including cell engineering, human health, and biomanufacturing. Finally, the current limitations of research and provide prospects for future investigations are discussed. This review will provide enlightening theoretical guidance for transcription factors engineering.
Collapse
Affiliation(s)
- Hehe He
- Key Laboratory of Industrial Biotechnology, Ministry of Education, School of Biotechnology, Jiangnan University, Wuxi, Jiangsu Province 214122, PR China
- National Engineering Research Center of Cereal Fermentation and Food Biomanufacturing, Jiangnan University, 1800 Lihu Avenue, Wuxi, Jiangsu Province 214122, PR China
- Jiangsu Provisional Research Center for Bioactive Product Processing Technology, Jiangnan University, 1800 Lihu Avenue, Wuxi, Jiangsu Province 214122, PR China
| | - Mingfei Yang
- Key Laboratory of Industrial Biotechnology, Ministry of Education, School of Biotechnology, Jiangnan University, Wuxi, Jiangsu Province 214122, PR China
- National Engineering Research Center of Cereal Fermentation and Food Biomanufacturing, Jiangnan University, 1800 Lihu Avenue, Wuxi, Jiangsu Province 214122, PR China
- Jiangsu Provisional Research Center for Bioactive Product Processing Technology, Jiangnan University, 1800 Lihu Avenue, Wuxi, Jiangsu Province 214122, PR China
| | - Siyu Li
- Key Laboratory of Industrial Biotechnology, Ministry of Education, School of Biotechnology, Jiangnan University, Wuxi, Jiangsu Province 214122, PR China
- National Engineering Research Center of Cereal Fermentation and Food Biomanufacturing, Jiangnan University, 1800 Lihu Avenue, Wuxi, Jiangsu Province 214122, PR China
- Jiangsu Provisional Research Center for Bioactive Product Processing Technology, Jiangnan University, 1800 Lihu Avenue, Wuxi, Jiangsu Province 214122, PR China
| | - Gaoyang Zhang
- Key Laboratory of Industrial Biotechnology, Ministry of Education, School of Biotechnology, Jiangnan University, Wuxi, Jiangsu Province 214122, PR China
- National Engineering Research Center of Cereal Fermentation and Food Biomanufacturing, Jiangnan University, 1800 Lihu Avenue, Wuxi, Jiangsu Province 214122, PR China
- Jiangsu Provisional Research Center for Bioactive Product Processing Technology, Jiangnan University, 1800 Lihu Avenue, Wuxi, Jiangsu Province 214122, PR China
| | - Zhongyang Ding
- Key Laboratory of Industrial Biotechnology, Ministry of Education, School of Biotechnology, Jiangnan University, Wuxi, Jiangsu Province 214122, PR China
- National Engineering Research Center of Cereal Fermentation and Food Biomanufacturing, Jiangnan University, 1800 Lihu Avenue, Wuxi, Jiangsu Province 214122, PR China
- Jiangsu Provisional Research Center for Bioactive Product Processing Technology, Jiangnan University, 1800 Lihu Avenue, Wuxi, Jiangsu Province 214122, PR China
| | - Liang Zhang
- Key Laboratory of Industrial Biotechnology, Ministry of Education, School of Biotechnology, Jiangnan University, Wuxi, Jiangsu Province 214122, PR China
- National Engineering Research Center of Cereal Fermentation and Food Biomanufacturing, Jiangnan University, 1800 Lihu Avenue, Wuxi, Jiangsu Province 214122, PR China
- Jiangsu Provisional Research Center for Bioactive Product Processing Technology, Jiangnan University, 1800 Lihu Avenue, Wuxi, Jiangsu Province 214122, PR China
| | - Guiyang Shi
- Key Laboratory of Industrial Biotechnology, Ministry of Education, School of Biotechnology, Jiangnan University, Wuxi, Jiangsu Province 214122, PR China
- National Engineering Research Center of Cereal Fermentation and Food Biomanufacturing, Jiangnan University, 1800 Lihu Avenue, Wuxi, Jiangsu Province 214122, PR China
- Jiangsu Provisional Research Center for Bioactive Product Processing Technology, Jiangnan University, 1800 Lihu Avenue, Wuxi, Jiangsu Province 214122, PR China
| | - Youran Li
- Key Laboratory of Industrial Biotechnology, Ministry of Education, School of Biotechnology, Jiangnan University, Wuxi, Jiangsu Province 214122, PR China
- National Engineering Research Center of Cereal Fermentation and Food Biomanufacturing, Jiangnan University, 1800 Lihu Avenue, Wuxi, Jiangsu Province 214122, PR China
- Jiangsu Provisional Research Center for Bioactive Product Processing Technology, Jiangnan University, 1800 Lihu Avenue, Wuxi, Jiangsu Province 214122, PR China
| |
Collapse
|
5
|
Nithun RV, Yao YM, Lin X, Habiballah S, Afek A, Jbara M. Deciphering the Role of the Ser-Phosphorylation Pattern on the DNA-Binding Activity of Max Transcription Factor Using Chemical Protein Synthesis. Angew Chem Int Ed Engl 2023; 62:e202310913. [PMID: 37642402 DOI: 10.1002/anie.202310913] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/29/2023] [Revised: 08/25/2023] [Accepted: 08/29/2023] [Indexed: 08/31/2023]
Abstract
The chemical synthesis of site-specifically modified transcription factors (TFs) is a powerful method to investigate how post-translational modifications (PTMs) influence TF-DNA interactions and impact gene expression. Among these TFs, Max plays a pivotal role in controlling the expression of 15 % of the genome. The activity of Max is regulated by PTMs; Ser-phosphorylation at the N-terminus is considered one of the key regulatory mechanisms. In this study, we developed a practical synthetic strategy to prepare homogeneous full-length Max for the first time, to explore the impact of Max phosphorylation. We prepared a focused library of eight Max variants, with distinct modification patterns, including mono-phosphorylated, and doubly phosphorylated analogues at Ser2/Ser11 as well as fluorescently labeled variants through native chemical ligation. Through comprehensive DNA binding analyses, we discovered that the phosphorylation position plays a crucial role in the DNA-binding activity of Max. Furthermore, in vitro high-throughput analysis using DNA microarrays revealed that the N-terminus phosphorylation pattern does not interfere with the DNA sequence specificity of Max. Our work provides insights into the regulatory role of Max's phosphorylation on the DNA interactions and sequence specificity, shedding light on how PTMs influence TF function.
Collapse
Affiliation(s)
- Raj V Nithun
- School of Chemistry, Raymond and Beverly Sackler Faculty of Exact Sciences, Tel Aviv University, Tel Aviv, 69978, Israel
| | - Yumi Minyi Yao
- Department of Chemical and Structural Biology, Weizmann Institute of Science, Rehovot, 7610001, Israel
| | - Xiaoxi Lin
- School of Chemistry, Raymond and Beverly Sackler Faculty of Exact Sciences, Tel Aviv University, Tel Aviv, 69978, Israel
| | - Shaimaa Habiballah
- School of Chemistry, Raymond and Beverly Sackler Faculty of Exact Sciences, Tel Aviv University, Tel Aviv, 69978, Israel
| | - Ariel Afek
- Department of Chemical and Structural Biology, Weizmann Institute of Science, Rehovot, 7610001, Israel
| | - Muhammad Jbara
- School of Chemistry, Raymond and Beverly Sackler Faculty of Exact Sciences, Tel Aviv University, Tel Aviv, 69978, Israel
| |
Collapse
|
6
|
Chen H, Xu Y, Jin J, Su XD. KaScape: a sequencing-based method for global characterization of protein‒DNA binding affinity. Sci Rep 2023; 13:16595. [PMID: 37789131 PMCID: PMC10547764 DOI: 10.1038/s41598-023-43426-x] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/20/2023] [Accepted: 09/23/2023] [Indexed: 10/05/2023] Open
Abstract
It is difficult to exhaustively screen all possible DNA binding sequences for a given transcription factor (TF). Here, we developed the KaScape method, in which TFs bind to all possible DNA sequences in the same DNA pool where DNA sequences are prepared by randomized oligo synthesis and the random length can be adjusted to a length such as 4, 5, 6, or 7. After separating bound from unbound double-stranded DNAs (dsDNAs), their sequences are determined by next-generation sequencing. To demonstrate the relative binding affinities of all possible DNA sequences determined by KaScape, we developed three-dimensional KaScape viewing software based on a K-mer graph. We applied KaScape to 12 plant TF family AtWRKY proteins and found that all AtWRKY proteins bound to the core sequence GAC with similar profiles. KaScape can detect not only binding sequences consistent with the consensus W-box "TTGAC(C/T)" but also other sequences with weak affinity. KaScape provides a high-throughput, easy-to-operate, sensitive, and exhaustive method for quantitatively characterizing the relative binding strength of a TF with all possible binding sequences, allowing us to comprehensively characterize the specificity and affinity landscape of transcription factors, particularly for moderate- and low-affinity binding sites.
Collapse
Affiliation(s)
- Hong Chen
- State Key Laboratory of Protein and Plant Gene Research, School of Life Sciences, and Biomedical Pioneering Innovation Center (BIOPIC), Peking University, Beijing, 100871, China
| | - Yongping Xu
- State Key Laboratory of Protein and Plant Gene Research, School of Life Sciences, and Biomedical Pioneering Innovation Center (BIOPIC), Peking University, Beijing, 100871, China
| | - Jianshi Jin
- State Key Laboratory of Integrated Management of Pest Insects and Rodents, Institute of Zoology, Chinese Academy of Sciences, 1 Beichen West Road, Chaoyang District, Beijing, 100101, People's Republic of China
| | - Xiao-Dong Su
- State Key Laboratory of Protein and Plant Gene Research, School of Life Sciences, and Biomedical Pioneering Innovation Center (BIOPIC), Peking University, Beijing, 100871, China.
| |
Collapse
|
7
|
Recio PS, Mitra NJ, Shively CA, Song D, Jaramillo G, Lewis KS, Chen X, Mitra R. Zinc cluster transcription factors frequently activate target genes using a non-canonical half-site binding mode. Nucleic Acids Res 2023; 51:5006-5021. [PMID: 37125648 PMCID: PMC10250231 DOI: 10.1093/nar/gkad320] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/09/2022] [Revised: 04/11/2023] [Accepted: 04/14/2023] [Indexed: 05/02/2023] Open
Abstract
Gene expression changes are orchestrated by transcription factors (TFs), which bind to DNA to regulate gene expression. It remains surprisingly difficult to predict basic features of the transcriptional process, including in vivo TF occupancy. Existing thermodynamic models of TF function are often not concordant with experimental measurements, suggesting undiscovered biology. Here, we analyzed one of the most well-studied TFs, the yeast zinc cluster Gal4, constructed a Shea-Ackers thermodynamic model to describe its binding, and compared the results of this model to experimentally measured Gal4p binding in vivo. We found that at many promoters, the model predicted no Gal4p binding, yet substantial binding was observed. These outlier promoters lacked canonical binding motifs, and subsequent investigation revealed Gal4p binds unexpectedly to DNA sequences with high densities of its half site (CGG). We confirmed this novel mode of binding through multiple experimental and computational paradigms; we also found most other zinc cluster TFs we tested frequently utilize this binding mode, at 27% of their targets on average. Together, these results demonstrate a novel mode of binding where zinc clusters, the largest class of TFs in yeast, bind DNA sequences with high densities of half sites.
Collapse
Affiliation(s)
- Pamela S Recio
- Department of Genetics, Washington University School of Medicine in St. Louis, St. Louis, MO 63108, USA
- The Edison Family Center for Genome Sciences & Systems Biology, Washington University School of Medicine in St. Louis, St. Louis, MO 63108, USA
| | - Nikhil J Mitra
- Department of Genetics, Washington University School of Medicine in St. Louis, St. Louis, MO 63108, USA
- The Edison Family Center for Genome Sciences & Systems Biology, Washington University School of Medicine in St. Louis, St. Louis, MO 63108, USA
| | - Christian A Shively
- Department of Genetics, Washington University School of Medicine in St. Louis, St. Louis, MO 63108, USA
- The Edison Family Center for Genome Sciences & Systems Biology, Washington University School of Medicine in St. Louis, St. Louis, MO 63108, USA
| | - David Song
- Department of Genetics, Washington University School of Medicine in St. Louis, St. Louis, MO 63108, USA
- The Edison Family Center for Genome Sciences & Systems Biology, Washington University School of Medicine in St. Louis, St. Louis, MO 63108, USA
| | - Grace Jaramillo
- Department of Genetics, Washington University School of Medicine in St. Louis, St. Louis, MO 63108, USA
- The Edison Family Center for Genome Sciences & Systems Biology, Washington University School of Medicine in St. Louis, St. Louis, MO 63108, USA
| | - Kristine Shady Lewis
- Department of Genetics, Washington University School of Medicine in St. Louis, St. Louis, MO 63108, USA
- The Edison Family Center for Genome Sciences & Systems Biology, Washington University School of Medicine in St. Louis, St. Louis, MO 63108, USA
| | - Xuhua Chen
- Department of Genetics, Washington University School of Medicine in St. Louis, St. Louis, MO 63108, USA
- The Edison Family Center for Genome Sciences & Systems Biology, Washington University School of Medicine in St. Louis, St. Louis, MO 63108, USA
| | - Robi D Mitra
- Department of Genetics, Washington University School of Medicine in St. Louis, St. Louis, MO 63108, USA
- The Edison Family Center for Genome Sciences & Systems Biology, Washington University School of Medicine in St. Louis, St. Louis, MO 63108, USA
- McDonnell Genome Institute, Washington University School of Medicine in St. Louis, St. Louis, MO 63108, USA
| |
Collapse
|
8
|
Luan Y, Tang Z, He Y, Xie Z. Intra-Domain Residue Coevolution in Transcription Factors Contributes to DNA Binding Specificity. Microbiol Spectr 2023; 11:e0365122. [PMID: 36943132 PMCID: PMC10100741 DOI: 10.1128/spectrum.03651-22] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/09/2022] [Accepted: 02/22/2023] [Indexed: 03/23/2023] Open
Abstract
Understanding the basis of the DNA-binding specificity of transcription factors (TFs) has been of long-standing interest. Despite extensive efforts to map millions of putative TF binding sequences, identifying the critical determinants for DNA binding specificity remains a major challenge. The coevolution of residues in proteins occurs due to a shared evolutionary history. However, it is unclear how coevolving residues in TFs contribute to DNA binding specificity. Here, we systematically collected publicly available data sets from multiple large-scale high-throughput TF-DNA interaction screening experiments for the major TF families with large numbers of TF members. These families included the Homeobox, HLH, bZIP_1, Ets, HMG_box, ZF-C4, and Zn_clus TFs. We detected TF subclass-determining sites (TSDSs) and showed that the TSDSs were more likely to coevolve with other TSDSs than with non-TSDSs, particularly for the Homeobox, HLH, Ets, bZIP_1, and HMG_box TF families. By in silico modeling, we showed that mutation of the highly coevolving residues could significantly reduce the stability of the TF-DNA complex. The distant residues from the DNA interface also contributed to TF-DNA binding activity. Overall, our study gave evidence that coevolved residues relate to transcriptional regulation and provided insights into the potential application of engineered DNA-binding domains and proteins. IMPORTANCE While unraveling DNA-binding specificity of TFs is the key to understanding the basis and molecular mechanism of gene expression regulation, identifying the critical determinants that contribute to DNA binding specificity remains a major challenge. In this study, we provided evidence showing that coevolving residues in TF domains contributed to DNA binding specificity. We demonstrated that the TSDSs were more likely to coevolve with other TSDSs than with non-TSDSs. Mutation of the coevolving residue pairs (CRPs) could significantly reduce the stability of THE TF-DNA complex, and even the distant residues from the DNA interface contribute to TF-DNA binding activity. Collectively, our study expands our knowledge of the interactions among coevolved residues in TFs, tertiary contacting, and functional importance in refined transcriptional regulation. Understanding the impact of coevolving residues in TFs will help understand the details of transcription of gene regulation and advance the application of engineered DNA-binding domains and protein.
Collapse
Affiliation(s)
- Yizhao Luan
- State Key Laboratory of Ophthalmology, Guangdong Provincial Key Laboratory of Ophthalmology and Visual Science, Zhongshan Ophthalmic Center, Sun Yat-sen University, Guangzhou, China
| | - Zehua Tang
- State Key Laboratory of Ophthalmology, Guangdong Provincial Key Laboratory of Ophthalmology and Visual Science, Zhongshan Ophthalmic Center, Sun Yat-sen University, Guangzhou, China
| | - Yao He
- State Key Laboratory of Ophthalmology, Guangdong Provincial Key Laboratory of Ophthalmology and Visual Science, Zhongshan Ophthalmic Center, Sun Yat-sen University, Guangzhou, China
| | - Zhi Xie
- State Key Laboratory of Ophthalmology, Guangdong Provincial Key Laboratory of Ophthalmology and Visual Science, Zhongshan Ophthalmic Center, Sun Yat-sen University, Guangzhou, China
| |
Collapse
|
9
|
King SB, Singh M. Primate protein-ligand interfaces exhibit significant conservation and unveil human-specific evolutionary drivers. PLoS Comput Biol 2023; 19:e1010966. [PMID: 36952575 PMCID: PMC10035887 DOI: 10.1371/journal.pcbi.1010966] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/18/2022] [Accepted: 02/22/2023] [Indexed: 03/25/2023] Open
Abstract
Despite the vast phenotypic differences observed across primates, their protein products are largely similar to each other at the sequence level. We hypothesized that, since proteins accomplish all their functions via interactions with other molecules, alterations in the sites that participate in these interactions may be of critical importance. To uncover the extent to which these sites evolve across primates, we built a structurally-derived dataset of ~4,200 one-to-one orthologous sequence groups across 18 primate species, consisting of ~68,000 ligand-binding sites that interact with DNA, RNA, small molecules, ions, or peptides. Using this dataset, we identify functionally important patterns of conservation and variation within the amino acid residues that facilitate protein-ligand interactions across the primate phylogeny. We uncover that interaction sites are significantly more conserved than other sites, and that sites binding DNA and RNA further exhibit the lowest levels of variation. We also show that the subset of ligand-binding sites that do vary are enriched in components of gene regulatory pathways and uncover several instances of human-specific ligand-binding site changes within transcription factors. Altogether, our results suggest that ligand-binding sites have experienced selective pressure in primates and propose that variation in these sites may have an outsized effect on phenotypic variation in primates through pleiotropic effects on gene regulation.
Collapse
Affiliation(s)
- Sean B. King
- Department of Molecular Biology, Princeton University, Princeton, New Jersey, United States of America
- Lewis-Sigler Institute for Integrative Genomics, Princeton University, Princeton, New Jersey, United States of America
| | - Mona Singh
- Lewis-Sigler Institute for Integrative Genomics, Princeton University, Princeton, New Jersey, United States of America
- Department of Computer Science, Princeton University, Princeton, New Jersey, United States of America
| |
Collapse
|
10
|
Myres GJ, Harris JM. Stable Immobilization of DNA to Silica Surfaces by Sequential Michael Addition Reactions Developed with Insights from Confocal Raman Microscopy. Anal Chem 2023; 95:3499-3506. [PMID: 36718639 DOI: 10.1021/acs.analchem.2c05594] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/01/2023]
Abstract
The immobilization of DNA to surfaces is required for numerous biosensing applications related to the capture of target DNA sequences, proteins, or small-molecule analytes from solution. For these applications to be successful, the chemistry of DNA immobilization should be efficient, reproducible, and stable and should allow the immobilized DNA to adopt a secondary structure required for association with its respective target molecule. To develop and characterize surface immobilization chemistry to meet this challenge, it is invaluable to have a quantitative, surface-sensitive method that can report the interfacial chemistry at each step, while also being capable of determining the structure, stability, and activity of the tethered DNA product. In this work, we develop a method to immobilize DNA to silica, glass, or other oxide surfaces by carrying out the reactions in porous silica particles. Due to the high specific surface area of porous silica, the local concentrations of surface-immobilized molecules within the particle are sufficiently high that interfacial chemistry can be monitored at each step of the process with confocal Raman microscopy, providing a unique capability to assess the molecular composition, structure, yield, and surface coverage of these reactions. We employ this methodology to investigate the steps for immobilizing thiolated-DNA to thiol-modified silica surfaces through sequential Michael addition reactions with the cross-linker 1,4-phenylene-bismaleimide. A key advantage of employing a phenyl-bismaleimide over a comparable alkyl coupling reagent is the efficient conversion of the initial phenyl-thiosuccinimide to a more stable succinamic acid thioether linkage. This transformation was confirmed by in situ Raman spectroscopy measurements, and the resulting succinamic acid thioether product exhibited greater than 95% retention of surface-immobilized DNA after 12 days at room temperature in aqueous buffer. Confocal Raman microscopy was also used to assess the conformational freedom of surface-immobilized DNA by comparing the structure of a 23-mer DNA hairpin sequence under duplex-forming and unfolding conditions. We find that the immobilized DNA hairpin can undergo reversible intramolecular duplex formation based on the changes in frequencies and intensities of the phosphate backbone and base-specific vibrational modes that are informative of the hybridization state of DNA.
Collapse
Affiliation(s)
- Grant J Myres
- Department of Chemistry, University of Utah, 315 South 1400 East, Salt Lake City, Utah 84112-0850 United States
| | - Joel M Harris
- Department of Chemistry, University of Utah, 315 South 1400 East, Salt Lake City, Utah 84112-0850 United States
| |
Collapse
|
11
|
Li Z, Gao E, Zhou J, Han W, Xu X, Gao X. Applications of deep learning in understanding gene regulation. CELL REPORTS METHODS 2023; 3:100384. [PMID: 36814848 PMCID: PMC9939384 DOI: 10.1016/j.crmeth.2022.100384] [Citation(s) in RCA: 8] [Impact Index Per Article: 8.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/22/2023]
Abstract
Gene regulation is a central topic in cell biology. Advances in omics technologies and the accumulation of omics data have provided better opportunities for gene regulation studies than ever before. For this reason deep learning, as a data-driven predictive modeling approach, has been successfully applied to this field during the past decade. In this article, we aim to give a brief yet comprehensive overview of representative deep-learning methods for gene regulation. Specifically, we discuss and compare the design principles and datasets used by each method, creating a reference for researchers who wish to replicate or improve existing methods. We also discuss the common problems of existing approaches and prospectively introduce the emerging deep-learning paradigms that will potentially alleviate them. We hope that this article will provide a rich and up-to-date resource and shed light on future research directions in this area.
Collapse
Affiliation(s)
- Zhongxiao Li
- Computer Science Program, Computer, Electrical and Mathematical Sciences and Engineering (CEMSE) Division, King Abdullah University of Science and Technology (KAUST), Thuwal 23955-6900, Kingdom of Saudi Arabia
- KAUST Computational Bioscience Research Center (CBRC), King Abdullah University of Science and Technology (KAUST), Thuwal 23955-6900, Kingdom of Saudi Arabia
| | - Elva Gao
- The KAUST School, King Abdullah University of Science and Technology (KAUST), Thuwal 23955-6900, Kingdom of Saudi Arabia
| | - Juexiao Zhou
- Computer Science Program, Computer, Electrical and Mathematical Sciences and Engineering (CEMSE) Division, King Abdullah University of Science and Technology (KAUST), Thuwal 23955-6900, Kingdom of Saudi Arabia
- KAUST Computational Bioscience Research Center (CBRC), King Abdullah University of Science and Technology (KAUST), Thuwal 23955-6900, Kingdom of Saudi Arabia
| | - Wenkai Han
- Computer Science Program, Computer, Electrical and Mathematical Sciences and Engineering (CEMSE) Division, King Abdullah University of Science and Technology (KAUST), Thuwal 23955-6900, Kingdom of Saudi Arabia
- KAUST Computational Bioscience Research Center (CBRC), King Abdullah University of Science and Technology (KAUST), Thuwal 23955-6900, Kingdom of Saudi Arabia
| | - Xiaopeng Xu
- Computer Science Program, Computer, Electrical and Mathematical Sciences and Engineering (CEMSE) Division, King Abdullah University of Science and Technology (KAUST), Thuwal 23955-6900, Kingdom of Saudi Arabia
- KAUST Computational Bioscience Research Center (CBRC), King Abdullah University of Science and Technology (KAUST), Thuwal 23955-6900, Kingdom of Saudi Arabia
| | - Xin Gao
- Computer Science Program, Computer, Electrical and Mathematical Sciences and Engineering (CEMSE) Division, King Abdullah University of Science and Technology (KAUST), Thuwal 23955-6900, Kingdom of Saudi Arabia
- KAUST Computational Bioscience Research Center (CBRC), King Abdullah University of Science and Technology (KAUST), Thuwal 23955-6900, Kingdom of Saudi Arabia
| |
Collapse
|
12
|
Chow CN, Yang CW, Chang WC. Databases and prospects of dynamic gene regulation in eukaryotes: A mini review. Comput Struct Biotechnol J 2023; 21:2147-2159. [PMID: 37013004 PMCID: PMC10066511 DOI: 10.1016/j.csbj.2023.03.032] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/03/2022] [Revised: 03/18/2023] [Accepted: 03/19/2023] [Indexed: 04/05/2023] Open
Abstract
In eukaryotes, dynamic regulation enables DNA polymerases to catalyze a variety of RNA products in spatial and temporal patterns. Dynamic gene expression is regulated by transcription factors (TFs) and epigenetics (DNA methylation and histone modification). The applications of biochemical technology and high-throughput sequencing enhance the understanding of mechanisms of these regulations and affected genomic regions. To provide a searchable platform for retrieving such metadata, numerous databases have been developed based on the integration of genome-wide maps (e.g., ChIP-seq, whole-genome bisulfite sequencing, RNA-seq, ATAC-seq, DNase-seq, and MNase-seq data) and functionally genomic annotation. In this mini review, we summarize the main functions of TF-related databases and outline the prevalent approaches used in inferring epigenetic regulations, their associated genes, and functions. We review the literature on crosstalk between TF and epigenetic regulation and the properties of non-coding RNA regulation, which are challenging topics that promise to pave the way for advances in database development.
Collapse
|
13
|
Steinhaus R, Robinson PN, Seelow D. FABIAN-variant: predicting the effects of DNA variants on transcription factor binding. Nucleic Acids Res 2022; 50:W322-W329. [PMID: 35639768 PMCID: PMC9252790 DOI: 10.1093/nar/gkac393] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/15/2022] [Revised: 04/22/2022] [Accepted: 05/06/2022] [Indexed: 12/03/2022] Open
Abstract
While great advances in predicting the effects of coding variants have been made, the assessment of non-coding variants remains challenging. This is especially problematic for variants within promoter regions which can lead to over-expression of a gene or reduce or even abolish its expression. The binding of transcription factors to the DNA can be predicted using position weight matrices (PWMs). More recently, transcription factor flexible models (TFFMs) have been introduced and shown to be more accurate than PWMs. TFFMs are based on hidden Markov models and can account for complex positional dependencies. Our new web-based application FABIAN-variant uses 1224 TFFMs and 3790 PWMs to predict whether and to which degree DNA variants affect the binding of 1387 different human transcription factors. For each variant and transcription factor, the software combines the results of different models for a final prediction of the resulting binding-affinity change. The software is written in C++ for speed but variants can be entered through a web interface. Alternatively, a VCF file can be uploaded to assess variants identified by high-throughput sequencing. The search can be restricted to variants in the vicinity of candidate genes. FABIAN-variant is available freely at https://www.genecascade.org/fabian/.
Collapse
Affiliation(s)
- Robin Steinhaus
- Exploratory Diagnostic Sciences, Berlin Institute of Health, 10117 Berlin, Germany.,Institute of Medical Genetics and Human Genetics, Charité - Universitätsmedizin Berlin, corporate member of Freie Universität Berlin and Humboldt-Universität zu Berlin, 13353 Berlin, Germany
| | - Peter N Robinson
- The Jackson Laboratory for Genomic Medicine, Farmington, CT 06030, USA.,Institute for Systems Genomics, University of Connecticut, Farmington, CT 06030, USA
| | - Dominik Seelow
- Exploratory Diagnostic Sciences, Berlin Institute of Health, 10117 Berlin, Germany.,Institute of Medical Genetics and Human Genetics, Charité - Universitätsmedizin Berlin, corporate member of Freie Universität Berlin and Humboldt-Universität zu Berlin, 13353 Berlin, Germany
| |
Collapse
|
14
|
Schmitz RJ, Grotewold E, Stam M. Cis-regulatory sequences in plants: Their importance, discovery, and future challenges. THE PLANT CELL 2022; 34:718-741. [PMID: 34918159 PMCID: PMC8824567 DOI: 10.1093/plcell/koab281] [Citation(s) in RCA: 110] [Impact Index Per Article: 55.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 06/04/2021] [Accepted: 10/20/2021] [Indexed: 05/19/2023]
Abstract
The identification and characterization of cis-regulatory DNA sequences and how they function to coordinate responses to developmental and environmental cues is of paramount importance to plant biology. Key to these regulatory processes are cis-regulatory modules (CRMs), which include enhancers and silencers. Despite the extraordinary advances in high-quality sequence assemblies and genome annotations, the identification and understanding of CRMs, and how they regulate gene expression, lag significantly behind. This is especially true for their distinguishing characteristics and activity states. Here, we review the current knowledge on CRMs and breakthrough technologies enabling identification, characterization, and validation of CRMs; we compare the genomic distributions of CRMs with respect to their target genes between different plant species, and discuss the role of transposable elements harboring CRMs in the evolution of gene expression. This is an exciting time to study cis-regulomes in plants; however, significant existing challenges need to be overcome to fully understand and appreciate the role of CRMs in plant biology and in crop improvement.
Collapse
Affiliation(s)
- Robert J Schmitz
- Department of Genetics, University of Georgia, Athens, Georgia 30602, USA
| | - Erich Grotewold
- Department of Biochemistry and Molecular Biology, Michigan State University, East Lansing, Michigan 48824, USA
| | | |
Collapse
|
15
|
Roth S, Ideses D, Juven-Gershon T, Danielli A. Rapid Biosensing Method for Detecting Protein-DNA Interactions. ACS Sens 2022; 7:60-70. [PMID: 34979074 DOI: 10.1021/acssensors.1c01579] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/12/2022]
Abstract
Identifying and investigating protein-DNA interactions, which play significant roles in many biological processes, is essential for basic and clinical research. Current techniques for identification of protein-DNA interactions are laborious, time-consuming, and suffer from nonspecific binding and limited sensitivity. To overcome these challenges and assess protein-DNA interactions, we use a magnetic modulation biosensing (MMB) system. In MMB, one of the interacting elements (protein or DNA) is immobilized to magnetic beads, and the other is coupled to a fluorescent molecule. Thus, the link between the magnetic bead and the fluorescent molecule is established only when binding occurs, enabling detection of the protein-DNA interaction. Using magnetic forces, the beads are concentrated and manipulated in a periodic motion in and out of a laser beam, producing a detectable oscillating signal. Using MMB, we detected protein-DNA interactions between short GC-rich DNA sequences and both a purified specificity protein 1 (Sp1) and an overexpressed Buttonhead (BTD) protein in a cell lysate. The specificity of the interactions was assessed using mutated DNA sequences and competition experiments. The assays were experimentally compared with commonly used electrophoretic mobility shift assay, which takes approximately 4-72 h. In comparison, the MMB-based assay's turnaround time is ∼2 h, and it provides unambiguous results and quantitative measures of performance. The MMB system uses simple and cheap components, making it an attractive alternative method over current costly and time-consuming techniques for analyzing protein-DNA interactions. Therefore, we anticipate that the MMB-based technique will significantly advance the detection of protein-DNA interactions in biomedical research.
Collapse
Affiliation(s)
- Shira Roth
- Faculty of Engineering, The Institute of Nanotechnology and Advanced Materials, Bar-Ilan University, Max and Anna Webb Street, Ramat Gan 5290002, Israel
| | - Diana Ideses
- The Mina & Everard Goodman Faculty of Life Sciences, Bar-Ilan University, Max and Anna Webb Street, Ramat Gan 5290002, Israel
| | - Tamar Juven-Gershon
- The Mina & Everard Goodman Faculty of Life Sciences, Bar-Ilan University, Max and Anna Webb Street, Ramat Gan 5290002, Israel
| | - Amos Danielli
- Faculty of Engineering, The Institute of Nanotechnology and Advanced Materials, Bar-Ilan University, Max and Anna Webb Street, Ramat Gan 5290002, Israel
| |
Collapse
|
16
|
Lai WKM, Mariani L, Rothschild G, Smith ER, Venters BJ, Blanda TR, Kuntala PK, Bocklund K, Mairose J, Dweikat SN, Mistretta K, Rossi MJ, James D, Anderson JT, Phanor SK, Zhang W, Zhao Z, Shah AP, Novitzky K, McAnarney E, Keogh MC, Shilatifard A, Basu U, Bulyk ML, Pugh BF. A ChIP-exo screen of 887 Protein Capture Reagents Program transcription factor antibodies in human cells. Genome Res 2021; 31:1663-1679. [PMID: 34426512 PMCID: PMC8415381 DOI: 10.1101/gr.275472.121] [Citation(s) in RCA: 7] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/03/2021] [Accepted: 07/07/2021] [Indexed: 12/22/2022]
Abstract
Antibodies offer a powerful means to interrogate specific proteins in a complex milieu. However, antibody availability and reliability can be problematic, whereas epitope tagging can be impractical in many cases. To address these limitations, the Protein Capture Reagents Program (PCRP) generated over a thousand renewable monoclonal antibodies (mAbs) against human presumptive chromatin proteins. However, these reagents have not been widely field-tested. We therefore performed a screen to test their ability to enrich genomic regions via chromatin immunoprecipitation (ChIP) and a variety of orthogonal assays. Eight hundred eighty-seven unique antibodies against 681 unique human transcription factors (TFs) were assayed by ultra-high-resolution ChIP-exo/seq, generating approximately 1200 ChIP-exo data sets, primarily in a single pass in one cell type (K562). Subsets of PCRP mAbs were further tested in ChIP-seq, CUT&RUN, STORM super-resolution microscopy, immunoblots, and protein binding microarray (PBM) experiments. About 5% of the tested antibodies displayed high-confidence target (i.e., cognate antigen) enrichment across at least one assay and are strong candidates for additional validation. An additional 34% produced ChIP-exo data that were distinct from background and thus warrant further testing. The remaining 61% were not substantially different from background, and likely require consideration of a much broader survey of cell types and/or assay optimizations. We show and discuss the metrics and challenges to antibody validation in chromatin-based assays.
Collapse
Affiliation(s)
- William K M Lai
- Center for Eukaryotic Gene Regulation, Department of Biochemistry and Molecular Biology, The Pennsylvania State University, University Park, Pennsylvania 16802, USA.,Department of Molecular Biology and Genetics, Cornell University, Ithaca, New York 14853, USA
| | - Luca Mariani
- Division of Genetics, Department of Medicine; Brigham and Women's Hospital and Harvard Medical School, Boston, Massachusetts 02115, USA
| | - Gerson Rothschild
- Department of Microbiology and Immunology, Vagelos College of Physicians and Surgeons, Columbia University, New York, New York 10032, USA
| | - Edwin R Smith
- Simpson Querrey Institute for Epigenetics and the Department of Biochemistry and Molecular Genetics, Northwestern University Feinberg School of Medicine, Chicago, Illinois 60611, USA
| | | | - Thomas R Blanda
- Center for Eukaryotic Gene Regulation, Department of Biochemistry and Molecular Biology, The Pennsylvania State University, University Park, Pennsylvania 16802, USA
| | - Prashant K Kuntala
- Center for Eukaryotic Gene Regulation, Department of Biochemistry and Molecular Biology, The Pennsylvania State University, University Park, Pennsylvania 16802, USA
| | - Kylie Bocklund
- Center for Eukaryotic Gene Regulation, Department of Biochemistry and Molecular Biology, The Pennsylvania State University, University Park, Pennsylvania 16802, USA
| | - Joshua Mairose
- Center for Eukaryotic Gene Regulation, Department of Biochemistry and Molecular Biology, The Pennsylvania State University, University Park, Pennsylvania 16802, USA
| | - Sarah N Dweikat
- Center for Eukaryotic Gene Regulation, Department of Biochemistry and Molecular Biology, The Pennsylvania State University, University Park, Pennsylvania 16802, USA
| | - Katelyn Mistretta
- Center for Eukaryotic Gene Regulation, Department of Biochemistry and Molecular Biology, The Pennsylvania State University, University Park, Pennsylvania 16802, USA
| | - Matthew J Rossi
- Center for Eukaryotic Gene Regulation, Department of Biochemistry and Molecular Biology, The Pennsylvania State University, University Park, Pennsylvania 16802, USA
| | - Daniela James
- Center for Eukaryotic Gene Regulation, Department of Biochemistry and Molecular Biology, The Pennsylvania State University, University Park, Pennsylvania 16802, USA
| | - James T Anderson
- Division of Genetics, Department of Medicine; Brigham and Women's Hospital and Harvard Medical School, Boston, Massachusetts 02115, USA
| | - Sabrina K Phanor
- Division of Genetics, Department of Medicine; Brigham and Women's Hospital and Harvard Medical School, Boston, Massachusetts 02115, USA
| | - Wanwei Zhang
- Department of Microbiology and Immunology, Vagelos College of Physicians and Surgeons, Columbia University, New York, New York 10032, USA
| | - Zibo Zhao
- Simpson Querrey Institute for Epigenetics and the Department of Biochemistry and Molecular Genetics, Northwestern University Feinberg School of Medicine, Chicago, Illinois 60611, USA
| | - Avani P Shah
- Simpson Querrey Institute for Epigenetics and the Department of Biochemistry and Molecular Genetics, Northwestern University Feinberg School of Medicine, Chicago, Illinois 60611, USA
| | | | | | | | - Ali Shilatifard
- Simpson Querrey Institute for Epigenetics and the Department of Biochemistry and Molecular Genetics, Northwestern University Feinberg School of Medicine, Chicago, Illinois 60611, USA
| | - Uttiya Basu
- Department of Microbiology and Immunology, Vagelos College of Physicians and Surgeons, Columbia University, New York, New York 10032, USA
| | - Martha L Bulyk
- Division of Genetics, Department of Medicine; Brigham and Women's Hospital and Harvard Medical School, Boston, Massachusetts 02115, USA.,Department of Pathology, Brigham and Women's Hospital and Harvard Medical School, Boston, Massachusetts 02115, USA
| | - B Franklin Pugh
- Center for Eukaryotic Gene Regulation, Department of Biochemistry and Molecular Biology, The Pennsylvania State University, University Park, Pennsylvania 16802, USA.,Department of Molecular Biology and Genetics, Cornell University, Ithaca, New York 14853, USA
| |
Collapse
|
17
|
Quan L, Mei J, He R, Sun X, Nie L, Li K, Lyu Q. Quantifying Intensities of Transcription Factor-DNA Binding by Learning From an Ensemble of Protein Binding Microarrays. IEEE J Biomed Health Inform 2021; 25:2811-2819. [PMID: 33571101 DOI: 10.1109/jbhi.2021.3058518] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/05/2022]
Abstract
The control of the coordinated expression of genes is primarily regulated by the interactions between transcription factors (TFs) and their DNA binding sites, which are an integral part of transcriptional regulatory networks. There are many computational tools focused on determining TF binding or unbinding to a DNA sequence. However, other tools focused on further determining the relative preference of such binding are needed. Here, we propose a regression model with deep learning, called SemanticBI, to predict intensities of TF-DNA binding. SemanticBI is a convolutional neural network (CNN)-recurrent neural network (RNN) architecture model that was trained on an ensemble of protein binding microarray data sets that covered multiple TFs. Using this approach, SemanticBI exhibited superior accuracy in predicting binding intensities compared to other popular methods. Moreover, SemanticBI uncovered vectorized sequence-oriented features using its CNN-RNN architecture, which is an abstract representation of the original DNA sequences. Additionally, the use of SemanticBI raises the question of whether motifs are necessary for computational models of TF binding. The online SemanticBI service can be accessed at http://qianglab.scst.suda.edu.cn/semantic/.
Collapse
|
18
|
Xu J, Kudron MM, Victorsen A, Gao J, Ammouri HN, Navarro FCP, Gevirtzman L, Waterston RH, White KP, Reinke V, Gerstein M. To mock or not: a comprehensive comparison of mock IP and DNA input for ChIP-seq. Nucleic Acids Res 2021; 49:e17. [PMID: 33347581 PMCID: PMC7897498 DOI: 10.1093/nar/gkaa1155] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/01/2020] [Revised: 10/26/2020] [Accepted: 12/17/2020] [Indexed: 12/14/2022] Open
Abstract
Chromatin immunoprecipitation (IP) followed by sequencing (ChIP-seq) is the gold standard to detect transcription-factor (TF) binding sites in the genome. Its success depends on appropriate controls removing systematic biases. The predominantly used controls, i.e. DNA input, correct for uneven sonication, but not for nonspecific interactions of the IP antibody. Another type of controls, 'mock' IP, corrects for both of the issues, but is not widely used because it is considered susceptible to technical noise. The tradeoff between the two control types has not been investigated systematically. Therefore, we generated comparable DNA input and mock IP experiments. Because mock IPs contain only nonspecific interactions, the sites predicted from them using DNA input indicate the spurious-site abundance. This abundance is highly correlated with the 'genomic activity' (e.g. chromatin openness). In particular, compared to cell lines, complex samples such as whole organisms have more spurious sites-probably because they contain multiple cell types, resulting in more expressed genes and more open chromatin. Consequently, DNA input and mock IP controls performed similarly for cell lines, whereas for complex samples, mock IP substantially reduced the number of spurious sites. However, DNA input is still informative; thus, we developed a simple framework integrating both controls, improving binding site detection.
Collapse
Affiliation(s)
- Jinrui Xu
- Program in Computational Biology and Bioinformatics, Yale University, New Haven, CT 06520, USA.,Department of Molecular Biophysics and Biochemistry, Yale University, New Haven, CT 06520, USA
| | | | - Alec Victorsen
- Institute for Genomics and Systems Biology, Department of Human Genetics, University of Chicago, IL 60637, USA
| | - Jiahao Gao
- Program in Computational Biology and Bioinformatics, Yale University, New Haven, CT 06520, USA.,Department of Molecular Biophysics and Biochemistry, Yale University, New Haven, CT 06520, USA
| | - Haneen N Ammouri
- Institute for Genomics and Systems Biology, Department of Human Genetics, University of Chicago, IL 60637, USA
| | - Fabio C P Navarro
- Program in Computational Biology and Bioinformatics, Yale University, New Haven, CT 06520, USA.,Department of Molecular Biophysics and Biochemistry, Yale University, New Haven, CT 06520, USA
| | - Louis Gevirtzman
- Department of Genome Sciences, University of Washington School of Medicine, Seattle, WA 98195, USA
| | - Robert H Waterston
- Department of Genome Sciences, University of Washington School of Medicine, Seattle, WA 98195, USA
| | - Kevin P White
- Institute for Genomics and Systems Biology, Department of Human Genetics, University of Chicago, IL 60637, USA
| | - Valerie Reinke
- Department of Genetics, Yale University, New Haven, CT 06520, USA
| | - Mark Gerstein
- Program in Computational Biology and Bioinformatics, Yale University, New Haven, CT 06520, USA.,Department of Molecular Biophysics and Biochemistry, Yale University, New Haven, CT 06520, USA.,Department of Computer Science, Yale University, New Haven, CT 06520, USA.,Department of Statistics and Data Science, Yale University, New Haven, CT 06520, USA
| |
Collapse
|
19
|
Xie Y, Jiang S, Li L, Yu X, Wang Y, Luo C, Cai Q, He W, Xie H, Zheng Y, Xie H, Zhang J. Single-Cell RNA Sequencing Efficiently Predicts Transcription Factor Targets in Plants. FRONTIERS IN PLANT SCIENCE 2020; 11:603302. [PMID: 33424903 PMCID: PMC7793804 DOI: 10.3389/fpls.2020.603302] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 09/06/2020] [Accepted: 11/16/2020] [Indexed: 05/31/2023]
Abstract
Discovering transcription factor (TF) targets is necessary for the study of regulatory pathways, but it is hampered in plants by the lack of highly efficient predictive technology. This study is the first to establish a simple system for predicting TF targets in rice (Oryza sativa) leaf cells based on 10 × Genomics' single-cell RNA sequencing method. We effectively utilized the transient expression system to create the differential expression of a TF (OsNAC78) in each cell and sequenced all single cell transcriptomes. In total, 35 candidate targets having strong correlations with OsNAC78 expression were captured using expression profiles. Likewise, 78 potential differentially expressed genes were identified between clusters having the lowest and highest expression levels of OsNAC78. A gene overlapping analysis identified 19 genes as final candidate targets, and various assays indicated that Os01g0934800 and Os01g0949900 were OsNAC78 targets. Additionally, the cell profiles showed extremely similar expression trajectories between OsNAC78 and the two targets. The data presented here provide a high-resolution insight into predicting TF targets and offer a new application for single-cell RNA sequencing in plants.
Collapse
Affiliation(s)
- Yunjie Xie
- College of Plant Protection, Fujian Agriculture and Forestry University, Fuzhou, China
- Rice Research Institute, Fujian Academy of Agricultural Sciences, Fuzhou, China
- State Key Laboratory of Ecological Pest Control for Fujian and Taiwan Crops, Fuzhou, China
- Key Laboratory of Germplasm Innovation and Molecular Breeding of Hybrid Rice for South China, Ministry of Agriculture and Rural Affairs, Fuzhou, China
- Incubator of National Key Laboratory of Germplasm Innovation and Molecular Breeding Between Fujian and Ministry of Sciences and Technology, Fuzhou, China
- Fuzhou Branch, National Rice Improvement Center of China, Fuzhou, China
- Fujian Engineering Laboratory of Crop Molecular Breeding, Fuzhou, China
- Base of South China, State Key Laboratory of Hybrid Rice, Fuzhou, China
| | - Shenfei Jiang
- College of Plant Protection, Fujian Agriculture and Forestry University, Fuzhou, China
- Rice Research Institute, Fujian Academy of Agricultural Sciences, Fuzhou, China
- State Key Laboratory of Ecological Pest Control for Fujian and Taiwan Crops, Fuzhou, China
- Key Laboratory of Germplasm Innovation and Molecular Breeding of Hybrid Rice for South China, Ministry of Agriculture and Rural Affairs, Fuzhou, China
- Incubator of National Key Laboratory of Germplasm Innovation and Molecular Breeding Between Fujian and Ministry of Sciences and Technology, Fuzhou, China
- Fuzhou Branch, National Rice Improvement Center of China, Fuzhou, China
- Fujian Engineering Laboratory of Crop Molecular Breeding, Fuzhou, China
- Base of South China, State Key Laboratory of Hybrid Rice, Fuzhou, China
| | - Lele Li
- Rice Research Institute, Fujian Academy of Agricultural Sciences, Fuzhou, China
- State Key Laboratory of Ecological Pest Control for Fujian and Taiwan Crops, Fuzhou, China
- Key Laboratory of Germplasm Innovation and Molecular Breeding of Hybrid Rice for South China, Ministry of Agriculture and Rural Affairs, Fuzhou, China
- Incubator of National Key Laboratory of Germplasm Innovation and Molecular Breeding Between Fujian and Ministry of Sciences and Technology, Fuzhou, China
- Fuzhou Branch, National Rice Improvement Center of China, Fuzhou, China
- Fujian Engineering Laboratory of Crop Molecular Breeding, Fuzhou, China
- Base of South China, State Key Laboratory of Hybrid Rice, Fuzhou, China
- College of Agronomy, Fujian Agriculture and Forestry University, Fuzhou, China
| | - Xiangzhen Yu
- College of Plant Protection, Fujian Agriculture and Forestry University, Fuzhou, China
- Rice Research Institute, Fujian Academy of Agricultural Sciences, Fuzhou, China
- State Key Laboratory of Ecological Pest Control for Fujian and Taiwan Crops, Fuzhou, China
- Key Laboratory of Germplasm Innovation and Molecular Breeding of Hybrid Rice for South China, Ministry of Agriculture and Rural Affairs, Fuzhou, China
- Incubator of National Key Laboratory of Germplasm Innovation and Molecular Breeding Between Fujian and Ministry of Sciences and Technology, Fuzhou, China
- Fuzhou Branch, National Rice Improvement Center of China, Fuzhou, China
- Fujian Engineering Laboratory of Crop Molecular Breeding, Fuzhou, China
- Base of South China, State Key Laboratory of Hybrid Rice, Fuzhou, China
| | - Yupeng Wang
- College of Plant Protection, Fujian Agriculture and Forestry University, Fuzhou, China
- Rice Research Institute, Fujian Academy of Agricultural Sciences, Fuzhou, China
- State Key Laboratory of Ecological Pest Control for Fujian and Taiwan Crops, Fuzhou, China
- Key Laboratory of Germplasm Innovation and Molecular Breeding of Hybrid Rice for South China, Ministry of Agriculture and Rural Affairs, Fuzhou, China
- Incubator of National Key Laboratory of Germplasm Innovation and Molecular Breeding Between Fujian and Ministry of Sciences and Technology, Fuzhou, China
- Fuzhou Branch, National Rice Improvement Center of China, Fuzhou, China
- Fujian Engineering Laboratory of Crop Molecular Breeding, Fuzhou, China
- Base of South China, State Key Laboratory of Hybrid Rice, Fuzhou, China
| | - Cuiqin Luo
- College of Plant Protection, Fujian Agriculture and Forestry University, Fuzhou, China
- Rice Research Institute, Fujian Academy of Agricultural Sciences, Fuzhou, China
- State Key Laboratory of Ecological Pest Control for Fujian and Taiwan Crops, Fuzhou, China
- Key Laboratory of Germplasm Innovation and Molecular Breeding of Hybrid Rice for South China, Ministry of Agriculture and Rural Affairs, Fuzhou, China
- Incubator of National Key Laboratory of Germplasm Innovation and Molecular Breeding Between Fujian and Ministry of Sciences and Technology, Fuzhou, China
- Fuzhou Branch, National Rice Improvement Center of China, Fuzhou, China
- Fujian Engineering Laboratory of Crop Molecular Breeding, Fuzhou, China
- Base of South China, State Key Laboratory of Hybrid Rice, Fuzhou, China
| | - Qiuhua Cai
- Rice Research Institute, Fujian Academy of Agricultural Sciences, Fuzhou, China
- State Key Laboratory of Ecological Pest Control for Fujian and Taiwan Crops, Fuzhou, China
- Key Laboratory of Germplasm Innovation and Molecular Breeding of Hybrid Rice for South China, Ministry of Agriculture and Rural Affairs, Fuzhou, China
- Incubator of National Key Laboratory of Germplasm Innovation and Molecular Breeding Between Fujian and Ministry of Sciences and Technology, Fuzhou, China
- Fuzhou Branch, National Rice Improvement Center of China, Fuzhou, China
- Fujian Engineering Laboratory of Crop Molecular Breeding, Fuzhou, China
- Base of South China, State Key Laboratory of Hybrid Rice, Fuzhou, China
| | - Wei He
- Rice Research Institute, Fujian Academy of Agricultural Sciences, Fuzhou, China
- State Key Laboratory of Ecological Pest Control for Fujian and Taiwan Crops, Fuzhou, China
- Key Laboratory of Germplasm Innovation and Molecular Breeding of Hybrid Rice for South China, Ministry of Agriculture and Rural Affairs, Fuzhou, China
- Incubator of National Key Laboratory of Germplasm Innovation and Molecular Breeding Between Fujian and Ministry of Sciences and Technology, Fuzhou, China
- Fuzhou Branch, National Rice Improvement Center of China, Fuzhou, China
- Fujian Engineering Laboratory of Crop Molecular Breeding, Fuzhou, China
- Base of South China, State Key Laboratory of Hybrid Rice, Fuzhou, China
| | - Hongguang Xie
- Rice Research Institute, Fujian Academy of Agricultural Sciences, Fuzhou, China
- State Key Laboratory of Ecological Pest Control for Fujian and Taiwan Crops, Fuzhou, China
- Key Laboratory of Germplasm Innovation and Molecular Breeding of Hybrid Rice for South China, Ministry of Agriculture and Rural Affairs, Fuzhou, China
- Incubator of National Key Laboratory of Germplasm Innovation and Molecular Breeding Between Fujian and Ministry of Sciences and Technology, Fuzhou, China
- Fuzhou Branch, National Rice Improvement Center of China, Fuzhou, China
- Fujian Engineering Laboratory of Crop Molecular Breeding, Fuzhou, China
- Base of South China, State Key Laboratory of Hybrid Rice, Fuzhou, China
| | - Yanmei Zheng
- College of Plant Protection, Fujian Agriculture and Forestry University, Fuzhou, China
- Rice Research Institute, Fujian Academy of Agricultural Sciences, Fuzhou, China
- State Key Laboratory of Ecological Pest Control for Fujian and Taiwan Crops, Fuzhou, China
- Key Laboratory of Germplasm Innovation and Molecular Breeding of Hybrid Rice for South China, Ministry of Agriculture and Rural Affairs, Fuzhou, China
- Incubator of National Key Laboratory of Germplasm Innovation and Molecular Breeding Between Fujian and Ministry of Sciences and Technology, Fuzhou, China
- Fuzhou Branch, National Rice Improvement Center of China, Fuzhou, China
- Fujian Engineering Laboratory of Crop Molecular Breeding, Fuzhou, China
- Base of South China, State Key Laboratory of Hybrid Rice, Fuzhou, China
| | - Huaan Xie
- College of Plant Protection, Fujian Agriculture and Forestry University, Fuzhou, China
- Rice Research Institute, Fujian Academy of Agricultural Sciences, Fuzhou, China
- State Key Laboratory of Ecological Pest Control for Fujian and Taiwan Crops, Fuzhou, China
- Key Laboratory of Germplasm Innovation and Molecular Breeding of Hybrid Rice for South China, Ministry of Agriculture and Rural Affairs, Fuzhou, China
- Incubator of National Key Laboratory of Germplasm Innovation and Molecular Breeding Between Fujian and Ministry of Sciences and Technology, Fuzhou, China
- Fuzhou Branch, National Rice Improvement Center of China, Fuzhou, China
- Fujian Engineering Laboratory of Crop Molecular Breeding, Fuzhou, China
- Base of South China, State Key Laboratory of Hybrid Rice, Fuzhou, China
- College of Agronomy, Fujian Agriculture and Forestry University, Fuzhou, China
| | - Jianfu Zhang
- College of Plant Protection, Fujian Agriculture and Forestry University, Fuzhou, China
- Rice Research Institute, Fujian Academy of Agricultural Sciences, Fuzhou, China
- State Key Laboratory of Ecological Pest Control for Fujian and Taiwan Crops, Fuzhou, China
- Key Laboratory of Germplasm Innovation and Molecular Breeding of Hybrid Rice for South China, Ministry of Agriculture and Rural Affairs, Fuzhou, China
- Incubator of National Key Laboratory of Germplasm Innovation and Molecular Breeding Between Fujian and Ministry of Sciences and Technology, Fuzhou, China
- Fuzhou Branch, National Rice Improvement Center of China, Fuzhou, China
- Fujian Engineering Laboratory of Crop Molecular Breeding, Fuzhou, China
- Base of South China, State Key Laboratory of Hybrid Rice, Fuzhou, China
- College of Agronomy, Fujian Agriculture and Forestry University, Fuzhou, China
| |
Collapse
|
20
|
Leporcq C, Spill Y, Balaramane D, Toussaint C, Weber M, Bardet AF. TFmotifView: a webserver for the visualization of transcription factor motifs in genomic regions. Nucleic Acids Res 2020; 48:W208-W217. [PMID: 32324215 PMCID: PMC7319436 DOI: 10.1093/nar/gkaa252] [Citation(s) in RCA: 17] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/20/2020] [Revised: 03/24/2020] [Accepted: 04/08/2020] [Indexed: 12/31/2022] Open
Abstract
Transcription factors (TFs) regulate the expression of gene expression. The binding specificities of many TFs have been deciphered and summarized as position-weight matrices, also called TF motifs. Despite the availability of hundreds of known TF motifs in databases, it remains non-trivial to quickly query and visualize the enrichment of known TF motifs in genomic regions of interest. Towards this goal, we developed TFmotifView, a web server that allows to study the distribution of known TF motifs in genomic regions. Based on input genomic regions and selected TF motifs, TFmotifView performs an overlap of the genomic regions with TF motif occurrences identified using a dynamic P-value threshold. TFmotifView generates three different outputs: (i) an enrichment table and scatterplot calculating the significance of TF motif occurrences in genomic regions compared to control regions, (ii) a genomic view of the organisation of TF motifs in each genomic region and (iii) a metaplot summarizing the position of TF motifs relative to the center of the regions. TFmotifView will contribute to the integration of TF motif information with a wide range of genomic datasets towards the goal to better understand the regulation of gene expression by transcription factors. TFmotifView is freely available at http://bardet.u-strasbg.fr/tfmotifview/.
Collapse
Affiliation(s)
- Clémentine Leporcq
- CNRS, University of Strasbourg, UMR7242 Biotechnology and Cell Signaling, Illkirch 67412, France
| | - Yannick Spill
- CNRS, University of Strasbourg, UMR7242 Biotechnology and Cell Signaling, Illkirch 67412, France
| | - Delphine Balaramane
- CNRS, University of Strasbourg, UMR7242 Biotechnology and Cell Signaling, Illkirch 67412, France
| | - Christophe Toussaint
- CNRS, University of Strasbourg, UMR7242 Biotechnology and Cell Signaling, Illkirch 67412, France
| | - Michaël Weber
- CNRS, University of Strasbourg, UMR7242 Biotechnology and Cell Signaling, Illkirch 67412, France
| | - Anaïs Flore Bardet
- CNRS, University of Strasbourg, UMR7242 Biotechnology and Cell Signaling, Illkirch 67412, France
| |
Collapse
|
21
|
Yang J, Ma A, Hoppe AD, Wang C, Li Y, Zhang C, Wang Y, Liu B, Ma Q. Prediction of regulatory motifs from human Chip-sequencing data using a deep learning framework. Nucleic Acids Res 2019; 47:7809-7824. [PMID: 31372637 PMCID: PMC6735894 DOI: 10.1093/nar/gkz672] [Citation(s) in RCA: 33] [Impact Index Per Article: 6.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/12/2019] [Accepted: 07/23/2019] [Indexed: 11/24/2022] Open
Abstract
The identification of transcription factor binding sites and cis-regulatory motifs is a frontier whereupon the rules governing protein–DNA binding are being revealed. Here, we developed a new method (DEep Sequence and Shape mOtif or DESSO) for cis-regulatory motif prediction using deep neural networks and the binomial distribution model. DESSO outperformed existing tools, including DeepBind, in predicting motifs in 690 human ENCODE ChIP-sequencing datasets. Furthermore, the deep-learning framework of DESSO expanded motif discovery beyond the state-of-the-art by allowing the identification of known and new protein–protein–DNA tethering interactions in human transcription factors (TFs). Specifically, 61 putative tethering interactions were identified among the 100 TFs expressed in the K562 cell line. In this work, the power of DESSO was further expanded by integrating the detection of DNA shape features. We found that shape information has strong predictive power for TF–DNA binding and provides new putative shape motif information for human TFs. Thus, DESSO improves in the identification and structural analysis of TF binding sites, by integrating the complexities of DNA binding into a deep-learning framework.
Collapse
Affiliation(s)
- Jinyu Yang
- Department of Biomedical Informatics, College of Medicine, The Ohio State University, Columbus, OH 43210, USA.,Department of Computer Science and Engineering, The University of Texas at Arlington, Arlington, TX 76010, USA
| | - Anjun Ma
- Department of Biomedical Informatics, College of Medicine, The Ohio State University, Columbus, OH 43210, USA
| | - Adam D Hoppe
- Department of Chemistry and Biochemistry, South Dakota State University, Brookings, SD 57007, USA.,BioSNTR, Brookings, SD 57007, USA
| | - Cankun Wang
- Department of Biomedical Informatics, College of Medicine, The Ohio State University, Columbus, OH 43210, USA
| | - Yang Li
- School of Mathematics, Shandong University, Jinan 250100, China
| | - Chi Zhang
- Department of Medical and Molecular Genetics, School of Medicine, Indiana University, Indianapolis, IN 46202, USA
| | - Yan Wang
- School of Artificial Intelligence, Jilin University, Changchun 130012, China
| | - Bingqiang Liu
- School of Mathematics, Shandong University, Jinan 250100, China
| | - Qin Ma
- Department of Biomedical Informatics, College of Medicine, The Ohio State University, Columbus, OH 43210, USA
| |
Collapse
|
22
|
Kribelbauer JF, Rastogi C, Bussemaker HJ, Mann RS. Low-Affinity Binding Sites and the Transcription Factor Specificity Paradox in Eukaryotes. Annu Rev Cell Dev Biol 2019; 35:357-379. [PMID: 31283382 DOI: 10.1146/annurev-cellbio-100617-062719] [Citation(s) in RCA: 106] [Impact Index Per Article: 21.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/20/2022]
Abstract
Eukaryotic transcription factors (TFs) from the same structural family tend to bind similar DNA sequences, despite the ability of these TFs to execute distinct functions in vivo. The cell partly resolves this specificity paradox through combinatorial strategies and the use of low-affinity binding sites, which are better able to distinguish between similar TFs. However, because these sites have low affinity, it is challenging to understand how TFs recognize them in vivo. Here, we summarize recent findings and technological advancements that allow for the quantification and mechanistic interpretation of TF recognition across a wide range of affinities. We propose a model that integrates insights from the fields of genetics and cell biology to provide further conceptual understanding of TF binding specificity. We argue that in eukaryotes, target specificity is driven by an inhomogeneous 3D nuclear distribution of TFs and by variation in DNA binding affinity such that locally elevated TF concentration allows low-affinity binding sites to be functional.
Collapse
Affiliation(s)
- Judith F Kribelbauer
- Department of Biological Sciences, Columbia University, New York, NY 10027, USA; .,Department of Systems Biology, Columbia University Irving Medical Center, New York, NY 10031, USA;
| | - Chaitanya Rastogi
- Department of Biological Sciences, Columbia University, New York, NY 10027, USA; .,Department of Systems Biology, Columbia University Irving Medical Center, New York, NY 10031, USA;
| | - Harmen J Bussemaker
- Department of Biological Sciences, Columbia University, New York, NY 10027, USA; .,Department of Systems Biology, Columbia University Irving Medical Center, New York, NY 10031, USA;
| | - Richard S Mann
- Department of Systems Biology, Columbia University Irving Medical Center, New York, NY 10031, USA; .,Department of Biochemistry and Molecular Biophysics, Columbia University Irving Medical Center, New York, NY 10031, USA.,Mortimer B. Zuckerman Mind Brain Behavior Institute, Columbia University, New York, NY 10027, USA
| |
Collapse
|
23
|
Anderson AP, Jones AG. erefinder: Genome-wide detection of oestrogen response elements. Mol Ecol Resour 2019; 19:1366-1373. [PMID: 31177626 DOI: 10.1111/1755-0998.13046] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/25/2018] [Revised: 05/31/2019] [Accepted: 05/31/2019] [Indexed: 11/28/2022]
Abstract
Oestrogen response elements (EREs) are specific DNA sequences to which ligand-bound oestrogen receptors (ERs) physically bind, allowing them to act as transcription factors for target genes. Locating EREs and ER responsive regions is therefore a potentially important component of the study of oestrogen-regulated pathways. Here, we report the development of a novel software tool, erefinder, which conducts a genome-wide, sliding-window analysis of oestrogen receptor binding affinity. We demonstrate the effects of adjusting window size and highlight the program's general agreement with ChIP studies. We further provide two examples of how erefinder can be used for comparative approaches. erefinder can handle large input files, has settings to allow for broad and narrow searches, and provides the full output to allow for greater data manipulation. These features facilitate a wide range of hypothesis testing for researchers and make erefinder an excellent tool to aid in oestrogen-related research.
Collapse
Affiliation(s)
- Andrew P Anderson
- Department of Biology, Texas A&M University, College Station, TX, USA
| | - Adam G Jones
- Department of Biological Sciences, University of Idaho, Moscow, ID, USA
| |
Collapse
|
24
|
Mitkin NA, Korneev K, Gorbacheva AM, Kuprash DV. Relative Efficiency of Transcription Factor Binding to Allelic Variants of Regulatory Regions of Human Genes in Immunoprecipitation and Real-Time PCR. Mol Biol 2019. [DOI: 10.1134/s0026893319030117] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/23/2022]
|
25
|
Domingo J, Baeza-Centurion P, Lehner B. The Causes and Consequences of Genetic Interactions (Epistasis). Annu Rev Genomics Hum Genet 2019; 20:433-460. [PMID: 31082279 DOI: 10.1146/annurev-genom-083118-014857] [Citation(s) in RCA: 124] [Impact Index Per Article: 24.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/09/2022]
Abstract
The same mutation can have different effects in different individuals. One important reason for this is that the outcome of a mutation can depend on the genetic context in which it occurs. This dependency is known as epistasis. In recent years, there has been a concerted effort to quantify the extent of pairwise and higher-order genetic interactions between mutations through deep mutagenesis of proteins and RNAs. This research has revealed two major components of epistasis: nonspecific genetic interactions caused by nonlinearities in genotype-to-phenotype maps, and specific interactions between particular mutations. Here, we provide an overview of our current understanding of the mechanisms causing epistasis at the molecular level, the consequences of genetic interactions for evolution and genetic prediction, and the applications of epistasis for understanding biology and determining macromolecular structures.
Collapse
Affiliation(s)
- Júlia Domingo
- Systems Biology Program, Centre for Genomic Regulation, Barcelona Institute of Science and Technology, 08003 Barcelona, Spain; , ,
| | - Pablo Baeza-Centurion
- Systems Biology Program, Centre for Genomic Regulation, Barcelona Institute of Science and Technology, 08003 Barcelona, Spain; , ,
| | - Ben Lehner
- Systems Biology Program, Centre for Genomic Regulation, Barcelona Institute of Science and Technology, 08003 Barcelona, Spain; , , .,Universitat Pompeu Fabra, 08003 Barcelona, Spain.,Institució Catalana de Recerca i Estudis Avançats (ICREA), 08010 Barcelona, Spain
| |
Collapse
|
26
|
Hu J, Wang J, Lin J, Liu T, Zhong Y, Liu J, Zheng Y, Gao Y, He J, Shang X. MD-SVM: a novel SVM-based algorithm for the motif discovery of transcription factor binding sites. BMC Bioinformatics 2019; 20:200. [PMID: 31074373 PMCID: PMC6509868 DOI: 10.1186/s12859-019-2735-3] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/28/2022] Open
Abstract
BACKGROUND Transcription factors (TFs) play important roles in the regulation of gene expression. They can activate or block transcription of downstream genes in a manner of binding to specific genomic sequences. Therefore, motif discovery of these binding preference patterns is of central significance in the understanding of molecular regulation mechanism. Many algorithms have been proposed for the identification of transcription factor binding sites. However, it remains a challengeable problem. RESULTS Here, we proposed a novel motif discovery algorithm based on support vector machine (MD-SVM) to learn a discriminative model for TF binding sites. MD-SVM firstly obtains position weight matrix (PWM) from a set of training datasets. Then it translates the MD problem into a computational framework of multiple instance learning (MIL). It was applied to several real biological datasets. Results show that our algorithm outperforms MI-SVM in terms of both accuracy and specificity. CONCLUSIONS In this paper, we modeled the TF motif discovery problem as a MIL optimization problem. The SVM algorithm was adapted to discriminate positive and negative bags of instances. Compared to other svm-based algorithms, MD-SVM show its superiority over its competitors in term of ROC AUC. Hopefully, it could be of benefit to the research community in the understanding of molecular functions of DNA functional elements and transcription factors.
Collapse
Affiliation(s)
- Jialu Hu
- School of Computer Science, Northwestern Polytechnical University, West Youyi Road 127, Xi’an, 710072 China
- Centre of Multidisciplinary Convergence Computing, School of Computer Science, Northwestern Polytechnical University, 1 Dong Xiang Road, Xi’an, 710129 China
| | - Jingru Wang
- School of Computer Science, Northwestern Polytechnical University, West Youyi Road 127, Xi’an, 710072 China
| | - Jianan Lin
- School of Computer Science, Northwestern Polytechnical University, West Youyi Road 127, Xi’an, 710072 China
| | - Tianwei Liu
- School of Computer Science, Northwestern Polytechnical University, West Youyi Road 127, Xi’an, 710072 China
| | - Yuanke Zhong
- School of Computer Science, Northwestern Polytechnical University, West Youyi Road 127, Xi’an, 710072 China
| | - Jie Liu
- School of Computer Science, Northwestern Polytechnical University, West Youyi Road 127, Xi’an, 710072 China
| | - Yan Zheng
- School of Computer Science, Northwestern Polytechnical University, West Youyi Road 127, Xi’an, 710072 China
| | - Yiqun Gao
- School of Computer Science, Northwestern Polytechnical University, West Youyi Road 127, Xi’an, 710072 China
| | - Junhao He
- School of Computer Science, Northwestern Polytechnical University, West Youyi Road 127, Xi’an, 710072 China
| | - Xuequn Shang
- School of Computer Science, Northwestern Polytechnical University, West Youyi Road 127, Xi’an, 710072 China
| |
Collapse
|
27
|
Rosa M, Di Felice R, Corni S. Adsorption Mechanisms of Nucleobases on the Hydrated Au(111) Surface. LANGMUIR : THE ACS JOURNAL OF SURFACES AND COLLOIDS 2018; 34:14749-14756. [PMID: 29723478 DOI: 10.1021/acs.langmuir.8b00065] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/08/2023]
Abstract
The solution environment is of fundamental importance in the adsorption of molecules on surfaces, a process that is strongly affected by the capability of the adsorbate to disrupt the hydration layer above the surface. Here we disclose how the presence of interface water influences the adsorption mechanism of DNA nucleobases on a gold surface. By means of metadynamics simulations, we describe the distinctive features of a complex free-energy landscape for each base, which manifests activation barriers for the adsorption process. We characterize the different pathways that allow each nucleobase to overcome the barriers and be adsorbed on the surface, discussing how they influence the kinetics of adsorption of single-stranded DNA oligomers with homogeneous sequences. Our findings offer a rationale as to why experimental data on the adsorption of single-stranded homo-oligonucleotides do not straightforwardly follow the thermodynamics affinity rank.
Collapse
Affiliation(s)
| | - Rosa Di Felice
- Center S3 , CNR Institute of Nanoscience , 41125 Modena , Italy
- Department of Physics and Astronomy , University of Southern California , Los Angeles , California 90089 , United States
| | - Stefano Corni
- Center S3 , CNR Institute of Nanoscience , 41125 Modena , Italy
| |
Collapse
|
28
|
Schneider A, Niemeyer CM. DNA Surface Technology: From Gene Sensors to Integrated Systems for Life and Materials Sciences. Angew Chem Int Ed Engl 2018. [DOI: 10.1002/ange.201811713] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/01/2023]
Affiliation(s)
- Ann‐Kathrin Schneider
- Institute for Biological Interfaces (IBG 1) Karlsruhe Institute of Technology (KIT) Hermann-von-Helmholtz-Platz 76344 Eggenstein-Leopoldshafen Germany
| | - Christof M. Niemeyer
- Institute for Biological Interfaces (IBG 1) Karlsruhe Institute of Technology (KIT) Hermann-von-Helmholtz-Platz 76344 Eggenstein-Leopoldshafen Germany
| |
Collapse
|
29
|
Schneider A, Niemeyer CM. DNA Surface Technology: From Gene Sensors to Integrated Systems for Life and Materials Sciences. Angew Chem Int Ed Engl 2018; 57:16959-16967. [DOI: 10.1002/anie.201811713] [Citation(s) in RCA: 34] [Impact Index Per Article: 5.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/11/2018] [Revised: 11/15/2018] [Indexed: 01/21/2023]
Affiliation(s)
- Ann‐Kathrin Schneider
- Institute for Biological Interfaces (IBG 1) Karlsruhe Institute of Technology (KIT) Hermann-von-Helmholtz-Platz 76344 Eggenstein-Leopoldshafen Germany
| | - Christof M. Niemeyer
- Institute for Biological Interfaces (IBG 1) Karlsruhe Institute of Technology (KIT) Hermann-von-Helmholtz-Platz 76344 Eggenstein-Leopoldshafen Germany
| |
Collapse
|
30
|
Abstract
Designing the expression cassettes with desired properties remains the most important consideration of gene engineering technology. One of the challenges for predictive gene expression is the modeling of synthetic gene switches to regulate one or more target genes which would directly respond to specific chemical, environmental, and physiological stimuli. Assessment of natural promoter, high-throughput sequencing, and modern biotech inventory aided in deciphering the structure of cis elements and molding the native cis elements into desired synthetic promoter. Synthetic promoters which are molded by rearrangement of cis motifs can greatly benefit plant biotechnology applications. This review gives a glimpse of the manual in vivo gene regulation through synthetic promoters. It summarizes the integrative design strategy of synthetic promoters and enumerates five approaches for constructing synthetic promoters. Insights into the pattern of cis regulatory elements in the pursuit of desirable "gene switches" to date has also been reevaluated. Joint strategies of bioinformatics modeling and randomized biochemical synthesis are addressed in an effort to construct synthetic promoters for intricate gene regulation.
Collapse
|
31
|
Gao Z, Ruan J. Computational modeling of in vivo and in vitro protein-DNA interactions by multiple instance learning. Bioinformatics 2018; 33:2097-2105. [PMID: 28334224 DOI: 10.1093/bioinformatics/btx115] [Citation(s) in RCA: 15] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/30/2016] [Accepted: 02/27/2017] [Indexed: 12/25/2022] Open
Abstract
Motivation The study of transcriptional regulation is still difficult yet fundamental in molecular biology research. While the development of both in vivo and in vitro profiling techniques have significantly enhanced our knowledge of transcription factor (TF)-DNA interactions, computational models of TF-DNA interactions are relatively simple and may not reveal sufficient biological insight. In particular, supervised learning based models for TF-DNA interactions attempt to map sequence-level features ( k -mers) to binding event but usually ignore the location of k -mers, which can cause data fragmentation and consequently inferior model performance. Results Here, we propose a novel algorithm based on the so-called multiple-instance learning (MIL) paradigm. MIL breaks each DNA sequence into multiple overlapping subsequences and models each subsequence separately, therefore implicitly takes into consideration binding site locations, resulting in both higher accuracy and better interpretability of the models. The result from both in vivo and in vitro TF-DNA interaction data show that our approach significantly outperform conventional single-instance learning based algorithms. Importantly, the models learned from in vitro data using our approach can predict in vivo binding with very good accuracy. In addition, the location information obtained by our method provides additional insight for motif finding results from ChIP-Seq data. Finally, our approach can be easily combined with other state-of-the-art TF-DNA interaction modeling methods. Availability and Implementation http://www.cs.utsa.edu/∼jruan/MIL/. Contact jianhua.ruan@utsa.edu. Supplementary information Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Zhen Gao
- Department of Computer Science, University of Texas at San Antonio, San Antonio, TX, USA
| | - Jianhua Ruan
- Department of Computer Science, University of Texas at San Antonio, San Antonio, TX, USA
| |
Collapse
|
32
|
Park J, Wang HH. Systematic and synthetic approaches to rewire regulatory networks. CURRENT OPINION IN SYSTEMS BIOLOGY 2018; 8:90-96. [PMID: 30637352 PMCID: PMC6329604 DOI: 10.1016/j.coisb.2017.12.009] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 12/11/2022]
Abstract
Microbial gene regulatory networks are composed of cis- and trans-components that in concert act to control essential and adaptive cellular functions. Regulatory components and interactions evolve to adopt new configurations through mutations and network rewiring events, resulting in novel phenotypes that may benefit the cell. Advances in high-throughput DNA synthesis and sequencing have enabled the development of new tools and approaches to better characterize and perturb various elements of regulatory networks. Here, we highlight key recent approaches to systematically dissect the sequence space of cis-regulatory elements and trans-regulators as well as their inter-connections. These efforts yield fundamental insights into the architecture, robustness, and dynamics of gene regulation and provide models and design principles for building synthetic regulatory networks for a variety of practical applications.
Collapse
Affiliation(s)
- Jimin Park
- Department of Systems Biology, Columbia University Medical Center, New York, USA
- Integrated Program in Cellular, Molecular and Biomedical Studies, Columbia University Medical Center, New York, USA
| | - Harris H Wang
- Department of Systems Biology, Columbia University Medical Center, New York, USA
- Department of Pathology and Cell Biology, Columbia University Medical Center, New York, USA
| |
Collapse
|
33
|
Comprehensive, high-resolution binding energy landscapes reveal context dependencies of transcription factor binding. Proc Natl Acad Sci U S A 2018; 115:E3702-E3711. [PMID: 29588420 PMCID: PMC5910820 DOI: 10.1073/pnas.1715888115] [Citation(s) in RCA: 51] [Impact Index Per Article: 8.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022] Open
Abstract
Transcription factors (TFs) are primary regulators of gene expression in cells, where they bind specific genomic target sites to control transcription. Quantitative measurements of TF-DNA binding energies can improve the accuracy of predictions of TF occupancy and downstream gene expression in vivo and shed light on how transcriptional networks are rewired throughout evolution. Here, we present a sequencing-based TF binding assay and analysis pipeline (BET-seq, for Binding Energy Topography by sequencing) capable of providing quantitative estimates of binding energies for more than one million DNA sequences in parallel at high energetic resolution. Using this platform, we measured the binding energies associated with all possible combinations of 10 nucleotides flanking the known consensus DNA target interacting with two model yeast TFs, Pho4 and Cbf1. A large fraction of these flanking mutations change overall binding energies by an amount equal to or greater than consensus site mutations, suggesting that current definitions of TF binding sites may be too restrictive. By systematically comparing estimates of binding energies output by deep neural networks (NNs) and biophysical models trained on these data, we establish that dinucleotide (DN) specificities are sufficient to explain essentially all variance in observed binding behavior, with Cbf1 binding exhibiting significantly more nonadditivity than Pho4. NN-derived binding energies agree with orthogonal biochemical measurements and reveal that dynamically occupied sites in vivo are both energetically and mutationally distant from the highest affinity sites.
Collapse
|
34
|
Rossi MJ, Lai WKM, Pugh BF. Genome-wide determinants of sequence-specific DNA binding of general regulatory factors. Genome Res 2018; 28:497-508. [PMID: 29563167 PMCID: PMC5880240 DOI: 10.1101/gr.229518.117] [Citation(s) in RCA: 38] [Impact Index Per Article: 6.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/24/2017] [Accepted: 03/05/2018] [Indexed: 01/01/2023]
Abstract
General regulatory factors (GRFs), such as Reb1, Abf1, Rap1, Mcm1, and Cbf1, positionally organize yeast chromatin through interactions with a core consensus DNA sequence. It is assumed that sequence recognition via direct base readout suffices for specificity and that spurious nonfunctional sites are rendered inaccessible by chromatin. We tested these assumptions through genome-wide mapping of GRFs in vivo and in purified biochemical systems at near–base pair (bp) resolution using several ChIP-exo–based assays. We find that computationally predicted DNA shape features (e.g., minor groove width, helix twist, base roll, and propeller twist) that are not defined by a unique consensus sequence are embedded in the nonunique portions of GRF motifs and contribute critically to sequence-specific binding. This dual source specificity occurs at GRF sites in promoter regions where chromatin organization starts. Outside of promoter regions, strong consensus sites lack the shape component and consequently lack an intrinsic ability to bind cognate GRFs, without regard to influences from chromatin. However, sites having a weak consensus and low intrinsic affinity do exist in these regions but are rendered inaccessible in a chromatin environment. Thus, GRF site-specificity is achieved through integration of favorable DNA sequence and shape readouts in promoter regions and by chromatin-based exclusion from fortuitous weak sites within gene bodies. This study further revealed a severe G/C nucleotide cross-linking selectivity inherent in all formaldehyde-based ChIP assays, which includes ChIP-seq. However, for most tested proteins, G/C selectivity did not appreciably affect binding site detection, although it does place limits on the quantitativeness of occupancy levels.
Collapse
Affiliation(s)
- Matthew J Rossi
- Center for Eukaryotic Gene Regulation, Department of Biochemistry and Molecular Biology, The Pennsylvania State University, University Park, Pennsylvania 16802, USA
| | - William K M Lai
- Center for Eukaryotic Gene Regulation, Department of Biochemistry and Molecular Biology, The Pennsylvania State University, University Park, Pennsylvania 16802, USA
| | - B Franklin Pugh
- Center for Eukaryotic Gene Regulation, Department of Biochemistry and Molecular Biology, The Pennsylvania State University, University Park, Pennsylvania 16802, USA
| |
Collapse
|
35
|
Rube HT, Rastogi C, Kribelbauer JF, Bussemaker HJ. A unified approach for quantifying and interpreting DNA shape readout by transcription factors. Mol Syst Biol 2018; 14:e7902. [PMID: 29472273 PMCID: PMC5822049 DOI: 10.15252/msb.20177902] [Citation(s) in RCA: 28] [Impact Index Per Article: 4.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/28/2017] [Revised: 01/26/2018] [Accepted: 01/31/2018] [Indexed: 01/07/2023] Open
Abstract
Transcription factors (TFs) interpret DNA sequence by probing the chemical and structural properties of the nucleotide polymer. DNA shape is thought to enable a parsimonious representation of dependencies between nucleotide positions. Here, we propose a unified mathematical representation of the DNA sequence dependence of shape and TF binding, respectively, which simplifies and enhances analysis of shape readout. First, we demonstrate that linear models based on mononucleotide features alone account for 60-70% of the variance in minor groove width, roll, helix twist, and propeller twist. This explains why simple scoring matrices that ignore all dependencies between nucleotide positions can partially account for DNA shape readout by a TF Adding dinucleotide features as sequence-to-shape predictors to our model, we can almost perfectly explain the shape parameters. Building on this observation, we developed a post hoc analysis method that can be used to analyze any mechanism-agnostic protein-DNA binding model in terms of shape readout. Our insights provide an alternative strategy for using DNA shape information to enhance our understanding of how cis-regulatory codes are interpreted by the cellular machinery.
Collapse
Affiliation(s)
- H Tomas Rube
- Department of Biological Sciences, Columbia University, New York, NY, USA
| | - Chaitanya Rastogi
- Department of Biological Sciences, Columbia University, New York, NY, USA
- Program in Applied Physics and Applied Mathematics, Columbia University, New York, NY, USA
| | - Judith F Kribelbauer
- Department of Biological Sciences, Columbia University, New York, NY, USA
- Department of Systems Biology, Columbia University Medical Center, New York, NY, USA
| | - Harmen J Bussemaker
- Department of Biological Sciences, Columbia University, New York, NY, USA
- Department of Systems Biology, Columbia University Medical Center, New York, NY, USA
| |
Collapse
|
36
|
Aditham AK, Shimko TC, Fordyce PM. BET-seq: Binding energy topographies revealed by microfluidics and high-throughput sequencing. Methods Cell Biol 2018; 148:229-250. [PMID: 30473071 PMCID: PMC7531582 DOI: 10.1016/bs.mcb.2018.09.011] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/25/2023]
Abstract
Biophysical models of transcriptional regulation rely on energetic measurements of the binding affinities between transcription factors (TFs) and target DNA binding sites. Historically, assays capable of measuring TF-DNA binding affinities have been relatively low-throughput (measuring ~103 sequences in parallel) and have required significant specialized equipment, limiting their use to a handful of laboratories. Recently, we developed an experimental assay and analysis pipeline that allows measurement of binding energies between a single TF and up to 106 DNA species in a single experiment (Binding Energy Topography by sequencing, or BET-seq) (Le et al., 2018). BET-seq employs the Mechanically Induced Trapping of Molecular Interactions (MITOMI) platform to purify DNA bound to a TF at equilibrium followed by high coverage sequencing to reveal relative differences in binding energy for each sequence. While we have previously used BET-seq to refine the binding affinity landscapes surrounding high-affinity DNA consensus target sites, we anticipate this technique will be applied in future work toward measuring a wide variety of TF-DNA landscapes. Here, we provide detailed instructions and general considerations for DNA library design, performing BET-seq assays, and analyzing the resulting data.
Collapse
Affiliation(s)
- Arjun K. Aditham
- Department of Bioengineering, Stanford University, Stanford, CA, United States,Stanford ChEM-H, Stanford University, Stanford, CA, United States
| | - Tyler C. Shimko
- Department of Genetics, Stanford University, Stanford, CA, United States
| | - Polly M. Fordyce
- Department of Bioengineering, Stanford University, Stanford, CA, United States,Stanford ChEM-H, Stanford University, Stanford, CA, United States,Department of Genetics, Stanford University, Stanford, CA, United States,Chan Zuckerberg Biohub, San Francisco, CA, United States,Corresponding author:
| |
Collapse
|
37
|
Hartonen T, Sahu B, Dave K, Kivioja T, Taipale J. PeakXus: comprehensive transcription factor binding site discovery from ChIP-Nexus and ChIP-Exo experiments. Bioinformatics 2017; 32:i629-i638. [PMID: 27587683 DOI: 10.1093/bioinformatics/btw448] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/14/2022] Open
Abstract
MOTIVATION Transcription factor (TF) binding can be studied accurately in vivo with ChIP-exo and ChIP-Nexus experiments. Only fraction of TF binding mechanisms are yet fully understood and accurate knowledge of binding locations and patterns of TFs is key to understanding binding that is not explained by simple positional weight matrix models. ChIP-exo/Nexus experiments can also offer insight on the effect of single nucleotide polymorphism (SNP) at TF binding sites on expression of the target genes. This is an important mechanism of action for disease-causing SNPs at non-coding genomic regions. RESULTS We describe a peak caller PeakXus that is specifically designed to leverage the increased resolution of ChIP-exo/Nexus and developed with the aim of making as few assumptions of the data as possible to allow discoveries of novel binding patterns. We apply PeakXus to ChIP-Nexus and ChIP-exo experiments performed both in Homo sapiens and in Drosophila melanogaster cell lines. We show that PeakXus consistently finds more peaks overlapping with a TF-specific recognition sequence than published methods. As an application example we demonstrate how PeakXus can be coupled with unique molecular identifiers (UMIs) to measure the effect of a SNP overlapping with a TF binding site on the in vivo binding of the TF. AVAILABILITY AND IMPLEMENTATION Source code of PeakXus is available at https://github.com/hartonen/PeakXus CONTACT tuomo.hartonen@helsinki.fi or jussi.taipale@ki.se.
Collapse
Affiliation(s)
- Tuomo Hartonen
- Genome-Scale Biology Research Program, Research Programs Unit, University of Helsinki, Helsinki, Finland
| | - Biswajyoti Sahu
- Genome-Scale Biology Research Program, Research Programs Unit, University of Helsinki, Helsinki, Finland
| | - Kashyap Dave
- Department of Biosciences and Nutrition, Karolinska Institutet, Stockholm, Sweden
| | - Teemu Kivioja
- Genome-Scale Biology Research Program, Research Programs Unit, University of Helsinki, Helsinki, Finland
| | - Jussi Taipale
- Genome-Scale Biology Research Program, Research Programs Unit, University of Helsinki, Helsinki, Finland Department of Biosciences and Nutrition, Karolinska Institutet, Stockholm, Sweden
| |
Collapse
|
38
|
Omidi S, Zavolan M, Pachkov M, Breda J, Berger S, van Nimwegen E. Automated incorporation of pairwise dependency in transcription factor binding site prediction using dinucleotide weight tensors. PLoS Comput Biol 2017; 13:e1005176. [PMID: 28753602 PMCID: PMC5550003 DOI: 10.1371/journal.pcbi.1005176] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/04/2016] [Revised: 08/09/2017] [Accepted: 06/02/2017] [Indexed: 11/17/2022] Open
Abstract
Gene regulatory networks are ultimately encoded by the sequence-specific binding of (TFs) to short DNA segments. Although it is customary to represent the binding specificity of a TF by a position-specific weight matrix (PSWM), which assumes each position within a site contributes independently to the overall binding affinity, evidence has been accumulating that there can be significant dependencies between positions. Unfortunately, methodological challenges have so far hindered the development of a practical and generally-accepted extension of the PSWM model. On the one hand, simple models that only consider dependencies between nearest-neighbor positions are easy to use in practice, but fail to account for the distal dependencies that are observed in the data. On the other hand, models that allow for arbitrary dependencies are prone to overfitting, requiring regularization schemes that are difficult to use in practice for non-experts. Here we present a new regulatory motif model, called dinucleotide weight tensor (DWT), that incorporates arbitrary pairwise dependencies between positions in binding sites, rigorously from first principles, and free from tunable parameters. We demonstrate the power of the method on a large set of ChIP-seq data-sets, showing that DWTs outperform both PSWMs and motif models that only incorporate nearest-neighbor dependencies. We also demonstrate that DWTs outperform two previously proposed methods. Finally, we show that DWTs inferred from ChIP-seq data also outperform PSWMs on HT-SELEX data for the same TF, suggesting that DWTs capture inherent biophysical properties of the interactions between the DNA binding domains of TFs and their binding sites. We make a suite of DWT tools available at dwt.unibas.ch, that allow users to automatically perform ‘motif finding’, i.e. the inference of DWT motifs from a set of sequences, binding site prediction with DWTs, and visualization of DWT ‘dilogo’ motifs. Gene regulatory networks are ultimately encoded in constellations of short binding sites in the DNA and RNA that are recognized by regulatory factors such as transcription factors (TFs). For several decades, computational analysis of regulatory networks has relied on a model of TF sequence-specificity, the position-specific weight-matrix (PSWM), that assumes different positions in a binding site contribute independently to the total binding energy of the TF. However, in recent years evidence has been accumulating that, at least for some TFs, this assumption does not hold. Here we present a new model for the sequence-specificity of TFs, the dinucleotide weight tensor (DWT), that takes arbitrary dependencies between positions in binding sites into account and show that it consistently outperforms PSWMs on high-throughput datasets on TF binding. Moreover, in contrast to previous approaches, DWTs are directly derived from first principles within a Bayesian framework, and contain no tunable parameters. This allows them to be easily applied in practice and we make a suite of tools available for computational analysis with DWTs.
Collapse
Affiliation(s)
- Saeed Omidi
- Biozentrum, University of Basel, Basel, Switzerland.,Swiss Institute of Bioinformatics, Basel, Switzerland
| | - Mihaela Zavolan
- Biozentrum, University of Basel, Basel, Switzerland.,Swiss Institute of Bioinformatics, Basel, Switzerland
| | - Mikhail Pachkov
- Biozentrum, University of Basel, Basel, Switzerland.,Swiss Institute of Bioinformatics, Basel, Switzerland
| | - Jeremie Breda
- Biozentrum, University of Basel, Basel, Switzerland.,Swiss Institute of Bioinformatics, Basel, Switzerland
| | - Severin Berger
- Biozentrum, University of Basel, Basel, Switzerland.,Swiss Institute of Bioinformatics, Basel, Switzerland
| | - Erik van Nimwegen
- Biozentrum, University of Basel, Basel, Switzerland.,Swiss Institute of Bioinformatics, Basel, Switzerland
| |
Collapse
|
39
|
Hassanzadeh HR, Kolhe P, Isbell CL, Wang MD. MotifMark: Finding regulatory motifs in DNA sequences. ANNUAL INTERNATIONAL CONFERENCE OF THE IEEE ENGINEERING IN MEDICINE AND BIOLOGY SOCIETY. IEEE ENGINEERING IN MEDICINE AND BIOLOGY SOCIETY. ANNUAL INTERNATIONAL CONFERENCE 2017; 2017:3890-3893. [PMID: 29060747 PMCID: PMC7324295 DOI: 10.1109/embc.2017.8037706] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/07/2023]
Abstract
The interaction between proteins and DNA is a key driving force in a significant number of biological processes such as transcriptional regulation, repair, recombination, splicing, and DNA modification. The identification of DNA-binding sites and the specificity of target proteins in binding to these regions are two important steps in understanding the mechanisms of these biological activities. A number of high-throughput technologies have recently emerged that try to quantify the affinity between proteins and DNA motifs. Despite their success, these technologies have their own limitations and fall short in precise characterization of motifs, and as a result, require further downstream analysis to extract useful and interpretable information from a haystack of noisy and inaccurate data. Here we propose MotifMark, a new algorithm based on graph theory and machine learning, that can find binding sites on candidate probes and rank their specificity in regard to the underlying transcription factor. We developed a pipeline to analyze experimental data derived from compact universal protein binding microarrays and benchmarked it against two of the most accurate motif search methods. Our results indicate that MotifMark can be a viable alternative technique for prediction of motif from protein binding microarrays and possibly other related high-throughput techniques.
Collapse
|
40
|
Abstract
Here we provide an introduction to sequence motifs, including prediction of transcription factor-binding sites and general approaches to finding motifs enriched in a set of regulatory sequences.
Collapse
|
41
|
Zhou S, Treloar AE, Lupien M. Emergence of the Noncoding Cancer Genome: A Target of Genetic and Epigenetic Alterations. Cancer Discov 2016; 6:1215-1229. [PMID: 27807102 DOI: 10.1158/2159-8290.cd-16-0745] [Citation(s) in RCA: 57] [Impact Index Per Article: 7.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/13/2016] [Accepted: 08/17/2016] [Indexed: 12/14/2022]
Abstract
The emergence of whole-genome annotation approaches is paving the way for the comprehensive annotation of the human genome across diverse cell and tissue types exposed to various environmental conditions. This has already unmasked the positions of thousands of functional cis-regulatory elements integral to transcriptional regulation, such as enhancers, promoters, and anchors of chromatin interactions that populate the noncoding genome. Recent studies have shown that cis-regulatory elements are commonly the targets of genetic and epigenetic alterations associated with aberrant gene expression in cancer. Here, we review these findings to showcase the contribution of the noncoding genome and its alteration in the development and progression of cancer. We also highlight the opportunities to translate the biological characterization of genetic and epigenetic alterations in the noncoding cancer genome into novel approaches to treat or monitor disease. SIGNIFICANCE The majority of genetic and epigenetic alterations accumulate in the noncoding genome throughout oncogenesis. Discriminating driver from passenger events is a challenge that holds great promise to improve our understanding of the etiology of different cancer types. Advancing our understanding of the noncoding cancer genome may thus identify new therapeutic opportunities and accelerate our capacity to find improved biomarkers to monitor various stages of cancer development. Cancer Discov; 6(11); 1215-29. ©2016 AACR.
Collapse
Affiliation(s)
- Stanley Zhou
- Princess Margaret Cancer Centre, University Health Network, Toronto, Ontario, Canada.,Department of Medical Biophysics, University of Toronto, Toronto, Ontario, Canada
| | - Aislinn E Treloar
- Princess Margaret Cancer Centre, University Health Network, Toronto, Ontario, Canada.,Department of Medical Biophysics, University of Toronto, Toronto, Ontario, Canada
| | - Mathieu Lupien
- Princess Margaret Cancer Centre, University Health Network, Toronto, Ontario, Canada. .,Department of Medical Biophysics, University of Toronto, Toronto, Ontario, Canada.,Ontario Institute for Cancer Research, Toronto, Ontario, Canada
| |
Collapse
|
42
|
Kotoku T, Kosaka K, Nishio M, Ishida Y, Kawaichi M, Matsuda E. CIBZ Regulates Mesodermal and Cardiac Differentiation of by Suppressing T and Mesp1 Expression in Mouse Embryonic Stem Cells. Sci Rep 2016; 6:34188. [PMID: 27659197 PMCID: PMC5034229 DOI: 10.1038/srep34188] [Citation(s) in RCA: 14] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/06/2016] [Accepted: 09/08/2016] [Indexed: 11/24/2022] Open
Abstract
The molecular mechanisms underlying mesodermal and cardiac specification from embryonic stem cells (ESCs) are not fully understood. Here, we showed that the BTB domain-containing zinc finger protein CIBZ is expressed in mouse ESCs but is dramatically downregulated during ESC differentiation. CIBZ deletion in ESCs induced specification toward mesoderm phenotypes and their differentiation into cardiomyocytes, whereas overexpression of CIBZ delayed these processes. During ESC differentiation, CIBZ loss-and-gain-of-function data indicate that CIBZ negatively regulates the expressions of Brachyury (T) and Mesp1, the key transcriptional factors responsible for the specification of mammalian mesoderm and cardiac progenitors, respectively. Chromatin immunoprecipitation assays showed that CIBZ binds to T and Mesp1 promoters in undifferentiated ESCs, and luciferase assays indicate that CIBZ suppresses T and Mesp1 promoters. These findings demonstrate that CIBZ is a novel regulator of mesodermal and cardiac differentiation of ESCs, and suggest that CIBZ-mediated cardiac differentiation depends on the regulation of these two genes.
Collapse
Affiliation(s)
| | - Koji Kosaka
- Division of Gene Function in Animals, Nara Institute of Science and Technology, Ikoma, 630-0192, Japan
| | - Miki Nishio
- Functional Genomics and Medicine, Nara Institute of Science and Technology, Ikoma, 630-0192, Japan
| | - Yasumasa Ishida
- Functional Genomics and Medicine, Nara Institute of Science and Technology, Ikoma, 630-0192, Japan
| | - Masashi Kawaichi
- Division of Gene Function in Animals, Nara Institute of Science and Technology, Ikoma, 630-0192, Japan
| | - Eishou Matsuda
- Division of Gene Function in Animals, Nara Institute of Science and Technology, Ikoma, 630-0192, Japan
| |
Collapse
|
43
|
Zhang J, Gao B, Chai H, Ma Z, Yang G. Identification of DNA-binding proteins using multi-features fusion and binary firefly optimization algorithm. BMC Bioinformatics 2016; 17:323. [PMID: 27565741 PMCID: PMC5002159 DOI: 10.1186/s12859-016-1201-8] [Citation(s) in RCA: 21] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/21/2016] [Accepted: 08/24/2016] [Indexed: 11/13/2022] Open
Abstract
Background DNA-binding proteins (DBPs) play fundamental roles in many biological processes. Therefore, the developing of effective computational tools for identifying DBPs is becoming highly desirable. Results In this study, we proposed an accurate method for the prediction of DBPs. Firstly, we focused on the challenge of improving DBP prediction accuracy with information solely from the sequence. Secondly, we used multiple informative features to encode the protein. These features included evolutionary conservation profile, secondary structure motifs, and physicochemical properties. Thirdly, we introduced a novel improved Binary Firefly Algorithm (BFA) to remove redundant or noisy features as well as select optimal parameters for the classifier. The experimental results of our predictor on two benchmark datasets outperformed many state-of-the-art predictors, which revealed the effectiveness of our method. The promising prediction performance on a new-compiled independent testing dataset from PDB and a large-scale dataset from UniProt proved the good generalization ability of our method. In addition, the BFA forged in this research would be of great potential in practical applications in optimization fields, especially in feature selection problems. Conclusions A highly accurate method was proposed for the identification of DBPs. A user-friendly web-server named iDbP (identification of DNA-binding Proteins) was constructed and provided for academic use. Electronic supplementary material The online version of this article (doi:10.1186/s12859-016-1201-8) contains supplementary material, which is available to authorized users.
Collapse
Affiliation(s)
- Jian Zhang
- School of Computer Science and Information Technology, Northeast Normal University, Changchun, 130117, People's Republic of China
| | - Bo Gao
- School of Computer Science and Information Technology, Northeast Normal University, Changchun, 130117, People's Republic of China
| | - Haiting Chai
- School of Computer Science and Information Technology, Northeast Normal University, Changchun, 130117, People's Republic of China
| | - Zhiqiang Ma
- School of Computer Science and Information Technology, Northeast Normal University, Changchun, 130117, People's Republic of China
| | - Guifu Yang
- School of Computer Science and Information Technology, Northeast Normal University, Changchun, 130117, People's Republic of China. .,Office of Informatization Management and Planning, Northeast Normal University, Changchun, 130117, People's Republic of China.
| |
Collapse
|
44
|
Levati E, Sartini S, Ottonello S, Montanini B. Dry and wet approaches for genome-wide functional annotation of conventional and unconventional transcriptional activators. Comput Struct Biotechnol J 2016; 14:262-70. [PMID: 27453771 PMCID: PMC4941109 DOI: 10.1016/j.csbj.2016.06.004] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/31/2016] [Revised: 06/21/2016] [Accepted: 06/23/2016] [Indexed: 02/06/2023] Open
Abstract
Transcription factors (TFs) are master gene products that regulate gene expression in response to a variety of stimuli. They interact with DNA in a sequence-specific manner using a variety of DNA-binding domain (DBD) modules. This allows to properly position their second domain, called "effector domain", to directly or indirectly recruit positively or negatively acting co-regulators including chromatin modifiers, thus modulating preinitiation complex formation as well as transcription elongation. At variance with the DBDs, which are comprised of well-defined and easily recognizable DNA binding motifs, effector domains are usually much less conserved and thus considerably more difficult to predict. Also not so easy to identify are the DNA-binding sites of TFs, especially on a genome-wide basis and in the case of overlapping binding regions. Another emerging issue, with many potential regulatory implications, is that of so-called "moonlighting" transcription factors, i.e., proteins with an annotated function unrelated to transcription and lacking any recognizable DBD or effector domain, that play a role in gene regulation as their second job. Starting from bioinformatic and experimental high-throughput tools for an unbiased, genome-wide identification and functional characterization of TFs (especially transcriptional activators), we describe both established (and usually well affordable) as well as newly developed platforms for DNA-binding site identification. Selected combinations of these search tools, some of which rely on next-generation sequencing approaches, allow delineating the entire repertoire of TFs and unconventional regulators encoded by the any sequenced genome.
Collapse
Affiliation(s)
| | | | - Simone Ottonello
- Corresponding author at: Department of Life Sciences, University of Parma, Parco Area delle Scienze 23/A, 43124 Parma, Italy.Department of Life SciencesUniversity of ParmaParco Area delle Scienze 23/AParma43124Italy
| | | |
Collapse
|
45
|
Zhang X, Daaboul GG, Spuhler PS, Dröge P, Ünlü MS. Quantitative characterization of conformational-specific protein-DNA binding using a dual-spectral interferometric imaging biosensor. NANOSCALE 2016; 8:5587-5598. [PMID: 26890964 DOI: 10.1039/c5nr06785e] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/05/2023]
Abstract
DNA-binding proteins play crucial roles in the maintenance and functions of the genome and yet, their specific binding mechanisms are not fully understood. Recently, it was discovered that DNA-binding proteins recognize specific binding sites to carry out their functions through an indirect readout mechanism by recognizing and capturing DNA conformational flexibility and deformation. High-throughput DNA microarray-based methods that provide large-scale protein-DNA binding information have shown effective and comprehensive analysis of protein-DNA binding affinities, but do not provide information of DNA conformational changes in specific protein-DNA complexes. Building on the high-throughput capability of DNA microarrays, we demonstrate a quantitative approach that simultaneously measures the amount of protein binding to DNA and nanometer-scale DNA conformational change induced by protein binding in a microarray format. Both measurements rely on spectral interferometry on a layered substrate using a single optical instrument in two distinct modalities. In the first modality, we quantitate the amount of binding of protein to surface-immobilized DNA in each DNA spot using a label-free spectral reflectivity technique that accurately measures the surface densities of protein and DNA accumulated on the substrate. In the second modality, for each DNA spot, we simultaneously measure DNA conformational change using a fluorescence vertical sectioning technique that determines average axial height of fluorophores tagged to specific nucleotides of the surface-immobilized DNA. The approach presented in this paper, when combined with current high-throughput DNA microarray-based technologies, has the potential to serve as a rapid and simple method for quantitative and large-scale characterization of conformational specific protein-DNA interactions.
Collapse
Affiliation(s)
- Xirui Zhang
- Department of Biomedical Engineering, Boston University, Boston, Massachusetts 02215, USA
| | - George G Daaboul
- Department of Biomedical Engineering, Boston University, Boston, Massachusetts 02215, USA and Department of Electrical and Computer Engineering, Boston University, Boston, Massachusetts 02215, USA.
| | - Philipp S Spuhler
- Department of Biomedical Engineering, Boston University, Boston, Massachusetts 02215, USA
| | - Peter Dröge
- School of Biological Sciences, Nanyang Technological University, Singapore 637551
| | - M Selim Ünlü
- Department of Biomedical Engineering, Boston University, Boston, Massachusetts 02215, USA and Department of Electrical and Computer Engineering, Boston University, Boston, Massachusetts 02215, USA.
| |
Collapse
|
46
|
Megraw M, Cumbie JS, Ivanchenko MG, Filichkin SA. Small Genetic Circuits and MicroRNAs: Big Players in Polymerase II Transcriptional Control in Plants. THE PLANT CELL 2016; 28:286-303. [PMID: 26869700 PMCID: PMC4790873 DOI: 10.1105/tpc.15.00852] [Citation(s) in RCA: 13] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 10/06/2015] [Accepted: 02/10/2016] [Indexed: 05/11/2023]
Abstract
RNA Polymerase II (Pol II) regulatory cascades involving transcription factors (TFs) and their targets orchestrate the genetic circuitry of every eukaryotic organism. In order to understand how these cascades function, they can be dissected into small genetic networks, each containing just a few Pol II transcribed genes, that generate specific signal-processing outcomes. Small RNA regulatory circuits involve direct regulation of a small RNA by a TF and/or direct regulation of a TF by a small RNA and have been shown to play unique roles in many organisms. Here, we will focus on small RNA regulatory circuits containing Pol II transcribed microRNAs (miRNAs). While the role of miRNA-containing regulatory circuits as modular building blocks for the function of complex networks has long been on the forefront of studies in the animal kingdom, plant studies are poised to take a lead role in this area because of their advantages in probing transcriptional and posttranscriptional control of Pol II genes. The relative simplicity of tissue- and cell-type organization, miRNA targeting, and genomic structure make the Arabidopsis thaliana plant model uniquely amenable for small RNA regulatory circuit studies in a multicellular organism. In this Review, we cover analysis, tools, and validation methods for probing the component interactions in miRNA-containing regulatory circuits. We then review the important roles that plant miRNAs are playing in these circuits and summarize methods for the identification of small genetic circuits that strongly influence plant function. We conclude by noting areas of opportunity where new plant studies are imminently needed.
Collapse
Affiliation(s)
- Molly Megraw
- Department of Botany and Plant Pathology, Oregon State University, Corvallis, Oregon 97331 Department of Electrical Engineering and Computer Science, Oregon State University, Corvallis, Oregon 97331 Center for Genome Research and Biocomputing, Oregon State University, Corvallis, Oregon 97331
| | - Jason S Cumbie
- Department of Botany and Plant Pathology, Oregon State University, Corvallis, Oregon 97331
| | - Maria G Ivanchenko
- Department of Botany and Plant Pathology, Oregon State University, Corvallis, Oregon 97331
| | - Sergei A Filichkin
- Department of Botany and Plant Pathology, Oregon State University, Corvallis, Oregon 97331 Center for Genome Research and Biocomputing, Oregon State University, Corvallis, Oregon 97331
| |
Collapse
|
47
|
An ultrasensitive scanning electrochemical microscopy (SECM)-based DNA biosensing platform amplified with the long self-assembled DNA concatemers. Electrochim Acta 2016. [DOI: 10.1016/j.electacta.2015.12.102] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/23/2023]
|
48
|
|
49
|
Glick Y, Orenstein Y, Chen D, Avrahami D, Zor T, Shamir R, Gerber D. Integrated microfluidic approach for quantitative high-throughput measurements of transcription factor binding affinities. Nucleic Acids Res 2015; 44:e51. [PMID: 26635393 PMCID: PMC4824076 DOI: 10.1093/nar/gkv1327] [Citation(s) in RCA: 17] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/12/2015] [Accepted: 11/14/2015] [Indexed: 01/16/2023] Open
Abstract
Protein binding to DNA is a fundamental process in gene regulation. Methodologies such as ChIP-Seq and mapping of DNase I hypersensitive sites provide global information on this regulation in vivo In vitro methodologies provide valuable complementary information on protein-DNA specificities. However, current methods still do not measure absolute binding affinities. There is a real need for large-scale quantitative protein-DNA affinity measurements. We developed QPID, a microfluidic application for measuring protein-DNA affinities. A single run is equivalent to 4096 gel-shift experiments. Using QPID, we characterized the different affinities of ATF1, c-Jun, c-Fos and AP-1 to the CRE consensus motif and CRE half-site in two different genomic sequences on a single device. We discovered that binding of ATF1, but not of AP-1, to the CRE half-site is highly affected by its genomic context. This effect was highly correlated with ATF1 ChIP-seq and PBM experiments. Next, we characterized the affinities of ATF1 and ATF3 to 128 genomic CRE and CRE half-site sequences. Our affinity measurements explained that in vivo binding differences between ATF1 and ATF3 to CRE and CRE half-sites are partially mediated by differences in the minor groove width. We believe that QPID would become a central tool for quantitative characterization of biophysical aspects affecting protein-DNA binding.
Collapse
Affiliation(s)
- Yair Glick
- Mina and Evrard Goodman life science faculty, Bar Ilan University, Ramat-Gan, 5290002, Israel
| | - Yaron Orenstein
- Blavatnik School of Computer Science, Tel-Aviv University, Tel-Aviv, 69978, Israel
| | - Dana Chen
- Mina and Evrard Goodman life science faculty, Bar Ilan University, Ramat-Gan, 5290002, Israel
| | - Dorit Avrahami
- Mina and Evrard Goodman life science faculty, Bar Ilan University, Ramat-Gan, 5290002, Israel
| | - Tsaffrir Zor
- Department of Biochemistry & Molecular Biology, Life Sciences Institute, Tel-Aviv University, Tel-Aviv, 69978, Israel
| | - Ron Shamir
- Blavatnik School of Computer Science, Tel-Aviv University, Tel-Aviv, 69978, Israel
| | - Doron Gerber
- Mina and Evrard Goodman life science faculty, Bar Ilan University, Ramat-Gan, 5290002, Israel
| |
Collapse
|
50
|
Dillon MBC, Schulten V, Oseroff C, Paul S, Dullanty LM, Frazier A, Belles X, Piulachs MD, Visness C, Bacharier L, Bloomberg GR, Busse P, Sidney J, Peters B, Sette A. Different Bla-g T cell antigens dominate responses in asthma versus rhinitis subjects. Clin Exp Allergy 2015; 45:1856-67. [PMID: 26414909 PMCID: PMC4654660 DOI: 10.1111/cea.12643] [Citation(s) in RCA: 26] [Impact Index Per Article: 2.9] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/28/2015] [Revised: 07/29/2015] [Accepted: 08/19/2015] [Indexed: 02/05/2023]
Abstract
BACKGROUND AND OBJECTIVE The allergenicity of several German cockroach (Bla-g) antigens at the level of IgE responses is well established. However, less is known about the specificity of CD4+ TH responses, and whether differences exist in associated magnitude or cytokine profiles as a function of disease severity. METHODS Proteomic and transcriptomic techniques were used to identify novel antigens recognized by allergen-specific T cells. To characterize different TH functionalities of allergen-specific T cells, ELISPOT assays with sets of overlapping peptides covering the sequences of known allergens and novel antigens were employed to measure release of IL-5, IFNγ, IL-10, IL-17 and IL-21. RESULTS Using these techniques, we characterized TH responses in a cohort of adult Bla-g-sensitized subjects, either with (n = 55) or without (n = 17) asthma, and nonsensitized controls (n = 20). T cell responses were detected for ten known Bla-g allergens and an additional ten novel Bla-g antigens, representing in total a 5-fold increase in the number of antigens demonstrated to be targeted by allergen-specific T cells. Responses of sensitized individuals regardless of asthma status were predominantly TH 2, but higher in patients with diagnosed asthma. In asthmatic subjects, Bla-g 5, 9 and 11 were immunodominant, while, in contrast, nonasthmatic-sensitized subjects responded mostly to Bla-g 5 and 4 and the novel antigen NBGA5. CONCLUSIONS Asthmatic and nonasthmatic cockroach-sensitized individuals exhibit similar TH 2-polarized responses. Compared with nonasthmatics, however, asthmatic individuals have responses of higher magnitude and different allergen specificity.
Collapse
Affiliation(s)
- M B C Dillon
- La Jolla Institute for Allergy and Immunology, La Jolla, CA, USA
| | - V Schulten
- La Jolla Institute for Allergy and Immunology, La Jolla, CA, USA
| | - C Oseroff
- La Jolla Institute for Allergy and Immunology, La Jolla, CA, USA
| | - S Paul
- La Jolla Institute for Allergy and Immunology, La Jolla, CA, USA
| | - L M Dullanty
- La Jolla Institute for Allergy and Immunology, La Jolla, CA, USA
| | - A Frazier
- La Jolla Institute for Allergy and Immunology, La Jolla, CA, USA
| | - X Belles
- Institut de Biologia Evolutiva (CSIC-Universitat Pompeu Fabra), Barcelona, Spain
| | - M D Piulachs
- Institut de Biologia Evolutiva (CSIC-Universitat Pompeu Fabra), Barcelona, Spain
| | - C Visness
- Federal Systems Division, Rho Inc., Chapel Hill, NC, USA
| | - L Bacharier
- Department of Pediatrics, Washington University School of Medicine, St. Louis, MO, USA
| | - G R Bloomberg
- Department of Pediatrics, Washington University School of Medicine, St. Louis, MO, USA
| | - P Busse
- Division of Clinical Immunology, Icahn School of Medicine at Mount Sinai School of Medicine, New York, NY, USA
| | - J Sidney
- La Jolla Institute for Allergy and Immunology, La Jolla, CA, USA
| | - B Peters
- La Jolla Institute for Allergy and Immunology, La Jolla, CA, USA
| | - A Sette
- La Jolla Institute for Allergy and Immunology, La Jolla, CA, USA
| |
Collapse
|