1
|
Biddie SC, Weykopf G, Hird EF, Friman ET, Bickmore WA. DNA-binding factor footprints and enhancer RNAs identify functional non-coding genetic variants. Genome Biol 2024; 25:208. [PMID: 39107801 PMCID: PMC11304670 DOI: 10.1186/s13059-024-03352-1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/06/2024] [Accepted: 07/25/2024] [Indexed: 08/10/2024] Open
Abstract
BACKGROUND Genome-wide association studies (GWAS) have revealed a multitude of candidate genetic variants affecting the risk of developing complex traits and diseases. However, the highlighted regions are typically in the non-coding genome, and uncovering the functional causative single nucleotide variants (SNVs) is challenging. Prioritization of variants is commonly based on genomic annotation with markers of active regulatory elements, but current approaches still poorly predict functional variants. To address this, we systematically analyze six markers of active regulatory elements for their ability to identify functional variants. RESULTS We benchmark against molecular quantitative trait loci (molQTL) from assays of regulatory element activity that identify allelic effects on DNA-binding factor occupancy, reporter assay expression, and chromatin accessibility. We identify the combination of DNase footprints and divergent enhancer RNA (eRNA) as markers for functional variants. This signature provides high precision, but with a trade-off of low recall, thus substantially reducing candidate variant sets to prioritize variants for functional validation. We present this as a framework called FINDER-Functional SNV IdeNtification using DNase footprints and eRNA. CONCLUSIONS We demonstrate the utility to prioritize variants using leukocyte count trait and analyze variants in linkage disequilibrium with a lead variant to predict a functional variant in asthma. Our findings have implications for prioritizing variants from GWAS, in development of predictive scoring algorithms, and for functionally informed fine mapping approaches.
Collapse
Affiliation(s)
- Simon C Biddie
- MRC Human Genetics Unit, Institute of Genetics and Cancer, University of Edinburgh, Edinburgh, UK.
- NHS Lothian, Edinburgh, UK.
| | - Giovanna Weykopf
- MRC Human Genetics Unit, Institute of Genetics and Cancer, University of Edinburgh, Edinburgh, UK
| | | | - Elias T Friman
- MRC Human Genetics Unit, Institute of Genetics and Cancer, University of Edinburgh, Edinburgh, UK
| | - Wendy A Bickmore
- MRC Human Genetics Unit, Institute of Genetics and Cancer, University of Edinburgh, Edinburgh, UK.
| |
Collapse
|
2
|
Jones T, Sigauke RF, Sanford L, Taatjes DJ, Allen MA, Dowell RD. A transcription factor (TF) inference method that broadly measures TF activity and identifies mechanistically distinct TF networks. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2024.03.15.585303. [PMID: 38559193 PMCID: PMC10980006 DOI: 10.1101/2024.03.15.585303] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 04/04/2024]
Abstract
TF profiler is a method of inferring transcription factor regulatory activity, i.e. when a TF is present and actively regulating transcription, directly directly from nascent sequencing assays such as PRO-seq and GRO-seq. Transcription factors orchestrate transcription and play a critical role in cellular maintenance, identity and response to external stimuli. While ChIP assays have measured DNA localization, they fall short of identifying when and where transcription factors are actively regulating transcription. Our method, on the other hand, uses RNA polymerase activity to infer TF activity across hundreds of data sets and transcription factors. Based on these classifications we identify three distinct classes of transcription factors: ubiquitous factors that play roles in cellular homeostasis, driving basal gene programs across tissues and cell types, tissue specific factors that act almost exclusively at enhancers and are themselves regulated at transcription, and stimulus responsive TFs which are regulated post-transcriptionally but act predominantly at enhancers. TF profiler is broadly applicable, providing regulatory insights on any PRO-seq sample for any transcription factor with a known binding motif.
Collapse
|
3
|
Sigauke RF, Sanford L, Maas ZL, Jones T, Stanley JT, Townsend HA, Allen MA, Dowell RD. Atlas of nascent RNA transcripts reveals enhancer to gene linkages. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2023:2023.12.07.570626. [PMID: 38105978 PMCID: PMC10723487 DOI: 10.1101/2023.12.07.570626] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/19/2023]
Abstract
Gene transcription is controlled and modulated by regulatory regions, including enhancers and promoters. These regions are abundant in unstable, non-coding bidirectional transcription. Using nascent RNA transcription data across hundreds of human samples, we identified over 800,000 regions containing bidirectional transcription. We then identify highly correlated transcription between bidirectional and gene regions. The identified correlated pairs, a bidirectional region and a gene, are enriched for disease associated SNPs and often supported by independent 3D data. We present these resources as an SQL database which serves as a resource for future studies into gene regulation, enhancer associated RNAs, and transcription factors.
Collapse
Affiliation(s)
- Rutendo F. Sigauke
- BioFrontiers Institute, University of Colorado Boulder, 3415 Colorado Ave., UCB 596, Boulder, 80309, CO, USA
| | - Lynn Sanford
- BioFrontiers Institute, University of Colorado Boulder, 3415 Colorado Ave., UCB 596, Boulder, 80309, CO, USA
| | - Zachary L. Maas
- BioFrontiers Institute, University of Colorado Boulder, 3415 Colorado Ave., UCB 596, Boulder, 80309, CO, USA
- Computer Science, University of Colorado Boulder, 1111 Engineering Drive, UCB 430, Boulder, 80309, CO, USA
| | - Taylor Jones
- BioFrontiers Institute, University of Colorado Boulder, 3415 Colorado Ave., UCB 596, Boulder, 80309, CO, USA
| | - Jacob T. Stanley
- BioFrontiers Institute, University of Colorado Boulder, 3415 Colorado Ave., UCB 596, Boulder, 80309, CO, USA
| | - Hope A. Townsend
- BioFrontiers Institute, University of Colorado Boulder, 3415 Colorado Ave., UCB 596, Boulder, 80309, CO, USA
- Molecular, Cellular and Developmental Biology, University of Colorado Boulder, 1945 Colorado Ave, UCB 347, Boulder, 80309, CO, USA
| | - Mary A. Allen
- BioFrontiers Institute, University of Colorado Boulder, 3415 Colorado Ave., UCB 596, Boulder, 80309, CO, USA
| | - Robin D. Dowell
- BioFrontiers Institute, University of Colorado Boulder, 3415 Colorado Ave., UCB 596, Boulder, 80309, CO, USA
- Computer Science, University of Colorado Boulder, 1111 Engineering Drive, UCB 430, Boulder, 80309, CO, USA
- Molecular, Cellular and Developmental Biology, University of Colorado Boulder, 1945 Colorado Ave, UCB 347, Boulder, 80309, CO, USA
| |
Collapse
|