1
|
Dong L, Wang J, Wang G. ASAS-EGB: A statistical framework for estimating allele-specific alternative splicing events using transcriptome data. Comput Biol Med 2023; 160:106981. [PMID: 37163967 DOI: 10.1016/j.compbiomed.2023.106981] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/09/2023] [Revised: 04/15/2023] [Accepted: 04/27/2023] [Indexed: 05/12/2023]
Abstract
Ubiquitous Alternative splicing (AS) in eukaryotes greatly increases the biodiversity of proteins and involves in disease and cancer. Allele-specific AS studies can facilitate the identification of cis-acting elements because both alleles share the same cellular environment. Due to the limited information provided on the exons defined by AS events, we propose a statistical framework and algorithm ASAS-EGB for ASAS analysis using the gene transcriptome. The framework obtains exclusively compatible sets of gene isoforms supporting each event isoform, and uses both phased and non-phased SNPs within the exons on the gene isoforms for inference. Using this strategy, we have demonstrated ASAS-EGB can yield better ASAS inferential performance than using event isoforms. ASAS-EGB supports both single-end and paired-end RNA-seq data, and we have proved its robustness using RNA-seq replicates of individual NA12878. ASAS-EGB builds Bayesian models for ASAS analysis, and the MCMC method is used to solve the problem. With more detailed annotations for individual genomes and transcriptomes appearing in the future, the algorithm proposed by the paper can provide better support for these data to reveal the regulatory mechanisms of individual genomes.
Collapse
Affiliation(s)
- Lili Dong
- School of Computer Science and Technology, Harbin Institute of Technology, Harbin, Heilongjiang, 150001, China
| | - Jianan Wang
- School of Computer Science and Technology, Harbin Institute of Technology, Harbin, Heilongjiang, 150001, China
| | - Guohua Wang
- School of Computer Science and Technology, Harbin Institute of Technology, Harbin, Heilongjiang, 150001, China.
| |
Collapse
|
2
|
Martínez-Ruiz C, Black JRM, Puttick C, Hill MS, Demeulemeester J, Larose Cadieux E, Thol K, Jones TP, Veeriah S, Naceur-Lombardelli C, Toncheva A, Prymas P, Rowan A, Ward S, Cubitt L, Athanasopoulou F, Pich O, Karasaki T, Moore DA, Salgado R, Colliver E, Castignani C, Dietzen M, Huebner A, Al Bakir M, Tanić M, Watkins TBK, Lim EL, Al-Rashed AM, Lang D, Clements J, Cook DE, Rosenthal R, Wilson GA, Frankell AM, de Carné Trécesson S, East P, Kanu N, Litchfield K, Birkbak NJ, Hackshaw A, Beck S, Van Loo P, Jamal-Hanjani M, Swanton C, McGranahan N. Genomic-transcriptomic evolution in lung cancer and metastasis. Nature 2023; 616:543-552. [PMID: 37046093 PMCID: PMC10115639 DOI: 10.1038/s41586-023-05706-4] [Citation(s) in RCA: 45] [Impact Index Per Article: 45.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/21/2021] [Accepted: 01/04/2023] [Indexed: 04/14/2023]
Abstract
Intratumour heterogeneity (ITH) fuels lung cancer evolution, which leads to immune evasion and resistance to therapy1. Here, using paired whole-exome and RNA sequencing data, we investigate intratumour transcriptomic diversity in 354 non-small cell lung cancer tumours from 347 out of the first 421 patients prospectively recruited into the TRACERx study2,3. Analyses of 947 tumour regions, representing both primary and metastatic disease, alongside 96 tumour-adjacent normal tissue samples implicate the transcriptome as a major source of phenotypic variation. Gene expression levels and ITH relate to patterns of positive and negative selection during tumour evolution. We observe frequent copy number-independent allele-specific expression that is linked to epigenomic dysfunction. Allele-specific expression can also result in genomic-transcriptomic parallel evolution, which converges on cancer gene disruption. We extract signatures of RNA single-base substitutions and link their aetiology to the activity of the RNA-editing enzymes ADAR and APOBEC3A, thereby revealing otherwise undetected ongoing APOBEC activity in tumours. Characterizing the transcriptomes of primary-metastatic tumour pairs, we combine multiple machine-learning approaches that leverage genomic and transcriptomic variables to link metastasis-seeding potential to the evolutionary context of mutations and increased proliferation within primary tumour regions. These results highlight the interplay between the genome and transcriptome in influencing ITH, lung cancer evolution and metastasis.
Collapse
Affiliation(s)
- Carlos Martínez-Ruiz
- Cancer Research UK Lung Cancer Centre of Excellence, University College London Cancer Institute, London, UK
- Cancer Genome Evolution Research Group, Cancer Research UK Lung Cancer Centre of Excellence, University College London Cancer Institute, London, UK
| | - James R M Black
- Cancer Research UK Lung Cancer Centre of Excellence, University College London Cancer Institute, London, UK
- Cancer Genome Evolution Research Group, Cancer Research UK Lung Cancer Centre of Excellence, University College London Cancer Institute, London, UK
| | - Clare Puttick
- Cancer Research UK Lung Cancer Centre of Excellence, University College London Cancer Institute, London, UK
- Cancer Genome Evolution Research Group, Cancer Research UK Lung Cancer Centre of Excellence, University College London Cancer Institute, London, UK
- Cancer Evolution and Genome Instability Laboratory, The Francis Crick Institute and University College London Cancer Institute, London, UK
| | - Mark S Hill
- Cancer Evolution and Genome Instability Laboratory, The Francis Crick Institute and University College London Cancer Institute, London, UK
| | - Jonas Demeulemeester
- Cancer Genomics Laboratory, The Francis Crick Institute, London, UK
- Integrative Cancer Genomics Laboratory, Department of Oncology, KU Leuven, Leuven, Belgium
- VIB-KU Leuven Center for Cancer Biology, Leuven, Belgium
| | - Elizabeth Larose Cadieux
- Cancer Genomics Laboratory, The Francis Crick Institute, London, UK
- Medical Genomics, University College London Cancer Institute, London, UK
| | - Kerstin Thol
- Cancer Research UK Lung Cancer Centre of Excellence, University College London Cancer Institute, London, UK
- Cancer Genome Evolution Research Group, Cancer Research UK Lung Cancer Centre of Excellence, University College London Cancer Institute, London, UK
| | - Thomas P Jones
- Cancer Research UK Lung Cancer Centre of Excellence, University College London Cancer Institute, London, UK
- Cancer Genome Evolution Research Group, Cancer Research UK Lung Cancer Centre of Excellence, University College London Cancer Institute, London, UK
| | - Selvaraju Veeriah
- Cancer Research UK Lung Cancer Centre of Excellence, University College London Cancer Institute, London, UK
| | | | - Antonia Toncheva
- Cancer Research UK Lung Cancer Centre of Excellence, University College London Cancer Institute, London, UK
| | - Paulina Prymas
- Cancer Research UK Lung Cancer Centre of Excellence, University College London Cancer Institute, London, UK
| | - Andrew Rowan
- Cancer Evolution and Genome Instability Laboratory, The Francis Crick Institute and University College London Cancer Institute, London, UK
| | - Sophia Ward
- Cancer Research UK Lung Cancer Centre of Excellence, University College London Cancer Institute, London, UK
- Cancer Evolution and Genome Instability Laboratory, The Francis Crick Institute and University College London Cancer Institute, London, UK
- Advanced Sequencing Facility, The Francis Crick Institute, London, UK
| | - Laura Cubitt
- Advanced Sequencing Facility, The Francis Crick Institute, London, UK
| | - Foteini Athanasopoulou
- Cancer Research UK Lung Cancer Centre of Excellence, University College London Cancer Institute, London, UK
- Cancer Evolution and Genome Instability Laboratory, The Francis Crick Institute and University College London Cancer Institute, London, UK
- Advanced Sequencing Facility, The Francis Crick Institute, London, UK
| | - Oriol Pich
- Cancer Evolution and Genome Instability Laboratory, The Francis Crick Institute and University College London Cancer Institute, London, UK
| | - Takahiro Karasaki
- Cancer Research UK Lung Cancer Centre of Excellence, University College London Cancer Institute, London, UK
- Cancer Evolution and Genome Instability Laboratory, The Francis Crick Institute and University College London Cancer Institute, London, UK
- Cancer Metastasis Laboratory, University College London Cancer Institute, London, UK
| | - David A Moore
- Cancer Research UK Lung Cancer Centre of Excellence, University College London Cancer Institute, London, UK
- Cancer Evolution and Genome Instability Laboratory, The Francis Crick Institute and University College London Cancer Institute, London, UK
- Department of Cellular Pathology, University College London Hospitals, London, UK
| | - Roberto Salgado
- Department of Pathology, ZAS Hospitals, Antwerp, Belgium
- Division of Research, Peter MacCallum Cancer Centre, Melbourne, Victoria, Australia
| | - Emma Colliver
- Cancer Evolution and Genome Instability Laboratory, The Francis Crick Institute and University College London Cancer Institute, London, UK
| | - Carla Castignani
- Cancer Genomics Laboratory, The Francis Crick Institute, London, UK
- Medical Genomics, University College London Cancer Institute, London, UK
| | - Michelle Dietzen
- Cancer Research UK Lung Cancer Centre of Excellence, University College London Cancer Institute, London, UK
- Cancer Genome Evolution Research Group, Cancer Research UK Lung Cancer Centre of Excellence, University College London Cancer Institute, London, UK
- Cancer Evolution and Genome Instability Laboratory, The Francis Crick Institute and University College London Cancer Institute, London, UK
| | - Ariana Huebner
- Cancer Research UK Lung Cancer Centre of Excellence, University College London Cancer Institute, London, UK
- Cancer Genome Evolution Research Group, Cancer Research UK Lung Cancer Centre of Excellence, University College London Cancer Institute, London, UK
- Cancer Evolution and Genome Instability Laboratory, The Francis Crick Institute and University College London Cancer Institute, London, UK
| | - Maise Al Bakir
- Cancer Research UK Lung Cancer Centre of Excellence, University College London Cancer Institute, London, UK
- Cancer Evolution and Genome Instability Laboratory, The Francis Crick Institute and University College London Cancer Institute, London, UK
| | - Miljana Tanić
- Medical Genomics, University College London Cancer Institute, London, UK
- Experimental Oncology, Institute for Oncology and Radiology of Serbia, Belgrade, Serbia
| | - Thomas B K Watkins
- Cancer Evolution and Genome Instability Laboratory, The Francis Crick Institute and University College London Cancer Institute, London, UK
| | - Emilia L Lim
- Cancer Research UK Lung Cancer Centre of Excellence, University College London Cancer Institute, London, UK
- Cancer Evolution and Genome Instability Laboratory, The Francis Crick Institute and University College London Cancer Institute, London, UK
| | - Ali M Al-Rashed
- Centre for Nephrology, Division of Medicine, University College London, London, UK
| | - Danny Lang
- Scientific Computing STP, Francis Crick Institute, London, UK
| | - James Clements
- Scientific Computing STP, Francis Crick Institute, London, UK
| | - Daniel E Cook
- Cancer Evolution and Genome Instability Laboratory, The Francis Crick Institute and University College London Cancer Institute, London, UK
| | - Rachel Rosenthal
- Cancer Evolution and Genome Instability Laboratory, The Francis Crick Institute and University College London Cancer Institute, London, UK
| | - Gareth A Wilson
- Cancer Evolution and Genome Instability Laboratory, The Francis Crick Institute and University College London Cancer Institute, London, UK
| | - Alexander M Frankell
- Cancer Research UK Lung Cancer Centre of Excellence, University College London Cancer Institute, London, UK
- Cancer Evolution and Genome Instability Laboratory, The Francis Crick Institute and University College London Cancer Institute, London, UK
| | | | - Philip East
- Bioinformatics and Biostatistics, The Francis Crick Institute, London, UK
| | - Nnennaya Kanu
- Cancer Research UK Lung Cancer Centre of Excellence, University College London Cancer Institute, London, UK
| | - Kevin Litchfield
- Cancer Research UK Lung Cancer Centre of Excellence, University College London Cancer Institute, London, UK
- Tumour Immunogenomics and Immunosurveillance Laboratory, University College London Cancer Institute, London, UK
| | - Nicolai J Birkbak
- Cancer Research UK Lung Cancer Centre of Excellence, University College London Cancer Institute, London, UK
- Cancer Evolution and Genome Instability Laboratory, The Francis Crick Institute and University College London Cancer Institute, London, UK
- Department of Molecular Medicine, Aarhus University Hospital, Aarhus, Denmark
- Department of Clinical Medicine, Aarhus University, Aarhus, Denmark
- Bioinformatics Research Centre, Aarhus University, Aarhus, Denmark
| | - Allan Hackshaw
- Cancer Research UK & UCL Cancer Trials Centre, London, UK
| | - Stephan Beck
- Medical Genomics, University College London Cancer Institute, London, UK
| | - Peter Van Loo
- Cancer Genomics Laboratory, The Francis Crick Institute, London, UK
- Department of Genetics, The University of Texas MD Anderson Cancer Center, Houston, TX, USA
- Department of Genomic Medicine, The University of Texas MD Anderson Cancer Center, Houston, TX, USA
| | - Mariam Jamal-Hanjani
- Cancer Research UK Lung Cancer Centre of Excellence, University College London Cancer Institute, London, UK
- Cancer Metastasis Laboratory, University College London Cancer Institute, London, UK
- Department of Medical Oncology, University College London Hospitals, London, UK
| | - Charles Swanton
- Cancer Research UK Lung Cancer Centre of Excellence, University College London Cancer Institute, London, UK.
- Cancer Evolution and Genome Instability Laboratory, The Francis Crick Institute and University College London Cancer Institute, London, UK.
- Department of Medical Oncology, University College London Hospitals, London, UK.
| | - Nicholas McGranahan
- Cancer Research UK Lung Cancer Centre of Excellence, University College London Cancer Institute, London, UK.
- Cancer Genome Evolution Research Group, Cancer Research UK Lung Cancer Centre of Excellence, University College London Cancer Institute, London, UK.
| |
Collapse
|
3
|
Deshpande D, Chhugani K, Chang Y, Karlsberg A, Loeffler C, Zhang J, Muszyńska A, Munteanu V, Yang H, Rotman J, Tao L, Balliu B, Tseng E, Eskin E, Zhao F, Mohammadi P, P. Łabaj P, Mangul S. RNA-seq data science: From raw data to effective interpretation. Front Genet 2023; 14:997383. [PMID: 36999049 PMCID: PMC10043755 DOI: 10.3389/fgene.2023.997383] [Citation(s) in RCA: 16] [Impact Index Per Article: 16.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/18/2022] [Accepted: 02/24/2023] [Indexed: 03/14/2023] Open
Abstract
RNA sequencing (RNA-seq) has become an exemplary technology in modern biology and clinical science. Its immense popularity is due in large part to the continuous efforts of the bioinformatics community to develop accurate and scalable computational tools to analyze the enormous amounts of transcriptomic data that it produces. RNA-seq analysis enables genes and their corresponding transcripts to be probed for a variety of purposes, such as detecting novel exons or whole transcripts, assessing expression of genes and alternative transcripts, and studying alternative splicing structure. It can be a challenge, however, to obtain meaningful biological signals from raw RNA-seq data because of the enormous scale of the data as well as the inherent limitations of different sequencing technologies, such as amplification bias or biases of library preparation. The need to overcome these technical challenges has pushed the rapid development of novel computational tools, which have evolved and diversified in accordance with technological advancements, leading to the current myriad of RNA-seq tools. These tools, combined with the diverse computational skill sets of biomedical researchers, help to unlock the full potential of RNA-seq. The purpose of this review is to explain basic concepts in the computational analysis of RNA-seq data and define discipline-specific jargon.
Collapse
Affiliation(s)
- Dhrithi Deshpande
- Department of Pharmacology and Pharmaceutical Sciences, USC Alfred E. Mann School of Pharmacy and Pharmaceutical Sciences, Los Angeles, CA, United States
| | - Karishma Chhugani
- Department of Pharmacology and Pharmaceutical Sciences, USC Alfred E. Mann School of Pharmacy and Pharmaceutical Sciences, Los Angeles, CA, United States
| | - Yutong Chang
- Department of Pharmacology and Pharmaceutical Sciences, USC Alfred E. Mann School of Pharmacy and Pharmaceutical Sciences, Los Angeles, CA, United States
| | - Aaron Karlsberg
- Department of Clinical Pharmacy, USC Alfred E. Mann School of Pharmacy and Pharmaceutical Sciences, Los Angeles, CA, United States
| | - Caitlin Loeffler
- Department of Computer Science, University of California, Los Angeles, CA, United States
| | - Jinyang Zhang
- Beijing Institutes of Life Science, Chinese Academy of Sciences, Beijing, China
| | - Agata Muszyńska
- Małopolska Centre of Biotechnology, Jagiellonian University, Krakow, Poland
- Institute of Automatic Control, Electronics and Computer Science, Silesian University of Technology, Gliwice, Poland
| | - Viorel Munteanu
- Department of Computers, Informatics and Microelectronics, Technical University of Moldova, Chisinau, Moldova
| | - Harry Yang
- Department of Microbiology, Immunology and Molecular Genetics, University of California Los Angeles, Los Angeles, CA, United States
| | - Jeremy Rotman
- Department of Clinical Pharmacy, USC Alfred E. Mann School of Pharmacy and Pharmaceutical Sciences, Los Angeles, CA, United States
| | - Laura Tao
- Department of Computational Medicine, David Geffen School of Medicine at UCLA, CHS, Los Angeles, CA, United States
| | - Brunilda Balliu
- Department of Computational Medicine, David Geffen School of Medicine at UCLA, CHS, Los Angeles, CA, United States
| | | | - Eleazar Eskin
- Department of Computer Science, University of California, Los Angeles, CA, United States
- Department of Computational Medicine, David Geffen School of Medicine at UCLA, CHS, Los Angeles, CA, United States
- Department of Human Genetics, David Geffen School of Medicine at UCLA, Los Angeles, CA, United States
| | - Fangqing Zhao
- Beijing Institutes of Life Science, Chinese Academy of Sciences, Beijing, China
- Key Laboratory of Systems Biology, Hangzhou Institute for Advanced Study, University of Chinese Academy of Sciences, Hangzhou, China
| | - Pejman Mohammadi
- Department of Integrative Structural and Computational Biology, The Scripps Research Institute, La Jolla, CA, United States
| | - Paweł P. Łabaj
- Małopolska Centre of Biotechnology, Jagiellonian University, Krakow, Poland
- Department of Biotechnology, Boku University Vienna, Vienna, Austria
| | - Serghei Mangul
- Department of Clinical Pharmacy, USC Alfred E. Mann School of Pharmacy and Pharmaceutical Sciences, Los Angeles, CA, United States
- Department of Quantitative and Computational Biology, USC Dornsife College of Letters, Arts and Sciences, Los Angeles, CA, United States
- *Correspondence: Serghei Mangul,
| |
Collapse
|
4
|
Mack KL, Square TA, Zhao B, Miller CT, Fraser HB. Evolution of Spatial and Temporal cis-Regulatory Divergence in Sticklebacks. Mol Biol Evol 2023; 40:7048494. [PMID: 36805962 PMCID: PMC10015619 DOI: 10.1093/molbev/msad034] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/02/2022] [Revised: 02/02/2023] [Accepted: 02/08/2023] [Indexed: 02/22/2023] Open
Abstract
Cis-regulatory changes are thought to play a major role in adaptation. Threespine sticklebacks have repeatedly colonized freshwater habitats in the Northern Hemisphere, where they have evolved a suite of phenotypes that distinguish them from marine populations, including changes in physiology, behavior, and morphology. To understand the role of gene regulatory evolution in adaptive divergence, here we investigate cis-regulatory changes in gene expression between marine and freshwater ecotypes through allele-specific expression (ASE) in F1 hybrids. Surveying seven ecologically relevant tissues, including three sampled across two developmental stages, we identified cis-regulatory divergence affecting a third of genes, nearly half of which were tissue-specific. Next, we compared allele-specific expression in dental tissues at two timepoints to characterize cis-regulatory changes during development between marine and freshwater fish. Applying a genome-wide test for selection on cis-regulatory changes, we find evidence for lineage-specific selection on several processes between ecotypes, including the Wnt signaling pathway in dental tissues. Finally, we show that genes with ASE, particularly those that are tissue-specific, are strongly enriched in genomic regions of repeated marine-freshwater divergence, supporting an important role for these cis-regulatory differences in parallel adaptive evolution of sticklebacks to freshwater habitats. Altogether, our results provide insight into the cis-regulatory landscape of divergence between stickleback ecotypes across tissues and during development, and support a fundamental role for tissue-specific cis-regulatory changes in rapid adaptation to new environments.
Collapse
Affiliation(s)
- Katya L Mack
- Department of Biology, Stanford University, Stanford, CA
| | - Tyler A Square
- Department of Molecular and Cell Biology, University of California, Berkeley, CA
| | - Bin Zhao
- Department of Biology, Stanford University, Stanford, CA
| | - Craig T Miller
- Department of Molecular and Cell Biology, University of California, Berkeley, CA
| | | |
Collapse
|
5
|
Boatwright JL. A Robust Methodology for Assessing Homoeolog-Specific Expression. Methods Mol Biol 2023; 2545:251-258. [PMID: 36720817 DOI: 10.1007/978-1-0716-2561-3_13] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/02/2023]
Abstract
Angiosperm evolution is marked by numerous, recurring polyploidization events. While hybridization and polyploidization have greatly increased the degree of genetic and phenotypic diversity in plants, the mechanisms underlying changes in the genotype-to-phenotype relationships remain unclear. As the field of natural sciences continues to expand during the post-genomic era, large datasets are becoming increasingly common. However, the development of tools and workflows available to robustly assess these changes have lagged behind data production. A robust homoeolog-specific expression analysis strongly depends upon proper homoeolog calling, the ability to account for reference sequence biases, flexible and accurate methods for dealing with residual bias, and a reproducible workflow. To that end, this chapter aims to provide a detailed description of the potential pitfalls encountered while estimating homoeolog-specific expression as well as provide a workflow that allows for robust inferences based on precise estimates of expression changes.
Collapse
Affiliation(s)
- J Lucas Boatwright
- Advanced Plant Technology, Clemson University, Clemson, SC, USA. .,Department of Plant and Environmental Sciences, Clemson University, Clemson, SC, USA.
| |
Collapse
|
6
|
Liang D, Aygün N, Matoba N, Ideraabdullah FY, Love MI, Stein JL. Inference of putative cell-type-specific imprinted regulatory elements and genes during human neuronal differentiation. Hum Mol Genet 2023; 32:402-416. [PMID: 35994039 PMCID: PMC9851749 DOI: 10.1093/hmg/ddac207] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/23/2022] [Revised: 08/02/2022] [Accepted: 08/17/2022] [Indexed: 01/24/2023] Open
Abstract
Genomic imprinting results in gene expression bias caused by parental chromosome of origin and occurs in genes with important roles during human brain development. However, the cell-type and temporal specificity of imprinting during human neurogenesis is generally unknown. By detecting within-donor allelic biases in chromatin accessibility and gene expression that are unrelated to cross-donor genotype, we inferred imprinting in both primary human neural progenitor cells and their differentiated neuronal progeny from up to 85 donors. We identified 43/20 putatively imprinted regulatory elements (IREs) in neurons/progenitors, and 133/79 putatively imprinted genes in neurons/progenitors. Although 10 IREs and 42 genes were shared between neurons and progenitors, most putative imprinting was only detected within specific cell types. In addition to well-known imprinted genes and their promoters, we inferred novel putative IREs and imprinted genes. Consistent with both DNA methylation-based and H3K27me3-based regulation of imprinted expression, some putative IREs also overlapped with differentially methylated or histone-marked regions. Finally, we identified a progenitor-specific putatively imprinted gene overlapping with copy number variation that is associated with uniparental disomy-like phenotypes. Our results can therefore be useful in interpreting the function of variants identified in future parent-of-origin association studies.
Collapse
Affiliation(s)
- Dan Liang
- Department of Genetics, University of North Carolina at Chapel Hill, Chapel Hill, NC 27599, USA
- UNC Neuroscience Center, University of North Carolina at Chapel Hill, Chapel Hill, NC 27599, USA
| | - Nil Aygün
- Department of Genetics, University of North Carolina at Chapel Hill, Chapel Hill, NC 27599, USA
- UNC Neuroscience Center, University of North Carolina at Chapel Hill, Chapel Hill, NC 27599, USA
| | - Nana Matoba
- Department of Genetics, University of North Carolina at Chapel Hill, Chapel Hill, NC 27599, USA
- UNC Neuroscience Center, University of North Carolina at Chapel Hill, Chapel Hill, NC 27599, USA
| | - Folami Y Ideraabdullah
- Department of Genetics, University of North Carolina at Chapel Hill, Chapel Hill, NC 27599, USA
| | - Michael I Love
- Department of Genetics, University of North Carolina at Chapel Hill, Chapel Hill, NC 27599, USA
- Department of Biostatistics, University of North Carolina at Chapel Hill, Chapel Hill, NC 27599, USA
| | - Jason L Stein
- Department of Genetics, University of North Carolina at Chapel Hill, Chapel Hill, NC 27599, USA
- UNC Neuroscience Center, University of North Carolina at Chapel Hill, Chapel Hill, NC 27599, USA
| |
Collapse
|
7
|
Cagirici HB, Andorf CM, Sen TZ. Co-expression pan-network reveals genes involved in complex traits within maize pan-genome. BMC PLANT BIOLOGY 2022; 22:595. [PMID: 36529716 PMCID: PMC9762053 DOI: 10.1186/s12870-022-03985-z] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 09/15/2022] [Accepted: 12/07/2022] [Indexed: 06/17/2023]
Abstract
BACKGROUND With the advances in the high throughput next generation sequencing technologies, genome-wide association studies (GWAS) have identified a large set of variants associated with complex phenotypic traits at a very fine scale. Despite the progress in GWAS, identification of genotype-phenotype relationship remains challenging in maize due to its nature with dozens of variants controlling the same trait. As the causal variations results in the change in expression, gene expression analyses carry a pivotal role in unraveling the transcriptional regulatory mechanisms behind the phenotypes. RESULTS To address these challenges, we incorporated the gene expression and GWAS-driven traits to extend the knowledge of genotype-phenotype relationships and transcriptional regulatory mechanisms behind the phenotypes. We constructed a large collection of gene co-expression networks and identified more than 2 million co-expressing gene pairs in the GWAS-driven pan-network which contains all the gene-pairs in individual genomes of the nested association mapping (NAM) population. We defined four sub-categories for the pan-network: (1) core-network contains the highest represented ~ 1% of the gene-pairs, (2) near-core network contains the next highest represented 1-5% of the gene-pairs, (3) private-network contains ~ 50% of the gene pairs that are unique to individual genomes, and (4) the dispensable-network contains the remaining 50-95% of the gene-pairs in the maize pan-genome. Strikingly, the private-network contained almost all the genes in the pan-network but lacked half of the interactions. We performed gene ontology (GO) enrichment analysis for the pan-, core-, and private- networks and compared the contributions of variants overlapping with genes and promoters to the GWAS-driven pan-network. CONCLUSIONS Gene co-expression networks revealed meaningful information about groups of co-regulated genes that play a central role in regulatory processes. Pan-network approach enabled us to visualize the global view of the gene regulatory network for the studied system that could not be well inferred by the core-network alone.
Collapse
Affiliation(s)
- H Busra Cagirici
- US Department of Agriculture - Agricultural Research Service, Crop Improvement Genetics Research Unit, Western Regional Research Center, 800 Buchanan St, Albany, CA, 94710, USA
| | - Carson M Andorf
- US Department of Agriculture - Agricultural Research Service, Corn Insects and Crop Genetics Research Unit, Iowa State University, Ames, IA, 50011, USA.
- Department of Computer Science, Iowa State University, Ames, IA, 50011, USA.
| | - Taner Z Sen
- US Department of Agriculture - Agricultural Research Service, Crop Improvement Genetics Research Unit, Western Regional Research Center, 800 Buchanan St, Albany, CA, 94710, USA.
- Department of Bioengineering, University of California, Berkeley, CA, 94720, USA.
| |
Collapse
|
8
|
Shi X, Li W, Guo Z, Wu M, Zhang X, Yuan L, Qiu X, Xing Y, Sun X, Xie H, Tang J. Comparative transcriptomic analysis of maize ear heterosis during the inflorescence meristem differentiation stage. BMC PLANT BIOLOGY 2022; 22:348. [PMID: 35843937 PMCID: PMC9290290 DOI: 10.1186/s12870-022-03695-6] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 10/14/2021] [Accepted: 06/08/2022] [Indexed: 06/15/2023]
Abstract
BACKGROUND Heterosis is widely used in many crops and is important for global food safety, and maize is one of the most successful crops to take advantage of heterosis. Gene expression patterns control the development of the maize ear, but the mechanisms by which heterosis affects transcriptional-level control are not fully understood. RESULTS In this study, we sampled ear inflorescence meristems (IMs) from the single-segment substitution maize (Zea mays) line lx9801hlEW2b, which contains the heterotic locus hlEW2b associated with ear width, as well as the receptor parent lx9801, the test parent Zheng58, and their corresponding hybrids Zheng58 × lx9801hlEW2b (HY) and Zheng58 × lx9801 (CK). After RNA sequencing and transcriptomic analysis, 2531 unique differentially expressed genes (DEGs) were identified between the two hybrids (HY vs. CK). Our results showed that approximately 64% and 48% of DEGs exhibited additive expression in HY and CK, whereas the other genes displayed a non-additive expression pattern. The DEGs were significantly enriched in GO functional categories of multiple metabolic processes, plant organ morphogenesis, and hormone regulation. These essential processes are potentially associated with heterosis performance during the maize ear developmental stage. In particular, 125 and 100 DEGs from hybrids with allele-specific expression (ASE) were specifically identified in HY and CK, respectively. Comparison between the two hybrids suggested that ASE genes were involved in different development-related processes that may lead to the hybrid vigor phenotype during maize ear development. In addition, several critical genes involved in auxin metabolism and IM development were differentially expressed between the hybrids and showed various expression patterns (additive, non-additive, and ASE). Changes in the expression levels of these genes may lead to differences in auxin homeostasis in the IM, affecting the transcription of core genes such as WUS that control IM development. CONCLUSIONS Our research suggests that additive, non-additive, and allele-specific expression patterns may fine-tune the expression of crucial DEGs that modulate carbohydrate and protein metabolic processes, nitrogen assimilation, and auxin metabolism to optimal levels, and these transcriptional changes may play important roles in maize ear heterosis. The results provide new information that increases our understanding of the relationship between transcriptional variation and heterosis during maize ear development, which may be helpful for clarifying the genetic and molecular mechanisms of heterosis.
Collapse
Affiliation(s)
- Xia Shi
- National Key Laboratory of Wheat and Maize Crop Science, College of Agronomy, Henan Agricultural University, Zhengzhou, 450002, China
- Henan Institute of Crop Molecular Breeding, Henan Academy of Agricultural Sciences, Zhengzhou, 450002, China
| | - Weihua Li
- National Key Laboratory of Wheat and Maize Crop Science, College of Agronomy, Henan Agricultural University, Zhengzhou, 450002, China.
| | - Zhanyong Guo
- National Key Laboratory of Wheat and Maize Crop Science, College of Agronomy, Henan Agricultural University, Zhengzhou, 450002, China
| | - Mingbo Wu
- National Key Laboratory of Wheat and Maize Crop Science, College of Agronomy, Henan Agricultural University, Zhengzhou, 450002, China
| | - Xiangge Zhang
- Henan Institute of Crop Molecular Breeding, Henan Academy of Agricultural Sciences, Zhengzhou, 450002, China
| | - Liang Yuan
- National Key Laboratory of Wheat and Maize Crop Science, College of Agronomy, Henan Agricultural University, Zhengzhou, 450002, China
| | - Xiaoqian Qiu
- National Key Laboratory of Wheat and Maize Crop Science, College of Agronomy, Henan Agricultural University, Zhengzhou, 450002, China
| | - Ye Xing
- National Key Laboratory of Wheat and Maize Crop Science, College of Agronomy, Henan Agricultural University, Zhengzhou, 450002, China
| | - Xiaojing Sun
- National Key Laboratory of Wheat and Maize Crop Science, College of Agronomy, Henan Agricultural University, Zhengzhou, 450002, China
| | - Huiling Xie
- National Key Laboratory of Wheat and Maize Crop Science, College of Agronomy, Henan Agricultural University, Zhengzhou, 450002, China
| | - Jihua Tang
- National Key Laboratory of Wheat and Maize Crop Science, College of Agronomy, Henan Agricultural University, Zhengzhou, 450002, China.
- The Shennong Laboratory, Zhengzhou, Henan, 450002, China.
| |
Collapse
|
9
|
Kalita CA, Gusev A. DeCAF: a novel method to identify cell-type specific regulatory variants and their role in cancer risk. Genome Biol 2022; 23:152. [PMID: 35804456 PMCID: PMC9264694 DOI: 10.1186/s13059-022-02708-9] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/11/2021] [Accepted: 06/15/2022] [Indexed: 01/09/2023] Open
Abstract
Here, we propose DeCAF (DEconvoluted cell type Allele specific Function), a new method to identify cell-fraction (cf) QTLs in tumors by leveraging both allelic and total expression information. Applying DeCAF to RNA-seq data from TCGA, we identify 3664 genes with cfQTLs (at 10% FDR) in 14 cell types, a 5.63× increase in discovery over conventional interaction-eQTL mapping. cfQTLs replicated in external cell-type-specific eQTL data are more enriched for cancer risk than conventional eQTLs. Our new method, DeCAF, empowers the discovery of biologically meaningful cfQTLs from bulk RNA-seq data in moderately sized studies.
Collapse
Affiliation(s)
- Cynthia A. Kalita
- grid.38142.3c000000041936754XDivision of Population Sciences, Dana–Farber Cancer Institute & Harvard Medical School, Boston, USA
| | - Alexander Gusev
- grid.38142.3c000000041936754XDivision of Population Sciences, Dana–Farber Cancer Institute & Harvard Medical School, Boston, USA ,grid.66859.340000 0004 0546 1623The Broad Institute, Boston, USA ,grid.62560.370000 0004 0378 8294Division of Genetics, Brigham & Women’s Hospital, Boston, USA
| |
Collapse
|
10
|
Mu W, Sarkar H, Srivastava A, Choi K, Patro R, Love MI. Airpart: interpretable statistical models for analyzing allelic imbalance in single-cell datasets. Bioinformatics 2022; 38:2773-2780. [PMID: 35561168 PMCID: PMC9113279 DOI: 10.1093/bioinformatics/btac212] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/26/2021] [Revised: 03/05/2022] [Accepted: 04/05/2022] [Indexed: 02/03/2023] Open
Abstract
MOTIVATION Allelic expression analysis aids in detection of cis-regulatory mechanisms of genetic variation, which produce allelic imbalance (AI) in heterozygotes. Measuring AI in bulk data lacking time or spatial resolution has the limitation that cell-type-specific (CTS), spatial- or time-dependent AI signals may be dampened or not detected. RESULTS We introduce a statistical method airpart for identifying differential CTS AI from single-cell RNA-sequencing data, or dynamics AI from other spatially or time-resolved datasets. airpart outputs discrete partitions of data, pointing to groups of genes and cells under common mechanisms of cis-genetic regulation. In order to account for low counts in single-cell data, our method uses a Generalized Fused Lasso with Binomial likelihood for partitioning groups of cells by AI signal, and a hierarchical Bayesian model for AI statistical inference. In simulation, airpart accurately detected partitions of cell types by their AI and had lower Root Mean Square Error (RMSE) of allelic ratio estimates than existing methods. In real data, airpart identified differential allelic imbalance patterns across cell states and could be used to define trends of AI signal over spatial or time axes. AVAILABILITY AND IMPLEMENTATION The airpart package is available as an R/Bioconductor package at https://bioconductor.org/packages/airpart. SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Wancen Mu
- To whom correspondence should be addressed. or
| | - Hirak Sarkar
- Department of Biomedical Informatics, Harvard Medical School, Boston, MA 02115, USA
| | | | | | - Rob Patro
- Department of Computer Science, University of Maryland, College Park, MD 20742, USA
| | | |
Collapse
|
11
|
Sen A, Huo Y, Elster J, Zage PE, McVicker G. Allele-specific expression reveals genes with recurrent cis-regulatory alterations in high-risk neuroblastoma. Genome Biol 2022; 23:71. [PMID: 35246212 PMCID: PMC8896304 DOI: 10.1186/s13059-022-02640-y] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/16/2021] [Accepted: 02/23/2022] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND Neuroblastoma is a pediatric malignancy with a high frequency of metastatic disease at initial diagnosis. Neuroblastoma tumors have few recurrent protein-coding mutations but contain extensive somatic copy number alterations (SCNAs) suggesting that mutations that alter gene dosage are important drivers of tumorigenesis. Here, we analyze allele-specific expression in 96 high-risk neuroblastoma tumors to discover genes impacted by cis-acting mutations that alter dosage. RESULTS We identify 1043 genes with recurrent, neuroblastoma-specific allele-specific expression. While most of these genes lie within common SCNA regions, many of them exhibit allele-specific expression in copy neutral samples and these samples are enriched for mutations that are predicted to cause nonsense-mediated decay. Thus, both SCNA and non-SCNA mutations frequently alter gene expression in neuroblastoma. We focus on genes with neuroblastoma-specific allele-specific expression in the absence of SCNAs and find 26 such genes that have reduced expression in stage 4 disease. At least two of these genes have evidence for tumor suppressor activity including the transcription factor TFAP2B and the protein tyrosine phosphatase PTPRH. CONCLUSIONS In summary, our allele-specific expression analysis discovers genes that are recurrently dysregulated by both large SCNAs and other cis-acting mutations in high-risk neuroblastoma.
Collapse
Affiliation(s)
- Arko Sen
- Integrative Biology Laboratory, Salk Institute for Biological Studies, La Jolla, California, USA
| | - Yuchen Huo
- Department of Pediatrics, Division of Hematology-Oncology, University of California San Diego, La Jolla, California, USA
| | - Jennifer Elster
- Department of Pediatrics, Division of Hematology-Oncology, University of California San Diego, La Jolla, California, USA.,Peckham Center for Cancer and Blood Disorders, Rady Children's Hospital-San Diego, San Diego, California, USA
| | - Peter E Zage
- Department of Pediatrics, Division of Hematology-Oncology, University of California San Diego, La Jolla, California, USA.,Peckham Center for Cancer and Blood Disorders, Rady Children's Hospital-San Diego, San Diego, California, USA
| | - Graham McVicker
- Integrative Biology Laboratory, Salk Institute for Biological Studies, La Jolla, California, USA.
| |
Collapse
|
12
|
Wang X, Ke L, Wang S, Fu J, Xu J, Hao Y, Kang C, Guo W, Deng X, Xu Q. Variation burst during dedifferentiation and increased CHH-type DNA methylation after 30 years of in vitro culture of sweet orange. HORTICULTURE RESEARCH 2022; 9:uhab036. [PMID: 35039837 PMCID: PMC8824543 DOI: 10.1093/hr/uhab036] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 07/02/2021] [Revised: 01/18/2022] [Accepted: 10/15/2021] [Indexed: 06/14/2023]
Abstract
Somaclonal variation arising from tissue culture may provide a valuable resource for the selection of new germplasm, but may not preserve true-to-type characteristics, which is a major concern for germplasm conservation or genome editing. The genomic changes associated with dedifferentiation and somaclonal variation during long-term in vitro culture are largely unknown. Sweet orange was one of the earliest plant species to be cultured in vitro and induced via somatic embryogenesis. We compared four sweet orange callus lines after 30 years of constant tissue culture with newly induced calli by comprehensively determining the single-nucleotide polymorphisms, copy number variations, transposable element insertions, methylomic and transcriptomic changes. We identified a burst of variation during early dedifferentiation, including a retrotransposon outbreak, followed by a variation purge during long-term in vitro culture. Notably, CHH methylation showed a dynamic pattern, initially disappearing during dedifferentiation and then more than recovering after 30 years of in vitro culture. We also analyzed the effects of somaclonal variation on transcriptional reprogramming, and indicated subgenome dominance was evident in the tetraploid callus. We identified a retrotransposon insertion and DNA modification alternations in the potential regeneration-related gene CLAVATA3/EMBRYO SURROUNDING REGION-RELATED 16. This study provides the foundation to harness in vitro variation and offers a deeper understanding of the variation introduced by tissue culture during germplasm conservation, somatic embryogenesis, gene editing, and breeding programs.
Collapse
Affiliation(s)
- Xia Wang
- Key Laboratory of Horticultural Plant Biology (Ministry of Education), Huazhong Agricultural University,
No. 1, Shizishan Street, Wuhan 430070, China
| | - Lili Ke
- Key Laboratory of Horticultural Plant Biology (Ministry of Education), Huazhong Agricultural University,
No. 1, Shizishan Street, Wuhan 430070, China
| | - Shuting Wang
- Key Laboratory of Horticultural Plant Biology (Ministry of Education), Huazhong Agricultural University,
No. 1, Shizishan Street, Wuhan 430070, China
| | - Jialing Fu
- Key Laboratory of Horticultural Plant Biology (Ministry of Education), Huazhong Agricultural University,
No. 1, Shizishan Street, Wuhan 430070, China
| | - Jidi Xu
- Key Laboratory of Horticultural Plant Biology (Ministry of Education), Huazhong Agricultural University,
No. 1, Shizishan Street, Wuhan 430070, China
| | - Yujin Hao
- Key Laboratory of Horticultural Plant Biology (Ministry of Education), Huazhong Agricultural University,
No. 1, Shizishan Street, Wuhan 430070, China
| | - Chunying Kang
- Key Laboratory of Horticultural Plant Biology (Ministry of Education), Huazhong Agricultural University,
No. 1, Shizishan Street, Wuhan 430070, China
| | - Wenwu Guo
- Key Laboratory of Horticultural Plant Biology (Ministry of Education), Huazhong Agricultural University,
No. 1, Shizishan Street, Wuhan 430070, China
| | - Xiuxin Deng
- Key Laboratory of Horticultural Plant Biology (Ministry of Education), Huazhong Agricultural University,
No. 1, Shizishan Street, Wuhan 430070, China
| | - Qiang Xu
- Key Laboratory of Horticultural Plant Biology (Ministry of Education), Huazhong Agricultural University,
No. 1, Shizishan Street, Wuhan 430070, China
| |
Collapse
|
13
|
Chen B, Scurrah CR, McKinley ET, Simmons AJ, Ramirez-Solano MA, Zhu X, Markham NO, Heiser CN, Vega PN, Rolong A, Kim H, Sheng Q, Drewes JL, Zhou Y, Southard-Smith AN, Xu Y, Ro J, Jones AL, Revetta F, Berry LD, Niitsu H, Islam M, Pelka K, Hofree M, Chen JH, Sarkizova S, Ng K, Giannakis M, Boland GM, Aguirre AJ, Anderson AC, Rozenblatt-Rosen O, Regev A, Hacohen N, Kawasaki K, Sato T, Goettel JA, Grady WM, Zheng W, Washington MK, Cai Q, Sears CL, Goldenring JR, Franklin JL, Su T, Huh WJ, Vandekar S, Roland JT, Liu Q, Coffey RJ, Shrubsole MJ, Lau KS. Differential pre-malignant programs and microenvironment chart distinct paths to malignancy in human colorectal polyps. Cell 2021; 184:6262-6280.e26. [PMID: 34910928 PMCID: PMC8941949 DOI: 10.1016/j.cell.2021.11.031] [Citation(s) in RCA: 115] [Impact Index Per Article: 38.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/16/2020] [Revised: 07/22/2021] [Accepted: 11/17/2021] [Indexed: 12/15/2022]
Abstract
Colorectal cancers (CRCs) arise from precursor polyps whose cellular origins, molecular heterogeneity, and immunogenic potential may reveal diagnostic and therapeutic insights when analyzed at high resolution. We present a single-cell transcriptomic and imaging atlas of the two most common human colorectal polyps, conventional adenomas and serrated polyps, and their resulting CRC counterparts. Integrative analysis of 128 datasets from 62 participants reveals adenomas arise from WNT-driven expansion of stem cells, while serrated polyps derive from differentiated cells through gastric metaplasia. Metaplasia-associated damage is coupled to a cytotoxic immune microenvironment preceding hypermutation, driven partly by antigen-presentation differences associated with tumor cell-differentiation status. Microsatellite unstable CRCs contain distinct non-metaplastic regions where tumor cells acquire stem cell properties and cytotoxic immune cells are depleted. Our multi-omic atlas provides insights into malignant progression of colorectal polyps and their microenvironment, serving as a framework for precision surveillance and prevention of CRC.
Collapse
Affiliation(s)
- Bob Chen
- Program in Chemical and Physical Biology, Vanderbilt University School of Medicine, Nashville, TN, USA; Epithelial Biology Center, Vanderbilt University Medical Center, Nashville, TN, USA
| | - Cherie' R Scurrah
- Epithelial Biology Center, Vanderbilt University Medical Center, Nashville, TN, USA; Department of Cell and Developmental Biology, Vanderbilt University School of Medicine, Nashville, TN, USA
| | - Eliot T McKinley
- Epithelial Biology Center, Vanderbilt University Medical Center, Nashville, TN, USA; Department of Cell and Developmental Biology, Vanderbilt University School of Medicine, Nashville, TN, USA
| | - Alan J Simmons
- Epithelial Biology Center, Vanderbilt University Medical Center, Nashville, TN, USA; Department of Cell and Developmental Biology, Vanderbilt University School of Medicine, Nashville, TN, USA
| | - Marisol A Ramirez-Solano
- Department of Biostatistics and Center for Quantitative Sciences, Vanderbilt University Medical Center, Nashville, TN, USA
| | - Xiangzhu Zhu
- Vanderbilt-Ingram Cancer Center, Nashville, TN, USA; Department of Medicine, Division of Epidemiology, Vanderbilt Epidemiology Center, Vanderbilt University Medical Center, Nashville, TN, USA
| | - Nicholas O Markham
- Epithelial Biology Center, Vanderbilt University Medical Center, Nashville, TN, USA; Department of Medicine, Division of Gastroenterology, Hepatology and Nutrition, Vanderbilt University Medical Center, Nashville, TN, USA; Department of Veterans Affairs, Tennessee Valley Healthcare System, Nashville, TN, USA; Department of Pathology, Microbiology, and Immunology, Vanderbilt University Medical Center, Nashville, TN, USA
| | - Cody N Heiser
- Program in Chemical and Physical Biology, Vanderbilt University School of Medicine, Nashville, TN, USA; Epithelial Biology Center, Vanderbilt University Medical Center, Nashville, TN, USA
| | - Paige N Vega
- Epithelial Biology Center, Vanderbilt University Medical Center, Nashville, TN, USA; Department of Cell and Developmental Biology, Vanderbilt University School of Medicine, Nashville, TN, USA
| | - Andrea Rolong
- Epithelial Biology Center, Vanderbilt University Medical Center, Nashville, TN, USA; Department of Cell and Developmental Biology, Vanderbilt University School of Medicine, Nashville, TN, USA
| | - Hyeyon Kim
- Epithelial Biology Center, Vanderbilt University Medical Center, Nashville, TN, USA; Department of Cell and Developmental Biology, Vanderbilt University School of Medicine, Nashville, TN, USA
| | - Quanhu Sheng
- Department of Biostatistics and Center for Quantitative Sciences, Vanderbilt University Medical Center, Nashville, TN, USA
| | - Julia L Drewes
- Department of Medicine, Division of Infectious Diseases, Johns Hopkins University School of Medicine, Baltimore, MD, USA
| | - Yuan Zhou
- Department of Biostatistics and Center for Quantitative Sciences, Vanderbilt University Medical Center, Nashville, TN, USA
| | - Austin N Southard-Smith
- Epithelial Biology Center, Vanderbilt University Medical Center, Nashville, TN, USA; Department of Cell and Developmental Biology, Vanderbilt University School of Medicine, Nashville, TN, USA
| | - Yanwen Xu
- Epithelial Biology Center, Vanderbilt University Medical Center, Nashville, TN, USA; Department of Cell and Developmental Biology, Vanderbilt University School of Medicine, Nashville, TN, USA
| | - James Ro
- Epithelial Biology Center, Vanderbilt University Medical Center, Nashville, TN, USA; Department of Cell and Developmental Biology, Vanderbilt University School of Medicine, Nashville, TN, USA
| | - Angela L Jones
- Vanderbilt Technologies for Advanced Genomics, Vanderbilt University Medical Center, Nashville, TN, USA
| | - Frank Revetta
- Department of Pathology, Microbiology, and Immunology, Vanderbilt University Medical Center, Nashville, TN, USA
| | - Lynne D Berry
- Department of Biostatistics and Center for Quantitative Sciences, Vanderbilt University Medical Center, Nashville, TN, USA
| | - Hiroaki Niitsu
- Epithelial Biology Center, Vanderbilt University Medical Center, Nashville, TN, USA; Department of Medicine, Division of Gastroenterology, Hepatology and Nutrition, Vanderbilt University Medical Center, Nashville, TN, USA
| | - Mirazul Islam
- Epithelial Biology Center, Vanderbilt University Medical Center, Nashville, TN, USA; Department of Cell and Developmental Biology, Vanderbilt University School of Medicine, Nashville, TN, USA
| | - Karin Pelka
- Broad Institute of Massachusetts Institute of Technology and Harvard, Cambridge, MA, USA; Massachusetts General Hospital Cancer Center, Harvard Medical School, Boston, MA, USA
| | - Matan Hofree
- Klarman Cell Observatory, Broad Institute of MIT and Harvard, Cambridge, MA, USA
| | - Jonathan H Chen
- Broad Institute of Massachusetts Institute of Technology and Harvard, Cambridge, MA, USA; Massachusetts General Hospital Cancer Center, Harvard Medical School, Boston, MA, USA; Department of Pathology, Massachusetts General Hospital, Boston, MA, USA
| | - Siranush Sarkizova
- Broad Institute of Massachusetts Institute of Technology and Harvard, Cambridge, MA, USA
| | - Kimmie Ng
- Department of Medical Oncology, Dana-Farber Cancer Institute, Boston, MA, USA
| | - Marios Giannakis
- Broad Institute of Massachusetts Institute of Technology and Harvard, Cambridge, MA, USA; Department of Medical Oncology, Dana-Farber Cancer Institute, Boston, MA, USA
| | - Genevieve M Boland
- Massachusetts General Hospital Cancer Center, Harvard Medical School, Boston, MA, USA; Department of Surgery, Massachusetts General Hospital, Boston, MA, USA
| | - Andrew J Aguirre
- Broad Institute of Massachusetts Institute of Technology and Harvard, Cambridge, MA, USA; Department of Medical Oncology, Dana-Farber Cancer Institute, Boston, MA, USA
| | - Ana C Anderson
- Evergrande Center for Immunologic Diseases, Harvard Medical School and Brigham and Women's Hospital, Boston, MA, USA
| | | | - Aviv Regev
- Broad Institute of Massachusetts Institute of Technology and Harvard, Cambridge, MA, USA; Howard Hughes Medical Institute and Koch Institute for Integrative Cancer Research, Department of Biology, Massachusetts Institute of Technology, Cambridge, MA, USA
| | - Nir Hacohen
- Broad Institute of Massachusetts Institute of Technology and Harvard, Cambridge, MA, USA; Massachusetts General Hospital Cancer Center, Harvard Medical School, Boston, MA, USA; Department of Immunology, Harvard Medical School, Boston, MA, USA
| | - Kenta Kawasaki
- Department of Organoid Medicine, Keio University School of Medicine, Tokyo, Japan
| | - Toshiro Sato
- Department of Organoid Medicine, Keio University School of Medicine, Tokyo, Japan
| | - Jeremy A Goettel
- Department of Medicine, Division of Gastroenterology, Hepatology and Nutrition, Vanderbilt University Medical Center, Nashville, TN, USA; Department of Pathology, Microbiology, and Immunology, Vanderbilt University Medical Center, Nashville, TN, USA
| | - William M Grady
- Clinical Research Division, Fred Hutchinson Cancer Research Center, and Gastroenterology Division, University of Washington School of Medicine, Seattle, WA, USA
| | - Wei Zheng
- Vanderbilt-Ingram Cancer Center, Nashville, TN, USA; Department of Medicine, Division of Epidemiology, Vanderbilt Epidemiology Center, Vanderbilt University Medical Center, Nashville, TN, USA
| | - M Kay Washington
- Department of Pathology, Microbiology, and Immunology, Vanderbilt University Medical Center, Nashville, TN, USA
| | - Qiuyin Cai
- Vanderbilt-Ingram Cancer Center, Nashville, TN, USA; Department of Medicine, Division of Epidemiology, Vanderbilt Epidemiology Center, Vanderbilt University Medical Center, Nashville, TN, USA
| | - Cynthia L Sears
- Department of Medicine, Division of Infectious Diseases, Johns Hopkins University School of Medicine, Baltimore, MD, USA
| | - James R Goldenring
- Epithelial Biology Center, Vanderbilt University Medical Center, Nashville, TN, USA; Vanderbilt-Ingram Cancer Center, Nashville, TN, USA; Department of Surgery, Vanderbilt University Medical Center, Nashville, TN, USA
| | - Jeffrey L Franklin
- Epithelial Biology Center, Vanderbilt University Medical Center, Nashville, TN, USA; Department of Cell and Developmental Biology, Vanderbilt University School of Medicine, Nashville, TN, USA; Vanderbilt-Ingram Cancer Center, Nashville, TN, USA; Department of Medicine, Division of Gastroenterology, Hepatology and Nutrition, Vanderbilt University Medical Center, Nashville, TN, USA
| | - Timothy Su
- Department of Medicine, Division of Epidemiology, Vanderbilt Epidemiology Center, Vanderbilt University Medical Center, Nashville, TN, USA
| | - Won Jae Huh
- Department of Pathology, Microbiology, and Immunology, Vanderbilt University Medical Center, Nashville, TN, USA
| | - Simon Vandekar
- Department of Biostatistics and Center for Quantitative Sciences, Vanderbilt University Medical Center, Nashville, TN, USA
| | - Joseph T Roland
- Epithelial Biology Center, Vanderbilt University Medical Center, Nashville, TN, USA; Department of Surgery, Vanderbilt University Medical Center, Nashville, TN, USA
| | - Qi Liu
- Department of Biostatistics and Center for Quantitative Sciences, Vanderbilt University Medical Center, Nashville, TN, USA
| | - Robert J Coffey
- Epithelial Biology Center, Vanderbilt University Medical Center, Nashville, TN, USA; Vanderbilt-Ingram Cancer Center, Nashville, TN, USA; Department of Medicine, Division of Gastroenterology, Hepatology and Nutrition, Vanderbilt University Medical Center, Nashville, TN, USA.
| | - Martha J Shrubsole
- Vanderbilt-Ingram Cancer Center, Nashville, TN, USA; Department of Medicine, Division of Epidemiology, Vanderbilt Epidemiology Center, Vanderbilt University Medical Center, Nashville, TN, USA.
| | - Ken S Lau
- Program in Chemical and Physical Biology, Vanderbilt University School of Medicine, Nashville, TN, USA; Epithelial Biology Center, Vanderbilt University Medical Center, Nashville, TN, USA; Department of Cell and Developmental Biology, Vanderbilt University School of Medicine, Nashville, TN, USA; Vanderbilt-Ingram Cancer Center, Nashville, TN, USA; Department of Surgery, Vanderbilt University Medical Center, Nashville, TN, USA.
| |
Collapse
|
14
|
Valentini S, Marchioretti C, Bisio A, Rossi A, Zaccara S, Romanel A, Inga A. TranSNPs: A class of functional SNPs affecting mRNA translation potential revealed by fraction-based allelic imbalance. iScience 2021; 24:103531. [PMID: 34917903 PMCID: PMC8666669 DOI: 10.1016/j.isci.2021.103531] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/08/2021] [Revised: 10/27/2021] [Accepted: 11/23/2021] [Indexed: 12/23/2022] Open
Abstract
Few studies have explored the association between SNPs and alterations in mRNA translation potential. We developed an approach to identify SNPs that can mark allele-specific protein expression levels and could represent sources of inter-individual variation in disease risk. Using MCF7 cells under different treatments, we performed polysomal profiling followed by RNA sequencing of total or polysome-associated mRNA fractions and designed a computational approach to identify SNPs showing a significant change in the allelic balance between total and polysomal mRNA fractions. We identified 147 SNPs, 39 of which located in UTRs. Allele-specific differences at the translation level were confirmed in transfected MCF7 cells by reporter assays. Exploiting breast cancer data from TCGA we identified UTR SNPs demonstrating distinct prognosis features and altering binding sites of RNA-binding proteins. Our approach produced a catalog of tranSNPs, a class of functional SNPs associated with allele-specific translation and potentially endowed with prognostic value for disease risk.
Collapse
Affiliation(s)
- Samuel Valentini
- Department of Cellular, Computational and Integrative Biology (CIBIO), University of Trento, 38123 Trento, Italy
| | - Caterina Marchioretti
- Department of Cellular, Computational and Integrative Biology (CIBIO), University of Trento, 38123 Trento, Italy
- Department of Biomedical Sciences (DBS), University of Padova, 35131 Padova, Italy
| | - Alessandra Bisio
- Department of Cellular, Computational and Integrative Biology (CIBIO), University of Trento, 38123 Trento, Italy
| | - Annalisa Rossi
- Department of Cellular, Computational and Integrative Biology (CIBIO), University of Trento, 38123 Trento, Italy
| | - Sara Zaccara
- Department of Cellular, Computational and Integrative Biology (CIBIO), University of Trento, 38123 Trento, Italy
- Weill Medical College, Cornell University, New York 10065, NY, USA
| | - Alessandro Romanel
- Department of Cellular, Computational and Integrative Biology (CIBIO), University of Trento, 38123 Trento, Italy
| | - Alberto Inga
- Department of Cellular, Computational and Integrative Biology (CIBIO), University of Trento, 38123 Trento, Italy
| |
Collapse
|
15
|
Sherbina K, León-Novelo LG, Nuzhdin SV, McIntyre LM, Marroni F. Power calculator for detecting allelic imbalance using hierarchical Bayesian model. BMC Res Notes 2021; 14:436. [PMID: 34838135 PMCID: PMC8626927 DOI: 10.1186/s13104-021-05851-x] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/16/2021] [Accepted: 11/15/2021] [Indexed: 11/10/2022] Open
Abstract
OBJECTIVE Allelic imbalance (AI) is the differential expression of the two alleles in a diploid. AI can vary between tissues, treatments, and environments. Methods for testing AI exist, but methods are needed to estimate type I error and power for detecting AI and difference of AI between conditions. As the costs of the technology plummet, what is more important: reads or replicates? RESULTS We find that a minimum of 2400, 480, and 240 allele specific reads divided equally among 12, 5, and 3 replicates is needed to detect a 10, 20, and 30%, respectively, deviation from allelic balance in a condition with power > 80%. A minimum of 960 and 240 allele specific reads divided equally among 8 replicates is needed to detect a 20 or 30% difference in AI between conditions with comparable power. Higher numbers of replicates increase power more than adding coverage without affecting type I error. We provide a Python package that enables simulation of AI scenarios and enables individuals to estimate type I error and power in detecting AI and differences in AI between conditions.
Collapse
Affiliation(s)
- Katrina Sherbina
- Quantitative and Computational Biology Section, University of Southern California, Los Angeles, CA, 90046, USA
| | - Luis G León-Novelo
- Department of Biostatistics and Data Science, The University of Texas Health Science Center at Houston-School of Public Health, Houston, TX, 77030, USA
| | - Sergey V Nuzhdin
- Molecular and Computational Biology Section, University of Southern California, Los Angeles, CA, 90046, USA
| | - Lauren M McIntyre
- Genetics Institute and Department of Molecular Genetics and Microbiology, University of Florida, Gainesville, FL, 32603, USA
| | - Fabio Marroni
- Dipartimento di Scienze Agroalimentari, Ambientali e Animali, Università di Udine, 33100, Udine, Italy.
| |
Collapse
|
16
|
Abstract
Diploidy has profound implications for population genetics and susceptibility to genetic diseases. Although two copies are present for most genes in the human genome, they are not necessarily both active or active at the same level in a given individual. Genomic imprinting, resulting in exclusive or biased expression in favor of the allele of paternal or maternal origin, is now believed to affect hundreds of human genes. A far greater number of genes display unequal expression of gene copies due to cis-acting genetic variants that perturb gene expression. The availability of data generated by RNA sequencing applied to large numbers of individuals and tissue types has generated unprecedented opportunities to assess the contribution of genetic variation to allelic imbalance in gene expression. Here we review the insights gained through the analysis of these data about the extent of the genetic contribution to allelic expression imbalance, the tools and statistical models for gene expression imbalance, and what the results obtained reveal about the contribution of genetic variants that alter gene expression to complex human diseases and phenotypes.
Collapse
Affiliation(s)
- Siobhan Cleary
- School of Mathematics, Statistics and Applied Mathematics, National University of Ireland, Galway H91 H3CY, Ireland;
| | - Cathal Seoighe
- School of Mathematics, Statistics and Applied Mathematics, National University of Ireland, Galway H91 H3CY, Ireland;
| |
Collapse
|
17
|
Mendelevich A, Vinogradova S, Gupta S, Mironov AA, Sunyaev SR, Gimelbrant AA. Replicate sequencing libraries are important for quantification of allelic imbalance. Nat Commun 2021; 12:3370. [PMID: 34099647 PMCID: PMC8184992 DOI: 10.1038/s41467-021-23544-8] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/19/2020] [Accepted: 04/30/2021] [Indexed: 12/13/2022] Open
Abstract
A sensitive approach to quantitative analysis of transcriptional regulation in diploid organisms is analysis of allelic imbalance (AI) in RNA sequencing (RNA-seq) data. A near-universal practice in such studies is to prepare and sequence only one library per RNA sample. We present theoretical and experimental evidence that data from a single RNA-seq library is insufficient for reliable quantification of the contribution of technical noise to the observed AI signal; consequently, reliance on one-replicate experimental design can lead to unaccounted-for variation in error rates in allele-specific analysis. We develop a computational approach, Qllelic, that accurately accounts for technical noise by making use of replicate RNA-seq libraries. Testing on new and existing datasets shows that application of Qllelic greatly decreases false positive rate in allele-specific analysis while conserving appropriate signal, and thus greatly improves reproducibility of AI estimates. We explore sources of technical overdispersion in observed AI signal and conclude by discussing design of RNA-seq studies addressing two biologically important questions: quantification of transcriptome-wide AI in one sample, and differential analysis of allele-specific expression between samples.
Collapse
Affiliation(s)
- Asia Mendelevich
- Skolkovo Institute of Science and Technology, Moscow, Russia.
- Center for Cancer Systems Biology, Dana-Farber Cancer Institute, Harvard Medical School, Boston, USA.
| | - Svetlana Vinogradova
- Center for Cancer Systems Biology, Dana-Farber Cancer Institute, Harvard Medical School, Boston, USA
| | - Saumya Gupta
- Center for Cancer Systems Biology, Dana-Farber Cancer Institute, Harvard Medical School, Boston, USA
- Broad Institute of Harvard and MIT, Cambridge, USA
| | - Andrey A Mironov
- Lomonosov Moscow State University, Faculty of Bioengineering and Bioinformatics, Moscow, Russia
- Institute of Information Transmission Problems, Russian Academy of Sciences, Moscow, Russia
| | - Shamil R Sunyaev
- Department of Biomedical Informatics, Harvard Medical School, Boston, USA
- Division of Genetics, Brigham and Women's Hospital, Boston, USA
| | - Alexander A Gimelbrant
- Center for Cancer Systems Biology, Dana-Farber Cancer Institute, Harvard Medical School, Boston, USA.
- Broad Institute of Harvard and MIT, Cambridge, USA.
| |
Collapse
|
18
|
Miller BR, Morse AM, Borgert JE, Liu Z, Sinclair K, Gamble G, Zou F, Newman JRB, León-Novelo LG, Marroni F, McIntyre LM. Testcrosses are an efficient strategy for identifying cis-regulatory variation: Bayesian analysis of allele-specific expression (BayesASE). G3 (BETHESDA, MD.) 2021; 11:jkab096. [PMID: 33772539 PMCID: PMC8104932 DOI: 10.1093/g3journal/jkab096] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 01/19/2021] [Accepted: 03/10/2021] [Indexed: 12/30/2022]
Abstract
Allelic imbalance (AI) occurs when alleles in a diploid individual are differentially expressed and indicates cis acting regulatory variation. What is the distribution of allelic effects in a natural population? Are all alleles the same? Are all alleles distinct? The approach described applies to any technology generating allele-specific sequence counts, for example for chromatin accessibility and can be applied generally including to comparisons between tissues or environments for the same genotype. Tests of allelic effect are generally performed by crossing individuals and comparing expression between alleles directly in the F1. However, a crossing scheme that compares alleles pairwise is a prohibitive cost for more than a handful of alleles as the number of crosses is at least (n2-n)/2 where n is the number of alleles. We show here that a testcross design followed by a hypothesis test of AI between testcrosses can be used to infer differences between nontester alleles, allowing n alleles to be compared with n crosses. Using a mouse data set where both testcrosses and direct comparisons have been performed, we show that the predicted differences between nontester alleles are validated at levels of over 90% when a parent-of-origin effect is present and of 60%-80% overall. Power considerations for a testcross, are similar to those in a reciprocal cross. In all applications, the testing for AI involves several complex bioinformatics steps. BayesASE is a complete bioinformatics pipeline that incorporates state-of-the-art error reduction techniques and a flexible Bayesian approach to estimating AI and formally comparing levels of AI between conditions. The modular structure of BayesASE has been packaged in Galaxy, made available in Nextflow and as a collection of scripts for the SLURM workload manager on github (https://github.com/McIntyre-Lab/BayesASE).
Collapse
Affiliation(s)
- Brecca R Miller
- Genetics Institute, University of Florida, Gainesville, FL 32608, USA
- NYU Langone Health, New York University, New York, NY 10013, USA
| | - Alison M Morse
- Genetics Institute, University of Florida, Gainesville, FL 32608, USA
- Department of Molecular Genetics and Microbiology, University of Florida, Gainesville, FL 32608, USA
| | - Jacqueline E Borgert
- Genetics Institute, University of Florida, Gainesville, FL 32608, USA
- Department of Biostatistics, University of North Carolina at Chapel Hill, Chapel Hill, NC 27515, USA
| | - Zihao Liu
- Genetics Institute, University of Florida, Gainesville, FL 32608, USA
- Department of Molecular Genetics and Microbiology, University of Florida, Gainesville, FL 32608, USA
| | - Kelsey Sinclair
- Genetics Institute, University of Florida, Gainesville, FL 32608, USA
- Department of Molecular Genetics and Microbiology, University of Florida, Gainesville, FL 32608, USA
| | - Gavin Gamble
- Genetics Institute, University of Florida, Gainesville, FL 32608, USA
| | - Fei Zou
- Department of Biostatistics, University of North Carolina at Chapel Hill, Chapel Hill, NC 27515, USA
- Department of Genetics, University of North Carolina at Chapel Hill, Chapel Hill, NC 27515, USA
| | - Jeremy R B Newman
- Genetics Institute, University of Florida, Gainesville, FL 32608, USA
- Department of Pathology, University of Florida, Gainesville, FL 32608 USA
| | - Luis G León-Novelo
- Department of Biostatistics and Data Science, University of Texas Health Science Center at Houston-University of Texas School of Public Health, Houston, TX 7703, USA
| | - Fabio Marroni
- Department of Agricultural, Food, Environmental and Animal Sciences, University of Udine, Udine, 33100, Italy
| | - Lauren M McIntyre
- Genetics Institute, University of Florida, Gainesville, FL 32608, USA
- Department of Molecular Genetics and Microbiology, University of Florida, Gainesville, FL 32608, USA
| |
Collapse
|
19
|
Pespeni MH, Moczek AP. Signals of selection beyond bottlenecks between exotic populations of the bull-headed dung beetle, Onthophagus taurus. Evol Dev 2021; 23:86-99. [PMID: 33522675 DOI: 10.1111/ede.12367] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/29/2019] [Revised: 12/02/2020] [Accepted: 12/15/2020] [Indexed: 11/27/2022]
Abstract
Colonization of new environments can lead to population bottlenecks and rapid phenotypic evolution that could be due to neutral and selective processes. Exotic populations of the bull-headed dung beetle (Onthophagus taurus) have differentiated in opposite directions from native beetles in male horn-to-body size allometry and female fecundity. Here we test for genetic and transcriptional differences among two exotic and one native O. taurus populations after three generations in common garden conditions. We sequenced RNA from 24 individuals for each of the three populations including both sexes, and spanning four developmental stages for the two exotic, differentiated populations. Identifying 270,400 high-quality single nucleotide polymorphisms, we revealed a strong signal of genetic differentiation between the three populations, and evidence of recent bottlenecks within and an excess of outlier loci between exotic populations. Differences in gene expression between populations were greatest in prepupae and early adult life stages, stages during which differences in male horn development and female fecundity manifest. Finally, genes differentially expressed between exotic populations also had greater genetic differentiation and performed functions related to chitin biosynthesis and nutrient sensing, possibly underlying allometry and fecundity trait divergences. Our results suggest that beyond bottlenecks, recent introductions have led to genetic and transcriptional differences in genes correlated with observed phenotypic differences.
Collapse
Affiliation(s)
- Melissa H Pespeni
- Department of Biology, University of Vermont, Burlington, Vermont, USA.,Department of Biology, Indiana University, Bloomington, Indiana, USA
| | - Armin P Moczek
- Department of Biology, Indiana University, Bloomington, Indiana, USA
| |
Collapse
|
20
|
aScan: A Novel Method for the Study of Allele Specific Expression in Single Individuals. J Mol Biol 2021; 433:166829. [PMID: 33508309 DOI: 10.1016/j.jmb.2021.166829] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/11/2020] [Revised: 01/08/2021] [Accepted: 01/09/2021] [Indexed: 02/06/2023]
Abstract
In diploid organisms, two copies of each allele are normally inherited from parents. Paternal and maternal alleles can be regulated and expressed unequally, which is referred to as allele-specific expression (ASE). In this work, we present aScan, a novel method for the identification of ASE from the analysis of matched individual genomic and RNA sequencing data. By performing extensive analyses of both real and simulated data, we demonstrate that aScan can correctly identify ASE with high accuracy and sensitivity in different experimental settings. Additionally, by applying our method to a small cohort of individuals that are not included in publicly available databases of human genetic variation, we outline the value of possible applications of ASE analysis in single individuals for deriving a more accurate annotation of "private" low-frequency genetic variants associated with regulatory effects on transcription. All in all, we believe that aScan will represent a beneficial addition to the set of bioinformatics tools for the analysis of ASE. Finally, while our method was initially conceived for the analysis of RNA-seq data, it can in principle be applied to any quantitative NGS assay for which matched genotypic and expression data are available. AVAILABILITY: aScan is currently available in the form of an open source standalone software package at: https://github.com/Federico77z/aScan/. aScan version 1.0.3, available at https://github.com/Federico77z/aScan/releases/tag/1.0.3, has been used for all the analyses included in this manuscript. A Docker image of the tool has also been made available at https://github.com/pmandreoli/aScanDocker.
Collapse
|
21
|
Tangwancharoen S, Semmens BX, Burton RS. Allele-Specific Expression and Evolution of Gene Regulation Underlying Acute Heat Stress Response and Local Adaptation in the Copepod Tigriopus californicus. J Hered 2020; 111:539-547. [PMID: 33141173 DOI: 10.1093/jhered/esaa044] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/21/2020] [Accepted: 10/26/2020] [Indexed: 01/02/2023] Open
Abstract
Geographic variation in environmental temperature can select for local adaptation among conspecific populations. Divergence in gene expression across the transcriptome is a key mechanism for evolution of local thermal adaptation in many systems, yet the genetic mechanisms underlying this regulatory evolution remain poorly understood. Here we examine gene expression in 2 locally adapted Tigriopus californicus populations (heat tolerant San Diego, SD, and less tolerant Santa Cruz, SC) and their F1 hybrids during acute heat stress response. Allele-specific expression (ASE) in F1 hybrids was used to determine cis-regulatory divergence. We found that the number of genes showing significant allelic imbalance increased under heat stress compared to unstressed controls. This suggests that there is significant population divergence in cis-regulatory elements underlying heat stress response. Specifically, the number of genes showing an excess of transcripts from the more thermal tolerant (SD) population increased with heat stress while that number of genes with an SC excess was similar in both treatments. Inheritance patterns of gene expression also revealed that genes displaying SD-dominant expression phenotypes increase in number in response to heat stress; that is, across loci, gene expression in F1's following heat stress showed more similarity to SD than SC, a pattern that was absent in the control treatment. The observed patterns of ASE and inheritance of gene expression provide insight into the complex processes underlying local adaptation and thermal stress response.
Collapse
Affiliation(s)
- Sumaetee Tangwancharoen
- Marine Biology Research Division, Scripps Institution of Oceanography, University of California, San Diego, La Jolla, CA.,Department of Biology, University of Vermont, Burlington, VT
| | - Brice X Semmens
- Marine Biology Research Division, Scripps Institution of Oceanography, University of California, San Diego, La Jolla, CA
| | - Ronald S Burton
- Marine Biology Research Division, Scripps Institution of Oceanography, University of California, San Diego, La Jolla, CA
| |
Collapse
|
22
|
Demirdjian L, Xu Y, Bahrami-Samani E, Pan Y, Stein S, Xie Z, Park E, Wu YN, Xing Y. Detecting Allele-Specific Alternative Splicing from Population-Scale RNA-Seq Data. Am J Hum Genet 2020; 107:461-472. [PMID: 32781045 PMCID: PMC7477012 DOI: 10.1016/j.ajhg.2020.07.005] [Citation(s) in RCA: 11] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/22/2019] [Accepted: 07/10/2020] [Indexed: 12/20/2022] Open
Abstract
RNA sequencing (RNA-seq) is a powerful technology for studying human transcriptome variation. We introduce PAIRADISE (Paired Replicate Analysis of Allelic Differential Splicing Events), a method for detecting allele-specific alternative splicing (ASAS) from RNA-seq data. Unlike conventional approaches that detect ASAS events one sample at a time, PAIRADISE aggregates ASAS signals across multiple individuals in a population. By treating the two alleles of an individual as paired, and multiple individuals sharing a heterozygous SNP as replicates, we formulate ASAS detection using PAIRADISE as a statistical problem for identifying differential alternative splicing from RNA-seq data with paired replicates. PAIRADISE outperforms alternative statistical models in simulation studies. Applying PAIRADISE to replicate RNA-seq data of a single individual and to population-scale RNA-seq data across many individuals, we detect ASAS events associated with genome-wide association study (GWAS) signals of complex traits or diseases. Additionally, PAIRADISE ASAS analysis detects the effects of rare variants on alternative splicing. PAIRADISE provides a useful computational tool for elucidating the genetic variation and phenotypic association of alternative splicing in populations.
Collapse
|
23
|
Fan KH, Devos KM, Schliekelman P. Strategies for eQTL mapping in allopolyploid organisms. TAG. THEORETICAL AND APPLIED GENETICS. THEORETISCHE UND ANGEWANDTE GENETIK 2020; 133:2477-2497. [PMID: 32462429 DOI: 10.1007/s00122-020-03612-1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 09/09/2019] [Accepted: 05/15/2020] [Indexed: 06/11/2023]
Abstract
KEY MESSAGE This study uses simulations to explore statistical power and false-positive rates for eQTL mapping in allopolyploid organisms and provides guidelines to apply eQTL mapping in these organisms. In recent years, RNA-seq has become the dominant technology for eQTL studies. However, most work has been in diploid organisms. Many species of economic and environmental importance are polyploid, and approaches for eQTL mapping in polyploids are not well developed. High similarity between duplicated genes in polyploids will cause misassignment of sequence reads and may cause false-positive results and/or lack of power to detect eQTL. In this paper, we first explore the similarity of homoeologous transcripts in polyploid organisms. We find that 5-20% of genes (varying with organism) in important agricultural plants such as wheat, soybean, and switchgrass are not sufficiently diverged between duplicated genomes to allow unambiguous assignment of reads. Second, we examine the impact of misassigned reads on eQTL mapping and show that both false-positive and false-negative rates can be greatly inflated. Third, we compare four strategies for dealing with ambiguous reads: (1) dividing ambiguous reads evenly between homoeologous transcripts, (2) assigning them proportionally, (3) using all reads for all genes, and (4) discarding ambiguous reads. We find that the strategy of discarding ambiguous reads gives the best balance of false-positive and false-negative rates for most genes. However, for genes that are very similar between genomes, using all reads is the only choice. This leads to reduced power, but false-positive rates will be maintained. We also discuss QTL mapping in polyploids using allele-specific expression (ASE) and show how the proportion of ASE-informative reads varies according to the divergence between homoeologous genes.
Collapse
Affiliation(s)
- Kang-Hsien Fan
- Institute of Bioinformatics, University of Georgia, Athens, GA, USA
| | - Katrien M Devos
- Department of Crop and Soil Sciences, Institute of Plant Breeding, Genetics and Genomics, University of Georgia, Athens, GA, USA
| | | |
Collapse
|
24
|
Dong L, Wang J, Wang G. BYASE: a Python library for estimating gene and isoform level allele-specific expression. Bioinformatics 2020; 36:4955-4956. [DOI: 10.1093/bioinformatics/btaa636] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/10/2020] [Revised: 05/12/2020] [Accepted: 07/10/2020] [Indexed: 12/15/2022] Open
Abstract
Abstract
Summary
Allele-specific expression (ASE) is involved in many important biological mechanisms. We present a python package BYASE and its graphical user interface (GUI) tool BYASE-GUI for the identification of ASE from single-end and paired-end RNA-seq data based on Bayesian inference, which can simultaneously report differences in gene-level and isoform-level expression. BYASE uses both phased SNPs and non-phased SNPs, and supports polyploid organisms.
Availability and implementation
The source codes of BYASE and BYASE-GUI are freely available at https://github.com/ncjllld/byase and https://github.com/ncjllld/byase_gui.
Supplementary information
Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Lili Dong
- School of Computer Science and Technology, Harbin Institute of Technology, Harbin, Heilongjiang 150001, China
| | - Jianan Wang
- School of Computer Science and Technology, Harbin Institute of Technology, Harbin, Heilongjiang 150001, China
| | - Guohua Wang
- School of Computer Science and Technology, Harbin Institute of Technology, Harbin, Heilongjiang 150001, China
- State Key Laboratory of Tree Genetics and Breeding, Northeast Forestry University, Harbin, Heilongjiang 150001, China
| |
Collapse
|
25
|
Zhou Z, Xu B, Minn A, Zhang NR. DENDRO: genetic heterogeneity profiling and subclone detection by single-cell RNA sequencing. Genome Biol 2020; 21:10. [PMID: 31937348 PMCID: PMC6961311 DOI: 10.1186/s13059-019-1922-x] [Citation(s) in RCA: 26] [Impact Index Per Article: 6.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/16/2019] [Accepted: 12/16/2019] [Indexed: 12/18/2022] Open
Abstract
Although scRNA-seq is now ubiquitously adopted in studies of intratumor heterogeneity, detection of somatic mutations and inference of clonal membership from scRNA-seq is currently unreliable. We propose DENDRO, an analysis method for scRNA-seq data that clusters single cells into genetically distinct subclones and reconstructs the phylogenetic tree relating the subclones. DENDRO utilizes transcribed point mutations and accounts for technical noise and expression stochasticity. We benchmark DENDRO and demonstrate its application on simulation data and real data from three cancer types. In particular, on a mouse melanoma model in response to immunotherapy, DENDRO delineates the role of neoantigens in treatment response.
Collapse
Affiliation(s)
- Zilu Zhou
- Graduate Group in Genomics and Computational Biology, University of Pennsylvania, Philadelphia, PA USA
- Department of Statistics, The Wharton School, University of Pennsylvania, Philadelphia, PA USA
| | - Bihui Xu
- Department of Radiation Oncology, Parker Institute for Cancer Immunotherapy, Abramson Family Cancer Research Institute, Graduate Group in Cell and Molecular Biology, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA USA
| | - Andy Minn
- Department of Radiation Oncology, Parker Institute for Cancer Immunotherapy, Abramson Family Cancer Research Institute, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA USA
| | - Nancy R. Zhang
- Department of Statistics, The Wharton School, University of Pennsylvania, Philadelphia, PA USA
| |
Collapse
|
26
|
Xie J, Ji T, Ferreira MAR, Li Y, Patel BN, Rivera RM. Modeling allele-specific expression at the gene and SNP levels simultaneously by a Bayesian logistic mixed regression model. BMC Bioinformatics 2019; 20:530. [PMID: 31660858 PMCID: PMC6819473 DOI: 10.1186/s12859-019-3141-6] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/03/2019] [Accepted: 10/09/2019] [Indexed: 12/29/2022] Open
Abstract
Background High-throughput sequencing experiments, which can determine allele origins, have been used to assess genome-wide allele-specific expression. Despite the amount of data generated from high-throughput experiments, statistical methods are often too simplistic to understand the complexity of gene expression. Specifically, existing methods do not test allele-specific expression (ASE) of a gene as a whole and variation in ASE within a gene across exons separately and simultaneously. Results We propose a generalized linear mixed model to close these gaps, incorporating variations due to genes, single nucleotide polymorphisms (SNPs), and biological replicates. To improve reliability of statistical inferences, we assign priors on each effect in the model so that information is shared across genes in the entire genome. We utilize Bayesian model selection to test the hypothesis of ASE for each gene and variations across SNPs within a gene. We apply our method to four tissue types in a bovine study to de novo detect ASE genes in the bovine genome, and uncover intriguing predictions of regulatory ASEs across gene exons and across tissue types. We compared our method to competing approaches through simulation studies that mimicked the real datasets. The R package, BLMRM, that implements our proposed algorithm, is publicly available for download at https://github.com/JingXieMIZZOU/BLMRM. Conclusions We will show that the proposed method exhibits improved control of the false discovery rate and improved power over existing methods when SNP variation and biological variation are present. Besides, our method also maintains low computational requirements that allows for whole genome analysis.
Collapse
|
27
|
Abstract
Allelic imbalance occurs when the two alleles of a gene are differentially expressed within a diploid organism, and can indicate important differences in cis-regulation and epigenetic state across the two chromosomes. Because of this, the ability to accurately quantify the proportion at which each allele of a gene is expressed is of great interest to researchers. This becomes challenging in the presence of small read counts and/or sample sizes, which can cause estimates for allelic expression proportions to have high variance. Investigators have traditionally dealt with this problem by filtering out genes with small counts and samples. However, this may inadvertently remove important genes that have truly large allelic imbalances. Another option is to use Bayesian estimators to reduce the variance. To this end, we evaluated the accuracy of three different estimators, the latter two of which are Bayesian shrinkage estimators: maximum likelihood, approximate posterior estimation of GLM coefficients (apeglm) and adaptive shrinkage (ash). We also wrote C++ code to quickly calculate ML and apeglm estimates, and integrated it into the apeglm package. The three methods were evaluated on both simulated and real data. Apeglm consistently performed better than ML according to a variety of criteria, including mean absolute error and concordance at the top. While ash had lower error and greater concordance than ML on the simulations, it also had a tendency to over-shrink large effects, and performed worse on the real data according to error and concordance. Furthermore, when compared to five other packages that also fit beta-binomial models, the apeglm package was substantially faster, making our package useful for quick and reliable analyses of allelic imbalance. Apeglm is available as an R/Bioconductor package at http://bioconductor.org/packages/apeglm.
Collapse
Affiliation(s)
- Joshua P Zitovsky
- Department of Biostatistics, University of North Carolina at Chapel Hill, Chapel Hill, NC, 27516, USA
| | - Michael I Love
- Department of Biostatistics, University of North Carolina at Chapel Hill, Chapel Hill, NC, 27516, USA
- Department of Genetics, University of North Carolina at Chapel Hill, Chapel Hill, NC, 27514, USA
| |
Collapse
|
28
|
Abstract
Allelic imbalance occurs when the two alleles of a gene are differentially expressed within a diploid organism and can indicate important differences in cis-regulation and epigenetic state across the two chromosomes. Because of this, the ability to accurately quantify the proportion at which each allele of a gene is expressed is of great interest to researchers. This becomes challenging in the presence of small read counts and/or sample sizes, which can cause estimators for allelic expression proportions to have high variance. Investigators have traditionally dealt with this problem by filtering out genes with small counts and samples. However, this may inadvertently remove important genes that have truly large allelic imbalances. Another option is to use pseudocounts or Bayesian estimators to reduce the variance. To this end, we evaluated the accuracy of four different estimators, the latter two of which are Bayesian shrinkage estimators: maximum likelihood, adding a pseudocount to each allele, approximate posterior estimation of GLM coefficients (apeglm) and adaptive shrinkage (ash). We also wrote C++ code to quickly calculate ML and apeglm estimates and integrated it into the apeglm package. The four methods were evaluated on two simulations and one real data set. Apeglm consistently performed better than ML according to a variety of criteria, and generally outperformed use of pseudocounts as well. Ash also performed better than ML in one of the simulations, but in the other performance was more mixed. Finally, when compared to five other packages that also fit beta-binomial models, the apeglm package was substantially faster and more numerically reliable, making our package useful for quick and reliable analyses of allelic imbalance. Apeglm is available as an R/Bioconductor package at http://bioconductor.org/packages/apeglm.
Collapse
Affiliation(s)
- Joshua P. Zitovsky
- Department of Biostatistics, University of North Carolina at Chapel Hill, Chapel Hill, NC, 27516, USA
| | - Michael I. Love
- Department of Biostatistics, University of North Carolina at Chapel Hill, Chapel Hill, NC, 27516, USA
- Department of Genetics, University of North Carolina at Chapel Hill, Chapel Hill, NC, 27514, USA
| |
Collapse
|
29
|
Jiang X, Zhang H, Zhang Z, Quan X. Flexible Non-Negative Matrix Factorization to Unravel Disease-Related Genes. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2019; 16:1948-1957. [PMID: 29993985 DOI: 10.1109/tcbb.2018.2823746] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/08/2023]
Abstract
Recently, non-negative matrix factorization (NMF) has been shown to perform well in the analysis of omics data. NMF assumes that the expression level of one gene is a linear additive composition of metagenes. The elements in metagene matrix represent the regulation effects and are restricted to non-negativity. However, according to the real biological meaning, there are two kinds of regulation effects, i.e., up-regulation and down-regulation. Few methods based on NMF have considered this biological meaning. Therefore, we designed a flexible non-negative matrix factorization (FNMF) algorithm by further considering the biological meaning of gene expression data. It allows negative numbers in the metagene matrix, and negative numbers represent down-regulation effects. We separated gene expression data into disease-driven gene expression and background gene expression. Subsequently, we computed disease-driven gene relative expression, and a ranked list of genes was obtained. The top ranked genes are considered to be involved in some disease-related biological processes. Experimental results on two real-world gene expression data demonstrate the feasibility and effectiveness of FNMF. Compared with conventional disease-related gene identification algorithms, FNMF has superior performance in analyzing gene expression data of diseases with complex pathology.
Collapse
|
30
|
Majoros WH, Kim YS, Barrera A, Li F, Wang X, Cunningham SJ, Johnson GD, Guo C, Lowe WL, Scholtens DM, Hayes MG, Reddy TE, Allen AS. Bayesian estimation of genetic regulatory effects in high-throughput reporter assays. Bioinformatics 2019; 36:331-338. [PMID: 31368479 PMCID: PMC7999138 DOI: 10.1093/bioinformatics/btz545] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/02/2018] [Revised: 06/12/2019] [Accepted: 07/24/2019] [Indexed: 01/31/2023] Open
Abstract
MOTIVATION High-throughput reporter assays dramatically improve our ability to assign function to noncoding genetic variants, by measuring allelic effects on gene expression in the controlled setting of a reporter gene. Unlike genetic association tests, such assays are not confounded by linkage disequilibrium when loci are independently assayed. These methods can thus improve the identification of causal disease mutations. While work continues on improving experimental aspects of these assays, less effort has gone into developing methods for assessing the statistical significance of assay results, particularly in the case of rare variants captured from patient DNA. RESULTS We describe a Bayesian hierarchical model, called Bayesian Inference of Regulatory Differences, which integrates prior information and explicitly accounts for variability between experimental replicates. The model produces substantially more accurate predictions than existing methods when allele frequencies are low, which is of clear advantage in the search for disease-causing variants in DNA captured from patient cohorts. Using the model, we demonstrate a clear tradeoff between variant sequencing coverage and numbers of biological replicates, and we show that the use of additional biological replicates decreases variance in estimates of effect size, due to the properties of the Poisson-binomial distribution. We also provide a power and sample size calculator, which facilitates decision making in experimental design parameters. AVAILABILITY AND IMPLEMENTATION The software is freely available from www.geneprediction.org/bird. The experimental design web tool can be accessed at http://67.159.92.22:8080. SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- William H Majoros
- Duke Center for Statistical Genetics and Genomics, Duke University,Division of Integrative Genomics, Department of Biostatistics and Bioinformatics, Duke University Medical School,Center for Genomic and Computational Biology, Duke University Medical School
| | - Young-Sook Kim
- Center for Genomic and Computational Biology, Duke University Medical School,Program in Computational Biology & Bioinformatics, Duke University, Durham, NC 27710
| | - Alejandro Barrera
- Center for Genomic and Computational Biology, Duke University Medical School
| | - Fan Li
- Department of Biostatistics, Yale University, New Haven, CT 06520
| | - Xingyan Wang
- Present address: PhD Program in Biostatistics, Department of Public Health Sciences, Penn State College of Medicine, Hershey, PA 17033, USA
| | | | - Graham D Johnson
- Center for Genomic and Computational Biology, Duke University Medical School,Department of Biostatistics and Bioinformatics, Duke University, Durham, NC 27710
| | - Cong Guo
- Present address: Human Genetics, GlaxoSmithKline, Collegeville, PA 19426, USA
| | - William L Lowe
- Division of Endocrinology Metabolism and Molecular Medicine, Northwestern University Feinberg School of Medicine, Chicago
| | - Denise M Scholtens
- Division of Biostatistics, Department of Preventive Medicine, Northwestern University Feinberg School of Medicine, Chicago, IL 60611, USA
| | - M Geoffrey Hayes
- Division of Endocrinology Metabolism and Molecular Medicine, Northwestern University Feinberg School of Medicine, Chicago
| | | | | |
Collapse
|
31
|
Verta JP, Jones FC. Predominance of cis-regulatory changes in parallel expression divergence of sticklebacks. eLife 2019; 8:43785. [PMID: 31090544 PMCID: PMC6550882 DOI: 10.7554/elife.43785] [Citation(s) in RCA: 48] [Impact Index Per Article: 9.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/27/2018] [Accepted: 05/01/2019] [Indexed: 12/15/2022] Open
Abstract
Regulation of gene expression is thought to play a major role in adaptation, but the relative importance of cis- and trans- regulatory mechanisms in the early stages of adaptive divergence is unclear. Using RNAseq of threespine stickleback fish gill tissue from four independent marine-freshwater ecotype pairs and their F1 hybrids, we show that cis-acting (allele-specific) regulation consistently predominates gene expression divergence. Genes showing parallel marine-freshwater expression divergence are found near to adaptive genomic regions, show signatures of natural selection around their transcription start sites and are enriched for cis-regulatory control. For genes with parallel increased expression among freshwater fish, the quantitative degree of cis- and trans-regulation is also highly correlated across populations, suggesting a shared genetic basis. Compared to other forms of regulation, cis-regulation tends to show greater additivity and stability across different genetic and environmental contexts, making it a fertile substrate for the early stages of adaptive evolution.
Collapse
Affiliation(s)
- Jukka-Pekka Verta
- Friedrich Miescher Laboratory of the Max Planck Society, Max-Planck-Ring, Tübingen, Germany.,Organismal and Evolutionary Biology Research Programme, University of Helsinki, Helsinki, Finland
| | - Felicity C Jones
- Friedrich Miescher Laboratory of the Max Planck Society, Max-Planck-Ring, Tübingen, Germany
| |
Collapse
|
32
|
Kryvokhyzha D, Milesi P, Duan T, Orsucci M, Wright SI, Glémin S, Lascoux M. Towards the new normal: Transcriptomic convergence and genomic legacy of the two subgenomes of an allopolyploid weed (Capsella bursa-pastoris). PLoS Genet 2019; 15:e1008131. [PMID: 31083657 PMCID: PMC6532933 DOI: 10.1371/journal.pgen.1008131] [Citation(s) in RCA: 18] [Impact Index Per Article: 3.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/14/2018] [Revised: 05/23/2019] [Accepted: 04/11/2019] [Indexed: 02/07/2023] Open
Abstract
Allopolyploidy has played a major role in plant evolution but its impact on genome diversity and expression patterns remains to be understood. Some studies found important genomic and transcriptomic changes in allopolyploids, whereas others detected a strong parental legacy and more subtle changes. The allotetraploid C. bursa-pastoris originated around 100,000 years ago and one could expect the genetic polymorphism of the two subgenomes to follow similar trajectories and their transcriptomes to start functioning together. To test this hypothesis, we sequenced the genomes and the transcriptomes (three tissues) of allotetraploid C. bursa-pastoris and its parental species, the outcrossing C. grandiflora and the self-fertilizing C. orientalis. Comparison of the divergence in expression between subgenomes, on the one hand, and divergence in expression between the parental species, on the other hand, indicated a strong parental legacy with a majority of genes exhibiting a conserved pattern and cis-regulation. However, a large proportion of the genes that were differentially expressed between the two subgenomes, were also under trans-regulation reflecting the establishment of a new regulatory pattern. Parental dominance varied among tissues: expression in flowers was closer to that of C. orientalis and expression in root and leaf to that of C. grandiflora. Since deleterious mutations accumulated preferentially on the C. orientalis subgenome, the bias in expression towards C. orientalis observed in flowers indicates that expression changes could be adaptive and related to the selfing syndrome, while biases in the roots and leaves towards the C. grandiflora subgenome may be reflective of the differential genetic load. Most plant species have a polyploid at some stage of their ancestry. Polyploidy, genome doubling through either multiple copies of a single species or through genomes of different species coming into the same nucleus, is therefore a crucial step in plant evolution. Understanding its impact on basic biological functions is thus a matter of interest. Shepherd’s purse (Capsella bursa-pastoris) is a major weed that appeared about 100,000 years ago through hybridization of two diploid species of the same genus. In the present project, we measured genetic diversity and analyzed gene expression patterns in flowers, roots, and leaves of C. bursa-pastoris individuals as well as in its two parental species, the outcrossing C. grandiflora and the self-fertilizing C. orientalis. Our data shows that, after 100,000 generations of evolution, the origin of the two subgenomes can still be seen: the genome inherited from C. grandiflora still differs from the one inherited from self-fertilizing C. orientalis. However, there are also signs that the two genomes have started to work together and are jointly regulated, and the way expression pattern varied across the three tissues indicates that the evolution of gene expression was adaptive.
Collapse
Affiliation(s)
- Dmytro Kryvokhyzha
- Plant Ecology and Evolution, Department of Ecology and Genetics, Evolutionary Biology Centre and Science for Life Laboratory, Uppsala University, Uppsala, Sweden
| | - Pascal Milesi
- Plant Ecology and Evolution, Department of Ecology and Genetics, Evolutionary Biology Centre and Science for Life Laboratory, Uppsala University, Uppsala, Sweden
| | - Tianlin Duan
- Plant Ecology and Evolution, Department of Ecology and Genetics, Evolutionary Biology Centre and Science for Life Laboratory, Uppsala University, Uppsala, Sweden
| | - Marion Orsucci
- Plant Ecology and Evolution, Department of Ecology and Genetics, Evolutionary Biology Centre and Science for Life Laboratory, Uppsala University, Uppsala, Sweden
| | - Stephen I. Wright
- Department of Ecology and Evolutionary Biology, University of Toronto, Toronto, Canada
| | - Sylvain Glémin
- Plant Ecology and Evolution, Department of Ecology and Genetics, Evolutionary Biology Centre and Science for Life Laboratory, Uppsala University, Uppsala, Sweden
- CNRS, Univ. Rennes, ECOBIO [(Ecosystèmes, biodiversité, évolution)] - UMR 6553, Rennes, France
| | - Martin Lascoux
- Plant Ecology and Evolution, Department of Ecology and Genetics, Evolutionary Biology Centre and Science for Life Laboratory, Uppsala University, Uppsala, Sweden
- * E-mail:
| |
Collapse
|
33
|
van der Veeken J, Zhong Y, Sharma R, Mazutis L, Dao P, Pe'er D, Leslie CS, Rudensky AY. Natural Genetic Variation Reveals Key Features of Epigenetic and Transcriptional Memory in Virus-Specific CD8 T Cells. Immunity 2019; 50:1202-1217.e7. [PMID: 31027997 DOI: 10.1016/j.immuni.2019.03.031] [Citation(s) in RCA: 40] [Impact Index Per Article: 8.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/12/2018] [Revised: 01/15/2019] [Accepted: 03/27/2019] [Indexed: 12/29/2022]
Abstract
Stable changes in chromatin states and gene expression in cells of the immune system form the basis for memory of infections and other challenges. Here, we used naturally occurring cis-regulatory variation in wild-derived inbred mouse strains to explore the mechanisms underlying long-lasting versus transient gene regulation in CD8 T cells responding to acute viral infection. Stably responsive DNA elements were characterized by dramatic and congruent chromatin remodeling events affecting multiple neighboring sites and required distinct transcription factor (TF) binding motifs for their accessibility. Specifically, we found that cooperative recruitment of T-box and Runx family transcription factors to shared targets mediated stable chromatin remodeling upon T cell activation. Our observations provide insights into the molecular mechanisms driving virus-specific CD8 T cell responses and suggest a general mechanism for the formation of transcriptional and epigenetic memory applicable to other immune and non-immune cells.
Collapse
Affiliation(s)
- Joris van der Veeken
- Howard Hughes Medical Institute, Memorial Sloan Kettering Cancer Center, New York, NY, USA; Immunology Program, Memorial Sloan Kettering Cancer Center, New York, NY, USA; Ludwig Center at Memorial Sloan Kettering Cancer Center, New York, NY, USA
| | - Yi Zhong
- Howard Hughes Medical Institute, Memorial Sloan Kettering Cancer Center, New York, NY, USA; Immunology Program, Memorial Sloan Kettering Cancer Center, New York, NY, USA; Ludwig Center at Memorial Sloan Kettering Cancer Center, New York, NY, USA; Computational and Systems Biology Program, Memorial Sloan Kettering Cancer Center, New York, NY, USA
| | - Roshan Sharma
- Computational and Systems Biology Program, Memorial Sloan Kettering Cancer Center, New York, NY, USA; Department of Applied Physics and Applied Mathematics, Columbia University, New York, NY, USA
| | - Linas Mazutis
- Computational and Systems Biology Program, Memorial Sloan Kettering Cancer Center, New York, NY, USA
| | - Phuong Dao
- Computational and Systems Biology Program, Memorial Sloan Kettering Cancer Center, New York, NY, USA
| | - Dana Pe'er
- Computational and Systems Biology Program, Memorial Sloan Kettering Cancer Center, New York, NY, USA
| | - Christina S Leslie
- Computational and Systems Biology Program, Memorial Sloan Kettering Cancer Center, New York, NY, USA
| | - Alexander Y Rudensky
- Howard Hughes Medical Institute, Memorial Sloan Kettering Cancer Center, New York, NY, USA; Immunology Program, Memorial Sloan Kettering Cancer Center, New York, NY, USA; Ludwig Center at Memorial Sloan Kettering Cancer Center, New York, NY, USA.
| |
Collapse
|
34
|
Zhao C, Xie S, Wu H, Luan Y, Hu S, Ni J, Lin R, Zhao S, Zhang D, Li X. Quantification of allelic differential expression using a simple Fluorescence primer PCR-RFLP-based method. Sci Rep 2019; 9:6334. [PMID: 31004110 PMCID: PMC6474871 DOI: 10.1038/s41598-019-42815-5] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/04/2018] [Accepted: 03/29/2019] [Indexed: 12/04/2022] Open
Abstract
Allelic differential expression (ADE) is common in diploid organisms, and is often the key reason for specific phenotype variations. Thus, ADE detection is important for identification of major genes and causal mutations. To date, sensitive and simple methods to detect ADE are still lacking. In this study, we have developed an accurate, simple, and sensitive method, named fluorescence primer PCR-RFLP quantitative method (fPCR-RFLP), for ADE analysis. This method involves two rounds of PCR amplification using a pair of primers, one of which is double-labeled with an overhang 6-FAM. The two alleles are then separated by RFLP and quantified by fluorescence density. fPCR-RFLP could precisely distinguish ADE cross a range of 1- to 32-fold differences. Using this method, we verified PLAG1 and KIT, two candidate genes related to growth rate and immune response traits of pigs, to be ADE both at different developmental stages and in different tissues. Our data demonstrates that fPCR-RFLP is an accurate and sensitive method for detecting ADE on both DNA and RNA level. Therefore, this powerful tool provides a way to analyze mutations that cause ADE.
Collapse
Affiliation(s)
- Changzhi Zhao
- Key Laboratory of Agricultural Animal Genetics, Breeding, and Reproduction of the Ministry of Education & Key Lab of Swine Genetics and Breeding of Ministry of Agriculture and Rural Affairs, Huazhong Agricultural University, Wuhan, 430070, P.R. China
| | - Shengsong Xie
- Key Laboratory of Agricultural Animal Genetics, Breeding, and Reproduction of the Ministry of Education & Key Lab of Swine Genetics and Breeding of Ministry of Agriculture and Rural Affairs, Huazhong Agricultural University, Wuhan, 430070, P.R. China.,The Cooperative Innovation Center for Sustainable Pig Production, Huazhong Agricultural University, Wuhan, 430070, P.R. China
| | - Hui Wu
- Key Laboratory of Agricultural Animal Genetics, Breeding, and Reproduction of the Ministry of Education & Key Lab of Swine Genetics and Breeding of Ministry of Agriculture and Rural Affairs, Huazhong Agricultural University, Wuhan, 430070, P.R. China
| | - Yu Luan
- Key Laboratory of Agricultural Animal Genetics, Breeding, and Reproduction of the Ministry of Education & Key Lab of Swine Genetics and Breeding of Ministry of Agriculture and Rural Affairs, Huazhong Agricultural University, Wuhan, 430070, P.R. China
| | - Suqin Hu
- Key Laboratory of Agricultural Animal Genetics, Breeding, and Reproduction of the Ministry of Education & Key Lab of Swine Genetics and Breeding of Ministry of Agriculture and Rural Affairs, Huazhong Agricultural University, Wuhan, 430070, P.R. China
| | - Juan Ni
- Key Laboratory of Agricultural Animal Genetics, Breeding, and Reproduction of the Ministry of Education & Key Lab of Swine Genetics and Breeding of Ministry of Agriculture and Rural Affairs, Huazhong Agricultural University, Wuhan, 430070, P.R. China
| | - Ruiyi Lin
- Key Laboratory of Agricultural Animal Genetics, Breeding, and Reproduction of the Ministry of Education & Key Lab of Swine Genetics and Breeding of Ministry of Agriculture and Rural Affairs, Huazhong Agricultural University, Wuhan, 430070, P.R. China
| | - Shuhong Zhao
- Key Laboratory of Agricultural Animal Genetics, Breeding, and Reproduction of the Ministry of Education & Key Lab of Swine Genetics and Breeding of Ministry of Agriculture and Rural Affairs, Huazhong Agricultural University, Wuhan, 430070, P.R. China.,The Cooperative Innovation Center for Sustainable Pig Production, Huazhong Agricultural University, Wuhan, 430070, P.R. China
| | - Dingxiao Zhang
- Key Laboratory of Agricultural Animal Genetics, Breeding, and Reproduction of the Ministry of Education & Key Lab of Swine Genetics and Breeding of Ministry of Agriculture and Rural Affairs, Huazhong Agricultural University, Wuhan, 430070, P.R. China. .,The Cooperative Innovation Center for Sustainable Pig Production, Huazhong Agricultural University, Wuhan, 430070, P.R. China.
| | - Xinyun Li
- Key Laboratory of Agricultural Animal Genetics, Breeding, and Reproduction of the Ministry of Education & Key Lab of Swine Genetics and Breeding of Ministry of Agriculture and Rural Affairs, Huazhong Agricultural University, Wuhan, 430070, P.R. China. .,The Cooperative Innovation Center for Sustainable Pig Production, Huazhong Agricultural University, Wuhan, 430070, P.R. China.
| |
Collapse
|
35
|
Patterns of genome-wide allele-specific expression in hybrid rice and the implications on the genetic basis of heterosis. Proc Natl Acad Sci U S A 2019; 116:5653-5658. [PMID: 30833384 PMCID: PMC6431163 DOI: 10.1073/pnas.1820513116] [Citation(s) in RCA: 90] [Impact Index Per Article: 18.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/01/2023] Open
Abstract
Utilization of heterosis has greatly increased productivity of many crops globally. Allele-specific expression (ASE) has been suggested as a mechanism for causing heterosis. We performed a genome-wide analysis of ASE in three tissues of an elite rice hybrid grown under four conditions. The analysis identified 3,270 genes showing various patterns of ASE in response to developmental and environmental cues, which provides a glimpse of the ASE landscape in the hybrid genome. We showed that the ASE patterns may have distinct implications in the genetic basis of heterosis, especially in light of the classical dominance and overdominance hypotheses. The genes showing ASE provide the candidates for future studies of the genetic and molecular mechanism of heterosis. Utilization of heterosis has greatly increased the productivity of many crops worldwide. Although tremendous progress has been made in characterizing the genetic basis of heterosis using genomic technologies, molecular mechanisms underlying the genetic components are much less understood. Allele-specific expression (ASE), or imbalance between the expression levels of two parental alleles in the hybrid, has been suggested as a mechanism of heterosis. Here, we performed a genome-wide analysis of ASE by comparing the read ratios of the parental alleles in RNA-sequencing data of an elite rice hybrid and its parents using three tissues from plants grown under four conditions. The analysis identified a total of 3,270 genes showing ASE (ASEGs) in various ways, which can be classified into two patterns: consistent ASEGs such that the ASE was biased toward one parental allele in all tissues/conditions, and inconsistent ASEGs such that ASE was found in some but not all tissues/conditions, including direction-shifting ASEGs in which the ASE was biased toward one parental allele in some tissues/conditions while toward the other parental allele in other tissues/conditions. The results suggested that these patterns may have distinct implications in the genetic basis of heterosis: The consistent ASEGs may cause partial to full dominance effects on the traits that they regulate, and direction-shifting ASEGs may cause overdominance. We also showed that ASEGs were significantly enriched in genomic regions that were differentially selected during rice breeding. These ASEGs provide an index of the genes for future pursuit of the genetic and molecular mechanism of heterosis.
Collapse
|
36
|
Adams CIM, Knapp M, Gemmell NJ, Jeunen GJ, Bunce M, Lamare MD, Taylor HR. Beyond Biodiversity: Can Environmental DNA (eDNA) Cut It as a Population Genetics Tool? Genes (Basel) 2019; 10:E192. [PMID: 30832286 PMCID: PMC6470983 DOI: 10.3390/genes10030192] [Citation(s) in RCA: 71] [Impact Index Per Article: 14.2] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/29/2019] [Revised: 02/19/2019] [Accepted: 02/26/2019] [Indexed: 01/23/2023] Open
Abstract
Population genetic data underpin many studies of behavioral, ecological, and evolutionary processes in wild populations and contribute to effective conservation management. However, collecting genetic samples can be challenging when working with endangered, invasive, or cryptic species. Environmental DNA (eDNA) offers a way to sample genetic material non-invasively without requiring visual observation. While eDNA has been trialed extensively as a biodiversity and biosecurity monitoring tool with a strong taxonomic focus, it has yet to be fully explored as a means for obtaining population genetic information. Here, we review current research that employs eDNA approaches for the study of populations. We outline challenges facing eDNA-based population genetic methodologies, and suggest avenues of research for future developments. We advocate that with further optimizations, this emergent field holds great potential as part of the population genetics toolkit.
Collapse
Affiliation(s)
- Clare I M Adams
- Department of Anatomy, University of Otago, 270 Great King Street, Dunedin, Otago 9016, New Zealand.
| | - Michael Knapp
- Department of Anatomy, University of Otago, 270 Great King Street, Dunedin, Otago 9016, New Zealand.
| | - Neil J Gemmell
- Department of Anatomy, University of Otago, 270 Great King Street, Dunedin, Otago 9016, New Zealand.
| | - Gert-Jan Jeunen
- Department of Anatomy, University of Otago, 270 Great King Street, Dunedin, Otago 9016, New Zealand.
| | - Michael Bunce
- Trace and Environmental DNA (TrEnD) Laboratory, School of Molecular and Life Sciences, Curtin University, Bentley, Perth, WA 6102, Australia.
| | - Miles D Lamare
- Department of Marine Science, University of Otago, 310 Castle Street, Dunedin, Otago 9016, New Zealand.
| | - Helen R Taylor
- Department of Anatomy, University of Otago, 270 Great King Street, Dunedin, Otago 9016, New Zealand.
| |
Collapse
|
37
|
Kryvokhyzha D, Salcedo A, Eriksson MC, Duan T, Tawari N, Chen J, Guerrina M, Kreiner JM, Kent TV, Lagercrantz U, Stinchcombe JR, Glémin S, Wright SI, Lascoux M. Parental legacy, demography, and admixture influenced the evolution of the two subgenomes of the tetraploid Capsella bursa-pastoris (Brassicaceae). PLoS Genet 2019; 15:e1007949. [PMID: 30768594 PMCID: PMC6395008 DOI: 10.1371/journal.pgen.1007949] [Citation(s) in RCA: 26] [Impact Index Per Article: 5.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/01/2018] [Revised: 02/28/2019] [Accepted: 01/09/2019] [Indexed: 11/18/2022] Open
Abstract
Allopolyploidy is generally perceived as a major source of evolutionary novelties and as an instantaneous way to create isolation barriers. However, we do not have a clear understanding of how two subgenomes evolve and interact once they have fused in an allopolyploid species nor how isolated they are from their relatives. Here, we address these questions by analyzing genomic and transcriptomic data of allotetraploid Capsella bursa-pastoris in three differentiated populations, Asia, Europe, and the Middle East. We phased the two subgenomes, one descended from the outcrossing and highly diverse Capsella grandiflora (CbpCg) and the other one from the selfing and genetically depauperate Capsella orientalis (CbpCo). For each subgenome, we assessed its relationship with the diploid relatives, temporal changes of effective population size (Ne), signatures of positive and negative selection, and gene expression patterns. In all three regions, Ne of the two subgenomes decreased gradually over time and the CbpCo subgenome accumulated more deleterious changes than CbpCg. There were signs of widespread admixture between C. bursa-pastoris and its diploid relatives. The two subgenomes were impacted differentially depending on geographic region suggesting either strong interploidy gene flow or multiple origins of C. bursa-pastoris. Selective sweeps were more common on the CbpCg subgenome in Europe and the Middle East, and on the CbpCo subgenome in Asia. In contrast, differences in expression were limited with the CbpCg subgenome slightly more expressed than CbpCo in Europe and the Middle-East. In summary, after more than 100,000 generations of co-existence, the two subgenomes of C. bursa-pastoris still retained a strong signature of parental legacy but their evolutionary trajectory strongly varied across geographic regions.
Collapse
Affiliation(s)
- Dmytro Kryvokhyzha
- Plant Ecology and Evolution, Department of Ecology and Genetics, Evolutionary Biology Centre and Science for Life Laboratory, Uppsala University, Uppsala, Sweden
| | - Adriana Salcedo
- Department of Ecology and Evolution, University of Toronto, Toronto, Canada
| | - Mimmi C. Eriksson
- Plant Ecology and Evolution, Department of Ecology and Genetics, Evolutionary Biology Centre and Science for Life Laboratory, Uppsala University, Uppsala, Sweden
- Department of Biological and Environmental Sciences, University of Gothenburg, Gothenburg, Sweden
| | - Tianlin Duan
- Plant Ecology and Evolution, Department of Ecology and Genetics, Evolutionary Biology Centre and Science for Life Laboratory, Uppsala University, Uppsala, Sweden
| | - Nilesh Tawari
- Computational and Systems Biology Group, Genome Institute of Singapore, Agency for Science, Technology and Research (A*Star), Singapore
| | - Jun Chen
- Plant Ecology and Evolution, Department of Ecology and Genetics, Evolutionary Biology Centre and Science for Life Laboratory, Uppsala University, Uppsala, Sweden
| | - Maria Guerrina
- Plant Ecology and Evolution, Department of Ecology and Genetics, Evolutionary Biology Centre and Science for Life Laboratory, Uppsala University, Uppsala, Sweden
| | - Julia M. Kreiner
- Department of Ecology and Evolution, University of Toronto, Toronto, Canada
| | - Tyler V. Kent
- Department of Ecology and Evolution, University of Toronto, Toronto, Canada
| | - Ulf Lagercrantz
- Plant Ecology and Evolution, Department of Ecology and Genetics, Evolutionary Biology Centre and Science for Life Laboratory, Uppsala University, Uppsala, Sweden
| | | | - Sylvain Glémin
- Plant Ecology and Evolution, Department of Ecology and Genetics, Evolutionary Biology Centre and Science for Life Laboratory, Uppsala University, Uppsala, Sweden
- CNRS, Université de Rennes 1, ECOBIO (Ecosystémes, biodiversité, évolution) - UMR 6553, F-35000 Rennes, France
| | - Stephen I. Wright
- Department of Ecology and Evolution, University of Toronto, Toronto, Canada
| | - Martin Lascoux
- Plant Ecology and Evolution, Department of Ecology and Genetics, Evolutionary Biology Centre and Science for Life Laboratory, Uppsala University, Uppsala, Sweden
| |
Collapse
|
38
|
Abstract
Allele-specific expression arises when transcriptional activity at the different alleles of a gene differs considerably. Although extensive research has been carried out to detect and characterize this phenomenon, the landscape of allele-specific expression in cancer is still poorly understood. In this chapter, we describe a fast and reliable analysis pipeline to study allele-specific expression in cancer using next-generation sequencing data. The pipeline provides a gene-level analysis approach that exploits paired germline DNA and tumor RNA sequencing data and benefits from parallel computation resources when available.
Collapse
Affiliation(s)
- Alessandro Romanel
- Centre for Integrative Biology (CIBIO), University of Trento, Trento, Italy.
| |
Collapse
|
39
|
Abstract
Allele-specific expression is traditionally studied by bulk RNA sequencing, which measures average gene expression across cells. Single-cell RNA sequencing (scRNA-seq) allows the comparison of expression distribution between the two alleles of a diploid organism, and characterization of allele-specific bursting. Here we describe SCALE, a bioinformatic and statistical framework for allele-specific gene expression analysis by scRNA-seq. SCALE estimates genome-wide bursting kinetics at the allelic level while accounting for technical bias and other complicating factors such as cell size. SCALE detects genes with significantly different bursting kinetics between the two alleles, as well as genes where the two alleles exhibit non-independent bursting processes. Here, we illustrate SCALE on a mouse blastocyst single-cell dataset with step-by-step demonstration from the upstream bioinformatic processing to the downstream biological interpretation of SCALE's output.
Collapse
Affiliation(s)
- Meichen Dong
- Department of Biostatistics, Gillings School of Global Public Health, University of North Carolina, Chapel Hill, NC, USA
| | - Yuchao Jiang
- Department of Biostatistics, Gillings School of Global Public Health, University of North Carolina, Chapel Hill, NC, USA.
- Department of Genetics, School of Medicine, University of North Carolina, Chapel Hill, NC, USA.
- Lineberger Comprehensive Cancer Center, University of North Carolina, Chapel Hill, NC, USA.
| |
Collapse
|
40
|
Liu Z, Dong X, Li Y. A Genome-Wide Study of Allele-Specific Expression in Colorectal Cancer. Front Genet 2018; 9:570. [PMID: 30538721 PMCID: PMC6277598 DOI: 10.3389/fgene.2018.00570] [Citation(s) in RCA: 18] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/19/2018] [Accepted: 11/06/2018] [Indexed: 12/30/2022] Open
Abstract
Accumulating evidence from small-scale studies has suggested that allele-specific expression (ASE) plays an important role in tumor initiation and progression. However, little is known about genome-wide ASE in tumors. In this study, we conducted a comprehensive analysis of ASE in individuals with colorectal cancer (CRC) on a genome-wide scale. We identified 5.4 thousand genome-wide ASEs of single nucleotide variations (SNVs) from tumor and normal tissues of 59 individuals with CRC. We observed an increased ASE level in tumor samples and the ASEs enriched as hotspots on the genome. Around 63% of the genes located there were previously reported to contain complex regulatory elements, e.g., human leukocyte antigen (HLA), or were implicated in tumor progression. Focussing on the allelic expression of somatic mutations, we found that 37.5% of them exhibited ASE, and genes harboring such somatic mutations, were enriched in important pathways implicated in cancers. In addition, by comparing the expected and observed ASE events in tumor samples, we identified 50 tumor specific ASEs which possibly contributed to the somatic events in the regulatory regions of the genes and significantly enriched known cancer driver genes. By analyzing CRC ASEs from several perspectives, we provided a systematic understanding of how ASE is implicated in both tumor and normal tissues and will be of critical value in guiding ASE studies in cancer.
Collapse
Affiliation(s)
- Zhi Liu
- Department of Epidemiology and Biostatistics, Jiangsu Key Lab of Cancer Biomarkers, Prevention and Treatment, Collaborative Innovation Center for Cancer Personalized Medicine, School of Public Health, Nanjing Medical University, Nanjing, China
| | - Xiao Dong
- Department of Genetics, Albert Einstein College of Medicine, Bronx, NY, United States
| | - Yixue Li
- Key Laboratory of Computational Biology, CAS-MPG Partner Institute for Computational Biology, Shanghai Institute of Nutrition and Health, Shanghai Institutes for Biological Sciences, University of Chinese Academy of Sciences, Chinese Academy of Sciences, Shanghai, China.,Shanghai Center for Bioinformation Technology, Shanghai Industrial Technology Institute, Shanghai, China.,Collaborative Innovation Center for Genetics and Development, Fudan University, Shanghai, China
| |
Collapse
|
41
|
Zhang Q, Keleş S. An empirical Bayes test for allelic-imbalance detection in ChIP-seq. Biostatistics 2018; 19:546-561. [PMID: 29126153 PMCID: PMC6454553 DOI: 10.1093/biostatistics/kxx060] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/10/2017] [Accepted: 10/01/2017] [Indexed: 11/12/2022] Open
Abstract
Chromatin immunoprecipitation followed by high-throughput sequencing (ChIP-seq) has enabled discovery of genomic regions enriched with biological signals such as transcription factor binding and histone modifications. Allelic-imbalance (ALI) detection is a complementary analysis of ChIP-seq data for associating biological signals with single nucleotide polymorphisms (SNPs). It has been successfully used in elucidating functional roles of non-coding SNPs. Commonly used statistical approaches for ALI detection are often based on binomial testing and mixture models, both of which rely on strong assumptions on the distribution of the unobserved allelic probability, and have significant practical shortcomings. We propose Non-Parametric Binomial (NPBin) test for ALI detection and for modeling Binomial data in general. NPBin models the density of the unobserved allelic probability non-parametrically, and estimates its empirical null distribution via curve fitting. We demonstrate the advantages of NPBin in terms of interpretability of the estimated density and the accuracy in ALI detection using simulations and analysis of several ChIP-seq data sets. We also illustrate the generality of our modeling framework beyond ALI detection by an application to a baseball batting average prediction problem. This article has supplementary material available at Biostatistics online. The code and the sample input data have been also deposited to github https://github.com/QiZhangStat/ALIdetection.
Collapse
Affiliation(s)
- Qi Zhang
- Department of Statistics, University of Nebraska-Lincoln, 340 Hardin Hall North Wing, Lincoln, NE, USA
| | - Sündüz Keleş
- Department of Biostatistics and Medical Informatics and Department of Statistics, University of Wisconsin, Madison, 1300 University Ave., Madison, WI, USA
| |
Collapse
|
42
|
Wang M, Uebbing S, Pawitan Y, Scofield DG. RPASE: Individual-based allele-specific expression detection without prior knowledge of haplotype phase. Mol Ecol Resour 2018; 18:1247-1262. [PMID: 29858523 DOI: 10.1111/1755-0998.12909] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/21/2017] [Revised: 05/09/2018] [Accepted: 05/21/2018] [Indexed: 01/04/2023]
Abstract
Variation in gene expression is believed to make a significant contribution to phenotypic diversity and divergence. The analysis of allele-specific expression (ASE) can reveal important insights into gene expression regulation. We developed a novel method called RPASE (Read-backed Phasing-based ASE detection) to test for genes that show ASE. With mapped RNA-seq data from a single individual and a list of SNPs from the same individual as the only input, RPASE is capable of aggregating information across multiple dependent SNPs and producing individual-based gene-level tests for ASE. RPASE performs well in simulations and comparisons. We applied RPASE to multiple bird species and found a potentially rich landscape of ASE.
Collapse
Affiliation(s)
- Mi Wang
- Department of Evolutionary Biology, Evolutionary Biology Centre, Uppsala University, Uppsala, Sweden
| | - Severin Uebbing
- Department of Evolutionary Biology, Evolutionary Biology Centre, Uppsala University, Uppsala, Sweden
| | - Yudi Pawitan
- Department of Medical Epidemiology and Biostatistics, Karolinska Institutet, Stockholm, Sweden
| | - Douglas G Scofield
- Department of Evolutionary Biology, Evolutionary Biology Centre, Uppsala University, Uppsala, Sweden
| |
Collapse
|
43
|
Yeo J, Morales DA, Chen T, Crawford EL, Zhang X, Blomquist TM, Levin AM, Massion PP, Arenberg DA, Midthun DE, Mazzone PJ, Nathan SD, Wainz RJ, Nana-Sinkam P, Willey PFS, Arend TJ, Padda K, Qiu S, Federov A, Hernandez DAR, Hammersley JR, Yoon Y, Safi F, Khuder SA, Willey JC. RNAseq analysis of bronchial epithelial cells to identify COPD-associated genes and SNPs. BMC Pulm Med 2018; 18:42. [PMID: 29506519 PMCID: PMC5838965 DOI: 10.1186/s12890-018-0603-y] [Citation(s) in RCA: 16] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/08/2017] [Accepted: 02/23/2018] [Indexed: 01/09/2023] Open
Abstract
Background There is a need for more powerful methods to identify low-effect SNPs that contribute to hereditary COPD pathogenesis. We hypothesized that SNPs contributing to COPD risk through cis-regulatory effects are enriched in genes comprised by bronchial epithelial cell (BEC) expression patterns associated with COPD. Methods To test this hypothesis, normal BEC specimens were obtained by bronchoscopy from 60 subjects: 30 subjects with COPD defined by spirometry (FEV1/FVC < 0.7, FEV1% < 80%), and 30 non-COPD controls. Targeted next generation sequencing was used to measure total and allele-specific expression of 35 genes in genome maintenance (GM) genes pathways linked to COPD pathogenesis, including seven TP53 and CEBP transcription factor family members. Shrinkage linear discriminant analysis (SLDA) was used to identify COPD-classification models. COPD GWAS were queried for putative cis-regulatory SNPs in the targeted genes. Results On a network basis, TP53 and CEBP transcription factor pathway gene pair network connections, including key DNA repair gene ERCC5, were significantly different in COPD subjects (e.g., Wilcoxon rank sum test for closeness, p-value = 5.0E-11). ERCC5 SNP rs4150275 association with chronic bronchitis was identified in a set of Lung Health Study (LHS) COPD GWAS SNPs restricted to those in putative regulatory regions within the targeted genes, and this association was validated in the COPDgene non-hispanic white (NHW) GWAS. ERCC5 SNP rs4150275 is linked (D’ = 1) to ERCC5 SNP rs17655 which displayed differential allelic expression (DAE) in BEC and is an expression quantitative trait locus (eQTL) in lung tissue (p = 3.2E-7). SNPs in linkage (D’ = 1) with rs17655 were predicted to alter miRNA binding (rs873601). A classifier model that comprised gene features CAT, CEBPG, GPX1, KEAP1, TP73, and XPA had pooled 10-fold cross-validation receiver operator characteristic area under the curve of 75.4% (95% CI: 66.3%–89.3%). The prevalence of DAE was higher than expected (p = 0.0023) in the classifier genes. Conclusions GM genes comprised by COPD-associated BEC expression patterns were enriched for SNPs with cis-regulatory function, including a putative cis-rSNP in ERCC5 that was associated with COPD risk. These findings support additional total and allele-specific expression analysis of gene pathways with high prior likelihood for involvement in COPD pathogenesis. Electronic supplementary material The online version of this article (10.1186/s12890-018-0603-y) contains supplementary material, which is available to authorized users.
Collapse
Affiliation(s)
- Jiyoun Yeo
- Department of Pathology, The University of Toledo College of Medicine, 3000 Arlington Avenue, HEB 219, Toledo, OH, 43614, USA
| | - Diego A Morales
- Division of Pulmonary and Critical Care Medicine, Department of Medicine, The University of Toledo College of Medicine, 3000 Arlington Avenue, HEB 219, Toledo, OH, 43614, USA
| | - Tian Chen
- Department of Mathematics and Statistics, The University of Toledo, 2801 W. Bancroft Street, Toledo, OH, 43606, USA
| | - Erin L Crawford
- Division of Pulmonary and Critical Care Medicine, Department of Medicine, The University of Toledo College of Medicine, 3000 Arlington Avenue, HEB 219, Toledo, OH, 43614, USA
| | - Xiaolu Zhang
- Department of Medicine, The University of Toledo College of Medicine, 3000 Arlington Avenue, Toledo, OH, 43614, USA
| | - Thomas M Blomquist
- Department of Pathology, The University of Toledo College of Medicine, 3000 Arlington Avenue, HEB 219, Toledo, OH, 43614, USA
| | - Albert M Levin
- Department of Biostatistics, Henry Ford Health System, 1 Ford Place Detroit, MI, Detroit, MI, 48202, USA
| | - Pierre P Massion
- Thoracic Program, Vanderbilt Ingram Cancer Center, Nashville, TN, 37232, USA
| | | | - David E Midthun
- Department of Pulmonary and Critical Care Medicine, Mayo Clinic, 200 1st St SW, Rochester, MN, 55905, USA
| | - Peter J Mazzone
- Department of Pulmonary Medicine, Cleveland Clinic, 9500 Euclid Ave, Cleveland, OH, 44195, USA
| | - Steven D Nathan
- Department of Pulmonary Medicine, Inova Fairfax Hospital, 3300 Gallows Road, Falls Church, VA, 22042-3300, USA
| | - Ronald J Wainz
- The Toledo Hospital, 2142 N Cove Blvd, Toledo, OH, 43606, USA
| | - Patrick Nana-Sinkam
- Division of Pulmonary Diseases and Critical Care Medicine, Virginia Commonwealth University, USA, Richmond, VA, 23284-2512, USA.,Ohio State University James Comprehensive Cancer Center and Solove Research Institute, Columbus, OH, USA
| | - Paige F S Willey
- American Enterprise Institute, 1789 Massachusetts Ave NW, Washington, DC, 20036, USA
| | - Taylor J Arend
- The University of Toledo College of Medicine, 3000 Arlington Avenue, Toledo, OH, 43614, USA
| | - Karanbir Padda
- Emory University School of Medicine, 1648 Pierce Dr NE, Atlanta, GA, 30307, USA
| | - Shuhao Qiu
- Department of Medicine, The University of Toledo Medical Center, 3000 Arlington Avenue, Toledo, OH, 43614, USA
| | - Alexei Federov
- Department of Mathematics and Statistics, The University of Toledo, 2801 W. Bancroft Street, Toledo, OH, 43606, USA.,Department of Medicine, The University of Toledo College of Medicine, 3000 Arlington Avenue, Toledo, OH, 43614, USA
| | - Dawn-Alita R Hernandez
- Division of Pulmonary and Critical Care Medicine, Department of Medicine, The University of Toledo College of Medicine, 3000 Arlington Avenue, RHC 0012, Toledo, OH, 43614, USA
| | - Jeffrey R Hammersley
- Division of Pulmonary and Critical Care Medicine, Department of Medicine, The University of Toledo College of Medicine, 3000 Arlington Avenue, RHC 0012, Toledo, OH, 43614, USA
| | - Youngsook Yoon
- Division of Pulmonary and Critical Care Medicine, Department of Medicine, The University of Toledo College of Medicine, 3000 Arlington Avenue, RHC 0012, Toledo, OH, 43614, USA
| | - Fadi Safi
- Division of Pulmonary and Critical Care Medicine, Department of Medicine, The University of Toledo College of Medicine, 3000 Arlington Avenue, RHC 0012, Toledo, OH, 43614, USA
| | - Sadik A Khuder
- Division of Pulmonary and Critical Care Medicine, Department of Medicine, The University of Toledo College of Medicine, 3000 Arlington Avenue, RHC 0012, Toledo, OH, 43614, USA
| | - James C Willey
- Division of Pulmonary and Critical Care Medicine, Department of Medicine, The University of Toledo College of Medicine, 3000 Arlington Avenue, Toledo, OH, 43614, USA.
| |
Collapse
|
44
|
Kalita CA, Moyerbrailean GA, Brown C, Wen X, Luca F, Pique-Regi R. QuASAR-MPRA: accurate allele-specific analysis for massively parallel reporter assays. Bioinformatics 2018; 34:787-794. [PMID: 29028988 PMCID: PMC6049023 DOI: 10.1093/bioinformatics/btx598] [Citation(s) in RCA: 20] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/18/2017] [Revised: 07/11/2017] [Accepted: 09/19/2017] [Indexed: 12/18/2022] Open
Abstract
Motivation The majority of the human genome is composed of non-coding regions containing regulatory elements such as enhancers, which are crucial for controlling gene expression. Many variants associated with complex traits are in these regions, and may disrupt gene regulatory sequences. Consequently, it is important to not only identify true enhancers but also to test if a variant within an enhancer affects gene regulation. Recently, allele-specific analysis in high-throughput reporter assays, such as massively parallel reporter assays (MPRAs), have been used to functionally validate non-coding variants. However, we are still missing high-quality and robust data analysis tools for these datasets. Results We have further developed our method for allele-specific analysis QuASAR (quantitative allele-specific analysis of reads) to analyze allele-specific signals in barcoded read counts data from MPRA. Using this approach, we can take into account the uncertainty on the original plasmid proportions, over-dispersion, and sequencing errors. The provided allelic skew estimate and its standard error also simplifies meta-analysis of replicate experiments. Additionally, we show that a beta-binomial distribution better models the variability present in the allelic imbalance of these synthetic reporters and results in a test that is statistically well calibrated under the null. Applying this approach to the MPRA data, we found 602 SNPs with significant (false discovery rate 10%) allele-specific regulatory function in LCLs. We also show that we can combine MPRA with QuASAR estimates to validate existing experimental and computational annotations of regulatory variants. Our study shows that with appropriate data analysis tools, we can improve the power to detect allelic effects in high-throughput reporter assays. Availability and implementation http://github.com/piquelab/QuASAR/tree/master/mpra. Contact fluca@wayne.edu or rpique@wayne.edu. Supplementary information Supplementary data are available online at Bioinformatics.
Collapse
Affiliation(s)
- Cynthia A Kalita
- Center for Molecular Medicine and Genetics, School of Medicine, Wayne State University, Detroit, MI, USA
| | - Gregory A Moyerbrailean
- Center for Molecular Medicine and Genetics, School of Medicine, Wayne State University, Detroit, MI, USA
| | - Christopher Brown
- Department of Genetics, University of Pennsylvania, Philadelphia, PA, USA
| | - Xiaoquan Wen
- Department of Biostatistics, University of Michigan, Ann Arbor, MI, USA
| | - Francesca Luca
- Center for Molecular Medicine and Genetics, School of Medicine, Wayne State University, Detroit, MI, USA
- Department of Obstetrics and Gynecology, Wayne State University, Detroit, MI, USA
| | - Roger Pique-Regi
- Center for Molecular Medicine and Genetics, School of Medicine, Wayne State University, Detroit, MI, USA
- Department of Obstetrics and Gynecology, Wayne State University, Detroit, MI, USA
| |
Collapse
|
45
|
Li W, Li L, Zhang S, Zhang C, Huang H, Li Y, Hu E, Deng G, Guo S, Wang Y, Li W, Chen L. Identification of potential genes for human ischemic cardiomyopathy based on RNA-Seq data. Oncotarget 2018; 7:82063-82073. [PMID: 27852050 PMCID: PMC5347674 DOI: 10.18632/oncotarget.13331] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/02/2016] [Accepted: 10/07/2016] [Indexed: 12/30/2022] Open
Abstract
Ischemic cardiomyopathy (ICM) is an important cause of heart failure, yet no ICM disease genes were stored in any public databases. Mutations of genes provided by RNA-Seq data could set a foundation for a variety of biological processes. This also made it possible to elucidate the mechanism and identify potential genes for ICM. In this paper, an integrated co-expression network was constructed using univariate and bivariate canonical correlation analysis for RNA-Seq data of human ICM samples. Three ICM-related modules were recognized after comparing between Pearson correlation coefficients of ICM samples and normal controls. Furthermore, 32 ICM potential genes were identified from ICM-related modules considering protein-protein interactions. Most of these genes were verified to be involved in ICM and diseases caused it by OMIM and literature. Our study could provide a novel perspective for potential gene identification and the pathogenesis for ICM and other complex diseases.
Collapse
Affiliation(s)
- Wan Li
- College of Bioinformatics Science and Technology, Harbin Medical University, Harbin, Heilongjiang, China
| | - Liansheng Li
- College of Bioinformatics Science and Technology, Harbin Medical University, Harbin, Heilongjiang, China
| | - Shiying Zhang
- College of Bioinformatics Science and Technology, Harbin Medical University, Harbin, Heilongjiang, China
| | - Ce Zhang
- Department of internal medicine, Heilongjiang Commercial Hospital, Harbin, Heilongjiang, China
| | - Hao Huang
- College of Bioinformatics Science and Technology, Harbin Medical University, Harbin, Heilongjiang, China
| | - Yiran Li
- College of Bioinformatics Science and Technology, Harbin Medical University, Harbin, Heilongjiang, China
| | - Erqiang Hu
- College of Bioinformatics Science and Technology, Harbin Medical University, Harbin, Heilongjiang, China
| | - Gui Deng
- College of Bioinformatics Science and Technology, Harbin Medical University, Harbin, Heilongjiang, China
| | - Shanshan Guo
- College of Bioinformatics Science and Technology, Harbin Medical University, Harbin, Heilongjiang, China
| | - Yahui Wang
- College of Bioinformatics Science and Technology, Harbin Medical University, Harbin, Heilongjiang, China
| | - Weimin Li
- Department of Cardiology, the First Affiliated Hospital of Harbin Medical University, Harbin, Heilongjiang, China
| | - Lina Chen
- College of Bioinformatics Science and Technology, Harbin Medical University, Harbin, Heilongjiang, China
| |
Collapse
|
46
|
Direct Testing for Allele-Specific Expression Differences Between Conditions. G3-GENES GENOMES GENETICS 2018; 8:447-460. [PMID: 29167272 PMCID: PMC5919738 DOI: 10.1534/g3.117.300139] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 12/25/2022]
Abstract
Allelic imbalance (AI) indicates the presence of functional variation in cis regulatory regions. Detecting cis regulatory differences using AI is widespread, yet there is no formal statistical methodology that tests whether AI differs between conditions. Here, we present a novel model and formally test differences in AI across conditions using Bayesian credible intervals. The approach tests AI by environment (G×E) interactions, and can be used to test AI between environments, genotypes, sex, and any other condition. We incorporate bias into the modeling process. Bias is allowed to vary between conditions, making the formulation of the model general. As gene expression affects power for detection of AI, and, as expression may vary between conditions, the model explicitly takes coverage into account. The proposed model has low type I and II error under several scenarios, and is robust to large differences in coverage between conditions. We reanalyze RNA-seq data from a Drosophila melanogaster population panel, with F1 genotypes, to compare levels of AI between mated and virgin female flies, and we show that AI × genotype interactions can also be tested. To demonstrate the use of the model to test genetic differences and interactions, a formal test between two F1s was performed, showing the expected 20% difference in AI. The proposed model allows a formal test of G×E and G×G, and reaffirms a previous finding that cis regulation is robust between environments.
Collapse
|
47
|
Rhoné B, Mariac C, Couderc M, Berthouly-Salazar C, Ousseini IS, Vigouroux Y. No Excess of Cis-Regulatory Variation Associated with Intraspecific Selection in Wild Pearl Millet (Cenchrus americanus). Genome Biol Evol 2017; 9:388-397. [PMID: 28137746 PMCID: PMC5381623 DOI: 10.1093/gbe/evx004] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 01/25/2017] [Indexed: 12/15/2022] Open
Abstract
Several studies suggest that cis-regulatory mutations are the favorite target of evolutionary changes, one reason being that cis-regulatory mutations might have fewer deleterious pleiotropic effects than protein-coding mutations. A review of the process also suggests that this bias towards adaptive cis-regulatory variation might be less pronounced at the intraspecific level compared with the interspecific level. In this study, we assessed the contribution of cis-regulatory variation to adaptation at the intraspecific level using populations of wild pearl millet (Cenchrus americanus ssp. monodii) sampled along an environmental gradient in Niger. From RNA sequencing of hybrids to assess allele-specific expression, we identified genes with cis-regulatory divergence between two parental accessions collected in contrasted environmental conditions. This revealed that ∼15% of transcribed genes showed cis-regulatory variation. Intersecting the gene set exhibiting cis-regulatory variation with the gene set identified as targets of selection revealed no excess of cis-acting mutations among the selected genes. We additionally found no excess of cis-regulatory variation among genes associated with adaptive traits. As our approach relied on methods identifying mainly genes submitted to strong selection pressure or with high phenotypic effect, the contribution of cis-regulatory changes to soft selection or polygenic adaptive traits remains to be tested. However our results favor the hypothesis that enrichment of adaptive cis-regulatory divergence builds up over time. For short evolutionary time-scales, cis-acting mutations are not predominantly involved in adaptive evolution associated with strong selective signal.
Collapse
Affiliation(s)
- Bénédicte Rhoné
- Unité Mixte de Recherche Diversité Adaptation et Développement des Plantes (UMR DIADE), Institut de Recherche pour le Développement, Montpellier, France.,Laboratoire de Biométrie et Biologie Evolutive, Université Lyon 1, CNRS, Lyon, France
| | - Cédric Mariac
- Unité Mixte de Recherche Diversité Adaptation et Développement des Plantes (UMR DIADE), Institut de Recherche pour le Développement, Montpellier, France
| | - Marie Couderc
- Unité Mixte de Recherche Diversité Adaptation et Développement des Plantes (UMR DIADE), Institut de Recherche pour le Développement, Montpellier, France
| | - Cécile Berthouly-Salazar
- Unité Mixte de Recherche Diversité Adaptation et Développement des Plantes (UMR DIADE), Institut de Recherche pour le Développement, Montpellier, France.,Laboratoire Mixte International Adaptation des Plantes et Microorganismes Associés aux Stress Environnementaux (LMI LAPSE), Centre de Recherche de Bel Air, Dakar, Sénégal
| | - Issaka Salia Ousseini
- Unité Mixte de Recherche Diversité Adaptation et Développement des Plantes (UMR DIADE), Institut de Recherche pour le Développement, Montpellier, France.,Laboratoire Mixte International Adaptation des Plantes et Microorganismes Associés aux Stress Environnementaux (LMI LAPSE), Centre de Recherche de Bel Air, Dakar, Sénégal.,Biology Department, Unité Mixte de Recherche Diversité Adaptation et Développement des plantes (UMR DIADE), Université Montpellier, France.,Université Abdou Moumouni de Niamey, Niger
| | - Yves Vigouroux
- Unité Mixte de Recherche Diversité Adaptation et Développement des Plantes (UMR DIADE), Institut de Recherche pour le Développement, Montpellier, France.,Laboratoire Mixte International Adaptation des Plantes et Microorganismes Associés aux Stress Environnementaux (LMI LAPSE), Centre de Recherche de Bel Air, Dakar, Sénégal.,Biology Department, Unité Mixte de Recherche Diversité Adaptation et Développement des plantes (UMR DIADE), Université Montpellier, France
| |
Collapse
|
48
|
Elkon R, Agami R. Characterization of noncoding regulatory DNA in the human genome. Nat Biotechnol 2017; 35:732-746. [DOI: 10.1038/nbt.3863] [Citation(s) in RCA: 63] [Impact Index Per Article: 9.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/01/2016] [Accepted: 03/31/2017] [Indexed: 12/22/2022]
|
49
|
Ghazanfar S, Vuocolo T, Morrison JL, Nicholas LM, McMillen IC, Yang JYH, Buckley MJ, Tellam RL. Gene expression allelic imbalance in ovine brown adipose tissue impacts energy homeostasis. PLoS One 2017; 12:e0180378. [PMID: 28665992 PMCID: PMC5493397 DOI: 10.1371/journal.pone.0180378] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/12/2016] [Accepted: 06/14/2017] [Indexed: 12/22/2022] Open
Abstract
Heritable trait variation within a population of organisms is largely governed by DNA variations that impact gene transcription and protein function. Identifying genetic variants that affect complex functional traits is a primary aim of population genetics studies, especially in the context of human disease and agricultural production traits. The identification of alleles directly altering mRNA expression and thereby biological function is challenging due to difficulty in isolating direct effects of cis-acting genetic variations from indirect trans-acting genetic effects. Allele specific gene expression or allelic imbalance in gene expression (AI) occurring at heterozygous loci provides an opportunity to identify genes directly impacted by cis-acting genetic variants as indirect trans-acting effects equally impact the expression of both alleles. However, the identification of genes showing AI in the context of the expression of all genes remains a challenge due to a variety of technical and statistical issues. The current study focuses on the discovery of genes showing AI using single nucleotide polymorphisms as allelic reporters. By developing a computational and statistical process that addressed multiple analytical challenges, we ranked 5,809 genes for evidence of AI using RNA-Seq data derived from brown adipose tissue samples from a cohort of late gestation fetal lambs and then identified a conservative subgroup of 1,293 genes. Thus, AI was extensive, representing approximately 25% of the tested genes. Genes associated with AI were enriched for multiple Gene Ontology (GO) terms relating to lipid metabolism, mitochondrial function and the extracellular matrix. These functions suggest that cis-acting genetic variations causing AI in the population are preferentially impacting genes involved in energy homeostasis and tissue remodelling. These functions may contribute to production traits likely to be under genetic selection in the population.
Collapse
Affiliation(s)
- Shila Ghazanfar
- Data61, CSIRO, North Ryde, NSW, Australia
- School of Mathematics and Statistics, The University of Sydney, Sydney, NSW, Australia
- * E-mail: (SG); (RLT)
| | - Tony Vuocolo
- CSIRO Agriculture, Queensland Biosciences Precinct, St Lucia, QLD, Australia
| | - Janna L. Morrison
- Early Origins of Adult Health Research Group, School of Pharmacy and Medical Sciences, Sansom Institute for Health Research, The University of South Australia, Adelaide, SA, Australia
| | - Lisa M. Nicholas
- Early Origins of Adult Health Research Group, School of Pharmacy and Medical Sciences, Sansom Institute for Health Research, The University of South Australia, Adelaide, SA, Australia
| | - Isabella C. McMillen
- Early Origins of Adult Health Research Group, School of Pharmacy and Medical Sciences, Sansom Institute for Health Research, The University of South Australia, Adelaide, SA, Australia
| | - Jean Y. H. Yang
- School of Mathematics and Statistics, The University of Sydney, Sydney, NSW, Australia
| | | | - Ross L. Tellam
- CSIRO Agriculture, Queensland Biosciences Precinct, St Lucia, QLD, Australia
- * E-mail: (SG); (RLT)
| |
Collapse
|
50
|
Jiang Y, Zhang NR, Li M. SCALE: modeling allele-specific gene expression by single-cell RNA sequencing. Genome Biol 2017; 18:74. [PMID: 28446220 PMCID: PMC5407026 DOI: 10.1186/s13059-017-1200-8] [Citation(s) in RCA: 58] [Impact Index Per Article: 8.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/17/2017] [Accepted: 03/24/2017] [Indexed: 12/13/2022] Open
Abstract
Allele-specific expression is traditionally studied by bulk RNA sequencing, which measures average expression across cells. Single-cell RNA sequencing allows the comparison of expression distribution between the two alleles of a diploid organism and the characterization of allele-specific bursting. Here, we propose SCALE to analyze genome-wide allele-specific bursting, with adjustment of technical variability. SCALE detects genes exhibiting allelic differences in bursting parameters and genes whose alleles burst non-independently. We apply SCALE to mouse blastocyst and human fibroblast cells and find that cis control in gene expression overwhelmingly manifests as differences in burst frequency.
Collapse
Affiliation(s)
- Yuchao Jiang
- Genomics and Computational Biology Graduate Program, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA, 19104, USA
| | - Nancy R Zhang
- Department of Statistics, The Wharton School, University of Pennsylvania, Philadelphia, PA, 19104, USA.
| | - Mingyao Li
- Department of Biostatistics and Epidemiology, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA, 19104, USA.
| |
Collapse
|