1
|
Wang F, Shao J, He S, Guo Y, Pan X, Wang Y, Nanaei HA, Chen L, Li R, Xu H, Yang Z, Liu M, Jiang Y. Allele-specific expression and splicing provides insight into the phenotypic differences between thin- and fat-tailed sheep breeds. J Genet Genomics 2022; 49:583-586. [PMID: 34998977 DOI: 10.1016/j.jgg.2021.12.008] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/10/2021] [Revised: 11/20/2021] [Accepted: 12/11/2021] [Indexed: 11/19/2022]
Affiliation(s)
- Fei Wang
- Key Laboratory of Animal Genetics, Breeding and Reproduction of Shaanxi Province, College of Animal Science and Technology, Northwest A&F University, Yangling, Shaanxi 712100, China
| | - Junjie Shao
- Key Laboratory of Animal Genetics, Breeding and Reproduction of Shaanxi Province, College of Animal Science and Technology, Northwest A&F University, Yangling, Shaanxi 712100, China
| | - Sangang He
- Key Laboratory of Ruminant Genetics, Breeding & Reproduction, Ministry of Agriculture, China; Key Laboratory of Animal Biotechnology of Xinjiang, Institute of Biotechnology, Xinjiang Academy of Animal Science, Urumqi, Xinjiang 830026, China
| | - Yingwei Guo
- Key Laboratory of Animal Genetics, Breeding and Reproduction of Shaanxi Province, College of Animal Science and Technology, Northwest A&F University, Yangling, Shaanxi 712100, China
| | - Xiangyu Pan
- Key Laboratory of Animal Genetics, Breeding and Reproduction of Shaanxi Province, College of Animal Science and Technology, Northwest A&F University, Yangling, Shaanxi 712100, China
| | - Yu Wang
- Key Laboratory of Animal Genetics, Breeding and Reproduction of Shaanxi Province, College of Animal Science and Technology, Northwest A&F University, Yangling, Shaanxi 712100, China
| | - Hojjat Asadollahpour Nanaei
- Key Laboratory of Animal Genetics, Breeding and Reproduction of Shaanxi Province, College of Animal Science and Technology, Northwest A&F University, Yangling, Shaanxi 712100, China
| | - Lei Chen
- Key Laboratory of Ruminant Genetics, Breeding & Reproduction, Ministry of Agriculture, China; Key Laboratory of Animal Biotechnology of Xinjiang, Institute of Biotechnology, Xinjiang Academy of Animal Science, Urumqi, Xinjiang 830026, China
| | - Ran Li
- Key Laboratory of Animal Genetics, Breeding and Reproduction of Shaanxi Province, College of Animal Science and Technology, Northwest A&F University, Yangling, Shaanxi 712100, China
| | - Han Xu
- Key Laboratory of Animal Genetics, Breeding and Reproduction of Shaanxi Province, College of Animal Science and Technology, Northwest A&F University, Yangling, Shaanxi 712100, China
| | - Zhirui Yang
- Key Laboratory of Animal Genetics, Breeding and Reproduction of Shaanxi Province, College of Animal Science and Technology, Northwest A&F University, Yangling, Shaanxi 712100, China
| | - Mingjun Liu
- Key Laboratory of Ruminant Genetics, Breeding & Reproduction, Ministry of Agriculture, China; Key Laboratory of Animal Biotechnology of Xinjiang, Institute of Biotechnology, Xinjiang Academy of Animal Science, Urumqi, Xinjiang 830026, China.
| | - Yu Jiang
- Key Laboratory of Animal Genetics, Breeding and Reproduction of Shaanxi Province, College of Animal Science and Technology, Northwest A&F University, Yangling, Shaanxi 712100, China.
| |
Collapse
|
2
|
Quantitative neurogenetics: applications in understanding disease. Biochem Soc Trans 2021; 49:1621-1631. [PMID: 34282824 DOI: 10.1042/bst20200732] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/20/2021] [Revised: 06/11/2021] [Accepted: 06/21/2021] [Indexed: 12/31/2022]
Abstract
Neurodevelopmental and neurodegenerative disorders (NNDs) are a group of conditions with a broad range of core and co-morbidities, associated with dysfunction of the central nervous system. Improvements in high throughput sequencing have led to the detection of putative risk genetic loci for NNDs, however, quantitative neurogenetic approaches need to be further developed in order to establish causality and underlying molecular genetic mechanisms of pathogenesis. Here, we discuss an approach for prioritizing the contribution of genetic risk loci to complex-NND pathogenesis by estimating the possible impacts of these loci on gene regulation. Furthermore, we highlight the use of a tissue-specificity gene expression index and the application of artificial intelligence (AI) to improve the interpretation of the role of genetic risk elements in NND pathogenesis. Given that NND symptoms are associated with brain dysfunction, risk loci with direct, causative actions would comprise genes with essential functions in neural cells that are highly expressed in the brain. Indeed, NND risk genes implicated in brain dysfunction are disproportionately enriched in the brain compared with other tissues, which we refer to as brain-specific expressed genes. In addition, the tissue-specificity gene expression index can be used as a handle to identify non-brain contexts that are involved in NND pathogenesis. Lastly, we discuss how using an AI approach provides the opportunity to integrate the biological impacts of risk loci to identify those putative combinations of causative relationships through which genetic factors contribute to NND pathogenesis.
Collapse
|
3
|
Degtyareva AO, Antontseva EV, Merkulova TI. Regulatory SNPs: Altered Transcription Factor Binding Sites Implicated in Complex Traits and Diseases. Int J Mol Sci 2021; 22:6454. [PMID: 34208629 PMCID: PMC8235176 DOI: 10.3390/ijms22126454] [Citation(s) in RCA: 30] [Impact Index Per Article: 10.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/05/2021] [Revised: 06/15/2021] [Accepted: 06/15/2021] [Indexed: 12/19/2022] Open
Abstract
The vast majority of the genetic variants (mainly SNPs) associated with various human traits and diseases map to a noncoding part of the genome and are enriched in its regulatory compartment, suggesting that many causal variants may affect gene expression. The leading mechanism of action of these SNPs consists in the alterations in the transcription factor binding via creation or disruption of transcription factor binding sites (TFBSs) or some change in the affinity of these regulatory proteins to their cognate sites. In this review, we first focus on the history of the discovery of regulatory SNPs (rSNPs) and systematized description of the existing methodical approaches to their study. Then, we brief the recent comprehensive examples of rSNPs studied from the discovery of the changes in the TFBS sequence as a result of a nucleotide substitution to identification of its effect on the target gene expression and, eventually, to phenotype. We also describe state-of-the-art genome-wide approaches to identification of regulatory variants, including both making molecular sense of genome-wide association studies (GWAS) and the alternative approaches the primary goal of which is to determine the functionality of genetic variants. Among these approaches, special attention is paid to expression quantitative trait loci (eQTLs) analysis and the search for allele-specific events in RNA-seq (ASE events) as well as in ChIP-seq, DNase-seq, and ATAC-seq (ASB events) data.
Collapse
Affiliation(s)
- Arina O. Degtyareva
- Department of Molecular Genetic, Institute of Cytology and Genetics, 630090 Novosibirsk, Russia; (A.O.D.); (E.V.A.)
| | - Elena V. Antontseva
- Department of Molecular Genetic, Institute of Cytology and Genetics, 630090 Novosibirsk, Russia; (A.O.D.); (E.V.A.)
| | - Tatiana I. Merkulova
- Department of Molecular Genetic, Institute of Cytology and Genetics, 630090 Novosibirsk, Russia; (A.O.D.); (E.V.A.)
- Department of Natural Sciences, Novosibirsk State University, 630090 Novosibirsk, Russia
| |
Collapse
|
4
|
Evans JM, Parker HG, Rutteman GR, Plassais J, Grinwis GCM, Harris AC, Lana SE, Ostrander EA. Multi-omics approach identifies germline regulatory variants associated with hematopoietic malignancies in retriever dog breeds. PLoS Genet 2021; 17:e1009543. [PMID: 33983928 PMCID: PMC8118335 DOI: 10.1371/journal.pgen.1009543] [Citation(s) in RCA: 7] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/13/2021] [Accepted: 04/12/2021] [Indexed: 12/18/2022] Open
Abstract
Histiocytic sarcoma is an aggressive hematopoietic malignancy of mature tissue histiocytes with a poorly understood etiology in humans. A histologically and clinically similar counterpart affects flat-coated retrievers (FCRs) at unusually high frequency, with 20% developing the lethal disease. The similar clinical presentation combined with the closed population structure of dogs, leading to high genetic homogeneity, makes dogs an excellent model for genetic studies of cancer susceptibility. To determine the genetic risk factors underlying histiocytic sarcoma in FCRs, we conducted multiple genome-wide association studies (GWASs), identifying two loci that confer significant risk on canine chromosomes (CFA) 5 (Pwald = 4.83x10-9) and 19 (Pwald = 2.25x10-7). We subsequently undertook a multi-omics approach that has been largely unexplored in the canine model to interrogate these regions, generating whole genome, transcriptome, and chromatin immunoprecipitation sequencing. These data highlight the PI3K pathway gene PIK3R6 on CFA5, and proximal candidate regulatory variants that are strongly associated with histiocytic sarcoma and predicted to impact transcription factor binding. The CFA5 association colocalizes with susceptibility loci for two hematopoietic malignancies, hemangiosarcoma and B-cell lymphoma, in the closely related golden retriever breed, revealing the risk contribution this single locus makes to multiple hematological cancers. By comparison, the CFA19 locus is unique to the FCR and harbors risk alleles associated with upregulation of TNFAIP6, which itself affects cell migration and metastasis. Together, these loci explain ~35% of disease risk, an exceptionally high value that demonstrates the advantages of domestic dogs for complex trait mapping and genetic studies of cancer susceptibility. We have identified two regions of the canine genome that explain a striking 35% of risk for developing histiocytic sarcoma in FCRs. The disease is uniformly lethal, affects 20% of FCRs, and parallels a cancer of the same name in humans. Both regions harbor genes involved in cell migration and cancer-related pathways. The first includes variants in regulatory regions at the tumor suppressor PIK3R6 locus that are strongly associated with histiocytic sarcoma and likely confer risk for other hematopoietic cancers. FCRs with risk alleles at the second locus demonstrate increased expression of TNFAIP6, which correlates with poor prognosis in multiple human cancers. In identifying genomic differences between affected and unaffected dogs, we advance our understanding of both canine and human health biology and set the stage for the development of diagnostic and therapeutic strategies.
Collapse
Affiliation(s)
- Jacquelyn M. Evans
- Cancer Genetics and Comparative Genomics Branch, National Human Genome Research Institute, National Institutes of Health, Bethesda, Maryland, United States of America
| | - Heidi G. Parker
- Cancer Genetics and Comparative Genomics Branch, National Human Genome Research Institute, National Institutes of Health, Bethesda, Maryland, United States of America
| | - Gerard R. Rutteman
- Department of Clinical Sciences, division Internal Medicine of Companion Animals, Faculty of Veterinary Medicine, Utrecht University, Utrecht, The Netherlands
| | - Jocelyn Plassais
- Cancer Genetics and Comparative Genomics Branch, National Human Genome Research Institute, National Institutes of Health, Bethesda, Maryland, United States of America
| | - Guy C. M. Grinwis
- Department Biomedical Health Sciences, division Pathology, Faculty of Veterinary Medicine, Utrecht University, Utrecht, The Netherlands
| | - Alexander C. Harris
- Cancer Genetics and Comparative Genomics Branch, National Human Genome Research Institute, National Institutes of Health, Bethesda, Maryland, United States of America
| | - Susan E. Lana
- College of Veterinary Medicine and Biomedical Sciences, Colorado State University, Fort Collins, Colorado, United States of America
| | - Elaine A. Ostrander
- Cancer Genetics and Comparative Genomics Branch, National Human Genome Research Institute, National Institutes of Health, Bethesda, Maryland, United States of America
- * E-mail:
| |
Collapse
|
5
|
Cooper RD, Shaffer HB. Allele-specific expression and gene regulation help explain transgressive thermal tolerance in non-native hybrids of the endangered California tiger salamander (Ambystoma californiense). Mol Ecol 2021; 30:987-1004. [PMID: 33338297 DOI: 10.1111/mec.15779] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/20/2019] [Revised: 11/30/2020] [Accepted: 12/11/2020] [Indexed: 01/26/2023]
Abstract
Hybridization between native and non-native species is an ongoing global conservation threat. Hybrids that exhibit traits and tolerances that surpass parental values are of particular concern, given their potential to outperform native species. Effective management of hybrid populations requires an understanding of both physiological performance and the underlying mechanisms that drive transgressive hybrid traits. Here, we explore several aspects of the hybridization between the endangered California tiger salamander (Ambystoma californiense; CTS) and the introduced barred tiger salamander (Ambystoma mavortium; BTS). We assayed critical thermal maximum (CTMax) to compare the ability of CTS, BTS and F1 hybrids to tolerate acute thermal stress, and found that hybrids exhibit a wide range of CTMax values, with 33% (4/12) able to tolerate temperatures greater than either parent. We then quantified the genomic response, measured at the RNA transcript level, of each salamander, to explore the mechanisms underlying thermal tolerance strategies. We found that CTS and BTS have strikingly different values and tissue-specific patterns of overall gene expression, with hybrids expressing intermediate values. F1 hybrids display abundant and variable degrees of allele-specific expression (ASE), likely arising from extensive compensatory evolution in gene regulatory mechanisms between CTS and BTS. We found evidence that the proportion of genes with allelic imbalance in individual hybrids correlates with their CTMax, suggesting a link between ASE and expanded thermal tolerance that may contribute to the success of hybrid salamanders in California. Future climate change may further complicate management of CTS if hybrid salamanders are better equipped to deal with rising temperatures.
Collapse
Affiliation(s)
- Robert D Cooper
- Department of Ecology and Evolutionary Biology, University of California, Los Angeles, CA, USA.,La Kretz Center for California Conservation Science, Institute of the Environment and Sustainability, University of California, Los Angeles, CA, USA
| | - H Bradley Shaffer
- Department of Ecology and Evolutionary Biology, University of California, Los Angeles, CA, USA.,La Kretz Center for California Conservation Science, Institute of the Environment and Sustainability, University of California, Los Angeles, CA, USA
| |
Collapse
|
6
|
Liu Y, Liu X, Zheng Z, Ma T, Liu Y, Long H, Cheng H, Fang M, Gong J, Li X, Zhao S, Xu X. Genome-wide analysis of expression QTL (eQTL) and allele-specific expression (ASE) in pig muscle identifies candidate genes for meat quality traits. Genet Sel Evol 2020; 52:59. [PMID: 33036552 PMCID: PMC7547458 DOI: 10.1186/s12711-020-00579-x] [Citation(s) in RCA: 15] [Impact Index Per Article: 3.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/16/2019] [Accepted: 09/28/2020] [Indexed: 12/23/2022] Open
Abstract
BACKGROUND Genetic analysis of gene expression level is a promising approach for characterizing candidate genes that are involved in complex economic traits such as meat quality. In the present study, we conducted expression quantitative trait loci (eQTL) and allele-specific expression (ASE) analyses based on RNA-sequencing (RNAseq) data from the longissimus muscle of 189 Duroc × Luchuan crossed pigs in order to identify some candidate genes for meat quality traits. RESULTS Using a genome-wide association study based on a mixed linear model, we identified 7192 cis-eQTL corresponding to 2098 cis-genes (p ≤ 1.33e-3, FDR ≤ 0.05) and 6400 trans-eQTL corresponding to 863 trans-genes (p ≤ 1.13e-6, FDR ≤ 0.05). ASE analysis using RNAseq SNPs identified 9815 significant ASE-SNPs in 2253 unique genes. Integrative analysis between the cis-eQTL and ASE target genes identified 540 common genes, including 33 genes with expression levels that were correlated with at least one meat quality trait. Among these 540 common genes, 63 have been reported previously as candidate genes for meat quality traits, such as PHKG1 (q-value = 1.67e-6 for the leading SNP in the cis-eQTL analysis), NUDT7 (q-value = 5.67e-13), FADS2 (q-value = 8.44e-5), and DGAT2 (q-value = 1.24e-3). CONCLUSIONS The present study confirmed several previously published candidate genes and identified some novel candidate genes for meat quality traits via eQTL and ASE analyses, which will be useful to prioritize candidate genes in further studies.
Collapse
Affiliation(s)
- Yan Liu
- Key Laboratory of Agricultural Animal Genetics, Breeding and Reproduction, Ministry of Education & College of Animal Science and Technology, Huazhong Agricultural University, Wuhan, 430070 China
- The Cooperative Innovation Center for Sustainable Pig Production, Wuhan, 430070 China
- Key Lab of Swine Genetics and Breeding of Ministry of Agriculture and Rural Affairs, Wuhan, 430070 China
| | - Xiaolei Liu
- Key Laboratory of Agricultural Animal Genetics, Breeding and Reproduction, Ministry of Education & College of Animal Science and Technology, Huazhong Agricultural University, Wuhan, 430070 China
- The Cooperative Innovation Center for Sustainable Pig Production, Wuhan, 430070 China
- Key Lab of Swine Genetics and Breeding of Ministry of Agriculture and Rural Affairs, Wuhan, 430070 China
| | - Zhiwei Zheng
- Key Laboratory of Agricultural Animal Genetics, Breeding and Reproduction, Ministry of Education & College of Animal Science and Technology, Huazhong Agricultural University, Wuhan, 430070 China
- The Cooperative Innovation Center for Sustainable Pig Production, Wuhan, 430070 China
- Key Lab of Swine Genetics and Breeding of Ministry of Agriculture and Rural Affairs, Wuhan, 430070 China
| | - Tingting Ma
- Key Laboratory of Agricultural Animal Genetics, Breeding and Reproduction, Ministry of Education & College of Animal Science and Technology, Huazhong Agricultural University, Wuhan, 430070 China
- The Cooperative Innovation Center for Sustainable Pig Production, Wuhan, 430070 China
- Key Lab of Swine Genetics and Breeding of Ministry of Agriculture and Rural Affairs, Wuhan, 430070 China
| | - Ying Liu
- Key Laboratory of Agricultural Animal Genetics, Breeding and Reproduction, Ministry of Education & College of Animal Science and Technology, Huazhong Agricultural University, Wuhan, 430070 China
- The Cooperative Innovation Center for Sustainable Pig Production, Wuhan, 430070 China
- Key Lab of Swine Genetics and Breeding of Ministry of Agriculture and Rural Affairs, Wuhan, 430070 China
| | - Huan Long
- Key Laboratory of Agricultural Animal Genetics, Breeding and Reproduction, Ministry of Education & College of Animal Science and Technology, Huazhong Agricultural University, Wuhan, 430070 China
- The Cooperative Innovation Center for Sustainable Pig Production, Wuhan, 430070 China
- Key Lab of Swine Genetics and Breeding of Ministry of Agriculture and Rural Affairs, Wuhan, 430070 China
| | - Huijun Cheng
- Key Laboratory of Agricultural Animal Genetics, Breeding and Reproduction, Ministry of Education & College of Animal Science and Technology, Huazhong Agricultural University, Wuhan, 430070 China
- The Cooperative Innovation Center for Sustainable Pig Production, Wuhan, 430070 China
- Key Lab of Swine Genetics and Breeding of Ministry of Agriculture and Rural Affairs, Wuhan, 430070 China
| | - Ming Fang
- Key Laboratory of Healthy Mariculture for the East China Sea, Ministry of Agriculture and Rural Affairs, Fisheries College, Jimei University, Xiamen, 361021 China
| | - Jing Gong
- Colleges of Informatics, Huazhong Agricultural University, Wuhan, 430070 China
| | - Xinyun Li
- Key Laboratory of Agricultural Animal Genetics, Breeding and Reproduction, Ministry of Education & College of Animal Science and Technology, Huazhong Agricultural University, Wuhan, 430070 China
- The Cooperative Innovation Center for Sustainable Pig Production, Wuhan, 430070 China
- Key Lab of Swine Genetics and Breeding of Ministry of Agriculture and Rural Affairs, Wuhan, 430070 China
| | - Shuhong Zhao
- Key Laboratory of Agricultural Animal Genetics, Breeding and Reproduction, Ministry of Education & College of Animal Science and Technology, Huazhong Agricultural University, Wuhan, 430070 China
- The Cooperative Innovation Center for Sustainable Pig Production, Wuhan, 430070 China
- Key Lab of Swine Genetics and Breeding of Ministry of Agriculture and Rural Affairs, Wuhan, 430070 China
| | - Xuewen Xu
- Key Laboratory of Agricultural Animal Genetics, Breeding and Reproduction, Ministry of Education & College of Animal Science and Technology, Huazhong Agricultural University, Wuhan, 430070 China
- The Cooperative Innovation Center for Sustainable Pig Production, Wuhan, 430070 China
- Key Lab of Swine Genetics and Breeding of Ministry of Agriculture and Rural Affairs, Wuhan, 430070 China
| |
Collapse
|
7
|
Fan KH, Devos KM, Schliekelman P. Strategies for eQTL mapping in allopolyploid organisms. TAG. THEORETICAL AND APPLIED GENETICS. THEORETISCHE UND ANGEWANDTE GENETIK 2020; 133:2477-2497. [PMID: 32462429 DOI: 10.1007/s00122-020-03612-1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 09/09/2019] [Accepted: 05/15/2020] [Indexed: 06/11/2023]
Abstract
KEY MESSAGE This study uses simulations to explore statistical power and false-positive rates for eQTL mapping in allopolyploid organisms and provides guidelines to apply eQTL mapping in these organisms. In recent years, RNA-seq has become the dominant technology for eQTL studies. However, most work has been in diploid organisms. Many species of economic and environmental importance are polyploid, and approaches for eQTL mapping in polyploids are not well developed. High similarity between duplicated genes in polyploids will cause misassignment of sequence reads and may cause false-positive results and/or lack of power to detect eQTL. In this paper, we first explore the similarity of homoeologous transcripts in polyploid organisms. We find that 5-20% of genes (varying with organism) in important agricultural plants such as wheat, soybean, and switchgrass are not sufficiently diverged between duplicated genomes to allow unambiguous assignment of reads. Second, we examine the impact of misassigned reads on eQTL mapping and show that both false-positive and false-negative rates can be greatly inflated. Third, we compare four strategies for dealing with ambiguous reads: (1) dividing ambiguous reads evenly between homoeologous transcripts, (2) assigning them proportionally, (3) using all reads for all genes, and (4) discarding ambiguous reads. We find that the strategy of discarding ambiguous reads gives the best balance of false-positive and false-negative rates for most genes. However, for genes that are very similar between genomes, using all reads is the only choice. This leads to reduced power, but false-positive rates will be maintained. We also discuss QTL mapping in polyploids using allele-specific expression (ASE) and show how the proportion of ASE-informative reads varies according to the divergence between homoeologous genes.
Collapse
Affiliation(s)
- Kang-Hsien Fan
- Institute of Bioinformatics, University of Georgia, Athens, GA, USA
| | - Katrien M Devos
- Department of Crop and Soil Sciences, Institute of Plant Breeding, Genetics and Genomics, University of Georgia, Athens, GA, USA
| | | |
Collapse
|
8
|
Lee C, Kang EY, Gandal MJ, Eskin E, Geschwind DH. Profiling allele-specific gene expression in brains from individuals with autism spectrum disorder reveals preferential minor allele usage. Nat Neurosci 2019; 22:1521-1532. [PMID: 31455884 PMCID: PMC6750256 DOI: 10.1038/s41593-019-0461-9] [Citation(s) in RCA: 26] [Impact Index Per Article: 5.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/12/2017] [Accepted: 07/09/2019] [Indexed: 12/21/2022]
Abstract
One fundamental but understudied mechanism of gene regulation in disease is allele-specific expression (ASE), the preferential expression of one allele. We leveraged RNA-sequencing data from human brain to assess ASE in autism spectrum disorder (ASD). When ASE is observed in ASD, the allele with lower population frequency (minor allele) is preferentially more highly expressed than the major allele, opposite to the canonical pattern. Importantly, genes showing ASE in ASD are enriched in those downregulated in ASD postmortem brains and in genes harboring de novo mutations in ASD. Two regions, 14q32 and 15q11, containing all known orphan C/D box small nucleolar RNAs (snoRNAs), are particularly enriched in shifts to higher minor allele expression. We demonstrate that this allele shifting enhances snoRNA-targeted splicing changes in ASD-related target genes in idiopathic ASD and 15q11-q13 duplication syndrome. Together, these results implicate allelic imbalance and dysregulation of orphan C/D box snoRNAs in ASD pathogenesis.
Collapse
Affiliation(s)
- Changhoon Lee
- Program in Neurogenetics, Department of Neurology, David Geffen School of Medicine, University of California, Los Angeles, Los Angeles, CA, USA
- Department of Neuroscience, Peter O'Donnell Jr. Brain Institute, University of Texas Southwestern Medical Center, Dallas, TX, USA
| | - Eun Yong Kang
- Department of Computer Science, Henry Samueli School of Engineering, University of California, Los Angeles, Los Angeles, CA, USA
| | - Michael J Gandal
- Program in Neurogenetics, Department of Neurology, David Geffen School of Medicine, University of California, Los Angeles, Los Angeles, CA, USA
- Center for Neurobehavioral Genetics, Semel Institute, David Geffen School of Medicine, University of California, Los Angeles, Los Angeles, CA, USA
| | - Eleazar Eskin
- Department of Computer Science, Henry Samueli School of Engineering, University of California, Los Angeles, Los Angeles, CA, USA
- Department of Human Genetics, David Geffen School of Medicine, University of California, Los Angeles, Los Angeles, CA, USA
- Department of Computational Medicine, David Geffen School of Medicine, University of California, Los Angeles, Los Angeles, CA, USA
| | - Daniel H Geschwind
- Program in Neurogenetics, Department of Neurology, David Geffen School of Medicine, University of California, Los Angeles, Los Angeles, CA, USA.
- Center for Neurobehavioral Genetics, Semel Institute, David Geffen School of Medicine, University of California, Los Angeles, Los Angeles, CA, USA.
- Department of Human Genetics, David Geffen School of Medicine, University of California, Los Angeles, Los Angeles, CA, USA.
- Center for Autism Research and Treatment, Semel Institute, David Geffen School of Medicine, University of California, Los Angeles, Los Angeles, CA, USA.
| |
Collapse
|
9
|
Metzger BPH, Wittkopp PJ. Compensatory trans-regulatory alleles minimizing variation in TDH3 expression are common within Saccharomyces cerevisiae. Evol Lett 2019; 3:448-461. [PMID: 31636938 PMCID: PMC6791293 DOI: 10.1002/evl3.137] [Citation(s) in RCA: 17] [Impact Index Per Article: 3.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/06/2019] [Revised: 08/07/2019] [Accepted: 08/09/2019] [Indexed: 11/06/2022] Open
Abstract
Heritable variation in gene expression is common within species. Much of this variation is due to genetic differences outside of the gene with altered expression and is trans-acting. This trans-regulatory variation is often polygenic, with individual variants typically having small effects, making the genetic architecture and evolution of trans-regulatory variation challenging to study. Consequently, key questions about trans-regulatory variation remain, including the variability of trans-regulatory variation within a species, how selection affects trans-regulatory variation, and how trans-regulatory variants are distributed throughout the genome and within a species. To address these questions, we isolated and measured trans-regulatory differences affecting TDH3 promoter activity among 56 strains of Saccharomyces cerevisiae, finding that trans-regulatory backgrounds varied approximately twofold in their effects on TDH3 promoter activity. Comparing this variation to neutral models of trans-regulatory evolution based on empirical measures of mutational effects revealed that despite this variability in the effects of trans-regulatory backgrounds, stabilizing selection has constrained trans-regulatory differences within this species. Using a powerful quantitative trait locus mapping method, we identified ∼100 trans-acting expression quantitative trait locus in each of three crosses to a common reference strain, indicating that regulatory variation is more polygenic than previous studies have suggested. Loci altering expression were located throughout the genome, and many loci were strain specific. This distribution and prevalence of alleles is consistent with recent theories about the genetic architecture of complex traits. In all mapping experiments, the nonreference strain alleles increased and decreased TDH3 promoter activity with similar frequencies, suggesting that stabilizing selection maintained many trans-acting variants with opposing effects. This variation may provide the raw material for compensatory evolution and larger scale regulatory rewiring observed in developmental systems drift among species.
Collapse
Affiliation(s)
- Brian P H Metzger
- Department of Ecology and Evolutionary Biology University of Michigan Ann Arbor Michigan 48109.,Department of Ecology and Evolution University of Chicago Chicago Illinois 60637
| | - Patricia J Wittkopp
- Department of Ecology and Evolutionary Biology University of Michigan Ann Arbor Michigan 48109.,Department of Molecular, Cellular, and Developmental Biology University of Michigan Ann Arbor Michigan 48109
| |
Collapse
|
10
|
Albert E, Duboscq R, Latreille M, Santoni S, Beukers M, Bouchet JP, Bitton F, Gricourt J, Poncet C, Gautier V, Jiménez-Gómez JM, Rigaill G, Causse M. Allele-specific expression and genetic determinants of transcriptomic variations in response to mild water deficit in tomato. THE PLANT JOURNAL : FOR CELL AND MOLECULAR BIOLOGY 2018; 96:635-650. [PMID: 30079488 DOI: 10.1111/tpj.14057] [Citation(s) in RCA: 18] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 06/01/2018] [Revised: 07/31/2018] [Accepted: 08/02/2018] [Indexed: 06/08/2023]
Abstract
Characterizing the natural diversity of gene expression across environments is an important step in understanding how genotype-by-environment interactions shape phenotypes. Here, we analyzed the impact of water deficit onto gene expression levels in tomato at the genome-wide scale. We sequenced the transcriptome of growing leaves and fruit pericarps at cell expansion stage in a cherry and a large fruited accession and their F1 hybrid grown under two watering regimes. Gene expression levels were steadily affected by the genotype and the watering regime. Whereas phenotypes showed mostly additive inheritance, ~80% of the genes displayed non-additive inheritance. By comparing allele-specific expression (ASE) in the F1 hybrid to the allelic expression in both parental lines, respectively, 3005 genes in leaf and 2857 genes in fruit deviated from 1:1 ratio independently of the watering regime. Among these genes, ~55% were controlled by cis factors, ~25% by trans factors and ~20% by a combination of both types of factors. A total of 328 genes in leaf and 113 in fruit exhibited significant ASE-by-watering regime interaction, among which ~80% presented trans-by-watering regime interaction, suggesting a response to water deficit mediated through a majority of trans-acting loci in tomato. We cross-validated the expression levels of 274 transcripts in fruit and leaves of 124 recombinant inbred lines (RILs) and identified 163 expression quantitative trait loci (eQTLs) mostly confirming the divergences identified by ASE. Combining phenotypic and expression data, we observed a complex network of variation between genes encoding enzymes involved in the sugar metabolism.
Collapse
Affiliation(s)
- Elise Albert
- INRA, UR1052, Centre de Recherche PACA, Génétique et Amélioration des Fruits et Légumes, 67 Allée des Chênes, Domaine Saint Maurice, CS60094, Montfavet, 84143, France
| | - Renaud Duboscq
- INRA, UR1052, Centre de Recherche PACA, Génétique et Amélioration des Fruits et Légumes, 67 Allée des Chênes, Domaine Saint Maurice, CS60094, Montfavet, 84143, France
| | - Muriel Latreille
- INRA, UMR1334, Amélioration génétique et Adaptation des Plantes, Montpellier SupAgro-INRA-IRD-UMII, 2 Place Pierre Viala, Montpellier, 34060, France
| | - Sylvain Santoni
- INRA, UMR1334, Amélioration génétique et Adaptation des Plantes, Montpellier SupAgro-INRA-IRD-UMII, 2 Place Pierre Viala, Montpellier, 34060, France
| | - Matthieu Beukers
- INRA, UR1052, Centre de Recherche PACA, Génétique et Amélioration des Fruits et Légumes, 67 Allée des Chênes, Domaine Saint Maurice, CS60094, Montfavet, 84143, France
| | - Jean-Paul Bouchet
- INRA, UR1052, Centre de Recherche PACA, Génétique et Amélioration des Fruits et Légumes, 67 Allée des Chênes, Domaine Saint Maurice, CS60094, Montfavet, 84143, France
| | - Fréderique Bitton
- INRA, UR1052, Centre de Recherche PACA, Génétique et Amélioration des Fruits et Légumes, 67 Allée des Chênes, Domaine Saint Maurice, CS60094, Montfavet, 84143, France
| | - Justine Gricourt
- INRA, UR1052, Centre de Recherche PACA, Génétique et Amélioration des Fruits et Légumes, 67 Allée des Chênes, Domaine Saint Maurice, CS60094, Montfavet, 84143, France
| | - Charles Poncet
- INRA, UMR1095, Génétique Diversité et Ecophysiologie des Céréales, 5 Chemin de Beaulieu, Clermont-Ferrand, 63039, France
| | - Véronique Gautier
- INRA, UMR1095, Génétique Diversité et Ecophysiologie des Céréales, 5 Chemin de Beaulieu, Clermont-Ferrand, 63039, France
| | - José M Jiménez-Gómez
- INRA, UMR1318, Institut Jean-Pierre Bourgin, AgroParisTech-INRA-CNRS, Route de Saint Cyr, Versailles, 78026, France
| | - Guillem Rigaill
- INRA, UMR8071, Laboratoire de Mathématiques et Modélisation d'Evry, Université d'Evry Val d'Essonne, ENSIIE-INRA-CNRS, Évry, 91037, France
| | - Mathilde Causse
- INRA, UR1052, Centre de Recherche PACA, Génétique et Amélioration des Fruits et Légumes, 67 Allée des Chênes, Domaine Saint Maurice, CS60094, Montfavet, 84143, France
| |
Collapse
|
11
|
Event Analysis: Using Transcript Events To Improve Estimates of Abundance in RNA-seq Data. G3-GENES GENOMES GENETICS 2018; 8:2923-2940. [PMID: 30021829 PMCID: PMC6118309 DOI: 10.1534/g3.118.200373] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 02/07/2023]
Abstract
Alternative splicing leverages genomic content by allowing the synthesis of multiple transcripts and, by implication, protein isoforms, from a single gene. However, estimating the abundance of transcripts produced in a given tissue from short sequencing reads is difficult and can result in both the construction of transcripts that do not exist, and the failure to identify true transcripts. An alternative approach is to catalog the events that make up isoforms (splice junctions and exons). We present here the Event Analysis (EA) approach, where we project transcripts onto the genome and identify overlapping/unique regions and junctions. In addition, all possible logical junctions are assembled into a catalog. Transcripts are filtered before quantitation based on simple measures: the proportion of the events detected, and the coverage. We find that mapping to a junction catalog is more efficient at detecting novel junctions than mapping in a splice aware manner. We identify 99.8% of true transcripts while iReckon identifies 82% of the true transcripts and creates more transcripts not included in the simulation than were initially used in the simulation. Using PacBio Iso-seq data from a mouse neural progenitor cell model, EA detects 60% of the novel junctions that are combinations of existing exons while only 43% are detected by STAR. EA further detects ∼5,000 annotated junctions missed by STAR. Filtering transcripts based on the proportion of the transcript detected and the number of reads on average supporting that transcript captures 95% of the PacBio transcriptome. Filtering the reference transcriptome before quantitation, results in is a more stable estimate of isoform abundance, with improved correlation between replicates. This was particularly evident when EA is applied to an RNA-seq study of type 1 diabetes (T1D), where the coefficient of variation among subjects (n = 81) in the transcript abundance estimates was substantially reduced compared to the estimation using the full reference. EA focuses on individual transcriptional events. These events can be quantitate and analyzed directly or used to identify the probable set of expressed transcripts. Simple rules based on detected events and coverage used in filtering result in a dramatic improvement in isoform estimation without the use of ancillary data (e.g., ChIP, long reads) that may not be available for many studies.
Collapse
|
12
|
Yeo J, Morales DA, Chen T, Crawford EL, Zhang X, Blomquist TM, Levin AM, Massion PP, Arenberg DA, Midthun DE, Mazzone PJ, Nathan SD, Wainz RJ, Nana-Sinkam P, Willey PFS, Arend TJ, Padda K, Qiu S, Federov A, Hernandez DAR, Hammersley JR, Yoon Y, Safi F, Khuder SA, Willey JC. RNAseq analysis of bronchial epithelial cells to identify COPD-associated genes and SNPs. BMC Pulm Med 2018; 18:42. [PMID: 29506519 PMCID: PMC5838965 DOI: 10.1186/s12890-018-0603-y] [Citation(s) in RCA: 16] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/08/2017] [Accepted: 02/23/2018] [Indexed: 01/09/2023] Open
Abstract
Background There is a need for more powerful methods to identify low-effect SNPs that contribute to hereditary COPD pathogenesis. We hypothesized that SNPs contributing to COPD risk through cis-regulatory effects are enriched in genes comprised by bronchial epithelial cell (BEC) expression patterns associated with COPD. Methods To test this hypothesis, normal BEC specimens were obtained by bronchoscopy from 60 subjects: 30 subjects with COPD defined by spirometry (FEV1/FVC < 0.7, FEV1% < 80%), and 30 non-COPD controls. Targeted next generation sequencing was used to measure total and allele-specific expression of 35 genes in genome maintenance (GM) genes pathways linked to COPD pathogenesis, including seven TP53 and CEBP transcription factor family members. Shrinkage linear discriminant analysis (SLDA) was used to identify COPD-classification models. COPD GWAS were queried for putative cis-regulatory SNPs in the targeted genes. Results On a network basis, TP53 and CEBP transcription factor pathway gene pair network connections, including key DNA repair gene ERCC5, were significantly different in COPD subjects (e.g., Wilcoxon rank sum test for closeness, p-value = 5.0E-11). ERCC5 SNP rs4150275 association with chronic bronchitis was identified in a set of Lung Health Study (LHS) COPD GWAS SNPs restricted to those in putative regulatory regions within the targeted genes, and this association was validated in the COPDgene non-hispanic white (NHW) GWAS. ERCC5 SNP rs4150275 is linked (D’ = 1) to ERCC5 SNP rs17655 which displayed differential allelic expression (DAE) in BEC and is an expression quantitative trait locus (eQTL) in lung tissue (p = 3.2E-7). SNPs in linkage (D’ = 1) with rs17655 were predicted to alter miRNA binding (rs873601). A classifier model that comprised gene features CAT, CEBPG, GPX1, KEAP1, TP73, and XPA had pooled 10-fold cross-validation receiver operator characteristic area under the curve of 75.4% (95% CI: 66.3%–89.3%). The prevalence of DAE was higher than expected (p = 0.0023) in the classifier genes. Conclusions GM genes comprised by COPD-associated BEC expression patterns were enriched for SNPs with cis-regulatory function, including a putative cis-rSNP in ERCC5 that was associated with COPD risk. These findings support additional total and allele-specific expression analysis of gene pathways with high prior likelihood for involvement in COPD pathogenesis. Electronic supplementary material The online version of this article (10.1186/s12890-018-0603-y) contains supplementary material, which is available to authorized users.
Collapse
Affiliation(s)
- Jiyoun Yeo
- Department of Pathology, The University of Toledo College of Medicine, 3000 Arlington Avenue, HEB 219, Toledo, OH, 43614, USA
| | - Diego A Morales
- Division of Pulmonary and Critical Care Medicine, Department of Medicine, The University of Toledo College of Medicine, 3000 Arlington Avenue, HEB 219, Toledo, OH, 43614, USA
| | - Tian Chen
- Department of Mathematics and Statistics, The University of Toledo, 2801 W. Bancroft Street, Toledo, OH, 43606, USA
| | - Erin L Crawford
- Division of Pulmonary and Critical Care Medicine, Department of Medicine, The University of Toledo College of Medicine, 3000 Arlington Avenue, HEB 219, Toledo, OH, 43614, USA
| | - Xiaolu Zhang
- Department of Medicine, The University of Toledo College of Medicine, 3000 Arlington Avenue, Toledo, OH, 43614, USA
| | - Thomas M Blomquist
- Department of Pathology, The University of Toledo College of Medicine, 3000 Arlington Avenue, HEB 219, Toledo, OH, 43614, USA
| | - Albert M Levin
- Department of Biostatistics, Henry Ford Health System, 1 Ford Place Detroit, MI, Detroit, MI, 48202, USA
| | - Pierre P Massion
- Thoracic Program, Vanderbilt Ingram Cancer Center, Nashville, TN, 37232, USA
| | | | - David E Midthun
- Department of Pulmonary and Critical Care Medicine, Mayo Clinic, 200 1st St SW, Rochester, MN, 55905, USA
| | - Peter J Mazzone
- Department of Pulmonary Medicine, Cleveland Clinic, 9500 Euclid Ave, Cleveland, OH, 44195, USA
| | - Steven D Nathan
- Department of Pulmonary Medicine, Inova Fairfax Hospital, 3300 Gallows Road, Falls Church, VA, 22042-3300, USA
| | - Ronald J Wainz
- The Toledo Hospital, 2142 N Cove Blvd, Toledo, OH, 43606, USA
| | - Patrick Nana-Sinkam
- Division of Pulmonary Diseases and Critical Care Medicine, Virginia Commonwealth University, USA, Richmond, VA, 23284-2512, USA.,Ohio State University James Comprehensive Cancer Center and Solove Research Institute, Columbus, OH, USA
| | - Paige F S Willey
- American Enterprise Institute, 1789 Massachusetts Ave NW, Washington, DC, 20036, USA
| | - Taylor J Arend
- The University of Toledo College of Medicine, 3000 Arlington Avenue, Toledo, OH, 43614, USA
| | - Karanbir Padda
- Emory University School of Medicine, 1648 Pierce Dr NE, Atlanta, GA, 30307, USA
| | - Shuhao Qiu
- Department of Medicine, The University of Toledo Medical Center, 3000 Arlington Avenue, Toledo, OH, 43614, USA
| | - Alexei Federov
- Department of Mathematics and Statistics, The University of Toledo, 2801 W. Bancroft Street, Toledo, OH, 43606, USA.,Department of Medicine, The University of Toledo College of Medicine, 3000 Arlington Avenue, Toledo, OH, 43614, USA
| | - Dawn-Alita R Hernandez
- Division of Pulmonary and Critical Care Medicine, Department of Medicine, The University of Toledo College of Medicine, 3000 Arlington Avenue, RHC 0012, Toledo, OH, 43614, USA
| | - Jeffrey R Hammersley
- Division of Pulmonary and Critical Care Medicine, Department of Medicine, The University of Toledo College of Medicine, 3000 Arlington Avenue, RHC 0012, Toledo, OH, 43614, USA
| | - Youngsook Yoon
- Division of Pulmonary and Critical Care Medicine, Department of Medicine, The University of Toledo College of Medicine, 3000 Arlington Avenue, RHC 0012, Toledo, OH, 43614, USA
| | - Fadi Safi
- Division of Pulmonary and Critical Care Medicine, Department of Medicine, The University of Toledo College of Medicine, 3000 Arlington Avenue, RHC 0012, Toledo, OH, 43614, USA
| | - Sadik A Khuder
- Division of Pulmonary and Critical Care Medicine, Department of Medicine, The University of Toledo College of Medicine, 3000 Arlington Avenue, RHC 0012, Toledo, OH, 43614, USA
| | - James C Willey
- Division of Pulmonary and Critical Care Medicine, Department of Medicine, The University of Toledo College of Medicine, 3000 Arlington Avenue, Toledo, OH, 43614, USA.
| |
Collapse
|
13
|
Tian L, Khan A, Ning Z, Yuan K, Zhang C, Lou H, Yuan Y, Xu S. Genome-wide comparison of allele-specific gene expression between African and European populations. Hum Mol Genet 2018; 27:1067-1077. [DOI: 10.1093/hmg/ddy027] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/14/2017] [Accepted: 01/05/2018] [Indexed: 11/12/2022] Open
Affiliation(s)
- Lei Tian
- Chinese Academy of Sciences Key Laboratory of Computational Biology, Max Planck Independent Research Group on Population Genomics, CAS-MPG Partner Institute for Computational Biology, Shanghai Institutes for Biological Sciences, CAS, Shanghai 200031, China
- University of Chinese Academy of Sciences, Beijing 100049, China
| | - Asifullah Khan
- Chinese Academy of Sciences Key Laboratory of Computational Biology, Max Planck Independent Research Group on Population Genomics, CAS-MPG Partner Institute for Computational Biology, Shanghai Institutes for Biological Sciences, CAS, Shanghai 200031, China
- Department of Biochemistry, Abdul Wali Khan University Mardan, Mardan-23200 KP, Pakistan
| | - Zhilin Ning
- Chinese Academy of Sciences Key Laboratory of Computational Biology, Max Planck Independent Research Group on Population Genomics, CAS-MPG Partner Institute for Computational Biology, Shanghai Institutes for Biological Sciences, CAS, Shanghai 200031, China
- University of Chinese Academy of Sciences, Beijing 100049, China
| | - Kai Yuan
- Chinese Academy of Sciences Key Laboratory of Computational Biology, Max Planck Independent Research Group on Population Genomics, CAS-MPG Partner Institute for Computational Biology, Shanghai Institutes for Biological Sciences, CAS, Shanghai 200031, China
- University of Chinese Academy of Sciences, Beijing 100049, China
| | - Chao Zhang
- Chinese Academy of Sciences Key Laboratory of Computational Biology, Max Planck Independent Research Group on Population Genomics, CAS-MPG Partner Institute for Computational Biology, Shanghai Institutes for Biological Sciences, CAS, Shanghai 200031, China
- University of Chinese Academy of Sciences, Beijing 100049, China
| | - Haiyi Lou
- Chinese Academy of Sciences Key Laboratory of Computational Biology, Max Planck Independent Research Group on Population Genomics, CAS-MPG Partner Institute for Computational Biology, Shanghai Institutes for Biological Sciences, CAS, Shanghai 200031, China
| | - Yuan Yuan
- Chinese Academy of Sciences Key Laboratory of Computational Biology, Max Planck Independent Research Group on Population Genomics, CAS-MPG Partner Institute for Computational Biology, Shanghai Institutes for Biological Sciences, CAS, Shanghai 200031, China
| | - Shuhua Xu
- Chinese Academy of Sciences Key Laboratory of Computational Biology, Max Planck Independent Research Group on Population Genomics, CAS-MPG Partner Institute for Computational Biology, Shanghai Institutes for Biological Sciences, CAS, Shanghai 200031, China
- University of Chinese Academy of Sciences, Beijing 100049, China
- School of Life Science and Technology, Shanghai Tech University, Shanghai 201210, China
- Collaborative Innovation Center of Genetics and Development, Shanghai 200438, China
| |
Collapse
|
14
|
Schuierer S, Carbone W, Knehr J, Petitjean V, Fernandez A, Sultan M, Roma G. A comprehensive assessment of RNA-seq protocols for degraded and low-quantity samples. BMC Genomics 2017; 18:442. [PMID: 28583074 PMCID: PMC5460543 DOI: 10.1186/s12864-017-3827-y] [Citation(s) in RCA: 89] [Impact Index Per Article: 12.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/17/2017] [Accepted: 05/29/2017] [Indexed: 12/27/2022] Open
Abstract
BACKGROUND RNA-sequencing (RNA-seq) has emerged as one of the most sensitive tool for gene expression analysis. Among the library preparation methods available, the standard poly(A) + enrichment provides a comprehensive, detailed, and accurate view of polyadenylated RNAs. However, on samples of suboptimal quality ribosomal RNA depletion and exon capture methods have recently been reported as better alternatives. METHODS We compared for the first time three commercial Illumina library preparation kits (TruSeq Stranded mRNA, TruSeq Ribo-Zero rRNA Removal, and TruSeq RNA Access) as representatives of these three different approaches using well-established human reference RNA samples from the MAQC/SEQC consortium on a wide range of input amounts (from 100 ng down to 1 ng) and degradation levels (intact, degraded, and highly degraded). RESULTS We assessed the accuracy of the generated expression values by comparison to gold standard TaqMan qPCR measurements and gained unprecedented insight into the limits of applicability in terms of input quantity and sample quality of each protocol. We found that each protocol generates highly reproducible results (R 2 > 0.92) on intact RNA samples down to input amounts of 10 ng. For degraded RNA samples, Ribo-Zero showed clear performance advantages over the other two protocols as it generated more accurate and better reproducible gene expression results even at very low input amounts such as 1 ng and 2 ng. For highly degraded RNA samples, RNA Access performed best generating reliable data down to 5 ng input. CONCLUSIONS We found that the ribosomal RNA depletion protocol from Illumina works very well at amounts far below recommendation and over a good range of intact and degraded material. We also infer that the exome-capture protocol (RNA Access, Illumina) performs better than other methods on highly degraded and low amount samples.
Collapse
Affiliation(s)
- Sven Schuierer
- Novartis Institutes for Biomedical Research, Novartis Pharma AG, Basel, Switzerland.
| | - Walter Carbone
- Novartis Institutes for Biomedical Research, Novartis Pharma AG, Basel, Switzerland
| | - Judith Knehr
- Novartis Institutes for Biomedical Research, Novartis Pharma AG, Basel, Switzerland
| | - Virginie Petitjean
- Novartis Institutes for Biomedical Research, Novartis Pharma AG, Basel, Switzerland
| | - Anita Fernandez
- Novartis Institutes for Biomedical Research, Novartis Pharma AG, Basel, Switzerland
| | - Marc Sultan
- Novartis Institutes for Biomedical Research, Novartis Pharma AG, Basel, Switzerland.
| | - Guglielmo Roma
- Novartis Institutes for Biomedical Research, Novartis Pharma AG, Basel, Switzerland.
| |
Collapse
|