1
|
Xavier JM, Magno R, Russell R, de Almeida BP, Jacinta-Fernandes A, Besouro-Duarte A, Dunning M, Samarajiwa S, O'Reilly M, Maia AM, Rocha CL, Rosli N, Ponder BAJ, Maia AT. Identification of candidate causal variants and target genes at 41 breast cancer risk loci through differential allelic expression analysis. Sci Rep 2024; 14:22526. [PMID: 39341862 PMCID: PMC11438911 DOI: 10.1038/s41598-024-72163-y] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/01/2024] [Accepted: 09/04/2024] [Indexed: 10/01/2024] Open
Abstract
Understanding breast cancer genetic risk relies on identifying causal variants and candidate target genes in risk loci identified by genome-wide association studies (GWAS), which remains challenging. Since most loci fall in active gene regulatory regions, we developed a novel approach facilitated by pinpointing the variants with greater regulatory potential in the disease's tissue of origin. Through genome-wide differential allelic expression (DAE) analysis, using microarray data from 64 normal breast tissue samples, we mapped the variants associated with DAE (daeQTLs). Then, we intersected these with GWAS data to reveal candidate risk regulatory variants and analysed their cis-acting regulatory potential. Finally, we validated our approach by extensive functional analysis of the 5q14.1 breast cancer risk locus. We observed widespread gene expression regulation by cis-acting variants in breast tissue, with 65% of coding and noncoding expressed genes displaying DAE (daeGenes). We identified over 54 K daeQTLs for 6761 (26%) daeGenes, including 385 daeGenes harbouring variants previously associated with BC risk. We found 1431 daeQTLs mapped to 93 different loci in strong linkage disequilibrium with risk-associated variants (risk-daeQTLs), suggesting a link between risk-causing variants and cis-regulation. There were 122 risk-daeQTL with stronger cis-acting potential in active regulatory regions with protein binding evidence. These variants mapped to 41 risk loci, of which 29 had no previous report of target genes and were candidates for regulating the expression levels of 65 genes. As validation, we identified and functionally characterised five candidate causal variants at the 5q14.1 risk locus targeting the ATG10 and ATP6AP1L genes, likely acting via modulation of alternative transcription and transcription factor binding. Our study demonstrates the power of DAE analysis and daeQTL mapping to identify causal regulatory variants and target genes at breast cancer risk loci, including those with complex regulatory landscapes. It additionally provides a genome-wide resource of variants associated with DAE for future functional studies.
Collapse
Affiliation(s)
- Joana M Xavier
- Cintesis@Rise, Universidade do Algarve, Faro, Portugal.
- Centro de Ciências do Mar (CCMAR), Universidade do Algarve, Faro, Portugal.
| | - Ramiro Magno
- Cintesis@Rise, Universidade do Algarve, Faro, Portugal
- Pattern Institute PT, Faro, Portugal
| | - Roslin Russell
- Cambridge Institute - CRUK, University of Cambridge, Cambridge, UK
- Department of Genetics, University of Cambridge, Cambridge, UK
| | - Bernardo P de Almeida
- Faculdade de Medicina e Ciências Biomédicas (FMCB), Universidade do Algarve, Faro, Portugal
- Faculdade de Medicina, Instituto de Medicina Molecular, Universidade de Lisboa, Lisbon, Portugal
- InstaDeep, Paris, France
| | - Ana Jacinta-Fernandes
- Faculdade de Medicina e Ciências Biomédicas (FMCB), Universidade do Algarve, Faro, Portugal
| | | | - Mark Dunning
- Cambridge Institute - CRUK, University of Cambridge, Cambridge, UK
- Sheffield Bioinformatics Core, The School of Medicine and Population Health, The University of Sheffield, Sheffield, UK
| | - Shamith Samarajiwa
- Medical Research Council (MRC) Cancer Unit, Hutchison/MRC Research Centre, University of Cambridge, Cambridge, UK
- Genetics and Genomics Section, Imperial College London, London, UK
| | - Martin O'Reilly
- Cambridge Institute - CRUK, University of Cambridge, Cambridge, UK
| | | | - Cátia L Rocha
- Faculdade de Medicina e Ciências Biomédicas (FMCB), Universidade do Algarve, Faro, Portugal
- Faculty of Medicine, Instituto de Saúde Ambiental (ISAMB), University of Lisbon, Lisbon, Portugal
| | - Nordiana Rosli
- Faculdade de Medicina e Ciências Biomédicas (FMCB), Universidade do Algarve, Faro, Portugal
- Training Division, Ministry of Health Malaysia, Putrajaya, Malaysia
- Biometrology Group, Division of Chemical and Biological Metrology, Korea Research Institute of Standards and Science, Daejeon, South Korea
| | - Bruce A J Ponder
- Cambridge Institute - CRUK, University of Cambridge, Cambridge, UK
| | - Ana-Teresa Maia
- Cintesis@Rise, Universidade do Algarve, Faro, Portugal.
- Centro de Ciências do Mar (CCMAR), Universidade do Algarve, Faro, Portugal.
- Faculdade de Medicina e Ciências Biomédicas (FMCB), Universidade do Algarve, Faro, Portugal.
| |
Collapse
|
2
|
Tan MH. Identification of Bona Fide RNA Editing Sites: History, Challenges, and Opportunities. Acc Chem Res 2023; 56:3033-3044. [PMID: 37827987 DOI: 10.1021/acs.accounts.3c00462] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/14/2023]
Abstract
Adenosine-to-inosine (A-to-I) RNA editing, catalyzed by the adenosine deaminase acting on the RNA (ADAR) family of enzymes of which there are three members (ADAR1, ADAR2, and ADAR3), is a major gene regulatory mechanism that diversifies the transcriptome. It is widespread in many metazoans, including humans. As inosine is interpreted by cellular machineries mainly as guanosine, A-to-I editing effectively gives A-to-G nucleotide changes. Depending on its location, an editing event can generate new protein isoforms or influence other RNA processing pathways. Researchers have found that ADAR-mediated editing performs diverse functions. For example, it enables living organisms such as cephalopods to adapt rapidly to fluctuating environmental conditions such as water temperature. In development, the loss of ADAR1 is embryonically lethal partly because endogenous double-stranded RNAs (dsRNAs) are no longer marked by inosines, which signal "self", and thus cause the melanoma differentiation-associated protein 5 (MDA5) sensor to trigger a deleterious interferon response. Hence, ADAR1 plays a key role in preventing aberrant activation of the innate immune system. Furthermore, ADAR enzymes have been implicated in myriad human diseases. Intriguingly, some cancer cells are known to exploit ADAR1 activity to dodge immune responses. However, the exact identities of immunogenic RNAs in different biological contexts have remained elusive. Consequently, there is tremendous interest in identifying inosine-containing RNAs in the cell.The identification of A-to-I RNA editing sites is dependent on the sequencing of nucleic acids. Technological and algorithmic advancements over the past decades have revolutionized the way editing events are detected. At the beginning, the discovery of editing sites relies on Sanger sequencing, a first-generation technology. Both RNA, which is reverse transcribed into complementary DNA (cDNA), and genomic DNA (gDNA) from the same source are analyzed. After sequence alignment, one would require an adenosine to be present in the genome but a guanosine to be detected in the RNA sample for a position to be declared as an editing site. However, an issue with Sanger sequencing is its low throughput. Subsequently, Illumina sequencing, a second-generation technology, was invented. By permitting the simultaneous interrogation of millions of molecules, it enables many editing sites to be identified rapidly. However, a key challenge is that the Illumina platform produces short sequencing reads that can be difficult to map accurately. To tackle the challenge, we and others developed computational workflows with a series of filters to discard sites that are likely to be false positives. When Illumina sequencing data sets are properly analyzed, A-to-G variants should emerge as the most dominant mismatch type. Moreover, the quantitative nature of the data allows us to build a comprehensive atlas of editing-level measurements across different biological contexts, providing deep insights into the spatiotemporal dynamics of RNA editing. However, difficulties remain in identifying true A-to-I editing sites in short protein-coding exons or in organisms and diseases where DNA mutations and genomic polymorphisms are prevalent and mostly unknown. Nanopore sequencing, a third-generation technology, promises to address the difficulties, as it allows native RNAs to be sequenced without conversion to cDNA, preserving base modifications that can be directly detected through machine learning. We recently demonstrated that nanopore sequencing could be used to identify A-to-I editing sites in native RNA directly. Although further work is needed to enhance the detection accuracy in single molecules from fewer cells, the nanopore technology holds the potential to revolutionize epitranscriptomic studies.
Collapse
Affiliation(s)
- Meng How Tan
- School of Chemistry, Chemical Engineering and Biotechnology, Nanyang Technological University, Singapore 637459, Singapore
- HP-NTU Digital Manufacturing Corporate Laboratory, Nanyang Technological University, Singapore 637460, Singapore
| |
Collapse
|
3
|
Gao Z, Yang X, Chen J, Rausher MD, Shi T. Expression inheritance and constraints on cis- and trans-regulatory mutations underlying lotus color variation. PLANT PHYSIOLOGY 2023; 191:1662-1683. [PMID: 36417237 PMCID: PMC10022630 DOI: 10.1093/plphys/kiac522] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 09/09/2022] [Accepted: 11/17/2022] [Indexed: 06/16/2023]
Abstract
Both cis- and trans-regulatory mutations drive changes in gene expression that underpin plant phenotypic evolution. However, how and why these two major types of regulatory mutations arise in different genes and how gene expression is inherited and associated with these regulatory changes are unclear. Here, by studying allele-specific expression in F1 hybrids of pink-flowered sacred lotus (Nelumbo nucifera) and yellow-flowered American lotus (N. lutea), we reveal the relative contributions of cis- and trans-regulatory changes to interspecific expression rewiring underlying petal color change and how the expression is inherited in hybrids. Although cis-only variants influenced slightly more genes, trans-only variants had a stronger impact on expression differences between species. In F1 hybrids, genes under cis-only and trans-only regulatory effects showed a propensity toward additive and dominant inheritance, respectively, whereas transgressive inheritance was observed in genes carrying both cis- and trans-variants acting in opposite directions. By investigating anthocyanin and carotenoid coexpression networks in petals, we found that the same category of regulatory mutations, particularly trans-variants, tend to rewire hub genes in coexpression modules underpinning flower color differentiation between species; we identified 45 known genes with cis- and trans-regulatory variants significantly correlated with flower coloration, such as ANTHOCYANIN 5-AROMATIC ACYLTRANSFERASE (ACT), GLUTATHIONE S-TRANSFERASE F11 (GSTF11), and LYCOPENE Ε-CYCLASE (LCYE). Notably, the relative abundance of genes in different categories of regulatory divergence was associated with the inferred magnitude of constraints like expression level and breadth. Overall, our study suggests distinct selective constraints and modes of gene expression inheritance among different regulatory mutations underlying lotus petal color divergence.
Collapse
Affiliation(s)
- Zhiyan Gao
- Key Laboratory of Aquatic Botany and Watershed Ecology, Wuhan Botanical Garden, Chinese Academy of Sciences, Wuhan 430074, China
- Center of Conservation Biology, Core Botanical Gardens, Chinese Academy of Sciences, Wuhan 430074, China
- College of Life Sciences, University of Chinese Academy of Sciences, Beijing 100049, China
| | - Xingyu Yang
- Wuhan Institute of Landscape Architecture, Wuhan 430081, China
| | - Jinming Chen
- Key Laboratory of Aquatic Botany and Watershed Ecology, Wuhan Botanical Garden, Chinese Academy of Sciences, Wuhan 430074, China
- Center of Conservation Biology, Core Botanical Gardens, Chinese Academy of Sciences, Wuhan 430074, China
| | - Mark D Rausher
- Department of Biology, Duke University, Durham, North Carolina 27708, USA
| | - Tao Shi
- Key Laboratory of Aquatic Botany and Watershed Ecology, Wuhan Botanical Garden, Chinese Academy of Sciences, Wuhan 430074, China
- Center of Conservation Biology, Core Botanical Gardens, Chinese Academy of Sciences, Wuhan 430074, China
| |
Collapse
|
4
|
Woerner AE, Crysup B, Hewitt FC, Gardner MW, Freitas MA, Budowle B. Techniques for estimating genetically variable peptides and semi-continuous likelihoods from massively parallel sequencing data. Forensic Sci Int Genet 2022; 59:102719. [DOI: 10.1016/j.fsigen.2022.102719] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/20/2022] [Revised: 04/25/2022] [Accepted: 05/01/2022] [Indexed: 11/25/2022]
|
5
|
Allelic imbalance of HLA-B expression in human lung cells infected with coronavirus and other respiratory viruses. Eur J Hum Genet 2022; 30:922-929. [PMID: 35322240 PMCID: PMC8940983 DOI: 10.1038/s41431-022-01070-5] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/11/2021] [Revised: 01/09/2022] [Accepted: 02/09/2022] [Indexed: 01/01/2023] Open
Abstract
The human leucocyte antigen (HLA) loci have been widely characterized to be associated with viral infectious diseases using either HLA allele frequency-based association or in silico predicted studies. However, there is less experimental evidence to link the HLA alleles with COVID-19 and other respiratory infectious diseases, particularly in the lung cells. To examine the role of HLA alleles in response to coronavirus and other respiratory viral infections in disease-relevant cells, we designed a two-stage study by integrating publicly accessible RNA-seq data sets, and performed allelic expression (AE) analysis on heterozygous HLA genotypes. We discovered an increased AE pattern accompanied with overexpression of HLA-B gene in SARS-CoV-2-infected human lung epithelial cells. Analysis of independent data sets verified the respiratory virus-induced AE of HLA-B gene in lung cells and tissues. The results were further experimentally validated in cultured lung cells infected with SARS-CoV-2. We further uncovered that the antiviral cytokine IFNβ contribute to AE of the HLA-B gene in lung cells. Our analyses provide a new insight into allelic influence on the HLA expression in association with SARS-CoV-2 and other common viral infectious diseases.
Collapse
|
6
|
Gürsoy G, Lu N, Wagner S, Gerstein M. Recovering genotypes and phenotypes using allele-specific genes. Genome Biol 2021; 22:263. [PMID: 34493313 PMCID: PMC8425091 DOI: 10.1186/s13059-021-02477-x] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/14/2021] [Accepted: 08/23/2021] [Indexed: 11/10/2022] Open
Abstract
With the recent increase in RNA sequencing efforts using large cohorts of individuals, surveying allele-specific gene expression is becoming increasingly frequent. Here, we report that, despite not containing explicit variant information, a list of genes known to be allele-specific in an individual is enough to recover key variants and link the individuals back to their genotypes and phenotypes. This creates a privacy conundrum.
Collapse
Affiliation(s)
- Gamze Gürsoy
- Program in Computational Biology and Bioinformatics, Yale University, New Haven, USA
- Molecular Biophysics and Biochemistry, Yale University, New Haven, USA
| | - Nancy Lu
- Molecular, Cellular, and Developmental Biology, Yale University, New Haven, USA
- Statistics and Data Science, Yale University, New Haven, USA
| | - Sarah Wagner
- Computer Science, Yale University, New Haven, USA
| | - Mark Gerstein
- Program in Computational Biology and Bioinformatics, Yale University, New Haven, USA.
- Molecular Biophysics and Biochemistry, Yale University, New Haven, USA.
- Statistics and Data Science, Yale University, New Haven, USA.
- Computer Science, Yale University, New Haven, USA.
| |
Collapse
|
7
|
Tomlinson MJ, Polson SW, Qiu J, Lake JA, Lee W, Abasht B. Investigation of allele specific expression in various tissues of broiler chickens using the detection tool VADT. Sci Rep 2021; 11:3968. [PMID: 33597613 PMCID: PMC7889858 DOI: 10.1038/s41598-021-83459-8] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/14/2020] [Accepted: 02/01/2021] [Indexed: 12/30/2022] Open
Abstract
Differential abundance of allelic transcripts in a diploid organism, commonly referred to as allele specific expression (ASE), is a biologically significant phenomenon and can be examined using single nucleotide polymorphisms (SNPs) from RNA-seq. Quantifying ASE aids in our ability to identify and understand cis-regulatory mechanisms that influence gene expression, and thereby assist in identifying causal mutations. This study examines ASE in breast muscle, abdominal fat, and liver of commercial broiler chickens using variants called from a large sub-set of the samples (n = 68). ASE analysis was performed using a custom software called VCF ASE Detection Tool (VADT), which detects ASE of biallelic SNPs using a binomial test. On average ~ 174,000 SNPs in each tissue passed our filtering criteria and were considered informative, of which ~ 24,000 (~ 14%) showed ASE. Of all ASE SNPs, only 3.7% exhibited ASE in all three tissues, with ~ 83% showing ASE specific to a single tissue. When ASE genes (genes containing ASE SNPs) were compared between tissues, the overlap among all three tissues increased to 20.1%. Our results indicate that ASE genes show tissue-specific enrichment patterns, but all three tissues showed enrichment for pathways involved in translation.
Collapse
Affiliation(s)
- M Joseph Tomlinson
- Department of Animal and Food Sciences, University of Delaware, 531 South College Ave, Newark, DE, 19716, USA.,Center for Bioinformatics and Computational Biology, University of Delaware, Newark, USA
| | - Shawn W Polson
- Department of Computer and Information Sciences, University of Delaware, Newark, USA.,Department of Biological Sciences, University of Delaware, Newark, USA.,Center for Bioinformatics and Computational Biology, University of Delaware, Newark, USA
| | - Jing Qiu
- Department of Applied Economics and Statistics, University of Delaware, Newark, USA.,Center for Bioinformatics and Computational Biology, University of Delaware, Newark, USA
| | - Juniper A Lake
- Department of Animal and Food Sciences, University of Delaware, 531 South College Ave, Newark, DE, 19716, USA.,Center for Bioinformatics and Computational Biology, University of Delaware, Newark, USA
| | - William Lee
- Maple Leaf Farms, Inc., Leesburg, IN, 46538, USA
| | - Behnam Abasht
- Department of Animal and Food Sciences, University of Delaware, 531 South College Ave, Newark, DE, 19716, USA. .,Center for Bioinformatics and Computational Biology, University of Delaware, Newark, USA.
| |
Collapse
|
8
|
Li J, Zhang C, Si H, Gu S, Liu X, Li D, Meng S, Yang X, Li S. Brain-specific monoallelic expression of bovine UBE3A is associated with genomic position. Anim Genet 2020; 52:47-54. [PMID: 33200847 DOI: 10.1111/age.13023] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 10/17/2020] [Indexed: 11/30/2022]
Abstract
Genomic imprinting is a rare epigenetic process in mammalian cells that leads to monoallelic expression of a gene with a parent-specific pattern. The UBE3A (ubiquitin protein ligase E3A) gene is imprinted with maternal allelic expression in the brain but biallelically expressed in all other tissues in humans. The silencing of the paternal UBE3A allele is thought to be caused by the paternally expressed antisense RNA transcript of UBE3A-ATS. The aberrant imprinted expression of the UBE3A is associated with several neurodevelopmental syndromes and psychological disorders. Cattle are a valuable model species in determining the genetic etiology of sporadic human disorder, and maternal expression of UEB3A has been revealed by next-generation sequencing study in the bovine conceptus. In this study, we investigated the allelic expression of UBE3A and UBE3A-ATS in adult bovine somatic tissues. To confirm the splicing pattern of bovine UBE3A, five 5' alternative transcripts (MT210534-MT210538) were first obtained from bovine brain tissue by RT-PCR. Based on 10 SNP genotypes, we found that the brain-specific monoallelic expression of bovine UBE3A did not occur along the entire locus, and there was a shift from biallelic expression to monoallelic expression in exon 14 of the UBE3A gene. However, the brain-specific monoallelic expression of bovine UBE3A-ATS occurred in the entire gene. These observations demonstrated that the monoallelic expression did not occur along the bovine UBE3A entire locus and was associated with the genomic position.
Collapse
Affiliation(s)
- J Li
- College of Life Science, Agricultural University of Hebei, Baoding, Hebei, China
| | - C Zhang
- College of Life Science, Agricultural University of Hebei, Baoding, Hebei, China
| | - H Si
- College of Life Science, Agricultural University of Hebei, Baoding, Hebei, China
| | - S Gu
- College of Life Science, Agricultural University of Hebei, Baoding, Hebei, China
| | - X Liu
- College of Life Science, Agricultural University of Hebei, Baoding, Hebei, China
| | - D Li
- College of Bioscience and Bioengineering, Hebei University of Science and Technology, Shijiazhuang, Hebei, China
| | - S Meng
- College of Life Science, Agricultural University of Hebei, Baoding, Hebei, China
| | - X Yang
- College of Life Science, Agricultural University of Hebei, Baoding, Hebei, China
| | - S Li
- College of Life Science, Agricultural University of Hebei, Baoding, Hebei, China
| |
Collapse
|
9
|
Nguyen HQ, Chattoraj S, Castillo D, Nguyen SC, Nir G, Lioutas A, Hershberg EA, Martins NMC, Reginato PL, Hannan M, Beliveau BJ, Church GM, Daugharthy ER, Marti-Renom MA, Wu CT. 3D mapping and accelerated super-resolution imaging of the human genome using in situ sequencing. Nat Methods 2020; 17:822-832. [PMID: 32719531 PMCID: PMC7537785 DOI: 10.1038/s41592-020-0890-0] [Citation(s) in RCA: 75] [Impact Index Per Article: 18.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/15/2019] [Accepted: 06/08/2020] [Indexed: 12/31/2022]
Abstract
There is a need for methods that can image chromosomes with genome-wide coverage, as well as greater genomic and optical resolution. We introduce OligoFISSEQ, a suite of three methods that leverage fluorescence in situ sequencing (FISSEQ) of barcoded Oligopaint probes to enable the rapid visualization of many targeted genomic regions. Applying OligoFISSEQ to human diploid fibroblast cells, we show how four rounds of sequencing are sufficient to produce 3D maps of 36 genomic targets across six chromosomes in hundreds to thousands of cells, implying a potential to image thousands of targets in only five to eight rounds of sequencing. We also use OligoFISSEQ to trace chromosomes at finer resolution, following the path of the X chromosome through 46 regions, with separate studies showing compatibility of OligoFISSEQ with immunocytochemistry. Finally, we combined OligoFISSEQ with OligoSTORM, laying the foundation for accelerated single-molecule super-resolution imaging of large swaths of, if not entire, human genomes.
Collapse
Affiliation(s)
- Huy Q Nguyen
- Department of Genetics, Harvard Medical School, Boston, MA, USA
| | | | - David Castillo
- CNAG-CRG, Centre for Genomic Regulation (CRG), Barcelona Institute of Science and Technology (BIST), Barcelona, Spain
| | - Son C Nguyen
- Department of Genetics, Harvard Medical School, Boston, MA, USA
- Department of Genetics, University of Pennsylvania, Philadelphia, PA, USA
| | - Guy Nir
- Department of Genetics, Harvard Medical School, Boston, MA, USA
- Wyss Institute, Harvard Medical School, Boston, MA, USA
| | | | - Elliot A Hershberg
- Department of Genome Sciences, University of Washington, Seattle, WA, USA
| | | | - Paul L Reginato
- Department of Genetics, Harvard Medical School, Boston, MA, USA
- Wyss Institute, Harvard Medical School, Boston, MA, USA
- Department of Biological Engineering, Massachusetts Institute of Technology, Cambridge, MA, USA
| | - Mohammed Hannan
- Department of Genetics, Harvard Medical School, Boston, MA, USA
| | - Brian J Beliveau
- Department of Genome Sciences, University of Washington, Seattle, WA, USA
- Brotman Baty Institute for Precision Medicine, Seattle, WA, USA
| | - George M Church
- Department of Genetics, Harvard Medical School, Boston, MA, USA
- Wyss Institute, Harvard Medical School, Boston, MA, USA
| | - Evan R Daugharthy
- Department of Genetics, Harvard Medical School, Boston, MA, USA
- Wyss Institute, Harvard Medical School, Boston, MA, USA
- Department of Systems Biology, Harvard Medical School, Boston, MA, USA
- ReadCoor, Cambridge, MA, USA
- ReadCoor, Cambridge, MA, USA
| | - Marc A Marti-Renom
- CNAG-CRG, Centre for Genomic Regulation (CRG), Barcelona Institute of Science and Technology (BIST), Barcelona, Spain.
- CRG, BIST, Barcelona, Spain.
- Pompeu Fabra University, Barcelona, Spain.
- ICREA, Barcelona, Spain.
| | - C-Ting Wu
- Department of Genetics, Harvard Medical School, Boston, MA, USA.
- Wyss Institute, Harvard Medical School, Boston, MA, USA.
| |
Collapse
|
10
|
Woerner AE, Hewitt FC, Gardner MW, Freitas MA, Schulte KQ, LeSassier DS, Baniasad M, Reed AJ, Powals ME, Smith AR, Albright NC, Ludolph BC, Zhang L, Allen LW, Weber K, Budowle B. An algorithm for random match probability calculation from peptide sequences. Forensic Sci Int Genet 2020; 47:102295. [DOI: 10.1016/j.fsigen.2020.102295] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/30/2019] [Revised: 02/23/2020] [Accepted: 03/25/2020] [Indexed: 02/01/2023]
|
11
|
Zou J, Hormozdiari F, Jew B, Castel SE, Lappalainen T, Ernst J, Sul JH, Eskin E. Leveraging allelic imbalance to refine fine-mapping for eQTL studies. PLoS Genet 2019; 15:e1008481. [PMID: 31834882 PMCID: PMC6952111 DOI: 10.1371/journal.pgen.1008481] [Citation(s) in RCA: 13] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/20/2019] [Revised: 01/09/2020] [Accepted: 10/15/2019] [Indexed: 11/18/2022] Open
Abstract
Many disease risk loci identified in genome-wide association studies are present in non-coding regions of the genome. Previous studies have found enrichment of expression quantitative trait loci (eQTLs) in disease risk loci, indicating that identifying causal variants for gene expression is important for elucidating the genetic basis of not only gene expression but also complex traits. However, detecting causal variants is challenging due to complex genetic correlation among variants known as linkage disequilibrium (LD) and the presence of multiple causal variants within a locus. Although several fine-mapping approaches have been developed to overcome these challenges, they may produce large sets of putative causal variants when true causal variants are in high LD with many non-causal variants. In eQTL studies, there is an additional source of information that can be used to improve fine-mapping called allelic imbalance (AIM) that measures imbalance in gene expression on two chromosomes of a diploid organism. In this work, we develop a novel statistical method that leverages both AIM and total expression data to detect causal variants that regulate gene expression. We illustrate through simulations and application to 10 tissues of the Genotype-Tissue Expression (GTEx) dataset that our method identifies the true causal variants with higher specificity than an approach that uses only eQTL information. Across all tissues and genes, our method achieves a median reduction rate of 11% in the number of putative causal variants. We use chromatin state data from the Roadmap Epigenomics Consortium to show that the putative causal variants identified by our method are enriched for active regions of the genome, providing orthogonal support that our method identifies causal variants with increased specificity. In recent years, many studies have identified genetic variants that are associated with the expression of genes (eQTLs). While thousands of eQTLs have been identified, not all associated variants cause changes in gene expression. This is in part due to the complex patterns of genetic correlation in the human genome. If a region of the genome contains many genetic variants that are highly correlated with each other, non-causal genetic variants close to a causal variant are also correlated with gene expression. Statistical fine-mapping is the process of identifying true causal variants from a set of candidate variants. In regions with high genetic correlation, previous fine-mapping methods may not be able to differentiate causal variants from nearby variants. We propose a method that utilizes a complementary source of information called allelic imbalance (AIM). We show that by combining eQTL and AIM data, we can identify the true causal variants more efficiently and substantially decrease the number of putative causal variants for downstream analysis.
Collapse
Affiliation(s)
- Jennifer Zou
- Computer Science Department, University of California Los Angeles, Los Angeles, California, United States of America
| | - Farhad Hormozdiari
- Genetic Epidemiology and Statistical Genetics Program, Harvard University, Cambridge, Massachusetts, United States of America
- Program in Medical and Population Genetics, Broad Institute of MIT and Harvard, Cambridge, Massachusetts, United States of America
| | - Brandon Jew
- Bioinformatics Interdepartmental Program, University of California Los Angeles, Los Angeles, California, United States of America
| | - Stephane E. Castel
- New York Genome Center, New York, New York, United States of America
- Department of Systems Biology, Columbia University, New York, New York, United States of America
| | - Tuuli Lappalainen
- New York Genome Center, New York, New York, United States of America
- Department of Systems Biology, Columbia University, New York, New York, United States of America
| | - Jason Ernst
- Computer Science Department, University of California Los Angeles, Los Angeles, California, United States of America
- Department of Biological Chemistry, University of California Los Angeles, Los Angeles, California, United States of America
| | - Jae Hoon Sul
- Department of Psychiatry and Biobehavioral Sciences, University of California Los Angeles, Los Angeles, California, United States of America
- * E-mail: (JHS); (EE)
| | - Eleazar Eskin
- Computer Science Department, University of California Los Angeles, Los Angeles, California, United States of America
- Department of Human Genetics, University of California Los Angeles, Los Angeles, California, United States of America
- * E-mail: (JHS); (EE)
| |
Collapse
|
12
|
Liu Z, Dong X, Li Y. A Genome-Wide Study of Allele-Specific Expression in Colorectal Cancer. Front Genet 2018; 9:570. [PMID: 30538721 PMCID: PMC6277598 DOI: 10.3389/fgene.2018.00570] [Citation(s) in RCA: 18] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/19/2018] [Accepted: 11/06/2018] [Indexed: 12/30/2022] Open
Abstract
Accumulating evidence from small-scale studies has suggested that allele-specific expression (ASE) plays an important role in tumor initiation and progression. However, little is known about genome-wide ASE in tumors. In this study, we conducted a comprehensive analysis of ASE in individuals with colorectal cancer (CRC) on a genome-wide scale. We identified 5.4 thousand genome-wide ASEs of single nucleotide variations (SNVs) from tumor and normal tissues of 59 individuals with CRC. We observed an increased ASE level in tumor samples and the ASEs enriched as hotspots on the genome. Around 63% of the genes located there were previously reported to contain complex regulatory elements, e.g., human leukocyte antigen (HLA), or were implicated in tumor progression. Focussing on the allelic expression of somatic mutations, we found that 37.5% of them exhibited ASE, and genes harboring such somatic mutations, were enriched in important pathways implicated in cancers. In addition, by comparing the expected and observed ASE events in tumor samples, we identified 50 tumor specific ASEs which possibly contributed to the somatic events in the regulatory regions of the genes and significantly enriched known cancer driver genes. By analyzing CRC ASEs from several perspectives, we provided a systematic understanding of how ASE is implicated in both tumor and normal tissues and will be of critical value in guiding ASE studies in cancer.
Collapse
Affiliation(s)
- Zhi Liu
- Department of Epidemiology and Biostatistics, Jiangsu Key Lab of Cancer Biomarkers, Prevention and Treatment, Collaborative Innovation Center for Cancer Personalized Medicine, School of Public Health, Nanjing Medical University, Nanjing, China
| | - Xiao Dong
- Department of Genetics, Albert Einstein College of Medicine, Bronx, NY, United States
| | - Yixue Li
- Key Laboratory of Computational Biology, CAS-MPG Partner Institute for Computational Biology, Shanghai Institute of Nutrition and Health, Shanghai Institutes for Biological Sciences, University of Chinese Academy of Sciences, Chinese Academy of Sciences, Shanghai, China.,Shanghai Center for Bioinformation Technology, Shanghai Industrial Technology Institute, Shanghai, China.,Collaborative Innovation Center for Genetics and Development, Fudan University, Shanghai, China
| |
Collapse
|
13
|
Li W, Hong R, Lai LT, Dong Q, Ni P, Chelliah R, Huq M, Ismail SNB, Chandola U, Ang Z, Lin B, Chen X, Chen L, Zhang LF. Genome-Wide RNAi Screen Identify Melanoma-Associated Antigen Mageb3 Involved in X Chromosome Inactivation. J Mol Biol 2018; 430:2734-2746. [DOI: 10.1016/j.jmb.2018.05.031] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/11/2018] [Revised: 05/17/2018] [Accepted: 05/17/2018] [Indexed: 10/16/2022]
|
14
|
Analysis of public RNA-sequencing data reveals biological consequences of genetic heterogeneity in cell line populations. Sci Rep 2018; 8:11226. [PMID: 30046134 PMCID: PMC6060100 DOI: 10.1038/s41598-018-29506-3] [Citation(s) in RCA: 17] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/14/2018] [Accepted: 07/13/2018] [Indexed: 01/19/2023] Open
Abstract
Meta-analysis of datasets available in public repositories are used to gather and summarise experiments performed across laboratories, as well as to explore consistency of scientific findings. As data quality and biological equivalency across samples may obscure such analyses and consequently their conclusions, we investigated the comparability of 85 public RNA-seq cell line datasets. Thousands of pairwise comparisons of single nucleotide variants in 139 samples revealed variable genetic heterogeneity of the eight cell line populations analysed as well as variable data quality. The H9 and HCT116 cell lines were found to be remarkably stable across laboratories (with median concordances of 99.2% and 98.5%, respectively), in contrast to the highly variable HeLa cells (89.3%). We show that the genetic heterogeneity encountered greatly affects gene expression between same-cell comparisons, highlighting the importance of interrogating the biological equivalency of samples when comparing experimental datasets. Both the number of differentially expressed genes and the expression levels negatively correlate with the genetic heterogeneity. Finally, we demonstrate how comparing genetically heterogeneous datasets affect gene expression analyses and that high dissimilarity between same-cell datasets alters the expression of more than 300 cancer-related genes, which are often the focus of studies using cell lines.
Collapse
|
15
|
Comprehensive comparative analysis of 5'-end RNA-sequencing methods. Nat Methods 2018; 15:505-511. [PMID: 29867192 PMCID: PMC6075671 DOI: 10.1038/s41592-018-0014-2] [Citation(s) in RCA: 66] [Impact Index Per Article: 11.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/18/2017] [Accepted: 04/10/2018] [Indexed: 12/20/2022]
Abstract
Specialized RNA-seq methods are required to identify the 5' ends of transcripts, which are critical for studies of gene regulation, but these methods have not been systematically benchmarked. We directly compared six such methods, including the performance of five methods on a single human cellular RNA sample and a new spike-in RNA assay that helps circumvent challenges resulting from uncertainties in annotation and RNA processing. We found that the 'cap analysis of gene expression' (CAGE) method performed best for mRNA and that most of its unannotated peaks were supported by evidence from other genomic methods. We applied CAGE to eight brain-related samples and determined sample-specific transcription start site (TSS) usage, as well as a transcriptome-wide shift in TSS usage between fetal and adult brain.
Collapse
|
16
|
Tian L, Khan A, Ning Z, Yuan K, Zhang C, Lou H, Yuan Y, Xu S. Genome-wide comparison of allele-specific gene expression between African and European populations. Hum Mol Genet 2018; 27:1067-1077. [DOI: 10.1093/hmg/ddy027] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/14/2017] [Accepted: 01/05/2018] [Indexed: 11/12/2022] Open
Affiliation(s)
- Lei Tian
- Chinese Academy of Sciences Key Laboratory of Computational Biology, Max Planck Independent Research Group on Population Genomics, CAS-MPG Partner Institute for Computational Biology, Shanghai Institutes for Biological Sciences, CAS, Shanghai 200031, China
- University of Chinese Academy of Sciences, Beijing 100049, China
| | - Asifullah Khan
- Chinese Academy of Sciences Key Laboratory of Computational Biology, Max Planck Independent Research Group on Population Genomics, CAS-MPG Partner Institute for Computational Biology, Shanghai Institutes for Biological Sciences, CAS, Shanghai 200031, China
- Department of Biochemistry, Abdul Wali Khan University Mardan, Mardan-23200 KP, Pakistan
| | - Zhilin Ning
- Chinese Academy of Sciences Key Laboratory of Computational Biology, Max Planck Independent Research Group on Population Genomics, CAS-MPG Partner Institute for Computational Biology, Shanghai Institutes for Biological Sciences, CAS, Shanghai 200031, China
- University of Chinese Academy of Sciences, Beijing 100049, China
| | - Kai Yuan
- Chinese Academy of Sciences Key Laboratory of Computational Biology, Max Planck Independent Research Group on Population Genomics, CAS-MPG Partner Institute for Computational Biology, Shanghai Institutes for Biological Sciences, CAS, Shanghai 200031, China
- University of Chinese Academy of Sciences, Beijing 100049, China
| | - Chao Zhang
- Chinese Academy of Sciences Key Laboratory of Computational Biology, Max Planck Independent Research Group on Population Genomics, CAS-MPG Partner Institute for Computational Biology, Shanghai Institutes for Biological Sciences, CAS, Shanghai 200031, China
- University of Chinese Academy of Sciences, Beijing 100049, China
| | - Haiyi Lou
- Chinese Academy of Sciences Key Laboratory of Computational Biology, Max Planck Independent Research Group on Population Genomics, CAS-MPG Partner Institute for Computational Biology, Shanghai Institutes for Biological Sciences, CAS, Shanghai 200031, China
| | - Yuan Yuan
- Chinese Academy of Sciences Key Laboratory of Computational Biology, Max Planck Independent Research Group on Population Genomics, CAS-MPG Partner Institute for Computational Biology, Shanghai Institutes for Biological Sciences, CAS, Shanghai 200031, China
| | - Shuhua Xu
- Chinese Academy of Sciences Key Laboratory of Computational Biology, Max Planck Independent Research Group on Population Genomics, CAS-MPG Partner Institute for Computational Biology, Shanghai Institutes for Biological Sciences, CAS, Shanghai 200031, China
- University of Chinese Academy of Sciences, Beijing 100049, China
- School of Life Science and Technology, Shanghai Tech University, Shanghai 201210, China
- Collaborative Innovation Center of Genetics and Development, Shanghai 200438, China
| |
Collapse
|
17
|
Abstract
The last past decade has witnessed a revolution in our appreciation of transcriptome complexity and regulation. This remarkable expansion in our knowledge largely originates from the advent of high-throughput methodologies, and the consecutive discovery that up to 90% of eukaryotic genomes are transcribed, thus generating an unanticipated large range of noncoding RNAs (Hangauer et al., 15(4):112, 2014). Besides leading to the identification of new noncoding RNA species, transcriptome-wide studies have uncovered novel layers of posttranscriptional regulatory mechanisms controlling RNA processing, maturation or translation, and each contributing to the precise and dynamic regulation of gene expression. Remarkably, the development of systems-level studies has been accompanied by tremendous progress in the visualization of individual RNA molecules in single cells, such that it is now possible to image RNA species with a single-molecule resolution from birth to translation or decay. Monitoring quantitatively, with unprecedented spatiotemporal resolution, the fate of individual molecules has been key to understanding the molecular mechanisms underlying the different steps of RNA regulation. This has also revealed biologically relevant, intracellular and intercellular heterogeneities in RNA distribution or regulation. More recently, the convergence of imaging and high-throughput technologies has led to the emergence of spatially resolved transcriptomic techniques that provide a means to perform large-scale analyses while preserving spatial information. By generating transcriptome-wide data on single-cell RNA content, or even subcellular RNA distribution, these methodologies are opening avenues to a wide range of network-level studies at the cell and organ-level, and promise to strongly improve disease diagnostic and treatment.In this introductory chapter, we highlight how recently developed technologies aiming at detecting and visualizing RNA molecules have contributed to the emergence of entirely new research fields, and to dramatic progress in our understanding of gene expression regulation.
Collapse
Affiliation(s)
- Caroline Medioni
- Université Côte d'Azur, CNRS, Inserm, iBV, Parc Valrose, 06100, Nice, France
| | - Florence Besse
- Université Côte d'Azur, CNRS, Inserm, iBV, Parc Valrose, 06100, Nice, France.
| |
Collapse
|
18
|
Bartonicek N, Clark MB, Quek XC, Torpy JR, Pritchard AL, Maag JLV, Gloss BS, Crawford J, Taft RJ, Hayward NK, Montgomery GW, Mattick JS, Mercer TR, Dinger ME. Intergenic disease-associated regions are abundant in novel transcripts. Genome Biol 2017; 18:241. [PMID: 29284497 PMCID: PMC5747244 DOI: 10.1186/s13059-017-1363-3] [Citation(s) in RCA: 33] [Impact Index Per Article: 4.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/31/2017] [Accepted: 11/21/2017] [Indexed: 12/21/2022] Open
Abstract
BACKGROUND Genotyping of large populations through genome-wide association studies (GWAS) has successfully identified many genomic variants associated with traits or disease risk. Unexpectedly, a large proportion of GWAS single nucleotide polymorphisms (SNPs) and associated haplotype blocks are in intronic and intergenic regions, hindering their functional evaluation. While some of these risk-susceptibility regions encompass cis-regulatory sites, their transcriptional potential has never been systematically explored. RESULTS To detect rare tissue-specific expression, we employed the transcript-enrichment method CaptureSeq on 21 human tissues to identify 1775 multi-exonic transcripts from 561 intronic and intergenic haploblocks associated with 392 traits and diseases, covering 73.9 Mb (2.2%) of the human genome. We show that a large proportion (85%) of disease-associated haploblocks express novel multi-exonic non-coding transcripts that are tissue-specific and enriched for GWAS SNPs as well as epigenetic markers of active transcription and enhancer activity. Similarly, we captured transcriptomes from 13 melanomas, targeting nine melanoma-associated haploblocks, and characterized 31 novel melanoma-specific transcripts that include fusion proteins, novel exons and non-coding RNAs, one-third of which showed allelically imbalanced expression. CONCLUSIONS This resource of previously unreported transcripts in disease-associated regions ( http://gwas-captureseq.dingerlab.org ) should provide an important starting point for the translational community in search of novel biomarkers, disease mechanisms, and drug targets.
Collapse
Affiliation(s)
- N Bartonicek
- Garvan Institute of Medical Research, Sydney, NSW, Australia
- Faculty of Medicine, St Vincent's Clinical School, University of New South Wales, Sydney, NSW, Australia
| | - M B Clark
- Garvan Institute of Medical Research, Sydney, NSW, Australia
- Department of Psychiatry, University of Oxford, Warneford Hospital, Oxford, UK
| | - X C Quek
- Garvan Institute of Medical Research, Sydney, NSW, Australia
- Faculty of Medicine, St Vincent's Clinical School, University of New South Wales, Sydney, NSW, Australia
| | - J R Torpy
- Garvan Institute of Medical Research, Sydney, NSW, Australia
- Faculty of Medicine, St Vincent's Clinical School, University of New South Wales, Sydney, NSW, Australia
| | - A L Pritchard
- QIMR Berghofer Medical Research Institute, Brisbane, QLD, Australia
| | - J L V Maag
- Garvan Institute of Medical Research, Sydney, NSW, Australia
- Faculty of Medicine, St Vincent's Clinical School, University of New South Wales, Sydney, NSW, Australia
| | - B S Gloss
- Garvan Institute of Medical Research, Sydney, NSW, Australia
- Faculty of Medicine, St Vincent's Clinical School, University of New South Wales, Sydney, NSW, Australia
| | - J Crawford
- Institute for Molecular Bioscience, The University of Queensland, Brisbane, QLD, Australia
| | - R J Taft
- Institute for Molecular Bioscience, The University of Queensland, Brisbane, QLD, Australia
- Illumina, Inc., San Diego, CA, USA
| | - N K Hayward
- QIMR Berghofer Medical Research Institute, Brisbane, QLD, Australia
| | - G W Montgomery
- Institute for Molecular Bioscience, The University of Queensland, Brisbane, QLD, Australia
| | - J S Mattick
- Garvan Institute of Medical Research, Sydney, NSW, Australia
- Faculty of Medicine, St Vincent's Clinical School, University of New South Wales, Sydney, NSW, Australia
| | - T R Mercer
- Garvan Institute of Medical Research, Sydney, NSW, Australia
- Faculty of Medicine, St Vincent's Clinical School, University of New South Wales, Sydney, NSW, Australia
- Altius Institute for Biomedical Sciences, Seattle, USA
| | - M E Dinger
- Garvan Institute of Medical Research, Sydney, NSW, Australia.
- Faculty of Medicine, St Vincent's Clinical School, University of New South Wales, Sydney, NSW, Australia.
| |
Collapse
|
19
|
Hong R, Chandola U, Zhang LF. Cat-D: a targeted sequencing method for the simultaneous detection of small DNA mutations and large DNA deletions with flexible boundaries. Sci Rep 2017; 7:15701. [PMID: 29146914 PMCID: PMC5691158 DOI: 10.1038/s41598-017-15764-0] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/27/2017] [Accepted: 11/01/2017] [Indexed: 11/23/2022] Open
Abstract
We developed a targeted DNA sequencing method that is capable of detecting a comprehensive panel of DNA mutations including small DNA mutations and large DNA deletions with unknown/flexible boundaries. The method directly identifies the large DNA deletions (Cat-D) without relying on sequencing coverage to make the genotype calls. We performed the method to simultaneously detect 10 small DNA mutations in β-thalassemia and 2 large genomic deletions in α-thalassemia from 10 genomic DNA samples. Cat-D was performed on 8 genomic DNA samples in duplicate. The 18 Cat-D samples were combined in one sequencing run. In total, 216 genotype calls were made, and 215 of the genotype calls were accurate. No false negative genotype calls were made. One false positive genotype call was made on one target mutation in one experimental duplicate from a genomic DNA sample. In summary, Cat-D can be developed into a robust, high-throughput and cost-effective method suitable for population-based carrier screens.
Collapse
Affiliation(s)
- Ru Hong
- School of Biological Sciences, Nanyang Technological University, 60 Nanyang Drive, Singapore, 637551, Singapore
| | - Udita Chandola
- School of Biological Sciences, Nanyang Technological University, 60 Nanyang Drive, Singapore, 637551, Singapore
| | - Li-Feng Zhang
- School of Biological Sciences, Nanyang Technological University, 60 Nanyang Drive, Singapore, 637551, Singapore.
| |
Collapse
|
20
|
Montag J, Syring M, Rose J, Weber AL, Ernstberger P, Mayer AK, Becker E, Keyser B, Dos Remedios C, Perrot A, van der Velden J, Francino A, Navarro-Lopez F, Ho CY, Brenner B, Kraft T. Intrinsic MYH7 expression regulation contributes to tissue level allelic imbalance in hypertrophic cardiomyopathy. J Muscle Res Cell Motil 2017; 38:291-302. [PMID: 29101517 PMCID: PMC5742120 DOI: 10.1007/s10974-017-9486-4] [Citation(s) in RCA: 17] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/22/2017] [Accepted: 10/28/2017] [Indexed: 11/29/2022]
Abstract
HCM, the most common inherited cardiac disease, is mainly caused by mutations in sarcomeric genes. More than a third of the patients are heterozygous for mutations in the MYH7 gene encoding for the β-myosin heavy chain. In HCM-patients, expression of the mutant and the wildtype allele can be unequal, thus leading to fractions of mutant and wildtype mRNA and protein which deviate from 1:1. This so-called allelic imbalance was detected in whole tissue samples but also in individual cells. There is evidence that the severity of HCM not only depends on the functional effect of the mutation itself, but also on the fraction of mutant protein in the myocardial tissue. Allelic imbalance has been shown to occur in a broad range of genes. Therefore, we aimed to examine whether the MYH7-alleles are intrinsically expressed imbalanced or whether the allelic imbalance is solely associated with the disease. We compared the expression of MYH7-alleles in non-HCM donors and in HCM-patients with different MYH7-missense mutations. In the HCM-patients, we identified imbalanced as well as equal expression of both alleles. Also at the protein level, allelic imbalance was determined. Most interestingly, we also discovered allelic imbalance and balance in non-HCM donors. Our findings therefore strongly indicate that apart from mutation-specific mechanisms, also non-HCM associated allelic-mRNA expression regulation may account for the allelic imbalance of the MYH7 gene in HCM-patients. Since the relative amount of mutant mRNA and protein or the extent of allelic imbalance has been associated with the severity of HCM, individual analysis of the MYH7-allelic expression may provide valuable information for the prognosis of each patient.
Collapse
Affiliation(s)
- Judith Montag
- Institute of Molecular and Cell Physiology, Hannover Medical School, Hanover, Germany.
| | - Mandy Syring
- Institute of Molecular and Cell Physiology, Hannover Medical School, Hanover, Germany
| | - Julia Rose
- Institute of Molecular and Cell Physiology, Hannover Medical School, Hanover, Germany
| | - Anna-Lena Weber
- Institute of Molecular and Cell Physiology, Hannover Medical School, Hanover, Germany
| | - Pia Ernstberger
- Institute of Molecular and Cell Physiology, Hannover Medical School, Hanover, Germany
| | - Anne-Kathrin Mayer
- Institute of Molecular and Cell Physiology, Hannover Medical School, Hanover, Germany
| | - Edgar Becker
- Institute of Molecular and Cell Physiology, Hannover Medical School, Hanover, Germany
| | - Britta Keyser
- Institute of Human Genetics, Hannover Medical School, Hanover, Germany
| | | | - Andreas Perrot
- Experimental and Clinical Research Center, Charité-University Clinic Berlin, Berlin, Germany
| | - Jolanda van der Velden
- Department of Physiology, Institute for Cardiovascular Research, VU University, Amsterdam, The Netherlands
| | - Antonio Francino
- Hospital Clinic/IDIBAPS, University of Barcelona, Barcelona, Spain
| | | | | | - Bernhard Brenner
- Institute of Molecular and Cell Physiology, Hannover Medical School, Hanover, Germany
| | - Theresia Kraft
- Institute of Molecular and Cell Physiology, Hannover Medical School, Hanover, Germany
| |
Collapse
|
21
|
Wang M, Uebbing S, Ellegren H. Bayesian Inference of Allele-Specific Gene Expression Indicates Abundant Cis-Regulatory Variation in Natural Flycatcher Populations. Genome Biol Evol 2017; 9:1266-1279. [PMID: 28453623 PMCID: PMC5434935 DOI: 10.1093/gbe/evx080] [Citation(s) in RCA: 20] [Impact Index Per Article: 2.9] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 04/25/2017] [Indexed: 12/13/2022] Open
Abstract
Polymorphism in cis-regulatory sequences can lead to different levels of expression for the two alleles of a gene, providing a starting point for the evolution of gene expression. Little is known about the genome-wide abundance of genetic variation in gene regulation in natural populations but analysis of allele-specific expression (ASE) provides a means for investigating such variation. We performed RNA-seq of multiple tissues from population samples of two closely related flycatcher species and developed a Bayesian algorithm that maximizes data usage by borrowing information from the whole data set and combines several SNPs per transcript to detect ASE. Of 2,576 transcripts analyzed in collared flycatcher, ASE was detected in 185 (7.2%) and a similar frequency was seen in the pied flycatcher. Transcripts with statistically significant ASE commonly showed the major allele in >90% of the reads, reflecting that power was highest when expression was heavily biased toward one of the alleles. This would suggest that the observed frequencies of ASE likely are underestimates. The proportion of ASE transcripts varied among tissues, being lowest in testis and highest in muscle. Individuals often showed ASE of particular transcripts in more than one tissue (73.4%), consistent with a genetic basis for regulation of gene expression. The results suggest that genetic variation in regulatory sequences commonly affects gene expression in natural populations and that it provides a seedbed for phenotypic evolution via divergence in gene expression.
Collapse
Affiliation(s)
- Mi Wang
- Department of Evolutionary Biology, Evolutionary Biology Centre, Uppsala University, Sweden
| | - Severin Uebbing
- Department of Evolutionary Biology, Evolutionary Biology Centre, Uppsala University, Sweden
| | - Hans Ellegren
- Department of Evolutionary Biology, Evolutionary Biology Centre, Uppsala University, Sweden
| |
Collapse
|
22
|
Zhu F, Schlupp I, Tiedemann R. Allele-specific expression at the androgen receptor alpha gene in a hybrid unisexual fish, the Amazon molly (Poecilia formosa). PLoS One 2017; 12:e0186411. [PMID: 29023530 PMCID: PMC5638567 DOI: 10.1371/journal.pone.0186411] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/04/2017] [Accepted: 09/29/2017] [Indexed: 12/25/2022] Open
Abstract
The all-female Amazon molly (Poecilia formosa) is the result of a hybridization of the Atlantic molly (P. mexicana) and the sailfin molly (P. latipinna) approximately 120,000 years ago. As a gynogenetic species, P. formosa needs to copulate with heterospecific males including males from one of its bisexual ancestral species. However, the sperm only triggers embryogenesis of the diploid eggs. The genetic information of the sperm donor typically will not contribute to the next generation of P. formosa. Hence, P. formosa possesses generally one allele from each of its ancestral species at any genetic locus. This raises the question whether both ancestral alleles are equally expressed in P. formosa. Allele-specific expression (ASE) has been previously assessed in various organisms, e.g., human and fish, and ASE was found to be important in the context of phenotypic variability and disease. In this study, we utilized Real-Time PCR techniques to estimate ASE of the androgen receptor alpha (arα) gene in several distinct tissues of Amazon mollies. We found an allelic bias favoring the maternal ancestor (P. mexicana) allele in ovarian tissue. This allelic bias was not observed in the gill or the brain tissue. Sequencing of the promoter regions of both alleles revealed an association between an Indel in a known CpG island and differential expression. Future studies may reveal whether our observed cis-regulatory divergence is caused by an ovary-specific trans-regulatory element, preferentially activating the allele of the maternal ancestor.
Collapse
Affiliation(s)
- Fangjun Zhu
- University of Evolutionary Biology/Systematic Zoology, Institute of Biochemistry and Biology, University of Potsdam, Potsdam, Germany
| | - Ingo Schlupp
- Department of Biology, University of Oklahoma, Norman, Oklahoma, United States of America
| | - Ralph Tiedemann
- University of Evolutionary Biology/Systematic Zoology, Institute of Biochemistry and Biology, University of Potsdam, Potsdam, Germany
| |
Collapse
|
23
|
RNA-Seq Analyses Identify Frequent Allele Specific Expression and No Evidence of Genomic Imprinting in Specific Embryonic Tissues of Chicken. Sci Rep 2017; 7:11944. [PMID: 28931927 PMCID: PMC5607270 DOI: 10.1038/s41598-017-12179-9] [Citation(s) in RCA: 33] [Impact Index Per Article: 4.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/06/2017] [Accepted: 09/05/2017] [Indexed: 12/30/2022] Open
Abstract
Epigenetic and genetic cis-regulatory elements in diploid organisms may cause allele specific expression (ASE) – unequal expression of the two chromosomal gene copies. Genomic imprinting is an intriguing type of ASE in which some genes are expressed monoallelically from either the paternal allele or maternal allele as a result of epigenetic modifications. Imprinted genes have been identified in several animal species and are frequently associated with embryonic development and growth. Whether genomic imprinting exists in chickens remains debatable, as previous studies have reported conflicting evidence. Albeit no genomic imprinting has been reported in the chicken embryo as a whole, we interrogated the existence or absence of genomic imprinting in the 12-day-old chicken embryonic brain and liver by examining ASE in F1 reciprocal crosses of two highly inbred chicken lines (Fayoumi and Leghorn). We identified 5197 and 4638 ASE SNPs, corresponding to 18.3% and 17.3% of the genes with a detectable expression in the embryonic brain and liver, respectively. There was no evidence detected of genomic imprinting in 12-day-old embryonic brain and liver. While ruling out the possibility of imprinted Z-chromosome inactivation, our results indicated that Z-linked gene expression is partially compensated between sexes in chickens.
Collapse
|
24
|
Chuang TJ, Tseng YH, Chen CY, Wang YD. Assessment of imprinting- and genetic variation-dependent monoallelic expression using reciprocal allele descendants between human family trios. Sci Rep 2017; 7:7038. [PMID: 28765567 PMCID: PMC5539102 DOI: 10.1038/s41598-017-07514-z] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/21/2017] [Accepted: 06/23/2017] [Indexed: 11/23/2022] Open
Abstract
Genomic imprinting is an important epigenetic process that silences one of the parentally-inherited alleles of a gene and thereby exhibits allelic-specific expression (ASE). Detection of human imprinting events is hampered by the infeasibility of the reciprocal mating system in humans and the removal of ASE events arising from non-imprinting factors. Here, we describe a pipeline with the pattern of reciprocal allele descendants (RADs) through genotyping and transcriptome sequencing data across independent parent-offspring trios to discriminate between varied types of ASE (e.g., imprinting, genetic variation-dependent ASE, and random monoallelic expression (RME)). We show that the vast majority of ASE events are due to sequence-dependent genetic variant, which are evolutionarily conserved and may themselves play a cis-regulatory role. Particularly, 74% of non-RAD ASE events, even though they exhibit ASE biases toward the same parentally-inherited allele across different individuals, are derived from genetic variation but not imprinting. We further show that the RME effect may affect the effectiveness of the population-based method for detecting imprinting events and our pipeline can help to distinguish between these two ASE types. Taken together, this study provides a good indicator for categorization of different types of ASE, opening up this widespread and complex mechanism for comprehensive characterization.
Collapse
Affiliation(s)
| | | | - Chia-Ying Chen
- Genomics Research Center, Academia Sinica, Taipei, Taiwan
| | - Yi-Da Wang
- Genomics Research Center, Academia Sinica, Taipei, Taiwan
| |
Collapse
|
25
|
Urbanek MO, Krzyzosiak WJ. Discriminating RNA variants with single-molecule allele-specific FISH. MUTATION RESEARCH-REVIEWS IN MUTATION RESEARCH 2017; 773:230-241. [DOI: 10.1016/j.mrrev.2016.09.003] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Received: 06/22/2016] [Revised: 09/12/2016] [Accepted: 09/13/2016] [Indexed: 10/21/2022]
|
26
|
Arts P, van der Raadt J, van Gestel SH, Steehouwer M, Shendure J, Hoischen A, Albers CA. Quantification of differential gene expression by multiplexed targeted resequencing of cDNA. Nat Commun 2017; 8:15190. [PMID: 28474677 PMCID: PMC5424154 DOI: 10.1038/ncomms15190] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/17/2016] [Accepted: 03/08/2017] [Indexed: 12/19/2022] Open
Abstract
Whole-transcriptome or RNA sequencing (RNA-Seq) is a powerful and versatile tool for functional analysis of different types of RNA molecules, but sample reagent and sequencing cost can be prohibitive for hypothesis-driven studies where the aim is to quantify differential expression of a limited number of genes. Here we present an approach for quantification of differential mRNA expression by targeted resequencing of complementary DNA using single-molecule molecular inversion probes (cDNA-smMIPs) that enable highly multiplexed resequencing of cDNA target regions of ∼100 nucleotides and counting of individual molecules. We show that accurate estimates of differential expression can be obtained from molecule counts for hundreds of smMIPs per reaction and that smMIPs are also suitable for quantification of relative gene expression and allele-specific expression. Compared with low-coverage RNA-Seq and a hybridization-based targeted RNA-Seq method, cDNA-smMIPs are a cost-effective high-throughput tool for hypothesis-driven expression analysis in large numbers of genes (10 to 500) and samples (hundreds to thousands).
Collapse
Affiliation(s)
- Peer Arts
- Department of Human Genetics, Donders Institute for Brain, Cognition and Behaviour, Radboud University Medical Center, PO Box 9101, 6500 HB, Nijmegen, The Netherlands
| | - Jori van der Raadt
- Department of Human Genetics, Donders Institute for Brain, Cognition and Behaviour, Radboud University Medical Center, PO Box 9101, 6500 HB, Nijmegen, The Netherlands
- Department of Molecular Developmental Biology, Radboud Institute for Molecular Life Sciences, Radboud University, PO Box 9101, 6500 HB, Nijmegen, The Netherlands
| | - Sebastianus H.C. van Gestel
- Department of Human Genetics, Donders Institute for Brain, Cognition and Behaviour, Radboud University Medical Center, PO Box 9101, 6500 HB, Nijmegen, The Netherlands
- Department of Molecular Developmental Biology, Radboud Institute for Molecular Life Sciences, Radboud University, PO Box 9101, 6500 HB, Nijmegen, The Netherlands
| | - Marloes Steehouwer
- Department of Human Genetics, Donders Institute for Brain, Cognition and Behaviour, Radboud University Medical Center, PO Box 9101, 6500 HB, Nijmegen, The Netherlands
| | - Jay Shendure
- Department of Genome Sciences, University of Washington, Foege Building S-250, Box 355065, 3720 15th Ave NE, Seattle, Washington 98195-5065, USA
- Howard Hughes Medical Institute, Seattle, Washington 98195, USA
| | - Alexander Hoischen
- Department of Human Genetics, Donders Institute for Brain, Cognition and Behaviour, Radboud University Medical Center, PO Box 9101, 6500 HB, Nijmegen, The Netherlands
- Department of Internal Medicine and Radboud Center for Infectious Diseases (RCI), Radboud University Medical Center, PO Box 9101, 6500 HB, Nijmegen, The Netherlands
| | - Cornelis A. Albers
- Department of Human Genetics, Donders Institute for Brain, Cognition and Behaviour, Radboud University Medical Center, PO Box 9101, 6500 HB, Nijmegen, The Netherlands
- Department of Molecular Developmental Biology, Radboud Institute for Molecular Life Sciences, Radboud University, PO Box 9101, 6500 HB, Nijmegen, The Netherlands
| |
Collapse
|
27
|
Hong R, Lin B, Lu X, Lai LT, Chen X, Sanyal A, Ng HH, Zhang K, Zhang LF. High-resolution RNA allelotyping along the inactive X chromosome: evidence of RNA polymerase III in regulating chromatin configuration. Sci Rep 2017; 7:45460. [PMID: 28368037 PMCID: PMC5377358 DOI: 10.1038/srep45460] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/26/2016] [Accepted: 03/02/2017] [Indexed: 01/02/2023] Open
Abstract
We carried out padlock capture, a high-resolution RNA allelotyping method, to study X chromosome inactivation (XCI). We examined the gene reactivation pattern along the inactive X (Xi), after Xist (X-inactive specific transcript), a prototype long non-coding RNA essential for establishing X chromosome inactivation (XCI) in early embryos, is conditionally deleted from Xi in somatic cells (Xi∆Xist). We also monitored the behaviors of X-linked non-coding transcripts before and after XCI. In each mutant cell line, gene reactivation occurs to ~6% genes along Xi∆Xist in a recognizable pattern. Genes with upstream regions enriched for SINEs are prone to be reactivated. SINE is a class of retrotransposon transcribed by RNA polymerase III (Pol III). Intriguingly, a significant fraction of Pol III transcription from non-coding regions is not subjected to Xist-mediated transcriptional silencing. Pol III inhibition affects gene reactivation status along Xi∆Xist, alters chromatin configuration and interferes with the establishment XCI during in vitro differentiation of ES cells. These results suggest that Pol III transcription is involved in chromatin structure re-organization during the onset of XCI and functions as a general mechanism regulating chromatin configuration in mammalian cells.
Collapse
Affiliation(s)
- Ru Hong
- School of Biological Sciences, Nanyang Technological University, 60 Nanyang Drive 637551, Singapore
| | - Bingqing Lin
- School of Biological Sciences, Nanyang Technological University, 60 Nanyang Drive 637551, Singapore
| | - Xinyi Lu
- Genome Institute of Singapore, 138672, Singapore
| | - Lan-Tian Lai
- School of Biological Sciences, Nanyang Technological University, 60 Nanyang Drive 637551, Singapore
| | - Xin Chen
- Division of Mathematical Sciences, School of Physical and Mathematical Sciences, Nanyang Technological University, 21 Nanyang Link 637371, Singapore
| | - Amartya Sanyal
- School of Biological Sciences, Nanyang Technological University, 60 Nanyang Drive 637551, Singapore
| | - Huck-Hui Ng
- Genome Institute of Singapore, 138672, Singapore
| | - Kun Zhang
- Department of Bioengineering, University of California at San Diego, La Jolla, CA 92093, USA
| | - Li-Feng Zhang
- School of Biological Sciences, Nanyang Technological University, 60 Nanyang Drive 637551, Singapore
| |
Collapse
|
28
|
Deonovic B, Wang Y, Weirather J, Wang XJ, Au KF. IDP-ASE: haplotyping and quantifying allele-specific expression at the gene and gene isoform level by hybrid sequencing. Nucleic Acids Res 2017; 45:e32. [PMID: 27899656 PMCID: PMC5952581 DOI: 10.1093/nar/gkw1076] [Citation(s) in RCA: 33] [Impact Index Per Article: 4.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/15/2016] [Revised: 10/20/2016] [Accepted: 10/26/2016] [Indexed: 12/14/2022] Open
Abstract
Allele-specific expression (ASE) is a fundamental problem in studying gene regulation and diploid transcriptome profiles, with two key challenges: (i) haplotyping and (ii) estimation of ASE at the gene isoform level. Existing ASE analysis methods are limited by a dependence on haplotyping from laborious experiments or extra genome/family trio data. In addition, there is a lack of methods for gene isoform level ASE analysis. We developed a tool, IDP-ASE, for full ASE analysis. By innovative integration of Third Generation Sequencing (TGS) long reads with Second Generation Sequencing (SGS) short reads, the accuracy of haplotyping and ASE quantification at the gene and gene isoform level was greatly improved as demonstrated by the gold standard data GM12878 data and semi-simulation data. In addition to methodology development, applications of IDP-ASE to human embryonic stem cells and breast cancer cells indicate that the imbalance of ASE and non-uniformity of gene isoform ASE is widespread, including tumorigenesis relevant genes and pluripotency markers. These results show that gene isoform expression and allele-specific expression cooperate to provide high diversity and complexity of gene regulation and expression, highlighting the importance of studying ASE at the gene isoform level. Our study provides a robust bioinformatics solution to understand ASE using RNA sequencing data only.
Collapse
Affiliation(s)
- Benjamin Deonovic
- Department of Biostatistics, University of Iowa, Iowa City, IA 52242, USA
| | - Yunhao Wang
- Department of Internal Medicine, University of Iowa, Iowa City, IA 52242, USA
- Key laboratory of Genetics Network Biology, Collaborative Innovation Center of Genetics and Development, Institute of Genetics and Developmental Biology, Chinese Academy of Sciences, Beijing 100101, China
- University of Chinese Academy of Sciences, Beijing 100049, China
| | - Jason Weirather
- Department of Internal Medicine, University of Iowa, Iowa City, IA 52242, USA
| | - Xiu-Jie Wang
- Key laboratory of Genetics Network Biology, Collaborative Innovation Center of Genetics and Development, Institute of Genetics and Developmental Biology, Chinese Academy of Sciences, Beijing 100101, China
| | - Kin Fai Au
- Department of Biostatistics, University of Iowa, Iowa City, IA 52242, USA
- Department of Internal Medicine, University of Iowa, Iowa City, IA 52242, USA
| |
Collapse
|
29
|
Lee JH. Quantitative approaches for investigating the spatial context of gene expression. WILEY INTERDISCIPLINARY REVIEWS-SYSTEMS BIOLOGY AND MEDICINE 2016; 9. [PMID: 28001340 PMCID: PMC5315614 DOI: 10.1002/wsbm.1369] [Citation(s) in RCA: 28] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 08/09/2016] [Revised: 10/19/2016] [Accepted: 10/25/2016] [Indexed: 01/01/2023]
Abstract
The spatial information associated with gene expression is important for elucidating the context-dependent transcriptional regulation during development. Recently, high-resolution sampling approaches, such as RNA tomography or single-cell RNA-seq combined with fluorescence in situ hybridization (FISH), have provided indirect ways to view global gene expression patterns in three dimensions. Now in situ sequencing technologies, such as fluorescent in situ sequencing (FISSEQ), are attempting to visualize the genetic signature directly in microscope images. This article will examine the basic principle of modern in situ and single-cell genetic methods, hurdles in quantifying intrinsic and extrinsic forces that influence cell decision-making, and technological requirements for making a visual map of gene regulation, form, and function. Successfully addressing these challenges will be essential for investigating the functional evolution of regulatory sequences during growth, development, and cancer progression. WIREs Syst Biol Med 2017, 9:e1369. doi: 10.1002/wsbm.1369 For further resources related to this article, please visit the WIREs website.
Collapse
Affiliation(s)
- Je H Lee
- Cold Spring Harbor Laboratory, Cold Spring Harbor, NY, USA
| |
Collapse
|
30
|
Juneja P, Quinn A, Jiggins FM. Latitudinal clines in gene expression and cis-regulatory element variation in Drosophila melanogaster. BMC Genomics 2016; 17:981. [PMID: 27894253 PMCID: PMC5126864 DOI: 10.1186/s12864-016-3333-7] [Citation(s) in RCA: 25] [Impact Index Per Article: 3.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/04/2016] [Accepted: 11/23/2016] [Indexed: 02/07/2023] Open
Abstract
BACKGROUND Organisms can rapidly adapt to their environment when colonizing a new habitat, and this could occur by changing protein sequences or by altering patterns of gene expression. The importance of gene expression in driving local adaptation is increasingly being appreciated, and cis-regulatory elements (CREs), which control and modify the expression of the nearby genes, are predicted to play an important role. Here we investigate genetic variation in gene expression in immune-challenged Drosophila melanogaster from temperate and tropical or sub-tropical populations in Australia and United States. RESULTS We find parallel latitudinal changes in gene expression, with genes involved in immunity, insecticide resistance, reproduction, and the response to the environment being especially likely to differ between latitudes. By measuring allele-specific gene expression (ASE), we show that cis-regulatory variation also shows parallel latitudinal differences between the two continents and contributes to the latitudinal differences in gene expression. CONCLUSIONS Both Australia and United States were relatively recently colonized by D. melanogaster, and it was recently shown that introductions of both African and European flies occurred, with African genotypes contributing disproportionately to tropical populations. Therefore, both the demographic history of the populations and local adaptation may be causing the patterns that we see.
Collapse
Affiliation(s)
- Punita Juneja
- Department of Genetics, University of Cambridge, Cambridge, CB2 3EH, UK
| | - Andrew Quinn
- Department of Genetics, University of Cambridge, Cambridge, CB2 3EH, UK
| | - Francis M Jiggins
- Department of Genetics, University of Cambridge, Cambridge, CB2 3EH, UK.
| |
Collapse
|
31
|
Movassagh M, Alomran N, Mudvari P, Dede M, Dede C, Kowsari K, Restrepo P, Cauley E, Bahl S, Li M, Waterhouse W, Tsaneva-Atanasova K, Edwards N, Horvath A. RNA2DNAlign: nucleotide resolution allele asymmetries through quantitative assessment of RNA and DNA paired sequencing data. Nucleic Acids Res 2016; 44:e161. [PMID: 27576531 PMCID: PMC5159535 DOI: 10.1093/nar/gkw757] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/18/2016] [Revised: 08/15/2016] [Accepted: 08/19/2016] [Indexed: 12/14/2022] Open
Abstract
We introduce RNA2DNAlign, a computational framework for quantitative assessment of allele counts across paired RNA and DNA sequencing datasets. RNA2DNAlign is based on quantitation of the relative abundance of variant and reference read counts, followed by binomial tests for genotype and allelic status at SNV positions between compatible sequences. RNA2DNAlign detects positions with differential allele distribution, suggesting asymmetries due to regulatory/structural events. Based on the type of asymmetry, RNA2DNAlign outlines positions likely to be implicated in RNA editing, allele-specific expression or loss, somatic mutagenesis or loss-of-heterozygosity (the first three also in a tumor-specific setting). We applied RNA2DNAlign on 360 matching normal and tumor exomes and transcriptomes from 90 breast cancer patients from TCGA. Under high-confidence settings, RNA2DNAlign identified 2038 distinct SNV sites associated with one of the aforementioned asymetries, the majority of which have not been linked to functionality before. The performance assessment shows very high specificity and sensitivity, due to the corroboration of signals across multiple matching datasets. RNA2DNAlign is freely available from http://github.com/HorvathLab/NGS as a self-contained binary package for 64-bit Linux systems.
Collapse
Affiliation(s)
- Mercedeh Movassagh
- McCormick Genomics and Proteomics Center, Department of Biochemistry and Molecular Medicine, The George Washington University, Washington, DC 20037, USA.,University of Massachusetts Medical School, Graduate School of Biomedical Sciences, Program in Bioinformatics and Integrative Biology, Worcester, MA 01605, USA
| | - Nawaf Alomran
- McCormick Genomics and Proteomics Center, Department of Biochemistry and Molecular Medicine, The George Washington University, Washington, DC 20037, USA.,Department of Biochemistry and Molecular & Cellular Biology, Georgetown University, Washington, DC 20057, USA
| | - Prakriti Mudvari
- McCormick Genomics and Proteomics Center, Department of Biochemistry and Molecular Medicine, The George Washington University, Washington, DC 20037, USA
| | - Merve Dede
- McCormick Genomics and Proteomics Center, Department of Biochemistry and Molecular Medicine, The George Washington University, Washington, DC 20037, USA
| | - Cem Dede
- McCormick Genomics and Proteomics Center, Department of Biochemistry and Molecular Medicine, The George Washington University, Washington, DC 20037, USA
| | - Kamran Kowsari
- McCormick Genomics and Proteomics Center, Department of Biochemistry and Molecular Medicine, The George Washington University, Washington, DC 20037, USA.,Department of Computer Science, School of Engineering and applied Science, The George Washington University, Washington, DC 20037, USA
| | - Paula Restrepo
- McCormick Genomics and Proteomics Center, Department of Biochemistry and Molecular Medicine, The George Washington University, Washington, DC 20037, USA
| | - Edmund Cauley
- Department of Pharmacology and Physiology, The George Washington University, Washington, DC 20037, USA
| | - Sonali Bahl
- Department of Pharmacology and Physiology, The George Washington University, Washington, DC 20037, USA
| | - Muzi Li
- McCormick Genomics and Proteomics Center, Department of Biochemistry and Molecular Medicine, The George Washington University, Washington, DC 20037, USA.,Department of Biochemistry and Molecular & Cellular Biology, Georgetown University, Washington, DC 20057, USA
| | - Wesley Waterhouse
- McCormick Genomics and Proteomics Center, Department of Biochemistry and Molecular Medicine, The George Washington University, Washington, DC 20037, USA
| | - Krasimira Tsaneva-Atanasova
- Department of Mathematics, College of Engineering, Mathematics and Physical Sciences & EPSRC Centre for Predictive Modelling in Healthcare, University of Exeter, Exeter, EX4 4QJ, UK
| | - Nathan Edwards
- Department of Biochemistry and Molecular & Cellular Biology, Georgetown University, Washington, DC 20057, USA
| | - Anelia Horvath
- McCormick Genomics and Proteomics Center, Department of Biochemistry and Molecular Medicine, The George Washington University, Washington, DC 20037, USA .,Department of Pharmacology and Physiology, The George Washington University, Washington, DC 20037, USA
| |
Collapse
|
32
|
King LB, Walum H, Inoue K, Eyrich NW, Young LJ. Variation in the Oxytocin Receptor Gene Predicts Brain Region-Specific Expression and Social Attachment. Biol Psychiatry 2016; 80:160-169. [PMID: 26893121 PMCID: PMC4909578 DOI: 10.1016/j.biopsych.2015.12.008] [Citation(s) in RCA: 119] [Impact Index Per Article: 14.9] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 10/02/2015] [Revised: 11/09/2015] [Accepted: 12/05/2015] [Indexed: 12/14/2022]
Abstract
BACKGROUND Oxytocin (OXT) modulates several aspects of social behavior. Intranasal OXT is a leading candidate for treating social deficits in patients with autism spectrum disorder, and common genetic variants in the human OXTR gene are associated with emotion recognition, relationship quality, and autism spectrum disorder. Animal models have revealed that individual differences in Oxtr expression in the brain drive social behavior variation. Our understanding of how genetic variation contributes to brain OXTR expression is very limited. METHODS We investigated Oxtr expression in monogamous prairie voles, which have a well-characterized OXT system. We quantified brain region-specific levels of Oxtr messenger RNA and oxytocin receptor protein with established neuroanatomic methods. We used pyrosequencing to investigate allelic imbalance of Oxtr mRNA, a molecular signature of polymorphic genetic regulatory elements. We performed next-generation sequencing to discover variants in and near the Oxtr gene. We investigated social attachment using the partner preference test. RESULTS Our allelic imbalance data demonstrate that genetic variants contribute to individual differences in Oxtr expression, but only in particular brain regions, including the nucleus accumbens, where oxytocin receptor signaling facilitates social attachment. Next-generation sequencing identified one polymorphism in the Oxtr intron, near a putative cis-regulatory element, explaining 74% of the variance in striatal Oxtr expression specifically. Males homozygous for the high expressing allele display enhanced social attachment. CONCLUSIONS Taken together, these findings provide convincing evidence for robust genetic influence on Oxtr expression and provide novel insights into how noncoding polymorphisms in OXTR might influence individual differences in human social cognition and behavior.
Collapse
Affiliation(s)
| | | | | | | | - Larry J. Young
- Address Correspondence to: Larry J. Young, 954
Gatewood Rd., Yerkes National Primate Research Center, Emory University,
Atlanta, GA 30329, USA, Phone: 404 727-8272, Fax: 404 727-8070,
| |
Collapse
|
33
|
Liu Z, Gui T, Wang Z, Li H, Fu Y, Dong X, Li Y. cisASE: a likelihood-based method for detecting putative cis-regulated allele-specific expression in RNA sequencing data. Bioinformatics 2016; 32:3291-3297. [PMID: 27412088 DOI: 10.1093/bioinformatics/btw416] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/27/2015] [Accepted: 06/24/2016] [Indexed: 12/31/2022] Open
Abstract
MOTIVATION Allele-specific expression (ASE) is a useful way to identify cis-acting regulatory variation, which provides opportunities to develop new therapeutic strategies that activate beneficial alleles or silence mutated alleles at specific loci. However, multiple problems hinder the identification of ASE in next-generation sequencing (NGS) data. RESULTS We developed cisASE, a likelihood-based method for detecting ASE on single nucleotide variant (SNV), exon and gene levels from sequencing data without requiring phasing or parental information. cisASE uses matched DNA-seq data to control technical bias and copy number variation (CNV) in putative cis-regulated ASE identification. Compared with state-of-the-art methods, cisASE exhibits significantly increased accuracy and speed. cisASE works moderately well for datasets without DNA-seq and thus is widely applicable. By applying cisASE to real datasets, we identified specific ASE characteristics in normal and cancer tissues, thus indicating that cisASE has potential for wide applications in cancer genomics. AVAILABILITY AND IMPLEMENTATION cisASE is freely available at http://lifecenter.sgst.cn/cisASE CONTACT: biosinodx@gmail.com or yxli@sibs.ac.cnSupplementary information: Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Zhi Liu
- Key Laboratory of Computational Biology, CAS-MPG Partner Institute for Computational Biology, Shanghai Institutes for Biological Sciences, Chinese Academy of Sciences, Shanghai 200031, China
| | - Tuantuan Gui
- Key Laboratory of Computational Biology, CAS-MPG Partner Institute for Computational Biology, Shanghai Institutes for Biological Sciences, Chinese Academy of Sciences, Shanghai 200031, China
| | - Zhen Wang
- Key Laboratory of Computational Biology, CAS-MPG Partner Institute for Computational Biology, Shanghai Institutes for Biological Sciences, Chinese Academy of Sciences, Shanghai 200031, China
| | - Hong Li
- Key Laboratory of Computational Biology, CAS-MPG Partner Institute for Computational Biology, Shanghai Institutes for Biological Sciences, Chinese Academy of Sciences, Shanghai 200031, China
| | - Yunhe Fu
- Key Laboratory of Systems Biology, Institute of Biochemistry and Cell Biology, Shanghai Institutes for Biological Sciences, Chinese Academy of Sciences, Shanghai 200031, China
| | - Xiao Dong
- Department of Genetics, Albert Einstein College of Medicine, Bronx, NY 10461, USA
| | - Yixue Li
- Key Laboratory of Computational Biology, CAS-MPG Partner Institute for Computational Biology, Shanghai Institutes for Biological Sciences, Chinese Academy of Sciences, Shanghai 200031, China School of Life Science and Technology, Shanghai Jiaotong University, Shanghai 200240, China Shanghai Center for Bioinformation Technology, Shanghai Industrial Technology Institute, Shanghai 201203, China and Collaborative Innovation Center for Genetics and Development, Fudan University, Shanghai 200438, China
| |
Collapse
|
34
|
Magne F, Ge B, Larrivée-Vanier S, Van Vliet G, Samuels ME, Pastinen T, Deladoëy J. Demonstration of Autosomal Monoallelic Expression in Thyroid Tissue Assessed by Whole-Exome and Bulk RNA Sequencing. Thyroid 2016; 26:852-9. [PMID: 27125219 DOI: 10.1089/thy.2016.0009] [Citation(s) in RCA: 13] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Indexed: 12/25/2022]
Abstract
BACKGROUND Congenital hypothyroidism due to thyroid dysgenesis (CHTD) is a disorder with a prevalence of 1/4000 live births, the cause of which remains unknown. The most common diagnostic category is thyroid ectopy, which occurs in up to 80% of CHTD cases. CHTD is predominantly not inherited and has a high discordance rate (>92%) between monozygotic (MZ) twins. The sporadic nature of CHTD might be explained by somatic events such as autosomal monoallelic expression (AME), given that genes expressed in a monoallelic way are more vulnerable to otherwise benign monoallelict genetic or epigenetic mutations. OBJECTIVE The aim of this study was to search for complete (90%) AME in normal and dysgenetic thyroid tissues. METHODS Aggregated analysis of whole-exome and bulk RNA sequencing was performed on two ectopic thyroids, four normal thyroids, and the human thyroid cell line Nthy-ori. RESULTS A median of 5062 (range 2081-5270) genes per sample showed sufficient numbers of heterozygous single nucleotide polymorphisms to be informative. The median monoallelic expression represented 22 (range 16-32) of the informative genes for each thyroid sample. Examples of genes displaying AME are FCGBP, ZNF331, USP10, BCLAF1, and some HLA genes; these genes are involved in epithelial-mesenchymal transition, cell migration, cancer, and immunity. CONCLUSIONS AME may account for the high discordance rate observed between MZ twins and for the sporadic nature of CHTD. These findings also have implications for other pathologies, including cancers and autoimmune disorders of the thyroid.
Collapse
Affiliation(s)
- Fabien Magne
- 1 Endocrinology Service and Research Center, Sainte-Justine University Hospital Center, Department of Pediatrics, Université de Montréal , Montreal, Canada
- 2 Department of Biomedical Sciences, Université de Montréal , Montreal, Canada
| | - Bing Ge
- 3 Department of Human Genetics, McGill University , Montreal, Canada
| | - Stéphanie Larrivée-Vanier
- 1 Endocrinology Service and Research Center, Sainte-Justine University Hospital Center, Department of Pediatrics, Université de Montréal , Montreal, Canada
| | - Guy Van Vliet
- 1 Endocrinology Service and Research Center, Sainte-Justine University Hospital Center, Department of Pediatrics, Université de Montréal , Montreal, Canada
| | - Mark E Samuels
- 1 Endocrinology Service and Research Center, Sainte-Justine University Hospital Center, Department of Pediatrics, Université de Montréal , Montreal, Canada
- 4 Department of Medicine, Université de Montréal , Montreal, Canada
| | - Tomi Pastinen
- 3 Department of Human Genetics, McGill University , Montreal, Canada
| | - Johnny Deladoëy
- 1 Endocrinology Service and Research Center, Sainte-Justine University Hospital Center, Department of Pediatrics, Université de Montréal , Montreal, Canada
- 2 Department of Biomedical Sciences, Université de Montréal , Montreal, Canada
- 5 Department of Biochemistry, Université de Montréal , Montreal, Canada
| |
Collapse
|
35
|
Edsgärd D, Iglesias MJ, Reilly SJ, Hamsten A, Tornvall P, Odeberg J, Emanuelsson O. GeneiASE: Detection of condition-dependent and static allele-specific expression from RNA-seq data without haplotype information. Sci Rep 2016; 6:21134. [PMID: 26887787 PMCID: PMC4758070 DOI: 10.1038/srep21134] [Citation(s) in RCA: 38] [Impact Index Per Article: 4.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/08/2015] [Accepted: 01/18/2016] [Indexed: 12/20/2022] Open
Abstract
Allele-specific expression (ASE) is the imbalance in transcription between maternal and paternal alleles at a locus and can be probed in single individuals using massively parallel DNA sequencing technology. Assessing ASE within a single sample provides a static picture of the ASE, but the magnitude of ASE for a given transcript may vary between different biological conditions in an individual. Such condition-dependent ASE could indicate a genetic variation with a functional role in the phenotypic difference. We investigated ASE through RNA-sequencing of primary white blood cells from eight human individuals before and after the controlled induction of an inflammatory response, and detected condition-dependent and static ASE at 211 and 13021 variants, respectively. We developed a method, GeneiASE, to detect genes exhibiting static or condition-dependent ASE in single individuals. GeneiASE performed consistently over a range of read depths and ASE effect sizes, and did not require phasing of variants to estimate haplotypes. We observed condition-dependent ASE related to the inflammatory response in 19 genes, and static ASE in 1389 genes. Allele-specific expression was confirmed by validation of variants through real-time quantitative RT-PCR, with RNA-seq and RT-PCR ASE effect-size correlations r = 0.67 and r = 0.94 for static and condition-dependent ASE, respectively.
Collapse
Affiliation(s)
- Daniel Edsgärd
- KTH Royal Institute of Technology, Science for Life Laboratory, School of Biotechnology, Division of Gene Technology, SE-171 65, Solna, Sweden
| | - Maria Jesus Iglesias
- Atherosclerosis Research Unit, Department of Medicine Solna, Karolinska Institutet, Center for Molecular Medicine, and Department of Cardiology, Karolinska University Hospital, Stockholm, Sweden.,KTH Royal Institute of Technology, Science for Life Laboratory, School of Biotechnology, Division of Proteomics, SE-171 65, Solna, Sweden
| | - Sarah-Jayne Reilly
- Atherosclerosis Research Unit, Department of Medicine Solna, Karolinska Institutet, Center for Molecular Medicine, and Department of Cardiology, Karolinska University Hospital, Stockholm, Sweden
| | - Anders Hamsten
- Atherosclerosis Research Unit, Department of Medicine Solna, Karolinska Institutet, Center for Molecular Medicine, and Department of Cardiology, Karolinska University Hospital, Stockholm, Sweden
| | - Per Tornvall
- Department of Clinical Science and Education, Södersjukhuset, Karolinska Institutet, Stockholm, Sweden
| | - Jacob Odeberg
- Atherosclerosis Research Unit, Department of Medicine Solna, Karolinska Institutet, Center for Molecular Medicine, and Department of Cardiology, Karolinska University Hospital, Stockholm, Sweden.,KTH Royal Institute of Technology, Science for Life Laboratory, School of Biotechnology, Division of Proteomics, SE-171 65, Solna, Sweden.,Department of Medicine, Centre for Hematology, Karolinska University Hospital and Karolinska Institutet, Solna, Sweden
| | - Olof Emanuelsson
- KTH Royal Institute of Technology, Science for Life Laboratory, School of Biotechnology, Division of Gene Technology, SE-171 65, Solna, Sweden
| |
Collapse
|
36
|
Erdem-Eraslan L, van den Bent MJ, Hoogstrate Y, Naz-Khan H, Stubbs A, van der Spek P, Böttcher R, Gao Y, de Wit M, Taal W, Oosterkamp HM, Walenkamp A, Beerepoot LV, Hanse MCJ, Buter J, Honkoop AH, van der Holt B, Vernhout RM, Sillevis Smitt PAE, Kros JM, French PJ. Identification of Patients with Recurrent Glioblastoma Who May Benefit from Combined Bevacizumab and CCNU Therapy: A Report from the BELOB Trial. Cancer Res 2016; 76:525-34. [PMID: 26762204 DOI: 10.1158/0008-5472.can-15-0776] [Citation(s) in RCA: 75] [Impact Index Per Article: 9.4] [Reference Citation Analysis] [Abstract] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/27/2015] [Accepted: 10/08/2015] [Indexed: 11/16/2022]
Abstract
The results from the randomized phase II BELOB trial provided evidence for a potential benefit of bevacizumab (beva), a humanized monoclonal antibody against circulating VEGF-A, when added to CCNU chemotherapy in patients with recurrent glioblastoma (GBM). In this study, we performed gene expression profiling (DASL and RNA-seq) of formalin-fixed, paraffin-embedded tumor material from participants of the BELOB trial to identify patients with recurrent GBM who benefitted most from beva+CCNU treatment. We demonstrate that tumors assigned to the IGS-18 or "classical" subtype and treated with beva+CCNU showed a significant benefit in progression-free survival and a trend toward benefit in overall survival, whereas other subtypes did not exhibit such benefit. In particular, expression of FMO4 and OSBPL3 was associated with treatment response. Importantly, the improved outcome in the beva+CCNU treatment arm was not explained by an uneven distribution of prognostically favorable subtypes as all molecular glioma subtypes were evenly distributed along the different study arms. The RNA-seq analysis also highlighted genetic alterations, including mutations, gene fusions, and copy number changes, within this well-defined cohort of tumors that may serve as useful predictive or prognostic biomarkers of patient outcome. Further validation of the identified molecular markers may enable the future stratification of recurrent GBM patients into appropriate treatment regimens.
Collapse
Affiliation(s)
- Lale Erdem-Eraslan
- Department of Neurology, Erasmus MC Cancer Institute, Rotterdam, the Netherlands
| | | | - Youri Hoogstrate
- Department of Urology, Erasmus MC Cancer Institute, Rotterdam, the Netherlands. Bioinformatics, Erasmus MC Cancer Institute, Rotterdam, the Netherlands
| | - Hina Naz-Khan
- Bioinformatics, Erasmus MC Cancer Institute, Rotterdam, the Netherlands
| | - Andrew Stubbs
- Bioinformatics, Erasmus MC Cancer Institute, Rotterdam, the Netherlands
| | | | - René Böttcher
- Department of Urology, Erasmus MC Cancer Institute, Rotterdam, the Netherlands
| | - Ya Gao
- Department of Neurology, Erasmus MC Cancer Institute, Rotterdam, the Netherlands
| | - Maurice de Wit
- Department of Neurology, Erasmus MC Cancer Institute, Rotterdam, the Netherlands
| | - Walter Taal
- Department of Neurology, Erasmus MC Cancer Institute, Rotterdam, the Netherlands
| | - Hendrika M Oosterkamp
- Department of Medical Oncology, Medical Center Haaglanden, The Hague, the Netherlands
| | - Annemiek Walenkamp
- Department of Medical Oncology, University Medical Center Groningen, Groningen, the Netherlands
| | | | - Monique C J Hanse
- Department of Neurology, Catharina Hospital Eindhoven, the Netherlands
| | - Jan Buter
- Department of Oncology, VU University Medical Center, Amsterdam, the Netherlands
| | - Aafke H Honkoop
- Department of Internal Medicine, Isala Kliniek, Zwolle, the Netherlands
| | - Bronno van der Holt
- Clinical Trial Center, Erasmus MC Cancer Institute, Rotterdam, the Netherlands
| | - René M Vernhout
- Clinical Trial Center, Erasmus MC Cancer Institute, Rotterdam, the Netherlands
| | | | - Johan M Kros
- Pathology, Erasmus MC Cancer Institute, Rotterdam, the Netherlands
| | - Pim J French
- Department of Neurology, Erasmus MC Cancer Institute, Rotterdam, the Netherlands.
| |
Collapse
|
37
|
Maurano MT, Haugen E, Sandstrom R, Vierstra J, Shafer A, Kaul R, Stamatoyannopoulos JA. Large-scale identification of sequence variants influencing human transcription factor occupancy in vivo. Nat Genet 2015; 47:1393-401. [PMID: 26502339 PMCID: PMC4666772 DOI: 10.1038/ng.3432] [Citation(s) in RCA: 152] [Impact Index Per Article: 16.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/17/2015] [Accepted: 10/02/2015] [Indexed: 12/18/2022]
Abstract
The function of human regulatory regions depends exquisitely on their local genomic environment and on cellular context, complicating experimental analysis of common disease- and trait-associated variants that localize within regulatory DNA. We use allelically resolved genomic DNase I footprinting data encompassing 166 individuals and 114 cell types to identify >60,000 common variants that directly influence transcription factor occupancy and regulatory DNA accessibility in vivo. The unprecedented scale of these data enables systematic analysis of the impact of sequence variation on transcription factor occupancy in vivo. We leverage this analysis to develop accurate models of variation affecting the recognition sites for diverse transcription factors and apply these models to discriminate nearly 500,000 common regulatory variants likely to affect transcription factor occupancy across the human genome. The approach and results provide a new foundation for the analysis and interpretation of noncoding variation in complete human genomes and for systems-level investigation of disease-associated variants.
Collapse
Affiliation(s)
- Matthew T Maurano
- Department of Genome Sciences, University of Washington, Seattle, Washington, USA
| | - Eric Haugen
- Department of Genome Sciences, University of Washington, Seattle, Washington, USA
| | - Richard Sandstrom
- Department of Genome Sciences, University of Washington, Seattle, Washington, USA
| | - Jeff Vierstra
- Department of Genome Sciences, University of Washington, Seattle, Washington, USA
| | - Anthony Shafer
- Department of Genome Sciences, University of Washington, Seattle, Washington, USA
| | - Rajinder Kaul
- Department of Genome Sciences, University of Washington, Seattle, Washington, USA
- Division of Medical Genetics, Department of Medicine, University of Washington, Seattle, Washington, USA
| | - John A Stamatoyannopoulos
- Department of Genome Sciences, University of Washington, Seattle, Washington, USA
- Division of Oncology, Department of Medicine, University of Washington, Seattle, Washington, USA
- Altius Institute for Biomedical Sciences, Seattle, Washington, USA
| |
Collapse
|
38
|
The conservation and signatures of lincRNAs in Marek's disease of chicken. Sci Rep 2015; 5:15184. [PMID: 26471251 PMCID: PMC4608010 DOI: 10.1038/srep15184] [Citation(s) in RCA: 40] [Impact Index Per Article: 4.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/24/2015] [Accepted: 09/15/2015] [Indexed: 01/10/2023] Open
Abstract
Long intergenic non-coding RNAs (lincRNAs) associated with a number of cancers and other diseases have been identified in mammals, but they are still formidable to be comprehensively identified and characterized. Marek's disease (MD) is a T cell lymphoma of chickens induced by Marek's disease virus (MDV). Here, we used a MD chicken model to develop a precise pipeline for identifying lincRNAs and to determine the roles of lincRNAs in T cell tumorigenesis. More than 1,000 lincRNA loci were identified in chicken bursa. Computational analyses demonstrated that lincRNAs are conserved among different species such as human, mouse and chicken. The putative lincRNAs were found to be associated with a wide range of biological functions including immune responses. Interestingly, we observed distinct lincRNA expression signatures in bursa between MD resistant and susceptible lines of chickens. One of the candidate lincRNAs, termed linc-satb1, was found to play a crucial role in MD immune response by regulating a nearby protein-coding gene SATB1. Thus, our results manifested that lincRNAs may exert considerable influence on MDV-induced T cell tumorigenesis and provide a rich resource for hypothesis-driven functional studies to reveal genetic mechanisms underlying susceptibility to tumorigenesis.
Collapse
|
39
|
Pirinen M, Lappalainen T, Zaitlen NA, Dermitzakis ET, Donnelly P, McCarthy MI, Rivas MA. Assessing allele-specific expression across multiple tissues from RNA-seq read data. Bioinformatics 2015; 31:2497-504. [PMID: 25819081 PMCID: PMC4514921 DOI: 10.1093/bioinformatics/btv074] [Citation(s) in RCA: 47] [Impact Index Per Article: 5.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/28/2014] [Revised: 01/09/2015] [Accepted: 01/29/2015] [Indexed: 11/14/2022] Open
Abstract
MOTIVATION RNA sequencing enables allele-specific expression (ASE) studies that complement standard genotype expression studies for common variants and, importantly, also allow measuring the regulatory impact of rare variants. The Genotype-Tissue Expression (GTEx) project is collecting RNA-seq data on multiple tissues of a same set of individuals and novel methods are required for the analysis of these data. RESULTS We present a statistical method to compare different patterns of ASE across tissues and to classify genetic variants according to their impact on the tissue-wide expression profile. We focus on strong ASE effects that we are expecting to see for protein-truncating variants, but our method can also be adjusted for other types of ASE effects. We illustrate the method with a real data example on a tissue-wide expression profile of a variant causal for lipoid proteinosis, and with a simulation study to assess our method more generally.
Collapse
Affiliation(s)
- Matti Pirinen
- Institute for Molecular Medicine Finland (FIMM), University of Helsinki, Helsinki, Finland
| | - Tuuli Lappalainen
- Department of Genetic Medicine and Development and, Institute for Genetics and Genomics in Geneva (iG3), University of Geneva, Geneva, Switzerland, Swiss Institute of Bioinformatics, Geneva, Switzerland, Department of Genetics, Stanford University, Palo Alto, CA, USA, New York Genome Center, New York, NY, USA, Department of Systems Biology, Columbia University, New York, NY, USA
| | - Noah A Zaitlen
- Department of Medicine, University of California, San Francisco, CA, USA
| | - Emmanouil T Dermitzakis
- Department of Genetic Medicine and Development and, Institute for Genetics and Genomics in Geneva (iG3), University of Geneva, Geneva, Switzerland, Swiss Institute of Bioinformatics, Geneva, Switzerland
| | - Peter Donnelly
- Wellcome Trust Centre for Human Genetics and Department of Statistics, University of Oxford, Oxford, UK and
| | - Mark I McCarthy
- Wellcome Trust Centre for Human Genetics and Oxford Centre for Diabetes, Endocrinology and Metabolism, Oxford, UK
| | | |
Collapse
|
40
|
Myers MJ, Martinez M, Li H, Qiu J, Troutman L, Sharkey M, Yancy HF. Influence of ABCB1 Genotype in Collies on the Pharmacokinetics and Pharmacodynamics of Loperamide in a Dose-Escalation Study. Drug Metab Dispos 2015; 43:1392-407. [DOI: 10.1124/dmd.115.063735] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/06/2015] [Accepted: 07/07/2015] [Indexed: 11/22/2022] Open
|
41
|
Wang X, Cairns MJ. Understanding complex transcriptome dynamics in schizophrenia and other neurological diseases using RNA sequencing. INTERNATIONAL REVIEW OF NEUROBIOLOGY 2015; 116:127-52. [PMID: 25172474 DOI: 10.1016/b978-0-12-801105-8.00006-0] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 02/27/2023]
Abstract
How the human brain develops and adapts with its trillions of functionally integrated synapses remains one of the greatest mysteries of life. With tremendous advances in neuroscience, genetics, and molecular biology, we are beginning to appreciate the scope of this complexity and define some of the parameters of the systems that make it possible. These same tools are also leading to advances in our understanding of the pathophysiology of neurocognitive and neuropsychiatric disorders. Like the substrate for these problems, the etiology is usually complex-involving an array of genetic and environmental influences. To resolve these influences and derive better interventions, we need to reveal every aspect of this complexity and model their interactions and define the systems and their regulatory structure. This is particularly important at the tissue-specific molecular interface between the underlying genetic and environmental influence defined by the transcriptome. Recent advances in transcriptome analysis facilitated by RNA sequencing (RNA-Seq) can provide unprecedented insight into the functional genomics of neurological disorders. In this review, we outline the advantages of this approach and highlight some early application of this technology in the investigation of the neuropathology of schizophrenia. Recent progress of RNA-Seq studies in schizophrenia has shown that there is extraordinary transcriptome dynamics with significant levels of alternative splicing. These studies only scratch the surface of this complexity and therefore future studies with greater depth and samples size will be vital to fully explore transcriptional diversity and its underlying influences in schizophrenia and provide the basis for new biomarkers and improved treatments.
Collapse
Affiliation(s)
- Xi Wang
- School of Biomedical Sciences and Pharmacy, Faculty of Health and Medicine, The University of Newcastle, Callaghan, New South Wales, Australia
| | - Murray J Cairns
- School of Biomedical Sciences and Pharmacy, Faculty of Health and Medicine, The University of Newcastle, Callaghan, New South Wales, Australia; The Schizophrenia Research Institute, Sydney, Australia.
| |
Collapse
|
42
|
Daniel G, Schmidt-Edelkraut U, Spengler D, Hoffmann A. Imprinted Zac1 in neural stem cells. World J Stem Cells 2015; 7:300-314. [PMID: 25815116 PMCID: PMC4369488 DOI: 10.4252/wjsc.v7.i2.300] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 07/25/2014] [Revised: 09/24/2014] [Accepted: 11/19/2014] [Indexed: 02/06/2023] Open
Abstract
Neural stem cells (NSCs) and imprinted genes play an important role in brain development. On historical grounds, these two determinants have been largely studied independently of each other. Recent evidence suggests, however, that NSCs can reset select genomic imprints to prevent precocious depletion of the stem cell reservoir. Moreover, imprinted genes like the transcriptional regulator Zac1 can fine tune neuronal vs astroglial differentiation of NSCs. Zac1 binds in a sequence-specific manner to pro-neuronal and imprinted genes to confer transcriptional regulation and furthermore coregulates members of the p53-family in NSCs. At the genome scale, Zac1 is a central hub of an imprinted gene network comprising genes with an important role for NSC quiescence, proliferation and differentiation. Overall, transcriptional, epigenomic, and genomic mechanisms seem to coordinate the functional relationships of NSCs and imprinted genes from development to maturation, and possibly aging.
Collapse
|
43
|
Fluorescent in situ sequencing (FISSEQ) of RNA for gene expression profiling in intact cells and tissues. Nat Protoc 2015; 10:442-58. [PMID: 25675209 DOI: 10.1038/nprot.2014.191] [Citation(s) in RCA: 335] [Impact Index Per Article: 37.2] [Reference Citation Analysis] [Abstract] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/22/2022]
Abstract
RNA-sequencing (RNA-seq) measures the quantitative change in gene expression over the whole transcriptome, but it lacks spatial context. In contrast, in situ hybridization provides the location of gene expression, but only for a small number of genes. Here we detail a protocol for genome-wide profiling of gene expression in situ in fixed cells and tissues, in which RNA is converted into cross-linked cDNA amplicons and sequenced manually on a confocal microscope. Unlike traditional RNA-seq, our method enriches for context-specific transcripts over housekeeping and/or structural RNA, and it preserves the tissue architecture for RNA localization studies. Our protocol is written for researchers experienced in cell microscopy with minimal computing skills. Library construction and sequencing can be completed within 14 d, with image analysis requiring an additional 2 d.
Collapse
|
44
|
Guo Y, Yang TL, Dong SS, Yan H, Hao RH, Chen XF, Chen JB, Tian Q, Li J, Shen H, Deng HW. Genetic analysis identifies DDR2 as a novel gene affecting bone mineral density and osteoporotic fractures in Chinese population. PLoS One 2015; 10:e0117102. [PMID: 25658585 PMCID: PMC4319719 DOI: 10.1371/journal.pone.0117102] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/08/2014] [Accepted: 12/17/2014] [Indexed: 11/19/2022] Open
Abstract
DDR2 gene, playing an essential role in regulating osteoblast differentiation and chondrocyte maturation, may influence bone mineral density (BMD) and osteoporosis, but the genetic variations actually leading to the association remain to be elucidated. Therefore, the aim of this study was to investigate whether the genetic variants in DDR2 are associated with BMD and fracture risk. This study was performed in three samples from two ethnicities, including 1,300 Chinese Han subjects, 700 Chinese Han subjects (350 with osteoporotic hip fractures and 350 healthy controls) and 2,286 US white subjects. Twenty-eight SNPs in DDR2 were genotyped and tested for associations with hip BMD and fractures. We identified 3 SNPs in DDR2 significantly associated with hip BMD in the Chinese population after multiple testing adjustments, which were rs7521233 (P = 1.06×10-4, β: -0.018 for allele C), rs7553831 (P = 1.30×10-4, β: -0.018 for allele T), and rs6697469 (P = 1.59×10-3, β: -0.015 for allele C), separately. These three SNPs were in high linkage disequilibrium. Haplotype analyses detected two significantly associated haplotypes, including one haplotype in block 2 (P = 9.54×10-4, β: -0.016) where these three SNPs located. SNP rs6697469 was also associated with hip fractures (P = 0.043, OR: 1.42) in the Chinese population. The effect on fracture risk was consistent with its association with lower BMD. However, in the white population, we didn't observe significant associations with hip BMD. eQTL analyses revealed that SNPs associated with BMD also affected DDR2 mRNA expression levels in Chinese. Our findings, together with the prior biological evidence, suggest that DDR2 could be a new candidate for osteoporosis in Chinese population. Our results also reveal an ethnic difference, which highlights the need for further genetic studies in each ethnic group.
Collapse
Affiliation(s)
- Yan Guo
- Key Laboratory of Biomedical Information Engineering of Ministry of Education, and Institute of Molecular Genetics, School of Life Science and Technology, Xi’an Jiaotong University, Xi’an, Shaanxi, P. R. China
| | - Tie-Lin Yang
- Key Laboratory of Biomedical Information Engineering of Ministry of Education, and Institute of Molecular Genetics, School of Life Science and Technology, Xi’an Jiaotong University, Xi’an, Shaanxi, P. R. China
- * E-mail: (TLY); (HWD)
| | - Shan-Shan Dong
- Key Laboratory of Biomedical Information Engineering of Ministry of Education, and Institute of Molecular Genetics, School of Life Science and Technology, Xi’an Jiaotong University, Xi’an, Shaanxi, P. R. China
| | - Han Yan
- Key Laboratory of Biomedical Information Engineering of Ministry of Education, and Institute of Molecular Genetics, School of Life Science and Technology, Xi’an Jiaotong University, Xi’an, Shaanxi, P. R. China
| | - Ruo-Han Hao
- Key Laboratory of Biomedical Information Engineering of Ministry of Education, and Institute of Molecular Genetics, School of Life Science and Technology, Xi’an Jiaotong University, Xi’an, Shaanxi, P. R. China
| | - Xiao-Feng Chen
- Key Laboratory of Biomedical Information Engineering of Ministry of Education, and Institute of Molecular Genetics, School of Life Science and Technology, Xi’an Jiaotong University, Xi’an, Shaanxi, P. R. China
| | - Jia-Bin Chen
- Key Laboratory of Biomedical Information Engineering of Ministry of Education, and Institute of Molecular Genetics, School of Life Science and Technology, Xi’an Jiaotong University, Xi’an, Shaanxi, P. R. China
| | - Qing Tian
- School of Public Health and Tropical Medicine, Tulane University, New Orleans, Louisiana, United States of America
| | - Jian Li
- School of Public Health and Tropical Medicine, Tulane University, New Orleans, Louisiana, United States of America
| | - Hui Shen
- School of Public Health and Tropical Medicine, Tulane University, New Orleans, Louisiana, United States of America
| | - Hong-Wen Deng
- School of Public Health and Tropical Medicine, Tulane University, New Orleans, Louisiana, United States of America
- * E-mail: (TLY); (HWD)
| |
Collapse
|
45
|
Khera AV, Mehta NN. Single-cell transcriptomics: an emerging tool in the study of cardiometabolic disease. J Transl Med 2014; 12:312. [PMID: 25377125 PMCID: PMC4228185 DOI: 10.1186/s12967-014-0312-0] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/27/2014] [Accepted: 10/27/2014] [Indexed: 11/30/2022] Open
Affiliation(s)
- Amit V Khera
- Cardiology Division, Massachusetts General Hospital, 55 Fruit Street, Boston, MA, 02114, USA.
| | - Nehal N Mehta
- Section of Inflammation and Cardiometabolic Diseases, National Heart, Lung and Blood Institute, Bethesda, MD, 20892, USA.
| |
Collapse
|
46
|
León-Novelo LG, McIntyre LM, Fear JM, Graze RM. A flexible Bayesian method for detecting allelic imbalance in RNA-seq data. BMC Genomics 2014; 15:920. [PMID: 25339465 PMCID: PMC4230747 DOI: 10.1186/1471-2164-15-920] [Citation(s) in RCA: 36] [Impact Index Per Article: 3.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/30/2014] [Accepted: 10/09/2014] [Indexed: 01/01/2023] Open
Abstract
Background One method of identifying cis regulatory differences is to analyze allele-specific expression (ASE) and identify cases of allelic imbalance (AI). RNA-seq is the most common way to measure ASE and a binomial test is often applied to determine statistical significance of AI. This implicitly assumes that there is no bias in estimation of AI. However, bias has been found to result from multiple factors including: genome ambiguity, reference quality, the mapping algorithm, and biases in the sequencing process. Two alternative approaches have been developed to handle bias: adjusting for bias using a statistical model and filtering regions of the genome suspected of harboring bias. Existing statistical models which account for bias rely on information from DNA controls, which can be cost prohibitive for large intraspecific studies. In contrast, data filtering is inexpensive and straightforward, but necessarily involves sacrificing a portion of the data. Results Here we propose a flexible Bayesian model for analysis of AI, which accounts for bias and can be implemented without DNA controls. In lieu of DNA controls, this Poisson-Gamma (PG) model uses an estimate of bias from simulations. The proposed model always has a lower type I error rate compared to the binomial test. Consistent with prior studies, bias dramatically affects the type I error rate. All of the tested models are sensitive to misspecification of bias. The closer the estimate of bias is to the true underlying bias, the lower the type I error rate. Correct estimates of bias result in a level alpha test. Conclusions To improve the assessment of AI, some forms of systematic error (e.g., map bias) can be identified using simulation. The resulting estimates of bias can be used to correct for bias in the PG model, without data filtering. Other sources of bias (e.g., unidentified variant calls) can be easily captured by DNA controls, but are missed by common filtering approaches. Consequently, as variant identification improves, the need for DNA controls will be reduced. Filtering does not significantly improve performance and is not recommended, as information is sacrificed without a measurable gain. The PG model developed here performs well when bias is known, or slightly misspecified. The model is flexible and can accommodate differences in experimental design and bias estimation. Electronic supplementary material The online version of this article (doi:10.1186/1471-2164-15-920) contains supplementary material, which is available to authorized users.
Collapse
Affiliation(s)
| | | | | | - Rita M Graze
- Department of Biological Sciences, Auburn University, 101 Rouse Life Science Building, 36849 Auburn, AL, USA.
| |
Collapse
|
47
|
Hu Y, Jia H, Wang Y, Cheng Y, Li Z. Sensitive quantification of messenger RNA with a real-time ligase chain reaction by using a ribonucleotide-modified DNA probe. Chem Commun (Camb) 2014; 50:13093-5. [DOI: 10.1039/c4cc05102e] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/21/2022]
|
48
|
Liu Z, Yang J, Xu H, Li C, Wang Z, Li Y, Dong X, Li Y. Comparing computational methods for identification of allele-specific expression based on next generation sequencing data. Genet Epidemiol 2014; 38:591-8. [PMID: 25183311 DOI: 10.1002/gepi.21846] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/16/2013] [Revised: 05/15/2014] [Accepted: 06/16/2014] [Indexed: 11/07/2022]
Abstract
Allele-specific expression (ASE) studies have wide-ranging implications for genome biology and medicine. Whole transcriptome RNA sequencing (RNA-Seq) has emerged as a genome-wide tool for identifying ASE, but suffers from mapping bias favoring reference alleles. Two categories of methods are adopted nowadays, to reduce the effect of mapping bias on ASE identification-normalizing RNA allelic ratio with the parallel genomic allelic ratio (pDNAar) and modifying reference genome to make reads carrying both alleles with the same chance to be mapped (mREF). We compared the sensitivity and specificity of both methods with simulated data, and demonstrated that the pDNAar, though ideally practical, was lower in sensitivity, because of its lower mapping rate of reads carrying nonreference (alternative) alleles, although mREF achieved higher sensitivity and specificity for its efficiency in mapping reads carrying both alleles. Application of these two methods in real sequencing data showed that mREF were able to identify more ASE loci because of its higher mapping efficiency, and able to correcting some seemly incorrect ASE loci identified by pDNAar due to the inefficiency in mapping reads carrying alternative alleles of pDNAar. Our study provides useful information for RNA sequencing data processing in the identification of ASE.
Collapse
Affiliation(s)
- Zhi Liu
- Key Laboratory of Systems Biology, Shanghai Institutes for Biological Sciences, Chinese Academic of Science, Shanghai, P. R. China; University of Chinese Academic of Science, Beijing, P. R. China
| | | | | | | | | | | | | | | |
Collapse
|
49
|
Mayba O, Gilbert HN, Liu J, Haverty PM, Jhunjhunwala S, Jiang Z, Watanabe C, Zhang Z. MBASED: allele-specific expression detection in cancer tissues and cell lines. Genome Biol 2014; 15:405. [PMID: 25315065 PMCID: PMC4165366 DOI: 10.1186/s13059-014-0405-3] [Citation(s) in RCA: 78] [Impact Index Per Article: 7.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/31/2014] [Accepted: 08/07/2014] [Indexed: 12/15/2022] Open
Abstract
Allele-specific gene expression, ASE, is an important aspect of gene regulation. We developed a novel method MBASED, meta-analysis based allele-specific expression detection for ASE detection using RNA-seq data that aggregates information across multiple single nucleotide variation loci to obtain a gene-level measure of ASE, even when prior phasing information is unavailable. MBASED is capable of one-sample and two-sample analyses and performs well in simulations. We applied MBASED to a panel of cancer cell lines and paired tumor-normal tissue samples, and observed extensive ASE in cancer, but not normal, samples, mainly driven by genomic copy number alterations.
Collapse
Affiliation(s)
- Oleg Mayba
- />Department of Bioinformatics and Computational Biology, Genentech Inc, South San Francisco, CA 94080 USA
| | - Houston N Gilbert
- />Department of Biostatistics, Genentech Inc, South San Francisco, CA 94080 USA
| | - Jinfeng Liu
- />Department of Bioinformatics and Computational Biology, Genentech Inc, South San Francisco, CA 94080 USA
| | - Peter M Haverty
- />Department of Bioinformatics and Computational Biology, Genentech Inc, South San Francisco, CA 94080 USA
| | - Suchit Jhunjhunwala
- />Department of Bioinformatics and Computational Biology, Genentech Inc, South San Francisco, CA 94080 USA
| | - Zhaoshi Jiang
- />Department of Bioinformatics and Computational Biology, Genentech Inc, South San Francisco, CA 94080 USA
| | - Colin Watanabe
- />Department of Bioinformatics and Computational Biology, Genentech Inc, South San Francisco, CA 94080 USA
| | - Zemin Zhang
- />Department of Bioinformatics and Computational Biology, Genentech Inc, South San Francisco, CA 94080 USA
| |
Collapse
|
50
|
Kosuri S, Church GM. Large-scale de novo DNA synthesis: technologies and applications. Nat Methods 2014; 11:499-507. [PMID: 24781323 PMCID: PMC7098426 DOI: 10.1038/nmeth.2918] [Citation(s) in RCA: 477] [Impact Index Per Article: 47.7] [Reference Citation Analysis] [Abstract] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/10/2013] [Accepted: 03/10/2014] [Indexed: 12/23/2022]
Abstract
For over 60 years, the synthetic production of new DNA sequences has helped researchers understand and engineer biology. Here we summarize methods and caveats for the de novo synthesis of DNA, with particular emphasis on recent technologies that allow for large-scale and low-cost production. In addition, we discuss emerging applications enabled by large-scale de novo DNA constructs, as well as the challenges and opportunities that lie ahead.
Collapse
Affiliation(s)
- Sriram Kosuri
- Department of Chemistry and Biochemistry, University of California, Los Angeles, Los Angeles, California, USA
| | - George M Church
- 1] Wyss Institute for Biologically Inspired Engineering, Boston, Massachusetts, USA. [2] Department of Genetics, Harvard Medical School, Boston, Massachusetts, USA
| |
Collapse
|