1
|
Williams ZH, Imedio AD, Gaucherand L, Lee DC, Mostafa SM, Phelan JP, Coffin JM, Johnson WE. Recombinant origin and interspecies transmission of a HERV-K(HML-2)-related primate retrovirus with a novel RNA transport element. eLife 2024; 13:e80216. [PMID: 39037763 PMCID: PMC11379458 DOI: 10.7554/elife.80216] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/24/2022] [Accepted: 07/20/2024] [Indexed: 07/23/2024] Open
Abstract
HERV-K(HML-2), the youngest clade of human endogenous retroviruses (HERVs), includes many intact or nearly intact proviruses, but no replication competent HML-2 proviruses have been identified in humans. HML-2-related proviruses are present in other primates, including rhesus macaques, but the extent and timing of HML-2 activity in macaques remains unclear. We have identified 145 HML-2-like proviruses in rhesus macaques, including a clade of young, rhesus-specific insertions. Age estimates, intact open reading frames, and insertional polymorphism of these insertions are consistent with recent or ongoing infectious activity in macaques. 106 of the proviruses form a clade characterized by an ~750 bp sequence between env and the 3' long terminal repeat (LTR), derived from an ancient recombination with a HERV-K(HML-8)-related virus. This clade is found in Old World monkeys (OWM), but not great apes, suggesting it originated after the ape/OWM split. We identified similar proviruses in white-cheeked gibbons; the gibbon insertions cluster within the OWM recombinant clade, suggesting interspecies transmission from OWM to gibbons. The LTRs of the youngest proviruses have deletions in U3, which disrupt the Rec Response Element (RcRE), required for nuclear export of unspliced viral RNA. We show that the HML-8-derived region functions as a Rec-independent constitutive transport element (CTE), indicating the ancestral Rec-RcRE export system was replaced by a CTE mechanism.
Collapse
Affiliation(s)
| | | | - Lea Gaucherand
- Molecular Microbiology Program, Tufts University Graduate School of Biomedical SciencesBostonUnited States
| | - Derek C Lee
- Department of Biology, Boston CollegeBostonUnited States
| | - Salwa Mohd Mostafa
- Department of Developmental, Molecular and Chemical Biology, Tufts University School of MedicineBostonUnited States
| | - James P Phelan
- Molecular Microbiology Program, Tufts University Graduate School of Biomedical SciencesBostonUnited States
| | - John M Coffin
- Molecular Microbiology Program, Tufts University Graduate School of Biomedical SciencesBostonUnited States
- Department of Developmental, Molecular and Chemical Biology, Tufts University School of MedicineBostonUnited States
| | | |
Collapse
|
2
|
Kishi JY, Liu N, West ER, Sheng K, Jordanides JJ, Serrata M, Cepko CL, Saka SK, Yin P. Light-Seq: light-directed in situ barcoding of biomolecules in fixed cells and tissues for spatially indexed sequencing. Nat Methods 2022; 19:1393-1402. [PMID: 36216958 PMCID: PMC9636025 DOI: 10.1038/s41592-022-01604-1] [Citation(s) in RCA: 22] [Impact Index Per Article: 7.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/21/2021] [Accepted: 08/10/2022] [Indexed: 11/21/2022]
Abstract
We present Light-Seq, an approach for multiplexed spatial indexing of intact biological samples using light-directed DNA barcoding in fixed cells and tissues followed by ex situ sequencing. Light-Seq combines spatially targeted, rapid photocrosslinking of DNA barcodes onto complementary DNAs in situ with a one-step DNA stitching reaction to create pooled, spatially indexed sequencing libraries. This light-directed barcoding enables in situ selection of multiple cell populations in intact fixed tissue samples for full-transcriptome sequencing based on location, morphology or protein stains, without cellular dissociation. Applying Light-Seq to mouse retinal sections, we recovered thousands of differentially enriched transcripts from three cellular layers and discovered biomarkers for a very rare neuronal subtype, dopaminergic amacrine cells, from only four to eight individual cells per section. Light-Seq provides an accessible workflow to combine in situ imaging and protein staining with next generation sequencing of the same cells, leaving the sample intact for further analysis post-sequencing.
Collapse
Affiliation(s)
- Jocelyn Y Kishi
- Wyss Institute for Biologically Inspired Engineering, Harvard University, Boston, MA, USA.
- Department of Systems Biology, Harvard Medical School, Boston, MA, USA.
| | - Ninning Liu
- Wyss Institute for Biologically Inspired Engineering, Harvard University, Boston, MA, USA
| | - Emma R West
- Department of Genetics, Blavatnik Institute, Harvard Medical School, Boston, MA, USA
- Howard Hughes Medical Institute, Chevy Chase, MD, USA
| | - Kuanwei Sheng
- Wyss Institute for Biologically Inspired Engineering, Harvard University, Boston, MA, USA
| | - Jack J Jordanides
- Wyss Institute for Biologically Inspired Engineering, Harvard University, Boston, MA, USA
| | - Matthew Serrata
- Wyss Institute for Biologically Inspired Engineering, Harvard University, Boston, MA, USA
| | - Constance L Cepko
- Department of Genetics, Blavatnik Institute, Harvard Medical School, Boston, MA, USA.
- Howard Hughes Medical Institute, Chevy Chase, MD, USA.
- Department of Ophthalmology, Harvard Medical School, Boston, MA, USA.
| | - Sinem K Saka
- Wyss Institute for Biologically Inspired Engineering, Harvard University, Boston, MA, USA.
- Department of Systems Biology, Harvard Medical School, Boston, MA, USA.
- Genome Biology Unit, European Molecular Biology Laboratory (EMBL), Heidelberg, Germany.
| | - Peng Yin
- Wyss Institute for Biologically Inspired Engineering, Harvard University, Boston, MA, USA.
- Department of Systems Biology, Harvard Medical School, Boston, MA, USA.
| |
Collapse
|
3
|
Capturing Genetic Diversity and Selection Signatures of the Endangered Kosovar Balusha Sheep Breed. Genes (Basel) 2022; 13:genes13050866. [PMID: 35627251 PMCID: PMC9140571 DOI: 10.3390/genes13050866] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/24/2022] [Revised: 04/23/2022] [Accepted: 05/09/2022] [Indexed: 11/17/2022] Open
Abstract
There is a growing concern about the loss of animal genetic resources. The aim of this study was to analyze the genetic diversity and potential peculiarity of the endangered Kosovar sheep breed Balusha. For this purpose, a dataset consisting of medium-density SNP chip genotypes (39,879 SNPs) from 45 Balusha sheep was generated and compared with SNP chip genotypes from 29 individuals of a second Kosovar breed, Bardhoka. Publicly available SNP genotypes from 39 individuals of the relatively closely located sheep breeds Istrian Pramenka and Ruda were additionally included in the analyses. Analysis of heterozygosity, allelic richness and effective population size was used to assess the genetic diversity. Inbreeding was evaluated using two different methods (FIS, FROH). The standardized FST (di) and cross-population extended haplotype homozygosity (XPEHH) methods were used to detect signatures of selection. We observed the lowest heterozygosity (HO = 0.351) and effective population size (Ne5 = 25, Ne50 = 228) for the Balusha breed. The mean allelic richness levels (1.780–1.876) across all analyzed breeds were similar and also comparable with those in worldwide breeds. FROH estimates (0.023–0.077) were highest for the Balusha population, although evidence of decreased inbreeding was observed in FIS results for the Balusha breed. Two Gene Ontology (GO) TERMs were strongly enriched for Balusha, and involved genes belonging to the melanogenesis and T cell receptor signaling pathways, respectively. This could result from selection for the special coat color pattern of Balusha (black head) and resistance to certain infectious diseases. The analyzed diversity parameters highlight the urgency to preserve the local Kosovar Balusha sheep as it is clearly distinguished from other sheep of Southeastern Europe, has the lowest diversity level and may harbor valuable genetic variants, e.g., for resistance to infectious diseases.
Collapse
|
4
|
Schmiedel D, Hezroni H, Hamburg A, Shulman Z. Brg1 Supports B Cell Proliferation and Germinal Center Formation Through Enhancer Activation. Front Immunol 2021; 12:705848. [PMID: 34539636 PMCID: PMC8440861 DOI: 10.3389/fimmu.2021.705848] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/06/2021] [Accepted: 08/11/2021] [Indexed: 12/31/2022] Open
Abstract
Activation and differentiation of B cells depend on extensive rewiring of gene expression networks through changes in chromatin structure and accessibility. The chromatin remodeling complex BAF with its catalytic subunit Brg1 was previously identified as an essential regulator of early B cell development, however, how Brg1 orchestrates gene expression during mature B cell activation is less clear. Here, we find that Brg1 is required for B cell proliferation and germinal center formation through selective interactions with enhancers. Brg1 recruitment to enhancers following B cell activation was associated with increased chromatin accessibility and transcriptional activation of their coupled promoters, thereby regulating the expression of cell cycle-associated genes. Accordingly, Brg1-deficient B cells were unable to mount germinal center reactions and support the formation of class-switched plasma cells. Our findings show that changes in B cell transcriptomes that support B cell proliferation and GC formation depend on enhancer activation by Brg1. Thus, the BAF complex plays a critical role during the onset of the humoral immune response.
Collapse
Affiliation(s)
- Dominik Schmiedel
- Department of Immunology, Weizmann Institute of Science, Rehovot, Israel
| | - Hadas Hezroni
- Department of Immunology, Weizmann Institute of Science, Rehovot, Israel
| | - Amit Hamburg
- Department of Immunology, Weizmann Institute of Science, Rehovot, Israel
| | - Ziv Shulman
- Department of Immunology, Weizmann Institute of Science, Rehovot, Israel
| |
Collapse
|
5
|
Sergeant MJ, Hughes JR, Hentges L, Lunter G, Downes DJ, Taylor S. Multi Locus View: an extensible web-based tool for the analysis of genomic data. Commun Biol 2021; 4:623. [PMID: 34035422 PMCID: PMC8149710 DOI: 10.1038/s42003-021-02097-y] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/14/2020] [Accepted: 03/31/2021] [Indexed: 01/10/2023] Open
Abstract
Tracking and understanding data quality, analysis and reproducibility are critical concerns in the biological sciences. This is especially true in genomics where next generation sequencing (NGS) based technologies such as ChIP-seq, RNA-seq and ATAC-seq are generating a flood of genome-scale data. However, such data are usually processed with automated tools and pipelines, generating tabular outputs and static visualisations. Interpretation is normally made at a high level without the ability to visualise the underlying data in detail. Conventional genome browsers are limited to browsing single locations and do not allow for interactions with the dataset as a whole. Multi Locus View (MLV), a web-based tool, has been developed to allow users to fluidly interact with genomics datasets at multiple scales. The user is able to browse the raw data, cluster, and combine the data with other analysis and annotate the data. User datasets can then be shared with other users or made public for quick assessment from the academic community. MLV is publically available at https://mlv.molbiol.ox.ac.uk .
Collapse
Affiliation(s)
- Martin J Sergeant
- MRC WIMM Centre for Computational Biology, MRC Weatherall Institute of Molecular Medicine, University of Oxford, Oxford, UK
| | - Jim R Hughes
- MRC WIMM Centre for Computational Biology, MRC Weatherall Institute of Molecular Medicine, University of Oxford, Oxford, UK
- MRC Molecular Haematology Unit, MRC Weatherall Institute of Molecular Medicine, University of Oxford, Oxford, UK
| | - Lance Hentges
- MRC WIMM Centre for Computational Biology, MRC Weatherall Institute of Molecular Medicine, University of Oxford, Oxford, UK
| | - Gerton Lunter
- MRC WIMM Centre for Computational Biology, MRC Weatherall Institute of Molecular Medicine, University of Oxford, Oxford, UK
- University Medical Centre Groningen, Department of Epidemiology, University of Groningen, Groningen, The Netherlands
| | - Damien J Downes
- MRC Molecular Haematology Unit, MRC Weatherall Institute of Molecular Medicine, University of Oxford, Oxford, UK
| | - Stephen Taylor
- MRC WIMM Centre for Computational Biology, MRC Weatherall Institute of Molecular Medicine, University of Oxford, Oxford, UK.
| |
Collapse
|
6
|
de Sousa MAP, de Athayde FRF, Maldonado MBC, de Lima AO, Fortes MRS, Lopes FL. Single nucleotide polymorphisms affect miRNA target prediction in bovine. PLoS One 2021; 16:e0249406. [PMID: 33882076 PMCID: PMC8059806 DOI: 10.1371/journal.pone.0249406] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/14/2020] [Accepted: 03/17/2021] [Indexed: 02/06/2023] Open
Abstract
Single nucleotide polymorphisms (SNPs) can have significant effects on phenotypic characteristics in cattle. MicroRNAs (miRNAs) are small, non-coding RNAs that act as post-transcriptional regulators by binding them to target mRNAs. In the present study, we scanned ~56 million SNPs against 1,064 bovine miRNA sequences and analyzed, in silico, their possible effects on target binding prediction, primary miRNA formation, association with QTL regions and the evolutionary conservation for each SNP locus. Following target prediction, we show that 71.6% of miRNA predicted targets were altered as a consequence of SNPs located within the seed region of the mature miRNAs. Next, we identified variations in the Minimum Free Energy (MFE), which represents the capacity to alter molecule stability and, consequently, miRNA maturation. A total of 48.6% of the sequences analyzed showed values within those previously reported as sufficient to alter miRNA maturation. We have also found 131 SNPs in 46 miRNAs, with altered target prediction, occurring in QTL regions. Lastly, analysis of evolutionary conservation scores for each SNP locus suggested that they have a conserved biological function through the evolutionary process. Our results suggest that SNPs in microRNAs have the potential to affect bovine phenotypes and could be of great value for genetic improvement studies, as well as production.
Collapse
Affiliation(s)
- Marco Antônio Perpétuo de Sousa
- Department of Production and Animal Health, São Paulo State University (Unesp), School of Veterinary Medicine, Araçatuba, São Paulo, Brazil
| | - Flavia Regina Florêncio de Athayde
- Department of Production and Animal Health, São Paulo State University (Unesp), School of Veterinary Medicine, Araçatuba, São Paulo, Brazil
| | | | - Andressa Oliveira de Lima
- Department of Production and Animal Health, São Paulo State University (Unesp), School of Veterinary Medicine, Araçatuba, São Paulo, Brazil
| | - Marina Rufino S. Fortes
- School of Chemistry and Molecular Bioscience, The University of Queensland, Brisbane, Queensland, Australia
| | - Flavia Lombardi Lopes
- Department of Production and Animal Health, São Paulo State University (Unesp), School of Veterinary Medicine, Araçatuba, São Paulo, Brazil
| |
Collapse
|
7
|
Technological advances and computational approaches for alternative splicing analysis in single cells. Comput Struct Biotechnol J 2020; 18:332-343. [PMID: 32099593 PMCID: PMC7033300 DOI: 10.1016/j.csbj.2020.01.009] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/31/2019] [Accepted: 01/26/2020] [Indexed: 12/15/2022] Open
Abstract
Alternative splicing of RNAs generates isoform diversity, resulting in different proteins that are necessary for maintaining cellular function and identity. The discovery of alternative splicing has been revolutionized by next-generation transcriptomic sequencing mainly using bulk RNA-sequencing, which has unravelled RNA splicing and mis-splicing of normal cells under steady-state and stress conditions. Single-cell RNA-sequencing studies have focused on gene-level expression analysis and revealed gene expression signatures distinguishable between different cellular types. Single-cell alternative splicing is an emerging area of research with the promise to reveal transcriptomic dynamics invisible to bulk- and gene-level analysis. In this review, we will discuss the technological advances for single-cell alternative splicing analysis, computational strategies for isoform detection and quantitation in single cells, and current applications of single-cell alternative splicing analysis and its potential future contributions to personalized medicine.
Collapse
|
8
|
Zhang Z, Lee JH, Ruan H, Ye Y, Krakowiak J, Hu Q, Xiang Y, Gong J, Zhou B, Wang L, Lin C, Diao L, Mills GB, Li W, Han L. Transcriptional landscape and clinical utility of enhancer RNAs for eRNA-targeted therapy in cancer. Nat Commun 2019; 10:4562. [PMID: 31594934 PMCID: PMC6783481 DOI: 10.1038/s41467-019-12543-5] [Citation(s) in RCA: 152] [Impact Index Per Article: 25.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/28/2018] [Accepted: 09/16/2019] [Indexed: 12/19/2022] Open
Abstract
Enhancer RNA (eRNA) is a type of noncoding RNA transcribed from the enhancer. Although critical roles of eRNA in gene transcription control have been increasingly realized, the systemic landscape and potential function of eRNAs in cancer remains largely unexplored. Here, we report the integration of multi-omics and pharmacogenomics data across large-scale patient samples and cancer cell lines. We observe a cancer-/lineage-specificity of eRNAs, which may be largely driven by tissue-specific TFs. eRNAs are involved in multiple cancer signaling pathways through putatively regulating their target genes, including clinically actionable genes and immune checkpoints. They may also affect drug response by within-pathway or cross-pathway means. We characterize the oncogenic potential and therapeutic liability of one eRNA, NET1e, supporting the clinical feasibility of eRNA-targeted therapy. We identify a panel of clinically relevant eRNAs and developed a user-friendly data portal. Our study reveals the transcriptional landscape and clinical utility of eRNAs in cancer.
Collapse
Affiliation(s)
- Zhao Zhang
- Department of Biochemistry and Molecular Biology, McGovern Medical School at The University of Texas Health Science Center at Houston, Houston, TX, 77030, USA
| | - Joo-Hyung Lee
- Department of Biochemistry and Molecular Biology, McGovern Medical School at The University of Texas Health Science Center at Houston, Houston, TX, 77030, USA
| | - Hang Ruan
- Department of Biochemistry and Molecular Biology, McGovern Medical School at The University of Texas Health Science Center at Houston, Houston, TX, 77030, USA
| | - Youqiong Ye
- Department of Biochemistry and Molecular Biology, McGovern Medical School at The University of Texas Health Science Center at Houston, Houston, TX, 77030, USA
| | - Joanna Krakowiak
- Department of Biochemistry and Molecular Biology, McGovern Medical School at The University of Texas Health Science Center at Houston, Houston, TX, 77030, USA
| | - Qingsong Hu
- Department of Molecular and Cellular Oncology, The University of Texas MD Anderson Cancer Center, Houston, TX, 77030, USA
| | - Yu Xiang
- Department of Biochemistry and Molecular Biology, McGovern Medical School at The University of Texas Health Science Center at Houston, Houston, TX, 77030, USA
| | - Jing Gong
- Department of Biochemistry and Molecular Biology, McGovern Medical School at The University of Texas Health Science Center at Houston, Houston, TX, 77030, USA
| | - Bingying Zhou
- State Key Laboratory of Cardiovascular Disease, Fuwai Hospital, National Center for Cardiovascular Diseases, Chinese Academy of Medical Sciences and Peking Union Medical College, Beijing, 100037, PR China
| | - Li Wang
- State Key Laboratory of Cardiovascular Disease, Fuwai Hospital, National Center for Cardiovascular Diseases, Chinese Academy of Medical Sciences and Peking Union Medical College, Beijing, 100037, PR China
| | - Chunru Lin
- Department of Molecular and Cellular Oncology, The University of Texas MD Anderson Cancer Center, Houston, TX, 77030, USA
| | - Lixia Diao
- Department of Bioinformatics and Computational Biology, The University of Texas MD Anderson Cancer Center, Houston, TX, 77030, USA
| | - Gordon B Mills
- Knight Cancer Institute, Oregon Health and Science University, Portland, OR, 97239, USA
| | - Wenbo Li
- Department of Biochemistry and Molecular Biology, McGovern Medical School at The University of Texas Health Science Center at Houston, Houston, TX, 77030, USA.
- Center for Precision Health, The University of Texas Health Science Center at Houston, Houston, TX, 77030, USA.
| | - Leng Han
- Department of Biochemistry and Molecular Biology, McGovern Medical School at The University of Texas Health Science Center at Houston, Houston, TX, 77030, USA.
- Center for Precision Health, The University of Texas Health Science Center at Houston, Houston, TX, 77030, USA.
| |
Collapse
|
9
|
Gorillas have been infected with the HERV-K (HML-2) endogenous retrovirus much more recently than humans and chimpanzees. Proc Natl Acad Sci U S A 2019; 116:1337-1346. [PMID: 30610173 DOI: 10.1073/pnas.1814203116] [Citation(s) in RCA: 23] [Impact Index Per Article: 3.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/20/2022] Open
Abstract
Human endogenous retrovirus-K (HERV-K) human mouse mammary tumor virus-like 2 (HML-2) is the most recently active endogenous retrovirus group in humans, and the only group with human-specific proviruses. HML-2 expression is associated with cancer and other diseases, but extensive searches have failed to reveal any replication-competent proviruses in humans. However, HML-2 proviruses are found throughout the catarrhine primates, and it is possible that they continue to infect some species today. To investigate this possibility, we searched for gorilla-specific HML-2 elements using both in silico data mining and targeted deep-sequencing approaches. We identified 150 gorilla-specific integrations, including 31 2-LTR proviruses. Many of these proviruses have identical LTRs, and are insertionally polymorphic, consistent with very recent integration. One identified provirus has full-length ORFs for all genes, and thus could potentially be replication-competent. We suggest that gorillas may still harbor infectious HML-2 virus and could serve as a model for understanding retrovirus evolution and pathogenesis in humans.
Collapse
|
10
|
Verma SS, Josyula N, Verma A, Zhang X, Veturi Y, Dewey FE, Hartzel DN, Lavage DR, Leader J, Ritchie MD, Pendergrass SA. Rare variants in drug target genes contributing to complex diseases, phenome-wide. Sci Rep 2018; 8:4624. [PMID: 29545597 PMCID: PMC5854600 DOI: 10.1038/s41598-018-22834-4] [Citation(s) in RCA: 12] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/22/2017] [Accepted: 03/01/2018] [Indexed: 12/30/2022] Open
Abstract
The DrugBank database consists of ~800 genes that are well characterized drug targets. This list of genes is a useful resource for association testing. For example, loss of function (LOF) genetic variation has the potential to mimic the effect of drugs, and high impact variation in these genes can impact downstream traits. Identifying novel associations between genetic variation in these genes and a range of diseases can also uncover new uses for the drugs that target these genes. Phenome Wide Association Studies (PheWAS) have been successful in identifying genetic associations across hundreds of thousands of diseases. We have conducted a novel gene based PheWAS to test the effect of rare variants in DrugBank genes, evaluating associations between these genes and more than 500 quantitative and dichotomous phenotypes. We used whole exome sequencing data from 38,568 samples in Geisinger MyCode Community Health Initiative. We evaluated the results of this study when binning rare variants using various filters based on potential functional impact. We identified multiple novel associations, and the majority of the significant associations were driven by functionally annotated variation. Overall, this study provides a sweeping exploration of rare variant associations within functionally relevant genes across a wide range of diagnoses.
Collapse
Affiliation(s)
- Shefali Setia Verma
- Perelman School of Medicine, Department of Genetics, University of Pennsylvania, Philadelphia, PA, 19104, USA
| | - Navya Josyula
- Biomedical and Translational Informatics Institute, Geisinger, Danville, PA, 17221, USA
| | - Anurag Verma
- Perelman School of Medicine, Department of Genetics, University of Pennsylvania, Philadelphia, PA, 19104, USA
| | - Xinyuan Zhang
- Perelman School of Medicine, Department of Genetics, University of Pennsylvania, Philadelphia, PA, 19104, USA
| | - Yogasudha Veturi
- Perelman School of Medicine, Department of Genetics, University of Pennsylvania, Philadelphia, PA, 19104, USA
| | | | - Dustin N Hartzel
- Phenomic Analytics and Clinical Data Core, Geisinger, Danville, PA, USA
| | - Daniel R Lavage
- Phenomic Analytics and Clinical Data Core, Geisinger, Danville, PA, USA
| | - Joe Leader
- Biomedical and Translational Informatics Institute, Geisinger, Danville, PA, 17221, USA.,Phenomic Analytics and Clinical Data Core, Geisinger, Danville, PA, USA
| | - Marylyn D Ritchie
- Perelman School of Medicine, Department of Genetics, University of Pennsylvania, Philadelphia, PA, 19104, USA
| | - Sarah A Pendergrass
- Biomedical and Translational Informatics Institute, Geisinger, Danville, PA, 17221, USA.
| |
Collapse
|
11
|
Localization of Cdc7 Protein Kinase During DNA Replication in Saccharomyces cerevisiae. G3-GENES GENOMES GENETICS 2017; 7:3757-3774. [PMID: 28924058 PMCID: PMC5677158 DOI: 10.1534/g3.117.300223] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 11/30/2022]
Abstract
DDK, a conserved serine-threonine protein kinase composed of a regulatory subunit, Dbf4, and a catalytic subunit, Cdc7, is essential for DNA replication initiation during S phase of the cell cycle through MCM2-7 helicase phosphorylation. The biological significance of DDK is well characterized, but the full mechanism of how DDK associates with substrates remains unclear. Cdc7 is bound to chromatin in the Saccharomyces cerevisiae genome throughout the cell cycle, but there is little empirical evidence as to specific Cdc7 binding locations. Using biochemical and genetic techniques, this study investigated the specific localization of Cdc7 on chromatin. The Calling Cards method, using Ty5 retrotransposons as a marker for DNA–protein binding, suggests Cdc7 kinase is preferentially bound to genomic DNA known to replicate early in S phase, including centromeres and origins of replication. We also discovered Cdc7 binding throughout the genome, which may be necessary to initiate other cellular processes, including meiotic recombination and translesion synthesis. A kinase dead Cdc7 point mutation increases the Ty5 retrotransposon integration efficiency and a 55-amino acid C-terminal truncation of Cdc7, unable to bind Dbf4, reduces Cdc7 binding suggesting a requirement for Dbf4 to stabilize Cdc7 on chromatin during S phase. Chromatin immunoprecipitation demonstrates that Cdc7 binding near specific origins changes during S phase. Our results suggest a model where Cdc7 is loosely bound to chromatin during G1. At the G1/S transition, Cdc7 binding to chromatin is increased and stabilized, preferentially at sites that may become origins, in order to carry out a variety of cellular processes.
Collapse
|
12
|
Abstract
The introduction of new standard formats, proBAM and proBed, improves the integration of genomics and proteomics information, thus aiding proteogenomics applications. These novel formats enable peptide spectrum matches (PSM) to be stored, inspected, and analyzed within the context of the genome. However, an easy-to-use and transparent tool to convert mass spectrometry identification files to these new formats is indispensable. proBAMconvert enables the conversion of common identification file formats (mzIdentML, mzTab, and pepXML) to proBAM/proBed using an intuitive interface. Furthermore, ProBAMconvert enables information to be output both at the PSM and peptide levels and has a command line interface next to the graphical user interface. Detailed documentation and a completely worked-out tutorial is available at http://probam.biobix.be .
Collapse
Affiliation(s)
- Volodimir Olexiouk
- Lab of Bioinformatics and Computational Genomics (BioBix), Department of Mathematical Modelling, Statistics and Bioinformatics, Faculty of Bioscience Engineering, Ghent University , 9000 Ghent, Belgium
| | - Gerben Menschaert
- Lab of Bioinformatics and Computational Genomics (BioBix), Department of Mathematical Modelling, Statistics and Bioinformatics, Faculty of Bioscience Engineering, Ghent University , 9000 Ghent, Belgium
| |
Collapse
|
13
|
Carrere S, Gouzy J. myGenomeBrowser: building and sharing your own genome browser. Bioinformatics 2017; 33:1255-1257. [PMID: 28011789 PMCID: PMC5408841 DOI: 10.1093/bioinformatics/btw800] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/05/2016] [Accepted: 12/13/2016] [Indexed: 01/08/2023] Open
Abstract
myGenomeBrowser is a web-based environment that provides biologists with a way to build, query and share their genome browsers. This tool, that builds on JBrowse, is designed to give users more autonomy while simplifying and minimizing intervention from system administrators. We have extended genome browser basic features to allow users to query, analyze and share their data. Availability and implementation: myGenomeBrowser is freely available at https://bbric-pipelines.toulouse.inra.fr/myGenomeBrowser and includes tutorial screencasts. Source code and installation instructions can be found at https://framagit.org/BBRIC/myGenomeBrowser. myGenomeBrowser is open-source and mainly implemented in Perl, JavaScript, Apache and Docker.
Collapse
Affiliation(s)
- Sébastien Carrere
- LIPM, Université de Toulouse, INRA, CNRS, Castanet-Tolosan, France
- To whom correspondence should be addressed.
| | - Jérôme Gouzy
- LIPM, Université de Toulouse, INRA, CNRS, Castanet-Tolosan, France
| |
Collapse
|
14
|
Strategies to identify natural antisense transcripts. Biochimie 2017; 132:131-151. [DOI: 10.1016/j.biochi.2016.11.006] [Citation(s) in RCA: 16] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/21/2016] [Accepted: 11/24/2016] [Indexed: 12/15/2022]
|
15
|
Taher L, Narlikar L, Ovcharenko I. Identification and computational analysis of gene regulatory elements. Cold Spring Harb Protoc 2015; 2015:pdb.top083642. [PMID: 25561628 PMCID: PMC5885252 DOI: 10.1101/pdb.top083642] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 06/04/2023]
Abstract
Over the last two decades, advances in experimental and computational technologies have greatly facilitated genomic research. Next-generation sequencing technologies have made de novo sequencing of large genomes affordable, and powerful computational approaches have enabled accurate annotations of genomic DNA sequences. Charting functional regions in genomes must account for not only the coding sequences, but also noncoding RNAs, repetitive elements, chromatin states, epigenetic modifications, and gene regulatory elements. A mix of comparative genomics, high-throughput biological experiments, and machine learning approaches has played a major role in this truly global effort. Here we describe some of these approaches and provide an account of our current understanding of the complex landscape of the human genome. We also present overviews of different publicly available, large-scale experimental data sets and computational tools, which we hope will prove beneficial for researchers working with large and complex genomes.
Collapse
Affiliation(s)
- Leila Taher
- Computational Biology Branch, National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, Maryland 20894
- Institute for Biostatistics and Informatics in Medicine and Ageing Research, University of Rostock, 18051 Rostock, Germany
| | - Leelavati Narlikar
- Chemical Engineering and Process Development Division, National Chemical Laboratory, CSIR, Pune 411008, India
| | - Ivan Ovcharenko
- Computational Biology Branch, National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, Maryland 20894
| |
Collapse
|
16
|
Verma SS, de Andrade M, Tromp G, Kuivaniemi H, Pugh E, Namjou-Khales B, Mukherjee S, Jarvik GP, Kottyan LC, Burt A, Bradford Y, Armstrong GD, Derr K, Crawford DC, Haines JL, Li R, Crosslin D, Ritchie MD. Imputation and quality control steps for combining multiple genome-wide datasets. Front Genet 2014; 5:370. [PMID: 25566314 PMCID: PMC4263197 DOI: 10.3389/fgene.2014.00370] [Citation(s) in RCA: 97] [Impact Index Per Article: 8.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/02/2014] [Accepted: 10/03/2014] [Indexed: 12/16/2022] Open
Abstract
The electronic MEdical Records and GEnomics (eMERGE) network brings together DNA biobanks linked to electronic health records (EHRs) from multiple institutions. Approximately 51,000 DNA samples from distinct individuals have been genotyped using genome-wide SNP arrays across the nine sites of the network. The eMERGE Coordinating Center and the Genomics Workgroup developed a pipeline to impute and merge genomic data across the different SNP arrays to maximize sample size and power to detect associations with a variety of clinical endpoints. The 1000 Genomes cosmopolitan reference panel was used for imputation. Imputation results were evaluated using the following metrics: accuracy of imputation, allelic R2 (estimated correlation between the imputed and true genotypes), and the relationship between allelic R2 and minor allele frequency. Computation time and memory resources required by two different software packages (BEAGLE and IMPUTE2) were also evaluated. A number of challenges were encountered due to the complexity of using two different imputation software packages, multiple ancestral populations, and many different genotyping platforms. We present lessons learned and describe the pipeline implemented here to impute and merge genomic data sets. The eMERGE imputed dataset will serve as a valuable resource for discovery, leveraging the clinical data that can be mined from the EHR.
Collapse
Affiliation(s)
- Shefali S Verma
- Department of Biochemistry and Molecular Biology, Center for Systems Genomics, The Pennsylvania State University Pennsylvania, PA, USA
| | - Mariza de Andrade
- Division of Biomedical Statistics and Informatics, Department of Health Sciences Research, Mayo Clinic Rochester, MN, USA
| | - Gerard Tromp
- The Sigfried and Janet Weis Center for Research, Geisinger Health System Danville, PA, USA
| | - Helena Kuivaniemi
- The Sigfried and Janet Weis Center for Research, Geisinger Health System Danville, PA, USA
| | - Elizabeth Pugh
- Center for Inherited Disease Research, John Hopkins University Baltimore, MD, USA
| | | | | | - Gail P Jarvik
- Department of Medicine, University of Washington Seattle, WA, USA
| | - Leah C Kottyan
- Cincinnati Children's Hospital Medical Center Cincinnati, OH, USA
| | - Amber Burt
- Department of Medicine, University of Washington Seattle, WA, USA
| | - Yuki Bradford
- Department of Biochemistry and Molecular Biology, Center for Systems Genomics, The Pennsylvania State University Pennsylvania, PA, USA
| | - Gretta D Armstrong
- Department of Biochemistry and Molecular Biology, Center for Systems Genomics, The Pennsylvania State University Pennsylvania, PA, USA
| | - Kimberly Derr
- The Sigfried and Janet Weis Center for Research, Geisinger Health System Danville, PA, USA
| | - Dana C Crawford
- Center for Human Genetics Research, Vanderbilt University Nashville, TN, USA ; Department of Epidemiology and Biostatistics, Case Western University Cleveland, OH, USA
| | - Jonathan L Haines
- Department of Epidemiology and Biostatistics, Case Western University Cleveland, OH, USA
| | - Rongling Li
- Division of Genomic Medicine, National Human Genome Research Institute Bethesda, MD, USA
| | - David Crosslin
- Department of Medicine, University of Washington Seattle, WA, USA
| | - Marylyn D Ritchie
- Department of Biochemistry and Molecular Biology, Center for Systems Genomics, The Pennsylvania State University Pennsylvania, PA, USA
| |
Collapse
|
17
|
Brewer MH, Ma KH, Beecham GW, Gopinath C, Baas F, Choi BO, Reilly MM, Shy ME, Züchner S, Svaren J, Antonellis A. Haplotype-specific modulation of a SOX10/CREB response element at the Charcot-Marie-Tooth disease type 4C locus SH3TC2. Hum Mol Genet 2014; 23:5171-87. [PMID: 24833716 PMCID: PMC4168306 DOI: 10.1093/hmg/ddu240] [Citation(s) in RCA: 20] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/12/2014] [Revised: 05/01/2014] [Accepted: 05/12/2014] [Indexed: 12/22/2022] Open
Abstract
Loss-of-function mutations in the Src homology 3 (SH3) domain and tetratricopeptide repeats 2 (SH3TC2) gene cause autosomal recessive demyelinating Charcot-Marie-Tooth neuropathy. The SH3TC2 protein has been implicated in promyelination signaling through axonal neuregulin-1 and the ERBB2 Schwann cell receptor. However, little is known about the transcriptional regulation of the SH3TC2 gene. We performed computational and functional analyses that revealed two cis-acting regulatory elements at SH3TC2-one at the promoter and one ∼150 kb downstream of the transcription start site. Both elements direct reporter gene expression in Schwann cells and are responsive to the transcription factor SOX10, which is essential for peripheral nervous system myelination. The downstream enhancer harbors a single-nucleotide polymorphism (SNP) that causes an ∼80% reduction in enhancer activity. The SNP resides directly within a predicted binding site for the transcription factor cAMP response element binding protein (CREB), and we demonstrate that this regulatory element binds to CREB and is activated by CREB expression. Finally, forskolin induces Sh3tc2 expression in rat primary Schwann cells, indicating that SH3TC2 is a CREB target gene. These findings prompted us to determine if SNP genotypes at SH3TC2 are associated with differential phenotypes in the most common demyelinating peripheral neuropathy, CMT1A. Interestingly, this revealed several associations between SNP alleles and disease severity. In summary, our data indicate that SH3TC2 is regulated by the transcription factors CREB and SOX10, define a regulatory SNP at this disease-associated locus and reveal SH3TC2 as a candidate modifier locus of CMT disease phenotypes.
Collapse
Affiliation(s)
| | - Ki Hwan Ma
- Cellular and Molecular Pathology (CMP) Program
| | - Gary W Beecham
- Dr John T. Macdonald Foundation Department of Human Genetics and John P. Hussman Institute for Human Genomics, University of Miami Miller School of Medicine, Miami, FL, USA
| | - Chetna Gopinath
- Cellular and Molecular Biology Program, University of Michigan Medical School, Ann Arbor, MI, USA
| | - Frank Baas
- Department of Genome Analysis, Academic Medical Centre, Amsterdam, The Netherlands
| | - Byung-Ok Choi
- Department of Neurology, Samsung Medical Center, Sungkyunkwan University School of Medicine, Gangnam-Gu, Seoul, Korea
| | - Mary M Reilly
- MRC Centre for Neuromuscular Diseases, UCL Institute of Neurology, Queen Square, London, UK
| | - Michael E Shy
- Department of Neurology Department of Pediatrics and Department of Physiology, Carver College of Medicine, University of Iowa, Iowa City, IA, USA
| | - Stephan Züchner
- Dr John T. Macdonald Foundation Department of Human Genetics and John P. Hussman Institute for Human Genomics, University of Miami Miller School of Medicine, Miami, FL, USA
| | - John Svaren
- Waisman Center and Department of Comparative Biosciences, University of Wisconsin-Madison, Madison, WI, USA
| | - Anthony Antonellis
- Department of Human Genetics Department of Neurology and Cellular and Molecular Biology Program, University of Michigan Medical School, Ann Arbor, MI, USA
| |
Collapse
|
18
|
Li Y, Zheng M, Lau YFC. The sex-determining factors SRY and SOX9 regulate similar target genes and promote testis cord formation during testicular differentiation. Cell Rep 2014; 8:723-33. [PMID: 25088423 DOI: 10.1016/j.celrep.2014.06.055] [Citation(s) in RCA: 93] [Impact Index Per Article: 8.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/20/2014] [Revised: 05/12/2014] [Accepted: 06/25/2014] [Indexed: 01/07/2023] Open
Abstract
Male sex determination is mediated sequentially by sex-determining region Y (SRY) and related SRY-box 9 (SOX9) transcription factors. To understand the gene regulatory hierarchy for SRY and SOX9, a series of chromatin immunoprecipitation and whole-genome promoter tiling microarray (ChIP-Chip) experiments were conducted with mouse gonadal cells at the time of sex determination. SRY and SOX9 bind to the promoters of many common targets involved in testis differentiation and regulate their expression in Sertoli cells. SRY binds to various ovarian differentiation genes and represses their activation through WNT/β-catenin signaling. Sertoli cell-Sertoli cell junction signaling, important for testis cord formation, is the top canonical pathway among the SRY and SOX9 targets. Hence, SRY determines Sertoli cell fate by repressing ovarian and activating testicular differentiation genes, promotes early Sertoli cells to form testis cord, and then passes on its functions to SOX9, which regulates common targets and activates its own gene regulatory program, beyond SRY actions, in sex determination.
Collapse
Affiliation(s)
- Yunmin Li
- Laboratory of Cell and Developmental Genetics, Department of Medicine, VA Medical Center, University of California, San Francisco, San Francisco, CA 94121, USA; Institute for Human Genetics, University of California, San Francisco, San Francisco, CA 94143, USA
| | - Ming Zheng
- Department of Anesthesia, Stanford University School of Medicine, Palo Alto, CA 94305, USA
| | - Yun-Fai Chris Lau
- Laboratory of Cell and Developmental Genetics, Department of Medicine, VA Medical Center, University of California, San Francisco, San Francisco, CA 94121, USA; Institute for Human Genetics, University of California, San Francisco, San Francisco, CA 94143, USA.
| |
Collapse
|
19
|
Sokol M, Wabl M, Ruiz IR, Pedersen FS. Novel principles of gamma-retroviral insertional transcription activation in murine leukemia virus-induced end-stage tumors. Retrovirology 2014; 11:36. [PMID: 24886479 PMCID: PMC4098794 DOI: 10.1186/1742-4690-11-36] [Citation(s) in RCA: 20] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/13/2013] [Accepted: 04/28/2014] [Indexed: 12/15/2022] Open
Abstract
BACKGROUND Insertional mutagenesis screens of retrovirus-induced mouse tumors have proven valuable in human cancer research and for understanding adverse effects of retroviral-based gene therapies. In previous studies, the assignment of mouse genes to individual retroviral integration sites has been based on close proximity and expression patterns of annotated genes at target positions in the genome. We here employed next-generation RNA sequencing to map retroviral-mouse chimeric junctions genome-wide, and to identify local patterns of transcription activation in T-lymphomas induced by the murine leukemia gamma-retrovirus SL3-3. Moreover, to determine epigenetic integration preferences underlying long-range gene activation by retroviruses, the colocalization propensity with common epigenetic enhancer markers (H3K4Me1 and H3K27Ac) of 6,117 integrations derived from end-stage tumors of more than 2,000 mice was examined. RESULTS We detected several novel mechanisms of retroviral insertional mutagenesis: bidirectional activation of mouse transcripts on opposite sides of a provirus including transcription of unannotated mouse sequence; sense/antisense-type activation of genes located on opposite DNA strands; tandem-type activation of distal genes that are positioned adjacently on the same DNA strand; activation of genes that are not the direct integration targets; combination-type insertional mutagenesis, in which enhancer activation, alternative chimeric splicing and retroviral promoter insertion are induced by a single retrovirus. We also show that irrespective of the distance to transcription start sites, the far majority of retroviruses in end-stage tumors colocalize with H3K4Me1 and H3K27Ac-enriched regions in murine lymphoid tissues. CONCLUSIONS We expose novel retrovirus-induced host transcription activation patterns that reach beyond a single and nearest annotated gene target. Awareness of this previously undescribed layer of complexity may prove important for elucidation of adverse effects in retroviral-based gene therapies. We also show that wild-type gamma-retroviruses are frequently positioned at enhancers, suggesting that integration into regulatory regions is specific and also subject to positive selection for sustaining long-range gene activation in end-stage tumors. Altogether, this study should prove useful for extrapolating adverse outcomes of retroviral vector therapies, and for understanding fundamental cellular regulatory principles and retroviral biology.
Collapse
Affiliation(s)
- Martin Sokol
- Department of Molecular Biology and Genetics, Aarhus University, DK-8000 Aarhus, Denmark
| | - Matthias Wabl
- Department of Microbiology and Immunology, University of California, San Francisco, CA 94143, USA
| | - Irene Rius Ruiz
- Department of Molecular Biology and Genetics, Aarhus University, DK-8000 Aarhus, Denmark
| | - Finn Skou Pedersen
- Department of Molecular Biology and Genetics, Aarhus University, DK-8000 Aarhus, Denmark
| |
Collapse
|
20
|
Integrative ChIP-seq/microarray analysis identifies a CTNNB1 target signature enriched in intestinal stem cells and colon cancer. PLoS One 2014; 9:e92317. [PMID: 24651522 PMCID: PMC3961325 DOI: 10.1371/journal.pone.0092317] [Citation(s) in RCA: 36] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/30/2013] [Accepted: 02/20/2014] [Indexed: 11/23/2022] Open
Abstract
Background Deregulation of canonical Wnt/CTNNB1 (beta-catenin) pathway is one of the earliest events in the pathogenesis of colon cancer. Mutations in APC or CTNNB1 are highly frequent in colon cancer and cause aberrant stabilization of CTNNB1, which activates the transcription of Wnt target genes by binding to chromatin via the TCF/LEF transcription factors. Here we report an integrative analysis of genome-wide chromatin occupancy of CTNNB1 by chromatin immunoprecipitation coupled with high-throughput sequencing (ChIP-seq) and gene expression profiling by microarray analysis upon RNAi-mediated knockdown of CTNNB1 in colon cancer cells. Results We observed 3629 CTNNB1 binding peaks across the genome and a significant correlation between CTNNB1 binding and knockdown-induced gene expression change. Our integrative analysis led to the discovery of a direct Wnt target signature composed of 162 genes. Gene ontology analysis of this signature revealed a significant enrichment of Wnt pathway genes, suggesting multiple feedback regulations of the pathway. We provide evidence that this gene signature partially overlaps with the Lgr5+ intestinal stem cell signature, and is significantly enriched in normal intestinal stem cells as well as in clinical colorectal cancer samples. Interestingly, while the expression of the CTNNB1 target gene set does not correlate with survival, elevated expression of negative feedback regulators within the signature predicts better prognosis. Conclusion Our data provide a genome-wide view of chromatin occupancy and gene regulation of Wnt/CTNNB1 signaling in colon cancer cells.
Collapse
|
21
|
Luo H, Sun S, Li P, Bu D, Cao H, Zhao Y. Comprehensive characterization of 10,571 mouse large intergenic noncoding RNAs from whole transcriptome sequencing. PLoS One 2013; 8:e70835. [PMID: 23951020 PMCID: PMC3741367 DOI: 10.1371/journal.pone.0070835] [Citation(s) in RCA: 48] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/21/2013] [Accepted: 06/22/2013] [Indexed: 11/18/2022] Open
Abstract
Large intergenic noncoding RNAs (lincRNAs) have been recognized in recent years to constitute a significant portion of the mammalian transcriptome, yet their biological functions remain largely elusive. This is partly due to an incomplete annotation of tissue-specific lincRNAs in essential model organisms, particularly in mice, which has hindered the genetic annotation and functional characterization of these novel transcripts. In this report, we performed ab initio assembly of 1.9 billion tissue-specific RNA-sequencing reads across six tissue types, and identified 3,965 novel expressed lincRNAs in mice. Combining these with 6,606 documented lincRNAs, we established a comprehensive catalog of 10,571 transcribed lincRNAs. We then systemically analyzed all mouse lincRNAs to reveal that some of them are evolutionally conserved and that they exhibit striking tissue-specific expression patterns. We also discovered that mouse lincRNAs carry unique genomic signatures, and that their expression level is correlated with that of neighboring protein-coding transcripts. Finally, we predicted that a large portion of tissue-specific lincRNAs are functionally associated with essential biological processes including the cell cycle and cell development, and that they could play a key role in regulating tissue development and functionality. Our analyses provide a framework for continued discovery and annotation of tissue-specific lincRNAs in model organisms, and our transcribed mouse lincRNA catalog will serve as a roadmap for functional analyses of lincRNAs in genetic mouse models.
Collapse
Affiliation(s)
- Haitao Luo
- Bioinformatics Research Group, Advanced Computing Research Laboratory, Institute of Computing Technology, Chinese Academy of Sciences, Beijing, China
| | - Silong Sun
- Bioinformatics Research Group, Advanced Computing Research Laboratory, Institute of Computing Technology, Chinese Academy of Sciences, Beijing, China
| | - Ping Li
- National Heart, Lung and Blood Institute, National Institutes of Health, Bethesda, Maryland, United States of America
| | - Dechao Bu
- Bioinformatics Research Group, Advanced Computing Research Laboratory, Institute of Computing Technology, Chinese Academy of Sciences, Beijing, China
| | - Haiming Cao
- National Heart, Lung and Blood Institute, National Institutes of Health, Bethesda, Maryland, United States of America
- * E-mail: (YZ); (HC)
| | - Yi Zhao
- Bioinformatics Research Group, Advanced Computing Research Laboratory, Institute of Computing Technology, Chinese Academy of Sciences, Beijing, China
- * E-mail: (YZ); (HC)
| |
Collapse
|
22
|
Cellular source and mechanisms of high transcriptome complexity in the mammalian testis. Cell Rep 2013; 3:2179-90. [PMID: 23791531 DOI: 10.1016/j.celrep.2013.05.031] [Citation(s) in RCA: 431] [Impact Index Per Article: 35.9] [Reference Citation Analysis] [Abstract] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/19/2012] [Revised: 04/17/2013] [Accepted: 05/21/2013] [Indexed: 01/01/2023] Open
Abstract
Understanding the extent of genomic transcription and its functional relevance is a central goal in genomics research. However, detailed genome-wide investigations of transcriptome complexity in major mammalian organs have been scarce. Here, using extensive RNA-seq data, we show that transcription of the genome is substantially more widespread in the testis than in other organs across representative mammals. Furthermore, we reveal that meiotic spermatocytes and especially postmeiotic round spermatids have remarkably diverse transcriptomes, which explains the high transcriptome complexity of the testis as a whole. The widespread transcriptional activity in spermatocytes and spermatids encompasses protein-coding and long noncoding RNA genes but also poorly conserves intergenic sequences, suggesting that it may not be of immediate functional relevance. Rather, our analyses of genome-wide epigenetic data suggest that this prevalent transcription, which most likely promoted the birth of new genes during evolution, is facilitated by an overall permissive chromatin in these germ cells that results from extensive chromatin remodeling.
Collapse
|
23
|
Transcriptome characterization by RNA-Seq reveals the involvement of the complement components in noise-traumatized rat cochleae. Neuroscience 2013; 248:1-16. [PMID: 23727008 DOI: 10.1016/j.neuroscience.2013.05.038] [Citation(s) in RCA: 32] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/04/2012] [Revised: 05/16/2013] [Accepted: 05/21/2013] [Indexed: 12/12/2022]
Abstract
Acoustic trauma, a leading cause of sensorineural hearing loss in adults, induces a complex degenerative process in the cochlea. Although previous investigations have identified multiple stress pathways, a comprehensive analysis of cochlear responses to acoustic injury is still lacking. In the current study, we used the next-generation RNA-sequencing (RNA-Seq) technique to sequence the whole transcriptome of the normal and noise-traumatized cochlear sensory epithelia (CSE). CSE tissues were collected from rat inner ears 1d after the rats were exposed to a 120-dB (sound pressure level) noise for 2 h. The RNA-Seq generated over 176 million sequence reads for the normal CSE and over 164 million reads for the noise-traumatized CSE. Alignment of these sequences with the rat Rn4 genome revealed the expression of over 17,000 gene transcripts in the CSE, over 2000 of which were exclusively expressed in either the normal or noise-traumatized CSE. Seventy-eight gene transcripts were differentially expressed (70 upregulated and 8 downregulated) after acoustic trauma. Many of the differentially expressed genes are related to the innate immune system. Further expression analyses using quantitative real time PCR confirmed the constitutive expression of multiple complement genes in the normal organ of Corti and the changes in the expression levels of the complement factor I (Cfi) and complement component 1, s subcomponent (C1s) after acoustic trauma. Moreover, protein expression analysis revealed strong expression of Cfi and C1s proteins in the organ of Corti. Importantly, these proteins exhibited expression changes following acoustic trauma. Collectively, the results of the current investigation suggest the involvement of the complement components in cochlear responses to acoustic trauma.
Collapse
|
24
|
Feldmann R, Fischer C, Kodelja V, Behrens S, Haas S, Vingron M, Timmermann B, Geikowski A, Sauer S. Genome-wide analysis of LXRα activation reveals new transcriptional networks in human atherosclerotic foam cells. Nucleic Acids Res 2013; 41:3518-31. [PMID: 23393188 PMCID: PMC3616743 DOI: 10.1093/nar/gkt034] [Citation(s) in RCA: 32] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/10/2023] Open
Abstract
Increased physiological levels of oxysterols are major risk factors for developing atherosclerosis and cardiovascular disease. Lipid-loaded macrophages, termed foam cells, are important during the early development of atherosclerotic plaques. To pursue the hypothesis that ligand-based modulation of the nuclear receptor LXRα is crucial for cell homeostasis during atherosclerotic processes, we analysed genome-wide the action of LXRα in foam cells and macrophages. By integrating chromatin immunoprecipitation-sequencing (ChIP-seq) and gene expression profile analyses, we generated a highly stringent set of 186 LXRα target genes. Treatment with the nanomolar-binding ligand T0901317 and subsequent auto-regulatory LXRα activation resulted in sequence-dependent sharpening of the genome-binding patterns of LXRα. LXRα-binding loci that correlated with differential gene expression revealed 32 novel target genes with potential beneficial effects, which in part explained the implications of disease-associated genetic variation data. These observations identified highly integrated LXRα ligand-dependent transcriptional networks, including the APOE/C1/C4/C2-gene cluster, which contribute to the reversal of cholesterol efflux and the dampening of inflammation processes in foam cells to prevent atherogenesis.
Collapse
Affiliation(s)
- Radmila Feldmann
- Max Planck Institute for Molecular Genetics, Ihnestraße 63-73, 14195 Berlin, Germany
| | | | | | | | | | | | | | | | | |
Collapse
|
25
|
Abstract
Mis-regulation of gene expression due to epigenetic abnormalities has been linked with complex genetic disorders, psychiatric illness, and cancer. In addition, the dynamic epigenetic changes that occur in pluripotent stem cells are believed to impact regulatory networks essential for proper lineage development. Chromatin immunoprecipitation (ChIP) is a technique used to enrich genomic fragments using antibodies against specific chromatin modifications, such as DNA-binding proteins or modified histones. Until recently, many ChIP protocols required large numbers of cells for each immunoprecipitation. This severely limited analysis of rare cell populations or post-mitotic, differentiated cell lines. Here, we describe a low cell number ChIP protocol with next generation sequencing and analysis that has the potential to uncover novel epigenetic regulatory pathways that were previously difficult or impossible to obtain.
Collapse
|
26
|
Jalali S, Jayaraj GG, Scaria V. Integrative transcriptome analysis suggest processing of a subset of long non-coding RNAs to small RNAs. Biol Direct 2012; 7:25. [PMID: 22871084 PMCID: PMC3477000 DOI: 10.1186/1745-6150-7-25] [Citation(s) in RCA: 46] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/15/2011] [Accepted: 07/13/2012] [Indexed: 12/22/2022] Open
Abstract
Background The availability of sequencing technology has enabled understanding of transcriptomes through genome-wide approaches including RNA-sequencing. Contrary to the previous assumption that large tracts of the eukaryotic genomes are not transcriptionally active, recent evidence from transcriptome sequencing approaches have revealed pervasive transcription in many genomes of higher eukaryotes. Many of these loci encode transcripts that have no obvious protein-coding potential and are designated as non-coding RNA (ncRNA). Non-coding RNAs are classified empirically as small and long non-coding RNAs based on the size of the functional RNAs. Each of these classes is further classified into functional subclasses. Although microRNAs (miRNA), one of the major subclass of ncRNAs, have been extensively studied for their roles in regulation of gene expression and involvement in a large number of patho-physiological processes, the functions of a large proportion of long non-coding RNAs (lncRNA) still remains elusive. We hypothesized that some lncRNAs could potentially be processed to small RNA and thus could have a dual regulatory output. Results Integration of large-scale independent experimental datasets in public domain revealed that certain well studied lncRNAs harbor small RNA clusters. Expression analysis of the small RNA clusters in different tissue and cell types reveal that they are differentially regulated suggesting a regulated biogenesis mechanism. Conclusions Our analysis suggests existence of a potentially novel pathway for lncRNA processing into small RNAs. Expression analysis, further suggests that this pathway is regulated. We argue that this evidence supports our hypothesis, though limitations of the datasets and analysis cannot completely rule out alternate possibilities. Further in-depth experimental verification of the observation could potentially reveal a novel pathway for biogenesis. Reviewers This article was reviewed by Dr Rory Johnson (nominated by Fyodor Kondrashov), Dr Raya Khanin (nominated by Dr Yuriy Gusev) and Prof Neil Smalheiser. For full reviews, please go to the Reviewer’s comment section.
Collapse
Affiliation(s)
- Saakshi Jalali
- GN Ramachandran Knowledge Center for Genome Informatics, CSIR Institute of Genomics and Integrative Biology, Mall Road, Delhi 110007, India
| | | | | |
Collapse
|
27
|
Stropp T, McPhillips T, Ludäscher B, Bieda M. Workflows for microarray data processing in the Kepler environment. BMC Bioinformatics 2012; 13:102. [PMID: 22594911 PMCID: PMC3431220 DOI: 10.1186/1471-2105-13-102] [Citation(s) in RCA: 13] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/06/2011] [Accepted: 03/08/2012] [Indexed: 12/31/2022] Open
Abstract
BACKGROUND Microarray data analysis has been the subject of extensive and ongoing pipeline development due to its complexity, the availability of several options at each analysis step, and the development of new analysis demands, including integration with new data sources. Bioinformatics pipelines are usually custom built for different applications, making them typically difficult to modify, extend and repurpose. Scientific workflow systems are intended to address these issues by providing general-purpose frameworks in which to develop and execute such pipelines. The Kepler workflow environment is a well-established system under continual development that is employed in several areas of scientific research. Kepler provides a flexible graphical interface, featuring clear display of parameter values, for design and modification of workflows. It has capabilities for developing novel computational components in the R, Python, and Java programming languages, all of which are widely used for bioinformatics algorithm development, along with capabilities for invoking external applications and using web services. RESULTS We developed a series of fully functional bioinformatics pipelines addressing common tasks in microarray processing in the Kepler workflow environment. These pipelines consist of a set of tools for GFF file processing of NimbleGen chromatin immunoprecipitation on microarray (ChIP-chip) datasets and more comprehensive workflows for Affymetrix gene expression microarray bioinformatics and basic primer design for PCR experiments, which are often used to validate microarray results. Although functional in themselves, these workflows can be easily customized, extended, or repurposed to match the needs of specific projects and are designed to be a toolkit and starting point for specific applications. These workflows illustrate a workflow programming paradigm focusing on local resources (programs and data) and therefore are close to traditional shell scripting or R/BioConductor scripting approaches to pipeline design. Finally, we suggest that microarray data processing task workflows may provide a basis for future example-based comparison of different workflow systems. CONCLUSIONS We provide a set of tools and complete workflows for microarray data analysis in the Kepler environment, which has the advantages of offering graphical, clear display of conceptual steps and parameters and the ability to easily integrate other resources such as remote data and web services.
Collapse
Affiliation(s)
- Thomas Stropp
- Department of Biochemistry and Molecular Biology, University of Calgary, Calgary, AB, Canada
| | | | | | | |
Collapse
|
28
|
Abstract
Myostatin (Mstn) is a secreted growth factor that negatively regulates muscle mass and is therefore a potential pharmacological target for the treatment of muscle wasting disorders such as Duchenne muscular dystrophy. Here we describe a novel Mstn blockade approach in which small interfering RNAs (siRNAs) complementary to a promoter-associated transcript induce transcriptional gene silencing (TGS) in two differentiated mouse muscle cell lines. Silencing is sensitive to treatment with the histone deacetylase inhibitor trichostatin A, and the silent state chromatin mark H3K9me2 is enriched at the Mstn promoter following siRNA transfection, suggesting epigenetic remodeling underlies the silencing effect. These observations suggest that long-term epigenetic silencing may be feasible for Mstn and that TGS is a promising novel therapeutic strategy for the treatment of muscle wasting disorders.
Collapse
|
29
|
Young MA, Larson DE, Sun CW, George DR, Ding L, Miller CA, Lin L, Pawlik KM, Chen K, Fan X, Schmidt H, Kalicki-Veizer J, Cook LL, Swift GW, Demeter RT, Wendl MC, Sands MS, Mardis ER, Wilson RK, Townes TM, Ley TJ. Background mutations in parental cells account for most of the genetic heterogeneity of induced pluripotent stem cells. Cell Stem Cell 2012; 10:570-82. [PMID: 22542160 DOI: 10.1016/j.stem.2012.03.002] [Citation(s) in RCA: 160] [Impact Index Per Article: 12.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/14/2011] [Revised: 01/31/2012] [Accepted: 03/04/2012] [Indexed: 01/19/2023]
Abstract
To assess the genetic consequences of induced pluripotent stem cell (iPSC) reprogramming, we sequenced the genomes of ten murine iPSC clones derived from three independent reprogramming experiments, and compared them to their parental cell genomes. We detected hundreds of single nucleotide variants (SNVs) in every clone, with an average of 11 in coding regions. In two experiments, all SNVs were unique for each clone and did not cluster in pathways, but in the third, all four iPSC clones contained 157 shared genetic variants, which could also be detected in rare cells (<1 in 500) within the parental MEF pool. These data suggest that most of the genetic variation in iPSC clones is not caused by reprogramming per se, but is rather a consequence of cloning individual cells, which "captures" their mutational history. These findings have implications for the development and therapeutic use of cells that are reprogrammed by any method.
Collapse
Affiliation(s)
- Margaret A Young
- Department of Internal Medicine, Division of Oncology, Section of Stem Cell Biology, Washington University, St Louis, MO 63110, USA
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
30
|
Kotorashvili A, Ramnauth A, Liu C, Lin J, Ye K, Kim R, Hazan R, Rohan T, Fineberg S, Loudig O. Effective DNA/RNA co-extraction for analysis of microRNAs, mRNAs, and genomic DNA from formalin-fixed paraffin-embedded specimens. PLoS One 2012; 7:e34683. [PMID: 22514653 PMCID: PMC3326040 DOI: 10.1371/journal.pone.0034683] [Citation(s) in RCA: 53] [Impact Index Per Article: 4.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/22/2011] [Accepted: 03/08/2012] [Indexed: 01/03/2023] Open
Abstract
Background Retrospective studies of archived human specimens, with known clinical follow-up, are used to identify predictive and prognostic molecular markers of disease. Due to biochemical differences, however, formalin-fixed paraffin-embedded (FFPE) DNA and RNA have generally been extracted separately from either different tissue sections or from the same section by dividing the digested tissue. The former limits accurate correlation whilst the latter is impractical when utilizing rare or limited archived specimens. Principal Findings For effective recovery of genomic DNA and total RNA from a single FFPE specimen, without splitting the proteinase-K digested tissue solution, we optimized a co-extraction method by using TRIzol and purifying DNA from the lower aqueous and RNA from the upper organic phases. Using a series of seven different archived specimens, we evaluated the total amounts of genomic DNA and total RNA recovered by our TRIzol-based co-extraction method and compared our results with those from two commercial kits, the Qiagen AllPrep DNA/RNA FFPE kit, for co-extraction, and the Ambion RecoverAll™ Total Nucleic Acid Isolation kit, for separate extraction of FFPE-DNA and -RNA. Then, to accurately assess the quality of DNA and RNA co-extracted from a single FFPE specimen, we used qRT-PCR, gene expression profiling and methylation assays to analyze microRNAs, mRNAs, and genomic DNA recovered from matched fresh and FFPE MCF10A cells. These experiments show that the TRIzol-based co-extraction method provides larger amounts of FFPE-DNA and –RNA than the two other methods, and particularly provides higher quality microRNAs and genomic DNA for subsequent molecular analyses. Significance We determined that co-extraction of genomic DNA and total RNA from a single FFPE specimen is an effective recovery approach to obtain high-quality material for parallel molecular and high-throughput analyses. Our optimized approach provides the option of collecting DNA, which would otherwise be discarded or degraded, for additional or subsequent studies.
Collapse
Affiliation(s)
- Adam Kotorashvili
- Department of Epidemiology and Population Health, Albert Einstein College of Medicine of Yeshiva University, Bronx, New York, United States of America
| | - Andrew Ramnauth
- Department of Epidemiology and Population Health, Albert Einstein College of Medicine of Yeshiva University, Bronx, New York, United States of America
| | - Christina Liu
- Department of Epidemiology and Population Health, Albert Einstein College of Medicine of Yeshiva University, Bronx, New York, United States of America
| | - Juan Lin
- Department of Epidemiology and Population Health, Albert Einstein College of Medicine of Yeshiva University, Bronx, New York, United States of America
| | - Kenny Ye
- Department of Epidemiology and Population Health, Albert Einstein College of Medicine of Yeshiva University, Bronx, New York, United States of America
| | - Ryung Kim
- Department of Epidemiology and Population Health, Albert Einstein College of Medicine of Yeshiva University, Bronx, New York, United States of America
| | - Rachel Hazan
- Department of Pathology, Albert Einstein College of Medicine of Yeshiva University, Bronx , New York, United States of America
| | - Thomas Rohan
- Department of Epidemiology and Population Health, Albert Einstein College of Medicine of Yeshiva University, Bronx, New York, United States of America
| | - Susan Fineberg
- Department of Pathology, Albert Einstein College of Medicine of Yeshiva University, Bronx , New York, United States of America
| | - Olivier Loudig
- Department of Epidemiology and Population Health, Albert Einstein College of Medicine of Yeshiva University, Bronx, New York, United States of America
- Department of Pathology, Albert Einstein College of Medicine of Yeshiva University, Bronx , New York, United States of America
- * E-mail:
| |
Collapse
|
31
|
Fradin M, Stoetzel C, Muller J, Koob M, Christmann D, Debry C, Kohler M, Isnard M, Astruc D, Desprez P, Zorres C, Flori E, Dollfus H, Doray B. Osteosclerotic bone dysplasia in siblings with a Fam20C mutation. Clin Genet 2010; 80:177-83. [PMID: 20825432 DOI: 10.1111/j.1399-0004.2010.01516.x] [Citation(s) in RCA: 52] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/07/2023]
Abstract
Raine syndrome is an autosomal recessive disorder caused by mutations in the FAM20C gene. FAM20C codes for the human homolog of DMP4, a dentin matrix protein highly expressed in odontoblasts and moderately in bone. DMP4 is probably playing a role in the mineralization process. Since the first case reported in 1989 by Raine et al. 21 cases have been published delineating a phenotype which associates dysmorphic features, cerebral calcifications, choanal atresia or stenosis and thoracic/pulmonary hypoplasia. Kan and Kozlowski suggested the name of Raine syndrome to describe this new lethal osteosclerotic bone dysplasia. All the cases described were lethal during the neonatal period except for the last two reported patients aged 8 and 11 years who presented severe mental retardation. Here we describe two sisters, with an attenuated phenotype of Raine syndrome, who present an unexpectedly normal psychomotor development at ages 4 and 1, respectively. Identification of a homozygous mutation in the FAM20C gene confirmed the Raine syndrome diagnosis, thus contributing to the expansion of the Raine syndrome phenotype. This case report also prompted us to revisit the FAM20 gene classification and allowed us to highlight the ancestral status of Fam20C.
Collapse
Affiliation(s)
- Melanie Fradin
- Service de Génétique Médicale, CHU Strasbourg, Hôpital de Hautepierre, Avenue Molière, Strasbourg, France.
| | | | | | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|