651
|
Schaper E, Gascuel O, Anisimova M. Deep conservation of human protein tandem repeats within the eukaryotes. Mol Biol Evol 2014; 31:1132-48. [PMID: 24497029 PMCID: PMC3995336 DOI: 10.1093/molbev/msu062] [Citation(s) in RCA: 48] [Impact Index Per Article: 4.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/01/2023] Open
Abstract
Tandem repeats (TRs) are a major element of protein sequences in all domains of life. They are particularly abundant in mammals, where by conservative estimates one in three proteins contain a TR. High generation-scale duplication and deletion rates were reported for nucleic TR units. However, it is not known whether protein TR units can also be frequently lost or gained providing a source of variation for rapid adaptation of protein function, or alternatively, tend to have conserved TR unit configurations over long evolutionary times. To obtain a systematic picture, we performed a proteome-wide analysis of the mode of evolution for human protein TRs. For this purpose, we propose a novel method for the detection of orthologous TRs based on circular profile hidden Markov models. For all detected TRs, we reconstructed bispecies TR unit phylogenies across 61 eukaryotes ranging from human to yeast. Moreover, we performed additional analyses to correlate functional and structural annotations of human TRs with their mode of evolution. Surprisingly, we find that the vast majority of human TRs are ancient, with TR unit number and order preserved intact since distant speciation events. For example, ≥61% of all human TRs have been strongly conserved at least since the root of all mammals, approximately 300 Ma. Further, we find no human protein TR that shows evidence for strong recent duplications and deletions. The results are in contrast to the high generation-scale mutability of nucleic TRs. Presumably, most protein TRs fold into stable and conserved structures that are indispensable for the function of the TR-containing protein. All of our data and results are available for download from http://www.atgc-montpellier.fr/TRE.
Collapse
Affiliation(s)
- Elke Schaper
- Department of Computer Science, ETH Zürich, Zürich, Switzerland
| | | | | |
Collapse
|
652
|
Abstract
BigWig files are a compressed, indexed, binary format for genome-wide signal data for calculations (e.g. GC percent) or experiments (e.g. ChIP-seq/RNA-seq read depth). bwtool is a tool designed to read bigWig files rapidly and efficiently, providing functionality for extracting data and summarizing it in several ways, globally or at specific regions. Additionally, the tool enables the conversion of the positions of signal data from one genome assembly to another, also known as ‘lifting’. We believe bwtool can be useful for the analyst frequently working with bigWig data, which is becoming a standard format to represent functional signals along genomes. The article includes supplementary examples of running the software. Availability and implementation: The C source code is freely available under the GNU public license v3 at http://cromatina.crg.eu/bwtool. Contact:andrew.pohl@crg.eu, andypohl@gmail.com Supplementary information: Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Andy Pohl
- Department of Gene Regulation, Stem Cells, and Cancer, Centre for Genomic Regulation (CRG) and Department of Experimental and Health Sciences (CEXS), Universitat Pompeu Fabra, 08003 Barcelona, SpainDepartment of Gene Regulation, Stem Cells, and Cancer, Centre for Genomic Regulation (CRG) and Department of Experimental and Health Sciences (CEXS), Universitat Pompeu Fabra, 08003 Barcelona, Spain
| | - Miguel Beato
- Department of Gene Regulation, Stem Cells, and Cancer, Centre for Genomic Regulation (CRG) and Department of Experimental and Health Sciences (CEXS), Universitat Pompeu Fabra, 08003 Barcelona, SpainDepartment of Gene Regulation, Stem Cells, and Cancer, Centre for Genomic Regulation (CRG) and Department of Experimental and Health Sciences (CEXS), Universitat Pompeu Fabra, 08003 Barcelona, Spain
| |
Collapse
|
653
|
Cho V, Mei Y, Sanny A, Chan S, Enders A, Bertram EM, Tan A, Goodnow CC, Andrews TD. The RNA-binding protein hnRNPLL induces a T cell alternative splicing program delineated by differential intron retention in polyadenylated RNA. Genome Biol 2014; 15:R26. [PMID: 24476532 PMCID: PMC4053824 DOI: 10.1186/gb-2014-15-1-r26] [Citation(s) in RCA: 40] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/17/2013] [Accepted: 01/29/2014] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND Retention of a subset of introns in spliced polyadenylated mRNA is emerging as a frequent, unexplained finding from RNA deep sequencing in mammalian cells. RESULTS Here we analyze intron retention in T lymphocytes by deep sequencing polyadenylated RNA. We show a developmentally regulated RNA-binding protein, hnRNPLL, induces retention of specific introns by sequencing RNA from T cells with an inactivating Hnrpll mutation and from B lymphocytes that physiologically downregulate Hnrpll during their differentiation. In Ptprc mRNA encoding the tyrosine phosphatase CD45, hnRNPLL induces selective retention of introns flanking exons 4 to 6; these correspond to the cassette exons containing hnRNPLL binding sites that are skipped in cells with normal, but not mutant or low, hnRNPLL. We identify similar patterns of hnRNPLL-induced differential intron retention flanking alternative exons in 14 other genes, representing novel elements of the hnRNPLL-induced splicing program in T cells. Retroviral expression of a normally spliced cDNA for one of these targets, Senp2, partially corrects the survival defect of Hnrpll-mutant T cells. We find that integrating a number of computational methods to detect genes with differentially retained introns provides a strategy to enrich for alternatively spliced exons in mammalian RNA-seq data, when complemented by RNA-seq analysis of purified cells with experimentally perturbed RNA-binding proteins. CONCLUSIONS Our findings demonstrate that intron retention in mRNA is induced by specific RNA-binding proteins and suggest a biological significance for this process in marking exons that are poised for alternative splicing.
Collapse
|
654
|
David FPA, Delafontaine J, Carat S, Ross FJ, Lefebvre G, Jarosz Y, Sinclair L, Noordermeer D, Rougemont J, Leleu M. HTSstation: a web application and open-access libraries for high-throughput sequencing data analysis. PLoS One 2014; 9:e85879. [PMID: 24475057 PMCID: PMC3903476 DOI: 10.1371/journal.pone.0085879] [Citation(s) in RCA: 86] [Impact Index Per Article: 8.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/03/2013] [Accepted: 12/03/2013] [Indexed: 01/25/2023] Open
Abstract
The HTSstation analysis portal is a suite of simple web forms coupled to modular analysis pipelines for various applications of High-Throughput Sequencing including ChIP-seq, RNA-seq, 4C-seq and re-sequencing. HTSstation offers biologists the possibility to rapidly investigate their HTS data using an intuitive web application with heuristically pre-defined parameters. A number of open-source software components have been implemented and can be used to build, configure and run HTS analysis pipelines reactively. Besides, our programming framework empowers developers with the possibility to design their own workflows and integrate additional third-party software. The HTSstation web application is accessible at http://htsstation.epfl.ch.
Collapse
Affiliation(s)
- Fabrice P. A. David
- School of Life Sciences, Ecole Polytechnique Fédérale de Lausanne (EPFL), Lausanne, Switzerland
- Swiss Institute of Bioinformatics (SIB), Lausanne, Switzerland
| | - Julien Delafontaine
- School of Life Sciences, Ecole Polytechnique Fédérale de Lausanne (EPFL), Lausanne, Switzerland
- Swiss Institute of Bioinformatics (SIB), Lausanne, Switzerland
| | - Solenne Carat
- School of Life Sciences, Ecole Polytechnique Fédérale de Lausanne (EPFL), Lausanne, Switzerland
- Swiss Institute of Bioinformatics (SIB), Lausanne, Switzerland
| | - Frederick J. Ross
- School of Life Sciences, Ecole Polytechnique Fédérale de Lausanne (EPFL), Lausanne, Switzerland
- Swiss Institute of Bioinformatics (SIB), Lausanne, Switzerland
| | - Gregory Lefebvre
- School of Life Sciences, Ecole Polytechnique Fédérale de Lausanne (EPFL), Lausanne, Switzerland
- Swiss Institute of Bioinformatics (SIB), Lausanne, Switzerland
| | - Yohan Jarosz
- School of Life Sciences, Ecole Polytechnique Fédérale de Lausanne (EPFL), Lausanne, Switzerland
- Swiss Institute of Bioinformatics (SIB), Lausanne, Switzerland
| | - Lucas Sinclair
- School of Life Sciences, Ecole Polytechnique Fédérale de Lausanne (EPFL), Lausanne, Switzerland
- Swiss Institute of Bioinformatics (SIB), Lausanne, Switzerland
| | - Daan Noordermeer
- School of Life Sciences, Ecole Polytechnique Fédérale de Lausanne (EPFL), Lausanne, Switzerland
- Swiss Institute for Experimental Cancer Research (ISREC), Lausanne, Switzerland
| | - Jacques Rougemont
- School of Life Sciences, Ecole Polytechnique Fédérale de Lausanne (EPFL), Lausanne, Switzerland
- Swiss Institute of Bioinformatics (SIB), Lausanne, Switzerland
- * E-mail: (JR); (ML)
| | - Marion Leleu
- School of Life Sciences, Ecole Polytechnique Fédérale de Lausanne (EPFL), Lausanne, Switzerland
- Swiss Institute of Bioinformatics (SIB), Lausanne, Switzerland
- Swiss Institute for Experimental Cancer Research (ISREC), Lausanne, Switzerland
- * E-mail: (JR); (ML)
| |
Collapse
|
655
|
Baldacchino S, Saliba C, Petroni V, Fenech AG, Borg N, Grech G. Deregulation of the phosphatase, PP2A is a common event in breast cancer, predicting sensitivity to FTY720. EPMA J 2014; 5:3. [PMID: 24460909 PMCID: PMC3913630 DOI: 10.1186/1878-5085-5-3] [Citation(s) in RCA: 38] [Impact Index Per Article: 3.8] [Reference Citation Analysis] [Abstract] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/13/2013] [Accepted: 01/09/2014] [Indexed: 01/01/2023]
Abstract
Background The most commonly used biomarkers to predict the response of breast cancer patients to therapy are the oestrogen receptor (ER), progesterone receptor (PgR), and human epidermal growth factor receptor 2 (HER2). Patients positive for these biomarkers are eligible for specific therapies such as endocrine treatment in the event of ER and PgR positivity, and the monoclonal antibody, trastuzumab, in the case of HER2-positive patients. Patients who are negative for these three biomarkers, the so-called triple negatives, however, derive little benefit from such therapies and are associated with a worse prognosis. Deregulation of the protein serine/threonine phosphatase type 2A (PP2A) and its regulatory subunits is a common event in breast cancer, providing a possible target for therapy. Methods The data portal, cBioPortal for Cancer Genomics was used to investigate the incidence of conditions that are associated with low phosphatase activity. Four (4) adherent human breast cancer cell lines, MDA-MB-468, MDA-MB-436, Hs578T and BT-20 were cultured to assess their viability when exposed to various dosages of rapamycin or FTY720. In addition, RNA was extracted and cDNA was synthesised to amplify the coding sequence of PPP2CA. Amplification was followed by high-resolution melting to identify variations. Results and conclusion The sequence of PPP2CA was found to be conserved across a diverse panel of solid tumour and haematological cell lines, suggesting that low expression of PPP2CA and differential binding of inhibitory PPP2CA regulators are the main mechanisms of PP2A deregulation. Interestingly, the cBioPortal for Cancer Genomics shows that PP2A is deregulated in 59.6% of basal breast tumours. Viability assays performed to determine the sensitivity of a panel of breast cancer cell lines to FTY720, a PP2A activator, indicated that cell lines associated with ER loss are sensitive to lower doses of FTY720. The subset of patients with suppressed PP2A activity is potentially eligible for treatment using therapies which target the PI3K/AKT/mTOR pathway, such as phosphatase activators.
Collapse
Affiliation(s)
- Shawn Baldacchino
- Department of Pathology, Medical School, University of Malta, Msida MSD2090, Malta
| | - Christian Saliba
- Department of Pathology, Medical School, University of Malta, Msida MSD2090, Malta
| | - Vanessa Petroni
- Department of Clinical Pharmacology and Therapeutics, University of Malta, Msida MSD2090, Malta
| | - Anthony G Fenech
- Department of Clinical Pharmacology and Therapeutics, University of Malta, Msida MSD2090, Malta
| | - Nigel Borg
- Department of Pathology, Medical School, University of Malta, Msida MSD2090, Malta
| | - Godfrey Grech
- Department of Pathology, Medical School, University of Malta, Msida MSD2090, Malta
| |
Collapse
|
656
|
Coverage and efficiency in current SNP chips. Eur J Hum Genet 2014; 22:1124-30. [PMID: 24448550 DOI: 10.1038/ejhg.2013.304] [Citation(s) in RCA: 44] [Impact Index Per Article: 4.4] [Reference Citation Analysis] [Abstract] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/18/2013] [Revised: 12/03/2013] [Accepted: 12/05/2013] [Indexed: 01/24/2023] Open
Abstract
To answer the question as to which commercial high-density SNP chip covers most of the human genome given a fixed budget, we compared the performance of 12 chips of different sizes released by Affymetrix and Illumina for the European, Asian, and African populations. These include Affymetrix' relatively new population-optimized arrays, whose SNP sets are each tailored toward a specific ethnicity. Our evaluation of the chips included the use of two measures, efficiency and cost-benefit ratio, which we developed as supplements to genetic coverage. Unlike coverage, these measures factor in the price of a chip or its substitute size (number of SNPs on chip), allowing comparisons to be drawn between differently priced chips. In this fashion, we identified the Affymetrix population-optimized arrays as offering the most cost-effective coverage for the Asian and African population. For the European population, we established the Illumina Human Omni 2.5-8 as the preferred choice. Interestingly, the Affymetrix chip tailored toward an Eastern Asian subpopulation performed well for all three populations investigated. However, our coverage estimates calculated for all chips proved much lower than those advertised by the producers. All our analyses were based on the 1000 Genome Project as reference population.
Collapse
|
657
|
Littlejohn MD, Tiplady K, Lopdell T, Law TA, Scott A, Harland C, Sherlock R, Henty K, Obolonkin V, Lehnert K, MacGibbon A, Spelman RJ, Davis SR, Snell RG. Expression variants of the lipogenic AGPAT6 gene affect diverse milk composition phenotypes in Bos taurus. PLoS One 2014; 9:e85757. [PMID: 24465687 PMCID: PMC3897493 DOI: 10.1371/journal.pone.0085757] [Citation(s) in RCA: 51] [Impact Index Per Article: 5.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/07/2013] [Accepted: 12/01/2013] [Indexed: 12/22/2022] Open
Abstract
Milk is composed of a complex mixture of lipids, proteins, carbohydrates and various vitamins and minerals as a source of nutrition for young mammals. The composition of milk varies between individuals, with lipid composition in particular being highly heritable. Recent reports have highlighted a region of bovine chromosome 27 harbouring variants affecting milk fat percentage and fatty acid content. We aimed to further investigate this locus in two independent cattle populations, consisting of a Holstein-Friesian x Jersey crossbreed pedigree of 711 F2 cows, and a collection of 32,530 mixed ancestry Bos taurus cows. Bayesian genome-wide association mapping using markers imputed from the Illumina BovineHD chip revealed a large quantitative trait locus (QTL) for milk fat percentage on chromosome 27, present in both populations. We also investigated a range of other milk composition phenotypes, and report additional associations at this locus for fat yield, protein percentage and yield, lactose percentage and yield, milk volume, and the proportions of numerous milk fatty acids. We then used mammary RNA sequence data from 212 lactating cows to assess the transcript abundance of genes located in the milk fat percentage QTL interval. This analysis revealed a strong eQTL for AGPAT6, demonstrating that high milk fat percentage genotype is also additively associated with increased expression of the AGPAT6 gene. Finally, we used whole genome sequence data from six F1 sires to target a panel of novel AGPAT6 locus variants for genotyping in the F2 crossbreed population. Association analysis of 58 of these variants revealed highly significant association for polymorphisms mapping to the 5′UTR exons and intron 1 of AGPAT6. Taken together, these data suggest that variants affecting the expression of AGPAT6 are causally involved in differential milk fat synthesis, with pleiotropic consequences for a diverse range of other milk components.
Collapse
Affiliation(s)
- Mathew D. Littlejohn
- Research & Development, Livestock Improvement Corporation, Hamilton, New Zealand
- * E-mail:
| | - Kathryn Tiplady
- Research & Development, Livestock Improvement Corporation, Hamilton, New Zealand
| | - Thomas Lopdell
- Research & Development, Livestock Improvement Corporation, Hamilton, New Zealand
| | - Tania A. Law
- School of Biological Sciences, University of Auckland, Auckland, New Zealand
| | - Andrew Scott
- Research & Development, Livestock Improvement Corporation, Hamilton, New Zealand
| | - Chad Harland
- Research & Development, Livestock Improvement Corporation, Hamilton, New Zealand
| | - Ric Sherlock
- Research & Development, Livestock Improvement Corporation, Hamilton, New Zealand
| | - Kristen Henty
- Research & Development, Livestock Improvement Corporation, Hamilton, New Zealand
- School of Biological Sciences, University of Auckland, Auckland, New Zealand
| | - Vlad Obolonkin
- Research & Development, Livestock Improvement Corporation, Hamilton, New Zealand
| | - Klaus Lehnert
- Research & Development, Livestock Improvement Corporation, Hamilton, New Zealand
- School of Biological Sciences, University of Auckland, Auckland, New Zealand
| | - Alistair MacGibbon
- Nutrition and Bioactives, Fonterra Research Centre, Palmerston North, New Zealand
| | - Richard J. Spelman
- Research & Development, Livestock Improvement Corporation, Hamilton, New Zealand
| | - Stephen R. Davis
- Research & Development, Livestock Improvement Corporation, Hamilton, New Zealand
| | - Russell G. Snell
- School of Biological Sciences, University of Auckland, Auckland, New Zealand
| |
Collapse
|
658
|
Rosikiewicz M, Robinson-Rechavi M. IQRray, a new method for Affymetrix microarray quality control, and the homologous organ conservation score, a new benchmark method for quality control metrics. ACTA ACUST UNITED AC 2014; 30:1392-9. [PMID: 24451627 PMCID: PMC4016700 DOI: 10.1093/bioinformatics/btu027] [Citation(s) in RCA: 9] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/14/2022]
Abstract
MOTIVATION Microarray results accumulated in public repositories are widely reused in meta-analytical studies and secondary databases. The quality of the data obtained with this technology varies from experiment to experiment, and an efficient method for quality assessment is necessary to ensure their reliability. RESULTS The lack of a good benchmark has hampered evaluation of existing methods for quality control. In this study, we propose a new independent quality metric that is based on evolutionary conservation of expression profiles. We show, using 11 large organ-specific datasets, that IQRray, a new quality metrics developed by us, exhibits the highest correlation with this reference metric, among 14 metrics tested. IQRray outperforms other methods in identification of poor quality arrays in datasets composed of arrays from many independent experiments. In contrast, the performance of methods designed for detecting outliers in a single experiment like Normalized Unscaled Standard Error and Relative Log Expression was low because of the inability of these methods to detect datasets containing only low-quality arrays and because the scores cannot be directly compared between experiments. AVAILABILITY AND IMPLEMENTATION The R implementation of IQRray is available at: ftp://lausanne.isb-sib.ch/pub/databases/Bgee/general/IQRray.R. CONTACT Marta.Rosikiewicz@unil.ch SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Marta Rosikiewicz
- Department of Ecology and Evolution, University of Lausanne and Swiss Institute of Bioinformatics, 1015 Lausanne, Switzerland
| | | |
Collapse
|
659
|
Hooper JE. A survey of software for genome-wide discovery of differential splicing in RNA-Seq data. Hum Genomics 2014; 8:3. [PMID: 24447644 PMCID: PMC3903050 DOI: 10.1186/1479-7364-8-3] [Citation(s) in RCA: 58] [Impact Index Per Article: 5.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/02/2013] [Accepted: 12/26/2013] [Indexed: 01/10/2023] Open
Abstract
Alternative splicing is a major contributor to cellular diversity. Therefore the identification and quantification of differentially spliced transcripts in genome-wide transcript analysis is an important consideration. Here, I review the software available for analysis of RNA-Seq data for differential splicing and discuss intrinsic challenges for differential splicing analyses. Three approaches to differential splicing analysis are described, along with their associated software implementations, their strengths, limitations, and caveats. Suggestions for future work include more extensive experimental validation to assess accuracy of the software predictions and consensus formats for outputs that would facilitate visualizations, data exchange, and downstream analyses.
Collapse
Affiliation(s)
- Joan E Hooper
- Department of Cell and Developmental Biology, University of Colorado Anschutz Medical Campus, 12801 17th Ave, rm 12103, MS 8108, PO Box 6511, Aurora, CO 80045, USA.
| |
Collapse
|
660
|
Serrano-Candelas E, Farré D, Aranguren-Ibáñez Á, Martínez-Høyer S, Pérez-Riba M. The vertebrate RCAN gene family: novel insights into evolution, structure and regulation. PLoS One 2014; 9:e85539. [PMID: 24465593 PMCID: PMC3896409 DOI: 10.1371/journal.pone.0085539] [Citation(s) in RCA: 17] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/09/2013] [Accepted: 12/04/2013] [Indexed: 12/30/2022] Open
Abstract
Recently there has been much interest in the Regulators of Calcineurin (RCAN) proteins which are important endogenous modulators of the calcineurin-NFATc signalling pathway. They have been shown to have a crucial role in cellular programmes such as the immune response, muscle fibre remodelling and memory, but also in pathological processes such as cardiac hypertrophy and neurodegenerative diseases. In vertebrates, the RCAN family form a functional subfamily of three members RCAN1, RCAN2 and RCAN3 whereas only one RCAN is present in the rest of Eukarya. In addition, RCAN genes have been shown to collocate with RUNX and CLIC genes in ACD clusters (ACD21, ACD6 and ACD1). How the RCAN genes and their clustering in ACDs evolved is still unknown. After analysing RCAN gene family evolution using bioinformatic tools, we propose that the three RCAN vertebrate genes within the ACD clusters, which evolved from single copy genes present in invertebrates and lower eukaryotes, are the result of two rounds of whole genome duplication, followed by a segmental duplication. This evolutionary scenario involves the loss or gain of some RCAN genes during evolution. In addition, we have analysed RCAN gene structure and identified the existence of several characteristic features that can be involved in RCAN evolution and gene expression regulation. These included: several transposable elements, CpG islands in the 5′ region of the genes, the existence of antisense transcripts (NAT) associated with the three human genes, and considerable evidence for bidirectional promoters that regulate RCAN gene expression. Furthermore, we show that the CpG island associated with the RCAN3 gene promoter is unmethylated and transcriptionally active. All these results provide timely new insights into the molecular mechanisms underlying RCAN function and a more in depth knowledge of this gene family whose members are obvious candidates for the development of future therapies.
Collapse
Affiliation(s)
- Eva Serrano-Candelas
- Cancer and Human Molecular Genetics Department, Bellvitge Biomedical Research Institute – IDIBELL, L’Hospitalet de Llobregat, Barcelona, Spain
| | - Domènec Farré
- Biological Aggression and Response Mechanisms Unit, Institut d'Investigacions Biomèdiques August Pi i Sunyer – IDIBAPS, Barcelona, Spain
| | - Álvaro Aranguren-Ibáñez
- Cancer and Human Molecular Genetics Department, Bellvitge Biomedical Research Institute – IDIBELL, L’Hospitalet de Llobregat, Barcelona, Spain
| | - Sergio Martínez-Høyer
- Cancer and Human Molecular Genetics Department, Bellvitge Biomedical Research Institute – IDIBELL, L’Hospitalet de Llobregat, Barcelona, Spain
| | - Mercè Pérez-Riba
- Cancer and Human Molecular Genetics Department, Bellvitge Biomedical Research Institute – IDIBELL, L’Hospitalet de Llobregat, Barcelona, Spain
- * E-mail:
| |
Collapse
|
661
|
Pimentel H, Parra M, Gee S, Ghanem D, An X, Li J, Mohandas N, Pachter L, Conboy JG. A dynamic alternative splicing program regulates gene expression during terminal erythropoiesis. Nucleic Acids Res 2014; 42:4031-42. [PMID: 24442673 PMCID: PMC3973340 DOI: 10.1093/nar/gkt1388] [Citation(s) in RCA: 69] [Impact Index Per Article: 6.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/13/2022] Open
Abstract
Alternative pre-messenger RNA splicing remodels the human transcriptome in a spatiotemporal manner during normal development and differentiation. Here we explored the landscape of transcript diversity in the erythroid lineage by RNA-seq analysis of five highly purified populations of morphologically distinct human erythroblasts, representing the last four cell divisions before enucleation. In this unique differentiation system, we found evidence of an extensive and dynamic alternative splicing program encompassing genes with many diverse functions. Alternative splicing was particularly enriched in genes controlling cell cycle, organelle organization, chromatin function and RNA processing. Many alternative exons exhibited differentiation-associated switches in splicing efficiency, mostly in late-stage polychromatophilic and orthochromatophilic erythroblasts, in concert with extensive cellular remodeling that precedes enucleation. A subset of alternative splicing switches introduces premature translation termination codons into selected transcripts in a differentiation stage-specific manner, supporting the hypothesis that alternative splicing-coupled nonsense-mediated decay contributes to regulation of erythroid-expressed genes as a novel part of the overall differentiation program. We conclude that a highly dynamic alternative splicing program in terminally differentiating erythroblasts plays a major role in regulating gene expression to ensure synthesis of appropriate proteome at each stage as the cells remodel in preparation for production of mature red cells.
Collapse
Affiliation(s)
- Harold Pimentel
- Department of Computer Science, University of California, Berkeley, CA 94720, USA, Life Sciences Division, Lawrence Berkeley National Laboratory, Berkeley, CA 94720, USA, Red Cell Physiology Laboratory, New York Blood Center, New York, NY 10065, USA, Department of Mathematics, University of California, Berkeley, CA 94720, USA and Department of Molecular & Cell Biology, University of California, Berkeley, CA 94720, USA
| | | | | | | | | | | | | | | | | |
Collapse
|
662
|
Arcas A, Fernández-Capetillo O, Cases I, Rojas AM. Emergence and evolutionary analysis of the human DDR network: implications in comparative genomics and downstream analyses. Mol Biol Evol 2014; 31:940-61. [PMID: 24441036 PMCID: PMC3969565 DOI: 10.1093/molbev/msu046] [Citation(s) in RCA: 17] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/07/2023] Open
Abstract
The DNA damage response (DDR) is a crucial signaling network that preserves the integrity of the genome. This network is an ensemble of distinct but often overlapping subnetworks, where different components fulfill distinct functions in precise spatial and temporal scenarios. To understand how these elements have been assembled together in humans, we performed comparative genomic analyses in 47 selected species to trace back their emergence using systematic phylogenetic analyses and estimated gene ages. The emergence of the contribution of posttranslational modifications to the complex regulation of DDR was also investigated. This is the first time a systematic analysis has focused on the evolution of DDR subnetworks as a whole. Our results indicate that a DDR core, mostly constructed around metabolic activities, appeared soon after the emergence of eukaryotes, and that additional regulatory capacities appeared later through complex evolutionary process. Potential key posttranslational modifications were also in place then, with interacting pairs preferentially appearing at the same evolutionary time, although modifications often led to the subsequent acquisition of new targets afterwards. We also found extensive gene loss in essential modules of the regulatory network in fungi, plants, and arthropods, important for their validation as model organisms for DDR studies.
Collapse
Affiliation(s)
- Aida Arcas
- Computational Cell Biology Group, Institute for Predictive and Personalized Medicine of Cancer, Badalona, Spain
| | | | | | | |
Collapse
|
663
|
Washietl S, Kellis M, Garber M. Evolutionary dynamics and tissue specificity of human long noncoding RNAs in six mammals. Genome Res 2014; 24:616-28. [PMID: 24429298 PMCID: PMC3975061 DOI: 10.1101/gr.165035.113] [Citation(s) in RCA: 287] [Impact Index Per Article: 28.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/21/2022]
Abstract
Long intergenic noncoding RNAs (lincRNAs) play diverse regulatory roles in human development and disease, but little is known about their evolutionary history and constraint. Here, we characterize human lincRNA expression patterns in nine tissues across six mammalian species and multiple individuals. Of the 1898 human lincRNAs expressed in these tissues, we find orthologous transcripts for 80% in chimpanzee, 63% in rhesus, 39% in cow, 38% in mouse, and 35% in rat. Mammalian-expressed lincRNAs show remarkably strong conservation of tissue specificity, suggesting that it is selectively maintained. In contrast, abundant splice-site turnover suggests that exact splice sites are not critical. Relative to evolutionarily young lincRNAs, mammalian-expressed lincRNAs show higher primary sequence conservation in their promoters and exons, increased proximity to protein-coding genes enriched for tissue-specific functions, fewer repeat elements, and more frequent single-exon transcripts. Remarkably, we find that ∼20% of human lincRNAs are not expressed beyond chimpanzee and are undetectable even in rhesus. These hominid-specific lincRNAs are more tissue specific, enriched for testis, and faster evolving within the human lineage.
Collapse
Affiliation(s)
- Stefan Washietl
- Computer Science and Artificial Intelligence Laboratory, Massachusetts Institute of Technology, Cambridge, Massachusetts 02140, USA
| | | | | |
Collapse
|
664
|
Hoffman NE, Chandramoorthy HC, Shanmughapriya S, Zhang XQ, Vallem S, Doonan PJ, Malliankaraman K, Guo S, Rajan S, Elrod JW, Koch WJ, Cheung JY, Madesh M. SLC25A23 augments mitochondrial Ca²⁺ uptake, interacts with MCU, and induces oxidative stress-mediated cell death. Mol Biol Cell 2014; 25:936-47. [PMID: 24430870 PMCID: PMC3952861 DOI: 10.1091/mbc.e13-08-0502] [Citation(s) in RCA: 115] [Impact Index Per Article: 11.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/08/2023] Open
Abstract
Emerging findings suggest that two lineages of mitochondrial Ca(2+) uptake participate during active and resting states: 1) the major eukaryotic membrane potential-dependent mitochondrial Ca(2+) uniporter and 2) the evolutionarily conserved exchangers and solute carriers, which are also involved in ion transport. Although the influx of Ca(2+) across the inner mitochondrial membrane maintains metabolic functions and cell death signal transduction, the mechanisms that regulate mitochondrial Ca(2+) accumulation are unclear. Solute carriers--solute carrier 25A23 (SLC25A23), SLC25A24, and SLC25A25--represent a family of EF-hand-containing mitochondrial proteins that transport Mg-ATP/Pi across the inner membrane. RNA interference-mediated knockdown of SLC25A23 but not SLC25A24 and SLC25A25 decreases mitochondrial Ca(2+) uptake and reduces cytosolic Ca(2+) clearance after histamine stimulation. Ectopic expression of SLC25A23 EF-hand-domain mutants exhibits a dominant-negative phenotype of reduced mitochondrial Ca(2+) uptake. In addition, SLC25A23 interacts with mitochondrial Ca(2+) uniporter (MCU; CCDC109A) and MICU1 (CBARA1) while also increasing IMCU. In addition, SLC25A23 knockdown lowers basal mROS accumulation, attenuates oxidant-induced ATP decline, and reduces cell death. Further, reconstitution with short hairpin RNA-insensitive SLC25A23 cDNA restores mitochondrial Ca(2+) uptake and superoxide production. These findings indicate that SLC25A23 plays an important role in mitochondrial matrix Ca(2+) influx.
Collapse
Affiliation(s)
- Nicholas E Hoffman
- Department of Biochemistry, Temple University, Philadelphia, PA 19140 Center for Translational Medicine, Temple University, Philadelphia, PA 19140
| | | | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
665
|
Seidel MG, Duerr C, Woutsas S, Schwerin-Nagel A, Sadeghi K, Neesen J, Uhrig S, Santos-Valente E, Pickl WF, Schwinger W, Urban C, Boztug K, Förster-Waldl E. A novel immunodeficiency syndrome associated with partial trisomy 19p13. J Med Genet 2014; 51:254-63. [PMID: 24431329 PMCID: PMC3963557 DOI: 10.1136/jmedgenet-2013-102122] [Citation(s) in RCA: 9] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/08/2022]
Abstract
Background Subtelomeric deletions and duplications may cause syndromic disorders that include features of immunodeficiency. To date, no phenotype of immunological pathology has been linked to partial trisomy 19. We report here on two unrelated male patients showing clinical and laboratory signs of immunodeficiency exhibiting a duplication involving Chromosome 19p13. Methods Both patients underwent a detailed clinical examination. Extended laboratory investigations for immune function, FISH and array comparative genome hybridization (CGH) analyses were performed. Results The reported patients were born prematurely with intrauterine growth retardation and share clinical features including neurological impairment, facial dysmorphy and urogenital malformations. Array CGH analyses of both patients showed a largely overlapping terminal duplication affecting Chromosome 19p13. In both affected individuals, the clinical course was marked by recurrent severe infections. Signs of humoral immunodeficiency were detected, including selective antibody deficiency against polysaccharide antigens in patient 1 and reduced IgG1, IgG3 subclass levels and IgM deficiency in patient 2. Class-switched B memory cells were almost absent in both patients. Normal numbers of T cells, B cells and natural killer cells were observed in both boys. Lymphocytic proliferation showed no consistent functional pathology, however, function of granulocytes and monocytes as assessed by oxidative burst test was moderately reduced. Moreover, natural killer cytotoxicity was reduced in both patients. Immunoglobulin substitution resulted in a decreased number and severity of infections and improved thriving in both patients. Conclusions Partial trisomy 19p13 represents a syndromic disorder associating organ malformation and hitherto unrecognised immunodeficiency.
Collapse
Affiliation(s)
- Markus G Seidel
- Divison of Pediatric Hematology-Oncology, Department Pediatrics and Adolescent Medicine, Medical University Graz, Graz, Austria
| | | | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
666
|
Sex-biased chromatin and regulatory cross-talk between sex chromosomes, autosomes, and mitochondria. Biol Sex Differ 2014; 5:2. [PMID: 24422881 PMCID: PMC3907150 DOI: 10.1186/2042-6410-5-2] [Citation(s) in RCA: 30] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 11/04/2013] [Accepted: 12/29/2013] [Indexed: 02/07/2023] Open
Abstract
Several autoimmune and neurological diseases exhibit a sex bias, but discerning the causes and mechanisms of these biases has been challenging. Sex differences begin to manifest themselves in early embryonic development, and gonadal differentiation further bifurcates the male and female phenotypes. Even at this early stage, however, there is evidence that males and females respond to environmental stimuli differently, and the divergent phenotypic responses may have consequences later in life. The effect of prenatal nutrient restriction illustrates this point, as adult women exposed to prenatal restrictions exhibited increased risk factors of cardiovascular disease, while men exposed to the same condition did not. Recent research has examined the roles of sex-specific genes, hormones, chromosomes, and the interactions among them in mediating sex-biased phenotypes. Such research has identified testosterone, for example, as a possible protective agent against autoimmune disorders and an XX chromosome complement as a susceptibility factor in murine models of lupus and multiple sclerosis. Sex-biased chromatin is an additional and likely important component. Research suggesting a role for X and Y chromosome heterochromatin in regulating epigenetic states of autosomes has highlighted unorthodox mechanisms of gene regulation. The crosstalk between the Y chromosomes and autosomes may be further mediated by the mitochondria. The organelles have solely maternal transmission and exert differential effects on males and females. Altogether, research supports the notion that the interaction between sex-biased elements might exert novel regulatory functions in the genome and contribute to sex-specific susceptibilities to autoimmune and neurological diseases.
Collapse
|
667
|
Phillips CJ, Phillips CD, Goecks J, Lessa EP, Sotero-Caio CG, Tandler B, Gannon MR, Baker RJ. Dietary and flight energetic adaptations in a salivary gland transcriptome of an insectivorous bat. PLoS One 2014; 9:e83512. [PMID: 24454705 PMCID: PMC3891661 DOI: 10.1371/journal.pone.0083512] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/08/2013] [Accepted: 11/04/2013] [Indexed: 12/12/2022] Open
Abstract
We hypothesized that evolution of salivary gland secretory proteome has been important in adaptation to insectivory, the most common dietary strategy among Chiroptera. A submandibular salivary gland (SMG) transcriptome was sequenced for the little brown bat, Myotis lucifugus. The likely secretory proteome of 23 genes included seven (RETNLB, PSAP, CLU, APOE, LCN2, C3, CEL) related to M. lucifugus insectivorous diet and metabolism. Six of the secretory proteins probably are endocrine, whereas one (CEL) most likely is exocrine. The encoded proteins are associated with lipid hydrolysis, regulation of lipid metabolism, lipid transport, and insulin resistance. They are capable of processing exogenous lipids for flight metabolism while foraging. Salivary carboxyl ester lipase (CEL) is thought to hydrolyze insect lipophorins, which probably are absorbed across the gastric mucosa during feeding. The other six proteins are predicted either to maintain these lipids at high blood concentrations or to facilitate transport and uptake by flight muscles. Expression of these seven genes and coordinated secretion from a single organ is novel to this insectivorous bat, and apparently has evolved through instances of gene duplication, gene recruitment, and nucleotide selection. Four of the recruited genes are single-copy in the Myotis genome, whereas three have undergone duplication(s) with two of these genes exhibiting evolutionary 'bursts' of duplication resulting in multiple paralogs. Evidence for episodic directional selection was found for six of seven genes, reinforcing the conclusion that the recruited genes have important roles in adaptation to insectivory and the metabolic demands of flight. Intragenic frequencies of mobile- element-like sequences differed from frequencies in the whole M. lucifugus genome. Differences among recruited genes imply separate evolutionary trajectories and that adaptation was not a single, coordinated event.
Collapse
Affiliation(s)
- Carleton J. Phillips
- Department of Biological Sciences, Texas Tech University, Lubbock, Texas, United States of America
| | - Caleb D. Phillips
- Department of Biological Sciences, Texas Tech University, Lubbock, Texas, United States of America
| | - Jeremy Goecks
- Department of Biology, Emory University, Atlanta, Georgia, United States of America
- Department of Math and Computer Science, Emory University, Atlanta, Georgia, United States of America
| | - Enrique P. Lessa
- Departamento de Ecología y Evolución, Facultad de Ciencias, Universidad de la República, Montevideo, Uruguay
| | - Cibele G. Sotero-Caio
- Department of Biological Sciences, Texas Tech University, Lubbock, Texas, United States of America
| | - Bernard Tandler
- Department of Biological Sciences, School of Dental Medicine, Case Western Reserve University, Cleveland, Ohio, United States of America
| | - Michael R. Gannon
- Department of Biology, Pennsylvania State University, Altoona College, Altoona, Pennsylvania, United States of America
| | - Robert J. Baker
- Department of Biological Sciences, Texas Tech University, Lubbock, Texas, United States of America
| |
Collapse
|
668
|
Taliun D, Gamper J, Pattaro C. Efficient haplotype block recognition of very long and dense genetic sequences. BMC Bioinformatics 2014; 15:10. [PMID: 24423111 PMCID: PMC3898000 DOI: 10.1186/1471-2105-15-10] [Citation(s) in RCA: 29] [Impact Index Per Article: 2.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/09/2013] [Accepted: 12/18/2013] [Indexed: 11/10/2022] Open
Abstract
Background The new sequencing technologies enable to scan very long and dense genetic sequences, obtaining datasets of genetic markers that are an order of magnitude larger than previously available. Such genetic sequences are characterized by common alleles interspersed with multiple rarer alleles. This situation has renewed the interest for the identification of haplotypes carrying the rare risk alleles. However, large scale explorations of the linkage-disequilibrium (LD) pattern to identify haplotype blocks are not easy to perform, because traditional algorithms have at least Θ(n2) time and memory complexity. Results We derived three incremental optimizations of the widely used haplotype block recognition algorithm proposed by Gabriel et al. in 2002. Our most efficient solution, called MIG ++, has only Θ(n) memory complexity and, on a genome-wide scale, it omits >80% of the calculations, which makes it an order of magnitude faster than the original algorithm. Differently from the existing software, the MIG ++ analyzes the LD between SNPs at any distance, avoiding restrictions on the maximal block length. The haplotype block partition of the entire HapMap II CEPH dataset was obtained in 457 hours. By replacing the standard likelihood-based D′ variance estimator with an approximated estimator, the runtime was further improved. While producing a coarser partition, the approximate method allowed to obtain the full-genome haplotype block partition of the entire 1000 Genomes Project CEPH dataset in 44 hours, with no restrictions on allele frequency or long-range correlations. These experiments showed that LD-based haplotype blocks can span more than one million base-pairs in both HapMap II and 1000 Genomes datasets. An application to the North American Rheumatoid Arthritis Consortium (NARAC) dataset shows how the MIG ++ can support genome-wide haplotype association studies. Conclusions The MIG ++ enables to perform LD-based haplotype block recognition on genetic sequences of any length and density. In the new generation sequencing era, this can help identify haplotypes that carry rare variants of interest. The low computational requirements open the possibility to include the haplotype block structure into genome-wide association scans, downstream analyses, and visual interfaces for online genome browsers.
Collapse
Affiliation(s)
- Daniel Taliun
- Center for Biomedicine, European Academy of Bolzano/Bozen (EURAC), Bozen-Bolzano, Italy.
| | | | | |
Collapse
|
669
|
Cajuso T, Hänninen UA, Kondelin J, Gylfe AE, Tanskanen T, Katainen R, Pitkänen E, Ristolainen H, Kaasinen E, Taipale M, Taipale J, Böhm J, Renkonen-Sinisalo L, Mecklin JP, Järvinen H, Tuupanen S, Kilpivaara O, Vahteristo P. Exome sequencing reveals frequent inactivating mutations in ARID1A, ARID1B, ARID2 and ARID4A in microsatellite unstable colorectal cancer. Int J Cancer 2014; 135:611-23. [PMID: 24382590 DOI: 10.1002/ijc.28705] [Citation(s) in RCA: 93] [Impact Index Per Article: 9.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/16/2013] [Revised: 12/05/2013] [Accepted: 12/12/2013] [Indexed: 12/13/2022]
Abstract
ARID1A has been identified as a novel tumor suppressor gene in ovarian cancer and subsequently in various other tumor types. ARID1A belongs to the ARID domain containing gene family, which comprises of 15 genes involved, for example, in transcriptional regulation, proliferation and chromatin remodeling. In this study, we used exome sequencing data to analyze the mutation frequency of all the ARID domain containing genes in 25 microsatellite unstable (MSI) colorectal cancers (CRCs) as a first systematic effort to characterize the mutation pattern of the whole ARID gene family. Genes which fulfilled the selection criteria in this discovery set (mutations in at least 4/25 [16%] samples, including at least one nonsense or splice site mutation) were chosen for further analysis in an independent validation set of 21 MSI CRCs. We found that in addition to ARID1A, which was mutated in 39% of the tumors (18/46), also ARID1B (13%, 6/46), ARID2 (13%, 6/46) and ARID4A (20%, 9/46) were frequently mutated. In all these genes, the mutations were distributed along the entire length of the gene, thus distinguishing them from typical MSI target genes previously described. Our results indicate that in addition to ARID1A, other members of the ARID gene family may play a role in MSI CRC.
Collapse
Affiliation(s)
- Tatiana Cajuso
- Department of Medical Genetics Genome-Scale Biology Research Program, University of Helsinki, Helsinki, Finland
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
670
|
Ekblom R, Wennekes P, Horsburgh GJ, Burke T. Characterization of the house sparrow (Passer domesticus) transcriptome: a resource for molecular ecology and immunogenetics. Mol Ecol Resour 2014; 14:636-46. [PMID: 24345231 DOI: 10.1111/1755-0998.12213] [Citation(s) in RCA: 12] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/04/2013] [Revised: 12/04/2013] [Accepted: 12/11/2013] [Indexed: 11/30/2022]
Abstract
The house sparrow (Passer domesticus) is an important model species in ecology and evolution. However, until recently, genomic resources for molecular ecological projects have been lacking in this species. Here, we present transcriptome sequencing data (RNA-Seq) from three different house sparrow tissues (spleen, blood and bursa). These tissues were specifically chosen to obtain a diverse representation of expressed genes and to maximize the yield of immune-related gene functions. After de novo assembly, 15,250 contigs were identified, representing sequence data from a total of 8756 known avian genes (as inferred from the closely related zebra finch). The transcriptome assembly contain sequence data from nine manually annotated MHC genes, including an almost complete MHC class I coding sequence. There were 407, 303 and 68 genes overexpressed in spleen, blood and bursa, respectively. Gene ontology terms related to ribosomal function were associated with overexpression in spleen and oxygen transport functions with overexpression in blood. In addition to the transcript sequences, we provide 327 gene-linked microsatellites (SSRs) with sufficient flanking sequences for primer design, and 3177 single-nucleotide polymorphisms (SNPs) within genes, that can be used in follow-up molecular ecology studies of this ecological well-studied species.
Collapse
Affiliation(s)
- Robert Ekblom
- Department of Ecology and Genetics, Evolutionary Biology Centre, Uppsala University, Norbyvägen 18 D, Uppsala, SE-75236, Sweden; Department of Animal and Plant Sciences, University of Sheffield, Sheffield, S10 2TN, UK
| | | | | | | |
Collapse
|
671
|
Brinkmeyer-Langford C, Kornegay JN. Comparative Genomics of X-linked Muscular Dystrophies: The Golden Retriever Model. Curr Genomics 2014; 14:330-42. [PMID: 24403852 PMCID: PMC3763684 DOI: 10.2174/13892029113149990004] [Citation(s) in RCA: 22] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/17/2013] [Revised: 07/16/2013] [Accepted: 07/19/2013] [Indexed: 12/30/2022] Open
Abstract
Duchenne muscular dystrophy (DMD) is a devastating disease that dramatically decreases the lifespan and abilities of affected young people. The primary molecular cause of the disease is the absence of functional dystrophin protein, which is critical to proper muscle function. Those with DMD vary in disease presentation and dystrophin mutation; the same causal mutation may be associated with drastically different levels of disease severity. Also contributing to this variation are the influences of additional modifying genes and/or changes in functional elements governing such modifiers. This genetic heterogeneity complicates the efficacy of treatment methods and to date medical interventions are limited to treating symptoms. Animal models of DMD have been instrumental in teasing out the intricacies of DMD disease and hold great promise for advancing knowledge of its variable presentation and treatment. This review addresses the utility of comparative genomics in elucidating the complex background behind phenotypic variation in a canine model of DMD, Golden Retriever muscular dystrophy (GRMD). This knowledge can be exploited in the development of improved, more personalized treatments for DMD patients, such as therapies that can be tailor-matched to the disease course and genomic background of individual patients.
Collapse
Affiliation(s)
- Candice Brinkmeyer-Langford
- Texas A&M University College of Veterinary Medicine, Dept. of Veterinary Integrative Biosciences - Mailstop 4458, College Station, Texas, U.S.A. 77843-4458
| | - Joe N Kornegay
- Texas A&M University College of Veterinary Medicine, Dept. of Veterinary Integrative Biosciences - Mailstop 4458, College Station, Texas, U.S.A. 77843-4458
| |
Collapse
|
672
|
Williams TD, Mirbahai L, Chipman JK. The toxicological application of transcriptomics and epigenomics in zebrafish and other teleosts. Brief Funct Genomics 2014; 13:157-71. [DOI: 10.1093/bfgp/elt053] [Citation(s) in RCA: 50] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/01/2023] Open
|
673
|
Klein HU, Schäfer M, Porse BT, Hasemann MS, Ickstadt K, Dugas M. Integrative analysis of histone ChIP-seq and transcription data using Bayesian mixture models. ACTA ACUST UNITED AC 2014; 30:1154-1162. [PMID: 24403540 DOI: 10.1093/bioinformatics/btu003] [Citation(s) in RCA: 23] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/20/2013] [Accepted: 12/30/2013] [Indexed: 01/08/2023]
Abstract
MOTIVATION Histone modifications are a key epigenetic mechanism to activate or repress the transcription of genes. Datasets of matched transcription data and histone modification data obtained by ChIP-seq exist, but methods for integrative analysis of both data types are still rare. Here, we present a novel bioinformatics approach to detect genes that show different transcript abundances between two conditions putatively caused by alterations in histone modification. RESULTS We introduce a correlation measure for integrative analysis of ChIP-seq and gene transcription data measured by RNA sequencing or microarrays and demonstrate that a proper normalization of ChIP-seq data is crucial. We suggest applying Bayesian mixture models of different types of distributions to further study the distribution of the correlation measure. The implicit classification of the mixture models is used to detect genes with differences between two conditions in both gene transcription and histone modification. The method is applied to different datasets, and its superiority to a naive separate analysis of both data types is demonstrated. AVAILABILITY AND IMPLEMENTATION R/Bioconductor package epigenomix. CONTACT h.klein@uni-muenster.de Supplementary information: Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Hans-Ulrich Klein
- Institute of Medical Informatics, University of Münster, D-48149 Münster, Mathematical Institute, Heinrich Heine University, D-40225 Düsseldorf, Germany, The Finsen Laboratory, Rigshospitalet, Faculty of Health Sciences, Biotech Research and Innovation Center (BRIC), Danish Stem Cell Centre (DanStem), Faculty of Health Sciences, University of Copenhagen, DK-2200 Copenhagen, Denmark and Faculty of Statistics, TU Dortmund University, D-44221 Dortmund, Germany
| | - Martin Schäfer
- Institute of Medical Informatics, University of Münster, D-48149 Münster, Mathematical Institute, Heinrich Heine University, D-40225 Düsseldorf, Germany, The Finsen Laboratory, Rigshospitalet, Faculty of Health Sciences, Biotech Research and Innovation Center (BRIC), Danish Stem Cell Centre (DanStem), Faculty of Health Sciences, University of Copenhagen, DK-2200 Copenhagen, Denmark and Faculty of Statistics, TU Dortmund University, D-44221 Dortmund, Germany
| | - Bo T Porse
- Institute of Medical Informatics, University of Münster, D-48149 Münster, Mathematical Institute, Heinrich Heine University, D-40225 Düsseldorf, Germany, The Finsen Laboratory, Rigshospitalet, Faculty of Health Sciences, Biotech Research and Innovation Center (BRIC), Danish Stem Cell Centre (DanStem), Faculty of Health Sciences, University of Copenhagen, DK-2200 Copenhagen, Denmark and Faculty of Statistics, TU Dortmund University, D-44221 Dortmund, Germany Institute of Medical Informatics, University of Münster, D-48149 Münster, Mathematical Institute, Heinrich Heine University, D-40225 Düsseldorf, Germany, The Finsen Laboratory, Rigshospitalet, Faculty of Health Sciences, Biotech Research and Innovation Center (BRIC), Danish Stem Cell Centre (DanStem), Faculty of Health Sciences, University of Copenhagen, DK-2200 Copenhagen, Denmark and Faculty of Statistics, TU Dortmund University, D-44221 Dortmund, Germany Institute of Medical Informatics, University of Münster, D-48149 Münster, Mathematical Institute, Heinrich Heine University, D-40225 Düsseldorf, Germany, The Finsen Laboratory, Rigshospitalet, Faculty of Health Sciences, Biotech Research and Innovation Center (BRIC), Danish Stem Cell Centre (DanStem), Faculty of Health Sciences, University of Copenhagen, DK-2200 Copenhagen, Denmark and Faculty of Statistics, TU Dortmund University, D-44221 Dortmund, Germany
| | - Marie S Hasemann
- Institute of Medical Informatics, University of Münster, D-48149 Münster, Mathematical Institute, Heinrich Heine University, D-40225 Düsseldorf, Germany, The Finsen Laboratory, Rigshospitalet, Faculty of Health Sciences, Biotech Research and Innovation Center (BRIC), Danish Stem Cell Centre (DanStem), Faculty of Health Sciences, University of Copenhagen, DK-2200 Copenhagen, Denmark and Faculty of Statistics, TU Dortmund University, D-44221 Dortmund, Germany Institute of Medical Informatics, University of Münster, D-48149 Münster, Mathematical Institute, Heinrich Heine University, D-40225 Düsseldorf, Germany, The Finsen Laboratory, Rigshospitalet, Faculty of Health Sciences, Biotech Research and Innovation Center (BRIC), Danish Stem Cell Centre (DanStem), Faculty of Health Sciences, University of Copenhagen, DK-2200 Copenhagen, Denmark and Faculty of Statistics, TU Dortmund University, D-44221 Dortmund, Germany Institute of Medical Informatics, University of Münster, D-48149 Münster, Mathematical Institute, Heinrich Heine University, D-40225 Düsseldorf, Germany, The Finsen Laboratory, Rigshospitalet, Faculty of Health Sciences, Biotech Research and Innovation Center (BRIC), Danish Stem Cell Centre (DanStem), Faculty of Health Sciences, University of Copenhagen, DK-2200 Copenhagen, Denmark and Faculty of Statistics, TU Dortmund University, D-44221 Dortmund, Germany
| | - Katja Ickstadt
- Institute of Medical Informatics, University of Münster, D-48149 Münster, Mathematical Institute, Heinrich Heine University, D-40225 Düsseldorf, Germany, The Finsen Laboratory, Rigshospitalet, Faculty of Health Sciences, Biotech Research and Innovation Center (BRIC), Danish Stem Cell Centre (DanStem), Faculty of Health Sciences, University of Copenhagen, DK-2200 Copenhagen, Denmark and Faculty of Statistics, TU Dortmund University, D-44221 Dortmund, Germany
| | - Martin Dugas
- Institute of Medical Informatics, University of Münster, D-48149 Münster, Mathematical Institute, Heinrich Heine University, D-40225 Düsseldorf, Germany, The Finsen Laboratory, Rigshospitalet, Faculty of Health Sciences, Biotech Research and Innovation Center (BRIC), Danish Stem Cell Centre (DanStem), Faculty of Health Sciences, University of Copenhagen, DK-2200 Copenhagen, Denmark and Faculty of Statistics, TU Dortmund University, D-44221 Dortmund, Germany
| |
Collapse
|
674
|
Hanikenne M, Baurain D. Origin and evolution of metal P-type ATPases in Plantae (Archaeplastida). FRONTIERS IN PLANT SCIENCE 2014; 4:544. [PMID: 24575101 PMCID: PMC3922081 DOI: 10.3389/fpls.2013.00544] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 07/15/2013] [Accepted: 12/12/2013] [Indexed: 05/22/2023]
Abstract
Metal ATPases are a subfamily of P-type ATPases involved in the transport of metal cations across biological membranes. They all share an architecture featuring eight transmembrane domains in pairs of two and are found in prokaryotes as well as in a variety of Eukaryotes. In Arabidopsis thaliana, eight metal P-type ATPases have been described, four being specific to copper transport and four displaying a broader metal specificity, including zinc, cadmium, and possibly copper and calcium. So far, few efforts have been devoted to elucidating the origin and evolution of these proteins in Eukaryotes. In this work, we use large-scale phylogenetics to show that metal P-type ATPases form a homogenous group among P-type ATPases and that their specialization into either monovalent (Cu) or divalent (Zn, Cd…) metal transport stems from a gene duplication that took place early in the evolution of Life. Then, we demonstrate that the four subgroups of plant metal ATPases all have a different evolutionary origin and a specific taxonomic distribution, only one tracing back to the cyanobacterial progenitor of the chloroplast. Finally, we examine the subsequent evolution of these proteins in green plants and conclude that the genes thoroughly characterized in model organisms are often the result of lineage-specific gene duplications, which calls for caution when attempting to infer function from sequence similarity alone in non-model organisms.
Collapse
Affiliation(s)
- Marc Hanikenne
- Functional Genomics and Plant Molecular Imaging, Department of Life Sciences, Center for Protein Engineering (CIP), University of LiègeLiège, Belgium
- PhytoSYSTEMS, University of LiègeLiège, Belgium
| | - Denis Baurain
- PhytoSYSTEMS, University of LiègeLiège, Belgium
- Eukaryotic Phylogenomics, Department of Life Sciences, University of LiègeLiège, Belgium
| |
Collapse
|
675
|
Abstract
Genetical genomics has been suggested as a powerful approach to study the genotype-phenotype gap. However, the relatively low power of these experiments (usually related to the high cost) has hindered fulfillment of its promise, especially for loci (QTL) of moderate effects.One strategy with which to overcome the issue is to use a targeted approach. It has two clear advantages: (i) it reduces the problem to a simple comparison between different genotypic groups at the QTL and (ii) it is a good starting point from which to investigate downstream effects of the QTL. In this study, from 698 F2 birds used for QTL mapping, gene expression profiles of 24 birds with divergent homozygous QTL genotypes were investigated. The targeted QTL was on chromosome 1 and affected initial pH of breast muscle. The biological mechanisms controlling this trait can be similar to those affecting malignant hyperthermia or muscle fatigue in humans. The gene expression study identified 10 strong local signals that were markedly more significant compared to any genes on the rest of the genome. The differentially expressed genes all mapped to a region <1 Mb, suggesting a remarkable reduction of the QTL interval. These results, combined with analysis of downstream effect of the QTL using gene network analysis, suggest that the QTL is controlling pH by governing oxidative stress. The results were reproducible with use of as few as four microarrays on pooled samples (with lower significance level). The results demonstrate that this cost-effective approach is promising for characterization of QTL.
Collapse
|
676
|
Woo S, Cha SW, Merrihew G, He Y, Castellana N, Guest C, MacCoss M, Bafna V. Proteogenomic database construction driven from large scale RNA-seq data. J Proteome Res 2014; 13:21-8. [PMID: 23802565 PMCID: PMC4034692 DOI: 10.1021/pr400294c] [Citation(s) in RCA: 96] [Impact Index Per Article: 9.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/28/2022]
Abstract
The advent of inexpensive RNA-seq technologies and other deep sequencing technologies for RNA has the promise to radically improve genomic annotation, providing information on transcribed regions and splicing events in a variety of cellular conditions. Using MS-based proteogenomics, many of these events can be confirmed directly at the protein level. However, the integration of large amounts of redundant RNA-seq data and mass spectrometry data poses a challenging problem. Our paper addresses this by construction of a compact database that contains all useful information expressed in RNA-seq reads. Applying our method to cumulative C. elegans data reduced 496.2 GB of aligned RNA-seq SAM files to 410 MB of splice graph database written in FASTA format. This corresponds to 1000× compression of data size, without loss of sensitivity. We performed a proteogenomics study using the custom data set, using a completely automated pipeline, and identified a total of 4044 novel events, including 215 novel genes, 808 novel exons, 12 alternative splicings, 618 gene-boundary corrections, 245 exon-boundary changes, 938 frame shifts, 1166 reverse strands, and 42 translated UTRs. Our results highlight the usefulness of transcript + proteomic integration for improved genome annotations.
Collapse
Affiliation(s)
- Sunghee Woo
- Department of Electrical and Computing Engineering, University of California, San Diego
| | - Seong Won Cha
- Department of Electrical and Computing Engineering, University of California, San Diego
| | - Gennifer Merrihew
- University of Washington, Department of Genome Sciences, Seattle, WA, USA
| | - Yupeng He
- Department of Bioinformatics and Systems Biology, University of California, San Diego
| | | | - Clark Guest
- Department of Electrical and Computing Engineering, University of California, San Diego
| | - Michael MacCoss
- University of Washington, Department of Genome Sciences, Seattle, WA, USA
| | - Vineet Bafna
- Department of Computer Science, University of California, San Diego
| |
Collapse
|
677
|
Mootha VV, Gong X, Ku HC, Xing C. Association and familial segregation of CTG18.1 trinucleotide repeat expansion of TCF4 gene in Fuchs' endothelial corneal dystrophy. Invest Ophthalmol Vis Sci 2014; 55:33-42. [PMID: 24255041 DOI: 10.1167/iovs.13-12611] [Citation(s) in RCA: 78] [Impact Index Per Article: 7.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/30/2022] Open
Abstract
PURPOSE We tested the association between two intronic polymorphisms (CTG18.1 and rs613872) in TCF4 and Fuchs' endothelial corneal dystrophy (FECD), and analyzed their segregation patterns in families. METHODS We recruited 120 unrelated Caucasian subjects with FECD and 100 controls. Available family members of probands were recruited. Genotyping of the single nucleotide polymorphism (SNP) rs613872 was performed using Sanger sequencing or real-time allelic discrimination assay. The trinucleotide repeat polymorphism, CTG18.1, was genotyped using a combination of short tandem repeat assay and triplet repeat primed PCR assay. The cytosine-thymine-guanine (CTG) repeat length of ≥40 was classified as an expanded CTG18.1 allele. Association of the two loci with FECD was evaluated. Segregation in 29 families was examined. RESULTS The two polymorphisms are in linkage disequilibrium (r(2) = 0.65 in cases and 0.31 in controls). Significant associations were found between FECD and rs613872 (P = 3.1 × 10(-17)), expanded CTG18.1 allele (P = 6.5 × 10(-25)), and their haplotypes (P = 5.9 × 10(-19)). The odds ratio (OR) of each copy of the rs613872 G allele for FECD was estimated to be 9.5 (95% confidence interval [CI], 5.1-17.5). The OR of each copy of the CTG18.1 expanded allele was estimated to be 32.3 (95% CI, 13.4-77.6). The expanded CTG 18.1 allele cosegregated with the trait in 52% (15/29) of families with complete penetrance and 10% (3/29) with incomplete penetrance. CONCLUSIONS We report, to our knowledge, the first independent replication of the expanded CTG 18.1 allele conferring significant risk for FECD (>30-fold increase). The expanded allele cosegregates with the trait with complete penetrance in a majority of families, but we also document cases of incomplete penetrance.
Collapse
Affiliation(s)
- V Vinod Mootha
- University of Texas Southwestern Medical Center, Department of Ophthalmology, Dallas, Texas
| | | | | | | |
Collapse
|
678
|
Cui J, Zhao W, Huang Z, Jarvis ED, Gilbert MTP, Walker PJ, Holmes EC, Zhang G. Low frequency of paleoviral infiltration across the avian phylogeny. Genome Biol 2014; 15:539. [PMID: 25496498 PMCID: PMC4272516 DOI: 10.1186/s13059-014-0539-3] [Citation(s) in RCA: 54] [Impact Index Per Article: 5.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/11/2014] [Accepted: 11/10/2014] [Indexed: 01/30/2023] Open
Abstract
BACKGROUND Mammalian genomes commonly harbor endogenous viral elements. Due to a lack of comparable genome-scale sequence data, far less is known about endogenous viral elements in avian species, even though their small genomes may enable important insights into the patterns and processes of endogenous viral element evolution. RESULTS Through a systematic screening of the genomes of 48 species sampled across the avian phylogeny we reveal that birds harbor a limited number of endogenous viral elements compared to mammals, with only five viral families observed: Retroviridae, Hepadnaviridae, Bornaviridae, Circoviridae, and Parvoviridae. All nonretroviral endogenous viral elements are present at low copy numbers and in few species, with only endogenous hepadnaviruses widely distributed, although these have been purged in some cases. We also provide the first evidence for endogenous bornaviruses and circoviruses in avian genomes, although at very low copy numbers. A comparative analysis of vertebrate genomes revealed a simple linear relationship between endogenous viral element abundance and host genome size, such that the occurrence of endogenous viral elements in bird genomes is 6- to 13-fold less frequent than in mammals. CONCLUSIONS These results reveal that avian genomes harbor relatively small numbers of endogenous viruses, particularly those derived from RNA viruses, and hence are either less susceptible to viral invasions or purge them more effectively.
Collapse
Affiliation(s)
- Jie Cui
- />Marie Bashir Institute for Infectious Diseases and Biosecurity, Charles Perkins Centre, School of Biological Sciences and Sydney Medical School, The University of Sydney, Sydney, NSW 2006 Australia
- />Program in Emerging Infectious Diseases, Duke-NUS Graduate Medical School, Singapore, 169857 Singapore
| | - Wei Zhao
- />China National GeneBank, BGI-Shenzhen, Shenzhen, 518083 China
| | - Zhiyong Huang
- />China National GeneBank, BGI-Shenzhen, Shenzhen, 518083 China
| | - Erich D Jarvis
- />Howard Hughes Medical Institute, Duke University Medical Center, Department of Neurobiology, Box 3209, Durham, North Carolina 27710 USA
| | - M Thomas P Gilbert
- />Centre for GeoGenetics, Natural History Museum of Denmark, University of Copenhagen, Øster Voldgade 5-7, DK-1350 Copenhagen, Denmark
- />Trace and Environmental DNA Laboratory, Department of Environment and Agriculture, Curtin University, Perth, Western Australia 6102 Australia
| | - Peter J Walker
- />CSIRO Animal, Food and Health Sciences, Australian Animal Health Laboratory, Geelong, Victoria 3220 Australia
| | - Edward C Holmes
- />Marie Bashir Institute for Infectious Diseases and Biosecurity, Charles Perkins Centre, School of Biological Sciences and Sydney Medical School, The University of Sydney, Sydney, NSW 2006 Australia
| | - Guojie Zhang
- />China National GeneBank, BGI-Shenzhen, Shenzhen, 518083 China
- />Centre for Social Evolution, Department of Biology, University of Copenhagen, Universitetsparken 15, DK-2100 Copenhagen, Denmark
| |
Collapse
|
679
|
Chen YL, Chen CM, Pai TW, Leong HW, Chong KF. Homologous synteny block detection based on suffix tree algorithms. J Bioinform Comput Biol 2014; 11:1343004. [PMID: 24372033 DOI: 10.1142/s021972001343004x] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022]
Abstract
A synteny block represents a set of contiguous genes located within the same chromosome and well conserved among various species. Through long evolutionary processes and genome rearrangement events, large numbers of synteny blocks remain highly conserved across multiple species. Understanding distribution of conserved gene blocks facilitates evolutionary biologists to trace the diversity of life, and it also plays an important role for orthologous gene detection and gene annotation in the genomic era. In this work, we focus on collinear synteny detection in which the order of genes is required and well conserved among multiple species. To achieve this goal, the suffix tree based algorithms for efficiently identifying homologous synteny blocks was proposed. The traditional suffix tree algorithm was modified by considering a chromosome as a string and each gene in a chromosome is encoded as a symbol character. Hence, a suffix tree can be built for different query chromosomes from various species. We can then efficiently search for conserved synteny blocks that are modeled as overlapped contiguous edges in our suffix tree. In addition, we defined a novel Synteny Block Conserved Index (SBCI) to evaluate the relationship of synteny block distribution between two species, and which could be applied as an evolutionary indicator for constructing a phylogenetic tree from multiple species instead of performing large computational requirements through whole genome sequence alignment.
Collapse
Affiliation(s)
- Yu-Lun Chen
- Department of Computer Science and Engineering and Center of Excellence for the Oceans, National Taiwan Ocean University, No. 2 Peining Road, Keelung, Taiwan 20224, Republic of China
| | | | | | | | | |
Collapse
|
680
|
Heinzel A, Mühlberger I, Fechete R, Mayer B, Perco P. Functional molecular units for guiding biomarker panel design. Methods Mol Biol 2014; 1159:109-133. [PMID: 24788264 DOI: 10.1007/978-1-4939-0709-0_7] [Citation(s) in RCA: 14] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 06/03/2023]
Abstract
The field of biomarker research has experienced a major boost in recent years, and the number of publications on biomarker studies evaluating given, but also proposing novel biomarker candidates is increasing rapidly for numerous clinically relevant disease areas. However, individual markers often lack sensitivity and specificity in the clinical context, resting essentially on the intra-individual phenotype variability hampering sensitivity, or on assessing more general processes downstream of the causative molecular events characterizing a disease term, in consequence impairing disease specificity. The trend to circumvent these shortcomings goes towards utilizing multimarker panels, thus combining the strength of individual markers to further enhance performance regarding both sensitivity and specificity. A way of identifying the optimal composition of individual markers in a panel approach is to pick each marker as representative for a specific pathophysiological (mechanistic) process relevant for the disease under investigation, hence resulting in a multimarker panel for covering the set of pathophysiological processes underlying the frequently multifactorial composition of a clinical phenotype.Here we outline a procedure of identifying such sets of disease-specific pathophysiological processes (units) delineated on the basis of disease-associated molecular feature lists derived from literature mining as well as aggregated, publicly available Omics profiling experiments. With such molecular units in hand, providing an improved reflection of a specific clinical phenotype, biomarker candidates can then be assigned to or novel candidates are to be selected from these units, subsequently resulting in a multimarker panel promising improved accuracy in disease diagnosis as well as prognosis.
Collapse
Affiliation(s)
- Andreas Heinzel
- emergentec biodevelopment GmbH, Gersthofer Strasse 29-31, 1180, Vienna, Austria
| | | | | | | | | |
Collapse
|
681
|
Koscielny G, Yaikhom G, Iyer V, Meehan TF, Morgan H, Atienza-Herrero J, Blake A, Chen CK, Easty R, Di Fenza A, Fiegel T, Grifiths M, Horne A, Karp NA, Kurbatova N, Mason JC, Matthews P, Oakley DJ, Qazi A, Regnart J, Retha A, Santos LA, Sneddon DJ, Warren J, Westerberg H, Wilson RJ, Melvin DG, Smedley D, Brown SDM, Flicek P, Skarnes WC, Mallon AM, Parkinson H. The International Mouse Phenotyping Consortium Web Portal, a unified point of access for knockout mice and related phenotyping data. Nucleic Acids Res 2014; 42:D802-9. [PMID: 24194600 PMCID: PMC3964955 DOI: 10.1093/nar/gkt977] [Citation(s) in RCA: 207] [Impact Index Per Article: 20.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/15/2013] [Revised: 09/20/2013] [Accepted: 10/01/2013] [Indexed: 12/21/2022] Open
Abstract
The International Mouse Phenotyping Consortium (IMPC) web portal (http://www.mousephenotype.org) provides the biomedical community with a unified point of access to mutant mice and rich collection of related emerging and existing mouse phenotype data. IMPC mouse clinics worldwide follow rigorous highly structured and standardized protocols for the experimentation, collection and dissemination of data. Dedicated 'data wranglers' work with each phenotyping center to collate data and perform quality control of data. An automated statistical analysis pipeline has been developed to identify knockout strains with a significant change in the phenotype parameters. Annotation with biomedical ontologies allows biologists and clinicians to easily find mouse strains with phenotypic traits relevant to their research. Data integration with other resources will provide insights into mammalian gene function and human disease. As phenotype data become available for every gene in the mouse, the IMPC web portal will become an invaluable tool for researchers studying the genetic contributions of genes to human diseases.
Collapse
Affiliation(s)
- Gautier Koscielny
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SD, UK, Medical Research Council Harwell (Mammalian Genetics Unit and Mary Lyon Centre), Harwell, Oxfordshire OX11 0RD, UK and Mouse Informatics Group, Wellcome Trust Sanger Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SA, UK
| | - Gagarine Yaikhom
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SD, UK, Medical Research Council Harwell (Mammalian Genetics Unit and Mary Lyon Centre), Harwell, Oxfordshire OX11 0RD, UK and Mouse Informatics Group, Wellcome Trust Sanger Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SA, UK
| | - Vivek Iyer
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SD, UK, Medical Research Council Harwell (Mammalian Genetics Unit and Mary Lyon Centre), Harwell, Oxfordshire OX11 0RD, UK and Mouse Informatics Group, Wellcome Trust Sanger Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SA, UK
| | - Terrence F. Meehan
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SD, UK, Medical Research Council Harwell (Mammalian Genetics Unit and Mary Lyon Centre), Harwell, Oxfordshire OX11 0RD, UK and Mouse Informatics Group, Wellcome Trust Sanger Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SA, UK
| | - Hugh Morgan
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SD, UK, Medical Research Council Harwell (Mammalian Genetics Unit and Mary Lyon Centre), Harwell, Oxfordshire OX11 0RD, UK and Mouse Informatics Group, Wellcome Trust Sanger Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SA, UK
| | - Julian Atienza-Herrero
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SD, UK, Medical Research Council Harwell (Mammalian Genetics Unit and Mary Lyon Centre), Harwell, Oxfordshire OX11 0RD, UK and Mouse Informatics Group, Wellcome Trust Sanger Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SA, UK
| | - Andrew Blake
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SD, UK, Medical Research Council Harwell (Mammalian Genetics Unit and Mary Lyon Centre), Harwell, Oxfordshire OX11 0RD, UK and Mouse Informatics Group, Wellcome Trust Sanger Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SA, UK
| | - Chao-Kung Chen
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SD, UK, Medical Research Council Harwell (Mammalian Genetics Unit and Mary Lyon Centre), Harwell, Oxfordshire OX11 0RD, UK and Mouse Informatics Group, Wellcome Trust Sanger Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SA, UK
| | - Richard Easty
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SD, UK, Medical Research Council Harwell (Mammalian Genetics Unit and Mary Lyon Centre), Harwell, Oxfordshire OX11 0RD, UK and Mouse Informatics Group, Wellcome Trust Sanger Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SA, UK
| | - Armida Di Fenza
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SD, UK, Medical Research Council Harwell (Mammalian Genetics Unit and Mary Lyon Centre), Harwell, Oxfordshire OX11 0RD, UK and Mouse Informatics Group, Wellcome Trust Sanger Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SA, UK
| | - Tanja Fiegel
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SD, UK, Medical Research Council Harwell (Mammalian Genetics Unit and Mary Lyon Centre), Harwell, Oxfordshire OX11 0RD, UK and Mouse Informatics Group, Wellcome Trust Sanger Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SA, UK
| | - Mark Grifiths
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SD, UK, Medical Research Council Harwell (Mammalian Genetics Unit and Mary Lyon Centre), Harwell, Oxfordshire OX11 0RD, UK and Mouse Informatics Group, Wellcome Trust Sanger Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SA, UK
| | - Alan Horne
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SD, UK, Medical Research Council Harwell (Mammalian Genetics Unit and Mary Lyon Centre), Harwell, Oxfordshire OX11 0RD, UK and Mouse Informatics Group, Wellcome Trust Sanger Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SA, UK
| | - Natasha A. Karp
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SD, UK, Medical Research Council Harwell (Mammalian Genetics Unit and Mary Lyon Centre), Harwell, Oxfordshire OX11 0RD, UK and Mouse Informatics Group, Wellcome Trust Sanger Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SA, UK
| | - Natalja Kurbatova
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SD, UK, Medical Research Council Harwell (Mammalian Genetics Unit and Mary Lyon Centre), Harwell, Oxfordshire OX11 0RD, UK and Mouse Informatics Group, Wellcome Trust Sanger Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SA, UK
| | - Jeremy C. Mason
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SD, UK, Medical Research Council Harwell (Mammalian Genetics Unit and Mary Lyon Centre), Harwell, Oxfordshire OX11 0RD, UK and Mouse Informatics Group, Wellcome Trust Sanger Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SA, UK
| | - Peter Matthews
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SD, UK, Medical Research Council Harwell (Mammalian Genetics Unit and Mary Lyon Centre), Harwell, Oxfordshire OX11 0RD, UK and Mouse Informatics Group, Wellcome Trust Sanger Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SA, UK
| | - Darren J. Oakley
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SD, UK, Medical Research Council Harwell (Mammalian Genetics Unit and Mary Lyon Centre), Harwell, Oxfordshire OX11 0RD, UK and Mouse Informatics Group, Wellcome Trust Sanger Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SA, UK
| | - Asfand Qazi
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SD, UK, Medical Research Council Harwell (Mammalian Genetics Unit and Mary Lyon Centre), Harwell, Oxfordshire OX11 0RD, UK and Mouse Informatics Group, Wellcome Trust Sanger Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SA, UK
| | - Jack Regnart
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SD, UK, Medical Research Council Harwell (Mammalian Genetics Unit and Mary Lyon Centre), Harwell, Oxfordshire OX11 0RD, UK and Mouse Informatics Group, Wellcome Trust Sanger Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SA, UK
| | - Ahmad Retha
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SD, UK, Medical Research Council Harwell (Mammalian Genetics Unit and Mary Lyon Centre), Harwell, Oxfordshire OX11 0RD, UK and Mouse Informatics Group, Wellcome Trust Sanger Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SA, UK
| | - Luis A. Santos
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SD, UK, Medical Research Council Harwell (Mammalian Genetics Unit and Mary Lyon Centre), Harwell, Oxfordshire OX11 0RD, UK and Mouse Informatics Group, Wellcome Trust Sanger Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SA, UK
| | - Duncan J. Sneddon
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SD, UK, Medical Research Council Harwell (Mammalian Genetics Unit and Mary Lyon Centre), Harwell, Oxfordshire OX11 0RD, UK and Mouse Informatics Group, Wellcome Trust Sanger Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SA, UK
| | - Jonathan Warren
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SD, UK, Medical Research Council Harwell (Mammalian Genetics Unit and Mary Lyon Centre), Harwell, Oxfordshire OX11 0RD, UK and Mouse Informatics Group, Wellcome Trust Sanger Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SA, UK
| | - Henrik Westerberg
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SD, UK, Medical Research Council Harwell (Mammalian Genetics Unit and Mary Lyon Centre), Harwell, Oxfordshire OX11 0RD, UK and Mouse Informatics Group, Wellcome Trust Sanger Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SA, UK
| | - Robert J. Wilson
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SD, UK, Medical Research Council Harwell (Mammalian Genetics Unit and Mary Lyon Centre), Harwell, Oxfordshire OX11 0RD, UK and Mouse Informatics Group, Wellcome Trust Sanger Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SA, UK
| | - David G. Melvin
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SD, UK, Medical Research Council Harwell (Mammalian Genetics Unit and Mary Lyon Centre), Harwell, Oxfordshire OX11 0RD, UK and Mouse Informatics Group, Wellcome Trust Sanger Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SA, UK
| | - Damian Smedley
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SD, UK, Medical Research Council Harwell (Mammalian Genetics Unit and Mary Lyon Centre), Harwell, Oxfordshire OX11 0RD, UK and Mouse Informatics Group, Wellcome Trust Sanger Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SA, UK
| | - Steve D. M. Brown
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SD, UK, Medical Research Council Harwell (Mammalian Genetics Unit and Mary Lyon Centre), Harwell, Oxfordshire OX11 0RD, UK and Mouse Informatics Group, Wellcome Trust Sanger Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SA, UK
| | - Paul Flicek
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SD, UK, Medical Research Council Harwell (Mammalian Genetics Unit and Mary Lyon Centre), Harwell, Oxfordshire OX11 0RD, UK and Mouse Informatics Group, Wellcome Trust Sanger Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SA, UK
| | - William C. Skarnes
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SD, UK, Medical Research Council Harwell (Mammalian Genetics Unit and Mary Lyon Centre), Harwell, Oxfordshire OX11 0RD, UK and Mouse Informatics Group, Wellcome Trust Sanger Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SA, UK
| | - Ann-Marie Mallon
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SD, UK, Medical Research Council Harwell (Mammalian Genetics Unit and Mary Lyon Centre), Harwell, Oxfordshire OX11 0RD, UK and Mouse Informatics Group, Wellcome Trust Sanger Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SA, UK
| | - Helen Parkinson
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SD, UK, Medical Research Council Harwell (Mammalian Genetics Unit and Mary Lyon Centre), Harwell, Oxfordshire OX11 0RD, UK and Mouse Informatics Group, Wellcome Trust Sanger Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SA, UK
| |
Collapse
|
682
|
Norman JD, Ferguson MM, Danzmann RG. An integrated transcriptomic and comparative genomic analysis of differential gene expression in Arctic charr (Salvelinus alpinus) following seawater exposure. J Exp Biol 2014; 217:4029-42. [DOI: 10.1242/jeb.107441] [Citation(s) in RCA: 13] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/23/2023]
Abstract
Abstract
High-throughput RNA sequencing was employed to compare expression profiles in two Arctic charr (Salvelinus alpinus) families post seawater exposure to identify genes and biological processes involved in hypo-osmoregulation and regulation of salinity tolerance. To further understand the genetic architecture of hypo-osmoregulation, the genomic organization of differentially expressed (DE) genes was also analysed. Using a de novo gill transcriptome assembly we found over 2300 contigs to be DE. Major transporters from the seawater mitochondrion-rich cell (MRC) complex were up-regulated in seawater. Expression ratios for 257 differentially expressed contigs were highly correlated between families, suggesting they are strictly regulated. Based on expression profiles and known molecular pathways we inferred that seawater exposure induced changes in methylation states and elevated peroxynitrite formation in gill. We hypothesized that concomitance between DE immune genes and the transition to a hypo-osmoregulatory state could be related to Cl- sequestration by antimicrobial defence mechanisms. Gene Ontology analysis revealed that cell division genes were up-regulated, which could reflect the proliferation of ATP1α1b-type seawater MRCs. Comparative genomics analyses suggest that hypo-osmoregulation is influenced by the relative proximities among a contingent of genes on Arctic charr linkage groups AC-4 and AC-12 that exhibit homologous affinities with a region on stickleback chromosome Ga-I. This supports the hypothesis that relative gene location along a chromosome is a property of the genetic architecture of hypo-osmoregulation. Evidence of non-random structure between hypo-osmoregulation candidate genes was found on AC-1/11 and AC-28, suggesting that interchromosomal rearrangements played a role in the evolution of hypo-osmoregulation in Arctic charr.
Collapse
|
683
|
Abstract
The mission of the Universal Protein Resource (UniProt) (http://www.uniprot.org) is to provide the scientific community with a comprehensive, high-quality and freely accessible resource of protein sequences and functional annotation. It integrates, interprets and standardizes data from literature and numerous resources to achieve the most comprehensive catalog possible of protein information. The central activities are the biocuration of the UniProt Knowledgebase and the dissemination of these data through our Web site and web services. UniProt is produced by the UniProt Consortium, which consists of groups from the European Bioinformatics Institute (EBI), the SIB Swiss Institute of Bioinformatics (SIB) and the Protein Information Resource (PIR). UniProt is updated and distributed every 4 weeks and can be accessed online for searches or downloads.
Collapse
Affiliation(s)
- The UniProt Consortium
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SD, UK, SIB Swiss Institute of Bioinformatics, Centre Medical Universitaire, 1 rue Michel Servet, 1211 Geneva 4, Switzerland, Protein Information Resource, Georgetown University Medical Center, 3300 Whitehaven Street North West, Suite 1200, Washington, DC 20007, USA and Protein Information Resource, University of Delaware, 15 Innovation Way, Suite 205, Newark, DE 19711, USA
| |
Collapse
|
684
|
Rehfeld A, Plass M, Døssing K, Knigge U, Kjær A, Krogh A, Friis-Hansen L. Alternative polyadenylation of tumor suppressor genes in small intestinal neuroendocrine tumors. Front Endocrinol (Lausanne) 2014; 5:46. [PMID: 24782827 PMCID: PMC3995063 DOI: 10.3389/fendo.2014.00046] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 02/09/2014] [Accepted: 03/22/2014] [Indexed: 12/20/2022] Open
Abstract
The tumorigenesis of small intestinal neuroendocrine tumors (SI-NETs) is poorly understood. Recent studies have associated alternative polyadenylation (APA) with proliferation, cell transformation, and cancer. Polyadenylation is the process in which the pre-messenger RNA is cleaved at a polyA site and a polyA tail is added. Genes with two or more polyA sites can undergo APA. This produces two or more distinct mRNA isoforms with different 3' untranslated regions. Additionally, APA can also produce mRNAs containing different 3'-terminal coding regions. Therefore, APA alters both the repertoire and the expression level of proteins. Here, we used high-throughput sequencing data to map polyA sites and characterize polyadenylation genome-wide in three SI-NETs and a reference sample. In the tumors, 16 genes showed significant changes of APA pattern, which lead to either the 3' truncation of mRNA coding regions or 3' untranslated regions. Among these, 11 genes had been previously associated with cancer, with 4 genes being known tumor suppressors: DCC, PDZD2, MAGI1, and DACT2. We validated the APA in three out of three cases with quantitative real-time-PCR. Our findings suggest that changes of APA pattern in these 16 genes could be involved in the tumorigenesis of SI-NETs. Furthermore, they also point to APA as a new target for both diagnostic and treatment of SI-NETs. The identified genes with APA specific to the SI-NETs could be further tested as diagnostic markers and drug targets for disease prevention and treatment.
Collapse
Affiliation(s)
- Anders Rehfeld
- Center for Genomic Medicine, Rigshospitalet, University of Copenhagen, Copenhagen, Denmark
| | - Mireya Plass
- Department of Biology, The Bioinformatics Centre, University of Copenhagen, Copenhagen, Denmark
| | - Kristina Døssing
- Center for Genomic Medicine, Rigshospitalet, University of Copenhagen, Copenhagen, Denmark
| | - Ulrich Knigge
- Department of Surgical Gastroenterology and Endocrinology, Rigshospitalet, University of Copenhagen, Copenhagen, Denmark
| | - Andreas Kjær
- Department of Clinical Physiology, Nuclear Medicine and PET, Rigshospitalet, University of Copenhagen, Copenhagen, Denmark
| | - Anders Krogh
- Department of Biology, The Bioinformatics Centre, University of Copenhagen, Copenhagen, Denmark
| | - Lennart Friis-Hansen
- Center for Genomic Medicine, Rigshospitalet, University of Copenhagen, Copenhagen, Denmark
- *Correspondence: Lennart Friis-Hansen, Center for Genomic Medicine, Rigshospitalet, University of Copenhagen, Blegdamsvej 9, Copenhagen DK 2100, Denmark e-mail:
| |
Collapse
|
685
|
Karolchik D, Barber GP, Casper J, Clawson H, Cline MS, Diekhans M, Dreszer TR, Fujita PA, Guruvadoo L, Haeussler M, Harte RA, Heitner S, Hinrichs AS, Learned K, Lee BT, Li CH, Raney BJ, Rhead B, Rosenbloom KR, Sloan CA, Speir ML, Zweig AS, Haussler D, Kuhn RM, Kent WJ. The UCSC Genome Browser database: 2014 update. Nucleic Acids Res 2014; 42:D764-70. [PMID: 24270787 PMCID: PMC3964947 DOI: 10.1093/nar/gkt1168] [Citation(s) in RCA: 550] [Impact Index Per Article: 55.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/18/2013] [Revised: 10/30/2013] [Accepted: 10/30/2013] [Indexed: 12/17/2022] Open
Abstract
The University of California Santa Cruz (UCSC) Genome Browser (http://genome.ucsc.edu) offers online public access to a growing database of genomic sequence and annotations for a large collection of organisms, primarily vertebrates, with an emphasis on the human and mouse genomes. The Browser's web-based tools provide an integrated environment for visualizing, comparing, analysing and sharing both publicly available and user-generated genomic data sets. As of September 2013, the database contained genomic sequence and a basic set of annotation 'tracks' for ∼90 organisms. Significant new annotations include a 60-species multiple alignment conservation track on the mouse, updated UCSC Genes tracks for human and mouse, and several new sets of variation and ENCODE data. New software tools include a Variant Annotation Integrator that returns predicted functional effects of a set of variants uploaded as a custom track, an extension to UCSC Genes that displays haplotype alleles for protein-coding genes and an expansion of data hubs that includes the capability to display remotely hosted user-provided assembly sequence in addition to annotation data. To improve European access, we have added a Genome Browser mirror (http://genome-euro.ucsc.edu) hosted at Bielefeld University in Germany.
Collapse
Affiliation(s)
- Donna Karolchik
- Center for Biomolecular Science and Engineering, School of Engineering, University of California Santa Cruz (UCSC), 1156 High Street, Santa Cruz, CA 95064, USA, Computational Biology Graduate Group, University of California Berkeley, Berkeley, CA 94720, USA, Department of Genetics, Stanford University School of Medicine, 3165 Porter Drive, Stanford, CA 94305, USA and Howard Hughes Medical Institute, Center for Biomolecular Science and Engineering, UCSC, 1156 High Street, Santa Cruz, CA 95064, USA
| | - Galt P. Barber
- Center for Biomolecular Science and Engineering, School of Engineering, University of California Santa Cruz (UCSC), 1156 High Street, Santa Cruz, CA 95064, USA, Computational Biology Graduate Group, University of California Berkeley, Berkeley, CA 94720, USA, Department of Genetics, Stanford University School of Medicine, 3165 Porter Drive, Stanford, CA 94305, USA and Howard Hughes Medical Institute, Center for Biomolecular Science and Engineering, UCSC, 1156 High Street, Santa Cruz, CA 95064, USA
| | - Jonathan Casper
- Center for Biomolecular Science and Engineering, School of Engineering, University of California Santa Cruz (UCSC), 1156 High Street, Santa Cruz, CA 95064, USA, Computational Biology Graduate Group, University of California Berkeley, Berkeley, CA 94720, USA, Department of Genetics, Stanford University School of Medicine, 3165 Porter Drive, Stanford, CA 94305, USA and Howard Hughes Medical Institute, Center for Biomolecular Science and Engineering, UCSC, 1156 High Street, Santa Cruz, CA 95064, USA
| | - Hiram Clawson
- Center for Biomolecular Science and Engineering, School of Engineering, University of California Santa Cruz (UCSC), 1156 High Street, Santa Cruz, CA 95064, USA, Computational Biology Graduate Group, University of California Berkeley, Berkeley, CA 94720, USA, Department of Genetics, Stanford University School of Medicine, 3165 Porter Drive, Stanford, CA 94305, USA and Howard Hughes Medical Institute, Center for Biomolecular Science and Engineering, UCSC, 1156 High Street, Santa Cruz, CA 95064, USA
| | - Melissa S. Cline
- Center for Biomolecular Science and Engineering, School of Engineering, University of California Santa Cruz (UCSC), 1156 High Street, Santa Cruz, CA 95064, USA, Computational Biology Graduate Group, University of California Berkeley, Berkeley, CA 94720, USA, Department of Genetics, Stanford University School of Medicine, 3165 Porter Drive, Stanford, CA 94305, USA and Howard Hughes Medical Institute, Center for Biomolecular Science and Engineering, UCSC, 1156 High Street, Santa Cruz, CA 95064, USA
| | - Mark Diekhans
- Center for Biomolecular Science and Engineering, School of Engineering, University of California Santa Cruz (UCSC), 1156 High Street, Santa Cruz, CA 95064, USA, Computational Biology Graduate Group, University of California Berkeley, Berkeley, CA 94720, USA, Department of Genetics, Stanford University School of Medicine, 3165 Porter Drive, Stanford, CA 94305, USA and Howard Hughes Medical Institute, Center for Biomolecular Science and Engineering, UCSC, 1156 High Street, Santa Cruz, CA 95064, USA
| | - Timothy R. Dreszer
- Center for Biomolecular Science and Engineering, School of Engineering, University of California Santa Cruz (UCSC), 1156 High Street, Santa Cruz, CA 95064, USA, Computational Biology Graduate Group, University of California Berkeley, Berkeley, CA 94720, USA, Department of Genetics, Stanford University School of Medicine, 3165 Porter Drive, Stanford, CA 94305, USA and Howard Hughes Medical Institute, Center for Biomolecular Science and Engineering, UCSC, 1156 High Street, Santa Cruz, CA 95064, USA
| | - Pauline A. Fujita
- Center for Biomolecular Science and Engineering, School of Engineering, University of California Santa Cruz (UCSC), 1156 High Street, Santa Cruz, CA 95064, USA, Computational Biology Graduate Group, University of California Berkeley, Berkeley, CA 94720, USA, Department of Genetics, Stanford University School of Medicine, 3165 Porter Drive, Stanford, CA 94305, USA and Howard Hughes Medical Institute, Center for Biomolecular Science and Engineering, UCSC, 1156 High Street, Santa Cruz, CA 95064, USA
| | - Luvina Guruvadoo
- Center for Biomolecular Science and Engineering, School of Engineering, University of California Santa Cruz (UCSC), 1156 High Street, Santa Cruz, CA 95064, USA, Computational Biology Graduate Group, University of California Berkeley, Berkeley, CA 94720, USA, Department of Genetics, Stanford University School of Medicine, 3165 Porter Drive, Stanford, CA 94305, USA and Howard Hughes Medical Institute, Center for Biomolecular Science and Engineering, UCSC, 1156 High Street, Santa Cruz, CA 95064, USA
| | - Maximilian Haeussler
- Center for Biomolecular Science and Engineering, School of Engineering, University of California Santa Cruz (UCSC), 1156 High Street, Santa Cruz, CA 95064, USA, Computational Biology Graduate Group, University of California Berkeley, Berkeley, CA 94720, USA, Department of Genetics, Stanford University School of Medicine, 3165 Porter Drive, Stanford, CA 94305, USA and Howard Hughes Medical Institute, Center for Biomolecular Science and Engineering, UCSC, 1156 High Street, Santa Cruz, CA 95064, USA
| | - Rachel A. Harte
- Center for Biomolecular Science and Engineering, School of Engineering, University of California Santa Cruz (UCSC), 1156 High Street, Santa Cruz, CA 95064, USA, Computational Biology Graduate Group, University of California Berkeley, Berkeley, CA 94720, USA, Department of Genetics, Stanford University School of Medicine, 3165 Porter Drive, Stanford, CA 94305, USA and Howard Hughes Medical Institute, Center for Biomolecular Science and Engineering, UCSC, 1156 High Street, Santa Cruz, CA 95064, USA
| | - Steve Heitner
- Center for Biomolecular Science and Engineering, School of Engineering, University of California Santa Cruz (UCSC), 1156 High Street, Santa Cruz, CA 95064, USA, Computational Biology Graduate Group, University of California Berkeley, Berkeley, CA 94720, USA, Department of Genetics, Stanford University School of Medicine, 3165 Porter Drive, Stanford, CA 94305, USA and Howard Hughes Medical Institute, Center for Biomolecular Science and Engineering, UCSC, 1156 High Street, Santa Cruz, CA 95064, USA
| | - Angie S. Hinrichs
- Center for Biomolecular Science and Engineering, School of Engineering, University of California Santa Cruz (UCSC), 1156 High Street, Santa Cruz, CA 95064, USA, Computational Biology Graduate Group, University of California Berkeley, Berkeley, CA 94720, USA, Department of Genetics, Stanford University School of Medicine, 3165 Porter Drive, Stanford, CA 94305, USA and Howard Hughes Medical Institute, Center for Biomolecular Science and Engineering, UCSC, 1156 High Street, Santa Cruz, CA 95064, USA
| | - Katrina Learned
- Center for Biomolecular Science and Engineering, School of Engineering, University of California Santa Cruz (UCSC), 1156 High Street, Santa Cruz, CA 95064, USA, Computational Biology Graduate Group, University of California Berkeley, Berkeley, CA 94720, USA, Department of Genetics, Stanford University School of Medicine, 3165 Porter Drive, Stanford, CA 94305, USA and Howard Hughes Medical Institute, Center for Biomolecular Science and Engineering, UCSC, 1156 High Street, Santa Cruz, CA 95064, USA
| | - Brian T. Lee
- Center for Biomolecular Science and Engineering, School of Engineering, University of California Santa Cruz (UCSC), 1156 High Street, Santa Cruz, CA 95064, USA, Computational Biology Graduate Group, University of California Berkeley, Berkeley, CA 94720, USA, Department of Genetics, Stanford University School of Medicine, 3165 Porter Drive, Stanford, CA 94305, USA and Howard Hughes Medical Institute, Center for Biomolecular Science and Engineering, UCSC, 1156 High Street, Santa Cruz, CA 95064, USA
| | - Chin H. Li
- Center for Biomolecular Science and Engineering, School of Engineering, University of California Santa Cruz (UCSC), 1156 High Street, Santa Cruz, CA 95064, USA, Computational Biology Graduate Group, University of California Berkeley, Berkeley, CA 94720, USA, Department of Genetics, Stanford University School of Medicine, 3165 Porter Drive, Stanford, CA 94305, USA and Howard Hughes Medical Institute, Center for Biomolecular Science and Engineering, UCSC, 1156 High Street, Santa Cruz, CA 95064, USA
| | - Brian J. Raney
- Center for Biomolecular Science and Engineering, School of Engineering, University of California Santa Cruz (UCSC), 1156 High Street, Santa Cruz, CA 95064, USA, Computational Biology Graduate Group, University of California Berkeley, Berkeley, CA 94720, USA, Department of Genetics, Stanford University School of Medicine, 3165 Porter Drive, Stanford, CA 94305, USA and Howard Hughes Medical Institute, Center for Biomolecular Science and Engineering, UCSC, 1156 High Street, Santa Cruz, CA 95064, USA
| | - Brooke Rhead
- Center for Biomolecular Science and Engineering, School of Engineering, University of California Santa Cruz (UCSC), 1156 High Street, Santa Cruz, CA 95064, USA, Computational Biology Graduate Group, University of California Berkeley, Berkeley, CA 94720, USA, Department of Genetics, Stanford University School of Medicine, 3165 Porter Drive, Stanford, CA 94305, USA and Howard Hughes Medical Institute, Center for Biomolecular Science and Engineering, UCSC, 1156 High Street, Santa Cruz, CA 95064, USA
| | - Kate R. Rosenbloom
- Center for Biomolecular Science and Engineering, School of Engineering, University of California Santa Cruz (UCSC), 1156 High Street, Santa Cruz, CA 95064, USA, Computational Biology Graduate Group, University of California Berkeley, Berkeley, CA 94720, USA, Department of Genetics, Stanford University School of Medicine, 3165 Porter Drive, Stanford, CA 94305, USA and Howard Hughes Medical Institute, Center for Biomolecular Science and Engineering, UCSC, 1156 High Street, Santa Cruz, CA 95064, USA
| | - Cricket A. Sloan
- Center for Biomolecular Science and Engineering, School of Engineering, University of California Santa Cruz (UCSC), 1156 High Street, Santa Cruz, CA 95064, USA, Computational Biology Graduate Group, University of California Berkeley, Berkeley, CA 94720, USA, Department of Genetics, Stanford University School of Medicine, 3165 Porter Drive, Stanford, CA 94305, USA and Howard Hughes Medical Institute, Center for Biomolecular Science and Engineering, UCSC, 1156 High Street, Santa Cruz, CA 95064, USA
| | - Matthew L. Speir
- Center for Biomolecular Science and Engineering, School of Engineering, University of California Santa Cruz (UCSC), 1156 High Street, Santa Cruz, CA 95064, USA, Computational Biology Graduate Group, University of California Berkeley, Berkeley, CA 94720, USA, Department of Genetics, Stanford University School of Medicine, 3165 Porter Drive, Stanford, CA 94305, USA and Howard Hughes Medical Institute, Center for Biomolecular Science and Engineering, UCSC, 1156 High Street, Santa Cruz, CA 95064, USA
| | - Ann S. Zweig
- Center for Biomolecular Science and Engineering, School of Engineering, University of California Santa Cruz (UCSC), 1156 High Street, Santa Cruz, CA 95064, USA, Computational Biology Graduate Group, University of California Berkeley, Berkeley, CA 94720, USA, Department of Genetics, Stanford University School of Medicine, 3165 Porter Drive, Stanford, CA 94305, USA and Howard Hughes Medical Institute, Center for Biomolecular Science and Engineering, UCSC, 1156 High Street, Santa Cruz, CA 95064, USA
| | - David Haussler
- Center for Biomolecular Science and Engineering, School of Engineering, University of California Santa Cruz (UCSC), 1156 High Street, Santa Cruz, CA 95064, USA, Computational Biology Graduate Group, University of California Berkeley, Berkeley, CA 94720, USA, Department of Genetics, Stanford University School of Medicine, 3165 Porter Drive, Stanford, CA 94305, USA and Howard Hughes Medical Institute, Center for Biomolecular Science and Engineering, UCSC, 1156 High Street, Santa Cruz, CA 95064, USA
| | - Robert M. Kuhn
- Center for Biomolecular Science and Engineering, School of Engineering, University of California Santa Cruz (UCSC), 1156 High Street, Santa Cruz, CA 95064, USA, Computational Biology Graduate Group, University of California Berkeley, Berkeley, CA 94720, USA, Department of Genetics, Stanford University School of Medicine, 3165 Porter Drive, Stanford, CA 94305, USA and Howard Hughes Medical Institute, Center for Biomolecular Science and Engineering, UCSC, 1156 High Street, Santa Cruz, CA 95064, USA
| | - W. James Kent
- Center for Biomolecular Science and Engineering, School of Engineering, University of California Santa Cruz (UCSC), 1156 High Street, Santa Cruz, CA 95064, USA, Computational Biology Graduate Group, University of California Berkeley, Berkeley, CA 94720, USA, Department of Genetics, Stanford University School of Medicine, 3165 Porter Drive, Stanford, CA 94305, USA and Howard Hughes Medical Institute, Center for Biomolecular Science and Engineering, UCSC, 1156 High Street, Santa Cruz, CA 95064, USA
| |
Collapse
|
686
|
Stenson PD, Mort M, Ball EV, Shaw K, Phillips AD, Cooper DN. The Human Gene Mutation Database: building a comprehensive mutation repository for clinical and molecular genetics, diagnostic testing and personalized genomic medicine. Hum Genet 2014; 133:1-9. [PMID: 24077912 PMCID: PMC3898141 DOI: 10.1007/s00439-013-1358-4] [Citation(s) in RCA: 1005] [Impact Index Per Article: 100.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/29/2013] [Accepted: 09/03/2013] [Indexed: 12/12/2022]
Abstract
The Human Gene Mutation Database (HGMD®) is a comprehensive collection of germline mutations in nuclear genes that underlie, or are associated with, human inherited disease. By June 2013, the database contained over 141,000 different lesions detected in over 5,700 different genes, with new mutation entries currently accumulating at a rate exceeding 10,000 per annum. HGMD was originally established in 1996 for the scientific study of mutational mechanisms in human genes. However, it has since acquired a much broader utility as a central unified disease-oriented mutation repository utilized by human molecular geneticists, genome scientists, molecular biologists, clinicians and genetic counsellors as well as by those specializing in biopharmaceuticals, bioinformatics and personalized genomics. The public version of HGMD (http://www.hgmd.org) is freely available to registered users from academic institutions/non-profit organizations whilst the subscription version (HGMD Professional) is available to academic, clinical and commercial users under license via BIOBASE GmbH.
Collapse
Affiliation(s)
- Peter D. Stenson
- Institute of Medical Genetics, School of Medicine, Cardiff University, Heath Park, Cardiff, CF14 4XN UK
| | - Matthew Mort
- Institute of Medical Genetics, School of Medicine, Cardiff University, Heath Park, Cardiff, CF14 4XN UK
| | - Edward V. Ball
- Institute of Medical Genetics, School of Medicine, Cardiff University, Heath Park, Cardiff, CF14 4XN UK
| | - Katy Shaw
- Institute of Medical Genetics, School of Medicine, Cardiff University, Heath Park, Cardiff, CF14 4XN UK
| | - Andrew D. Phillips
- Institute of Medical Genetics, School of Medicine, Cardiff University, Heath Park, Cardiff, CF14 4XN UK
| | - David N. Cooper
- Institute of Medical Genetics, School of Medicine, Cardiff University, Heath Park, Cardiff, CF14 4XN UK
| |
Collapse
|
687
|
Guo L, Du Y, Chang S, Zhang K, Wang J. rSNPBase: a database for curated regulatory SNPs. Nucleic Acids Res 2014; 42:D1033-9. [PMID: 24285297 PMCID: PMC3964952 DOI: 10.1093/nar/gkt1167] [Citation(s) in RCA: 94] [Impact Index Per Article: 9.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/16/2013] [Accepted: 10/30/2013] [Indexed: 01/20/2023] Open
Abstract
In recent years, human regulatory SNPs (rSNPs) have been widely studied. Here, we present database rSNPBase, freely available at http://rsnp.psych.ac.cn/, to provide curated rSNPs that analyses the regulatory features of all SNPs in the human genome with reference to experimentally supported regulatory elements. In contrast with previous SNP functional annotation databases, rSNPBase is characterized by several unique features. (i) To improve reliability, all SNPs in rSNPBase are annotated with reference to experimentally supported regulatory elements. (ii) rSNPBase focuses on rSNPs involved in a wide range of regulation types, including proximal and distal transcriptional regulation and post-transcriptional regulation, and identifies their potentially regulated genes. (iii) Linkage disequilibrium (LD) correlations between SNPs were analysed so that the regulatory feature is annotated to SNP-set rather than a single SNP. (iv) rSNPBase provides the spatio-temporal labels and experimental eQTL labels for SNPs. In summary, rSNPBase provides more reliable, comprehensive and user-friendly regulatory annotations on rSNPs and will assist researchers in selecting candidate SNPs for further genetic studies and in exploring causal SNPs for in-depth molecular mechanisms of complex phenotypes.
Collapse
Affiliation(s)
- Liyuan Guo
- Key Laboratory of Mental Health, Institute of Psychology, Chinese Academy of Sciences, 16 Lincui Road, Chaoyang District, Beijing 100101, China and University of Chinese Academy of Sciences, 19A Yuquan Road, Beijing, 100049, China
| | - Yang Du
- Key Laboratory of Mental Health, Institute of Psychology, Chinese Academy of Sciences, 16 Lincui Road, Chaoyang District, Beijing 100101, China and University of Chinese Academy of Sciences, 19A Yuquan Road, Beijing, 100049, China
| | - Suhua Chang
- Key Laboratory of Mental Health, Institute of Psychology, Chinese Academy of Sciences, 16 Lincui Road, Chaoyang District, Beijing 100101, China and University of Chinese Academy of Sciences, 19A Yuquan Road, Beijing, 100049, China
| | - Kunlin Zhang
- Key Laboratory of Mental Health, Institute of Psychology, Chinese Academy of Sciences, 16 Lincui Road, Chaoyang District, Beijing 100101, China and University of Chinese Academy of Sciences, 19A Yuquan Road, Beijing, 100049, China
| | - Jing Wang
- Key Laboratory of Mental Health, Institute of Psychology, Chinese Academy of Sciences, 16 Lincui Road, Chaoyang District, Beijing 100101, China and University of Chinese Academy of Sciences, 19A Yuquan Road, Beijing, 100049, China
| |
Collapse
|
688
|
Alves-Cruzeiro JMDC, Nogales-Cadenas R, Pascual-Montano AD. CentrosomeDB: a new generation of the centrosomal proteins database for Human and Drosophila melanogaster. Nucleic Acids Res 2014; 42:D430-6. [PMID: 24270791 PMCID: PMC3964966 DOI: 10.1093/nar/gkt1126] [Citation(s) in RCA: 41] [Impact Index Per Article: 4.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/13/2013] [Revised: 10/22/2013] [Accepted: 10/24/2013] [Indexed: 01/01/2023] Open
Abstract
We present the second generation of centrosomeDB, available online at http://centrosome.cnb.csic.es, with a significant expansion of 1357 human and drosophila centrosomal genes and their corresponding information. The centrosome of animal cells takes part in important biological processes such as the organization of the interphase microtubule cytoskeleton and the assembly of the mitotic spindle. The active research done during the past decades has produced lots of data related to centrosomal proteins. Unfortunately, the accumulated data are dispersed among diverse and heterogeneous sources of information. We believe that the availability of a repository collecting curated evidences of centrosomal proteins would constitute a key resource for the scientific community. This was our first motivation to introduce CentrosomeDB in NAR database issue in 2009, collecting a set of human centrosomal proteins that were reported in the literature and other sources. The intensive use of this resource during these years has encouraged us to present this new expanded version. Using our database, the researcher is offered the possibility to study the evolution, function and structure of the centrosome. We have compiled information from many sources, including Gene Ontology, disease-association, single nucleotide polymorphisms and associated gene expression experiments. Special interest has been paid to protein-protein interaction.
Collapse
Affiliation(s)
| | - Rubén Nogales-Cadenas
- Functional Bioinformatics Group, National Center for Biotechnology-CSIC, Madrid 28049, Spain
| | | |
Collapse
|
689
|
Liu F, Wei XL, Li H, Wei JF, Wang YQ, Gong XJ. Molecular evolution of the vertebrate FK506 binding protein 25. Int J Genomics 2014; 2014:402603. [PMID: 24724077 PMCID: PMC3958658 DOI: 10.1155/2014/402603] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/25/2013] [Accepted: 01/16/2014] [Indexed: 02/05/2023] Open
Abstract
FK506 binding proteins (FKBPs) belong to immunophilins with peptidyl-prolyl isomerases (PPIases) activity. FKBP25 (also known as FKBP3) is one of the nuclear DNA-binding proteins in the FKBPs family, which plays an important role in regulating transcription and chromatin structure. The calculation of nonsynonymous and synonymous substitution rates suggested that FKBP25 undergoes purifying selection throughout the whole vertebrate evolution. Moreover, the result of site-specific tests showed that no sites were detected under positive selection. Only one PPIase domain was detected by searching FKBP25 sequences at Pfam and SMART domain databases. Mammalian FKBP25 possess exon-intron conservation, although conservation in the whole vertebrate lineage is incomplete. The result of this study suggests that the purifying selection triggers FKBP25 evolutionary history, which allows us to discover the complete role of the PPIase domain in the interaction between FKBP25 and nuclear proteins. Moreover, intron alterations during FKBP25 evolution that regulate gene splicing may be involved in the purifying selection.
Collapse
Affiliation(s)
- Fei Liu
- Department of Pharmacology, China Pharmaceutical University, Nanjing 210009, China
- Research Division of Clinical Pharmacology, The First Affiliated Hospital, Nanjing Medical University, 300 Guangzhou Road, Nanjing 210029, China
| | - Xiao-Long Wei
- Department of Pathology, Cancer Hospital of Shantou University Medical College, Shantou, China
| | - Hao Li
- Department of Pharmacology, China Pharmaceutical University, Nanjing 210009, China
- Research Division of Clinical Pharmacology, The First Affiliated Hospital, Nanjing Medical University, 300 Guangzhou Road, Nanjing 210029, China
| | - Ji-Fu Wei
- Research Division of Clinical Pharmacology, The First Affiliated Hospital, Nanjing Medical University, 300 Guangzhou Road, Nanjing 210029, China
| | - Yong-Qing Wang
- Research Division of Clinical Pharmacology, The First Affiliated Hospital, Nanjing Medical University, 300 Guangzhou Road, Nanjing 210029, China
- *Yong-Qing Wang: and
| | - Xiao-Jian Gong
- Department of Pharmacology, China Pharmaceutical University, Nanjing 210009, China
- *Xiao-Jian Gong:
| |
Collapse
|
690
|
Valdes C, Capobianco E. Methods to detect transcribed pseudogenes: RNA-Seq discovery allows learning through features. Methods Mol Biol 2014; 1167:157-83. [PMID: 24823777 DOI: 10.1007/978-1-4939-0835-6_11] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/04/2022]
Abstract
The detection of transcripts and the measurement of their associated activity at the pseudogene scale have recently become important topics of research. Being integral part of many recent studies aimed at establishing a role for a variety of noncoding RNA structures, pseudogenes' popularity has substantially increased due to the discovery of regulatory properties and complex mechanisms of action that, while requiring further investigation, analysis, and validation, promise as well to have a broad impact on human disease. Currently, there are relatively few methodologies specifically designed to accomplish the detection of pseudogene transcripts and tools that either replace or integrate manual annotation procedures are very much needed. In particular, it seems to us justified that we engage in advancing the computational treatment of pseudogenes at the whole transcriptome level. Catalogs of human pseudogenes have started to be delivered, through RNA-Seq technologies. However, just a certain number of transcriptomes has been covered. Furthermore, while most proposals have led to the production of a targeted algorithm, especially used for detection, few computational pipelines were designed following a comprehensive approach addressing identification and quantification of transcriptional activity within a unifying methodological frame. Given the currently incomplete evidence, the limitations of the impacts due to the lack of extensive testing, and the presence of unsolved uncertainties affecting the reproducibility of results, our motivation for the proposal of a new computational approach is high and timely. We have considered a hybrid approach, based on the assembly of a variety of computational tools, including RNA-Seq methods and machine learning applications, all applied to transcriptome data of various complexities. Our initial strategy is to provide lists of pseudogenes to be validated against the currently known examples, in order to extend our knowledge further. An ultimate goal that is naturally linked to this work is to provide an automatic approach that analyzes transcriptomes with the goal of detecting candidate pseudogenes through characteristic features and that allows efficient and reproducible pseudogene classification models.
Collapse
Affiliation(s)
- Camilo Valdes
- Center for Computational Science, University of Miami, Miami, FL, 33146, USA
| | | |
Collapse
|
691
|
Howard A, Rogers AN. Role of translation initiation factor 4G in lifespan regulation and age-related health. Ageing Res Rev 2014; 13:115-24. [PMID: 24394551 DOI: 10.1016/j.arr.2013.12.008] [Citation(s) in RCA: 20] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/11/2013] [Revised: 12/18/2013] [Accepted: 12/23/2013] [Indexed: 01/04/2023]
Abstract
Inhibiting expression of eukaryotic translation initiation factor 4G (eIF4G) arrests normal development but extends lifespan when suppressed during adulthood. In addition to reducing overall translation, inhibition alters the stoichiometry of mRNA translation in favor of genes important for responding to stress and against those associated with growth and reproduction in C. elegans. In humans, aberrant expression of eIF4G is associated with certain forms of cancer and neurodegeneration. Here we review what is known about the roles of eIF4G in molecular, cellular, and organismal contexts. Also discussed are the gaps in understanding of this factor, particularly with regard to the roles of specific forms of expression in individual tissues and the importance of understanding eIF4G for development of potential therapeutic applications.
Collapse
|
692
|
Brooksbank C, Bergman MT, Apweiler R, Birney E, Thornton J. The European Bioinformatics Institute's data resources 2014. Nucleic Acids Res 2014; 42:D18-25. [PMID: 24271396 PMCID: PMC3964968 DOI: 10.1093/nar/gkt1206] [Citation(s) in RCA: 52] [Impact Index Per Article: 5.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/17/2013] [Revised: 11/01/2013] [Accepted: 11/04/2013] [Indexed: 12/18/2022] Open
Abstract
Molecular Biology has been at the heart of the 'big data' revolution from its very beginning, and the need for access to biological data is a common thread running from the 1965 publication of Dayhoff's 'Atlas of Protein Sequence and Structure' through the Human Genome Project in the late 1990s and early 2000s to today's population-scale sequencing initiatives. The European Bioinformatics Institute (EMBL-EBI; http://www.ebi.ac.uk) is one of three organizations worldwide that provides free access to comprehensive, integrated molecular data sets. Here, we summarize the principles underpinning the development of these public resources and provide an overview of EMBL-EBI's database collection to complement the reviews of individual databases provided elsewhere in this issue.
Collapse
Affiliation(s)
- Catherine Brooksbank
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Mary Todd Bergman
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Rolf Apweiler
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Ewan Birney
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Janet Thornton
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| |
Collapse
|
693
|
Zaghlool A, Ameur A, Cavelier L, Feuk L. Splicing in the human brain. INTERNATIONAL REVIEW OF NEUROBIOLOGY 2014; 116:95-125. [PMID: 25172473 DOI: 10.1016/b978-0-12-801105-8.00005-9] [Citation(s) in RCA: 15] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/13/2022]
Abstract
It has become increasingly clear over the past decade that RNA has important functions in human cells beyond its role as an intermediate translator of DNA to protein. It is now known that RNA plays highly specific roles in pathways involved in regulatory, structural, and catalytic functions. The complexity of RNA production and regulation has become evident with the advent of high-throughput methods to study the transcriptome. Deep sequencing has revealed an enormous diversity of RNA types and transcript isoforms in human cells. The transcriptome of the human brain is particularly interesting as it contains more expressed genes than other tissues and also displays an extreme diversity of transcript isoforms, indicating that highly complex regulatory pathways are present in the brain. Several of these regulatory proteins are now identified, including RNA-binding proteins that are neuron specific. RNA-binding proteins also play important roles in regulating the splicing process and the temporal and spatial isoform production. While significant progress has been made in understanding the human transcriptome, many questions still remain regarding the basic mechanisms of splicing and subcellular localization of RNA. A long-standing question is to what extent the splicing of pre-mRNA is cotranscriptional and posttranscriptional, respectively. Recent data, including studies of the human brain, indicate that splicing is primarily cotranscriptional in human cells. This chapter describes the current understanding of splicing and splicing regulation in the human brain and discusses the recent global sequence-based analyses of transcription and splicing.
Collapse
Affiliation(s)
- Ammar Zaghlool
- Department of Immunology, Genetics and Pathology, Uppsala University, Uppsala, Sweden; Science for Life Laboratory, Uppsala University, Uppsala, Sweden
| | - Adam Ameur
- Department of Immunology, Genetics and Pathology, Uppsala University, Uppsala, Sweden; Science for Life Laboratory, Uppsala University, Uppsala, Sweden
| | - Lucia Cavelier
- Department of Immunology, Genetics and Pathology, Uppsala University, Uppsala, Sweden; Uppsala University Hospital, Uppsala, Sweden
| | - Lars Feuk
- Department of Immunology, Genetics and Pathology, Uppsala University, Uppsala, Sweden; Science for Life Laboratory, Uppsala University, Uppsala, Sweden.
| |
Collapse
|
694
|
Kersey PJ, Allen JE, Christensen M, Davis P, Falin LJ, Grabmueller C, Hughes DST, Humphrey J, Kerhornou A, Khobova J, Langridge N, McDowall MD, Maheswari U, Maslen G, Nuhn M, Ong CK, Paulini M, Pedro H, Toneva I, Tuli MA, Walts B, Williams G, Wilson D, Youens-Clark K, Monaco MK, Stein J, Wei X, Ware D, Bolser DM, Howe KL, Kulesha E, Lawson D, Staines DM. Ensembl Genomes 2013: scaling up access to genome-wide data. Nucleic Acids Res 2014; 42:D546-52. [PMID: 24163254 PMCID: PMC3965094 DOI: 10.1093/nar/gkt979] [Citation(s) in RCA: 180] [Impact Index Per Article: 18.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/13/2013] [Accepted: 10/01/2013] [Indexed: 12/20/2022] Open
Abstract
Ensembl Genomes (http://www.ensemblgenomes.org) is an integrating resource for genome-scale data from non-vertebrate species. The project exploits and extends technologies for genome annotation, analysis and dissemination, developed in the context of the vertebrate-focused Ensembl project, and provides a complementary set of resources for non-vertebrate species through a consistent set of programmatic and interactive interfaces. These provide access to data including reference sequence, gene models, transcriptional data, polymorphisms and comparative analysis. This article provides an update to the previous publications about the resource, with a focus on recent developments. These include the addition of important new genomes (and related data sets) including crop plants, vectors of human disease and eukaryotic pathogens. In addition, the resource has scaled up its representation of bacterial genomes, and now includes the genomes of over 9000 bacteria. Specific extensions to the web and programmatic interfaces have been developed to support users in navigating these large data sets. Looking forward, analytic tools to allow targeted selection of data for visualization and download are likely to become increasingly important in future as the number of available genomes increases within all domains of life, and some of the challenges faced in representing bacterial data are likely to become commonplace for eukaryotes in future.
Collapse
Affiliation(s)
- Paul Julian Kersey
- The European Molecular Biology Laboratory, The European Bioinformatics Institute, The Wellcome Trust Genome Campus, Hinxton, Cambridgeshire, CB10 1SD, UK, Wellcome Trust Sanger Centre, The Wellcome Trust Genome Campus, Hinxton, Cambridgeshire, CB10 1SA, UK, Cold Spring Harbor Laboratory, 1 Bungtown Rd, Cold Spring Harbor, NY 11724, USA and USDA-ARS, Cornell University, Ithaca, NY, 14853, USA
| | - James E. Allen
- The European Molecular Biology Laboratory, The European Bioinformatics Institute, The Wellcome Trust Genome Campus, Hinxton, Cambridgeshire, CB10 1SD, UK, Wellcome Trust Sanger Centre, The Wellcome Trust Genome Campus, Hinxton, Cambridgeshire, CB10 1SA, UK, Cold Spring Harbor Laboratory, 1 Bungtown Rd, Cold Spring Harbor, NY 11724, USA and USDA-ARS, Cornell University, Ithaca, NY, 14853, USA
| | - Mikkel Christensen
- The European Molecular Biology Laboratory, The European Bioinformatics Institute, The Wellcome Trust Genome Campus, Hinxton, Cambridgeshire, CB10 1SD, UK, Wellcome Trust Sanger Centre, The Wellcome Trust Genome Campus, Hinxton, Cambridgeshire, CB10 1SA, UK, Cold Spring Harbor Laboratory, 1 Bungtown Rd, Cold Spring Harbor, NY 11724, USA and USDA-ARS, Cornell University, Ithaca, NY, 14853, USA
| | - Paul Davis
- The European Molecular Biology Laboratory, The European Bioinformatics Institute, The Wellcome Trust Genome Campus, Hinxton, Cambridgeshire, CB10 1SD, UK, Wellcome Trust Sanger Centre, The Wellcome Trust Genome Campus, Hinxton, Cambridgeshire, CB10 1SA, UK, Cold Spring Harbor Laboratory, 1 Bungtown Rd, Cold Spring Harbor, NY 11724, USA and USDA-ARS, Cornell University, Ithaca, NY, 14853, USA
| | - Lee J. Falin
- The European Molecular Biology Laboratory, The European Bioinformatics Institute, The Wellcome Trust Genome Campus, Hinxton, Cambridgeshire, CB10 1SD, UK, Wellcome Trust Sanger Centre, The Wellcome Trust Genome Campus, Hinxton, Cambridgeshire, CB10 1SA, UK, Cold Spring Harbor Laboratory, 1 Bungtown Rd, Cold Spring Harbor, NY 11724, USA and USDA-ARS, Cornell University, Ithaca, NY, 14853, USA
| | - Christoph Grabmueller
- The European Molecular Biology Laboratory, The European Bioinformatics Institute, The Wellcome Trust Genome Campus, Hinxton, Cambridgeshire, CB10 1SD, UK, Wellcome Trust Sanger Centre, The Wellcome Trust Genome Campus, Hinxton, Cambridgeshire, CB10 1SA, UK, Cold Spring Harbor Laboratory, 1 Bungtown Rd, Cold Spring Harbor, NY 11724, USA and USDA-ARS, Cornell University, Ithaca, NY, 14853, USA
| | - Daniel Seth Toney Hughes
- The European Molecular Biology Laboratory, The European Bioinformatics Institute, The Wellcome Trust Genome Campus, Hinxton, Cambridgeshire, CB10 1SD, UK, Wellcome Trust Sanger Centre, The Wellcome Trust Genome Campus, Hinxton, Cambridgeshire, CB10 1SA, UK, Cold Spring Harbor Laboratory, 1 Bungtown Rd, Cold Spring Harbor, NY 11724, USA and USDA-ARS, Cornell University, Ithaca, NY, 14853, USA
| | - Jay Humphrey
- The European Molecular Biology Laboratory, The European Bioinformatics Institute, The Wellcome Trust Genome Campus, Hinxton, Cambridgeshire, CB10 1SD, UK, Wellcome Trust Sanger Centre, The Wellcome Trust Genome Campus, Hinxton, Cambridgeshire, CB10 1SA, UK, Cold Spring Harbor Laboratory, 1 Bungtown Rd, Cold Spring Harbor, NY 11724, USA and USDA-ARS, Cornell University, Ithaca, NY, 14853, USA
| | - Arnaud Kerhornou
- The European Molecular Biology Laboratory, The European Bioinformatics Institute, The Wellcome Trust Genome Campus, Hinxton, Cambridgeshire, CB10 1SD, UK, Wellcome Trust Sanger Centre, The Wellcome Trust Genome Campus, Hinxton, Cambridgeshire, CB10 1SA, UK, Cold Spring Harbor Laboratory, 1 Bungtown Rd, Cold Spring Harbor, NY 11724, USA and USDA-ARS, Cornell University, Ithaca, NY, 14853, USA
| | - Julia Khobova
- The European Molecular Biology Laboratory, The European Bioinformatics Institute, The Wellcome Trust Genome Campus, Hinxton, Cambridgeshire, CB10 1SD, UK, Wellcome Trust Sanger Centre, The Wellcome Trust Genome Campus, Hinxton, Cambridgeshire, CB10 1SA, UK, Cold Spring Harbor Laboratory, 1 Bungtown Rd, Cold Spring Harbor, NY 11724, USA and USDA-ARS, Cornell University, Ithaca, NY, 14853, USA
| | - Nicholas Langridge
- The European Molecular Biology Laboratory, The European Bioinformatics Institute, The Wellcome Trust Genome Campus, Hinxton, Cambridgeshire, CB10 1SD, UK, Wellcome Trust Sanger Centre, The Wellcome Trust Genome Campus, Hinxton, Cambridgeshire, CB10 1SA, UK, Cold Spring Harbor Laboratory, 1 Bungtown Rd, Cold Spring Harbor, NY 11724, USA and USDA-ARS, Cornell University, Ithaca, NY, 14853, USA
| | - Mark D. McDowall
- The European Molecular Biology Laboratory, The European Bioinformatics Institute, The Wellcome Trust Genome Campus, Hinxton, Cambridgeshire, CB10 1SD, UK, Wellcome Trust Sanger Centre, The Wellcome Trust Genome Campus, Hinxton, Cambridgeshire, CB10 1SA, UK, Cold Spring Harbor Laboratory, 1 Bungtown Rd, Cold Spring Harbor, NY 11724, USA and USDA-ARS, Cornell University, Ithaca, NY, 14853, USA
| | - Uma Maheswari
- The European Molecular Biology Laboratory, The European Bioinformatics Institute, The Wellcome Trust Genome Campus, Hinxton, Cambridgeshire, CB10 1SD, UK, Wellcome Trust Sanger Centre, The Wellcome Trust Genome Campus, Hinxton, Cambridgeshire, CB10 1SA, UK, Cold Spring Harbor Laboratory, 1 Bungtown Rd, Cold Spring Harbor, NY 11724, USA and USDA-ARS, Cornell University, Ithaca, NY, 14853, USA
| | - Gareth Maslen
- The European Molecular Biology Laboratory, The European Bioinformatics Institute, The Wellcome Trust Genome Campus, Hinxton, Cambridgeshire, CB10 1SD, UK, Wellcome Trust Sanger Centre, The Wellcome Trust Genome Campus, Hinxton, Cambridgeshire, CB10 1SA, UK, Cold Spring Harbor Laboratory, 1 Bungtown Rd, Cold Spring Harbor, NY 11724, USA and USDA-ARS, Cornell University, Ithaca, NY, 14853, USA
| | - Michael Nuhn
- The European Molecular Biology Laboratory, The European Bioinformatics Institute, The Wellcome Trust Genome Campus, Hinxton, Cambridgeshire, CB10 1SD, UK, Wellcome Trust Sanger Centre, The Wellcome Trust Genome Campus, Hinxton, Cambridgeshire, CB10 1SA, UK, Cold Spring Harbor Laboratory, 1 Bungtown Rd, Cold Spring Harbor, NY 11724, USA and USDA-ARS, Cornell University, Ithaca, NY, 14853, USA
| | - Chuang Kee Ong
- The European Molecular Biology Laboratory, The European Bioinformatics Institute, The Wellcome Trust Genome Campus, Hinxton, Cambridgeshire, CB10 1SD, UK, Wellcome Trust Sanger Centre, The Wellcome Trust Genome Campus, Hinxton, Cambridgeshire, CB10 1SA, UK, Cold Spring Harbor Laboratory, 1 Bungtown Rd, Cold Spring Harbor, NY 11724, USA and USDA-ARS, Cornell University, Ithaca, NY, 14853, USA
| | - Michael Paulini
- The European Molecular Biology Laboratory, The European Bioinformatics Institute, The Wellcome Trust Genome Campus, Hinxton, Cambridgeshire, CB10 1SD, UK, Wellcome Trust Sanger Centre, The Wellcome Trust Genome Campus, Hinxton, Cambridgeshire, CB10 1SA, UK, Cold Spring Harbor Laboratory, 1 Bungtown Rd, Cold Spring Harbor, NY 11724, USA and USDA-ARS, Cornell University, Ithaca, NY, 14853, USA
| | - Helder Pedro
- The European Molecular Biology Laboratory, The European Bioinformatics Institute, The Wellcome Trust Genome Campus, Hinxton, Cambridgeshire, CB10 1SD, UK, Wellcome Trust Sanger Centre, The Wellcome Trust Genome Campus, Hinxton, Cambridgeshire, CB10 1SA, UK, Cold Spring Harbor Laboratory, 1 Bungtown Rd, Cold Spring Harbor, NY 11724, USA and USDA-ARS, Cornell University, Ithaca, NY, 14853, USA
| | - Iliana Toneva
- The European Molecular Biology Laboratory, The European Bioinformatics Institute, The Wellcome Trust Genome Campus, Hinxton, Cambridgeshire, CB10 1SD, UK, Wellcome Trust Sanger Centre, The Wellcome Trust Genome Campus, Hinxton, Cambridgeshire, CB10 1SA, UK, Cold Spring Harbor Laboratory, 1 Bungtown Rd, Cold Spring Harbor, NY 11724, USA and USDA-ARS, Cornell University, Ithaca, NY, 14853, USA
| | - Mary Ann Tuli
- The European Molecular Biology Laboratory, The European Bioinformatics Institute, The Wellcome Trust Genome Campus, Hinxton, Cambridgeshire, CB10 1SD, UK, Wellcome Trust Sanger Centre, The Wellcome Trust Genome Campus, Hinxton, Cambridgeshire, CB10 1SA, UK, Cold Spring Harbor Laboratory, 1 Bungtown Rd, Cold Spring Harbor, NY 11724, USA and USDA-ARS, Cornell University, Ithaca, NY, 14853, USA
| | - Brandon Walts
- The European Molecular Biology Laboratory, The European Bioinformatics Institute, The Wellcome Trust Genome Campus, Hinxton, Cambridgeshire, CB10 1SD, UK, Wellcome Trust Sanger Centre, The Wellcome Trust Genome Campus, Hinxton, Cambridgeshire, CB10 1SA, UK, Cold Spring Harbor Laboratory, 1 Bungtown Rd, Cold Spring Harbor, NY 11724, USA and USDA-ARS, Cornell University, Ithaca, NY, 14853, USA
| | - Gareth Williams
- The European Molecular Biology Laboratory, The European Bioinformatics Institute, The Wellcome Trust Genome Campus, Hinxton, Cambridgeshire, CB10 1SD, UK, Wellcome Trust Sanger Centre, The Wellcome Trust Genome Campus, Hinxton, Cambridgeshire, CB10 1SA, UK, Cold Spring Harbor Laboratory, 1 Bungtown Rd, Cold Spring Harbor, NY 11724, USA and USDA-ARS, Cornell University, Ithaca, NY, 14853, USA
| | - Derek Wilson
- The European Molecular Biology Laboratory, The European Bioinformatics Institute, The Wellcome Trust Genome Campus, Hinxton, Cambridgeshire, CB10 1SD, UK, Wellcome Trust Sanger Centre, The Wellcome Trust Genome Campus, Hinxton, Cambridgeshire, CB10 1SA, UK, Cold Spring Harbor Laboratory, 1 Bungtown Rd, Cold Spring Harbor, NY 11724, USA and USDA-ARS, Cornell University, Ithaca, NY, 14853, USA
| | - Ken Youens-Clark
- The European Molecular Biology Laboratory, The European Bioinformatics Institute, The Wellcome Trust Genome Campus, Hinxton, Cambridgeshire, CB10 1SD, UK, Wellcome Trust Sanger Centre, The Wellcome Trust Genome Campus, Hinxton, Cambridgeshire, CB10 1SA, UK, Cold Spring Harbor Laboratory, 1 Bungtown Rd, Cold Spring Harbor, NY 11724, USA and USDA-ARS, Cornell University, Ithaca, NY, 14853, USA
| | - Marcela K. Monaco
- The European Molecular Biology Laboratory, The European Bioinformatics Institute, The Wellcome Trust Genome Campus, Hinxton, Cambridgeshire, CB10 1SD, UK, Wellcome Trust Sanger Centre, The Wellcome Trust Genome Campus, Hinxton, Cambridgeshire, CB10 1SA, UK, Cold Spring Harbor Laboratory, 1 Bungtown Rd, Cold Spring Harbor, NY 11724, USA and USDA-ARS, Cornell University, Ithaca, NY, 14853, USA
| | - Joshua Stein
- The European Molecular Biology Laboratory, The European Bioinformatics Institute, The Wellcome Trust Genome Campus, Hinxton, Cambridgeshire, CB10 1SD, UK, Wellcome Trust Sanger Centre, The Wellcome Trust Genome Campus, Hinxton, Cambridgeshire, CB10 1SA, UK, Cold Spring Harbor Laboratory, 1 Bungtown Rd, Cold Spring Harbor, NY 11724, USA and USDA-ARS, Cornell University, Ithaca, NY, 14853, USA
| | - Xuehong Wei
- The European Molecular Biology Laboratory, The European Bioinformatics Institute, The Wellcome Trust Genome Campus, Hinxton, Cambridgeshire, CB10 1SD, UK, Wellcome Trust Sanger Centre, The Wellcome Trust Genome Campus, Hinxton, Cambridgeshire, CB10 1SA, UK, Cold Spring Harbor Laboratory, 1 Bungtown Rd, Cold Spring Harbor, NY 11724, USA and USDA-ARS, Cornell University, Ithaca, NY, 14853, USA
| | - Doreen Ware
- The European Molecular Biology Laboratory, The European Bioinformatics Institute, The Wellcome Trust Genome Campus, Hinxton, Cambridgeshire, CB10 1SD, UK, Wellcome Trust Sanger Centre, The Wellcome Trust Genome Campus, Hinxton, Cambridgeshire, CB10 1SA, UK, Cold Spring Harbor Laboratory, 1 Bungtown Rd, Cold Spring Harbor, NY 11724, USA and USDA-ARS, Cornell University, Ithaca, NY, 14853, USA
| | - Daniel M. Bolser
- The European Molecular Biology Laboratory, The European Bioinformatics Institute, The Wellcome Trust Genome Campus, Hinxton, Cambridgeshire, CB10 1SD, UK, Wellcome Trust Sanger Centre, The Wellcome Trust Genome Campus, Hinxton, Cambridgeshire, CB10 1SA, UK, Cold Spring Harbor Laboratory, 1 Bungtown Rd, Cold Spring Harbor, NY 11724, USA and USDA-ARS, Cornell University, Ithaca, NY, 14853, USA
| | - Kevin Lee Howe
- The European Molecular Biology Laboratory, The European Bioinformatics Institute, The Wellcome Trust Genome Campus, Hinxton, Cambridgeshire, CB10 1SD, UK, Wellcome Trust Sanger Centre, The Wellcome Trust Genome Campus, Hinxton, Cambridgeshire, CB10 1SA, UK, Cold Spring Harbor Laboratory, 1 Bungtown Rd, Cold Spring Harbor, NY 11724, USA and USDA-ARS, Cornell University, Ithaca, NY, 14853, USA
| | - Eugene Kulesha
- The European Molecular Biology Laboratory, The European Bioinformatics Institute, The Wellcome Trust Genome Campus, Hinxton, Cambridgeshire, CB10 1SD, UK, Wellcome Trust Sanger Centre, The Wellcome Trust Genome Campus, Hinxton, Cambridgeshire, CB10 1SA, UK, Cold Spring Harbor Laboratory, 1 Bungtown Rd, Cold Spring Harbor, NY 11724, USA and USDA-ARS, Cornell University, Ithaca, NY, 14853, USA
| | - Daniel Lawson
- The European Molecular Biology Laboratory, The European Bioinformatics Institute, The Wellcome Trust Genome Campus, Hinxton, Cambridgeshire, CB10 1SD, UK, Wellcome Trust Sanger Centre, The Wellcome Trust Genome Campus, Hinxton, Cambridgeshire, CB10 1SA, UK, Cold Spring Harbor Laboratory, 1 Bungtown Rd, Cold Spring Harbor, NY 11724, USA and USDA-ARS, Cornell University, Ithaca, NY, 14853, USA
| | - Daniel Michael Staines
- The European Molecular Biology Laboratory, The European Bioinformatics Institute, The Wellcome Trust Genome Campus, Hinxton, Cambridgeshire, CB10 1SD, UK, Wellcome Trust Sanger Centre, The Wellcome Trust Genome Campus, Hinxton, Cambridgeshire, CB10 1SA, UK, Cold Spring Harbor Laboratory, 1 Bungtown Rd, Cold Spring Harbor, NY 11724, USA and USDA-ARS, Cornell University, Ithaca, NY, 14853, USA
| |
Collapse
|
695
|
Webb B, Eswar N, Fan H, Khuri N, Pieper U, Dong G, Sali A. Comparative Modeling of Drug Target Proteins☆. REFERENCE MODULE IN CHEMISTRY, MOLECULAR SCIENCES AND CHEMICAL ENGINEERING 2014. [PMCID: PMC7157477 DOI: 10.1016/b978-0-12-409547-2.11133-3] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 11/26/2022]
Abstract
In this perspective, we begin by describing the comparative protein structure modeling technique and the accuracy of the corresponding models. We then discuss the significant role that comparative prediction plays in drug discovery. We focus on virtual ligand screening against comparative models and illustrate the state-of-the-art by a number of specific examples.
Collapse
|
696
|
Bedoya-Reina OC, Ratan A, Burhans R, Kim HL, Giardine B, Riemer C, Li Q, Olson TL, Loughran TP, Vonholdt BM, Perry GH, Schuster SC, Miller W. Galaxy tools to study genome diversity. Gigascience 2013; 2:17. [PMID: 24377391 PMCID: PMC3877877 DOI: 10.1186/2047-217x-2-17] [Citation(s) in RCA: 18] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/24/2013] [Accepted: 12/12/2013] [Indexed: 12/02/2022] Open
Abstract
Background Intra-species genetic variation can be used to investigate population structure, selection, and gene flow in non-model vertebrates; and due to the plummeting costs for genome sequencing, it is now possible for small labs to obtain full-genome variation data from their species of interest. However, those labs may not have easy access to, and familiarity with, computational tools to analyze those data. Results We have created a suite of tools for the Galaxy web server aimed at handling nucleotide and amino-acid polymorphisms discovered by full-genome sequencing of several individuals of the same species, or using a SNP genotyping microarray. In addition to providing user-friendly tools, a main goal is to make published analyses reproducible. While most of the examples discussed in this paper deal with nuclear-genome diversity in non-human vertebrates, we also illustrate the application of the tools to fungal genomes, human biomedical data, and mitochondrial sequences. Conclusions This project illustrates that a small group can design, implement, test, document, and distribute a Galaxy tool collection to meet the needs of a particular community of biologists.
Collapse
Affiliation(s)
| | | | | | | | | | | | | | | | | | | | | | | | - Webb Miller
- Center for Comparative Genomics and Bioinformatics, Pennsylvania State University, University Park, PA 16802, USA.
| |
Collapse
|
697
|
Gomes AV. Genetics of proteasome diseases. SCIENTIFICA 2013; 2013:637629. [PMID: 24490108 PMCID: PMC3892944 DOI: 10.1155/2013/637629] [Citation(s) in RCA: 50] [Impact Index Per Article: 4.5] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 10/20/2013] [Accepted: 11/18/2013] [Indexed: 05/28/2023]
Abstract
The proteasome is a large, multiple subunit complex that is capable of degrading most intracellular proteins. Polymorphisms in proteasome subunits are associated with cardiovascular diseases, diabetes, neurological diseases, and cancer. One polymorphism in the proteasome gene PSMA6 (-8C/G) is associated with three different diseases: type 2 diabetes, myocardial infarction, and coronary artery disease. One type of proteasome, the immunoproteasome, which contains inducible catalytic subunits, is adapted to generate peptides for antigen presentation. It has recently been shown that mutations and polymorphisms in the immunoproteasome catalytic subunit PSMB8 are associated with several inflammatory and autoinflammatory diseases including Nakajo-Nishimura syndrome, CANDLE syndrome, and intestinal M. tuberculosis infection. This comprehensive review describes the disease-related polymorphisms in proteasome genes associated with human diseases and the physiological modulation of proteasome function by these polymorphisms. Given the large number of subunits and the central importance of the proteasome in human physiology as well as the fast pace of detection of proteasome polymorphisms associated with human diseases, it is likely that other polymorphisms in proteasome genes associated with diseases will be detected in the near future. While disease-associated polymorphisms are now readily discovered, the challenge will be to use this genetic information for clinical benefit.
Collapse
Affiliation(s)
- Aldrin V. Gomes
- Department of Neurobiology, Physiology, and Behavior, University of California, Davis, CA 95616, USA
- Department of Physiology and Membrane Biology, University of California, Davis, CA 95616, USA
| |
Collapse
|
698
|
Lin J, Kreisberg R, Kallio A, Dudley AM, Nykter M, Shmulevich I, May P, Autio R. POMO--Plotting Omics analysis results for Multiple Organisms. BMC Genomics 2013; 14:918. [PMID: 24365393 PMCID: PMC3880012 DOI: 10.1186/1471-2164-14-918] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/19/2013] [Accepted: 12/18/2013] [Indexed: 12/15/2022] Open
Abstract
Background Systems biology experiments studying different topics and organisms produce thousands of data values across different types of genomic data. Further, data mining analyses are yielding ranked and heterogeneous results and association networks distributed over the entire genome. The visualization of these results is often difficult and standalone web tools allowing for custom inputs and dynamic filtering are limited. Results We have developed POMO (http://pomo.cs.tut.fi), an interactive web-based application to visually explore omics data analysis results and associations in circular, network and grid views. The circular graph represents the chromosome lengths as perimeter segments, as a reference outer ring, such as cytoband for human. The inner arcs between nodes represent the uploaded network. Further, multiple annotation rings, for example depiction of gene copy number changes, can be uploaded as text files and represented as bar, histogram or heatmap rings. POMO has built-in references for human, mouse, nematode, fly, yeast, zebrafish, rice, tomato, Arabidopsis, and Escherichia coli. In addition, POMO provides custom options that allow integrated plotting of unsupported strains or closely related species associations, such as human and mouse orthologs or two yeast wild types, studied together within a single analysis. The web application also supports interactive label and weight filtering. Every iterative filtered result in POMO can be exported as image file and text file for sharing or direct future input. Conclusions The POMO web application is a unique tool for omics data analysis, which can be used to visualize and filter the genome-wide networks in the context of chromosomal locations as well as multiple network layouts. With the several illustration and filtering options the tool supports the analysis and visualization of any heterogeneous omics data analysis association results for many organisms. POMO is freely available and does not require any installation or registration.
Collapse
Affiliation(s)
- Jake Lin
- Luxembourg Centre for Systems Biomedicine, University of Luxembourg, Luxembourg, Luxembourg.
| | | | | | | | | | | | | | | |
Collapse
|
699
|
Luu W, Zerenturk EJ, Kristiana I, Bucknall MP, Sharpe LJ, Brown AJ. Signaling regulates activity of DHCR24, the final enzyme in cholesterol synthesis. J Lipid Res 2013; 55:410-20. [PMID: 24363437 DOI: 10.1194/jlr.m043257] [Citation(s) in RCA: 50] [Impact Index Per Article: 4.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/29/2023] Open
Abstract
The role of signaling in regulating cholesterol homeostasis is gradually becoming more widely recognized. Here, we explored how kinases and phosphorylation sites regulate the activity of the enzyme involved in the final step of cholesterol synthesis, 3β-hydroxysterol Δ24-reductase (DHCR24). Many factors are known to regulate DHCR24 transcriptionally, but little is known about its posttranslational regulation. We developed a system to specifically test human ectopic DHCR24 activity in a model cell-line (Chinese hamster ovary-7) using siRNA targeted only to hamster DHCR24, thus ensuring that all activity could be attributed to the human enzyme. We determined the effect of known phosphorylation sites and found that mutating certain residues (T110, Y299, and Y507) inhibited DHCR24 activity. In addition, inhibitors of protein kinase C ablated DHCR24 activity, although not through a known phosphorylation site. Our data indicate a novel mechanism whereby DHCR24 activity is regulated by signaling.
Collapse
Affiliation(s)
- Winnie Luu
- School of Biotechnology and Biomolecular Sciences The University of New South Wales, Sydney, NSW 2052, Australia
| | | | | | | | | | | |
Collapse
|
700
|
Zerbino DR, Johnson N, Juettemann T, Wilder SP, Flicek P. WiggleTools: parallel processing of large collections of genome-wide datasets for visualization and statistical analysis. Bioinformatics 2013; 30:1008-9. [PMID: 24363377 PMCID: PMC3967112 DOI: 10.1093/bioinformatics/btt737] [Citation(s) in RCA: 82] [Impact Index Per Article: 7.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/25/2022] Open
Abstract
Motivation: Using high-throughput sequencing, researchers are now generating hundreds of whole-genome assays to measure various features such as transcription factor binding, histone marks, DNA methylation or RNA transcription. Displaying so much data generally leads to a confusing accumulation of plots. We describe here a multithreaded library that computes statistics on large numbers of datasets (Wiggle, BigWig, Bed, BigBed and BAM), generating statistical summaries within minutes with limited memory requirements, whether on the whole genome or on selected regions. Availability and Implementation: The code is freely available under Apache 2.0 license at www.github.com/Ensembl/Wiggletools Contact:zerbino@ebi.ac.uk or flicek@ebi.ac.uk
Collapse
Affiliation(s)
- Daniel R Zerbino
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | | | | | | | | |
Collapse
|