1
|
Rahman N, O'Cathail C, Zyoud A, Sokolov A, Oude Munnink B, Grüning B, Cummins C, Amid C, Nieuwenhuijse DF, Visontai D, Yuan DY, Gupta D, Prasad DK, Gulyás GM, Rinck G, McKinnon J, Rajan J, Knaggs J, Skiby JE, Stéger J, Szarvas J, Gueye K, Papp K, Hoek M, Kumar M, Ventouratou MA, Bouquieaux MC, Koliba M, Mansurova M, Haseeb M, Worp N, Harrison PW, Leinonen R, Thorne R, Selvakumar S, Hunt S, Venkataraman S, Jayathilaka S, Cezard T, Maier W, Waheed Z, Iqbal Z, Aarestrup FM, Csabai I, Koopmans M, Burdett T, Cochrane G. Mobilisation and analyses of publicly available SARS-CoV-2 data for pandemic responses. Microb Genom 2024; 10:001188. [PMID: 38358325 PMCID: PMC10926692 DOI: 10.1099/mgen.0.001188] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/04/2023] [Accepted: 01/14/2024] [Indexed: 02/16/2024] Open
Abstract
The COVID-19 pandemic has seen large-scale pathogen genomic sequencing efforts, becoming part of the toolbox for surveillance and epidemic research. This resulted in an unprecedented level of data sharing to open repositories, which has actively supported the identification of SARS-CoV-2 structure, molecular interactions, mutations and variants, and facilitated vaccine development and drug reuse studies and design. The European COVID-19 Data Platform was launched to support this data sharing, and has resulted in the deposition of several million SARS-CoV-2 raw reads. In this paper we describe (1) open data sharing, (2) tools for submission, analysis, visualisation and data claiming (e.g. ORCiD), (3) the systematic analysis of these datasets, at scale via the SARS-CoV-2 Data Hubs as well as (4) lessons learnt. This paper describes a component of the Platform, the SARS-CoV-2 Data Hubs, which enable the extension and set up of infrastructure that we intend to use more widely in the future for pathogen surveillance and pandemic preparedness.
Collapse
Affiliation(s)
- Nadim Rahman
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridgeshire, UK
| | - Colman O'Cathail
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridgeshire, UK
| | - Ahmad Zyoud
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridgeshire, UK
| | - Alexey Sokolov
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridgeshire, UK
| | - Bas Oude Munnink
- Erasmus Medical Center, Wytemaweg 80, 3015 CN Rotterdam, Netherlands
| | - Björn Grüning
- University of Freiburg, Friedrichstr. 39, 79098 Freiburg, Germany
| | - Carla Cummins
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridgeshire, UK
| | - Clara Amid
- Erasmus Medical Center, Wytemaweg 80, 3015 CN Rotterdam, Netherlands
| | | | - Dávid Visontai
- Eötvös Loránd University, H-1053 Budapest, Egyetem tér 1-3, Hungary
| | - David Yu Yuan
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridgeshire, UK
| | - Dipayan Gupta
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridgeshire, UK
| | - Divyae K. Prasad
- Erasmus Medical Center, Wytemaweg 80, 3015 CN Rotterdam, Netherlands
| | - Gábor Máté Gulyás
- Technical University of Denmark, Anker Engelunds Vej 101, 2800 Kongens Lyngby, Denmark
| | - Gabriele Rinck
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridgeshire, UK
| | - Jasmine McKinnon
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridgeshire, UK
| | - Jeena Rajan
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridgeshire, UK
| | - Jeff Knaggs
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridgeshire, UK
| | - Jeffrey Edward Skiby
- Technical University of Denmark, Anker Engelunds Vej 101, 2800 Kongens Lyngby, Denmark
| | - József Stéger
- Eötvös Loránd University, H-1053 Budapest, Egyetem tér 1-3, Hungary
| | - Judit Szarvas
- Technical University of Denmark, Anker Engelunds Vej 101, 2800 Kongens Lyngby, Denmark
| | - Khadim Gueye
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridgeshire, UK
| | - Krisztián Papp
- Eötvös Loránd University, H-1053 Budapest, Egyetem tér 1-3, Hungary
| | - Maarten Hoek
- Erasmus Medical Center, Wytemaweg 80, 3015 CN Rotterdam, Netherlands
| | - Manish Kumar
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridgeshire, UK
| | - Marianna A. Ventouratou
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridgeshire, UK
| | | | - Martin Koliba
- Technical University of Denmark, Anker Engelunds Vej 101, 2800 Kongens Lyngby, Denmark
| | - Milena Mansurova
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridgeshire, UK
| | - Muhammad Haseeb
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridgeshire, UK
| | - Nathalie Worp
- Erasmus Medical Center, Wytemaweg 80, 3015 CN Rotterdam, Netherlands
| | - Peter W. Harrison
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridgeshire, UK
| | - Rasko Leinonen
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridgeshire, UK
| | - Ross Thorne
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridgeshire, UK
| | - Sandeep Selvakumar
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridgeshire, UK
| | - Sarah Hunt
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridgeshire, UK
| | - Sundar Venkataraman
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridgeshire, UK
| | - Suran Jayathilaka
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridgeshire, UK
| | - Timothée Cezard
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridgeshire, UK
| | - Wolfgang Maier
- University of Freiburg, Friedrichstr. 39, 79098 Freiburg, Germany
| | - Zahra Waheed
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridgeshire, UK
| | - Zamin Iqbal
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridgeshire, UK
| | | | - Istvan Csabai
- Eötvös Loránd University, H-1053 Budapest, Egyetem tér 1-3, Hungary
| | - Marion Koopmans
- Erasmus Medical Center, Wytemaweg 80, 3015 CN Rotterdam, Netherlands
| | - Tony Burdett
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridgeshire, UK
| | - Guy Cochrane
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridgeshire, UK
| |
Collapse
|
2
|
Mariani N, Borsini A, Cecil CAM, Felix JF, Sebert S, Cattaneo A, Walton E, Milaneschi Y, Cochrane G, Amid C, Rajan J, Giacobbe J, Sanz Y, Agustí A, Sorg T, Herault Y, Miettunen J, Parmar P, Cattane N, Jaddoe V, Lötjönen J, Buisan C, González Ballester MA, Piella G, Gelpi JL, Lamers F, Penninx BWJH, Tiemeier H, von Tottleben M, Thiel R, Heil KF, Järvelin MR, Pariante C, Mansuy IM, Lekadir K. Identifying causative mechanisms linking early-life stress to psycho-cardio-metabolic multi-morbidity: The EarlyCause project. PLoS One 2021; 16:e0245475. [PMID: 33476328 PMCID: PMC7819604 DOI: 10.1371/journal.pone.0245475] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/25/2020] [Accepted: 11/27/2020] [Indexed: 12/24/2022] Open
Abstract
Introduction Depression, cardiovascular diseases and diabetes are among the major non-communicable diseases, leading to significant disability and mortality worldwide. These diseases may share environmental and genetic determinants associated with multimorbid patterns. Stressful early-life events are among the primary factors associated with the development of mental and physical diseases. However, possible causative mechanisms linking early life stress (ELS) with psycho-cardio-metabolic (PCM) multi-morbidity are not well understood. This prevents a full understanding of causal pathways towards the shared risk of these diseases and the development of coordinated preventive and therapeutic interventions. Methods and analysis This paper describes the study protocol for EarlyCause, a large-scale and inter-disciplinary research project funded by the European Union’s Horizon 2020 research and innovation programme. The project takes advantage of human longitudinal birth cohort data, animal studies and cellular models to test the hypothesis of shared mechanisms and molecular pathways by which ELS shapes an individual’s physical and mental health in adulthood. The study will research in detail how ELS converts into biological signals embedded simultaneously or sequentially in the brain, the cardiovascular and metabolic systems. The research will mainly focus on four biological processes including possible alterations of the epigenome, neuroendocrine system, inflammatome, and the gut microbiome. Life-course models will integrate the role of modifying factors as sex, socioeconomics, and lifestyle with the goal to better identify groups at risk as well as inform promising strategies to reverse the possible mechanisms and/or reduce the impact of ELS on multi-morbidity development in high-risk individuals. These strategies will help better manage the impact of multi-morbidity on human health and the associated risk.
Collapse
Affiliation(s)
- Nicole Mariani
- Department of Psychological Medicine, Stress, Psychiatry and Immunology Laboratory, Institute of Psychiatry, Psychology & Neuroscience, King’s College London, London, United Kingdom
- * E-mail:
| | - Alessandra Borsini
- Department of Psychological Medicine, Stress, Psychiatry and Immunology Laboratory, Institute of Psychiatry, Psychology & Neuroscience, King’s College London, London, United Kingdom
| | - Charlotte A. M. Cecil
- Department of Epidemiology, Erasmus MC, University Medical Center Rotterdam, Rotterdam, The Netherlands
- Department of Child and Adolescent Psychiatry, Erasmus MC, University Medical Center Rotterdam, Rotterdam, The Netherlands
| | - Janine F. Felix
- Generation R Study Group, Erasmus MC, University Medical Center Rotterdam, Rotterdam, The Netherlands
- Department of Pediatrics, Erasmus MC, University Medical Center Rotterdam, Rotterdam, The Netherlands
| | - Sylvain Sebert
- Faculty of Medicine, Center for Life Course Health Research, University of Oulu, Oulu, Finland
- Medical Research Council Integrative Epidemiology Unit, Bristol Medical School, University of Bristol, Bristol, United Kingdom
- Faculty of Medicine, Department of Metabolism, Digestion and Reproduction, Genomic Medicine, Imperial College London, London, United Kingdom
| | - Annamaria Cattaneo
- IRCCS Istituto Centro San Giovanni di Dio Fatebenefratelli, Biological Psychiatry Laboratory, Brescia, Italy
- Department of Pharmacological and Biomolecular Sciences, University of Milan, Milan, Italy
| | - Esther Walton
- Department of Psychology, University of Bath, Bath, United Kingdom
| | - Yuri Milaneschi
- Department of Psychiatry, Amsterdam UMC/Vrije Universiteit & GGZinGeest, Amsterdam Public Health and Amsterdam Neuroscience Research Institutes, Amsterdam, The Netherlands
| | - Guy Cochrane
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge, United Kingdom
| | - Clara Amid
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge, United Kingdom
- Department of Viroscience, Erasmus Medical Center, Rotterdam, Netherlands
| | - Jeena Rajan
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge, United Kingdom
| | - Juliette Giacobbe
- Department of Psychological Medicine, Stress, Psychiatry and Immunology Laboratory, Institute of Psychiatry, Psychology & Neuroscience, King’s College London, London, United Kingdom
| | - Yolanda Sanz
- Microbial Ecology, Nutrition and Health Research Group, Institute of Agrochemistry and Food Technology, National Research Council (IATA-CSIC), Valencia, Spain
| | - Ana Agustí
- Microbial Ecology, Nutrition and Health Research Group, Institute of Agrochemistry and Food Technology, National Research Council (IATA-CSIC), Valencia, Spain
| | - Tania Sorg
- Centre Européen de Recherche en Biologie et Médicine, Institut de Génétique et de Biologie Moléculaire et Cellulaire, PHENOMIN-ICS, Université de Strasbourg, CNRS, INSERM, Strasbourg, France
| | - Yann Herault
- Centre Européen de Recherche en Biologie et Médicine, Institut de Génétique et de Biologie Moléculaire et Cellulaire, PHENOMIN-ICS, Université de Strasbourg, CNRS, INSERM, Strasbourg, France
| | - Jouko Miettunen
- Faculty of Medicine, Center for Life Course Health Research, University of Oulu, Oulu, Finland
- Medical Research Center Oulu, Oulu University Hospital and University of Oulu, Oulu, Finland
| | - Priyanka Parmar
- Faculty of Medicine, Center for Life Course Health Research, University of Oulu, Oulu, Finland
| | - Nadia Cattane
- IRCCS Istituto Centro San Giovanni di Dio Fatebenefratelli, Biological Psychiatry Laboratory, Brescia, Italy
| | - Vincent Jaddoe
- Generation R Study Group, Erasmus MC, University Medical Center Rotterdam, Rotterdam, The Netherlands
- Department of Pediatrics, Erasmus MC, University Medical Center Rotterdam, Rotterdam, The Netherlands
| | - Jyrki Lötjönen
- Department of Information and Communication Technologies, Universitat Pompeu Fabra, Barcelona, Spain
| | - Carme Buisan
- Department of Information and Communication Technologies, Universitat Pompeu Fabra, Barcelona, Spain
| | - Miguel A. González Ballester
- Department of Information and Communication Technologies, Universitat Pompeu Fabra, Barcelona, Spain
- Catalan Institution for Research and Advanced Studies (ICREA), Barcelona, Spain
| | - Gemma Piella
- Department of Information and Communication Technologies, Universitat Pompeu Fabra, Barcelona, Spain
| | - Josep L. Gelpi
- Department of Biochemistry and Molecular Biomedicine, Universitat de Barcelona, Barcelona, Spain
| | - Femke Lamers
- Department of Psychology, University of Bath, Bath, United Kingdom
| | | | - Henning Tiemeier
- Department of Social and Behavioral Science, Harvard T.H. Chan School of Public Health, Boston, Massachusetts, United States of America
| | | | - Rainer Thiel
- Empirica Communication and Technology Research, Bonn, Germany
| | - Katharina F. Heil
- Departament de Matemàtiques i Informàtica, Universitat de Barcelona, Barcelona, Spain
| | - Marjo-Riitta Järvelin
- Faculty of Medicine, Center for Life Course Health Research, University of Oulu, Oulu, Finland
- Department of Epidemiology and Biostatistics, MRC-PHE Centre for Environment and Health, School of Public Health, Imperial College London, London, United Kingdom
- Unit of Primary Health Care, Oulu University Hospital, OYS, Oulu, Finland
- Department of Life Sciences, College of Health and Life Sciences, Brunel University London, London, United Kingdom
| | - Carmine Pariante
- Department of Psychological Medicine, Stress, Psychiatry and Immunology Laboratory, Institute of Psychiatry, Psychology & Neuroscience, King’s College London, London, United Kingdom
| | - Isabelle M. Mansuy
- Medical Faculty of the University of Zürich and Department of Health Science and Technology of the ETH Zürich, Laboratory of Neuroepigenetics, Brain Research Institute, Zürich Neuroscience Center, Zürich, Switzerland
| | - Karim Lekadir
- Departament de Matemàtiques i Informàtica, Universitat de Barcelona, Barcelona, Spain
| |
Collapse
|
3
|
Sala C, Mordhorst H, Grützke J, Brinkmann A, Petersen TN, Poulsen C, Cotter PD, Crispie F, Ellis RJ, Castellani G, Amid C, Hakhverdyan M, Guyader SL, Manfreda G, Mossong J, Nitsche A, Ragimbeau C, Schaeffer J, Schlundt J, Tay MYF, Aarestrup FM, Hendriksen RS, Pamp SJ, De Cesare A. Metagenomics-Based Proficiency Test of Smoked Salmon Spiked with a Mock Community. Microorganisms 2020; 8:microorganisms8121861. [PMID: 33255715 PMCID: PMC7760972 DOI: 10.3390/microorganisms8121861] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/28/2020] [Revised: 11/17/2020] [Accepted: 11/23/2020] [Indexed: 12/13/2022] Open
Abstract
An inter-laboratory proficiency test was organized to assess the ability of participants to perform shotgun metagenomic sequencing of cold smoked salmon, experimentally spiked with a mock community composed of six bacteria, one parasite, one yeast, one DNA, and two RNA viruses. Each participant applied its in-house wet-lab workflow(s) to obtain the metagenomic dataset(s), which were then collected and analyzed using MG-RAST. A total of 27 datasets were analyzed. Sample pre-processing, DNA extraction protocol, library preparation kit, and sequencing platform, influenced the abundance of specific microorganisms of the mock community. Our results highlight that despite differences in wet-lab protocols, the reads corresponding to the mock community members spiked in the cold smoked salmon, were both detected and quantified in terms of relative abundance, in the metagenomic datasets, proving the suitability of shotgun metagenomic sequencing as a genomic tool to detect microorganisms belonging to different domains in the same food matrix. The implementation of standardized wet-lab protocols would highly facilitate the comparability of shotgun metagenomic sequencing dataset across laboratories and sectors. Moreover, there is a need for clearly defining a sequencing reads threshold, to consider pathogens as detected or undetected in a food sample.
Collapse
Affiliation(s)
- Claudia Sala
- Department of Physics and Astronomy, University of Bologna, 40127 Bologna, Italy;
| | - Hanne Mordhorst
- Research Group for Genomic Epidemiology, National Food Institute, Technical University of Denmark, Kemitorvet, DK-2800 Kgs, 2800 Lyngby, Denmark; (H.M.); (T.N.P.); (C.P.); (F.M.A.); (R.S.H.); (S.J.P.)
| | - Josephine Grützke
- German Federal Institute for Risk Assessment, Department of Biological Safety, 12277 Berlin, Germany;
| | - Annika Brinkmann
- Highly Pathogenic Viruses, ZBS 1, Centre for Biological Threats and Special Pathogens, Robert Koch Institute, 13353 Berlin, Germany; (A.B.); (A.N.)
| | - Thomas N. Petersen
- Research Group for Genomic Epidemiology, National Food Institute, Technical University of Denmark, Kemitorvet, DK-2800 Kgs, 2800 Lyngby, Denmark; (H.M.); (T.N.P.); (C.P.); (F.M.A.); (R.S.H.); (S.J.P.)
| | - Casper Poulsen
- Research Group for Genomic Epidemiology, National Food Institute, Technical University of Denmark, Kemitorvet, DK-2800 Kgs, 2800 Lyngby, Denmark; (H.M.); (T.N.P.); (C.P.); (F.M.A.); (R.S.H.); (S.J.P.)
| | - Paul D. Cotter
- Teagasc Food Research Centre, Moorepark, APC Microbiome Ireland and Vistamilk, T12 YN60 Co. Cork, Ireland; (P.D.C.); (F.C.)
| | - Fiona Crispie
- Teagasc Food Research Centre, Moorepark, APC Microbiome Ireland and Vistamilk, T12 YN60 Co. Cork, Ireland; (P.D.C.); (F.C.)
| | - Richard J. Ellis
- Surveillance and Laboratory Services Department, Animal and Plant Health Agency, APHA Weybridge, Addlestone, Surrey, KT15 3NB, UK;
| | - Gastone Castellani
- Department of Experimental, Diagnostic and Specialty Medicine, University of Bologna, 40127 Bologna, Italy;
| | - Clara Amid
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK;
| | | | - Soizick Le Guyader
- Laboratoire de Microbiologie, CEDEX 03, 44311 Nantes, France; (S.L.G.); (J.S.)
| | - Gerardo Manfreda
- Department of Agricultural and Food Sciences, University of Bologna, 40064 Ozzano dell’Emilia, Italy;
| | - Joël Mossong
- Epidemiology and Microbial Genomics, Laboratoire National de Santé, L-3555 Dudelange, Luxembourg; (J.M.); (C.R.)
| | - Andreas Nitsche
- Highly Pathogenic Viruses, ZBS 1, Centre for Biological Threats and Special Pathogens, Robert Koch Institute, 13353 Berlin, Germany; (A.B.); (A.N.)
| | - Catherine Ragimbeau
- Epidemiology and Microbial Genomics, Laboratoire National de Santé, L-3555 Dudelange, Luxembourg; (J.M.); (C.R.)
| | - Julien Schaeffer
- Laboratoire de Microbiologie, CEDEX 03, 44311 Nantes, France; (S.L.G.); (J.S.)
| | - Joergen Schlundt
- Nanyang Technological University Food Technology Centre (NAFTEC), Nanyang Technological University (NTU), 62 Nanyang Dr, Singapore 637459, Singapore; (J.S.); (M.Y.F.T.)
| | - Moon Y. F. Tay
- Nanyang Technological University Food Technology Centre (NAFTEC), Nanyang Technological University (NTU), 62 Nanyang Dr, Singapore 637459, Singapore; (J.S.); (M.Y.F.T.)
| | - Frank M. Aarestrup
- Research Group for Genomic Epidemiology, National Food Institute, Technical University of Denmark, Kemitorvet, DK-2800 Kgs, 2800 Lyngby, Denmark; (H.M.); (T.N.P.); (C.P.); (F.M.A.); (R.S.H.); (S.J.P.)
| | - Rene S. Hendriksen
- Research Group for Genomic Epidemiology, National Food Institute, Technical University of Denmark, Kemitorvet, DK-2800 Kgs, 2800 Lyngby, Denmark; (H.M.); (T.N.P.); (C.P.); (F.M.A.); (R.S.H.); (S.J.P.)
| | - Sünje Johanna Pamp
- Research Group for Genomic Epidemiology, National Food Institute, Technical University of Denmark, Kemitorvet, DK-2800 Kgs, 2800 Lyngby, Denmark; (H.M.); (T.N.P.); (C.P.); (F.M.A.); (R.S.H.); (S.J.P.)
| | - Alessandra De Cesare
- Department of Veterinary Medical Sciences, University of Bologna, Via Tolara di Sopra 50, 40064 Ozzano dell’Emilia, Italy
- Correspondence:
| |
Collapse
|
4
|
Amid C, Alako BTF, Balavenkataraman Kadhirvelu V, Burdett T, Burgin J, Fan J, Harrison PW, Holt S, Hussein A, Ivanov E, Jayathilaka S, Kay S, Keane T, Leinonen R, Liu X, Martinez-Villacorta J, Milano A, Pakseresht A, Rahman N, Rajan J, Reddy K, Richards E, Smirnov D, Sokolov A, Vijayaraja S, Cochrane G. The European Nucleotide Archive in 2019. Nucleic Acids Res 2020; 48:D70-D76. [PMID: 31722421 PMCID: PMC7145635 DOI: 10.1093/nar/gkz1063] [Citation(s) in RCA: 55] [Impact Index Per Article: 13.8] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/04/2019] [Revised: 10/25/2019] [Accepted: 11/07/2019] [Indexed: 11/12/2022] Open
Abstract
The European Nucleotide Archive (ENA, https://www.ebi.ac.uk/ena) at the European Molecular Biology Laboratory's European Bioinformatics Institute provides open and freely available data deposition and access services across the spectrum of nucleotide sequence data types. Making the world's public sequencing datasets available to the scientific community, the ENA represents a globally comprehensive nucleotide sequence resource. Here, we outline ENA services and content in 2019 and provide an insight into selected key areas of development in this period.
Collapse
Affiliation(s)
- Clara Amid
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Blaise T F Alako
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | | | - Tony Burdett
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Josephine Burgin
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Jun Fan
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Peter W Harrison
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Sam Holt
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Abdulrahman Hussein
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Eugene Ivanov
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Suran Jayathilaka
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Simon Kay
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Thomas Keane
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Rasko Leinonen
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Xin Liu
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Josue Martinez-Villacorta
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Annalisa Milano
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Amir Pakseresht
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Nadim Rahman
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Jeena Rajan
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Kethi Reddy
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Edward Richards
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Dmitriy Smirnov
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Alexey Sokolov
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Senthilnathan Vijayaraja
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Guy Cochrane
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| |
Collapse
|
5
|
Harrison PW, Alako B, Amid C, Cerdeño-Tárraga A, Cleland I, Holt S, Hussein A, Jayathilaka S, Kay S, Keane T, Leinonen R, Liu X, Martínez-Villacorta J, Milano A, Pakseresht N, Rajan J, Reddy K, Richards E, Rosello M, Silvester N, Smirnov D, Toribio AL, Vijayaraja S, Cochrane G. The European Nucleotide Archive in 2018. Nucleic Acids Res 2020; 47:D84-D88. [PMID: 30395270 PMCID: PMC6323982 DOI: 10.1093/nar/gky1078] [Citation(s) in RCA: 68] [Impact Index Per Article: 17.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/18/2018] [Accepted: 10/22/2018] [Indexed: 11/25/2022] Open
Abstract
The European Nucleotide Archive (ENA; https://www.ebi.ac.uk/ena), provided from EMBL-EBI, has for more than three decades been responsible for archiving the world's public sequencing data and presenting this important resource to the scientific community to support and accelerate the global research effort. Here, we outline ENA services and content in 2018 and provide an overview of a selection of focus areas of development work: extending data coordination services around ENA, sequence submissions through template expansion, early pre-submission validation tools and our move towards a new browser and retrieval infrastructure.
Collapse
Affiliation(s)
- Peter W Harrison
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Blaise Alako
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Clara Amid
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Ana Cerdeño-Tárraga
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Iain Cleland
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Sam Holt
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Abdulrahman Hussein
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Suran Jayathilaka
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Simon Kay
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Thomas Keane
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Rasko Leinonen
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Xin Liu
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Josué Martínez-Villacorta
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Annalisa Milano
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Nima Pakseresht
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Jeena Rajan
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Kethi Reddy
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Edward Richards
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Marc Rosello
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Nicole Silvester
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Dmitriy Smirnov
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Ana-Luisa Toribio
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Senthilnathan Vijayaraja
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Guy Cochrane
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| |
Collapse
|
6
|
Matamoros S, Hendriksen RS, Pataki BÁ, Pakseresht N, Rossello M, Silvester N, Amid C, Aarestrup FM, Koopmans M, Cochrane G, Csabai I, Lund O, Schultsz C. Accelerating surveillance and research of antimicrobial resistance - an online repository for sharing of antimicrobial susceptibility data associated with whole-genome sequences. Microb Genom 2020; 6:e000342. [PMID: 32255760 PMCID: PMC7371118 DOI: 10.1099/mgen.0.000342] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/10/2019] [Accepted: 01/31/2020] [Indexed: 11/29/2022] Open
Abstract
Antimicrobial resistance (AMR) is an emerging threat to modern medicine. Improved diagnostics and surveillance of resistant bacteria require the development of next-generation analysis tools and collaboration between international partners. Here, we present the 'AMR Data Hub', an online infrastructure for storage and sharing of structured phenotypic AMR data linked to bacterial whole-genome sequences. Leveraging infrastructure built by the European COMPARE Consortium and structured around the European Nucleotide Archive (ENA), the AMR Data Hub already provides an extensive data collection of more than 2500 isolates with linked genome and AMR data. Representing these data in standardized formats, we provide tools for the validation and submission of new data and services supporting search, browse and retrieval. The current collection was created through a collaboration by several partners from the European COMPARE Consortium, demonstrating the capacities and utility of the AMR Data Hub and its associated tools. We anticipate growth of content and offer the hub as a basis for future research into methods to explore and predict AMR.
Collapse
Affiliation(s)
- Sébastien Matamoros
- Amsterdam UMC, University of Amsterdam, Department of Medical Microbiology, Amsterdam, The Netherlands
| | - Rene S. Hendriksen
- National Food Institute, Technical University of Denmark, Lyngby, Denmark
| | - Bálint Ármin Pataki
- Department of Physics of Complex Systems, ELTE Eötvös Loránd University, Budapest, Hungary
- Department of Computational Sciences, Wigner Research Centre for Physics of the HAS, Budapest, Hungary
| | - Nima Pakseresht
- European Molecular Biology Laboratory (EMBL), European Bioinformatics Institute (EBI), Wellcome Genome Campus, Hinxton, Cambridge, UK
| | - Marc Rossello
- European Molecular Biology Laboratory (EMBL), European Bioinformatics Institute (EBI), Wellcome Genome Campus, Hinxton, Cambridge, UK
| | - Nicole Silvester
- European Molecular Biology Laboratory (EMBL), European Bioinformatics Institute (EBI), Wellcome Genome Campus, Hinxton, Cambridge, UK
| | - Clara Amid
- European Molecular Biology Laboratory (EMBL), European Bioinformatics Institute (EBI), Wellcome Genome Campus, Hinxton, Cambridge, UK
| | - Frank M. Aarestrup
- National Food Institute, Technical University of Denmark, Lyngby, Denmark
| | - Marion Koopmans
- Department of Viroscience, Erasmus University Medical Center, Rotterdam, The Netherlands
| | - Guy Cochrane
- European Molecular Biology Laboratory (EMBL), European Bioinformatics Institute (EBI), Wellcome Genome Campus, Hinxton, Cambridge, UK
| | - Istvan Csabai
- Department of Physics of Complex Systems, ELTE Eötvös Loránd University, Budapest, Hungary
- Department of Computational Sciences, Wigner Research Centre for Physics of the HAS, Budapest, Hungary
| | - Ole Lund
- National Food Institute, Technical University of Denmark, Lyngby, Denmark
| | - Constance Schultsz
- Amsterdam UMC, University of Amsterdam, Department of Medical Microbiology, Amsterdam, The Netherlands
- Amsterdam UMC, University of Amsterdam, Department of Global Health, Amsterdam Institute for Global Health and Development, Amsterdam, The Netherlands
| |
Collapse
|
7
|
Poen MJ, Pohlmann A, Amid C, Bestebroer TM, Brookes SM, Brown IH, Everett H, Schapendonk CME, Scheuer RD, Smits SL, Beer M, Fouchier RAM, Ellis RJ. Comparison of sequencing methods and data processing pipelines for whole genome sequencing and minority single nucleotide variant (mSNV) analysis during an influenza A/H5N8 outbreak. PLoS One 2020; 15:e0229326. [PMID: 32078666 PMCID: PMC7032710 DOI: 10.1371/journal.pone.0229326] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/25/2019] [Accepted: 12/30/2019] [Indexed: 12/12/2022] Open
Abstract
As high-throughput sequencing technologies are becoming more widely adopted for analysing pathogens in disease outbreaks there needs to be assurance that the different sequencing technologies and approaches to data analysis will yield reliable and comparable results. Conversely, understanding where agreement cannot be achieved provides insight into the limitations of these approaches and also allows efforts to be focused on areas of the process that need improvement. This manuscript describes the next-generation sequencing of three closely related viruses, each analysed using different sequencing strategies, sequencing instruments and data processing pipelines. In order to determine the comparability of consensus sequences and minority (sub-consensus) single nucleotide variant (mSNV) identification, the biological samples, the sequence data from 3 sequencing platforms and the *.bam quality-trimmed alignment files of raw data of 3 influenza A/H5N8 viruses were shared. This analysis demonstrated that variation in the final result could be attributed to all stages in the process, but the most critical were the well-known homopolymer errors introduced by 454 sequencing, and the alignment processes in the different data processing pipelines which affected the consistency of mSNV detection. However, homopolymer errors aside, there was generally a good agreement between consensus sequences that were obtained for all combinations of sequencing platforms and data processing pipelines. Nevertheless, minority variant analysis will need a different level of careful standardization and awareness about the possible limitations, as shown in this study.
Collapse
Affiliation(s)
| | - Anne Pohlmann
- Institute of Diagnostic Virology, Friedrich-Loeffler-Institute, Insel Riems, Germany
| | - Clara Amid
- European Molecular Biology Laboratory (EMBL), European Bioinformatics Institute (EBI), Wellcome Genome Campus, Hinxton, Cambridge, United Kingdom
| | | | - Sharon M. Brookes
- Animal and Plant Health Agency (APHA) - Weybridge, Addlestone, Surrey, United Kingdom
| | - Ian H. Brown
- Animal and Plant Health Agency (APHA) - Weybridge, Addlestone, Surrey, United Kingdom
| | - Helen Everett
- Animal and Plant Health Agency (APHA) - Weybridge, Addlestone, Surrey, United Kingdom
| | | | | | - Saskia L. Smits
- Erasmus MC, Department of Viroscience, Rotterdam, the Netherlands
| | - Martin Beer
- Institute of Diagnostic Virology, Friedrich-Loeffler-Institute, Insel Riems, Germany
| | | | - Richard J. Ellis
- Animal and Plant Health Agency (APHA) - Weybridge, Addlestone, Surrey, United Kingdom
- * E-mail:
| |
Collapse
|
8
|
Mitchell AL, Scheremetjew M, Denise H, Potter S, Tarkowska A, Qureshi M, Salazar GA, Pesseat S, Boland MA, Hunter F, ten Hoopen P, Alako B, Amid C, Wilkinson DJ, Curtis TP, Cochrane G, Finn RD. EBI Metagenomics in 2017: enriching the analysis of microbial communities, from sequence reads to assemblies. Nucleic Acids Res 2019; 46:D726-D735. [PMID: 29069476 PMCID: PMC5753268 DOI: 10.1093/nar/gkx967] [Citation(s) in RCA: 121] [Impact Index Per Article: 24.2] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Grants] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/19/2017] [Accepted: 10/12/2017] [Indexed: 01/16/2023] Open
Abstract
EBI metagenomics (http://www.ebi.ac.uk/metagenomics) provides a free to use platform for the analysis and archiving of sequence data derived from the microbial populations found in a particular environment. Over the past two years, EBI metagenomics has increased the number of datasets analysed 10-fold. In addition to increased throughput, the underlying analysis pipeline has been overhauled to include both new or updated tools and reference databases. Of particular note is a new workflow for taxonomic assignments that has been extended to include assignments based on both the large and small subunit RNA marker genes and to encompass all cellular micro-organisms. We also describe the addition of metagenomic assembly as a new analysis service. Our pilot studies have produced over 2400 assemblies from datasets in the public domain. From these assemblies, we have produced a searchable, non-redundant protein database of over 50 million sequences. To provide improved access to the data stored within the resource, we have developed a programmatic interface that provides access to the analysis results and associated sample metadata. Finally, we have integrated the results of a series of statistical analyses that provide estimations of diversity and sample comparisons.
Collapse
Affiliation(s)
- Alex L Mitchell
- EMBL-EBI European Bioinformatics Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SD, UK
- To whom correspondence should be addressed. Tel: +44 1223 494652; . Correspondence may also be addressed to Robert D. Finn. Tel: +44 1223 492678;
| | - Maxim Scheremetjew
- EMBL-EBI European Bioinformatics Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Hubert Denise
- EMBL-EBI European Bioinformatics Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Simon Potter
- EMBL-EBI European Bioinformatics Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Aleksandra Tarkowska
- EMBL-EBI European Bioinformatics Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Matloob Qureshi
- EMBL-EBI European Bioinformatics Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Gustavo A Salazar
- EMBL-EBI European Bioinformatics Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Sebastien Pesseat
- EMBL-EBI European Bioinformatics Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Miguel A Boland
- EMBL-EBI European Bioinformatics Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Fiona M I Hunter
- EMBL-EBI European Bioinformatics Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Petra ten Hoopen
- EMBL-EBI European Bioinformatics Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Blaise Alako
- EMBL-EBI European Bioinformatics Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Clara Amid
- EMBL-EBI European Bioinformatics Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Darren J Wilkinson
- School of Mathematics & Statistics, Newcastle University, Newcastle upon Tyne NE1 7RU, UK
| | - Thomas P Curtis
- School of Civil Engineering and Geosciences, Newcastle University, Newcastle upon Tyne NE1 7RU, UK
| | - Guy Cochrane
- EMBL-EBI European Bioinformatics Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Robert D Finn
- EMBL-EBI European Bioinformatics Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SD, UK
- To whom correspondence should be addressed. Tel: +44 1223 494652; . Correspondence may also be addressed to Robert D. Finn. Tel: +44 1223 492678;
| |
Collapse
|
9
|
Silvester N, Alako B, Amid C, Cerdeño-Tarrága A, Clarke L, Cleland I, Harrison PW, Jayathilaka S, Kay S, Keane T, Leinonen R, Liu X, Martínez-Villacorta J, Menchi M, Reddy K, Pakseresht N, Rajan J, Rossello M, Smirnov D, Toribio AL, Vaughan D, Zalunin V, Cochrane G. The European Nucleotide Archive in 2017. Nucleic Acids Res 2019; 46:D36-D40. [PMID: 29140475 PMCID: PMC5753375 DOI: 10.1093/nar/gkx1125] [Citation(s) in RCA: 54] [Impact Index Per Article: 10.8] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/06/2017] [Accepted: 10/25/2017] [Indexed: 12/03/2022] Open
Abstract
For 35 years the European Nucleotide Archive (ENA; https://www.ebi.ac.uk/ena) has been responsible for making the world’s public sequencing data available to the scientific community. Advances in sequencing technology have driven exponential growth in the volume of data to be processed and stored and a substantial broadening of the user community. Here, we outline ENA services and content in 2017 and provide insight into a selection of current key areas of development in ENA driven by challenges arising from the above growth.
Collapse
Affiliation(s)
- Nicole Silvester
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge, CB10 1SD, UK
| | - Blaise Alako
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge, CB10 1SD, UK
| | - Clara Amid
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge, CB10 1SD, UK
| | - Ana Cerdeño-Tarrága
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge, CB10 1SD, UK
| | - Laura Clarke
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge, CB10 1SD, UK
| | - Iain Cleland
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge, CB10 1SD, UK
| | - Peter W Harrison
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge, CB10 1SD, UK
| | - Suran Jayathilaka
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge, CB10 1SD, UK
| | - Simon Kay
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge, CB10 1SD, UK
| | - Thomas Keane
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge, CB10 1SD, UK
| | - Rasko Leinonen
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge, CB10 1SD, UK
| | - Xin Liu
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge, CB10 1SD, UK
| | - Josué Martínez-Villacorta
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge, CB10 1SD, UK
| | - Manuela Menchi
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge, CB10 1SD, UK
| | - Kethi Reddy
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge, CB10 1SD, UK
| | - Nima Pakseresht
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge, CB10 1SD, UK
| | - Jeena Rajan
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge, CB10 1SD, UK
| | - Marc Rossello
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge, CB10 1SD, UK
| | - Dmitriy Smirnov
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge, CB10 1SD, UK
| | - Ana L Toribio
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge, CB10 1SD, UK
| | - Daniel Vaughan
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge, CB10 1SD, UK
| | - Vadim Zalunin
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge, CB10 1SD, UK
| | - Guy Cochrane
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge, CB10 1SD, UK
| |
Collapse
|
10
|
Hendriksen RS, Munk P, Njage P, van Bunnik B, McNally L, Lukjancenko O, Röder T, Nieuwenhuijse D, Pedersen SK, Kjeldgaard J, Kaas RS, Clausen PTLC, Vogt JK, Leekitcharoenphon P, van de Schans MGM, Zuidema T, de Roda Husman AM, Rasmussen S, Petersen B, Amid C, Cochrane G, Sicheritz-Ponten T, Schmitt H, Alvarez JRM, Aidara-Kane A, Pamp SJ, Lund O, Hald T, Woolhouse M, Koopmans MP, Vigre H, Petersen TN, Aarestrup FM. Global monitoring of antimicrobial resistance based on metagenomics analyses of urban sewage. Nat Commun 2019; 10:1124. [PMID: 30850636 PMCID: PMC6408512 DOI: 10.1038/s41467-019-08853-3] [Citation(s) in RCA: 461] [Impact Index Per Article: 92.2] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/16/2018] [Accepted: 01/31/2019] [Indexed: 12/11/2022] Open
Abstract
Antimicrobial resistance (AMR) is a serious threat to global public health, but obtaining representative data on AMR for healthy human populations is difficult. Here, we use metagenomic analysis of untreated sewage to characterize the bacterial resistome from 79 sites in 60 countries. We find systematic differences in abundance and diversity of AMR genes between Europe/North-America/Oceania and Africa/Asia/South-America. Antimicrobial use data and bacterial taxonomy only explains a minor part of the AMR variation that we observe. We find no evidence for cross-selection between antimicrobial classes, or for effect of air travel between sites. However, AMR gene abundance strongly correlates with socio-economic, health and environmental factors, which we use to predict AMR gene abundances in all countries in the world. Our findings suggest that global AMR gene diversity and abundance vary by region, and that improving sanitation and health could potentially limit the global burden of AMR. We propose metagenomic analysis of sewage as an ethically acceptable and economically feasible approach for continuous global surveillance and prediction of AMR.
Collapse
Affiliation(s)
- Rene S Hendriksen
- National Food Institute, Technical University of Denmark, Kgs. Lyngby, 2800, Denmark
| | - Patrick Munk
- National Food Institute, Technical University of Denmark, Kgs. Lyngby, 2800, Denmark
| | - Patrick Njage
- National Food Institute, Technical University of Denmark, Kgs. Lyngby, 2800, Denmark
| | - Bram van Bunnik
- Usher Institute, University of Edinburgh, Edinburgh, EH8 9AG, UK
| | - Luke McNally
- Centre for Synthetic and Systems Biology, School of Biological Sciences, University of Edinburgh, Edinburgh, EH9 3JD, UK
| | - Oksana Lukjancenko
- National Food Institute, Technical University of Denmark, Kgs. Lyngby, 2800, Denmark
| | - Timo Röder
- National Food Institute, Technical University of Denmark, Kgs. Lyngby, 2800, Denmark
| | | | | | - Jette Kjeldgaard
- National Food Institute, Technical University of Denmark, Kgs. Lyngby, 2800, Denmark
| | - Rolf S Kaas
- National Food Institute, Technical University of Denmark, Kgs. Lyngby, 2800, Denmark
| | | | - Josef Korbinian Vogt
- National Food Institute, Technical University of Denmark, Kgs. Lyngby, 2800, Denmark
| | | | | | - Tina Zuidema
- RIKILT Wageningen University and Research, Wageningen, 6708, The Netherlands
| | - Ana Maria de Roda Husman
- National Institute for Public Health and the Environment (RIVM), Bilthoven, 3721, The Netherlands
| | - Simon Rasmussen
- Department of Bio and Health Informatics, Technical University of Denmark, Kgs. Lyngby, 2800, Denmark
| | - Bent Petersen
- Department of Bio and Health Informatics, Technical University of Denmark, Kgs. Lyngby, 2800, Denmark
| | | | - Clara Amid
- European Molecular Biology Laboratory, European Bioinformatics Institute, Hinxton, CB10 1SD, UK
| | - Guy Cochrane
- European Molecular Biology Laboratory, European Bioinformatics Institute, Hinxton, CB10 1SD, UK
| | - Thomas Sicheritz-Ponten
- Centre of Excellence for Omics-Driven Computational Biodiscovery, AIMST University, Kedah, 08100, Malaysia
| | - Heike Schmitt
- National Institute for Public Health and the Environment (RIVM), Bilthoven, 3721, The Netherlands
| | | | | | - Sünje J Pamp
- National Food Institute, Technical University of Denmark, Kgs. Lyngby, 2800, Denmark
| | - Ole Lund
- Department of Bio and Health Informatics, Technical University of Denmark, Kgs. Lyngby, 2800, Denmark
| | - Tine Hald
- National Food Institute, Technical University of Denmark, Kgs. Lyngby, 2800, Denmark
| | - Mark Woolhouse
- Usher Institute, University of Edinburgh, Edinburgh, EH8 9AG, UK
| | - Marion P Koopmans
- Viroscience, Erasmus Medical Center, Rotterdam, 3015, The Netherlands
| | - Håkan Vigre
- National Food Institute, Technical University of Denmark, Kgs. Lyngby, 2800, Denmark
| | | | - Frank M Aarestrup
- National Food Institute, Technical University of Denmark, Kgs. Lyngby, 2800, Denmark.
| |
Collapse
|
11
|
Amid C, Pakseresht N, Silvester N, Jayathilaka S, Lund O, Dynovski LD, Pataki BÁ, Visontai D, Xavier BB, Alako BTF, Belka A, Cisneros JLB, Cotten M, Haringhuizen GB, Harrison PW, Höper D, Holt S, Hundahl C, Hussein A, Kaas RS, Liu X, Leinonen R, Malhotra-Kumar S, Nieuwenhuijse DF, Rahman N, dos S Ribeiro C, Skiby JE, Schmitz D, Stéger J, Szalai-Gindl JM, Thomsen MCF, Cacciò SM, Csabai I, Kroneman A, Koopmans M, Aarestrup F, Cochrane G. The COMPARE Data Hubs. Database (Oxford) 2019; 2019:baz136. [PMID: 31868882 PMCID: PMC6927095 DOI: 10.1093/database/baz136] [Citation(s) in RCA: 22] [Impact Index Per Article: 4.4] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/21/2019] [Revised: 11/06/2019] [Accepted: 11/07/2019] [Indexed: 11/12/2022]
Abstract
Data sharing enables research communities to exchange findings and build upon the knowledge that arises from their discoveries. Areas of public and animal health as well as food safety would benefit from rapid data sharing when it comes to emergencies. However, ethical, regulatory and institutional challenges, as well as lack of suitable platforms which provide an infrastructure for data sharing in structured formats, often lead to data not being shared or at most shared in form of supplementary materials in journal publications. Here, we describe an informatics platform that includes workflows for structured data storage, managing and pre-publication sharing of pathogen sequencing data and its analysis interpretations with relevant stakeholders.
Collapse
Affiliation(s)
- Clara Amid
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Nima Pakseresht
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Nicole Silvester
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Suran Jayathilaka
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Ole Lund
- Research Group for Genomic Epidemiology, National Food Institute, Technical University of Denmark, Kongens Lyngby 2800, Denmark
| | - Lukasz D Dynovski
- Research Group for Genomic Epidemiology, National Food Institute, Technical University of Denmark, Kongens Lyngby 2800, Denmark
| | - Bálint Á Pataki
- Department of Physics of Complex Systems, ELTE Eötvös Loránd University, Budapest 1117, Hungary
- Department of Computational Sciences, Wigner Research Centre for Physics of the HAS, Budapest 1121, Hungary
| | - Dávid Visontai
- Department of Physics of Complex Systems, ELTE Eötvös Loránd University, Budapest 1117, Hungary
- Department of Computational Sciences, Wigner Research Centre for Physics of the HAS, Budapest 1121, Hungary
| | - Basil Britto Xavier
- Laboratory of Medical Microbiology, Vaccine and Infectious Disease Institute, University of Antwerp, Wilrijk 2610, Belgium
| | - Blaise T F Alako
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Ariane Belka
- Institute of Diagnostic Virology, Friedrich-Loeffler-Institut, Greifswald 17493, Germany
| | - Jose L B Cisneros
- Research Group for Genomic Epidemiology, National Food Institute, Technical University of Denmark, Kongens Lyngby 2800, Denmark
| | - Matthew Cotten
- Department of Viroscience, Erasmus Medical Center, Rotterdam 3015, Netherlands
| | - George B Haringhuizen
- National Institute for Public Health and the Environment (RIVM), Bilthoven 3720, The Netherlands
| | - Peter W Harrison
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Dirk Höper
- Institute of Diagnostic Virology, Friedrich-Loeffler-Institut, Greifswald 17493, Germany
| | - Sam Holt
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Camilla Hundahl
- Research Group for Genomic Epidemiology, National Food Institute, Technical University of Denmark, Kongens Lyngby 2800, Denmark
| | - Abdulrahman Hussein
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Rolf S Kaas
- Research Group for Genomic Epidemiology, National Food Institute, Technical University of Denmark, Kongens Lyngby 2800, Denmark
| | - Xin Liu
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Rasko Leinonen
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Surbhi Malhotra-Kumar
- Laboratory of Medical Microbiology, Vaccine and Infectious Disease Institute, University of Antwerp, Wilrijk 2610, Belgium
| | | | - Nadim Rahman
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Carolina dos S Ribeiro
- National Institute for Public Health and the Environment (RIVM), Bilthoven 3720, The Netherlands
| | - Jeffrey E Skiby
- Research Group for Genomic Epidemiology, National Food Institute, Technical University of Denmark, Kongens Lyngby 2800, Denmark
| | - Dennis Schmitz
- Department of Viroscience, Erasmus Medical Center, Rotterdam 3015, Netherlands
- National Institute for Public Health and the Environment (RIVM), Bilthoven 3720, The Netherlands
| | - József Stéger
- Department of Physics of Complex Systems, ELTE Eötvös Loránd University, Budapest 1117, Hungary
- Department of Computational Sciences, Wigner Research Centre for Physics of the HAS, Budapest 1121, Hungary
| | - János M Szalai-Gindl
- Department of Physics of Complex Systems, ELTE Eötvös Loránd University, Budapest 1117, Hungary
- Department of Computational Sciences, Wigner Research Centre for Physics of the HAS, Budapest 1121, Hungary
| | - Martin C F Thomsen
- Research Group for Genomic Epidemiology, National Food Institute, Technical University of Denmark, Kongens Lyngby 2800, Denmark
| | - Simone M Cacciò
- European Union Reference Laboratory for Parasites, Istituto Superiore di Sanità (ISS), Rome 00161, Italy
| | - István Csabai
- Department of Physics of Complex Systems, ELTE Eötvös Loránd University, Budapest 1117, Hungary
- Department of Computational Sciences, Wigner Research Centre for Physics of the HAS, Budapest 1121, Hungary
| | - Annelies Kroneman
- National Institute for Public Health and the Environment (RIVM), Bilthoven 3720, The Netherlands
| | - Marion Koopmans
- Department of Viroscience, Erasmus Medical Center, Rotterdam 3015, Netherlands
| | - Frank Aarestrup
- Research Group for Genomic Epidemiology, National Food Institute, Technical University of Denmark, Kongens Lyngby 2800, Denmark
| | - Guy Cochrane
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| |
Collapse
|
12
|
Amid C, Olstedt M, Gunnarsson JS, Le Lan H, Tran Thi Minh H, Van den Brink PJ, Hellström M, Tedengren M. Additive effects of the herbicide glyphosate and elevated temperature on the branched coral Acropora formosa in Nha Trang, Vietnam. Environ Sci Pollut Res Int 2018; 25:13360-13372. [PMID: 28111719 PMCID: PMC5978828 DOI: 10.1007/s11356-016-8320-7] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 05/09/2016] [Accepted: 12/21/2016] [Indexed: 05/16/2023]
Abstract
The combined effects of the herbicide glyphosate and elevated temperature were studied on the tropical staghorn coral Acropora formosa, in Nha Trang bay, Vietnam. The corals were collected from two different reefs, one close to a polluted fish farm and one in a marine-protected area (MPA). In the laboratory, branches of the corals were exposed to the herbicide glyphosate at ambient (28 °C) and at 3 °C elevated water temperatures (31 °C). Effects of herbicide and elevated temperature were studied on coral bleaching using photography and digital image analysis (new colorimetric method developed here based on grayscale), chlorophyll a analysis, and symbiotic dinoflagellate (Symbiodinium, referred to as zooxanthellae) counts. All corals from the MPA started to bleach in the laboratory before they were exposed to the treatments, indicating that they were very sensitive, as opposed to the corals collected from the more polluted site, which were more tolerant and showed no bleaching response to temperature increase or herbicide alone. However, the combined exposure to the stressors resulted in significant loss of color, proportional to loss in chlorophyll a and zooxanthellae. The difference in sensitivity of the corals collected from the polluted site versus the MPA site could be explained by different symbiont types: the resilient type C3u and the stress-sensitive types C21 and C23, respectively. The additive effect of elevated temperatures and herbicides adds further weight to the notion that the bleaching of coral reefs is accelerated in the presence of multiple stressors. These results suggest that the corals in Nha Trang bay have adapted to the ongoing pollution to become more tolerant to anthropogenic stressors, and that multiple stressors hamper this resilience. The loss of color and decrease of chlorophyll a suggest that bleaching is related to concentration of chloro-pigments. The colorimetric method could be further fine-tuned and used as a precise, non-intrusive tool for monitoring coral bleaching in situ.
Collapse
Affiliation(s)
- C Amid
- Department of Ecology, Environment and Plant Sciences (DEEP), Stockholm University, 106 91, Stockholm, Sweden
| | - M Olstedt
- Department of Ecology, Environment and Plant Sciences (DEEP), Stockholm University, 106 91, Stockholm, Sweden
| | - J S Gunnarsson
- Department of Ecology, Environment and Plant Sciences (DEEP), Stockholm University, 106 91, Stockholm, Sweden
| | - H Le Lan
- Institute of Oceanography (IO), Nha Trang, Vietnam
| | | | - P J Van den Brink
- Department of Aquatic Ecology and Water Quality Management, Wageningen University, P.O. Box 47, 6700 AA, Wageningen, The Netherlands
- Alterra, Wageningen University and Research Centre, P.O. Box 47, 6700 AA, Wageningen, The Netherlands
| | - M Hellström
- Department of Ecology, Environment and Plant Sciences (DEEP), Stockholm University, 106 91, Stockholm, Sweden
| | - M Tedengren
- Department of Ecology, Environment and Plant Sciences (DEEP), Stockholm University, 106 91, Stockholm, Sweden.
| |
Collapse
|
13
|
Alberti A, Poulain J, Engelen S, Labadie K, Romac S, Ferrera I, Albini G, Aury JM, Belser C, Bertrand A, Cruaud C, Da Silva C, Dossat C, Gavory F, Gas S, Guy J, Haquelle M, Jacoby E, Jaillon O, Lemainque A, Pelletier E, Samson G, Wessner M, Acinas SG, Royo-Llonch M, Cornejo-Castillo FM, Logares R, Fernández-Gómez B, Bowler C, Cochrane G, Amid C, Hoopen PT, De Vargas C, Grimsley N, Desgranges E, Kandels-Lewis S, Ogata H, Poulton N, Sieracki ME, Stepanauskas R, Sullivan MB, Brum JR, Duhaime MB, Poulos BT, Hurwitz BL, Pesant S, Karsenti E, Wincker P. Viral to metazoan marine plankton nucleotide sequences from the Tara Oceans expedition. Sci Data 2017; 4:170093. [PMID: 28763055 PMCID: PMC5538240 DOI: 10.1038/sdata.2017.93] [Citation(s) in RCA: 82] [Impact Index Per Article: 11.7] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/08/2017] [Accepted: 06/05/2017] [Indexed: 02/01/2023] Open
Abstract
A unique collection of oceanic samples was gathered by the Tara Oceans
expeditions (2009–2013), targeting plankton organisms ranging from viruses to
metazoans, and providing rich environmental context measurements. Thanks to recent advances in
the field of genomics, extensive sequencing has been performed for a deep genomic analysis of
this huge collection of samples. A strategy based on different approaches, such as
metabarcoding, metagenomics, single-cell genomics and metatranscriptomics, has been chosen for
analysis of size-fractionated plankton communities. Here, we provide detailed procedures
applied for genomic data generation, from nucleic acids extraction to sequence production, and
we describe registries of genomics datasets available at the European Nucleotide Archive (ENA,
www.ebi.ac.uk/ena). The association of these metadata to the experimental
procedures applied for their generation will help the scientific community to access these data
and facilitate their analysis. This paper complements other efforts to provide a full
description of experiments and open science resources generated from the Tara
Oceans project, further extending their value for the study of the world’s planktonic
ecosystems.
Collapse
Affiliation(s)
- Adriana Alberti
- CEA-Institut de Biologie François Jacob, Genoscope, 2 rue Gaston Crémieux, Evry 91057, France
| | - Julie Poulain
- CEA-Institut de Biologie François Jacob, Genoscope, 2 rue Gaston Crémieux, Evry 91057, France
| | - Stefan Engelen
- CEA-Institut de Biologie François Jacob, Genoscope, 2 rue Gaston Crémieux, Evry 91057, France
| | - Karine Labadie
- CEA-Institut de Biologie François Jacob, Genoscope, 2 rue Gaston Crémieux, Evry 91057, France
| | - Sarah Romac
- CNRS, UMR 7144, Station Biologique de Roscoff, Place Georges Teissier, Roscoff 29680, France.,Sorbonne Universités, UPMC Univ Paris 06, UMR 7144, Station Biologique de Roscoff, Place Georges Teissier, Roscoff 29680, France
| | - Isabel Ferrera
- Departament de Biologia Marina i Oceanografia, Institute of Marine Sciences (ICM), CSIC, Barcelona E08003, Spain
| | - Guillaume Albini
- CEA-Institut de Biologie François Jacob, Genoscope, 2 rue Gaston Crémieux, Evry 91057, France
| | - Jean-Marc Aury
- CEA-Institut de Biologie François Jacob, Genoscope, 2 rue Gaston Crémieux, Evry 91057, France
| | - Caroline Belser
- CEA-Institut de Biologie François Jacob, Genoscope, 2 rue Gaston Crémieux, Evry 91057, France
| | - Alexis Bertrand
- CEA-Institut de Biologie François Jacob, Genoscope, 2 rue Gaston Crémieux, Evry 91057, France
| | - Corinne Cruaud
- CEA-Institut de Biologie François Jacob, Genoscope, 2 rue Gaston Crémieux, Evry 91057, France
| | - Corinne Da Silva
- CEA-Institut de Biologie François Jacob, Genoscope, 2 rue Gaston Crémieux, Evry 91057, France
| | - Carole Dossat
- CEA-Institut de Biologie François Jacob, Genoscope, 2 rue Gaston Crémieux, Evry 91057, France
| | - Frédérick Gavory
- CEA-Institut de Biologie François Jacob, Genoscope, 2 rue Gaston Crémieux, Evry 91057, France
| | - Shahinaz Gas
- CEA-Institut de Biologie François Jacob, Genoscope, 2 rue Gaston Crémieux, Evry 91057, France
| | - Julie Guy
- CEA-Institut de Biologie François Jacob, Genoscope, 2 rue Gaston Crémieux, Evry 91057, France
| | - Maud Haquelle
- CEA-Institut de Biologie François Jacob, Genoscope, 2 rue Gaston Crémieux, Evry 91057, France
| | - E'krame Jacoby
- CEA-Institut de Biologie François Jacob, Genoscope, 2 rue Gaston Crémieux, Evry 91057, France
| | - Olivier Jaillon
- CEA-Institut de Biologie François Jacob, Genoscope, 2 rue Gaston Crémieux, Evry 91057, France.,CNRS, UMR 8030, Evry CP5706, France.,Université d'Evry, UMR 8030, Evry CP5706, France
| | - Arnaud Lemainque
- CEA-Institut de Biologie François Jacob, Genoscope, 2 rue Gaston Crémieux, Evry 91057, France
| | - Eric Pelletier
- CEA-Institut de Biologie François Jacob, Genoscope, 2 rue Gaston Crémieux, Evry 91057, France.,CNRS, UMR 8030, Evry CP5706, France.,Université d'Evry, UMR 8030, Evry CP5706, France
| | - Gaëlle Samson
- CEA-Institut de Biologie François Jacob, Genoscope, 2 rue Gaston Crémieux, Evry 91057, France
| | - Mark Wessner
- CEA-Institut de Biologie François Jacob, Genoscope, 2 rue Gaston Crémieux, Evry 91057, France
| | | | - Silvia G Acinas
- Departament de Biologia Marina i Oceanografia, Institute of Marine Sciences (ICM), CSIC, Barcelona E08003, Spain
| | - Marta Royo-Llonch
- Departament de Biologia Marina i Oceanografia, Institute of Marine Sciences (ICM), CSIC, Barcelona E08003, Spain
| | - Francisco M Cornejo-Castillo
- Departament de Biologia Marina i Oceanografia, Institute of Marine Sciences (ICM), CSIC, Barcelona E08003, Spain
| | - Ramiro Logares
- Departament de Biologia Marina i Oceanografia, Institute of Marine Sciences (ICM), CSIC, Barcelona E08003, Spain
| | - Beatriz Fernández-Gómez
- Departament de Biologia Marina i Oceanografia, Institute of Marine Sciences (ICM), CSIC, Barcelona E08003, Spain.,FONDAP Center for Genome Regulation, Moneda 1375, Santiago 8320000, Chile.,Laboratorio de Bioinformática y Expresión Génica, Instituto de Nutrición y Tecnología de los Alimentos (INTA), Universidad de Chile, El Libano Macul, Santiago 5524, Chile
| | - Chris Bowler
- Ecole Normale Supérieure, PSL Research University, Institut de Biologie de l'Ecole Normale Supérieure (IBENS), CNRS UMR 8197, INSERM U1024, 46 rue d'Ulm, Paris F-75005, France
| | - Guy Cochrane
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genomes Campus, Hinxton, Cambridge CB10 1 SD, UK
| | - Clara Amid
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genomes Campus, Hinxton, Cambridge CB10 1 SD, UK
| | - Petra Ten Hoopen
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genomes Campus, Hinxton, Cambridge CB10 1 SD, UK
| | - Colomban De Vargas
- CNRS, UMR 7144, Station Biologique de Roscoff, Place Georges Teissier, Roscoff 29680, France.,Sorbonne Universités, UPMC Univ Paris 06, UMR 7144, Station Biologique de Roscoff, Place Georges Teissier, Roscoff 29680, France
| | - Nigel Grimsley
- CNRS UMR 7232, BIOM, Avenue Pierre Fabre, Banyuls-sur-Mer 66650, France.,Sorbonne Universités Paris 06, OOB UPMC, Avenue Pierre Fabre, Banyuls-sur-Mer 66650, France
| | - Elodie Desgranges
- CNRS UMR 7232, BIOM, Avenue Pierre Fabre, Banyuls-sur-Mer 66650, France.,Sorbonne Universités Paris 06, OOB UPMC, Avenue Pierre Fabre, Banyuls-sur-Mer 66650, France
| | - Stefanie Kandels-Lewis
- Directors' Research European Molecular Biology Laboratory, Meyerhofstr. 1, Heidelberg 69117, Germany.,Structural and Computational Biology, European Molecular Biology Laboratory, Meyerhofstr. 1, Heidelberg 69117, Germany
| | - Hiroyuki Ogata
- for Chemical Research, Kyoto University, Gokasho, Uji, Kyoto 611-0011, Japan
| | - Nicole Poulton
- Bigelow Laboratory for Ocean Sciences, East Boothbay, Maine 04544, USA
| | - Michael E Sieracki
- Bigelow Laboratory for Ocean Sciences, East Boothbay, Maine 04544, USA.,National Science Foundation, Arlington, Virginia 22230, USA
| | | | - Matthew B Sullivan
- Departments of Microbiology and Civil, Environmental and Geodetic Engineering, Ohio State University, Columbus, Ohio 43210, USA.,Department of Microbiology, The Ohio State University, Columbus, Ohio 43210, USA
| | - Jennifer R Brum
- Department of Microbiology, The Ohio State University, Columbus, Ohio 43210, USA
| | - Melissa B Duhaime
- Department of Ecology and Evolutionary Biology, University of Michigan, Ann Arbor, Michigan 48109, USA
| | | | - Bonnie L Hurwitz
- Department of Agricultural and Biosystems Engineering, University of Arizona, Tucson, Arizona 85719, USA
| | | | - Stéphane Pesant
- MARUM, Center for Marine Environmental Sciences, University of Bremen, Leobener Str. 8, Bremen 28359, Germany.,PANGAEA, Data Publisher for Earth and Environmental Science, University of Bremen, Leobener Str. 8, Bremen 28359, Germany
| | - Eric Karsenti
- Ecole Normale Supérieure, PSL Research University, Institut de Biologie de l'Ecole Normale Supérieure (IBENS), CNRS UMR 8197, INSERM U1024, 46 rue d'Ulm, Paris F-75005, France.,Directors' Research European Molecular Biology Laboratory, Meyerhofstr. 1, Heidelberg 69117, Germany.,Sorbonne Universités, UPMC Université Paris 06, CNRS, Laboratoire d'oceanographie de Villefranche (LOV), Observatoire Océanologique, 181 Chemin du Lazaret, Villefranche-sur-mer F-06230, France
| | - Patrick Wincker
- CEA-Institut de Biologie François Jacob, Genoscope, 2 rue Gaston Crémieux, Evry 91057, France.,CNRS, UMR 8030, Evry CP5706, France.,Université d'Evry, UMR 8030, Evry CP5706, France
| |
Collapse
|
14
|
Toribio AL, Alako B, Amid C, Cerdeño-Tarrága A, Clarke L, Cleland I, Fairley S, Gibson R, Goodgame N, Ten Hoopen P, Jayathilaka S, Kay S, Leinonen R, Liu X, Martínez-Villacorta J, Pakseresht N, Rajan J, Reddy K, Rosello M, Silvester N, Smirnov D, Vaughan D, Zalunin V, Cochrane G. European Nucleotide Archive in 2016. Nucleic Acids Res 2016; 45:D32-D36. [PMID: 27899630 PMCID: PMC5210577 DOI: 10.1093/nar/gkw1106] [Citation(s) in RCA: 63] [Impact Index Per Article: 7.9] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/22/2016] [Revised: 10/25/2016] [Accepted: 10/31/2016] [Indexed: 02/07/2023] Open
Abstract
The European Nucleotide Archive (ENA; http://www.ebi.ac.uk/ena) offers a rich platform for data sharing, publishing and archiving and a globally comprehensive data set for onward use by the scientific community. With a broad scope spanning raw sequencing reads, genome assemblies and functional annotation, the resource provides extensive data submission, search and download facilities across web and programmatic interfaces. Here, we outline ENA content and major access modalities, highlight major developments in 2016 and outline a number of examples of data reuse from ENA.
Collapse
Affiliation(s)
- Ana Luisa Toribio
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Blaise Alako
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Clara Amid
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Ana Cerdeño-Tarrága
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Laura Clarke
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Iain Cleland
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Susan Fairley
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Richard Gibson
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Neil Goodgame
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Petra Ten Hoopen
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Suran Jayathilaka
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Simon Kay
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Rasko Leinonen
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Xin Liu
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Josué Martínez-Villacorta
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Nima Pakseresht
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Jeena Rajan
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Kethi Reddy
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Marc Rosello
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Nicole Silvester
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Dmitriy Smirnov
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Daniel Vaughan
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Vadim Zalunin
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Guy Cochrane
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| |
Collapse
|
15
|
ten Hoopen P, Amid C, Buttigieg PL, Pafilis E, Bravakos P, Cerdeño-Tárraga AM, Gibson R, Kahlke T, Legaki A, Narayana Murthy K, Papastefanou G, Pereira E, Rossello M, Luisa Toribio A, Cochrane G. Value, but high costs in post-deposition data curation. Database (Oxford) 2016; 2016:bav126. [PMID: 26861660 PMCID: PMC4747322 DOI: 10.1093/database/bav126] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 10/16/2015] [Accepted: 12/14/2015] [Indexed: 12/26/2022]
Abstract
Discoverability of sequence data in primary data archives is proportional to the richness of contextual information associated with the data. Here, we describe an exercise in the improvement of contextual information surrounding sample records associated with metagenomics sequence reads available in the European Nucleotide Archive. We outline the annotation process and summarize findings of this effort aimed at increasing usability of publicly available environmental data. Furthermore, we emphasize the benefits of such an exercise and detail its costs. We conclude that such a third party annotation approach is expensive and has value as an element of curation, but should form only part of a more sustainable submitter-driven approach. Database URL: http://www.ebi.ac.uk/ena
Collapse
Affiliation(s)
- Petra ten Hoopen
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Clara Amid
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Pier Luigi Buttigieg
- Alfred-Wegener-Institut Helmholtz-Zentrum für Polar-und Meeresforschung, Am Handelshafen 12, Bremerhaven 27570, Germany
| | - Evangelos Pafilis
- Institute of Marine Biology, Biotechnology and Aquaculture, Hellenic Centre for Marine Research, Heraklion, Crete, P.O. Box 2214 71003, Greece
| | - Panos Bravakos
- Institute of Marine Biology, Biotechnology and Aquaculture, Hellenic Centre for Marine Research, Heraklion, Crete, P.O. Box 2214 71003, Greece
| | - Ana M Cerdeño-Tárraga
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Richard Gibson
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Tim Kahlke
- CSIRO Marine and Atmospheric Research, Castray Esplanade, Hobart TAS 7001, Australia
| | - Aglaia Legaki
- Institute of Marine Biology, Biotechnology and Aquaculture, Hellenic Centre for Marine Research, Heraklion, Crete, P.O. Box 2214 71003, Greece
| | - Kada Narayana Murthy
- Pondicherry University, Brookshabad Campus, Andaman and Nicobar Islands, Port Blair 744112, India
| | - Gabriella Papastefanou
- Institute of Marine Biology, Biotechnology and Aquaculture, Hellenic Centre for Marine Research, Heraklion, Crete, P.O. Box 2214 71003, Greece
| | - Emiliano Pereira
- Max Planck Institute for Marine Microbial Ecology, Microbial Genomics and Bioinformatics Group, Celsiusstr. 1, Bremen 28359, Germany
| | - Marc Rossello
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Ana Luisa Toribio
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Guy Cochrane
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| |
Collapse
|
16
|
Gibson R, Alako B, Amid C, Cerdeño-Tárraga A, Cleland I, Goodgame N, Ten Hoopen P, Jayathilaka S, Kay S, Leinonen R, Liu X, Pallreddy S, Pakseresht N, Rajan J, Rosselló M, Silvester N, Smirnov D, Toribio AL, Vaughan D, Zalunin V, Cochrane G. Biocuration of functional annotation at the European nucleotide archive. Nucleic Acids Res 2016; 44:D58-66. [PMID: 26615190 PMCID: PMC4702917 DOI: 10.1093/nar/gkv1311] [Citation(s) in RCA: 27] [Impact Index Per Article: 3.4] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/09/2015] [Revised: 11/06/2015] [Accepted: 11/09/2015] [Indexed: 01/03/2023] Open
Abstract
The European Nucleotide Archive (ENA; http://www.ebi.ac.uk/ena) is a repository for the submission, maintenance and presentation of nucleotide sequence data and related sample and experimental information. In this article we report on ENA in 2015 regarding general activity, notable published data sets and major achievements. This is followed by a focus on sustainable biocuration of functional annotation, an area which has particularly felt the pressure of sequencing growth. The importance of functional annotation, how it can be submitted and the shifting role of the biocurator in the context of increasing volumes of data are all discussed.
Collapse
Affiliation(s)
- Richard Gibson
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton, Cambridge, CB10 1SD, UK
| | - Blaise Alako
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton, Cambridge, CB10 1SD, UK
| | - Clara Amid
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton, Cambridge, CB10 1SD, UK
| | - Ana Cerdeño-Tárraga
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton, Cambridge, CB10 1SD, UK
| | - Iain Cleland
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton, Cambridge, CB10 1SD, UK
| | - Neil Goodgame
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton, Cambridge, CB10 1SD, UK
| | - Petra Ten Hoopen
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton, Cambridge, CB10 1SD, UK
| | - Suran Jayathilaka
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton, Cambridge, CB10 1SD, UK
| | - Simon Kay
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton, Cambridge, CB10 1SD, UK
| | - Rasko Leinonen
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton, Cambridge, CB10 1SD, UK
| | - Xin Liu
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton, Cambridge, CB10 1SD, UK
| | - Swapna Pallreddy
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton, Cambridge, CB10 1SD, UK
| | - Nima Pakseresht
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton, Cambridge, CB10 1SD, UK
| | - Jeena Rajan
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton, Cambridge, CB10 1SD, UK
| | - Marc Rosselló
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton, Cambridge, CB10 1SD, UK
| | - Nicole Silvester
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton, Cambridge, CB10 1SD, UK
| | - Dmitriy Smirnov
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton, Cambridge, CB10 1SD, UK
| | - Ana Luisa Toribio
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton, Cambridge, CB10 1SD, UK
| | - Daniel Vaughan
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton, Cambridge, CB10 1SD, UK
| | - Vadim Zalunin
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton, Cambridge, CB10 1SD, UK
| | - Guy Cochrane
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton, Cambridge, CB10 1SD, UK
| |
Collapse
|
17
|
Silvester N, Alako B, Amid C, Cerdeño-Tárraga A, Cleland I, Gibson R, Goodgame N, Ten Hoopen P, Kay S, Leinonen R, Li W, Liu X, Lopez R, Pakseresht N, Pallreddy S, Plaister S, Radhakrishnan R, Rossello M, Senf A, Smirnov D, Toribio AL, Vaughan D, Zalunin V, Cochrane G. Content discovery and retrieval services at the European Nucleotide Archive. Nucleic Acids Res 2014; 43:D23-9. [PMID: 25404130 PMCID: PMC4383942 DOI: 10.1093/nar/gku1129] [Citation(s) in RCA: 35] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/26/2022] Open
Abstract
The European Nucleotide Archive (ENA; http://www.ebi.ac.uk/ena) is Europe's primary resource for nucleotide sequence information. With the growing volume and diversity of public sequencing data comes the need for increased sophistication in data organisation, presentation and search services so as to maximise its discoverability and usability. In response to this, ENA has been introducing and improving checklists for use during submission and expanding its search facilities to provide targeted search results. Here, we give a brief update on ENA content and some major developments undertaken in data submission services during 2014. We then describe in more detail the services we offer for data discovery and retrieval.
Collapse
Affiliation(s)
- Nicole Silvester
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Blaise Alako
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Clara Amid
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Ana Cerdeño-Tárraga
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Iain Cleland
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Richard Gibson
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Neil Goodgame
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Petra Ten Hoopen
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Simon Kay
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Rasko Leinonen
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Weizhong Li
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Xin Liu
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Rodrigo Lopez
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Nima Pakseresht
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Swapna Pallreddy
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Sheila Plaister
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Rajesh Radhakrishnan
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Marc Rossello
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Alexander Senf
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Dmitriy Smirnov
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Ana Luisa Toribio
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Daniel Vaughan
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Vadim Zalunin
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Guy Cochrane
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| |
Collapse
|
18
|
Pakseresht N, Alako B, Amid C, Cerdeño-Tárraga A, Cleland I, Gibson R, Goodgame N, Gur T, Jang M, Kay S, Leinonen R, Li W, Liu X, Lopez R, McWilliam H, Oisel A, Pallreddy S, Plaister S, Radhakrishnan R, Rivière S, Rossello M, Senf A, Silvester N, Smirnov D, Squizzato S, ten Hoopen P, Toribio AL, Vaughan D, Zalunin V, Cochrane G. Assembly information services in the European Nucleotide Archive. Nucleic Acids Res 2013; 42:D38-43. [PMID: 24214989 PMCID: PMC3965037 DOI: 10.1093/nar/gkt1082] [Citation(s) in RCA: 29] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/02/2022] Open
Abstract
The European Nucleotide Archive (ENA; http://www.ebi.ac.uk/ena) is a repository for the world public domain nucleotide sequence data output. ENA content covers a spectrum of data types including raw reads, assembly data and functional annotation. ENA has faced a dramatic growth in genome assembly submission rates, data volumes and complexity of datasets. This has prompted a broad reworking of assembly submission services, for which we now reach the end of a major programme of work and many enhancements have already been made available over the year to components of the submission service. In this article, we briefly review ENA content and growth over 2013, describe our rapidly developing services for genome assembly information and outline further major developments over the last year.
Collapse
Affiliation(s)
- Nima Pakseresht
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
19
|
Dawson HD, Loveland JE, Pascal G, Gilbert JGR, Uenishi H, Mann KM, Sang Y, Zhang J, Carvalho-Silva D, Hunt T, Hardy M, Hu Z, Zhao SH, Anselmo A, Shinkai H, Chen C, Badaoui B, Berman D, Amid C, Kay M, Lloyd D, Snow C, Morozumi T, Cheng RPY, Bystrom M, Kapetanovic R, Schwartz JC, Kataria R, Astley M, Fritz E, Steward C, Thomas M, Wilming L, Toki D, Archibald AL, Bed’Hom B, Beraldi D, Huang TH, Ait-Ali T, Blecha F, Botti S, Freeman TC, Giuffra E, Hume DA, Lunney JK, Murtaugh MP, Reecy JM, Harrow JL, Rogel-Gaillard C, Tuggle CK. Structural and functional annotation of the porcine immunome. BMC Genomics 2013; 14:332. [PMID: 23676093 PMCID: PMC3658956 DOI: 10.1186/1471-2164-14-332] [Citation(s) in RCA: 130] [Impact Index Per Article: 11.8] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/02/2013] [Accepted: 05/03/2013] [Indexed: 01/09/2023] Open
Abstract
BACKGROUND The domestic pig is known as an excellent model for human immunology and the two species share many pathogens. Susceptibility to infectious disease is one of the major constraints on swine performance, yet the structure and function of genes comprising the pig immunome are not well-characterized. The completion of the pig genome provides the opportunity to annotate the pig immunome, and compare and contrast pig and human immune systems. RESULTS The Immune Response Annotation Group (IRAG) used computational curation and manual annotation of the swine genome assembly 10.2 (Sscrofa10.2) to refine the currently available automated annotation of 1,369 immunity-related genes through sequence-based comparison to genes in other species. Within these genes, we annotated 3,472 transcripts. Annotation provided evidence for gene expansions in several immune response families, and identified artiodactyl-specific expansions in the cathelicidin and type 1 Interferon families. We found gene duplications for 18 genes, including 13 immune response genes and five non-immune response genes discovered in the annotation process. Manual annotation provided evidence for many new alternative splice variants and 8 gene duplications. Over 1,100 transcripts without porcine sequence evidence were detected using cross-species annotation. We used a functional approach to discover and accurately annotate porcine immune response genes. A co-expression clustering analysis of transcriptomic data from selected experimental infections or immune stimulations of blood, macrophages or lymph nodes identified a large cluster of genes that exhibited a correlated positive response upon infection across multiple pathogens or immune stimuli. Interestingly, this gene cluster (cluster 4) is enriched for known general human immune response genes, yet contains many un-annotated porcine genes. A phylogenetic analysis of the encoded proteins of cluster 4 genes showed that 15% exhibited an accelerated evolution as compared to 4.1% across the entire genome. CONCLUSIONS This extensive annotation dramatically extends the genome-based knowledge of the molecular genetics and structure of a major portion of the porcine immunome. Our complementary functional approach using co-expression during immune response has provided new putative immune response annotation for over 500 porcine genes. Our phylogenetic analysis of this core immunome cluster confirms rapid evolutionary change in this set of genes, and that, as in other species, such genes are important components of the pig's adaptation to pathogen challenge over evolutionary time. These comprehensive and integrated analyses increase the value of the porcine genome sequence and provide important tools for global analyses and data-mining of the porcine immune response.
Collapse
Affiliation(s)
- Harry D Dawson
- USDA-ARS, Beltsville Human Nutrition Research Center, Diet, Genomics, and Immunology Laboratory, Beltsville, MD 20705, USA
| | - Jane E Loveland
- Informatics Department, Wellcome Trust Sanger Institute, Wellcome Trust Genome Campus, Hinxton, Cambs CB10 1SA, UK
| | - Géraldine Pascal
- INRA, UMR85 Physiologie de la Reproduction et des Comportements, F-37380, Nouzilly, France
| | - James GR Gilbert
- Informatics Department, Wellcome Trust Sanger Institute, Wellcome Trust Genome Campus, Hinxton, Cambs CB10 1SA, UK
| | - Hirohide Uenishi
- National Institute of Agrobiological Sciences, 2-1-2 Kannondai, Tsukuba, Ibaraki 305-8602, Japan
| | - Katherine M Mann
- USDA ARS BA Animal Parasitic Diseases Laboratory, Beltsville, MD 20705, USA
| | - Yongming Sang
- Department of Anatomy and Physiology, College of Veterinary Medicine, Kansas State University, Manhattan, KS 66506, USA
| | - Jie Zhang
- Laboratory of Animal Genetics, Breeding, and Reproduction, Huazhong Agricultural University, Wuhan 430070, PR China
| | - Denise Carvalho-Silva
- Informatics Department, Wellcome Trust Sanger Institute, Wellcome Trust Genome Campus, Hinxton, Cambs CB10 1SA, UK,Current affiliation: EMBL Outstation-Hinxton, European Bioinformatics Institute, Wellcome Trust Genome Campus, Cambs CB10 1SD, UK
| | - Toby Hunt
- Informatics Department, Wellcome Trust Sanger Institute, Wellcome Trust Genome Campus, Hinxton, Cambs CB10 1SA, UK
| | - Matthew Hardy
- Informatics Department, Wellcome Trust Sanger Institute, Wellcome Trust Genome Campus, Hinxton, Cambs CB10 1SA, UK
| | - Zhiliang Hu
- Department of Animal Science, Iowa State University, Ames, IA 50011, USA
| | - Shu-Hong Zhao
- Laboratory of Animal Genetics, Breeding, and Reproduction, Huazhong Agricultural University, Wuhan 430070, PR China
| | - Anna Anselmo
- Parco Tecnologico Padano, Integrative Biology Unit, via A. Einstein, 26900, Lodi, Italy
| | - Hiroki Shinkai
- National Institute of Agrobiological Sciences, 2-1-2 Kannondai, Tsukuba, Ibaraki 305-8602, Japan
| | - Celine Chen
- USDA-ARS, Beltsville Human Nutrition Research Center, Diet, Genomics, and Immunology Laboratory, Beltsville, MD 20705, USA
| | - Bouabid Badaoui
- Parco Tecnologico Padano, Integrative Biology Unit, via A. Einstein, 26900, Lodi, Italy
| | - Daniel Berman
- USDA ARS BA Animal Parasitic Diseases Laboratory, Beltsville, MD 20705, USA
| | - Clara Amid
- Informatics Department, Wellcome Trust Sanger Institute, Wellcome Trust Genome Campus, Hinxton, Cambs CB10 1SA, UK,Current affiliation: EMBL Outstation-Hinxton, European Bioinformatics Institute, Wellcome Trust Genome Campus, Cambs CB10 1SD, UK
| | - Mike Kay
- Informatics Department, Wellcome Trust Sanger Institute, Wellcome Trust Genome Campus, Hinxton, Cambs CB10 1SA, UK
| | - David Lloyd
- Informatics Department, Wellcome Trust Sanger Institute, Wellcome Trust Genome Campus, Hinxton, Cambs CB10 1SA, UK
| | - Catherine Snow
- Informatics Department, Wellcome Trust Sanger Institute, Wellcome Trust Genome Campus, Hinxton, Cambs CB10 1SA, UK
| | - Takeya Morozumi
- Institute of Japan Association for Technology in Agriculture, Forestry and Fisheries, 446-1 Ippaizuka, Kamiyokoba, Tsukuba, Ibaraki 305-0854, Japan
| | - Ryan Pei-Yen Cheng
- Department of Animal Science, Iowa State University, Ames, IA 50011, USA
| | - Megan Bystrom
- Department of Animal Science, Iowa State University, Ames, IA 50011, USA
| | - Ronan Kapetanovic
- The Roslin Institute and Royal (Dick) School of Veterinary Studies, University of Edinburgh, Easter Bush, Midlothian EH25 9RG, UK
| | - John C Schwartz
- Department of Veterinary and Biomedical Sciences, University of Minnesota, 1971 Commonwealth Avenue, St. Paul, MN 55108, USA
| | - Ranjit Kataria
- National Bureau of Animal Genetic Resources, P.B. 129, GT Road By-Pass, Karnal 132001, (Haryana), India
| | - Matthew Astley
- Informatics Department, Wellcome Trust Sanger Institute, Wellcome Trust Genome Campus, Hinxton, Cambs CB10 1SA, UK
| | - Eric Fritz
- Department of Animal Science, Iowa State University, Ames, IA 50011, USA
| | - Charles Steward
- Informatics Department, Wellcome Trust Sanger Institute, Wellcome Trust Genome Campus, Hinxton, Cambs CB10 1SA, UK
| | - Mark Thomas
- Informatics Department, Wellcome Trust Sanger Institute, Wellcome Trust Genome Campus, Hinxton, Cambs CB10 1SA, UK
| | - Laurens Wilming
- Informatics Department, Wellcome Trust Sanger Institute, Wellcome Trust Genome Campus, Hinxton, Cambs CB10 1SA, UK
| | - Daisuke Toki
- Institute of Japan Association for Technology in Agriculture, Forestry and Fisheries, 446-1 Ippaizuka, Kamiyokoba, Tsukuba, Ibaraki 305-0854, Japan
| | - Alan L Archibald
- The Roslin Institute and Royal (Dick) School of Veterinary Studies, University of Edinburgh, Easter Bush, Midlothian EH25 9RG, UK
| | - Bertrand Bed’Hom
- INRA, UMR1313 Génétique Animale et Biologie Intégrative, F-78350, Jouy-en-Josas, France
| | - Dario Beraldi
- The Roslin Institute and Royal (Dick) School of Veterinary Studies, University of Edinburgh, Easter Bush, Midlothian EH25 9RG, UK
| | - Ting-Hua Huang
- Department of Animal Science, Iowa State University, Ames, IA 50011, USA
| | - Tahar Ait-Ali
- The Roslin Institute and Royal (Dick) School of Veterinary Studies, University of Edinburgh, Easter Bush, Midlothian EH25 9RG, UK
| | - Frank Blecha
- Department of Anatomy and Physiology, College of Veterinary Medicine, Kansas State University, Manhattan, KS 66506, USA
| | - Sara Botti
- Parco Tecnologico Padano, Integrative Biology Unit, via A. Einstein, 26900, Lodi, Italy
| | - Tom C Freeman
- The Roslin Institute and Royal (Dick) School of Veterinary Studies, University of Edinburgh, Easter Bush, Midlothian EH25 9RG, UK
| | - Elisabetta Giuffra
- Parco Tecnologico Padano, Integrative Biology Unit, via A. Einstein, 26900, Lodi, Italy,INRA, UMR1313 Génétique Animale et Biologie Intégrative, F-78350, Jouy-en-Josas, France
| | - David A Hume
- The Roslin Institute and Royal (Dick) School of Veterinary Studies, University of Edinburgh, Easter Bush, Midlothian EH25 9RG, UK
| | - Joan K Lunney
- USDA ARS BA Animal Parasitic Diseases Laboratory, Beltsville, MD 20705, USA
| | - Michael P Murtaugh
- Department of Veterinary and Biomedical Sciences, University of Minnesota, 1971 Commonwealth Avenue, St. Paul, MN 55108, USA
| | - James M Reecy
- Department of Animal Science, Iowa State University, Ames, IA 50011, USA
| | - Jennifer L Harrow
- Informatics Department, Wellcome Trust Sanger Institute, Wellcome Trust Genome Campus, Hinxton, Cambs CB10 1SA, UK
| | - Claire Rogel-Gaillard
- INRA, UMR1313 Génétique Animale et Biologie Intégrative, F-78350, Jouy-en-Josas, France
| | | |
Collapse
|
20
|
Cochrane G, Alako B, Amid C, Bower L, Cerdeño-Tárraga A, Cleland I, Gibson R, Goodgame N, Jang M, Kay S, Leinonen R, Lin X, Lopez R, McWilliam H, Oisel A, Pakseresht N, Pallreddy S, Park Y, Plaister S, Radhakrishnan R, Rivière S, Rossello M, Senf A, Silvester N, Smirnov D, Ten Hoopen P, Toribio A, Vaughan D, Zalunin V. Facing growth in the European Nucleotide Archive. Nucleic Acids Res 2012. [PMID: 23203883 PMCID: PMC3531187 DOI: 10.1093/nar/gks1175] [Citation(s) in RCA: 60] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/14/2022] Open
Abstract
The European Nucleotide Archive (ENA; http://www.ebi.ac.uk/ena/) collects, maintains and presents comprehensive nucleic acid sequence and related information as part of the permanent public scientific record. Here, we provide brief updates on ENA content developments and major service enhancements in 2012 and describe in more detail two important areas of development and policy that are driven by ongoing growth in sequencing technologies. First, we describe the ENA data warehouse, a resource for which we provide a programmatic entry point to integrated content across the breadth of ENA. Second, we detail our plans for the deployment of CRAM data compression technology in ENA.
Collapse
Affiliation(s)
- Guy Cochrane
- EMBL - European Bioinformatics Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SD, UK.
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
21
|
MacArthur DG, Balasubramanian S, Frankish A, Huang N, Morris J, Walter K, Jostins L, Habegger L, Pickrell JK, Montgomery SB, Albers CA, Zhang ZD, Conrad DF, Lunter G, Zheng H, Ayub Q, DePristo MA, Banks E, Hu M, Handsaker RE, Rosenfeld JA, Fromer M, Jin M, Mu XJ, Khurana E, Ye K, Kay M, Saunders GI, Suner MM, Hunt T, Barnes IHA, Amid C, Carvalho-Silva DR, Bignell AH, Snow C, Yngvadottir B, Bumpstead S, Cooper DN, Xue Y, Romero IG, Wang J, Li Y, Gibbs RA, McCarroll SA, Dermitzakis ET, Pritchard JK, Barrett JC, Harrow J, Hurles ME, Gerstein MB, Tyler-Smith C. A systematic survey of loss-of-function variants in human protein-coding genes. Science 2012; 335:823-8. [PMID: 22344438 PMCID: PMC3299548 DOI: 10.1126/science.1215040] [Citation(s) in RCA: 869] [Impact Index Per Article: 72.4] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/17/2023]
Abstract
Genome-sequencing studies indicate that all humans carry many genetic variants predicted to cause loss of function (LoF) of protein-coding genes, suggesting unexpected redundancy in the human genome. Here we apply stringent filters to 2951 putative LoF variants obtained from 185 human genomes to determine their true prevalence and properties. We estimate that human genomes typically contain ~100 genuine LoF variants with ~20 genes completely inactivated. We identify rare and likely deleterious LoF alleles, including 26 known and 21 predicted severe disease-causing variants, as well as common LoF variants in nonessential genes. We describe functional and evolutionary differences between LoF-tolerant and recessive disease genes and a method for using these differences to prioritize candidate genes found in clinical sequencing studies.
Collapse
|
22
|
Amid C, Birney E, Bower L, Cerdeño-Tárraga A, Cheng Y, Cleland I, Faruque N, Gibson R, Goodgame N, Hunter C, Jang M, Leinonen R, Liu X, Oisel A, Pakseresht N, Plaister S, Radhakrishnan R, Reddy K, Rivière S, Rossello M, Senf A, Smirnov D, Ten Hoopen P, Vaughan D, Vaughan R, Zalunin V, Cochrane G. Major submissions tool developments at the European Nucleotide Archive. Nucleic Acids Res 2011; 40:D43-7. [PMID: 22080548 PMCID: PMC3245037 DOI: 10.1093/nar/gkr946] [Citation(s) in RCA: 28] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/26/2022] Open
Abstract
The European Nucleotide Archive (ENA; http://www.ebi.ac.uk/ena), Europe's primary nucleotide sequence resource, captures and presents globally comprehensive nucleic acid sequence and associated information. Covering the spectrum from raw data to assembled and functionally annotated genomes, the ENA has witnessed a dramatic growth resulting from advances in sequencing technology and ever broadening application of the methodology. During 2011, we have continued to operate and extend the broad range of ENA services. In particular, we have released major new functionality in our interactive web submission system, Webin, through developments in template-based submissions for annotated sequences and support for raw next-generation sequence read submissions.
Collapse
Affiliation(s)
- Clara Amid
- European Bioinformatics Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SD, UK.
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
23
|
Abstract
A report on the meeting 'Beyond the Genome', Boston, USA, 11-13 October 2010.
Collapse
Affiliation(s)
- Amanda M Casto
- Department of Genetics, Stanford University, Stanford, CA 94305, USA.
| | | |
Collapse
|
24
|
Amid C, Frankish A, Aken B, Ezkurdia I, Kokocinsk F, Gilbert J, White S, Carninci P, Gingeras T, Guigo R, Searle S, Tress ML, Harrow J, Hubbard T. From identification to validation to gene count. Genome Biol 2010. [PMCID: PMC3026224 DOI: 10.1186/gb-2010-11-s1-o1] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/10/2022] Open
|
25
|
Pruitt KD, Harrow J, Harte RA, Wallin C, Diekhans M, Maglott DR, Searle S, Farrell CM, Loveland JE, Ruef BJ, Hart E, Suner MM, Landrum MJ, Aken B, Ayling S, Baertsch R, Fernandez-Banet J, Cherry JL, Curwen V, Dicuccio M, Kellis M, Lee J, Lin MF, Schuster M, Shkeda A, Amid C, Brown G, Dukhanina O, Frankish A, Hart J, Maidak BL, Mudge J, Murphy MR, Murphy T, Rajan J, Rajput B, Riddick LD, Snow C, Steward C, Webb D, Weber JA, Wilming L, Wu W, Birney E, Haussler D, Hubbard T, Ostell J, Durbin R, Lipman D. The consensus coding sequence (CCDS) project: Identifying a common protein-coding gene set for the human and mouse genomes. Genes Dev 2009; 19:1316-23. [PMID: 19498102 PMCID: PMC2704439 DOI: 10.1101/gr.080531.108] [Citation(s) in RCA: 401] [Impact Index Per Article: 26.7] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/05/2008] [Accepted: 04/20/2009] [Indexed: 11/25/2022]
Abstract
Effective use of the human and mouse genomes requires reliable identification of genes and their products. Although multiple public resources provide annotation, different methods are used that can result in similar but not identical representation of genes, transcripts, and proteins. The collaborative consensus coding sequence (CCDS) project tracks identical protein annotations on the reference mouse and human genomes with a stable identifier (CCDS ID), and ensures that they are consistently represented on the NCBI, Ensembl, and UCSC Genome Browsers. Importantly, the project coordinates on manually reviewing inconsistent protein annotations between sites, as well as annotations for which new evidence suggests a revision is needed, to progressively converge on a complete protein-coding set for the human and mouse reference genomes, while maintaining a high standard of reliability and biological accuracy. To date, the project has identified 20,159 human and 17,707 mouse consensus coding regions from 17,052 human and 16,893 mouse genes. Three evaluation methods indicate that the entries in the CCDS set are highly likely to represent real proteins, more so than annotations from contributing groups not included in CCDS. The CCDS database thus centralizes the function of identifying well-supported, identically-annotated, protein-coding regions.
Collapse
Affiliation(s)
- Kim D Pruitt
- National Center for Biotechnology Information, National Library of Medicine, Bethesda, Maryland 20894, USA.
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
26
|
Yamasaki C, Murakami K, Fujii Y, Sato Y, Harada E, Takeda JI, Taniya T, Sakate R, Kikugawa S, Shimada M, Tanino M, Koyanagi KO, Barrero RA, Gough C, Chun HW, Habara T, Hanaoka H, Hayakawa Y, Hilton PB, Kaneko Y, Kanno M, Kawahara Y, Kawamura T, Matsuya A, Nagata N, Nishikata K, Noda AO, Nurimoto S, Saichi N, Sakai H, Sanbonmatsu R, Shiba R, Suzuki M, Takabayashi K, Takahashi A, Tamura T, Tanaka M, Tanaka S, Todokoro F, Yamaguchi K, Yamamoto N, Okido T, Mashima J, Hashizume A, Jin L, Lee KB, Lin YC, Nozaki A, Sakai K, Tada M, Miyazaki S, Makino T, Ohyanagi H, Osato N, Tanaka N, Suzuki Y, Ikeo K, Saitou N, Sugawara H, O'Donovan C, Kulikova T, Whitfield E, Halligan B, Shimoyama M, Twigger S, Yura K, Kimura K, Yasuda T, Nishikawa T, Akiyama Y, Motono C, Mukai Y, Nagasaki H, Suwa M, Horton P, Kikuno R, Ohara O, Lancet D, Eveno E, Graudens E, Imbeaud S, Debily MA, Hayashizaki Y, Amid C, Han M, Osanger A, Endo T, Thomas MA, Hirakawa M, Makalowski W, Nakao M, Kim NS, Yoo HS, De Souza SJ, Bonaldo MDF, Niimura Y, Kuryshev V, Schupp I, Wiemann S, Bellgard M, Shionyu M, Jia L, Thierry-Mieg D, Thierry-Mieg J, Wagner L, Zhang Q, Go M, Minoshima S, Ohtsubo M, Hanada K, Tonellato P, Isogai T, Zhang J, Lenhard B, Kim S, Chen Z, Hinz U, Estreicher A, Nakai K, Makalowska I, Hide W, Tiffin N, Wilming L, Chakraborty R, Soares MB, Chiusano ML, Suzuki Y, Auffray C, Yamaguchi-Kabata Y, Itoh T, Hishiki T, Fukuchi S, Nishikawa K, Sugano S, Nomura N, Tateno Y, Imanishi T, Gojobori T. The H-Invitational Database (H-InvDB), a comprehensive annotation resource for human genes and transcripts. Nucleic Acids Res 2007; 36:D793-9. [PMID: 18089548 PMCID: PMC2238988 DOI: 10.1093/nar/gkm999] [Citation(s) in RCA: 44] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022] Open
Abstract
Here we report the new features and improvements in our latest release of the H-Invitational Database (H-InvDB; http://www.h-invitational.jp/), a comprehensive annotation resource for human genes and transcripts. H-InvDB, originally developed as an integrated database of the human transcriptome based on extensive annotation of large sets of full-length cDNA (FLcDNA) clones, now provides annotation for 120 558 human mRNAs extracted from the International Nucleotide Sequence Databases (INSD), in addition to 54 978 human FLcDNAs, in the latest release H-InvDB_4.6. We mapped those human transcripts onto the human genome sequences (NCBI build 36.1) and determined 34 699 human gene clusters, which could define 34 057 (98.1%) protein-coding and 642 (1.9%) non-protein-coding loci; 858 (2.5%) transcribed loci overlapped with predicted pseudogenes. For all these transcripts and genes, we provide comprehensive annotation including gene structures, gene functions, alternative splicing variants, functional non-protein-coding RNAs, functional domains, predicted sub cellular localizations, metabolic pathways, predictions of protein 3D structure, mapping of SNPs and microsatellite repeat motifs, co-localization with orphan diseases, gene expression profiles, orthologous genes, protein-protein interactions (PPI) and annotation for gene families. The current H-InvDB annotation resources consist of two main views: Transcript view and Locus view and eight sub-databases: the DiseaseInfo Viewer, H-ANGEL, the Clustering Viewer, G-integra, the TOPO Viewer, Evola, the PPI view and the Gene family/group.
Collapse
Affiliation(s)
-
- Japan Biological Information Research Center, Japan Biological Informatics Consortium, Japan
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
27
|
Imanishi T, Itoh T, Suzuki Y, O'Donovan C, Fukuchi S, Koyanagi KO, Barrero RA, Tamura T, Yamaguchi-Kabata Y, Tanino M, Yura K, Miyazaki S, Ikeo K, Homma K, Kasprzyk A, Nishikawa T, Hirakawa M, Thierry-Mieg J, Thierry-Mieg D, Ashurst J, Jia L, Nakao M, Thomas MA, Mulder N, Karavidopoulou Y, Jin L, Kim S, Yasuda T, Lenhard B, Eveno E, Suzuki Y, Yamasaki C, Takeda JI, Gough C, Hilton P, Fujii Y, Sakai H, Tanaka S, Amid C, Bellgard M, Bonaldo MDF, Bono H, Bromberg SK, Brookes AJ, Bruford E, Carninci P, Chelala C, Couillault C, de Souza SJ, Debily MA, Devignes MD, Dubchak I, Endo T, Estreicher A, Eyras E, Fukami-Kobayashi K, R. Gopinath G, Graudens E, Hahn Y, Han M, Han ZG, Hanada K, Hanaoka H, Harada E, Hashimoto K, Hinz U, Hirai M, Hishiki T, Hopkinson I, Imbeaud S, Inoko H, Kanapin A, Kaneko Y, Kasukawa T, Kelso J, Kersey P, Kikuno R, Kimura K, Korn B, Kuryshev V, Makalowska I, Makino T, Mano S, Mariage-Samson R, Mashima J, Matsuda H, Mewes HW, Minoshima S, Nagai K, Nagasaki H, Nagata N, Nigam R, Ogasawara O, Ohara O, Ohtsubo M, Okada N, Okido T, Oota S, Ota M, Ota T, Otsuki T, Piatier-Tonneau D, Poustka A, Ren SX, Saitou N, Sakai K, Sakamoto S, Sakate R, Schupp I, Servant F, Sherry S, Shiba R, Shimizu N, Shimoyama M, Simpson AJ, Soares B, Steward C, Suwa M, Suzuki M, Takahashi A, Tamiya G, Tanaka H, Taylor T, Terwilliger JD, Unneberg P, Veeramachaneni V, Watanabe S, Wilming L, Yasuda N, Yoo HS, Stodolsky M, Makalowski W, Go M, Nakai K, Takagi T, Kanehisa M, Sakaki Y, Quackenbush J, Okazaki Y, Hayashizaki Y, Hide W, Chakraborty R, Nishikawa K, Sugawara H, Tateno Y, Chen Z, Oishi M, Tonellato P, Apweiler R, Okubo K, Wagner L, Wiemann S, Strausberg RL, Isogai T, Auffray C, Nomura N, Gojobori T, Sugano S. Integrative annotation of 21,037 human genes validated by full-length cDNA clones. PLoS Biol 2004; 2:e162. [PMID: 15103394 PMCID: PMC393292 DOI: 10.1371/journal.pbio.0020162] [Citation(s) in RCA: 267] [Impact Index Per Article: 13.4] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/19/2003] [Accepted: 04/01/2004] [Indexed: 01/08/2023] Open
Abstract
The human genome sequence defines our inherent biological potential; the realization of the biology encoded therein requires knowledge of the function of each gene. Currently, our knowledge in this area is still limited. Several lines of investigation have been used to elucidate the structure and function of the genes in the human genome. Even so, gene prediction remains a difficult task, as the varieties of transcripts of a gene may vary to a great extent. We thus performed an exhaustive integrative characterization of 41,118 full-length cDNAs that capture the gene transcripts as complete functional cassettes, providing an unequivocal report of structural and functional diversity at the gene level. Our international collaboration has validated 21,037 human gene candidates by analysis of high-quality full-length cDNA clones through curation using unified criteria. This led to the identification of 5,155 new gene candidates. It also manifested the most reliable way to control the quality of the cDNA clones. We have developed a human gene database, called the H-Invitational Database (H-InvDB; http://www.h-invitational.jp/). It provides the following: integrative annotation of human genes, description of gene structures, details of novel alternative splicing isoforms, non-protein-coding RNAs, functional domains, subcellular localizations, metabolic pathways, predictions of protein three-dimensional structure, mapping of known single nucleotide polymorphisms (SNPs), identification of polymorphic microsatellite repeats within human genes, and comparative results with mouse full-length cDNAs. The H-InvDB analysis has shown that up to 4% of the human genome sequence (National Center for Biotechnology Information build 34 assembly) may contain misassembled or missing regions. We found that 6.5% of the human gene candidates (1,377 loci) did not have a good protein-coding open reading frame, of which 296 loci are strong candidates for non-protein-coding RNA genes. In addition, among 72,027 uniquely mapped SNPs and insertions/deletions localized within human genes, 13,215 nonsynonymous SNPs, 315 nonsense SNPs, and 452 indels occurred in coding regions. Together with 25 polymorphic microsatellite repeats present in coding regions, they may alter protein structure, causing phenotypic effects or resulting in disease. The H-InvDB platform represents a substantial contribution to resources needed for the exploration of human biology and pathology.
Collapse
Affiliation(s)
- Tadashi Imanishi
- 1Integrated Database Group, Biological Information Research Center, National Institute of Advanced Industrial Science and TechnologyTokyoJapan
| | - Takeshi Itoh
- 1Integrated Database Group, Biological Information Research Center, National Institute of Advanced Industrial Science and TechnologyTokyoJapan
- 2Bioinformatics Laboratory, Genome Research Department, National Institute of Agrobiological SciencesIbarakiJapan
| | - Yutaka Suzuki
- 3Human Genome Center, The Institute of Medical Science, The University of TokyoTokyoJapan
- 68Department of Medical Genome Sciences, Graduate School of Frontier Sciences, University of TokyoTokyoJapan
| | - Claire O'Donovan
- 4EMBL Outstation—European Bioinformatics Institute, Wellcome Trust Genome CampusCambridgeUnited Kingdom
| | - Satoshi Fukuchi
- 5Center for Information Biology and DNA Data Bank of Japan, National Institute of GeneticsShizuokaJapan
| | | | - Roberto A Barrero
- 5Center for Information Biology and DNA Data Bank of Japan, National Institute of GeneticsShizuokaJapan
| | - Takuro Tamura
- 7Integrated Database Group, Japan Biological Information Research Center, Japan Biological Informatics ConsortiumTokyoJapan
- 8BITS CompanyShizuokaJapan
| | - Yumi Yamaguchi-Kabata
- 1Integrated Database Group, Biological Information Research Center, National Institute of Advanced Industrial Science and TechnologyTokyoJapan
| | - Motohiko Tanino
- 1Integrated Database Group, Biological Information Research Center, National Institute of Advanced Industrial Science and TechnologyTokyoJapan
- 7Integrated Database Group, Japan Biological Information Research Center, Japan Biological Informatics ConsortiumTokyoJapan
| | - Kei Yura
- 9Quantum Bioinformatics Group, Center for Promotion of Computational Science and Engineering, Japan Atomic Energy Research InstituteKyotoJapan
| | - Satoru Miyazaki
- 5Center for Information Biology and DNA Data Bank of Japan, National Institute of GeneticsShizuokaJapan
| | - Kazuho Ikeo
- 5Center for Information Biology and DNA Data Bank of Japan, National Institute of GeneticsShizuokaJapan
| | - Keiichi Homma
- 5Center for Information Biology and DNA Data Bank of Japan, National Institute of GeneticsShizuokaJapan
| | - Arek Kasprzyk
- 4EMBL Outstation—European Bioinformatics Institute, Wellcome Trust Genome CampusCambridgeUnited Kingdom
| | - Tetsuo Nishikawa
- 10Reverse Proteomics Research InstituteChibaJapan
- 11Central Research Laboratory, HitachiTokyoJapan
| | - Mika Hirakawa
- 12Bioinformatics Center, Institute for Chemical Research, Kyoto UniversityKyotoJapan
| | - Jean Thierry-Mieg
- 13National Center for Biotechnology Information, National Library of Medicine, National Institutes of HealthBethesda, MarylandUnited States of America
- 14Centre National de la Recherche Scientifique (CNRS), Laboratoire de Physique MathematiqueMontpellierFrance
| | - Danielle Thierry-Mieg
- 13National Center for Biotechnology Information, National Library of Medicine, National Institutes of HealthBethesda, MarylandUnited States of America
- 14Centre National de la Recherche Scientifique (CNRS), Laboratoire de Physique MathematiqueMontpellierFrance
| | - Jennifer Ashurst
- 15The Wellcome Trust Sanger Institute, Wellcome Trust Genome CampusCambridgeUnited Kingdom
| | - Libin Jia
- 16National Cancer Institute, National Institutes of HealthBethesda, MarylandUnited States of America
| | - Mitsuteru Nakao
- 3Human Genome Center, The Institute of Medical Science, The University of TokyoTokyoJapan
| | - Michael A Thomas
- 17Department of Biological Sciences, Idaho State UniversityPocatello, IdahoUnited States of America
| | - Nicola Mulder
- 4EMBL Outstation—European Bioinformatics Institute, Wellcome Trust Genome CampusCambridgeUnited Kingdom
| | - Youla Karavidopoulou
- 4EMBL Outstation—European Bioinformatics Institute, Wellcome Trust Genome CampusCambridgeUnited Kingdom
| | - Lihua Jin
- 5Center for Information Biology and DNA Data Bank of Japan, National Institute of GeneticsShizuokaJapan
| | - Sangsoo Kim
- 18Korea Research Institute of Bioscience and BiotechnologyTaejeonKorea
| | | | - Boris Lenhard
- 19Center for Genomics and Bioinformatics, Karolinska InstitutetStockholmSweden
| | - Eric Eveno
- 20Genexpress—CNRS—Functional Genomics and Systemic Biology for HealthVillejuif CedexFrance
- 21Sino-French Laboratory in Life Sciences and GenomicsShanghaiChina
| | - Yoshiyuki Suzuki
- 5Center for Information Biology and DNA Data Bank of Japan, National Institute of GeneticsShizuokaJapan
| | - Chisato Yamasaki
- 1Integrated Database Group, Biological Information Research Center, National Institute of Advanced Industrial Science and TechnologyTokyoJapan
| | - Jun-ichi Takeda
- 1Integrated Database Group, Biological Information Research Center, National Institute of Advanced Industrial Science and TechnologyTokyoJapan
| | - Craig Gough
- 1Integrated Database Group, Biological Information Research Center, National Institute of Advanced Industrial Science and TechnologyTokyoJapan
- 7Integrated Database Group, Japan Biological Information Research Center, Japan Biological Informatics ConsortiumTokyoJapan
| | - Phillip Hilton
- 1Integrated Database Group, Biological Information Research Center, National Institute of Advanced Industrial Science and TechnologyTokyoJapan
- 7Integrated Database Group, Japan Biological Information Research Center, Japan Biological Informatics ConsortiumTokyoJapan
| | - Yasuyuki Fujii
- 1Integrated Database Group, Biological Information Research Center, National Institute of Advanced Industrial Science and TechnologyTokyoJapan
- 7Integrated Database Group, Japan Biological Information Research Center, Japan Biological Informatics ConsortiumTokyoJapan
| | - Hiroaki Sakai
- 1Integrated Database Group, Biological Information Research Center, National Institute of Advanced Industrial Science and TechnologyTokyoJapan
- 7Integrated Database Group, Japan Biological Information Research Center, Japan Biological Informatics ConsortiumTokyoJapan
- 22Tokyo Research Laboratories, Kyowa Hakko Kogyo CompanyTokyoJapan
| | - Susumu Tanaka
- 1Integrated Database Group, Biological Information Research Center, National Institute of Advanced Industrial Science and TechnologyTokyoJapan
- 7Integrated Database Group, Japan Biological Information Research Center, Japan Biological Informatics ConsortiumTokyoJapan
| | - Clara Amid
- 23MIPS—Institute for Bioinformatics, GSF—National Research Center for Environment and HealthNeuherbergGermany
| | - Matthew Bellgard
- 24Centre for Bioinformatics and Biological Computing, School of Information Technology, Murdoch UniversityMurdoch, Western AustraliaAustralia
| | - Maria de Fatima Bonaldo
- 25Medical Education and Biomedical Research Facility, University of IowaIowa City, IowaUnited States of America
| | - Hidemasa Bono
- 26Genome Exploration Research Group, RIKEN Genomic Sciences Center, RIKEN Yokohama InstituteKanagawaJapan
| | - Susan K Bromberg
- 27Medical College of Wisconsin, MilwaukeeWisconsinUnited States of America
| | - Anthony J Brookes
- 19Center for Genomics and Bioinformatics, Karolinska InstitutetStockholmSweden
| | - Elspeth Bruford
- 28HUGO Gene Nomenclature Committee, University College LondonLondonUnited Kingdom
| | | | - Claude Chelala
- 20Genexpress—CNRS—Functional Genomics and Systemic Biology for HealthVillejuif CedexFrance
| | - Christine Couillault
- 20Genexpress—CNRS—Functional Genomics and Systemic Biology for HealthVillejuif CedexFrance
- 21Sino-French Laboratory in Life Sciences and GenomicsShanghaiChina
| | | | - Marie-Anne Debily
- 20Genexpress—CNRS—Functional Genomics and Systemic Biology for HealthVillejuif CedexFrance
| | | | - Inna Dubchak
- 32Lawrence Berkeley National Laboratory, BerkeleyCaliforniaUnited States of America
| | - Toshinori Endo
- 33Department of Bioinformatics, Medical Research Institute, Tokyo Medical and Dental UniversityTokyoJapan
| | | | - Eduardo Eyras
- 15The Wellcome Trust Sanger Institute, Wellcome Trust Genome CampusCambridgeUnited Kingdom
| | - Kaoru Fukami-Kobayashi
- 35Bioresource Information Division, RIKEN BioResource Center, RIKEN Tsukuba InstituteIbarakiJapan
| | - Gopal R. Gopinath
- 36Genome Knowledgebase, Cold Spring Harbor LaboratoryCold Spring Harbor, New YorkUnited States of America
| | - Esther Graudens
- 20Genexpress—CNRS—Functional Genomics and Systemic Biology for HealthVillejuif CedexFrance
- 21Sino-French Laboratory in Life Sciences and GenomicsShanghaiChina
| | - Yoonsoo Hahn
- 18Korea Research Institute of Bioscience and BiotechnologyTaejeonKorea
| | - Michael Han
- 23MIPS—Institute for Bioinformatics, GSF—National Research Center for Environment and HealthNeuherbergGermany
| | - Ze-Guang Han
- 21Sino-French Laboratory in Life Sciences and GenomicsShanghaiChina
- 37Chinese National Human Genome Center at ShanghaiShanghaiChina
| | - Kousuke Hanada
- 5Center for Information Biology and DNA Data Bank of Japan, National Institute of GeneticsShizuokaJapan
| | - Hideki Hanaoka
- 1Integrated Database Group, Biological Information Research Center, National Institute of Advanced Industrial Science and TechnologyTokyoJapan
| | - Erimi Harada
- 1Integrated Database Group, Biological Information Research Center, National Institute of Advanced Industrial Science and TechnologyTokyoJapan
- 7Integrated Database Group, Japan Biological Information Research Center, Japan Biological Informatics ConsortiumTokyoJapan
| | - Katsuyuki Hashimoto
- 38Division of Genetic Resources, National Institute of Infectious DiseasesTokyoJapan
| | - Ursula Hinz
- 34Swiss Institute of BioinformaticsGenevaSwitzerland
| | - Momoki Hirai
- 39Graduate School of Frontier Sciences, Department of Integrated Biosciences, University of TokyoChibaJapan
| | - Teruyoshi Hishiki
- 40Functional Genomics Group, Biological Information Research Center, National Institute of Advanced Industrial Science and TechnologyTokyoJapan
| | - Ian Hopkinson
- 41Department of Primary Care and Population Sciences, Royal Free University College Medical School, University College LondonLondonUnited Kingdom
- 42Clinical and Molecular Genetics Unit, The Institute of Child HealthLondonUnited Kingdom
| | - Sandrine Imbeaud
- 20Genexpress—CNRS—Functional Genomics and Systemic Biology for HealthVillejuif CedexFrance
- 21Sino-French Laboratory in Life Sciences and GenomicsShanghaiChina
| | - Hidetoshi Inoko
- 1Integrated Database Group, Biological Information Research Center, National Institute of Advanced Industrial Science and TechnologyTokyoJapan
- 7Integrated Database Group, Japan Biological Information Research Center, Japan Biological Informatics ConsortiumTokyoJapan
- 43Department of Genetic Information, Division of Molecular Life Science, School of Medicine, Tokai UniversityKanagawaJapan
| | - Alexander Kanapin
- 4EMBL Outstation—European Bioinformatics Institute, Wellcome Trust Genome CampusCambridgeUnited Kingdom
| | - Yayoi Kaneko
- 1Integrated Database Group, Biological Information Research Center, National Institute of Advanced Industrial Science and TechnologyTokyoJapan
- 7Integrated Database Group, Japan Biological Information Research Center, Japan Biological Informatics ConsortiumTokyoJapan
| | - Takeya Kasukawa
- 26Genome Exploration Research Group, RIKEN Genomic Sciences Center, RIKEN Yokohama InstituteKanagawaJapan
| | - Janet Kelso
- 44South African National Bioinformatics Institute, University of the Western CapeBellvilleSouth Africa
| | - Paul Kersey
- 4EMBL Outstation—European Bioinformatics Institute, Wellcome Trust Genome CampusCambridgeUnited Kingdom
| | | | | | - Bernhard Korn
- 46RZPD Resource Center for Genome ResearchHeidelbergGermany
| | - Vladimir Kuryshev
- 47Molecular Genome Analysis, German Cancer Research Center-DKFZHeidelbergGermany
| | - Izabela Makalowska
- 48Pennsylvania State UniversityUniversity Park, PennsylvaniaUnited States of America
| | - Takashi Makino
- 5Center for Information Biology and DNA Data Bank of Japan, National Institute of GeneticsShizuokaJapan
| | - Shuhei Mano
- 43Department of Genetic Information, Division of Molecular Life Science, School of Medicine, Tokai UniversityKanagawaJapan
| | - Regine Mariage-Samson
- 20Genexpress—CNRS—Functional Genomics and Systemic Biology for HealthVillejuif CedexFrance
| | - Jun Mashima
- 5Center for Information Biology and DNA Data Bank of Japan, National Institute of GeneticsShizuokaJapan
| | - Hideo Matsuda
- 49Department of Bioinformatic Engineering, Graduate School of Information Science and Technology, Osaka UniversityOsakaJapan
| | - Hans-Werner Mewes
- 23MIPS—Institute for Bioinformatics, GSF—National Research Center for Environment and HealthNeuherbergGermany
| | - Shinsei Minoshima
- 50Medical Photobiology Department, Photon Medical Research Center, Hamamatsu University School of MedicineShizuokaJapan
- 52Department of Molecular Biology, Keio University School of MedicineTokyoJapan
| | | | - Hideki Nagasaki
- 51Computational Biology Research Center, National Institute of Advanced Industrial Science and TechnologyTokyoJapan
| | - Naoki Nagata
- 1Integrated Database Group, Biological Information Research Center, National Institute of Advanced Industrial Science and TechnologyTokyoJapan
| | - Rajni Nigam
- 27Medical College of Wisconsin, MilwaukeeWisconsinUnited States of America
| | - Osamu Ogasawara
- 3Human Genome Center, The Institute of Medical Science, The University of TokyoTokyoJapan
| | | | - Masafumi Ohtsubo
- 52Department of Molecular Biology, Keio University School of MedicineTokyoJapan
| | - Norihiro Okada
- 53Department of Biological Sciences, Graduate School of Bioscience and Biotechnology, Tokyo Institute of TechnologyKanagawaJapan
| | - Toshihisa Okido
- 5Center for Information Biology and DNA Data Bank of Japan, National Institute of GeneticsShizuokaJapan
| | - Satoshi Oota
- 35Bioresource Information Division, RIKEN BioResource Center, RIKEN Tsukuba InstituteIbarakiJapan
| | - Motonori Ota
- 54Global Scientific Information and Computing Center, Tokyo Institute of TechnologyTokyoJapan
| | - Toshio Ota
- 22Tokyo Research Laboratories, Kyowa Hakko Kogyo CompanyTokyoJapan
| | - Tetsuji Otsuki
- 55Molecular Biology Laboratory, Medicinal Research Laboratories, Taisho Pharmaceutical CompanySaitamaJapan
| | | | - Annemarie Poustka
- 47Molecular Genome Analysis, German Cancer Research Center-DKFZHeidelbergGermany
| | - Shuang-Xi Ren
- 21Sino-French Laboratory in Life Sciences and GenomicsShanghaiChina
- 37Chinese National Human Genome Center at ShanghaiShanghaiChina
| | - Naruya Saitou
- 56Department of Population Genetics, National Institute of GeneticsShizuokaJapan
| | - Katsunaga Sakai
- 5Center for Information Biology and DNA Data Bank of Japan, National Institute of GeneticsShizuokaJapan
| | - Shigetaka Sakamoto
- 5Center for Information Biology and DNA Data Bank of Japan, National Institute of GeneticsShizuokaJapan
| | - Ryuichi Sakate
- 39Graduate School of Frontier Sciences, Department of Integrated Biosciences, University of TokyoChibaJapan
| | - Ingo Schupp
- 47Molecular Genome Analysis, German Cancer Research Center-DKFZHeidelbergGermany
| | - Florence Servant
- 4EMBL Outstation—European Bioinformatics Institute, Wellcome Trust Genome CampusCambridgeUnited Kingdom
| | - Stephen Sherry
- 13National Center for Biotechnology Information, National Library of Medicine, National Institutes of HealthBethesda, MarylandUnited States of America
| | - Rie Shiba
- 1Integrated Database Group, Biological Information Research Center, National Institute of Advanced Industrial Science and TechnologyTokyoJapan
- 7Integrated Database Group, Japan Biological Information Research Center, Japan Biological Informatics ConsortiumTokyoJapan
| | - Nobuyoshi Shimizu
- 52Department of Molecular Biology, Keio University School of MedicineTokyoJapan
| | - Mary Shimoyama
- 27Medical College of Wisconsin, MilwaukeeWisconsinUnited States of America
| | | | - Bento Soares
- 25Medical Education and Biomedical Research Facility, University of IowaIowa City, IowaUnited States of America
| | - Charles Steward
- 15The Wellcome Trust Sanger Institute, Wellcome Trust Genome CampusCambridgeUnited Kingdom
| | - Makiko Suwa
- 51Computational Biology Research Center, National Institute of Advanced Industrial Science and TechnologyTokyoJapan
| | - Mami Suzuki
- 5Center for Information Biology and DNA Data Bank of Japan, National Institute of GeneticsShizuokaJapan
| | - Aiko Takahashi
- 1Integrated Database Group, Biological Information Research Center, National Institute of Advanced Industrial Science and TechnologyTokyoJapan
- 7Integrated Database Group, Japan Biological Information Research Center, Japan Biological Informatics ConsortiumTokyoJapan
| | - Gen Tamiya
- 1Integrated Database Group, Biological Information Research Center, National Institute of Advanced Industrial Science and TechnologyTokyoJapan
- 7Integrated Database Group, Japan Biological Information Research Center, Japan Biological Informatics ConsortiumTokyoJapan
- 43Department of Genetic Information, Division of Molecular Life Science, School of Medicine, Tokai UniversityKanagawaJapan
| | - Hiroshi Tanaka
- 33Department of Bioinformatics, Medical Research Institute, Tokyo Medical and Dental UniversityTokyoJapan
| | - Todd Taylor
- 57Human Genome Research Group, Genomic Sciences Center, RIKEN Yokohama InstituteKanagawaJapan
| | - Joseph D Terwilliger
- 58Columbia University and Columbia Genome CenterNew York, New YorkUnited States of America
| | - Per Unneberg
- 59Department of Biotechnology, Royal Institute of TechnologyStockholmSweden
| | - Vamsi Veeramachaneni
- 48Pennsylvania State UniversityUniversity Park, PennsylvaniaUnited States of America
| | - Shinya Watanabe
- 3Human Genome Center, The Institute of Medical Science, The University of TokyoTokyoJapan
| | - Laurens Wilming
- 15The Wellcome Trust Sanger Institute, Wellcome Trust Genome CampusCambridgeUnited Kingdom
| | - Norikazu Yasuda
- 1Integrated Database Group, Biological Information Research Center, National Institute of Advanced Industrial Science and TechnologyTokyoJapan
- 7Integrated Database Group, Japan Biological Information Research Center, Japan Biological Informatics ConsortiumTokyoJapan
| | - Hyang-Sook Yoo
- 18Korea Research Institute of Bioscience and BiotechnologyTaejeonKorea
| | - Marvin Stodolsky
- 60Biology Division and Genome Task Group, Office of Biological and Environmental Research, United States Department of EnergyWashington, D.CUnited States of America
| | - Wojciech Makalowski
- 48Pennsylvania State UniversityUniversity Park, PennsylvaniaUnited States of America
| | - Mitiko Go
- 61Faculty of Bio-Science, Nagahama Institute of Bio-Science and TechnologyShigaJapan
| | - Kenta Nakai
- 3Human Genome Center, The Institute of Medical Science, The University of TokyoTokyoJapan
| | - Toshihisa Takagi
- 3Human Genome Center, The Institute of Medical Science, The University of TokyoTokyoJapan
| | - Minoru Kanehisa
- 12Bioinformatics Center, Institute for Chemical Research, Kyoto UniversityKyotoJapan
| | - Yoshiyuki Sakaki
- 3Human Genome Center, The Institute of Medical Science, The University of TokyoTokyoJapan
- 57Human Genome Research Group, Genomic Sciences Center, RIKEN Yokohama InstituteKanagawaJapan
| | - John Quackenbush
- 62Institute for Genomic ResearchRockville, MarylandUnited States of America
| | - Yasushi Okazaki
- 26Genome Exploration Research Group, RIKEN Genomic Sciences Center, RIKEN Yokohama InstituteKanagawaJapan
| | - Yoshihide Hayashizaki
- 26Genome Exploration Research Group, RIKEN Genomic Sciences Center, RIKEN Yokohama InstituteKanagawaJapan
| | - Winston Hide
- 44South African National Bioinformatics Institute, University of the Western CapeBellvilleSouth Africa
| | - Ranajit Chakraborty
- 63Center for Genome Information, Department of Environmental Health, University of CincinnatiCincinnati, OhioUnited States of America
| | - Ken Nishikawa
- 5Center for Information Biology and DNA Data Bank of Japan, National Institute of GeneticsShizuokaJapan
| | - Hideaki Sugawara
- 5Center for Information Biology and DNA Data Bank of Japan, National Institute of GeneticsShizuokaJapan
| | - Yoshio Tateno
- 5Center for Information Biology and DNA Data Bank of Japan, National Institute of GeneticsShizuokaJapan
| | - Zhu Chen
- 21Sino-French Laboratory in Life Sciences and GenomicsShanghaiChina
- 37Chinese National Human Genome Center at ShanghaiShanghaiChina
- 64State Key Laboratory of Medical Genomics, Shanghai Institute of Hematology, Rui-Jin Hospital, Shanghai Second Medical UniversityShanghaiChina
| | | | - Peter Tonellato
- 65PointOne SystemsWauwatosa, WisconsinUnited States of America
| | - Rolf Apweiler
- 4EMBL Outstation—European Bioinformatics Institute, Wellcome Trust Genome CampusCambridgeUnited Kingdom
| | - Kousaku Okubo
- 5Center for Information Biology and DNA Data Bank of Japan, National Institute of GeneticsShizuokaJapan
- 40Functional Genomics Group, Biological Information Research Center, National Institute of Advanced Industrial Science and TechnologyTokyoJapan
| | - Lukas Wagner
- 13National Center for Biotechnology Information, National Library of Medicine, National Institutes of HealthBethesda, MarylandUnited States of America
| | - Stefan Wiemann
- 47Molecular Genome Analysis, German Cancer Research Center-DKFZHeidelbergGermany
| | - Robert L Strausberg
- 16National Cancer Institute, National Institutes of HealthBethesda, MarylandUnited States of America
| | - Takao Isogai
- 10Reverse Proteomics Research InstituteChibaJapan
- 66Graduate School of Life and Environmental Sciences, University of TsukubaIbarakiJapan
| | - Charles Auffray
- 20Genexpress—CNRS—Functional Genomics and Systemic Biology for HealthVillejuif CedexFrance
- 21Sino-French Laboratory in Life Sciences and GenomicsShanghaiChina
| | - Nobuo Nomura
- 40Functional Genomics Group, Biological Information Research Center, National Institute of Advanced Industrial Science and TechnologyTokyoJapan
| | - Takashi Gojobori
- 1Integrated Database Group, Biological Information Research Center, National Institute of Advanced Industrial Science and TechnologyTokyoJapan
- 5Center for Information Biology and DNA Data Bank of Japan, National Institute of GeneticsShizuokaJapan
- 67Department of Genetics, Graduate University for Advanced StudiesShizuokaJapan
| | - Sumio Sugano
- 3Human Genome Center, The Institute of Medical Science, The University of TokyoTokyoJapan
- 40Functional Genomics Group, Biological Information Research Center, National Institute of Advanced Industrial Science and TechnologyTokyoJapan
- 68Department of Medical Genome Sciences, Graduate School of Frontier Sciences, University of TokyoTokyoJapan
| |
Collapse
|
28
|
Mewes HW, Amid C, Arnold R, Frishman D, Güldener U, Mannhaupt G, Münsterkötter M, Pagel P, Strack N, Stümpflen V, Warfsmann J, Ruepp A. MIPS: analysis and annotation of proteins from whole genomes. Nucleic Acids Res 2004; 32:D41-4. [PMID: 14681354 PMCID: PMC308826 DOI: 10.1093/nar/gkh092] [Citation(s) in RCA: 356] [Impact Index Per Article: 17.8] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/12/2022] Open
Abstract
The Munich Information Center for Protein Sequences (MIPS-GSF), Neuherberg, Germany, provides protein sequence-related information based on whole-genome analysis. The main focus of the work is directed toward the systematic organization of sequence-related attributes as gathered by a variety of algorithms, primary information from experimental data together with information compiled from the scientific literature. MIPS maintains automatically generated and manually annotated genome-specific databases, develops systematic classification schemes for the functional annotation of protein sequences and provides tools for the comprehensive analysis of protein sequences. This report updates the information on the yeast genome (CYGD), the Neurospora crassa genome (MNCDB), the database of complete cDNAs (German Human Genome Project, NGFN), the database of mammalian protein-protein interactions (MPPI), the database of FASTA homologies (SIMAP), and the interface for the fast retrieval of protein-associated information (QUIPOS). The Arabidopsis thaliana database, the rice database, the plant EST databases (MATDB, MOsDB, SPUTNIK), as well as the databases for the comprehensive set of genomes (PEDANT genomes) are described elsewhere in the 2003 and 2004 NAR database issues, respectively. All databases described, and the detailed descriptions of our projects can be accessed through the MIPS web server (http://mips.gsf.de).
Collapse
Affiliation(s)
- H W Mewes
- Institute for Bioinformatics (MIPS), GSF National Research Center for Environment and Health, Ingolstaedter Landstrasse 1, D-85764 Neuherberg, Germany.
| | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
29
|
Amid C, Bahr A, Mujica A, Sampson N, Bikar SE, Winterpacht A, Zabel B, Hankeln T, Schmidt ER. Comparative genomic sequencing reveals a strikingly similar architecture of a conserved syntenic region on human chromosome 11p15.3 (including gene ST5) and mouse chromosome 7. Cytogenet Cell Genet 2001; 93:284-90. [PMID: 11528127 DOI: 10.1159/000056999] [Citation(s) in RCA: 17] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/19/2022]
Abstract
Comparative genomics is a superior way to identify phylogenetically conserved features like genes or regions involved in gene regulation. The comparison of extended orthologous chromosomal regions should also reveal other characteristic traits essential for chromosome or gene function. In the present study we have sequenced and compared a region of conserved synteny from human chromosome 11p15.3 and mouse chromosome 7. In human, this region is known to contain several genes involved in the development of various disorders like Beckwith-Wiedemann overgrowth syndrome and other tumor diseases. Furthermore, in the neighboring chromosome region 11p15.5 extensive imprinting of genes has been reported which might extend to region 11p15.3. The analysis of approximately 730 kb in human and 620 kb in mouse led to the identification of eleven genes. All putative genes found in the mouse DNA were also present in the same order and orientation in the human chromosome. However, in the human DNA one putative gene of unknown function could be identified which is not present in the orthologous position of the mouse chromosome. The sequence similarity between human and mouse is higher in transcribed and exon regions than in non-transcribed segments. Dot plot analysis, however, reveals a surprisingly well-conserved sequence similarity over the entire analyzed region. In particular, the positions of CpG islands, short regions of very high GC content in the 5' region of putative genes, are similar in human and mouse. With respect to base composition, two distinct segments of significantly different GC content exist as well in human as in the mouse. With a GC content of 45% the one segment would correspond to "isochore H1" and the other segment (39% GC in human, 40% GC in mouse) to "isochore L1/L2". The gene density (one gene per 66 kb) is slightly higher than the average calculated for the complete human genome (one gene per 90 kb). The comparison of the number and distribution of repetitive elements shows that the proportion of human DNA made up by interspersed repeats (43.8%) is significantly higher than in the corresponding mouse DNA (30.1%). This partly explains why the human DNA is longer between the landmark genes used to define the orthologous positions in human and mouse.
Collapse
Affiliation(s)
- C Amid
- Institute of Molecular Genetics, Biosafety Research and Consulting, Johannes Gutenberg University, Mainz, Germany
| | | | | | | | | | | | | | | | | |
Collapse
|
30
|
Hankeln T, Amid C, Weich B, Niessing J, Schmidt ER. Molecular evolution of the globin gene cluster E in two distantly related midges, Chironomus pallidivittatus and C. thummi thummi. J Mol Evol 1998; 46:589-601. [PMID: 9545469 DOI: 10.1007/pl00006339] [Citation(s) in RCA: 19] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/07/2023]
Abstract
We have studied the evolutionary dynamics of a cluster of insect globin genes by comparing the organization and sequence of the gene group in two distantly related species, Chironomus pallidivittatus and C. t. thummi. Although the general architecture of the globin gene cluster has been conserved, we have found an additional, previously undescribed gene (named Cpa F) in C. pallidivittatus which shows signs of accelerated sequence evolution at nonsynonymous codon positions. This new gene is clearly functional, as demonstrated by Northern analysis. Comparison of paralogous and orthologous genes reveals patterns of intraspecific sequence homogenization. The head-to-head-oriented globin 3 and 4 gene pairs in C. t. thummi and the gb 4 gene pair in C. pallidivittatus have been efficiently homogenized, probably by gene conversion, in their promoter and coding regions. Inverted transcriptional orientation seems to favor efficient conversion. The orthologous genes from C. t. thummi and C. pallidivittatus reveal different levels of sequence conservation, ranging from 85.3 to 94.7% amino acid identity. Surprisingly, globin gene E, for which up to now no corresponding protein has been detected in the larval hemolymph of C. t. thummi, shows the highest degree of interspecies sequence conservation. This points to an essential, as yet unknown function of this globin. The usefulness of globin gene comparisons for dating speciation events in Chironomus is discussed.
Collapse
Affiliation(s)
- T Hankeln
- Institut fuer Molekulargenetik, gentechnologische Sicherheitsforschung und Beratung, Johannes Gutenberg Univ. Mainz, Becherweg 32, D-55099 Mainz, Germany (FRG).
| | | | | | | | | |
Collapse
|