1
|
Kai N, Qingsong C, Kejia M, Weiwei L, Xing W, Xuejie C, Lixia C, Minzi D, Yuanyuan Y, Xiaoyan W. An Inflammatory Bowel Diseases Integrated Resources Portal (IBDIRP). Database (Oxford) 2024; 2024:baad097. [PMID: 38227799 DOI: 10.1093/database/baad097] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/06/2023] [Revised: 12/02/2023] [Accepted: 12/22/2023] [Indexed: 01/18/2024]
Abstract
IBD, including ulcerative colitis and Crohn's disease, is a chronic and debilitating gastrointestinal disorder that affects millions of people worldwide. Research on IBD has generated massive amounts of data, including literature, metagenomics, metabolomics, bioresources and databases. We aim to create an IBD Integrated Resources Portal (IBDIRP) that provides the most comprehensive resources for IBD. An integrated platform was developed that provides information on different aspects of IBD research resources, such as single-nucleotide polymorphisms (SNPs), genes, transcriptome, microbiota, metabolomics, single cells and other resources. Valuable and comprehensive IBD-related data were collected from PubMed, Google, GMrepo, gutMega, gutMDisorder, Single Cell Portal and other sources. Then, the data were systematically sorted, and these resources were manually curated. We systematically sorted and cataloged more than 320 unique risk SNPs associated with IBD in the SNP section. We presented over 289 IBD-related genes based on the database collection in the gene section. We also obtained 153 manually curated IBD transcriptomics data, including 12 388 samples, on the Gene Expression Omnibus database. The sorted IBD-related microbiota data from three primary microbiome databases (GMrepo, gutMega and gutMDisorder) were available for download. We selected 23 149 IBD-related taxonomic records from these databases. Additionally, we collected 24 IBD metabolomics studies with 2896 participants in the metabolomics section. We introduced two interactive single-cell data plug-in units that provided data visualization based on cells and genes. Finally, we listed 18 significant IBD web resources, such as the official European Crohn's and Colitis Organisation and International Organization for the Study of IBD websites, IBD scoring tools, IBD genetic and multi-omics resources, IBD biobanks and other useful research resources. The IBDIRP website is the first integrated resource for global IBD researchers. This portal will help researchers by providing comprehensive knowledge and enabling them to reinforce the multidimensional impression of IBD. The IBDIRP website is accessible via www.ibdirp.com Database URL: www.ibdirp.com.
Collapse
Affiliation(s)
- Nie Kai
- Department of Gastroenterology, The Third Xiangya Hospital of Central South University, Changsha Hunan 410000, China
- Hunan Key Laboratory of Nonresolving Inflammation and Cancer, Changsha Hunan 410000, China
| | | | - Ma Kejia
- Department of Gastroenterology, The Third Xiangya Hospital of Central South University, Changsha Hunan 410000, China
- Hunan Key Laboratory of Nonresolving Inflammation and Cancer, Changsha Hunan 410000, China
| | - Luo Weiwei
- Department of Gastroenterology, The Third Xiangya Hospital of Central South University, Changsha Hunan 410000, China
- Hunan Key Laboratory of Nonresolving Inflammation and Cancer, Changsha Hunan 410000, China
| | - Wu Xing
- Department of Gastroenterology, The Third Xiangya Hospital of Central South University, Changsha Hunan 410000, China
- Hunan Key Laboratory of Nonresolving Inflammation and Cancer, Changsha Hunan 410000, China
| | - Chen Xuejie
- Department of Gastroenterology, The Third Xiangya Hospital of Central South University, Changsha Hunan 410000, China
- Hunan Key Laboratory of Nonresolving Inflammation and Cancer, Changsha Hunan 410000, China
| | - Cai Lixia
- Changsha Hospital for Maternal and Child Health Care Affiliated to Hunan Normal University Changsha Hunan 410000, China
| | - Deng Minzi
- Department of Gastroenterology, The Third Xiangya Hospital of Central South University, Changsha Hunan 410000, China
- Hunan Key Laboratory of Nonresolving Inflammation and Cancer, Changsha Hunan 410000, China
| | - Yang Yuanyuan
- Department of Gastroenterology, The Third Xiangya Hospital of Central South University, Changsha Hunan 410000, China
- Hunan Key Laboratory of Nonresolving Inflammation and Cancer, Changsha Hunan 410000, China
| | - Wang Xiaoyan
- Department of Gastroenterology, The Third Xiangya Hospital of Central South University, Changsha Hunan 410000, China
- Hunan Key Laboratory of Nonresolving Inflammation and Cancer, Changsha Hunan 410000, China
- The College of Computer Science in Sichuan University, Chengdu Sichuan 610000, China
| |
Collapse
|
2
|
Timme RE, Karsch-Mizrachi I, Waheed Z, Arita M, MacCannell D, Maguire F, Petit III R, Page AJ, Mendes CI, Nasar MI, Oluniyi P, Tyler AD, Raphenya AR, Guthrie JL, Olawoye I, Rinck G, O’Cathail C, Lees J, Cochrane G, Cummins C, Brister JR, Klimke W, Feldgarden M, Griffiths E. Putting everything in its place: using the INSDC compliant Pathogen Data Object Model to better structure genomic data submitted for public health applications. Microb Genom 2023; 9:001145. [PMID: 38085797 PMCID: PMC10763499 DOI: 10.1099/mgen.0.001145] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/16/2023] [Accepted: 11/13/2023] [Indexed: 12/18/2023] Open
Abstract
Fast, efficient public health actions require well-organized and coordinated systems that can supply timely and accurate knowledge. Public databases of pathogen genomic data, such as the International Nucleotide Sequence Database Collaboration (INSDC), have become essential tools for efficient public health decisions. However, these international resources began primarily for academic purposes, rather than for surveillance or interventions. Now, queries need to access not only the whole genomes of multiple pathogens but also make connections using robust contextual metadata to identify issues of public health relevance. Databases that over time developed a patchwork of submission formats and requirements need to be consistently organized and coordinated internationally to allow effective searches.To help resolve these issues, we propose a common pathogen data structure called the Pathogen Data Object Model (DOM) that will formalize the minimum pieces of sequence data and contextual data necessary for general public health uses, while recognizing that submitters will likely withhold a wide range of non-public contextual data. Further, we propose contributors use the Pathogen DOM for all pathogen submissions (bacterial, viral, fungal, and parasites), which will simplify data submissions and provide a consistent and transparent data structure for downstream data analyses. We also highlight how improved submission tools can support the Pathogen DOM, offering users additional easy-to-use methods to ensure this structure is followed.
Collapse
Affiliation(s)
- Ruth E. Timme
- Center for Food Safety and Applied Nutrition, U.S. Food and Drug Administration, College Park, MD, USA
| | - Ilene Karsch-Mizrachi
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, MD, USA
| | - Zahra Waheed
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, UK
| | - Masanori Arita
- DNA Data Bank of Japan, National Institute of Genetics, Mishima, Japan
| | - Duncan MacCannell
- National Center for Emerging and Zoonotic Infectious Diseases, Centers for Disease Control and Prevention, Atlanta, GA, USA
| | - Finlay Maguire
- Department of Community Health & Epidemiology, Faculty of Medicine, Dalhousie University, Halifax, Canada
- Faculty of Computer Science, Dalhousie University, Halifax, Canada
| | | | - Andrew J. Page
- Quadram Institute Bioscience, Norwich, Norfolk, UK
- Theiagen Genomics LLC, Highlands Ranch, CO, USA
| | | | - Muhammad Ibtisam Nasar
- Department of Biology, College of Science, United Arab Emirates University- Al Ain, Abu Dhabi, UAE
| | - Paul Oluniyi
- Chan Zuckerberg Biohub Network, San Francisco, CA, USA
| | - Andrea D. Tyler
- Science Technology Cores and Services, National Microbiology Laboratory, Public Health Agency of Canada, Winnipeg, Canada
| | - Amogelang R. Raphenya
- Department of Biochemistry and Biomedical Sciences and the Michael G. DeGroote Institute for Infectious Disease Research, McMaster University, Hamilton, Ontario, Canada
| | - Jennifer L. Guthrie
- Schulich School of Medicine & Dentistry, University of Western Ontario, London, Ontario, Canada
| | - Idowu Olawoye
- Schulich School of Medicine & Dentistry, University of Western Ontario, London, Ontario, Canada
| | - Gabriele Rinck
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, UK
| | - Colman O’Cathail
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, UK
| | - John Lees
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, UK
| | - Guy Cochrane
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, UK
| | - Carla Cummins
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, UK
| | - J. Rodney Brister
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, MD, USA
| | - William Klimke
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, MD, USA
| | - Michael Feldgarden
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, MD, USA
| | - Emma Griffiths
- Faculty of Health Sciences, Simon Fraser University, Burnaby, British Columbia, Canada
| |
Collapse
|
3
|
Lee B, Hwang S, Kim PG, Ko G, Jang K, Kim S, Kim JH, Jeon J, Kim H, Jung J, Yoon BH, Byeon I, Jang I, Song W, Choi J, Kim SY. Introduction of the Korea BioData Station (K-BDS) for sharing biological data. Genomics Inform 2023; 21:e12. [PMID: 37037470 PMCID: PMC10085736 DOI: 10.5808/gi.22073] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/09/2022] [Accepted: 03/06/2023] [Indexed: 04/03/2023] Open
Abstract
A wave of new technologies has created opportunities for the cost-effective generation of high-throughput profiles of biological systems, foreshadowing a "data-driven science" era. The large variety of data available from biological research is also a rich resource that can be used for innovative endeavors. However, we are facing considerable challenges in big data deposition, integration, and translation due to the complexity of biological data and its production at unprecedented exponential rates. To address these problems, in 2020, the Korean government officially announced a national strategy to collect and manage the biological data produced through national R&D fund allocations and provide the collected data to researchers. To this end, the Korea Bioinformation Center (KOBIC) developed a new biological data repository, the Korea BioData Station (K-BDS), for sharing data from individual researchers and research programs to create a data-driven biological study environment. The K-BDS is dedicated to providing free open access to a suite of featured data resources in support of worldwide activities in both academia and industry.
Collapse
|
4
|
Wolfsberger W, Chhugani K, Shchubelka K, Frolova A, Salyha Y, Zlenko O, Arych M, Dziuba D, Parkhomenko A, Smolanka V, Gümüş ZH, Sezgin E, Diaz-Lameiro A, Toth VR, Maci M, Bortz E, Kondrashov F, Morton PM, Łabaj PP, Romero V, Hlávka J, Mangul S, Oleksyk TK. Scientists without borders: lessons from Ukraine. Gigascience 2022; 12:giad045. [PMID: 37496156 PMCID: PMC10372202 DOI: 10.1093/gigascience/giad045] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/05/2023] [Revised: 05/31/2023] [Accepted: 06/01/2023] [Indexed: 07/28/2023] Open
Abstract
Conflicts and natural disasters affect entire populations of the countries involved and, in addition to the thousands of lives destroyed, have a substantial negative impact on the scientific advances these countries provide. The unprovoked invasion of Ukraine by Russia, the devastating earthquake in Turkey and Syria, and the ongoing conflicts in the Middle East are just a few examples. Millions of people have been killed or displaced, their futures uncertain. These events have resulted in extensive infrastructure collapse, with loss of electricity, transportation, and access to services. Schools, universities, and research centers have been destroyed along with decades' worth of data, samples, and findings. Scholars in disaster areas face short- and long-term problems in terms of what they can accomplish now for obtaining grants and for employment in the long run. In our interconnected world, conflicts and disasters are no longer a local problem but have wide-ranging impacts on the entire world, both now and in the future. Here, we focus on the current and ongoing impact of war on the scientific community within Ukraine and from this draw lessons that can be applied to all affected countries where scientists at risk are facing hardship. We present and classify examples of effective and feasible mechanisms used to support researchers in countries facing hardship and discuss how these can be implemented with help from the international scientific community and what more is desperately needed. Reaching out, providing accessible training opportunities, and developing collaborations should increase inclusion and connectivity, support scientific advancements within affected communities, and expedite postwar and disaster recovery.
Collapse
Affiliation(s)
- Walter Wolfsberger
- Department of Biological Sciences, Oakland University,
Rochester, MI 48309-4479, USA
| | - Karishma Chhugani
- Department of Clinical Pharmacy, USC Alfred E. Mann School of Pharmacy and
Pharmaceutical Sciences, University of Southern California,
Los Angeles, CA 90033, USA
| | - Khrystyna Shchubelka
- Department of Biological Sciences, Oakland University,
Rochester, MI 48309-4479, USA
| | - Alina Frolova
- Institute of Molecular Biology and Genetics of National Academy of Sciences
of Ukraine, Kyiv Academic University, Kyiv 03143,
Ukraine
| | - Yuriy Salyha
- Institute of Animal Biology, National Academy of Agrarian Sciences (NAAS)
of Ukraine, Lviv 79034, Ukraine
| | - Oksana Zlenko
- National Scientific Center “Institute of Experimental and Clinical
Veterinary Medicine,” Kharkiv 61023, Ukraine
| | - Mykhailo Arych
- Institute of Economics and Management, National University of Food
Technologies (NUFT) of Ukraine, Kyiv 01601,
Ukraine
| | - Dmytro Dziuba
- Department of Anesthesiology and Intensive Care, P.L. Shpyk
NUHC Ukraine, Kyiv 04112, Ukraine
| | - Andrii Parkhomenko
- Department of Finance and Business Economics, Marshall School
of Business, University of Southern California, Los Angeles, CA 90089, USA
| | - Volodymyr Smolanka
- Department of Medicine, Uzhhorod National University,
Uzhhorod 88000, Ukraine
| | - Zeynep H Gümüş
- Department of Genetics and Genomic Sciences, Icahn School of Medicine at
Mount Sinai, New York, NY 10029, USA
| | - Efe Sezgin
- Department of Food Engineering, Izmir Institute of
Technology, Urla, Izmir 35430, Turkey
| | - Alondra Diaz-Lameiro
- Department of Biology, University of Puerto Rico at Mayagüez,
Mayagüez 00681, Puerto
Rico
| | - Viktor R Toth
- Aquatic Botany and Microbial Ecology Research Group, Balaton Limnological
Research Institute, Tihany 8237, Hungary
| | - Megi Maci
- Stritch School of Medicine, Loyola University Chicago,
Maywood, IL 60153, USA
| | - Eric Bortz
- Department of Biological Sciences, University of Alaska,
Anchorage, AK 99508, USA
| | - Fyodor Kondrashov
- Institute of Science and Technology Austria,
Klosterneuburg 3400, Austria
| | - Patricia M Morton
- Department of Sociology, Department of Public Health, Wayne State
University, Detroit, MI 48202, USA
| | - Paweł P Łabaj
- Małopolska Centre of Biotechnology, Jagiellonian University,
Kraków 30-348, Poland
| | - Veronika Romero
- Department of Neurobiology, University of Utah, Salt Lake
City, UT 84112, USA
| | - Jakub Hlávka
- Price School of Public Policy, University of Southern
California, Los Angeles, CA 90089-3333, USA
- Masaryk University, Brno 6017, Czech Republic
| | - Serghei Mangul
- Department of Clinical Pharmacy, USC Alfred E. Mann School of Pharmacy and
Pharmaceutical Sciences, University of Southern California,
Los Angeles, CA 90033, USA
- Department of Computational Biology, University of Southern
California, Los Angeles, CA 90033, USA
| | - Taras K Oleksyk
- Department of Biological Sciences, Oakland University,
Rochester, MI 48309-4479, USA
- Department of Biology, Uzhhorod National University, Uzhhorod
88000, Ukraine
| |
Collapse
|
5
|
Munk P, Brinch C, Møller FD, Petersen TN, Hendriksen RS, Seyfarth AM, Kjeldgaard JS, Svendsen CA, van Bunnik B, Berglund F, Larsson DGJ, Koopmans M, Woolhouse M, Aarestrup FM. Genomic analysis of sewage from 101 countries reveals global landscape of antimicrobial resistance. Nat Commun 2022; 13:7251. [PMID: 36456547 PMCID: PMC9715550 DOI: 10.1038/s41467-022-34312-7] [Citation(s) in RCA: 80] [Impact Index Per Article: 40.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/25/2022] [Accepted: 10/20/2022] [Indexed: 12/03/2022] Open
Abstract
Antimicrobial resistance (AMR) is a major threat to global health. Understanding the emergence, evolution, and transmission of individual antibiotic resistance genes (ARGs) is essential to develop sustainable strategies combatting this threat. Here, we use metagenomic sequencing to analyse ARGs in 757 sewage samples from 243 cities in 101 countries, collected from 2016 to 2019. We find regional patterns in resistomes, and these differ between subsets corresponding to drug classes and are partly driven by taxonomic variation. The genetic environments of 49 common ARGs are highly diverse, with most common ARGs carried by multiple distinct genomic contexts globally and sometimes on plasmids. Analysis of flanking sequence revealed ARG-specific patterns of dispersal limitation and global transmission. Our data furthermore suggest certain geographies are more prone to transmission events and should receive additional attention.
Collapse
Affiliation(s)
- Patrick Munk
- Research Group for Genomic Epidemiology, Technical University of Denmark, Kgs, Lyngby, Denmark.
| | - Christian Brinch
- Research Group for Genomic Epidemiology, Technical University of Denmark, Kgs, Lyngby, Denmark
| | - Frederik Duus Møller
- Research Group for Genomic Epidemiology, Technical University of Denmark, Kgs, Lyngby, Denmark
| | - Thomas N Petersen
- Research Group for Genomic Epidemiology, Technical University of Denmark, Kgs, Lyngby, Denmark
| | - Rene S Hendriksen
- Research Group for Genomic Epidemiology, Technical University of Denmark, Kgs, Lyngby, Denmark
| | - Anne Mette Seyfarth
- Research Group for Genomic Epidemiology, Technical University of Denmark, Kgs, Lyngby, Denmark
| | - Jette S Kjeldgaard
- Research Group for Genomic Epidemiology, Technical University of Denmark, Kgs, Lyngby, Denmark
| | - Christina Aaby Svendsen
- Research Group for Genomic Epidemiology, Technical University of Denmark, Kgs, Lyngby, Denmark
| | - Bram van Bunnik
- Centre for Immunity, Infection and Evolution, University of Edinburgh, Edinburgh, UK
| | - Fanny Berglund
- Centre for Antibiotic Resistance Research (CARe), University of Gothenburg, Gothenburg, Sweden
| | - D G Joakim Larsson
- Centre for Antibiotic Resistance Research (CARe), University of Gothenburg, Gothenburg, Sweden
| | - Marion Koopmans
- Department of Viroscience, Erasmus MC, Rotterdam, The Netherlands
| | - Mark Woolhouse
- Centre for Immunity, Infection and Evolution, University of Edinburgh, Edinburgh, UK
| | - Frank M Aarestrup
- Research Group for Genomic Epidemiology, Technical University of Denmark, Kgs, Lyngby, Denmark
| |
Collapse
|
6
|
Premzl M. Revised eutherian gene collections. BMC Genom Data 2022; 23:56. [PMID: 35870891 PMCID: PMC9308196 DOI: 10.1186/s12863-022-01071-9] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/27/2021] [Accepted: 07/13/2022] [Indexed: 11/24/2022] Open
Abstract
Objectives The most recent research projects in scientific field of eutherian comparative genomics included intentions to sequence every extant eutherian species genome in foreseeable future, so that future revisions and updates of eutherian gene data sets were expected. Data description Using 35 public eutherian reference genomic sequence assemblies and free available software, the eutherian comparative genomic analysis protocol RRID:SCR_014401 was published as guidance against potential genomic sequence errors. The protocol curated 14 eutherian third-party data gene data sets, including, in aggregate, 2615 complete coding sequences that were deposited in European Nucleotide Archive. The published eutherian gene collections were used in revisions and updates of eutherian gene data set classifications and nomenclatures that included gene annotations, phylogenetic analyses and protein molecular evolution analyses.
Collapse
|
7
|
Identification of mutations in SARS-CoV-2 PCR primer regions. Sci Rep 2022; 12:18651. [PMID: 36333366 PMCID: PMC9636223 DOI: 10.1038/s41598-022-21953-3] [Citation(s) in RCA: 6] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/08/2022] [Accepted: 10/06/2022] [Indexed: 11/06/2022] Open
Abstract
Due to the constantly increasing number of mutations in the SARS-CoV-2 genome, concerns have emerged over the possibility of decreased diagnostic accuracy of reverse transcription-polymerase chain reaction (RT-PCR), the gold standard diagnostic test for SARS-CoV-2. We propose an analysis pipeline to discover genomic variations overlapping the target regions of commonly used PCR primer sets. We provide the list of these mutations in a publicly available format based on a dataset of more than 1.2 million SARS-CoV-2 samples. Our approach distinguishes among mutations possibly having a damaging impact on PCR efficiency and ones anticipated to be neutral in this sense. Samples are categorized as "prone to misclassification" vs. "likely to be correctly detected" by a given PCR primer set based on the estimated effect of mutations present. Samples susceptible to misclassification are generally present at a daily rate of 2% or lower, although particular primer sets seem to have compromised performance when detecting Omicron samples. As different variant strains may temporarily gain dominance in the worldwide SARS-CoV-2 viral population, the efficiency of a particular PCR primer set may change over time, therefore constant monitoring of variations in primer target regions is highly recommended.
Collapse
|
8
|
Diversity and distribution of Type VI Secretion System gene clusters in bacterial plasmids. Sci Rep 2022; 12:8249. [PMID: 35581398 PMCID: PMC9113992 DOI: 10.1038/s41598-022-12382-3] [Citation(s) in RCA: 6] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/02/2022] [Accepted: 05/10/2022] [Indexed: 11/16/2022] Open
Abstract
Type VI Secretion System (T6SS) is a nanomolecular apparatus that allows the delivery of effector molecules through the cell envelope of a donor bacterium to prokaryotic and/or eukaryotic cells, playing a role in the bacterial competition, virulence, and host interaction. T6SS is patchily distributed in bacterial genomes, suggesting an association with horizontal gene transfer (HGT). In fact, T6SS gene loci are eventually found within genomic islands (GIs), and there are some reports in plasmids and integrative and conjugative elements (ICEs). The impact that T6SS may have on bacteria fitness and the lack of evidence on its spread mechanism led us to question whether plasmids could represent a key mechanism in the spread of T6SS in bacteria. Therefore, we performed an in-silico analysis to reveal the association between T6SS and plasmids. T6SS was mined on 30,660 plasmids from NCBI based on the presence of at least six T6SS core proteins. T6SS was identified in 330 plasmids, all belonging to the same type (T6SSi), mainly in Proteobacteria (328/330), particularly in Rhizobium and Ralstonia. Interestingly, most genomes carrying T6SS-harboring plasmids did not encode T6SS in their chromosomes, and, in general, chromosomal and plasmid T6SSs did not form separate clades.
Collapse
|
9
|
Madeira F, Pearce M, Tivey ARN, Basutkar P, Lee J, Edbali O, Madhusoodanan N, Kolesnikov A, Lopez R. Search and sequence analysis tools services from EMBL-EBI in 2022. Nucleic Acids Res 2022; 50:W276-W279. [PMID: 35412617 PMCID: PMC9252731 DOI: 10.1093/nar/gkac240] [Citation(s) in RCA: 1156] [Impact Index Per Article: 578.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/04/2022] [Accepted: 03/28/2022] [Indexed: 12/11/2022] Open
Abstract
The EMBL-EBI search and sequence analysis tools frameworks provide integrated access to EMBL-EBI’s data resources and core bioinformatics analytical tools. EBI Search (https://www.ebi.ac.uk/ebisearch) provides a full-text search engine across nearly 5 billion entries, while the Job Dispatcher tools framework (https://www.ebi.ac.uk/services) enables the scientific community to perform a diverse range of sequence analysis using popular bioinformatics applications. Both allow users to interact through user-friendly web applications, as well as via RESTful and SOAP-based APIs. Here, we describe recent improvements to these services and updates made to accommodate the increasing data requirements during the COVID-19 pandemic.
Collapse
Affiliation(s)
- Fábio Madeira
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Matt Pearce
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Adrian R N Tivey
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Prasad Basutkar
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Joon Lee
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Ossama Edbali
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Nandana Madhusoodanan
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Anton Kolesnikov
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Rodrigo Lopez
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| |
Collapse
|
10
|
Cantelli G, Bateman A, Brooksbank C, Petrov AI, Malik-Sheriff R, Ide-Smith M, Hermjakob H, Flicek P, Apweiler R, Birney E, McEntyre J. The European Bioinformatics Institute (EMBL-EBI) in 2021. Nucleic Acids Res 2022; 50:D11-D19. [PMID: 34850134 PMCID: PMC8690175 DOI: 10.1093/nar/gkab1127] [Citation(s) in RCA: 29] [Impact Index Per Article: 14.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/15/2021] [Revised: 10/14/2021] [Accepted: 11/23/2021] [Indexed: 11/28/2022] Open
Abstract
The European Bioinformatics Institute (EMBL-EBI) maintains a comprehensive range of freely available and up-to-date molecular data resources, which includes over 40 resources covering every major data type in the life sciences. This year's service update for EMBL-EBI includes new resources, PGS Catalog and AlphaFold DB, and updates on existing resources, including the COVID-19 Data Platform, trRosetta and RoseTTAfold models introduced in Pfam and InterPro, and the launch of Genome Integrations with Function and Sequence by UniProt and Ensembl. Furthermore, we highlight projects through which EMBL-EBI has contributed to the development of community-driven data standards and guidelines, including the Recommended Metadata for Biological Images (REMBI), and the BioModels Reproducibility Scorecard. Training is one of EMBL-EBI's core missions and a key component of the provision of bioinformatics services to users: this year's update includes many of the improvements that have been developed to EMBL-EBI's online training offering.
Collapse
Affiliation(s)
- Gaia Cantelli
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Alex Bateman
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Cath Brooksbank
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Anton I Petrov
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Rahuman S Malik-Sheriff
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Michele Ide-Smith
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Henning Hermjakob
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Paul Flicek
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Rolf Apweiler
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Ewan Birney
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Johanna McEntyre
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| |
Collapse
|
11
|
Zadissa A, Apweiler R. Data Mining, Quality and Management in the Life Sciences. Methods Mol Biol 2022; 2449:3-25. [PMID: 35507257 DOI: 10.1007/978-1-0716-2095-3_1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 06/14/2023]
Abstract
With the evermore emphasis put on open science and its invaluable benefits to the scientific community, it is no longer the case where a research project simply ends with a scientific publication. The benefits of data sharing and reproducibility of results have taken the centerpiece within the life science research supported by FAIR principles that firmly underline the importance of open data. The current data-intensive multidisciplinary research has also highlighted the significance of how data is mined and managed. Here we describe some of the features adopted by EMBL-EBI data resources to support data mining, data quality, and data management. We also highlight how EMBL-EBI has responded to the current pandemic through its data resources.
Collapse
Affiliation(s)
- Amonida Zadissa
- EMBL-EBI, Wellcome Genome Campus, Hinxton, Cambridgeshire, UK.
| | - Rolf Apweiler
- EMBL-EBI, Wellcome Genome Campus, Hinxton, Cambridgeshire, UK
| |
Collapse
|
12
|
Birney E. The International Human Genome Project. Hum Mol Genet 2021; 30:R161-R163. [PMID: 34264324 PMCID: PMC8490009 DOI: 10.1093/hmg/ddab198] [Citation(s) in RCA: 9] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/07/2021] [Revised: 07/08/2021] [Accepted: 07/09/2021] [Indexed: 12/01/2022] Open
Abstract
The human genome project was conceived and executed as an international project, due to both pragmatic and principled reasons. This internationality has served the project well, with the resulting human genome being freely available for all researchers in all countries. Over time the reference human genome will likely have to evolve to a graph genome, and tap into more diverse sequences worldwide. A similar international mindset underpins data analysis for the interpretation of the human genome from basic to clinical research.
Collapse
Affiliation(s)
- Ewan Birney
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, UK
| |
Collapse
|
13
|
Parks DH, Chuvochina M, Rinke C, Mussig AJ, Chaumeil PA, Hugenholtz P. GTDB: an ongoing census of bacterial and archaeal diversity through a phylogenetically consistent, rank normalized and complete genome-based taxonomy. Nucleic Acids Res 2021; 50:D785-D794. [PMID: 34520557 PMCID: PMC8728215 DOI: 10.1093/nar/gkab776] [Citation(s) in RCA: 632] [Impact Index Per Article: 210.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/27/2021] [Revised: 08/18/2021] [Accepted: 08/28/2021] [Indexed: 11/13/2022] Open
Abstract
The Genome Taxonomy Database (GTDB; https://gtdb.ecogenomic.org) provides a phylogenetically consistent and rank normalized genome-based taxonomy for prokaryotic genomes sourced from the NCBI Assembly database. GTDB R06-RS202 spans 254 090 bacterial and 4316 archaeal genomes, a 270% increase since the introduction of the GTDB in November, 2017. These genomes are organized into 45 555 bacterial and 2339 archaeal species clusters which is a 200% increase since the integration of species clusters into the GTDB in June, 2019. Here, we explore prokaryotic diversity from the perspective of the GTDB and highlight the importance of metagenome-assembled genomes in expanding available genomic representation. We also discuss improvements to the GTDB website which allow tracking of taxonomic changes, easy assessment of genome assembly quality, and identification of genomes assembled from type material or used as species representatives. Methodological updates and policy changes made since the inception of the GTDB are then described along with the procedure used to update species clusters in the GTDB. We conclude with a discussion on the use of average nucleotide identities as a pragmatic approach for delineating prokaryotic species.
Collapse
Affiliation(s)
- Donovan H Parks
- The University of Queensland, School of Chemistry and Molecular Biosciences, Australian Centre for Ecogenomics, QLD 4072, Australia
| | - Maria Chuvochina
- The University of Queensland, School of Chemistry and Molecular Biosciences, Australian Centre for Ecogenomics, QLD 4072, Australia
| | - Christian Rinke
- The University of Queensland, School of Chemistry and Molecular Biosciences, Australian Centre for Ecogenomics, QLD 4072, Australia
| | - Aaron J Mussig
- The University of Queensland, School of Chemistry and Molecular Biosciences, Australian Centre for Ecogenomics, QLD 4072, Australia
| | - Pierre-Alain Chaumeil
- The University of Queensland, School of Chemistry and Molecular Biosciences, Australian Centre for Ecogenomics, QLD 4072, Australia
| | - Philip Hugenholtz
- The University of Queensland, School of Chemistry and Molecular Biosciences, Australian Centre for Ecogenomics, QLD 4072, Australia
| |
Collapse
|
14
|
Djekidel MN, Rosikiewicz W, Peng JC, Kanneganti TD, Hui Y, Jin H, Hedges D, Schreiner P, Fan Y, Wu G, Xu B. CovidExpress: an interactive portal for intuitive investigation on SARS-CoV-2 related transcriptomes. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2021:2021.05.14.444026. [PMID: 34075382 PMCID: PMC8168395 DOI: 10.1101/2021.05.14.444026] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 12/22/2022]
Abstract
Infection with severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) in humans could cause coronavirus disease 2019 (COVID-19). Since its first discovery in Dec 2019, SARS-CoV-2 has become a global pandemic and caused 3.3 million direct/indirect deaths (2021 May). Amongst the scientific community's response to COVID-19, data sharing has emerged as an essential aspect of the combat against SARS-CoV-2. Despite the ever-growing studies about SARS-CoV-2 and COVID-19, to date, only a few databases were curated to enable access to gene expression data. Furthermore, these databases curated only a small set of data and do not provide easy access for investigators without computational skills to perform analyses. To fill this gap and advance open-access to the growing gene expression data on this deadly virus, we collected about 1,500 human bulk RNA-seq datasets from publicly available resources, developed a database and visualization tool, named CovidExpress (https://stjudecab.github.io/covidexpress). This open access database will allow research investigators to examine the gene expression in various tissues, cell lines, and their response to SARS-CoV-2 under different experimental conditions, accelerating the understanding of the etiology of this disease to inform the drug and vaccine development. Our integrative analysis of this big dataset highlights a set of commonly regulated genes in SARS-CoV-2 infected lung and Rhinovirus infected nasal tissues, including OASL that were under-studied in COVID-19 related reports. Our results also suggested a potential FURIN positive feedback loop that might explain the evolutional advantage of SARS-CoV-2.
Collapse
Affiliation(s)
- Mohamed Nadhir Djekidel
- Center for Applied Bioinformatics, St. Jude Children’s Research Hospital, Memphis, Tennessee, 38105, USA
- These authors contributed equally to this study
| | - Wojciech Rosikiewicz
- Center for Applied Bioinformatics, St. Jude Children’s Research Hospital, Memphis, Tennessee, 38105, USA
- These authors contributed equally to this study
| | - Jamy C. Peng
- Department of Developmental Neurobiology, St. Jude Children’s Research Hospital, Memphis, Tennessee, 38105, USA
| | | | - Yawei Hui
- Center for Applied Bioinformatics, St. Jude Children’s Research Hospital, Memphis, Tennessee, 38105, USA
| | - Hongjian Jin
- Center for Applied Bioinformatics, St. Jude Children’s Research Hospital, Memphis, Tennessee, 38105, USA
| | - Dale Hedges
- Center for Applied Bioinformatics, St. Jude Children’s Research Hospital, Memphis, Tennessee, 38105, USA
| | - Patrick Schreiner
- Center for Applied Bioinformatics, St. Jude Children’s Research Hospital, Memphis, Tennessee, 38105, USA
| | - Yiping Fan
- Center for Applied Bioinformatics, St. Jude Children’s Research Hospital, Memphis, Tennessee, 38105, USA
| | - Gang Wu
- Center for Applied Bioinformatics, St. Jude Children’s Research Hospital, Memphis, Tennessee, 38105, USA
| | - Beisi Xu
- Center for Applied Bioinformatics, St. Jude Children’s Research Hospital, Memphis, Tennessee, 38105, USA
| |
Collapse
|
15
|
Chen T, Chen X, Zhang S, Zhu J, Tang B, Wang A, Dong L, Zhang Z, Yu C, Sun Y, Chi L, Chen H, Zhai S, Sun Y, Lan L, Zhang X, Xiao J, Bao Y, Wang Y, Zhang Z, Zhao W. The Genome Sequence Archive Family: Toward Explosive Data Growth and Diverse Data Types. GENOMICS, PROTEOMICS & BIOINFORMATICS 2021. [PMID: 34400360 DOI: 10.1016/j.gpb] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Subscribe] [Scholar Register] [Indexed: 04/16/2023]
Abstract
The Genome Sequence Archive (GSA) is a data repository for archiving raw sequence data, which provides data storage and sharing services for worldwide scientific communities. Considering explosive data growth with diverse data types, here we present the GSA family by expanding into a set of resources for raw data archive with different purposes, namely, GSA (https://ngdc.cncb.ac.cn/gsa/), GSA for Human (GSA-Human, https://ngdc.cncb.ac.cn/gsa-human/), and Open Archive for Miscellaneous Data (OMIX, https://ngdc.cncb.ac.cn/omix/). Compared with the 2017 version, GSA has been significantly updated in data model, online functionalities, and web interfaces. GSA-Human, as a new partner of GSA, is a data repository specialized in human genetics-related data with controlled access and security. OMIX, as a critical complement to the two resources mentioned above, is an open archive for miscellaneous data. Together, all these resources form a family of resources dedicated to archiving explosive data with diverse types, accepting data submissions from all over the world, and providing free open access to all publicly available data in support of worldwide research activities.
Collapse
Affiliation(s)
- Tingting Chen
- China National Center for Bioinformation, Beijing 100101, China; National Genomics Data Center, Beijing Institute of Genomics, Chinese Academy of Sciences, Beijing 100101, China; CAS Key Laboratory of Genome Sciences and Information, Beijing Institute of Genomics, Chinese Academy of Sciences, Beijing 100101, China
| | - Xu Chen
- China National Center for Bioinformation, Beijing 100101, China; National Genomics Data Center, Beijing Institute of Genomics, Chinese Academy of Sciences, Beijing 100101, China; CAS Key Laboratory of Genome Sciences and Information, Beijing Institute of Genomics, Chinese Academy of Sciences, Beijing 100101, China
| | - Sisi Zhang
- China National Center for Bioinformation, Beijing 100101, China; National Genomics Data Center, Beijing Institute of Genomics, Chinese Academy of Sciences, Beijing 100101, China; CAS Key Laboratory of Genome Sciences and Information, Beijing Institute of Genomics, Chinese Academy of Sciences, Beijing 100101, China
| | - Junwei Zhu
- China National Center for Bioinformation, Beijing 100101, China; National Genomics Data Center, Beijing Institute of Genomics, Chinese Academy of Sciences, Beijing 100101, China; CAS Key Laboratory of Genome Sciences and Information, Beijing Institute of Genomics, Chinese Academy of Sciences, Beijing 100101, China
| | - Bixia Tang
- China National Center for Bioinformation, Beijing 100101, China; National Genomics Data Center, Beijing Institute of Genomics, Chinese Academy of Sciences, Beijing 100101, China; CAS Key Laboratory of Genome Sciences and Information, Beijing Institute of Genomics, Chinese Academy of Sciences, Beijing 100101, China
| | - Anke Wang
- China National Center for Bioinformation, Beijing 100101, China; National Genomics Data Center, Beijing Institute of Genomics, Chinese Academy of Sciences, Beijing 100101, China; CAS Key Laboratory of Genome Sciences and Information, Beijing Institute of Genomics, Chinese Academy of Sciences, Beijing 100101, China
| | - Lili Dong
- China National Center for Bioinformation, Beijing 100101, China; National Genomics Data Center, Beijing Institute of Genomics, Chinese Academy of Sciences, Beijing 100101, China; CAS Key Laboratory of Genome Sciences and Information, Beijing Institute of Genomics, Chinese Academy of Sciences, Beijing 100101, China
| | - Zhewen Zhang
- China National Center for Bioinformation, Beijing 100101, China; National Genomics Data Center, Beijing Institute of Genomics, Chinese Academy of Sciences, Beijing 100101, China; CAS Key Laboratory of Genome Sciences and Information, Beijing Institute of Genomics, Chinese Academy of Sciences, Beijing 100101, China
| | - Caixia Yu
- China National Center for Bioinformation, Beijing 100101, China; National Genomics Data Center, Beijing Institute of Genomics, Chinese Academy of Sciences, Beijing 100101, China; CAS Key Laboratory of Genome Sciences and Information, Beijing Institute of Genomics, Chinese Academy of Sciences, Beijing 100101, China
| | - Yanling Sun
- China National Center for Bioinformation, Beijing 100101, China; National Genomics Data Center, Beijing Institute of Genomics, Chinese Academy of Sciences, Beijing 100101, China; CAS Key Laboratory of Genome Sciences and Information, Beijing Institute of Genomics, Chinese Academy of Sciences, Beijing 100101, China
| | - Lianjiang Chi
- China National Center for Bioinformation, Beijing 100101, China; National Genomics Data Center, Beijing Institute of Genomics, Chinese Academy of Sciences, Beijing 100101, China; CAS Key Laboratory of Genomic and Precision Medicine, Beijing Institute of Genomics, Chinese Academy of Sciences, Beijing 100101, China
| | - Huanxin Chen
- China National Center for Bioinformation, Beijing 100101, China; National Genomics Data Center, Beijing Institute of Genomics, Chinese Academy of Sciences, Beijing 100101, China; CAS Key Laboratory of Genome Sciences and Information, Beijing Institute of Genomics, Chinese Academy of Sciences, Beijing 100101, China
| | - Shuang Zhai
- China National Center for Bioinformation, Beijing 100101, China; National Genomics Data Center, Beijing Institute of Genomics, Chinese Academy of Sciences, Beijing 100101, China; CAS Key Laboratory of Genome Sciences and Information, Beijing Institute of Genomics, Chinese Academy of Sciences, Beijing 100101, China
| | - Yubin Sun
- China National Center for Bioinformation, Beijing 100101, China; National Genomics Data Center, Beijing Institute of Genomics, Chinese Academy of Sciences, Beijing 100101, China; CAS Key Laboratory of Genome Sciences and Information, Beijing Institute of Genomics, Chinese Academy of Sciences, Beijing 100101, China
| | - Li Lan
- China National Center for Bioinformation, Beijing 100101, China; National Genomics Data Center, Beijing Institute of Genomics, Chinese Academy of Sciences, Beijing 100101, China; CAS Key Laboratory of Genome Sciences and Information, Beijing Institute of Genomics, Chinese Academy of Sciences, Beijing 100101, China
| | - Xin Zhang
- China National Center for Bioinformation, Beijing 100101, China; National Genomics Data Center, Beijing Institute of Genomics, Chinese Academy of Sciences, Beijing 100101, China; CAS Key Laboratory of Genome Sciences and Information, Beijing Institute of Genomics, Chinese Academy of Sciences, Beijing 100101, China
| | - Jingfa Xiao
- China National Center for Bioinformation, Beijing 100101, China; National Genomics Data Center, Beijing Institute of Genomics, Chinese Academy of Sciences, Beijing 100101, China; CAS Key Laboratory of Genome Sciences and Information, Beijing Institute of Genomics, Chinese Academy of Sciences, Beijing 100101, China; University of Chinese Academy of Sciences, Beijing 100049, China
| | - Yiming Bao
- China National Center for Bioinformation, Beijing 100101, China; National Genomics Data Center, Beijing Institute of Genomics, Chinese Academy of Sciences, Beijing 100101, China; CAS Key Laboratory of Genome Sciences and Information, Beijing Institute of Genomics, Chinese Academy of Sciences, Beijing 100101, China; University of Chinese Academy of Sciences, Beijing 100049, China
| | - Yanqing Wang
- China National Center for Bioinformation, Beijing 100101, China; National Genomics Data Center, Beijing Institute of Genomics, Chinese Academy of Sciences, Beijing 100101, China; CAS Key Laboratory of Genome Sciences and Information, Beijing Institute of Genomics, Chinese Academy of Sciences, Beijing 100101, China.
| | - Zhang Zhang
- China National Center for Bioinformation, Beijing 100101, China; National Genomics Data Center, Beijing Institute of Genomics, Chinese Academy of Sciences, Beijing 100101, China; CAS Key Laboratory of Genome Sciences and Information, Beijing Institute of Genomics, Chinese Academy of Sciences, Beijing 100101, China; University of Chinese Academy of Sciences, Beijing 100049, China.
| | - Wenming Zhao
- China National Center for Bioinformation, Beijing 100101, China; National Genomics Data Center, Beijing Institute of Genomics, Chinese Academy of Sciences, Beijing 100101, China; CAS Key Laboratory of Genome Sciences and Information, Beijing Institute of Genomics, Chinese Academy of Sciences, Beijing 100101, China; University of Chinese Academy of Sciences, Beijing 100049, China.
| |
Collapse
|
16
|
Chen T, Chen X, Zhang S, Zhu J, Tang B, Wang A, Dong L, Zhang Z, Yu C, Sun Y, Chi L, Chen H, Zhai S, Sun Y, Lan L, Zhang X, Xiao J, Bao Y, Wang Y, Zhang Z, Zhao W. The Genome Sequence Archive Family: Toward Explosive Data Growth and Diverse Data Types. GENOMICS, PROTEOMICS & BIOINFORMATICS 2021; 19:578-583. [PMID: 34400360 PMCID: PMC9039563 DOI: 10.1016/j.gpb.2021.08.001] [Citation(s) in RCA: 572] [Impact Index Per Article: 190.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 01/28/2021] [Revised: 08/05/2021] [Accepted: 08/06/2021] [Indexed: 12/31/2022]
Abstract
The Genome Sequence Archive (GSA) is a data repository for archiving raw sequence data, which provides data storage and sharing services for worldwide scientific communities. Considering explosive data growth with diverse data types, here we present the GSA family by expanding into a set of resources for raw data archive with different purposes, namely, GSA (https://ngdc.cncb.ac.cn/gsa/), GSA for Human (GSA-Human, https://ngdc.cncb.ac.cn/gsa-human/), and Open Archive for Miscellaneous Data (OMIX, https://ngdc.cncb.ac.cn/omix/). Compared with the 2017 version, GSA has been significantly updated in data model, online functionalities, and web interfaces. GSA-Human, as a new partner of GSA, is a data repository specialized in human genetics-related data with controlled access and security. OMIX, as a critical complement to the two resources mentioned above, is an open archive for miscellaneous data. Together, all these resources form a family of resources dedicated to archiving explosive data with diverse types, accepting data submissions from all over the world, and providing free open access to all publicly available data in support of worldwide research activities.
Collapse
Affiliation(s)
- Tingting Chen
- China National Center for Bioinformation, Beijing 100101, China; National Genomics Data Center, Beijing Institute of Genomics, Chinese Academy of Sciences, Beijing 100101, China; CAS Key Laboratory of Genome Sciences and Information, Beijing Institute of Genomics, Chinese Academy of Sciences, Beijing 100101, China
| | - Xu Chen
- China National Center for Bioinformation, Beijing 100101, China; National Genomics Data Center, Beijing Institute of Genomics, Chinese Academy of Sciences, Beijing 100101, China; CAS Key Laboratory of Genome Sciences and Information, Beijing Institute of Genomics, Chinese Academy of Sciences, Beijing 100101, China
| | - Sisi Zhang
- China National Center for Bioinformation, Beijing 100101, China; National Genomics Data Center, Beijing Institute of Genomics, Chinese Academy of Sciences, Beijing 100101, China; CAS Key Laboratory of Genome Sciences and Information, Beijing Institute of Genomics, Chinese Academy of Sciences, Beijing 100101, China
| | - Junwei Zhu
- China National Center for Bioinformation, Beijing 100101, China; National Genomics Data Center, Beijing Institute of Genomics, Chinese Academy of Sciences, Beijing 100101, China; CAS Key Laboratory of Genome Sciences and Information, Beijing Institute of Genomics, Chinese Academy of Sciences, Beijing 100101, China
| | - Bixia Tang
- China National Center for Bioinformation, Beijing 100101, China; National Genomics Data Center, Beijing Institute of Genomics, Chinese Academy of Sciences, Beijing 100101, China; CAS Key Laboratory of Genome Sciences and Information, Beijing Institute of Genomics, Chinese Academy of Sciences, Beijing 100101, China
| | - Anke Wang
- China National Center for Bioinformation, Beijing 100101, China; National Genomics Data Center, Beijing Institute of Genomics, Chinese Academy of Sciences, Beijing 100101, China; CAS Key Laboratory of Genome Sciences and Information, Beijing Institute of Genomics, Chinese Academy of Sciences, Beijing 100101, China
| | - Lili Dong
- China National Center for Bioinformation, Beijing 100101, China; National Genomics Data Center, Beijing Institute of Genomics, Chinese Academy of Sciences, Beijing 100101, China; CAS Key Laboratory of Genome Sciences and Information, Beijing Institute of Genomics, Chinese Academy of Sciences, Beijing 100101, China
| | - Zhewen Zhang
- China National Center for Bioinformation, Beijing 100101, China; National Genomics Data Center, Beijing Institute of Genomics, Chinese Academy of Sciences, Beijing 100101, China; CAS Key Laboratory of Genome Sciences and Information, Beijing Institute of Genomics, Chinese Academy of Sciences, Beijing 100101, China
| | - Caixia Yu
- China National Center for Bioinformation, Beijing 100101, China; National Genomics Data Center, Beijing Institute of Genomics, Chinese Academy of Sciences, Beijing 100101, China; CAS Key Laboratory of Genome Sciences and Information, Beijing Institute of Genomics, Chinese Academy of Sciences, Beijing 100101, China
| | - Yanling Sun
- China National Center for Bioinformation, Beijing 100101, China; National Genomics Data Center, Beijing Institute of Genomics, Chinese Academy of Sciences, Beijing 100101, China; CAS Key Laboratory of Genome Sciences and Information, Beijing Institute of Genomics, Chinese Academy of Sciences, Beijing 100101, China
| | - Lianjiang Chi
- China National Center for Bioinformation, Beijing 100101, China; National Genomics Data Center, Beijing Institute of Genomics, Chinese Academy of Sciences, Beijing 100101, China; CAS Key Laboratory of Genomic and Precision Medicine, Beijing Institute of Genomics, Chinese Academy of Sciences, Beijing 100101, China
| | - Huanxin Chen
- China National Center for Bioinformation, Beijing 100101, China; National Genomics Data Center, Beijing Institute of Genomics, Chinese Academy of Sciences, Beijing 100101, China; CAS Key Laboratory of Genome Sciences and Information, Beijing Institute of Genomics, Chinese Academy of Sciences, Beijing 100101, China
| | - Shuang Zhai
- China National Center for Bioinformation, Beijing 100101, China; National Genomics Data Center, Beijing Institute of Genomics, Chinese Academy of Sciences, Beijing 100101, China; CAS Key Laboratory of Genome Sciences and Information, Beijing Institute of Genomics, Chinese Academy of Sciences, Beijing 100101, China
| | - Yubin Sun
- China National Center for Bioinformation, Beijing 100101, China; National Genomics Data Center, Beijing Institute of Genomics, Chinese Academy of Sciences, Beijing 100101, China; CAS Key Laboratory of Genome Sciences and Information, Beijing Institute of Genomics, Chinese Academy of Sciences, Beijing 100101, China
| | - Li Lan
- China National Center for Bioinformation, Beijing 100101, China; National Genomics Data Center, Beijing Institute of Genomics, Chinese Academy of Sciences, Beijing 100101, China; CAS Key Laboratory of Genome Sciences and Information, Beijing Institute of Genomics, Chinese Academy of Sciences, Beijing 100101, China
| | - Xin Zhang
- China National Center for Bioinformation, Beijing 100101, China; National Genomics Data Center, Beijing Institute of Genomics, Chinese Academy of Sciences, Beijing 100101, China; CAS Key Laboratory of Genome Sciences and Information, Beijing Institute of Genomics, Chinese Academy of Sciences, Beijing 100101, China
| | - Jingfa Xiao
- China National Center for Bioinformation, Beijing 100101, China; National Genomics Data Center, Beijing Institute of Genomics, Chinese Academy of Sciences, Beijing 100101, China; CAS Key Laboratory of Genome Sciences and Information, Beijing Institute of Genomics, Chinese Academy of Sciences, Beijing 100101, China; University of Chinese Academy of Sciences, Beijing 100049, China
| | - Yiming Bao
- China National Center for Bioinformation, Beijing 100101, China; National Genomics Data Center, Beijing Institute of Genomics, Chinese Academy of Sciences, Beijing 100101, China; CAS Key Laboratory of Genome Sciences and Information, Beijing Institute of Genomics, Chinese Academy of Sciences, Beijing 100101, China; University of Chinese Academy of Sciences, Beijing 100049, China
| | - Yanqing Wang
- China National Center for Bioinformation, Beijing 100101, China; National Genomics Data Center, Beijing Institute of Genomics, Chinese Academy of Sciences, Beijing 100101, China; CAS Key Laboratory of Genome Sciences and Information, Beijing Institute of Genomics, Chinese Academy of Sciences, Beijing 100101, China.
| | - Zhang Zhang
- China National Center for Bioinformation, Beijing 100101, China; National Genomics Data Center, Beijing Institute of Genomics, Chinese Academy of Sciences, Beijing 100101, China; CAS Key Laboratory of Genome Sciences and Information, Beijing Institute of Genomics, Chinese Academy of Sciences, Beijing 100101, China; University of Chinese Academy of Sciences, Beijing 100049, China.
| | - Wenming Zhao
- China National Center for Bioinformation, Beijing 100101, China; National Genomics Data Center, Beijing Institute of Genomics, Chinese Academy of Sciences, Beijing 100101, China; CAS Key Laboratory of Genome Sciences and Information, Beijing Institute of Genomics, Chinese Academy of Sciences, Beijing 100101, China; University of Chinese Academy of Sciences, Beijing 100049, China.
| |
Collapse
|
17
|
Jain S, Saxena A, Hesarur S, Bhadhadhara K, Bharti N, Kasibhatla SM, Sonavane U, Joshi R. GenoVault: a cloud based genomics repository. BioData Min 2021; 14:36. [PMID: 34325724 PMCID: PMC8319889 DOI: 10.1186/s13040-021-00268-5] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/07/2020] [Accepted: 07/02/2021] [Indexed: 11/15/2022] Open
Abstract
GenoVault is a cloud-based repository for handling Next Generation Sequencing (NGS) data. It is developed using OpenStack-based private cloud with various services like keystone for authentication, cinder for block storage, neutron for networking and nova for managing compute instances for the Cloud. GenoVault uses object-based storage, which enables data to be stored as objects instead of files or blocks for faster retrieval from different distributed object nodes. Along with a web-based interface, a JavaFX-based desktop client has also been developed to meet the requirements of large file uploads that are usually seen in NGS datasets. Users can store files in their respective object-based storage areas and the metadata provided by the user during file uploads is used for querying the database. GenoVault repository is designed taking into account future needs and hence can scale both vertically and horizontally using OpenStack-based cloud features. Users have an option to make the data shareable to the public or restrict the access as private. Data security is ensured as every container is a separate entity in object-based storage architecture which is also supported by Secure File Transfer Protocol (SFTP) for data upload and download. The data is uploaded by the user in individual containers that include raw read files (fastq), processed alignment files (bam, sam, bed) and the output of variation detection (vcf). GenoVault architecture allows verification of the data in terms of integrity and authentication before making it available to collaborators as per the user’s permissions. GenoVault is useful for maintaining the organization-wide NGS data generated in various labs which is not yet published and submitted to public repositories like NCBI. GenoVault also provides support to share NGS data among the collaborating institutions. GenoVault can thus manage vast volumes of NGS data on any OpenStack-based private cloud.
Collapse
Affiliation(s)
- Sankalp Jain
- HPC-M&BA) Group, Centre for Development of Advanced Computing (C-DAC), Pune, MH, 411008, India
| | - Amit Saxena
- HPC-M&BA) Group, Centre for Development of Advanced Computing (C-DAC), Pune, MH, 411008, India
| | - Suprit Hesarur
- HPC-M&BA) Group, Centre for Development of Advanced Computing (C-DAC), Pune, MH, 411008, India
| | - Kirti Bhadhadhara
- HPC-M&BA) Group, Centre for Development of Advanced Computing (C-DAC), Pune, MH, 411008, India
| | - Neeraj Bharti
- HPC-M&BA) Group, Centre for Development of Advanced Computing (C-DAC), Pune, MH, 411008, India
| | | | - Uddhavesh Sonavane
- HPC-M&BA) Group, Centre for Development of Advanced Computing (C-DAC), Pune, MH, 411008, India
| | - Rajendra Joshi
- HPC-M&BA) Group, Centre for Development of Advanced Computing (C-DAC), Pune, MH, 411008, India.
| |
Collapse
|
18
|
Harrison PW, Lopez R, Rahman N, Allen SG, Aslam R, Buso N, Cummins C, Fathy Y, Felix E, Glont M, Jayathilaka S, Kadam S, Kumar M, Lauer KB, Malhotra G, Mosaku A, Edbali O, Park YM, Parton A, Pearce M, Estrada Pena JF, Rossetto J, Russell C, Selvakumar S, Sitjà XP, Sokolov A, Thorne R, Ventouratou M, Walter P, Yordanova G, Zadissa A, Cochrane G, Blomberg N, Apweiler R. The COVID-19 Data Portal: accelerating SARS-CoV-2 and COVID-19 research through rapid open access data sharing. Nucleic Acids Res 2021; 49:W619-W623. [PMID: 34048576 PMCID: PMC8218199 DOI: 10.1093/nar/gkab417] [Citation(s) in RCA: 39] [Impact Index Per Article: 13.0] [Reference Citation Analysis] [Abstract] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/23/2021] [Revised: 04/20/2021] [Accepted: 05/01/2021] [Indexed: 01/07/2023] Open
Abstract
The severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) pandemic will be remembered as one of the defining events of the 21st century. The rapid global outbreak has had significant impacts on human society and is already responsible for millions of deaths. Understanding and tackling the impact of the virus has required a worldwide mobilisation and coordination of scientific research. The COVID-19 Data Portal (https://www.covid19dataportal.org/) was first released as part of the European COVID-19 Data Platform, on April 20th 2020 to facilitate rapid and open data sharing and analysis, to accelerate global SARS-CoV-2 and COVID-19 research. The COVID-19 Data Portal has fortnightly feature releases to continue to add new data types, search options, visualisations and improvements based on user feedback and research. The open datasets and intuitive suite of search, identification and download services, represent a truly FAIR (Findable, Accessible, Interoperable and Reusable) resource that enables researchers to easily identify and quickly obtain the key datasets needed for their COVID-19 research.
Collapse
Affiliation(s)
- Peter W Harrison
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Rodrigo Lopez
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Nadim Rahman
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Stefan Gutnick Allen
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Raheela Aslam
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Nicola Buso
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Carla Cummins
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Yasmin Fathy
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Eloy Felix
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Mihai Glont
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Suran Jayathilaka
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Sandeep Kadam
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Manish Kumar
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | | | - Geetika Malhotra
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Abayomi Mosaku
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Ossama Edbali
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Young Mi Park
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Andrew Parton
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Matt Pearce
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Jose Francisco Estrada Pena
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Joseph Rossetto
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Craig Russell
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Sandeep Selvakumar
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | | | - Alexey Sokolov
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Ross Thorne
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Marianna Ventouratou
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Peter Walter
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Galabina Yordanova
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Amonida Zadissa
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Guy Cochrane
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Niklas Blomberg
- ELIXIR, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Rolf Apweiler
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| |
Collapse
|
19
|
Praissman JL, Wells L. Proteomics-Based Insights Into the SARS-CoV-2-Mediated COVID-19 Pandemic: A Review of the First Year of Research. Mol Cell Proteomics 2021; 20:100103. [PMID: 34089862 PMCID: PMC8176883 DOI: 10.1016/j.mcpro.2021.100103] [Citation(s) in RCA: 13] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/04/2021] [Accepted: 05/24/2021] [Indexed: 02/08/2023] Open
Abstract
In late 2019, a virus subsequently named severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) emerged in China and led to a worldwide pandemic of the disease termed coronavirus disease 2019. The global health threat posed by this pandemic led to an extremely rapid and robust mobilization of the scientific and medical communities as evidenced by the publication of more than 10,000 peer-reviewed articles and thousands of preprints in the first year of the pandemic alone. With the publication of the initial genome sequence of SARS-CoV-2, the proteomics community immediately joined this effort publishing, to date, more than 100 peer-reviewed proteomics studies and submitting many more preprints to preprint servers. In this review, we focus on peer-reviewed articles published on the proteome, glycoproteome, and glycome of SARS-CoV-2. At a basic level, proteomic studies provide valuable information on quantitative aspects of viral infection course; information on the identities, sites, and microheterogeneity of post-translational modifications; and, information on protein-protein interactions. At a biological systems level, these studies elucidate host cell and tissue responses, characterize antibodies and other immune system factors in infection, suggest biomarkers that may be useful for diagnosis and disease-course monitoring, and help in the development or repurposing of potential therapeutics. Here, we summarize results from selected early studies to provide a perspective on the current rapidly evolving literature.
Collapse
Affiliation(s)
- Jeremy L Praissman
- Complex Carbohydrate Research Center, University of Georgia, Athens, Georgia, USA
| | - Lance Wells
- Complex Carbohydrate Research Center, University of Georgia, Athens, Georgia, USA.
| |
Collapse
|
20
|
Harrison PW, Ahamed A, Aslam R, Alako BTF, Burgin J, Buso N, Courtot M, Fan J, Gupta D, Haseeb M, Holt S, Ibrahim T, Ivanov E, Jayathilaka S, Balavenkataraman Kadhirvelu V, Kumar M, Lopez R, Kay S, Leinonen R, Liu X, O'Cathail C, Pakseresht A, Park Y, Pesant S, Rahman N, Rajan J, Sokolov A, Vijayaraja S, Waheed Z, Zyoud A, Burdett T, Cochrane G. The European Nucleotide Archive in 2020. Nucleic Acids Res 2021; 49:D82-D85. [PMID: 33175160 PMCID: PMC7778925 DOI: 10.1093/nar/gkaa1028] [Citation(s) in RCA: 72] [Impact Index Per Article: 24.0] [Reference Citation Analysis] [Abstract] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/18/2020] [Accepted: 10/20/2020] [Indexed: 11/12/2022] Open
Abstract
The European Nucleotide Archive (ENA; https://www.ebi.ac.uk/ena), provided by the European Molecular Biology Laboratory's European Bioinformatics Institute (EMBL-EBI), has for almost forty years continued in its mission to freely archive and present the world's public sequencing data for the benefit of the entire scientific community and for the acceleration of the global research effort. Here we highlight the major developments to ENA services and content in 2020, focussing in particular on the recently released updated ENA browser, modernisation of our release process and our data coordination collaborations with specific research communities.
Collapse
Affiliation(s)
- Peter W Harrison
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Alisha Ahamed
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Raheela Aslam
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Blaise T F Alako
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Josephine Burgin
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Nicola Buso
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Mélanie Courtot
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Jun Fan
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Dipayan Gupta
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Muhammad Haseeb
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Sam Holt
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Talal Ibrahim
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Eugene Ivanov
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Suran Jayathilaka
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | | | - Manish Kumar
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Rodrigo Lopez
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Simon Kay
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Rasko Leinonen
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Xin Liu
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Colman O'Cathail
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Amir Pakseresht
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Youngmi Park
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Stephane Pesant
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Nadim Rahman
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Jeena Rajan
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Alexey Sokolov
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Senthilnathan Vijayaraja
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Zahra Waheed
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Ahmad Zyoud
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Tony Burdett
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Guy Cochrane
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| |
Collapse
|
21
|
Rigden DJ, Fernández XM. The 2021 Nucleic Acids Research database issue and the online molecular biology database collection. Nucleic Acids Res 2021; 49:D1-D9. [PMID: 33396976 PMCID: PMC7778882 DOI: 10.1093/nar/gkaa1216] [Citation(s) in RCA: 33] [Impact Index Per Article: 11.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/13/2022] Open
Abstract
The 2021 Nucleic Acids Research database Issue contains 189 papers spanning a wide range of biological fields and investigation. It includes 89 papers reporting on new databases and 90 covering recent changes to resources previously published in the Issue. A further ten are updates on databases most recently published elsewhere. Seven new databases focus on COVID-19 and SARS-CoV-2 and many others offer resources for studying the virus. Major returning nucleic acid databases include NONCODE, Rfam and RNAcentral. Protein family and domain databases include COG, Pfam, SMART and Panther. Protein structures are covered by RCSB PDB and dispersed proteins by PED and MobiDB. In metabolism and signalling, STRING, KEGG and WikiPathways are featured, along with returning KLIFS and new DKK and KinaseMD, all focused on kinases. IMG/M and IMG/VR update in the microbial and viral genome resources section, while human and model organism genomics resources include Flybase, Ensembl and UCSC Genome Browser. Cancer studies are covered by updates from canSAR and PINA, as well as newcomers CNCdatabase and Oncovar for cancer drivers. Plant comparative genomics is catered for by updates from Gramene and GreenPhylDB. The entire Database Issue is freely available online on the Nucleic Acids Research website (https://academic.oup.com/nar). The NAR online Molecular Biology Database Collection has been substantially updated, revisiting nearly 1000 entries, adding 90 new resources and eliminating 86 obsolete databases, bringing the current total to 1641 databases. It is available at https://www.oxfordjournals.org/nar/database/c/.
Collapse
Affiliation(s)
- Daniel J Rigden
- Institute of Systems, Molecular and Integrative Biology, University of Liverpool, Crown Street, Liverpool L69 7ZB, UK
| | | |
Collapse
|