1
|
Zhang X, Wu J, Luo Y, Wang Y, Wu Y, Xu X, Zhang Y, Kong R, Chi Y, Sun Y, Chen S, He Q, Zhu F, Zhou Z. CovEpiAb: a comprehensive database and analysis resource for immune epitopes and antibodies of human coronaviruses. Brief Bioinform 2024; 25:bbae183. [PMID: 38653491 PMCID: PMC11036340 DOI: 10.1093/bib/bbae183] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/03/2024] [Revised: 02/24/2024] [Accepted: 04/08/2024] [Indexed: 04/25/2024] Open
Abstract
Coronaviruses have threatened humans repeatedly, especially COVID-19 caused by SARS-CoV-2, which has posed a substantial threat to global public health. SARS-CoV-2 continuously evolves through random mutation, resulting in a significant decrease in the efficacy of existing vaccines and neutralizing antibody drugs. It is critical to assess immune escape caused by viral mutations and develop broad-spectrum vaccines and neutralizing antibodies targeting conserved epitopes. Thus, we constructed CovEpiAb, a comprehensive database and analysis resource of human coronavirus (HCoVs) immune epitopes and antibodies. CovEpiAb contains information on over 60 000 experimentally validated epitopes and over 12 000 antibodies for HCoVs and SARS-CoV-2 variants. The database is unique in (1) classifying and annotating cross-reactive epitopes from different viruses and variants; (2) providing molecular and experimental interaction profiles of antibodies, including structure-based binding sites and around 70 000 data on binding affinity and neutralizing activity; (3) providing virological characteristics of current and past circulating SARS-CoV-2 variants and in vitro activity of various therapeutics; and (4) offering site-level annotations of key functional features, including antibody binding, immunological epitopes, SARS-CoV-2 mutations and conservation across HCoVs. In addition, we developed an integrated pipeline for epitope prediction named COVEP, which is available from the webpage of CovEpiAb. CovEpiAb is freely accessible at https://pgx.zju.edu.cn/covepiab/.
Collapse
Affiliation(s)
- Xue Zhang
- National Key Laboratory of Advanced Drug Delivery and Release Systems & Zhejiang Provincial Key Laboratory of Anti-Cancer Drug Research, College of Pharmaceutical Sciences, Zhejiang University, Hangzhou 310058, China
| | - JingCheng Wu
- National Key Laboratory of Advanced Drug Delivery and Release Systems & Zhejiang Provincial Key Laboratory of Anti-Cancer Drug Research, College of Pharmaceutical Sciences, Zhejiang University, Hangzhou 310058, China
| | - Yuanyuan Luo
- National Key Laboratory of Advanced Drug Delivery and Release Systems & Zhejiang Provincial Key Laboratory of Anti-Cancer Drug Research, College of Pharmaceutical Sciences, Zhejiang University, Hangzhou 310058, China
| | - Yilin Wang
- National Key Laboratory of Advanced Drug Delivery and Release Systems & Zhejiang Provincial Key Laboratory of Anti-Cancer Drug Research, College of Pharmaceutical Sciences, Zhejiang University, Hangzhou 310058, China
| | - Yujie Wu
- National Key Laboratory of Advanced Drug Delivery and Release Systems & Zhejiang Provincial Key Laboratory of Anti-Cancer Drug Research, College of Pharmaceutical Sciences, Zhejiang University, Hangzhou 310058, China
| | - Xiaobin Xu
- National Key Laboratory of Advanced Drug Delivery and Release Systems & Zhejiang Provincial Key Laboratory of Anti-Cancer Drug Research, College of Pharmaceutical Sciences, Zhejiang University, Hangzhou 310058, China
| | - Yufang Zhang
- National Key Laboratory of Advanced Drug Delivery and Release Systems & Zhejiang Provincial Key Laboratory of Anti-Cancer Drug Research, College of Pharmaceutical Sciences, Zhejiang University, Hangzhou 310058, China
| | - Ruiying Kong
- National Key Laboratory of Advanced Drug Delivery and Release Systems & Zhejiang Provincial Key Laboratory of Anti-Cancer Drug Research, College of Pharmaceutical Sciences, Zhejiang University, Hangzhou 310058, China
| | - Ying Chi
- Alibaba-Zhejiang University Joint Research Center of Future Digital Healthcare, Hangzhou 310058, China
- ZJU-UoE Institute, Zhejiang University, Haining 314400, China
| | - Yisheng Sun
- Key Lab of Vaccine, Prevention and Control of Infectious Disease of Zhejiang Province, Zhejiang Provincial Center for Disease Control and Prevention, Hangzhou 310015, China
| | - Shuqing Chen
- National Key Laboratory of Advanced Drug Delivery and Release Systems & Zhejiang Provincial Key Laboratory of Anti-Cancer Drug Research, College of Pharmaceutical Sciences, Zhejiang University, Hangzhou 310058, China
| | - Qiaojun He
- National Key Laboratory of Advanced Drug Delivery and Release Systems & Zhejiang Provincial Key Laboratory of Anti-Cancer Drug Research, College of Pharmaceutical Sciences, Zhejiang University, Hangzhou 310058, China
- Zhejiang University Innovation Institute for Artificial Intelligence in Medicine, Engineering Research Center of Innovative Anticancer Drugs, Ministry of Education, Hangzhou 310018, China
| | - Feng Zhu
- National Key Laboratory of Advanced Drug Delivery and Release Systems & Zhejiang Provincial Key Laboratory of Anti-Cancer Drug Research, College of Pharmaceutical Sciences, Zhejiang University, Hangzhou 310058, China
- Zhejiang University Innovation Institute for Artificial Intelligence in Medicine, Engineering Research Center of Innovative Anticancer Drugs, Ministry of Education, Hangzhou 310018, China
- Alibaba-Zhejiang University Joint Research Center of Future Digital Healthcare, Hangzhou 310058, China
| | - Zhan Zhou
- National Key Laboratory of Advanced Drug Delivery and Release Systems & Zhejiang Provincial Key Laboratory of Anti-Cancer Drug Research, College of Pharmaceutical Sciences, Zhejiang University, Hangzhou 310058, China
- Zhejiang University Innovation Institute for Artificial Intelligence in Medicine, Engineering Research Center of Innovative Anticancer Drugs, Ministry of Education, Hangzhou 310018, China
- Alibaba-Zhejiang University Joint Research Center of Future Digital Healthcare, Hangzhou 310058, China
- The Fourth Affiliated Hospital, Zhejiang University School of Medicine, Yiwu 322000, China
| |
Collapse
|
2
|
Amahong K, Zhang W, Liu Y, Li T, Huang S, Han L, Tao L, Zhu F. RVvictor: Virus RNA-directed molecular interactions for RNA virus infection. Comput Biol Med 2024; 169:107886. [PMID: 38157777 DOI: 10.1016/j.compbiomed.2023.107886] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/06/2023] [Revised: 12/14/2023] [Accepted: 12/18/2023] [Indexed: 01/03/2024]
Abstract
RNA viruses are major human pathogens that cause seasonal epidemics and occasional pandemic outbreaks. Due to the nature of their RNA genomes, it is anticipated that virus's RNA interacts with host protein (INTPRO), messenger RNA (INTmRNA), and non-coding RNA (INTncRNA) to perform their particular functions during their transcription and replication. In other words, thus, it is urgently needed to have such valuable data on virus RNA-directed molecular interactions (especially INTPROs), which are highly anticipated to attract broad research interests in the fields of RNA virus translation and replication. In this study, a new database was constructed to describe the virus RNA-directed interaction (INTPRO, INTmRNA, INTncRNA) for RNA virus (RVvictor). This database is unique in a) unambiguously characterizing the interactions between viruses RNAs and host proteins, b) providing, for the first time, the most systematic RNA-directed interaction data resources in providing clues to understand the molecular mechanisms of RNA viruses' translation, and replication, and c) in RVvictor, comprehensive enrichment analysis is conducted for each virus RNA based on its associated target genes/proteins, and the enrichment results were explicitly illustrated using various graphs. We found significant enrichment of a suite of pathways related to infection, translation, and replication, e.g., HIV infection, coronavirus disease, regulation of viral genome replication, and so on. Due to the devastating and persistent threat posed by the RNA virus, RVvictor constructed, for the first time, a possible network of cross-talk in RNA-directed interaction, which may ultimately explain the pathogenicity of RNA virus infection. The knowledge base might help develop new anti-viral therapeutic targets in the future. It's now free and publicly accessible at: https://idrblab.org/rvvictor/.
Collapse
Affiliation(s)
- Kuerbannisha Amahong
- College of Pharmaceutical Sciences, The Second Affiliated Hospital, Zhejiang University School of Medicine, Zhejiang University, Hangzhou, 310058, China; Innovation Institute for Artificial Intelligence in Medicine of Zhejiang University, Alibaba-Zhejiang University Joint Research Center of Future Digital Healthcare, Hangzhou, 330110, China
| | - Wei Zhang
- College of Pharmaceutical Sciences, The Second Affiliated Hospital, Zhejiang University School of Medicine, Zhejiang University, Hangzhou, 310058, China; Innovation Institute for Artificial Intelligence in Medicine of Zhejiang University, Alibaba-Zhejiang University Joint Research Center of Future Digital Healthcare, Hangzhou, 330110, China
| | - Yuhong Liu
- Key Laboratory of Elemene Class Anti-cancer Chinese Medicines, School of Pharmacy, Hangzhou Normal University, Hangzhou, 311121, China
| | - Teng Li
- College of Pharmaceutical Sciences, The Second Affiliated Hospital, Zhejiang University School of Medicine, Zhejiang University, Hangzhou, 310058, China
| | - Shijie Huang
- College of Pharmaceutical Sciences, The Second Affiliated Hospital, Zhejiang University School of Medicine, Zhejiang University, Hangzhou, 310058, China; Innovation Institute for Artificial Intelligence in Medicine of Zhejiang University, Alibaba-Zhejiang University Joint Research Center of Future Digital Healthcare, Hangzhou, 330110, China
| | - Lianyi Han
- Greater Bay Area Institute of Precision Medicine (Guangzhou), School of Life Sciences, Fudan University, Shanghai, 315211, China.
| | - Lin Tao
- Key Laboratory of Elemene Class Anti-cancer Chinese Medicines, School of Pharmacy, Hangzhou Normal University, Hangzhou, 311121, China.
| | - Feng Zhu
- College of Pharmaceutical Sciences, The Second Affiliated Hospital, Zhejiang University School of Medicine, Zhejiang University, Hangzhou, 310058, China; Innovation Institute for Artificial Intelligence in Medicine of Zhejiang University, Alibaba-Zhejiang University Joint Research Center of Future Digital Healthcare, Hangzhou, 330110, China.
| |
Collapse
|
3
|
Li C, Ma L, Zou D, Zhang R, Bai X, Li L, Wu G, Huang T, Zhao W, Jin E, Bao Y, Song S. RCoV19: A One-stop Hub for SARS-CoV-2 Genome Data Integration, Variant Monitoring, and Risk Pre-warning. GENOMICS, PROTEOMICS & BIOINFORMATICS 2023; 21:1066-1079. [PMID: 37898309 PMCID: PMC10928372 DOI: 10.1016/j.gpb.2023.10.004] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 06/09/2023] [Revised: 10/17/2023] [Accepted: 10/19/2023] [Indexed: 10/30/2023]
Abstract
The Resource for Coronavirus 2019 (RCoV19) is an open-access information resource dedicated to providing valuable data on the genomes, mutations, and variants of the severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2). In this updated implementation of RCoV19, we have made significant improvements and advancements over the previous version. Firstly, we have implemented a highly refined genome data curation model. This model now features an automated integration pipeline and optimized curation rules, enabling efficient daily updates of data in RCoV19. Secondly, we have developed a global and regional lineage evolution monitoring platform, alongside an outbreak risk pre-warning system. These additions provide a comprehensive understanding of SARS-CoV-2 evolution and transmission patterns, enabling better preparedness and response strategies. Thirdly, we have developed a powerful interactive mutation spectrum comparison module. This module allows users to compare and analyze mutation patterns, assisting in the detection of potential new lineages. Furthermore, we have incorporated a comprehensive knowledgebase on mutation effects. This knowledgebase serves as a valuable resource for retrieving information on the functional implications of specific mutations. In summary, RCoV19 serves as a vital scientific resource, providing access to valuable data, relevant information, and technical support in the global fight against COVID-19. The complete contents of RCoV19 are available to the public at https://ngdc.cncb.ac.cn/ncov/.
Collapse
Affiliation(s)
- Cuiping Li
- National Genomics Data Center, Beijing Institute of Genomics, Chinese Academy of Sciences and China National Center for Bioinformation, Beijing 100101, China
| | - Lina Ma
- National Genomics Data Center, Beijing Institute of Genomics, Chinese Academy of Sciences and China National Center for Bioinformation, Beijing 100101, China; CAS Key Laboratory of Genome Sciences and Information, Beijing Institute of Genomics, Chinese Academy of Sciences and China National Center for Bioinformation, Beijing 100101, China; University of Chinese Academy of Sciences, Beijing 100049, China
| | - Dong Zou
- National Genomics Data Center, Beijing Institute of Genomics, Chinese Academy of Sciences and China National Center for Bioinformation, Beijing 100101, China; CAS Key Laboratory of Genome Sciences and Information, Beijing Institute of Genomics, Chinese Academy of Sciences and China National Center for Bioinformation, Beijing 100101, China
| | - Rongqin Zhang
- National Genomics Data Center, Beijing Institute of Genomics, Chinese Academy of Sciences and China National Center for Bioinformation, Beijing 100101, China; University of Chinese Academy of Sciences, Beijing 100049, China; Sino-Danish College, University of Chinese Academy of Sciences, Beijing 100049, China
| | - Xue Bai
- National Genomics Data Center, Beijing Institute of Genomics, Chinese Academy of Sciences and China National Center for Bioinformation, Beijing 100101, China
| | - Lun Li
- National Genomics Data Center, Beijing Institute of Genomics, Chinese Academy of Sciences and China National Center for Bioinformation, Beijing 100101, China
| | - Gangao Wu
- National Genomics Data Center, Beijing Institute of Genomics, Chinese Academy of Sciences and China National Center for Bioinformation, Beijing 100101, China; CAS Key Laboratory of Genome Sciences and Information, Beijing Institute of Genomics, Chinese Academy of Sciences and China National Center for Bioinformation, Beijing 100101, China; University of Chinese Academy of Sciences, Beijing 100049, China
| | - Tianhao Huang
- National Genomics Data Center, Beijing Institute of Genomics, Chinese Academy of Sciences and China National Center for Bioinformation, Beijing 100101, China; CAS Key Laboratory of Genome Sciences and Information, Beijing Institute of Genomics, Chinese Academy of Sciences and China National Center for Bioinformation, Beijing 100101, China; University of Chinese Academy of Sciences, Beijing 100049, China
| | - Wei Zhao
- National Genomics Data Center, Beijing Institute of Genomics, Chinese Academy of Sciences and China National Center for Bioinformation, Beijing 100101, China; CAS Key Laboratory of Genome Sciences and Information, Beijing Institute of Genomics, Chinese Academy of Sciences and China National Center for Bioinformation, Beijing 100101, China; University of Chinese Academy of Sciences, Beijing 100049, China
| | - Enhui Jin
- National Genomics Data Center, Beijing Institute of Genomics, Chinese Academy of Sciences and China National Center for Bioinformation, Beijing 100101, China; CAS Key Laboratory of Genome Sciences and Information, Beijing Institute of Genomics, Chinese Academy of Sciences and China National Center for Bioinformation, Beijing 100101, China; University of Chinese Academy of Sciences, Beijing 100049, China
| | - Yiming Bao
- National Genomics Data Center, Beijing Institute of Genomics, Chinese Academy of Sciences and China National Center for Bioinformation, Beijing 100101, China; CAS Key Laboratory of Genome Sciences and Information, Beijing Institute of Genomics, Chinese Academy of Sciences and China National Center for Bioinformation, Beijing 100101, China; University of Chinese Academy of Sciences, Beijing 100049, China.
| | - Shuhui Song
- National Genomics Data Center, Beijing Institute of Genomics, Chinese Academy of Sciences and China National Center for Bioinformation, Beijing 100101, China; CAS Key Laboratory of Genome Sciences and Information, Beijing Institute of Genomics, Chinese Academy of Sciences and China National Center for Bioinformation, Beijing 100101, China; University of Chinese Academy of Sciences, Beijing 100049, China; Sino-Danish College, University of Chinese Academy of Sciences, Beijing 100049, China.
| |
Collapse
|
4
|
Tan M, Xia J, Luo H, Meng G, Zhu Z. Applying the digital data and the bioinformatics tools in SARS-CoV-2 research. Comput Struct Biotechnol J 2023; 21:4697-4705. [PMID: 37841328 PMCID: PMC10568291 DOI: 10.1016/j.csbj.2023.09.044] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/12/2023] [Revised: 09/29/2023] [Accepted: 09/29/2023] [Indexed: 10/17/2023] Open
Abstract
Bioinformatics has been playing a crucial role in the scientific progress to fight against the pandemic of the coronavirus disease 2019 (COVID-19) caused by the severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2). The advances in novel algorithms, mega data technology, artificial intelligence and deep learning assisted the development of novel bioinformatics tools to analyze daily increasing SARS-CoV-2 data in the past years. These tools were applied in genomic analyses, evolutionary tracking, epidemiological analyses, protein structure interpretation, studies in virus-host interaction and clinical performance. To promote the in-silico analysis in the future, we conducted a review which summarized the databases, web services and software applied in SARS-CoV-2 research. Those digital resources applied in SARS-CoV-2 research may also potentially contribute to the research in other coronavirus and non-coronavirus viruses.
Collapse
Affiliation(s)
- Meng Tan
- School of Life Sciences, Chongqing University, Chongqing, China
| | - Jiaxin Xia
- School of Life Sciences, Chongqing University, Chongqing, China
| | - Haitao Luo
- School of Life Sciences, Chongqing University, Chongqing, China
| | - Geng Meng
- College of Veterinary Medicine, China Agricultural University, Beijing, China
| | - Zhenglin Zhu
- School of Life Sciences, Chongqing University, Chongqing, China
| |
Collapse
|
5
|
Sathyaseelan C, Magateshvaren
Saras MA, Prasad Patro LP, Uttamrao PP, Rathinavelan T. CoVe-Tracker: An Interactive SARS-CoV-2 Pan Proteome Evolution Tracker. J Proteome Res 2023; 22:1984-1996. [PMID: 37036263 PMCID: PMC10108739 DOI: 10.1021/acs.jproteome.3c00068] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/03/2023] [Indexed: 04/11/2023]
Abstract
SARS-CoV-2 has significantly mutated its genome during the past 3 years, leading to the periodic emergence of several variants. Some of the variants possess enhanced fitness advantage, transmissibility, and pathogenicity and can also reduce vaccine efficacy. Thus, it is important to track the viral evolution to prevent and protect the mankind from SARS-CoV-2 infection. To this end, an interactive web-GUI platform, namely, CoVe-tracker (SARS-CoV-2 evolution tracker), is developed to track its pan proteome evolutionary dynamics (https://project.iith.ac.in/cove-tracker/). CoVe-tracker provides an opportunity for the user to fetch the country-wise and protein-wise amino acid mutations (currently, 44139) of SARS-CoV-2 and their month-wise distribution. It also provides position-wise evolution observed in the SARS-CoV-2 proteome. Importantly, CoVe-tracker provides month- and country-wise distributions of 2065 phylogenetic assignment of named global outbreak (PANGO) lineages and their 177564 variants. It further provides periodic updates on SARS-CoV-2 variant(s) evolution. CoVe-tracker provides the results in a user-friendly interactive fashion by projecting the results onto the world map (for country-wise distribution) and protein 3D structure (for protein-wise mutation). The application of CoVe-tracker in tracking the closest cousin(s) of a variant is demonstrated by considering BA.4 and BA.5 PANGO lineages as test cases. Thus, CoVe-tracker would be useful in the quick surveillance of newly emerging mutations/variants/lineages to facilitate the understanding of viral evolution, transmission, and disease epidemiology.
Collapse
Affiliation(s)
- Chakkarai Sathyaseelan
- Department of Biotechnology, Indian Institute
of Technology Hyderabad, Kandi, Telangana State 502285,
India
| | | | - L. Ponoop Prasad Patro
- Department of Biotechnology, Indian Institute
of Technology Hyderabad, Kandi, Telangana State 502285,
India
| | - Patil Pranita Uttamrao
- Department of Biotechnology, Indian Institute
of Technology Hyderabad, Kandi, Telangana State 502285,
India
| | | |
Collapse
|
6
|
Rojas-Cruz AF, Gallego-Gómez JC, Bermúdez-Santana CI. RNA structure-altering mutations underlying positive selection on Spike protein reveal novel putative signatures to trace crossing host-species barriers in Betacoronavirus. RNA Biol 2022; 19:1019-1044. [PMID: 36102368 PMCID: PMC9481089 DOI: 10.1080/15476286.2022.2115750] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/05/2022] Open
Abstract
Similar to other RNA viruses, the emergence of Betacoronavirus relies on cross-species viral transmission, which requires careful health surveillance monitoring of protein-coding information as well as genome-wide analysis. Although the evolutionary jump from natural reservoirs to humans may be mainly traced-back by studying the effect that hotspot mutations have on viral proteins, it is largely unexplored if other impacts might emerge on the structured RNA genome of Betacoronavirus. In this survey, the protein-coding and viral genome architecture were simultaneously studied to uncover novel insights into cross-species horizontal transmission events. We analysed 1,252,952 viral genomes of SARS-CoV, MERS-CoV, and SARS-CoV-2 distributed across the world in bats, intermediate animals, and humans to build a new landscape of changes in the RNA viral genome. Phylogenetic analyses suggest that bat viruses are the most closely related to the time of most recent common ancestor of Betacoronavirus, and missense mutations in viral proteins, mainly in the S protein S1 subunit: SARS-CoV (G > T; A577S); MERS-CoV (C > T; S746R and C > T; N762A); and SARS-CoV-2 (A > G; D614G) appear to have driven viral diversification. We also found that codon sites under positive selection on S protein overlap with non-compensatory mutations that disrupt secondary RNA structures in the RNA genome complement. These findings provide pivotal factors that might be underlying the eventual jumping the species barrier from bats to intermediate hosts. Lastly, we discovered that nearly half of the Betacoronavirus genomes carry highly conserved RNA structures, and more than 90% of these RNA structures show negative selection signals, suggesting essential functions in the biology of Betacoronavirus that have not been investigated to date. Further research is needed on negatively selected RNA structures to scan for emerging functions like the potential of coding virus-derived small RNAs and to develop new candidate antiviral therapeutic strategies.
Collapse
Affiliation(s)
- Alexis Felipe Rojas-Cruz
- Theoretical and Computational RNomics Group, Department of Biology, Faculty of Sciences, National University of Colombia, Bogota Colombia
| | - Juan Carlos Gallego-Gómez
- Molecular and Translational Medicine Group, Faculty of Medicine, University of Antioquia, Medellin Colombia
| | - Clara Isabel Bermúdez-Santana
- Theoretical and Computational RNomics Group, Department of Biology, Faculty of Sciences, National University of Colombia, Bogota Colombia
- Center of Excellence in Scientific Computing, National University of Colombia, Bogota Colombia
| |
Collapse
|
7
|
Bernasconi A, Guizzardi G, Pastor O, Storey VC. Semantic interoperability: ontological unpacking of a viral conceptual model. BMC Bioinformatics 2022; 23:491. [PMID: 36396980 PMCID: PMC9672571 DOI: 10.1186/s12859-022-05022-0] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/17/2022] [Accepted: 10/29/2022] [Indexed: 11/18/2022] Open
Abstract
BACKGROUND Genomics and virology are unquestionably important, but complex, domains being investigated by a large number of scientists. The need to facilitate and support work within these domains requires sharing of databases, although it is often difficult to do so because of the different ways in which data is represented across the databases. To foster semantic interoperability, models are needed that provide a deep understanding and interpretation of the concepts in a domain, so that the data can be consistently interpreted among researchers. RESULTS In this research, we propose the use of conceptual models to support semantic interoperability among databases and assess their ontological clarity to support their effective use. This modeling effort is illustrated by its application to the Viral Conceptual Model (VCM) that captures and represents the sequencing of viruses, inspired by the need to understand the genomic aspects of the virus responsible for COVID-19. For achieving semantic clarity on the VCM, we leverage the "ontological unpacking" method, a process of ontological analysis that reveals the ontological foundation of the information that is represented in a conceptual model. This is accomplished by applying the stereotypes of the OntoUML ontology-driven conceptual modeling language.As a result, we propose a new OntoVCM, an ontologically grounded model, based on the initial VCM, but with guaranteed interoperability among the data sources that employ it. CONCLUSIONS We propose and illustrate how the unpacking of the Viral Conceptual Model resolves several issues related to semantic interoperability, the importance of which is recognized by the "I" in FAIR principles. The research addresses conceptual uncertainty within the domain of SARS-CoV-2 data and knowledge.The method employed provides the basis for further analyses of complex models currently used in life science applications, but lacking ontological grounding, subsequently hindering the interoperability needed for scientists to progress their research.
Collapse
Affiliation(s)
- Anna Bernasconi
- Department of Electronics, Information and Bioengineering, Politecnico di Milano, Milan, Italy.
- PROS Research Center, VRAIN Research Institute, Universitat Politècnica de València, Valencia, Spain.
| | - Giancarlo Guizzardi
- Conceptual and Cognitive Modeling Research Group, Free University of Bozen-Bolzano, Bolzano, Italy
- Services and Cybersecurity Group, University of Twente, Enschede, The Netherlands
| | - Oscar Pastor
- PROS Research Center, VRAIN Research Institute, Universitat Politècnica de València, Valencia, Spain
| | - Veda C Storey
- J. Mack Robinson College of Business, Georgia State University, Atlanta, Georgia, USA
| |
Collapse
|
8
|
Kumar S, Kumar GS, Maitra SS, Malý P, Bharadwaj S, Sharma P, Dwivedi VD. Viral informatics: bioinformatics-based solution for managing viral infections. Brief Bioinform 2022; 23:6659740. [PMID: 35947964 DOI: 10.1093/bib/bbac326] [Citation(s) in RCA: 6] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/27/2022] [Revised: 06/26/2022] [Accepted: 07/18/2022] [Indexed: 11/13/2022] Open
Abstract
Several new viral infections have emerged in the human population and establishing as global pandemics. With advancements in translation research, the scientific community has developed potential therapeutics to eradicate or control certain viral infections, such as smallpox and polio, responsible for billions of disabilities and deaths in the past. Unfortunately, some viral infections, such as dengue virus (DENV) and human immunodeficiency virus-1 (HIV-1), are still prevailing due to a lack of specific therapeutics, while new pathogenic viral strains or variants are emerging because of high genetic recombination or cross-species transmission. Consequently, to combat the emerging viral infections, bioinformatics-based potential strategies have been developed for viral characterization and developing new effective therapeutics for their eradication or management. This review attempts to provide a single platform for the available wide range of bioinformatics-based approaches, including bioinformatics methods for the identification and management of emerging or evolved viral strains, genome analysis concerning the pathogenicity and epidemiological analysis, computational methods for designing the viral therapeutics, and consolidated information in the form of databases against the known pathogenic viruses. This enriched review of the generally applicable viral informatics approaches aims to provide an overview of available resources capable of carrying out the desired task and may be utilized to expand additional strategies to improve the quality of translation viral informatics research.
Collapse
Affiliation(s)
- Sanjay Kumar
- School of Biotechnology, Jawaharlal Nehru University, New Delhi, India.,Center for Bioinformatics, Computational and Systems Biology, Pathfinder Research and Training Foundation, Greater Noida, India
| | - Geethu S Kumar
- Department of Life Science, School of Basic Science and Research, Sharda University, Greater Noida, Uttar Pradesh, India.,Center for Bioinformatics, Computational and Systems Biology, Pathfinder Research and Training Foundation, Greater Noida, India
| | | | - Petr Malý
- Laboratory of Ligand Engineering, Institute of Biotechnology of the Czech Academy of Sciences v.v.i., BIOCEV Research Center, Vestec, Czech Republic
| | - Shiv Bharadwaj
- Laboratory of Ligand Engineering, Institute of Biotechnology of the Czech Academy of Sciences v.v.i., BIOCEV Research Center, Vestec, Czech Republic
| | - Pradeep Sharma
- Department of Biophysics, All India Institute of Medical Sciences, New Delhi, India
| | - Vivek Dhar Dwivedi
- Center for Bioinformatics, Computational and Systems Biology, Pathfinder Research and Training Foundation, Greater Noida, India.,Institute of Advanced Materials, IAAM, 59053 Ulrika, Sweden
| |
Collapse
|
9
|
Harari S, Tahor M, Rutsinsky N, Meijer S, Miller D, Henig O, Halutz O, Levytskyi K, Ben-Ami R, Adler A, Paran Y, Stern A. Drivers of adaptive evolution during chronic SARS-CoV-2 infections. Nat Med 2022; 28:1501-1508. [PMID: 35725921 PMCID: PMC9307477 DOI: 10.1038/s41591-022-01882-4] [Citation(s) in RCA: 62] [Impact Index Per Article: 31.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/10/2022] [Accepted: 05/23/2022] [Indexed: 11/17/2022]
Abstract
In some immunocompromised patients with chronic severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) infection, considerable adaptive evolution occurs. Some substitutions found in chronic infections are lineage-defining mutations in variants of concern (VOCs), which has led to the hypothesis that VOCs emerged from chronic infections. In this study, we searched for drivers of VOC-like emergence by consolidating sequencing results from a set of 27 chronic infections. Most substitutions in this set reflected lineage-defining VOC mutations; however, a subset of mutations associated with successful global transmission was absent from chronic infections. We further tested the ability to associate antibody evasion mutations with patient-specific and virus-specific features and found that viral rebound is strongly correlated with the emergence of antibody evasion. We found evidence for dynamic polymorphic viral populations in most patients, suggesting that a compromised immune system selects for antibody evasion in particular niches in a patient’s body. We suggest that a tradeoff exists between antibody evasion and transmissibility and that extensive monitoring of chronic infections is necessary to further understanding of VOC emergence. Analysis of mutations that arise in chronic SARS-CoV-2 infections shows both overlap and differences with mutations present in pandemic viral variants of concern, highlighting their distinct drivers of evolution.
Collapse
Affiliation(s)
- Sheri Harari
- The Shmunis School of Biomedicine and Cancer Research, Tel Aviv University, Tel Aviv, Israel.,Edmond J. Safra Center for Bioinformatics at Tel Aviv University, Tel Aviv, Israel
| | - Maayan Tahor
- The Shmunis School of Biomedicine and Cancer Research, Tel Aviv University, Tel Aviv, Israel
| | - Natalie Rutsinsky
- The Shmunis School of Biomedicine and Cancer Research, Tel Aviv University, Tel Aviv, Israel
| | - Suzy Meijer
- Department of Infectious Diseases and Epidemiology, Tel Aviv Sourasky Medical Center, Tel Aviv, Israel.,Sackler Faculty of Medicine, Tel Aviv University, Tel Aviv, Israel
| | - Danielle Miller
- The Shmunis School of Biomedicine and Cancer Research, Tel Aviv University, Tel Aviv, Israel.,Edmond J. Safra Center for Bioinformatics at Tel Aviv University, Tel Aviv, Israel
| | - Oryan Henig
- Department of Infectious Diseases and Epidemiology, Tel Aviv Sourasky Medical Center, Tel Aviv, Israel.,Sackler Faculty of Medicine, Tel Aviv University, Tel Aviv, Israel
| | - Ora Halutz
- Clinical Microbiology Laboratory, Tel Aviv Sourasky Medical Center, Tel Aviv, Israel
| | - Katia Levytskyi
- Department of Infectious Diseases and Epidemiology, Tel Aviv Sourasky Medical Center, Tel Aviv, Israel.,Sackler Faculty of Medicine, Tel Aviv University, Tel Aviv, Israel
| | - Ronen Ben-Ami
- Department of Infectious Diseases and Epidemiology, Tel Aviv Sourasky Medical Center, Tel Aviv, Israel.,Sackler Faculty of Medicine, Tel Aviv University, Tel Aviv, Israel
| | - Amos Adler
- Department of Infectious Diseases and Epidemiology, Tel Aviv Sourasky Medical Center, Tel Aviv, Israel.,Sackler Faculty of Medicine, Tel Aviv University, Tel Aviv, Israel
| | - Yael Paran
- Department of Infectious Diseases and Epidemiology, Tel Aviv Sourasky Medical Center, Tel Aviv, Israel.,Sackler Faculty of Medicine, Tel Aviv University, Tel Aviv, Israel
| | - Adi Stern
- The Shmunis School of Biomedicine and Cancer Research, Tel Aviv University, Tel Aviv, Israel. .,Edmond J. Safra Center for Bioinformatics at Tel Aviv University, Tel Aviv, Israel.
| |
Collapse
|
10
|
CoV2K model, a comprehensive representation of SARS-CoV-2 knowledge and data interplay. Sci Data 2022; 9:260. [PMID: 35650205 PMCID: PMC9160032 DOI: 10.1038/s41597-022-01348-9] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/01/2021] [Accepted: 04/26/2022] [Indexed: 11/08/2022] Open
Abstract
Since the outbreak of the COVID-19 pandemic, many research organizations have studied the genome of the SARS-CoV-2 virus; a body of public resources have been published for monitoring its evolution. While we experience an unprecedented richness of information in this domain, we also ascertained the presence of several information quality issues. We hereby propose CoV2K, an abstract model for explaining SARS-CoV-2-related concepts and interactions, focusing on viral mutations, their co-occurrence within variants, and their effects. CoV2K provides a clear and concise route map for understanding different connected types of information related to the virus; it thus drives a process of data and knowledge integration that aggregates information from several current resources, harmonizing their content and overcoming incompleteness and inconsistency issues. CoV2K is available for exploration as a graph that can be queried through a RESTful API addressing single entities or paths through their relationships. Practical use cases demonstrate its application to current knowledge inquiries.
Collapse
|
11
|
Cilibrasi L, Pinoli P, Bernasconi A, Canakoglu A, Chiara M, Ceri S. ViruClust: direct comparison of SARS-CoV-2 genomes and genetic variants in space and time. Bioinformatics 2022; 38:1988-1994. [PMID: 35040923 DOI: 10.1093/bioinformatics/btac030] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/24/2021] [Revised: 12/24/2021] [Accepted: 01/13/2022] [Indexed: 02/03/2023] Open
Abstract
MOTIVATION The ongoing evolution of SARS-CoV-2 and the rapid emergence of variants of concern at distinct geographic locations have relevant implications for the implementation of strategies for controlling the COVID-19 pandemic. Combining the growing body of data and the evidence on potential functional implications of SARS-CoV-2 mutations can suggest highly effective methods for the prioritization of novel variants of potential concern, e.g. increasing in frequency locally and/or globally. However, these analyses may be complex, requiring the integration of different data and resources. We claim the need for a streamlined access to up-to-date and high-quality genome sequencing data from different geographic regions/countries, and the current lack of a robust and consistent framework for the evaluation/comparison of the results. RESULTS To overcome these limitations, we developed ViruClust, a novel tool for the comparison of SARS-CoV-2 genomic sequences and lineages in space and time. ViruClust is made available through a powerful and intuitive web-based user interface. Sophisticated large-scale analyses can be executed with a few clicks, even by users without any computational background. To demonstrate potential applications of our method, we applied ViruClust to conduct a thorough study of the evolution of the most prevalent lineage of the Delta SARS-CoV-2 variant, and derived relevant observations. By allowing the seamless integration of different types of functional annotations and the direct comparison of viral genomes and genetic variants in space and time, ViruClust represents a highly valuable resource for monitoring the evolution of SARS-CoV-2, facilitating the identification of variants and/or mutations of potential concern. AVAILABILITY AND IMPLEMENTATION ViruClust is openly available at http://gmql.eu/viruclust/. SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Luca Cilibrasi
- Department of Electronics, Information and Bioengineering (DEIB), Politecnico di Milano, 20133 Milano, Italy
| | - Pietro Pinoli
- Department of Electronics, Information and Bioengineering (DEIB), Politecnico di Milano, 20133 Milano, Italy
| | - Anna Bernasconi
- Department of Electronics, Information and Bioengineering (DEIB), Politecnico di Milano, 20133 Milano, Italy
| | - Arif Canakoglu
- Department of Electronics, Information and Bioengineering (DEIB), Politecnico di Milano, 20133 Milano, Italy
| | - Matteo Chiara
- Department of BioSciences, University of Milano, 20133 Milano, Italy
| | - Stefano Ceri
- Department of Electronics, Information and Bioengineering (DEIB), Politecnico di Milano, 20133 Milano, Italy
| |
Collapse
|
12
|
Alfonsi T, Pinoli P, Canakoglu A. High Performance Integration Pipeline for Viral and Epitope Sequences. BIOTECH 2022; 11:biotech11010007. [PMID: 35822815 PMCID: PMC9245902 DOI: 10.3390/biotech11010007] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/31/2022] [Revised: 03/08/2022] [Accepted: 03/15/2022] [Indexed: 11/28/2022] Open
Abstract
With the spread of COVID-19, sequencing laboratories started to share hundreds of sequences daily. However, the lack of a commonly agreed standard across deposition databases hindered the exploration and study of all the viral sequences collected worldwide in a practical and homogeneous way. During the first months of the pandemic, we developed an automatic procedure to collect, transform, and integrate viral sequences of SARS-CoV-2, MERS, SARS-CoV, Ebola, and Dengue from four major database institutions (NCBI, COG-UK, GISAID, and NMDC). This data pipeline allowed the creation of the data exploration interfaces VirusViz and EpiSurf, as well as ViruSurf, one of the largest databases of integrated viral sequences. Almost two years after the first release of the repository, the original pipeline underwent a thorough refinement process and became more efficient, scalable, and general (currently, it also includes epitopes from the IEDB). Thanks to these improvements, we constantly update and expand our integrated repository, encompassing about 9.1 million SARS-CoV-2 sequences at present (March 2022). This pipeline made it possible to design and develop fundamental resources for any researcher interested in understanding the biological mechanisms behind the viral infection. In addition, it plays a crucial role in many analytic and visualization tools, such as ViruSurf, EpiSurf, VirusViz, and VirusLab.
Collapse
Affiliation(s)
- Tommaso Alfonsi
- Dipartimento di Elettronica, Informazione e Bioingegneria, Politecnico di Milano, Via Ponzio 34/5, 20133 Milano, Italy; (P.P.); (A.C.)
- Correspondence:
| | - Pietro Pinoli
- Dipartimento di Elettronica, Informazione e Bioingegneria, Politecnico di Milano, Via Ponzio 34/5, 20133 Milano, Italy; (P.P.); (A.C.)
| | - Arif Canakoglu
- Dipartimento di Elettronica, Informazione e Bioingegneria, Politecnico di Milano, Via Ponzio 34/5, 20133 Milano, Italy; (P.P.); (A.C.)
- Policlinico di Milano Ospedale Maggiore, Fondazione IRCCS Ca’ Granda, Via Francesco Sforza, 35, 20122 Milano, Italy
| |
Collapse
|
13
|
Bernasconi A, Cascianelli S. Scenarios for the Integration of Microarray Gene Expression Profiles in COVID-19-Related Studies. Methods Mol Biol 2022; 2401:195-215. [PMID: 34902130 DOI: 10.1007/978-1-0716-1839-4_13] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 06/14/2023]
Abstract
The COVID-19 pandemic has hit heavily many aspects of our lives. At this time, genomic research is concerned with exploiting available datasets and knowledge to fuel discovery on this novel disease. Studies that can precisely characterize the gene expression profiles of human hosts infected by SARS-CoV-2 are of significant relevance. However, not many such experiments have yet been produced to date, nor made publicly available online. Thus, it is of paramount importance that data analysts explore all possibilities to integrate information coming from similar viruses and related diseases; interestingly, microarray gene profile experiments become extremely valuable for this purpose. This chapter reviews the aspects that should be considered when integrating transcriptomics data, considering mainly samples infected by different viruses and combining together various data types and also the extracted knowledge. It describes a series of scenarios from studies performed in literature and it suggests possible other directions of noteworthy integration.
Collapse
Affiliation(s)
- Anna Bernasconi
- Dipartimento di Elettronica, Informazione e Bioingegneria, Politecnico di Milano, Milan, Italy.
| | - Silvia Cascianelli
- Dipartimento di Elettronica, Informazione e Bioingegneria, Politecnico di Milano, Milan, Italy
| |
Collapse
|
14
|
Databases, Knowledgebases, and Software Tools for Virus Informatics. ADVANCES IN EXPERIMENTAL MEDICINE AND BIOLOGY 2022; 1368:1-19. [DOI: 10.1007/978-981-16-8969-7_1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 10/18/2022]
|
15
|
Essabbar A, Kartti S, Alouane T, Hakmi M, Belyamani L, Ibrahimi A. IDbSV: An Open-Access Repository for Monitoring SARS-CoV-2 Variations and Evolution. Front Med (Lausanne) 2021; 8:765249. [PMID: 34966754 PMCID: PMC8710592 DOI: 10.3389/fmed.2021.765249] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/26/2021] [Accepted: 11/05/2021] [Indexed: 11/13/2022] Open
Abstract
Ending COVID-19 pandemic requires a collaborative understanding of SARS-CoV-2 and COVID-19 mechanisms. Yet, the evolving nature of coronaviruses results in a continuous emergence of new variants of the virus. Central to this is the need for a continuous monitoring system able to detect potentially harmful variants of the virus in real-time. In this manuscript, we present the International Database of SARS-CoV-2 Variations (IDbSV), the result of ongoing efforts in curating, analyzing, and sharing comprehensive interpretation of SARS-CoV-2's genetic variations and variants. Through user-friendly interactive data visualizations, we aim to provide a novel surveillance tool to the scientific and public health communities. The database is regularly updated with new records through a 4-step workflow (1—Quality control of curated sequences, 2—Call of variations, 3—Functional annotation, and 4—Metadata association). To the best of our knowledge, IDbSV provides access to the largest repository of SARS-CoV-2 variations and the largest analysis of SARS-CoV-2 genomes with over 60 thousand annotated variations curated from the 1,808,613 genomes alongside their functional annotations, first known appearance, and associated genetic lineages, enabling a robust interpretation tool for SARS-CoV-2 variations to help understanding SARS-CoV-2 dynamics across the world.
Collapse
Affiliation(s)
- Abdelmounim Essabbar
- Medical Biotechnology Laboratory (MedBiotech), Bioinova Research Center, Rabat Medical and Pharmacy School, Mohammed Vth University, Rabat, Morocco
| | - Souad Kartti
- Medical Biotechnology Laboratory (MedBiotech), Bioinova Research Center, Rabat Medical and Pharmacy School, Mohammed Vth University, Rabat, Morocco
| | - Tarek Alouane
- Medical Biotechnology Laboratory (MedBiotech), Bioinova Research Center, Rabat Medical and Pharmacy School, Mohammed Vth University, Rabat, Morocco
| | - Mohammed Hakmi
- Medical Biotechnology Laboratory (MedBiotech), Bioinova Research Center, Rabat Medical and Pharmacy School, Mohammed Vth University, Rabat, Morocco
| | - Lahcen Belyamani
- Emergency Department, Military Hospital Mohammed V, Rabat Medical & Pharmacy School, Mohammed Vth University, Rabat, Morocco
| | - Azeddine Ibrahimi
- Medical Biotechnology Laboratory (MedBiotech), Bioinova Research Center, Rabat Medical and Pharmacy School, Mohammed Vth University, Rabat, Morocco
| |
Collapse
|
16
|
Torrens-Fontanals M, Peralta-García A, Talarico C, Guixà-González R, Giorgino T, Selent J. SCoV2-MD: a database for the dynamics of the SARS-CoV-2 proteome and variant impact predictions. Nucleic Acids Res 2021; 50:D858-D866. [PMID: 34761257 PMCID: PMC8689960 DOI: 10.1093/nar/gkab977] [Citation(s) in RCA: 6] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/14/2021] [Revised: 09/21/2021] [Accepted: 11/08/2021] [Indexed: 11/23/2022] Open
Abstract
SCoV2-MD (www.scov2-md.org) is a new
online resource that systematically organizes atomistic simulations of the
SARS-CoV-2 proteome. The database includes simulations produced by leading
groups using molecular dynamics (MD) methods to investigate the
structure-dynamics-function relationships of viral proteins. SCoV2-MD
cross-references the molecular data with the pandemic evolution by tracking all
available variants sequenced during the pandemic and deposited in the GISAID
resource. SCoV2-MD enables the interactive analysis of the deposited
trajectories through a web interface, which enables users to search by viral
protein, isolate, phylogenetic attributes, or specific point mutation. Each
mutation can then be analyzed interactively combining static (e.g. a variety of
amino acid substitution penalties) and dynamic (time-dependent data derived from
the dynamics of the local geometry) scores. Dynamic scores can be computed on
the basis of nine non-covalent interaction types, including steric properties,
solvent accessibility, hydrogen bonding, and other types of chemical
interactions. Where available, experimental data such as antibody escape and
change in binding affinities from deep mutational scanning experiments are also
made available. All metrics can be combined to build predefined or custom scores
to interrogate the impact of evolving variants on protein structure and
function. SCoV2-MD is a new online resource that systematically organizes atomistic
simulations of the SARS-CoV-2 proteome. The database includes simulations
produced by leading groups using molecular dynamics (MD) methods to investigate
the structure-dynamics-function relationships of viral proteins. SCoV2-MD
cross-references the molecular data with the pandemic evolution by tracking all
available variants sequenced during the pandemic and deposited in the GISAID
resource. SCoV2-MD enables the interactive analysis of the deposited
trajectories through a web interface, which enables users to search by viral
protein, isolate, phylogenetic attributes, or specific point mutation. Each
mutation can then be analyzed interactively combining static (e.g. a variety of
amino acid substitution penalties) and dynamic (time-dependent data derived from
the dynamics of the local geometry) scores.
Collapse
Affiliation(s)
- Mariona Torrens-Fontanals
- Research Programme on Biomedical Informatics, Hospital del Mar Medical Research Institute-Department of Experimental and Health Sciences, Pompeu Fabra University, Barcelona 08003, Spain
| | - Alejandro Peralta-García
- Research Programme on Biomedical Informatics, Hospital del Mar Medical Research Institute-Department of Experimental and Health Sciences, Pompeu Fabra University, Barcelona 08003, Spain
| | - Carmine Talarico
- EXSCALATE, Dompé Farmaceutici S.p.A., Via Tommaso De Amicis, 95, Napoli, 80131, Italy
| | - Ramon Guixà-González
- Laboratory of Biomolecular Research, Paul Scherrer Institute, CH-5232 Villigen PSI, Switzerland.,Condensed Matter Theory Group, Paul Scherrer Institute, CH-5232 Villigen PSI, Switzerland
| | - Toni Giorgino
- Biophysics Institute (CNR-IBF), National Research Council of Italy, Milan 20133, Italy.,Department of Biosciences, University of Milan, Milan 20133, Italy
| | - Jana Selent
- Research Programme on Biomedical Informatics, Hospital del Mar Medical Research Institute-Department of Experimental and Health Sciences, Pompeu Fabra University, Barcelona 08003, Spain
| |
Collapse
|
17
|
Pinoli P, Bernasconi A, Sandionigi A, Ceri S. VirusLab: A Tool for Customized SARS-CoV-2 Data Analysis. BIOTECH 2021; 10:biotech10040027. [PMID: 35822801 PMCID: PMC9245481 DOI: 10.3390/biotech10040027] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/14/2021] [Revised: 10/26/2021] [Accepted: 11/02/2021] [Indexed: 12/14/2022] Open
Abstract
Since the beginning of 2020, the COVID-19 pandemic has posed unprecedented challenges to viral data analysis and connected host disease diagnostic methods. We propose VirusLab, a flexible system for analysing SARS-CoV-2 viral sequences and relating them to metadata or clinical information about the host. VirusLab capitalizes on two existing resources: ViruSurf, a database of public SARS-CoV-2 sequences supporting metadata-driven search, and VirusViz, a tool for visual analysis of search results. VirusLab is designed for taking advantage of these resources within a server-side architecture that: (i) covers pipelines based on approaches already in use (ARTIC, Galaxy) but entirely cutomizable upon user request; (ii) predigests analysis of raw sequencing data from different platforms (Oxford Nanopore and Illumina); (iii) gives access to public archives datasets; (iv) supplies user-friendly reporting – making it a tool that can also be integrated into a business environment. VirusLab can be installed and hosted within the premises of any organization where information about SARS-CoV-2 sequences can be safely integrated with information about hosts (e.g., clinical metadata). A system such as VirusLab is not currently available in the landscape of similar providers: our results show that VirusLab is a powerful tool to generate tabular/graphical and machine readable reports that can be integrated in more complex pipelines. We foresee that the proposed system can support many research-oriented and therapeutic scenarios within hospitals or the tracing of viral sequences and their mutational processes within organizations for viral surveillance.
Collapse
Affiliation(s)
- Pietro Pinoli
- Dipartimento di Elettronica, Informazione e Bioingegneria, Politecnico di Milano, 20133 Milano, Italy; (P.P.); (S.C.)
| | - Anna Bernasconi
- Dipartimento di Elettronica, Informazione e Bioingegneria, Politecnico di Milano, 20133 Milano, Italy; (P.P.); (S.C.)
- Correspondence: ; Tel.: +39-02-2399-3655
| | - Anna Sandionigi
- Quantia Consulting S.r.l., Mariano Comense, 22066 Como, Italy;
| | - Stefano Ceri
- Dipartimento di Elettronica, Informazione e Bioingegneria, Politecnico di Milano, 20133 Milano, Italy; (P.P.); (S.C.)
| |
Collapse
|
18
|
Bernasconi A, Mari L, Casagrandi R, Ceri S. Data-driven analysis of amino acid change dynamics timely reveals SARS-CoV-2 variant emergence. Sci Rep 2021; 11:21068. [PMID: 34702903 PMCID: PMC8548498 DOI: 10.1038/s41598-021-00496-z] [Citation(s) in RCA: 12] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/04/2021] [Accepted: 10/12/2021] [Indexed: 02/07/2023] Open
Abstract
Since its emergence in late 2019, the diffusion of SARS-CoV-2 is associated with the evolution of its viral genome. The co-occurrence of specific amino acid changes, collectively named ‘virus variant’, requires scrutiny (as variants may hugely impact the agent’s transmission, pathogenesis, or antigenicity); variant evolution is studied using phylogenetics. Yet, never has this problem been tackled by digging into data with ad hoc analysis techniques. Here we show that the emergence of variants can in fact be traced through data-driven methods, further capitalizing on the value of large collections of SARS-CoV-2 sequences. For all countries with sufficient data, we compute weekly counts of amino acid changes, unveil time-varying clusters of changes with similar—rapidly growing—dynamics, and then follow their evolution. Our method succeeds in timely associating clusters to variants of interest/concern, provided their change composition is well characterized. This allows us to detect variants’ emergence, rise, peak, and eventual decline under competitive pressure of another variant. Our early warning system, exclusively relying on deposited sequences, shows the power of big data in this context, and concurs to calling for the wide spreading of public SARS-CoV-2 genome sequencing for improved surveillance and control of the COVID-19 pandemic.
Collapse
Affiliation(s)
- Anna Bernasconi
- Departement of Electronics, Information, and Bioengineering, Politecnico di Milano, 20133, Milan, Italy.
| | - Lorenzo Mari
- Departement of Electronics, Information, and Bioengineering, Politecnico di Milano, 20133, Milan, Italy
| | - Renato Casagrandi
- Departement of Electronics, Information, and Bioengineering, Politecnico di Milano, 20133, Milan, Italy
| | - Stefano Ceri
- Departement of Electronics, Information, and Bioengineering, Politecnico di Milano, 20133, Milan, Italy
| |
Collapse
|
19
|
Bernasconi A, Cilibrasi L, Al Khalaf R, Alfonsi T, Ceri S, Pinoli P, Canakoglu A. EpiSurf: metadata-driven search server for analyzing amino acid changes within epitopes of SARS-CoV-2 and other viral species. Database (Oxford) 2021; 2021:baab059. [PMID: 34585726 PMCID: PMC8500151 DOI: 10.1093/database/baab059] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/20/2021] [Revised: 07/27/2021] [Accepted: 09/16/2021] [Indexed: 11/21/2022]
Abstract
EpiSurf is a Web application for selecting viral populations of interest and then analyzing how their amino acid changes are distributed along epitopes. Viral sequences are searched within ViruSurf, which stores curated metadata and amino acid changes imported from the most widely used deposition sources for viral databases (GenBank, COVID-19 Genomics UK (COG-UK) and Global initiative on sharing all influenza data (GISAID)). Epitopes are searched within the open source Immune Epitope Database or directly proposed by users by indicating their start and stop positions in the context of a given viral protein. Amino acid changes of selected populations are joined with epitopes of interest; a result table summarizes, for each epitope, statistics about the overlapping amino acid changes and about the sequences carrying such alterations. The results may also be inspected by the VirusViz Web application; epitope regions are highlighted within the given viral protein, and changes can be comparatively inspected. For sequences mutated within the epitope, we also offer a complete view of the distribution of amino acid changes, optionally grouped by the location, collection date or lineage. Thanks to these functionalities, EpiSurf supports the user-friendly testing of epitope conservancy within selected populations of interest, which can be of utmost relevance for designing vaccines, drugs or serological assays. EpiSurf is available at two endpoints. Database URL: http://gmql.eu/episurf/ (for searching GenBank and COG-UK sequences) and http://gmql.eu/episurf_gisaid/ (for GISAID sequences).
Collapse
Affiliation(s)
- Anna Bernasconi
- Dipartimento di Elettronica, Informazione e
Bioingegneria, Politecnico di Milano, Via Ponzio 34/5, Milano 20133,
Italy
| | - Luca Cilibrasi
- Dipartimento di Elettronica, Informazione e
Bioingegneria, Politecnico di Milano, Via Ponzio 34/5, Milano 20133,
Italy
| | - Ruba Al Khalaf
- Dipartimento di Elettronica, Informazione e
Bioingegneria, Politecnico di Milano, Via Ponzio 34/5, Milano 20133,
Italy
| | - Tommaso Alfonsi
- Dipartimento di Elettronica, Informazione e
Bioingegneria, Politecnico di Milano, Via Ponzio 34/5, Milano 20133,
Italy
| | - Stefano Ceri
- Dipartimento di Elettronica, Informazione e
Bioingegneria, Politecnico di Milano, Via Ponzio 34/5, Milano 20133,
Italy
| | - Pietro Pinoli
- Dipartimento di Elettronica, Informazione e
Bioingegneria, Politecnico di Milano, Via Ponzio 34/5, Milano 20133,
Italy
| | - Arif Canakoglu
- Dipartimento di Elettronica, Informazione e
Bioingegneria, Politecnico di Milano, Via Ponzio 34/5, Milano 20133,
Italy
| |
Collapse
|
20
|
Bernasconi A, Gulino A, Alfonsi T, Canakoglu A, Pinoli P, Sandionigi A, Ceri S. VirusViz: comparative analysis and effective visualization of viral nucleotide and amino acid variants. Nucleic Acids Res 2021; 49:e90. [PMID: 34107016 PMCID: PMC8344854 DOI: 10.1093/nar/gkab478] [Citation(s) in RCA: 10] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/26/2021] [Revised: 05/11/2021] [Accepted: 05/24/2021] [Indexed: 12/27/2022] Open
Abstract
Variant visualization plays an important role in supporting the viral evolution analysis, extremely valuable during the COVID-19 pandemic. VirusViz is a web-based application for comparing variants of selected viral populations and their sub-populations; it is primarily focused on SARS-CoV-2 variants, although the tool also supports other viral species (SARS-CoV, MERS-CoV, Dengue, Ebola). As input, VirusViz imports results of queries extracting variants and metadata from the large database ViruSurf, which integrates information about most SARS-CoV-2 sequences publicly deposited worldwide. Moreover, VirusViz accepts sequences of new viral populations as multi-FASTA files plus corresponding metadata in CSV format; a bioinformatic pipeline builds a suitable input for VirusViz by extracting the nucleotide and amino acid variants. Pages of VirusViz provide metadata summarization, variant descriptions, and variant visualization with rich options for zooming, highlighting variants or regions of interest, and switching from nucleotides to amino acids; sequences can be grouped, groups can be comparatively analyzed. For SARS-CoV-2, we manually collect mutations with known or predicted levels of severity/virulence, as indicated in linked research articles; such critical mutations are reported when observed in sequences. The system includes light-weight project management for downloading, resuming, and merging data analysis sessions. VirusViz is freely available at http://gmql.eu/virusviz/.
Collapse
Affiliation(s)
- Anna Bernasconi
- Dipartimento di Elettronica, Informazione e Bioingegneria, Politecnico di Milano, Via Ponzio 34/5, 20133 Milano, Italy
| | - Andrea Gulino
- Dipartimento di Elettronica, Informazione e Bioingegneria, Politecnico di Milano, Via Ponzio 34/5, 20133 Milano, Italy
| | - Tommaso Alfonsi
- Dipartimento di Elettronica, Informazione e Bioingegneria, Politecnico di Milano, Via Ponzio 34/5, 20133 Milano, Italy
| | - Arif Canakoglu
- Dipartimento di Elettronica, Informazione e Bioingegneria, Politecnico di Milano, Via Ponzio 34/5, 20133 Milano, Italy
| | - Pietro Pinoli
- Dipartimento di Elettronica, Informazione e Bioingegneria, Politecnico di Milano, Via Ponzio 34/5, 20133 Milano, Italy
| | - Anna Sandionigi
- Quantia Consulting S.r.l., Via Petrarca 20, 22066, Mariano Comense, Como, Italy
| | - Stefano Ceri
- Dipartimento di Elettronica, Informazione e Bioingegneria, Politecnico di Milano, Via Ponzio 34/5, 20133 Milano, Italy
| |
Collapse
|
21
|
Gozashti L, Corbett-Detig R. Shortcomings of SARS-CoV-2 genomic metadata. BMC Res Notes 2021; 14:189. [PMID: 34001211 PMCID: PMC8128092 DOI: 10.1186/s13104-021-05605-9] [Citation(s) in RCA: 6] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/19/2021] [Accepted: 05/06/2021] [Indexed: 11/10/2022] Open
Abstract
OBJECTIVE The SARS-CoV-2 pandemic has prompted one of the most extensive and expeditious genomic sequencing efforts in history. Each viral genome is accompanied by a set of metadata which supplies important information such as the geographic origin of the sample, age of the host, and the lab at which the sample was sequenced, and is integral to epidemiological efforts and public health direction. Here, we interrogate some shortcomings of metadata within the GISAID database to raise awareness of common errors and inconsistencies that may affect data-driven analyses and provide possible avenues for resolutions. RESULTS Our analysis reveals a startling prevalence of spelling errors and inconsistent naming conventions, which together occur in an estimated ~ 9.8% and ~ 11.6% of "originating lab" and "submitting lab" GISAID metadata entries respectively. We also find numerous ambiguous entries which provide very little information about the actual source of a sample and could easily associate with multiple sources worldwide. Importantly, all of these issues can impair the ability and accuracy of association studies by deceptively causing a group of samples to identify with multiple sources when they truly all identify with one source, or vice versa.
Collapse
Affiliation(s)
- Landen Gozashti
- Department of Organismic and Evolutionary Biology and Museum of Comparative Zoology, Harvard University, Cambridge, MA, 02138, USA. .,Department of Biomolecular Engineering and Genomics Institute, University of California Santa Cruz, Santa Cruz, CA, 95064, USA.
| | - Russell Corbett-Detig
- Department of Biomolecular Engineering and Genomics Institute, University of California Santa Cruz, Santa Cruz, CA, 95064, USA
| |
Collapse
|
22
|
Bernasconi A, Canakoglu A, Masseroli M, Pinoli P, Ceri S. A review on viral data sources and search systems for perspective mitigation of COVID-19. Brief Bioinform 2021; 22:664-675. [PMID: 33348368 PMCID: PMC7799334 DOI: 10.1093/bib/bbaa359] [Citation(s) in RCA: 13] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/01/2020] [Revised: 10/09/2020] [Accepted: 11/09/2020] [Indexed: 12/26/2022] Open
Abstract
With the outbreak of the COVID-19 disease, the research community is producing unprecedented efforts dedicated to better understand and mitigate the effects of the pandemic. In this context, we review the data integration efforts required for accessing and searching genome sequences and metadata of SARS-CoV2, the virus responsible for the COVID-19 disease, which have been deposited into the most important repositories of viral sequences. Organizations that were already present in the virus domain are now dedicating special interest to the emergence of COVID-19 pandemics, by emphasizing specific SARS-CoV2 data and services. At the same time, novel organizations and resources were born in this critical period to serve specifically the purposes of COVID-19 mitigation while setting the research ground for contrasting possible future pandemics. Accessibility and integration of viral sequence data, possibly in conjunction with the human host genotype and clinical data, are paramount to better understand the COVID-19 disease and mitigate its effects. Few examples of host-pathogen integrated datasets exist so far, but we expect them to grow together with the knowledge of COVID-19 disease; once such datasets will be available, useful integrative surveillance mechanisms can be put in place by observing how common variants distribute in time and space, relating them to the phenotypic impact evidenced in the literature.
Collapse
|
23
|
Chiara M, D’Erchia AM, Gissi C, Manzari C, Parisi A, Resta N, Zambelli F, Picardi E, Pavesi G, Horner DS, Pesole G. Next generation sequencing of SARS-CoV-2 genomes: challenges, applications and opportunities. Brief Bioinform 2021; 22:616-630. [PMID: 33279989 PMCID: PMC7799330 DOI: 10.1093/bib/bbaa297] [Citation(s) in RCA: 117] [Impact Index Per Article: 39.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/08/2020] [Revised: 09/27/2020] [Accepted: 10/07/2020] [Indexed: 12/31/2022] Open
Abstract
Various next generation sequencing (NGS) based strategies have been successfully used in the recent past for tracing origins and understanding the evolution of infectious agents, investigating the spread and transmission chains of outbreaks, as well as facilitating the development of effective and rapid molecular diagnostic tests and contributing to the hunt for treatments and vaccines. The ongoing COVID-19 pandemic poses one of the greatest global threats in modern history and has already caused severe social and economic costs. The development of efficient and rapid sequencing methods to reconstruct the genomic sequence of SARS-CoV-2, the etiological agent of COVID-19, has been fundamental for the design of diagnostic molecular tests and to devise effective measures and strategies to mitigate the diffusion of the pandemic. Diverse approaches and sequencing methods can, as testified by the number of available sequences, be applied to SARS-CoV-2 genomes. However, each technology and sequencing approach has its own advantages and limitations. In the current review, we will provide a brief, but hopefully comprehensive, account of currently available platforms and methodological approaches for the sequencing of SARS-CoV-2 genomes. We also present an outline of current repositories and databases that provide access to SARS-CoV-2 genomic data and associated metadata. Finally, we offer general advice and guidelines for the appropriate sharing and deposition of SARS-CoV-2 data and metadata, and suggest that more efficient and standardized integration of current and future SARS-CoV-2-related data would greatly facilitate the struggle against this new pathogen. We hope that our 'vademecum' for the production and handling of SARS-CoV-2-related sequencing data, will contribute to this objective.
Collapse
Affiliation(s)
- Matteo Chiara
- molecular biology and bioinformatics at the University of Milan
| | - Anna Maria D’Erchia
- molecular biology at the University of Bari and research associate at the Institute of Biomembranes, Bioenergetics and Molecular Biotechnologies of the National Research Council in Bari
| | - Carmela Gissi
- molecular biology at the University of Bari and research associate at the Institute of Biomembranes, Bioenergetics and Molecular Biotechnologies of the National Research Council in Bari
| | - Caterina Manzari
- Institute of Biomembranes, Bioenergetics and Molecular Biotechnologies of the National Research Council in Bari
| | - Antonio Parisi
- Genetic and Molecular Epidemiology Laboratory at the Experimental Zooprophylactic Institute of Apulia and Basilicata
| | - Nicoletta Resta
- Medical Genetics at the University of Bari. She heads the Laboratory Unit of Medical Genetics and the School of Specialization in Medical Genetics
| | | | - Ernesto Picardi
- molecular biology and bioinformatics at the University of Bari and research associate at the Institute of Biomembranes, Bioenergetics and Molecular Biotechnologies of the National Research Council in Bari
| | - Giulio Pavesi
- Associate Professor of bioinformatics at the University of Milan (Italy)
| | - David S Horner
- molecular biology and bioinformatics at the University of Milan
| | - Graziano Pesole
- molecular biology at the University of Bari and Research Associate at the Institute of Biomembranes, Bioenergetics and Molecular Biotechnologies of the National Research Council in Bari
| |
Collapse
|
24
|
Biswas N, Kumar K, Mallick P, Das S, Kamal IM, Bose S, Choudhury A, Chakrabarti S. Structural and Drug Screening Analysis of the Non-structural Proteins of Severe Acute Respiratory Syndrome Coronavirus 2 Virus Extracted From Indian Coronavirus Disease 2019 Patients. Front Genet 2021; 12:626642. [PMID: 33767730 PMCID: PMC7985531 DOI: 10.3389/fgene.2021.626642] [Citation(s) in RCA: 14] [Impact Index Per Article: 4.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/06/2020] [Accepted: 01/26/2021] [Indexed: 12/14/2022] Open
Abstract
The novel coronavirus 2 (nCoV2) outbreaks took place in December 2019 in Wuhan City, Hubei Province, China. It continued to spread worldwide in an unprecedented manner, bringing the whole world to a lockdown and causing severe loss of life and economic stability. The coronavirus disease 2019 (COVID-19) pandemic has also affected India, infecting more than 10 million till 31st December 2020 and resulting in more than a hundred thousand deaths. In the absence of an effective vaccine, it is imperative to understand the phenotypic outcome of the genetic variants and subsequently the mode of action of its proteins with respect to human proteins and other bio-molecules. Availability of a large number of genomic and mutational data extracted from the nCoV2 virus infecting Indian patients in a public repository provided an opportunity to understand and analyze the specific variations of the virus in India and their impact in broader perspectives. Non-structural proteins (NSPs) of severe acute respiratory syndrome coronavirus 2 (SARS-CoV2) virus play a major role in its survival as well as virulence power. Here, we provide a detailed overview of the SARS-CoV2 NSPs including primary and secondary structural information, mutational frequency of the Indian and Wuhan variants, phylogenetic profiles, three-dimensional (3D) structural perspectives using homology modeling and molecular dynamics analyses for wild-type and selected variants, host-interactome analysis and viral-host protein complexes, and in silico drug screening with known antivirals and other drugs against the SARS-CoV2 NSPs isolated from the variants found within Indian patients across various regions of the country. All this information is categorized in the form of a database named, Database of NSPs of India specific Novel Coronavirus (DbNSP InC), which is freely available at http://www.hpppi.iicb.res.in/covid19/index.php.
Collapse
Affiliation(s)
- Nupur Biswas
- Structural Biology and Bioinformatics Division, Council for Scientific and Industrial Research (CSIR)–Indian Institute of Chemical Biology (IICB), Kolkata, India
| | | | | | | | | | | | | | - Saikat Chakrabarti
- Structural Biology and Bioinformatics Division, Council for Scientific and Industrial Research (CSIR)–Indian Institute of Chemical Biology (IICB), Kolkata, India
| |
Collapse
|
25
|
A Conceptual Model for Geo-Online Exploratory Data Visualization: The Case of the COVID-19 Pandemic. INFORMATION 2021. [DOI: 10.3390/info12020069] [Citation(s) in RCA: 9] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/30/2022] Open
Abstract
Responding to the recent COVID-19 outbreak, several organizations and private citizens considered the opportunity to design and publish online explanatory data visualization tools for the communication of disease data supported by a spatial dimension. They responded to the need of receiving instant information arising from the broad research community, the public health authorities, and the general public. In addition, the growing maturity of information and mapping technologies, as well as of social networks, has greatly supported the diffusion of web-based dashboards and infographics, blending geographical, graphical, and statistical representation approaches. We propose a broad conceptualization of Web visualization tools for geo-spatial information, exceptionally employed to communicate the current pandemic; to this end, we study a significant number of publicly available platforms that track, visualize, and communicate indicators related to COVID-19. Our methodology is based on (i) a preliminary systematization of actors, data types, providers, and visualization tools, and on (ii) the creation of a rich collection of relevant sites clustered according to significant parameters. Ultimately, the contribution of this work includes a critical analysis of collected evidence and an extensive modeling effort of Geo-Online Exploratory Data Visualization (Geo-OEDV) tools, synthesized in terms of an Entity-Relationship schema. The COVID-19 pandemic outbreak has offered a significant case to study how and how much modern public communication needs spatially related data and effective implementation of tools whose inspection can impact decision-making at different levels. Our resulting model will allow several stakeholders (general users, policy-makers, and researchers/analysts) to gain awareness on the assets of structured online communication and resource owners to direct future development of these important tools.
Collapse
|
26
|
Rigden DJ, Fernández XM. The 2021 Nucleic Acids Research database issue and the online molecular biology database collection. Nucleic Acids Res 2021; 49:D1-D9. [PMID: 33396976 PMCID: PMC7778882 DOI: 10.1093/nar/gkaa1216] [Citation(s) in RCA: 30] [Impact Index Per Article: 10.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/13/2022] Open
Abstract
The 2021 Nucleic Acids Research database Issue contains 189 papers spanning a wide range of biological fields and investigation. It includes 89 papers reporting on new databases and 90 covering recent changes to resources previously published in the Issue. A further ten are updates on databases most recently published elsewhere. Seven new databases focus on COVID-19 and SARS-CoV-2 and many others offer resources for studying the virus. Major returning nucleic acid databases include NONCODE, Rfam and RNAcentral. Protein family and domain databases include COG, Pfam, SMART and Panther. Protein structures are covered by RCSB PDB and dispersed proteins by PED and MobiDB. In metabolism and signalling, STRING, KEGG and WikiPathways are featured, along with returning KLIFS and new DKK and KinaseMD, all focused on kinases. IMG/M and IMG/VR update in the microbial and viral genome resources section, while human and model organism genomics resources include Flybase, Ensembl and UCSC Genome Browser. Cancer studies are covered by updates from canSAR and PINA, as well as newcomers CNCdatabase and Oncovar for cancer drivers. Plant comparative genomics is catered for by updates from Gramene and GreenPhylDB. The entire Database Issue is freely available online on the Nucleic Acids Research website (https://academic.oup.com/nar). The NAR online Molecular Biology Database Collection has been substantially updated, revisiting nearly 1000 entries, adding 90 new resources and eliminating 86 obsolete databases, bringing the current total to 1641 databases. It is available at https://www.oxfordjournals.org/nar/database/c/.
Collapse
Affiliation(s)
- Daniel J Rigden
- Institute of Systems, Molecular and Integrative Biology, University of Liverpool, Crown Street, Liverpool L69 7ZB, UK
| | | |
Collapse
|