1
|
Ji C, Shao J. Shine: A novel strategy to extract specific, sensitive and well-conserved biomarkers from massive microbial genomic datasets. BMC Bioinformatics 2023; 24:128. [PMID: 37016282 PMCID: PMC10071469 DOI: 10.1186/s12859-023-05195-2] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/23/2022] [Accepted: 02/17/2023] [Indexed: 04/06/2023] Open
Abstract
BACKGROUND Concentrations of the pathogenic microorganisms' DNA in biological samples are typically low. Therefore, DNA diagnostics of common infections are costly, rarely accurate, and challenging. Limited by failing to cover updated epidemic testing samples, computational services are difficult to implement in clinical applications without complex customized settings. Furthermore, the combined biomarkers used to maintain high conservation may not be cost effective and could cause several experimental errors in many clinical settings. Given the limitations of recent developed technology, 16S rRNA is too conserved to distinguish closely related species, and mosaic plasmids are not effective as well because of their uneven distribution across prokaryotic taxa. RESULTS Here, we provide a computational strategy, Shine, that allows extraction of specific, sensitive and well-conserved biomarkers from massive microbial genomic datasets. Distinguished with simple concatenations with blast-based filtering, our method involves a de novo genome alignment-based pipeline to explore the original and specific repetitive biomarkers in the defined population. It can cover all members to detect newly discovered multicopy conserved species-specific or even subspecies-specific target probes and primer sets. The method has been successfully applied to a number of clinical projects and has the overwhelming advantages of automated detection of all pathogenic microorganisms without the limitations of genome annotation and incompletely assembled motifs. Using on our pipeline, users may select different configuration parameters depending on the purpose of the project for routine clinical detection practices on the website https://bioinfo.liferiver.com.cn with easy registration. CONCLUSIONS The proposed strategy is suitable for identifying shared phylogenetic markers while featuring low rates of false positive or false negative. This technology is suitable for the automatic design of minimal and efficient PCR primers and other types of detection probes.
Collapse
Affiliation(s)
- Cong Ji
- Liferiver Science and Technology Institute, Shanghai ZJ Bio-Tech Co., Ltd., Shanghai, China.
| | - Junbin Shao
- Liferiver Science and Technology Institute, Shanghai ZJ Bio-Tech Co., Ltd., Shanghai, China.
| |
Collapse
|
2
|
Determinants of Virus Variation, Evolution, and Host Adaptation. Pathogens 2022; 11:pathogens11091039. [PMID: 36145471 PMCID: PMC9501407 DOI: 10.3390/pathogens11091039] [Citation(s) in RCA: 11] [Impact Index Per Article: 3.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/08/2022] [Revised: 09/06/2022] [Accepted: 09/09/2022] [Indexed: 11/17/2022] Open
Abstract
Virus evolution is the change in the genetic structure of a viral population over time and results in the emergence of new viral variants, strains, and species with novel biological properties, including adaptation to new hosts. There are host, vector, environmental, and viral factors that contribute to virus evolution. To achieve or fine tune compatibility and successfully establish infection, viruses adapt to a particular host species or to a group of species. However, some viruses are better able to adapt to diverse hosts, vectors, and environments. Viruses generate genetic diversity through mutation, reassortment, and recombination. Plant viruses are exposed to genetic drift and selection pressures by host and vector factors, and random variants or those with a competitive advantage are fixed in the population and mediate the emergence of new viral strains or species with novel biological properties. This process creates a footprint in the virus genome evident as the preferential accumulation of substitutions, insertions, or deletions in areas of the genome that function as determinants of host adaptation. Here, with respect to plant viruses, we review the current understanding of the sources of variation, the effect of selection, and its role in virus evolution and host adaptation.
Collapse
|
3
|
Miao M, De Clercq E, Li G. Towards Efficient and Accurate SARS-CoV-2 Genome Sequence Typing Based on Supervised Learning Approaches. Microorganisms 2022; 10:microorganisms10091785. [PMID: 36144387 PMCID: PMC9505117 DOI: 10.3390/microorganisms10091785] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/11/2022] [Revised: 08/24/2022] [Accepted: 09/01/2022] [Indexed: 11/16/2022] Open
Abstract
Despite the active development of SARS-CoV-2 surveillance methods (e.g., Nextstrain, GISAID, Pangolin), the global emergence of various SARS-CoV-2 viral lineages that potentially cause antiviral and vaccine failure has driven the need for accurate and efficient SARS-CoV-2 genome sequence classifiers. This study presents an optimized method that accurately identifies the viral lineages of SARS-CoV-2 genome sequences using existing schemes. For Nextstrain and GISAID clades, a template matching-based method is proposed to quantify the differences between viral clades and to play an important role in classification evaluation. Furthermore, to improve the typing accuracy of SARS-CoV-2 genome sequences, an ensemble model that integrates a combination of machine learning-based methods (such as Random Forest and Catboost) with optimized weights is proposed for Nextstrain, Pangolin, and GISAID clades. Cross-validation is applied to optimize the parameters of the machine learning-based method and the weight settings of the ensemble model. To improve the efficiency of the model, in addition to the one-hot encoding method, we have proposed a nucleotide site mutation-based data structure that requires less computational resources and performs better in SARS-CoV-2 genome sequence typing. Based on an accumulated database of >1 million SARS-CoV-2 genome sequences, performance evaluations show that the proposed system has a typing accuracy of 99.879%, 97.732%, and 96.291% for Nextstrain, Pangolin, and GISAID clades, respectively. A single prediction only takes an average of <20 ms on a portable laptop. Overall, this study provides an efficient and accurate SARS-CoV-2 genome sequence typing system that benefits current and future surveillance of SARS-CoV-2 variants.
Collapse
Affiliation(s)
- Miao Miao
- Hunan Provincial Key Laboratory of Clinical Epidemiology, Xiangya School of Public Health, Central South University, Changsha 410078, China
| | - Erik De Clercq
- Department of Microbiology, Immunology and Transplantation, Rega Institute for Medical Research, KU Leuven, 3000 Leuven, Belgium
| | - Guangdi Li
- Hunan Provincial Key Laboratory of Clinical Epidemiology, Xiangya School of Public Health, Central South University, Changsha 410078, China
- Hunan Children’s Hospital, Changsha 410007, China
- Correspondence: ; Tel.: +86-731-8480-5414
| |
Collapse
|
4
|
Cacciabue M, Aguilera P, Gismondi MI, Taboga O. Covidex: An ultrafast and accurate tool for SARS-CoV-2 subtyping. INFECTION, GENETICS AND EVOLUTION : JOURNAL OF MOLECULAR EPIDEMIOLOGY AND EVOLUTIONARY GENETICS IN INFECTIOUS DISEASES 2022; 99:105261. [PMID: 35231666 PMCID: PMC8881885 DOI: 10.1016/j.meegid.2022.105261] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 04/17/2021] [Revised: 12/20/2021] [Accepted: 02/23/2022] [Indexed: 11/29/2022]
Abstract
The epidemiological surveillance of SARS-CoV-2 by means of whole-genome sequencing has revealed the emergence and co-existence of multiple viral lineages or subtypes throughout the world. Moreover, it has been shown that several subtypes of this virus display particular phenotypes, such as increased transmissibility or reduced susceptibility to neutralizing antibodies, leading to the denomination of Variants of Interest (VOI) or Variants of Concern (VOC). Thus, subtyping of SARS-CoV-2 is a crucial step for the surveillance of this pathogen. Here, we present Covidex, an open-source, alignment-free machine learning subtyping tool. It is a shiny web app that allows an ultra-fast and accurate classification of SARS-CoV-2 genome sequences into the three most used nomenclature systems (GISAID, Nextstrain, Pango lineages). It also categorizes input sequences as VOI or VOC, according to current definitions. The program is cross-platform compatible and it is available via Source-Forge https://sourceforge.net/projects/covidex or via the web application http://covidex.unlu.edu.ar.
Collapse
Affiliation(s)
- Marco Cacciabue
- Instituto de Agrobiotecnología y Biología Molecular (IABIMO), Instituto Nacional de Tecnología Agropecuaria (INTA), Consejo Nacional de Investigaciones Científicas y Técnicas (CONICET), De los Reseros y N. Repetto s/n, Hurlingham B1686IGC, Buenos Aires, Argentina; Universidad Nacional de Luján, Departamento de Ciencias Básicas, Av. Constitución y RN 5, 6700 Luján, Buenos Aires, Argentina.
| | - Pablo Aguilera
- Instituto de Agrobiotecnología y Biología Molecular (IABIMO), Instituto Nacional de Tecnología Agropecuaria (INTA), Consejo Nacional de Investigaciones Científicas y Técnicas (CONICET), De los Reseros y N. Repetto s/n, Hurlingham B1686IGC, Buenos Aires, Argentina; Universidad Nacional de Luján, Departamento de Ciencias Básicas, Av. Constitución y RN 5, 6700 Luján, Buenos Aires, Argentina
| | - María Inés Gismondi
- Instituto de Agrobiotecnología y Biología Molecular (IABIMO), Instituto Nacional de Tecnología Agropecuaria (INTA), Consejo Nacional de Investigaciones Científicas y Técnicas (CONICET), De los Reseros y N. Repetto s/n, Hurlingham B1686IGC, Buenos Aires, Argentina; Universidad Nacional de Luján, Departamento de Ciencias Básicas, Av. Constitución y RN 5, 6700 Luján, Buenos Aires, Argentina
| | - Oscar Taboga
- Instituto de Agrobiotecnología y Biología Molecular (IABIMO), Instituto Nacional de Tecnología Agropecuaria (INTA), Consejo Nacional de Investigaciones Científicas y Técnicas (CONICET), De los Reseros y N. Repetto s/n, Hurlingham B1686IGC, Buenos Aires, Argentina
| |
Collapse
|
5
|
Gorbalenya AE, Lauber C. Bioinformatics of virus taxonomy: foundations and tools for developing sequence-based hierarchical classification. Curr Opin Virol 2021; 52:48-56. [PMID: 34883443 DOI: 10.1016/j.coviro.2021.11.003] [Citation(s) in RCA: 12] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/22/2021] [Revised: 10/22/2021] [Accepted: 11/04/2021] [Indexed: 11/03/2022]
Abstract
The genome sequence is the only characteristic readily obtainable for all known viruses, underlying the growing role of comparative genomics in organizing knowledge about viruses in a systematic evolution-aware way, known as virus taxonomy. Overseen by the International Committee on Taxonomy of Viruses (ICTV), development of virus taxonomy involves taxa demarcation at 15 ranks of a hierarchical classification, often in host-specific manner. Outside the ICTV remit, researchers assess fitting numerous unclassified viruses into the established taxa. They employ different metrics of virus clustering, basing on conserved domain(s), separation of viruses in rooted phylogenetic trees and pair-wise distance space. Computational approaches differ further in respect to methodology, number of ranks considered, sensitivity to uneven virus sampling, and visualization of results. Advancing and using computational tools will be critical for improving taxa demarcation across the virosphere and resolving rank origins in research that may also inform experimental virology.
Collapse
Affiliation(s)
- Alexander E Gorbalenya
- Department of Medical Microbiology, Leiden University Medical Center, Leiden, The Netherlands; Faculty of Bioengineering and Bioinformatics and Belozersky, Institute of Physico-Chemical Biology, Lomonosov Moscow State University, 119899, Moscow, Russia.
| | - Chris Lauber
- Institute for Experimental Virology, TWINCORE Centre for Experimental and Clinical Infection Research, A Joint Venture between the Hannover Medical School (MHH) and the Helmholtz Centre for Infection Research (HZI), Hannover, Germany
| |
Collapse
|
6
|
Hebbani AV, Pulakuntla S, Pannuru P, Aramgam S, Badri KR, Reddy VD. COVID-19: comprehensive review on mutations and current vaccines. Arch Microbiol 2021; 204:8. [PMID: 34873656 PMCID: PMC8647783 DOI: 10.1007/s00203-021-02606-x] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/03/2021] [Revised: 11/09/2021] [Accepted: 11/22/2021] [Indexed: 12/15/2022]
Abstract
Viral outbreaks had been a threat for the human race for a long time. Several epidemics and pandemics have been reported in the past with serious consequences on human health and subsequent social and economic aspects. According to WHO, viral infections continue to be a major health concern globally. Novel coronavirus, SARS-CoV-2 (Severe acute respiratory syndrome coronavirus-2) causes the most recent infectious pandemic disease, COVID-19 (Coronavirus disease-19). As of now, there were 249 million infections of COVID-19 worldwide with a high mortality of more than 5 million deaths reported; and the number of new additional cases is drastically increasing. Development of therapies to treat the infected cases and prophylactic agents including vaccines that are effective towards different variants are crucial to curtail the COVID-19 pandemic. Owing to the fact that there is a high mortality and morbidity rate along with the risk of virus causing further epidemic outbursts, development of additional effective therapeutic and preventive strategies are highly warranted. Prevention, early detection and treatment will reduce the spread of COVID-19 pandemic. The present review highlights the novel mutations and therapeutic updates associated with coronaviruses along with the clinical manifestations-diagnosis, clinical management and, prophylactic and therapeutic strategies of COVID-19 infection.
Collapse
Affiliation(s)
| | - Swetha Pulakuntla
- Department of Biochemistry, REVA University, Bengaluru, 560064, India
| | - Padmavathi Pannuru
- DR Biosciences, Research and Development Institute, Bettahalasur, Bengaluru, 562157, India
| | - Sreelatha Aramgam
- Department of Biochemistry, REVA University, Bengaluru, 560064, India
- Department of Radiation Oncology, Emory University School of Medicine, Atlanta, GA, 30322, USA
| | - Kameswara Rao Badri
- Department of Pharmacology and Toxicology, Cardiovascular Research Institute, Morehouse School of Medicine, Atlanta, GA, 30310, USA.
- Clinical Analytical Chemistry Laboratory, Clinical Research Center, Morehouse School of Medicine, Atlanta, GA, 30310, USA.
| | | |
Collapse
|
7
|
Kormelink R, Verchot J, Tao X, Desbiez C. The Bunyavirales: The Plant-Infecting Counterparts. Viruses 2021; 13:842. [PMID: 34066457 PMCID: PMC8148189 DOI: 10.3390/v13050842] [Citation(s) in RCA: 25] [Impact Index Per Article: 6.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/07/2021] [Revised: 04/26/2021] [Accepted: 04/29/2021] [Indexed: 12/18/2022] Open
Abstract
Negative-strand (-) RNA viruses (NSVs) comprise a large and diverse group of viruses that are generally divided in those with non-segmented and those with segmented genomes. Whereas most NSVs infect animals and humans, the smaller group of the plant-infecting counterparts is expanding, with many causing devastating diseases worldwide, affecting a large number of major bulk and high-value food crops. In 2018, the taxonomy of segmented NSVs faced a major reorganization with the establishment of the order Bunyavirales. This article overviews the major plant viruses that are part of the order, i.e., orthospoviruses (Tospoviridae), tenuiviruses (Phenuiviridae), and emaraviruses (Fimoviridae), and provides updates on the more recent ongoing research. Features shared with the animal-infecting counterparts are mentioned, however, special attention is given to their adaptation to plant hosts and vector transmission, including intra/intercellular trafficking and viral counter defense to antiviral RNAi.
Collapse
Affiliation(s)
- Richard Kormelink
- Laboratory of Virology, Department of Plant Sciences, Wageningen University, Droevendaalsesteeg 1, 6708 PB Wageningen, The Netherlands
| | - Jeanmarie Verchot
- Department of Plant Pathology and Microbiology, Texas A&M University, College Station, TX 77843, USA;
| | - Xiaorong Tao
- Department of Plant Pathology, Nanjing Agricultural University, Nanjing 210095, China;
| | | |
Collapse
|