Reference Citation Analysis: Find an Article, Find a Category, Find a Journal, Find a Scholar

For: He L, Sun S, Zhang Q, Bao X, Li PK. Alignment-free sequence comparison for virus genomes based on location correlation coefficient. Infect Genet Evol 2021;96:105106. [PMID: 34626822 PMCID: PMC8493760 DOI: 10.1016/j.meegid.2021.105106] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 06/20/2021] [Revised: 09/08/2021] [Accepted: 10/03/2021] [Indexed: 12/18/2022]

For:	He L, Sun S, Zhang Q, Bao X, Li PK. Alignment-free sequence comparison for virus genomes based on location correlation coefficient. Infect Genet Evol 2021;96:105106. [PMID: 34626822 PMCID: PMC8493760 DOI: 10.1016/j.meegid.2021.105106] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 06/20/2021] [Revised: 09/08/2021] [Accepted: 10/03/2021] [Indexed: 12/18/2022]

Number

Cited by Other Article(s)

Chong LC, Khan AM. A Systematic Bioinformatics Approach for Mapping the Minimal Set of a Viral Peptidome. Curr Protoc 2024;4:e1056. [PMID: 38856995 DOI: 10.1002/cpz1.1056] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 06/11/2024]

Abstract

Sequence changes in viral genomes generate protein sequence diversity that enables viruses to evade the host immune system, hindering the development of effective preventive and therapeutic interventions. The massive proliferation of sequence data provides unprecedented opportunities to study viral adaptation and evolution. An alignment-free approach removes various restrictions posed by an alignment-dependent approach for studying sequence diversity. The publicly available tool, UNIQmin, offers an alignment-free approach for studying viral sequence diversity at any given rank of taxonomy lineage and is big data ready. The tool performs an exhaustive search to determine the minimal set of sequences required to capture the peptidome diversity within a given dataset. This compression is possible through the removal of identical sequences and unique sequences that do not contribute effectively to the peptidome diversity pool. Herein, we describe a detailed four-part protocol utilizing UNIQmin to generate the minimal set for the purpose of viral diversity analyses, alignment-free at any rank of the taxonomy lineage, using the recent global public health threat Monkeypox virus (MPX) sequence data as a case study. The protocol enables a systematic bioinformatics approach to study sequence diversity across taxonomic lineages, which is crucial for our future preparedness against viral epidemics. This is particularly important when data are abundant, freely available, and alignment is not an option. © 2024 Wiley Periodicals LLC. Basic Protocol 1: Tool installation and input file preparation Basic Protocol 2: Generation of a minimal set of sequences for a given dataset Basic Protocol 3: Comparative minimal set analysis across taxonomic lineage ranks Basic Protocol 4: Factors affecting the minimal set of sequences.

Collapse

Wang T, Yu ZG, Li J. CGRWDL: alignment-free phylogeny reconstruction method for viruses based on chaos game representation weighted by dynamical language model. Front Microbiol 2024;15:1339156. [PMID: 38572227 PMCID: PMC10987876 DOI: 10.3389/fmicb.2024.1339156] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/15/2023] [Accepted: 02/23/2024] [Indexed: 04/05/2024] Open

Qiu X, Liu Y, Sha A. SARS-CoV-2 and natural infection in animals. J Med Virol 2023;95:e28147. [PMID: 36121159 PMCID: PMC9538246 DOI: 10.1002/jmv.28147] [Citation(s) in RCA: 16] [Impact Index Per Article: 16.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/20/2022] [Revised: 09/02/2022] [Accepted: 09/12/2022] [Indexed: 01/11/2023]

Jain M, Patil N, Gor D, Sharma MK, Goel N, Kaushik P. Proteomic Approach for Comparative Analysis of the Spike Protein of SARS-CoV-2 Omicron (B.1.1.529) Variant and Other Pango Lineages. Proteomes 2022;10:34. [PMID: 36278694 PMCID: PMC9624331 DOI: 10.3390/proteomes10040034] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/25/2022] [Revised: 10/11/2022] [Accepted: 10/13/2022] [Indexed: 11/24/2022] Open

Silva JM, Pratas D, Caetano T, Matos S. The complexity landscape of viral genomes. Gigascience 2022;11:6661051. [PMID: 35950839 PMCID: PMC9366995 DOI: 10.1093/gigascience/giac079] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/03/2022] [Revised: 05/25/2022] [Accepted: 07/26/2022] [Indexed: 12/11/2022] Open

Abstract

BACKGROUND

Viruses are among the shortest yet highly abundant species that harbor minimal instructions to infect cells, adapt, multiply, and exist. However, with the current substantial availability of viral genome sequences, the scientific repertory lacks a complexity landscape that automatically enlights viral genomes' organization, relation, and fundamental characteristics.

RESULTS

This work provides a comprehensive landscape of the viral genome's complexity (or quantity of information), identifying the most redundant and complex groups regarding their genome sequence while providing their distribution and characteristics at a large and local scale. Moreover, we identify and quantify inverted repeats abundance in viral genomes. For this purpose, we measure the sequence complexity of each available viral genome using data compression, demonstrating that adequate data compressors can efficiently quantify the complexity of viral genome sequences, including subsequences better represented by algorithmic sources (e.g., inverted repeats). Using a state-of-the-art genomic compressor on an extensive viral genomes database, we show that double-stranded DNA viruses are, on average, the most redundant viruses while single-stranded DNA viruses are the least. Contrarily, double-stranded RNA viruses show a lower redundancy relative to single-stranded RNA. Furthermore, we extend the ability of data compressors to quantify local complexity (or information content) in viral genomes using complexity profiles, unprecedently providing a direct complexity analysis of human herpesviruses. We also conceive a features-based classification methodology that can accurately distinguish viral genomes at different taxonomic levels without direct comparisons between sequences. This methodology combines data compression with simple measures such as GC-content percentage and sequence length, followed by machine learning classifiers.

CONCLUSIONS

This article presents methodologies and findings that are highly relevant for understanding the patterns of similarity and singularity between viral groups, opening new frontiers for studying viral genomes' organization while depicting the complexity trends and classification components of these genomes at different taxonomic levels. The whole study is supported by an extensive website (https://asilab.github.io/canvas/) for comprehending the viral genome characterization using dynamic and interactive approaches.

Collapse

Ahmad SU, Hafeez Kiani B, Abrar M, Jan Z, Zafar I, Ali Y, Alanazi AM, Malik A, Rather MA, Ahmad A, Khan AA. A comprehensive genomic study, mutation screening, phylogenetic and statistical analysis of SARS-CoV-2 and its variant omicron among different countries. J Infect Public Health 2022;15:878-891. [PMID: 35839568 PMCID: PMC9262654 DOI: 10.1016/j.jiph.2022.07.002] [Citation(s) in RCA: 18] [Impact Index Per Article: 9.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/21/2022] [Revised: 06/16/2022] [Accepted: 07/03/2022] [Indexed: 01/09/2023] Open

Abstract

BACKGROUND

With the rapid development of the genomic sequence data for the Severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) and its variants Delta (B.1.617.2) and Omicron (B.1.1.529), it is vital to successfully identify mutations within the genome.

OBJECTIVE

The main objective of the study is to investigate the full-length genome mutation analysis of 157 SARS-CoV-2 and its variant Delta and Omicron isolates. This study also provides possible effects at the structural level to understand the role of mutations and new insights into the evolution of COVID-19 and evaluates the differential level analysis in viral genome sequence among different nations. We have also tried to offer a mutation snapshot for these differences that could help in vaccine formulation. This study utilizes a unique and efficient method of targeting the stable genes for the drug discovery approach.

METHODS

Complete genome sequence information of SARS-CoV-2, Delta, and Omicron from online resources were used to predict structure domain identification, data mining, and screening; employing different bioinformatics tools. BioEdit software was used to perform their genomic alignments across countries and a phylogenetic tree as per the confidence of 500 bootstrapping values was constructed. Heterozygosity ratios were determined in-silico. A minimum spanning network (MSN) of selected populations was determined by Bruvo's distance role-based framework.

RESULTS

Out of all 157 different strains of SARS-CoV-2 and its variants, and their complete genome sequences from different countries, Corona nucleoca and DUF5515 were observed to be the most conserved domains. All genomes obtained changes in comparison to the Wuhan-Hu-1 strain, mainly in the TRS region (CUAAAC or ACGAAC). We discovered 596 mutations in all genes, with the highest number (321) found in ORF1ab (QHD43415.1), or TRS site mutations found only in ORF7a (1) and ORF10 (2). The Omicron variant has 30 mutations in the Spike protein and has a higher alpha-helix shape (23.46%) than the Delta version (22.03%). T478 was also discovered to be a prevalent polymorphism in Delta and Omicron variations, as well as genomic gaps ranging from 45 to 65aa. All 157 sequences contained variations and conformed to Nei's Genetic distance. We discovered heterozygosity (Hs) 0.01, mean anticipated Hs 0.32, the genetic diversity index (GDI) 0.01943989, and GD within population 0.01266951. The Hedrick value was 0.52324978, the GD coefficient was 0.52324978, the average Hs was 0.01371452, and the GD coefficient was 0.52324978. Among other countries, Brazil has the highest standard error (SE) rate (1.398), whereas Japan has the highest ratio of Nei's gene diversity (0.01).

CONCLUSIONS

The study's findings will assist in comprehending the shape and kind of complete genome, their streaming genomic sequences, and mutations in various additions of SARS-CoV-2, as well as its different variant strains like Omicron. These results will provide a scientific basis to design the vaccines and understand the genomic study of these viruses.

Collapse