Reference Citation Analysis: Find an Article, Find a Category, Find a Journal, Find a Scholar

For: Qiu P, Cai XY, Ding W, Zhang Q, Norris ED, Greene JR. HCV genotyping using statistical classification approach. J Biomed Sci 2009;16:62. [PMID: 19586537 PMCID: PMC2720937 DOI: 10.1186/1423-0127-16-62] [Citation(s) in RCA: 10] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/01/2009] [Accepted: 07/08/2009] [Indexed: 01/24/2023] Open

For:	Qiu P, Cai XY, Ding W, Zhang Q, Norris ED, Greene JR. HCV genotyping using statistical classification approach. J Biomed Sci 2009;16:62. [PMID: 19586537 PMCID: PMC2720937 DOI: 10.1186/1423-0127-16-62] [Citation(s) in RCA: 10] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/01/2009] [Accepted: 07/08/2009] [Indexed: 01/24/2023] Open

Number

Cited by Other Article(s)

Fahmy AM, Hammad MS, Mabrouk MS, Al-Atabany WI. On leveraging self-supervised learning for accurate HCV genotyping. Sci Rep 2024;14:15463. [PMID: 38965254 PMCID: PMC11224313 DOI: 10.1038/s41598-024-64209-y] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/11/2024] [Accepted: 06/06/2024] [Indexed: 07/06/2024] Open

PWM2Vec: An Efficient Embedding Approach for Viral Host Specification from Coronavirus Spike Sequences. BIOLOGY 2022;11:biology11030418. [PMID: 35336792 PMCID: PMC8945605 DOI: 10.3390/biology11030418] [Citation(s) in RCA: 7] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 01/04/2022] [Revised: 02/24/2022] [Accepted: 03/07/2022] [Indexed: 01/14/2023]

Abstract

Simple Summary

The family of coronaviruses comprises a diverse set of strains and variants which cause diseases from the common cold to COVID-19. Moreover, they infect a wide array of hosts from bats, camels, birds, to humans. Studying coronaviruses through the lens of host specificity provides a unique perspective to understanding the evolution, diversity and dynamics of this family. In particular, this can reveal groups of different hosts infected by similar strains, giving clues on strains which were more likely to have evolved to jump from one host to another. In this work, we frame host specificity as a classification task, in designing a very compact numerical representation of the spike sequences of different coronaviruses. Based on this numerical representation, classification methods are able to detect the target host with high accuracy. Such an approach can used to efficiently scale to large volumes of sequences, in order to unveil trends in the host specificity of different coronavirus strains.

Abstract

The study of host specificity has important connections to the question about the origin of SARS-CoV-2 in humans which led to the COVID-19 pandemic—an important open question. There are speculations that bats are a possible origin. Likewise, there are many closely related (corona)viruses, such as SARS, which was found to be transmitted through civets. The study of the different hosts which can be potential carriers and transmitters of deadly viruses to humans is crucial to understanding, mitigating, and preventing current and future pandemics. In coronaviruses, the surface (S) protein, or spike protein, is important in determining host specificity, since it is the point of contact between the virus and the host cell membrane. In this paper, we classify the hosts of over five thousand coronaviruses from their spike protein sequences, segregating them into clusters of distinct hosts among birds, bats, camels, swine, humans, and weasels, to name a few. We propose a feature embedding based on the well-known position weight matrix (PWM), which we call PWM2Vec, and we use it to generate feature vectors from the spike protein sequences of these coronaviruses. While our embedding is inspired by the success of PWMs in biological applications, such as determining protein function and identifying transcription factor binding sites, we are the first (to the best of our knowledge) to use PWMs from viral sequences to generate fixed-length feature vector representations, and use them in the context of host classification. The results on real world data show that when using PWM2Vec, machine learning classifiers are able to perform comparably to the baseline models in terms of predictive performance and runtime—in some cases, the performance is better. We also measure the importance of different amino acids using information gain to show the amino acids which are important for predicting the host of a given coronavirus. Finally, we perform some statistical analyses on these results to show that our embedding is more compact than the embeddings of the baseline models.

Collapse

Singh A, Mankotia DS, Irshad M. A Single-step Multiplex Quantitative Real Time Polymerase Chain Reaction Assay for Hepatitis C Virus Genotypes. J Transl Int Med 2017;5:34-42. [PMID: 28680837 DOI: 10.1515/jtim-2017-0010] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/15/2023] Open

Abstract

BACKGROUND AND OBJECTIVES

The variable response of hepatitis C virus (HCV) genotypes towards anti-viral treatment requires prior information on the genotype status before planning a therapeutic strategy. Although assays for typing or subtyping of HCV are available, however, a fast and reliable assay system is still needed. The present study was planned to develop a single-step multiplex quantitative real time polymerase chain reaction (qPCR) assay to determine HCV genotypes in patients' sera.

METHODS

The conserved sequences from 5' UTR, core and NS5b regions of HCV genome were used to design primers and hydrolysis probes labeled with fluorophores. Starting with the standardization of singleplex (qPCR) for each individual HCV-genotype, the experimental conditions were finally optimized for the development of multiplex assay. The sensitivity and specificity were assessed both for singleplex and multiplex assays. Using the template concentration of 10² copies per microliter, the value of quantification cycle (Cq) and the limit of detection (LOD) were also compared for both singleplex and multiplex assays. Similarly, the merit of multiplex assay was also compared with sequence analysis and restriction fragment length polymorphism (RFLP) techniques used for HCV genotyping. In order to find the application of multiplex qPCR assay, it was used for genotyping in a panel of 98 sera positive for HCV RNA after screening a total number of 239 patients with various liver diseases.

RESULTS

The results demonstrated the presence of genotype 1 in 26 of 98 (26.53%) sera, genotype 3 in 65 (66.32%) and genotype 4 in 2 (2.04%) sera samples, respectively. One sample showed mixed infection of genotype 1 and 3. Five samples could not show the presence of any genotype. Genotypes 2, 5 and 6 could not be detected in these sera samples. The analysis of sera by singleplex and RFLP indicated the results of multiplex to be comparable with singleplex and with clear merit of multiplex over RFLP. In addition, the results of multiplex assay were also found to be comparable with those from sequence analysis. The sensitivity, specificity, Cq values and LOD values were compared and found to be closely associated both for singleplex and multiplex assays.

CONCLUSION

The multiplex qPCR assay was found to be a fast, specific and sensitive method that can be used as a technique of choice for HCV genotyping in all routine laboratories.

Collapse

Qiu P, Stevens R, Wei B, Lahser F, Howe AYM, Klappenbach JA, Marton MJ. HCV genotyping from NGS short reads and its application in genotype detection from HCV mixed infected plasma. PLoS One 2015;10:e0122082. [PMID: 25830316 PMCID: PMC4382110 DOI: 10.1371/journal.pone.0122082] [Citation(s) in RCA: 13] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/15/2014] [Accepted: 02/10/2015] [Indexed: 12/12/2022] Open

Abstract

Genotyping of hepatitis C virus (HCV) plays an important role in the treatment of HCV. As new genotype-specific treatment options become available, it has become increasingly important to have accurate HCV genotype and subtype information to ensure that the most appropriate treatment regimen is selected. Most current genotyping methods are unable to detect mixed genotypes from two or more HCV infections. Next generation sequencing (NGS) allows for rapid and low cost mass sequencing of viral genomes and provides an opportunity to probe the viral population from a single host. In this paper, the possibility of using short NGS reads for direct HCV genotyping without genome assembly was evaluated. We surveyed the publicly-available genetic content of three HCV drug target regions (NS3, NS5A, NS5B) in terms of whether these genes contained genotype-specific regions that could predict genotype. Six genotypes and 38 subtypes were included in this study. An automated phylogenetic analysis based HCV genotyping method was implemented and used to assess different HCV target gene regions. Candidate regions of 250-bp each were found for all three genes that have enough genetic information to predict HCV genotypes/subtypes. Validation using public datasets shows 100% genotyping accuracy. To test whether these 250-bp regions were sufficient to identify mixed genotypes, we developed a random primer-based method to sequence HCV plasma samples containing mixtures of two HCV genotypes in different ratios. We were able to determine the genotypes without ambiguity and to quantify the ratio of the abundances of the mixed genotypes in the samples. These data provide a proof-of-concept that this random primed, NGS-based short-read genotyping approach does not need prior information about the viral population and is capable of detecting mixed viral infection.

Collapse

Identification of novel small molecules as inhibitors of hepatitis C virus by structure-based virtual screening. Int J Mol Sci 2013;14:22845-56. [PMID: 24264035 PMCID: PMC3856094 DOI: 10.3390/ijms141122845] [Citation(s) in RCA: 9] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/12/2013] [Revised: 11/06/2013] [Accepted: 11/07/2013] [Indexed: 12/30/2022] Open

Rotella DP. The discovery and development of boceprevir. Expert Opin Drug Discov 2013;8:1439-47. [PMID: 24079543 DOI: 10.1517/17460441.2013.843525] [Citation(s) in RCA: 16] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/16/2022]

Den Boer JW, Euser SM, Nagelkerke NJ, Schuren F, Jarraud S, Etienne J. Prediction of the origin of French Legionella pneumophila strains using a mixed-genome microarray. BMC Genomics 2013;14:435. [PMID: 23815549 PMCID: PMC3701591 DOI: 10.1186/1471-2164-14-435] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/22/2013] [Accepted: 06/19/2013] [Indexed: 11/17/2022] Open

Abstract

Background

Legionella is a water and soil bacterium that can infect humans, causing a pneumonia known as Legionnaires’ disease. The pneumonia is almost exclusively caused by the species L. pneumophila, of which serogroup 1 is responsible for 90% of patients. Within serogroup 1, large differences in prevalence in clinical isolates have been described. A recent study, using a Dutch Legionella strain collection, identified five virulence associated markers. In our study, we verify whether these five Dutch markers can predict the patient or environmental origin of a French Legionella strain collection. In addition, we identify new potential virulence markers and verify whether these can predict better. A total of 219 French patient isolates and environmental strains were compared using a mixed-genome micro-array. The micro-array data were analysed to identify predictive markers, using a Random Forest algorithm combined with a logistic regression model. The sequences of the identified markers were compared with eleven known Legionella genomes, using BlastN and BlastX; the functionality for each of the predictive markers was checked in the literature.

Results

The five Dutch markers insufficiently predicted the patient or environmental origin of the French Legionella strains. Subsequent analyses identified four predictive markers for the French collection that were used for the logistic regression model. This model showed a negative predictive value of 91%. Three of the French markers differed from the Dutch markers, one showed considerable overlap and was found in one of the Legionella genomes (Lorraine strain). This marker encodes for a structural toxin protein RtxA, described for L. pneumophila as a factor involved in virulence and entry in both human cells and amoebae.

Conclusions

The combination of a mixed-genome micro-array and statistical analysis using a Random Forest algorithm has identified virulence markers in a consistent way. The Lorraine strain and related Dutch and French Legionella strains contain a marker that encodes a RtxA protein which probably is involved in the increased prevalence in clinical isolates. The current set of predictive markers is insufficient to justify its use as a reliable test in the public health field in France. Our results suggest that genetic differences in Legionella strains exist between geographically distinct entities. It may be necessary to develop region-specific mixed-genome microarrays that are constantly adapted and updated.

Collapse

Prevalence of hepatitis C virus genotypes in mashhad, northeast iran. IRANIAN JOURNAL OF PUBLIC HEALTH 2012;41. [PMID: 23193507 PMCID: PMC3494216] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 11/04/2022]

Sarwar MT, Kausar H, Ijaz B, Ahmad W, Ansar M, Sumrin A, Ashfaq UA, Asad S, Gull S, Shahid I, Hassan S. NS4A protein as a marker of HCV history suggests that different HCV genotypes originally evolved from genotype 1b. Virol J 2011;8:317. [PMID: 21696641 PMCID: PMC3145594 DOI: 10.1186/1743-422x-8-317] [Citation(s) in RCA: 9] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/18/2011] [Accepted: 06/23/2011] [Indexed: 01/06/2023] Open

Abstract

BACKGROUND

The 9.6 kb long RNA genome of Hepatitis C virus (HCV) is under the control of RNA dependent RNA polymerase, an error-prone enzyme, for its transcription and replication. A high rate of mutation has been found to be associated with RNA viruses like HCV. Based on genetic variability, HCV has been classified into 6 different major genotypes and 11 different subtypes. However this classification system does not provide significant information about the origin of the virus, primarily due to high mutation rate at nucleotide level. HCV genome codes for a single polyprotein of about 3011 amino acids which is processed into structural and non-structural proteins inside host cell by viral and cellular proteases.

RESULTS

We have identified a conserved NS4A protein sequence for HCV genotype 3a reported from four different continents of the world i.e. Europe, America, Australia and Asia. We investigated 346 sequences and compared amino acid composition of NS4A protein of different HCV genotypes through Multiple Sequence Alignment and observed amino acid substitutions C22, V29, V30, V38, Q46 and Q47 in NS4A protein of genotype 1b. Furthermore, we observed C22 and V30 as more consistent members of NS4A protein of genotype 1a. Similarly Q46 and Q47 in genotype 5, V29, V30, Q46 and Q47 in genotype 4, C22, Q46 and Q47 in genotype 6, C22, V38, Q46 and Q47 in genotype 3 and C22 in genotype 2 as more consistent members of NS4A protein of these genotypes. So the different amino acids that were introduced as substitutions in NS4A protein of genotype 1 subtype 1b have been retained as consistent members of the NS4A protein of other known genotypes.

CONCLUSION

These observations indicate that NS4A protein of different HCV genotypes originally evolved from NS4A protein of genotype 1 subtype 1b, which in turn indicate that HCV genotype 1 subtype 1b established itself earlier in human population and all other known genotypes evolved later as a result of mutations in HCV genotype 1b. These results were further confirmed through phylogenetic analysis by constructing phylogenetic tree using NS4A protein as a phylogenetic marker.

Collapse

Chevaliez S, Bouvier-Alias M, Brillet R, Pawlotsky JM. Hepatitis C virus (HCV) genotype 1 subtype identification in new HCV drug development and future clinical practice. PLoS One 2009;4:e8209. [PMID: 19997618 PMCID: PMC2785465 DOI: 10.1371/journal.pone.0008209] [Citation(s) in RCA: 98] [Impact Index Per Article: 6.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/09/2009] [Accepted: 11/09/2009] [Indexed: 12/12/2022] Open