Reference Citation Analysis: Find an Article, Find a Category, Find a Journal, Find a Scholar

For: Sergeev RS, Kavaliou IS, Sataneuski UV, Gabrielian A, Rosenthal A, Tartakovsky M, Tuzikov AV. Genome-Wide Analysis of MDR and XDR Tuberculosis from Belarus: Machine-Learning Approach. IEEE/ACM Trans Comput Biol Bioinform 2019;16:1398-1408. [PMID: 28678713 DOI: 10.1109/tcbb.2017.2720669] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/07/2023]

For:	Sergeev RS, Kavaliou IS, Sataneuski UV, Gabrielian A, Rosenthal A, Tartakovsky M, Tuzikov AV. Genome-Wide Analysis of MDR and XDR Tuberculosis from Belarus: Machine-Learning Approach. IEEE/ACM Trans Comput Biol Bioinform 2019;16:1398-1408. [PMID: 28678713 DOI: 10.1109/tcbb.2017.2720669] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/07/2023]

Number

Cited by Other Article(s)

Durge AR, Shrimankar DD. DHFS-ECM: Design of a Dual Heuristic Feature Selection-based Ensemble Classification Model for the Identification of Bamboo Species from Genomic Sequences. Curr Genomics 2024;25:185-201. [PMID: 39087000 PMCID: PMC11288165 DOI: 10.2174/0113892029268176240125055419] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/24/2023] [Revised: 01/16/2024] [Accepted: 01/16/2024] [Indexed: 08/02/2024] Open

Abstract

Background

Analyzing genomic sequences plays a crucial role in understanding biological diversity and classifying Bamboo species. Existing methods for genomic sequence analysis suffer from limitations such as complexity, low accuracy, and the need for constant reconfiguration in response to evolving genomic datasets.

Aim

This study addresses these limitations by introducing a novel Dual Heuristic Feature Selection-based Ensemble Classification Model (DHFS-ECM) for the precise identification of Bamboo species from genomic sequences.

Methods

The proposed DHFS-ECM method employs a Genetic Algorithm to perform dual heuristic feature selection. This process maximizes inter-class variance, leading to the selection of informative N-gram feature sets. Subsequently, intra-class variance levels are used to create optimal training and validation sets, ensuring comprehensive coverage of class-specific features. The selected features are then processed through an ensemble classification layer, combining multiple stratification models for species-specific categorization.

Results

Comparative analysis with state-of-the-art methods demonstrate that DHFS-ECM achieves remarkable improvements in accuracy (9.5%), precision (5.9%), recall (8.5%), and AUC performance (4.5%). Importantly, the model maintains its performance even with an increased number of species classes due to the continuous learning facilitated by the Dual Heuristic Genetic Algorithm Model.

Conclusion

DHFS-ECM offers several key advantages, including efficient feature extraction, reduced model complexity, enhanced interpretability, and increased robustness and accuracy through the ensemble classification layer. These attributes make DHFS-ECM a promising tool for real-time clinical applications and a valuable contribution to the field of genomic sequence analysis.

Collapse

Perea-Jacobo R, Paredes-Gutiérrez GR, Guerrero-Chevannier MÁ, Flores DL, Muñiz-Salazar R. Machine Learning of the Whole Genome Sequence of Mycobacterium tuberculosis: A Scoping PRISMA-Based Review. Microorganisms 2023;11:1872. [PMID: 37630431 PMCID: PMC10456961 DOI: 10.3390/microorganisms11081872] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/16/2023] [Revised: 07/13/2023] [Accepted: 07/14/2023] [Indexed: 08/27/2023] Open

Durge AR, Shrimankar DD, Sawarkar AD. Heuristic Analysis of Genomic Sequence Processing Models for High Efficiency Prediction: A Statistical Perspective. Curr Genomics 2022;23:299-317. [PMID: 36778194 PMCID: PMC9878859 DOI: 10.2174/1389202923666220927105311] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/25/2022] [Revised: 08/29/2022] [Accepted: 09/01/2022] [Indexed: 11/22/2022] Open

Cui ZJ, Zhang WT, Zhu Q, Zhang QY, Zhang HY. Using a Heat Diffusion Model to Detect Potential Drug Resistance Genes of Mycobacterium tuberculosis. Protein Pept Lett 2020;27:711-717. [PMID: 32167422 DOI: 10.2174/0929866527666200313113157] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/15/2019] [Revised: 12/01/2019] [Accepted: 12/21/2019] [Indexed: 01/01/2023]

Abstract

BACKGROUND

Tuberculosis (TB), caused by Mycobacterium tuberculosis (Mtb), is one of the oldest known and most dangerous diseases. Although the spread of TB was controlled in the early 20th century using antibiotics and vaccines, TB has again become a threat because of increased drug resistance. There is still a lack of effective treatment regimens for a person who is already infected with multidrug-resistant Mtb (MDR-Mtb) or extensively drug-resistant Mtb (XDRMtb). In the past decades, many research groups have explored the drug resistance profiles of Mtb based on sequence data by GWAS, which identified some mutations that were significantly linked with drug resistance, and attempted to explain the resistance mechanisms. However, they mainly focused on several significant mutations in drug targets (e.g. rpoB, katG). Some genes which are potentially associated with drug resistance may be overlooked by the GWAS analysis.

OBJECTIVE

In this article, our motivation is to detect potential drug resistance genes of Mtb using a heat diffusion model.

METHODS

All sequencing data, which contained 127 samples of Mtb, i.e. 34 ethambutol-, 65 isoniazid-, 53 rifampicin- and 45 streptomycin-resistant strains. The raw sequence data were preprocessed using Trimmomatic software and aligned to the Mtb H37Rv reference genome using Bowtie2. From the resulting alignments, SAMtools and VarScan were used to filter sequences and call SNPs. The GWAS was performed by the PLINK package to obtain the significant SNPs, which were mapped to genes. The P-values of genes calculated by GWAS were transferred into a heat vector. The heat vector and the Mtb protein-protein interactions (PPI) derived from the STRING database were inputted into the heat diffusion model to obtain significant subnetworks by HotNet2. Finally, the most significant (P < 0.05) subnetworks associated with different phenotypes were obtained. To verify the change of binding energy between the drug and target before and after mutation, the method of molecular dynamics simulation was performed using the AMBER software.

RESULTS

We identified significant subnetworks in rifampicin-resistant samples. Excitingly, we found rpoB and rpoC, which are drug targets of rifampicin. From the protein structure of rpoB, the mutation location was extremely close to the drug binding site, with a distance of only 3.97 Å. Molecular dynamics simulation revealed that the binding energy of rpoB and rifampicin decreased after D435V mutation. To a large extent, this mutation can influence the affinity of drug-target binding. In addition, topA and pyrG were reported to be linked with drug resistance, and might be new TB drug targets. Other genes that have not yet been reported are worth further study.

CONCLUSION

Using a heat diffusion model in combination with GWAS results and protein-protein interactions, the significantly mutated subnetworks in rifampicin-resistant samples were found. The subnetwork not only contained the known targets of rifampicin (rpoB, rpoC), but also included topA and pyrG, which are potentially associated with drug resistance. Together, these results offer deeper insights into drug resistance of Mtb, and provides potential drug targets for finding new antituberculosis drugs.

Collapse

Gabrielian A, Engle E, Harris M, Wollenberg K, Juarez-Espinosa O, Glogowski A, Long A, Patti L, Hurt DE, Rosenthal A, Tartakovsky M. TB DEPOT (Data Exploration Portal): A multi-domain tuberculosis data analysis resource. PLoS One 2019;14:e0217410. [PMID: 31120982 PMCID: PMC6532897 DOI: 10.1371/journal.pone.0217410] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/28/2018] [Accepted: 05/10/2019] [Indexed: 02/06/2023] Open

Affiliation(s)

Andrei Gabrielian Office of Cyber Infrastructure & Computational Biology, National Institute of Allergy and Infectious Disease, National Institutes of Health, Bethesda, MD, United States of America
Eric Engle Office of Cyber Infrastructure & Computational Biology, National Institute of Allergy and Infectious Disease, National Institutes of Health, Bethesda, MD, United States of America
Michael Harris Office of Cyber Infrastructure & Computational Biology, National Institute of Allergy and Infectious Disease, National Institutes of Health, Bethesda, MD, United States of America
Kurt Wollenberg Office of Cyber Infrastructure & Computational Biology, National Institute of Allergy and Infectious Disease, National Institutes of Health, Bethesda, MD, United States of America
Octavio Juarez-Espinosa Office of Cyber Infrastructure & Computational Biology, National Institute of Allergy and Infectious Disease, National Institutes of Health, Bethesda, MD, United States of America
Alexander Glogowski Office of Cyber Infrastructure & Computational Biology, National Institute of Allergy and Infectious Disease, National Institutes of Health, Bethesda, MD, United States of America
Alyssa Long Office of Cyber Infrastructure & Computational Biology, National Institute of Allergy and Infectious Disease, National Institutes of Health, Bethesda, MD, United States of America
Lisa Patti Office of Cyber Infrastructure & Computational Biology, National Institute of Allergy and Infectious Disease, National Institutes of Health, Bethesda, MD, United States of America
Darrell E. Hurt Office of Cyber Infrastructure & Computational Biology, National Institute of Allergy and Infectious Disease, National Institutes of Health, Bethesda, MD, United States of America
Alex Rosenthal Office of Cyber Infrastructure & Computational Biology, National Institute of Allergy and Infectious Disease, National Institutes of Health, Bethesda, MD, United States of America
Mike Tartakovsky Office of Cyber Infrastructure & Computational Biology, National Institute of Allergy and Infectious Disease, National Institutes of Health, Bethesda, MD, United States of America

Collapse

Disease Diagnosis in Smart Healthcare: Innovation, Technologies and Applications. SUSTAINABILITY 2017. [DOI: 10.3390/su9122309] [Citation(s) in RCA: 72] [Impact Index Per Article: 10.3] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/16/2022]