Reference Citation Analysis: Find an Article, Find a Category, Find a Journal, Find a Scholar

For: Wang X, Zhang J, Li GZ. Multi-location gram-positive and gram-negative bacterial protein subcellular localization using gene ontology and multi-label classifier ensemble. BMC Bioinformatics 2015;16 Suppl 12:S1. [PMID: 26329681 PMCID: PMC4705491 DOI: 10.1186/1471-2105-16-s12-s1] [Citation(s) in RCA: 18] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/06/2023] Open

For:	Wang X, Zhang J, Li GZ. Multi-location gram-positive and gram-negative bacterial protein subcellular localization using gene ontology and multi-label classifier ensemble. BMC Bioinformatics 2015;16 Suppl 12:S1. [PMID: 26329681 PMCID: PMC4705491 DOI: 10.1186/1471-2105-16-s12-s1] [Citation(s) in RCA: 18] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/06/2023] Open

Number

Cited by Other Article(s)

Xiao H, Zou Y, Wang J, Wan S. A Review for Artificial Intelligence Based Protein Subcellular Localization. Biomolecules 2024;14:409. [PMID: 38672426 PMCID: PMC11048326 DOI: 10.3390/biom14040409] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/29/2024] [Revised: 03/21/2024] [Accepted: 03/25/2024] [Indexed: 04/28/2024] Open

Wang C, Wang Y, Ding P, Li S, Yu X, Yu B. ML-FGAT: Identification of multi-label protein subcellular localization by interpretable graph attention networks and feature-generative adversarial networks. Comput Biol Med 2024;170:107944. [PMID: 38215617 DOI: 10.1016/j.compbiomed.2024.107944] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/08/2023] [Revised: 12/08/2023] [Accepted: 01/01/2024] [Indexed: 01/14/2024]

Ntakiyisumba E, Lee S, Won G. Identification of risk profiles for Salmonella prevalence in pig supply chains in South Korea using meta-analysis and a quantitative microbial risk assessment model. Food Res Int 2023;170:112999. [PMID: 37316069 DOI: 10.1016/j.foodres.2023.112999] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/06/2023] [Revised: 05/13/2023] [Accepted: 05/16/2023] [Indexed: 06/16/2023]

Abstract

International travel and the globalization of food supplies have increased the risk of epidemic foodborne infections. Salmonella strains, particularly non-typhoidal Salmonella (NTS), are major zoonotic pathogens responsible for gastrointestinal diseases worldwide. In this study, the prevalence and Salmonella contamination in pigs/carcasses throughout the South Korean pig supply chain and the associated risk factors were evaluated using Systematic reviews and meta-analyses (SRMA), and quantitative microbial risk assessment (QMRA). The prevalence of Salmonella in finishing pigs, which is one of the major starting inputs of the QMRA model was calculated through SRMA of studies conducted in south Korea in order to complement and enhance the robustness of the model. Our findings revealed that the pooled Salmonella prevalence in pigs was 4.15% with a 95% confidence interval (CI) of 2.56 to 6.66%. Considering the pig supply chain, the highest prevalence was detected in slaughterhouses (6.27% [95% CI: 3.36; 11.37]), followed by farms (4.16% [95% CI: 2.32; 7.35]) and meat stores (1.21% [95% CI: 0.42; 3.46]). The QMRA model predicted a 3.9% likelihood of Salmonella-free carcasses and a 96.1% probability of Salmonella-positive carcasses at the end of slaughter, with an average Salmonella concentration of 6.38 log CFU/carcass (95% CI: 5.17; 7.28). This corresponds to an average contamination of 1.23 log CFU/g (95% CI: 0.37; 2.48) of pork meat. Across the pig supply chain, the highest Salmonella contamination was predicted after transport and lairage, with an average concentration of 8 log CFU/pig (95% CI: 7.15; 8.42). Sensitivity analysis indicated that Salmonella fecal shedding (r = 0.68) and Salmonella prevalence in finishing pigs (r = 0.39) at pre-harvest were the most significant factors associated with Salmonella contamination in pork carcasses. Although disinfection and sanitation interventions along the slaughter line can reduce contamination levels to some extent, effective measures should be taken to reduce Salmonella prevalence at the farm level to improve the safety of pork consumption.

Collapse

Ozger ZB. A robust protein language model for SARS-CoV-2 protein-protein interaction network prediction. Artif Intell Med 2023;142:102574. [PMID: 37316102 DOI: 10.1016/j.artmed.2023.102574] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/24/2022] [Revised: 04/17/2023] [Accepted: 04/27/2023] [Indexed: 06/16/2023]

Alves CL, Toutain TGLDO, de Carvalho Aguiar P, Pineda AM, Roster K, Thielemann C, Porto JAM, Rodrigues FA. Diagnosis of autism spectrum disorder based on functional brain networks and machine learning. Sci Rep 2023;13:8072. [PMID: 37202411 DOI: 10.1038/s41598-023-34650-6] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/04/2022] [Accepted: 05/04/2023] [Indexed: 05/20/2023] Open

Ebler J, Ebert P, Clarke WE, Rausch T, Audano PA, Houwaart T, Mao Y, Korbel JO, Eichler EE, Zody MC, Dilthey AT, Marschall T. Pangenome-based genome inference allows efficient and accurate genotyping across a wide spectrum of variant classes. Nat Genet 2022;54:518-525. [PMID: 35410384 PMCID: PMC9005351 DOI: 10.1038/s41588-022-01043-w] [Citation(s) in RCA: 73] [Impact Index Per Article: 36.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/15/2021] [Accepted: 03/03/2022] [Indexed: 12/30/2022]

Jacob Machado D, Portella de Luna Marques F, Jiménez-Ferbans L, Grant T. An empirical test of the relationship between the bootstrap and likelihood ratio support in maximum likelihood phylogenetic analysis. Cladistics 2021;38:392-401. [PMID: 34932221 DOI: 10.1111/cla.12496] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 11/15/2021] [Indexed: 11/27/2022] Open

Abstract

In maximum likelihood (ML), the support for a clade can be calculated directly as the likelihood ratio (LR) or log-likelihood difference (S, LLD) of the best trees with and without the clade of interest. However, bootstrap (BS) clade frequencies are more pervasive in ML phylogenetics and are almost universally interpreted as measuring support. In addition to theoretical arguments against that interpretation, BS has several undesirable attributes for a support measure. For example, it does not vary in proportion to optimality or identify clades that are rejected by the evidence and can be overestimated due to missing data. Nevertheless, if BS is a reliable predictor of S, then it might be an efficient indirect method of measuring support-an attractive possibility, given the speed of many BS implementations. To assess the relationship between S and BS, we analyzed 106 empirical datasets retrieved from TreeBASE. Also, to evaluate the degree to which S and BS are affected by the number of replicates during suboptimal tree searches for S and pseudoreplicates during BS estimation, we randomly selected 5 of the 106 datasets and analyzed them using variable numbers of replicates and pseudoreplicates, respectively. The correlation between S and BS was extremely weak in the datasets we analyzed. Increasing the number of replicates during tree search decreased the estimated values of S for most clades, but the magnitude of change was small. In contrast, although increasing pseudoreplicates affected BS values for only approximately 40% of clades, values both increased and decreased, and they did so at much greater magnitudes. Increasing replicates/pseudoreplicates affected the rank order of clades in each tree for both S and BS. Our findings show decisively that BS is not an efficient indirect method of measuring support and suggest that even quite superficial searches to calculate S provide better estimates of support.

Collapse

Liu Y, Jin S, Gao H, Wang X, Wang C, Zhou W, Yu B. Predicting the multi-label protein subcellular localization through multi-information fusion and MLSI dimensionality reduction based on MLFE classifier. Bioinformatics 2021;38:1223-1230. [PMID: 34864897 PMCID: PMC8690230 DOI: 10.1093/bioinformatics/btab811] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/19/2021] [Revised: 11/17/2021] [Accepted: 11/30/2021] [Indexed: 01/05/2023] Open

Abstract

MOTIVATION

Multi-label (ML) protein subcellular localization (SCL) is an indispensable way to study protein function. It can locate a certain protein (such as the human transmembrane protein that promotes the invasion of the severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2)) or expression product at a specific location in a cell, which can provide a reference for clinical treatment of diseases such as coronavirus disease 2019 (COVID-19).

RESULTS

The article proposes a novel method named ML-locMLFE. First of all, six feature extraction methods are adopted to obtain protein effective information. These methods include pseudo amino acid composition, encoding based on grouped weight, gene ontology, multi-scale continuous and discontinuous, residue probing transformation and evolutionary distance transformation. In the next part, we utilize the ML information latent semantic index method to avoid the interference of redundant information. In the end, ML learning with feature-induced labeling information enrichment is adopted to predict the ML protein SCL. The Gram-positive bacteria dataset is chosen as a training set, while the Gram-negative bacteria dataset, virus dataset, newPlant dataset and SARS-CoV-2 dataset as the test sets. The overall actual accuracy of the first four datasets are 99.23%, 93.82%, 93.24% and 96.72% by the leave-one-out cross validation. It is worth mentioning that the overall actual accuracy prediction result of our predictor on the SARS-CoV-2 dataset is 72.73%. The results indicate that the ML-locMLFE method has obvious advantages in predicting the SCL of ML protein, which provides new ideas for further research on the SCL of ML protein.

AVAILABILITY AND IMPLEMENTATION

The source codes and datasets are publicly available at https://github.com/QUST-AIBBDRC/ML-locMLFE/.

SUPPLEMENTARY INFORMATION

Supplementary data are available at Bioinformatics online.

Collapse

Three Distinct Proteases Are Responsible for Overall Cell Surface Proteolysis in Streptococcus thermophilus. Appl Environ Microbiol 2021;87:e0129221. [PMID: 34550764 DOI: 10.1128/aem.01292-21] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/16/2022] Open

Zhang Q, Zhang Y, Li S, Han Y, Jin S, Gu H, Yu B. Accurate prediction of multi-label protein subcellular localization through multi-view feature learning with RBRL classifier. Brief Bioinform 2021;22:6127451. [PMID: 33537726 DOI: 10.1093/bib/bbab012] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/20/2020] [Revised: 12/12/2020] [Accepted: 01/06/2021] [Indexed: 01/27/2023] Open

Li J, Zhang L, He S, Guo F, Zou Q. SubLocEP: a novel ensemble predictor of subcellular localization of eukaryotic mRNA based on machine learning. Brief Bioinform 2021;22:6059770. [PMID: 33388743 DOI: 10.1093/bib/bbaa401] [Citation(s) in RCA: 14] [Impact Index Per Article: 4.7] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/02/2020] [Revised: 11/28/2020] [Accepted: 12/08/2020] [Indexed: 01/23/2023] Open

Imai K, Nakai K. Tools for the Recognition of Sorting Signals and the Prediction of Subcellular Localization of Proteins From Their Amino Acid Sequences. Front Genet 2020;11:607812. [PMID: 33324450 PMCID: PMC7723863 DOI: 10.3389/fgene.2020.607812] [Citation(s) in RCA: 12] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/18/2020] [Accepted: 11/03/2020] [Indexed: 12/13/2022] Open

Peabody MA, Lau WYV, Hoad GR, Jia B, Maguire F, Gray KL, Beiko RG, Brinkman FSL. PSORTm: a bacterial and archaeal protein subcellular localization prediction tool for metagenomics data. Bioinformatics 2020;36:3043-3048. [PMID: 32108861 PMCID: PMC7214030 DOI: 10.1093/bioinformatics/btaa136] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/03/2019] [Revised: 01/23/2020] [Accepted: 02/25/2020] [Indexed: 11/14/2022] Open

Abstract

Motivation

Many methods for microbial protein subcellular localization (SCL) prediction exist; however, none is readily available for analysis of metagenomic sequence data, despite growing interest from researchers studying microbial communities in humans, agri-food relevant organisms and in other environments (e.g. for identification of cell-surface biomarkers for rapid protein-based diagnostic tests). We wished to also identify new markers of water quality from freshwater samples collected from pristine versus pollution-impacted watersheds.

Results

We report PSORTm, the first bioinformatics tool designed for prediction of diverse bacterial and archaeal protein SCL from metagenomics data. PSORTm incorporates components of PSORTb, one of the most precise and widely used protein SCL predictors, with an automated classification by cell envelope. An evaluation using 5-fold cross-validation with in silico-fragmented sequences with known localization showed that PSORTm maintains PSORTb’s high precision, while sensitivity increases proportionately with metagenomic sequence fragment length. PSORTm’s read-based analysis was similar to PSORTb-based analysis of metagenome-assembled genomes (MAGs); however, the latter requires non-trivial manual classification of each MAG by cell envelope, and cannot make use of unassembled sequences. Analysis of the watershed samples revealed the importance of normalization and identified potential biomarkers of water quality. This method should be useful for examining a wide range of microbial communities, including human microbiomes, and other microbiomes of medical, environmental or industrial importance.

Availability and implementation

Documentation, source code and docker containers are available for running PSORTm locally at https://www.psort.org/psortm/ (freely available, open-source software under GNU General Public License Version 3).

Supplementary information

Supplementary data are available at Bioinformatics online.

Collapse

Bouziane H, Chouarfia A. Use of Chou's 5-steps rule to predict the subcellular localization of gram-negative and gram-positive bacterial proteins by multi-label learning based on gene ontology annotation and profile alignment. J Integr Bioinform 2020;18:51-79. [PMID: 32598314 PMCID: PMC8035964 DOI: 10.1515/jib-2019-0091] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/08/2019] [Accepted: 04/08/2020] [Indexed: 12/31/2022] Open

Abstract

To date, many proteins generated by large-scale genome sequencing projects are still uncharacterized and subject to intensive investigations by both experimental and computational means. Knowledge of protein subcellular localization (SCL) is of key importance for protein function elucidation. However, it remains a challenging task, especially for multiple sites proteins known to shuttle between cell compartments to perform their proper biological functions and proteins which do not have significant homology to proteins of known subcellular locations. Due to their low-cost and reasonable accuracy, machine learning-based methods have gained much attention in this context with the availability of a plethora of biological databases and annotated proteins for analysis and benchmarking. Various predictive models have been proposed to tackle the SCL problem, using different protein sequence features pertaining to the subcellular localization, however, the overwhelming majority of them focuses on single localization and cover very limited cellular locations. The prediction was basically established on sorting signals, amino acids compositions, and homology. To improve the prediction quality, focus is actually on knowledge information extracted from annotation databases, such as protein-protein interactions and Gene Ontology (GO) functional domains annotation which has been recently a widely adopted and essential information for learning systems. To deal with such problem, in the present study, we considered SCL prediction task as a multi-label learning problem and tried to label both single site and multiple sites unannotated bacterial protein sequences by mining proteins homology relationships using both GO terms of protein homologs and PSI-BLAST profiles. The experiments using 5-fold cross-validation tests on the benchmark datasets showed a significant improvement on the results obtained by the proposed consensus multi-label prediction model which discriminates six compartments for Gram-negative and five compartments for Gram-positive bacterial proteins.

Collapse

Du L, Meng Q, Chen Y, Wu P. Subcellular location prediction of apoptosis proteins using two novel feature extraction methods based on evolutionary information and LDA. BMC Bioinformatics 2020;21:212. [PMID: 32448129 PMCID: PMC7245797 DOI: 10.1186/s12859-020-3539-1] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/29/2019] [Accepted: 05/06/2020] [Indexed: 11/13/2022] Open

Buchowiecka AK. Modified cysteine S-phosphopeptide standards for mass spectrometry-based proteomics. Amino Acids 2019;51:1365-1375. [DOI: 10.1007/s00726-019-02773-8] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/03/2018] [Accepted: 08/18/2019] [Indexed: 02/06/2023]

Shao W, Liu M, Xu YY, Shen HB, Zhang D. An Organelle Correlation-Guided Feature Selection Approach for Classifying Multi-Label Subcellular Bio-Images. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2018;15:828-838. [PMID: 28278481 DOI: 10.1109/tcbb.2017.2677907] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/06/2023]

Hasan MAM, Ahmad S, Molla MKI. Protein subcellular localization prediction using multiple kernel learning based support vector machine. MOLECULAR BIOSYSTEMS 2017;13:785-795. [DOI: 10.1039/c6mb00860g] [Citation(s) in RCA: 32] [Impact Index Per Article: 4.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 01/05/2023]

Sharma R, Dehzangi A, Lyons J, Paliwal K, Tsunoda T, Sharma A. Predict Gram-Positive and Gram-Negative Subcellular Localization via Incorporating Evolutionary Information and Physicochemical Features Into Chou's General PseAAC. IEEE Trans Nanobioscience 2015;14:915-26. [DOI: 10.1109/tnb.2015.2500186] [Citation(s) in RCA: 71] [Impact Index Per Article: 7.9] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/08/2022]