Reference Citation Analysis: Find an Article, Find a Category, Find a Journal, Find a Scholar

For: Höglund A, Dönnes P, Blum T, Adolph HW, Kohlbacher O. MultiLoc: prediction of protein subcellular localization using N-terminal targeting sequences, sequence motifs and amino acid composition. Bioinformatics 2006;22:1158-65. [PMID: 16428265 DOI: 10.1093/bioinformatics/btl002] [Citation(s) in RCA: 213] [Impact Index Per Article: 11.8] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022] Open

For:	Höglund A, Dönnes P, Blum T, Adolph HW, Kohlbacher O. MultiLoc: prediction of protein subcellular localization using N-terminal targeting sequences, sequence motifs and amino acid composition. Bioinformatics 2006;22:1158-65. [PMID: 16428265 DOI: 10.1093/bioinformatics/btl002] [Citation(s) in RCA: 213] [Impact Index Per Article: 11.8] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022] Open

Number

Cited by Other Article(s)

Veszelyi K, Czegle I, Varga V, Németh CE, Besztercei B, Margittai É. Subcellular Localization of Thioredoxin/Thioredoxin Reductase System-A Missing Link in Endoplasmic Reticulum Redox Balance. Int J Mol Sci 2024;25:6647. [PMID: 38928353 PMCID: PMC11204020 DOI: 10.3390/ijms25126647] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/20/2024] [Revised: 06/12/2024] [Accepted: 06/14/2024] [Indexed: 06/28/2024] Open

Li X, Qian Y, Hu Y, Chen J, Yue H, Deng L. MSF-PFP: A Novel Multisource Feature Fusion Model for Protein Function Prediction. J Chem Inf Model 2024;64:1502-1511. [PMID: 38413369 DOI: 10.1021/acs.jcim.3c01794] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/29/2024]

Nielsen H. Protein Sorting Prediction. Methods Mol Biol 2024;2715:27-63. [PMID: 37930519 DOI: 10.1007/978-1-0716-3445-5_2] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/07/2023]

Kushwah AS, Dixit H, Upadhyay V, Yadav S, Verma SK, Prasad R. Elucidating the zinc-binding proteome of Fusarium oxysporum f. sp. lycopersici with particular emphasis on zinc-binding effector proteins. Arch Microbiol 2023;205:298. [PMID: 37516670 DOI: 10.1007/s00203-023-03638-1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/01/2023] [Revised: 06/29/2023] [Accepted: 07/14/2023] [Indexed: 07/31/2023]

Yang B, Zhang L, Xiang S, Chen H, Qu C, Lu K, Li J. Identification of Trehalose-6-Phosphate Synthase (TPS) Genes Associated with Both Source-/Sink-Related Yield Traits and Drought Response in Rapeseed (Brassica napus L.). PLANTS (BASEL, SWITZERLAND) 2023;12:981. [PMID: 36903842 PMCID: PMC10005558 DOI: 10.3390/plants12050981] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 12/23/2022] [Revised: 02/15/2023] [Accepted: 02/16/2023] [Indexed: 06/18/2023]

Li Z, Gao E, Zhou J, Han W, Xu X, Gao X. Applications of deep learning in understanding gene regulation. CELL REPORTS METHODS 2023;3:100384. [PMID: 36814848 PMCID: PMC9939384 DOI: 10.1016/j.crmeth.2022.100384] [Citation(s) in RCA: 8] [Impact Index Per Article: 8.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/22/2023]

Affiliation(s)

Zhongxiao Li Computer Science Program, Computer, Electrical and Mathematical Sciences and Engineering (CEMSE) Division, King Abdullah University of Science and Technology (KAUST), Thuwal 23955-6900, Kingdom of Saudi Arabia KAUST Computational Bioscience Research Center (CBRC), King Abdullah University of Science and Technology (KAUST), Thuwal 23955-6900, Kingdom of Saudi Arabia
Elva Gao The KAUST School, King Abdullah University of Science and Technology (KAUST), Thuwal 23955-6900, Kingdom of Saudi Arabia
Juexiao Zhou Computer Science Program, Computer, Electrical and Mathematical Sciences and Engineering (CEMSE) Division, King Abdullah University of Science and Technology (KAUST), Thuwal 23955-6900, Kingdom of Saudi Arabia KAUST Computational Bioscience Research Center (CBRC), King Abdullah University of Science and Technology (KAUST), Thuwal 23955-6900, Kingdom of Saudi Arabia
Wenkai Han Computer Science Program, Computer, Electrical and Mathematical Sciences and Engineering (CEMSE) Division, King Abdullah University of Science and Technology (KAUST), Thuwal 23955-6900, Kingdom of Saudi Arabia KAUST Computational Bioscience Research Center (CBRC), King Abdullah University of Science and Technology (KAUST), Thuwal 23955-6900, Kingdom of Saudi Arabia
Xiaopeng Xu Computer Science Program, Computer, Electrical and Mathematical Sciences and Engineering (CEMSE) Division, King Abdullah University of Science and Technology (KAUST), Thuwal 23955-6900, Kingdom of Saudi Arabia KAUST Computational Bioscience Research Center (CBRC), King Abdullah University of Science and Technology (KAUST), Thuwal 23955-6900, Kingdom of Saudi Arabia
Xin Gao Computer Science Program, Computer, Electrical and Mathematical Sciences and Engineering (CEMSE) Division, King Abdullah University of Science and Technology (KAUST), Thuwal 23955-6900, Kingdom of Saudi Arabia KAUST Computational Bioscience Research Center (CBRC), King Abdullah University of Science and Technology (KAUST), Thuwal 23955-6900, Kingdom of Saudi Arabia

Collapse

Mitra N, Dey S. Understanding the catalytic abilities of class IV sirtuin OsSRT1 and its linkage to the DNA repair system under stress conditions. PLANT SCIENCE : AN INTERNATIONAL JOURNAL OF EXPERIMENTAL PLANT BIOLOGY 2022;323:111398. [PMID: 35917976 DOI: 10.1016/j.plantsci.2022.111398] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 04/04/2022] [Revised: 07/04/2022] [Accepted: 07/24/2022] [Indexed: 06/15/2023]

Kha QH, Ho QT, Le NQK. Identifying SNARE Proteins Using an Alignment-Free Method Based on Multiscan Convolutional Neural Network and PSSM Profiles. J Chem Inf Model 2022;62:4820-4826. [PMID: 36166351 PMCID: PMC9554904 DOI: 10.1021/acs.jcim.2c01034] [Citation(s) in RCA: 32] [Impact Index Per Article: 16.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/03/2022]

Tu Y, Lei H, Shen HB, Yang Y. SIFLoc: a self-supervised pre-training method for enhancing the recognition of protein subcellular localization in immunofluorescence microscopic images. Brief Bioinform 2022;23:6527276. [DOI: 10.1093/bib/bbab605] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/19/2021] [Revised: 12/15/2021] [Accepted: 12/27/2021] [Indexed: 12/19/2022] Open

Yadav NS, Kumar P, Singh I. Structural and functional analysis of protein. Bioinformatics 2022. [DOI: 10.1016/b978-0-323-89775-4.00026-2] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022] Open

Ofer D, Brandes N, Linial M. The language of proteins: NLP, machine learning & protein sequences. Comput Struct Biotechnol J 2021;19:1750-1758. [PMID: 33897979 PMCID: PMC8050421 DOI: 10.1016/j.csbj.2021.03.022] [Citation(s) in RCA: 97] [Impact Index Per Article: 32.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/28/2021] [Revised: 03/19/2021] [Accepted: 03/19/2021] [Indexed: 12/12/2022] Open

Pyrih J, Žárský V, Fellows JD, Grosche C, Wloga D, Striepen B, Maier UG, Tachezy J. The iron-sulfur scaffold protein HCF101 unveils the complexity of organellar evolution in SAR, Haptista and Cryptista. BMC Ecol Evol 2021;21:46. [PMID: 33740894 PMCID: PMC7980591 DOI: 10.1186/s12862-021-01777-x] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/01/2020] [Accepted: 03/08/2021] [Indexed: 11/22/2022] Open

Abstract

Background

Nbp35-like proteins (Nbp35, Cfd1, HCF101, Ind1, and AbpC) are P-loop NTPases that serve as components of iron-sulfur cluster (FeS) assembly machineries. In eukaryotes, Ind1 is present in mitochondria, and its function is associated with the assembly of FeS clusters in subunits of respiratory Complex I, Nbp35 and Cfd1 are the components of the cytosolic FeS assembly (CIA) pathway, and HCF101 is involved in FeS assembly of photosystem I in plastids of plants (chHCF101). The AbpC protein operates in Bacteria and Archaea. To date, the cellular distribution of these proteins is considered to be highly conserved with only a few exceptions.

Results

We searched for the genes of all members of the Nbp35-like protein family and analyzed their targeting sequences. Nbp35 and Cfd1 were predicted to reside in the cytoplasm with some exceptions of Nbp35 localization to the mitochondria; Ind1was found in the mitochondria, and HCF101 was predicted to reside in plastids (chHCF101) of all photosynthetically active eukaryotes. Surprisingly, we found a second HCF101 paralog in all members of Cryptista, Haptista, and SAR that was predicted to predominantly target mitochondria (mHCF101), whereas Ind1 appeared to be absent in these organisms. We also identified a few exceptions, as apicomplexans possess mHCF101 predicted to localize in the cytosol and Nbp35 in the mitochondria. Our predictions were experimentally confirmed in selected representatives of Apicomplexa (Toxoplasma gondii), Stramenopila (Phaeodactylum tricornutum, Thalassiosira pseudonana), and Ciliophora (Tetrahymena thermophila) by tagging proteins with a transgenic reporter. Phylogenetic analysis suggested that chHCF101 and mHCF101 evolved from a common ancestral HCF101 independently of the Nbp35/Cfd1 and Ind1 proteins. Interestingly, phylogenetic analysis supports rather a lateral gene transfer of ancestral HCF101 from bacteria than its acquisition being associated with either α-proteobacterial or cyanobacterial endosymbionts.

Conclusion

Our searches for Nbp35-like proteins across eukaryotic lineages revealed that SAR, Haptista, and Cryptista possess mitochondrial HCF101. Because plastid localization of HCF101 was only known thus far, the discovery of its mitochondrial paralog explains confusion regarding the presence of HCF101 in organisms that possibly lost secondary plastids (e.g., ciliates, Cryptosporidium) or possess reduced nonphotosynthetic plastids (apicomplexans).

Supplementary Information

The online version contains supplementary material available at 10.1186/s12862-021-01777-x.

Collapse

Imai K, Nakai K. Tools for the Recognition of Sorting Signals and the Prediction of Subcellular Localization of Proteins From Their Amino Acid Sequences. Front Genet 2020;11:607812. [PMID: 33324450 PMCID: PMC7723863 DOI: 10.3389/fgene.2020.607812] [Citation(s) in RCA: 12] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/18/2020] [Accepted: 11/03/2020] [Indexed: 12/13/2022] Open

Semwal R, Varadwaj PK. HumDLoc: Human Protein Subcellular Localization Prediction Using Deep Neural Network. Curr Genomics 2020;21:546-557. [PMID: 33214771 PMCID: PMC7604748 DOI: 10.2174/1389202921999200528160534] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/10/2020] [Revised: 03/27/2020] [Accepted: 03/30/2020] [Indexed: 11/24/2022] Open

Abstract

Aims

To develop a tool that can annotate subcellular localization of human proteins.

Background

With the progression of high throughput human proteomics projects, an enormous amount of protein sequence data has been discovered in the recent past. All these raw sequence data require precise mapping and annotation for their respective biological role and functional attributes. The functional characteristics of protein molecules are highly dependent on the subcellular localization/compartment. Therefore, a fully automated and reliable protein subcellular localization prediction system would be very useful for current proteomic research.

Objective

To develop a machine learning-based predictive model that can annotate the subcellular localization of human proteins with high accuracy and precision.

Methods

In this study, we used the PSI-CD-HIT homology criterion and utilized the sequence-based features of protein sequences to develop a powerful subcellular localization predictive model. The dataset used to train the HumDLoc model was extracted from a reliable data source, Uniprot knowledge base, which helps the model to generalize on the unseen dataset.

Results

The proposed model, HumDLoc, was compared with two of the most widely used techniques: CELLO and DeepLoc, and other machine learning-based tools. The result demonstrated promising predictive performance of HumDLoc model based on various machine learning parameters such as accuracy (≥97.00%), precision (≥0.86), recall (≥0.89), MCC score (≥0.86), ROC curve (0.98 square unit), and precision-recall curve (0.93 square unit).

Conclusion

In conclusion, HumDLoc was able to outperform several alternative tools for correctly predicting subcellular localization of human proteins. The HumDLoc has been hosted as a web-based tool at https://bioserver.iiita.ac.in/HumDLoc/.

Collapse

Plasma Proteome Profiling of Coronary Artery Disease Patients: Downregulation of Transthyretin-An Important Event. Mediators Inflamm 2020;2020:3429541. [PMID: 33299376 PMCID: PMC7707994 DOI: 10.1155/2020/3429541] [Citation(s) in RCA: 9] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/02/2020] [Accepted: 10/24/2020] [Indexed: 02/07/2023] Open

Abstract

Coronary artery disease (CAD) is a prevalent chronic inflammatory cardiac disorder. An early diagnosis is likely to help in the prevention and proper management of this disease. As the study of proteomics provides the potential markers for detection of a disease, in the present investigation, attempt has been made to identify disease-associated differential proteins involved in CAD pathogenesis. For this study, a total of 200 selected CAD patients were considered, who were recruited for percutaneous coronary intervention (PCI) treatment. The proteomic analysis was performed using two-dimensional gel electrophoresis (2-DE) and MALDI-TOF MS/MS. Samples were also subjected to Western blot analysis, enzyme-linked immunosorbent assay (ELISA), peripheral blood mononuclear cells isolation immunofluorescence (IF) analysis, analytical screening by fluorescence-activated cell sorting (FACS), and in silico analysis. The representative data were shown as mean ± SD of at least three experiments. A total of 19 proteins were identified. Among them, the most abundant five proteins (serotransferrin, talin-1, alpha-2HS glycoprotein, transthyretin (TTR), fibrinogen-α chain) were found to have altered level in CAD. Serotransferrin, talin-1, alpha-2HS glycoprotein, and transthyretin (TTR) were found to have lower level, whereas fibrinogen-α chain was found to have higher level in CAD plasma compared to healthy, confirmed by Western blot analysis. TTR, an important acute phase transport protein, was validated low level in 200 CAD patients who confirmed to undergo PCI treatment. Further, in silico and in vitro studies of TTR indicated a downexpression of CAD in plasma as compared to the plasma of healthy individuals. Lower level of plasma TTR was determined to be an important risk marker in the atherosclerotic-approved CAD patients. We suggest that the TTR lower level predicts disease severity and hence may serve as an important marker tool for CAD screening. However, further large-scale studies are required to determine the clinical significance of TTR.

Collapse

Rotenberg D, Baumann AA, Ben-Mahmoud S, Christiaens O, Dermauw W, Ioannidis P, Jacobs CGC, Vargas Jentzsch IM, Oliver JE, Poelchau MF, Rajarapu SP, Schneweis DJ, Snoeck S, Taning CNT, Wei D, Widana Gamage SMK, Hughes DST, Murali SC, Bailey ST, Bejerman NE, Holmes CJ, Jennings EC, Rosendale AJ, Rosselot A, Hervey K, Schneweis BA, Cheng S, Childers C, Simão FA, Dietzgen RG, Chao H, Dinh H, Doddapaneni HV, Dugan S, Han Y, Lee SL, Muzny DM, Qu J, Worley KC, Benoit JB, Friedrich M, Jones JW, Panfilio KA, Park Y, Robertson HM, Smagghe G, Ullman DE, van der Zee M, Van Leeuwen T, Veenstra JA, Waterhouse RM, Weirauch MT, Werren JH, Whitfield AE, Zdobnov EM, Gibbs RA, Richards S. Genome-enabled insights into the biology of thrips as crop pests. BMC Biol 2020;18:142. [PMID: 33070780 PMCID: PMC7570057 DOI: 10.1186/s12915-020-00862-9] [Citation(s) in RCA: 35] [Impact Index Per Article: 8.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/18/2020] [Accepted: 09/02/2020] [Indexed: 12/22/2022] Open

Abstract

BACKGROUND

The western flower thrips, Frankliniella occidentalis (Pergande), is a globally invasive pest and plant virus vector on a wide array of food, fiber, and ornamental crops. The underlying genetic mechanisms of the processes governing thrips pest and vector biology, feeding behaviors, ecology, and insecticide resistance are largely unknown. To address this gap, we present the F. occidentalis draft genome assembly and official gene set.

RESULTS

We report on the first genome sequence for any member of the insect order Thysanoptera. Benchmarking Universal Single-Copy Ortholog (BUSCO) assessments of the genome assembly (size = 415.8 Mb, scaffold N50 = 948.9 kb) revealed a relatively complete and well-annotated assembly in comparison to other insect genomes. The genome is unusually GC-rich (50%) compared to other insect genomes to date. The official gene set (OGS v1.0) contains 16,859 genes, of which ~ 10% were manually verified and corrected by our consortium. We focused on manual annotation, phylogenetic, and expression evidence analyses for gene sets centered on primary themes in the life histories and activities of plant-colonizing insects. Highlights include the following: (1) divergent clades and large expansions in genes associated with environmental sensing (chemosensory receptors) and detoxification (CYP4, CYP6, and CCE enzymes) of substances encountered in agricultural environments; (2) a comprehensive set of salivary gland genes supported by enriched expression; (3) apparent absence of members of the IMD innate immune defense pathway; and (4) developmental- and sex-specific expression analyses of genes associated with progression from larvae to adulthood through neometaboly, a distinct form of maturation differing from either incomplete or complete metamorphosis in the Insecta.

CONCLUSIONS

Analysis of the F. occidentalis genome offers insights into the polyphagous behavior of this insect pest that finds, colonizes, and survives on a widely diverse array of plants. The genomic resources presented here enable a more complete analysis of insect evolution and biology, providing a missing taxon for contemporary insect genomics-based analyses. Our study also offers a genomic benchmark for molecular and evolutionary investigations of other Thysanoptera species.

Collapse

Affiliation(s)

Dorith Rotenberg Department of Entomology and Plant Pathology, North Carolina State University, Raleigh, NC, 27695, USA.
Aaron A Baumann Virology Section, College of Veterinary Medicine, University of Tennessee, A239 VTH, 2407 River Drive, Knoxville, TN, 37996, USA
Sulley Ben-Mahmoud Department of Entomology and Nematology, University of California Davis, Davis, CA, 95616, USA
Olivier Christiaens Laboratory of Agrozoology, Department of Plants and Crops, Ghent University, Coupure Links 653, 9000, Ghent, Belgium
Wannes Dermauw Laboratory of Agrozoology, Department of Plants and Crops, Ghent University, Coupure Links 653, 9000, Ghent, Belgium
Panagiotis Ioannidis Institute of Molecular Biology and Biotechnology, Foundation for Research and Technology-Hellas, Vassilika Vouton, 70013, Heraklion, Greece Department of Genetic Medicine and Development, University of Geneva Medical School, and Swiss Institute of Bioinformatics, Geneva, Switzerland
Chris G C Jacobs Institute of Biology, Leiden University, 2333 BE, Leiden, The Netherlands
Iris M Vargas Jentzsch Institute for Zoology: Developmental Biology, University of Cologne, 50674, Cologne, Germany
Jonathan E Oliver Department of Plant Pathology, University of Georgia - Tifton Campus, Tifton, GA, 31793-5737, USA
Monica F Poelchau National Agricultural Library, USDA-ARS, Beltsville, MD, 20705, USA
Swapna Priya Rajarapu Department of Entomology and Plant Pathology, North Carolina State University, Raleigh, NC, 27695, USA
Derek J Schneweis Department of Plant Pathology, Kansas State University, Manhattan, KS, 66506, USA
Simon Snoeck Laboratory of Agrozoology, Department of Plants and Crops, Ghent University, Coupure Links 653, 9000, Ghent, Belgium Department of Biology, University of Washington, Seattle, WA, 98105, USA
Clauvis N T Taning Laboratory of Agrozoology, Department of Plants and Crops, Ghent University, Coupure Links 653, 9000, Ghent, Belgium
Dong Wei Laboratory of Agrozoology, Department of Plants and Crops, Ghent University, Coupure Links 653, 9000, Ghent, Belgium Chongqing Key Laboratory of Entomology and Pest Control Engineering, College of Plant Protection, Southwest University, Chongqing, China International Joint Laboratory of China-Belgium on Sustainable Crop Pest Control, Academy of Agricultural Sciences, Southwest University, Chongqing, China and Ghent University, Ghent, Belgium
Shirani M K Widana Gamage Department of Botany, University of Ruhuna, Matara, Sri Lanka
Daniel S T Hughes Human Genome Sequencing Center, Department of Human and Molecular Genetics, Baylor College of Medicine, One Baylor Plaza, Houston, TX, 77030, USA
Shwetha C Murali Human Genome Sequencing Center, Department of Human and Molecular Genetics, Baylor College of Medicine, One Baylor Plaza, Houston, TX, 77030, USA
Samuel T Bailey Department of Biological Sciences, University of Cincinnati, Cincinnati, OH, 45221, USA
Nicolas E Bejerman IPAVE-CIAP-INTA, 5020, Cordoba, Argentina
Christopher J Holmes Department of Biological Sciences, University of Cincinnati, Cincinnati, OH, 45221, USA
Emily C Jennings Department of Biological Sciences, University of Cincinnati, Cincinnati, OH, 45221, USA
Andrew J Rosendale Department of Biological Sciences, University of Cincinnati, Cincinnati, OH, 45221, USA Department of Biology, Mount St. Joseph University, Cincinnati, OH, 45233, USA
Andrew Rosselot Department of Biological Sciences, University of Cincinnati, Cincinnati, OH, 45221, USA
Kaylee Hervey Department of Plant Pathology, Kansas State University, Manhattan, KS, 66506, USA
Brandi A Schneweis Department of Plant Pathology, Kansas State University, Manhattan, KS, 66506, USA
Sammy Cheng Department of Biology, University of Rochester, Rochester, NY, 14627, USA
Christopher Childers National Agricultural Library, USDA-ARS, Beltsville, MD, 20705, USA
Felipe A Simão Department of Genetic Medicine and Development, University of Geneva Medical School, and Swiss Institute of Bioinformatics, Geneva, Switzerland
Ralf G Dietzgen Queensland Alliance for Agriculture and Food Innovation, The University of Queensland, St. Lucia, QLD, 4072, Australia
Hsu Chao Human Genome Sequencing Center, Department of Human and Molecular Genetics, Baylor College of Medicine, One Baylor Plaza, Houston, TX, 77030, USA
Huyen Dinh Human Genome Sequencing Center, Department of Human and Molecular Genetics, Baylor College of Medicine, One Baylor Plaza, Houston, TX, 77030, USA
Harsha Vardhan Doddapaneni Human Genome Sequencing Center, Department of Human and Molecular Genetics, Baylor College of Medicine, One Baylor Plaza, Houston, TX, 77030, USA
Shannon Dugan Human Genome Sequencing Center, Department of Human and Molecular Genetics, Baylor College of Medicine, One Baylor Plaza, Houston, TX, 77030, USA
Yi Han Human Genome Sequencing Center, Department of Human and Molecular Genetics, Baylor College of Medicine, One Baylor Plaza, Houston, TX, 77030, USA
Sandra L Lee Human Genome Sequencing Center, Department of Human and Molecular Genetics, Baylor College of Medicine, One Baylor Plaza, Houston, TX, 77030, USA
Donna M Muzny Human Genome Sequencing Center, Department of Human and Molecular Genetics, Baylor College of Medicine, One Baylor Plaza, Houston, TX, 77030, USA
Jiaxin Qu Human Genome Sequencing Center, Department of Human and Molecular Genetics, Baylor College of Medicine, One Baylor Plaza, Houston, TX, 77030, USA
Kim C Worley Human Genome Sequencing Center, Department of Human and Molecular Genetics, Baylor College of Medicine, One Baylor Plaza, Houston, TX, 77030, USA
Joshua B Benoit Department of Biological Sciences, University of Cincinnati, Cincinnati, OH, 45221, USA
Markus Friedrich Department of Biological Sciences, Wayne State University, Detroit, MI, 48202, USA
Jeffery W Jones Department of Biological Sciences, Wayne State University, Detroit, MI, 48202, USA
Kristen A Panfilio Institute for Zoology: Developmental Biology, University of Cologne, 50674, Cologne, Germany School of Life Sciences, University of Warwick, Gibbet Hill Campus, Coventry, CV4 7AL, UK
Yoonseong Park Department of Entomology, Kansas State University, Manhattan, KS, 66506, USA
Hugh M Robertson Department of Entomology, University of Illinois at Urbana-Champaign, Urbana, IL, 61801, USA
Guy Smagghe Laboratory of Agrozoology, Department of Plants and Crops, Ghent University, Coupure Links 653, 9000, Ghent, Belgium Chongqing Key Laboratory of Entomology and Pest Control Engineering, College of Plant Protection, Southwest University, Chongqing, China International Joint Laboratory of China-Belgium on Sustainable Crop Pest Control, Academy of Agricultural Sciences, Southwest University, Chongqing, China and Ghent University, Ghent, Belgium
Diane E Ullman Department of Entomology and Nematology, University of California Davis, Davis, CA, 95616, USA
Maurijn van der Zee Institute of Biology, Leiden University, 2333 BE, Leiden, The Netherlands
Thomas Van Leeuwen Laboratory of Agrozoology, Department of Plants and Crops, Ghent University, Coupure Links 653, 9000, Ghent, Belgium
Jan A Veenstra INCIA UMR 5287 CNRS, University of Bordeaux, Pessac, France
Robert M Waterhouse Department of Ecology and Evolution, Swiss Institute of Bioinformatics, University of Lausanne, 1015, Lausanne, Switzerland
Matthew T Weirauch Center for Autoimmune Genomics and Etiology, Divisions of Biomedical Informatics and Developmental Biology, Cincinnati Children's Hospital Medical Center, Cincinnati, OH, 45229, USA Department of Pediatrics, University of Cincinnati, College of Medicine, Cincinnati, OH, 45229, USA
John H Werren Department of Biology, University of Rochester, Rochester, NY, 14627, USA
Anna E Whitfield Department of Entomology and Plant Pathology, North Carolina State University, Raleigh, NC, 27695, USA
Evgeny M Zdobnov Department of Genetic Medicine and Development, University of Geneva Medical School, and Swiss Institute of Bioinformatics, Geneva, Switzerland
Richard A Gibbs Human Genome Sequencing Center, Department of Human and Molecular Genetics, Baylor College of Medicine, One Baylor Plaza, Houston, TX, 77030, USA
Stephen Richards Human Genome Sequencing Center, Department of Human and Molecular Genetics, Baylor College of Medicine, One Baylor Plaza, Houston, TX, 77030, USA

Collapse

Xu YY, Zhou H, Murphy RF, Shen HB. Consistency and variation of protein subcellular location annotations. Proteins 2020;89:242-250. [PMID: 32935893 DOI: 10.1002/prot.26010] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/16/2020] [Revised: 07/09/2020] [Accepted: 09/13/2020] [Indexed: 11/09/2022]

Deveshwar P, Sharma S, Prusty A, Sinha N, Zargar SM, Karwal D, Parashar V, Singh S, Tyagi AK. Analysis of rice nuclear-localized seed-expressed proteins and their database (RSNP-DB). Sci Rep 2020;10:15116. [PMID: 32934280 PMCID: PMC7492263 DOI: 10.1038/s41598-020-70713-8] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/27/2020] [Accepted: 08/03/2020] [Indexed: 01/16/2023] Open

Genome-Wide Identification and Expression Profiling of Monosaccharide Transporter Genes Associated with High Harvest Index Values in Rapeseed (Brassica napus L.). Genes (Basel) 2020;11:genes11060653. [PMID: 32549312 PMCID: PMC7349323 DOI: 10.3390/genes11060653] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/15/2020] [Revised: 06/10/2020] [Accepted: 06/12/2020] [Indexed: 01/15/2023] Open

Sahu SS, Loaiza CD, Kaundal R. Plant-mSubP: a computational framework for the prediction of single- and multi-target protein subcellular localization using integrated machine-learning approaches. AOB PLANTS 2020;12:plz068. [PMID: 32528639 PMCID: PMC7274489 DOI: 10.1093/aobpla/plz068] [Citation(s) in RCA: 44] [Impact Index Per Article: 11.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 04/28/2019] [Accepted: 10/11/2019] [Indexed: 05/18/2023]

Abstract

The subcellular localization of proteins is very important for characterizing its function in a cell. Accurate prediction of the subcellular locations in computational paradigm has been an active area of interest. Most of the work has been focused on single localization prediction. Only few studies have discussed the multi-target localization, but have not achieved good accuracy so far; in plant sciences, very limited work has been done. Here we report the development of a novel tool Plant-mSubP, which is based on integrated machine learning approaches to efficiently predict the subcellular localizations in plant proteomes. The proposed approach predicts with high accuracy 11 single localizations and three dual locations of plant cell. Several hybrid features based on composition and physicochemical properties of a protein such as amino acid composition, pseudo amino acid composition, auto-correlation descriptors, quasi-sequence-order descriptors and hybrid features are used to represent the protein. The performance of the proposed method has been assessed through a training set as well as an independent test set. Using the hybrid feature of the pseudo amino acid composition, N-Center-C terminal amino acid composition and the dipeptide composition (PseAAC-NCC-DIPEP), an overall accuracy of 81.97 %, 84.75 % and 87.88 % is achieved on the training data set of proteins containing the single-label, single- and dual-label combined, and dual-label proteins, respectively. When tested on the independent data, an accuracy of 64.36 %, 64.84 % and 81.08 % is achieved on the single-label, single- and dual-label, and dual-label proteins, respectively. The prediction models have been implemented on a web server available at http://bioinfo.usu.edu/Plant-mSubP/. The results indicate that the proposed approach is comparable to the existing methods in single localization prediction and outperforms all other existing tools when compared for dual-label proteins. The prediction tool will be a useful resource for better annotation of various plant proteomes.

Collapse

Barman RK, Mukhopadhyay A, Maulik U, Das S. Identification of infectious disease-associated host genes using machine learning techniques. BMC Bioinformatics 2019;20:736. [PMID: 31881961 PMCID: PMC6935192 DOI: 10.1186/s12859-019-3317-0] [Citation(s) in RCA: 18] [Impact Index Per Article: 3.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/31/2019] [Accepted: 12/16/2019] [Indexed: 02/06/2023] Open

Abstract

Background

With the global spread of multidrug resistance in pathogenic microbes, infectious diseases emerge as a key public health concern of the recent time. Identification of host genes associated with infectious diseases will improve our understanding about the mechanisms behind their development and help to identify novel therapeutic targets.

Results

We developed a machine learning techniques-based classification approach to identify infectious disease-associated host genes by integrating sequence and protein interaction network features. Among different methods, Deep Neural Networks (DNN) model with 16 selected features for pseudo-amino acid composition (PAAC) and network properties achieved the highest accuracy of 86.33% with sensitivity of 85.61% and specificity of 86.57%. The DNN classifier also attained an accuracy of 83.33% on a blind dataset and a sensitivity of 83.1% on an independent dataset. Furthermore, to predict unknown infectious disease-associated host genes, we applied the proposed DNN model to all reviewed proteins from the database. Seventy-six out of 100 highly-predicted infectious disease-associated genes from our study were also found in experimentally-verified human-pathogen protein-protein interactions (PPIs). Finally, we validated the highly-predicted infectious disease-associated genes by disease and gene ontology enrichment analysis and found that many of them are shared by one or more of the other diseases, such as cancer, metabolic and immune related diseases.

Conclusions

To the best of our knowledge, this is the first computational method to identify infectious disease-associated host genes. The proposed method will help large-scale prediction of host genes associated with infectious-diseases. However, our results indicated that for small datasets, advanced DNN-based method does not offer significant advantage over the simpler supervised machine learning techniques, such as Support Vector Machine (SVM) or Random Forest (RF) for the prediction of infectious disease-associated host genes. Significant overlap of infectious disease with cancer and metabolic disease on disease and gene ontology enrichment analysis suggests that these diseases perturb the functions of the same cellular signaling pathways and may be treated by drugs that tend to reverse these perturbations. Moreover, identification of novel candidate genes associated with infectious diseases would help us to explain disease pathogenesis further and develop novel therapeutics.

Collapse

Bonetta R, Valentino G. Machine learning techniques for protein function prediction. Proteins 2019;88:397-413. [PMID: 31603244 DOI: 10.1002/prot.25832] [Citation(s) in RCA: 63] [Impact Index Per Article: 12.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/19/2019] [Revised: 07/05/2019] [Accepted: 09/17/2019] [Indexed: 12/17/2022]

Han GS, Yu ZG. ML-rRBF-ECOC: A Multi-Label Learning Classifier for Predicting Protein Subcellular Localization with Both Single and Multiple Sites. CURR PROTEOMICS 2019. [DOI: 10.2174/1570164616666190103143945] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/22/2022]

Abstract Background: The subcellular localization of a protein is closely related with its functions and interactions. More and more evidences show that proteins may simultaneously exist at, or move between, two or more different subcellular localizations. Therefore, predicting protein subcellular localization is an important but challenging problem. Observation: Most of the existing methods for predicting protein subcellular localization assume that a protein locates at a single site. Although a few methods have been proposed to deal with proteins with multiple sites, correlations between subcellular localization are not efficiently taken into account. In this paper, we propose an integrated method for predicting protein subcellular localizations with both single site and multiple sites. Methods: Firstly, we extend the Multi-Label Radial Basis Function (ML-RBF) method to the regularized version, and augment the first layer of ML-RBF to take local correlations between subcellular localization into account. Secondly, we embed the modified ML-RBF into a multi-label Error-Correcting Output Codes (ECOC) method in order to further consider the subcellular localization dependency. We name our method ML-rRBF-ECOC. Finally, the performance of ML-rRBF-ECOC is evaluated on three benchmark datasets. Results: The results demonstrate that ML-rRBF-ECOC has highly competitive performance to the related multi-label learning method and some state-of-the-art methods for predicting protein subcellular localizations with multiple sites. Considering dependency between subcellular localizations can contribute to the improvement of prediction performance. Conclusion: This also indicates that correlations between different subcellular localizations really exist. Our method at least plays a complementary role to existing methods for predicting protein subcellular localizations with multiple sites. Collapse

Chou KC, Cheng X, Xiao X. pLoc_bal-mEuk: Predict Subcellular Localization of Eukaryotic Proteins by General PseAAC and Quasi-balancing Training Dataset. Med Chem 2019;15:472-485. [DOI: 10.2174/1573406415666181218102517] [Citation(s) in RCA: 40] [Impact Index Per Article: 8.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/28/2018] [Revised: 10/23/2018] [Accepted: 12/12/2018] [Indexed: 12/24/2022]

Abstract Background/Objective: Information of protein subcellular localization is crucially important for both basic research and drug development. With the explosive growth of protein sequences discovered in the post-genomic age, it is highly demanded to develop powerful bioinformatics tools for timely and effectively identifying their subcellular localization purely based on the sequence information alone. Recently, a predictor called “pLoc-mEuk” was developed for identifying the subcellular localization of eukaryotic proteins. Its performance is overwhelmingly better than that of the other predictors for the same purpose, particularly in dealing with multi-label systems where many proteins, called “multiplex proteins”, may simultaneously occur in two or more subcellular locations. Although it is indeed a very powerful predictor, more efforts are definitely needed to further improve it. This is because pLoc-mEuk was trained by an extremely skewed dataset where some subset was about 200 times the size of the other subsets. Accordingly, it cannot avoid the biased consequence caused by such an uneven training dataset. Methods: To alleviate such bias, we have developed a new predictor called pLoc_bal-mEuk by quasi-balancing the training dataset. Cross-validation tests on exactly the same experimentconfirmed dataset have indicated that the proposed new predictor is remarkably superior to pLocmEuk, the existing state-of-the-art predictor in identifying the subcellular localization of eukaryotic proteins. It has not escaped our notice that the quasi-balancing treatment can also be used to deal with many other biological systems. Results: To maximize the convenience for most experimental scientists, a user-friendly web-server for the new predictor has been established at http://www.jci-bioinfo.cn/pLoc_bal-mEuk/. Conclusion: It is anticipated that the pLoc_bal-Euk predictor holds very high potential to become a useful high throughput tool in identifying the subcellular localization of eukaryotic proteins, particularly for finding multi-target drugs that is currently a very hot trend trend in drug development. Collapse

Gao F, Peng C, Li J, Zhuang R, Guo Z, Xu D, Su X, Zhang X. Radioiodinated progesterone derivative for progesterone receptor targeting with enhanced nucleus uptake via phenylboronic acid conjugation. J Labelled Comp Radiopharm 2019;62:301-309. [PMID: 31032992 DOI: 10.1002/jlcr.3741] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/27/2019] [Revised: 03/13/2019] [Accepted: 03/26/2019] [Indexed: 11/06/2022]

Nielsen H, Tsirigos KD, Brunak S, von Heijne G. A Brief History of Protein Sorting Prediction. Protein J 2019;38:200-216. [PMID: 31119599 PMCID: PMC6589146 DOI: 10.1007/s10930-019-09838-3] [Citation(s) in RCA: 128] [Impact Index Per Article: 25.6] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/24/2022]

Bentley SJ, Jamabo M, Boshoff A. The Hsp70/J-protein machinery of the African trypanosome, Trypanosoma brucei. Cell Stress Chaperones 2019;24:125-148. [PMID: 30506377 PMCID: PMC6363631 DOI: 10.1007/s12192-018-0950-x] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/14/2018] [Revised: 11/06/2018] [Accepted: 11/12/2018] [Indexed: 12/28/2022] Open

Overexpression of ScMYBAS1 alternative splicing transcripts differentially impacts biomass accumulation and drought tolerance in rice transgenic plants. PLoS One 2018;13:e0207534. [PMID: 30517137 PMCID: PMC6281192 DOI: 10.1371/journal.pone.0207534] [Citation(s) in RCA: 15] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/26/2018] [Accepted: 11/01/2018] [Indexed: 02/05/2023] Open

Cheng X, Xiao X, Chou KC. pLoc_bal-mGneg: Predict subcellular localization of Gram-negative bacterial proteins by quasi-balancing training dataset and general PseAAC. J Theor Biol 2018;458:92-102. [DOI: 10.1016/j.jtbi.2018.09.005] [Citation(s) in RCA: 65] [Impact Index Per Article: 10.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/16/2018] [Revised: 09/05/2018] [Accepted: 09/07/2018] [Indexed: 01/03/2023]

Sharma M, Bennewitz B, Klösgen RB. Rather rule than exception? How to evaluate the relevance of dual protein targeting to mitochondria and chloroplasts. PHOTOSYNTHESIS RESEARCH 2018;138:335-343. [PMID: 29946965 DOI: 10.1007/s11120-018-0543-7] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 10/27/2017] [Accepted: 06/20/2018] [Indexed: 05/11/2023]

Gudenas BL, Wang L. Prediction of LncRNA Subcellular Localization with Deep Learning from Sequence Features. Sci Rep 2018;8:16385. [PMID: 30401954 PMCID: PMC6219567 DOI: 10.1038/s41598-018-34708-w] [Citation(s) in RCA: 76] [Impact Index Per Article: 12.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/26/2018] [Accepted: 10/19/2018] [Indexed: 12/20/2022] Open

Dayan FE, Barker A, Tranel PJ. Origins and structure of chloroplastic and mitochondrial plant protoporphyrinogen oxidases: implications for the evolution of herbicide resistance. PEST MANAGEMENT SCIENCE 2018;74:2226-2234. [PMID: 28967179 DOI: 10.1002/ps.4744] [Citation(s) in RCA: 19] [Impact Index Per Article: 3.2] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 07/21/2017] [Revised: 09/05/2017] [Accepted: 09/23/2017] [Indexed: 05/25/2023]

Kang MK, Tullman-Ercek D. Engineering expression and function of membrane proteins. Methods 2018;147:66-72. [DOI: 10.1016/j.ymeth.2018.04.014] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/01/2018] [Revised: 04/03/2018] [Accepted: 04/16/2018] [Indexed: 01/18/2023] Open

Mirzaei Mehrabad E, Hassanzadeh R, Eslahchi C. PMLPR: A novel method for predicting subcellular localization based on recommender systems. Sci Rep 2018;8:12006. [PMID: 30104743 PMCID: PMC6089892 DOI: 10.1038/s41598-018-30394-w] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/25/2016] [Accepted: 07/30/2018] [Indexed: 12/16/2022] Open

Jurtz VI, Johansen AR, Nielsen M, Almagro Armenteros JJ, Nielsen H, Sønderby CK, Winther O, Sønderby SK. An introduction to deep learning on biological sequence data: examples and solutions. Bioinformatics 2018;33:3685-3690. [PMID: 28961695 DOI: 10.1093/bioinformatics/btx531] [Citation(s) in RCA: 74] [Impact Index Per Article: 12.3] [Reference Citation Analysis] [Abstract] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/13/2017] [Accepted: 08/22/2017] [Indexed: 11/14/2022] Open

Almagro Armenteros JJ, Sønderby CK, Sønderby SK, Nielsen H, Winther O. DeepLoc: prediction of protein subcellular localization using deep learning. Bioinformatics 2018;33:3387-3395. [PMID: 29036616 DOI: 10.1093/bioinformatics/btx431] [Citation(s) in RCA: 607] [Impact Index Per Article: 101.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/16/2017] [Accepted: 07/03/2017] [Indexed: 01/12/2023] Open

Predicting protein submitochondrial locations by incorporating the pseudo-position specific scoring matrix into the general Chou's pseudo-amino acid composition. J Theor Biol 2018;450:86-103. [DOI: 10.1016/j.jtbi.2018.04.026] [Citation(s) in RCA: 61] [Impact Index Per Article: 10.2] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/08/2018] [Revised: 04/10/2018] [Accepted: 04/16/2018] [Indexed: 01/16/2023]

Cheng X, Lin WZ, Xiao X, Chou KC. pLoc_bal-mAnimal: predict subcellular localization of animal proteins by balancing training dataset and PseAAC. Bioinformatics 2018;35:398-406. [DOI: 10.1093/bioinformatics/bty628] [Citation(s) in RCA: 79] [Impact Index Per Article: 13.2] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/21/2018] [Accepted: 07/11/2018] [Indexed: 12/25/2022] Open

Kikegawa T, Yamaguchi T, Nambu R, Etchuya K, Ikeda M, Mukai Y. Signal-anchor sequences are an essential factor for the Golgi-plasma membrane localization of type II membrane proteins. Biosci Biotechnol Biochem 2018;82:1708-1714. [PMID: 29912671 DOI: 10.1080/09168451.2018.1484272] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/14/2022]

Bakhtiarizadeh MR, Rahimi M, Mohammadi-Sangcheshmeh A, Shariati J V, Salami SA. PrESOgenesis: A two-layer multi-label predictor for identifying fertility-related proteins using support vector machine and pseudo amino acid composition approach. Sci Rep 2018;8:9025. [PMID: 29899414 PMCID: PMC5998058 DOI: 10.1038/s41598-018-27338-9] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/02/2018] [Accepted: 05/25/2018] [Indexed: 11/08/2022] Open

Nielsen H. Protein Sorting Prediction. Methods Mol Biol 2018;1615:23-57. [PMID: 28667600 DOI: 10.1007/978-1-4939-7033-9_2] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/17/2023]

A Novel Modeling in Mathematical Biology for Classification of Signal Peptides. Sci Rep 2018;8:1039. [PMID: 29348418 PMCID: PMC5773712 DOI: 10.1038/s41598-018-19491-y] [Citation(s) in RCA: 66] [Impact Index Per Article: 11.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/21/2017] [Accepted: 01/02/2018] [Indexed: 11/17/2022] Open

Olmedo P, Moreno AA, Sanhueza D, Balic I, Silva-Sanzana C, Zepeda B, Verdonk JC, Arriagada C, Meneses C, Campos-Vargas R. A catechol oxidase AcPPO from cherimoya (Annona cherimola Mill.) is localized to the Golgi apparatus. PLANT SCIENCE : AN INTERNATIONAL JOURNAL OF EXPERIMENTAL PLANT BIOLOGY 2018;266:46-54. [PMID: 29241566 DOI: 10.1016/j.plantsci.2017.10.012] [Citation(s) in RCA: 15] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 09/15/2017] [Revised: 10/18/2017] [Accepted: 10/20/2017] [Indexed: 06/07/2023]

Kunze M. Predicting Peroxisomal Targeting Signals to Elucidate the Peroxisomal Proteome of Mammals. Subcell Biochem 2018;89:157-199. [PMID: 30378023 DOI: 10.1007/978-981-13-2233-4_7] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/15/2022]

pLoc-mEuk: Predict subcellular localization of multi-label eukaryotic proteins by extracting the key GO information into general PseAAC. Genomics 2018;110:50-58. [DOI: 10.1016/j.ygeno.2017.08.005] [Citation(s) in RCA: 180] [Impact Index Per Article: 30.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/06/2017] [Revised: 08/10/2017] [Accepted: 08/11/2017] [Indexed: 11/22/2022]

Zhou H, Yang Y, Shen HB. Hum-mPLoc 3.0: prediction enhancement of human protein subcellular localization through modeling the hidden correlations of gene ontology and functional domain features. Bioinformatics 2017;33:843-853. [PMID: 27993784 DOI: 10.1093/bioinformatics/btw723] [Citation(s) in RCA: 37] [Impact Index Per Article: 5.3] [Reference Citation Analysis] [Abstract] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/28/2016] [Accepted: 11/17/2016] [Indexed: 11/13/2022] Open

Abstract

Motivation

Protein subcellular localization prediction has been an important research topic in computational biology over the last decade. Various automatic methods have been proposed to predict locations for large scale protein datasets, where statistical machine learning algorithms are widely used for model construction. A key step in these predictors is encoding the amino acid sequences into feature vectors. Many studies have shown that features extracted from biological domains, such as gene ontology and functional domains, can be very useful for improving the prediction accuracy. However, domain knowledge usually results in redundant features and high-dimensional feature spaces, which may degenerate the performance of machine learning models.

Results

In this paper, we propose a new amino acid sequence-based human protein subcellular location prediction approach Hum-mPLoc 3.0, which covers 12 human subcellular localizations. The sequences are represented by multi-view complementary features, i.e. context vocabulary annotation-based gene ontology (GO) terms, peptide-based functional domains, and residue-based statistical features. To systematically reflect the structural hierarchy of the domain knowledge bases, we propose a novel feature representation protocol denoted as HCM (Hidden Correlation Modeling), which will create more compact and discriminative feature vectors by modeling the hidden correlations between annotation terms. Experimental results on four benchmark datasets show that HCM improves prediction accuracy by 5-11% and F 1 by 8-19% compared with conventional GO-based methods. A large-scale application of Hum-mPLoc 3.0 on the whole human proteome reveals proteins co-localization preferences in the cell.

Availability and Implementation

www.csbio.sjtu.edu.cn/bioinf/Hum-mPLoc3/.

Contacts

hbshen@sjtu.edu.cn.

Supplementary information

Supplementary data are available at Bioinformatics online.

Collapse

Bentley SJ, Boshoff A. Hsp70/J-protein machinery from Glossina morsitans morsitans, vector of African trypanosomiasis. PLoS One 2017;12:e0183858. [PMID: 28902917 PMCID: PMC5597180 DOI: 10.1371/journal.pone.0183858] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/23/2017] [Accepted: 08/11/2017] [Indexed: 11/18/2022] Open

pLoc-mVirus: Predict subcellular localization of multi-location virus proteins via incorporating the optimal GO information into general PseAAC. Gene 2017;628:315-321. [DOI: 10.1016/j.gene.2017.07.036] [Citation(s) in RCA: 135] [Impact Index Per Article: 19.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/21/2017] [Revised: 07/08/2017] [Accepted: 07/11/2017] [Indexed: 12/25/2022]

GLUT10-Lacking in Arterial Tortuosity Syndrome-Is Localized to the Endoplasmic Reticulum of Human Fibroblasts. Int J Mol Sci 2017;18:ijms18081820. [PMID: 28829359 PMCID: PMC5578206 DOI: 10.3390/ijms18081820] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/28/2017] [Revised: 08/13/2017] [Accepted: 08/13/2017] [Indexed: 01/02/2023] Open

Nielsen H. Predicting Subcellular Localization of Proteins by Bioinformatic Algorithms. Curr Top Microbiol Immunol 2017;404:129-158. [PMID: 26728066 DOI: 10.1007/82_2015_5006] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/08/2023]