1
|
Prusty JS, Kumar A. LC-MS/MS profiling and analysis of Bacillus licheniformis extracellular proteins for antifungal potential against Candida albicans. J Proteomics 2024; 303:105228. [PMID: 38878881 DOI: 10.1016/j.jprot.2024.105228] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/24/2024] [Revised: 06/11/2024] [Accepted: 06/12/2024] [Indexed: 06/18/2024]
Abstract
Candida albicans, a significant human pathogenic fungus, employs hydrolytic proteases for host invasion. Conventional antifungal agents are reported with resistance issues from around the world. This study investigates the role of Bacillus licheniformis extracellular proteins (ECP) as effective antifungal peptides (AFPs). The aim was to identify and characterize the ECP of B. licheniformis through LC-MS/MS and bioinformatics analysis. LC-MS/MS analysis identified 326 proteins with 69 putative ECP, further analyzed in silico. Of these, 21 peptides exhibited antifungal properties revealed by classAMP tool and are predominantly anionic. Peptide-protein docking revealed interactions between AFPs like Peptide chain release factor 1 (Q65DV1_Seq1: SASEQLSDAK) and Putative carboxy peptidase (Q65IF0_Seq7: SDSSLEDQDFILESK) with C. albicans virulent SAP5 proteins (PDB ID 2QZX), forming hydrogen bonds and significant Pi-Pi interactions. The identification of B. licheniformis ECP is the novelty of the study that sheds light on their antifungal potential. The identified AFPs, particularly those interacting with bonafide pharmaceutical targets SAP5 of C. albicans represent promising avenues for the development of antifungal treatments with AFPs that could be the pursuit of a novel therapeutic strategy against C. albicans. SIGNIFICANCE OF STUDY: The purpose of this work was to carry out proteomic profiling of the secretome of B. licheniformis. Previously, the efficacy of Bacillus licheniformis extracellular proteins against Candida albicans was investigated and documented in a recently communicated manuscript, showcasing the antifungal activity of these proteins. In order to achieve high-throughput identification of ES (Excretory-secretory) proteins, the utilization of liquid chromatography tandem mass spectrometry (LC-MS) was utilized. There was a lack of comprehensive research on AFPs in B. licheniformis, nevertheless. The proteins secreted by B. licheniformis in liquid medium were initially discovered using liquid chromatography-tandem mass spectrometry (LC-MS) analysis and identification in order to immediately characterize the unidentified active metabolites in fermentation broth.
Collapse
Affiliation(s)
- Jyoti Sankar Prusty
- Department of Biotechnology, National Institute of Technology, Raipur 492010, CG, India
| | - Awanish Kumar
- Department of Biotechnology, National Institute of Technology, Raipur 492010, CG, India.
| |
Collapse
|
2
|
Devi SB, Kumar S. Designing a multi-epitope chimeric protein from different potential targets: A potential vaccine candidate against Plasmodium. Mol Biochem Parasitol 2023; 255:111560. [PMID: 37084957 DOI: 10.1016/j.molbiopara.2023.111560] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/13/2022] [Revised: 03/30/2023] [Accepted: 04/03/2023] [Indexed: 04/23/2023]
Abstract
Malaria is an infectious disease that has been a continuous threat to mankind since the time immemorial. Owing to the complex multi-staged life cycle of the plasmodium parasite, an effective malaria vaccine which is fully protective against the parasite infection is urgently needed to deal with the challenges. In the present study, essential parasite proteins were identified and a chimeric protein with multivalent epitopes was generated. The designed chimeric protein consists of best potential B and T cell epitopes from five different essential parasite proteins. Physiochemical studies of the chimeric protein showed that the modeled vaccine construct was thermo-stable, hydrophilic and antigenic in nature. And the binding of the vaccine construct with Toll-like receptor-4 (TLR-4) as revealed by the molecular docking suggests the possible interaction and role of the vaccine construct in activating the innate immune response. The constructed vaccine being a chimeric protein containing epitopes from different potential candidates could target different stages or pathways of the parasite. Moreover, the approach used in this study is time and cost effective, and can be applied in the discoveries of new potential vaccine targets for other pathogens.
Collapse
Affiliation(s)
- Sanasam Bijara Devi
- Department of Life science & Bioinformatics, Assam University, Silchar 788011 India.
| | - Sanjeev Kumar
- Department of Life science & Bioinformatics, Assam University, Silchar 788011 India
| |
Collapse
|
3
|
Heinzinger M, Littmann M, Sillitoe I, Bordin N, Orengo C, Rost B. Contrastive learning on protein embeddings enlightens midnight zone. NAR Genom Bioinform 2022; 4:lqac043. [PMID: 35702380 PMCID: PMC9188115 DOI: 10.1093/nargab/lqac043] [Citation(s) in RCA: 25] [Impact Index Per Article: 12.5] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/22/2021] [Revised: 03/25/2022] [Accepted: 05/17/2022] [Indexed: 12/23/2022] Open
Abstract
Experimental structures are leveraged through multiple sequence alignments, or more generally through homology-based inference (HBI), facilitating the transfer of information from a protein with known annotation to a query without any annotation. A recent alternative expands the concept of HBI from sequence-distance lookup to embedding-based annotation transfer (EAT). These embeddings are derived from protein Language Models (pLMs). Here, we introduce using single protein representations from pLMs for contrastive learning. This learning procedure creates a new set of embeddings that optimizes constraints captured by hierarchical classifications of protein 3D structures defined by the CATH resource. The approach, dubbed ProtTucker, has an improved ability to recognize distant homologous relationships than more traditional techniques such as threading or fold recognition. Thus, these embeddings have allowed sequence comparison to step into the 'midnight zone' of protein similarity, i.e. the region in which distantly related sequences have a seemingly random pairwise sequence similarity. The novelty of this work is in the particular combination of tools and sampling techniques that ascertained good performance comparable or better to existing state-of-the-art sequence comparison methods. Additionally, since this method does not need to generate alignments it is also orders of magnitudes faster. The code is available at https://github.com/Rostlab/EAT.
Collapse
Affiliation(s)
- Michael Heinzinger
- TUM (Technical University of Munich) Dept Informatics, Bioinformatics & Computational Biology - i12, Boltzmannstr. 3, 85748 Garching/Munich, Germany
- TUM Graduate School, Center of Doctoral Studies in Informatics and its Applications (CeDoSIA), Boltzmannstr. 11, 85748 Garching, Germany
| | - Maria Littmann
- TUM (Technical University of Munich) Dept Informatics, Bioinformatics & Computational Biology - i12, Boltzmannstr. 3, 85748 Garching/Munich, Germany
| | - Ian Sillitoe
- Institute of Structural and Molecular Biology, University College London, London WC1E 6BT, UK
| | - Nicola Bordin
- Institute of Structural and Molecular Biology, University College London, London WC1E 6BT, UK
| | - Christine Orengo
- Institute of Structural and Molecular Biology, University College London, London WC1E 6BT, UK
| | - Burkhard Rost
- TUM (Technical University of Munich) Dept Informatics, Bioinformatics & Computational Biology - i12, Boltzmannstr. 3, 85748 Garching/Munich, Germany
- Institute for Advanced Study (TUM-IAS), Lichtenbergstr. 2a, 85748 Garching, Germany & TUM School of Life Sciences Weihenstephan (WZW), Alte Akademie 8, Freising, Germany
| |
Collapse
|
4
|
Jin Y, Yang Y. ProtPlat: an efficient pre-training platform for protein classification based on FastText. BMC Bioinformatics 2022; 23:66. [PMID: 35148686 PMCID: PMC8832758 DOI: 10.1186/s12859-022-04604-2] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/13/2021] [Accepted: 02/02/2022] [Indexed: 11/24/2022] Open
Abstract
Background For the past decades, benefitting from the rapid growth of protein sequence data in public databases, a lot of machine learning methods have been developed to predict physicochemical properties or functions of proteins using amino acid sequence features. However, the prediction performance often suffers from the lack of labeled data. In recent years, pre-training methods have been widely studied to address the small-sample issue in computer vision and natural language processing fields, while specific pre-training techniques for protein sequences are few. Results In this paper, we propose a pre-training platform for representing protein sequences, called ProtPlat, which uses the Pfam database to train a three-layer neural network, and then uses specific training data from downstream tasks to fine-tune the model. ProtPlat can learn good representations for amino acids, and at the same time achieve efficient classification. We conduct experiments on three protein classification tasks, including the identification of type III secreted effectors, the prediction of subcellular localization, and the recognition of signal peptides. The experimental results show that the pre-training can enhance model performance effectively and ProtPlat is competitive to the state-of-the-art predictors, especially for small datasets. We implement the ProtPlat platform as a web service (https://compbio.sjtu.edu.cn/protplat) that is accessible to the public. Conclusions To enhance the feature representation of protein amino acid sequences and improve the performance of sequence-based classification tasks, we develop ProtPlat, a general platform for the pre-training of protein sequences, which is featured by a large-scale supervised training based on Pfam database and an efficient learning model, FastText. The experimental results of three downstream classification tasks demonstrate the efficacy of ProtPlat. Supplementary Information The online version contains supplementary material available at 10.1186/s12859-022-04604-2.
Collapse
Affiliation(s)
- Yuan Jin
- Department of Computer Science and Engineering, Shanghai Jiao Tong University, and Key Laboratory of Shanghai Education Commission for Intelligent Interaction and Cognitive Engineering, Shanghai, 200240, China
| | - Yang Yang
- Department of Computer Science and Engineering, Shanghai Jiao Tong University, and Key Laboratory of Shanghai Education Commission for Intelligent Interaction and Cognitive Engineering, Shanghai, 200240, China.
| |
Collapse
|
5
|
Abeywickrama TD, Perera IC. In Silico Characterization and Virtual Screening of GntR/HutC Family Transcriptional Regulator MoyR: A Potential Monooxygenase Regulator in Mycobacterium tuberculosis. BIOLOGY 2021; 10:biology10121241. [PMID: 34943156 PMCID: PMC8698889 DOI: 10.3390/biology10121241] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 05/28/2021] [Revised: 07/01/2021] [Accepted: 07/02/2021] [Indexed: 12/31/2022]
Abstract
Simple Summary In an era where the world faces new diseases and pathogens, another emerging challenge is neglected pathogens becoming more notorious. Transcriptional regulators play a vital role in the pathogenesis and survival of these pathogens. Hence, characterizing transcriptional regulators, either in vitro or in silico, is of great importance. Here, we present the first structural characterization of a GntR/HutC regulator in Mycobacterium tuberculosis via in silico methods. We have suggested its possible role and potential as a drug target as well as identified possible drug leads that can be used for further improvements. Abstract Mycobacterium tuberculosis is a well-known pathogen due to the emergence of drug resistance associated with it, where transcriptional regulators play a key role in infection, colonization and persistence. The genome of M. tuberculosis encodes many transcriptional regulators, and here we report an in-depth in silico characterization of a GntR regulator: MoyR, a possible monooxygenase regulator. Homology modelling provided a reliable structure for MoyR, showing homology with a HutC regulator DasR from Streptomyces coelicolor. In silico physicochemical analysis revealed that MoyR is a cytoplasmic protein with higher thermal stability and higher pI. Four highly probable binding pockets were determined in MoyR and the druggability was higher in the orthosteric binding site consisting of three conserved critical residues: TYR179, ARG223 and GLU234. Two highly conserved leucine residues were identified in the effector-binding region of MoyR and other HutC homologues, suggesting that these two residues can be crucial for structure stability and oligomerization. Virtual screening of drug leads resulted in four drug-like compounds with greater affinity to MoyR with potential inhibitory effects for MoyR. Our findings support that this regulator protein can be valuable as a therapeutic target that can be used for developing drug leads.
Collapse
|
6
|
Stärk H, Dallago C, Heinzinger M, Rost B. Light attention predicts protein location from the language of life. BIOINFORMATICS ADVANCES 2021; 1:vbab035. [PMID: 36700108 PMCID: PMC9710637 DOI: 10.1093/bioadv/vbab035] [Citation(s) in RCA: 34] [Impact Index Per Article: 11.3] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 06/28/2021] [Revised: 09/27/2021] [Accepted: 11/15/2021] [Indexed: 01/28/2023]
Abstract
Summary Although knowing where a protein functions in a cell is important to characterize biological processes, this information remains unavailable for most known proteins. Machine learning narrows the gap through predictions from expert-designed input features leveraging information from multiple sequence alignments (MSAs) that is resource expensive to generate. Here, we showcased using embeddings from protein language models for competitive localization prediction without MSAs. Our lightweight deep neural network architecture used a softmax weighted aggregation mechanism with linear complexity in sequence length referred to as light attention. The method significantly outperformed the state-of-the-art (SOTA) for 10 localization classes by about 8 percentage points (Q10). So far, this might be the highest improvement of just embeddings over MSAs. Our new test set highlighted the limits of standard static datasets: while inviting new models, they might not suffice to claim improvements over the SOTA. Availability and implementation The novel models are available as a web-service at http://embed.protein.properties. Code needed to reproduce results is provided at https://github.com/HannesStark/protein-localization. Predictions for the human proteome are available at https://zenodo.org/record/5047020. Supplementary information Supplementary data are available at Bioinformatics Advances online.
Collapse
Affiliation(s)
- Hannes Stärk
- Department of Informatics, Bioinformatics & Computational Biology—i12, TUM (Technical University of Munich), 85748 Munich, Germany
| | - Christian Dallago
- Department of Informatics, Bioinformatics & Computational Biology—i12, TUM (Technical University of Munich), 85748 Munich, Germany
- TUM Graduate School, Center of Doctoral Studies in Informatics and its Applications (CeDoSIA), 85748 Munich, Germany
| | - Michael Heinzinger
- Department of Informatics, Bioinformatics & Computational Biology—i12, TUM (Technical University of Munich), 85748 Munich, Germany
- TUM Graduate School, Center of Doctoral Studies in Informatics and its Applications (CeDoSIA), 85748 Munich, Germany
| | - Burkhard Rost
- Department of Informatics, Bioinformatics & Computational Biology—i12, TUM (Technical University of Munich), 85748 Munich, Germany
- Institute for Advanced Study (TUM-IAS), 85748 Munich, Germany
- TUM School of Life Sciences Weihenstephan (WZW), Freising, Germany
| |
Collapse
|
7
|
Kumar R, Dhanda SK. Bird Eye View of Protein Subcellular Localization Prediction. Life (Basel) 2020; 10:E347. [PMID: 33327400 PMCID: PMC7764902 DOI: 10.3390/life10120347] [Citation(s) in RCA: 9] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/24/2020] [Revised: 12/11/2020] [Accepted: 12/11/2020] [Indexed: 12/12/2022] Open
Abstract
Proteins are made up of long chain of amino acids that perform a variety of functions in different organisms. The activity of the proteins is determined by the nucleotide sequence of their genes and by its 3D structure. In addition, it is essential for proteins to be destined to their specific locations or compartments to perform their structure and functions. The challenge of computational prediction of subcellular localization of proteins is addressed in various in silico methods. In this review, we reviewed the progress in this field and offered a bird eye view consisting of a comprehensive listing of tools, types of input features explored, machine learning approaches employed, and evaluation matrices applied. We hope the review will be useful for the researchers working in the field of protein localization predictions.
Collapse
Affiliation(s)
- Ravindra Kumar
- Biometric Research Program, Division of Cancer Treatment and Diagnosis, National Cancer Institute, NIH, 9609 Medical Center Drive, Rockville, MD 20850, USA
| | - Sandeep Kumar Dhanda
- Department of Oncology, St. Jude Children’s Research Hospital, Memphis, TN 38105, USA
| |
Collapse
|
8
|
Deveshwar P, Sharma S, Prusty A, Sinha N, Zargar SM, Karwal D, Parashar V, Singh S, Tyagi AK. Analysis of rice nuclear-localized seed-expressed proteins and their database (RSNP-DB). Sci Rep 2020; 10:15116. [PMID: 32934280 PMCID: PMC7492263 DOI: 10.1038/s41598-020-70713-8] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/27/2020] [Accepted: 08/03/2020] [Indexed: 01/16/2023] Open
Abstract
Nuclear proteins are primarily regulatory factors governing gene expression. Multiple factors determine the localization of a protein in the nucleus. An upright identification of nuclear proteins is way far from accuracy. We have attempted to combine information from subcellular prediction tools, experimental evidence, and nuclear proteome data to identify a reliable list of seed-expressed nuclear proteins in rice. Depending upon the number of prediction tools calling a protein nuclear, we could sort 19,441 seed expressed proteins into five categories. Of which, half of the seed-expressed proteins were called nuclear by at least one out of four prediction tools. Further, gene ontology (GO) enrichment and transcription factor composition analysis showed that 6116 seed-expressed proteins could be called nuclear with a greater assertion. Localization evidence from experimental data was available for 1360 proteins. Their analysis showed that a 92.04% accuracy of a nuclear call is valid for proteins predicted nuclear by at least three tools. Distribution of nuclear localization signals and nuclear export signals showed that the majority of category four members were nuclear resident proteins, whereas other categories have a low fraction of nuclear resident proteins and significantly higher constitution of shuttling proteins. We compiled all the above information for the seed-expressed genes in the form of a searchable database named Rice Seed Nuclear Protein DataBase (RSNP-DB) https://pmb.du.ac.in/rsnpdb. This information will be useful for comprehending the role of seed nuclear proteome in rice.
Collapse
Affiliation(s)
- Priyanka Deveshwar
- Interdisciplinary Centre for Plant Genomics and Department of Plant Molecular Biology, University of Delhi, South Campus, New Delhi, India
| | - Shivam Sharma
- Interdisciplinary Centre for Plant Genomics and Department of Plant Molecular Biology, University of Delhi, South Campus, New Delhi, India
| | - Ankita Prusty
- Interdisciplinary Centre for Plant Genomics and Department of Plant Molecular Biology, University of Delhi, South Campus, New Delhi, India
| | - Neha Sinha
- Interdisciplinary Centre for Plant Genomics and Department of Plant Molecular Biology, University of Delhi, South Campus, New Delhi, India
| | - Sajad Majeed Zargar
- Interdisciplinary Centre for Plant Genomics and Department of Plant Molecular Biology, University of Delhi, South Campus, New Delhi, India.,Proteomics Laboratory, Division of Plant Biotechnology, Sher-e-Kashmir University of Agricultural Sciences & Technology of Kashmir, Shalimar, Srinagar, Jammu & Kashmir, India
| | - Divya Karwal
- Institute of Informatics and Communications, University of Delhi, South Campus, New Delhi, India
| | - Vishal Parashar
- Institute of Informatics and Communications, University of Delhi, South Campus, New Delhi, India
| | - Sanjeev Singh
- Institute of Informatics and Communications, University of Delhi, South Campus, New Delhi, India
| | - Akhilesh Kumar Tyagi
- Interdisciplinary Centre for Plant Genomics and Department of Plant Molecular Biology, University of Delhi, South Campus, New Delhi, India.
| |
Collapse
|
9
|
Arif M, Ahmad S, Ali F, Fang G, Li M, Yu DJ. TargetCPP: accurate prediction of cell-penetrating peptides from optimized multi-scale features using gradient boost decision tree. J Comput Aided Mol Des 2020; 34:841-856. [PMID: 32180124 DOI: 10.1007/s10822-020-00307-z] [Citation(s) in RCA: 42] [Impact Index Per Article: 10.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/12/2019] [Accepted: 03/09/2020] [Indexed: 02/08/2023]
Abstract
Cell-penetrating peptides (CPPs) are short length permeable proteins have emerged as drugs delivery tool of therapeutic agents including genetic materials and macromolecules into cells. Recently, CPP has become a hotspot avenue for life science research and paved a new way of disease treatment without harmful impact on cell viability due to nontoxic characteristic. Therefore, the correct identification of CPPs will provide hints for medical applications. Considering the shortcomings of traditional experimental CPPs identification, it is urgently needed to design intelligent predictor for accurate identification of CPPs for the large scale uncharacterized sequences. We develop a novel computational method, called TargetCPP, to discriminate CPPs from Non-CPPs with improved accuracy. In TargetCPP, first the peptide sequences are formulated with four distinct encoding methods i.e., composite protein sequence representation, composition transition and distribution, split amino acid composition, and information theory features. These dominant feature vectors were fused and applied intelligent minimum redundancy and maximum relevancy feature selection method to choose an optimal subset of features. Finally, the predictive model is learned through different classification algorithms on the optimized features. Among these classifiers, gradient boost decision tree algorithm achieved excellent performance throughout the experiments. Notably, the TargetCPP tool attained high prediction Accuracy of 93.54% and 88.28% using jackknife and independent test, respectively. Empirical outcomes prove the superiority and potency of proposed bioinformatics method over state-of-the-art methods. It is highly anticipated that the outcomes of this study will provide a strong background for large scale prediction of CPPs and instructive guidance in clinical therapy and medical applications.
Collapse
Affiliation(s)
- Muhammad Arif
- School of Computer Science and Engineering, Nanjing University of Science and Technology, Nanjing, 210094, China
| | - Saeed Ahmad
- School of Computer Science and Engineering, Nanjing University of Science and Technology, Nanjing, 210094, China
| | - Farman Ali
- School of Computer Science and Engineering, Nanjing University of Science and Technology, Nanjing, 210094, China
| | - Ge Fang
- School of Computer Science and Engineering, Nanjing University of Science and Technology, Nanjing, 210094, China
| | - Min Li
- School of Computer Science and Engineering, Nanjing University of Science and Technology, Nanjing, 210094, China
| | - Dong-Jun Yu
- School of Computer Science and Engineering, Nanjing University of Science and Technology, Nanjing, 210094, China.
| |
Collapse
|
10
|
Heinzinger M, Elnaggar A, Wang Y, Dallago C, Nechaev D, Matthes F, Rost B. Modeling aspects of the language of life through transfer-learning protein sequences. BMC Bioinformatics 2019; 20:723. [PMID: 31847804 PMCID: PMC6918593 DOI: 10.1186/s12859-019-3220-8] [Citation(s) in RCA: 241] [Impact Index Per Article: 48.2] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/03/2019] [Accepted: 11/13/2019] [Indexed: 12/15/2022] Open
Abstract
BACKGROUND Predicting protein function and structure from sequence is one important challenge for computational biology. For 26 years, most state-of-the-art approaches combined machine learning and evolutionary information. However, for some applications retrieving related proteins is becoming too time-consuming. Additionally, evolutionary information is less powerful for small families, e.g. for proteins from the Dark Proteome. Both these problems are addressed by the new methodology introduced here. RESULTS We introduced a novel way to represent protein sequences as continuous vectors (embeddings) by using the language model ELMo taken from natural language processing. By modeling protein sequences, ELMo effectively captured the biophysical properties of the language of life from unlabeled big data (UniRef50). We refer to these new embeddings as SeqVec (Sequence-to-Vector) and demonstrate their effectiveness by training simple neural networks for two different tasks. At the per-residue level, secondary structure (Q3 = 79% ± 1, Q8 = 68% ± 1) and regions with intrinsic disorder (MCC = 0.59 ± 0.03) were predicted significantly better than through one-hot encoding or through Word2vec-like approaches. At the per-protein level, subcellular localization was predicted in ten classes (Q10 = 68% ± 1) and membrane-bound were distinguished from water-soluble proteins (Q2 = 87% ± 1). Although SeqVec embeddings generated the best predictions from single sequences, no solution improved over the best existing method using evolutionary information. Nevertheless, our approach improved over some popular methods using evolutionary information and for some proteins even did beat the best. Thus, they prove to condense the underlying principles of protein sequences. Overall, the important novelty is speed: where the lightning-fast HHblits needed on average about two minutes to generate the evolutionary information for a target protein, SeqVec created embeddings on average in 0.03 s. As this speed-up is independent of the size of growing sequence databases, SeqVec provides a highly scalable approach for the analysis of big data in proteomics, i.e. microbiome or metaproteome analysis. CONCLUSION Transfer-learning succeeded to extract information from unlabeled sequence databases relevant for various protein prediction tasks. SeqVec modeled the language of life, namely the principles underlying protein sequences better than any features suggested by textbooks and prediction methods. The exception is evolutionary information, however, that information is not available on the level of a single sequence.
Collapse
Affiliation(s)
- Michael Heinzinger
- Department of Informatics, Bioinformatics & Computational Biology - i12, TUM (Technical University of Munich), Boltzmannstr. 3, 85748, Garching/Munich, Germany.
- TUM Graduate School, Center of Doctoral Studies in Informatics and its Applications (CeDoSIA), Boltzmannstr. 11, 85748, Garching, Germany.
| | - Ahmed Elnaggar
- Department of Informatics, Bioinformatics & Computational Biology - i12, TUM (Technical University of Munich), Boltzmannstr. 3, 85748, Garching/Munich, Germany
- TUM Graduate School, Center of Doctoral Studies in Informatics and its Applications (CeDoSIA), Boltzmannstr. 11, 85748, Garching, Germany
| | - Yu Wang
- Leibniz Supercomputing Centre, Boltzmannstr. 1, 85748, Garching/Munich, Germany
| | - Christian Dallago
- Department of Informatics, Bioinformatics & Computational Biology - i12, TUM (Technical University of Munich), Boltzmannstr. 3, 85748, Garching/Munich, Germany
- TUM Graduate School, Center of Doctoral Studies in Informatics and its Applications (CeDoSIA), Boltzmannstr. 11, 85748, Garching, Germany
| | - Dmitrii Nechaev
- Department of Informatics, Bioinformatics & Computational Biology - i12, TUM (Technical University of Munich), Boltzmannstr. 3, 85748, Garching/Munich, Germany
- TUM Graduate School, Center of Doctoral Studies in Informatics and its Applications (CeDoSIA), Boltzmannstr. 11, 85748, Garching, Germany
| | - Florian Matthes
- TUM Department of Informatics, Software Engineering and Business Information Systems, Boltzmannstr. 1, 85748, Garching/Munich, Germany
| | - Burkhard Rost
- Department of Informatics, Bioinformatics & Computational Biology - i12, TUM (Technical University of Munich), Boltzmannstr. 3, 85748, Garching/Munich, Germany
- Institute for Advanced Study (TUM-IAS), Lichtenbergstr. 2a, 85748, Garching/Munich, Germany
- TUM School of Life Sciences Weihenstephan (WZW), Alte Akademie 8, Freising, Germany
- Department of Biochemistry and Molecular Biophysics & New York Consortium on Membrane Protein Structure (NYCOMPS), Columbia University, 701 West, 168th Street, New York, NY, 10032, USA
| |
Collapse
|
11
|
Bernhofer M, Goldberg T, Wolf S, Ahmed M, Zaugg J, Boden M, Rost B. NLSdb-major update for database of nuclear localization signals and nuclear export signals. Nucleic Acids Res 2019; 46:D503-D508. [PMID: 29106588 PMCID: PMC5753228 DOI: 10.1093/nar/gkx1021] [Citation(s) in RCA: 60] [Impact Index Per Article: 12.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/15/2017] [Accepted: 10/18/2017] [Indexed: 11/13/2022] Open
Abstract
NLSdb is a database collecting nuclear export signals (NES) and nuclear localization signals (NLS) along with experimentally annotated nuclear and non-nuclear proteins. NES and NLS are short sequence motifs related to protein transport out of and into the nucleus. The updated NLSdb now contains 2253 NLS and introduces 398 NES. The potential sets of novel NES and NLS have been generated by a simple 'in silico mutagenesis' protocol. We started with motifs annotated by experiments. In step 1, we increased specificity such that no known non-nuclear protein matched the refined motif. In step 2, we increased the sensitivity trying to match several different families with a motif. We then iterated over steps 1 and 2. The final set of 2253 NLS motifs matched 35% of 8421 experimentally verified nuclear proteins (up from 21% for the previous version) and none of 18 278 non-nuclear proteins. We updated the web interface providing multiple options to search protein sequences for NES and NLS motifs, and to evaluate your own signal sequences. NLSdb can be accessed via Rostlab services at: https://rostlab.org/services/nlsdb/.
Collapse
Affiliation(s)
- Michael Bernhofer
- Department of Informatics, I12-Chair of Bioinformatics and Computational Biology, Technical University of Munich (TUM), Boltzmannstrasse 3, 85748 Garching/Munich, Germany
| | - Tatyana Goldberg
- Department of Informatics, I12-Chair of Bioinformatics and Computational Biology, Technical University of Munich (TUM), Boltzmannstrasse 3, 85748 Garching/Munich, Germany
| | - Silvana Wolf
- Department of Informatics, I12-Chair of Bioinformatics and Computational Biology, Technical University of Munich (TUM), Boltzmannstrasse 3, 85748 Garching/Munich, Germany
| | - Mohamed Ahmed
- Department of Informatics, I12-Chair of Bioinformatics and Computational Biology, Technical University of Munich (TUM), Boltzmannstrasse 3, 85748 Garching/Munich, Germany
| | - Julian Zaugg
- School of Chemistry and Molecular Biosciences, The University of Queensland, Brisbane 4072, Australia
| | - Mikael Boden
- School of Chemistry and Molecular Biosciences, The University of Queensland, Brisbane 4072, Australia
| | - Burkhard Rost
- Department of Informatics, I12-Chair of Bioinformatics and Computational Biology, Technical University of Munich (TUM), Boltzmannstrasse 3, 85748 Garching/Munich, Germany.,Institute of Advanced Study (TUM-IAS), Lichtenbergstrasse 2a, 85748 Garching/Munich, Germany.,Institute for Food and Plant Sciences WZW-Weihenstephan, Alte Akademie 8, 85354 Freising, Germany.,Department of Biochemistry and Molecular Biophysics, Columbia University, New York, NY 10032, USA
| |
Collapse
|
12
|
Abstract
Ever since the signal hypothesis was proposed in 1971, the exact nature of signal peptides has been a focus point of research. The prediction of signal peptides and protein subcellular location from amino acid sequences has been an important problem in bioinformatics since the dawn of this research field, involving many statistical and machine learning technologies. In this review, we provide a historical account of how position-weight matrices, artificial neural networks, hidden Markov models, support vector machines and, lately, deep learning techniques have been used in the attempts to predict where proteins go. Because the secretory pathway was the first one to be studied both experimentally and through bioinformatics, our main focus is on the historical development of prediction methods for signal peptides that target proteins for secretion; prediction methods to identify targeting signals for other cellular compartments are treated in less detail.
Collapse
Affiliation(s)
- Henrik Nielsen
- Department of Health Technology, Section for Bioinformatics, Technical University of Denmark, Kgs. Lyngby, Denmark.
| | - Konstantinos D Tsirigos
- Department of Health Technology, Section for Bioinformatics, Technical University of Denmark, Kgs. Lyngby, Denmark
| | - Søren Brunak
- Department of Health Technology, Section for Bioinformatics, Technical University of Denmark, Kgs. Lyngby, Denmark
- Faculty of Health and Medical Sciences, Novo Nordisk Foundation Center for Protein Research, University of Copenhagen, Copenhagen, Denmark
| | - Gunnar von Heijne
- Department of Biochemistry and Biophysics, Stockholm University, Stockholm, Sweden
- Science for Life Laboratory, Stockholm University, Solna, Sweden
| |
Collapse
|
13
|
Sanasam BD, Kumar S. PRE-binding protein of Plasmodium falciparum is a potential candidate for vaccine design and development: An in silico evaluation of the hypothesis. Med Hypotheses 2019; 125:119-123. [DOI: 10.1016/j.mehy.2019.01.006] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/12/2018] [Revised: 12/14/2018] [Accepted: 01/10/2019] [Indexed: 11/29/2022]
|
14
|
Savojardo C, Martelli PL, Fariselli P, Casadio R. SChloro: directing Viridiplantae proteins to six chloroplastic sub-compartments. Bioinformatics 2018; 33:347-353. [PMID: 28172591 PMCID: PMC5408801 DOI: 10.1093/bioinformatics/btw656] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/20/2016] [Revised: 06/21/2016] [Accepted: 10/12/2016] [Indexed: 12/12/2022] Open
Abstract
Motivation Chloroplasts are organelles found in plants and involved in several important cell processes. Similarly to other compartments in the cell, chloroplasts have an internal structure comprising several sub-compartments, where different proteins are targeted to perform their functions. Given the relation between protein function and localization, the availability of effective computational tools to predict protein sub-organelle localizations is crucial for large-scale functional studies. Results In this paper we present SChloro, a novel machine-learning approach to predict protein sub-chloroplastic localization, based on targeting signal detection and membrane protein information. The proposed approach performs multi-label predictions discriminating six chloroplastic sub-compartments that include inner membrane, outer membrane, stroma, thylakoid lumen, plastoglobule and thylakoid membrane. In comparative benchmarks, the proposed method outperforms current state-of-the-art methods in both single- and multi-compartment predictions, with an overall multi-label accuracy of 74%. The results demonstrate the relevance of the approach that is eligible as a good candidate for integration into more general large-scale annotation pipelines of protein subcellular localization. Availability and Implementation The method is available as web server at http://schloro.biocomp.unibo.it Contact gigi@biocomp.unibo.it.
Collapse
Affiliation(s)
- Castrense Savojardo
- Biocomputing Group, BiGeA - CIG, Interdepartmental Center «Luigi Galvani» for Integrated Studies of Bioinformatics, Biophysics and Biocomplexity, University of Bologna, Bologna, Italy
| | - Pier Luigi Martelli
- Biocomputing Group, BiGeA - CIG, Interdepartmental Center «Luigi Galvani» for Integrated Studies of Bioinformatics, Biophysics and Biocomplexity, University of Bologna, Bologna, Italy
| | - Piero Fariselli
- Department of Comparative Biomedicine and Food Science (BCA), University of Padova, Padova, Italy
| | - Rita Casadio
- Biocomputing Group, BiGeA - CIG, Interdepartmental Center «Luigi Galvani» for Integrated Studies of Bioinformatics, Biophysics and Biocomplexity, University of Bologna, Bologna, Italy.,Interdepartmental Center «Giorgio Prodi» for Cancer Research, University of Bologna, Bologna, Italy
| |
Collapse
|
15
|
Kunze M. Predicting Peroxisomal Targeting Signals to Elucidate the Peroxisomal Proteome of Mammals. Subcell Biochem 2018; 89:157-199. [PMID: 30378023 DOI: 10.1007/978-981-13-2233-4_7] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/15/2022]
Abstract
Peroxisomes harbor a plethora of proteins, but the peroxisomal proteome as the entirety of all peroxisomal proteins is still unknown for mammalian species. Computational algorithms can be used to predict the subcellular localization of proteins based on their amino acid sequence and this method has been amply used to forecast the intracellular fate of individual proteins. However, when applying such algorithms systematically to all proteins of an organism the prediction of its peroxisomal proteome in silico should be possible. Therefore, a reliable detection of peroxisomal targeting signals (PTS ) acting as postal codes for the intracellular distribution of the encoding protein is crucial. Peroxisomal proteins can utilize different routes to reach their destination depending on the type of PTS. Accordingly, independent prediction algorithms have been developed for each type of PTS, but only those for type-1 motifs (PTS1) have so far reached a satisfying predictive performance. This is partially due to the low number of peroxisomal proteins limiting the power of statistical analyses and partially due to specific properties of peroxisomal protein import, which render functional PTS motifs inactive in specific contexts. Moreover, the prediction of the peroxisomal proteome is limited by the high number of proteins encoded in mammalian genomes, which causes numerous false positive predictions even when using reliable algorithms and buries the few yet unidentified peroxisomal proteins. Thus, the application of prediction algorithms to identify all peroxisomal proteins is currently ineffective as stand-alone method, but can display its full potential when combined with other methods.
Collapse
Affiliation(s)
- Markus Kunze
- Department of Pathobiology of the Nervous System, Center for Brain Research, Medical University of Vienna, Vienna, Austria.
| |
Collapse
|
16
|
Random mutagenesis analysis and identification of a novel C 2H 2-type transcription factor from the nematode-trapping fungus Arthrobotrys oligospora. Sci Rep 2017; 7:5640. [PMID: 28717216 PMCID: PMC5514059 DOI: 10.1038/s41598-017-06075-5] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/31/2016] [Accepted: 06/12/2017] [Indexed: 01/04/2023] Open
Abstract
Arthrobotrys oligospora is a typical nematode-trapping fungus. In this study, 37 transformants of A. oligospora were obtained by REMI (restriction enzyme mediated integration) method and phenotypic properties of nine transformants were analyzed. The nine transformants showed differences in growth, conidiation, trap formation, stress tolerance, and/or pathogenicity among each other and with those of the parental wild-type strain (WT). The insertional sites of the hph cassette were identified in transformants X5 and X13. In X5, the cassette was inserted in the non-coding region between AOL_s00076g273 (76g273) and AOL_s00076g274 (76g274) and the transcription of 76g274, but not 76g273, was enhanced in X5. 76g274p had two conserved domains and was predicted as a nucleoprotein, which we confirmed by its nuclear localization in Saccharomyces cerevisiae using the green fluorescent protein-fused 76g274p. The transcription of 76g274 was stimulated or inhibited by several environmental factors. The sporulation yields of 76g274-deficient mutants were decreased by 70%, and transcription of several sporulation-related genes was severely diminished compared to the WT during the conidiation. In summary, a method for screening mutants was established in A. oligospora and using the method, we identified a novel C2H2-type transcription factor that positively regulates the conidiation of A. oligospora.
Collapse
|
17
|
Tung CH, Chen CW, Sun HH, Chu YW. Predicting human protein subcellular localization by heterogeneous and comprehensive approaches. PLoS One 2017; 12:e0178832. [PMID: 28658305 PMCID: PMC5489166 DOI: 10.1371/journal.pone.0178832] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/08/2017] [Accepted: 05/19/2017] [Indexed: 11/19/2022] Open
Abstract
Drug development and investigation of protein function both require an understanding of protein subcellular localization. We developed a system, REALoc, that can predict the subcellular localization of singleplex and multiplex proteins in humans. This system, based on comprehensive strategy, consists of two heterogeneous systematic frameworks that integrate one-to-one and many-to-many machine learning methods and use sequence-based features, including amino acid composition, surface accessibility, weighted sign aa index, and sequence similarity profile, as well as gene ontology function-based features. REALoc can be used to predict localization to six subcellular compartments (cell membrane, cytoplasm, endoplasmic reticulum/Golgi, mitochondrion, nucleus, and extracellular). REALoc yielded a 75.3% absolute true success rate during five-fold cross-validation and a 57.1% absolute true success rate in an independent database test, which was >10% higher than six other prediction systems. Lastly, we analyzed the effects of Vote and GANN models on singleplex and multiplex localization prediction efficacy. REALoc is freely available at http://predictor.nchu.edu.tw/REALoc.
Collapse
Affiliation(s)
- Chi-Hua Tung
- Department of Bioinformatics, Chung-Hua University, Hsinchu, Taiwan
| | - Chi-Wei Chen
- Institute of Genomics and Bioinformatics, National Chung Hsing University 250, Taichung 402, Taiwan
| | - Han-Hao Sun
- Institute of Genomics and Bioinformatics, National Chung Hsing University 250, Taichung 402, Taiwan
| | - Yen-Wei Chu
- Institute of Genomics and Bioinformatics, National Chung Hsing University 250, Taichung 402, Taiwan
- Biotechnology Center, Agricultural Biotechnology Center, Institute of Molecular Biology, Graduate Institute of Biotechnology, National Chung Hsing University 250, Taichung 402, Taiwan
- * E-mail:
| |
Collapse
|
18
|
Kim YH, Han ME, Oh SO. The molecular mechanism for nuclear transport and its application. Anat Cell Biol 2017; 50:77-85. [PMID: 28713609 PMCID: PMC5509903 DOI: 10.5115/acb.2017.50.2.77] [Citation(s) in RCA: 45] [Impact Index Per Article: 6.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/15/2017] [Accepted: 03/31/2017] [Indexed: 12/30/2022] Open
Abstract
Transportation between the cytoplasm and the nucleoplasm is critical for many physiological and pathophysiological processes including gene expression, signal transduction, and oncogenesis. So, the molecular mechanism for the transportation needs to be studied not only to understand cell physiological processes but also to develop new diagnostic and therapeutic targets. Recent progress in the research of the nuclear transportation (import and export) via nuclear pore complex and four important factors affecting nuclear transport (nucleoporins, Ran, karyopherins, and nuclear localization signals/nuclear export signals) will be discussed. Moreover, the clinical significance of nuclear transport and its application will be reviewed. This review will provide some critical insight for the molecular design of therapeutics which need to be targeted inside the nucleus.
Collapse
Affiliation(s)
- Yun Hak Kim
- Department of Anatomy, Pusan National University School of Medicine, Yangsan, Korea.,BEER, Busan Society of Evidence-Based mEdicine and Research, Busan, Korea.,Gene and Cell Therapy Research Center for Vessel-associated Diseases, Pusan National University, Yangsan, Korea
| | - Myoung-Eun Han
- Department of Anatomy, Pusan National University School of Medicine, Yangsan, Korea.,Gene and Cell Therapy Research Center for Vessel-associated Diseases, Pusan National University, Yangsan, Korea
| | - Sae-Ock Oh
- Department of Anatomy, Pusan National University School of Medicine, Yangsan, Korea.,Gene and Cell Therapy Research Center for Vessel-associated Diseases, Pusan National University, Yangsan, Korea
| |
Collapse
|
19
|
Phosphorylated and Nonphosphorylated PfMAP2 Are Localized in the Nucleus, Dependent on the Stage of Plasmodium falciparum Asexual Maturation. BIOMED RESEARCH INTERNATIONAL 2016; 2016:1645097. [PMID: 27525262 PMCID: PMC4976173 DOI: 10.1155/2016/1645097] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 02/17/2016] [Accepted: 06/16/2016] [Indexed: 11/30/2022]
Abstract
Plasmodium falciparum mitogen-activated protein (MAP) kinases, a family of enzymes central to signal transduction processes including inflammatory responses, are a promising target for antimalarial drug development. Our study shows for the first time that the P. falciparum specific MAP kinase 2 (PfMAP2) is colocalized in the nucleus of all of the asexual erythrocytic stages of P. falciparum and is particularly elevated in its phosphorylated form. It was also discovered that PfMAP2 is expressed in its highest quantity during the early trophozoite (ring form) stage and significantly reduced in the mature trophozoite and schizont stages. Although the phosphorylated form of the kinase is always more prevalent, its ratio relative to the nonphosphorylated form remained constant irrespective of the parasites' developmental stage. We have also shown that the TSH motif specifically renders PfMAP2 genetically divergent from the other plasmodial MAP kinase activation sites using Neighbour Joining analysis. Furthermore, TSH motif-specific designed antibody is crucial in determining the location of the expression of the PfMAP2 protein. However, by using immunoelectron microscopy, PPfMAP2 were detected ubiquitously in the parasitized erythrocytes. In summary, PfMAP2 may play a far more important role than previously thought and is a worthy candidate for research as an antimalarial.
Collapse
|
20
|
The Pseudomonas aeruginosa PAO1 Two-Component Regulator CarSR Regulates Calcium Homeostasis and Calcium-Induced Virulence Factor Production through Its Regulatory Targets CarO and CarP. J Bacteriol 2016; 198:951-63. [PMID: 26755627 DOI: 10.1128/jb.00963-15] [Citation(s) in RCA: 37] [Impact Index Per Article: 4.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/09/2015] [Accepted: 12/31/2015] [Indexed: 02/02/2023] Open
Abstract
UNLABELLED Pseudomonas aeruginosa is an opportunistic human pathogen that causes severe, life-threatening infections in patients with cystic fibrosis (CF), endocarditis, wounds, or artificial implants. During CF pulmonary infections, P. aeruginosa often encounters environments where the levels of calcium (Ca(2+)) are elevated. Previously, we showed that P. aeruginosa responds to externally added Ca(2+) through enhanced biofilm formation, increased production of several secreted virulence factors, and by developing a transient increase in the intracellular Ca(2+) level, followed by its removal to the basal submicromolar level. However, the molecular mechanisms responsible for regulating Ca(2+)-induced virulence factor production and Ca(2+) homeostasis are not known. Here, we characterized the genome-wide transcriptional response of P. aeruginosa to elevated [Ca(2+)] in both planktonic cultures and biofilms. Among the genes induced by CaCl2 in strain PAO1 was an operon containing the two-component regulator PA2656-PA2657 (here called carS and carR), while the closely related two-component regulators phoPQ and pmrAB were repressed by CaCl2 addition. To identify the regulatory targets of CarSR, we constructed a deletion mutant of carR and performed transcriptome analysis of the mutant strain at low and high [Ca(2+)]. Among the genes regulated by CarSR in response to CaCl2 are the predicted periplasmic OB-fold protein, PA0320 (here called carO), and the inner membrane-anchored five-bladed β-propeller protein, PA0327 (here called carP). Mutations in both carO and carP affected Ca(2+) homeostasis, reducing the ability of P. aeruginosa to export excess Ca(2+). In addition, a mutation in carP had a pleotropic effect in a Ca(2+)-dependent manner, altering swarming motility, pyocyanin production, and tobramycin sensitivity. Overall, the results indicate that the two-component system CarSR is responsible for sensing high levels of external Ca(2+) and responding through its regulatory targets that modulate Ca(2+) homeostasis, surface-associated motility, and the production of the virulence factor pyocyanin. IMPORTANCE During infectious disease, Pseudomonas aeruginosa encounters environments with high calcium (Ca(2+)) concentrations, yet the cells maintain intracellular Ca(2+) at levels that are orders of magnitude less than that of the external environment. In addition, Ca(2+) signals P. aeruginosa to induce the production of several virulence factors. Compared to eukaryotes, little is known about how bacteria maintain Ca(2+) homeostasis or how Ca(2+) acts as a signal. In this study, we identified a two-component regulatory system in P. aeruginosa PAO1, termed CarRS, that is induced at elevated Ca(2+) levels. CarRS modulates Ca(2+) signaling and Ca(2+) homeostasis through its regulatory targets, CarO and CarP. The results demonstrate that P. aeruginosa uses a two-component regulatory system to sense external Ca(2+) and relays that information for Ca(2+)-dependent cellular processes.
Collapse
|
21
|
Nucleic acid import into mitochondria: New insights into the translocation pathways. BIOCHIMICA ET BIOPHYSICA ACTA-MOLECULAR CELL RESEARCH 2015; 1853:3165-81. [DOI: 10.1016/j.bbamcr.2015.09.011] [Citation(s) in RCA: 20] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Received: 05/29/2015] [Revised: 08/16/2015] [Accepted: 09/10/2015] [Indexed: 11/18/2022]
|
22
|
Chaudhari P, Ahmed B, Joly DL, Germain H. Effector biology during biotrophic invasion of plant cells. Virulence 2015; 5:703-9. [PMID: 25513771 PMCID: PMC4189876 DOI: 10.4161/viru.29652] [Citation(s) in RCA: 26] [Impact Index Per Article: 2.9] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/19/2022] Open
Abstract
Several obligate biotrophic phytopathogens, namely oomycetes and fungi, invade and feed on living plant cells through specialized structures known as haustoria. Deploying an arsenal of secreted proteins called effectors, these pathogens balance their parasitic propagation by subverting plant immunity without sacrificing host cells. Such secreted proteins, which are thought to be delivered by haustoria, conceivably reprogram host cells and instigate structural modifications, in addition to the modulation of various cellular processes. As effectors represent tools to assist disease resistance breeding, this short review provides a bird’s eye view on the relationship between the virulence function of effectors and their subcellular localization in host cells.
Collapse
Affiliation(s)
- Prateek Chaudhari
- a Groupe de Recherche en Biologie Végétale; Département de Chimie, Biochimie et Physique; Université du Québec à Trois-Rivières; Trois-Rivières, QC Canada
| | | | | | | |
Collapse
|
23
|
Arango-Argoty GA, Jaramillo-Garzón JA, Castellanos-Domínguez G. Feature extraction by statistical contact potentials and wavelet transform for predicting subcellular localizations in gram negative bacterial proteins. J Theor Biol 2015; 364:121-30. [PMID: 25219623 DOI: 10.1016/j.jtbi.2014.08.051] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/08/2013] [Revised: 08/27/2014] [Accepted: 08/28/2014] [Indexed: 11/16/2022]
Abstract
Predicting the localization of a protein has become a useful practice for inferring its function. Most of the reported methods to predict subcellular localizations in Gram-negative bacterial proteins make use of standard protein representations that generally do not take into account the distribution of the amino acids and the structural information of the proteins. Here, we propose a protein representation based on the structural information contained in the pairwise statistical contact potentials. The wavelet transform decodes the information contained in the primary structure of the proteins, allowing the identification of patterns along the proteins, which are used to characterize the subcellular localizations. Then, a support vector machine classifier is trained to categorize them. Cellular compartments like periplasm and extracellular medium are difficult to predict, having a high false negative rate. The wavelet-based method achieves an overall high performance while maintaining a low false negative rate, particularly, on "periplasm" and "extracellular medium". Our results suggest the proposed protein characterization is a useful alternative to representing and predicting protein sequences over the classical and cutting edge protein depictions.
Collapse
Affiliation(s)
- G A Arango-Argoty
- Signal Processing and Recognition Group, Universidad Nacional de Colombia, s. Manizales, Campus La Nubia, km 7 via al Magdalena, Manizales, Colombia; Department of Computational and Systems Biology, University of Pittsburgh School of Medicine, 3501 Fifth Ave, Pittsburgh, PA 15260, USA.
| | - J A Jaramillo-Garzón
- Signal Processing and Recognition Group, Universidad Nacional de Colombia, s. Manizales, Campus La Nubia, km 7 via al Magdalena, Manizales, Colombia; Research Center of the Instituto Tecnologico Metropolitano, Calle 73 No 76A-354, Medellín, Colombia
| | - G Castellanos-Domínguez
- Signal Processing and Recognition Group, Universidad Nacional de Colombia, s. Manizales, Campus La Nubia, km 7 via al Magdalena, Manizales, Colombia
| |
Collapse
|
24
|
Secondary and Tertiary Structure Prediction of Proteins: A Bioinformatic Approach. COMPLEX SYSTEM MODELLING AND CONTROL THROUGH INTELLIGENT SOFT COMPUTATIONS 2015. [DOI: 10.1007/978-3-319-12883-2_19] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/27/2022]
|
25
|
Chen J, Tang YY, Chen CLP, Fang B, Lin Y, Shang Z. Multi-Label Learning With Fuzzy Hypergraph Regularization for Protein Subcellular Location Prediction. IEEE Trans Nanobioscience 2014; 13:438-47. [DOI: 10.1109/tnb.2014.2341111] [Citation(s) in RCA: 12] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/10/2022]
|
26
|
Foong PM, Abedi Karjiban R, Normi YM, Salleh AB, Abdul Rahman MB. Bioinformatics survey of the metal usage by psychrophilic yeast Glaciozyma antarctica PI12. Metallomics 2014; 7:156-64. [PMID: 25412156 DOI: 10.1039/c4mt00163j] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/21/2022]
Abstract
Metal ions are one of the essential elements which are extensively involved in many cellular activities. With rapid advancements in genome sequencing techniques, bioinformatics approaches have provided a promising way to extract functional information of a protein directly from its primary structure. Recent findings have suggested that the metal content of an organism can be predicted from its complete genome sequences. Characterizing the biological metal usage of cold-adapted organisms may help to outline a comprehensive understanding of the metal-partnerships between the psychrophile and its adjacent environment. The focus of this study is targeted towards the analysis of the metal composition of a psychrophilic yeast Glaciozyma antarctica PI12 isolated from sea ice of Antarctica. Since the cellular metal content of an organism is usually reflected in the expressed metal-binding proteins, the putative metal-binding sequences from G. antarctica PI12 were identified with respect to their sequence homologies, domain compositions, protein families and cellular distribution. Most of the analyses revealed that the proteome was enriched with zinc, and the content of metal decreased in the order of Zn > Fe > Mg > Mn, Ca > Cu. Upon comparison, it was found that the metal compositions among yeasts were almost identical. These observations suggested that G. antarctica PI12 could have inherited a conserved trend of metal usage similar to modern eukaryotes, despite its geographically isolated habitat.
Collapse
Affiliation(s)
- Pik Mun Foong
- Enzyme and Microbial Technology Research Center (EMTech), Faculty of Biotechnology and Biomolecular Sciences, Universiti Putra Malaysia, 43400 UPM Serdang, Selangor Darul Ehsan, Malaysia.
| | | | | | | | | |
Collapse
|
27
|
Peng PH, Lin CH, Tsai HW, Lin TY. Cold response in Phalaenopsis aphrodite and characterization of PaCBF1 and PaICE1. PLANT & CELL PHYSIOLOGY 2014; 55:1623-35. [PMID: 24974386 DOI: 10.1093/pcp/pcu093] [Citation(s) in RCA: 19] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/11/2023]
Abstract
Phalaenopsis is a winter-blooming orchid genus commonly cultivated in tropical Asian countries. Because orchids are one of the most economically important flower crops in Taiwan, it is crucial to understand their response to cold and other abiotic stresses. The present study focused on gene regulation of P. aphrodite in response to abiotic stress, mainly cold. Our results demonstrate that P. aphrodite is sensitive to low temperatures, especially in its reproductive stage. We found that after exposure to 4°C, plants in the vegetative stage maintained better membrane integrity and photosynthetic capacity than in the flowering stage. At the molecular level, C-repeat binding factor1 (PaCBF1) and its putative target gene dehydrin1 (PaDHN1) mRNAs were induced by cold, whereas inducer of CBF expression1 (PaICE1) mRNA was constitutively expressed. PaICE1 transactivated MYC motifs in the PaCBF1 promoter, indicating that up-regulation of PaCBF1 may be mediated by the binding of PaICE1 to MYC motifs. Overexpression of PaCBF1 in transgenic Arabidopsis induced AtCOR6.6 and RD29a without cold stimulus and maintained better membrane integrity after cold stress. Herein, we present evidence that cold induction of PaCBF1 transcripts in P. aphrodite may be transactivated by PaICE1 and consequently protect plants from cold damage through up-regulation of cold-regulated (COR) genes, such as DHN. To our knowledge, this study is the first report of the isolation and characterization of CBF, DHN and ICE genes in the Orchidaceae family.
Collapse
Affiliation(s)
- Po-Hsin Peng
- Institute of Bioinformatics and Structural Biology and Department of Life Science, National Tsing Hua University, No. 101, Sec. 2, Kuang Fu Road, Hsinchu 30013, Taiwan, Republic of China
| | - Chia-Hui Lin
- Institute of Bioinformatics and Structural Biology and Department of Life Science, National Tsing Hua University, No. 101, Sec. 2, Kuang Fu Road, Hsinchu 30013, Taiwan, Republic of China
| | - Hui-Wen Tsai
- Institute of Bioinformatics and Structural Biology and Department of Life Science, National Tsing Hua University, No. 101, Sec. 2, Kuang Fu Road, Hsinchu 30013, Taiwan, Republic of China
| | - Tsai-Yun Lin
- Institute of Bioinformatics and Structural Biology and Department of Life Science, National Tsing Hua University, No. 101, Sec. 2, Kuang Fu Road, Hsinchu 30013, Taiwan, Republic of China
| |
Collapse
|
28
|
Jaiswal DK, Ray D, Choudhary MK, Subba P, Kumar A, Verma J, Kumar R, Datta A, Chakraborty S, Chakraborty N. Comparative proteomics of dehydration response in the rice nucleus: new insights into the molecular basis of genotype-specific adaptation. Proteomics 2014; 13:3478-97. [PMID: 24133045 DOI: 10.1002/pmic.201300284] [Citation(s) in RCA: 22] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/10/2013] [Revised: 09/10/2013] [Accepted: 09/23/2013] [Indexed: 01/04/2023]
Abstract
Dehydration is the most crucial environmental factor that considerably reduces the crop harvest index, and thus has become a concern for global agriculture. To better understand the role of nuclear proteins in water-deficit condition, a nuclear proteome was developed from a dehydration-sensitive rice cultivar IR-64 followed by its comparison with that of a dehydration-tolerant c.v. Rasi. The 2DE protein profiling of c.v. IR-64 coupled with MS/MS analysis led to the identification of 93 dehydration-responsive proteins (DRPs). Among those identified proteins, 78 were predicted to be destined to the nucleus, accounting for more than 80% of the dataset. While the detected number of protein spots in c.v. IR-64 was higher when compared with that of Rasi, the number of DRPs was found to be less. Fifty-seven percent of the DRPs were found to be common to both sensitive and tolerant cultivars, indicating significant differences between the two nuclear proteomes. Further, we constructed a functional association network of the DRPs of c.v. IR-64, which suggests that a significant number of the proteins are capable of interacting with each other. The combination of nuclear proteome and interactome analyses would elucidate stress-responsive signaling and the molecular basis of dehydration tolerance in plants.
Collapse
|
29
|
Computational and experimental approaches to reveal the effects of single nucleotide polymorphisms with respect to disease diagnostics. Int J Mol Sci 2014; 15:9670-717. [PMID: 24886813 PMCID: PMC4100115 DOI: 10.3390/ijms15069670] [Citation(s) in RCA: 26] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/08/2014] [Revised: 05/15/2014] [Accepted: 05/16/2014] [Indexed: 12/25/2022] Open
Abstract
DNA mutations are the cause of many human diseases and they are the reason for natural differences among individuals by affecting the structure, function, interactions, and other properties of DNA and expressed proteins. The ability to predict whether a given mutation is disease-causing or harmless is of great importance for the early detection of patients with a high risk of developing a particular disease and would pave the way for personalized medicine and diagnostics. Here we review existing methods and techniques to study and predict the effects of DNA mutations from three different perspectives: in silico, in vitro and in vivo. It is emphasized that the problem is complicated and successful detection of a pathogenic mutation frequently requires a combination of several methods and a knowledge of the biological phenomena associated with the corresponding macromolecules.
Collapse
|
30
|
Amanzadeh E, Mohabatkar H, Biria D. Classification of DNA Minor and Major Grooves Binding Proteins According to the NLSs by Data Analysis Methods. Appl Biochem Biotechnol 2014; 174:437-51. [DOI: 10.1007/s12010-014-0926-y] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/01/2013] [Accepted: 04/17/2014] [Indexed: 11/30/2022]
|
31
|
Song T, Gu H. Discriminative motif discovery via simulated evolution and random under-sampling. PLoS One 2014; 9:e87670. [PMID: 24551063 PMCID: PMC3923751 DOI: 10.1371/journal.pone.0087670] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/07/2013] [Accepted: 12/29/2013] [Indexed: 11/22/2022] Open
Abstract
Conserved motifs in biological sequences are closely related to their structure and functions. Recently, discriminative motif discovery methods have attracted more and more attention. However, little attention has been devoted to the data imbalance problem, which is one of the main reasons affecting the performance of the discriminative models. In this article, a simulated evolution method is applied to solve the multi-class imbalance problem at the stage of data preprocessing, and at the stage of Hidden Markov Models (HMMs) training, a random under-sampling method is introduced for the imbalance between the positive and negative datasets. It is shown that, in the task of discovering targeting motifs of nine subcellular compartments, the motifs found by our method are more conserved than the methods without considering data imbalance problem and recover the most known targeting motifs from Minimotif Miner and InterPro. Meanwhile, we use the found motifs to predict protein subcellular localization and achieve higher prediction precision and recall for the minority classes.
Collapse
Affiliation(s)
- Tao Song
- Faculty of Electronic Information and Electrical Engineering, Dalian University of Technology, Dalian, Liaoning, China
| | - Hong Gu
- Faculty of Electronic Information and Electrical Engineering, Dalian University of Technology, Dalian, Liaoning, China
- * E-mail:
| |
Collapse
|
32
|
Combination of site directed mutagenesis and secondary structure analysis predicts the amino acids essential for stability of M. leprae MurE. Interdiscip Sci 2014; 6:40-7. [PMID: 24464703 DOI: 10.1007/s12539-014-0185-1] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/24/2012] [Revised: 03/22/2013] [Accepted: 04/23/2013] [Indexed: 10/25/2022]
Abstract
The life-threatening infections caused by Mycobacterium leprae (Mle) remain a major challenge in developing countries as well as globe and there is a need to design potent anti-leprosy drugs. In our previous studies, ATP-dependent Mle-MurE ligase involved in biosynthesis of peptidoglycan was identified as one of the common drug targets, homology modeled and reported. In this work in silico site directed mutagenesis study was carried out on the homology modeled Mle-MurE ligase. This predicted the amino acids essential for stability. In addition, the distribution of these residues in different secondary structures and in active sites was analyzed. Finally, the role of the conserved residues in stability and function was analyzed. The availability of Mle-MurE ligase built model together with insights gained from stability studies and docking studies will promote the rational design of potent and selective Mle-MurE ligase inhibitors as anti-leprosy therapeutics.
Collapse
|
33
|
Feiglin A, Ashkenazi S, Schlessinger A, Rost B, Ofran Y. Co-expression and co-localization of hub proteins and their partners are encoded in protein sequence. MOLECULAR BIOSYSTEMS 2014; 10:787-94. [PMID: 24457447 DOI: 10.1039/c3mb70411d] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 01/23/2023]
Abstract
Spatiotemporal coordination is a critical factor in biological processes. Some hubs in protein-protein interaction networks tend to be co-expressed and co-localized with their partners more strongly than others, a difference which is arguably related to functional differences between the hubs. Based on numerous analyses of yeast hubs, it has been suggested that differences in co-expression and co-localization are reflected in the structural and molecular characteristics of the hubs. We hypothesized that if indeed differences in co-expression and co-localization are encoded in the molecular characteristics of the protein, it may be possible to predict the tendency for co-expression and co-localization of human hubs based on features learned from systematically characterized yeast hubs. Thus, we trained a prediction algorithm on hubs from yeast that were classified as either strongly or weakly co-expressed and co-localized with their partners, and applied the trained model to 800 human hub proteins. We found that the algorithm significantly distinguishes between human hubs that are co-expressed and co-localized with their partners and hubs that are not. The prediction is based on sequence derived features such as "stickiness", i.e. the existence of multiple putative binding sites that enable multiple simultaneous interactions, "plasticity", i.e. the existence of predicted structural disorder which conjecturally allows for multiple consecutive interactions with the same binding site and predicted subcellular localization. These results suggest that spatiotemporal dynamics is encoded, at least in part, in the amino acid sequence of the protein and that this encoding is similar in yeast and in human.
Collapse
Affiliation(s)
- Ariel Feiglin
- The Goodman faculty of life sciences, Bar Ilan University, Ramat Gan 52900, Israel.
| | | | | | | | | |
Collapse
|
34
|
Teso S, Passerini A. Joint probabilistic-logical refinement of multiple protein feature predictors. BMC Bioinformatics 2014; 15:16. [PMID: 24428894 PMCID: PMC3929554 DOI: 10.1186/1471-2105-15-16] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/03/2012] [Accepted: 11/06/2013] [Indexed: 11/24/2022] Open
Abstract
Background Computational methods for the prediction of protein features from sequence are a long-standing focus of bioinformatics. A key observation is that several protein features are closely inter-related, that is, they are conditioned on each other. Researchers invested a lot of effort into designing predictors that exploit this fact. Most existing methods leverage inter-feature constraints by including known (or predicted) correlated features as inputs to the predictor, thus conditioning the result. Results By including correlated features as inputs, existing methods only rely on one side of the relation: the output feature is conditioned on the known input features. Here we show how to jointly improve the outputs of multiple correlated predictors by means of a probabilistic-logical consistency layer. The logical layer enforces a set of weighted first-order rules encoding biological constraints between the features, and improves the raw predictions so that they least violate the constraints. In particular, we show how to integrate three stand-alone predictors of correlated features: subcellular localization (Loctree [J Mol Biol 348:85–100, 2005]), disulfide bonding state (Disulfind [Nucleic Acids Res 34:W177–W181, 2006]), and metal bonding state (MetalDetector [Bioinformatics 24:2094–2095, 2008]), in a way that takes into account the respective strengths and weaknesses, and does not require any change to the predictors themselves. We also compare our methodology against two alternative refinement pipelines based on state-of-the-art sequential prediction methods. Conclusions The proposed framework is able to improve the performance of the underlying predictors by removing rule violations. We show that different predictors offer complementary advantages, and our method is able to integrate them using non-trivial constraints, generating more consistent predictions. In addition, our framework is fully general, and could in principle be applied to a vast array of heterogeneous predictions without requiring any change to the underlying software. On the other hand, the alternative strategies are more specific and tend to favor one task at the expense of the others, as shown by our experimental evaluation. The ultimate goal of our framework is to seamlessly integrate full prediction suites, such as Distill [BMC Bioinformatics 7:402, 2006] and PredictProtein [Nucleic Acids Res 32:W321–W326, 2004].
Collapse
Affiliation(s)
- Stefano Teso
- Department of Information Engineering and Computer Science, Università degli Studi di Trento, Trento, Italy.
| | | |
Collapse
|
35
|
Pan S, Carter CJ, Raikhel NV. Understanding protein trafficking in plant cells through proteomics. Expert Rev Proteomics 2014; 2:781-92. [PMID: 16209656 DOI: 10.1586/14789450.2.5.781] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/23/2022]
Abstract
The functions of approximately one-third of the proteins encoded by the Arabidopsis thaliana genome are completely unknown. Moreover, many annotations of the remainder of the genome supply tentative functions, at best. Knowing the ultimate localization of these proteins, as well as the pathways used for getting there, may provide clues as to their functions. The putative localization of most proteins currently relies on in silico-based bioinformatics approaches, which, unfortunately, often result in erroneous predictions. Emerging proteomics techniques coupled with other systems biology approaches now provide researchers with a plethora of methods for elucidating the final location of these proteins on a large scale, as well as the ability to dissect protein-sorting pathways in plants.
Collapse
Affiliation(s)
- Songqin Pan
- WM Keck Proteomics Laboratory, Center for Plant Cell Biology, Botany & Plant Sciences, University of California, Riverside, CA 92521, USA.
| | | | | |
Collapse
|
36
|
Govindan G, Nair AS. Bagging with CTD--a novel signature for the hierarchical prediction of secreted protein trafficking in eukaryotes. GENOMICS PROTEOMICS & BIOINFORMATICS 2013; 11:385-90. [PMID: 24316328 PMCID: PMC4357838 DOI: 10.1016/j.gpb.2013.07.005] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 02/02/2013] [Revised: 07/01/2013] [Accepted: 07/17/2013] [Indexed: 11/19/2022]
Abstract
Protein trafficking or protein sorting in eukaryotes is a complicated process and is carried out based on the information contained in the protein. Many methods reported prediction of the subcellular location of proteins from sequence information. However, most of these prediction methods use a flat structure or parallel architecture to perform prediction. In this work, we introduce ensemble classifiers with features that are extracted directly from full length protein sequences to predict locations in the protein-sorting pathway hierarchically. Sequence driven features, sequence mapped features and sequence autocorrelation features were tested with ensemble learners and their performances were compared. When evaluated by independent data testing, ensemble based-bagging algorithms with sequence feature composition, transition and distribution (CTD) successfully classified two datasets with accuracies greater than 90%. We compared our results with similar published methods, and our method equally performed with the others at two levels in the secreted pathway. This study shows that the feature CTD extracted from protein sequences is effective in capturing biological features among compartments in secreted pathways.
Collapse
Affiliation(s)
- Geetha Govindan
- Department of Computational Biology and Bioinformatics, University of Kerala, Thiruvananthapuram 695581, India.
| | - Achuthsankar S Nair
- Department of Computational Biology and Bioinformatics, University of Kerala, Thiruvananthapuram 695581, India
| |
Collapse
|
37
|
Kaundal R, Sahu SS, Verma R, Weirick T. Identification and characterization of plastid-type proteins from sequence-attributed features using machine learning. BMC Bioinformatics 2013; 14 Suppl 14:S7. [PMID: 24266945 PMCID: PMC3851450 DOI: 10.1186/1471-2105-14-s14-s7] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/04/2023] Open
Abstract
BACKGROUND Plastids are an important component of plant cells, being the site of manufacture and storage of chemical compounds used by the cell, and contain pigments such as those used in photosynthesis, starch synthesis/storage, cell color etc. They are essential organelles of the plant cell, also present in algae. Recent advances in genomic technology and sequencing efforts is generating a huge amount of DNA sequence data every day. The predicted proteome of these genomes needs annotation at a faster pace. In view of this, one such annotation need is to develop an automated system that can distinguish between plastid and non-plastid proteins accurately, and further classify plastid-types based on their functionality. We compared the amino acid compositions of plastid proteins with those of non-plastid ones and found significant differences, which were used as a basis to develop various feature-based prediction models using similarity-search and machine learning. RESULTS In this study, we developed separate Support Vector Machine (SVM) trained classifiers for characterizing the plastids in two steps: first distinguishing the plastid vs. non-plastid proteins, and then classifying the identified plastids into their various types based on their function (chloroplast, chromoplast, etioplast, and amyloplast). Five diverse protein features: amino acid composition, dipeptide composition, the pseudo amino acid composition, N(terminal)-Center-C(terminal) composition and the protein physicochemical properties are used to develop SVM models. Overall, the dipeptide composition-based module shows the best performance with an accuracy of 86.80% and Matthews Correlation Coefficient (MCC) of 0.74 in phase-I and 78.60% with a MCC of 0.44 in phase-II. On independent test data, this model also performs better with an overall accuracy of 76.58% and 74.97% in phase-I and phase-II, respectively. The similarity-based PSI-BLAST module shows very low performance with about 50% prediction accuracy for distinguishing plastid vs. non-plastids and only 20% in classifying various plastid-types, indicating the need and importance of machine learning algorithms. CONCLUSION The current work is a first attempt to develop a methodology for classifying various plastid-type proteins. The prediction modules have also been made available as a web tool, PLpred available at http://bioinfo.okstate.edu/PLpred/ for real time identification/characterization. We believe this tool will be very useful in the functional annotation of various genomes.
Collapse
|
38
|
Adelfio A, Volpato V, Pollastri G. SCLpredT: Ab initio and homology-based prediction of subcellular localization by N-to-1 neural networks. SPRINGERPLUS 2013; 2:502. [PMID: 24133649 PMCID: PMC3795874 DOI: 10.1186/2193-1801-2-502] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 06/13/2013] [Accepted: 09/25/2013] [Indexed: 01/20/2023]
Abstract
Abstract The prediction of protein subcellular localization is a important step towards the prediction of protein function, and considerable effort has gone over the last decade into the development of computational predictors of protein localization. In this article we design a new predictor of protein subcellular localization, based on a Machine Learning model (N-to-1 Neural Networks) which we have recently developed. This system, in three versions specialised, respectively, on Plants, Fungi and Animals, has a rich output which incorporates the class “organelle” alongside cytoplasm, nucleus, mitochondria and extracellular, and, additionally, chloroplast in the case of Plants. We investigate the information gain of introducing additional inputs, including predicted secondary structure, and localization information from homologous sequences. To accommodate the latter we design a new algorithm which we present here for the first time. While we do not observe any improvement when including predicted secondary structure, we measure significant overall gains when adding homology information. The final predictor including homology information correctly predicts 74%, 79% and 60% of all proteins in the case of Fungi, Animals and Plants, respectively, and outperforms our previous, state-of-the-art predictor SCLpred, and the popular predictor BaCelLo. We also observe that the contribution of homology information becomes dominant over sequence information for sequence identity values exceeding 50% for Animals and Fungi, and 60% for Plants, confirming that subcellular localization is less conserved than structure. SCLpredT is publicly available at http://distillf.ucd.ie/sclpredt/. Sequence- or template-based predictions can be obtained, and up to 32kbytes of input can be processed in a single submission.
Collapse
Affiliation(s)
- Alessandro Adelfio
- School of Computer Science and Informatics, University College Dublin, Belfield, Dublin 4 Ireland ; Complex and Adaptive Systems Laboratory, University College Dublin, Belfield, Dublin 4 Ireland
| | | | | |
Collapse
|
39
|
Colcombet J, Lopez-Obando M, Heurtevin L, Bernard C, Martin K, Berthomé R, Lurin C. Systematic study of subcellular localization of Arabidopsis PPR proteins confirms a massive targeting to organelles. RNA Biol 2013; 10:1557-75. [PMID: 24037373 PMCID: PMC3858439 DOI: 10.4161/rna.26128] [Citation(s) in RCA: 73] [Impact Index Per Article: 6.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/19/2022] Open
Abstract
Four hundred and fifty-eight genes coding for PentatricoPeptide Repeat (PPR) proteins are annotated in the Arabidopsis thaliana genome. Over the past 10 years, numerous reports have shown that many of these proteins function in organelles to target specific transcripts and are involved in post-transcriptional regulation. Therefore, they are thought to be important players in the coordination between nuclear and organelle genome expression. Only four of these proteins have been described to be addressed outside organelles, indicating that some PPRs could function in post-transcriptional regulations of nuclear genes. In this work, we updated and improved our current knowledge on the localization of PPR proteins of Arabidopsis within the plant cell. We particularly investigated the subcellular localization of 166 PPR proteins whose targeting predictions were ambiguous, using a combination of high-throughput cloning and microscopy. Through systematic localization experiments and data integration, we confirmed that PPR proteins are largely targeted to organelles and showed that dual targeting to both the mitochondria and plastid occurs more frequently than expected. These results allow us to speculate that dual-targeted PPR proteins could be important for the fine coordination of gene expressions in both organelles.
Collapse
Affiliation(s)
- Jean Colcombet
- Unité de Recherche en Génomique Végétale (URGV); UMR INRA/UEVE - ERL CNRS 91057; CP 5708; 91057 EVRY CEDEX, France
| | - Mauricio Lopez-Obando
- Unité de Recherche en Génomique Végétale (URGV); UMR INRA/UEVE - ERL CNRS 91057; CP 5708; 91057 EVRY CEDEX, France
| | - Laure Heurtevin
- Unité de Recherche en Génomique Végétale (URGV); UMR INRA/UEVE - ERL CNRS 91057; CP 5708; 91057 EVRY CEDEX, France
| | - Clément Bernard
- Unité de Recherche en Génomique Végétale (URGV); UMR INRA/UEVE - ERL CNRS 91057; CP 5708; 91057 EVRY CEDEX, France
| | - Karine Martin
- Unité de Recherche en Génomique Végétale (URGV); UMR INRA/UEVE - ERL CNRS 91057; CP 5708; 91057 EVRY CEDEX, France
| | - Richard Berthomé
- Unité de Recherche en Génomique Végétale (URGV); UMR INRA/UEVE - ERL CNRS 91057; CP 5708; 91057 EVRY CEDEX, France
| | - Claire Lurin
- Unité de Recherche en Génomique Végétale (URGV); UMR INRA/UEVE - ERL CNRS 91057; CP 5708; 91057 EVRY CEDEX, France
| |
Collapse
|
40
|
Kumar S, Puniya BL, Parween S, Nahar P, Ramachandran S. Identification of novel adhesins of M. tuberculosis H37Rv using integrated approach of multiple computational algorithms and experimental analysis. PLoS One 2013; 8:e69790. [PMID: 23922800 PMCID: PMC3726780 DOI: 10.1371/journal.pone.0069790] [Citation(s) in RCA: 22] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/16/2013] [Accepted: 06/18/2013] [Indexed: 01/24/2023] Open
Abstract
Pathogenic bacteria interacting with eukaryotic host express adhesins on their surface. These adhesins aid in bacterial attachment to the host cell receptors during colonization. A few adhesins such as Heparin binding hemagglutinin adhesin (HBHA), Apa, Malate Synthase of M. tuberculosis have been identified using specific experimental interaction models based on the biological knowledge of the pathogen. In the present work, we carried out computational screening for adhesins of M. tuberculosis. We used an integrated computational approach using SPAAN for predicting adhesins, PSORTb, SubLoc and LocTree for extracellular localization, and BLAST for verifying non-similarity to human proteins. These steps are among the first of reverse vaccinology. Multiple claims and attacks from different algorithms were processed through argumentative approach. Additional filtration criteria included selection for proteins with low molecular weights and absence of literature reports. We examined binding potential of the selected proteins using an image based ELISA. The protein Rv2599 (membrane protein) binds to human fibronectin, laminin and collagen. Rv3717 (N-acetylmuramoyl-L-alanine amidase) and Rv0309 (L,D-transpeptidase) bind to fibronectin and laminin. We report Rv2599 (membrane protein), Rv0309 and Rv3717 as novel adhesins of M. tuberculosis H37Rv. Our results expand the number of known adhesins of M. tuberculosis and suggest their regulated expression in different stages.
Collapse
Affiliation(s)
- Sanjiv Kumar
- Functional Genomics Unit, Council of Scientific and Industrial Research -Institute of Genomics and Integrative Biology (CSIR-IGIB), New Delhi, India
| | - Bhanwar Lal Puniya
- Functional Genomics Unit, Council of Scientific and Industrial Research -Institute of Genomics and Integrative Biology (CSIR-IGIB), New Delhi, India
| | - Shahila Parween
- Functional Genomics Unit, Council of Scientific and Industrial Research -Institute of Genomics and Integrative Biology (CSIR-IGIB), New Delhi, India
| | - Pradip Nahar
- Functional Genomics Unit, Council of Scientific and Industrial Research -Institute of Genomics and Integrative Biology (CSIR-IGIB), New Delhi, India
| | - Srinivasan Ramachandran
- Functional Genomics Unit, Council of Scientific and Industrial Research -Institute of Genomics and Integrative Biology (CSIR-IGIB), New Delhi, India
- * E-mail:
| |
Collapse
|
41
|
Kim YB, Li X, Kim SJ, Kim HH, Lee J, Kim H, Park SU. MYB transcription factors regulate glucosinolate biosynthesis in different organs of Chinese cabbage (Brassica rapa ssp. pekinensis). Molecules 2013; 18:8682-95. [PMID: 23881053 PMCID: PMC6269701 DOI: 10.3390/molecules18078682] [Citation(s) in RCA: 57] [Impact Index Per Article: 5.2] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/23/2013] [Revised: 07/11/2013] [Accepted: 07/18/2013] [Indexed: 11/16/2022] Open
Abstract
In this study, we investigated the expression of seven MYB transcription factors (a total of 17 genes that included Dof1.1, IQD1-1, MYB28, MYB29, MYB34, MYB51, and MYB122 and their isoforms) involved in aliphatic and indolic glucosinolate (GSL) biosynthesis and analyzed the aliphatic and indolic GSL content in different organs of Chinese cabbage (Brassica rapassp. Pekinensis). MYB28 and MYB29 expression in the stem was dramatically different when compared with the levels in the other organs. MYB34, MYB122, MYB51, Dof1.1, and IQD1-1 showed very low transcript levels among different organs. HPLC analysis showed that the glucosinolates (GSLs) consisted of five aliphatic GSLs (progoitrin, sinigrin, glucoalyssin, gluconapin, and glucobrassicanapin) and four indolic GSLs (4-hydroxyglucobrassicin, glucobrassicin, 4-methoxygluco-brassicin, and neoglucobrassicin). Aliphatic GSLs exhibited 63.3% of the total GSLs content, followed by aromatic GSL (19.0%), indolic GSLs (10%), and unknown GSLs (7.7%) in different organs of Chinese cabbage. The total GSL content of different parts (ranked in descending order) was as follows: seed > flower > young leaves > stem > root > old leaves. The relationship between GSLs accumulation and expression of GSLs biosynthesis MYB TFs genes in different organs may be helpful to understand the mechanism of MYB TFs regulating GSL biosynthesis in Chinese cabbage.
Collapse
Affiliation(s)
- Yeon Bok Kim
- Department of Crop Science, Chungnam National University, 99 Daehak-ro, Yuseong-gu, Daejeon 305-764, Korea; E-Mails: (Y.B.K.); (X.H.L.)
| | - Xiaohua Li
- Department of Crop Science, Chungnam National University, 99 Daehak-ro, Yuseong-gu, Daejeon 305-764, Korea; E-Mails: (Y.B.K.); (X.H.L.)
| | - Sun-Ju Kim
- Department of Bio-Environmental Chemistry, Chungnam National University, 99 Daehak-Ro, Yuseong-Gu, Daejeon 305-764, Korea; E-Mail:
| | - Haeng Hoon Kim
- Department of Well-being Resources, Sunchon National University, 413 Jungangno, Suncheon, Jeollanam-do, 540-742, Korea; E-Mail:
| | - Jeongyeo Lee
- Green Bio Research Center, Cabbage Genomics Assisted Breeding Supporting Center, Korea Research Institute of Bioscience and Biotechnology (KRIBB), Gwahangno 111, Daejeon 305-806, Korea; E-Mail:
| | - HyeRan Kim
- Green Bio Research Center, Cabbage Genomics Assisted Breeding Supporting Center, Korea Research Institute of Bioscience and Biotechnology (KRIBB), Gwahangno 111, Daejeon 305-806, Korea; E-Mail:
| | - Sang Un Park
- Department of Crop Science, Chungnam National University, 99 Daehak-ro, Yuseong-gu, Daejeon 305-764, Korea; E-Mails: (Y.B.K.); (X.H.L.)
| |
Collapse
|
42
|
Mirabello C, Pollastri G. Porter, PaleAle 4.0: high-accuracy prediction of protein secondary structure and relative solvent accessibility. Bioinformatics 2013; 29:2056-8. [DOI: 10.1093/bioinformatics/btt344] [Citation(s) in RCA: 88] [Impact Index Per Article: 8.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/14/2022] Open
|
43
|
Abstract
Motivation: Subcellular localization is one aspect of protein function. Despite advances in high-throughput imaging, localization maps remain incomplete. Several methods accurately predict localization, but many challenges remain to be tackled. Results: In this study, we introduced a framework to predict localization in life's three domains, including globular and membrane proteins (3 classes for archaea; 6 for bacteria and 18 for eukaryota). The resulting method, LocTree2, works well even for protein fragments. It uses a hierarchical system of support vector machines that imitates the cascading mechanism of cellular sorting. The method reaches high levels of sustained performance (eukaryota: Q18=65%, bacteria: Q6=84%). LocTree2 also accurately distinguishes membrane and non-membrane proteins. In our hands, it compared favorably with top methods when tested on new data. Availability: Online through PredictProtein (predictprotein.org); as standalone version at http://www.rostlab.org/services/loctree2. Contact:localization@rostlab.org Supplementary Information:Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Tatyana Goldberg
- TUM, Bioinformatik-I12, Informatik, Boltzmannstrasse 3, Garching 85748, Germany.
| | | | | |
Collapse
|
44
|
Arango-Argoty GA, Jaramillo-Garzón JA, Castellanos-Domínguez CG. Contact potentials via wavelet transform for prediction of subcellular localizations in gram negative bacterial proteins. ANNUAL INTERNATIONAL CONFERENCE OF THE IEEE ENGINEERING IN MEDICINE AND BIOLOGY SOCIETY. IEEE ENGINEERING IN MEDICINE AND BIOLOGY SOCIETY. ANNUAL INTERNATIONAL CONFERENCE 2013; 2013:643-646. [PMID: 24109769 DOI: 10.1109/embc.2013.6609582] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/02/2023]
Abstract
Predicting the localization of a protein has become a useful practice for inferring its function. Most of the reported methods to predict subcellular localizations in Gram-negative bacterial proteins have shown a low false positive rate. However, some subcellular compartmens like "periplasm" and "extracellular medium" are difficult to predict and remain high false negative rates. In this paper, a method based on representation from statistical contact potentials and wavelet transform is presented. The wavelet-based method achieves an overall high performance holding low false and negative rates particularly on periplasm and extracellular medium. Results suggest the contact potentials as an useful alternative to characterize protein sequences.
Collapse
|
45
|
Transcription factor-dependent nuclear localization of a transcriptional repressor in jasmonate hormone signaling. Proc Natl Acad Sci U S A 2012; 109:20148-53. [PMID: 23169619 DOI: 10.1073/pnas.1210054109] [Citation(s) in RCA: 80] [Impact Index Per Article: 6.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/04/2023] Open
Abstract
The plant hormone jasmonate (JA) plays an important role in regulating growth, development and immunity. A key step in JA signaling is ligand-dependent assembly of a coreceptor complex consisting of the F-box protein COI1 and JAZ transcriptional repressors. Assembly of this receptor complex results in proteasome-mediated degradation of JAZ repressors, which at resting state bind to and repress the MYC transcription factors. Although the JA receptor complex is believed to function within the nucleus, how this receptor complex enters the nucleus and, more generally, the cell biology of jasmonate signaling are not well understood. In this study, we conducted mutational analysis of the C termini (containing the conserved Jas motif) of two JAZ repressors, JAZ1 and JAZ9. These analyses unexpectedly revealed different subcellular localization patterns of JAZ1ΔJas and JAZ9ΔJas, which were associated with differential interaction of JAZ1ΔJas and JAZ9ΔJas with MYC2 and differential repressor activity in vivo. Importantly, physical interaction with MYC2 appears to play an active role in the nuclear targeting of JAZ1 and JAZ9, and the nuclear localization of JAZ9 was compromised in myc2 mutant plants. We identified a highly conserved arginine residue in the Jas motif that is critical for coupling MYC2 interaction with nuclear localization of JAZ9 and JAZ9 repressor function in vivo. Our results suggest a model for explaining why some JAZΔJas proteins, but not others, confer constitutive JA-insensitivity when overexpressed in plants. Results also provide evidence for a transcription factor-dependent mechanism for nuclear import of a cognate transcriptional repressor JAZ9 in plants.
Collapse
|
46
|
Renier S, Micheau P, Talon R, Hébraud M, Desvaux M. Subcellular localization of extracytoplasmic proteins in monoderm bacteria: rational secretomics-based strategy for genomic and proteomic analyses. PLoS One 2012; 7:e42982. [PMID: 22912771 PMCID: PMC3415414 DOI: 10.1371/journal.pone.0042982] [Citation(s) in RCA: 34] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/10/2012] [Accepted: 07/13/2012] [Indexed: 11/20/2022] Open
Abstract
Genome-scale prediction of subcellular localization (SCL) is not only useful for inferring protein function but also for supporting proteomic data. In line with the secretome concept, a rational and original analytical strategy mimicking the secretion steps that determine ultimate SCL was developed for Gram-positive (monoderm) bacteria. Based on the biology of protein secretion, a flowchart and decision trees were designed considering (i) membrane targeting, (ii) protein secretion systems, (iii) membrane retention, and (iv) cell-wall retention by domains or post-translocational modifications, as well as (v) incorporation to cell-surface supramolecular structures. Using Listeria monocytogenes as a case study, results were compared with known data set from SCL predictors and experimental proteomics. While in good agreement with experimental extracytoplasmic fractions, the secretomics-based method outperforms other genomic analyses, which were simply not intended to be as inclusive. Compared to all other localization predictors, this method does not only supply a static snapshot of protein SCL but also offers the full picture of the secretion process dynamics: (i) the protein routing is detailed, (ii) the number of distinct SCL and protein categories is comprehensive, (iii) the description of protein type and topology is provided, (iv) the SCL is unambiguously differentiated from the protein category, and (v) the multiple SCL and protein category are fully considered. In that sense, the secretomics-based method is much more than a SCL predictor. Besides a major step forward in genomics and proteomics of protein secretion, the secretomics-based method appears as a strategy of choice to generate in silico hypotheses for experimental testing.
Collapse
Affiliation(s)
- Sandra Renier
- INRA, UR454 Microbiology, Saint-Genès Champanelle, France
| | - Pierre Micheau
- INRA, UR454 Microbiology, Saint-Genès Champanelle, France
| | - Régine Talon
- INRA, UR454 Microbiology, Saint-Genès Champanelle, France
| | - Michel Hébraud
- INRA, UR454 Microbiology, Saint-Genès Champanelle, France
| | - Mickaël Desvaux
- INRA, UR454 Microbiology, Saint-Genès Champanelle, France
- * E-mail:
| |
Collapse
|
47
|
Predicted protein subcellular localization in dominant surface ocean bacterioplankton. Appl Environ Microbiol 2012; 78:6550-7. [PMID: 22773648 DOI: 10.1128/aem.01406-12] [Citation(s) in RCA: 10] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/20/2022] Open
Abstract
Bacteria consume dissolved organic matter (DOM) through hydrolysis, transport and intracellular metabolism, and these activities occur in distinct subcellular localizations. Bacterial protein subcellular localizations for several major marine bacterial groups were predicted using genomic, metagenomic and metatranscriptomic data sets following modification of MetaP software for use with partial gene sequences. The most distinct pattern of subcellular localization was found for Bacteroidetes, whose genomes were substantially enriched with outer membrane and extracellular proteins but depleted of inner membrane proteins compared with five other taxa (SAR11, Roseobacter, Synechococcus, Prochlorococcus, oligotrophic marine Gammaproteobacteria). When subcellular localization patterns were compared between genes and transcripts, three taxa had expression biased toward proteins localized to cell locations outside of the cytosol (SAR11, Roseobacter, and Synechococcus), as expected based on the importance of carbon and nutrient acquisition in an oligotrophic ocean, but two taxa did not (oligotrophic marine Gammaproteobacteria and Bacteroidetes). Diel variations in the fraction and putative gene functions of transcripts encoding inner membrane and periplasmic proteins compared to cytoplasmic proteins suggest a close coupling of photosynthetic extracellular release and bacterial consumption, providing insights into interactions between phytoplankton, bacteria, and DOM.
Collapse
|
48
|
Mourad GS, Tippmann-Crosby J, Hunt KA, Gicheru Y, Bade K, Mansfield TA, Schultes NP. Genetic and molecular characterization reveals a unique nucleobase cation symporter 1 in Arabidopsis. FEBS Lett 2012; 586:1370-8. [PMID: 22616996 DOI: 10.1016/j.febslet.2012.03.058] [Citation(s) in RCA: 33] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/15/2012] [Revised: 03/25/2012] [Accepted: 03/26/2012] [Indexed: 11/13/2022]
Abstract
Locus At5g03555 encodes a nucleobase cation symporter 1 (AtNCS1) in the Arabidopsis genome. Arabidopsis insertion mutants, AtNcs1-1 and AtNcs1-3, were used for in planta toxic nucleobase analog growth studies and radio-labeled nucleobase uptake assays to characterize solute transport specificities. These results correlate with similar growth and uptake studies of AtNCS1 expressed in Saccharomyces cerevisiae. Both in planta and heterologous expression studies in yeast revealed a unique solute transport profile for AtNCS1 in moving adenine, guanine and uracil. This is in stark contrast to the canonical transport profiles determined for the well-characterized S. cerevisiae NCS1 proteins FUR4 (uracil transport) or FCY2 (adenine, guanine, and cytosine transport).
Collapse
Affiliation(s)
- George S Mourad
- Department of Biology, Indiana University-Purdue University Fort Wayne, Fort Wayne, IN 46805, USA.
| | | | | | | | | | | | | |
Collapse
|
49
|
Law SR, Narsai R, Taylor NL, Delannoy E, Carrie C, Giraud E, Millar AH, Small I, Whelan J. Nucleotide and RNA metabolism prime translational initiation in the earliest events of mitochondrial biogenesis during Arabidopsis germination. PLANT PHYSIOLOGY 2012; 158:1610-27. [PMID: 22345507 PMCID: PMC3320173 DOI: 10.1104/pp.111.192351] [Citation(s) in RCA: 46] [Impact Index Per Article: 3.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 12/18/2011] [Accepted: 02/13/2012] [Indexed: 05/18/2023]
Abstract
Mitochondria play a crucial role in germination and early seedling growth in Arabidopsis (Arabidopsis thaliana). Morphological observations of mitochondria revealed that mitochondrial numbers, typical size, and oval morphology were evident after 12 h of imbibition in continuous light (following 48 h of stratification). The transition from a dormant to an active metabolic state was punctuated by an early molecular switch, characterized by a transient burst in the expression of genes encoding mitochondrial proteins. Factors involved in mitochondrial transcription and RNA processing were overrepresented among these early-expressed genes. This was closely followed by an increase in the transcript abundance of genes encoding proteins involved in mitochondrial DNA replication and translation. This burst in the expression of factors implicated in mitochondrial RNA and DNA metabolism was accompanied by an increase in transcripts encoding components required for nucleotide biosynthesis in the cytosol and increases in transcript abundance of specific members of the mitochondrial carrier protein family that have previously been associated with nucleotide transport into mitochondria. Only after these genes peaked in expression and largely declined were typical mitochondrial numbers and morphology observed. Subsequently, there was an increase in transcript abundance for various bioenergetic and metabolic functions of mitochondria. The coordination of nucleus- and organelle-encoded gene expression was also examined by quantitative reverse transcription-polymerase chain reaction, specifically for components of the mitochondrial electron transport chain and the chloroplastic photosynthetic machinery. Analysis of protein abundance using western-blot analysis and mass spectrometry revealed that for many proteins, patterns of protein and transcript abundance changes displayed significant positive correlations. A model for mitochondrial biogenesis during germination is proposed, in which an early increase in the abundance of transcripts encoding biogenesis functions (RNA metabolism and import components) precedes a later cascade of gene expression encoding the bioenergetic and metabolic functions of mitochondria.
Collapse
Affiliation(s)
| | | | | | | | | | | | | | | | - James Whelan
- Australian Research Council Centre of Excellence in Plant Energy Biology (S.R.L., R.N., N.L.T., E.D., C.C., E.G., A.H.M., I.S., J.W.), Centre for Computational Systems Biology (R.N., I.S.), and Centre for Comparative Analysis of Biomolecular Networks (N.L.T., A.H.M.), University of Western Australia, Crawley 6009, Western Australia, Australia
| |
Collapse
|
50
|
Domingos RF, Vieira ML, Romero EC, Gonçales AP, de Morais ZM, Vasconcellos SA, Nascimento ALTO. Features of two proteins of Leptospira interrogans with potential role in host-pathogen interactions. BMC Microbiol 2012; 12:50. [PMID: 22463075 PMCID: PMC3444417 DOI: 10.1186/1471-2180-12-50] [Citation(s) in RCA: 55] [Impact Index Per Article: 4.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/18/2011] [Accepted: 03/30/2012] [Indexed: 11/10/2022] Open
Abstract
Background Leptospirosis is considered a re-emerging infectious disease caused by pathogenic spirochaetes of the genus Leptospira. Pathogenic leptospires have the ability to survive and disseminate to multiple organs after penetrating the host. Leptospires were shown to express surface proteins that interact with the extracellular matrix (ECM) and to plasminogen (PLG). This study examined the interaction of two putative leptospiral proteins with laminin, collagen Type I, collagen Type IV, cellular fibronectin, plasma fibronectin, PLG, factor H and C4bp. Results We show that two leptospiral proteins encoded by LIC11834 and LIC12253 genes interact with laminin in a dose - dependent and saturable mode, with dissociation equilibrium constants (KD) of 367.5 and 415.4 nM, respectively. These proteins were named Lsa33 and Lsa25 (Leptospiral surface adhesin) for LIC11834 and LIC12253, respectively. Metaperiodate - treated laminin reduced Lsa25 - laminin interaction, suggesting that sugar moieties of this ligand participate in this interaction. The Lsa33 is also PLG - binding receptor, with a KD of 23.53 nM, capable of generating plasmin in the presence of an activator. Although in a weak manner, both proteins interact with C4bp, a regulator of complement classical route. In silico analysis together with proteinase K and immunoflorescence data suggest that these proteins might be surface exposed. Moreover, the recombinant proteins partially inhibited leptospiral adherence to immobilized laminin and PLG. Conclusions We believe that these multifunctional proteins have the potential to participate in the interaction of leptospires to hosts by mediating adhesion and by helping the bacteria to escape the immune system and to overcome tissue barriers. To our knowledge, Lsa33 is the first leptospiral protein described to date with the capability of binding laminin, PLG and C4bp in vitro.
Collapse
Affiliation(s)
- Renan F Domingos
- Centro de Biotecnologia, Instituto Butantan, Avenida Vital Brazil, 1500, 05503-900, São Paulo, SP, Brazil
| | | | | | | | | | | | | |
Collapse
|