1
|
Li J, Zou Q, Yuan L. A review from biological mapping to computation-based subcellular localization. MOLECULAR THERAPY. NUCLEIC ACIDS 2023; 32:507-521. [PMID: 37215152 PMCID: PMC10192651 DOI: 10.1016/j.omtn.2023.04.015] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 05/24/2023]
Abstract
Subcellular localization is crucial to the study of virus and diseases. Specifically, research on protein subcellular localization can help identify clues between virus and host cells that can aid in the design of targeted drugs. Research on RNA subcellular localization is significant for human diseases (such as Alzheimer's disease, colon cancer, etc.). To date, only reviews addressing subcellular localization of proteins have been published, which are outdated for reference, and reviews of RNA subcellular localization are not comprehensive. Therefore, we collated (the most up-to-date) literature on protein and RNA subcellular localization to help researchers understand changes in the field of protein and RNA subcellular localization. Extensive and complete methods for constructing subcellular localization models have also been summarized, which can help readers understand the changes in application of biotechnology and computer science in subcellular localization research and explore how to use biological data to construct improved subcellular localization models. This paper is the first review to cover both protein subcellular localization and RNA subcellular localization. We urge researchers from biology and computational biology to jointly pay attention to transformation patterns, interrelationships, differences, and causality of protein subcellular localization and RNA subcellular localization.
Collapse
Affiliation(s)
- Jing Li
- Yangtze Delta Region Institute (Quzhou), University of Electronic Science and Technology of China, 1 Chengdian Road, Quzhou, Zhejiang 324000, China
- School of Biomedical Sciences, University of Hong Kong, Hong Kong, China
| | - Quan Zou
- Yangtze Delta Region Institute (Quzhou), University of Electronic Science and Technology of China, 1 Chengdian Road, Quzhou, Zhejiang 324000, China
| | - Lei Yuan
- Department of Hepatobiliary Surgery, Quzhou People's Hospital, 100 Minjiang Main Road, Quzhou, Zhejiang 324000, China
| |
Collapse
|
2
|
Azad I, Khan T, Ahmad N, Khan AR, Akhter Y. Updates on drug designing approach through computational strategies: a review. Future Sci OA 2023; 9:FSO862. [PMID: 37180609 PMCID: PMC10167725 DOI: 10.2144/fsoa-2022-0085] [Citation(s) in RCA: 6] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/14/2022] [Accepted: 04/12/2023] [Indexed: 05/16/2023] Open
Abstract
The drug discovery and development (DDD) process in pursuit of novel drug candidates is a challenging procedure requiring lots of time and resources. Therefore, computer-aided drug design (CADD) methodologies are used extensively to promote proficiency in drug development in a systematic and time-effective manner. The point in reference is SARS-CoV-2 which has emerged as a global pandemic. In the absence of any confirmed drug moiety to treat the infection, the science fraternity adopted hit and trial methods to come up with a lead drug compound. This article is an overview of the virtual methodologies, which assist in finding novel hits and help in the progression of drug development in a short period with a specific medicinal solution.
Collapse
Affiliation(s)
- Iqbal Azad
- Department of Chemistry, Integral University, Dasauli, P.O. Bas-ha, Kursi Road, Lucknow, 226026, UP, India
| | - Tahmeena Khan
- Department of Chemistry, Integral University, Dasauli, P.O. Bas-ha, Kursi Road, Lucknow, 226026, UP, India
| | - Naseem Ahmad
- Department of Chemistry, Integral University, Dasauli, P.O. Bas-ha, Kursi Road, Lucknow, 226026, UP, India
| | - Abdul Rahman Khan
- Department of Chemistry, Integral University, Dasauli, P.O. Bas-ha, Kursi Road, Lucknow, 226026, UP, India
| | - Yusuf Akhter
- Department of Biotechnology, Babasaheb Bhimrao Ambedkar University, Vidya Vihar, Raebareli Road, Lucknow, UP, 2260025, India
| |
Collapse
|
3
|
Wang RH, Luo T, Guo YP, Yang ZX, Zhang HY, Hao HY, Du PF. dbMisLoc: A Manually Curated Database of Conditional Protein Mis-localization Events. Interdiscip Sci 2023:10.1007/s12539-023-00564-0. [PMID: 37000408 DOI: 10.1007/s12539-023-00564-0] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/30/2023] [Revised: 03/19/2023] [Accepted: 03/21/2023] [Indexed: 04/01/2023]
Abstract
Over the last few years, an increasing number of protein mis-localization events have been reported under various conditions. It is important to understand these events and their relationship with complex disorders. Although many efforts had been made in establishing models with statistical or machine learning algorithms, a comprehensive database resource is still missing. Since the records of experimental-validated protein mis-localization events spread across many literatures, a collection of all these reports in a unique website is demanded. In this paper, we created the dbMisLoc database by manually curating conditional protein mis-localization events from various literatures. The dbMisLoc database records the protein localizations, mis-localizations, conditions for mis-localization, and the original reports. The dbMisLoc database allows the users to intuitively view, search, visualize and download protein mis-localization records. The dbMisLoc database integrates a BLAST search engine, which can search mis-localized proteins that are similar to user queries. The dbMisLoc database can be accessed directly through ( https://dbml.pufengdu.org ). The source code of dbMisLoc database is available from the GitHub repository ( https://github.com/quinlanW/dbMisLoc ) for free. Users can host their own mirrors of dbMisLoc database on their own servers. dbMisLoc is database for manually curated protein mis-localization events. It contains mis-localization events in 14 categories of conditions such as diseases, drug treatments and environmental stresses.
Collapse
Affiliation(s)
- Ren-Hua Wang
- College of Intelligence and Computing, Tianjin University, Tianjin, 300350, China
| | - Tao Luo
- College of Intelligence and Computing, Tianjin University, Tianjin, 300350, China
| | - Yu-Peng Guo
- College of Intelligence and Computing, Tianjin University, Tianjin, 300350, China
| | - Zi-Xin Yang
- College of Intelligence and Computing, Tianjin University, Tianjin, 300350, China
| | - He-Yi Zhang
- College of Intelligence and Computing, Tianjin University, Tianjin, 300350, China
| | - Hong-Yu Hao
- College of Intelligence and Computing, Tianjin University, Tianjin, 300350, China
| | - Pu-Feng Du
- College of Intelligence and Computing, Tianjin University, Tianjin, 300350, China.
| |
Collapse
|
4
|
Rashid M, Omar M, Mohanta TK. FungiProteomeDB: a database for the molecular weight and isoelectric points of the fungal proteomes. Database (Oxford) 2023; 2023:7078806. [PMID: 36929177 PMCID: PMC10019025 DOI: 10.1093/database/baad004] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/11/2022] [Revised: 02/01/2023] [Accepted: 03/11/2023] [Indexed: 03/18/2023]
Abstract
Proteins' molecular weight (MW) and isoelectric point (pI) are crucial for their subcellular localization and subsequent function. These are also useful in 2D gel electrophoresis, liquid chromatography-mass spectrometry and X-ray protein crystallography. Moreover, visualizations like a virtual 2D proteome map of pI vs. MW are worthwhile to discuss the proteome diversity among different species. Although the genome sequence data of the fungi kingdom improved enormously, the proteomic details have been poorly elaborated. Therefore, we have calculated the MW and pI of the fungi proteins and reported them in, FungiProteomeDB, an online database (DB) https://vision4research.com/fungidb/. We analyzed the proteome of 685 fungal species that contain 7 127 141 protein sequences. The DB provides an easy-to-use and efficient interface for various search options, summary statistics and virtual 2D proteome map visualizations. The MW and pI of a protein can be obtained by searching the name of a protein, a keyword or a list of accession numbers. It also allows querying protein sequences. The DB will be helpful in hypothesis formulation and in various biotechnological applications. Database URL https://vision4research.com/fungidb/.
Collapse
|
5
|
Mou M, Pan Z, Lu M, Sun H, Wang Y, Luo Y, Zhu F. Application of Machine Learning in Spatial Proteomics. J Chem Inf Model 2022; 62:5875-5895. [PMID: 36378082 DOI: 10.1021/acs.jcim.2c01161] [Citation(s) in RCA: 11] [Impact Index Per Article: 5.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/16/2022]
Abstract
Spatial proteomics is an interdisciplinary field that investigates the localization and dynamics of proteins, and it has gained extensive attention in recent years, especially the subcellular proteomics. Numerous evidence indicate that the subcellular localization of proteins is associated with various cellular processes and disease progression. Mass spectrometry (MS)-based and imaging-based experimental approaches have been developed to acquire large-scale spatial proteomic data. To allow the reliable analysis of increasingly complex spatial proteomics data, machine learning (ML) methods have been widely used in both MS-based and imaging-based spatial proteomic data analysis pipelines. Here, we comprehensively survey the applications of ML in spatial proteomics from following aspects: (1) data resources for spatial proteome are comprehensively introduced; (2) the roles of different ML algorithms in data analysis pipelines are elaborated; (3) successful applications of spatial proteomics and several analytical tools integrating ML methods are presented; (4) challenges existing in modern ML-based spatial proteomics studies are discussed. This review provides guidelines for researchers seeking to apply ML methods to analyze spatial proteomic data and can facilitate insightful understanding of cell biology as well as the future research in medical and drug discovery communities.
Collapse
Affiliation(s)
- Minjie Mou
- College of Pharmaceutical Sciences, Zhejiang University, Hangzhou 310058, China
| | - Ziqi Pan
- College of Pharmaceutical Sciences, Zhejiang University, Hangzhou 310058, China
| | - Mingkun Lu
- College of Pharmaceutical Sciences, Zhejiang University, Hangzhou 310058, China
| | - Huaicheng Sun
- College of Pharmaceutical Sciences, Zhejiang University, Hangzhou 310058, China
| | - Yunxia Wang
- College of Pharmaceutical Sciences, Zhejiang University, Hangzhou 310058, China
| | - Yongchao Luo
- College of Pharmaceutical Sciences, Zhejiang University, Hangzhou 310058, China
| | - Feng Zhu
- College of Pharmaceutical Sciences, Zhejiang University, Hangzhou 310058, China
| |
Collapse
|
6
|
Mohanta TK, Kamran MS, Omar M, Anwar W, Choi GS. PlantMWpIDB: a database for the molecular weight and isoelectric points of the plant proteomes. Sci Rep 2022; 12:7421. [PMID: 35523906 PMCID: PMC9076895 DOI: 10.1038/s41598-022-11077-z] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/22/2021] [Accepted: 04/11/2022] [Indexed: 01/14/2023] Open
Abstract
The molecular weight and isoelectric point of the proteins are very important parameters that control their subcellular localization and subsequent function. Although the genome sequence data of the plant kingdom improved enormously, the proteomic details have been poorly elaborated. Therefore, we have calculated the molecular weight and isoelectric point of the plant proteins and reported them in this database. A database, PlantMWpIDB, containing protein data from 342 plant proteomes was created to provide information on plant proteomes for hypothesis formulation in basic research and for biotechnological applications. The Molecular weight and isoelectric point (pI) are important molecular parameters of proteins that are useful when conducting protein studies involving 2D gel electrophoresis, liquid chromatography-mass spectrometry, and X-ray protein crystallography. PlantMWpIDB provides an easy-to-use and efficient interface for search options and generates a summary of basic protein parameters. The database represents a virtual 2D proteome map of plants, and the molecular weight and pI of a protein can be obtained by searching on the name of a protein, a keyword, or by a list of accession numbers. The PlantMWpIDB database also allows one to query protein sequences. The database can be found in the following link https://plantmwpidb.com/ . The individual 2D virtual proteome map of the plant kingdom will enable us to understand the proteome diversity between different species. Further, the molecular weight and isoelectric point of individual proteins can enable us to understand their functional significance in different species.
Collapse
Affiliation(s)
- Tapan Kumar Mohanta
- Natural and Medical Sciences Research Center, University of Nizwa, Nizwa, 616, Oman.
| | - Muhammad Shahzad Kamran
- Department of Computer Science and IT, The Islamia University of Bahawalpur, Bahawalpur, Pakistan
| | - Muhammad Omar
- Department of Data Science, Faculty of Computing, The Islamia University of Bahawalpur, Bahawalpur, Pakistan.,Department of Information and Communication Engineering, Yeungnam University, 214-1, Gyeongsan-si, 712-749, South Korea
| | - Waheed Anwar
- Department of Computer Science and IT, The Islamia University of Bahawalpur, Bahawalpur, Pakistan
| | - Gyu Sang Choi
- Department of Information and Communication Engineering, Yeungnam University, 214-1, Gyeongsan-si, 712-749, South Korea.
| |
Collapse
|
7
|
Abrams MB, Chuong JN, AlZaben F, Dubin CA, Skerker JM, Brem RB. Barcoded reciprocal hemizygosity analysis via sequencing illuminates the complex genetic basis of yeast thermotolerance. G3 GENES|GENOMES|GENETICS 2022; 12:6456302. [PMID: 34878132 PMCID: PMC9210320 DOI: 10.1093/g3journal/jkab412] [Citation(s) in RCA: 5] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 07/24/2021] [Accepted: 11/04/2021] [Indexed: 11/12/2022]
Abstract
Decades of successes in statistical genetics have revealed the molecular underpinnings of traits as they vary across individuals of a given species. But standard methods in the field cannot be applied to divergences between reproductively isolated taxa. Genome-wide reciprocal hemizygosity mapping (RH-seq), a mutagenesis screen in an interspecies hybrid background, holds promise as a method to accelerate the progress of interspecies genetics research. Here, we describe an improvement to RH-seq in which mutants harbor barcodes for cheap and straightforward sequencing after selection in a condition of interest. As a proof of concept for the new tool, we carried out genetic dissection of the difference in thermotolerance between two reproductively isolated budding yeast species. Experimental screening identified dozens of candidate loci at which variation between the species contributed to the thermotolerance trait. Hits were enriched for mitosis genes and other housekeeping factors, and among them were multiple loci with robust sequence signatures of positive selection. Together, these results shed new light on the mechanisms by which evolution solved the problems of cell survival and division at high temperature in the yeast clade, and they illustrate the power of the barcoded RH-seq approach.
Collapse
Affiliation(s)
- Melanie B Abrams
- Department of Plant and Microbial Biology, University of California, Berkeley , Berkeley, CA 94720, USA
| | - Julie N Chuong
- Department of Plant and Microbial Biology, University of California, Berkeley , Berkeley, CA 94720, USA
- PhD Program in Biology, New York University , New York, NY 10003, USA
| | - Faisal AlZaben
- Department of Plant and Microbial Biology, University of California, Berkeley , Berkeley, CA 94720, USA
| | - Claire A Dubin
- Department of Plant and Microbial Biology, University of California, Berkeley , Berkeley, CA 94720, USA
| | - Jeffrey M Skerker
- Environmental Genomics and Systems Biology Division, Lawrence Berkeley National Laboratory , Berkeley, CA 94720, USA
| | - Rachel B Brem
- Department of Plant and Microbial Biology, University of California, Berkeley , Berkeley, CA 94720, USA
- Buck Institute for Research on Aging , Novato, CA 94945, USA
| |
Collapse
|
8
|
Sudhakar P, Machiels K, Verstockt B, Korcsmaros T, Vermeire S. Computational Biology and Machine Learning Approaches to Understand Mechanistic Microbiome-Host Interactions. Front Microbiol 2021; 12:618856. [PMID: 34046017 PMCID: PMC8148342 DOI: 10.3389/fmicb.2021.618856] [Citation(s) in RCA: 9] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/18/2020] [Accepted: 03/19/2021] [Indexed: 12/11/2022] Open
Abstract
The microbiome, by virtue of its interactions with the host, is implicated in various host functions including its influence on nutrition and homeostasis. Many chronic diseases such as diabetes, cancer, inflammatory bowel diseases are characterized by a disruption of microbial communities in at least one biological niche/organ system. Various molecular mechanisms between microbial and host components such as proteins, RNAs, metabolites have recently been identified, thus filling many gaps in our understanding of how the microbiome modulates host processes. Concurrently, high-throughput technologies have enabled the profiling of heterogeneous datasets capturing community level changes in the microbiome as well as the host responses. However, due to limitations in parallel sampling and analytical procedures, big gaps still exist in terms of how the microbiome mechanistically influences host functions at a system and community level. In the past decade, computational biology and machine learning methodologies have been developed with the aim of filling the existing gaps. Due to the agnostic nature of the tools, they have been applied in diverse disease contexts to analyze and infer the interactions between the microbiome and host molecular components. Some of these approaches allow the identification and analysis of affected downstream host processes. Most of the tools statistically or mechanistically integrate different types of -omic and meta -omic datasets followed by functional/biological interpretation. In this review, we provide an overview of the landscape of computational approaches for investigating mechanistic interactions between individual microbes/microbiome and the host and the opportunities for basic and clinical research. These could include but are not limited to the development of activity- and mechanism-based biomarkers, uncovering mechanisms for therapeutic interventions and generating integrated signatures to stratify patients.
Collapse
Affiliation(s)
- Padhmanand Sudhakar
- Department of Chronic Diseases, Metabolism and Ageing, Translational Research Center for Gastrointestinal Disorders (TARGID), KU Leuven, Leuven, Belgium
- Earlham Institute, Norwich, United Kingdom
- Quadram Institute Bioscience, Norwich, United Kingdom
| | - Kathleen Machiels
- Department of Chronic Diseases, Metabolism and Ageing, Translational Research Center for Gastrointestinal Disorders (TARGID), KU Leuven, Leuven, Belgium
| | - Bram Verstockt
- Department of Chronic Diseases, Metabolism and Ageing, Translational Research Center for Gastrointestinal Disorders (TARGID), KU Leuven, Leuven, Belgium
- Department of Gastroenterology and Hepatology, University Hospitals Leuven, KU Leuven, Leuven, Belgium
| | - Tamas Korcsmaros
- Earlham Institute, Norwich, United Kingdom
- Quadram Institute Bioscience, Norwich, United Kingdom
| | - Séverine Vermeire
- Department of Chronic Diseases, Metabolism and Ageing, Translational Research Center for Gastrointestinal Disorders (TARGID), KU Leuven, Leuven, Belgium
- Department of Gastroenterology and Hepatology, University Hospitals Leuven, KU Leuven, Leuven, Belgium
| |
Collapse
|
9
|
Roth YD, Lian Z, Pochiraju S, Shaikh B, Karr JR. Datanator: an integrated database of molecular data for quantitatively modeling cellular behavior. Nucleic Acids Res 2021; 49:D516-D522. [PMID: 33174603 PMCID: PMC7779073 DOI: 10.1093/nar/gkaa1008] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/06/2020] [Revised: 10/12/2020] [Accepted: 10/21/2020] [Indexed: 12/23/2022] Open
Abstract
Integrative research about multiple biochemical subsystems has significant potential to help advance biology, bioengineering and medicine. However, it is difficult to obtain the diverse data needed for integrative research. To facilitate biochemical research, we developed Datanator (https://datanator.info), an integrated database and set of tools for finding clouds of multiple types of molecular data about specific molecules and reactions in specific organisms and environments, as well as data about chemically-similar molecules and reactions in phylogenetically-similar organisms in similar environments. Currently, Datanator includes metabolite concentrations, RNA modifications and half-lives, protein abundances and modifications, and reaction rate constants about a broad range of organisms. Going forward, we aim to launch a community initiative to curate additional data. Datanator also provides tools for filtering, visualizing and exporting these data clouds. We believe that Datanator can facilitate a wide range of research from integrative mechanistic models, such as whole-cell models, to comparative data-driven analyses of multiple organisms.
Collapse
Affiliation(s)
- Yosef D Roth
- Icahn Institute for Data Science and Genomic Technology and Department of Genetics and Genomic Sciences, Icahn School of Medicine at Mount Sinai, 1255 5th Avenue, Suite C2, New York, NY 10029, USA
| | - Zhouyang Lian
- Icahn Institute for Data Science and Genomic Technology and Department of Genetics and Genomic Sciences, Icahn School of Medicine at Mount Sinai, 1255 5th Avenue, Suite C2, New York, NY 10029, USA
| | - Saahith Pochiraju
- Icahn Institute for Data Science and Genomic Technology and Department of Genetics and Genomic Sciences, Icahn School of Medicine at Mount Sinai, 1255 5th Avenue, Suite C2, New York, NY 10029, USA
| | - Bilal Shaikh
- Icahn Institute for Data Science and Genomic Technology and Department of Genetics and Genomic Sciences, Icahn School of Medicine at Mount Sinai, 1255 5th Avenue, Suite C2, New York, NY 10029, USA
| | - Jonathan R Karr
- Icahn Institute for Data Science and Genomic Technology and Department of Genetics and Genomic Sciences, Icahn School of Medicine at Mount Sinai, 1255 5th Avenue, Suite C2, New York, NY 10029, USA
| |
Collapse
|
10
|
Dai W, Chen B, Peng W, Li X, Zhong J, Wang J. A Novel Multi-Ensemble Method for Identifying Essential Proteins. J Comput Biol 2021; 28:637-649. [PMID: 33439753 DOI: 10.1089/cmb.2020.0527] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/28/2022] Open
Abstract
Essential proteins possess critical functions for cell survival. Identifying essential proteins improves our understanding of how a cell works and also plays a vital role in the research fields of disease treatment and drug development. Recently, some machine-learning methods and ensemble learning methods have been proposed to identify essential proteins by introducing effective protein features. However, the ensemble learning method only used to focus on the choice of base classifiers. In this article, we propose a novel ensemble learning framework called multi-ensemble to integrate different base classifiers. The multi-ensemble method adopts the idea of multi-view learning and selects multiple base classifiers and trains those classifiers by continually adding the samples that are predicted correctly by the other base classifiers. We applied multi-ensemble to Yeast data and Escherichia coli data. The results show that our approach achieved better performance than both individual classifiers and the other ensemble learning methods.
Collapse
Affiliation(s)
- Wei Dai
- Faculty of Information Engineering and Automation, Kunming University of Science and Technology, Kunming, China.,Computer Technology Application Key Lab of Yunnan Province, Kunming University of Science and Technology, Kunming, China
| | - Bingxi Chen
- Faculty of Information Engineering and Automation, Kunming University of Science and Technology, Kunming, China
| | - Wei Peng
- Faculty of Information Engineering and Automation, Kunming University of Science and Technology, Kunming, China.,Computer Technology Application Key Lab of Yunnan Province, Kunming University of Science and Technology, Kunming, China
| | - Xia Li
- Faculty of Information Engineering and Automation, Kunming University of Science and Technology, Kunming, China
| | - Jiancheng Zhong
- School of Information Science and Engineering, Hunan Normal University, Changsha, China
| | - Jianxin Wang
- School of Computer Science and Engineering, Central South University, Changsha, China
| |
Collapse
|
11
|
Christian RW, Hewitt SL, Roalson EH, Dhingra A. Genome-Scale Characterization of Predicted Plastid-Targeted Proteomes in Higher Plants. Sci Rep 2020; 10:8281. [PMID: 32427841 PMCID: PMC7237471 DOI: 10.1038/s41598-020-64670-5] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/21/2019] [Accepted: 04/20/2020] [Indexed: 12/20/2022] Open
Abstract
Plastids are morphologically and functionally diverse organelles that are dependent on nuclear-encoded, plastid-targeted proteins for all biochemical and regulatory functions. However, how plastid proteomes vary temporally, spatially, and taxonomically has been historically difficult to analyze at a genome-wide scale using experimental methods. A bioinformatics workflow was developed and evaluated using a combination of fast and user-friendly subcellular prediction programs to maximize performance and accuracy for chloroplast transit peptides and demonstrate this technique on the predicted proteomes of 15 sequenced plant genomes. Gene family grouping was then performed in parallel using modified approaches of reciprocal best BLAST hits (RBH) and UCLUST. A total of 628 protein families were found to have conserved plastid targeting across angiosperm species using RBH, and 828 using UCLUST. However, thousands of clusters were also detected where only one species had predicted plastid targeting, most notably in Panicum virgatum which had 1,458 proteins with species-unique targeting. An average of 45% overlap was found in plastid-targeted protein-coding gene families compared with Arabidopsis, but an additional 20% of proteins matched against the full Arabidopsis proteome, indicating a unique evolution of plastid targeting. Neofunctionalization through subcellular relocalization is known to impart novel biological functions but has not been described before on a genome-wide scale for the plastid proteome. Further work to correlate these predicted novel plastid-targeted proteins to transcript abundance and high-throughput proteomics will uncover unique aspects of plastid biology and shed light on how the plastid proteome has evolved to influence plastid morphology and biochemistry.
Collapse
Affiliation(s)
- Ryan W Christian
- Department of Horticulture, Washington State University, Pullman, WA, USA
- Molecular Plant Sciences Program, Washington State University, Pullman, WA, USA
| | - Seanna L Hewitt
- Department of Horticulture, Washington State University, Pullman, WA, USA
- Molecular Plant Sciences Program, Washington State University, Pullman, WA, USA
| | - Eric H Roalson
- Molecular Plant Sciences Program, Washington State University, Pullman, WA, USA
- School of Biological Sciences, Washington State University, Pullman, WA, USA
| | - Amit Dhingra
- Department of Horticulture, Washington State University, Pullman, WA, USA.
- Molecular Plant Sciences Program, Washington State University, Pullman, WA, USA.
| |
Collapse
|
12
|
Macalino SJY, Billones JB, Organo VG, Carrillo MCO. In Silico Strategies in Tuberculosis Drug Discovery. Molecules 2020; 25:E665. [PMID: 32033144 PMCID: PMC7037728 DOI: 10.3390/molecules25030665] [Citation(s) in RCA: 27] [Impact Index Per Article: 6.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/28/2019] [Revised: 12/15/2019] [Accepted: 12/17/2019] [Indexed: 12/16/2022] Open
Abstract
Tuberculosis (TB) remains a serious threat to global public health, responsible for an estimated 1.5 million mortalities in 2018. While there are available therapeutics for this infection, slow-acting drugs, poor patient compliance, drug toxicity, and drug resistance require the discovery of novel TB drugs. Discovering new and more potent antibiotics that target novel TB protein targets is an attractive strategy towards controlling the global TB epidemic. In silico strategies can be applied at multiple stages of the drug discovery paradigm to expedite the identification of novel anti-TB therapeutics. In this paper, we discuss the current TB treatment, emergence of drug resistance, and the effective application of computational tools to the different stages of TB drug discovery when combined with traditional biochemical methods. We will also highlight the strengths and points of improvement in in silico TB drug discovery research, as well as possible future perspectives in this field.
Collapse
Affiliation(s)
- Stephani Joy Y. Macalino
- Chemistry Department, De La Salle University, 2401 Taft Avenue, Manila 0992, Philippines;
- OVPAA-EIDR Program, “Computer-Aided Discovery of Compounds for the Treatment of Tuberculosis in the Philippines”, Department of Physical Sciences and Mathematics, College of Arts and Sciences, University of the Philippines Manila, Manila 1000, Philippines; (V.G.O.); (M.C.O.C.)
| | - Junie B. Billones
- OVPAA-EIDR Program, “Computer-Aided Discovery of Compounds for the Treatment of Tuberculosis in the Philippines”, Department of Physical Sciences and Mathematics, College of Arts and Sciences, University of the Philippines Manila, Manila 1000, Philippines; (V.G.O.); (M.C.O.C.)
| | - Voltaire G. Organo
- OVPAA-EIDR Program, “Computer-Aided Discovery of Compounds for the Treatment of Tuberculosis in the Philippines”, Department of Physical Sciences and Mathematics, College of Arts and Sciences, University of the Philippines Manila, Manila 1000, Philippines; (V.G.O.); (M.C.O.C.)
| | - Maria Constancia O. Carrillo
- OVPAA-EIDR Program, “Computer-Aided Discovery of Compounds for the Treatment of Tuberculosis in the Philippines”, Department of Physical Sciences and Mathematics, College of Arts and Sciences, University of the Philippines Manila, Manila 1000, Philippines; (V.G.O.); (M.C.O.C.)
| |
Collapse
|
13
|
Chen H, Zhang Z, Jiang S, Li R, Li W, Zhao C, Hong H, Huang X, Li H, Bo X. New insights on human essential genes based on integrated analysis and the construction of the HEGIAP web-based platform. Brief Bioinform 2019; 21:1397-1410. [PMID: 31504171 PMCID: PMC7373178 DOI: 10.1093/bib/bbz072] [Citation(s) in RCA: 36] [Impact Index Per Article: 7.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/25/2019] [Revised: 05/13/2019] [Accepted: 05/24/2019] [Indexed: 12/13/2022] Open
Abstract
Essential genes are those whose loss of function compromises organism viability or results in profound loss of fitness. Recent gene-editing technologies have provided new opportunities to characterize essential genes. Here, we present an integrated analysis that comprehensively and systematically elucidates the genetic and regulatory characteristics of human essential genes. First, we found that essential genes act as ‘hubs’ in protein–protein interaction networks, chromatin structure and epigenetic modification. Second, essential genes represent conserved biological processes across species, although gene essentiality changes differently among species. Third, essential genes are important for cell development due to their discriminate transcription activity in embryo development and oncogenesis. In addition, we developed an interactive web server, the Human Essential Genes Interactive Analysis Platform (http://sysomics.com/HEGIAP/), which integrates abundant analytical tools to enable global, multidimensional interpretation of gene essentiality. Our study provides new insights that improve the understanding of human essential genes.
Collapse
Affiliation(s)
- Hebing Chen
- Beijing Institute of Radiation Medicine, Beijing 100850, China
| | - Zhuo Zhang
- Beijing Institute of Radiation Medicine, Beijing 100850, China
| | - Shuai Jiang
- Beijing Institute of Radiation Medicine, Beijing 100850, China
| | - Ruijiang Li
- Beijing Institute of Radiation Medicine, Beijing 100850, China
| | - Wanying Li
- Beijing Institute of Radiation Medicine, Beijing 100850, China
| | - Chenghui Zhao
- Beijing Institute of Radiation Medicine, Beijing 100850, China
| | - Hao Hong
- Beijing Institute of Radiation Medicine, Beijing 100850, China
| | - Xin Huang
- Beijing Institute of Radiation Medicine, Beijing 100850, China
| | - Hao Li
- Beijing Institute of Radiation Medicine, Beijing 100850, China
| | - Xiaochen Bo
- Beijing Institute of Radiation Medicine, Beijing 100850, China
| |
Collapse
|
14
|
Forsythe ES, Sharbrough J, Havird JC, Warren JM, Sloan DB. CyMIRA: The Cytonuclear Molecular Interactions Reference for Arabidopsis. Genome Biol Evol 2019; 11:2194-2202. [PMID: 31282937 PMCID: PMC6685490 DOI: 10.1093/gbe/evz144] [Citation(s) in RCA: 14] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 07/01/2019] [Indexed: 12/11/2022] Open
Abstract
The function and evolution of eukaryotic cells depend upon direct molecular interactions between gene products encoded in nuclear and cytoplasmic genomes. Understanding how these cytonuclear interactions drive molecular evolution and generate genetic incompatibilities between isolated populations and species is of central importance to eukaryotic biology. Plants are an outstanding system to investigate such effects because of their two different genomic compartments present in the cytoplasm (mitochondria and plastids) and the extensive resources detailing subcellular targeting of nuclear-encoded proteins. However, the field lacks a consistent classification scheme for mitochondrial- and plastid-targeted proteins based on their molecular interactions with cytoplasmic genomes and gene products, which hinders efforts to standardize and compare results across studies. Here, we take advantage of detailed knowledge about the model angiosperm Arabidopsis thaliana to provide a curated database of plant cytonuclear interactions at the molecular level. CyMIRA (Cytonuclear Molecular Interactions Reference for Arabidopsis) is available at http://cymira.colostate.edu/ and https://github.com/dbsloan/cymira and will serve as a resource to aid researchers in partitioning evolutionary genomic data into functional gene classes based on organelle targeting and direct molecular interaction with cytoplasmic genomes and gene products. It includes 11 categories (and 27 subcategories) of different cytonuclear complexes and types of molecular interactions, and it reports residue-level information for cytonuclear contact sites. We hope that this framework will make it easier to standardize, interpret, and compare studies testing the functional and evolutionary consequences of cytonuclear interactions.
Collapse
Affiliation(s)
| | | | - Justin C Havird
- Department of Integrative Biology, University of Texas, Austin
| | | | | |
Collapse
|
15
|
Fischbach A, Krüger A, Hampp S, Assmann G, Rank L, Hufnagel M, Stöckl MT, Fischer JMF, Veith S, Rossatti P, Ganz M, Ferrando-May E, Hartwig A, Hauser K, Wiesmüller L, Bürkle A, Mangerich A. The C-terminal domain of p53 orchestrates the interplay between non-covalent and covalent poly(ADP-ribosyl)ation of p53 by PARP1. Nucleic Acids Res 2019; 46:804-822. [PMID: 29216372 PMCID: PMC5778597 DOI: 10.1093/nar/gkx1205] [Citation(s) in RCA: 69] [Impact Index Per Article: 13.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/20/2017] [Accepted: 11/22/2017] [Indexed: 01/25/2023] Open
Abstract
The post-translational modification poly(ADP-ribosyl)ation (PARylation) plays key roles in genome maintenance and transcription. Both non-covalent poly(ADP-ribose) binding and covalent PARylation control protein functions, however, it is unknown how the two modes of modification crosstalk mechanistically. Employing the tumor suppressor p53 as a model substrate, this study provides detailed insights into the interplay between non-covalent and covalent PARylation and unravels its functional significance in the regulation of p53. We reveal that the multifunctional C-terminal domain (CTD) of p53 acts as the central hub in the PARylation-dependent regulation of p53. Specifically, p53 bound to auto-PARylated PARP1 via highly specific non–covalent PAR-CTD interaction, which conveyed target specificity for its covalent PARylation by PARP1. Strikingly, fusing the p53-CTD to a protein that is normally not PARylated, renders this a target for covalent PARylation as well. Functional studies revealed that the p53–PAR interaction had substantial implications on molecular and cellular levels. Thus, PAR significantly influenced the complex p53–DNA binding properties and controlled p53 functions, with major implications on the p53-dependent interactome, transcription, and replication-associated recombination. Remarkably, this mechanism potentially also applies to other PARylation targets, since a bioinformatics analysis revealed that CTD-like regions are highly enriched in the PARylated proteome.
Collapse
Affiliation(s)
- Arthur Fischbach
- Department of Biology, University of Konstanz, 78457 Konstanz, Germany.,Konstanz Research School Chemical Biology, University of Konstanz, 78457 Konstanz, Germany
| | - Annika Krüger
- Department of Biology, University of Konstanz, 78457 Konstanz, Germany.,Konstanz Research School Chemical Biology, University of Konstanz, 78457 Konstanz, Germany.,Department of Chemistry, University of Konstanz, 78457 Konstanz, Germany
| | - Stephanie Hampp
- Department of Obstetrics and Gynaecology, University of Ulm, 89075 Ulm, Germany
| | - Greta Assmann
- Department of Biology, University of Konstanz, 78457 Konstanz, Germany.,Konstanz Research School Chemical Biology, University of Konstanz, 78457 Konstanz, Germany
| | - Lisa Rank
- Department of Biology, University of Konstanz, 78457 Konstanz, Germany
| | - Matthias Hufnagel
- Department of Food Chemistry and Toxicology, Institute for Applied Biosciences, Karlsruhe Institute of Technology (KIT), 76131 Karlsruhe, Germany
| | - Martin T Stöckl
- Bioimaging Center, Department of Biology, University of Konstanz, 78457 Konstanz, Germany
| | - Jan M F Fischer
- Department of Biology, University of Konstanz, 78457 Konstanz, Germany.,Konstanz Research School Chemical Biology, University of Konstanz, 78457 Konstanz, Germany
| | - Sebastian Veith
- Department of Biology, University of Konstanz, 78457 Konstanz, Germany.,Research Training Group 1331, University of Konstanz, 78457 Konstanz, Germany
| | - Pascal Rossatti
- Department of Biology, University of Konstanz, 78457 Konstanz, Germany
| | - Magdalena Ganz
- Bioimaging Center, Department of Biology, University of Konstanz, 78457 Konstanz, Germany
| | - Elisa Ferrando-May
- Bioimaging Center, Department of Biology, University of Konstanz, 78457 Konstanz, Germany
| | - Andrea Hartwig
- Department of Food Chemistry and Toxicology, Institute for Applied Biosciences, Karlsruhe Institute of Technology (KIT), 76131 Karlsruhe, Germany
| | - Karin Hauser
- Department of Chemistry, University of Konstanz, 78457 Konstanz, Germany
| | - Lisa Wiesmüller
- Department of Obstetrics and Gynaecology, University of Ulm, 89075 Ulm, Germany
| | - Alexander Bürkle
- Department of Biology, University of Konstanz, 78457 Konstanz, Germany
| | - Aswin Mangerich
- Department of Biology, University of Konstanz, 78457 Konstanz, Germany
| |
Collapse
|
16
|
Subba P, Narayana Kotimoole C, Prasad TSK. Plant Proteome Databases and Bioinformatic Tools: An Expert Review and Comparative Insights. ACTA ACUST UNITED AC 2019; 23:190-206. [DOI: 10.1089/omi.2019.0024] [Citation(s) in RCA: 13] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/06/2023]
Affiliation(s)
- Pratigya Subba
- Center for Systems Biology and Molecular Medicine, Yenepoya Research Centre, Yenepoya (Deemed to be University), Mangalore, India
| | - Chinmaya Narayana Kotimoole
- Center for Systems Biology and Molecular Medicine, Yenepoya Research Centre, Yenepoya (Deemed to be University), Mangalore, India
| | - Thottethodi Subrahmanya Keshava Prasad
- Center for Systems Biology and Molecular Medicine, Yenepoya Research Centre, Yenepoya (Deemed to be University), Mangalore, India
- Institute of Bioinformatics, International Technology Park, Bangalore, India
| |
Collapse
|
17
|
Zhong J, Sun Y, Peng W, Xie M, Yang J, Tang X. XGBFEMF: An XGBoost-Based Framework for Essential Protein Prediction. IEEE Trans Nanobioscience 2018; 17:243-250. [DOI: 10.1109/tnb.2018.2842219] [Citation(s) in RCA: 70] [Impact Index Per Article: 11.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/08/2022]
|
18
|
Shekari F, Baharvand H, Salekdeh GH. Organellar proteomics of embryonic stem cells. ADVANCES IN PROTEIN CHEMISTRY AND STRUCTURAL BIOLOGY 2018; 95:215-30. [PMID: 24985774 DOI: 10.1016/b978-0-12-800453-1.00007-5] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [Subscribe] [Scholar Register] [Indexed: 01/07/2023]
Abstract
Embryonic stem cells (ESCs) are undifferentiated cells with two common remarkable features known as self-renewal and differentiation. Proteomics plays an increasingly important role in understanding molecular mechanisms underlying self-renewal and pluripotency of ESCs and their applications in cell therapy and developmental biology studies. As the function of a protein is strongly associated with its localization in cell, a complete and accurate picture of the proteome of ESCs cannot be achieved without knowing the subcellular locations of proteins. Subcellular fractionation allows enrichment of low abundant proteins and signaling complexes and reduces the complexity of the sample. It also provided insight into tracking proteins that shuttle between different compartments. Despite the substantial interest and efforts in ESC subcellular proteomics area, progress has been relatively limited. In this review, we present an overview on current status of ESCs organelle proteomics research and discuss challenges in subcellular proteomics.
Collapse
Affiliation(s)
- Faezeh Shekari
- Department of Molecular Systems Biology at Cell Science Research Center, Royan Institute for Stem Cell Biology and Technology, ACECR, Tehran, Iran; Department of Developmental Biology, University of Science and Culture, ACECR, Tehran, Iran
| | - Hossein Baharvand
- Department of Developmental Biology, University of Science and Culture, ACECR, Tehran, Iran; Department of Stem Cells and Developmental Biology at Cell Science Research Center, Royan Institute for Stem Cell Biology and Technology, ACECR, Tehran, Iran.
| | - Ghasem Hosseini Salekdeh
- Department of Molecular Systems Biology at Cell Science Research Center, Royan Institute for Stem Cell Biology and Technology, ACECR, Tehran, Iran; Department of Systems Biology, Agricultural Biotechnology Research Institute of Iran, Karaj, Iran.
| |
Collapse
|
19
|
Zhang T, Tan P, Wang L, Jin N, Li Y, Zhang L, Yang H, Hu Z, Zhang L, Hu C, Li C, Qian K, Zhang C, Huang Y, Li K, Lin H, Wang D. RNALocate: a resource for RNA subcellular localizations. Nucleic Acids Res 2017; 45:D135-D138. [PMID: 27543076 PMCID: PMC5210605 DOI: 10.1093/nar/gkw728] [Citation(s) in RCA: 86] [Impact Index Per Article: 12.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/12/2016] [Accepted: 08/08/2016] [Indexed: 02/05/2023] Open
Abstract
Increasing evidence has revealed that RNA subcellular localization is a very important feature for deeply understanding RNA's biological functions after being transported into intra- or extra-cellular regions. RNALocate is a web-accessible database that aims to provide a high-quality RNA subcellular localization resource and facilitate future researches on RNA function or structure. The current version of RNALocate documents more than 37 700 manually curated RNA subcellular localization entries with experimental evidence, involving more than 21 800 RNAs with 42 subcellular localizations in 65 species, mainly including Homo sapiens, Mus musculus and Saccharomyces cerevisiae etc. Besides, RNA homology, sequence and interaction data have also been integrated into RNALocate. Users can access these data through online search, browse, blast and visualization tools. In conclusion, RNALocate will be of help in elucidating the entirety of RNA subcellular localization, and developing new prediction methods. The database is available at http://www.rna-society.org/rnalocate/.
Collapse
Affiliation(s)
- Ting Zhang
- College of Bioinformatics Science and Technology, Harbin Medical University, Harbin 150081, China
| | - Puwen Tan
- College of Bioinformatics Science and Technology, Harbin Medical University, Harbin 150081, China
| | - Liqiang Wang
- College of Bioinformatics Science and Technology, Harbin Medical University, Harbin 150081, China
| | - Nana Jin
- College of Bioinformatics Science and Technology, Harbin Medical University, Harbin 150081, China
| | - Yana Li
- College of Bioinformatics Science and Technology, Harbin Medical University, Harbin 150081, China
| | - Lin Zhang
- College of Bioinformatics Science and Technology, Harbin Medical University, Harbin 150081, China
| | - Huan Yang
- Key Laboratory for NeuroInformation of Ministry of Education, Center of Bioinformatics, School of Life Science and Technology, University of Electronic Science and Technology of China, Chengdu 610054, China
| | - Zhenyu Hu
- College of Bioinformatics Science and Technology, Harbin Medical University, Harbin 150081, China
| | - Lining Zhang
- College of Bioinformatics Science and Technology, Harbin Medical University, Harbin 150081, China
| | - Chunyu Hu
- College of Bioinformatics Science and Technology, Harbin Medical University, Harbin 150081, China
| | - Chunhua Li
- College of Bioinformatics Science and Technology, Harbin Medical University, Harbin 150081, China
| | - Kun Qian
- College of Bioinformatics Science and Technology, Harbin Medical University, Harbin 150081, China
| | - Changjian Zhang
- Key Laboratory for NeuroInformation of Ministry of Education, Center of Bioinformatics, School of Life Science and Technology, University of Electronic Science and Technology of China, Chengdu 610054, China
| | - Yan Huang
- College of Bioinformatics Science and Technology, Harbin Medical University, Harbin 150081, China
| | - Kongning Li
- College of Bioinformatics Science and Technology, Harbin Medical University, Harbin 150081, China
| | - Hao Lin
- Key Laboratory for NeuroInformation of Ministry of Education, Center of Bioinformatics, School of Life Science and Technology, University of Electronic Science and Technology of China, Chengdu 610054, China
| | - Dong Wang
- College of Bioinformatics Science and Technology, Harbin Medical University, Harbin 150081, China
- Department of Biochemistry and Molecular Biology, Shantou University Medical College, Shantou 515041, China
| |
Collapse
|
20
|
Zámbó V, Tóth M, Schlachter K, Szelényi P, Sarnyai F, Lotz G, Csala M, Kereszturi É. Cytosolic localization of NADH cytochrome b₅ oxidoreductase (Ncb5or). FEBS Lett 2016; 590:661-71. [PMID: 26878259 DOI: 10.1002/1873-3468.12097] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/19/2015] [Revised: 01/29/2016] [Accepted: 02/04/2016] [Indexed: 11/10/2022]
Abstract
Acyl-CoA desaturation in the endoplasmic reticulum (ER) membrane depends on cytosolic NADH or NADPH, whereas NADPH in the ER lumen is utilized by prereceptor glucocorticoid production. It was assumed that NADH cytochrome b5 oxidoreductase (Ncb5or) might connect Acyl-CoA desaturation to ER luminal redox. We aimed to clarify the ambiguous compartmentalization of Ncb5or and test the possible effect of stearoyl-CoA on microsomal NADPH level. Amino acid sequence analysis, fluorescence microscopy of GFP-tagged protein, immunocytochemistry, and western blot analysis of subcellular fractions unequivocally demonstrated that Ncb5or, either endogenous or exogenous, is localized in the cytoplasm and not in the ER lumen in cultured cells and liver tissue. Moreover, the involvement of ER-luminal reducing equivalents in stearoyl-CoA desaturation was excluded.
Collapse
Affiliation(s)
- Veronika Zámbó
- Department of Medical Chemistry, Molecular Biology and Pathobiochemistry, Semmelweis University, Budapest, Hungary
| | - Mónika Tóth
- Department of Medical Chemistry, Molecular Biology and Pathobiochemistry, Semmelweis University, Budapest, Hungary
| | | | - Péter Szelényi
- Department of Medical Chemistry, Molecular Biology and Pathobiochemistry, Semmelweis University, Budapest, Hungary
| | - Farkas Sarnyai
- Department of Medical Chemistry, Molecular Biology and Pathobiochemistry, Semmelweis University, Budapest, Hungary
| | - Gábor Lotz
- 2nd Department of Pathology, Semmelweis University, Budapest, Hungary
| | - Miklós Csala
- Department of Medical Chemistry, Molecular Biology and Pathobiochemistry, Semmelweis University, Budapest, Hungary
| | - Éva Kereszturi
- Department of Medical Chemistry, Molecular Biology and Pathobiochemistry, Semmelweis University, Budapest, Hungary
| |
Collapse
|
21
|
Negi S, Pandey S, Srinivasan SM, Mohammed A, Guda C. LocSigDB: a database of protein localization signals. DATABASE-THE JOURNAL OF BIOLOGICAL DATABASES AND CURATION 2015; 2015:bav003. [PMID: 25725059 PMCID: PMC4343182 DOI: 10.1093/database/bav003] [Citation(s) in RCA: 34] [Impact Index Per Article: 3.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 01/09/2023]
Abstract
LocSigDB (http://genome.unmc.edu/LocSigDB/) is a manually curated database of experimental protein localization signals for eight distinct subcellular locations; primarily in a eukaryotic cell with brief coverage of bacterial proteins. Proteins must be localized at their appropriate subcellular compartment to perform their desired function. Mislocalization of proteins to unintended locations is a causative factor for many human diseases; therefore, collection of known sorting signals will help support many important areas of biomedical research. By performing an extensive literature study, we compiled a collection of 533 experimentally determined localization signals, along with the proteins that harbor such signals. Each signal in the LocSigDB is annotated with its localization, source, PubMed references and is linked to the proteins in UniProt database along with the organism information that contain the same amino acid pattern as the given signal. From LocSigDB webserver, users can download the whole database or browse/search for data using an intuitive query interface. To date, LocSigDB is the most comprehensive compendium of protein localization signals for eight distinct subcellular locations. Database URL: http://genome.unmc.edu/LocSigDB/
Collapse
Affiliation(s)
- Simarjeet Negi
- Department of Genetics, Cell Biology and Anatomy, Bioinformatics and Systems Biology Core, Department of Biochemistry and Molecular Biology, Fred and Pamela Buffet Cancer Center and Eppley Institute for Research in Cancer and Allied Diseases, University of Nebraska Medical Center, Omaha, NE 68198, USA
| | - Sanjit Pandey
- Department of Genetics, Cell Biology and Anatomy, Bioinformatics and Systems Biology Core, Department of Biochemistry and Molecular Biology, Fred and Pamela Buffet Cancer Center and Eppley Institute for Research in Cancer and Allied Diseases, University of Nebraska Medical Center, Omaha, NE 68198, USA Department of Genetics, Cell Biology and Anatomy, Bioinformatics and Systems Biology Core, Department of Biochemistry and Molecular Biology, Fred and Pamela Buffet Cancer Center and Eppley Institute for Research in Cancer and Allied Diseases, University of Nebraska Medical Center, Omaha, NE 68198, USA
| | - Satish M Srinivasan
- Department of Genetics, Cell Biology and Anatomy, Bioinformatics and Systems Biology Core, Department of Biochemistry and Molecular Biology, Fred and Pamela Buffet Cancer Center and Eppley Institute for Research in Cancer and Allied Diseases, University of Nebraska Medical Center, Omaha, NE 68198, USA
| | - Akram Mohammed
- Department of Genetics, Cell Biology and Anatomy, Bioinformatics and Systems Biology Core, Department of Biochemistry and Molecular Biology, Fred and Pamela Buffet Cancer Center and Eppley Institute for Research in Cancer and Allied Diseases, University of Nebraska Medical Center, Omaha, NE 68198, USA
| | - Chittibabu Guda
- Department of Genetics, Cell Biology and Anatomy, Bioinformatics and Systems Biology Core, Department of Biochemistry and Molecular Biology, Fred and Pamela Buffet Cancer Center and Eppley Institute for Research in Cancer and Allied Diseases, University of Nebraska Medical Center, Omaha, NE 68198, USA Department of Genetics, Cell Biology and Anatomy, Bioinformatics and Systems Biology Core, Department of Biochemistry and Molecular Biology, Fred and Pamela Buffet Cancer Center and Eppley Institute for Research in Cancer and Allied Diseases, University of Nebraska Medical Center, Omaha, NE 68198, USA Department of Genetics, Cell Biology and Anatomy, Bioinformatics and Systems Biology Core, Department of Biochemistry and Molecular Biology, Fred and Pamela Buffet Cancer Center and Eppley Institute for Research in Cancer and Allied Diseases, University of Nebraska Medical Center, Omaha, NE 68198, USA Department of Genetics, Cell Biology and Anatomy, Bioinformatics and Systems Biology Core, Department of Biochemistry and Molecular Biology, Fred and Pamela Buffet Cancer Center and Eppley Institute for Research in Cancer and Allied Diseases, University of Nebraska Medical Center, Omaha, NE 68198, USA Department of Genetics, Cell Biology and Anatomy, Bioinformatics and Systems Biology Core, Department of Biochemistry and Molecular Biology, Fred and Pamela Buffet Cancer Center and Eppley Institute for Research in Cancer and Allied Diseases, University of Nebraska Medical Center, Omaha, NE 68198, USA
| |
Collapse
|
22
|
Veres DV, Gyurkó DM, Thaler B, Szalay KZ, Fazekas D, Korcsmáros T, Csermely P. ComPPI: a cellular compartment-specific database for protein-protein interaction network analysis. Nucleic Acids Res 2014; 43:D485-93. [PMID: 25348397 PMCID: PMC4383876 DOI: 10.1093/nar/gku1007] [Citation(s) in RCA: 76] [Impact Index Per Article: 7.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/06/2023] Open
Abstract
Here we present ComPPI, a cellular compartment-specific database of proteins and their interactions enabling an extensive, compartmentalized protein–protein interaction network analysis (URL: http://ComPPI.LinkGroup.hu). ComPPI enables the user to filter biologically unlikely interactions, where the two interacting proteins have no common subcellular localizations and to predict novel properties, such as compartment-specific biological functions. ComPPI is an integrated database covering four species (S. cerevisiae, C. elegans, D. melanogaster and H. sapiens). The compilation of nine protein–protein interaction and eight subcellular localization data sets had four curation steps including a manually built, comprehensive hierarchical structure of >1600 subcellular localizations. ComPPI provides confidence scores for protein subcellular localizations and protein–protein interactions. ComPPI has user-friendly search options for individual proteins giving their subcellular localization, their interactions and the likelihood of their interactions considering the subcellular localization of their interacting partners. Download options of search results, whole-proteomes, organelle-specific interactomes and subcellular localization data are available on its website. Due to its novel features, ComPPI is useful for the analysis of experimental results in biochemistry and molecular biology, as well as for proteome-wide studies in bioinformatics and network science helping cellular biology, medicine and drug design.
Collapse
Affiliation(s)
- Daniel V Veres
- Department of Medical Chemistry, Semmelweis University, Budapest, Hungary
| | - Dávid M Gyurkó
- Department of Medical Chemistry, Semmelweis University, Budapest, Hungary
| | - Benedek Thaler
- Department of Medical Chemistry, Semmelweis University, Budapest, Hungary Faculty of Electrical Engineering and Informatics, Budapest University of Technology and Economics, Budapest, Hungary
| | - Kristóf Z Szalay
- Department of Medical Chemistry, Semmelweis University, Budapest, Hungary
| | - Dávid Fazekas
- Department of Genetics, Eötvös Loránd University, Budapest, Hungary
| | - Tamás Korcsmáros
- Department of Genetics, Eötvös Loránd University, Budapest, Hungary TGAC, The Genome Analysis Centre, Norwich, UK Gut Health and Food Safety Programme, Institute of Food Research, Norwich, UK
| | - Peter Csermely
- Department of Medical Chemistry, Semmelweis University, Budapest, Hungary
| |
Collapse
|
23
|
Binder JX, Pletscher-Frankild S, Tsafou K, Stolte C, O'Donoghue SI, Schneider R, Jensen LJ. COMPARTMENTS: unification and visualization of protein subcellular localization evidence. DATABASE-THE JOURNAL OF BIOLOGICAL DATABASES AND CURATION 2014; 2014:bau012. [PMID: 24573882 PMCID: PMC3935310 DOI: 10.1093/database/bau012] [Citation(s) in RCA: 374] [Impact Index Per Article: 37.4] [Reference Citation Analysis] [Abstract] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 12/21/2022]
Abstract
Information on protein subcellular localization is important to understand the cellular functions of proteins. Currently, such information is manually curated from the literature, obtained from high-throughput microscopy-based screens and predicted from primary sequence. To get a comprehensive view of the localization of a protein, it is thus necessary to consult multiple databases and prediction tools. To address this, we present the COMPARTMENTS resource, which integrates all sources listed above as well as the results of automatic text mining. The resource is automatically kept up to date with source databases, and all localization evidence is mapped onto common protein identifiers and Gene Ontology terms. We further assign confidence scores to the localization evidence to facilitate comparison of different types and sources of evidence. To further improve the comparability, we assign confidence scores based on the type and source of the localization evidence. Finally, we visualize the unified localization evidence for a protein on a schematic cell to provide a simple overview. Database URL:http://compartments.jensenlab.org
Collapse
Affiliation(s)
- Janos X Binder
- Structural and Computational Biology Unit, European Molecular Biology Laboratory (EMBL), 69117 Heidelberg, Germany, Bioinformatics Core Facility, Luxembourg Centre for Systems Biomedicine (LCSB), University of Luxembourg, 4362 Esch-sur-Alzette, Luxembourg, Department of Disease Systems Biology, Novo Nordisk Foundation Center for Protein Research, Faculty of Health and Medical Sciences, University of Copenhagen, 2200 Copenhagen, Denmark, CSIRO Computational Informatics, Sydney, NSW 2113 Australia and Garvan Institute of Medical Research, Sydney, NSW 2100, Australia
| | | | | | | | | | | | | |
Collapse
|
24
|
Tribl F, Meyer HE, Marcus K. Analysis of organelles within the nervous system: impact on brain and organelle functions. Expert Rev Proteomics 2014; 5:333-51. [DOI: 10.1586/14789450.5.2.333] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/19/2023]
|
25
|
Govindan G, Nair AS. Bagging with CTD--a novel signature for the hierarchical prediction of secreted protein trafficking in eukaryotes. GENOMICS PROTEOMICS & BIOINFORMATICS 2013; 11:385-90. [PMID: 24316328 PMCID: PMC4357838 DOI: 10.1016/j.gpb.2013.07.005] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 02/02/2013] [Revised: 07/01/2013] [Accepted: 07/17/2013] [Indexed: 11/19/2022]
Abstract
Protein trafficking or protein sorting in eukaryotes is a complicated process and is carried out based on the information contained in the protein. Many methods reported prediction of the subcellular location of proteins from sequence information. However, most of these prediction methods use a flat structure or parallel architecture to perform prediction. In this work, we introduce ensemble classifiers with features that are extracted directly from full length protein sequences to predict locations in the protein-sorting pathway hierarchically. Sequence driven features, sequence mapped features and sequence autocorrelation features were tested with ensemble learners and their performances were compared. When evaluated by independent data testing, ensemble based-bagging algorithms with sequence feature composition, transition and distribution (CTD) successfully classified two datasets with accuracies greater than 90%. We compared our results with similar published methods, and our method equally performed with the others at two levels in the secreted pathway. This study shows that the feature CTD extracted from protein sequences is effective in capturing biological features among compartments in secreted pathways.
Collapse
Affiliation(s)
- Geetha Govindan
- Department of Computational Biology and Bioinformatics, University of Kerala, Thiruvananthapuram 695581, India.
| | - Achuthsankar S Nair
- Department of Computational Biology and Bioinformatics, University of Kerala, Thiruvananthapuram 695581, India
| |
Collapse
|
26
|
Abstract
Background Membrane proteins perform essential roles in diverse cellular functions and are regarded as major pharmaceutical targets. The significance of membrane proteins has led to the developing dozens of resources related with membrane proteins. However, most of these resources are built for specific well-known membrane protein groups, making it difficult to find common and specific features of various membrane protein groups. Methods We collected human membrane proteins from the dispersed resources and predicted novel membrane protein candidates by using ortholog information and our membrane protein classifiers. The membrane proteins were classified according to the type of interaction with the membrane, subcellular localization, and molecular function. We also made new feature dataset to characterize the membrane proteins in various aspects including membrane protein topology, domain, biological process, disease, and drug. Moreover, protein structure and ICD-10-CM based integrated disease and drug information was newly included. To analyze the comprehensive information of membrane proteins, we implemented analysis tools to identify novel sequence and functional features of the classified membrane protein groups and to extract features from protein sequences. Results We constructed HMPAS with 28,509 collected known membrane proteins and 8,076 newly predicted candidates. This system provides integrated information of human membrane proteins individually and in groups organized by 45 subcellular locations and 1,401 molecular functions. As a case study, we identified associations between the membrane proteins and diseases and present that membrane proteins are promising targets for diseases related with nervous system and circulatory system. A web-based interface of this system was constructed to facilitate researchers not only to retrieve organized information of individual proteins but also to use the tools to analyze the membrane proteins. Conclusions HMPAS provides comprehensive information about human membrane proteins including specific features of certain membrane protein groups. In this system, user can acquire the information of individual proteins and specified groups focused on their conserved sequence features, involved cellular processes, and diseases. HMPAS may contribute as a valuable resource for the inference of novel cellular mechanisms and pharmaceutical targets associated with the human membrane proteins. HMPAS is freely available at http://fcode.kaist.ac.kr/hmpas.
Collapse
|
27
|
Zhong J, Wang J, Peng W, Zhang Z, Pan Y. Prediction of essential proteins based on gene expression programming. BMC Genomics 2013; 14 Suppl 4:S7. [PMID: 24267033 PMCID: PMC3856491 DOI: 10.1186/1471-2164-14-s4-s7] [Citation(s) in RCA: 48] [Impact Index Per Article: 4.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/12/2022] Open
Abstract
Background Essential proteins are indispensable for cell survive. Identifying essential proteins is very important for improving our understanding the way of a cell working. There are various types of features related to the essentiality of proteins. Many methods have been proposed to combine some of them to predict essential proteins. However, it is still a big challenge for designing an effective method to predict them by integrating different features, and explaining how these selected features decide the essentiality of protein. Gene expression programming (GEP) is a learning algorithm and what it learns specifically is about relationships between variables in sets of data and then builds models to explain these relationships. Results In this work, we propose a GEP-based method to predict essential protein by combing some biological features and topological features. We carry out experiments on S. cerevisiae data. The experimental results show that the our method achieves better prediction performance than those methods using individual features. Moreover, our method outperforms some machine learning methods and performs as well as a method which is obtained by combining the outputs of eight machine learning methods. Conclusions The accuracy of predicting essential proteins can been improved by using GEP method to combine some topological features and biological features.
Collapse
|
28
|
Xu D. Protein databases on the internet. CURRENT PROTOCOLS IN PROTEIN SCIENCE 2012; Chapter 2:2.6.1-2.6.17. [PMID: 23151744 DOI: 10.1002/0471140864.ps0206s70] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 01/06/2023]
Abstract
Protein databases have become a crucial part of modern biology. Huge amounts of data for protein structures, functions, and particularly sequences are being generated. Searching databases is often the first step in the study of a new protein. Comparison between proteins or between protein families provides information about the relationship between proteins within a genome or across different species, and hence offers much more information than can be obtained by studying only an isolated protein. In addition, secondary databases derived from experimental databases are also widely available. These databases reorganize and annotate the data or provide predictions. The use of multiple databases often helps researchers understand the structure and function of a protein. Although some protein databases are widely known, they are far from being fully utilized in the protein science community. This unit provides a starting point for readers to explore the potential of protein databases on the Internet.
Collapse
Affiliation(s)
- Dong Xu
- Department of Computer Science and Christopher S. Bond Life Sciences Center, University of Missouri, Columbia, Missouri
| |
Collapse
|
29
|
Barsan C, Zouine M, Maza E, Bian W, Egea I, Rossignol M, Bouyssie D, Pichereaux C, Purgatto E, Bouzayen M, Latché A, Pech JC. Proteomic analysis of chloroplast-to-chromoplast transition in tomato reveals metabolic shifts coupled with disrupted thylakoid biogenesis machinery and elevated energy-production components. PLANT PHYSIOLOGY 2012; 160:708-25. [PMID: 22908117 PMCID: PMC3461550 DOI: 10.1104/pp.112.203679] [Citation(s) in RCA: 86] [Impact Index Per Article: 7.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 07/26/2012] [Accepted: 08/16/2012] [Indexed: 05/18/2023]
Abstract
A comparative proteomic approach was performed to identify differentially expressed proteins in plastids at three stages of tomato (Solanum lycopersicum) fruit ripening (mature-green, breaker, red). Stringent curation and processing of the data from three independent replicates identified 1,932 proteins among which 1,529 were quantified by spectral counting. The quantification procedures have been subsequently validated by immunoblot analysis of six proteins representative of distinct metabolic or regulatory pathways. Among the main features of the chloroplast-to-chromoplast transition revealed by the study, chromoplastogenesis appears to be associated with major metabolic shifts: (1) strong decrease in abundance of proteins of light reactions (photosynthesis, Calvin cycle, photorespiration) and carbohydrate metabolism (starch synthesis/degradation), mostly between breaker and red stages and (2) increase in terpenoid biosynthesis (including carotenoids) and stress-response proteins (ascorbate-glutathione cycle, abiotic stress, redox, heat shock). These metabolic shifts are preceded by the accumulation of plastid-encoded acetyl Coenzyme A carboxylase D proteins accounting for the generation of a storage matrix that will accumulate carotenoids. Of particular note is the high abundance of proteins involved in providing energy and in metabolites import. Structural differentiation of the chromoplast is characterized by a sharp and continuous decrease of thylakoid proteins whereas envelope and stroma proteins remain remarkably stable. This is coincident with the disruption of the machinery for thylakoids and photosystem biogenesis (vesicular trafficking, provision of material for thylakoid biosynthesis, photosystems assembly) and the loss of the plastid division machinery. Altogether, the data provide new insights on the chromoplast differentiation process while enriching our knowledge of the plant plastid proteome.
Collapse
Affiliation(s)
| | | | | | | | - Isabel Egea
- Université de Toulouse, Institut National Polytechnique-Ecole Nationale Supérieure Agronomique de Toulouse, Génomique et Biotechnologie des Fruits, Castanet-Tolosan F–31326, France (C.B., M.Z., E.M., W.B., I.E., M.B., A.L., J.-C.P.); Institut National de la Recherche Agronomique, Génomique et Biotechnologie des Fruits, Chemin de Borde Rouge, Castanet-Tolosan F–31326, France (C.B., M.Z., E.M., W.B., I.E., M.B., A.L., J.-C.P.); Fédération de Recherche 3450, Agrobiosciences, Interactions et Biodiversités, Plateforme Protéomique Génopole Toulouse Midi-Pyrénées, Institut de Pharmacologie et de Biologie Structurale, Centre National de la Recherche Scientifique, F–31077 Toulouse, France (M.R., C.P.); Université de Toulouse, Université Paul Sabatier, Institut de Pharmacologie et de Biologie Structurale, Toulouse F–31077, France (M.R., D.B., C.P.); and Universidade de São Paulo, Faculdade de Ciências Farmacêuticas, Depto. de Alimentos e Nutrição Experimental, 05508–000 São Paulo, Brazil (E.P.)
| | - Michel Rossignol
- Université de Toulouse, Institut National Polytechnique-Ecole Nationale Supérieure Agronomique de Toulouse, Génomique et Biotechnologie des Fruits, Castanet-Tolosan F–31326, France (C.B., M.Z., E.M., W.B., I.E., M.B., A.L., J.-C.P.); Institut National de la Recherche Agronomique, Génomique et Biotechnologie des Fruits, Chemin de Borde Rouge, Castanet-Tolosan F–31326, France (C.B., M.Z., E.M., W.B., I.E., M.B., A.L., J.-C.P.); Fédération de Recherche 3450, Agrobiosciences, Interactions et Biodiversités, Plateforme Protéomique Génopole Toulouse Midi-Pyrénées, Institut de Pharmacologie et de Biologie Structurale, Centre National de la Recherche Scientifique, F–31077 Toulouse, France (M.R., C.P.); Université de Toulouse, Université Paul Sabatier, Institut de Pharmacologie et de Biologie Structurale, Toulouse F–31077, France (M.R., D.B., C.P.); and Universidade de São Paulo, Faculdade de Ciências Farmacêuticas, Depto. de Alimentos e Nutrição Experimental, 05508–000 São Paulo, Brazil (E.P.)
| | - David Bouyssie
- Université de Toulouse, Institut National Polytechnique-Ecole Nationale Supérieure Agronomique de Toulouse, Génomique et Biotechnologie des Fruits, Castanet-Tolosan F–31326, France (C.B., M.Z., E.M., W.B., I.E., M.B., A.L., J.-C.P.); Institut National de la Recherche Agronomique, Génomique et Biotechnologie des Fruits, Chemin de Borde Rouge, Castanet-Tolosan F–31326, France (C.B., M.Z., E.M., W.B., I.E., M.B., A.L., J.-C.P.); Fédération de Recherche 3450, Agrobiosciences, Interactions et Biodiversités, Plateforme Protéomique Génopole Toulouse Midi-Pyrénées, Institut de Pharmacologie et de Biologie Structurale, Centre National de la Recherche Scientifique, F–31077 Toulouse, France (M.R., C.P.); Université de Toulouse, Université Paul Sabatier, Institut de Pharmacologie et de Biologie Structurale, Toulouse F–31077, France (M.R., D.B., C.P.); and Universidade de São Paulo, Faculdade de Ciências Farmacêuticas, Depto. de Alimentos e Nutrição Experimental, 05508–000 São Paulo, Brazil (E.P.)
| | - Carole Pichereaux
- Université de Toulouse, Institut National Polytechnique-Ecole Nationale Supérieure Agronomique de Toulouse, Génomique et Biotechnologie des Fruits, Castanet-Tolosan F–31326, France (C.B., M.Z., E.M., W.B., I.E., M.B., A.L., J.-C.P.); Institut National de la Recherche Agronomique, Génomique et Biotechnologie des Fruits, Chemin de Borde Rouge, Castanet-Tolosan F–31326, France (C.B., M.Z., E.M., W.B., I.E., M.B., A.L., J.-C.P.); Fédération de Recherche 3450, Agrobiosciences, Interactions et Biodiversités, Plateforme Protéomique Génopole Toulouse Midi-Pyrénées, Institut de Pharmacologie et de Biologie Structurale, Centre National de la Recherche Scientifique, F–31077 Toulouse, France (M.R., C.P.); Université de Toulouse, Université Paul Sabatier, Institut de Pharmacologie et de Biologie Structurale, Toulouse F–31077, France (M.R., D.B., C.P.); and Universidade de São Paulo, Faculdade de Ciências Farmacêuticas, Depto. de Alimentos e Nutrição Experimental, 05508–000 São Paulo, Brazil (E.P.)
| | - Eduardo Purgatto
- Université de Toulouse, Institut National Polytechnique-Ecole Nationale Supérieure Agronomique de Toulouse, Génomique et Biotechnologie des Fruits, Castanet-Tolosan F–31326, France (C.B., M.Z., E.M., W.B., I.E., M.B., A.L., J.-C.P.); Institut National de la Recherche Agronomique, Génomique et Biotechnologie des Fruits, Chemin de Borde Rouge, Castanet-Tolosan F–31326, France (C.B., M.Z., E.M., W.B., I.E., M.B., A.L., J.-C.P.); Fédération de Recherche 3450, Agrobiosciences, Interactions et Biodiversités, Plateforme Protéomique Génopole Toulouse Midi-Pyrénées, Institut de Pharmacologie et de Biologie Structurale, Centre National de la Recherche Scientifique, F–31077 Toulouse, France (M.R., C.P.); Université de Toulouse, Université Paul Sabatier, Institut de Pharmacologie et de Biologie Structurale, Toulouse F–31077, France (M.R., D.B., C.P.); and Universidade de São Paulo, Faculdade de Ciências Farmacêuticas, Depto. de Alimentos e Nutrição Experimental, 05508–000 São Paulo, Brazil (E.P.)
| | - Mondher Bouzayen
- Université de Toulouse, Institut National Polytechnique-Ecole Nationale Supérieure Agronomique de Toulouse, Génomique et Biotechnologie des Fruits, Castanet-Tolosan F–31326, France (C.B., M.Z., E.M., W.B., I.E., M.B., A.L., J.-C.P.); Institut National de la Recherche Agronomique, Génomique et Biotechnologie des Fruits, Chemin de Borde Rouge, Castanet-Tolosan F–31326, France (C.B., M.Z., E.M., W.B., I.E., M.B., A.L., J.-C.P.); Fédération de Recherche 3450, Agrobiosciences, Interactions et Biodiversités, Plateforme Protéomique Génopole Toulouse Midi-Pyrénées, Institut de Pharmacologie et de Biologie Structurale, Centre National de la Recherche Scientifique, F–31077 Toulouse, France (M.R., C.P.); Université de Toulouse, Université Paul Sabatier, Institut de Pharmacologie et de Biologie Structurale, Toulouse F–31077, France (M.R., D.B., C.P.); and Universidade de São Paulo, Faculdade de Ciências Farmacêuticas, Depto. de Alimentos e Nutrição Experimental, 05508–000 São Paulo, Brazil (E.P.)
| | - Alain Latché
- Université de Toulouse, Institut National Polytechnique-Ecole Nationale Supérieure Agronomique de Toulouse, Génomique et Biotechnologie des Fruits, Castanet-Tolosan F–31326, France (C.B., M.Z., E.M., W.B., I.E., M.B., A.L., J.-C.P.); Institut National de la Recherche Agronomique, Génomique et Biotechnologie des Fruits, Chemin de Borde Rouge, Castanet-Tolosan F–31326, France (C.B., M.Z., E.M., W.B., I.E., M.B., A.L., J.-C.P.); Fédération de Recherche 3450, Agrobiosciences, Interactions et Biodiversités, Plateforme Protéomique Génopole Toulouse Midi-Pyrénées, Institut de Pharmacologie et de Biologie Structurale, Centre National de la Recherche Scientifique, F–31077 Toulouse, France (M.R., C.P.); Université de Toulouse, Université Paul Sabatier, Institut de Pharmacologie et de Biologie Structurale, Toulouse F–31077, France (M.R., D.B., C.P.); and Universidade de São Paulo, Faculdade de Ciências Farmacêuticas, Depto. de Alimentos e Nutrição Experimental, 05508–000 São Paulo, Brazil (E.P.)
| | - Jean-Claude Pech
- Université de Toulouse, Institut National Polytechnique-Ecole Nationale Supérieure Agronomique de Toulouse, Génomique et Biotechnologie des Fruits, Castanet-Tolosan F–31326, France (C.B., M.Z., E.M., W.B., I.E., M.B., A.L., J.-C.P.); Institut National de la Recherche Agronomique, Génomique et Biotechnologie des Fruits, Chemin de Borde Rouge, Castanet-Tolosan F–31326, France (C.B., M.Z., E.M., W.B., I.E., M.B., A.L., J.-C.P.); Fédération de Recherche 3450, Agrobiosciences, Interactions et Biodiversités, Plateforme Protéomique Génopole Toulouse Midi-Pyrénées, Institut de Pharmacologie et de Biologie Structurale, Centre National de la Recherche Scientifique, F–31077 Toulouse, France (M.R., C.P.); Université de Toulouse, Université Paul Sabatier, Institut de Pharmacologie et de Biologie Structurale, Toulouse F–31077, France (M.R., D.B., C.P.); and Universidade de São Paulo, Faculdade de Ciências Farmacêuticas, Depto. de Alimentos e Nutrição Experimental, 05508–000 São Paulo, Brazil (E.P.)
| |
Collapse
|
30
|
Zeng T, Chen L. Tracing dynamic biological processes during phase transition. BMC SYSTEMS BIOLOGY 2012; 6 Suppl 1:S12. [PMID: 23046764 PMCID: PMC3403121 DOI: 10.1186/1752-0509-6-s1-s12] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 11/30/2022]
Abstract
Background Phase transition widely exists in the biological world, such as transformation of cell cycle phases, cell differentiation stages, disease development, and so on. Such a nonlinear phenomenon is considered as the conversion of a biological system from one phenotype/state to another. Studies on the molecular mechanisms of biological phase transition have attracted much attention, in particular, on different genotypes (or expression variations) in a specific phase, but with less of focus on cascade changes of genes' functions (or system state) during the phase shift or transition process. However, it is a fundamental but important mission to trace the temporal characteristics of a biological system during a specific phase transition process, which can offer clues for understanding dynamic behaviors of living organisms. Results By overcoming the hurdles of traditional time segmentation and temporal biclustering methods, a causal process model (CPM) in the present work is proposed to study the biological phase transition in a systematic manner, i.e. first, we make gene-specific segmentation on time-course expression data by developing a new boundary gene estimation scheme, and then infer functional cascade dynamics by constructing a temporal block network. After the computational validation on synthetic data, CPM was used to analyze the well-known Yeast cell cycle data. It was found that the dynamics of the boundary genes are periodic and consistent with the phases of the cell cycle, and the temporal block network indeed demonstrates a meaningful cascade structure of the enriched biological functions. In addition, we further studied protein modules based on the temporal block network, which reflect temporal features in different cycles. Conclusions All of these results demonstrate that CPM is effective and efficient comparing to traditional methods, and is able to elucidate essential regulatory mechanism of a biological system even with complicated nonlinear phase transitions.
Collapse
Affiliation(s)
- Tao Zeng
- Key Laboratory of Systems Biology, SIBS-Novo Nordisk Translational Research Centre for PreDiabetes, Shanghai Institutes for Biological Sciences, Chinese Academy of Sciences, Shanghai, China
| | | |
Collapse
|
31
|
Azad MA, Wright GD. Determining the mode of action of bioactive compounds. Bioorg Med Chem 2012; 20:1929-39. [DOI: 10.1016/j.bmc.2011.10.088] [Citation(s) in RCA: 16] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/09/2011] [Revised: 10/14/2011] [Accepted: 10/30/2011] [Indexed: 10/14/2022]
|
32
|
Affiliation(s)
- Dong Xu
- Department of Computer Science and Christopher S. Bond Life Sciences Center, University of Missouri Columbia Missouri
| |
Collapse
|
33
|
Kinjo AR, Kumagai Y, Dinh H, Takeuchi O, Standley DM. Functional characterization of protein domains common to animal viruses and mouse. BMC Genomics 2011; 12 Suppl 3:S21. [PMID: 22369715 PMCID: PMC3333181 DOI: 10.1186/1471-2164-12-s3-s21] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022] Open
Abstract
Background Many viruses contain genes that originate from their hosts. Some of these acquired genes give viruses the ability to interfere with host immune responses by various mechanisms. Genes of host origin that appear commonly in viruses code for proteins that span a wide range of functions, from kinases and phosphotases, to cytokines and their receptors, to ubiquitin ligases and proteases. While many important cases of such lateral gene transfer in viruses have been documented, there has yet to be a genome-wide survey of viral-encoded genes acquired from animal hosts. Results Here we carry out such a survey in order to gain insight into the host immune system. We made the results available in the form of a web-based tool that allows viral-centered or host-centered queries to be performed (http://imm.ifrec.osaka-u.ac.jp/musvirus/). We examine the relationship between acquired genes and immune function, and compare host-virus homology with gene expression data in stimulated dendritic cells and T-cells. We found that genes whose expression changes significantly during the innate antiviral immune response had more homologs in animal virus than genes whose expression did not change or genes involved in the adaptive immune response. Conclusions Statistics gathered from the MusVirus database support earlier reports of gene transfer from host to virus and indicate that viruses are more likely to acquire genes involved in innate antiviral immune responses than those involved in acquired immune responses.
Collapse
Affiliation(s)
- Akira R Kinjo
- Institute for Protein Research, Osaka University, 3-2 Yamadaoka, Suita, Osaka 565-0871, Japan.
| | | | | | | | | |
Collapse
|
34
|
Polyakov NB, Slizhikova DK, Izmalkova MY, Cherepanova NI, Kazakov VS, Rogova MA, Zhukova NA, Alexeev DG, Bazaleev NA, Skripnikov AY, Govorun VM. Proteome analysis of chloroplasts from the moss Physcomitrella patens (Hedw.) B.S.G. BIOCHEMISTRY (MOSCOW) 2011; 75:1470-83. [PMID: 21314618 DOI: 10.1134/s0006297910120084] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/22/2022]
Abstract
Intact chloroplasts were prepared from protoplasts of the moss Physcomitrella patens according to an especially developed method. They were additionally separated into stroma and thylakoid fractions. The proteomes of intact plastids, stroma, and thylakoids were analyzed by 1D-electrophoresis under denaturing conditions followed by protein digestion and nano-LC-ESI-MS/MS of tryptic peptides from gel bands. A total of 624 unique proteins were identified, 434 of which were annotated as chloroplast resident proteins. The majority of proteins belonged to a photosynthetic group (21.3%) and to the group of proteins implicated in protein degradation, posttranslational modification, folding, and import (20.6%). Among proteins assigned to chloroplasts, the following groups are prominent combining proteins implicated in metabolism of: amino acids (6.9%), nucleotides (2.5%), lipids (2.2%), carbohydrates (2.4%), hormones (1.5%), isoprenoids (1.25%), vitamins and cofactors (1%), sulfur (1.25%), and nitrogen (1%); as well as proteins involved in the pentose-phosphate cycle (1.75%), tetrapyrrole synthesis (3.7%), and redox processes (3.6%). The data can be used in physiological and photobiological studies as well as in further studies of P. patens chloroplast proteome including structural and functional specifics of plant protein localization in organelles.
Collapse
Affiliation(s)
- N B Polyakov
- Shemyakin and Ovchinnikov Institute of Bioorganic Chemistry, Russian Academy of Sciences, Moscow 117997, Russia.
| | | | | | | | | | | | | | | | | | | | | |
Collapse
|
35
|
Bukurova YA, Nikitina IG, Khankin SL, Krasnov GS, Lisitsyn NA, Karpov VL, Beresten SF. Search for protein markers for serum diagnostics of tumors by analysis of microRNA expression profiles. Mol Biol 2011. [DOI: 10.1134/s0026893311020038] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/06/2023]
|
36
|
Hadi-Alijanvand H, Rouhani M, Proctor EA, Dokholyan NV, Moosavi-Movahedi AA. A folding pathway-dependent score to recognize membrane proteins. PLoS One 2011; 6:e16778. [PMID: 21390303 PMCID: PMC3046963 DOI: 10.1371/journal.pone.0016778] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/21/2010] [Accepted: 12/29/2010] [Indexed: 12/11/2022] Open
Abstract
While various approaches exist to study protein localization, it is still a challenge to predict where proteins localize. Here, we consider a mechanistic viewpoint for membrane localization. Taking into account the steps for the folding pathway of α-helical membrane proteins and relating biophysical parameters to each of these steps, we create a score capable of predicting the propensity for membrane localization and call it FP(3)mem. This score is driven from the principal component analysis (PCA) of the biophysical parameters related to membrane localization. FP(3)mem allows us to rationalize the colocalization of a number of channel proteins with the Cav1.2 channel by their fewer propensities for membrane localization.
Collapse
Affiliation(s)
| | - Maryam Rouhani
- Institute of Biochemistry and Biophysics, University of Tehran, Tehran, Iran
| | - Elizabeth A. Proctor
- Genetics Medicine, Department of Biochemistry and Biophysics, University of North Carolina at Chapel Hill, Chapel Hill, North Carolina, United States of America
| | - Nikolay V. Dokholyan
- Genetics Medicine, Department of Biochemistry and Biophysics, University of North Carolina at Chapel Hill, Chapel Hill, North Carolina, United States of America
| | | |
Collapse
|
37
|
Wang J, Li C, Wang E, Wang X. An FPT approach for predicting protein localization from yeast genomic data. PLoS One 2011; 6:e14449. [PMID: 21283516 PMCID: PMC3023707 DOI: 10.1371/journal.pone.0014449] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/22/2010] [Accepted: 12/01/2010] [Indexed: 11/18/2022] Open
Abstract
Accurately predicting the localization of proteins is of paramount importance in the quest to determine their respective functions within the cellular compartment. Because of the continuous and rapid progress in the fields of genomics and proteomics, more data are available now than ever before. Coincidentally, data mining methods been developed and refined in order to handle this experimental windfall, thus allowing the scientific community to quantitatively address long-standing questions such as that of protein localization. Here, we develop a frequent pattern tree (FPT) approach to generate a minimum set of rules (mFPT) for predicting protein localization. We acquire a series of rules according to the features of yeast genomic data. The mFPT prediction accuracy is benchmarked against other commonly used methods such as Bayesian networks and logistic regression under various statistical measures. Our results show that mFPT gave better performance than other approaches in predicting protein localization. Meanwhile, setting 0.65 as the minimum hit-rate, we obtained 138 proteins that mFPT predicted differently than the simple naive bayesian method (SNB). In our analysis of these 138 proteins, we present novel predictions for the location for 17 proteins, which currently do not have any defined localization. These predictions can serve as putative annotations and should provide preliminary clues for experimentalists. We also compared our predictions against the eukaryotic subcellular localization database and related predictions by others on protein localization. Our method is quite generalized and can thus be applied to discover the underlying rules for protein-protein interactions, genomic interactions, and structure-function relationships, as well as those of other fields of research.
Collapse
Affiliation(s)
- Jin Wang
- State Key Laboratory of Electroanalytical Chemistry, Changchun Institute of Applied Chemistry, Chinese Academy of Sciences, Changchun, Jilin, China
- Departments of Chemistry, Physics and Astronomy, State University of New York at Stony Brook, Stony Brook, New York, United States of America
| | - Chunhe Li
- State Key Laboratory of Electroanalytical Chemistry, Changchun Institute of Applied Chemistry, Chinese Academy of Sciences, Changchun, Jilin, China
- Graduate School of the Chinese Academy of Sciences, Beijing, China
| | - Erkang Wang
- State Key Laboratory of Electroanalytical Chemistry, Changchun Institute of Applied Chemistry, Chinese Academy of Sciences, Changchun, Jilin, China
| | | |
Collapse
|
38
|
Rastogi S, Rost B. LocDB: experimental annotations of localization for Homo sapiens and Arabidopsis thaliana. Nucleic Acids Res 2010; 39:D230-4. [PMID: 21071420 PMCID: PMC3013784 DOI: 10.1093/nar/gkq927] [Citation(s) in RCA: 39] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/19/2023] Open
Abstract
LocDB is a manually curated database with experimental annotations for the subcellular localizations of proteins in Homo sapiens (HS, human) and Arabidopsis thaliana (AT, thale cress). Currently, it contains entries for 19 604 UniProt proteins (HS: 13 342; AT: 6262). Each database entry contains the experimentally derived localization in Gene Ontology (GO) terminology, the experimental annotation of localization, localization predictions by state-of-the-art methods and, where available, the type of experimental information. LocDB is searchable by keyword, protein name and subcellular compartment, as well as by identifiers from UniProt, Ensembl and TAIR resources. In comparison to other public databases, LocDB as a resource adds about 10 000 experimental localization annotations for HS proteins and ∼900 for AS proteins. Over 40% of the proteins in LocDB have multiple localization annotations providing a better platform for development of new multiple localization prediction methods with higher coverage and accuracy. Links to all referenced databases are provided. LocDB will be updated regularly by our group (available at: http://www.rostlab.org/services/locDB).
Collapse
Affiliation(s)
- Shruti Rastogi
- Department of Biochemistry and Molecular Biophysics, Columbia University, 701 West, 168th Street, New York, NY 10032, USA.
| | | |
Collapse
|
39
|
Kim IS, Lim KJ, Han BG, Chung MG, Kim KW. AKAPDB: A-Kinase Anchoring Proteins Database. Genomics Inform 2010. [DOI: 10.5808/gi.2010.8.2.090] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/20/2022] Open
|
40
|
Protein interactions and ligand binding: from protein subfamilies to functional specificity. Proc Natl Acad Sci U S A 2010; 107:1995-2000. [PMID: 20133844 DOI: 10.1073/pnas.0908044107] [Citation(s) in RCA: 108] [Impact Index Per Article: 7.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022] Open
Abstract
The divergence accumulated during the evolution of protein families translates into their internal organization as subfamilies, and it is directly reflected in the characteristic patterns of differentially conserved residues. These specifically conserved positions in protein subfamilies are known as "specificity determining positions" (SDPs). Previous studies have limited their analysis to the study of the relationship between these positions and ligand-binding specificity, demonstrating significant yet limited predictive capacity. We have systematically extended this observation to include the role of differential protein interactions in the segregation of protein subfamilies and explored in detail the structural distribution of SDPs at protein interfaces. Our results show the extensive influence of protein interactions in the evolution of protein families and the widespread association of SDPs with protein interfaces. The combined analysis of SDPs in interfaces and ligand-binding sites provides a more complete picture of the organization of protein families, constituting the necessary framework for a large scale analysis of the evolution of protein function.
Collapse
|
41
|
Terentiev AA, Moldogazieva NT, Shaitan KV. Dynamic proteomics in modeling of the living cell. Protein-protein interactions. BIOCHEMISTRY (MOSCOW) 2010; 74:1586-607. [DOI: 10.1134/s0006297909130112] [Citation(s) in RCA: 30] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/23/2022]
|
42
|
Mano S, Miwa T, Nishikawa SI, Mimura T, Nishimura M. Seeing is believing: on the use of image databases for visually exploring plant organelle dynamics. PLANT & CELL PHYSIOLOGY 2009; 50:2000-2014. [PMID: 19755394 DOI: 10.1093/pcp/pcp128] [Citation(s) in RCA: 14] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/26/2023]
Abstract
Organelle dynamics vary dramatically depending on cell type, developmental stage and environmental stimuli, so that various parameters, such as size, number and behavior, are required for the description of the dynamics of each organelle. Imaging techniques are superior to other techniques for describing organelle dynamics because these parameters are visually exhibited. Therefore, as the results can be seen immediately, investigators can more easily grasp organelle dynamics. At present, imaging techniques are emerging as fundamental tools in plant organelle research, and the development of new methodologies to visualize organelles and the improvement of analytical tools and equipment have allowed the large-scale generation of image and movie data. Accordingly, image databases that accumulate information on organelle dynamics are an increasingly indispensable part of modern plant organelle research. In addition, image databases are potentially rich data sources for computational analyses, as image and movie data reposited in the databases contain valuable and significant information, such as size, number, length and velocity. Computational analytical tools support image-based data mining, such as segmentation, quantification and statistical analyses, to extract biologically meaningful information from each database and combine them to construct models. In this review, we outline the image databases that are dedicated to plant organelle research and present their potential as resources for image-based computational analyses.
Collapse
Affiliation(s)
- Shoji Mano
- Department of Cell Biology, National Institute for Basic Biology, Okazaki, 444-8585, Japan
| | | | | | | | | |
Collapse
|
43
|
Baginsky S. Plant proteomics: concepts, applications, and novel strategies for data interpretation. MASS SPECTROMETRY REVIEWS 2009; 28:93-120. [PMID: 18618656 DOI: 10.1002/mas.20183] [Citation(s) in RCA: 30] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/08/2023]
Abstract
Proteomics is an essential source of information about biological systems because it generates knowledge about the concentrations, interactions, functions, and catalytic activities of proteins, which are the major structural and functional determinants of cells. In the last few years significant technology development has taken place both at the level of data analysis software and mass spectrometry hardware. Conceptual progress in proteomics has made possible the analysis of entire proteomes at previously unprecedented density and accuracy. New concepts have emerged that comprise quantitative analyses of full proteomes, database-independent protein identification strategies, targeted quantitative proteomics approaches with proteotypic peptides and the systematic analysis of an increasing number of posttranslational modifications at high temporal and spatial resolution. Although plant proteomics is making progress, there are still several analytical challenges that await experimental and conceptual solutions. With this review I will highlight the current status of plant proteomics and put it into the context of the aforementioned conceptual progress in the field, illustrate some of the plant-specific challenges and present my view on the great opportunities for plant systems biology offered by proteomics.
Collapse
Affiliation(s)
- Sacha Baginsky
- Institute of Plant Sciences, Swiss Federal Institute of Technology, Universitätsstrasse 2, 8092 Zurich, Switzerland.
| |
Collapse
|
44
|
Wong P, Althammer S, Hildebrand A, Kirschner A, Pagel P, Geissler B, Smialowski P, Blöchl F, Oesterheld M, Schmidt T, Strack N, Theis FJ, Ruepp A, Frishman D. An evolutionary and structural characterization of mammalian protein complex organization. BMC Genomics 2008; 9:629. [PMID: 19108706 PMCID: PMC2645396 DOI: 10.1186/1471-2164-9-629] [Citation(s) in RCA: 18] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/17/2008] [Accepted: 12/23/2008] [Indexed: 12/25/2022] Open
Abstract
Background We have recently released a comprehensive, manually curated database of mammalian protein complexes called CORUM. Combining CORUM with other resources, we assembled a dataset of over 2700 mammalian complexes. The availability of a rich information resource allows us to search for organizational properties concerning these complexes. Results As the complexity of a protein complex in terms of the number of unique subunits increases, we observed that the number of such complexes and the mean non-synonymous to synonymous substitution ratio of associated genes tend to decrease. Similarly, as the number of different complexes a given protein participates in increases, the number of such proteins and the substitution ratio of the associated gene also tends to decrease. These observations provide evidence relating natural selection and the organization of mammalian complexes. We also observed greater homogeneity in terms of predicted protein isoelectric points, secondary structure and substitution ratio in annotated versus randomly generated complexes. A large proportion of the protein content and interactions in the complexes could be predicted from known binary protein-protein and domain-domain interactions. In particular, we found that large proteins interact preferentially with much smaller proteins. Conclusion We observed similar trends in yeast and other data. Our results support the existence of conserved relations associated with the mammalian protein complexes.
Collapse
Affiliation(s)
- Philip Wong
- Helmholtz Center Munich-German Research Center for Environmental Health (GmbH), Institute of Bioinformatics and Systems Biology, Ingolstädter Landstrasse 1, Neuherberg, Germany.
| | | | | | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
45
|
Yellaboina S, Dudekula DB, Ko MS. Prediction of evolutionarily conserved interologs in Mus musculus. BMC Genomics 2008; 9:465. [PMID: 18842131 PMCID: PMC2571111 DOI: 10.1186/1471-2164-9-465] [Citation(s) in RCA: 18] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/05/2008] [Accepted: 10/08/2008] [Indexed: 12/03/2022] Open
Abstract
Background Identification of protein-protein interactions is an important first step to understand living systems. High-throughput experimental approaches have accumulated large amount of information on protein-protein interactions in human and other model organisms. Such interaction information has been successfully transferred to other species, in which the experimental data are limited. However, the annotation transfer method could yield false positive interologs due to the lack of conservation of interactions when applied to phylogenetically distant organisms. Results To address this issue, we used phylogenetic profile method to filter false positives in interologs based on the notion that evolutionary conserved interactions show similar patterns of occurrence along the genomes. The approach was applied to Mus musculus, in which the experimentally identified interactions are limited. We first inferred the protein-protein interactions in Mus musculus by using two approaches: i) identifying mouse orthologs of interacting proteins (interologs) based on the experimental protein-protein interaction data from other organisms; and ii) analyzing frequency of mouse ortholog co-occurrence in predicted operons of bacteria. We then filtered possible false-positives in the predicted interactions using the phylogenetic profiles. We found that this filtering method significantly increased the frequency of interacting protein-pairs coexpressed in the same cells/tissues in gene expression omnibus (GEO) database as well as the frequency of interacting protein-pairs shared the similar Gene Ontology (GO) terms for biological processes and cellular localizations. The data supports the notion that phylogenetic profile helps to reduce the number of false positives in interologs. Conclusion We have developed protein-protein interaction database in mouse, which contains 41109 interologs. We have also developed a web interface to facilitate the use of database .
Collapse
Affiliation(s)
- Sailu Yellaboina
- Developmental Genomics and Aging Section, Laboratory of Genetics, National Institute on Aging, National Institutes of Health, Baltimore, MD 21224, USA.
| | | | | |
Collapse
|
46
|
Zhang S, Xia X, Shen J, Zhou Y, Sun Z. DBMLoc: a Database of proteins with multiple subcellular localizations. BMC Bioinformatics 2008; 9:127. [PMID: 18304364 PMCID: PMC2292141 DOI: 10.1186/1471-2105-9-127] [Citation(s) in RCA: 44] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/22/2007] [Accepted: 02/28/2008] [Indexed: 11/19/2022] Open
Abstract
Background Subcellular localization information is one of the key features to protein function research. Locating to a specific subcellular compartment is essential for a protein to function efficiently. Proteins which have multiple localizations will provide more clues. This kind of proteins may take a high proportion, even more than 35%. Description We have developed a database of proteins with multiple subcellular localizations, designated DBMLoc. The initial release contains 10470 multiple subcellular localization-annotated entries. Annotations are collected from primary protein databases, specific subcellular localization databases and literature texts. All the protein entries are cross-referenced to GO annotations and SwissProt. Protein-protein interactions are also annotated. They are classified into 12 large subcellular localization categories based on GO hierarchical architecture and original annotations. Download, search and sequence BLAST tools are also available on the website. Conclusion DBMLoc is a protein database which collects proteins with more than one subcellular localization annotation. It is freely accessed at .
Collapse
Affiliation(s)
- Song Zhang
- MOE Key Laboratory of Bioinformatics, State Key Laboratory of Biomembrane and Membrane Biotechnology, Department of Biological Sciences and Biotechnology, Tsinghua University, Beijing 100084, China.
| | | | | | | | | |
Collapse
|
47
|
Abstract
Protein databases have become a crucial part of modern biology. Huge amounts of data for protein structures, functions, and particularly sequences are being generated. Searching databases is often the first step in the study of a new protein. Comparison between proteins and between protein families in databases provides information about the relationship between proteins within a genome or across different species, and hence offers much more information than can be obtained by studying only an isolated protein. In addition, secondary databases derived from experimental databases are also widely available. These databases reorganize and annotate the data or provide predictions. The use of multiple databases often helps researchers understand the structure and function of proteins. Although some protein databases are widely known, they are far from being fully utilized in the protein science community. This unit provides a starting point for readers to explore the potential of protein databases on the Internet.
Collapse
Affiliation(s)
- Dong Xu
- Digital Biology Laboratory, University of Missouri-Columbia, Columbia, Missouri, USA
| | | |
Collapse
|
48
|
Mano S, Miwa T, Nishikawa SI, Mimura T, Nishimura M. The plant organelles database (PODB): a collection of visualized plant organelles and protocols for plant organelle research. Nucleic Acids Res 2008; 36:D929-37. [PMID: 17932059 PMCID: PMC2238956 DOI: 10.1093/nar/gkm789] [Citation(s) in RCA: 24] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/13/2007] [Revised: 09/14/2007] [Accepted: 09/17/2007] [Indexed: 11/12/2022] Open
Abstract
The plant organelles database (PODB; http://podb.nibb.ac.jp/Organellome) was built to promote a comprehensive understanding of organelle dynamics, including organelle function, biogenesis, differentiation, movement and interactions with other organelles. This database consists of three individual parts, the organellome database, the functional analysis database and external links to other databases and homepages. The organellome database provides images of various plant organelles that were visualized with fluorescent and nonfluorescent probes in various tissues of several plant species at different developmental stages. The functional analysis database is a collection of protocols for plant organelle research. External links give access primarily to other databases and Web pages with information on transcriptomes and proteomes. All the data and protocols in the organellome database and the functional analysis database are populated by direct submission of experimentally determined data from plant researchers and can be freely downloaded. Our database promotes the exchange of information between plant organelle researchers for the comprehensive study of the organelle dynamics that support integrated functions in higher plants. We would also appreciate contributions of data and protocols from all plant researchers to maximize the usefulness of the database.
Collapse
Affiliation(s)
- Shoji Mano
- Department of Cell Biology, National Institute for Basic Biology, Department of Basic Biology, School of Life Science, The Graduate University for Advanced Studies, Computer Laboratory, National Institute for Basic Biology, Okazaki 444-8585, Graduate School of Science, Nagoya University, Nagoya 464-8602 and Department of Biology, Faculty of Science, Kobe University, Kobe 657-8501, Japan
| | - Tomoki Miwa
- Department of Cell Biology, National Institute for Basic Biology, Department of Basic Biology, School of Life Science, The Graduate University for Advanced Studies, Computer Laboratory, National Institute for Basic Biology, Okazaki 444-8585, Graduate School of Science, Nagoya University, Nagoya 464-8602 and Department of Biology, Faculty of Science, Kobe University, Kobe 657-8501, Japan
| | - Shuh-ichi Nishikawa
- Department of Cell Biology, National Institute for Basic Biology, Department of Basic Biology, School of Life Science, The Graduate University for Advanced Studies, Computer Laboratory, National Institute for Basic Biology, Okazaki 444-8585, Graduate School of Science, Nagoya University, Nagoya 464-8602 and Department of Biology, Faculty of Science, Kobe University, Kobe 657-8501, Japan
| | - Tetsuro Mimura
- Department of Cell Biology, National Institute for Basic Biology, Department of Basic Biology, School of Life Science, The Graduate University for Advanced Studies, Computer Laboratory, National Institute for Basic Biology, Okazaki 444-8585, Graduate School of Science, Nagoya University, Nagoya 464-8602 and Department of Biology, Faculty of Science, Kobe University, Kobe 657-8501, Japan
| | - Mikio Nishimura
- Department of Cell Biology, National Institute for Basic Biology, Department of Basic Biology, School of Life Science, The Graduate University for Advanced Studies, Computer Laboratory, National Institute for Basic Biology, Okazaki 444-8585, Graduate School of Science, Nagoya University, Nagoya 464-8602 and Department of Biology, Faculty of Science, Kobe University, Kobe 657-8501, Japan
| |
Collapse
|