1
|
Ullah M, Hadi F, Song J, Yu DJ. PScL-2LSAESM: bioimage-based prediction of protein subcellular localization by integrating heterogeneous features with the two-level SAE-SM and mean ensemble method. Bioinformatics 2023; 39:6839969. [PMID: 36413068 PMCID: PMC9947927 DOI: 10.1093/bioinformatics/btac727] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/13/2022] [Revised: 11/02/2022] [Accepted: 11/21/2022] [Indexed: 11/23/2022] Open
Abstract
MOTIVATION Over the past decades, a variety of in silico methods have been developed to predict protein subcellular localization within cells. However, a common and major challenge in the design and development of such methods is how to effectively utilize the heterogeneous feature sets extracted from bioimages. In this regards, limited efforts have been undertaken. RESULTS We propose a new two-level stacked autoencoder network (termed 2L-SAE-SM) to improve its performance by integrating the heterogeneous feature sets. In particular, in the first level of 2L-SAE-SM, each optimal heterogeneous feature set is fed to train our designed stacked autoencoder network (SAE-SM). All the trained SAE-SMs in the first level can output the decision sets based on their respective optimal heterogeneous feature sets, known as 'intermediate decision' sets. Such intermediate decision sets are then ensembled using the mean ensemble method to generate the 'intermediate feature' set for the second-level SAE-SM. Using the proposed framework, we further develop a novel predictor, referred to as PScL-2LSAESM, to characterize image-based protein subcellular localization. Extensive benchmarking experiments on the latest benchmark training and independent test datasets collected from the human protein atlas databank demonstrate the effectiveness of the proposed 2L-SAE-SM framework for the integration of heterogeneous feature sets. Moreover, performance comparison of the proposed PScL-2LSAESM with current state-of-the-art methods further illustrates that PScL-2LSAESM clearly outperforms the existing state-of-the-art methods for the task of protein subcellular localization. AVAILABILITY AND IMPLEMENTATION https://github.com/csbio-njust-edu/PScL-2LSAESM. SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Matee Ullah
- School of Computer Science and Engineering, Nanjing University of Science and Technology, Nanjing 210094, China
| | - Fazal Hadi
- School of Computer Science and Engineering, Nanjing University of Science and Technology, Nanjing 210094, China
| | | | - Dong-Jun Yu
- To whom correspondence should be addressed. or
| |
Collapse
|
2
|
Wang G, Xue MQ, Shen HB, Xu YY. Learning protein subcellular localization multi-view patterns from heterogeneous data of imaging, sequence and networks. Brief Bioinform 2022; 23:6499983. [PMID: 35018423 DOI: 10.1093/bib/bbab539] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/08/2021] [Revised: 11/03/2021] [Accepted: 11/20/2021] [Indexed: 11/13/2022] Open
Abstract
Location proteomics seeks to provide automated high-resolution descriptions of protein location patterns within cells. Many efforts have been undertaken in location proteomics over the past decades, thereby producing plenty of automated predictors for protein subcellular localization. However, most of these predictors are trained solely from high-throughput microscopic images or protein amino acid sequences alone. Unifying heterogeneous protein data sources has yet to be exploited. In this paper, we present a pipeline called sequence, image, network-based protein subcellular locator (SIN-Locator) that constructs a multi-view description of proteins by integrating multiple data types including images of protein expression in cells or tissues, amino acid sequences and protein-protein interaction networks, to classify the patterns of protein subcellular locations. Proteins were encoded by both handcrafted features and deep learning features, and multiple combining methods were implemented. Our experimental results indicated that optimal integrations can considerately enhance the classification accuracy, and the utility of SIN-Locator has been demonstrated through applying to new released proteins in the human protein atlas. Furthermore, we also investigate the contribution of different data sources and influence of partial absence of data. This work is anticipated to provide clues for reconciliation and combination of multi-source data for protein location analysis.
Collapse
Affiliation(s)
- Ge Wang
- School of Biomedical Engineering and Guangdong Provincial Key Laboratory of Medical Image Processing, Southern Medical University, Guangzhou 510515, China.,Guangdong Province Engineering Laboratory for Medical Imaging and Diagnostic Technology, Southern Medical University, Guangzhou 510515, China
| | - Min-Qi Xue
- School of Biomedical Engineering and Guangdong Provincial Key Laboratory of Medical Image Processing, Southern Medical University, Guangzhou 510515, China.,Guangdong Province Engineering Laboratory for Medical Imaging and Diagnostic Technology, Southern Medical University, Guangzhou 510515, China
| | - Hong-Bin Shen
- Institute of Image Processing and Pattern Recognition, Shanghai Jiao Tong University, Shanghai 200240, China.,School of Life Sciences and Biotechnology, Shanghai Jiao Tong University, Shanghai 200240, China
| | - Ying-Ying Xu
- School of Biomedical Engineering and Guangdong Provincial Key Laboratory of Medical Image Processing, Southern Medical University, Guangzhou 510515, China.,Guangdong Province Engineering Laboratory for Medical Imaging and Diagnostic Technology, Southern Medical University, Guangzhou 510515, China
| |
Collapse
|
3
|
Donovan-Maiye RM, Brown JM, Chan CK, Ding L, Yan C, Gaudreault N, Theriot JA, Maleckar MM, Knijnenburg TA, Johnson GR. A deep generative model of 3D single-cell organization. PLoS Comput Biol 2022; 18:e1009155. [PMID: 35041651 PMCID: PMC8797242 DOI: 10.1371/journal.pcbi.1009155] [Citation(s) in RCA: 14] [Impact Index Per Article: 7.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/19/2021] [Revised: 01/28/2022] [Accepted: 11/29/2021] [Indexed: 11/18/2022] Open
Abstract
We introduce a framework for end-to-end integrative modeling of 3D single-cell multi-channel fluorescent image data of diverse subcellular structures. We employ stacked conditional β-variational autoencoders to first learn a latent representation of cell morphology, and then learn a latent representation of subcellular structure localization which is conditioned on the learned cell morphology. Our model is flexible and can be trained on images of arbitrary subcellular structures and at varying degrees of sparsity and reconstruction fidelity. We train our full model on 3D cell image data and explore design trade-offs in the 2D setting. Once trained, our model can be used to predict plausible locations of structures in cells where these structures were not imaged. The trained model can also be used to quantify the variation in the location of subcellular structures by generating plausible instantiations of each structure in arbitrary cell geometries. We apply our trained model to a small drug perturbation screen to demonstrate its applicability to new data. We show how the latent representations of drugged cells differ from unperturbed cells as expected by on-target effects of the drugs.
Collapse
Affiliation(s)
| | - Jackson M. Brown
- Allen Institute for Cell Science, Seattle, Washington, United States of America
| | - Caleb K. Chan
- Allen Institute for Cell Science, Seattle, Washington, United States of America
| | - Liya Ding
- Allen Institute for Cell Science, Seattle, Washington, United States of America
| | - Calysta Yan
- Allen Institute for Cell Science, Seattle, Washington, United States of America
| | - Nathalie Gaudreault
- Allen Institute for Cell Science, Seattle, Washington, United States of America
| | - Julie A. Theriot
- Allen Institute for Cell Science, Seattle, Washington, United States of America
- Department of Biology and Howard Hughes Medical Institute, University of Washington, Seattle, Washington, United States of America
| | - Mary M. Maleckar
- Allen Institute for Cell Science, Seattle, Washington, United States of America
| | - Theo A. Knijnenburg
- Allen Institute for Cell Science, Seattle, Washington, United States of America
- * E-mail:
| | - Gregory R. Johnson
- Allen Institute for Cell Science, Seattle, Washington, United States of America
| |
Collapse
|
4
|
Mirzaei Mehrabad E, Hassanzadeh R, Eslahchi C. PMLPR: A novel method for predicting subcellular localization based on recommender systems. Sci Rep 2018; 8:12006. [PMID: 30104743 PMCID: PMC6089892 DOI: 10.1038/s41598-018-30394-w] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/25/2016] [Accepted: 07/30/2018] [Indexed: 12/16/2022] Open
Abstract
The importance of protein subcellular localization problem is due to the importance of protein's functions in different cell parts. Moreover, prediction of subcellular locations helps to identify the potential molecular targets for drugs and has an important role in genome annotation. Most of the existing prediction methods assign only one location for each protein. But, since some proteins move between different subcellular locations, they can have multiple locations. In recent years, some multiple location predictors have been introduced. However, their performances are not accurate enough and there is much room for improvement. In this paper, we introduced a method, PMLPR, to predict locations for a protein. PMLPR predicts a list of locations for each protein based on recommender systems and it can properly overcome the multiple location prediction problem. For evaluating the performance of PMLPR, we considered six datasets RAT, FLY, HUMAN, Du et al., DBMLoc and Höglund. The performance of this algorithm is compared with six state-of-the-art algorithms, YLoc, WOLF-PSORT, prediction channel, MDLoc, Du et al. and MultiLoc2-HighRes. The results indicate that our proposed method is significantly superior on RAT and Fly proteins, and decent on HUMAN proteins. Moreover, on the datasets introduced by Du et al., DBMLoc and Höglund, PMLPR has comparable results. For the case study, we applied the algorithms on 8 proteins which are important in cancer research. The results of comparison with other methods indicate the efficiency of PMLPR.
Collapse
Affiliation(s)
- Elnaz Mirzaei Mehrabad
- Department of Computer Science, Faculty of Mathematical Sciences, Shahid Beheshti University, Tehran, Iran
| | - Reza Hassanzadeh
- Department of Engineering Sciences, Faculty of Advanced Technologies, University of Mohaghegh Ardabili, Namin, Iran
- Department of Bioinformatics, Faculty of Computer Engineering and Information Technology, Sabalan University of Advanced Technologies (SUAT), Namin, Iran
| | - Changiz Eslahchi
- Department of Computer Science, Faculty of Mathematical Sciences, Shahid Beheshti University, Tehran, Iran.
- School of Biological Science, Institute for Research in Fundamental Sciences (IPM), Tehran, Iran.
| |
Collapse
|
5
|
Zhang SW, Liu YF, Yu Y, Zhang TH, Fan XN. MSLoc-DT: A new method for predicting the protein subcellular location of multispecies based on decision templates. Anal Biochem 2014; 449:164-71. [DOI: 10.1016/j.ab.2013.12.013] [Citation(s) in RCA: 18] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/21/2013] [Revised: 11/08/2013] [Accepted: 12/12/2013] [Indexed: 12/12/2022]
|
6
|
Liu S, Mundra PA, Rajapakse JC. Features for cells and nuclei classification. ANNUAL INTERNATIONAL CONFERENCE OF THE IEEE ENGINEERING IN MEDICINE AND BIOLOGY SOCIETY. IEEE ENGINEERING IN MEDICINE AND BIOLOGY SOCIETY. ANNUAL INTERNATIONAL CONFERENCE 2012; 2011:6601-4. [PMID: 22255852 DOI: 10.1109/iembs.2011.6091628] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/08/2022]
Abstract
The performance of automated analysis of cellular images is heavily influenced by the features that characterize cells or cell nuclei. In this paper, an exhaustive set of features including morphological, topological, and texture features are explored to determine the optimal features for classification of cells and cell nuclei. The optimal subset of features are obtained using popular feature selection methods. The results of feature selection indicate that Zernike moment, Daubechies wavelets, and Gabor wavelets give the most important features for the classification of cells or cell nuclei in fluorescent microscopy images.
Collapse
Affiliation(s)
- Song Liu
- BioInformatics Research Centre, School of Computer Engineering, Nanyang Technological university, Singapore.
| | | | | |
Collapse
|
7
|
Bockhorst JP, Conroy JM, Agarwal S, O’Leary DP, Yu H. Beyond captions: linking figures with abstract sentences in biomedical articles. PLoS One 2012; 7:e39618. [PMID: 22815711 PMCID: PMC3399876 DOI: 10.1371/journal.pone.0039618] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/25/2011] [Accepted: 05/23/2012] [Indexed: 11/18/2022] Open
Abstract
Although figures in scientific articles have high information content and concisely communicate many key research findings, they are currently under utilized by literature search and retrieval systems. Many systems ignore figures, and those that do not typically only consider caption text. This study describes and evaluates a fully automated approach for associating figures in the body of a biomedical article with sentences in its abstract. We use supervised methods to learn probabilistic language models, hidden Markov models, and conditional random fields for predicting associations between abstract sentences and figures. Three kinds of evidence are used: text in abstract sentences and figures, relative positions of sentences and figures, and the patterns of sentence/figure associations across an article. Each information source is shown to have predictive value, and models that use all kinds of evidence are more accurate than models that do not. Our most accurate method has an -score of 69% on a cross-validation experiment, is competitive with the accuracy of human experts, has significantly better predictive accuracy than state-of-the-art methods and enables users to access figures associated with an abstract sentence with an average of 1.82 fewer mouse clicks. A user evaluation shows that human users find our system beneficial. The system is available at http://FigureItOut.askHERMES.org.
Collapse
Affiliation(s)
- Joseph P. Bockhorst
- Department of Computer Science, University of Wisconsin–Milwaukee, Milwaukee, Wisconsin, United States of America
- * E-mail: (JPB); (HY)
| | - John M. Conroy
- IDA/Center for Computing Sciences, Bowie, Maryland, United States of America
| | - Shashank Agarwal
- Department of Health Sciences, University of Wisconsin–Milwaukee, Milwaukee, Wisconsin, United States of America
| | - Dianne P. O’Leary
- Computer Science Department and UMIACS, University of Maryland, College Park, Maryland, United States of America
| | - Hong Yu
- Department of Computer Science, University of Wisconsin–Milwaukee, Milwaukee, Wisconsin, United States of America
- Department of Health Sciences, University of Wisconsin–Milwaukee, Milwaukee, Wisconsin, United States of America
- * E-mail: (JPB); (HY)
| |
Collapse
|
8
|
Jackson C, Glory-Afshar E, Murphy RF, Kovacevic J. Model building and intelligent acquisition with application to protein subcellular location classification. Bioinformatics 2011; 27:1854-9. [PMID: 21558154 DOI: 10.1093/bioinformatics/btr286] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022] Open
Abstract
MOTIVATION We present a framework and algorithms to intelligently acquire movies of protein subcellular location patterns by learning their models as they are being acquired, and simultaneously determining how many cells to acquire as well as how many frames to acquire per cell. This is motivated by the desire to minimize acquisition time and photobleaching, given the need to build such models for all proteins, in all cell types, under all conditions. Our key innovation is to build models during acquisition rather than as a post-processing step, thus allowing us to intelligently and automatically adapt the acquisition process given the model acquired. RESULTS We validate our framework on protein subcellular location classification, and show that the combination of model building and intelligent acquisition results in time and storage savings without loss of classification accuracy, or alternatively, higher classification accuracy for the same total acquisition time. AVAILABILITY AND IMPLEMENTATION The data and software used for this study will be made available upon publication at http://murphylab.web.cmu.edu/software and http://www.andrew.cmu.edu/user/jelenak/Software. CONTACT jelenak@cmu.edu.
Collapse
Affiliation(s)
- C Jackson
- Center for Bioimage Informatics, Department of Biomedical Engineering, Carnegie Mellon University, 5000 Forbes Ave., Pittsburgh, PA 15213, USA
| | | | | | | |
Collapse
|
9
|
Abstract
Chemical address tags can be defined as specific structural features shared by a set of bioimaging probes having a predictable influence on cell-associated visual signals obtained from these probes. Here, using a large image dataset acquired with a high content screening instrument, machine vision and cheminformatics analysis have been applied to reveal chemical address tags. With a combinatorial library of fluorescent molecules, fluorescence signal intensity, spectral, and spatial features characterizing each one of the probes' visual signals were extracted from images acquired with the three different excitation and emission channels of the imaging instrument. With multivariate regression, the additive contribution from each one of the different building blocks of the bioimaging probes toward each measured, cell-associated image-based feature was calculated. In this manner, variations in the chemical features of the molecules were associated with the resulting staining patterns, facilitating quantitative, objective analysis of chemical address tags. Hierarchical clustering and paired image-cheminformatics analysis revealed key structure-property relationships amongst many building blocks of the fluorescent molecules. The results point to different chemical modifications of the bioimaging probes that can exert similar (or different) effects on the probes' visual signals. Inspection of the clustered structures suggests intramolecular charge migration or partial charge distribution as potential mechanistic determinants of chemical address tag behavior.
Collapse
Affiliation(s)
- Kerby Shedden
- Department of Statistics, University of Michigan, Ann Arbor, Michigan 48109, USA
| | | |
Collapse
|
10
|
Shariff A, Kangas J, Coelho LP, Quinn S, Murphy RF. Automated image analysis for high-content screening and analysis. ACTA ACUST UNITED AC 2010; 15:726-34. [PMID: 20488979 DOI: 10.1177/1087057110370894] [Citation(s) in RCA: 96] [Impact Index Per Article: 6.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/19/2022]
Abstract
The field of high-content screening and analysis consists of a set of methodologies for automated discovery in cell biology and drug development using large amounts of image data. In most cases, imaging is carried out by automated microscopes, often assisted by automated liquid handling and cell culture. Image processing, computer vision, and machine learning are used to automatically process high-dimensional image data into meaningful cell biological results. The key is creating automated analysis pipelines typically consisting of 4 basic steps: (1) image processing (normalization, segmentation, tracing, tracking), (2) spatial transformation to bring images to a common reference frame (registration), (3) computation of image features, and (4) machine learning for modeling and interpretation of data. An overview of these image analysis tools is presented here, along with brief descriptions of a few applications.
Collapse
Affiliation(s)
- Aabid Shariff
- Lane Center for Computational Biology and Center for Bioimage Informatics, Carnegie Mellon University, Pittsburgh, PA, USA
| | | | | | | | | |
Collapse
|
11
|
Kvilekval K, Fedorov D, Obara B, Singh A, Manjunath BS. Bisque: a platform for bioimage analysis and management. ACTA ACUST UNITED AC 2009; 26:544-52. [PMID: 20031971 DOI: 10.1093/bioinformatics/btp699] [Citation(s) in RCA: 147] [Impact Index Per Article: 9.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/14/2022]
Abstract
MOTIVATION Advances in the field of microscopy have brought about the need for better image management and analysis solutions. Novel imaging techniques have created vast stores of images and metadata that are difficult to organize, search, process and analyze. These tasks are further complicated by conflicting and proprietary image and metadata formats, that impede analyzing and sharing of images and any associated data. These obstacles have resulted in research resources being locked away in digital media and file cabinets. Current image management systems do not address the pressing needs of researchers who must quantify image data on a regular basis. RESULTS We present Bisque, a web-based platform specifically designed to provide researchers with organizational and quantitative analysis tools for 5D image data. Users can extend Bisque with both data model and analysis extensions in order to adapt the system to local needs. Bisque's extensibility stems from two core concepts: flexible metadata facility and an open web-based architecture. Together these empower researchers to create, develop and share novel bioimage analyses. Several case studies using Bisque with specific applications are presented as an indication of how users can expect to extend Bisque for their own purposes.
Collapse
Affiliation(s)
- Kristian Kvilekval
- Center for Bio-Image Informatics, Electrical and Computer Engineering Department, University of California Santa Barbara, CA, USA.
| | | | | | | | | |
Collapse
|
12
|
Shedden K, Li Q, Liu F, Chang YT, Rosania GR. Machine vision-assisted analysis of structure-localization relationships in a combinatorial library of prospective bioimaging probes. Cytometry A 2009; 75:482-93. [PMID: 19243023 DOI: 10.1002/cyto.a.20713] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/08/2022]
Abstract
With a combinatorial library of bioimaging probes, it is now possible to use machine vision to analyze the contribution of different building blocks of the molecules to their cell-associated visual signals. For this purpose, cell-permeant, fluorescent styryl molecules were synthesized by condensation of 168 aldehyde with 8 pyridinium/quinolinium building blocks. Images of cells incubated with fluorescent molecules were acquired with a high content screening instrument. Chemical and image feature analysis revealed how variation in one or the other building block of the styryl molecules led to variations in the molecules' visual signals. Across each pair of probes in the library, chemical similarity was significantly associated with spectral and total signal intensity similarity. However, chemical similarity was much less associated with similarity in subcellular probe fluorescence patterns. Quantitative analysis and visual inspection of pairs of images acquired from pairs of styryl isomers confirm that many closely-related probes exhibit different subcellular localization patterns. Therefore, idiosyncratic interactions between styryl molecules and specific cellular components greatly contribute to the subcellular distribution of the styryl probes' fluorescence signal. These results demonstrate how machine vision and cheminformatics can be combined to analyze the targeting properties of bioimaging probes, using large image data sets acquired with automated screening systems.
Collapse
Affiliation(s)
- Kerby Shedden
- Department of Statistics, University of Michigan, Ann Arbor, Michigan 48109, USA
| | | | | | | | | |
Collapse
|
13
|
Milosevic J, Bulau P, Mortz E, Eickelberg O. Subcellular fractionation of TGF-β1-stimulated lung epithelial cells: A novel proteomic approach for identifying signaling intermediates. Proteomics 2009; 9:1230-40. [DOI: 10.1002/pmic.200700604] [Citation(s) in RCA: 12] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/06/2022]
|
14
|
Mizuno Y, Kurochkin IV, Herberth M, Okazaki Y, Schönbach C. Predicted mouse peroxisome-targeted proteins and their actual subcellular locations. BMC Bioinformatics 2008; 9 Suppl 12:S16. [PMID: 19091015 PMCID: PMC2638156 DOI: 10.1186/1471-2105-9-s12-s16] [Citation(s) in RCA: 14] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND The import of most intraperoxisomal proteins is mediated by peroxisome targeting signals at their C-termini (PTS1) or N-terminal regions (PTS2). Both signals have been integrated in subcellular location prediction programs. However their present performance, particularly of PTS2-targeting did not seem fitting for large-scale screening of sequences. RESULTS We modified an earlier reported PTS1 screening method to identify PTS2-containing mouse candidates using a combination of computational and manual annotation. For rapid confirmation of five new PTS2- and two previously identified PTS1-containing candidates we developed the new cell line CHO-perRed which stably expresses the peroxisomal marker dsRed-PTS1. Using CHO-perRed we confirmed the peroxisomal localization of PTS1-targeted candidate Zadh2. Preliminary characterization of Zadh2 expression suggested non-PPARalpha mediated activation. Notably, none of the PTS2 candidates located to peroxisomes. CONCLUSION In a few cases the PTS may oscillate from "silent" to "functional" depending on its surface accessibility indicating the potential for context-dependent conditional subcellular sorting. Overall, PTS2-targeting predictions are unlikely to improve without generation and integration of new experimental data from location proteomics, protein structures and quantitative Pex7 PTS2 peptide binding assays.
Collapse
Affiliation(s)
- Yumi Mizuno
- Division of Functional Genomics and Systems Medicine, Research Center for Genomic Medicine, Saitama Medical University, Hidaka, Saitama 350-1241, Japan.
| | | | | | | | | |
Collapse
|
15
|
Goodin MM, Chakrabarty R, Banerjee R, Yelton S, Debolt S. New gateways to discovery. PLANT PHYSIOLOGY 2007; 145:1100-9. [PMID: 18056860 PMCID: PMC2151732 DOI: 10.1104/pp.107.106641] [Citation(s) in RCA: 35] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 07/31/2007] [Accepted: 08/28/2007] [Indexed: 05/19/2023]
Affiliation(s)
- Michael M Goodin
- Department of Plant Pathology , University of Kentucky, Lexington, Kentucky 40546, USA.
| | | | | | | | | |
Collapse
|
16
|
Sprenger J, Lynn Fink J, Karunaratne S, Hanson K, Hamilton NA, Teasdale RD. LOCATE: a mammalian protein subcellular localization database. Nucleic Acids Res 2007; 36:D230-3. [PMID: 17986452 PMCID: PMC2238969 DOI: 10.1093/nar/gkm950] [Citation(s) in RCA: 108] [Impact Index Per Article: 6.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/24/2022] Open
Abstract
LOCATE is a curated, web-accessible database that houses data describing the membrane organization and subcellular localization of mouse and human proteins. Over the past 2 years, the data in LOCATE have grown substantially. The database now contains high-quality localization data for 20% of the mouse proteome and general localization annotation for nearly 36% of the mouse proteome. The proteome annotated in LOCATE is from the RIKEN FANTOM Consortium Isoform Protein Sequence sets which contains 58 128 mouse and 64 637 human protein isoforms. Other additions include computational subcellular localization predictions, automated computational classification of experimental localization image data, prediction of protein sorting signals and third party submission of literature data. Collectively, this database provides localization proteome for individual subcellular compartments that will underpin future systematic investigations of these regions. It is available at http://locate.imb.uq.edu.au/
Collapse
Affiliation(s)
- Josefine Sprenger
- ARC Centre of Excellence in Bioinformatics, Institute for Molecular Bioscience, The University of Queensland, St Lucia, Queensland 4072, Australia
| | | | | | | | | | | |
Collapse
|
17
|
Martone ME, Sargis J, Tran J, Wong WW, Jiles H, Mangir C. Database resources for cellular electron microscopy. Methods Cell Biol 2007; 79:799-822. [PMID: 17327184 DOI: 10.1016/s0091-679x(06)79031-8] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 05/14/2023]
Affiliation(s)
- Maryann E Martone
- National Center for Microscopy and Imaging Research, Center for Research in Biological Systems, University of California, San Diego, La Jolla, California 92093, USA
| | | | | | | | | | | |
Collapse
|
18
|
Giuliano KA, Johnston PA, Gough A, Taylor DL. Systems cell biology based on high-content screening. Methods Enzymol 2006; 414:601-19. [PMID: 17110213 DOI: 10.1016/s0076-6879(06)14031-8] [Citation(s) in RCA: 39] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/12/2023]
Abstract
A new discipline of biology has emerged since 2004, which we call "systems cell biology" (SCB). Systems cell biology is the study of the living cell, the basic unit of life, an integrated and interacting network of genes, proteins, and myriad metabolic reactions that give rise to function. SCB takes advantage of high-content screening platforms, but delivers more detailed profiles of cellular systemic function, including the application of advanced reagents and informatics tools to sophisticated cellular models. Therefore, an SCB profile is a cellular systemic response as measured by a panel of reagents that quantify a specific set of biomarkers.
Collapse
|
19
|
Abstract
To date, proteomics approaches have aimed to either identify novel proteins or change in protein expression/modification in various organisms under normal or disease conditions. One major aspect of functional proteomics is to identify protein biological properties in a given context, however, forward proteomics approaches alone cannot complete this goal. Indeed, with the increasing successes of such proteomics-based research strategies and the subsequent increasing amounts of proteins identified with unknown molecular functions, approaches allowing for systematic analyses of protein functions are desired. In this review, we propose to depict the complementarities of forward and reverse proteomics approaches in the definite understanding of protein functions. This dual strategy requires a data integration loop which allows for systematic characterization of protein function(s). The details of the integrative process combining both in silico and experimental resources and tools are presented. Altogether, we believe that the integration of forward and reverse proteomics approaches supported by bioinformatics will provide an efficient path towards systems biology.
Collapse
Affiliation(s)
- Sandrine Palcy
- Organelle Signaling laboratory, Department of Surgery, McGill University, Montreal, Quebec, Canada.
| | | |
Collapse
|
20
|
Chen X, Velliste M, Murphy RF. Automated interpretation of subcellular patterns in fluorescence microscope images for location proteomics. Cytometry A 2006; 69:631-40. [PMID: 16752421 PMCID: PMC2901544 DOI: 10.1002/cyto.a.20280] [Citation(s) in RCA: 50] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/08/2022]
Abstract
Proteomics, the large scale identification and characterization of many or all proteins expressed in a given cell type, has become a major area of biological research. In addition to information on protein sequence, structure and expression levels, knowledge of a protein's subcellular location is essential to a complete understanding of its functions. Currently, subcellular location patterns are routinely determined by visual inspection of fluorescence microscope images. We review here research aimed at creating systems for automated, systematic determination of location. These employ numerical feature extraction from images, feature reduction to identify the most useful features, and various supervised learning (classification) and unsupervised learning (clustering) methods. These methods have been shown to perform significantly better than human interpretation of the same images. When coupled with technologies for tagging large numbers of proteins and high-throughput microscope systems, the computational methods reviewed here enable the new subfield of location proteomics. This subfield will make critical contributions in two related areas. First, it will provide structured, high-resolution information on location to enable Systems Biology efforts to simulate cell behavior from the gene level on up. Second, it will provide tools for Cytomics projects aimed at characterizing the behaviors of all cell types before, during, and after the onset of various diseases.
Collapse
Affiliation(s)
- Xiang Chen
- Department of Biological Sciences, Carnegie Mellon University, Pittsburgh, PA 15213
- Center for Automated Learning and Discovery, Carnegie Mellon University, Pittsburgh, PA 15213
- Center for Bioimage Informatics, Carnegie Mellon University, Pittsburgh, PA 15213, FAX: 1.412.268.9580
| | - Meel Velliste
- Department of Biomedical Engineering, Carnegie Mellon University, Pittsburgh, PA 15213
| | - Robert F. Murphy
- Department of Biological Sciences, Carnegie Mellon University, Pittsburgh, PA 15213
- Department of Biomedical Engineering, Carnegie Mellon University, Pittsburgh, PA 15213
- Center for Automated Learning and Discovery, Carnegie Mellon University, Pittsburgh, PA 15213
- Center for Bioimage Informatics, Carnegie Mellon University, Pittsburgh, PA 15213, FAX: 1.412.268.9580
| |
Collapse
|
21
|
Bayraktar B, Banada PP, Hirleman ED, Bhunia AK, Robinson JP, Rajwa B. Feature extraction from light-scatter patterns of Listeria colonies for identification and classification. JOURNAL OF BIOMEDICAL OPTICS 2006; 11:34006. [PMID: 16822056 DOI: 10.1117/1.2203987] [Citation(s) in RCA: 28] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/10/2023]
Abstract
Bacterial contamination by Listeria monocytogenes not only puts the public at risk, but also is costly for the food-processing industry. Traditional biochemical methods for pathogen identification require complicated sample preparation for reliable results. Optical scattering technology has been used for identification of bacterial cells in suspension, but with only limited success. Therefore, to improve the efficacy of the identification process using our novel imaging approach, we analyze bacterial colonies grown on solid surfaces. The work presented here demonstrates an application of computer-vision and pattern-recognition techniques to classify scatter patterns formed by Listeria colonies. Bacterial colonies are analyzed with a laser scatterometer. Features of circular scatter patterns formed by bacterial colonies illuminated by laser light are characterized using Zernike moment invariants. Principal component analysis and hierarchical clustering are performed on the results of feature extraction. Classification using linear discriminant analysis, partial least squares, and neural networks is capable of separating different strains of Listeria with a low error rate. The demonstrated system is also able to determine automatically the pathogenicity of bacteria on the basis of colony scatter patterns. We conclude that the obtained results are encouraging, and strongly suggest the feasibility of image-based biodetection systems.
Collapse
Affiliation(s)
- Bulent Bayraktar
- Purdue University, Bindley Bioscience Center, Purdue University Cytometry Laboratories, Department of Electrical and Computer Engineering, West Lafayette, Indiana 47907, USA
| | | | | | | | | | | |
Collapse
|
22
|
Thalmann I. Inner ear proteomics: a fad or hear to stay. Brain Res 2006; 1091:103-12. [PMID: 16540098 DOI: 10.1016/j.brainres.2006.01.099] [Citation(s) in RCA: 15] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/03/2005] [Revised: 01/26/2006] [Accepted: 01/26/2006] [Indexed: 11/17/2022]
Abstract
Proteomics, the large-scale analysis of the structure and function of proteins, as well as of protein-protein interactions, has evolved into a major component of 'systems analysis'. This requires the integration of information from different sources and at multiple levels, and involves two distinct parameters, (1) high-throughput protein separation, identification, and characterization, and (2) the extension of the obtained analytical data for the determination of the physiological function. The inner ear poses exceptional challenges to the study of proteomics because of its minute size, poor accessibility, association with complex fluid spaces, and diversity of cell types. Various approaches to the study of proteomics of the inner ear are presented, and success stories, noteworthy failures and what lies ahead, will be discussed.
Collapse
Affiliation(s)
- Isolde Thalmann
- Department of Otolaryngology, Washington University School of Medicine, 660 S. Euclid Avenue, Box 8115, St. Louis, MO 63110, USA.
| |
Collapse
|