1
|
Xiao H, Zou Y, Wang J, Wan S. A Review for Artificial Intelligence Based Protein Subcellular Localization. Biomolecules 2024; 14:409. [PMID: 38672426 PMCID: PMC11048326 DOI: 10.3390/biom14040409] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/29/2024] [Revised: 03/21/2024] [Accepted: 03/25/2024] [Indexed: 04/28/2024] Open
Abstract
Proteins need to be located in appropriate spatiotemporal contexts to carry out their diverse biological functions. Mislocalized proteins may lead to a broad range of diseases, such as cancer and Alzheimer's disease. Knowing where a target protein resides within a cell will give insights into tailored drug design for a disease. As the gold validation standard, the conventional wet lab uses fluorescent microscopy imaging, immunoelectron microscopy, and fluorescent biomarker tags for protein subcellular location identification. However, the booming era of proteomics and high-throughput sequencing generates tons of newly discovered proteins, making protein subcellular localization by wet-lab experiments a mission impossible. To tackle this concern, in the past decades, artificial intelligence (AI) and machine learning (ML), especially deep learning methods, have made significant progress in this research area. In this article, we review the latest advances in AI-based method development in three typical types of approaches, including sequence-based, knowledge-based, and image-based methods. We also elaborately discuss existing challenges and future directions in AI-based method development in this research field.
Collapse
Affiliation(s)
- Hanyu Xiao
- Department of Genetics, Cell Biology and Anatomy, College of Medicine, University of Nebraska Medical Center, Omaha, NE 68198, USA;
| | - Yijin Zou
- College of Veterinary Medicine, China Agricultural University, Beijing 100193, China;
| | - Jieqiong Wang
- Department of Neurological Sciences, College of Medicine, University of Nebraska Medical Center, Omaha, NE 68198, USA;
| | - Shibiao Wan
- Department of Genetics, Cell Biology and Anatomy, College of Medicine, University of Nebraska Medical Center, Omaha, NE 68198, USA;
| |
Collapse
|
2
|
Wan S, Mak MW, Kung SY. FUEL-mLoc: feature-unified prediction and explanation of multi-localization of cellular proteins in multiple organisms. Bioinformatics 2017; 33:749-750. [PMID: 28011780 DOI: 10.1093/bioinformatics/btw717] [Citation(s) in RCA: 16] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/23/2016] [Accepted: 11/08/2016] [Indexed: 01/08/2023] Open
Abstract
Although many web-servers for predicting protein subcellular localization have been developed, they often have the following drawbacks: (i) lack of interpretability or interpreting results with heterogenous information which may confuse users; (ii) ignoring multi-location proteins and (iii) only focusing on specific organism. To tackle these problems, we present an interpretable and efficient web-server, namely FUEL-mLoc, using eature- nified prediction and xplanation of m ulti- oc alization of cellular proteins in multiple organisms. Compared to conventional localization predictors, FUEL-mLoc has the following advantages: (i) using unified features (i.e. essential GO terms) to interpret why a prediction is made; (ii) being capable of predicting both single- and multi-location proteins and (iii) being able to handle proteins of multiple organisms, including Eukaryota, Homo sapiens, Viridiplantae, Gram-positive Bacteria, Gram-negative Bacteria and Virus . Experimental results demonstrate that FUEL-mLoc outperforms state-of-the-art subcellular-localization predictors. Availability and Implementation http://bioinfo.eie.polyu.edu.hk/FUEL-mLoc/. Contacts shibiao.wan@princeton.edu or enmwmak@polyu.edu.hk. Supplementary information Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Shibiao Wan
- Department of Electrical Engineering, Princeton University, NJ, USA
| | - Man-Wai Mak
- Department of Electronic and Information Engineering, The Hong Kong Polytechnic University, Hong Kong, SAR, China
| | - Sun-Yuan Kung
- Department of Electrical Engineering, Princeton University, NJ, USA
| |
Collapse
|
3
|
Affiliation(s)
- Marc Thilo Figge
- Applied Systems Biology, HKI-Center for Systems Biology of Infection, Leibniz-Institute for Natural Product Research and Infection Biology - Hans-Knöll-Institute (HKI), Jena, Germany.,Faculty of Biology and Pharmacy, Friedrich Schiller University, Jena, Germany
| | - Robert F Murphy
- Departments of Computational Biology, Biological Sciences, Biomedical Engineering and Machine Learning, Carnegie Mellon University, Pittsburgh, Pennsylvania.,Freiburg Institute for Advanced Studies and Faculty of Biology, Albert Ludwig University of Freiburg, Germany
| |
Collapse
|
4
|
Wan S, Mak MW, Kung SY. Sparse regressions for predicting and interpreting subcellular localization of multi-label proteins. BMC Bioinformatics 2016; 17:97. [PMID: 26911432 PMCID: PMC4765148 DOI: 10.1186/s12859-016-0940-x] [Citation(s) in RCA: 24] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/17/2015] [Accepted: 01/27/2016] [Indexed: 11/10/2022] Open
Abstract
Background Predicting protein subcellular localization is indispensable for inferring protein functions. Recent studies have been focusing on predicting not only single-location proteins, but also multi-location proteins. Almost all of the high performing predictors proposed recently use gene ontology (GO) terms to construct feature vectors for classification. Despite their high performance, their prediction decisions are difficult to interpret because of the large number of GO terms involved. Results This paper proposes using sparse regressions to exploit GO information for both predicting and interpreting subcellular localization of single- and multi-location proteins. Specifically, we compared two multi-label sparse regression algorithms, namely multi-label LASSO (mLASSO) and multi-label elastic net (mEN), for large-scale predictions of protein subcellular localization. Both algorithms can yield sparse and interpretable solutions. By using the one-vs-rest strategy, mLASSO and mEN identified 87 and 429 out of more than 8,000 GO terms, respectively, which play essential roles in determining subcellular localization. More interestingly, many of the GO terms selected by mEN are from the biological process and molecular function categories, suggesting that the GO terms of these categories also play vital roles in the prediction. With these essential GO terms, not only where a protein locates can be decided, but also why it resides there can be revealed. Conclusions Experimental results show that the output of both mEN and mLASSO are interpretable and they perform significantly better than existing state-of-the-art predictors. Moreover, mEN selects more features and performs better than mLASSO on a stringent human benchmark dataset. For readers’ convenience, an online server called SpaPredictor for both mLASSO and mEN is available at http://bioinfo.eie.polyu.edu.hk/SpaPredictorServer/.
Collapse
Affiliation(s)
- Shibiao Wan
- Department of Electronic and Information Engineering, The Hong Kong Polytechnic University, Hong Kong, SAR, China.
| | - Man-Wai Mak
- Department of Electronic and Information Engineering, The Hong Kong Polytechnic University, Hong Kong, SAR, China.
| | - Sun-Yuan Kung
- Department of Electrical Engineering, Princeton University, New Jersey, USA.
| |
Collapse
|
5
|
Yang Q, Zou HY, Zhang Y, Tang LJ, Shen GL, Jiang JH, Yu RQ. Multiplex protein pattern unmixing using a non-linear variable-weighted support vector machine as optimized by a particle swarm optimization algorithm. Talanta 2016; 147:609-14. [DOI: 10.1016/j.talanta.2015.10.047] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/06/2015] [Revised: 10/14/2015] [Accepted: 10/18/2015] [Indexed: 11/30/2022]
|
6
|
Predicting subcellular localization of multi-location proteins by improving support vector machines with an adaptive-decision scheme. INT J MACH LEARN CYB 2015. [DOI: 10.1007/s13042-015-0460-4] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/11/2023]
|
7
|
Simha R, Briesemeister S, Kohlbacher O, Shatkay H. Protein (multi-)location prediction: utilizing interdependencies via a generative model. Bioinformatics 2015; 31:i365-74. [PMID: 26072505 PMCID: PMC4765880 DOI: 10.1093/bioinformatics/btv264] [Citation(s) in RCA: 13] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/14/2022] Open
Abstract
MOTIVATION Proteins are responsible for a multitude of vital tasks in all living organisms. Given that a protein's function and role are strongly related to its subcellular location, protein location prediction is an important research area. While proteins move from one location to another and can localize to multiple locations, most existing location prediction systems assign only a single location per protein. A few recent systems attempt to predict multiple locations for proteins, however, their performance leaves much room for improvement. Moreover, such systems do not capture dependencies among locations and usually consider locations as independent. We hypothesize that a multi-location predictor that captures location inter-dependencies can improve location predictions for proteins. RESULTS We introduce a probabilistic generative model for protein localization, and develop a system based on it-which we call MDLoc-that utilizes inter-dependencies among locations to predict multiple locations for proteins. The model captures location inter-dependencies using Bayesian networks and represents dependency between features and locations using a mixture model. We use iterative processes for learning model parameters and for estimating protein locations. We evaluate our classifier MDLoc, on a dataset of single- and multi-localized proteins derived from the DBMLoc dataset, which is the most comprehensive protein multi-localization dataset currently available. Our results, obtained by using MDLoc, significantly improve upon results obtained by an initial simpler classifier, as well as on results reported by other top systems. AVAILABILITY AND IMPLEMENTATION MDLoc is available at: http://www.eecis.udel.edu/∼compbio/mdloc.
Collapse
Affiliation(s)
- Ramanuja Simha
- Department of Computer and Information Sciences, University of Delaware, Newark, DE, USA, Applied Bioinformatics, Center for Bioinformatics, University of Tuebingen, Germany, Center for Bioinformatics and Computational Biology, University of Delaware, Newark, DE, USA and School of Computing, Queen's University, Kingston, ON, Canada
| | - Sebastian Briesemeister
- Department of Computer and Information Sciences, University of Delaware, Newark, DE, USA, Applied Bioinformatics, Center for Bioinformatics, University of Tuebingen, Germany, Center for Bioinformatics and Computational Biology, University of Delaware, Newark, DE, USA and School of Computing, Queen's University, Kingston, ON, Canada
| | - Oliver Kohlbacher
- Department of Computer and Information Sciences, University of Delaware, Newark, DE, USA, Applied Bioinformatics, Center for Bioinformatics, University of Tuebingen, Germany, Center for Bioinformatics and Computational Biology, University of Delaware, Newark, DE, USA and School of Computing, Queen's University, Kingston, ON, Canada
| | - Hagit Shatkay
- Department of Computer and Information Sciences, University of Delaware, Newark, DE, USA, Applied Bioinformatics, Center for Bioinformatics, University of Tuebingen, Germany, Center for Bioinformatics and Computational Biology, University of Delaware, Newark, DE, USA and School of Computing, Queen's University, Kingston, ON, Canada Department of Computer and Information Sciences, University of Delaware, Newark, DE, USA, Applied Bioinformatics, Center for Bioinformatics, University of Tuebingen, Germany, Center for Bioinformatics and Computational Biology, University of Delaware, Newark, DE, USA and School of Computing, Queen's University, Kingston, ON, Canada Department of Computer and Information Sciences, University of Delaware, Newark, DE, USA, Applied Bioinformatics, Center for Bioinformatics, University of Tuebingen, Germany, Center for Bioinformatics and Computational Biology, University of Delaware, Newark, DE, USA and School of Computing, Queen's University, Kingston, ON, Canada
| |
Collapse
|
8
|
|
9
|
Wan S, Mak MW, Kung SY. R3P-Loc: a compact multi-label predictor using ridge regression and random projection for protein subcellular localization. J Theor Biol 2014; 360:34-45. [PMID: 24997236 DOI: 10.1016/j.jtbi.2014.06.031] [Citation(s) in RCA: 28] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/12/2014] [Revised: 06/24/2014] [Accepted: 06/25/2014] [Indexed: 12/21/2022]
Abstract
Locating proteins within cellular contexts is of paramount significance in elucidating their biological functions. Computational methods based on knowledge databases (such as gene ontology annotation (GOA) database) are known to be more efficient than sequence-based methods. However, the predominant scenarios of knowledge-based methods are that (1) knowledge databases typically have enormous size and are growing exponentially, (2) knowledge databases contain redundant information, and (3) the number of extracted features from knowledge databases is much larger than the number of data samples with ground-truth labels. These properties render the extracted features liable to redundant or irrelevant information, causing the prediction systems suffer from overfitting. To address these problems, this paper proposes an efficient multi-label predictor, namely R3P-Loc, which uses two compact databases for feature extraction and applies random projection (RP) to reduce the feature dimensions of an ensemble ridge regression (RR) classifier. Two new compact databases are created from Swiss-Prot and GOA databases. These databases possess almost the same amount of information as their full-size counterparts but with much smaller size. Experimental results on two recent datasets (eukaryote and plant) suggest that R3P-Loc can reduce the dimensions by seven-folds and significantly outperforms state-of-the-art predictors. This paper also demonstrates that the compact databases reduce the memory consumption by 39 times without causing degradation in prediction accuracy. For readers׳ convenience, the R3P-Loc server is available online at url:http://bioinfo.eie.polyu.edu.hk/R3PLocServer/.
Collapse
Affiliation(s)
- Shibiao Wan
- Department of Electronic and Information Engineering, The Hong Kong Polytechnic University, Hong Kong SAR, China.
| | - Man-Wai Mak
- Department of Electronic and Information Engineering, The Hong Kong Polytechnic University, Hong Kong SAR, China.
| | - Sun-Yuan Kung
- Department of Electrical Engineering, Princeton University, NJ, USA.
| |
Collapse
|
10
|
Simha R, Shatkay H. Protein (multi-)location prediction: using location inter-dependencies in a probabilistic framework. Algorithms Mol Biol 2014; 9:8. [PMID: 24646119 PMCID: PMC3994749 DOI: 10.1186/1748-7188-9-8] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/03/2013] [Accepted: 03/02/2014] [Indexed: 12/23/2022] Open
Abstract
Motivation Knowing the location of a protein within the cell is important for understanding its function, role in biological processes, and potential use as a drug target. Much progress has been made in developing computational methods that predict single locations for proteins. Most such methods are based on the over-simplifying assumption that proteins localize to a single location. However, it has been shown that proteins localize to multiple locations. While a few recent systems attempt to predict multiple locations of proteins, their performance leaves much room for improvement. Moreover, they typically treat locations as independent and do not attempt to utilize possible inter-dependencies among locations. Our hypothesis is that directly incorporating inter-dependencies among locations into both the classifier-learning and the prediction process can improve location prediction performance. Results We present a new method and a preliminary system we have developed that directly incorporates inter-dependencies among locations into the location-prediction process of multiply-localized proteins. Our method is based on a collection of Bayesian network classifiers, where each classifier is used to predict a single location. Learning the structure of each Bayesian network classifier takes into account inter-dependencies among locations, and the prediction process uses estimates involving multiple locations. We evaluate our system on a dataset of single- and multi-localized proteins (the most comprehensive protein multi-localization dataset currently available, derived from the DBMLoc dataset). Our results, obtained by incorporating inter-dependencies, are significantly higher than those obtained by classifiers that do not use inter-dependencies. The performance of our system on multi-localized proteins is comparable to a top performing system (YLoc+), without being restricted only to location-combinations present in the training set.
Collapse
|
11
|
Gohar AV, Cao R, Jenkins P, Li W, Houston JP, Houston KD. Subcellular localization-dependent changes in EGFP fluorescence lifetime measured by time-resolved flow cytometry. BIOMEDICAL OPTICS EXPRESS 2013; 4:1390-400. [PMID: 24010001 PMCID: PMC3756581 DOI: 10.1364/boe.4.001390] [Citation(s) in RCA: 21] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 06/10/2013] [Revised: 07/11/2013] [Accepted: 07/15/2013] [Indexed: 05/23/2023]
Abstract
Intracellular protein transport and localization to subcellular regions are processes necessary for normal protein function. Fluorescent proteins can be fused to proteins of interest to track movement and determine localization within a cell. Currently, fluorescence microscopy combined with image processing is most often used to study protein movement and subcellular localization. In this contribution we evaluate a high-throughput time-resolved flow cytometry approach to correlate intracellular localization of human LC3 protein with the fluorescence lifetime of enhanced green fluorescent protein (EGFP). Subcellular LC3 localization to autophagosomes is a marker of the cellular process called autophagy. In breast cancer cells expressing native EGFP and EGFP-LC3 fusion proteins, we measured the fluorescence intensity and lifetime of (i) diffuse EGFP (ii) punctate EGFP-LC3 and (iii) diffuse EGFP-ΔLC3 after amino acid starvation to induce autophagy-dependent LC3 localization. We verify EGFP-LC3 localization with low-throughput confocal microscopy and compare to fluorescence intensity measured by standard flow cytometry. Our results demonstrate that time-resolved flow cytometry can be correlated to subcellular localization of EGFP fusion proteins by measuring changes in fluorescence lifetime.
Collapse
Affiliation(s)
- Ali Vaziri Gohar
- Molecular Biology Program, New Mexico State University, Las Cruces, NM 88003, USA
| | - Ruofan Cao
- Department of Chemical Engineering, New Mexico State University, Las Cruces, NM 88003, USA
| | - Patrick Jenkins
- Department of Chemical Engineering, New Mexico State University, Las Cruces, NM 88003, USA
| | - Wenyan Li
- Department of Chemical Engineering, New Mexico State University, Las Cruces, NM 88003, USA
| | - Jessica P. Houston
- Molecular Biology Program, New Mexico State University, Las Cruces, NM 88003, USA
- Department of Chemical Engineering, New Mexico State University, Las Cruces, NM 88003, USA
| | - Kevin D. Houston
- Molecular Biology Program, New Mexico State University, Las Cruces, NM 88003, USA
- Department of Chemistry and Biochemistry, New Mexico State University, Las Cruces, NM 88003, USA
| |
Collapse
|
12
|
Satori CP, Henderson MM, Krautkramer EA, Kostal V, Distefano MM, Arriaga EA. Bioanalysis of eukaryotic organelles. Chem Rev 2013; 113:2733-811. [PMID: 23570618 PMCID: PMC3676536 DOI: 10.1021/cr300354g] [Citation(s) in RCA: 87] [Impact Index Per Article: 7.9] [Reference Citation Analysis] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/18/2022]
Affiliation(s)
- Chad P. Satori
- Department of Chemistry, University of Minnesota, Twin Cities, Minneapolis, MN, USA, 55455
| | - Michelle M. Henderson
- Department of Chemistry, University of Minnesota, Twin Cities, Minneapolis, MN, USA, 55455
| | - Elyse A. Krautkramer
- Department of Chemistry, University of Minnesota, Twin Cities, Minneapolis, MN, USA, 55455
| | - Vratislav Kostal
- Tescan, Libusina trida 21, Brno, 623 00, Czech Republic
- Institute of Analytical Chemistry ASCR, Veveri 97, Brno, 602 00, Czech Republic
| | - Mark M. Distefano
- Department of Chemistry, University of Minnesota, Twin Cities, Minneapolis, MN, USA, 55455
| | - Edgar A. Arriaga
- Department of Chemistry, University of Minnesota, Twin Cities, Minneapolis, MN, USA, 55455
| |
Collapse
|
13
|
Niu H, Jiang H, Cheng B, Li X, Dong Q, Shao L, Liu S, Wang X. Stromal proteome expression profile and muscle-invasive bladder cancer research. Cancer Cell Int 2012; 12:39. [PMID: 22920603 PMCID: PMC3489783 DOI: 10.1186/1475-2867-12-39] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/25/2012] [Accepted: 08/17/2012] [Indexed: 12/03/2022] Open
Abstract
BACKGROUND To globally characterize the cancer stroma expression profile of muscle-invasive transitional cell carcinoma and to discuss the cancer biology as well as biomarker discovery from stroma. Laser capture micro dissection was used to harvest purified muscle-invasive bladder cancer stromal cells and normal urothelial stromal cells from 4 paired samples. Two-dimensional liquid chromatography tandem mass spectrometry was used to identify the proteome expression profile. The differential proteins were further analyzed using bioinformatics tools and compared with the published literature. RESULTS We identified 868/872 commonly expressed proteins and 978 differential proteins from 4 paired cancer and normal stromal samples using laser capture micro dissection coupled with two-dimensional liquid chromatography tandem mass spectrometry. 487/491 proteins uniquely expressed in cancer/normal stroma. Differential proteins were compared with the entire list of the international protein index (IPI), and there were 42/42 gene ontology (GO) terms exhibited as enriched and 8/5 exhibited as depleted in cellular Component, respectively. Significantly altered pathways between cancer/normal stroma mainly include metabolic pathways, ribosome, focal adhesion, etc. Finally, descriptive statistics show that the stromal proteins with extremes of PI and MW have the same probability to be a biomarker. CONCLUSIONS Based on our results, stromal cells are essential component of the cancer, biomarker discovery and network based multi target therapy should consider neoplastic cells itself and corresponding stroma as whole one.
Collapse
Affiliation(s)
- Haitao Niu
- Department of Urology, The Affiliated Hospital of Medical College Qingdao University, Qingdao, China
| | - Haiping Jiang
- Department of Oncology, The Affiliated Hospital of Medical College Qingdao University, Qingdao, China
| | - Bo Cheng
- Department of Urology, The Central Hospital of Shengli Oil Field, Dondying, China
| | - Xinhui Li
- Department of Urology, The Affiliated Hospital of Medical College Qingdao University, Qingdao, China
| | - Qian Dong
- Department of Pediatric Surgery, The Affiliated Hospital of Medical College Qingdao University, Qingdao, China
| | - Leping Shao
- Department of Nephrology, The Affiliated Hospital of Medical College Qingdao University, Qingdao, China
| | - Shiguo Liu
- Gout Laboratory, The Affiliated Hospital of Medical College, Qingdao University, Qingdao, China
| | - Xinsheng Wang
- Department of Urology, The Affiliated Hospital of Medical College Qingdao University, Qingdao, China
| |
Collapse
|
14
|
Buck TE, Li J, Rohde GK, Murphy RF. Toward the virtual cell: automated approaches to building models of subcellular organization "learned" from microscopy images. Bioessays 2012; 34:791-9. [PMID: 22777818 DOI: 10.1002/bies.201200032] [Citation(s) in RCA: 31] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/09/2023]
Abstract
We review state-of-the-art computational methods for constructing, from image data, generative statistical models of cellular and nuclear shapes and the arrangement of subcellular structures and proteins within them. These automated approaches allow consistent analysis of images of cells for the purposes of learning the range of possible phenotypes, discriminating between them, and informing further investigation. Such models can also provide realistic geometry and initial protein locations to simulations in order to better understand cellular and subcellular processes. To determine the structures of cellular components and how proteins and other molecules are distributed among them, the generative modeling approach described here can be coupled with high throughput imaging technology to infer and represent subcellular organization from data with few a priori assumptions. We also discuss potential improvements to these methods and future directions for research.
Collapse
Affiliation(s)
- Taráz E Buck
- Lane Center for Computational Biology, Carnegie Mellon University, Pittsburgh, PA, USA
| | | | | | | |
Collapse
|
15
|
Murphy RF. CellOrganizer: Image-derived models of subcellular organization and protein distribution. Methods Cell Biol 2012; 110:179-93. [PMID: 22482949 DOI: 10.1016/b978-0-12-388403-9.00007-2] [Citation(s) in RCA: 35] [Impact Index Per Article: 2.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/12/2022]
Abstract
This chapter describes approaches for learning models of subcellular organization from images. The primary utility of these models is expected to be from incorporation into complex simulations of cell behaviors. Most current cell simulations do not consider spatial organization of proteins at all, or treat each organelle type as a single, idealized compartment. The ability to build generative models for all proteins in a proteome and use them for spatially accurate simulations is expected to improve the accuracy of models of cell behaviors. A second use, of potentially equal importance, is expected to be in testing and comparing software for analyzing cell images. The complexity and sophistication of algorithms used in cell-image-based screens and assays (variously referred to as high-content screening, high-content analysis, or high-throughput microscopy) is continuously increasing, and generative models can be used to produce images for testing these algorithms in which the expected answer is known.
Collapse
Affiliation(s)
- Robert F Murphy
- Lane Center for Computational Biology, Carnegie Mellon University, Pittsburgh, Pennsylvania, USA
| |
Collapse
|
16
|
Mohr SE, Perrimon N. RNAi screening: new approaches, understandings, and organisms. WILEY INTERDISCIPLINARY REVIEWS-RNA 2011; 3:145-58. [PMID: 21953743 DOI: 10.1002/wrna.110] [Citation(s) in RCA: 97] [Impact Index Per Article: 7.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/12/2022]
Abstract
RNA interference (RNAi) leads to sequence-specific knockdown of gene function. The approach can be used in large-scale screens to interrogate function in various model organisms and an increasing number of other species. Genome-scale RNAi screens are routinely performed in cultured or primary cells or in vivo in organisms such as C. elegans. High-throughput RNAi screening is benefitting from the development of sophisticated new instrumentation and software tools for collecting and analyzing data, including high-content image data. The results of large-scale RNAi screens have already proved useful, leading to new understandings of gene function relevant to topics such as infection, cancer, obesity, and aging. Nevertheless, important caveats apply and should be taken into consideration when developing or interpreting RNAi screens. Some level of false discovery is inherent to high-throughput approaches and specific to RNAi screens, false discovery due to off-target effects (OTEs) of RNAi reagents remains a problem. The need to improve our ability to use RNAi to elucidate gene function at large scale and in additional systems continues to be addressed through improved RNAi library design, development of innovative computational and analysis tools and other approaches.
Collapse
Affiliation(s)
- Stephanie E Mohr
- Drosophila RNAi Screening Center, Department of Genetics, Harvard Medical School, Boston, MA, USA
| | | |
Collapse
|
17
|
Smith PJ, Radbruch A. Cyto 2011 - overview of the XXVI ISAC Congress Proceedings Issue of Cytometry. Cytometry A 2011; 79:325-7. [PMID: 21520398 DOI: 10.1002/cyto.a.21069] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/07/2022]
|
18
|
Tkaczyk ER, Tkaczyk AH. Multiphoton flow cytometry strategies and applications. Cytometry A 2011; 79:775-88. [DOI: 10.1002/cyto.a.21110] [Citation(s) in RCA: 27] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/31/2011] [Revised: 06/15/2011] [Accepted: 06/27/2011] [Indexed: 12/20/2022]
|
19
|
Niu HT, Dong Z, Jiang G, Xu T, Liu YQ, Cao YW, Zhao J, Wang XS. Proteomics research on muscle-invasive bladder transitional cell carcinoma. Cancer Cell Int 2011; 11:17. [PMID: 21645413 PMCID: PMC3118115 DOI: 10.1186/1475-2867-11-17] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/25/2011] [Accepted: 06/07/2011] [Indexed: 02/06/2023] Open
Abstract
Background Aimed to facilitate candidate biomarkers selection and improve network-based multi-target therapy, we perform comparative proteomics research on muscle-invasive bladder transitional cell carcinoma. Laser capture microdissection was used to harvest purified muscle-invasive bladder cancer cells and normal urothelial cells from 4 paired samples. Two-dimensional liquid chromatography tandem mass spectrometry was used to identify the proteome expression profile. The differential proteins were further analyzed using bioinformatics tools and compared with the published literature. Results A total of 885/890 proteins commonly appeared in 4 paired samples. 295/337 of the 488/493 proteins that specific expressed in tumor/normal cells own gene ontology (GO) cellular component annotation. Compared with the entire list of the international protein index (IPI), there are 42/45 GO terms exhibited as enriched and 9/5 exhibited as depleted, respectively. Several pathways exhibit significantly changes between cancer and normal cells, mainly including spliceosome, endocytosis, oxidative phosphorylation, etc. Finally, descriptive statistics show that the PI Distribution of candidate biomarkers have certain regularity. Conclusions The present study identified the proteome expression profile of muscle-invasive bladder cancer cells and normal urothelial cells, providing information for subcellular pattern research of cancer and offer candidate proteins for biomarker panel and network-based multi-target therapy.
Collapse
Affiliation(s)
- Hai Tao Niu
- Department of Urology, The Affiliated Hospital of Medical College Qingdao University, Qingdao, China
| | - Zhen Dong
- Department of Urology, The Affiliated Hospital of Medical College Qingdao University, Qingdao, China
| | - Gang Jiang
- Department of Radiology, The Affiliated Hospital of Medical College Qingdao University, Qingdao, China
| | - Ting Xu
- Department of Geratology, The 401th Hospital of PLA, Qingdao, China
| | - Yan Qun Liu
- Department of Hematology, Qingdao University Medical College, Qingdao, China
| | - Yan Wei Cao
- Department of Urology, The Affiliated Hospital of Medical College Qingdao University, Qingdao, China
| | - Jun Zhao
- Department of Urology, The Affiliated Hospital of Medical College Qingdao University, Qingdao, China
| | - Xin Sheng Wang
- Department of Urology, The Affiliated Hospital of Medical College Qingdao University, Qingdao, China
| |
Collapse
|