1
|
Wei Q, Li J, He QY, Chen Y, Zhang G. Identifying PE2 and PE5 Proteins from Existing Mass Spectrometry Data Using pFind. J Proteome Res 2024; 23:2323-2331. [PMID: 38865581 DOI: 10.1021/acs.jproteome.3c00674] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 06/14/2024]
Abstract
The Chromosome-Centric Human Proteome Project (C-HPP) aims to identify all proteins encoded by the human genome. Currently, the human proteome still contains approximately 2000 PE2-PE5 proteins, referring to annotated coding genes that lack sufficient protein-level evidence. During the past 10 years, it has been increasingly difficult to identify PE2-PE5 proteins in C-HPP approaches due to the limited occurrence. Therefore, we proposed that reanalyzing massive MS data sets in repository with newly developed algorithms may increase the occurrence of the peptides of these proteins. In this study, we downloaded 1000 MS data sets via the ProteomeXchange database. Using pFind software, we identified peptides referring to 1788 PE2-PE5 proteins. Among them, 11 PE2 and 16 PE5 proteins were identified with at least 2 peptides, and 12 of them were identified using 2 peptides in a single data set, following the criteria of the HPP guidelines. We found translation evidence for 16 of the 11 PE2 and 16 PE5 proteins in our RNC-seq data, supporting their existence. The properties of the PE2 and PE5 proteins were similar to those of the PE1 proteins. Our approach demonstrated that mining PE2 and PE5 proteins in massive data repository is still worthy, and multidata set peptide identifications may support the presence of PE2 and PE5 proteins or at least prompt additional studies for validation. Extremely high throughput could be a solution to finding more PE2 and PE5 proteins.
Collapse
Affiliation(s)
- Qianzhou Wei
- Key Laboratory of Functional Protein Research of Guangdong Higher Education Institutes and MOE Key Laboratory of Tumor Molecular Biology, Institute of Life and Health Engineering, Jinan University, Guangzhou 510632, China
| | - Jiamin Li
- Key Laboratory of Functional Protein Research of Guangdong Higher Education Institutes and MOE Key Laboratory of Tumor Molecular Biology, Institute of Life and Health Engineering, Jinan University, Guangzhou 510632, China
| | - Qing-Yu He
- Key Laboratory of Functional Protein Research of Guangdong Higher Education Institutes and MOE Key Laboratory of Tumor Molecular Biology, Institute of Life and Health Engineering, Jinan University, Guangzhou 510632, China
| | - Yang Chen
- Key Laboratory of Functional Protein Research of Guangdong Higher Education Institutes and MOE Key Laboratory of Tumor Molecular Biology, Institute of Life and Health Engineering, Jinan University, Guangzhou 510632, China
| | - Gong Zhang
- Key Laboratory of Functional Protein Research of Guangdong Higher Education Institutes and MOE Key Laboratory of Tumor Molecular Biology, Institute of Life and Health Engineering, Jinan University, Guangzhou 510632, China
| |
Collapse
|
2
|
Richardson R, Tejedor Navarro H, Amaral LAN, Stoeger T. Meta-Research: Understudied genes are lost in a leaky pipeline between genome-wide assays and reporting of results. eLife 2024; 12:RP93429. [PMID: 38546716 PMCID: PMC10977968 DOI: 10.7554/elife.93429] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 04/01/2024] Open
Abstract
Present-day publications on human genes primarily feature genes that already appeared in many publications prior to completion of the Human Genome Project in 2003. These patterns persist despite the subsequent adoption of high-throughput technologies, which routinely identify novel genes associated with biological processes and disease. Although several hypotheses for bias in the selection of genes as research targets have been proposed, their explanatory powers have not yet been compared. Our analysis suggests that understudied genes are systematically abandoned in favor of better-studied genes between the completion of -omics experiments and the reporting of results. Understudied genes remain abandoned by studies that cite these -omics experiments. Conversely, we find that publications on understudied genes may even accrue a greater number of citations. Among 45 biological and experimental factors previously proposed to affect which genes are being studied, we find that 33 are significantly associated with the choice of hit genes presented in titles and abstracts of -omics studies. To promote the investigation of understudied genes, we condense our insights into a tool, find my understudied genes (FMUG), that allows scientists to engage with potential bias during the selection of hits. We demonstrate the utility of FMUG through the identification of genes that remain understudied in vertebrate aging. FMUG is developed in Flutter and is available for download at fmug.amaral.northwestern.edu as a MacOS/Windows app.
Collapse
Affiliation(s)
- Reese Richardson
- Interdisciplinary Biological Sciences, Northwestern UniversityEvanstonUnited States
- Department of Chemical and Biological Engineering, Northwestern UniversityEvanstonUnited States
| | - Heliodoro Tejedor Navarro
- Department of Chemical and Biological Engineering, Northwestern UniversityEvanstonUnited States
- Northwestern Institute on Complex Systems, Northwestern UniversityEvanstonUnited States
| | - Luis A Nunes Amaral
- Department of Chemical and Biological Engineering, Northwestern UniversityEvanstonUnited States
- Northwestern Institute on Complex Systems, Northwestern UniversityEvanstonUnited States
- Department of Molecular Biosciences, Northwestern UniversityEvanstonUnited States
- Department of Physics and Astronomy, Northwestern UniversityEvanstonUnited States
| | - Thomas Stoeger
- Department of Chemical and Biological Engineering, Northwestern UniversityEvanstonUnited States
- The Potocsnak Longevity Institute, Northwestern UniversityChicagoUnited States
- Simpson Querrey Lung Institute for Translational Science, Northwestern UniversityChicagoUnited States
| |
Collapse
|
3
|
Richardson RAK, Tejedor Navarro H, Amaral LAN, Stoeger T. Meta-Research: understudied genes are lost in a leaky pipeline between genome-wide assays and reporting of results. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2023.02.28.530483. [PMID: 36909550 PMCID: PMC10002660 DOI: 10.1101/2023.02.28.530483] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 03/06/2023]
Abstract
Present-day publications on human genes primarily feature genes that already appeared in many publications prior to completion of the Human Genome Project in 2003. These patterns persist despite the subsequent adoption of high-throughput technologies, which routinely identify novel genes associated with biological processes and disease. Although several hypotheses for bias in the selection of genes as research targets have been proposed, their explanatory powers have not yet been compared. Our analysis suggests that understudied genes are systematically abandoned in favor of better-studied genes between the completion of -omics experiments and the reporting of results. Understudied genes remain abandoned by studies that cite these -omics experiments. Conversely, we find that publications on understudied genes may even accrue a greater number of citations. Among 45 biological and experimental factors previously proposed to affect which genes are being studied, we find that 33 are significantly associated with the choice of hit genes presented in titles and abstracts of - omics studies. To promote the investigation of understudied genes we condense our insights into a tool, find my understudied genes (FMUG), that allows scientists to engage with potential bias during the selection of hits. We demonstrate the utility of FMUG through the identification of genes that remain understudied in vertebrate aging. FMUG is developed in Flutter and is available for download at fmug.amaral.northwestern.edu as a MacOS/Windows app.
Collapse
Affiliation(s)
- Reese AK Richardson
- Interdisciplinary Biological Sciences, Northwestern University
- Department of Chemical and Biological Engineering, Northwestern University
| | - Heliodoro Tejedor Navarro
- Department of Chemical and Biological Engineering, Northwestern University
- Northwestern Institute on Complex Systems, Northwestern University
| | - Luis A Nunes Amaral
- Department of Chemical and Biological Engineering, Northwestern University
- Northwestern Institute on Complex Systems, Northwestern University
- Department of Physics and Astronomy, Northwestern University
- Department of Molecular Biosciences, Northwestern University
| | - Thomas Stoeger
- Department of Chemical and Biological Engineering, Northwestern University
- The Potocsnak Longevity Institute, Northwestern University
- Simpson Querrey Lung Institute for Translational Science, Northwestern University
| |
Collapse
|
4
|
Allou L, Mundlos S. Disruption of regulatory domains and novel transcripts as disease-causing mechanisms. Bioessays 2023; 45:e2300010. [PMID: 37381881 DOI: 10.1002/bies.202300010] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/16/2023] [Revised: 05/24/2023] [Accepted: 06/06/2023] [Indexed: 06/30/2023]
Abstract
Deletions, duplications, insertions, inversions, and translocations, collectively called structural variations (SVs), affect more base pairs of the genome than any other sequence variant. The recent technological advancements in genome sequencing have enabled the discovery of tens of thousands of SVs per human genome. These SVs primarily affect non-coding DNA sequences, but the difficulties in interpreting their impact limit our understanding of human disease etiology. The functional annotation of non-coding DNA sequences and methodologies to characterize their three-dimensional (3D) organization in the nucleus have greatly expanded our understanding of the basic mechanisms underlying gene regulation, thereby improving the interpretation of SVs for their pathogenic impact. Here, we discuss the various mechanisms by which SVs can result in altered gene regulation and how these mechanisms can result in rare genetic disorders. Beyond changing gene expression, SVs can produce novel gene-intergenic fusion transcripts at the SV breakpoints.
Collapse
Affiliation(s)
- Lila Allou
- RG Development & Disease, Max Planck Institute for Molecular Genetics, Berlin, Germany
- Institute for Medical and Human Genetics, Charité-Universitätsmedizin Berlin, Berlin, Germany
| | - Stefan Mundlos
- RG Development & Disease, Max Planck Institute for Molecular Genetics, Berlin, Germany
- Institute for Medical and Human Genetics, Charité-Universitätsmedizin Berlin, Berlin, Germany
- Berlin-Brandenburg Center for Regenerative Therapies, Charité-Universitätsmedizin Berlin, Berlin, Germany
| |
Collapse
|
5
|
Rocha JJ, Jayaram SA, Stevens TJ, Muschalik N, Shah RD, Emran S, Robles C, Freeman M, Munro S. Functional unknomics: Systematic screening of conserved genes of unknown function. PLoS Biol 2023; 21:e3002222. [PMID: 37552676 PMCID: PMC10409296 DOI: 10.1371/journal.pbio.3002222] [Citation(s) in RCA: 15] [Impact Index Per Article: 15.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/12/2023] [Accepted: 06/27/2023] [Indexed: 08/10/2023] Open
Abstract
The human genome encodes approximately 20,000 proteins, many still uncharacterised. It has become clear that scientific research tends to focus on well-studied proteins, leading to a concern that poorly understood genes are unjustifiably neglected. To address this, we have developed a publicly available and customisable "Unknome database" that ranks proteins based on how little is known about them. We applied RNA interference (RNAi) in Drosophila to 260 unknown genes that are conserved between flies and humans. Knockdown of some genes resulted in loss of viability, and functional screening of the rest revealed hits for fertility, development, locomotion, protein quality control, and resilience to stress. CRISPR/Cas9 gene disruption validated a component of Notch signalling and 2 genes contributing to male fertility. Our work illustrates the importance of poorly understood genes, provides a resource to accelerate future research, and highlights a need to support database curation to ensure that misannotation does not erode our awareness of our own ignorance.
Collapse
Affiliation(s)
- João J. Rocha
- MRC Laboratory of Molecular Biology, Cambridge, United Kingdom
| | | | - Tim J. Stevens
- MRC Laboratory of Molecular Biology, Cambridge, United Kingdom
| | | | - Rajen D. Shah
- Centre for Mathematical Sciences, University of Cambridge, Cambridge, United Kingdom
| | - Sahar Emran
- MRC Laboratory of Molecular Biology, Cambridge, United Kingdom
| | - Cristina Robles
- MRC Laboratory of Molecular Biology, Cambridge, United Kingdom
| | - Matthew Freeman
- MRC Laboratory of Molecular Biology, Cambridge, United Kingdom
- Sir William Dunn School of Pathology, University of Oxford, Oxford, United Kingdom
| | - Sean Munro
- MRC Laboratory of Molecular Biology, Cambridge, United Kingdom
| |
Collapse
|
6
|
Huang S, Li J, Wu S, Zheng Z, Wang C, Li H, Zhao L, Zhang X, Huang H, Huang C, Xie Q. C4orf19 inhibits colorectal cancer cell proliferation by competitively binding to Keap1 with TRIM25 via the USP17/Elk-1/CDK6 axis. Oncogene 2023; 42:1333-1346. [PMID: 36882524 DOI: 10.1038/s41388-023-02656-z] [Citation(s) in RCA: 5] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/03/2022] [Revised: 02/18/2023] [Accepted: 02/27/2023] [Indexed: 03/09/2023]
Abstract
Colorectal cancer (CRC) is one of the most common malignant tumors in the gastrointestinal tract, and has been attracted a great deal attention and extensive investigation due to its high morbidity and mortality rates. The C4orf19 gene encodes a protein with uncharacterized function. Our preliminary exploration of the TCGA database indicated that C4orf19 is markedly downregulated in CRC tissues in comparison to that observed in normal colonic tissues, suggesting its potential association with CRC behaviors. Further studies showed a significant positive correlation between C4orf19 expression levels and CRC patient prognosis. Ectopic expression of C4orf19 inhibited the growth of CRC cells in vitro and tumorigenic ability in vivo. Mechanistic studies showed that C4orf19 binds to Keap1 at near the Lys615, which prevents the ubiquitination of Keap1 by TRIM25, thus protecting the Keap1 protein from degradation. The accumulated Keap1 results in USP17 degradation and in turn leading to the degradation of Elk-1, further attenuates its regulated CDK6 mRNA transcription and protein expression, as well as its mediated proliferation of CRC cells. Collectively, the present studies characterize function of C4orf19 as a tumor suppressor for CRC cell proliferation by targeting Keap1/USP17/Elk-1/CDK6 axis.
Collapse
Affiliation(s)
- Shirui Huang
- Department of Clinical Laboratory, The Second Affiliated Hospital and Yuying Children's Hospital of Wenzhou Medical University, Wenzhou, 325027, Zhejiang, China
| | - Jizhen Li
- Department of Clinical Laboratory, The Second Affiliated Hospital and Yuying Children's Hospital of Wenzhou Medical University, Wenzhou, 325027, Zhejiang, China
| | - Shuang Wu
- Zhejiang Provincial Key Laboratory of Medical Genetics, Key Laboratory of Laboratory Medicine, Ministry of Education, School of Laboratory Medicine and Life Sciences, Wenzhou Medical University, Wenzhou, 325035, Zhejiang, China
| | - Zhijian Zheng
- Zhejiang Provincial Key Laboratory of Medical Genetics, Key Laboratory of Laboratory Medicine, Ministry of Education, School of Laboratory Medicine and Life Sciences, Wenzhou Medical University, Wenzhou, 325035, Zhejiang, China
| | - Cong Wang
- Zhejiang Provincial Key Laboratory of Medical Genetics, Key Laboratory of Laboratory Medicine, Ministry of Education, School of Laboratory Medicine and Life Sciences, Wenzhou Medical University, Wenzhou, 325035, Zhejiang, China
| | - Hongyan Li
- Zhejiang Provincial Key Laboratory of Medical Genetics, Key Laboratory of Laboratory Medicine, Ministry of Education, School of Laboratory Medicine and Life Sciences, Wenzhou Medical University, Wenzhou, 325035, Zhejiang, China
| | - Lingling Zhao
- Zhejiang Provincial Key Laboratory of Medical Genetics, Key Laboratory of Laboratory Medicine, Ministry of Education, School of Laboratory Medicine and Life Sciences, Wenzhou Medical University, Wenzhou, 325035, Zhejiang, China
| | - Xiaodong Zhang
- The First Affiliated Hospital of Wenzhou Medical University, Wenzhou, 325035, Zhejiang, China
| | - Haishan Huang
- Zhejiang Provincial Key Laboratory of Medical Genetics, Key Laboratory of Laboratory Medicine, Ministry of Education, School of Laboratory Medicine and Life Sciences, Wenzhou Medical University, Wenzhou, 325035, Zhejiang, China
| | - Chuanshu Huang
- Zhejiang Provincial Key Laboratory of Medical Genetics, Key Laboratory of Laboratory Medicine, Ministry of Education, School of Laboratory Medicine and Life Sciences, Wenzhou Medical University, Wenzhou, 325035, Zhejiang, China
| | - Qipeng Xie
- Department of Clinical Laboratory, The Second Affiliated Hospital and Yuying Children's Hospital of Wenzhou Medical University, Wenzhou, 325027, Zhejiang, China.
| |
Collapse
|
7
|
Bairoch A. Meet the Editorial Board Member. CURR PROTEOMICS 2022. [DOI: 10.2174/157016461904220907111423] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/22/2022]
Affiliation(s)
- Amos Bairoch
- Department of Human Protein Sciences
Swiss-Prot Group
Swiss Institute of Bioinformatics
Geneva
Switzerland
| |
Collapse
|
8
|
Ilgisonis EV, Pogodin PV, Kiseleva OI, Tarbeeva SN, Ponomarenko EA. Evolution of Protein Functional Annotation: Text Mining Study. J Pers Med 2022; 12:jpm12030479. [PMID: 35330478 PMCID: PMC8952229 DOI: 10.3390/jpm12030479] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/02/2022] [Revised: 03/07/2022] [Accepted: 03/08/2022] [Indexed: 11/23/2022] Open
Abstract
Within the Human Proteome Project initiative framework for creating functional annotations of uPE1 proteins, the neXt-CP50 Challenge was launched in 2018. In analogy with the missing-protein challenge, each command deciphers the functional features of the proteins in the chromosome-centric mode. However, the neXt-CP50 Challenge is more complicated than the missing-protein challenge: the approaches and methods for solving the problem are clear, but neither the concept of protein function nor specific experimental and/or bioinformatics protocols have been standardized to address it. We proposed using a retrospective analysis of the key HPP repository, the neXtProt database, to identify the most frequently used experimental and bioinformatic methods for analyzing protein functions, and the dynamics of accumulation of functional annotations. It has been shown that the dynamics of the increase in the number of proteins with known functions are greater than the progress made in the experimental confirmation of the existence of questionable proteins in the framework of the missing-protein challenge. At the same time, the functional annotation is based on the guilty-by-association postulate, according to which, based on large-scale experiments on API-MS and Y2H, proteins with unknown functions are most likely mapped through “handshakes” to biochemical processes.
Collapse
|
9
|
Stupp D, Sharon E, Bloch I, Zitnik M, Zuk O, Tabach Y. Co-evolution based machine-learning for predicting functional interactions between human genes. Nat Commun 2021; 12:6454. [PMID: 34753957 PMCID: PMC8578642 DOI: 10.1038/s41467-021-26792-w] [Citation(s) in RCA: 11] [Impact Index Per Article: 3.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/26/2020] [Accepted: 10/09/2021] [Indexed: 12/20/2022] Open
Abstract
Over the next decade, more than a million eukaryotic species are expected to be fully sequenced. This has the potential to improve our understanding of genotype and phenotype crosstalk, gene function and interactions, and answer evolutionary questions. Here, we develop a machine-learning approach for utilizing phylogenetic profiles across 1154 eukaryotic species. This method integrates co-evolution across eukaryotic clades to predict functional interactions between human genes and the context for these interactions. We benchmark our approach showing a 14% performance increase (auROC) compared to previous methods. Using this approach, we predict functional annotations for less studied genes. We focus on DNA repair and verify that 9 of the top 50 predicted genes have been identified elsewhere, with others previously prioritized by high-throughput screens. Overall, our approach enables better annotation of function and functional interactions and facilitates the understanding of evolutionary processes underlying co-evolution. The manuscript is accompanied by a webserver available at: https://mlpp.cs.huji.ac.il. With the rise in number of eukaryotic species being fully sequenced, large scale phylogenetic profiling can give insights on gene function, Here, the authors describe a machine-learning approach that integrates co-evolution across eukaryotic clades to predict gene function and functional interactions among human genes.
Collapse
Affiliation(s)
- Doron Stupp
- Department of Developmental Biology and Cancer Research, The Institute for Medical Research Israel-Canada, The Hebrew University of Jerusalem, 9112001, Jerusalem, Israel
| | - Elad Sharon
- Department of Developmental Biology and Cancer Research, The Institute for Medical Research Israel-Canada, The Hebrew University of Jerusalem, 9112001, Jerusalem, Israel
| | - Idit Bloch
- Department of Developmental Biology and Cancer Research, The Institute for Medical Research Israel-Canada, The Hebrew University of Jerusalem, 9112001, Jerusalem, Israel
| | - Marinka Zitnik
- Department of Biomedical Informatics, Harvard University, Boston, MA, 02115, USA
| | - Or Zuk
- Department of Statistics and Data Science, The Hebrew University of Jerusalem, Jerusalem, 9190501, Israel.
| | - Yuval Tabach
- Department of Developmental Biology and Cancer Research, The Institute for Medical Research Israel-Canada, The Hebrew University of Jerusalem, 9112001, Jerusalem, Israel.
| |
Collapse
|
10
|
Duek P, Mary C, Zahn-Zabal M, Bairoch A, Lane L. Functionathon: a manual data mining workflow to generate functional hypotheses for uncharacterized human proteins and its application by undergraduate students. Database (Oxford) 2021; 2021:baab046. [PMID: 34318869 PMCID: PMC8317215 DOI: 10.1093/database/baab046] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/07/2020] [Revised: 07/06/2021] [Accepted: 07/12/2021] [Indexed: 12/11/2022]
Abstract
About 10% of human proteins have no annotated function in protein knowledge bases. A workflow to generate hypotheses for the function of these uncharacterized proteins has been developed, based on predicted and experimental information on protein properties, interactions, tissular expression, subcellular localization, conservation in other organisms, as well as phenotypic data in mutant model organisms. This workflow has been applied to seven uncharacterized human proteins (C6orf118, C7orf25, CXorf58, RSRP1, SMLR1, TMEM53 and TMEM232) in the frame of a course-based undergraduate research experience named Functionathon organized at the University of Geneva to teach undergraduate students how to use biological databases and bioinformatics tools and interpret the results. C6orf118, CXorf58 and TMEM232 were proposed to be involved in cilia-related functions; TMEM53 and SMLR1 were proposed to be involved in lipid metabolism and C7orf25 and RSRP1 were proposed to be involved in RNA metabolism and gene expression. Experimental strategies to test these hypotheses were also discussed. The results of this manual data mining study may contribute to the project recently launched by the Human Proteome Organization (HUPO) Human Proteome Project aiming to fill gaps in the functional annotation of human proteins. Database URL: http://www.nextprot.org.
Collapse
Affiliation(s)
- Paula Duek
- CALIPHO group, SIB Swiss Institute of Bioinformatics
- Department of microbiology and molecular medicine, Faculty of medicine, University of Geneva, Geneva, Switzerland
| | - Camille Mary
- Department of microbiology and molecular medicine, Faculty of medicine, University of Geneva, Geneva, Switzerland
| | | | - Amos Bairoch
- CALIPHO group, SIB Swiss Institute of Bioinformatics
- Department of microbiology and molecular medicine, Faculty of medicine, University of Geneva, Geneva, Switzerland
| | - Lydie Lane
- CALIPHO group, SIB Swiss Institute of Bioinformatics
- Department of microbiology and molecular medicine, Faculty of medicine, University of Geneva, Geneva, Switzerland
| |
Collapse
|
11
|
Tran Q, Sudasinghe A, Jones B, Xiong K, Cohen RE, Sharlin DS, Hartert KT, Goellner GM. FAM171B is a novel polyglutamine protein widely expressed in the mammalian brain. Brain Res 2021; 1766:147540. [PMID: 34052262 DOI: 10.1016/j.brainres.2021.147540] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/12/2020] [Revised: 05/14/2021] [Accepted: 05/24/2021] [Indexed: 10/21/2022]
Abstract
Mutation in proteins containing polyglutamine (polyQ) tracts has been shown to underlie a number of severe human neurodegenerative disorders such as Huntington's Disease and Spinocerebellar Ataxia. In this study, we identify and describe FAM171B as a novel polyQ protein containing fourteen consecutive glutamine residues in its National Center for Biotechnology Information (NCBI) referenced sequence. Utilizing western blotting, in situ hybridization, and immunohistochemistry, we demonstrate that FAM171B is widely expressed in mouse brain with pronounced localization in the hippocampus, cerebellum, and cerebral cortex. Furthermore, immunofluorescence experiments reveal that FAM171B predominantly localizes to vesicle-like structures in the cytoplasm of neurons. Finally, bioinformatic analysis suggests that FAM171B is robustly expressed in human brain, and (similar to other polyQ disease genes) its polyQ tract is polymorphic within the general human population. Thus, as a polyQ protein that is expressed in brain, FAM171B should be considered a candidate gene for an as yet molecularly uncharacterized neurodegenerative disease.
Collapse
Affiliation(s)
- Quan Tran
- Department of Biological Sciences, Trafton South 242, Minnesota State University, Mankato, MN 56001, United States
| | - Ashani Sudasinghe
- Department of Biological Sciences, Trafton South 242, Minnesota State University, Mankato, MN 56001, United States
| | - Brooke Jones
- Department of Biological Sciences, Trafton South 242, Minnesota State University, Mankato, MN 56001, United States
| | - Ka Xiong
- Department of Biological Sciences, Trafton South 242, Minnesota State University, Mankato, MN 56001, United States
| | - Rachel E Cohen
- Department of Biological Sciences, Trafton South 242, Minnesota State University, Mankato, MN 56001, United States
| | - David S Sharlin
- Department of Biological Sciences, Trafton South 242, Minnesota State University, Mankato, MN 56001, United States
| | - Keenan T Hartert
- Department of Biological Sciences, Trafton South 242, Minnesota State University, Mankato, MN 56001, United States
| | - Geoffrey M Goellner
- Department of Biological Sciences, Trafton South 242, Minnesota State University, Mankato, MN 56001, United States.
| |
Collapse
|
12
|
De Luca E, Perrelli A, Swamy H, Nitti M, Passalacqua M, Furfaro AL, Salzano AM, Scaloni A, Glading AJ, Retta SF. Protein kinase Cα regulates the nucleocytoplasmic shuttling of KRIT1. J Cell Sci 2021; 134:jcs250217. [PMID: 33443102 PMCID: PMC7875496 DOI: 10.1242/jcs.250217] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/16/2020] [Accepted: 12/15/2020] [Indexed: 12/16/2022] Open
Abstract
KRIT1 is a scaffolding protein that regulates multiple molecular mechanisms, including cell-cell and cell-matrix adhesion, and redox homeostasis and signaling. However, rather little is known about how KRIT1 is itself regulated. KRIT1 is found in both the cytoplasm and the nucleus, yet the upstream signaling proteins and mechanisms that regulate KRIT1 nucleocytoplasmic shuttling are not well understood. Here, we identify a key role for protein kinase C (PKC) in this process. In particular, we found that PKC activation promotes the redox-dependent cytoplasmic localization of KRIT1, whereas inhibition of PKC or treatment with the antioxidant N-acetylcysteine leads to KRIT1 nuclear accumulation. Moreover, we demonstrated that the N-terminal region of KRIT1 is crucial for the ability of PKC to regulate KRIT1 nucleocytoplasmic shuttling, and may be a target for PKC-dependent regulatory phosphorylation events. Finally, we found that silencing of PKCα, but not PKCδ, inhibits phorbol 12-myristate 13-acetate (PMA)-induced cytoplasmic enrichment of KRIT1, suggesting a major role for PKCα in regulating KRIT1 nucleocytoplasmic shuttling. Overall, our findings identify PKCα as a novel regulator of KRIT1 subcellular compartmentalization, thus shedding new light on the physiopathological functions of this protein.
Collapse
Affiliation(s)
- Elisa De Luca
- Department of Clinical and Biological Sciences, University of Torino, 10043 Orbassano, Torino, Italy
- CCM Italia Research Network, National Coordination Center at the Department of Clinical and Biological Sciences, University of Torino, 10043 Orbassano, Torino, Italy
- Center for Biomolecular Nanotechnologies, Istituto Italiano di Tecnologia, 73010 Arnesano, Lecce, Italy
| | - Andrea Perrelli
- Department of Clinical and Biological Sciences, University of Torino, 10043 Orbassano, Torino, Italy
- CCM Italia Research Network, National Coordination Center at the Department of Clinical and Biological Sciences, University of Torino, 10043 Orbassano, Torino, Italy
| | - Harsha Swamy
- Department of Pharmacology and Physiology, University of Rochester, Rochester, NY 14642, USA
| | - Mariapaola Nitti
- Department of Experimental Medicine, University of Genoa, 16132 Genova, Italy
| | - Mario Passalacqua
- Department of Experimental Medicine, University of Genoa, 16132 Genova, Italy
| | - Anna Lisa Furfaro
- Department of Experimental Medicine, University of Genoa, 16132 Genova, Italy
| | - Anna Maria Salzano
- Proteomics & Mass Spectrometry Laboratory, ISPAAM, National Research Council, 80147 Napoli, Italy
| | - Andrea Scaloni
- Proteomics & Mass Spectrometry Laboratory, ISPAAM, National Research Council, 80147 Napoli, Italy
| | - Angela J Glading
- Department of Pharmacology and Physiology, University of Rochester, Rochester, NY 14642, USA
| | - Saverio Francesco Retta
- Department of Clinical and Biological Sciences, University of Torino, 10043 Orbassano, Torino, Italy
- CCM Italia Research Network, National Coordination Center at the Department of Clinical and Biological Sciences, University of Torino, 10043 Orbassano, Torino, Italy
| |
Collapse
|
13
|
Jamin SP, Hikmet F, Mathieu R, Jégou B, Lindskog C, Chalmel F, Primig M. Combined RNA/tissue profiling identifies novel Cancer/testis genes. Mol Oncol 2021; 15:3003-3023. [PMID: 33426787 PMCID: PMC8564638 DOI: 10.1002/1878-0261.12900] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/20/2020] [Revised: 11/19/2020] [Accepted: 12/24/2020] [Indexed: 11/14/2022] Open
Abstract
Cancer/Testis (CT) genes are induced in germ cells, repressed in somatic cells, and derepressed in somatic tumors, where these genes can contribute to cancer progression. CT gene identification requires data obtained using standardized protocols and technologies. This is a challenge because data for germ cells, gonads, normal somatic tissues, and a wide range of cancer samples stem from multiple sources and were generated over substantial periods of time. We carried out a GeneChip‐based RNA profiling analysis using our own data for testis and enriched germ cells, data for somatic cancers from the Expression Project for Oncology, and data for normal somatic tissues from the Gene Omnibus Repository. We identified 478 candidate loci that include known CT genes, numerous genes associated with oncogenic processes, and novel candidates that are not referenced in the Cancer/Testis Database (www.cta.lncc.br). We complemented RNA expression data at the protein level for SPESP1, GALNTL5, PDCL2, and C11orf42 using cancer tissue microarrays covering malignant tumors of breast, uterus, thyroid, and kidney, as well as published RNA profiling and immunohistochemical data provided by the Human Protein Atlas (www.proteinatlas.org). We report that combined RNA/tissue profiling identifies novel CT genes that may be of clinical interest as therapeutical targets or biomarkers. Our findings also highlight the challenges of detecting truly germ cell‐specific mRNAs and the proteins they encode in highly heterogenous testicular, somatic, and tumor tissues.
Collapse
Affiliation(s)
- Soazik P Jamin
- Inserm, EHESP, Irset (Institut de recherche en santé, environnement et travail) - UMR_S, Univ Rennes, France
| | - Feria Hikmet
- Department of Immunology, Genetics and Pathology, Science for Life Laboratory, Rudbeck Laboratory, Uppsala University, Sweden
| | - Romain Mathieu
- Inserm, EHESP, Irset (Institut de recherche en santé, environnement et travail) - UMR_S, Univ Rennes, France.,Department of Urology, University Hospital, Rennes, France
| | - Bernard Jégou
- Inserm, EHESP, Irset (Institut de recherche en santé, environnement et travail) - UMR_S, Univ Rennes, France
| | - Cecilia Lindskog
- Department of Immunology, Genetics and Pathology, Science for Life Laboratory, Rudbeck Laboratory, Uppsala University, Sweden
| | - Frédéric Chalmel
- Inserm, EHESP, Irset (Institut de recherche en santé, environnement et travail) - UMR_S, Univ Rennes, France
| | - Michael Primig
- Inserm, EHESP, Irset (Institut de recherche en santé, environnement et travail) - UMR_S, Univ Rennes, France
| |
Collapse
|
14
|
Affiliation(s)
- Christopher M Overall
- Centre for Blood Research, Departments of Oral Biological & Medical Sciences, and Biochemistry & Molecular Biology, Faculty of Dentistry, The University of British Columbia, Vancouver, British Columbia V6T 1Z3, Canada
| |
Collapse
|
15
|
Hwang H, Im JE, Yang Y, Kim H, Kwon KH, Kim YH, Kim JY, Yoo JS. Bioinformatic Prediction of Gene Ontology Terms of Uncharacterized Proteins from Chromosome 11. J Proteome Res 2020; 19:4907-4912. [PMID: 33089979 DOI: 10.1021/acs.jproteome.0c00482] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/18/2023]
Abstract
In chromosome 11, 71 out of its 1254 proteins remain functionally uncharacterized on the basis of their existence evidence (uPE1s) following the latest version of neXtProt (release 2020-01-17). Because in vivo and in vitro experimental strategies are often time-consuming and labor-intensive, there is a need for a bioinformatics tool to predict the function annotation. Here, we used I-TASSER/COFACTOR provided on the neXtProt web site, which predicts gene ontology (GO) terms based on the 3D structure of the protein. I-TASSER/COFACTOR predicted 2413 GO terms with a benchmark dataset of the 22 proteins belonging to PE1 of chromosome 11. In this study, we developed a filtering algorithm in order to select specific GO terms using the GO map generated by I-TASSER/COFACTOR. As a result, 187 specific GO terms showed a higher average precision-recall score at the least cellular component term compared to 2413 predicted GO terms. Next, we applied 65 proteins belonging to uPE1s of chromosome 11, and then 409 out of 6684 GO terms survived, where 103 and 142 GO terms of molecular function and biological process, respectively, were included. Representatively, the cellular component GO terms of CCDC90B, C11orf52, and the SMAP were predicted and validated using the overexpression system into 293T cells and immunofluorescence staining. We will further study their biological and molecular functions toward the goal of the neXt-CP50 project as a part of C-HPP. We shared all results and programs in Github (https://github.com/heeyounh/I-TASSER-COFACTOR-filtering.git).
Collapse
Affiliation(s)
- Heeyoun Hwang
- Research Center for Bioconvergence Analysis, Korea Basic Science Institute, Cheongju 28119, Republic of Korea
| | - Ji Eun Im
- Division of Convergence Technology, Research Institute of National Cancer Center, Goyang 10408, Republic of Korea
| | - Yeji Yang
- Research Center for Bioconvergence Analysis, Korea Basic Science Institute, Cheongju 28119, Republic of Korea
| | - Hyejin Kim
- Research Center for Bioconvergence Analysis, Korea Basic Science Institute, Cheongju 28119, Republic of Korea.,Graduate School of Analytical Science and Technology, Chungnam National University, Daejeon 34134, Republic of Korea
| | - Kyung-Hoon Kwon
- Research Center for Bioconvergence Analysis, Korea Basic Science Institute, Cheongju 28119, Republic of Korea
| | - Yun-Hee Kim
- Division of Convergence Technology, Research Institute of National Cancer Center, Goyang 10408, Republic of Korea.,Department of Cancer Biomedical Science, The National Cancer Center Graduate School of Cancer Science and Policy, Goyang 10408, Republic of Korea
| | - Jin Young Kim
- Research Center for Bioconvergence Analysis, Korea Basic Science Institute, Cheongju 28119, Republic of Korea
| | - Jong Shin Yoo
- Research Center for Bioconvergence Analysis, Korea Basic Science Institute, Cheongju 28119, Republic of Korea.,Graduate School of Analytical Science and Technology, Chungnam National University, Daejeon 34134, Republic of Korea
| |
Collapse
|
16
|
|
17
|
Lachén-Montes M, Mendizuri N, Ausín K, Pérez-Mediavilla A, Azkargorta M, Iloro I, Elortza F, Kondo H, Ohigashi I, Ferrer I, de la Torre R, Robledo P, Fernández-Irigoyen J, Santamaría E. Smelling the Dark Proteome: Functional Characterization of PITH Domain-Containing Protein 1 (C1orf128) in Olfactory Metabolism. J Proteome Res 2020; 19:4826-4843. [PMID: 33185454 DOI: 10.1021/acs.jproteome.0c00452] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/22/2022]
Abstract
The Human Proteome Project (HPP) consortium aims to functionally characterize the dark proteome. On the basis of the relevance of olfaction in early neurodegeneration, we have analyzed the dark proteome using data mining in public resources and omics data sets derived from the human olfactory system. Multiple dark proteins localize at synaptic terminals and may be involved in amyloidopathies such as Alzheimer's disease (AD). We have characterized the dark PITH domain-containing protein 1 (PITHD1) in olfactory metabolism using bioinformatics, proteomics, in vitro and in vivo studies, and neuropathology. PITHD1-/- mice exhibit olfactory bulb (OB) proteome changes related to synaptic transmission, cognition, and memory. OB PITHD1 expression increases with age in wild-type (WT) mice and decreases in Tg2576 AD mice at late stages. The analysis across 6 neurological disorders reveals that olfactory tract (OT) PITHD1 is specifically upregulated in human AD. Stimulation of olfactory neuroepithelial (ON) cells with PITHD1 alters the ON phosphoproteome, modifies the proliferation rate, and induces a pro-inflammatory phenotype. This workflow applied by the Spanish C-HPP and Human Brain Proteome Project (HBPP) teams across the ON-OB-OT axis can be adapted as a guidance to decipher functional features of dark proteins. Data are available via ProteomeXchange with identifiers PXD018784 and PXD021634.
Collapse
Affiliation(s)
- Mercedes Lachén-Montes
- Clinical Neuroproteomics Unit, Navarrabiomed, Complejo Hospitalario de Navarra (CHN), Universidad Pública de Navarra (UPNA), Irunlarrea 3, 31008 Pamplona, Spain.,Proteored-ISCIII, Proteomics Platform, Navarrabiomed, Complejo Hospitalario de Navarra (CHN), Universidad Pública de Navarra (UPNA), Irunlarrea 3, 31008 Pamplona, Spain.,IdiSNA, Navarra Institute for Health Research, Irunlarrea 3, 31008 Pamplona, Spain
| | - Naroa Mendizuri
- Clinical Neuroproteomics Unit, Navarrabiomed, Complejo Hospitalario de Navarra (CHN), Universidad Pública de Navarra (UPNA), Irunlarrea 3, 31008 Pamplona, Spain.,Proteored-ISCIII, Proteomics Platform, Navarrabiomed, Complejo Hospitalario de Navarra (CHN), Universidad Pública de Navarra (UPNA), Irunlarrea 3, 31008 Pamplona, Spain.,IdiSNA, Navarra Institute for Health Research, Irunlarrea 3, 31008 Pamplona, Spain
| | - Karina Ausín
- Clinical Neuroproteomics Unit, Navarrabiomed, Complejo Hospitalario de Navarra (CHN), Universidad Pública de Navarra (UPNA), Irunlarrea 3, 31008 Pamplona, Spain.,Proteored-ISCIII, Proteomics Platform, Navarrabiomed, Complejo Hospitalario de Navarra (CHN), Universidad Pública de Navarra (UPNA), Irunlarrea 3, 31008 Pamplona, Spain.,IdiSNA, Navarra Institute for Health Research, Irunlarrea 3, 31008 Pamplona, Spain
| | - Alberto Pérez-Mediavilla
- IdiSNA, Navarra Institute for Health Research, Irunlarrea 3, 31008 Pamplona, Spain.,Neurobiology of Alzheimer's Disease, Department of Biochemistry, Center for Applied Medical Research (CIMA), Neurosciences Division, University of Navarra, 31008 Pamplona, Spain
| | - Mikel Azkargorta
- Proteomics Platform, CIC bioGUNE, CIBERehd, ProteoRed-ISCIII, Bizkaia Science and Technology Park, 48160 Derio, Spain
| | - Ibon Iloro
- Proteomics Platform, CIC bioGUNE, CIBERehd, ProteoRed-ISCIII, Bizkaia Science and Technology Park, 48160 Derio, Spain
| | - Felix Elortza
- Proteomics Platform, CIC bioGUNE, CIBERehd, ProteoRed-ISCIII, Bizkaia Science and Technology Park, 48160 Derio, Spain
| | - Hiroyuki Kondo
- Division of Experimental Immunology, Institute of Advanced Medical Sciences, Tokushima University, Tokushima 770-8503, Japan
| | - Izumi Ohigashi
- Division of Experimental Immunology, Institute of Advanced Medical Sciences, Tokushima University, Tokushima 770-8503, Japan
| | - Isidre Ferrer
- Bellvitge Biomedical Research Institute (IDIBELL), 08908 Hospitalet de Llobregat, Spain.,CIBERNED (Network Centre of Biomedical Research of Neurodegenerative Diseases), Institute of Health Carlos III, 28029 Madrid, Spain.,Department of Pathology and Experimental Therapeutics, University of Barcelona, 08908 Hospitalet de Llobregat, Spain.,Institute of Neurosciences, University of Barcelona, 08007 Barcelona, Spain
| | - Rafael de la Torre
- Integrative Pharmacology and Systems Neuroscience Research Group, Neurosciences Research Program, IMIM (Hospital del Mar Medical Research Institute), 08003 Barcelona, Spain.,Department of Experimental and Health Sciences, Pompeu Fabra University (CEXS-UPF), 08002 Barcelona, Spain.,School of Medicine, Universitat Autònoma de Barcelona, 08193 Cerdanyola del Vallès, Spain.,CIBER de Fisiopatología de la Obesidad y Nutrición (CB06/03), CIBEROBN, 28029 Madrid, Spain
| | - Patricia Robledo
- Integrative Pharmacology and Systems Neuroscience Research Group, Neurosciences Research Program, IMIM (Hospital del Mar Medical Research Institute), 08003 Barcelona, Spain.,Department of Experimental and Health Sciences, Pompeu Fabra University (CEXS-UPF), 08002 Barcelona, Spain
| | - Joaquín Fernández-Irigoyen
- Clinical Neuroproteomics Unit, Navarrabiomed, Complejo Hospitalario de Navarra (CHN), Universidad Pública de Navarra (UPNA), Irunlarrea 3, 31008 Pamplona, Spain.,Proteored-ISCIII, Proteomics Platform, Navarrabiomed, Complejo Hospitalario de Navarra (CHN), Universidad Pública de Navarra (UPNA), Irunlarrea 3, 31008 Pamplona, Spain.,IdiSNA, Navarra Institute for Health Research, Irunlarrea 3, 31008 Pamplona, Spain
| | - Enrique Santamaría
- Clinical Neuroproteomics Unit, Navarrabiomed, Complejo Hospitalario de Navarra (CHN), Universidad Pública de Navarra (UPNA), Irunlarrea 3, 31008 Pamplona, Spain.,Proteored-ISCIII, Proteomics Platform, Navarrabiomed, Complejo Hospitalario de Navarra (CHN), Universidad Pública de Navarra (UPNA), Irunlarrea 3, 31008 Pamplona, Spain.,IdiSNA, Navarra Institute for Health Research, Irunlarrea 3, 31008 Pamplona, Spain
| |
Collapse
|
18
|
Vandenbrouck Y, Pineau C, Lane L. The Functionally Unannotated Proteome of Human Male Tissues: A Shared Resource to Uncover New Protein Functions Associated with Reproductive Biology. J Proteome Res 2020; 19:4782-4794. [PMID: 33064489 DOI: 10.1021/acs.jproteome.0c00516] [Citation(s) in RCA: 11] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/15/2022]
Abstract
In the context of the Human Proteome Project, we built an inventory of 412 functionally unannotated human proteins for which experimental evidence at the protein level exists (uPE1) and which are highly expressed in tissues involved in human male reproduction. We implemented a strategy combining literature mining, bioinformatics tools to collate annotation and experimental information from specific molecular public resources, and efficient visualization tools to put these unknown proteins into their biological context (protein complexes, tissue and subcellular location, expression pattern). The gathered knowledge allowed pinpointing five uPE1 for which a function has recently been proposed and which should be updated in protein knowledge bases. Furthermore, this bioinformatics strategy allowed to build new functional hypotheses for five other uPE1s in link with phenotypic traits that are specific to male reproductive function such as ciliogenesis/flagellum formation in germ cells (CCDC112 and TEX9), chromatin remodeling (C3orf62) and spermatozoon maturation (CCDC183). We also discussed the enigmatic case of MAGEB proteins, a poorly documented cancer/testis antigen subtype. Tools used and computational outputs produced during this study are freely accessible via ProteoRE (http://www.proteore.org), a Galaxy-based instance, for reuse purposes. We propose these five uPE1s should be investigated in priority by expert laboratories and hope that this inventory and shared resources will stimulate the interest of the community of reproductive biology.
Collapse
Affiliation(s)
- Yves Vandenbrouck
- Univ. Grenoble Alpes, INSERM, CEA, IRIG-BGE, U1038, F-38000 Grenoble, France
| | - Charles Pineau
- Univ. Rennes, Inserm, EHESP, Irset (Institut de Recherche en Santé, Environnement et Travail) - UMR_S 1085, F-35042 Rennes cedex, France
| | - Lydie Lane
- SIB Swiss Institute of Bioinformatics and Department of Microbiology and Molecular Medicine, Faculty of Medicine, University of Geneva, CMU, Michel Servet 1, 1211 Geneva 4, Switzerland
| |
Collapse
|
19
|
Poverennaya E, Kiseleva O, Romanova A, Pyatnitskiy M. Predicting Functions of Uncharacterized Human Proteins: From Canonical to Proteoforms. Genes (Basel) 2020; 11:E677. [PMID: 32575886 PMCID: PMC7350264 DOI: 10.3390/genes11060677] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/30/2020] [Revised: 06/09/2020] [Accepted: 06/19/2020] [Indexed: 01/22/2023] Open
Abstract
Despite tremendous efforts in genomics, transcriptomics, and proteomics communities, there is still no comprehensive data about the exact number of protein-coding genes, translated proteoforms, and their function. In addition, by now, we lack functional annotation for 1193 genes, where expression was confirmed at the proteomic level (uPE1 proteins). We re-analyzed results of AP-MS experiments from the BioPlex 2.0 database to predict functions of uPE1 proteins and their splice forms. By building a protein-protein interaction network for 12 ths. identified proteins encoded by 11 ths. genes, we were able to predict Gene Ontology categories for a total of 387 uPE1 genes. We predicted different functions for canonical and alternatively spliced forms for four uPE1 genes. In total, functional differences were revealed for 62 proteoforms encoded by 31 genes. Based on these results, it can be carefully concluded that the dynamics and versatility of the interactome is ensured by changing the dominant splice form. Overall, we propose that analysis of large-scale AP-MS experiments performed for various cell lines and under various conditions is a key to understanding the full potential of genes role in cellular processes.
Collapse
Affiliation(s)
- Ekaterina Poverennaya
- Department of Bioinformatics, Institute of Biomedical Chemistry, 119121 Moscow, Russia; (O.K.); (A.R.); (M.P.)
- Institute of Environmental and Agricultural Biology (X-BIO),Tyumen State University, 625003 Tyumen, Russia
| | - Olga Kiseleva
- Department of Bioinformatics, Institute of Biomedical Chemistry, 119121 Moscow, Russia; (O.K.); (A.R.); (M.P.)
| | - Anastasia Romanova
- Department of Bioinformatics, Institute of Biomedical Chemistry, 119121 Moscow, Russia; (O.K.); (A.R.); (M.P.)
- Faculty of Biological and Medical Physics, Moscow Institute of Physics and Technology, Dolgoprudny, 141701 Moscow, Russia
| | - Mikhail Pyatnitskiy
- Department of Bioinformatics, Institute of Biomedical Chemistry, 119121 Moscow, Russia; (O.K.); (A.R.); (M.P.)
- Department of Molecular Biology and Genetics, Federal Research and Clinical Center of Physical-Chemical Medicine of Federal Medical Biological Agency, 119435 Moscow, Russia
| |
Collapse
|
20
|
Affiliation(s)
- Monique Zahn-Zabal
- CALIPHO Group, SIB Swiss Institute of Bioinformatics and University of Geneva, Geneva, Switzerland
| | - Lydie Lane
- CALIPHO Group, SIB Swiss Institute of Bioinformatics and University of Geneva, Geneva, Switzerland
| |
Collapse
|
21
|
Zhang C, Lane L, Omenn GS, Zhang Y. Blinded Testing of Function Annotation for uPE1 Proteins by I-TASSER/COFACTOR Pipeline Using the 2018-2019 Additions to neXtProt and the CAFA3 Challenge. J Proteome Res 2019; 18:4154-4166. [PMID: 31581775 PMCID: PMC6900986 DOI: 10.1021/acs.jproteome.9b00537] [Citation(s) in RCA: 19] [Impact Index Per Article: 3.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/19/2022]
Abstract
In 2018, we reported a hybrid pipeline that predicts protein structures with I-TASSER and function with COFACTOR. I-TASSER/COFACTOR achieved Gene Ontology (GO) high prediction accuracies of Fmax = 0.69 and 0.57 for molecular function (MF) and biological process (BP), respectively, on 100 comprehensively annotated proteins. Now we report blinded analyses of newly annotated proteins in the critical assessment of function annotation (CAFA) three function prediction challenge and in neXtProt. For CAFA3 results released in May 2019, our predictions on 267 and 912 human proteins with newly annotated MF and BP terms achieved Fmax = 0.50 and 0.42, respectively, on "No Knowledge" proteins, and 0.51 and 0.74, respectively, on "Limited Knowledge" proteins. While COFACTOR consistently outperforms simple homology-based analysis, its accuracy still depends on template availability. Meanwhile, in neXtProt 2019-01, 25 proteins acquired new function annotation through literature curation at UniProt/Swiss-Prot. Before the release of these curated results, we submitted to neXtProt blinded predictions of free-text function annotation based on predicted GO terms. For 10 of the 25, a good match of free-text or GO term annotation was obtained. These blind tests represent rigorous assessments of I-TASSER/COFACTOR. neXtProt now provides links to precomputed I-TASSER/COFACTOR predictions for proteins without function annotation to facilitate experimental planning on "dark proteins".
Collapse
Affiliation(s)
- Chengxin Zhang
- Department of Computational Medicine and Bioinformatics, University of Michigan, Ann Arbor, Michigan 48109-2218, United States
| | - Lydie Lane
- CALIPHO Group, SIB Swiss Institute of Bioinformatics, Geneva, Switzerland
- Department of Microbiology and Molecular Medicine, Faculty of Medicine, University of Geneva, Geneva, Switzerland
| | - Gilbert S. Omenn
- Department of Computational Medicine and Bioinformatics, University of Michigan, Ann Arbor, Michigan 48109-2218, United States
- Departments of Internal Medicine and Human Genetics and School of Public Health, and University of Michigan, Ann Arbor, Michigan 48109-2218, United States
| | - Yang Zhang
- Department of Computational Medicine and Bioinformatics, University of Michigan, Ann Arbor, Michigan 48109-2218, United States
- Department of Biological Chemistry, University of Michigan, Ann Arbor, Michigan 48109-2218, United States
| |
Collapse
|
22
|
Omenn GS, Lane L, Overall CM, Corrales FJ, Schwenk JM, Paik YK, Van Eyk JE, Liu S, Pennington S, Snyder MP, Baker MS, Deutsch EW. Progress on Identifying and Characterizing the Human Proteome: 2019 Metrics from the HUPO Human Proteome Project. J Proteome Res 2019; 18:4098-4107. [PMID: 31430157 PMCID: PMC6898754 DOI: 10.1021/acs.jproteome.9b00434] [Citation(s) in RCA: 36] [Impact Index Per Article: 7.2] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/17/2023]
Abstract
The Human Proteome Project (HPP) annually reports on progress made throughout the field in credibly identifying and characterizing the complete human protein parts list and making proteomics an integral part of multiomics studies in medicine and the life sciences. NeXtProt release 2019-01-11 contains 17 694 proteins with strong protein-level evidence (PE1), compliant with HPP Guidelines for Interpretation of MS Data v2.1; these represent 89% of all 19 823 neXtProt predicted coding genes (all PE1,2,3,4 proteins), up from 17 470 one year earlier. Conversely, the number of neXtProt PE2,3,4 proteins, termed the "missing proteins" (MPs), has been reduced from 2949 to 2129 since 2016 through efforts throughout the community, including the chromosome-centric HPP. PeptideAtlas is the source of uniformly reanalyzed raw mass spectrometry data for neXtProt; PeptideAtlas added 495 canonical proteins between 2018 and 2019, especially from studies designed to detect hard-to-identify proteins. Meanwhile, the Human Protein Atlas has released version 18.1 with immunohistochemical evidence of expression of 17 000 proteins and survival plots as part of the Pathology Atlas. Many investigators apply multiplexed SRM-targeted proteomics for quantitation of organ-specific popular proteins in studies of various human diseases. The 19 teams of the Biology and Disease-driven B/D-HPP published a total of 160 publications in 2018, bringing proteomics to a broad array of biomedical research.
Collapse
Affiliation(s)
- Gilbert S. Omenn
- Department of Computational Medicine and Bioinformatics, University of Michigan, 100 Washtenaw Avenue, Ann Arbor, Michigan 48109-2218, United States
- Institute for Systems Biology, 401 Terry Avenue North, Seattle, Washington 98109-5263, United States
| | - Lydie Lane
- CALIPHO Group, SIB Swiss Institute of Bioinformatics and Department of Microbiology and Molecular Medicine, Faculty of Medicine, University of Geneva, CMU, Michel-Servet 1, 1211 Geneva 4, Switzerland
| | - Christopher M. Overall
- Life Sciences Institute, Faculty of Dentistry, University of British Columbia, 2350 Health Sciences Mall, Room 4.401, Vancouver, British Columbia V6T 1Z3, Canada
| | | | - Jochen M. Schwenk
- Science for Life Laboratory, KTH Royal Institute of Technology, Tomtebodavägen 23A, 17165 Solna, Sweden
| | - Young-Ki Paik
- Yonsei Proteome Research Center, Yonsei University, Room 425, Building #114, 50 Yonsei-ro, Seodaemoon-ku, Seoul 120-749, South Korea
| | - Jennifer E. Van Eyk
- Advanced Clinical BioSystems Research Institute, Cedars Sinai Precision Biomarker Laboratories, Barbra Streisand Women’s Heart Center, Cedars-Sinai Medical Center, Los Angeles, California 90048, United States
| | - Siqi Liu
- BGI Group-Shenzhen, Yantian District, Shenzhen 518083, China
| | - Stephen Pennington
- School of Medicine, University College Dublin, Conway Institute Belfield, Dublin 4, Ireland
| | - Michael P. Snyder
- Department of Genetics, Stanford University, Alway Building, 300 Pasteur Drive and 3165 Porter Drive, Palo Alto, California 94304, United States
| | - Mark S. Baker
- Department of Biomedical Sciences, Faculty of Medicine & Health Sciences, Macquarie University, 75 Talavera Road, North Ryde, NSW 2109, Australia
| | - Eric W. Deutsch
- Institute for Systems Biology, 401 Terry Avenue North, Seattle, Washington 98109-5263, United States
| |
Collapse
|
23
|
Sima AC, Dessimoz C, Stockinger K, Zahn-Zabal M, Mendes de Farias T. A hands-on introduction to querying evolutionary relationships across multiple data sources using SPARQL. F1000Res 2019; 8:1822. [PMID: 32612807 PMCID: PMC7324951 DOI: 10.12688/f1000research.21027.2] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Accepted: 07/09/2020] [Indexed: 11/20/2022] Open
Abstract
The increasing use of Semantic Web technologies in the life sciences, in particular the use of the Resource Description Framework (RDF) and the RDF query language SPARQL, opens the path for novel integrative analyses, combining information from multiple data sources. However, analyzing evolutionary data in RDF is not trivial, due to the steep learning curve required to understand both the data models adopted by different RDF data sources, as well as the equivalent SPARQL constructs required to benefit from this data - in particular, recursive property paths. In this article, we provide a hands-on introduction to querying evolutionary data across several data sources that publish orthology information in RDF, namely: The Orthologous MAtrix (OMA), the European Bioinformatics Institute (EBI) RDF platform, the Database of Orthologous Groups (OrthoDB) and the Microbial Genome Database (MBGD). We present four protocols in increasing order of complexity. In these protocols, we demonstrate through SPARQL queries how to retrieve pairwise orthologs, homologous groups, and hierarchical orthologous groups. Finally, we show how orthology information in different data sources can be compared, through the use of federated SPARQL queries.
Collapse
Affiliation(s)
- Ana Claudia Sima
- ZHAW Zurich University of Applied Sciences, Winterthur, Zurich, Switzerland.,Department of Computational Biology, University of Lausanne, Lausanne, Vaud, Switzerland.,SIB Swiss Institute of Bioinformatics, Lausanne, Vaud, Switzerland
| | - Christophe Dessimoz
- Department of Computational Biology, University of Lausanne, Lausanne, Vaud, Switzerland.,SIB Swiss Institute of Bioinformatics, Lausanne, Vaud, Switzerland.,Center for Integrative Genomics, University of Lausanne, Lausanne, Vaud, Switzerland.,Department of Computer Science, University College London, London, UK.,Department of Genetics, Evolution, and Environment, University College London, London, UK
| | - Kurt Stockinger
- ZHAW Zurich University of Applied Sciences, Winterthur, Zurich, Switzerland
| | - Monique Zahn-Zabal
- Department of Computational Biology, University of Lausanne, Lausanne, Vaud, Switzerland.,SIB Swiss Institute of Bioinformatics, Lausanne, Vaud, Switzerland
| | - Tarcisio Mendes de Farias
- Department of Computational Biology, University of Lausanne, Lausanne, Vaud, Switzerland.,SIB Swiss Institute of Bioinformatics, Lausanne, Vaud, Switzerland.,Center for Integrative Genomics, University of Lausanne, Lausanne, Vaud, Switzerland.,Department of Ecology and Evolution, University of Lausanne, Lausanne, Vaud, Switzerland
| |
Collapse
|
24
|
Sima AC, Dessimoz C, Stockinger K, Zahn-Zabal M, Mendes de Farias T. A hands-on introduction to querying evolutionary relationships across multiple data sources using SPARQL. F1000Res 2019; 8:1822. [PMID: 32612807 PMCID: PMC7324951 DOI: 10.12688/f1000research.21027.1] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Accepted: 10/22/2019] [Indexed: 08/01/2024] Open
Abstract
The increasing use of Semantic Web technologies in the life sciences, in particular the use of the Resource Description Framework (RDF) and the RDF query language SPARQL, opens the path for novel integrative analyses, combining information from multiple sources. However, analyzing evolutionary data in RDF is not trivial, due to the steep learning curve required to understand both the data models adopted by different RDF data sources, as well as the SPARQL query language. In this article, we provide a hands-on introduction to querying evolutionary data across multiple sources that publish orthology information in RDF, namely: The Orthologous MAtrix (OMA), the European Bioinformatics Institute (EBI) RDF platform, the Database of Orthologous Groups (OrthoDB) and the Microbial Genome Database (MBGD). We present four protocols in increasing order of complexity. In these protocols, we demonstrate through SPARQL queries how to retrieve pairwise orthologs, homologous groups, and hierarchical orthologous groups. Finally, we show how orthology information in different sources can be compared, through the use of federated SPARQL queries.
Collapse
Affiliation(s)
- Ana Claudia Sima
- ZHAW Zurich University of Applied Sciences, Winterthur, Zurich, Switzerland
- Department of Computational Biology, University of Lausanne, Lausanne, Vaud, Switzerland
- SIB Swiss Institute of Bioinformatics, Lausanne, Vaud, Switzerland
| | - Christophe Dessimoz
- Department of Computational Biology, University of Lausanne, Lausanne, Vaud, Switzerland
- SIB Swiss Institute of Bioinformatics, Lausanne, Vaud, Switzerland
- Center for Integrative Genomics, University of Lausanne, Lausanne, Vaud, Switzerland
- Department of Computer Science, University College London, London, UK
- Department of Genetics, Evolution, and Environment, University College London, London, UK
| | - Kurt Stockinger
- ZHAW Zurich University of Applied Sciences, Winterthur, Zurich, Switzerland
| | - Monique Zahn-Zabal
- Department of Computational Biology, University of Lausanne, Lausanne, Vaud, Switzerland
- SIB Swiss Institute of Bioinformatics, Lausanne, Vaud, Switzerland
| | - Tarcisio Mendes de Farias
- Department of Computational Biology, University of Lausanne, Lausanne, Vaud, Switzerland
- SIB Swiss Institute of Bioinformatics, Lausanne, Vaud, Switzerland
- Center for Integrative Genomics, University of Lausanne, Lausanne, Vaud, Switzerland
- Department of Ecology and Evolution, University of Lausanne, Lausanne, Vaud, Switzerland
| |
Collapse
|
25
|
Abstract
Using neXtProt release 2019-01-11, we manually curated a list of 1837 functionally uncharacterized human proteins. Using OrthoList 2, we found that 270 of them have homologues in Caenorhabditis elegans, including 60 with a one-to-one orthology relationship. According to annotations extracted from WormBase, the vast majority of these 60 worm genes have RNAi experimental data or mutant alleles, but manual inspection shows that only 15% have phenotypes that could be interpreted in terms of a specific function. One third of the worm orthologs have protein-protein interaction data, and two of these interactions are conserved in humans. The combination of phenotypic, protein-protein interaction, and gene expression data provides functional hypotheses for 8 uncharacterized human proteins. Experimental validation in human or orthologs is necessary before they can be considered for annotation.
Collapse
Affiliation(s)
- Paula Duek
- CALIPHO Group , SIB-Swiss Institute of Bioinformatics, CMU , Michel-Servet 1 , 1211 Geneva 4 , Switzerland
| | - Lydie Lane
- CALIPHO Group , SIB-Swiss Institute of Bioinformatics, CMU , Michel-Servet 1 , 1211 Geneva 4 , Switzerland.,Department of Microbiology and Molecular Medicine, Faculty of Medicine , University of Geneva, CMU , Michel-Servet 1 , 1211 Geneva 4 , Switzerland
| |
Collapse
|
26
|
Bauer TJ, Gombocz E, Krüger M, Sahana J, Corydon TJ, Bauer J, Infanger M, Grimm D. Augmenting cancer cell proteomics with cellular images - A semantic approach to understand focal adhesion. J Biomed Inform 2019; 100:103320. [PMID: 31669288 DOI: 10.1016/j.jbi.2019.103320] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/17/2019] [Revised: 09/23/2019] [Accepted: 10/23/2019] [Indexed: 01/13/2023]
Abstract
If monolayers of cancer cells are exposed to microgravity, some of the cells cease adhering to the bottom of a culture flask and join three-dimensional aggregates floating in the culture medium. Searching reasons for this change in phenotype, we performed proteome analyses and learnt that accumulation and posttranslational modification of proteins involved in cell-matrix and cell-cell adhesion are affected. To further investigate these proteins, we developed a methodology to find histological images about focal adhesion complex (FA) proteins. Selecting proteins expressed by human FTC-133 and MCF-7 cancer cells and known to be incorporated in FA, we transformed the experimental data to RDF to establish a core semantic knowledgebase. Applying iterative SPARQL queries to Linked Open Databases, we augmented these data with additional functional, transformation- and aggregation-related relationships. Using reasoning, we retrieved publications with images about the spatial arrangement of proteins incorporated in FA. Contextualizing those images enabled us to gain insights about FA of cells changing their site of growth, and to independently validate our experimental results. This new way to link experimental proteome data to biomedical knowledge from various sources via searching images may generally be applied in science when images are a tool of knowledge dissemination.
Collapse
Affiliation(s)
- Thomas J Bauer
- Clinic for Plastic, Aesthetic and Hand Surgery, Otto-von-Guericke-University Magdeburg, D-39120 Magdeburg, Germany.
| | - Erich Gombocz
- Melissa Informatics, 2550 Ninth Street, Suite 114, Berkeley, CA, USA.
| | - Marcus Krüger
- Clinic for Plastic, Aesthetic and Hand Surgery, Otto-von-Guericke-University Magdeburg, D-39120 Magdeburg, Germany.
| | - Jayashree Sahana
- Department of Biomedicine, Aarhus University, Hoeg-Guldbergsgade 10, DK-8000 Aarhus C, Denmark.
| | - Thomas J Corydon
- Department of Biomedicine, Aarhus University, Hoeg-Guldbergsgade 10, DK-8000 Aarhus C, Denmark; Department of Ophthalmology, Aarhus University Hospital, Palle Juul-Jensens Boulevard 99, DK-8200 Aarhus N, Denmark.
| | - Johann Bauer
- Max-Planck Institute of Biochemistry, D-82152 Martinsried, Germany.
| | - Manfred Infanger
- Clinic for Plastic, Aesthetic and Hand Surgery, Otto-von-Guericke-University Magdeburg, D-39120 Magdeburg, Germany.
| | - Daniela Grimm
- Clinic for Plastic, Aesthetic and Hand Surgery, Otto-von-Guericke-University Magdeburg, D-39120 Magdeburg, Germany; Department of Biomedicine, Aarhus University, Hoeg-Guldbergsgade 10, DK-8000 Aarhus C, Denmark; Gravitational Biology and Translational Regenerative Medicine, Faculty of Medicine and Mechanical Engineering, Otto-von-Guericke-University-Magdeburg, D-39120 Magdeburg, Germany.
| |
Collapse
|
27
|
Paik YK, Overall CM, Corrales F, Deutsch EW, Lane L, Omenn GS. Toward Completion of the Human Proteome Parts List: Progress Uncovering Proteins That Are Missing or Have Unknown Function and Developing Analytical Methods. J Proteome Res 2019; 17:4023-4030. [PMID: 30985145 PMCID: PMC6288998 DOI: 10.1021/acs.jproteome.8b00885] [Citation(s) in RCA: 17] [Impact Index Per Article: 3.4] [Reference Citation Analysis] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/07/2023]
Affiliation(s)
- Young-Ki Paik
- Yonsei Proteome Research Center, College of Life Science and Technology, Yonsei University
| | - Christopher M Overall
- Centre for Blood Research, Departments of Oral Biological & Medical Sciences and Biochemistry & Molecular Biology, Faculty of Dentistry, University of British Columbia
| | - Fernando Corrales
- Functional Proteomics Laboratory National Center of Biotechnology, CSIC
| | | | - Lydie Lane
- CALIPHO Group, SIB Swiss Institute of Bioinformatics and Department of Microbiology and Molecular Medicine, Faculty of Medicine, CMU, University of Geneva
| | - Gilbert S Omenn
- Institute for Systems Biology, Departments of Computational Medicine & Bioinformatics, Internal Medicine, and Human Genetics & School of Public Health, University of Michigan
| |
Collapse
|
28
|
Pineau C, Hikmet F, Zhang C, Oksvold P, Chen S, Fagerberg L, Uhlén M, Lindskog C. Cell Type-Specific Expression of Testis Elevated Genes Based on Transcriptomics and Antibody-Based Proteomics. J Proteome Res 2019; 18:4215-4230. [PMID: 31429579 DOI: 10.1021/acs.jproteome.9b00351] [Citation(s) in RCA: 24] [Impact Index Per Article: 4.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/12/2022]
Abstract
One of the most complex organs in the human body is the testis, where spermatogenesis takes place. This physiological process involves thousands of genes and proteins that are activated and repressed, making testis the organ with the highest number of tissue-specific genes. However, the function of a large proportion of the corresponding proteins remains unknown and testis harbors many missing proteins (MPs), defined as products of protein-coding genes that lack experimental mass spectrometry evidence. Here, an integrated omics approach was used for exploring the cell type-specific protein expression of genes with an elevated expression in testis. By combining genome-wide transcriptomics analysis with immunohistochemistry, more than 500 proteins with distinct testicular protein expression patterns were identified, and these were selected for in-depth characterization of their in situ expression in eight different testicular cell types. The cell type-specific protein expression patterns allowed us to identify six distinct clusters of expression at different stages of spermatogenesis. The analysis highlighted numerous poorly characterized proteins in each of these clusters whose expression overlapped with that of known proteins involved in spermatogenesis, including 85 proteins with an unknown function and 60 proteins that previously have been classified as MPs. Furthermore, we were able to characterize the in situ distribution of several proteins that previously lacked spatial information and cell type-specific expression within the testis. The testis elevated expression levels both at the RNA and protein levels suggest that these proteins are related to testis-specific functions. In summary, the study demonstrates the power of combining genome-wide transcriptomics analysis with antibody-based protein profiling to explore the cell type-specific expression of both well-known proteins and MPs. The analyzed proteins constitute important targets for further testis-specific research in male reproductive disorders.
Collapse
Affiliation(s)
- Charles Pineau
- Univ Rennes , Inserm, EHESP, Irset (Institut de Recherche en Santé, Environnement et Travail), UMR_S 1085 , 35042 Rennes Cedex, France.,Protim , Univ Rennes , 35042 Rennes Cedex, France
| | - Feria Hikmet
- Uppsala University , Department of Immunology, Genetics and Pathology, Rudbeck Laboratory , 75185 Uppsala , Sweden
| | - Cheng Zhang
- Science for Life Laboratory , School of Engineering Sciences in Chemistry, Biotechnology and Health, KTH-Royal Institute of Technology , 17121 Stockholm , Sweden
| | - Per Oksvold
- Science for Life Laboratory , School of Engineering Sciences in Chemistry, Biotechnology and Health, KTH-Royal Institute of Technology , 17121 Stockholm , Sweden
| | - Shuqi Chen
- Science for Life Laboratory , School of Engineering Sciences in Chemistry, Biotechnology and Health, KTH-Royal Institute of Technology , 17121 Stockholm , Sweden
| | - Linn Fagerberg
- Science for Life Laboratory , School of Engineering Sciences in Chemistry, Biotechnology and Health, KTH-Royal Institute of Technology , 17121 Stockholm , Sweden
| | - Mathias Uhlén
- Science for Life Laboratory , School of Engineering Sciences in Chemistry, Biotechnology and Health, KTH-Royal Institute of Technology , 17121 Stockholm , Sweden
| | - Cecilia Lindskog
- Uppsala University , Department of Immunology, Genetics and Pathology, Rudbeck Laboratory , 75185 Uppsala , Sweden
| |
Collapse
|
29
|
A Bioinformatics View of Glycan⁻Virus Interactions. Viruses 2019; 11:v11040374. [PMID: 31018588 PMCID: PMC6521074 DOI: 10.3390/v11040374] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/05/2019] [Revised: 04/05/2019] [Accepted: 04/15/2019] [Indexed: 02/06/2023] Open
Abstract
Evidence of the mediation of glycan molecules in the interaction between viruses and their hosts is accumulating and is now partially reflected in several online databases. Bioinformatics provides convenient and efficient means of searching, visualizing, comparing, and sometimes predicting, interactions in numerous and diverse molecular biology applications related to the -omics fields. As viromics is gaining momentum, bioinformatics support is increasingly needed. We propose a survey of the current resources for searching, visualizing, comparing, and possibly predicting host–virus interactions that integrate the presence and role of glycans. To the best of our knowledge, we have mapped the specialized and general-purpose databases with the appropriate focus. With an illustration of their potential usage, we also discuss the strong and weak points of the current bioinformatics landscape in the context of understanding viral infection and the immune response to it.
Collapse
|
30
|
Mendoza L, Deutsch EW, Sun Z, Campbell DS, Shteynberg DD, Moritz RL. Flexible and Fast Mapping of Peptides to a Proteome with ProteoMapper. J Proteome Res 2018; 17:4337-4344. [PMID: 30230343 DOI: 10.1021/acs.jproteome.8b00544] [Citation(s) in RCA: 12] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]
Abstract
Bottom-up proteomics relies on the proteolytic or chemical cleavage of proteins into peptides, the identification of those peptides via mass spectrometry, and the mapping of the identified peptides back to the reference proteome to infer which possible proteins are identified. Reliable mapping of peptides to proteins still poses substantial challenges when considering similar proteins, protein families, splice isoforms, sequence variation, and possible residue mass modifications, combined with an imperfect and incomplete understanding of the proteome. The ProteoMapper tool enables a comprehensive and rapid mapping of peptides to a reference proteome. The indexer component creates a segmented index for an input proteome from a FASTA or PEFF file. The ProMaST component provides ultrafast mapping of one or more input peptides against the index. ProteoMapper allows searches that take into account known sequence variation encoded in PEFF files. It also enables fuzzy searches to find highly similar peptides with residue order changes or other isobaric or near-isobaric substitutions within a specified mass tolerance. We demonstrate an example of a one-hit-wonder identification in PeptideAtlas that may be better explained by a combination of catalogued and uncatalogued sequence variation in another highly observed protein. ProteoMapper is a free and open source, available for local use after downloading, embedding in other applications, as an online web tool at http://www.peptideatlas.org/map , and as a web service.
Collapse
Affiliation(s)
- Luis Mendoza
- Institute for Systems Biology , 401 Terry Ave North , Seattle , Washington 98109 , United States
| | - Eric W Deutsch
- Institute for Systems Biology , 401 Terry Ave North , Seattle , Washington 98109 , United States
| | - Zhi Sun
- Institute for Systems Biology , 401 Terry Ave North , Seattle , Washington 98109 , United States
| | - David S Campbell
- Institute for Systems Biology , 401 Terry Ave North , Seattle , Washington 98109 , United States
| | - David D Shteynberg
- Institute for Systems Biology , 401 Terry Ave North , Seattle , Washington 98109 , United States
| | - Robert L Moritz
- Institute for Systems Biology , 401 Terry Ave North , Seattle , Washington 98109 , United States
| |
Collapse
|