1
|
Altenhoff A, Bairoch A, Bansal P, Baratin D, Bastian F, Bolleman* J, Bridge A, Burdet F, Crameri K, Dauvillier J, Dessimoz C, Gehant S, Glover N, Gnodtke K, Hayes C, Ibberson M, Kriventseva E, Kuznetsov D, Frédérique L, Mehl F, Mendes de Farias* T, Michel PA, Moretti S, Morgat A, Österle S, Pagni M, Redaschi N, Robinson-Rechavi M, Samarasinghe K, Sima AC, Szklarczyk D, Topalov O, Touré V, Unni D, von Mering C, Wollbrett J, Zahn-Zabal* M, Zdobnov E. The SIB Swiss Institute of Bioinformatics Semantic Web of data. Nucleic Acids Res 2024; 52:D44-D51. [PMID: 37878411 PMCID: PMC10767860 DOI: 10.1093/nar/gkad902] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/07/2023] [Revised: 10/02/2023] [Accepted: 10/05/2023] [Indexed: 10/27/2023] Open
Abstract
The SIB Swiss Institute of Bioinformatics (https://www.sib.swiss/) is a federation of bioinformatics research and service groups. The international life science community in academia and industry has been accessing the freely available databases provided by SIB since its inception in 1998. In this paper we present the 11 databases which currently offer semantically enriched data in accordance with the FAIR principles (Findable, Accessible, Interoperable, Reusable), as well as the Swiss Personalized Health Network initiative (SPHN) which also employs this enrichment. The semantic enrichment facilitates the manipulation of large data sets from public databases and private data sets. Examples are provided to illustrate that the data from the SIB databases can not only be queried using precise criteria individually, but also across multiple databases, including a variety of non-SIB databases. Data manipulation, be it exploration, extraction, annotation, combination, and publication, is possible using the SPARQL query language. Providing documentation, tutorials and sample queries makes it easier to navigate this web of semantic data. Through this paper, the reader will discover how the existing SIB knowledge graphs can be leveraged to tackle the complex biological or clinical questions that are being addressed today.
Collapse
|
2
|
Hernández ÁP, Micaelo A, Piñol R, García-Vaquero ML, Aramayona JJ, Criado JJ, Rodriguez E, Sánchez-Gallego JI, Landeira-Viñuela A, Juanes-Velasco P, Díez P, Góngora R, Jara-Acevedo R, Orfao A, Miana-Mena J, Muñoz MJ, Villanueva S, Millán Á, Fuentes M. Comprehensive and systematic characterization of multi-functionalized cisplatin nano-conjugate: from the chemistry and proteomic biocompatibility to the animal model. J Nanobiotechnology 2022; 20:341. [PMID: 35858906 PMCID: PMC9301860 DOI: 10.1186/s12951-022-01546-y] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/26/2022] [Accepted: 07/05/2022] [Indexed: 11/16/2022] Open
Abstract
Background Nowadays, nanoparticles (NPs) have evolved as multifunctional systems combining different custom anchorages which opens a wide range of applications in biomedical research. Thus, their pharmacological involvements require more comprehensive analysis and novel nanodrugs should be characterized by both chemically and biological point of view. Within the wide variety of biocompatible nanosystems, iron oxide nanoparticles (IONPs) present mostly of the required features which make them suitable for multifunctional NPs with many biopharmaceutical applications. Results Cisplatin-IONPs and different functionalization stages have been broadly evaluated. The potential application of these nanodrugs in onco-therapies has been assessed by studying in vitro biocompatibility (interactions with environment) by proteomics characterization the determination of protein corona in different proximal fluids (human plasma, rabbit plasma and fetal bovine serum),. Moreover, protein labeling and LC–MS/MS analysis provided more than 4000 proteins de novo synthetized as consequence of the nanodrugs presence defending cell signaling in different tumor cell types (data available via ProteomeXchanges with identified PXD026615). Further in vivo studies have provided a more integrative view of the biopharmaceutical perspectives of IONPs. Conclusions Pharmacological proteomic profile different behavior between species and different affinity of protein coating layers (soft and hard corona). Also, intracellular signaling exposed differences between tumor cell lines studied. First approaches in animal model reveal the potential of theses NPs as drug delivery vehicles and confirm cisplatin compounds as strengthened antitumoral agents.
Supplementary Information The online version contains supplementary material available at 10.1186/s12951-022-01546-y.
Collapse
Affiliation(s)
- Ángela-Patricia Hernández
- Department of Medicine and General Cytometry Service-Nucleus, CIBERONC CB16/12/00400, Cancer Research Centre, (IBMCC/CSIC/USAL/IBSAL), University of Salamanca-CSIC, IBSAL, Campus Miguel de Unamuno s/n, 37007, Salamanca, Spain.,Department of Pharmaceutical Sciences. Organic Chemistry Section. Faculty of Pharmacy, University of Salamanca, Campus Miguel de Unamuno s/n, 37007, Salamanca, Spain.,ImmunoStep, SL, Edificio Centro de Investigación del Cáncer, University of Salamanca, Avda. Coimbra s/n, Campus Miguel de Unamuno, 37007, Salamanca, Spain
| | - Ania Micaelo
- Department of Medicine and General Cytometry Service-Nucleus, CIBERONC CB16/12/00400, Cancer Research Centre, (IBMCC/CSIC/USAL/IBSAL), University of Salamanca-CSIC, IBSAL, Campus Miguel de Unamuno s/n, 37007, Salamanca, Spain.,ImmunoStep, SL, Edificio Centro de Investigación del Cáncer, University of Salamanca, Avda. Coimbra s/n, Campus Miguel de Unamuno, 37007, Salamanca, Spain
| | - Rafael Piñol
- INMA, Institute of Nanoscience and Materials of Aragon, CSIC-University of Zaragoza, 50018, Saragossa, Spain.,ImmunoStep, SL, Edificio Centro de Investigación del Cáncer, University of Salamanca, Avda. Coimbra s/n, Campus Miguel de Unamuno, 37007, Salamanca, Spain
| | - Marina L García-Vaquero
- Department of Medicine and General Cytometry Service-Nucleus, CIBERONC CB16/12/00400, Cancer Research Centre, (IBMCC/CSIC/USAL/IBSAL), University of Salamanca-CSIC, IBSAL, Campus Miguel de Unamuno s/n, 37007, Salamanca, Spain.,ImmunoStep, SL, Edificio Centro de Investigación del Cáncer, University of Salamanca, Avda. Coimbra s/n, Campus Miguel de Unamuno, 37007, Salamanca, Spain
| | - José J Aramayona
- Department of Pharmacology and Physiology, University of Zaragoza, Zaragoza, Spain.,ImmunoStep, SL, Edificio Centro de Investigación del Cáncer, University of Salamanca, Avda. Coimbra s/n, Campus Miguel de Unamuno, 37007, Salamanca, Spain
| | - Julio J Criado
- Department of Inorganic Chemistry, Faculty of Chemical Sciences, Plaza de los Caídos S/N, 37008, Salamanca, Spain.,ImmunoStep, SL, Edificio Centro de Investigación del Cáncer, University of Salamanca, Avda. Coimbra s/n, Campus Miguel de Unamuno, 37007, Salamanca, Spain
| | - Emilio Rodriguez
- Department of Inorganic Chemistry, Faculty of Chemical Sciences, Plaza de los Caídos S/N, 37008, Salamanca, Spain.,ImmunoStep, SL, Edificio Centro de Investigación del Cáncer, University of Salamanca, Avda. Coimbra s/n, Campus Miguel de Unamuno, 37007, Salamanca, Spain
| | - José Ignacio Sánchez-Gallego
- Department of Medicine and General Cytometry Service-Nucleus, CIBERONC CB16/12/00400, Cancer Research Centre, (IBMCC/CSIC/USAL/IBSAL), University of Salamanca-CSIC, IBSAL, Campus Miguel de Unamuno s/n, 37007, Salamanca, Spain.,ImmunoStep, SL, Edificio Centro de Investigación del Cáncer, University of Salamanca, Avda. Coimbra s/n, Campus Miguel de Unamuno, 37007, Salamanca, Spain
| | - Alicia Landeira-Viñuela
- Department of Medicine and General Cytometry Service-Nucleus, CIBERONC CB16/12/00400, Cancer Research Centre, (IBMCC/CSIC/USAL/IBSAL), University of Salamanca-CSIC, IBSAL, Campus Miguel de Unamuno s/n, 37007, Salamanca, Spain.,ImmunoStep, SL, Edificio Centro de Investigación del Cáncer, University of Salamanca, Avda. Coimbra s/n, Campus Miguel de Unamuno, 37007, Salamanca, Spain
| | - Pablo Juanes-Velasco
- Department of Medicine and General Cytometry Service-Nucleus, CIBERONC CB16/12/00400, Cancer Research Centre, (IBMCC/CSIC/USAL/IBSAL), University of Salamanca-CSIC, IBSAL, Campus Miguel de Unamuno s/n, 37007, Salamanca, Spain.,ImmunoStep, SL, Edificio Centro de Investigación del Cáncer, University of Salamanca, Avda. Coimbra s/n, Campus Miguel de Unamuno, 37007, Salamanca, Spain
| | - Paula Díez
- Department of Medicine and General Cytometry Service-Nucleus, CIBERONC CB16/12/00400, Cancer Research Centre, (IBMCC/CSIC/USAL/IBSAL), University of Salamanca-CSIC, IBSAL, Campus Miguel de Unamuno s/n, 37007, Salamanca, Spain.,ImmunoStep, SL, Edificio Centro de Investigación del Cáncer, University of Salamanca, Avda. Coimbra s/n, Campus Miguel de Unamuno, 37007, Salamanca, Spain
| | - Rafael Góngora
- Department of Medicine and General Cytometry Service-Nucleus, CIBERONC CB16/12/00400, Cancer Research Centre, (IBMCC/CSIC/USAL/IBSAL), University of Salamanca-CSIC, IBSAL, Campus Miguel de Unamuno s/n, 37007, Salamanca, Spain.,ImmunoStep, SL, Edificio Centro de Investigación del Cáncer, University of Salamanca, Avda. Coimbra s/n, Campus Miguel de Unamuno, 37007, Salamanca, Spain
| | - Ricardo Jara-Acevedo
- ImmunoStep, SL, Edificio Centro de Investigación del Cáncer, University of Salamanca, Avda. Coimbra s/n, Campus Miguel de Unamuno, 37007, Salamanca, Spain
| | - Alberto Orfao
- Department of Medicine and General Cytometry Service-Nucleus, CIBERONC CB16/12/00400, Cancer Research Centre, (IBMCC/CSIC/USAL/IBSAL), University of Salamanca-CSIC, IBSAL, Campus Miguel de Unamuno s/n, 37007, Salamanca, Spain.,ImmunoStep, SL, Edificio Centro de Investigación del Cáncer, University of Salamanca, Avda. Coimbra s/n, Campus Miguel de Unamuno, 37007, Salamanca, Spain
| | - Javier Miana-Mena
- Department of Pharmacology and Physiology, University of Zaragoza, Zaragoza, Spain.,ImmunoStep, SL, Edificio Centro de Investigación del Cáncer, University of Salamanca, Avda. Coimbra s/n, Campus Miguel de Unamuno, 37007, Salamanca, Spain
| | - María Jesús Muñoz
- Department of Pharmacology and Physiology, University of Zaragoza, Zaragoza, Spain.,ImmunoStep, SL, Edificio Centro de Investigación del Cáncer, University of Salamanca, Avda. Coimbra s/n, Campus Miguel de Unamuno, 37007, Salamanca, Spain
| | - Sergio Villanueva
- Department of Pharmacology and Physiology, University of Zaragoza, Zaragoza, Spain.,ImmunoStep, SL, Edificio Centro de Investigación del Cáncer, University of Salamanca, Avda. Coimbra s/n, Campus Miguel de Unamuno, 37007, Salamanca, Spain
| | - Ángel Millán
- INMA, Institute of Nanoscience and Materials of Aragon, CSIC-University of Zaragoza, 50018, Saragossa, Spain. .,ImmunoStep, SL, Edificio Centro de Investigación del Cáncer, University of Salamanca, Avda. Coimbra s/n, Campus Miguel de Unamuno, 37007, Salamanca, Spain.
| | - Manuel Fuentes
- Department of Medicine and General Cytometry Service-Nucleus, CIBERONC CB16/12/00400, Cancer Research Centre, (IBMCC/CSIC/USAL/IBSAL), University of Salamanca-CSIC, IBSAL, Campus Miguel de Unamuno s/n, 37007, Salamanca, Spain. .,ImmunoStep, SL, Edificio Centro de Investigación del Cáncer, University of Salamanca, Avda. Coimbra s/n, Campus Miguel de Unamuno, 37007, Salamanca, Spain. .,Proteomics Unit, Cancer Research Centre (IBMCC/CSIC/USAL/IBSAL), 37007, Salamanca, Spain.
| |
Collapse
|
3
|
Deciphering Biomarkers for Leptomeningeal Metastasis in Malignant Hemopathies (Lymphoma/Leukemia) Patients by Comprehensive Multipronged Proteomics Characterization of Cerebrospinal Fluid. Cancers (Basel) 2022; 14:cancers14020449. [PMID: 35053611 PMCID: PMC8773653 DOI: 10.3390/cancers14020449] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/24/2021] [Revised: 01/04/2022] [Accepted: 01/06/2022] [Indexed: 11/16/2022] Open
Abstract
Simple Summary The early diagnosis of leptomeningeal disease is a challenge because it is asymptomatic in the early stages. Consequently, it is important to identify a panel of biomarkers to help in its diagnosis and/or prognosis. For this purpose, we explored a multipronged proteomics approach in cerebrospinal fluid (CSF) to determine a potential panel of biomarkers. Thus, a systematic and exhaustive characterization of more than 300 CSF samples was performed by an integrated approach by Liquid Chromatography-Tandem Mass Spectrometry (LC-MS/MS) and functional proteomics analysis to establish protein profiles, which were useful for developing a panel of biomarkers validated by in silico approaches. Abstract In the present work, leptomeningeal disease, a very destructive form of systemic cancer, was characterized from several proteomics points of view. This pathology involves the invasion of the leptomeninges by malignant tumor cells. The tumor spreads to the central nervous system through the cerebrospinal fluid (CSF) and has a very grim prognosis; the average life expectancy of patients who suffer it does not exceed 3 months. The early diagnosis of leptomeningeal disease is a challenge because, in most of the cases, it is an asymptomatic pathology. When the symptoms are clear, the disease is already in the very advanced stages and life expectancy is low. Consequently, there is a pressing need to determine useful CSF proteins to help in the diagnosis and/or prognosis of this disease. For this purpose, a systematic and exhaustive proteomics characterization of CSF by multipronged proteomics approaches was performed to determine different protein profiles as potential biomarkers. Proteins such as PTPRC, SERPINC1, sCD44, sCD14, ANPEP, SPP1, FCGR1A, C9, sCD19, and sCD34, among others, and their functional analysis, reveals that most of them are linked to the pathology and are not detected on normal CSF. Finally, a panel of biomarkers was verified by a prediction model for leptomeningeal disease, showing new insights into the research for potential biomarkers that are easy to translate into the clinic for the diagnosis of this devastating disease.
Collapse
|
4
|
Landeira-Viñuela A, Díez P, Juanes-Velasco P, Lécrevisse Q, Orfao A, De Las Rivas J, Fuentes M. Deepening into Intracellular Signaling Landscape through Integrative Spatial Proteomics and Transcriptomics in a Lymphoma Model. Biomolecules 2021; 11:1776. [PMID: 34944421 PMCID: PMC8699084 DOI: 10.3390/biom11121776] [Citation(s) in RCA: 6] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/21/2021] [Revised: 11/11/2021] [Accepted: 11/23/2021] [Indexed: 12/12/2022] Open
Abstract
Human Proteome Project (HPP) presents a systematic characterization of the protein landscape under different conditions using several complementary-omic techniques (LC-MS/MS proteomics, affinity proteomics, transcriptomics, etc.). In the present study, using a B-cell lymphoma cell line as a model, comprehensive integration of RNA-Seq transcriptomics, MS/MS, and antibody-based affinity proteomics (combined with size-exclusion chromatography) (SEC-MAP) were performed to uncover correlations that could provide insights into protein dynamics at the intracellular level. Here, 5672 unique proteins were systematically identified by MS/MS analysis and subcellular protein extraction strategies (neXtProt release 2020-21, MS/MS data are available via ProteomeXchange with identifier PXD003939). Moreover, RNA deep sequencing analysis of this lymphoma B-cell line identified 19,518 expressed genes and 5707 protein coding genes (mapped to neXtProt). Among these data sets, 162 relevant proteins (targeted by 206 antibodies) were systematically analyzed by the SEC-MAP approach, providing information about PTMs, isoforms, protein complexes, and subcellular localization. Finally, a bioinformatic pipeline has been designed and developed for orthogonal integration of these high-content proteomics and transcriptomics datasets, which might be useful for comprehensive and global characterization of intracellular protein profiles.
Collapse
Affiliation(s)
- Alicia Landeira-Viñuela
- Department of Medicine and General Cytometry Service-Nucleus, USAL/IBSAL, 37000 Salamanca, Spain; (A.L.-V.); (P.D.); (P.J.-V.); (Q.L.); (A.O.)
| | - Paula Díez
- Department of Medicine and General Cytometry Service-Nucleus, USAL/IBSAL, 37000 Salamanca, Spain; (A.L.-V.); (P.D.); (P.J.-V.); (Q.L.); (A.O.)
- Proteomics Unit, Cancer Research Centre (IBMCC/CSIC/USAL/IBSAL), 37007 Salamanca, Spain
| | - Pablo Juanes-Velasco
- Department of Medicine and General Cytometry Service-Nucleus, USAL/IBSAL, 37000 Salamanca, Spain; (A.L.-V.); (P.D.); (P.J.-V.); (Q.L.); (A.O.)
| | - Quentin Lécrevisse
- Department of Medicine and General Cytometry Service-Nucleus, USAL/IBSAL, 37000 Salamanca, Spain; (A.L.-V.); (P.D.); (P.J.-V.); (Q.L.); (A.O.)
| | - Alberto Orfao
- Department of Medicine and General Cytometry Service-Nucleus, USAL/IBSAL, 37000 Salamanca, Spain; (A.L.-V.); (P.D.); (P.J.-V.); (Q.L.); (A.O.)
| | - Javier De Las Rivas
- Bioinformatics and Functional Genomics, Cancer Research Centre (IBMCC/CSIC/USAL/IBSAL), 37007 Salamanca, Spain;
| | - Manuel Fuentes
- Department of Medicine and General Cytometry Service-Nucleus, USAL/IBSAL, 37000 Salamanca, Spain; (A.L.-V.); (P.D.); (P.J.-V.); (Q.L.); (A.O.)
- Proteomics Unit, Cancer Research Centre (IBMCC/CSIC/USAL/IBSAL), 37007 Salamanca, Spain
| |
Collapse
|
5
|
Cui Y, Wang Z, Köster J, Liao X, Peng S, Tang T, Huang C, Yang C. VISPR-online: a web-based interactive tool to visualize CRISPR screening experiments. BMC Bioinformatics 2021; 22:344. [PMID: 34167459 PMCID: PMC8223366 DOI: 10.1186/s12859-021-04275-5] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/09/2021] [Accepted: 06/15/2021] [Indexed: 01/11/2023] Open
Abstract
BACKGROUND VISPR is an interactive visualization and analysis framework for CRISPR screening experiments. However, it only supports the output of MAGeCK, and requires installation and manual configuration. Furthermore, VISPR is designed to run on a single computer, and data sharing between collaborators is challenging. RESULTS To make the tool easily accessible to the community, we present VISPR-online, a web-based general application allowing users to visualize, explore, and share CRISPR screening data online with a few simple steps. VISPR-online provides an exploration of screening results and visualization of read count changes. Apart from MAGeCK, VISPR-online supports two more popular CRISPR screening analysis tools: BAGEL and JACKS. It provides an interactive environment for exploring gene essentiality, viewing guide RNA (gRNA) locations, and allowing users to resume and share screening results. CONCLUSIONS VISPR-online allows users to visualize, explore and share CRISPR screening data online. It is freely available at http://vispr-online.weililab.org , while the source code is available at https://github.com/lemoncyb/VISPR-online .
Collapse
Affiliation(s)
- Yingbo Cui
- School of Computer, National University of Defense Technology, Changsha, 410073, China.
| | - Zihang Wang
- College of Information Science and Engineering, Hunan University, Changsha, 410006, China
| | - Johannes Köster
- Algorithms for Reproducible Bioinformatics, Institute of Human Genetics, University of Duisburg-Essen, 45147, Essen, Germany
| | - Xiangke Liao
- School of Computer, National University of Defense Technology, Changsha, 410073, China
| | - Shaoliang Peng
- College of Information Science and Engineering, Hunan University, Changsha, 410006, China
- National Supercomputing Center in Changsha, Changsha, 410082, China
| | - Tao Tang
- School of Computer, National University of Defense Technology, Changsha, 410073, China
| | - Chun Huang
- School of Computer, National University of Defense Technology, Changsha, 410073, China
| | - Canqun Yang
- School of Computer, National University of Defense Technology, Changsha, 410073, China
| |
Collapse
|
6
|
Li H, Funk CC, McFarland K, Dammer EB, Allen M, Carrasquillo MM, Levites Y, Chakrabarty P, Burgess JD, Wang X, Dickson D, Seyfried NT, Duong DM, Lah JJ, Younkin SG, Levey AI, Omenn GS, Ertekin‐Taner N, Golde TE, Price ND. Integrative functional genomic analysis of intron retention in human and mouse brain with Alzheimer's disease. Alzheimers Dement 2021; 17:984-1004. [PMID: 33480174 PMCID: PMC8248162 DOI: 10.1002/alz.12254] [Citation(s) in RCA: 19] [Impact Index Per Article: 6.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/14/2019] [Revised: 10/08/2020] [Accepted: 10/17/2020] [Indexed: 12/21/2022]
Abstract
Intron retention (IR) has been implicated in the pathogenesis of complex diseases such as cancers; its association with Alzheimer's disease (AD) remains unexplored. We performed genome-wide analysis of IR through integrating genetic, transcriptomic, and proteomic data of AD subjects and mouse models from the Accelerating Medicines Partnership-Alzheimer's Disease project. We identified 4535 and 4086 IR events in 2173 human and 1736 mouse genes, respectively. Quantitation of IR enabled the identification of differentially expressed genes that conventional exon-level approaches did not reveal. There were significant correlations of intron expression within innate immune genes, like HMBOX1, with AD in humans. Peptides with a high probability of translation from intron-retained mRNAs were identified using mass spectrometry. Further, we established AD-specific intron expression Quantitative Trait Loci, and identified splicing-related genes that may regulate IR. Our analysis provides a novel resource for the search for new AD biomarkers and pathological mechanisms.
Collapse
Affiliation(s)
- Hong‐Dong Li
- Hunan Provincial Key Lab on BioinformaticsSchool of Computer Science and EngineeringCentral South UniversityChangshaHunanP.R. China
- Institute for Systems BiologySeattleWashingtonUSA
| | - Cory C. Funk
- Institute for Systems BiologySeattleWashingtonUSA
| | - Karen McFarland
- Department of Neuroscience and NeurologyCenter for Translational Research in Neurodegenerative diseaseand McKnight Brain InstituteUniversity of FloridaGainesvilleFloridaUSA
| | - Eric B. Dammer
- Department of BiochemistryEmory UniversityAtlantaGeorgiaUSA
| | - Mariet Allen
- Mayo ClinicDepartment ofNeuroscienceJacksonvilleFloridaUSA
| | | | - Yona Levites
- Department of Neuroscience and NeurologyCenter for Translational Research in Neurodegenerative diseaseand McKnight Brain InstituteUniversity of FloridaGainesvilleFloridaUSA
| | - Paramita Chakrabarty
- Department of Neuroscience and NeurologyCenter for Translational Research in Neurodegenerative diseaseand McKnight Brain InstituteUniversity of FloridaGainesvilleFloridaUSA
| | | | - Xue Wang
- Mayo ClinicDepartment of Health Sciences ResearchJacksonvilleFloridaUSA
| | - Dennis Dickson
- Mayo ClinicDepartment ofNeuroscienceJacksonvilleFloridaUSA
| | - Nicholas T. Seyfried
- Department of BiochemistryEmory UniversityAtlantaGeorgiaUSA
- Department of NeurologyEmory UniversityAtlantaGeorgiaUSA
| | - Duc M. Duong
- Department of BiochemistryEmory UniversityAtlantaGeorgiaUSA
| | - James J. Lah
- Department of NeurologyEmory UniversityAtlantaGeorgiaUSA
| | | | - Allan I. Levey
- Department of NeurologyEmory UniversityAtlantaGeorgiaUSA
| | - Gilbert S. Omenn
- Institute for Systems BiologySeattleWashingtonUSA
- Department of Computational Medicine and BioinformaticsUniversity of MichiganAnn ArborMichiganUSA
| | - Nilüfer Ertekin‐Taner
- Mayo ClinicDepartment ofNeuroscienceJacksonvilleFloridaUSA
- Mayo ClinicDepartment of NeurologyJacksonvilleFloridaUSA
| | - Todd E. Golde
- Department of Neuroscience and NeurologyCenter for Translational Research in Neurodegenerative diseaseand McKnight Brain InstituteUniversity of FloridaGainesvilleFloridaUSA
| | | |
Collapse
|
7
|
Galgonek J, Vondrášek J. IDSM ChemWebRDF: SPARQLing small-molecule datasets. J Cheminform 2021; 13:38. [PMID: 33980298 PMCID: PMC8117646 DOI: 10.1186/s13321-021-00515-1] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/06/2021] [Accepted: 04/23/2021] [Indexed: 11/12/2022] Open
Abstract
The Resource Description Framework (RDF), together with well-defined ontologies, significantly increases data interoperability and usability. The SPARQL query language was introduced to retrieve requested RDF data and to explore links between them. Among other useful features, SPARQL supports federated queries that combine multiple independent data source endpoints. This allows users to obtain insights that are not possible using only a single data source. Owing to all of these useful features, many biological and chemical databases present their data in RDF, and support SPARQL querying. In our project, we primary focused on PubChem, ChEMBL and ChEBI small-molecule datasets. These datasets are already being exported to RDF by their creators. However, none of them has an official and currently supported SPARQL endpoint. This omission makes it difficult to construct complex or federated queries that could access all of the datasets, thus underutilising the main advantage of the availability of RDF data. Our goal is to address this gap by integrating the datasets into one database called the Integrated Database of Small Molecules (IDSM) that will be accessible through a SPARQL endpoint. Beyond that, we will also focus on increasing mutual interoperability of the datasets. To realise the endpoint, we decided to implement an in-house developed SPARQL engine based on the PostgreSQL relational database for data storage. In our approach, data are stored in the traditional relational form, and the SPARQL engine translates incoming SPARQL queries into equivalent SQL queries. An important feature of the engine is that it optimises the resulting SQL queries. Together with optimisations performed by PostgreSQL, this allows efficient evaluations of SPARQL queries. The endpoint provides not only querying in the dataset, but also the compound substructure and similarity search supported by our Sachem project. Although the endpoint is accessible from an internet browser, it is mainly intended to be used for programmatic access by other services, for example as a part of federated queries. For regular users, we offer a rich web application called ChemWebRDF using the endpoint. The application is publicly available at https://idsm.elixir-czech.cz/chemweb/.
Collapse
Affiliation(s)
- Jakub Galgonek
- Institute of Organic Chemistry and Biochemistry of the CAS, Flemingovo náměstí 2, 166 10, Prague 6, Czech Republic.
| | - Jiří Vondrášek
- Institute of Organic Chemistry and Biochemistry of the CAS, Flemingovo náměstí 2, 166 10, Prague 6, Czech Republic
| |
Collapse
|
8
|
Erady C, Boxall A, Puntambekar S, Suhas Jagannathan N, Chauhan R, Chong D, Meena N, Kulkarni A, Kasabe B, Prathivadi Bhayankaram K, Umrania Y, Andreani A, Nel J, Wayland MT, Pina C, Lilley KS, Prabakaran S. Pan-cancer analysis of transcripts encoding novel open-reading frames (nORFs) and their potential biological functions. NPJ Genom Med 2021; 6:4. [PMID: 33495453 PMCID: PMC7835362 DOI: 10.1038/s41525-020-00167-4] [Citation(s) in RCA: 18] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/06/2020] [Accepted: 11/18/2020] [Indexed: 12/13/2022] Open
Abstract
Uncharacterized and unannotated open-reading frames, which we refer to as novel open reading frames (nORFs), may sometimes encode peptides that remain unexplored for novel therapeutic opportunities. To our knowledge, no systematic identification and characterization of transcripts encoding nORFs or their translation products in cancer, or in any other physiological process has been performed. We use our curated nORFs database (nORFs.org), together with RNA-Seq data from The Cancer Genome Atlas (TCGA) and Genotype-Expression (GTEx) consortiums, to identify transcripts containing nORFs that are expressed frequently in cancer or matched normal tissue across 22 cancer types. We show nORFs are subject to extensive dysregulation at the transcript level in cancer tissue and that a small subset of nORFs are associated with overall patient survival, suggesting that nORFs may have prognostic value. We also show that nORF products can form protein-like structures with post-translational modifications. Finally, we perform in silico screening for inhibitors against nORF-encoded proteins that are disrupted in stomach and esophageal cancer, showing that they can potentially be targeted by inhibitors. We hope this work will guide and motivate future studies that perform in-depth characterization of nORF functions in cancer and other diseases.
Collapse
Affiliation(s)
- Chaitanya Erady
- Department of Genetics, University of Cambridge, Downing Site, Cambridge, CB2 3EH, UK
| | - Adam Boxall
- Department of Genetics, University of Cambridge, Downing Site, Cambridge, CB2 3EH, UK
| | - Shraddha Puntambekar
- Department of Biology, Indian Institute of Science Education and Research, Pune, Maharashtra, 411008, India
| | - N Suhas Jagannathan
- Cancer and Stem Cell Biology Programme, and Centre for Computational Biology, Duke-NUS Medical School, Singapore, 169857, Singapore
| | - Ruchi Chauhan
- Department of Genetics, University of Cambridge, Downing Site, Cambridge, CB2 3EH, UK
| | - David Chong
- Department of Genetics, University of Cambridge, Downing Site, Cambridge, CB2 3EH, UK
| | - Narendra Meena
- Department of Genetics, University of Cambridge, Downing Site, Cambridge, CB2 3EH, UK
| | - Apurv Kulkarni
- Department of Biology, Indian Institute of Science Education and Research, Pune, Maharashtra, 411008, India
| | - Bhagyashri Kasabe
- Department of Biology, Indian Institute of Science Education and Research, Pune, Maharashtra, 411008, India
| | | | - Yagnesh Umrania
- Cambridge Centre for Proteomics, Department of Biochemistry, University of Cambridge, Tennis Court Road, Cambridge, CB2 1QR, UK
| | - Adam Andreani
- Department of Genetics, University of Cambridge, Downing Site, Cambridge, CB2 3EH, UK
| | - Jean Nel
- Department of Genetics, University of Cambridge, Downing Site, Cambridge, CB2 3EH, UK
| | - Matthew T Wayland
- Department of Zoology, University of Cambridge, Downing Street, Cambridge, CB2 3EJ, UK
| | - Cristina Pina
- Department of Haematology, Cambridge Biomedical Campus, Cambridge, CB2 0PT, UK
| | - Kathryn S Lilley
- Cambridge Centre for Proteomics, Department of Biochemistry, University of Cambridge, Tennis Court Road, Cambridge, CB2 1QR, UK
| | - Sudhakaran Prabakaran
- Department of Genetics, University of Cambridge, Downing Site, Cambridge, CB2 3EH, UK.
| |
Collapse
|
9
|
Paik YK, Overall CM, Corrales F, Deutsch EW, Lane L, Omenn GS. Advances in Identifying and Characterizing the Human Proteome. J Proteome Res 2020; 18:4079-4084. [PMID: 31805768 DOI: 10.1021/acs.jproteome.9b00745] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/08/2023]
Affiliation(s)
- Young-Ki Paik
- Yonsei Proteome Research Center, College of Life Science and Technology , Yonsei University
| | - Christopher M Overall
- Centre for Blood Research, Departments of Oral Biological & Medical Sciences and Biochemistry & Molecular Biology, Faculty of Dentistry , University of British Columbia
| | - Fernando Corrales
- Functional Proteomics Laboratory National Center of Biotechnology , CSIC
| | | | - Lydie Lane
- CALIPHO Group, SIB Swiss Institute of Bioinformatics and Department of Microbiology and Molecular Medicine, Faculty of Medicine, CMU , University of Geneva
| | - Gilbert S Omenn
- Institute for Systems Biology, Departments of Computational Medicine & Bioinformatics, Internal Medicine, and Human Genetics & School of Public Health , University of Michigan
| |
Collapse
|
10
|
Sanchez A, Kuras M, Murillo JR, Pla I, Pawlowski K, Szasz AM, Gil J, Nogueira FCS, Perez-Riverol Y, Eriksson J, Appelqvist R, Miliotis T, Kim Y, Baldetorp B, Ingvar C, Olsson H, Lundgren L, Ekedahl H, Horvatovich P, Sugihara Y, Welinder C, Wieslander E, Kwon HJ, Domont GB, Malm J, Rezeli M, Betancourt LH, Marko-Varga G. Novel functional proteins coded by the human genome discovered in metastases of melanoma patients. Cell Biol Toxicol 2020; 36:261-272. [PMID: 31599373 PMCID: PMC7320927 DOI: 10.1007/s10565-019-09494-4] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/21/2019] [Accepted: 09/02/2019] [Indexed: 12/18/2022]
Abstract
In the advanced stages, malignant melanoma (MM) has a very poor prognosis. Due to tremendous efforts in cancer research over the last 10 years, and the introduction of novel therapies such as targeted therapies and immunomodulators, the rather dark horizon of the median survival has dramatically changed from under 1 year to several years. With the advent of proteomics, deep-mining studies can reach low-abundant expression levels. The complexity of the proteome, however, still surpasses the dynamic range capabilities of current analytical techniques. Consequently, many predicted protein products with potential biological functions have not yet been verified in experimental proteomic data. This category of 'missing proteins' (MP) is comprised of all proteins that have been predicted but are currently unverified. As part of the initiative launched in 2016 in the USA, the European Cancer Moonshot Center has performed numerous deep proteomics analyses on samples from MM patients. In this study, nine MPs were clearly identified by mass spectrometry in MM metastases. Some MPs significantly correlated with proteins that possess identical PFAM structural domains; and other MPs were significantly associated with cancer-related proteins. This is the first study to our knowledge, where unknown and novel proteins have been annotated in metastatic melanoma tumour tissue.
Collapse
Affiliation(s)
- Aniel Sanchez
- Section for Clinical Chemistry, Department of Translational Medicine, Skåne University Hospital Malmö, Lund University, 205 02, Malmö, Sweden.
| | - Magdalena Kuras
- Clinical Protein Science & Imaging, Biomedical Centre, Department of Biomedical Engineering, Lund University, BMC D13, 221 84, Lund, Sweden
| | - Jimmy Rodriguez Murillo
- Clinical Protein Science & Imaging, Biomedical Centre, Department of Biomedical Engineering, Lund University, BMC D13, 221 84, Lund, Sweden
| | - Indira Pla
- Section for Clinical Chemistry, Department of Translational Medicine, Skåne University Hospital Malmö, Lund University, 205 02, Malmö, Sweden
| | - Krzysztof Pawlowski
- Section for Clinical Chemistry, Department of Translational Medicine, Skåne University Hospital Malmö, Lund University, 205 02, Malmö, Sweden
- Biology, Warsaw University of Life Sciences, Warsaw, Poland
| | - A Marcell Szasz
- Cancer Center, Semmelweis University, Budapest, 1083, Hungary
| | - Jeovanis Gil
- Clinical Protein Science & Imaging, Biomedical Centre, Department of Biomedical Engineering, Lund University, BMC D13, 221 84, Lund, Sweden
| | - Fábio C S Nogueira
- Proteomics Unit, Department of Biochemistry, Federal University of Rio de Janeiro, Rio de Janeiro, Brazil
- Laboratory of Proteomics, LADETEC, Institute of Chemistry, Federal University of Rio de Janeiro, Rio de Janeiro, Brazil
| | - Yasset Perez-Riverol
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Trust Genome Campus, CB10 1SD Hinxton, Cambridge, UK
| | - Jonatan Eriksson
- Clinical Protein Science & Imaging, Biomedical Centre, Department of Biomedical Engineering, Lund University, BMC D13, 221 84, Lund, Sweden
| | - Roger Appelqvist
- Clinical Protein Science & Imaging, Biomedical Centre, Department of Biomedical Engineering, Lund University, BMC D13, 221 84, Lund, Sweden
| | | | - Yonghyo Kim
- Clinical Protein Science & Imaging, Biomedical Centre, Department of Biomedical Engineering, Lund University, BMC D13, 221 84, Lund, Sweden
| | - Bo Baldetorp
- Division of Oncology and Pathology, Department of Clinical Sciences Lund, Lund University, 221 85, Lund, Sweden
| | - Christian Ingvar
- Department of Surgery, Clinical Sciences, Skåne University Hospital, Lund University, Lund, Sweden
| | - Håkan Olsson
- Division of Oncology and Pathology, Department of Clinical Sciences Lund, Lund University, 221 85, Lund, Sweden
| | - Lotta Lundgren
- Division of Oncology and Pathology, Department of Clinical Sciences Lund, Lund University, 221 85, Lund, Sweden
- Department of Hematology, Oncology and Radiation Physics, Skåne University Hospital, Lund, Sweden
| | - Henrik Ekedahl
- Division of Oncology and Pathology, Department of Clinical Sciences Lund, Lund University, 221 85, Lund, Sweden
| | - Peter Horvatovich
- Department of Analytical Biochemistry, Faculty of Science and Engineering, University of Groningen, Groningen, The Netherlands
| | - Yutaka Sugihara
- Division of Oncology and Pathology, Department of Clinical Sciences Lund, Lund University, 221 85, Lund, Sweden
| | - Charlotte Welinder
- Division of Oncology and Pathology, Department of Clinical Sciences Lund, Lund University, 221 85, Lund, Sweden
| | - Elisabet Wieslander
- Division of Oncology and Pathology, Department of Clinical Sciences Lund, Lund University, 221 85, Lund, Sweden
| | - Ho Jeong Kwon
- Department of Biotechnology, Yonsei University, Seoul, South Korea
| | - Gilberto B Domont
- Proteomics Unit, Department of Biochemistry, Federal University of Rio de Janeiro, Rio de Janeiro, Brazil
| | - Johan Malm
- Section for Clinical Chemistry, Department of Translational Medicine, Skåne University Hospital Malmö, Lund University, 205 02, Malmö, Sweden
| | - Melinda Rezeli
- Clinical Protein Science & Imaging, Biomedical Centre, Department of Biomedical Engineering, Lund University, BMC D13, 221 84, Lund, Sweden
| | - Lazaro Hiram Betancourt
- Clinical Protein Science & Imaging, Biomedical Centre, Department of Biomedical Engineering, Lund University, BMC D13, 221 84, Lund, Sweden.
| | - György Marko-Varga
- Clinical Protein Science & Imaging, Biomedical Centre, Department of Biomedical Engineering, Lund University, BMC D13, 221 84, Lund, Sweden
| |
Collapse
|
11
|
Is It Possible to Find Needles in a Haystack? Meta-Analysis of 1000+ MS/MS Files Provided by the Russian Proteomic Consortium for Mining Missing Proteins. Proteomes 2020; 8:proteomes8020012. [PMID: 32456206 PMCID: PMC7356824 DOI: 10.3390/proteomes8020012] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/24/2020] [Revised: 05/18/2020] [Accepted: 05/19/2020] [Indexed: 12/04/2022] Open
Abstract
Despite direct or indirect efforts of the proteomic community, the fraction of blind spots on the protein map is still significant. Almost 11% of human genes encode missing proteins; the existence of which proteins is still in doubt. Apparently, proteomics has reached a stage when more attention and curiosity need to be exerted in the identification of every novel protein in order to expand the unusual types of biomaterials and/or conditions. It seems that we have exhausted the current conventional approaches to the discovery of missing proteins and may need to investigate alternatives. Here, we present an approach to deciphering missing proteins based on the use of non-standard methodological solutions and encompassing diverse MS/MS data, obtained for rare types of biological samples by members of the Russian Proteomic community in the last five years. These data were re-analyzed in a uniform manner by three search engines, which are part of the SearchGUI package. The study resulted in the identification of two missing and five uncertain proteins detected with two peptides. Moreover, 149 proteins were detected with a single proteotypic peptide. Finally, we analyzed the gene expression levels to suggest feasible targets for further validation of missing and uncertain protein observations, which will fully meet the requirements of the international consortium. The MS data are available on the ProteomeXchange platform (PXD014300).
Collapse
|
12
|
Moriya Y, Kawano S, Okuda S, Watanabe Y, Matsumoto M, Takami T, Kobayashi D, Yamanouchi Y, Araki N, Yoshizawa AC, Tabata T, Iwasaki M, Sugiyama N, Tanaka S, Goto S, Ishihama Y. The jPOST environment: an integrated proteomics data repository and database. Nucleic Acids Res 2020; 47:D1218-D1224. [PMID: 30295851 PMCID: PMC6324006 DOI: 10.1093/nar/gky899] [Citation(s) in RCA: 70] [Impact Index Per Article: 17.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/13/2018] [Accepted: 09/24/2018] [Indexed: 01/13/2023] Open
Abstract
Rapid progress is being made in mass spectrometry (MS)-based proteomics, yielding an increasing number of larger datasets with higher quality and higher throughput. To integrate proteomics datasets generated from various projects and institutions, we launched a project named jPOST (Japan ProteOme STandard Repository/Database, https://jpostdb.org/) in 2015. Its proteomics data repository, jPOSTrepo, began operations in 2016 and has accepted more than 10 TB of MS-based proteomics datasets in the past two years. In addition, we have developed a new proteomics database named jPOSTdb in which the published raw datasets in jPOSTrepo are reanalyzed using standardized protocol. jPOSTdb provides viewers showing the frequency of detected post-translational modifications, the co-occurrence of phosphorylation sites on a peptide and peptide sharing among proteoforms. jPOSTdb also provides basic statistical analysis tools to compare proteomics datasets.
Collapse
Affiliation(s)
- Yuki Moriya
- Database Center for Life Science, Joint Support-Center for Data Science Research, Research Organization of Information and Systems, Kashiwa 277-0871, Japan
| | - Shin Kawano
- Database Center for Life Science, Joint Support-Center for Data Science Research, Research Organization of Information and Systems, Kashiwa 277-0871, Japan
| | - Shujiro Okuda
- Niigata University Graduate School of Medical and Dental Sciences, Niigata 951-8510, Japan
| | - Yu Watanabe
- Niigata University Graduate School of Medical and Dental Sciences, Niigata 951-8510, Japan
| | - Masaki Matsumoto
- Medical Institute of Bioregulation, Kyushu University, Fukuoka 812-8582, Japan
| | - Tomoyo Takami
- Medical Institute of Bioregulation, Kyushu University, Fukuoka 812-8582, Japan
| | - Daiki Kobayashi
- Graduate School of Medical Sciences, Faculty of Life Sciences, Kumamoto University, Kumamoto 860-8556, Japan
| | - Yoshinori Yamanouchi
- Graduate School of Medical Sciences, Faculty of Life Sciences, Kumamoto University, Kumamoto 860-8556, Japan.,Kumamoto University Hospital, Kumamoto 860-8556, Japan
| | - Norie Araki
- Graduate School of Medical Sciences, Faculty of Life Sciences, Kumamoto University, Kumamoto 860-8556, Japan
| | - Akiyasu C Yoshizawa
- Graduate School of Pharmaceutical Sciences, Kyoto University, Kyoto 606-8501, Japan
| | - Tsuyoshi Tabata
- Graduate School of Pharmaceutical Sciences, Kyoto University, Kyoto 606-8501, Japan.,Center for iPS Cell Research and Application, Kyoto University, Kyoto 606-8507, Japan
| | - Mio Iwasaki
- Center for iPS Cell Research and Application, Kyoto University, Kyoto 606-8507, Japan
| | - Naoyuki Sugiyama
- Graduate School of Pharmaceutical Sciences, Kyoto University, Kyoto 606-8501, Japan
| | | | - Susumu Goto
- Database Center for Life Science, Joint Support-Center for Data Science Research, Research Organization of Information and Systems, Kashiwa 277-0871, Japan
| | - Yasushi Ishihama
- Graduate School of Pharmaceutical Sciences, Kyoto University, Kyoto 606-8501, Japan
| |
Collapse
|
13
|
Verheggen K, Raeder H, Berven FS, Martens L, Barsnes H, Vaudel M. Anatomy and evolution of database search engines-a central component of mass spectrometry based proteomic workflows. MASS SPECTROMETRY REVIEWS 2020; 39:292-306. [PMID: 28902424 DOI: 10.1002/mas.21543] [Citation(s) in RCA: 60] [Impact Index Per Article: 15.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 04/06/2016] [Accepted: 07/05/2017] [Indexed: 06/07/2023]
Abstract
Sequence database search engines are bioinformatics algorithms that identify peptides from tandem mass spectra using a reference protein sequence database. Two decades of development, notably driven by advances in mass spectrometry, have provided scientists with more than 30 published search engines, each with its own properties. In this review, we present the common paradigm behind the different implementations, and its limitations for modern mass spectrometry datasets. We also detail how the search engines attempt to alleviate these limitations, and provide an overview of the different software frameworks available to the researcher. Finally, we highlight alternative approaches for the identification of proteomic mass spectrometry datasets, either as a replacement for, or as a complement to, sequence database search engines.
Collapse
Affiliation(s)
- Kenneth Verheggen
- VIB-UGent Center for Medical Biotechnology, VIB, Ghent, Belgium
- Department of Biochemistry, Ghent University, Ghent, Belgium
- Bioinformatics Institute Ghent, Ghent University, Ghent, Belgium
| | - Helge Raeder
- KG Jebsen Center for Diabetes Research, Department of Clinical Science, University of Bergen, Norway
- Department of Pediatrics, Haukeland University Hospital, Bergen, Norway
| | - Frode S Berven
- Proteomics Unit, Department of Biomedicine, University of Bergen, Norway
| | - Lennart Martens
- VIB-UGent Center for Medical Biotechnology, VIB, Ghent, Belgium
- Department of Biochemistry, Ghent University, Ghent, Belgium
- Bioinformatics Institute Ghent, Ghent University, Ghent, Belgium
| | - Harald Barsnes
- KG Jebsen Center for Diabetes Research, Department of Clinical Science, University of Bergen, Norway
- Proteomics Unit, Department of Biomedicine, University of Bergen, Norway
- Computational Biology Unit, Department of Informatics, University of Bergen, Norway
| | - Marc Vaudel
- KG Jebsen Center for Diabetes Research, Department of Clinical Science, University of Bergen, Norway
- Proteomics Unit, Department of Biomedicine, University of Bergen, Norway
- Center for Medical Genetics and Molecular Medicine, Haukeland University Hospital, Bergen, Norway
| |
Collapse
|
14
|
Chen G, Chen J, Liu H, Chen S, Zhang Y, Li P, Thierry-Mieg D, Thierry-Mieg J, Mattes W, Ning B, Shi T. Comprehensive Identification and Characterization of Human Secretome Based on Integrative Proteomic and Transcriptomic Data. Front Cell Dev Biol 2019; 7:299. [PMID: 31824949 PMCID: PMC6881247 DOI: 10.3389/fcell.2019.00299] [Citation(s) in RCA: 18] [Impact Index Per Article: 3.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/27/2019] [Accepted: 11/07/2019] [Indexed: 12/25/2022] Open
Abstract
Secreted proteins (SPs) play important roles in diverse important biological processes; however, a comprehensive and high-quality list of human SPs is still lacking. Here we identified 6,943 high-confidence human SPs (3,522 of them are novel) based on 330,427 human proteins derived from databases of UniProt, Ensembl, AceView, and RefSeq. Notably, 6,267 of 6,943 (90.3%) SPs have the supporting evidences from a large amount of mass spectrometry (MS) and RNA-seq data. We found that the SPs were broadly expressed in diverse tissues as well as human body fluid, and a significant portion of them exhibited tissue-specific expression. Moreover, 14 cancer-specific SPs that their expression levels were significantly associated with the patients’ survival of eight different tumors were identified, which could be potential prognostic biomarkers. Strikingly, 89.21% of 6,943 SPs (2,927 novel SPs) contain known protein domains. Those novel SPs we mainly enriched with the known domains regarding immunity, such as Immunoglobulin V-set and C1-set domain. Specifically, we constructed a user-friendly and freely accessible database, SPRomeDB (www.unimd.org/SPRomeDB), to catalog those SPs. Our comprehensive SP identification and characterization gain insights into human secretome and provide valuable resource for future researches.
Collapse
Affiliation(s)
- Geng Chen
- The Center for Bioinformatics and Computational Biology, Shanghai Key Laboratory of Regulatory Biology, The Institute of Biomedical Sciences and School of Life Sciences, East China Normal University, Shanghai, China
| | - Jiwei Chen
- The Center for Bioinformatics and Computational Biology, Shanghai Key Laboratory of Regulatory Biology, The Institute of Biomedical Sciences and School of Life Sciences, East China Normal University, Shanghai, China
| | - Huanlong Liu
- The Center for Bioinformatics and Computational Biology, Shanghai Key Laboratory of Regulatory Biology, The Institute of Biomedical Sciences and School of Life Sciences, East China Normal University, Shanghai, China
| | - Shuangguan Chen
- The Center for Bioinformatics and Computational Biology, Shanghai Key Laboratory of Regulatory Biology, The Institute of Biomedical Sciences and School of Life Sciences, East China Normal University, Shanghai, China
| | - Yang Zhang
- The Center for Bioinformatics and Computational Biology, Shanghai Key Laboratory of Regulatory Biology, The Institute of Biomedical Sciences and School of Life Sciences, East China Normal University, Shanghai, China
| | - Peng Li
- The Center for Bioinformatics and Computational Biology, Shanghai Key Laboratory of Regulatory Biology, The Institute of Biomedical Sciences and School of Life Sciences, East China Normal University, Shanghai, China
| | - Danielle Thierry-Mieg
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, MD, United States
| | - Jean Thierry-Mieg
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, MD, United States
| | - William Mattes
- National Center for Toxicological Research, Food and Drug Administration, Jefferson City, AR, United States
| | - Baitang Ning
- National Center for Toxicological Research, Food and Drug Administration, Jefferson City, AR, United States
| | - Tieliu Shi
- The Center for Bioinformatics and Computational Biology, Shanghai Key Laboratory of Regulatory Biology, The Institute of Biomedical Sciences and School of Life Sciences, East China Normal University, Shanghai, China
| |
Collapse
|
15
|
Deutsch EW, Lane L, Overall CM, Bandeira N, Baker MS, Pineau C, Moritz RL, Corrales F, Orchard S, Van Eyk JE, Paik YK, Weintraub ST, Vandenbrouck Y, Omenn GS. Human Proteome Project Mass Spectrometry Data Interpretation Guidelines 3.0. J Proteome Res 2019; 18:4108-4116. [PMID: 31599596 DOI: 10.1021/acs.jproteome.9b00542] [Citation(s) in RCA: 69] [Impact Index Per Article: 13.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022]
Abstract
The Human Proteome Organization's (HUPO) Human Proteome Project (HPP) developed Mass Spectrometry (MS) Data Interpretation Guidelines that have been applied since 2016. These guidelines have helped ensure that the emerging draft of the complete human proteome is highly accurate and with low numbers of false-positive protein identifications. Here, we describe an update to these guidelines based on consensus-reaching discussions with the wider HPP community over the past year. The revised 3.0 guidelines address several major and minor identified gaps. We have added guidelines for emerging data independent acquisition (DIA) MS workflows and for use of the new Universal Spectrum Identifier (USI) system being developed by the HUPO Proteomics Standards Initiative (PSI). In addition, we discuss updates to the standard HPP pipeline for collecting MS evidence for all proteins in the HPP, including refinements to minimum evidence. We present a new plan for incorporating MassIVE-KB into the HPP pipeline for the next (HPP 2020) cycle in order to obtain more comprehensive coverage of public MS data sets. The main checklist has been reorganized under headings and subitems, and related guidelines have been grouped. In sum, Version 2.1 of the HPP MS Data Interpretation Guidelines has served well, and this timely update to version 3.0 will aid the HPP as it approaches its goal of collecting and curating MS evidence of translation and expression for all predicted ∼20 000 human proteins encoded by the human genome.
Collapse
Affiliation(s)
- Eric W Deutsch
- Institute for Systems Biology , Seattle , Washington 98109 , United States
| | - Lydie Lane
- SIB Swiss Institute of Bioinformatics and Department of Microbiology and Molecular Medicine, Faculty of Medicine , University of Geneva , CMU, Michel Servet 1 , 1211 Geneva 4 , Switzerland
| | - Christopher M Overall
- Centre for Blood Research, Departments of Oral Biological & Medical Sciences, and Biochemistry & Molecular Biology, Faculty of Dentistry , The University of British Columbia , Vancouver , BC V6T 1Z4 , Canada
| | - Nuno Bandeira
- Center for Computational Mass Spectrometry and Department of Computer Science and Engineering, Skaggs School of Pharmacy and Pharmaceutical Sciences , University of California San Diego , La Jolla , California 92093 , United States
| | - Mark S Baker
- Department of Biomedical Sciences, Faculty of Medicine and Health Science , Macquarie University , Macquarie Park , NSW 2109 , Australia
| | - Charles Pineau
- Univ. Rennes , Inserm, EHESP, Irset (Institut de Recherche en Santé, Environnement et Travail) - UMR_S 1085 , F-35042 Rennes cedex , France
| | - Robert L Moritz
- Institute for Systems Biology , Seattle , Washington 98109 , United States
| | - Fernando Corrales
- Functional Proteomics Laboratory, Centro Nacional de Biotecnología , Spanish Research Council , ProteoRed-.ISCIII , Madrid 117 , Spain
| | - Sandra Orchard
- European Molecular Biology Laboratory , European Bioinformatics Institute (EMBL-EBI) , Wellcome Trust Genome Campus , Hinxton , Cambridge CB10 1SD , U.K
| | - Jennifer E Van Eyk
- Advanced Clinical Biosystems Research Institute, The Smidt Heart Institute, Department of Medicine , Cedars Sinai Medical Center , Los Angeles , California 90048 , United States
| | - Young-Ki Paik
- Yonsei Proteome Research Center , Yonsei University , 50 Yonsei-ro , Sudaemoon-ku , Seoul 03720 , Korea
| | - Susan T Weintraub
- The University of Texas Health Science Center at San Antonio , San Antonio , Texas 78229 , United States
| | - Yves Vandenbrouck
- Univ. Grenoble Alpes , CEA, INSERM, IRIG-BGE, U1038 , F-38000 Grenoble , France
| | - Gilbert S Omenn
- Institute for Systems Biology , Seattle , Washington 98109 , United States.,Departments of Computational Medicine & Bioinformatics, Internal Medicine, and Human Genetics and School of Public Health , University of Michigan , Ann Arbor , Michigan 48109-2218 , United States
| |
Collapse
|
16
|
Pan J, Liu S, Zhu H, Qian J. AAgMarker 1.0: a resource of serological autoantigen biomarkers for clinical diagnosis and prognosis of various human diseases. Nucleic Acids Res 2019; 46:D886-D893. [PMID: 28977551 PMCID: PMC5753245 DOI: 10.1093/nar/gkx770] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/10/2017] [Accepted: 08/29/2017] [Indexed: 01/02/2023] Open
Abstract
Autoantibodies are produced to target an individual's own antigens (e.g. proteins). They can trigger autoimmune responses and inflammation, and thus, cause many types of diseases. Many high-throughput autoantibody profiling projects have been reported for unbiased identification of serological autoantigen-based biomarkers. However, a lack of centralized data portal for these published assays has been a major obstacle to further data mining and cross-evaluate the quality of these datasets generated from different diseases. Here, we introduce a user-friendly database, AAgMarker 1.0, which collects many published raw datasets obtained from serum profiling assays on the proteome microarrays, and provides a toolbox for mining these data. The current version of AAgMarker 1.0 contains 854 serum samples, involving 136 092 proteins. A total of 7803 (4470 non-redundant) candidate autoantigen biomarkers were identified and collected for 12 diseases, such as Alzheimer's disease, Bechet's disease and Parkinson's disease. Seven statistical parameters are introduced to quantitatively assess these biomarkers. Users can retrieve, analyse and compare the datasets through basic search, advanced search and browse. These biomarkers are also downloadable by disease terms. The AAgMarker 1.0 is now freely accessible at http://bioinfo.wilmer.jhu.edu/AAgMarker/. We believe this database will be a valuable resource for the community of both biomedical and clinical research.
Collapse
Affiliation(s)
- Jianbo Pan
- Department of Ophthalmology, Johns Hopkins School of Medicine, Baltimore, MD 21205, USA
| | - Sheng Liu
- Department of Ophthalmology, Johns Hopkins School of Medicine, Baltimore, MD 21205, USA
| | - Heng Zhu
- Department of Pharmacology and Molecular Sciences, Johns Hopkins School of Medicine, Baltimore, MD 21205, USA
| | - Jiang Qian
- Department of Ophthalmology, Johns Hopkins School of Medicine, Baltimore, MD 21205, USA.,The Sidney Kimmel Comprehensive Cancer Center, Johns Hopkins School of Medicine, Baltimore, MD 21205, USA
| |
Collapse
|
17
|
Hsu YY, Clyne M, Wei CH, Khoury MJ, Lu Z. Using deep learning to identify translational research in genomic medicine beyond bench to bedside. DATABASE-THE JOURNAL OF BIOLOGICAL DATABASES AND CURATION 2019; 2019:5309020. [PMID: 30753477 PMCID: PMC6367517 DOI: 10.1093/database/baz010] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 10/18/2018] [Accepted: 01/15/2019] [Indexed: 12/18/2022]
Abstract
Tracking scientific research publications on the evaluation, utility and implementation of genomic applications is critical for the translation of basic research to impact clinical and population health. In this work, we utilize state-of-the-art machine learning approaches to identify translational research in genomics beyond bench to bedside from the biomedical literature. We apply the convolutional neural networks (CNNs) and support vector machines (SVMs) to the bench/bedside article classification on the weekly manual annotation data of the Public Health Genomics Knowledge Base database. Both classifiers employ salient features to determine the probability of curation-eligible publications, which can effectively reduce the workload of manual triage and curation process. We applied the CNNs and SVMs to an independent test set (n = 400), and the models achieved the F-measure of 0.80 and 0.74, respectively. We further tested the CNNs, which perform better results, on the routine annotation pipeline for 2 weeks and significantly reduced the effort and retrieved more appropriate research articles. Our approaches provide direct insight into the automated curation of genomic translational research beyond bench to bedside. The machine learning classifiers are found to be helpful for annotators to enhance the efficiency of manual curation.
Collapse
Affiliation(s)
- Yi-Yu Hsu
- National Center for Biotechnology Information, National Library of Medicine, Bethesda, MD, USA
| | - Mindy Clyne
- Implementation Science Team, Division of Cancer Control and Population Sciences, National Cancer Institute, Bethesda, MD, USA
| | - Chih-Hsuan Wei
- National Center for Biotechnology Information, National Library of Medicine, Bethesda, MD, USA
| | - Muin J Khoury
- Office of Public Health Genomics, Centers for Disease Control and Prevention, Atlanta, GA, USA
| | - Zhiyong Lu
- National Center for Biotechnology Information, National Library of Medicine, Bethesda, MD, USA
| |
Collapse
|
18
|
Kopylov AT, Ponomarenko EA, Ilgisonis EV, Pyatnitskiy MA, Lisitsa AV, Poverennaya EV, Kiseleva OI, Farafonova TE, Tikhonova OV, Zavialova MG, Novikova SE, Moshkovskii SA, Radko SP, Morukov BV, Grigoriev AI, Paik YK, Salekdeh GH, Urbani A, Zgoda VG, Archakov AI. 200+ Protein Concentrations in Healthy Human Blood Plasma: Targeted Quantitative SRM SIS Screening of Chromosomes 18, 13, Y, and the Mitochondrial Chromosome Encoded Proteome. J Proteome Res 2018; 18:120-129. [PMID: 30480452 DOI: 10.1021/acs.jproteome.8b00391] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/11/2022]
Abstract
This work continues the series of the quantitative measurements of the proteins encoded by different chromosomes in the blood plasma of a healthy person. Selected Reaction Monitoring with Stable Isotope-labeled peptide Standards (SRM SIS) and a gene-centric approach, which is the basis for the implementation of the international Chromosome-centric Human Proteome Project (C-HPP), were applied for the quantitative measurement of proteins in human blood plasma. Analyses were carried out in the frame of C-HPP for each protein-coding gene of the four human chromosomes: 18, 13, Y, and mitochondrial. Concentrations of proteins encoded by 667 genes were measured in 54 blood plasma samples of the volunteers, whose health conditions were consistent with requirements for astronauts. The gene list included 276, 329, 47, and 15 genes of chromosomes 18, 13, Y, and the mitochondrial chromosome, respectively. This paper does not make claims about the detection of missing proteins. Only 205 proteins (30.7%) were detected in the samples. Of them, 84, 106, 10, and 5 belonged to chromosomes 18, 13, and Y and the mitochondrial chromosome, respectively. Each detected protein was found in at least one of the samples analyzed. The SRM SIS raw data are available in the ProteomeXchange repository (PXD004374, PASS01192).
Collapse
Affiliation(s)
| | | | | | | | | | | | | | | | | | | | | | - Sergey A Moshkovskii
- Institute of Biomedical Chemistry , Moscow 119435 , Russia.,Pirogov Russian National Research Medical University , Moscow 117997 , Russia
| | - Sergey P Radko
- Institute of Biomedical Chemistry , Moscow 119435 , Russia
| | - Boris V Morukov
- Institute of Medico-Biological Problems , Moscow 123007 , Russia
| | | | - Young-Ki Paik
- Yonsei Proteome Research Center , Yonsei University , Seoul 03722 , Korea
| | - Ghasem Hosseini Salekdeh
- Department of Molecular Systems Biology, Cell Science Research Center , Royan Institute for Stem Cell Biology and Technology, ACECR , Tehran , Iran.,Department of Molecular Sciences , Macquarie University , Sydney , New South Wales 2109 , Australia.,Department of Systems Biology , Agricultural Biotechnology Research Institute of Iran , Karaj , Iran
| | - Andrea Urbani
- Area of Diagnostic Laboratories , Fondazione Policlinico Gemelli-IRCCS , Rome 00168 , Italy.,Institute of Biochemistry and Clinical Biochemistry , Catholic University of the Sacred Heart , Rome 00168 , Italy
| | - Victor G Zgoda
- Institute of Biomedical Chemistry , Moscow 119435 , Russia
| | | |
Collapse
|
19
|
Siddiqui O, Zhang H, Guan Y, Omenn GS. Chromosome 17 Missing Proteins: Recent Progress and Future Directions as Part of the neXt-MP50 Challenge. J Proteome Res 2018; 17:4061-4071. [PMID: 30280577 DOI: 10.1021/acs.jproteome.8b00442] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022]
Abstract
The Chromosome-centric Human Proteome Project (C-HPP), announced in September 2016, is an initiative to accelerate progress on the detection and characterization of neXtProt PE2,3,4 "missing proteins" (MPs) with a mandate to each chromosome team to find about 50 MPs over 2 years. Here we report major progress toward the neXt-MP50 challenge with 43 newly validated Chr 17 PE1 proteins, of which 25 were based on mass spectrometry, 12 on protein-protein interactions, 3 on a combination of MS and PPI, and 3 with other types of data. Notable among these new PE1 proteins were five keratin-associated proteins, a single olfactory receptor, and five additional membrane-embedded proteins. We evaluate the prospects of finding the remaining 105 MPs coded for on Chr 17, focusing on mass spectrometry and protein-protein interaction approaches. We present a list of 35 prioritized MPs with specific approaches that may be used in further MS and PPI experimental studies. Additionally, we demonstrate how in silico studies can be used to capture individual peptides from major data repositories, documenting one MP that appears to be a strong candidate for PE1. We are close to our goal of finding 50 MPs for Chr 17.
Collapse
Affiliation(s)
- Omer Siddiqui
- Department of Electronic Engineering and Computer Science , University of Michigan , Ann Arbor , Michigan 48109 , United States.,Department of Computational Medicine and Bioinformatics , University of Michigan , Ann Arbor , Michigan 48109 , United States
| | - Hongjiu Zhang
- Department of Computational Medicine and Bioinformatics , University of Michigan , Ann Arbor , Michigan 48109 , United States
| | - Yuanfang Guan
- Department of Computational Medicine and Bioinformatics , University of Michigan , Ann Arbor , Michigan 48109 , United States.,Department of Internal Medicine , University of Michigan , Ann Arbor , Michigan 48109 , United States
| | - Gilbert S Omenn
- Department of Computational Medicine and Bioinformatics , University of Michigan , Ann Arbor , Michigan 48109 , United States.,Department of Internal Medicine , University of Michigan , Ann Arbor , Michigan 48109 , United States.,Department of Human Genetics and School of Public Health , University of Michigan , Ann Arbor , Michigan 48109 , United States
| |
Collapse
|
20
|
Markosian C, Di Costanzo L, Sekharan M, Shao C, Burley SK, Zardecki C. Analysis of impact metrics for the Protein Data Bank. Sci Data 2018; 5:180212. [PMID: 30325351 PMCID: PMC6190746 DOI: 10.1038/sdata.2018.212] [Citation(s) in RCA: 19] [Impact Index Per Article: 3.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/11/2018] [Accepted: 08/29/2018] [Indexed: 01/13/2023] Open
Abstract
Since 1971, the Protein Data Bank (PDB) archive has served as the single, global repository for open access to atomic-level data for biological macromolecules. The archive currently holds >140,000 structures (>1 billion atoms). These structures are the molecules of life found in all organisms. Knowing the 3D structure of a biological macromolecule is essential for understanding the molecule's function, providing insights in health and disease, food and energy production, and other topics of concern to prosperity and sustainability. PDB data are freely and publicly available, without restrictions on usage. Through bibliometric and usage studies, we sought to determine the impact of the PDB across disciplines and demographics. Our analysis shows that even though research areas such as molecular biology and biochemistry account for the most usage, other fields are increasingly using PDB resources. PDB usage is seen across 150 disciplines in applied sciences, humanities, and social sciences. Data are also re-used and integrated with >400 resources. Our study identifies trends in PDB usage and documents its utility across research disciplines.
Collapse
Affiliation(s)
- Christopher Markosian
- Department of Molecular Biology and Biochemistry, School of Arts and Sciences, Rutgers, The State University of New Jersey, Piscataway, NJ USA
| | - Luigi Di Costanzo
- RCSB Protein Data Bank, Institute for Quantitative Biomedicine, Rutgers, The State University of New Jersey, Piscataway, NJ USA
| | - Monica Sekharan
- RCSB Protein Data Bank, Institute for Quantitative Biomedicine, Rutgers, The State University of New Jersey, Piscataway, NJ USA
| | - Chenghua Shao
- RCSB Protein Data Bank, Institute for Quantitative Biomedicine, Rutgers, The State University of New Jersey, Piscataway, NJ USA
| | - Stephen K Burley
- RCSB Protein Data Bank, Institute for Quantitative Biomedicine, Rutgers, The State University of New Jersey, Piscataway, NJ USA.,RCSB Protein Data Bank, Skaggs School of Pharmacy and Pharmaceutical Sciences and San Diego Supercomputer Center, University of California, San Diego, La Jolla, CA USA.,Cancer Institute of New Jersey, Rutgers, The State University of New Jersey, New Brunswick, NJ USA
| | - Christine Zardecki
- RCSB Protein Data Bank, Institute for Quantitative Biomedicine, Rutgers, The State University of New Jersey, Piscataway, NJ USA
| |
Collapse
|
21
|
Mendoza L, Deutsch EW, Sun Z, Campbell DS, Shteynberg DD, Moritz RL. Flexible and Fast Mapping of Peptides to a Proteome with ProteoMapper. J Proteome Res 2018; 17:4337-4344. [PMID: 30230343 DOI: 10.1021/acs.jproteome.8b00544] [Citation(s) in RCA: 12] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]
Abstract
Bottom-up proteomics relies on the proteolytic or chemical cleavage of proteins into peptides, the identification of those peptides via mass spectrometry, and the mapping of the identified peptides back to the reference proteome to infer which possible proteins are identified. Reliable mapping of peptides to proteins still poses substantial challenges when considering similar proteins, protein families, splice isoforms, sequence variation, and possible residue mass modifications, combined with an imperfect and incomplete understanding of the proteome. The ProteoMapper tool enables a comprehensive and rapid mapping of peptides to a reference proteome. The indexer component creates a segmented index for an input proteome from a FASTA or PEFF file. The ProMaST component provides ultrafast mapping of one or more input peptides against the index. ProteoMapper allows searches that take into account known sequence variation encoded in PEFF files. It also enables fuzzy searches to find highly similar peptides with residue order changes or other isobaric or near-isobaric substitutions within a specified mass tolerance. We demonstrate an example of a one-hit-wonder identification in PeptideAtlas that may be better explained by a combination of catalogued and uncatalogued sequence variation in another highly observed protein. ProteoMapper is a free and open source, available for local use after downloading, embedding in other applications, as an online web tool at http://www.peptideatlas.org/map , and as a web service.
Collapse
Affiliation(s)
- Luis Mendoza
- Institute for Systems Biology , 401 Terry Ave North , Seattle , Washington 98109 , United States
| | - Eric W Deutsch
- Institute for Systems Biology , 401 Terry Ave North , Seattle , Washington 98109 , United States
| | - Zhi Sun
- Institute for Systems Biology , 401 Terry Ave North , Seattle , Washington 98109 , United States
| | - David S Campbell
- Institute for Systems Biology , 401 Terry Ave North , Seattle , Washington 98109 , United States
| | - David D Shteynberg
- Institute for Systems Biology , 401 Terry Ave North , Seattle , Washington 98109 , United States
| | - Robert L Moritz
- Institute for Systems Biology , 401 Terry Ave North , Seattle , Washington 98109 , United States
| |
Collapse
|
22
|
Macron C, Lane L, Núñez Galindo A, Dayon L. Deep Dive on the Proteome of Human Cerebrospinal Fluid: A Valuable Data Resource for Biomarker Discovery and Missing Protein Identification. J Proteome Res 2018; 17:4113-4126. [PMID: 30124047 DOI: 10.1021/acs.jproteome.8b00300] [Citation(s) in RCA: 46] [Impact Index Per Article: 7.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/04/2023]
Abstract
Cerebrospinal fluid (CSF) is a body fluid of choice for biomarker studies of brain disorders but remains relatively under-studied compared with other biological fluids such as plasma, partly due to the more invasive means of its sample collection. The present study establishes an in-depth CSF proteome through the analysis of a unique CSF sample from a pool of donors. After immunoaffinity depletion, the CSF sample was fractionated using off-gel electrophoresis and analyzed with liquid chromatography tandem mass spectrometry (MS) using the latest generation of hybrid Orbitrap mass spectrometers. The shotgun proteomic analysis allowed the identification of 20 689 peptides mapping on 3379 proteins. To the best of our knowledge, the obtained data set constitutes the largest CSF proteome published so far. Among the CSF proteins identified, 34% correspond to genes whose transcripts are highly expressed in brain according to the Human Protein Atlas. The principal Alzheimer's disease biomarkers (e.g., tau protein, amyloid-β, apolipoprotein E, and neurogranin) were detected. Importantly, our data set significantly contributes to the Chromosome-centric Human Proteome Project (C-HPP), and 12 proteins considered as missing are proposed for validation in accordance with the HPP guidelines. Of these 12 proteins, 8 proteins are based on 2 to 6 uniquely mapping peptides from this CSF analysis, and 4 match a new peptide with a "stranded" single peptide in PeptideAtlas from previous CSF studies. The MS proteomic data are available to the ProteomeXchange Consortium ( http://www.proteomexchange.org/ ) with the data set identifier PXD009646.
Collapse
Affiliation(s)
- Charlotte Macron
- Proteomics , Nestlé Institute of Health Sciences , 1015 Lausanne , Switzerland
| | - Lydie Lane
- CALIPHO Group , SIB-Swiss Institute of Bioinformatics , CMU, rue Michel-Servet 1 , 1211 Geneva 4 , Switzerland.,Department of Microbiology and Molecular Medicine, Faculty of Medicine , University of Geneva , rue Michel-Servet 1 , 1211 Geneva 4 , Switzerland
| | | | - Loïc Dayon
- Proteomics , Nestlé Institute of Health Sciences , 1015 Lausanne , Switzerland
| |
Collapse
|
23
|
Assembling the Community-Scale Discoverable Human Proteome. Cell Syst 2018; 7:412-421.e5. [PMID: 30172843 PMCID: PMC6279426 DOI: 10.1016/j.cels.2018.08.004] [Citation(s) in RCA: 107] [Impact Index Per Article: 17.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/14/2017] [Revised: 12/22/2017] [Accepted: 08/03/2018] [Indexed: 01/15/2023]
Abstract
The increasing throughput and sharing of proteomics mass spectrometry data have now yielded over one-third of a million public mass spectrometry runs. However, these discoveries are not continuously aggregated in an open and error-controlled manner, which limits their utility. To facilitate the reusability of these data, we built the MassIVE Knowledge Base (MassIVE-KB), a community-wide, continuously updating knowledge base that aggregates proteomics mass spectrometry discoveries into an open reusable format with full provenance information for community scrutiny. Reusing >31 TB of public human data stored in a mass spectrometry interactive virtual environment (MassIVE), the MassIVE-KB contains >2.1 million precursors from 19,610 proteins (48% larger than before; 97% of the total) and doubles proteome coverage to 6 million amino acids (54% of the proteome) with strict library-scale false discovery controls, thereby providing evidence for 430 proteins for which sufficient protein-level evidence was previously missing. Furthermore, MassIVE-KB can inform experimental design, helps identify and quantify new data, and provides tools for community construction of specialized spectral libraries. Wang et al. introduce MassIVE-KB, a program designed to distill the entire community’s mass spectrometry data into reusable spectral library resources. As a result, the statistically-significant discovery of a peptide or protein in a single researcher’s data will thus be made available to the whole community to support its identification (in shotgun experiments) or quantitative detection (in targeted experiments) in all future analyses.
Collapse
|
24
|
Macron C, Lane L, Núñez Galindo A, Dayon L. Identification of Missing Proteins in Normal Human Cerebrospinal Fluid. J Proteome Res 2018; 17:4315-4319. [DOI: 10.1021/acs.jproteome.8b00194] [Citation(s) in RCA: 12] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/11/2022]
Affiliation(s)
- Charlotte Macron
- Proteomics, Nestlé Institute of Health Sciences, 1015 Lausanne, Switzerland
| | - Lydie Lane
- CALIPHO Group, SIB-Swiss Institute of Bioinformatics, CMU, rue Michel-Servet 1, 1211 Geneva 4, Switzerland
- Department of Microbiology and Molecular Medicine, Faculty of Medicine, University of Geneva, rue Michel-Servet 1, 1211 Geneva 4, Switzerland
| | | | - Loïc Dayon
- Proteomics, Nestlé Institute of Health Sciences, 1015 Lausanne, Switzerland
| |
Collapse
|
25
|
Zheng L, Chen Y, Elhanan G, Perl Y, Geller J, Ochs C. Complex overlapping concepts: An effective auditing methodology for families of similarly structured BioPortal ontologies. J Biomed Inform 2018; 83:135-149. [PMID: 29852316 DOI: 10.1016/j.jbi.2018.05.015] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/10/2017] [Revised: 05/25/2018] [Accepted: 05/26/2018] [Indexed: 11/30/2022]
Abstract
In previous research, we have demonstrated for a number of ontologies that structurally complex concepts (for different definitions of "complex") in an ontology are more likely to exhibit errors than other concepts. Thus, such complex concepts often become fertile ground for quality assurance (QA) in ontologies. They should be audited first. One example of complex concepts is given by "overlapping concepts" (to be defined below.) Historically, a different auditing methodology had to be developed for every single ontology. For better scalability and efficiency, it is desirable to identify family-wide QA methodologies. Each such methodology would be applicable to a whole family of similar ontologies. In past research, we had divided the 685 ontologies of BioPortal into families of structurally similar ontologies. We showed for four ontologies of the same large family in BioPortal that "overlapping concepts" are indeed statistically significantly more likely to exhibit errors. In order to make an authoritative statement concerning the success of "overlapping concepts" as a methodology for a whole family of similar ontologies (or of large subhierarchies of ontologies), it is necessary to show that "overlapping concepts" have a higher likelihood of errors for six out of six ontologies of the family. In this paper, we are demonstrating for two more ontologies that "overlapping concepts" can successfully predict groups of concepts with a higher error rate than concepts from a control group. The fifth ontology is the Neoplasm subhierarchy of the National Cancer Institute thesaurus (NCIt). The sixth ontology is the Infectious Disease subhierarchy of SNOMED CT. We demonstrate quality assurance results for both of them. Furthermore, in this paper we observe two novel, important, and useful phenomena during quality assurance of "overlapping concepts." First, an erroneous "overlapping concept" can help with discovering other erroneous "non-overlapping concepts" in its vicinity. Secondly, correcting erroneous "overlapping concepts" may turn them into "non-overlapping concepts." We demonstrate that this may reduce the complexity of parts of the ontology, which in turn makes the ontology more comprehensible, simplifying maintenance and use of the ontology.
Collapse
Affiliation(s)
- Ling Zheng
- Department of Computer Science, New Jersey Institute of Technology, Newark, NJ 07102, United States.
| | - Yan Chen
- CIS Department, Borough of Manhattan Community College, CUNY, NY 10007, United States
| | - Gai Elhanan
- Applied Innovation Center, Desert Research Institute, Reno, NV 89512, United States
| | - Yehoshua Perl
- Department of Computer Science, New Jersey Institute of Technology, Newark, NJ 07102, United States
| | - James Geller
- Department of Computer Science, New Jersey Institute of Technology, Newark, NJ 07102, United States
| | | |
Collapse
|
26
|
Abstract
The Cellosaurus is a knowledge resource on cell lines. It aims to describe all cell lines used in biomedical research. Its scope encompasses both vertebrates and invertebrates. Currently, information for >100,000 cell lines is provided. For each cell line, it provides a wealth of information, cross-references, and literature citations. The Cellosaurus is available on the ExPASy server (https://web.expasy.org/cellosaurus/) and can be downloaded in a variety of formats. Among its many uses, the Cellosaurus is a key resource to help researchers identify potentially contaminated/misidentified cell lines, thus contributing to improving the quality of research in the life sciences.
Collapse
Affiliation(s)
- Amos Bairoch
- Computer and Laboratory Investigation of Proteins of Human Origin Group, Faculty of Medicine, Swiss Institute of Bioinformatics, University of Geneva, Geneva 4, Switzerland
| |
Collapse
|
27
|
Haller C, Chaskar P, Piccand J, Cominetti O, Macron C, Dayon L, Kraus MRC. Insights into Islet Differentiation and Maturation through Proteomic Characterization of a Human iPSC-Derived Pancreatic Endocrine Model. Proteomics Clin Appl 2018; 12:e1600173. [PMID: 29578310 DOI: 10.1002/prca.201600173] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/24/2017] [Revised: 02/09/2018] [Indexed: 12/16/2022]
Abstract
PURPOSE Great progresses have been made for generating in vitro pluripotent stem cell pancreatic β-like cells. However, the maturation stage of the cells still requires in vivo maturation to recreate the environmental niche. A deeper understanding of the factors promoting maturation of the cells is of great interest for clinical applications. EXPERIMENTAL DESIGN Label-free mass spectrometry based proteomic analysis is performed on samples from a longitudinal study of differentiation of human induced pluripotent stem cells toward glucose responsive insulin producing cells. RESULTS Proteome patterns correlate with specific transcription factor gene expression levels during in vitro differentiation, showing the relevance of the technology for identification of pancreatic-specific markers. The analysis of proteomes of the implanted cells in a longitudinal study shows that the neovascularization process linked to the extracellular matrix environment is time-dependent and conditions the proper maturation of the cells in β-like cells secreting insulin in response to glucose. CONCLUSIONS AND CLINICAL RELEVANCE Proteomic profiling is valuable to qualify and better understand in vivo maturation of progenitor cells toward β-cells. This is critical for future clinical trials where in vivo maturation still needs to be improved for robustness and effectiveness of cell therapy. Novel biomarkers for predicting the efficiency of maturation represents noninvasive monitoring tools for following efficiency of the implant.
Collapse
Affiliation(s)
- Corinne Haller
- Stem Cells, Nestlé Institute of Health Sciences, Lausanne, Switzerland
| | - Prasad Chaskar
- Stem Cells, Nestlé Institute of Health Sciences, Lausanne, Switzerland
| | - Julie Piccand
- Stem Cells, Nestlé Institute of Health Sciences, Lausanne, Switzerland
| | - Ornella Cominetti
- Proteomics, Nestlé Institute of Health Sciences, Lausanne, Switzerland
| | - Charlotte Macron
- Proteomics, Nestlé Institute of Health Sciences, Lausanne, Switzerland
| | - Loïc Dayon
- Proteomics, Nestlé Institute of Health Sciences, Lausanne, Switzerland
| | - Marine R-C Kraus
- Stem Cells, Nestlé Institute of Health Sciences, Lausanne, Switzerland
| |
Collapse
|
28
|
Frantzi M, Klimou Z, Makridakis M, Zoidakis J, Latosinska A, Borràs DM, Janssen B, Giannopoulou I, Lygirou V, Lazaris AC, Anagnou NP, Mischak H, Roubelakis MG, Vlahou A. Silencing of Profilin-1 suppresses cell adhesion and tumor growth via predicted alterations in integrin and Ca2+ signaling in T24M-based bladder cancer models. Oncotarget 2018; 7:70750-70768. [PMID: 27683119 PMCID: PMC5342587 DOI: 10.18632/oncotarget.12218] [Citation(s) in RCA: 20] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/22/2016] [Accepted: 09/13/2016] [Indexed: 12/14/2022] Open
Abstract
Bladder cancer (BC) is the second most common malignancy of the genitourinary system, characterized by the highest recurrence rate of all cancers. Treatment options are limited; thus a thorough understanding of the underlying molecular mechanisms is needed to guide the discovery of novel therapeutic targets. Profilins are actin binding proteins with attributed pleiotropic functions to cytoskeletal remodeling, cell adhesion, motility, even transcriptional regulation, not fully characterized yet. Earlier studies from our laboratory revealed that decreased tissue levels of Profilin-1 (PFN1) are correlated with BC progression to muscle invasive disease. Herein, we describe a comprehensive analysis of PFN1 silencing via shRNA, in vitro (by employing T24M cells) and in vivo [(with T24M xenografts in non-obese diabetic severe combined immunodeficient mice (NOD/SCID) mice]. A combination of phenotypic and molecular assays, including migration, proliferation, adhesion assays, flow cytometry and total mRNA sequencing, as well as immunohistochemistry for investigation of selected findings in human specimens were applied. A decrease in BC cell adhesion and tumor growth in vivo following PFN downregulation are observed, likely associated with the concomitant downregulation of Fibronectin receptor, Endothelin-1, and Actin polymerization. A decrease in the levels of multiple key members of the non-canonical Wnt/Ca2+ signaling pathway is also detected following PFN1 suppression, providing the groundwork for future studies, addressing the specific role of PFN1 in Ca2+ signaling, particularly in the muscle invasive disease.
Collapse
Affiliation(s)
- Maria Frantzi
- Proteomics Laboratory, Biotechnology Division, Biomedical Research Foundation of the Academy of Athens, Athens, Greece.,Research and Development Department, Mosaiques Diagnostics GmbH, Hannover, Germany
| | - Zoi Klimou
- Laboratory of Biology, School of Medicine, National and Kapodistrian University of Athens, Athens, Greece.,Cell and Gene Therapy Laboratory, Biomedical Research Foundation of The Academy of Athens, Athens, Greece
| | - Manousos Makridakis
- Proteomics Laboratory, Biotechnology Division, Biomedical Research Foundation of the Academy of Athens, Athens, Greece
| | - Jerome Zoidakis
- Proteomics Laboratory, Biotechnology Division, Biomedical Research Foundation of the Academy of Athens, Athens, Greece
| | - Agnieszka Latosinska
- Proteomics Laboratory, Biotechnology Division, Biomedical Research Foundation of the Academy of Athens, Athens, Greece
| | - Daniel M Borràs
- Research and Development Department, GenomeScan B.V., Leiden, The Netherlands
| | - Bart Janssen
- Research and Development Department, GenomeScan B.V., Leiden, The Netherlands
| | - Ioanna Giannopoulou
- First Department of Pathology, School of Medicine, National and Kapodistrian University of Athens, Athens, Greece
| | - Vasiliki Lygirou
- Proteomics Laboratory, Biotechnology Division, Biomedical Research Foundation of the Academy of Athens, Athens, Greece
| | - Andreas C Lazaris
- First Department of Pathology, School of Medicine, National and Kapodistrian University of Athens, Athens, Greece
| | - Nicholas P Anagnou
- Laboratory of Biology, School of Medicine, National and Kapodistrian University of Athens, Athens, Greece.,Cell and Gene Therapy Laboratory, Biomedical Research Foundation of The Academy of Athens, Athens, Greece
| | - Harald Mischak
- Research and Development Department, Mosaiques Diagnostics GmbH, Hannover, Germany
| | - Maria G Roubelakis
- Laboratory of Biology, School of Medicine, National and Kapodistrian University of Athens, Athens, Greece.,Cell and Gene Therapy Laboratory, Biomedical Research Foundation of The Academy of Athens, Athens, Greece
| | - Antonia Vlahou
- Proteomics Laboratory, Biotechnology Division, Biomedical Research Foundation of the Academy of Athens, Athens, Greece
| |
Collapse
|
29
|
Gobeill J, Gaudet P, Dopp D, Morrone A, Kahanda I, Hsu YY, Wei CH, Lu Z, Ruch P. Overview of the BioCreative VI text-mining services for Kinome Curation Track. DATABASE-THE JOURNAL OF BIOLOGICAL DATABASES AND CURATION 2018; 2018:5133467. [PMID: 30329035 PMCID: PMC6191643 DOI: 10.1093/database/bay104] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 04/03/2018] [Accepted: 09/13/2018] [Indexed: 11/30/2022]
Abstract
The text-mining services for kinome curation track, part of BioCreative VI, proposed a competition to assess the effectiveness of text mining to perform literature triage. The track has exploited an unpublished curated data set from the neXtProt database. This data set contained comprehensive annotations for 300 human protein kinases. For a given protein and a given curation axis [diseases or gene ontology (GO) biological processes], participants’ systems had to identify and rank relevant articles in a collection of 5.2 M MEDLINE citations (task 1) or 530 000 full-text articles (task 2). Explored strategies comprised named-entity recognition and machine-learning frameworks. For that latter approach, participants developed methods to derive a set of negative instances, as the databases typically do not store articles that were judged as irrelevant by curators. The supervised approaches proposed by the participating groups achieved significant improvements compared to the baseline established in a previous study and compared to a basic PubMed search.
Collapse
Affiliation(s)
- Julien Gobeill
- SIB Text Mining, Swiss Institute of Bioinformatics, Geneva, Switzerland.,HES-SO / HEG Geneva, Information Sciences, Geneva, Switzerland
| | - Pascale Gaudet
- SIB Text Mining, Swiss Institute of Bioinformatics, Geneva, Switzerland
| | | | | | | | - Yi-Yu Hsu
- National Center for Biotechnology Information, Bethesda, MD, USA
| | - Chih-Hsuan Wei
- National Center for Biotechnology Information, Bethesda, MD, USA
| | - Zhiyong Lu
- National Center for Biotechnology Information, Bethesda, MD, USA
| | - Patrick Ruch
- SIB Text Mining, Swiss Institute of Bioinformatics, Geneva, Switzerland.,HES-SO / HEG Geneva, Information Sciences, Geneva, Switzerland
| |
Collapse
|
30
|
Deutsch EW, Orchard S, Binz PA, Bittremieux W, Eisenacher M, Hermjakob H, Kawano S, Lam H, Mayer G, Menschaert G, Perez-Riverol Y, Salek RM, Tabb DL, Tenzer S, Vizcaíno JA, Walzer M, Jones AR. Proteomics Standards Initiative: Fifteen Years of Progress and Future Work. J Proteome Res 2017; 16:4288-4298. [PMID: 28849660 PMCID: PMC5715286 DOI: 10.1021/acs.jproteome.7b00370] [Citation(s) in RCA: 68] [Impact Index Per Article: 9.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/02/2017] [Indexed: 12/21/2022]
Abstract
The Proteomics Standards Initiative (PSI) of the Human Proteome Organization (HUPO) has now been developing and promoting open community standards and software tools in the field of proteomics for 15 years. Under the guidance of the chair, cochairs, and other leadership positions, the PSI working groups are tasked with the development and maintenance of community standards via special workshops and ongoing work. Among the existing ratified standards, the PSI working groups continue to update PSI-MI XML, MITAB, mzML, mzIdentML, mzQuantML, mzTab, and the MIAPE (Minimum Information About a Proteomics Experiment) guidelines with the advance of new technologies and techniques. Furthermore, new standards are currently either in the final stages of completion (proBed and proBAM for proteogenomics results as well as PEFF) or in early stages of design (a spectral library standard format, a universal spectrum identifier, the qcML quality control format, and the Protein Expression Interface (PROXI) web services Application Programming Interface). In this work we review the current status of all of these aspects of the PSI, describe synergies with other efforts such as the ProteomeXchange Consortium, the Human Proteome Project, and the metabolomics community, and provide a look at future directions of the PSI.
Collapse
Affiliation(s)
- Eric W. Deutsch
- Institute
for Systems Biology, Seattle, Washington 98109, United States
| | - Sandra Orchard
- European
Molecular Biology Laboratory, European Bioinformatics
Institute (EMBL-EBI), Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SD, United Kingdom
| | - Pierre-Alain Binz
- CHUV
Centre Hospitalier Universitaire Vaudois, 1011 Lausanne, Switzerland
| | - Wout Bittremieux
- Department
of Mathematics and Computer Science, University
of Antwerp, Middelheimlaan
1, 2020 Antwerp, Belgium
| | - Martin Eisenacher
- Medizinisches
Proteom Center (MPC), Ruhr-Universität
Bochum, D-44801 Bochum, Germany
| | - Henning Hermjakob
- European
Molecular Biology Laboratory, European Bioinformatics
Institute (EMBL-EBI), Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SD, United Kingdom
- State
Key Laboratory of Proteomics, Beijing Proteome Research Center, Beijing
Institute of Radiation Medicine, National
Center for Protein Sciences, Beijing, Beijing 102206, China
| | - Shin Kawano
- Database
Center for Life Science, Joint Support Center for Data Science Research,
Research Organization of Information and Systems, Kashiwa, Chiba 277-0871, Japan
| | - Henry Lam
- Division
of Biomedical Engineering, The Hong Kong
University of Science and Technology, Clear Water Bay, Hong Kong, P. R. China
- Department
of Chemical and Biomolecular Engineering, The Hong Kong University of Science and Technology, Clear Water Bay, Hong Kong, P. R. China
| | - Gerhard Mayer
- Medizinisches
Proteom Center (MPC), Ruhr-Universität
Bochum, D-44801 Bochum, Germany
| | - Gerben Menschaert
- Lab of Bioinformatics
and Computational Genomics (BioBix), Faculty of Bioscience Engineering, Ghent University, 9000 Ghent, Belgium
| | - Yasset Perez-Riverol
- European
Molecular Biology Laboratory, European Bioinformatics
Institute (EMBL-EBI), Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SD, United Kingdom
| | - Reza M. Salek
- European
Molecular Biology Laboratory, European Bioinformatics
Institute (EMBL-EBI), Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SD, United Kingdom
| | - David L. Tabb
- SA
MRC Centre
for TB Research, DST/NRF Centre of Excellence for Biomedical TB Research,
Division of Molecular Biology and Human Genetics, Faculty of Medicine
and Health Sciences, Stellenbosch University, Cape Town, South Africa
| | - Stefan Tenzer
- Institute
for Immunology, University Medical Center
of the Johannes-Gutenberg University Mainz, 55131 Mainz, Germany
| | - Juan Antonio Vizcaíno
- European
Molecular Biology Laboratory, European Bioinformatics
Institute (EMBL-EBI), Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SD, United Kingdom
| | - Mathias Walzer
- European
Molecular Biology Laboratory, European Bioinformatics
Institute (EMBL-EBI), Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SD, United Kingdom
| | - Andrew R. Jones
- Institute
of Integrative Biology, University of Liverpool, South Wirral L64 4AY, United Kingdom
| |
Collapse
|
31
|
Wang Y, Chen Y, Zhang Y, Wei W, Li Y, Zhang T, He F, Gao Y, Xu P. Multi-Protease Strategy Identifies Three PE2 Missing Proteins in Human Testis Tissue. J Proteome Res 2017; 16:4352-4363. [PMID: 28959888 DOI: 10.1021/acs.jproteome.7b00340] [Citation(s) in RCA: 20] [Impact Index Per Article: 2.9] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/30/2022]
Abstract
Although 5 years of the missing proteins (MPs) study have been completed, searching for MPs remains one of the core missions of the Chromosome-Centric Human Proteome Project (C-HPP). Following the next-50-MPs challenge of the C-HPP, we have focused on the testis-enriched MPs by various strategies since 2015. On the basis of the theoretical analysis of MPs (2017-01, neXtProt) using multiprotease digestion, we found that nonconventional proteases (e.g. LysargiNase, GluC) could improve the peptide diversity and sequence coverage compared with Trypsin. Therefore, a multiprotease strategy was used for searching more MPs in the same human testis tissues separated by 10% SDS-PAGE, followed by high resolution LC-MS/MS system (Q Exactive HF). A total of 7838 proteins were identified. Among them, three PE2 MPs in neXtProt 2017-01 have been identified: beta-defensin 123 ( Q8N688 , chr 20q), cancer/testis antigen family 45 member A10 ( P0DMU9 , chr Xq), and Histone H2A-Bbd type 2/3 ( P0C5Z0 , chr Xq). However, because only one unique peptide of ≥9 AA was identified in beta-defensin 123 and Histone H2A-Bbd type 2/3, respectively, further analysis indicates that each falls under the exceptions clause of the HPP Guidelines v2.1. After a spectrum quality check, isobaric PTM and single amino acid variant (SAAV) filtering, and verification with a synthesized peptide, and based on overlapping peptides from different proteases, these three MPs should be considered as exemplary examples of MPs found by exceptional criteria. Other MPs were considered as candidates but need further validation. All MS data sets have been deposited to the ProteomeXchange with identifier PXD006465.
Collapse
Affiliation(s)
- Yihao Wang
- State Key Laboratory of Proteomics, National Center for Protein Sciences Beijing, Beijing Proteome Research Center, Beijing Institute of Radiation Medicine , Beijing 102206, China.,Department of Pharmacology and Toxicology, Beijing Institute of Radiation Medicine , Beijing 100850, China
| | - Yang Chen
- State Key Laboratory of Proteomics, National Center for Protein Sciences Beijing, Beijing Proteome Research Center, Beijing Institute of Radiation Medicine , Beijing 102206, China
| | - Yao Zhang
- State Key Laboratory of Biocontrol and Guangdong Provincial Key Laboratory of Plant Resources, College of Ecology and Evolution, Sun Yat-Sen University , Guangzhou 510275, China
| | - Wei Wei
- State Key Laboratory of Proteomics, National Center for Protein Sciences Beijing, Beijing Proteome Research Center, Beijing Institute of Radiation Medicine , Beijing 102206, China
| | - Yanchang Li
- State Key Laboratory of Proteomics, National Center for Protein Sciences Beijing, Beijing Proteome Research Center, Beijing Institute of Radiation Medicine , Beijing 102206, China
| | - Tao Zhang
- State Key Laboratory of Proteomics, National Center for Protein Sciences Beijing, Beijing Proteome Research Center, Beijing Institute of Radiation Medicine , Beijing 102206, China
| | - Fuchu He
- State Key Laboratory of Proteomics, National Center for Protein Sciences Beijing, Beijing Proteome Research Center, Beijing Institute of Radiation Medicine , Beijing 102206, China
| | - Yue Gao
- Department of Pharmacology and Toxicology, Beijing Institute of Radiation Medicine , Beijing 100850, China
| | - Ping Xu
- State Key Laboratory of Proteomics, National Center for Protein Sciences Beijing, Beijing Proteome Research Center, Beijing Institute of Radiation Medicine , Beijing 102206, China.,Key Laboratory of Combinatorial Biosynthesis and Drug Discovery of Ministry of Education, School of Pharmaceutical Sciences, Wuhan University , Wuhan 430072, China.,Graduate School, Anhui Medical University , Hefei 230032, China.,Tianjin Baodi Hospital , Tianjin 301800, China
| |
Collapse
|
32
|
Maffioli E, Nonnis S, Angioni R, Santagata F, Calì B, Zanotti L, Negri A, Viola A, Tedeschi G. Proteomic analysis of the secretome of human bone marrow-derived mesenchymal stem cells primed by pro-inflammatory cytokines. J Proteomics 2017; 166:115-126. [DOI: 10.1016/j.jprot.2017.07.012] [Citation(s) in RCA: 44] [Impact Index Per Article: 6.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/27/2017] [Revised: 06/07/2017] [Accepted: 07/17/2017] [Indexed: 02/07/2023]
|
33
|
Galperin MY, Fernández-Suárez XM, Rigden DJ. The 24th annual Nucleic Acids Research database issue: a look back and upcoming changes. Nucleic Acids Res 2017; 45:D1-D11. [PMID: 28053160 PMCID: PMC5210597 DOI: 10.1093/nar/gkw1188] [Citation(s) in RCA: 50] [Impact Index Per Article: 7.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/11/2016] [Accepted: 11/16/2016] [Indexed: 12/23/2022] Open
Abstract
This year's Database Issue of Nucleic Acids Research contains 152 papers that include descriptions of 54 new databases and update papers on 98 databases, of which 16 have not been previously featured in NAR As always, these databases cover a broad range of molecular biology subjects, including genome structure, gene expression and its regulation, proteins, protein domains, and protein-protein interactions. Following the recent trend, an increasing number of new and established databases deal with the issues of human health, from cancer-causing mutations to drugs and drug targets. In accordance with this trend, three recently compiled databases that have been selected by NAR reviewers and editors as 'breakthrough' contributions, denovo-db, the Monarch Initiative, and Open Targets, cover human de novo gene variants, disease-related phenotypes in model organisms, and a bioinformatics platform for therapeutic target identification and validation, respectively. We expect these databases to attract the attention of numerous researchers working in various areas of genetics and genomics. Looking back at the past 12 years, we present here the 'golden set' of databases that have consistently served as authoritative, comprehensive, and convenient data resources widely used by the entire community and offer some lessons on what makes a successful database. The Database Issue is freely available online at the https://academic.oup.com/nar web site. An updated version of the NAR Molecular Biology Database Collection is available at http://www.oxfordjournals.org/nar/database/a/.
Collapse
Affiliation(s)
- Michael Y Galperin
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, MD 20894, USA
| | | | - Daniel J Rigden
- Institute of Integrative Biology, University of Liverpool, Crown Street, Liverpool L69 7ZB, UK
| |
Collapse
|
34
|
Thul PJ, Åkesson L, Wiking M, Mahdessian D, Geladaki A, Ait Blal H, Alm T, Asplund A, Björk L, Breckels LM, Bäckström A, Danielsson F, Fagerberg L, Fall J, Gatto L, Gnann C, Hober S, Hjelmare M, Johansson F, Lee S, Lindskog C, Mulder J, Mulvey CM, Nilsson P, Oksvold P, Rockberg J, Schutten R, Schwenk JM, Sivertsson Å, Sjöstedt E, Skogs M, Stadler C, Sullivan DP, Tegel H, Winsnes C, Zhang C, Zwahlen M, Mardinoglu A, Pontén F, von Feilitzen K, Lilley KS, Uhlén M, Lundberg E. A subcellular map of the human proteome. Science 2017; 356:science.aal3321. [PMID: 28495876 DOI: 10.1126/science.aal3321] [Citation(s) in RCA: 1760] [Impact Index Per Article: 251.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/02/2016] [Accepted: 03/31/2017] [Indexed: 12/13/2022]
Abstract
Resolving the spatial distribution of the human proteome at a subcellular level can greatly increase our understanding of human biology and disease. Here we present a comprehensive image-based map of subcellular protein distribution, the Cell Atlas, built by integrating transcriptomics and antibody-based immunofluorescence microscopy with validation by mass spectrometry. Mapping the in situ localization of 12,003 human proteins at a single-cell level to 30 subcellular structures enabled the definition of the proteomes of 13 major organelles. Exploration of the proteomes revealed single-cell variations in abundance or spatial distribution and localization of about half of the proteins to multiple compartments. This subcellular map can be used to refine existing protein-protein interaction networks and provides an important resource to deconvolute the highly complex architecture of the human cell.
Collapse
Affiliation(s)
- Peter J Thul
- Science for Life Laboratory, School of Biotechnology, KTH Royal Institute of Technology, SE-171 21 Stockholm, Sweden
| | - Lovisa Åkesson
- Science for Life Laboratory, School of Biotechnology, KTH Royal Institute of Technology, SE-171 21 Stockholm, Sweden
| | - Mikaela Wiking
- Science for Life Laboratory, School of Biotechnology, KTH Royal Institute of Technology, SE-171 21 Stockholm, Sweden
| | - Diana Mahdessian
- Science for Life Laboratory, School of Biotechnology, KTH Royal Institute of Technology, SE-171 21 Stockholm, Sweden
| | - Aikaterini Geladaki
- Cambridge Centre for Proteomics, Department of Biochemistry, University of Cambridge, Tennis Court Road, Cambridge CB2 1QR, UK
- Department of Genetics, University of Cambridge, Downing Street, Cambridge CB2 3EH, UK
| | - Hammou Ait Blal
- Science for Life Laboratory, School of Biotechnology, KTH Royal Institute of Technology, SE-171 21 Stockholm, Sweden
| | - Tove Alm
- Science for Life Laboratory, School of Biotechnology, KTH Royal Institute of Technology, SE-171 21 Stockholm, Sweden
| | - Anna Asplund
- Department of Immunology, Genetics and Pathology, Science for Life Laboratory, Uppsala University, SE-751 85 Uppsala, Sweden
| | - Lars Björk
- Science for Life Laboratory, School of Biotechnology, KTH Royal Institute of Technology, SE-171 21 Stockholm, Sweden
| | - Lisa M Breckels
- Cambridge Centre for Proteomics, Department of Biochemistry, University of Cambridge, Tennis Court Road, Cambridge CB2 1QR, UK
- Computational Proteomics Unit, Department of Biochemistry, University of Cambridge, Tennis Court Road, Cambridge CB2 1QR, UK
| | - Anna Bäckström
- Science for Life Laboratory, School of Biotechnology, KTH Royal Institute of Technology, SE-171 21 Stockholm, Sweden
| | - Frida Danielsson
- Science for Life Laboratory, School of Biotechnology, KTH Royal Institute of Technology, SE-171 21 Stockholm, Sweden
| | - Linn Fagerberg
- Science for Life Laboratory, School of Biotechnology, KTH Royal Institute of Technology, SE-171 21 Stockholm, Sweden
| | - Jenny Fall
- Science for Life Laboratory, School of Biotechnology, KTH Royal Institute of Technology, SE-171 21 Stockholm, Sweden
| | - Laurent Gatto
- Cambridge Centre for Proteomics, Department of Biochemistry, University of Cambridge, Tennis Court Road, Cambridge CB2 1QR, UK
- Computational Proteomics Unit, Department of Biochemistry, University of Cambridge, Tennis Court Road, Cambridge CB2 1QR, UK
| | - Christian Gnann
- Science for Life Laboratory, School of Biotechnology, KTH Royal Institute of Technology, SE-171 21 Stockholm, Sweden
| | - Sophia Hober
- Department of Proteomics, School of Biotechnology, KTH Royal Institute of Technology, SE-106 91 Stockholm, Sweden
| | - Martin Hjelmare
- Science for Life Laboratory, School of Biotechnology, KTH Royal Institute of Technology, SE-171 21 Stockholm, Sweden
| | - Fredric Johansson
- Science for Life Laboratory, School of Biotechnology, KTH Royal Institute of Technology, SE-171 21 Stockholm, Sweden
| | - Sunjae Lee
- Science for Life Laboratory, School of Biotechnology, KTH Royal Institute of Technology, SE-171 21 Stockholm, Sweden
| | - Cecilia Lindskog
- Department of Immunology, Genetics and Pathology, Science for Life Laboratory, Uppsala University, SE-751 85 Uppsala, Sweden
| | - Jan Mulder
- Science for Life Laboratory, Department of Neuroscience, Karolinska Institute, SE-171 77 Stockholm, Sweden
| | - Claire M Mulvey
- Cambridge Centre for Proteomics, Department of Biochemistry, University of Cambridge, Tennis Court Road, Cambridge CB2 1QR, UK
| | - Peter Nilsson
- Science for Life Laboratory, School of Biotechnology, KTH Royal Institute of Technology, SE-171 21 Stockholm, Sweden
| | - Per Oksvold
- Science for Life Laboratory, School of Biotechnology, KTH Royal Institute of Technology, SE-171 21 Stockholm, Sweden
| | - Johan Rockberg
- Department of Proteomics, School of Biotechnology, KTH Royal Institute of Technology, SE-106 91 Stockholm, Sweden
| | - Rutger Schutten
- Science for Life Laboratory, School of Biotechnology, KTH Royal Institute of Technology, SE-171 21 Stockholm, Sweden
| | - Jochen M Schwenk
- Science for Life Laboratory, School of Biotechnology, KTH Royal Institute of Technology, SE-171 21 Stockholm, Sweden
| | - Åsa Sivertsson
- Science for Life Laboratory, School of Biotechnology, KTH Royal Institute of Technology, SE-171 21 Stockholm, Sweden
| | - Evelina Sjöstedt
- Department of Immunology, Genetics and Pathology, Science for Life Laboratory, Uppsala University, SE-751 85 Uppsala, Sweden
| | - Marie Skogs
- Science for Life Laboratory, School of Biotechnology, KTH Royal Institute of Technology, SE-171 21 Stockholm, Sweden
| | - Charlotte Stadler
- Science for Life Laboratory, School of Biotechnology, KTH Royal Institute of Technology, SE-171 21 Stockholm, Sweden
| | - Devin P Sullivan
- Science for Life Laboratory, School of Biotechnology, KTH Royal Institute of Technology, SE-171 21 Stockholm, Sweden
| | - Hanna Tegel
- Department of Proteomics, School of Biotechnology, KTH Royal Institute of Technology, SE-106 91 Stockholm, Sweden
| | - Casper Winsnes
- Science for Life Laboratory, School of Biotechnology, KTH Royal Institute of Technology, SE-171 21 Stockholm, Sweden
| | - Cheng Zhang
- Science for Life Laboratory, School of Biotechnology, KTH Royal Institute of Technology, SE-171 21 Stockholm, Sweden
| | - Martin Zwahlen
- Science for Life Laboratory, School of Biotechnology, KTH Royal Institute of Technology, SE-171 21 Stockholm, Sweden
| | - Adil Mardinoglu
- Science for Life Laboratory, School of Biotechnology, KTH Royal Institute of Technology, SE-171 21 Stockholm, Sweden
| | - Fredrik Pontén
- Department of Immunology, Genetics and Pathology, Science for Life Laboratory, Uppsala University, SE-751 85 Uppsala, Sweden
| | - Kalle von Feilitzen
- Science for Life Laboratory, School of Biotechnology, KTH Royal Institute of Technology, SE-171 21 Stockholm, Sweden
| | - Kathryn S Lilley
- Cambridge Centre for Proteomics, Department of Biochemistry, University of Cambridge, Tennis Court Road, Cambridge CB2 1QR, UK
| | - Mathias Uhlén
- Science for Life Laboratory, School of Biotechnology, KTH Royal Institute of Technology, SE-171 21 Stockholm, Sweden.
| | - Emma Lundberg
- Science for Life Laboratory, School of Biotechnology, KTH Royal Institute of Technology, SE-171 21 Stockholm, Sweden.
| |
Collapse
|
35
|
Identification and Validation of Novel Subtype-Specific Protein Biomarkers in Pancreatic Ductal Adenocarcinoma. Pancreas 2017; 46:311-322. [PMID: 27846146 DOI: 10.1097/mpa.0000000000000743] [Citation(s) in RCA: 18] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Indexed: 12/13/2022]
Abstract
OBJECTIVES Pancreatic ductal adenocarcinoma (PDAC) has been subclassified into 3 molecular subtypes: classical, quasi-mesenchymal, and exocrine-like. These subtypes exhibit differences in patient survival and drug resistance to conventional therapies. The aim of the current study is to identify novel subtype-specific protein biomarkers facilitating subtype stratification of patients with PDAC and novel therapy development. METHODS A set of 12 human patient-derived primary cell lines was used as a starting material for an advanced label-free proteomics approach leading to the identification of novel cell surface and secreted biomarkers. Cell surface protein identification was achieved by in vitro biotinylation, followed by mass spectrometric analysis of purified biotin-tagged proteins. Proteins secreted into a chemically defined serum-free cell culture medium were analyzed by shotgun proteomics. RESULTS Of 3288 identified proteins, 2 pan-PDAC (protocadherin-1 and lipocalin-2) and 2 exocrine-like-specific (cadherin-17 and galectin-4) biomarker candidates have been validated. Proximity ligation assay analysis of the 2 exocrine-like biomarkers revealed their co-localization on the surface of exocrine-like cells. CONCLUSIONS The study reports the identification and validation of novel PDAC biomarkers relevant for the development of patient stratification tools. In addition, cadherin-17 and galectin-4 may serve as targets for bispecific antibodies as novel therapeutics in PDAC.
Collapse
|
36
|
A Golden Age for Working with Public Proteomics Data. Trends Biochem Sci 2017; 42:333-341. [PMID: 28118949 PMCID: PMC5414595 DOI: 10.1016/j.tibs.2017.01.001] [Citation(s) in RCA: 71] [Impact Index Per Article: 10.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/02/2016] [Revised: 12/13/2016] [Accepted: 01/02/2017] [Indexed: 11/23/2022]
Abstract
Data sharing in mass spectrometry (MS)-based proteomics is becoming a common scientific practice, as is now common in the case of other, more mature ‘omics’ disciplines like genomics and transcriptomics. We want to highlight that this situation, unprecedented in the field, opens a plethora of opportunities for data scientists. First, we explain in some detail some of the work already achieved, such as systematic reanalysis efforts. We also explain existing applications of public proteomics data, such as proteogenomics and the creation of spectral libraries and spectral archives. Finally, we discuss the main existing challenges and mention the first attempts to combine public proteomics data with other types of omics data sets. The field of proteomics has matured and diversified substantially over the past 10 years. Proteomics data are increasingly shared through centralized, public repositories. Standardization efforts have ensured that a large proportion of these public data can be read and processed by any interested researcher. Because any proteomics data set is only partially understood, there is great opportunity for (orthogonal) reuse of public data. While public proteomics data has so far remained outside ethics and privacy discussions, recent work indicates that there is an inherent risk.
Collapse
|
37
|
Rauscher B, Heigwer F, Breinig M, Winter J, Boutros M. GenomeCRISPR - a database for high-throughput CRISPR/Cas9 screens. Nucleic Acids Res 2017; 45:D679-D686. [PMID: 27789686 PMCID: PMC5210668 DOI: 10.1093/nar/gkw997] [Citation(s) in RCA: 45] [Impact Index Per Article: 6.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/06/2016] [Revised: 10/12/2016] [Accepted: 10/14/2016] [Indexed: 12/12/2022] Open
Abstract
Over the past years, CRISPR/Cas9 mediated genome editing has developed into a powerful tool for modifying genomes in various organisms. In high-throughput screens, CRISPR/Cas9 mediated gene perturbations can be used for the systematic functional analysis of whole genomes. Discoveries from such screens provide a wealth of knowledge about gene to phenotype relationships in various biological model systems. However, a database resource to query results efficiently has been lacking. To this end, we developed GenomeCRISPR (http://genomecrispr.org), a database for genome-scale CRISPR/Cas9 screens. Currently, GenomeCRISPR contains data on more than 550 000 single guide RNAs (sgRNA) derived from 84 different experiments performed in 48 different human cell lines, comprising all screens in human cells using CRISPR/Cas published to date. GenomeCRISPR provides data mining options and tools, such as gene or genomic region search. Phenotypic and genome track views allow users to investigate and compare the results of different screens, or the impact of different sgRNAs on the gene of interest. An Application Programming Interface (API) allows for automated data access and batch download. As more screening data will become available, we also aim at extending the database to include functional genomic data from other organisms and enable cross-species comparisons.
Collapse
Affiliation(s)
- Benedikt Rauscher
- German Cancer Research Center (DKFZ), Division Signaling and Functional Genomics and Heidelberg University, Department of Cell and Molecular Biology, Medical Faculty Mannheim, 69120 Heidelberg, Germany
| | - Florian Heigwer
- German Cancer Research Center (DKFZ), Division Signaling and Functional Genomics and Heidelberg University, Department of Cell and Molecular Biology, Medical Faculty Mannheim, 69120 Heidelberg, Germany
| | - Marco Breinig
- German Cancer Research Center (DKFZ), Division Signaling and Functional Genomics and Heidelberg University, Department of Cell and Molecular Biology, Medical Faculty Mannheim, 69120 Heidelberg, Germany
| | - Jan Winter
- German Cancer Research Center (DKFZ), Division Signaling and Functional Genomics and Heidelberg University, Department of Cell and Molecular Biology, Medical Faculty Mannheim, 69120 Heidelberg, Germany
| | - Michael Boutros
- German Cancer Research Center (DKFZ), Division Signaling and Functional Genomics and Heidelberg University, Department of Cell and Molecular Biology, Medical Faculty Mannheim, 69120 Heidelberg, Germany
- German Cancer Consortium (DKTK), 69120 Heidelberg, Germany
| |
Collapse
|
38
|
Abstract
The main databases devoted stricto sensu to cancer cytogenetics are the "Mitelman Database of Chromosome Aberrations and Gene Fusions in Cancer" ( http://cgap.nci.nih.gov/Chromosomes/Mitelman ), the "Atlas of Genetics and Cytogenetics in Oncology and Haematology" ( http://atlasgeneticsoncology.org ), and COSMIC ( http://cancer.sanger.ac.uk/cosmic ).However, being a complex multistep process, cancer cytogenetics are broadened to "cytogenomics," with complementary resources on: general databases (nucleic acid and protein sequences databases; cartography browsers: GenBank, RefSeq, UCSC, Ensembl, UniProtKB, and Entrez Gene), cancer genomic portals associated with recent international integrated programs, such as TCGA or ICGC, other fusion genes databases, array CGH databases, copy number variation databases, and mutation databases. Other resources such as the International System for Human Cytogenomic Nomenclature (ISCN), the International Classification of Diseases for Oncology (ICD-O), and the Human Gene Nomenclature Database (HGNC) allow a common language.Data within the scientific/medical community should be freely available. However, most of the institutional stakeholders are now gradually disengaging, and well-known databases are forced to beg or to disappear (which may happen!).
Collapse
|
39
|
Gilany K, Minai-Tehrani A, Amini M, Agharezaee N, Arjmand B. The Challenge of Human Spermatozoa Proteome: A Systematic Review. J Reprod Infertil 2017; 18:267-279. [PMID: 29062791 PMCID: PMC5641436] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/21/2022] Open
Abstract
Currently, there are 20,197 human protein-coding genes in the most expertly curated database (UniProtKB/Swiss-Pro). Big efforts have been made by the international consortium, the Chromosome-Centric Human Proteome Project (C-HPP) and independent researchers, to map human proteome. In brief, anno 2017 the human proteome was outlined. The male factor contributes to 50% of infertility in couples. However, there are limited human spermatozoa proteomic studies. Firstly, the development of the mapping of the human spermatozoa was analyzed. The human spermatozoa have been used as a model for missing proteins. It has been shown that human spermatozoa are excellent sources for finding missing proteins. Y chromosome proteome mapping is led by Iran. However, it seems that it is extremely challenging to map the human spermatozoa Y chromosome proteins based on current mass spectrometry-based proteomics technology. Post-translation modifications (PTMs) of human spermatozoa proteome are the most unexplored area and currently the exact role of PTMs in male infertility is unknown. Additionally, the clinical human spermatozoa proteomic analysis, anno 2017 was done in this study.
Collapse
Affiliation(s)
- Kambiz Gilany
- Reproductive Biotechnology Research Center, Avicenna Research Institute, ACECR, Tehran, Iran, Metabolomics and Genomics Research Center, Endocrinology and Metabolism Molecular Cellular Sciences Institute, Tehran University of Medical Sciences, Tehran, Iran,Corresponding Author: Kambiz Gilany, Reproductive Biotechnology Research Center, Avicenna Research Institute, ACECR, Tehran, Iran, P.O. Box: 19615-1177 E-mail:
| | - Arash Minai-Tehrani
- Nanobiotechnology Research Center, Avicenna Research Institute, ACECR, Tehran, Iran
| | - Mehdi Amini
- Reproductive Biotechnology Research Center, Avicenna Research Institute, ACECR, Tehran, Iran
| | - Niloofar Agharezaee
- Reproductive Biotechnology Research Center, Avicenna Research Institute, ACECR, Tehran, Iran, Department of Genetics, Tehran Medical Sciences Branch, Islamic Azad University, Tehran, Iran
| | - Babak Arjmand
- Metabolomics and Genomics Research Center, Endocrinology and Metabolism Molecular Cellular Sciences Institute, Tehran University of Medical Sciences, Tehran, Iran, Cell Therapy and Regenerative Medicine Research Center, Endocrinology and Metabolism Molecular Cellular Sciences Institute, Tehran University of Medical Sciences, Tehran, Iran
| |
Collapse
|
40
|
Lisacek F, Mariethoz J, Alocci D, Rudd PM, Abrahams JL, Campbell MP, Packer NH, Ståhle J, Widmalm G, Mullen E, Adamczyk B, Rojas-Macias MA, Jin C, Karlsson NG. Databases and Associated Tools for Glycomics and Glycoproteomics. Methods Mol Biol 2017; 1503:235-264. [PMID: 27743371 DOI: 10.1007/978-1-4939-6493-2_18] [Citation(s) in RCA: 36] [Impact Index Per Article: 5.1] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/03/2022]
Abstract
The access to biodatabases for glycomics and glycoproteomics has proven to be essential for current glycobiological research. This chapter presents available databases that are devoted to different aspects of glycobioinformatics. This includes oligosaccharide sequence databases, experimental databases, 3D structure databases (of both glycans and glycorelated proteins) and association of glycans with tissue, disease, and proteins. Specific search protocols are also provided using tools associated with experimental databases for converting primary glycoanalytical data to glycan structural information. In particular, researchers using glycoanalysis methods by U/HPLC (GlycoBase), MS (GlycoWorkbench, UniCarb-DB, GlycoDigest), and NMR (CASPER) will benefit from this chapter. In addition we also include information on how to utilize glycan structural information to query databases that associate glycans with proteins (UniCarbKB) and with interactions with pathogens (SugarBind).
Collapse
Affiliation(s)
- Frederique Lisacek
- Proteome Informatics Group, SIB Swiss Institute of Bioinformatics, Geneva, Switzerland
| | - Julien Mariethoz
- Proteome Informatics Group, SIB Swiss Institute of Bioinformatics, Geneva, Switzerland
| | - Davide Alocci
- Proteome Informatics Group, SIB Swiss Institute of Bioinformatics, Geneva, Switzerland
| | - Pauline M Rudd
- NIBRT GlycoScience Group, NIBRT-The National Institute for Bioprocessing Research and Training, Fosters Avenue, Mount Merrion, Blackrock, Co., Dublin, Ireland
| | - Jodie L Abrahams
- Biomolecular Frontiers Research Centre, Macquarie University, North Ryde, NSW, Australia
| | - Matthew P Campbell
- Biomolecular Frontiers Research Centre, Macquarie University, North Ryde, NSW, Australia
| | - Nicolle H Packer
- Biomolecular Frontiers Research Centre, Macquarie University, North Ryde, NSW, Australia
| | - Jonas Ståhle
- Department of Organic Chemistry, Arrhenius Laboratory, Stockholm University, Stockholm, Sweden
| | - Göran Widmalm
- Department of Organic Chemistry, Arrhenius Laboratory, Stockholm University, Stockholm, Sweden
| | | | - Barbara Adamczyk
- NIBRT GlycoScience Group, NIBRT-The National Institute for Bioprocessing Research and Training, Fosters Avenue, Mount Merrion, Blackrock, Co., Dublin, Ireland
- Department of Medical Biochemistry and Cell Biology, Institute of Biomedicine, Sahlgrenska Academy, University of Gothenburg, Box 440, 405 30, Gothenburg, Sweden
| | - Miguel A Rojas-Macias
- Department of Medical Biochemistry and Cell Biology, Institute of Biomedicine, Sahlgrenska Academy, University of Gothenburg, Box 440, 405 30, Gothenburg, Sweden
| | - Chunsheng Jin
- Department of Medical Biochemistry and Cell Biology, Institute of Biomedicine, Sahlgrenska Academy, University of Gothenburg, Box 440, 405 30, Gothenburg, Sweden
| | - Niclas G Karlsson
- Department of Medical Biochemistry and Cell Biology, Institute of Biomedicine, Sahlgrenska Academy, University of Gothenburg, Box 440, 405 30, Gothenburg, Sweden.
| |
Collapse
|
41
|
Gligorijević V, Malod-Dognin N, Pržulj N. Integrative methods for analyzing big data in precision medicine. Proteomics 2016; 16:741-58. [PMID: 26677817 DOI: 10.1002/pmic.201500396] [Citation(s) in RCA: 98] [Impact Index Per Article: 12.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/08/2015] [Revised: 11/16/2015] [Accepted: 12/09/2015] [Indexed: 12/19/2022]
Abstract
We provide an overview of recent developments in big data analyses in the context of precision medicine and health informatics. With the advance in technologies capturing molecular and medical data, we entered the area of "Big Data" in biology and medicine. These data offer many opportunities to advance precision medicine. We outline key challenges in precision medicine and present recent advances in data integration-based methods to uncover personalized information from big data produced by various omics studies. We survey recent integrative methods for disease subtyping, biomarkers discovery, and drug repurposing, and list the tools that are available to domain scientists. Given the ever-growing nature of these big data, we highlight key issues that big data integration methods will face.
Collapse
Affiliation(s)
| | | | - Nataša Pržulj
- Department of Computing, Imperial College London, London, UK
| |
Collapse
|
42
|
Gaudet P, Michel PA, Zahn-Zabal M, Britan A, Cusin I, Domagalski M, Duek PD, Gateau A, Gleizes A, Hinard V, Rech de Laval V, Lin J, Nikitin F, Schaeffer M, Teixeira D, Lane L, Bairoch A. The neXtProt knowledgebase on human proteins: 2017 update. Nucleic Acids Res 2016; 45:D177-D182. [PMID: 27899619 PMCID: PMC5210547 DOI: 10.1093/nar/gkw1062] [Citation(s) in RCA: 131] [Impact Index Per Article: 16.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/30/2016] [Revised: 10/19/2016] [Accepted: 10/24/2016] [Indexed: 01/14/2023] Open
Abstract
The neXtProt human protein knowledgebase (https://www.nextprot.org) continues to add new content and tools, with a focus on proteomics and genetic variation data. neXtProt now has proteomics data for over 85% of the human proteins, as well as new tools tailored to the proteomics community.Moreover, the neXtProt release 2016-08-25 includes over 8000 phenotypic observations for over 4000 variations in a number of genes involved in hereditary cancers and channelopathies. These changes are presented in the current neXtProt update. All of the neXtProt data are available via our user interface and FTP site. We also provide an API access and a SPARQL endpoint for more technical applications.
Collapse
Affiliation(s)
- Pascale Gaudet
- CALIPHO group, SIB Swiss Institute of Bioinformatics, Geneva, Switzerland, 1206 .,Department of Human Protein Sciences, Faculty of Medicine, University of Geneva, Geneva, Switzerland, 1206
| | - Pierre-André Michel
- CALIPHO group, SIB Swiss Institute of Bioinformatics, Geneva, Switzerland, 1206
| | - Monique Zahn-Zabal
- CALIPHO group, SIB Swiss Institute of Bioinformatics, Geneva, Switzerland, 1206
| | - Aurore Britan
- CALIPHO group, SIB Swiss Institute of Bioinformatics, Geneva, Switzerland, 1206
| | - Isabelle Cusin
- CALIPHO group, SIB Swiss Institute of Bioinformatics, Geneva, Switzerland, 1206
| | - Marcin Domagalski
- CALIPHO group, SIB Swiss Institute of Bioinformatics, Geneva, Switzerland, 1206.,Department of Human Protein Sciences, Faculty of Medicine, University of Geneva, Geneva, Switzerland, 1206
| | - Paula D Duek
- CALIPHO group, SIB Swiss Institute of Bioinformatics, Geneva, Switzerland, 1206
| | - Alain Gateau
- CALIPHO group, SIB Swiss Institute of Bioinformatics, Geneva, Switzerland, 1206
| | - Anne Gleizes
- CALIPHO group, SIB Swiss Institute of Bioinformatics, Geneva, Switzerland, 1206
| | - Valérie Hinard
- CALIPHO group, SIB Swiss Institute of Bioinformatics, Geneva, Switzerland, 1206
| | - Valentine Rech de Laval
- CALIPHO group, SIB Swiss Institute of Bioinformatics, Geneva, Switzerland, 1206.,Department of Human Protein Sciences, Faculty of Medicine, University of Geneva, Geneva, Switzerland, 1206
| | - JinJin Lin
- Sun Yat-sen University, 135 Xingang W Rd, Haizhu, Guangzhou, Guangdong, China
| | - Frederic Nikitin
- CALIPHO group, SIB Swiss Institute of Bioinformatics, Geneva, Switzerland, 1206
| | - Mathieu Schaeffer
- CALIPHO group, SIB Swiss Institute of Bioinformatics, Geneva, Switzerland, 1206.,Department of Human Protein Sciences, Faculty of Medicine, University of Geneva, Geneva, Switzerland, 1206
| | - Daniel Teixeira
- CALIPHO group, SIB Swiss Institute of Bioinformatics, Geneva, Switzerland, 1206
| | - Lydie Lane
- CALIPHO group, SIB Swiss Institute of Bioinformatics, Geneva, Switzerland, 1206.,Department of Human Protein Sciences, Faculty of Medicine, University of Geneva, Geneva, Switzerland, 1206
| | - Amos Bairoch
- CALIPHO group, SIB Swiss Institute of Bioinformatics, Geneva, Switzerland, 1206.,Department of Human Protein Sciences, Faculty of Medicine, University of Geneva, Geneva, Switzerland, 1206
| |
Collapse
|
43
|
Wang D, Yang L, Zhang P, LaBaer J, Hermjakob H, Li D, Yu X. AAgAtlas 1.0: a human autoantigen database. Nucleic Acids Res 2016; 45:D769-D776. [PMID: 27924021 PMCID: PMC5210642 DOI: 10.1093/nar/gkw946] [Citation(s) in RCA: 37] [Impact Index Per Article: 4.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/21/2016] [Revised: 09/22/2016] [Accepted: 10/11/2016] [Indexed: 12/25/2022] Open
Abstract
Autoantibodies refer to antibodies that target self-antigens, which can play pivotal roles in maintaining homeostasis, distinguishing normal from tumor tissue and trigger autoimmune diseases. In the last three decades, tremendous efforts have been devoted to elucidate the generation, evolution and functions of autoantibodies, as well as their target autoantigens. However, reports of these countless previously identified autoantigens are randomly dispersed in the literature. Here, we constructed an AAgAtlas database 1.0 using text-mining and manual curation. We extracted 45 830 autoantigen-related abstracts and 94 313 sentences from PubMed using the keywords of either ‘autoantigen’ or ‘autoantibody’ or their lexical variants, which were further refined to 25 520 abstracts, 43 253 sentences and 3984 candidates by our bio-entity recognizer based on the Protein Ontology. Finally, we identified 1126 genes as human autoantigens and 1071 related human diseases, with which we constructed a human autoantigen database (AAgAtlas database 1.0). The database provides a user-friendly interface to conveniently browse, retrieve and download human autoantigens as well as their associated diseases. The database is freely accessible at http://biokb.ncpsb.org/aagatlas/. We believe this database will be a valuable resource to track and understand human autoantigens as well as to investigate their functions in basic and translational research.
Collapse
Affiliation(s)
- Dan Wang
- State Key Laboratory of Proteomics, Beijing Proteome Research Center, National Center for Protein Sciences-Beijing (PHOENIX Center), Beijing Institute of Radiation Medicine, Beijing 102206, China
| | - Liuhui Yang
- State Key Laboratory of Proteomics, Beijing Proteome Research Center, National Center for Protein Sciences-Beijing (PHOENIX Center), Beijing Institute of Radiation Medicine, Beijing 102206, China
| | - Ping Zhang
- State Key Laboratory of Proteomics, Beijing Proteome Research Center, National Center for Protein Sciences-Beijing (PHOENIX Center), Beijing Institute of Radiation Medicine, Beijing 102206, China
| | - Joshua LaBaer
- The Virginia G. Piper Center for Personalized Diagnostics, Biodesign Institute, Arizona State University, Tempe, AZ 85287, USA
| | - Henning Hermjakob
- State Key Laboratory of Proteomics, Beijing Proteome Research Center, National Center for Protein Sciences-Beijing (PHOENIX Center), Beijing Institute of Radiation Medicine, Beijing 102206, China .,European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Dong Li
- State Key Laboratory of Proteomics, Beijing Proteome Research Center, National Center for Protein Sciences-Beijing (PHOENIX Center), Beijing Institute of Radiation Medicine, Beijing 102206, China
| | - Xiaobo Yu
- State Key Laboratory of Proteomics, Beijing Proteome Research Center, National Center for Protein Sciences-Beijing (PHOENIX Center), Beijing Institute of Radiation Medicine, Beijing 102206, China
| |
Collapse
|
44
|
Vit O, Man P, Kadek A, Hausner J, Sklenar J, Harant K, Novak P, Scigelova M, Woffendin G, Petrak J. Large-scale identification of membrane proteins based on analysis of trypsin-protected transmembrane segments. J Proteomics 2016; 149:15-22. [DOI: 10.1016/j.jprot.2016.03.016] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/30/2015] [Revised: 02/03/2016] [Accepted: 03/04/2016] [Indexed: 01/06/2023]
|
45
|
Garin-Muga A, Odriozola L, Martínez-Val A, Del Toro N, Martínez R, Molina M, Cantero L, Rivera R, Garrido N, Dominguez F, Sanchez Del Pino MM, Vizcaíno JA, Corrales FJ, Segura V. Detection of Missing Proteins Using the PRIDE Database as a Source of Mass Spectrometry Evidence. J Proteome Res 2016; 15:4101-4115. [PMID: 27581094 PMCID: PMC5099979 DOI: 10.1021/acs.jproteome.6b00437] [Citation(s) in RCA: 14] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/11/2022]
Abstract
![]()
The current catalogue of the human
proteome is not yet complete,
as experimental proteomics evidence is still elusive for a group of
proteins known as the missing proteins. The Human Proteome Project
(HPP) has been successfully using technology and bioinformatic resources
to improve the characterization of such challenging proteins. In this
manuscript, we propose a pipeline starting with the mining of the
PRIDE database to select a group of data sets potentially enriched
in missing proteins that are subsequently analyzed for protein identification
with a method based on the statistical analysis of proteotypic peptides.
Spermatozoa and the HEK293 cell line were found to be a promising
source of missing proteins and clearly merit further attention in
future studies. After the analysis of the selected samples, we found
342 PSMs, suggesting the presence of 97 missing proteins in human
spermatozoa or the HEK293 cell line, while only 36 missing proteins
were potentially detected in the retina, frontal cortex, aorta thoracica,
or placenta. The functional analysis of the missing proteins detected
confirmed their tissue specificity, and the validation of a selected
set of peptides using targeted proteomics (SRM/MRM assays) further
supports the utility of the proposed pipeline. As illustrative examples,
DNAH3 and TEPP in spermatozoa, and UNCX and ATAD3C in HEK293 cells
were some of the more robust and remarkable identifications in this
study. We provide evidence indicating the relevance to carefully analyze
the ever-increasing MS/MS data available from PRIDE and other repositories
as sources for missing proteins detection in specific biological matrices
as revealed for HEK293 cells.
Collapse
Affiliation(s)
- Alba Garin-Muga
- Proteomics and Bioinformatics Unit, Center for Applied Medical Research, University of Navarra , 31008, Pamplona, Spain
| | - Leticia Odriozola
- Proteomics and Bioinformatics Unit, Center for Applied Medical Research, University of Navarra , 31008, Pamplona, Spain.,IdiSNA, Navarra Institute for Health Research , 31008, Pamplona, Spain
| | - Ana Martínez-Val
- Proteomics Unit, Spanish National Cancer Research Centre , 28029, Madrid, Spain
| | - Noemí Del Toro
- European Molecular Biology Laboratory, European Bioinformatics Institute , Wellcome Trust GenomeCampus, Hinxton, Cambridge, CB10 1SD, U.K
| | - Rocío Martínez
- Proteomics and Bioinformatics Unit, Center for Applied Medical Research, University of Navarra , 31008, Pamplona, Spain
| | - Manuela Molina
- Proteomics and Bioinformatics Unit, Center for Applied Medical Research, University of Navarra , 31008, Pamplona, Spain
| | - Laura Cantero
- Proteomics Unit (SCSIE), University of Valencia , 46010, Valencia, Spain
| | - Rocío Rivera
- Andrology Laboratory and Sperm Bank, Instituto Universitario IVI , 46015, Valencia, Spain
| | - Nicolás Garrido
- Andrology Laboratory and Sperm Bank, Instituto Universitario IVI , 46015, Valencia, Spain
| | | | | | - Juan Antonio Vizcaíno
- European Molecular Biology Laboratory, European Bioinformatics Institute , Wellcome Trust GenomeCampus, Hinxton, Cambridge, CB10 1SD, U.K
| | - Fernando J Corrales
- Proteomics and Bioinformatics Unit, Center for Applied Medical Research, University of Navarra , 31008, Pamplona, Spain.,IdiSNA, Navarra Institute for Health Research , 31008, Pamplona, Spain.,Division of Hepatology and Gene Therapy, Center for Applied Medical Research, University of Navarra , 31008, Pamplona, Spain
| | - Victor Segura
- Proteomics and Bioinformatics Unit, Center for Applied Medical Research, University of Navarra , 31008, Pamplona, Spain.,IdiSNA, Navarra Institute for Health Research , 31008, Pamplona, Spain
| |
Collapse
|
46
|
Matthews H, Hanison J, Nirmalan N. "Omics"-Informed Drug and Biomarker Discovery: Opportunities, Challenges and Future Perspectives. Proteomes 2016; 4:E28. [PMID: 28248238 PMCID: PMC5217350 DOI: 10.3390/proteomes4030028] [Citation(s) in RCA: 95] [Impact Index Per Article: 11.9] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/12/2016] [Revised: 09/01/2016] [Accepted: 09/07/2016] [Indexed: 12/21/2022] Open
Abstract
The pharmaceutical industry faces unsustainable program failure despite significant increases in investment. Dwindling discovery pipelines, rapidly expanding R&D budgets and increasing regulatory control, predict significant gaps in the future drug markets. The cumulative duration of discovery from concept to commercialisation is unacceptably lengthy, and adds to the deepening crisis. Existing animal models predicting clinical translations are simplistic, highly reductionist and, therefore, not fit for purpose. The catastrophic consequences of ever-increasing attrition rates are most likely to be felt in the developing world, where resistance acquisition by killer diseases like malaria, tuberculosis and HIV have paced far ahead of new drug discovery. The coming of age of Omics-based applications makes available a formidable technological resource to further expand our knowledge of the complexities of human disease. The standardisation, analysis and comprehensive collation of the "data-heavy" outputs of these sciences are indeed challenging. A renewed focus on increasing reproducibility by understanding inherent biological, methodological, technical and analytical variables is crucial if reliable and useful inferences with potential for translation are to be achieved. The individual Omics sciences-genomics, transcriptomics, proteomics and metabolomics-have the singular advantage of being complimentary for cross validation, and together could potentially enable a much-needed systems biology perspective of the perturbations underlying disease processes. If current adverse trends are to be reversed, it is imperative that a shift in the R&D focus from speed to quality is achieved. In this review, we discuss the potential implications of recent Omics-based advances for the drug development process.
Collapse
Affiliation(s)
- Holly Matthews
- Department of Life Sciences, Faculty of Natural Sciences, Imperial College, London SW7 2AZ, UK.
| | - James Hanison
- Manchester Royal Infirmary, Oxford Road, Greater Manchester M13 9WL, UK.
| | - Niroshini Nirmalan
- Environment and Life Sciences, University of Salford, Greater Manchester M5 4WT, UK.
| |
Collapse
|
47
|
Deutsch EW, Sun Z, Campbell DS, Binz PA, Farrah T, Shteynberg D, Mendoza L, Omenn GS, Moritz RL. Tiered Human Integrated Sequence Search Databases for Shotgun Proteomics. J Proteome Res 2016; 15:4091-4100. [PMID: 27577934 DOI: 10.1021/acs.jproteome.6b00445] [Citation(s) in RCA: 21] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/30/2022]
Abstract
The results of analysis of shotgun proteomics mass spectrometry data can be greatly affected by the selection of the reference protein sequence database against which the spectra are matched. For many species there are multiple sources from which somewhat different sequence sets can be obtained. This can lead to confusion about which database is best in which circumstances-a problem especially acute in human sample analysis. All sequence databases are genome-based, with sequences for the predicted gene and their protein translation products compiled. Our goal is to create a set of primary sequence databases that comprise the union of sequences from many of the different available sources and make the result easily available to the community. We have compiled a set of four sequence databases of varying sizes, from a small database consisting of only the ∼20,000 primary isoforms plus contaminants to a very large database that includes almost all nonredundant protein sequences from several sources. This set of tiered, increasingly complete human protein sequence databases suitable for mass spectrometry proteomics sequence database searching is called the Tiered Human Integrated Search Proteome set. In order to evaluate the utility of these databases, we have analyzed two different data sets, one from the HeLa cell line and the other from normal human liver tissue, with each of the four tiers of database complexity. The result is that approximately 0.8%, 1.1%, and 1.5% additional peptides can be identified for Tiers 2, 3, and 4, respectively, as compared with the Tier 1 database, at substantially increasing computational cost. This increase in computational cost may be worth bearing if the identification of sequence variants or the discovery of sequences that are not present in the reviewed knowledge base entries is an important goal of the study. We find that it is useful to search a data set against a simpler database, and then check the uniqueness of the discovered peptides against a more complex database. We have set up an automated system that downloads all the source databases on the first of each month and automatically generates a new set of search databases and makes them available for download at http://www.peptideatlas.org/thisp/ .
Collapse
Affiliation(s)
- Eric W Deutsch
- Institute for Systems Biology , Seattle, Washington 98109, United States
| | - Zhi Sun
- Institute for Systems Biology , Seattle, Washington 98109, United States
| | - David S Campbell
- Institute for Systems Biology , Seattle, Washington 98109, United States
| | - Pierre-Alain Binz
- CHUV Centre Universitaire Hospitalier Vaudois , 1011 Lausanne, Switzerland
| | - Terry Farrah
- Institute for Systems Biology , Seattle, Washington 98109, United States
| | - David Shteynberg
- Institute for Systems Biology , Seattle, Washington 98109, United States
| | - Luis Mendoza
- Institute for Systems Biology , Seattle, Washington 98109, United States
| | - Gilbert S Omenn
- Institute for Systems Biology , Seattle, Washington 98109, United States.,Departments of Computational Medicine & Bioinformatics, Internal Medicine, Human Genetics and School of Public Health, University of Michigan , Ann Arbor, Michigan 48109, United States
| | - Robert L Moritz
- Institute for Systems Biology , Seattle, Washington 98109, United States
| |
Collapse
|
48
|
Poverennaya EV, Kopylov AT, Ponomarenko EA, Ilgisonis EV, Zgoda VG, Tikhonova OV, Novikova SE, Farafonova TE, Kiseleva YY, Radko SP, Vakhrushev IV, Yarygin KN, Moshkovskii SA, Kiseleva OI, Lisitsa AV, Sokolov AS, Mazur AM, Prokhortchouk EB, Skryabin KG, Kostrjukova ES, Tyakht AV, Gorbachev AY, Ilina EN, Govorun VM, Archakov AI. State of the Art of Chromosome 18-Centric HPP in 2016: Transcriptome and Proteome Profiling of Liver Tissue and HepG2 Cells. J Proteome Res 2016; 15:4030-4038. [PMID: 27527821 DOI: 10.1021/acs.jproteome.6b00380] [Citation(s) in RCA: 20] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/15/2023]
Abstract
A gene-centric approach was applied for a large-scale study of expression products of a single chromosome. Transcriptome profiling of liver tissue and HepG2 cell line was independently performed using two RNA-Seq platforms (SOLiD and Illumina) and also by Droplet Digital PCR (ddPCR) and quantitative RT-PCR. Proteome profiling was performed using shotgun LC-MS/MS as well as selected reaction monitoring with stable isotope-labeled standards (SRM/SIS) for liver tissue and HepG2 cells. On the basis of SRM/SIS measurements, protein copy numbers were estimated for the Chromosome 18 (Chr 18) encoded proteins in the selected types of biological material. These values were compared with expression levels of corresponding mRNA. As a result, we obtained information about 158 and 142 transcripts for HepG2 cell line and liver tissue, respectively. SRM/SIS measurements and shotgun LC-MS/MS allowed us to detect 91 Chr 18-encoded proteins in total, while an intersection between the HepG2 cell line and liver tissue proteomes was ∼66%. In total, there were 16 proteins specifically observed in HepG2 cell line, while 15 proteins were found solely in the liver tissue. Comparison between proteome and transcriptome revealed a poor correlation (R2 ≈ 0.1) between corresponding mRNA and protein expression levels. The SRM and shotgun data sets (obtained during 2015-2016) are available in PASSEL (PASS00697) and ProteomeExchange/PRIDE (PXD004407). All measurements were also uploaded into the in-house Chr 18 Knowledgebase at http://kb18.ru/protein/matrix/416126 .
Collapse
Affiliation(s)
| | - Arthur T Kopylov
- Institute of Biomedical Chemistry , Pogodinskaya Street, 10, Moscow 119121, Russia
| | - Elena A Ponomarenko
- Institute of Biomedical Chemistry , Pogodinskaya Street, 10, Moscow 119121, Russia
| | | | - Victor G Zgoda
- Institute of Biomedical Chemistry , Pogodinskaya Street, 10, Moscow 119121, Russia
| | - Olga V Tikhonova
- Institute of Biomedical Chemistry , Pogodinskaya Street, 10, Moscow 119121, Russia
| | - Svetlana E Novikova
- Institute of Biomedical Chemistry , Pogodinskaya Street, 10, Moscow 119121, Russia
| | - Tatyana E Farafonova
- Institute of Biomedical Chemistry , Pogodinskaya Street, 10, Moscow 119121, Russia
| | - Yana Yu Kiseleva
- Institute of Biomedical Chemistry , Pogodinskaya Street, 10, Moscow 119121, Russia
| | - Sergey P Radko
- Institute of Biomedical Chemistry , Pogodinskaya Street, 10, Moscow 119121, Russia
| | - Igor V Vakhrushev
- Institute of Biomedical Chemistry , Pogodinskaya Street, 10, Moscow 119121, Russia
| | - Konstantin N Yarygin
- Institute of Biomedical Chemistry , Pogodinskaya Street, 10, Moscow 119121, Russia
| | - Sergei A Moshkovskii
- Institute of Biomedical Chemistry , Pogodinskaya Street, 10, Moscow 119121, Russia.,Pirogov Russian National Research Medical University , Ostrovitianov Str. 1, Moscow 117997, Russia
| | - Olga I Kiseleva
- Institute of Biomedical Chemistry , Pogodinskaya Street, 10, Moscow 119121, Russia
| | - Andrey V Lisitsa
- Institute of Biomedical Chemistry , Pogodinskaya Street, 10, Moscow 119121, Russia
| | - Alexey S Sokolov
- Center "Bioengineering" Russian Academy of Sciences , Prospect 60-let Oktyabrya, 7, Build.1, Moscow 119071, Russia
| | - Alexander M Mazur
- Center "Bioengineering" Russian Academy of Sciences , Prospect 60-let Oktyabrya, 7, Build.1, Moscow 119071, Russia
| | - Egor B Prokhortchouk
- Center "Bioengineering" Russian Academy of Sciences , Prospect 60-let Oktyabrya, 7, Build.1, Moscow 119071, Russia
| | - Konstantin G Skryabin
- Center "Bioengineering" Russian Academy of Sciences , Prospect 60-let Oktyabrya, 7, Build.1, Moscow 119071, Russia
| | - Elena S Kostrjukova
- Scientific Research Institute of Physical-Chemical Medicine , Malaya Pirogovskaya, 1a, Moscow 119435, Russia
| | - Alexander V Tyakht
- Scientific Research Institute of Physical-Chemical Medicine , Malaya Pirogovskaya, 1a, Moscow 119435, Russia
| | - Alexey Yu Gorbachev
- Scientific Research Institute of Physical-Chemical Medicine , Malaya Pirogovskaya, 1a, Moscow 119435, Russia
| | - Elena N Ilina
- Scientific Research Institute of Physical-Chemical Medicine , Malaya Pirogovskaya, 1a, Moscow 119435, Russia
| | - Vadim M Govorun
- Scientific Research Institute of Physical-Chemical Medicine , Malaya Pirogovskaya, 1a, Moscow 119435, Russia
| | - Alexander I Archakov
- Institute of Biomedical Chemistry , Pogodinskaya Street, 10, Moscow 119121, Russia
| |
Collapse
|
49
|
Deutsch EW, Overall CM, Van Eyk JE, Baker MS, Paik YK, Weintraub ST, Lane L, Martens L, Vandenbrouck Y, Kusebauch U, Hancock WS, Hermjakob H, Aebersold R, Moritz RL, Omenn GS. Human Proteome Project Mass Spectrometry Data Interpretation Guidelines 2.1. J Proteome Res 2016; 15:3961-3970. [PMID: 27490519 DOI: 10.1021/acs.jproteome.6b00392] [Citation(s) in RCA: 134] [Impact Index Per Article: 16.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/02/2023]
Abstract
Every data-rich community research effort requires a clear plan for ensuring the quality of the data interpretation and comparability of analyses. To address this need within the Human Proteome Project (HPP) of the Human Proteome Organization (HUPO), we have developed through broad consultation a set of mass spectrometry data interpretation guidelines that should be applied to all HPP data contributions. For submission of manuscripts reporting HPP protein identification results, the guidelines are presented as a one-page checklist containing 15 essential points followed by two pages of expanded description of each. Here we present an overview of the guidelines and provide an in-depth description of each of the 15 elements to facilitate understanding of the intentions and rationale behind the guidelines, for both authors and reviewers. Broadly, these guidelines provide specific directions regarding how HPP data are to be submitted to mass spectrometry data repositories, how error analysis should be presented, and how detection of novel proteins should be supported with additional confirmatory evidence. These guidelines, developed by the HPP community, are presented to the broader scientific community for further discussion.
Collapse
Affiliation(s)
- Eric W Deutsch
- Institute for Systems Biology , 401 Terry Avenure North, Seattle, Washington 98109, United States
| | - Christopher M Overall
- Centre for Blood Research, Departments of Oral Biological & Medical Sciences, and Biochemistry & Molecular Biology, Faculty of Dentistry, University of British Columbia , Vancouver, British Columbia V6T 1Z3, Canada
| | - Jennifer E Van Eyk
- Advanced Clinical Biosystems Research Institute, Department of Medicine, Cedars Sinai Medical Center , Los Angeles, California 90048, United States
| | - Mark S Baker
- Department of Biomedical Sciences, Faculty of Medicine and Health Science, Macquarie University , Sydney, New South Wales 2109, Australia
| | - Young-Ki Paik
- Yonsei Proteome Research Center and Department of Biochemistry, Yonsei University , 50 Yonsei-ro, Sudaemoon-ku, Seoul 120-749, Korea
| | - Susan T Weintraub
- The University of Texas , Health Science Center at San Antonio, San Antonio, Texas 78229, United States
| | - Lydie Lane
- SIB Swiss Institute of Bioinformatics and Department of Human Protein Science, Faculty of Medicine, University of Geneva , CMU, Michel Servet 1, 1211 Geneva 4, Switzerland
| | - Lennart Martens
- Department of Medical Protein Research, VIB , Ghent 9052, Belgium.,Department of Biochemistry, Ghent University , Ghent B-9000, Belgium
| | - Yves Vandenbrouck
- French Proteomics Infrastructure, Biosciences and Biotechnology Institute of Grenoble (BIG), Université Grenoble Alpes, CEA, INSERM , U1038 Grenoble, France
| | - Ulrike Kusebauch
- Institute for Systems Biology , 401 Terry Avenure North, Seattle, Washington 98109, United States
| | - William S Hancock
- Department of Chemical Biology, Northeastern University , Boston, Massachusetts 02115, United States
| | - Henning Hermjakob
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Trust Genome Campus , Hinxton, Cambridge CB10 1SD, United Kingdom.,National Center for Protein Sciences , Beijing 102206, China
| | - Ruedi Aebersold
- Department of Biology, Institute of Molecular Systems Biology , ETH Zurich, Zurich 8093, Switzerland.,Faculty of Science, University of Zurich , 8006 Zurich, Switzerland
| | - Robert L Moritz
- Institute for Systems Biology , 401 Terry Avenure North, Seattle, Washington 98109, United States
| | - Gilbert S Omenn
- Institute for Systems Biology , 401 Terry Avenure North, Seattle, Washington 98109, United States.,Departments of Computational Medicine & Bioinformatics, Internal Medicine, and Human Genetics and School of Public Health, University of Michigan , Ann Arbor, Michigan 48109-2218, United States
| |
Collapse
|
50
|
Vandenbrouck Y, Lane L, Carapito C, Duek P, Rondel K, Bruley C, Macron C, Gonzalez de Peredo A, Couté Y, Chaoui K, Com E, Gateau A, Hesse AM, Marcellin M, Méar L, Mouton-Barbosa E, Robin T, Burlet-Schiltz O, Cianferani S, Ferro M, Fréour T, Lindskog C, Garin J, Pineau C. Looking for Missing Proteins in the Proteome of Human Spermatozoa: An Update. J Proteome Res 2016; 15:3998-4019. [PMID: 27444420 DOI: 10.1021/acs.jproteome.6b00400] [Citation(s) in RCA: 58] [Impact Index Per Article: 7.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/10/2023]
Abstract
The Chromosome-Centric Human Proteome Project (C-HPP) aims to identify "missing" proteins in the neXtProt knowledgebase. We present an in-depth proteomics analysis of the human sperm proteome to identify testis-enriched missing proteins. Using protein extraction procedures and LC-MS/MS analysis, we detected 235 proteins (PE2-PE4) for which no previous evidence of protein expression was annotated. Through LC-MS/MS and LC-PRM analysis, data mining, and immunohistochemistry, we confirmed the expression of 206 missing proteins (PE2-PE4) in line with current HPP guidelines (version 2.0). Parallel reaction monitoring acquisition and sythetic heavy labeled peptides targeted 36 ≪one-hit wonder≫ candidates selected based on prior peptide spectrum match assessment. 24 were validated with additional predicted and specifically targeted peptides. Evidence was found for 16 more missing proteins using immunohistochemistry on human testis sections. The expression pattern for some of these proteins was specific to the testis, and they could possibly be valuable markers with fertility assessment applications. Strong evidence was also found of four "uncertain" proteins (PE5); their status should be re-examined. We show how using a range of sample preparation techniques combined with MS-based analysis, expert knowledge, and complementary antibody-based techniques can produce data of interest to the community. All MS/MS data are available via ProteomeXchange under identifier PXD003947. In addition to contributing to the C-HPP, we hope these data will stimulate continued exploration of the sperm proteome.
Collapse
Affiliation(s)
- Yves Vandenbrouck
- CEA, DRF, BIG, Laboratoire de Biologie à Grande Echelle, 17 rue des martyrs, Grenoble F-38054, France.,Inserm U1038 , 17, rue des Martyrs, Grenoble F-38054, France.,Université de Grenoble , Grenoble F-38054, France
| | - Lydie Lane
- Department of Human Protein Sciences, Faculty of Medicine, University of Geneva , 1, rue Michel-Servet, 1211 Geneva 4, Switzerland.,CALIPHO Group, SIB-Swiss Institute of Bioinformatics, CMU , rue Michel-Servet 1, CH-1211 Geneva 4, Switzerland
| | - Christine Carapito
- Laboratoire de Spectrométrie de Masse BioOrganique (LSMBO), IPHC, Université de Strasbourg, CNRS UMR7178, 25 Rue Becquerel, 67087 Strasbourg, France
| | - Paula Duek
- CALIPHO Group, SIB-Swiss Institute of Bioinformatics, CMU , rue Michel-Servet 1, CH-1211 Geneva 4, Switzerland
| | - Karine Rondel
- Protim, Inserm U1085, Irset, Campus de Beaulieu, Rennes 35042, France
| | - Christophe Bruley
- CEA, DRF, BIG, Laboratoire de Biologie à Grande Echelle, 17 rue des martyrs, Grenoble F-38054, France.,Inserm U1038 , 17, rue des Martyrs, Grenoble F-38054, France.,Université de Grenoble , Grenoble F-38054, France
| | - Charlotte Macron
- Laboratoire de Spectrométrie de Masse BioOrganique (LSMBO), IPHC, Université de Strasbourg, CNRS UMR7178, 25 Rue Becquerel, 67087 Strasbourg, France
| | - Anne Gonzalez de Peredo
- Institut de Pharmacologie et de Biologie Structurale, Université de Toulouse, CNRS, UPS, 31062 Toulouse, France
| | - Yohann Couté
- CEA, DRF, BIG, Laboratoire de Biologie à Grande Echelle, 17 rue des martyrs, Grenoble F-38054, France.,Inserm U1038 , 17, rue des Martyrs, Grenoble F-38054, France.,Université de Grenoble , Grenoble F-38054, France
| | - Karima Chaoui
- Institut de Pharmacologie et de Biologie Structurale, Université de Toulouse, CNRS, UPS, 31062 Toulouse, France
| | - Emmanuelle Com
- Protim, Inserm U1085, Irset, Campus de Beaulieu, Rennes 35042, France
| | - Alain Gateau
- CALIPHO Group, SIB-Swiss Institute of Bioinformatics, CMU , rue Michel-Servet 1, CH-1211 Geneva 4, Switzerland
| | - Anne-Marie Hesse
- CEA, DRF, BIG, Laboratoire de Biologie à Grande Echelle, 17 rue des martyrs, Grenoble F-38054, France.,Inserm U1038 , 17, rue des Martyrs, Grenoble F-38054, France.,Université de Grenoble , Grenoble F-38054, France
| | - Marlene Marcellin
- Institut de Pharmacologie et de Biologie Structurale, Université de Toulouse, CNRS, UPS, 31062 Toulouse, France
| | - Loren Méar
- Protim, Inserm U1085, Irset, Campus de Beaulieu, Rennes 35042, France
| | - Emmanuelle Mouton-Barbosa
- Institut de Pharmacologie et de Biologie Structurale, Université de Toulouse, CNRS, UPS, 31062 Toulouse, France
| | - Thibault Robin
- Proteome Informatics Group, Centre Universitaire d'Informatique , Route de Drize 7, 1227 Carouge, CH, Switzerland
| | - Odile Burlet-Schiltz
- Institut de Pharmacologie et de Biologie Structurale, Université de Toulouse, CNRS, UPS, 31062 Toulouse, France
| | - Sarah Cianferani
- Laboratoire de Spectrométrie de Masse BioOrganique (LSMBO), IPHC, Université de Strasbourg, CNRS UMR7178, 25 Rue Becquerel, 67087 Strasbourg, France
| | - Myriam Ferro
- CEA, DRF, BIG, Laboratoire de Biologie à Grande Echelle, 17 rue des martyrs, Grenoble F-38054, France.,Inserm U1038 , 17, rue des Martyrs, Grenoble F-38054, France.,Université de Grenoble , Grenoble F-38054, France
| | - Thomas Fréour
- Service de Médecine de la Reproduction, CHU de Nantes , 38 boulevard Jean Monnet, 44093 Nantes cedex, France.,INSERM UMR1064 , Nantes 44093, France
| | - Cecilia Lindskog
- CEA, DRF, BIG, Laboratoire de Biologie à Grande Echelle, 17 rue des martyrs, Grenoble F-38054, France.,Inserm U1038 , 17, rue des Martyrs, Grenoble F-38054, France
| | - Jérôme Garin
- CEA, DRF, BIG, Laboratoire de Biologie à Grande Echelle, 17 rue des martyrs, Grenoble F-38054, France.,Inserm U1038 , 17, rue des Martyrs, Grenoble F-38054, France.,Université de Grenoble , Grenoble F-38054, France
| | - Charles Pineau
- Protim, Inserm U1085, Irset, Campus de Beaulieu, Rennes 35042, France
| |
Collapse
|