1
|
Tsantilas KA, Merrihew GE, Robbins JE, Johnson RS, Park J, Plubell DL, Canterbury JD, Huang E, Riffle M, Sharma V, MacLean BX, Eckels J, Wu CC, Bereman MS, Spencer SE, Hoofnagle AN, MacCoss MJ. A Framework for Quality Control in Quantitative Proteomics. J Proteome Res 2024. [PMID: 39248652 DOI: 10.1021/acs.jproteome.4c00363] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 09/10/2024]
Abstract
A thorough evaluation of the quality, reproducibility, and variability of bottom-up proteomics data is necessary at every stage of a workflow, from planning to analysis. We share vignettes applying adaptable quality control (QC) measures to assess sample preparation, system function, and quantitative analysis. System suitability samples are repeatedly measured longitudinally with targeted methods, and we share examples where they are used on three instrument platforms to identify severe system failures and track function over months to years. Internal QCs incorporated at the protein and peptide levels allow our team to assess sample preparation issues and to differentiate system failures from sample-specific issues. External QC samples prepared alongside our experimental samples are used to verify the consistency and quantitative potential of our results during batch correction and normalization before assessing biological phenotypes. We combine these controls with rapid analysis (Skyline), longitudinal QC metrics (AutoQC), and server-based data deposition (PanoramaWeb). We propose that this integrated approach to QC is a useful starting point for groups to facilitate rapid quality control assessment to ensure that valuable instrument time is used to collect the best quality data possible. Data are available on Panorama Public and ProteomeXchange under the identifier PXD051318.
Collapse
Affiliation(s)
- Kristine A Tsantilas
- Department of Genome Sciences, University of Washington, Seattle, Washington 98195, United States
| | - Gennifer E Merrihew
- Department of Genome Sciences, University of Washington, Seattle, Washington 98195, United States
| | - Julia E Robbins
- Department of Genome Sciences, University of Washington, Seattle, Washington 98195, United States
| | - Richard S Johnson
- Department of Genome Sciences, University of Washington, Seattle, Washington 98195, United States
| | - Jea Park
- Department of Genome Sciences, University of Washington, Seattle, Washington 98195, United States
| | - Deanna L Plubell
- Department of Genome Sciences, University of Washington, Seattle, Washington 98195, United States
| | - Jesse D Canterbury
- Thermo Fisher Scientific, 355 River Oaks Parkway, San Jose, California 95134, United States
| | - Eric Huang
- Department of Genome Sciences, University of Washington, Seattle, Washington 98195, United States
| | - Michael Riffle
- Department of Biochemistry, University of Washington, Seattle, Washington 98195, United States
| | - Vagisha Sharma
- Department of Genome Sciences, University of Washington, Seattle, Washington 98195, United States
| | - Brendan X MacLean
- Department of Genome Sciences, University of Washington, Seattle, Washington 98195, United States
| | - Josh Eckels
- LabKey, 500 Union St #1000, Seattle, Washington 98101, United States
| | - Christine C Wu
- Department of Genome Sciences, University of Washington, Seattle, Washington 98195, United States
| | - Michael S Bereman
- Department of Biological Sciences, North Carolina State University, Raleigh, North Carolina 27607, United States
| | - Sandra E Spencer
- Canada's Michael Smith Genome Sciences Centre (BC Cancer Research Institute), University of British Columbia, Vancouver, British Columbia V5Z 4S6, Canada
| | - Andrew N Hoofnagle
- Department of Laboratory Medicine and Pathology, University of Washington, Seattle, Washington 98195, United States
| | - Michael J MacCoss
- Department of Genome Sciences, University of Washington, Seattle, Washington 98195, United States
| |
Collapse
|
2
|
Jiang Y, Rex DA, Schuster D, Neely BA, Rosano GL, Volkmar N, Momenzadeh A, Peters-Clarke TM, Egbert SB, Kreimer S, Doud EH, Crook OM, Yadav AK, Vanuopadath M, Hegeman AD, Mayta M, Duboff AG, Riley NM, Moritz RL, Meyer JG. Comprehensive Overview of Bottom-Up Proteomics Using Mass Spectrometry. ACS MEASUREMENT SCIENCE AU 2024; 4:338-417. [PMID: 39193565 PMCID: PMC11348894 DOI: 10.1021/acsmeasuresciau.3c00068] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 11/15/2023] [Revised: 05/03/2024] [Accepted: 05/03/2024] [Indexed: 08/29/2024]
Abstract
Proteomics is the large scale study of protein structure and function from biological systems through protein identification and quantification. "Shotgun proteomics" or "bottom-up proteomics" is the prevailing strategy, in which proteins are hydrolyzed into peptides that are analyzed by mass spectrometry. Proteomics studies can be applied to diverse studies ranging from simple protein identification to studies of proteoforms, protein-protein interactions, protein structural alterations, absolute and relative protein quantification, post-translational modifications, and protein stability. To enable this range of different experiments, there are diverse strategies for proteome analysis. The nuances of how proteomic workflows differ may be challenging to understand for new practitioners. Here, we provide a comprehensive overview of different proteomics methods. We cover from biochemistry basics and protein extraction to biological interpretation and orthogonal validation. We expect this Review will serve as a handbook for researchers who are new to the field of bottom-up proteomics.
Collapse
Affiliation(s)
- Yuming Jiang
- Department
of Computational Biomedicine, Cedars Sinai
Medical Center, Los Angeles, California 90048, United States
- Smidt Heart
Institute, Cedars Sinai Medical Center, Los Angeles, California 90048, United States
- Advanced
Clinical Biosystems Research Institute, Cedars Sinai Medical Center, Los
Angeles, California 90048, United States
| | - Devasahayam Arokia
Balaya Rex
- Center for
Systems Biology and Molecular Medicine, Yenepoya Research Centre, Yenepoya (Deemed to be University), Mangalore 575018, India
| | - Dina Schuster
- Department
of Biology, Institute of Molecular Systems
Biology, ETH Zurich, Zurich 8093, Switzerland
- Department
of Biology, Institute of Molecular Biology
and Biophysics, ETH Zurich, Zurich 8093, Switzerland
- Laboratory
of Biomolecular Research, Division of Biology and Chemistry, Paul Scherrer Institute, Villigen 5232, Switzerland
| | - Benjamin A. Neely
- Chemical
Sciences Division, National Institute of
Standards and Technology, NIST, Charleston, South Carolina 29412, United States
| | - Germán L. Rosano
- Mass
Spectrometry
Unit, Institute of Molecular and Cellular
Biology of Rosario, Rosario, 2000 Argentina
| | - Norbert Volkmar
- Department
of Biology, Institute of Molecular Systems
Biology, ETH Zurich, Zurich 8093, Switzerland
| | - Amanda Momenzadeh
- Department
of Computational Biomedicine, Cedars Sinai
Medical Center, Los Angeles, California 90048, United States
- Smidt Heart
Institute, Cedars Sinai Medical Center, Los Angeles, California 90048, United States
- Advanced
Clinical Biosystems Research Institute, Cedars Sinai Medical Center, Los
Angeles, California 90048, United States
| | - Trenton M. Peters-Clarke
- Department
of Pharmaceutical Chemistry, University
of California—San Francisco, San Francisco, California, 94158, United States
| | - Susan B. Egbert
- Department
of Chemistry, University of Manitoba, Winnipeg, Manitoba, R3T 2N2 Canada
| | - Simion Kreimer
- Smidt Heart
Institute, Cedars Sinai Medical Center, Los Angeles, California 90048, United States
- Advanced
Clinical Biosystems Research Institute, Cedars Sinai Medical Center, Los
Angeles, California 90048, United States
| | - Emma H. Doud
- Center
for Proteome Analysis, Indiana University
School of Medicine, Indianapolis, Indiana, 46202-3082, United States
| | - Oliver M. Crook
- Oxford
Protein Informatics Group, Department of Statistics, University of Oxford, Oxford OX1 3LB, United
Kingdom
| | - Amit Kumar Yadav
- Translational
Health Science and Technology Institute, NCR Biotech Science Cluster 3rd Milestone Faridabad-Gurgaon
Expressway, Faridabad, Haryana 121001, India
| | | | - Adrian D. Hegeman
- Departments
of Horticultural Science and Plant and Microbial Biology, University of Minnesota, Twin Cities, Minnesota 55108, United States
| | - Martín
L. Mayta
- School
of Medicine and Health Sciences, Center for Health Sciences Research, Universidad Adventista del Plata, Libertador San Martin 3103, Argentina
- Molecular
Biology Department, School of Pharmacy and Biochemistry, Universidad Nacional de Rosario, Rosario 2000, Argentina
| | - Anna G. Duboff
- Department
of Chemistry, University of Washington, Seattle, Washington 98195, United States
| | - Nicholas M. Riley
- Department
of Chemistry, University of Washington, Seattle, Washington 98195, United States
| | - Robert L. Moritz
- Institute
for Systems biology, Seattle, Washington 98109, United States
| | - Jesse G. Meyer
- Department
of Computational Biomedicine, Cedars Sinai
Medical Center, Los Angeles, California 90048, United States
- Smidt Heart
Institute, Cedars Sinai Medical Center, Los Angeles, California 90048, United States
- Advanced
Clinical Biosystems Research Institute, Cedars Sinai Medical Center, Los
Angeles, California 90048, United States
| |
Collapse
|
3
|
Kim H, Huh S, Park J, Han Y, Ahn KG, Noh Y, Lee SJ, Chu H, Kim SS, Jung HS, Yun WG, Cho YJ, Kwon W, Jang JY, Kang UB. Development of a Fit-For-Purpose Multi-Marker Panel for Early Diagnosis of Pancreatic Ductal Adenocarcinoma. Mol Cell Proteomics 2024; 23:100824. [PMID: 39097268 DOI: 10.1016/j.mcpro.2024.100824] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/02/2024] [Revised: 07/28/2024] [Accepted: 07/31/2024] [Indexed: 08/05/2024] Open
Abstract
Pancreatic ductal adenocarcinoma (PDAC) suffers from a lack of an effective diagnostic method, which hampers improvement in patient survival. Carbohydrate antigen 19-9 (CA19-9) is the only FDA-approved blood biomarker for PDAC, yet its clinical utility is limited due to suboptimal performance. Liquid chromatography-mass spectrometry (LC-MS) has emerged as a burgeoning technology in clinical proteomics for the discovery, verification, and validation of novel biomarkers. A plethora of protein biomarker candidates for PDAC have been identified using LC-MS, yet few has successfully transitioned into clinical practice. This translational standstill is owed partly to insufficient considerations of practical needs and perspectives of clinical implementation during biomarker development pipelines, such as demonstrating the analytical robustness of proposed biomarkers which is critical for transitioning from research-grade to clinical-grade assays. Moreover, the throughput and cost-effectiveness of proposed assays ought to be considered concomitantly from the early phases of the biomarker pipelines for enhancing widespread adoption in clinical settings. Here, we developed a fit-for-purpose multi-marker panel for PDAC diagnosis by consolidating analytically robust biomarkers as well as employing a relatively simple LC-MS protocol. In the discovery phase, we comprehensively surveyed putative PDAC biomarkers from both in-house data and prior studies. In the verification phase, we developed a multiple-reaction monitoring (MRM)-MS-based proteomic assay using surrogate peptides that passed stringent analytical validation tests. We adopted a high-throughput protocol including a short gradient (<10 min) and simple sample preparation (no depletion or enrichment steps). Additionally, we developed our assay using serum samples, which are usually the preferred biospecimen in clinical settings. We developed predictive models based on our final panel of 12 protein biomarkers combined with CA19-9, which showed improved diagnostic performance compared to using CA19-9 alone in discriminating PDAC from non-PDAC controls including healthy individuals and patients with benign pancreatic diseases. A large-scale clinical validation is underway to demonstrate the clinical validity of our novel panel.
Collapse
Affiliation(s)
- Hyeonji Kim
- Bertis R&D Division, Bertis Inc, Gyeonggi-do, Republic of Korea
| | - Sunghyun Huh
- Bertis R&D Division, Bertis Inc, Gyeonggi-do, Republic of Korea
| | | | - Youngmin Han
- Department of Surgery and Cancer Research Institute, Seoul National University College of Medicine, Seoul, Republic of Korea
| | - Kyung-Geun Ahn
- Bertis R&D Division, Bertis Inc, Gyeonggi-do, Republic of Korea
| | - Yiyoung Noh
- Bertis R&D Division, Bertis Inc, Gyeonggi-do, Republic of Korea
| | - Seong-Jae Lee
- Bertis R&D Division, Bertis Inc, Gyeonggi-do, Republic of Korea
| | - Hyosub Chu
- Bertis R&D Division, Bertis Inc, Gyeonggi-do, Republic of Korea
| | - Sung-Soo Kim
- Manufacturing and Technology Division, Bertis Inc, Gyeonggi-do, Republic of Korea
| | - Hye-Sol Jung
- Department of Surgery and Cancer Research Institute, Seoul National University College of Medicine, Seoul, Republic of Korea
| | - Won-Gun Yun
- Department of Surgery and Cancer Research Institute, Seoul National University College of Medicine, Seoul, Republic of Korea
| | - Young Jae Cho
- Department of Surgery and Cancer Research Institute, Seoul National University College of Medicine, Seoul, Republic of Korea
| | - Wooil Kwon
- Department of Surgery and Cancer Research Institute, Seoul National University College of Medicine, Seoul, Republic of Korea
| | - Jin-Young Jang
- Department of Surgery and Cancer Research Institute, Seoul National University College of Medicine, Seoul, Republic of Korea.
| | - Un-Beom Kang
- Bertis R&D Division, Bertis Inc, Gyeonggi-do, Republic of Korea.
| |
Collapse
|
4
|
Wang Q, Ding X, Xu Z, Wang B, Wang A, Wang L, Ding Y, Song S, Chen Y, Zhang S, Jiang L, Ding X. The mouse multi-organ proteome from infancy to adulthood. Nat Commun 2024; 15:5752. [PMID: 38982135 PMCID: PMC11233712 DOI: 10.1038/s41467-024-50183-6] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/21/2023] [Accepted: 07/03/2024] [Indexed: 07/11/2024] Open
Abstract
The early-life organ development and maturation shape the fundamental blueprint for later-life phenotype. However, a multi-organ proteome atlas from infancy to adulthood is currently not available. Herein, we present a comprehensive proteomic analysis of ten mouse organs (brain, heart, lung, liver, kidney, spleen, stomach, intestine, muscle and skin) at three crucial developmental stages (1-, 4- and 8-weeks after birth) acquired using data-independent acquisition mass spectrometry. We detect and quantify 11,533 protein groups across the ten organs and obtain 115 age-related differentially expressed protein groups that are co-expressed in all organs from infancy to adulthood. We find that spliceosome proteins prevalently play crucial regulatory roles in the early-life development of multiple organs, and detect organ-specific expression patterns and sexual dimorphism. This multi-organ proteome atlas provides a fundamental resource for understanding the molecular mechanisms underlying early-life organ development and maturation.
Collapse
Affiliation(s)
- Qingwen Wang
- Department of Anesthesiology and Surgical Intensive Care Unit, Xinhua Hospital, School of Medicine and School of Biomedical Engineering, Shanghai Jiao Tong University, Shanghai, China
- State Key Laboratory of Oncogenes and Related Genes, Institute for Personalized Medicine, Shanghai Jiao Tong University, Shanghai, China
| | - Xinwen Ding
- Department of Anesthesiology and Surgical Intensive Care Unit, Xinhua Hospital, School of Medicine and School of Biomedical Engineering, Shanghai Jiao Tong University, Shanghai, China
- State Key Laboratory of Oncogenes and Related Genes, Institute for Personalized Medicine, Shanghai Jiao Tong University, Shanghai, China
| | - Zhixiao Xu
- Department of Anesthesiology and Surgical Intensive Care Unit, Xinhua Hospital, School of Medicine and School of Biomedical Engineering, Shanghai Jiao Tong University, Shanghai, China
- State Key Laboratory of Oncogenes and Related Genes, Institute for Personalized Medicine, Shanghai Jiao Tong University, Shanghai, China
| | - Boqian Wang
- Department of Anesthesiology and Surgical Intensive Care Unit, Xinhua Hospital, School of Medicine and School of Biomedical Engineering, Shanghai Jiao Tong University, Shanghai, China
- State Key Laboratory of Oncogenes and Related Genes, Institute for Personalized Medicine, Shanghai Jiao Tong University, Shanghai, China
| | - Aiting Wang
- Department of Anesthesiology and Surgical Intensive Care Unit, Xinhua Hospital, School of Medicine and School of Biomedical Engineering, Shanghai Jiao Tong University, Shanghai, China
- State Key Laboratory of Oncogenes and Related Genes, Institute for Personalized Medicine, Shanghai Jiao Tong University, Shanghai, China
| | - Liping Wang
- Department of Anesthesiology and Surgical Intensive Care Unit, Xinhua Hospital, School of Medicine and School of Biomedical Engineering, Shanghai Jiao Tong University, Shanghai, China
| | - Yi Ding
- Department of Anesthesiology and Surgical Intensive Care Unit, Xinhua Hospital, School of Medicine and School of Biomedical Engineering, Shanghai Jiao Tong University, Shanghai, China
- State Key Laboratory of Oncogenes and Related Genes, Institute for Personalized Medicine, Shanghai Jiao Tong University, Shanghai, China
| | - Sunfengda Song
- Department of Anesthesiology and Surgical Intensive Care Unit, Xinhua Hospital, School of Medicine and School of Biomedical Engineering, Shanghai Jiao Tong University, Shanghai, China
- State Key Laboratory of Oncogenes and Related Genes, Institute for Personalized Medicine, Shanghai Jiao Tong University, Shanghai, China
| | - Youming Chen
- Department of Anesthesiology and Surgical Intensive Care Unit, Xinhua Hospital, School of Medicine and School of Biomedical Engineering, Shanghai Jiao Tong University, Shanghai, China
- State Key Laboratory of Oncogenes and Related Genes, Institute for Personalized Medicine, Shanghai Jiao Tong University, Shanghai, China
| | - Shuang Zhang
- Department of Anesthesiology and Surgical Intensive Care Unit, Xinhua Hospital, School of Medicine and School of Biomedical Engineering, Shanghai Jiao Tong University, Shanghai, China
- State Key Laboratory of Oncogenes and Related Genes, Institute for Personalized Medicine, Shanghai Jiao Tong University, Shanghai, China
| | - Lai Jiang
- Department of Anesthesiology and Surgical Intensive Care Unit, Xinhua Hospital, School of Medicine and School of Biomedical Engineering, Shanghai Jiao Tong University, Shanghai, China
| | - Xianting Ding
- Department of Anesthesiology and Surgical Intensive Care Unit, Xinhua Hospital, School of Medicine and School of Biomedical Engineering, Shanghai Jiao Tong University, Shanghai, China.
- State Key Laboratory of Oncogenes and Related Genes, Institute for Personalized Medicine, Shanghai Jiao Tong University, Shanghai, China.
| |
Collapse
|
5
|
Karpov OA, Stotland A, Raedschelders K, Chazarin B, Ai L, Murray CI, Van Eyk JE. Proteomics of the heart. Physiol Rev 2024; 104:931-982. [PMID: 38300522 PMCID: PMC11381016 DOI: 10.1152/physrev.00026.2023] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/03/2023] [Revised: 12/25/2023] [Accepted: 01/14/2024] [Indexed: 02/02/2024] Open
Abstract
Mass spectrometry-based proteomics is a sophisticated identification tool specializing in portraying protein dynamics at a molecular level. Proteomics provides biologists with a snapshot of context-dependent protein and proteoform expression, structural conformations, dynamic turnover, and protein-protein interactions. Cardiac proteomics can offer a broader and deeper understanding of the molecular mechanisms that underscore cardiovascular disease, and it is foundational to the development of future therapeutic interventions. This review encapsulates the evolution, current technologies, and future perspectives of proteomic-based mass spectrometry as it applies to the study of the heart. Key technological advancements have allowed researchers to study proteomes at a single-cell level and employ robot-assisted automation systems for enhanced sample preparation techniques, and the increase in fidelity of the mass spectrometers has allowed for the unambiguous identification of numerous dynamic posttranslational modifications. Animal models of cardiovascular disease, ranging from early animal experiments to current sophisticated models of heart failure with preserved ejection fraction, have provided the tools to study a challenging organ in the laboratory. Further technological development will pave the way for the implementation of proteomics even closer within the clinical setting, allowing not only scientists but also patients to benefit from an understanding of protein interplay as it relates to cardiac disease physiology.
Collapse
Affiliation(s)
- Oleg A Karpov
- Smidt Heart Institute, Advanced Clinical Biosystems Research Institute, Cedars-Sinai Medical Center, Los Angeles, California, United States
| | - Aleksandr Stotland
- Smidt Heart Institute, Advanced Clinical Biosystems Research Institute, Cedars-Sinai Medical Center, Los Angeles, California, United States
| | - Koen Raedschelders
- Smidt Heart Institute, Advanced Clinical Biosystems Research Institute, Cedars-Sinai Medical Center, Los Angeles, California, United States
| | - Blandine Chazarin
- Smidt Heart Institute, Advanced Clinical Biosystems Research Institute, Cedars-Sinai Medical Center, Los Angeles, California, United States
| | - Lizhuo Ai
- Smidt Heart Institute, Advanced Clinical Biosystems Research Institute, Cedars-Sinai Medical Center, Los Angeles, California, United States
| | - Christopher I Murray
- Smidt Heart Institute, Advanced Clinical Biosystems Research Institute, Cedars-Sinai Medical Center, Los Angeles, California, United States
| | - Jennifer E Van Eyk
- Smidt Heart Institute, Advanced Clinical Biosystems Research Institute, Cedars-Sinai Medical Center, Los Angeles, California, United States
| |
Collapse
|
6
|
Webel H, Niu L, Nielsen AB, Locard-Paulet M, Mann M, Jensen LJ, Rasmussen S. Imputation of label-free quantitative mass spectrometry-based proteomics data using self-supervised deep learning. Nat Commun 2024; 15:5405. [PMID: 38926340 PMCID: PMC11208500 DOI: 10.1038/s41467-024-48711-5] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/01/2023] [Accepted: 05/13/2024] [Indexed: 06/28/2024] Open
Abstract
Imputation techniques provide means to replace missing measurements with a value and are used in almost all downstream analysis of mass spectrometry (MS) based proteomics data using label-free quantification (LFQ). Here we demonstrate how collaborative filtering, denoising autoencoders, and variational autoencoders can impute missing values in the context of LFQ at different levels. We applied our method, proteomics imputation modeling mass spectrometry (PIMMS), to an alcohol-related liver disease (ALD) cohort with blood plasma proteomics data available for 358 individuals. Removing 20 percent of the intensities we were able to recover 15 out of 17 significant abundant protein groups using PIMMS-VAE imputations. When analyzing the full dataset we identified 30 additional proteins (+13.2%) that were significantly differentially abundant across disease stages compared to no imputation and found that some of these were predictive of ALD progression in machine learning models. We, therefore, suggest the use of deep learning approaches for imputing missing values in MS-based proteomics on larger datasets and provide workflows for these.
Collapse
Affiliation(s)
- Henry Webel
- Novo Nordisk Foundation Center for Protein Research, Faculty of Health and Medical Sciences, University of Copenhagen, Copenhagen N, Denmark
- Novo Nordisk Foundation Center for Basic Metabolic Research, Faculty of Health and Medical Sciences, University of Copenhagen, Copenhagen N, Denmark
| | - Lili Niu
- Novo Nordisk Foundation Center for Protein Research, Faculty of Health and Medical Sciences, University of Copenhagen, Copenhagen N, Denmark
| | - Annelaura Bach Nielsen
- Novo Nordisk Foundation Center for Protein Research, Faculty of Health and Medical Sciences, University of Copenhagen, Copenhagen N, Denmark
| | - Marie Locard-Paulet
- Novo Nordisk Foundation Center for Protein Research, Faculty of Health and Medical Sciences, University of Copenhagen, Copenhagen N, Denmark
- Institut de Pharmacologie et de Biologie Structurale (IPBS), Université de Toulouse, CNRS, Université Toulouse III - Paul Sabatier (UT3), Toulouse, France
| | - Matthias Mann
- Novo Nordisk Foundation Center for Protein Research, Faculty of Health and Medical Sciences, University of Copenhagen, Copenhagen N, Denmark
- Department of Proteomics and Signal Transduction, Max Planck Institute of Biochemistry, Martinsried, Germany
| | - Lars Juhl Jensen
- Novo Nordisk Foundation Center for Protein Research, Faculty of Health and Medical Sciences, University of Copenhagen, Copenhagen N, Denmark
| | - Simon Rasmussen
- Novo Nordisk Foundation Center for Protein Research, Faculty of Health and Medical Sciences, University of Copenhagen, Copenhagen N, Denmark.
- Novo Nordisk Foundation Center for Basic Metabolic Research, Faculty of Health and Medical Sciences, University of Copenhagen, Copenhagen N, Denmark.
- The Novo Nordisk Foundation Center for Genomic Mechanisms of Disease, Broad Institute of MIT and Harvard, Cambridge, MA, 02142, USA.
| |
Collapse
|
7
|
Chen L, Zhang Z, Matsumoto C, Gao Y. High-Throughput Proteomics Enabled by a Fully Automated Dual-Trap and Dual-Column LC-MS. Anal Chem 2024; 96:9761-9766. [PMID: 38887087 DOI: 10.1021/acs.analchem.3c03182] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 06/20/2024]
Abstract
This Technical Note describes a dual-column liquid chromatography system coupled to mass spectrometry (LC-MS) for high-throughput bottom-up proteomic analysis. This system made full use of two 2-position 10-port valves and a binary pump with an integrated loading pump of a commercial LC instrument to provide successive operation of two parallel subsystems. Each subsystem consisted of a set of trap columns and an analytical column. A T-junction union was used to split the mobile phase from the loading pump into two parts. This allowed one set of columns to be washed and equilibrated, followed by the injection of the next sample, while the previous sample was eluting and being analyzed on the other set of columns, thereby greatly increasing the analysis throughput. This approach showed high reproducibility for the analysis of HeLa tryptic digests with average relative standard deviation (RSD) values of 1.75%, 6.90%, and 5.19% for the identification number of proteins, peptides, and peptide-spectrum matches (PSMs), respectively, across 10 consecutive runs. The capacity for peptide and protein identification, as well as proteome depth, of the dual-column LC system was comparable to a conventional single-column system. Due to its simple equipment requirements and set up process, this method should be highly accessible for other laboratories.
Collapse
Affiliation(s)
- Liang Chen
- College of Pharmacy, University of Illinois at Chicago, Chicago, Illinois 60612, United States
| | - Ziwei Zhang
- College of Pharmacy, University of Illinois at Chicago, Chicago, Illinois 60612, United States
| | - Cory Matsumoto
- College of Pharmacy, University of Illinois at Chicago, Chicago, Illinois 60612, United States
| | - Yu Gao
- College of Pharmacy, University of Illinois at Chicago, Chicago, Illinois 60612, United States
| |
Collapse
|
8
|
Pelletier SJ, Leclercq M, Roux-Dalvai F, de Geus MB, Leslie S, Wang W, Lam TT, Nairn AC, Arnold SE, Carlyle BC, Precioso F, Droit A. BERNN: Enhancing classification of Liquid Chromatography Mass Spectrometry data with batch effect removal neural networks. Nat Commun 2024; 15:3777. [PMID: 38710683 PMCID: PMC11074280 DOI: 10.1038/s41467-024-48177-5] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/28/2023] [Accepted: 04/24/2024] [Indexed: 05/08/2024] Open
Abstract
Liquid Chromatography Mass Spectrometry (LC-MS) is a powerful method for profiling complex biological samples. However, batch effects typically arise from differences in sample processing protocols, experimental conditions, and data acquisition techniques, significantly impacting the interpretability of results. Correcting batch effects is crucial for the reproducibility of omics research, but current methods are not optimal for the removal of batch effects without compressing the genuine biological variation under study. We propose a suite of Batch Effect Removal Neural Networks (BERNN) to remove batch effects in large LC-MS experiments, with the goal of maximizing sample classification performance between conditions. More importantly, these models must efficiently generalize in batches not seen during training. A comparison of batch effect correction methods across five diverse datasets demonstrated that BERNN models consistently showed the strongest sample classification performance. However, the model producing the greatest classification improvements did not always perform best in terms of batch effect removal. Finally, we show that the overcorrection of batch effects resulted in the loss of some essential biological variability. These findings highlight the importance of balancing batch effect removal while preserving valuable biological diversity in large-scale LC-MS experiments.
Collapse
Affiliation(s)
- Simon J Pelletier
- Computational Biology Laboratory, CHU de Québec - Université Laval Research Center, Québec City, QC, Canada
| | - Mickaël Leclercq
- Computational Biology Laboratory, CHU de Québec - Université Laval Research Center, Québec City, QC, Canada
| | - Florence Roux-Dalvai
- Computational Biology Laboratory, CHU de Québec - Université Laval Research Center, Québec City, QC, Canada
- Proteomics Platform, CHU de Québec - Université Laval Research Center, Québec City, QC, Canada
| | - Matthijs B de Geus
- Massachusetts General Hospital Department of Neurology, Charlestown, MA, USA
- Leiden University Medical Center, Leiden, The Netherlands
| | - Shannon Leslie
- Yale Department of Psychiatry, New Haven, CT, USA
- Janssen Pharmaceuticals, San Diego, CA, USA
| | - Weiwei Wang
- Keck MS & Proteomics Resource, Yale School of Medicine, New Haven, CT, USA
| | - TuKiet T Lam
- Keck MS & Proteomics Resource, Yale School of Medicine, New Haven, CT, USA
- Yale School of Medicine, Department of Molecular Biophysics and Biochemistry, New Haven, CT, USA
| | | | - Steven E Arnold
- Massachusetts General Hospital Department of Neurology, Charlestown, MA, USA
| | - Becky C Carlyle
- Massachusetts General Hospital Department of Neurology, Charlestown, MA, USA
- Oxford University Department of Physiology Anatomy and Genetics, Oxford, UK
- Kavli Institute for Nanoscience Discovery, Oxford, UK
| | - Frédéric Precioso
- Université Côte d'Azur, CNRS, INRIA, I3S, Sophia Antipolis, Nice, France
| | - Arnaud Droit
- Computational Biology Laboratory, CHU de Québec - Université Laval Research Center, Québec City, QC, Canada.
- Proteomics Platform, CHU de Québec - Université Laval Research Center, Québec City, QC, Canada.
| |
Collapse
|
9
|
Nebauer DJ, Pearson LA, Neilan BA. Critical steps in an environmental metaproteomics workflow. Environ Microbiol 2024; 26:e16637. [PMID: 38760994 DOI: 10.1111/1462-2920.16637] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/01/2024] [Accepted: 04/30/2024] [Indexed: 05/20/2024]
Abstract
Environmental metaproteomics is a rapidly advancing field that provides insights into the structure, dynamics, and metabolic activity of microbial communities. As the field is still maturing, it lacks consistent workflows, making it challenging for non-expert researchers to navigate. This review aims to introduce the workflow of environmental metaproteomics. It outlines the standard practices for sample collection, processing, and analysis, and offers strategies to overcome the unique challenges presented by common environmental matrices such as soil, freshwater, marine environments, biofilms, sludge, and symbionts. The review also highlights the bottlenecks in data analysis that are specific to metaproteomics samples and provides suggestions for researchers to obtain high-quality datasets. It includes recent benchmarking studies and descriptions of software packages specifically built for metaproteomics analysis. The article is written without assuming the reader's familiarity with single-organism proteomic workflows, making it accessible to those new to proteomics or mass spectrometry in general. This primer for environmental metaproteomics aims to improve accessibility to this exciting technology and empower researchers to tackle challenging and ambitious research questions. While it is primarily a resource for those new to the field, it should also be useful for established researchers looking to streamline or troubleshoot their metaproteomics experiments.
Collapse
Affiliation(s)
- Daniel J Nebauer
- School of Environmental and Life Sciences, The University of Newcastle, Callaghan, New South Wales, Australia
- Centre of Excellence in Synthetic Biology, Australian Research Council, Sydney, New South Wales, Australia
| | - Leanne A Pearson
- School of Environmental and Life Sciences, The University of Newcastle, Callaghan, New South Wales, Australia
- Centre of Excellence in Synthetic Biology, Australian Research Council, Sydney, New South Wales, Australia
| | - Brett A Neilan
- School of Environmental and Life Sciences, The University of Newcastle, Callaghan, New South Wales, Australia
- Centre of Excellence in Synthetic Biology, Australian Research Council, Sydney, New South Wales, Australia
| |
Collapse
|
10
|
Li L, Mayne J, Beltran A, Zhang X, Ning Z, Figeys D. RapidAIM 2.0: a high-throughput assay to study functional response of human gut microbiome to xenobiotics. MICROBIOME RESEARCH REPORTS 2024; 3:26. [PMID: 38841404 PMCID: PMC11149095 DOI: 10.20517/mrr.2023.57] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 10/02/2023] [Revised: 03/03/2024] [Accepted: 03/25/2024] [Indexed: 06/07/2024]
Abstract
Aim: Our gut microbiome has its own functionalities which can be modulated by various xenobiotic and biotic components. The development and application of a high-throughput functional screening approach of individual gut microbiomes accelerates drug discovery and our understanding of microbiome-drug interactions. We previously developed the rapid assay of individual microbiome (RapidAIM), which combined an optimized culturing model with metaproteomics to study gut microbiome responses to xenobiotics. In this study, we aim to incorporate automation and multiplexing techniques into RapidAIM to develop a high-throughput protocol. Methods: To develop a 2.0 version of RapidAIM, we automated the protein analysis protocol, and introduced a tandem mass tag (TMT) multiplexing technique. To demonstrate the typical outcome of the protocol, we used RapidAIM 2.0 to evaluate the effect of prebiotic kestose on ex vivo individual human gut microbiomes biobanked with five different workflows. Results: We describe the protocol of RapidAIM 2.0 with extensive details on stool sample collection, biobanking, in vitro culturing and stimulation, sample processing, metaproteomics measurement, and data analysis. The analysis depth of 5,014 ± 142 protein groups per multiplexed sample was achieved. A test on five biobanking methods using RapidAIM 2.0 showed the minimal effect of sample processing on live microbiota functional responses to kestose. Conclusions: Depth and reproducibility of RapidAIM 2.0 are comparable to previous manual label-free metaproteomic analyses. In the meantime, the protocol realizes culturing and sample preparation of 320 samples in six days, opening the door to extensively understanding the effects of xenobiotic and biotic factors on our internal ecology.
Collapse
Affiliation(s)
| | | | | | | | | | - Daniel Figeys
- Correspondence to: Prof. Daniel Figeys, School of Pharmaceutical Sciences, Ottawa Institute of Systems Biology and Department of Biochemistry, Microbiology and Immunology, Faculty of Medicine, University of Ottawa, 451 Smyth Road, Ottawa K1H 8M5, Ontario, Canada. E-mail:
| |
Collapse
|
11
|
Lin A, Torres CM, Hobbs EC, Bardhan J, Aley SB, Spencer CT, Taylor KL, Chiang T. Computational and Systems Biology Advances to Enable Bioagent Agnostic Signatures. Health Secur 2024; 22:130-139. [PMID: 38483337 PMCID: PMC11044874 DOI: 10.1089/hs.2023.0076] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 03/26/2024] Open
Affiliation(s)
- Andy Lin
- Andy Lin, PhD, is a Linus Pauling Distinguished Postdoctoral Fellow; in the National Security Directorate, Pacific Northwest National Laboratory, Seattle, WA
| | - Cameron M. Torres
- Cameron M. Torres is a Graduate Research Assistant and Wieland Fellow, Department of Biological Sciences; at the University of Texas at El Paso, El Paso, TX
| | - Errett C. Hobbs
- Errett C. Hobbs, PhD, is a Data Scientist; in the National Security Directorate, Pacific Northwest National Laboratory, Seattle, WA
| | - Jaydeep Bardhan
- Jaydeep Bardhan, PhD, is a Research Line Manager, Earth and Biological Sciences Directorate, Pacific Northwest National Laboratory, Richland, WA
| | - Stephen B. Aley
- Stephen B. Aley, PhD, is a Professor, Biological Sciences, and an Associate Vice President for Research, Sponsored Projects; at the University of Texas at El Paso, El Paso, TX
| | - Charles T. Spencer
- Charles T. Spencer, PhD, is an Associate Professor, Biological Sciences, and Edward and Barbara Brown Egbert Endowed Chair of the Department of Biological Sciences; at the University of Texas at El Paso, El Paso, TX
| | - Karen L. Taylor
- Karen L. Taylor, MS, is a Research Line Manager; in the National Security Directorate, Pacific Northwest National Laboratory, Seattle, WA
| | - Tony Chiang
- Tony Chiang, PhD, is a Data Scientist; in the National Security Directorate, Pacific Northwest National Laboratory, Seattle, WA
| |
Collapse
|
12
|
Langan LM, Lovin LM, Taylor RB, Scarlett KR, Kevin Chambliss C, Chatterjee S, Scott JT, Brooks BW. Proteome changes in larval zebrafish (Danio rerio) and fathead minnow (Pimephales promelas) exposed to (±) anatoxin-a. ENVIRONMENT INTERNATIONAL 2024; 185:108514. [PMID: 38394915 DOI: 10.1016/j.envint.2024.108514] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 12/15/2023] [Revised: 02/16/2024] [Accepted: 02/17/2024] [Indexed: 02/25/2024]
Abstract
Anatoxin-a and its analogues are potent neurotoxins produced by several genera of cyanobacteria. Due in part to its high toxicity and potential presence in drinking water, these toxins pose threats to public health, companion animals and the environment. It primarily exerts toxicity as a cholinergic agonist, with high affinity at neuromuscular junctions, but molecular mechanisms by which it elicits toxicological responses are not fully understood. To advance understanding of this cyanobacteria, proteomic characterization (DIA shotgun proteomics) of two common fish models (zebrafish and fathead minnow) was performed following (±) anatoxin-a exposure. Specifically, proteome changes were identified and quantified in larval fish exposed for 96 h (0.01-3 mg/L (±) anatoxin-a and caffeine (a methodological positive control) with environmentally relevant treatment levels examined based on environmental exposure distributions of surface water data. Proteomic concentration - response relationships revealed 48 and 29 proteins with concentration - response relationships curves for zebrafish and fathead minnow, respectively. In contrast, the highest number of differentially expressed proteins (DEPs) varied between zebrafish (n = 145) and fathead minnow (n = 300), with only fatheads displaying DEPs at all treatment levels. For both species, genes associated with reproduction were significantly downregulated, with pathways analysis that broadly clustered genes into groups associated with DNA repair mechanisms. Importantly, significant differences in proteome response between the species was also observed, consistent with prior observations of differences in response using both behavioral assays and gene expression, adding further support to model specific differences in organismal sensitivity and/or response. When DEPs were read across from humans to zebrafish, disease ontology enrichment identified diseases associated with cognition and muscle weakness consistent with the prior literature. Our observations highlight limited knowledge of how (±) anatoxin-a, a commonly used synthetic racemate surrogate, elicits responses at a molecular level and advances its toxicological understanding.
Collapse
Affiliation(s)
- Laura M Langan
- Department of Environmental Science, Baylor University, Waco, TX 76798, USA; Center for Reservoir and Aquatic Systems Research, Baylor University, Waco, TX 76798, USA; Department of Environmental Health Sciences, University of South Carolina, Columbia, SC 29208, USA.
| | - Lea M Lovin
- Department of Environmental Science, Baylor University, Waco, TX 76798, USA; Center for Reservoir and Aquatic Systems Research, Baylor University, Waco, TX 76798, USA; Department of Wildlife, Fish and Environmental Studies, Swedish University of Agricultural Sciences, Umeå, Sweden
| | - Raegyn B Taylor
- Center for Reservoir and Aquatic Systems Research, Baylor University, Waco, TX 76798, USA; Department of Chemistry, Baylor University, Waco, TX 76798, USA
| | - Kendall R Scarlett
- Department of Environmental Science, Baylor University, Waco, TX 76798, USA; Center for Reservoir and Aquatic Systems Research, Baylor University, Waco, TX 76798, USA
| | - C Kevin Chambliss
- Center for Reservoir and Aquatic Systems Research, Baylor University, Waco, TX 76798, USA; Department of Chemistry, Baylor University, Waco, TX 76798, USA
| | - Saurabh Chatterjee
- Department of Medicine, Department of Environmental and Occupational Health, University of California Irvine, Irvine, CA 92617, USA
| | - J Thad Scott
- Center for Reservoir and Aquatic Systems Research, Baylor University, Waco, TX 76798, USA; Department of Biology, Baylor University, Waco, TX 76798, USA
| | - Bryan W Brooks
- Department of Environmental Science, Baylor University, Waco, TX 76798, USA; Center for Reservoir and Aquatic Systems Research, Baylor University, Waco, TX 76798, USA.
| |
Collapse
|
13
|
Lin A, Torres C, Hobbs EC, Bardhan J, Aley S, Spencer CT, Taylor KL, Chiang T. Computational and Systems Biology Advances to Enable Bioagent Agnostic Signatures. ARXIV 2024:arXiv:2310.13898v3. [PMID: 37961741 PMCID: PMC10635321] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Subscribe] [Scholar Register] [Indexed: 11/15/2023]
Abstract
Enumerated threat agent lists have long driven biodefense priorities. The global SARS-CoV-2 pandemic demonstrated the limitations of searching for known threat agents as compared to a more agnostic approach. Recent technological advances are enabling agent-agnostic biodefense, especially through the integration of multi-modal observations of host-pathogen interactions directed by a human immunological model. Although well-developed technical assays exist for many aspects of human-pathogen interaction, the analytic methods and pipelines to combine and holistically interpret the results of such assays are immature and require further investments to exploit new technologies. In this manuscript, we discuss potential immunologically based bioagent-agnostic approaches and the computational tool gaps the community should prioritize filling.
Collapse
Affiliation(s)
- Andy Lin
- National Security Directorate, Pacific Northwest National Laboratory, Seattle, WA 98109, USA
| | - Cameron Torres
- Department of Biological Sciences, University of Texas at El Paso, El Paso, Texas 79968 USA
| | - Errett C Hobbs
- National Security Directorate, Pacific Northwest National Laboratory, Seattle, WA 98109, USA
| | - Jaydeep Bardhan
- Earth and Biological Sciences Directorate, Pacific Northwest National Laboratory, Seattle, WA 98109, USA
| | - Stephen Aley
- Department of Biological Sciences, University of Texas at El Paso, El Paso, Texas 79968 USA
| | - Charles T Spencer
- Department of Biological Sciences, University of Texas at El Paso, El Paso, Texas 79968 USA
| | - Karen L Taylor
- National Security Directorate, Pacific Northwest National Laboratory, Seattle, WA 98109, USA
| | - Tony Chiang
- National Security Directorate, Pacific Northwest National Laboratory, Seattle, WA 98109, USA
- Department of Biological Sciences, University of Texas at El Paso, El Paso, Texas 79968 USA
- Department of Mathematics, University of Washington, Seattle 98102 USA
| |
Collapse
|
14
|
Shajari E, Gagné D, Malick M, Roy P, Noël JF, Gagnon H, Brunet MA, Delisle M, Boisvert FM, Beaulieu JF. Application of SWATH Mass Spectrometry and Machine Learning in the Diagnosis of Inflammatory Bowel Disease Based on the Stool Proteome. Biomedicines 2024; 12:333. [PMID: 38397935 PMCID: PMC10886680 DOI: 10.3390/biomedicines12020333] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/07/2023] [Revised: 01/17/2024] [Accepted: 01/25/2024] [Indexed: 02/25/2024] Open
Abstract
Inflammatory bowel disease (IBD) flare-ups exhibit symptoms that are similar to other diseases and conditions, making diagnosis and treatment complicated. Currently, the gold standard for diagnosing and monitoring IBD is colonoscopy and biopsy, which are invasive and uncomfortable procedures, and the fecal calprotectin test, which is not sufficiently accurate. Therefore, it is necessary to develop an alternative method. In this study, our aim was to provide proof of concept for the application of Sequential Window Acquisition of All Theoretical Mass Spectra-Mass spectrometry (SWATH-MS) and machine learning to develop a non-invasive and accurate predictive model using the stool proteome to distinguish between active IBD patients and symptomatic non-IBD patients. Proteome profiles of 123 samples were obtained and data processing procedures were optimized to select an appropriate pipeline. The differentially abundant analysis identified 48 proteins. Utilizing correlation-based feature selection (Cfs), 7 proteins were selected for proceeding steps. To identify the most appropriate predictive machine learning model, five of the most popular methods, including support vector machines (SVMs), random forests, logistic regression, naive Bayes, and k-nearest neighbors (KNN), were assessed. The generated model was validated by implementing the algorithm on 45 prospective unseen datasets; the results showed a sensitivity of 96% and a specificity of 76%, indicating its performance. In conclusion, this study illustrates the effectiveness of utilizing the stool proteome obtained through SWATH-MS in accurately diagnosing active IBD via a machine learning model.
Collapse
Affiliation(s)
- Elmira Shajari
- Laboratory of Intestinal Physiopathology, Faculty of Medicine and Health Sciences, Université de Sherbrooke, Sherbrooke, QC J1H 5N4, Canada
- Centre de Recherche du Centre Hospitalier Universitaire de Sherbrooke, Sherbrooke, QC J1H 5N4, Canada
- Department of Immunology and Cell Biology, Faculty of Medicine and Health Sciences, Université de Sherbrooke, Sherbrooke, QC J1H 5N4, Canada
| | - David Gagné
- Laboratory of Intestinal Physiopathology, Faculty of Medicine and Health Sciences, Université de Sherbrooke, Sherbrooke, QC J1H 5N4, Canada
- Centre de Recherche du Centre Hospitalier Universitaire de Sherbrooke, Sherbrooke, QC J1H 5N4, Canada
- Department of Immunology and Cell Biology, Faculty of Medicine and Health Sciences, Université de Sherbrooke, Sherbrooke, QC J1H 5N4, Canada
- Allumiqs, 975 Rue Léon-Trépanier, Sherbrooke, QC J1G 5J6, Canada
| | - Mandy Malick
- Laboratory of Intestinal Physiopathology, Faculty of Medicine and Health Sciences, Université de Sherbrooke, Sherbrooke, QC J1H 5N4, Canada
- Centre de Recherche du Centre Hospitalier Universitaire de Sherbrooke, Sherbrooke, QC J1H 5N4, Canada
- Department of Immunology and Cell Biology, Faculty of Medicine and Health Sciences, Université de Sherbrooke, Sherbrooke, QC J1H 5N4, Canada
| | - Patricia Roy
- Laboratory of Intestinal Physiopathology, Faculty of Medicine and Health Sciences, Université de Sherbrooke, Sherbrooke, QC J1H 5N4, Canada
- Centre de Recherche du Centre Hospitalier Universitaire de Sherbrooke, Sherbrooke, QC J1H 5N4, Canada
- Department of Immunology and Cell Biology, Faculty of Medicine and Health Sciences, Université de Sherbrooke, Sherbrooke, QC J1H 5N4, Canada
| | | | - Hugo Gagnon
- Allumiqs, 975 Rue Léon-Trépanier, Sherbrooke, QC J1G 5J6, Canada
| | - Marie A. Brunet
- Centre de Recherche du Centre Hospitalier Universitaire de Sherbrooke, Sherbrooke, QC J1H 5N4, Canada
- Department of Pediatrics, Faculty of Medicine and Health Sciences, Université de Sherbrooke, Sherbrooke, QC J1H 5N4, Canada
| | - Maxime Delisle
- Centre de Recherche du Centre Hospitalier Universitaire de Sherbrooke, Sherbrooke, QC J1H 5N4, Canada
- Department of Medicine, Faculty of Medicine and Health Sciences, Université de Sherbrooke, Sherbrooke, QC J1H 5N4, Canada
| | - François-Michel Boisvert
- Centre de Recherche du Centre Hospitalier Universitaire de Sherbrooke, Sherbrooke, QC J1H 5N4, Canada
- Department of Immunology and Cell Biology, Faculty of Medicine and Health Sciences, Université de Sherbrooke, Sherbrooke, QC J1H 5N4, Canada
| | - Jean-François Beaulieu
- Laboratory of Intestinal Physiopathology, Faculty of Medicine and Health Sciences, Université de Sherbrooke, Sherbrooke, QC J1H 5N4, Canada
- Centre de Recherche du Centre Hospitalier Universitaire de Sherbrooke, Sherbrooke, QC J1H 5N4, Canada
- Department of Immunology and Cell Biology, Faculty of Medicine and Health Sciences, Université de Sherbrooke, Sherbrooke, QC J1H 5N4, Canada
| |
Collapse
|
15
|
Cheng C, Messerschmidt L, Bravo I, Waldbauer M, Bhavikatti R, Schenk C, Grujic V, Model T, Kubinec R, Barceló J. A General Primer for Data Harmonization. Sci Data 2024; 11:152. [PMID: 38297013 PMCID: PMC10831085 DOI: 10.1038/s41597-024-02956-3] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/31/2023] [Accepted: 01/11/2024] [Indexed: 02/02/2024] Open
Affiliation(s)
- Cindy Cheng
- Hochschule für Politik, Technical University of Munich, Richard-Wagner Str. 1, Munich, 80333, Bavaria, Germany.
| | - Luca Messerschmidt
- Hochschule für Politik, Technical University of Munich, Richard-Wagner Str. 1, Munich, 80333, Bavaria, Germany
| | - Isaac Bravo
- Hochschule für Politik, Technical University of Munich, Richard-Wagner Str. 1, Munich, 80333, Bavaria, Germany
| | - Marco Waldbauer
- Hochschule für Politik, Technical University of Munich, Richard-Wagner Str. 1, Munich, 80333, Bavaria, Germany
| | | | - Caress Schenk
- School of Humanities and Social Sciences, Nazarbayev University, Kabanbay Batry Ave., 53, Astana, 010000, Kazakhstan
| | - Vanja Grujic
- Faculty of Law, University of Brasilia, Campus Universitário Darcy Ribeiro Asa Norte, Brasília, 10587, Brazil
| | - Tim Model
- Delve, 2225 3rd St, San Francisco, 94107, California, USA
| | - Robert Kubinec
- Division of Social Science, New York University Abu Dhabi, Social Science Building (A5), Abu Dhabi, 129188, United Arab Emirates
| | - Joan Barceló
- Division of Social Science, New York University Abu Dhabi, Social Science Building (A5), Abu Dhabi, 129188, United Arab Emirates
| |
Collapse
|
16
|
Afonin AM, Piironen AK, de Sousa Maciel I, Ivanova M, Alatalo A, Whipp AM, Pulkkinen L, Rose RJ, van Kamp I, Kaprio J, Kanninen KM. Proteomic insights into mental health status: plasma markers in young adults. Transl Psychiatry 2024; 14:55. [PMID: 38267423 PMCID: PMC10808121 DOI: 10.1038/s41398-024-02751-z] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 06/15/2023] [Revised: 01/05/2024] [Accepted: 01/08/2024] [Indexed: 01/26/2024] Open
Abstract
Global emphasis on enhancing prevention and treatment strategies necessitates an increased understanding of the biological mechanisms of psychopathology. Plasma proteomics is a powerful tool that has been applied in the context of specific mental disorders for biomarker identification. The p-factor, also known as the "general psychopathology factor", is a concept in psychopathology suggesting that there is a common underlying factor that contributes to the development of various forms of mental disorders. It has been proposed that the p-factor can be used to understand the overall mental health status of an individual. Here, we aimed to discover plasma proteins associated with the p-factor in 775 young adults in the FinnTwin12 cohort. Using liquid chromatography-tandem mass spectrometry, 13 proteins with a significant connection with the p-factor were identified, 8 of which were linked to epidermal growth factor receptor (EGFR) signaling. This exploratory study provides new insight into biological alterations associated with mental health status in young adults.
Collapse
Affiliation(s)
- Alexey M Afonin
- A.I. Virtanen Institute for Molecular Sciences, University of Eastern Finland, Kuopio, Finland
| | - Aino-Kaisa Piironen
- A.I. Virtanen Institute for Molecular Sciences, University of Eastern Finland, Kuopio, Finland
| | - Izaque de Sousa Maciel
- A.I. Virtanen Institute for Molecular Sciences, University of Eastern Finland, Kuopio, Finland
| | - Mariia Ivanova
- A.I. Virtanen Institute for Molecular Sciences, University of Eastern Finland, Kuopio, Finland
| | - Arto Alatalo
- A.I. Virtanen Institute for Molecular Sciences, University of Eastern Finland, Kuopio, Finland
| | - Alyce M Whipp
- Institute for Molecular Medicine Finland (FIMM), HiLIFE, University of Helsinki, Helsinki, Finland
| | - Lea Pulkkinen
- Department of Psychology, University of Jyvaskyla, Jyvaskyla, Finland
| | - Richard J Rose
- Department of Psychological & Brain Sciences, Indiana University, Bloomington, IN, USA
| | - Irene van Kamp
- Centre for Sustainability, Environment and Health, National Institute for Public Health and the Environment, Bilthoven, the Netherlands
| | - Jaakko Kaprio
- Institute for Molecular Medicine Finland (FIMM), HiLIFE, University of Helsinki, Helsinki, Finland
- Department of Public Health, University of Helsinki, Helsinki, Finland
| | - Katja M Kanninen
- A.I. Virtanen Institute for Molecular Sciences, University of Eastern Finland, Kuopio, Finland.
| |
Collapse
|
17
|
Grégoire S, Vanderaa C, Dit Ruys SP, Kune C, Mazzucchelli G, Vertommen D, Gatto L. Standardized Workflow for Mass-Spectrometry-Based Single-Cell Proteomics Data Processing and Analysis Using the scp Package. Methods Mol Biol 2024; 2817:177-220. [PMID: 38907155 DOI: 10.1007/978-1-0716-3934-4_14] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 06/23/2024]
Abstract
Mass-spectrometry (MS)-based single-cell proteomics (SCP) explores cellular heterogeneity by focusing on the functional effectors of the cells-proteins. However, extracting meaningful biological information from MS data is far from trivial, especially with single cells. Currently, data analysis workflows are substantially different from one research team to another. Moreover, it is difficult to evaluate pipelines as ground truths are missing. Our team has developed the R/Bioconductor package called scp to provide a standardized framework for SCP data analysis. It relies on the widely used QFeatures and SingleCellExperiment data structures. In addition, we used a design containing cell lines mixed in known proportions to generate controlled variability for data analysis benchmarking. In this chapter, we provide a flexible data analysis protocol for SCP data using the scp package together with comprehensive explanations at each step of the processing. Our main steps are quality control on the feature and cell level, aggregation of the raw data into peptides and proteins, normalization, and batch correction. We validate our workflow using our ground truth data set. We illustrate how to use this modular, standardized framework and highlight some crucial steps.
Collapse
Affiliation(s)
- Samuel Grégoire
- Computational Biology and Bioinformatics Unit, de Duve Institute, UCLouvain, Brussels, Belgium
| | - Christophe Vanderaa
- Computational Biology and Bioinformatics Unit, de Duve Institute, UCLouvain, Brussels, Belgium
| | | | - Christopher Kune
- Laboratory of Mass Spectrometry, MolSys Research Unit, University of Liège, Liège, Belgium
| | - Gabriel Mazzucchelli
- Laboratory of Mass Spectrometry, MolSys Research Unit, University of Liège, Liège, Belgium
- GIGA Proteomics Facility, University of Liège, Liège, Belgium
| | - Didier Vertommen
- Protein Phosphorylation Unit, de Duve Institute, UCLouvain, Brussels, Belgium
| | - Laurent Gatto
- Computational Biology and Bioinformatics Unit, de Duve Institute, UCLouvain, Brussels, Belgium.
| |
Collapse
|
18
|
Wang H, Lim KP, Kong W, Gao H, Wong BJH, Phua SX, Guo T, Goh WWB. MultiPro: DDA-PASEF and diaPASEF acquired cell line proteomic datasets with deliberate batch effects. Sci Data 2023; 10:858. [PMID: 38042886 PMCID: PMC10693559 DOI: 10.1038/s41597-023-02779-8] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/11/2023] [Accepted: 11/23/2023] [Indexed: 12/04/2023] Open
Abstract
Mass spectrometry-based proteomics plays a critical role in current biological and clinical research. Technical issues like data integration, missing value imputation, batch effect correction and the exploration of inter-connections amongst these technical issues, can produce errors but are not well studied. Although proteomic technologies have improved significantly in recent years, this alone cannot resolve these issues. What is needed are better algorithms and data processing knowledge. But to obtain these, we need appropriate proteomics datasets for exploration, investigation, and benchmarking. To meet this need, we developed MultiPro (Multi-purpose Proteome Resource), a resource comprising four comprehensive large-scale proteomics datasets with deliberate batch effects using the latest parallel accumulation-serial fragmentation in both Data-Dependent Acquisition (DDA) and Data Independent Acquisition (DIA) modes. Each dataset contains a balanced two-class design based on well-characterized and widely studied cell lines (A549 vs K562 or HCC1806 vs HS578T) with 48 or 36 biological and technical replicates altogether, allowing for investigation of a multitude of technical issues. These datasets allow for investigation of inter-connections between class and batch factors, or to develop approaches to compare and integrate data from DDA and DIA platforms.
Collapse
Affiliation(s)
- He Wang
- Lee Kong Chian School of Medicine, Nanyang Technological University, Singapore, 308232, Singapore
- School of Biological Sciences, Nanyang Technological University, Singapore, 637551, Singapore
| | - Kai Peng Lim
- Lee Kong Chian School of Medicine, Nanyang Technological University, Singapore, 308232, Singapore
- School of Biological Sciences, Nanyang Technological University, Singapore, 637551, Singapore
| | - Weijia Kong
- Lee Kong Chian School of Medicine, Nanyang Technological University, Singapore, 308232, Singapore
- School of Biological Sciences, Nanyang Technological University, Singapore, 637551, Singapore
| | - Huanhuan Gao
- Westlake Center for Intelligent Proteomics, Westlake Laboratory of Life Sciences and Biomedicine, Hangzhou, Zhejiang Province, 310030, China
- Key Laboratory of Structural Biology of Zhejiang Province, School of Life Sciences, Westlake University, Hangzhou, Zhejiang Province, 310030, China
- Research Center for Industries of the Future, Westlake University, 600 Dunyu Road, Hangzhou, Zhejiang, 310030, China
| | - Bertrand Jern Han Wong
- School of Biological Sciences, Nanyang Technological University, Singapore, 637551, Singapore
| | - Ser Xian Phua
- Lee Kong Chian School of Medicine, Nanyang Technological University, Singapore, 308232, Singapore
- School of Biological Sciences, Nanyang Technological University, Singapore, 637551, Singapore
| | - Tiannan Guo
- Westlake Center for Intelligent Proteomics, Westlake Laboratory of Life Sciences and Biomedicine, Hangzhou, Zhejiang Province, 310030, China
- Key Laboratory of Structural Biology of Zhejiang Province, School of Life Sciences, Westlake University, Hangzhou, Zhejiang Province, 310030, China
- Research Center for Industries of the Future, Westlake University, 600 Dunyu Road, Hangzhou, Zhejiang, 310030, China
| | - Wilson Wen Bin Goh
- Lee Kong Chian School of Medicine, Nanyang Technological University, Singapore, 308232, Singapore.
- School of Biological Sciences, Nanyang Technological University, Singapore, 637551, Singapore.
- Center for Biomedical Informatics, Nanyang Technological University, Singapore, 636921, Singapore.
| |
Collapse
|
19
|
de Azúa-López ZR, Pezzotti MR, González-Díaz Á, Meilhac O, Ureña J, Amaya-Villar R, Castellano A, Varela LM. HDL anti-inflammatory function is impaired and associated with high SAA1 and low APOA4 levels in aneurysmal subarachnoid hemorrhage. J Cereb Blood Flow Metab 2023; 43:1919-1930. [PMID: 37357772 PMCID: PMC10676137 DOI: 10.1177/0271678x231184806] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 12/28/2022] [Revised: 05/07/2023] [Accepted: 06/02/2023] [Indexed: 06/27/2023]
Abstract
Aneurysmal subarachnoid hemorrhage (aSAH) is a devastating disease with high morbidity and mortality rates. Within 24 hours after aSAH, monocytes are recruited and enter the subarachnoid space, where they mature into macrophages, increasing the inflammatory response and contributing, along with other factors, to delayed neurological dysfunction and poor outcomes. High-density lipoproteins (HDL) are lipid-protein complexes that exert anti-inflammatory effects but under pathological conditions undergo structural alterations that have been associated with loss of functionality. Plasma HDL were isolated from patients with aSAH and analyzed for their anti-inflammatory activity and protein composition. HDL isolated from patients lost the ability to prevent VCAM-1 expression in endothelial cells (HUVEC) and subsequent adhesion of THP-1 monocytes to the endothelium. Proteomic analysis showed that HDL particles from patients had an altered composition compared to those of healthy subjects. We confirmed by western blot that low levels of apolipoprotein A4 (APOA4) and high of serum amyloid A1 (SAA1) in HDL were associated with the lack of anti-inflammatory function observed in aSAH. Our results indicate that the study of HDL in the pathophysiology of aSAH is needed, and functional HDL supplementation could be considered a novel therapeutic approach to the treatment of the inflammatory response after aSAH.
Collapse
Affiliation(s)
- Zaida Ruiz de Azúa-López
- Instituto de Biomedicina de Sevilla (IBiS)/Hospital Universitario Virgen del Rocío/CSIC/Universidad de Sevilla, Sevilla, Spain
- Unidad de Cuidados Intensivos, Hospital Universitario Virgen del Rocío, Sevilla, Spain
| | - M Rosa Pezzotti
- Instituto de Biomedicina de Sevilla (IBiS)/Hospital Universitario Virgen del Rocío/CSIC/Universidad de Sevilla, Sevilla, Spain
- Departamento de Fisiología Médica y Biofísica, Facultad de Medicina, Universidad de Sevilla, Sevilla, Spain
| | - Ángela González-Díaz
- Instituto de Biomedicina de Sevilla (IBiS)/Hospital Universitario Virgen del Rocío/CSIC/Universidad de Sevilla, Sevilla, Spain
| | - Olivier Meilhac
- Université de La Réunion, INSERM, UMR 1188 Diabète athérothombose Réunion Océan Indien (DéTROI), Saint-Pierre de La Réunion, France
- CHU de La Réunion, Saint-Pierre de la Réunion, France
| | - Juan Ureña
- Instituto de Biomedicina de Sevilla (IBiS)/Hospital Universitario Virgen del Rocío/CSIC/Universidad de Sevilla, Sevilla, Spain
- Departamento de Fisiología Médica y Biofísica, Facultad de Medicina, Universidad de Sevilla, Sevilla, Spain
| | - Rosario Amaya-Villar
- Instituto de Biomedicina de Sevilla (IBiS)/Hospital Universitario Virgen del Rocío/CSIC/Universidad de Sevilla, Sevilla, Spain
- Unidad de Cuidados Intensivos, Hospital Universitario Virgen del Rocío, Sevilla, Spain
| | - Antonio Castellano
- Instituto de Biomedicina de Sevilla (IBiS)/Hospital Universitario Virgen del Rocío/CSIC/Universidad de Sevilla, Sevilla, Spain
- Departamento de Fisiología Médica y Biofísica, Facultad de Medicina, Universidad de Sevilla, Sevilla, Spain
| | - Lourdes M Varela
- Instituto de Biomedicina de Sevilla (IBiS)/Hospital Universitario Virgen del Rocío/CSIC/Universidad de Sevilla, Sevilla, Spain
- Departamento de Fisiología Médica y Biofísica, Facultad de Medicina, Universidad de Sevilla, Sevilla, Spain
| |
Collapse
|
20
|
Mengelkoch S, Miryam Schüssler-Fiorenza Rose S, Lautman Z, Alley JC, Roos LG, Ehlert B, Moriarity DP, Lancaster S, Snyder MP, Slavich GM. Multi-omics approaches in psychoneuroimmunology and health research: Conceptual considerations and methodological recommendations. Brain Behav Immun 2023; 114:475-487. [PMID: 37543247 PMCID: PMC11195542 DOI: 10.1016/j.bbi.2023.07.022] [Citation(s) in RCA: 9] [Impact Index Per Article: 9.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 02/27/2023] [Revised: 07/04/2023] [Accepted: 07/30/2023] [Indexed: 08/07/2023] Open
Abstract
The field of psychoneuroimmunology (PNI) has grown substantially in both relevance and prominence over the past 40 years. Notwithstanding its impressive trajectory, a majority of PNI studies are still based on a relatively small number of analytes. To advance this work, we suggest that PNI, and health research in general, can benefit greatly from adopting a multi-omics approach, which involves integrating data across multiple biological levels (e.g., the genome, proteome, transcriptome, metabolome, lipidome, and microbiome/metagenome) to more comprehensively profile biological functions and relate these profiles to clinical and behavioral outcomes. To assist investigators in this endeavor, we provide an overview of multi-omics research, highlight recent landmark multi-omics studies investigating human health and disease risk, and discuss how multi-omics can be applied to better elucidate links between psychological, nervous system, and immune system activity. In doing so, we describe how to design high-quality multi-omics studies, decide which biological samples (e.g., blood, stool, urine, saliva, solid tissue) are most relevant, incorporate behavioral and wearable sensing data into multi-omics research, and understand key data quality, integration, analysis, and interpretation issues. PNI researchers are addressing some of the most interesting and important questions at the intersection of psychology, neuroscience, and immunology. Applying a multi-omics approach to this work will greatly expand the horizon of what is possible in PNI and has the potential to revolutionize our understanding of mind-body medicine.
Collapse
Affiliation(s)
- Summer Mengelkoch
- Department of Psychiatry and Biobehavioral Sciences, University of California, Los Angeles, CA, USA.
| | | | - Ziv Lautman
- Department of Genetics, Stanford University, Stanford, CA, USA
| | - Jenna C Alley
- Department of Psychiatry and Biobehavioral Sciences, University of California, Los Angeles, CA, USA
| | - Lydia G Roos
- Department of Psychiatry and Biobehavioral Sciences, University of California, Los Angeles, CA, USA
| | - Benjamin Ehlert
- Department of Genetics, Stanford University, Stanford, CA, USA
| | - Daniel P Moriarity
- Department of Psychiatry and Biobehavioral Sciences, University of California, Los Angeles, CA, USA
| | | | | | - George M Slavich
- Department of Psychiatry and Biobehavioral Sciences, University of California, Los Angeles, CA, USA.
| |
Collapse
|
21
|
Gómez-Varela D, Xian F, Grundtner S, Sondermann JR, Carta G, Schmidt M. Increasing taxonomic and functional characterization of host-microbiome interactions by DIA-PASEF metaproteomics. Front Microbiol 2023; 14:1258703. [PMID: 37908546 PMCID: PMC10613666 DOI: 10.3389/fmicb.2023.1258703] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/14/2023] [Accepted: 09/20/2023] [Indexed: 11/02/2023] Open
Abstract
Introduction Metaproteomics is a rapidly advancing field that offers unique insights into the taxonomic composition and the functional activity of microbial communities, and their effects on host physiology. Classically, data-dependent acquisition (DDA) mass spectrometry (MS) has been applied for peptide identification and quantification in metaproteomics. However, DDA-MS exhibits well-known limitations in terms of depth, sensitivity, and reproducibility. Consequently, methodological improvements are required to better characterize the protein landscape of microbiomes and their interactions with the host. Methods We present an optimized proteomic workflow that utilizes the information captured by Parallel Accumulation-Serial Fragmentation (PASEF) MS for comprehensive metaproteomic studies in complex fecal samples of mice. Results and discussion We show that implementing PASEF using a DDA acquisition scheme (DDA-PASEF) increased peptide quantification up to 5 times and reached higher accuracy and reproducibility compared to previously published classical DDA and data-independent acquisition (DIA) methods. Furthermore, we demonstrate that the combination of DIA, PASEF, and neuronal-network-based data analysis, was superior to DDA-PASEF in all mentioned parameters. Importantly, DIA-PASEF expanded the dynamic range towards low-abundant proteins and it doubled the quantification of proteins with unknown or uncharacterized functions. Compared to previous classical DDA metaproteomic studies, DIA-PASEF resulted in the quantification of up to 4 times more taxonomic units using 16 times less injected peptides and 4 times shorter chromatography gradients. Moreover, 131 additional functional pathways distributed across more and even uniquely identified taxa were profiled as revealed by a peptide-centric taxonomic-functional analysis. We tested our workflow on a validated preclinical mouse model of neuropathic pain to assess longitudinal changes in host-gut microbiome interactions associated with pain - an unexplored topic for metaproteomics. We uncovered the significant enrichment of two bacterial classes upon pain, and, in addition, the upregulation of metabolic activities previously linked to chronic pain as well as various hitherto unknown ones. Furthermore, our data revealed pain-associated dynamics of proteome complexes implicated in the crosstalk between the host immune system and the gut microbiome. In conclusion, the DIA-PASEF metaproteomic workflow presented here provides a stepping stone towards a deeper understanding of microbial ecosystems across the breadth of biomedical and biotechnological fields.
Collapse
|
22
|
Cheng Y, Ren Y, Wang W, Zhang W. Similar proteome expression profiles of the aggregated lymphoid nodules area and Peyer's patches in Bactrian camel. BMC Genomics 2023; 24:608. [PMID: 37821839 PMCID: PMC10568864 DOI: 10.1186/s12864-023-09715-5] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/09/2023] [Accepted: 10/04/2023] [Indexed: 10/13/2023] Open
Abstract
BACKGROUND The presence of Aggregated Lymphoid Nodules Area (ALNA) is a notable anatomical characteristic observed in the abomasum of Bactrian camels. This area is comprised of two separate regions, namely the Reticular Mucosal Folds Region (RMFR) and the Longitudinal Mucosal Folds Region (LMFR). The histological properties of ALNA exhibit significant similarities to those of Peyer's patches (PPs) found in the gastrointestinal system. The functional characteristics of ALNA were examined in relation to mucosal immunity in the gastrointestinal system. RESULTS We used iTRAQ-based proteomic analysis on twelve Bactrian camels to measure the amount of proteins expressed in ALNA. In the experiment, we sampled the RMFR and LMFR separately from the ALNA and compared their proteomic quantification results with samples from the PPs. A total of 1253 proteins were identified, among which 39 differentially expressed proteins (DEPs) were found between RMFR and PPs, 33 DEPs were found between LMFR and PPs, and 22 DEPs were found between LMFR and RMFR. The proteins FLNA, MYH11, and HSPB1 were chosen for validation using the enzyme-linked immunosorbent assay (ELISA), and the observed expression profiles were found to be in agreement with the results obtained from the iTRAQ study. The InnateDB database was utilized to get data pertaining to immune-associated proteins in ALNA. It was observed that a significant proportion, specifically 76.6%, of these proteins were found to be associated with the same orthogroups as human immune-related genes. These proteins are acknowledged to be associated with a diverse range of functions, encompassing the uptake, processing and presentation of antigens, activation of lymphocytes, the signaling pathways of T-cell and B-cell receptors, and the control of actin polymerization. CONCLUSIONS The experimental results suggest that there are parallels in the immune-related proteins found in ALNA and PPs. Although there are variations in the structures of LMFR and RMFR, the proteins produced in both structures exhibit a high degree of similarity and perform comparable functions in the context of mucosal immune responses.
Collapse
Affiliation(s)
- Yujiao Cheng
- College of Veterinary Medicine, Gansu Agricultural University, Lanzhou, Gansu, China
| | - Yan Ren
- The Davies Research Centre, School of Animal and Veterinary Sciences, University of Adelaide, Roseworthy, SA, 5371, Australia
| | - Wenhui Wang
- College of Veterinary Medicine, Gansu Agricultural University, Lanzhou, Gansu, China.
| | - Wangdong Zhang
- College of Veterinary Medicine, Gansu Agricultural University, Lanzhou, Gansu, China.
| |
Collapse
|
23
|
Yu Y, Zhang N, Mai Y, Ren L, Chen Q, Cao Z, Chen Q, Liu Y, Hou W, Yang J, Hong H, Xu J, Tong W, Dong L, Shi L, Fang X, Zheng Y. Correcting batch effects in large-scale multiomics studies using a reference-material-based ratio method. Genome Biol 2023; 24:201. [PMID: 37674217 PMCID: PMC10483871 DOI: 10.1186/s13059-023-03047-z] [Citation(s) in RCA: 4] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/08/2022] [Accepted: 05/18/2023] [Indexed: 09/08/2023] Open
Abstract
BACKGROUND Batch effects are notoriously common technical variations in multiomics data and may result in misleading outcomes if uncorrected or over-corrected. A plethora of batch-effect correction algorithms are proposed to facilitate data integration. However, their respective advantages and limitations are not adequately assessed in terms of omics types, the performance metrics, and the application scenarios. RESULTS As part of the Quartet Project for quality control and data integration of multiomics profiling, we comprehensively assess the performance of seven batch effect correction algorithms based on different performance metrics of clinical relevance, i.e., the accuracy of identifying differentially expressed features, the robustness of predictive models, and the ability of accurately clustering cross-batch samples into their own donors. The ratio-based method, i.e., by scaling absolute feature values of study samples relative to those of concurrently profiled reference material(s), is found to be much more effective and broadly applicable than others, especially when batch effects are completely confounded with biological factors of study interests. We further provide practical guidelines for implementing the ratio based approach in increasingly large-scale multiomics studies. CONCLUSIONS Multiomics measurements are prone to batch effects, which can be effectively corrected using ratio-based scaling of the multiomics data. Our study lays the foundation for eliminating batch effects at a ratio scale.
Collapse
Affiliation(s)
- Ying Yu
- State Key Laboratory of Genetic Engineering, School of Life Sciences and Human Phenome Institute, Shanghai Cancer Center, Fudan University, Shanghai, China
| | - Naixin Zhang
- State Key Laboratory of Genetic Engineering, School of Life Sciences and Human Phenome Institute, Shanghai Cancer Center, Fudan University, Shanghai, China
| | - Yuanbang Mai
- State Key Laboratory of Genetic Engineering, School of Life Sciences and Human Phenome Institute, Shanghai Cancer Center, Fudan University, Shanghai, China
| | - Luyao Ren
- State Key Laboratory of Genetic Engineering, School of Life Sciences and Human Phenome Institute, Shanghai Cancer Center, Fudan University, Shanghai, China
| | - Qiaochu Chen
- State Key Laboratory of Genetic Engineering, School of Life Sciences and Human Phenome Institute, Shanghai Cancer Center, Fudan University, Shanghai, China
| | - Zehui Cao
- State Key Laboratory of Genetic Engineering, School of Life Sciences and Human Phenome Institute, Shanghai Cancer Center, Fudan University, Shanghai, China
| | - Qingwang Chen
- State Key Laboratory of Genetic Engineering, School of Life Sciences and Human Phenome Institute, Shanghai Cancer Center, Fudan University, Shanghai, China
| | - Yaqing Liu
- State Key Laboratory of Genetic Engineering, School of Life Sciences and Human Phenome Institute, Shanghai Cancer Center, Fudan University, Shanghai, China
| | - Wanwan Hou
- State Key Laboratory of Genetic Engineering, School of Life Sciences and Human Phenome Institute, Shanghai Cancer Center, Fudan University, Shanghai, China
| | - Jingcheng Yang
- State Key Laboratory of Genetic Engineering, School of Life Sciences and Human Phenome Institute, Shanghai Cancer Center, Fudan University, Shanghai, China
- Greater Bay Area Institute of Precision Medicine, Guangzhou, Guangdong, China
| | - Huixiao Hong
- Division of Bioinformatics and Biostatistics, National Center for Toxicological Research, US Food and Drug Administration, Jefferson, AR, USA
| | - Joshua Xu
- Division of Bioinformatics and Biostatistics, National Center for Toxicological Research, US Food and Drug Administration, Jefferson, AR, USA
| | - Weida Tong
- Division of Bioinformatics and Biostatistics, National Center for Toxicological Research, US Food and Drug Administration, Jefferson, AR, USA
| | | | - Leming Shi
- State Key Laboratory of Genetic Engineering, School of Life Sciences and Human Phenome Institute, Shanghai Cancer Center, Fudan University, Shanghai, China.
- International Human Phenome Institutes, Shanghai, China.
| | - Xiang Fang
- National Institute of Metrology, Beijing, China.
| | - Yuanting Zheng
- State Key Laboratory of Genetic Engineering, School of Life Sciences and Human Phenome Institute, Shanghai Cancer Center, Fudan University, Shanghai, China.
| |
Collapse
|
24
|
Goh WWB, Hui HWH, Wong L. How missing value imputation is confounded with batch effects and what you can do about it. Drug Discov Today 2023; 28:103661. [PMID: 37301250 DOI: 10.1016/j.drudis.2023.103661] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/08/2023] [Revised: 05/31/2023] [Accepted: 06/05/2023] [Indexed: 06/12/2023]
Abstract
In data-processing pipelines, upstream steps can influence downstream processes because of their sequential nature. Among these data-processing steps, batch effect (BE) correction (BEC) and missing value imputation (MVI) are crucial for ensuring data suitability for advanced modeling and reducing the likelihood of false discoveries. Although BEC-MVI interactions are not well studied, they are ultimately interdependent. Batch sensitization can improve the quality of MVI. Conversely, accounting for missingness also improves proper BE estimation in BEC. Here, we discuss how BEC and MVI are interconnected and interdependent. We show how batch sensitization can improve any MVI and bring attention to the idea of BE-associated missing values (BEAMs). Finally, we discuss how batch-class imbalance problems can be mitigated by borrowing ideas from machine learning.
Collapse
Affiliation(s)
- Wilson Wen Bin Goh
- Lee Kong Chian School of Medicine, Nanyang Technological University, Singapore; School of Biological Sciences, Nanyang Technological University, Singapore; Center for Biomedical Informatics, Nanyang Technological University, Singapore.
| | - Harvard Wai Hann Hui
- Lee Kong Chian School of Medicine, Nanyang Technological University, Singapore; School of Biological Sciences, Nanyang Technological University, Singapore
| | - Limsoon Wong
- Department of Computer Science, National University of Singapore, Singapore; Department of Pathology, National University of Singapore, Singapore.
| |
Collapse
|
25
|
Bowser BL, Patterson KL, Robinson RA. Evaluating cPILOT Data toward Quality Control Implementation. JOURNAL OF THE AMERICAN SOCIETY FOR MASS SPECTROMETRY 2023; 34:1741-1752. [PMID: 37459602 DOI: 10.1021/jasms.3c00179] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 08/03/2023]
Abstract
Multiplexing enables the monitoring of hundreds to thousands of proteins in quantitative proteomics analyses and increases sample throughput. In most mass-spectrometry-based proteomics workflows, multiplexing is achieved by labeling biological samples with heavy isotopes via precursor isotopic labeling or isobaric tagging. Enhanced multiplexing strategies, such as combined precursor isotopic labeling and isobaric tagging (cPILOT), combine multiple technologies to afford an even higher sample throughput. Critical to enhanced multiplexing analyses is ensuring that analytical performance is optimal and that missingness of sample channels is minimized. Automation of sample preparation steps and use of quality control (QC) metrics can be incorporated into multiplexing analyses and reduce the likelihood of missing information, thus maximizing the amount of usable quantitative data. Here, we implemented QC metrics previously developed in our laboratory to evaluate a 36-plex cPILOT experiment that encompassed 144 mouse samples of various tissue types, time points, genotypes, and biological replicates. The evaluation focuses on the use of a sample pool generated from all samples in the experiment to monitor the daily instrument performance and to provide a means for data normalization across sample batches. Our results show that tracking QC metrics enabled the quantification of ∼7000 proteins in each sample batch, of which ∼70% had minimal missing values across up to 36 sample channels. Implementation of QC metrics for future cPILOT studies as well as other enhanced multiplexing strategies will help yield high-quality data sets.
Collapse
Affiliation(s)
- Bailey L Bowser
- Department of Chemistry, Vanderbilt University, Nashville, Tennessee 37235, United States
| | - Khiry L Patterson
- Department of Chemistry, Vanderbilt University, Nashville, Tennessee 37235, United States
| | - Renã As Robinson
- Department of Chemistry, Vanderbilt University, Nashville, Tennessee 37235, United States
- Department of Neurology, Vanderbilt University Medical Center, Nashville, Tennessee 37232, United States
- Vanderbilt Memory & Alzheimer's Center, Nashville, Tennessee 37212, United States
- Vanderbilt Institute of Chemical Biology, Nashville, Tennessee 37232, United States
- Vanderbilt Brain Institute, Nashville, Tennessee 37232, United States
| |
Collapse
|
26
|
Droit A, Pelletier S, Leclerq M, Roux-Dalvai F, de Geus M, Leslie S, Wang W, Lam T, Nairn A, Arnold S, Carlyle B, Precioso F. Enhancing Classification of liquid chromatography mass spectrometry data with Batch Effect Removal Neural Networks (BERNN). RESEARCH SQUARE 2023:rs.3.rs-3112514. [PMID: 37461653 PMCID: PMC10350225 DOI: 10.21203/rs.3.rs-3112514/v1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 07/28/2023]
Abstract
Liquid Chromatography Mass Spectrometry (LC-MS) is a powerful method for profiling complex biological samples. However, batch effects typically arise from differences in sample processing protocols, experimental conditions and data acquisition techniques, significantlyimpacting the interpretability of results. Correcting batch effects is crucial for the reproducibility of proteomics research, but current methods are not optimal for removal of batch effects without compressing the genuine biological variation under study. We propose a suite of Batch Effect Removal Neural Networks (BERNN) to remove batch effects in large LC-MS experiments, with the goal of maximizing sample classification performance between conditions. More importantly, these models must efficiently generalize in batches not seen during training. Comparison of batch effect correction methods across three diverse datasets demonstrated that BERNN models consistently showed the strongest sample classification performance. However, the model producing the greatest classification improvements did not always perform best in terms of batch effect removal. Finally, we show that overcorrection of batch effects resulted in the loss of some essential biological variability. These findings highlight the importance of balancing batch effect removal while preserving valuable biological diversity in large-scale LC-MS experiments.
Collapse
Affiliation(s)
- Arnaud Droit
- Centre de Recherche du CHU de Québec - Université Laval, Axe Endocrinologie et Néphrologie, Québec, Canada
| | | | | | | | | | | | - Weiwei Wang
- 7. Keck MS & Proteomics Resource, Yale School of Medicine
| | - TuKiet Lam
- 7. Keck MS & Proteomics Resource, Yale School of Medicine
| | | | - Steven Arnold
- 3. Massachusetts General Hospital Department of Neurology
| | - Becky Carlyle
- 3. Massachusetts General Hospital Department of Neurology
| | | |
Collapse
|
27
|
Alvarez-Rivera E, Ortiz-Hernández EJ, Lugo E, Lozada-Reyes LM, Boukli NM. Oncogenic Proteomics Approaches for Translational Research and HIV-Associated Malignancy Mechanisms. Proteomes 2023; 11:22. [PMID: 37489388 PMCID: PMC10366845 DOI: 10.3390/proteomes11030022] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/30/2023] [Revised: 06/09/2023] [Accepted: 06/29/2023] [Indexed: 07/26/2023] Open
Abstract
Recent advances in the field of proteomics have allowed extensive insights into the molecular regulations of the cell proteome. Specifically, this allows researchers to dissect a multitude of signaling arrays while targeting for the discovery of novel protein signatures. These approaches based on data mining are becoming increasingly powerful for identifying both potential disease mechanisms as well as indicators for disease progression and overall survival predictive and prognostic molecular markers for cancer. Furthermore, mass spectrometry (MS) integrations satisfy the ongoing demand for in-depth biomarker validation. For the purpose of this review, we will highlight the current developments based on MS sensitivity, to place quantitative proteomics into clinical settings and provide a perspective to integrate proteomics data for future applications in cancer precision medicine. We will also discuss malignancies associated with oncogenic viruses such as Acquire Immunodeficiency Syndrome (AIDS) and suggest novel mechanisms behind this phenomenon. Human Immunodeficiency Virus type-1 (HIV-1) proteins are known to be oncogenic per se, to induce oxidative and endoplasmic reticulum stresses, and to be released from the infected or expressing cells. HIV-1 proteins can act alone or in collaboration with other known oncoproteins, which cause the bulk of malignancies in people living with HIV-1 on ART.
Collapse
Affiliation(s)
- Eduardo Alvarez-Rivera
- Biomedical Proteomics Facility, Department of Microbiology and Immunology, Universidad Central del Caribe, School of Medicine, Bayamón, PR 00960, USA
| | - Emanuel J. Ortiz-Hernández
- Biomedical Proteomics Facility, Department of Microbiology and Immunology, Universidad Central del Caribe, School of Medicine, Bayamón, PR 00960, USA
| | - Elyette Lugo
- Biomedical Proteomics Facility, Department of Microbiology and Immunology, Universidad Central del Caribe, School of Medicine, Bayamón, PR 00960, USA
| | | | - Nawal M. Boukli
- Biomedical Proteomics Facility, Department of Microbiology and Immunology, Universidad Central del Caribe, School of Medicine, Bayamón, PR 00960, USA
| |
Collapse
|
28
|
Xu M, Xu K, Yin S, Sun W, Wang G, Zhang K, Mu J, Wu M, Xing B, Zhang X, Han J, Zhao X, Chang C, Wang Y, Xu D, Yu X. In-depth serum proteomics reveals the trajectory of hallmarks of cancer in hepatitis B virus-related liver diseases. Mol Cell Proteomics 2023:100574. [PMID: 37209815 PMCID: PMC10316086 DOI: 10.1016/j.mcpro.2023.100574] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/29/2022] [Revised: 04/25/2023] [Accepted: 05/16/2023] [Indexed: 05/22/2023] Open
Abstract
Hepatocellular carcinoma (HCC) is a prevalent cancer in China, with chronic hepatitis B (CHB) and liver cirrhosis (LC) being high-risk factors for developing HCC. Here, we determined the serum proteomes (762 proteins) of 125 healthy controls and Hepatitis B virus-infected CHB, LC, and HCC patients and constructed the first cancerous trajectory of liver diseases. The results not only reveal that the majority of altered biological processes were involved in the hallmarks of cancer (inflammation, metastasis, metabolism, vasculature, coagulation), but also identify potential therapeutic targets in cancerous pathways (i.e., IL17 signaling pathway). Notably, the biomarker panels for detecting HCC in CHB and LC high-risk populations were further developed using machine learning in two cohorts comprised of 200 samples (discovery cohort=125, validation cohort=75). The protein signatures significantly improved the area under the receiver operating characteristic curve (AUC) of HCC (CHB discovery and validation cohort = 0.953 and 0.891, respectively; LC discovery and validation cohort = 0.966 and 0.818, respectively) compared to using the traditional biomarker, alpha-fetoprotein (AFP), alone. Finally, selected biomarkers were validated with parallel reaction monitoring (PRM) mass spectrometry in an additional cohort (n=120). Altogether, our results provide fundamental insights into the continuous changes of cancer biology processes in liver diseases and identify candidate protein targets for early detection and intervention.
Collapse
Affiliation(s)
- Meng Xu
- State Key Laboratory of Analytical Chemistry for Life Science, School of Chemistry and Chemical Engineering, Nanjing University, Nanjing, China; State Key Laboratory of Proteomics, Beijing Proteome Research Center, National Center for Protein Sciences, Beijing Institute of Lifeomics, Beijing, 102206, China
| | - Kaikun Xu
- State Key Laboratory of Proteomics, Beijing Proteome Research Center, National Center for Protein Sciences, Beijing Institute of Lifeomics, Beijing, 102206, China; Research Unit of Proteomics Driven Cancer Precision Medicine, Chinese Academy of Medical Sciences, Beijing 102206, China
| | - Shangqi Yin
- Department of Clinical Laboratory, Beijing Ditan Hospital, Capital Medical University, Beijing 100015, China
| | - Wei Sun
- State Key Laboratory of Proteomics, Beijing Proteome Research Center, National Center for Protein Sciences, Beijing Institute of Lifeomics, Beijing, 102206, China
| | - Guibin Wang
- State Key Laboratory of Proteomics, Beijing Proteome Research Center, National Center for Protein Sciences, Beijing Institute of Lifeomics, Beijing, 102206, China
| | - Kai Zhang
- State Key Laboratory of Proteomics, Beijing Proteome Research Center, National Center for Protein Sciences, Beijing Institute of Lifeomics, Beijing, 102206, China
| | - Jinsong Mu
- Department of Critical Care Medicine, The Fifth Medical Center, Chinese PLA General Hospital, Beijing, 100039, China
| | - Miantao Wu
- Sun Yat-sen University Cancer Center, State Key Laboratory of Oncology in South China, Collaborative Innovation Center for Cancer Medicine, Guangzhou 510060, China
| | - Baocai Xing
- Key Laboratory of Carcinogenesis and Translational Research (Ministry of Education/Beijing), Department of Hepato-Pancreato-Biliary Surgery I, Peking University Cancer Hospital and Institute, Beijing, 100036, China
| | - Xiaomei Zhang
- State Key Laboratory of Proteomics, Beijing Proteome Research Center, National Center for Protein Sciences, Beijing Institute of Lifeomics, Beijing, 102206, China
| | - Jinyu Han
- Department of Clinical Laboratory, Beijing Ditan Hospital, Capital Medical University, Beijing 100015, China
| | - Xiaohang Zhao
- State Key Laboratory of Molecular Oncology, Cancer Hospital, Chinese Academy of Medical Sciences and Peking Union Medical College, Beijing 100021, China
| | - Cheng Chang
- State Key Laboratory of Proteomics, Beijing Proteome Research Center, National Center for Protein Sciences, Beijing Institute of Lifeomics, Beijing, 102206, China; Research Unit of Proteomics Driven Cancer Precision Medicine, Chinese Academy of Medical Sciences, Beijing 102206, China.
| | - Yajie Wang
- Department of Clinical Laboratory, Beijing Ditan Hospital, Capital Medical University, Beijing 100015, China.
| | - Danke Xu
- State Key Laboratory of Analytical Chemistry for Life Science, School of Chemistry and Chemical Engineering, Nanjing University, Nanjing, China.
| | - Xiaobo Yu
- State Key Laboratory of Proteomics, Beijing Proteome Research Center, National Center for Protein Sciences, Beijing Institute of Lifeomics, Beijing, 102206, China.
| |
Collapse
|
29
|
Maxwell CB, Sandhu JK, Cao TH, McCann GP, Ng LL, Jones DJL. The Edge Effect in High-Throughput Proteomics: A Cautionary Tale. JOURNAL OF THE AMERICAN SOCIETY FOR MASS SPECTROMETRY 2023. [PMID: 37155737 DOI: 10.1021/jasms.3c00035] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/10/2023]
Abstract
In order for mass spectrometry to continue to grow as a platform for high-throughput clinical and translational research, careful consideration must be given to quality control by ensuring that the assay performs reproducibly and accurately and precisely. In particular, the throughput required for large cohort clinical validation in biomarker discovery and diagnostic screening has driven the growth of multiplexed targeted liquid chromatography coupled to tandem mass spectrometry (LC-MS/MS) assays paired with sample preparation and analysis in multiwell plates. However, large scale MS-based proteomics studies are often plagued by batch effects: sources of technical variation in the data, which can arise from a diverse array of sources such as sample preparation batches, different reagent lots, or indeed MS signal drift. These batch effects can confound the detection of true signal differences, resulting in incorrect conclusions being drawn about significant biological effects or lack thereof. Here, we present an intraplate batch effect termed the edge effect arising from temperature gradients in multiwell plates, commonly reported in preclinical cell culture studies but not yet reported in a clinical proteomics setting. We present methods herein to ameliorate the phenomenon including proper assessment of heating techniques for multiwell plates and incorporation of surrogate standards, which can normalize for intraplate variation.
Collapse
Affiliation(s)
- Colleen B Maxwell
- The Leicester van Geest MultiOmics Facility, Hodgkin Building, University of Leicester, Leicester LE1 9HN, United Kingdom
- Department of Cardiovascular Sciences and NIHR Leicester Biomedical Research Centre, Glenfield Hospital, Leicester LE3 9QP, United Kingdom
| | - Jatinderpal K Sandhu
- The Leicester van Geest MultiOmics Facility, Hodgkin Building, University of Leicester, Leicester LE1 9HN, United Kingdom
- Department of Cardiovascular Sciences and NIHR Leicester Biomedical Research Centre, Glenfield Hospital, Leicester LE3 9QP, United Kingdom
| | - Thong H Cao
- The Leicester van Geest MultiOmics Facility, Hodgkin Building, University of Leicester, Leicester LE1 9HN, United Kingdom
- Department of Cardiovascular Sciences and NIHR Leicester Biomedical Research Centre, Glenfield Hospital, Leicester LE3 9QP, United Kingdom
| | - Gerry P McCann
- Department of Cardiovascular Sciences and NIHR Leicester Biomedical Research Centre, Glenfield Hospital, Leicester LE3 9QP, United Kingdom
| | - Leong L Ng
- The Leicester van Geest MultiOmics Facility, Hodgkin Building, University of Leicester, Leicester LE1 9HN, United Kingdom
- Department of Cardiovascular Sciences and NIHR Leicester Biomedical Research Centre, Glenfield Hospital, Leicester LE3 9QP, United Kingdom
| | - Donald J L Jones
- The Leicester van Geest MultiOmics Facility, Hodgkin Building, University of Leicester, Leicester LE1 9HN, United Kingdom
- Department of Cardiovascular Sciences and NIHR Leicester Biomedical Research Centre, Glenfield Hospital, Leicester LE3 9QP, United Kingdom
- Leicester Cancer Research Centre, RKCSB, University of Leicester, Leicester LE2 7LX, United Kingdom
| |
Collapse
|
30
|
Sun W, Lin Y, Huang Y, Chan J, Terrillon S, Rosenbaum AI, Contrepois K. Robust and High-Throughput Analytical Flow Proteomics Analysis of Cynomolgus Monkey and Human Matrices with Zeno SWATH Data Independent Acquisition. Mol Cell Proteomics 2023:100562. [PMID: 37142056 DOI: 10.1016/j.mcpro.2023.100562] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/22/2022] [Revised: 04/17/2023] [Accepted: 04/26/2023] [Indexed: 05/06/2023] Open
Abstract
Modern mass spectrometers routinely allow deep proteome coverage in a single experiment. These methods are typically operated at nano and micro flow regimes, but they often lack throughput and chromatographic robustness, which is critical for large-scale studies. In this context, we have developed, optimized and benchmarked LC-MS methods combining the robustness and throughput of analytical flow chromatography with the added sensitivity provided by the Zeno trap across a wide range of cynomolgus monkey and human matrices of interest for toxicological studies and clinical biomarker discovery. SWATH data independent acquisition (DIA) experiments with Zeno trap activated (Zeno SWATH DIA) provided a clear advantage over conventional SWATH DIA in all sample types tested with improved sensitivity, quantitative robustness and signal linearity as well as increased protein coverage by up to 9-fold. Using a 10-min gradient chromatography, up to 3,300 proteins were identified in tissues at 2 μg peptide load. Importantly, the performance gains with Zeno SWATH translated into better biological pathway representation and improved the ability to identify dysregulated proteins and pathways associated with two metabolic diseases in human plasma. Finally, we demonstrate that this method is highly stable over time with the acquisition of reliable data over the injection of 1,000+ samples (14.2 days of uninterrupted acquisition) without the need for human intervention or normalization. Altogether, Zeno SWATH DIA methodology allows fast, sensitive and robust proteomic workflows using analytical flow and is amenable to large-scale studies. This work provides detailed method performance assessment on a variety of relevant biological matrices and serves as a valuable resource for the proteomics community.
Collapse
Affiliation(s)
- Weiwen Sun
- Integrated Bioanalysis, Clinical Pharmacology and Safety Sciences, R&D, AstraZeneca, South San Francisco, CA 94080, USA
| | - Yuan Lin
- Integrated Bioanalysis, Clinical Pharmacology and Safety Sciences, R&D, AstraZeneca, South San Francisco, CA 94080, USA
| | - Yue Huang
- Integrated Bioanalysis, Clinical Pharmacology and Safety Sciences, R&D, AstraZeneca, South San Francisco, CA 94080, USA
| | - Josolyn Chan
- Integrated Bioanalysis, Clinical Pharmacology and Safety Sciences, R&D, AstraZeneca, South San Francisco, CA 94080, USA
| | - Sonia Terrillon
- Integrated Bioanalysis, Clinical Pharmacology and Safety Sciences, R&D, AstraZeneca, South San Francisco, CA 94080, USA
| | - Anton I Rosenbaum
- Integrated Bioanalysis, Clinical Pharmacology and Safety Sciences, R&D, AstraZeneca, South San Francisco, CA 94080, USA.
| | - Kévin Contrepois
- Integrated Bioanalysis, Clinical Pharmacology and Safety Sciences, R&D, AstraZeneca, South San Francisco, CA 94080, USA.
| |
Collapse
|
31
|
Smith JW, O'Meally RN, Burke SM, Ng DK, Chen JG, Kensler TW, Groopman JD, Cole RN. Global Discovery and Temporal Changes of Human Albumin Modifications by Pan-Protein Adductomics: Initial Application to Air Pollution Exposure. JOURNAL OF THE AMERICAN SOCIETY FOR MASS SPECTROMETRY 2023; 34:595-607. [PMID: 36939690 DOI: 10.1021/jasms.2c00314] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/18/2023]
Abstract
Assessing personal exposure to environmental toxicants is a critical challenge for predicting disease risk. Previously, using human serum albumin (HSA)-based biomonitoring, we reported dosimetric relationships between adducts at HSA Cys34 and ambient air pollutant levels (Smith et al., Chem. Res. Toxicol. 2021, 34, 1183). These results provided the foundation to explore modifications at other sites in HSA to reveal novel adducts of complex exposures. Thus, the Pan-Protein Adductomics (PPA) technology reported here is the next step toward an unbiased, comprehensive characterization of the HSA adductome. The PPA workflow requires <2 μL serum/plasma and uses nanoflow-liquid chromatography, gas-phase fractionation, and overlapping-window data-independent acquisition high-resolution tandem mass spectrometry. PPA analysis of albumin from nonsmoking women exposed to high levels of air pollution uncovered 68 unique location-specific modifications (LSMs) across 21 HSA residues. While nearly half were located at Cys34 (33 LSMs), 35 were detected on other residues, including Lys, His, Tyr, Ser, Met, and Arg. HSA adduct relative abundances spanned a ∼400 000-fold range and included putative products of exogenous (SO2, benzene, phycoerythrobilin) and endogenous (oxidation, lipid peroxidation, glycation, carbamylation) origin, as well as 24 modifications without annotations. PPA quantification revealed statistically significant changes in LSM levels across the 84 days of monitoring (∼3 HSA lifetimes) in the following putative adducts: Cys34 trioxidation, β-methylthiolation, benzaldehyde, and benzene diol epoxide; Met329 oxidation; Arg145 dioxidation; and unannotated Cys34 and His146 adducts. Notably, the PPA workflow can be extended to any protein. Pan-Protein Adductomics is a novel and powerful strategy for untargeted global exploration of protein modifications.
Collapse
Affiliation(s)
- Joshua W Smith
- Department of Environmental Health and Engineering, Bloomberg School of Public Health, Johns Hopkins University, Baltimore, Maryland 21205, United States
| | - Robert N O'Meally
- Department of Biological Chemistry, School of Medicine, Johns Hopkins University, Baltimore, Maryland 21205, United States
| | - Sean M Burke
- Department of Environmental Health and Engineering, Bloomberg School of Public Health, Johns Hopkins University, Baltimore, Maryland 21205, United States
| | - Derek K Ng
- Department of Epidemiology, Johns Hopkins Bloomberg School of Public Health, Baltimore, Maryland 21205, United States
| | - Jian-Guo Chen
- Qidong Liver Cancer Institute, Qidong People's Hospital, Affiliated Qidong Hospital of Nantong University, Qidong, Jiangsu 226200, P. R. China
| | - Thomas W Kensler
- Public Health Sciences Division, Fred Hutchinson Cancer Research Center, Seattle, Washington 98109, United States
| | - John D Groopman
- Department of Environmental Health and Engineering, Bloomberg School of Public Health, Johns Hopkins University, Baltimore, Maryland 21205, United States
| | - Robert N Cole
- Department of Biological Chemistry, School of Medicine, Johns Hopkins University, Baltimore, Maryland 21205, United States
| |
Collapse
|
32
|
Woo J, Zhang Q. A Streamlined High-Throughput Plasma Proteomics Platform for Clinical Proteomics with Improved Proteome Coverage, Reproducibility, and Robustness. JOURNAL OF THE AMERICAN SOCIETY FOR MASS SPECTROMETRY 2023; 34:754-762. [PMID: 36975161 PMCID: PMC10080683 DOI: 10.1021/jasms.3c00022] [Citation(s) in RCA: 8] [Impact Index Per Article: 8.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 01/23/2023] [Revised: 03/16/2023] [Accepted: 03/20/2023] [Indexed: 06/18/2023]
Abstract
Mass spectrometry-based clinical proteomics requires high throughput, reproducibility, robustness, and comprehensive coverage to serve the needs of clinical diagnosis, prognosis, and personalized medicine. Oftentimes these requirements are contradictory to each other. We report the development of a streamlined High-Throughput Plasma Proteomics (sHTPP) platform for untargeted profiling of the blood plasma proteome, which includes 96-well plates and simplified procedures for sample preparation, disposable trap column for peptide loading, robust liquid chromatographic system for separation, data-independent acquisition in tandem mass spectrometry, and DIA-NN, FragPipe, and in-house peptide spectral library-based data analysis. Using the optimized platform at a throughput of 60 samples per day, over 600 protein groups including 57 FDA-approved biomarkers can be consistently identified from whole human plasma, and more than 85% of the detected proteins have 100% completeness in quantitative values across 300 samples. The balance achieved between proteome coverage, throughput, and reproducibility of this sHTPP platform makes it promising in clinical settings, where a large number of samples are to be measured quickly and reliably to support various needs of clinical medicine.
Collapse
Affiliation(s)
- Jongmin Woo
- Center
for Translational Biomedical Research, University
of North Carolina at Greensboro, North Carolina Research Campus, Kannapolis, North Carolina 28081, United States
| | - Qibin Zhang
- Center
for Translational Biomedical Research, University
of North Carolina at Greensboro, North Carolina Research Campus, Kannapolis, North Carolina 28081, United States
- Department
of Chemistry & Biochemistry, University
of North Carolina at Greensboro, Greensboro, North Carolina 27402, United States
| |
Collapse
|
33
|
Messner CB, Demichev V, Wang Z, Hartl J, Kustatscher G, Mülleder M, Ralser M. Mass spectrometry-based high-throughput proteomics and its role in biomedical studies and systems biology. Proteomics 2023; 23:e2200013. [PMID: 36349817 DOI: 10.1002/pmic.202200013] [Citation(s) in RCA: 29] [Impact Index Per Article: 29.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/07/2022] [Revised: 10/13/2022] [Accepted: 10/13/2022] [Indexed: 11/11/2022]
Abstract
There are multiple reasons why the next generation of biological and medical studies require increasing numbers of samples. Biological systems are dynamic, and the effect of a perturbation depends on the genetic background and environment. As a consequence, many conditions need to be considered to reach generalizable conclusions. Moreover, human population and clinical studies only reach sufficient statistical power if conducted at scale and with precise measurement methods. Finally, many proteins remain without sufficient functional annotations, because they have not been systematically studied under a broad range of conditions. In this review, we discuss the latest technical developments in mass spectrometry (MS)-based proteomics that facilitate large-scale studies by fast and efficient chromatography, fast scanning mass spectrometers, data-independent acquisition (DIA), and new software. We further highlight recent studies which demonstrate how high-throughput (HT) proteomics can be applied to capture biological diversity, to annotate gene functions or to generate predictive and prognostic models for human diseases.
Collapse
Affiliation(s)
- Christoph B Messner
- Precision Proteomics Center, Swiss Institute of Allergy and Asthma Research (SIAF), University of Zurich, Davos, Switzerland
| | - Vadim Demichev
- Institute of Biochemistry, Charité - Universitätsmedizin Berlin, Berlin, Germany
| | - Ziyue Wang
- Institute of Biochemistry, Charité - Universitätsmedizin Berlin, Berlin, Germany
| | - Johannes Hartl
- Institute of Biochemistry, Charité - Universitätsmedizin Berlin, Berlin, Germany
| | - Georg Kustatscher
- Wellcome Centre for Cell Biology, University of Edinburgh, Max Born Crescent, Edinburgh, Scotland, UK
| | - Michael Mülleder
- Core Facility High Throughput Mass Spectrometry, Charité - Universitätsmedizin Berlin, Berlin, Germany
| | - Markus Ralser
- Institute of Biochemistry, Charité - Universitätsmedizin Berlin, Berlin, Germany
- Wellcome Centre for Human Genetics, Nuffield Department of Medicine, University of Oxford, Oxford, UK
| |
Collapse
|
34
|
Gatto L, Aebersold R, Cox J, Demichev V, Derks J, Emmott E, Franks AM, Ivanov AR, Kelly RT, Khoury L, Leduc A, MacCoss MJ, Nemes P, Perlman DH, Petelski AA, Rose CM, Schoof EM, Van Eyk J, Vanderaa C, Yates JR, Slavov N. Initial recommendations for performing, benchmarking and reporting single-cell proteomics experiments. Nat Methods 2023; 20:375-386. [PMID: 36864200 PMCID: PMC10130941 DOI: 10.1038/s41592-023-01785-3] [Citation(s) in RCA: 39] [Impact Index Per Article: 39.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/06/2022] [Accepted: 01/24/2023] [Indexed: 03/04/2023]
Abstract
Analyzing proteins from single cells by tandem mass spectrometry (MS) has recently become technically feasible. While such analysis has the potential to accurately quantify thousands of proteins across thousands of single cells, the accuracy and reproducibility of the results may be undermined by numerous factors affecting experimental design, sample preparation, data acquisition and data analysis. We expect that broadly accepted community guidelines and standardized metrics will enhance rigor, data quality and alignment between laboratories. Here we propose best practices, quality controls and data-reporting recommendations to assist in the broad adoption of reliable quantitative workflows for single-cell proteomics. Resources and discussion forums are available at https://single-cell.net/guidelines .
Collapse
Affiliation(s)
- Laurent Gatto
- Computational Biology and Bioinformatics Unit, de Duve Institute, Université Catholique de Louvain, Brussels, Belgium
| | - Ruedi Aebersold
- Department of Biology, Institute of Molecular Systems Biology, ETH Zurich, Zurich, Switzerland
| | - Juergen Cox
- Max Planck Institute of Biochemistry, Martinsried, Germany
| | | | - Jason Derks
- Departments of Bioengineering, Biology, Chemistry and Chemical Biology, Single-Cell Proteomics Center and Barnett Institute, Northeastern University, Boston, MA, USA
| | - Edward Emmott
- Centre for Proteome Research, Department of Biochemistry and Systems Biology, University of Liverpool, Liverpool, UK
| | - Alexander M Franks
- Department of Statistics and Applied Probability, University of California Santa Barbara, Santa Barbara, CA, USA
| | - Alexander R Ivanov
- Department of Chemistry and Chemical Biology, Barnett Institute of Chemical and Biological Analysis, Northeastern University, Boston, MA, USA
| | - Ryan T Kelly
- Department of Chemistry and Biochemistry, Brigham Young University, Provo, UT, USA
| | - Luke Khoury
- Departments of Bioengineering, Biology, Chemistry and Chemical Biology, Single-Cell Proteomics Center and Barnett Institute, Northeastern University, Boston, MA, USA
| | - Andrew Leduc
- Departments of Bioengineering, Biology, Chemistry and Chemical Biology, Single-Cell Proteomics Center and Barnett Institute, Northeastern University, Boston, MA, USA
| | | | - Peter Nemes
- Department of Chemistry and Biochemistry, University of Maryland, College Park, MD, USA
| | - David H Perlman
- Merck Exploratory Science Center, Merck Sharp & Dohme Corp., Cambridge, MA, USA
| | - Aleksandra A Petelski
- Departments of Bioengineering, Biology, Chemistry and Chemical Biology, Single-Cell Proteomics Center and Barnett Institute, Northeastern University, Boston, MA, USA
- Parallel Squared Technology Institute, Watertown, MA, USA
| | - Christopher M Rose
- Department of Microchemistry, Proteomics and Lipidomics, Genentech Inc., South San Francisco, CA, USA
| | - Erwin M Schoof
- Department of Biotechnology and Biomedicine, Technical University of Denmark, Lyngby, Denmark
| | | | - Christophe Vanderaa
- Computational Biology and Bioinformatics Unit, de Duve Institute, Université Catholique de Louvain, Brussels, Belgium
| | - John R Yates
- Departments of Molecular Medicine and Neurobiology, the Scripps Research Institute, La Jolla, CA, USA
| | - Nikolai Slavov
- Departments of Bioengineering, Biology, Chemistry and Chemical Biology, Single-Cell Proteomics Center and Barnett Institute, Northeastern University, Boston, MA, USA.
- Parallel Squared Technology Institute, Watertown, MA, USA.
| |
Collapse
|
35
|
Latent class analysis of psychotic-affective disorders with data-driven plasma proteomics. Transl Psychiatry 2023; 13:44. [PMID: 36746927 PMCID: PMC9902608 DOI: 10.1038/s41398-023-02321-9] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 08/22/2022] [Revised: 01/09/2023] [Accepted: 01/13/2023] [Indexed: 02/08/2023] Open
Abstract
Data-driven approaches to subtype transdiagnostic samples are important for understanding heterogeneity within disorders and overlap between disorders. Thus, this study was conducted to determine whether plasma proteomics-based clustering could subtype patients with transdiagnostic psychotic-affective disorder diagnoses. The study population included 504 patients with schizophrenia, bipolar disorder, and major depressive disorder and 160 healthy controls, aged 19 to 65 years. Multiple reaction monitoring was performed using plasma samples from each individual. Pathologic peptides were determined by linear regression between patients and healthy controls. Latent class analysis was conducted in patients after peptide values were stratified by sex and divided into tertile values. Significant demographic and clinical characteristics were determined for the latent clusters. The latent class analysis was repeated when healthy controls were included. Twelve peptides were significantly different between the patients and healthy controls after controlling for significant covariates. Latent class analysis based on these peptides after stratification by sex revealed two distinct classes of patients. The negative symptom factor of the Brief Psychiatric Rating Scale was significantly different between the classes (t = -2.070, p = 0.039). When healthy controls were included, two latent classes were identified, and the negative symptom factor of the Brief Psychiatric Rating Scale was still significant (t = -2.372, p = 0.018). In conclusion, negative symptoms should be considered a significant biological aspect for understanding the heterogeneity and overlap of psychotic-affective disorders.
Collapse
|
36
|
Sundararaman N, Bhat A, Venkatraman V, Binek A, Dwight Z, Ariyasinghe NR, Escopete S, Joung SY, Cheng S, Parker SJ, Fert-Bober J, Van Eyk JE. BIRCH: An Automated Workflow for Evaluation, Correction, and Visualization of Batch Effect in Bottom-Up Mass Spectrometry-Based Proteomics Data. J Proteome Res 2023; 22:471-481. [PMID: 36695565 DOI: 10.1021/acs.jproteome.2c00671] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/26/2023]
Abstract
Recent surges in large-scale mass spectrometry (MS)-based proteomics studies demand a concurrent rise in methods to facilitate reliable and reproducible data analysis. Quantification of proteins in MS analysis can be affected by variations in technical factors such as sample preparation and data acquisition conditions leading to batch effects, which adds to noise in the data set. This may in turn affect the effectiveness of any biological conclusions derived from the data. Here we present Batch-effect Identification, Representation, and Correction of Heterogeneous data (BIRCH), a workflow for analysis and correction of batch effect through an automated, versatile, and easy to use web-based tool with the goal of eliminating technical variation. BIRCH also supports diagnosis of the data to check for the presence of batch effects, feasibility of batch correction, and imputation to deal with missing values in the data set. To illustrate the relevance of the tool, we explore two case studies, including an iPSC-derived cell study and a Covid vaccine study to show different context-specific use cases. Ultimately this tool can be used as an extremely powerful approach for eliminating technical bias while retaining biological bias, toward understanding disease mechanisms and potential therapeutics.
Collapse
Affiliation(s)
- Niveda Sundararaman
- Smidt Heart Institute, Cedars-Sinai Medical Center, Los Angeles, California 90048, United States.,Advanced Clinical Biosystems Research Institute, Cedars-Sinai Medical Center, Los Angeles, California 90048, United States
| | - Archana Bhat
- Smidt Heart Institute, Cedars-Sinai Medical Center, Los Angeles, California 90048, United States.,Advanced Clinical Biosystems Research Institute, Cedars-Sinai Medical Center, Los Angeles, California 90048, United States
| | - Vidya Venkatraman
- Smidt Heart Institute, Cedars-Sinai Medical Center, Los Angeles, California 90048, United States.,Advanced Clinical Biosystems Research Institute, Cedars-Sinai Medical Center, Los Angeles, California 90048, United States
| | - Aleksandra Binek
- Smidt Heart Institute, Cedars-Sinai Medical Center, Los Angeles, California 90048, United States.,Advanced Clinical Biosystems Research Institute, Cedars-Sinai Medical Center, Los Angeles, California 90048, United States
| | - Zachary Dwight
- Precision Biomarker Laboratories, Cedars-Sinai Medical Center, Los Angeles, California 90048, United States
| | - Nethika R Ariyasinghe
- Smidt Heart Institute, Cedars-Sinai Medical Center, Los Angeles, California 90048, United States.,Advanced Clinical Biosystems Research Institute, Cedars-Sinai Medical Center, Los Angeles, California 90048, United States
| | - Sean Escopete
- Smidt Heart Institute, Cedars-Sinai Medical Center, Los Angeles, California 90048, United States.,Advanced Clinical Biosystems Research Institute, Cedars-Sinai Medical Center, Los Angeles, California 90048, United States
| | - Sandy Y Joung
- Smidt Heart Institute, Cedars-Sinai Medical Center, Los Angeles, California 90048, United States
| | - Susan Cheng
- Smidt Heart Institute, Cedars-Sinai Medical Center, Los Angeles, California 90048, United States
| | - Sarah J Parker
- Smidt Heart Institute, Cedars-Sinai Medical Center, Los Angeles, California 90048, United States.,Advanced Clinical Biosystems Research Institute, Cedars-Sinai Medical Center, Los Angeles, California 90048, United States
| | - Justyna Fert-Bober
- Smidt Heart Institute, Cedars-Sinai Medical Center, Los Angeles, California 90048, United States.,Advanced Clinical Biosystems Research Institute, Cedars-Sinai Medical Center, Los Angeles, California 90048, United States
| | - Jennifer E Van Eyk
- Smidt Heart Institute, Cedars-Sinai Medical Center, Los Angeles, California 90048, United States.,Advanced Clinical Biosystems Research Institute, Cedars-Sinai Medical Center, Los Angeles, California 90048, United States
| |
Collapse
|
37
|
Vanderaa C, Gatto L. The Current State of Single-Cell Proteomics Data Analysis. Curr Protoc 2023; 3:e658. [PMID: 36633424 DOI: 10.1002/cpz1.658] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/13/2023]
Abstract
Sound data analysis is essential to retrieve meaningful biological information from single-cell proteomics experiments. This analysis is carried out by computational methods that are assembled into workflows, and their implementations influence the conclusions that can be drawn from the data. In this work, we explore and compare the computational workflows that have been used over the last four years and identify a profound lack of consensus on how to analyze single-cell proteomics data. We highlight the need for benchmarking of computational workflows and standardization of computational tools and data, as well as carefully designed experiments. Finally, we cover the current standardization efforts that aim to fill the gap, list the remaining missing pieces, and conclude with lessons learned from the replication of published single-cell proteomics analyses. © 2023 Wiley Periodicals LLC.
Collapse
Affiliation(s)
- Christophe Vanderaa
- Computational Biology and Bioinformatics Unit (CBIO), de Duve Institute, Université catholique de Louvain, Belgium
| | - Laurent Gatto
- Computational Biology and Bioinformatics Unit (CBIO), de Duve Institute, Université catholique de Louvain, Belgium
| |
Collapse
|
38
|
Staudt DE, Murray HC, Skerrett-Byrne DA, Smith ND, Jamaluddin MFB, Kahl RGS, Duchatel RJ, Germon ZP, McLachlan T, Jackson ER, Findlay IJ, Kearney PS, Mannan A, McEwen HP, Douglas AM, Nixon B, Verrills NM, Dun MD. Phospho-heavy-labeled-spiketide FAIMS stepped-CV DDA (pHASED) provides real-time phosphoproteomics data to aid in cancer drug selection. Clin Proteomics 2022; 19:48. [PMID: 36536316 PMCID: PMC9762002 DOI: 10.1186/s12014-022-09385-7] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/10/2022] [Accepted: 12/07/2022] [Indexed: 12/23/2022] Open
Abstract
Global high-throughput phosphoproteomic profiling is increasingly being applied to cancer specimens to identify the oncogenic signaling cascades responsible for promoting disease initiation and disease progression; pathways that are often invisible to genomics analysis. Hence, phosphoproteomic profiling has enormous potential to inform and improve individualized anti-cancer treatment strategies. However, to achieve the adequate phosphoproteomic depth and coverage necessary to identify the activated, and hence, targetable kinases responsible for driving oncogenic signaling pathways, affinity phosphopeptide enrichment techniques are required and often coupled with offline high-pressure liquid chromatographic (HPLC) separation prior to nanoflow liquid chromatography-tandem mass spectrometry (nLC-MS/MS). These complex and time-consuming procedures, limit the utility of phosphoproteomics for the analysis of individual cancer patient specimens in real-time, and restrict phosphoproteomics to specialized laboratories often outside of the clinical setting. To address these limitations, here we have optimized a new protocol, phospho-heavy-labeled-spiketide FAIMS Stepped-CV DDA (pHASED), that employs online phosphoproteome deconvolution using high-field asymmetric waveform ion mobility spectrometry (FAIMS) and internal phosphopeptide standards to provide accurate label-free quantitation (LFQ) data in real-time. Compared with traditional single-shot LFQ phosphoproteomics workflows, pHASED provided increased phosphoproteomic depth and coverage (phosphopeptides = 4617 pHASED, 2789 LFQ), whilst eliminating the variability associated with offline prefractionation. pHASED was optimized using tyrosine kinase inhibitor (sorafenib) resistant isogenic FLT3-mutant acute myeloid leukemia (AML) cell line models. Bioinformatic analysis identified differential activation of the serine/threonine protein kinase ataxia-telangiectasia mutated (ATM) pathway, responsible for sensing and repairing DNA damage in sorafenib-resistant AML cell line models, thereby uncovering a potential therapeutic opportunity. Herein, we have optimized a rapid, reproducible, and flexible protocol for the characterization of complex cancer phosphoproteomes in real-time, a step towards the implementation of phosphoproteomics in the clinic to aid in the selection of anti-cancer therapies for patients.
Collapse
Affiliation(s)
- Dilana E. Staudt
- grid.266842.c0000 0000 8831 109XSchool of Biomedical Sciences and Pharmacy, College of Health, Medicine and Wellbeing, University of Newcastle, Callaghan, NSW 2308 Australia ,grid.413648.cPrecision Medicine Research Program, Hunter Medical Research Institute, New Lambton Heights, NSW 2305 Australia
| | - Heather C. Murray
- grid.266842.c0000 0000 8831 109XSchool of Biomedical Sciences and Pharmacy, College of Health, Medicine and Wellbeing, University of Newcastle, Callaghan, NSW 2308 Australia ,grid.413648.cPrecision Medicine Research Program, Hunter Medical Research Institute, New Lambton Heights, NSW 2305 Australia
| | - David A. Skerrett-Byrne
- grid.266842.c0000 0000 8831 109XSchool of Environmental and Life Sciences, College of Engineering, Science and Environment, University of Newcastle, Callaghan, NSW 2308 Australia ,grid.413648.cInfertility and Reproduction Research Program, Hunter Medical Research Institute, New Lambton Heights, NSW 2305 Australia
| | - Nathan D. Smith
- grid.266842.c0000 0000 8831 109XAnalytical and Biomolecular Research Facility (ABRF), Research Services, University of Newcastle, NSW, Callaghan, 2308 Australia
| | - M. Fairuz B. Jamaluddin
- grid.266842.c0000 0000 8831 109XSchool of Biomedical Sciences and Pharmacy, College of Health, Medicine and Wellbeing, University of Newcastle, Callaghan, NSW 2308 Australia
| | - Richard G. S. Kahl
- grid.266842.c0000 0000 8831 109XSchool of Biomedical Sciences and Pharmacy, College of Health, Medicine and Wellbeing, University of Newcastle, Callaghan, NSW 2308 Australia
| | - Ryan J. Duchatel
- grid.266842.c0000 0000 8831 109XSchool of Biomedical Sciences and Pharmacy, College of Health, Medicine and Wellbeing, University of Newcastle, Callaghan, NSW 2308 Australia ,grid.413648.cPrecision Medicine Research Program, Hunter Medical Research Institute, New Lambton Heights, NSW 2305 Australia
| | - Zacary P. Germon
- grid.266842.c0000 0000 8831 109XSchool of Biomedical Sciences and Pharmacy, College of Health, Medicine and Wellbeing, University of Newcastle, Callaghan, NSW 2308 Australia ,grid.413648.cPrecision Medicine Research Program, Hunter Medical Research Institute, New Lambton Heights, NSW 2305 Australia
| | - Tabitha McLachlan
- grid.266842.c0000 0000 8831 109XSchool of Biomedical Sciences and Pharmacy, College of Health, Medicine and Wellbeing, University of Newcastle, Callaghan, NSW 2308 Australia ,grid.413648.cPrecision Medicine Research Program, Hunter Medical Research Institute, New Lambton Heights, NSW 2305 Australia
| | - Evangeline R. Jackson
- grid.266842.c0000 0000 8831 109XSchool of Biomedical Sciences and Pharmacy, College of Health, Medicine and Wellbeing, University of Newcastle, Callaghan, NSW 2308 Australia ,grid.413648.cPrecision Medicine Research Program, Hunter Medical Research Institute, New Lambton Heights, NSW 2305 Australia
| | - Izac J. Findlay
- grid.266842.c0000 0000 8831 109XSchool of Biomedical Sciences and Pharmacy, College of Health, Medicine and Wellbeing, University of Newcastle, Callaghan, NSW 2308 Australia ,grid.413648.cPrecision Medicine Research Program, Hunter Medical Research Institute, New Lambton Heights, NSW 2305 Australia
| | - Padraic S. Kearney
- grid.266842.c0000 0000 8831 109XSchool of Biomedical Sciences and Pharmacy, College of Health, Medicine and Wellbeing, University of Newcastle, Callaghan, NSW 2308 Australia ,grid.413648.cPrecision Medicine Research Program, Hunter Medical Research Institute, New Lambton Heights, NSW 2305 Australia
| | - Abdul Mannan
- grid.266842.c0000 0000 8831 109XSchool of Biomedical Sciences and Pharmacy, College of Health, Medicine and Wellbeing, University of Newcastle, Callaghan, NSW 2308 Australia ,grid.413648.cPrecision Medicine Research Program, Hunter Medical Research Institute, New Lambton Heights, NSW 2305 Australia
| | - Holly P. McEwen
- grid.266842.c0000 0000 8831 109XSchool of Biomedical Sciences and Pharmacy, College of Health, Medicine and Wellbeing, University of Newcastle, Callaghan, NSW 2308 Australia ,grid.413648.cPrecision Medicine Research Program, Hunter Medical Research Institute, New Lambton Heights, NSW 2305 Australia
| | - Alicia M. Douglas
- grid.266842.c0000 0000 8831 109XSchool of Biomedical Sciences and Pharmacy, College of Health, Medicine and Wellbeing, University of Newcastle, Callaghan, NSW 2308 Australia
| | - Brett Nixon
- grid.266842.c0000 0000 8831 109XSchool of Environmental and Life Sciences, College of Engineering, Science and Environment, University of Newcastle, Callaghan, NSW 2308 Australia ,grid.413648.cInfertility and Reproduction Research Program, Hunter Medical Research Institute, New Lambton Heights, NSW 2305 Australia
| | - Nicole M. Verrills
- grid.266842.c0000 0000 8831 109XSchool of Biomedical Sciences and Pharmacy, College of Health, Medicine and Wellbeing, University of Newcastle, Callaghan, NSW 2308 Australia ,grid.413648.cPrecision Medicine Research Program, Hunter Medical Research Institute, New Lambton Heights, NSW 2305 Australia
| | - Matthew D. Dun
- grid.266842.c0000 0000 8831 109XSchool of Biomedical Sciences and Pharmacy, College of Health, Medicine and Wellbeing, University of Newcastle, Callaghan, NSW 2308 Australia ,grid.413648.cPrecision Medicine Research Program, Hunter Medical Research Institute, New Lambton Heights, NSW 2305 Australia
| |
Collapse
|
39
|
Leprêtre M, Geffard O, Espeyte A, Faugere J, Ayciriex S, Salvador A, Delorme N, Chaumot A, Degli-Esposti D. Multiple reaction monitoring mass spectrometry for the discovery of environmentally modulated proteins in an aquatic invertebrate sentinel species, Gammarus fossarum. ENVIRONMENTAL POLLUTION (BARKING, ESSEX : 1987) 2022; 315:120393. [PMID: 36223854 DOI: 10.1016/j.envpol.2022.120393] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 05/04/2022] [Revised: 10/03/2022] [Accepted: 10/05/2022] [Indexed: 06/16/2023]
Abstract
Multiple reaction monitoring (MRM) mass spectrometry is emerging as a relevant tool for measuring customized molecular markers in freshwater sentinel species. While this technique is typically used for the validation of protein molecular markers preselected from shotgun experiments, recent gains of MRM multiplexing capacity offer new possibilities to conduct large-scale screening of animal proteomes. By combining the strength of active biomonitoring strategies and MRM technologies, this study aims to propose a new strategy for the discovery of candidate proteins that respond to environmental variability. For this purpose, 249 peptides derived from 147 proteins were monitored by MRM in 273 male gammarids caged in 56 environmental sites, representative of the diversity of French water bodies. A methodology is here proposed to identify a set of customized housekeeping peptides (HKPs) used to correct analytical batch effects and allow proper comparison of peptide levels in gammarids. A comparative analysis performed on HKPs-normalized data resulted in the identification of peptides highly modulated in the environment and derived from proteins likely involved in the environmental stress response. Overall, this study proposes a breakthrough approach to screen and identify potential proteins responding to relevant environmental conditions in sentinel species.
Collapse
Affiliation(s)
- Maxime Leprêtre
- INRAE, UR RiverLy, Laboratoire d'écotoxicologie, F-69625, Villeurbanne, France
| | - Olivier Geffard
- INRAE, UR RiverLy, Laboratoire d'écotoxicologie, F-69625, Villeurbanne, France
| | - Anabelle Espeyte
- INRAE, UR RiverLy, Laboratoire d'écotoxicologie, F-69625, Villeurbanne, France
| | - Julien Faugere
- Université de Lyon, Université Claude Bernard Lyon 1, Institut des Sciences Analytiques, CNRS UMR 5280, 5 rue de la Doua, F-69100, Villeurbanne, France
| | - Sophie Ayciriex
- Université de Lyon, Université Claude Bernard Lyon 1, Institut des Sciences Analytiques, CNRS UMR 5280, 5 rue de la Doua, F-69100, Villeurbanne, France
| | - Arnaud Salvador
- Université de Lyon, Université Claude Bernard Lyon 1, Institut des Sciences Analytiques, CNRS UMR 5280, 5 rue de la Doua, F-69100, Villeurbanne, France
| | - Nicolas Delorme
- INRAE, UR RiverLy, Laboratoire d'écotoxicologie, F-69625, Villeurbanne, France
| | - Arnaud Chaumot
- INRAE, UR RiverLy, Laboratoire d'écotoxicologie, F-69625, Villeurbanne, France
| | | |
Collapse
|
40
|
King CD, Kapp KL, Arul AB, Choi MJ, Robinson RAS. Advancements in automation for plasma proteomics sample preparation. Mol Omics 2022; 18:828-839. [PMID: 36048090 PMCID: PMC9879274 DOI: 10.1039/d2mo00122e] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/28/2023]
Abstract
Automation is necessary to increase sample processing throughput for large-scale clinical analyses. Replacement of manual pipettes with robotic liquid handler systems is especially helpful in processing blood-based samples, such as plasma and serum. These samples are very heterogenous, and protein expression can vary greatly from sample-to-sample, even for healthy controls. Detection of true biological changes requires that variation from sample preparation steps and downstream analytical detection methods, such as mass spectrometry, remains low. In this mini-review, we discuss plasma proteomics protocols and the benefits of automation towards enabling detection of low abundant proteins and providing low sample error and increased sample throughput. This discussion includes considerations for automation of major sample depletion and/or enrichment strategies for plasma toward mass spectrometry detection.
Collapse
Affiliation(s)
- Christina D King
- Department of Chemistry, Vanderbilt University, Nashville, Tennessee 37235, USA
| | - Kathryn L Kapp
- Department of Chemistry, Vanderbilt University, Nashville, Tennessee 37235, USA
- Vanderbilt Institute of Chemical Biology, Vanderbilt University, Nashville, Tennessee 37232, USA
| | - Albert B Arul
- Department of Chemistry, Vanderbilt University, Nashville, Tennessee 37235, USA
| | - Min Ji Choi
- Department of Chemistry, Vanderbilt University, Nashville, Tennessee 37235, USA
| | - Renã A S Robinson
- Department of Chemistry, Vanderbilt University, Nashville, Tennessee 37235, USA
- Vanderbilt Institute of Chemical Biology, Vanderbilt University, Nashville, Tennessee 37232, USA
- Department of Neurology, Vanderbilt University Medical Center, Nashville, Tennessee 37232, USA
- Vanderbilt Memory & Alzheimer's Center, Vanderbilt University Medical Center, Nashville, Tennessee 37212, USA
- Vanderbilt Brain Institute, Vanderbilt University, Nashville, Tennessee 37232, USA
| |
Collapse
|
41
|
Adamer MF, Brüningk SC, Tejada-Arranz A, Estermann F, Basler M, Borgwardt K. reComBat: batch-effect removal in large-scale multi-source gene-expression data integration. BIOINFORMATICS ADVANCES 2022; 2:vbac071. [PMID: 36699372 PMCID: PMC9710604 DOI: 10.1093/bioadv/vbac071] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 08/11/2022] [Revised: 09/01/2022] [Accepted: 09/26/2022] [Indexed: 01/28/2023]
Abstract
Motivation With the steadily increasing abundance of omics data produced all over the world under vastly different experimental conditions residing in public databases, a crucial step in many data-driven bioinformatics applications is that of data integration. The challenge of batch-effect removal for entire databases lies in the large number of batches and biological variation, which can result in design matrix singularity. This problem can currently not be solved satisfactorily by any common batch-correction algorithm. Results We present reComBat, a regularized version of the empirical Bayes method to overcome this limitation and benchmark it against popular approaches for the harmonization of public gene-expression data (both microarray and bulkRNAsq) of the human opportunistic pathogen Pseudomonas aeruginosa. Batch-effects are successfully mitigated while biologically meaningful gene-expression variation is retained. reComBat fills the gap in batch-correction approaches applicable to large-scale, public omics databases and opens up new avenues for data-driven analysis of complex biological processes beyond the scope of a single study. Availability and implementation The code is available at https://github.com/BorgwardtLab/reComBat, all data and evaluation code can be found at https://github.com/BorgwardtLab/batchCorrectionPublicData. Supplementary information Supplementary data are available at Bioinformatics Advances online.
Collapse
Affiliation(s)
| | | | | | | | - Marek Basler
- Biozentrum, University of Basel, Basel 4056, Switzerland
| | - Karsten Borgwardt
- Department of Biosystems Science and Engineering, ETH Zurich, Basel 4058, Switzerland,Swiss Institute for Bioinformatics (SIB), Lausanne 1015, Switzerland
| |
Collapse
|
42
|
What can scatterplots teach us about doing data science better? INTERNATIONAL JOURNAL OF DATA SCIENCE AND ANALYTICS 2022. [DOI: 10.1007/s41060-022-00362-9] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/14/2022]
|
43
|
Phua SX, Lim KP, Goh WWB. Perspectives for better batch effect correction in mass-spectrometry-based proteomics. Comput Struct Biotechnol J 2022; 20:4369-4375. [PMID: 36051874 PMCID: PMC9411064 DOI: 10.1016/j.csbj.2022.08.022] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/14/2022] [Revised: 08/09/2022] [Accepted: 08/09/2022] [Indexed: 11/08/2022] Open
Abstract
Mass-spectrometry-based proteomics presents some unique challenges for batch effect correction. Batch effects are technical sources of variation, can confound analysis and usually non-biological in nature. As proteomic analysis involves several stages of data transformation from spectra to protein, the decision on when and what to apply batch correction on is often unclear. Here, we explore several relevant issues pertinent to batch effect correct considerations. The first involves applications of batch effect correction requiring prior knowledge on batch factors and exploring data to uncover new/unknown batch factors. The second considers recent literature that suggests there is no single best batch effect correction algorithm---i.e., instead of a best approach, one may instead ask, what is a suitable approach. The third section considers issues of batch effect detection. And finally, we look at potential developments for proteomic-specific batch effect correction methods and how to do better functional evaluations on batch corrected data.
Collapse
Affiliation(s)
- Ser-Xian Phua
- Lee Kong Chian School of Medicine, Nanyang Technological University, Singapore
- School of Biological Sciences, Nanyang Technological University, Singapore
| | - Kai-Peng Lim
- Lee Kong Chian School of Medicine, Nanyang Technological University, Singapore
- School of Biological Sciences, Nanyang Technological University, Singapore
| | - Wilson Wen-Bin Goh
- Lee Kong Chian School of Medicine, Nanyang Technological University, Singapore
- School of Biological Sciences, Nanyang Technological University, Singapore
- Center for Biomedical Informatics, Nanyang Technological University, Singapore
| |
Collapse
|
44
|
Characterisation of a Novel Cell Line (ICR-SS-1) Established from a Patient-Derived Xenograft of Synovial Sarcoma. Cells 2022; 11:cells11152418. [PMID: 35954262 PMCID: PMC9368503 DOI: 10.3390/cells11152418] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/14/2022] [Revised: 07/28/2022] [Accepted: 08/03/2022] [Indexed: 12/04/2022] Open
Abstract
Synovial sarcoma is a rare translocation-driven cancer with poor survival outcomes, particularly in the advanced setting. Previous synovial sarcoma preclinical studies have relied on a small panel of cell lines which suffer from the limitation of genomic and phenotypic drift as a result of being grown in culture for decades. Patient-derived xenografts (PDX) are a valuable tool for preclinical research as they retain many histopathological features of their originating human tumour; however, this approach is expensive, slow, and resource intensive, which hinders their utility in large-scale functional genomic and drug screens. To address some of these limitations, in this study, we have established and characterised a novel synovial sarcoma cell line, ICR-SS-1, which is derived from a PDX model and is amenable to high-throughput drug screens. We show that ICR-SS-1 grows readily in culture, retains the pathognomonic SS18::SSX1 fusion gene, and recapitulates the molecular features of human synovial sarcoma tumours as shown by proteomic profiling. Comparative analysis of drug response profiles with two other established synovial sarcoma cell lines (SYO-1 and HS-SY-II) finds that ICR-SS-1 harbours intrinsic resistance to doxorubicin and is sensitive to targeted inhibition of several oncogenic pathways including the PI3K-mTOR pathway. Collectively, our studies show that the ICR-SS-1 cell line model may be a valuable preclinical tool for studying the biology of anthracycline-resistant synovial sarcoma and identifying new salvage therapies following failure of doxorubicin.
Collapse
|
45
|
Wang LR, Choy XY, Goh WWB. Doppelgänger Spotting in Biomedical Gene Expression Data. iScience 2022; 25:104788. [PMID: 35992056 PMCID: PMC9382272 DOI: 10.1016/j.isci.2022.104788] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/05/2022] [Revised: 05/13/2022] [Accepted: 07/13/2022] [Indexed: 11/29/2022] Open
Abstract
Doppelgänger effects (DEs) occur when samples exhibit chance similarities such that, when split across training and validation sets, inflates the trained machine learning (ML) model performance. This inflationary effect causes misleading confidence on the deployability of the model. Thus, so far, there are no tools for doppelgänger identification or standard practices to manage their confounding implications. We present doppelgangerIdentifier, a software suite for doppelgänger identification and verification. Applying doppelgangerIdentifier across a multitude of diseases and data types, we show the pervasive nature of DEs in biomedical gene expression data. We also provide guidelines toward proper doppelgänger identification by exploring the ramifications of lingering batch effects from batch imbalances on the sensitivity of our doppelgänger identification algorithm. We suggest doppelgänger verification as a useful procedure to establish baselines for model evaluation that may inform on whether feature selection and ML on the data set may yield meaningful insights. Doppelgänger effects inflate the machine learning performance Doppelgänger effects exist in RNA-Seq and microarray gene expression data Developed doppelgangerIdentifier, a software to identify and verify doppelgängers Provide guidelines for proper doppelgänger identification
Collapse
Affiliation(s)
- Li Rong Wang
- School of Computer Science and Engineering, Nanyang Technological University, 60 Nanyang Drive, 637551, Singapore
| | - Xin Yun Choy
- School of Computer Science and Engineering, Nanyang Technological University, 60 Nanyang Drive, 637551, Singapore
| | - Wilson Wen Bin Goh
- School of Biological Sciences, Nanyang Technological University, 60 Nanyang Drive, 637551, Singapore
- Lee Kong Chian School of Medicine, Nanyang Technological University, 60 Nanyang Drive, 637551, Singapore
- Centre for Biomedical Informatics, Nanyang Technological University, 60 Nanyang Drive, 637551, Singapore
- Corresponding author
| |
Collapse
|
46
|
HarmonizR enables data harmonization across independent proteomic datasets with appropriate handling of missing values. Nat Commun 2022; 13:3523. [PMID: 35725563 PMCID: PMC9209422 DOI: 10.1038/s41467-022-31007-x] [Citation(s) in RCA: 16] [Impact Index Per Article: 8.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/24/2021] [Accepted: 05/25/2022] [Indexed: 01/01/2023] Open
Abstract
Dataset integration is common practice to overcome limitations in statistically underpowered omics datasets. Proteome datasets display high technical variability and frequent missing values. Sophisticated strategies for batch effect reduction are lacking or rely on error-prone data imputation. Here we introduce HarmonizR, a data harmonization tool with appropriate missing value handling. The method exploits the structure of available data and matrix dissection for minimal data loss, without data imputation. This strategy implements two common batch effect reduction methods—ComBat and limma (removeBatchEffect()). The HarmonizR strategy, evaluated on four exemplarily analyzed datasets with up to 23 batches, demonstrated successful data harmonization for different tissue preservation techniques, LC-MS/MS instrumentation setups, and quantification approaches. Compared to data imputation methods, HarmonizR was more efficient and performed superior regarding the detection of significant proteins. HarmonizR is an efficient tool for missing data tolerant experimental variance reduction and is easily adjustable for individual dataset properties and user preferences. Dataset integration is common practice to overcome limitations in statistically underpowered omics datasets. Here the authors present “HarmonizR”, a tool for missing data tolerant experimental variance reduction in large, integrated but independently generated datasets without data imputation, adjustable for individual dataset modalities, correction algorithm, and user preferences.
Collapse
|
47
|
Proteomic Profiling Identifies Co-Regulated Expression of Splicing Factors as a Characteristic Feature of Intravenous Leiomyomatosis. Cancers (Basel) 2022; 14:cancers14122907. [PMID: 35740573 PMCID: PMC9221257 DOI: 10.3390/cancers14122907] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/19/2022] [Accepted: 06/07/2022] [Indexed: 11/17/2022] Open
Abstract
Intravenous leiomyomatosis (IVLM) is a rare benign smooth muscle tumour that is characterised by intravenous growth in the uterine and pelvic veins. Previous DNA copy number and transcriptomic studies have shown that IVLM harbors unique genomic and transcriptomic alterations when compared to uterine leiomyoma (uLM), which may account for their distinct clinical behaviour. Here we undertake the first comparative proteomic analysis of IVLM and other smooth muscle tumours (comprising uLM, soft tissue leiomyoma and benign metastasizing leiomyoma) utilising data-independent acquisition mass spectrometry. We show that, at the protein level, IVLM is defined by the unique co-regulated expression of splicing factors. In particular, IVLM is enriched in two clusters composed of co-regulated proteins from the hnRNP, LSm, SR and Sm classes of the spliceosome complex. One of these clusters (Cluster 3) is associated with key biological processes including nascent protein translocation and cell signalling by small GTPases. Taken together, our study provides evidence of co-regulated expression of splicing factors in IVLM compared to other smooth muscle tumours, which suggests a possible role for alternative splicing in the pathogenesis of IVLM.
Collapse
|
48
|
Gürsoy UK, Kantarci A. Molecular biomarker research in periodontology: A roadmap for translation of science to clinical assay validation. J Clin Periodontol 2022; 49:556-561. [PMID: 35322451 PMCID: PMC9321848 DOI: 10.1111/jcpe.13617] [Citation(s) in RCA: 8] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/09/2022] [Revised: 02/20/2022] [Accepted: 03/13/2022] [Indexed: 12/11/2022]
Abstract
The number of studies that aims to apply host‐ or microbe‐derived biochemical biomarkers to periodontal disease diagnosis has increased significantly during the last three decades. The biochemical markers can reflect the presence, severity, and activity of periodontal diseases; however, heterogeneities in applied laboratory methods, data presentation, statistical analysis, and data interpretation prevent the translation of candidate host‐ or microbe‐derived biochemical biomarkers to clinical assay validation. Here, we propose a roadmap for making the research outcomes comparable and re‐analysable with the ultimate goal of translating research to clinical practice. This roadmap presents reporting recommendations for host‐ or microbe‐derived biochemical biomarker studies in periodontology. We aim to make essential elements of the research work (including diagnostic criteria, clinical endpoint definitions, participant recruitment criteria, sample collection and storage techniques, biochemical and microbiological detection methods, and applied statistical analysis) visible and comparable.
Collapse
Affiliation(s)
- Ulvi Kahraman Gürsoy
- Department of Periodontology, Institute of Dentistry, University of Turku, Turku, Finland
| | - Alpdogan Kantarci
- The Forsyth Institute, Cambridge, Massachusetts, USA.,School of Dental Medicine, Harvard University, Boston, Massachusetts, USA
| |
Collapse
|
49
|
Are batch effects still relevant in the age of big data? Trends Biotechnol 2022; 40:1029-1040. [DOI: 10.1016/j.tibtech.2022.02.005] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/10/2021] [Revised: 02/13/2022] [Accepted: 02/18/2022] [Indexed: 12/30/2022]
|
50
|
Williams EG, Pfister N, Roy S, Statzer C, Haverty J, Ingels J, Bohl C, Hasan M, Čuklina J, Bühlmann P, Zamboni N, Lu L, Ewald CY, Williams RW, Aebersold R. Multiomic profiling of the liver across diets and age in a diverse mouse population. Cell Syst 2022; 13:43-57.e6. [PMID: 34666007 PMCID: PMC8776606 DOI: 10.1016/j.cels.2021.09.005] [Citation(s) in RCA: 17] [Impact Index Per Article: 8.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/24/2021] [Revised: 07/12/2021] [Accepted: 09/14/2021] [Indexed: 01/21/2023]
Abstract
We profiled the liver transcriptome, proteome, and metabolome in 347 individuals from 58 isogenic strains of the BXD mouse population across age (7 to 24 months) and diet (low or high fat) to link molecular variations to metabolic traits. Several hundred genes are affected by diet and/or age at the transcript and protein levels. Orthologs of two aging-associated genes, St7 and Ctsd, were knocked down in C. elegans, reducing longevity in wild-type and mutant long-lived strains. The multiomics data were analyzed as segregating gene networks according to each independent variable, providing causal insight into dietary and aging effects. Candidates were cross-examined in an independent diversity outbred mouse liver dataset segregating for similar diets, with ∼80%-90% of diet-related candidate genes found in common across datasets. Together, we have developed a large multiomics resource for multivariate analysis of complex traits and demonstrate a methodology for moving from observational associations to causal connections.
Collapse
Affiliation(s)
- Evan G Williams
- Luxembourg Centre for Systems Biomedicine, University of Luxembourg, Esch-sur-Alzette, Luxembourg.
| | - Niklas Pfister
- Department of Mathematical Sciences, University of Copenhagen, Copenhagen, Denmark
| | - Suheeta Roy
- Department of Genetics, Genomics and Informatics, University of Tennessee Health Science Center, Memphis, TN, USA
| | - Cyril Statzer
- Department of Health Sciences and Technology, ETH Zürich, Zurich, Switzerland
| | - Jack Haverty
- Luxembourg Centre for Systems Biomedicine, University of Luxembourg, Esch-sur-Alzette, Luxembourg
| | - Jesse Ingels
- Department of Genetics, Genomics and Informatics, University of Tennessee Health Science Center, Memphis, TN, USA
| | - Casey Bohl
- Department of Genetics, Genomics and Informatics, University of Tennessee Health Science Center, Memphis, TN, USA
| | - Moaraj Hasan
- Department of Biology, Institute of Molecular Systems Biology, ETH Zürich, Zurich, Switzerland
| | - Jelena Čuklina
- Department of Biology, Institute of Molecular Systems Biology, ETH Zürich, Zurich, Switzerland
| | - Peter Bühlmann
- Department of Mathematics, Seminar for Statistics, ETH Zürich, Zurich, Switzerland
| | - Nicola Zamboni
- Department of Biology, Institute of Molecular Systems Biology, ETH Zürich, Zurich, Switzerland
| | - Lu Lu
- Department of Genetics, Genomics and Informatics, University of Tennessee Health Science Center, Memphis, TN, USA
| | - Collin Y Ewald
- Department of Health Sciences and Technology, ETH Zürich, Zurich, Switzerland
| | - Robert W Williams
- Department of Genetics, Genomics and Informatics, University of Tennessee Health Science Center, Memphis, TN, USA
| | - Ruedi Aebersold
- Department of Biology, Institute of Molecular Systems Biology, ETH Zürich, Zurich, Switzerland; Faculty of Science, University of Zürich, Zurich, Switzerland
| |
Collapse
|