1
|
Ozonoff A, Schaenman J, Jayavelu ND, Milliren CE, Calfee CS, Cairns CB, Kraft M, Baden LR, Shaw AC, Krammer F, van Bakel H, Esserman DA, Liu S, Sesma AF, Simon V, Hafler DA, Montgomery RR, Kleinstein SH, Levy O, Bime C, Haddad EK, Erle DJ, Pulendran B, Nadeau KC, Davis MM, Hough CL, Messer WB, Higuita NIA, Metcalf JP, Atkinson MA, Brakenridge SC, Corry D, Kheradmand F, Ehrlich LI, Melamed E, McComsey GA, Sekaly R, Diray-Arce J, Peters B, Augustine AD, Reed EF, Altman MC, Becker PM, Rouphael N, Ozonoff A, Schaenman J, Jayavelu ND, Milliren CE, Calfee CS, Cairns CB, Kraft M, Baden LR, Shaw AC, Krammer F, van Bakel H, Esserman DA, Liu S, Sesma AF, Simon V, Hafler DA, Montgomery RR, Kleinstein SH, Levy O, Bime C, Haddad EK, Erle DJ, Pulendran B, Nadeau KC, Davis MM, Hough CL, Messer WB, Higuita NIA, Metcalf JP, Atkinson MA, Brakenridge SC, Corry D, Kheradmand F, Ehrlich LI, Melamed E, McComsey GA, Sekaly R, Diray-Arce J, Peters B, Augustine AD, Reed EF, McEnaney K, Barton B, Lentucci C, Saluvan M, Chang AC, Hoch A, Albert M, Shaheen T, Kho AT, Thomas S, Chen J, Murphy MD, Cooney M, Presnell S, Fragiadakis GK, et alOzonoff A, Schaenman J, Jayavelu ND, Milliren CE, Calfee CS, Cairns CB, Kraft M, Baden LR, Shaw AC, Krammer F, van Bakel H, Esserman DA, Liu S, Sesma AF, Simon V, Hafler DA, Montgomery RR, Kleinstein SH, Levy O, Bime C, Haddad EK, Erle DJ, Pulendran B, Nadeau KC, Davis MM, Hough CL, Messer WB, Higuita NIA, Metcalf JP, Atkinson MA, Brakenridge SC, Corry D, Kheradmand F, Ehrlich LI, Melamed E, McComsey GA, Sekaly R, Diray-Arce J, Peters B, Augustine AD, Reed EF, Altman MC, Becker PM, Rouphael N, Ozonoff A, Schaenman J, Jayavelu ND, Milliren CE, Calfee CS, Cairns CB, Kraft M, Baden LR, Shaw AC, Krammer F, van Bakel H, Esserman DA, Liu S, Sesma AF, Simon V, Hafler DA, Montgomery RR, Kleinstein SH, Levy O, Bime C, Haddad EK, Erle DJ, Pulendran B, Nadeau KC, Davis MM, Hough CL, Messer WB, Higuita NIA, Metcalf JP, Atkinson MA, Brakenridge SC, Corry D, Kheradmand F, Ehrlich LI, Melamed E, McComsey GA, Sekaly R, Diray-Arce J, Peters B, Augustine AD, Reed EF, McEnaney K, Barton B, Lentucci C, Saluvan M, Chang AC, Hoch A, Albert M, Shaheen T, Kho AT, Thomas S, Chen J, Murphy MD, Cooney M, Presnell S, Fragiadakis GK, Patel R, Guan L, Gygi J, Pawar S, Brito A, Khalil Z, Maguire C, Fourati S, Overton JA, Vita R, Westendorf K, Salehi-Rad R, Leligdowicz A, Matthay MA, Singer JP, Kangelaris KN, Hendrickson CM, Krummel MF, Langelier CR, Woodruff PG, Powell DL, Kim JN, Simmons B, Goonewardene IM, Smith CM, Martens M, Mosier J, Kimura H, Sherman AC, Walsh SR, Issa NC, Dela Cruz C, Farhadian S, Iwasaki A, Ko AI, Chinthrajah S, Ahuja N, Rogers AJ, Artandi M, Siegel SA, Lu Z, Drevets DA, Brown BR, Anderson ML, Guirgis FW, Thyagarajan RV, Rousseau JF, Wylie D, Busch J, Gandhi S, Triplett TA, Yendewa G, Giddings O, Anderson EJ, Mehta AK, Sevransky JE, Khor B, Rahman A, Stadlbauer D, Dutta J, Xie H, Kim-Schulze S, Gonzalez-Reiche AS, van de Guchte A, Farrugia K, Khan Z, Maecker HT, Elashoff D, Brook J, Ramires-Sanchez E, Llamas M, Rivera A, Perdomo C, Ward DC, Magyar CE, Fulcher JA, Abe-Jones Y, Asthana S, Beagle A, Bhide S, Carrillo SA, Chak S, Fragiadakis GK, Ghale R, Gonzalez A, Jauregui A, Jones N, Lea T, Lee D, Lota R, Milush J, Nguyen V, Pierce L, Prasad PA, Rao A, Samad B, Shaw C, Sigman A, Sinha P, Ward A, Willmore A, Zhan J, Rashid S, Rodriguez N, Tang K, Altamirano LT, Betancourt L, Curiel C, Sutter N, Paz MT, Tietje-Ulrich G, Leroux C, Connors J, Bernui M, Kutzler MA, Edwards C, Lee E, Lin E, Croen B, Semenza NC, Rogowski B, Melnyk N, Woloszczuk K, Cusimano G, Bell MR, Furukawa S, McLin R, Marrero P, Sheidy J, Tegos GP, Nagle C, Mege N, Ulring K, Seyfert-Margolis V, Conway M, Francisco D, Molzahn A, Erickson H, Wilson CC, Schunk R, Sierra B, Hughes T, Smolen K, Desjardins M, van Haren S, Mitre X, Cauley J, Li X, Tong A, Evans B, Montesano C, Licona JH, Krauss J, Chang JBP, Izaguirre N, Chaudhary O, Coppi A, Fournier J, Mohanty S, Muenker MC, Nelson A, Raddassi K, Rainone M, Ruff WE, Salahuddin S, Schulz WL, Vijayakumar P, Wang H, Wunder Jr. E, Young HP, Zhao Y, Saksena M, Altman D, Kojic E, Srivastava K, Eaker LQ, Bermúdez-González MC, Beach KF, Sominsky LA, Azad AR, Carreño JM, Singh G, Raskin A, Tcheou J, Bielak D, Kawabata H, Mulder LCF, Kleiner G, Lee AS, Do ED, Fernandes A, Manohar M, Hagan T, Blish CA, Din HN, Roque J, Yang S, Brunton A, Sullivan PE, Strnad M, Lyski ZL, Coulter FJ, Booth JL, Sinko LA, Moldawer LL, Borresen B, Roth-Manning B, Song LZ, Nelson E, Lewis-Smith M, Smith J, Tipan PG, Siles N, Bazzi S, Geltman J, Hurley K, Gabriele G, Sieg S, Vaysman T, Bristow L, Hussaini L, Hellmeister K, Samaha H, Cheng A, Spainhour C, Scherer EM, Johnson B, Bechnak A, Ciric CR, Hewitt L, Carter E, Mcnair N, Panganiban B, Huerta C, Usher J, Ribeiro SP, Altman MC, Becker PM, Rouphael N. Phenotypes of disease severity in a cohort of hospitalized COVID-19 patients: Results from the IMPACC study. EBioMedicine 2022; 83:104208. [PMID: 35952496 PMCID: PMC9359694 DOI: 10.1016/j.ebiom.2022.104208] [Show More Authors] [Citation(s) in RCA: 46] [Impact Index Per Article: 15.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/24/2022] [Revised: 07/11/2022] [Accepted: 07/25/2022] [Indexed: 02/08/2023] Open
Abstract
BACKGROUND Better understanding of the association between characteristics of patients hospitalized with coronavirus disease 2019 (COVID-19) and outcome is needed to further improve upon patient management. METHODS Immunophenotyping Assessment in a COVID-19 Cohort (IMPACC) is a prospective, observational study of 1164 patients from 20 hospitals across the United States. Disease severity was assessed using a 7-point ordinal scale based on degree of respiratory illness. Patients were prospectively surveyed for 1 year after discharge for post-acute sequalae of COVID-19 (PASC) through quarterly surveys. Demographics, comorbidities, radiographic findings, clinical laboratory values, SARS-CoV-2 PCR and serology were captured over a 28-day period. Multivariable logistic regression was performed. FINDINGS The median age was 59 years (interquartile range [IQR] 20); 711 (61%) were men; overall mortality was 14%, and 228 (20%) required invasive mechanical ventilation. Unsupervised clustering of ordinal score over time revealed distinct disease course trajectories. Risk factors associated with prolonged hospitalization or death by day 28 included age ≥ 65 years (odds ratio [OR], 2.01; 95% CI 1.28-3.17), Hispanic ethnicity (OR, 1.71; 95% CI 1.13-2.57), elevated baseline creatinine (OR 2.80; 95% CI 1.63- 4.80) or troponin (OR 1.89; 95% 1.03-3.47), baseline lymphopenia (OR 2.19; 95% CI 1.61-2.97), presence of infiltrate by chest imaging (OR 3.16; 95% CI 1.96-5.10), and high SARS-CoV2 viral load (OR 1.53; 95% CI 1.17-2.00). Fatal cases had the lowest ratio of SARS-CoV-2 antibody to viral load levels compared to other trajectories over time (p=0.001). 589 survivors (51%) completed at least one survey at follow-up with 305 (52%) having at least one symptom consistent with PASC, most commonly dyspnea (56% among symptomatic patients). Female sex was the only associated risk factor for PASC. INTERPRETATION Integration of PCR cycle threshold, and antibody values with demographics, comorbidities, and laboratory/radiographic findings identified risk factors for 28-day outcome severity, though only female sex was associated with PASC. Longitudinal clinical phenotyping offers important insights, and provides a framework for immunophenotyping for acute and long COVID-19. FUNDING NIH.
Collapse
|
Observational Study |
3 |
46 |
2
|
Gygi JP, Yu Q, Navarrete-Perea J, Rad R, Gygi SP, Paulo JA. Web-Based Search Tool for Visualizing Instrument Performance Using the Triple Knockout (TKO) Proteome Standard. J Proteome Res 2018; 18:687-693. [PMID: 30451507 DOI: 10.1021/acs.jproteome.8b00737] [Citation(s) in RCA: 38] [Impact Index Per Article: 5.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/08/2023]
Abstract
Multiplexing strategies are at the forefront of mass-spectrometry-based proteomics, with SPS-MS3 methods becoming increasingly commonplace. A known caveat of isobaric multiplexing is interference resulting from coisolated and cofragmented ions that do not originate from the selected precursor of interest. The triple knockout (TKO) standard was designed to benchmark data collection strategies to minimize interference. However, a limitation to its widespread use has been the lack of an automated analysis platform. We present a TKO Visualization Tool (TVT). The TVT viewer allows for automated, web-based, database searching of the TKO standard, returning traditional figures of merit, such as peptide and protein counts, scan-specific ion accumulation times, as well as the TKO-specific metric, the IFI (interference-free index). Moreover, the TVT viewer allows for plotting of two TKO standards to assess protocol optimizations, compare instruments, or measure degradation of instrument performance over time. We showcase the TVT viewer by probing the selection of (1) stationary phase resin, (2) MS2 isolation window width, and (3) number of synchronous precursor selection (SPS) ions for SPS-MS3 analysis. Using the TVT viewer will allow the proteomics community to search and compare TKO results to optimize user-specific data collection workflows.
Collapse
|
Research Support, N.I.H., Extramural |
7 |
38 |
3
|
Diray-Arce J, Fourati S, Doni Jayavelu N, Patel R, Maguire C, Chang AC, Dandekar R, Qi J, Lee BH, van Zalm P, Schroeder A, Chen E, Konstorum A, Brito A, Gygi JP, Kho A, Chen J, Pawar S, Gonzalez-Reiche AS, Hoch A, Milliren CE, Overton JA, Westendorf K, Cairns CB, Rouphael N, Bosinger SE, Kim-Schulze S, Krammer F, Rosen L, Grubaugh ND, van Bakel H, Wilson M, Rajan J, Steen H, Eckalbar W, Cotsapas C, Langelier CR, Levy O, Altman MC, Maecker H, Montgomery RR, Haddad EK, Sekaly RP, Esserman D, Ozonoff A, Becker PM, Augustine AD, Guan L, Peters B, Kleinstein SH. Multi-omic longitudinal study reveals immune correlates of clinical course among hospitalized COVID-19 patients. Cell Rep Med 2023; 4:101079. [PMID: 37327781 PMCID: PMC10203880 DOI: 10.1016/j.xcrm.2023.101079] [Citation(s) in RCA: 23] [Impact Index Per Article: 11.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/30/2022] [Revised: 01/31/2023] [Accepted: 05/16/2023] [Indexed: 06/18/2023]
Abstract
The IMPACC cohort, composed of >1,000 hospitalized COVID-19 participants, contains five illness trajectory groups (TGs) during acute infection (first 28 days), ranging from milder (TG1-3) to more severe disease course (TG4) and death (TG5). Here, we report deep immunophenotyping, profiling of >15,000 longitudinal blood and nasal samples from 540 participants of the IMPACC cohort, using 14 distinct assays. These unbiased analyses identify cellular and molecular signatures present within 72 h of hospital admission that distinguish moderate from severe and fatal COVID-19 disease. Importantly, cellular and molecular states also distinguish participants with more severe disease that recover or stabilize within 28 days from those that progress to fatal outcomes (TG4 vs. TG5). Furthermore, our longitudinal design reveals that these biologic states display distinct temporal patterns associated with clinical outcomes. Characterizing host immune responses in relation to heterogeneity in disease course may inform clinical prognosis and opportunities for intervention.
Collapse
|
Research Support, N.I.H., Extramural |
2 |
23 |
4
|
Gygi JP, Rad R, Navarrete-Perea J, Younesi S, Gygi SP, Paulo JA. A Triple Knockout Isobaric-Labeling Quality Control Platform with an Integrated Online Database Search. JOURNAL OF THE AMERICAN SOCIETY FOR MASS SPECTROMETRY 2020; 31:1344-1349. [PMID: 32202424 PMCID: PMC7332369 DOI: 10.1021/jasms.0c00029] [Citation(s) in RCA: 16] [Impact Index Per Article: 3.2] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/24/2023]
Abstract
Sample multiplexing using isobaric tagging is a powerful strategy for proteome-wide protein quantification. One major caveat of isobaric tagging is ratio compression that results from the isolation, fragmentation, and quantification of coeluting, near-isobaric peptides, a phenomenon typically referred to as "ion interference". A robust platform to ensure quality control, optimize parameters, and enable comparisons across samples is essential as new instrumentation and analytical methods evolve. Here, we introduce TKO-iQC, an integrated platform consisting of the Triple Knockout (TKO) yeast digest standard and an automated web-based database search and protein profile visualization application. We highlight two new TKO standards based on the TMTpro reagent (TKOpro9 and TKOpro16) as well as an updated TKO Viewing Tool, TVT2.0. TKO-iQC greatly facilitates the comparison of instrument performance with a straightforward and streamlined workflow.
Collapse
|
research-article |
5 |
16 |
5
|
Navarrete-Perea J, Liu X, Rad R, Gygi JP, Gygi SP, Paulo JA. Assessing interference in isobaric tag-based sample multiplexing using an 18-plex interference standard. Proteomics 2021; 22:e2100317. [PMID: 34918453 DOI: 10.1002/pmic.202100317] [Citation(s) in RCA: 11] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/01/2021] [Revised: 12/09/2021] [Accepted: 12/09/2021] [Indexed: 11/06/2022]
Abstract
Reporter ion interference remains a limitation of isobaric tag-based sample multiplexing. Advances in instrumentation and data acquisition modes, such as the recently developed real-time database search (RTS), can reduce interference. However, interference persists as does the need to benchmark upstream sample preparation and data acquisition strategies. Here, we present an updated Triple yeast KnockOut (TKO) standard as well as corresponding upgrades to the TKO Viewing Tool (TVT2.5, http://tko.hms.harvard.edu/). Specifically, we expand the TKO standard to incorporate the TMTpro18-plex reagents (TKO18). We also construct a variant thereof which has been digested only with LysC (TKO18L). We compare proteome coverage and interference levels of TKO18 and TKO18L data that are acquired under different data acquisition modes and analyzed using TVT2.5. Our data illustrate that RTS reduces interference while improving proteome coverage and suggest that digesting with LysC alone only modestly reduces interference, albeit at the expense of proteome depth. Collectively, the two new TKO standards coupled with the updated TVT represent a convenient and versatile platform for assessing and developing methods to reduce interference in isobaric tag-based experiments. This article is protected by copyright. All rights reserved.
Collapse
|
|
4 |
11 |
6
|
Diray-Arce J, Miller HER, Henrich E, Gerritsen B, Mulè MP, Fourati S, Gygi J, Hagan T, Tomalin L, Rychkov D, Kazmin D, Chawla DG, Meng H, Dunn P, Campbell J, Sarwal M, Tsang JS, Levy O, Pulendran B, Sekaly R, Floratos A, Gottardo R, Kleinstein SH, Suárez-Fariñas M. The Immune Signatures data resource, a compendium of systems vaccinology datasets. Sci Data 2022; 9:635. [PMID: 36266291 PMCID: PMC9584267 DOI: 10.1038/s41597-022-01714-7] [Citation(s) in RCA: 9] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/24/2022] [Accepted: 09/22/2022] [Indexed: 01/04/2023] Open
Abstract
Vaccines are among the most cost-effective public health interventions for preventing infection-induced morbidity and mortality, yet much remains to be learned regarding the mechanisms by which vaccines protect. Systems immunology combines traditional immunology with modern 'omic profiling techniques and computational modeling to promote rapid and transformative advances in vaccinology and vaccine discovery. The NIH/NIAID Human Immunology Project Consortium (HIPC) has leveraged systems immunology approaches to identify molecular signatures associated with the immunogenicity of many vaccines. However, comparative analyses have been limited by the distributed nature of some data, potential batch effects across studies, and the absence of multiple relevant studies from non-HIPC groups in ImmPort. To support comparative analyses across different vaccines, we have created the Immune Signatures Data Resource, a compendium of standardized systems vaccinology datasets. This data resource is available through ImmuneSpace, along with code to reproduce the processing and batch normalization starting from the underlying study data in ImmPort and the Gene Expression Omnibus (GEO). The current release comprises 1405 participants from 53 cohorts profiling the response to 24 different vaccines. This novel systems vaccinology data release represents a valuable resource for comparative and meta-analyses that will accelerate our understanding of mechanisms underlying vaccine responses.
Collapse
|
Dataset |
3 |
9 |
7
|
Gygi JP, Kleinstein SH, Guan L. Predictive overfitting in immunological applications: Pitfalls and solutions. Hum Vaccin Immunother 2023; 19:2251830. [PMID: 37697867 PMCID: PMC10498807 DOI: 10.1080/21645515.2023.2251830] [Citation(s) in RCA: 6] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/01/2023] [Revised: 07/27/2023] [Accepted: 08/21/2023] [Indexed: 09/13/2023] Open
Abstract
Overfitting describes the phenomenon where a highly predictive model on the training data generalizes poorly to future observations. It is a common concern when applying machine learning techniques to contemporary medical applications, such as predicting vaccination response and disease status in infectious disease or cancer studies. This review examines the causes of overfitting and offers strategies to counteract it, focusing on model complexity reduction, reliable model evaluation, and harnessing data diversity. Through discussion of the underlying mathematical models and illustrative examples using both synthetic data and published real datasets, our objective is to equip analysts and bioinformaticians with the knowledge and tools necessary to detect and mitigate overfitting in their research.
Collapse
|
Review |
2 |
6 |
8
|
Gygi JP, Maguire C, Patel RK, Shinde P, Konstorum A, Shannon CP, Xu L, Hoch A, Jayavelu ND, Network I, Haddad EK, Reed EF, Kraft M, McComsey GA, Metcalf J, Ozonoff A, Esserman D, Cairns CB, Rouphael N, Bosinger SE, Kim-Schulze S, Krammer F, Rosen LB, van Bakel H, Wilson M, Eckalbar W, Maecker H, Langelier CR, Steen H, Altman MC, Montgomery RR, Levy O, Melamed E, Pulendran B, Diray-Arce J, Smolen KK, Fragiadakis GK, Becker PM, Augustine AD, Sekaly RP, Ehrlich LIR, Fourati S, Peters B, Kleinstein SH, Guan L. Integrated longitudinal multi-omics study identifies immune programs associated with COVID-19 severity and mortality in 1152 hospitalized participants. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2023:2023.11.03.565292. [PMID: 37986828 PMCID: PMC10659275 DOI: 10.1101/2023.11.03.565292] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/22/2023]
Abstract
Hospitalized COVID-19 patients exhibit diverse clinical outcomes, with some individuals diverging over time even though their initial disease severity appears similar. A systematic evaluation of molecular and cellular profiles over the full disease course can link immune programs and their coordination with progression heterogeneity. In this study, we carried out deep immunophenotyping and conducted longitudinal multi-omics modeling integrating ten distinct assays on a total of 1,152 IMPACC participants and identified several immune cascades that were significant drivers of differential clinical outcomes. Increasing disease severity was driven by a temporal pattern that began with the early upregulation of immunosuppressive metabolites and then elevated levels of inflammatory cytokines, signatures of coagulation, NETosis, and T-cell functional dysregulation. A second immune cascade, predictive of 28-day mortality among critically ill patients, was characterized by reduced total plasma immunoglobulins and B cells, as well as dysregulated IFN responsiveness. We demonstrated that the balance disruption between IFN-stimulated genes and IFN inhibitors is a crucial biomarker of COVID-19 mortality, potentially contributing to the failure of viral clearance in patients with fatal illness. Our longitudinal multi-omics profiling study revealed novel temporal coordination across diverse omics that potentially explain disease progression, providing insights that inform the targeted development of therapies for hospitalized COVID-19 patients, especially those critically ill.
Collapse
|
Preprint |
2 |
4 |
9
|
Gygi JP, Maguire C, Patel RK, Shinde P, Konstorum A, Shannon CP, Xu L, Hoch A, Jayavelu ND, Haddad EK, IMPACC Network, Reed EF, Kraft M, McComsey GA, Metcalf JP, Ozonoff A, Esserman D, Cairns CB, Rouphael N, Bosinger SE, Kim-Schulze S, Krammer F, Rosen LB, van Bakel H, Wilson M, Eckalbar WL, Maecker HT, Langelier CR, Steen H, Altman MC, Montgomery RR, Levy O, Melamed E, Pulendran B, Diray-Arce J, Smolen KK, Fragiadakis GK, Becker PM, Sekaly RP, Ehrlich LI, Fourati S, Peters B, Kleinstein SH, Guan L. Integrated longitudinal multiomics study identifies immune programs associated with acute COVID-19 severity and mortality. J Clin Invest 2024; 134:e176640. [PMID: 38690733 PMCID: PMC11060740 DOI: 10.1172/jci176640] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/26/2023] [Accepted: 03/12/2024] [Indexed: 05/03/2024] Open
Abstract
BACKGROUNDPatients hospitalized for COVID-19 exhibit diverse clinical outcomes, with outcomes for some individuals diverging over time even though their initial disease severity appears similar to that of other patients. A systematic evaluation of molecular and cellular profiles over the full disease course can link immune programs and their coordination with progression heterogeneity.METHODSWe performed deep immunophenotyping and conducted longitudinal multiomics modeling, integrating 10 assays for 1,152 Immunophenotyping Assessment in a COVID-19 Cohort (IMPACC) study participants and identifying several immune cascades that were significant drivers of differential clinical outcomes.RESULTSIncreasing disease severity was driven by a temporal pattern that began with the early upregulation of immunosuppressive metabolites and then elevated levels of inflammatory cytokines, signatures of coagulation, formation of neutrophil extracellular traps, and T cell functional dysregulation. A second immune cascade, predictive of 28-day mortality among critically ill patients, was characterized by reduced total plasma Igs and B cells and dysregulated IFN responsiveness. We demonstrated that the balance disruption between IFN-stimulated genes and IFN inhibitors is a crucial biomarker of COVID-19 mortality, potentially contributing to failure of viral clearance in patients with fatal illness.CONCLUSIONOur longitudinal multiomics profiling study revealed temporal coordination across diverse omics that potentially explain the disease progression, providing insights that can inform the targeted development of therapies for patients hospitalized with COVID-19, especially those who are critically ill.TRIAL REGISTRATIONClinicalTrials.gov NCT04378777.FUNDINGNIH (5R01AI135803-03, 5U19AI118608-04, 5U19AI128910-04, 4U19AI090023-11, 4U19AI118610-06, R01AI145835-01A1S1, 5U19AI062629-17, 5U19AI057229-17, 5U19AI125357-05, 5U19AI128913-03, 3U19AI077439-13, 5U54AI142766-03, 5R01AI104870-07, 3U19AI089992-09, 3U19AI128913-03, and 5T32DA018926-18); NIAID, NIH (3U19AI1289130, U19AI128913-04S1, and R01AI122220); and National Science Foundation (DMS2310836).
Collapse
|
Observational Study |
1 |
|
10
|
Jayavelu ND, Samaha H, Wimalasena ST, Hoch A, Gygi JP, Gabernet G, Ozonoff A, Liu S, Milliren CE, Levy O, Baden LR, Melamed E, Ehrlich LIR, McComsey GA, Sekaly RP, Cairns CB, Haddad EK, Schaenman J, Shaw AC, Hafler DA, Montgomery RR, Corry DB, Kheradmand F, Atkinson MA, Brakenridge SC, Higuita NIA, Metcalf JP, Hough CL, Messer WB, Pulendran B, Nadeau KC, Davis MM, Geng LN, Sesma AF, Simon V, Krammer F, Kraft M, Bime C, Calfee CS, Erle DJ, Langelier CR, IMPACC Network, Guan L, Maecker HT, Peters B, Kleinstein SH, Reed EF, Diray-Arce J, Rouphael N, Altman MC. Machine learning models predict long COVID outcomes based on baseline clinical and immunologic factors. MEDRXIV : THE PREPRINT SERVER FOR HEALTH SCIENCES 2025:2025.02.12.25322164. [PMID: 39990570 PMCID: PMC11844586 DOI: 10.1101/2025.02.12.25322164] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Subscribe] [Scholar Register] [Indexed: 02/25/2025]
Abstract
The post-acute sequelae of SARS-CoV-2 (PASC), also known as long COVID, remain a significant health issue that is incompletely understood. Predicting which acutely infected individuals will go on to develop long COVID is challenging due to the lack of established biomarkers, clear disease mechanisms, or well-defined sub-phenotypes. Machine learning (ML) models offer the potential to address this by leveraging clinical data to enhance diagnostic precision. We utilized clinical data, including antibody titers and viral load measurements collected at the time of hospital admission, to predict the likelihood of acute COVID-19 progressing to long COVID. Our machine learning models achieved median AUROC values ranging from 0.64 to 0.66 and AUPRC values between 0.51 and 0.54, demonstrating their predictive capabilities. Feature importance analysis revealed that low antibody titers and high viral loads at hospital admission were the strongest predictors of long COVID outcomes. Comorbidities, including chronic respiratory, cardiac, and neurologic diseases, as well as female sex, were also identified as significant risk factors for long COVID. Our findings suggest that ML models have the potential to identify patients at risk for developing long COVID based on baseline clinical characteristics. These models can help guide early interventions, improving patient outcomes and mitigating the long-term public health impacts of SARS-CoV-2.
Collapse
|
Preprint |
1 |
|
11
|
Gabernet G, Maciuch J, Gygi JP, Moore JF, Hoch A, Syphurs C, Chu T, Jayavelu ND, Corry DB, Kheradmand F, Baden LR, Sekaly RP, McComsey GA, Haddad EK, Cairns CB, Rouphael N, Fernandez-Sesma A, Simon V, Metcalf JP, Agudelo Higuita NI, Hough CL, Messer WB, Davis MM, Nadeau KC, Pulendran B, Kraft M, Bime C, Reed EF, Schaenman J, Erle DJ, Calfee CS, Atkinson MA, Brackenridge SC, Melamed E, Shaw AC, Hafler DA, Ozonoff A, Bosinger SE, Eckalbar W, Maecker HT, Kim-Schulze S, Steen H, Krammer F, Westendorf K, Network I, Peters B, Fourati S, Altman MC, Levy O, Smolen KK, Montgomery RR, Diray-Arce J, Kleinstein SH, Guan L, Ehrlich LIR. Identification of a multi-omics factor predictive of long COVID in the IMPACC study. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2025:2025.02.12.637926. [PMID: 39990442 PMCID: PMC11844572 DOI: 10.1101/2025.02.12.637926] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Subscribe] [Scholar Register] [Indexed: 02/25/2025]
Abstract
Following SARS-CoV-2 infection, ∼10-35% of COVID-19 patients experience long COVID (LC), in which often debilitating symptoms persist for at least three months. Elucidating the biologic underpinnings of LC could identify therapeutic opportunities. We utilized machine learning methods on biologic analytes and patient reported outcome surveys provided over 12 months after hospital discharge from >500 hospitalized COVID-19 patients in the IMPACC cohort to identify a multi-omics "recovery factor". IMPACC participants who experienced LC had lower recovery factor scores compared to participants without LC. Biologic characterization revealed increased levels of plasma proteins associated with inflammation, elevated transcriptional signatures of heme metabolism, and decreased androgenic steroids in LC patients. The recovery factor was also associated with altered circulating immune cell frequencies. Notably, recovery factor scores were predictive of LC occurrence in patients as early as hospital admission, irrespective of acute disease severity. Thus, the recovery factor identifies patients at risk of LC early after SARS-CoV-2 infection and reveals LC biomarkers and potential treatment targets.
Collapse
|
Preprint |
1 |
|
12
|
Gygi JP, Konstorum A, Pawar S, Aron E, Kleinstein SH, Guan L. A supervised Bayesian factor model for the identification of multi-omics signatures. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2023:2023.01.25.525545. [PMID: 36747790 PMCID: PMC9900835 DOI: 10.1101/2023.01.25.525545] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 01/28/2023]
Abstract
MOTIVATION Predictive biological signatures provide utility as biomarkers for disease diagnosis and prognosis, as well as prediction of responses to vaccination or therapy. These signatures are iden-tified from high-throughput profiling assays through a combination of dimensionality reduction and machine learning techniques. The genes, proteins, metabolites, and other biological analytes that compose signatures also generate hypotheses on the underlying mechanisms driving biological responses, thus improving biological understanding. Dimensionality reduction is a critical step in signature discovery to address the large number of analytes in omics datasets, especially for multi-omics profiling studies with tens of thousands of measurements. Latent factor models, which can account for the structural heterogeneity across diverse assays, effectively integrate multi-omics data and reduce dimensionality to a small number of factors that capture correlations and associations among measurements. These factors provide biologically interpretable features for predictive model-ing. However, multi-omics integration and predictive modeling are generally performed independent-ly in sequential steps, leading to suboptimal factor construction. Combining these steps can yield better multi-omics signatures that are more predictive while still being biologically meaningful. RESULTS We developed a supervised variational Bayesian factor model that extracts multi-omics signatures from high-throughput profiling datasets that can span multiple data types. Signature-based multiPle-omics intEgration via lAtent factoRs (SPEAR) adaptively determines factor rank, emphasis on factor structure, data relevance and feature sparsity. The method improves the recon-struction of underlying factors in synthetic examples and prediction accuracy of COVID-19 severity and breast cancer tumor subtypes. AVAILABILITY SPEAR is a publicly available R-package hosted at https://bitbucket.org/kleinstein/SPEAR.
Collapse
|
Preprint |
2 |
|
13
|
Shinde P, Soldevila F, Reyna J, Aoki M, Rasmussen M, Willemsen L, Kojima M, Ha B, Greenbaum JA, Overton JA, Guzman-Orozco H, Nili S, Orfield S, Gygi JP, da Silva Antunes R, Sette A, Grant B, Olsen LR, Konstorum A, Guan L, Ay F, Kleinstein SH, Peters B. A multi-omics systems vaccinology resource to develop and test computational models of immunity. CELL REPORTS METHODS 2024; 4:100731. [PMID: 38490204 PMCID: PMC10985234 DOI: 10.1016/j.crmeth.2024.100731] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 09/13/2023] [Revised: 01/04/2024] [Accepted: 02/20/2024] [Indexed: 03/17/2024]
Abstract
Systems vaccinology studies have identified factors affecting individual vaccine responses, but comparing these findings is challenging due to varying study designs. To address this lack of reproducibility, we established a community resource for comparing Bordetella pertussis booster responses and to host annual contests for predicting patients' vaccination outcomes. We report here on our experiences with the "dry-run" prediction contest. We found that, among 20+ models adopted from the literature, the most successful model predicting vaccination outcome was based on age alone. This confirms our concerns about the reproducibility of conclusions between different vaccinology studies. Further, we found that, for newly trained models, handling of baseline information on the target variables was crucial. Overall, multiple co-inertia analysis gave the best results of the tested modeling approaches. Our goal is to engage community in these prediction challenges by making data and models available and opening a public contest in August 2024.
Collapse
|
research-article |
1 |
|
14
|
Shinde P, Willemsen L, Anderson M, Aoki M, Basu S, Burel JG, Cheng P, Ghosh Dastidar S, Dunleavy A, Einav T, Forschmiedt J, Fourati S, Garcia J, Gibson W, Greenbaum JA, Guan L, Guan W, Gygi JP, Ha B, Hou J, Hsiao J, Huang Y, Jansen R, Kakoty B, Kang Z, Kobie JJ, Kojima M, Konstorum A, Lee J, Lewis SA, Li A, Lock EF, Mahita J, Mendes M, Meng H, Neher A, Nili S, Olsen LR, Orfield S, Overton JA, Pai N, Parker C, Qian B, Rasmussen M, Reyna J, Richardson E, Safo S, Sorenson J, Srinivasan A, Thrupp N, Tippalagama R, Trevizani R, Ventz S, Wang J, Wu CC, Ay F, Grant B, Kleinstein SH, Peters B. Putting computational models of immunity to the test-An invited challenge to predict B.pertussis vaccination responses. PLoS Comput Biol 2025; 21:e1012927. [PMID: 40163550 PMCID: PMC11978014 DOI: 10.1371/journal.pcbi.1012927] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/23/2024] [Revised: 04/08/2025] [Accepted: 03/04/2025] [Indexed: 04/02/2025] Open
Abstract
Systems vaccinology studies have been used to build computational models that predict individual vaccine responses and identify the factors contributing to differences in outcome. Comparing such models is challenging due to variability in study designs. To address this, we established a community resource to compare models predicting B. pertussis booster responses and generate experimental data for the explicit purpose of model evaluation. We here describe our second computational prediction challenge using this resource, where we benchmarked 49 algorithms from 53 scientists. We found that the most successful models stood out in their handling of nonlinearities, reducing large feature sets to representative subsets, and advanced data preprocessing. In contrast, we found that models adopted from literature that were developed to predict vaccine antibody responses in other settings performed poorly, reinforcing the need for purpose-built models. Overall, this demonstrates the value of purpose-generated datasets for rigorous and open model evaluations to identify features that improve the reliability and applicability of computational models in vaccine response prediction.
Collapse
|
research-article |
1 |
|
15
|
Wang K, Nie Y, Maguire C, Syphurs C, Sheen H, Karoly M, Lapp L, Gygi JP, Jayavelu ND, Patel RK, Hoch A, Corry D, Kheradmand F, McComsey GA, Fernandez-Sesma A, Simon V, Metcalf JP, Higuita NIA, Messer WB, Davis MM, Nadeau KC, Kraft M, Bime C, Schaenman J, Erle D, Calfee CS, Atkinson MA, Brackenridge SC, Hafler DA, Shaw A, Rahman A, Hough CL, Geng LN, Ozonoff A, Haddad EK, Reed EF, van Bakel H, Kim-Schultz S, Krammer F, Wilson M, Eckalbar W, Bosinger S, Langelier CR, Sekaly RP, Montgomery RR, Maecker HT, Krumholz H, Melamed E, Steen H, Pulendran B, Augustine AD, Cairns CB, Rouphael N, Becker PM, Fourati S, Shannon CP, Smolen KK, Peters B, Kleinstein SH, Levy O, Altman MC, Iwasaki A, Diray-Arce J, Ehrlich LIR, Guan L. Unraveling SARS-CoV-2 Host-Response Heterogeneity through Longitudinal Molecular Subtyping. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2024.11.22.624784. [PMID: 39651165 PMCID: PMC11623532 DOI: 10.1101/2024.11.22.624784] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/11/2024]
Abstract
Hospitalized COVID-19 patients exhibit diverse immune responses during acute infection, which are associated with a wide range of clinical outcomes. However, understanding these immune heterogeneities and their links to various clinical complications, especially long COVID, remains a challenge. In this study, we performed unsupervised subtyping of longitudinal multi-omics immunophenotyping in over 1,000 hospitalized patients, identifying two critical subtypes linked to mortality or mechanical ventilation with prolonged hospital stay and three severe subtypes associated with timely acute recovery. We confirmed that unresolved systemic inflammation and T-cell dysfunctions were hallmarks of increased severity and further distinguished patients with similar acute respiratory severity by their distinct immune profiles, which correlated with differences in demographic and clinical complications. Notably, one critical subtype (SubF) was uniquely characterized by early excessive inflammation, insufficient anticoagulation, and fatty acid dysregulation, alongside higher incidences of hematologic, cardiac, and renal complications, and an elevated risk of long COVID. Among the severe subtypes, significant differences in viral clearance and early antiviral responses were observed, with one subtype (SubC) showing strong early T-cell cytotoxicity but a poor humoral response, slower viral clearance, and greater risks of chronic organ dysfunction and long COVID. These findings provide crucial insights into the complex and context-dependent nature of COVID-19 immune responses, highlighting the importance of personalized therapeutic strategies to improve both acute and long-term outcomes.
Collapse
|
Preprint |
1 |
|
16
|
Shinde P, Soldevila F, Reyna J, Aoki M, Rasmussen M, Willemsen L, Kojima M, Ha B, Greenbaum JA, Overton JA, Guzman-Orozco H, Nili S, Orfield S, Gygi JP, da Silva Antunes R, Sette A, Grant B, Olsen LR, Konstorum A, Guan L, Ay F, Kleinstein SH, Peters B. A systems vaccinology resource to develop and test computational models of immunity. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2023:2023.08.28.555193. [PMID: 37693565 PMCID: PMC10491180 DOI: 10.1101/2023.08.28.555193] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 09/12/2023]
Abstract
Computational models that predict an individual's response to a vaccine offer the potential for mechanistic insights and personalized vaccination strategies. These models are increasingly derived from systems vaccinology studies that generate immune profiles from human cohorts pre- and post-vaccination. Most of these studies involve relatively small cohorts and profile the response to a single vaccine. The ability to assess the performance of the resulting models would be improved by comparing their performance on independent datasets, as has been done with great success in other areas of biology such as protein structure predictions. To transfer this approach to system vaccinology studies, we established a prototype platform that focuses on the evaluation of Computational Models of Immunity to Pertussis Booster vaccinations (CMI-PB). A community resource, CMI-PB generates experimental data for the explicit purpose of model evaluation, which is performed through a series of annual data releases and associated contests. We here report on our experience with the first such 'dry run' for a contest where the goal was to predict individual immune responses based on pre-vaccination multi-omic profiles. Over 30 models adopted from the literature were tested, but only one was predictive, and was based on age alone. The performance of new models built using CMI-PB training data was much better, but varied significantly based on the choice of pre-vaccination features used and the model building strategy. This suggests that previously published models developed for other vaccines do not generalize well to Pertussis Booster vaccination. Overall, these results reinforced the need for comparative analysis across models and datasets that CMI-PB aims to achieve. We are seeking wider community engagement for our first public prediction contest, which will open in early 2024.
Collapse
|
Preprint |
2 |
|
17
|
Gygi JP, Konstorum A, Pawar S, Aron E, Kleinstein SH, Guan L. A supervised Bayesian factor model for the identification of multi-omics signatures. Bioinformatics 2024; 40:btae202. [PMID: 38603606 PMCID: PMC11078774 DOI: 10.1093/bioinformatics/btae202] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/18/2023] [Revised: 02/29/2024] [Accepted: 04/10/2024] [Indexed: 04/13/2024] Open
Abstract
MOTIVATION Predictive biological signatures provide utility as biomarkers for disease diagnosis and prognosis, as well as prediction of responses to vaccination or therapy. These signatures are identified from high-throughput profiling assays through a combination of dimensionality reduction and machine learning techniques. The genes, proteins, metabolites, and other biological analytes that compose signatures also generate hypotheses on the underlying mechanisms driving biological responses, thus improving biological understanding. Dimensionality reduction is a critical step in signature discovery to address the large number of analytes in omics datasets, especially for multi-omics profiling studies with tens of thousands of measurements. Latent factor models, which can account for the structural heterogeneity across diverse assays, effectively integrate multi-omics data and reduce dimensionality to a small number of factors that capture correlations and associations among measurements. These factors provide biologically interpretable features for predictive modeling. However, multi-omics integration and predictive modeling are generally performed independently in sequential steps, leading to suboptimal factor construction. Combining these steps can yield better multi-omics signatures that are more predictive while still being biologically meaningful. RESULTS We developed a supervised variational Bayesian factor model that extracts multi-omics signatures from high-throughput profiling datasets that can span multiple data types. Signature-based multiPle-omics intEgration via lAtent factoRs (SPEAR) adaptively determines factor rank, emphasis on factor structure, data relevance and feature sparsity. The method improves the reconstruction of underlying factors in synthetic examples and prediction accuracy of coronavirus disease 2019 severity and breast cancer tumor subtypes. AVAILABILITY AND IMPLEMENTATION SPEAR is a publicly available R-package hosted at https://bitbucket.org/kleinstein/SPEAR.
Collapse
|
Research Support, N.I.H., Extramural |
1 |
|
18
|
Shinde P, Willemsen L, Anderson M, Aoki M, Basu S, Burel JG, Cheng P, Dastidar SG, Dunleavy A, Einav T, Forschmiedt J, Fourati S, Garcia J, Gibson W, Greenbaum JA, Guan L, Guan W, Gygi JP, Ha B, Hou J, Hsiao J, Huang Y, Jansen R, Kakoty B, Kang Z, Kobie JJ, Kojima M, Konstorum A, Lee J, Lewis SA, Li A, Lock EF, Mahita J, Mendes M, Meng H, Neher A, Nili S, Olsen LR, Orfield S, Overton JA, Pai N, Parker C, Qian B, Rasmussen M, Reyna J, Richardson E, Safo S, Sorenson J, Srinivasan A, Thrupp N, Tippalagama R, Trevizani R, Ventz S, Wang J, Wu CC, Ay F, Grant B, Kleinstein SH, Peters B. Putting computational models of immunity to the test - an invited challenge to predict B. pertussis vaccination outcomes. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2024.09.04.611290. [PMID: 39282381 PMCID: PMC11398469 DOI: 10.1101/2024.09.04.611290] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 09/22/2024]
Abstract
Systems vaccinology studies have been used to build computational models that predict individual vaccine responses and identify the factors contributing to differences in outcome. Comparing such models is challenging due to variability in study designs. To address this, we established a community resource to compare models predicting B. pertussis booster responses and generate experimental data for the explicit purpose of model evaluation. We here describe our second computational prediction challenge using this resource, where we benchmarked 49 algorithms from 53 scientists. We found that the most successful models stood out in their handling of nonlinearities, reducing large feature sets to representative subsets, and advanced data preprocessing. In contrast, we found that models adopted from literature that were developed to predict vaccine antibody responses in other settings performed poorly, reinforcing the need for purpose-built models. Overall, this demonstrates the value of purpose-generated datasets for rigorous and open model evaluations to identify features that improve the reliability and applicability of computational models in vaccine response prediction.
Collapse
|
Preprint |
1 |
|