1
|
Carrasco-Zanini J, Pietzner M, Koprulu M, Wheeler E, Kerrison ND, Wareham NJ, Langenberg C. Proteomic prediction of diverse incident diseases: a machine learning-guided biomarker discovery study using data from a prospective cohort study. Lancet Digit Health 2024; 6:e470-e479. [PMID: 38906612 DOI: 10.1016/s2589-7500(24)00087-6] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/31/2023] [Revised: 04/03/2024] [Accepted: 04/19/2024] [Indexed: 06/23/2024]
Abstract
BACKGROUND Broad-capture proteomic technologies have the potential to improve disease prediction, enabling targeted prevention and management, but studies have so far been limited to very few selected diseases and have not evaluated predictive performance across multiple conditions. We aimed to evaluate the potential of serum proteins to improve risk prediction over and above health-derived information and polygenic risk scores across a diverse set of 24 outcomes. METHODS We designed multiple case-cohorts nested in the EPIC-Norfolk prospective study, from participants with available serum samples and genome-wide genotype data, with more than 32 974 person-years of follow-up. Participants were middle-aged individuals (aged 40-79 years at baseline) of European ancestry who were recruited from the general population of Norfolk, England, between March, 1993 and December, 1997. We selected participants who developed one of ten less common diseases within 10 years of follow-up; we also subsampled a randomly drawn control subcohort, which also served to investigate 14 more common outcomes (n>70), including all-cause premature mortality (death before the age of 75 years; case numbers 71-437; controls 608-1556). Individuals were excluded from the current study owing to failed genotyping or proteomic quality control, relatedness, or missing information on age, sex, BMI, or smoking status. We used a machine learning framework to derive sparse predictive protein models for the onset of the the 23 individual diseases and all-cause premature mortality, and to derive a single common sparse multimorbidity signature that was predictive across multiple diseases from 2923 serum proteins. FINDINGS Participants who developed one of ten less common diseases within 10 years of follow-up included 482 women and 507 men, with a mean age at baseline of 64·56 years (8·08). The random subcohort included 990 women and 769 men, with a mean age of 58·79 years (9·31). As few as five proteins alone outperformed polygenic risk scores for 17 of 23 outcomes (median dfference in concordance index [C-index] 0·13 [0·10-0·17]) and improved predictive performance when added over basic patient-derived information models for seven outcomes, achieving a median C-index of 0·82 (IQR 0·77-0·82). This included diseases with poor prognosis such as lung cancer (C-index 0·85 [+/- cross-validation error 0·83-0·87]), for which we identified unreported biomarkers such as C-X-C motif chemokine ligand 17. A sparse multimorbidity signature of ten proteins improved prediction across seven outcomes over patient-derived information models, achieving performances (median C-index 0·81 [IQR 0·80-0·82]) similar to those of disease-specific signatures. INTERPRETATION We show the value of broad-capture proteomic biomarker discovery studies across multiple diseases of diverse causes, pointing to those that might benefit the most from proteomic approaches, and the potential to derive common sparse biomarker panels for prediction of multiple diseases at once. This framework could enable follow-up studies to explore the generalisability of proteomic models and to benchmark these against clinical assays, which are required to understand the translational potential of these findings. FUNDING Medical Research Council, Health Data Research UK, UK Research and Innovation-National Institute for Health and Care Research, Cancer Research UK, and Wellcome Trust.
Collapse
Affiliation(s)
- Julia Carrasco-Zanini
- MRC Epidemiology Unit, University of Cambridge School of Clinical Medicine, Institute of Metabolic Science, Cambridge, UK; Precision Healthcare University Research Institute, Queen Mary University of London, London, UK
| | - Maik Pietzner
- MRC Epidemiology Unit, University of Cambridge School of Clinical Medicine, Institute of Metabolic Science, Cambridge, UK; Computational Medicine, Berlin Institute of Health at Charité-Universitätsmedizin Berlin, Berlin, Germany; Precision Healthcare University Research Institute, Queen Mary University of London, London, UK
| | - Mine Koprulu
- MRC Epidemiology Unit, University of Cambridge School of Clinical Medicine, Institute of Metabolic Science, Cambridge, UK
| | - Eleanor Wheeler
- MRC Epidemiology Unit, University of Cambridge School of Clinical Medicine, Institute of Metabolic Science, Cambridge, UK
| | - Nicola D Kerrison
- MRC Epidemiology Unit, University of Cambridge School of Clinical Medicine, Institute of Metabolic Science, Cambridge, UK
| | - Nicholas J Wareham
- MRC Epidemiology Unit, University of Cambridge School of Clinical Medicine, Institute of Metabolic Science, Cambridge, UK
| | - Claudia Langenberg
- MRC Epidemiology Unit, University of Cambridge School of Clinical Medicine, Institute of Metabolic Science, Cambridge, UK; Computational Medicine, Berlin Institute of Health at Charité-Universitätsmedizin Berlin, Berlin, Germany; Precision Healthcare University Research Institute, Queen Mary University of London, London, UK.
| |
Collapse
|
2
|
Carrasco-Zanini J, Pietzner M, Wheeler E, Kerrison ND, Langenberg C, Wareham NJ. Multi-omic prediction of incident type 2 diabetes. Diabetologia 2024; 67:102-112. [PMID: 37889320 PMCID: PMC10709231 DOI: 10.1007/s00125-023-06027-x] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 06/07/2023] [Accepted: 08/30/2023] [Indexed: 10/28/2023]
Abstract
AIMS/HYPOTHESIS The identification of people who are at high risk of developing type 2 diabetes is a key part of population-level prevention strategies. Previous studies have evaluated the predictive utility of omics measurements, such as metabolites, proteins or polygenic scores, but have considered these separately. The improvement that combined omics biomarkers can provide over and above current clinical standard models is unclear. The aim of this study was to test the predictive performance of genome, proteome, metabolome and clinical biomarkers when added to established clinical prediction models for type 2 diabetes. METHODS We developed sparse interpretable prediction models in a prospective, nested type 2 diabetes case-cohort study (N=1105, incident type 2 diabetes cases=375) with 10,792 person-years of follow-up, selecting from 5759 features across the genome, proteome, metabolome and clinical biomarkers using least absolute shrinkage and selection operator (LASSO) regression. We compared the predictive performance of omics-derived predictors with a clinical model including the variables from the Cambridge Diabetes Risk Score and HbA1c. RESULTS Among single omics prediction models that did not include clinical risk factors, the top ten proteins alone achieved the highest performance (concordance index [C index]=0.82 [95% CI 0.75, 0.88]), suggesting the proteome as the most informative single omic layer in the absence of clinical information. However, the largest improvement in prediction of type 2 diabetes incidence over and above the clinical model was achieved by the top ten features across several omic layers (C index=0.87 [95% CI 0.82, 0.92], Δ C index=0.05, p=0.045). This improvement by the top ten omic features was also evident in individuals with HbA1c <42 mmol/mol (6.0%), the threshold for prediabetes (C index=0.84 [95% CI 0.77, 0.90], Δ C index=0.07, p=0.03), the group in whom prediction would be most useful since they are not targeted for preventative interventions by current clinical guidelines. In this subgroup, the type 2 diabetes polygenic risk score was the major contributor to the improvement in prediction, and achieved a comparable improvement in performance when added onto the clinical model alone (C index=0.83 [95% CI 0.75, 0.90], Δ C index=0.06, p=0.002). However, compared with those with prediabetes, individuals at high polygenic risk in this group had only around half the absolute risk for type 2 diabetes over a 20 year period. CONCLUSIONS/INTERPRETATION Omic approaches provided marginal improvements in prediction of incident type 2 diabetes. However, while a polygenic risk score does improve prediction in people with an HbA1c in the normoglycaemic range, the group in whom prediction would be most useful, even individuals with a high polygenic burden in that subgroup had a low absolute type 2 diabetes risk. This suggests a limited feasibility of implementing targeted population-based genetic screening for preventative interventions.
Collapse
Affiliation(s)
- Julia Carrasco-Zanini
- MRC Epidemiology Unit, School of Clinical Medicine, University of Cambridge, Institute of Metabolic Science, Cambridge, UK
- Computational Medicine, Berlin Institute of Health at Charité-Universitätsmedizin Berlin, Berlin, Germany
- Precision Healthcare University Research Institute, Queen Mary University of London, London, UK
| | - Maik Pietzner
- MRC Epidemiology Unit, School of Clinical Medicine, University of Cambridge, Institute of Metabolic Science, Cambridge, UK
- Computational Medicine, Berlin Institute of Health at Charité-Universitätsmedizin Berlin, Berlin, Germany
- Precision Healthcare University Research Institute, Queen Mary University of London, London, UK
| | - Eleanor Wheeler
- MRC Epidemiology Unit, School of Clinical Medicine, University of Cambridge, Institute of Metabolic Science, Cambridge, UK
| | - Nicola D Kerrison
- MRC Epidemiology Unit, School of Clinical Medicine, University of Cambridge, Institute of Metabolic Science, Cambridge, UK
| | - Claudia Langenberg
- MRC Epidemiology Unit, School of Clinical Medicine, University of Cambridge, Institute of Metabolic Science, Cambridge, UK.
- Computational Medicine, Berlin Institute of Health at Charité-Universitätsmedizin Berlin, Berlin, Germany.
- Precision Healthcare University Research Institute, Queen Mary University of London, London, UK.
| | - Nicholas J Wareham
- MRC Epidemiology Unit, School of Clinical Medicine, University of Cambridge, Institute of Metabolic Science, Cambridge, UK.
| |
Collapse
|
3
|
Zanetti D, Stell L, Gustafsson S, Abbasi F, Tsao PS, Knowles JW, Zethelius B, Ärnlöv J, Balkau B, Walker M, Lazzeroni LC, Lind L, Petrie JR, Assimes TL. Plasma proteomic signatures of a direct measure of insulin sensitivity in two population cohorts. Diabetologia 2023; 66:1643-1654. [PMID: 37329449 PMCID: PMC10390625 DOI: 10.1007/s00125-023-05946-z] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 06/20/2022] [Accepted: 04/12/2023] [Indexed: 06/19/2023]
Abstract
AIMS/HYPOTHESIS The euglycaemic-hyperinsulinaemic clamp (EIC) is the reference standard for the measurement of whole-body insulin sensitivity but is laborious and expensive to perform. We aimed to assess the incremental value of high-throughput plasma proteomic profiling in developing signatures correlating with the M value derived from the EIC. METHODS We measured 828 proteins in the fasting plasma of 966 participants from the Relationship between Insulin Sensitivity and Cardiovascular disease (RISC) study and 745 participants from the Uppsala Longitudinal Study of Adult Men (ULSAM) using a high-throughput proximity extension assay. We used the least absolute shrinkage and selection operator (LASSO) approach using clinical variables and protein measures as features. Models were tested within and across cohorts. Our primary model performance metric was the proportion of the M value variance explained (R2). RESULTS A standard LASSO model incorporating 53 proteins in addition to routinely available clinical variables increased the M value R2 from 0.237 (95% CI 0.178, 0.303) to 0.456 (0.372, 0.536) in RISC. A similar pattern was observed in ULSAM, in which the M value R2 increased from 0.443 (0.360, 0.530) to 0.632 (0.569, 0.698) with the addition of 61 proteins. Models trained in one cohort and tested in the other also demonstrated significant improvements in R2 despite differences in baseline cohort characteristics and clamp methodology (RISC to ULSAM: 0.491 [0.433, 0.539] for 51 proteins; ULSAM to RISC: 0.369 [0.331, 0.416] for 67 proteins). A randomised LASSO and stability selection algorithm selected only two proteins per cohort (three unique proteins), which improved R2 but to a lesser degree than in standard LASSO models: 0.352 (0.266, 0.439) in RISC and 0.495 (0.404, 0.585) in ULSAM. Reductions in improvements of R2 with randomised LASSO and stability selection were less marked in cross-cohort analyses (RISC to ULSAM R2 0.444 [0.391, 0.497]; ULSAM to RISC R2 0.348 [0.300, 0.396]). Models of proteins alone were as effective as models that included both clinical variables and proteins using either standard or randomised LASSO. The single most consistently selected protein across all analyses and models was IGF-binding protein 2. CONCLUSIONS/INTERPRETATION A plasma proteomic signature identified using a standard LASSO approach improves the cross-sectional estimation of the M value over routine clinical variables. However, a small subset of these proteins identified using a stability selection algorithm affords much of this improvement, especially when considering cross-cohort analyses. Our approach provides opportunities to improve the identification of insulin-resistant individuals at risk of insulin resistance-related adverse health consequences.
Collapse
Affiliation(s)
- Daniela Zanetti
- Department of Medicine, Division of Cardiovascular Medicine, Stanford University School of Medicine, Stanford, CA, USA
- VA Palo Alto Health Care System, Palo Alto, CA, USA
| | - Laurel Stell
- VA Palo Alto Health Care System, Palo Alto, CA, USA
- Department of Biomedical Data Science, Stanford University School of Medicine, Stanford, CA, USA
| | | | - Fahim Abbasi
- Department of Medicine, Division of Cardiovascular Medicine, Stanford University School of Medicine, Stanford, CA, USA
- Stanford Diabetes Research Center, Stanford University School of Medicine, Stanford, CA, USA
| | - Philip S Tsao
- Department of Medicine, Division of Cardiovascular Medicine, Stanford University School of Medicine, Stanford, CA, USA
- VA Palo Alto Health Care System, Palo Alto, CA, USA
- Stanford Cardiovascular Institute, Stanford University School of Medicine, Stanford, CA, USA
| | - Joshua W Knowles
- Department of Medicine, Division of Cardiovascular Medicine, Stanford University School of Medicine, Stanford, CA, USA
- Stanford Diabetes Research Center, Stanford University School of Medicine, Stanford, CA, USA
- Stanford Cardiovascular Institute, Stanford University School of Medicine, Stanford, CA, USA
- Stanford Prevention Research Center, Stanford University School of Medicine, Stanford, CA, USA
| | - Björn Zethelius
- Department of Public Health/Geriatrics, Uppsala University, Uppsala, Sweden
| | - Johan Ärnlöv
- Division of Family Medicine and Primary Care, Department of Neurobiology, Care Sciences and Society, Karolinska Institute, Stockholm, Sweden
- Department of Health and Social Studies, Dalarna University, Falun, Sweden
| | - Beverley Balkau
- Clinical Epidemiology, Centre for Research in Epidemiology and Population Health, Inserm U1018, Villejuif, France
| | - Mark Walker
- Translational and Clinical Research Institute, Newcastle University, Newcastle upon Tyne, UK
| | - Laura C Lazzeroni
- Department of Biomedical Data Science, Stanford University School of Medicine, Stanford, CA, USA
- Department of Psychiatry and Behavioral Sciences, Stanford University, Stanford, CA, USA
| | - Lars Lind
- Department of Medical Sciences, Uppsala University, Uppsala, Sweden.
| | - John R Petrie
- School of Health and Wellbeing, College of Medical, Veterinary and Life Sciences, University of Glasgow, Glasgow, UK.
| | - Themistocles L Assimes
- Department of Medicine, Division of Cardiovascular Medicine, Stanford University School of Medicine, Stanford, CA, USA.
- VA Palo Alto Health Care System, Palo Alto, CA, USA.
- Stanford Diabetes Research Center, Stanford University School of Medicine, Stanford, CA, USA.
- Stanford Cardiovascular Institute, Stanford University School of Medicine, Stanford, CA, USA.
- Department of Epidemiology and Population Health, Stanford University School of Medicine, Stanford, CA, USA.
| |
Collapse
|
4
|
Abolbaghaei A, Turner M, Thibodeau JF, Holterman CE, Kennedy CRJ, Burger D. The Proteome of Circulating Large Extracellular Vesicles in Diabetes and Hypertension. Int J Mol Sci 2023; 24:ijms24054930. [PMID: 36902363 PMCID: PMC10003702 DOI: 10.3390/ijms24054930] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/23/2022] [Revised: 02/14/2023] [Accepted: 02/23/2023] [Indexed: 03/08/2023] Open
Abstract
Hypertension and diabetes induce vascular injury through processes that are not fully understood. Changes in extracellular vesicle (EV) composition could provide novel insights. Here, we examined the protein composition of circulating EVs from hypertensive, diabetic and healthy mice. EVs were isolated from transgenic mice overexpressing human renin in the liver (TtRhRen, hypertensive), OVE26 type 1 diabetic mice and wild-type (WT) mice. Protein content was analyzed using liquid chromatography-mass spectrometry. We identified 544 independent proteins, of which 408 were found in all groups, 34 were exclusive to WT, 16 were exclusive to OVE26 and 5 were exclusive to TTRhRen mice. Amongst the differentially expressed proteins, haptoglobin (HPT) was upregulated and ankyrin-1 (ANK1) was downregulated in OVE26 and TtRhRen mice compared with WT controls. Conversely, TSP4 and Co3A1 were upregulated and SAA4 was downregulated exclusively in diabetic mice; and PPN was upregulated and SPTB1 and SPTA1 were downregulated in hypertensive mice, compared to WT mice. Ingenuity pathway analysis identified enrichment in proteins associated with SNARE signaling, the complement system and NAD homeostasis in EVs from diabetic mice. Conversely, in EVs from hypertensive mice, there was enrichment in semaphroin and Rho signaling. Further analysis of these changes may improve understanding of vascular injury in hypertension and diabetes.
Collapse
Affiliation(s)
- Akram Abolbaghaei
- Chronic Disease Program, Kidney Research Centre, Ottawa Hospital Research Institute, Ottawa, ON K1H 8M5, Canada
| | - Maddison Turner
- Chronic Disease Program, Kidney Research Centre, Ottawa Hospital Research Institute, Ottawa, ON K1H 8M5, Canada
| | - Jean-François Thibodeau
- Chronic Disease Program, Kidney Research Centre, Ottawa Hospital Research Institute, Ottawa, ON K1H 8M5, Canada
| | - Chet E. Holterman
- Chronic Disease Program, Kidney Research Centre, Ottawa Hospital Research Institute, Ottawa, ON K1H 8M5, Canada
| | - Christopher R. J. Kennedy
- Chronic Disease Program, Kidney Research Centre, Ottawa Hospital Research Institute, Ottawa, ON K1H 8M5, Canada
- Departments of Medicine and Cellular and Molecular Medicine, University of Ottawa, Ottawa, ON K1H 8M5, Canada
| | - Dylan Burger
- Chronic Disease Program, Kidney Research Centre, Ottawa Hospital Research Institute, Ottawa, ON K1H 8M5, Canada
- Departments of Medicine and Cellular and Molecular Medicine, University of Ottawa, Ottawa, ON K1H 8M5, Canada
- School of Pharmaceutical Sciences, University of Ottawa, Ottawa, ON K1H 8M5, Canada
- Correspondence: ; Tel.: +1-613-562-5800 (ext. 8241)
| |
Collapse
|
5
|
Abstract
It is well established from clinical trials that behavioural interventions can halve the risk of progression from prediabetes to type 2 diabetes but translating this evidence of efficacy into effective real-world interventions at scale is an ongoing challenge. A common suggestion is that future preventive interventions need to be more personalised in order to enhance effectiveness. This review evaluates the degree to which existing interventions are already personalised and outlines how greater personalisation could be achieved through better identification of those at high risk, division of type 2 diabetes into specific subgroups and, above all, more individualisation of the behavioural targets for preventive action. Approaches using more dynamic real-time data are in their scientific infancy. Although these approaches are promising they need longer-term evaluation against clinical outcomes. Whatever personalised preventive approaches for type 2 diabetes are developed in the future, they will need to be complementary to existing individual-level interventions that are being rolled out and that are demonstrably effective. They will also need to ideally synergise with, and at the very least not detract attention from, efforts to develop and implement strategies that impact on type 2 diabetes risk at the societal level.
Collapse
Affiliation(s)
- Nicholas J Wareham
- Medical Research Council Epidemiology Unit, Institute of Metabolic Science, University of Cambridge Clinical School, Cambridge, UK.
| |
Collapse
|
6
|
Ghanbari F, Yazdanpanah N, Yazdanpanah M, Richards JB, Manousaki D. Connecting Genomics and Proteomics to Identify Protein Biomarkers for Adult and Youth-Onset Type 2 Diabetes: A Two-Sample Mendelian Randomization Study. Diabetes 2022; 71:1324-1337. [PMID: 35234851 DOI: 10.2337/db21-1046] [Citation(s) in RCA: 6] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 11/17/2021] [Accepted: 02/24/2022] [Indexed: 11/13/2022]
Abstract
Type 2 diabetes shows an increasing prevalence in both adults and children. Identification of biomarkers for both youth and adult-onset type 2 diabetes is crucial for development of screening tools or drug targets. In this study, using two-sample Mendelian randomization (MR), we identified 22 circulating proteins causally linked to adult type 2 diabetes and 11 proteins with suggestive evidence for association with youth-onset type 2 diabetes. Among these, colocalization analysis further supported a role in type 2 diabetes for C-type mannose receptor 2 (MR odds ratio [OR] 0.85 [95% CI 0.79-0.92] per genetically predicted SD increase in protein level), MANS domain containing 4 (MR OR 0.90 [95% CI 0.88-0.92]), sodium/potassium-transporting ATPase subunit β2 (MR OR 1.10 [95% CI 1.06-1.15]), endoplasmic reticulum oxidoreductase 1β (MR OR 1.09 [95% CI 1.05-1.14]), spermatogenesis-associated protein 20 (MR OR 1.12 [95% CI 1.06-1.18]), haptoglobin (MR OR 0.96 [95% CI 0.94-0.98]), and α1-3-N-acetylgalactosaminyltransferase and α1-3-galactosyltransferase (MR OR 1.04 [95% CI 1.03-1.05]). Our findings support a causal role in type 2 diabetes for a set of circulating proteins, which represent promising type 2 diabetes drug targets.
Collapse
Affiliation(s)
- Faegheh Ghanbari
- Research Center of the Sainte-Justine University Hospital, University of Montreal, Montreal, Quebec, Canada
| | - Nahid Yazdanpanah
- Research Center of the Sainte-Justine University Hospital, University of Montreal, Montreal, Quebec, Canada
| | - Mojgan Yazdanpanah
- Research Center of the Sainte-Justine University Hospital, University of Montreal, Montreal, Quebec, Canada
| | - J Brent Richards
- Lady Davis Institute for Medical Research, Jewish General Hospital, Montreal, Quebec, Canada
- Department of Medicine, McGill University, Montreal, Quebec, Canada
- Department of Human Genetics, McGill University, Montreal, Quebec, Canada
- Department of Epidemiology and Biostatistics, McGill University, Montreal, Quebec, Canada
- Department of Twin Research, King's College London, London, U.K
| | - Despoina Manousaki
- Research Center of the Sainte-Justine University Hospital, University of Montreal, Montreal, Quebec, Canada
- Departments of Pediatrics, Biochemistry and Molecular Medicine, University of Montreal, Montreal, Canada
| |
Collapse
|