1
|
Dürauer A, Jungbauer A, Scharl T. Sensors and chemometrics in downstream processing. Biotechnol Bioeng 2024; 121:2347-2364. [PMID: 37470278 DOI: 10.1002/bit.28499] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/15/2023] [Revised: 06/14/2023] [Accepted: 07/07/2023] [Indexed: 07/21/2023]
Abstract
The biopharmaceutical industry is still running in batch mode, mostly because it is highly regulated. In the past, sensors were not readily available and in-process control was mainly executed offline. The most important product parameters are quantity, purity, and potency, in addition to adventitious agents and bioburden. New concepts using disposable single-use technologies and integrated bioprocessing for manufacturing will dominate the future of bioprocessing. To ensure the quality of pharmaceuticals, initiatives such as Process Analytical Technologies, Quality by Design, and Continuous Integrated Manufacturing have been established. The aim is that these initiatives, together with technology development, will pave the way for process automation and autonomous bioprocessing without any human intervention. Then, real-time release would be realized, leading to a highly predictive and robust biomanufacturing system. The steps toward such automated and autonomous bioprocessing are reviewed in the context of monitoring and control. It is possible to integrate real-time monitoring gradually, and it should be considered from a soft sensor perspective. This concept has already been successfully implemented in other industries and requires relatively simple model training and the use of established statistical tools, such as multivariate statistics or neural networks. This review describes a scenario for integrating soft sensors and predictive chemometrics into modern process control. This is exemplified by selective downstream processing steps, such as chromatography and membrane filtration, the most common unit operations for separation of biopharmaceuticals.
Collapse
Affiliation(s)
- Astrid Dürauer
- Institute of Bioprocessing Science and Engineering, University of Natural Resources and Life Sciences, Vienna, Austria
| | - Alois Jungbauer
- Institute of Bioprocessing Science and Engineering, University of Natural Resources and Life Sciences, Vienna, Austria
- Austrian Centre of Industrial Biotechnology, Vienna, Austria
| | - Theresa Scharl
- Institute of Statistics, University of Natural Resources and Life Sciences, Vienna, Austria
| |
Collapse
|
2
|
Potts S, Bergherr E, Reinke C, Griesbach C. Prediction-based variable selection for component-wise gradient boosting. Int J Biostat 2024; 20:293-314. [PMID: 38000054 DOI: 10.1515/ijb-2023-0052] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/24/2023] [Accepted: 09/18/2023] [Indexed: 11/26/2023]
Abstract
Model-based component-wise gradient boosting is a popular tool for data-driven variable selection. In order to improve its prediction and selection qualities even further, several modifications of the original algorithm have been developed, that mainly focus on different stopping criteria, leaving the actual variable selection mechanism untouched. We investigate different prediction-based mechanisms for the variable selection step in model-based component-wise gradient boosting. These approaches include Akaikes Information Criterion (AIC) as well as a selection rule relying on the component-wise test error computed via cross-validation. We implemented the AIC and cross-validation routines for Generalized Linear Models and evaluated them regarding their variable selection properties and predictive performance. An extensive simulation study revealed improved selection properties whereas the prediction error could be lowered in a real world application with age-standardized COVID-19 incidence rates.
Collapse
Affiliation(s)
- Sophie Potts
- Chair of Spatial Data Science and Statistical Learning, University of Goettingen, Goettingen, Germany
| | - Elisabeth Bergherr
- Chair of Spatial Data Science and Statistical Learning, University of Goettingen, Goettingen, Germany
| | - Constantin Reinke
- Chair of Empirical Methods in Social Science and Demography, University of Rostock, Rostock, Germany
| | - Colin Griesbach
- Chair of Spatial Data Science and Statistical Learning, University of Goettingen, Goettingen, Germany
| |
Collapse
|
3
|
Abiose O, Rutledge J, Moran‐Losada P, Belloy ME, Wilson EN, He Z, Trelle AN, Channappa D, Romero A, Park J, Yutsis MV, Sha SJ, Andreasson KI, Poston KL, Henderson VW, Wagner AD, Wyss‐Coray T, Mormino EC. Post-translational modifications linked to preclinical Alzheimer's disease-related pathological and cognitive changes. Alzheimers Dement 2024; 20:1851-1867. [PMID: 38146099 PMCID: PMC10984434 DOI: 10.1002/alz.13576] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/01/2023] [Revised: 11/08/2023] [Accepted: 11/13/2023] [Indexed: 12/27/2023]
Abstract
INTRODUCTION In this study, we leverage proteomic techniques to identify communities of proteins underlying Alzheimer's disease (AD) risk among clinically unimpaired (CU) older adults. METHODS We constructed a protein co-expression network using 3869 cerebrospinal fluid (CSF) proteins quantified by SomaLogic, Inc., in a cohort of participants along the AD clinical spectrum. We then replicated this network in an independent cohort of CU older adults and related these modules to clinically-relevant outcomes. RESULTS We discovered modules enriched for phosphorylation and ubiquitination that were associated with abnormal amyloid status, as well as p-tau181 (M4: β = 2.44, p < 0.001, M7: β = 2.57, p < 0.001) and executive function performance (M4: β = -2.00, p = 0.005, M7: β = -2.39, p < 0.001). DISCUSSION In leveraging CSF proteomic data from individuals spanning the clinical spectrum of AD, we highlight the importance of post-translational modifications for early cognitive and pathological changes.
Collapse
Affiliation(s)
- Olamide Abiose
- Department of Neurology and Neurological SciencesStanford University School of MedicinePalo AltoCaliforniaUSA
- Wu Tsai Neurosciences InstituteStanford University School of MedicineStanfordCaliforniaUSA
| | - Jarod Rutledge
- The Phil and Penny Knight Initiative for Brain ResilienceStanford UniversityStanfordCaliforniaUSA
- Department of GeneticsStanford UniversityStanfordCaliforniaUSA
| | - Patricia Moran‐Losada
- Department of Neurology and Neurological SciencesStanford University School of MedicinePalo AltoCaliforniaUSA
- Wu Tsai Neurosciences InstituteStanford University School of MedicineStanfordCaliforniaUSA
- The Phil and Penny Knight Initiative for Brain ResilienceStanford UniversityStanfordCaliforniaUSA
| | - Michael E. Belloy
- Department of Neurology and Neurological SciencesStanford University School of MedicinePalo AltoCaliforniaUSA
| | - Edward N. Wilson
- Department of Neurology and Neurological SciencesStanford University School of MedicinePalo AltoCaliforniaUSA
- Wu Tsai Neurosciences InstituteStanford University School of MedicineStanfordCaliforniaUSA
| | - Zihuai He
- Department of Neurology and Neurological SciencesStanford University School of MedicinePalo AltoCaliforniaUSA
- Center for Biomedical Informatics ResearchStanford University School of MedicineStanfordCaliforniaUSA
| | - Alexandra N. Trelle
- Department of Neurology and Neurological SciencesStanford University School of MedicinePalo AltoCaliforniaUSA
| | - Divya Channappa
- Department of Neurology and Neurological SciencesStanford University School of MedicinePalo AltoCaliforniaUSA
| | - America Romero
- Department of Neurology and Neurological SciencesStanford University School of MedicinePalo AltoCaliforniaUSA
| | - Jennifer Park
- Department of Neurology and Neurological SciencesStanford University School of MedicinePalo AltoCaliforniaUSA
| | - Maya V. Yutsis
- Department of Neurology and Neurological SciencesStanford University School of MedicinePalo AltoCaliforniaUSA
| | - Sharon J. Sha
- Department of Neurology and Neurological SciencesStanford University School of MedicinePalo AltoCaliforniaUSA
| | - Katrin I. Andreasson
- Department of Neurology and Neurological SciencesStanford University School of MedicinePalo AltoCaliforniaUSA
- Wu Tsai Neurosciences InstituteStanford University School of MedicineStanfordCaliforniaUSA
- Chan Zuckerberg BiohubSan FranciscoCaliforniaUSA
| | - Kathleen L. Poston
- Department of Neurology and Neurological SciencesStanford University School of MedicinePalo AltoCaliforniaUSA
- Wu Tsai Neurosciences InstituteStanford University School of MedicineStanfordCaliforniaUSA
- The Phil and Penny Knight Initiative for Brain ResilienceStanford UniversityStanfordCaliforniaUSA
| | - Victor W. Henderson
- Department of Neurology and Neurological SciencesStanford University School of MedicinePalo AltoCaliforniaUSA
- Department of Epidemiology & Population HealthStanford University School of MedicineStanfordCaliforniaUSA
| | - Anthony D. Wagner
- Wu Tsai Neurosciences InstituteStanford University School of MedicineStanfordCaliforniaUSA
- Department of PsychologyStanford UniversityStanfordCaliforniaUSA
| | - Tony Wyss‐Coray
- Department of Neurology and Neurological SciencesStanford University School of MedicinePalo AltoCaliforniaUSA
- Wu Tsai Neurosciences InstituteStanford University School of MedicineStanfordCaliforniaUSA
- The Phil and Penny Knight Initiative for Brain ResilienceStanford UniversityStanfordCaliforniaUSA
| | - Elizabeth C. Mormino
- Department of Neurology and Neurological SciencesStanford University School of MedicinePalo AltoCaliforniaUSA
- Wu Tsai Neurosciences InstituteStanford University School of MedicineStanfordCaliforniaUSA
| |
Collapse
|
4
|
Pedrero-Martin Y, Falla D, Rodriguez-Brazzarola P, Torrontegui-Duarte M, Fernandez-Sanchez M, Jerez-Aragones JM, Liew BXW, Luque-Suarez A. Prognostic Factors of Perceived Disability and Perceived Recovery After Whiplash: A Longitudinal, Prospective Study With One-year Follow-up. Clin J Pain 2024; 40:165-173. [PMID: 38031848 DOI: 10.1097/ajp.0000000000001182] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/25/2023] [Accepted: 11/20/2023] [Indexed: 12/01/2023]
Abstract
OBJECTIVES The understanding of the role that cognitive and emotional factors play in how an individual recovers from a whiplash injury is important. Hence, we sought to evaluate whether pain-related cognitions (self-efficacy beliefs, expectation of recovery, pain catastrophizing, optimism, and pessimism) and emotions (kinesiophobia) are longitudinally associated with the transition to chronic whiplash-associated disorders in terms of perceived disability and perceived recovery at 6 and 12 months. METHODS One hundred sixty-one participants with acute or subacute whiplash-associated disorder were included. The predictors were: self-efficacy beliefs, expectation of recovery, pain catastrophizing, optimism, pessimism, pain intensity, and kinesiophobia. The 2 outcomes were the dichotomized scores of perceived disability and recovery expectations at 6 and 12 months. Stepwise regression with bootstrap resampling was performed to identify the predictors most strongly associated with the outcomes and the stability of such selection. RESULTS Baseline perceived disability, pain catastrophizing, and expectation of recovery were the most likely to be statistically significant, with an overage frequency of 87.2%, 84.0%, and 84.0%, respectively. CONCLUSION Individuals with higher expectations of recovery and lower levels of pain catastrophizing and perceived disability at baseline have higher perceived recovery and perceived disability at 6 and 12 months. These results have important clinical implications as both factors are modifiable through health education approaches.
Collapse
Affiliation(s)
- Yolanda Pedrero-Martin
- University of Malaga, Faculty of Health Sciences, Malaga, Spain
- University of Gimbernat-Cantabria, Cantabria, España
| | - Deborah Falla
- University of Birmingham, School of Sport Exercise and Rehabilitation Sciences, Birmingham. Centre of Precision Rehabilitation for Spinal Pain (CPR Spine)
| | | | | | | | | | - Bernard X W Liew
- School of Sport, Rehabilitation and Exercise Sciences, University of Essex, Colchester, Essex, UK
| | - Alejandro Luque-Suarez
- University of Malaga, Faculty of Health Sciences, Malaga, Spain
- Biomedical Research Institute-IBIMA, Malaga, Spain
| |
Collapse
|
5
|
Battauz M, Vidoni P. A boosting method to select the random effects in linear mixed models. Biometrics 2024; 80:ujae010. [PMID: 38465986 DOI: 10.1093/biomtc/ujae010] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/02/2023] [Revised: 12/07/2023] [Accepted: 01/29/2024] [Indexed: 03/12/2024]
Abstract
This paper proposes a novel likelihood-based boosting method for the selection of the random effects in linear mixed models. The nonconvexity of the objective function to minimize, which is the negative profile log-likelihood, requires the adoption of new solutions. In this respect, our optimization approach also employs the directions of negative curvature besides the usual Newton directions. A simulation study and a real-data application show the good performance of the proposal.
Collapse
Affiliation(s)
- Michela Battauz
- Department of Economics and Statistics, University of Udine, Udine 33100, Italy
| | - Paolo Vidoni
- Department of Economics and Statistics, University of Udine, Udine 33100, Italy
| |
Collapse
|
6
|
Cardner M, Marass F, Gedvilaite E, Yang JL, Tsui DWY, Beerenwinkel N. Predicting tumour content of liquid biopsies from cell-free DNA. BMC Bioinformatics 2023; 24:368. [PMID: 37777714 PMCID: PMC10543881 DOI: 10.1186/s12859-023-05478-8] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/11/2022] [Accepted: 09/12/2023] [Indexed: 10/02/2023] Open
Abstract
BACKGROUND Liquid biopsy is a minimally-invasive method of sampling bodily fluids, capable of revealing evidence of cancer. The distribution of cell-free DNA (cfDNA) fragment lengths has been shown to differ between healthy subjects and cancer patients, whereby the distributional shift correlates with the sample's tumour content. These fragmentomic data have not yet been utilised to directly quantify the proportion of tumour-derived cfDNA in a liquid biopsy. RESULTS We used statistical learning to predict tumour content from Fourier and wavelet transforms of cfDNA length distributions in samples from 118 cancer patients. The model was validated on an independent dilution series of patient plasma. CONCLUSIONS This proof of concept suggests that our fragmentomic methodology could be useful for predicting tumour content in liquid biopsies.
Collapse
Affiliation(s)
- Mathias Cardner
- Department of Biosystems Science and Engineering, ETH Zurich, 4058, Basel, Switzerland
- SIB Swiss Institute of Bioinformatics, 4058, Basel, Switzerland
| | - Francesco Marass
- Department of Biosystems Science and Engineering, ETH Zurich, 4058, Basel, Switzerland
- SIB Swiss Institute of Bioinformatics, 4058, Basel, Switzerland
- PetDx, Inc, La Jolla, USA
| | - Erika Gedvilaite
- Department of Pathology, Memorial Sloan Kettering Cancer Center, New York, NY, USA
| | - Julie L Yang
- Epigenetics Research Center, Memorial Sloan Kettering Cancer Center, New York, NY, USA
| | - Dana W Y Tsui
- Department of Pathology, Memorial Sloan Kettering Cancer Center, New York, NY, USA.
- PetDx, Inc, La Jolla, USA.
| | - Niko Beerenwinkel
- Department of Biosystems Science and Engineering, ETH Zurich, 4058, Basel, Switzerland.
- SIB Swiss Institute of Bioinformatics, 4058, Basel, Switzerland.
| |
Collapse
|
7
|
Liew BXW, Kovacs FM, Rügamer D, Royuela A. Automatic Variable Selection Algorithms in Prognostic Factor Research in Neck Pain. J Clin Med 2023; 12:6232. [PMID: 37834877 PMCID: PMC10573798 DOI: 10.3390/jcm12196232] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/28/2023] [Revised: 09/21/2023] [Accepted: 09/26/2023] [Indexed: 10/15/2023] Open
Abstract
This study aims to compare the variable selection strategies of different machine learning (ML) and statistical algorithms in the prognosis of neck pain (NP) recovery. A total of 3001 participants with NP were included. Three dichotomous outcomes of an improvement in NP, arm pain (AP), and disability at 3 months follow-up were used. Twenty-five variables (twenty-eight parameters) were included as predictors. There were more parameters than variables, as some categorical variables had >2 levels. Eight modelling techniques were compared: stepwise regression based on unadjusted p values (stepP), on adjusted p values (stepPAdj), on Akaike information criterion (stepAIC), best subset regression (BestSubset) least absolute shrinkage and selection operator [LASSO], Minimax concave penalty (MCP), model-based boosting (mboost), and multivariate adaptive regression splines (MuARS). The algorithm that selected the fewest predictors was stepPAdj (number of predictors, p = 4 to 8). MuARS was the algorithm with the second fewest predictors selected (p = 9 to 14). The predictor selected by all algorithms with the largest coefficient magnitude was "having undergone a neuroreflexotherapy intervention" for NP (β = from 1.987 to 2.296) and AP (β = from 2.639 to 3.554), and "Imaging findings: spinal stenosis" (β = from -1.331 to -1.763) for disability. Stepwise regression based on adjusted p-values resulted in the sparsest models, which enhanced clinical interpretability. MuARS appears to provide the optimal balance between model sparsity whilst retaining high predictive performance across outcomes. Different algorithms produced similar performances but resulted in a different number of variables selected. Rather than relying on any single algorithm, confidence in the variable selection may be increased by using multiple algorithms.
Collapse
Affiliation(s)
- Bernard X. W. Liew
- School of Sport, Rehabilitation and Exercise Sciences, University of Essex, Colchester CO4 3SQ, Essex, UK
| | - Francisco M. Kovacs
- Unidad de la Espalda Kovacs, HLA-Moncloa University Hospital, 28008 Madrid, Spain;
| | - David Rügamer
- Department of Statistics, Ludwig-Maximilians-Universität München, 80539 Munich, Germany;
| | - Ana Royuela
- Biostatistics Unit, Hospital Puerta de Hierro, Instituto Investigación Sanitaria Puerta de Hierro-Segovia de Arana, Consorcio de Investigación Biomédica en Red de Epidemiología y Salud Pública, Red Española de Investigadores en Dolencias de la Espalda, 28222 Madrid, Spain;
| |
Collapse
|
8
|
Zanetti D, Stell L, Gustafsson S, Abbasi F, Tsao PS, Knowles JW, Zethelius B, Ärnlöv J, Balkau B, Walker M, Lazzeroni LC, Lind L, Petrie JR, Assimes TL. Plasma proteomic signatures of a direct measure of insulin sensitivity in two population cohorts. Diabetologia 2023; 66:1643-1654. [PMID: 37329449 PMCID: PMC10390625 DOI: 10.1007/s00125-023-05946-z] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 06/20/2022] [Accepted: 04/12/2023] [Indexed: 06/19/2023]
Abstract
AIMS/HYPOTHESIS The euglycaemic-hyperinsulinaemic clamp (EIC) is the reference standard for the measurement of whole-body insulin sensitivity but is laborious and expensive to perform. We aimed to assess the incremental value of high-throughput plasma proteomic profiling in developing signatures correlating with the M value derived from the EIC. METHODS We measured 828 proteins in the fasting plasma of 966 participants from the Relationship between Insulin Sensitivity and Cardiovascular disease (RISC) study and 745 participants from the Uppsala Longitudinal Study of Adult Men (ULSAM) using a high-throughput proximity extension assay. We used the least absolute shrinkage and selection operator (LASSO) approach using clinical variables and protein measures as features. Models were tested within and across cohorts. Our primary model performance metric was the proportion of the M value variance explained (R2). RESULTS A standard LASSO model incorporating 53 proteins in addition to routinely available clinical variables increased the M value R2 from 0.237 (95% CI 0.178, 0.303) to 0.456 (0.372, 0.536) in RISC. A similar pattern was observed in ULSAM, in which the M value R2 increased from 0.443 (0.360, 0.530) to 0.632 (0.569, 0.698) with the addition of 61 proteins. Models trained in one cohort and tested in the other also demonstrated significant improvements in R2 despite differences in baseline cohort characteristics and clamp methodology (RISC to ULSAM: 0.491 [0.433, 0.539] for 51 proteins; ULSAM to RISC: 0.369 [0.331, 0.416] for 67 proteins). A randomised LASSO and stability selection algorithm selected only two proteins per cohort (three unique proteins), which improved R2 but to a lesser degree than in standard LASSO models: 0.352 (0.266, 0.439) in RISC and 0.495 (0.404, 0.585) in ULSAM. Reductions in improvements of R2 with randomised LASSO and stability selection were less marked in cross-cohort analyses (RISC to ULSAM R2 0.444 [0.391, 0.497]; ULSAM to RISC R2 0.348 [0.300, 0.396]). Models of proteins alone were as effective as models that included both clinical variables and proteins using either standard or randomised LASSO. The single most consistently selected protein across all analyses and models was IGF-binding protein 2. CONCLUSIONS/INTERPRETATION A plasma proteomic signature identified using a standard LASSO approach improves the cross-sectional estimation of the M value over routine clinical variables. However, a small subset of these proteins identified using a stability selection algorithm affords much of this improvement, especially when considering cross-cohort analyses. Our approach provides opportunities to improve the identification of insulin-resistant individuals at risk of insulin resistance-related adverse health consequences.
Collapse
Affiliation(s)
- Daniela Zanetti
- Department of Medicine, Division of Cardiovascular Medicine, Stanford University School of Medicine, Stanford, CA, USA
- VA Palo Alto Health Care System, Palo Alto, CA, USA
| | - Laurel Stell
- VA Palo Alto Health Care System, Palo Alto, CA, USA
- Department of Biomedical Data Science, Stanford University School of Medicine, Stanford, CA, USA
| | | | - Fahim Abbasi
- Department of Medicine, Division of Cardiovascular Medicine, Stanford University School of Medicine, Stanford, CA, USA
- Stanford Diabetes Research Center, Stanford University School of Medicine, Stanford, CA, USA
| | - Philip S Tsao
- Department of Medicine, Division of Cardiovascular Medicine, Stanford University School of Medicine, Stanford, CA, USA
- VA Palo Alto Health Care System, Palo Alto, CA, USA
- Stanford Cardiovascular Institute, Stanford University School of Medicine, Stanford, CA, USA
| | - Joshua W Knowles
- Department of Medicine, Division of Cardiovascular Medicine, Stanford University School of Medicine, Stanford, CA, USA
- Stanford Diabetes Research Center, Stanford University School of Medicine, Stanford, CA, USA
- Stanford Cardiovascular Institute, Stanford University School of Medicine, Stanford, CA, USA
- Stanford Prevention Research Center, Stanford University School of Medicine, Stanford, CA, USA
| | - Björn Zethelius
- Department of Public Health/Geriatrics, Uppsala University, Uppsala, Sweden
| | - Johan Ärnlöv
- Division of Family Medicine and Primary Care, Department of Neurobiology, Care Sciences and Society, Karolinska Institute, Stockholm, Sweden
- Department of Health and Social Studies, Dalarna University, Falun, Sweden
| | - Beverley Balkau
- Clinical Epidemiology, Centre for Research in Epidemiology and Population Health, Inserm U1018, Villejuif, France
| | - Mark Walker
- Translational and Clinical Research Institute, Newcastle University, Newcastle upon Tyne, UK
| | - Laura C Lazzeroni
- Department of Biomedical Data Science, Stanford University School of Medicine, Stanford, CA, USA
- Department of Psychiatry and Behavioral Sciences, Stanford University, Stanford, CA, USA
| | - Lars Lind
- Department of Medical Sciences, Uppsala University, Uppsala, Sweden.
| | - John R Petrie
- School of Health and Wellbeing, College of Medical, Veterinary and Life Sciences, University of Glasgow, Glasgow, UK.
| | - Themistocles L Assimes
- Department of Medicine, Division of Cardiovascular Medicine, Stanford University School of Medicine, Stanford, CA, USA.
- VA Palo Alto Health Care System, Palo Alto, CA, USA.
- Stanford Diabetes Research Center, Stanford University School of Medicine, Stanford, CA, USA.
- Stanford Cardiovascular Institute, Stanford University School of Medicine, Stanford, CA, USA.
- Department of Epidemiology and Population Health, Stanford University School of Medicine, Stanford, CA, USA.
| |
Collapse
|
9
|
Cardner M, Tuckwell D, Kostikova A, Forrer P, Siegel RM, Marti A, Vandemeulebroecke M, Ferrero E. Analysis of serum proteomics data identifies a quantitative association between beta-defensin 2 at baseline and clinical response to IL-17 blockade in psoriatic arthritis. RMD Open 2023; 9:e003042. [PMID: 37321668 DOI: 10.1136/rmdopen-2023-003042] [Citation(s) in RCA: 3] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/30/2023] [Accepted: 05/22/2023] [Indexed: 06/17/2023] Open
Abstract
OBJECTIVES Despite several effective targeted therapies, biomarkers that predict whether a patient with psoriatic arthritis (PsA) will respond to a particular treatment are currently lacking. METHODS We analysed proteomics data from serum samples of nearly 2000 patients with PsA in placebo-controlled phase-III clinical trials of the interleukin-17 inhibitor secukinumab. To discover predictive biomarkers of clinical response, we used statistical learning with controlled feature selection. The top candidate was validated using an ELISA and was separately assessed in a trial of almost 800 patients with PsA treated with secukinumab or the tumour necrosis factor inhibitor adalimumab. RESULTS Serum levels of beta-defensin 2 (BD-2) at baseline were found to be robustly associated with subsequent clinical response (eg, American College of Rheumatology definition of 20%, 50% and 70% improvement) to secukinumab, but not to placebo. This finding was validated in two independent clinical studies not used for discovery. Although BD-2 is known to be associated with psoriasis severity, the predictivity of BD-2 was independent of baseline Psoriasis Area and Severity Index. The association between BD-2 and response to secukinumab was observed as early as 4 weeks and maintained up to 52 weeks. BD-2 was also found to predict response to treatment with adalimumab. Unlike in PsA, BD-2 was not predictive of response to secukinumab in rheumatoid arthritis. CONCLUSIONS In PsA, BD-2 at baseline is quantitatively associated with clinical response to secukinumab. Patients with high levels of BD-2 at baseline reach and sustain higher rates of clinical response after treatment with secukinumab.
Collapse
Affiliation(s)
- Mathias Cardner
- Novartis Pharma AG, Basel, Switzerland
- Novartis Institutes for BioMedical Research, Basel, Switzerland
| | - Danny Tuckwell
- Novartis Institutes for BioMedical Research, Basel, Switzerland
| | - Anna Kostikova
- Novartis Institutes for BioMedical Research, Basel, Switzerland
| | | | | | | | | | - Enrico Ferrero
- Novartis Institutes for BioMedical Research, Basel, Switzerland
| |
Collapse
|
10
|
Simon T, Mayr GJ, Morgenstern D, Umlauf N, Zeileis A. Amplification of annual and diurnal cycles of alpine lightning. CLIMATE DYNAMICS 2023; 61:4125-4137. [PMID: 37854482 PMCID: PMC10579137 DOI: 10.1007/s00382-023-06786-8] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 08/18/2022] [Accepted: 04/10/2023] [Indexed: 10/20/2023]
Abstract
The response of lightning to a changing climate is not fully understood. Historic trends of proxies known for fostering convective environments suggest an increase of lightning over large parts of Europe. Since lightning results from the interaction of processes on many scales, as many of these processes as possible must be considered for a comprehensive answer. Recent achievements of decade-long seamless lightning measurements and hourly reanalyses of atmospheric conditions including cloud micro-physics combined with flexible regression techniques have made a reliable reconstruction of cloud-to-ground lightning down to its seasonally varying diurnal cycle feasible. The European Eastern Alps and their surroundings are chosen as reconstruction region since this domain includes a large variety of land-cover, topographical and atmospheric circulation conditions. The most intense changes over the four decades from 1980 to 2019 occurred over the high Alps where lightning activity doubled in the 2010 s compared to the 1980 s. There, the lightning season reaches a higher maximum and starts one month earlier. Diurnally, the peak is up to 50% stronger with more lightning strikes in the afternoon and evening hours. Signals along the southern and northern alpine rim are similar but weaker whereas the flatlands surrounding the Alps have no significant trend.
Collapse
Affiliation(s)
- Thorsten Simon
- Department of Mathematics, Universität Innsbruck, Technikerstrasse 21a, 6020 Innsbruck, Austria
- Department of Statistics, Universität Innsbruck, Universitätsstrasse 15, 6020 Innsbruck, Austria
| | - Georg J. Mayr
- Department of Atmospheric and Cryospheric Sciences, Universität Innsbruck, Innrain 52, 6020 Innsbruck, Austria
| | - Deborah Morgenstern
- Department of Statistics, Universität Innsbruck, Universitätsstrasse 15, 6020 Innsbruck, Austria
- Department of Atmospheric and Cryospheric Sciences, Universität Innsbruck, Innrain 52, 6020 Innsbruck, Austria
| | - Nikolaus Umlauf
- Department of Statistics, Universität Innsbruck, Universitätsstrasse 15, 6020 Innsbruck, Austria
| | - Achim Zeileis
- Department of Statistics, Universität Innsbruck, Universitätsstrasse 15, 6020 Innsbruck, Austria
| |
Collapse
|
11
|
Lokmer A, Alladi CG, Troudet R, Bacq-Daian D, Boland-Auge A, Latapie V, Deleuze JF, RajKumar RP, Shewade DG, Bélivier F, Marie-Claire C, Jamain S. Risperidone response in patients with schizophrenia drives DNA methylation changes in immune and neuronal systems. Epigenomics 2023; 15:21-38. [PMID: 36919681 DOI: 10.2217/epi-2023-0017] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 03/16/2023] Open
Abstract
Background: The choice of efficient antipsychotic therapy for schizophrenia relies on a time-consuming trial-and-error approach, whereas the social and economic burdens of the disease call for faster alternatives. Material & methods: In a search for predictive biomarkers of antipsychotic response, blood methylomes of 28 patients were analyzed before and 4 weeks into risperidone therapy. Results: Several CpGs exhibiting response-specific temporal dynamics were identified in otherwise temporally stable methylomes and noticeable global response-related differences were observed between good and bad responders. These were associated with genes involved in immunity, neurotransmission and neuronal development. Polymorphisms in many of these genes were previously linked with schizophrenia etiology and antipsychotic response. Conclusion: Antipsychotic response seems to be shaped by both stable and medication-induced methylation differences.
Collapse
Affiliation(s)
- Ana Lokmer
- Univ Paris Est Créteil, INSERM, IMRB, Translational Neuropsychiatry, Créteil, F-94000, France.,Fondation FondaMental, Créteil, F-94000, France
| | - Charanraj Goud Alladi
- Université de Paris, INSERM UMRS 1144, Optimisation Thérapeutique en Neuropsychopharmacologie (OTeN), Paris, F-75006, France
| | - Réjane Troudet
- Univ Paris Est Créteil, INSERM, IMRB, Translational Neuropsychiatry, Créteil, F-94000, France.,Fondation FondaMental, Créteil, F-94000, France
| | - Delphine Bacq-Daian
- Université Paris-Saclay, CEA, Centre National de Recherche en Génomique Humaine (CNRGH), Evry, F-91057, France
| | - Anne Boland-Auge
- Université Paris-Saclay, CEA, Centre National de Recherche en Génomique Humaine (CNRGH), Evry, F-91057, France
| | - Violaine Latapie
- Univ Paris Est Créteil, INSERM, IMRB, Translational Neuropsychiatry, Créteil, F-94000, France.,Fondation FondaMental, Créteil, F-94000, France
| | - Jean-François Deleuze
- Université Paris-Saclay, CEA, Centre National de Recherche en Génomique Humaine (CNRGH), Evry, F-91057, France
| | - Ravi Philip RajKumar
- Department of Pharmacology, Jawaharlal Institute of Postgraduate Medical Education & Research, Puducherry, 605006, India
| | - Deepak Gopal Shewade
- Department of Psychiatry, Jawaharlal Institute of Postgraduate Medical Education & Research, Puducherry, 605006, India.,Centre National de Recherche en Génomique Humaine (CNRGH), Institut de Biologie François Jacob, CEA, Université Paris-Saclay, Evry, F-91000, France
| | - Frank Bélivier
- Fondation FondaMental, Créteil, F-94000, France.,Université de Paris, INSERM UMRS 1144, Optimisation Thérapeutique en Neuropsychopharmacologie (OTeN), Paris, F-75006, France.,Hôpitaux Lariboisière-Fernand Widal, GHU APHP Nord, Département de Psychiatrie et de Médecine Addicto-logique, Paris, F-75010, France
| | - Cynthia Marie-Claire
- Université de Paris, INSERM UMRS 1144, Optimisation Thérapeutique en Neuropsychopharmacologie (OTeN), Paris, F-75006, France
| | - Stéphane Jamain
- Univ Paris Est Créteil, INSERM, IMRB, Translational Neuropsychiatry, Créteil, F-94000, France.,Fondation FondaMental, Créteil, F-94000, France
| |
Collapse
|
12
|
Capanu M, Giurcanu M, Begg CB, Gönen M. Subsampling based variable selection for generalized linear models. Comput Stat Data Anal 2023; 184. [PMID: 37090139 PMCID: PMC10118238 DOI: 10.1016/j.csda.2023.107740] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 03/13/2023]
Abstract
A novel variable selection method for low-dimensional generalized linear models is introduced. The new approach called AIC OPTimization via STABility Selection (OPT-STABS) repeatedly subsamples the data, minimizes Akaike's Information Criterion (AIC) over a sequence of nested models for each subsample, and includes in the final model those predictors selected in the minimum AIC model in a large fraction of the subsamples. New methods are also introduced to establish an optimal variable selection cutoff over repeated subsamples. An extensive simulation study examining a variety of proposec variable selection methods shows that, although no single method uniformly outperforms the others in all the scenarios considered, OPT-STABS is consistently among the best-performing methods in most settings while it performs competitively for the rest. This is in contrast to other candidate methods which either have poor performance across the board or exhibit good performance in some settings, but very poor in others. In addition, the asymptotic properties of the OPT-STABS estimator are derived, and its root-n consistency and asymptotic normality are proved. The methods are applied to two datasets involving logistic and Poisson regressions.
Collapse
|
13
|
Ha CSR, Müller-Nurasyid M, Petrera A, Hauck SM, Marini F, Bartsch DK, Slater EP, Strauch K. Proteomics biomarker discovery for individualized prevention of familial pancreatic cancer using statistical learning. PLoS One 2023; 18:e0280399. [PMID: 36701413 PMCID: PMC9879447 DOI: 10.1371/journal.pone.0280399] [Citation(s) in RCA: 3] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/23/2022] [Accepted: 12/28/2022] [Indexed: 01/27/2023] Open
Abstract
BACKGROUND The low five-year survival rate of pancreatic ductal adenocarcinoma (PDAC) and the low diagnostic rate of early-stage PDAC via imaging highlight the need to discover novel biomarkers and improve the current screening procedures for early diagnosis. Familial pancreatic cancer (FPC) describes the cases of PDAC that are present in two or more individuals within a circle of first-degree relatives. Using innovative high-throughput proteomics, we were able to quantify the protein profiles of individuals at risk from FPC families in different potential pre-cancer stages. However, the high-dimensional proteomics data structure challenges the use of traditional statistical analysis tools. Hence, we applied advanced statistical learning methods to enhance the analysis and improve the results' interpretability. METHODS We applied model-based gradient boosting and adaptive lasso to deal with the small, unbalanced study design via simultaneous variable selection and model fitting. In addition, we used stability selection to identify a stable subset of selected biomarkers and, as a result, obtain even more interpretable results. In each step, we compared the performance of the different analytical pipelines and validated our approaches via simulation scenarios. RESULTS In the simulation study, model-based gradient boosting showed a more accurate prediction performance in the small, unbalanced, and high-dimensional datasets than adaptive lasso and could identify more relevant variables. Furthermore, using model-based gradient boosting, we discovered a subset of promising serum biomarkers that may potentially improve the current screening procedure of FPC. CONCLUSION Advanced statistical learning methods helped us overcome the shortcomings of an unbalanced study design in a valuable clinical dataset. The discovered serum biomarkers provide us with a clear direction for further investigations and more precise clinical hypotheses regarding the development of FPC and optimal strategies for its early detection.
Collapse
Affiliation(s)
- Chung Shing Rex Ha
- Institute of Medical Biostatistics, Epidemiology and Informatics (IMBEI), University Medical Center, Johannes Gutenberg University, Mainz, Germany
- Institute of Genetic Epidemiology, Helmholtz Zentrum München—German Research Center for Environmental Health, Neuherberg, Germany
- Faculty of Medicine, Institute for Medical Information Processing, Chair of Genetic Epidemiology, Biometry, and Epidemiology (IBE), LMU Munich, Munich, Germany
- * E-mail:
| | - Martina Müller-Nurasyid
- Institute of Medical Biostatistics, Epidemiology and Informatics (IMBEI), University Medical Center, Johannes Gutenberg University, Mainz, Germany
- Institute of Genetic Epidemiology, Helmholtz Zentrum München—German Research Center for Environmental Health, Neuherberg, Germany
- Faculty of Medicine, Institute for Medical Information Processing, Biometry, and Epidemiology (IBE), LMU Munich, Munich, Germanys
- Faculty of Medicine, Institute for Medical Information Processing, Pettenkofer School of Public Health Munich, Biometry, and Epidemiology (IBE), LMU Munich, Munich, Germany
| | - Agnese Petrera
- Research Unit Protein Science and Metabolomics and Proteomics Core Facility, Helmholtz Zentrum München—German Research Center for Environmental Health, Neuherberg, Germany
| | - Stefanie M. Hauck
- Research Unit Protein Science and Metabolomics and Proteomics Core Facility, Helmholtz Zentrum München—German Research Center for Environmental Health, Neuherberg, Germany
| | - Federico Marini
- Institute of Medical Biostatistics, Epidemiology and Informatics (IMBEI), University Medical Center, Johannes Gutenberg University, Mainz, Germany
- Research Center for Immunotherapy (FZI), University Medical Center, Johannes Gutenberg University, Mainz, Germany
| | - Detlef K. Bartsch
- Department of Visceral-, Thoracic- and Vascular Surgery, Philipps University, Marburg, Germany
| | - Emily P. Slater
- Department of Visceral-, Thoracic- and Vascular Surgery, Philipps University, Marburg, Germany
| | - Konstantin Strauch
- Institute of Medical Biostatistics, Epidemiology and Informatics (IMBEI), University Medical Center, Johannes Gutenberg University, Mainz, Germany
- Institute of Genetic Epidemiology, Helmholtz Zentrum München—German Research Center for Environmental Health, Neuherberg, Germany
- Faculty of Medicine, Institute for Medical Information Processing, Chair of Genetic Epidemiology, Biometry, and Epidemiology (IBE), LMU Munich, Munich, Germany
| |
Collapse
|
14
|
Failmezger H, Hessel H, Kapil A, Schmidt G, Harder N. Spatial heterogeneity of cancer associated protein expression in immunohistochemically stained images as an improved prognostic biomarker. Front Oncol 2022; 12:964716. [PMID: 36601480 PMCID: PMC9806230 DOI: 10.3389/fonc.2022.964716] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/08/2022] [Accepted: 11/23/2022] [Indexed: 12/23/2022] Open
Abstract
The identification of new tumor biomarkers for patient stratification before therapy, for monitoring of disease progression, and for characterization of tumor biology plays a crucial role in cancer research. The status of these biomarkers is mostly scored manually by a pathologist and such scores typically, do not consider the spatial heterogeneity of the protein's expression in the tissue. Using advanced image analysis methods, marker expression can be determined quantitatively with high accuracy and reproducibility on a per-cell level. To aggregate such per-cell marker expressions on a patient level, the expression values for single cells are usually averaged for the whole tissue. However, averaging neglects the spatial heterogeneity of the marker expression in the tissue. We present two novel approaches for quantitative scoring of spatial marker expression heterogeneity. The first approach is based on a co-occurrence analysis of the marker expression in neighboring cells. The second approach accounts for the local variability of the protein's expression by tiling the tissue with a regular grid and assigning local spatial heterogeneity phenotypes per tile. We apply our novel scores to quantify the spatial expression of four different membrane markers, i.e., HER2, CMET, CD44, and EGFR in immunohistochemically (IHC) stained tissue sections of colorectal cancer patients. We evaluate the prognostic relevance of our spatial scores in this cohort and show that the spatial heterogeneity scores clearly outperform the marker expression average as a prognostic factor (CMET: p-value=0.01 vs. p-value=0.3).
Collapse
|
15
|
Comparison between LASSO and RT methods for prediction of generic E. coli concentration in pasture poultry farms. Food Res Int 2022; 161:111860. [DOI: 10.1016/j.foodres.2022.111860] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/17/2022] [Revised: 07/28/2022] [Accepted: 08/21/2022] [Indexed: 11/21/2022]
|
16
|
Huemer MT, Petrera A, Hauck SM, Drey M, Peters A, Thorand B. Proteomics of the phase angle: Results from the population-based KORA S4 study. Clin Nutr 2022; 41:1818-1826. [PMID: 35834914 DOI: 10.1016/j.clnu.2022.06.038] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/25/2022] [Revised: 06/01/2022] [Accepted: 06/23/2022] [Indexed: 11/03/2022]
Abstract
BACKGROUND & AIMS The phase angle (PhA) measured with bioelectrical impedance analysis is considered to reflect the interrelated components body cell mass and fluid distribution based on technical and physical aspects of the PhA measurement. However, the biomedical meaning of the PhA remains vague. Previous studies mainly assessed associations of the PhA with numerous diseases and health outcomes, but few connected protein markers to the PhA. To broaden our understanding of the biomedical background of the PhA, we aimed to explore a proteomics profile associated with the PhA and related biological factors. METHODS The study sample encompassed 1484 participants (725 women and 759 men) aged 55-74 years from the population-based Cooperative Health Research in the Region of Augsburg (KORA) S4 study. Proteomics measurements were performed with a proximity extension assay. We employed boosting with stability selection to establish a set of markers that was strongly associated with the PhA from a group of 233 plasma protein markers. We integrated the selected protein markers into a network and enrichment analysis to identify gene ontology (GO) terms significantly overrepresented for the selected PhA protein markers. RESULTS Boosting with stability selection identified seven protein markers that were strongly and independently associated with the PhA: N-terminal prohormone brain natriuretic peptide (NT-proBNP), insulin-like growth factor-binding protein 2 (IGFBP2), adrenomedullin (ADM), myoglobin (MB), matrix metalloproteinase-9 (MMP9), protein-glutamine gamma-glutamyltransferase 2 (TGM2), and fractalkine (CX3CL1) [beta coefficient per 1 standard deviation increase in normalized protein expression values on a log 2 scale (95% confidence interval): -0.12 (-0.15, -0.08), -0.13 (-0.17, -0.09), -0.14 (-0.18, -0.10), 0.10 (0.07, 0.14), 0.07 (0.04, 0.10), 0.08 (0.05, 0.11), -0.06 (-0.10, -0.03), respectively]. According to the enrichment analysis, this protein profile was significantly overrepresented in the following top five GO terms: positive regulation of cell population proliferation (p-value: 1.32E-04), extracellular space (p-value: 1.34E-04), anatomical structure formation involved in morphogenesis (p-value: 2.92E-04), regulation of multicellular organismal development (p-value: 5.72E-04), and metal ion homeostasis (p-value: 8.86E-04). CONCLUSION Implementing a proteomics approach, we identified six new protein markers strongly associated with the PhA and confirmed that NT-proBNP is a key PhA marker. The main biological processes that were related to this PhA's protein profile are involved in regulating the amount and growth of cells, reinforcing, from a biomedical perspective, the current technical-based consensus of the PhA to reflect body cell mass.
Collapse
Affiliation(s)
- Marie-Theres Huemer
- Institute of Epidemiology, Helmholtz Zentrum München, German Research Center for Environmental Health (GmbH), Ingolstädter Landstr. 1, 85764 Neuherberg, Germany.
| | - Agnese Petrera
- Research Unit Protein Science and Metabolomics and Proteomics Core, Helmholtz Zentrum München, German Research Center for Environmental Health (GmbH), Heidemannstr. 1, 80939 Munich, Germany.
| | - Stefanie M Hauck
- Research Unit Protein Science and Metabolomics and Proteomics Core, Helmholtz Zentrum München, German Research Center for Environmental Health (GmbH), Heidemannstr. 1, 80939 Munich, Germany.
| | - Michael Drey
- Department of Medicine IV, University Hospital, LMU Munich, Geriatrics, Ziemssenstr. 5, 80336 Munich, Germany.
| | - Annette Peters
- Institute of Epidemiology, Helmholtz Zentrum München, German Research Center for Environmental Health (GmbH), Ingolstädter Landstr. 1, 85764 Neuherberg, Germany; German Center for Diabetes Research (DZD), Ingolstädter Landstr. 1, 85764 München-Neuherberg, Germany; Chair of Epidemiology, Institute for Medical Information Processing, Biometry and Epidemiology, Medical Faculty, Ludwig-Maximilians-Universität München, Marchioninistr. 15, 81377 Munich, Germany.
| | - Barbara Thorand
- Institute of Epidemiology, Helmholtz Zentrum München, German Research Center for Environmental Health (GmbH), Ingolstädter Landstr. 1, 85764 Neuherberg, Germany; German Center for Diabetes Research (DZD), Ingolstädter Landstr. 1, 85764 München-Neuherberg, Germany.
| |
Collapse
|
17
|
Priya S, Burns MB, Ward T, Mars RAT, Adamowicz B, Lock EF, Kashyap PC, Knights D, Blekhman R. Identification of shared and disease-specific host gene-microbiome associations across human diseases using multi-omic integration. Nat Microbiol 2022; 7:780-795. [PMID: 35577971 PMCID: PMC9159953 DOI: 10.1038/s41564-022-01121-z] [Citation(s) in RCA: 54] [Impact Index Per Article: 27.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/01/2021] [Accepted: 04/06/2022] [Indexed: 12/19/2022]
Abstract
While gut microbiome and host gene regulation independently contribute to gastrointestinal disorders, it is unclear how the two may interact to influence host pathophysiology. Here we developed a machine learning-based framework to jointly analyse paired host transcriptomic (n = 208) and gut microbiome (n = 208) profiles from colonic mucosal samples of patients with colorectal cancer, inflammatory bowel disease and irritable bowel syndrome. We identified associations between gut microbes and host genes that depict shared as well as disease-specific patterns. We found that a common set of host genes and pathways implicated in gastrointestinal inflammation, gut barrier protection and energy metabolism are associated with disease-specific gut microbes. Additionally, we also found that mucosal gut microbes that have been implicated in all three diseases, such as Streptococcus, are associated with different host pathways in each disease, suggesting that similar microbes can affect host pathophysiology in a disease-specific manner through regulation of different host genes. Our framework can be applied to other diseases for the identification of host gene-microbiome associations that may influence disease outcomes.
Collapse
Affiliation(s)
- Sambhawa Priya
- Department of Genetics, Cell Biology and Development, University of Minnesota, Minneapolis, MN, USA
- Bioinformatics and Computational Biology, University of Minnesota, Minneapolis, MN, USA
| | - Michael B Burns
- Department of Biology, Loyola University Chicago, Chicago, IL, USA
| | - Tonya Ward
- BioTechnology Institute, College of Biological Sciences, University of Minnesota, Minneapolis, MN, USA
| | - Ruben A T Mars
- Division of Gastroenterology and Hepatology, Department of Internal Medicine, Mayo Clinic, Rochester, MN, USA
| | - Beth Adamowicz
- Department of Genetics, Cell Biology and Development, University of Minnesota, Minneapolis, MN, USA
| | - Eric F Lock
- Division of Biostatistics, School of Public Health, University of Minnesota, Minneapolis, MN, USA
| | - Purna C Kashyap
- Division of Gastroenterology and Hepatology, Department of Internal Medicine, Mayo Clinic, Rochester, MN, USA
| | - Dan Knights
- BioTechnology Institute, College of Biological Sciences, University of Minnesota, Minneapolis, MN, USA
- Department of Computer Science and Engineering, University of Minnesota, Minneapolis, MN, USA
| | - Ran Blekhman
- Department of Genetics, Cell Biology and Development, University of Minnesota, Minneapolis, MN, USA.
- Department of Ecology, Evolution, and Behavior, University of Minnesota, Minneapolis, MN, USA.
| |
Collapse
|
18
|
Teitsdottir UD, Darreh-Shori T, Lund SH, Jonsdottir MK, Snaedal J, Petersen PH. Phenotypic Displays of Cholinergic Enzymes Associate With Markers of Inflammation, Neurofibrillary Tangles, and Neurodegeneration in Pre- and Early Symptomatic Dementia Subjects. Front Aging Neurosci 2022; 14:876019. [PMID: 35693340 PMCID: PMC9178195 DOI: 10.3389/fnagi.2022.876019] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/14/2022] [Accepted: 05/02/2022] [Indexed: 11/13/2022] Open
Abstract
Background Cholinergic drugs are the most commonly used drugs for the treatment of Alzheimer’s disease (AD). Therefore, a better understanding of the cholinergic system and its relation to both AD-related biomarkers and cognitive functions is of high importance. Objectives To evaluate the relationships of cerebrospinal fluid (CSF) cholinergic enzymes with markers of amyloidosis, neurodegeneration, neurofibrillary tangles, inflammation and performance on verbal episodic memory in a memory clinic cohort. Methods In this cross-sectional study, 46 cholinergic drug-free subjects (median age = 71, 54% female, median MMSE = 28) were recruited from an Icelandic memory clinic cohort targeting early stages of cognitive impairment. Enzyme activity of acetylcholinesterase (AChE) and butyrylcholinesterase (BuChE) was measured in CSF as well as levels of amyloid-β1–42 (Aβ42), phosphorylated tau (P-tau), total-tau (T-tau), neurofilament light (NFL), YKL-40, S100 calcium-binding protein B (S100B), and glial fibrillary acidic protein (GFAP). Verbal episodic memory was assessed with the Rey Auditory Verbal Learning (RAVLT) and Story tests. Results No significant relationships were found between CSF Aβ42 levels and AChE or BuChE activity (p > 0.05). In contrast, T-tau (r = 0.46, p = 0.001) and P-tau (r = 0.45, p = 0.002) levels correlated significantly with AChE activity. Although neurodegeneration markers T-tau and NFL did correlate with each other (r = 0.59, p < 0.001), NFL did not correlate with AChE (r = 0.25, p = 0.09) or BuChE (r = 0.27, p = 0.06). Inflammation markers S100B and YKL-40 both correlated significantly with AChE (S100B: r = 0.43, p = 0.003; YKL-40: r = 0.32, p = 0.03) and BuChE (S100B: r = 0.47, p < 0.001; YKL-40: r = 0.38, p = 0.009) activity. A weak correlation was detected between AChE activity and the composite score reflecting verbal episodic memory (r = −0.34, p = 0.02). LASSO regression analyses with a stability approach were performed for the selection of a set of measures best predicting cholinergic activity and verbal episodic memory score. S100B was the predictor with the highest model selection frequency for both AChE (68%) and BuChE (73%) activity. Age (91%) was the most reliable predictor for verbal episodic memory, with selection frequency of both cholinergic enzymes below 10%. Conclusions Results indicate a relationship between higher activity of the ACh-degrading cholinergic enzymes with increased neurodegeneration, neurofibrillary tangles and inflammation in the stages of pre- and early symptomatic dementia, independent of CSF Aβ42 levels.
Collapse
Affiliation(s)
- Unnur D. Teitsdottir
- Faculty of Medicine, Department of Anatomy, Biomedical Center, University of Iceland, Reykjavik, Iceland
- *Correspondence: Unnur D. Teitsdottir
| | - Taher Darreh-Shori
- Division of Clinical Geriatrics, Department of Neurobiology, Care Sciences and Society, Center for Alzheimer Research, Karolinska Institutet, Campus Flemingsberg, Stockholm, Sweden
| | | | - Maria K. Jonsdottir
- Department of Psychology, Reykjavik University, Reykjavik, Iceland
- Department of Psychiatry, Landspitali-National University Hospital, Reykjavik, Iceland
| | - Jon Snaedal
- Memory Clinic, Department of Geriatric Medicine, Landspitali-National University Hospital, Reykjavik, Iceland
| | - Petur H. Petersen
- Faculty of Medicine, Department of Anatomy, Biomedical Center, University of Iceland, Reykjavik, Iceland
| |
Collapse
|
19
|
Huber KJ, Vieira S, Sikorski J, Wüst PK, Fösel BU, Gröngröft A, Overmann J. Differential Response of Acidobacteria to Water Content, Soil Type, and Land Use During an Extended Drought in African Savannah Soils. Front Microbiol 2022; 13:750456. [PMID: 35222321 PMCID: PMC8874233 DOI: 10.3389/fmicb.2022.750456] [Citation(s) in RCA: 5] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/30/2021] [Accepted: 01/20/2022] [Indexed: 11/13/2022] Open
Abstract
Although climate change is expected to increase the extent of drylands worldwide, the effect of drought on the soil microbiome is still insufficiently understood as for dominant but little characterized phyla like the Acidobacteria. In the present study the active acidobacterial communities of Namibian soils differing in type, physicochemical parameters, and land use were characterized by high-throughput sequencing. Water content, pH, major ions and nutrients were distinct for sandy soils, woodlands or dry agriculture on loamy sands. Soils were repeatedly sampled over a 2-year time period and covered consecutively a strong rainy, a dry, a normal rainy and a weak rainy season. The increasing drought had differential effects on different soils. Linear modeling of the soil water content across all sampling locations and sampling dates revealed that the accumulated precipitation of the preceding season had only a weak, but statistically significant effect, whereas woodland and irrigation exerted a strong positive effect on water content. The decrease in soil water content was accompanied by a pronounced decrease in the fraction of active Acidobacteria (7.9-0.7%) while overall bacterial community size/cell counts remained constant. Notably, the strongest decline in the relative fraction of Acidobacteria was observed after the first cycle of rainy and dry season, rather than after the weakest rainy season at the end of the observation period. Over the 2-year period, also the β-diversity of soil Acidobacteria changed. During the first year this change in composition was related to soil type (loamy sand) and land use (woodland) as explanatory variables. A total of 188 different acidobacterial sequence variants affiliated with the "Acidobacteriia," Blastocatellia, and Vicinamibacteria changed significantly in abundance, suggesting either drought sensitivity or formation of dormant cell forms. Comparative physiological testing of 15 Namibian isolates revealed species-specific and differential responses in viability during long-term continuous desiccation or drying-rewetting cycles. These different responses were not determined by phylogenetic affiliation and provide a first explanation for the effect of drought on soil Acidobacteria. In conclusion, the response of acidobacterial communities to water availability is non-linear, most likely caused by the different physiological adaptations of the different taxa present.
Collapse
Affiliation(s)
- Katharina J. Huber
- Leibniz Institute DSMZ—German Collection of Microorganisms and Cell Cultures, Braunschweig, Germany
| | - Selma Vieira
- Leibniz Institute DSMZ—German Collection of Microorganisms and Cell Cultures, Braunschweig, Germany
| | - Johannes Sikorski
- Leibniz Institute DSMZ—German Collection of Microorganisms and Cell Cultures, Braunschweig, Germany
| | - Pia K. Wüst
- Leibniz Institute DSMZ—German Collection of Microorganisms and Cell Cultures, Braunschweig, Germany
| | - Bärbel U. Fösel
- Leibniz Institute DSMZ—German Collection of Microorganisms and Cell Cultures, Braunschweig, Germany
| | - Alexander Gröngröft
- Department of Geosciences, Institute of Soil Science, University of Hamburg, Hamburg, Germany
| | - Jörg Overmann
- Leibniz Institute DSMZ—German Collection of Microorganisms and Cell Cultures, Braunschweig, Germany
- Institute of Microbiology, Technical University Braunschweig, Braunschweig, Germany
| |
Collapse
|
20
|
Zhang B, Hepp T, Greven S, Bergherr E. Adaptive step-length selection in gradient boosting for Gaussian location and scale models. Comput Stat 2022. [DOI: 10.1007/s00180-022-01199-3] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022]
Abstract
AbstractTuning of model-based boosting algorithms relies mainly on the number of iterations, while the step-length is fixed at a predefined value. For complex models with several predictors such as Generalized additive models for location, scale and shape (GAMLSS), imbalanced updates of predictors, where some distribution parameters are updated more frequently than others, can be a problem that prevents some submodels to be appropriately fitted within a limited number of boosting iterations. We propose an approach using adaptive step-length (ASL) determination within a non-cyclical boosting algorithm for Gaussian location and scale models, as an important special case of the wider class of GAMLSS, to prevent such imbalance. Moreover, we discuss properties of the ASL and derive a semi-analytical form of the ASL that avoids manual selection of the search interval and numerical optimization to find the optimal step-length, and consequently improves computational efficiency. We show competitive behavior of the proposed approaches compared to penalized maximum likelihood and boosting with a fixed step-length for Gaussian location and scale models in two simulations and two applications, in particular for cases of large variance and/or more variables than observations. In addition, the underlying concept of the ASL is also applicable to the whole GAMLSS framework and to other models with more than one predictor like zero-inflated count models, and brings up insights into the choice of the reasonable defaults for the step-length in the simpler special case of (Gaussian) additive models.
Collapse
|
21
|
Marchais A, Marques Da Costa ME, Job B, Abbas R, Drubay D, Piperno-Neumann S, Fromigué O, Gomez-Brouchet A, Françoise R, Droit R, Lervat C, ENTZ-WERLE N, Pacquement H, Devoldere C, Cupissol D, Bodet D, GANDEMER V, Berger MG, Bérard PM, Jimenez M, Vassal G, Geoerger B, Brugieres L, Gaspar N. Immune infiltrate and tumor microenvironment transcriptional programs stratify pediatric osteosarcoma into prognostic groups at diagnosis. Cancer Res 2022; 82:974-985. [DOI: 10.1158/0008-5472.can-20-4189] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/15/2020] [Revised: 07/26/2021] [Accepted: 01/18/2022] [Indexed: 11/16/2022]
|
22
|
Strömer A, Staerk C, Klein N, Weinhold L, Titze S, Mayr A. Deselection of base-learners for statistical boosting-with an application to distributional regression. Stat Methods Med Res 2021; 31:207-224. [PMID: 34882438 DOI: 10.1177/09622802211051088] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/23/2022]
Abstract
We present a new procedure for enhanced variable selection for component-wise gradient boosting. Statistical boosting is a computational approach that emerged from machine learning, which allows to fit regression models in the presence of high-dimensional data. Furthermore, the algorithm can lead to data-driven variable selection. In practice, however, the final models typically tend to include too many variables in some situations. This occurs particularly for low-dimensional data (p<n), where we observe a slow overfitting behavior of boosting. As a result, more variables get included into the final model without altering the prediction accuracy. Many of these false positives are incorporated with a small coefficient and therefore have a small impact, but lead to a larger model. We try to overcome this issue by giving the algorithm the chance to deselect base-learners with minor importance. We analyze the impact of the new approach on variable selection and prediction performance in comparison to alternative methods including boosting with earlier stopping as well as twin boosting. We illustrate our approach with data of an ongoing cohort study for chronic kidney disease patients, where the most influential predictors for the health-related quality of life measure are selected in a distributional regression approach based on beta regression.
Collapse
Affiliation(s)
- Annika Strömer
- Department of Medical Biometrics, Informatics and Epidemiology, Faculty of Medicine, 9374University of Bonn, Germany
| | - Christian Staerk
- Department of Medical Biometrics, Informatics and Epidemiology, Faculty of Medicine, 9374University of Bonn, Germany
| | - Nadja Klein
- Emmy Noether Research Group in Statistics and Data Science, Humboldt-Universität zu Berlin, Germany
| | - Leonie Weinhold
- Department of Medical Biometrics, Informatics and Epidemiology, Faculty of Medicine, 9374University of Bonn, Germany
| | - Stephanie Titze
- Department of Nephrology and Hypertension, 9171FAU Erlangen-Nuremberg, Germany
| | - Andreas Mayr
- Department of Medical Biometrics, Informatics and Epidemiology, Faculty of Medicine, 9374University of Bonn, Germany
| |
Collapse
|
23
|
Abstract
AbstractRanking problems, also known as preference learning problems, define a widely spread class of statistical learning problems with many applications, including fraud detection, document ranking, medicine, chemistry, credit risk screening, image ranking or media memorability. While there already exist reviews concentrating on specific types of ranking problems like label and object ranking problems, there does not yet seem to exist an overview concentrating on instance ranking problems that both includes developments in distinguishing between different types of instance ranking problems as well as careful discussions about their differences and the applicability of the existing ranking algorithms to them. In instance ranking, one explicitly takes the responses into account with the goal to infer a scoring function which directly maps feature vectors to real-valued ranking scores, in contrast to object ranking problems where the ranks are given as preference information with the goal to learn a permutation. In this article, we systematically review different types of instance ranking problems and the corresponding loss functions resp. goodness criteria. We discuss the difficulties when trying to optimize those criteria. As for a detailed and comprehensive overview of existing machine learning techniques to solve such ranking problems, we systematize existing techniques and recapitulate the corresponding optimization problems in a unified notation. We also discuss to which of the instance ranking problems the respective algorithms are tailored and identify their strengths and limitations. Computational aspects and open research problems are also considered.
Collapse
|
24
|
Predicting Physician Consultations for Low Back Pain Using Claims Data and Population-Based Cohort Data-An Interpretable Machine Learning Approach. INTERNATIONAL JOURNAL OF ENVIRONMENTAL RESEARCH AND PUBLIC HEALTH 2021; 18:ijerph182212013. [PMID: 34831773 PMCID: PMC8622753 DOI: 10.3390/ijerph182212013] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 09/18/2021] [Revised: 10/24/2021] [Accepted: 11/12/2021] [Indexed: 11/17/2022]
Abstract
(1) Background: Predicting chronic low back pain (LBP) is of clinical and economic interest as LBP leads to disabilities and health service utilization. This study aims to build a competitive and interpretable prediction model; (2) Methods: We used clinical and claims data of 3837 participants of a population-based cohort study to predict future LBP consultations (ICD-10: M40.XX-M54.XX). Best subset selection (BSS) was applied in repeated random samples of training data (75% of data); scoring rules were used to identify the best subset of predictors. The rediction accuracy of BSS was compared to randomforest and support vector machines (SVM) in the validation data (25% of data); (3) Results: The best subset comprised 16 out of 32 predictors. Previous occurrence of LBP increased the odds for future LBP consultations (odds ratio (OR) 6.91 [5.05; 9.45]), while concomitant diseases reduced the odds (1 vs. 0, OR: 0.74 [0.57; 0.98], >1 vs. 0: 0.37 [0.21; 0.67]). The area-under-curve (AUC) of BSS was acceptable (0.78 [0.74; 0.82]) and comparable with SVM (0.78 [0.74; 0.82]) and randomforest (0.79 [0.75; 0.83]); (4) Conclusions: Regarding prediction accuracy, BSS has been considered competitive with established machine-learning approaches. Nonetheless, considerable misclassification is inherent and further refinements are required to improve predictions.
Collapse
|
25
|
Koutroulis G, Botler L, Mutlu B, Diwold K, Römer K, Kern R. KOMPOS: Connecting Causal Knots in Large Nonlinear Time Series with Non-Parametric Regression Splines. ACM T INTEL SYST TEC 2021. [DOI: 10.1145/3480971] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/19/2022]
Abstract
Recovering causality from copious time series data beyond mere correlations has been an important contributing factor in numerous scientific fields. Most existing works assume linearity in the data that may not comply with many real-world scenarios. Moreover, it is usually not sufficient to solely infer the causal relationships. Identifying the correct time delay of cause-effect is extremely vital for further insight and effective policies in inter-disciplinary domains. To bridge this gap, we propose KOMPOS, a novel algorithmic framework that combines a powerful concept from causal discovery of additive noise models with graphical ones. We primarily build our structural causal model from multivariate adaptive regression splines with inherent additive local nonlinearities, which render the underlying causal structure more easily identifiable. In contrast to other methods, our approach is not restricted to Gaussian or non-Gaussian noise due to the non-parametric attribute of the regression method. We conduct extensive experiments on both synthetic and real-world datasets, demonstrating the superiority of the proposed algorithm over existing causal discovery methods, especially for the challenging cases of autocorrelated and non-stationary time series.
Collapse
Affiliation(s)
| | - Leo Botler
- Graz University of Technology, Graz, Austria
| | | | - Konrad Diwold
- Pro2Future GmbH and Graz University of Technology, Graz, Austria
| | - Kay Römer
- Graz University of Technology, Graz, Austria
| | - Roman Kern
- Graz University of Technology, Graz, Austria
| |
Collapse
|
26
|
Staerk C, Mayr A. Randomized boosting with multivariable base-learners for high-dimensional variable selection and prediction. BMC Bioinformatics 2021; 22:441. [PMID: 34530737 PMCID: PMC8447543 DOI: 10.1186/s12859-021-04340-z] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/24/2021] [Accepted: 08/24/2021] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND Statistical boosting is a computational approach to select and estimate interpretable prediction models for high-dimensional biomedical data, leading to implicit regularization and variable selection when combined with early stopping. Traditionally, the set of base-learners is fixed for all iterations and consists of simple regression learners including only one predictor variable at a time. Furthermore, the number of iterations is typically tuned by optimizing the predictive performance, leading to models which often include unnecessarily large numbers of noise variables. RESULTS We propose three consecutive extensions of classical component-wise gradient boosting. In the first extension, called Subspace Boosting (SubBoost), base-learners can consist of several variables, allowing for multivariable updates in a single iteration. To compensate for the larger flexibility, the ultimate selection of base-learners is based on information criteria leading to an automatic stopping of the algorithm. As the second extension, Random Subspace Boosting (RSubBoost) additionally includes a random preselection of base-learners in each iteration, enabling the scalability to high-dimensional data. In a third extension, called Adaptive Subspace Boosting (AdaSubBoost), an adaptive random preselection of base-learners is considered, focusing on base-learners which have proven to be predictive in previous iterations. Simulation results show that the multivariable updates in the three subspace algorithms are particularly beneficial in cases of high correlations among signal covariates. In several biomedical applications the proposed algorithms tend to yield sparser models than classical statistical boosting, while showing a very competitive predictive performance also compared to penalized regression approaches like the (relaxed) lasso and the elastic net. CONCLUSIONS The proposed randomized boosting approaches with multivariable base-learners are promising extensions of statistical boosting, particularly suited for highly-correlated and sparse high-dimensional settings. The incorporated selection of base-learners via information criteria induces automatic stopping of the algorithms, promoting sparser and more interpretable prediction models.
Collapse
Affiliation(s)
- Christian Staerk
- Department of Medical Biometry, Informatics and Epidemiology, University Hospital Bonn, Venusberg-Campus 1, 53127, Bonn, Germany.
| | - Andreas Mayr
- Department of Medical Biometry, Informatics and Epidemiology, University Hospital Bonn, Venusberg-Campus 1, 53127, Bonn, Germany
| |
Collapse
|
27
|
Freijeiro‐González L, Febrero‐Bande M, González‐Manteiga W. A Critical Review of LASSO and Its Derivatives for Variable Selection Under Dependence Among Covariates. Int Stat Rev 2021. [DOI: 10.1111/insr.12469] [Citation(s) in RCA: 7] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/27/2022]
Affiliation(s)
- Laura Freijeiro‐González
- Department of Statistics Mathematical Analysis and Optimization; Santiago de Compostela University Santiago de Compostela Spain
| | - Manuel Febrero‐Bande
- Department of Statistics Mathematical Analysis and Optimization; Santiago de Compostela University Santiago de Compostela Spain
| | - Wenceslao González‐Manteiga
- Department of Statistics Mathematical Analysis and Optimization; Santiago de Compostela University Santiago de Compostela Spain
| |
Collapse
|
28
|
Kidney Allograft Function Is a Confounder of Urine Metabolite Profiles in Kidney Allograft Recipients. Metabolites 2021; 11:metabo11080533. [PMID: 34436474 PMCID: PMC8399888 DOI: 10.3390/metabo11080533] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/15/2021] [Revised: 07/24/2021] [Accepted: 08/03/2021] [Indexed: 11/17/2022] Open
Abstract
Noninvasive biomarkers of kidney allograft status can help minimize the need for standard of care kidney allograft biopsies. Metabolites that are measured in the urine may inform about kidney function and health status, and potentially identify rejection events. To test these hypotheses, we conducted a metabolomics study of biopsy-matched urine cell-free supernatants from kidney allograft recipients who were diagnosed with two major types of acute rejections and no-rejection controls. Non-targeted metabolomics data for 674 metabolites and 577 unidentified molecules, for 192 biopsy-matched urine samples, were analyzed. Univariate and multivariate analyses identified metabolite signatures for kidney allograft rejection. The replicability of a previously developed urine metabolite signature was examined. Our study showed that metabolite profiles can serve as biomarkers for discriminating rejection biopsies from biopsies without rejection features, but also revealed a role of estimated Glomerular Filtration Rate (eGFR) as a major confounder of the metabolite signal.
Collapse
|
29
|
Huemer MT, Bauer A, Petrera A, Scholz M, Hauck SM, Drey M, Peters A, Thorand B. Proteomic profiling of low muscle and high fat mass: a machine learning approach in the KORA S4/FF4 study. J Cachexia Sarcopenia Muscle 2021; 12:1011-1023. [PMID: 34151535 PMCID: PMC8350207 DOI: 10.1002/jcsm.12733] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 01/28/2021] [Revised: 05/12/2021] [Accepted: 05/21/2021] [Indexed: 12/14/2022] Open
Abstract
BACKGROUND The coexistence of low muscle mass and high fat mass, two interrelated conditions strongly associated with declining health status, has been characterized by only a few protein biomarkers. High-throughput proteomics enable concurrent measurement of numerous proteins, facilitating the discovery of potentially new biomarkers. METHODS Data derived from the prospective population-based Cooperative Health Research in the Region of Augsburg S4/FF4 cohort study (median follow-up time: 13.5 years) included 1478 participants (756 men and 722 women) aged 55-74 years in the cross-sectional and 608 participants (315 men and 293 women) in the longitudinal analysis. Appendicular skeletal muscle mass (ASMM) and body fat mass index (BFMI) were determined through bioelectrical impedance analysis at baseline and follow-up. At baseline, 233 plasma proteins were measured using proximity extension assay. We implemented boosting with stability selection to enable false positives-controlled variable selection to identify new protein biomarkers of low muscle mass, high fat mass, and their combination. We evaluated prediction models developed based on group least absolute shrinkage and selection operator (lasso) with 100× bootstrapping by cross-validated area under the curve (AUC) to investigate if proteins increase the prediction accuracy on top of classical risk factors. RESULTS In the cross-sectional analysis, we identified kallikrein-6, C-C motif chemokine 28 (CCL28), and tissue factor pathway inhibitor as previously unknown biomarkers for muscle mass [association with low ASMM: odds ratio (OR) per 1-SD increase in log2 normalized protein expression values (95% confidence interval (CI)): 1.63 (1.37-1.95), 1.31 (1.14-1.51), 1.24 (1.06-1.45), respectively] and serine protease 27 for fat mass [association with high BFMI: OR (95% CI): 0.73 (0.61-0.86)]. CCL28 and metalloproteinase inhibitor 4 (TIMP4) constituted new biomarkers for the combination of low muscle and high fat mass [association with low ASMM combined with high BFMI: OR (95% CI): 1.32 (1.08-1.61), 1.28 (1.03-1.59), respectively]. Including protein biomarkers selected in ≥90% of group lasso bootstrap iterations on top of classical risk factors improved the performance of models predicting low ASMM, high BFMI, and their combination [delta AUC (95% CI): 0.16 (0.13-0.20), 0.22 (0.18-0.25), 0.12 (0.08-0.17), respectively]. In the longitudinal analysis, N-terminal prohormone brain natriuretic peptide (NT-proBNP) was the only protein selected for loss in ASMM and loss in ASMM combined with gain in BFMI over 14 years [OR (95% CI): 1.40 (1.10-1.77), 1.60 (1.15-2.24), respectively]. CONCLUSIONS Proteomic profiling revealed CCL28 and TIMP4 as new biomarkers of low muscle mass combined with high fat mass and NT-proBNP as a key biomarker of loss in muscle mass combined with gain in fat mass. Proteomics enable us to accelerate biomarker discoveries in muscle research.
Collapse
Affiliation(s)
- Marie-Theres Huemer
- Institute of Epidemiology, Helmholtz Zentrum München, German Research Center for Environmental Health (GmbH), Neuherberg, Germany
| | - Alina Bauer
- Institute of Epidemiology, Helmholtz Zentrum München, German Research Center for Environmental Health (GmbH), Neuherberg, Germany
| | - Agnese Petrera
- Research Unit Protein Science, Helmholtz Zentrum München, German Research Center for Environmental Health (GmbH), Neuherberg, Germany
| | - Markus Scholz
- Institute for Medical Informatics, Statistics and Epidemiology (IMISE), Universität Leipzig, Leipzig, Germany
| | - Stefanie M Hauck
- Research Unit Protein Science, Helmholtz Zentrum München, German Research Center for Environmental Health (GmbH), Neuherberg, Germany
| | - Michael Drey
- Medizinische Klinik und Poliklinik IV, Schwerpunkt Akutgeriatrie, Klinikum der Universität München (LMU), Munich, Germany
| | - Annette Peters
- Institute of Epidemiology, Helmholtz Zentrum München, German Research Center for Environmental Health (GmbH), Neuherberg, Germany.,German Center for Diabetes Research (DZD), München-Neuherberg, Germany.,Chair of Epidemiology, Institute for Medical Information Processing, Biometry and Epidemiology, Medical Faculty, Ludwig-Maximilians-Universität München, Munich, Germany
| | - Barbara Thorand
- Institute of Epidemiology, Helmholtz Zentrum München, German Research Center for Environmental Health (GmbH), Neuherberg, Germany.,German Center for Diabetes Research (DZD), München-Neuherberg, Germany
| |
Collapse
|
30
|
Soerensen M, Debrabant B, Halekoh U, Møller JE, Hassager C, Frydland M, Hjelmborg J, Beck HC, Rasmussen LM. Does diabetes modify the effect of heparin on plasma proteins? - A proteomic search for plasma protein biomarkers for diabetes-related endothelial dysfunction. J Diabetes Complications 2021; 35:107906. [PMID: 33785251 DOI: 10.1016/j.jdiacomp.2021.107906] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 11/11/2020] [Revised: 02/11/2021] [Accepted: 03/07/2021] [Indexed: 11/23/2022]
Abstract
AIM Heparin administration affects the concentrations of many plasma proteins through their displacement from the endothelial glycocalyx. A differentiated protein response in diabetes will therefore, at least partly, reflect glycocalyx changes. This study aims at identifying biomarkers of endothelial dysfunction in diabetes by statistical exploration of plasma proteome data for interactions between diabetes status and heparin treatment. METHODS Diabetes-by-heparin interactions in relation to protein levels were inspected by regression modelling in plasma proteome data from 497 patients admitted for acute angiography. Analyses were conducted separately for all 273 proteins and as set-based analyses of 44 heparin-relevant proteins identified by gene ontology analysis and 42 heparin-influenced proteins previously reported. RESULTS Seventy-five patients had diabetes and 361 received heparin before hospitalization. The proteome-wide analysis displayed no proteins with diabetes-heparin interaction to pass correction for multiple testing. The overall set-based analyses revealed significant association for both protein sets (p-values<2*10-4), while constraining on opposite directions of effect in diabetics and none-diabetics was insignificant (p-values = 0.11 and 0.17). CONCLUSIONS Our plasma proteome-wide interaction approach supports that diabetes influences heparin effects on protein levels, however the direction of effects and individual proteins could not be definitively pinpointed, likely reflecting a complex protein-basis for glycocalyx dysfunction in diabetes.
Collapse
Affiliation(s)
- Mette Soerensen
- Epidemiology, Biostatistics and Biodemography, Department of Public Health, University of Southern Denmark, J.B. Winsløws Vej 9B, 5000 Odense C, Denmark; Center for Individualized Medicine in Arterial Diseases, Department of Clinical Biochemistry and Pharmacology, Odense University Hospital, J.B. Winsløws Vej 4, 5000 Odense C, Denmark; Department of Clinical Genetics, Odense University Hospital, J.B. Winsløws Vej 4, 5000 Odense C, Denmark.
| | - Birgit Debrabant
- Epidemiology, Biostatistics and Biodemography, Department of Public Health, University of Southern Denmark, J.B. Winsløws Vej 9B, 5000 Odense C, Denmark.
| | - Ulrich Halekoh
- Epidemiology, Biostatistics and Biodemography, Department of Public Health, University of Southern Denmark, J.B. Winsløws Vej 9B, 5000 Odense C, Denmark.
| | - Jacob Eifer Møller
- Department of Clinical Cardiology, Odense University Hospital, J.B. Winsløws Vej 4, 5000 Odense C, Denmark; Department of Cardiology, Rigshospitalet, Blegdamsvej 9, 2100 Copenhagen Ø, Denmark; Department of Clinical Medicine, University of Copenhagen, Blegdamsvej 3B, 2200 Copenhagen N, Denmark.
| | - Christian Hassager
- Department of Cardiology, Rigshospitalet, Blegdamsvej 9, 2100 Copenhagen Ø, Denmark; Department of Clinical Medicine, University of Copenhagen, Blegdamsvej 3B, 2200 Copenhagen N, Denmark.
| | - Martin Frydland
- Department of Cardiology, Rigshospitalet, Blegdamsvej 9, 2100 Copenhagen Ø, Denmark; Department of Clinical Medicine, University of Copenhagen, Blegdamsvej 3B, 2200 Copenhagen N, Denmark.
| | - Jacob Hjelmborg
- Epidemiology, Biostatistics and Biodemography, Department of Public Health, University of Southern Denmark, J.B. Winsløws Vej 9B, 5000 Odense C, Denmark.
| | - Hans Christian Beck
- Center for Individualized Medicine in Arterial Diseases, Department of Clinical Biochemistry and Pharmacology, Odense University Hospital, J.B. Winsløws Vej 4, 5000 Odense C, Denmark.
| | - Lars Melholt Rasmussen
- Center for Individualized Medicine in Arterial Diseases, Department of Clinical Biochemistry and Pharmacology, Odense University Hospital, J.B. Winsløws Vej 4, 5000 Odense C, Denmark.
| |
Collapse
|
31
|
Tozzo V, Azencott CA, Fiorini S, Fava E, Trucco A, Barla A. Where Do We Stand in Regularization for Life Science Studies? J Comput Biol 2021; 29:213-232. [PMID: 33926217 PMCID: PMC8968832 DOI: 10.1089/cmb.2019.0371] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/27/2022] Open
Abstract
More and more biologists and bioinformaticians turn to machine learning to analyze large amounts of data. In this context, it is crucial to understand which is the most suitable data analysis pipeline for achieving reliable results. This process may be challenging, due to a variety of factors, the most crucial ones being the data type and the general goal of the analysis (e.g., explorative or predictive). Life science data sets require further consideration as they often contain measures with a low signal-to-noise ratio, high-dimensional observations, and relatively few samples. In this complex setting, regularization, which can be defined as the introduction of additional information to solve an ill-posed problem, is the tool of choice to obtain robust models. Different regularization practices may be used depending both on characteristics of the data and of the question asked, and different choices may lead to different results. In this article, we provide a comprehensive description of the impact and importance of regularization techniques in life science studies. In particular, we provide an intuition of what regularization is and of the different ways it can be implemented and exploited. We propose four general life sciences problems in which regularization is fundamental and should be exploited for robustness. For each of these large families of problems, we enumerate different techniques as well as examples and case studies. Lastly, we provide a unified view of how to approach each data type with various regularization techniques.
Collapse
Affiliation(s)
- Veronica Tozzo
- Department of Informatics, Bioengineering, Robotics and System Engineering-DIBRIS, University of Genoa, Genoa, Italy
| | - Chloé-Agathe Azencott
- Centre for Computational Biology-CBIO, MINES ParisTech, PSL Research University, Paris, France.,Institut Curie, PSL Research University, Paris, France.,INSERM, U900, Paris, France
| | | | - Emanuele Fava
- Departiment of Electrical, Electronic, Telecommunications Engineering, and Naval Architecture (DITEN), University of Genoa, Genoa, Italy
| | - Andrea Trucco
- Departiment of Electrical, Electronic, Telecommunications Engineering, and Naval Architecture (DITEN), University of Genoa, Genoa, Italy
| | - Annalisa Barla
- Department of Informatics, Bioengineering, Robotics and System Engineering-DIBRIS, University of Genoa, Genoa, Italy
| |
Collapse
|
32
|
Seifer DB, Petok WD, Agrawal A, Glenn TL, Bayer AH, Witt BR, Burgin BD, Lieman HJ. Psychological experience and coping strategies of patients in the Northeast US delaying care for infertility during the COVID-19 pandemic. Reprod Biol Endocrinol 2021; 19:28. [PMID: 33618732 PMCID: PMC7899935 DOI: 10.1186/s12958-021-00721-4] [Citation(s) in RCA: 11] [Impact Index Per Article: 3.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 01/14/2021] [Accepted: 02/17/2021] [Indexed: 12/21/2022] Open
Abstract
BACKGROUND On March 17, 2020 an expert ASRM task force recommended the temporary suspension of new, non-urgent fertility treatments during an ongoing world-wide pandemic of Covid-19. We surveyed at the time of resumption of fertility care the psychological experience and coping strategies of patients pausing their care due to Covid-19 and examined which factors were associated and predictive of resilience, anxiety, stress and hopefulness. METHODS Cross sectional cohort patient survey using an anonymous, self-reported, single time, web-based, HIPPA compliant platform (REDCap). Survey sampled two Northeast academic fertility practices (Yale Medicine Fertility Center in CT and Montefiore's Institute for Reproductive Medicine and Health in NY). Data from multiple choice and open response questions collected demographic, reproductive history, experience and attitudes about Covid-19, prior infertility treatment, sense of hopefulness and stress, coping strategies for mitigating stress and two validated psychological surveys to assess anxiety (six-item short-form State Trait Anxiety Inventory (STAl-6)) and resilience (10-item Connor-Davidson Resilience Scale, (CD-RISC-10). RESULTS Seven hundred thirty-four patients were sent invitations to participate. Two hundred fourteen of 734 (29.2%) completed the survey. Patients reported their fertility journey had been delayed a mean of 10 weeks while 60% had been actively trying to conceive > 1.5 years. The top 5 ranked coping skills from a choice of 19 were establishing a daily routine, going outside regularly, exercising, maintaining social connection via phone, social media or Zoom and continuing to work. Having a history of anxiety (p < 0.0001) and having received oral medication as prior infertility treatment (p < 0.0001) were associated with lower resilience. Increased hopefulness about having a child at the time of completing the survey (p < 0.0001) and higher resilience scores (p < 0.0001) were associated with decreased anxiety. Higher reported stress scores (p < 0.0001) were associated with increased anxiety. Multiple multivariate regression showed being non-Hispanic black (p = 0.035) to be predictive of more resilience while variables predictive of less resilience were being a full-time homemaker (p = 0.03), having received oral medication as prior infertility treatment (p = 0.003) and having higher scores on the STAI-6 (< 0.0001). CONCLUSIONS Prior to and in anticipation of further pauses in treatment the clinical staff should consider pretreatment screening for psychological distress and provide referral sources. In addition, utilization of a patient centered approach to care should be employed.
Collapse
Affiliation(s)
- David B. Seifer
- grid.47100.320000000419368710Department Obstetrics, Gynecology and Reproductive Sciences, Yale School of Medicine, New Haven, CT USA
| | - William D. Petok
- grid.265008.90000 0001 2166 5843Department of Obstetrics and Gynecology, Sidney Kimmel Medical College, Thomas Jefferson University, Philadelphia, PA USA
| | - Alisha Agrawal
- grid.413480.a0000 0004 0440 749XDepartment of Psychiatry, Dartmouth Hitchcock Medical Center, Lebanon, New Hampshire USA
| | - Tanya L. Glenn
- grid.47100.320000000419368710Department Obstetrics, Gynecology and Reproductive Sciences, Yale School of Medicine, New Haven, CT USA
| | - Arielle H. Bayer
- grid.251993.50000000121791997Department Obstetrics & Gynecology and Women’s Health, Albert Einstein School of Medicine, Bronx, NY USA
| | - Barry R. Witt
- grid.47100.320000000419368710Department Obstetrics, Gynecology and Reproductive Sciences, Yale School of Medicine, New Haven, CT USA
| | - Blair D. Burgin
- grid.258857.50000 0001 2227 5871Department of Psychology, La Salle University, Philadelphia, PA USA
| | - Harry J. Lieman
- grid.251993.50000000121791997Department Obstetrics & Gynecology and Women’s Health, Albert Einstein School of Medicine, Bronx, NY USA
| |
Collapse
|
33
|
Halama A, Oliveira JM, Filho SA, Qasim M, Achkar IW, Johnson S, Suhre K, Vinardell T. Metabolic Predictors of Equine Performance in Endurance Racing. Metabolites 2021; 11:metabo11020082. [PMID: 33572513 PMCID: PMC7912089 DOI: 10.3390/metabo11020082] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/22/2020] [Revised: 01/23/2021] [Accepted: 01/26/2021] [Indexed: 11/16/2022] Open
Abstract
Equine performance in endurance racing depends on the interplay between physiological and metabolic processes. However, there is currently no parameter for estimating the readiness of animals for competition. Our objectives were to provide an in-depth characterization of metabolic consequences of endurance racing and to establish a metabolic performance profile for those animals. We monitored metabolite composition, using a broad non-targeted metabolomics approach, in blood plasma samples from 47 Arabian horses participating in endurance races. The samples were collected before and after the competition and a total of 792 metabolites were measured. We found significant alterations between before and after the race in 417 molecules involved in lipids and amino acid metabolism. Further, even before the race starts, we found metabolic differences between animals who completed the race and those who did not. We identified a set of six metabolite predictors (imidazole propionate, pipecolate, ethylmalonate, 2R-3R-dihydroxybutyrate, β-hydroxy-isovalerate and X-25455) of animal performance in endurance competition; the resulting model had an area under a receiver operating characteristic (AUC) of 0.92 (95% CI: 0.85-0.98). This study provides an in-depth characterization of metabolic alterations driven by endurance races in equines. Furthermore, we showed the feasibility of identifying potential metabolic signatures as predictors of animal performance in endurance competition.
Collapse
Affiliation(s)
- Anna Halama
- Department of Physiology and Biophysics, Weill Cornell Medicine-Qatar, Doha 24144, Qatar;
- Correspondence: (A.H.); (K.S.); (T.V.)
| | - Joao M. Oliveira
- Equine Veterinary Medical Center, Qatar Foundation, Doha 5825, Qatar; (J.M.O.); (M.Q.); (S.J.)
| | - Silvio A. Filho
- Department of Endurance Racing, Al Shaqab, Doha 36623, Qatar;
| | - Muhammad Qasim
- Equine Veterinary Medical Center, Qatar Foundation, Doha 5825, Qatar; (J.M.O.); (M.Q.); (S.J.)
| | - Iman W. Achkar
- Department of Physiology and Biophysics, Weill Cornell Medicine-Qatar, Doha 24144, Qatar;
| | - Sarah Johnson
- Equine Veterinary Medical Center, Qatar Foundation, Doha 5825, Qatar; (J.M.O.); (M.Q.); (S.J.)
| | - Karsten Suhre
- Department of Physiology and Biophysics, Weill Cornell Medicine-Qatar, Doha 24144, Qatar;
- Correspondence: (A.H.); (K.S.); (T.V.)
| | - Tatiana Vinardell
- Equine Veterinary Medical Center, Qatar Foundation, Doha 5825, Qatar; (J.M.O.); (M.Q.); (S.J.)
- College of Health and Life Sciences, Hamad Bin Khalifa University, Member of Qatar Foundation, Doha 34110, Qatar
- Correspondence: (A.H.); (K.S.); (T.V.)
| |
Collapse
|
34
|
Lima E, Hyde R, Green M. Model selection for inferential models with high dimensional data: synthesis and graphical representation of multiple techniques. Sci Rep 2021; 11:412. [PMID: 33431921 PMCID: PMC7801732 DOI: 10.1038/s41598-020-79317-8] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/11/2020] [Accepted: 12/07/2020] [Indexed: 12/18/2022] Open
Abstract
Inferential research commonly involves identification of causal factors from within high dimensional data but selection of the 'correct' variables can be problematic. One specific problem is that results vary depending on statistical method employed and it has been argued that triangulation of multiple methods is advantageous to safely identify the correct, important variables. To date, no formal method of triangulation has been reported that incorporates both model stability and coefficient estimates; in this paper we develop an adaptable, straightforward method to achieve this. Six methods of variable selection were evaluated using simulated datasets of different dimensions with known underlying relationships. We used a bootstrap methodology to combine stability matrices across methods and estimate aggregated coefficient distributions. Novel graphical approaches provided a transparent route to visualise and compare results between methods. The proposed aggregated method provides a flexible route to formally triangulate results across any chosen number of variable selection methods and provides a combined result that incorporates uncertainty arising from between-method variability. In these simulated datasets, the combined method generally performed as well or better than the individual methods, with low error rates and clearer demarcation of the true causal variables than for the individual methods.
Collapse
Affiliation(s)
- Eliana Lima
- School of Veterinary Medicine and Science, University of Nottingham, Sutton Bonington Campus, Leicestershire, LE12 5RD, UK
| | - Robert Hyde
- School of Veterinary Medicine and Science, University of Nottingham, Sutton Bonington Campus, Leicestershire, LE12 5RD, UK
| | - Martin Green
- School of Veterinary Medicine and Science, University of Nottingham, Sutton Bonington Campus, Leicestershire, LE12 5RD, UK.
| |
Collapse
|
35
|
Scelsi MA, Napolioni V, Greicius MD, Altmann A. Network propagation of rare variants in Alzheimer's disease reveals tissue-specific hub genes and communities. PLoS Comput Biol 2021; 17:e1008517. [PMID: 33411734 PMCID: PMC7817020 DOI: 10.1371/journal.pcbi.1008517] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/21/2020] [Revised: 01/20/2021] [Accepted: 11/10/2020] [Indexed: 11/18/2022] Open
Abstract
State-of-the-art rare variant association testing methods aggregate the contribution of rare variants in biologically relevant genomic regions to boost statistical power. However, testing single genes separately does not consider the complex interaction landscape of genes, nor the downstream effects of non-synonymous variants on protein structure and function. Here we present the NETwork Propagation-based Assessment of Genetic Events (NETPAGE), an integrative approach aimed at investigating the biological pathways through which rare variation results in complex disease phenotypes. We applied NETPAGE to sporadic, late-onset Alzheimer's disease (AD), using whole-genome sequencing from the AD Neuroimaging Initiative (ADNI) cohort, as well as whole-exome sequencing from the AD Sequencing Project (ADSP). NETPAGE is based on network propagation, a framework that models information flow on a graph and simulates the percolation of genetic variation through tissue-specific gene interaction networks. The result of network propagation is a set of smoothed gene scores that can be tested for association with disease status through sparse regression. The application of NETPAGE to AD enabled the identification of a set of connected genes whose smoothed variation profile was robustly associated to case-control status, based on gene interactions in the hippocampus. Additionally, smoothed scores significantly correlated with risk of conversion to AD in Mild Cognitive Impairment (MCI) subjects. Lastly, we investigated tissue-specific transcriptional dysregulation of the core genes in two independent RNA-seq datasets, as well as significant enrichments in terms of gene sets with known connections to AD. We present a framework that enables enhanced genetic association testing for a wide range of traits, diseases, and sample sizes.
Collapse
Affiliation(s)
- Marzia Antonella Scelsi
- Centre for Medical Image Computing, Department of Medical Physics and Biomedical Engineering, University College London, London, United Kingdom
| | - Valerio Napolioni
- Functional Imaging in Neuropsychiatric Disorders (FIND) Lab, Department of Neurology and Neurological Sciences, Stanford University School of Medicine, Stanford, California, United States of America
| | - Michael D Greicius
- Functional Imaging in Neuropsychiatric Disorders (FIND) Lab, Department of Neurology and Neurological Sciences, Stanford University School of Medicine, Stanford, California, United States of America
| | - Andre Altmann
- Centre for Medical Image Computing, Department of Medical Physics and Biomedical Engineering, University College London, London, United Kingdom
| | | |
Collapse
|
36
|
Borchert C, Herman A, Roth M, Brooks AC, Friedenberg SG. RNA sequencing of whole blood in dogs with primary immune-mediated hemolytic anemia (IMHA) reveals novel insights into disease pathogenesis. PLoS One 2020; 15:e0240975. [PMID: 33091028 PMCID: PMC7580939 DOI: 10.1371/journal.pone.0240975] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/22/2020] [Accepted: 10/06/2020] [Indexed: 11/29/2022] Open
Abstract
Immune-mediated hemolytic anemia (IMHA) is a life-threatening autoimmune disorder characterized by a self-mediated attack on circulating red blood cells. The disease occurs naturally in both dogs and humans, but is significantly more prevalent in dogs. Because of its shared features across species, dogs offer a naturally occurring model for studying IMHA in people. In this study, we used RNA sequencing of whole blood from treatment-naïve dogs to study transcriptome-wide changes in gene expression in newly diagnosed animals compared to healthy controls. We found many overexpressed genes in pathways related to neutrophil function, coagulation, and hematopoiesis. In particular, the most highly overexpressed gene in cases was a phospholipase scramblase, which mediates the externalization of phosphatidylserine from the inner to the outer leaflet of cell membranes. This family of genes has been shown to be critically important for programmed cell death of erythrocytes as well as the initiation of the clotting cascade. Unexpectedly, we found marked underexpression of many genes related to lymphocyte function. We also identified groups of genes that are highly associated with the inflammatory response and red blood cell regeneration in affected dogs. We did not find any genes that distinguished dogs that lived vs. those that died at 30 days following diagnosis, nor did we find any relevant genomic signatures of microbial organisms in the blood of affected animals. Future studies are warranted to validate these findings and assess their implication in developing novel therapeutic approaches for dogs and humans with IMHA.
Collapse
Affiliation(s)
- Corie Borchert
- Department of Veterinary Clinical Sciences, University of Minnesota College of Veterinary Medicine, St. Paul, Minnesota, United States of America
| | - Adam Herman
- Minnesota Supercomputing Institute, University of Minnesota, Minneapolis, Minnesota, United States of America
| | - Megan Roth
- Department of Veterinary Clinical Sciences, University of Minnesota College of Veterinary Medicine, St. Paul, Minnesota, United States of America
| | - Aimee C. Brooks
- Department of Veterinary Clinical Sciences, Purdue University College of Veterinary Medicine, West Lafayette, Indiana, United States of America
| | - Steven G. Friedenberg
- Department of Veterinary Clinical Sciences, University of Minnesota College of Veterinary Medicine, St. Paul, Minnesota, United States of America
| |
Collapse
|
37
|
Adde A, Darveau M, Barker N, Cumming S. Predicting spatiotemporal abundance of breeding waterfowl across Canada: A Bayesian hierarchical modelling approach. DIVERS DISTRIB 2020. [DOI: 10.1111/ddi.13129] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/26/2022] Open
Affiliation(s)
- Antoine Adde
- Department of Wood and Forest Sciences Laval University Quebec QC Canada
| | - Marcel Darveau
- Department of Wood and Forest Sciences Laval University Quebec QC Canada
- Ducks Unlimited Canada Quebec QC Canada
| | - Nicole Barker
- Canadian Wildlife Service Environment and Climate Change Canada Edmonton AB Canada
| | - Steven Cumming
- Department of Wood and Forest Sciences Laval University Quebec QC Canada
| |
Collapse
|
38
|
Cohen AS, Cox CR, Le TP, Cowan T, Masucci MD, Strauss GP, Kirkpatrick B. Using machine learning of computerized vocal expression to measure blunted vocal affect and alogia. NPJ SCHIZOPHRENIA 2020; 6:26. [PMID: 32978400 PMCID: PMC7519104 DOI: 10.1038/s41537-020-00115-2] [Citation(s) in RCA: 17] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 03/06/2020] [Accepted: 08/06/2020] [Indexed: 11/16/2022]
Abstract
Negative symptoms are a transdiagnostic feature of serious mental illness (SMI) that can be potentially “digitally phenotyped” using objective vocal analysis. In prior studies, vocal measures show low convergence with clinical ratings, potentially because analysis has used small, constrained acoustic feature sets. We sought to evaluate (1) whether clinically rated blunted vocal affect (BvA)/alogia could be accurately modelled using machine learning (ML) with a large feature set from two separate tasks (i.e., a 20-s “picture” and a 60-s “free-recall” task), (2) whether “Predicted” BvA/alogia (computed from the ML model) are associated with demographics, diagnosis, psychiatric symptoms, and cognitive/social functioning, and (3) which key vocal features are central to BvA/Alogia ratings. Accuracy was high (>90%) and was improved when computed separately by speaking task. ML scores were associated with poor cognitive performance and social functioning and were higher in patients with schizophrenia versus depression or mania diagnoses. However, the features identified as most predictive of BvA/Alogia were generally not considered critical to their operational definitions. Implications for validating and implementing digital phenotyping to reduce SMI burden are discussed.
Collapse
Affiliation(s)
- Alex S Cohen
- Department of Psychology, Louisiana State University, Baton Rouge, LA, USA. .,Center for Computation and Technology, Louisiana State University, Baton Rouge, LA, 70803, USA.
| | - Christopher R Cox
- Department of Psychology, Louisiana State University, Baton Rouge, LA, USA
| | - Thanh P Le
- Department of Psychology, Louisiana State University, Baton Rouge, LA, USA.,Center for Computation and Technology, Louisiana State University, Baton Rouge, LA, 70803, USA
| | - Tovah Cowan
- Department of Psychology, Louisiana State University, Baton Rouge, LA, USA.,Center for Computation and Technology, Louisiana State University, Baton Rouge, LA, 70803, USA
| | - Michael D Masucci
- Department of Psychology, Louisiana State University, Baton Rouge, LA, USA.,Center for Computation and Technology, Louisiana State University, Baton Rouge, LA, 70803, USA
| | | | - Brian Kirkpatrick
- Department of Psychiatry and Behavioral Sciences, University of Nevada, Reno, USA
| |
Collapse
|
39
|
Teitsdottir UD, Jonsdottir MK, Lund SH, Darreh-Shori T, Snaedal J, Petersen PH. Association of glial and neuronal degeneration markers with Alzheimer's disease cerebrospinal fluid profile and cognitive functions. ALZHEIMERS RESEARCH & THERAPY 2020; 12:92. [PMID: 32753068 PMCID: PMC7404927 DOI: 10.1186/s13195-020-00657-8] [Citation(s) in RCA: 17] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 12/26/2019] [Accepted: 07/21/2020] [Indexed: 01/15/2023]
Abstract
BACKGROUND Neuroinflammation has gained increasing attention as a potential contributing factor in the onset and progression of Alzheimer's disease (AD). The objective of this study was to examine the association of selected cerebrospinal fluid (CSF) inflammatory and neuronal degeneration markers with signature CSF AD profile and cognitive functions among subjects at the symptomatic pre- and early dementia stages. METHODS In this cross-sectional study, 52 subjects were selected from an Icelandic memory clinic cohort. Subjects were classified as having AD (n = 28, age = 70, 39% female, Mini-Mental State Examination [MMSE] = 27) or non-AD (n = 24, age = 67, 33% female, MMSE = 28) profile based on the ratio between CSF total-tau (T-tau) and amyloid-β1-42 (Aβ42) values (cut-off point chosen as 0.52). Novel CSF biomarkers included neurofilament light (NFL), YKL-40, S100 calcium-binding protein B (S100B) and glial fibrillary acidic protein (GFAP), measured with enzyme-linked immunosorbent assays (ELISAs). Subjects underwent neuropsychological assessment for evaluation of different cognitive domains, including verbal episodic memory, non-verbal episodic memory, language, processing speed, and executive functions. RESULTS Accuracy coefficient for distinguishing between the two CSF profiles was calculated for each CSF marker and test. Novel CSF markers performed poorly (area under curve [AUC] coefficients ranging from 0.61 to 0.64) compared to tests reflecting verbal episodic memory, which all performed fair (AUC > 70). LASSO regression with a stability approach was applied for the selection of CSF markers and demographic variables predicting performance on each cognitive domain, both among all subjects and only those with a CSF AD profile. Relationships between CSF markers and cognitive domains, where the CSF marker reached stability selection criteria of > 75%, were visualized with scatter plots. Before calculations of corresponding Pearson's correlations coefficients, composite scores for cognitive domains were adjusted for age and education. GFAP correlated with executive functions (r = - 0.37, p = 0.01) overall, while GFAP correlated with processing speed (r = - 0.68, p < 0.001) and NFL with verbal episodic memory (r = - 0.43, p = 0.02) among subjects with a CSF AD profile. CONCLUSIONS The novel CSF markers NFL and GFAP show potential as markers for cognitive decline among individuals with core AD pathology at the symptomatic pre- and early stages of dementia.
Collapse
Affiliation(s)
- Unnur D Teitsdottir
- Faculty of Medicine, Department of Anatomy, Biomedical Center, University of Iceland, Reykjavik, Iceland.
| | - Maria K Jonsdottir
- Department of Psychology, Reykjavik University, Reykjavik, Iceland.,Department of Psychiatry, Landspitali - National University Hospital, Reykjavik, Iceland
| | | | - Taher Darreh-Shori
- Division of Clinical Geriatrics, Center for Alzheimer Research, NVS Department, Karolinska Institutet, Huddinge, Sweden
| | - Jon Snaedal
- Memory clinic, Department of Geriatric Medicine, Landspitali - National University Hospital, Reykjavik, Iceland
| | - Petur H Petersen
- Faculty of Medicine, Department of Anatomy, Biomedical Center, University of Iceland, Reykjavik, Iceland
| |
Collapse
|
40
|
Rotival M, Siddle KJ, Silvert M, Pothlichet J, Quach H, Quintana-Murci L. Population variation in miRNAs and isomiRs and their impact on human immunity to infection. Genome Biol 2020; 21:187. [PMID: 32731901 PMCID: PMC7391576 DOI: 10.1186/s13059-020-02098-w] [Citation(s) in RCA: 12] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/11/2020] [Accepted: 07/08/2020] [Indexed: 12/19/2022] Open
Abstract
BACKGROUND MicroRNAs (miRNAs) are key regulators of the immune system, yet their variation and contribution to intra- and inter-population differences in immune responses is poorly characterized. RESULTS We generate 977 miRNA-sequencing profiles from primary monocytes from individuals of African and European ancestry following activation of three TLR pathways (TLR4, TLR1/2, and TLR7/8) or infection with influenza A virus. We find that immune activation leads to important modifications in the miRNA and isomiR repertoire, particularly in response to viral challenges. These changes are much weaker than those observed for protein-coding genes, suggesting stronger selective constraints on the miRNA response to stimulation. This is supported by the limited genetic control of miRNA expression variability (miR-QTLs) and the lower occurrence of gene-environment interactions, in stark contrast with eQTLs that are largely context-dependent. We also detect marked differences in miRNA expression between populations, which are mostly driven by non-genetic factors. On average, miR-QTLs explain approximately 60% of population differences in expression of their cognate miRNAs and, in some cases, evolve adaptively, as shown in Europeans for a miRNA-rich cluster on chromosome 14. Finally, integrating miRNA and mRNA data from the same individuals, we provide evidence that the canonical model of miRNA-driven transcript degradation has a minor impact on miRNA-mRNA correlations, which are, in our setting, mainly driven by co-transcription. CONCLUSION Together, our results shed new light onto the factors driving miRNA and isomiR diversity at the population level and constitute a useful resource for evaluating their role in host differences of immunity to infection.
Collapse
Affiliation(s)
- Maxime Rotival
- Human Evolutionary Genetics Unit, Institut Pasteur, CNRS UMR 2000, 75015 Paris, France
| | - Katherine J. Siddle
- Broad Institute of MIT and Harvard, Cambridge, MA USA
- Department of Organismic and Evolutionary Biology, Harvard University, Cambridge, MA 02138 USA
| | - Martin Silvert
- Human Evolutionary Genetics Unit, Institut Pasteur, CNRS UMR 2000, 75015 Paris, France
- Sorbonne Universités, École Doctorale Complexité du Vivant, 75005 Paris, France
| | - Julien Pothlichet
- Human Evolutionary Genetics Unit, Institut Pasteur, CNRS UMR 2000, 75015 Paris, France
- Present Address: DIACCURATE, Institut Pasteur, 75015 Paris, France
| | - Hélène Quach
- Human Evolutionary Genetics Unit, Institut Pasteur, CNRS UMR 2000, 75015 Paris, France
- Present Address: UMR7206, Muséum National d’Histoire Naturelle, CNRS, Université Paris Diderot, 75016 Paris, France
| | - Lluis Quintana-Murci
- Human Evolutionary Genetics Unit, Institut Pasteur, CNRS UMR 2000, 75015 Paris, France
- Chair Human Genomics and Evolution, Collège de France, 75005 Paris, France
| |
Collapse
|
41
|
Ploner T, Heß S, Grum M, Drewe-Boss P, Walker J. Using gradient boosting with stability selection on health insurance claims data to identify disease trajectories in chronic obstructive pulmonary disease. Stat Methods Med Res 2020; 29:3684-3694. [PMID: 32646307 DOI: 10.1177/0962280220938088] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/17/2022]
Abstract
OBJECTIVE We propose a data-driven method to detect temporal patterns of disease progression in high-dimensional claims data based on gradient boosting with stability selection. MATERIALS AND METHODS We identified patients with chronic obstructive pulmonary disease in a German health insurance claims database with 6.5 million individuals and divided them into a group of patients with the highest disease severity and a group of control patients with lower severity. We then used gradient boosting with stability selection to determine variables correlating with a chronic obstructive pulmonary disease diagnosis of highest severity and subsequently model the temporal progression of the disease using the selected variables. RESULTS We identified a network of 20 diagnoses (e.g. respiratory failure), medications (e.g. anticholinergic drugs) and procedures associated with a subsequent chronic obstructive pulmonary disease diagnosis of highest severity. Furthermore, the network successfully captured temporal patterns, such as disease progressions from lower to higher severity grades. DISCUSSION The temporal trajectories identified by our data-driven approach are compatible with existing knowledge about chronic obstructive pulmonary disease showing that the method can reliably select relevant variables in a high-dimensional context. CONCLUSION We provide a generalizable approach for the automatic detection of disease trajectories in claims data. This could help to diagnose diseases early, identify unknown risk factors and optimize treatment plans.
Collapse
Affiliation(s)
- Tina Ploner
- InGef-Institute for Applied Health Research Berlin GmbH, Berlin, Germany
| | - Steffen Heß
- InGef-Institute for Applied Health Research Berlin GmbH, Berlin, Germany
| | | | - Philipp Drewe-Boss
- Max Delbrück Center for Molecular Medicine in the Helmholtz Association, Berlin, Germany
| | - Jochen Walker
- InGef-Institute for Applied Health Research Berlin GmbH, Berlin, Germany
| |
Collapse
|
42
|
Guinot F, Szafranski M, Chiquet J, Zancarini A, Le Signor C, Mougel C, Ambroise C. Fast computation of genome-metagenome interaction effects. Algorithms Mol Biol 2020; 15:13. [PMID: 32625242 PMCID: PMC7329492 DOI: 10.1186/s13015-020-00173-2] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/15/2019] [Accepted: 06/17/2020] [Indexed: 01/01/2023] Open
Abstract
Motivation Association studies have been widely used to search for associations between common genetic variants observations and a given phenotype. However, it is now generally accepted that genes and environment must be examined jointly when estimating phenotypic variance. In this work we consider two types of biological markers: genotypic markers, which characterize an observation in terms of inherited genetic information, and metagenomic marker which are related to the environment. Both types of markers are available in their millions and can be used to characterize any observation uniquely. Objective Our focus is on detecting interactions between groups of genetic and metagenomic markers in order to gain a better understanding of the complex relationship between environment and genome in the expression of a given phenotype. Contributions We propose a novel approach for efficiently detecting interactions between complementary datasets in a high-dimensional setting with a reduced computational cost. The method, named SICOMORE, reduces the dimension of the search space by selecting a subset of supervariables in the two complementary datasets. These supervariables are given by a weighted group structure defined on sets of variables at different scales. A Lasso selection is then applied on each type of supervariable to obtain a subset of potential interactions that will be explored via linear model testing. Results We compare SICOMORE with other approaches in simulations, with varying sample sizes, noise, and numbers of true interactions. SICOMORE exhibits convincing results in terms of recall, as well as competitive performances with respect to running time. The method is also used to detect interaction between genomic markers in Medicago truncatula and metagenomic markers in its rhizosphere bacterial community. Software availability An R package is available [4], along with its documentation and associated scripts, allowing the reader to reproduce the results presented in the paper.
Collapse
|
43
|
Richter-Heitmann T, Hofner B, Krah FS, Sikorski J, Wüst PK, Bunk B, Huang S, Regan KM, Berner D, Boeddinghaus RS, Marhan S, Prati D, Kandeler E, Overmann J, Friedrich MW. Stochastic Dispersal Rather Than Deterministic Selection Explains the Spatio-Temporal Distribution of Soil Bacteria in a Temperate Grassland. Front Microbiol 2020; 11:1391. [PMID: 32695081 PMCID: PMC7338559 DOI: 10.3389/fmicb.2020.01391] [Citation(s) in RCA: 21] [Impact Index Per Article: 5.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/03/2020] [Accepted: 05/29/2020] [Indexed: 01/15/2023] Open
Abstract
Spatial and temporal processes shaping microbial communities are inseparably linked but rarely studied together. By Illumina 16S rRNA sequencing, we monitored soil bacteria in 360 stations on a 100 square meter plot distributed across six intra-annual samplings in a rarely managed, temperate grassland. Using a multi-tiered approach, we tested the extent to which stochastic or deterministic processes influenced the composition of local communities. A combination of phylogenetic turnover analysis and null modeling demonstrated that either homogenization by unlimited stochastic dispersal or scenarios, in which neither stochastic processes nor deterministic forces dominated, explained local assembly processes. Thus, the majority of all sampled communities (82%) was rather homogeneous with no significant changes in abundance-weighted composition. However, we detected strong and uniform taxonomic shifts within just nine samples in early summer. Thus, community snapshots sampled from single points in time or space do not necessarily reflect a representative community state. The potential for change despite the overall homogeneity was further demonstrated when the focus shifted to the rare biosphere. Rare OTU turnover, rather than nestedness, characterized abundance-independent β-diversity. Accordingly, boosted generalized additive models encompassing spatial, temporal and environmental variables revealed strong and highly diverse effects of space on OTU abundance, even within the same genus. This pure spatial effect increased with decreasing OTU abundance and frequency, whereas soil moisture – the most important environmental variable – had an opposite effect by impacting abundant OTUs more than the rare ones. These results indicate that – despite considerable oscillation in space and time – the abundant and resident OTUs provide a community backbone that supports much higher β-diversity of a dynamic rare biosphere. Our findings reveal complex interactions among space, time, and environmental filters within bacterial communities in a long-established temperate grassland.
Collapse
Affiliation(s)
- Tim Richter-Heitmann
- Microbial Ecophysiology Group, Faculty of Biology/Chemistry, University of Bremen, Bremen, Germany.,International Max Planck Research School of Marine Microbiology, Max Planck Institute for Marine Microbiology, Bremen, Germany
| | - Benjamin Hofner
- Institut für Medizininformatik, Biometrie und Epidemiologie, Friedrich-Alexander-Universität Erlangen-Nürnberg, Erlangen, Germany
| | - Franz-Sebastian Krah
- Biodiversity Conservation, Institute for Ecology, Evolution and Diversity, Biologicum, Goethe University Frankfurt, Frankfurt am Main, Germany
| | - Johannes Sikorski
- Leibniz Institute DSMZ-German Collection of Microorganisms and Cell Cultures, Braunschweig, Germany
| | - Pia K Wüst
- Leibniz Institute DSMZ-German Collection of Microorganisms and Cell Cultures, Braunschweig, Germany
| | - Boyke Bunk
- Leibniz Institute DSMZ-German Collection of Microorganisms and Cell Cultures, Braunschweig, Germany
| | - Sixing Huang
- Leibniz Institute DSMZ-German Collection of Microorganisms and Cell Cultures, Braunschweig, Germany
| | - Kathleen M Regan
- Institute of Soil Science and Land Evaluation, Soil Biology Department, University of Hohenheim, Stuttgart, Germany
| | - Doreen Berner
- Institute of Soil Science and Land Evaluation, Soil Biology Department, University of Hohenheim, Stuttgart, Germany
| | - Runa S Boeddinghaus
- Institute of Soil Science and Land Evaluation, Soil Biology Department, University of Hohenheim, Stuttgart, Germany
| | - Sven Marhan
- Institute of Soil Science and Land Evaluation, Soil Biology Department, University of Hohenheim, Stuttgart, Germany
| | - Daniel Prati
- Institute of Plant Sciences, University of Bern, Bern, Switzerland
| | - Ellen Kandeler
- Institute of Soil Science and Land Evaluation, Soil Biology Department, University of Hohenheim, Stuttgart, Germany
| | - Jörg Overmann
- Leibniz Institute DSMZ-German Collection of Microorganisms and Cell Cultures, Braunschweig, Germany
| | - Michael W Friedrich
- Microbial Ecophysiology Group, Faculty of Biology/Chemistry, University of Bremen, Bremen, Germany
| |
Collapse
|
44
|
Mirończuk MM, Protasiewicz J. Recognising innovative companies by using a diversified stacked generalisation method for website classification. APPL INTELL 2020. [DOI: 10.1007/s10489-019-01509-1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/26/2022]
|
45
|
Cascio L, Chen CF, Pauly R, Srikanth S, Jones K, Skinner CD, Stevenson RE, Schwartz CE, Boccuto L. Abnormalities in the genes that encode Large Amino Acid Transporters increase the risk of Autism Spectrum Disorder. Mol Genet Genomic Med 2019; 8:e1036. [PMID: 31701662 PMCID: PMC6978257 DOI: 10.1002/mgg3.1036] [Citation(s) in RCA: 36] [Impact Index Per Article: 7.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/16/2019] [Revised: 10/08/2019] [Accepted: 10/16/2019] [Indexed: 12/12/2022] Open
Abstract
Background Autism spectrum disorder (ASD) is a common neurodevelopmental disorder whose molecular mechanisms are largely unknown. Several studies have shown an association between ASD and abnormalities in the metabolism of amino acids, specifically tryptophan and branched‐chain amino acids (BCAAs). Methods Ninety‐seven patients with ASD were screened by Sanger sequencing the genes encoding the heavy (SLC3A2) and light subunits (SLC7A5 and SLC7A8) of the large amino acid transporters (LAT) 1 and 2. LAT1 and 2 are responsible for the transportation of tryptophan and BCAA across the blood–brain barrier and are expressed both in blood and brain. Functional studies were performed employing the Biolog Phenotype Microarray Mammalian (PM‐M) technology to investigate the metabolic profiling in lymphoblastoid cell lines from 43 patients with ASD and 50 controls with particular focus on the amino acid substrates of LATs. Results We detected nine likely pathogenic variants in 11 of 97 patients (11.3%): three in SLC3A2, three in SLC7A5, and three in SLC7A8. Six variants of unknown significance were detected in eight patients, two of which also carrying a likely pathogenic variant. The functional studies showed a consistently reduced utilization of tryptophan, accompanied by evidence of reduced utilization of other large aromatic amino acids (LAAs), either alone or as part of a dipeptide. Conclusion Coding variants in the LAT genes were detected in 17 of 97 patients with ASD (17.5%). Metabolic assays indicate that such abnormalities affect the utilization of certain amino acids, particularly tryptophan and other LAAs, with potential consequences on their transport across the blood barrier and their availability during brain development. Therefore, abnormalities in the LAT1 and two transporters are likely associated with an increased risk of developing ASD.
Collapse
Affiliation(s)
- Lauren Cascio
- JC Self research Institute, Greenwood Genetic Center, Greenwood, SC, USA
| | - Chin-Fu Chen
- JC Self research Institute, Greenwood Genetic Center, Greenwood, SC, USA
| | - Rini Pauly
- JC Self research Institute, Greenwood Genetic Center, Greenwood, SC, USA
| | - Sujata Srikanth
- JC Self research Institute, Greenwood Genetic Center, Greenwood, SC, USA
| | - Kelly Jones
- JC Self research Institute, Greenwood Genetic Center, Greenwood, SC, USA
| | - Cindy D Skinner
- JC Self research Institute, Greenwood Genetic Center, Greenwood, SC, USA
| | - Roger E Stevenson
- JC Self research Institute, Greenwood Genetic Center, Greenwood, SC, USA
| | - Charles E Schwartz
- JC Self research Institute, Greenwood Genetic Center, Greenwood, SC, USA
| | - Luigi Boccuto
- JC Self research Institute, Greenwood Genetic Center, Greenwood, SC, USA
| |
Collapse
|
46
|
Zwanenburg A. Radiomics in nuclear medicine: robustness, reproducibility, standardization, and how to avoid data analysis traps and replication crisis. Eur J Nucl Med Mol Imaging 2019; 46:2638-2655. [PMID: 31240330 DOI: 10.1007/s00259-019-04391-8] [Citation(s) in RCA: 169] [Impact Index Per Article: 33.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/31/2019] [Accepted: 06/04/2019] [Indexed: 12/16/2022]
Abstract
Radiomics in nuclear medicine is rapidly expanding. Reproducibility of radiomics studies in multicentre settings is an important criterion for clinical translation. We therefore performed a meta-analysis to investigate reproducibility of radiomics biomarkers in PET imaging and to obtain quantitative information regarding their sensitivity to variations in various imaging and radiomics-related factors as well as their inherent sensitivity. Additionally, we identify and describe data analysis pitfalls that affect the reproducibility and generalizability of radiomics studies. After a systematic literature search, 42 studies were included in the qualitative synthesis, and data from 21 were used for the quantitative meta-analysis. Data concerning measurement agreement and reliability were collected for 21 of 38 different factors associated with image acquisition, reconstruction, segmentation and radiomics-specific processing steps. Variations in voxel size, segmentation and several reconstruction parameters strongly affected reproducibility, but the level of evidence remained weak. Based on the meta-analysis, we also assessed inherent sensitivity to variations of 110 PET image biomarkers. SUVmean and SUVmax were found to be reliable, whereas image biomarkers based on the neighbourhood grey tone difference matrix and most biomarkers based on the size zone matrix were found to be highly sensitive to variations, and should be used with care in multicentre settings. Lastly, we identify 11 data analysis pitfalls. These pitfalls concern model validation and information leakage during model development, but also relate to reporting and the software used for data analysis. Avoiding such pitfalls is essential for minimizing bias in the results and to enable reproduction and validation of radiomics studies.
Collapse
Affiliation(s)
- Alex Zwanenburg
- OncoRay - National Center for Radiation Research in Oncology, Faculty of Medicine and University Hospital Carl Gustav Carus, Helmholtz-Zentrum Dresden - Rossendorf, Technische Universität Dresden, Dresden, Germany.
- National Center for Tumor Diseases (NCT), Partner Site Dresden, Dresden, Germany.
- German Cancer Research Center (DKFZ), Heidelberg, Germany.
- German Cancer Consortium (DKTK), Partner Site Dresden, Dresden, Germany.
| |
Collapse
|
47
|
Affiliation(s)
- Chunxia Zhang
- School of Mathematics and StatisticsXi'an Jiaotong University China
| | - Yilei Wu
- Department of Statistics and Actuarial ScienceUniversity of Waterloo Waterloo Ontario Canada
| | - Mu Zhu
- Department of Statistics and Actuarial ScienceUniversity of Waterloo Waterloo Ontario Canada
| |
Collapse
|
48
|
Sauer DG, Melcher M, Mosor M, Walch N, Berkemeyer M, Scharl-Hirsch T, Leisch F, Jungbauer A, Dürauer A. Real-time monitoring and model-based prediction of purity and quantity during a chromatographic capture of fibroblast growth factor 2. Biotechnol Bioeng 2019; 116:1999-2009. [PMID: 30934111 PMCID: PMC6618329 DOI: 10.1002/bit.26984] [Citation(s) in RCA: 23] [Impact Index Per Article: 4.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/24/2018] [Revised: 03/15/2019] [Accepted: 03/28/2019] [Indexed: 12/14/2022]
Abstract
Process analytical technology combines understanding and control of the process with real‐time monitoring of critical quality and performance attributes. The goal is to ensure the quality of the final product. Currently, chromatographic processes in biopharmaceutical production are predominantly monitored with UV/Vis absorbance and a direct correlation with purity and quantity is limited. In this study, a chromatographic workstation was equipped with additional online sensors, such as multi‐angle light scattering, refractive index, attenuated total reflection Fourier‐transform infrared, and fluorescence spectroscopy. Models to predict quantity, host cell proteins (HCP), and double‐stranded DNA (dsDNA) content simultaneously were developed and exemplified by a cation exchange capture step for fibroblast growth factor 2 expressed in Escherichia coliOnline data and corresponding offline data for product quantity and co‐eluting impurities, such as dsDNA and HCP, were analyzed using boosted structured additive regression. Different sensor combinations were used to achieve the best prediction performance for each quality attribute. Quantity can be adequately predicted by applying a small predictor set of the typical chromatographic workstation sensor signals with a test error of 0.85 mg/ml (range in training data: 0.1–28 mg/ml). For HCP and dsDNA additional fluorescence and/or attenuated total reflection Fourier‐transform infrared spectral information was important to achieve prediction errors of 200 (2–6579 ppm) and 340 ppm (8–3773 ppm), respectively.
Collapse
Affiliation(s)
| | - Michael Melcher
- Austrian Centre of Industrial Biotechnology, Vienna, Austria.,Institute of Applied Statistics and Computing, University of Natural Resources and Life Sciences Vienna, Vienna, Austria
| | - Magdalena Mosor
- Austrian Centre of Industrial Biotechnology, Vienna, Austria
| | - Nicole Walch
- Biopharmaceuticals Operations Austria, Manufacturing Science, Boehringer Ingelheim Regional Center Vienna GmbH & Co KG, Vienna, Austria
| | - Matthias Berkemeyer
- Biopharma Process Science Austria, Boehringer Ingelheim Regional Center Vienna GmbH & Co KG, Vienna, Austria
| | - Theresa Scharl-Hirsch
- Austrian Centre of Industrial Biotechnology, Vienna, Austria.,Institute of Applied Statistics and Computing, University of Natural Resources and Life Sciences Vienna, Vienna, Austria
| | - Friedrich Leisch
- Austrian Centre of Industrial Biotechnology, Vienna, Austria.,Institute of Applied Statistics and Computing, University of Natural Resources and Life Sciences Vienna, Vienna, Austria
| | - Alois Jungbauer
- Austrian Centre of Industrial Biotechnology, Vienna, Austria.,Department of Biotechnology, University of Natural Resources and Life Sciences Vienna, Vienna, Austria
| | - Astrid Dürauer
- Austrian Centre of Industrial Biotechnology, Vienna, Austria.,Department of Biotechnology, University of Natural Resources and Life Sciences Vienna, Vienna, Austria
| |
Collapse
|
49
|
Hepp T, Schmid M, Mayr A. Significance Tests for Boosted Location and Scale Models with Linear Base-Learners. Int J Biostat 2019; 15:/j/ijb.ahead-of-print/ijb-2018-0110/ijb-2018-0110.xml. [PMID: 30990787 DOI: 10.1515/ijb-2018-0110] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/25/2018] [Accepted: 03/21/2019] [Indexed: 11/15/2022]
Abstract
Generalized additive models for location scale and shape (GAMLSS) offer very flexible solutions to a wide range of statistical analysis problems, but can be challenging in terms of proper model specification. This complex task can be simplified using regularization techniques such as gradient boosting algorithms, but the estimates derived from such models are shrunken towards zero and it is consequently not straightforward to calculate proper confidence intervals or test statistics. In this article, we propose two strategies to obtain p-values for linear effect estimates for Gaussian location and scale models based on permutation tests and a parametric bootstrap approach. These procedures can provide a solution for one of the remaining problems in the application of gradient boosting algorithms for distributional regression in biostatistical data analyses. Results from extensive simulations indicate that in low-dimensional data both suggested approaches are able to hold the type-I error threshold and provide reasonable test power comparable to the Wald-type test for maximum likelihood inference. In high-dimensional data, when gradient boosting is the only feasible inference for this model class, the power decreases but the type-I error is still under control. In addition, we demonstrate the application of both tests in an epidemiological study to analyse the impact of physical exercise on both average and the stability of the lung function of elderly people in Germany.
Collapse
Affiliation(s)
- Tobias Hepp
- Institut für medizinische Biometrie, Informatik und Epidemiologie, Medizinische Fakultät, Rheinische Friedrich-Wilhelms-Universität Bonn, Bonn, Germany.,Institut für Medizininformatik, Biometrie und Epidemiologie, Medizinische Fakultät, Friedrich-Alexander-Universität Erlangen-Nürnberg, Erlangen, Germany
| | - Matthias Schmid
- Institut für medizinische Biometrie, Informatik und Epidemiologie, Medizinische Fakultät, Rheinische Friedrich-Wilhelms-Universität Bonn, Bonn, Germany
| | - Andreas Mayr
- Institut für medizinische Biometrie, Informatik und Epidemiologie, Medizinische Fakultät, Rheinische Friedrich-Wilhelms-Universität Bonn, Bonn, Germany
| |
Collapse
|
50
|
Smith A, Hofner B, Lamb JS, Osenkowski J, Allison T, Sadoti G, McWilliams SR, Paton P. Modeling spatiotemporal abundance of mobile wildlife in highly variable environments using boosted GAMLSS hurdle models. Ecol Evol 2019; 9:2346-2364. [PMID: 30891185 PMCID: PMC6405508 DOI: 10.1002/ece3.4738] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/31/2018] [Revised: 10/11/2018] [Accepted: 10/30/2018] [Indexed: 11/07/2022] Open
Abstract
Modeling organism distributions from survey data involves numerous statistical challenges, including accounting for zero-inflation, overdispersion, and selection and incorporation of environmental covariates. In environments with high spatial and temporal variability, addressing these challenges often requires numerous assumptions regarding organism distributions and their relationships to biophysical features. These assumptions may limit the resolution or accuracy of predictions resulting from survey-based distribution models. We propose an iterative modeling approach that incorporates a negative binomial hurdle, followed by modeling of the relationship of organism distribution and abundance to environmental covariates using generalized additive models (GAM) and generalized additive models for location, scale, and shape (GAMLSS). Our approach accounts for key features of survey data by separating binary (presence-absence) from count (abundance) data, separately modeling the mean and dispersion of count data, and incorporating selection of appropriate covariates and response functions from a suite of potential covariates while avoiding overfitting. We apply our modeling approach to surveys of sea duck abundance and distribution in Nantucket Sound (Massachusetts, USA), which has been proposed as a location for offshore wind energy development. Our model results highlight the importance of spatiotemporal variation in this system, as well as identifying key habitat features including distance to shore, sediment grain size, and seafloor topographic variation. Our work provides a powerful, flexible, and highly repeatable modeling framework with minimal assumptions that can be broadly applied to the modeling of survey data with high spatiotemporal variability. Applying GAMLSS models to the count portion of survey data allows us to incorporate potential overdispersion, which can dramatically affect model results in highly dynamic systems. Our approach is particularly relevant to systems in which little a priori knowledge is available regarding relationships between organism distributions and biophysical features, since it incorporates simultaneous selection of covariates and their functional relationships with organism responses.
Collapse
Affiliation(s)
- Adam Smith
- Department of Natural Resources ScienceUniversity of Rhode IslandKingstonRhode Island
- Present address:
United States Fish and Wildlife Service, National Wildlife Refuge SystemInventory and Monitoring BranchAthensGeorgia
| | - Benjamin Hofner
- Department of Medical Informatics, Biometry and EpidemiologyFriedrich‐Alexander‐University Erlangen‐NurembergErlangenGermany
- Present address:
Section BiostatisticsPaul‐Ehrlich‐InstitutLangenGermany
| | - Juliet S. Lamb
- Department of Natural Resources ScienceUniversity of Rhode IslandKingstonRhode Island
| | - Jason Osenkowski
- Rhode Island Department of Environmental ManagementWest KingstonRhode Island
| | - Taber Allison
- American Wind Wildlife InstituteWashingtonDistrict of Columbia
| | | | - Scott R. McWilliams
- Department of Natural Resources ScienceUniversity of Rhode IslandKingstonRhode Island
| | - Peter Paton
- Department of Natural Resources ScienceUniversity of Rhode IslandKingstonRhode Island
| |
Collapse
|