1
|
Tempke R, Musho T. Autonomous generation of single photon emitting materials. NANOSCALE 2024; 16:10239-10249. [PMID: 38726673 DOI: 10.1039/d3nr04944b] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/31/2024]
Abstract
The utilization of machine learning in Materials Science underscores the critical importance of the quality and quantity of data in training models effectively. Unlike fields such as image processing and natural language processing, there is limited availability of atomistic datasets, leading to biases in training data. Particularly in the domain of materials discovery, there exists an issue of continuity in atomistic datasets. Experimental data sourced from literature and patents is usually only available for favorable data, resulting in bias in the training dataset. This study focuses on developing a SMILES-based model for generating synthetic datasets of quantum materials using a variational autoencoder. This study centers on the generation of a synthetic dataset of quantum materials specifically for quantum sensing applications, with a focus on two-level quantum molecules that exhibit a dipole blockade. The proposed technique offers an improved sampling algorithm by incorporating newly generated data into the sampling algorithm to create a more normally distributed dataset. Through this technique, the study was able to generate over 1 000 000 candidate quantum materials from a small dataset of only 8000 materials. The generated dataset identified several iodine-containing molecules as promising single photon emitting materials for potential quantum sensing applications.
Collapse
Affiliation(s)
- Robert Tempke
- Department of Mechanical, Materials and Aerospace Engineering, West Virginia University, P.O. Box 6106, Morgantown, WV, USA.
| | - Terence Musho
- Department of Mechanical, Materials and Aerospace Engineering, West Virginia University, P.O. Box 6106, Morgantown, WV, USA.
| |
Collapse
|
2
|
Aghababa MP, Andrysek J. Exploration and demonstration of explainable machine learning models in prosthetic rehabilitation-based gait analysis. PLoS One 2024; 19:e0300447. [PMID: 38564508 PMCID: PMC10987001 DOI: 10.1371/journal.pone.0300447] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/10/2023] [Accepted: 02/27/2024] [Indexed: 04/04/2024] Open
Abstract
Quantitative gait analysis is important for understanding the non-typical walking patterns associated with mobility impairments. Conventional linear statistical methods and machine learning (ML) models are commonly used to assess gait performance and related changes in the gait parameters. Nonetheless, explainable machine learning provides an alternative technique for distinguishing the significant and influential gait changes stemming from a given intervention. The goal of this work was to demonstrate the use of explainable ML models in gait analysis for prosthetic rehabilitation in both population- and sample-based interpretability analyses. Models were developed to classify amputee gait with two types of prosthetic knee joints. Sagittal plane gait patterns of 21 individuals with unilateral transfemoral amputations were video-recorded and 19 spatiotemporal and kinematic gait parameters were extracted and included in the models. Four ML models-logistic regression, support vector machine, random forest, and LightGBM-were assessed and tested for accuracy and precision. The Shapley Additive exPlanations (SHAP) framework was applied to examine global and local interpretability. Random Forest yielded the highest classification accuracy (98.3%). The SHAP framework quantified the level of influence of each gait parameter in the models where knee flexion-related parameters were found the most influential factors in yielding the outcomes of the models. The sample-based explainable ML provided additional insights over the population-based analyses, including an understanding of the effect of the knee type on the walking style of a specific sample, and whether or not it agreed with global interpretations. It was concluded that explainable ML models can be powerful tools for the assessment of gait-related clinical interventions, revealing important parameters that may be overlooked using conventional statistical methods.
Collapse
Affiliation(s)
- Mohammad Pourmahmood Aghababa
- Institute of Biomedical Engineering, University of Toronto, Toronto, Ontario, Canada
- Bloorview Research Institute, Holland Bloorview Kids Rehabilitation Hospital, Toronto, Ontario, Canada
| | - Jan Andrysek
- Institute of Biomedical Engineering, University of Toronto, Toronto, Ontario, Canada
- Bloorview Research Institute, Holland Bloorview Kids Rehabilitation Hospital, Toronto, Ontario, Canada
| |
Collapse
|
3
|
Iwasaki Y, Suemori T, Kobayashi Y. Predicting macroinvertebrate average score per taxon (ASPT) at water quality monitoring sites in Japanese rivers. ENVIRONMENTAL SCIENCE AND POLLUTION RESEARCH INTERNATIONAL 2024; 31:28538-28548. [PMID: 38561531 DOI: 10.1007/s11356-024-33053-y] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 11/21/2023] [Accepted: 03/19/2024] [Indexed: 04/04/2024]
Abstract
Biomonitoring with bioindicators such as river macroinvertebrates is fundamental for assessing the status of freshwater ecosystems. In Japan, water quality and biomonitoring surveys are conducted separately, leading to a lack of nationwide information on their relationships and the biological status of water quality monitoring (WQM) sites. To understand the biological status of WQM sites across Japan, we developed a multiple linear regression model to estimate the average score per taxon (ASPT) using river macroinvertebrate data surveyed at a total of 237 "aligned" sites based on the co-occurrence of biomonitoring and WQM sites. The resulting regression model with eight predictors, such as biological oxygen demand, the proportion of urban areas in the catchment, could predict ASPT with reasonable accuracy (e.g., an error of ±1 for 96% of the aligned data). Using this model, we estimated ASPT values at 2925 WQM sites in rivers nationwide, categorizing them into four levels of river environment quality: "very good" (29% of WQM sites), "good" (50%), "fairly good" (14%), and "not good" (8%). Furthermore, we observed statistically significant correlations (p < 0.05; 0.4 ≤ r ≤ 0.7) between ASPT and all eight macroinvertebrate metrics examined, such as mayfly and stonefly richness, providing ecological implications of changes in ASPT.
Collapse
Affiliation(s)
- Yuichi Iwasaki
- Research Institute of Science for Safety and Sustainability, National Institute of Advanced Industrial Science and Technology (AIST), 16-1 Onogawa, Tsukuba, Ibaraki, 305-8569, Japan.
| | - Tomomi Suemori
- Research Institute of Science for Safety and Sustainability, National Institute of Advanced Industrial Science and Technology (AIST), 16-1 Onogawa, Tsukuba, Ibaraki, 305-8569, Japan
| | - Yuta Kobayashi
- Field Science Center, Faculty of Agriculture, Tokyo University of Agriculture and Technology, 3-5-8 Saiwai-tyo, Fuchu, Tokyo, 183-8509, Japan
| |
Collapse
|
4
|
Truong B, Zheng J, Hornsby L, Fox B, Chou C, Qian J. Development and Validation of Machine Learning Algorithms to Predict 1-Year Ischemic Stroke and Bleeding Events in Patients with Atrial Fibrillation and Cancer. Cardiovasc Toxicol 2024; 24:365-374. [PMID: 38499940 PMCID: PMC10998799 DOI: 10.1007/s12012-024-09843-8] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 01/15/2024] [Accepted: 02/21/2024] [Indexed: 03/20/2024]
Abstract
In this study, we leveraged machine learning (ML) approach to develop and validate new assessment tools for predicting stroke and bleeding among patients with atrial fibrillation (AFib) and cancer. We conducted a retrospective cohort study including patients who were newly diagnosed with AFib with a record of cancer from the 2012-2018 Surveillance, Epidemiology, and End Results (SEER)-Medicare database. The ML algorithms were developed and validated separately for each outcome by fitting elastic net, random forest (RF), extreme gradient boosting (XGBoost), support vector machine (SVM), and neural network models with tenfold cross-validation (train:test = 7:3). We obtained area under the curve (AUC), sensitivity, specificity, and F2 score as performance metrics. Model calibration was assessed using Brier score. In sensitivity analysis, we resampled data using Synthetic Minority Oversampling Technique (SMOTE). Among 18,388 patients with AFib and cancer, 523 (2.84%) had ischemic stroke and 221 (1.20%) had major bleeding within one year after AFib diagnosis. In prediction of ischemic stroke, RF significantly outperformed other ML models [AUC (0.916, 95% CI 0.887-0.945), sensitivity 0.868, specificity 0.801, F2 score 0.375, Brier score = 0.035]. However, the performance of ML algorithms in prediction of major bleeding was low with highest AUC achieved by RF (0.623, 95% CI 0.554-0.692). RF models performed better than CHA2DS2-VASc and HAS-BLED scores. SMOTE did not improve the performance of the ML algorithms. Our study demonstrated a promising application of ML in stroke prediction among patients with AFib and cancer. This tool may be leveraged in assisting clinicians to identify patients at high risk of stroke and optimize treatment decisions.
Collapse
Affiliation(s)
- Bang Truong
- Department of Health Outcomes Research and Policy, Auburn University Harrison College of Pharmacy, 4306d Walker Building, Auburn, AL, 36849, USA
| | - Jingyi Zheng
- Department of Mathematics and Statistics, Auburn University College of Sciences and Mathematics, Auburn, AL, USA
| | - Lori Hornsby
- Department of Pharmacy Practice, Auburn University Harrison College of Pharmacy, Auburn, AL, USA
| | - Brent Fox
- Department of Health Outcomes Research and Policy, Auburn University Harrison College of Pharmacy, 4306d Walker Building, Auburn, AL, 36849, USA
| | - Chiahung Chou
- Department of Health Outcomes Research and Policy, Auburn University Harrison College of Pharmacy, 4306d Walker Building, Auburn, AL, 36849, USA
| | - Jingjing Qian
- Department of Health Outcomes Research and Policy, Auburn University Harrison College of Pharmacy, 4306d Walker Building, Auburn, AL, 36849, USA.
| |
Collapse
|
5
|
Xia K, Chen D, Jin S, Yi X, Luo L. Prediction of lung papillary adenocarcinoma-specific survival using ensemble machine learning models. Sci Rep 2023; 13:14827. [PMID: 37684259 PMCID: PMC10491759 DOI: 10.1038/s41598-023-40779-1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/16/2023] [Accepted: 08/16/2023] [Indexed: 09/10/2023] Open
Abstract
Accurate prognostic prediction is crucial for treatment decision-making in lung papillary adenocarcinoma (LPADC). The aim of this study was to predict cancer-specific survival in LPADC using ensemble machine learning and classical Cox regression models. Moreover, models were evaluated to provide recommendations based on quantitative data for personalized treatment of LPADC. Data of patients diagnosed with LPADC (2004-2018) were extracted from the Surveillance, Epidemiology, and End Results database. The set of samples was randomly divided into the training and validation sets at a ratio of 7:3. Three ensemble models were selected, namely gradient boosting survival (GBS), random survival forest (RSF), and extra survival trees (EST). In addition, Cox proportional hazards (CoxPH) regression was used to construct the prognostic models. The Harrell's concordance index (C-index), integrated Brier score (IBS), and area under the time-dependent receiver operating characteristic curve (time-dependent AUC) were used to evaluate the performance of the predictive models. A user-friendly web access panel was provided to easily evaluate the model for the prediction of survival and treatment recommendations. A total of 3615 patients were randomly divided into the training and validation cohorts (n = 2530 and 1085, respectively). The extra survival trees, RSF, GBS, and CoxPH models showed good discriminative ability and calibration in both the training and validation cohorts (mean of time-dependent AUC: > 0.84 and > 0.82; C-index: > 0.79 and > 0.77; IBS: < 0.16 and < 0.17, respectively). The RSF and GBS models were more consistent than the CoxPH model in predicting long-term survival. We implemented the developed models as web applications for deployment into clinical practice (accessible through https://shinyshine-820-lpaprediction-model-z3ubbu.streamlit.app/ ). All four prognostic models showed good discriminative ability and calibration. The RSF and GBS models exhibited the highest effectiveness among all models in predicting the long-term cancer-specific survival of patients with LPADC. This approach may facilitate the development of personalized treatment plans and prediction of prognosis for LPADC.
Collapse
Affiliation(s)
- Kaide Xia
- Guiyang Maternal and Child Health Care Hospital, Guiyang Children's Hospital, Guiyang, China
| | - Dinghua Chen
- Department of General Surgery, The Forth People's Hospital of Guiyang, Guiyang, China
| | - Shuai Jin
- School of Big Health, Guizhou Medical University, Guiyang, China
| | - Xinglin Yi
- Department of Respiratory Medicine, Third Military Medical University, Chongqing, China
| | - Li Luo
- Department of Clinical Laboratory, The Second People's Hospital of Guiyang, Guiyang, China.
| |
Collapse
|
6
|
Congdon P. The ethnic density effect as a contextual influence in ecological disease models: Establishing its quantitative expression. Health Place 2023; 83:103083. [PMID: 37544099 DOI: 10.1016/j.healthplace.2023.103083] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 01/31/2023] [Revised: 04/09/2023] [Accepted: 07/04/2023] [Indexed: 08/08/2023]
Abstract
Research suggests higher neighbourhood ethnic minority density to be associated with lessened chances of ethnic group illness. We focus on the density effect on psychosis, arguing that (at higher ethnic concentrations) it acts as a contextual influence attenuating the compositional influence whereby minority ethnicity is associated with higher psychosis risk. In terms of ecological disease regression, the ethnic density effect will then be apparent in nonlinear impacts of minority concentration. Contextual effects may also be evident in spatially varying regression coefficient models for psychosis. Nonlinearity or heterogeneity may be associated with other contextual processes where geography modifies demography (e.g. deprivation amplification). We illustrate these issues with an analysis of psychosis prevalence in 4835 London neighbourhoods. The data are collected in primary care (during 2019/20) using clinical diagnosis (e.g. based on referrals to specialists or psychosis hospitalisation), and refer to patients currently under care: such care may extend retrospectively over several years. The data offer a complete population perspective in contrast to survey data, which typically offer limited geographic perspectives. We consider impacts on psychosis prevalence of non-white ethnicity, as well as those of deprivation, social fragmentation and urbanicity. We find evidence suggesting nonlinear impacts of non-white ethnicity on psychosis (essentially flat risk above a threshold concentration), but find no evidence for deprivation amplification.
Collapse
Affiliation(s)
- Peter Congdon
- School of Geography, Queen Mary University of London, London, E1 4NS, UK.
| |
Collapse
|
7
|
Asadullah MN, Tham E. Learning and happiness during Covid-19 school closure in urban Malaysia. INTERNATIONAL JOURNAL OF EDUCATIONAL DEVELOPMENT 2023; 101:102822. [PMID: 37347031 PMCID: PMC10258585 DOI: 10.1016/j.ijedudev.2023.102822] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 03/04/2023] [Revised: 05/29/2023] [Accepted: 06/08/2023] [Indexed: 06/23/2023]
Abstract
COVID-19 school closure has disrupted education systems globally raising concerns over learning time loss. At the same time, social isolation at home has seen a decline in happiness level among young learners. Understanding the link between cognitive effort and emotional wellbeing is important for post-pandemic learning recovery interventions particularly if there is a feedback loop from happiness to learning. In this context, we use primary survey data collected during the first school closure in urban Malaysia to study the complex association between learning loss and student happiness. Machine learning methods are used to accommodate the multi-dimensional and interaction effects between the covariates that influence this association. Empirically, we find that the most important covariates are student gender, social economic status (SES) proxied by the number of books ownership, time spent on play and religious activity. Based on the results, we develop a conceptual framework of learning continuity by formalizing the importance of investment in emotional wellbeing.
Collapse
Affiliation(s)
- M Niaz Asadullah
- Monash University Malaysia, Malaysia
- University of Reading, UK
- North South University, Bangladesh
| | | |
Collapse
|
8
|
Gentilin A. Challenges in normalizing pulse wave velocity scores: Implications for assessing central artery stiffness. Vascular 2023:17085381231194145. [PMID: 37553123 DOI: 10.1177/17085381231194145] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 08/10/2023]
|
9
|
Fife DA, D'Onofrio J. Common, uncommon, and novel applications of random forest in psychological research. Behav Res Methods 2023; 55:2447-2466. [PMID: 35915361 DOI: 10.3758/s13428-022-01901-9] [Citation(s) in RCA: 8] [Impact Index Per Article: 8.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 06/05/2022] [Indexed: 01/08/2023]
Abstract
Recent reform efforts have pushed toward a better understanding of the distinction between exploratory and confirmatory research, and appropriate use of each. As some utilize more exploratory tools, it may be tempting to employ multiple linear regression models. In this paper, we advocate for the use of random forest (RF) models. RF is able to obtain better predictive performance than traditional regression, while also inherently protecting against overfitting as well as detecting nonlinear effects and interactions among predictors. Given the advantages of RF compared to other statistical procedures, it is a tool commonly used within a plethora of industries, including stock trading, banking, pharmaceuticals, and patient healthcare planning. However, we find RF is used within the field of psychology comparatively less frequently. In the current paper, we advocate for RF as an important statistical tool within the context of behavioral and psychological research. In hopes of increasing the use of RF in the field of psychology, we provide information pertaining to the limitations one might confront in using RF and how to overcome such limitations. Moreover, we discuss various methods for how to optimally utilize RF with psychological data, such as nonparametric modeling, interaction and nonlinearity detection, variable selection, prediction and classification modeling, and assessing parameters of Monte Carlo simulations. Throughout, we illustrate the use of RF with visualization strategies, aimed to make RF models more comprehensible and intuitive.
Collapse
|
10
|
Edlinger A, Garland G, Banerjee S, Degrune F, García-Palacios P, Herzog C, Pescador DS, Romdhane S, Ryo M, Saghaï A, Hallin S, Maestre FT, Philippot L, Rillig MC, van der Heijden MGA. The impact of agricultural management on soil aggregation and carbon storage is regulated by climatic thresholds across a 3000 km European gradient. GLOBAL CHANGE BIOLOGY 2023; 29:3177-3192. [PMID: 36897740 DOI: 10.1111/gcb.16677] [Citation(s) in RCA: 5] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 06/24/2022] [Accepted: 02/07/2023] [Indexed: 05/03/2023]
Abstract
Organic carbon and aggregate stability are key features of soil quality and are important to consider when evaluating the potential of agricultural soils as carbon sinks. However, we lack a comprehensive understanding of how soil organic carbon (SOC) and aggregate stability respond to agricultural management across wide environmental gradients. Here, we assessed the impact of climatic factors, soil properties and agricultural management (including land use, crop cover, crop diversity, organic fertilization, and management intensity) on SOC and the mean weight diameter of soil aggregates, commonly used as an indicator for soil aggregate stability, across a 3000 km European gradient. Soil aggregate stability (-56%) and SOC stocks (-35%) in the topsoil (20 cm) were lower in croplands compared with neighboring grassland sites (uncropped sites with perennial vegetation and little or no external inputs). Land use and aridity were strong drivers of soil aggregation explaining 33% and 20% of the variation, respectively. SOC stocks were best explained by calcium content (20% of explained variation) followed by aridity (15%) and mean annual temperature (10%). We also found a threshold-like pattern for SOC stocks and aggregate stability in response to aridity, with lower values at sites with higher aridity. The impact of crop management on aggregate stability and SOC stocks appeared to be regulated by these thresholds, with more pronounced positive effects of crop diversity and more severe negative effects of crop management intensity in nondryland compared with dryland regions. We link the higher sensitivity of SOC stocks and aggregate stability in nondryland regions to a higher climatic potential for aggregate-mediated SOC stabilization. The presented findings are relevant for improving predictions of management effects on soil structure and C storage and highlight the need for site-specific agri-environmental policies to improve soil quality and C sequestration.
Collapse
Affiliation(s)
- Anna Edlinger
- Agroscope, Plant-Soil Interactions Group, Zurich, Switzerland
- Department of Plant and Microbial Biology, University of Zurich, Zurich, Switzerland
| | - Gina Garland
- Agroscope, Plant-Soil Interactions Group, Zurich, Switzerland
- Department of Environmental System Science, ETH Zurich, Zurich, Switzerland
| | - Samiran Banerjee
- Department of Microbiological Sciences, North Dakota State University, Fargo, North Dakota, USA
| | - Florine Degrune
- Institute of Biology, Freie Universität Berlin, Berlin, Germany
- Berlin-Brandenburg Institute of Advanced Biodiversity Research (BBIB), Berlin, Germany
- Soil Science and Environment Group, Changins, University of Applied Sciences and Arts Western Switzerland, Nyon, Switzerland
| | - Pablo García-Palacios
- Instituto de Ciencias Agrarias, Consejo Superior de Investigaciones Científicas, Madrid, Spain
| | - Chantal Herzog
- Agroscope, Plant-Soil Interactions Group, Zurich, Switzerland
- Department of Plant and Microbial Biology, University of Zurich, Zurich, Switzerland
| | - David Sánchez Pescador
- Departamento de Biología y Geología, Física y Química Inorgánica, Escuela Superior de Ciencias Experimentales y Tecnología, Universidad Rey Juan Carlos, Móstoles, Spain
| | - Sana Romdhane
- Department of Agroecology, INRA, AgroSup Dijon, University Bourgogne Franche Comte, Dijon, France
| | - Masahiro Ryo
- Institute of Biology, Freie Universität Berlin, Berlin, Germany
- Berlin-Brandenburg Institute of Advanced Biodiversity Research (BBIB), Berlin, Germany
- Leibniz Centre for Agricultural Landscape Research (ZALF), Müncheberg, Germany
- Brandenburg University of Technology Cottbus-Senftenberg, Cottbus, Germany
| | - Aurélien Saghaï
- Department of Forest Mycology and Plant Pathology, Swedish University of Agricultural Sciences, Uppsala, Sweden
| | - Sara Hallin
- Department of Forest Mycology and Plant Pathology, Swedish University of Agricultural Sciences, Uppsala, Sweden
| | - Fernando T Maestre
- Instituto Multidisciplinar para el Estudio del Medio "Ramón Margalef", Universidad de Alicante, Alicante, Spain
- Departamento de Ecología, Universidad de Alicante, Alicante, Spain
| | - Laurent Philippot
- Department of Agroecology, INRA, AgroSup Dijon, University Bourgogne Franche Comte, Dijon, France
| | - Matthias C Rillig
- Institute of Biology, Freie Universität Berlin, Berlin, Germany
- Berlin-Brandenburg Institute of Advanced Biodiversity Research (BBIB), Berlin, Germany
| | - Marcel G A van der Heijden
- Agroscope, Plant-Soil Interactions Group, Zurich, Switzerland
- Department of Plant and Microbial Biology, University of Zurich, Zurich, Switzerland
| |
Collapse
|
11
|
Haw JY, King RB. Understanding Filipino students’ achievement in PISA: The roles of personal characteristics, proximal processes, and social contexts. SOCIAL PSYCHOLOGY OF EDUCATION 2023. [DOI: 10.1007/s11218-023-09773-3] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 04/08/2023]
|
12
|
Dedeloudi A, Weaver E, Lamprou DA. Machine learning in additive manufacturing & Microfluidics for smarter and safer drug delivery systems. Int J Pharm 2023; 636:122818. [PMID: 36907280 DOI: 10.1016/j.ijpharm.2023.122818] [Citation(s) in RCA: 3] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/16/2023] [Revised: 02/23/2023] [Accepted: 03/06/2023] [Indexed: 03/13/2023]
Abstract
A new technological passage has emerged in the pharmaceutical field, concerning the management, application, and transfer of knowledge from humans to machines, as well as the implementation of advanced manufacturing and product optimisation processes. Machine Learning (ML) methods have been introduced to Additive Manufacturing (AM) and Microfluidics (MFs) to predict and generate learning patterns for precise fabrication of tailor-made pharmaceutical treatments. Moreover, regarding the diversity and complexity of personalised medicine, ML has been part of quality by design strategy, targeting towards the development of safe and effective drug delivery systems. The utilisation of different and novel ML techniques along with Internet of Things sensors in AM and MFs, have shown promising aspects regarding the development of well-defined automated procedures towards the production of sustainable and quality-based therapeutic systems. Thus, the effective data utilisation, prospects on a flexible and broader production of "on demand" treatments. In this study, a thorough overview has been achieved, concerning scientific achievements of the past decade, which aims to trigger the research interest on incorporating different types of ML in AM and MFs, as essential techniques for the enhancement of quality standards of customised medicinal applications, as well as the reduction of variability potency, throughout a pharmaceutical process.
Collapse
Affiliation(s)
- Aikaterini Dedeloudi
- School of Pharmacy, Queen's University Belfast, 97 Lisburn Road, Belfast BT9 7BL, UK
| | - Edward Weaver
- School of Pharmacy, Queen's University Belfast, 97 Lisburn Road, Belfast BT9 7BL, UK
| | - Dimitrios A Lamprou
- School of Pharmacy, Queen's University Belfast, 97 Lisburn Road, Belfast BT9 7BL, UK.
| |
Collapse
|
13
|
Rudar J, Golding GB, Kremer SC, Hajibabaei M. Decision Tree Ensembles Utilizing Multivariate Splits Are Effective at Investigating Beta Diversity in Medically Relevant 16S Amplicon Sequencing Data. Microbiol Spectr 2023; 11:e0206522. [PMID: 36877086 PMCID: PMC10100742 DOI: 10.1128/spectrum.02065-22] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/07/2022] [Accepted: 02/11/2023] [Indexed: 03/07/2023] Open
Abstract
Developing an understanding of how microbial communities vary across conditions is an important analytical step. We used 16S rRNA data isolated from human stool samples to investigate whether learned dissimilarities, such as those produced using unsupervised decision tree ensembles, can be used to improve the analysis of the composition of bacterial communities in patients suffering from Crohn's disease and adenomas/colorectal cancers. We also introduce a workflow capable of learning dissimilarities, projecting them into a lower dimensional space, and identifying features that impact the location of samples in the projections. For example, when used with the centered log ratio transformation, our new workflow (TreeOrdination) could identify differences in the microbial communities of Crohn's disease patients and healthy controls. Further investigation of our models elucidated the global impact amplicon sequence variants (ASVs) had on the locations of samples in the projected space and how each ASV impacted individual samples in this space. Furthermore, this approach can be used to integrate patient data easily into the model and results in models that generalize well to unseen data. Models employing multivariate splits can improve the analysis of complex high-throughput sequencing data sets because they are better able to learn about the underlying structure of the data set. IMPORTANCE There is an ever-increasing level of interest in accurately modeling and understanding the roles that commensal organisms play in human health and disease. We show that learned representations can be used to create informative ordinations. We also demonstrate that the application of modern model introspection algorithms can be used to investigate and quantify the impacts of taxa in these ordinations, and that the taxa identified by these approaches have been associated with immune-mediated inflammatory diseases and colorectal cancer.
Collapse
Affiliation(s)
- Josip Rudar
- Department of Integrative Biology & Centre for Biodiversity Genomics, University of Guelph, Guelph, Ontario, Canada
| | - G. Brian Golding
- Department of Biology, McMaster University, Hamilton, Ontario, Canada
| | - Stefan C. Kremer
- School of Computer Science, University of Guelph, Guelph, Ontario, Canada
| | - Mehrdad Hajibabaei
- Department of Integrative Biology & Centre for Biodiversity Genomics, University of Guelph, Guelph, Ontario, Canada
| |
Collapse
|
14
|
Dong W, Motairek I, Nasir K, Chen Z, Kim U, Khalifa Y, Freedman D, Griggs S, Rajagopalan S, Al-Kindi SG. Risk factors and geographic disparities in premature cardiovascular mortality in US counties: a machine learning approach. Sci Rep 2023; 13:2978. [PMID: 36808141 PMCID: PMC9941082 DOI: 10.1038/s41598-023-30188-9] [Citation(s) in RCA: 4] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/22/2022] [Accepted: 02/17/2023] [Indexed: 02/22/2023] Open
Abstract
Disparities in premature cardiovascular mortality (PCVM) have been associated with socioeconomic, behavioral, and environmental risk factors. Understanding the "phenotypes", or combinations of characteristics associated with the highest risk of PCVM, and the geographic distributions of these phenotypes is critical to targeting PCVM interventions. This study applied the classification and regression tree (CART) to identify county phenotypes of PCVM and geographic information systems to examine the distributions of identified phenotypes. Random forest analysis was applied to evaluate the relative importance of risk factors associated with PCVM. The CART analysis identified seven county phenotypes of PCVM, where high-risk phenotypes were characterized by having greater percentages of people with lower income, higher physical inactivity, and higher food insecurity. These high-risk phenotypes were mostly concentrated in the Black Belt of the American South and the Appalachian region. The random forest analysis identified additional important risk factors associated with PCVM, including broadband access, smoking, receipt of Supplemental Nutrition Assistance Program benefits, and educational attainment. Our study demonstrates the use of machine learning approaches in characterizing community-level phenotypes of PCVM. Interventions to reduce PCVM should be tailored according to these phenotypes in corresponding geographic areas.
Collapse
Affiliation(s)
- Weichuan Dong
- Department of Population and Quantitative Health Sciences, Case Western Reserve University School of Medicine, Cleveland, OH, 44106, USA
| | - Issam Motairek
- Harrington Heart and Vascular Institute, University Hospitals, 11100 Euclid Ave, Cleveland, OH, 44106, USA
| | | | - Zhuo Chen
- Harrington Heart and Vascular Institute, University Hospitals, 11100 Euclid Ave, Cleveland, OH, 44106, USA
| | - Uriel Kim
- Department of Population and Quantitative Health Sciences, Case Western Reserve University School of Medicine, Cleveland, OH, 44106, USA
- Kellogg School of Management, Northwestern University, Evanston, IL, 60208, USA
| | - Yassin Khalifa
- Harrington Heart and Vascular Institute, University Hospitals, 11100 Euclid Ave, Cleveland, OH, 44106, USA
| | - Darcy Freedman
- Department of Population and Quantitative Health Sciences, Case Western Reserve University School of Medicine, Cleveland, OH, 44106, USA
- Mary Ann Swetland Center for Environmental Health, Case Western Reserve University, Cleveland, OH, 44106, USA
| | - Stephanie Griggs
- Frances Bolton School of Nursing, Case Western Reserve University, Cleveland, OH, 44106, USA
| | - Sanjay Rajagopalan
- Harrington Heart and Vascular Institute, University Hospitals, 11100 Euclid Ave, Cleveland, OH, 44106, USA
- Case Western Reserve University School of Medicine, Cleveland, OH, 44106, USA
| | - Sadeer G Al-Kindi
- Harrington Heart and Vascular Institute, University Hospitals, 11100 Euclid Ave, Cleveland, OH, 44106, USA.
- Case Western Reserve University School of Medicine, Cleveland, OH, 44106, USA.
| |
Collapse
|
15
|
Pichler M, Hartig F. Machine learning and deep learning—A review for ecologists. Methods Ecol Evol 2023. [DOI: 10.1111/2041-210x.14061] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/16/2023]
Affiliation(s)
| | - Florian Hartig
- Theoretical Ecology University of Regensburg Regensburg Germany
| |
Collapse
|
16
|
Hallin S. Environmental microbiology going computational-Predictive ecology and unpredicted discoveries. Environ Microbiol 2023; 25:111-114. [PMID: 36181387 PMCID: PMC10092848 DOI: 10.1111/1462-2920.16232] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/27/2022] [Accepted: 09/28/2022] [Indexed: 01/21/2023]
Affiliation(s)
- Sara Hallin
- Swedish University of Agricultural Sciences, Department of Forest Mycology and Plant Pathology, Uppsala, Sweden
| |
Collapse
|
17
|
Tran TT, Lee J, Gunathilake M, Kim J, Kim SY, Cho H, Kim J. A comparison of machine learning models and Cox proportional hazards models regarding their ability to predict the risk of gastrointestinal cancer based on metabolic syndrome and its components. Front Oncol 2023; 13:1049787. [PMID: 36937438 PMCID: PMC10018751 DOI: 10.3389/fonc.2023.1049787] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/03/2022] [Accepted: 01/20/2023] [Indexed: 03/06/2023] Open
Abstract
Background Little is known about applying machine learning (ML) techniques to identify the important variables contributing to the occurrence of gastrointestinal (GI) cancer in epidemiological studies. We aimed to compare different ML models to a Cox proportional hazards (CPH) model regarding their ability to predict the risk of GI cancer based on metabolic syndrome (MetS) and its components. Methods A total of 41,837 participants were included in a prospective cohort study. Incident cancer cases were identified by following up with participants until December 2019. We used CPH, random survival forest (RSF), survival trees (ST), gradient boosting (GB), survival support vector machine (SSVM), and extra survival trees (EST) models to explore the impact of MetS on GI cancer prediction. We used the C-index and integrated Brier score (IBS) to compare the models. Results In all, 540 incident GI cancer cases were identified. The GB and SSVM models exhibited comparable performance to the CPH model concerning the C-index (0.725). We also recorded a similar IBS for all models (0.017). Fasting glucose and waist circumference were considered important predictors. Conclusions Our study found comparably good performance concerning the C-index for the ML models and CPH model. This finding suggests that ML models may be considered another method for survival analysis when the CPH model's conditions are not satisfied.
Collapse
Affiliation(s)
- Tao Thi Tran
- Department of Cancer Control and Population Health, Graduate School of Cancer Science and Policy, Goyang-si, Gyeonggi-do, Republic of Korea
| | - Jeonghee Lee
- Department of Cancer Biomedical Science, Graduate School of Cancer Science and Policy, Goyang-si, Gyeonggi-do, Republic of Korea
| | - Madhawa Gunathilake
- Department of Cancer Biomedical Science, Graduate School of Cancer Science and Policy, Goyang-si, Gyeonggi-do, Republic of Korea
| | - Junetae Kim
- Department of Cancer Control and Population Health, Graduate School of Cancer Science and Policy, Goyang-si, Gyeonggi-do, Republic of Korea
| | - Sun-Young Kim
- Department of Cancer Control and Population Health, Graduate School of Cancer Science and Policy, Goyang-si, Gyeonggi-do, Republic of Korea
| | - Hyunsoon Cho
- Department of Cancer Control and Population Health, Graduate School of Cancer Science and Policy, Goyang-si, Gyeonggi-do, Republic of Korea
| | - Jeongseon Kim
- Department of Cancer Biomedical Science, Graduate School of Cancer Science and Policy, Goyang-si, Gyeonggi-do, Republic of Korea
- *Correspondence: Jeongseon Kim,
| |
Collapse
|
18
|
Miller RJH, Hauser MT, Sharir T, Einstein AJ, Fish MB, Ruddy TD, Kaufmann PA, Sinusas AJ, Miller EJ, Bateman TM, Dorbala S, Di Carli M, Huang C, Liang JX, Han D, Dey D, Berman DS, Slomka PJ. Machine learning to predict abnormal myocardial perfusion from pre-test features. J Nucl Cardiol 2022; 29:2393-2403. [PMID: 35672567 PMCID: PMC9588501 DOI: 10.1007/s12350-022-03012-6] [Citation(s) in RCA: 5] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/29/2022] [Revised: 04/22/2022] [Accepted: 04/22/2022] [Indexed: 11/24/2022]
Abstract
BACKGROUND Accurately predicting which patients will have abnormal perfusion on MPI based on pre-test clinical information may help physicians make test selection decisions. We developed and validated a machine learning (ML) model for predicting abnormal perfusion using pre-test features. METHODS We included consecutive patients who underwent SPECT MPI, with 20,418 patients from a multi-center (5 sites) international registry in the training population and 9019 patients (from 2 separate sites) in the external testing population. The ML (extreme gradient boosting) model utilized 30 pre-test features to predict the presence of abnormal myocardial perfusion by expert visual interpretation. RESULTS In external testing, the ML model had higher prediction performance for abnormal perfusion (area under receiver-operating characteristic curve [AUC] 0.762, 95% CI 0.750-0.774) compared to the clinical CAD consortium (AUC 0.689) basic CAD consortium (AUC 0.657), and updated Diamond-Forrester models (AUC 0.658, p < 0.001 for all). Calibration (validation of the continuous risk prediction) was superior for the ML model (Brier score 0.149) compared to the other models (Brier score 0.165 to 0.198, all p < 0.001). CONCLUSION ML can predict abnormal myocardial perfusion using readily available pre-test information. This model could be used to help guide physician decisions regarding non-invasive test selection.
Collapse
Affiliation(s)
- Robert J H Miller
- Departments of Medicine (Division of Artificial Intelligence in Medicine), Imaging and Biomedical Sciences, Cedars-Sinai Medical Center, 8700 Beverly Blvd, Suite Metro 203, Los Angeles, CA, 90048, USA
- Department of Cardiac Sciences, University of Calgary, Calgary, AB, Canada
- Libin Cardiovascular Institute, Calgary, AB, Canada
| | - M Timothy Hauser
- Section of Nuclear Cardiology, Department of Clinical Imaging, Oklahoma Heart Hospital, Oklahoma City, OK, USA
| | - Tali Sharir
- Department of Nuclear Cardiology, Assuta Medical Centers, Tel Aviv, Israel
- Ben Gurion University of the Negev, Beer Sheba, Israel
| | - Andrew J Einstein
- Division of Cardiology, Department of Medicine and Department of Radiology, Columbia University Irving Medical Center, New York, NY, USA
- New York-Presbyterian Hospital, New York, NY, USA
| | - Mathews B Fish
- Oregon Heart and Vascular Institute, Sacred Heart Medical Center, Springfield, OR, USA
| | - Terrence D Ruddy
- Division of Cardiology, University of Ottawa Heart Institute, Ottawa, ON, Canada
| | - Philipp A Kaufmann
- Department of Nuclear Medicine, Cardiac Imaging, University Hospital Zurich, Zurich, Switzerland
| | - Albert J Sinusas
- Section of Cardiovascular Medicine, Department of Internal Medicine, Yale University School of Medicine, New Haven, CT, USA
| | - Edward J Miller
- Section of Cardiovascular Medicine, Department of Internal Medicine, Yale University School of Medicine, New Haven, CT, USA
| | | | - Sharmila Dorbala
- Division of Nuclear Medicine and Molecular Imaging, Department of Radiology, Brigham and Women's Hospital, Boston, MA, USA
| | - Marcelo Di Carli
- Division of Nuclear Medicine and Molecular Imaging, Department of Radiology, Brigham and Women's Hospital, Boston, MA, USA
| | - Cathleen Huang
- Departments of Medicine (Division of Artificial Intelligence in Medicine), Imaging and Biomedical Sciences, Cedars-Sinai Medical Center, 8700 Beverly Blvd, Suite Metro 203, Los Angeles, CA, 90048, USA
| | - Joanna X Liang
- Departments of Medicine (Division of Artificial Intelligence in Medicine), Imaging and Biomedical Sciences, Cedars-Sinai Medical Center, 8700 Beverly Blvd, Suite Metro 203, Los Angeles, CA, 90048, USA
| | - Donghee Han
- Departments of Medicine (Division of Artificial Intelligence in Medicine), Imaging and Biomedical Sciences, Cedars-Sinai Medical Center, 8700 Beverly Blvd, Suite Metro 203, Los Angeles, CA, 90048, USA
| | - Damini Dey
- Departments of Medicine (Division of Artificial Intelligence in Medicine), Imaging and Biomedical Sciences, Cedars-Sinai Medical Center, 8700 Beverly Blvd, Suite Metro 203, Los Angeles, CA, 90048, USA
| | - Daniel S Berman
- Departments of Medicine (Division of Artificial Intelligence in Medicine), Imaging and Biomedical Sciences, Cedars-Sinai Medical Center, 8700 Beverly Blvd, Suite Metro 203, Los Angeles, CA, 90048, USA
| | - Piotr J Slomka
- Departments of Medicine (Division of Artificial Intelligence in Medicine), Imaging and Biomedical Sciences, Cedars-Sinai Medical Center, 8700 Beverly Blvd, Suite Metro 203, Los Angeles, CA, 90048, USA.
| |
Collapse
|
19
|
Kim SI, Kang JW, Eun YG, Lee YC. Prediction of survival in oropharyngeal squamous cell carcinoma using machine learning algorithms: A study based on the surveillance, epidemiology, and end results database. Front Oncol 2022; 12:974678. [PMID: 36072804 PMCID: PMC9441569 DOI: 10.3389/fonc.2022.974678] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/21/2022] [Accepted: 08/08/2022] [Indexed: 11/28/2022] Open
Abstract
Background We determined appropriate survival prediction machine learning models for patients with oropharyngeal squamous cell carcinoma (OPSCC) using the “Surveillance, Epidemiology, and End Results” (SEER) database. Methods In total, 4039 patients diagnosed with OPSCC between 2004 and 2016 were enrolled in this study. In particular, 13 variables were selected and analyzed: age, sex, tumor grade, tumor size, neck dissection, radiation therapy, cancer directed surgery, chemotherapy, T stage, N stage, M stage, clinical stage, and human papillomavirus (HPV) status. The T-, N-, and clinical staging were reconstructed based on the American Joint Committee on Cancer (AJCC) Staging Manual, 8th Edition. The patients were randomly assigned to a development or test dataset at a 7:3 ratio. The extremely randomized survival tree (EST), conditional survival forest (CSF), and DeepSurv models were used to predict the overall and disease-specific survival in patients with OPSCC. A 10-fold cross-validation on a development dataset was used to build the training and internal validation data for all models. We evaluated the predictive performance of each model using test datasets. Results A higher c-index value and lower integrated Brier score (IBS), root mean square error (RMSE), and mean absolute error (MAE) indicate a better performance from a machine learning model. The C-index was the highest for the DeepSurv model (0.77). The IBS was also the lowest in the DeepSurv model (0.08). However, the RMSE and RAE were the lowest for the CSF model. Conclusions We demonstrated various machine-learning-based survival prediction models. The CSF model showed a better performance in predicting the survival of patients with OPSCC in terms of the RMSE and RAE. In this context, machine learning models based on personalized survival predictions can be used to stratify various complex risk factors. This could help in designing personalized treatments and predicting prognoses for patients.
Collapse
|
20
|
Tachino J, Matsumoto H, Sugihara F, Seno S, Okuzaki D, Kitamura T, Komukai S, Kido Y, Kojima T, Togami Y, Katayama Y, Nakagawa Y, Ogura H. Development of clinical phenotypes and biological profiles via proteomic analysis of trauma patients. Crit Care 2022; 26:241. [PMID: 35933364 PMCID: PMC9357328 DOI: 10.1186/s13054-022-04103-z] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/08/2022] [Accepted: 07/16/2022] [Indexed: 11/28/2022] Open
Abstract
Background Trauma is a heterogeneous condition, and specific clinical phenotypes may identify target populations that could benefit from certain treatment strategies. In this retrospective study, we determined clinical phenotypes and identified new target populations of trauma patients and their treatment strategies. Methods We retrospectively analyzed datasets from the Japan Trauma Data Bank and determined trauma death clinical phenotypes using statistical machine learning techniques and evaluation of biological profiles. Results The analysis included 71,038 blunt trauma patients [median age, 63 (interquartile range [IQR], 40–78) years; 45,479 (64.0%) males; median Injury Severity Score, 13 (IQR, 9–20)], and the derivation and validation cohorts included 42,780 (60.2%) and 28,258 (39.8%) patients, respectively. Of eight derived phenotypes (D-1–D-8), D-8 (n = 2178) had the highest mortality (48.6%) with characteristic severely disturbed consciousness and was further divided into four phenotypes: D-8α, multiple trauma in the young (n = 464); D-8β, head trauma with lower body temperature (n = 178); D-8γ, severe head injury in the elderly (n = 957); and D-8δ, multiple trauma, with higher predicted mortality than actual mortality (n = 579). Phenotype distributions were comparable in the validation cohort. Biological profile analysis of 90 trauma patients revealed that D-8 exhibited excessive inflammation, including enhanced acute inflammatory response, dysregulated complement activation pathways, and impaired coagulation, including downregulated coagulation and platelet degranulation pathways, compared with other phenotypes. Conclusions We identified clinical phenotypes with high mortality, and the evaluation of the molecular pathogenesis underlying these clinical phenotypes suggests that lethal trauma may involve excessive inflammation and coagulation disorders. Supplementary Information The online version contains supplementary material available at 10.1186/s13054-022-04103-z.
Collapse
|
21
|
Prediction of Oil Palm Yield Using Machine Learning in the Perspective of Fluctuating Weather and Soil Moisture Conditions: Evaluation of a Generic Workflow. PLANTS 2022; 11:plants11131697. [PMID: 35807648 PMCID: PMC9268852 DOI: 10.3390/plants11131697] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 04/11/2022] [Revised: 06/20/2022] [Accepted: 06/24/2022] [Indexed: 11/19/2022]
Abstract
Current development in precision agriculture has underscored the role of machine learning in crop yield prediction. Machine learning algorithms are capable of learning linear and nonlinear patterns in complex agro-meteorological data. However, the application of machine learning methods for predictive analysis is lacking in the oil palm industry. This work evaluated a supervised machine learning approach to develop an explainable and reusable oil palm yield prediction workflow. The input data included 12 weather and three soil moisture parameters along with 420 months of actual yield records of the study site. Multisource data and conventional machine learning techniques were coupled with an automated model selection process. The performance of two top regression models, namely Extra Tree and AdaBoost was evaluated using six statistical evaluation metrics. The prediction was followed by data preprocessing and feature selection. Selected regression models were compared with Random Forest, Gradient Boosting, Decision Tree, and other non-tree algorithms to prove the R2 driven performance superiority of tree-based ensemble models. In addition, the learning process of the models was examined using model-based feature importance, learning curve, validation curve, residual analysis, and prediction error. Results indicated that rainfall frequency, root-zone soil moisture, and temperature could make a significant impact on oil palm yield. Most influential features that contributed to the prediction process are rainfall, cloud amount, number of rain days, wind speed, and root zone soil wetness. It is concluded that the means of machine learning have great potential for the application to predict oil palm yield using weather and soil moisture data.
Collapse
|
22
|
Deng H, Eftekhari Z, Carlin C, Veerapong J, Fournier KF, Johnston FM, Dineen SP, Powers BD, Hendrix R, Lambert LA, Abbott DE, Vande Walle K, Grotz TE, Patel SH, Clarke CN, Staley CA, Abdel-Misih S, Cloyd JM, Lee B, Fong Y, Raoof M. Development and Validation of an Explainable Machine Learning Model for Major Complications After Cytoreductive Surgery. JAMA Netw Open 2022; 5:e2212930. [PMID: 35612856 PMCID: PMC9133947 DOI: 10.1001/jamanetworkopen.2022.12930] [Citation(s) in RCA: 11] [Impact Index Per Article: 5.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 12/17/2021] [Accepted: 03/31/2022] [Indexed: 11/16/2022] Open
Abstract
Importance Cytoreductive surgery (CRS) is one of the most complex operations in surgical oncology with significant morbidity, and improved risk prediction tools are critically needed. Machine learning models can potentially overcome the limitations of traditional multiple logistic regression (MLR) models and provide accurate risk estimates. Objective To develop and validate an explainable machine learning model for predicting major postoperative complications in patients undergoing CRS. Design, Setting, and Participants This prognostic study used patient data from tertiary care hospitals with expertise in CRS included in the US Hyperthermic Intraperitoneal Chemotherapy Collaborative Database between 1998 and 2018. Information from 147 variables was extracted to predict the risk of a major complication. An ensemble-based machine learning (gradient-boosting) model was optimized on 80% of the sample with subsequent validation on a 20% holdout data set. The machine learning model was compared with traditional MLR models. The artificial intelligence SHAP (Shapley additive explanations) method was used for interpretation of patient- and cohort-level risk estimates and interactions to define novel surgical risk phenotypes. Data were analyzed between November 2019 and August 2021. Exposures Cytoreductive surgery. Main Outcomes and Measures Area under the receiver operating characteristics (AUROC); area under the precision recall curve (AUPRC). Results Data from a total 2372 patients were included in model development (mean age, 55 years [range, 11-95 years]; 1366 [57.6%] women). The optimized machine learning model achieved high discrimination (AUROC: mean cross-validation, 0.75 [range, 0.73-0.81]; test, 0.74) and precision (AUPRC: mean cross-validation, 0.50 [range, 0.46-0.58]; test, 0.42). Compared with the optimized machine learning model, the published MLR model performed worse (test AUROC and AUPRC: 0.54 and 0.18, respectively). Higher volume of estimated blood loss, having pelvic peritonectomy, and longer operative time were the top 3 contributors to the high likelihood of major complications. SHAP dependence plots demonstrated insightful nonlinear interactive associations between predictors and major complications. For instance, high estimated blood loss (ie, above 500 mL) was only detrimental when operative time exceeded 9 hours. Unsupervised clustering of patients based on similarity of sources of risk allowed identification of 6 distinct surgical risk phenotypes. Conclusions and Relevance In this prognostic study using data from patients undergoing CRS, an optimized machine learning model demonstrated a superior ability to predict individual- and cohort-level risk of major complications vs traditional methods. Using the SHAP method, 6 distinct surgical phenotypes were identified based on sources of risk of major complications.
Collapse
Affiliation(s)
- Huiyu Deng
- City of Hope National Medical Center, Duarte, California
| | | | - Cameron Carlin
- City of Hope National Medical Center, Duarte, California
| | | | | | | | | | | | - Ryan Hendrix
- University of Massachusetts, Worcester, Massachusetts
| | | | | | | | | | | | | | | | | | | | - Byrne Lee
- Stanford University, Stanford, California
| | - Yuman Fong
- City of Hope National Medical Center, Duarte, California
| | - Mustafa Raoof
- City of Hope National Medical Center, Duarte, California
| |
Collapse
|
23
|
Beckham JL, Wyss KM, Xie Y, McHugh EA, Li JT, Advincula PA, Chen W, Lin J, Tour JM. Machine Learning Guided Synthesis of Flash Graphene. ADVANCED MATERIALS (DEERFIELD BEACH, FLA.) 2022; 34:e2106506. [PMID: 35064973 DOI: 10.1002/adma.202106506] [Citation(s) in RCA: 17] [Impact Index Per Article: 8.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 08/18/2021] [Revised: 12/15/2021] [Indexed: 06/14/2023]
Abstract
Advances in nanoscience have enabled the synthesis of nanomaterials, such as graphene, from low-value or waste materials through flash Joule heating. Though this capability is promising, the complex and entangled variables that govern nanocrystal formation in the Joule heating process remain poorly understood. In this work, machine learning (ML) models are constructed to explore the factors that drive the transformation of amorphous carbon into graphene nanocrystals during flash Joule heating. An XGBoost regression model of crystallinity achieves an r2 score of 0.8051 ± 0.054. Feature importance assays and decision trees extracted from these models reveal key considerations in the selection of starting materials and the role of stochastic current fluctuations in flash Joule heating synthesis. Furthermore, partial dependence analyses demonstrate the importance of charge and current density as predictors of crystallinity, implying a progression from reaction-limited to diffusion-limited kinetics as flash Joule heating parameters change. Finally, a practical application of the ML models is shown by using Bayesian meta-learning algorithms to automatically improve bulk crystallinity over many Joule heating reactions. These results illustrate the power of ML as a tool to analyze complex nanomanufacturing processes and enable the synthesis of 2D crystals with desirable properties by flash Joule heating.
Collapse
Affiliation(s)
- Jacob L Beckham
- Department of Chemistry, Rice University, 6100 Main Street MS 222, Houston, TX, 77005, USA
| | - Kevin M Wyss
- Department of Chemistry, Rice University, 6100 Main Street MS 222, Houston, TX, 77005, USA
| | - Yunchao Xie
- Department of Mechanical and Aerospace Engineering, University of Missouri, Columbia, MO, 65211, USA
| | - Emily A McHugh
- Department of Chemistry, Rice University, 6100 Main Street MS 222, Houston, TX, 77005, USA
| | - John Tianci Li
- Department of Chemistry, Rice University, 6100 Main Street MS 222, Houston, TX, 77005, USA
| | - Paul A Advincula
- Department of Chemistry, Rice University, 6100 Main Street MS 222, Houston, TX, 77005, USA
| | - Weiyin Chen
- Department of Chemistry, Rice University, 6100 Main Street MS 222, Houston, TX, 77005, USA
| | - Jian Lin
- Department of Mechanical and Aerospace Engineering, University of Missouri, Columbia, MO, 65211, USA
| | - James M Tour
- Department of Chemistry, Smalley-Curl Institute, NanoCarbon Center, Welch Institute for Advanced Materials, Department of Materials Science and Nanoengineering, Department of Computer Science, Rice University, 6100 Main Street MS 222, Houston, TX, 77005, USA
| |
Collapse
|
24
|
Nath B, Chowdhury R, Ni‐Meister W, Mahanta C. Predicting the Distribution of Arsenic in Groundwater by a Geospatial Machine Learning Technique in the Two Most Affected Districts of Assam, India: The Public Health Implications. GEOHEALTH 2022; 6:e2021GH000585. [PMID: 35340282 PMCID: PMC8934026 DOI: 10.1029/2021gh000585] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 12/30/2021] [Revised: 02/02/2022] [Accepted: 02/18/2022] [Indexed: 06/14/2023]
Abstract
Arsenic (As) is a well-known carcinogen and chemical contaminant in groundwater. The spatial heterogeneity in As distribution in groundwater makes it difficult to predict the location of safe areas for tube well installations, consumption, and agriculture. Geospatial machine learning techniques have been used to predict the location of safe and unsafe areas of groundwater As. We used a similar machine learning technique and developed a habitation-level (spatial resolution 250 m) predictive model to determine the risk and extent of As >10 μg/L in groundwater in the two most affected districts of Assam, India, with an aim to advise policymakers on targeted interventions. A random forest model was employed in Python environments to predict the probabilities of As at concentrations >10 μg/L using intrinsic and extrinsic predictor variables, which were selected for their inherent relationship with As occurrence in groundwater. The relationships between predictor variables and proportions of As occurrences >10 μg/L follow the well-documented processes leading to As release in groundwater. We identified potential As hotspots based on a probability of ≥0.7 for As >10 μg/L, including regions not previously surveyed and extending beyond previously known As hotspots. Of the total land area (6,500 km2), 25% was identified as a high-risk zone, with an estimated 155,000 people potentially consuming As through drinking water or cooking food. The ternary hazard probability map (showing high, moderate, and low risk for As >10 μg/L) could inform policymakers on establishing newer drinking water treatment plants and providing safe drinking water connections to rural households.
Collapse
Affiliation(s)
- Bibhash Nath
- Department of Geography and Environmental ScienceHunter College of City University of New YorkNew YorkNYUSA
| | - Runti Chowdhury
- Department of Geological SciencesGauhati UniversityGuwahatiIndia
| | - Wenge Ni‐Meister
- Department of Geography and Environmental ScienceHunter College of City University of New YorkNew YorkNYUSA
| | - Chandan Mahanta
- Department of Civil EngineeringIndian Institute of Technology GuwahatiGuwahatiIndia
| |
Collapse
|
25
|
Linnik VG, Saveliev AA, Bauer TV, Minkina TM, Mandzhieva SS. Analysis and assessment of heavy metal contamination in the vicinity of Lake Atamanskoe (Rostov region, Russia) using multivariate statistical methods. ENVIRONMENTAL GEOCHEMISTRY AND HEALTH 2022; 44:511-526. [PMID: 33609207 DOI: 10.1007/s10653-021-00853-x] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 05/31/2020] [Accepted: 02/05/2021] [Indexed: 06/12/2023]
Abstract
Assessment of spatial patterns of potentially toxic metals is one of the most urgent tasks in soil chemistry. In this study, descriptive statistics and three methods of multivariate statistical analysis, such as the hierarchical cluster analysis (HCA), correlation analysis, and conditional inference tree (CIT), were used to identify patterns and potential sources of heavy metals (Co, Ni, Cu, Cr, Pb, MnO, and Zn). The investigation was carried out on 81 sample points, using 20 testing parameters. A strong positive correlation found among Ni, Cu, Zn, and HCA results has confirmed the common origin of the elements from waste discharge. Hierarchical CA divided the 81 test sites into 5 classes based on the soil quality and HMs contamination similarity. Regression trees for Cr, Pb, Zn, and Cu were verified by the splitting factor including HMs content and soil chemistry factors. The CIT has revealed that the elements (Cr, Pb, Zn, and Cu) concentration values are split at the first level by some other metal, indicating common anthropogenic impact resulting from industrial waste discharges. The factors at the next hierarchical level of splitting, in addition to the HMs, include compounds belonging to soil chemistry variables (SiO2, Al2O3, and K2O). The CIT nonlinear regression model is in good agreement with the data: R2 values for log-transformed concentrations of Cr, Pb, Zn, and Cu are equal to 0.775; 0.774; 0.775; 0.804, respectively.
Collapse
Affiliation(s)
- Vitaly G Linnik
- Institute of Geochemistry and Analytical Chemistry, Russian Academy of Sciences, Moscow, 119991, Russian Federation
| | - Anatoly A Saveliev
- Institute of Environmental Sciences, Kazan Federal University, Kazan, 420097, Russian Federation
| | - Tatiana V Bauer
- Federal Research Centre the Southern Scientific Centre of the Russian Academy of Sciences, Rostov-on-Don, 344006, Russian Federation
| | - Tatiana M Minkina
- Southern Federal University, 194/1 prosp. Stachki ave, Rostov-on-Don, 344006, Russian Federation
| | - Saglara S Mandzhieva
- Southern Federal University, 194/1 prosp. Stachki ave, Rostov-on-Don, 344006, Russian Federation.
| |
Collapse
|
26
|
Context is key: normalization as a novel approach to sport specific preprocessing of KPI's for match analysis in soccer. Sci Rep 2022; 12:1117. [PMID: 35064172 PMCID: PMC8782855 DOI: 10.1038/s41598-022-05089-y] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/24/2021] [Accepted: 12/31/2021] [Indexed: 11/25/2022] Open
Abstract
Key Performance Indicators (KPIs) have been investigated, validated and applied in multitude of sports for recruiting, coaching, opponent, self-analysis etc. Although a wide variety of in game performance indicators have been used as KPIs, they lack sports specific context. With the introduction of artificial intelligence and machine learning (AI/ML) in sports, the need for building intrinsic context into the independent variables is even greater as AI/ML models seem to perform better in terms of predictability but lack interpretability. The study proposes domain specific feature preprocessing method (normalization) that can be utilized across a wide range of sports and demonstrates its value through a specific data transformation by using team possession as a normalizing factor while analyzing defensive performance in soccer. The study performed two linear regressions and three gradient boosting machine models to demonstrate the value of normalization while predicting defensive performance. The results demonstrate that the direction of correlation of the relevant variables changes post normalization while predicting defensive performance of teams for the whole season. Both raw and normalized KPIs showing significant correlation with defensive performance (p < 0.001). The addition of the normalized variables contributes towards higher information gain, improved performance and increased interpretability of the models.
Collapse
|
27
|
Tang M, Gao L, He B, Yang Y. Machine Learning-Based Prognostic Prediction Models of Non-Metastatic Colon Cancer: Analyses Based on Surveillance, Epidemiology and End Results Database and a Chinese Cohort. Cancer Manag Res 2022; 14:25-35. [PMID: 35018119 PMCID: PMC8742582 DOI: 10.2147/cmar.s340739] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/23/2021] [Accepted: 12/01/2021] [Indexed: 12/16/2022] Open
Abstract
Purpose The present study aimed to develop prognostic prediction models based on machine learning (ML) for non-metastatic colon cancer (CRC), which can provide a precise quantitative risk assessment and serve as an assistive method for treatment strategy development. The possibility of improving prediction accuracy using nonlinear methods compared to linear methods was investigated. Patients and Methods A cancer-specific survival (CSS) model constructed using logistic regression, extreme gradient boosting (XGBoost), and random forest algorithms was trained on the Surveillance, Epidemiology, and End Results datasets for 15,254 patients with non-metastatic CRC (split into training [70%] and internal validation [30%] datasets) and externally validated with an outpatient cohort of 311 cases from Xiyuan Hospital in China. A Chinese cohort was also used to develop recurrence and metastasis (R&M) models for CRC patients. The experiments for each model were performed 100 times to obtain average scores and 95% confidence intervals. The model performance was evaluated using the area under the receiver operating characteristic curve (AUC) values. Results The XGBoost approach showed the highest AUC values of 0.86 (0.84-0.88), 0.82 (0.81-0.83), and 0.81 (0.79-0.82) for one-, three-, and five-year CSS cohorts, respectively, along with a relatively high generalization ability. The XGBoost approach also performed best for the R&M model, with the AUC values of 0.71 (0.64-0.79), 0.79 (0.74-0.86), and 0.89 (0.82-0.95) for one-, three-, and five-year R&M cohorts, respectively. The rankings of predictor importance for the CSS and R&M models were different, and the higher model accuracy was associated with more prognostic predictors. Conclusion Three different ML algorithms for developing prognostic prediction models for non-metastatic CRC were compared. The predictive performance results showed that the nonlinear XGBoost approach performed best, suggesting that it can be used for quantifying the prognostic risk. It was also demonstrated that the model performance can be improved when more prognostic predictors are considered.
Collapse
Affiliation(s)
- Mo Tang
- Oncology Department, Xiyuan Hospital of China Academy of Chinese Medical Sciences, Beijing, People's Republic of China
| | - Lihao Gao
- Smart City Business Unit, Baidu Inc., Beijing, People's Republic of China
| | - Bin He
- Oncology Department, Xiyuan Hospital of China Academy of Chinese Medical Sciences, Beijing, People's Republic of China
| | - Yufei Yang
- Oncology Department, Xiyuan Hospital of China Academy of Chinese Medical Sciences, Beijing, People's Republic of China
| |
Collapse
|
28
|
Rousseau S, Polachek IS, Frenkel TI. A machine learning approach to identifying pregnant women's risk for persistent post-traumatic stress following childbirth. J Affect Disord 2022; 296:136-149. [PMID: 34601301 DOI: 10.1016/j.jad.2021.09.014] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 02/15/2021] [Revised: 08/11/2021] [Accepted: 09/12/2021] [Indexed: 10/20/2022]
Abstract
INTRO Recent literature identifies childbirth as a potentially traumatic event, following which mothers may develop symptoms of Post-Traumatic-Stress-Following-Childbirth (PTS-FC). Especially when persistent, PTS-FC may interfere with mothers' caregiving and associated infant development, underscoring the need for accurate predictive screening of risk. Drawing on recent developments in advanced statistical modeling, the aim of the current study was to identify a set of prenatal indicators and prediction rules that may accurately identify pregnant women's risk for developing symptoms of PTS-FC which persist throughout the early postpartum period. METHODS 182 women from the general population completed a comprehensive set of approximately 200 potentially predictive questions during pregnancy, and subsequently reported on their acute stress and PTS-FC at three days, one month, and three months postpartum (self-report and clinician-administered interview). Based on the postpartum acute stress and PTS-FC data, women were classified into profiles of "Stable-High-PTS-FC" and "Stable-Low-PTS-FC" by means of Latent-Class Analyses. Prenatal data were modeled to identify women at risk for "Stable-High PTS-FC". RESULTS Employing machine-learning decision-tree analyses, a total of 36 questions and 7 prediction-rules were selected. Based on a cost-rate of 15 versus 100 for false-negative "Stable-Low-PTS-FC" versus false-negative "Stable-High-PTS-FC", the final model showed 80.6% accuracy for "Stable-High-PTS-FC" prediction. DISCUSSION This study identifies a short set of questions and prediction rules that may be included in future large-scale validation studies aimed at developing and validating a brief PTS-FC screening instrument that could be implemented in general population prenatal healthcare practice. Accurate screening would allow for selective administering of preventive interventions towards women at risk.
Collapse
Affiliation(s)
- Sofie Rousseau
- Ziama Arkin Infancy Institute, Interdisciplinary Center (IDC) Herzliya, Hanadiv 71, 1st floor, Herzliya 46485, Israel; Baruch Ivcher School of Psychology, Interdisciplinary Center (IDC) Herzliya, HaUniversity 8, Herzliya 4610101, Israel
| | - Inbal Shlomi Polachek
- Be'er Ya'akov Medical Center, Israel; Tel Aviv University, Sackler School of Medicine, Tel Aviv, Israel
| | - Tahl I Frenkel
- Ziama Arkin Infancy Institute, Interdisciplinary Center (IDC) Herzliya, Hanadiv 71, 1st floor, Herzliya 46485, Israel; Baruch Ivcher School of Psychology, Interdisciplinary Center (IDC) Herzliya, HaUniversity 8, Herzliya 4610101, Israel.
| |
Collapse
|
29
|
Dong W, Bensken WP, Kim U, Rose J, Berger NA, Koroukian SM. Phenotype Discovery and Geographic Disparities of Late-Stage Breast Cancer Diagnosis across U.S. Counties: A Machine Learning Approach. Cancer Epidemiol Biomarkers Prev 2022; 31:66-76. [PMID: 34697059 PMCID: PMC8755627 DOI: 10.1158/1055-9965.epi-21-0838] [Citation(s) in RCA: 5] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/07/2021] [Revised: 09/20/2021] [Accepted: 10/21/2021] [Indexed: 11/16/2022] Open
Abstract
BACKGROUND Disparities in the stage at diagnosis for breast cancer have been independently associated with various contextual characteristics. Understanding which combinations of these characteristics indicate highest risk, and where they are located, is critical to targeting interventions and improving outcomes for patients with breast cancer. METHODS The study included women diagnosed with invasive breast cancer between 2009 and 2018 from 680 U.S. counties participating in the Surveillance, Epidemiology, and End Results program. We used a machine learning approach called Classification and Regression Tree (CART) to identify county "phenotypes," combinations of characteristics that predict the percentage of patients with breast cancer presenting with late-stage disease. We then mapped the phenotypes and compared their geographic distributions. These findings were further validated using an alternate machine learning approach called random forest. RESULTS We discovered seven phenotypes of late-stage breast cancer. Common to most phenotypes associated with high risk of late-stage diagnosis were high uninsured rate, low mammography use, high area deprivation, rurality, and high poverty. Geographically, these phenotypes were most prevalent in southern and western states, while phenotypes associated with lower percentages of late-stage diagnosis were most prevalent in the northeastern states and select metropolitan areas. CONCLUSIONS The use of machine learning methods of CART and random forest together with geographic methods offers a promising avenue for future disparities research. IMPACT Local interventions to reduce late-stage breast cancer diagnosis, such as community education and outreach programs, can use machine learning and geographic modeling approaches to tailor strategies for early detection and resource allocation.
Collapse
Affiliation(s)
- Weichuan Dong
- Case Comprehensive Cancer Center, Case Western Reserve University School of Medicine, Cleveland, Ohio.
- Center for Community Health Integration, Case Western Reserve University School of Medicine, Cleveland, Ohio
- Department of Population and Quantitative Health Sciences, Case Western Reserve University School of Medicine, Cleveland, Ohio
- Department of Geography, Kent State University, Kent, Ohio
| | - Wyatt P Bensken
- Case Comprehensive Cancer Center, Case Western Reserve University School of Medicine, Cleveland, Ohio
- Department of Population and Quantitative Health Sciences, Case Western Reserve University School of Medicine, Cleveland, Ohio
| | - Uriel Kim
- Case Comprehensive Cancer Center, Case Western Reserve University School of Medicine, Cleveland, Ohio
- Center for Community Health Integration, Case Western Reserve University School of Medicine, Cleveland, Ohio
- Department of Population and Quantitative Health Sciences, Case Western Reserve University School of Medicine, Cleveland, Ohio
| | - Johnie Rose
- Case Comprehensive Cancer Center, Case Western Reserve University School of Medicine, Cleveland, Ohio
- Center for Community Health Integration, Case Western Reserve University School of Medicine, Cleveland, Ohio
- Department of Population and Quantitative Health Sciences, Case Western Reserve University School of Medicine, Cleveland, Ohio
| | - Nathan A Berger
- Case Comprehensive Cancer Center, Case Western Reserve University School of Medicine, Cleveland, Ohio
- Center for Science, Health, and Society, Case Western Reserve University School of Medicine, Cleveland, Ohio
| | - Siran M Koroukian
- Case Comprehensive Cancer Center, Case Western Reserve University School of Medicine, Cleveland, Ohio
- Center for Community Health Integration, Case Western Reserve University School of Medicine, Cleveland, Ohio
- Department of Population and Quantitative Health Sciences, Case Western Reserve University School of Medicine, Cleveland, Ohio
| |
Collapse
|
30
|
Asher AL, Sammak SE, Michalopoulos GD, Yolcu YU, Alexander AY, Knightly JJ, Foley KT, Shaffrey CI, Harbaugh RE, Rose GA, Coric D, Bisson EF, Glassman SD, Mummaneni PV, Bydon M. Time trend analysis of database and registry use in the neurosurgical literature: evidence for the advance of registry science. J Neurosurg 2021:1-6. [PMID: 34920432 DOI: 10.3171/2021.9.jns212153] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/06/2022]
Affiliation(s)
- Anthony L Asher
- 1Neuroscience Institute, Carolina Neurosurgery & Spine Associates, Carolinas Healthcare System, Charlotte, North Carolina
| | - Sally El Sammak
- 2Mayo Clinic Neuro-Informatics Laboratory, Department of Neurologic Surgery, Mayo Clinic, Rochester, Minnesota.,3Department of Neurologic Surgery, Mayo Clinic, Rochester, Minnesota
| | - Giorgos D Michalopoulos
- 2Mayo Clinic Neuro-Informatics Laboratory, Department of Neurologic Surgery, Mayo Clinic, Rochester, Minnesota.,3Department of Neurologic Surgery, Mayo Clinic, Rochester, Minnesota
| | - Yagiz U Yolcu
- 2Mayo Clinic Neuro-Informatics Laboratory, Department of Neurologic Surgery, Mayo Clinic, Rochester, Minnesota.,3Department of Neurologic Surgery, Mayo Clinic, Rochester, Minnesota
| | - A Yohan Alexander
- 2Mayo Clinic Neuro-Informatics Laboratory, Department of Neurologic Surgery, Mayo Clinic, Rochester, Minnesota.,3Department of Neurologic Surgery, Mayo Clinic, Rochester, Minnesota
| | | | - Kevin T Foley
- 5Department of Neurosurgery, University of Tennessee, Memphis, Tennessee
| | - Christopher I Shaffrey
- 6Duke Neurosurgery and Orthopaedic Surgery, Duke University Medical Center, Durham, North Carolina
| | - Robert E Harbaugh
- 7Department of Neurosurgery, College of Medicine, Pennsylvania State University, Hershey, Pennsylvania
| | - Geoffrey A Rose
- 8Sanger Heart & Vascular Institute, Atrium Health, Charlotte, North Carolina
| | - Domagoj Coric
- 1Neuroscience Institute, Carolina Neurosurgery & Spine Associates, Carolinas Healthcare System, Charlotte, North Carolina
| | - Erica F Bisson
- 9Department of Neurological Surgery, University of Utah, Salt Lake City, Utah
| | | | - Praveen V Mummaneni
- 11Department of Neurological Surgery, University of California, San Francisco, California
| | - Mohamad Bydon
- 2Mayo Clinic Neuro-Informatics Laboratory, Department of Neurologic Surgery, Mayo Clinic, Rochester, Minnesota.,3Department of Neurologic Surgery, Mayo Clinic, Rochester, Minnesota
| |
Collapse
|
31
|
Francisco ME, Carvajal TM, Ryo M, Nukazawa K, Amalin DM, Watanabe K. Dengue disease dynamics are modulated by the combined influences of precipitation and landscape: A machine learning approach. THE SCIENCE OF THE TOTAL ENVIRONMENT 2021; 792:148406. [PMID: 34157535 DOI: 10.1016/j.scitotenv.2021.148406] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 01/21/2021] [Revised: 05/25/2021] [Accepted: 06/08/2021] [Indexed: 06/13/2023]
Abstract
BACKGROUND Dengue is an endemic vector-borne disease influenced by environmental factors such as landscape and climate. Previous studies separately assessed the effects of landscape and climate factors on mosquito occurrence and dengue incidence. However, both factors concurrently coexist in time and space and can interact, affecting mosquito development and dengue disease transmission. For example, eggs laid in a suitable environment can hatch after being submerged in rain water. It has been difficult for conventional statistical modeling approaches to demonstrate these combined influences due to mathematical constraints. OBJECTIVES To investigate the combined influences of landscape and climate factors on mosquito occurrence and dengue incidence. METHODS Entomological, epidemiological, and landscape data from the rainy season (July-December) were obtained from respective government agencies in Metropolitan Manila, Philippines, from 2012 to 2014. Temperature, precipitation and vegetation data were obtained through remote sensing. A random forest algorithm was used to select the landscape and climate variables. Afterward, using the identified key variables, a model-based (MOB) recursive partitioning was implemented to test the combined influences of landscape and climate factors on ovitrap index (vector mosquito occurrence) and dengue incidence. RESULTS The MOB recursive partitioning for ovitrap index indicated a high sensitivity of vector mosquito occurrence on environmental conditions generated by a combination of high residential density areas with low precipitation. Moreover, the MOB recursive partitioning indicated high sensitivity of dengue incidence to the effects of precipitation in areas with high proportions of residential density and commercial areas. CONCLUSIONS Dengue dynamics are not solely influenced by individual effects of either climate or landscape, but rather by their synergistic or combined effects. The presented findings have the potential to target vector surveillance in areas identified as suitable for mosquito occurrence under specific climatic conditions and may be relevant as part of urban planning strategies to control dengue.
Collapse
Affiliation(s)
- Micanaldo Ernesto Francisco
- Center for Marine Environmental Studies (CMES), Ehime University, Matsuyama 790-8577, Japan; Graduate School of Science and Engineering, Ehime University, Matsuyama 790-8577, Japan
| | - Thaddeus M Carvajal
- Center for Marine Environmental Studies (CMES), Ehime University, Matsuyama 790-8577, Japan; Graduate School of Science and Engineering, Ehime University, Matsuyama 790-8577, Japan; Biology Department, De La Salle University, Taft Ave, Manila 1004, Philippines; Biological Control Research Unit, Center for Natural Science and Environmental Research, De La Salle University, Taft Ave, Manila, Philippines
| | - Masahiro Ryo
- Leibniz Centre for Agricultural Landscape Research (ZALF), Eberswalder Str. 84, 15374 Müncheberg, Germany; Environment and Natural Sciences, Brandenburg University of Technology Cottbus-Senftenberg, 03046 Cottbus, Germany
| | - Kei Nukazawa
- Department of Civil and Environmental Engineering, University of Miyazaki, Miyazaki 889-2192, Japan
| | - Divina M Amalin
- Biology Department, De La Salle University, Taft Ave, Manila 1004, Philippines; Biological Control Research Unit, Center for Natural Science and Environmental Research, De La Salle University, Taft Ave, Manila, Philippines
| | - Kozo Watanabe
- Center for Marine Environmental Studies (CMES), Ehime University, Matsuyama 790-8577, Japan; Graduate School of Science and Engineering, Ehime University, Matsuyama 790-8577, Japan; Biology Department, De La Salle University, Taft Ave, Manila 1004, Philippines; Biological Control Research Unit, Center for Natural Science and Environmental Research, De La Salle University, Taft Ave, Manila, Philippines.
| |
Collapse
|
32
|
Havinga I, Marcos D, Bogaart PW, Hein L, Tuia D. Social media and deep learning capture the aesthetic quality of the landscape. Sci Rep 2021; 11:20000. [PMID: 34625594 PMCID: PMC8501120 DOI: 10.1038/s41598-021-99282-0] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/07/2021] [Accepted: 09/13/2021] [Indexed: 11/09/2022] Open
Abstract
Peoples' recreation and well-being are closely related to their aesthetic enjoyment of the landscape. Ecosystem service (ES) assessments record the aesthetic contributions of landscapes to peoples' well-being in support of sustainable policy goals. However, the survey methods available to measure these contributions restrict modelling at large scales. As a result, most studies rely on environmental indicator models but these do not incorporate peoples' actual use of the landscape. Now, social media has emerged as a rich new source of information to understand human-nature interactions while advances in deep learning have enabled large-scale analysis of the imagery uploaded to these platforms. In this study, we test the accuracy of Flickr and deep learning-based models of landscape quality using a crowdsourced survey in Great Britain. We find that this novel modelling approach generates a strong and comparable level of accuracy versus an indicator model and, in combination, captures additional aesthetic information. At the same time, social media provides a direct measure of individuals' aesthetic enjoyment, a point of view inaccessible to indicator models, as well as a greater independence of the scale of measurement and insights into how peoples' appreciation of the landscape changes over time. Our results show how social media and deep learning can support significant advances in modelling the aesthetic contributions of ecosystems for ES assessments.
Collapse
Affiliation(s)
- Ilan Havinga
- Environmental Systems Analysis Group, Wageningen University, Wageningen, 6708 PB, The Netherlands.
| | - Diego Marcos
- Laboratory of Geo-Information Science and Remote Sensing, Wageningen University, Wageningen, 6708 PB, The Netherlands
| | - Patrick W Bogaart
- National Accounts Department, Statistics Netherlands, The Hague, 2492 JP, The Netherlands
| | - Lars Hein
- Environmental Systems Analysis Group, Wageningen University, Wageningen, 6708 PB, The Netherlands
| | - Devis Tuia
- Laboratory of Geo-Information Science and Remote Sensing, Wageningen University, Wageningen, 6708 PB, The Netherlands
- Environmental Computational Science and Earth Observation Laboratory, Ecole Polytechnique Fédérale de Lausanne, Industrie 17, Sion, Switzerland
| |
Collapse
|
33
|
Sharifi-Heris Z, Laitala J, Airola A, Rahmani AM, Bender M. Machine learning modeling for preterm birth prediction using health record: A systematic review (Preprint). JMIR Med Inform 2021; 10:e33875. [PMID: 35442214 PMCID: PMC9069277 DOI: 10.2196/33875] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/27/2021] [Revised: 01/29/2022] [Accepted: 02/26/2022] [Indexed: 11/24/2022] Open
Abstract
Background Preterm birth (PTB), a common pregnancy complication, is responsible for 35% of the 3.1 million pregnancy-related deaths each year and significantly affects around 15 million children annually worldwide. Conventional approaches to predict PTB lack reliable predictive power, leaving >50% of cases undetected. Recently, machine learning (ML) models have shown potential as an appropriate complementary approach for PTB prediction using health records (HRs). Objective This study aimed to systematically review the literature concerned with PTB prediction using HR data and the ML approach. Methods This systematic review was conducted in accordance with the PRISMA (Preferred Reporting Items for Systematic Reviews and Meta-Analyses) statement. A comprehensive search was performed in 7 bibliographic databases until May 15, 2021. The quality of the studies was assessed, and descriptive information, including descriptive characteristics of the data, ML modeling processes, and model performance, was extracted and reported. Results A total of 732 papers were screened through title and abstract. Of these 732 studies, 23 (3.1%) were screened by full text, resulting in 13 (1.8%) papers that met the inclusion criteria. The sample size varied from a minimum value of 274 to a maximum of 1,400,000. The time length for which data were extracted varied from 1 to 11 years, and the oldest and newest data were related to 1988 and 2018, respectively. Population, data set, and ML models’ characteristics were assessed, and the performance of the model was often reported based on metrics such as accuracy, sensitivity, specificity, and area under the receiver operating characteristic curve. Conclusions Various ML models used for different HR data indicated potential for PTB prediction. However, evaluation metrics, software and package used, data size and type, selected features, and importantly data management method often remain unjustified, threatening the reliability, performance, and internal or external validity of the model. To understand the usefulness of ML in covering the existing gap, future studies are also suggested to compare it with a conventional method on the same data set.
Collapse
Affiliation(s)
- Zahra Sharifi-Heris
- Sue & Bill Gross School of Nursing, University of California, Irvine, CA, United States
| | - Juho Laitala
- Department of Computing, University of Turku, Turku, Finland
| | - Antti Airola
- Department of Computing, University of Turku, Turku, Finland
| | - Amir M Rahmani
- Sue & Bill Gross School of Nursing, University of California, Irvine, CA, United States
| | - Miriam Bender
- Sue & Bill Gross School of Nursing, University of California, Irvine, CA, United States
| |
Collapse
|
34
|
Sharma D, Xu W. phyLoSTM: a novel deep learning model on disease prediction from longitudinal microbiome data. Bioinformatics 2021; 37:3707-3714. [PMID: 34213529 DOI: 10.1093/bioinformatics/btab482] [Citation(s) in RCA: 14] [Impact Index Per Article: 4.7] [Reference Citation Analysis] [Abstract] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/19/2021] [Revised: 05/24/2021] [Accepted: 06/30/2021] [Indexed: 11/12/2022] Open
Abstract
MOTIVATION Research shows that human microbiome is highly dynamic on longitudinal timescales, changing dynamically with diet, or due to medical interventions. In this paper, we propose a novel deep learning framework "phyLoSTM", using a combination of Convolutional Neural Networks and Long Short Term Memory Networks (LSTM) for feature extraction and analysis of temporal dependency in longitudinal microbiome sequencing data along with host's environmental factors for disease prediction. Additional novelty in terms of handling variable timepoints in subjects through LSTMs, as well as, weight balancing between imbalanced cases and controls is proposed. RESULTS We simulated 100 datasets across multiple time points for model testing. To demonstrate the model's effectiveness, we also implemented this novel method into two real longitudinal human microbiome studies: (i) DIABIMMUNE three country cohort with food allergy outcomes (Milk, Egg, Peanut and Overall) (ii) DiGiulio study with preterm delivery as outcome. Extensive analysis and comparison of our approach yields encouraging performance with an AUC of 0.897 (increased by 5%) on simulated studies and AUCs of 0.762 (increased by 19%) and 0.713 (increased by 8%) on the two real longitudinal microbiome studies respectively, as compared to the next best performing method, Random Forest. The proposed methodology improves predictive accuracy on longitudinal human microbiome studies containing spatially correlated data, and evaluates the change of microbiome composition contributing to outcome prediction. AVAILABILITY AND IMPLEMENTATION https://github.com/divya031090/phyLoSTM.
Collapse
Affiliation(s)
- Divya Sharma
- Princess Margaret Cancer Center, University Health Network, Toronto, Ontario, Canada
| | - Wei Xu
- Princess Margaret Cancer Center, University Health Network, Toronto, Ontario, Canada.,Division of Biostatistics, Dalla Lana School of Public Health, University of Toronto, Toronto, Ontario, Canada
| |
Collapse
|
35
|
Sharma D, Paterson AD, Xu W. TaxoNN: ensemble of neural networks on stratified microbiome data for disease prediction. Bioinformatics 2021; 36:4544-4550. [PMID: 32449747 PMCID: PMC7750934 DOI: 10.1093/bioinformatics/btaa542] [Citation(s) in RCA: 22] [Impact Index Per Article: 7.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/10/2020] [Revised: 05/08/2020] [Accepted: 05/19/2020] [Indexed: 11/13/2022] Open
Abstract
MOTIVATION Research supports the potential use of microbiome as a predictor of some diseases. Motivated by the findings that microbiome data is complex in nature, and there is an inherent correlation due to hierarchical taxonomy of microbial Operational Taxonomic Units (OTUs), we propose a novel machine learning method incorporating a stratified approach to group OTUs into phylum clusters. Convolutional Neural Networks (CNNs) were used to train within each of the clusters individually. Further, through an ensemble learning approach, features obtained from each cluster were then concatenated to improve prediction accuracy. Our two-step approach comprising stratification prior to combining multiple CNNs, aided in capturing the relationships between OTUs sharing a phylum efficiently, as compared to using a single CNN ignoring OTU correlations. RESULTS We used simulated datasets containing 168 OTUs in 200 cases and 200 controls for model testing. Thirty-two OTUs, potentially associated with risk of disease were randomly selected and interactions between three OTUs were used to introduce non-linearity. We also implemented this novel method in two human microbiome studies: (i) Cirrhosis with 118 cases, 114 controls; (ii) type 2 diabetes (T2D) with 170 cases, 174 controls; to demonstrate the model's effectiveness. Extensive experimentation and comparison against conventional machine learning techniques yielded encouraging results. We obtained mean AUC values of 0.88, 0.92, 0.75, showing a consistent increment (5%, 3%, 7%) in simulations, Cirrhosis and T2D data, respectively, against the next best performing method, Random Forest. AVAILABILITY AND IMPLEMENTATION https://github.com/divya031090/TaxoNN_OTU. SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Divya Sharma
- Division of Biostatistics, Dalla Lana School of Public Health, University of Toronto, Toronto, ON, Canada M5T 3M7
| | - Andrew D Paterson
- Division of Biostatistics, Dalla Lana School of Public Health, University of Toronto, Toronto, ON, Canada M5T 3M7.,Genetics and Genome Biology Program, The Hospital for Sick Children, Toronto, ON, Canada, M5G 1X8
| | - Wei Xu
- Division of Biostatistics, Dalla Lana School of Public Health, University of Toronto, Toronto, ON, Canada M5T 3M7.,Department of Biostatistics, Princess Margaret Cancer Center, University Health Network, Toronto, ON, Canada, M5G 2C1
| |
Collapse
|
36
|
Artificial Intelligence in Acute Ischemic Stroke. Artif Intell Med 2021. [DOI: 10.1007/978-3-030-58080-3_287-1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/20/2022]
|
37
|
Phellan R, Hachem B, Clin J, Mac-Thiong JM, Duong L. Real-time biomechanics using the finite element method and machine learning: Review and perspective. Med Phys 2020; 48:7-18. [PMID: 33222226 DOI: 10.1002/mp.14602] [Citation(s) in RCA: 17] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/26/2020] [Revised: 09/26/2020] [Accepted: 11/02/2020] [Indexed: 12/27/2022] Open
Abstract
PURPOSE The finite element method (FEM) is the preferred method to simulate phenomena in anatomical structures. However, purely FEM-based mechanical simulations require considerable time, limiting their use in clinical applications that require real-time responses, such as haptics simulators. Machine learning (ML) approaches have been proposed to help with the reduction of the required time. The present paper reviews cases where ML could help to generate faster simulations, without considerably affecting the performance results. METHODS This review details the ML approaches used, considering the anatomical structures involved, the data collection strategies, the selected ML algorithms, with corresponding features, the metrics used for validation, and the resulting time gains. RESULTS A total of 41 references were found. ML algorithms are mainly trained with FEM-based simulations in 32 publications. The preferred ML approach is neural networks, including deep learning in 35 publications. Tissue deformation is simulated in 18 applications, but other features are also considered. The average distance error and mean squared error are the most frequently used performance metrics, in 14 and 17 publications, respectively. The time gains were considerable, going from hours or minutes for purely FEM-based simulations to milliseconds, when using ML. CONCLUSIONS ML algorithms can be used to accelerate FEM-based biomechanical simulations of anatomical structures, possibly reaching real-time responses. Fast and real-time simulations of anatomical structures, generated with ML algorithms, can help to reduce the time required by FEM-based simulations and accelerate their adoption in the clinical practice.
Collapse
Affiliation(s)
- Renzo Phellan
- ETS Montreal, University of Quebec, 1100 Notre-Dame West, Montreal, QC, Canada
| | - Bahe Hachem
- Spinologics Inc., 6750 Esplanade Avenue #290, Montreal, QC, Canada
| | - Julien Clin
- Spinologics Inc., 6750 Esplanade Avenue #290, Montreal, QC, Canada
| | | | - Luc Duong
- ETS Montreal, University of Quebec, 1100 Notre-Dame West, Montreal, QC, Canada
| |
Collapse
|
38
|
Plant functional traits are correlated with species persistence in the herb layer of old-growth beech forests. Sci Rep 2020; 10:19253. [PMID: 33159118 PMCID: PMC7648635 DOI: 10.1038/s41598-020-76289-7] [Citation(s) in RCA: 12] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/14/2018] [Accepted: 10/20/2020] [Indexed: 02/06/2023] Open
Abstract
This paper explores which traits are correlated with fine-scale (0.25 m2) species persistence patterns in the herb layer of old-growth forests. Four old-growth beech forests representing different climatic contexts (presence or absence of summer drought period) were selected along a north–south gradient in Italy. Eight surveys were conducted in each of the sites during the period spanning 1999–2011. We found that fine-scale species persistence was correlated with different sets of plant functional traits, depending on local ecological context. Seed mass was found to be as important for the fine-scale species persistence in the northern sites, while clonal and bud-bank traits were markedly correlated with the southern sites characterised by summer drought. Leaf traits appeared to correlate with species persistence in the drier and wetter sites. However, we found that different attributes, i.e. helomorphic vs scleromorphic leaves, were correlated to species persistence in the northernmost and southernmost sites, respectively. These differences appear to be dependent on local trait adaptation rather than plant phylogenetic history. Our findings suggest that the persistent species in the old-growth forests might adopt an acquisitive resource-use strategy (i.e. helomorphic leaves with high SLA) with higher seed mass in sites without summer drought, while under water-stressed conditions persistent species have a conservative resource-use strategy (i.e. scleromorphic leaves with low SLA) with an increased importance of clonal and resprouting ability.
Collapse
|
39
|
Comparison of the Tree-Based Machine Learning Algorithms to Cox Regression in Predicting the Survival of Oral and Pharyngeal Cancers: Analyses Based on SEER Database. Cancers (Basel) 2020; 12:cancers12102802. [PMID: 33003533 PMCID: PMC7600270 DOI: 10.3390/cancers12102802] [Citation(s) in RCA: 28] [Impact Index Per Article: 7.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/17/2020] [Revised: 09/23/2020] [Accepted: 09/27/2020] [Indexed: 12/24/2022] Open
Abstract
Simple Summary Formulating accurate survival prediction models of oral and pharyngeal cancers (OPCs) is important, as they might impact the decisions of clinicians and patients. Improving the quality of these clinical prediction modelling studies can benefit the reliability of the developed models and facilitate their implementations in clinical practice. Given the growing trend on the application of machine learning methods in cancer research, we present the use of popular tree-based machine learning algorithms and compare them to the standard Cox regression as an aim to predict OPCs survival. The predictive models discussed here are based on a large cancer registry dataset incorporating various prognosis factors and different forms of bias. The comparable predictive performance between Cox and tree-based models suggested that these machine learning algorithms provide non-parametric alternatives to Cox regression and are of clinical use for estimating the survival probability of OPCs patients. Abstract This study aims to demonstrate the use of the tree-based machine learning algorithms to predict the 3- and 5-year disease-specific survival of oral and pharyngeal cancers (OPCs) and compare their performance with the traditional Cox regression. A total of 21,154 individuals diagnosed with OPCs between 2004 and 2009 were obtained from the Surveillance, Epidemiology, and End Results (SEER) database. Three tree-based machine learning algorithms (survival tree (ST), random forest (RF) and conditional inference forest (CF)), together with a reference technique (Cox proportional hazard models (Cox)), were used to develop the survival prediction models. To handle the missing values in predictors, we applied the substantive model compatible version of the fully conditional specification imputation approach to the Cox model, whereas we used RF to impute missing data for the ST, RF and CF models. For internal validation, we used 10-fold cross-validation with 50 iterations in the model development datasets. Following this, model performance was evaluated using the C-index, integrated Brier score (IBS) and calibration curves in the test datasets. For predicting the 3-year survival of OPCs with the complete cases, the C-index in the development sets were 0.77 (0.77, 0.77), 0.70 (0.70, 0.70), 0.83 (0.83, 0.84) and 0.83 (0.83, 0.86) for Cox, ST, RF and CF, respectively. Similar results were observed in the 5-year survival prediction models, with C-index for Cox, ST, RF and CF being 0.76 (0.76, 0.76), 0.69 (0.69, 0.70), 0.83 (0.83, 0.83) and 0.85 (0.84, 0.86), respectively, in development datasets. The prediction error curves based on IBS showed a similar pattern for these models. The predictive performance remained unchanged in the analyses with imputed data. Additionally, a free web-based calculator was developed for potential clinical use. In conclusion, compared to Cox regression, ST had a lower and RF and CF had a higher predictive accuracy in predicting the 3- and 5-year OPCs survival using SEER data. The RF and CF algorithms provide non-parametric alternatives to Cox regression to be of clinical use for estimating the survival probability of OPCs patients.
Collapse
|
40
|
Woody Aboveground Biomass Mapping of the Brazilian Savanna with a Multi-Sensor and Machine Learning Approach. REMOTE SENSING 2020. [DOI: 10.3390/rs12172685] [Citation(s) in RCA: 19] [Impact Index Per Article: 4.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/16/2022]
Abstract
The tropical savanna in Brazil known as the Cerrado covers circa 23% of the Brazilian territory, but only 3% of this area is protected. High rates of deforestation and degradation in the woodland and forest areas have made the Cerrado the second-largest source of carbon emissions in Brazil. However, data on these emissions are highly uncertain because of the spatial and temporal variability of the aboveground biomass (AGB) in this biome. Remote-sensing data combined with local vegetation inventories provide the means to quantify the AGB at large scales. Here, we quantify the spatial distribution of woody AGB in the Rio Vermelho watershed, located in the centre of the Cerrado, at a high spatial resolution of 30 metres, with a random forest (RF) machine-learning approach. We produced the first high-resolution map of the AGB for a region in the Brazilian Cerrado using a combination of vegetation inventory plots, airborne light detection and ranging (LiDAR) data, and multispectral and radar satellite images (Landsat 8 and ALOS-2/PALSAR-2). A combination of random forest (RF) models and jackknife analyses enabled us to select the best remote-sensing variables to quantify the AGB on a large scale. Overall, the relationship between the ground data from vegetation inventories and remote-sensing variables was strong (R2 = 0.89), with a root-mean-square error (RMSE) of 7.58 Mg ha−1 and a bias of 0.43 Mg ha−1.
Collapse
|
41
|
Vahl de Paula B, Squizani Arruda W, Etienne Parent L, Frank de Araujo E, Brunetto G. Nutrient Diagnosis of Eucalyptus at the Factor-Specific Level Using Machine Learning and Compositional Methods. PLANTS 2020; 9:plants9081049. [PMID: 32824810 PMCID: PMC7464882 DOI: 10.3390/plants9081049] [Citation(s) in RCA: 9] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 07/21/2020] [Revised: 08/11/2020] [Accepted: 08/13/2020] [Indexed: 12/21/2022]
Abstract
Brazil is home to 30% of the world’s Eucalyptus trees. The seedlings are fertilized at plantation to support biomass production until canopy closure. Thereafter, fertilization is guided by state standards that may not apply at the local scale where myriads of growth factors interact. Our objective was to customize the nutrient diagnosis of young Eucalyptus trees down to factor-specific levels. We collected 1861 observations across eight clones, 48 soil types, and 148 locations in southern Brazil. Cutoff diameter between low- and high-yielding specimens at breast height was set at 4.3 cm. The random forest classification model returned a relatively uninformative area under the curve (AUC) of 0.63 using tissue compositions only, and an informative AUC of 0.78 after adding local features. Compared to nutrient levels from quartile compatibility intervals of nutritionally balanced specimens at high-yield level, state guidelines appeared to be too high for Mg, B, Mn, and Fe and too low for Cu and Zn. Moreover, diagnosis using concentration ranges collapsed in the multivariate Euclidean hyper-space by denying nutrient interactions. Factor-specific diagnosis detected nutrient imbalance by computing the Euclidean distance between centered log-ratio transformed compositions of defective and successful neighbors at a local scale. Downscaling regional nutrient standards may thus fail to account for factor interactions at a local scale. Documenting factors at a local scale requires large datasets through close collaboration between stakeholders.
Collapse
Affiliation(s)
- Betania Vahl de Paula
- Departemento dos Solos, Universidade Federal de Santa Maria, Av. Roraima, 1000-Camobi, Santa Maria-RS 97105-900, Brazil; (W.S.A.); (L.E.P.); (G.B.)
- Correspondence: ; Tel.: +55-5532177117
| | - Wagner Squizani Arruda
- Departemento dos Solos, Universidade Federal de Santa Maria, Av. Roraima, 1000-Camobi, Santa Maria-RS 97105-900, Brazil; (W.S.A.); (L.E.P.); (G.B.)
| | - Léon Etienne Parent
- Departemento dos Solos, Universidade Federal de Santa Maria, Av. Roraima, 1000-Camobi, Santa Maria-RS 97105-900, Brazil; (W.S.A.); (L.E.P.); (G.B.)
- Department of Soils and Agrifood Engineering, Laval University, Quebec, QC G1V 0A6, Canada
| | - Elias Frank de Araujo
- Soil and Management Researcher of CMPC-Cellulose Rio Grandense, Rua São Geraldo 1680-Guaíba–RS, Brazil;
| | - Gustavo Brunetto
- Departemento dos Solos, Universidade Federal de Santa Maria, Av. Roraima, 1000-Camobi, Santa Maria-RS 97105-900, Brazil; (W.S.A.); (L.E.P.); (G.B.)
| |
Collapse
|
42
|
Mazumdar M, Lin JYJ, Zhang W, Li L, Liu M, Dharmarajan K, Sanderson M, Isola L, Hu L. Comparison of statistical and machine learning models for healthcare cost data: a simulation study motivated by Oncology Care Model (OCM) data. BMC Health Serv Res 2020; 20:350. [PMID: 32334595 PMCID: PMC7183716 DOI: 10.1186/s12913-020-05148-y] [Citation(s) in RCA: 21] [Impact Index Per Article: 5.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/25/2019] [Accepted: 03/24/2020] [Indexed: 01/08/2023] Open
Abstract
Background The Oncology Care Model (OCM) was developed as a payment model to encourage participating practices to provide better-quality care for cancer patients at a lower cost. The risk-adjustment model used in OCM is a Gamma generalized linear model (Gamma GLM) with log-link. The predicted value of expense for the episodes identified for our academic medical center (AMC), based on the model fitted to the national data, did not correlate well with our observed expense. This motivated us to fit the Gamma GLM to our AMC data and compare it with two other flexible modeling methods: Random Forest (RF) and Partially Linear Additive Quantile Regression (PLAQR). We also performed a simulation study to assess comparative performance of these methods and examined the impact of non-linearity and interaction effects, two understudied aspects in the field of cost prediction. Methods The simulation was designed with an outcome of cost generated from four distributions: Gamma, Weibull, Log-normal with a heteroscedastic error term, and heavy-tailed. Simulation parameters both similar to and different from OCM data were considered. The performance metrics considered were the root mean square error (RMSE), mean absolute prediction error (MAPE), and cost accuracy (CA). Bootstrap resampling was utilized to estimate the operating characteristics of the performance metrics, which were described by boxplots. Results RF attained the best performance with lowest RMSE, MAPE, and highest CA for most of the scenarios. When the models were misspecified, their performance was further differentiated. Model performance differed more for non-exponential than exponential outcome distributions. Conclusions RF outperformed Gamma GLM and PLAQR in predicting overall and top decile costs. RF demonstrated improved prediction under various scenarios common in healthcare cost modeling. Additionally, RF did not require prespecification of outcome distribution, nonlinearity effect, or interaction terms. Therefore, RF appears to be the best tool to predict average cost. However, when the goal is to estimate extreme expenses, e.g., high cost episodes, the accuracy gained by RF versus its computational costs may need to be considered.
Collapse
Affiliation(s)
- Madhu Mazumdar
- Institute for Healthcare Delivery Science, Department of Population Health Science and Policy, Icahn School of Medicine at Mount Sinai, New York, NY, 10029, USA.,Tisch Cancer Institute, Mount Sinai Hospital, New York, NY, 10029, USA
| | - Jung-Yi Joyce Lin
- Institute for Healthcare Delivery Science, Department of Population Health Science and Policy, Icahn School of Medicine at Mount Sinai, New York, NY, 10029, USA.,Tisch Cancer Institute, Mount Sinai Hospital, New York, NY, 10029, USA
| | - Wei Zhang
- Department of Mathematics and Statistics, University of Arkansas at Little Rock, Little Rock, AR, 72204, USA
| | - Lihua Li
- Institute for Healthcare Delivery Science, Department of Population Health Science and Policy, Icahn School of Medicine at Mount Sinai, New York, NY, 10029, USA.,Tisch Cancer Institute, Mount Sinai Hospital, New York, NY, 10029, USA
| | - Mark Liu
- Tisch Cancer Institute, Mount Sinai Hospital, New York, NY, 10029, USA
| | - Kavita Dharmarajan
- Department of Radiation Oncology, Brookdale Department of Geriatrics and Palliative Medicine Mount Sinai Hospital, Icahn School of Medicine at Mount Sinai, New York, USA
| | - Mark Sanderson
- Department of Health System Design and Global Health, Icahn School of Medicine at Mount Sinai, New York, NY, 10029, USA
| | - Luis Isola
- Tisch Cancer Institute, Mount Sinai Hospital, New York, NY, 10029, USA
| | - Liangyuan Hu
- Institute for Healthcare Delivery Science, Department of Population Health Science and Policy, Icahn School of Medicine at Mount Sinai, New York, NY, 10029, USA. .,Tisch Cancer Institute, Mount Sinai Hospital, New York, NY, 10029, USA.
| |
Collapse
|
43
|
D'Alelio D, Rampone S, Cusano LM, Morfino V, Russo L, Sanseverino N, Cloern JE, Lomas MW. Machine learning identifies a strong association between warming and reduced primary productivity in an oligotrophic ocean gyre. Sci Rep 2020; 10:3287. [PMID: 32098970 PMCID: PMC7042350 DOI: 10.1038/s41598-020-59989-y] [Citation(s) in RCA: 18] [Impact Index Per Article: 4.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/27/2019] [Accepted: 02/06/2020] [Indexed: 12/20/2022] Open
Abstract
Phytoplankton play key roles in the oceans by regulating global biogeochemical cycles and production in marine food webs. Global warming is thought to affect phytoplankton production both directly, by impacting their photosynthetic metabolism, and indirectly by modifying the physical environment in which they grow. In this respect, the Bermuda Atlantic Time-series Study (BATS) in the Sargasso Sea (North Atlantic gyre) provides a unique opportunity to explore effects of warming on phytoplankton production across the vast oligotrophic ocean regions because it is one of the few multidecadal records of measured net primary productivity (NPP). We analysed the time series of phytoplankton primary productivity at BATS site using machine learning techniques (ML) to show that increased water temperature over a 27-year period (1990–2016), and the consequent weakening of vertical mixing in the upper ocean, induced a negative feedback on phytoplankton productivity by reducing the availability of essential resources, nitrogen and light. The unbalanced availability of these resources with warming, coupled with ecological changes at the community level, is expected to intensify the oligotrophic state of open-ocean regions that are far from land-based nutrient sources.
Collapse
Affiliation(s)
- Domenico D'Alelio
- Department of Integrative Marine Ecology, Stazione Zoologica Anton Dohrn, Villa Comunale, I-80121, Naples, Italy.
| | - Salvatore Rampone
- Università degli Studi del Sannio, Via Delle Puglie 76, I-82100, Benevento, Italy
| | - Luigi Maria Cusano
- Università degli Studi del Sannio, Via Delle Puglie 76, I-82100, Benevento, Italy
| | - Valerio Morfino
- Università degli Studi del Sannio, Via Delle Puglie 76, I-82100, Benevento, Italy
| | - Luca Russo
- Department of Integrative Marine Ecology, Stazione Zoologica Anton Dohrn, Villa Comunale, I-80121, Naples, Italy
| | - Nadia Sanseverino
- Università degli Studi del Sannio, Via Delle Puglie 76, I-82100, Benevento, Italy
| | - James E Cloern
- United States Geological Survey (emeritus), Menlo Park, CA, USA
| | - Michael W Lomas
- Bigelow Laboratory for Ocean Sciences, East Boothbay, ME, USA.
| |
Collapse
|
44
|
Wang X, Yang YQ, Liu SH, Hong XY, Sun XF, Shi JH. Comparing different venous thromboembolism risk assessment machine learning models in Chinese patients. J Eval Clin Pract 2020; 26:26-34. [PMID: 31840330 DOI: 10.1111/jep.13324] [Citation(s) in RCA: 11] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 04/11/2019] [Revised: 11/06/2019] [Accepted: 11/14/2019] [Indexed: 12/14/2022]
Abstract
OBJECTIVE Venous thromboembolism (VTE) is a fatal complication and the most common preventable cause of death in hospitals. The risk-to-benefit ratio of thromboprophylaxis depends on the performance of the risk assessment model. A linear model, the Padua model, is recommended for medical inpatients in the United States but is not suitable for Chinese inpatients due to differences in race and disease spectrum. Currently, machine learning (ML) methods show advantages in modeling complex data patterns and have been applied to clinical data analysis. This study aimed to build VTE risk assessment ML models among Chinese inpatients and compare the predictive validity of the ML models with that of the Padua model. METHODS We used 376 patients, including 188 patients with VTE, to build a model and then evaluate the predictive validity of the model in a consecutive clinical dataset from Peking Union Medical College Hospital. Nine widely used ML methods were trained on the model derivation set and then compared with the Padua model. RESULTS Among the nine ML methods, random forest (RF), boosting-based methods, and logistic regression achieved a higher specificity, Youden index, positive predictive value, and area under the receiver operating characteristic curve than the Padua model on both the test and clinical validation sets. However, their sensitivities were inferior to that of the Padua model. Combined with the receiver operating characteristic curve, RF, as the best performing model, maintained high specificity with relatively better sensitivity and captured VTE patients' patterns more precisely. CONCLUSIONS Advances in ML technology provide powerful tools for medical data analysis, and choosing models conforming to the disease pattern would achieve good performance. Popular ML models do not surpass the Padua model on all indicators of validity, and the drawback of low sensitivity should be improved upon in the future.
Collapse
Affiliation(s)
- Xin Wang
- Department of Ultrasound, Peking Union Medical College Hospital, Beijing, China.,Chinese Academy of Medical Sciences, Peking Union Medical College, Beijing, China
| | - Yu-Qing Yang
- Computer Science and Technology, Tsinghua University, Beijing, China
| | - Si-Hua Liu
- Department of Respiration, Peking Union Medical College Hospital, Beijing, China.,Chinese Academy of Medical Sciences, Peking Union Medical College, Beijing, China
| | - Xin-Yu Hong
- Department of Respiration, Peking Union Medical College Hospital, Beijing, China.,Chinese Academy of Medical Sciences, Peking Union Medical College, Beijing, China
| | - Xue-Feng Sun
- Department of Respiration, Peking Union Medical College Hospital, Beijing, China.,Chinese Academy of Medical Sciences, Peking Union Medical College, Beijing, China
| | - Ju-Hong Shi
- Department of Respiration, Peking Union Medical College Hospital, Beijing, China.,Chinese Academy of Medical Sciences, Peking Union Medical College, Beijing, China
| |
Collapse
|
45
|
Lehmann A, Zheng W, Ryo M, Soutschek K, Roy J, Rongstock R, Maaß S, Rillig MC. Fungal Traits Important for Soil Aggregation. Front Microbiol 2020; 10:2904. [PMID: 31998249 PMCID: PMC6962133 DOI: 10.3389/fmicb.2019.02904] [Citation(s) in RCA: 19] [Impact Index Per Article: 4.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/04/2019] [Accepted: 12/02/2019] [Indexed: 01/29/2023] Open
Abstract
Soil structure, the complex arrangement of soil into aggregates and pore spaces, is a key feature of soils and soil biota. Among them, filamentous saprobic fungi have well-documented effects on soil aggregation. However, it is unclear what properties, or traits, determine the overall positive effect of fungi on soil aggregation. To achieve progress, it would be helpful to systematically investigate a broad suite of fungal species for their trait expression and the relation of these traits to soil aggregation. Here, we apply a trait-based approach to a set of 15 traits measured under standardized conditions on 31 fungal strains including Ascomycota, Basidiomycota, and Mucoromycota, all isolated from the same soil. We find large differences among these fungi in their ability to aggregate soil, including neutral to positive effects, and we document large differences in trait expression among strains. We identify biomass density, i.e., the density with which a mycelium grows (positive effects), leucine aminopeptidase activity (negative effects) and phylogeny as important factors explaining differences in soil aggregate formation (SAF) among fungal strains; importantly, growth rate was not among the important traits. Our results point to a typical suite of traits characterizing fungi that are good soil aggregators, and our findings illustrate the power of employing a trait-based approach to unravel biological mechanisms underpinning soil aggregation. Such an approach could now be extended also to other soil biota groups. In an applied context of restoration and agriculture, such trait information can inform management, for example to prioritize practices that favor the expression of more desirable fungal traits.
Collapse
Affiliation(s)
- Anika Lehmann
- Ecology of Plants, Institut für Biologie, Freie Universität Berlin, Berlin, Germany
- Berlin-Brandenburg Institute of Advanced Biodiversity Research, Berlin, Germany
| | | | - Masahiro Ryo
- Ecology of Plants, Institut für Biologie, Freie Universität Berlin, Berlin, Germany
- Berlin-Brandenburg Institute of Advanced Biodiversity Research, Berlin, Germany
| | - Katharina Soutschek
- Ecology of Plants, Institut für Biologie, Freie Universität Berlin, Berlin, Germany
| | - Julien Roy
- Ecology of Plants, Institut für Biologie, Freie Universität Berlin, Berlin, Germany
- Berlin-Brandenburg Institute of Advanced Biodiversity Research, Berlin, Germany
| | - Rebecca Rongstock
- Ecology of Plants, Institut für Biologie, Freie Universität Berlin, Berlin, Germany
| | - Stefanie Maaß
- Berlin-Brandenburg Institute of Advanced Biodiversity Research, Berlin, Germany
- Plant Ecology and Nature Conservation, Institut für Biochemie und Biologie, Universität Potsdam, Potsdam, Germany
| | - Matthias C. Rillig
- Ecology of Plants, Institut für Biologie, Freie Universität Berlin, Berlin, Germany
- Berlin-Brandenburg Institute of Advanced Biodiversity Research, Berlin, Germany
| |
Collapse
|
46
|
Ryo M, Jeschke JM, Rillig MC, Heger T. Machine learning with the hierarchy-of-hypotheses (HoH) approach discovers novel pattern in studies on biological invasions. Res Synth Methods 2020; 11:66-73. [PMID: 31219681 PMCID: PMC7003914 DOI: 10.1002/jrsm.1363] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/06/2018] [Revised: 06/05/2019] [Accepted: 06/10/2019] [Indexed: 11/11/2022]
Abstract
Research synthesis on simple yet general hypotheses and ideas is challenging in scientific disciplines studying highly context-dependent systems such as medical, social, and biological sciences. This study shows that machine learning, equation-free statistical modeling of artificial intelligence, is a promising synthesis tool for discovering novel patterns and the source of controversy in a general hypothesis. We apply a decision tree algorithm, assuming that evidence from various contexts can be adequately integrated in a hierarchically nested structure. As a case study, we analyzed 163 articles that studied a prominent hypothesis in invasion biology, the enemy release hypothesis. We explored if any of the nine attributes that classify each study can differentiate conclusions as classification problem. Results corroborated that machine learning can be useful for research synthesis, as the algorithm could detect patterns that had been already focused in previous narrative reviews. Compared with the previous synthesis study that assessed the same evidence collection based on experts' judgement, the algorithm has newly proposed that the studies focusing on Asian regions mostly supported the hypothesis, suggesting that more detailed investigations in these regions can enhance our understanding of the hypothesis. We suggest that machine learning algorithms can be a promising synthesis tool especially where studies (a) reformulate a general hypothesis from different perspectives, (b) use different methods or variables, or (c) report insufficient information for conducting meta-analyses.
Collapse
Affiliation(s)
- Masahiro Ryo
- Institute of BiologyFreie Universität BerlinBerlinGermany
- Berlin‐Brandenburg Institute of Advanced Biodiversity Research (BBIB)BerlinGermany
| | - Jonathan M. Jeschke
- Institute of BiologyFreie Universität BerlinBerlinGermany
- Berlin‐Brandenburg Institute of Advanced Biodiversity Research (BBIB)BerlinGermany
- Leibniz‐Institute of Freshwater Ecology and Inland Fisheries (IGB)BerlinGermany
| | - Matthias C. Rillig
- Institute of BiologyFreie Universität BerlinBerlinGermany
- Berlin‐Brandenburg Institute of Advanced Biodiversity Research (BBIB)BerlinGermany
| | - Tina Heger
- Berlin‐Brandenburg Institute of Advanced Biodiversity Research (BBIB)BerlinGermany
- Biodiversity Research/Systematic BotanyUniversity of PotsdamPotsdamGermany
- Restoration EcologyTechnical University of MunichFreisingGermany
| |
Collapse
|
47
|
Pichler M, Boreux V, Klein A, Schleuning M, Hartig F. Machine learning algorithms to infer trait‐matching and predict species interactions in ecological networks. Methods Ecol Evol 2019. [DOI: 10.1111/2041-210x.13329] [Citation(s) in RCA: 54] [Impact Index Per Article: 10.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/11/2022]
Affiliation(s)
| | - Virginie Boreux
- Nature Conservation and Landscape Ecology University of Freiburg Freiburg Germany
| | | | - Matthias Schleuning
- Senckenberg Biodiversity and Climate Research Centre (SBiK‐F) Frankfurt (Main) Germany
| | - Florian Hartig
- Theoretical Ecology University of Regensburg Regensburg Germany
| |
Collapse
|
48
|
Eisenhauer N, Schielzeth H, Barnes AD, Barry K, Bonn A, Brose U, Bruelheide H, Buchmann N, Buscot F, Ebeling A, Ferlian O, Freschet GT, Giling DP, Hättenschwiler S, Hillebrand H, Hines J, Isbell F, Koller-France E, König-Ries B, de Kroon H, Meyer ST, Milcu A, Müller J, Nock CA, Petermann JS, Roscher C, Scherber C, Scherer-Lorenzen M, Schmid B, Schnitzer SA, Schuldt A, Tscharntke T, Türke M, van Dam NM, van der Plas F, Vogel A, Wagg C, Wardle DA, Weigelt A, Weisser WW, Wirth C, Jochum M. A multitrophic perspective on biodiversity-ecosystem functioning research. ADV ECOL RES 2019; 61:1-54. [PMID: 31908360 PMCID: PMC6944504 DOI: 10.1016/bs.aecr.2019.06.001] [Citation(s) in RCA: 54] [Impact Index Per Article: 10.8] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/05/2023]
Abstract
Concern about the functional consequences of unprecedented loss in biodiversity has prompted biodiversity-ecosystem functioning (BEF) research to become one of the most active fields of ecological research in the past 25 years. Hundreds of experiments have manipulated biodiversity as an independent variable and found compelling support that the functioning of ecosystems increases with the diversity of their ecological communities. This research has also identified some of the mechanisms underlying BEF relationships, some context-dependencies of the strength of relationships, as well as implications for various ecosystem services that mankind depends upon. In this paper, we argue that a multitrophic perspective of biotic interactions in random and non-random biodiversity change scenarios is key to advance future BEF research and to address some of its most important remaining challenges. We discuss that the study and the quantification of multitrophic interactions in space and time facilitates scaling up from small-scale biodiversity manipulations and ecosystem function assessments to management-relevant spatial scales across ecosystem boundaries. We specifically consider multitrophic conceptual frameworks to understand and predict the context-dependency of BEF relationships. Moreover, we highlight the importance of the eco-evolutionary underpinnings of multitrophic BEF relationships. We outline that FAIR data (meeting the standards of findability, accessibility, interoperability, and reusability) and reproducible processing will be key to advance this field of research by making it more integrative. Finally, we show how these BEF insights may be implemented for ecosystem management, society, and policy. Given that human well-being critically depends on the multiple services provided by diverse, multitrophic communities, integrating the approaches of evolutionary ecology, community ecology, and ecosystem ecology in future BEF research will be key to refine conservation targets and develop sustainable management strategies.
Collapse
Affiliation(s)
- Nico Eisenhauer
- German Centre for Integrative Biodiversity Research (iDiv) Halle-Jena-Leipzig, Deutscher Platz 5e, 04103 Leipzig, Germany
- Institute of Biology, Leipzig University, Deutscher Platz 5e, 04103 Leipzig, Germany
| | - Holger Schielzeth
- Department of Population Ecology, Institute of Ecology and Evolution, Friedrich Schiller University Jena, Jena, Germany
| | - Andrew D Barnes
- German Centre for Integrative Biodiversity Research (iDiv) Halle-Jena-Leipzig, Deutscher Platz 5e, 04103 Leipzig, Germany
- Institute of Biology, Leipzig University, Deutscher Platz 5e, 04103 Leipzig, Germany
| | - Kathryn Barry
- German Centre for Integrative Biodiversity Research (iDiv) Halle-Jena-Leipzig, Deutscher Platz 5e, 04103 Leipzig, Germany
- Institute of Biology, Leipzig University, Johannisallee 21-23, 04103 Leipzig, Germany
| | - Aletta Bonn
- German Centre for Integrative Biodiversity Research (iDiv) Halle-Jena-Leipzig, Deutscher Platz 5e, 04103 Leipzig, Germany
| | - Ulrich Brose
- German Centre for Integrative Biodiversity Research (iDiv) Halle-Jena-Leipzig, Deutscher Platz 5e, 04103 Leipzig, Germany
- EcoNetLab, Institute of Biodiversity, Friedrich Schiller University Jena, Dornburger-Str. 159, 07743 Jena, Germany
| | - Helge Bruelheide
- German Centre for Integrative Biodiversity Research (iDiv) Halle-Jena-Leipzig, Deutscher Platz 5e, 04103 Leipzig, Germany
- Institute of Biology / Geobotany and Botanical Garden, Martin Luther University Halle-Wittenberg, Am Kirchtor 1, 06108 Halle (Saale), Germany
| | - Nina Buchmann
- Institute of Agricultural Sciences, ETH Zurich, Universitätstr. 2, 8092 Zurich, Switzerland
| | - François Buscot
- German Centre for Integrative Biodiversity Research (iDiv) Halle-Jena-Leipzig, Deutscher Platz 5e, 04103 Leipzig, Germany
- UFZ - Helmholtz Centre for Environmental Research, Soil Ecology Department, Theodor-Lieser-Straße 4, 06120 Halle Saale, Germany
| | - Anne Ebeling
- Institute of Ecology and Evolution, Friedrich Schiller University Jena, Dornburger Str. 159, 07743 Jena, Germany
| | - Olga Ferlian
- German Centre for Integrative Biodiversity Research (iDiv) Halle-Jena-Leipzig, Deutscher Platz 5e, 04103 Leipzig, Germany
- Institute of Biology, Leipzig University, Deutscher Platz 5e, 04103 Leipzig, Germany
| | - Grégoire T Freschet
- Centre d'Ecologie Fonctionnelle et Evolutive, UMR 5175 (CNRS - Université de Montpellier - Université Paul-Valéry Montpellier - EPHE), 1919 Route de Mende, Montpellier 34293, France
| | - Darren P Giling
- German Centre for Integrative Biodiversity Research (iDiv) Halle-Jena-Leipzig, Deutscher Platz 5e, 04103 Leipzig, Germany
- Institute of Biology, Leipzig University, Deutscher Platz 5e, 04103 Leipzig, Germany
- Institute of Ecology and Evolution, Friedrich Schiller University Jena, Dornburger Straße 159, 07743 Jena, Germany
| | - Stephan Hättenschwiler
- Centre d'Ecologie Fonctionnelle et Evolutive, UMR 5175 (CNRS - Université de Montpellier - Université Paul-Valéry Montpellier - EPHE), 1919 Route de Mende, Montpellier 34293, France
| | - Helmut Hillebrand
- German Centre for Integrative Biodiversity Research (iDiv) Halle-Jena-Leipzig, Deutscher Platz 5e, 04103 Leipzig, Germany
- Institute for Chemistry and Biology of Marine Environments [ICBM], Carl-von-Ossietzky University Oldenburg, Schleusenstrasse 1, 26382 Wilhelmshaven, Germany
| | - Jes Hines
- German Centre for Integrative Biodiversity Research (iDiv) Halle-Jena-Leipzig, Deutscher Platz 5e, 04103 Leipzig, Germany
- Institute of Biology, Leipzig University, Deutscher Platz 5e, 04103 Leipzig, Germany
| | - Forest Isbell
- Department of Ecology, Evolution and Behavior, University of Minnesota, 1479 Gortner Avenue, St. Paul, MN 55108, USA
| | - Eva Koller-France
- Karlsruher Institut für Technologie (KIT), Institut für Geographie und Geoökologie, Reinhard-Baumeister-Platz 1, 76131 Karlsruhe, Germany
| | - Birgitta König-Ries
- German Centre for Integrative Biodiversity Research (iDiv) Halle-Jena-Leipzig, Deutscher Platz 5e, 04103 Leipzig, Germany
- Institute of Computer Science, Friedrich Schiller Universität Jena, Ernst-Abbe-Platz 2, 07743 Jena, Germany
| | - Hans de Kroon
- Radboud University, Institute for Water and Wetland Research, Animal Ecology and Physiology & Experimental Plant Ecology, PO Box 9100, 6500 GL Nijmegen, The Netherlands
| | - Sebastian T Meyer
- Terrestrial Ecology Research Group, Technical University of Munich, School of Life Sciences Weihenstephan, Hans-Carl-von-Carlowitz-Platz 2, 85354 Freising, Germany
| | - Alexandru Milcu
- Ecotron Européen de Montpellier, Centre National de la Recherche Scientifique (CNRS), Unité Propre de Service 3248, Campus Baillarguet, Montferrier-sur-Lez, France
- Centre d'Ecologie Fonctionnelle et Evolutive, UMR 5175 (CNRS - Université de Montpellier - Université Paul-Valéry Montpellier - EPHE), 1919 Route de Mende, Montpellier 34293, France
| | - Jörg Müller
- Field Station Fabrikschleichach, Department of Animal Ecology and Tropical Biology, Biocenter, University of Würzburg, Glashüttenstraße 5, 96181 Rauhenebrach, Germany
- Bavarian Forest National Park, Freyunger Str. 2, 94481 Grafenau, Germany
| | - Charles A Nock
- Geobotany, Faculty of Biology, University of Freiburg, Schaenzlestrasse 1, 79104 Freiburg, Germany
- Department of Renewable Resources, University of Alberta, 751 General Services Building, Edmonton, Canada, T6G 2H1
| | - Jana S Petermann
- Department of Biosciences, University of Salzburg, Hellbrunner Str. 34, 5020 Salzburg, Austria
| | - Christiane Roscher
- German Centre for Integrative Biodiversity Research (iDiv) Halle-Jena-Leipzig, Deutscher Platz 5e, 04103 Leipzig, Germany
- UFZ - Helmholtz Centre for Environmental Research, Department Physiological Diversity, Permoserstrasse 15, 04318 Leipzig, Germany
| | - Christoph Scherber
- Institute of Landscape Ecology, University of Münster, Heisenbergstr. 2, 48149 Münster, Germany
| | - Michael Scherer-Lorenzen
- Geobotany, Faculty of Biology, University of Freiburg, Schaenzlestrasse 1, 79104 Freiburg, Germany
| | - Bernhard Schmid
- Department of Geography, University of Zürich, 190 Winterthurerstrasse, 8057, Zürich, Switzerland
| | | | - Andreas Schuldt
- Forest Nature Conservation, Faculty of Forest Sciences and Forest Ecology, University of Göttingen, Buesgenweg 3, 37077 Goettingen, Germany
| | - Teja Tscharntke
- Agroecology, Dept. of Crop Sciences, University of Göttingen, Germany
- Centre of Biodiversity and Sustainable Land Use (CBL), University of Göttingen, Germany
| | - Manfred Türke
- German Centre for Integrative Biodiversity Research (iDiv) Halle-Jena-Leipzig, Deutscher Platz 5e, 04103 Leipzig, Germany
- Institute of Biology, Leipzig University, Deutscher Platz 5e, 04103 Leipzig, Germany
- Institute of Biological and Medical Imaging (IBMI), Helmholtz Zentrum München (HMGU) - German Research Center for Environmental Health, Ingolstädter Landstr. 1, 85764 Neuherberg, Germany
| | - Nicole M van Dam
- German Centre for Integrative Biodiversity Research (iDiv) Halle-Jena-Leipzig, Deutscher Platz 5e, 04103 Leipzig, Germany
- Institute of Biodiversity, Friedrich Schiller University Jena, Dornburger-Str. 159, 07743 Jena, Germany
| | - Fons van der Plas
- Institute of Biology, Leipzig University, Deutscher Platz 5e, 04103 Leipzig, Germany
| | - Anja Vogel
- German Centre for Integrative Biodiversity Research (iDiv) Halle-Jena-Leipzig, Deutscher Platz 5e, 04103 Leipzig, Germany
- Institute of Biology, Leipzig University, Deutscher Platz 5e, 04103 Leipzig, Germany
- Institute of Ecology and Evolution, Friedrich Schiller University Jena, Dornburger Straße 159, 07743 Jena, Germany
| | - Cameron Wagg
- Fredericton Research and Development Centre, Agriculture and Agri-Food Canada, 850 Lincoln Road, E3B 8B7, Fredericton, Canada
- Department of Evolutionary Biology and Environmental Studies, University of Zürich, 190 Winterthurerstrasse, 8057, Zürich, Switzerland
| | - David A Wardle
- Asian School of the Environment, Nanyang Technological University, 50 Nanyang Avenue, Singapore 639798
| | - Alexandra Weigelt
- German Centre for Integrative Biodiversity Research (iDiv) Halle-Jena-Leipzig, Deutscher Platz 5e, 04103 Leipzig, Germany
- Institute of Biology, Leipzig University, Johannisallee 21-23, 04103 Leipzig, Germany
| | - Wolfgang W Weisser
- Terrestrial Ecology Research Group, Technical University of Munich, School of Life Sciences Weihenstephan, Hans-Carl-von-Carlowitz-Platz 2, 85354 Freising, Germany
| | - Christian Wirth
- German Centre for Integrative Biodiversity Research (iDiv) Halle-Jena-Leipzig, Deutscher Platz 5e, 04103 Leipzig, Germany
- Institute of Biology, Leipzig University, Johannisallee 21-23, 04103 Leipzig, Germany
| | - Malte Jochum
- German Centre for Integrative Biodiversity Research (iDiv) Halle-Jena-Leipzig, Deutscher Platz 5e, 04103 Leipzig, Germany
- Institute of Biology, Leipzig University, Deutscher Platz 5e, 04103 Leipzig, Germany
- Institute of Plant Sciences, University of Bern, Altenbergrain 21, 3013 Bern, Switzerland
| |
Collapse
|
49
|
Jeltsch F, Grimm V, Reeg J, Schlägel UE. Give chance a chance: from coexistence to coviability in biodiversity theory. Ecosphere 2019. [DOI: 10.1002/ecs2.2700] [Citation(s) in RCA: 14] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/06/2023] Open
Affiliation(s)
- Florian Jeltsch
- Department of Plant Ecology and Nature Conservation University of Potsdam Am Mühlenberg 3 Potsdam‐Golm DE‐14476 Germany
- Berlin‐Brandenburg Institute of Advanced Biodiversity Research (BBIB) Berlin DE‐14195 Germany
| | - Volker Grimm
- Department of Plant Ecology and Nature Conservation University of Potsdam Am Mühlenberg 3 Potsdam‐Golm DE‐14476 Germany
- Department of Ecological Modelling Helmholtz Centre for Environmental Research‐UFZ Permoserstraße 15 Leipzig 04318 Germany
- German Centre for Integrative Biodiversity Research (iDiv) Halle‐Jena‐Leipzig Deutscher Platz 5e Leipzig 04103 Germany
| | - Jette Reeg
- Department of Plant Ecology and Nature Conservation University of Potsdam Am Mühlenberg 3 Potsdam‐Golm DE‐14476 Germany
| | - Ulrike E. Schlägel
- Department of Plant Ecology and Nature Conservation University of Potsdam Am Mühlenberg 3 Potsdam‐Golm DE‐14476 Germany
| |
Collapse
|
50
|
On the Synergistic Use of Optical and SAR Time-Series Satellite Data for Small Mammal Disease Host Mapping. REMOTE SENSING 2018. [DOI: 10.3390/rs11010039] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/17/2022]
Abstract
(1) Background: Echinococcus multilocularis (Em), a highly pathogenic parasitic tapeworm, is responsible for a significant burden of human disease. In this study, optical and time-series Synthetic Aperture Radar (SAR) data is used synergistically to model key land cover characteristics driving the spatial distributions of two small mammal intermediate host species, Ellobius tancrei and Microtus gregalis, which facilitate Em transmission in a highly endemic area of Kyrgyzstan. (2) Methods: A series of land cover maps are derived from (a) single-date Landsat Operational Land Imager (OLI) imagery, (b) time-series Sentinel-1 SAR data, and (c) Landsat OLI and time-series Sentinel-1 SAR data in combination. Small mammal distributions are analyzed in relation to the surrounding land cover class coverage using random forests, before being applied predictively over broader areas. A comparison of models derived from the three land cover maps are made, assessing their potential for use in cloud-prone areas. (3) Results: Classification accuracies demonstrated the combined OLI-SAR classification to be of highest accuracy, with the single-date OLI and time-series SAR derived classifications of equivalent quality. Random forest analysis identified statistically significant positive relationships between E. tancrei density and agricultural land, and between M. gregalis density and water and bushes. Predictive application of random forest models identified hotspots of high relative density of E. tancrei and M. gregalis across the broader study area. (4) Conclusions: This offers valuable information to improve the targeting of limited-resource disease control activities to disrupt disease transmission in this area. Time-series SAR derived land cover maps are shown to be of equivalent quality to those generated from single-date optical imagery, which enables application of these methods in cloud-affected areas where, previously, this was not possible due to the sparsity of cloud-free optical imagery.
Collapse
|