1
|
Herrera-Luis E, Benke K, Volk H, Ladd-Acosta C, Wojcik GL. Gene-environment interactions in human health. Nat Rev Genet 2024:10.1038/s41576-024-00731-z. [PMID: 38806721 DOI: 10.1038/s41576-024-00731-z] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 04/03/2024] [Indexed: 05/30/2024]
Abstract
Gene-environment interactions (G × E), the interplay of genetic variation with environmental factors, have a pivotal impact on human complex traits and diseases. Statistically, G × E can be assessed by determining the deviation from expectation of predictive models based solely on the phenotypic effects of genetics or environmental exposures. Despite the unprecedented, widespread and diverse use of G × E analytical frameworks, heterogeneity in their application and reporting hinders their applicability in public health. In this Review, we discuss study design considerations as well as G × E analytical frameworks to assess polygenic liability dependent on the environment, to identify specific genetic variants exhibiting G × E, and to characterize environmental context for these dynamics. We conclude with recommendations to address the most common challenges and pitfalls in the conceptualization, methodology and reporting of G × E studies, as well as future directions.
Collapse
Affiliation(s)
- Esther Herrera-Luis
- Department of Epidemiology, Johns Hopkins Bloomberg School of Public Health, Baltimore, MD, USA
| | - Kelly Benke
- Department of Mental Health, Johns Hopkins Bloomberg School of Public Health, Baltimore, MD, USA
| | - Heather Volk
- Department of Mental Health, Johns Hopkins Bloomberg School of Public Health, Baltimore, MD, USA
| | - Christine Ladd-Acosta
- Department of Epidemiology, Johns Hopkins Bloomberg School of Public Health, Baltimore, MD, USA
| | - Genevieve L Wojcik
- Department of Epidemiology, Johns Hopkins Bloomberg School of Public Health, Baltimore, MD, USA.
| |
Collapse
|
2
|
Ma X, Tang Y, Wang C, Li Y, Zhang J, Luo Y, Xu Z, Wu F, Wang S. Interpretable XGBoost-SHAP Model Predicts Nanoparticles Delivery Efficiency Based on Tumor Genomic Mutations and Nanoparticle Properties. ACS APPLIED BIO MATERIALS 2023; 6:4326-4335. [PMID: 37683105 DOI: 10.1021/acsabm.3c00527] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 09/10/2023]
Abstract
Understanding the complex interaction between nanoparticles (NPs) and tumors in vivo and how it dominates the delivery efficiency of NPs is critical for the translation of nanomedicine. Herein, we proposed an interpretable XGBoost-SHAP model by integrating the information on NPs physicochemical properties and tumor genomic profile to predict the delivery efficiency. The correlation coefficients were 0.66, 0.75, and 0.54 for the prediction of maximum delivery efficiency, delivery efficiency at 24 and 168 h postinjection for test sets. The analysis of the feature importance revealed that the tumor genomic mutations and their interaction with NPs properties played important roles in the delivery of NPs. The biological pathways of the NP-delivery-related genes were further explored through gene ontology enrichment analysis. Our work provides a pipeline to predict and explain the delivery efficiency of NPs to heterogeneous tumors and highlights the power of simultaneously using omics data and interpretable machine learning algorithms for discovering interactions between NPs and individual tumors, which is important for the development of personalized precision nanomedicine.
Collapse
Affiliation(s)
- Xingqun Ma
- Laboratory of Molecular Imaging, Department of Radiology, The First Affiliated Hospital of Nanjing Medical University, Nanjing, Jiangsu 210000, China
- Department of Oncology, Nanjing Baiyi Hospital, Jinling Clinical College of Nanjing Medical University, Nanjing, Jiangsu 210000, China
| | - Yuxia Tang
- Laboratory of Molecular Imaging, Department of Radiology, The First Affiliated Hospital of Nanjing Medical University, Nanjing, Jiangsu 210000, China
| | - Chuanbing Wang
- Laboratory of Molecular Imaging, Department of Radiology, The First Affiliated Hospital of Nanjing Medical University, Nanjing, Jiangsu 210000, China
| | - Yang Li
- Laboratory of Molecular Imaging, Department of Radiology, The First Affiliated Hospital of Nanjing Medical University, Nanjing, Jiangsu 210000, China
| | - Jiulou Zhang
- Laboratory of Molecular Imaging, Department of Radiology, The First Affiliated Hospital of Nanjing Medical University, Nanjing, Jiangsu 210000, China
| | - Yafei Luo
- Laboratory of Molecular Imaging, Department of Radiology, The First Affiliated Hospital of Nanjing Medical University, Nanjing, Jiangsu 210000, China
| | - Ziqing Xu
- Laboratory of Molecular Imaging, Department of Radiology, The First Affiliated Hospital of Nanjing Medical University, Nanjing, Jiangsu 210000, China
| | - Feiyun Wu
- Laboratory of Molecular Imaging, Department of Radiology, The First Affiliated Hospital of Nanjing Medical University, Nanjing, Jiangsu 210000, China
| | - Shouju Wang
- Laboratory of Molecular Imaging, Department of Radiology, The First Affiliated Hospital of Nanjing Medical University, Nanjing, Jiangsu 210000, China
| |
Collapse
|
3
|
Ferrão LFV, Dhakal R, Dias R, Tieman D, Whitaker V, Gore MA, Messina C, Resende MFR. Machine learning applications to improve flavor and nutritional content of horticultural crops through breeding and genetics. Curr Opin Biotechnol 2023; 83:102968. [PMID: 37515935 DOI: 10.1016/j.copbio.2023.102968] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/21/2023] [Revised: 06/19/2023] [Accepted: 06/21/2023] [Indexed: 07/31/2023]
Abstract
Over the last decades, significant strides were made in understanding the biochemical factors influencing the nutritional content and flavor profile of fruits and vegetables. Product differentiation in the produce aisle is the natural consequence of increasing consumer power in the food industry. Cotton-candy grapes, specialty tomatoes, and pineapple-flavored white strawberries provide a few examples. Given the increased demand for flavorful varieties, and pressing need to reduce micronutrient malnutrition, we expect breeding to increase its prioritization toward these traits. Reaching this goal will, in part, necessitate knowledge of the genetic architecture controlling these traits, as well as the development of breeding methods that maximize their genetic gain. Can artificial intelligence (AI) help predict flavor preferences, and can such insights be leveraged by breeding programs? In this Perspective, we outline both the opportunities and challenges for the development of more flavorful and nutritious crops, and how AI can support these breeding initiatives.
Collapse
Affiliation(s)
- Luís Felipe V Ferrão
- Horticultural Sciences Department, University of Florida, Gainesville, FL, United States
| | - Rakshya Dhakal
- Plant Breeding Graduate Program, University of Florida, Gainesville, FL, United States
| | - Raquel Dias
- Microbiology and Cell Science Department, University of Florida, Gainesville, FL, United States
| | - Denise Tieman
- Horticultural Sciences Department, University of Florida, Gainesville, FL, United States
| | - Vance Whitaker
- Horticultural Sciences Department, University of Florida, Gainesville, FL, United States; Plant Breeding Graduate Program, University of Florida, Gainesville, FL, United States
| | - Michael A Gore
- Plant Breeding and Genetics Section, School of Integrative Plant Science, Cornell University, Ithaca, NY, United States
| | - Carlos Messina
- Horticultural Sciences Department, University of Florida, Gainesville, FL, United States; Plant Breeding Graduate Program, University of Florida, Gainesville, FL, United States
| | - Márcio F R Resende
- Horticultural Sciences Department, University of Florida, Gainesville, FL, United States; Plant Breeding Graduate Program, University of Florida, Gainesville, FL, United States.
| |
Collapse
|
4
|
Cheng X, Du F, Long X, Huang J. Genetic Inheritance Models of Non-Syndromic Cleft Lip with or without Palate: From Monogenic to Polygenic. Genes (Basel) 2023; 14:1859. [PMID: 37895208 PMCID: PMC10606748 DOI: 10.3390/genes14101859] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/14/2023] [Revised: 09/18/2023] [Accepted: 09/21/2023] [Indexed: 10/29/2023] Open
Abstract
Non-syndromic cleft lip with or without palate (NSCL/P) is a prevalent birth defect that affects 1/500-1/1400 live births globally. The genetic basis of NSCL/P is intricate and involves both genetic and environmental factors. In the past few years, various genetic inheritance models have been proposed to elucidate the underlying mechanisms of NSCL/P. These models range from simple monogenic inheritance to more complex polygenic inheritance. Here, we present a comprehensive overview of the genetic inheritance model of NSCL/P exemplified by representative genes and regions from both monogenic and polygenic perspectives. We also summarize existing association studies and corresponding loci of NSCL/P within the Chinese population and highlight the potential of utilizing polygenic risk scores for risk stratification of NSCL/P. The potential application of polygenic models offers promising avenues for improved risk assessment and personalized approaches in the prevention and management of NSCL/P individuals.
Collapse
Affiliation(s)
- Xi Cheng
- Peking Union Medical College Hospital, Chinese Academy of Medical Sciences & Peking Union Medical College, Beijing 100730, China; (X.C.); (F.D.); (X.L.)
| | - Fengzhou Du
- Peking Union Medical College Hospital, Chinese Academy of Medical Sciences & Peking Union Medical College, Beijing 100730, China; (X.C.); (F.D.); (X.L.)
- Department of Plastic Surgery, Peking Union Medical College Hospital, Beijing 100730, China
| | - Xiao Long
- Peking Union Medical College Hospital, Chinese Academy of Medical Sciences & Peking Union Medical College, Beijing 100730, China; (X.C.); (F.D.); (X.L.)
- Department of Plastic Surgery, Peking Union Medical College Hospital, Beijing 100730, China
| | - Jiuzuo Huang
- Peking Union Medical College Hospital, Chinese Academy of Medical Sciences & Peking Union Medical College, Beijing 100730, China; (X.C.); (F.D.); (X.L.)
- Department of Plastic Surgery, Peking Union Medical College Hospital, Beijing 100730, China
| |
Collapse
|
5
|
Zhou GL, Xu FJ, Qiao JK, Che ZX, Xiang T, Liu XL, Li XY, Zhao SH, Zhu MJ. E-GWAS: an ensemble-like GWAS strategy that provides effective control over false positive rates without decreasing true positives. Genet Sel Evol 2023; 55:46. [PMID: 37407918 DOI: 10.1186/s12711-023-00820-3] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/30/2022] [Accepted: 06/23/2023] [Indexed: 07/07/2023] Open
Abstract
BACKGROUND Genome-wide association studies (GWAS) are an effective way to explore genotype-phenotype associations in humans, animals, and plants. Various GWAS methods have been developed based on different genetic or statistical assumptions. However, no single method is optimal for all traits and, for many traits, the putative single nucleotide polymorphisms (SNPs) that are detected by the different methods do not entirely overlap due to the diversity of the genetic architecture of complex traits. Therefore, multi-tool-based GWAS strategies that combine different methods have been increasingly employed. To take this one step further, we propose an ensemble-like GWAS strategy (E-GWAS) that statistically integrates GWAS results from different single GWAS methods. RESULTS E-GWAS was compared with various single GWAS methods using simulated phenotype traits with different genetic architectures. E-GWAS performed stably across traits with different genetic architectures and effectively controlled the number of false positive genetic variants detected without decreasing the number of true positive variants. In addition, its performance could be further improved by using a bin-merged strategy and the addition of more distinct single GWAS methods. Our results show that the numbers of true and false positive SNPs detected by the E-GWAS strategy slightly increased and decreased, respectively, with increasing bin size and when the number and the diversity of individual GWAS methods that were integrated in E-GWAS increased, the latter being more effective than the bin-merged strategy. The E-GWAS strategy was also applied to a real dataset to study backfat thickness in a pig population, and 10 candidate genes related to this trait and expressed in adipose-associated tissues were identified. CONCLUSIONS Using both simulated and real datasets, we show that E-GWAS is a reliable and robust strategy that effectively integrates the GWAS results of different methods and reduces the number of false positive SNPs without decreasing that of true positive SNPs.
Collapse
Affiliation(s)
- Guang-Liang Zhou
- Key Lab of Agricultural Animal Genetics, Breeding, and Reproduction of Ministry of Education, Huazhong Agricultural University, Wuhan, 430070, China
| | - Fang-Jun Xu
- Key Lab of Agricultural Animal Genetics, Breeding, and Reproduction of Ministry of Education, Huazhong Agricultural University, Wuhan, 430070, China
| | - Jia-Kun Qiao
- Key Lab of Agricultural Animal Genetics, Breeding, and Reproduction of Ministry of Education, Huazhong Agricultural University, Wuhan, 430070, China
| | - Zhao-Xuan Che
- Key Lab of Agricultural Animal Genetics, Breeding, and Reproduction of Ministry of Education, Huazhong Agricultural University, Wuhan, 430070, China
| | - Tao Xiang
- Key Lab of Agricultural Animal Genetics, Breeding, and Reproduction of Ministry of Education, Huazhong Agricultural University, Wuhan, 430070, China
- The Cooperative Innovation Center for Sustainable Pig Production, Huazhong Agricultural University, Wuhan, 430070, China
| | - Xiao-Lei Liu
- Key Lab of Agricultural Animal Genetics, Breeding, and Reproduction of Ministry of Education, Huazhong Agricultural University, Wuhan, 430070, China
- The Cooperative Innovation Center for Sustainable Pig Production, Huazhong Agricultural University, Wuhan, 430070, China
| | - Xin-Yun Li
- Key Lab of Agricultural Animal Genetics, Breeding, and Reproduction of Ministry of Education, Huazhong Agricultural University, Wuhan, 430070, China
- The Cooperative Innovation Center for Sustainable Pig Production, Huazhong Agricultural University, Wuhan, 430070, China
| | - Shu-Hong Zhao
- Key Lab of Agricultural Animal Genetics, Breeding, and Reproduction of Ministry of Education, Huazhong Agricultural University, Wuhan, 430070, China
- The Cooperative Innovation Center for Sustainable Pig Production, Huazhong Agricultural University, Wuhan, 430070, China
| | - Meng-Jin Zhu
- Key Lab of Agricultural Animal Genetics, Breeding, and Reproduction of Ministry of Education, Huazhong Agricultural University, Wuhan, 430070, China.
- The Cooperative Innovation Center for Sustainable Pig Production, Huazhong Agricultural University, Wuhan, 430070, China.
| |
Collapse
|
6
|
Vlahek D, Mongus D. An Efficient Iterative Approach to Explainable Feature Learning. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 2023; 34:2606-2618. [PMID: 34478388 DOI: 10.1109/tnnls.2021.3107049] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/04/2023]
Abstract
This article introduces a new iterative approach to explainable feature learning. During each iteration, new features are generated, first by applying arithmetic operations on the input set of features. These are then evaluated in terms of probability distribution agreements between values of samples belonging to different classes. Finally, a graph-based approach for feature selection is proposed, which allows for selecting high-quality and uncorrelated features to be used in feature generation during the next iteration. As shown by the results, the proposed method improved the accuracy of all tested classifiers, where the best accuracies were achieved using random forest. In addition, the method turned out to be insensitive to both of the input parameters, while superior performances in comparison to the state of the art were demonstrated on nine out of 15 test sets and achieving comparable results in the others. Finally, we demonstrate the explainability of the learned feature representation for knowledge discovery.
Collapse
|
7
|
Ferrato MH, Marsh AG, Franke KR, Huang BJ, Kolb EA, DeRyckere D, Grahm DK, Chandrasekaran S, Crowgey EL. Machine learning classifier approaches for predicting response to RTK-type-III inhibitors demonstrate high accuracy using transcriptomic signatures and ex vivo data. BIOINFORMATICS ADVANCES 2023; 3:vbad034. [PMID: 37250111 PMCID: PMC10209528 DOI: 10.1093/bioadv/vbad034] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 09/08/2022] [Revised: 02/16/2023] [Accepted: 03/21/2023] [Indexed: 05/31/2023]
Abstract
Motivation The application of machine learning (ML) techniques in the medical field has demonstrated both successes and challenges in the precision medicine era. The ability to accurately classify a subject as a potential responder versus a nonresponder to a given therapy is still an active area of research pushing the field to create new approaches for applying machine-learning techniques. In this study, we leveraged publicly available data through the BeatAML initiative. Specifically, we used gene count data, generated via RNA-seq, from 451 individuals matched with ex vivo data generated from treatment with RTK-type-III inhibitors. Three feature selection techniques were tested, principal component analysis, Shapley Additive Explanation (SHAP) technique and differential gene expression analysis, with three different classifiers, XGBoost, LightGBM and random forest (RF). Sensitivity versus specificity was analyzed using the area under the curve (AUC)-receiver operating curves (ROCs) for every model developed. Results Our work demonstrated that feature selection technique, rather than the classifier, had the greatest impact on model performance. The SHAP technique outperformed the other feature selection techniques and was able to with high accuracy predict outcome response, with the highest performing model: Foretinib with 89% AUC using the SHAP technique and RF classifier. Our ML pipelines demonstrate that at the time of diagnosis, a transcriptomics signature exists that can potentially predict response to treatment, demonstrating the potential of using ML applications in precision medicine efforts. Availability and implementation https://github.com/UD-CRPL/RCDML. Supplementary information Supplementary data are available at Bioinformatics Advances online.
Collapse
Affiliation(s)
| | | | - Karl R Franke
- Nemours Children Health System, Wilmington, DE 19803, USA
| | - Benjamin J Huang
- Department of Pediatrics, University of California San Francisco, San Francisco, CA 94143, USA
- Helen Diller Family Comprehensive Cancer Center, University of California San Francisco, San Francisco, CA 94143, USA
| | - E Anders Kolb
- Nemours Children Health System, Wilmington, DE 19803, USA
| | - Deborah DeRyckere
- Department of Pediatrics, Emory University School of Medicine, Atlanta, GA 30322, USA
| | - Douglas K Grahm
- Department of Pediatrics, Emory University School of Medicine, Atlanta, GA 30322, USA
| | | | | |
Collapse
|
8
|
Johnsen PV, Strümke I, Langaas M, DeWan AT, Riemer-Sørensen S. Inferring feature importance with uncertainties with application to large genotype data. PLoS Comput Biol 2023; 19:e1010963. [PMID: 36917581 PMCID: PMC10038287 DOI: 10.1371/journal.pcbi.1010963] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/04/2022] [Revised: 03/24/2023] [Accepted: 02/20/2023] [Indexed: 03/16/2023] Open
Abstract
Estimating feature importance, which is the contribution of a prediction or several predictions due to a feature, is an essential aspect of explaining data-based models. Besides explaining the model itself, an equally relevant question is which features are important in the underlying data generating process. We present a Shapley-value-based framework for inferring the importance of individual features, including uncertainty in the estimator. We build upon the recently published model-agnostic feature importance score of SAGE (Shapley additive global importance) and introduce Sub-SAGE. For tree-based models, it has the advantage that it can be estimated without computationally expensive resampling. We argue that for all model types the uncertainties in our Sub-SAGE estimator can be estimated using bootstrapping and demonstrate the approach for tree ensemble methods. The framework is exemplified on synthetic data as well as large genotype data for predicting feature importance with respect to obesity.
Collapse
Affiliation(s)
- Pål Vegard Johnsen
- SINTEF DIGITAL, Oslo, Norway
- Department of Mathematical Sciences, Norwegian University of Science and Technology, Trondheim, Norway
| | - Inga Strümke
- Department of Engineering Cybernetics, Norwegian University of Science and Technology, Trondheim, Norway
- Department of Holistic Systems, SimulaMet, Oslo, Norway
| | - Mette Langaas
- Department of Mathematical Sciences, Norwegian University of Science and Technology, Trondheim, Norway
| | - Andrew Thomas DeWan
- Department of Chronic Disease Epidemiology and Center for Perinatal, Pediatric and Environmental Epidemiology, Yale School of Public Health, New Haven, Connecticut, United States of America
| | | |
Collapse
|
9
|
Cui T, El Mekkaoui K, Reinvall J, Havulinna AS, Marttinen P, Kaski S. Gene-gene interaction detection with deep learning. Commun Biol 2022; 5:1238. [PMID: 36371468 PMCID: PMC9653457 DOI: 10.1038/s42003-022-04186-y] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/02/2022] [Accepted: 10/27/2022] [Indexed: 11/13/2022] Open
Abstract
The extent to which genetic interactions affect observed phenotypes is generally unknown because current interaction detection approaches only consider simple interactions between top SNPs of genes. We introduce an open-source framework for increasing the power of interaction detection by considering all SNPs within a selected set of genes and complex interactions between them, beyond only the currently considered multiplicative relationships. In brief, the relation between SNPs and a phenotype is captured by a neural network, and the interactions are quantified by Shapley scores between hidden nodes, which are gene representations that optimally combine information from the corresponding SNPs. Additionally, we design a permutation procedure tailored for neural networks to assess the significance of interactions, which outperformed existing alternatives on simulated datasets with complex interactions, and in a cholesterol study on the UK Biobank it detected nine interactions which replicated on an independent FINRISK dataset.
Collapse
Affiliation(s)
- Tianyu Cui
- Department of Computer Science, Aalto University, Espoo, Finland.
| | | | - Jaakko Reinvall
- Department of Computer Science, Aalto University, Espoo, Finland
| | - Aki S Havulinna
- Finnish Institute for Health and Welfare (THL), Helsinki, Finland
- Institute for Molecular Medicine Finland, FIMM-HiLIFE, Helsinki, Finland
| | - Pekka Marttinen
- Department of Computer Science, Aalto University, Espoo, Finland
- Finnish Institute for Health and Welfare (THL), Helsinki, Finland
| | - Samuel Kaski
- Department of Computer Science, Aalto University, Espoo, Finland
- Department of Computer Science, University of Manchester, Manchester, UK
| |
Collapse
|
10
|
Schütz N, Knobel SEJ, Botros A, Single M, Pais B, Santschi V, Gatica-Perez D, Buluschek P, Urwyler P, Gerber SM, Müri RM, Mosimann UP, Saner H, Nef T. A systems approach towards remote health-monitoring in older adults: Introducing a zero-interaction digital exhaust. NPJ Digit Med 2022; 5:116. [PMID: 35974156 PMCID: PMC9381599 DOI: 10.1038/s41746-022-00657-y] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/03/2022] [Accepted: 07/13/2022] [Indexed: 11/09/2022] Open
Abstract
Using connected sensing devices to remotely monitor health is a promising way to help transition healthcare from a rather reactive to a more precision medicine oriented proactive approach, which could be particularly relevant in the face of rapid population ageing and the challenges it poses to healthcare systems. Sensor derived digital measures of health, such as digital biomarkers or digital clinical outcome assessments, may be used to monitor health status or the risk of adverse events like falls. Current research around such digital measures has largely focused on exploring the use of few individual measures obtained through mobile devices. However, especially for long-term applications in older adults, this choice of technology may not be ideal and could further add to the digital divide. Moreover, large-scale systems biology approaches, like genomics, have already proven beneficial in precision medicine, making it plausible that the same could also hold for remote-health monitoring. In this context, we introduce and describe a zero-interaction digital exhaust: a set of 1268 digital measures that cover large parts of a person’s activity, behavior and physiology. Making this approach more inclusive of older adults, we base this set entirely on contactless, zero-interaction sensing technologies. Applying the resulting digital exhaust to real-world data, we then demonstrate the possibility to create multiple ageing relevant digital clinical outcome assessments. Paired with modern machine learning, we find these assessments to be surprisingly powerful and often on-par with mobile approaches. Lastly, we highlight the possibility to discover novel digital biomarkers based on this large-scale approach.
Collapse
Affiliation(s)
- Narayan Schütz
- ARTORG Center for Biomedical Engineering Research, University of Bern, Bern, Switzerland.
| | - Samuel E J Knobel
- ARTORG Center for Biomedical Engineering Research, University of Bern, Bern, Switzerland
| | - Angela Botros
- ARTORG Center for Biomedical Engineering Research, University of Bern, Bern, Switzerland
| | - Michael Single
- ARTORG Center for Biomedical Engineering Research, University of Bern, Bern, Switzerland
| | - Bruno Pais
- LaSource School of Nursing Sciences, HES-SO University of Applied Sciences and Arts Western Switzerland, Lausanne, Switzerland
| | - Valérie Santschi
- LaSource School of Nursing Sciences, HES-SO University of Applied Sciences and Arts Western Switzerland, Lausanne, Switzerland
| | - Daniel Gatica-Perez
- Idiap Research Institute, Martigny, Switzerland.,School of Engineering, Ecole Polytechnique Fédérale de Lausanne (EPFL), Lausanne, Switzerland
| | | | - Prabitha Urwyler
- ARTORG Center for Biomedical Engineering Research, University of Bern, Bern, Switzerland
| | - Stephan M Gerber
- ARTORG Center for Biomedical Engineering Research, University of Bern, Bern, Switzerland
| | - René M Müri
- ARTORG Center for Biomedical Engineering Research, University of Bern, Bern, Switzerland.,Department of Neurology, Inselspital, Bern, Switzerland
| | - Urs P Mosimann
- ARTORG Center for Biomedical Engineering Research, University of Bern, Bern, Switzerland
| | - Hugo Saner
- ARTORG Center for Biomedical Engineering Research, University of Bern, Bern, Switzerland.,Institute of Social and Preventive Medicine, University of Bern, Bern, Switzerland
| | - Tobias Nef
- ARTORG Center for Biomedical Engineering Research, University of Bern, Bern, Switzerland.,Department of Neurology, Inselspital, Bern, Switzerland
| |
Collapse
|
11
|
Compressive Strength Estimation of Steel-Fiber-Reinforced Concrete and Raw Material Interactions Using Advanced Algorithms. Polymers (Basel) 2022; 14:polym14153065. [PMID: 35956580 PMCID: PMC9370679 DOI: 10.3390/polym14153065] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/31/2022] [Revised: 06/29/2022] [Accepted: 07/06/2022] [Indexed: 02/01/2023] Open
Abstract
Steel-fiber-reinforced concrete (SFRC) has been introduced as an effective alternative to conventional concrete in the construction sector. The incorporation of steel fibers into concrete provides a bridging mechanism to arrest cracks, improve the post-cracking behavior of concrete, and transfer stresses in concrete. Artificial intelligence (AI) approaches are in use nowadays to predict concrete properties to conserve time and money in the construction industry. Accordingly, this study aims to apply advanced and sophisticated machine-learning (ML) algorithms to predict SFRC compressive strength. In the current work, the applied ML approaches were gradient boosting, random forest, and XGBoost. The considered input variables were cement, fine aggregates (sand), coarse aggregates, water, silica fume, super-plasticizer, fly ash, steel fiber, fiber diameter, and fiber length. Previous studies have not addressed the effects of raw materials on compressive strength in considerable detail, leaving a research gap. The integration of a SHAP analysis with ML algorithms was also performed in this paper, addressing a current research need. A SHAP analysis is intended to provide an in-depth understanding of the SFRC mix design in terms of its strength factors via complicated, nonlinear behavior and the description of input factor contributions by assigning a weighing factor to each input component. The performances of all the algorithms were evaluated by applying statistical checks such as the determination coefficient (R2), the root mean square error (RMSE), and the mean absolute error (MAE). The random forest ML approach had a higher, i.e., 0.96, R2 value with fewer errors, producing higher precision than other models with lesser R2 values. The SFRC compressive strength could be anticipated by applying the random forest ML approach. Further, it was revealed from the SHapley Additive exPlanations (SHAP) analysis that cement content had the highest positive influence on the compressive strength of SFRC. In this way, the current study is beneficial for researchers to effectively and quickly evaluate SFRC compressive strength.
Collapse
|
12
|
Amin MN, Ahmad W, Khan K, Ahmad A, Nazar S, Alabdullah AA. Use of Artificial Intelligence for Predicting Parameters of Sustainable Concrete and Raw Ingredient Effects and Interactions. MATERIALS 2022; 15:ma15155207. [PMID: 35955144 PMCID: PMC9369900 DOI: 10.3390/ma15155207] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 05/30/2022] [Revised: 07/13/2022] [Accepted: 07/18/2022] [Indexed: 11/16/2022]
Abstract
Incorporating waste material, such as recycled coarse aggregate concrete (RCAC), into construction material can reduce environmental pollution. It is also well-known that the inferior properties of recycled aggregates (RAs), when incorporated into concrete, can impact its mechanical properties, and it is necessary to evaluate the optimal performance. Accordingly, artificial intelligence has been used recently to evaluate the performance of concrete compressive behaviour for different types of construction material. Therefore, supervised machine learning techniques, i.e., DT-XG Boost, DT-Gradient Boosting, SVM-Bagging, and SVM-Adaboost, are executed in the current study to predict RCAC’s compressive strength. Additionally, SHapley Additive exPlanations (SHAP) analysis shows the influence of input parameters on the compressive strength of RCAC and the interactions between them. The correlation coefficient (R2), root mean square error (RMSE), and mean absolute error (MAE) are used to assess the model’s performance. Subsequently, the k-fold cross-validation method is executed to validate the model’s performance. The R2 value of 0.98 from DT-Gradient Boosting supersedes those of the other methods, i.e., DT- XG Boost, SVM-Bagging, and SVM-Adaboost. The DT-Gradient Boosting model, with a higher R2 value and lower error (i.e., MAE, RMSE) values, had a better performance than the other ensemble techniques. The application of machine learning techniques for the prediction of concrete properties would consume fewer resources and take less time and effort for scholars in the respective engineering field. The forecasting of the proposed DT-Gradient Boosting models is in close agreement with the actual experimental results, as indicated by the assessment output showing the improved estimation of RCAC’s compressive strength.
Collapse
Affiliation(s)
- Muhammad Nasir Amin
- Department of Civil and Environmental Engineering, College of Engineering, King Faisal University, Al-Ahsa 31982, Saudi Arabia; (K.K.); (A.A.A.)
- Correspondence: ; Tel.: +966-13-589-5431; Fax: +966-13-581-7068
| | - Waqas Ahmad
- Department of Civil Engineering, COMSATS University Islamabad, Abbottabad 22060, Pakistan; (W.A.); (S.N.)
| | - Kaffayatullah Khan
- Department of Civil and Environmental Engineering, College of Engineering, King Faisal University, Al-Ahsa 31982, Saudi Arabia; (K.K.); (A.A.A.)
| | - Ayaz Ahmad
- MaREI Centre, Ryan Institute and School of Engineering, College of Science and Engineering, National University of Ireland Galway, H91 HX31 Galway, Ireland;
| | - Sohaib Nazar
- Department of Civil Engineering, COMSATS University Islamabad, Abbottabad 22060, Pakistan; (W.A.); (S.N.)
| | - Anas Abdulalim Alabdullah
- Department of Civil and Environmental Engineering, College of Engineering, King Faisal University, Al-Ahsa 31982, Saudi Arabia; (K.K.); (A.A.A.)
| |
Collapse
|
13
|
Shen Z, Deifalla AF, Kamiński P, Dyczko A. Compressive Strength Evaluation of Ultra-High-Strength Concrete by Machine Learning. MATERIALS 2022; 15:ma15103523. [PMID: 35629548 PMCID: PMC9148046 DOI: 10.3390/ma15103523] [Citation(s) in RCA: 12] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 04/13/2022] [Revised: 05/02/2022] [Accepted: 05/07/2022] [Indexed: 02/04/2023]
Abstract
In civil engineering, ultra-high-strength concrete (UHSC) is a useful and efficient building material. To save money and time in the construction sector, soft computing approaches have been used to estimate concrete properties. As a result, the current work used sophisticated soft computing techniques to estimate the compressive strength of UHSC. In this study, XGBoost, AdaBoost, and Bagging were the employed soft computing techniques. The variables taken into account included cement content, fly ash, silica fume and silicate content, sand and water content, superplasticizer content, steel fiber, steel fiber aspect ratio, and curing time. The algorithm performance was evaluated using statistical metrics, such as the mean absolute error (MAE), root mean square error (RMSE), and coefficient of determination (R2). The model’s performance was then evaluated statistically. The XGBoost soft computing technique, with a higher R2 (0.90) and low errors, was more accurate than the other algorithms, which had a lower R2. The compressive strength of UHSC can be predicted using the XGBoost soft computing technique. The SHapley Additive exPlanations (SHAP) analysis showed that curing time had the highest positive influence on UHSC compressive strength. Thus, scholars will be able to quickly and effectively determine the compressive strength of UHSC using this study’s findings.
Collapse
Affiliation(s)
- Zhongjie Shen
- Xijing University, Xi’an 710123, China
- Correspondence: (Z.S.); (A.F.D.)
| | - Ahmed Farouk Deifalla
- Structural Engineering and Construction Management Department, Faculty of Engineering and Technology, Future University in Egypt, Cairo 11835, Egypt
- Correspondence: (Z.S.); (A.F.D.)
| | - Paweł Kamiński
- Faculty of Civil Engineering and Resource Management, AGH University of Science and Technology, Mickiewicza 30, 30-059 Kraków, Poland;
| | - Artur Dyczko
- Mineral and Energy Economy Research Institute of the Polish Academy of Sciences, J. Wybickiego 7a, 31-261 Kraków, Poland;
| |
Collapse
|
14
|
Medvedev A, Mishra Sharma S, Tsatsorin E, Nabieva E, Yarotsky D. Human genotype-to-phenotype predictions: Boosting accuracy with nonlinear models. PLoS One 2022; 17:e0273293. [PMID: 36044406 PMCID: PMC9432766 DOI: 10.1371/journal.pone.0273293] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/27/2021] [Accepted: 08/04/2022] [Indexed: 11/23/2022] Open
Abstract
Genotype-to-phenotype prediction is a central problem of human genetics. In recent years, it has become possible to construct complex predictive models for phenotypes, thanks to the availability of large genome data sets as well as efficient and scalable machine learning tools. In this paper, we make a threefold contribution to this problem. First, we ask if state-of-the-art nonlinear predictive models, such as boosted decision trees, can be more efficient for phenotype prediction than conventional linear models. We find that this is indeed the case if model features include a sufficiently rich set of covariates, but probably not otherwise. Second, we ask if the conventional selection of single nucleotide polymorphisms (SNPs) by genome wide association studies (GWAS) can be replaced by a more efficient procedure, taking into account information in previously selected SNPs. We propose such a procedure, based on a sequential feature importance estimation with decision trees, and show that this approach indeed produced informative SNP sets that are much more compact than when selected with GWAS. Finally, we show that the highest prediction accuracy can ultimately be achieved by ensembling individual linear and nonlinear models. To the best of our knowledge, for some of the phenotypes that we consider (asthma, hypothyroidism), our results are a new state-of-the-art.
Collapse
Affiliation(s)
| | | | | | - Elena Nabieva
- Skolkovo Institute of Science and Technology, Moscow, Russia
| | - Dmitry Yarotsky
- Skolkovo Institute of Science and Technology, Moscow, Russia
| |
Collapse
|