Reference Citation Analysis: Find an Article, Find a Category, Find a Journal, Find a Scholar

Download

Total Articles

43
(from Reference Citation Analysis)

Article PDFs (11)

Cited by > 0 (21)

Searched Name

automated machine learning

Ranked By

Results Analysis

Year Published Analysis
Article Type Analysis
Publication Title Analysis
Category Analysis

Results Analysis

Indexed Articles

Year Published

Show more Refine

Article Type

Show more Refine

Article Statistics

Refine

MESH Headings

Show more Refine

First Author

Show more Refine

First Author Affiliations

Show more Refine

Authors

Show more Refine

Publication Titles

Show more Refine

Grant Agencies

Show more Refine

Countries/Regions

Show more Refine

Affiliations

Show more Refine

Corresponding Author Affiliations

Show more Refine

Category

Show more Refine

Number

Citation Analysis

Bifarin OO, Fernández FM. Automated Machine Learning and Explainable AI (AutoML-XAI) for Metabolomics: Improving Cancer Diagnostics. J Am Soc Mass Spectrom 2024. [PMID: 38690775 DOI: 10.1021/jasms.3c00403] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/03/2024]

Abstract

Metabolomics generates complex data necessitating advanced computational methods for generating biological insight. While machine learning (ML) is promising, the challenges of selecting the best algorithms and tuning hyperparameters, particularly for nonexperts, remain. Automated machine learning (AutoML) can streamline this process; however, the issue of interpretability could persist. This research introduces a unified pipeline that combines AutoML with explainable AI (XAI) techniques to optimize metabolomics analysis. We tested our approach on two data sets: renal cell carcinoma (RCC) urine metabolomics and ovarian cancer (OC) serum metabolomics. AutoML, using Auto-sklearn, surpassed standalone ML algorithms like SVM and k-Nearest Neighbors in differentiating between RCC and healthy controls, as well as OC patients and those with other gynecological cancers. The effectiveness of Auto-sklearn is highlighted by its AUC scores of 0.97 for RCC and 0.85 for OC, obtained from the unseen test sets. Importantly, on most of the metrics considered, Auto-sklearn demonstrated a better classification performance, leveraging a mix of algorithms and ensemble techniques. Shapley Additive Explanations (SHAP) provided a global ranking of feature importance, identifying dibutylamine and ganglioside GM(d34:1) as the top discriminative metabolites for RCC and OC, respectively. Waterfall plots offered local explanations by illustrating the influence of each metabolite on individual predictions. Dependence plots spotlighted metabolite interactions, such as the connection between hippuric acid and one of its derivatives in RCC, and between GM3(d34:1) and GM3(18:1_16:0) in OC, hinting at potential mechanistic relationships. Through decision plots, a detailed error analysis was conducted, contrasting feature importance for correctly versus incorrectly classified samples. In essence, our pipeline emphasizes the importance of harmonizing AutoML and XAI, facilitating both simplified ML application and improved interpretability in metabolomics data science.

Collapse

Meng L, Zhu P, Xia K. Application value of the automated machine learning model based on modified CT index combined with serological indices in the early prediction of lung cancer. Front Public Health 2024;12:1368217. [PMID: 38645446 PMCID: PMC11027066 DOI: 10.3389/fpubh.2024.1368217] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/10/2024] [Accepted: 03/19/2024] [Indexed: 04/23/2024] Open

Abstract

Background and objective

Accurately predicting the extent of lung tumor infiltration is crucial for improving patient survival and cure rates. This study aims to evaluate the application value of an improved CT index combined with serum biomarkers, obtained through an artificial intelligence recognition system analyzing CT features of pulmonary nodules, in early prediction of lung cancer infiltration using machine learning models.

Patients and methods

A retrospective analysis was conducted on clinical data of 803 patients hospitalized for lung cancer treatment from January 2020 to December 2023 at two hospitals: Hospital 1 (Affiliated Changshu Hospital of Soochow University) and Hospital 2 (Nantong Eighth People's Hospital). Data from Hospital 1 were used for internal training, while data from Hospital 2 were used for external validation. Five algorithms, including traditional logistic regression (LR) and machine learning techniques (generalized linear models [GLM], random forest [RF], gradient boosting machine [GBM], deep neural network [DL], and naive Bayes [NB]), were employed to construct models predicting early lung cancer infiltration and were analyzed. The models were comprehensively evaluated through receiver operating characteristic curve (AUC) analysis based on LR, calibration curves, decision curve analysis (DCA), as well as global and individual interpretative analyses using variable feature importance and SHapley additive explanations (SHAP) plots.

Results

A total of 560 patients were used for model development in the training dataset, while a dataset comprising 243 patients was used for external validation. The GBM model exhibited the best performance among the five algorithms, with AUCs of 0.931 and 0.99 in the validation and test sets, respectively, and accuracies of 0.857 and 0.955 in the validation and test groups, respectively, outperforming other models. Additionally, the study found that nodule diameter and average CT value were the most significant features for predicting lung cancer infiltration using machine learning models.

Conclusion

The GBM model established in this study can effectively predict the risk of infiltration in early-stage lung cancer patients, thereby improving the accuracy of lung cancer screening and facilitating timely intervention for infiltrative lung cancer patients by clinicians, leading to early diagnosis and treatment of lung cancer, and ultimately reducing lung cancer-related mortality.

Collapse

Zhang S, Chen D, Sun H, Kemp GJ, Chen Y, Tan Q, Yang Y, Gong Q, Yue Q. Whole brain morphologic features improve the predictive accuracy of IDH status and VEGF expression levels in gliomas. Cereb Cortex 2024;34:bhae151. [PMID: 38642107 DOI: 10.1093/cercor/bhae151] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/11/2024] [Revised: 03/14/2024] [Accepted: 03/23/2024] [Indexed: 04/22/2024] Open

Bibi I, Schaffert D, Blauth M, Lull C, von Ahnen JA, Gross G, Weigandt WA, Knitza J, Kuhn S, Benecke J, Leipe J, Schmieder A, Olsavszky V. Automated Machine Learning Analysis of Patients With Chronic Skin Disease Using a Medical Smartphone App: Retrospective Study. J Med Internet Res 2023;25:e50886. [PMID: 38015608 PMCID: PMC10716771 DOI: 10.2196/50886] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/21/2023] [Revised: 09/18/2023] [Accepted: 09/19/2023] [Indexed: 11/29/2023] Open

Abstract

BACKGROUND

Rapid digitalization in health care has led to the adoption of digital technologies; however, limited trust in internet-based health decisions and the need for technical personnel hinder the use of smartphones and machine learning applications. To address this, automated machine learning (AutoML) is a promising tool that can empower health care professionals to enhance the effectiveness of mobile health apps.

OBJECTIVE

We used AutoML to analyze data from clinical studies involving patients with chronic hand and/or foot eczema or psoriasis vulgaris who used a smartphone monitoring app. The analysis focused on itching, pain, Dermatology Life Quality Index (DLQI) development, and app use.

METHODS

After extensive data set preparation, which consisted of combining 3 primary data sets by extracting common features and by computing new features, a new pseudonymized secondary data set with a total of 368 patients was created. Next, multiple machine learning classification models were built during AutoML processing, with the most accurate models ultimately selected for further data set analysis.

RESULTS

Itching development for 6 months was accurately modeled using the light gradient boosted trees classifier model (log loss: 0.9302 for validation, 1.0193 for cross-validation, and 0.9167 for holdout). Pain development for 6 months was assessed using the random forest classifier model (log loss: 1.1799 for validation, 1.1561 for cross-validation, and 1.0976 for holdout). Then, the random forest classifier model (log loss: 1.3670 for validation, 1.4354 for cross-validation, and 1.3974 for holdout) was used again to estimate the DLQI development for 6 months. Finally, app use was analyzed using an elastic net blender model (area under the curve: 0.6567 for validation, 0.6207 for cross-validation, and 0.7232 for holdout). Influential feature correlations were identified, including BMI, age, disease activity, DLQI, and Hospital Anxiety and Depression Scale-Anxiety scores at follow-up. App use increased with BMI >35, was less common in patients aged >47 years and those aged 23 to 31 years, and was more common in those with higher disease activity. A Hospital Anxiety and Depression Scale-Anxiety score >8 had a slightly positive effect on app use.

CONCLUSIONS

This study provides valuable insights into the relationship between data characteristics and targeted outcomes in patients with chronic eczema or psoriasis, highlighting the potential of smartphone and AutoML techniques in improving chronic disease management and patient care.

Collapse

Affiliation(s)

Igor Bibi Department of Dermatology, Venereology and Allergology, University Medical Center and Medical Faculty Mannheim, Center of Excellence in Dermatology, Heidelberg University, Mannheim, Germany
Daniel Schaffert Department of Dermatology, Venereology and Allergology, University Medical Center and Medical Faculty Mannheim, Center of Excellence in Dermatology, Heidelberg University, Mannheim, Germany
Mara Blauth Department of Dermatology, Venereology and Allergology, University Medical Center and Medical Faculty Mannheim, Center of Excellence in Dermatology, Heidelberg University, Mannheim, Germany
Christian Lull Department of Dermatology, Venereology and Allergology, University Medical Center and Medical Faculty Mannheim, Center of Excellence in Dermatology, Heidelberg University, Mannheim, Germany
Jan Alwin von Ahnen Department of Dermatology, Venereology and Allergology, University Medical Center and Medical Faculty Mannheim, Center of Excellence in Dermatology, Heidelberg University, Mannheim, Germany
Georg Gross Department of Medicine V, Division of Rheumatology, University Medical Centre and Medical Faculty Mannheim, Heidelberg University, Mannheim, Germany
Wanja Alexander Weigandt Department of Dermatology, Venereology and Allergology, University Medical Center and Medical Faculty Mannheim, Center of Excellence in Dermatology, Heidelberg University, Mannheim, Germany
Johannes Knitza Institute of Digital Medicine, Philipps-University Marburg and University Hospital of Giessen and Marburg, Marburg, Germany
Sebastian Kuhn Institute of Digital Medicine, Philipps-University Marburg and University Hospital of Giessen and Marburg, Marburg, Germany
Johannes Benecke Department of Dermatology, Venereology and Allergology, University Medical Center and Medical Faculty Mannheim, Center of Excellence in Dermatology, Heidelberg University, Mannheim, Germany
Jan Leipe Department of Medicine V, Division of Rheumatology, University Medical Centre and Medical Faculty Mannheim, Heidelberg University, Mannheim, Germany
Astrid Schmieder Department of Dermatology, Venereology, and Allergology, University Hospital Würzburg, Würzburg, Germany
Victor Olsavszky Department of Dermatology, Venereology and Allergology, University Medical Center and Medical Faculty Mannheim, Center of Excellence in Dermatology, Heidelberg University, Mannheim, Germany

Collapse

Chen D, Wang SJ, Zhao ZJ, Ji X, Shen Q, Yu Y, Cui SD, Wang JG, Chen ZY, Wang JY, Guo ZY, Wu PX, Tang GQ. Genomic prediction of pig growth traits based on machine learning. Yi Chuan 2023;45:922-932. [PMID: 37872114 DOI: 10.16288/j.yczz.23-120] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Subscribe] [Scholar Register] [Indexed: 10/25/2023]

Abstract

This study aimed to assess and compare the performance of different machine learning models in predicting selected pig growth traits and genomic estimated breeding values (GEBV) using automated machine learning, with the goal of optimizing whole-genome evaluation methods in pig breeding. The research employed genomic information, pedigree matrices, fixed effects, and phenotype data from 9968 pigs across multiple companies to derive four optimal machine learning models: deep learning (DL), random forest (RF), gradient boosting machine (GBM), and extreme gradient boosting (XGB). Through 10-fold cross-validation, predictions were made for GEBV and phenotypes of pigs reaching weight milestones (100 kg and 115 kg) with adjustments for backfat and days to weight. The findings indicated that machine learning models exhibited higher accuracy in predicting GEBV compared to phenotypic traits. Notably, GBM demonstrated superior GEBV prediction accuracy, with values of 0.683, 0.710, 0.866, and 0.871 for B100, B115, D100, and D115, respectively, slightly outperforming other methods. In phenotype prediction, GBM emerged as the best-performing model for pigs with B100, B115, D100, and D115 traits, achieving prediction accuracies of 0.547, followed by DL at 0.547, and then XGB with accuracies of 0.672 and 0.670. In terms of model training time, RF required the most time, while GBM and DL fell in between, and XGB demonstrated the shortest training time. In summary, machine learning models obtained through automated techniques exhibited higher GEBV prediction accuracy compared to phenotypic traits. GBM emerged as the overall top performer in terms of prediction accuracy and training time efficiency, while XGB demonstrated the ability to train accurate prediction models within a short timeframe. RF, on the other hand, had longer training times and insufficient accuracy, rendering it unsuitable for predicting pig growth traits and GEBV.

Collapse

Affiliation(s)

Dong Chen Key Laboratory of Livestock and Poultry Multi-omics, Ministry of Agriculture and Rural Affairs, College of Animal Science and Technology, Sichuan Agricultural University, Chengdu 611130, China Farm Animal Genetic Resources Exploration and Innovation Key Laboratory of Sichuan Province, Sichuan Agricultural University, Chengdu 611130, China State Key Laboratory of Swine and Poultry Breeding Industry, College of Animal Science and Technology, Sichuan Agricultural University, Chengdu 611130, China
Shu-Jie Wang Key Laboratory of Livestock and Poultry Multi-omics, Ministry of Agriculture and Rural Affairs, College of Animal Science and Technology, Sichuan Agricultural University, Chengdu 611130, China Farm Animal Genetic Resources Exploration and Innovation Key Laboratory of Sichuan Province, Sichuan Agricultural University, Chengdu 611130, China State Key Laboratory of Swine and Poultry Breeding Industry, College of Animal Science and Technology, Sichuan Agricultural University, Chengdu 611130, China
Zhen-Jian Zhao Key Laboratory of Livestock and Poultry Multi-omics, Ministry of Agriculture and Rural Affairs, College of Animal Science and Technology, Sichuan Agricultural University, Chengdu 611130, China Farm Animal Genetic Resources Exploration and Innovation Key Laboratory of Sichuan Province, Sichuan Agricultural University, Chengdu 611130, China State Key Laboratory of Swine and Poultry Breeding Industry, College of Animal Science and Technology, Sichuan Agricultural University, Chengdu 611130, China
Xiang Ji Key Laboratory of Livestock and Poultry Multi-omics, Ministry of Agriculture and Rural Affairs, College of Animal Science and Technology, Sichuan Agricultural University, Chengdu 611130, China Farm Animal Genetic Resources Exploration and Innovation Key Laboratory of Sichuan Province, Sichuan Agricultural University, Chengdu 611130, China State Key Laboratory of Swine and Poultry Breeding Industry, College of Animal Science and Technology, Sichuan Agricultural University, Chengdu 611130, China
Qi Shen Key Laboratory of Livestock and Poultry Multi-omics, Ministry of Agriculture and Rural Affairs, College of Animal Science and Technology, Sichuan Agricultural University, Chengdu 611130, China Farm Animal Genetic Resources Exploration and Innovation Key Laboratory of Sichuan Province, Sichuan Agricultural University, Chengdu 611130, China State Key Laboratory of Swine and Poultry Breeding Industry, College of Animal Science and Technology, Sichuan Agricultural University, Chengdu 611130, China
Yang Yu Key Laboratory of Livestock and Poultry Multi-omics, Ministry of Agriculture and Rural Affairs, College of Animal Science and Technology, Sichuan Agricultural University, Chengdu 611130, China Farm Animal Genetic Resources Exploration and Innovation Key Laboratory of Sichuan Province, Sichuan Agricultural University, Chengdu 611130, China State Key Laboratory of Swine and Poultry Breeding Industry, College of Animal Science and Technology, Sichuan Agricultural University, Chengdu 611130, China
Sheng-di Cui Key Laboratory of Livestock and Poultry Multi-omics, Ministry of Agriculture and Rural Affairs, College of Animal Science and Technology, Sichuan Agricultural University, Chengdu 611130, China Farm Animal Genetic Resources Exploration and Innovation Key Laboratory of Sichuan Province, Sichuan Agricultural University, Chengdu 611130, China State Key Laboratory of Swine and Poultry Breeding Industry, College of Animal Science and Technology, Sichuan Agricultural University, Chengdu 611130, China
Jun-Ge Wang Key Laboratory of Livestock and Poultry Multi-omics, Ministry of Agriculture and Rural Affairs, College of Animal Science and Technology, Sichuan Agricultural University, Chengdu 611130, China Farm Animal Genetic Resources Exploration and Innovation Key Laboratory of Sichuan Province, Sichuan Agricultural University, Chengdu 611130, China State Key Laboratory of Swine and Poultry Breeding Industry, College of Animal Science and Technology, Sichuan Agricultural University, Chengdu 611130, China
Zi-Yang Chen Key Laboratory of Livestock and Poultry Multi-omics, Ministry of Agriculture and Rural Affairs, College of Animal Science and Technology, Sichuan Agricultural University, Chengdu 611130, China Farm Animal Genetic Resources Exploration and Innovation Key Laboratory of Sichuan Province, Sichuan Agricultural University, Chengdu 611130, China State Key Laboratory of Swine and Poultry Breeding Industry, College of Animal Science and Technology, Sichuan Agricultural University, Chengdu 611130, China
Jin-Yong Wang National Center of Technology Innovation for Pigs, Chongqing 402460, China
Zong-Yi Guo National Center of Technology Innovation for Pigs, Chongqing 402460, China
Ping-Xian Wu National Center of Technology Innovation for Pigs, Chongqing 402460, China
Guo-Qing Tang Key Laboratory of Livestock and Poultry Multi-omics, Ministry of Agriculture and Rural Affairs, College of Animal Science and Technology, Sichuan Agricultural University, Chengdu 611130, China Farm Animal Genetic Resources Exploration and Innovation Key Laboratory of Sichuan Province, Sichuan Agricultural University, Chengdu 611130, China State Key Laboratory of Swine and Poultry Breeding Industry, College of Animal Science and Technology, Sichuan Agricultural University, Chengdu 611130, China

Collapse

Krupp L, Wiede C, Friedhoff J, Grabmaier A. Explainable Remaining Tool Life Prediction for Individualized Production Using Automated Machine Learning. Sensors (Basel) 2023;23:8523. [PMID: 37896615 PMCID: PMC10610891 DOI: 10.3390/s23208523] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 09/07/2023] [Revised: 10/07/2023] [Accepted: 10/15/2023] [Indexed: 10/29/2023]

Abstract

The increasing demand for customized products is a core driver of novel automation concepts in Industry 4.0. For the case of machining complex free-form workpieces, e.g., in die making and mold making, individualized manufacturing is already the industrial practice. The varying process conditions and demanding machining processes lead to a high relevance of machining domain experts and a low degree of manufacturing flow automation. In order to increase the degree of automation, online process monitoring and the prediction of the quality-related remaining cutting tool life is indispensable. However, the varying process conditions complicate this as the correlation between the sensor signals and tool condition is not directly apparent. Furthermore, machine learning (ML) knowledge is limited on the shop floor, preventing a manual adaption of the models to changing conditions. Therefore, this paper introduces a new method for remaining tool life prediction in individualized production using automated machine learning (AutoML). The method enables the incorporation of machining expert knowledge via the model inputs and outputs. It automatically creates end-to-end ML pipelines based on optimized ensembles of regression and forecasting models. An explainability algorithm visualizes the relevance of the model inputs for the decision making. The method is analyzed and compared to a manual state-of-the-art approach for series production in a comprehensive evaluation using a new milling dataset. The dataset represents gradual tool wear under changing workpieces and process parameters. Our AutoML method outperforms the state-of-the-art approach and the evaluation indicates that a transfer of methods designed for series production to variable process conditions is not easily possible. Overall, the new method optimizes individualized production economically and in terms of resources. Machining experts with limited ML knowledge can leverage their domain knowledge to develop, validate and adapt tool life models.

Collapse

Thirunavukarasu AJ, Elangovan K, Gutierrez L, Li Y, Tan I, Keane PA, Korot E, Ting DSW. Democratizing Artificial Intelligence Imaging Analysis With Automated Machine Learning: Tutorial. J Med Internet Res 2023;25:e49949. [PMID: 37824185 PMCID: PMC10603560 DOI: 10.2196/49949] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/14/2023] [Revised: 08/21/2023] [Accepted: 09/13/2023] [Indexed: 10/13/2023] Open

Omar I, Khan M, Starr A, Abou Rok Ba K. Automated Prediction of Crack Propagation Using H2O AutoML. Sensors (Basel) 2023;23:8419. [PMID: 37896512 PMCID: PMC10611134 DOI: 10.3390/s23208419] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 09/27/2023] [Revised: 10/06/2023] [Accepted: 10/09/2023] [Indexed: 10/29/2023]

Abstract

Crack propagation is a critical phenomenon in materials science and engineering, significantly impacting structural integrity, reliability, and safety across various applications. The accurate prediction of crack propagation behavior is paramount for ensuring the performance and durability of engineering components, as extensively explored in prior research. Nevertheless, there is a pressing demand for automated models capable of efficiently and precisely forecasting crack propagation. In this study, we address this need by developing a machine learning-based automated model using the powerful H2O library. This model aims to accurately predict crack propagation behavior in various materials by analyzing intricate crack patterns and delivering reliable predictions. To achieve this, we employed a comprehensive dataset derived from measured instances of crack propagation in Acrylonitrile Butadiene Styrene (ABS) specimens. Rigorous evaluation metrics, including Mean Absolute Error (MAE), Root Mean Square Error (RMSE), and R-squared (R2) values, were applied to assess the model's predictive accuracy. Cross-validation techniques were utilized to ensure its robustness and generalizability across diverse datasets. Our results underscore the automated model's remarkable accuracy and reliability in predicting crack propagation. This study not only highlights the immense potential of the H2O library as a valuable tool for structural health monitoring but also advocates for the broader adoption of Automated Machine Learning (AutoML) solutions in engineering applications. In addition to presenting these findings, we define H2O as a powerful machine learning library and AutoML as Automated Machine Learning to ensure clarity and understanding for readers unfamiliar with these terms. This research not only demonstrates the significance of AutoML in future-proofing our approach to structural integrity and safety but also emphasizes the need for comprehensive reporting and understanding in scientific discourse.

Collapse

Sakagianni A, Koufopoulou C, Kalles D, Loupelis E, Verykios VS, Feretzakis G. Automated ML Techniques for Predicting COVID-19 Mortality in the ICU. Stud Health Technol Inform 2023;305:517-520. [PMID: 37387081 DOI: 10.3233/shti230547] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 07/01/2023]

Valeri JA, Soenksen LR, Collins KM, Ramesh P, Cai G, Powers R, Angenent-Mari NM, Camacho DM, Wong F, Lu TK, Collins JJ. BioAutoMATED: An end-to-end automated machine learning tool for explanation and design of biological sequences. Cell Syst 2023;14:525-542.e9. [PMID: 37348466 PMCID: PMC10700034 DOI: 10.1016/j.cels.2023.05.007] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/30/2022] [Revised: 02/17/2023] [Accepted: 05/22/2023] [Indexed: 06/24/2023]

Affiliation(s)

Jacqueline A Valeri Department of Biological Engineering, Massachusetts Institute of Technology, 77 Massachusetts Ave, Cambridge, MA 02139, USA; Institute for Medical Engineering and Science, Massachusetts Institute of Technology, 77 Massachusetts Ave, Cambridge, MA 02139, USA; Wyss Institute for Biologically Inspired Engineering, Harvard University, Boston, MA 02115, USA; Broad Institute of MIT and Harvard, Cambridge, MA 02142, USA
Luis R Soenksen Institute for Medical Engineering and Science, Massachusetts Institute of Technology, 77 Massachusetts Ave, Cambridge, MA 02139, USA; Wyss Institute for Biologically Inspired Engineering, Harvard University, Boston, MA 02115, USA; Department of Mechanical Engineering, Massachusetts Institute of Technology, 77 Massachusetts Ave, Cambridge, MA 02139, USA
Katherine M Collins Wyss Institute for Biologically Inspired Engineering, Harvard University, Boston, MA 02115, USA; Department of Electrical Engineering and Computer Science, Massachusetts Institute of Technology, 77 Massachusetts Ave, Cambridge, MA 02139, USA; Department of Brain and Cognitive Sciences, Massachusetts Institute of Technology, 77 Massachusetts Ave, Cambridge, MA 02139, USA; Department of Engineering, University of Cambridge, Trumpington St, Cambridge CB2 1PZ, UK
Pradeep Ramesh Wyss Institute for Biologically Inspired Engineering, Harvard University, Boston, MA 02115, USA
George Cai Wyss Institute for Biologically Inspired Engineering, Harvard University, Boston, MA 02115, USA
Rani Powers Wyss Institute for Biologically Inspired Engineering, Harvard University, Boston, MA 02115, USA; Pluto Biosciences, Golden, CO 80402, USA
Nicolaas M Angenent-Mari Department of Biological Engineering, Massachusetts Institute of Technology, 77 Massachusetts Ave, Cambridge, MA 02139, USA; Institute for Medical Engineering and Science, Massachusetts Institute of Technology, 77 Massachusetts Ave, Cambridge, MA 02139, USA; Wyss Institute for Biologically Inspired Engineering, Harvard University, Boston, MA 02115, USA
Diogo M Camacho Wyss Institute for Biologically Inspired Engineering, Harvard University, Boston, MA 02115, USA
Felix Wong Department of Biological Engineering, Massachusetts Institute of Technology, 77 Massachusetts Ave, Cambridge, MA 02139, USA; Institute for Medical Engineering and Science, Massachusetts Institute of Technology, 77 Massachusetts Ave, Cambridge, MA 02139, USA; Broad Institute of MIT and Harvard, Cambridge, MA 02142, USA
Timothy K Lu Department of Biological Engineering, Massachusetts Institute of Technology, 77 Massachusetts Ave, Cambridge, MA 02139, USA; Institute for Medical Engineering and Science, Massachusetts Institute of Technology, 77 Massachusetts Ave, Cambridge, MA 02139, USA; Broad Institute of MIT and Harvard, Cambridge, MA 02142, USA; Department of Electrical Engineering and Computer Science, Massachusetts Institute of Technology, 77 Massachusetts Ave, Cambridge, MA 02139, USA; Synthetic Biology Group, Research Laboratory of Electronics, Massachusetts Institute of Technology, Cambridge, MA 02139, USA
James J Collins Department of Biological Engineering, Massachusetts Institute of Technology, 77 Massachusetts Ave, Cambridge, MA 02139, USA; Institute for Medical Engineering and Science, Massachusetts Institute of Technology, 77 Massachusetts Ave, Cambridge, MA 02139, USA; Wyss Institute for Biologically Inspired Engineering, Harvard University, Boston, MA 02115, USA; Broad Institute of MIT and Harvard, Cambridge, MA 02142, USA; Harvard-MIT Program in Health Sciences and Technology, Cambridge, MA 02139, USA; Abdul Latif Jameel Clinic for Machine Learning in Health, Massachusetts Institute of Technology, Cambridge, MA 02139, USA.

Collapse

Jeong D, Jeong W, Lee JH, Park SY. Use of Automated Machine Learning for Classifying Hemoperitoneum on Ultrasonographic Images of Morrison's Pouch: A Multicenter Retrospective Study. J Clin Med 2023;12:4043. [PMID: 37373736 DOI: 10.3390/jcm12124043] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/08/2023] [Revised: 06/09/2023] [Accepted: 06/11/2023] [Indexed: 06/29/2023] Open

Liu L, Zhang R, Shi D, Li R, Wang Q, Feng Y, Lu F, Zong Y, Xu X. Automated machine learning to predict the difficulty for endoscopic resection of gastric gastrointestinal stromal tumor. Front Oncol 2023;13:1190987. [PMID: 37234977 PMCID: PMC10206233 DOI: 10.3389/fonc.2023.1190987] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/21/2023] [Accepted: 04/26/2023] [Indexed: 05/28/2023] Open

Abstract

Background

Accurate preoperative assessment of surgical difficulty is crucial to the success of the surgery and patient safety. This study aimed to evaluate the difficulty for endoscopic resection (ER) of gastric gastrointestinal stromal tumors (gGISTs) using multiple machine learning (ML) algorithms.

Methods

From December 2010 to December 2022, 555 patients with gGISTs in multi-centers were retrospectively studied and assigned to a training, validation, and test cohort. A difficult case was defined as meeting one of the following criteria: an operative time ≥ 90 min, severe intraoperative bleeding, or conversion to laparoscopic resection. Five types of algorithms were employed in building models, including traditional logistic regression (LR) and automated machine learning (AutoML) analysis (gradient boost machine (GBM), deep neural net (DL), generalized linear model (GLM), and default random forest (DRF)). We assessed the performance of the models using the areas under the receiver operating characteristic curves (AUC), the calibration curve, and the decision curve analysis (DCA) based on LR, as well as feature importance, SHapley Additive exPlanation (SHAP) Plots and Local Interpretable Model Agnostic Explanation (LIME) based on AutoML.

Results

The GBM model outperformed other models with an AUC of 0.894 in the validation and 0.791 in the test cohorts. Furthermore, the GBM model achieved the highest accuracy among these AutoML models, with 0.935 and 0.911 in the validation and test cohorts, respectively. In addition, it was found that tumor size and endoscopists' experience were the most prominent features that significantly impacted the AutoML model's performance in predicting the difficulty for ER of gGISTs.

Conclusion

The AutoML model based on the GBM algorithm can accurately predict the difficulty for ER of gGISTs before surgery.

Collapse

Chen F, Zhou B, Yang L, Chen X, Zhuang J. Predicting bacterial transport through saturated porous media using an automated machine learning model. Front Microbiol 2023;14:1152059. [PMID: 37234532 PMCID: PMC10206036 DOI: 10.3389/fmicb.2023.1152059] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/27/2023] [Accepted: 04/25/2023] [Indexed: 05/28/2023] Open

Chung J, Oh DJ, Park J, Kim SH, Lim YJ. Automatic Classification of GI Organs in Wireless Capsule Endoscopy Using a No-Code Platform-Based Deep Learning Model. Diagnostics (Basel) 2023;13:diagnostics13081389. [PMID: 37189489 DOI: 10.3390/diagnostics13081389] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/22/2023] [Revised: 04/03/2023] [Accepted: 04/10/2023] [Indexed: 05/17/2023] Open

Lai FL, Gao F. Auto-Kla: a novel web server to discriminate lysine lactylation sites using automated machine learning. Brief Bioinform 2023;24:7068952. [PMID: 36869843 DOI: 10.1093/bib/bbad070] [Citation(s) in RCA: 7] [Impact Index Per Article: 7.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/05/2022] [Revised: 01/24/2023] [Accepted: 02/07/2023] [Indexed: 03/05/2023] Open

González-Nóvoa JA, Campanioni S, Busto L, Fariña J, Rodríguez-Andina JJ, Vila D, Íñiguez A, Veiga C. Improving Intensive Care Unit Early Readmission Prediction Using Optimized and Explainable Machine Learning. Int J Environ Res Public Health 2023;20:3455. [PMID: 36834150 PMCID: PMC9960143 DOI: 10.3390/ijerph20043455] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 12/20/2022] [Revised: 02/10/2023] [Accepted: 02/14/2023] [Indexed: 06/18/2023]

Yuan P, Xu S, Zhai Z, Xu H. Research of intelligent reasoning system of Arabidopsis thaliana phenotype based on automated multi-task machine learning. Front Plant Sci 2023;14:1048016. [PMID: 36866380 PMCID: PMC9974140 DOI: 10.3389/fpls.2023.1048016] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 09/19/2022] [Accepted: 01/13/2023] [Indexed: 06/18/2023]

González-Nóvoa JA, Busto L, Campanioni S, Fariña J, Rodríguez-Andina JJ, Vila D, Veiga C. Two-Step Approach for Occupancy Estimation in Intensive Care Units Based on Bayesian Optimization Techniques. Sensors (Basel) 2023;23:1162. [PMID: 36772202 PMCID: PMC9919941 DOI: 10.3390/s23031162] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 11/30/2022] [Revised: 01/14/2023] [Accepted: 01/17/2023] [Indexed: 06/18/2023]

Chen T, Or CK. Automated machine learning-based prediction of the progression of knee pain, functional decline, and incidence of knee osteoarthritis in individuals at high risk of knee osteoarthritis: Data from the osteoarthritis initiative study. Digit Health 2023;9:20552076231216419. [PMID: 38033512 PMCID: PMC10685797 DOI: 10.1177/20552076231216419] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/20/2023] [Accepted: 11/07/2023] [Indexed: 12/02/2023] Open

Abstract

Objective

This study aimed to examine the performance of machine learning models in predicting the progression of knee pain, functional decline, and incidence of knee osteoarthritis (OA) in high-risk individuals, with automated machine learning (AutoML) being used to automate the prediction process.

Design

There were four stages in the process of our AutoML-integrated prediction. Stage 1-Data preparation: The data of 3200 eligible individuals in the Osteoarthritis Initiative (OAI) study who were considered at high risk of knee OA at the baseline visit were extracted and used. Specifically, 1094 variables from the OAI study were used to predict the changes in knee pain, physical function, and incidence of knee OA (i.e. the first occurrence of frequent knee symptoms and definite tibial osteophytes (Kellgren and Lawrence grade ≥2)) over a 9-year period. Stage 2-Model training: The AutoML approach was used to automatically train nine widely used machine learning (ML) models. Stage 3-Model testing: The AutoML approach was used to automatically test the performance of the ML models. Stage 4-Selection of important input variables: The AutoML approach automated the process of computing the importance scores of all input variables and identifying the most important ones, using the technique of permutation feature importance.

Results

Using the AutoML approach, the weighted ensemble model and the CatBoost model showed the best performance among all nine ML models. For the prediction of each outcome in each year, the five most important input variables were identified, most of which were obtained from self-reported questionnaire surveys and radiographic imaging reports.

Conclusion

The AutoML approach has shown potential in automating the process of using ML models to predict long-term changes in knee OA-related outcomes. Its use could support the deployment of ML solutions, facilitating the provision of personalized interventions to prevent the deterioration of knee health and incident knee OA.

Collapse

Yu C, Li Y, Yin M, Gao J, Xi L, Lin J, Liu L, Zhang H, Wu A, Xu C, Liu X, Wang Y, Zhu J. Automated Machine Learning in Predicting 30-Day Mortality in Patients with Non-Cholestatic Cirrhosis. J Pers Med 2022;12:1930. [PMID: 36422105 PMCID: PMC9693570 DOI: 10.3390/jpm12111930] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/15/2022] [Revised: 11/09/2022] [Accepted: 11/18/2022] [Indexed: 07/30/2023] Open

Affiliation(s)

Chenyan Yu Department of Gastroenterology, The First Affiliated Hospital of Soochow University, 188 Shizi Street, Suzhou 215006, China Suzhou Clinical Center of Digestive Diseases, Suzhou 215000, China Department of Gastroenterology, Dushu Lake Hospital Affiliated to Soochow University, Suzhou 215000, China
Yao Li Department of Gastroenterology, The First Affiliated Hospital of Soochow University, 188 Shizi Street, Suzhou 215006, China Suzhou Clinical Center of Digestive Diseases, Suzhou 215000, China
Minyue Yin Department of Gastroenterology, The First Affiliated Hospital of Soochow University, 188 Shizi Street, Suzhou 215006, China Suzhou Clinical Center of Digestive Diseases, Suzhou 215000, China
Jingwen Gao Department of Gastroenterology, The First Affiliated Hospital of Soochow University, 188 Shizi Street, Suzhou 215006, China Suzhou Clinical Center of Digestive Diseases, Suzhou 215000, China
Liting Xi Department of Gastroenterology, The First Affiliated Hospital of Soochow University, 188 Shizi Street, Suzhou 215006, China Suzhou Clinical Center of Digestive Diseases, Suzhou 215000, China
Jiaxi Lin Department of Gastroenterology, The First Affiliated Hospital of Soochow University, 188 Shizi Street, Suzhou 215006, China Suzhou Clinical Center of Digestive Diseases, Suzhou 215000, China
Lu Liu Department of Gastroenterology, The First Affiliated Hospital of Soochow University, 188 Shizi Street, Suzhou 215006, China Suzhou Clinical Center of Digestive Diseases, Suzhou 215000, China
Huixian Zhang Department of Gastroenterology, Dushu Lake Hospital Affiliated to Soochow University, Suzhou 215000, China
Airong Wu Department of Gastroenterology, The First Affiliated Hospital of Soochow University, 188 Shizi Street, Suzhou 215006, China Suzhou Clinical Center of Digestive Diseases, Suzhou 215000, China
Chunfang Xu Department of Gastroenterology, The First Affiliated Hospital of Soochow University, 188 Shizi Street, Suzhou 215006, China Suzhou Clinical Center of Digestive Diseases, Suzhou 215000, China
Xiaolin Liu Department of Gastroenterology, The First Affiliated Hospital of Soochow University, 188 Shizi Street, Suzhou 215006, China Suzhou Clinical Center of Digestive Diseases, Suzhou 215000, China
Yue Wang Department of Hepatology, The Fifth People’s Hospital of Suzhou, Suzhou 215000, China
Jinzhou Zhu Department of Gastroenterology, The First Affiliated Hospital of Soochow University, 188 Shizi Street, Suzhou 215006, China Suzhou Clinical Center of Digestive Diseases, Suzhou 215000, China

Collapse

Zhou K, Huang X, Song Q, Chen R, Hu X. Auto-GNN: Neural architecture search of graph neural networks. Front Big Data 2022;5:1029307. [PMID: 36466713 PMCID: PMC9714572 DOI: 10.3389/fdata.2022.1029307] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/27/2022] [Accepted: 10/26/2022] [Indexed: 09/19/2023] Open

Chen J, Zhang J, Zhao H. Quantifying Alignment Deviations for the In-Plane Biaxial Test System via a Shape-Optimised Cruciform Specimen. Materials (Basel) 2022;15:4949. [PMID: 35888416 DOI: 10.3390/ma15144949] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 06/07/2022] [Revised: 07/07/2022] [Accepted: 07/10/2022] [Indexed: 11/17/2022]

Topcu Dİ, Bayraktar N. Searching for the urine osmolality surrogate: an automated machine learning approach. Clin Chem Lab Med 2022;60:1911-1920. [PMID: 35778953 DOI: 10.1515/cclm-2022-0415] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/26/2022] [Accepted: 06/22/2022] [Indexed: 12/30/2022]

Yin M, Zhang R, Zhou Z, Liu L, Gao J, Xu W, Yu C, Lin J, Liu X, Xu C, Zhu J. Automated Machine Learning for the Early Prediction of the Severity of Acute Pancreatitis in Hospitals. Front Cell Infect Microbiol 2022;12:886935. [PMID: 35755847 PMCID: PMC9226483 DOI: 10.3389/fcimb.2022.886935] [Citation(s) in RCA: 5] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/01/2022] [Accepted: 04/29/2022] [Indexed: 11/13/2022] Open

Abstract

Background

Machine learning (ML) algorithms are widely applied in building models of medicine due to their powerful studying and generalizing ability. This study aims to explore different ML models for early identification of severe acute pancreatitis (SAP) among patients hospitalized for acute pancreatitis.

Methods

This retrospective study enrolled patients with acute pancreatitis (AP) from multiple centers. Data from the First Affiliated Hospital and Changshu No. 1 Hospital of Soochow University were adopted for training and internal validation, and data from the Second Affiliated Hospital of Soochow University were adopted for external validation from January 2017 to December 2021. The diagnosis of AP and SAP was based on the 2012 revised Atlanta classification of acute pancreatitis. Models were built using traditional logistic regression (LR) and automated machine learning (AutoML) analysis with five types of algorithms. The performance of models was evaluated by the receiver operating characteristic (ROC) curve, the calibration curve, and the decision curve analysis (DCA) based on LR and feature importance, SHapley Additive exPlanation (SHAP) Plot, and Local Interpretable Model Agnostic Explanation (LIME) based on AutoML.

Results

A total of 1,012 patients were included in this study to develop the AutoML models in the training/validation dataset. An independent dataset of 212 patients was used to test the models. The model developed by the gradient boost machine (GBM) outperformed other models with an area under the ROC curve (AUC) of 0.937 in the validation set and an AUC of 0.945 in the test set. Furthermore, the GBM model achieved the highest sensitivity value of 0.583 among these AutoML models. The model developed by eXtreme Gradient Boosting (XGBoost) achieved the highest specificity value of 0.980 and the highest accuracy of 0.958 in the test set.

Conclusions

The AutoML model based on the GBM algorithm for early prediction of SAP showed evident clinical practicability.

Collapse

Ritter Z, Papp L, Zámbó K, Tóth Z, Dezső D, Veres DS, Máthé D, Budán F, Karádi É, Balikó A, Pajor L, Szomor Á, Schmidt E, Alizadeh H. Two-Year Event-Free Survival Prediction in DLBCL Patients Based on In Vivo Radiomics and Clinical Parameters. Front Oncol 2022;12:820136. [PMID: 35756658 PMCID: PMC9216187 DOI: 10.3389/fonc.2022.820136] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/22/2021] [Accepted: 05/18/2022] [Indexed: 12/11/2022] Open

Manduchi E, Le TT, Fu W, Moore JH. Genetic Analysis of Coronary Artery Disease Using Tree-Based Automated Machine Learning Informed By Biology-Based Feature Selection. IEEE/ACM Trans Comput Biol Bioinform 2022;19:1379-1386. [PMID: 34310318 PMCID: PMC9291719 DOI: 10.1109/tcbb.2021.3099068] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Indexed: 06/13/2023]

Yuan H, Xie F, Eng Hock Ong M, Ning Y, Lucas Chee M, Ehsan Saffari S, Rizal Abdullah H, Alan Goldstein B, Chakraborty B, Liu N. AutoScore-Imbalance: An Interpretable Machine Learning Tool for Development of Clinical Scores with Rare Events Data. J Biomed Inform 2022;129:104072. [PMID: 35421602 DOI: 10.1016/j.jbi.2022.104072] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/14/2021] [Revised: 03/10/2022] [Accepted: 04/07/2022] [Indexed: 02/06/2023]

Abstract

BACKGROUND

Medical decision-making impacts both individual and public health. Clinical scores are commonly used among various decision-making models to determine the degree of disease deterioration at the bedside. AutoScore was proposed as a useful clinical score generator based on machine learning and a generalized linear model. However, its current framework still leaves room for improvement when addressing unbalanced data of rare events.

METHODS

Using machine intelligence approaches, we developed AutoScore-Imbalance, which comprises three components: training dataset optimization, sample weight optimization, and adjusted AutoScore. Baseline techniques for performance comparison included the original AutoScore, full logistic regression, stepwise logistic regression, least absolute shrinkage and selection operator (LASSO), full random forest, and random forest with a reduced number of variables. These models were evaluated based on their area under the curve (AUC) in the receiver operating characteristic analysis and balanced accuracy (i.e., mean value of sensitivity and specificity). By utilizing a publicly accessible dataset from Beth Israel Deaconess Medical Center, we assessed the proposed model and baseline approaches to predict inpatient mortality.

RESULTS

AutoScore-Imbalance outperformed baselines in terms of AUC and balanced accuracy. The nine-variable AutoScore-Imbalance sub-model achieved the highest AUC of 0.786 (0.732-0.839), while the eleven-variable original AutoScore obtained an AUC of 0.723 (0.663-0.783), and the logistic regression with 21 variables obtained an AUC of 0.743 (0.685-0.800). The AutoScore-Imbalance sub-model (using a down-sampling algorithm) yielded an AUC of 0.771 (0.718-0.823) with only five variables, demonstrating a good balance between performance and variable sparsity. Furthermore, AutoScore-Imbalance obtained the highest balanced accuracy of 0.757 (0.702-0.805), compared to 0.698 (0.643-0.753) by the original AutoScore and the maximum of 0.720 (0.664-0.769) by other baseline models.

CONCLUSIONS

We have developed an interpretable tool to handle clinical data imbalance, presented its structure, and demonstrated its superiority over baselines. The AutoScore-Imbalance tool can be applied to highly unbalanced datasets to gain further insight into rare medical events and facilitate real-world clinical decision-making.

Collapse

Angarita-Zapata JS, Maestre-Gongora G, Calderín JF. A Bibliometric Analysis and Benchmark of Machine Learning and AutoML in Crash Severity Prediction: The Case Study of Three Colombian Cities. Sensors (Basel) 2021;21:s21248401. [PMID: 34960494 PMCID: PMC8708527 DOI: 10.3390/s21248401] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 11/07/2021] [Revised: 12/10/2021] [Accepted: 12/14/2021] [Indexed: 11/16/2022]

Abstract

Traffic accidents are of worldwide concern, as they are one of the leading causes of death globally. One policy designed to cope with them is the design and deployment of road safety systems. These aim to predict crashes based on historical records, provided by new Internet of Things (IoT) technologies, to enhance traffic flow management and promote safer roads. Increasing data availability has helped machine learning (ML) to address the prediction of crashes and their severity. The literature reports numerous contributions regarding survey papers, experimental comparisons of various techniques, and the design of new methods at the point where crash severity prediction (CSP) and ML converge. Despite such progress, and as far as we know, there are no comprehensive research articles that theoretically and practically approach the model selection problem (MSP) in CSP. Thus, this paper introduces a bibliometric analysis and experimental benchmark of ML and automated machine learning (AutoML) as a suitable approach to automatically address the MSP in CSP. Firstly, 2318 bibliographic references were consulted to identify relevant authors, trending topics, keywords evolution, and the most common ML methods used in related-case studies, which revealed an opportunity for the use AutoML in the transportation field. Then, we compared AutoML (AutoGluon, Auto-sklearn, TPOT) and ML (CatBoost, Decision Tree, Extra Trees, Gradient Boosting, Gaussian Naive Bayes, Light Gradient Boosting Machine, Random Forest) methods in three case studies using open data portals belonging to the cities of Medellín, Bogotá, and Bucaramanga in Colombia. Our experimentation reveals that AutoGluon and CatBoost are competitive and robust ML approaches to deal with various CSP problems. In addition, we concluded that general-purpose AutoML effectively supports the MSP in CSP without developing domain-focused AutoML methods for this supervised learning problem. Finally, based on the results obtained, we introduce challenges and research opportunities that the community should explore to enhance the contributions that ML and AutoML can bring to CSP and other transportation areas.

Collapse

Manduchi E, Moore JH. Leveraging Automated Machine Learning for the Analysis of Global Public Health Data: A Case Study in Malaria. Int J Public Health 2021;66:614296. [PMID: 34744577 PMCID: PMC8565284 DOI: 10.3389/ijph.2021.614296] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/05/2020] [Accepted: 03/17/2021] [Indexed: 11/13/2022] Open

Wang K, Xue Q, Lu JJ. Risky Driver Recognition with Class Imbalance Data and Automated Machine Learning Framework. Int J Environ Res Public Health 2021;18:7534. [PMID: 34299986 DOI: 10.3390/ijerph18147534] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [What about the content of this article? (0)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 05/12/2021] [Revised: 06/26/2021] [Accepted: 07/03/2021] [Indexed: 11/26/2022]

Chen YW, Song Q, Liu X, Sastry PS, Hu X. On Robustness of Neural Architecture Search Under Label Noise. Front Big Data 2021;3:2. [PMID: 33693377 PMCID: PMC7931895 DOI: 10.3389/fdata.2020.00002] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/02/2019] [Accepted: 01/10/2020] [Indexed: 11/30/2022] Open

Ikemura K, Bellin E, Yagi Y, Billett H, Saada M, Simone K, Stahl L, Szymanski J, Goldstein DY, Reyes Gil M. Using Automated Machine Learning to Predict the Mortality of Patients With COVID-19: Prediction Model Development Study. J Med Internet Res 2021;23:e23458. [PMID: 33539308 PMCID: PMC7919846 DOI: 10.2196/23458] [Citation(s) in RCA: 22] [Impact Index Per Article: 7.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/17/2020] [Revised: 12/23/2020] [Accepted: 02/03/2021] [Indexed: 12/16/2022] Open

Abstract

BACKGROUND

During a pandemic, it is important for clinicians to stratify patients and decide who receives limited medical resources. Machine learning models have been proposed to accurately predict COVID-19 disease severity. Previous studies have typically tested only one machine learning algorithm and limited performance evaluation to area under the curve analysis. To obtain the best results possible, it may be important to test different machine learning algorithms to find the best prediction model.

OBJECTIVE

In this study, we aimed to use automated machine learning (autoML) to train various machine learning algorithms. We selected the model that best predicted patients' chances of surviving a SARS-CoV-2 infection. In addition, we identified which variables (ie, vital signs, biomarkers, comorbidities, etc) were the most influential in generating an accurate model.

METHODS

Data were retrospectively collected from all patients who tested positive for COVID-19 at our institution between March 1 and July 3, 2020. We collected 48 variables from each patient within 36 hours before or after the index time (ie, real-time polymerase chain reaction positivity). Patients were followed for 30 days or until death. Patients' data were used to build 20 machine learning models with various algorithms via autoML. The performance of machine learning models was measured by analyzing the area under the precision-recall curve (AUPCR). Subsequently, we established model interpretability via Shapley additive explanation and partial dependence plots to identify and rank variables that drove model predictions. Afterward, we conducted dimensionality reduction to extract the 10 most influential variables. AutoML models were retrained by only using these 10 variables, and the output models were evaluated against the model that used 48 variables.

RESULTS

Data from 4313 patients were used to develop the models. The best model that was generated by using autoML and 48 variables was the stacked ensemble model (AUPRC=0.807). The two best independent models were the gradient boost machine and extreme gradient boost models, which had an AUPRC of 0.803 and 0.793, respectively. The deep learning model (AUPRC=0.73) was substantially inferior to the other models. The 10 most influential variables for generating high-performing models were systolic and diastolic blood pressure, age, pulse oximetry level, blood urea nitrogen level, lactate dehydrogenase level, D-dimer level, troponin level, respiratory rate, and Charlson comorbidity score. After the autoML models were retrained with these 10 variables, the stacked ensemble model still had the best performance (AUPRC=0.791).

CONCLUSIONS

We used autoML to develop high-performing models that predicted the survival of patients with COVID-19. In addition, we identified important variables that correlated with mortality. This is proof of concept that autoML is an efficient, effective, and informative method for generating machine learning-based clinical decision support tools.

Collapse

Zhang S, Sun H, Su X, Yang X, Wang W, Wan X, Tan Q, Chen N, Yue Q, Gong Q. Automated machine learning to predict the co-occurrence of isocitrate dehydrogenase mutations and O⁶ -methylguanine-DNA methyltransferase promoter methylation in patients with gliomas. J Magn Reson Imaging 2021;54:197-205. [PMID: 33393131 DOI: 10.1002/jmri.27498] [Citation(s) in RCA: 16] [Impact Index Per Article: 5.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/02/2020] [Revised: 12/17/2020] [Accepted: 12/18/2020] [Indexed: 02/05/2023] Open

Abstract

Combining isocitrate dehydrogenase mutation (IDHmut) with O⁶ -methylguanine-DNA methyltransferase promoter methylation (MGMTmet) has been identified as a critical prognostic molecular marker for gliomas. The aim of this study was to determine the ability of glioma radiomics features from magnetic resonance imaging (MRI) to predict the co-occurrence of IDHmut and MGMTmet by applying the tree-based pipeline optimization tool (TPOT), an automated machine learning (autoML) approach. This was a retrospective study, in which 162 patients with gliomas were evaluated, including 58 patients with co-occurrence of IDHmut and MGMTmet and 104 patients with other status comprising: IDH wildtype and MGMT unmethylated (n = 67), IDH wildtype and MGMTmet (n = 36), and IDHmut and MGMT unmethylated (n = 1). Three-dimensional (3D) T1-weighted images, gadolinium-enhanced 3D T1-weighted images (Gd-3DT1WI), T2-weighted images, and fluid-attenuated inversion recovery (FLAIR) images acquired at 3.0 T were used. Radiomics features were extracted from FLAIR and Gd-3DT1WI images. The TPOT was employed to generate the best machine learning pipeline, which contains both feature selector and classifier, based on input feature sets. A 4-fold cross-validation was used to evaluate the performance of automatically generated models. For each iteration, the training set included 121 subjects, while the test set included 41 subjects. Student's t-test or a chi-square test was applied on different clinical characteristics between two groups. Sensitivity, specificity, accuracy, kappa score, and AUC were used to evaluate the performance of TPOT-generated models. Finally, we compared the above metrics of TPOT-generated models to identify the best-performing model. Patients' ages and grades between two groups were significantly different (p = 0.002 and p = 0.000, respectively). The 4-fold cross-validation showed that gradient boosting classifier trained on shape and textual features from the Laplacian-of-Gaussian-filtered Gd-3DT1 achieved the best performance (average sensitivity = 81.1%, average specificity = 94%, average accuracy = 89.4%, average kappa score = 0.76, average AUC = 0.951). Using autoML based on radiomics features from MRI, a high discriminatory accuracy was achieved for predicting co-occurrence of IDHmut and MGMTmet in gliomas. LEVEL OF EVIDENCE: 3 TECHNICAL EFFICACY STAGE: 3.

Collapse

Affiliation(s)

Simin Zhang Huaxi MR Research Center (HMRRC), Functional and Molecular Imaging Key Laboratory of Sichuan Province, Department of Radiology, West China Hospital of Sichuan University, Chengdu, China.,Huaxi Glioma Center, West China Hospital of Sichuan University, Chengdu, China
Huaiqiang Sun Huaxi MR Research Center (HMRRC), Functional and Molecular Imaging Key Laboratory of Sichuan Province, Department of Radiology, West China Hospital of Sichuan University, Chengdu, China
Xiaorui Su Huaxi MR Research Center (HMRRC), Functional and Molecular Imaging Key Laboratory of Sichuan Province, Department of Radiology, West China Hospital of Sichuan University, Chengdu, China.,Huaxi Glioma Center, West China Hospital of Sichuan University, Chengdu, China
Xibiao Yang Huaxi Glioma Center, West China Hospital of Sichuan University, Chengdu, China.,Department of Radiology, West China Hospital of Sichuan University, Chengdu, China
Weina Wang Huaxi MR Research Center (HMRRC), Functional and Molecular Imaging Key Laboratory of Sichuan Province, Department of Radiology, West China Hospital of Sichuan University, Chengdu, China
Xinyue Wan Huaxi MR Research Center (HMRRC), Functional and Molecular Imaging Key Laboratory of Sichuan Province, Department of Radiology, West China Hospital of Sichuan University, Chengdu, China
Qiaoyue Tan Huaxi MR Research Center (HMRRC), Functional and Molecular Imaging Key Laboratory of Sichuan Province, Department of Radiology, West China Hospital of Sichuan University, Chengdu, China.,Division of Radiation Physics, State Key Laboratory of Biotherapy and Cancer Center, West China Hospital of Sichuan University, Chengdu, China
Ni Chen Department of Pathology, West China Hospital of Sichuan University, Chengdu, China
Qiang Yue Huaxi Glioma Center, West China Hospital of Sichuan University, Chengdu, China.,Department of Radiology, West China Hospital of Sichuan University, Chengdu, China
Qiyong Gong Huaxi MR Research Center (HMRRC), Functional and Molecular Imaging Key Laboratory of Sichuan Province, Department of Radiology, West China Hospital of Sichuan University, Chengdu, China

Collapse

Dafflon J, Pinaya WHL, Turkheimer F, Cole JH, Leech R, Harris MA, Cox SR, Whalley HC, McIntosh AM, Hellyer PJ. An automated machine learning approach to predict brain age from cortical anatomical measures. Hum Brain Mapp 2020;41:3555-3566. [PMID: 32415917 PMCID: PMC7416036 DOI: 10.1002/hbm.25028] [Citation(s) in RCA: 16] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/06/2020] [Revised: 04/10/2020] [Accepted: 04/21/2020] [Indexed: 12/31/2022] Open

Olsavszky V, Dosius M, Vladescu C, Benecke J. Time Series Analysis and Forecasting with Automated Machine Learning on a National ICD-10 Database. Int J Environ Res Public Health 2020;17:E4979. [PMID: 32664331 DOI: 10.3390/ijerph17144979] [Citation(s) in RCA: 14] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 05/31/2020] [Revised: 06/29/2020] [Accepted: 07/07/2020] [Indexed: 12/22/2022]

Sakagianni A, Feretzakis G, Kalles D, Koufopoulou C, Kaldis V. Setting up an Easy-to-Use Machine Learning Pipeline for Medical Decision Support: A Case Study for COVID-19 Diagnosis Based on Deep Learning with CT Scans. Stud Health Technol Inform 2020;272:13-16. [PMID: 32604588 DOI: 10.3233/shti200481] [Citation(s) in RCA: 17] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/15/2022]

Bhat GS, Shankar N, Panahi IMS. Automated machine learning based speech classification for hearing aid applications and its real-time implementation on smartphone. Annu Int Conf IEEE Eng Med Biol Soc 2020;2020:956-959. [PMID: 33018143 PMCID: PMC7545263 DOI: 10.1109/embc44109.2020.9175693] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/05/2022]

Montesanto A, D'Aquila P, Lagani V, Paparazzo E, Geracitano S, Formentini L, Giacconi R, Cardelli M, Provinciali M, Bellizzi D, Passarino G. A New Robust Epigenetic Model for Forensic Age Prediction. J Forensic Sci 2020;65:1424-1431. [PMID: 32453457 DOI: 10.1111/1556-4029.14460] [Citation(s) in RCA: 19] [Impact Index Per Article: 4.8] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/04/2020] [Revised: 04/22/2020] [Accepted: 05/04/2020] [Indexed: 12/12/2022]

Lesot MJ, Vieira S, Reformat MZ, Carvalho JP, Wilbik A, Bouchon-Meunier B, Yager RR. General-Purpose Automated Machine Learning for Transportation: A Case Study of Auto-sklearn for Traffic Forecasting. Information Processing and Management of Uncertainty in Knowledge-Based Systems 2020. [PMCID: PMC7274664 DOI: 10.1007/978-3-030-50143-3_57] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/02/2023]

Puri M. Automated Machine Learning Diagnostic Support System as a Computational Biomarker for Detecting Drug-Induced Liver Injury Patterns in Whole Slide Liver Pathology Images. Assay Drug Dev Technol 2020;18:1-10. [PMID: 31149832 PMCID: PMC6998050 DOI: 10.1089/adt.2019.919] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/01/2023] Open

Liu T, Nicholas J, Theilig MM, Guntuku SC, Kording K, Mohr DC, Ungar L. Machine Learning for Phone-Based Relationship Estimation: The Need to Consider Population Heterogeneity. Proc ACM Interact Mob Wearable Ubiquitous Technol 2019;3:145. [PMID: 32490330 PMCID: PMC7265570 DOI: 10.1145/3369820] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/11/2023]

Adamou M, Antoniou G, Greasidou E, Lagani V, Charonyktakis P, Tsamardinos I, Doyle M. Toward Automatic Risk Assessment to Support Suicide Prevention. Crisis 2018;40:249-256. [PMID: 30474411 DOI: 10.1027/0227-5910/a000561] [Citation(s) in RCA: 22] [Impact Index Per Article: 3.7] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 01/06/2023]

Key Words
automated machine learning
clinical data
risk assessment
suicide prevention
text mining
Collapse

MESH Headings Collapse

Grants Collapse

Affiliation(s)
Marios Adamou
1 South West Yorkshire Partnership NHS Foundation Trust, Wakefield, UK.,2 Department of Computer Science, University of Huddersfield, UK
Grigoris Antoniou
2 Department of Computer Science, University of Huddersfield, UK
Elissavet Greasidou
3 Gnosis Data Analysis PC, Heraklion, Greece
Vincenzo Lagani
3 Gnosis Data Analysis PC, Heraklion, Greece.,5 Institute of Chemical Biology, Ilia State University, Tbilisi, Georgia
Paulos Charonyktakis
3 Gnosis Data Analysis PC, Heraklion, Greece
Ioannis Tsamardinos
2 Department of Computer Science, University of Huddersfield, UK.,3 Gnosis Data Analysis PC, Heraklion, Greece.,4 Computer Science Department, University of Crete, Heraklion, Greece
Michael Doyle
1 South West Yorkshire Partnership NHS Foundation Trust, Wakefield, UK
Collapse

Orlenko A, Moore JH, Orzechowski P, Olson RS, Cairns J, Caraballo PJ, Weinshilboum RM, Wang L, Breitenstein MK. Considerations for automated machine learning in clinical metabolic profiling: Altered homocysteine plasma concentration associated with metformin exposure. Pac Symp Biocomput 2018;23:460-471. [PMID: 29218905 PMCID: PMC5882490] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 11/12/2022]

Abstract

With the maturation of metabolomics science and proliferation of biobanks, clinical metabolic profiling is an increasingly opportunistic frontier for advancing translational clinical research. Automated Machine Learning (AutoML) approaches provide exciting opportunity to guide feature selection in agnostic metabolic profiling endeavors, where potentially thousands of independent data points must be evaluated. In previous research, AutoML using high-dimensional data of varying types has been demonstrably robust, outperforming traditional approaches. However, considerations for application in clinical metabolic profiling remain to be evaluated. Particularly, regarding the robustness of AutoML to identify and adjust for common clinical confounders. In this study, we present a focused case study regarding AutoML considerations for using the Tree-Based Optimization Tool (TPOT) in metabolic profiling of exposure to metformin in a biobank cohort. First, we propose a tandem rank-accuracy measure to guide agnostic feature selection and corresponding threshold determination in clinical metabolic profiling endeavors. Second, while AutoML, using default parameters, demonstrated potential to lack sensitivity to low-effect confounding clinical covariates, we demonstrated residual training and adjustment of metabolite features as an easily applicable approach to ensure AutoML adjustment for potential confounding characteristics. Finally, we present increased homocysteine with long-term exposure to metformin as a potentially novel, non-replicated metabolite association suggested by TPOT; an association not identified in parallel clinical metabolic profiling endeavors. While warranting independent replication, our tandem rank-accuracy measure suggests homocysteine to be the metabolite feature with largest effect, and corresponding priority for further translational clinical research. Residual training and adjustment for a potential confounding effect by BMI only slightly modified the suggested association. Increased homocysteine is thought to be associated with vitamin B12 deficiency - evaluation for potential clinical relevance is suggested. While considerations for clinical metabolic profiling are recommended, including adjustment approaches for clinical confounders, AutoML presents an exciting tool to enhance clinical metabolic profiling and advance translational research endeavors.

Collapse