Reference Citation Analysis: Find an Article, Find a Category, Find a Journal, Find a Scholar

For: Luo G. MLBCD: a machine learning tool for big clinical data. Health Inf Sci Syst 2015;3:3. [PMID: 26417431 PMCID: PMC4584489 DOI: 10.1186/s13755-015-0011-0] [Citation(s) in RCA: 23] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/27/2015] [Accepted: 09/22/2015] [Indexed: 12/12/2022] Open

For:	Luo G. MLBCD: a machine learning tool for big clinical data. Health Inf Sci Syst 2015;3:3. [PMID: 26417431 PMCID: PMC4584489 DOI: 10.1186/s13755-015-0011-0] [Citation(s) in RCA: 23] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/27/2015] [Accepted: 09/22/2015] [Indexed: 12/12/2022] Open

Number

Cited by Other Article(s)

Jiang S, Wang T, Zhang KH. Data-driven decision-making for precision diagnosis of digestive diseases. Biomed Eng Online 2023;22:87. [PMID: 37658345 PMCID: PMC10472739 DOI: 10.1186/s12938-023-01148-1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/14/2023] [Accepted: 08/15/2023] [Indexed: 09/03/2023] Open

Yamanouchi Y, Nakamura T, Ikeda T, Usuku K. An Alternative Application of Natural Language Processing to Express a Characteristic Feature of Diseases in Japanese Medical Records. Methods Inf Med 2023;62:110-118. [PMID: 36809794 PMCID: PMC10462427 DOI: 10.1055/a-2039-3773] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/17/2021] [Accepted: 04/13/2022] [Indexed: 02/23/2023]

Abstract

BACKGROUND

Owing to the linguistic situation, Japanese natural language processing (NLP) requires morphological analyses for word segmentation using dictionary techniques.

OBJECTIVE

We aimed to clarify whether it can be substituted with an open-end discovery-based NLP (OD-NLP), which does not use any dictionary techniques.

METHODS

Clinical texts at the first medical visit were collected for comparison of OD-NLP with word dictionary-based-NLP (WD-NLP). Topics were generated in each document using a topic model, which later corresponded to the respective diseases determined in International Statistical Classification of Diseases and Related Health Problems 10 revision. The prediction accuracy and expressivity of each disease were examined in equivalent number of entities/words after filtration with either term frequency and inverse document frequency (TF-IDF) or dominance value (DMV).

RESULTS

In documents from 10,520 observed patients, 169,913 entities and 44,758 words were segmented using OD-NLP and WD-NLP, simultaneously. Without filtering, accuracy and recall levels were low, and there was no difference in the harmonic mean of the F-measure between NLPs. However, physicians reported OD-NLP contained more meaningful words than WD-NLP. When datasets were created in an equivalent number of entities/words with TF-IDF, F-measure in OD-NLP was higher than WD-NLP at lower thresholds. When the threshold increased, the number of datasets created decreased, resulting in increased values of F-measure, although the differences disappeared. Two datasets near the maximum threshold showing differences in F-measure were examined whether their topics were associated with diseases. The results showed that more diseases were found in OD-NLP at lower thresholds, indicating that the topics described characteristics of diseases. The superiority remained as much as that of TF-IDF when filtration was changed to DMV.

CONCLUSION

The current findings prefer the use of OD-NLP to express characteristics of diseases from Japanese clinical texts and may help in the construction of document summaries and retrieval in clinical settings.

Collapse

Ji W, Xue M, Zhang Y, Yao H, Wang Y. A Machine Learning Based Framework to Identify and Classify Non-alcoholic Fatty Liver Disease in a Large-Scale Population. Front Public Health 2022;10:846118. [PMID: 35444985 PMCID: PMC9013842 DOI: 10.3389/fpubh.2022.846118] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/30/2021] [Accepted: 02/23/2022] [Indexed: 12/12/2022] Open

Nwanosike EM, Conway BR, Merchant HA, Hasan SS. Potential applications and performance of machine learning techniques and algorithms in clinical practice: A systematic review. Int J Med Inform 2021;159:104679. [PMID: 34990939 DOI: 10.1016/j.ijmedinf.2021.104679] [Citation(s) in RCA: 34] [Impact Index Per Article: 11.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/29/2021] [Revised: 12/08/2021] [Accepted: 12/27/2021] [Indexed: 12/11/2022]

Abstract

PURPOSE

The advent of clinically adapted machine learning algorithms can solve numerous problems ranging from disease diagnosis and prognosis to therapy recommendations. This systematic review examines the performance of machine learning (ML) algorithms and evaluates the progress made to date towards their implementation in clinical practice.

METHODS

Systematic searching of databases (PubMed, MEDLINE, Scopus, Google Scholar, Cochrane Library and WHO Covid-19 database) to identify original articles published between January 2011 and October 2021. Studies reporting ML techniques in clinical practice involving humans and ML algorithms with a performance metric were considered.

RESULTS

Of 873 unique articles identified, 36 studies were eligible for inclusion. The XGBoost (extreme gradient boosting) algorithm showed the highest potential for clinical applications (n = 7 studies); this was followed jointly by random forest algorithm, logistic regression, and the support vector machine, respectively (n = 5 studies). Prediction of outcomes (n = 33), in particular Inflammatory diseases (n = 7) received the most attention followed by cancer and neuropsychiatric disorders (n = 5 for each) and Covid-19 (n = 4). Thirty-three out of the thirty-six included studies passed more than 50% of the selected quality assessment criteria in the TRIPOD checklist. In contrast, none of the studies could achieve an ideal overall bias rating of 'low' based on the PROBAST checklist. In contrast, only three studies showed evidence of the deployment of ML algorithm(s) in clinical practice.

CONCLUSIONS

ML is potentially a reliable tool for clinical decision support. Although advocated widely in clinical practice, work is still in progress to validate clinically adapted ML algorithms. Improving quality standards, transparency, and interpretability of ML models will further lower the barriers to acceptability.

Collapse

Mijwil MM. Skin cancer disease images classification using deep learning solutions. MULTIMEDIA TOOLS AND APPLICATIONS 2021. [DOI: 10.1007/s11042-021-10952-7] [Citation(s) in RCA: 20] [Impact Index Per Article: 6.7] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Received: 08/16/2020] [Revised: 11/04/2020] [Accepted: 04/14/2021] [Indexed: 08/30/2023]

Automated machine learning: Review of the state-of-the-art and opportunities for healthcare. Artif Intell Med 2020;104:101822. [DOI: 10.1016/j.artmed.2020.101822] [Citation(s) in RCA: 197] [Impact Index Per Article: 49.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/08/2019] [Revised: 01/17/2020] [Accepted: 02/17/2020] [Indexed: 12/13/2022]

Classification and prediction of diabetes disease using machine learning paradigm. Health Inf Sci Syst 2020;8:7. [PMID: 31949894 DOI: 10.1007/s13755-019-0095-z] [Citation(s) in RCA: 45] [Impact Index Per Article: 11.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/21/2019] [Accepted: 12/21/2019] [Indexed: 12/19/2022] Open

Xue M, Su Y, Li C, Wang S, Yao H. Identification of Potential Type II Diabetes in a Large-Scale Chinese Population Using a Systematic Machine Learning Framework. J Diabetes Res 2020;2020:6873891. [PMID: 33029536 PMCID: PMC7532405 DOI: 10.1155/2020/6873891] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 03/12/2020] [Revised: 08/01/2020] [Accepted: 09/02/2020] [Indexed: 12/19/2022] Open

Abstract

BACKGROUND

An estimated 425 million people globally have diabetes, accounting for 12% of the world's health expenditures, and the number continues to grow, placing a huge burden on the healthcare system, especially in those remote, underserved areas.

METHODS

A total of 584,168 adult subjects who have participated in the national physical examination were enrolled in this study. The risk factors for type II diabetes mellitus (T2DM) were identified by p values and odds ratio, using logistic regression (LR) based on variables of physical measurement and a questionnaire. Combined with the risk factors selected by LR, we used a decision tree, a random forest, AdaBoost with a decision tree (AdaBoost), and an extreme gradient boosting decision tree (XGBoost) to identify individuals with T2DM, compared the performance of the four machine learning classifiers, and used the best-performing classifier to output the degree of variables' importance scores of T2DM.

RESULTS

The results indicated that XGBoost had the best performance (accuracy = 0.906, precision = 0.910, recall = 0.902, F-1 = 0.906, and AUC = 0.968). The degree of variables' importance scores in XGBoost showed that BMI was the most significant feature, followed by age, waist circumference, systolic pressure, ethnicity, smoking amount, fatty liver, hypertension, physical activity, drinking status, dietary ratio (meat to vegetables), drink amount, smoking status, and diet habit (oil loving).

CONCLUSIONS

We proposed a classifier based on LR-XGBoost which used fourteen variables of patients which are easily obtained and noninvasive as predictor variables to identify potential incidents of T2DM. The classifier can accurately screen the risk of diabetes in the early phrase, and the degree of variables' importance scores gives a clue to prevent diabetes occurrence.

Collapse

Wu DTY, Vennemeyer S, Brown K, Revalee J, Murdock P, Salomone S, France A, Clarke-Myers K, Hanke SP. Usability Testing of an Interactive Dashboard for Surgical Quality Improvement in a Large Congenital Heart Center. Appl Clin Inform 2019;10:859-869. [PMID: 31724143 DOI: 10.1055/s-0039-1698466] [Citation(s) in RCA: 15] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/14/2022] Open

Abstract

BACKGROUND

Interactive data visualization and dashboards can be an effective way to explore meaningful patterns in large clinical data sets and to inform quality improvement initiatives. However, these interactive dashboards may have usability issues that undermine their effectiveness. These usability issues can be attributed to mismatched mental models between the designers and the users. Unfortunately, very few evaluation studies in visual analytics have specifically examined such mismatches between these two groups.

OBJECTIVES

We aimed to evaluate the usability of an interactive surgical dashboard and to seek opportunities for improvement. We also aimed to provide empirical evidence to demonstrate the mismatched mental models between the designers and the users of the dashboard.

METHODS

An interactive dashboard was developed in a large congenital heart center. This dashboard provides real-time, interactive access to clinical outcomes data for the surgical program. A mixed-method, two-phase study was conducted to collect user feedback. A group of designers (N = 3) and a purposeful sample of users (N = 12) were recruited. The qualitative data were analyzed thematically. The dashboards were compared using the System Usability Scale (SUS) and qualitative data.

RESULTS

The participating users gave an average SUS score of 82.9 on the new dashboard and 63.5 on the existing dashboard (p = 0.006). The participants achieved high task accuracy when using the new dashboard. The qualitative analysis revealed three opportunities for improvement. The data analysis and triangulation provided empirical evidence to the mismatched mental models.

CONCLUSION

We conducted a mixed-method usability study on an interactive surgical dashboard and identified areas of improvements. Our study design can be an effective and efficient way to evaluate visual analytics systems in health care. We encourage researchers and practitioners to conduct user-centered evaluation and implement education plans to mitigate potential usability challenges and increase user satisfaction and adoption.

Collapse

Wang X, Williams C, Liu ZH, Croghan J. Big data management challenges in health research-a literature review. Brief Bioinform 2019;20:156-167. [PMID: 28968677 DOI: 10.1093/bib/bbx086] [Citation(s) in RCA: 38] [Impact Index Per Article: 7.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/27/2017] [Indexed: 12/12/2022] Open

Zeng X, Luo G. Progressive sampling-based Bayesian optimization for efficient and automatic machine learning model selection. Health Inf Sci Syst 2017;5:2. [PMID: 29038732 PMCID: PMC5617811 DOI: 10.1007/s13755-017-0023-z] [Citation(s) in RCA: 30] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/02/2017] [Accepted: 09/20/2017] [Indexed: 12/11/2022] Open

Luo G, Stone BL, Johnson MD, Tarczy-Hornoch P, Wilcox AB, Mooney SD, Sheng X, Haug PJ, Nkoy FL. Automating Construction of Machine Learning Models With Clinical Big Data: Proposal Rationale and Methods. JMIR Res Protoc 2017;6:e175. [PMID: 28851678 PMCID: PMC5596298 DOI: 10.2196/resprot.7757] [Citation(s) in RCA: 27] [Impact Index Per Article: 3.9] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/26/2017] [Revised: 07/14/2017] [Accepted: 07/15/2017] [Indexed: 12/14/2022] Open

Abstract

Background

To improve health outcomes and cut health care costs, we often need to conduct prediction/classification using large clinical datasets (aka, clinical big data), for example, to identify high-risk patients for preventive interventions. Machine learning has been proposed as a key technology for doing this. Machine learning has won most data science competitions and could support many clinical activities, yet only 15% of hospitals use it for even limited purposes. Despite familiarity with data, health care researchers often lack machine learning expertise to directly use clinical big data, creating a hurdle in realizing value from their data. Health care researchers can work with data scientists with deep machine learning knowledge, but it takes time and effort for both parties to communicate effectively. Facing a shortage in the United States of data scientists and hiring competition from companies with deep pockets, health care systems have difficulty recruiting data scientists. Building and generalizing a machine learning model often requires hundreds to thousands of manual iterations by data scientists to select the following: (1) hyper-parameter values and complex algorithms that greatly affect model accuracy and (2) operators and periods for temporally aggregating clinical attributes (eg, whether a patient’s weight kept rising in the past year). This process becomes infeasible with limited budgets.

Objective

This study’s goal is to enable health care researchers to directly use clinical big data, make machine learning feasible with limited budgets and data scientist resources, and realize value from data.

Methods

This study will allow us to achieve the following: (1) finish developing the new software, Automated Machine Learning (Auto-ML), to automate model selection for machine learning with clinical big data and validate Auto-ML on seven benchmark modeling problems of clinical importance; (2) apply Auto-ML and novel methodology to two new modeling problems crucial for care management allocation and pilot one model with care managers; and (3) perform simulations to estimate the impact of adopting Auto-ML on US patient outcomes.

Results

We are currently writing Auto-ML’s design document. We intend to finish our study by around the year 2022.

Conclusions

Auto-ML will generalize to various clinical prediction/classification problems. With minimal help from data scientists, health care researchers can use Auto-ML to quickly build high-quality models. This will boost wider use of machine learning in health care and improve patient outcomes.

Collapse

de Silva NHND. Relational Databases and Biomedical Big Data. Methods Mol Biol 2017;1617:69-81. [PMID: 28540677 DOI: 10.1007/978-1-4939-7046-9_5] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/22/2023]

Luo G. PredicT-ML: a tool for automating machine learning model building with big clinical data. Health Inf Sci Syst 2016;4:5. [PMID: 27280018 PMCID: PMC4897944 DOI: 10.1186/s13755-016-0018-1] [Citation(s) in RCA: 27] [Impact Index Per Article: 3.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/02/2016] [Accepted: 06/01/2016] [Indexed: 12/16/2022] Open

Optimized Distributed Hyperparameter Search and Simulation for Lung Texture Classification in CT Using Hadoop. J Imaging 2016. [DOI: 10.3390/jimaging2020019] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/16/2022] Open

A review of automatic selection methods for machine learning algorithms and hyper-parameter values. ACTA ACUST UNITED AC 2016. [DOI: 10.1007/s13721-016-0125-6] [Citation(s) in RCA: 130] [Impact Index Per Article: 16.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/26/2022]

Petridis AK, Fischer I, Cornelius JF, Kamp MA, Ringel F, Tortora A, Steiger HJ. Demographic distribution of hospital admissions for brain arteriovenous malformations in Germany--estimation of the natural course with the big-data approach. Acta Neurochir (Wien) 2016;158:791-796. [PMID: 26873715 DOI: 10.1007/s00701-016-2727-2] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/02/2015] [Accepted: 01/27/2016] [Indexed: 11/29/2022]

Abstract

BACKGROUND

Estimation of the natural history of arteriovenous malformations based on short-term observation is potentially biased by multiple factors. Retrieval of demographic information of all AVM patients of national data pools and comparison with the national demographic profile might be another way to approach the natural history.

MATERIALS AND METHODS

Upon request, the German Federal Statistical Office provided the numbers of patients admitted in Germany from 2009 through 2013 with ICD Q28.2 (brain AVM) as primary discharge diagnosis, and the corresponding age distribution. Age-related admission rates of AVM were calculated by comparison with the German demographic distribution.

RESULTS

A total of 6527 patients were hospitalized from 2009-2013 with brain AVM (Q28.2) as the principal diagnosis. Age-specific admission rate during the first year of life was high with 19.0/100,000 during the 5-year study period, corresponding to a yearly admission rate of 3.8 per 100,000 babies. Apart from the high admission rate during the first year of life, the admission rate was low, but steadily increasing during first decades of life reaching a plateau with 11.1/100,000 in the age group 30-34 years, corresponding to an annual admission rate of 2.2/100,000. After the age of 30-34 years, admission rates decreased continuously, reaching 0 in the age group 90-95 years. The lifetime risk of admission in terms of admission per 100,000 age-matched people was calculated by retrograde integration of the admission rates. At the age of 1 year, the cumulative number of future admissions for AVM during lifetime amounted to 131.3/100,000 children. For the older age groups, the chance of future admission for AVM decreased as expected, reaching 43.8/100,000 by the age of 50 and 0 by the age of 90.

CONCLUSIONS

Despite some open issues, the current data suggests that achieving old age with an untreated brain AVM is unlikely. Furthermore, the data support the concept that most brain AVMs are not necessarily a congenital entity but develop during the first decades of life.

Collapse

Predictive Business Process Monitoring Framework with Hyperparameter Optimization. ADVANCED INFORMATION SYSTEMS ENGINEERING 2016. [DOI: 10.1007/978-3-319-39696-5_22] [Citation(s) in RCA: 27] [Impact Index Per Article: 3.4] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/02/2022]