1
|
Niemann U, Boecking B, Brueggemann P, Spiliopoulou M, Mazurek B. Heterogeneity in response to treatment across tinnitus phenotypes. Sci Rep 2024; 14:2111. [PMID: 38267701 PMCID: PMC10808188 DOI: 10.1038/s41598-024-52651-x] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/30/2022] [Accepted: 01/22/2024] [Indexed: 01/26/2024] Open
Abstract
The clinical heterogeneity of chronic tinnitus poses major challenges to patient management and prompts the identification of distinct patient subgroups (or phenotypes) that respond more predictable to a particular treatment. We model heterogeneity in treatment response among phenotypes of tinnitus patients concerning their change in self-reported health burden, psychological characteristics, and tinnitus characteristics. Before and after a 7-day multimodal treatment, 989 tinnitus patients completed 14 assessment questionnaires, from which 64 variables measured general tinnitus characteristics, quality of life, pain experiences, somatic expressions, affective symptoms, tinnitus-related distress, internal resources, and perceived stress. Our approach encompasses mechanisms for patient phenotyping, visualizations of the phenotypes and their change with treatment in a projected space, and the extraction of patient subgroups based on their change with treatment. On average, all four distinct phenotypes identified at the pre-intervention baseline showed improved values for nearly all the considered variables following the intervention. However, a considerable intra-phenotype heterogeneity was noted. Five clusters of change reflected variations in the observed improvements among individuals. These patterns of treatment effects were identified to be associated with baseline phenotypes. Our exploratory approach establishes a groundwork for future studies incorporating control groups to pinpoint patient subgroups that are more likely to benefit from specific treatments. This strategy not only has the potential to advance personalized medicine but can also be extended to a broader spectrum of patients with various chronic conditions.
Collapse
Affiliation(s)
- Uli Niemann
- University Library, Otto von Guericke University Magdeburg, Universitätsplatz 2, Magdeburg, 39106, Germany.
- Faculty of Computer Science, Otto von Guericke University Magdeburg, Universitätsplatz 2, Magdeburg, 39106, Germany.
| | - Benjamin Boecking
- Charité-Universitaetsmedizin Berlin, Corporate Member of Freie Universität Berlin and Humboldt-Universität zu Berlin, Charitéplatz 1, Berlin, 10117, Germany
| | - Petra Brueggemann
- Charité-Universitaetsmedizin Berlin, Corporate Member of Freie Universität Berlin and Humboldt-Universität zu Berlin, Charitéplatz 1, Berlin, 10117, Germany
| | - Myra Spiliopoulou
- Faculty of Computer Science, Otto von Guericke University Magdeburg, Universitätsplatz 2, Magdeburg, 39106, Germany
| | - Birgit Mazurek
- Charité-Universitaetsmedizin Berlin, Corporate Member of Freie Universität Berlin and Humboldt-Universität zu Berlin, Charitéplatz 1, Berlin, 10117, Germany
| |
Collapse
|
2
|
Zaji A, Liu Z, Bando T, Zhao L. Ontology-Based Driving Simulation for Traffic Lights Optimization. ACM T INTEL SYST TEC 2023. [DOI: 10.1145/3579839] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/15/2023]
Abstract
Traffic light optimization is one of the principal components to lessen the traffic flow and travel time in an urban area. The present article seeks to introduce a novel procedure to design the traffic lights in a city using evolutionary-based optimization algorithms in combination with an ontology-based driving behavior simulation framework. Accordingly, an ontology-based knowledgebase is introduced to provide a machine-understandable knowledge of roads and intersections, traffic rules, and driving behaviors. Then, a simulation environment is developed to inspect car behavior in real-time. To optimize the traffic lights, a sine-based equation was defined for each traffic light, and the total travel time of the vehicles was considered as the cost function in the optimization algorithm. The optimization was performed with 5, 10, 15, 20, 25, and 30 vehicles in the urban areas. Based on the results, in contrast to uncontrolled intersections without traffic lights, optimized traffic lights can significantly contribute to total travel time-saving. To conclude, due to an escalation in the number of vehicles, the significance of optimized traffic lights has encountered an increase, and unoptimized traffic lights could increase total travel time even more than a city deprived of any traffic light.
Collapse
Affiliation(s)
| | - Zheng Liu
- School of Engineering, University of British Columbia, Canada
| | - Takashi Bando
- Silicon Valley Innovation Center, DENSO International America, Inc., USA
| | | |
Collapse
|
3
|
Emmert-Streib F, Yli-Harja O. What Is a Digital Twin? Experimental Design for a Data-Centric Machine Learning Perspective in Health. Int J Mol Sci 2022; 23:13149. [PMID: 36361936 PMCID: PMC9653941 DOI: 10.3390/ijms232113149] [Citation(s) in RCA: 5] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/19/2022] [Revised: 10/25/2022] [Accepted: 10/27/2022] [Indexed: 08/08/2023] Open
Abstract
The idea of a digital twin has recently gained widespread attention. While, so far, it has been used predominantly for problems in engineering and manufacturing, it is believed that a digital twin also holds great promise for applications in medicine and health. However, a problem that severely hampers progress in these fields is the lack of a solid definition of the concept behind a digital twin that would be directly amenable for such big data-driven fields requiring a statistical data analysis. In this paper, we address this problem. We will see that the term 'digital twin', as used in the literature, is like a Matryoshka doll. For this reason, we unstack the concept via a data-centric machine learning perspective, allowing us to define its main components. As a consequence, we suggest to use the term Digital Twin System instead of digital twin because this highlights its complex interconnected substructure. In addition, we address ethical concerns that result from treatment suggestions for patients based on simulated data and a possible lack of explainability of the underling models.
Collapse
Affiliation(s)
- Frank Emmert-Streib
- Predictive Society and Data Analytics Lab, Faculty of Information Technology and Communication Sciences, Tampere University, 33100 Tampere, Finland
| | - Olli Yli-Harja
- Computational Systems Biology, Faculty of Medicine and Health Technology, Tampere University, 33720 Tampere, Finland
- Institute for Systems Biology, Seattle, WA 98195, USA
| |
Collapse
|
4
|
Zhang D, Li Y, Kalbaugh CA, Shi L, Divers J, Islam S, Annex BH. Machine Learning Approach to Predict In-Hospital Mortality in Patients Admitted for Peripheral Artery Disease in the United States. J Am Heart Assoc 2022; 11:e026987. [PMID: 36216437 DOI: 10.1161/jaha.122.026987] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Indexed: 11/16/2022]
Abstract
Background Peripheral artery disease (PAD) affects >10 million people in the United States. PAD is associated with poor outcomes, including premature death. Machine learning (ML) has been increasingly used on big data to predict clinical outcomes. This study aims to develop ML models to predict in-hospital mortality in patients hospitalized for PAD based on a national database. Methods and Results Inpatient hospitalization data were obtained from the 2016 to 2019 National Inpatient Sample. A total of 150 921 inpatients were identified with a primary diagnosis of PAD and PAD-related procedures using codes of the International Classification of Diseases, Tenth Revision, Clinical Modification (ICD-10-CM) and International Classification of Diseases, Tenth Revision, Procedure Coding System (ICD-10-PCS). Four ML models, including logistic regression, random forest, light gradient boosting, and extreme gradient boosting models, were trained to predict the risk of in-hospital death based on a selection of variables, including patient characteristics, comorbidities, procedures, and hospital-related factors. In-hospital mortality occurred in 1.8% of patients. The performance of the 4 models was comparable, with the area under the receiver operating characteristic curve ranging from 0.83 to 0.85, sensitivity of 77% to 82%, and specificity of 72% to 75%. These results suggest adequate predictability for clinical decision-making. In all 4 models, the total number of diagnoses and procedures, age, endovascular revascularization procedure, congestive heart failure, diabetes, and diabetes with complications were critical predictors of in-hospital mortality. Conclusions This study demonstrates the feasibility of ML in predicting in-hospital mortality in patients with a primary PAD diagnosis. Findings highlight the potential of ML models in identifying high-risk patients for poor outcomes and guiding personalized intervention.
Collapse
Affiliation(s)
- Donglan Zhang
- Division of Health Services Research, Department of Foundations of Medicine New York University Long Island School of Medicine Mineola NY
| | - Yike Li
- Department of Otolaryngology-Head and Neck Surgery, Bill Wilkerson Center Vanderbilt University Medical Center Nashville TN
| | | | - Lu Shi
- Department of Public Health Sciences Clemson University Clemson SC
| | - Jasmin Divers
- Division of Health Services Research, Department of Foundations of Medicine New York University Long Island School of Medicine Mineola NY
| | - Shahidul Islam
- Division of Health Services Research, Department of Foundations of Medicine New York University Long Island School of Medicine Mineola NY
| | - Brian H Annex
- Department of Medicine and Vascular Biology Center Medical College of Georgia Augusta GA
| |
Collapse
|
5
|
Mondol RK, Truong ND, Reza M, Ippolito S, Ebrahimie E, Kavehei O. AFExNet: An Adversarial Autoencoder for Differentiating Breast Cancer Sub-Types and Extracting Biologically Relevant Genes. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2022; 19:2060-2070. [PMID: 33720833 DOI: 10.1109/tcbb.2021.3066086] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/12/2023]
Abstract
Technological advancements in high-throughput genomics enable the generation of complex and large data sets that can be used for classification, clustering, and bio-marker identification. Modern deep learning algorithms provide us with the opportunity of finding most significant features in such huge dataset to characterize diseases (e.g., cancer) and their sub-types. Thus, developing such deep learning method, which can successfully extract meaningful features from various breast cancer sub-types, is of current research interest. In this paper, we develop dual stage (unsupervised pre-training and supervised fine-tuning) neural network architecture termed AFExNet based on adversarial auto-encoder (AAE) to extract features from high dimensional genetic data. We evaluated the performance of our model through twelve different supervised classifiers to verify the usefulness of the new features using public RNA-Seq dataset of breast cancer. AFExNet provides consistent results in all performance metrics across twelve different classifiers which makes our model classifier independent. We also develop a method named 'TopGene' to find highly weighted genes from the latent space which could be useful for finding cancer bio-markers. Put together, AFExNet has great potential for biological data to accurately and effectively extract features. Our work is fully reproducible and source code can be downloaded from Github: https://github.com/NeuroSyd/breast-cancer-sub-types.
Collapse
|
6
|
Martínez-García M, Hernández-Lemus E. Data Integration Challenges for Machine Learning in Precision Medicine. Front Med (Lausanne) 2022; 8:784455. [PMID: 35145977 PMCID: PMC8821900 DOI: 10.3389/fmed.2021.784455] [Citation(s) in RCA: 10] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/27/2021] [Accepted: 12/28/2021] [Indexed: 12/19/2022] Open
Abstract
A main goal of Precision Medicine is that of incorporating and integrating the vast corpora on different databases about the molecular and environmental origins of disease, into analytic frameworks, allowing the development of individualized, context-dependent diagnostics, and therapeutic approaches. In this regard, artificial intelligence and machine learning approaches can be used to build analytical models of complex disease aimed at prediction of personalized health conditions and outcomes. Such models must handle the wide heterogeneity of individuals in both their genetic predisposition and their social and environmental determinants. Computational approaches to medicine need to be able to efficiently manage, visualize and integrate, large datasets combining structure, and unstructured formats. This needs to be done while constrained by different levels of confidentiality, ideally doing so within a unified analytical architecture. Efficient data integration and management is key to the successful application of computational intelligence approaches to medicine. A number of challenges arise in the design of successful designs to medical data analytics under currently demanding conditions of performance in personalized medicine, while also subject to time, computational power, and bioethical constraints. Here, we will review some of these constraints and discuss possible avenues to overcome current challenges.
Collapse
Affiliation(s)
- Mireya Martínez-García
- Clinical Research Division, National Institute of Cardiology ‘Ignacio Chávez’, Mexico City, Mexico
| | - Enrique Hernández-Lemus
- Computational Genomics Division, National Institute of Genomic Medicine (INMEGEN), Mexico City, Mexico
- Center for Complexity Sciences, Universidad Nacional Autnoma de Mexico, Mexico City, Mexico
| |
Collapse
|
7
|
|
8
|
Lee Y, Veerubhotla K, Jeong MH, Lee CH. Deep Learning in Personalization of Cardiovascular Stents. J Cardiovasc Pharmacol Ther 2020; 25:110-120. [DOI: 10.1177/1074248419878405] [Citation(s) in RCA: 10] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Indexed: 08/30/2023]
Abstract
Deep learning (DL) application has demonstrated its enormous potential in accomplishing biomedical tasks, such as vessel segmentation, brain visualization, and speech recognition. This review article has mainly covered recent advances in the principles of DL algorithms, existing DL software, and designing strategies of DL models. Latest progresses in cardiovascular devices, especially DL-based cardiovascular stent used for angioplasty, differential and advanced diagnostic means, and the treatment outcomes involved with coronary artery disease (CAD), are discussed. Also presented is DL-based discovery of new materials and future medical technologies that will facilitate the development of tailored and personalized treatment strategies by identifying and forecasting individual impending risks of cardiovascular diseases.
Collapse
Affiliation(s)
- Yugyung Lee
- School of Computing and Engineering, University of Missouri-Kansas City, MO, USA
| | - Krishna Veerubhotla
- Division of Pharmaceutical Sciences, School of Pharmacy, University of Missouri-Kansas City, MO, USA
| | - Myung Ho Jeong
- Department of Cardiovascular Medicine of Chonnam National University, Gwang-Ju, South Korea
| | - Chi H. Lee
- Division of Pharmaceutical Sciences, School of Pharmacy, University of Missouri-Kansas City, MO, USA
| |
Collapse
|
9
|
Zerka F, Barakat S, Walsh S, Bogowicz M, Leijenaar RTH, Jochems A, Miraglio B, Townend D, Lambin P. Systematic Review of Privacy-Preserving Distributed Machine Learning From Federated Databases in Health Care. JCO Clin Cancer Inform 2020; 4:184-200. [PMID: 32134684 PMCID: PMC7113079 DOI: 10.1200/cci.19.00047] [Citation(s) in RCA: 44] [Impact Index Per Article: 11.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 01/16/2020] [Indexed: 02/06/2023] Open
Abstract
Big data for health care is one of the potential solutions to deal with the numerous challenges of health care, such as rising cost, aging population, precision medicine, universal health coverage, and the increase of noncommunicable diseases. However, data centralization for big data raises privacy and regulatory concerns.Covered topics include (1) an introduction to privacy of patient data and distributed learning as a potential solution to preserving these data, a description of the legal context for patient data research, and a definition of machine/deep learning concepts; (2) a presentation of the adopted review protocol; (3) a presentation of the search results; and (4) a discussion of the findings, limitations of the review, and future perspectives.Distributed learning from federated databases makes data centralization unnecessary. Distributed algorithms iteratively analyze separate databases, essentially sharing research questions and answers between databases instead of sharing the data. In other words, one can learn from separate and isolated datasets without patient data ever leaving the individual clinical institutes.Distributed learning promises great potential to facilitate big data for medical application, in particular for international consortiums. Our purpose is to review the major implementations of distributed learning in health care.
Collapse
Affiliation(s)
- Fadila Zerka
- The D-Lab, Department of Precision Medicine, GROW School for Oncology and Developmental Biology, Maastricht University Medical Centre, Maastricht, The Netherlands
- Oncoradiomics, Liège, Belgium
| | - Samir Barakat
- The D-Lab, Department of Precision Medicine, GROW School for Oncology and Developmental Biology, Maastricht University Medical Centre, Maastricht, The Netherlands
- Oncoradiomics, Liège, Belgium
| | - Sean Walsh
- The D-Lab, Department of Precision Medicine, GROW School for Oncology and Developmental Biology, Maastricht University Medical Centre, Maastricht, The Netherlands
- Oncoradiomics, Liège, Belgium
| | - Marta Bogowicz
- The D-Lab, Department of Precision Medicine, GROW School for Oncology and Developmental Biology, Maastricht University Medical Centre, Maastricht, The Netherlands
- Department of Radiation Oncology, University Hospital Zurich and University of Zurich, Zurich, Switzerland
| | - Ralph T. H. Leijenaar
- The D-Lab, Department of Precision Medicine, GROW School for Oncology and Developmental Biology, Maastricht University Medical Centre, Maastricht, The Netherlands
- Oncoradiomics, Liège, Belgium
| | - Arthur Jochems
- The D-Lab, Department of Precision Medicine, GROW School for Oncology and Developmental Biology, Maastricht University Medical Centre, Maastricht, The Netherlands
| | | | - David Townend
- Department of Health, Ethics, and Society, CAPHRI (Care and Public Health Research Institute), Maastricht University, Maastricht, The Netherlands
| | - Philippe Lambin
- The D-Lab, Department of Precision Medicine, GROW School for Oncology and Developmental Biology, Maastricht University Medical Centre, Maastricht, The Netherlands
| |
Collapse
|
10
|
Yang Z, Dehmer M, Yli-Harja O, Emmert-Streib F. Combining deep learning with token selection for patient phenotyping from electronic health records. Sci Rep 2020; 10:1432. [PMID: 31996705 PMCID: PMC6989657 DOI: 10.1038/s41598-020-58178-1] [Citation(s) in RCA: 18] [Impact Index Per Article: 4.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/26/2019] [Accepted: 01/13/2020] [Indexed: 01/05/2023] Open
Abstract
Artificial intelligence provides the opportunity to reveal important information buried in large amounts of complex data. Electronic health records (eHRs) are a source of such big data that provide a multitude of health related clinical information about patients. However, text data from eHRs, e.g., discharge summary notes, are challenging in their analysis because these notes are free-form texts and the writing formats and styles vary considerably between different records. For this reason, in this paper we study deep learning neural networks in combination with natural language processing to analyze text data from clinical discharge summaries. We provide a detail analysis of patient phenotyping, i.e., the automatic prediction of ten patient disorders, by investigating the influence of network architectures, sample sizes and information content of tokens. Importantly, for patients suffering from Chronic Pain, the disorder that is the most difficult one to classify, we find the largest performance gain for a combined word- and sentence-level input convolutional neural network (ws-CNN). As a general result, we find that the combination of data quality and data quantity of the text data is playing a crucial role for using more complex network architectures that improve significantly beyond a word-level input CNN model. From our investigations of learning curves and token selection mechanisms, we conclude that for such a transition one requires larger sample sizes because the amount of information per sample is quite small and only carried by few tokens and token categories. Interestingly, we found that the token frequency in the eHRs follow a Zipf law and we utilized this behavior to investigate the information content of tokens by defining a token selection mechanism. The latter addresses also issues of explainable AI.
Collapse
Affiliation(s)
- Zhen Yang
- Predictive Society and Data Analytics Lab, Tampere University, Tampere, Korkeakoulunkatu 10, 33720, Tampere, Finland
| | - Matthias Dehmer
- Steyr School of Management, University of Applied Sciences Upper Austria, 4400, Steyr Campus, Austria
- College of Artificial Intelligence, Nankai University, Tianjin, 300350, China
- Department of Biomedical Computer Science and Mechatronics, UMIT-The Health and Life Science University, 6060, Hall in Tyrol, Austria
| | - Olli Yli-Harja
- Computational Systems Biology Lab, Tampere University, Korkeakoulunkatu 10, 33720, Tampere, Finland
- Institute of Biosciences and Medical Technology, Tampere University, Tampere, Korkeakoulunkatu 10, 33720, Tampere, Finland
- Institute for Systems Biology, Seattle, WA, 98109, USA
| | - Frank Emmert-Streib
- Predictive Society and Data Analytics Lab, Tampere University, Tampere, Korkeakoulunkatu 10, 33720, Tampere, Finland.
- Institute of Biosciences and Medical Technology, Tampere University, Tampere, Korkeakoulunkatu 10, 33720, Tampere, Finland.
| |
Collapse
|
11
|
Emmert-Streib F, Dehmer M, Yli-Harja O. Ensuring Quality Standards and Reproducible Research for Data Analysis Services in Oncology: A Cooperative Service Model. Front Cell Dev Biol 2020; 7:349. [PMID: 31921859 PMCID: PMC6929679 DOI: 10.3389/fcell.2019.00349] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/21/2019] [Accepted: 12/04/2019] [Indexed: 11/13/2022] Open
Abstract
Modern molecular high-throughput devices, e.g., next-generation sequencing, have transformed medical research. Resulting data sets are usually high-dimensional on a genomic-scale providing multi-factorial information from intertwined molecular and cellular activities of genes and their products. This genomics-revolution installed precision medicine offering breathtaking opportunities for patient's diagnosis and treatment. However, due to the speed of these developments the quality standards of the involved data analyses are lacking behind, as exemplified by the infamous Duke Saga. In this paper, we argue in favor of a two-stage cooperative serve model that couples data generation and data analysis in the most beneficial way from the perspective of a patient to ensure data analysis quality standards including reproducible research.
Collapse
Affiliation(s)
- Frank Emmert-Streib
- Predictive Society and Data Analytics Lab, Faculty of Information Technology and Communication Sciences, Tampere University, Tampere, Finland.,Institute of Biosciences and Medical Technology, Tampere, Finland
| | - Matthias Dehmer
- Steyr School of Management, University of Applied Sciences Upper Austria, Steyr, Austria.,Department of Mechatronics and Biomedical Computer Science, UMIT, Hall in Tyrol, Austria.,College of Artificial Intelligence, Nankai University, Tianjin, China
| | - Olli Yli-Harja
- Institute of Biosciences and Medical Technology, Tampere, Finland.,Institute for Systems Biology, Seattle, WA, United States
| |
Collapse
|
12
|
Ahmed Z, Mohamed K, Zeeshan S, Dong X. Artificial intelligence with multi-functional machine learning platform development for better healthcare and precision medicine. Database (Oxford) 2020; 2020:baaa010. [PMID: 32185396 PMCID: PMC7078068 DOI: 10.1093/database/baaa010] [Citation(s) in RCA: 167] [Impact Index Per Article: 41.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/02/2019] [Revised: 01/05/2020] [Accepted: 01/21/2020] [Indexed: 02/06/2023]
Abstract
Precision medicine is one of the recent and powerful developments in medical care, which has the potential to improve the traditional symptom-driven practice of medicine, allowing earlier interventions using advanced diagnostics and tailoring better and economically personalized treatments. Identifying the best pathway to personalized and population medicine involves the ability to analyze comprehensive patient information together with broader aspects to monitor and distinguish between sick and relatively healthy people, which will lead to a better understanding of biological indicators that can signal shifts in health. While the complexities of disease at the individual level have made it difficult to utilize healthcare information in clinical decision-making, some of the existing constraints have been greatly minimized by technological advancements. To implement effective precision medicine with enhanced ability to positively impact patient outcomes and provide real-time decision support, it is important to harness the power of electronic health records by integrating disparate data sources and discovering patient-specific patterns of disease progression. Useful analytic tools, technologies, databases, and approaches are required to augment networking and interoperability of clinical, laboratory and public health systems, as well as addressing ethical and social issues related to the privacy and protection of healthcare data with effective balance. Developing multifunctional machine learning platforms for clinical data extraction, aggregation, management and analysis can support clinicians by efficiently stratifying subjects to understand specific scenarios and optimize decision-making. Implementation of artificial intelligence in healthcare is a compelling vision that has the potential in leading to the significant improvements for achieving the goals of providing real-time, better personalized and population medicine at lower costs. In this study, we focused on analyzing and discussing various published artificial intelligence and machine learning solutions, approaches and perspectives, aiming to advance academic solutions in paving the way for a new data-centric era of discovery in healthcare.
Collapse
Affiliation(s)
- Zeeshan Ahmed
- Institute for Health, Health Care Policy and Aging Research, Rutgers, The State University of New Jersey, 112 Paterson Street, New Brunswick, NJ, USA
- Department of Medicine, Rutgers Robert Wood Johnson Medical School, Rutgers Biomedical and Health Sciences, 125 Paterson Street, New Brunswick, NJ, USA
- Department of Genetics and Genome Sciences, School of Medicine, University of Connecticut Health Center, 263 Farmington Ave., Farmington, CT, USA
- Institute for Systems Genomics, University of Connecticut, 67 North Eagleville Road, Storrs, CT, USA
| | - Khalid Mohamed
- Department of Genetics and Genome Sciences, School of Medicine, University of Connecticut Health Center, 263 Farmington Ave., Farmington, CT, USA
| | - Saman Zeeshan
- The Jackson Laboratory for Genomic Medicine, 10 Discovery Drive, Farmington, CT, USA
| | - XinQi Dong
- Institute for Health, Health Care Policy and Aging Research, Rutgers, The State University of New Jersey, 112 Paterson Street, New Brunswick, NJ, USA
- Department of Medicine, Rutgers Robert Wood Johnson Medical School, Rutgers Biomedical and Health Sciences, 125 Paterson Street, New Brunswick, NJ, USA
| |
Collapse
|
13
|
Emmert-Streib F, Yli-Harja O, Dehmer M. Utilizing Social Media Data for Psychoanalysis to Study Human Personality. Front Psychol 2019; 10:2596. [PMID: 31803123 PMCID: PMC6873989 DOI: 10.3389/fpsyg.2019.02596] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/21/2019] [Accepted: 10/31/2019] [Indexed: 11/13/2022] Open
Abstract
Social media data, for instance from Twitter or Facebook, provide a new type of data that consist of a mixture of text, image and video information. From a scientific point of view, the capabilities of this type of data from such microblogs are not well explored and to date it is largely unknown what principal knowledge can be extracted thereof. In this paper, we present a discussion of the capabilities of data from microblogs for performing a psychoanalysis. This could allow an analysis of the human personality of individual users. Such prospects raises serious concerns regarding the privacy of users of social media platforms.
Collapse
Affiliation(s)
- Frank Emmert-Streib
- Predictive Society and Data Analytics Lab, Faculty of Information Technology and Communication Sciences, Tampere University, Tampere, Finland
- Faculty of Medicine and Health Technology, Institute of Biosciences and Medical Technology, Tampere University, Tampere, Finland
| | - Olli Yli-Harja
- Faculty of Medicine and Health Technology, Institute of Biosciences and Medical Technology, Tampere University, Tampere, Finland
| | - Matthias Dehmer
- Faculty for Management, Institute for Intelligent Production, University of Applied Sciences Upper Austria, Steyr, Austria
- Department of Mechatronics and Biomedical Computer Science, University for Health Sciences, Medical Informatics and Technology (UMIT), Hall in Tirol, Austria
- College of Artificial Intelligence, Nankai University, Nankai, China
| |
Collapse
|
14
|
Understanding Statistical Hypothesis Testing: The Logic of Statistical Inference. MACHINE LEARNING AND KNOWLEDGE EXTRACTION 2019. [DOI: 10.3390/make1030054] [Citation(s) in RCA: 22] [Impact Index Per Article: 4.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/17/2022]
Abstract
Statistical hypothesis testing is among the most misunderstood quantitative analysis methods from data science. Despite its seeming simplicity, it has complex interdependencies between its procedural components. In this paper, we discuss the underlying logic behind statistical hypothesis testing, the formal meaning of its components and their connections. Our presentation is applicable to all statistical hypothesis tests as generic backbone and, hence, useful across all application domains in data science and artificial intelligence.
Collapse
|
15
|
Smolander J, Dehmer M, Emmert-Streib F. Comparing deep belief networks with support vector machines for classifying gene expression data from complex disorders. FEBS Open Bio 2019; 9:1232-1248. [PMID: 31074948 PMCID: PMC6609581 DOI: 10.1002/2211-5463.12652] [Citation(s) in RCA: 14] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/12/2019] [Revised: 04/25/2019] [Accepted: 05/08/2019] [Indexed: 12/24/2022] Open
Abstract
Genomics data provide great opportunities for translational research and the clinical practice, for example, for predicting disease stages. However, the classification of such data is a challenging task due to their high dimensionality, noise, and heterogeneity. In recent years, deep learning classifiers generated much interest, but due to their complexity, so far, little is known about the utility of this method for genomics. In this paper, we address this problem by studying a computational diagnostics task by classification of breast cancer and inflammatory bowel disease patients based on high‐dimensional gene expression data. We provide a comprehensive analysis of the classification performance of deep belief networks (DBNs) in dependence on its multiple model parameters and in comparison with support vector machines (SVMs). Furthermore, we investigate combined classifiers that integrate DBNs with SVMs. Such a classifier utilizes a DBN as representation learner forming the input for a SVM. Overall, our results provide guidelines for the complex usage of DBN for classifying gene expression data from complex diseases.
Collapse
Affiliation(s)
- Johannes Smolander
- Predictive Society and Data Analytics Lab, Faculty of Information Technology and Communication Sciences, Tampere University, Finland.,Turku Centre for Biotechnology, University of Turku, Finland
| | - Matthias Dehmer
- Institute for Intelligent Production, Faculty for Management, University of Applied Sciences Upper Austria, Steyr, Austria.,Department of Mechatronics and Biomedical Computer Science, UMIT, Hall in Tyrol, Austria.,College of Computer and Control Engineering, Nankai University, Tianjin, China
| | - Frank Emmert-Streib
- Predictive Society and Data Analytics Lab, Faculty of Information Technology and Communication Sciences, Tampere University, Finland.,Institute of Biosciences and Medical Technology, Tampere, Finland
| |
Collapse
|
16
|
Azam MF, Musa A, Dehmer M, Yli-Harja OP, Emmert-Streib F. Global Genetics Research in Prostate Cancer: A Text Mining and Computational Network Theory Approach. Front Genet 2019; 10:70. [PMID: 30838019 PMCID: PMC6383410 DOI: 10.3389/fgene.2019.00070] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/21/2018] [Accepted: 01/28/2019] [Indexed: 11/13/2022] Open
Abstract
Prostate cancer is the most common cancer type in men in Finland and second worldwide. In this paper, we analyze almost 150, 000 published papers about prostate cancer, authored by ten thousands of scientists worldwide, with an integrated text mining and computational network theory approach. We demonstrate how to integrate text mining with network analysis investigating research contributions of countries and collaborations within and between countries. Furthermore, we study the time evolution of individually and collectively studied genes. Finally, we investigate a collaboration network of Finland and compare studied genes with globally studied genes in prostate cancer genetics. Overall, our results provide a global overview of prostate cancer research in genetics. In addition, we present a specific discussion for Finland. Our results shed light on trends within the last 30 years and are useful for translational researchers within the full range from genetics to public health management and health policy.
Collapse
Affiliation(s)
- Md Facihul Azam
- Predictive Society and Data Analysis Lab, Faculty of Information Technology and Communication Sciences, Tampere University, Tampere, Finland.,Institute of Biosciences and Medical Technology, Tampere, Finland
| | - Aliyu Musa
- Predictive Society and Data Analysis Lab, Faculty of Information Technology and Communication Sciences, Tampere University, Tampere, Finland.,Institute of Biosciences and Medical Technology, Tampere, Finland
| | - Matthias Dehmer
- Faculty for Management, Institute for Intelligent Production, University of Applied Sciences Upper Austria, Steyr, Austria.,Department of Mechatronics and Biomedical Computer Science, UMIT, Hall in Tyrol, Austria.,College of Computer and Control Engineering, Nankai University, Tianjin, China
| | - Olli P Yli-Harja
- Institute of Biosciences and Medical Technology, Tampere, Finland.,Computational Systems Biology, Faculty of Biomedical Engineering, Tampere University, Tampere, Finland.,Institute for Systems Biology, Seattle, WA, United States
| | - Frank Emmert-Streib
- Predictive Society and Data Analysis Lab, Faculty of Information Technology and Communication Sciences, Tampere University, Tampere, Finland.,Institute of Biosciences and Medical Technology, Tampere, Finland
| |
Collapse
|
17
|
Rivas AL, Hoogesteijn AL, Antoniades A, Tomazou M, Buranda T, Perkins DJ, Fair JM, Durvasula R, Fasina FO, Tegos GP, van Regenmortel MHV. Assessing the Dynamics and Complexity of Disease Pathogenicity Using 4-Dimensional Immunological Data. Front Immunol 2019; 10:1258. [PMID: 31249569 PMCID: PMC6582751 DOI: 10.3389/fimmu.2019.01258] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/19/2019] [Accepted: 05/17/2019] [Indexed: 02/05/2023] Open
Abstract
Investigating disease pathogenesis and personalized prognostics are major biomedical needs. Because patients sharing the same diagnosis can experience different outcomes, such as survival or death, physicians need new personalized tools, including those that rapidly differentiate several inflammatory phases. To address these topics, a pattern recognition-based method (PRM) that follows an inverse problem approach was designed to assess, in <10 min, eight concepts: synergy, pleiotropy, complexity, dynamics, ambiguity, circularity, personalized outcomes, and explanatory prognostics (pathogenesis). By creating thousands of secondary combinations derived from blood leukocyte data, the PRM measures synergic, pleiotropic, complex and dynamic data interactions, which provide personalized prognostics while some undesirable features-such as false results and the ambiguity associated with data circularity-are prevented. Here, this method is compared to Principal Component Analysis (PCA) and evaluated with data collected from hantavirus-infected humans and birds that appeared to be healthy. When human data were examined, the PRM predicted 96.9 % of all surviving patients while PCA did not distinguish outcomes. Demonstrating applications in personalized prognosis, eight PRM data structures sufficed to identify all but one of the survivors. Dynamic data patterns also distinguished survivors from non-survivors, as well as one subset of non-survivors, which exhibited chronic inflammation. When the PRM explored avian data, it differentiated immune profiles consistent with no, early, or late inflammation. Yet, PCA did not recognize patterns in avian data. Findings support the notion that immune responses, while variable, are rather deterministic: a low number of complex and dynamic data combinations may be enough to, rapidly, unmask conditions that are neither directly observable nor reliably forecasted.
Collapse
Affiliation(s)
- Ariel L. Rivas
- School of Medicine, Center for Global Health-Division of Infectious Diseases, University of New Mexico, Albuquerque, NM, United States
- *Correspondence: Ariel L. Rivas
| | - Almira L. Hoogesteijn
- Human Ecology, Centro de Investigación y de Estudios Avanzados (CINVESTAV), Mérida, Mexico
| | | | | | - Tione Buranda
- Department of Pathology, School of Medicine, University of New Mexico, Albuquerque, NM, United States
| | - Douglas J. Perkins
- School of Medicine, Center for Global Health-Division of Infectious Diseases, University of New Mexico, Albuquerque, NM, United States
| | - Jeanne M. Fair
- Biosecurity and Public Health, Los Alamos National Laboratory, Los Alamos, NM, United States
| | - Ravi Durvasula
- Loyola University Medical Center, Chicago, IL, United States
| | - Folorunso O. Fasina
- Department of Veterinary Tropical Diseases, University of Pretoria, Pretoria, South Africa
- Food and Agriculture Organization of the United Nations, Dar es Salaam, Tanzania
| | | | - Marc H. V. van Regenmortel
- Centre National de la Recherche Scientifique (CNRS), School of Biotechnology, University of Strasbourg, Strasbourg, France
| |
Collapse
|