1
|
Mughaz D, HaCohen-Kerner Y, Gabbay D. Extraction of time-related expressions using text mining with application to Hebrew. PLoS One 2024; 19:e0293196. [PMID: 38394097 PMCID: PMC10889890 DOI: 10.1371/journal.pone.0293196] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/14/2022] [Accepted: 10/08/2023] [Indexed: 02/25/2024] Open
Abstract
In this research, we extract time-related expressions from a rabbinic text in a semi-automatic manner. These expressions usually appear next to rabbinic references (name / nickname / acronym / book-name). The first step toward our goal is to find all the expressions near references in the corpus. However, not all of the phrases around the references are time-related expressions. Therefore, these phrases are initially considered to be potential time-related expressions. To extract the time-related expressions, we formulate two new statistical functions, and we use screening and heuristic methods. We tested these statistical functions, grammatical screenings, and heuristic methods on a corpus containing responsa documents. In this corpus, many rabbinic citations are known and marked. The statistical functions and the screening methods filtered the potential time-related expressions and reduced 99.88% of the initial expressions (from 484,681 to 575).
Collapse
Affiliation(s)
- Dror Mughaz
- Dept. of Computer Science, Jerusalem College of Technology–Lev Academic Center, Jerusalem, Israel
- Dept. of Computer Science, Bar-Ilan University, Ramat-Gan, Israel
| | - Yaakov HaCohen-Kerner
- Dept. of Computer Science, Jerusalem College of Technology–Lev Academic Center, Jerusalem, Israel
| | - Dov Gabbay
- Dept. of Computer Science, Bar-Ilan University, Ramat-Gan, Israel
- Dep. of Informatics, Kings College London, Strand, London, United Kingdom
| |
Collapse
|
2
|
Feng Z, Shen Z, Li H, Li S. e-TSN: an interactive visual exploration platform for target-disease knowledge mapping from literature. Brief Bioinform 2022; 23:bbac465. [PMID: 36347537 PMCID: PMC9677481 DOI: 10.1093/bib/bbac465] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/07/2022] [Revised: 09/20/2022] [Accepted: 09/27/2022] [Indexed: 11/10/2022] Open
Abstract
Target discovery and identification processes are driven by the increasing amount of biomedical data. The vast numbers of unstructured texts of biomedical publications provide a rich source of knowledge for drug target discovery research and demand the development of specific algorithms or tools to facilitate finding disease genes and proteins. Text mining is a method that can automatically mine helpful information related to drug target discovery from massive biomedical literature. However, there is a substantial lag between biomedical publications and the subsequent abstraction of information extracted by text mining to databases. The knowledge graph is introduced to integrate heterogeneous biomedical data. Here, we describe e-TSN (Target significance and novelty explorer, http://www.lilab-ecust.cn/etsn/), a knowledge visualization web server integrating the largest database of associations between targets and diseases from the full scientific literature by constructing significance and novelty scoring methods based on bibliometric statistics. The platform aims to visualize target-disease knowledge graphs to assist in prioritizing candidate disease-related proteins. Approved drugs and associated bioactivities for each interested target are also provided to facilitate the visualization of drug-target relationships. In summary, e-TSN is a fast and customizable visualization resource for investigating and analyzing the intricate target-disease networks, which could help researchers understand the mechanisms underlying complex disease phenotypes and improve the drug discovery and development efficiency, especially for the unexpected outbreak of infectious disease pandemics like COVID-19.
Collapse
Affiliation(s)
- Ziyan Feng
- Shanghai Key Laboratory of New Drug Design, School of Pharmacy, East China University of Science and Technology, Shanghai 200237, China
| | - Zihao Shen
- Shanghai Key Laboratory of New Drug Design, School of Pharmacy, East China University of Science and Technology, Shanghai 200237, China
| | - Honglin Li
- Shanghai Key Laboratory of New Drug Design, School of Pharmacy, East China University of Science and Technology, Shanghai 200237, China
- Innovation Center for AI and Drug Discovery, East China Normal University, Shanghai 200062, China
- Lingang Laboratory, Shanghai 200031, China
| | - Shiliang Li
- Shanghai Key Laboratory of New Drug Design, School of Pharmacy, East China University of Science and Technology, Shanghai 200237, China
- Innovation Center for AI and Drug Discovery, East China Normal University, Shanghai 200062, China
| |
Collapse
|
3
|
Lazarczyk M, Duda K, Mickael ME, AK O, Paszkiewicz J, Kowalczyk A, Horbańczuk JO, Sacharczuk M. Adera2.0: A Drug Repurposing Workflow for Neuroimmunological Investigations Using Neural Networks. MOLECULES (BASEL, SWITZERLAND) 2022; 27:molecules27196453. [PMID: 36234990 PMCID: PMC9571571 DOI: 10.3390/molecules27196453] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 08/09/2022] [Revised: 09/25/2022] [Accepted: 09/26/2022] [Indexed: 11/16/2022]
Abstract
Drug repurposing in the context of neuroimmunological (NI) investigations is still in its primary stages. Drug repurposing is an important method that bypasses lengthy drug discovery procedures and focuses on discovering new usages for known medications. Neuroimmunological diseases, such as Alzheimer's, Parkinson's, multiple sclerosis, and depression, include various pathologies that result from the interaction between the central nervous system and the immune system. However, the repurposing of NI medications is hindered by the vast amount of information that needs mining. We previously presented Adera1.0, which was capable of text mining PubMed for answering query-based questions. However, Adera1.0 was not able to automatically identify chemical compounds within relevant sentences. To challenge the need for repurposing known medications for neuroimmunological diseases, we built a deep neural network named Adera2.0 to perform drug repurposing. The workflow uses three deep learning networks. The first network is an encoder and its main task is to embed text into matrices. The second network uses a mean squared error (MSE) loss function to predict answers in the form of embedded matrices. The third network, which constitutes the main novelty in our updated workflow, also uses a MSE loss function. Its main usage is to extract compound names from relevant sentences resulting from the previous network. To optimize the network function, we compared eight different designs. We found that a deep neural network consisting of an RNN neural network and a leaky ReLU could achieve 0.0001 loss and 67% sensitivity. Additionally, we validated Adera2.0's ability to predict NI drug usage against the DRUG Repurposing Hub database. These results establish the ability of Adera2.0 to repurpose drug candidates that can shorten the development of the drug cycle. The workflow could be download online.
Collapse
Affiliation(s)
- Marzena Lazarczyk
- Department of Experimental Genomics, Institute of Genetics and Animal Biotechnology of the Polish Academy of Sciences, ul. Postepu 36A, Jastrzebiec, 05-552 Magdalenka, Poland
| | - Kamila Duda
- Centre for Preclinical Research and Technology, Department of Pharmacodynamics, Faculty of Pharmacy with the Laboratory Medicine Division, Medical University of Warsaw, Banacha 1B, 02-091 Warsaw, Poland
| | - Michel Edwar Mickael
- Department of Experimental Genomics, Institute of Genetics and Animal Biotechnology of the Polish Academy of Sciences, ul. Postepu 36A, Jastrzebiec, 05-552 Magdalenka, Poland
- PM Research Center, Väpnaregatan 22, 58649 Linköping, Sweden
- Correspondence: (M.E.M.); (M.S.)
| | - Onurhan AK
- Department of Sociology, Queen’s University at Kingston, 99 University Ave, Kingston, ON K7L 3N6, Canada
| | - Justyna Paszkiewicz
- Department of Health, John Paul II University of Applied Sciences in Biala Podlaska, Sidorska 95/97, 21-500 Biała Podlaska, Poland
| | - Agnieszka Kowalczyk
- Centre for Preclinical Research and Technology, Department of Pharmacodynamics, Faculty of Pharmacy with the Laboratory Medicine Division, Medical University of Warsaw, Banacha 1B, 02-091 Warsaw, Poland
| | - Jarosław Olav Horbańczuk
- Institute of Genetics and Animal Biotechnology of the Polish Academy of Sciences, ul. Postepu 36A, Jastrzebiec, 05-552 Magdalenka, Poland
| | - Mariusz Sacharczuk
- Department of Experimental Genomics, Institute of Genetics and Animal Biotechnology of the Polish Academy of Sciences, ul. Postepu 36A, Jastrzebiec, 05-552 Magdalenka, Poland
- Department of Pharmacodynamics, Faculty of Pharmacy with the Laboratory Medicine Division, Medical University of Warsaw, Banacha 1B, 02-091 Warsaw, Poland
- Correspondence: (M.E.M.); (M.S.)
| |
Collapse
|
4
|
Discovering Thematically Coherent Biomedical Documents Using Contextualized Bidirectional Encoder Representations from Transformers-Based Clustering. INTERNATIONAL JOURNAL OF ENVIRONMENTAL RESEARCH AND PUBLIC HEALTH 2022; 19:ijerph19105893. [PMID: 35627429 PMCID: PMC9141535 DOI: 10.3390/ijerph19105893] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 03/20/2022] [Revised: 04/26/2022] [Accepted: 05/10/2022] [Indexed: 02/01/2023]
Abstract
The increasing expansion of biomedical documents has increased the number of natural language textual resources related to the current applications. Meanwhile, there has been a great interest in extracting useful information from meaningful coherent groupings of textual content documents in the last decade. However, it is challenging to discover informative representations and define relevant articles from the rapidly growing biomedical literature due to the unsupervised nature of document clustering. Moreover, empirical investigations demonstrated that traditional text clustering methods produce unsatisfactory results in terms of non-contextualized vector space representations because that neglect the semantic relationship between biomedical texts. Recently, pre-trained language models have emerged as successful in a wide range of natural language processing applications. In this paper, we propose the Gaussian Mixture Model-based efficient clustering framework that incorporates substantially pre-trained (Bidirectional Encoder Representations from Transformers for Biomedical Text Mining) BioBERT domain-specific language representations to enhance the clustering accuracy. Our proposed framework consists of main three phases. First, classic text pre-processing techniques are used biomedical document data, which crawled from the PubMed repository. Second, representative vectors are extracted from a pre-trained BioBERT language model for biomedical text mining. Third, we employ the Gaussian Mixture Model as a clustering algorithm, which allows us to assign labels for each biomedical document. In order to prove the efficiency of our proposed model, we conducted a comprehensive experimental analysis utilizing several clustering algorithms while combining diverse embedding techniques. Consequently, the experimental results show that the proposed model outperforms the benchmark models by reaching performance measures of Fowlkes mallows score, silhouette coefficient, adjusted rand index, Davies-Bouldin score of 0.7817, 0.3765, 0.4478, 1.6849, respectively. We expect the outcomes of this study will assist domain specialists in comprehending thematically cohesive documents in the healthcare field.
Collapse
|
5
|
Whole-Genome Sequencing of 100 Genomes Identifies a Distinctive Genetic Susceptibility Profile of Qatari Patients with Hypertension. J Pers Med 2022; 12:jpm12050722. [PMID: 35629146 PMCID: PMC9144388 DOI: 10.3390/jpm12050722] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/03/2022] [Revised: 04/11/2022] [Accepted: 04/26/2022] [Indexed: 02/05/2023] Open
Abstract
Essential hypertension (EH) is a leading risk condition for cardiovascular and renal complications. While multiple genes are associated with EH, little is known about its genetic etiology. Therefore, this study aimed to screen for variants that are associated with EH in 100 hypertensive/100 control patients comprising Qatari individuals using GWASs of whole-genome sequencing and compare these findings with genetic data obtained from more than 10,000 published peer-reviewed studies on EH. The GWAS analysis performed with 21,096 SNPs revealed 38 SNPs with a significant ≥4 log-p value association with EH. The two highest EH-associated SNPs (rs921932379 and rs113688672) revealed a significance score of ≥5 log-p value. These SNPs are located within the inter-genic region of GMPS-SETP14 and ISCA1P6-AC012451.1, respectively. Text mining yielded 3748 genes and 3078 SNPs, where 51 genes and 24 SNPs were mentioned in more than 30 and 10 different articles, respectively. Comparing our GWAS results to previously published articles revealed 194 that are unique to our patient cohort; of these, 13 genes that have 26 SNPs are the most significant with ≥4 log-p value. Of these genes, C2orf47-SPATS2L contains nine EH-associated SNPs. Most of EH-associated genes are related to ion gate channel activity and cardiac conduction. The disease–gene analysis revealed that a large number of EH-associated genes are associated with a variety of cardiovascular disorders. The clustering analysis using EH-associated SNPs across different ethnic groups showed high frequency for the minor allele in different ethnic groups, including Africans, East Asians, and South Asians. The combination of GWAS and text mining helped in identifying the unique genetic susceptibility profile of Qatari patients with EH. To our knowledge, this is the first small study that searched for genetic factors associated with EH in Qatari patients.
Collapse
|
6
|
Large-Scale Functional Genomics Screen to Identify Modulators of Human β-Cell Insulin Secretion. Biomedicines 2022; 10:biomedicines10010103. [PMID: 35052782 PMCID: PMC8773179 DOI: 10.3390/biomedicines10010103] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/29/2021] [Revised: 12/24/2021] [Accepted: 12/27/2021] [Indexed: 12/27/2022] Open
Abstract
Type 2 diabetes (T2D) is a chronic metabolic disorder affecting almost half a billion people worldwide. Impaired function of pancreatic β-cells is both a hallmark of T2D and an underlying factor in the pathophysiology of the disease. Understanding the cellular mechanisms regulating appropriate insulin secretion has been of long-standing interest in the scientific and clinical communities. To identify novel genes regulating insulin secretion we developed a robust arrayed siRNA screen measuring basal, glucose-stimulated, and augmented insulin secretion by EndoC-βH1 cells, a human β-cell line, in a 384-well plate format. We screened 521 candidate genes selected by text mining for relevance to T2D biology and identified 23 positive and 68 negative regulators of insulin secretion. Among these, we validated ghrelin receptor (GHSR), and two genes implicated in endoplasmic reticulum stress, ATF4 and HSPA5. Thus, we have demonstrated the feasibility of using EndoC-βH1 cells for large-scale siRNA screening to identify candidate genes regulating β-cell insulin secretion as potential novel drug targets. Furthermore, this screening format can be adapted to other disease-relevant functional endpoints to enable large-scale screening for targets regulating cellular mechanisms contributing to the progressive loss of functional β-cell mass occurring in T2D.
Collapse
|
7
|
Mak KK, Balijepalli MK, Pichika MR. Success stories of AI in drug discovery - where do things stand? Expert Opin Drug Discov 2021; 17:79-92. [PMID: 34553659 DOI: 10.1080/17460441.2022.1985108] [Citation(s) in RCA: 10] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/20/2023]
Abstract
INTRODUCTION Artificial intelligence (AI) in drug discovery and development (DDD) has gained more traction in the past few years. Many scientific reviews have already been made available in this area. Thus, in this review, the authors have focused on the success stories of AI-driven drug candidates and the scientometric analysis of the literature in this field. AREA COVERED The authors explore the literature to compile the success stories of AI-driven drug candidates that are currently being assessed in clinical trials or have investigational new drug (IND) status. The authors also provide the reader with their expert perspectives for future developments and their opinions on the field. EXPERT OPINION Partnerships between AI companies and the pharma industry are booming. The early signs of the impact of AI on DDD are encouraging, and the pharma industry is hoping for breakthroughs. AI can be a promising technology to unveil the greatest successes, but it has yet to be proven as AI is still at the embryonic stage.
Collapse
Affiliation(s)
- Kit-Kay Mak
- School of Postgraduate Studies and Research, International Medical University, Bukit Jalil, Malaysia.,Department of Pharmaceutical Chemistry, School of Pharmacy, International Medical University, Bukit Jalil, Malaysia.,Centre for Bioactive Molecules and Drug Delivery, Institute for Research, Development, and Innovation (Irdi), International Medical University, Bukit Jalil, Malaysia
| | | | - Mallikarjuna Rao Pichika
- Department of Pharmaceutical Chemistry, School of Pharmacy, International Medical University, Bukit Jalil, Malaysia.,Centre for Bioactive Molecules and Drug Delivery, Institute for Research, Development, and Innovation (Irdi), International Medical University, Bukit Jalil, Malaysia
| |
Collapse
|
8
|
Parolo S, Tomasoni D, Bora P, Ramponi A, Kaddi C, Azer K, Domenici E, Neves-Zaph S, Lombardo R. Reconstruction of the Cytokine Signaling in Lysosomal Storage Diseases by Literature Mining and Network Analysis. Front Cell Dev Biol 2021; 9:703489. [PMID: 34490253 PMCID: PMC8417786 DOI: 10.3389/fcell.2021.703489] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/30/2021] [Accepted: 07/30/2021] [Indexed: 11/13/2022] Open
Abstract
Lysosomal storage diseases (LSDs) are characterized by the abnormal accumulation of substrates in tissues due to the deficiency of lysosomal proteins. Among the numerous clinical manifestations, chronic inflammation has been consistently reported for several LSDs. However, the molecular mechanisms involved in the inflammatory response are still not completely understood. In this study, we performed text-mining and systems biology analyses to investigate the inflammatory signals in three LSDs characterized by sphingolipid accumulation: Gaucher disease, Acid Sphingomyelinase Deficiency (ASMD), and Fabry Disease. We first identified the cytokines linked to the LSDs, and then built on the extracted knowledge to investigate the inflammatory signals. We found numerous transcription factors that are putative regulators of cytokine expression in a cell-specific context, such as the signaling axes controlled by STAT2, JUN, and NR4A2 as candidate regulators of the monocyte Gaucher disease cytokine network. Overall, our results suggest the presence of a complex inflammatory signaling in LSDs involving many cellular and molecular players that could be further investigated as putative targets of anti-inflammatory therapies.
Collapse
Affiliation(s)
- Silvia Parolo
- Fondazione the Microsoft Research-University of Trento Centre for Computational and Systems Biology, Rovereto, Italy
| | - Danilo Tomasoni
- Fondazione the Microsoft Research-University of Trento Centre for Computational and Systems Biology, Rovereto, Italy
| | - Pranami Bora
- Fondazione the Microsoft Research-University of Trento Centre for Computational and Systems Biology, Rovereto, Italy
| | - Alan Ramponi
- Fondazione the Microsoft Research-University of Trento Centre for Computational and Systems Biology, Rovereto, Italy
| | - Chanchala Kaddi
- Data and Data Science - Translational Disease Modeling, Sanofi, Bridgewater, NJ, United States
| | - Karim Azer
- Data and Data Science - Translational Disease Modeling, Sanofi, Bridgewater, NJ, United States
| | - Enrico Domenici
- Fondazione the Microsoft Research-University of Trento Centre for Computational and Systems Biology, Rovereto, Italy.,Department of Cellular, Computational and Integrative Biology (CIBIO), University of Trento, Trento, Italy
| | - Susana Neves-Zaph
- Data and Data Science - Translational Disease Modeling, Sanofi, Bridgewater, NJ, United States
| | - Rosario Lombardo
- Fondazione the Microsoft Research-University of Trento Centre for Computational and Systems Biology, Rovereto, Italy
| |
Collapse
|
9
|
Henry S, Wijesinghe DS, Myers A, McInnes BT. Using Literature Based Discovery to Gain Insights Into the Metabolomic Processes of Cardiac Arrest. Front Res Metr Anal 2021; 6:644728. [PMID: 34250435 PMCID: PMC8267364 DOI: 10.3389/frma.2021.644728] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/21/2020] [Accepted: 05/07/2021] [Indexed: 12/19/2022] Open
Abstract
In this paper, we describe how we applied LBD techniques to discover lecithin cholesterol acyltransferase (LCAT) as a druggable target for cardiac arrest. We fully describe our process which includes the use of high-throughput metabolomic analysis to identify metabolites significantly related to cardiac arrest, and how we used LBD to gain insights into how these metabolites relate to cardiac arrest. These insights lead to our proposal (for the first time) of LCAT as a druggable target; the effects of which are supported by in vivo studies which were brought forth by this work. Metabolites are the end product of many biochemical pathways within the human body. Observed changes in metabolite levels are indicative of changes in these pathways, and provide valuable insights toward the cause, progression, and treatment of diseases. Following cardiac arrest, we observed changes in metabolite levels pre- and post-resuscitation. We used LBD to help discover diseases implicitly linked via these metabolites of interest. Results of LBD indicated a strong link between Fish Eye disease and cardiac arrest. Since fish eye disease is characterized by an LCAT deficiency, it began an investigation into the effects of LCAT and cardiac arrest survival. In the investigation, we found that decreased LCAT activity may increase cardiac arrest survival rates by increasing ω-3 polyunsaturated fatty acid availability in circulation. We verified the effects of ω-3 polyunsaturated fatty acids on increasing survival rate following cardiac arrest via in vivo with rat models.
Collapse
Affiliation(s)
- Sam Henry
- Department of Physics, Computer Science and Engineering, Christopher Newport University, Newport News, VA, United States
| | - D. Shanaka Wijesinghe
- Department of Pharmacotherapy and Outcomes Science, Virginia Commonwealth University, Richmond, VA, United States
| | - Aidan Myers
- Department of Computer Science, Virginia Commonwealth University, Richmond, VA, United States
| | - Bridget T. McInnes
- Department of Computer Science, Virginia Commonwealth University, Richmond, VA, United States
| |
Collapse
|
10
|
De Silva K, Mathews N, Teede H, Forbes A, Jönsson D, Demmer RT, Enticott J. Clinical notes as prognostic markers of mortality associated with diabetes mellitus following critical care: A retrospective cohort analysis using machine learning and unstructured big data. Comput Biol Med 2021; 132:104305. [PMID: 33705995 DOI: 10.1016/j.compbiomed.2021.104305] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/24/2020] [Revised: 02/23/2021] [Accepted: 02/27/2021] [Indexed: 12/14/2022]
Abstract
BACKGROUND Clinical notes are ubiquitous resources offering potential value in optimizing critical care via data mining technologies. OBJECTIVE To determine the predictive value of clinical notes as prognostic markers of 1-year all-cause mortality among people with diabetes following critical care. MATERIALS AND METHODS Mortality of diabetes patients were predicted using three cohorts of clinical text in a critical care database, written by physicians (n = 45253), nurses (159027), and both (n = 204280). Natural language processing was used to pre-process text documents and LASSO-regularized logistic regression models were trained and tested. Confusion matrix metrics of each model were calculated and AUROC estimates between models were compared. All predictive words and corresponding coefficients were extracted. Outcome probability associated with each text document was estimated. RESULTS Models built on clinical text of physicians, nurses, and the combined cohort predicted mortality with AUROC of 0.996, 0.893, and 0.922, respectively. Predictive performance of the models significantly differed from one another whereas inter-rater reliability ranged from substantial to almost perfect across them. Number of predictive words with non-zero coefficients were 3994, 8159, and 10579, respectively, in the models of physicians, nurses, and the combined cohort. Physicians' and nursing notes, both individually and when combined, strongly predicted 1-year all-cause mortality among people with diabetes following critical care. CONCLUSION Clinical notes of physicians and nurses are strong and novel prognostic markers of diabetes-associated mortality in critical care, offering potentially generalizable and scalable applications. Clinical text-derived personalized risk estimates of prognostic outcomes such as mortality could be used to optimize patient care.
Collapse
Affiliation(s)
- Kushan De Silva
- Monash Centre for Health Research and Implementation, School of Public Health and Preventive Medicine, Faculty of Medicine, Nursing, and Health Sciences, Monash University, Clayton, 3168, Australia.
| | - Noel Mathews
- Monash Centre for Health Research and Implementation, School of Public Health and Preventive Medicine, Faculty of Medicine, Nursing, and Health Sciences, Monash University, Clayton, 3168, Australia
| | - Helena Teede
- Monash Centre for Health Research and Implementation, School of Public Health and Preventive Medicine, Faculty of Medicine, Nursing, and Health Sciences, Monash University, Clayton, 3168, Australia
| | - Andrew Forbes
- Biostatistics Unit, Division of Research Methodology, School of Public Health and Preventive Medicine, Faculty of Medicine, Nursing, and Health Sciences, Monash University, Melbourne, 3004, Australia
| | - Daniel Jönsson
- Department of Periodontology, Faculty of Odontology, Malmö University, Malmö, 21119, Sweden; Swedish Dental Service of Skane, Lund, 22647, Sweden
| | - Ryan T Demmer
- Division of Epidemiology and Community Health, School of Public Health, University of Minnesota, Minneapolis, MN, USA; Mailman School of Public Health, Columbia University, New York, USA
| | - Joanne Enticott
- Monash Centre for Health Research and Implementation, School of Public Health and Preventive Medicine, Faculty of Medicine, Nursing, and Health Sciences, Monash University, Clayton, 3168, Australia
| |
Collapse
|