1
|
Le NQK. Leveraging transformers-based language models in proteome bioinformatics. Proteomics 2023; 23:e2300011. [PMID: 37381841 DOI: 10.1002/pmic.202300011] [Citation(s) in RCA: 9] [Impact Index Per Article: 9.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/14/2023] [Revised: 06/13/2023] [Accepted: 06/13/2023] [Indexed: 06/30/2023]
Abstract
In recent years, the rapid growth of biological data has increased interest in using bioinformatics to analyze and interpret this data. Proteomics, which studies the structure, function, and interactions of proteins, is a crucial area of bioinformatics. Using natural language processing (NLP) techniques in proteomics is an emerging field that combines machine learning and text mining to analyze biological data. Recently, transformer-based NLP models have gained significant attention for their ability to process variable-length input sequences in parallel, using self-attention mechanisms to capture long-range dependencies. In this review paper, we discuss the recent advancements in transformer-based NLP models in proteome bioinformatics and examine their advantages, limitations, and potential applications to improve the accuracy and efficiency of various tasks. Additionally, we highlight the challenges and future directions of using these models in proteome bioinformatics research. Overall, this review provides valuable insights into the potential of transformer-based NLP models to revolutionize proteome bioinformatics.
Collapse
Affiliation(s)
- Nguyen Quoc Khanh Le
- Professional Master Program in Artificial Intelligence in Medicine, College of Medicine, Taipei Medical University, Taipei, Taiwan
- AIBioMed Research Group, Taipei Medical University, Taipei, Taiwan
- Research Center for Artificial Intelligence in Medicine, Taipei Medical University, Taipei, Taiwan
- Translational Imaging Research Center, Taipei Medical University Hospital, Taipei, Taiwan
| |
Collapse
|
2
|
Goldrick S, Alosert H, Lovelady C, Bond NJ, Senussi T, Hatton D, Klein J, Cheeks M, Turner R, Savery J, Farid SS. Next-generation cell line selection methodology leveraging data lakes, natural language generation and advanced data analytics. Front Bioeng Biotechnol 2023; 11:1160223. [PMID: 37342509 PMCID: PMC10277482 DOI: 10.3389/fbioe.2023.1160223] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/06/2023] [Accepted: 05/22/2023] [Indexed: 06/23/2023] Open
Abstract
Cell line development is an essential stage in biopharmaceutical development that often lies on the critical path. Failure to fully characterise the lead clone during initial screening can lead to lengthy project delays during scale-up, which can potentially compromise commercial manufacturing success. In this study, we propose a novel cell line development methodology, referenced as CLD 4, which involves four steps enabling autonomous data-driven selection of the lead clone. The first step involves the digitalisation of the process and storage of all available information within a structured data lake. The second step calculates a new metric referenced as the cell line manufacturability index (MI CL) quantifying the performance of each clone by considering the selection criteria relevant to productivity, growth and product quality. The third step implements machine learning (ML) to identify any potential risks associated with process operation and relevant critical quality attributes (CQAs). The final step of CLD 4 takes into account the available metadata and summaries all relevant statistics generated in steps 1-3 in an automated report utilising a natural language generation (NLG) algorithm. The CLD 4 methodology was implemented to select the lead clone of a recombinant Chinese hamster ovary (CHO) cell line producing high levels of an antibody-peptide fusion with a known product quality issue related to end-point trisulfide bond (TSB) concentration. CLD 4 identified sub-optimal process conditions leading to increased levels of trisulfide bond that would not be identified through conventional cell line development methodologies. CLD 4 embodies the core principles of Industry 4.0 and demonstrates the benefits of increased digitalisation, data lake integration, predictive analytics and autonomous report generation to enable more informed decision making.
Collapse
Affiliation(s)
- Stephen Goldrick
- Department of Biochemical Engineering, University College London, London, United Kingdom
| | - Haneen Alosert
- Department of Biochemical Engineering, University College London, London, United Kingdom
| | - Clare Lovelady
- Cell Culture and Fermentation Science, Biopharmaceuticals Development, R&D, AstraZeneca, Cambridge, United Kingdom
| | - Nicholas J. Bond
- Analytical Sciences, Biopharmaceuticals Development, R&D, AstraZeneca, Cambridge, United Kingdom
| | - Tarik Senussi
- Cell Culture and Fermentation Science, Biopharmaceuticals Development, R&D, AstraZeneca, Cambridge, United Kingdom
| | - Diane Hatton
- Cell Culture and Fermentation Science, Biopharmaceuticals Development, R&D, AstraZeneca, Cambridge, United Kingdom
| | - John Klein
- Data Science and Modelling, Biopharmaceuticals Development, R&D, AstraZeneca, Cambridge, United Kingdom
| | - Matthew Cheeks
- Cell Culture and Fermentation Science, Biopharmaceuticals Development, R&D, AstraZeneca, Cambridge, United Kingdom
| | - Richard Turner
- Purification Process Sciences, Biopharmaceuticals Development, R&D, AstraZeneca, Cambridge, United Kingdom
| | - James Savery
- Data Science and Modelling, Biopharmaceuticals Development, R&D, AstraZeneca, Cambridge, United Kingdom
| | - Suzanne S. Farid
- Department of Biochemical Engineering, University College London, London, United Kingdom
| |
Collapse
|
3
|
Torrado JC, Husebo BS, Allore HG, Erdal A, Fæø SE, Reithe H, Førsund E, Tzoulis C, Patrascu M. Digital phenotyping by wearable-driven artificial intelligence in older adults and people with Parkinson's disease: Protocol of the mixed method, cyclic ActiveAgeing study. PLoS One 2022; 17:e0275747. [PMID: 36240173 PMCID: PMC9565381 DOI: 10.1371/journal.pone.0275747] [Citation(s) in RCA: 7] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/10/2022] [Accepted: 09/22/2022] [Indexed: 11/28/2022] Open
Abstract
BACKGROUND Active ageing is described as the process of optimizing health, empowerment, and security to enhance the quality of life in the rapidly growing population of older adults. Meanwhile, multimorbidity and neurological disorders, such as Parkinson's disease (PD), lead to global public health and resource limitations. We introduce a novel user-centered paradigm of ageing based on wearable-driven artificial intelligence (AI) that may harness the autonomy and independence that accompany functional limitation or disability, and possibly elevate life expectancy in older adults and people with PD. METHODS ActiveAgeing is a 4-year, multicentre, mixed method, cyclic study that combines digital phenotyping via commercial devices (Empatica E4, Fitbit Sense, and Oura Ring) with traditional evaluation (clinical assessment scales, in-depth interviews, and clinical consultations) and includes four types of participants: (1) people with PD and (2) their informal caregiver; (3) healthy older adults from the Helgetun living environment in Norway, and (4) people on the Helgetun waiting list. For the first study, each group will be represented by N = 15 participants to test the data acquisition and to determine the sample size for the second study. To suggest lifestyle changes, modules for human expert-based advice, machine-generated advice, and self-generated advice from accessible data visualization will be designed. Quantitative analysis of physiological data will rely on digital signal processing (DSP) and AI techniques. The clinical assessment scales are the Unified Parkinson's Disease Rating Scale (UPDRS), Montreal Cognitive Assessment (MoCA), Geriatric Depression Scale (GDS), Geriatric Anxiety Inventory (GAI), Apathy Evaluation Scale (AES), and the REM Sleep Behaviour Disorder Screening Questionnaire (RBDSQ). A qualitative inquiry will be carried out with individual and focus group interviews and analysed using a hermeneutic approach including narrative and thematic analysis techniques. DISCUSSION We hypothesise that digital phenotyping is feasible to explore the ageing process from clinical and lifestyle perspectives including older adults and people with PD. Data is used for clinical decision-making by symptom tracking, predicting symptom evolution, and discovering new outcome measures for clinical trials.
Collapse
Affiliation(s)
- Juan C. Torrado
- Faculty of Medicine, Department of Global Public Health and Primary Care, Centre for Elderly and Nursing Home Medicine (SEFAS), University of Bergen, Bergen, Norway
| | - Bettina S. Husebo
- Faculty of Medicine, Department of Global Public Health and Primary Care, Centre for Elderly and Nursing Home Medicine (SEFAS), University of Bergen, Bergen, Norway
- Department of Nursing Home Medicine, Municipality of Bergen, Bergen, Norway
| | - Heather G. Allore
- Yale School of Medicine and Yale School of Public Health, New Haven, CT, United States of America
| | - Ane Erdal
- Faculty of Medicine, Department of Global Public Health and Primary Care, Centre for Elderly and Nursing Home Medicine (SEFAS), University of Bergen, Bergen, Norway
| | - Stein E. Fæø
- Faculty of Health Studies, Department of Nursing, VID Specialized University, Bergen, Norway
| | - Haakon Reithe
- Faculty of Medicine, Department of Global Public Health and Primary Care, Centre for Elderly and Nursing Home Medicine (SEFAS), University of Bergen, Bergen, Norway
| | - Elise Førsund
- Faculty of Medicine, Department of Global Public Health and Primary Care, Centre for Elderly and Nursing Home Medicine (SEFAS), University of Bergen, Bergen, Norway
| | - Charalampos Tzoulis
- Department of Neurology, Neuro-SysMed Center, Haukeland University Hospital, Bergen, Norway
- K.G Jebsen Center for Translational Research in Parkinson’s Disease, University of Bergen, Bergen, Norway
- Department of Clinical Medicine, University of Bergen, Bergen, Norway
| | - Monica Patrascu
- Faculty of Medicine, Department of Global Public Health and Primary Care, Centre for Elderly and Nursing Home Medicine (SEFAS), University of Bergen, Bergen, Norway
| |
Collapse
|
4
|
op den Akker H, Cabrita M, Pnevmatikakis A. Digital Therapeutics: Virtual Coaching Powered by Artificial Intelligence on Real-World Data. FRONTIERS IN COMPUTER SCIENCE 2021. [DOI: 10.3389/fcomp.2021.750428] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022] Open
Abstract
An ever-increasing number of people need to cope with one or more chronic conditions for a significant portion of their life. Digital Therapeutics (DTx) focused on the prevention, management, or treatment of chronic diseases are promising in alleviating the personal socio-economic burden caused. In this paper we describe a proposed DTx methodology covering three main components: observation (which data is collected), understanding (how to acquire knowledge based on the data collected), and coaching (how to communicate the acquired knowledge to the user). We focus on an emerging form of automated virtual coaching, delivered through conversational agents allowing interaction with end-users using natural language. Our methodology will be applied in the new generation of the Healthentia platform, an eClinical solution that captures clinical outcomes from mobile, medical and Internet of Things (IoT) devices, using a patient-centric mobile application and offers Artificial Intelligence (AI) driven smart services. While we are unable to provide data to prove its effectiveness, we illustrate the potential of the proposed architecture to deliver DTx by describing how the methodology can be applied to a use-case consisting of a clinical trial for treatment of a chronic condition, combining testing of a new medication and a lifestyle intervention, which will be partly implemented and evaluated in the context of the European research project RE-SAMPLE (REal-time data monitoring for Shared, Adaptive, Multi-domain and Personalised prediction, and decision making for Long-term Pulmonary care Ecosystems).
Collapse
|
5
|
Woodward MA, Maganti N, Niziol LM, Amin S, Hou A, Singh K. Development and Validation of a Natural Language Processing Algorithm to Extract Descriptors of Microbial Keratitis From the Electronic Health Record. Cornea 2021; 40:1548-1553. [PMID: 34029244 PMCID: PMC8578049 DOI: 10.1097/ico.0000000000002755] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/20/2020] [Accepted: 03/17/2021] [Indexed: 11/26/2022]
Abstract
PURPOSE The purpose of this article was to develop and validate a natural language processing (NLP) algorithm to extract qualitative descriptors of microbial keratitis (MK) from electronic health records. METHODS In this retrospective cohort study, patients with MK diagnoses from 2 academic centers were identified using electronic health records. An NLP algorithm was created to extract MK centrality, depth, and thinning. A random sample of patient with MK encounters were used to train the algorithm (400 encounters of 100 patients) and compared with expert chart review. The algorithm was evaluated in internal (n = 100) and external validation data sets (n = 59) in comparison with masked chart review. Outcomes were sensitivity and specificity of the NLP algorithm to extract qualitative MK features as compared with masked chart review performed by an ophthalmologist. RESULTS Across data sets, gold-standard chart review found centrality was documented in 64.0% to 79.3% of charts, depth in 15.0% to 20.3%, and thinning in 25.4% to 31.3%. Compared with chart review, the NLP algorithm had a sensitivity of 80.3%, 50.0%, and 66.7% for identifying central MK, 85.4%, 66.7%, and 100% for deep MK, and 100.0%, 95.2%, and 100% for thin MK, in the training, internal, and external validation samples, respectively. Specificity was 41.1%, 38.6%, and 46.2% for centrality, 100%, 83.3%, and 71.4% for depth, and 93.3%, 100%, and was not applicable (n = 0) to the external data for thinning, in the samples, respectively. CONCLUSIONS MK features are not documented consistently showing a lack of standardization in recording MK examination elements. NLP shows promise but will be limited if the available clinical data are missing from the chart.
Collapse
Affiliation(s)
- Maria A. Woodward
- Department of Ophthalmology and Visual Sciences, W. K. Kellogg Eye Center, University of Michigan, Ann Arbor, Michigan
- Institute for Healthcare Policy and Innovation, University of Michigan, Ann Arbor, Michigan
| | - Nenita Maganti
- Department of Ophthalmology and Visual Sciences, W. K. Kellogg Eye Center, University of Michigan, Ann Arbor, Michigan
- Feinberg School of Medicine, Northwestern University, Chicago, Illinois
| | - Leslie M. Niziol
- Department of Ophthalmology and Visual Sciences, W. K. Kellogg Eye Center, University of Michigan, Ann Arbor, Michigan
| | - Sejal Amin
- Department of Ophthalmology, Henry Ford Health System, Detroit, Michigan
| | - Andrew Hou
- Department of Ophthalmology, Henry Ford Health System, Detroit, Michigan
| | - Karandeep Singh
- Institute for Healthcare Policy and Innovation, University of Michigan, Ann Arbor, Michigan
- Departments of Learning Health Systems and Internal Medicine, University of Michigan, Ann Arbor, Michigan
| |
Collapse
|
6
|
Goodwin TR, Harabagiu SM. Inferring Clinical Correlations from EEG Reports with Deep Neural Learning. AMIA ... ANNUAL SYMPOSIUM PROCEEDINGS. AMIA SYMPOSIUM 2018; 2017:770-779. [PMID: 29854143 PMCID: PMC5977577] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Subscribe] [Scholar Register] [Indexed: 06/08/2023]
Abstract
Successful diagnosis and management of neurological dysfunction relies on proper communication between the neurologist and the primary physician (or other specialists). Because this communication is documented within medical records, the ability to automatically infer the clinical correlations for a patient from his or her medical records would provide an important step towards enabling health care systems to automatically identify patients requiring additional follow-up as well as flagging any unexpected clinical correlations for review. In this paper, we present a Deep Section Recovery Model (DSRM) which applies deep neural learning on a large body of EEG reports in order to infer the expected clinical correlations for a patient from the information in a given EEG report by (1) automatically extracting word- and report- level features from the report and (2) inferring the most likely clinical correlations and expressing those clinical correlations in natural language. We evaluated the performance of the DSRM by removing the clinical correlation sections from EEG reports and measuring how well the model could recover that information from the remainder of the report. The DSRM obtained a 17% improvement over the top-performing baseline, highlighting not only the power of the DSRM but also the promise of automatically recognizing unexpected clinical correlations in the future.
Collapse
|
7
|
Vreeman DJ, Richoz C. Possibilities and Implications of Using the ICF and Other Vocabulary Standards in Electronic Health Records. PHYSIOTHERAPY RESEARCH INTERNATIONAL 2013; 20:210-9. [PMID: 23897840 DOI: 10.1002/pri.1559] [Citation(s) in RCA: 19] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/02/2013] [Revised: 04/25/2013] [Accepted: 05/20/2013] [Indexed: 11/07/2022]
Abstract
There is now widespread recognition of the powerful potential of electronic health record (EHR) systems to improve the health-care delivery system. The benefits of EHRs grow even larger when the health data within their purview are seamlessly shared, aggregated and processed across different providers, settings and institutions. Yet, the plethora of idiosyncratic conventions for identifying the same clinical content in different information systems is a fundamental barrier to fully leveraging the potential of EHRs. Only by adopting vocabulary standards that provide the lingua franca across these local dialects can computers efficiently move, aggregate and use health data for decision support, outcomes management, quality reporting, research and many other purposes. In this regard, the International Classification of Functioning, Disability, and Health (ICF) is an important standard for physiotherapists because it provides a framework and standard language for describing health and health-related states. However, physiotherapists and other health-care professionals capture a wide range of data such as patient histories, clinical findings, tests and measurements, procedures, and so on, for which other vocabulary standards such as Logical Observation Identifiers Names and Codes and Systematized Nomenclature Of Medicine Clinical Terms are crucial for interoperable communication between different electronic systems. In this paper, we describe how the ICF and other internationally accepted vocabulary standards could advance physiotherapy practise and research by enabling data sharing and reuse by EHRs. We highlight how these different vocabulary standards fit together within a comprehensive record system, and how EHRs can make use of them, with a particular focus on enhancing decision-making. By incorporating the ICF and other internationally accepted vocabulary standards into our clinical information systems, physiotherapists will be able to leverage the potent capabilities of EHRs and contribute our unique clinical perspective to other health-care providers within the emerging electronic health information infrastructure.
Collapse
Affiliation(s)
- Daniel J Vreeman
- Biomedical Informatics, Regenstrief Institute, Inc., Indianapolis, IN, 46202-3012, USA.,Indiana University School of Medicine, Indiana University, Indianapolis, IN, 46202-3012, USA
| | - Christophe Richoz
- Advanced Computing Research Centre, Health Informatics Lab, University of South Australia, Mawson Lakes, South Australia, 5095, Australia
| |
Collapse
|
8
|
Jonnalagadda SR, Del Fiol G, Medlin R, Weir C, Fiszman M, Mostafa J, Liu H. Automatically extracting sentences from Medline citations to support clinicians' information needs. J Am Med Inform Assoc 2012; 20:995-1000. [PMID: 23100128 DOI: 10.1136/amiajnl-2012-001347] [Citation(s) in RCA: 23] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/04/2022] Open
Abstract
OBJECTIVE Online health knowledge resources contain answers to most of the information needs raised by clinicians in the course of care. However, significant barriers limit the use of these resources for decision-making, especially clinicians' lack of time. In this study we assessed the feasibility of automatically generating knowledge summaries for a particular clinical topic composed of relevant sentences extracted from Medline citations. METHODS The proposed approach combines information retrieval and semantic information extraction techniques to identify relevant sentences from Medline abstracts. We assessed this approach in two case studies on the treatment alternatives for depression and Alzheimer's disease. RESULTS A total of 515 of 564 (91.3%) sentences retrieved in the two case studies were relevant to the topic of interest. About one-third of the relevant sentences described factual knowledge or a study conclusion that can be used for supporting information needs at the point of care. CONCLUSIONS The high rate of relevant sentences is desirable, given that clinicians' lack of time is one of the main barriers to using knowledge resources at the point of care. Sentence rank was not significantly associated with relevancy, possibly due to most sentences being highly relevant. Sentences located closer to the end of the abstract and sentences with treatment and comparative predications were likely to be conclusive sentences. Our proposed technical approach to helping clinicians meet their information needs is promising. The approach can be extended for other knowledge resources and information need types.
Collapse
|
9
|
Rosenbloom ST, Denny JC, Xu H, Lorenzi N, Stead WW, Johnson KB. Data from clinical notes: a perspective on the tension between structure and flexible documentation. J Am Med Inform Assoc 2011; 18:181-6. [PMID: 21233086 DOI: 10.1136/jamia.2010.007237] [Citation(s) in RCA: 226] [Impact Index Per Article: 17.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/30/2022] Open
Abstract
Clinical documentation is central to patient care. The success of electronic health record system adoption may depend on how well such systems support clinical documentation. A major goal of integrating clinical documentation into electronic heath record systems is to generate reusable data. As a result, there has been an emphasis on deploying computer-based documentation systems that prioritize direct structured documentation. Research has demonstrated that healthcare providers value different factors when writing clinical notes, such as narrative expressivity, amenability to the existing workflow, and usability. The authors explore the tension between expressivity and structured clinical documentation, review methods for obtaining reusable data from clinical notes, and recommend that healthcare providers be able to choose how to document patient care based on workflow and note content needs. When reusable data are needed from notes, providers can use structured documentation or rely on post-hoc text processing to produce structured data, as appropriate.
Collapse
Affiliation(s)
- S Trent Rosenbloom
- Department of Biomedical Informatics, Vanderbilt University Medical Center, Nashville, Tennessee, USA.
| | | | | | | | | | | |
Collapse
|
10
|
Rubinelli S, Schulz PJ, Hartung U. “Your risk is low, because …”: argument-driven online genetic counselling. ARGUMENT & COMPUTATION 2010. [DOI: 10.1080/19462166.2010.504884] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/19/2022]
|
11
|
Sarkar IN. Biomedical informatics and translational medicine. J Transl Med 2010; 8:22. [PMID: 20187952 PMCID: PMC2837642 DOI: 10.1186/1479-5876-8-22] [Citation(s) in RCA: 73] [Impact Index Per Article: 5.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/21/2009] [Accepted: 02/26/2010] [Indexed: 11/23/2022] Open
Abstract
Biomedical informatics involves a core set of methodologies that can provide a foundation for crossing the "translational barriers" associated with translational medicine. To this end, the fundamental aspects of biomedical informatics (e.g., bioinformatics, imaging informatics, clinical informatics, and public health informatics) may be essential in helping improve the ability to bring basic research findings to the bedside, evaluate the efficacy of interventions across communities, and enable the assessment of the eventual impact of translational medicine innovations on health policies. Here, a brief description is provided for a selection of key biomedical informatics topics (Decision Support, Natural Language Processing, Standards, Information Retrieval, and Electronic Health Records) and their relevance to translational medicine. Based on contributions and advancements in each of these topic areas, the article proposes that biomedical informatics practitioners ("biomedical informaticians") can be essential members of translational medicine teams.
Collapse
Affiliation(s)
- Indra Neil Sarkar
- Center for Clinical and Translational Science, Department of Microbiology and Molecular Genetics, University of Vermont, College of Medicine, 89 Beaumont Ave, Given Courtyard N309, Burlington, VT 05405, USA.
| |
Collapse
|
12
|
When a graph is poorer than 100 words: A comparison of computerised natural language generation, human generated descriptions and graphical displays in neonatal intensive care. APPLIED COGNITIVE PSYCHOLOGY 2010. [DOI: 10.1002/acp.1545] [Citation(s) in RCA: 19] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/07/2022]
|
13
|
Hunter J, Freer Y, Gatt A, Logie R, McIntosh N, van der Meulen M, Portet F, Reiter E, Sripada S, Sykes C. Summarising complex ICU data in natural language. AMIA ... ANNUAL SYMPOSIUM PROCEEDINGS. AMIA SYMPOSIUM 2008; 2008:323-327. [PMID: 18998961 PMCID: PMC2656014] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Subscribe] [Scholar Register] [Received: 03/14/2008] [Revised: 07/02/2008] [Indexed: 05/27/2023]
Abstract
It has been shown that summarizing complex multi-channel physiological and discrete data in natural language (text) can lead to better decision-making in the intensive care unit (ICU). As part of the BabyTalk project, we describe a prototype system (BT-45) which can generate such textual summaries automatically. Although these summaries are not yet as good as those generated by human experts, we have demonstrated experimentally that they lead to as good decision-making as can be achieved through presenting the same data graphically.
Collapse
Affiliation(s)
- Jim Hunter
- Department of Computing Science, University of Aberdeen, UK
| | | | | | | | | | | | | | | | | | | |
Collapse
|
14
|
|
15
|
Johnson SB, Bakken S, Dine D, Hyun S, Mendonça E, Morrison F, Bright T, Van Vleck T, Wrenn J, Stetson P. An electronic health record based on structured narrative. J Am Med Inform Assoc 2008; 15:54-64. [PMID: 17947628 PMCID: PMC2274868 DOI: 10.1197/jamia.m2131] [Citation(s) in RCA: 83] [Impact Index Per Article: 5.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/20/2006] [Accepted: 09/20/2007] [Indexed: 11/10/2022] Open
Abstract
OBJECTIVE To develop an electronic health record that facilitates rapid capture of detailed narrative observations from clinicians, with partial structuring of narrative information for integration and reuse. DESIGN We propose a design in which unstructured text and coded data are fused into a single model called structured narrative. Each major clinical event (e.g., encounter or procedure) is represented as a document that is marked up to identify gross structure (sections, fields, paragraphs, lists) as well as fine structure within sentences (concepts, modifiers, relationships). Marked up items are associated with standardized codes that enable linkage to other events, as well as efficient reuse of information, which can speed up data entry by clinicians. Natural language processing is used to identify fine structure, which can reduce the need for form-based entry. VALIDATION The model is validated through an example of use by a clinician, with discussion of relevant aspects of the user interface, data structures and processing rules. DISCUSSION The proposed model represents all patient information as documents with standardized gross structure (templates). Clinicians enter their data as free text, which is coded by natural language processing in real time making it immediately usable for other computation, such as alerts or critiques. In addition, the narrative data annotates and augments structured data with temporal relations, severity and degree modifiers, causal connections, clinical explanations and rationale. CONCLUSION Structured narrative has potential to facilitate capture of data directly from clinicians by allowing freedom of expression, giving immediate feedback, supporting reuse of clinical information and structuring data for subsequent processing, such as quality assurance and clinical research.
Collapse
Affiliation(s)
- Stephen B Johnson
- Department of Biomedical Informatics, Columbia University, New York, NY, USA.
| | | | | | | | | | | | | | | | | | | |
Collapse
|
16
|
Dimarco C, Bray P, Covvey HD, Cowan DD, Diciccio V, Hovy E, Lipa J, Yang C. Authoring and generation of individualized patient education materials. AMIA ... ANNUAL SYMPOSIUM PROCEEDINGS. AMIA SYMPOSIUM 2006; 2006:195-9. [PMID: 17238330 PMCID: PMC1839536] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Subscribe] [Scholar Register] [Indexed: 05/13/2023]
Abstract
Although the pre-surgical patient-surgeon encounter is the opportunity to educate the patient, it is essential that the patient be given educational materials to complement the face-to-face exchange. This is virtu-ally impossible to do well with brochures, because many combinations of procedures are possible, dif-ferent patients have different concerns, and patients have varying levels of literacy and knowledge. In the extreme, a patient would either be given a set of bro-chures selected from 100s of variants, or all patients would be given the same set of brochures without regard for differing needs. We have been developing an information brochure generator that customizes material for every individual patient regardless of the complexity of the surgical intervention.
Collapse
Affiliation(s)
- C Dimarco
- Waterloo Institute for Health Informatics Research, Waterloo, the Division of Plastic Surgery, University Health Network, University of Toronto, Toronto, Onatrio, Canada and the Information Sciences Institute, University of Southern California, U.S.A
| | | | | | | | | | | | | | | |
Collapse
|
17
|
Morrison FP, Kukafka R, Johnson SB. Analyzing the structure and content of public health messages. AMIA ... ANNUAL SYMPOSIUM PROCEEDINGS. AMIA SYMPOSIUM 2005; 2005:540-4. [PMID: 16779098 PMCID: PMC1560424] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Subscribe] [Scholar Register] [Indexed: 05/10/2023]
Abstract
BACKGROUND Health messages are crucial to the field of public health in effecting behavior change, but little research is available to assist writers in composing the overall structure of a message. In order to develop software to assist non-expert message writers in constructing effective messages, the structure of existing health messages must be understood, and an appropriate method for analyzing health message structure developed. METHODS 72 messages from expert sources were used for development of the method, which was then tested for reproducibility using ten randomly selected health messages. Four raters analyzed the messages and inter-coder agreement was calculated. RESULTS A method for analyzing the structure of the messages was developed using sublanguage analysis and discourse analysis. Overall kappa between four coders was 0.69, demonstrating "substantial agreement." CONCLUSION A novel framework for characterizing health message structure and a method for analyzing messages appears to be reproducible.
Collapse
|
18
|
Green N. GenIE: an intelligent system for writing genetic counseling patient letters. AMIA ... ANNUAL SYMPOSIUM PROCEEDINGS. AMIA SYMPOSIUM 2005; 2005:969. [PMID: 16779256 PMCID: PMC1560849] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Subscribe] [Scholar Register] [Indexed: 05/10/2023]
Abstract
We are developing GenIE, a prototype intelligent system to create first drafts of genetic counseling patient letters. GenIE will apply natural language generation techniques to construct the first draft of a letter for subsequent review and editing, if needed, by the genetic counselor. For purposes of knowledge acquisition, we have been analyzing a corpus of patient letters. Based on the corpus analysis we are developing a knowledge base and text generation strategies.
Collapse
Affiliation(s)
- Nancy Green
- Department of Mathematical Sciences/Computer Science Division,University of North Carolina at Greensboro, NC 27402, USA
| |
Collapse
|
19
|
Abstract
When authors of empirical science articles write abstracts, they employ a wide variety of distinct linguistic operations which interact to condense and rephrase a subset of sentences from the source text. An on-going comparison of biological and biomedical journal articles with their author-written abstracts is providing a basis for a more linguistically detailed model of abstract derivation using syntactic representations of selected source sentences. The description makes use of rich dictionary information to formulate paraphrasing rules of differing degrees of generality, including some which are sublanguage-specific, and others which appear valid in several languages when formulated using "lexical functions" to express important semantic relationships between lexical items. Some paraphrase operations may use both lexical functions and rhetorical relations between sentences to reformulate larger chunks of text in a concise abstract sentence. The descriptive framework is computable and utilizes existing linguistic resources.
Collapse
Affiliation(s)
- Richard Kittredge
- Department of Linguistics and Translation, University of Montreal, CP 6128, Succ.A, Montréal, Que., Canada H3C 3J7.
| |
Collapse
|
20
|
Abstract
In this paper, we describe how user-adapted explanations about drug prescriptions can be generated from already existing data sources. We start by illustrating the two-step approach employed in the first version of the natural language generator and the limitations of generated texts, that we discovered through analytical and empirical evaluations. We claim that, although style refinement would be needed in these texts, particular care should be devoted to implementing some of the persuasion techniques that doctors employ in their explanations. This would require either thoroughly revising the text planning techniques employed or converting to a multistep generation architecture. We justify why we selected this second alternative and propose some heuristics to repair problems found in the first version of the generator. Some final considerations about the advantages of this approach and the possibility of generalizing it to other domains conclude the paper.
Collapse
Affiliation(s)
- F de Rosis
- Dipartimento di Informatica, Università di Bari, Italy.
| | | | | |
Collapse
|
21
|
Abstract
A number of compositional Medical Concept Representation systems are being developed. Although these provide for a detailed conceptual representation of the underlying information, they have to be translated back to natural language for used by end-users and applications. The GALEN programme has been developing one such representation and we report here on a tool developed to generate natural language phrases from the GALEN conceptual representations. This tool can be adapted to different source modelling schemes and to different destination languages or sublanguages of a domain. It is based on a multilingual approach to natural language generation, realised through a clean separation of the domain model from the linguistic model and their link by well defined structures. Specific knowledge structures and operations have been developed for bridging between the modelling 'style' of the conceptual representation and natural language. Using the example of the scheme developed for modelling surgical operative procedures within the GALEN-IN-USE project, we show how the generator is adapted to such a scheme. The basic characteristics of the surgical procedures scheme are presented together with the basic principles of the generation tool. Using worked examples, we discuss the transformation operations which change the initial source representation into a form which can more directly be translated to a given natural language. In particular, the linguistic knowledge which has to be introduced--such as definitions of concepts and relationships is described. We explain the overall generator strategy and how particular transformation operations are triggered by language-dependent and conceptual parameters. Results are shown for generated French phrases corresponding to surgical procedures from the urology domain.
Collapse
Affiliation(s)
- J C Wagner
- Medical Informatics Division, University Hospital of Geneva, Switzerland.
| | | | | | | |
Collapse
|
22
|
Rassinoux AM, Lovis C, Baud RH, Scherrer JR. Versatility of a multilingual and bi-directional approach for medical language processing. Proc AMIA Symp 1998:668-72. [PMID: 9929303 PMCID: PMC2232097] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/10/2023] Open
Abstract
At the dawn of the 21st century, we are experiencing an exponential growth of online information that is mostly textual, and that benefits from new electronic media, such as the World Wide Web (WWW), to be broadly diffused across borders. However, there is a gap to bridge between holding information and accessing in a relevant way the deep underlying knowledge. Multilingual natural language processing (NLP), once tuned, is certainly the best solution to cope with this era of textual information. This paper focuses on the lesson learned through the joint development of an analyzer and a generator of medical language, within a multilingual context. Concrete examples, derived from the efforts under way in the European GALEN-IN-USE project, illustrate the use of these linguistic tools for the handling of surgical procedures.
Collapse
Affiliation(s)
- A M Rassinoux
- Medical Informatics Division, University Hospital of Geneva, Switzerland
| | | | | | | |
Collapse
|
23
|
Carolis BD, Rosis FD, Andreoli C, Cavallo V, Cicco MLD. The dynamic generation of hypertext presentations of medical guidelines. NEW REV HYPERMEDIA M 1998. [DOI: 10.1080/13614569808914696] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/23/2022]
|