1
|
Webb BD, Lau LY, Tsevdos D, Shewcraft RA, Corrigan D, Shi L, Lee S, Tyler J, Li S, Wang Z, Stolovitzky G, Edelmann L, Chen R, Schadt EE, Li L. An algorithm to identify patients aged 0-3 with rare genetic disorders. Orphanet J Rare Dis 2024; 19:183. [PMID: 38698482 PMCID: PMC11064409 DOI: 10.1186/s13023-024-03188-9] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/11/2023] [Accepted: 04/17/2024] [Indexed: 05/05/2024] Open
Abstract
BACKGROUND With over 7000 Mendelian disorders, identifying children with a specific rare genetic disorder diagnosis through structured electronic medical record data is challenging given incompleteness of records, inaccurate medical diagnosis coding, as well as heterogeneity in clinical symptoms and procedures for specific disorders. We sought to develop a digital phenotyping algorithm (PheIndex) using electronic medical records to identify children aged 0-3 diagnosed with genetic disorders or who present with illness with an increased risk for genetic disorders. RESULTS Through expert opinion, we established 13 criteria for the algorithm and derived a score and a classification. The performance of each criterion and the classification were validated by chart review. PheIndex identified 1,088 children out of 93,154 live births who may be at an increased risk for genetic disorders. Chart review demonstrated that the algorithm achieved 90% sensitivity, 97% specificity, and 94% accuracy. CONCLUSIONS The PheIndex algorithm can help identify when a rare genetic disorder may be present, alerting providers to consider ordering a diagnostic genetic test and/or referring a patient to a medical geneticist.
Collapse
Affiliation(s)
- Bryn D Webb
- Department of Pediatrics, University of Wisconsin School of Medicine and Public Health, Madison, WI, USA.
- GeneDx Holdings Corp, (formerly known as Sema4 Holdings Corp.), Stamford, Connecticut, CT, USA.
| | - Lisa Y Lau
- GeneDx Holdings Corp, (formerly known as Sema4 Holdings Corp.), Stamford, Connecticut, CT, USA
| | - Despina Tsevdos
- Department of Pediatrics, Icahn School of Medicine at Mount Sinai, New York, NY, USA
| | - Ryan A Shewcraft
- GeneDx Holdings Corp, (formerly known as Sema4 Holdings Corp.), Stamford, Connecticut, CT, USA
| | - David Corrigan
- GeneDx Holdings Corp, (formerly known as Sema4 Holdings Corp.), Stamford, Connecticut, CT, USA
| | - Lisong Shi
- GeneDx Holdings Corp, (formerly known as Sema4 Holdings Corp.), Stamford, Connecticut, CT, USA
| | - Seungwoo Lee
- GeneDx Holdings Corp, (formerly known as Sema4 Holdings Corp.), Stamford, Connecticut, CT, USA
| | - Jonathan Tyler
- GeneDx Holdings Corp, (formerly known as Sema4 Holdings Corp.), Stamford, Connecticut, CT, USA
| | - Shilong Li
- GeneDx Holdings Corp, (formerly known as Sema4 Holdings Corp.), Stamford, Connecticut, CT, USA
| | - Zichen Wang
- GeneDx Holdings Corp, (formerly known as Sema4 Holdings Corp.), Stamford, Connecticut, CT, USA
| | - Gustavo Stolovitzky
- GeneDx Holdings Corp, (formerly known as Sema4 Holdings Corp.), Stamford, Connecticut, CT, USA
| | - Lisa Edelmann
- GeneDx Holdings Corp, (formerly known as Sema4 Holdings Corp.), Stamford, Connecticut, CT, USA
| | - Rong Chen
- GeneDx Holdings Corp, (formerly known as Sema4 Holdings Corp.), Stamford, Connecticut, CT, USA
| | - Eric E Schadt
- Department of Genetics and Genomic Sciences, The Icahn Institute for Genomics and Multiscale Biology, Icahn School of Medicine at Mount Sinai, New York, NY, USA
| | - Li Li
- GeneDx Holdings Corp, (formerly known as Sema4 Holdings Corp.), Stamford, Connecticut, CT, USA.
| |
Collapse
|
2
|
Alsentzer E, Rasmussen MJ, Fontoura R, Cull AL, Beaulieu-Jones B, Gray KJ, Bates DW, Kovacheva VP. Zero-shot interpretable phenotyping of postpartum hemorrhage using large language models. NPJ Digit Med 2023; 6:212. [PMID: 38036723 PMCID: PMC10689487 DOI: 10.1038/s41746-023-00957-x] [Citation(s) in RCA: 9] [Impact Index Per Article: 9.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/02/2023] [Accepted: 11/01/2023] [Indexed: 12/02/2023] Open
Abstract
Many areas of medicine would benefit from deeper, more accurate phenotyping, but there are limited approaches for phenotyping using clinical notes without substantial annotated data. Large language models (LLMs) have demonstrated immense potential to adapt to novel tasks with no additional training by specifying task-specific instructions. Here we report the performance of a publicly available LLM, Flan-T5, in phenotyping patients with postpartum hemorrhage (PPH) using discharge notes from electronic health records (n = 271,081). The language model achieves strong performance in extracting 24 granular concepts associated with PPH. Identifying these granular concepts accurately allows the development of interpretable, complex phenotypes and subtypes. The Flan-T5 model achieves high fidelity in phenotyping PPH (positive predictive value of 0.95), identifying 47% more patients with this complication compared to the current standard of using claims codes. This LLM pipeline can be used reliably for subtyping PPH and outperforms a claims-based approach on the three most common PPH subtypes associated with uterine atony, abnormal placentation, and obstetric trauma. The advantage of this approach to subtyping is its interpretability, as each concept contributing to the subtype determination can be evaluated. Moreover, as definitions may change over time due to new guidelines, using granular concepts to create complex phenotypes enables prompt and efficient updating of the algorithm. Using this language modelling approach enables rapid phenotyping without the need for any manually annotated training data across multiple clinical use cases.
Collapse
Affiliation(s)
- Emily Alsentzer
- Division of General Internal Medicine and Primary Care, Brigham and Women's Hospital, Boston, MA, USA
| | - Matthew J Rasmussen
- Department of Anesthesiology, Perioperative and Pain Medicine, Brigham and Women's Hospital, Boston, MA, USA
| | - Romy Fontoura
- Department of Anesthesiology, Perioperative and Pain Medicine, Brigham and Women's Hospital, Boston, MA, USA
| | - Alexis L Cull
- Department of Anesthesiology, Perioperative and Pain Medicine, Brigham and Women's Hospital, Boston, MA, USA
| | - Brett Beaulieu-Jones
- Section of Biomedical Data Science, Department of Medicine, University of Chicago, Chicago, IL, USA
| | - Kathryn J Gray
- Center for Genomic Medicine, Massachusetts General Hospital, Boston, MA, USA
- Division of Maternal-Fetal Medicine, Brigham and Women's Hospital, Boston, MA, USA
| | - David W Bates
- Division of General Internal Medicine and Primary Care, Brigham and Women's Hospital, Boston, MA, USA
- Department of Health Care Policy and Management, Harvard T. H. Chan School of Public Health, Boston, MA, USA
| | - Vesela P Kovacheva
- Department of Anesthesiology, Perioperative and Pain Medicine, Brigham and Women's Hospital, Boston, MA, USA.
| |
Collapse
|
3
|
Patek K, Friedman P. Postpartum Hemorrhage-Epidemiology, Risk Factors, and Causes. Clin Obstet Gynecol 2023; 66:344-356. [PMID: 37130373 DOI: 10.1097/grf.0000000000000782] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 05/04/2023]
Abstract
The incidence of postpartum hemorrhage (PPH) is increasing worldwide and in the United States. Coinciding, is the increased rate of severe maternal morbidity with blood transfusion in the United States over the past 2 decades. Consequences of PPH can be life-threatening and carry significant cost burden to the health care system. This review will discuss the current trends, distribution, and risk factors for PPH. Causes of PPH will be explored in detail.
Collapse
Affiliation(s)
- Kara Patek
- Corewell Health William Beaumont University Hospital, Royal Oak, Michigan
| | | |
Collapse
|
4
|
Alsentzer E, Rasmussen MJ, Fontoura R, Cull AL, Beaulieu-Jones B, Gray KJ, Bates DW, Kovacheva VP. Zero-shot Interpretable Phenotyping of Postpartum Hemorrhage Using Large Language Models. MEDRXIV : THE PREPRINT SERVER FOR HEALTH SCIENCES 2023:2023.05.31.23290753. [PMID: 37398230 PMCID: PMC10312824 DOI: 10.1101/2023.05.31.23290753] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 07/04/2023]
Abstract
Many areas of medicine would benefit from deeper, more accurate phenotyping, but there are limited approaches for phenotyping using clinical notes without substantial annotated data. Large language models (LLMs) have demonstrated immense potential to adapt to novel tasks with no additional training by specifying task-specific i nstructions. We investigated the per-formance of a publicly available LLM, Flan-T5, in phenotyping patients with postpartum hemorrhage (PPH) using discharge notes from electronic health records ( n =271,081). The language model achieved strong performance in extracting 24 granular concepts associated with PPH. Identifying these granular concepts accurately allowed the development of inter-pretable, complex phenotypes and subtypes. The Flan-T5 model achieved high fidelity in phenotyping PPH (positive predictive value of 0.95), identifying 47% more patients with this complication compared to the current standard of using claims codes. This LLM pipeline can be used reliably for subtyping PPH and outperformed a claims-based approach on the three most common PPH subtypes associated with uterine atony, abnormal placentation, and obstetric trauma. The advantage of this approach to subtyping is its interpretability, as each concept contributing to the subtype determination can be evaluated. Moreover, as definitions may change over time due to new guidelines, using granular concepts to create complex phenotypes enables prompt and efficient updating of the algorithm. Using this lan-guage modelling approach enables rapid phenotyping without the need for any manually annotated training data across multiple clinical use cases.
Collapse
|
5
|
Boland MR, Elhadad N, Pratt W. Informatics for sex- and gender-related health: understanding the problems, developing new methods, and designing new solutions. J Am Med Inform Assoc 2022; 29:225-229. [PMID: 35024858 PMCID: PMC8757304 DOI: 10.1093/jamia/ocab287] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/20/2021] [Accepted: 12/20/2021] [Indexed: 01/14/2023] Open
Affiliation(s)
- Mary Regina Boland
- Department of Biostatistics, Epidemiology and Informatics, Perelman School of Medicine, University of Pennsylvania, Philadelphia, Pennsylvania, USA
- Institute for Biomedical Informatics, University of Pennsylvania, Philadelphia, Pennsylvania, USA
- Center for Excellence in Environmental Toxicology, University of Pennsylvania, Philadelphia, Pennsylvania, USA
- Department of Biomedical and Health Informatics, Children’s Hospital of Philadelphia, Philadelphia, Pennsylvania, USA
- Leonard Davis Institute for Health Economics, University of Pennsylvania, Philadelphia, Pennsylvania, USA
| | - Noémie Elhadad
- Biomedical Informatics, Columbia University, New York, New York, USA
| | - Wanda Pratt
- Information School, University of Washington, Seattle, Washington, USA
| |
Collapse
|