1
|
Schreidah CM, Gordon ER, Adeuyan O, Chen C, Lapolla BA, Kent JA, Reynolds GB, Fahmy LM, Weng C, Tatonetti NP, Chase HS, Pe’er I, Geskin LJ. Current status of artificial intelligence methods for skin cancer survival analysis: a scoping review. Front Med (Lausanne) 2024; 11:1243659. [PMID: 38711781 PMCID: PMC11070520 DOI: 10.3389/fmed.2024.1243659] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/21/2023] [Accepted: 02/22/2024] [Indexed: 05/08/2024] Open
Abstract
Skin cancer mortality rates continue to rise, and survival analysis is increasingly needed to understand who is at risk and what interventions improve outcomes. However, current statistical methods are limited by inability to synthesize multiple data types, such as patient genetics, clinical history, demographics, and pathology and reveal significant multimodal relationships through predictive algorithms. Advances in computing power and data science enabled the rise of artificial intelligence (AI), which synthesizes vast amounts of data and applies algorithms that enable personalized diagnostic approaches. Here, we analyze AI methods used in skin cancer survival analysis, focusing on supervised learning, unsupervised learning, deep learning, and natural language processing. We illustrate strengths and weaknesses of these approaches with examples. Our PubMed search yielded 14 publications meeting inclusion criteria for this scoping review. Most publications focused on melanoma, particularly histopathologic interpretation with deep learning. Such concentration on a single type of skin cancer amid increasing focus on deep learning highlight growing areas for innovation; however, it also demonstrates opportunity for additional analysis that addresses other types of cutaneous malignancies and expands the scope of prognostication to combine both genetic, histopathologic, and clinical data. Moreover, researchers may leverage multiple AI methods for enhanced benefit in analyses. Expanding AI to this arena may enable improved survival analysis, targeted treatments, and outcomes.
Collapse
Affiliation(s)
- Celine M. Schreidah
- Vagelos College of Physicians and Surgeons, Columbia University, New York, NY, United States
| | - Emily R. Gordon
- Vagelos College of Physicians and Surgeons, Columbia University, New York, NY, United States
| | - Oluwaseyi Adeuyan
- Vagelos College of Physicians and Surgeons, Columbia University, New York, NY, United States
| | - Caroline Chen
- Vagelos College of Physicians and Surgeons, Columbia University, New York, NY, United States
| | - Brigit A. Lapolla
- Department of Dermatology, Columbia University Irving Medical Center, New York, NY, United States
| | - Joshua A. Kent
- Jacobs School of Medicine and Biomedical Sciences, University at Buffalo, Buffalo, NY, United States
| | | | - Lauren M. Fahmy
- Vagelos College of Physicians and Surgeons, Columbia University, New York, NY, United States
| | - Chunhua Weng
- The Data Science Institute, Columbia University, New York, NY, United States
- Department of Biomedical Informatics, Columbia University, New York, NY, United States
| | - Nicholas P. Tatonetti
- The Data Science Institute, Columbia University, New York, NY, United States
- Department of Biomedical Informatics, Columbia University, New York, NY, United States
- Department of Computational Biomedicine, Cedars-Sinai Medical Center, Los Angeles, CA, United States
- Cedars-Sinai Cancer, Cedars-Sinai Medical Center, Los Angeles, CA, United States
| | - Herbert S. Chase
- Department of Biomedical Informatics, Columbia University, New York, NY, United States
| | - Itsik Pe’er
- The Data Science Institute, Columbia University, New York, NY, United States
- Department of Systems Biology, Columbia University, New York, NY, United States
- Department of Computer Science, Columbia University, New York, NY, United States
| | - Larisa J. Geskin
- Department of Dermatology, Columbia University Irving Medical Center, New York, NY, United States
| |
Collapse
|
2
|
Colvin A, Youssef S, Noh H, Wright J, Jumonville G, LaRow Brown K, Tatonetti NP, Milner JD, Weng C, Bordone LA, Petukhova L. Inborn Errors of Immunity Contribute to the Burden of Skin Disease and Create Opportunities for Improving the Practice of Dermatology. J Invest Dermatol 2024; 144:307-315.e1. [PMID: 37716649 DOI: 10.1016/j.jid.2023.08.018] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/28/2023] [Revised: 07/31/2023] [Accepted: 08/01/2023] [Indexed: 09/18/2023]
Abstract
Opportunities to improve the clinical management of skin disease are being created by advances in genomic medicine. Large-scale sequencing increasingly challenges notions about single-gene disorders. It is now apparent that monogenic etiologies make appreciable contributions to the population burden of disease and that they are underrecognized in clinical practice. A genetic diagnosis informs on molecular pathology and may direct targeted treatments and tailored prevention strategies for patients and family members. It also generates knowledge about disease pathogenesis and management that is relevant to patients without rare pathogenic variants. Inborn errors of immunity are a large class of monogenic etiologies that have been well-studied and contribute to the population burden of inflammatory diseases. To further delineate the contributions of inborn errors of immunity to the pathogenesis of skin disease, we performed a set of analyses that identified 316 inborn errors of immunity associated with skin pathologies, including common skin diseases. These data suggest that clinical sequencing is underutilized in dermatology. We next use these data to derive a network that illuminates the molecular relationships of these disorders and suggests an underlying etiological organization to immune-mediated skin disease. Our results motivate the further development of a molecularly derived and data-driven reorganization of clinical diagnoses of skin disease.
Collapse
Affiliation(s)
- Annelise Colvin
- Vagelos College of Physicians and Surgeons, Columbia University, New York, New York, USA
| | - Soundos Youssef
- Department of Pediatrics and Adolescent Medicine, American University of Beirut Medical Center, Beirut, Lebanon
| | - Heeju Noh
- Department of Systems Biology, Vagelos College of Physicians and Surgeons, Columbia University, New York, New York, USA
| | - Julia Wright
- Department of Epidemiology, Mailman School of Public Health, Columbia University, New York, New York, USA
| | - Ghislaine Jumonville
- Department of Epidemiology, Mailman School of Public Health, Columbia University, New York, New York, USA
| | - Kathleen LaRow Brown
- Department of Biomedical Informatics, Vagelos College of Physicians and Surgeons, Columbia University, New York, New York, USA
| | - Nicholas P Tatonetti
- Department of Biomedical Informatics, Vagelos College of Physicians and Surgeons, Columbia University, New York, New York, USA; Department of Computational Biomedicine, Cedars-Sinai Medical Center, West Hollywood, California, USA; Samuel Oschin Comprehensive Cancer Institute, Cedars-Sinai Medical Center, Los Angeles, California, USA
| | - Joshua D Milner
- Department of Pediatrics, Vagelos College of Physicians and Surgeons, Columbia University, New York, New York, USA
| | - Chunhua Weng
- Department of Biomedical Informatics, Vagelos College of Physicians and Surgeons, Columbia University, New York, New York, USA
| | - Lindsey A Bordone
- Department of Dermatology, Vagelos College of Physicians and Surgeons, Columbia University, New York, New York, USA
| | - Lynn Petukhova
- Department of Epidemiology, Mailman School of Public Health, Columbia University, New York, New York, USA; Department of Dermatology, Vagelos College of Physicians and Surgeons, Columbia University, New York, New York, USA.
| |
Collapse
|
3
|
Brown KL, Ramlall V, Zietz M, Gisladottir U, Tatonetti NP. Estimating the heritability of SARS-CoV-2 susceptibility and COVID-19 severity. Nat Commun 2024; 15:367. [PMID: 38191623 PMCID: PMC10774300 DOI: 10.1038/s41467-023-44250-7] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/01/2022] [Accepted: 12/05/2023] [Indexed: 01/10/2024] Open
Abstract
SARS-CoV-2 has infected over 340 million people, prompting therapeutic research. While genetic studies can highlight potential drug targets, understanding the heritability of SARS-CoV-2 susceptibility and COVID-19 severity can contextualize their results. To date, loci from meta-analyses explain 1.2% and 5.8% of variation in susceptibility and severity respectively. Here we estimate the importance of shared environment and additive genetic variation to SARS-CoV-2 susceptibility and COVID-19 severity using pedigree data, PCR results, and hospitalization information. The relative importance of genetics and shared environment for susceptibility shifted during the study, with heritability ranging from 33% (95% CI: 20%-46%) to 70% (95% CI: 63%-74%). Heritability was greater for days hospitalized with COVID-19 (41%, 95% CI: 33%-57%) compared to shared environment (33%, 95% CI: 24%-38%). While our estimates suggest these genetic architectures are not fully understood, the shift in susceptibility estimates highlights the challenge of estimation during a pandemic, given environmental fluctuations and vaccine introduction.
Collapse
Affiliation(s)
| | - Vijendra Ramlall
- Department of Biomedical Informatics, Columbia University, New York, NY, USA
- Department of Physiology & Cellular Biophysics, Columbia University, New York, NY, USA
| | - Michael Zietz
- Department of Biomedical Informatics, Columbia University, New York, NY, USA
| | - Undina Gisladottir
- Department of Biomedical Informatics, Columbia University, New York, NY, USA
| | - Nicholas P Tatonetti
- Department of Biomedical Informatics, Columbia University, New York, NY, USA.
- Department of Computational Biomedicine, Cedars-Sinai Medical Center, West Hollywood, CA, USA.
- Cedars-Sinai Cancer, Cedars-Sinai Medical Center, Los Angeles, CA, USA.
| |
Collapse
|
4
|
Moore JH, Li X, Chang JH, Tatonetti NP, Theodorescu D, Chen Y, Asselbergs FW, Venkatesan M, Wang ZP. SynTwin: A graph-based approach for predicting clinical outcomes using digital twins derived from synthetic patients. Pac Symp Biocomput 2024; 29:96-107. [PMID: 38160272 PMCID: PMC10827004] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Subscribe] [Scholar Register] [Indexed: 01/03/2024]
Abstract
The concept of a digital twin came from the engineering, industrial, and manufacturing domains to create virtual objects or machines that could inform the design and development of real objects. This idea is appealing for precision medicine where digital twins of patients could help inform healthcare decisions. We have developed a methodology for generating and using digital twins for clinical outcome prediction. We introduce a new approach that combines synthetic data and network science to create digital twins (i.e. SynTwin) for precision medicine. First, our approach starts by estimating the distance between all subjects based on their available features. Second, the distances are used to construct a network with subjects as nodes and edges defining distance less than the percolation threshold. Third, communities or cliques of subjects are defined. Fourth, a large population of synthetic patients are generated using a synthetic data generation algorithm that models the correlation structure of the data to generate new patients. Fifth, digital twins are selected from the synthetic patient population that are within a given distance defining a subject community in the network. Finally, we compare and contrast community-based prediction of clinical endpoints using real subjects, digital twins, or both within and outside of the community. Key to this approach are the digital twins defined using patient similarity that represent hypothetical unobserved patients with patterns similar to nearby real patients as defined by network distance and community structure. We apply our SynTwin approach to predicting mortality in a population-based cancer registry (n=87,674) from the Surveillance, Epidemiology, and End Results (SEER) program from the National Cancer Institute (USA). Our results demonstrate that nearest network neighbor prediction of mortality in this study is significantly improved with digital twins (AUROC=0.864, 95% CI=0.857-0.872) over just using real data alone (AUROC=0.791, 95% CI=0.781-0.800). These results suggest a network-based digital twin strategy using synthetic patients may add value to precision medicine efforts.
Collapse
Affiliation(s)
- Jason H Moore
- Department of Computational Biomedicine, Cedars-Sinai Medical Center, West Hollywood, CA, United States2Cedars-Sinai Cancer, Cedars-Sinai Medical Center, Los Angeles, CA, United States,
| | | | | | | | | | | | | | | | | |
Collapse
|
5
|
Romano JD, Li H, Napolitano T, Realubit R, Karan C, Holford M, Tatonetti NP. Discovering Venom-Derived Drug Candidates Using Differential Gene Expression. Toxins (Basel) 2023; 15:451. [PMID: 37505720 PMCID: PMC10467105 DOI: 10.3390/toxins15070451] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/15/2023] [Revised: 06/16/2023] [Accepted: 07/07/2023] [Indexed: 07/29/2023] Open
Abstract
Venoms are a diverse and complex group of natural toxins that have been adapted to treat many types of human disease, but rigorous computational approaches for discovering new therapeutic activities are scarce. We have designed and validated a new platform-named VenomSeq-to systematically identify putative associations between venoms and drugs/diseases via high-throughput transcriptomics and perturbational differential gene expression analysis. In this study, we describe the architecture of VenomSeq and its evaluation using the crude venoms from 25 diverse animal species and 9 purified teretoxin peptides. By integrating comparisons to public repositories of differential expression, associations between regulatory networks and disease, and existing knowledge of venom activity, we provide a number of new therapeutic hypotheses linking venoms to human diseases supported by multiple layers of preliminary evidence.
Collapse
Affiliation(s)
- Joseph D. Romano
- Department of Biostatistics, Epidemiology and Informatics, University of Pennsylvania, Philadelphia, PA 19104, USA;
- Institute for Biomedical Informatics, University of Pennsylvania, Philadelphia, PA 19104, USA
- Center of Excellence in Environmental Toxicology, University of Pennsylvania, Philadelphia, PA 19104, USA
| | - Hai Li
- Department of Systems Biology, Columbia University, New York, NY 10032, USA; (H.L.); (R.R.); (C.K.)
- Columbia Genome Center, Columbia University, New York, NY 10032, USA
| | - Tanya Napolitano
- Department of Chemistry, CUNY Hunter College, New York, NY 10032, USA (M.H.)
- The PhD Program in Biochemistry, Graduate Center of the City University of New York, New York, NY 10016, USA
| | - Ronald Realubit
- Department of Systems Biology, Columbia University, New York, NY 10032, USA; (H.L.); (R.R.); (C.K.)
- Columbia Genome Center, Columbia University, New York, NY 10032, USA
| | - Charles Karan
- Department of Systems Biology, Columbia University, New York, NY 10032, USA; (H.L.); (R.R.); (C.K.)
- Columbia Genome Center, Columbia University, New York, NY 10032, USA
| | - Mandë Holford
- Department of Chemistry, CUNY Hunter College, New York, NY 10032, USA (M.H.)
- The PhD Program in Biochemistry, Graduate Center of the City University of New York, New York, NY 10016, USA
- The PhD Program in Chemistry, Graduate Center of the City University of New York, New York, NY 10016, USA
- The PhD Program in Biology, Graduate Center of the City University of New York, New York, NY 10016, USA
- Department of Invertebrate Zoology, The American Museum of Natural History, New York, NY 10032, USA
| | - Nicholas P. Tatonetti
- Department of Computational Biomedicine, Cedars-Sinai Medical Center, Los Angeles, CA 90069, USA
| |
Collapse
|
6
|
Dioun S, Chen L, Hillyer G, Tatonetti NP, May BL, Melamed A, Wright JD. Association between neighborhood socioeconomic status, built environment and SARS-CoV-2 infection among cancer patients treated at a Tertiary Cancer Center in New York City. Cancer Rep (Hoboken) 2023; 6:e1714. [PMID: 36307215 PMCID: PMC9874553 DOI: 10.1002/cnr2.1714] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/15/2021] [Revised: 08/03/2022] [Accepted: 08/17/2022] [Indexed: 11/05/2022] Open
Abstract
BACKGROUND Racial and ethnic minority groups experience a disproportionate burden of SARS-CoV-2 illness and studies suggest that cancer patients are at a particular risk for severe SARS-CoV-2 infection. AIMS The objective of this study was examine the association between neighborhood characteristics and SARS-CoV-2 infection among patients with cancer. METHODS AND RESULTS We performed a cross-sectional study of New York City residents receiving treatment for cancer at a tertiary cancer center. Patients were linked by their address to data from the US Census Bureau's American Community Survey and to real estate tax data from New York's Department of City Planning. Models were used to both to estimate odds ratios (ORs) per unit increase and to predict probabilities (and 95% CI) of SARS-CoV2 infection. We identified 2350 New York City residents with cancer receiving treatment. Overall, 214 (9.1%) were infected with SARS-CoV-2. In adjusted models, the percentage of Hispanic/Latino population (aOR = 1.01; 95% CI, 1.005-1.02), unemployment rate (aOR = 1.10; 95% CI, 1.05-1.16), poverty rates (aOR = 1.02; 95% CI, 1.0002-1.03), rate of >1 person per room (aOR = 1.04; 95% CI, 1.01-1.07), average household size (aOR = 1.79; 95% CI, 1.23-2.59) and population density (aOR = 1.86; 95% CI, 1.27-2.72) were associated with SARS-CoV-2 infection. CONCLUSION Among cancer patients in New York City receiving anti-cancer therapy, SARS-CoV-2 infection was associated with neighborhood- and building-level markers of larger household membership, household crowding, and low socioeconomic status. NOVELTY AND IMPACT We performed a cross-sectional analysis of residents of New York City receiving treatment for cancer in which we linked subjects to census and real estate date. This linkage is a novel way to examine the neighborhood characteristics that influence SARS-COV-2 infection. We found that among patients receiving anti-cancer therapy, SARS-CoV-2 infection was associated with building and neighborhood-level markers of household crowding, larger household membership, and low socioeconomic status. With ongoing surges of SARS-CoV-2 infections, these data may help in the development of interventions to decrease the morbidity and mortality associated with SARS-CoV-2 among cancer patients.
Collapse
Affiliation(s)
- Shayan Dioun
- Columbia Universtiy College of Physicians and SurgeonsNew YorkNew YorkUSA
- New York Presbyterian HospitalNew YorkNew YorkUSA
| | - Ling Chen
- Columbia Universtiy College of Physicians and SurgeonsNew YorkNew YorkUSA
| | - Grace Hillyer
- Mailman School of Public HealthColumbia UniversityNew YorkNew YorkUSA
- Herbert Irving Comprehensive Cancer CenterNew YorkNew YorkUSA
| | - Nicholas P. Tatonetti
- Columbia Universtiy College of Physicians and SurgeonsNew YorkNew YorkUSA
- Herbert Irving Comprehensive Cancer CenterNew YorkNew YorkUSA
| | - Benjamin L. May
- Columbia Universtiy College of Physicians and SurgeonsNew YorkNew YorkUSA
| | - Alexander Melamed
- Columbia Universtiy College of Physicians and SurgeonsNew YorkNew YorkUSA
- New York Presbyterian HospitalNew YorkNew YorkUSA
- Herbert Irving Comprehensive Cancer CenterNew YorkNew YorkUSA
| | - Jason D. Wright
- Columbia Universtiy College of Physicians and SurgeonsNew YorkNew YorkUSA
- New York Presbyterian HospitalNew YorkNew YorkUSA
- Herbert Irving Comprehensive Cancer CenterNew YorkNew YorkUSA
| |
Collapse
|
7
|
Park J, Artin MG, Lee KE, May BL, Park M, Hur C, Tatonetti NP. Structured deep embedding model to generate composite clinical indices from electronic health records for early detection of pancreatic cancer. Patterns (N Y) 2023; 4:100636. [PMID: 36699740 PMCID: PMC9868652 DOI: 10.1016/j.patter.2022.100636] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 07/05/2022] [Revised: 08/18/2022] [Accepted: 10/24/2022] [Indexed: 12/12/2022]
Abstract
The high-dimensionality, complexity, and irregularity of electronic health records (EHR) data create significant challenges for both simplified and comprehensive health assessments, prohibiting an efficient extraction of actionable insights by clinicians. If we can provide human decision-makers with a simplified set of interpretable composite indices (i.e., combining information about groups of related measures into single representative values), it will facilitate effective clinical decision-making. In this study, we built a structured deep embedding model aimed at reducing the dimensionality of the input variables by grouping related measurements as determined by domain experts (e.g., clinicians). Our results suggest that composite indices representing liver function may consistently be the most important factor in the early detection of pancreatic cancer (PC). We propose our model as a basis for leveraging deep learning toward developing composite indices from EHR for predicting health outcomes, including but not limited to various cancers, with clinically meaningful interpretations.
Collapse
Affiliation(s)
- Jiheum Park
- Department of Medicine, Columbia University Irving Medical Center, New York, NY 10032, USA
| | - Michael G. Artin
- Hospital of the University of Pennsylvania, Philadelphia, PA 19104, USA
| | - Kate E. Lee
- Duke University Medical Center, Durham, NC 27710, USA
| | - Benjamin L. May
- Herbert Irving Comprehensive Cancer Center, Columbia University Irving Medical Center, New York, NY 10032, USA
| | - Michael Park
- Applied Info Partners, Inc, Worlds Fair Drive, Somerset, NJ 08873, USA
- X-Mechanics, Cresskill, NJ 07626, USA
| | - Chin Hur
- Department of Medicine, Columbia University Irving Medical Center, New York, NY 10032, USA
| | | |
Collapse
|
8
|
Biswas S, Shahriar S, Giangreco NP, Arvanitis P, Winkler M, Tatonetti NP, Brunken WJ, Cutforth T, Agalliu D. Mural Wnt/β-catenin signaling regulates Lama2 expression to promote neurovascular unit maturation. Development 2022; 149:dev200610. [PMID: 36098369 PMCID: PMC9578690 DOI: 10.1242/dev.200610] [Citation(s) in RCA: 7] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/03/2022] [Accepted: 08/04/2022] [Indexed: 11/20/2022]
Abstract
Neurovascular unit and barrier maturation rely on vascular basement membrane (vBM) composition. Laminins, a major vBM component, are crucial for these processes, yet the signaling pathway(s) that regulate their expression remain unknown. Here, we show that mural cells have active Wnt/β-catenin signaling during central nervous system development in mice. Bulk RNA sequencing and validation using postnatal day 10 and 14 wild-type versus adenomatosis polyposis coli downregulated 1 (Apcdd1-/-) mouse retinas revealed that Lama2 mRNA and protein levels are increased in mutant vasculature with higher Wnt/β-catenin signaling. Mural cells are the main source of Lama2, and Wnt/β-catenin activation induces Lama2 expression in mural cells in vitro. Markers of mature astrocytes, including aquaporin 4 (a water channel in astrocyte endfeet) and integrin-α6 (a laminin receptor), are upregulated in Apcdd1-/- retinas with higher Lama2 vBM deposition. Thus, the Wnt/β-catenin pathway regulates Lama2 expression in mural cells to promote neurovascular unit and barrier maturation.
Collapse
Affiliation(s)
- Saptarshi Biswas
- Department of Neurology, Columbia University Irving Medical Center, New York, NY 10032, USA
| | - Sanjid Shahriar
- Department of Neurology, Columbia University Irving Medical Center, New York, NY 10032, USA
- Department of Pathology & Cell Biology, Columbia University Irving Medical Center, New York, NY 10032, USA
| | - Nicholas P. Giangreco
- Department of Systems Biology, Columbia University Irving Medical Center, New York, NY 10032, USA
- Department of Biomedical Informatics, Columbia University Irving Medical Center, New York, NY 10032, USA
| | - Panos Arvanitis
- Department of Biomedical Engineering, Columbia University Irving Medical Center, New York, NY 10032, USA
| | - Markus Winkler
- Faculty of Medicine, Institute of Anatomy, Ludwig-Maximilians Universität, Munich 80336, Germany
| | - Nicholas P. Tatonetti
- Department of Systems Biology, Columbia University Irving Medical Center, New York, NY 10032, USA
- Department of Biomedical Informatics, Columbia University Irving Medical Center, New York, NY 10032, USA
| | - William J. Brunken
- Department of Ophthalmology & Visual Sciences, SUNY Upstate Medical University, Syracuse, NY 13210, USA
| | - Tyler Cutforth
- Department of Neurology, Columbia University Irving Medical Center, New York, NY 10032, USA
| | - Dritan Agalliu
- Department of Neurology, Columbia University Irving Medical Center, New York, NY 10032, USA
- Department of Pathology & Cell Biology, Columbia University Irving Medical Center, New York, NY 10032, USA
| |
Collapse
|
9
|
Giangreco NP, Lebreton G, Restaino S, Farr M, Zorn E, Colombo PC, Patel J, Soni RK, Leprince P, Kobashigawa J, Tatonetti NP, Fine BM. Alterations in the kallikrein-kinin system predict death after heart transplant. Sci Rep 2022; 12:14167. [PMID: 35986069 PMCID: PMC9391369 DOI: 10.1038/s41598-022-18573-2] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/25/2022] [Accepted: 08/16/2022] [Indexed: 11/09/2022] Open
Abstract
Heart transplantation remains the definitive treatment for end stage heart failure. Because availability is limited, risk stratification of candidates is crucial for optimizing both organ allocations and transplant outcomes. Here we utilize proteomics prior to transplant to identify new biomarkers that predict post-transplant survival in a multi-institutional cohort. Microvesicles were isolated from serum samples and underwent proteomic analysis using mass spectrometry. Monte Carlo cross-validation (MCCV) was used to predict survival after transplant incorporating select recipient pre-transplant clinical characteristics and serum microvesicle proteomic data. We identified six protein markers with prediction performance above AUROC of 0.6, including Prothrombin (F2), anti-plasmin (SERPINF2), Factor IX, carboxypeptidase 2 (CPB2), HGF activator (HGFAC) and low molecular weight kininogen (LK). No clinical characteristics demonstrated an AUROC > 0.6. Putative biological functions and pathways were assessed using gene set enrichment analysis (GSEA). Differential expression analysis identified enriched pathways prior to transplant that were associated with post-transplant survival including activation of platelets and the coagulation pathway prior to transplant. Specifically, upregulation of coagulation cascade components of the kallikrein-kinin system (KKS) and downregulation of kininogen prior to transplant were associated with survival after transplant. Further prospective studies are warranted to determine if alterations in the KKS contributes to overall post-transplant survival.
Collapse
Affiliation(s)
- Nicholas P Giangreco
- Departments of Systems Biology, Biomedical Informatics, and Medicine, Columbia University, New York, NY, USA
| | - Guillaume Lebreton
- Chirurgie Thoracique et Cardiovasculaire, Pitíe-Salpetriere University Hospital, Paris, France
| | - Susan Restaino
- Department of Medicine, Division of Cardiology, Columbia University Irving Medical Center, New York, NY, USA
| | - Maryjane Farr
- Department of Medicine, Division of Cardiology, University of Texas Southwestern Medical Center, Dallas, TX, USA
| | - Emmanuel Zorn
- Center for Translational Immunology, Columbia University Irving Medical Center, New York, NY, USA
| | - Paolo C Colombo
- Department of Medicine, Division of Cardiology, Columbia University Irving Medical Center, New York, NY, USA
| | - Jignesh Patel
- Cedars-Sinai Heart Institute, Cedars Sinai Medical Center, Los Angeles, CA, USA
| | - Rajesh Kumar Soni
- Proteomics and Macromolecular Crystallography Shared Resource, Herbert Irving Comprehensive Cancer Center, Columbia University Irving Medical Center, New York, NY, USA
| | - Pascal Leprince
- Chirurgie Thoracique et Cardiovasculaire, Pitíe-Salpetriere University Hospital, Paris, France
| | - Jon Kobashigawa
- Cedars-Sinai Heart Institute, Cedars Sinai Medical Center, Los Angeles, CA, USA
| | - Nicholas P Tatonetti
- Departments of Systems Biology, Biomedical Informatics, and Medicine, Columbia University, New York, NY, USA
- Institute for Genomic Medicine, Columbia University, New York, NY, USA
| | - Barry M Fine
- Department of Medicine, Division of Cardiology, Columbia University Irving Medical Center, New York, NY, USA.
| |
Collapse
|
10
|
Giangreco NP, Tatonetti NP. A database of pediatric drug effects to evaluate ontogenic mechanisms from child growth and development. Med 2022; 3:579-595.e7. [PMID: 35752163 PMCID: PMC9378670 DOI: 10.1016/j.medj.2022.06.001] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/30/2021] [Revised: 11/30/2021] [Accepted: 05/26/2022] [Indexed: 11/29/2022]
Abstract
BACKGROUND Adverse drug effects (ADEs) in children are common and may result in disability and death, necessitating post-marketing monitoring of their use. Evaluating drug safety is especially challenging in children due to the processes of growth and maturation, which can alter how children respond to treatment. Current drug safety-signal-detection methods do not account for these dynamics. METHODS We recently developed a method called disproportionality generalized additive models (dGAMs) to better identify safety signals for drugs across child-development stages. FINDINGS We used dGAMs on a database of 264,453 pediatric adverse-event reports and found 19,438 ADEs signals associated with development and validated these signals against a small reference set of pediatric ADEs. Using our approach, we can hypothesize on the ontogenic dynamics of ADE signals, such as that montelukast-induced psychiatric disorders appear most significant in the second year of life. Additionally, we integrated pediatric enzyme expression data and found that pharmacogenes with dynamic childhood expression, such as CYP2C18 and CYP27B1, are associated with pediatric ADEs. CONCLUSIONS We curated KidSIDES, a database of pediatric drug safety signals, for the research community and developed the Pediatric Drug Safety portal (PDSportal) to facilitate evaluation of drug safety signals across childhood growth and development. FUNDING This study was supported by grants from the National Institutes of Health (NIH).
Collapse
Affiliation(s)
- Nicholas P Giangreco
- Departments of Systems Biology and Biomedical Informatics, Columbia University, 622 W. 168(th) Street, New York, NY 10032, USA
| | - Nicholas P Tatonetti
- Departments of Systems Biology and Biomedical Informatics, Columbia University, 622 W. 168(th) Street, New York, NY 10032, USA.
| |
Collapse
|
11
|
Park J, Artin MG, Lee KE, Pumpalova YS, Ingram MA, May BL, Park M, Hur C, Tatonetti NP. Deep learning on time series laboratory test results from electronic health records for early detection of pancreatic cancer. J Biomed Inform 2022; 131:104095. [PMID: 35598881 PMCID: PMC10286873 DOI: 10.1016/j.jbi.2022.104095] [Citation(s) in RCA: 5] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/21/2022] [Revised: 04/04/2022] [Accepted: 05/16/2022] [Indexed: 11/26/2022]
Abstract
The multi-modal and unstructured nature of observational data in Electronic Health Records (EHR) is currently a significant obstacle for the application of machine learning towards risk stratification. In this study, we develop a deep learning framework for incorporating longitudinal clinical data from EHR to infer risk for pancreatic cancer (PC). This framework includes a novel training protocol, which enforces an emphasis on early detection by applying an independent Poisson-random mask on proximal-time measurements for each variable. Data fusion for irregular multivariate time-series features is enabled by a "grouped" neural network (GrpNN) architecture, which uses representation learning to generate a dimensionally reduced vector for each measurement set before making a final prediction. These models were evaluated using EHR data from Columbia University Irving Medical Center-New York Presbyterian Hospital. Our framework demonstrated better performance on early detection (AUROC 0.671, CI 95% 0.667 - 0.675, p < 0.001) at 12 months prior to diagnosis compared to a logistic regression, xgboost, and a feedforward neural network baseline. We demonstrate that our masking strategy results greater improvements at distal times prior to diagnosis, and that our GrpNN model improves generalizability by reducing overfitting relative to the feedforward baseline. The results were consistent across reported race. Our proposed algorithm is potentially generalizable to other diseases including but not limited to cancer where early detection can improve survival.
Collapse
Affiliation(s)
- Jiheum Park
- Department of Medicine, Columbia University Irving Medical Center, New York, NY, United States
| | - Michael G Artin
- Department of Medicine, Columbia University Irving Medical Center, New York, NY, United States
| | - Kate E Lee
- Department of Medicine, Columbia University Irving Medical Center, New York, NY, United States
| | - Yoanna S Pumpalova
- Department of Medicine, Columbia University Irving Medical Center, New York, NY, United States
| | - Myles A Ingram
- Department of Medicine, Columbia University Irving Medical Center, New York, NY, United States
| | - Benjamin L May
- Herbert Irving Comprehensive Cancer Center, Columbia University Irving Medical Center, New York, NY, United States
| | - Michael Park
- Applied Info Partners Inc, Worlds Fair Drive, Somerset, NJ, United States; X-Mechanics LLC, Cresskill, NJ, United States
| | - Chin Hur
- Department of Medicine, Columbia University Irving Medical Center, New York, NY, United States.
| | - Nicholas P Tatonetti
- Department of Biomedical Informatics, Columbia University, New York, NY, United States
| |
Collapse
|
12
|
Shen TH, Stauber J, Xu K, Jacunski A, Paragas N, Callahan M, Banlengchit R, Levitman AD, Desanti De Oliveira B, Beenken A, Grau MS, Mathieu E, Zhang Q, Li Y, Gopal T, Askanase N, Arumugam S, Mohan S, Good PI, Stevens JS, Lin F, Sia SK, Lin CS, D’Agati V, Kiryluk K, Tatonetti NP, Barasch J. Snapshots of nascent RNA reveal cell- and stimulus-specific responses to acute kidney injury. JCI Insight 2022; 7:e146374. [PMID: 35230973 PMCID: PMC8986083 DOI: 10.1172/jci.insight.146374] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/17/2022] Open
Abstract
The current strategy to detect acute injury of kidney tubular cells relies on changes in serum levels of creatinine. Yet serum creatinine (sCr) is a marker of both functional and pathological processes and does not adequately assay tubular injury. In addition, sCr may require days to reach diagnostic thresholds, yet tubular cells respond with programs of damage and repair within minutes or hours. To detect acute responses to clinically relevant stimuli, we created mice expressing Rosa26-floxed-stop uracil phosphoribosyltransferase (Uprt) and inoculated 4-thiouracil (4-TU) to tag nascent RNA at selected time points. Cre-driven 4-TU-tagged RNA was isolated from intact kidneys and demonstrated that volume depletion and ischemia induced different genetic programs in collecting ducts and intercalated cells. Even lineage-related cell types expressed different genes in response to the 2 stressors. TU tagging also demonstrated the transient nature of the responses. Because we placed Uprt in the ubiquitously active Rosa26 locus, nascent RNAs from many cell types can be tagged in vivo and their roles interrogated under various conditions. In short, 4-TU labeling identifies stimulus-specific, cell-specific, and time-dependent acute responses that are otherwise difficult to detect with other technologies and are entirely obscured when sCr is the sole metric of kidney damage.
Collapse
Affiliation(s)
| | | | | | - Alexandra Jacunski
- Department of Biomedical Informatics, Columbia University, New York, New York, USA
| | - Neal Paragas
- Department of Medicine, University of Washington, Seattle, Washington, USA
| | | | | | | | | | | | | | | | | | | | | | | | | | - Sumit Mohan
- Department of Medicine, and
- Department of Epidemiology
| | | | | | | | | | - Chyuan-Sheng Lin
- Department of Pathology and Cell Biology, Columbia University, New York, New York, USA
| | - Vivette D’Agati
- Department of Pathology and Cell Biology, Columbia University, New York, New York, USA
| | | | | | | |
Collapse
|
13
|
Park J, Foox J, Hether T, Danko DC, Warren S, Kim Y, Reeves J, Butler DJ, Mozsary C, Rosiene J, Shaiber A, Afshin EE, MacKay M, Rendeiro AF, Bram Y, Chandar V, Geiger H, Craney A, Velu P, Melnick AM, Hajirasouliha I, Beheshti A, Taylor D, Saravia-Butler A, Singh U, Wurtele ES, Schisler J, Fennessey S, Corvelo A, Zody MC, Germer S, Salvatore S, Levy S, Wu S, Tatonetti NP, Shapira S, Salvatore M, Westblade LF, Cushing M, Rennert H, Kriegel AJ, Elemento O, Imielinski M, Rice CM, Borczuk AC, Meydan C, Schwartz RE, Mason CE. System-wide transcriptome damage and tissue identity loss in COVID-19 patients. Cell Rep Med 2022; 3:100522. [PMID: 35233546 PMCID: PMC8784611 DOI: 10.1016/j.xcrm.2022.100522] [Citation(s) in RCA: 8] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/07/2021] [Revised: 12/22/2021] [Accepted: 01/16/2022] [Indexed: 01/07/2023]
Abstract
The molecular mechanisms underlying the clinical manifestations of coronavirus disease 2019 (COVID-19), and what distinguishes them from common seasonal influenza virus and other lung injury states such as acute respiratory distress syndrome, remain poorly understood. To address these challenges, we combine transcriptional profiling of 646 clinical nasopharyngeal swabs and 39 patient autopsy tissues to define body-wide transcriptome changes in response to COVID-19. We then match these data with spatial protein and expression profiling across 357 tissue sections from 16 representative patient lung samples and identify tissue-compartment-specific damage wrought by severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) infection, evident as a function of varying viral loads during the clinical course of infection and tissue-type-specific expression states. Overall, our findings reveal a systemic disruption of canonical cellular and transcriptional pathways across all tissues, which can inform subsequent studies to combat the mortality of COVID-19 and to better understand the molecular dynamics of lethal SARS-CoV-2 and other respiratory infections.
Collapse
Affiliation(s)
- Jiwoon Park
- Department of Physiology, Biophysics and Systems Biology, Weill Cornell Medicine, New York, NY, USA
- Laboratory of Virology and Infectious Disease, The Rockefeller University, New York, NY 10065, USA
| | - Jonathan Foox
- Department of Physiology, Biophysics and Systems Biology, Weill Cornell Medicine, New York, NY, USA
- The HRH Prince Alwaleed Bin Talal Bin Abdulaziz Alsaud Institute for Computational Biomedicine, Weill Cornell Medicine, New York, NY, USA
| | | | - David C. Danko
- Department of Physiology, Biophysics and Systems Biology, Weill Cornell Medicine, New York, NY, USA
- The HRH Prince Alwaleed Bin Talal Bin Abdulaziz Alsaud Institute for Computational Biomedicine, Weill Cornell Medicine, New York, NY, USA
- Tri-Institutional Computational Biology & Medicine Program, Weill Cornell Medicine, New York, NY, USA
| | | | - Youngmi Kim
- NanoString Technologies, Inc., Seattle, WA, USA
| | | | - Daniel J. Butler
- Department of Physiology, Biophysics and Systems Biology, Weill Cornell Medicine, New York, NY, USA
| | - Christopher Mozsary
- Department of Physiology, Biophysics and Systems Biology, Weill Cornell Medicine, New York, NY, USA
| | - Joel Rosiene
- New York Genome Center, New York, NY, USA
- Department of Pathology and Laboratory Medicine, Weill Cornell Medicine, New York, NY, USA
| | - Alon Shaiber
- New York Genome Center, New York, NY, USA
- Department of Pathology and Laboratory Medicine, Weill Cornell Medicine, New York, NY, USA
| | - Evan E. Afshin
- Department of Physiology, Biophysics and Systems Biology, Weill Cornell Medicine, New York, NY, USA
- The HRH Prince Alwaleed Bin Talal Bin Abdulaziz Alsaud Institute for Computational Biomedicine, Weill Cornell Medicine, New York, NY, USA
| | - Matthew MacKay
- Department of Physiology, Biophysics and Systems Biology, Weill Cornell Medicine, New York, NY, USA
| | - André F. Rendeiro
- The HRH Prince Alwaleed Bin Talal Bin Abdulaziz Alsaud Institute for Computational Biomedicine, Weill Cornell Medicine, New York, NY, USA
- Englander Institute for Precision Medicine and the Meyer Cancer Center, Weill Cornell Medicine, New York, NY, USA
| | - Yaron Bram
- Department of Medicine, Weill Cornell Medicine, New York, NY, USA
| | | | | | - Arryn Craney
- Department of Pathology and Laboratory Medicine, Weill Cornell Medicine, New York, NY, USA
| | - Priya Velu
- Department of Pathology and Laboratory Medicine, Weill Cornell Medicine, New York, NY, USA
| | - Ari M. Melnick
- Department of Medicine, Weill Cornell Medicine, New York, NY, USA
| | - Iman Hajirasouliha
- Department of Physiology, Biophysics and Systems Biology, Weill Cornell Medicine, New York, NY, USA
- The HRH Prince Alwaleed Bin Talal Bin Abdulaziz Alsaud Institute for Computational Biomedicine, Weill Cornell Medicine, New York, NY, USA
- Englander Institute for Precision Medicine and the Meyer Cancer Center, Weill Cornell Medicine, New York, NY, USA
| | - Afshin Beheshti
- KBR, Space Biosciences Division, NASA Ames Research Center, Moffett Field, CA, USA
- Stanley Center for Psychiatric Research, Broad Institute of MIT and Harvard, Cambridge, MA, USA
| | - Deanne Taylor
- Department of Biomedical and Health Informatics, The Children’s Hospital of Philadelphia, Philadelphia, PA, USA
- Department of Pediatrics, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA, USA
| | - Amanda Saravia-Butler
- Space Biosciences Division, NASA Ames Research Center, Moffett Field, CA, USA
- Logyx, LLC, Mountain View, CA, USA
| | - Urminder Singh
- Bioinformatics and Computational Biology Program, Center for Metabolic Biology, Department of Genetics, Development and Cell Biology Iowa State University, Ames, IA, USA
| | - Eve Syrkin Wurtele
- Bioinformatics and Computational Biology Program, Center for Metabolic Biology, Department of Genetics, Development and Cell Biology Iowa State University, Ames, IA, USA
| | - Jonathan Schisler
- McAllister Heart Institute at The University of North Carolina at Chapel Hill, Chapel Hill, NC, USA
- Department of Pharmacology, and Department of Pathology and Lab Medicine at The University of North Carolina at Chapel Hill, Chapel Hill, NC, USA
| | | | | | | | | | - Steven Salvatore
- Department of Pathology and Laboratory Medicine, Weill Cornell Medicine, New York, NY, USA
| | - Shawn Levy
- HudsonAlpha Discovery Institute, Huntsville, AL, USA
| | - Shixiu Wu
- Hangzhou Cancer Institute, Hangzhou Cancer Hospital, Hangzhou, China
- Department of Radiation Oncology, Hangzhou Cancer Hospital, Hangzhou, China
| | - Nicholas P. Tatonetti
- Department of Biomedical Informatics, Department of Systems Biology, Department of Medicine, Institute for Genomic Medicine, Columbia University, New York, NY, USA
| | - Sagi Shapira
- Department of Biomedical Informatics, Department of Systems Biology, Department of Medicine, Institute for Genomic Medicine, Columbia University, New York, NY, USA
| | - Mirella Salvatore
- Department of Medicine, Weill Cornell Medicine, New York, NY, USA
- Department of Population Health Sciences, Weill Cornell Medicine, New York, NY, USA
| | - Lars F. Westblade
- Department of Pathology and Laboratory Medicine, Weill Cornell Medicine, New York, NY, USA
- Department of Medicine, Weill Cornell Medicine, New York, NY, USA
| | - Melissa Cushing
- Department of Pathology and Laboratory Medicine, Weill Cornell Medicine, New York, NY, USA
| | - Hanna Rennert
- Department of Pathology and Laboratory Medicine, Weill Cornell Medicine, New York, NY, USA
| | - Alison J. Kriegel
- Department of Physiology, Cardiovascular Center, Center of Systems Molecular Medicine, Medical College of Wisconsin, Milwaukee, WI, USA
| | - Olivier Elemento
- Department of Physiology, Biophysics and Systems Biology, Weill Cornell Medicine, New York, NY, USA
- Tri-Institutional Computational Biology & Medicine Program, Weill Cornell Medicine, New York, NY, USA
- Englander Institute for Precision Medicine and the Meyer Cancer Center, Weill Cornell Medicine, New York, NY, USA
| | - Marcin Imielinski
- New York Genome Center, New York, NY, USA
- Department of Pathology and Laboratory Medicine, Weill Cornell Medicine, New York, NY, USA
| | - Charles M. Rice
- Laboratory of Virology and Infectious Disease, The Rockefeller University, New York, NY 10065, USA
| | - Alain C. Borczuk
- Department of Pathology and Laboratory Medicine, Weill Cornell Medicine, New York, NY, USA
| | - Cem Meydan
- Department of Physiology, Biophysics and Systems Biology, Weill Cornell Medicine, New York, NY, USA
- The HRH Prince Alwaleed Bin Talal Bin Abdulaziz Alsaud Institute for Computational Biomedicine, Weill Cornell Medicine, New York, NY, USA
| | - Robert E. Schwartz
- Department of Physiology, Biophysics and Systems Biology, Weill Cornell Medicine, New York, NY, USA
- Department of Medicine, Weill Cornell Medicine, New York, NY, USA
| | - Christopher E. Mason
- Department of Physiology, Biophysics and Systems Biology, Weill Cornell Medicine, New York, NY, USA
- The HRH Prince Alwaleed Bin Talal Bin Abdulaziz Alsaud Institute for Computational Biomedicine, Weill Cornell Medicine, New York, NY, USA
- New York Genome Center, New York, NY, USA
- The Feil Family Brain and Mind Research Institute, Weill Cornell Medicine, New York, NY, USA
| |
Collapse
|
14
|
Koleck TA, Topaz M, Tatonetti NP, George M, Miaskowski C, Smaldone A, Bakken S. Characterizing shared and distinct symptom clusters in common chronic conditions through natural language processing of nursing notes. Res Nurs Health 2021; 44:906-919. [PMID: 34637147 PMCID: PMC8641786 DOI: 10.1002/nur.22190] [Citation(s) in RCA: 12] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/17/2021] [Revised: 09/14/2021] [Accepted: 09/21/2021] [Indexed: 01/08/2023]
Abstract
Data-driven characterization of symptom clusters in chronic conditions is essential for shared cluster detection and physiological mechanism discovery. This study aims to computationally describe symptom documentation from electronic nursing notes and compare symptom clusters among patients diagnosed with four chronic conditions-chronic obstructive pulmonary disease (COPD), heart failure, type 2 diabetes mellitus, and cancer. Nursing notes (N = 504,395; 133,977 patients) were obtained for the 2016 calendar year from a single medical center. We used NimbleMiner, a natural language processing application, to identify the presence of 56 symptoms. We calculated symptom documentation prevalence by note and patient for the corpus. Then, we visually compared documentation for a subset of patients (N = 22,657) diagnosed with COPD (n = 3339), heart failure (n = 6587), diabetes (n = 12,139), and cancer (n = 7269) and conducted multiple correspondence analysis and hierarchical clustering to discover underlying groups of patients who have similar symptom profiles (i.e., symptom clusters) for each condition. As expected, pain was the most frequently documented symptom. All conditions had a group of patients characterized by no symptoms. Shared clusters included cardiovascular symptoms for heart failure and diabetes; pain and other symptoms for COPD, diabetes, and cancer; and a newly-identified cognitive and neurological symptom cluster for heart failure, diabetes, and cancer. Cancer (gastrointestinal symptoms and fatigue) and COPD (mental health symptoms) each contained a unique cluster. In summary, we report both shared and distinct, as well as established and novel, symptom clusters across chronic conditions. Findings support the use of electronic health record-derived notes and NLP methods to study symptoms and symptom clusters to advance symptom science.
Collapse
Affiliation(s)
- Theresa A. Koleck
- School of Nursing, University of Pittsburgh, Pittsburgh, Pennsylvania
| | - Maxim Topaz
- School of Nursing, Columbia University, New York, New York
- Data Science Institute, Columbia University, New York, New York
| | - Nicholas P. Tatonetti
- Data Science Institute, Columbia University, New York, New York
- Department of Biomedical Informatics, Columbia University, New York, New York
- Department of Systems Biology, Columbia University, New York, New York
- Department of Medicine, Columbia University, New York, New York
- Institute for Genomic Medicine, Columbia University, New York, New York
| | - Maureen George
- School of Nursing, Columbia University, New York, New York
| | - Christine Miaskowski
- School of Nursing, University of California San Francisco, San Francisco, California
| | - Arlene Smaldone
- School of Nursing, Columbia University, New York, New York
- College of Dental Medicine, Columbia University, New York, New York
| | - Suzanne Bakken
- School of Nursing, Columbia University, New York, New York
- Data Science Institute, Columbia University, New York, New York
- Department of Biomedical Informatics, Columbia University, New York, New York
| |
Collapse
|
15
|
Giangreco NP, Lina S, Qian J, Kouame A, Subbian V, Boerwinkle E, Cicek M, Clark CR, Cohen E, Gebo KA, Loperena-Cortes R, Mayo K, Mockrin S, Ohno-Machado L, Schully SD, Tatonetti NP, Ramirez AH. Pediatric data from the All of Us research program: demonstration of pediatric obesity over time. JAMIA Open 2021; 4:ooab112. [PMID: 35155998 PMCID: PMC8827025 DOI: 10.1093/jamiaopen/ooab112] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/24/2021] [Revised: 11/17/2021] [Accepted: 12/15/2021] [Indexed: 11/12/2022] Open
Abstract
OBJECTIVE To describe and demonstrate use of pediatric data collected by the All of Us Research Program. MATERIALS AND METHODS All of Us participant physical measurements and electronic health record (EHR) data were analyzed including investigation of trends in childhood obesity and correlation with adult body mass index (BMI). RESULTS We identified 19 729 participants with legacy pediatric EHR data including diagnoses, prescriptions, visits, procedures, and measurements gathered since 1980. We found an increase in pediatric obesity diagnosis over time that correlates with BMI measurements recorded in participants' adult EHRs and those physical measurements taken at enrollment in the research program. DISCUSSION We highlight the availability of retrospective pediatric EHR data for nearly 20 000 All of Us participants. These data are relevant to current issues such as the rise in pediatric obesity. CONCLUSION All of Us contains a rich resource of retrospective pediatric EHR data to accelerate pediatric research studies.
Collapse
Affiliation(s)
- Nicholas P Giangreco
- Department of Biomedical Informatics, Columbia University, New York, New York, USA
- Department of Systems Biology, Columbia University, New York, New York, USA
| | - Sulieman Lina
- Department of Biomedical Informatics, Vanderbilt University Medical Center, Nashville, Tennessee, USA
| | - Jun Qian
- Department of Biomedical Informatics, Vanderbilt University Medical Center, Nashville, Tennessee, USA
| | - Aymone Kouame
- Vanderbilt Institute for Clinical and Translational Research, Vanderbilt University Medical Center, Nashville, Tennessee, USA
| | - Vignesh Subbian
- Department of Biomedical Engineering, The University of Arizona, Tucson, Arizona, USA
- Department of Systems & Industrial Engineering, The University of Arizona, Tucson, Arizona, USA
| | - Eric Boerwinkle
- School of Public Health, The University of Texas Health Science Center at Houston, Houston, Texas, USA
| | - Mine Cicek
- Department of Laboratory Medicine and Pathology, Mayo Clinic, Rochester, Minnesota, USA
| | - Cheryl R Clark
- Department of Medicine, Brigham and Women’s Hospital, Boston, Massachusetts, USA
| | - Elizabeth Cohen
- Hunter-Bellevue School of Nursing, Hunter College City University of New York, New York, New York, USA
| | - Kelly A Gebo
- Bloomberg School of Public Health, Johns Hopkins University School of Medicine, Baltimore, Maryland, USA
| | - Roxana Loperena-Cortes
- Vanderbilt Institute for Clinical and Translational Research, Vanderbilt University Medical Center, Nashville, Tennessee, USA
| | - Kelsey Mayo
- Vanderbilt Institute for Clinical and Translational Research, Vanderbilt University Medical Center, Nashville, Tennessee, USA
| | - Stephen Mockrin
- All of Us Research Program, National Institutes of Health, Bethesda, Maryland, USA
- Leidos, Inc, Frederick, Maryland, USA
| | - Lucila Ohno-Machado
- Department of Biomedical Informatics, UCSD Health, La Jolla, California, USA
| | - Sheri D Schully
- All of Us Research Program, National Institutes of Health, Bethesda, Maryland, USA
| | - Nicholas P Tatonetti
- Department of Biomedical Informatics, Columbia University, New York, New York, USA
- Department of Systems Biology, Columbia University, New York, New York, USA
| | - Andrea H Ramirez
- All of Us Research Program, National Institutes of Health, Bethesda, Maryland, USA
- Department of Medicine, Vanderbilt University Medical Center, Nashville, Tennessee, USA
| |
Collapse
|
16
|
Abstract
OBJECTIVES Provide an overview of the emerging themes and notable papers which were published in 2020 in the field of Bioinformatics and Translational Informatics (BTI) for the International Medical Informatics Association Yearbook. METHODS A team of 16 individuals scanned the literature from the past year. Using a scoring rubric, papers were evaluated on their novelty, importance, and objective quality. 1,224 Medical Subject Headings (MeSH) terms extracted from these papers were used to identify themes and research focuses. The authors then used the scoring results to select notable papers and trends presented in this manuscript. RESULTS The search phase identified 263 potential papers and central themes of coronavirus disease 2019 (COVID-19), machine learning, and bioinformatics were examined in greater detail. CONCLUSIONS When addressing a once in a centruy pandemic, scientists worldwide answered the call, with informaticians playing a critical role. Productivity and innovations reached new heights in both TBI and science, but significant research gaps remain.
Collapse
Affiliation(s)
- Scott P. McGrath
- CITRIS Health, University of California Berkeley, Berkeley, CA, USA
| | | | - Maryam Tavakoli
- MTERMS Lab, Brigham and Women's Hospital, Harvard Medical School, Boston, MA, USA
| | | |
Collapse
|
17
|
Giangreco NP, Tatonetti NP. Evaluating risk detection methods to uncover ontogenic-mediated adverse drug effect mechanisms in children. BioData Min 2021; 14:34. [PMID: 34294093 PMCID: PMC8296590 DOI: 10.1186/s13040-021-00264-9] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/26/2021] [Accepted: 06/16/2021] [Indexed: 02/08/2023] Open
Abstract
BACKGROUND Identifying adverse drugs effects (ADEs) in children, overall and within pediatric age groups, is essential for preventing disability and death from marketed drugs. At the same time, however, detection is challenging due to dynamic biological processes during growth and maturation, called ontogeny, that alter pharmacokinetics and pharmacodynamics. As a result, methodologies in pediatric drug safety have been limited to event surveillance and have not focused on investigating adverse event mechanisms. There is an opportunity to identify drug event patterns within observational databases for evaluating ontogenic-mediated adverse event mechanisms. The first step of which is to establish statistical models that can identify temporal trends of adverse effects across childhood. RESULTS Using simulation, we evaluated a population stratification method (the proportional reporting ratio or PRR) and a population modeling method (the generalized additive model or GAM) to identify and quantify ADE risk at varying reporting rates and dynamics. We found that GAMs showed improved performance over the PRR in detecting dynamic drug event reporting across child development stages. Moreover, GAMs exhibited normally distributed and robust ADE risk estimation at all development stages by sharing information across child development stages. CONCLUSIONS Our study underscores the opportunity for using population modeling techniques, which leverage drug event reporting across development stages, as biologically-inspired detection methods for evaluating ontogenic mechanisms.
Collapse
Affiliation(s)
- Nicholas P. Giangreco
- Departments of Systems Biology and Biomedical Informatics, Columbia University, 622 W. 168th Street, New York, NY 10032 USA
| | - Nicholas P. Tatonetti
- Departments of Systems Biology and Biomedical Informatics, Columbia University, 622 W. 168th Street, New York, NY 10032 USA
| |
Collapse
|
18
|
Giangreco NP, Lebreton G, Restaino S, Jane Farr M, Zorn E, Colombo PC, Patel J, Levine R, Truby L, Soni RK, Leprince P, Kobashigawa J, Tatonetti NP, Fine BM. Plasma kallikrein predicts primary graft dysfunction after heart transplant. J Heart Lung Transplant 2021; 40:1199-1211. [PMID: 34330603 DOI: 10.1016/j.healun.2021.07.001] [Citation(s) in RCA: 7] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/22/2021] [Revised: 06/21/2021] [Accepted: 07/01/2021] [Indexed: 01/06/2023] Open
Abstract
BACKGROUND Primary graft dysfunction (PGD) is the leading cause of early mortality after heart transplant. Pre-transplant predictors of PGD remain elusive and its etiology remains unclear. METHODS Microvesicles were isolated from 88 pre-transplant serum samples and underwent proteomic evaluation using TMT mass spectrometry. Monte Carlo cross validation (MCCV) was used to predict the occurrence of severe PGD after transplant using recipient pre-transplant clinical characteristics and serum microvesicle proteomic data. Putative biological functions and pathways were assessed using gene set enrichment analysis (GSEA) within the MCCV prediction methodology. RESULTS Using our MCCV prediction methodology, decreased levels of plasma kallikrein (KLKB1), a critical regulator of the kinin-kallikrein system, was the most predictive factor identified for PGD (AUROC 0.6444 [0.6293, 0.6655]; odds 0.1959 [0.0592, 0.3663]. Furthermore, a predictive panel combining KLKB1 with inotrope therapy achieved peak performance (AUROC 0.7181 [0.7020, 0.7372]) across and within (AUROCs of 0.66-0.78) each cohort. A classifier utilizing KLKB1 and inotrope therapy outperforms existing composite scores by more than 50 percent. The diagnostic utility of the classifier was validated on 65 consecutive transplant patients, resulting in an AUROC of 0.71 and a negative predictive value of 0.92-0.96. Differential expression analysis revealed a enrichment in inflammatory and immune pathways prior to PGD. CONCLUSIONS Pre-transplant level of KLKB1 is a robust predictor of post-transplant PGD. The combination with pre-transplant inotrope therapy enhances the prediction of PGD compared to pre-transplant KLKB1 levels alone and the resulting classifier equation validates within a prospective validation cohort. Inflammation and immune pathway enrichment characterize the pre-transplant proteomic signature predictive of PGD.
Collapse
Affiliation(s)
- Nicholas P Giangreco
- Departments of Systems Biology, Biomedical Informatics, and Medicine, Columbia University, New York, New York
| | - Guillaume Lebreton
- Chirurgie Thoracique et Cardiovasculaire, Pitiíe-Salpetriere University Hospital, Paris, France
| | - Susan Restaino
- Department of Medicine, Division of Cardiology, Columbia University Irving Medical Center, New York, New York
| | - Mary Jane Farr
- Department of Medicine, Division of Cardiology, Columbia University Irving Medical Center, New York, New York
| | - Emmanuel Zorn
- Center for Translational Immunology, Columbia University Irving Medical Center, New York, New York
| | - Paolo C Colombo
- Department of Medicine, Division of Cardiology, Columbia University Irving Medical Center, New York, New York
| | - Jignesh Patel
- Cedars-Sinai Heart Institute, Cedars Sinai Medical Center, Los Angeles, California
| | - Ryan Levine
- Cedars-Sinai Heart Institute, Cedars Sinai Medical Center, Los Angeles, California
| | - Lauren Truby
- Department of Medicine, Division of Cardiology, Columbia University Irving Medical Center, New York, New York
| | - Rajesh Kumar Soni
- Proteomics and Macromolecular Crystallography Shared Resource, Herbert Irving Comprehensive Cancer Center, Columbia University Irving Medical Center, New York, New York
| | - Pascal Leprince
- Chirurgie Thoracique et Cardiovasculaire, Pitiíe-Salpetriere University Hospital, Paris, France
| | - Jon Kobashigawa
- Cedars-Sinai Heart Institute, Cedars Sinai Medical Center, Los Angeles, California
| | - Nicholas P Tatonetti
- Departments of Systems Biology, Biomedical Informatics, and Medicine, Columbia University, New York, New York; Institute for Genomic Medicine, Columbia University, New York, New York
| | - Barry M Fine
- Department of Medicine, Division of Cardiology, Columbia University Irving Medical Center, New York, New York.
| |
Collapse
|
19
|
Kennel PJ, Yahi A, Naka Y, Mancini DM, Marboe CC, Max K, Akat K, Tuschl T, Vasilescu EM, Zorn E, Tatonetti NP, Schulze PC. Longitudinal profiling of circulating miRNA during cardiac allograft rejection: a proof-of-concept study. ESC Heart Fail 2021; 8:1840-1849. [PMID: 33713567 PMCID: PMC8120386 DOI: 10.1002/ehf2.13238] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/17/2020] [Revised: 01/09/2021] [Accepted: 01/19/2021] [Indexed: 12/30/2022] Open
Abstract
AIMS Allograft rejection following heart transplantation (HTx) is a serious complication even in the era of modern immunosuppressive regimens and causes up to a third of early deaths after HTx. Allograft rejection is mediated by a cascade of immune mechanisms leading to acute cellular rejection (ACR) and/or antibody-mediated rejection (AMR). The gold standard for monitoring allograft rejection is invasive endomyocardial biopsy that exposes patients to complications. Little is known about the potential of circulating miRNAs as biomarkers to detect cardiac allograft rejection. We here present a systematic analysis of circulating miRNAs as biomarkers and predictors for allograft rejection after HTx using next-generation small RNA sequencing. METHODS AND RESULTS We used next-generation small RNA sequencing to investigate circulating miRNAs among HTx recipients (10 healthy controls, 10 heart failure patients, 13 ACR, and 10 AMR). MiRNA profiling was performed at different time points before, during, and after resolution of the rejection episode. We found three miRNAs with significantly increased serum levels in patients with biopsy-proven cardiac rejection when compared with patients without rejection: hsa-miR-139-5p, hsa-miR-151a-5p, and hsa-miR-186-5p. We identified miRNAs that may serve as potential predictors for the subsequent development of ACR: hsa-miR-29c-3p (ACR) and hsa-miR-486-5p (AMR). Overall, hsa-miR-486-5p was most strongly associated with acute rejection episodes. CONCLUSIONS Monitoring cardiac allograft rejection using circulating miRNAs might represent an alternative strategy to invasive endomyocardial biopsy.
Collapse
Affiliation(s)
- Peter J. Kennel
- Division of Cardiology, Department of MedicineColumbia UniversityNew YorkNYUSA
- Department of Medicine I, Division of CardiologyUniversity Hospital of Friedrich Schiller University JenaAm Klinikum 1Jena07747Germany
| | - Alexandre Yahi
- Department of Biomedical InformaticsColumbia UniversityNew YorkNYUSA
- Department of Systems BiologyColumbia UniversityNew YorkNYUSA
- Department of MedicineColumbia UniversityNew YorkNYUSA
| | | | | | - Charles C. Marboe
- Department of Pathology and Cell BiologyColumbia UniversityNew YorkNYUSA
| | - Klaas Max
- Laboratory of RNA Molecular BiologyRockefeller UniversityNew YorkNYUSA
| | - Kemal Akat
- Laboratory of RNA Molecular BiologyRockefeller UniversityNew YorkNYUSA
| | - Thomas Tuschl
- Laboratory of RNA Molecular BiologyRockefeller UniversityNew YorkNYUSA
| | | | - Emmanuel Zorn
- Columbia Center for Translational ImmunologyColumbia UniversityNew YorkNYUSA
| | - Nicholas P. Tatonetti
- Department of Biomedical InformaticsColumbia UniversityNew YorkNYUSA
- Department of Systems BiologyColumbia UniversityNew YorkNYUSA
- Department of MedicineColumbia UniversityNew YorkNYUSA
| | - Paul Christian Schulze
- Department of Medicine I, Division of CardiologyUniversity Hospital of Friedrich Schiller University JenaAm Klinikum 1Jena07747Germany
| |
Collapse
|
20
|
Koleck TA, Tatonetti NP, Bakken S, Mitha S, Henderson MM, George M, Miaskowski C, Smaldone A, Topaz M. Identifying Symptom Information in Clinical Notes Using Natural Language Processing. Nurs Res 2021; 70:173-183. [PMID: 33196504 PMCID: PMC9109773 DOI: 10.1097/nnr.0000000000000488] [Citation(s) in RCA: 18] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/22/2022]
Abstract
BACKGROUND Symptoms are a core concept of nursing interest. Large-scale secondary data reuse of notes in electronic health records (EHRs) has the potential to increase the quantity and quality of symptom research. However, the symptom language used in clinical notes is complex. A need exists for methods designed specifically to identify and study symptom information from EHR notes. OBJECTIVES We aim to describe a method that combines standardized vocabularies, clinical expertise, and natural language processing to generate comprehensive symptom vocabularies and identify symptom information in EHR notes. We piloted this method with five diverse symptom concepts: constipation, depressed mood, disturbed sleep, fatigue, and palpitations. METHODS First, we obtained synonym lists for each pilot symptom concept from the Unified Medical Language System. Then, we used two large bodies of text (clinical notes from Columbia University Irving Medical Center and PubMed abstracts containing Medical Subject Headings or key words related to the pilot symptoms) to further expand our initial vocabulary of synonyms for each pilot symptom concept. We used NimbleMiner, an open-source natural language processing tool, to accomplish these tasks and evaluated NimbleMiner symptom identification performance by comparison to a manually annotated set of nurse- and physician-authored common EHR note types. RESULTS Compared to the baseline Unified Medical Language System synonym lists, we identified up to 11 times more additional synonym words or expressions, including abbreviations, misspellings, and unique multiword combinations, for each symptom concept. Natural language processing system symptom identification performance was excellent. DISCUSSION Using our comprehensive symptom vocabularies and NimbleMiner to label symptoms in clinical notes produced excellent performance metrics. The ability to extract symptom information from EHR notes in an accurate and scalable manner has the potential to greatly facilitate symptom science research.
Collapse
|
21
|
Shang N, Khan A, Polubriaginof F, Zanoni F, Mehl K, Fasel D, Drawz PE, Carrol RJ, Denny JC, Hathcock MA, Arruda-Olson AM, Peissig PL, Dart RA, Brilliant MH, Larson EB, Carrell DS, Pendergrass S, Verma SS, Ritchie MD, Benoit B, Gainer VS, Karlson EW, Gordon AS, Jarvik GP, Stanaway IB, Crosslin DR, Mohan S, Ionita-Laza I, Tatonetti NP, Gharavi AG, Hripcsak G, Weng C, Kiryluk K. Medical records-based chronic kidney disease phenotype for clinical care and "big data" observational and genetic studies. NPJ Digit Med 2021; 4:70. [PMID: 33850243 PMCID: PMC8044136 DOI: 10.1038/s41746-021-00428-1] [Citation(s) in RCA: 28] [Impact Index Per Article: 9.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/07/2020] [Accepted: 02/25/2021] [Indexed: 12/19/2022] Open
Abstract
Chronic Kidney Disease (CKD) represents a slowly progressive disorder that is typically silent until late stages, but early intervention can significantly delay its progression. We designed a portable and scalable electronic CKD phenotype to facilitate early disease recognition and empower large-scale observational and genetic studies of kidney traits. The algorithm uses a combination of rule-based and machine-learning methods to automatically place patients on the staging grid of albuminuria by glomerular filtration rate ("A-by-G" grid). We manually validated the algorithm by 451 chart reviews across three medical systems, demonstrating overall positive predictive value of 95% for CKD cases and 97% for healthy controls. Independent case-control validation using 2350 patient records demonstrated diagnostic specificity of 97% and sensitivity of 87%. Application of the phenotype to 1.3 million patients demonstrated that over 80% of CKD cases are undetected using ICD codes alone. We also demonstrated several large-scale applications of the phenotype, including identifying stage-specific kidney disease comorbidities, in silico estimation of kidney trait heritability in thousands of pedigrees reconstructed from medical records, and biobank-based multicenter genome-wide and phenome-wide association studies.
Collapse
Affiliation(s)
- Ning Shang
- Department of Biomedical Informatics, Vagelos College of Physicians & Surgeons, Columbia University, New York, NY, USA
| | - Atlas Khan
- Division of Nephrology, Department of Medicine, Vagelos College of Physicians & Surgeons, Columbia University, New York, NY, USA
| | - Fernanda Polubriaginof
- Department of Biomedical Informatics, Vagelos College of Physicians & Surgeons, Columbia University, New York, NY, USA
| | - Francesca Zanoni
- Division of Nephrology, Department of Medicine, Vagelos College of Physicians & Surgeons, Columbia University, New York, NY, USA
| | - Karla Mehl
- Division of Nephrology, Department of Medicine, Vagelos College of Physicians & Surgeons, Columbia University, New York, NY, USA
| | - David Fasel
- Department of Biomedical Informatics, Vagelos College of Physicians & Surgeons, Columbia University, New York, NY, USA
| | - Paul E Drawz
- Department of Medicine, University of Minnesota, Minnesota, MN, USA
| | - Robert J Carrol
- Department of Biomedical Informatics, Vanderbilt University, Nashville, TN, USA
| | - Joshua C Denny
- Department of Biomedical Informatics, Vanderbilt University, Nashville, TN, USA
- Departments of Medicine, Vanderbilt University, Nashville, TN, USA
| | | | | | | | - Richard A Dart
- Marshfield Clinic Research Institute, Marshfield, WI, USA
| | | | - Eric B Larson
- Kaiser Permanente Washington Health Research Institute, Seattle, WA, USA
| | - David S Carrell
- Kaiser Permanente Washington Health Research Institute, Seattle, WA, USA
| | | | | | | | | | | | | | - Adam S Gordon
- Center for Genetic Medicine, Northwestern University, Chicago, IL, USA
| | - Gail P Jarvik
- Departments of Medicine (Medical Genetics) and Genome Sciences, University of Washington School of Medicine, Seattle, WA, USA
| | - Ian B Stanaway
- Departments of Medicine (Medical Genetics) and Genome Sciences, University of Washington School of Medicine, Seattle, WA, USA
| | - David R Crosslin
- Departments of Medicine (Medical Genetics) and Genome Sciences, University of Washington School of Medicine, Seattle, WA, USA
- Department of Biomedical Informatics and Medical Education, University of Washington, Seattle, WA, USA
| | - Sumit Mohan
- Division of Nephrology, Department of Medicine, Vagelos College of Physicians & Surgeons, Columbia University, New York, NY, USA
| | - Iuliana Ionita-Laza
- Department of Biostatistics, Mailman School of Public Health, Columbia University, New York, NY, USA
| | - Nicholas P Tatonetti
- Department of Biomedical Informatics, Vagelos College of Physicians & Surgeons, Columbia University, New York, NY, USA
| | - Ali G Gharavi
- Division of Nephrology, Department of Medicine, Vagelos College of Physicians & Surgeons, Columbia University, New York, NY, USA
| | - George Hripcsak
- Department of Biomedical Informatics, Vagelos College of Physicians & Surgeons, Columbia University, New York, NY, USA
| | - Chunhua Weng
- Department of Biomedical Informatics, Vagelos College of Physicians & Surgeons, Columbia University, New York, NY, USA
| | - Krzysztof Kiryluk
- Division of Nephrology, Department of Medicine, Vagelos College of Physicians & Surgeons, Columbia University, New York, NY, USA.
| |
Collapse
|
22
|
Giangreco NP, Elias JE, Tatonetti NP. No population left behind: Improving paediatric drug safety using informatics and systems biology. Br J Clin Pharmacol 2020; 88:1464-1470. [PMID: 33332641 PMCID: PMC8209126 DOI: 10.1111/bcp.14705] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/23/2020] [Revised: 10/26/2020] [Accepted: 12/05/2020] [Indexed: 12/12/2022] Open
Abstract
Adverse drugs effects (ADEs) in children are common and may result in disability and death. The current paediatric drug safety landscape, including clinical trials, is limited as it rarely includes children and relies on extrapolation from adults. Children are not small adults but go through an evolutionarily conserved and physiologically dynamic process of growth and maturation. Novel quantitative approaches, integrating observations from clinical trials and drug safety databases with dynamic mechanisms, can be used to systematically identify ADEs unique to childhood. In this perspective, we discuss three critical research directions using systems biology methodologies and novel informatics to improve paediatric drug safety, namely child versus adult drug safety profiles, age-dependent drug toxicities and genetic susceptibility of ADEs across childhood. We argue that a data-driven framework that leverages observational data, biomedical knowledge and systems biology modelling will reveal previously unknown mechanisms of pediatric adverse drug events and lead to improved paediatric drug safety.
Collapse
Affiliation(s)
- Nicholas P Giangreco
- Department of Biomedical Informatics and Systems Biology, Columbia University, New York, NY, USA
| | - Jonathan E Elias
- Department of Pediatrics, Instructor in Pediatrics, Assistant Medical Director of Information Services, Weill Cornell Medical & NYP Weill Cornell Medical Center, New York, NY, USA
| | - Nicholas P Tatonetti
- Department of Biomedical Informatics and Systems Biology, Columbia University, New York, NY, USA
| |
Collapse
|
23
|
Thangaraj PM, Kummer BR, Lorberbaum T, Elkind MSV, Tatonetti NP. Comparative analysis, applications, and interpretation of electronic health record-based stroke phenotyping methods. BioData Min 2020; 13:21. [PMID: 33372632 PMCID: PMC7720570 DOI: 10.1186/s13040-020-00230-x] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/26/2020] [Accepted: 11/15/2020] [Indexed: 01/14/2023] Open
Abstract
BACKGROUND Accurate identification of acute ischemic stroke (AIS) patient cohorts is essential for a wide range of clinical investigations. Automated phenotyping methods that leverage electronic health records (EHRs) represent a fundamentally new approach cohort identification without current laborious and ungeneralizable generation of phenotyping algorithms. We systematically compared and evaluated the ability of machine learning algorithms and case-control combinations to phenotype acute ischemic stroke patients using data from an EHR. MATERIALS AND METHODS Using structured patient data from the EHR at a tertiary-care hospital system, we built and evaluated machine learning models to identify patients with AIS based on 75 different case-control and classifier combinations. We then estimated the prevalence of AIS patients across the EHR. Finally, we externally validated the ability of the models to detect AIS patients without AIS diagnosis codes using the UK Biobank. RESULTS Across all models, we found that the mean AUROC for detecting AIS was 0.963 ± 0.0520 and average precision score 0.790 ± 0.196 with minimal feature processing. Classifiers trained with cases with AIS diagnosis codes and controls with no cerebrovascular disease codes had the best average F1 score (0.832 ± 0.0383). In the external validation, we found that the top probabilities from a model-predicted AIS cohort were significantly enriched for AIS patients without AIS diagnosis codes (60-150 fold over expected). CONCLUSIONS Our findings support machine learning algorithms as a generalizable way to accurately identify AIS patients without using process-intensive manual feature curation. When a set of AIS patients is unavailable, diagnosis codes may be used to train classifier models.
Collapse
Affiliation(s)
- Phyllis M Thangaraj
- Department of Biomedical Informatics, Columbia University, 622 W 168th St., PH-20, New York, NY, 10032, USA
- Department of Systems Biology, Columbia University, New York, NY, USA
| | - Benjamin R Kummer
- Department of Neurology, Icahn School of Medicine at Mt. Sinai, New York, NY, USA
| | - Tal Lorberbaum
- Department of Biomedical Informatics, Columbia University, 622 W 168th St., PH-20, New York, NY, 10032, USA
- Department of Systems Biology, Columbia University, New York, NY, USA
| | - Mitchell S V Elkind
- Department of Neurology, Vagelos College of Physicians and Surgeons, Columbia University, New York, NY, USA
- Department of Epidemiology, Mailman School of Public Health, Columbia University, New York, NY, USA
| | - Nicholas P Tatonetti
- Department of Biomedical Informatics, Columbia University, 622 W 168th St., PH-20, New York, NY, 10032, USA.
- Department of Systems Biology, Columbia University, New York, NY, USA.
| |
Collapse
|
24
|
Silver ER, Truong HQ, Ostvar S, Hur C, Tatonetti NP. Association of Neighborhood Deprivation Index With Success in Cancer Care Crowdfunding. JAMA Netw Open 2020; 3:e2026946. [PMID: 33270122 PMCID: PMC7716189 DOI: 10.1001/jamanetworkopen.2020.26946] [Citation(s) in RCA: 20] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 05/12/2020] [Accepted: 09/25/2020] [Indexed: 01/16/2023] Open
Abstract
Importance Financial toxicity resulting from cancer care poses a substantial public health concern, leading some patients to turn to online crowdfunding. However, the practice may exacerbate existing socioeconomic cancer disparities by privileging those with access to interpersonal wealth and digital media literacy. Objective To test the hypotheses that higher county-level socioeconomic status and the presence (vs absence) of text indicators of beneficiary worth in campaign descriptions are associated with amount raised from cancer crowdfunding. Design, Setting, and Participants This cross-sectional analysis examined US cancer crowdfunding campaigns conducted between 2010 and 2019 and data from the American Community Survey (2013-2017). Data analysis was performed from December 2019 to March 2020. Exposures Neighborhood deprivation index of campaign location and campaign text features indicating the beneficiary's worth. Main Outcomes and Measures Amount of money raised. Results This study analyzed 144 061 US cancer crowdfunding campaigns. Campaigns in counties with higher neighborhood deprivation raised less (-26.07%; 95% CI, -27.46% to -24.65%; P < .001) than those in counties with less neighborhood deprivation. Campaigns raised more funds when legitimizing details were provided, including clinical details about the cancer type (9.58%; 95% CI, 8.00% to 11.18%; P < .001) and treatment type (6.58%; 95% CI, 5.44% to 7.79%; P < .001) and financial details, such as insurance status (1.39%; 95% CI, 0.20% to 2.63%; P = .02) and out-of-pocket costs (7.36%; 95% CI, 6.18% to 8.55%; P < .001). Campaigns raised more money when beneficiaries were described as warm (13.80%; 95% CI, 12.30% to 15.26%; P < .001), brave (15.40%; 95% CI, 14.11% to 16.65%; P < .001), or self-reliant (5.23%; 95% CI, 3.77% to 6.72%; P < .001). Conclusions and Relevance These findings suggest that cancer crowdfunding success ay disproportionately benefit those in high-socioeconomic status areas and those with the internet literacy necessary to portray beneficiaries as worthy. By rewarding those with existing socioeconomic advantage, cancer crowdfunding may perpetuate socioeconomic disparities in cancer care access. The findings also underscore the widespread nature of financial toxicity resulting from cancer care.
Collapse
Affiliation(s)
- Elisabeth R. Silver
- Department of Medicine, Columbia University Irving Medical Center, New York, New York
| | - Han Q. Truong
- Department of Medicine, Columbia University Irving Medical Center, New York, New York
| | - Sassan Ostvar
- Department of Medicine, Columbia University Irving Medical Center, New York, New York
| | - Chin Hur
- Department of Medicine, Columbia University Irving Medical Center, New York, New York
- Herbert Irving Comprehensive Cancer Center, Columbia University Irving Medical Center, New York, New York
- Division of Digestive and Liver Diseases, Columbia University Irving Medical Center, New York, New York
| | - Nicholas P. Tatonetti
- Department of Medicine, Columbia University Irving Medical Center, New York, New York
- Herbert Irving Comprehensive Cancer Center, Columbia University Irving Medical Center, New York, New York
- Department of Biomedical Informatics, Columbia University Irving Medical Center, New York, New York
| |
Collapse
|
25
|
Spurlin EE, Han ES, Silver ER, May BL, Tatonetti NP, Ingram MA, Jin Z, Hur C, Advincula AP, Hur HC. Where Have All the Emergencies Gone? The Impact of the COVID-19 Pandemic on Obstetric and Gynecologic Procedures and Consults at a New York City Hospital. J Minim Invasive Gynecol 2020; 28:1411-1419.e1. [PMID: 33248312 PMCID: PMC7688419 DOI: 10.1016/j.jmig.2020.11.012] [Citation(s) in RCA: 9] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/27/2020] [Revised: 11/17/2020] [Accepted: 11/20/2020] [Indexed: 11/29/2022]
Abstract
Study Objective The purpose of this study was to assess the impact of the coronavirus disease 2019 (COVID-19) pandemic on surgical volume and emergency department (ED) consults across obstetrics-gynecology (OB-GYN) services at a New York City hospital. Design Retrospective cohort study. Setting Tertiary care academic medical center in New York City. Patients Women undergoing OB-GYN ED consults or surgeries between February 1, 2020 and April 15, 2020. Interventions March 16 institutional moratorium on elective surgeries. Measurements and Main Results The volume and types of surgeries and ED consults were compared before and after the COVID-19 moratorium. During the pandemic, the average weekly volume of ED consults and gynecology (GYN) surgeries decreased, whereas obstetric (OB) surgeries remained stable. The proportions of OB-GYN ED consults, GYN surgeries, and OB surgeries relative to all ED consults, all surgeries, and all labor and delivery patients were 1.87%, 13.8%, 54.6% in the pre–COVID-19 time frame (February 1–March 15) vs 1.53%, 21.3%, 79.7% in the COVID-19 time frame (March 16–April 15), representing no significant difference in proportions of OB-GYN ED consults (p = .464) and GYN surgeries (p = .310) before and during COVID-19, with a proportionate increase in OB surgeries (p <.002). The distribution of GYN surgical case types changed significantly during the pandemic with higher proportions of emergent surgeries for ectopic pregnancies, miscarriages, and concern for cancer (p <.001). Alternatively, the OB surgery distribution of case types remained relatively constant. Conclusion This study highlights how the pandemic has affected the ways that patients in OB-GYN access and receive care. Institutional policies suspending elective surgeries during the pandemic decreased GYN surgical volume and affected the types of cases performed. This decrease was not appreciated for OB surgical volume, reflecting the nonelective and time-sensitive nature of obstetric care. A decrease in ED consults was noted during the pandemic begging the question “Where have all the emergencies gone?” Although the moratorium on elective procedures was necessary, “elective” GYN surgeries remain medically indicated to address symptoms such as pain and bleeding and to prevent serious medical sequelae such as severe anemia requiring transfusion. As we continue to battle COVID-19, we must not lose sight of those patients whose care has been deferred.
Collapse
Affiliation(s)
- Emily E Spurlin
- Department of Obstetrics and Gynecology, Columbia University Irving Medical Center, (Dr. Spurlin)
| | - Esther S Han
- Division of Gynecologic Specialty Surgery, Department of Obstetrics and Gynecology, New York Presbyterian Hospital, Columbia University Irving Medical Center, (Drs. Han, Advincula, and H. Hur)
| | - Elisabeth R Silver
- Department of Medicine, Columbia University Irving Medical Center, (Dr. C. Hur and Ms. Silver and Mr. Ingram)
| | - Benjamin L May
- Herbert Irving Comprehensive Cancer Center, Columbia University Irving Medical Center, (Mr. May)
| | - Nicholas P Tatonetti
- Departments of Biomedical Informatics, Systems Biology, and Medicine, Columbia University Irving Medical Center, (Dr. Tatonetti)
| | - Myles A Ingram
- Department of Medicine, Columbia University Irving Medical Center, (Dr. C. Hur and Ms. Silver and Mr. Ingram)
| | - Zhezhen Jin
- Department of Biostatistics, Columbia University, (Dr. Jin), New York, New York
| | - Chin Hur
- Department of Medicine, Columbia University Irving Medical Center, (Dr. C. Hur and Ms. Silver and Mr. Ingram)
| | - Arnold P Advincula
- Division of Gynecologic Specialty Surgery, Department of Obstetrics and Gynecology, New York Presbyterian Hospital, Columbia University Irving Medical Center, (Drs. Han, Advincula, and H. Hur)
| | - Hye-Chun Hur
- Division of Gynecologic Specialty Surgery, Department of Obstetrics and Gynecology, New York Presbyterian Hospital, Columbia University Irving Medical Center, (Drs. Han, Advincula, and H. Hur).
| |
Collapse
|
26
|
Zietz M, Zucker J, Tatonetti NP. Associations between blood type and COVID-19 infection, intubation, and death. Nat Commun 2020; 11:5761. [PMID: 33188185 PMCID: PMC7666188 DOI: 10.1038/s41467-020-19623-x] [Citation(s) in RCA: 219] [Impact Index Per Article: 54.8] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/15/2020] [Accepted: 10/16/2020] [Indexed: 01/06/2023] Open
Abstract
The rapid global spread of the novel coronavirus SARS-CoV-2 has strained healthcare and testing resources, making the identification and prioritization of individuals most at-risk a critical challenge. Recent evidence suggests blood type may affect risk of severe COVID-19. Here, we use observational healthcare data on 14,112 individuals tested for SARS-CoV-2 with known blood type in the New York Presbyterian (NYP) hospital system to assess the association between ABO and Rh blood types and infection, intubation, and death. We find slightly increased infection prevalence among non-O types. Risk of intubation was decreased among A and increased among AB and B types, compared with type O, while risk of death was increased for type AB and decreased for types A and B. We estimate Rh-negative blood type to have a protective effect for all three outcomes. Our results add to the growing body of evidence suggesting blood type may play a role in COVID-19. Recent evidence has suggested that blood type may be associated with severe COVID-19. Here, the authors use data from ~14,000 individuals tested for SARS-CoV-2 at a New York City hospital, and find that certain ABO and Rh blood types are associated with infection, intubation, and death.
Collapse
Affiliation(s)
- Michael Zietz
- Department of Biomedical Informatics, Columbia University Irving Medical Center, New York, NY, USA
| | - Jason Zucker
- Department of Medicine, Columbia University Irving Medical Center, New York, NY, USA
| | - Nicholas P Tatonetti
- Department of Biomedical Informatics, Columbia University Irving Medical Center, New York, NY, USA. .,Department of Medicine, Columbia University Irving Medical Center, New York, NY, USA. .,Department of Systems Biology, Columbia University Irving Medical Center, New York, NY, USA. .,Institute for Genomic Medicine, Columbia University Irving Medical Center, New York, NY, USA.
| |
Collapse
|
27
|
Chandak P, Tatonetti NP. Using Machine Learning to Identify Adverse Drug Effects Posing Increased Risk to Women. Patterns (N Y) 2020; 1:100108. [PMID: 33179017 PMCID: PMC7654817 DOI: 10.1016/j.patter.2020.100108] [Citation(s) in RCA: 17] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 05/29/2020] [Revised: 08/14/2020] [Accepted: 08/27/2020] [Indexed: 11/27/2022]
Abstract
Adverse drug reactions are the fourth leading cause of death in the US. Although women take longer to metabolize medications and experience twice the risk of developing adverse reactions compared with men, these sex differences are not comprehensively understood. Real-world clinical data provide an opportunity to estimate safety effects in otherwise understudied populations, i.e., women. These data, however, are subject to confounding biases and correlated covariates. We present AwareDX, a pharmacovigilance algorithm that leverages advances in machine learning to predict sex risks. Our algorithm mitigates these biases and quantifies the differential risk of a drug causing an adverse event in either men or women. AwareDX demonstrates high precision during validation against clinical literature and pharmacogenetic mechanisms. We present a resource of 20,817 adverse drug effects posing sex-specific risks. AwareDX, and this resource, present an opportunity to minimize adverse events by tailoring drug prescription and dosage to sex.
Collapse
Affiliation(s)
- Payal Chandak
- Department of Computer Science, Columbia University, New York, NY 10027, USA
| | | |
Collapse
|
28
|
Ramlall V, Thangaraj PM, Meydan C, Foox J, Butler D, Kim J, May B, De Freitas JK, Glicksberg BS, Mason CE, Tatonetti NP, Shapira SD. Immune complement and coagulation dysfunction in adverse outcomes of SARS-CoV-2 infection. Nat Med 2020; 26:1609-1615. [PMID: 32747830 PMCID: PMC7809634 DOI: 10.1038/s41591-020-1021-2] [Citation(s) in RCA: 211] [Impact Index Per Article: 52.8] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/25/2020] [Accepted: 07/16/2020] [Indexed: 11/08/2022]
Abstract
Understanding the pathophysiology of SARS-CoV-2 infection is critical for therapeutic and public health strategies. Viral-host interactions can guide discovery of disease regulators, and protein structure function analysis points to several immune pathways, including complement and coagulation, as targets of coronaviruses. To determine whether conditions associated with dysregulated complement or coagulation systems impact disease, we performed a retrospective observational study and found that history of macular degeneration (a proxy for complement-activation disorders) and history of coagulation disorders (thrombocytopenia, thrombosis and hemorrhage) are risk factors for SARS-CoV-2-associated morbidity and mortality-effects that are independent of age, sex or history of smoking. Transcriptional profiling of nasopharyngeal swabs demonstrated that in addition to type-I interferon and interleukin-6-dependent inflammatory responses, infection results in robust engagement of the complement and coagulation pathways. Finally, in a candidate-driven genetic association study of severe SARS-CoV-2 disease, we identified putative complement and coagulation-associated loci including missense, eQTL and sQTL variants of critical complement and coagulation regulators. In addition to providing evidence that complement function modulates SARS-CoV-2 infection outcome, the data point to putative transcriptional genetic markers of susceptibility. The results highlight the value of using a multimodal analytical approach to reveal determinants and predictors of immunity, susceptibility and clinical outcome associated with infection.
Collapse
Affiliation(s)
- Vijendra Ramlall
- Department of Biomedical Informatics, Columbia University, New York, NY, USA
- Department of Physiology & Cellular Biophysics, Columbia University, New York, NY, USA
| | - Phyllis M Thangaraj
- Department of Biomedical Informatics, Columbia University, New York, NY, USA
- Vagelos College of Physicians and Surgeons, Columbia University, New York, NY, USA
| | - Cem Meydan
- The HRH Prince Alwaleed Bin Talal Bin Abdulaziz Alsaud Institute for Computational Biomedicine, Weill Cornell Medicine, New York, NY, USA
- Herbert Irving Comprehensive Cancer Center, Columbia University, New York, NY, USA
| | - Jonathan Foox
- The HRH Prince Alwaleed Bin Talal Bin Abdulaziz Alsaud Institute for Computational Biomedicine, Weill Cornell Medicine, New York, NY, USA
- Department of Physiology and Biophysics, Weill Cornell Medicine, New York, NY, USA
| | - Daniel Butler
- The HRH Prince Alwaleed Bin Talal Bin Abdulaziz Alsaud Institute for Computational Biomedicine, Weill Cornell Medicine, New York, NY, USA
- Department of Physiology and Biophysics, Weill Cornell Medicine, New York, NY, USA
| | - Jacob Kim
- Department of Systems Biology, Columbia University, New York, NY, USA
| | - Ben May
- Herbert Irving Comprehensive Cancer Center, Columbia University, New York, NY, USA
| | - Jessica K De Freitas
- Department of Genetics and Genomic Sciences, Icahn School of Medicine at Mount Sinai, New York, NY, USA
- Hasso Plattner Institute for Digital Health at Mount Sinai, Icahn School of Medicine at Mount Sinai, New York, NY, USA
| | - Benjamin S Glicksberg
- Department of Genetics and Genomic Sciences, Icahn School of Medicine at Mount Sinai, New York, NY, USA
- Hasso Plattner Institute for Digital Health at Mount Sinai, Icahn School of Medicine at Mount Sinai, New York, NY, USA
| | - Christopher E Mason
- The HRH Prince Alwaleed Bin Talal Bin Abdulaziz Alsaud Institute for Computational Biomedicine, Weill Cornell Medicine, New York, NY, USA
- Department of Physiology and Biophysics, Weill Cornell Medicine, New York, NY, USA
- The WorldQuant Initiative for Quantitative Prediction, Weill Cornell Medicine, New York, NY, USA
- The Feil Family Brain and Mind Research Institute, Weill Cornell Medicine, New York, NY, USA
| | - Nicholas P Tatonetti
- Department of Biomedical Informatics, Columbia University, New York, NY, USA.
- Department of Systems Biology, Columbia University, New York, NY, USA.
| | - Sagi D Shapira
- Department of Systems Biology, Columbia University, New York, NY, USA.
| |
Collapse
|
29
|
Abstract
The rapid global spread of the novel coronavirus SARS-CoV-2 has strained healthcare and testing resources, making the identification and prioritization of individuals most at-risk a critical challenge. Recent evidence suggests blood type may affect risk of severe COVID-19. We used observational healthcare data on 14,112 individuals tested for SARS-CoV-2 with known blood type in the New York Presbyterian (NYP) hospital system to assess the association between ABO and Rh blood types and infection, intubation, and death. We found slightly increased infection prevalence among non-O types. Risk of intubation was decreased among A and increased among AB and B types, compared with type O, while risk of death was increased for type AB and decreased for types A and B. We estimated Rh-negative blood type to have a protective effect for all three outcomes. Our results add to the growing body of evidence suggesting blood type may play a role in COVID-19.
Collapse
Affiliation(s)
- Michael Zietz
- Department of Biomedical Informatics, Columbia University Irving Medical Center
| | - Jason Zucker
- Department of Medicine, Columbia University Irving Medical Center
| | | |
Collapse
|
30
|
Ramlall V, Thangaraj PM, Meydan C, Foox J, Butler D, May B, De Freitas JK, Glicksberg BS, Mason CE, Tatonetti NP, Shapira SD. Identification of Immune complement function as a determinant of adverse SARS-CoV-2 infection outcome. medRxiv 2020:2020.05.05.20092452. [PMID: 32511494 PMCID: PMC7273262 DOI: 10.1101/2020.05.05.20092452] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 11/24/2022]
Abstract
Understanding the pathophysiology of SARS-CoV-2 infection is critical for therapeutics and public health intervention strategies. Viral-host interactions can guide discovery of regulators of disease outcomes, and protein structure function analysis points to several immune pathways, including complement and coagulation, as targets of the coronavirus proteome. To determine if conditions associated with dysregulation of the complement or coagulation systems impact adverse clinical outcomes, we performed a retrospective observational study of 11,116 patients who presented with suspected SARS-CoV-2 infection. We found that history of macular degeneration (a proxy for complement activation disorders) and history of coagulation disorders (thrombocytopenia, thrombosis, and hemorrhage) are risk factors for morbidity and mortality in SARS-CoV-2 infected patients - effects that could not be explained by age, sex, or history of smoking. Further, transcriptional profiling of nasopharyngeal (NP) swabs from 650 control and SARS-CoV-2 infected patients demonstrated that in addition to innate Type-I interferon and IL-6 dependent inflammatory immune responses, infection results in robust engagement and activation of the complement and coagulation pathways. Finally, we conducted a candidate driven genetic association study of severe SARS-CoV-2 disease. Among the findings, our scan identified putative complement and coagulation associated loci including missense, eQTL and sQTL variants of critical regulators of the complement and coagulation cascades. In addition to providing evidence that complement function modulates SARS-CoV-2 infection outcome, the data point to putative transcriptional genetic markers of susceptibility. The results highlight the value of using a multi-modal analytical approach, combining molecular information from virus protein structure-function analysis with clinical informatics, transcriptomics, and genomics to reveal determinants and predictors of immunity, susceptibility, and clinical outcome associated with infection.
Collapse
Affiliation(s)
- Vijendra Ramlall
- Department of Biomedical Informatics, Columbia University, New York, NY, USA. USA
- Department of Physiology & Cellular Biophysics, Columbia University, New York, NY, USA
| | - Phyllis M. Thangaraj
- Department of Biomedical Informatics, Columbia University, New York, NY, USA. USA
- Vagelos College of Physicians and Surgeons, Columbia University, New York, NY, USA
| | - Cem Meydan
- The HRH Prince Alwaleed Bin Talal Bin Abdulaziz Alsaud Institute for Computational Biomedicine, Weill Cornell Medicine, New York, NY, USA
- Herbert Irving Comprehensive Cancer Center, Columbia University, New York, NY, USA
| | - Jonathan Foox
- Department of Physiology and Biophysics, Weill Cornell Medicine, New York, NY, USA
- The HRH Prince Alwaleed Bin Talal Bin Abdulaziz Alsaud Institute for Computational Biomedicine, Weill Cornell Medicine, New York, NY, USA
| | - Daniel Butler
- Department of Physiology and Biophysics, Weill Cornell Medicine, New York, NY, USA
- The HRH Prince Alwaleed Bin Talal Bin Abdulaziz Alsaud Institute for Computational Biomedicine, Weill Cornell Medicine, New York, NY, USA
| | - Ben May
- Herbert Irving Comprehensive Cancer Center, Columbia University, New York, NY, USA
| | - Jessica K. De Freitas
- Department of Genetics and Genomic Sciences, Icahn School of Medicine at Mount Sinai, New York, NY, 10029
- Hasso Plattner Institute for Digital Health at Mount Sinai, Icahn School of Medicine at Mount Sinai, New York, NY, 10065
| | - Benjamin S. Glicksberg
- Department of Genetics and Genomic Sciences, Icahn School of Medicine at Mount Sinai, New York, NY, 10029
- Hasso Plattner Institute for Digital Health at Mount Sinai, Icahn School of Medicine at Mount Sinai, New York, NY, 10065
| | - Christopher E. Mason
- Department of Physiology and Biophysics, Weill Cornell Medicine, New York, NY, USA
- The HRH Prince Alwaleed Bin Talal Bin Abdulaziz Alsaud Institute for Computational Biomedicine, Weill Cornell Medicine, New York, NY, USA
- The WorldQuant Initiative for Quantitative Prediction, Weill Cornell Medicine, New York, NY, USA
- The Feil Family Brain and Mind Research Institute, Weill Cornell Medicine, New York, NY, USA
| | - Nicholas P. Tatonetti
- Department of Biomedical Informatics, Columbia University, New York, NY, USA. USA
- Department of Systems Biology, Columbia University, New York, NY, USA. USA
| | - Sagi D. Shapira
- Department of Systems Biology, Columbia University, New York, NY, USA. USA
| |
Collapse
|
31
|
Macesic N, Bear Don't Walk OJ, Pe'er I, Tatonetti NP, Peleg AY, Uhlemann AC. Predicting Phenotypic Polymyxin Resistance in Klebsiella pneumoniae through Machine Learning Analysis of Genomic Data. mSystems 2020; 5:e00656-19. [PMID: 32457240 PMCID: PMC7253370 DOI: 10.1128/msystems.00656-19] [Citation(s) in RCA: 29] [Impact Index Per Article: 7.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/09/2019] [Accepted: 05/01/2020] [Indexed: 02/06/2023] Open
Abstract
Polymyxins are used as treatments of last resort for Gram-negative bacterial infections. Their increased use has led to concerns about emerging polymyxin resistance (PR). Phenotypic polymyxin susceptibility testing is resource intensive and difficult to perform accurately. The complex polygenic nature of PR and our incomplete understanding of its genetic basis make it difficult to predict PR using detection of resistance determinants. We therefore applied machine learning (ML) to whole-genome sequencing data from >600 Klebsiella pneumoniae clonal group 258 (CG258) genomes to predict phenotypic PR. Using a reference-based representation of genomic data with ML outperformed a rule-based approach that detected variants in known PR genes (area under receiver-operator curve [AUROC], 0.894 versus 0.791, P = 0.006). We noted modest increases in performance by using a bacterial genome-wide association study to filter relevant genomic features and by integrating clinical data in the form of prior polymyxin exposure. Conversely, reference-free representation of genomic data as k-mers was associated with decreased performance (AUROC, 0.692 versus 0.894, P = 0.015). When ML models were interpreted to extract genomic features, six of seven known PR genes were correctly identified by models without prior programming and several genes involved in stress responses and maintenance of the cell membrane were identified as potential novel determinants of PR. These findings are a proof of concept that whole-genome sequencing data can accurately predict PR in K. pneumoniae CG258 and may be applicable to other forms of complex antimicrobial resistance.IMPORTANCE Polymyxins are last-resort antibiotics used to treat highly resistant Gram-negative bacteria. There are increasing reports of polymyxin resistance emerging, raising concerns of a postantibiotic era. Polymyxin resistance is therefore a significant public health threat, but current phenotypic methods for detection are difficult and time-consuming to perform. There have been increasing efforts to use whole-genome sequencing for detection of antibiotic resistance, but this has been difficult to apply to polymyxin resistance because of its complex polygenic nature. The significance of our research is that we successfully applied machine learning methods to predict polymyxin resistance in Klebsiella pneumoniae clonal group 258, a common health care-associated and multidrug-resistant pathogen. Our findings highlight that machine learning can be successfully applied even in complex forms of antibiotic resistance and represent a significant contribution to the literature that could be used to predict resistance in other bacteria and to other antibiotics.
Collapse
Affiliation(s)
- Nenad Macesic
- Division of Infectious Diseases, Columbia University Irving Medical Center, New York, New York, USA
- Department of Infectious Diseases, The Alfred Hospital and Central Clinical School, Monash University, Melbourne, Australia
| | | | - Itsik Pe'er
- Department of Computer Science, Columbia University, New York, New York, USA
| | - Nicholas P Tatonetti
- Department of Biomedical Informatics, Columbia University, New York, New York, USA
| | - Anton Y Peleg
- Department of Infectious Diseases, The Alfred Hospital and Central Clinical School, Monash University, Melbourne, Australia
- Infection and Immunity Program, Monash Biomedicine Discovery Institute, Department of Microbiology, Monash University, Clayton, Victoria, Australia
| | - Anne-Catrin Uhlemann
- Division of Infectious Diseases, Columbia University Irving Medical Center, New York, New York, USA
- Microbiome & Pathogen Genomics Core, Columbia University Irving Medical Center, New York, New York, USA
| |
Collapse
|
32
|
Krigel A, Tatonetti NP, Neugut AI, Lebwohl B. No Increased Risk of Colorectal Adenomas in Spouses of Patients with Colorectal Neoplasia. Clin Gastroenterol Hepatol 2020; 18:509-510. [PMID: 30928453 PMCID: PMC10855025 DOI: 10.1016/j.cgh.2019.03.038] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 12/04/2018] [Revised: 03/18/2019] [Accepted: 03/24/2019] [Indexed: 02/07/2023]
Abstract
Although genetic factors such as family history have been associated with increased risk of developing colorectal cancer (CRC), multiple lifestyle and environmental risk factors for CRC have been identified, including smoking, diet, obesity, and physical activity.1,2 Although couples typically have different genetic backgrounds, spouses are likely to share lifestyle and environmental exposures over the course of years, including similar home environments, geographical locations of residence, dietary exposures, and smoking exposures.3 As such, one might expect that an increased CRC incidence would be seen among spouses of patients with CRC; however, studies on this topic have inconsistent results.3-6 By using a large cohort of spouses who have undergone colonoscopy, we aimed to determine whether the risk of colorectal adenomas is increased among spouses of those with colorectal neoplasia (CRN) on colonoscopy.
Collapse
Affiliation(s)
- Anna Krigel
- Division of Digestive and Liver Diseases, Department of Medicine, Columbia University Medical Center, New York, New York.
| | - Nicholas P Tatonetti
- Division of Biomedical Informatics, Columbia University Medical Center, New York, New York
| | - Alfred I Neugut
- Herbert Irving Comprehensive Cancer Center, Columbia University Medical Center, New York, New York; Department of Epidemiology, Mailman School of Public Health, Columbia University, New York, New York
| | - Benjamin Lebwohl
- Division of Digestive and Liver Diseases, Department of Medicine, Columbia University Medical Center, New York, New York; Department of Epidemiology, Mailman School of Public Health, Columbia University, New York, New York
| |
Collapse
|
33
|
Glicksberg BS, Oskotsky B, Thangaraj PM, Giangreco N, Badgeley MA, Johnson KW, Datta D, Rudrapatna VA, Rappoport N, Shervey MM, Miotto R, Goldstein TC, Rutenberg E, Frazier R, Lee N, Israni S, Larsen R, Percha B, Li L, Dudley JT, Tatonetti NP, Butte AJ. PatientExploreR: an extensible application for dynamic visualization of patient clinical history from electronic health records in the OMOP common data model. Bioinformatics 2019; 35:4515-4518. [PMID: 31214700 PMCID: PMC6821222 DOI: 10.1093/bioinformatics/btz409] [Citation(s) in RCA: 18] [Impact Index Per Article: 3.6] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/16/2018] [Revised: 03/20/2019] [Accepted: 06/13/2019] [Indexed: 01/05/2023] Open
Abstract
MOTIVATION Electronic health records (EHRs) are quickly becoming omnipresent in healthcare, but interoperability issues and technical demands limit their use for biomedical and clinical research. Interactive and flexible software that interfaces directly with EHR data structured around a common data model (CDM) could accelerate more EHR-based research by making the data more accessible to researchers who lack computational expertise and/or domain knowledge. RESULTS We present PatientExploreR, an extensible application built on the R/Shiny framework that interfaces with a relational database of EHR data in the Observational Medical Outcomes Partnership CDM format. PatientExploreR produces patient-level interactive and dynamic reports and facilitates visualization of clinical data without any programming required. It allows researchers to easily construct and export patient cohorts from the EHR for analysis with other software. This application could enable easier exploration of patient-level data for physicians and researchers. PatientExploreR can incorporate EHR data from any institution that employs the CDM for users with approved access. The software code is free and open source under the MIT license, enabling institutions to install and users to expand and modify the application for their own purposes. AVAILABILITY AND IMPLEMENTATION PatientExploreR can be freely obtained from GitHub: https://github.com/BenGlicksberg/PatientExploreR. We provide instructions for how researchers with approved access to their institutional EHR can use this package. We also release an open sandbox server of synthesized patient data for users without EHR access to explore: http://patientexplorer.ucsf.edu. SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Benjamin S Glicksberg
- Bakar Computational Health Sciences Institute, University of California, San Francisco, San Francisco, CA, USA
| | - Boris Oskotsky
- Bakar Computational Health Sciences Institute, University of California, San Francisco, San Francisco, CA, USA
| | - Phyllis M Thangaraj
- Department of Biomedical Informatics, Columbia University, New York, NY, USA
- Department of Systems Biology, Columbia University, New York, NY, USA
- Department of Medicine, Columbia University, New York, NY, USA
| | - Nicholas Giangreco
- Department of Biomedical Informatics, Columbia University, New York, NY, USA
- Department of Systems Biology, Columbia University, New York, NY, USA
- Department of Medicine, Columbia University, New York, NY, USA
| | - Marcus A Badgeley
- Departments of Genomics and Data Science, Icahn Institute for Genomic Sciences and Multiscale Biology, Icahn School of Medicine at Mount Sinai, Institute of Next Generation Healthcare, New York, NY, USA
| | - Kipp W Johnson
- Departments of Genomics and Data Science, Icahn Institute for Genomic Sciences and Multiscale Biology, Icahn School of Medicine at Mount Sinai, Institute of Next Generation Healthcare, New York, NY, USA
| | - Debajyoti Datta
- Bakar Computational Health Sciences Institute, University of California, San Francisco, San Francisco, CA, USA
| | - Vivek A Rudrapatna
- Bakar Computational Health Sciences Institute, University of California, San Francisco, San Francisco, CA, USA
- Division of Gastroenterology, Department of Medicine, University of California, San Francisco, CA, USA
| | - Nadav Rappoport
- Bakar Computational Health Sciences Institute, University of California, San Francisco, San Francisco, CA, USA
| | - Mark M Shervey
- Departments of Genomics and Data Science, Icahn Institute for Genomic Sciences and Multiscale Biology, Icahn School of Medicine at Mount Sinai, Institute of Next Generation Healthcare, New York, NY, USA
| | - Riccardo Miotto
- Departments of Genomics and Data Science, Icahn Institute for Genomic Sciences and Multiscale Biology, Icahn School of Medicine at Mount Sinai, Institute of Next Generation Healthcare, New York, NY, USA
| | - Theodore C Goldstein
- Bakar Computational Health Sciences Institute, University of California, San Francisco, San Francisco, CA, USA
| | - Eugenia Rutenberg
- Bakar Computational Health Sciences Institute, University of California, San Francisco, San Francisco, CA, USA
| | - Remi Frazier
- Enterprise Information and Analytics, University of California, San Francisco, San Francisco, CA, USA
| | - Nelson Lee
- Enterprise Information and Analytics, University of California, San Francisco, San Francisco, CA, USA
| | - Sharat Israni
- Bakar Computational Health Sciences Institute, University of California, San Francisco, San Francisco, CA, USA
| | - Rick Larsen
- Enterprise Information and Analytics, University of California, San Francisco, San Francisco, CA, USA
| | - Bethany Percha
- Departments of Genomics and Data Science, Icahn Institute for Genomic Sciences and Multiscale Biology, Icahn School of Medicine at Mount Sinai, Institute of Next Generation Healthcare, New York, NY, USA
| | - Li Li
- Departments of Genomics and Data Science, Icahn Institute for Genomic Sciences and Multiscale Biology, Icahn School of Medicine at Mount Sinai, Institute of Next Generation Healthcare, New York, NY, USA
| | - Joel T Dudley
- Departments of Genomics and Data Science, Icahn Institute for Genomic Sciences and Multiscale Biology, Icahn School of Medicine at Mount Sinai, Institute of Next Generation Healthcare, New York, NY, USA
| | - Nicholas P Tatonetti
- Department of Biomedical Informatics, Columbia University, New York, NY, USA
- Department of Systems Biology, Columbia University, New York, NY, USA
- Department of Medicine, Columbia University, New York, NY, USA
| | - Atul J Butte
- Bakar Computational Health Sciences Institute, University of California, San Francisco, San Francisco, CA, USA
- Center for Data-Driven Insights and Innovation, University of California Health, Oakland, CA, USA
| |
Collapse
|
34
|
Lyudovyk O, Shen Y, Tatonetti NP, Hsiao SJ, Mansukhani MM, Weng C. Pathway analysis of genomic pathology tests for prognostic cancer subtyping. J Biomed Inform 2019; 98:103286. [PMID: 31499184 PMCID: PMC7136846 DOI: 10.1016/j.jbi.2019.103286] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/31/2019] [Revised: 09/01/2019] [Accepted: 09/05/2019] [Indexed: 10/26/2022]
Abstract
Genomic test results collected during the provision of medical care and stored in Electronic Health Record (EHR) systems represent an opportunity for clinical research into disease heterogeneity and clinical outcomes. In this paper, we evaluate the use of genomic test reports ordered for cancer patients in order to derive cancer subtypes and to identify biological pathways predictive of poor survival outcomes. A novel method is proposed to calculate patient similarity based on affected biological pathways rather than gene mutations. We demonstrate that this approach identifies subtypes of prognostic value and biological pathways linked to survival, with implications for precision treatment selection and a better understanding of the underlying disease. We also share lessons learned regarding the opportunities and challenges of secondary use of observational genomic data to conduct such research.
Collapse
Affiliation(s)
- Olga Lyudovyk
- Department of Biomedical Informatics, Columbia University, New York, NY, USA
| | - Yufeng Shen
- Department of Biomedical Informatics, Columbia University, New York, NY, USA
| | | | - Susan J Hsiao
- Department of Pathology and Cell Biology, Columbia University Irving Medical Center, New York, NY, USA
| | - Mahesh M Mansukhani
- Department of Pathology and Cell Biology, Columbia University Irving Medical Center, New York, NY, USA
| | - Chunhua Weng
- Department of Biomedical Informatics, Columbia University, New York, NY, USA.
| |
Collapse
|
35
|
Desanti De Oliveira B, Xu K, Shen TH, Callahan M, Kiryluk K, D'Agati VD, Tatonetti NP, Barasch J, Devarajan P. Molecular nephrology: types of acute tubular injury. Nat Rev Nephrol 2019; 15:599-612. [PMID: 31439924 PMCID: PMC7303545 DOI: 10.1038/s41581-019-0184-x] [Citation(s) in RCA: 80] [Impact Index Per Article: 16.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 07/19/2019] [Indexed: 12/29/2022]
Abstract
The acute loss of kidney function has been diagnosed for many decades using the serum concentration of creatinine - a muscle metabolite that is an insensitive and non-specific marker of kidney function, but is now used for the very definition of acute kidney injury (AKI). Fortunately, myriad new tools have now been developed to better understand the relationship between acute tubular injury and elevation in serum creatinine (SCr). These tools include unbiased gene and protein expression analyses in kidney, urine and blood, the localization of specific gene transcripts in pathological biopsy samples by rapid in-situ RNA technology and single-cell RNA-sequencing analyses. However, this molecular approach to AKI has produced a series of unexpected problems, because the expression of specific kidney-derived molecules that are indicative of injury often do not correlate with SCr levels. This discrepancy between kidney injury markers and SCr level can be reconciled by the recognition that many separate subtypes of AKI exist, each with distinct patterning of molecular markers of tubular injury and SCr data. In this Review, we describe the weaknesses of isolated SCr-based diagnoses, the clinical and molecular subtyping of acute tubular injury, and the role of non-invasive biomarkers in clinical phenotyping. We propose a conceptual model that synthesizes molecular and physiological data along a time course spanning from acute cellular injury to organ failure.
Collapse
Affiliation(s)
| | | | | | | | | | | | | | | | - Prasad Devarajan
- Cincinnati Children's Hospital Medical Center, University of Cincinnati College of Medicine, Cincinnati, OH, USA.
| |
Collapse
|
36
|
Abstract
The ability to collect, store and analyze massive amounts of molecular and clinical data is fundamentally transforming the scientific method and its application in translational medicine. Collecting observations has always been a prerequisite for discovery, and great leaps in scientific understanding are accompanied by an expansion of this ability. Particle physics, astronomy and climate science, for example, have all greatly benefited from the development of new technologies enabling the collection of larger and more diverse data. Unlike medicine, however, each of these fields also has a mature theoretical framework on which new data can be evaluated and incorporated-to say it another way, there are no 'first principals' from which a healthy human could be analytically derived. The worry, and it is a valid concern, is that, without a strong theoretical underpinning, the inundation of data will cause medical research to devolve into a haphazard enterprise without discipline or rigor. The Age of Big Data harbors tremendous opportunity for biomedical advances, but will also be treacherous and demanding on future scientists.
Collapse
Affiliation(s)
- Nicholas P Tatonetti
- Departments of Biomedical Informatics, Systems Biology, and Medicine, Columbia University, New York, NY, USA
| |
Collapse
|
37
|
Basile AO, Yahi A, Tatonetti NP. Artificial Intelligence for Drug Toxicity and Safety. Trends Pharmacol Sci 2019; 40:624-635. [PMID: 31383376 PMCID: PMC6710127 DOI: 10.1016/j.tips.2019.07.005] [Citation(s) in RCA: 81] [Impact Index Per Article: 16.2] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/09/2019] [Revised: 07/10/2019] [Accepted: 07/10/2019] [Indexed: 12/13/2022]
Abstract
Interventional pharmacology is one of medicine's most potent weapons against disease. These drugs, however, can result in damaging side effects and must be closely monitored. Pharmacovigilance is the field of science that monitors, detects, and prevents adverse drug reactions (ADRs). Safety efforts begin during the development process, using in vivo and in vitro studies, continue through clinical trials, and extend to postmarketing surveillance of ADRs in real-world populations. Future toxicity and safety challenges, including increased polypharmacy and patient diversity, stress the limits of these traditional tools. Massive amounts of newly available data present an opportunity for using artificial intelligence (AI) and machine learning to improve drug safety science. Here, we explore recent advances as applied to preclinical drug safety and postmarketing surveillance with a specific focus on machine and deep learning (DL) approaches.
Collapse
Affiliation(s)
- Anna O Basile
- Columbia University Medical Center, New York, NY, USA
| | | | | |
Collapse
|
38
|
Polubriaginof FCG, Ryan P, Salmasian H, Shapiro AW, Perotte A, Safford MM, Hripcsak G, Smith S, Tatonetti NP, Vawdrey DK. Challenges with quality of race and ethnicity data in observational databases. J Am Med Inform Assoc 2019; 26:730-736. [PMID: 31365089 PMCID: PMC6696496 DOI: 10.1093/jamia/ocz113] [Citation(s) in RCA: 99] [Impact Index Per Article: 19.8] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/03/2019] [Revised: 05/14/2019] [Accepted: 06/14/2019] [Indexed: 12/12/2022] Open
Abstract
OBJECTIVE We sought to assess the quality of race and ethnicity information in observational health databases, including electronic health records (EHRs), and to propose patient self-recording as an improvement strategy. MATERIALS AND METHODS We assessed completeness of race and ethnicity information in large observational health databases in the United States (Healthcare Cost and Utilization Project and Optum Labs), and at a single healthcare system in New York City serving a racially and ethnically diverse population. We compared race and ethnicity data collected via administrative processes with data recorded directly by respondents via paper surveys (National Health and Nutrition Examination Survey and Hospital Consumer Assessment of Healthcare Providers and Systems). Respondent-recorded data were considered the gold standard for the collection of race and ethnicity information. RESULTS Among the 160 million patients from the Healthcare Cost and Utilization Project and Optum Labs datasets, race or ethnicity was unknown for 25%. Among the 2.4 million patients in the single New York City healthcare system's EHR, race or ethnicity was unknown for 57%. However, when patients directly recorded their race and ethnicity, 86% provided clinically meaningful information, and 66% of patients reported information that was discrepant with the EHR. DISCUSSION Race and ethnicity data are critical to support precision medicine initiatives and to determine healthcare disparities; however, the quality of this information in observational databases is concerning. Patient self-recording through the use of patient-facing tools can substantially increase the quality of the information while engaging patients in their health. CONCLUSIONS Patient self-recording may improve the completeness of race and ethnicity information.
Collapse
Affiliation(s)
- Fernanda C G Polubriaginof
- Value Institute, NewYork-Presbyterian Hospital, New York, New York, USA
- Steele Institute for Health Innovation, Geisinger, Danville, Pennsylvania and Department of Biomedical Informatics, Columbia University, New York, New York
| | - Patrick Ryan
- Steele Institute for Health Innovation, Geisinger, Danville, Pennsylvania and Department of Biomedical Informatics, Columbia University, New York, New York
- Epidemiology Analytics, Janssen Research & Development, LLC, Titusville, New Jersey, USA
| | - Hojjat Salmasian
- Division of General Internal Medicine, Harvard Medical School, Boston, Massachusetts, USA
- Department of Quality and Safety, Brigham and Women’s Hospital Boston, Massachusetts, USA
| | | | - Adler Perotte
- Steele Institute for Health Innovation, Geisinger, Danville, Pennsylvania and Department of Biomedical Informatics, Columbia University, New York, New York
| | - Monika M Safford
- Division of General Internal Medicine, Department of Medicine, Weill Cornell Medical College, New York, New York, USA
| | - George Hripcsak
- Steele Institute for Health Innovation, Geisinger, Danville, Pennsylvania and Department of Biomedical Informatics, Columbia University, New York, New York
- Medical Informatics Services, Human Resources, NewYork-Presbyterian Hospital, New York, New York, USA
| | - Shaun Smith
- NewYork-Presbyterian Hospital, New York, New York, USA
| | - Nicholas P Tatonetti
- Steele Institute for Health Innovation, Geisinger, Danville, Pennsylvania and Department of Biomedical Informatics, Columbia University, New York, New York
| | - David K Vawdrey
- Value Institute, NewYork-Presbyterian Hospital, New York, New York, USA
- Steele Institute for Health Innovation, Geisinger, Danville, Pennsylvania and Department of Biomedical Informatics, Columbia University, New York, New York
| |
Collapse
|
39
|
Romano JD, Tatonetti NP. Informatics and Computational Methods in Natural Product Drug Discovery: A Review and Perspectives. Front Genet 2019; 10:368. [PMID: 31114606 PMCID: PMC6503039 DOI: 10.3389/fgene.2019.00368] [Citation(s) in RCA: 59] [Impact Index Per Article: 11.8] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/12/2018] [Accepted: 04/05/2019] [Indexed: 12/17/2022] Open
Abstract
The discovery of new pharmaceutical drugs is one of the preeminent tasks-scientifically, economically, and socially-in biomedical research. Advances in informatics and computational biology have increased productivity at many stages of the drug discovery pipeline. Nevertheless, drug discovery has slowed, largely due to the reliance on small molecules as the primary source of novel hypotheses. Natural products (such as plant metabolites, animal toxins, and immunological components) comprise a vast and diverse source of bioactive compounds, some of which are supported by thousands of years of traditional medicine, and are largely disjoint from the set of small molecules used commonly for discovery. However, natural products possess unique characteristics that distinguish them from traditional small molecule drug candidates, requiring new methods and approaches for assessing their therapeutic potential. In this review, we investigate a number of state-of-the-art techniques in bioinformatics, cheminformatics, and knowledge engineering for data-driven drug discovery from natural products. We focus on methods that aim to bridge the gap between traditional small-molecule drug candidates and different classes of natural products. We also explore the current informatics knowledge gaps and other barriers that need to be overcome to fully leverage these compounds for drug discovery. Finally, we conclude with a "road map" of research priorities that seeks to realize this goal.
Collapse
Affiliation(s)
- Joseph D. Romano
- Department of Biomedical Informatics, Columbia University, New York, NY, United States
- Department of Systems Biology, Columbia University, New York, NY, United States
- Department of Medicine, Columbia University, New York, NY, United States
- Data Science Institute, Columbia University, New York, NY, United States
| | - Nicholas P. Tatonetti
- Department of Biomedical Informatics, Columbia University, New York, NY, United States
- Department of Systems Biology, Columbia University, New York, NY, United States
- Department of Medicine, Columbia University, New York, NY, United States
- Data Science Institute, Columbia University, New York, NY, United States
| |
Collapse
|
40
|
Tatonetti NP. The Next Generation of Drug Safety Science: Coupling Detection, Corroboration, and Validation to Discover Novel Drug Effects and Drug-Drug Interactions. Clin Pharmacol Ther 2019; 103:177-179. [PMID: 29313964 DOI: 10.1002/cpt.949] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/04/2017] [Revised: 11/17/2017] [Accepted: 11/18/2017] [Indexed: 11/10/2022]
Abstract
Rare adverse drug reactions and drug-drug interactions (DDIs) are difficult to detect in randomized trials and impossible to prove using observational studies. We must ascribe to a new way of conducting research that has the efficiency of a retrospective analysis and the rigor of a prospective trial. This can be achieved by integrating observational data from humans with laboratory experiments in model systems. The former establishes clinical significance and the latter supports causality.
Collapse
Affiliation(s)
- Nicholas P Tatonetti
- Department of Biomedical Informatics, Columbia University, New York, New York, USA.,Department of Systems Biology, Columbia University, New York, New York, USA.,Institute for Genomic Medicine, Columbia University, New York, New York, USA.,Data Science Institute, Columbia University, New York, New York, USA
| |
Collapse
|
41
|
Glicksberg BS, Oskotsky B, Giangreco N, Thangaraj PM, Rudrapatna V, Datta D, Frazier R, Lee N, Larsen R, Tatonetti NP, Butte AJ. ROMOP: a light-weight R package for interfacing with OMOP-formatted electronic health record data. JAMIA Open 2019; 2:10-14. [PMID: 31633087 PMCID: PMC6800657 DOI: 10.1093/jamiaopen/ooy059] [Citation(s) in RCA: 10] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/03/2018] [Revised: 10/26/2018] [Accepted: 12/02/2018] [Indexed: 12/03/2022] Open
Abstract
OBJECTIVES Electronic health record (EHR) data are increasingly used for biomedical discoveries. The nature of the data, however, requires expertise in both data science and EHR structure. The Observational Medical Out-comes Partnership (OMOP) common data model (CDM) standardizes the language and structure of EHR data to promote interoperability of EHR data for research. While the OMOP CDM is valuable and more attuned to research purposes, it still requires extensive domain knowledge to utilize effectively, potentially limiting more widespread adoption of EHR data for research and quality improvement. MATERIALS AND METHODS We have created ROMOP: an R package for direct interfacing with EHR data in the OMOP CDM format. RESULTS ROMOP streamlines typical EHR-related data processes. Its functions include exploration of data types, extraction and summarization of patient clinical and demographic data, and patient searches using any CDM vocabulary concept. CONCLUSION ROMOP is freely available under the Massachusetts Institute of Technology (MIT) license and can be obtained from GitHub (http://github.com/BenGlicksberg/ROMOP). We detail instructions for setup and use in the Supplementary Materials. Additionally, we provide a public sandbox server containing synthesized clinical data for users to explore OMOP data and ROMOP (http://romop.ucsf.edu).
Collapse
Affiliation(s)
- Benjamin S Glicksberg
- Department of Pediatrics Bakar Computational Health Sciences Institute, University of California San Francisco, San Francisco, California, USA
| | - Boris Oskotsky
- Department of Pediatrics Bakar Computational Health Sciences Institute, University of California San Francisco, San Francisco, California, USA
| | - Nicholas Giangreco
- Departments of Biomedical Informatics, Systems Biology, and Medicine, Columbia University, New York, New York, USA
| | - Phyllis M Thangaraj
- Departments of Biomedical Informatics, Systems Biology, and Medicine, Columbia University, New York, New York, USA
| | - Vivek Rudrapatna
- Department of Pediatrics Bakar Computational Health Sciences Institute, University of California San Francisco, San Francisco, California, USA
| | - Debajyoti Datta
- Department of Pediatrics Bakar Computational Health Sciences Institute, University of California San Francisco, San Francisco, California, USA
| | - Remi Frazier
- Academic Research Systems, Department of Enterprise Data Warehouse University of California San Francisco, San Francisco, California, USA
| | - Nelson Lee
- Academic Research Systems, Department of Enterprise Data Warehouse University of California San Francisco, San Francisco, California, USA
| | - Rick Larsen
- Academic Research Systems, Department of Enterprise Data Warehouse University of California San Francisco, San Francisco, California, USA
| | - Nicholas P Tatonetti
- Departments of Biomedical Informatics, Systems Biology, and Medicine, Columbia University, New York, New York, USA
| | - Atul J Butte
- Department of Pediatrics Bakar Computational Health Sciences Institute, University of California San Francisco, San Francisco, California, USA
| |
Collapse
|
42
|
Polubriaginof FCG, Shang N, Hripcsak G, Tatonetti NP, Vawdrey DK. Low Screening Rates for Diabetes Mellitus Among Family Members of Affected Relatives. AMIA Annu Symp Proc 2018; 2018:1471-1477. [PMID: 30815192 PMCID: PMC6371358] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Grants] [Subscribe] [Scholar Register] [Indexed: 06/09/2023]
Abstract
Cardiovascular disease is the leading cause of death in the United States, and abnormal blood glucose is an important risk factor. Delayed diagnosis of diabetes mellitus can increase patients' morbidity. In an urban academic medical center with a large clinical data warehouse, we used a novel algorithm to identify 56,794 family members of diabetic patients that were eligible for disease screening. We found that 30.6% of patients did not receive diabetes screening as recommended by current guidelines. Further, our analysis showed that having more than one family member affected and being a female were important contributors to being screened for diabetes mellitus. This study demonstrates that informatics methods applied to electronic health record data can be used to identify patients at risk for disease development, and therefore support clinical care.
Collapse
Affiliation(s)
| | - Ning Shang
- Department of Biomedical Informatics, Columbia University, New York, NY
| | - George Hripcsak
- Department of Biomedical Informatics, Columbia University, New York, NY
| | | | - David K Vawdrey
- Value Institute, NewYork-Presbyterian Hospital, New York, NY
- Department of Biomedical Informatics, Columbia University, New York, NY
| |
Collapse
|
43
|
Spagnolo F, Cristofari P, Tatonetti NP, Ginzburg LR, Dykhuizen DE. Pathogen population structure can explain hospital outbreaks. ISME J 2018; 12:2835-2843. [PMID: 30046167 PMCID: PMC6246595 DOI: 10.1038/s41396-018-0235-5] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 05/30/2018] [Accepted: 06/22/2018] [Indexed: 02/07/2023]
Abstract
Hospitalized patients are at risk for increased length of stay, illness, or death due to hospital acquired infections. The majority of hospital transmission models describe dynamics on the level of the host rather than on the level of the pathogens themselves. Accordingly, epidemiologists often cannot complete transmission chains without direct evidence of either host-host contact or a large reservoir population. Here, we propose an ecology-based model to explain the transmission of pathogens in hospitals. The model is based upon metapopulation biology, which describes a group of interacting localized populations and island biogeography, which provides a basis for how pathogens may be moving between locales. Computational simulation trials are used to assess the applicability of the model. Results indicate that pathogens survive for extended periods without the need for large reservoirs by living in localized ephemeral populations while continuously transmitting pathogens to new seed populations. Computational simulations show small populations spending significant portions of time at sizes too small to be detected by most surveillance protocols and that the number and type of these ephemeral populations enable the overall pathogen population to be sustained. By modeling hospital pathogens as a metapopulation, many observations characteristic of hospital acquired infection outbreaks for which there has previously been no sufficient biological explanation, including how and why empirically successful interventions work, can now be accounted for using population dynamic hypotheses. Epidemiological links between temporally isolated outbreaks are explained via pathogen population dynamics and potential outbreak intervention targets are identified.
Collapse
Affiliation(s)
- Fabrizio Spagnolo
- Ecology, Evolution and Environmental Biology Department, Columbia University, New York, NY, 10027, USA.
| | - Pierre Cristofari
- Department of Biomedical Informatics, Columbia University Medical Center, New York, NY, 10032, USA
- Astronomy Department, Columbia University, New York, NY, 10027, USA
| | - Nicholas P Tatonetti
- Department of Biomedical Informatics, Columbia University Medical Center, New York, NY, 10032, USA
- Department of Systems Biology, Columbia University Medical Center, New York, NY, 10032, USA
- Department of Medicine, Columbia University, New York, NY, 10032, USA
| | | | - Daniel E Dykhuizen
- Department of Ecology and Evolution, Stony Brook University, Stony Brook, NY, 11794, USA
| |
Collapse
|
44
|
Ta CN, Dumontier M, Hripcsak G, Tatonetti NP, Weng C. Columbia Open Health Data, clinical concept prevalence and co-occurrence from electronic health records. Sci Data 2018; 5:180273. [PMID: 30480666 PMCID: PMC6257042 DOI: 10.1038/sdata.2018.273] [Citation(s) in RCA: 36] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/09/2018] [Accepted: 10/16/2018] [Indexed: 12/11/2022] Open
Abstract
Columbia Open Health Data (COHD) is a publicly accessible database of electronic health record (EHR) prevalence and co-occurrence frequencies between conditions, drugs, procedures, and demographics. COHD was derived from Columbia University Irving Medical Center's Observational Health Data Sciences and Informatics (OHDSI) database. The lifetime dataset, derived from all records, contains 36,578 single concepts (11,952 conditions, 12,334 drugs, and 10,816 procedures) and 32,788,901 concept pairs from 5,364,781 patients. The 5-year dataset, derived from records from 2013-2017, contains 29,964 single concepts (10,159 conditions, 10,264 drugs, and 8,270 procedures) and 15,927,195 concept pairs from 1,790,431 patients. Exclusion of rare concepts (count ≤ 10) and Poisson randomization enable data sharing by eliminating risks to patient privacy. EHR prevalences are informative of healthcare consumption rates. Analysis of co-occurrence frequencies via relative frequency analysis and observed-expected frequency ratio are informative of associations between clinical concepts, useful for biomedical research tasks such as drug repurposing and pharmacovigilance. COHD is publicly accessible through a web application-programming interface (API) and downloadable from the Figshare repository. The code is available on GitHub.
Collapse
Affiliation(s)
- Casey N. Ta
- Department of Biomedical Informatics, Columbia University, NY, USA
| | - Michel Dumontier
- Institute of Data Science, Maastricht University, Maastricht, The Netherlands
| | - George Hripcsak
- Department of Biomedical Informatics, Columbia University, NY, USA
| | - Nicholas P. Tatonetti
- Department of Biomedical Informatics, Columbia University, NY, USA
- Department of Systems Biology, Columbia University, NY, USA
- Department of Medicine, Columbia University, NY, USA
| | - Chunhua Weng
- Department of Biomedical Informatics, Columbia University, NY, USA
| |
Collapse
|
45
|
Castillero E, Ali ZA, Akashi H, Giangreco N, Wang C, Stöhr EJ, Ji R, Zhang X, Kheysin N, Park JES, Hegde S, Patel S, Stein S, Cuenca C, Leung D, Homma S, Tatonetti NP, Topkara VK, Takeda K, Colombo PC, Naka Y, Sweeney HL, Schulze PC, George I. Structural and functional cardiac profile after prolonged duration of mechanical unloading: potential implications for myocardial recovery. Am J Physiol Heart Circ Physiol 2018; 315:H1463-H1476. [PMID: 30141986 PMCID: PMC6297806 DOI: 10.1152/ajpheart.00187.2018] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 03/28/2018] [Revised: 07/18/2018] [Accepted: 08/02/2018] [Indexed: 11/22/2022]
Abstract
Clinical and experimental studies have suggested that the duration of left ventricular assist device (LVAD) support may affect remodeling of the failing heart. We aimed to 1) characterize the changes in Ca2+/calmodulin-dependent protein kinase type-IIδ (CaMKIIδ), growth signaling, structural proteins, fibrosis, apoptosis, and gene expression before and after LVAD support and 2) assess whether the duration of support correlated with improvement or worsening of reverse remodeling. Left ventricular apex tissue and serum pairs were collected in patients with dilated cardiomyopathy ( n = 25, 23 men and 2 women) at LVAD implantation and after LVAD support at cardiac transplantation/LVAD explantation. Normal cardiac tissue was obtained from healthy hearts ( n = 4) and normal serum from age-matched control hearts ( n = 4). The duration of LVAD support ranged from 48 to 1,170 days (median duration: 270 days). LVAD support was associated with CaMKIIδ activation, increased nuclear myocyte enhancer factor 2, sustained histone deacetylase-4 phosphorylation, increased circulating and cardiac myostatin (MSTN) and MSTN signaling mediated by SMAD2, ongoing structural protein dysregulation and sustained fibrosis and apoptosis (all P < 0.05). Increased CaMKIIδ phosphorylation, nuclear myocyte enhancer factor 2, and cardiac MSTN significantly correlated with the duration of support. Phosphorylation of SMAD2 and apoptosis decreased with a shorter duration of LVAD support but increased with a longer duration of LVAD support. Further study is needed to define the optimal duration of LVAD support in patients with dilated cardiomyopathy. NEW & NOTEWORTHY A long duration of left ventricular assist device support may be detrimental for myocardial recovery, based on myocardial tissue experiments in patients with prolonged support showing significantly worsened activation of Ca2+/calmodulin-dependent protein kinase-IIδ, increased nuclear myocyte enhancer factor 2, increased myostatin and its signaling by SMAD2, and apoptosis as well as sustained histone deacetylase-4 phosphorylation, structural protein dysregulation, and fibrosis.
Collapse
Affiliation(s)
- Estibaliz Castillero
- Division of Cardiothoracic Surgery, College of Physicians and Surgeons of Columbia University , New York, New York
| | - Ziad A Ali
- Division of Cardiology, College of Physicians and Surgeons of Columbia University , New York, New York
| | - Hirokazu Akashi
- Division of Cardiothoracic Surgery, College of Physicians and Surgeons of Columbia University , New York, New York
| | - Nicholas Giangreco
- Department of Biomedical Informatics, Systems Biology, Institute for Genomic Medicine, Data Science Institute, Columbia University , New York, New York
| | - Catherine Wang
- Division of Cardiothoracic Surgery, College of Physicians and Surgeons of Columbia University , New York, New York
| | - Eric J Stöhr
- Division of Cardiology, College of Physicians and Surgeons of Columbia University , New York, New York
- School of Sport and Health Sciences, Cardiff Metropolitan University , Cardiff , United Kingdom
| | - Ruping Ji
- Division of Cardiology, College of Physicians and Surgeons of Columbia University , New York, New York
| | - Xiaokan Zhang
- Division of Cardiology, College of Physicians and Surgeons of Columbia University , New York, New York
| | - Nathaniel Kheysin
- Division of Cardiothoracic Surgery, College of Physicians and Surgeons of Columbia University , New York, New York
| | - Joo-Eun S Park
- Division of Cardiothoracic Surgery, College of Physicians and Surgeons of Columbia University , New York, New York
| | - Sheetal Hegde
- Division of Cardiothoracic Surgery, College of Physicians and Surgeons of Columbia University , New York, New York
| | - Sanatkumar Patel
- Division of Cardiothoracic Surgery, College of Physicians and Surgeons of Columbia University , New York, New York
| | - Samantha Stein
- Division of Cardiothoracic Surgery, College of Physicians and Surgeons of Columbia University , New York, New York
| | - Carlos Cuenca
- Division of Cardiothoracic Surgery, College of Physicians and Surgeons of Columbia University , New York, New York
| | - Diana Leung
- Division of Cardiothoracic Surgery, College of Physicians and Surgeons of Columbia University , New York, New York
| | - Shunichi Homma
- Division of Cardiology, College of Physicians and Surgeons of Columbia University , New York, New York
| | - Nicholas P Tatonetti
- Department of Biomedical Informatics, Systems Biology, Institute for Genomic Medicine, Data Science Institute, Columbia University , New York, New York
| | - Veli K Topkara
- Division of Cardiology, College of Physicians and Surgeons of Columbia University , New York, New York
| | - Koji Takeda
- Division of Cardiothoracic Surgery, College of Physicians and Surgeons of Columbia University , New York, New York
| | - Paolo C Colombo
- Division of Cardiology, College of Physicians and Surgeons of Columbia University , New York, New York
| | - Yoshifumi Naka
- Division of Cardiothoracic Surgery, College of Physicians and Surgeons of Columbia University , New York, New York
| | - H Lee Sweeney
- Department of Pharmacology, University of Florida , Gainesville, Florida
| | - P Christian Schulze
- Division of Cardiology, College of Physicians and Surgeons of Columbia University , New York, New York
| | - Isaac George
- Division of Cardiothoracic Surgery, College of Physicians and Surgeons of Columbia University , New York, New York
| |
Collapse
|
46
|
Shameer K, Perez-Rodriguez MM, Bachar R, Li L, Johnson A, Johnson KW, Glicksberg BS, Smith MR, Readhead B, Scarpa J, Jebakaran J, Kovatch P, Lim S, Goodman W, Reich DL, Kasarskis A, Tatonetti NP, Dudley JT. Pharmacological risk factors associated with hospital readmission rates in a psychiatric cohort identified using prescriptome data mining. BMC Med Inform Decis Mak 2018; 18:79. [PMID: 30255805 PMCID: PMC6156906 DOI: 10.1186/s12911-018-0653-3] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/18/2022] Open
Abstract
BACKGROUND Worldwide, over 14% of individuals hospitalized for psychiatric reasons have readmissions to hospitals within 30 days after discharge. Predicting patients at risk and leveraging accelerated interventions can reduce the rates of early readmission, a negative clinical outcome (i.e., a treatment failure) that affects the quality of life of patient. To implement individualized interventions, it is necessary to predict those individuals at highest risk for 30-day readmission. In this study, our aim was to conduct a data-driven investigation to find the pharmacological factors influencing 30-day all-cause, intra- and interdepartmental readmissions after an index psychiatric admission, using the compendium of prescription data (prescriptome) from electronic medical records (EMR). METHODS The data scientists in the project received a deidentified database from the Mount Sinai Data Warehouse, which was used to perform all analyses. Data was stored in a secured MySQL database, normalized and indexed using a unique hexadecimal identifier associated with the data for psychiatric illness visits. We used Bayesian logistic regression models to evaluate the association of prescription data with 30-day readmission risk. We constructed individual models and compiled results after adjusting for covariates, including drug exposure, age, and gender. We also performed digital comorbidity survey using EMR data combined with the estimation of shared genetic architecture using genomic annotations to disease phenotypes. RESULTS Using an automated, data-driven approach, we identified prescription medications, side effects (primary side effects), and drug-drug interaction-induced side effects (secondary side effects) associated with readmission risk in a cohort of 1275 patients using prescriptome analytics. In our study, we identified 28 drugs associated with risk for readmission among psychiatric patients. Based on prescription data, Pravastatin had the highest risk of readmission (OR = 13.10; 95% CI (2.82, 60.8)). We also identified enrichment of primary side effects (n = 4006) and secondary side effects (n = 36) induced by prescription drugs in the subset of readmitted patients (n = 89) compared to the non-readmitted subgroup (n = 1186). Digital comorbidity analyses and shared genetic analyses further reveals that cardiovascular disease and psychiatric conditions are comorbid and share functional gene modules (cardiomyopathy and anxiety disorder: shared genes (n = 37; P = 1.06815E-06)). CONCLUSIONS Large scale prescriptome data is now available from EMRs and accessible for analytics that could improve healthcare outcomes. Such analyses could also drive hypothesis and data-driven research. In this study, we explored the utility of prescriptome data to identify factors driving readmission in a psychiatric cohort. Converging digital health data from EMRs and systems biology investigations reveal a subset of patient populations that have significant comorbidities with cardiovascular diseases are more likely to be readmitted. Further, the genetic architecture of psychiatric illness also suggests overlap with cardiovascular diseases. In summary, assessment of medications, side effects, and drug-drug interactions in a clinical setting as well as genomic information using a data mining approach could help to find factors that could help to lower readmission rates in patients with mental illness.
Collapse
Affiliation(s)
- Khader Shameer
- Institute for Next Generation Healthcare, Mount Sinai Health System, New York, NY, USA
- Department of Genetics and Genomic Sciences, Icahn Institute for Genomics and Multiscale Biology, New York, NY, USA
| | | | - Roy Bachar
- Department of Psychiatry, Mount Sinai Health System, New York, NY, USA
- Hackensack Meridian Health Hackensack University Medical Center, Hackensack, NJ, USA
| | - Li Li
- Institute for Next Generation Healthcare, Mount Sinai Health System, New York, NY, USA
- Department of Genetics and Genomic Sciences, Icahn Institute for Genomics and Multiscale Biology, New York, NY, USA
| | - Amy Johnson
- Department of Psychiatry, Mount Sinai Health System, New York, NY, USA
| | - Kipp W Johnson
- Institute for Next Generation Healthcare, Mount Sinai Health System, New York, NY, USA
- Department of Genetics and Genomic Sciences, Icahn Institute for Genomics and Multiscale Biology, New York, NY, USA
| | - Benjamin S Glicksberg
- Institute for Next Generation Healthcare, Mount Sinai Health System, New York, NY, USA
- Department of Genetics and Genomic Sciences, Icahn Institute for Genomics and Multiscale Biology, New York, NY, USA
| | - Milo R Smith
- Institute for Next Generation Healthcare, Mount Sinai Health System, New York, NY, USA
- Department of Genetics and Genomic Sciences, Icahn Institute for Genomics and Multiscale Biology, New York, NY, USA
| | - Ben Readhead
- Institute for Next Generation Healthcare, Mount Sinai Health System, New York, NY, USA
- Department of Genetics and Genomic Sciences, Icahn Institute for Genomics and Multiscale Biology, New York, NY, USA
| | - Joseph Scarpa
- Department of Genetics and Genomic Sciences, Icahn Institute for Genomics and Multiscale Biology, New York, NY, USA
| | | | - Patricia Kovatch
- Mount Sinai Data Warehouse, Mount Sinai Health System, New York, NY, USA
| | - Sabina Lim
- Department of Psychiatry, Mount Sinai Health System, New York, NY, USA
| | - Wayne Goodman
- Department of Psychiatry, Mount Sinai Health System, New York, NY, USA
| | - David L Reich
- Department of Anesthesiology, Mount Sinai Health System, New York, NY, USA
| | - Andrew Kasarskis
- Department of Genetics and Genomic Sciences, Icahn Institute for Genomics and Multiscale Biology, New York, NY, USA
| | - Nicholas P Tatonetti
- Departments of Biomedical Informatics, Systems Biology and Medicine, Columbia University, New York, NY, USA
| | - Joel T Dudley
- Institute for Next Generation Healthcare, Mount Sinai Health System, New York, NY, USA.
- Department of Genetics and Genomic Sciences, Icahn Institute for Genomics and Multiscale Biology, New York, NY, USA.
- Department of Population Health Science and Policy; Icahn School of Medicine at Mount Sinai, Mount Sinai Health System, New York, NY, USA.
| |
Collapse
|
47
|
Tatonetti NP. Science as a Culinary Art: How Data Science and Informatics Will Change Knowledge Discovery for Everyone. Annu Rev Biomed Data Sci 2018. [DOI: 10.1146/annurev-bd-01-041718-100011] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/09/2022]
Affiliation(s)
- Nicholas P. Tatonetti
- Department of Biomedical Informatics, Department of Systems Biology, Department of Medicine, Institute for Genomic Medicine, and Data Science Institute, Columbia University, New York, NY 10032, USA
| |
Collapse
|
48
|
Hao Y, Quinnies K, Realubit R, Karan C, Tatonetti NP. Tissue-Specific Analysis of Pharmacological Pathways. CPT Pharmacometrics Syst Pharmacol 2018; 7:453-463. [PMID: 29920991 PMCID: PMC6063738 DOI: 10.1002/psp4.12305] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 01/29/2018] [Revised: 03/19/2018] [Accepted: 04/11/2018] [Indexed: 01/06/2023]
Abstract
Understanding the downstream consequences of pharmacologically targeted proteins is essential to drug design. Current approaches investigate molecular effects under tissue‐naïve assumptions. Many target proteins, however, have tissue‐specific expression. A systematic study connecting drugs to target pathways in in vivo human tissues is needed. We introduced a data‐driven method that integrates drug‐target relationships with gene expression, protein‐protein interaction, and pathway annotation data. We applied our method to four independent genomewide expression datasets and built 467,396 connections between 1,034 drugs and 954 pathways in 259 human tissues or cell lines. We validated our results using data from L1000 and Pharmacogenomics Knowledgebase (PharmGKB), and observed high precision and recall. We predicted and tested anticoagulant effects of 22 compounds experimentally that were previously unknown, and used clinical data to validate these effects retrospectively. Our systematic study provides a better understanding of the cellular response to drugs and can be applied to many research topics in systems pharmacology.
Collapse
Affiliation(s)
- Yun Hao
- Departments of Biomedical Informatics, Systems Biology, and Medicine, Columbia University, New York, New York, USA
| | - Kayla Quinnies
- Departments of Biomedical Informatics, Systems Biology, and Medicine, Columbia University, New York, New York, USA
| | - Ronald Realubit
- Columbia Genome Center, Columbia University, New York, New York, USA
| | - Charles Karan
- Columbia Genome Center, Columbia University, New York, New York, USA
| | - Nicholas P Tatonetti
- Departments of Biomedical Informatics, Systems Biology, and Medicine, Columbia University, New York, New York, USA.,Institute for Genomic Medicine, Columbia University, New York, New York, USA.,Data Science Institute, Columbia University, New York, NY, USA
| |
Collapse
|
49
|
Hoffman KB, Dimbil M, Tatonetti NP, Kyle RF. A Pharmacovigilance Signaling System Based on FDA Regulatory Action and Post-Marketing Adverse Event Reports. Drug Saf 2017; 39:561-75. [PMID: 26946292 DOI: 10.1007/s40264-016-0409-x] [Citation(s) in RCA: 14] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/22/2022]
Abstract
INTRODUCTION Many serious drug adverse events (AEs) only manifest well after regulatory approval. Therefore, the development of signaling methods to use with post-approval AE databases appears vital to comprehensively assess real-world drug safety. However, with millions of potential drug-AE pairs to analyze, the issue of focus is daunting. OBJECTIVE Our objective was to develop a signaling platform that focuses on AEs with historically demonstrated regulatory interest and to analyze such AEs with a disproportional reporting method that offers broad signal detection and acceptable false-positive rates. METHODS We analyzed over 1500 US FDA regulatory actions (safety communications and drug label changes) from 2008 to 2015 to construct a list of eligible signal AEs. The FDA Adverse Event Reporting System (FAERS) was used to evaluate disproportional reporting rates, constrained by minimum case counts and confidence interval limits, of these selected AEs for 109 training drugs. This step led to 45 AEs that appeared to have a low likelihood of being added to a label by FDA, so they were removed from the signal eligible list. We measured disproportional reporting for the final group of eligible AEs on a test group of 29 drugs that were not used in either the eligible list construction or the training steps. RESULTS In a group of 29 test drugs, our model reduced the number of potential drug-AE signals from 41,834 to 97 and predicted 73 % of individual drug label changes. The model also predicted at least one AE-drug pair label change in 66 % of all the label changes for the test drugs. CONCLUSIONS By concentrating on AE types with already demonstrated interest to FDA, we constructed a signaling system that provided focus regarding drug-AE pairs and suitable accuracy with regard to the issuance of FDA labeling changes. We suggest that focus on historical regulatory actions may increase the utility of pharmacovigilance signaling systems.
Collapse
Affiliation(s)
- Keith B Hoffman
- Advera Health Analytics, Inc., 3663 N. Laughlin Road, Suite 102, Santa Rosa, CA, 95403, USA.
| | - Mo Dimbil
- Advera Health Analytics, Inc., 3663 N. Laughlin Road, Suite 102, Santa Rosa, CA, 95403, USA
| | | | - Robert F Kyle
- Advera Health Analytics, Inc., 3663 N. Laughlin Road, Suite 102, Santa Rosa, CA, 95403, USA
| |
Collapse
|
50
|
Boland MR, Polubriaginof F, Tatonetti NP. Development of A Machine Learning Algorithm to Classify Drugs Of Unknown Fetal Effect. Sci Rep 2017; 7:12839. [PMID: 28993650 PMCID: PMC5634437 DOI: 10.1038/s41598-017-12943-x] [Citation(s) in RCA: 18] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/05/2017] [Accepted: 09/08/2017] [Indexed: 12/11/2022] Open
Abstract
Many drugs commonly prescribed during pregnancy lack a fetal safety recommendation - called FDA 'category C' drugs. This study aims to classify these drugs into harmful and safe categories using knowledge gained from chemoinformatics (i.e., pharmacological similarity with drugs of known fetal effect) and empirical data (i.e., derived from Electronic Health Records). Our fetal loss cohort contains 14,922 affected and 33,043 unaffected pregnancies and our congenital anomalies cohort contains 5,658 affected and 31,240 unaffected infants. We trained a random forest to classify drugs of unknown pregnancy class into harmful or safe categories, focusing on two distinct outcomes: fetal loss and congenital anomalies. Our models achieved an out-of-bag accuracy of 91% for fetal loss and 87% for congenital anomalies outperforming null models. Fifty-seven 'category C' medications were classified as harmful for fetal loss and eleven for congenital anomalies. This includes medications with documented harmful effects, including naproxen, ibuprofen and rubella live vaccine. We also identified several novel drugs, e.g., haloperidol, that increased the risk of fetal loss. Our approach provides important information on the harmfulness of 'category C' drugs. This is needed, as no FDA recommendation exists for these drugs' fetal safety.
Collapse
Affiliation(s)
- Mary Regina Boland
- Department of Biostatistics, Epidemiology and Informatics, University of Pennsylvania, Philadelphia, USA.
- Institute for Biomedical Informatics, University of Pennsylvania, Philadelphia, USA.
- Center of Excellence in Environmental Toxicology, University of Pennsylvania, Philadelphia, USA.
- Department of Biomedical and Health Informatics, Children's Hospital of Philadelphia, Philadelphia, USA.
- Department of Biomedical Informatics, Columbia University, New York, USA.
- Department of Medicine, Columbia University, New York, USA.
- Department of Systems Biology, Columbia University, New York, USA.
- Observational Health Data Sciences and Informatics, Columbia University, New York, USA.
| | - Fernanda Polubriaginof
- Department of Biomedical Informatics, Columbia University, New York, USA
- Department of Medicine, Columbia University, New York, USA
- Department of Systems Biology, Columbia University, New York, USA
| | - Nicholas P Tatonetti
- Department of Biomedical Informatics, Columbia University, New York, USA.
- Department of Medicine, Columbia University, New York, USA.
- Department of Systems Biology, Columbia University, New York, USA.
- Observational Health Data Sciences and Informatics, Columbia University, New York, USA.
| |
Collapse
|