51
|
McPherson JD, Marra M, Hillier L, Waterston RH, Chinwalla A, Wallis J, Sekhon M, Wylie K, Mardis ER, Wilson RK, Fulton R, Kucaba TA, Wagner-McPherson C, Barbazuk WB, Gregory SG, Humphray SJ, French L, Evans RS, Bethel G, Whittaker A, Holden JL, McCann OT, Dunham A, Soderlund C, Scott CE, Bentley DR, Schuler G, Chen HC, Jang W, Green ED, Idol JR, Maduro VV, Montgomery KT, Lee E, Miller A, Emerling S, Gibbs R, Scherer S, Gorrell JH, Sodergren E, Clerc-Blankenburg K, Tabor P, Naylor S, Garcia D, de Jong PJ, Catanese JJ, Nowak N, Osoegawa K, Qin S, Rowen L, Madan A, Dors M, Hood L, Trask B, Friedman C, Massa H, Cheung VG, Kirsch IR, Reid T, Yonescu R, Weissenbach J, Bruls T, Heilig R, Branscomb E, Olsen A, Doggett N, Cheng JF, Hawkins T, Myers RM, Shang J, Ramirez L, Schmutz J, Velasquez O, Dixon K, Stone NE, Cox DR, Haussler D, Kent WJ, Furey T, Rogic S, Kennedy S, Jones S, Rosenthal A, Wen G, Schilhabel M, Gloeckner G, Nyakatura G, Siebert R, Schlegelberger B, Korenberg J, Chen XN, Fujiyama A, Hattori M, Toyoda A, Yada T, Park HS, Sakaki Y, Shimizu N, Asakawa S, Kawasaki K, Sasaki T, Shintani A, Shimizu A, Shibuya K, Kudoh J, Minoshima S, Ramser J, Seranski P, Hoff C, Poustka A, Reinhardt R, Lehrach H. A physical map of the human genome. Nature 2001; 409:934-41. [PMID: 11237014 DOI: 10.1038/35057157] [Citation(s) in RCA: 549] [Impact Index Per Article: 23.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/13/2022]
Abstract
The human genome is by far the largest genome to be sequenced, and its size and complexity present many challenges for sequence assembly. The International Human Genome Sequencing Consortium constructed a map of the whole genome to enable the selection of clones for sequencing and for the accurate assembly of the genome sequence. Here we report the construction of the whole-genome bacterial artificial chromosome (BAC) map and its integration with previous landmark maps and information from mapping efforts focused on specific chromosomal regions. We also describe the integration of sequence data with the map.
Collapse
|
52
|
Friedman C, Gatti G, Elstein A, Franz T, Murphy G, Wolf F. Are clinicians correct when they believe they are correct? Implications for medical decision support. Stud Health Technol Inform 2001; 84:454-8. [PMID: 11604781] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 04/17/2023]
Abstract
The process of clinical decision support is linked to the validity of clinicians' confidence in their judgments. Clinicians who are appropriately confident-highly confident when they are correct and less confident when they are incorrect-will access computer-based and other information resources only when they are needed. Clinicians who are consistently underconfident will rely on external resources when they are not needed. Those who are overconfident, who believe they are correct when in fact they are not, will be prone to medical errors. An extensive literature indicates a general tendency toward overconfidence in human judgment. This study explores the relationship between confidence and "correctness", across three levels of clinical experience, in the task domain of diagnosis in internal medicine. We created detailed synopses of 36 diagnostically challenging cases and divided them into four equivalent sets of nine cases each. We asked 216 subjects at three experience levels (72 senior medical students, 72 senior medical residents, and 72 faculty attendings) to generate a differential diagnosis for each of the nine cases in one randomly-assigned set, and simultaneously to indicate their level of confidence in each of their diagnoses. We then examined the relationship between the correctness of these diagnoses (the appearance of the correct diagnosis anywhere in the hypothesis list) and these confidence judgments, for all subjects and separately for subjects at each experience level. Results indicate a small but statistically significant relationship associating correctness with higher confidence for all subjects (Kendall's tau b =.-106;p <.0001). This statistical relationship is strongest for the students ( tau b =.-121;p <.001), somewhat lesser but still significant for the faculty-level attendings ( tau b =.-103;p <.005), and non-significant ( tau b =.-041 ) for the residents. (The negative correlations are a coding artifact.) Subjects in this study showed a tendency toward underconfidence: they had low confidence in correct diagnoses more often than they had high confidence when wrong. Nonetheless, they were overconfident and thus "error prone" for 17% of cases overall. The medical students were possibly overmatched by the difficulty of the cases, so their concordance between confidence and correctness may have resulted from an awareness that they were often guessing. The relatively low concordance seen in the residents and attendings makes a strong argument that decision support systems to reduce medical errors should include both "push" and "pull" models. In sum, these results indicate that medical decision support systems cannot rely exclusively on clinicians' perceptions of their information needs, as such perceptions will frequently be incorrect.
Collapse
|
53
|
BAC Resource Consortium T, Cheung VG, Nowak N, Jang W, Kirsch IR, Zhao S, Chen XN, Furey TS, Kim UJ, Kuo WL, Olivier M, Conroy J, Kasprzyk A, Massa H, Yonescu R, Sait S, Thoreen C, Snijders A, Lemyre E, Bailey JA, Bruzel A, Burrill WD, Clegg SM, Collins S, Dhami P, Friedman C, Han CS, Herrick S, Lee J, Ligon AH, Lowry S, Morley M, Narasimhan S, Osoegawa K, Peng Z, Plajzer-Frick I, Quade BJ, Scott D, Sirotkin K, Thorpe AA, Gray JW, Hudson J, Pinkel D, Ried T, Rowen L, Shen-Ong GL, Strausberg RL, Birney E, Callen DF, Cheng JF, Cox DR, Doggett NA, Carter NP, Eichler EE, Haussler D, Korenberg JR, Morton CC, Albertson D, Schuler G, de Jong PJ, Trask BJ. Integration of cytogenetic landmarks into the draft sequence of the human genome. Nature 2001; 409:953-8. [PMID: 11237021 PMCID: PMC7845515 DOI: 10.1038/35057192] [Citation(s) in RCA: 203] [Impact Index Per Article: 8.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/31/2022]
Abstract
We have placed 7,600 cytogenetically defined landmarks on the draft sequence of the human genome to help with the characterization of genes altered by gross chromosomal aberrations that cause human disease. The landmarks are large-insert clones mapped to chromosome bands by fluorescence in situ hybridization. Each clone contains a sequence tag that is positioned on the genomic sequence. This genome-wide set of sequence-anchored clones allows structural and functional analyses of the genome. This resource represents the first comprehensive integration of cytogenetic, radiation hybrid, linkage and sequence maps of the human genome; provides an independent validation of the sequence map and framework for contig order and orientation; surveys the genome for large-scale duplications, which are likely to require special attention during sequence assembly; and allows a stringent assessment of sequence differences between the dark and light bands of chromosomes. It also provides insight into large-scale chromatin structure and the evolution of chromosomes and gene families and will accelerate our understanding of the molecular bases of human disease and cancer.
Collapse
|
54
|
Lussier YA, Shagina L, Friedman C. Automating SNOMED coding using medical language understanding: a feasibility study. Proc AMIA Symp 2001:418-22. [PMID: 11825222 PMCID: PMC2243482] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/23/2023] Open
Abstract
This paper evaluates qualitatively the use of the MedLEE natural language processing system to code medical narratives directly into the SNOMED nomenclature, while retaining the MedLEE information model data structure. A gold standard is produced from narrative text manually coded in SNOMED. An automated parsing and SNOMED-coding of the narrative text is then automatically generated by MedLEE. By comparing MedLEE s output to that of the Gold Standard, the capacities of SNOMED and MedLEE to represent the clinical information are subsequently evaluated leading to qualitative observations on their respective strengths and constraints. In this study, MedLEE did code to SNOMED and captures the codes in a sub-structure amenable to interoperability with the description logic of SNOMED RT, showing an approach that augments and formalizes SNOMED s compositional representation methods to accurately capture information from clinical narratives.
Collapse
|
55
|
Friedman C, Liu H, Shagina L, Johnson S, Hripcsak G. Evaluating the UMLS as a source of lexical knowledge for medical language processing. Proc AMIA Symp 2001:189-93. [PMID: 11825178 PMCID: PMC2243298] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/23/2023] Open
Abstract
Medical language processing (MLP) systems rely on specialized lexicons in order to recognize, classify, and normalize medical terminology, and the performance of an MLP system is dependent on the coverage and quality of such lexicons. However, the acquisition of lexical knowledge is expensive and time-consuming. The UMLS is a comprehensive resource that can be used to acquire lexical knowledge needed for medical language processing. This paper describes methods that use these resources to automatically create lexical entries and generate two lexicons. The first lexicon was created primarily using the UMLS, whereas the second was created by supplementing the lexicon of an existing MLP system called MedLEE with entries based on the UMLS. We subsequently carried out a study, which is the primary focus of this paper, using MedLEE with each of the two lexicons and also the current MedLEE lexicon to measure performance. Overall accuracy, sensitivity, and specificity using the lexicon primarily based on the UMLS were.86,.60, and.96 respectively. Those measures using the MedLEE lexicon alone were.93,.81, and.93, which was significantly better except for specificity; performance using the supplemental lexicon was exactly the same as performance using solely the MedLEE lexicon.
Collapse
|
56
|
Liu H, Lussier YA, Friedman C. A study of abbreviations in the UMLS. Proc AMIA Symp 2001:393-7. [PMID: 11825217 PMCID: PMC2243414] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/23/2023] Open
Abstract
Abbreviations are widely used in medicine. The understanding of abbreviations is important for medical language processing and information retrieval systems. The Unified Medical Language System (UMLS) contains a large number of abbreviations. We hypothesized that extracting and studying the UMLS abbreviations can be helpful for understanding the characteristics of abbreviations in medicine. In this paper, we describe a method for extracting abbreviations from the UMLS. We evaluated the method and studied the ambiguous nature of the abbreviations. In addition, the coverage of the UMLS abbreviations in medical reports was studied. Using our method, we extracted 163,666 unique (abbreviation, full form) pairs from the UMLS with a precision of 97.5%, and a recall of 96%. The UMLS abbreviations were highly ambiguous: 33.1% of abbreviations with six characters or less had multiple meanings; the average number of different full forms for all abbreviations with six characters or less was 2.28. The coverage of the UMLS abbreviations in medical reports was over 66%.
Collapse
|
57
|
Krauthammer M, Rzhetsky A, Morozov P, Friedman C. Using BLAST for identifying gene and protein names in journal articles. Gene 2000; 259:245-52. [PMID: 11163982 DOI: 10.1016/s0378-1119(00)00431-5] [Citation(s) in RCA: 80] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/18/2022]
Abstract
We describe a system which automatically identifies gene and protein names in journal articles, an important and non-trivial first step in knowledge extraction of protein and gene actions. Our system uses a database of gene and protein names and is based on BLAST [Altschul et al., Nucleic Acids Res. 25 (1997) 3389-3402], a popular tool for DNA and protein sequence comparison. We describe a method that consists of mapping sequences of text characters into sequences of nucleotides that can be processed by BLAST. We demonstrate that this approach is feasible: the system matches gene and protein names with a recall of 78.8% and a precision of 71.7%, which includes names that are not part of the system database. An analysis of the results suggests techniques that can be used to improve performance further.
Collapse
|
58
|
Giorgi D, Friedman C, Trask BJ, Rouquier S. Characterization of nonfunctional V1R-like pheromone receptor sequences in human. Genome Res 2000; 10:1979-85. [PMID: 11116092 PMCID: PMC313059 DOI: 10.1101/gr.10.12.1979] [Citation(s) in RCA: 35] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/24/2022]
Abstract
The vomeronasal organ (VNO) or Jacobson's organ is responsible in terrestrial vertebrates for the sensory perception of pheromones, chemicals that elicit stereotyped behaviors among individuals of the same species. Pheromone-induced behaviors and a functional VNO have been described in a number of mammals, but the existence of this sensory system in human is still debated. Recently, two nonhomologous gene families, V1R and V2R, encoding pheromone receptors have been identified in rat. These receptors belong to the seven-transmembrane domain G-protein-coupled receptor superfamily. We sought to characterize V1R-like genes in the human genome. We have identified seven different human sequences by PCR and library screening with rodent sequences. These human sequences exhibit characteristic features of V1R receptors and show 52%-59% of amino acid sequence identity with the rat sequences. Using PCR on a monochromosomal somatic cell hybrid panel and/or FISH, we demonstrate that these V1R-like sequences are distributed on chromosomes 7, 16, 20, 13, 14, 15, 21, and 22 and possibly on additional chromosomes. One sequence hybridizes to pericentromeric locations on all the acrocentric chromosomes (13, 14, 15, 21, and 22). All of the seven V1R-like sequences analyzed show interrupted reading frames, indicating that they represent nonfunctional pseudogenes. The preponderence of pseudogenes among human V1R sequences and the striking anatomical differences between rodent and human VNO raise the possibility that humans may have lost the V1R/VNO-mediated sensory functions of rodents.
Collapse
|
59
|
Rzhetsky A, Koike T, Kalachikov S, Gomez SM, Krauthammer M, Kaplan SH, Kra P, Russo JJ, Friedman C. A knowledge model for analysis and simulation of regulatory networks. Bioinformatics 2000; 16:1120-8. [PMID: 11159331 DOI: 10.1093/bioinformatics/16.12.1120] [Citation(s) in RCA: 43] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/14/2022] Open
Abstract
MOTIVATION In order to aid in hypothesis-driven experimental gene discovery, we are designing a computer application for the automatic retrieval of signal transduction data from electronic versions of scientific publications using natural language processing (NLP) techniques, as well as for visualizing and editing representations of regulatory systems. These systems describe both signal transduction and biochemical pathways within complex multicellular organisms, yeast, and bacteria. This computer application in turn requires the development of a domain-specific ontology, or knowledge model. RESULTS We introduce an ontological model for the representation of biological knowledge related to regulatory networks in vertebrates. We outline a taxonomy of the concepts, define their 'whole-to-part' relationships, describe the properties of major concepts, and outline a set of the most important axioms. The ontology is partially realized in a computer system designed to aid researchers in biology and medicine in visualizing and editing a representation of a signal transduction system.
Collapse
|
60
|
Friedman C. Infection control outside the hospital: developing a continuum of care. THE QUALITY LETTER FOR HEALTHCARE LEADERS 2000; 12:12-3, 1. [PMID: 10947527] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Subscribe] [Scholar Register] [Indexed: 02/17/2023]
Abstract
Pressures to limit or eliminate more expensive inpatient care have led t he way to rapidly expanded use of ambulatory care are extended care services. A consensus panel of infection control specialists have devised new recommendations on what can be addressed when infections occur outside the hospital.
Collapse
|
61
|
Catalano PJ, Post K, Sen C, Costantino P, Friedman C. Prevention of cerebrospinal fluid rhinorrhea in neurotologic surgery. THE AMERICAN JOURNAL OF OTOLOGY 2000; 21:265-9. [PMID: 10733195 DOI: 10.1016/s0196-0709(00)80020-4] [Citation(s) in RCA: 9] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]
Abstract
OBJECTIVE To determine the efficacy and safety of quick-setting hydroxyapatite cement in eliminating cerebrospinal fluid (CSF) rhinorrhea following neurotologic surgery. STUDY DESIGN A prospective study of 40 consecutive patients undergoing neurotologic surgery in whom the dura was opened. SETTING All patients were treated as hospital inpatients at a tertiary referral center. PATIENTS 25 men and 15 women between the ages of 20 and 72 years (mean age 51 years) underwent neurotologic surgery at the parent institution. INTERVENTION Various neurotologic procedures were performed for the resection of 25 acoustic tumors, 5 meningiomas, 3 glomus tumors, 2 vestibular nerve sections, 2 chordomas, 1 epidermoid tumor, and 1 meningoencephelocele, and for 2 patients referred to our institution with known CSF leaks following acoustic tumor surgery. A new form of quick-setting hydroxyapatite cement, which that hardens within 3 to 5 minutes was used to seal the air cell tracts of the temporal bone in all cases. MAIN OUTCOME MEASURE The presence of CSF rhinorrhea postoperatively. RESULTS CSF rhinorrhea occurred in 2 patients following acoustic tumor surgery, the first through an occult air cell tract at the margin of the drilled internal auditory canal, and the second via an oval window fistula 1 month after a translabyrinthine approach. CONCLUSIONS This form of hydroxyapatite cement appears safe, reliable, effective, and economical for the prevention of CSF rhinorrhea following neurotologic surgery. CSF rhinorrhea cannot be eliminated unless our ability to identify all potential air cell tract communications improves.
Collapse
|
62
|
Lin B, White JT, Ferguson C, Bumgarner R, Friedman C, Trask B, Ellis W, Lange P, Hood L, Nelson PS. PART-1: a novel human prostate-specific, androgen-regulated gene that maps to chromosome 5q12. Cancer Res 2000; 60:858-63. [PMID: 10706094] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/15/2023]
Abstract
Genes regulated by androgenic hormones are of critical importance for the normal physiological function of the human prostate gland, and they contribute to the development and progression of prostate carcinoma. We used cDNA microarrays containing 1500 prostate-derived cDNAs to profile transcripts regulated by androgens in prostate cancer cells. This study identified a novel gene that we have designated PART-1 (prostate androgen-regulated transcript 1), which exhibited increased expression upon exposure to androgens in the LNCaP prostate cancer cell line. Northern analysis demonstrated that PART-1 is highly expressed in the prostate gland relative to other normal human tissues and is expressed as different transcripts using at least three different polyadenylation signals. The PART-1 cDNA and putative protein are not significantly homologous to any sequences in the nonredundant public sequence databases. Cloning and analysis of the putative PART-1 promoter region identified a potential binding site for the homeobox gene PBX-la, but no consensus androgen response element or sterol-regulatory element binding sites were identified. We used a radiation hybrid panel and fluorescence in situ hybridization to map the PART-1 gene to chromosome 5q12, a region that has been suggested to harbor a prostate tumor suppressor gene. These results identify a new gene involved in the androgen receptor-regulated gene network of the human prostate that may play a role in the etiology of prostate carcinogenesis.
Collapse
|
63
|
Elkins JS, Friedman C, Boden-Albala B, Sacco RL, Hripcsak G. Coding neuroradiology reports for the Northern Manhattan Stroke Study: a comparison of natural language processing and manual review. COMPUTERS AND BIOMEDICAL RESEARCH, AN INTERNATIONAL JOURNAL 2000; 33:1-10. [PMID: 10772780 DOI: 10.1006/cbmr.1999.1535] [Citation(s) in RCA: 47] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/22/2022]
Abstract
Automated systems using natural language processing may greatly speed chart review tasks for clinical research, but their accuracy in this setting is unknown. The objective of this study was to compare the accuracy of automated and manual coding in the data acquisition tasks of an ongoing clinical research study, the Northern Manhattan Stroke Study(NOMASS). We identified 471 neuroradiology reports of brain images used in the NOMASS study. Using both automated and manual coding, we completed a standardized NOMASS imaging form with the information contained in these reports. We then generated ROC curves for both manual and automated coding by comparing our results to the original NOMASS data, where study in investigators directly coded their interpretations of brain images. The areas under the ROC curves for both manual and automated coding were the main outcome measure. The overall predictive value of the automated system (ROC area 0.85, 95% CI 0.84-0.87) was not statistically different from the predictive value of the manual coding (ROC area 0.87, 95% CI 0.83-0.91). Measured in terms of accuracy, the automated system performed slightly worse than manual coding. The overall accuracy of the automated system was 84% (CI 83-85%). The overall accuracy of manual coding was 86% (CI 84-88%). The difference in accuracy between the two methods was small but statistically significant (P = 0.026). Errors in manual coding appeared to be due to differences between neurologists' and nueroradiologists' interpretation, different use of detailed anatomic terms, and lack of clinical information. Automated systems can use natural language processing to rapidly perform complex data acquisition tasks. Although there is a small decrease in the accuracy of the data as compared to traditional methods, automated systems may greatly expand the power of chart review in clinical research design and implementation.
Collapse
|
64
|
Liu H, Friedman C. A method for vocabulary development and visualization based on medical language processing and XML. Proc AMIA Symp 2000:502-6. [PMID: 11079934 PMCID: PMC2243989] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/18/2023] Open
Abstract
A comprehensive controlled clinical vocabulary is critical to the effectiveness of many automated clinical systems. Vocabulary development and maintenance is an important aspect of a vocabulary, and should be linked to terms physicians actually use. This paper presents a method to help vocabulary builders capture, visualize, and analyze both compositional and quantitative information related to terms physicians use. The method includes several components: an MLP system, a corpus of relevant reports and a visualization tool based on XML and JAVA.
Collapse
|
65
|
Barrows Jr RC, Busuioc M, Friedman C. Limited parsing of notational text visit notes: ad-hoc vs. NLP approaches. Proc AMIA Symp 2000:51-5. [PMID: 11079843 PMCID: PMC2243829] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/18/2023] Open
Abstract
This paper describes the extraction of structured data relevant to glaucoma diagnosis and progression from visit notes typed as "notational text" by ophthalmologists during patient encounters. We compared two text processing systems: a limited pattern matching system called GDP (Glaucoma Dedicated Parser) and MedLEE, a proven natural language processing system which is in routine use encoding findings from chest radiograph and mammogram reports at the New York-Presbyterian hospital's Columbia-Presbyterian Center. We also evaluated the use of GDP as a preprocessor program to transform notational text into constructions recognizable by MedLEE. These systems have been evaluated according to their recall and precision in the particular task of processing a corpus of "notational text" documents to extract information related to glaucoma disease.
Collapse
|
66
|
Friedman C. A broad-coverage natural language processing system. Proc AMIA Symp 2000:270-4. [PMID: 11079887 PMCID: PMC2243979] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/18/2023] Open
Abstract
Natural language processing systems (NLP) that extract clinical information from textual reports were shown to be effective for limited domains and for particular applications. Because an NLP system typically requires substantial resources to develop, it is beneficial if it is designed to be easily extendible to multiple domains and applications. This paper describes multiple extensions of an NLP system called MedLEE, which was originally developed for the domain of radiological reports of the chest, but has subsequently been extended to mammography, discharge summaries, all of radiology, electrocardiography, echocardiography, and pathology.
Collapse
|
67
|
Deng Y, Madan A, Banta AB, Friedman C, Trask BJ, Hood L, Li L. Characterization, chromosomal localization, and the complete 30-kb DNA sequence of the human Jagged2 (JAG2) gene. Genomics 2000; 63:133-8. [PMID: 10662552 DOI: 10.1006/geno.1999.6045] [Citation(s) in RCA: 13] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/22/2022]
Abstract
The genomic sequence of the human Jagged2 (JAG2) gene, which encodes a ligand for the Notch receptors, was determined. The 30-kb DNA sequence spanning the JAG2 gene contains 26 exons and a putative promoter region. Several potential binding sites for transcription factors, including NF-kappab, E47, E12, E2F, Ets-1, MyoD, and OCT-1, were found in the human JAG2 promoter region. The JAG2 gene was also mapped to the chromosomal region 14q32 using fluorescence in situ hybridization.
Collapse
|
68
|
Friedman C, Dolgin JG. Adverse events of Kytril Injection questioned. Oncol Nurs Forum 1999; 26:1587-9. [PMID: 10573674] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [MESH Headings] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/14/2023]
|
69
|
Friedman C, Barnette M, Buck AS, Ham R, Harris JA, Hoffman P, Johnson D, Manian F, Nicolle L, Pearson ML, Perl TM, Solomon SL. Requirements for infrastructure and essential activities of infection control and epidemiology in out-of-hospital settings: a consensus panel report. Association for Professionals in Infection Control and Epidemiology and Society for Healthcare Epidemiology of America. Infect Control Hosp Epidemiol 1999; 20:695-705. [PMID: 10530650 DOI: 10.1086/501569] [Citation(s) in RCA: 33] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/04/2022]
Abstract
In 1997 the Association for Professionals in Infection Control and Epidemiology and the Society for Healthcare Epidemiology of America established a consensus panel to develop recommendations for optimal infrastructure and essential activities of infection control and epidemiology programs in out-of-hospital settings. The following report represents the Consensus Panel's best assessment of requirements for a healthy and effective out-of-hospital-based infection control and epidemiology program. The recommendations fall into 5 categories: managing critical data and information; developing and recommending policies and procedures; intervening directly to prevent infections; educating and training of health care workers, patients, and nonmedical caregivers; and resources. The Consensus Panel used an evidence-based approach and categorized recommendations according to modifications of the scheme developed by the Clinical Affairs Committee of the Infectious Diseases Society of America and the Centers for Disease Control and Prevention's Healthcare Infection Control Practices Advisory Committee.
Collapse
|
70
|
Friedman C, Barnette M, Buck AS, Ham R, Harris JA, Hoffman P, Johnson D, Manian F, Nicolle L, Pearson ML, Perl TM, Solomon SL. Requirements for infrastructure and essential activities of infection control and epidemiology in out-of-hospital settings: a Consensus Panel report. Am J Infect Control 1999; 27:418-30. [PMID: 10511489 DOI: 10.1016/s0196-6553(99)70008-8] [Citation(s) in RCA: 24] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/22/2022]
Abstract
In 1997 the Association for Professionals in Infection Control and Epidemiology and the Society for Healthcare Epidemiology of America established a consensus panel to develop recommendations for optimal infrastructure and essential activities of infection control and epidemiology programs in out-of-hospital settings. The following report represents the Consensus Panel's best assessment of requirements for a healthy and effective out-of-hospital-based infection control and epidemiology program. The recommendations fall into 5 categories: managing critical data and information; developing and recommending policies and procedures; intervening directly to prevent infections; educating and training of health care workers, patients, and nonmedical caregivers; and resources. The Consensus Panel used an evidence-based approach and categorized recommendations according to modifications of the scheme developed by the Clinical Affairs Committee of the Infectious Diseases Society of America and the Centers for Disease Control and Prevention's Healthcare Infection Control Practices Advisory Committee.
Collapse
|
71
|
Friedman C, Hripcsak G. Natural language processing and its future in medicine. ACADEMIC MEDICINE : JOURNAL OF THE ASSOCIATION OF AMERICAN MEDICAL COLLEGES 1999; 74:890-895. [PMID: 10495728 DOI: 10.1097/00001888-199908000-00012] [Citation(s) in RCA: 87] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/23/2023]
Abstract
If accurate clinical information were available electronically, automated applications could be developed to use this information to improve patient care and lower costs. However, to be fully retrievable, clinical information must be structured or coded. Many online patient reports are not coded, but are recorded in natural-language text that cannot be reliably accessed. Natural language processing (NLP) can solve this problem by extracting and structuring text-based clinical information, making clinical data available for use. NLP systems are quite difficult to develop, as they require substantial amounts of knowledge, but progress has definitely been made. Some NLP systems have been developed and tested and have demonstrated promising performance in practical clinical applications; some of these systems have already been deployed. The authors provide background information about NLP, briefly describe some of the systems that have been recently developed, and discuss the future of NLP in medicine.
Collapse
|
72
|
Friedman C, Elstein A, Wolf F, Murphy G, Franz T, Fine P, Heckerling P, Miller T. Measuring the quality of diagnostic hypothesis sets for studies of decision support. Stud Health Technol Inform 1999; 52 Pt 2:864-8. [PMID: 10384584] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/13/2023]
Abstract
Within medical informatics there is widespread interest in computer-based decision support and the evaluation of its impact. It is widely recognized that the measurement of dependent variables, or outcomes, represents the most challenging aspect of this work. This paper describes and reports the reliability and validity of an outcome metric for studies of diagnostic decision support. The results of this study will guide the analytic methods used in our ongoing multi-site study of the effects of decision support on diagnostic reasoning. Our measurement approach conceptualizes the quality of a diagnostic hypothesis set as having two components summed to generate a composite index: a Plausibility Component derived from ratings of each hypothesis in the set, whether correct or incorrect; and a Location Component derived from the location of the correct diagnosis if it appears in the set. The reliability of this metric is determined by the extent of interrater agreement on the plausibility of diagnostic hypotheses. Validity is determined by the extent to which the index generates scores that make sense on inspection (face validity), as well as the extent to which the component scores are non-redundant and discriminate the performance of novices and experts (construct validity). Using data from the pilot and main phases of our ongoing study (n = 124 subjects working 1116 cases), the reliability of our diagnostic quality metric was found to be 0.85-0.88. The metric was found to generate, on inspection, no clearly counterintuitive scores. Using data from the pilot phase of our study (n = 12 subjects working 108 cases), the component scores were moderately correlated (r = 0.68). The composite index, computed by equally weighting both components, was found to discriminate the hypotheses of medical students and attending physicians by 0.97 standard deviation units. Based on these findings, we have adopted this metric for use in our further research exploring the impact of decision support systems on diagnostic reasoning and will make it available to the informatics research community.
Collapse
|
73
|
Horan-Murphy E, Barnard B, Chenoweth C, Friedman C, Hazuka B, Russell B, Foster M, Goldman C, Bullock P, Docken L, McDonald L. APIC/CHICA-Canada Infection Control and Epidemiology: Professional and Practice Standards. Association for Professionals in Infection Control and Epidemiology, Inc, and the Community and Hospital Infection Control Association-Canada. Am J Infect Control 1999; 27:47-51. [PMID: 10223902 DOI: 10.1016/s0196-6553(99)70073-8] [Citation(s) in RCA: 27] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/25/2022]
|
74
|
Friedman C, Hripcsak G, Shagina L, Liu H. Representing information in patient reports using natural language processing and the extensible markup language. J Am Med Inform Assoc 1999; 6:76-87. [PMID: 9925230 PMCID: PMC61346 DOI: 10.1136/jamia.1999.0060076] [Citation(s) in RCA: 88] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/04/2022] Open
Abstract
OBJECTIVE To design a document model that provides reliable and efficient access to clinical information in patient reports for a broad range of clinical applications, and to implement an automated method using natural language processing that maps textual reports to a form consistent with the model. METHODS A document model that encodes structured clinical information in patient reports while retaining the original contents was designed using the extensible markup language (XML), and a document type definition (DTD) was created. An existing natural language processor (NLP) was modified to generate output consistent with the model. Two hundred reports were processed using the modified NLP system, and the XML output that was generated was validated using an XML validating parser. RESULTS The modified NLP system successfully processed all 200 reports. The output of one report was invalid, and 199 reports were valid XML forms consistent with the DTD. CONCLUSIONS Natural language processing can be used to automatically create an enriched document that contains a structured component whose elements are linked to portions of the original textual report. This integrated document model provides a representation where documents containing specific information can be accurately and efficiently retrieved by querying the structured components. If manual review of the documents is desired, the salient information in the original reports can also be identified and highlighted. Using an XML model of tagging provides an additional benefit in that software tools that manipulate XML documents are readily available.
Collapse
|
75
|
Shablinsky I, Starren J, Friedman C. What do ER physicians really want? A method for elucidating ER information needs. Proc AMIA Symp 1999:390-4. [PMID: 10566387 PMCID: PMC2232515] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/14/2023] Open
Abstract
Prior discharge summaries are a critical source of information for treating emergency room patients. However, reading discharge summaries may occupy more time than emergency care clinicians can afford. It would be beneficial to present vital information in the reports to them so that they would be able to quickly extract and digest it. There are several possible ways to present the information without changing the structure or content of the report itself. As a prelude to an effective study concerning the efficiency of the various presentation approaches, it is first necessary to know which diagnoses would benefit from past history, and what kind of information is most important to present for each of the diagnoses. In this study, we present a method for elucidating emergency care information needs from clinicians. Analysis of the data obtained from clinicians resulted in generation of a list of important diagnoses and informational categories. For validation, the clinicians were shown sample reports and were asked to highlight critical information. Overall, predicted important items correlated with physicians highlighting (Pearson correlation coefficient of 0.650, significance level 0.01).
Collapse
|