1
|
Peluso A, Danciu I, Yoon HJ, Yusof JM, Bhattacharya T, Spannaus A, Schaefferkoetter N, Durbin EB, Wu XC, Stroup A, Doherty J, Schwartz S, Wiggins C, Coyle L, Penberthy L, Tourassi GD, Gao S. Deep learning uncertainty quantification for clinical text classification. J Biomed Inform 2024; 149:104576. [PMID: 38101690 DOI: 10.1016/j.jbi.2023.104576] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/22/2023] [Revised: 12/06/2023] [Accepted: 12/10/2023] [Indexed: 12/17/2023]
Abstract
INTRODUCTION Machine learning algorithms are expected to work side-by-side with humans in decision-making pipelines. Thus, the ability of classifiers to make reliable decisions is of paramount importance. Deep neural networks (DNNs) represent the state-of-the-art models to address real-world classification. Although the strength of activation in DNNs is often correlated with the network's confidence, in-depth analyses are needed to establish whether they are well calibrated. METHOD In this paper, we demonstrate the use of DNN-based classification tools to benefit cancer registries by automating information extraction of disease at diagnosis and at surgery from electronic text pathology reports from the US National Cancer Institute (NCI) Surveillance, Epidemiology, and End Results (SEER) population-based cancer registries. In particular, we introduce multiple methods for selective classification to achieve a target level of accuracy on multiple classification tasks while minimizing the rejection amount-that is, the number of electronic pathology reports for which the model's predictions are unreliable. We evaluate the proposed methods by comparing our approach with the current in-house deep learning-based abstaining classifier. RESULTS Overall, all the proposed selective classification methods effectively allow for achieving the targeted level of accuracy or higher in a trade-off analysis aimed to minimize the rejection rate. On in-distribution validation and holdout test data, with all the proposed methods, we achieve on all tasks the required target level of accuracy with a lower rejection rate than the deep abstaining classifier (DAC). Interpreting the results for the out-of-distribution test data is more complex; nevertheless, in this case as well, the rejection rate from the best among the proposed methods achieving 97% accuracy or higher is lower than the rejection rate based on the DAC. CONCLUSIONS We show that although both approaches can flag those samples that should be manually reviewed and labeled by human annotators, the newly proposed methods retain a larger fraction and do so without retraining-thus offering a reduced computational cost compared with the in-house deep learning-based abstaining classifier.
Collapse
Affiliation(s)
- Alina Peluso
- Oak Ridge National Laboratory, Oak Ridge, TN 37830, United States.
| | - Ioana Danciu
- Oak Ridge National Laboratory, Oak Ridge, TN 37830, United States
| | - Hong-Jun Yoon
- Oak Ridge National Laboratory, Oak Ridge, TN 37830, United States
| | | | | | - Adam Spannaus
- Oak Ridge National Laboratory, Oak Ridge, TN 37830, United States
| | | | - Eric B Durbin
- University of Kentucky, Lexington, KY 40536, United States
| | - Xiao-Cheng Wu
- Louisiana State University, New Orleans, LA 70112, United States
| | - Antoinette Stroup
- Rutgers Cancer Institute of New Jersey, New Brunswick, NJ 08901, United States
| | | | - Stephen Schwartz
- Fred Hutchinson Cancer Research Center, Seattle, WA 98109, United States
| | - Charles Wiggins
- University of New Mexico, Albuquerque, NM 87131, United States
| | - Linda Coyle
- Information Management Services Inc., Calverton, MD 20705, United States
| | - Lynne Penberthy
- National Cancer Institute, Bethesda, MD 20814, United States
| | | | - Shang Gao
- Oak Ridge National Laboratory, Oak Ridge, TN 37830, United States
| |
Collapse
|
2
|
Greenburg J, Lu Y, Lu S, Kamau U, Hamilton R, Pettus J, Preum S, Vaickus L, Levy J. Development of an interactive web dashboard to facilitate the reexamination of pathology reports for instances of underbilling of CPT codes. J Pathol Inform 2023; 14:100187. [PMID: 36700236 PMCID: PMC9867971 DOI: 10.1016/j.jpi.2023.100187] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/27/2022] [Accepted: 01/03/2023] [Indexed: 01/13/2023] Open
Abstract
Current Procedural Terminology Codes is a numerical coding system used to bill for medical procedures and services and crucially, represents a major reimbursement pathway. Given that pathology services represent a consequential source of hospital revenue, understanding instances where codes may have been misassigned or underbilled is critical. Several algorithms have been proposed that can identify improperly billed CPT codes in existing datasets of pathology reports. Estimation of the fiscal impacts of these reports requires a coder (i.e., billing staff) to review the original reports and manually code them again. As the re-assignment of codes using machine learning algorithms can be done quickly, the bottleneck in validating these reassignments is in this manual re-coding process, which can prove cumbersome. This work documents the development of a rapidly deployable dashboard for examination of reports that the original coder may have misbilled. Our dashboard features the following main components: (1) a bar plot to show the predicted probabilities for each CPT code, (2) an interpretation plot showing how each word in the report combines to form the overall prediction, and (3) a place for the user to input the CPT code they have chosen to assign. This dashboard utilizes the algorithms developed to accurately identify CPT codes to highlight the codes missed by the original coders. In order to demonstrate the function of this web application, we recruited pathologists to utilize it to highlight reports that had codes incorrectly assigned. We expect this application to accelerate the validation of re-assigned codes through facilitating rapid review of false-positive pathology reports. In the future, we will use this technology to review thousands of past cases in order to estimate the impact of underbilling has on departmental revenue.
Collapse
Affiliation(s)
- Jack Greenburg
- Department of Computer Science, Middlebury College, Middlebury, VT, USA
| | - Yunrui Lu
- Program in Quantitative Biomedical Sciences, Dartmouth College Geisel School of Medicine, Hanover, NH, USA
| | - Shuyang Lu
- Program in Quantitative Biomedical Sciences, Dartmouth College Geisel School of Medicine, Hanover, NH, USA
| | - Uhuru Kamau
- Program in Quantitative Biomedical Sciences, Dartmouth College Geisel School of Medicine, Hanover, NH, USA
| | - Robert Hamilton
- Department of Pathology, Johns Hopkins Hospital, Baltimore, MD, USA
| | - Jason Pettus
- Emerging Diagnostic and Investigative Technologies, Department of Pathology and Laboratory Medicine, Dartmouth Health, Lebanon, NH, USA
| | - Sarah Preum
- Department of Computer Science, Dartmouth College, Hanover, NH, USA
| | - Louis Vaickus
- Emerging Diagnostic and Investigative Technologies, Department of Pathology and Laboratory Medicine, Dartmouth Health, Lebanon, NH, USA
| | - Joshua Levy
- Program in Quantitative Biomedical Sciences, Dartmouth College Geisel School of Medicine, Hanover, NH, USA
- Emerging Diagnostic and Investigative Technologies, Department of Pathology and Laboratory Medicine, Dartmouth Health, Lebanon, NH, USA
- Department of Epidemiology, Dartmouth College Geisel School of Medicine, Hanover, NH, USA
- Department of Dermatology, Dartmouth Health, Lebanon, NH, USA
- Corresponding author at: Emerging Diagnostic and Investigative Technologies, Biostatistics and Bioinformatics Shared Resource, Dartmouth Cancer Center, Dartmouth-Hitchcock Medical Center, 1 Medical Center Drive, Department of Pathology and Laboratory Medicine, Lebanon, NH 03756, USA.
| |
Collapse
|
3
|
Amin A, DeLellis RA, Fava JL. Modifying phrases in surgical pathology reports: introduction of Standardized Scheme of Reporting Certainty in Pathology Reports (SSRC-Path). Virchows Arch 2021. [PMID: 34272982 DOI: 10.1007/s00428-021-03155-w] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/12/2021] [Revised: 06/23/2021] [Accepted: 07/05/2021] [Indexed: 10/20/2022]
Abstract
Pathologists often incorporate modifying phrases in their diagnosis to imply varying levels of diagnostic certainty; however, what is implied by the pathologists is not equivalent with what is perceived by the referring physicians and patients. This discordance can have significant implications in management, safety, and cost. We intend to identify lack of consistency in interpretation of modifying phrases by comparing perceived level of certainty by pathologists and non-pathologists, and introduce a standard scheme for reporting uncertainty in pathology reports using the experience with imaging reporting and data systems. In this study, a list of 18 most commonly used modifying phrases in pathology reports was distributed among separate cohorts of pathologists (N = 17) and non-pathology clinicians (N = 225) as a questionnaire survey, and the participants were asked to assign a certainty level to each phrase. All the participants had practice privileges in Brown University-affiliated teaching hospitals. The survey was completed by 207 participants (17 pathologists, 190 non-pathologists). It reveals a significant discordance between the interpretations of the modifying phrases between the two cohorts, with significant variations in subgroups of non-pathology clinicians. Also there is disagreement between pathologists and other clinicians regarding the causes of miscommunication triggered by pathology reports. Pathologists and non-pathology clinicians should be mindful of the potential sources of misunderstanding of pathology reports and take necessary actions to prevent and clarify the uncertainties. Using a standard scheme for reporting uncertainty in pathology reports is recommended.
Collapse
|
4
|
Lee J, Song HJ, Yoon E, Park SB, Park SH, Seo JW, Park P, Choi J. Automated extraction of Biomarker information from pathology reports. BMC Med Inform Decis Mak 2018; 18:29. [PMID: 29783980 PMCID: PMC5963015 DOI: 10.1186/s12911-018-0609-7] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/12/2017] [Accepted: 04/27/2018] [Indexed: 02/06/2023] Open
Abstract
Background Pathology reports are written in free-text form, which precludes efficient data gathering. We aimed to overcome this limitation and design an automated system for extracting biomarker profiles from accumulated pathology reports. Methods We designed a new data model for representing biomarker knowledge. The automated system parses immunohistochemistry reports based on a “slide paragraph” unit defined as a set of immunohistochemistry findings obtained for the same tissue slide. Pathology reports are parsed using context-free grammar for immunohistochemistry, and using a tree-like structure for surgical pathology. The performance of the approach was validated on manually annotated pathology reports of 100 randomly selected patients managed at Seoul National University Hospital. Results High F-scores were obtained for parsing biomarker name and corresponding test results (0.999 and 0.998, respectively) from the immunohistochemistry reports, compared to relatively poor performance for parsing surgical pathology findings. However, applying the proposed approach to our single-center dataset revealed information on 221 unique biomarkers, which represents a richer result than biomarker profiles obtained based on the published literature. Owing to the data representation model, the proposed approach can associate biomarker profiles extracted from an immunohistochemistry report with corresponding pathology findings listed in one or more surgical pathology reports. Term variations are resolved by normalization to corresponding preferred terms determined by expanded dictionary look-up and text similarity-based search. Conclusions Our proposed approach for biomarker data extraction addresses key limitations regarding data representation and can handle reports prepared in the clinical setting, which often contain incomplete sentences, typographical errors, and inconsistent formatting. Electronic supplementary material The online version of this article (10.1186/s12911-018-0609-7) contains supplementary material, which is available to authorized users.
Collapse
Affiliation(s)
- Jeongeun Lee
- Interdisciplinary Program for Bioengineering, Graduate School, Seoul National Universty, Seoul, Republic of Korea
| | - Hyun-Je Song
- School of Computer Science and Engineering, Kyungpook National University, Daegu, Republic of Korea
| | - Eunsil Yoon
- PAS1 team, TmaxSoft, Gyeonggi-do, Republic of Korea
| | - Seong-Bae Park
- School of Computer Science and Engineering, Kyungpook National University, Daegu, Republic of Korea
| | - Sung-Hye Park
- Department of Pathology, College of Medicine, Seoul National University, Seoul, Republic of Korea
| | - Jeong-Wook Seo
- Department of Pathology, College of Medicine, Seoul National University, Seoul, Republic of Korea
| | - Peom Park
- Department of Industrial Engineering, Ajou University, Suwon, Republic of Korea
| | - Jinwook Choi
- Interdisciplinary Program for Bioengineering, Graduate School, Seoul National Universty, Seoul, Republic of Korea. .,Department of Biomedical Engineering, College of Medicine, Seoul National University, Seoul, Republic of Korea.
| |
Collapse
|
5
|
Tang R, Ouyang L, Li C, He Y, Griffin M, Taghian A, Smith B, Yala A, Barzilay R, Hughes K. Machine learning to parse breast pathology reports in Chinese. Breast Cancer Res Treat 2018; 169:243-250. [PMID: 29380208 DOI: 10.1007/s10549-018-4668-3] [Citation(s) in RCA: 18] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/08/2018] [Accepted: 01/11/2018] [Indexed: 02/06/2023]
Abstract
INTRODUCTION Large structured databases of pathology findings are valuable in deriving new clinical insights. However, they are labor intensive to create and generally require manual annotation. There has been some work in the bioinformatics community to support automating this work via machine learning in English. Our contribution is to provide an automated approach to construct such structured databases in Chinese, and to set the stage for extraction from other languages. METHODS We collected 2104 de-identified Chinese benign and malignant breast pathology reports from Hunan Cancer Hospital. Physicians with native Chinese proficiency reviewed the reports and annotated a variety of binary and numerical pathologic entities. After excluding 78 cases with a bilateral lesion in the same report, 1216 cases were used as a training set for the algorithm, which was then refined by 405 development cases. The Natural language processing algorithm was tested by using the remaining 405 cases to evaluate the machine learning outcome. The model was used to extract 13 binary entities and 8 numerical entities. RESULTS When compared to physicians with native Chinese proficiency, the model showed a per-entity accuracy from 91 to 100% for all common diagnoses on the test set. The overall accuracy of binary entities was 98% and of numerical entities was 95%. In a per-report evaluation for binary entities with more than 100 training cases, 85% of all the testing reports were completely correct and 11% had an error in 1 out of 22 entities. CONCLUSION We have demonstrated that Chinese breast pathology reports can be automatically parsed into structured data using standard machine learning approaches. The results of our study demonstrate that techniques effective in parsing English reports can be scaled to other languages.
Collapse
Affiliation(s)
- Rong Tang
- Division of Surgical Oncology, MGH, Boston, USA
| | - Lizhi Ouyang
- Department of Breast Surgery, Hunan Cancer Hospital, Changsha, Hunan, China
| | - Clara Li
- Department of Electrical Engineering and Computer Science, CSAIL, MIT, Cambridge, USA
| | - Yue He
- Department of Breast Surgery, Hunan Cancer Hospital, Changsha, Hunan, China
| | | | | | | | - Adam Yala
- Department of Electrical Engineering and Computer Science, CSAIL, MIT, Cambridge, USA
| | - Regina Barzilay
- Department of Electrical Engineering and Computer Science, CSAIL, MIT, Cambridge, USA
| | | |
Collapse
|
6
|
Machado I, Pozo JJ, Marcilla D, Cruz J, Tardío JC, Astudillo A, Bagué S. [Protocol for the study of bone tumours and standardization of pathology reports]. Rev Esp Patol 2017; 50:34-44. [PMID: 29179963 DOI: 10.1016/j.patol.2016.08.003] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [What about the content of this article? (0)] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 07/20/2016] [Revised: 08/08/2016] [Accepted: 08/12/2016] [Indexed: 01/21/2023]
Abstract
Primary bone neoplasms represent a rare and heterogeneous group of mesenchymal tumours. The prevalence of benign and malignant tumours varies; the latter (sarcomas) account for less than 0.2% of all malignant tumours. Primary bone neoplasms are usually diagnosed and classified according to the criteria established and published by the World Health Organization (WHO 2013). These criteria are a result of advances in molecular pathology, which complements the histopathological diagnosis. Bone tumours should be diagnosed and treated in referral centers by a multidisciplinary team including pathologists, radiologists, orthopedic surgeons and oncologists. We analyzed different national and international protocols in order to provide a guide of recommendations for the improvement of pathological evaluation and management of bone tumours. We include specific recommendations for the pre-analytical, analytical, and post-analytical phases, as well as protocols for gross and microscopic pathology.
Collapse
|
7
|
Ou Y, Patrick J. Automatic negation detection in narrative pathology reports. Artif Intell Med 2015; 64:41-50. [PMID: 25990897 DOI: 10.1016/j.artmed.2015.03.001] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [What about the content of this article? (0)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/06/2014] [Revised: 01/20/2015] [Accepted: 03/17/2015] [Indexed: 11/21/2022]
Abstract
OBJECTIVE To detect negations of medical entities in free-text pathology reports with different approaches, and evaluate their performances. METHODS AND MATERIAL Three different approaches were applied for negation detection: the lexicon-based approach was a rule-based method, relying on trigger terms and termination clues; the syntax-based approach was also a rule-based method, where the rules and negation patterns were designed using the dependency output from the Stanford parser; the machine-learning-based approach used a support vector machine as a classifier to build models with a number of features. A total of 284 English pathology reports of lymphoma were used for the study. RESULTS The machine-learning-based approach had the best overall performance on the test set with micro-averaged F-score of 82.56%, while the syntax-based approach performed worst with 78.62% F-score. The lexicon-based approach attained an overall average precision of 89.74% and recall of 76.09%, which were significantly better than the results achieved by Negation Tagger with a similar approach. DISCUSSION The lexicon-based approach benefitted from being customized to the corpus more than the other two methods. The errors in negation detection with the syntax-based approach producing poorest performance were mainly due to the poor parsing results, and the errors with the other methods were probably because of the abnormal grammatical structures. CONCLUSIONS A machine-learning-based approach has potential advantages for negation detection, and may be preferable for the task. To improve the overall performance, one of the possible solutions is to apply different approaches to each section in the reports.
Collapse
|
8
|
Luo Y, Sohani AR, Hochberg EP, Szolovits P. Automatic lymphoma classification with sentence subgraph mining from pathology reports. J Am Med Inform Assoc 2014; 21:824-32. [PMID: 24431333 PMCID: PMC4147603 DOI: 10.1136/amiajnl-2013-002443] [Citation(s) in RCA: 41] [Impact Index Per Article: 4.1] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/23/2013] [Revised: 12/05/2013] [Accepted: 12/23/2013] [Indexed: 11/03/2022] Open
Abstract
OBJECTIVE Pathology reports are rich in narrative statements that encode a complex web of relations among medical concepts. These relations are routinely used by doctors to reason on diagnoses, but often require hand-crafted rules or supervised learning to extract into prespecified forms for computational disease modeling. We aim to automatically capture relations from narrative text without supervision. METHODS We design a novel framework that translates sentences into graph representations, automatically mines sentence subgraphs, reduces redundancy in mined subgraphs, and automatically generates subgraph features for subsequent classification tasks. To ensure meaningful interpretations over the sentence graphs, we use the Unified Medical Language System Metathesaurus to map token subsequences to concepts, and in turn sentence graph nodes. We test our system with multiple lymphoma classification tasks that together mimic the differential diagnosis by a pathologist. To this end, we prevent our classifiers from looking at explicit mentions or synonyms of lymphomas in the text. RESULTS AND CONCLUSIONS We compare our system with three baseline classifiers using standard n-grams, full MetaMap concepts, and filtered MetaMap concepts. Our system achieves high F-measures on multiple binary classifications of lymphoma (Burkitt lymphoma, 0.8; diffuse large B-cell lymphoma, 0.909; follicular lymphoma, 0.84; Hodgkin lymphoma, 0.912). Significance tests show that our system outperforms all three baselines. Moreover, feature analysis identifies subgraph features that contribute to improved performance; these features agree with the state-of-the-art knowledge about lymphoma classification. We also highlight how these unsupervised relation features may provide meaningful insights into lymphoma classification.
Collapse
Affiliation(s)
- Yuan Luo
- Computer Science and Artificial Intelligence Lab, Massachusetts Institute of Technology, Cambridge, Massachusetts, USA
| | - Aliyah R Sohani
- Department of Pathology, Massachusetts General Hospital and Harvard Medical School, Cambridge, Massachusetts, USA
| | - Ephraim P Hochberg
- Center for Lymphoma, Massachusetts General Hospital, Cambridge, Massachusetts, USA
- Department of Medicine, Harvard Medical School, Cambridge, Massachusetts, USA
| | - Peter Szolovits
- Computer Science and Artificial Intelligence Lab, Massachusetts Institute of Technology, Cambridge, Massachusetts, USA
| |
Collapse
|