1
|
Abstract
Central nervous system tumours represent one of the most lethal cancer types, particularly among children1. Primary treatment includes neurosurgical resection of the tumour, in which a delicate balance must be struck between maximizing the extent of resection and minimizing risk of neurological damage and comorbidity2,3. However, surgeons have limited knowledge of the precise tumour type prior to surgery. Current standard practice relies on preoperative imaging and intraoperative histological analysis, but these are not always conclusive and occasionally wrong. Using rapid nanopore sequencing, a sparse methylation profile can be obtained during surgery4. Here we developed Sturgeon, a patient-agnostic transfer-learned neural network, to enable molecular subclassification of central nervous system tumours based on such sparse profiles. Sturgeon delivered an accurate diagnosis within 40 minutes after starting sequencing in 45 out of 50 retrospectively sequenced samples (abstaining from diagnosis of the other 5 samples). Furthermore, we demonstrated its applicability in real time during 25 surgeries, achieving a diagnostic turnaround time of less than 90 min. Of these, 18 (72%) diagnoses were correct and 7 did not reach the required confidence threshold. We conclude that machine-learned diagnosis based on low-cost intraoperative sequencing can assist neurosurgical decision-making, potentially preventing neurological comorbidity and avoiding additional surgeries.
Collapse
|
2
|
The uncovered biases and errors in clinical determination of bone age by using deep learning models. Eur Radiol 2023; 33:3544-3556. [PMID: 36538072 DOI: 10.1007/s00330-022-09330-0] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/12/2022] [Revised: 10/13/2022] [Accepted: 11/28/2022] [Indexed: 12/24/2022]
Abstract
OBJECTIVES To evaluate AI biases and errors in estimating bone age (BA) by comparing AI and radiologists' clinical determinations of BA. METHODS We established three deep learning models from a Chinese private dataset (CHNm), an American public dataset (USAm), and a joint dataset combining the above two datasets (JOIm). The test data CHNt (n = 1246) were labeled by ten senior pediatric radiologists. The effects of data site differences, interpretation bias, and interobserver variability on BA assessment were evaluated. The differences between the AI models' and radiologists' clinical determinations of BA (normal, advanced, and delayed BA groups by using the Brush data) were evaluated by the chi-square test and Kappa values. The heatmaps of CHNm-CHNt were generated by using Grad-CAM. RESULTS We obtained an MAD value of 0.42 years on CHNm-CHNt; this result indicated an appropriate accuracy for the whole group but did not indicate an accurate estimation of individual BA because with a kappa value of 0.714, the agreement between AI and human clinical determinations of BA was significantly different. The features of the heatmaps were not fully consistent with the human vision on the X-ray films. Variable performance in BA estimation by different AI models and the disagreement between AI and radiologists' clinical determinations of BA may be caused by data biases, including patients' sex and age, institutions, and radiologists. CONCLUSIONS The deep learning models outperform external validation in predicting BA on both internal and joint datasets. However, the biases and errors in the models' clinical determinations of child development should be carefully considered. KEY POINTS • With a kappa value of 0.714, clinical determinations of bone age by using AI did not accord well with clinical determinations by radiologists. • Several biases, including patients' sex and age, institutions, and radiologists, may cause variable performance by AI bone age models and disagreement between AI and radiologists' clinical determinations of bone age. • AI heatmaps of bone age were not fully consistent with human vision on X-ray films.
Collapse
|
3
|
Deep learning algorithm performs similarly to radiologists in the assessment of prostate volume on MRI. Eur Radiol 2023; 33:2519-2528. [PMID: 36371606 PMCID: PMC10017633 DOI: 10.1007/s00330-022-09239-8] [Citation(s) in RCA: 3] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/12/2022] [Revised: 09/26/2022] [Accepted: 10/13/2022] [Indexed: 11/15/2022]
Abstract
OBJECTIVES Prostate volume (PV) in combination with prostate specific antigen (PSA) yields PSA density which is an increasingly important biomarker. Calculating PV from MRI is a time-consuming, radiologist-dependent task. The aim of this study was to assess whether a deep learning algorithm can replace PI-RADS 2.1 based ellipsoid formula (EF) for calculating PV. METHODS Eight different measures of PV were retrospectively collected for each of 124 patients who underwent radical prostatectomy and preoperative MRI of the prostate (multicenter and multi-scanner MRI's 1.5 and 3 T). Agreement between volumes obtained from the deep learning algorithm (PVDL) and ellipsoid formula by two radiologists (PVEF1 and PVEF2) was evaluated against the reference standard PV obtained by manual planimetry by an expert radiologist (PVMPE). A sensitivity analysis was performed using a prostatectomy specimen as the reference standard. Inter-reader agreement was evaluated between the radiologists using the ellipsoid formula and between the expert and inexperienced radiologists performing manual planimetry. RESULTS PVDL showed better agreement and precision than PVEF1 and PVEF2 using the reference standard PVMPE (mean difference [95% limits of agreement] PVDL: -0.33 [-10.80; 10.14], PVEF1: -3.83 [-19.55; 11.89], PVEF2: -3.05 [-18.55; 12.45]) or the PV determined based on specimen weight (PVDL: -4.22 [-22.52; 14.07], PVEF1: -7.89 [-30.50; 14.73], PVEF2: -6.97 [-30.13; 16.18]). Inter-reader agreement was excellent between the two experienced radiologists using the ellipsoid formula and was good between expert and inexperienced radiologists performing manual planimetry. CONCLUSION Deep learning algorithm performs similarly to radiologists in the assessment of prostate volume on MRI. KEY POINTS • A commercially available deep learning algorithm performs similarly to radiologists in the assessment of prostate volume on MRI. • The deep-learning algorithm was previously untrained on this heterogenous multicenter day-to-day practice MRI data set.
Collapse
|
4
|
Abstract
BACKGROUND Artificial intelligence (AI)-enabled analysis of 12-lead ECGs may facilitate efficient estimation of incident atrial fibrillation (AF) risk. However, it remains unclear whether AI provides meaningful and generalizable improvement in predictive accuracy beyond clinical risk factors for AF. METHODS We trained a convolutional neural network (ECG-AI) to infer 5-year incident AF risk using 12-lead ECGs in patients receiving longitudinal primary care at Massachusetts General Hospital (MGH). We then fit 3 Cox proportional hazards models, composed of ECG-AI 5-year AF probability, CHARGE-AF clinical risk score (Cohorts for Heart and Aging in Genomic Epidemiology-Atrial Fibrillation), and terms for both ECG-AI and CHARGE-AF (CH-AI), respectively. We assessed model performance by calculating discrimination (area under the receiver operating characteristic curve) and calibration in an internal test set and 2 external test sets (Brigham and Women's Hospital [BWH] and UK Biobank). Models were recalibrated to estimate 2-year AF risk in the UK Biobank given limited available follow-up. We used saliency mapping to identify ECG features most influential on ECG-AI risk predictions and assessed correlation between ECG-AI and CHARGE-AF linear predictors. RESULTS The training set comprised 45 770 individuals (age 55±17 years, 53% women, 2171 AF events) and the test sets comprised 83 162 individuals (age 59±13 years, 56% women, 2424 AF events). Area under the receiver operating characteristic curve was comparable using CHARGE-AF (MGH, 0.802 [95% CI, 0.767-0.836]; BWH, 0.752 [95% CI, 0.741-0.763]; UK Biobank, 0.732 [95% CI, 0.704-0.759]) and ECG-AI (MGH, 0.823 [95% CI, 0.790-0.856]; BWH, 0.747 [95% CI, 0.736-0.759]; UK Biobank, 0.705 [95% CI, 0.673-0.737]). Area under the receiver operating characteristic curve was highest using CH-AI (MGH, 0.838 [95% CI, 0.807 to 0.869]; BWH, 0.777 [95% CI, 0.766 to 0.788]; UK Biobank, 0.746 [95% CI, 0.716 to 0.776]). Calibration error was low using ECG-AI (MGH, 0.0212; BWH, 0.0129; UK Biobank, 0.0035) and CH-AI (MGH, 0.012; BWH, 0.0108; UK Biobank, 0.0001). In saliency analyses, the ECG P-wave had the greatest influence on AI model predictions. ECG-AI and CHARGE-AF linear predictors were correlated (Pearson r: MGH, 0.61; BWH, 0.66; UK Biobank, 0.41). CONCLUSIONS AI-based analysis of 12-lead ECGs has similar predictive usefulness to a clinical risk factor model for incident AF and the approaches are complementary. ECG-AI may enable efficient quantification of future AF risk.
Collapse
|
5
|
Abstract
BACKGROUND The microvasculature, the smallest blood vessels in the body, has key roles in maintenance of organ health and tumorigenesis. The retinal fundus is a window for human in vivo noninvasive assessment of the microvasculature. Large-scale complementary machine learning-based assessment of the retinal vasculature with phenome-wide and genome-wide analyses may yield new insights into human health and disease. METHODS We used 97 895 retinal fundus images from 54 813 UK Biobank participants. Using convolutional neural networks to segment the retinal microvasculature, we calculated vascular density and fractal dimension as a measure of vascular branching complexity. We associated these indices with 1866 incident International Classification of Diseases-based conditions (median 10-year follow-up) and 88 quantitative traits, adjusting for age, sex, smoking status, and ethnicity. RESULTS Low retinal vascular fractal dimension and density were significantly associated with higher risks for incident mortality, hypertension, congestive heart failure, renal failure, type 2 diabetes, sleep apnea, anemia, and multiple ocular conditions, as well as corresponding quantitative traits. Genome-wide association of vascular fractal dimension and density identified 7 and 13 novel loci, respectively, that were enriched for pathways linked to angiogenesis (eg, vascular endothelial growth factor, platelet-derived growth factor receptor, angiopoietin, and WNT signaling pathways) and inflammation (eg, interleukin, cytokine signaling). CONCLUSIONS Our results indicate that the retinal vasculature may serve as a biomarker for future cardiometabolic and ocular disease and provide insights into genes and biological pathways influencing microvascular indices. Moreover, such a framework highlights how deep learning of images can quantify an interpretable phenotype for integration with electronic health record, biomarker, and genetic data to inform risk prediction and risk modification.
Collapse
|
6
|
Deep Learning for Basal Cell Carcinoma Detection for Reflectance Confocal Microscopy. J Invest Dermatol 2022; 142:97-103. [PMID: 34265329 PMCID: PMC9338423 DOI: 10.1016/j.jid.2021.06.015] [Citation(s) in RCA: 14] [Impact Index Per Article: 7.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/24/2021] [Revised: 05/28/2021] [Accepted: 06/04/2021] [Indexed: 01/03/2023]
Abstract
Basal cell carcinoma (BCC) is the most common skin cancer, with over 2 million cases diagnosed annually in the United States. Conventionally, BCC is diagnosed by naked eye examination and dermoscopy. Suspicious lesions are either removed or biopsied for histopathological confirmation, thus lowering the specificity of noninvasive BCC diagnosis. Recently, reflectance confocal microscopy, a noninvasive diagnostic technique that can image skin lesions at cellular level resolution, has shown to improve specificity in BCC diagnosis and reduced the number needed to biopsy by 2-3 times. In this study, we developed and evaluated a deep learning-based artificial intelligence model to automatically detect BCC in reflectance confocal microscopy images. The proposed model achieved an area under the curve for the receiver operator characteristic curve of 89.7% (stack level) and 88.3% (lesion level), a performance on par with that of reflectance confocal microscopy experts. Furthermore, the model achieved an area under the curve of 86.1% on a held-out test set from international collaborators, demonstrating the reproducibility and generalizability of the proposed automated diagnostic approach. These results provide a clear indication that the clinical deployment of decision support systems for the detection of BCC in reflectance confocal microscopy images has the potential for optimizing the evaluation and diagnosis of patients with skin cancer.
Collapse
|
7
|
Data Homogeneity Effect in Deep Learning-Based Prediction of Type 1 Diabetic Retinopathy. J Diabetes Res 2021; 2021:2751695. [PMID: 35071603 PMCID: PMC8776492 DOI: 10.1155/2021/2751695] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 09/01/2021] [Revised: 10/27/2021] [Accepted: 11/25/2021] [Indexed: 12/05/2022] Open
Abstract
This study is aimed at evaluating a deep transfer learning-based model for identifying diabetic retinopathy (DR) that was trained using a dataset with high variability and predominant type 2 diabetes (T2D) and comparing model performance with that in patients with type 1 diabetes (T1D). The Kaggle dataset, which is a publicly available dataset, was divided into training and testing Kaggle datasets. In the comparison dataset, we collected retinal fundus images of T1D patients at Chang Gung Memorial Hospital in Taiwan from 2013 to 2020, and the images were divided into training and testing T1D datasets. The model was developed using 4 different convolutional neural networks (Inception-V3, DenseNet-121, VGG1, and Xception). The model performance in predicting DR was evaluated using testing images from each dataset, and area under the curve (AUC), sensitivity, and specificity were calculated. The model trained using the Kaggle dataset had an average (range) AUC of 0.74 (0.03) and 0.87 (0.01) in the testing Kaggle and T1D datasets, respectively. The model trained using the T1D dataset had an AUC of 0.88 (0.03), which decreased to 0.57 (0.02) in the testing Kaggle dataset. Heatmaps showed that the model focused on retinal hemorrhage, vessels, and exudation to predict DR. In wrong prediction images, artifacts and low-image quality affected model performance. The model developed with the high variability and T2D predominant dataset could be applied to T1D patients. Dataset homogeneity could affect the performance, trainability, and generalization of the model.
Collapse
|
8
|
Accurate recognition of colorectal cancer with semi-supervised deep learning on pathological images. Nat Commun 2021; 12:6311. [PMID: 34728629 PMCID: PMC8563931 DOI: 10.1038/s41467-021-26643-8] [Citation(s) in RCA: 35] [Impact Index Per Article: 11.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/10/2020] [Accepted: 10/12/2021] [Indexed: 02/07/2023] Open
Abstract
Machine-assisted pathological recognition has been focused on supervised learning (SL) that suffers from a significant annotation bottleneck. We propose a semi-supervised learning (SSL) method based on the mean teacher architecture using 13,111 whole slide images of colorectal cancer from 8803 subjects from 13 independent centers. SSL (~3150 labeled, ~40,950 unlabeled; ~6300 labeled, ~37,800 unlabeled patches) performs significantly better than the SL. No significant difference is found between SSL (~6300 labeled, ~37,800 unlabeled) and SL (~44,100 labeled) at patch-level diagnoses (area under the curve (AUC): 0.980 ± 0.014 vs. 0.987 ± 0.008, P value = 0.134) and patient-level diagnoses (AUC: 0.974 ± 0.013 vs. 0.980 ± 0.010, P value = 0.117), which is close to human pathologists (average AUC: 0.969). The evaluation on 15,000 lung and 294,912 lymph node images also confirm SSL can achieve similar performance as that of SL with massive annotations. SSL dramatically reduces the annotations, which has great potential to effectively build expert-level pathological artificial intelligence platforms in practice.
Collapse
|
9
|
Abstract
Deep learning algorithms are powerful tools to analyse, restore and transform bioimaging data, increasingly used in life sciences research. These approaches now outperform most other algorithms for a broad range of image analysis tasks. In particular, one of the promises of deep learning is the possibility to provide parameter-free, one-click data analysis achieving expert-level performances in a fraction of the time previously required. However, as with most new and upcoming technologies, the potential for inappropriate use is raising concerns among the biomedical research community. This perspective aims to provide a short overview of key concepts that we believe are important for researchers to consider when using deep learning for their microscopy studies. These comments are based on our own experience gained while optimising various deep learning tools for bioimage analysis and discussions with colleagues from both the developer and user community. In particular, we focus on describing how results obtained using deep learning can be validated and discuss what should, in our views, be considered when choosing a suitable tool. We also suggest what aspects of a deep learning analysis would need to be reported in publications to describe the use of such tools to guarantee that the work can be reproduced. We hope this perspective will foster further discussion between developers, image analysis specialists, users and journal editors to define adequate guidelines and ensure that this transformative technology is used appropriately.
Collapse
|
10
|
Abstract
PURPOSE OF REVIEW In this article, we introduce the concept of model interpretability, review its applications in deep learning models for clinical ophthalmology, and discuss its role in the integration of artificial intelligence in healthcare. RECENT FINDINGS The advent of deep learning in medicine has introduced models with remarkable accuracy. However, the inherent complexity of these models undermines its users' ability to understand, debug and ultimately trust them in clinical practice. Novel methods are being increasingly explored to improve models' 'interpretability' and draw clearer associations between their outputs and features in the input dataset. In the field of ophthalmology, interpretability methods have enabled users to make informed adjustments, identify clinically relevant imaging patterns, and predict outcomes in deep learning models. SUMMARY Interpretability methods support the transparency necessary to implement, operate and modify complex deep learning models. These benefits are becoming increasingly demonstrated in models for clinical ophthalmology. As quality standards for deep learning models used in healthcare continue to evolve, interpretability methods may prove influential in their path to regulatory approval and acceptance in clinical practice.
Collapse
|
11
|
Mixed-data deep learning in repeated predictions of general medicine length of stay: a derivation study. Intern Emerg Med 2021; 16:1613-1617. [PMID: 33728577 DOI: 10.1007/s11739-021-02697-w] [Citation(s) in RCA: 10] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 11/30/2020] [Accepted: 03/05/2021] [Indexed: 12/11/2022]
Abstract
The accurate prediction of likely discharges and estimates of length of stay (LOS) aid in effective hospital administration and help to prevent access block. Machine learning (ML) may be able to help with these tasks. For consecutive patients admitted under General Medicine at the Royal Adelaide Hospital over an 8-month period, daily ward round notes and relevant discrete data fields were collected from the electronic medical record. These data were then split into training and testing sets (7-month/1-month train/test split) prior to use in ML analyses aiming to predict discharge within the next 2 days, discharge within the next 7 days and an estimated date of discharge (EDD). Artificial neural networks and logistic regression were effective at predicting discharge within 48 h of a given ward round note. These models achieved an area under the receiver operator curve (AUC) of 0.80 and 0.78, respectively. Prediction of discharge within 7 days of a given note was less accurate, with artificial neural network returning an AUC of 0.68 and logistic regression an AUC of 0.61. The generation of an exact EDD remains inaccurate. This study has shown that repeated estimates of LOS using daily ward round notes and mixed-data inputs are effective in the prediction of general medicine discharges in the next 48 h. Further research may seek to prospectively and externally validate models for prediction of upcoming discharge, as well as combination human-ML approaches for generating EDDs.
Collapse
|
12
|
Inter-database validation of a deep learning approach for automatic sleep scoring. PLoS One 2021; 16:e0256111. [PMID: 34398931 PMCID: PMC8366993 DOI: 10.1371/journal.pone.0256111] [Citation(s) in RCA: 9] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/17/2021] [Accepted: 08/01/2021] [Indexed: 12/17/2022] Open
Abstract
STUDY OBJECTIVES Development of inter-database generalizable sleep staging algorithms represents a challenge due to increased data variability across different datasets. Sharing data between different centers is also a problem due to potential restrictions due to patient privacy protection. In this work, we describe a new deep learning approach for automatic sleep staging, and address its generalization capabilities on a wide range of public sleep staging databases. We also examine the suitability of a novel approach that uses an ensemble of individual local models and evaluate its impact on the resulting inter-database generalization performance. METHODS A general deep learning network architecture for automatic sleep staging is presented. Different preprocessing and architectural variant options are tested. The resulting prediction capabilities are evaluated and compared on a heterogeneous collection of six public sleep staging datasets. Validation is carried out in the context of independent local and external dataset generalization scenarios. RESULTS Best results were achieved using the CNN_LSTM_5 neural network variant. Average prediction capabilities on independent local testing sets achieved 0.80 kappa score. When individual local models predict data from external datasets, average kappa score decreases to 0.54. Using the proposed ensemble-based approach, average kappa performance on the external dataset prediction scenario increases to 0.62. To our knowledge this is the largest study by the number of datasets so far on validating the generalization capabilities of an automatic sleep staging algorithm using external databases. CONCLUSIONS Validation results show good general performance of our method, as compared with the expected levels of human agreement, as well as to state-of-the-art automatic sleep staging methods. The proposed ensemble-based approach enables flexible and scalable design, allowing dynamic integration of local models into the final ensemble, preserving data locality, and increasing generalization capabilities of the resulting system at the same time.
Collapse
|
13
|
Gastrointestinal cancer classification and prognostication from histology using deep learning: Systematic review. Eur J Cancer 2021; 155:200-215. [PMID: 34391053 DOI: 10.1016/j.ejca.2021.07.012] [Citation(s) in RCA: 47] [Impact Index Per Article: 15.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/16/2021] [Accepted: 07/06/2021] [Indexed: 02/07/2023]
Abstract
BACKGROUND Gastrointestinal cancers account for approximately 20% of all cancer diagnoses and are responsible for 22.5% of cancer deaths worldwide. Artificial intelligence-based diagnostic support systems, in particular convolutional neural network (CNN)-based image analysis tools, have shown great potential in medical computer vision. In this systematic review, we summarise recent studies reporting CNN-based approaches for digital biomarkers for characterization and prognostication of gastrointestinal cancer pathology. METHODS Pubmed and Medline were screened for peer-reviewed papers dealing with CNN-based gastrointestinal cancer analyses from histological slides, published between 2015 and 2020.Seven hundred and ninety titles and abstracts were screened, and 58 full-text articles were assessed for eligibility. RESULTS Sixteen publications fulfilled our inclusion criteria dealing with tumor or precursor lesion characterization or prognostic and predictive biomarkers: 14 studies on colorectal or rectal cancer, three studies on gastric cancer and none on esophageal cancer. These studies were categorised according to their end-points: polyp characterization, tumor characterization and patient outcome. Regarding the translation into clinical practice, we identified several studies demonstrating generalization of the classifier with external tests and comparisons with pathologists, but none presenting clinical implementation. CONCLUSIONS Results of recent studies on CNN-based image analysis in gastrointestinal cancer pathology are promising, but studies were conducted in observational and retrospective settings. Large-scale trials are needed to assess performance and predict clinical usefulness. Furthermore, large-scale trials are required for approval of CNN-based prediction models as medical devices.
Collapse
|
14
|
A Simulated Prospective Evaluation of a Deep Learning Model for Real-Time Prediction of Clinical Deterioration Among Ward Patients. Crit Care Med 2021; 49:1312-1321. [PMID: 33711001 PMCID: PMC8282687 DOI: 10.1097/ccm.0000000000004966] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/23/2022]
Abstract
OBJECTIVES The National Early Warning Score, Modified Early Warning Score, and quick Sepsis-related Organ Failure Assessment can predict clinical deterioration. These scores exhibit only moderate performance and are often evaluated using aggregated measures over time. A simulated prospective validation strategy that assesses multiple predictions per patient-day would provide the best pragmatic evaluation. We developed a deep recurrent neural network deterioration model and conducted a simulated prospective evaluation. DESIGN Retrospective cohort study. SETTING Four hospitals in Pennsylvania. PATIENTS Inpatient adults discharged between July 1, 2017, and June 30, 2019. INTERVENTIONS None. MEASUREMENTS AND MAIN RESULTS We trained a deep recurrent neural network and logistic regression model using data from electronic health records to predict hourly the 24-hour composite outcome of transfer to ICU or death. We analyzed 146,446 hospitalizations with 16.75 million patient-hours. The hourly event rate was 1.6% (12,842 transfers or deaths, corresponding to 260,295 patient-hours within the predictive horizon). On a hold-out dataset, the deep recurrent neural network achieved an area under the precision-recall curve of 0.042 (95% CI, 0.04-0.043), comparable with logistic regression model (0.043; 95% CI 0.041 to 0.045), and outperformed National Early Warning Score (0.034; 95% CI, 0.032-0.035), Modified Early Warning Score (0.028; 95% CI, 0.027- 0.03), and quick Sepsis-related Organ Failure Assessment (0.021; 95% CI, 0.021-0.022). For a fixed sensitivity of 50%, the deep recurrent neural network achieved a positive predictive value of 3.4% (95% CI, 3.4-3.5) and outperformed logistic regression model (3.1%; 95% CI 3.1-3.2), National Early Warning Score (2.0%; 95% CI, 2.0-2.0), Modified Early Warning Score (1.5%; 95% CI, 1.5-1.5), and quick Sepsis-related Organ Failure Assessment (1.5%; 95% CI, 1.5-1.5). CONCLUSIONS Commonly used early warning scores for clinical decompensation, along with a logistic regression model and a deep recurrent neural network model, show very poor performance characteristics when assessed using a simulated prospective validation. None of these models may be suitable for real-time deployment.
Collapse
|
15
|
Abstract
Protein structures can provide invaluable information, both for reasoning about biological processes and for enabling interventions such as structure-based drug development or targeted mutagenesis. After decades of effort, 17% of the total residues in human protein sequences are covered by an experimentally determined structure1. Here we markedly expand the structural coverage of the proteome by applying the state-of-the-art machine learning method, AlphaFold2, at a scale that covers almost the entire human proteome (98.5% of human proteins). The resulting dataset covers 58% of residues with a confident prediction, of which a subset (36% of all residues) have very high confidence. We introduce several metrics developed by building on the AlphaFold model and use them to interpret the dataset, identifying strong multi-domain predictions as well as regions that are likely to be disordered. Finally, we provide some case studies to illustrate how high-quality predictions could be used to generate biological hypotheses. We are making our predictions freely available to the community and anticipate that routine large-scale and high-accuracy structure prediction will become an important tool that will allow new questions to be addressed from a structural perspective.
Collapse
|
16
|
Abstract
Proteins are essential to life, and understanding their structure can facilitate a mechanistic understanding of their function. Through an enormous experimental effort1-4, the structures of around 100,000 unique proteins have been determined5, but this represents a small fraction of the billions of known protein sequences6,7. Structural coverage is bottlenecked by the months to years of painstaking effort required to determine a single protein structure. Accurate computational approaches are needed to address this gap and to enable large-scale structural bioinformatics. Predicting the three-dimensional structure that a protein will adopt based solely on its amino acid sequence-the structure prediction component of the 'protein folding problem'8-has been an important open research problem for more than 50 years9. Despite recent progress10-14, existing methods fall far short of atomic accuracy, especially when no homologous structure is available. Here we provide the first computational method that can regularly predict protein structures with atomic accuracy even in cases in which no similar structure is known. We validated an entirely redesigned version of our neural network-based model, AlphaFold, in the challenging 14th Critical Assessment of protein Structure Prediction (CASP14)15, demonstrating accuracy competitive with experimental structures in a majority of cases and greatly outperforming other methods. Underpinning the latest version of AlphaFold is a novel machine learning approach that incorporates physical and biological knowledge about protein structure, leveraging multi-sequence alignments, into the design of the deep learning algorithm.
Collapse
|
17
|
Deep learning for semi-automated unidirectional measurement of lung tumor size in CT. Cancer Imaging 2021; 21:43. [PMID: 34162439 PMCID: PMC8220702 DOI: 10.1186/s40644-021-00413-7] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/18/2020] [Accepted: 06/09/2021] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND Performing Response Evaluation Criteria in Solid Tumor (RECISTS) measurement is a non-trivial task requiring much expertise and time. A deep learning-based algorithm has the potential to assist with rapid and consistent lesion measurement. PURPOSE The aim of this study is to develop and evaluate deep learning (DL) algorithm for semi-automated unidirectional CT measurement of lung lesions. METHODS This retrospective study included 1617 lung CT images from 8 publicly open datasets. A convolutional neural network was trained using 1373 training and validation images annotated by two radiologists. Performance of the DL algorithm was evaluated 244 test images annotated by one radiologist. DL algorithm's measurement consistency with human radiologist was evaluated using Intraclass Correlation Coefficient (ICC) and Bland-Altman plotting. Bonferroni's method was used to analyze difference in their diagnostic behavior, attributed by tumor characteristics. Statistical significance was set at p < 0.05. RESULTS The DL algorithm yielded ICC score of 0.959 with human radiologist. Bland-Altman plotting suggested 240 (98.4 %) measurements realized within the upper and lower limits of agreement (LOA). Some measurements outside the LOA revealed difference in clinical reasoning between DL algorithm and human radiologist. Overall, the algorithm marginally overestimated the size of lesion by 2.97 % compared to human radiologists. Further investigation indicated tumor characteristics may be associated with the DL algorithm's diagnostic behavior of over or underestimating the lesion size compared to human radiologist. CONCLUSIONS The DL algorithm for unidirectional measurement of lung tumor size demonstrated excellent agreement with human radiologist.
Collapse
|
18
|
Predictive Analytics for Care and Management of Patients With Acute Diseases: Deep Learning-Based Method to Predict Crucial Complication Phenotypes. J Med Internet Res 2021; 23:e18372. [PMID: 33576744 PMCID: PMC7910123 DOI: 10.2196/18372] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/22/2020] [Revised: 09/13/2020] [Accepted: 12/21/2020] [Indexed: 01/16/2023] Open
Abstract
BACKGROUND Acute diseases present severe complications that develop rapidly, exhibit distinct phenotypes, and have profound effects on patient outcomes. Predictive analytics can enhance physicians' care and management of patients with acute diseases by predicting crucial complication phenotypes for a timely diagnosis and treatment. However, effective phenotype predictions require several challenges to be overcome. First, patient data collected in the early stages of an acute disease (eg, clinical data and laboratory results) are less informative for predicting phenotypic outcomes. Second, patient data are temporal and heterogeneous; for example, patients receive laboratory tests at different time intervals and frequencies. Third, imbalanced distributions of patient outcomes create additional complexity for predicting complication phenotypes. OBJECTIVE To predict crucial complication phenotypes among patients with acute diseases, we propose a novel, deep learning-based method that uses recurrent neural network-based sequence embedding to represent disease progression while considering temporal heterogeneities in patient data. Our method incorporates a latent regulator to alleviate data insufficiency constraints by accounting for the underlying mechanisms that are not observed in patient data. The proposed method also includes cost-sensitive learning to address imbalanced outcome distributions in patient data for improved predictions. METHODS From a major health care organization in Taiwan, we obtained a sample of 10,354 electronic health records that pertained to 6545 patients with peritonitis. The proposed method projects these temporal, heterogeneous, and clinical data into a substantially reduced feature space and then incorporates a latent regulator (latent parameter matrix) to obviate data insufficiencies and account for variations in phenotypic expressions. Moreover, our method employs cost-sensitive learning to further increase the predictive performance. RESULTS We evaluated the efficacy of the proposed method for predicting two hepatic complication phenotypes in patients with peritonitis: acute hepatic encephalopathy and hepatorenal syndrome. The following three benchmark techniques were evaluated: temporal multiple measurement case-based reasoning (MMCBR), temporal short long-term memory (T-SLTM) networks, and time fusion convolutional neural network (CNN). For acute hepatic encephalopathy predictions, our method attained an area under the curve (AUC) value of 0.82, which outperforms temporal MMCBR by 64%, T-SLTM by 26%, and time fusion CNN by 26%. For hepatorenal syndrome predictions, our method achieved an AUC value of 0.64, which is 29% better than that of temporal MMCBR (0.54). Overall, the evaluation results show that the proposed method significantly outperforms all the benchmarks, as measured by recall, F-measure, and AUC while maintaining comparable precision values. CONCLUSIONS The proposed method learns a short-term temporal representation from patient data to predict complication phenotypes and offers greater predictive utilities than prevalent data-driven techniques. This method is generalizable and can be applied to different acute disease (illness) scenarios that are characterized by insufficient patient clinical data availability, temporal heterogeneities, and imbalanced distributions of important patient outcomes.
Collapse
|
19
|
Deep neural networks identify signaling mechanisms of ErbB-family drug resistance from a continuous cell morphology space. Cell Rep 2021; 34:108657. [PMID: 33472071 DOI: 10.1016/j.celrep.2020.108657] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/04/2018] [Revised: 01/20/2020] [Accepted: 12/28/2020] [Indexed: 01/21/2023] Open
Abstract
It is well known that the development of drug resistance in cancer cells can lead to changes in cell morphology. Here, we describe the use of deep neural networks to analyze this relationship, demonstrating that complex cell morphologies can encode states of signaling networks and unravel cellular mechanisms hidden to conventional approaches. We perform high-content screening of 17 cancer cell lines, generating more than 500 billion data points from ∼850 million cells. We analyze these data using a deep learning model, resulting in the identification of a continuous 27-dimension space describing all of the observed cell morphologies. From its morphology alone, we could thus predict whether a cell was resistant to ErbB-family drugs, with an accuracy of 74%, and predict the potential mechanism of resistance, subsequently validating the role of MET and insulin-like growth factor 1 receptor (IGF1R) as drivers of cetuximab resistance in in vitro models of lung and head/neck cancer.
Collapse
|
20
|
Lower body kinematics estimation from wearable sensors for walking and running: A deep learning approach. Gait Posture 2021; 83:185-193. [PMID: 33161275 DOI: 10.1016/j.gaitpost.2020.10.026] [Citation(s) in RCA: 29] [Impact Index Per Article: 9.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 07/22/2019] [Revised: 10/03/2020] [Accepted: 10/21/2020] [Indexed: 02/02/2023]
Abstract
BACKGROUND Inertial measurement units (IMUs) are promising tools for collecting human movement data. Model-based filtering approaches (e.g. Extended Kalman Filter) have been proposed to estimate joint angles from IMUs data but little is known about the potential of data-driven approaches. RESEARCH QUESTION Can deep learning models accurately predict lower limb joint angles from IMU data during gait? METHODS Lower-limb kinematic data were simultaneously measured with a marker-based motion capture system and running leggings with 5 integrated IMUs measuring acceleration and angular velocity at the pelvis, thighs and tibias. Data acquisition was performed on 27 participants (26.5 (3.9) years, 1.75 (0.07) m, 68.3 (10.0) kg) while walking at 4 and 6 km/h and running at 8, 10, 12 and 14 km/h on a treadmill. The model input consists of raw IMU data, while the output estimates the joint angles of the lower body. The model was trained with a nested k-fold cross-validation and tested considering a user-independent approach. Mean error (ME), mean absolute error (MAE) and Pearson correlation coefficient (r) were computed between the ground truth and predicted joint angles. RESULTS MAE for the DOFs ranged from 2.2(0.9) to 5.1(2.7)° with an average of 3.6(2.1)°. r ranged from 0.67(0.23) to 0.99(0.01) with moderate correlation (0.4≤r<0.7) was found for the hip right rotation and lumbar extension, strong correlation (0.7≤r<0.9) was found for the hip left rotation and ankle right/left inversion while all other DOFs showed very strong correlation (r≥0.9). SIGNIFICANCE The proposed model can reliably predict joint kinematics for walking, running and gait transitions without specific knowledge about the body characteristics of the wearer, or the position and orientation of the IMU relative to the attached segment. These results have been validated with treadmill gait, and have not yet been confirmed for gait in other settings.
Collapse
|
21
|
Deep Learning for Osteoporosis Classification Using Hip Radiographs and Patient Clinical Covariates. Biomolecules 2020; 10:biom10111534. [PMID: 33182778 PMCID: PMC7697189 DOI: 10.3390/biom10111534] [Citation(s) in RCA: 46] [Impact Index Per Article: 11.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/04/2020] [Revised: 11/08/2020] [Accepted: 11/08/2020] [Indexed: 01/10/2023] Open
Abstract
This study considers the use of deep learning to diagnose osteoporosis from hip radiographs, and whether adding clinical data improves diagnostic performance over the image mode alone. For objective labeling, we collected a dataset containing 1131 images from patients who underwent both skeletal bone mineral density measurement and hip radiography at a single general hospital between 2014 and 2019. Osteoporosis was assessed from the hip radiographs using five convolutional neural network (CNN) models. We also investigated ensemble models with clinical covariates added to each CNN. The accuracy, precision, recall, specificity, negative predictive value (npv), F1 score, and area under the curve (AUC) score were calculated for each network. In the evaluation of the five CNN models using only hip radiographs, GoogleNet and EfficientNet b3 exhibited the best accuracy, precision, and specificity. Among the five ensemble models, EfficientNet b3 exhibited the best accuracy, recall, npv, F1 score, and AUC score when patient variables were included. The CNN models diagnosed osteoporosis from hip radiographs with high accuracy, and their performance improved further with the addition of clinical covariates from patient records.
Collapse
|
22
|
Marrying Medical Domain Knowledge With Deep Learning on Electronic Health Records: A Deep Visual Analytics Approach. J Med Internet Res 2020; 22:e20645. [PMID: 32985996 PMCID: PMC7551124 DOI: 10.2196/20645] [Citation(s) in RCA: 11] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/29/2020] [Revised: 07/07/2020] [Accepted: 07/26/2020] [Indexed: 01/23/2023] Open
Abstract
BACKGROUND Deep learning models have attracted significant interest from health care researchers during the last few decades. There have been many studies that apply deep learning to medical applications and achieve promising results. However, there are three limitations to the existing models: (1) most clinicians are unable to interpret the results from the existing models, (2) existing models cannot incorporate complicated medical domain knowledge (eg, a disease causes another disease), and (3) most existing models lack visual exploration and interaction. Both the electronic health record (EHR) data set and the deep model results are complex and abstract, which impedes clinicians from exploring and communicating with the model directly. OBJECTIVE The objective of this study is to develop an interpretable and accurate risk prediction model as well as an interactive clinical prediction system to support EHR data exploration, knowledge graph demonstration, and model interpretation. METHODS A domain-knowledge-guided recurrent neural network (DG-RNN) model is proposed to predict clinical risks. The model takes medical event sequences as input and incorporates medical domain knowledge by attending to a subgraph of the whole medical knowledge graph. A global pooling operation and a fully connected layer are used to output the clinical outcomes. The middle results and the parameters of the fully connected layer are helpful in identifying which medical events cause clinical risks. DG-Viz is also designed to support EHR data exploration, knowledge graph demonstration, and model interpretation. RESULTS We conducted both risk prediction experiments and a case study on a real-world data set. A total of 554 patients with heart failure and 1662 control patients without heart failure were selected from the data set. The experimental results show that the proposed DG-RNN outperforms the state-of-the-art approaches by approximately 1.5%. The case study demonstrates how our medical physician collaborator can effectively explore the data and interpret the prediction results using DG-Viz. CONCLUSIONS In this study, we present DG-Viz, an interactive clinical prediction system, which brings together the power of deep learning (ie, a DG-RNN-based model) and visual analytics to predict clinical risks and visually interpret the EHR prediction results. Experimental results and a case study on heart failure risk prediction tasks demonstrate the effectiveness and usefulness of the DG-Viz system. This study will pave the way for interactive, interpretable, and accurate clinical risk predictions.
Collapse
|
23
|
Dynamics and Development of the COVID-19 Epidemic in the United States: A Compartmental Model Enhanced With Deep Learning Techniques. J Med Internet Res 2020; 22:e21173. [PMID: 32763892 PMCID: PMC7451112 DOI: 10.2196/21173] [Citation(s) in RCA: 9] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/07/2020] [Revised: 08/02/2020] [Accepted: 08/06/2020] [Indexed: 01/08/2023] Open
Abstract
BACKGROUND Compartmental models dominate epidemic modeling. Transmission parameters between compartments are typically estimated through stochastic parameterization processes that depends on detailed statistics of transmission characteristics, which are economically and resource-wise expensive to collect. OBJECTIVE We aim to apply deep learning techniques as a lower data dependency alternative to estimate transmission parameters of a customized compartmental model, for the purpose of simulating the dynamics of the US coronavirus disease (COVID-19) epidemic and projecting its further development. METHODS We constructed a compartmental model and developed a multistep deep learning methodology to estimate the model's transmission parameters. We then fed the estimated transmission parameters to the model to predict development of the US COVID-19 epidemic for 35 and 42 days. Epidemics are considered suppressed when the basic reproduction number (R0) is less than 1. RESULTS The deep learning-enhanced compartmental model predicts that R0 will fall to <1 around August 17-19, 2020, at which point the epidemic will effectively start to die out, and that the US "infected" population will peak around August 16-18, 2020, at 3,228,574 to 3,308,911 individual cases. The model also predicted that the number of accumulative confirmed cases will cross the 5 million mark around August 7, 2020. CONCLUSIONS Current compartmental models require stochastic parameterization to estimate the transmission parameters. These models' effectiveness depends upon detailed statistics on transmission characteristics. As an alternative, deep learning techniques are effective in estimating these stochastic parameters with greatly reduced dependency on data particularity.
Collapse
|
24
|
A New Image Classification Approach via Improved MobileNet Models with Local Receptive Field Expansion in Shallow Layers. COMPUTATIONAL INTELLIGENCE AND NEUROSCIENCE 2020; 2020:8817849. [PMID: 32802028 PMCID: PMC7416240 DOI: 10.1155/2020/8817849] [Citation(s) in RCA: 18] [Impact Index Per Article: 4.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 06/17/2020] [Revised: 07/04/2020] [Accepted: 07/10/2020] [Indexed: 11/25/2022]
Abstract
Because deep neural networks (DNNs) are both memory-intensive and computation-intensive, they are difficult to apply to embedded systems with limited hardware resources. Therefore, DNN models need to be compressed and accelerated. By applying depthwise separable convolutions, MobileNet can decrease the number of parameters and computational complexity with less loss of classification precision. Based on MobileNet, 3 improved MobileNet models with local receptive field expansion in shallow layers, also called Dilated-MobileNet (Dilated Convolution MobileNet) models, are proposed, in which dilated convolutions are introduced into a specific convolutional layer of the MobileNet model. Without increasing the number of parameters, dilated convolutions are used to increase the receptive field of the convolution filters to obtain better classification accuracy. The experiments were performed on the Caltech-101, Caltech-256, and Tubingen animals with attribute datasets, respectively. The results show that Dilated-MobileNets can obtain up to 2% higher classification accuracy than MobileNet.
Collapse
|
25
|
Asthma Exacerbation Prediction and Risk Factor Analysis Based on a Time-Sensitive, Attentive Neural Network: Retrospective Cohort Study. J Med Internet Res 2020; 22:e16981. [PMID: 32735224 PMCID: PMC7428917 DOI: 10.2196/16981] [Citation(s) in RCA: 21] [Impact Index Per Article: 5.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/09/2019] [Revised: 03/02/2020] [Accepted: 05/13/2020] [Indexed: 12/13/2022] Open
Abstract
BACKGROUND Asthma exacerbation is an acute or subacute episode of progressive worsening of asthma symptoms and can have a significant impact on patients' quality of life. However, efficient methods that can help identify personalized risk factors and make early predictions are lacking. OBJECTIVE This study aims to use advanced deep learning models to better predict the risk of asthma exacerbations and to explore potential risk factors involved in progressive asthma. METHODS We proposed a novel time-sensitive, attentive neural network to predict asthma exacerbation using clinical variables from large electronic health records. The clinical variables were collected from the Cerner Health Facts database between 1992 and 2015, including 31,433 adult patients with asthma. Interpretations on both patient and cohort levels were investigated based on the model parameters. RESULTS The proposed model obtained an area under the curve value of 0.7003 through a five-fold cross-validation, which outperformed the baseline methods. The results also demonstrated that the addition of elapsed time embeddings considerably improved the prediction performance. Further analysis observed diverse distributions of contributing factors across patients as well as some possible cohort-level risk factors, which could be found supporting evidence from peer-reviewed literature such as respiratory diseases and esophageal reflux. CONCLUSIONS The proposed neural network model performed better than previous methods for the prediction of asthma exacerbation. We believe that personalized risk scores and analyses of contributing factors can help clinicians better assess the individual's level of disease progression and afford the opportunity to adjust treatment, prevent exacerbation, and improve outcomes.
Collapse
|
26
|
Abstract
BACKGROUND Identifying drug-target interaction is a key element in drug discovery. In silico prediction of drug-target interaction can speed up the process of identifying unknown interactions between drugs and target proteins. In recent studies, handcrafted features, similarity metrics and machine learning methods have been proposed for predicting drug-target interactions. However, these methods cannot fully learn the underlying relations between drugs and targets. In this paper, we propose anew framework for drug-target interaction prediction that learns latent features from drug-target interaction network. RESULTS We present a framework to utilize the network topology and identify interacting and non-interacting drug-target pairs. We model the problem as a semi-bipartite graph in which we are able to use drug-drug and protein-protein similarity in a drug-protein network. We have then used a graph labeling method for vertex ordering in our graph embedding process. Finally, we employed deep neural network to learn the complex pattern of interacting pairs from embedded graphs. We show our approach is able to learn sophisticated drug-target topological features and outperforms other state-of-the-art approaches. CONCLUSIONS The proposed learning model on semi-bipartite graph model, can integrate drug-drug and protein-protein similarities which are semantically different than drug-protein information in a drug-target interaction network. We show our model can determine interaction likelihood for each drug-target pair and outperform other heuristics.
Collapse
|
27
|
Solo: Doublet Identification in Single-Cell RNA-Seq via Semi-Supervised Deep Learning. Cell Syst 2020; 11:95-101.e5. [PMID: 32592658 DOI: 10.1016/j.cels.2020.05.010] [Citation(s) in RCA: 64] [Impact Index Per Article: 16.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/02/2019] [Revised: 04/08/2020] [Accepted: 05/26/2020] [Indexed: 11/20/2022]
Abstract
Single-cell RNA sequencing (scRNA-seq) measurements of gene expression enable an unprecedented high-resolution view into cellular state. However, current methods often result in two or more cells that share the same cell-identifying barcode; these "doublets" violate the fundamental premise of single-cell technology and can lead to incorrect inferences. Here, we describe Solo, a semi-supervised deep learning approach that identifies doublets with greater accuracy than existing methods. Solo embeds cells unsupervised using a variational autoencoder and then appends a feed-forward neural network layer to the encoder to form a supervised classifier. We train this classifier to distinguish simulated doublets from the observed data. Solo can be applied in combination with experimental doublet detection methods to further purify scRNA-seq data to true single cells. It is freely available from https://github.com/calico/solo. A record of this paper's transparent peer review process is included in the Supplemental Information.
Collapse
|
28
|
Identification of the Facial Features of Patients With Cancer: A Deep Learning-Based Pilot Study. J Med Internet Res 2020; 22:e17234. [PMID: 32347802 PMCID: PMC7221634 DOI: 10.2196/17234] [Citation(s) in RCA: 9] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/27/2019] [Revised: 02/12/2020] [Accepted: 03/05/2020] [Indexed: 01/16/2023] Open
Abstract
BACKGROUND Cancer has become the second leading cause of death globally. Most cancer cases are due to genetic mutations, which affect metabolism and result in facial changes. OBJECTIVE In this study, we aimed to identify the facial features of patients with cancer using the deep learning technique. METHODS Images of faces of patients with cancer were collected to build the cancer face image data set. A face image data set of people without cancer was built by randomly selecting images from the publicly available MegaAge data set according to the sex and age distribution of the cancer face image data set. Each face image was preprocessed to obtain an upright centered face chip, following which the background was filtered out to exclude the effects of nonrelative factors. A residual neural network was constructed to classify cancer and noncancer cases. Transfer learning, minibatches, few epochs, L2 regulation, and random dropout training strategies were used to prevent overfitting. Moreover, guided gradient-weighted class activation mapping was used to reveal the relevant features. RESULTS A total of 8124 face images of patients with cancer (men: n=3851, 47.4%; women: n=4273, 52.6%) were collected from January 2018 to January 2019. The ages of the patients ranged from 1 year to 70 years (median age 52 years). The average faces of both male and female patients with cancer displayed more obvious facial adiposity than the average faces of people without cancer, which was supported by a landmark comparison. When testing the data set, the training process was terminated after 5 epochs. The area under the receiver operating characteristic curve was 0.94, and the accuracy rate was 0.82. The main relative feature of cancer cases was facial skin, while the relative features of noncancer cases were extracted from the complementary face region. CONCLUSIONS In this study, we built a face data set of patients with cancer and constructed a deep learning model to classify the faces of people with and those without cancer. We found that facial skin and adiposity were closely related to the presence of cancer.
Collapse
|
29
|
Emerging role of deep learning-based artificial intelligence in tumor pathology. Cancer Commun (Lond) 2020; 40:154-166. [PMID: 32277744 PMCID: PMC7170661 DOI: 10.1002/cac2.12012] [Citation(s) in RCA: 143] [Impact Index Per Article: 35.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/01/2020] [Accepted: 02/06/2020] [Indexed: 12/11/2022] Open
Abstract
The development of digital pathology and progression of state-of-the-art algorithms for computer vision have led to increasing interest in the use of artificial intelligence (AI), especially deep learning (DL)-based AI, in tumor pathology. The DL-based algorithms have been developed to conduct all kinds of work involved in tumor pathology, including tumor diagnosis, subtyping, grading, staging, and prognostic prediction, as well as the identification of pathological features, biomarkers and genetic changes. The applications of AI in pathology not only contribute to improve diagnostic accuracy and objectivity but also reduce the workload of pathologists and subsequently enable them to spend additional time on high-level decision-making tasks. In addition, AI is useful for pathologists to meet the requirements of precision oncology. However, there are still some challenges relating to the implementation of AI, including the issues of algorithm validation and interpretability, computing systems, the unbelieving attitude of pathologists, clinicians and patients, as well as regulators and reimbursements. Herein, we present an overview on how AI-based approaches could be integrated into the workflow of pathologists and discuss the challenges and perspectives of the implementation of AI in tumor pathology.
Collapse
|
30
|
Agreement of two pre-trained deep-learning neural networks built with transfer learning with six pathologists on 6000 patches of prostate cancer from Gleason2019 Challenge. ROMANIAN JOURNAL OF MORPHOLOGY AND EMBRYOLOGY = REVUE ROUMAINE DE MORPHOLOGIE ET EMBRYOLOGIE 2020; 61:513-519. [PMID: 33544803 PMCID: PMC7864291 DOI: 10.47162/rjme.61.2.21] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 02/05/2023]
Abstract
INTRODUCTION While the visual inspection of histopathology images by expert pathologists remains the golden standard method for grading of prostate cancer the quest for developing automated algorithms for the job is set and deep-learning techniques have emerged on top of other approaches. METHODS Two pre-trained deep-learning networks, obtained with transfer learning from two general purpose classification networks - AlexNet and GoogleNet, originally trained on a proprietary dataset of prostate cancer were used to classify 6000 cropped images from Gleason2019 Challenge. RESULTS The average agreement between the two networks and the six pathologists was found to be substantial for AlexNet and moderate for GoogleNet. When tested against the majority vote of the six pathologists the agreement was perfect and moderate for AlexNet, and GoogleNet, respectively. Despite our expectations, the average inter-pathologist agreement was moderate, while between the two networks it was substantial. Resulted accuracy for AlexNet and GoogleNet when tested against the majority vote as ground truth was of 85.51% and 74.75%, respectively. This result was higher than the score obtained on the dataset that they were trained on, showing their generalization capabilities. CONCLUSIONS Both the agreement and the accuracy indicate a better performance of AlexNet over GoogleNet, making it suitable for clinical deployment thus could potentially contribute to faster, more accurate and with higher reproducibility prostate cancer diagnosis.
Collapse
|
31
|
Effect of a deep-learning computer-aided detection system on adenoma detection during colonoscopy (CADe-DB trial): a double-blind randomised study. Lancet Gastroenterol Hepatol 2020; 5:343-351. [PMID: 31981517 DOI: 10.1016/s2468-1253(19)30411-x] [Citation(s) in RCA: 245] [Impact Index Per Article: 61.3] [Reference Citation Analysis] [MESH Headings] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 07/18/2019] [Revised: 11/05/2019] [Accepted: 11/07/2019] [Indexed: 12/17/2022]
|
32
|
CNN-MHSA: A Convolutional Neural Network and multi-head self-attention combined approach for detecting phishing websites. Neural Netw 2020; 125:303-312. [PMID: 32172140 DOI: 10.1016/j.neunet.2020.02.013] [Citation(s) in RCA: 33] [Impact Index Per Article: 8.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/13/2019] [Revised: 01/19/2020] [Accepted: 02/20/2020] [Indexed: 11/19/2022]
Abstract
Increasing phishing sites today have posed great threats due to their terribly imperceptible hazard. They expect users to mistake them as legitimate ones so as to steal user information and properties without notice. The conventional way to mitigate such threats is to set up blacklists. However, it cannot detect one-time Uniform Resource Locators (URL) that have not appeared in the list. As an improvement, deep learning methods are applied to increase detection accuracy and reduce the misjudgment ratio. However, some of them only focus on the characters in URLs but ignore the relationships between characters, which results in that the detection accuracy still needs to be improved. Considering the multi-head self-attention (MHSA) can learn the inner structures of URLs, in this paper, we propose CNN-MHSA, a Convolutional Neural Network (CNN) and the MHSA combined approach for highly-precise. To achieve this goal, CNN-MHSA first takes a URL string as the input data and feeds it into a mature CNN model so as to extract its features. In the meanwhile, MHSA is applied to exploit characters' relationships in the URL so as to calculate the corresponding weights for the CNN learned features. Finally, CNN-MHSA can produce highly-precise detection result for a URL object by integrating its features and their weights. The thorough experiments on a dataset collected in real environment demonstrate that our method achieves 99.84% accuracy, which outperforms the classical method CNN-LSTM and at least 6.25% higher than other similar methods on average.
Collapse
|
33
|
Quantitative Assessment of the Effects of Compression on Deep Learning in Digital Pathology Image Analysis. JCO Clin Cancer Inform 2020; 4:221-233. [PMID: 32155093 PMCID: PMC7113072 DOI: 10.1200/cci.19.00068] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 01/15/2020] [Indexed: 11/20/2022] Open
Abstract
PURPOSE Deep learning (DL), a class of approaches involving self-learned discriminative features, is increasingly being applied to digital pathology (DP) images for tasks such as disease identification and segmentation of tissue primitives (eg, nuclei, glands, lymphocytes). One application of DP is in telepathology, which involves digitally transmitting DP slides over the Internet for secondary diagnosis by an expert at a remote location. Unfortunately, the places benefiting most from telepathology often have poor Internet quality, resulting in prohibitive transmission times of DP images. Image compression may help, but the degree to which image compression affects performance of DL algorithms has been largely unexplored. METHODS We investigated the effects of image compression on the performance of DL strategies in the context of 3 representative use cases involving segmentation of nuclei (n = 137), segmentation of lymph node metastasis (n = 380), and lymphocyte detection (n = 100). For each use case, test images at various levels of compression (JPEG compression quality score ranging from 1-100 and JPEG2000 compression peak signal-to-noise ratio ranging from 18-100 dB) were evaluated by a DL classifier. Performance metrics including F1 score and area under the receiver operating characteristic curve were computed at the various compression levels. RESULTS Our results suggest that DP images can be compressed by 85% while still maintaining the performance of the DL algorithms at 95% of what is achievable without any compression. Interestingly, the maximum compression level sustainable by DL algorithms is similar to where pathologists also reported difficulties in providing accurate interpretations. CONCLUSION Our findings seem to suggest that in low-resource settings, DP images can be significantly compressed before transmission for DL-based telepathology applications.
Collapse
|
34
|
The diagnostic accuracy of artificial intelligence in thoracic diseases: A protocol for systematic review and meta-analysis. Medicine (Baltimore) 2020; 99:e19114. [PMID: 32049826 PMCID: PMC7035064 DOI: 10.1097/md.0000000000019114] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Indexed: 11/25/2022] Open
Abstract
INTRODUCTION Thoracic diseases include a variety of common human primary malignant tumors, among which lung cancer and esophageal cancer are among the top 10 in cancer incidence and mortality. Early diagnosis is an important part of cancer treatment, so artificial intelligence (AI) systems have been developed for the accurate and automated detection and diagnosis of thoracic tumors. However, the complicated AI structure and image processing made the diagnosis result of AI-based system unstable. The purpose of this study is to systematically review published evidence to explore the accuracy of AI systems in diagnosing thoracic cancers. METHODS AND ANALYSIS We will conduct a systematic review and meta-analysis of the diagnostic accuracy of AI systems for the prediction of thoracic diseases. The primary objective is to assess the diagnostic accuracy of thoracic cancers, including assessing potential biases and calculating combined estimates of sensitivity, specificity, and area under the receiver operating characteristic curve (AUC). The secondary objective is to evaluate the factors associated with different models, classifiers, and radiomics information. We will search databases such as PubMed/MEDLINE, Embase (via OVID), and the Cochrane Library. Two reviewers will independently screen titles and abstracts, perform full article reviews and extract study data. We will report study characteristics and assess methodological quality using the Quality Assessment of Diagnostic Accuracy Studies 2 (QUADAS-2) tool. RevMan 5.3 and Meta-disc 1.4 software will be used for data synthesis. If pooling is appropriate, we will produce summary receiver operating characteristic (SROC) curves, summary operating points (pooled sensitivity and specificity), and 95% confidence intervals around the summary operating points. Methodological subgroup and sensitivity analyses will be performed to explore heterogeneity. PROSPERO REGISTRATION NUMBER CRD42019135247.
Collapse
|
35
|
SurgAI: deep learning for computerized laparoscopic image understanding in gynaecology. Surg Endosc 2020; 34:5377-5383. [PMID: 31996995 DOI: 10.1007/s00464-019-07330-8] [Citation(s) in RCA: 21] [Impact Index Per Article: 5.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/04/2019] [Accepted: 12/24/2019] [Indexed: 11/25/2022]
Abstract
BACKGROUND In laparoscopy, the digital camera offers surgeons the opportunity to receive support from image-guided surgery systems. Such systems require image understanding, the ability for a computer to understand what the laparoscope sees. Image understanding has recently progressed owing to the emergence of artificial intelligence and especially deep learning techniques. However, the state of the art of deep learning in gynaecology only offers image-based detection, reporting the presence or absence of an anatomical structure, without finding its location. A solution to the localisation problem is given by the concept of semantic segmentation, giving the detection and pixel-level location of a structure in an image. The state-of-the-art results in semantic segmentation are achieved by deep learning, whose usage requires a massive amount of annotated data. We propose the first dataset dedicated to this task and the first evaluation of deep learning-based semantic segmentation in gynaecology. METHODS We used the deep learning method called Mask R-CNN. Our dataset has 461 laparoscopic images manually annotated with three classes: uterus, ovaries and surgical tools. We split our dataset in 361 images to train Mask R-CNN and 100 images to evaluate its performance. RESULTS The segmentation accuracy is reported in terms of percentage of overlap between the segmented regions from Mask R-CNN and the manually annotated ones. The accuracy is 84.5%, 29.6% and 54.5% for uterus, ovaries and surgical tools, respectively. An automatic detection of these structures was then inferred from the semantic segmentation results which led to state-of-the-art detection performance, except for the ovaries. Specifically, the detection accuracy is 97%, 24% and 86% for uterus, ovaries and surgical tools, respectively. CONCLUSION Our preliminary results are very promising, given the relatively small size of our initial dataset. The creation of an international surgical database seems essential.
Collapse
|
36
|
Training high-performance and large-scale deep neural networks with full 8-bit integers. Neural Netw 2020; 125:70-82. [PMID: 32070857 DOI: 10.1016/j.neunet.2019.12.027] [Citation(s) in RCA: 17] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/24/2019] [Revised: 11/16/2019] [Accepted: 12/27/2019] [Indexed: 11/18/2022]
Abstract
Deep neural network (DNN) quantization converting floating-point (FP) data in the network to integers (INT) is an effective way to shrink the model size for memory saving and simplify the operations for compute acceleration. Recently, researches on DNN quantization develop from inference to training, laying a foundation for the online training on accelerators. However, existing schemes leaving batch normalization (BN) untouched during training are mostly incomplete quantization that still adopts high precision FP in some parts of the data paths. Currently, there is no solution that can use only low bit-width INT data during the whole training process of large-scale DNNs with acceptable accuracy. In this work, through decomposing all the computation steps in DNNs and fusing three special quantization functions to satisfy the different precision requirements, we propose a unified complete quantization framework termed as "WAGEUBN" to quantize DNNs involving all data paths including W (Weights), A (Activation), G (Gradient), E (Error), U (Update), and BN. Moreover, the Momentum optimizer is also quantized to realize a completely quantized framework. Experiments on ResNet18/34/50 models demonstrate that WAGEUBN can achieve competitive accuracy on the ImageNet dataset. For the first time, the study of quantization in large-scale DNNs is advanced to the full 8-bit INT level. In this way, all the operations in the training and inference can be bit-wise operations, pushing towards faster processing speed, decreased memory cost, and higher energy efficiency. Our throughout quantization framework has great potential for future efficient portable devices with online learning ability.
Collapse
|
37
|
Abstract
Despite the high level of interest in the use of machine learning (ML) and neuroimaging to detect psychosis at the individual level, the reliability of the findings is unclear due to potential methodological issues that may have inflated the existing literature. This study aimed to elucidate the extent to which the application of ML to neuroanatomical data allows detection of first episode psychosis (FEP), while putting in place methodological precautions to avoid overoptimistic results. We tested both traditional ML and an emerging approach known as deep learning (DL) using 3 feature sets of interest: (1) surface-based regional volumes and cortical thickness, (2) voxel-based gray matter volume (GMV) and (3) voxel-based cortical thickness (VBCT). To assess the reliability of the findings, we repeated all analyses in 5 independent datasets, totaling 956 participants (514 FEP and 444 within-site matched controls). The performance was assessed via nested cross-validation (CV) and cross-site CV. Accuracies ranged from 50% to 70% for surfaced-based features; from 50% to 63% for GMV; and from 51% to 68% for VBCT. The best accuracies (70%) were achieved when DL was applied to surface-based features; however, these models generalized poorly to other sites. Findings from this study suggest that, when methodological precautions are adopted to avoid overoptimistic results, detection of individuals in the early stages of psychosis is more challenging than originally thought. In light of this, we argue that the current evidence for the diagnostic value of ML and structural neuroimaging should be reconsidered toward a more cautious interpretation.
Collapse
|
38
|
Automatic detection of lesion load change in Multiple Sclerosis using convolutional neural networks with segmentation confidence. Neuroimage Clin 2019; 25:102104. [PMID: 31927500 PMCID: PMC6953959 DOI: 10.1016/j.nicl.2019.102104] [Citation(s) in RCA: 33] [Impact Index Per Article: 6.6] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/08/2019] [Revised: 09/27/2019] [Accepted: 11/18/2019] [Indexed: 12/19/2022]
Abstract
The detection of new or enlarged white-matter lesions is a vital task in the monitoring of patients undergoing disease-modifying treatment for multiple sclerosis. However, the definition of 'new or enlarged' is not fixed, and it is known that lesion-counting is highly subjective, with high degree of inter- and intra-rater variability. Automated methods for lesion quantification, if accurate enough, hold the potential to make the detection of new and enlarged lesions consistent and repeatable. However, the majority of lesion segmentation algorithms are not evaluated for their ability to separate radiologically progressive from radiologically stable patients, despite this being a pressing clinical use-case. In this paper, we explore the ability of a deep learning segmentation classifier to separate stable from progressive patients by lesion volume and lesion count, and find that neither measure provides a good separation. Instead, we propose a method for identifying lesion changes of high certainty, and establish on an internal dataset of longitudinal multiple sclerosis cases that this method is able to separate progressive from stable time-points with a very high level of discrimination (AUC = 0.999), while changes in lesion volume are much less able to perform this separation (AUC = 0.71). Validation of the method on two external datasets confirms that the method is able to generalize beyond the setting in which it was trained, achieving an accuracies of 75 % and 85 % in separating stable and progressive time-points.
Collapse
|
39
|
Deep Learning Based on Standard H&E Images of Primary Melanoma Tumors Identifies Patients at Risk for Visceral Recurrence and Death. Clin Cancer Res 2019; 26:1126-1134. [PMID: 31636101 PMCID: PMC8142811 DOI: 10.1158/1078-0432.ccr-19-1495] [Citation(s) in RCA: 61] [Impact Index Per Article: 12.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/08/2019] [Revised: 08/09/2019] [Accepted: 10/16/2019] [Indexed: 12/22/2022]
Abstract
PURPOSE Biomarkers for disease-specific survival (DSS) in early-stage melanoma are needed to select patients for adjuvant immunotherapy and accelerate clinical trial design. We present a pathology-based computational method using a deep neural network architecture for DSS prediction. EXPERIMENTAL DESIGN The model was trained on 108 patients from four institutions and tested on 104 patients from Yale School of Medicine (YSM, New Haven, CT). A receiver operating characteristic (ROC) curve was generated on the basis of vote aggregation of individual image sequences, an optimized cutoff was selected, and the computational model was tested on a third independent population of 51 patients from Geisinger Health Systems (GHS). RESULTS Area under the curve (AUC) in the YSM patients was 0.905 (P < 0.0001). AUC in the GHS patients was 0.880 (P < 0.0001). Using the cutoff selected in the YSM cohort, the computational model predicted DSS in the GHS cohort based on Kaplan-Meier (KM) analysis (P < 0.0001). CONCLUSIONS The novel method presented is applicable to digital images, obviating the need for sample shipment and manipulation and representing a practical advance over current genetic and IHC-based methods.
Collapse
|
40
|
Breast Cancer Detection and Diagnosis Using Mammographic Data: Systematic Review. J Med Internet Res 2019; 21:e14464. [PMID: 31350843 PMCID: PMC6688437 DOI: 10.2196/14464] [Citation(s) in RCA: 58] [Impact Index Per Article: 11.6] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/23/2019] [Revised: 06/11/2019] [Accepted: 06/12/2019] [Indexed: 12/19/2022] Open
Abstract
BACKGROUND Machine learning (ML) has become a vital part of medical imaging research. ML methods have evolved over the years from manual seeded inputs to automatic initializations. The advancements in the field of ML have led to more intelligent and self-reliant computer-aided diagnosis (CAD) systems, as the learning ability of ML methods has been constantly improving. More and more automated methods are emerging with deep feature learning and representations. Recent advancements of ML with deeper and extensive representation approaches, commonly known as deep learning (DL) approaches, have made a very significant impact on improving the diagnostics capabilities of the CAD systems. OBJECTIVE This review aimed to survey both traditional ML and DL literature with particular application for breast cancer diagnosis. The review also provided a brief insight into some well-known DL networks. METHODS In this paper, we present an overview of ML and DL techniques with particular application for breast cancer. Specifically, we search the PubMed, Google Scholar, MEDLINE, ScienceDirect, Springer, and Web of Science databases and retrieve the studies in DL for the past 5 years that have used multiview mammogram datasets. RESULTS The analysis of traditional ML reveals the limited usage of the methods, whereas the DL methods have great potential for implementation in clinical analysis and improve the diagnostic capability of existing CAD systems. CONCLUSIONS From the literature, it can be found that heterogeneous breast densities make masses more challenging to detect and classify compared with calcifications. The traditional ML methods present confined approaches limited to either particular density type or datasets. Although the DL methods show promising improvements in breast cancer diagnosis, there are still issues of data scarcity and computational cost, which have been overcome to a significant extent by applying data augmentation and improved computational power of DL algorithms.
Collapse
|
41
|
Fully Automated, Quality-Controlled Cardiac Analysis From CMR: Validation and Large-Scale Application to Characterize Cardiac Function. JACC Cardiovasc Imaging 2019; 13:684-695. [PMID: 31326477 PMCID: PMC7060799 DOI: 10.1016/j.jcmg.2019.05.030] [Citation(s) in RCA: 82] [Impact Index Per Article: 16.4] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 01/30/2019] [Revised: 04/26/2019] [Accepted: 05/16/2019] [Indexed: 12/13/2022]
Abstract
Objectives This study sought to develop a fully automated framework for cardiac function analysis from cardiac magnetic resonance (CMR), including comprehensive quality control (QC) algorithms to detect erroneous output. Background Analysis of cine CMR imaging using deep learning (DL) algorithms could automate ventricular function assessment. However, variable image quality, variability in phenotypes of disease, and unavoidable weaknesses in training of DL algorithms currently prevent their use in clinical practice. Methods The framework consists of a pre-analysis DL image QC, followed by a DL algorithm for biventricular segmentation in long-axis and short-axis views, myocardial feature-tracking (FT), and a post-analysis QC to detect erroneous results. The study validated the framework in healthy subjects and cardiac patients by comparison against manual analysis (n = 100) and evaluation of the QC steps’ ability to detect erroneous results (n = 700). Next, this method was used to obtain reference values for cardiac function metrics from the UK Biobank. Results Automated analysis correlated highly with manual analysis for left and right ventricular volumes (all r > 0.95), strain (circumferential r = 0.89, longitudinal r > 0.89), and filling and ejection rates (all r ≥ 0.93). There was no significant bias for cardiac volumes and filling and ejection rates, except for right ventricular end-systolic volume (bias +1.80 ml; p = 0.01). The bias for FT strain was <1.3%. The sensitivity of detection of erroneous output was 95% for volume-derived parameters and 93% for FT strain. Finally, reference values were automatically derived from 2,029 CMR exams in healthy subjects. Conclusions The study demonstrates a DL-based framework for automated, quality-controlled characterization of cardiac function from cine CMR, without the need for direct clinician oversight.
Collapse
|
42
|
Adverse Drug Event Detection from Electronic Health Records Using Hierarchical Recurrent Neural Networks with Dual-Level Embedding. Drug Saf 2019; 42:113-122. [PMID: 30649736 DOI: 10.1007/s40264-018-0765-9] [Citation(s) in RCA: 17] [Impact Index Per Article: 3.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/02/2023]
Abstract
INTRODUCTION Adverse drug event (ADE) detection is a vital step towards effective pharmacovigilance and prevention of future incidents caused by potentially harmful ADEs. The electronic health records (EHRs) of patients in hospitals contain valuable information regarding ADEs and hence are an important source for detecting ADE signals. However, EHR texts tend to be noisy. Yet applying off-the-shelf tools for EHR text preprocessing jeopardizes the subsequent ADE detection performance, which depends on a well tokenized text input. OBJECTIVE In this paper, we report our experience with the NLP Challenges for Detecting Medication and Adverse Drug Events from Electronic Health Records (MADE1.0), which aims to promote deep innovations on this subject. In particular, we have developed rule-based sentence and word tokenization techniques to deal with the noise in the EHR text. METHODS We propose a detection methodology by adapting a three-layered, deep learning architecture of (1) recurrent neural network [bi-directional long short-term memory (Bi-LSTM)] for character-level word representation to encode the morphological features of the medical terminology, (2) Bi-LSTM for capturing the contextual information of each word within a sentence, and (3) conditional random fields for the final label prediction by also considering the surrounding words. We experiment with different word embedding methods commonly used in word-level classification tasks and demonstrate the impact of an integrated usage of both domain-specific and general-purpose pre-trained word embedding for detecting ADEs from EHRs. RESULTS Our system was ranked first for the named entity recognition task in the MADE1.0 challenge, with a micro-averaged F1-score of 0.8290 (official score). CONCLUSION Our results indicate that the integration of two widely used sequence labeling techniques that complement each other along with dual-level embedding (character level and word level) to represent words in the input layer results in a deep learning architecture that achieves excellent information extraction accuracy for EHR notes.
Collapse
|
43
|
Application of deep learning (3-dimensional convolutional neural network) for the prediction of pathological invasiveness in lung adenocarcinoma: A preliminary study. Medicine (Baltimore) 2019; 98:e16119. [PMID: 31232960 PMCID: PMC6636940 DOI: 10.1097/md.0000000000016119] [Citation(s) in RCA: 15] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Indexed: 12/17/2022] Open
Abstract
To compare results for radiological prediction of pathological invasiveness in lung adenocarcinoma between radiologists and a deep learning (DL) system.Ninety patients (50 men, 40 women; mean age, 66 years; range, 40-88 years) who underwent pre-operative chest computed tomography (CT) with 0.625-mm slice thickness were included in this retrospective study. Twenty-four cases of adenocarcinoma in situ (AIS), 20 cases of minimally invasive adenocarcinoma (MIA), and 46 cases of invasive adenocarcinoma (IVA) were pathologically diagnosed. Three radiologists of different levels of experience diagnosed each nodule by using previously documented CT findings to predict pathological invasiveness. DL was structured using a 3-dimensional (3D) convolutional neural network (3D-CNN) constructed with 2 successive pairs of convolution and max-pooling layers, and 2 fully connected layers. The output layer comprises 3 nodes to recognize the 3 conditions of adenocarcinoma (AIS, MIA, and IVA) or 2 nodes for 2 conditions (AIS and MIA/IVA). Results from DL and the 3 radiologists were statistically compared.No significant differences in pathological diagnostic accuracy rates were seen between DL and the 3 radiologists (P >.11). Receiver operating characteristic analysis demonstrated that area under the curve for DL (0.712) was almost the same as that for the radiologist with extensive experience (0.714; P = .98). Compared with the consensus results from radiologists, DL offered significantly inferior sensitivity (P = .0005), but significantly superior specificity (P = .02).Despite the small training data set, diagnostic performance of DL was almost the same as the radiologist with extensive experience. In particular, DL provided higher specificity than radiologists.
Collapse
|
44
|
Deep learning-based preoperative predictive analytics for patient-reported outcomes following lumbar discectomy: feasibility of center-specific modeling. Spine J 2019; 19:853-861. [PMID: 30453080 DOI: 10.1016/j.spinee.2018.11.009] [Citation(s) in RCA: 60] [Impact Index Per Article: 12.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 09/06/2018] [Revised: 11/12/2018] [Accepted: 11/12/2018] [Indexed: 02/03/2023]
Abstract
BACKGROUND CONTEXT There is considerable variability in patient-reported outcome measures following surgery for lumbar disc herniation. Individualized prediction tools that are derived from center- or even surgeon-specific data could provide valuable insights for shared decision-making. PURPOSE To evaluate the feasibility of deriving robust deep learning-based predictive analytics from single-center, single-surgeon data. STUDY DESIGN Derivation of predictive models from a prospective registry. PATIENT SAMPLE Patients who underwent single-level tubular microdiscectomy for lumbar disc herniation. OUTCOME MEASURES Numeric rating scales for leg and back pain severity and Oswestry Disability Index scores at 12 months postoperatively. METHODS Data were derived from a prospective registry. We trained deep neural network-based and logistic regression-based prediction models for patient-reported outcome measures. The primary endpoint was achievement of the minimum clinically important difference (MCID) in numeric rating scales and Oswestry Disability Index, defined as a 30% or greater improvement from baseline. Univariate predictors of MCID were also identified using conventional statistics. RESULTS A total of 422 patients were included (mean [SD] age: 48.5 [11.5] years; 207 [49%] female). After 1 year, 337 (80%), 219 (52%), and 337 (80%) patients reported a clinically relevant improvement in leg pain, back pain, and functional disability, respectively. The deep learning models predicted MCID with high area-under-the-curve of 0.87, 0.90, and 0.84, as well as accuracy of 85%, 87%, and 75%. The regression models provided inferior performance measures for each of the outcomes. CONCLUSIONS Our study demonstrates that generating personalized and robust deep learning-based analytics for outcome prediction is feasible even with limited amounts of center-specific data. With prospective validation, the ability to preoperatively and reliably inform patients about the likelihood of symptom improvement could prove useful in patient counselling and shared decision-making.
Collapse
|