Reference Citation Analysis: Find an Article, Find a Category, Find a Journal, Find a Scholar

For: Tabe-Bordbar S, Emad A, Zhao SD, Sinha S. A closer look at cross-validation for assessing the accuracy of gene regulatory networks and models. Sci Rep 2018;8:6620. [PMID: 29700343 PMCID: PMC5920056 DOI: 10.1038/s41598-018-24937-4] [Citation(s) in RCA: 27] [Impact Index Per Article: 4.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/22/2017] [Accepted: 04/09/2018] [Indexed: 11/26/2022] Open

For:	Tabe-Bordbar S, Emad A, Zhao SD, Sinha S. A closer look at cross-validation for assessing the accuracy of gene regulatory networks and models. Sci Rep 2018;8:6620. [PMID: 29700343 PMCID: PMC5920056 DOI: 10.1038/s41598-018-24937-4] [Citation(s) in RCA: 27] [Impact Index Per Article: 4.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/22/2017] [Accepted: 04/09/2018] [Indexed: 11/26/2022] Open

Number

Cited by Other Article(s)

Ab Rasid AM, Muazu Musa R, Abdul Majeed APP, Musawi Maliki ABH, Abdullah MR, Mohd Razmaan MA, Abu Osman NA. Physical fitness and motor ability parameters as predictors for skateboarding performance: A logistic regression modelling analysis. PLoS One 2024;19:e0296467. [PMID: 38329954 PMCID: PMC10852284 DOI: 10.1371/journal.pone.0296467] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/30/2023] [Accepted: 12/13/2023] [Indexed: 02/10/2024] Open

Hasegawa S, Sawada T, Serizawa T. Identification of Water-Soluble Polymers through Machine Learning of Fluorescence Signals from Multiple Peptide Sensors. ACS APPLIED BIO MATERIALS 2023;6:4598-4602. [PMID: 37889623 PMCID: PMC10664068 DOI: 10.1021/acsabm.3c00736] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/01/2023] [Revised: 10/22/2023] [Accepted: 10/23/2023] [Indexed: 10/29/2023]

Sousa H, Musa RM, Clemente FM, Sarmento H, Gouveia ÉR. Physical predictors for retention and dismissal of professional soccer head coaches: an analysis of locomotor variables using logistic regression pipeline. Front Sports Act Living 2023;5:1301845. [PMID: 38053523 PMCID: PMC10694450 DOI: 10.3389/fspor.2023.1301845] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/25/2023] [Accepted: 11/07/2023] [Indexed: 12/07/2023] Open

Greenberg ZF, Graim KS, He M. Towards artificial intelligence-enabled extracellular vesicle precision drug delivery. Adv Drug Deliv Rev 2023:114974. [PMID: 37356623 DOI: 10.1016/j.addr.2023.114974] [Citation(s) in RCA: 4] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/06/2023] [Revised: 06/21/2023] [Accepted: 06/22/2023] [Indexed: 06/27/2023]

Nghiem N, Atkinson J, Nguyen BP, Tran-Duy A, Wilson N. Predicting high health-cost users among people with cardiovascular disease using machine learning and nationwide linked social administrative datasets. HEALTH ECONOMICS REVIEW 2023;13:9. [PMID: 36738348 PMCID: PMC9898915 DOI: 10.1186/s13561-023-00422-1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 08/30/2022] [Accepted: 01/23/2023] [Indexed: 06/18/2023]

Baldrighi GN, Nova A, Bernardinelli L, Fazia T. A Pipeline for Phasing and Genotype Imputation on Mixed Human Data (Parents-Offspring Trios and Unrelated Subjects) by Reviewing Current Methods and Software. LIFE (BASEL, SWITZERLAND) 2022;12:life12122030. [PMID: 36556394 PMCID: PMC9781110 DOI: 10.3390/life12122030] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 09/30/2022] [Revised: 12/01/2022] [Accepted: 12/02/2022] [Indexed: 12/09/2022]

Szymborski J, Emad A. RAPPPID: towards generalizable protein interaction prediction with AWD-LSTM twin networks. Bioinformatics 2022;38:3958-3967. [PMID: 35771595 DOI: 10.1093/bioinformatics/btac429] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/13/2021] [Revised: 04/30/2022] [Accepted: 06/27/2022] [Indexed: 12/24/2022] Open

Wang CW, Chang CC, Lee YC, Lin YJ, Lo SC, Hsu PC, Liou YA, Wang CH, Chao TK. Weakly supervised deep learning for prediction of treatment effectiveness on ovarian cancer from histopathology images. Comput Med Imaging Graph 2022;99:102093. [PMID: 35752000 DOI: 10.1016/j.compmedimag.2022.102093] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/18/2021] [Revised: 05/13/2022] [Accepted: 06/03/2022] [Indexed: 11/30/2022]

Abstract

Despite the progress made during the last two decades in the surgery and chemotherapy of ovarian cancer, more than 70 % of advanced patients are with recurrent cancer and decease. Surgical debulking of tumors following chemotherapy is the conventional treatment for advanced carcinoma, but patients with such treatment remain at great risk for recurrence and developing drug resistance, and only about 30 % of the women affected will be cured. Bevacizumab is a humanized monoclonal antibody, which blocks VEGF signaling in cancer, inhibits angiogenesis and causes tumor shrinkage, and has been recently approved by FDA as a monotherapy for advanced ovarian cancer in combination with chemotherapy. Considering the cost, potential toxicity, and finding that only a portion of patients will benefit from these drugs, the identification of new predictive method for the treatment of ovarian cancer remains an urgent unmet medical need. In this study, we develop weakly supervised deep learning approaches to accurately predict therapeutic effect for bevacizumab of ovarian cancer patients from histopathological hematoxylin and eosin stained whole slide images, without any pathologist-provided locally annotated regions. To the authors' best knowledge, this is the first model demonstrated to be effective for prediction of the therapeutic effect of patients with epithelial ovarian cancer to bevacizumab. Quantitative evaluation of a whole section dataset shows that the proposed method achieves high accuracy, 0.882 ± 0.06; precision, 0.921 ± 0.04, recall, 0.912 ± 0.03; F-measure, 0.917 ± 0.07 using 5-fold cross validation and outperforms two state-of-the art deep learning approaches Coudray et al. (2018), Campanella et al. (2019). For an independent TMA testing set, the three proposed methods obtain promising results with high recall (sensitivity) 0.946, 0.893 and 0.964, respectively. The results suggest that the proposed method could be useful for guiding treatment by assisting in filtering out patients without positive therapeutic response to suffer from further treatments while keeping patients with positive response in the treatment process. Furthermore, according to the statistical analysis of the Cox Proportional Hazards Model, patients who were predicted to be invalid by the proposed model had a very high risk of cancer recurrence (hazard ratio = 13.727) than patients predicted to be effective with statistical signifcance (p < 0.05).

Collapse

Rajendran P, Pramanik M. High frame rate (∼3 Hz) circular photoacoustic tomography using single-element ultrasound transducer aided with deep learning. JOURNAL OF BIOMEDICAL OPTICS 2022;27:066005. [PMID: 36452448 PMCID: PMC9209813 DOI: 10.1117/1.jbo.27.6.066005] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 03/27/2022] [Accepted: 06/01/2022] [Indexed: 05/29/2023]

Chintalapudi N, Angeloni U, Battineni G, di Canio M, Marotta C, Rezza G, Sagaro GG, Silenzi A, Amenta F. LASSO Regression Modeling on Prediction of Medical Terms among Seafarers’ Health Documents Using Tidy Text Mining. Bioengineering (Basel) 2022;9:bioengineering9030124. [PMID: 35324813 PMCID: PMC8945331 DOI: 10.3390/bioengineering9030124] [Citation(s) in RCA: 7] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/07/2022] [Revised: 03/02/2022] [Accepted: 03/16/2022] [Indexed: 12/31/2022] Open

Li F, Zhou Y, Zhang Y, Yin J, Qiu Y, Gao J, Zhu F. POSREG: proteomic signature discovered by simultaneously optimizing its reproducibility and generalizability. Brief Bioinform 2022;23:6532538. [PMID: 35183059 DOI: 10.1093/bib/bbac040] [Citation(s) in RCA: 69] [Impact Index Per Article: 34.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/30/2021] [Revised: 01/21/2022] [Accepted: 01/27/2022] [Indexed: 12/17/2022] Open

Jamdade R, Upadhyay M, Al Shaer K, Al Harthi E, Al Sallani M, Al Jasmi M, Al Ketbi A. Evaluation of Arabian Vascular Plant Barcodes (rbcL and matK): Precision of Unsupervised and Supervised Learning Methods towards Accurate Identification. PLANTS (BASEL, SWITZERLAND) 2021;10:plants10122741. [PMID: 34961211 PMCID: PMC8708657 DOI: 10.3390/plants10122741] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 08/19/2021] [Revised: 09/16/2021] [Accepted: 09/23/2021] [Indexed: 06/14/2023]

Abstract

Arabia is the largest peninsula in the world, with >3000 species of vascular plants. Not much effort has been made to generate a multi-locus marker barcode library to identify and discriminate the recorded plant species. This study aimed to determine the reliability of the available Arabian plant barcodes (>1500; rbcL and matK) at the public repository (NCBI GenBank) using the unsupervised and supervised methods. Comparative analysis was carried out with the standard dataset (FINBOL) to assess the methods and markers' reliability. Our analysis suggests that from the unsupervised method, TaxonDNA's All Species Barcode criterion (ASB) exhibits the highest accuracy for rbcL barcodes, followed by the matK barcodes using the aligned dataset (FINBOL). However, for the Arabian plant barcode dataset (GBMA), the supervised method performed better than the unsupervised method, where the Random Forest and K-Nearest Neighbor (gappy kernel) classifiers were robust enough. These classifiers successfully recognized true species from both barcode markers belonging to the aligned and alignment-free datasets, respectively. The multi-class classifier showed high species resolution following the two classifiers, though its performance declined when employed to recognize true species. Similar results were observed for the FINBOL dataset through the supervised learning approach; overall, matK marker showed higher accuracy than rbcL. However, the lower rate of species identification in matK in GBMA data could be due to the higher evolutionary rate or gaps and missing data, as observed for the ASB criterion in the FINBOL dataset. Further, a lower number of sequences and singletons could also affect the rate of species resolution, as observed in the GBMA dataset. The GBMA dataset lacks sufficient species membership. We would encourage the taxonomists from the Arabian Peninsula to join our campaign on the Arabian Barcode of Life at the Barcode of Life Data (BOLD) systems. Our efforts together could help improve the rate of species identification for the Arabian Vascular plants.

Collapse

Guo MG, Sosa DN, Altman RB. Challenges and opportunities in network-based solutions for biological questions. Brief Bioinform 2021;23:6438103. [PMID: 34849568 PMCID: PMC8769687 DOI: 10.1093/bib/bbab437] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/23/2021] [Revised: 09/09/2021] [Accepted: 09/22/2021] [Indexed: 11/28/2022] Open

Wang CW, Huang SC, Lee YC, Shen YJ, Meng SI, Gaol JL. Deep learning for bone marrow cell detection and classification on whole-slide images. Med Image Anal 2021;75:102270. [PMID: 34710655 DOI: 10.1016/j.media.2021.102270] [Citation(s) in RCA: 35] [Impact Index Per Article: 11.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/21/2020] [Revised: 10/06/2021] [Accepted: 10/13/2021] [Indexed: 12/19/2022]

Abstract

Bone marrow (BM) examination is an essential step in both diagnosing and managing numerous hematologic disorders. BM nucleated differential count (NDC) analysis, as part of BM examination, holds the most fundamental and crucial information. However, there are many challenges to perform automated BM NDC analysis on whole-slide images (WSIs), including large dimensions of data to process, complicated cell types with subtle differences. To the authors best knowledge, this is the first study on fully automatic BM NDC using WSIs with 40x objective magnification, which can replace traditional manual counting relying on light microscopy via oil-immersion 100x objective lens with a total 1000x magnification. In this study, we develop an efficient and fully automatic hierarchical deep learning framework for BM NDC WSI analysis in seconds. The proposed hierarchical framework consists of (1) a deep learning model for rapid localization of BM particles and cellular trails generating regions of interest (ROI) for further analysis, (2) a patch-based deep learning model for cell identification of 16 cell types, including megakaryocytes, mitotic cells, and four stages of erythroblasts which have not been demonstrated in previous studies before, and (3) a fast stitching model for integrating patch-based results and producing final outputs. In evaluation, the proposed method is firstly tested on a dataset with a total of 12,426 annotated cells using cross validation, achieving high recall and accuracy of 0.905 ± 0.078 and 0.989 ± 0.006, respectively, and taking only 44 seconds to perform BM NDC analysis for a WSI. To further examine the generalizability of our model, we conduct an evaluation on the second independent dataset with a total of 3005 cells, and the results show that the proposed method also obtains high recall and accuracy of 0.842 and 0.988, respectively. In comparison with the existing small-image-based benchmark methods, the proposed method demonstrates superior performance in recall, accuracy and computational time.

Collapse

Pourashraf T, Shokri S, Yousefi M, Ahmadi A, Azar PA. Implementing Machine Learning in Laboratory Synthesis by Hybrid of SVR Model and Optimization Algorithms. ADVANCED THEORY AND SIMULATIONS 2021. [DOI: 10.1002/adts.202100225] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/21/2022]

Molina Mora JA, Montero-Manso P, García-Batán R, Campos-Sánchez R, Vilar-Fernández J, García F. A first perturbome of Pseudomonas aeruginosa: Identification of core genes related to multiple perturbations by a machine learning approach. Biosystems 2021;205:104411. [PMID: 33757842 DOI: 10.1016/j.biosystems.2021.104411] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/24/2020] [Revised: 03/11/2021] [Accepted: 03/12/2021] [Indexed: 01/27/2023]

Syed M, Syed S, Sexton K, Syeda HB, Garza M, Zozus M, Syed F, Begum S, Syed AU, Sanford J, Prior F. Application of Machine Learning in Intensive Care Unit (ICU) Settings Using MIMIC Dataset: Systematic Review. INFORMATICS-BASEL 2021;8. [PMID: 33981592 DOI: 10.3390/informatics8010016] [Citation(s) in RCA: 16] [Impact Index Per Article: 5.3] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Subscribe] [Scholar Register] [Indexed: 12/16/2022]

Affiliation(s)

Mahanazuddin Syed Department of Biomedical Informatics, University of Arkansas for Medical Sciences (UAMS), Little Rock, Arkansas 72205, USA
Shorabuddin Syed Department of Biomedical Informatics, University of Arkansas for Medical Sciences (UAMS), Little Rock, Arkansas 72205, USA
Kevin Sexton Department of Biomedical Informatics, University of Arkansas for Medical Sciences (UAMS), Little Rock, Arkansas 72205, USA Department of Surgery, University of Arkansas for Medical Sciences (UAMS), Little Rock, Arkansas 72205, USA Department of Health Policy and Management, University of Arkansas for Medical Sciences (UAMS), Little Rock, Arkansas 72205, USA
Hafsa Bareen Syeda Department of Biomedical Informatics, University of Arkansas for Medical Sciences (UAMS), Little Rock, Arkansas 72205, USA
Maryam Garza Department of Biomedical Informatics, University of Arkansas for Medical Sciences (UAMS), Little Rock, Arkansas 72205, USA
Meredith Zozus Department of Population Health Sciences, University of Texas Health Science Center at San Antonio, San Antonio, Texas 78229, USA
Farhanuddin Syed Shadan Institute of Medical Sciences, College of Medicine, Hyderabad, Telangana 500086, India
Salma Begum Department of Information Technology, University of Arkansas for Medical Sciences (UAMS), Little Rock, Arkansas 72205, USA
Abdullah Usama Syed Department of Information Science, University of Arkansas at Little Rock (UALR), Little Rock, Arkansas 72205, USA
Joseph Sanford Department of Biomedical Informatics, University of Arkansas for Medical Sciences (UAMS), Little Rock, Arkansas 72205, USA Department of Anesthesiology, University of Arkansas for Medical Sciences (UAMS), Little Rock, Arkansas 72205, USA
Fred Prior Department of Biomedical Informatics, University of Arkansas for Medical Sciences (UAMS), Little Rock, Arkansas 72205, USA

Collapse

Lüftinger L, Májek P, Beisken S, Rattei T, Posch AE. Learning From Limited Data: Towards Best Practice Techniques for Antimicrobial Resistance Prediction From Whole Genome Sequencing Data. Front Cell Infect Microbiol 2021;11:610348. [PMID: 33659219 PMCID: PMC7917081 DOI: 10.3389/fcimb.2021.610348] [Citation(s) in RCA: 11] [Impact Index Per Article: 3.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/25/2020] [Accepted: 01/11/2021] [Indexed: 01/20/2023] Open

Abstract

Antimicrobial resistance prediction from whole genome sequencing data (WGS) is an emerging application of machine learning, promising to improve antimicrobial resistance surveillance and outbreak monitoring. Despite significant reductions in sequencing cost, the availability and sampling diversity of WGS data with matched antimicrobial susceptibility testing (AST) profiles required for training of WGS-AST prediction models remains limited. Best practice machine learning techniques are required to ensure trained models generalize to independent data for optimal predictive performance. Limited data restricts the choice of machine learning training and evaluation methods and can result in overestimation of model performance. We demonstrate that the widely used random k-fold cross-validation method is ill-suited for application to small bacterial genomics datasets and offer an alternative cross-validation method based on genomic distance. We benchmarked three machine learning architectures previously applied to the WGS-AST problem on a set of 8,704 genome assemblies from five clinically relevant pathogens across 77 species-compound combinations collated from public databases. We show that individual models can be effectively ensembled to improve model performance. By combining models via stacked generalization with cross-validation, a model ensembling technique suitable for small datasets, we improved average sensitivity and specificity of individual models by 1.77% and 3.20%, respectively. Furthermore, stacked models exhibited improved robustness and were thus less prone to outlier performance drops than individual component models. In this study, we highlight best practice techniques for antimicrobial resistance prediction from WGS data and introduce the combination of genome distance aware cross-validation and stacked generalization for robust and accurate WGS-AST.

Collapse

Bobak CA, Kang L, Workman L, Bateman L, Khan MS, Prins M, May L, Franchina FA, Baard C, Nicol MP, Zar HJ, Hill JE. Breath can discriminate tuberculosis from other lower respiratory illness in children. Sci Rep 2021;11:2704. [PMID: 33526828 PMCID: PMC7851130 DOI: 10.1038/s41598-021-80970-w] [Citation(s) in RCA: 14] [Impact Index Per Article: 4.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/11/2020] [Accepted: 12/28/2020] [Indexed: 01/30/2023] Open

Affiliation(s)

Carly A. Bobak grid.254880.30000 0001 2179 2404Thayer School of Engineering, Dartmouth College, Hanover, NH USA ,2grid.254880.30000 0001 2179 2404Geisel School of Medicine, Dartmouth College, Hanover, NH USA
Lili Kang grid.254880.30000 0001 2179 2404Thayer School of Engineering, Dartmouth College, Hanover, NH USA
Lesley Workman grid.415742.10000 0001 2296 3850Department of Pediatrics and Child Health, MRC Unit on Child and Adolescent Health, University of Cape Town and Red Cross War Memorial Children’s Hospital, Cape Town, South Africa
Lindy Bateman grid.415742.10000 0001 2296 3850Department of Pediatrics and Child Health, MRC Unit on Child and Adolescent Health, University of Cape Town and Red Cross War Memorial Children’s Hospital, Cape Town, South Africa
Mohammad S. Khan grid.254880.30000 0001 2179 2404Thayer School of Engineering, Dartmouth College, Hanover, NH USA
Margaretha Prins grid.415742.10000 0001 2296 3850Department of Pediatrics and Child Health, MRC Unit on Child and Adolescent Health, University of Cape Town and Red Cross War Memorial Children’s Hospital, Cape Town, South Africa
Lloyd May grid.254880.30000 0001 2179 2404Thayer School of Engineering, Dartmouth College, Hanover, NH USA
Flavio A. Franchina grid.254880.30000 0001 2179 2404Thayer School of Engineering, Dartmouth College, Hanover, NH USA ,4grid.4861.b0000 0001 0805 7253Molecular Systems, Organic and Biological Analytical Chemistry Group, University of Liège, Liège, Belgium
Cynthia Baard grid.415742.10000 0001 2296 3850Department of Pediatrics and Child Health, MRC Unit on Child and Adolescent Health, University of Cape Town and Red Cross War Memorial Children’s Hospital, Cape Town, South Africa
Mark P. Nicol grid.7836.a0000 0004 1937 1151Division of Medical Microbiology and Institute for Infectious Diseases and Molecular Medicine, University of Cape Town, Cape Town, South Africa ,6grid.1012.20000 0004 1936 7910School of Biomedical Sciences, University of Western Australia, Perth, Australia
Heather J. Zar grid.415742.10000 0001 2296 3850Department of Pediatrics and Child Health, MRC Unit on Child and Adolescent Health, University of Cape Town and Red Cross War Memorial Children’s Hospital, Cape Town, South Africa
Jane E. Hill grid.254880.30000 0001 2179 2404Thayer School of Engineering, Dartmouth College, Hanover, NH USA

Collapse

Alafeef M, Srivastava I, Pan D. Machine Learning for Precision Breast Cancer Diagnosis and Prediction of the Nanoparticle Cellular Internalization. ACS Sens 2020;5:1689-1698. [PMID: 32466640 DOI: 10.1021/acssensors.0c00329] [Citation(s) in RCA: 24] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/14/2022]

Abstract

In the field of theranostics, diagnostic nanoparticles are designed to collect highly patient-selective disease profiles, which is then leveraged by a set of nanotherapeutics to improve the therapeutic results. Despite their early promise, high interpatient and intratumoral heterogeneities make any rational design and analysis of these theranostics platforms extremely problematic. Recent advances in deep-learning-based tools may help bridge this gap, using pattern recognition algorithms for better diagnostic precision and therapeutic outcome. Triple-negative breast cancer (TNBC) is a conundrum because of the complex molecular diversity, making its diagnosis and therapy challenging. To address these challenges, we propose a method to predict the cellular internalization of nanoparticles (NPs) against different cancer stages using artificial intelligence. Here, we demonstrate for the first time that a combination of machine-learning (ML) algorithm and characteristic cellular uptake responses for individual cancer cell types can be successfully used to classify various cancer cell types. Utilizing this approach, we can optimize the nanomaterials to get an optimum structure-internalization response for a given particle. This methodology predicted the structure-internalization response of the evaluated nanoparticles with remarkable accuracy (Q² = 0.9). We anticipate that it can reduce the effort by minimizing the number of nanoparticles that need to be tested and could be utilized as a screening tool for designing nanotherapeutics. Following this, we have proposed a diagnostic nanomaterial-based platform used to assemble a patient-specific cancer profile with the assistance of machine learning (ML). The platform is composed of eight carbon nanoparticles (CNPs) with multifarious surface chemistries that can differentiate healthy breast cells from cancerous cells and then subclassify TNBC cells vs non-TNBC cells, within the TNBC group. The artificial neural network (ANN) algorithm has been successfully used in identifying the type of cancer cells from 36 unknown cancer samples with an overall accuracy of >98%, providing potential applications in cancer diagnostics.

Collapse

Machine learning-based lifetime breast cancer risk reclassification compared with the BOADICEA model: impact on screening recommendations. Br J Cancer 2020;123:860-867. [PMID: 32565540 PMCID: PMC7463251 DOI: 10.1038/s41416-020-0937-0] [Citation(s) in RCA: 16] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/02/2019] [Revised: 05/13/2020] [Accepted: 05/29/2020] [Indexed: 12/17/2022] Open

Mayhew MB, Buturovic L, Luethy R, Midic U, Moore AR, Roque JA, Shaller BD, Asuni T, Rawling D, Remmel M, Choi K, Wacker J, Khatri P, Rogers AJ, Sweeney TE. A generalizable 29-mRNA neural-network classifier for acute bacterial and viral infections. Nat Commun 2020;11:1177. [PMID: 32132525 PMCID: PMC7055276 DOI: 10.1038/s41467-020-14975-w] [Citation(s) in RCA: 61] [Impact Index Per Article: 15.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/18/2019] [Accepted: 02/13/2020] [Indexed: 02/07/2023] Open

Nair JKR, Saeed UA, McDougall CC, Sabri A, Kovacina B, Raidu BVS, Khokhar RA, Probst S, Hirsh V, Chankowsky J, Van Kempen LC, Taylor J. Radiogenomic Models Using Machine Learning Techniques to Predict EGFR Mutations in Non-Small Cell Lung Cancer. Can Assoc Radiol J 2020;72:109-119. [PMID: 32063026 DOI: 10.1177/0846537119899526] [Citation(s) in RCA: 41] [Impact Index Per Article: 10.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/31/2022] Open

Huang EW, Bhope A, Lim J, Sinha S, Emad A. Tissue-guided LASSO for prediction of clinical drug response using preclinical samples. PLoS Comput Biol 2020;16:e1007607. [PMID: 31967990 PMCID: PMC6975549 DOI: 10.1371/journal.pcbi.1007607] [Citation(s) in RCA: 23] [Impact Index Per Article: 5.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/06/2019] [Accepted: 12/15/2019] [Indexed: 12/12/2022] Open

Abstract

Prediction of clinical drug response (CDR) of cancer patients, based on their clinical and molecular profiles obtained prior to administration of the drug, can play a significant role in individualized medicine. Machine learning models have the potential to address this issue but training them requires data from a large number of patients treated with each drug, limiting their feasibility. While large databases of drug response and molecular profiles of preclinical in-vitro cancer cell lines (CCLs) exist for many drugs, it is unclear whether preclinical samples can be used to predict CDR of real patients. We designed a systematic approach to evaluate how well different algorithms, trained on gene expression and drug response of CCLs, can predict CDR of patients. Using data from two large databases, we evaluated various linear and non-linear algorithms, some of which utilized information on gene interactions. Then, we developed a new algorithm called TG-LASSO that explicitly integrates information on samples' tissue of origin with gene expression profiles to improve prediction performance. Our results showed that regularized regression methods provide better prediction performance. However, including the network information or common methods of including information on the tissue of origin did not improve the results. On the other hand, TG-LASSO improved the predictions and distinguished resistant and sensitive patients for 7 out of 13 drugs. Additionally, TG-LASSO identified genes associated with the drug response, including known targets and pathways involved in the drugs' mechanism of action. Moreover, genes identified by TG-LASSO for multiple drugs in a tissue were associated with patient survival. In summary, our analysis suggests that preclinical samples can be used to predict CDR of patients and identify biomarkers of drug sensitivity and survival.

Collapse

Setting the standards for machine learning in biology. Nat Rev Mol Cell Biol 2019;20:659-660. [DOI: 10.1038/s41580-019-0176-5] [Citation(s) in RCA: 46] [Impact Index Per Article: 9.2] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/08/2023]

Picart-Armada S, Barrett SJ, Willé DR, Perera-Lluna A, Gutteridge A, Dessailly BH. Benchmarking network propagation methods for disease gene identification. PLoS Comput Biol 2019;15:e1007276. [PMID: 31479437 PMCID: PMC6743778 DOI: 10.1371/journal.pcbi.1007276] [Citation(s) in RCA: 23] [Impact Index Per Article: 4.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/24/2019] [Revised: 09/13/2019] [Accepted: 07/16/2019] [Indexed: 12/17/2022] Open

Abstract

In-silico identification of potential target genes for disease is an essential aspect of drug target discovery. Recent studies suggest that successful targets can be found through by leveraging genetic, genomic and protein interaction information. Here, we systematically tested the ability of 12 varied algorithms, based on network propagation, to identify genes that have been targeted by any drug, on gene-disease data from 22 common non-cancerous diseases in OpenTargets. We considered two biological networks, six performance metrics and compared two types of input gene-disease association scores. The impact of the design factors in performance was quantified through additive explanatory models. Standard cross-validation led to over-optimistic performance estimates due to the presence of protein complexes. In order to obtain realistic estimates, we introduced two novel protein complex-aware cross-validation schemes. When seeding biological networks with known drug targets, machine learning and diffusion-based methods found around 2-4 true targets within the top 20 suggestions. Seeding the networks with genes associated to disease by genetics decreased performance below 1 true hit on average. The use of a larger network, although noisier, improved overall performance. We conclude that diffusion-based prioritisers and machine learning applied to diffusion-based features are suited for drug discovery in practice and improve over simpler neighbour-voting methods. We also demonstrate the large impact of choosing an adequate validation strategy and the definition of seed disease genes.

The use of biological network data has proven its effectiveness in many areas from computational biology. Networks consist of nodes, usually genes or proteins, and edges that connect pairs of nodes, representing information such as physical interactions, regulatory roles or co-occurrence. In order to find new candidate nodes for a given biological property, the so-called network propagation algorithms start from the set of known nodes with that property and leverage the connections from the biological network to make predictions. Here, we assess the performance of several network propagation algorithms to find sensible gene targets for 22 common non-cancerous diseases, i.e. those that have been found promising enough to start the clinical trials with any compound. We focus on obtaining performance metrics that reflect a practical scenario in drug development where only a small set of genes can be essayed. We found that the presence of protein complexes biased the performance estimates, leading to over-optimistic conclusions, and introduced two novel strategies to address it. Our results support that network propagation is still a viable approach to find drug targets, but that special care needs to be put on the validation strategy. Algorithms benefitted from the use of a larger -although noisier- network and of direct evidence data, rather than indirect genetic associations to disease.

Collapse

Zhang W, Li W, Zhang J, Wang N. Data Integration of Hybrid Microarray and Single Cell Expression Data to Enhance Gene Network Inference. Curr Bioinform 2019. [DOI: 10.2174/1574893614666190104142228] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/16/2022]