Reference Citation Analysis: Find an Article, Find a Category, Find a Journal, Find a Scholar

For: Eid FE, Elmarakeby HA, Chan YA, Fornelos N, ElHefnawi M, Van Allen EM, Heath LS, Lage K. Systematic auditing is essential to debiasing machine learning in biology. Commun Biol 2021;4:183. [PMID: 33568741 DOI: 10.1038/s42003-021-01674-5] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/01/2020] [Accepted: 11/12/2020] [Indexed: 12/20/2022] Open

For:	Eid FE, Elmarakeby HA, Chan YA, Fornelos N, ElHefnawi M, Van Allen EM, Heath LS, Lage K. Systematic auditing is essential to debiasing machine learning in biology. Commun Biol 2021;4:183. [PMID: 33568741 DOI: 10.1038/s42003-021-01674-5] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/01/2020] [Accepted: 11/12/2020] [Indexed: 12/20/2022] Open

Number

Cited by Other Article(s)

Makarov V, Chabbert C, Koletou E, Psomopoulos F, Kurbatova N, Ramirez S, Nelson C, Natarajan P, Neupane B. Good machine learning practices: Learnings from the modern pharmaceutical discovery enterprise. Comput Biol Med 2024;177:108632. [PMID: 38788373 DOI: 10.1016/j.compbiomed.2024.108632] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/07/2024] [Revised: 05/07/2024] [Accepted: 05/18/2024] [Indexed: 05/26/2024]

Cappelletti L, Rekerle L, Fontana T, Hansen P, Casiraghi E, Ravanmehr V, Mungall CJ, Yang JJ, Spranger L, Karlebach G, Caufield JH, Carmody L, Coleman B, Oprea TI, Reese J, Valentini G, Robinson PN. Node-degree aware edge sampling mitigates inflated classification performance in biomedical random walk-based graph representation learning. BIOINFORMATICS ADVANCES 2024;4:vbae036. [PMID: 38577542 PMCID: PMC10994718 DOI: 10.1093/bioadv/vbae036] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 01/11/2023] [Revised: 01/31/2024] [Accepted: 02/29/2024] [Indexed: 04/06/2024]

Affiliation(s)

Luca Cappelletti AnacletoLab, Dipartimento di Informatica, Università degli Studi di Milano, Milano 20133, Italy
Lauren Rekerle The Jackson Laboratory for Genomic Medicine, CT 06032, United States
Tommaso Fontana AnacletoLab, Dipartimento di Informatica, Università degli Studi di Milano, Milano 20133, Italy
Peter Hansen The Jackson Laboratory for Genomic Medicine, CT 06032, United States
Elena Casiraghi AnacletoLab, Dipartimento di Informatica, Università degli Studi di Milano, Milano 20133, Italy Division of Environmental Genomics and Systems Biology, Lawrence Berkeley National Laboratory, Berkeley, CA 94710, United States
Vida Ravanmehr The Jackson Laboratory for Genomic Medicine, CT 06032, United States
Christopher J Mungall Division of Environmental Genomics and Systems Biology, Lawrence Berkeley National Laboratory, Berkeley, CA 94710, United States
Jeremy J Yang Department of Internal Medicine and UNM Comprehensive Cancer Center, UNM School of Medicine, Albuquerque, NM 87102, United States
Leonard Spranger Institute of Bioinformatics, Freie Universität Berlin, Berlin, 14195, Germany
Guy Karlebach The Jackson Laboratory for Genomic Medicine, CT 06032, United States
J Harry Caufield Division of Environmental Genomics and Systems Biology, Lawrence Berkeley National Laboratory, Berkeley, CA 94710, United States
Leigh Carmody The Jackson Laboratory for Genomic Medicine, CT 06032, United States
Ben Coleman The Jackson Laboratory for Genomic Medicine, CT 06032, United States Institute for Systems Genomics, University of Connecticut, Farmington, CT 06032, United States
Tudor I Oprea Department of Internal Medicine and UNM Comprehensive Cancer Center, UNM School of Medicine, Albuquerque, NM 87102, United States
Justin Reese Division of Environmental Genomics and Systems Biology, Lawrence Berkeley National Laboratory, Berkeley, CA 94710, United States
Giorgio Valentini AnacletoLab, Dipartimento di Informatica, Università degli Studi di Milano, Milano 20133, Italy ELLIS—European Laboratory for Learning and Intelligent Systems
Peter N Robinson The Jackson Laboratory for Genomic Medicine, CT 06032, United States Institute for Systems Genomics, University of Connecticut, Farmington, CT 06032, United States ELLIS—European Laboratory for Learning and Intelligent Systems Berlin Institute of Health, Charité – Universitätsmedizin Berlin, Berlin, 10117, Germany

Collapse

Aguilera-Puga MDC, Cancelarich NL, Marani MM, de la Fuente-Nunez C, Plisson F. Accelerating the Discovery and Design of Antimicrobial Peptides with Artificial Intelligence. Methods Mol Biol 2024;2714:329-352. [PMID: 37676607 DOI: 10.1007/978-1-0716-3441-7_18] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 09/08/2023]

Fernandez ME, Martinez-Romero J, Aon MA, Bernier M, Price NL, de Cabo R. How is Big Data reshaping preclinical aging research? Lab Anim (NY) 2023;52:289-314. [PMID: 38017182 DOI: 10.1038/s41684-023-01286-y] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/24/2023] [Accepted: 10/10/2023] [Indexed: 11/30/2023]

Ahlquist KD, Sugden LA, Ramachandran S. Enabling interpretable machine learning for biological data with reliability scores. PLoS Comput Biol 2023;19:e1011175. [PMID: 37235578 PMCID: PMC10249903 DOI: 10.1371/journal.pcbi.1011175] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/24/2022] [Revised: 06/08/2023] [Accepted: 05/10/2023] [Indexed: 05/28/2023] Open

Abstract

Machine learning tools have proven useful across biological disciplines, allowing researchers to draw conclusions from large datasets, and opening up new opportunities for interpreting complex and heterogeneous biological data. Alongside the rapid growth of machine learning, there have also been growing pains: some models that appear to perform well have later been revealed to rely on features of the data that are artifactual or biased; this feeds into the general criticism that machine learning models are designed to optimize model performance over the creation of new biological insights. A natural question arises: how do we develop machine learning models that are inherently interpretable or explainable? In this manuscript, we describe the SWIF(r) reliability score (SRS), a method building on the SWIF(r) generative framework that reflects the trustworthiness of the classification of a specific instance. The concept of the reliability score has the potential to generalize to other machine learning methods. We demonstrate the utility of the SRS when faced with common challenges in machine learning including: 1) an unknown class present in testing data that was not present in training data, 2) systemic mismatch between training and testing data, and 3) instances of testing data that have missing values for some attributes. We explore these applications of the SRS using a range of biological datasets, from agricultural data on seed morphology, to 22 quantitative traits in the UK Biobank, and population genetic simulations and 1000 Genomes Project data. With each of these examples, we demonstrate how the SRS can allow researchers to interrogate their data and training approach thoroughly, and to pair their domain-specific knowledge with powerful machine-learning frameworks. We also compare the SRS to related tools for outlier and novelty detection, and find that it has comparable performance, with the advantage of being able to operate when some data are missing. The SRS, and the broader discussion of interpretable scientific machine learning, will aid researchers in the biological machine learning space as they seek to harness the power of machine learning without sacrificing rigor and biological insight.

Collapse

Couckuyt A, Seurinck R, Emmaneel A, Quintelier K, Novak D, Van Gassen S, Saeys Y. Challenges in translational machine learning. Hum Genet 2022;141:1451-1466. [PMID: 35246744 PMCID: PMC8896412 DOI: 10.1007/s00439-022-02439-8] [Citation(s) in RCA: 7] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/09/2021] [Accepted: 02/08/2022] [Indexed: 11/25/2022]

Das S, Taylor K, Beaulah S, Gardner S. Systematic indication extension for drugs using patient stratification insights generated by combinatorial analytics. PATTERNS (NEW YORK, N.Y.) 2022;3:100496. [PMID: 35755863 PMCID: PMC9214305 DOI: 10.1016/j.patter.2022.100496] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 11/24/2022]

Navigating the pitfalls of applying machine learning in genomics. Nat Rev Genet 2022;23:169-181. [PMID: 34837041 DOI: 10.1038/s41576-021-00434-9] [Citation(s) in RCA: 66] [Impact Index Per Article: 33.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 10/28/2021] [Indexed: 11/08/2022]

Barsi S, Szalai B. Modeling in systems biology: Causal understanding before prediction? PATTERNS (NEW YORK, N.Y.) 2021;2:100280. [PMID: 34179849 PMCID: PMC8212131 DOI: 10.1016/j.patter.2021.100280] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 11/28/2022]

Wu Z, Johnston KE, Arnold FH, Yang KK. Protein sequence design with deep generative models. Curr Opin Chem Biol 2021;65:18-27. [PMID: 34051682 DOI: 10.1016/j.cbpa.2021.04.004] [Citation(s) in RCA: 52] [Impact Index Per Article: 17.3] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/24/2021] [Revised: 04/02/2021] [Accepted: 04/07/2021] [Indexed: 12/20/2022]

Deep Automation Bias: How to Tackle a Wicked Problem of AI? BIG DATA AND COGNITIVE COMPUTING 2021. [DOI: 10.3390/bdcc5020018] [Citation(s) in RCA: 6] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 01/08/2023]