1
|
Turley J, Chenchiah IV, Martin P, Liverpool TB, Weavers H. Deep learning for rapid analysis of cell divisions in vivo during epithelial morphogenesis and repair. eLife 2024; 12:RP87949. [PMID: 39312468 PMCID: PMC11419669 DOI: 10.7554/elife.87949] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 09/25/2024] Open
Abstract
Cell division is fundamental to all healthy tissue growth, as well as being rate-limiting in the tissue repair response to wounding and during cancer progression. However, the role that cell divisions play in tissue growth is a collective one, requiring the integration of many individual cell division events. It is particularly difficult to accurately detect and quantify multiple features of large numbers of cell divisions (including their spatio-temporal synchronicity and orientation) over extended periods of time. It would thus be advantageous to perform such analyses in an automated fashion, which can naturally be enabled using deep learning. Hence, we develop a pipeline of deep learning models that accurately identify dividing cells in time-lapse movies of epithelial tissues in vivo. Our pipeline also determines their axis of division orientation, as well as their shape changes before and after division. This strategy enables us to analyse the dynamic profile of cell divisions within the Drosophila pupal wing epithelium, both as it undergoes developmental morphogenesis and as it repairs following laser wounding. We show that the division axis is biased according to lines of tissue tension and that wounding triggers a synchronised (but not oriented) burst of cell divisions back from the leading edge.
Collapse
Affiliation(s)
- Jake Turley
- School of Mathematics, University of BristolBristolUnited Kingdom
- School of Biochemistry, University of BristolBristolUnited Kingdom
- Mechanobiology Institute, National University of SingaporeSingaporeSingapore
| | | | - Paul Martin
- School of Biochemistry, University of BristolBristolUnited Kingdom
| | | | - Helen Weavers
- School of Biochemistry, University of BristolBristolUnited Kingdom
| |
Collapse
|
2
|
Baniasadi A, Das JP, Prendergast CM, Beizavi Z, Ma HY, Jaber MY, Capaccione KM. Imaging at the nexus: how state of the art imaging techniques can enhance our understanding of cancer and fibrosis. J Transl Med 2024; 22:567. [PMID: 38872212 PMCID: PMC11177383 DOI: 10.1186/s12967-024-05379-1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/11/2024] [Accepted: 06/06/2024] [Indexed: 06/15/2024] Open
Abstract
Both cancer and fibrosis are diseases involving dysregulation of cell signaling pathways resulting in an altered cellular microenvironment which ultimately leads to progression of the condition. The two disease entities share common molecular pathophysiology and recent research has illuminated the how each promotes the other. Multiple imaging techniques have been developed to aid in the early and accurate diagnosis of each disease, and given the commonalities between the pathophysiology of the conditions, advances in imaging one disease have opened new avenues to study the other. Here, we detail the most up-to-date advances in imaging techniques for each disease and how they have crossed over to improve detection and monitoring of the other. We explore techniques in positron emission tomography (PET), magnetic resonance imaging (MRI), second generation harmonic Imaging (SGHI), ultrasound (US), radiomics, and artificial intelligence (AI). A new diagnostic imaging tool in PET/computed tomography (CT) is the use of radiolabeled fibroblast activation protein inhibitor (FAPI). SGHI uses high-frequency sound waves to penetrate deeper into the tissue, providing a more detailed view of the tumor microenvironment. Artificial intelligence with the aid of advanced deep learning (DL) algorithms has been highly effective in training computer systems to diagnose and classify neoplastic lesions in multiple organs. Ultimately, advancing imaging techniques in cancer and fibrosis can lead to significantly more timely and accurate diagnoses of both diseases resulting in better patient outcomes.
Collapse
Affiliation(s)
- Alireza Baniasadi
- Department of Radiology, Columbia University Irving Medical Center, 622 W 168Th Street, New York, NY, 10032, USA.
| | - Jeeban P Das
- Department of Radiology, Memorial Sloan Kettering Cancer Center, New York, NY, 10065, USA
| | - Conor M Prendergast
- Department of Radiology, Columbia University Irving Medical Center, 622 W 168Th Street, New York, NY, 10032, USA
| | - Zahra Beizavi
- Department of Radiology, Columbia University Irving Medical Center, 622 W 168Th Street, New York, NY, 10032, USA
| | - Hong Y Ma
- Department of Radiology, Columbia University Irving Medical Center, 622 W 168Th Street, New York, NY, 10032, USA
| | | | - Kathleen M Capaccione
- Department of Radiology, Columbia University Irving Medical Center, 622 W 168Th Street, New York, NY, 10032, USA
| |
Collapse
|
3
|
Rivero-Garcia I, Torres M, Sánchez-Cabo F. Deep generative models in single-cell omics. Comput Biol Med 2024; 176:108561. [PMID: 38749321 DOI: 10.1016/j.compbiomed.2024.108561] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/26/2024] [Revised: 04/30/2024] [Accepted: 05/05/2024] [Indexed: 05/31/2024]
Abstract
Deep Generative Models (DGMs) are becoming instrumental for inferring probability distributions inherent to complex processes, such as most questions in biomedical research. For many years, there was a lack of mathematical methods that would allow this inference in the scarce data scenario of biomedical research. The advent of single-cell omics has finally made square the so-called "skinny matrix", allowing to apply mathematical methods already extensively used in other areas. Moreover, it is now possible to integrate data at different molecular levels in thousands or even millions of samples, thanks to the number of single-cell atlases being collaboratively generated. Additionally, DGMs have proven useful in other frequent tasks in single-cell analysis pipelines, from dimensionality reduction, cell type annotation to RNA velocity inference. In spite of its promise, DGMs need to be used with caution in biomedical research, paying special attention to its use to answer the right questions and the definition of appropriate error metrics and validation check points that confirm not only its correct use but also its relevance. All in all, DGMs provide an exciting tool that opens a bright future for the integrative analysis of single-cell -omics to understand health and disease.
Collapse
Affiliation(s)
- Inés Rivero-Garcia
- Universidad Politécnica de Madrid, Madrid, 28040, Spain; Centro Nacional de Investigaciones Cardiovasculares (CNIC), Madrid, 28029, Spain
| | - Miguel Torres
- Centro Nacional de Investigaciones Cardiovasculares (CNIC), Madrid, 28029, Spain
| | - Fátima Sánchez-Cabo
- Centro Nacional de Investigaciones Cardiovasculares (CNIC), Madrid, 28029, Spain.
| |
Collapse
|
4
|
Lv Q, Liu Y, Sun Y, Wu M. Insight into deep learning for glioma IDH medical image analysis: A systematic review. Medicine (Baltimore) 2024; 103:e37150. [PMID: 38363910 PMCID: PMC10869095 DOI: 10.1097/md.0000000000037150] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 10/18/2023] [Accepted: 01/11/2024] [Indexed: 02/18/2024] Open
Abstract
BACKGROUND Deep learning techniques explain the enormous potential of medical image analysis, particularly in digital pathology. Concurrently, molecular markers have gained increasing significance over the past decade in the context of glioma patients, providing novel insights into diagnosis and more personalized treatment options. Deep learning combined with imaging and molecular analysis enables more accurate prognostication of patients, more accurate treatment plan proposals, and accurate biomarker (IDH) prediction for gliomas. This systematic study examines the development of deep learning techniques for IDH prediction using histopathology images, spanning the period from 2019 to 2023. METHOD The study adhered to the PRISMA reporting requirements, and databases including PubMed, Google Scholar, Google Search, and preprint repositories (such as arXiv) were systematically queried for pertinent literature spanning the period from 2019 to the 30th of 2023. Search phrases related to deep learning, digital pathology, glioma, and IDH were collaboratively utilized. RESULTS Fifteen papers meeting the inclusion criteria were included in the analysis. These criteria specifically encompassed studies utilizing deep learning for the analysis of hematoxylin and eosin images to determine the IDH status in patients with gliomas. CONCLUSIONS When predicting the status of IDH, the classifier built on digital pathological images demonstrates exceptional performance. The study's predictive effectiveness is enhanced with the utilization of the appropriate deep learning model. However, external verification is necessary to showcase their resilience and universality. Larger sample sizes and multicenter samples are necessary for more comprehensive research to evaluate performance and confirm clinical advantages.
Collapse
Affiliation(s)
- Qingqing Lv
- Hunan Cancer Hospital, The Affiliated Cancer Hospital of Xiangya School of Medicine, Central South University, Changsha 410008, Hunan, China
- The Key Laboratory of Carcinogenesis of the Chinese Ministry of Health, The Key Laboratory of Carcinogenesis and Cancer Invasion of the Chinese Ministry of Education, Cancer Research Institute, Central South University, Changsha, 410078, Hunan, China
| | - Yihao Liu
- Hunan Cancer Hospital, The Affiliated Cancer Hospital of Xiangya School of Medicine, Central South University, Changsha 410008, Hunan, China
- The Key Laboratory of Carcinogenesis of the Chinese Ministry of Health, The Key Laboratory of Carcinogenesis and Cancer Invasion of the Chinese Ministry of Education, Cancer Research Institute, Central South University, Changsha, 410078, Hunan, China
| | - Yingnan Sun
- Hunan Cancer Hospital, The Affiliated Cancer Hospital of Xiangya School of Medicine, Central South University, Changsha 410008, Hunan, China
| | - Minghua Wu
- Hunan Cancer Hospital, The Affiliated Cancer Hospital of Xiangya School of Medicine, Central South University, Changsha 410008, Hunan, China
- The Key Laboratory of Carcinogenesis of the Chinese Ministry of Health, The Key Laboratory of Carcinogenesis and Cancer Invasion of the Chinese Ministry of Education, Cancer Research Institute, Central South University, Changsha, 410078, Hunan, China
| |
Collapse
|
5
|
Ferro M, Falagario UG, Barone B, Maggi M, Crocetto F, Busetto GM, Giudice FD, Terracciano D, Lucarelli G, Lasorsa F, Catellani M, Brescia A, Mistretta FA, Luzzago S, Piccinelli ML, Vartolomei MD, Jereczek-Fossa BA, Musi G, Montanari E, Cobelli OD, Tataru OS. Artificial Intelligence in the Advanced Diagnosis of Bladder Cancer-Comprehensive Literature Review and Future Advancement. Diagnostics (Basel) 2023; 13:2308. [PMID: 37443700 DOI: 10.3390/diagnostics13132308] [Citation(s) in RCA: 4] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/07/2023] [Revised: 07/03/2023] [Accepted: 07/05/2023] [Indexed: 07/15/2023] Open
Abstract
Artificial intelligence is highly regarded as the most promising future technology that will have a great impact on healthcare across all specialties. Its subsets, machine learning, deep learning, and artificial neural networks, are able to automatically learn from massive amounts of data and can improve the prediction algorithms to enhance their performance. This area is still under development, but the latest evidence shows great potential in the diagnosis, prognosis, and treatment of urological diseases, including bladder cancer, which are currently using old prediction tools and historical nomograms. This review focuses on highly significant and comprehensive literature evidence of artificial intelligence in the management of bladder cancer and investigates the near introduction in clinical practice.
Collapse
Affiliation(s)
- Matteo Ferro
- Department of Urology, IEO-European Institute of Oncology, IRCCS-Istituto di Ricovero e Cura a Carattere Scientifico, 20141 Milan, Italy
| | - Ugo Giovanni Falagario
- Department of Urology and Organ Transplantation, University of Foggia, 71121 Foggia, Italy
| | - Biagio Barone
- Urology Unit, Department of Surgical Sciences, AORN Sant'Anna e San Sebastiano, 81100 Caserta, Italy
| | - Martina Maggi
- Department of Maternal Infant and Urologic Sciences, Policlinico Umberto I Hospital, Sapienza University of Rome, 00161 Rome, Italy
| | - Felice Crocetto
- Department of Neurosciences and Reproductive Sciences and Odontostomatology, University of Naples Federico II, 80131 Naples, Italy
| | - Gian Maria Busetto
- Department of Urology and Organ Transplantation, University of Foggia, 71121 Foggia, Italy
| | - Francesco Del Giudice
- Department of Maternal Infant and Urologic Sciences, Policlinico Umberto I Hospital, Sapienza University of Rome, 00161 Rome, Italy
| | - Daniela Terracciano
- Department of Translational Medical Sciences, University of Naples "Federico II", 80131 Naples, Italy
| | - Giuseppe Lucarelli
- Urology, Andrology and Kidney Transplantation Unit, Department of Emergency and Organ Transplantation, University of Bari, 70124 Bari, Italy
| | - Francesco Lasorsa
- Urology, Andrology and Kidney Transplantation Unit, Department of Emergency and Organ Transplantation, University of Bari, 70124 Bari, Italy
| | - Michele Catellani
- Department of Urology, ASST Papa Giovanni XXIII, 24127 Bergamo, Italy
| | - Antonio Brescia
- Department of Urology, IEO-European Institute of Oncology, IRCCS-Istituto di Ricovero e Cura a Carattere Scientifico, 20141 Milan, Italy
| | - Francesco Alessandro Mistretta
- Department of Urology, IEO-European Institute of Oncology, IRCCS-Istituto di Ricovero e Cura a Carattere Scientifico, 20141 Milan, Italy
- Department of Oncology and Hemato-Oncology, University of Milan, 20122 Milan, Italy
| | - Stefano Luzzago
- Department of Urology, IEO-European Institute of Oncology, IRCCS-Istituto di Ricovero e Cura a Carattere Scientifico, 20141 Milan, Italy
- Department of Oncology and Hemato-Oncology, University of Milan, 20122 Milan, Italy
| | - Mattia Luca Piccinelli
- Department of Urology, IEO-European Institute of Oncology, IRCCS-Istituto di Ricovero e Cura a Carattere Scientifico, 20141 Milan, Italy
| | | | - Barbara Alicja Jereczek-Fossa
- Department of Oncology and Hemato-Oncology, University of Milan, 20122 Milan, Italy
- Division of Radiation Oncology, IEO-European Institute of Oncology IRCCS, 20141 Milan, Italy
| | - Gennaro Musi
- Department of Urology, IEO-European Institute of Oncology, IRCCS-Istituto di Ricovero e Cura a Carattere Scientifico, 20141 Milan, Italy
- Department of Oncology and Hemato-Oncology, University of Milan, 20122 Milan, Italy
| | - Emanuele Montanari
- Department of Urology, Foundation IRCCS Ca' Granda-Ospedale Maggiore Policlinico, 20122 Milan, Italy
- Department of Clinical Sciences and Community Health, University of Milan, 20122 Milan, Italy
| | - Ottavio de Cobelli
- Department of Urology, IEO-European Institute of Oncology, IRCCS-Istituto di Ricovero e Cura a Carattere Scientifico, 20141 Milan, Italy
- Department of Oncology and Hemato-Oncology, University of Milan, 20122 Milan, Italy
| | - Octavian Sabin Tataru
- Department of Simulation Applied in Medicine, George Emil Palade University of Medicine, Pharmacy, Science and Technology of Târgu Mures, 540142 Târgu Mures, Romania
| |
Collapse
|
6
|
Fallahzadeh R, Bidoki NH, Stelzer IA, Becker M, Marić I, Chang AL, Culos A, Phongpreecha T, Xenochristou M, Francesco DD, Espinosa C, Berson E, Verdonk F, Angst MS, Gaudilliere B, Aghaeepour N. In-silico generation of high-dimensional immune response data in patients using a deep neural network. Cytometry A 2023; 103:392-404. [PMID: 36507780 PMCID: PMC10182197 DOI: 10.1002/cyto.a.24709] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/14/2022] [Revised: 10/14/2022] [Accepted: 11/29/2022] [Indexed: 12/15/2022]
Abstract
Technologies for single-cell profiling of the immune system have enabled researchers to extract rich interconnected networks of cellular abundance, phenotypical and functional cellular parameters. These studies can power machine learning approaches to understand the role of the immune system in various diseases. However, the performance of these approaches and the generalizability of the findings have been hindered by limited cohort sizes in translational studies, partially due to logistical demands and costs associated with longitudinal data collection in sufficiently large patient cohorts. An evolving challenge is the requirement for ever-increasing cohort sizes as the dimensionality of datasets grows. We propose a deep learning model derived from a novel pipeline of optimal temporal cell matching and overcomplete autoencoders that uses data from a small subset of patients to learn to forecast an entire patient's immune response in a high dimensional space from one timepoint to another. In our analysis of 1.08 million cells from patients pre- and post-surgical intervention, we demonstrate that the generated patient-specific data are qualitatively and quantitatively similar to real patient data by demonstrating fidelity, diversity, and usefulness.
Collapse
Affiliation(s)
- Ramin Fallahzadeh
- Department of Anesthesiology, Pain and Perioperative Medicine, Stanford University, Stanford, California, USA
- Department of Biomedical Data Science, Stanford University, Stanford, California, USA
| | - Neda H. Bidoki
- Department of Anesthesiology, Pain and Perioperative Medicine, Stanford University, Stanford, California, USA
- Department of Biomedical Data Science, Stanford University, Stanford, California, USA
| | - Ina A. Stelzer
- Department of Anesthesiology, Pain and Perioperative Medicine, Stanford University, Stanford, California, USA
| | - Martin Becker
- Department of Anesthesiology, Pain and Perioperative Medicine, Stanford University, Stanford, California, USA
- Department of Biomedical Data Science, Stanford University, Stanford, California, USA
| | - Ivana Marić
- Department of Pediatrics, Stanford University, Stanford, California, USA
| | - Alan L. Chang
- Department of Anesthesiology, Pain and Perioperative Medicine, Stanford University, Stanford, California, USA
- Department of Biomedical Data Science, Stanford University, Stanford, California, USA
| | - Anthony Culos
- Department of Anesthesiology, Pain and Perioperative Medicine, Stanford University, Stanford, California, USA
- Department of Biomedical Data Science, Stanford University, Stanford, California, USA
| | - Thanaphong Phongpreecha
- Department of Anesthesiology, Pain and Perioperative Medicine, Stanford University, Stanford, California, USA
- Department of Biomedical Data Science, Stanford University, Stanford, California, USA
- Department of Pathology, Stanford University, Stanford, California, USA
| | - Maria Xenochristou
- Department of Anesthesiology, Pain and Perioperative Medicine, Stanford University, Stanford, California, USA
- Department of Biomedical Data Science, Stanford University, Stanford, California, USA
| | - Davide De Francesco
- Department of Anesthesiology, Pain and Perioperative Medicine, Stanford University, Stanford, California, USA
- Department of Biomedical Data Science, Stanford University, Stanford, California, USA
| | - Camilo Espinosa
- Department of Anesthesiology, Pain and Perioperative Medicine, Stanford University, Stanford, California, USA
- Department of Biomedical Data Science, Stanford University, Stanford, California, USA
| | - Eloise Berson
- Department of Anesthesiology, Pain and Perioperative Medicine, Stanford University, Stanford, California, USA
- Department of Biomedical Data Science, Stanford University, Stanford, California, USA
| | - Franck Verdonk
- Department of Anesthesiology, Pain and Perioperative Medicine, Stanford University, Stanford, California, USA
| | - Martin S. Angst
- Department of Anesthesiology, Pain and Perioperative Medicine, Stanford University, Stanford, California, USA
| | - Brice Gaudilliere
- Department of Anesthesiology, Pain and Perioperative Medicine, Stanford University, Stanford, California, USA
- Department of Pediatrics, Stanford University, Stanford, California, USA
| | - Nima Aghaeepour
- Department of Anesthesiology, Pain and Perioperative Medicine, Stanford University, Stanford, California, USA
- Department of Biomedical Data Science, Stanford University, Stanford, California, USA
- Department of Pediatrics, Stanford University, Stanford, California, USA
| |
Collapse
|
7
|
Bakrania A, Joshi N, Zhao X, Zheng G, Bhat M. Artificial intelligence in liver cancers: Decoding the impact of machine learning models in clinical diagnosis of primary liver cancers and liver cancer metastases. Pharmacol Res 2023; 189:106706. [PMID: 36813095 DOI: 10.1016/j.phrs.2023.106706] [Citation(s) in RCA: 12] [Impact Index Per Article: 12.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 01/14/2023] [Revised: 02/17/2023] [Accepted: 02/19/2023] [Indexed: 02/22/2023]
Abstract
Liver cancers are the fourth leading cause of cancer-related mortality worldwide. In the past decade, breakthroughs in the field of artificial intelligence (AI) have inspired development of algorithms in the cancer setting. A growing body of recent studies have evaluated machine learning (ML) and deep learning (DL) algorithms for pre-screening, diagnosis and management of liver cancer patients through diagnostic image analysis, biomarker discovery and predicting personalized clinical outcomes. Despite the promise of these early AI tools, there is a significant need to explain the 'black box' of AI and work towards deployment to enable ultimate clinical translatability. Certain emerging fields such as RNA nanomedicine for targeted liver cancer therapy may also benefit from application of AI, specifically in nano-formulation research and development given that they are still largely reliant on lengthy trial-and-error experiments. In this paper, we put forward the current landscape of AI in liver cancers along with the challenges of AI in liver cancer diagnosis and management. Finally, we have discussed the future perspectives of AI application in liver cancer and how a multidisciplinary approach using AI in nanomedicine could accelerate the transition of personalized liver cancer medicine from bench side to the clinic.
Collapse
Affiliation(s)
- Anita Bakrania
- Toronto General Hospital Research Institute, Toronto, ON, Canada; Ajmera Transplant Program, University Health Network, Toronto, ON, Canada; Princess Margaret Cancer Centre, University Health Network, Toronto, ON, Canada.
| | | | - Xun Zhao
- Toronto General Hospital Research Institute, Toronto, ON, Canada; Ajmera Transplant Program, University Health Network, Toronto, ON, Canada
| | - Gang Zheng
- Princess Margaret Cancer Centre, University Health Network, Toronto, ON, Canada; Institute of Biomedical Engineering, University of Toronto, Toronto, ON, Canada; Department of Medical Biophysics, University of Toronto, Toronto, ON, Canada
| | - Mamatha Bhat
- Toronto General Hospital Research Institute, Toronto, ON, Canada; Ajmera Transplant Program, University Health Network, Toronto, ON, Canada; Division of Gastroenterology, Department of Medicine, University Health Network and University of Toronto, Toronto, ON, Canada; Department of Medical Sciences, Toronto, ON, Canada.
| |
Collapse
|
8
|
Robotic data acquisition with deep learning enables cell image-based prediction of transcriptomic phenotypes. Proc Natl Acad Sci U S A 2023; 120:e2210283120. [PMID: 36577074 PMCID: PMC9910600 DOI: 10.1073/pnas.2210283120] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/30/2022] Open
Abstract
Single-cell whole-transcriptome analysis is the gold standard approach to identifying molecularly defined cell phenotypes. However, this approach cannot be used for dynamics measurements such as live-cell imaging. Here, we developed a multifunctional robot, the automated live imaging and cell picking system (ALPS) and used it to perform single-cell RNA sequencing for microscopically observed cells with multiple imaging modes. Using robotically obtained data that linked cell images and the whole transcriptome, we successfully predicted transcriptome-defined cell phenotypes in a noninvasive manner using cell image-based deep learning. This noninvasive approach opens a window to determine the live-cell whole transcriptome in real time. Moreover, this work, which is based on a data-driven approach, is a proof of concept for determining the transcriptome-defined phenotypes (i.e., not relying on specific genes) of any cell from cell images using a model trained on linked datasets.
Collapse
|
9
|
Walther BA, Bergmann M. Plastic pollution of four understudied marine ecosystems: a review of mangroves, seagrass meadows, the Arctic Ocean and the deep seafloor. Emerg Top Life Sci 2022; 6:371-387. [PMID: 36214383 DOI: 10.1042/etls20220017] [Citation(s) in RCA: 5] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/09/2022] [Revised: 09/14/2022] [Accepted: 09/15/2022] [Indexed: 02/06/2023]
Abstract
Plastic pollution is now a worldwide phenomenon affecting all marine ecosystems, but some ecosystems and regions remain understudied. Here, we review the presence and impacts of macroplastics and microplastics for four such ecosystems: mangroves, seagrass meadows, the Arctic Ocean and the deep seafloor. Plastic production has grown steadily, and thus the impact on species and ecosystems has increased, too. The accumulated evidence also indicates that plastic pollution is an additional and increasing stressor to these already ecosystems and many of the species living in them. However, laboratory or field studies, which provide strong correlational or experimental evidence of ecological harm due to plastic pollution remain scarce or absent for these ecosystems. Based on these findings, we give some research recommendations for the future.
Collapse
Affiliation(s)
- Bruno Andreas Walther
- Alfred Wegener Institute Helmholtz Centre for Polar and Marine Research, Bremerhaven, Germany
| | - Melanie Bergmann
- Alfred Wegener Institute Helmholtz Centre for Polar and Marine Research, Bremerhaven, Germany
| |
Collapse
|
10
|
Tsimenidis S, Vrochidou E, Papakostas GA. Omics Data and Data Representations for Deep Learning-Based Predictive Modeling. Int J Mol Sci 2022; 23:12272. [PMID: 36293133 PMCID: PMC9603455 DOI: 10.3390/ijms232012272] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/23/2022] [Revised: 10/03/2022] [Accepted: 10/12/2022] [Indexed: 11/25/2022] Open
Abstract
Medical discoveries mainly depend on the capability to process and analyze biological datasets, which inundate the scientific community and are still expanding as the cost of next-generation sequencing technologies is decreasing. Deep learning (DL) is a viable method to exploit this massive data stream since it has advanced quickly with there being successive innovations. However, an obstacle to scientific progress emerges: the difficulty of applying DL to biology, and this because both fields are evolving at a breakneck pace, thus making it hard for an individual to occupy the front lines of both of them. This paper aims to bridge the gap and help computer scientists bring their valuable expertise into the life sciences. This work provides an overview of the most common types of biological data and data representations that are used to train DL models, with additional information on the models themselves and the various tasks that are being tackled. This is the essential information a DL expert with no background in biology needs in order to participate in DL-based research projects in biomedicine, biotechnology, and drug discovery. Alternatively, this study could be also useful to researchers in biology to understand and utilize the power of DL to gain better insights into and extract important information from the omics data.
Collapse
Affiliation(s)
| | | | - George A. Papakostas
- MLV Research Group, Department of Computer Science, International Hellenic University, 65404 Kavala, Greece
| |
Collapse
|
11
|
Hajiabadi H, Mamontova I, Prizak R, Pancholi A, Koziolek A, Hilbert L. Deep-learning microscopy image reconstruction with quality control reveals second-scale rearrangements in RNA polymerase II clusters. PNAS NEXUS 2022; 1:pgac065. [PMID: 36741438 PMCID: PMC9896941 DOI: 10.1093/pnasnexus/pgac065] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 01/20/2022] [Accepted: 05/17/2022] [Indexed: 02/07/2023]
Abstract
Fluorescence microscopy, a central tool of biological research, is subject to inherent trade-offs in experiment design. For instance, image acquisition speed can only be increased in exchange for a lowered signal quality, or for an increased rate of photo-damage to the specimen. Computational denoising can recover some loss of signal, extending the trade-off margin for high-speed imaging. Recently proposed denoising on the basis of neural networks shows exceptional performance but raises concerns of errors typical of neural networks. Here, we present a work-flow that supports an empirically optimized reduction of exposure times, as well as per-image quality control to exclude images with reconstruction errors. We implement this work-flow on the basis of the denoising tool Noise2Void and assess the molecular state and 3D shape of RNA polymerase II (Pol II) clusters in live zebrafish embryos. Image acquisition speed could be tripled, achieving 2-s time resolution and 350-nm lateral image resolution. The obtained data reveal stereotyped events of approximately 10 s duration: initially, the molecular mark for recruited Pol II increases, then the mark for active Pol II increases, and finally Pol II clusters take on a stretched and unfolded shape. An independent analysis based on fixed sample images reproduces this sequence of events, and suggests that they are related to the transient association of genes with Pol II clusters. Our work-flow consists of procedures that can be implemented on commercial fluorescence microscopes without any hardware or software modification, and should, therefore, be transferable to many other applications.
Collapse
Affiliation(s)
| | | | - Roshan Prizak
- Institute of Biological and Chemical Systems, Department of Biological Information Processing, Karlsruhe Institute of Technology, 76344, Eggenstein-Leopoldshafen, Germany
| | - Agnieszka Pancholi
- Institute of Biological and Chemical Systems, Department of Biological Information Processing, Karlsruhe Institute of Technology, 76344, Eggenstein-Leopoldshafen, Germany
| | | | | |
Collapse
|
12
|
da Costa AH, Santos RACD, Cerri R. Investigating deep feedforward neural networks for classification of transposon-derived piRNAs. COMPLEX INTELL SYST 2022. [DOI: 10.1007/s40747-021-00531-6] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]
Abstract
AbstractPIWI-interacting RNAs (piRNAS) form an important class of non-coding RNAs that play a key role in gene expression regulation and genome integrity by silencing transposable elements. However, despite the importance of piRNAs and the large application of deep learning in computational biology, there are few studies of deep learning for piRNAs prediction. Still, current methods focus on using advanced architectures like CNN and variations. This paper presents an investigation on deep feedforward network models for classification of human transposon-derived piRNAs. We developed a lightweight predictor (when compared to other deep learning methods) and we show by practical evidence that simple neural networks can perform as well as better than complex neural networks when using the appropriate hyperparameters. For that, we train, analyze and compare the results of a multilayer perceptron with different hyperparameter choices, such as numbers of hidden layers, activation functions and optimizers, clarifying the advantages and disadvantages of each choice. Our proposed predictor reached a F-score of 0.872, outperforming other state-of-the-art methods for human transposon-derived piRNAs classification. In addition, to better access the generalization of our proposal, we also showed it achieved competitive results when classifying piRNAs of other species.
Collapse
|
13
|
Machine Learning and Deep Learning Applications in Multiple Myeloma Diagnosis, Prognosis, and Treatment Selection. Cancers (Basel) 2022; 14:cancers14030606. [PMID: 35158874 PMCID: PMC8833500 DOI: 10.3390/cancers14030606] [Citation(s) in RCA: 23] [Impact Index Per Article: 11.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/28/2021] [Revised: 01/20/2022] [Accepted: 01/24/2022] [Indexed: 02/01/2023] Open
Abstract
Simple Summary Multiple myeloma is a malignant neoplasm of plasma cells with complex pathogenesis. With major progresses in multiple myeloma research, it is essential that we reconsider our methods for diagnosing and monitoring multiple myeloma disease. This fact needs the integration of serology, histology, radiology, and genetic data; therefore, multiple myeloma study has generated massive quantities of granular high-dimensional data exceeding human understanding. With improved computational techniques, artificial intelligence tools for data processing and analysis are becoming more and more relevant. Artificial intelligence represents a wide set of algorithms for which machine learning and deep learning are presently among the most impactful. This review focuses on artificial intelligence applications in multiple myeloma research, first illustrating machine learning and deep learning procedures and workflow, followed by how these algorithms are used for multiple myeloma diagnosis, prognosis, bone lesions identification, and evaluation of response to the treatment. Abstract Artificial intelligence has recently modified the panorama of oncology investigation thanks to the use of machine learning algorithms and deep learning strategies. Machine learning is a branch of artificial intelligence that involves algorithms that analyse information, learn from that information, and then employ their discoveries to make abreast choice, while deep learning is a field of machine learning basically represented by algorithms inspired by the organization and function of the brain, named artificial neural networks. In this review, we examine the possibility of the artificial intelligence applications in multiple myeloma evaluation, and we report the most significant experimentations with respect to the machine and deep learning procedures in the relevant field. Multiple myeloma is one of the most common haematological malignancies in the world, and among them, it is one of the most difficult ones to cure due to the high occurrence of relapse and chemoresistance. Machine learning- and deep learning-based studies are expected to be among the future strategies to challenge this negative-prognosis tumour via the detection of new markers for their prompt discovery and therapy selection and by a better evaluation of its relapse and survival.
Collapse
|
14
|
Wang C, Caragea D, Kodadinne Narayana N, Hein NT, Bheemanahalli R, Somayanda IM, Jagadish SVK. Deep learning based high-throughput phenotyping of chalkiness in rice exposed to high night temperature. PLANT METHODS 2022; 18:9. [PMID: 35065667 PMCID: PMC8783510 DOI: 10.1186/s13007-022-00839-5] [Citation(s) in RCA: 7] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 06/24/2021] [Accepted: 01/06/2022] [Indexed: 05/02/2023]
Abstract
BACKGROUND Rice is a major staple food crop for more than half the world's population. As the global population is expected to reach 9.7 billion by 2050, increasing the production of high-quality rice is needed to meet the anticipated increased demand. However, global environmental changes, especially increasing temperatures, can affect grain yield and quality. Heat stress is one of the major causes of an increased proportion of chalkiness in rice, which compromises quality and reduces the market value. Researchers have identified 140 quantitative trait loci linked to chalkiness mapped across 12 chromosomes of the rice genome. However, the available genetic information acquired by employing advances in genetics has not been adequately exploited due to a lack of a reliable, rapid and high-throughput phenotyping tool to capture chalkiness. To derive extensive benefit from the genetic progress achieved, tools that facilitate high-throughput phenotyping of rice chalkiness are needed. RESULTS We use a fully automated approach based on convolutional neural networks (CNNs) and Gradient-weighted Class Activation Mapping (Grad-CAM) to detect chalkiness in rice grain images. Specifically, we train a CNN model to distinguish between chalky and non-chalky grains and subsequently use Grad-CAM to identify the area of a grain that is indicative of the chalky class. The area identified by the Grad-CAM approach takes the form of a smooth heatmap that can be used to quantify the degree of chalkiness. Experimental results on both polished and unpolished rice grains using standard instance classification and segmentation metrics have shown that Grad-CAM can accurately identify chalky grains and detect the chalkiness area. CONCLUSIONS We have successfully demonstrated the application of a Grad-CAM based tool to accurately capture high night temperature induced chalkiness in rice. The models trained will be made publicly available. They are easy-to-use, scalable and can be readily incorporated into ongoing rice breeding programs, without rice researchers requiring computer science or machine learning expertise.
Collapse
Affiliation(s)
- Chaoxin Wang
- Department of Computer Science, Kansas State University, Manhattan, KS 66506 USA
| | - Doina Caragea
- Department of Computer Science, Kansas State University, Manhattan, KS 66506 USA
| | - Nisarga Kodadinne Narayana
- Institute for Genomics, Biocomputing and Biotechnology, Mississippi State University, Mississippi State, MS 39762 USA
| | - Nathan T. Hein
- Department of Agronomy, Kansas State University, Manhattan, KS 66506 USA
| | - Raju Bheemanahalli
- Department of Plant and Soil Sciences, Mississippi State University, Mississippi State, MS 39762 USA
| | - Impa M. Somayanda
- Department of Agronomy, Kansas State University, Manhattan, KS 66506 USA
| | | |
Collapse
|
15
|
Watson ER, Taherian Fard A, Mar JC. Computational Methods for Single-Cell Imaging and Omics Data Integration. Front Mol Biosci 2022; 8:768106. [PMID: 35111809 PMCID: PMC8801747 DOI: 10.3389/fmolb.2021.768106] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/14/2021] [Accepted: 11/29/2021] [Indexed: 12/12/2022] Open
Abstract
Integrating single cell omics and single cell imaging allows for a more effective characterisation of the underlying mechanisms that drive a phenotype at the tissue level, creating a comprehensive profile at the cellular level. Although the use of imaging data is well established in biomedical research, its primary application has been to observe phenotypes at the tissue or organ level, often using medical imaging techniques such as MRI, CT, and PET. These imaging technologies complement omics-based data in biomedical research because they are helpful for identifying associations between genotype and phenotype, along with functional changes occurring at the tissue level. Single cell imaging can act as an intermediary between these levels. Meanwhile new technologies continue to arrive that can be used to interrogate the genome of single cells and its related omics datasets. As these two areas, single cell imaging and single cell omics, each advance independently with the development of novel techniques, the opportunity to integrate these data types becomes more and more attractive. This review outlines some of the technologies and methods currently available for generating, processing, and analysing single-cell omics- and imaging data, and how they could be integrated to further our understanding of complex biological phenomena like ageing. We include an emphasis on machine learning algorithms because of their ability to identify complex patterns in large multidimensional data.
Collapse
Affiliation(s)
| | - Atefeh Taherian Fard
- Australian Institute for Bioengineering and Nanotechnology, The University of Queensland, Brisbane, QLD, Australia
| | - Jessica Cara Mar
- Australian Institute for Bioengineering and Nanotechnology, The University of Queensland, Brisbane, QLD, Australia
| |
Collapse
|
16
|
He S, Leanse LG, Feng Y. Artificial intelligence and machine learning assisted drug delivery for effective treatment of infectious diseases. Adv Drug Deliv Rev 2021; 178:113922. [PMID: 34461198 DOI: 10.1016/j.addr.2021.113922] [Citation(s) in RCA: 23] [Impact Index Per Article: 7.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/01/2021] [Revised: 07/14/2021] [Accepted: 08/09/2021] [Indexed: 12/23/2022]
Abstract
In the era of antimicrobial resistance, the prevalence of multidrug-resistant microorganisms that resist conventional antibiotic treatment has steadily increased. Thus, it is now unquestionable that infectious diseases are significant global burdens that urgently require innovative treatment strategies. Emerging studies have demonstrated that artificial intelligence (AI) can transform drug delivery to promote effective treatment of infectious diseases. In this review, we propose to evaluate the significance, essential principles, and popular tools of AI in drug delivery for infectious disease treatment. Specifically, we will focus on the achievements and key findings of current research, as well as the applications of AI on drug delivery throughout the whole antimicrobial treatment process, with an emphasis on drug development, treatment regimen optimization, drug delivery system and administration route design, and drug delivery outcome prediction. To that end, the challenges of AI in drug delivery for infectious disease treatments and their current solutions and future perspective will be presented and discussed.
Collapse
Affiliation(s)
- Sheng He
- Boston Children's Hospital, Harvard Medical School, Harvard University, Boston, MA, USA.
| | - Leon G Leanse
- Massachusetts General Hospital, Harvard Medical School, Harvard University, Boston, MA, USA
| | - Yanfang Feng
- Massachusetts General Hospital, Harvard Medical School, Harvard University, Boston, MA, USA.
| |
Collapse
|
17
|
Tran KA, Kondrashova O, Bradley A, Williams ED, Pearson JV, Waddell N. Deep learning in cancer diagnosis, prognosis and treatment selection. Genome Med 2021; 13:152. [PMID: 34579788 PMCID: PMC8477474 DOI: 10.1186/s13073-021-00968-x] [Citation(s) in RCA: 258] [Impact Index Per Article: 86.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/13/2020] [Accepted: 09/12/2021] [Indexed: 12/13/2022] Open
Abstract
Deep learning is a subdiscipline of artificial intelligence that uses a machine learning technique called artificial neural networks to extract patterns and make predictions from large data sets. The increasing adoption of deep learning across healthcare domains together with the availability of highly characterised cancer datasets has accelerated research into the utility of deep learning in the analysis of the complex biology of cancer. While early results are promising, this is a rapidly evolving field with new knowledge emerging in both cancer biology and deep learning. In this review, we provide an overview of emerging deep learning techniques and how they are being applied to oncology. We focus on the deep learning applications for omics data types, including genomic, methylation and transcriptomic data, as well as histopathology-based genomic inference, and provide perspectives on how the different data types can be integrated to develop decision support tools. We provide specific examples of how deep learning may be applied in cancer diagnosis, prognosis and treatment management. We also assess the current limitations and challenges for the application of deep learning in precision oncology, including the lack of phenotypically rich data and the need for more explainable deep learning models. Finally, we conclude with a discussion of how current obstacles can be overcome to enable future clinical utilisation of deep learning.
Collapse
Affiliation(s)
- Khoa A. Tran
- Department of Genetics and Computational Biology, QIMR Berghofer Medical Research Institute, Brisbane, 4006 Australia
- School of Biomedical Sciences, Faculty of Health, Queensland University of Technology (QUT), Brisbane, 4059 Australia
| | - Olga Kondrashova
- Department of Genetics and Computational Biology, QIMR Berghofer Medical Research Institute, Brisbane, 4006 Australia
| | - Andrew Bradley
- Faculty of Engineering, Queensland University of Technology (QUT), Brisbane, 4000 Australia
| | - Elizabeth D. Williams
- School of Biomedical Sciences, Faculty of Health, Queensland University of Technology (QUT), Brisbane, 4059 Australia
- Australian Prostate Cancer Research Centre - Queensland (APCRC-Q) and Queensland Bladder Cancer Initiative (QBCI), Brisbane, 4102 Australia
| | - John V. Pearson
- Department of Genetics and Computational Biology, QIMR Berghofer Medical Research Institute, Brisbane, 4006 Australia
| | - Nicola Waddell
- Department of Genetics and Computational Biology, QIMR Berghofer Medical Research Institute, Brisbane, 4006 Australia
| |
Collapse
|
18
|
Lebleux M, Denimal E, De Oliveira D, Marin A, Desroche N, Alexandre H, Weidmann S, Rousseaux S. Prediction of Genetic Groups within Brettanomyces bruxellensis through Cell Morphology Using a Deep Learning Tool. J Fungi (Basel) 2021; 7:jof7080581. [PMID: 34436120 PMCID: PMC8396822 DOI: 10.3390/jof7080581] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/28/2021] [Revised: 07/16/2021] [Accepted: 07/18/2021] [Indexed: 11/16/2022] Open
Abstract
Brettanomyces bruxellensis is described as a wine spoilage yeast with many mainly strain-dependent genetic characteristics, bestowing tolerance against environmental stresses and persistence during the winemaking process. Thus, it is essential to discriminate B. bruxellensis isolates at the strain level in order to predict their stress resistance capacities. Few predictive tools are available to reveal intraspecific diversity within B. bruxellensis species; also, they require expertise and can be expensive. In this study, a Random Amplified Polymorphic DNA (RAPD) adapted PCR method was used with three different primers to discriminate 74 different B. bruxellensis isolates. High correlation between the results of this method using the primer OPA-09 and those of a previous microsatellite analysis was obtained, allowing us to cluster the isolates among four genetic groups more quickly and cheaply than microsatellite analysis. To make analysis even faster, we further investigated the correlation suggested in a previous study between genetic groups and cell polymorphism using the analysis of optical microscopy images via deep learning. A Convolutional Neural Network (CNN) was trained to predict the genetic group of B. bruxellensis isolates with 96.6% accuracy. These methods make intraspecific discrimination among B. bruxellensis species faster, simpler and less costly. These results open up very promising new perspectives in oenology for the study of microbial ecosystems.
Collapse
Affiliation(s)
- Manon Lebleux
- Laboratoire VAlMiS-IUVV, AgroSup Dijon, UMR PAM A 02.102, University Bourgogne Franche-Comté, F-21000 Dijon, France; (D.D.O.); (H.A.); (S.W.); (S.R.)
- Correspondence:
| | - Emmanuel Denimal
- AgroSup Dijon, Direction Scientifique, Appui à la Recherche, 26 Boulevard Docteur Petitjean, F-21000 Dijon, France;
| | - Déborah De Oliveira
- Laboratoire VAlMiS-IUVV, AgroSup Dijon, UMR PAM A 02.102, University Bourgogne Franche-Comté, F-21000 Dijon, France; (D.D.O.); (H.A.); (S.W.); (S.R.)
| | - Ambroise Marin
- Plateau D’imagerie DimaCell, Esplanade Erasme, Agrosup Dijon, UMR PAM A 02.102, University Bourgogne Franche-Comté, F-21000 Dijon, France;
| | | | - Hervé Alexandre
- Laboratoire VAlMiS-IUVV, AgroSup Dijon, UMR PAM A 02.102, University Bourgogne Franche-Comté, F-21000 Dijon, France; (D.D.O.); (H.A.); (S.W.); (S.R.)
| | - Stéphanie Weidmann
- Laboratoire VAlMiS-IUVV, AgroSup Dijon, UMR PAM A 02.102, University Bourgogne Franche-Comté, F-21000 Dijon, France; (D.D.O.); (H.A.); (S.W.); (S.R.)
| | - Sandrine Rousseaux
- Laboratoire VAlMiS-IUVV, AgroSup Dijon, UMR PAM A 02.102, University Bourgogne Franche-Comté, F-21000 Dijon, France; (D.D.O.); (H.A.); (S.W.); (S.R.)
| |
Collapse
|
19
|
Routhier E, Bin Kamruddin A, Mozziconacci J. keras_dna: a wrapper for fast implementation of deep learning models in genomics. Bioinformatics 2021; 37:1593-1594. [PMID: 33135730 DOI: 10.1093/bioinformatics/btaa929] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/05/2020] [Revised: 10/14/2020] [Accepted: 10/19/2020] [Indexed: 11/12/2022] Open
Abstract
SUMMARY Prediction of genomic annotations from DNA sequences using deep learning is today becoming a flourishing field with many applications. Nevertheless, there are still difficulties in handling data in order to conveniently build and train models dedicated for specific end-user's tasks. keras_dna is designed for an easy implementation of Keras models (TensorFlow high level API) for genomics. It can handle standard bioinformatic files formats as inputs such as bigwig, gff, bed, wig, bedGraph or fasta and returns standardized inputs for model training. keras_dna is designed to implement existing models but also to facilitate the development of news models that can have single or multiple targets or inputs. AVAILABILITY AND IMPLEMENTATION Freely available with a MIT License using pip install keras_dna or cloning the github repo at https://github.com/etirouthier/keras_dna.git. SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Etienne Routhier
- Sorbonne Universite, CNRS, Laboratoire de Physique Théorique de la Matière Condensée (LPTMC), Paris F-75252, France
| | - Ayman Bin Kamruddin
- Sorbonne Universite, CNRS, Laboratoire de Physique Théorique de la Matière Condensée (LPTMC), Paris F-75252, France
- Muséum National d'Histoire Naturelle, Structure et Instabilité des Génomes, UMR7196, Paris 75231, France
| | - Julien Mozziconacci
- Sorbonne Universite, CNRS, Laboratoire de Physique Théorique de la Matière Condensée (LPTMC), Paris F-75252, France
- Muséum National d'Histoire Naturelle, Structure et Instabilité des Génomes, UMR7196, Paris 75231, France
| |
Collapse
|
20
|
Ali MAS, Misko O, Salumaa SO, Papkov M, Palo K, Fishman D, Parts L. Evaluating Very Deep Convolutional Neural Networks for Nucleus Segmentation from Brightfield Cell Microscopy Images. SLAS DISCOVERY 2021; 26:1125-1137. [PMID: 34167359 PMCID: PMC8458686 DOI: 10.1177/24725552211023214] [Citation(s) in RCA: 7] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 11/15/2022]
Abstract
Advances in microscopy have increased output data volumes, and powerful image analysis methods are required to match. In particular, finding and characterizing nuclei from microscopy images, a core cytometry task, remains difficult to automate. While deep learning models have given encouraging results on this problem, the most powerful approaches have not yet been tested for attacking it. Here, we review and evaluate state-of-the-art very deep convolutional neural network architectures and training strategies for segmenting nuclei from brightfield cell images. We tested U-Net as a baseline model; considered U-Net++, Tiramisu, and DeepLabv3+ as latest instances of advanced families of segmentation models; and propose PPU-Net, a novel light-weight alternative. The deeper architectures outperformed standard U-Net and results from previous studies on the challenging brightfield images, with balanced pixel-wise accuracies of up to 86%. PPU-Net achieved this performance with 20-fold fewer parameters than the comparably accurate methods. All models perform better on larger nuclei and in sparser images. We further confirmed that in the absence of plentiful training data, augmentation and pretraining on other data improve performance. In particular, using only 16 images with data augmentation is enough to achieve a pixel-wise F1 score that is within 5% of the one achieved with a full data set for all models. The remaining segmentation errors are mainly due to missed nuclei in dense regions, overlapping cells, and imaging artifacts, indicating the major outstanding challenges.
Collapse
Affiliation(s)
- Mohammed A S Ali
- Department of Computer Science, University of Tartu, Tartu, Estonia
| | - Oleg Misko
- Ukrainian Catholic University, Lviv, L'vìvs'ka, Ukraine
| | | | - Mikhail Papkov
- Department of Computer Science, University of Tartu, Tartu, Estonia
| | - Kaupo Palo
- PerkinElmer Cellular Technologies Germany GmbH, Hamburg, Germany
| | - Dmytro Fishman
- Department of Computer Science, University of Tartu, Tartu, Estonia
| | - Leopold Parts
- Department of Computer Science, University of Tartu, Tartu, Estonia.,Wellcome Sanger Institute, Hinxton, Cambridgeshire, UK
| |
Collapse
|
21
|
Sajjad H, Imtiaz S, Noor T, Siddiqui YH, Sajjad A, Zia M. Cancer models in preclinical research: A chronicle review of advancement in effective cancer research. Animal Model Exp Med 2021; 4:87-103. [PMID: 34179717 PMCID: PMC8212826 DOI: 10.1002/ame2.12165] [Citation(s) in RCA: 53] [Impact Index Per Article: 17.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/13/2020] [Accepted: 03/04/2021] [Indexed: 12/15/2022] Open
Abstract
Cancer is a major stress for public well-being and is the most dreadful disease. The models used in the discovery of cancer treatment are continuously changing and extending toward advanced preclinical studies. Cancer models are either naturally existing or artificially prepared experimental systems that show similar features with human tumors though the heterogeneous nature of the tumor is very familiar. The choice of the most fitting model to best reflect the given tumor system is one of the real difficulties for cancer examination. Therefore, vast studies have been conducted on the cancer models for developing a better understanding of cancer invasion, progression, and early detection. These models give an insight into cancer etiology, molecular basis, host tumor interaction, the role of microenvironment, and tumor heterogeneity in tumor metastasis. These models are also used to predict novel cancer markers, targeted therapies, and are extremely helpful in drug development. In this review, the potential of cancer models to be used as a platform for drug screening and therapeutic discoveries are highlighted. Although none of the cancer models is regarded as ideal because each is associated with essential caveats that restraint its application yet by bridging the gap between preliminary cancer research and translational medicine. However, they promise a brighter future for cancer treatment.
Collapse
Affiliation(s)
- Humna Sajjad
- Department of BiotechnologyQuaid‐i‐Azam UniversityIslamabadPakistan
| | - Saiqa Imtiaz
- Department of BiotechnologyQuaid‐i‐Azam UniversityIslamabadPakistan
| | - Tayyaba Noor
- Department of BiotechnologyQuaid‐i‐Azam UniversityIslamabadPakistan
| | | | - Anila Sajjad
- Department of BiotechnologyQuaid‐i‐Azam UniversityIslamabadPakistan
| | - Muhammad Zia
- Department of BiotechnologyQuaid‐i‐Azam UniversityIslamabadPakistan
| |
Collapse
|
22
|
Application of Deep Neural Networks as a Prescreening Tool to Assign Individualized Absorption Models in Pharmacokinetic Analysis. Pharmaceutics 2021; 13:pharmaceutics13060797. [PMID: 34073609 PMCID: PMC8227048 DOI: 10.3390/pharmaceutics13060797] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/14/2021] [Revised: 05/03/2021] [Accepted: 05/19/2021] [Indexed: 11/17/2022] Open
Abstract
A specific model for drug absorption is necessarily assumed in pharmacokinetic (PK) analyses following extravascular dosing. Unfortunately, an inappropriate absorption model may force other model parameters to be poorly estimated. An added complexity arises in population PK analyses when different individuals appear to have different absorption patterns. The aim of this study is to demonstrate that a deep neural network (DNN) can be used to prescreen data and assign an individualized absorption model consistent with either a first-order, Erlang, or split-peak process. Ten thousand profiles were simulated for each of the three aforementioned shapes and used for training the DNN algorithm with a 30% hold-out validation set. During the training phase, a 99.7% accuracy was attained, with 99.4% accuracy during in the validation process. In testing the algorithm classification performance with external patient data, a 93.7% accuracy was reached. This algorithm was developed to prescreen individual data and assign a particular absorption model prior to a population PK analysis. We envision it being used as an efficient prescreening tool in other situations that involve a model component that appears to be variable across subjects. It has the potential to reduce the time needed to perform a manual visual assignment and eliminate inter-assessor variability and bias in assigning a sub-model.
Collapse
|
23
|
Fang Z, Zhou H. VirionFinder: Identification of Complete and Partial Prokaryote Virus Virion Protein From Virome Data Using the Sequence and Biochemical Properties of Amino Acids. Front Microbiol 2021; 12:615711. [PMID: 33613485 PMCID: PMC7894196 DOI: 10.3389/fmicb.2021.615711] [Citation(s) in RCA: 6] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/09/2020] [Accepted: 01/04/2021] [Indexed: 01/22/2023] Open
Abstract
Viruses are some of the most abundant biological entities on Earth, and prokaryote virus are the dominant members of the viral community. Because of the diversity of prokaryote virus, functional annotation cannot be performed on a large number of genes from newly discovered prokaryote virus by searching the current database; therefore, the development of an alignment-free algorithm for functional annotation of prokaryote virus proteins is important to understand the viral community. The identification of prokaryote virus proteins (PVVPs) is a critical step for many viral analyses, such as species classification, phylogenetic analysis and the exploration of how prokaryote virus interact with their hosts. Although a series of PVVP prediction tools have been developed, the performance of these tools is still not satisfactory. Moreover, viral metagenomic data contains fragmented sequences, leading to the existence of some incomplete genes. Therefore, a tool that can identify partial prokaryote virus proteins is also needed. In this work, we present a novel algorithm, called VirionFinder, to identify the complete and partial PVVPs from non-prokaryote virus virion proteins (non-PVVPs). VirionFinder uses the sequence and biochemical properties of 20 amino acids as the mathematical model to encode the protein sequences and uses a deep learning technique to identify whether a given protein is a PVVP. Compared with the state-of-the-art tools using artificial benchmark datasets, the results show that under the same specificity (Sp), the sensitivity (Sn) of VirionFinder is approximately 10-34% much higher than the Sn of these tools on both complete and partial proteins. When evaluating related tools using real virome data, the recognition rate of PVVP-like sequences of VirionFinder is also much higher than that of the other tools. We expect that VirionFinder will be a powerful tool for identifying novel virion proteins from both complete prokaryote virus genomes and viral metagenomic data. VirionFinder is freely available at https://github.com/zhenchengfang/VirionFinder.
Collapse
Affiliation(s)
- Zhencheng Fang
- Microbiome Medicine Center, Department of Laboratory Medicine, Zhujiang Hospital, Southern Medical University, Guangzhou, China
- Center for Quantitative Biology, Peking University, Beijing, China
| | - Hongwei Zhou
- Microbiome Medicine Center, Department of Laboratory Medicine, Zhujiang Hospital, Southern Medical University, Guangzhou, China
- State Key Laboratory of Organ Failure Research, Southern Medical University, Guangzhou, China
| |
Collapse
|
24
|
Routhier E, Pierre E, Khodabandelou G, Mozziconacci J. Genome-wide prediction of DNA mutation effect on nucleosome positions for yeast synthetic genomics. Genome Res 2021; 31:317-326. [PMID: 33355297 PMCID: PMC7849406 DOI: 10.1101/gr.264416.120] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/15/2020] [Accepted: 12/11/2020] [Indexed: 12/15/2022]
Abstract
Genetically modified genomes are often used today in many areas of fundamental and applied research. In many studies, coding or noncoding regions are modified in order to change protein sequences or gene expression levels. Modifying one or several nucleotides in a genome can also lead to unexpected changes in the epigenetic regulation of genes. When designing a synthetic genome with many mutations, it would thus be very informative to be able to predict the effect of these mutations on chromatin. We develop here a deep learning approach that quantifies the effect of every possible single mutation on nucleosome positions on the full Saccharomyces cerevisiae genome. This type of annotation track can be used when designing a modified S. cerevisiae genome. We further highlight how this track can provide new insights on the sequence-dependent mechanisms that drive nucleosomes' positions in vivo.
Collapse
Affiliation(s)
- Etienne Routhier
- Sorbonne Universite, CNRS, Laboratoire de Physique Théorique de la Matière Condensée, LPTMC, Paris F-75252, France
| | - Edgard Pierre
- Sorbonne Universite, CNRS, Laboratoire de Physique Théorique de la Matière Condensée, LPTMC, Paris F-75252, France
| | | | - Julien Mozziconacci
- Sorbonne Universite, CNRS, Laboratoire de Physique Théorique de la Matière Condensée, LPTMC, Paris F-75252, France
- Muséum National d'Histoire Naturelle, Structure et Instabilité des Génomes, UMR7196, Paris 75231, France
- Institut Universitaire de France, Paris 75005, France
| |
Collapse
|
25
|
Zhang Y, Tian Y, Wu P, Chen D. Application of Skeleton Data and Long Short-Term Memory in Action Recognition of Children with Autism Spectrum Disorder. SENSORS (BASEL, SWITZERLAND) 2021; 21:E411. [PMID: 33430118 PMCID: PMC7827022 DOI: 10.3390/s21020411] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 12/07/2020] [Revised: 12/25/2020] [Accepted: 01/06/2021] [Indexed: 11/16/2022]
Abstract
The recognition of stereotyped action is one of the core diagnostic criteria of Autism Spectrum Disorder (ASD). However, it mainly relies on parent interviews and clinical observations, which lead to a long diagnosis cycle and prevents the ASD children from timely treatment. To speed up the recognition process of stereotyped actions, a method based on skeleton data and Long Short-Term Memory (LSTM) is proposed in this paper. In the first stage of our method, the OpenPose algorithm is used to obtain the initial skeleton data from the video of ASD children. Furthermore, four denoising methods are proposed to eliminate the noise of the initial skeleton data. In the second stage, we track multiple ASD children in the same scene by matching distance between current skeletons and previous skeletons. In the last stage, the neural network based on LSTM is proposed to classify the ASD children's actions. The performed experiments show that our proposed method is effective for ASD children's action recognition. Compared to the previous traditional schemes, our scheme has higher accuracy and is almost non-invasive for ASD children.
Collapse
Affiliation(s)
- Yunkai Zhang
- School of Communication and Electronic Engineering, East China Normal University, Shanghai 200241, China; (Y.Z.); (Y.T.)
| | - Yinghong Tian
- School of Communication and Electronic Engineering, East China Normal University, Shanghai 200241, China; (Y.Z.); (Y.T.)
| | - Pingyi Wu
- Experimental Teaching Center for Teacher Education, East China Normal University, Shanghai 200241, China
| | - Dongfan Chen
- Department of Rehabilitation Sciences, East China Normal University, Shanghai 200062, China;
| |
Collapse
|
26
|
Akay M, Du Y, Sershen CL, Wu M, Chen TY, Assassi S, Mohan C, Akay YM. Deep Learning Classification of Systemic Sclerosis Skin Using the MobileNetV2 Model. IEEE OPEN JOURNAL OF ENGINEERING IN MEDICINE AND BIOLOGY 2021; 2:104-110. [PMID: 35402975 PMCID: PMC8901014 DOI: 10.1109/ojemb.2021.3066097] [Citation(s) in RCA: 10] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/01/2021] [Revised: 03/03/2021] [Accepted: 03/08/2021] [Indexed: 11/21/2022] Open
Abstract
Goal: Systemic sclerosis (SSc) is a rare autoimmune, systemic disease with prominent fibrosis of skin and internal organs. Early diagnosis of the disease is crucial for designing effective therapy and management plans. Machine learning algorithms, especially deep learning, have been found to be greatly useful in biology, medicine, healthcare, and biomedical applications, in the areas of medical image processing and speech recognition. However, the need for a large training data set and the requirement for a graphics processing unit (GPU) have hindered the wide application of machine learning algorithms as a diagnostic tool in resource-constrained environments (e.g., clinics). Methods: In this paper, we propose a novel mobile deep learning network for the characterization of SSc skin. The proposed network architecture consists of the UNet, a dense connectivity convolutional neural network (CNN) with added classifier layers that when combined with limited training data, yields better image segmentation and more accurate classification, and a mobile training module. In addition, to improve the computational efficiency and diagnostic accuracy, the highly efficient training model called “MobileNetV2,” which is designed for mobile and embedded applications, was used to train the network. Results: The proposed network was implemented using a standard laptop (2.5 GHz Intel Core i7). After fine tuning, our results showed the proposed network reached 100% accuracy on the training image set, 96.8% accuracy on the validation image set, and 95.2% on the testing image set. The training time was less than 5 hours. We also analyzed the same normal vs SSc skin image sets using the CNN using the same laptop. The CNN reached 100% accuracy on the training image set, 87.7% accuracy on the validation image set, and 82.9% on the testing image set. Additionally, it took more than 14 hours to train the CNN architecture. We also utilized the MobileNetV2 model to analyze an additional dataset of images and classified them as normal, early (mid and moderate) SSc or late (severe) SSc skin images. The network reached 100% accuracy on the training image set, 97.2% on the validation set, and 94.8% on the testing image set. Using the same normal, early and late phase SSc skin images, the CNN reached 100% accuracy on the training image set, 87.7% accuracy on the validation image set, and 82.9% on the testing image set. These results indicated that the MobileNetV2 architecture is more accurate and efficient compared to the CNN to classify normal, early and late phase SSc skin images. Conclusions: Our preliminary study, intended to show the efficacy of the proposed network architecture, holds promise in the characterization of SSc. We believe that the proposed network architecture could easily be implemented in a clinical setting, providing a simple, inexpensive, and accurate screening tool for SSc.
Collapse
Affiliation(s)
- Metin Akay
- Biomedical Engineering DepartmentUniversity of Houston Houston TX 77204 USA
| | - Yong Du
- Biomedical Engineering DepartmentUniversity of Houston Houston TX 77204 USA
| | - Cheryl L Sershen
- Biomedical Engineering DepartmentUniversity of Houston Houston TX 77204 USA
| | - Minghua Wu
- Division of Rheumatology and Clinical Immunogenetics, Department of Internal Medicine UTHealth Houston TX 77030 USA
| | - Ting Y Chen
- Biomedical Engineering DepartmentUniversity of Houston Houston TX 77204 USA
| | - Shervin Assassi
- Division of Rheumatology and Clinical Immunogenetics, Department of Internal Medicine UTHealth Houston TX 77030 USA
| | - Chandra Mohan
- Biomedical Engineering DepartmentUniversity of Houston Houston TX 77204 USA
| | - Yasemin M Akay
- Biomedical Engineering DepartmentUniversity of Houston Houston TX 77204 USA
| |
Collapse
|
27
|
Lai CQ, Ibrahim H, Abd Hamid AI, Abdullah JM. Classification of Non-Severe Traumatic Brain Injury from Resting-State EEG Signal Using LSTM Network with ECOC-SVM. SENSORS (BASEL, SWITZERLAND) 2020; 20:E5234. [PMID: 32937801 PMCID: PMC7570640 DOI: 10.3390/s20185234] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 08/08/2020] [Revised: 09/09/2020] [Accepted: 09/11/2020] [Indexed: 12/21/2022]
Abstract
Traumatic brain injury (TBI) is one of the common injuries when the human head receives an impact due to an accident or fall and is one of the most frequently submitted insurance claims. However, it is often always misused when individuals attempt an insurance fraud claim by providing false medical conditions. Therefore, there is a need for an instant brain condition classification system. This study presents a novel classification architecture that can classify non-severe TBI patients and healthy subjects employing resting-state electroencephalogram (EEG) as the input, solving the immobility issue of the computed tomography (CT) scan and magnetic resonance imaging (MRI). The proposed architecture makes use of long short term memory (LSTM) and error-correcting output coding support vector machine (ECOC-SVM) to perform multiclass classification. The pre-processed EEG time series are supplied to the network by each time step, where important information from the previous time step will be remembered by the LSTM cell. Activations from the LSTM cell is used to train an ECOC-SVM. The temporal advantages of the EEG were amplified and able to achieve a classification accuracy of 100%. The proposed method was compared to existing works in the literature, and it is shown that the proposed method is superior in terms of classification accuracy, sensitivity, specificity, and precision.
Collapse
Affiliation(s)
- Chi Qin Lai
- School of Electrical and Electronic Engineering, Engineering Campus, Universiti Sains Malaysia, Nibong Tebal 14300, Penang, Malaysia;
| | - Haidi Ibrahim
- School of Electrical and Electronic Engineering, Engineering Campus, Universiti Sains Malaysia, Nibong Tebal 14300, Penang, Malaysia;
| | - Aini Ismafairus Abd Hamid
- Brain and Behaviour Cluster, Department of Neurosciences, School of Medical Sciences, Universiti Sains Malaysia, Health Campus, Jalan Raja Perempuan Zainab 2, Kubang Kerian 16150, Kota Bharu, Kelantan, Malaysia; (A.I.A.H.); (J.M.A.)
| | - Jafri Malin Abdullah
- Brain and Behaviour Cluster, Department of Neurosciences, School of Medical Sciences, Universiti Sains Malaysia, Health Campus, Jalan Raja Perempuan Zainab 2, Kubang Kerian 16150, Kota Bharu, Kelantan, Malaysia; (A.I.A.H.); (J.M.A.)
| |
Collapse
|
28
|
Wilentzik Müller R, Gat-Viks I. Exploring Neural Networks and Related Visualization Techniques in Gene Expression Data. Front Genet 2020; 11:402. [PMID: 32499810 PMCID: PMC7243731 DOI: 10.3389/fgene.2020.00402] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/13/2019] [Accepted: 03/30/2020] [Indexed: 12/04/2022] Open
Abstract
Over the past decade, neural networks have become one of the cutting-edge methods in various research fields, outshining specifically in complex classification problems. In this paper, we propose two main contributions: first, we conduct a methodological study of neural network modeling for classifying biological traits based on structured gene expression data. Then, we suggest an innovative approach for utilizing deep learning visualization techniques in order to reveal the specific genes important for the correct classification of each trait within the trained models. Our data suggests that this approach have great potential for becoming a standard feature importance tool used in complex medical research problems, and that it can further be generalized to various structured data classification problems outside the biological domain.
Collapse
Affiliation(s)
- Roni Wilentzik Müller
- School of Molecular Cell Biology & Biotechnology, Tel Aviv University, Tel Aviv, Israel
| | - Irit Gat-Viks
- School of Molecular Cell Biology & Biotechnology, Tel Aviv University, Tel Aviv, Israel
| |
Collapse
|
29
|
Al-Ajlan A, El Allali A. CNN-MGP: Convolutional Neural Networks for Metagenomics Gene Prediction. Interdiscip Sci 2019; 11:628-635. [PMID: 30588558 PMCID: PMC6841655 DOI: 10.1007/s12539-018-0313-4] [Citation(s) in RCA: 27] [Impact Index Per Article: 5.4] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/23/2018] [Revised: 11/22/2018] [Accepted: 12/07/2018] [Indexed: 12/30/2022]
Abstract
Accurate gene prediction in metagenomics fragments is a computationally challenging task due to the short-read length, incomplete, and fragmented nature of the data. Most gene-prediction programs are based on extracting a large number of features and then applying statistical approaches or supervised classification approaches to predict genes. In our study, we introduce a convolutional neural network for metagenomics gene prediction (CNN-MGP) program that predicts genes in metagenomics fragments directly from raw DNA sequences, without the need for manual feature extraction and feature selection stages. CNN-MGP is able to learn the characteristics of coding and non-coding regions and distinguish coding and non-coding open reading frames (ORFs). We train 10 CNN models on 10 mutually exclusive datasets based on pre-defined GC content ranges. We extract ORFs from each fragment; then, the ORFs are encoded numerically and inputted into an appropriate CNN model based on the fragment-GC content. The output from the CNN is the probability that an ORF will encode a gene. Finally, a greedy algorithm is used to select the final gene list. Overall, CNN-MGP is effective and achieves a 91% accuracy on testing dataset. CNN-MGP shows the ability of deep learning to predict genes in metagenomics fragments, and it achieves an accuracy higher than or comparable to state-of-the-art gene-prediction programs that use pre-defined features.
Collapse
Affiliation(s)
- Amani Al-Ajlan
- Computer Science Department, College of Computer and Information Sciences, King Saud University, Riyadh, Saudi Arabia
| | - Achraf El Allali
- Computer Science Department, College of Computer and Information Sciences, King Saud University, Riyadh, Saudi Arabia
| |
Collapse
|
30
|
Mansouri K, Cariello NF, Korotcov A, Tkachenko V, Grulke CM, Sprankle CS, Allen D, Casey WM, Kleinstreuer NC, Williams AJ. Open-source QSAR models for pKa prediction using multiple machine learning approaches. J Cheminform 2019; 11:60. [PMID: 33430972 PMCID: PMC6749653 DOI: 10.1186/s13321-019-0384-1] [Citation(s) in RCA: 49] [Impact Index Per Article: 9.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/23/2019] [Accepted: 09/03/2019] [Indexed: 12/13/2022] Open
Abstract
BACKGROUND The logarithmic acid dissociation constant pKa reflects the ionization of a chemical, which affects lipophilicity, solubility, protein binding, and ability to pass through the plasma membrane. Thus, pKa affects chemical absorption, distribution, metabolism, excretion, and toxicity properties. Multiple proprietary software packages exist for the prediction of pKa, but to the best of our knowledge no free and open-source programs exist for this purpose. Using a freely available data set and three machine learning approaches, we developed open-source models for pKa prediction. METHODS The experimental strongest acidic and strongest basic pKa values in water for 7912 chemicals were obtained from DataWarrior, a freely available software package. Chemical structures were curated and standardized for quantitative structure-activity relationship (QSAR) modeling using KNIME, and a subset comprising 79% of the initial set was used for modeling. To evaluate different approaches to modeling, several datasets were constructed based on different processing of chemical structures with acidic and/or basic pKas. Continuous molecular descriptors, binary fingerprints, and fragment counts were generated using PaDEL, and pKa prediction models were created using three machine learning methods, (1) support vector machines (SVM) combined with k-nearest neighbors (kNN), (2) extreme gradient boosting (XGB) and (3) deep neural networks (DNN). RESULTS The three methods delivered comparable performances on the training and test sets with a root-mean-squared error (RMSE) around 1.5 and a coefficient of determination (R2) around 0.80. Two commercial pKa predictors from ACD/Labs and ChemAxon were used to benchmark the three best models developed in this work, and performance of our models compared favorably to the commercial products. CONCLUSIONS This work provides multiple QSAR models to predict the strongest acidic and strongest basic pKas of chemicals, built using publicly available data, and provided as free and open-source software on GitHub.
Collapse
Affiliation(s)
- Kamel Mansouri
- Integrated Laboratory Systems, Inc., P.O. Box 13501, Research Triangle Park, NC 27709 USA
| | - Neal F. Cariello
- Integrated Laboratory Systems, Inc., P.O. Box 13501, Research Triangle Park, NC 27709 USA
| | - Alexandru Korotcov
- Science Data Software LLC, 14914 Bradwill Court, Rockville, MD 20850 USA
| | - Valery Tkachenko
- Science Data Software LLC, 14914 Bradwill Court, Rockville, MD 20850 USA
| | - Chris M. Grulke
- National Center for Computational Toxicology, U.S. Environmental Protection Agency, 109 T.W. Alexander Dr., Mail Code D143-02, Research Triangle Park, NC 27709 USA
| | - Catherine S. Sprankle
- Integrated Laboratory Systems, Inc., P.O. Box 13501, Research Triangle Park, NC 27709 USA
| | - David Allen
- Integrated Laboratory Systems, Inc., P.O. Box 13501, Research Triangle Park, NC 27709 USA
| | - Warren M. Casey
- National Institute of Environmental Health Sciences, P.O. Box 12233, Mail Stop K2-16, Research Triangle Park, NC 27709 USA
| | - Nicole C. Kleinstreuer
- National Institute of Environmental Health Sciences, P.O. Box 12233, Mail Stop K2-16, Research Triangle Park, NC 27709 USA
| | - Antony J. Williams
- National Center for Computational Toxicology, U.S. Environmental Protection Agency, 109 T.W. Alexander Dr., Mail Code D143-02, Research Triangle Park, NC 27709 USA
| |
Collapse
|
31
|
Rungruangsak-Torrissen K, Manoonpong P. Neural computational model GrowthEstimate: A model for studying living resources through digestive efficiency. PLoS One 2019; 14:e0216030. [PMID: 31461459 PMCID: PMC6713322 DOI: 10.1371/journal.pone.0216030] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/03/2018] [Accepted: 04/13/2019] [Indexed: 11/18/2022] Open
Abstract
The neural computational model GrowthEstimate is introduced with focusing on new perspectives for the practical estimation of weight specific growth rate (SGR, % day-1). It is developed using recurrent neural networks of reservoir computing type, for estimating SGR based on the known data of three key biological factors relating to growth. These factors are: (1) weight (g) for specifying the age of the growth stage; (2) digestive efficiency through the pyloric caecal activity ratio of trypsin to chymotrypsin (T/C ratio) for specifying genetic differences in food utilization and growth potential, basically resulting from food consumption under variations in food quality and environmental conditions; and (3) protein growth efficiency through the condition factor (CF, 100 × g cm-3), as higher dietary protein level affecting higher skeletal growth (length) and resulting in lower CF. The computational model was trained using four datasets of different salmonids with size variations. It was evaluated with 15% of each dataset, resulting in an acceptable range of SGR outputs. Additional tests with different species indicated similarity between the estimated SGR outputs and the real SGR values, and the same ranking of wild population growth. The developed model GrowthEstimate is exceptionally useful for the precise and comparable growth estimation of living resources at individual levels, especially in natural ecosystems where the studied individuals, environmental conditions, food availability and consumption rates cannot be controlled. It is a revelation and will help to minimize uncertainty in wild stock assessment process. This will improve our knowledge in nutritional ecology, through the biochemical effects of climate change and environmental impact on the growth performance quality of aquatic living resources in the wild, as well as in aquaculture. The original GrowthEstimate software is available at GitHub repository (https://github.com/RungruangsakTorrissenManoonpong/GrowthEstimate). All other relevant data are within the paper. It will be improved for generality for future use, and required co-operations of the biodata collections of different species from different climate zones. Therefore, a co-operation will be available.
Collapse
Affiliation(s)
- Krisna Rungruangsak-Torrissen
- Institute of Marine Research, Ecosystem Processes Research Group, Matredal, Norway
- Freelance Researcher, Bergen, Norway
| | - Poramate Manoonpong
- Embodied Artificial Intelligence and Neurorobotics Lab, Centre for Biorobotics, The Mærsk Mc-Kinney Møller Institute, University of Southern Denmark, Odense M, Denmark
- Bio-inspired Robotics and Neural Engineering Lab, School of Information Science and Technology, Vidyasirimedhi Institute of Science and Technology, Rayong, Thailand
| |
Collapse
|
32
|
Eraslan G, Avsec Ž, Gagneur J, Theis FJ. Deep learning: new computational modelling techniques for genomics. Nat Rev Genet 2019; 20:389-403. [PMID: 30971806 DOI: 10.1038/s41576-019-0122-6] [Citation(s) in RCA: 526] [Impact Index Per Article: 105.2] [Reference Citation Analysis] [Abstract] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/17/2022]
Abstract
As a data-driven science, genomics largely utilizes machine learning to capture dependencies in data and derive novel biological hypotheses. However, the ability to extract new insights from the exponentially increasing volume of genomics data requires more expressive machine learning models. By effectively leveraging large data sets, deep learning has transformed fields such as computer vision and natural language processing. Now, it is becoming the method of choice for many genomics modelling tasks, including predicting the impact of genetic variation on gene regulatory mechanisms such as DNA accessibility and splicing.
Collapse
Affiliation(s)
- Gökcen Eraslan
- Institute of Computational Biology, Helmholtz Zentrum München, Neuherberg, Germany.,School of Life Sciences Weihenstephan, Technical University of Munich, Freising, Germany
| | - Žiga Avsec
- Department of Informatics, Technical University of Munich, Garching, Germany
| | - Julien Gagneur
- Department of Informatics, Technical University of Munich, Garching, Germany.
| | - Fabian J Theis
- Institute of Computational Biology, Helmholtz Zentrum München, Neuherberg, Germany. .,School of Life Sciences Weihenstephan, Technical University of Munich, Freising, Germany. .,Department of Mathematics, Technical University of Munich, Garching, Germany.
| |
Collapse
|
33
|
Hériché JK, Alexander S, Ellenberg J. Integrating Imaging and Omics: Computational Methods and Challenges. Annu Rev Biomed Data Sci 2019. [DOI: 10.1146/annurev-biodatasci-080917-013328] [Citation(s) in RCA: 24] [Impact Index Per Article: 4.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/09/2022]
Abstract
Fluorescence microscopy imaging has long been complementary to DNA sequencing- and mass spectrometry–based omics in biomedical research, but these approaches are now converging. On the one hand, omics methods are moving from in vitro methods that average across large cell populations to in situ molecular characterization tools with single-cell sensitivity. On the other hand, fluorescence microscopy imaging has moved from a morphological description of tissues and cells to quantitative molecular profiling with single-molecule resolution. Recent technological developments underpinned by computational methods have started to blur the lines between imaging and omics and have made their direct correlation and seamless integration an exciting possibility. As this trend continues rapidly, it will allow us to create comprehensive molecular profiles of living systems with spatial and temporal context and subcellular resolution. Key to achieving this ambitious goal will be novel computational methods and successfully dealing with the challenges of data integration and sharing as well as cloud-enabled big data analysis.
Collapse
Affiliation(s)
- Jean-Karim Hériché
- Cell Biology and Biophysics Unit, European Molecular Biology Laboratory (EMBL), 69117 Heidelberg, Germany
| | - Stephanie Alexander
- Cell Biology and Biophysics Unit, European Molecular Biology Laboratory (EMBL), 69117 Heidelberg, Germany
| | - Jan Ellenberg
- Cell Biology and Biophysics Unit, European Molecular Biology Laboratory (EMBL), 69117 Heidelberg, Germany
| |
Collapse
|
34
|
Deep Learning in the Biomedical Applications: Recent and Future Status. APPLIED SCIENCES-BASEL 2019. [DOI: 10.3390/app9081526] [Citation(s) in RCA: 75] [Impact Index Per Article: 15.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 02/06/2023]
Abstract
Deep neural networks represent, nowadays, the most effective machine learning technology in biomedical domain. In this domain, the different areas of interest concern the Omics (study of the genome—genomics—and proteins—transcriptomics, proteomics, and metabolomics), bioimaging (study of biological cell and tissue), medical imaging (study of the human organs by creating visual representations), BBMI (study of the brain and body machine interface) and public and medical health management (PmHM). This paper reviews the major deep learning concepts pertinent to such biomedical applications. Concise overviews are provided for the Omics and the BBMI. We end our analysis with a critical discussion, interpretation and relevant open challenges.
Collapse
|
35
|
Wang W, Corominas R, Lin GN. De novo Mutations From Whole Exome Sequencing in Neurodevelopmental and Psychiatric Disorders: From Discovery to Application. Front Genet 2019; 10:258. [PMID: 31001316 PMCID: PMC6456656 DOI: 10.3389/fgene.2019.00258] [Citation(s) in RCA: 41] [Impact Index Per Article: 8.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/09/2018] [Accepted: 03/08/2019] [Indexed: 12/13/2022] Open
Abstract
Neurodevelopmental and psychiatric disorders are a highly disabling and heterogeneous group of developmental and mental disorders, resulting from complex interactions of genetic and environmental risk factors. The nature of multifactorial traits and the presence of comorbidity and polygenicity in these disorders present challenges in both disease risk identification and clinical diagnoses. The genetic component has been firmly established, but the identification of all the causative variants remains elusive. The development of next-generation sequencing, especially whole exome sequencing (WES), has greatly enriched our knowledge of the precise genetic alterations of human diseases, including brain-related disorders. In particular, the extensive usage of WES in research studies has uncovered the important contribution of de novo mutations (DNMs) to these disorders. Trio and quad familial WES are a particularly useful approach to discover DNMs. Here, we review the major WES studies in neurodevelopmental and psychiatric disorders and summarize how genes hit by discovered DNMs are shared among different disorders. Next, we discuss different integrative approaches utilized to interrogate DNMs and to identify biological pathways that may disrupt brain development and shed light on our understanding of the genetic architecture underlying these disorders. Lastly, we discuss the current state of the transition from WES research to its routine clinical application. This review will assist researchers and clinicians in the interpretation of variants obtained from WES studies, and highlights the need to develop consensus analytical protocols and validated lists of genes appropriate for clinical laboratory analysis, in order to reach the growing demands.
Collapse
Affiliation(s)
- Weidi Wang
- Shanghai Mental Health Center, School of Biomedical Engineering, Shanghai Jiao Tong University School of Medicine, Shanghai, China
- Shanghai Key Laboratory of Psychotic Disorders, Shanghai Mental Health Center, Shanghai, China
- Brain Science and Technology Research Center, Shanghai Jiao Tong University, Shanghai, China
| | - Roser Corominas
- Departament de Genètica, Microbiologia i Estadística, Facultat de Biologia, Universitat de Barcelona, Barcelona, Spain
- Centro de Investigación Biomédica en Red de Enfermedades Raras, Valencia, Spain
- Institut de Biomedicina de la Universitat de Barcelona, Barcelona, Spain
- Institut de Recerca Sant Joan de Déu, Esplugues de Llobregat, Barcelona, Spain
| | - Guan Ning Lin
- Shanghai Mental Health Center, School of Biomedical Engineering, Shanghai Jiao Tong University School of Medicine, Shanghai, China
- Shanghai Key Laboratory of Psychotic Disorders, Shanghai Mental Health Center, Shanghai, China
- Brain Science and Technology Research Center, Shanghai Jiao Tong University, Shanghai, China
| |
Collapse
|
36
|
Zemouri R, Devalland C, Valmary-Degano S, Zerhouni N. [Neural network: A future in pathology?]. Ann Pathol 2019; 39:119-129. [PMID: 30773224 DOI: 10.1016/j.annpat.2019.01.004] [Citation(s) in RCA: 16] [Impact Index Per Article: 3.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/23/2018] [Revised: 12/28/2018] [Accepted: 01/15/2019] [Indexed: 02/07/2023]
Abstract
Artificial Intelligence, in particular deep neural networks are the most used machine learning technics in the biomedical field. Artificial neural networks are inspired by the biological neurons; they are interconnected and follow mathematical models. Two phases are required: a learning and a using phase. The two main applications are classification and regression Computer tools such as GPU computational accelerators or some development tools such as MATLAB libraries are used. Their application field is vast and allows the management of big data in genomics and molecular biology as well as the automated analysis of histological slides. The Whole Slide Image scanner can acquire and store slides in the form of digital images. This scanning associated with deep learning algorithms allows automatic recognition of lesions through the automatic recognition of regions of interest previously validated by the pathologist. These computer aided diagnosis techniques are tested in particular in mammary pathology and dermatopathology. They will allow an efficient and a more comprehensive vision, and will provide diagnosis assistance in pathology by correlating several biomedical data such as clinical, radiological and molecular biology data.
Collapse
Affiliation(s)
- Ryad Zemouri
- CEDRIC laboratory of the Conservatoire national des arts et métiers (CNAM), HESAM université, 292, rue Saint-Martin, 750141 Paris cedex 03, France.
| | - Christine Devalland
- Service d'anatomie et cytologie pathologiques, hôpital nord Franche-Comté, 100, route de Moval, 90400 Trevenans, France.
| | - Séverine Valmary-Degano
- TSA10217, service d'anatomie et cytologie pathologiques, CHU de Grenoble-Alpes, 38043 Grenoble cedex, France.
| | - Noureddine Zerhouni
- ENSMM, CNR, FEMTO-ST institute, université de Bourgogne Franche-Comté, 25000 Besançon, France.
| |
Collapse
|
37
|
|
38
|
Amidi A, Amidi S, Vlachakis D, Megalooikonomou V, Paragios N, Zacharaki EI. EnzyNet: enzyme classification using 3D convolutional neural networks on spatial representation. PeerJ 2018; 6:e4750. [PMID: 29740518 PMCID: PMC5937476 DOI: 10.7717/peerj.4750] [Citation(s) in RCA: 35] [Impact Index Per Article: 5.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/25/2017] [Accepted: 04/21/2018] [Indexed: 11/20/2022] Open
Abstract
During the past decade, with the significant progress of computational power as well as ever-rising data availability, deep learning techniques became increasingly popular due to their excellent performance on computer vision problems. The size of the Protein Data Bank (PDB) has increased more than 15-fold since 1999, which enabled the expansion of models that aim at predicting enzymatic function via their amino acid composition. Amino acid sequence, however, is less conserved in nature than protein structure and therefore considered a less reliable predictor of protein function. This paper presents EnzyNet, a novel 3D convolutional neural networks classifier that predicts the Enzyme Commission number of enzymes based only on their voxel-based spatial structure. The spatial distribution of biochemical properties was also examined as complementary information. The two-layer architecture was investigated on a large dataset of 63,558 enzymes from the PDB and achieved an accuracy of 78.4% by exploiting only the binary representation of the protein shape. Code and datasets are available at https://github.com/shervinea/enzynet.
Collapse
Affiliation(s)
- Afshine Amidi
- Massachusetts Institute of Technology, Cambridge, MA, USA.,Center for Visual Computing, Department of Applied Mathematics, Ecole Centrale de Paris (CentraleSupélec), Châtenay-Malabry, France
| | - Shervine Amidi
- Center for Visual Computing, Department of Applied Mathematics, Ecole Centrale de Paris (CentraleSupélec), Châtenay-Malabry, France
| | - Dimitrios Vlachakis
- MDAKM Group, Department of Computer Engineering and Informatics, University of Patras, Patras, Greece
| | - Vasileios Megalooikonomou
- MDAKM Group, Department of Computer Engineering and Informatics, University of Patras, Patras, Greece
| | - Nikos Paragios
- Center for Visual Computing, Department of Applied Mathematics, Ecole Centrale de Paris (CentraleSupélec), Châtenay-Malabry, France
| | - Evangelia I Zacharaki
- Center for Visual Computing, Department of Applied Mathematics, Ecole Centrale de Paris (CentraleSupélec), Châtenay-Malabry, France.,MDAKM Group, Department of Computer Engineering and Informatics, University of Patras, Patras, Greece
| |
Collapse
|
39
|
Telenti A, Lippert C, Chang PC, DePristo M. Deep learning of genomic variation and regulatory network data. Hum Mol Genet 2018; 27:R63-R71. [PMID: 29648622 PMCID: PMC6499235 DOI: 10.1093/hmg/ddy115] [Citation(s) in RCA: 39] [Impact Index Per Article: 6.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/09/2018] [Revised: 03/26/2018] [Accepted: 03/27/2018] [Indexed: 02/07/2023] Open
Abstract
The human genome is now investigated through high-throughput functional assays, and through the generation of population genomic data. These advances support the identification of functional genetic variants and the prediction of traits (e.g. deleterious variants and disease). This review summarizes lessons learned from the large-scale analyses of genome and exome data sets, modeling of population data and machine-learning strategies to solve complex genomic sequence regions. The review also portrays the rapid adoption of artificial intelligence/deep neural networks in genomics; in particular, deep learning approaches are well suited to model the complex dependencies in the regulatory landscape of the genome, and to provide predictors for genetic variant calling and interpretation.
Collapse
Affiliation(s)
- Amalio Telenti
- Scripps Translational Science Institute, The Scripps Research Institute, La Jolla, CA 92037, USA
| | | | | | | |
Collapse
|
40
|
Adjeroh D, Allaga M, Tan J, Lin J, Jiang Y, Abbasi A, Zhou X. Feature-Based and String-Based Models for Predicting RNA-Protein Interaction. Molecules 2018; 23:E697. [PMID: 29562711 PMCID: PMC6017419 DOI: 10.3390/molecules23030697] [Citation(s) in RCA: 12] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/21/2017] [Revised: 02/17/2018] [Accepted: 02/21/2018] [Indexed: 12/13/2022] Open
Abstract
In this work, we study two approaches for the problem of RNA-Protein Interaction (RPI). In the first approach, we use a feature-based technique by combining extracted features from both sequences and secondary structures. The feature-based approach enhanced the prediction accuracy as it included much more available information about the RNA-protein pairs. In the second approach, we apply search algorithms and data structures to extract effective string patterns for prediction of RPI, using both sequence information (protein and RNA sequences), and structure information (protein and RNA secondary structures). This led to different string-based models for predicting interacting RNA-protein pairs. We show results that demonstrate the effectiveness of the proposed approaches, including comparative results against leading state-of-the-art methods.
Collapse
Affiliation(s)
- Donald Adjeroh
- Lane Department of Computer Science and Electrical Engineering, West Virginia University, Morgantown, WV 26508, USA.
| | - Maen Allaga
- Lane Department of Computer Science and Electrical Engineering, West Virginia University, Morgantown, WV 26508, USA.
| | - Jun Tan
- Lane Department of Computer Science and Electrical Engineering, West Virginia University, Morgantown, WV 26508, USA.
| | - Jie Lin
- Faculty of Software, Fujian Normal University, Fuzhou 350108, China.
| | - Yue Jiang
- Faculty of Software, Fujian Normal University, Fuzhou 350108, China.
| | - Ahmed Abbasi
- McIntire School of Commerce, University of Virginia, Charlottesville, VA 22904, USA.
| | - Xiaobo Zhou
- McGovern Medical School, and School of Biomedical Informatics, The University of Texas Health Science Center at Houston (UTHealth), Houston, TX 77030, USA.
| |
Collapse
|
41
|
Integrating the whole from the sum of the parts: vignettes in computational biology. Emerg Top Life Sci 2017; 1:241-243. [PMID: 33525804 DOI: 10.1042/etls20170137] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/28/2017] [Revised: 10/05/2017] [Accepted: 10/05/2017] [Indexed: 11/17/2022]
Abstract
As is typical of contemporary cutting-edge interdisciplinary fields, computational biology touches and impacts many disciplines ranging from fundamental studies in the areas of genomics, proteomics transcriptomics, lipidomics to practical applications such as personalized medicine, drug discovery, and synthetic biology. This editorial examines the multifaceted role computational biology plays. Using the tools of deep learning, it can make powerful predictions of many biological variables, which may not provide a deep understanding of what factors contribute to the phenomena. Alternatively, it can provide the how and the why of biological processes. Most importantly, it can help guide and interpret what experiments and biological systems to study.
Collapse
|