1
|
Eldawlatly S. On the role of generative artificial intelligence in the development of brain-computer interfaces. BMC Biomed Eng 2024; 6:4. [PMID: 38698495 PMCID: PMC11064240 DOI: 10.1186/s42490-024-00080-2] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/04/2023] [Accepted: 04/24/2024] [Indexed: 05/05/2024] Open
Abstract
Since their inception more than 50 years ago, Brain-Computer Interfaces (BCIs) have held promise to compensate for functions lost by people with disabilities through allowing direct communication between the brain and external devices. While research throughout the past decades has demonstrated the feasibility of BCI to act as a successful assistive technology, the widespread use of BCI outside the lab is still beyond reach. This can be attributed to a number of challenges that need to be addressed for BCI to be of practical use including limited data availability, limited temporal and spatial resolutions of brain signals recorded non-invasively and inter-subject variability. In addition, for a very long time, BCI development has been mainly confined to specific simple brain patterns, while developing other BCI applications relying on complex brain patterns has been proven infeasible. Generative Artificial Intelligence (GAI) has recently emerged as an artificial intelligence domain in which trained models can be used to generate new data with properties resembling that of available data. Given the enhancements observed in other domains that possess similar challenges to BCI development, GAI has been recently employed in a multitude of BCI development applications to generate synthetic brain activity; thereby, augmenting the recorded brain activity. Here, a brief review of the recent adoption of GAI techniques to overcome the aforementioned BCI challenges is provided demonstrating the enhancements achieved using GAI techniques in augmenting limited EEG data, enhancing the spatiotemporal resolution of recorded EEG data, enhancing cross-subject performance of BCI systems and implementing end-to-end BCI applications. GAI could represent the means by which BCI would be transformed into a prevalent assistive technology, thereby improving the quality of life of people with disabilities, and helping in adopting BCI as an emerging human-computer interaction technology for general use.
Collapse
Affiliation(s)
- Seif Eldawlatly
- Computer and Systems Engineering Department, Faculty of Engineering, Ain Shams University, 1 El-Sarayat St., Abbassia, Cairo, Egypt.
- Computer Science and Engineering Department, The American University in Cairo, Cairo, Egypt.
| |
Collapse
|
2
|
Satheesh Kumar J, Vinoth Kumar V, Mahesh TR, Alqahtani MS, Prabhavathy P, Manikandan K, Guluwadi S. Detection of Marchiafava Bignami disease using distinct deep learning techniques in medical diagnostics. BMC Med Imaging 2024; 24:100. [PMID: 38684964 PMCID: PMC11059769 DOI: 10.1186/s12880-024-01283-8] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/19/2024] [Accepted: 04/25/2024] [Indexed: 05/02/2024] Open
Abstract
PURPOSE To detect the Marchiafava Bignami Disease (MBD) using a distinct deep learning technique. BACKGROUND Advanced deep learning methods are becoming more crucial in contemporary medical diagnostics, particularly for detecting intricate and uncommon neurological illnesses such as MBD. This rare neurodegenerative disorder, sometimes associated with persistent alcoholism, is characterized by the loss of myelin or tissue death in the corpus callosum. It poses significant diagnostic difficulties owing to its infrequency and the subtle signs it exhibits in its first stages, both clinically and on radiological scans. METHODS The novel method of Variational Autoencoders (VAEs) in conjunction with attention mechanisms is used to identify MBD peculiar diseases accurately. VAEs are well-known for their proficiency in unsupervised learning and anomaly detection. They excel at analyzing extensive brain imaging datasets to uncover subtle patterns and abnormalities that traditional diagnostic approaches may overlook, especially those related to specific diseases. The use of attention mechanisms enhances this technique, enabling the model to concentrate on the most crucial elements of the imaging data, similar to the discerning observation of a skilled radiologist. Thus, we utilized the VAE with attention mechanisms in this study to detect MBD. Such a combination enables the prompt identification of MBD and assists in formulating more customized and efficient treatment strategies. RESULTS A significant breakthrough in this field is the creation of a VAE equipped with attention mechanisms, which has shown outstanding performance by achieving accuracy rates of over 90% in accurately differentiating MBD from other neurodegenerative disorders. CONCLUSION This model, which underwent training using a diverse range of MRI images, has shown a notable level of sensitivity and specificity, significantly minimizing the frequency of false positive results and strengthening the confidence and dependability of these sophisticated automated diagnostic tools.
Collapse
Affiliation(s)
- J Satheesh Kumar
- Department of Electronics and Instrumentation Engineering, Dayananda Sagar College of Engineering, Bangalore, India
| | - V Vinoth Kumar
- School of Computer Science Engineering and Information Systems, Vellore Institute of Technology, Vellore, India
| | - T R Mahesh
- Department of Computer Science and Engineering, JAIN (Deemed-to-Be University), Bengaluru, 562112, India
| | - Mohammed S Alqahtani
- Radiological Sciences Department, College of Applied Medical Sciences, King Khalid University, 61421, Abha, Saudi Arabia
| | - P Prabhavathy
- School of Computer Science Engineering and Information Systems, Vellore Institute of Technology, Vellore, India
| | - K Manikandan
- School of Computer Science and Engineering (SCOPE), Vellore Institute of Technology (VIT), Vellore, India
| | - Suresh Guluwadi
- Adama Science and Technology University, 302120, Adama, Ethiopia.
| |
Collapse
|
3
|
Tung CH, Hsiao YJ, Chen HL, Huang GR, Porcar L, Chang MC, Carrillo JM, Wang Y, Sumpter BG, Shinohara Y, Taylor J, Do C, Chen WR. Unveiling mesoscopic structures in distorted lamellar phases through deep learning-based small angle neutron scattering analysis. J Colloid Interface Sci 2024; 659:739-750. [PMID: 38211491 DOI: 10.1016/j.jcis.2024.01.003] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/08/2023] [Revised: 12/18/2023] [Accepted: 01/02/2024] [Indexed: 01/13/2024]
Abstract
HYPOTHESIS The formation of distorted lamellar phases, distinguished by their arrangement of crumpled, stacked layers, is frequently accompanied by the disruption of long-range order, leading to the formation of interconnected network structures commonly observed in the sponge phase. Nevertheless, traditional scattering functions grounded in deterministic modeling fall short of fully representing these intricate structural characteristics. Our hypothesis posits that a deep learning method, in conjunction with the generalized leveled wave approach used for describing structural features of distorted lamellar phases, can quantitatively unveil the inherent spatial correlations within these phases. EXPERIMENTS AND SIMULATIONS This report outlines a novel strategy that integrates convolutional neural networks and variational autoencoders, supported by stochastically generated density fluctuations, into a regression analysis framework for extracting structural features of distorted lamellar phases from small angle neutron scattering data. To evaluate the efficacy of our proposed approach, we conducted computational accuracy assessments and applied it to the analysis of experimentally measured small angle neutron scattering spectra of AOT surfactant solutions, a frequently studied lamellar system. FINDINGS The findings unambiguously demonstrate that deep learning provides a dependable and quantitative approach for investigating the morphology of wide variations of distorted lamellar phases. It is adaptable for deciphering structures from the lamellar to sponge phase including intermediate structures exhibiting fused topological features. This research highlights the effectiveness of deep learning methods in tackling complex issues in the field of soft matter structural analysis and beyond.
Collapse
Affiliation(s)
- Chi-Huan Tung
- Department of Chemical Engineering, National Tsing Hua University, Hsinchu, 30013, Taiwan
| | - Yu-Jung Hsiao
- Department of Chemical Engineering, National Tsing Hua University, Hsinchu, 30013, Taiwan
| | - Hsin-Lung Chen
- Department of Chemical Engineering, National Tsing Hua University, Hsinchu, 30013, Taiwan
| | - Guan-Rong Huang
- Department of Materials and Optoelectronic Science, National Sun Yat-sen University, Kaohsiung, 80424, Taiwan
| | - Lionel Porcar
- Institut Laue-Langevin, B.P. 156, F-38042 Grenoble Cedex 9, France
| | - Ming-Ching Chang
- Department of Computer Science, University at Albany - State University of New York, Albany, 12222, NY, United States
| | - Jan-Michael Carrillo
- Center for Nanophase Materials Sciences, Oak Ridge National Laboratory, Oak Ridge, 37831, TN, United States
| | - Yangyang Wang
- Center for Nanophase Materials Sciences, Oak Ridge National Laboratory, Oak Ridge, 37831, TN, United States
| | - Bobby G Sumpter
- Center for Nanophase Materials Sciences, Oak Ridge National Laboratory, Oak Ridge, 37831, TN, United States
| | - Yuya Shinohara
- Materials Science and Technology Division, Oak Ridge National Laboratory, Oak Ridge, 37831, TN, United States
| | - Jon Taylor
- Neutron Scattering Division, Oak Ridge National Laboratory, Oak Ridge, 37831, TN, United States
| | - Changwoo Do
- Neutron Scattering Division, Oak Ridge National Laboratory, Oak Ridge, 37831, TN, United States
| | - Wei-Ren Chen
- Neutron Scattering Division, Oak Ridge National Laboratory, Oak Ridge, 37831, TN, United States.
| |
Collapse
|
4
|
McMillan L, Fayaz J, Varga L. Domain-informed variational neural networks and support vector machines based leakage detection framework to augment self-healing in water distribution networks. Water Res 2024; 249:120983. [PMID: 38118223 DOI: 10.1016/j.watres.2023.120983] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 08/21/2023] [Revised: 11/17/2023] [Accepted: 12/05/2023] [Indexed: 12/22/2023]
Abstract
The reduction of water leakage is essential for ensuring sustainable and resilient water supply systems. Despite recent investments in sensing technologies, pipe leakage remains a significant challenge for the water sector, particularly in developed nations like the UK, which suffer from aging water infrastructure. Conventional models and analytical methods for detecting pipe leakage often face reliability issues and are generally limited to detecting leaks during nighttime hours. Moreover, leakages are frequently detected by the customers rather than the water companies. To achieve substantial reductions in leakage and enhance public confidence in water supply and management, adopting an intelligent detection method is crucial. Such a method should effectively leverage existing sensor data for reliable leakage identification across the network. This not only helps in minimizing water loss and the associated energy costs of water treatment but also aids in steering the water sector towards a more sustainable and resilient future. As a step towards 'self-healing' water infrastructure systems, this study presents a novel framework for rapidly identifying potential leakages at the district meter area (DMA) level. The framework involves training a domain-informed variational autoencoder (VAE) for real-time dimensionality reduction of water flow time series data and developing a two-dimensional surrogate latent variable (LV) mapping which sufficiently and efficiently captures the distinct characteristics of leakage and regular (non-leakage) flow. The domain-informed training employs a novel loss function that ensures a distinct but regulated LV space for the two classes of flow groupings (i.e., leakage and non-leakage). Subsquently, a binary SVM classifier is used to provide a hyperplane for separating the two classes of LVs corresponding to the flow groupings. Hence, the proposed framework can be efficiently utilised to classify the incoming flow as leakage or non-leakage based on the encoded surrogates LVs of the flow time series using the trained VAE encoder. The framework is trained and tested on a dataset of over 2000 DMAs in North Yorkshire, UK, containing water flow time series recorded at 15-minute intervals over one year. The framework performs exceptionally well for both regular and leakage water flow groupings with a classification accuracy of over 98 % on the unobserved test dataset.
Collapse
Affiliation(s)
- Lauren McMillan
- Infrastructure Systems Institute, Department of Civil, Environmental and Geomatic Engineering, University College London (UCL), London, UK.
| | - Jawad Fayaz
- Infrastructure Systems Institute, Department of Civil, Environmental and Geomatic Engineering, University College London (UCL), London, UK; School of Computing, Engineering and Digital Technologies, Teesside University (TU), Teesside, UK
| | - Liz Varga
- Infrastructure Systems Institute, Department of Civil, Environmental and Geomatic Engineering, University College London (UCL), London, UK
| |
Collapse
|
5
|
Monachino G, Zanchi B, Fiorillo L, Conte G, Auricchio A, Tzovara A, Faraci FD. Deep Generative Models: The winning key for large and easily accessible ECG datasets? Comput Biol Med 2023; 167:107655. [PMID: 37976830 DOI: 10.1016/j.compbiomed.2023.107655] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/24/2023] [Revised: 10/04/2023] [Accepted: 10/31/2023] [Indexed: 11/19/2023]
Abstract
Large high-quality datasets are essential for building powerful artificial intelligence (AI) algorithms capable of supporting advancement in cardiac clinical research. However, researchers working with electrocardiogram (ECG) signals struggle to get access and/or to build one. The aim of the present work is to shed light on a potential solution to address the lack of large and easily accessible ECG datasets. Firstly, the main causes of such a lack are identified and examined. Afterward, the potentials and limitations of cardiac data generation via deep generative models (DGMs) are deeply analyzed. These very promising algorithms have been found capable not only of generating large quantities of ECG signals but also of supporting data anonymization processes, to simplify data sharing while respecting patients' privacy. Their application could help research progress and cooperation in the name of open science. However several aspects, such as a standardized synthetic data quality evaluation and algorithm stability, need to be further explored.
Collapse
Affiliation(s)
- Giuliana Monachino
- Institute of Digital Technologies for Personalized Healthcare - MeDiTech, Department of Innovative Technologies, University of Applied Sciences and Arts of Southern Switzerland, Via la Santa 1, Lugano 6900, Switzerland; Institute of Informatics, University of Bern, Neubrückstrasse 10, Bern 3012, Switzerland.
| | - Beatrice Zanchi
- Institute of Digital Technologies for Personalized Healthcare - MeDiTech, Department of Innovative Technologies, University of Applied Sciences and Arts of Southern Switzerland, Via la Santa 1, Lugano 6900, Switzerland; Department of Quantitative Biomedicine, University of Zurich, Schmelzbergstrasse 26, Zurich 8091, Switzerland
| | - Luigi Fiorillo
- Institute of Digital Technologies for Personalized Healthcare - MeDiTech, Department of Innovative Technologies, University of Applied Sciences and Arts of Southern Switzerland, Via la Santa 1, Lugano 6900, Switzerland
| | - Giulio Conte
- Division of Cardiology, Fondazione Cardiocentro Ticino, Via Tesserete 48, Lugano 6900, Switzerland; Centre for Computational Medicine in Cardiology, Faculty of Informatics, Università della Svizzera Italiana, Via la Santa 1, Lugano 6900, Switzerland
| | - Angelo Auricchio
- Division of Cardiology, Fondazione Cardiocentro Ticino, Via Tesserete 48, Lugano 6900, Switzerland; Centre for Computational Medicine in Cardiology, Faculty of Informatics, Università della Svizzera Italiana, Via la Santa 1, Lugano 6900, Switzerland
| | - Athina Tzovara
- Institute of Informatics, University of Bern, Neubrückstrasse 10, Bern 3012, Switzerland; Sleep Wake Epilepsy Center | NeuroTec, Department of Neurology, Inselspital, Bern University Hospital, University of Bern, Freiburgstrasse 16, Bern 3010, Switzerland
| | - Francesca Dalia Faraci
- Institute of Digital Technologies for Personalized Healthcare - MeDiTech, Department of Innovative Technologies, University of Applied Sciences and Arts of Southern Switzerland, Via la Santa 1, Lugano 6900, Switzerland
| |
Collapse
|
6
|
Celard P, Iglesias EL, Sorribes-Fdez JM, Romero R, Vieira AS, Borrajo L. A survey on deep learning applied to medical images: from simple artificial neural networks to generative models. Neural Comput Appl 2023; 35:2291-2323. [PMID: 36373133 PMCID: PMC9638354 DOI: 10.1007/s00521-022-07953-4] [Citation(s) in RCA: 9] [Impact Index Per Article: 9.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/23/2022] [Accepted: 10/12/2022] [Indexed: 11/06/2022]
Abstract
Deep learning techniques, in particular generative models, have taken on great importance in medical image analysis. This paper surveys fundamental deep learning concepts related to medical image generation. It provides concise overviews of studies which use some of the latest state-of-the-art models from last years applied to medical images of different injured body areas or organs that have a disease associated with (e.g., brain tumor and COVID-19 lungs pneumonia). The motivation for this study is to offer a comprehensive overview of artificial neural networks (NNs) and deep generative models in medical imaging, so more groups and authors that are not familiar with deep learning take into consideration its use in medicine works. We review the use of generative models, such as generative adversarial networks and variational autoencoders, as techniques to achieve semantic segmentation, data augmentation, and better classification algorithms, among other purposes. In addition, a collection of widely used public medical datasets containing magnetic resonance (MR) images, computed tomography (CT) scans, and common pictures is presented. Finally, we feature a summary of the current state of generative models in medical image including key features, current challenges, and future research paths.
Collapse
Affiliation(s)
- P. Celard
- Computer Science Department, Universidade de Vigo, Escuela Superior de Ingeniería Informática, Campus Universitario As Lagoas, 32004 Ourense, Spain ,CINBIO - Biomedical Research Centre, Universidade de Vigo, Campus Universitario Lagoas-Marcosende, 36310 Vigo, Spain ,SING Research Group, Galicia Sur Health Research Institute (IIS Galicia Sur), SERGAS-UVIGO, Vigo, Spain
| | - E. L. Iglesias
- Computer Science Department, Universidade de Vigo, Escuela Superior de Ingeniería Informática, Campus Universitario As Lagoas, 32004 Ourense, Spain ,CINBIO - Biomedical Research Centre, Universidade de Vigo, Campus Universitario Lagoas-Marcosende, 36310 Vigo, Spain ,SING Research Group, Galicia Sur Health Research Institute (IIS Galicia Sur), SERGAS-UVIGO, Vigo, Spain
| | - J. M. Sorribes-Fdez
- Computer Science Department, Universidade de Vigo, Escuela Superior de Ingeniería Informática, Campus Universitario As Lagoas, 32004 Ourense, Spain ,CINBIO - Biomedical Research Centre, Universidade de Vigo, Campus Universitario Lagoas-Marcosende, 36310 Vigo, Spain ,SING Research Group, Galicia Sur Health Research Institute (IIS Galicia Sur), SERGAS-UVIGO, Vigo, Spain
| | - R. Romero
- Computer Science Department, Universidade de Vigo, Escuela Superior de Ingeniería Informática, Campus Universitario As Lagoas, 32004 Ourense, Spain ,CINBIO - Biomedical Research Centre, Universidade de Vigo, Campus Universitario Lagoas-Marcosende, 36310 Vigo, Spain ,SING Research Group, Galicia Sur Health Research Institute (IIS Galicia Sur), SERGAS-UVIGO, Vigo, Spain
| | - A. Seara Vieira
- Computer Science Department, Universidade de Vigo, Escuela Superior de Ingeniería Informática, Campus Universitario As Lagoas, 32004 Ourense, Spain ,CINBIO - Biomedical Research Centre, Universidade de Vigo, Campus Universitario Lagoas-Marcosende, 36310 Vigo, Spain ,SING Research Group, Galicia Sur Health Research Institute (IIS Galicia Sur), SERGAS-UVIGO, Vigo, Spain
| | - L. Borrajo
- Computer Science Department, Universidade de Vigo, Escuela Superior de Ingeniería Informática, Campus Universitario As Lagoas, 32004 Ourense, Spain ,CINBIO - Biomedical Research Centre, Universidade de Vigo, Campus Universitario Lagoas-Marcosende, 36310 Vigo, Spain ,SING Research Group, Galicia Sur Health Research Institute (IIS Galicia Sur), SERGAS-UVIGO, Vigo, Spain
| |
Collapse
|
7
|
Dimitriadis A, Trivizakis E, Papanikolaou N, Tsiknakis M, Marias K. Enhancing cancer differentiation with synthetic MRI examinations via generative models: a systematic review. Insights Imaging 2022; 13:188. [PMID: 36503979 PMCID: PMC9742072 DOI: 10.1186/s13244-022-01315-3] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/22/2022] [Accepted: 07/24/2022] [Indexed: 12/14/2022] Open
Abstract
Contemporary deep learning-based decision systems are well-known for requiring high-volume datasets in order to produce generalized, reliable, and high-performing models. However, the collection of such datasets is challenging, requiring time-consuming processes involving also expert clinicians with limited time. In addition, data collection often raises ethical and legal issues and depends on costly and invasive procedures. Deep generative models such as generative adversarial networks and variational autoencoders can capture the underlying distribution of the examined data, allowing them to create new and unique instances of samples. This study aims to shed light on generative data augmentation techniques and corresponding best practices. Through in-depth investigation, we underline the limitations and potential methodology pitfalls from critical standpoint and aim to promote open science research by identifying publicly available open-source repositories and datasets.
Collapse
Affiliation(s)
- Avtantil Dimitriadis
- grid.4834.b0000 0004 0635 685XComputational Biomedicine Laboratory (CBML), Foundation for Research and Technology Hellas (FORTH), 70013 Heraklion, Greece ,grid.419879.a0000 0004 0393 8299Department of Electrical and Computer Engineering, Hellenic Mediterranean University, 71410 Heraklion, Greece
| | - Eleftherios Trivizakis
- grid.4834.b0000 0004 0635 685XComputational Biomedicine Laboratory (CBML), Foundation for Research and Technology Hellas (FORTH), 70013 Heraklion, Greece ,grid.8127.c0000 0004 0576 3437Medical School, University of Crete, 71003 Heraklion, Greece
| | - Nikolaos Papanikolaou
- grid.4834.b0000 0004 0635 685XComputational Biomedicine Laboratory (CBML), Foundation for Research and Technology Hellas (FORTH), 70013 Heraklion, Greece ,grid.421010.60000 0004 0453 9636Computational Clinical Imaging Group, Centre of the Unknown, Champalimaud Foundation, 1400-038 Lisbon, Portugal ,grid.18886.3fThe Royal Marsden NHS Foundation Trust, THe Institute of Cancer Research, London, UK
| | - Manolis Tsiknakis
- grid.4834.b0000 0004 0635 685XComputational Biomedicine Laboratory (CBML), Foundation for Research and Technology Hellas (FORTH), 70013 Heraklion, Greece ,grid.419879.a0000 0004 0393 8299Department of Electrical and Computer Engineering, Hellenic Mediterranean University, 71410 Heraklion, Greece
| | - Kostas Marias
- grid.4834.b0000 0004 0635 685XComputational Biomedicine Laboratory (CBML), Foundation for Research and Technology Hellas (FORTH), 70013 Heraklion, Greece ,grid.419879.a0000 0004 0393 8299Department of Electrical and Computer Engineering, Hellenic Mediterranean University, 71410 Heraklion, Greece
| |
Collapse
|
8
|
Tevosyan A, Khondkaryan L, Khachatrian H, Tadevosyan G, Apresyan L, Babayan N, Stopper H, Navoyan Z. Improving VAE based molecular representations for compound property prediction. J Cheminform 2022; 14:69. [PMID: 36242073 PMCID: PMC9569108 DOI: 10.1186/s13321-022-00648-x] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/15/2022] [Accepted: 10/01/2022] [Indexed: 11/25/2022] Open
Abstract
Collecting labeled data for many important tasks in chemoinformatics is time consuming and requires expensive experiments. In recent years, machine learning has been used to learn rich representations of molecules using large scale unlabeled molecular datasets and transfer the knowledge to solve the more challenging tasks with limited datasets. Variational autoencoders are one of the tools that have been proposed to perform the transfer for both chemical property prediction and molecular generation tasks. In this work we propose a simple method to improve chemical property prediction performance of machine learning models by incorporating additional information on correlated molecular descriptors in the representations learned by variational autoencoders. We verify the method on three property prediction tasks. We explore the impact of the number of incorporated descriptors, correlation between the descriptors and the target properties, sizes of the datasets etc. Finally, we show the relation between the performance of property prediction models and the distance between property prediction dataset and the larger unlabeled dataset in the representation space.
Collapse
Affiliation(s)
- Ani Tevosyan
- YerevaNN, Charents str. 20, 0025, Yerevan, Armenia
| | - Lusine Khondkaryan
- Laboratory of Cell Technologies, Institute of Molecular Biology, National Academy of Sciences of RA, Hasratyan str. 7, 0014, Yerevan, Armenia
| | - Hrant Khachatrian
- YerevaNN, Charents str. 20, 0025, Yerevan, Armenia.,Yerevan State University, Alex Manoogian str. 1, 0025, Yerevan, Armenia
| | - Gohar Tadevosyan
- Laboratory of Cell Technologies, Institute of Molecular Biology, National Academy of Sciences of RA, Hasratyan str. 7, 0014, Yerevan, Armenia
| | - Lilit Apresyan
- Laboratory of Cell Technologies, Institute of Molecular Biology, National Academy of Sciences of RA, Hasratyan str. 7, 0014, Yerevan, Armenia
| | - Nelly Babayan
- Laboratory of Cell Technologies, Institute of Molecular Biology, National Academy of Sciences of RA, Hasratyan str. 7, 0014, Yerevan, Armenia.,, Toxometris.ai, Sarmen str. 7, 0009, Yerevan, Armenia
| | - Helga Stopper
- Department of Toxicology, Institute of Pharmacology and Toxicology, University of Würzburg, Versbacher str. 9, 97078, Würzburg, Germany
| | - Zaven Navoyan
- , Toxometris.ai, Sarmen str. 7, 0009, Yerevan, Armenia.
| |
Collapse
|
9
|
Kshirsagar M, Yuan H, Ferres JL, Leslie C. BindVAE: Dirichlet variational autoencoders for de novo motif discovery from accessible chromatin. Genome Biol 2022; 23:174. [PMID: 35971180 PMCID: PMC9380350 DOI: 10.1186/s13059-022-02723-w] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/07/2021] [Accepted: 06/28/2022] [Indexed: 11/10/2022] Open
Abstract
We present a novel unsupervised deep learning approach called BindVAE, based on Dirichlet variational autoencoders, for jointly decoding multiple TF binding signals from open chromatin regions. BindVAE can disentangle an input DNA sequence into distinct latent factors that encode cell-type specific in vivo binding signals for individual TFs, composite patterns for TFs involved in cooperative binding, and genomic context surrounding the binding sites. On the task of retrieving the motifs of expressed TFs in a given cell type, BindVAE is competitive with existing motif discovery approaches.
Collapse
Affiliation(s)
| | - Han Yuan
- Calico Life Sciences, South San Francisco, CA, USA
| | | | | |
Collapse
|
10
|
Ren Z, Li J, Xue X, Li X, Yang F, Jiao Z, Gao X. Reconstructing seen image from brain activity by visually-guided cognitive representation and adversarial learning. Neuroimage 2021; 228:117602. [PMID: 33395572 DOI: 10.1016/j.neuroimage.2020.117602] [Citation(s) in RCA: 10] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/04/2019] [Revised: 11/13/2020] [Accepted: 11/23/2020] [Indexed: 11/20/2022] Open
Abstract
Reconstructing perceived stimulus (image) only from human brain activity measured with functional Magnetic Resonance Imaging (fMRI) is a significant task in brain decoding. However, the inconsistent distribution and representation between fMRI signals and visual images cause great 'domain gap'. Moreover, the limited fMRI data instances generally suffer from the issues of low signal noise ratio (SNR), extremely high dimensionality, and limited spatial resolution. Existing methods are often affected by these issues so that a satisfactory reconstruction is still an open problem. In this paper, we show that it is possible to obtain a promising solution by learning visually-guided latent cognitive representations from the fMRI signals, and inversely decoding them to the image stimuli. The resulting framework is called Dual-Variational Autoencoder/ Generative Adversarial Network (D-Vae/Gan), which combines the advantages of adversarial representation learning with knowledge distillation. In addition, we introduce a novel three-stage learning strategy which enables the (cognitive) encoder to gradually distill useful knowledge from the paired (visual) encoder during the learning process. Extensive experimental results on both artificial and natural images have demonstrated that our method could achieve surprisingly good results and outperform the available alternatives.
Collapse
|
11
|
Derkarabetian S, Castillo S, Koo PK, Ovchinnikov S, Hedin M. A demonstration of unsupervised machine learning in species delimitation. Mol Phylogenet Evol 2019; 139:106562. [PMID: 31323334 PMCID: PMC6880864 DOI: 10.1016/j.ympev.2019.106562] [Citation(s) in RCA: 38] [Impact Index Per Article: 7.6] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/17/2019] [Revised: 07/03/2019] [Accepted: 07/15/2019] [Indexed: 01/13/2023]
Abstract
One major challenge to delimiting species with genetic data is successfully differentiating population structure from species-level divergence, an issue exacerbated in taxa inhabiting naturally fragmented habitats. Many fields of science are now using machine learning, and in evolutionary biology supervised machine learning has recently been used to infer species boundaries. These supervised methods require training data with associated labels. Conversely, unsupervised machine learning (UML) uses inherent data structure and does not require user-specified training labels, potentially providing more objectivity in species delimitation. In the context of integrative taxonomy, we demonstrate the utility of three UML approaches (random forests, variational autoencoders, t-distributed stochastic neighbor embedding) for species delimitation in an arachnid taxon with high population genetic structure (Opiliones, Laniatores, Metanonychus). We find that UML approaches successfully cluster samples according to species-level divergences and not high levels of population structure, while model-based validation methods severely over-split putative species. UML offers intuitive data visualization in two-dimensional space, the ability to accommodate various data types, and has potential in many areas of systematic and evolutionary biology. We argue that machine learning methods are ideally suited for species delimitation and may perform well in many natural systems and across taxa with diverse biological characteristics.
Collapse
Affiliation(s)
- Shahan Derkarabetian
- Department of Organismic and Evolutionary Biology, Museum of Comparative Zoology, Harvard University, Cambridge, MA 02138, United States; Department of Biology, San Diego State University, San Diego, CA 92182, United States; Department of Evolution, Ecology, and Organismal Biology, University of California, Riverside, Riverside, CA 92521, United States.
| | - Stephanie Castillo
- Department of Biology, San Diego State University, San Diego, CA 92182, United States; Department of Entomology, University of California, Riverside, Riverside, CA 92521, United States
| | - Peter K Koo
- Howard Hughes Medical Institute, Department of Molecular and Cellular Biology, Harvard University, Cambridge, MA 02138, United States
| | - Sergey Ovchinnikov
- Center for Systems Biology, Harvard University, Cambridge, MA 02138, United States
| | - Marshal Hedin
- Department of Biology, San Diego State University, San Diego, CA 92182, United States
| |
Collapse
|