51
|
Tesei G, Giampanis S, Shi J, Norgeot B. Learning end-to-end patient representations through self-supervised covariate balancing for causal treatment effect estimation. J Biomed Inform 2023; 140:104339. [PMID: 36940895 DOI: 10.1016/j.jbi.2023.104339] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/19/2022] [Revised: 02/15/2023] [Accepted: 03/13/2023] [Indexed: 03/23/2023]
Abstract
A causal effect can be defined as a comparison of outcomes that result from two or more alternative actions, with only one of the action-outcome pairs actually being observed. In healthcare, the gold standard for causal effect measurements is randomized controlled trials (RCTs), in which a target population is explicitly defined and each study sample is randomly assigned to either the treatment or control cohorts. The great potential to derive actionable insights from causal relationships has led to a growing body of machine-learning research applying causal effect estimators to observational data in the fields of healthcare, education, and economics. The primary difference between causal effect studies utilizing observational data and RCTs is that for observational data, the study occurs after the treatment, and therefore we do not have control over the treatment assignment mechanism. This can lead to massive differences in covariate distributions between control and treatment samples, making a comparison of causal effects confounded and unreliable. Classical approaches have sought to solve this problem piecemeal, first by predicting treatment assignment and then treatment effect separately. Recent work extended part of these approaches to a new family of representation-learning algorithms, showing that the upper bound of the expected treatment effect estimation error is determined by two factors: the outcome generalization-error of the representation and the distance between the treated and control distributions induced by the representation. To achieve minimal dissimilarity in learning such distributions, in this work we propose a specific auto-balancing, self-supervised objective. Experiments on real and benchmark datasets revealed that our approach consistently produced less biased estimates than previously published state-of-the-art methods. We demonstrate that the reduction in error can be directly attributed to the ability to learn representations that explicitly reduce such dissimilarity; further, in case of violations of the positivity assumption (frequent in observational data), we show our approach performs significantly better than the previous state of the art. Thus, by learning representations that induce similar distributions of the treated and control cohorts, we present evidence to support the error bound dissimilarity hypothesis as well as providing a new state-of-the-art model for causal effect estimation.
Collapse
|
52
|
Gajendran S, Manjula D, Sugumaran V, Hema R. Extraction of knowledge graph of Covid-19 through mining of unstructured biomedical corpora. Comput Biol Chem 2023; 102:107808. [PMID: 36621289 PMCID: PMC9807269 DOI: 10.1016/j.compbiolchem.2022.107808] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/15/2022] [Revised: 12/21/2022] [Accepted: 12/29/2022] [Indexed: 01/04/2023]
Abstract
The number of biomedical articles published is increasing rapidly over the years. Currently there are about 30 million articles in PubMed and over 25 million mentions in Medline. Among these fundamentals, Biomedical Named Entity Recognition (BioNER) and Biomedical Relation Extraction (BioRE) are the most essential in analysing the literature. In the biomedical domain, Knowledge Graph is used to visualize the relationships between various entities such as proteins, chemicals and diseases. Scientific publications have increased dramatically as a result of the search for treatments and potential cures for the new Coronavirus, but efficiently analysing, integrating, and utilising related sources of information remains a difficulty. In order to effectively combat the disease during pandemics like COVID-19, literature must be used quickly and effectively. In this paper, we introduced a fully automated framework consists of BERT-BiLSTM, Knowledge graph, and Representation Learning model to extract the top diseases, chemicals, and proteins related to COVID-19 from the literature. The proposed framework uses Named Entity Recognition models for disease recognition, chemical recognition, and protein recognition. Then the system uses the Chemical - Disease Relation Extraction and Chemical - Protein Relation Extraction models. And the system extracts the entities and relations from the CORD-19 dataset using the models. The system then creates a Knowledge Graph for the extracted relations and entities. The system performs Representation Learning on this KG to get the embeddings of all entities and get the top related diseases, chemicals, and proteins with respect to COVID-19.
Collapse
|
53
|
Lu M, Zhang Y, Zhang S, Shi H, Huang Z. Knowledge-aware patient representation learning for multiple disease subtypes. J Biomed Inform 2023; 138:104292. [PMID: 36641030 DOI: 10.1016/j.jbi.2023.104292] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/20/2022] [Accepted: 01/10/2023] [Indexed: 01/13/2023]
Abstract
Learning latent representations of patients with a target disease is a core problem in a broad range of downstream applications, such as clinical endpoint prediction. The suffering of patients may have multiple subtypes with certain similarities and differences, which need to be addressed for learning effective patient representation to facilitate the downstream tasks. However, existing studies either ignore the distinction of disease subtypes to learn disease-level representations, or neglect the correlations between subtypes and only learn disease subtype-level representations, which affects the performance of patient representation learning. To alleviate this problem, we studied how to effectively integrate data from all disease subtypes to improve the representation of each subtype. Specifically, we proposed a knowledge-aware shared-private neural network model to explicitly use disease-oriented knowledge and learn shared and specific representations from the disease and its subtype perspectives. To evaluate the feasibility of the proposed model, we conducted a particular downstream task, i.e., clinical endpoint prediction, on the basis of the learned patient presentations. The results on the real-world clinical datasets demonstrated that our model could yield a significant improvement over state-of-the-art models.
Collapse
|
54
|
Grigorashvili EI, Chervontseva ZS, Gelfand MS. Predicting RNA secondary structure by a neural network: what features may be learned? PeerJ 2022; 10:e14335. [PMID: 36530406 PMCID: PMC9756865 DOI: 10.7717/peerj.14335] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/16/2022] [Accepted: 10/12/2022] [Indexed: 12/14/2022] Open
Abstract
Deep learning is a class of machine learning techniques capable of creating internal representation of data without explicit preprogramming. Hence, in addition to practical applications, it is of interest to analyze what features of biological data may be learned by such models. Here, we describe PredPair, a deep learning neural network trained to predict base pairs in RNA structure from sequence alone, without any incorporated prior knowledge, such as the stacking energies or possible spatial structures. PredPair learned the Watson-Crick and wobble base-pairing rules and created an internal representation of the stacking energies and helices. Application to independent experimental (DMS-Seq) data on nucleotide accessibility in mRNA showed that the nucleotides predicted as paired indeed tend to be involved in the RNA structure. The performance of the constructed model was comparable with the state-of-the-art method based on the thermodynamic approach, but with a higher false positives rate. On the other hand, it successfully predicted pseudoknots. t-SNE clusters of embeddings of RNA sequences created by PredPair tend to contain embeddings from particular Rfam families, supporting the predictions of PredPair being in line with biological classification.
Collapse
|
55
|
MERGE: A Multi-graph Attentive Representation learning framework integrating Group information from similar patients. Comput Biol Med 2022; 151:106245. [PMID: 36335809 DOI: 10.1016/j.compbiomed.2022.106245] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/14/2022] [Revised: 10/04/2022] [Accepted: 10/22/2022] [Indexed: 12/27/2022]
Abstract
It is an important research task in the field of medical big data to predict patient's future health status according to the historical temporal Electronic Health Records (EHRs). Most of the existing deep learning-based medical prediction methods only focus on the patient's individual information. However, due to the sparseness and low quality of EHR data, individual clinical records of single patient often cannot provide complete health information, which severely limits the accuracy of the prediction models. In this paper, we propose a Multi-graph attEntive Representation learning framework integrating Group information from similar patiEnts(MERGE) for medical prediction. In this framework, while capturing the individual patient's temporal characteristics through the individual representation learning module, the group representation leaning module is used to learn group representations of similar patients from different aspects as a supplement, thereby effectively improving the accuracy of patients' representation. We evaluate our method on the MIMIC-III dataset for the task of in-hospital mortality prediction and Xiangya dataset for cardiovascular diseases (CVDs) prediction. The experimental results show that MERGE outperforms the state-of-the-art methods.
Collapse
|
56
|
Thanapalasingam T, van Berkel L, Bloem P, Groth P. Relational graph convolutional networks: a closer look. PeerJ Comput Sci 2022; 8:e1073. [PMID: 36426239 PMCID: PMC9680895 DOI: 10.7717/peerj-cs.1073] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/01/2022] [Accepted: 07/28/2022] [Indexed: 06/16/2023]
Abstract
In this article, we describe a reproduction of the Relational Graph Convolutional Network (RGCN). Using our reproduction, we explain the intuition behind the model. Our reproduction results empirically validate the correctness of our implementations using benchmark Knowledge Graph datasets on node classification and link prediction tasks. Our explanation provides a friendly understanding of the different components of the RGCN for both users and researchers extending the RGCN approach. Furthermore, we introduce two new configurations of the RGCN that are more parameter efficient. The code and datasets are available at https://github.com/thiviyanT/torch-rgcn.
Collapse
|
57
|
Lee CE, Park H, Shin YG, Chung M. Voxel-wise adversarial semi-supervised learning for medical image segmentation. Comput Biol Med 2022; 150:106152. [PMID: 36208595 DOI: 10.1016/j.compbiomed.2022.106152] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/04/2022] [Revised: 09/03/2022] [Accepted: 09/24/2022] [Indexed: 11/17/2022]
Abstract
BACKGROUND AND OBJECTIVE Semi-supervised learning for medical image segmentation is an important area of research for alleviating the huge cost associated with the construction of reliable large-scale annotations in the medical domain. Recent semi-supervised approaches have demonstrated promising results by employing consistency regularization, pseudo-labeling techniques, and adversarial learning. These methods primarily attempt to learn the distribution of labeled and unlabeled data by enforcing consistency in the predictions or embedding context. However, previous approaches have focused only on local discrepancy minimization or context relations across single classes. METHODS In this paper, we introduce a novel adversarial learning-based semi-supervised segmentation method that effectively embeds both local and global features from multiple hidden layers and learns context relations between multiple classes. Our voxel-wise adversarial learning method utilizes a voxel-wise feature discriminator, which considers multilayer voxel-wise features (involving both local and global features) as an input by embedding class-specific voxel-wise feature distribution. Furthermore, our previous representation learning method is improved by overcoming information loss and learning stability problems, which enables rich representations of labeled data. RESULT In the experiments, we used the Left Atrial Segmentation Challenge dataset and the Abdominal Multi-Organ dataset to prove the effectiveness of our method in both single class and multiclass segmentation. The experimental results demonstrate that our method outperforms current best-performing state-of-the-art semi-supervised learning approaches. Our proposed adversarial learning-based semi-supervised segmentation method successfully leveraged unlabeled data to improve the network performance by 2% in Dice score coefficient for multi-organ dataset. CONCLUSION We compare our approach to a wide range of medical datasets, and showed our method can be adapted to embed class-specific features. Furthermore, visual interpretation of the feature space demonstrates that our proposed method enables a well-distributed and separated feature space from both labeled and unlabeled data, which improves the overall prediction results.
Collapse
|
58
|
Gutman Music M, Holur P, Bulkeley K. Mapping dreams in a computational space: A phrase-level model for analyzing Fight/Flight and other typical situations in dream reports. Conscious Cogn 2022; 106:103428. [PMID: 36341867 DOI: 10.1016/j.concog.2022.103428] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/10/2022] [Revised: 09/27/2022] [Accepted: 10/11/2022] [Indexed: 01/27/2023]
Abstract
This article demonstrates that an automated system of linguistic analysis can be developed - the Oneirograph - to analyze large collections of dreams and computationally map their contents in terms of typical situations involving an interplay of characters, activities, and settings. Focusing the analysis first on the twin situations of fighting and fleeing, the results provide densely detailed empirical evidence of the underlying semantic structures of typical dreams. The results also indicate that the Oneirograph analytic system can be applied to other typical dream situations as well (e.g., flying, falling), each of which can be computationally mapped in terms of a distinctive constellation of characters, activities, and settings.
Collapse
|
59
|
Chu J, Zhang Y, Huang F, Si L, Huang S, Huang Z. Disentangled representation for sequential treatment effect estimation. COMPUTER METHODS AND PROGRAMS IN BIOMEDICINE 2022; 226:107175. [PMID: 36242866 DOI: 10.1016/j.cmpb.2022.107175] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 06/08/2022] [Revised: 10/04/2022] [Accepted: 10/04/2022] [Indexed: 06/16/2023]
Abstract
BACKGROUND AND OBJECTIVE Treatment effect estimation, as a fundamental problem in causal inference, focuses on estimating the outcome difference between different treatments. However, in clinical observational data, some patient covariates (such as gender, age) not only affect the outcomes but also affect the treatment assignment. Such covariates, named as confounders, produce distribution discrepancies between different treatment groups, thereby introducing the selection bias for the estimation of treatment effects. The situation is even more complicated in longitudinal data, because the confounders are time-varying that are subject to patient history and meanwhile affect the future outcomes and treatment assignments. Existing methods mainly work on cross-sectional data obtained at a specific time point, but cannot process the time-varying confounders hidden in the longitudinal data. METHODS In this study, we address this problem for the first time by disentangled representation learning, which considers the observational data as consisting of three components, including outcome-specific factors, treatment-specific factors, and time-varying confounders. Based on this, the proposed approach adopts a recurrent neural network-based framework to process sequential information and learn the disentangled representations of the components from longitudinal observational sequences, captures the posterior distributions of latent factors by multi-task learning strategy. Moreover, mutual information-based regularization is adopted to eliminate the time-varying confounders. In this way, the association between patient history and treatment assignment is removed and the estimation can be effectively conducted. RESULTS We evaluate our model in a realistic set-up using a model of tumor growth. The proposed model achieves the best performance over benchmark models for both one-step ahead prediction (0.70% vs 0.74% for the-state-of-the-art model, when γ = 3. Measured by normalized root mean square error, the lower the better) and five-step ahead prediction (1.47% vs 1.83%) in most cases. By increasing the effect of confounders, our proposed model always shows superiority against the state-of-the-art model. In addition, we adopted T-SNE to visualize the disentangled representations and present the effectiveness of disentanglement explicitly and intuitively. CONCLUSIONS The experimental results indicate the powerful capacity of our model in learning disentangled representations from longitudinal observational data and dealing with the time-varying confounders, and demonstrate the surpassing performance achieved by our proposed model on dynamic treatment effect estimation.
Collapse
|
60
|
Fu Z, Jiao J, Yasrab R, Drukker L, Papageorghiou AT, Noble JA. Anatomy-Aware Contrastive Representation Learning for Fetal Ultrasound. COMPUTER VISION - ECCV ... : ... EUROPEAN CONFERENCE ON COMPUTER VISION : PROCEEDINGS. EUROPEAN CONFERENCE ON COMPUTER VISION 2022; 2022:422-436. [PMID: 37250853 PMCID: PMC7614575 DOI: 10.1007/978-3-031-25066-8_23] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/31/2023]
Abstract
Self-supervised contrastive representation learning offers the advantage of learning meaningful visual representations from unlabeled medical datasets for transfer learning. However, applying current contrastive learning approaches to medical data without considering its domain-specific anatomical characteristics may lead to visual representations that are inconsistent in appearance and semantics. In this paper, we propose to improve visual representations of medical images via anatomy-aware contrastive learning (AWCL), which incorporates anatomy information to augment the positive/negative pair sampling in a contrastive learning manner. The proposed approach is demonstrated for automated fetal ultrasound imaging tasks, enabling the positive pairs from the same or different ultrasound scans that are anatomically similar to be pulled together and thus improving the representation learning. We empirically investigate the effect of inclusion of anatomy information with coarse- and fine-grained granularity, for contrastive learning and find that learning with fine-grained anatomy information which preserves intra-class difference is more effective than its counterpart. We also analyze the impact of anatomy ratio on our AWCL framework and find that using more distinct but anatomically similar samples to compose positive pairs results in better quality representations. Extensive experiments on a large-scale fetal ultrasound dataset demonstrate that our approach is effective for learning representations that transfer well to three clinical downstream tasks, and achieves superior performance compared to ImageNet supervised and the current state-of-the-art contrastive learning methods. In particular, AWCL outperforms ImageNet supervised method by 13.8% and state-of-the-art contrastive-based method by 7.1% on a cross-domain segmentation task. The code is available at https://github.com/JianboJiao/AWCL.
Collapse
|
61
|
Zhao C, Wang H, Qi W, Liu S. Toward drug-miRNA resistance association prediction by positional encoding graph neural network and multi-channel neural network. Methods 2022; 207:81-89. [PMID: 36167292 DOI: 10.1016/j.ymeth.2022.09.005] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/19/2022] [Revised: 09/01/2022] [Accepted: 09/18/2022] [Indexed: 10/31/2022] Open
Abstract
Drug discovery is a costly and time-consuming process, and most drugs exert therapeutic efficacy by targeting specific proteins. However, there are a large number of proteins that are not targeted by any drug. Recently, miRNA-based therapeutics are becoming increasingly important, since miRNA can regulate the expressions of specific genes and affect a variety of human diseases. Therefore, it is of great significance to study the associations between miRNAs and drugs to enable drug discovery and disease treatment. In this work, we propose a novel method named DMR-PEG, which facilitates drug-miRNA resistance association (DMRA) prediction by leveraging positional encoding graph neural network with layer attention (LAPEG) and multi-channel neural network (MNN). LAPEG considers both the potential information in the miRNA-drug resistance heterogeneous network and the specific characteristics of entities (i.e., drugs and miRNAs) to learn favorable representations of drugs and miRNAs. And MNN models various sophisticated relations and synthesizes the predictions from different perspectives effectively. In the comprehensive experiments, DMR-PEG achieves the area under the precision-recall curve (AUPR) score of 0.2793 and the area under the receiver-operating characteristic curve (AUC) score of 0.9475, which outperforms the most state-of-the-art methods. Further experimental results show that our proposed method has good robustness and stability. The ablation study demonstrates each component in DMR-PEG is essential for drug-miRNA drug resistance association prediction. And real-world case study presents that DMR-PEG is promising for DMRA inference.
Collapse
|
62
|
Fashi PA, Hemati S, Babaie M, Gonzalez R, Tizhoosh H. A self-supervised contrastive learning approach for whole slide image representation in digital pathology. J Pathol Inform 2022; 13:100133. [PMID: 36605114 PMCID: PMC9808093 DOI: 10.1016/j.jpi.2022.100133] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/24/2022] [Revised: 08/16/2022] [Accepted: 08/22/2022] [Indexed: 02/07/2023] Open
Abstract
Image analysis in digital pathology has proven to be one of the most challenging fields in medical imaging for AI-driven classification and search tasks. Due to their gigapixel dimensions, whole slide images (WSIs) are difficult to represent for computational pathology. Self-supervised learning (SSL) has recently demonstrated excellent performance in learning effective representations on pretext objectives, which may improve the generalizations of downstream tasks. Previous self-supervised representation methods rely on patch selection and classification such that the effect of SSL on end-to-end WSI representation is not investigated. In contrast to existing augmentation-based SSL methods, this paper proposes a novel self-supervised learning scheme based on the available primary site information. We also design a fully supervised contrastive learning setup to increase the robustness of the representations for WSI classification and search for both pretext and downstream tasks. We trained and evaluated the model on more than 6000 WSIs from The Cancer Genome Atlas (TCGA) repository provided by the National Cancer Institute. The proposed architecture achieved excellent results on most primary sites and cancer subtypes. We also achieved the best result on validation on a lung cancer classification task.
Collapse
|
63
|
Kouhsar M, Kashaninia E, Mardani B, Rabiee HR. CircWalk: a novel approach to predict CircRNA-disease association based on heterogeneous network representation learning. BMC Bioinformatics 2022; 23:331. [PMID: 35953785 PMCID: PMC9367077 DOI: 10.1186/s12859-022-04883-9] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/13/2022] [Accepted: 08/08/2022] [Indexed: 11/10/2022] Open
Abstract
Background Several types of RNA in the cell are usually involved in biological processes with multiple functions. Coding RNAs code for proteins while non-coding RNAs regulate gene expression. Some single-strand RNAs can create a circular shape via the back splicing process and convert into a new type called circular RNA (circRNA). circRNAs are among the essential non-coding RNAs in the cell that involve multiple disorders. One of the critical functions of circRNAs is to regulate the expression of other genes through sponging micro RNAs (miRNAs) in diseases. This mechanism, known as the competing endogenous RNA (ceRNA) hypothesis, and additional information obtained from biological datasets can be used by computational approaches to predict novel associations between disease and circRNAs.
Results We applied multiple classifiers to validate the extracted features from the heterogeneous network and selected the most appropriate one based on some evaluation criteria. Then, the XGBoost is utilized in our pipeline to generate a novel approach, called CircWalk, to predict CircRNA-Disease associations. Our results demonstrate that CircWalk has reasonable accuracy and AUC compared with other state-of-the-art algorithms. We also use CircWalk to predict novel circRNAs associated with lung, gastric, and colorectal cancers as a case study. The results show that our approach can accurately detect novel circRNAs related to these diseases. Conclusions Considering the ceRNA hypothesis, we integrate multiple resources to construct a heterogeneous network from circRNAs, mRNAs, miRNAs, and diseases. Next, the DeepWalk algorithm is applied to the network to extract feature vectors for circRNAs and diseases. The extracted features are used to learn a classifier and generate a model to predict novel CircRNA-Disease associations. Our approach uses the concept of the ceRNA hypothesis and the miRNA sponge effect of circRNAs to predict their associations with diseases. Our results show that this outlook could help identify CircRNA-Disease associations more accurately. Supplementary Information The online version contains supplementary material available at 10.1186/s12859-022-04883-9.
Collapse
|
64
|
Masset P, Qin S, Zavatone-Veth JA. Drifting neuronal representations: Bug or feature? BIOLOGICAL CYBERNETICS 2022; 116:253-266. [PMID: 34993613 DOI: 10.1007/s00422-021-00916-3] [Citation(s) in RCA: 10] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 08/25/2021] [Accepted: 11/17/2021] [Indexed: 06/14/2023]
Abstract
The brain displays a remarkable ability to sustain stable memories, allowing animals to execute precise behaviors or recall stimulus associations years after they were first learned. Yet, recent long-term recording experiments have revealed that single-neuron representations continuously change over time, contravening the classical assumption that learned features remain static. How do unstable neural codes support robust perception, memories, and actions? Here, we review recent experimental evidence for such representational drift across brain areas, as well as dissections of its functional characteristics and underlying mechanisms. We emphasize theoretical proposals for how drift need not only be a form of noise for which the brain must compensate. Rather, it can emerge from computationally beneficial mechanisms in hierarchical networks performing robust probabilistic computations.
Collapse
|
65
|
Das R, Kaur K, Walia E. Feature Generalization for Breast Cancer Detection in Histopathological Images. Interdiscip Sci 2022; 14:566-581. [PMID: 35482216 DOI: 10.1007/s12539-022-00515-1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/21/2021] [Revised: 03/17/2022] [Accepted: 03/22/2022] [Indexed: 06/14/2023]
Abstract
Recent period has witnessed benchmarked performance of transfer learning using deep architectures in computer-aided diagnosis (CAD) of breast cancer. In this perspective, the pre-trained neural network needs to be fine-tuned with relevant data to extract useful features from the dataset. However, in addition to the computational overhead, it suffers the curse of overfitting in case of feature extraction from smaller datasets. Handcrafted feature extraction techniques as well as feature extraction using pre-trained deep networks come into rescue in aforementioned situation and have proved to be much more efficient and lightweight compared to deep architecture-based transfer learning techniques. This research has identified the competence of classifying breast cancer images using feature engineering and representation learning over the established and contemporary notion of using transfer learning techniques. Moreover, it has revealed superior feature learning capacity with feature fusion in contrast to the conventional belief of understanding unknown feature patterns better with representation learning alone. Experiments have been conducted on two different and popular breast cancer image datasets, namely, KIMIA Path960 and BreakHis datasets. A comparison of image-level accuracy is performed on these datasets using the above-mentioned feature extraction techniques. Image level accuracy of 97.81% is achieved for KIMIA Path960 dataset using individual features extracted with handcrafted (color histogram) technique. Fusion of uniform Local Binary Pattern (uLBP) and color histogram features has resulted in 99.17% of highest accuracy for the same dataset. Experimentation with BreakHis dataset has resulted in highest classification accuracy of 88.41% with color histogram features for images with 200X magnification factor. Finally, the results are contrasted to that of state-of-the-art and superior performances are observed on many occasions with the proposed fusion-based techniques. In case of BreakHis dataset, the highest accuracies 87.60% (with least standard deviation) and 85.77% are recorded for 200X and 400X magnification factors, respectively, and the results for the aforesaid magnification factors of images have exceeded the state-of-the-art.
Collapse
|
66
|
Piaggesi S, Panisson A. Time-varying graph representation learning via higher-order skip-gram with negative sampling. EPJ DATA SCIENCE 2022; 11:33. [PMID: 35668814 PMCID: PMC9143726 DOI: 10.1140/epjds/s13688-022-00344-8] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 06/11/2021] [Accepted: 05/09/2022] [Indexed: 06/15/2023]
Abstract
UNLABELLED Representation learning models for graphs are a successful family of techniques that project nodes into feature spaces that can be exploited by other machine learning algorithms. Since many real-world networks are inherently dynamic, with interactions among nodes changing over time, these techniques can be defined both for static and for time-varying graphs. Here, we show how the skip-gram embedding approach can be generalized to perform implicit tensor factorization on different tensor representations of time-varying graphs. We show that higher-order skip-gram with negative sampling (HOSGNS) is able to disentangle the role of nodes and time, with a small fraction of the number of parameters needed by other approaches. We empirically evaluate our approach using time-resolved face-to-face proximity data, showing that the learned representations outperform state-of-the-art methods when used to solve downstream tasks such as network reconstruction. Good performance on predicting the outcome of dynamical processes such as disease spreading shows the potential of this method to estimate contagion risk, providing early risk awareness based on contact tracing data. SUPPLEMENTARY INFORMATION The online version contains supplementary material available at 10.1140/epjds/s13688-022-00344-8.
Collapse
|
67
|
Gao F, Zhang W, Baccarelli AA, Shen Y. Predicting chemical ecotoxicity by learning latent space chemical representations. ENVIRONMENT INTERNATIONAL 2022; 163:107224. [PMID: 35395577 PMCID: PMC9044254 DOI: 10.1016/j.envint.2022.107224] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 01/04/2022] [Revised: 03/29/2022] [Accepted: 03/30/2022] [Indexed: 05/31/2023]
Abstract
In silico prediction of chemical ecotoxicity (HC50) represents an important complement to improve in vivo and in vitro toxicological assessment of manufactured chemicals. Recent application of machine learning models to predict chemical HC50 yields variable prediction performance that depends on effectively learning chemical representations from high-dimension data. To improve HC50 prediction performance, we developed an autoencoder model by learning latent space chemical embeddings. This novel approach achieved state-of-the-art prediction performance of HC50 with R2 of 0.668 ± 0.003 and mean absolute error (MAE) of 0.572 ± 0.001, and outperformed other dimension reduction methods including principal component analysis (PCA) (R2 = 0.601 ± 0.031 and MAE = 0.629 ± 0.005), kernel PCA (R2 = 0.631 ± 0.008 and MAE = 0.625 ± 0.006), and uniform manifold approximation and projection dimensionality reduction (R2 = 0.400 ± 0.008 and MAE = 0.801 ± 0.002). A simple linear layer with chemical embeddings learned from the autoencoder model performed better than random forest (R2 = 0.663 ± 0.007 and MAE = 0.591 ± 0.008), fully connected neural network (R2 = 0.614 ± 0.016 and MAE = 0.610 ± 0.008), least absolute shrinkage and selection operator (R2 = 0.617 ± 0.037 and MAE = 0.619 ± 0.007), and ridge regression (R2 = 0.638 ± 0.007 and MAE = 0.613 ± 0.005) using unlearned raw input features. Our results highlighted the usefulness of learning latent chemical representations, and our autoencoder model provides an alternative approach for robust HC50 prediction.
Collapse
|
68
|
Subakti A, Murfi H, Hariadi N. The performance of BERT as data representation of text clustering. JOURNAL OF BIG DATA 2022; 9:15. [PMID: 35194542 PMCID: PMC8848302 DOI: 10.1186/s40537-022-00564-9] [Citation(s) in RCA: 7] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 09/25/2021] [Accepted: 01/13/2022] [Indexed: 05/12/2023]
Abstract
Text clustering is the task of grouping a set of texts so that text in the same group will be more similar than those from a different group. The process of grouping text manually requires a significant amount of time and labor. Therefore, automation utilizing machine learning is necessary. One of the most frequently used method to represent textual data is Term Frequency Inverse Document Frequency (TFIDF). However, TFIDF cannot consider the position and context of a word in a sentence. Bidirectional Encoder Representation from Transformers (BERT) model can produce text representation that incorporates the position and context of a word in a sentence. This research analyzed the performance of the BERT model as data representation for text. Moreover, various feature extraction and normalization methods are also applied for the data representation of the BERT model. To examine the performances of BERT, we use four clustering algorithms, i.e., k-means clustering, eigenspace-based fuzzy c-means, deep embedded clustering, and improved deep embedded clustering. Our simulations show that BERT outperforms TFIDF method in 28 out of 36 metrics. Furthermore, different feature extraction and normalization produced varied performances. The usage of these feature extraction and normalization must be altered depending on the text clustering algorithm used.
Collapse
|
69
|
Lee CE, Chung M, Shin YG. Voxel-level Siamese Representation Learning for Abdominal Multi-Organ Segmentation. COMPUTER METHODS AND PROGRAMS IN BIOMEDICINE 2022; 213:106547. [PMID: 34839269 DOI: 10.1016/j.cmpb.2021.106547] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 07/12/2021] [Revised: 10/18/2021] [Accepted: 11/15/2021] [Indexed: 06/13/2023]
Abstract
BACKGROUND AND OBJECTIVE Recent works in medical image segmentation have actively explored various deep learning architectures or objective functions to encode high-level features from volumetric data owing to limited image annotations. However, most existing approaches tend to ignore cross-volume global context and define context relations in the decision space. In this work, we propose a novel voxel-level Siamese representation learning method for abdominal multi-organ segmentation to improve representation space. METHODS The proposed method enforces voxel-wise feature relations in the representation space for leveraging limited datasets more comprehensively to achieve better performance. Inspired by recent progress in contrastive learning, we suppressed voxel-wise relations from the same class to be projected to the same point without using negative samples. Moreover, we introduce a multi-resolution context aggregation method that aggregates features from multiple hidden layers, which encodes both the global and local contexts for segmentation. RESULTS Our experiments on the multi-organ dataset outperformed the existing approaches by 2% in Dice score coefficient. The qualitative visualizations of the representation spaces demonstrate that the improvements were gained primarily by a disentangled feature space. CONCLUSION Our new representation learning method successfully encoded high-level features in the representation space by using a limited dataset, which showed superior accuracy in the medical image segmentation task compared to other contrastive loss-based methods. Moreover, our method can be easily applied to other networks without using additional parameters in the inference.
Collapse
|
70
|
Deep generative modeling for protein design. Curr Opin Struct Biol 2021; 72:226-236. [PMID: 34963082 DOI: 10.1016/j.sbi.2021.11.008] [Citation(s) in RCA: 27] [Impact Index Per Article: 9.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/31/2021] [Revised: 11/01/2021] [Accepted: 11/22/2021] [Indexed: 11/21/2022]
Abstract
Deep learning approaches have produced substantial breakthroughs in fields such as image classification and natural language processing and are making rapid inroads in the area of protein design. Many generative models of proteins have been developed that encompass all known protein sequences, model specific protein families, or extrapolate the dynamics of individual proteins. Those generative models can learn protein representations that are often more informative of protein structure and function than hand-engineered features. Furthermore, they can be used to quickly propose millions of novel proteins that resemble the native counterparts in terms of expression level, stability, or other attributes. The protein design process can further be guided by discriminative oracles to select candidates with the highest probability of having the desired properties. In this review, we discuss five classes of generative models that have been most successful at modeling proteins and provide a framework for model guided protein design.
Collapse
|
71
|
Imitation and mirror systems in robots through Deep Modality Blending Networks. Neural Netw 2021; 146:22-35. [PMID: 34839090 DOI: 10.1016/j.neunet.2021.11.004] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/24/2021] [Revised: 09/29/2021] [Accepted: 11/04/2021] [Indexed: 11/23/2022]
Abstract
Learning to interact with the environment not only empowers the agent with manipulation capability but also generates information to facilitate building of action understanding and imitation capabilities. This seems to be a strategy adopted by biological systems, in particular primates, as evidenced by the existence of mirror neurons that seem to be involved in multi-modal action understanding. How to benefit from the interaction experience of the robots to enable understanding actions and goals of other agents is still a challenging question. In this study, we propose a novel method, deep modality blending networks (DMBN), that creates a common latent space from multi-modal experience of a robot by blending multi-modal signals with a stochastic weighting mechanism. We show for the first time that deep learning, when combined with a novel modality blending scheme, can facilitate action recognition and produce structures to sustain anatomical and effect-based imitation capabilities. Our proposed system, which is based on conditional neural processes, can be conditioned on any desired sensory/motor value at any time step, and can generate a complete multi-modal trajectory consistent with the desired conditioning in one-shot by querying the network for all the sampled time points in parallel avoiding the accumulation of prediction errors. Based on simulation experiments with an arm-gripper robot and an RGB camera, we showed that DMBN could make accurate predictions about any missing modality (camera or joint angles) given the available ones outperforming recent multimodal variational autoencoder models in terms of long-horizon high-dimensional trajectory predictions. We further showed that given desired images from different perspectives, i.e. images generated by the observation of other robots placed on different sides of the table, our system could generate image and joint angle sequences that correspond to either anatomical or effect-based imitation behavior. To achieve this mirror-like behavior, our system does not perform a pixel-based template matching but rather benefits from and relies on the common latent space constructed by using both joint and image modalities, as shown by additional experiments. Moreover, we showed that mirror learning (in our system) does not only depend on visual experience and cannot be achieved without proprioceptive experience. Our experiments showed that out of ten training scenarios with different initial configurations, the proposed DMBN model could achieve mirror learning in all of the cases where the model that only uses visual information failed in half of them. Overall, the proposed DMBN architecture not only serves as a computational model for sustaining mirror neuron-like capabilities, but also stands as a powerful machine learning architecture for high-dimensional multi-modal temporal data with robust retrieval capabilities operating with partial information in one or multiple modalities.
Collapse
|
72
|
Multi-layer information fusion based on graph convolutional network for knowledge-driven herb recommendation. Neural Netw 2021; 146:1-10. [PMID: 34826774 DOI: 10.1016/j.neunet.2021.11.010] [Citation(s) in RCA: 10] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/17/2021] [Revised: 10/16/2021] [Accepted: 11/09/2021] [Indexed: 11/24/2022]
Abstract
Prescription of Traditional Chinese Medicine (TCM) is a precious treasure accumulated in the long-term development of TCM. Artificial intelligence (AI) technology is used to build herb recommendation models to deeply understand regularities in prescriptions, which is of great significance to clinical application of TCM and discovery of new prescriptions. Most of herb recommendation models constructed in the past ignored the nature information of herbs, and most of them used statistical models based on bag-of-words for herb recommendation, which makes it difficult for the model to perceive the complex correlation between symptoms and herbs. In this paper, we introduce the properties of herbs as additional auxiliary information by constructing herb knowledge graph, and propose a graph convolution model with multi-layer information fusion to obtain symptom feature representations and herb feature representations with rich information and less noise. We apply the proposed model to the TCM prescription dataset, and the experiment results show that our model outperforms the baseline models in terms of Precision@5 by 6.2%, Recall@5 by 16.0% and F1-Score@5 by 12.0%.
Collapse
|
73
|
Marin R, Rampini A, Castellani U, Rodolà E, Ovsjanikov M, Melzi S. Spectral Shape Recovery and Analysis Via Data-driven Connections. Int J Comput Vis 2021; 129:2745-2760. [PMID: 34720402 PMCID: PMC8550494 DOI: 10.1007/s11263-021-01492-6] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/19/2021] [Accepted: 06/15/2021] [Indexed: 11/25/2022]
Abstract
We introduce a novel learning-based method to recover shapes from their Laplacian spectra, based on establishing and exploring connections in a learned latent space. The core of our approach consists in a cycle-consistent module that maps between a learned latent space and sequences of eigenvalues. This module provides an efficient and effective link between the shape geometry, encoded in a latent vector, and its Laplacian spectrum. Our proposed data-driven approach replaces the need for ad-hoc regularizers required by prior methods, while providing more accurate results at a fraction of the computational cost. Moreover, these latent space connections enable novel applications for both analyzing and controlling the spectral properties of deformable shapes, especially in the context of a shape collection. Our learning model and the associated analysis apply without modifications across different dimensions (2D and 3D shapes alike), representations (meshes, contours and point clouds), nature of the latent space (generated by an auto-encoder or a parametric model), as well as across different shape classes, and admits arbitrary resolution of the input spectrum without affecting complexity. The increased flexibility allows us to address notoriously difficult tasks in 3D vision and geometry processing within a unified framework, including shape generation from spectrum, latent space exploration and analysis, mesh super-resolution, shape exploration, style transfer, spectrum estimation for point clouds, segmentation transfer and non-rigid shape matching.
Collapse
|
74
|
Bramlage L, Cortese A. Generalized attention-weighted reinforcement learning. Neural Netw 2021; 145:10-21. [PMID: 34710787 DOI: 10.1016/j.neunet.2021.09.023] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/28/2021] [Revised: 08/19/2021] [Accepted: 09/24/2021] [Indexed: 11/26/2022]
Abstract
In neuroscience, attention has been shown to bidirectionally interact with reinforcement learning (RL) to reduce the dimensionality of task representations, restricting computations to relevant features. In machine learning, despite their popularity, attention mechanisms have seldom been administered to decision-making problems. Here, we leverage a theoretical model from computational neuroscience - the attention-weighted RL (AWRL), defining how humans identify task-relevant features (i.e., that allow value predictions) - to design an applied deep RL paradigm. We formally demonstrate that the conjunction of the self-attention mechanism, widely employed in machine learning, with value function approximation is a general formulation of the AWRL model. To evaluate our agent, we train it on three Atari tasks at different complexity levels, incorporating both task-relevant and irrelevant features. Because the model uses semantic observations, we can uncover not only which features the agent elects to base decisions on, but also how it chooses to compile more complex, relational features from simpler ones. We first show that performance depends in large part on the ability to compile new compound features, rather than mere focus on individual features. In line with neuroscience predictions, self-attention leads to high resiliency to noise (irrelevant features) compared to other benchmark models. Finally, we highlight the importance and separate contributions of both bottom-up and top-down attention in the learning process. Together, these results demonstrate the broader validity of the AWRL framework in complex task scenarios, and illustrate the benefits of a deeper integration between neuroscience-derived models and RL for decision making in machine learning.
Collapse
|
75
|
Guha Roy A, Ren J, Azizi S, Loh A, Natarajan V, Mustafa B, Pawlowski N, Freyberg J, Liu Y, Beaver Z, Vo N, Bui P, Winter S, MacWilliams P, Corrado GS, Telang U, Liu Y, Cemgil T, Karthikesalingam A, Lakshminarayanan B, Winkens J. Does your dermatology classifier know what it doesn't know? Detecting the long-tail of unseen conditions. Med Image Anal 2021; 75:102274. [PMID: 34731777 DOI: 10.1016/j.media.2021.102274] [Citation(s) in RCA: 10] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/02/2021] [Revised: 08/12/2021] [Accepted: 10/15/2021] [Indexed: 11/15/2022]
Abstract
Supervised deep learning models have proven to be highly effective in classification of dermatological conditions. These models rely on the availability of abundant labeled training examples. However, in the real-world, many dermatological conditions are individually too infrequent for per-condition classification with supervised learning. Although individually infrequent, these conditions may collectively be common and therefore are clinically significant in aggregate. To prevent models from generating erroneous outputs on such examples, there remains a considerable unmet need for deep learning systems that can better detect such infrequent conditions. These infrequent 'outlier' conditions are seen very rarely (or not at all) during training. In this paper, we frame this task as an out-of-distribution (OOD) detection problem. We set up a benchmark ensuring that outlier conditions are disjoint between the model training, validation, and test sets. Unlike traditional OOD detection benchmarks where the task is to detect dataset distribution shift, we aim at the more challenging task of detecting subtle differences resulting from a different pathology or condition. We propose a novel hierarchical outlier detection (HOD) loss, which assigns multiple abstention classes corresponding to each training outlier class and jointly performs a coarse classification of inliers vs. outliers, along with fine-grained classification of the individual classes. We demonstrate that the proposed HOD loss based approach outperforms leading methods that leverage outlier data during training. Further, performance is significantly boosted by using recent representation learning methods (BiT, SimCLR, MICLe). Further, we explore ensembling strategies for OOD detection and propose a diverse ensemble selection process for the best result. We also perform a subgroup analysis over conditions of varying risk levels and different skin types to investigate how OOD performance changes over each subgroup and demonstrate the gains of our framework in comparison to baseline. Furthermore, we go beyond traditional performance metrics and introduce a cost matrix for model trust analysis to approximate downstream clinical impact. We use this cost matrix to compare the proposed method against the baseline, thereby making a stronger case for its effectiveness in real-world scenarios.
Collapse
|