1
|
Iliadis D, De Baets B, Pahikkala T, Waegeman W. A comparison of embedding aggregation strategies in drug-target interaction prediction. BMC Bioinformatics 2024; 25:59. [PMID: 38321386 PMCID: PMC10845509 DOI: 10.1186/s12859-024-05684-y] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/25/2023] [Accepted: 01/30/2024] [Indexed: 02/08/2024] Open
Abstract
The prediction of interactions between novel drugs and biological targets is a vital step in the early stage of the drug discovery pipeline. Many deep learning approaches have been proposed over the last decade, with a substantial fraction of them sharing the same underlying two-branch architecture. Their distinction is limited to the use of different types of feature representations and branches (multi-layer perceptrons, convolutional neural networks, graph neural networks and transformers). In contrast, the strategy used to combine the outputs (embeddings) of the branches has remained mostly the same. The same general architecture has also been used extensively in the area of recommender systems, where the choice of an aggregation strategy is still an open question. In this work, we investigate the effectiveness of three different embedding aggregation strategies in the area of drug-target interaction (DTI) prediction. We formally define these strategies and prove their universal approximator capabilities. We then present experiments that compare the different strategies on benchmark datasets from the area of DTI prediction, showcasing conditions under which specific strategies could be the obvious choice.
Collapse
Affiliation(s)
- Dimitrios Iliadis
- Department of Data Analysis and Mathematical Modelling, Ghent University, Coupure Links 653, 9000, Ghent, Belgium.
| | - Bernard De Baets
- Department of Data Analysis and Mathematical Modelling, Ghent University, Coupure Links 653, 9000, Ghent, Belgium
| | - Tapio Pahikkala
- Department of Computing, University of Turku, 20500, Turku, Finland
| | - Willem Waegeman
- Department of Data Analysis and Mathematical Modelling, Ghent University, Coupure Links 653, 9000, Ghent, Belgium
| |
Collapse
|
2
|
Drug-target interaction prediction via an ensemble of weighted nearest neighbors with interaction recovery. APPL INTELL 2022. [DOI: 10.1007/s10489-021-02495-z] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/20/2022]
|
3
|
Fitzgerald JM, Webb EK, Weis CN, Huggins AA, Bennett KP, Miskovich TA, Krukowski JL, deRoon-Cassini TA, Larson CL. Hippocampal Resting-State Functional Connectivity Forecasts Individual Posttraumatic Stress Disorder Symptoms: A Data-Driven Approach. BIOLOGICAL PSYCHIATRY. COGNITIVE NEUROSCIENCE AND NEUROIMAGING 2022; 7:139-149. [PMID: 34478884 PMCID: PMC8825698 DOI: 10.1016/j.bpsc.2021.08.007] [Citation(s) in RCA: 8] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 03/14/2021] [Revised: 07/18/2021] [Accepted: 08/22/2021] [Indexed: 02/03/2023]
Abstract
BACKGROUND Posttraumatic stress disorder (PTSD) is a debilitating disorder, and there is no current accurate prediction of who develops it after trauma. Neurobiologically, individuals with chronic PTSD exhibit aberrant resting-state functional connectivity (rsFC) between the hippocampus and other brain regions (e.g., amygdala, prefrontal cortex, posterior cingulate), and these aberrations correlate with severity of illness. Previous small-scale research (n < 25) has also shown that hippocampal rsFC measured acutely after trauma is predictive of future severity using a region-of-interest-based approach. While this is a promising biomarker, to date, no study has used a data-driven approach to test whole-brain hippocampal FC patterns in forecasting the development of PTSD symptoms. METHODS A total of 98 adults at risk of PTSD were recruited from the emergency department after traumatic injury and completed resting-state functional magnetic resonance imaging (8 min) within 1 month; 6 months later, they completed the Clinician-Administered PTSD Scale for DSM-5 for assessment of PTSD symptom severity. Whole-brain rsFC values with bilateral hippocampi were extracted (using CONN) and used in a machine learning kernel ridge regression analysis (PRoNTo); a k-folds (k = 10) and 70/30 testing versus training split approach were used for cross-validation (1000 iterations to bootstrap confidence intervals for significance values). RESULTS Acute hippocampal rsFC significantly predicted Clinician-Administered PTSD Scale for DSM-5 scores at 6 months (r = 0.30, p = .006; mean squared error = 120.58, p = .006; R2 = 0.09, p = .025). In post hoc analyses, hippocampal rsFC remained significant after controlling for demographics, PTSD symptoms at baseline, and depression, anxiety, and stress severity at 6 months (B = 0.59, SE = 0.20, p = .003). CONCLUSIONS Findings suggest that functional connectivity of the hippocampus across the brain acutely after traumatic injury is associated with prospective PTSD symptom severity.
Collapse
Affiliation(s)
| | - Elisabeth Kate Webb
- University of Wisconsin-Milwaukee, Department of Psychology, Milwaukee, WI, USA
| | - Carissa N. Weis
- University of Wisconsin-Milwaukee, Department of Psychology, Milwaukee, WI, USA
| | - Ashley A. Huggins
- Medical University of South Carolina, Department of Psychiatry, Charleston, SC, USA
| | | | | | | | - Terri A. deRoon-Cassini
- Medical College of Wisconsin, Department of Surgery, Division of Trauma & Acute Care Surgery, Milwaukee, WI, USA
| | - Christine L. Larson
- University of Wisconsin-Milwaukee, Department of Psychology, Milwaukee, WI, USA
| |
Collapse
|
4
|
Viljanen M, Airola A, Pahikkala T. Generalized vec trick for fast learning of pairwise kernel models. Mach Learn 2022. [DOI: 10.1007/s10994-021-06127-y] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022]
Abstract
AbstractPairwise learning corresponds to the supervised learning setting where the goal is to make predictions for pairs of objects. Prominent applications include predicting drug-target or protein-protein interactions, or customer-product preferences. In this work, we present a comprehensive review of pairwise kernels, that have been proposed for incorporating prior knowledge about the relationship between the objects. Specifically, we consider the standard, symmetric and anti-symmetric Kronecker product kernels, metric-learning, Cartesian, ranking, as well as linear, polynomial and Gaussian kernels. Recently, a $$O(nm+nq)$$
O
(
n
m
+
n
q
)
time generalized vec trick algorithm, where $$n$$
n
, $$m$$
m
, and $$q$$
q
denote the number of pairs, drugs and targets, was introduced for training kernel methods with the Kronecker product kernel. This was a significant improvement over previous $$O(n^2)$$
O
(
n
2
)
training methods, since in most real-world applications $$m,q<< n$$
m
,
q
<
<
n
. In this work we show how all the reviewed kernels can be expressed as sums of Kronecker products, allowing the use of generalized vec trick for speeding up their computation. In the experiments, we demonstrate how the introduced approach allows scaling pairwise kernels to much larger data sets than previously feasible, and provide an extensive comparison of the kernels on a number of biological interaction prediction tasks.
Collapse
|
5
|
Stock M, Piot N, Vanbesien S, Meys J, Smagghe G, De Baets B. Pairwise learning for predicting pollination interactions based on traits and phylogeny. Ecol Modell 2021. [DOI: 10.1016/j.ecolmodel.2021.109508] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/15/2022]
|
6
|
Cold-Start Problems in Data-Driven Prediction of Drug-Drug Interaction Effects. Pharmaceuticals (Basel) 2021; 14:ph14050429. [PMID: 34063324 PMCID: PMC8147651 DOI: 10.3390/ph14050429] [Citation(s) in RCA: 7] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/22/2021] [Revised: 04/27/2021] [Accepted: 04/28/2021] [Indexed: 02/02/2023] Open
Abstract
Combining drugs, a phenomenon often referred to as polypharmacy, can induce additional adverse effects. The identification of adverse combinations is a key task in pharmacovigilance. In this context, in silico approaches based on machine learning are promising as they can learn from a limited number of combinations to predict for all. In this work, we identify various subtasks in predicting effects caused by drug–drug interaction. Predicting drug–drug interaction effects for drugs that already exist is very different from predicting outcomes for newly developed drugs, commonly called a cold-start problem. We propose suitable validation schemes for the different subtasks that emerge. These validation schemes are critical to correctly assess the performance. We develop a new model that obtains AUC-ROC =0.843 for the hardest cold-start task up to AUC-ROC =0.957 for the easiest one on the benchmark dataset of Zitnik et al. Finally, we illustrate how our predictions can be used to improve post-market surveillance systems or detect drug–drug interaction effects earlier during drug development.
Collapse
|
7
|
Haneczok J, Piskorski J. Shallow and deep learning for event relatedness classification. Inf Process Manag 2020. [DOI: 10.1016/j.ipm.2020.102371] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/23/2022]
|
8
|
Stock M, Piot N, Vanbesien S, Vaissière B, Coiffait-Gombault C, Smagghe G, De Baets B. Information content in pollination network reveals missing interactions. Ecol Modell 2020. [DOI: 10.1016/j.ecolmodel.2020.109161] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/25/2022]
|
9
|
Daly AJ, Stock M, Baetens JM, De Baets B. Guiding Mineralization Co-Culture Discovery Using Bayesian Optimization. ENVIRONMENTAL SCIENCE & TECHNOLOGY 2019; 53:14459-14469. [PMID: 31682110 DOI: 10.1021/acs.est.9b05942] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/10/2023]
Abstract
Many disciplines rely on testing combinations of compounds, materials, proteins, or bacterial species to drive scientific discovery. It is time-consuming and expensive to determine experimentally, via trial-and-error or random selection approaches, which of the many possible combinations will lead to desirable outcomes. Hence, there is a pressing need for more rational and efficient experimental design approaches to reduce experimental effort. In this work, we demonstrate the potential of machine learning methods for the in silico selection of promising co-culture combinations in the application of bioaugmentation. We use the example of pollutant removal in drinking water treatment plants, which can be achieved using co-cultures of a specialized pollutant degrader with combinations of bacterial isolates. To reduce the experimental effort needed to discover high-performing combinations, we propose a data-driven experimental design. Based on a dataset of mineralization performance for all pairs of 13 bacterial species co-cultured with MSH1, we built a Gaussian process regression model to predict the Gompertz mineralization parameters of the co-cultures of two and three species, based on the single-strain parameters. We subsequently used this model in a Bayesian optimization scheme to suggest potentially high-performing combinations of bacteria. We achieved good performance with this approach, both for predicting mineralization parameters and for selecting effective co-cultures, despite the limited dataset. As a novel application of Bayesian optimization in bioremediation, this experimental design approach has promising applications for highlighting co-culture combinations for in vitro testing in various settings, to lessen the experimental burden and perform more targeted screenings.
Collapse
Affiliation(s)
- Aisling J Daly
- KERMIT, Department of Data Analysis and Mathematical Modelling , Ghent University , Coupure Links 653 , B-9000 Ghent , Belgium
| | - Michiel Stock
- KERMIT, Department of Data Analysis and Mathematical Modelling , Ghent University , Coupure Links 653 , B-9000 Ghent , Belgium
| | - Jan M Baetens
- KERMIT, Department of Data Analysis and Mathematical Modelling , Ghent University , Coupure Links 653 , B-9000 Ghent , Belgium
| | - Bernard De Baets
- KERMIT, Department of Data Analysis and Mathematical Modelling , Ghent University , Coupure Links 653 , B-9000 Ghent , Belgium
| |
Collapse
|
10
|
Xu X, Tsang IW, Liu C. Improving Generalization via Attribute Selection on Out-of-the-Box Data. Neural Comput 2019; 32:485-514. [PMID: 31835004 DOI: 10.1162/neco_a_01256] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/04/2022]
Abstract
Zero-shot learning (ZSL) aims to recognize unseen objects (test classes) given some other seen objects (training classes) by sharing information of attributes between different objects. Attributes are artificially annotated for objects and treated equally in recent ZSL tasks. However, some inferior attributes with poor predictability or poor discriminability may have negative impacts on the ZSL system performance. This letter first derives a generalization error bound for ZSL tasks. Our theoretical analysis verifies that selecting the subset of key attributes can improve the generalization performance of the original ZSL model, which uses all the attributes. Unfortunately, previous attribute selection methods have been conducted based on the seen data, and their selected attributes have poor generalization capability to the unseen data, which is unavailable in the training stage of ZSL tasks. Inspired by learning from pseudo-relevance feedback, this letter introduces out-of-the-box data-pseudo-data generated by an attribute-guided generative model-to mimic the unseen data. We then present an iterative attribute selection (IAS) strategy that iteratively selects key attributes based on the out-of-the-box data. Since the distribution of the generated out-of-the-box data is similar to that of the test data, the key attributes selected by IAS can be effectively generalized to test data. Extensive experiments demonstrate that IAS can significantly improve existing attribute-based ZSL methods and achieve state-of-the-art performance.
Collapse
Affiliation(s)
- Xiaofeng Xu
- School of Computer Science and Engineering, Nanjing University of Science and Technology, Nanjing, Jiangsu 210094, China, and Centre for Artificial Intelligence, University of Technology Sydney, Ultimo, NSW 2007, Australia
| | - Ivor W Tsang
- Centre for Artificial Intelligence, University of Technology Sydney, Ultimo, NSW 2007, Australia
| | - Chuancai Liu
- School of Computer Science and Engineering, Nanjing University of Science and Technology, Nanjing, Jiangsu 210094, China, and Collaborative Innovation Center of IoT Technology and Intelligent Systems, Minjiang University, Fuzhou, Fujian 350000, China
| |
Collapse
|
11
|
Abstract
The deep multiple kernel learning (DMKL) method has caused widespread concern due to its better results compared with shallow multiple kernel learning. However, existing DMKL methods, which have a fixed number of layers and fixed type of kernels, have poor ability to adapt to different data sets and are difficult to find suitable model parameters to improve the test accuracy. In this paper, we propose a self-adaptive deep multiple kernel learning (SA-DMKL) method. Our SA-DMKL method can adapt the model through optimizing the model parameters of each kernel function with a grid search method and change the numbers and types of kernel function in each layer according to the generalization bound that is evaluated with Rademacher chaos complexity. Experiments on the three datasets of University of California—Irvine (UCI) and image dataset Caltech 256 validate the effectiveness of the proposed method on three aspects.
Collapse
|
12
|
Stock M, Pahikkala T, Airola A, Waegeman W, De Baets B. Algebraic shortcuts for leave-one-out cross-validation in supervised network inference. Brief Bioinform 2018; 21:262-271. [PMID: 30329015 DOI: 10.1093/bib/bby095] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/27/2018] [Revised: 08/21/2018] [Accepted: 09/06/2018] [Indexed: 12/20/2022] Open
Abstract
Supervised machine learning techniques have traditionally been very successful at reconstructing biological networks, such as protein-ligand interaction, protein-protein interaction and gene regulatory networks. Many supervised techniques for network prediction use linear models on a possibly nonlinear pairwise feature representation of edges. Recently, much emphasis has been placed on the correct evaluation of such supervised models. It is vital to distinguish between using a model to either predict new interactions in a given network or to predict interactions for a new vertex not present in the original network. This distinction matters because (i) the performance might dramatically differ between the prediction settings and (ii) tuning the model hyperparameters to obtain the best possible model depends on the setting of interest. Specific cross-validation schemes need to be used to assess the performance in such different prediction settings.In this work we discuss a state-of-the-art kernel-based network inference technique called two-step kernel ridge regression. We show that this regression model can be trained efficiently, with a time complexity scaling with the number of vertices rather than the number of edges. Furthermore, this framework leads to a series of cross-validation shortcuts that allow one to rapidly estimate the model performance for any relevant network prediction setting. This allows computational biologists to fully assess the capabilities of their models. The machine learning techniques with the algebraic shortcuts are implemented in the RLScore software package: https://github.com/aatapa/RLScore.
Collapse
Affiliation(s)
- Michiel Stock
- Department of Data Analysis and Mathematical Modelling, Ghent University, Belgium
| | - Tapio Pahikkala
- Department of Future Technologies, University of Turku, Finland
| | - Antti Airola
- Department of Future Technologies, University of Turku, Finland
| | - Willem Waegeman
- Department of Data Analysis and Mathematical Modelling, Ghent University, Belgium
| | - Bernard De Baets
- Department of Data Analysis and Mathematical Modelling, Ghent University, Belgium
| |
Collapse
|