1
|
Provable Unrestricted Adversarial Training without Compromise with Generalizability. IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE 2024; PP:1-18. [PMID: 38743549 DOI: 10.1109/tpami.2024.3400988] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/16/2024]
Abstract
Adversarial training (AT) is widely considered as the most promising strategy to defend against adversarial attacks and has drawn increasing interest from researchers. However, the existing AT methods still suffer from two challenges. First, they are unable to handle unrestricted adversarial examples (UAEs), which are built from scratch, as opposed to restricted adversarial examples (RAEs), which are created by adding perturbations bound by an lp norm to observed examples. Second, the existing AT methods often achieve adversarial robustness at the expense of standard generalizability (i.e., the accuracy on natural examples) because they make a tradeoff between them. To overcome these challenges, we propose a unique viewpoint that understands UAEs as imperceptibly perturbed unobserved examples. Also, we find that the tradeoff results from the separation of the distributions of adversarial examples and natural examples. Based on these ideas, we propose a novel AT approach called Provable Unrestricted Adversarial Training (PUAT), which can provide a target classifier with comprehensive adversarial robustness against both UAE and RAE, and simultaneously improve its standard generalizability. Particularly, PUAT utilizes partially labeled data to achieve effective UAE generation by accurately capturing the natural data distribution through a novel augmented triple-GAN. At the same time, PUAT extends the traditional AT by introducing the supervised loss of the target classifier into the adversarial loss and achieves the alignment between the UAE distribution, the natural data distribution, and the distribution learned by the classifier, with the collaboration of the augmented triple-GAN. Finally, the solid theoretical analysis and extensive experiments conducted on widely-used benchmarks demonstrate the superiority of PUAT.
Collapse
|
2
|
A Survey on Deep Learning Event Extraction: Approaches and Applications. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 2024; 35:6301-6321. [PMID: 36269921 DOI: 10.1109/tnnls.2022.3213168] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/16/2023]
Abstract
Event extraction (EE) is a crucial research task for promptly apprehending event information from massive textual data. With the rapid development of deep learning, EE based on deep learning technology has become a research hotspot. Numerous methods, datasets, and evaluation metrics have been proposed in the literature, raising the need for a comprehensive and updated survey. This article fills the research gap by reviewing the state-of-the-art approaches, especially focusing on the general domain EE based on deep learning models. We introduce a new literature classification of current general domain EE research according to the task definition. Afterward, we summarize the paradigm and models of EE approaches, and then discuss each of them in detail. As an important aspect, we summarize the benchmarks that support tests of predictions and evaluation metrics. A comprehensive comparison among different approaches is also provided in this survey. Finally, we conclude by summarizing future research directions facing the research area.
Collapse
|
3
|
Reinforced GNNs for Multiple Instance Learning. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 2024; PP:1-15. [PMID: 38687672 DOI: 10.1109/tnnls.2024.3392575] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/02/2024]
Abstract
Multiple instance learning (MIL) trains models from bags of instances, where each bag contains multiple instances, and only bag-level labels are available for supervision. The application of graph neural networks (GNNs) in capturing intrabag topology effectively improves MIL. Existing GNNs usually require filtering low-confidence edges among instances and adapting graph neural architectures to new bag structures. However, such asynchronous adjustments to structure and architecture are tedious and ignore their correlations. To tackle these issues, we propose a reinforced GNN framework for MIL (RGMIL), pioneering the exploitation of multiagent deep reinforcement learning (MADRL) in MIL tasks. MADRL enables the flexible definition or extension of factors that influence bag graphs or GNNs and provides synchronous control over them. Moreover, MADRL explores structure-to-architecture correlations while automating adjustments. Experimental results on multiple MIL datasets demonstrate that RGMIL achieves the best performance with excellent explainability. The code and data are available at https://github.com/RingBDStack/RGMIL.
Collapse
|
4
|
Towards complex dynamic physics system simulation with graph neural ordinary equations. Neural Netw 2024; 176:106341. [PMID: 38692189 DOI: 10.1016/j.neunet.2024.106341] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/11/2023] [Revised: 03/03/2024] [Accepted: 04/23/2024] [Indexed: 05/03/2024]
Abstract
The great learning ability of deep learning facilitates us to comprehend the real physical world, making learning to simulate complicated particle systems a promising endeavour both in academia and industry. However, the complex laws of the physical world pose significant challenges to the learning based simulations, such as the varying spatial dependencies between interacting particles and varying temporal dependencies between particle system states in different time stamps, which dominate particles' interacting behavior and the physical systems' evolution patterns. Existing learning based methods fail to fully account for the complexities, making them unable to yield satisfactory simulations. To better comprehend the complex physical laws, we propose a novel model - Graph Networks with Spatial-Temporal neural Ordinary Differential Equations (GNSTODE) - that characterizes the varying spatial and temporal dependencies in particle systems using a united end-to-end framework. Through training with real-world particle-particle interaction observations, GNSTODE can simulate any possible particle systems with high precisions. We empirically evaluate GNSTODE's simulation performance on two real-world particle systems, Gravity and Coulomb, with varying levels of spatial and temporal dependencies. The results show that GNSTODE yields better simulations than state-of-the-art methods, showing that GNSTODE can serve as an effective tool for particle simulation in real-world applications. Our code is made available at https://github.com/Guangsi-Shi/AI-for-physics-GNSTODE.
Collapse
|
5
|
MultiFair: Model Fairness With Multiple Sensitive Attributes. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 2024; PP:1-14. [PMID: 38648122 DOI: 10.1109/tnnls.2024.3384181] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 04/25/2024]
Abstract
While existing fairness interventions show promise in mitigating biased predictions, most studies concentrate on single-attribute protections. Although a few methods consider multiple attributes, they either require additional constraints or prediction heads, incurring high computational overhead or jeopardizing the stability of the training process. More critically, they consider per-attribute protection approaches, raising concerns about fairness gerrymandering where certain attribute combinations remain unfair. This work aims to construct a neutral domain containing fused information across all subgroups and attributes. It delivers fair predictions as the fused input contains neutralized information for all considered attributes. Specifically, we adopt mixup operations to generate samples with fused information. However, our experiments reveal that directly adopting the operations leads to degraded prediction results. The excessive mixup operations result in unrecognizable training data. To this end, we design three distinct mixup schemes that balance information fusion across attributes while retaining distinct visual features critical for training valid models. Extensive experiments with multiple datasets and up to eight sensitive attributes demonstrate that the proposed MultiFair method can deliver fairness protections for multiple attributes while maintaining valid prediction results.
Collapse
|
6
|
A Comprehensive Survey on Community Detection With Deep Learning. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 2024; 35:4682-4702. [PMID: 35263257 DOI: 10.1109/tnnls.2021.3137396] [Citation(s) in RCA: 27] [Impact Index Per Article: 27.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/14/2023]
Abstract
Detecting a community in a network is a matter of discerning the distinct features and connections of a group of members that are different from those in other communities. The ability to do this is of great significance in network analysis. However, beyond the classic spectral clustering and statistical inference methods, there have been significant developments with deep learning techniques for community detection in recent years-particularly when it comes to handling high-dimensional network data. Hence, a comprehensive review of the latest progress in community detection through deep learning is timely. To frame the survey, we have devised a new taxonomy covering different state-of-the-art methods, including deep learning models based on deep neural networks (DNNs), deep nonnegative matrix factorization, and deep sparse filtering. The main category, i.e., DNNs, is further divided into convolutional networks, graph attention networks, generative adversarial networks, and autoencoders. The popular benchmark datasets, evaluation metrics, and open-source implementations to address experimentation settings are also summarized. This is followed by a discussion on the practical applications of community detection in various domains. The survey concludes with suggestions of challenging topics that would make for fruitful future research directions in this fast-growing deep learning field.
Collapse
|
7
|
MVSTT: A Multiview Spatial-Temporal Transformer Network for Traffic-Flow Forecasting. IEEE TRANSACTIONS ON CYBERNETICS 2024; 54:1582-1595. [PMID: 37015356 DOI: 10.1109/tcyb.2022.3223918] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/19/2023]
Abstract
Accurate traffic-flow prediction remains a critical challenge due to complicated spatial dependencies, temporal factors, and unpredictable events. Most existing approaches focus on single- or dual-view learning and thus face limitations in systematically learning complex spatial-temporal features. In this work, we propose a novel multiview spatial-temporal transformer (MVSTT) network that can effectively learn complex spatial-temporal domain correlations and potential patterns from multiple views. First, we examine a temporal view and design a short-range gated convolution component from a short-term subview, and a long-range gated convolution component from a long-term subview. These two components effectively aggregate knowledge of the temporal domain at multiple granularities and mine patterns of node evolution across time steps. Meanwhile, in the spatial view, we design a dual-graph spatial learning module that captures fixed and dynamic spatial dependencies of nodes, as well as the evolution patterns of edges, from the static and dynamic graph subviews, respectively. In addition, we further design a spatial-temporal transformer to mine different levels of spatial-temporal features through multiview knowledge fusion. Extensive experiments on four real-world traffic datasets show that our method consistently outperforms the state-of-the-art baseline. The code of MVSTT is available at https://github.com/JianSoL/MVSTT.
Collapse
|
8
|
Motif-Based Contrastive Learning for Community Detection. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 2024; PP:1-14. [PMID: 38408012 DOI: 10.1109/tnnls.2024.3367873] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 02/28/2024]
Abstract
Community detection has become a prominent task in complex network analysis. However, most of the existing methods for community detection only focus on the lower order structure at the level of individual nodes and edges and ignore the higher order connectivity patterns that characterize the fundamental building blocks within the network. In recent years, researchers have shown interest in motifs and their role in network analysis. However, most of the existing higher order approaches are based on shallow methods, failing to capture the intricate nonlinear relationships between nodes. In order to better fuse higher order and lower order structural information, a novel deep learning framework called motif-based contrastive learning for community detection (MotifCC) is proposed. First, a higher order network is constructed based on motifs. Subnetworks are then obtained by removing isolated nodes, addressing the fragmentation issue in the higher order network. Next, the concept of contrastive learning is applied to effectively fuse various kinds of information from nodes, edges, and higher order and lower order structures. This aims to maximize the similarity of corresponding node information, while distinguishing different nodes and different communities. Finally, based on the community structure of subnetworks, the community labels of all nodes are obtained by using the idea of label propagation. Extensive experiments on real-world datasets validate the effectiveness of MotifCC.
Collapse
|
9
|
Multi-View Graph Learning by Joint Modeling of Consistency and Inconsistency. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 2024; 35:2848-2862. [PMID: 35895654 DOI: 10.1109/tnnls.2022.3192445] [Citation(s) in RCA: 3] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/15/2023]
Abstract
Graph learning has emerged as a promising technique for multi-view clustering due to its ability to learn a unified and robust graph from multiple views. However, existing graph learning methods mostly focus on the multi-view consistency issue, yet often neglect the inconsistency between views, which makes them vulnerable to possibly low-quality or noisy datasets. To overcome this limitation, we propose a new multi-view graph learning framework, which for the first time simultaneously and explicitly models multi-view consistency and inconsistency in a unified objective function, through which the consistent and inconsistent parts of each single-view graph as well as the unified graph that fuses the consistent parts can be iteratively learned. Though optimizing the objective function is NP-hard, we design a highly efficient optimization algorithm that can obtain an approximate solution with linear time complexity in the number of edges in the unified graph. Furthermore, our multi-view graph learning approach can be applied to both similarity graphs and dissimilarity graphs, which lead to two graph fusion-based variants in our framework. Experiments on 12 multi-view datasets have demonstrated the robustness and efficiency of the proposed approach. The code is available at https://github.com/youweiliang/Multi-view_Graph_Learning.
Collapse
|
10
|
A Weighted Symmetric Graph Embedding Approach for Link Prediction in Undirected Graphs. IEEE TRANSACTIONS ON CYBERNETICS 2024; 54:1037-1047. [PMID: 35759583 DOI: 10.1109/tcyb.2022.3181810] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/15/2023]
Abstract
Link prediction is an important task in social network analysis and mining because of its various applications. A large number of link prediction methods have been proposed. Among them, the deep learning-based embedding methods exhibit excellent performance, which encodes each node and edge as an embedding vector, enabling easy integration with traditional machine learning algorithms. However, there still remain some unsolved problems for this kind of methods, especially in the steps of node embedding and edge embedding. First, they either share exactly the same weight among all neighbors or assign a completely different weight to each node to obtain the node embedding. Second, they can hardly keep the symmetry of edge embeddings obtained from node representations by direct concatenation or other binary operations such as averaging and Hadamard product. In order to solve these problems, we propose a weighted symmetric graph embedding approach for link prediction. In node embedding, the proposed approach aggregates neighbors in different orders with different aggregating weights. In edge embedding, the proposed approach bidirectionally concatenates node pairs both forwardly and backwardly to guarantee the symmetry of edge representations while preserving local structural information. The experimental results show that our proposed approach can better predict network links, outperforming the state-of-the-art methods. The appropriate aggregating weight assignment and the bidirectional concatenation enable us to learn more accurate and symmetric edge representations for link prediction.
Collapse
|
11
|
Omni-Training: Bridging Pre-Training and Meta-Training for Few-Shot Learning. IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE 2023; 45:15275-15291. [PMID: 37751343 DOI: 10.1109/tpami.2023.3319517] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 09/28/2023]
Abstract
Few-shot learning aims to fast adapt a deep model from a few examples. While pre-training and meta-training can create deep models powerful for few-shot generalization, we find that pre-training and meta-training focus respectively on cross-domain transferability and cross-task transferability, which restricts their data efficiency in the entangled settings of domain shift and task shift. We thus propose the Omni-Training framework to seamlessly bridge pre-training and meta-training for data-efficient few-shot learning. Our first contribution is a tri-flow Omni-Net architecture. Besides the joint representation flow, Omni-Net introduces two parallel flows for pre-training and meta-training, responsible for improving domain transferability and task transferability respectively. Omni-Net further coordinates the parallel flows by routing their representations via the joint-flow, enabling knowledge transfer across flows. Our second contribution is the Omni-Loss, which introduces a self-distillation strategy separately on the pre-training and meta-training objectives for boosting knowledge transfer throughout different training stages. Omni-Training is a general framework to accommodate many existing algorithms. Evaluations justify that our single framework consistently and clearly outperforms the individual state-of-the-art methods on both cross-task and cross-domain settings in a variety of classification, regression and reinforcement learning problems.
Collapse
|
12
|
G 3SR: Global Graph Guided Session-Based Recommendation. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 2023; 34:9671-9684. [PMID: 35324448 DOI: 10.1109/tnnls.2022.3159592] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/14/2023]
Abstract
Session-based recommendation tries to make use of anonymous session data to deliver high-quality recommendations under the condition that user profiles and the complete historical behavioral data of a target user are unavailable. Previous works consider each session individually and try to capture user interests within a session. Despite their encouraging results, these models can only perceive intra-session items and cannot draw upon the massive historical relational information. To solve this problem, we propose a novel method named global graph guided session-based recommendation (G3SR). G3SR decomposes the session-based recommendation workflow into two steps. First, a global graph is built upon all session data, from which the global item representations are learned in an unsupervised manner. Then, these representations are refined on session graphs under the graph networks, and a readout function is used to generate session representations for each session. Extensive experiments on two real-world benchmark datasets show remarkable and consistent improvements of the G3SR method over the state-of-the-art methods, especially for cold items.
Collapse
|
13
|
Model-Based Self-Advising for Multi-Agent Learning. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 2023; 34:7934-7945. [PMID: 35157599 DOI: 10.1109/tnnls.2022.3147221] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/14/2023]
Abstract
In multiagent learning, one of the main ways to improve learning performance is to ask for advice from another agent. Contemporary advising methods share a common limitation that a teacher agent can only advise a student agent if the teacher has experience with an identical state. However, in highly complex learning scenarios, such as autonomous driving, it is rare for two agents to experience exactly the same state, which makes the advice less of a learning aid and more of a one-time instruction. In these scenarios, with contemporary methods, agents do not really help each other learn, and the main outcome of their back and forth requests for advice is an exorbitant communications' overhead. In human interactions, teachers are often asked for advice on what to do in situations that students are personally unfamiliar with. In these, we generally draw from similar experiences to formulate advice. This inspired us to provide agents with the same ability when asked for advice on an unfamiliar state. Hence, we propose a model-based self-advising method that allows agents to train a model based on states similar to the state in question to inform its response. As a result, the advice given can not only be used to resolve the current dilemma but also many other similar situations that the student may come across in the future via self-advising. Compared with contemporary methods, our method brings a significant improvement in learning performance with much lower communication overheads.
Collapse
|
14
|
Balancing Learning Model Privacy, Fairness, and Accuracy With Early Stopping Criteria. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 2023; 34:5557-5569. [PMID: 34878980 DOI: 10.1109/tnnls.2021.3129592] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/13/2023]
Abstract
As deep learning models mature, one of the most prescient questions we face is: what is the ideal tradeoff between accuracy, fairness, and privacy (AFP)? Unfortunately, both the privacy and the fairness of a model come at the cost of its accuracy. Hence, an efficient and effective means of fine-tuning the balance between this trinity of needs is critical. Motivated by some curious observations in privacy-accuracy tradeoffs with differentially private stochastic gradient descent (DP-SGD), where fair models sometimes result, we conjecture that fairness might be better managed as an indirect byproduct of this process. Hence, we conduct a series of analyses, both theoretical and empirical, on the impacts of implementing DP-SGD in deep neural network models through gradient clipping and noise addition. The results show that, in deep learning, the number of training epochs is central to striking a balance between AFP because DP-SGD makes the training less stable, providing the possibility of model updates at a low discrimination level without much loss in accuracy. Based on this observation, we designed two different early stopping criteria to help analysts choose the optimal epoch at which to stop training a model so as to achieve their ideal tradeoff. Extensive experiments show that our methods can achieve an ideal balance between AFP.
Collapse
|
15
|
Comprehensive evaluation of deep and graph learning on drug-drug interactions prediction. Brief Bioinform 2023:bbad235. [PMID: 37401373 DOI: 10.1093/bib/bbad235] [Citation(s) in RCA: 7] [Impact Index Per Article: 7.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/27/2023] [Revised: 05/30/2023] [Accepted: 06/05/2023] [Indexed: 07/05/2023] Open
Abstract
Recent advances and achievements of artificial intelligence (AI) as well as deep and graph learning models have established their usefulness in biomedical applications, especially in drug-drug interactions (DDIs). DDIs refer to a change in the effect of one drug to the presence of another drug in the human body, which plays an essential role in drug discovery and clinical research. DDIs prediction through traditional clinical trials and experiments is an expensive and time-consuming process. To correctly apply the advanced AI and deep learning, the developer and user meet various challenges such as the availability and encoding of data resources, and the design of computational methods. This review summarizes chemical structure based, network based, natural language processing based and hybrid methods, providing an updated and accessible guide to the broad researchers and development community with different domain knowledge. We introduce widely used molecular representation and describe the theoretical frameworks of graph neural network models for representing molecular structures. We present the advantages and disadvantages of deep and graph learning methods by performing comparative experiments. We discuss the potential technical challenges and highlight future directions of deep and graph learning models for accelerating DDIs prediction.
Collapse
|
16
|
Adaptive Subgraph Neural Network With Reinforced Critical Structure Mining. IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE 2023; 45:8063-8080. [PMID: 37018637 DOI: 10.1109/tpami.2023.3235931] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/06/2023]
Abstract
While graph representation learning methods have shown success in various graph mining tasks, what knowledge is exploited for predictions is less discussed. This paper proposes a novel Adaptive Subgraph Neural Network named AdaSNN to find critical structures in graph data, i.e., subgraphs that are dominant to the prediction results. To detect critical subgraphs of arbitrary size and shape in the absence of explicit subgraph-level annotations, AdaSNN designs a Reinforced Subgraph Detection Module to search subgraphs adaptively without heuristic assumptions or predefined rules. To encourage the subgraph to be predictive at the global scale, we design a Bi-Level Mutual Information Enhancement Mechanism including both global-aware and label-aware mutual information maximization to further enhance the subgraph representations in the perspective of information theory. By mining critical subgraphs that reflect the intrinsic property of a graph, AdaSNN can provide sufficient interpretability to the learned results. Comprehensive experimental results on seven typical graph datasets demonstrate that AdaSNN has a significant and consistent performance improvement and provides insightful results.
Collapse
|
17
|
Transferring From Textual Entailment to Biomedical Named Entity Recognition. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2023; 20:2577-2586. [PMID: 37018664 DOI: 10.1109/tcbb.2023.3236477] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/19/2023]
Abstract
Biomedical Named Entity Recognition (BioNER) aims at identifying biomedical entities such as genes, proteins, diseases, and chemical compounds in the given textual data. However, due to the issues of ethics, privacy, and high specialization of biomedical data, BioNER suffers from the more severe problem of lacking in quality labeled data than the general domain especially for the token-level. Facing the extremely limited labeled biomedical data, this work studies the problem of gazetteer-based BioNER, which aims at building a BioNER system from scratch. It needs to identify the entities in the given sentences when we have zero token-level annotations for training. Previous works usually use sequential labeling models to solve the NER or BioNER task and obtain weakly labeled data from gazetteers when we don't have full annotations. However, these labeled data are quite noisy since we need the labels for each token and the entity coverage of the gazetteers is limited. Here we propose to formulate the BioNER task as a Textual Entailment problem and solve the task via Textual Entailment with Dynamic Contrastive learning (TEDC). TEDC not only alleviates the noisy labeling issue, but also transfers the knowledge from pre-trained textual entailment models. Additionally, the dynamic contrastive learning framework contrasts the entities and non-entities in the same sentence and improves the model's discrimination ability. Experiments on two real-world biomedical datasets show that TEDC can achieve state-of-the-art performance for gazetteer-based BioNER.
Collapse
|
18
|
C-DeepTrust: A Context-Aware Deep Trust Prediction Model in Online Social Networks. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 2023; 34:2767-2780. [PMID: 34550893 DOI: 10.1109/tnnls.2021.3107948] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/02/2023]
Abstract
Trust prediction provides valuable support for decision making, information dissemination, and product promotion in online social networks. As a complex concept in the social network community, trust relationships among people can be established virtually based on: 1) their interaction behaviors, e.g., the ratings and comments that they provided; 2) the contextual information associated with their interactions, e.g., location and culture; and 3) the relative temporal features of interactions and the time periods when the trust relationships hold. Most of the existing works only focus on some aspects of trust, and there is not a comprehensive study of user trust development that considers and incorporates 1)-3) in trust prediction. In this article, we propose a context-aware deep trust prediction model C-DeepTrust to fill this gap. First, we conduct user feature modeling to obtain the user's static and dynamic preference features in each context. Static user preference features are obtained from all the ratings and reviews that a user provided, while dynamic user preference features are obtained from the items rated/reviewed by the user in time series. The obtained context-aware user features are then combined and fed into the multilayer projection structure to further mine the context-aware latent features. Finally, the context-aware trust relationships between users are calculated by their context-aware feature vector cosine similarities according to the social homophily theory, which shows a pervasive property of social networks that trust relationships are more likely to be developed among similar people. Extensive experiments conducted on two real-world datasets show the superior performance of our approach compared with the representative baseline methods.
Collapse
|
19
|
Sinkhorn Distance Minimization for Adaptive Semi-Supervised Social Network Alignment. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 2023; PP:1-14. [PMID: 37216231 DOI: 10.1109/tnnls.2023.3267126] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/24/2023]
Abstract
Social network alignment, aiming at linking identical identities across different social platforms, is a fundamental task in social graph mining. Most existing approaches are supervised models and require a large number of manually labeled data, which are infeasible in practice considering the yawning gap between social platforms. Recently, isomorphism across social networks is incorporated as complementary to link identities from the distribution level, which contributes to alleviating the dependency on sample-level annotations. Adversarial learning is adopted to learn a shared projection function by minimizing the distance between two social distributions. However, the hypothesis of isomorphism might not always hold true as social user behaviors are generally unpredictable, and thus a shared projection function is insufficient to handle the sophisticated cross-platform correlations. In addition, adversarial learning suffers from training instability and uncertainty, which may hinder model performance. In this article, we propose a novel meta-learning-based social network alignment model Meta-SNA to effectively capture the isomorphism and the unique characteristics of each identity. Our motivation lies in learning a shared meta-model to preserve the global cross-platform knowledge and an adaptor to learn a specific projection function for each identity. Sinkhorn distance is further introduced as the distribution closeness measurement to tackle the limitations of adversarial learning, which owns an explicitly optimal solution and can be efficiently computed by the matrix scaling algorithm. Empirically, we evaluate the proposed model over multiple datasets, and the experimental results demonstrate the superiority of Meta-SNA.
Collapse
|
20
|
A Hybrid Two-Stage Teaching-Learning-Based Optimization Algorithm for Feature Selection in Bioinformatics. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2023; 20:1746-1760. [PMID: 36251903 DOI: 10.1109/tcbb.2022.3215129] [Citation(s) in RCA: 3] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/06/2023]
Abstract
The "curse of dimensionality" brings new challenges to the feature selection (FS) problem, especially in bioinformatics filed. In this paper, we propose a hybrid Two-Stage Teaching-Learning-Based Optimization (TS-TLBO) algorithm to improve the performance of bioinformatics data classification. In the selection reduction stage, potentially informative features, as well as noisy features, are selected to effectively reduce the search space. In the following comparative self-learning stage, the teacher and the worst student with self-learning evolve together based on the duality of the FS problems to enhance the exploitation capabilities. In addition, an opposition-based learning strategy is utilized to generate initial solutions to rapidly improve the quality of the solutions. We further develop a self-adaptive mutation mechanism to improve the search performance by dynamically adjusting the mutation rate according to the teacher's convergence ability. Moreover, we integrate a differential evolutionary method with TLBO to boost the exploration ability of our algorithm. We conduct comparative experiments on 31 public data sets with different data dimensions, including 7 bioinformatics datasets, and evaluate our TS-TLBO algorithm compared with 11 related methods. The experimental results show that the TS-TLBO algorithm obtains a good feature subset with better classification performance, and indicates its generality to the FS problems.
Collapse
|
21
|
Higher Order Connection Enhanced Community Detection in Adversarial Multiview Networks. IEEE TRANSACTIONS ON CYBERNETICS 2023; 53:3060-3074. [PMID: 34767522 DOI: 10.1109/tcyb.2021.3125227] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/13/2023]
Abstract
Community detection in multiview networks has drawn an increasing amount of attention in recent years. Many approaches have been developed from different perspectives. Despite the success, the problem of community detection in adversarial multiview networks remains largely unsolved. An adversarial multiview network is a multiview network that suffers an adversarial attack on community detection in which the attackers may deliberately remove some critical edges so as to hide the underlying community structure, leading to the performance degeneration of the existing approaches. To address this problem, we propose a novel approach, called higher order connection enhanced multiview modularity (HCEMM). The main idea lies in enhancing the intracommunity connection of each view by means of utilizing the higher order connection structure. The first step is to discover the view-specific higher order Microcommunities (VHM-communities) from the higher order connection structure. Then, for each view of the original multiview network, additional edges are added to make the nodes in each of its VHM-communities fully connected like a clique, by which the intracommunity connection of the multiview network can be enhanced. Therefore, the proposed approach is able to discover the underlying community structure in a multiview network while recovering the missing edges. Extensive experiments conducted on 16 real-world datasets confirm the effectiveness of the proposed approach.
Collapse
|
22
|
Effectively Identifying Compound-Protein Interaction Using Graph Neural Representation. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2023; 20:932-943. [PMID: 35951570 DOI: 10.1109/tcbb.2022.3198003] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/04/2023]
Abstract
Effectively identifying compound-protein interactions (CPIs) is crucial for new drug design, which is an important step in silico drug discovery. Current machine learning methods for CPI prediction mainly use one-demensional (1D) compound/protein strings and/or the specific descriptors. However, they often ignore the fact that molecules are essentially modeled by the molecular graph. We observe that in real-world scenarios, the topological structure information of the molecular graph usually provides an overview of how the atoms are connected, and the local chemical context reveals the functionality of the protein sequence in CPI. These two types of information are complementary to each other and they are both significant for modeling compound-protein pairs. Motivated by this, we propose an end-to-end deep learning framework named GraphCPI, which captures the structural information of compounds and leverages the chemical context of protein sequences for solving the CPI prediction task. Our framework can integrate any popular graph neural networks for learning compounds, and it combines with a convolutional neural network for embedding sequences. To compare our method with classic and state-of-the-art deep learning methods, we conduct extensive experiments based on several widely-used CPI datasets. The experimental results show the feasibility and competitiveness of our proposed method.
Collapse
|
23
|
SOR-TC: Self-Attentive Octave ResNet with Temporal Consistency for Compressed Video Action Recognition. Neurocomputing 2023. [DOI: 10.1016/j.neucom.2023.02.045] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 03/06/2023]
|
24
|
CIFair: Constructing continuous domains of invariant features for image fair classifications. Knowl Based Syst 2023. [DOI: 10.1016/j.knosys.2023.110417] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/27/2023]
|
25
|
Multiview Clustering via Proximity Learning in Latent Representation Space. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 2023; 34:973-986. [PMID: 34432638 DOI: 10.1109/tnnls.2021.3104846] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/13/2023]
Abstract
Most existing multiview clustering methods are based on the original feature space. However, the feature redundancy and noise in the original feature space limit their clustering performance. Aiming at addressing this problem, some multiview clustering methods learn the latent data representation linearly, while performance may decline if the relation between the latent data representation and the original data is nonlinear. The other methods which nonlinearly learn the latent data representation usually conduct the latent representation learning and clustering separately, resulting in that the latent data representation might be not well adapted to clustering. Furthermore, none of them model the intercluster relation and intracluster correlation of data points, which limits the quality of the learned latent data representation and therefore influences the clustering performance. To solve these problems, this article proposes a novel multiview clustering method via proximity learning in latent representation space, named multiview latent proximity learning (MLPL). For one thing, MLPL learns the latent data representation in a nonlinear manner which takes the intercluster relation and intracluster correlation into consideration simultaneously. For another, through conducting the latent representation learning and consensus proximity learning simultaneously, MLPL learns a consensus proximity matrix with k connected components to output the clustering result directly. Extensive experiments are conducted on seven real-world datasets to demonstrate the effectiveness and superiority of the MLPL method compared with the state-of-the-art multiview clustering methods.
Collapse
|
26
|
PredRNN: A Recurrent Neural Network for Spatiotemporal Predictive Learning. IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE 2023; 45:2208-2225. [PMID: 35380958 DOI: 10.1109/tpami.2022.3165153] [Citation(s) in RCA: 12] [Impact Index Per Article: 12.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/14/2023]
Abstract
The predictive learning of spatiotemporal sequences aims to generate future images by learning from the historical context, where the visual dynamics are believed to have modular structures that can be learned with compositional subsystems. This paper models these structures by presenting PredRNN, a new recurrent network, in which a pair of memory cells are explicitly decoupled, operate in nearly independent transition manners, and finally form unified representations of the complex environment. Concretely, besides the original memory cell of LSTM, this network is featured by a zigzag memory flow that propagates in both bottom-up and top-down directions across all layers, enabling the learned visual dynamics at different levels of RNNs to communicate. It also leverages a memory decoupling loss to keep the memory cells from learning redundant features. We further propose a new curriculum learning strategy to force PredRNN to learn long-term dynamics from context frames, which can be generalized to most sequence-to-sequence models. We provide detailed ablation studies to verify the effectiveness of each component. Our approach is shown to obtain highly competitive results on five datasets for both action-free and action-conditioned predictive learning scenarios.
Collapse
|
27
|
Reinforced, Incremental and Cross-Lingual Event Detection From Social Messages. IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE 2023; 45:980-998. [PMID: 35077355 DOI: 10.1109/tpami.2022.3144993] [Citation(s) in RCA: 6] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/14/2023]
Abstract
Detecting hot social events (e.g., political scandal, momentous meetings, natural hazards, etc.) from social messages is crucial as it highlights significant happenings to help people understand the real world. On account of the streaming nature of social messages, incremental social event detection models in acquiring, preserving, and updating messages over time have attracted great attention. However, the challenge is that the existing event detection methods towards streaming social messages are generally confronted with ambiguous events features, dispersive text contents, and multiple languages, and hence result in low accuracy and generalization ability. In this paper, we present a novel reinForced, incremental and cross-lingual social Event detection architecture, namely FinEvent, from streaming social messages. Concretely, we first model social messages into heterogeneous graphs integrating both rich meta-semantics and diverse meta-relations, and convert them to weighted multi-relational message graphs. Second, we propose a new reinforced weighted multi-relational graph neural network framework by using a Multi-agent Reinforcement Learning algorithm to select optimal aggregation thresholds across different relations/edges to learn social message embeddings. To solve the long-tail problem in social event detection, a balanced sampling strategy guided Contrastive Learning mechanism is designed for incremental social message representation learning. Third, a new Deep Reinforcement Learning guided density-based spatial clustering model is designed to select the optimal minimum number of samples required to form a cluster and optimal minimum distance between two clusters in social event detection tasks. Finally, we implement incremental social message representation learning based on knowledge preservation on the graph neural network and achieve the transferring cross-lingual social event detection. We conduct extensive experiments to evaluate the FinEvent on Twitter streams, demonstrating a significant and consistent improvement in model quality with 14%-118%, 8%-170%, and 2%-21% increases in performance on offline, online, and cross-lingual social event detection tasks.
Collapse
|
28
|
DC-FUDA: Improving deep clustering via fully unsupervised domain adaptation. Neurocomputing 2023. [DOI: 10.1016/j.neucom.2023.01.058] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/15/2023]
|
29
|
Privacy-Preserving Federated Mining of Frequent Itemsets. Inf Sci (N Y) 2023. [DOI: 10.1016/j.ins.2023.01.002] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/09/2023]
|
30
|
Domain-Invariant Feature Progressive Distillation with Adversarial Adaptive Augmentation for Low-Resource Cross-Domain NER. ACM T ASIAN LOW-RESO 2022. [DOI: 10.1145/3570502] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/23/2022]
Abstract
Considering the expensive annotation in Named Entity Recognition (NER), Cross-domain NER enables NER in low-resource target domains with few or without labeled data, by transferring the knowledge of high-resource domains. However, the discrepancy between different domains causes domain shift problem hamper the performance of cross-domain NER in low-resource scenarios. In this article, we first propose an adversarial adaptive augmentation, where we integrate the adversarial strategy into a multi-task leaner to augment and qualify domain adaptive data. We extract domain-invariant features of the adaptive data to bridge the cross-domain gap and alleviate the label-sparsity problem simultaneously. Therefore, another important component in this article is the progressive domain-invariant feature distillation framework. A multi-grained MMD (Maximum Mean Discrepancy) approach in the framework to extract the multi-level domain invariant features and enable knowledge transfer across domains through the adversarial adaptive data. Advanced Knowledge Distillation (KD) schema processes progressively domain adaptation through the powerful pre-trained language models and multi-level domain invariant features. Extensive comparative experiments over 4 English and 2 Chinese benchmarks show the importance of adversarial augmentation and effective adaptation from high-resource domains to low-resource target domains. Comparison with 2 vanilla and 4 latest baselines indicates the state-of-the-art performance and superiority confronted with both zero-resource and minimal-resource scenarios.
Collapse
|
31
|
An Integrated Cluster Detection, Optimization, and Interpretation Approach for Financial Data. IEEE TRANSACTIONS ON CYBERNETICS 2022; 52:13848-13861. [PMID: 34550896 DOI: 10.1109/tcyb.2021.3109066] [Citation(s) in RCA: 10] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/23/2023]
Abstract
In many financial applications, such as fraud detection, reject inference, and credit evaluation, detecting clusters automatically is critical because it helps to understand the subpatterns of the data that can be used to infer user's behaviors and identify potential risks. Due to the complexity of human behaviors and changing social environments, the distributions of financial data are usually complex and it is challenging to find clusters and give reasonable interpretations. The goal of this study is to develop an integrated approach to detect clusters in financial data, and optimize the scope of the clusters such that the clusters can be easily interpreted. Specifically, we first proposed a new cluster quality evaluation criterion, which is free from large-scale computation and can guide base clustering algorithms such as k -Means to detect hyperellipsoidal clusters adaptively. Then, we designed a new solver for a revised support vector data description model, which efficiently refines the centroids and scopes of the detected clusters to make the clusters tighter such that the data in the clusters share greater similarities, and thus, the clusters can be easily interpreted with eigenvectors. Using ten financial datasets, the experiments showed that the proposed algorithm can efficiently find reasonable number of clusters. The proposed approach is suitable for large-scale financial datasets whose features are meaningful, and also applicable to financial mining tasks, such as data distribution interpretation and anomaly detection.
Collapse
|
32
|
Privacy and Robustness in Federated Learning: Attacks and Defenses. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 2022; PP:1-21. [PMID: 36355741 DOI: 10.1109/tnnls.2022.3216981] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/16/2023]
Abstract
As data are increasingly being stored in different silos and societies becoming more aware of data privacy issues, the traditional centralized training of artificial intelligence (AI) models is facing efficiency and privacy challenges. Recently, federated learning (FL) has emerged as an alternative solution and continues to thrive in this new reality. Existing FL protocol designs have been shown to be vulnerable to adversaries within or outside of the system, compromising data privacy and system robustness. Besides training powerful global models, it is of paramount importance to design FL systems that have privacy guarantees and are resistant to different types of adversaries. In this article, we conduct a comprehensive survey on privacy and robustness in FL over the past five years. Through a concise introduction to the concept of FL and a unique taxonomy covering: 1) threat models; 2) privacy attacks and defenses; and 3) poisoning attacks and defenses, we provide an accessible review of this important topic. We highlight the intuitions, key techniques, and fundamental assumptions adopted by various attacks and defenses. Finally, we discuss promising future research directions toward robust and privacy-preserving FL, and their interplays with the multidisciplinary goals of FL.
Collapse
|
33
|
VideoDG: Generalizing Temporal Relations in Videos to Novel Domains. IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE 2022; 44:7989-8004. [PMID: 34596532 DOI: 10.1109/tpami.2021.3116945] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/13/2023]
Abstract
This paper introduces video domain generalization where most video classification networks degenerate due to the lack of exposure to the target domains of divergent distributions. We observe that the global temporal features are less generalizable, due to the temporal domain shift that videos from other unseen domains may have an unexpected absence or misalignment of the temporal relations. This finding has motivated us to solve video domain generalization by effectively learning the local-relation features of different timescales that are more generalizable, and exploiting them along with the global-relation features to maintain the discriminability. This paper presents the VideoDG framework with two technical contributions. The first is a new deep architecture named the Adversarial Pyramid Network, which improves the generalizability of video features by capturing the local-relation, global-relation, and cross-relation features progressively. On the basis of pyramid features, the second contribution is a new and robust approach of adversarial data augmentation that can bridge different video domains by improving the diversity and quality of augmented data. We construct three video domain generalization benchmarks in which domains are divided according to different datasets, different consequences of actions, or different camera views, respectively. VideoDG consistently outperforms the combinations of previous video classification models and existing domain generalization methods on all benchmarks.
Collapse
|
34
|
Abstract
Graph Neural Networks (GNNs) have been widely used for the representation learning of various structured graph data, typically through message passing among nodes by aggregating their neighborhood information via different operations. While promising, most existing GNNs oversimplify the complexity and diversity of the edges in the graph and thus are inefficient to cope with ubiquitous heterogeneous graphs, which are typically in the form of multi-relational graph representations. In this article, we propose
RioGNN
, a novel Reinforced, recursive, and flexible neighborhood selection guided multi-relational Graph Neural Network architecture, to navigate complexity of neural network structures whilst maintaining relation-dependent representations. We first construct a multi-relational graph, according to the practical task, to reflect the heterogeneity of nodes, edges, attributes, and labels. To avoid the embedding over-assimilation among different types of nodes, we employ a label-aware neural similarity measure to ascertain the most similar neighbors based on node attributes. A reinforced relation-aware neighbor selection mechanism is developed to choose the most similar neighbors of a targeting node within a relation before aggregating all neighborhood information from different relations to obtain the eventual node embedding. Particularly, to improve the efficiency of neighbor selecting, we propose a new recursive and scalable reinforcement learning framework with estimable depth and width for different scales of multi-relational graphs.
RioGNN
can learn more discriminative node embedding with enhanced explainability due to the recognition of individual importance of each relation via the filtering threshold mechanism. Comprehensive experiments on real-world graph data and practical tasks demonstrate the advancements of effectiveness, efficiency, and the model explainability, as opposed to other comparative GNN models.
Collapse
|
35
|
A resource scheduling method for reliable and trusted distributed composite services in cloud environment based on deep reinforcement learning. Front Genet 2022; 13:964784. [PMID: 36299577 PMCID: PMC9588937 DOI: 10.3389/fgene.2022.964784] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/09/2022] [Accepted: 09/21/2022] [Indexed: 11/13/2022] Open
Abstract
With the vigorous development of Internet technology, applications are increasingly migrating to the cloud. Cloud, a distributed network environment, has been widely extended to many fields such as digital finance, supply chain management, and biomedicine. In order to meet the needs of the rapid development of the modern biomedical industry, the biological cloud platform is an inevitable choice for the integration and analysis of medical information. It improves the work efficiency of the biological information system and also realizes reliable and credible intelligent processing of biological resources. Cloud services in bioinformatics are mainly for the processing of biological data, such as the analysis and processing of genes, the testing and detection of human tissues and organs, and the storage and transportation of vaccines. Biomedical companies form a data chain on the cloud, and they provide services and transfer data to each other to create composite services. Therefore, our motivation is to improve process efficiency of biological cloud services. Users’ business requirements have become complicated and diversified, which puts forward higher requirements for service scheduling strategies in cloud computing platforms. In addition, deep reinforcement learning shows strong perception and continuous decision-making capabilities in automatic control problems, which provides a new idea and method for solving the service scheduling and resource allocation problems in the cloud computing field. Therefore, this paper designs a composite service scheduling model under the containers instance mode which hybrids reservation and on-demand. The containers in the cluster are divided into two instance modes: reservation and on-demand. A composite service is described as a three-level structure: a composite service consists of multiple services, and a service consists of multiple service instances, where the service instance is the minimum scheduling unit. In addition, an improved Deep Q-Network (DQN) algorithm is proposed and applied to the scheduling algorithm of composite services. The experimental results show that applying our improved DQN algorithm to the composite services scheduling problem in the container cloud environment can effectively reduce the completion time of the composite services. Meanwhile, the method improves Quality of Service (QoS) and resource utilization in the container cloud environment.
Collapse
|
36
|
Fairness in graph-based semi-supervised learning. Knowl Inf Syst 2022. [DOI: 10.1007/s10115-022-01738-w] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022]
Abstract
AbstractMachine learning is widely deployed in society, unleashing its power in a wide range of applications owing to the advent of big data. One emerging problem faced by machine learning is the discrimination from data, and such discrimination is reflected in the eventual decisions made by the algorithms. Recent study has proved that increasing the size of training (labeled) data will promote the fairness criteria with model performance being maintained. In this work, we aim to explore a more general case where quantities of unlabeled data are provided, indeed leading to a new form of learning paradigm, namely fair semi-supervised learning. Taking the popularity of graph-based approaches in semi-supervised learning, we study this problem both on conventional label propagation method and graph neural networks, where various fairness criteria can be flexibly integrated. Our developed algorithms are proved to be non-trivial extensions to the existing supervised models with fairness constraints. Extensive experiments on real-world datasets exhibit that our methods achieve a better trade-off between classification accuracy and fairness than the compared baselines.
Collapse
|
37
|
An Adaptive Graph Pre-training Framework for Localized Collaborative Filtering. ACM T INFORM SYST 2022. [DOI: 10.1145/3555372] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/01/2022]
Abstract
Graph neural networks (GNNs) have been widely applied in the recommendation tasks and have achieved very appealing performance. However, most GNN-based recommendation methods suffer from the problem of data sparsity in practice. Meanwhile, pre-training techniques have achieved great success in mitigating data sparsity in various domains such as natural language processing (NLP) and computer vision (CV). Thus, graph pre-training has the great potential to alleviate data sparsity in GNN-based recommendations. However, pre-training GNNs for recommendations face unique challenges. For example, user-item interaction graphs in different recommendation tasks have distinct sets of users and items, and they often present different properties. Therefore, the successful mechanisms commonly used in NLP and CV to transfer knowledge from pre-training tasks to downstream tasks such as sharing learned embeddings or feature extractors are not directly applicable to existing GNN-based recommendations models. To tackle these challenges, we delicately design an adaptive graph pre-training framework for localized collaborative filtering (ADAPT). It does not require transferring user/item embeddings, and is able to capture both the common knowledge across different graphs and the uniqueness for each graph simultaneously. Extensive experimental results have demonstrated the effectiveness and superiority of ADAPT.
Collapse
|
38
|
Abstract
Multiview subspace clustering (MVSC) is a recently emerging technique that aims to discover the underlying subspace in multiview data and thereby cluster the data based on the learned subspace. Though quite a few MVSC methods have been proposed in recent years, most of them cannot explicitly preserve the locality in the learned subspaces and also neglect the subspacewise grouping effect, which restricts their ability of multiview subspace learning. To address this, in this article, we propose a novel MVSC with grouping effect (MvSCGE) approach. Particularly, our approach simultaneously learns the multiple subspace representations for multiple views with smooth regularization, and then exploits the subspacewise grouping effect in these learned subspaces by means of a unified optimization framework. Meanwhile, the proposed approach is able to ensure the cross-view consistency and learn a consistent cluster indicator matrix for the final clustering results. Extensive experiments on several benchmark datasets have been conducted to validate the superiority of the proposed approach.
Collapse
|
39
|
Node Pair Information Preserving Network Embedding Based on Adversarial Networks. IEEE TRANSACTIONS ON CYBERNETICS 2022; 52:5908-5922. [PMID: 33284768 DOI: 10.1109/tcyb.2020.3035066] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/12/2023]
Abstract
Network embedding aims to learn the low-dimensional node representations for networks, which has attracted an increasing amount of attention in recent years. Most existing efforts in this field attempt to embed the network based on node similarity, which generally relies on edge existence statistics of the network. Instead of relying on the global edge existence statistics for every node pair, in this article, we utilize the information between a pair of nodes in a local way and propose a model, called node pair information preserving network embedding (NINE), based on adversarial networks. The main idea lies in preserving the node pair information (NI) by means of adversarial networks. The architecture of the proposed NINE model consists of three main components, namely: 1) NI embedder; 2) NI generator; and 3) NI discriminator. In the NI embedder, to avoid the complicated similarity calculation for a pair of nodes, the original NI vector calculated from the direct neighbor information of the two nodes is adopted as features, and the edge existence information is taken as labels to learn the embedded NI vector in a supervised learning manner. The second component is the NI generator, which takes the original node representation vectors of a node pair as input and outputs the generated NI vector. In order to make the generated NI vector follow the same distribution of the corresponding embedded NI vector, the generative adversarial network (GAN) is adopted, resulting in the third component, called the NI discriminator. Extensive experiments are conducted on seven real-world datasets in three downstream tasks, namely: 1) network reconstruction; 2) link prediction; and 3) node classification. Comparison results with seven state-of-the-art models demonstrate the effectiveness, efficiency, and rationality of our model.
Collapse
|
40
|
Deep learning for drug repurposing: Methods, databases, and applications. WIRES COMPUTATIONAL MOLECULAR SCIENCE 2022. [DOI: 10.1002/wcms.1597] [Citation(s) in RCA: 6] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/15/2022]
|
41
|
Deep reinforcement learning guided graph neural networks for brain network analysis. Neural Netw 2022; 154:56-67. [DOI: 10.1016/j.neunet.2022.06.035] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/09/2021] [Revised: 05/25/2022] [Accepted: 06/28/2022] [Indexed: 10/17/2022]
|
42
|
Multivariate Correlation-aware Spatio-temporal Graph Convolutional Networks for Multi-scale Traffic Prediction. ACM T INTEL SYST TEC 2022. [DOI: 10.1145/3469087] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/19/2022]
Abstract
Traffic flow prediction based on vehicle trajectories collected from the installed GPS devices is critically important to Intelligent Transportation Systems (ITS). One limitation of existing traffic prediction models is that they mostly focus on predicting road-segment level traffic conditions, which can be considered as a fine-grained prediction. In many scenarios, however, a coarse-grained prediction, such as predicting the traffic flows among different urban areas covering multiple road links, is also required to help government have a better understanding on traffic conditions from the macroscopic point of view. This is especially useful in the applications of urban planning and public transportation planning. Another limitation is that the correlations among different types of traffic-related features are largely ignored. For example, the traffic flow and traffic speed are usually negatively correlated. Existing works regard these traffic-related features as independent features without considering their correlations. In this article, we for the first time study the novel problem of multivariate correlation-aware multi-scale traffic flow predicting, and we propose a feature correlation-aware spatio-temporal graph convolutional networks named MC-STGCN to effectively address it. Specifically, given a road graph, we first construct a coarse-grained road graph based on both the topology closeness and the traffic flow similarity among the nodes (road links). Then a cross-scale spatial-temporal feature learning and fusion technique is proposed for dealing with both the fine- and coarse-grained traffic data. In the spatial domain, a cross-scale GCN is proposed to learn the multi-scale spatial features jointly and fuse them together. In the temporal domain, a cross-scale temporal network that is composed of a hierarchical attention is designed for effectively capturing intra- and inter-scale temporal correlations. To effectively capture the feature correlations, a feature correlation learning component is also designed. Finally, a structural constraint is introduced to make the predictions on the two scale traffic data consistent. We conduct extensive evaluations over two real traffic datasets, and the results demonstrate the superior performance of the proposal on both fine- and coarse-grained traffic predictions.
Collapse
|
43
|
Visual Sentiment Analysis With Social Relations-Guided Multiattention Networks. IEEE TRANSACTIONS ON CYBERNETICS 2022; 52:4472-4484. [PMID: 33175687 DOI: 10.1109/tcyb.2020.3027766] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/11/2023]
Abstract
These days, social media users tend to express their feelings through sharing images online. Capturing the emotions embedded in these social images involves great research challenges and practical values. Most existing works concentrate on extracting the visual feature from a global view, while ignoring the fact that visual objects are also rich in emotion. How to leverage the multilevel visual features to improve the sentiment analysis performance is important yet challenging. Besides, existing works view each social image as an independent sample while ignoring the rich correlations among social images, which may be helpful in detecting visual emotion. In this article, we propose a novel model called social relations-guided multiattention networks (SRGMANs) to incorporate both the multilevel (region-level and object-level) visual features of a single image and the correlations among multiple social images to conduct visual sentiment analysis. Specifically, we first construct a heterogeneous network consisting of various types of social relations and introduce a heterogeneous network embedding method to learn the network representation for each image. Then, two visual attention branches (region attention network and object attention network) are devised to extract emotional and discriminative visual features. For each branch, we design a self-attention module to capture the emotional dependencies among visual parts. Besides, a network-guided attention module is also designed in each branch to focus on more network-related emotional visual parts with the guidance of the topology information. Finally, the attended visual features from the two attention models, together with network representation features, are combined within a holistic framework to predict the sentiment of social images. Extensive experiments demonstrate the superiority of our model on three benchmark datasets.
Collapse
|
44
|
Joint Stance and Rumor Detection in Hierarchical Heterogeneous Graph. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 2022; 33:2530-2542. [PMID: 34714751 DOI: 10.1109/tnnls.2021.3114027] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/13/2023]
Abstract
Recently, large volumes of false or unverified information (e.g., fake news and rumors) appear frequently in emerging social media, which are often discussed on a large scale and widely disseminated, causing bad consequences. Many studies on rumor detection indicate that the stance distribution of posts is closely related to the rumor veracity. However, these two tasks are generally considered separately or just using a shared encoder/layer via multitask learning, without exploring the more profound correlation between them. In particular, the performance of existing methods relies heavily on the quality of hand-crafted features and the quantity of labeled data, which is not conducive to early rumor detection and few-shot detection. In this article, we construct a hierarchical heterogeneous graph by associating posts containing the same high-frequency words to facilitate the feature cross-topic propagation and jointly formulate stance and rumor detection as multistage classification tasks. To realize the updating of node embeddings jointly driven by stance and rumor detection, we propose a multigraph neural network framework, which can more flexibly capture the attribute and structure information of the context. Experiments on real datasets collected from Twitter and Reddit show that our method outperforms state-of-the-art by a large margin on both stance and rumor detection. And the experimental results also show that our method has better interpretability and requires less labeled data.
Collapse
|
45
|
An Autoencoder Framework With Attention Mechanism for Cross-Domain Recommendation. IEEE TRANSACTIONS ON CYBERNETICS 2022; 52:5229-5241. [PMID: 33156800 DOI: 10.1109/tcyb.2020.3029002] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/11/2023]
Abstract
In recent years, the recommender system has been widely used in online platforms, which can extract useful information from giant volumes of data and recommend suitable items to the user according to user preferences. However, the recommender system usually suffers from sparsity and cold-start problems. Cross-domain recommendation, as a particular example of transfer learning, has been used to solve the aforementioned problems. However, many existing cross-domain recommendation approaches are based on matrix factorization, which can only learn the shallow and linear characteristics of users and items. Therefore, in this article, we propose a novel autoencoder framework with an attention mechanism (AAM) for cross-domain recommendation, which can transfer and fuse information between different domains and make a more accurate rating prediction. The main idea of the proposed framework lies in utilizing autoencoder, multilayer perceptron, and self-attention to extract user and item features, learn the user and item-latent factors, and fuse the user-latent factors from different domains, respectively. In addition, to learn the affinity of the user-latent factors between different domains in a multiaspect level, we also strengthen the self-attention mechanism by using multihead self-attention and propose AAM++. Experiments conducted on two real-world datasets empirically demonstrate that our proposed methods outperform the state-of-the-art methods in cross-domain recommendation and AAM++ performs better than AAM on sparse and large-scale datasets.
Collapse
|
46
|
Differential Advising in Multiagent Reinforcement Learning. IEEE TRANSACTIONS ON CYBERNETICS 2022; 52:5508-5521. [PMID: 33232260 DOI: 10.1109/tcyb.2020.3034424] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/11/2023]
Abstract
Agent advising is one of the main approaches to improve agent learning performance by enabling agents to share advice. Existing advising methods have a common limitation that an adviser agent can offer advice to an advisee agent only if the advice is created in the same state as the advisee's state. However, in complex environments, it is a very strong requirement that two states are the same, because a state may consist of multiple dimensions and two states being the same means that all these dimensions in the two states are correspondingly identical. Therefore, this requirement may limit the applicability of existing advising methods to complex environments. In this article, inspired by the differential privacy scheme, we propose a differential advising method that relaxes this requirement by enabling agents to use advice in a state even if the advice is created in a slightly different state. Compared with the existing methods, agents using the proposed method have more opportunity to take advice from others. This article is the first to adopt the concept of differential privacy on advising to improve agent learning performance instead of addressing security issues. The experimental results demonstrate that the proposed method is more efficient in complex environments than the existing methods.
Collapse
|
47
|
Abstract
Text classification is the most fundamental and essential task in natural language processing. The last decade has seen a surge of research in this area due to the unprecedented success of deep learning. Numerous methods, datasets, and evaluation metrics have been proposed in the literature, raising the need for a comprehensive and updated survey. This paper fills the gap by reviewing the state-of-the-art approaches from 1961 to 2021, focusing on models from traditional models to deep learning. We create a taxonomy for text classification according to the text involved and the models used for feature extraction and classification. We then discuss each of these categories in detail, dealing with both the technical developments and benchmark datasets that support tests of predictions. A comprehensive comparison between different techniques, as well as identifying the pros and cons of various evaluation metrics are also provided in this survey. Finally, we conclude by summarizing key implications, future research directions, and the challenges facing the research area.
Collapse
|
48
|
A survey on heterogeneous information network based recommender systems: Concepts, methods, applications and resources. AI OPEN 2022. [DOI: 10.1016/j.aiopen.2022.03.002] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/18/2022] Open
|
49
|
Federated Social Recommendation with Graph Neural Network. ACM T INTEL SYST TEC 2022. [DOI: 10.1145/3501815] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/19/2022]
Abstract
Recommender systems have become prosperous nowadays, designed to predict users’ potential interests in items by learning embeddings. Recent developments of the Graph Neural Networks (GNNs) also provide recommender systems with powerful backbones to learn embeddings from a user-item graph. However, only leveraging the user-item interactions suffers from the cold-start issue due to the difficulty in data collection. Hence, current endeavors propose fusing social information with user-item interactions to alleviate it, which is the social recommendation problem. Existing work employs GNNs to aggregate both social links and user-item interactions simultaneously. However, they all require centralized storage of the social links and item interactions of users, which leads to privacy concerns. Additionally, according to strict privacy protection under General Data Protection Regulation, centralized data storage may not be feasible in the future, urging a decentralized framework of social recommendation.
As a result, we design a federated learning recommender system for the social recommendation task, which is rather challenging because of its heterogeneity, personalization, and privacy protection requirements. To this end, we devise a novel framework
Fe
drated
So
cial recommendation with
G
raph neural network (
FeSoG
). Firstly,
FeSoG
adopts relational attention and aggregation to handle heterogeneity. Secondly,
FeSoG
infers user embeddings using local data to retain personalization. Last but not least, the proposed model employs pseudo-labeling techniques with item sampling to protect the privacy and enhance training. Extensive experiments on three real-world datasets justify the effectiveness of
FeSoG
in completing social recommendation and privacy protection. We are the first work proposing a federated learning framework for social recommendation to the best of our knowledge.
Collapse
|
50
|
A Survey on Knowledge Graphs: Representation, Acquisition, and Applications. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 2022; 33:494-514. [PMID: 33900922 DOI: 10.1109/tnnls.2021.3070843] [Citation(s) in RCA: 144] [Impact Index Per Article: 72.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/12/2023]
Abstract
Human knowledge provides a formal understanding of the world. Knowledge graphs that represent structural relations between entities have become an increasingly popular research direction toward cognition and human-level intelligence. In this survey, we provide a comprehensive review of the knowledge graph covering overall research topics about: 1) knowledge graph representation learning; 2) knowledge acquisition and completion; 3) temporal knowledge graph; and 4) knowledge-aware applications and summarize recent breakthroughs and perspective directions to facilitate future research. We propose a full-view categorization and new taxonomies on these topics. Knowledge graph embedding is organized from four aspects of representation space, scoring function, encoding models, and auxiliary information. For knowledge acquisition, especially knowledge graph completion, embedding methods, path inference, and logical rule reasoning are reviewed. We further explore several emerging topics, including metarelational learning, commonsense reasoning, and temporal knowledge graphs. To facilitate future research on knowledge graphs, we also provide a curated collection of data sets and open-source libraries on different tasks. In the end, we have a thorough outlook on several promising research directions.
Collapse
|