1
|
Ran Y, Xu XK, Jia T. The maximum capability of a topological feature in link prediction. PNAS NEXUS 2024; 3:pgae113. [PMID: 38528954 PMCID: PMC10962729 DOI: 10.1093/pnasnexus/pgae113] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 07/26/2023] [Accepted: 02/21/2024] [Indexed: 03/27/2024]
Abstract
Networks offer a powerful approach to modeling complex systems by representing the underlying set of pairwise interactions. Link prediction is the task that predicts links of a network that are not directly visible, with profound applications in biological, social, and other complex systems. Despite intensive utilization of the topological feature in this task, it is unclear to what extent a feature can be leveraged to infer missing links. Here, we aim to unveil the capability of a topological feature in link prediction by identifying its prediction performance upper bound. We introduce a theoretical framework that is compatible with different indexes to gauge the feature, different prediction approaches to utilize the feature, and different metrics to quantify the prediction performance. The maximum capability of a topological feature follows a simple yet theoretically validated expression, which only depends on the extent to which the feature is held in missing and nonexistent links. Because a family of indexes based on the same feature shares the same upper bound, the potential of all others can be estimated from one single index. Furthermore, a feature's capability is lifted in the supervised prediction, which can be mathematically quantified, allowing us to estimate the benefit of applying machine learning algorithms. The universality of the pattern uncovered is empirically verified by 550 structurally diverse networks. The findings have applications in feature and method selection, and shed light on network characteristics that make a topological feature effective in link prediction.
Collapse
Affiliation(s)
- Yijun Ran
- College of Computer and Information Science, Southwest University, Chongqing 400715, P.R. China
- Center for Computational Communication Research, Beijing Normal University, Zhuhai 519087, P.R. China
- School of Journalism and Communication, Beijing Normal University, Beijing 100875, P.R. China
| | - Xiao-Ke Xu
- Center for Computational Communication Research, Beijing Normal University, Zhuhai 519087, P.R. China
- School of Journalism and Communication, Beijing Normal University, Beijing 100875, P.R. China
| | - Tao Jia
- College of Computer and Information Science, Southwest University, Chongqing 400715, P.R. China
| |
Collapse
|
2
|
Collaboration prediction based on multilayer all-author tripartite citation networks: A case study of gene editing. J Informetr 2023. [DOI: 10.1016/j.joi.2022.101374] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/13/2023]
|
3
|
Wang CD, Shi W, Huang L, Lin KY, Huang D, Yu PS. Node Pair Information Preserving Network Embedding Based on Adversarial Networks. IEEE TRANSACTIONS ON CYBERNETICS 2022; 52:5908-5922. [PMID: 33284768 DOI: 10.1109/tcyb.2020.3035066] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/12/2023]
Abstract
Network embedding aims to learn the low-dimensional node representations for networks, which has attracted an increasing amount of attention in recent years. Most existing efforts in this field attempt to embed the network based on node similarity, which generally relies on edge existence statistics of the network. Instead of relying on the global edge existence statistics for every node pair, in this article, we utilize the information between a pair of nodes in a local way and propose a model, called node pair information preserving network embedding (NINE), based on adversarial networks. The main idea lies in preserving the node pair information (NI) by means of adversarial networks. The architecture of the proposed NINE model consists of three main components, namely: 1) NI embedder; 2) NI generator; and 3) NI discriminator. In the NI embedder, to avoid the complicated similarity calculation for a pair of nodes, the original NI vector calculated from the direct neighbor information of the two nodes is adopted as features, and the edge existence information is taken as labels to learn the embedded NI vector in a supervised learning manner. The second component is the NI generator, which takes the original node representation vectors of a node pair as input and outputs the generated NI vector. In order to make the generated NI vector follow the same distribution of the corresponding embedded NI vector, the generative adversarial network (GAN) is adopted, resulting in the third component, called the NI discriminator. Extensive experiments are conducted on seven real-world datasets in three downstream tasks, namely: 1) network reconstruction; 2) link prediction; and 3) node classification. Comparison results with seven state-of-the-art models demonstrate the effectiveness, efficiency, and rationality of our model.
Collapse
|
4
|
Su X, Hu L, You Z, Hu P, Zhao B. Attention-based Knowledge Graph Representation Learning for Predicting Drug-drug Interactions. Brief Bioinform 2022; 23:6572660. [PMID: 35453147 DOI: 10.1093/bib/bbac140] [Citation(s) in RCA: 33] [Impact Index Per Article: 16.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/22/2021] [Revised: 03/02/2022] [Accepted: 03/27/2022] [Indexed: 02/06/2023] Open
Abstract
Drug-drug interactions (DDIs) are known as the main cause of life-threatening adverse events, and their identification is a key task in drug development. Existing computational algorithms mainly solve this problem by using advanced representation learning techniques. Though effective, few of them are capable of performing their tasks on biomedical knowledge graphs (KGs) that provide more detailed information about drug attributes and drug-related triple facts. In this work, an attention-based KG representation learning framework, namely DDKG, is proposed to fully utilize the information of KGs for improved performance of DDI prediction. In particular, DDKG first initializes the representations of drugs with their embeddings derived from drug attributes with an encoder-decoder layer, and then learns the representations of drugs by recursively propagating and aggregating first-order neighboring information along top-ranked network paths determined by neighboring node embeddings and triple facts. Last, DDKG estimates the probability of being interacting for pairwise drugs with their representations in an end-to-end manner. To evaluate the effectiveness of DDKG, extensive experiments have been conducted on two practical datasets with different sizes, and the results demonstrate that DDKG is superior to state-of-the-art algorithms on the DDI prediction task in terms of different evaluation metrics across all datasets.
Collapse
Affiliation(s)
- Xiaorui Su
- Xinjiang Technical Institute of Physics and Chemistry, Chinese Academy of Sciences, Urumqi 830011, China.,University of Chinese Academy of Sciences, Beijing 100049, China.,Xinjiang Laboratory of Minority Speech and Language Information Processing, Urumqi 830011, China
| | - Lun Hu
- Xinjiang Technical Institute of Physics and Chemistry, Chinese Academy of Sciences, Urumqi 830011, China.,University of Chinese Academy of Sciences, Beijing 100049, China.,Xinjiang Laboratory of Minority Speech and Language Information Processing, Urumqi 830011, China
| | - Zhuhong You
- School of Computer Science, Northwestern Polytechnical University, Xi'an 710129, China
| | - Pengwei Hu
- Xinjiang Technical Institute of Physics and Chemistry, Chinese Academy of Sciences, Urumqi 830011, China.,University of Chinese Academy of Sciences, Beijing 100049, China.,Xinjiang Laboratory of Minority Speech and Language Information Processing, Urumqi 830011, China
| | - Bowei Zhao
- Xinjiang Technical Institute of Physics and Chemistry, Chinese Academy of Sciences, Urumqi 830011, China.,University of Chinese Academy of Sciences, Beijing 100049, China.,Xinjiang Laboratory of Minority Speech and Language Information Processing, Urumqi 830011, China
| |
Collapse
|
5
|
Yu H, Dong W, Shi J. RANEDDI: Relation-aware network embedding for drug-drug interaction prediction. Inf Sci (N Y) 2022. [DOI: 10.1016/j.ins.2021.09.008] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/26/2022]
|
6
|
Dai Q, Shen X, Zheng Z, Zhang L, Li Q, Wang D. Adversarial training regularization for negative sampling based network embedding. Inf Sci (N Y) 2021. [DOI: 10.1016/j.ins.2021.07.018] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/20/2022]
|
7
|
Guo JN, Mao XL, Lin SY, Wei W, Huang H. Deep kernel supervised hashing for node classification in structural networks. Inf Sci (N Y) 2021. [DOI: 10.1016/j.ins.2021.03.068] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/21/2022]
|
8
|
Dhayne H, Kilany R, Haque R, Taher Y. EMR2vec: Bridging the gap between patient data and clinical trial. COMPUTERS & INDUSTRIAL ENGINEERING 2021; 156:107236. [PMID: 33746344 PMCID: PMC7959675 DOI: 10.1016/j.cie.2021.107236] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 06/03/2020] [Revised: 02/17/2021] [Accepted: 03/08/2021] [Indexed: 06/12/2023]
Abstract
The human suffering from diseases caused by life-threatening viruses such as SARS, Ebola, and COVID-19 motivated many of us to study and discover the best means to harness the potential of data integration to assist clinical researchers to curb these viruses. Integrating patients data with clinical trials data is enormously promising as it provides a comprehensive knowledge base that accelerates the clinical research response-ability to tackle emerging infectious disease outbreaks. This work introduces EMR2vec, a platform that customises advanced NLP, machine learning and semantic web techniques to link potential patients to suitable clinical trials. Linking these two different but complementary datasets allows clinicians and researchers to compare patients to clinical research opportunities or to automatically select patients for personalized clinical care. The platform derives a 'bag of medical terms' (BoMT) from eligibility criteria by normalizing extracted entities through SNOMED-CT ontology. With the usage of BoMT, an ontological reasoning method is proposed to represent EMR and clinical trials in a vector space model. The platform presents a matching process that reduces vector dimensionality using a neural network, then applies orthogonality projection to measure the similarity between vectors. Finally, the proposed EMR2vec platform is evaluated with an extendable prototype based on Big data tools.
Collapse
Affiliation(s)
| | - Rima Kilany
- Saint Joseph University, Mar Roukos, Beirut, Lebanon
| | - Rafiqul Haque
- Intelligencia, 66 Avenue des Champs Elysees, Paris, France
| | - Yehia Taher
- David lab, 45 Avenue des Etats Unis, Versailles, France
| |
Collapse
|
9
|
Zhang W, Guo X, Wang W, Tian Q, Pan L, Jiao P. Role-based network embedding via structural features reconstruction with degree-regularized constraint. Knowl Based Syst 2021. [DOI: 10.1016/j.knosys.2021.106872] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/22/2022]
|
10
|
|
11
|
|
12
|
Zhang T, Yang X, Wang X, Wang R. Deep joint neural model for single image haze removal and color correction. Inf Sci (N Y) 2020. [DOI: 10.1016/j.ins.2020.05.105] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]
|
13
|
|
14
|
Mata ASD. Complex Networks: a Mini-review. BRAZILIAN JOURNAL OF PHYSICS 2020; 50:658-672. [PMCID: PMC7357442 DOI: 10.1007/s13538-020-00772-9] [Citation(s) in RCA: 9] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 05/18/2020] [Indexed: 06/13/2023]
Abstract
Network analysis is a powerful tool that provides us a fruitful framework to describe phenomena related to social, technological, and many other real-world complex systems. In this paper, we present a brief review about complex networks including fundamental quantities, examples of network models, and the essential role of network topology in the investigation of dynamical processes as epidemics, rumor spreading, and synchronization. A quite of advances have been provided in this field, and many other authors also review the main contributions in this area over the years. However, we show an overview from a different perspective. Our aim is to provide basic information to a broad audience and more detailed references for those who would like to learn deeper the topic.
Collapse
|