Ding K, Nouri E, Zheng G, Liu H, White R. Toward Robust Graph Semi-Supervised Learning Against Extreme Data Scarcity.
IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 2024;
35:11661-11670. [PMID:
38421848 DOI:
10.1109/tnnls.2024.3351938]
[Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 03/02/2024]
Abstract
The success of graph neural networks (GNNs) in graph-based web mining highly relies on abundant human-annotated data, which is laborious to obtain in practice. When only a few labeled nodes are available, how to improve their robustness is key to achieving replicable and sustainable graph semi-supervised learning. Though self-training is powerful for semi-supervised learning, its application on graph-structured data may fail because 1) larger receptive fields are not leveraged to capture long-range node interactions, which exacerbates the difficulty of propagating feature-label patterns from labeled nodes to unlabeled nodes and 2) limited labeled data makes it challenging to learn well-separated decision boundaries for different node classes without explicitly capturing the underlying semantic structure. To address the challenges of capturing informative structural and semantic knowledge, we propose a new graph data augmentation framework, augmented graph self-training (AGST), which is built with two new (i.e., structural and semantic) augmentation modules on top of a decoupled GST backbone. In this work, we investigate whether this novel framework can learn a robust graph predictive model under the low-data context. We conduct comprehensive evaluations on semi-supervised node classification under different scenarios of limited labeled-node data. The experimental results demonstrate the unique contributions of the novel data augmentation framework for node classification with few labeled data.
Collapse