1,320 results on '"Gao, Junbin"'
Search Results
2. Diffusing to the Top: Boost Graph Neural Networks with Minimal Hyperparameter Tuning
- Author
-
Lin, Lequan, Shi, Dai, Han, Andi, Wang, Zhiyong, and Gao, Junbin
- Subjects
Computer Science - Machine Learning - Abstract
Graph Neural Networks (GNNs) are proficient in graph representation learning and achieve promising performance on versatile tasks such as node classification and link prediction. Usually, a comprehensive hyperparameter tuning is essential for fully unlocking GNN's top performance, especially for complicated tasks such as node classification on large graphs and long-range graphs. This is usually associated with high computational and time costs and careful design of appropriate search spaces. This work introduces a graph-conditioned latent diffusion framework (GNN-Diff) to generate high-performing GNNs based on the model checkpoints of sub-optimal hyperparameters selected by a light-tuning coarse search. We validate our method through 166 experiments across four graph tasks: node classification on small, large, and long-range graphs, as well as link prediction. Our experiments involve 10 classic and state-of-the-art target models and 20 publicly available datasets. The results consistently demonstrate that GNN-Diff: (1) boosts the performance of GNNs with efficient hyperparameter tuning; and (2) presents high stability and generalizability on unseen data across multiple generation runs. The code is available at https://github.com/lequanlin/GNN-Diff.
- Published
- 2024
3. When Graph Neural Networks Meet Dynamic Mode Decomposition
- Author
-
Shi, Dai, Lin, Lequan, Han, Andi, Wang, Zhiyong, Guo, Yi, and Gao, Junbin
- Subjects
Computer Science - Machine Learning - Abstract
Graph Neural Networks (GNNs) have emerged as fundamental tools for a wide range of prediction tasks on graph-structured data. Recent studies have drawn analogies between GNN feature propagation and diffusion processes, which can be interpreted as dynamical systems. In this paper, we delve deeper into this perspective by connecting the dynamics in GNNs to modern Koopman theory and its numerical method, Dynamic Mode Decomposition (DMD). We illustrate how DMD can estimate a low-rank, finite-dimensional linear operator based on multiple states of the system, effectively approximating potential nonlinear interactions between nodes in the graph. This approach allows us to capture complex dynamics within the graph accurately and efficiently. We theoretically establish a connection between the DMD-estimated operator and the original dynamic operator between system states. Building upon this foundation, we introduce a family of DMD-GNN models that effectively leverage the low-rank eigenfunctions provided by the DMD algorithm. We further discuss the potential of enhancing our approach by incorporating domain-specific constraints such as symmetry into the DMD computation, allowing the corresponding GNN models to respect known physical properties of the underlying system. Our work paves the path for applying advanced dynamical system analysis tools via GNNs. We validate our approach through extensive experiments on various learning tasks, including directed graphs, large-scale graphs, long-range interactions, and spatial-temporal graphs. We also empirically verify that our proposed models can serve as powerful encoders for link prediction tasks. The results demonstrate that our DMD-enhanced GNNs achieve state-of-the-art performance, highlighting the effectiveness of integrating DMD into GNN frameworks.
- Published
- 2024
4. A Riemannian Approach to Ground Metric Learning for Optimal Transport
- Author
-
Jawanpuria, Pratik, Shi, Dai, Mishra, Bamdev, and Gao, Junbin
- Subjects
Computer Science - Machine Learning ,Computer Science - Artificial Intelligence - Abstract
Optimal transport (OT) theory has attracted much attention in machine learning and signal processing applications. OT defines a notion of distance between probability distributions of source and target data points. A crucial factor that influences OT-based distances is the ground metric of the embedding space in which the source and target data points lie. In this work, we propose to learn a suitable latent ground metric parameterized by a symmetric positive definite matrix. We use the rich Riemannian geometry of symmetric positive definite matrices to jointly learn the OT distance along with the ground metric. Empirical results illustrate the efficacy of the learned metric in OT-based domain adaptation.
- Published
- 2024
5. STLLM-DF: A Spatial-Temporal Large Language Model with Diffusion for Enhanced Multi-Mode Traffic System Forecasting
- Author
-
Shao, Zhiqi, Xi, Haoning, Lu, Haohui, Wang, Ze, Bell, Michael G. H., and Gao, Junbin
- Subjects
Computer Science - Machine Learning ,Computer Science - Artificial Intelligence ,I.2.7 ,I.2.1 - Abstract
The rapid advancement of Intelligent Transportation Systems (ITS) presents challenges, particularly with missing data in multi-modal transportation and the complexity of handling diverse sequential tasks within a centralized framework. To address these issues, we propose the Spatial-Temporal Large Language Model Diffusion (STLLM-DF), an innovative model that leverages Denoising Diffusion Probabilistic Models (DDPMs) and Large Language Models (LLMs) to improve multi-task transportation prediction. The DDPM's robust denoising capabilities enable it to recover underlying data patterns from noisy inputs, making it particularly effective in complex transportation systems. Meanwhile, the non-pretrained LLM dynamically adapts to spatial-temporal relationships within multi-modal networks, allowing the system to efficiently manage diverse transportation tasks in both long-term and short-term predictions. Extensive experiments demonstrate that STLLM-DF consistently outperforms existing models, achieving an average reduction of 2.40\% in MAE, 4.50\% in RMSE, and 1.51\% in MAPE. This model significantly advances centralized ITS by enhancing predictive accuracy, robustness, and overall system performance across multiple tasks, thus paving the way for more effective spatio-temporal traffic forecasting through the integration of frozen transformer language models and diffusion techniques., Comment: 26 pages, 11 figures
- Published
- 2024
6. Machine Learning-Based Prediction of Key Genes Correlated to the Subretinal Lesion Severity in a Mouse Model of Age-Related Macular Degeneration
- Author
-
Yan, Kuan, Zeng, Yue, Shi, Dai, Zhang, Ting, Matsypura, Dmytro, Gillies, Mark C., Zhu, Ling, and Gao, Junbin
- Subjects
Quantitative Biology - Genomics ,Computer Science - Machine Learning - Abstract
Age-related macular degeneration (AMD) is a major cause of blindness in older adults, severely affecting vision and quality of life. Despite advances in understanding AMD, the molecular factors driving the severity of subretinal scarring (fibrosis) remain elusive, hampering the development of effective therapies. This study introduces a machine learning-based framework to predict key genes that are strongly correlated with lesion severity and to identify potential therapeutic targets to prevent subretinal fibrosis in AMD. Using an original RNA sequencing (RNA-seq) dataset from the diseased retinas of JR5558 mice, we developed a novel and specific feature engineering technique, including pathway-based dimensionality reduction and gene-based feature expansion, to enhance prediction accuracy. Two iterative experiments were conducted by leveraging Ridge and ElasticNet regression models to assess biological relevance and gene impact. The results highlight the biological significance of several key genes and demonstrate the framework's effectiveness in identifying novel therapeutic targets. The key findings provide valuable insights for advancing drug discovery efforts and improving treatment strategies for AMD, with the potential to enhance patient outcomes by targeting the underlying genetic mechanisms of subretinal lesion development.
- Published
- 2024
7. Global Stock Market Volatility Forecasting Incorporating Dynamic Graphs and All Trading Days
- Author
-
Chi, Zhengyang, Gao, Junbin, and Wang, Chao
- Subjects
Quantitative Finance - General Finance ,Economics - General Economics - Abstract
This study introduces a global stock market volatility forecasting model that enhances forecasting accuracy and practical utility in real-world financial decision-making by integrating dynamic graph structures and encompassing the union of active trading days of different stock markets. The model employs a spatial-temporal graph neural network (GNN) architecture to capture the volatility spillover effect, where shocks in one market spread to others through the interconnective global economy. By calculating the volatility spillover index to depict the volatility network as graphs, the model effectively mirrors the volatility dynamics for the chosen stock market indices. In the empirical analysis, the proposed model surpasses the benchmark model in all forecasting scenarios and is shown to be sensitive to the underlying volatility interrelationships.
- Published
- 2024
8. Hierarchical Multi-modal Transformer for Cross-modal Long Document Classification
- Author
-
Liu, Tengfei, Hu, Yongli, Gao, Junbin, Sun, Yanfeng, and Yin, Baocai
- Subjects
Computer Science - Computer Vision and Pattern Recognition ,Computer Science - Artificial Intelligence - Abstract
Long Document Classification (LDC) has gained significant attention recently. However, multi-modal data in long documents such as texts and images are not being effectively utilized. Prior studies in this area have attempted to integrate texts and images in document-related tasks, but they have only focused on short text sequences and images of pages. How to classify long documents with hierarchical structure texts and embedding images is a new problem and faces multi-modal representation difficulties. In this paper, we propose a novel approach called Hierarchical Multi-modal Transformer (HMT) for cross-modal long document classification. The HMT conducts multi-modal feature interaction and fusion between images and texts in a hierarchical manner. Our approach uses a multi-modal transformer and a dynamic multi-scale multi-modal transformer to model the complex relationships between image features, and the section and sentence features. Furthermore, we introduce a new interaction strategy called the dynamic mask transfer module to integrate these two transformers by propagating features between them. To validate our approach, we conduct cross-modal LDC experiments on two newly created and two publicly available multi-modal long document datasets, and the results show that the proposed HMT outperforms state-of-the-art single-modality and multi-modality methods., Comment: IEEE Transactions on Multimedia
- Published
- 2024
9. Unleash Graph Neural Networks from Heavy Tuning
- Author
-
Lin, Lequan, Shi, Dai, Han, Andi, Wang, Zhiyong, and Gao, Junbin
- Subjects
Computer Science - Machine Learning - Abstract
Graph Neural Networks (GNNs) are deep-learning architectures designed for graph-type data, where understanding relationships among individual observations is crucial. However, achieving promising GNN performance, especially on unseen data, requires comprehensive hyperparameter tuning and meticulous training. Unfortunately, these processes come with high computational costs and significant human effort. Additionally, conventional searching algorithms such as grid search may result in overfitting on validation data, diminishing generalization accuracy. To tackle these challenges, we propose a graph conditional latent diffusion framework (GNN-Diff) to generate high-performing GNNs directly by learning from checkpoints saved during a light-tuning coarse search. Our method: (1) unleashes GNN training from heavy tuning and complex search space design; (2) produces GNN parameters that outperform those obtained through comprehensive grid search; and (3) establishes higher-quality generation for GNNs compared to diffusion frameworks designed for general neural networks.
- Published
- 2024
10. ST-MambaSync: The Complement of Mamba and Transformers for Spatial-Temporal in Traffic Flow Prediction
- Author
-
Shao, Zhiqi, Yao, Xusheng, Wang, Ze, and Gao, Junbin
- Subjects
Computer Science - Machine Learning ,Computer Science - Artificial Intelligence ,53A45 ,I.2.0 - Abstract
Accurate traffic flow prediction is crucial for optimizing traffic management, enhancing road safety, and reducing environmental impacts. Existing models face challenges with long sequence data, requiring substantial memory and computational resources, and often suffer from slow inference times due to the lack of a unified summary state. This paper introduces ST-MambaSync, an innovative traffic flow prediction model that combines transformer technology with the ST-Mamba block, representing a significant advancement in the field. We are the pioneers in employing the Mamba mechanism which is an attention mechanism integrated with ResNet within a transformer framework, which significantly enhances the model's explainability and performance. ST-MambaSync effectively addresses key challenges such as data length and computational efficiency, setting new benchmarks for accuracy and processing speed through comprehensive comparative analysis. This development has significant implications for urban planning and real-time traffic management, establishing a new standard in traffic flow prediction technology., Comment: 11 pages. arXiv admin note: substantial text overlap with arXiv:2404.13257
- Published
- 2024
11. ST-Mamba: Spatial-Temporal Selective State Space Model for Traffic Flow Prediction
- Author
-
Shao, Zhiqi, Bell, Michael G. H., Wang, Ze, Geers, D. Glenn, Xi, Haoning, and Gao, Junbin
- Subjects
Computer Science - Machine Learning ,53A45 ,I.2.0 - Abstract
Traffic flow prediction, a critical aspect of intelligent transportation systems, has been increasingly popular in the field of artificial intelligence, driven by the availability of extensive traffic data. The current challenges of traffic flow prediction lie in integrating diverse factors while balancing the trade-off between computational complexity and the precision necessary for effective long-range and large-scale predictions. To address these challenges, we introduce a Spatial-Temporal Selective State Space (ST-Mamba) model, which is the first to leverage the power of spatial-temporal learning in traffic flow prediction without using graph modeling. The ST-Mamba model can effectively capture the long-range dependency for traffic flow data, thereby avoiding the issue of over-smoothing. The proposed ST-Mamba model incorporates an effective Spatial-Temporal Mixer (ST-Mixer) to seamlessly integrate spatial and temporal data processing into a unified framework and employs a Spatial-Temporal Selective State Space (ST-SSM) block to improve computational efficiency. The proposed ST-Mamba model, specifically designed for spatial-temporal data, simplifies processing procedure and enhances generalization capabilities, thereby significantly improving the accuracy of long-range traffic flow prediction. Compared to the previous state-of-the-art (SOTA) model, the proposed ST-Mamba model achieves a 61.11\% improvement in computational speed and increases prediction accuracy by 0.67\%. Extensive experiments with real-world traffic datasets demonstrate that the \textsf{ST-Mamba} model sets a new benchmark in traffic flow prediction, achieving SOTA performance in computational efficiency for both long- and short-range predictions and significantly improving the overall efficiency and effectiveness of traffic management., Comment: 25 pages, 6 figures
- Published
- 2024
12. IME: Integrating Multi-curvature Shared and Specific Embedding for Temporal Knowledge Graph Completion
- Author
-
Wang, Jiapu, Cui, Zheng, Wang, Boyue, Pan, Shirui, Gao, Junbin, Yin, Baocai, and Gao, Wen
- Subjects
Computer Science - Artificial Intelligence - Abstract
Temporal Knowledge Graphs (TKGs) incorporate a temporal dimension, allowing for a precise capture of the evolution of knowledge and reflecting the dynamic nature of the real world. Typically, TKGs contain complex geometric structures, with various geometric structures interwoven. However, existing Temporal Knowledge Graph Completion (TKGC) methods either model TKGs in a single space or neglect the heterogeneity of different curvature spaces, thus constraining their capacity to capture these intricate geometric structures. In this paper, we propose a novel Integrating Multi-curvature shared and specific Embedding (IME) model for TKGC tasks. Concretely, IME models TKGs into multi-curvature spaces, including hyperspherical, hyperbolic, and Euclidean spaces. Subsequently, IME incorporates two key properties, namely space-shared property and space-specific property. The space-shared property facilitates the learning of commonalities across different curvature spaces and alleviates the spatial gap caused by the heterogeneous nature of multi-curvature spaces, while the space-specific property captures characteristic features. Meanwhile, IME proposes an Adjustable Multi-curvature Pooling (AMP) approach to effectively retain important information. Furthermore, IME innovatively designs similarity, difference, and structure loss functions to attain the stated objective. Experimental results clearly demonstrate the superior performance of IME over existing state-of-the-art TKGC models.
- Published
- 2024
13. CCDSReFormer: Traffic Flow Prediction with a Criss-Crossed Dual-Stream Enhanced Rectified Transformer Model
- Author
-
Shao, Zhiqi, Bell, Michael G. H., Wang, Ze, Geers, D. Glenn, Yao, Xusheng, and Gao, Junbin
- Subjects
Computer Science - Machine Learning ,I.2.0 - Abstract
Accurate, and effective traffic forecasting is vital for smart traffic systems, crucial in urban traffic planning and management. Current Spatio-Temporal Transformer models, despite their prediction capabilities, struggle with balancing computational efficiency and accuracy, favoring global over local information, and handling spatial and temporal data separately, limiting insight into complex interactions. We introduce the Criss-Crossed Dual-Stream Enhanced Rectified Transformer model (CCDSReFormer), which includes three innovative modules: Enhanced Rectified Spatial Self-attention (ReSSA), Enhanced Rectified Delay Aware Self-attention (ReDASA), and Enhanced Rectified Temporal Self-attention (ReTSA). These modules aim to lower computational needs via sparse attention, focus on local information for better traffic dynamics understanding, and merge spatial and temporal insights through a unique learning method. Extensive tests on six real-world datasets highlight CCDSReFormer's superior performance. An ablation study also confirms the significant impact of each component on the model's predictive accuracy, showcasing our model's ability to forecast traffic flow effectively., Comment: 18 pages
- Published
- 2024
14. Design your own universe: a physics-informed agnostic method for enhancing graph neural networks
- Author
-
Shi, Dai, Han, Andi, Lin, Lequan, Guo, Yi, Wang, Zhiyong, and Gao, Junbin
- Published
- 2024
- Full Text
- View/download PDF
15. DGNN: Decoupled Graph Neural Networks with Structural Consistency between Attribute and Graph Embedding Representations
- Author
-
Wang, Jinlu, Guo, Jipeng, Sun, Yanfeng, Gao, Junbin, Wang, Shaofan, Yang, Yachao, and Yin, Baocai
- Subjects
Computer Science - Machine Learning - Abstract
Graph neural networks (GNNs) demonstrate a robust capability for representation learning on graphs with complex structures, showcasing superior performance in various applications. The majority of existing GNNs employ a graph convolution operation by using both attribute and structure information through coupled learning. In essence, GNNs, from an optimization perspective, seek to learn a consensus and compromise embedding representation that balances attribute and graph information, selectively exploring and retaining valid information. To obtain a more comprehensive embedding representation of nodes, a novel GNNs framework, dubbed Decoupled Graph Neural Networks (DGNN), is introduced. DGNN explores distinctive embedding representations from the attribute and graph spaces by decoupled terms. Considering that semantic graph, constructed from attribute feature space, consists of different node connection information and provides enhancement for the topological graph, both topological and semantic graphs are combined for the embedding representation learning. Further, structural consistency among attribute embedding and graph embeddings is promoted to effectively remove redundant information and establish soft connection. This involves promoting factor sharing for adjacency reconstruction matrices, facilitating the exploration of a consensus and high-level correlation. Finally, a more powerful and complete representation is achieved through the concatenation of these embeddings. Experimental results conducted on several graph benchmark datasets verify its superiority in node classification task.
- Published
- 2024
16. SpecSTG: A Fast Spectral Diffusion Framework for Probabilistic Spatio-Temporal Traffic Forecasting
- Author
-
Lin, Lequan, Shi, Dai, Han, Andi, and Gao, Junbin
- Subjects
Computer Science - Machine Learning - Abstract
Traffic forecasting, a crucial application of spatio-temporal graph (STG) learning, has traditionally relied on deterministic models for accurate point estimations. Yet, these models fall short of quantifying future uncertainties. Recently, many probabilistic methods, especially variants of diffusion models, have been proposed to fill this gap. However, existing diffusion methods typically deal with individual sensors separately when generating future time series, resulting in limited usage of spatial information in the probabilistic learning process. In this work, we propose SpecSTG, a novel spectral diffusion framework, to better leverage spatial dependencies and systematic patterns inherent in traffic data. More specifically, our method generates the Fourier representation of future time series, transforming the learning process into the spectral domain enriched with spatial information. Additionally, our approach incorporates a fast spectral graph convolution designed for Fourier input, alleviating the computational burden associated with existing models. Compared with state-of-the-arts, SpecSTG achieves up to 8% improvements on point estimations and up to 0.78% improvements on quantifying future uncertainties. Furthermore, SpecSTG's training and validation speed is 3.33X of the most efficient existing diffusion method for STG forecasting. The source code for SpecSTG is available at https://anonymous.4open.science/r/SpecSTG.
- Published
- 2024
17. Quasi-framelets: robust graph neural networks via adaptive framelet convolution
- Author
-
Yang, Mengxi, Shi, Dai, Zheng, Xuebin, Yin, Jie, and Gao, Junbin
- Published
- 2024
- Full Text
- View/download PDF
18. Exposition on over-squashing problem on GNNs: Current Methods, Benchmarks and Challenges
- Author
-
Shi, Dai, Han, Andi, Lin, Lequan, Guo, Yi, and Gao, Junbin
- Subjects
Computer Science - Machine Learning - Abstract
Graph-based message-passing neural networks (MPNNs) have achieved remarkable success in both node and graph-level learning tasks. However, several identified problems, including over-smoothing (OSM), limited expressive power, and over-squashing (OSQ), still limit the performance of MPNNs. In particular, OSQ serves as the latest identified problem, where MPNNs gradually lose their learning accuracy when long-range dependencies between graph nodes are required. In this work, we provide an exposition on the OSQ problem by summarizing different formulations of OSQ from current literature, as well as the three different categories of approaches for addressing the OSQ problem. In addition, we also discuss the alignment between OSQ and expressive power and the trade-off between OSQ and OSM. Furthermore, we summarize the empirical methods leveraged from existing works to verify the efficiency of OSQ mitigation approaches, with illustrations of their computational complexities. Lastly, we list some open questions that are of interest for further exploration of the OSQ problem along with potential directions from the best of our knowledge.
- Published
- 2023
19. From Continuous Dynamics to Graph Neural Networks: Neural Diffusion and Beyond
- Author
-
Han, Andi, Shi, Dai, Lin, Lequan, and Gao, Junbin
- Subjects
Computer Science - Machine Learning ,Computer Science - Artificial Intelligence ,Statistics - Machine Learning - Abstract
Graph neural networks (GNNs) have demonstrated significant promise in modelling relational data and have been widely applied in various fields of interest. The key mechanism behind GNNs is the so-called message passing where information is being iteratively aggregated to central nodes from their neighbourhood. Such a scheme has been found to be intrinsically linked to a physical process known as heat diffusion, where the propagation of GNNs naturally corresponds to the evolution of heat density. Analogizing the process of message passing to the heat dynamics allows to fundamentally understand the power and pitfalls of GNNs and consequently informs better model design. Recently, there emerges a plethora of works that proposes GNNs inspired from the continuous dynamics formulation, in an attempt to mitigate the known limitations of GNNs, such as oversmoothing and oversquashing. In this survey, we provide the first systematic and comprehensive review of studies that leverage the continuous perspective of GNNs. To this end, we introduce foundational ingredients for adapting continuous dynamics to GNNs, along with a general framework for the design of graph neural dynamics. We then review and categorize existing works based on their driven mechanisms and underlying dynamics. We also summarize how the limitations of classic GNNs can be addressed under the continuous framework. We conclude by identifying multiple open research directions.
- Published
- 2023
20. Bregman Graph Neural Network
- Author
-
Zhai, Jiayu, Lin, Lequan, Shi, Dai, and Gao, Junbin
- Subjects
Computer Science - Machine Learning - Abstract
Numerous recent research on graph neural networks (GNNs) has focused on formulating GNN architectures as an optimization problem with the smoothness assumption. However, in node classification tasks, the smoothing effect induced by GNNs tends to assimilate representations and over-homogenize labels of connected nodes, leading to adverse effects such as over-smoothing and misclassification. In this paper, we propose a novel bilevel optimization framework for GNNs inspired by the notion of Bregman distance. We demonstrate that the GNN layer proposed accordingly can effectively mitigate the over-smoothing issue by introducing a mechanism reminiscent of the "skip connection". We validate our theoretical results through comprehensive empirical studies in which Bregman-enhanced GNNs outperform their original counterparts in both homophilic and heterophilic graphs. Furthermore, our experiments also show that Bregman GNNs can produce more robust learning accuracy even when the number of layers is high, suggesting the effectiveness of the proposed method in alleviating the over-smoothing issue.
- Published
- 2023
21. Unifying over-smoothing and over-squashing in graph neural networks: A physics informed approach and beyond
- Author
-
Shao, Zhiqi, Shi, Dai, Han, Andi, Guo, Yi, Zhao, Qibin, and Gao, Junbin
- Subjects
Computer Science - Machine Learning - Abstract
Graph Neural Networks (GNNs) have emerged as one of the leading approaches for machine learning on graph-structured data. Despite their great success, critical computational challenges such as over-smoothing, over-squashing, and limited expressive power continue to impact the performance of GNNs. In this study, inspired from the time-reversal principle commonly utilized in classical and quantum physics, we reverse the time direction of the graph heat equation. The resulted reversing process yields a class of high pass filtering functions that enhance the sharpness of graph node features. Leveraging this concept, we introduce the Multi-Scaled Heat Kernel based GNN (MHKG) by amalgamating diverse filtering functions' effects on node features. To explore more flexible filtering conditions, we further generalize MHKG into a model termed G-MHKG and thoroughly show the roles of each element in controlling over-smoothing, over-squashing and expressive power. Notably, we illustrate that all aforementioned issues can be characterized and analyzed via the properties of the filtering functions, and uncover a trade-off between over-smoothing and over-squashing: enhancing node feature sharpness will make model suffer more from over-squashing, and vice versa. Furthermore, we manipulate the time again to show how G-MHKG can handle both two issues under mild conditions. Our conclusive experiments highlight the effectiveness of proposed models. It surpasses several GNN baseline models in performance across graph datasets characterized by both homophily and heterophily.
- Published
- 2023
22. How Curvature Enhance the Adaptation Power of Framelet GCNs
- Author
-
Shi, Dai, Guo, Yi, Shao, Zhiqi, and Gao, Junbin
- Subjects
Computer Science - Machine Learning - Abstract
Graph neural network (GNN) has been demonstrated powerful in modeling graph-structured data. However, despite many successful cases of applying GNNs to various graph classification and prediction tasks, whether the graph geometrical information has been fully exploited to enhance the learning performance of GNNs is not yet well understood. This paper introduces a new approach to enhance GNN by discrete graph Ricci curvature. Specifically, the graph Ricci curvature defined on the edges of a graph measures how difficult the information transits on one edge from one node to another based on their neighborhoods. Motivated by the geometric analogy of Ricci curvature in the graph setting, we prove that by inserting the curvature information with different carefully designed transformation function $\zeta$, several known computational issues in GNN such as over-smoothing can be alleviated in our proposed model. Furthermore, we verified that edges with very positive Ricci curvature (i.e., $\kappa_{i,j} \approx 1$) are preferred to be dropped to enhance model's adaption to heterophily graph and one curvature based graph edge drop algorithm is proposed. Comprehensive experiments show that our curvature-based GNN model outperforms the state-of-the-art baselines in both homophily and heterophily graph datasets, indicating the effectiveness of involving graph geometric information in GNNs.
- Published
- 2023
23. Frameless Graph Knowledge Distillation
- Author
-
Shi, Dai, Shao, Zhiqi, Guo, Yi, and Gao, Junbin
- Subjects
Computer Science - Machine Learning - Abstract
Knowledge distillation (KD) has shown great potential for transferring knowledge from a complex teacher model to a simple student model in which the heavy learning task can be accomplished efficiently and without losing too much prediction accuracy. Recently, many attempts have been made by applying the KD mechanism to the graph representation learning models such as graph neural networks (GNNs) to accelerate the model's inference speed via student models. However, many existing KD-based GNNs utilize MLP as a universal approximator in the student model to imitate the teacher model's process without considering the graph knowledge from the teacher model. In this work, we provide a KD-based framework on multi-scaled GNNs, known as graph framelet, and prove that by adequately utilizing the graph knowledge in a multi-scaled manner provided by graph framelet decomposition, the student model is capable of adapting both homophilic and heterophilic graphs and has the potential of alleviating the over-squashing issue with a simple yet effectively graph surgery. Furthermore, we show how the graph knowledge supplied by the teacher is learned and digested by the student model via both algebra and geometry. Comprehensive experiments show that our proposed model can generate learning accuracy identical to or even surpass the teacher model while maintaining the high speed of inference.
- Published
- 2023
24. Combating Confirmation Bias: A Unified Pseudo-Labeling Framework for Entity Alignment
- Author
-
Ding, Qijie, Yin, Jie, Zhang, Daokun, and Gao, Junbin
- Subjects
Computer Science - Artificial Intelligence ,Computer Science - Computation and Language ,Computer Science - Machine Learning - Abstract
Entity alignment (EA) aims at identifying equivalent entity pairs across different knowledge graphs (KGs) that refer to the same real-world identity. To systematically combat confirmation bias for pseudo-labeling-based entity alignment, we propose a Unified Pseudo-Labeling framework for Entity Alignment (UPL-EA) that explicitly eliminates pseudo-labeling errors to boost the accuracy of entity alignment. UPL-EA consists of two complementary components: (1) The Optimal Transport (OT)-based pseudo-labeling uses discrete OT modeling as an effective means to enable more accurate determination of entity correspondences across two KGs and to mitigate the adverse impact of erroneous matches. A simple but highly effective criterion is further devised to derive pseudo-labeled entity pairs that satisfy one-to-one correspondences at each iteration. (2) The cross-iteration pseudo-label calibration operates across multiple consecutive iterations to further improve the pseudo-labeling precision rate by reducing the local pseudo-label selection variability with a theoretical guarantee. The two components are respectively designed to eliminate Type I and Type II pseudo-labeling errors identified through our analyse. The calibrated pseudo-labels are thereafter used to augment prior alignment seeds to reinforce subsequent model training for alignment inference. The effectiveness of UPL-EA in eliminating pseudo-labeling errors is both theoretically supported and experimentally validated. The experimental results show that our approach achieves competitive performance with limited prior alignment seeds.
- Published
- 2023
25. Variational Counterfactual Prediction under Runtime Domain Corruption
- Author
-
Wen, Hechuan, Chen, Tong, Chai, Li Kheng, Sadiq, Shazia, Gao, Junbin, and Yin, Hongzhi
- Subjects
Computer Science - Machine Learning ,Statistics - Machine Learning - Abstract
To date, various neural methods have been proposed for causal effect estimation based on observational data, where a default assumption is the same distribution and availability of variables at both training and inference (i.e., runtime) stages. However, distribution shift (i.e., domain shift) could happen during runtime, and bigger challenges arise from the impaired accessibility of variables. This is commonly caused by increasing privacy and ethical concerns, which can make arbitrary variables unavailable in the entire runtime data and imputation impractical. We term the co-occurrence of domain shift and inaccessible variables runtime domain corruption, which seriously impairs the generalizability of a trained counterfactual predictor. To counter runtime domain corruption, we subsume counterfactual prediction under the notion of domain adaptation. Specifically, we upper-bound the error w.r.t. the target domain (i.e., runtime covariates) by the sum of source domain error and inter-domain distribution distance. In addition, we build an adversarially unified variational causal effect model, named VEGAN, with a novel two-stage adversarial domain adaptation scheme to reduce the latent distribution disparity between treated and control groups first, and between training and runtime variables afterwards. We demonstrate that VEGAN outperforms other state-of-the-art baselines on individual-level treatment effect estimation in the presence of runtime domain corruption on benchmark datasets.
- Published
- 2023
26. Efficient and Interpretable Compressive Text Summarisation with Unsupervised Dual-Agent Reinforcement Learning
- Author
-
Tang, Peggy, Gao, Junbin, Zhang, Lei, and Wang, Zhiyong
- Subjects
Computer Science - Computation and Language - Abstract
Recently, compressive text summarisation offers a balance between the conciseness issue of extractive summarisation and the factual hallucination issue of abstractive summarisation. However, most existing compressive summarisation methods are supervised, relying on the expensive effort of creating a new training dataset with corresponding compressive summaries. In this paper, we propose an efficient and interpretable compressive summarisation method that utilises unsupervised dual-agent reinforcement learning to optimise a summary's semantic coverage and fluency by simulating human judgment on summarisation quality. Our model consists of an extractor agent and a compressor agent, and both agents have a multi-head attentional pointer-based structure. The extractor agent first chooses salient sentences from a document, and then the compressor agent compresses these extracted sentences by selecting salient words to form a summary without using reference summaries to compute the summary reward. To our best knowledge, this is the first work on unsupervised compressive summarisation. Experimental results on three widely used datasets (e.g., Newsroom, CNN/DM, and XSum) show that our model achieves promising performance and a significant improvement on Newsroom in terms of the ROUGE metric, as well as interpretability of semantic coverage of summarisation results., Comment: The 4th Workshop on Simple and Efficient Natural Language Processing (SustaiNLP 2023), co-located with ACL 2023
- Published
- 2023
27. A novel approach for assessing fairness in deployed machine learning algorithms
- Author
-
Uddin, Shahadat, Lu, Haohui, Rahman, Ashfaqur, and Gao, Junbin
- Published
- 2024
- Full Text
- View/download PDF
28. Enhancing framelet GCNs with generalized p-Laplacian regularization
- Author
-
Shao, Zhiqi, Shi, Dai, Han, Andi, Vasnev, Andrey, Guo, Yi, and Gao, Junbin
- Published
- 2024
- Full Text
- View/download PDF
29. Riemannian block SPD coupling manifold and its application to optimal transport
- Author
-
Han, Andi, Mishra, Bamdev, Jawanpuria, Pratik, and Gao, Junbin
- Published
- 2024
- Full Text
- View/download PDF
30. Differentially private Riemannian optimization
- Author
-
Han, Andi, Mishra, Bamdev, Jawanpuria, Pratik, and Gao, Junbin
- Published
- 2024
- Full Text
- View/download PDF
31. Revisiting Generalized p-Laplacian Regularized Framelet GCNs: Convergence, Energy Dynamic and Training with Non-Linear Diffusion
- Author
-
Shi, Dai, Shao, Zhiqi, Guo, Yi, Zhao, Qibin, and Gao, Junbin
- Subjects
Computer Science - Machine Learning - Abstract
This paper presents a comprehensive theoretical analysis of the graph p-Laplacian regularized framelet network (pL-UFG) to establish a solid understanding of its properties. We conduct a convergence analysis on pL-UFG, addressing the gap in the understanding of its asymptotic behaviors. Further by investigating the generalized Dirichlet energy of pL-UFG, we demonstrate that the Dirichlet energy remains non-zero throughout convergence, ensuring the avoidance of over-smoothing issues. Additionally, we elucidate the energy dynamic perspective, highlighting the synergistic relationship between the implicit layer in pL-UFG and graph framelets. This synergy enhances the model's adaptability to both homophilic and heterophilic data. Notably, we reveal that pL-UFG can be interpreted as a generalized non-linear diffusion process, thereby bridging the gap between pL-UFG and differential equations on the graph. Importantly, these multifaceted analyses lead to unified conclusions that offer novel insights for understanding and implementing pL-UFG, as well as other graph neural network (GNN) models. Finally, based on our dynamic analysis, we propose two novel pL-UFG models with manually controlled energy dynamics. We demonstrate empirically and theoretically that our proposed models not only inherit the advantages of pL-UFG but also significantly reduce computational costs for training on large-scale graph datasets.
- Published
- 2023
32. Diffusion Models for Time Series Applications: A Survey
- Author
-
Lin, Lequan, Li, Zhengkun, Li, Ruikun, Li, Xuliang, and Gao, Junbin
- Subjects
Computer Science - Machine Learning - Abstract
Diffusion models, a family of generative models based on deep learning, have become increasingly prominent in cutting-edge machine learning research. With a distinguished performance in generating samples that resemble the observed data, diffusion models are widely used in image, video, and text synthesis nowadays. In recent years, the concept of diffusion has been extended to time series applications, and many powerful models have been developed. Considering the deficiency of a methodical summary and discourse on these models, we provide this survey as an elementary resource for new researchers in this area and also an inspiration to motivate future research. For better understanding, we include an introduction about the basics of diffusion models. Except for this, we primarily focus on diffusion-based methods for time series forecasting, imputation, and generation, and present them respectively in three individual sections. We also compare different methods for the same application and highlight their connections if applicable. Lastly, we conclude the common limitation of diffusion-based methods and highlight potential future research directions.
- Published
- 2023
33. DiffMoCa: Diffusion Model Based Multi-modality Cut and Paste
- Author
-
Zhang, Junjie, Wu, Shaojin, Gao, Junbin, Yu, Fusheng, Xu, Hao, Zeng, Zhigang, Hartmanis, Juris, Founding Editor, van Leeuwen, Jan, Series Editor, Hutchison, David, Editorial Board Member, Kanade, Takeo, Editorial Board Member, Kittler, Josef, Editorial Board Member, Kleinberg, Jon M., Editorial Board Member, Kobsa, Alfred, Series Editor, Mattern, Friedemann, Editorial Board Member, Mitchell, John C., Editorial Board Member, Naor, Moni, Editorial Board Member, Nierstrasz, Oscar, Series Editor, Pandu Rangan, C., Editorial Board Member, Sudan, Madhu, Series Editor, Terzopoulos, Demetri, Editorial Board Member, Tygar, Doug, Editorial Board Member, Weikum, Gerhard, Series Editor, Vardi, Moshe Y, Series Editor, Goos, Gerhard, Founding Editor, Bertino, Elisa, Editorial Board Member, Gao, Wen, Editorial Board Member, Steffen, Bernhard, Editorial Board Member, Yung, Moti, Editorial Board Member, Woeginger, Gerhard, Editorial Board Member, Le, Xinyi, editor, and Zhang, Zhijun, editor
- Published
- 2024
- Full Text
- View/download PDF
34. Graph Contrastive Learning with Implicit Augmentations
- Author
-
Liang, Huidong, Du, Xingjian, Zhu, Bilei, Ma, Zejun, Chen, Ke, and Gao, Junbin
- Subjects
Computer Science - Machine Learning ,Statistics - Machine Learning - Abstract
Existing graph contrastive learning methods rely on augmentation techniques based on random perturbations (e.g., randomly adding or dropping edges and nodes). Nevertheless, altering certain edges or nodes can unexpectedly change the graph characteristics, and choosing the optimal perturbing ratio for each dataset requires onerous manual tuning. In this paper, we introduce Implicit Graph Contrastive Learning (iGCL), which utilizes augmentations in the latent space learned from a Variational Graph Auto-Encoder by reconstructing graph topological structure. Importantly, instead of explicitly sampling augmentations from latent distributions, we further propose an upper bound for the expected contrastive loss to improve the efficiency of our learning algorithm. Thus, graph semantics can be preserved within the augmentations in an intelligent way without arbitrary manual design or prior human knowledge. Experimental results on both graph-level and node-level tasks show that the proposed method achieves state-of-the-art performance compared to other benchmarks, where ablation studies in the end demonstrate the effectiveness of modules in iGCL.
- Published
- 2022
35. Generalized Laplacian Regularized Framelet Graph Neural Networks
- Author
-
Shao, Zhiqi, Han, Andi, Shi, Dai, Vasnev, Andrey, and Gao, Junbin
- Subjects
Computer Science - Machine Learning - Abstract
This paper introduces a novel Framelet Graph approach based on p-Laplacian GNN. The proposed two models, named p-Laplacian undecimated framelet graph convolution (pL-UFG) and generalized p-Laplacian undecimated framelet graph convolution (pL-fUFG) inherit the nature of p-Laplacian with the expressive power of multi-resolution decomposition of graph signals. The empirical study highlights the excellent performance of the pL-UFG and pL-fUFG in different graph learning tasks including node classification and signal denoising.
- Published
- 2022
36. A Magnetic Framelet-Based Convolutional Neural Network for Directed Graphs
- Author
-
Lin, Lequan and Gao, Junbin
- Subjects
Computer Science - Machine Learning ,Electrical Engineering and Systems Science - Signal Processing - Abstract
Spectral Graph Convolutional Networks (spectral GCNNs), a powerful tool for analyzing and processing graph data, typically apply frequency filtering via Fourier transform to obtain representations with selective information. Although research shows that spectral GCNNs can be enhanced by framelet-based filtering, the massive majority of such research only considers undirected graphs. In this paper, we introduce Framelet-MagNet, a magnetic framelet-based spectral GCNN for directed graphs (digraphs). The model applies the framelet transform to digraph signals to form a more sophisticated representation for filtering. Digraph framelets are constructed with the complex-valued magnetic Laplacian, simultaneously leading to signal processing in both real and complex domains. We empirically validate the predictive power of Framelet-MagNet over a range of state-of-the-art models in node classification, link prediction, and denoising., Comment: Accepted by ICASSP 2023
- Published
- 2022
37. SA-MLP: Distilling Graph Knowledge from GNNs into Structure-Aware MLP
- Author
-
Chen, Jie, Chen, Shouzhen, Bai, Mingyuan, Gao, Junbin, Zhang, Junping, and Pu, Jian
- Subjects
Computer Science - Machine Learning ,Computer Science - Artificial Intelligence - Abstract
The message-passing mechanism helps Graph Neural Networks (GNNs) achieve remarkable results on various node classification tasks. Nevertheless, the recursive nodes fetching and aggregation in message-passing cause inference latency when deploying GNNs to large-scale graphs. One promising inference acceleration direction is to distill the GNNs into message-passing-free student multi-layer perceptrons (MLPs). However, the MLP student cannot fully learn the structure knowledge due to the lack of structure inputs, which causes inferior performance in the heterophily and inductive scenarios. To address this, we intend to inject structure information into MLP-like students in low-latency and interpretable ways. Specifically, we first design a Structure-Aware MLP (SA-MLP) student that encodes both features and structures without message-passing. Then, we introduce a novel structure-mixing knowledge distillation strategy to enhance the learning ability of MLPs for structure information. Furthermore, we design a latent structure embedding approximation technique with two-stage distillation for inductive scenarios. Extensive experiments on eight benchmark datasets under both transductive and inductive settings show that our SA-MLP can consistently outperform the teacher GNNs, while maintaining faster inference as MLPs. The source code of our work can be found in https://github.com/JC-202/SA-MLP.
- Published
- 2022
38. Generalized energy and gradient flow via graph framelets
- Author
-
Han, Andi, Shi, Dai, Shao, Zhiqi, and Gao, Junbin
- Subjects
Computer Science - Machine Learning ,Statistics - Machine Learning - Abstract
In this work, we provide a theoretical understanding of the framelet-based graph neural networks through the perspective of energy gradient flow. By viewing the framelet-based models as discretized gradient flows of some energy, we show it can induce both low-frequency and high-frequency-dominated dynamics, via the separate weight matrices for different frequency components. This substantiates its good empirical performance on both homophilic and heterophilic graphs. We then propose a generalized energy via framelet decomposition and show its gradient flow leads to a novel graph neural network, which includes many existing models as special cases. We then explain how the proposed model generally leads to more flexible dynamics, thus potentially enhancing the representation power of graph neural networks.
- Published
- 2022
39. Recent advances in artificial intelligence generated content
- Author
-
Zhang, Junping, Sun, Lingyun, Jin, Cong, Gao, Junbin, Li, Xiaobing, Luo, Jiebo, Pan, Zhigeng, Tang, Ying, and Wang, Jingdong
- Published
- 2024
- Full Text
- View/download PDF
40. Diffusion models for time-series applications: a survey
- Author
-
Lin, Lequan, Li, Zhengkun, Li, Ruikun, Li, Xuliang, and Gao, Junbin
- Published
- 2024
- Full Text
- View/download PDF
41. Riemannian accelerated gradient methods via extrapolation
- Author
-
Han, Andi, Mishra, Bamdev, Jawanpuria, Pratik, and Gao, Junbin
- Subjects
Mathematics - Optimization and Control ,Computer Science - Machine Learning ,Statistics - Machine Learning - Abstract
In this paper, we propose a simple acceleration scheme for Riemannian gradient methods by extrapolating iterates on manifolds. We show when the iterates are generated from Riemannian gradient descent method, the accelerated scheme achieves the optimal convergence rate asymptotically and is computationally more favorable than the recently proposed Riemannian Nesterov accelerated gradient methods. Our experiments verify the practical benefit of the novel acceleration strategy.
- Published
- 2022
42. Universal Deep GNNs: Rethinking Residual Connection in GNNs from a Path Decomposition Perspective for Preventing the Over-smoothing
- Author
-
Chen, Jie, Liu, Weiqi, Huang, Zhizhong, Gao, Junbin, Zhang, Junping, and Pu, Jian
- Subjects
Computer Science - Machine Learning ,Computer Science - Artificial Intelligence - Abstract
The performance of GNNs degrades as they become deeper due to the over-smoothing. Among all the attempts to prevent over-smoothing, residual connection is one of the promising methods due to its simplicity. However, recent studies have shown that GNNs with residual connections only slightly slow down the degeneration. The reason why residual connections fail in GNNs is still unknown. In this paper, we investigate the forward and backward behavior of GNNs with residual connections from a novel path decomposition perspective. We find that the recursive aggregation of the median length paths from the binomial distribution of residual connection paths dominates output representation, resulting in over-smoothing as GNNs go deeper. Entangled propagation and weight matrices cause gradient smoothing and prevent GNNs with residual connections from optimizing to the identity mapping. Based on these findings, we present a Universal Deep GNNs (UDGNN) framework with cold-start adaptive residual connections (DRIVE) and feedforward modules. Extensive experiments demonstrate the effectiveness of our method, which achieves state-of-the-art results over non-smooth heterophily datasets by simply stacking standard GNNs.
- Published
- 2022
43. Embedding Graphs on Grassmann Manifold
- Author
-
Zhou, Bingxin, Zheng, Xuebin, Wang, Yu Guang, Li, Ming, and Gao, Junbin
- Subjects
Computer Science - Machine Learning ,Computer Science - Artificial Intelligence - Abstract
Learning efficient graph representation is the key to favorably addressing downstream tasks on graphs, such as node or graph property prediction. Given the non-Euclidean structural property of graphs, preserving the original graph data's similarity relationship in the embedded space needs specific tools and a similarity metric. This paper develops a new graph representation learning scheme, namely EGG, which embeds approximated second-order graph characteristics into a Grassmann manifold. The proposed strategy leverages graph convolutions to learn hidden representations of the corresponding subspace of the graph, which is then mapped to a Grassmann point of a low dimensional manifold through truncated singular value decomposition (SVD). The established graph embedding approximates denoised correlationship of node attributes, as implemented in the form of a symmetric matrix space for Euclidean calculation. The effectiveness of EGG is demonstrated using both clustering and classification tasks at the node level and graph level. It outperforms baseline models on various benchmarks.
- Published
- 2022
- Full Text
- View/download PDF
44. Differentially private Riemannian optimization
- Author
-
Han, Andi, Mishra, Bamdev, Jawanpuria, Pratik, and Gao, Junbin
- Subjects
Mathematics - Optimization and Control ,Computer Science - Cryptography and Security ,Computer Science - Machine Learning ,Statistics - Machine Learning - Abstract
In this paper, we study the differentially private empirical risk minimization problem where the parameter is constrained to a Riemannian manifold. We introduce a framework of differentially private Riemannian optimization by adding noise to the Riemannian gradient on the tangent space. The noise follows a Gaussian distribution intrinsically defined with respect to the Riemannian metric. We adapt the Gaussian mechanism from the Euclidean space to the tangent space compatible to such generalized Gaussian distribution. We show that this strategy presents a simple analysis as compared to directly adding noise on the manifold. We further show privacy guarantees of the proposed differentially private Riemannian (stochastic) gradient descent using an extension of the moments accountant technique. Additionally, we prove utility guarantees under geodesic (strongly) convex, general nonconvex objectives as well as under the Riemannian Polyak-{\L}ojasiewicz condition. We show the efficacy of the proposed framework in several applications.
- Published
- 2022
45. A Simple Yet Effective SVD-GCN for Directed Graphs
- Author
-
Zou, Chunya, Han, Andi, Lin, Lequan, and Gao, Junbin
- Subjects
Computer Science - Machine Learning ,Computer Science - Social and Information Networks - Abstract
In this paper, we propose a simple yet effective graph neural network for directed graphs (digraph) based on the classic Singular Value Decomposition (SVD), named SVD-GCN. The new graph neural network is built upon the graph SVD-framelet to better decompose graph signals on the SVD ``frequency'' bands. Further the new framelet SVD-GCN is also scaled up for larger scale graphs via using Chebyshev polynomial approximation. Through empirical experiments conducted on several node classification datasets, we have found that SVD-GCN has remarkable improvements in a variety of graph node learning tasks and it outperforms GCN and many other state-of-the-art graph neural networks for digraphs. Moreover, we empirically demonstate that the SVD-GCN has great denoising capability and robustness to high level graph data attacks. The theoretical and experimental results prove that the SVD-GCN is effective on a variant of graph datasets, meanwhile maintaining stable and even better performance than the state-of-the-arts., Comment: 14 pages
- Published
- 2022
46. Riemannian Hamiltonian methods for min-max optimization on manifolds
- Author
-
Han, Andi, Mishra, Bamdev, Jawanpuria, Pratik, Kumar, Pawan, and Gao, Junbin
- Subjects
Mathematics - Optimization and Control ,Computer Science - Machine Learning ,Statistics - Machine Learning - Abstract
In this paper, we study min-max optimization problems on Riemannian manifolds. We introduce a Riemannian Hamiltonian function, minimization of which serves as a proxy for solving the original min-max problems. Under the Riemannian Polyak--{\L}ojasiewicz condition on the Hamiltonian function, its minimizer corresponds to the desired min-max saddle point. We also provide cases where this condition is satisfied. For geodesic-bilinear optimization in particular, solving the proxy problem leads to the correct search direction towards global optimality, which becomes challenging with the min-max formulation. To minimize the Hamiltonian function, we propose Riemannian Hamiltonian methods (RHM) and present their convergence analyses. We extend RHM to include consensus regularization and to the stochastic setting. We illustrate the efficacy of the proposed RHM in applications such as subspace robust Wasserstein distance, robust training of neural networks, and generative adversarial networks., Comment: Extended version with proofs
- Published
- 2022
47. OTExtSum: Extractive Text Summarisation with Optimal Transport
- Author
-
Tang, Peggy, Hu, Kun, Yan, Rui, Zhang, Lei, Gao, Junbin, and Wang, Zhiyong
- Subjects
Computer Science - Computation and Language - Abstract
Extractive text summarisation aims to select salient sentences from a document to form a short yet informative summary. While learning-based methods have achieved promising results, they have several limitations, such as dependence on expensive training and lack of interpretability. Therefore, in this paper, we propose a novel non-learning-based method by for the first time formulating text summarisation as an Optimal Transport (OT) problem, namely Optimal Transport Extractive Summariser (OTExtSum). Optimal sentence extraction is conceptualised as obtaining an optimal summary that minimises the transportation cost to a given document regarding their semantic distributions. Such a cost is defined by the Wasserstein distance and used to measure the summary's semantic coverage of the original document. Comprehensive experiments on four challenging and widely used datasets - MultiNews, PubMed, BillSum, and CNN/DM demonstrate that our proposed method outperforms the state-of-the-art non-learning-based methods and several recent learning-based methods in terms of the ROUGE metric., Comment: Findings of NAACL 2022
- Published
- 2022
48. Exploiting Neighbor Effect: Conv-Agnostic GNNs Framework for Graphs with Heterophily
- Author
-
Chen, Jie, Chen, Shouzhen, Gao, Junbin, Huang, Zengfeng, Zhang, Junping, and Pu, Jian
- Subjects
Computer Science - Machine Learning ,Computer Science - Artificial Intelligence - Abstract
Due to the homophily assumption in graph convolution networks (GNNs), a common consensus in the graph node classification task is that GNNs perform well on homophilic graphs but may fail on heterophilic graphs with many inter-class edges. However, the previous inter-class edges perspective and related homo-ratio metrics cannot well explain the GNNs performance under some heterophilic datasets, which implies that not all the inter-class edges are harmful to GNNs. In this work, we propose a new metric based on von Neumann entropy to re-examine the heterophily problem of GNNs and investigate the feature aggregation of inter-class edges from an entire neighbor identifiable perspective. Moreover, we propose a simple yet effective Conv-Agnostic GNN framework (CAGNNs) to enhance the performance of most GNNs on heterophily datasets by learning the neighbor effect for each node. Specifically, we first decouple the feature of each node into the discriminative feature for downstream tasks and the aggregation feature for graph convolution. Then, we propose a shared mixer module to adaptively evaluate the neighbor effect of each node to incorporate the neighbor information. The proposed framework can be regarded as a plug-in component and is compatible with most GNNs. The experimental results over nine well-known benchmark datasets indicate that our framework can significantly improve performance, especially for the heterophily graphs. The average performance gain is 9.81%, 25.81%, and 20.61% compared with GIN, GAT, and GCN, respectively. Extensive ablation studies and robustness analysis further verify the effectiveness, robustness, and interpretability of our framework. Code is available at https://github.com/JC-202/CAGNN., Comment: accepted by TNNLS 2023
- Published
- 2022
49. Contrastive optimized graph convolution network for traffic forecasting
- Author
-
Guo, Kan, Tian, Daxin, Hu, Yongli, Sun, Yanfeng, Qian, Zhen (Sean), Zhou, Jianshan, Gao, Junbin, and Yin, Baocai
- Published
- 2024
- Full Text
- View/download PDF
50. Robust Graph Representation Learning for Local Corruption Recovery
- Author
-
Zhou, Bingxin, Jiang, Yuanhong, Wang, Yu Guang, Liang, Jingwei, Gao, Junbin, Pan, Shirui, and Zhang, Xiaoqun
- Subjects
Computer Science - Machine Learning - Abstract
The performance of graph representation learning is affected by the quality of graph input. While existing research usually pursues a globally smoothed graph embedding, we believe the rarely observed anomalies are as well harmful to an accurate prediction. This work establishes a graph learning scheme that automatically detects (locally) corrupted feature attributes and recovers robust embedding for prediction tasks. The detection operation leverages a graph autoencoder, which does not make any assumptions about the distribution of the local corruptions. It pinpoints the positions of the anomalous node attributes in an unbiased mask matrix, where robust estimations are recovered with sparsity promoting regularizer. The optimizer approaches a new embedding that is sparse in the framelet domain and conditionally close to input observations. Extensive experiments are provided to validate our proposed model can recover a robust graph representation from black-box poisoning and achieve excellent performance., Comment: WWW '23: Proceedings of the ACM Web Conference 2023
- Published
- 2022
Catalog
Discovery Service for Jio Institute Digital Library
For full access to our library's resources, please sign in.