Author: "Zhou, Hongkuan" - Searchworks@Jio Institute Digital Library Search Results

Your search keyword '"Zhou, Hongkuan"' showing total 102 results

Start Over Author "Zhou, Hongkuan"

102 results on '"Zhou, Hongkuan"'

1. Visual Representation Learning Guided By Multi-modal Prior Knowledge

Author: Zhou, Hongkuan, Halilaj, Lavdim, Monka, Sebastian, Schmid, Stefan, Zhu, Yuqicheng, Xiong, Bo, and Staab, Steffen
Subjects: Computer Science - Computer Vision and Pattern Recognition, Computer Science - Machine Learning
Abstract: Despite the remarkable success of deep neural networks (DNNs) in computer vision, they fail to remain high-performing when facing distribution shifts between training and testing data. In this paper, we propose Knowledge-Guided Visual representation learning (KGV), a distribution-based learning approach leveraging multi-modal prior knowledge, to improve generalization under distribution shift. We use prior knowledge from two distinct modalities: 1) a knowledge graph (KG) with hierarchical and association relationships; and 2) generated synthetic images of visual elements semantically represented in the KG. The respective embeddings are generated from the given modalities in a common latent space, i.e., visual embeddings from original and synthetic images as well as knowledge graph embeddings (KGEs). These embeddings are aligned via a novel variant of translation-based KGE methods, where the node and relation embeddings of the KG are modeled as Gaussian distributions and translations respectively. We claim that incorporating multi-model prior knowledge enables more regularized learning of image representations. Thus, the models are able to better generalize across different data distributions. We evaluate KGV on different image classification tasks with major or minor distribution shifts, namely road sign classification across datasets from Germany, China, and Russia, image classification with the mini-ImageNet dataset and its variants, as well as the DVM-CAR dataset. The results demonstrate that KGV consistently exhibits higher accuracy and data efficiency than the baselines across all experiments.
Published: 2024

2. Learning Personalized Scoping for Graph Neural Networks under Heterophily

Author: Deng, Gangda, Zhou, Hongkuan, Kannan, Rajgopal, and Prasanna, Viktor
Subjects: Computer Science - Machine Learning, Computer Science - Social and Information Networks
Abstract: Heterophilous graphs, where dissimilar nodes tend to connect, pose a challenge for graph neural networks (GNNs) as their superior performance typically comes from aggregating homophilous information. Increasing the GNN depth can expand the scope (i.e., receptive field), potentially finding homophily from the higher-order neighborhoods. However, uniformly expanding the scope results in subpar performance since real-world web graphs often exhibit homophily disparity between nodes. An ideal way is personalized scopes, allowing nodes to have varying scope sizes. Existing methods typically add node-adaptive weights for each hop. Although expressive, they inevitably suffer from severe overfitting. To address this issue, we formalize personalized scoping as a separate scope classification problem that overcomes GNN overfitting in node classification. Specifically, we predict the optimal GNN depth for each node. Our theoretical and empirical analysis suggests that accurately predicting the depth can significantly enhance generalization. We further propose Adaptive Scope (AS), a lightweight approach that only participates in GNN inference. AS encodes structural patterns and predicts the depth to select the best model for each node's prediction. Experimental results show that AS is highly flexible with various GNN architectures across a wide range of datasets while significantly improving accuracy.
Published: 2024

3. TASER: Temporal Adaptive Sampling for Fast and Accurate Dynamic Graph Representation Learning

Author: Deng, Gangda, Zhou, Hongkuan, Zeng, Hanqing, Xia, Yinglong, Leung, Christopher, Li, Jianbo, Kannan, Rajgopal, and Prasanna, Viktor
Subjects: Computer Science - Machine Learning, Computer Science - Artificial Intelligence
Abstract: Recently, Temporal Graph Neural Networks (TGNNs) have demonstrated state-of-the-art performance in various high-impact applications, including fraud detection and content recommendation. Despite the success of TGNNs, they are prone to the prevalent noise found in real-world dynamic graphs like time-deprecated links and skewed interaction distribution. The noise causes two critical issues that significantly compromise the accuracy of TGNNs: (1) models are supervised by inferior interactions, and (2) noisy input induces high variance in the aggregated messages. However, current TGNN denoising techniques do not consider the diverse and dynamic noise pattern of each node. In addition, they also suffer from the excessive mini-batch generation overheads caused by traversing more neighbors. We believe the remedy for fast and accurate TGNNs lies in temporal adaptive sampling. In this work, we propose TASER, the first adaptive sampling method for TGNNs optimized for accuracy, efficiency, and scalability. TASER adapts its mini-batch selection based on training dynamics and temporal neighbor selection based on the contextual, structural, and temporal properties of past interactions. To alleviate the bottleneck in mini-batch generation, TASER implements a pure GPU-based temporal neighbor finder and a dedicated GPU feature cache. We evaluate the performance of TASER using two state-of-the-art backbone TGNNs. On five popular datasets, TASER outperforms the corresponding baselines by an average of 2.3% in Mean Reciprocal Rank (MRR) while achieving an average of 5.1x speedup in training time., Comment: IPDPS 2024
Published: 2024

4. Language-conditioned Learning for Robotic Manipulation: A Survey

Author: Zhou, Hongkuan, Yao, Xiangtong, Meng, Yuan, Sun, Siming, Bing, Zhenshan, Huang, Kai, and Knoll, Alois
Subjects: Computer Science - Robotics
Abstract: Language-conditioned robotic manipulation represents a cutting-edge area of research, enabling seamless communication and cooperation between humans and robotic agents. This field focuses on teaching robotic systems to comprehend and execute instructions conveyed in natural language. To achieve this, the development of robust language understanding models capable of extracting actionable insights from textual input is essential. In this comprehensive survey, we systematically explore recent advancements in language-conditioned approaches within the context of robotic manipulation. We analyze these approaches based on their learning paradigms, which encompass reinforcement learning, imitation learning, and the integration of foundational models, such as large language models and vision-language models. Furthermore, we conduct an in-depth comparative analysis, considering aspects like semantic information extraction, environment & evaluation, auxiliary tasks, and task representation. Finally, we outline potential future research directions in the realm of language-conditioned learning for robotic manipulation, with the topic of generalization capabilities and safety issues. The GitHub repository of this paper can be found at https://github.com/hk-zh/language-conditioned-robot-manipulation-models
Published: 2023

5. What Matters to Enhance Traffic Rule Compliance of Imitation Learning for End-to-End Autonomous Driving

Author: Zhou, Hongkuan, Cao, Wei, Sui, Aifen, and Bing, Zhenshan
Subjects: Computer Science - Computer Vision and Pattern Recognition, Computer Science - Artificial Intelligence, Computer Science - Machine Learning, Computer Science - Robotics
Abstract: End-to-end autonomous driving, where the entire driving pipeline is replaced with a single neural network, has recently gained research attention because of its simpler structure and faster inference time. Despite this appealing approach largely reducing the complexity in the driving pipeline, it also leads to safety issues because the trained policy is not always compliant with the traffic rules. In this paper, we proposed P-CSG, a penalty-based imitation learning approach with contrastive-based cross semantics generation sensor fusion technologies to increase the overall performance of end-to-end autonomous driving. In this method, we introduce three penalties - red light, stop sign, and curvature speed penalty to make the agent more sensitive to traffic rules. The proposed cross semantics generation helps to align the shared information of different input modalities. We assessed our model's performance using the CARLA Leaderboard - Town 05 Long Benchmark and Longest6 Benchmark, achieving 8.5% and 2.0% driving score improvement compared to the baselines. Furthermore, we conducted robustness evaluations against adversarial attacks like FGSM and Dot attacks, revealing a substantial increase in robustness compared to other baseline models. More detailed information can be found at https://hk-zh.github.io/p-csg-plus., Comment: 14 pages, 3 figures
Published: 2023

6. DistTGL: Distributed Memory-Based Temporal Graph Neural Network Training

Author: Zhou, Hongkuan, Zheng, Da, Song, Xiang, Karypis, George, and Prasanna, Viktor
Subjects: Computer Science - Machine Learning
Abstract: Memory-based Temporal Graph Neural Networks are powerful tools in dynamic graph representation learning and have demonstrated superior performance in many real-world applications. However, their node memory favors smaller batch sizes to capture more dependencies in graph events and needs to be maintained synchronously across all trainers. As a result, existing frameworks suffer from accuracy loss when scaling to multiple GPUs. Evenworse, the tremendous overhead to synchronize the node memory make it impractical to be deployed to distributed GPU clusters. In this work, we propose DistTGL -- an efficient and scalable solution to train memory-based TGNNs on distributed GPU clusters. DistTGL has three improvements over existing solutions: an enhanced TGNN model, a novel training algorithm, and an optimized system. In experiments, DistTGL achieves near-linear convergence speedup, outperforming state-of-the-art single-machine method by 14.5% in accuracy and 10.17x in training throughput., Comment: SC'23
Published: 2023

7. Language-Conditioned Imitation Learning with Base Skill Priors under Unstructured Data

Author: Zhou, Hongkuan, Bing, Zhenshan, Yao, Xiangtong, Su, Xiaojie, Yang, Chenguang, Huang, Kai, and Knoll, Alois
Subjects: Computer Science - Robotics, Computer Science - Artificial Intelligence
Abstract: The growing interest in language-conditioned robot manipulation aims to develop robots capable of understanding and executing complex tasks, with the objective of enabling robots to interpret language commands and manipulate objects accordingly. While language-conditioned approaches demonstrate impressive capabilities for addressing tasks in familiar environments, they encounter limitations in adapting to unfamiliar environment settings. In this study, we propose a general-purpose, language-conditioned approach that combines base skill priors and imitation learning under unstructured data to enhance the algorithm's generalization in adapting to unfamiliar environments. We assess our model's performance in both simulated and real-world environments using a zero-shot setting. In the simulated environment, the proposed approach surpasses previously reported scores for CALVIN benchmark, especially in the challenging Zero-Shot Multi-Environment setting. The average completed task length, indicating the average number of tasks the agent can continuously complete, improves more than 2.5 times compared to the state-of-the-art method HULC. In addition, we conduct a zero-shot evaluation of our policy in a real-world setting, following training exclusively in simulated environments without additional specific adaptations. In this evaluation, we set up ten tasks and achieved an average 30% improvement in our approach compared to the current state-of-the-art approach, demonstrating a high generalization capability in both simulated environments and the real world. For further details, including access to our code and videos, please refer to https://hk-zh.github.io/spil/
Published: 2023

8. HTNet: Dynamic WLAN Performance Prediction using Heterogenous Temporal GNN

Author: Zhou, Hongkuan, Kannan, Rajgopal, Swami, Ananthram, and Prasanna, Viktor
Subjects: Computer Science - Networking and Internet Architecture
Abstract: Predicting the throughput of WLAN deployments is a classic problem that occurs in the design of robust and high performance WLAN systems. However, due to the increasingly complex communication protocols and the increase in interference between devices in denser and denser WLAN deployments, traditional methods either have substantial runtime or enormous prediction error and hence cannot be applied in downstream tasks. Recently, Graph Neural Networks have been proven to be powerful graph analytic models and have been broadly applied to various networking problems such as link scheduling and power allocation. In this work, we propose HTNet, a specialized Heterogeneous Temporal Graph Neural Network that extracts features from dynamic WLAN deployments. Analyzing the unique graph structure of WLAN deployment graphs, we show that HTNet achieves the maximum expressive power on each snapshot. Based on a powerful message passing scheme, HTNet requires fewer number of layers compared with other GNN-based methods which entails less supporting data and runtime. To evaluate the performance of HTNet, we prepare six different setups with more than five thousands dense dynamic WLAN deployments that cover a wide range of real-world scenarios. HTNet achieves the lowest prediction error on all six setups with an average improvement of 25.3\% over the state-of-the-art methods., Comment: InfoCom'23
Published: 2023

9. Penalty-Based Imitation Learning With Cross Semantics Generation Sensor Fusion for Autonomous Driving

Author: Zhou, Hongkuan, Sui, Aifen, Shi, Letian, and Li, Yinxian
Subjects: Computer Science - Robotics, Computer Science - Artificial Intelligence
Abstract: In recent times, there has been a growing focus on end-to-end autonomous driving technologies. This technology involves the replacement of the entire driving pipeline with a single neural network, which has a simpler structure and faster inference time. However, while this approach reduces the number of components in the driving pipeline, it also presents challenges related to interpretability and safety. For instance, the trained policy may not always comply with traffic rules, and it is difficult to determine the reason for such misbehavior due to the lack of intermediate outputs. Additionally, the successful implementation of autonomous driving technology heavily depends on the reliable and expedient processing of sensory data to accurately perceive the surrounding environment. In this paper, we provide penalty-based imitation learning approach combined with cross semantics generation sensor fusion technologies (P-CSG) to efficiently integrate multiple modalities of information and enable the autonomous agent to effectively adhere to traffic regulations. Our model undergoes evaluation within the Town 05 Long benchmark, where we observe a remarkable increase in the driving score by more than 12% when compared to the state-of-the-art (SOTA) model, InterFuser. Notably, our model achieves this performance enhancement while achieving a 7-fold increase in inference speed and reducing the model size by approximately 30%. For more detailed information, including code-based resources, they can be found at https://hk-zh.github.io/p-csg/
Published: 2023

10. Learning from Symmetry: Meta-Reinforcement Learning with Symmetrical Behaviors and Language Instructions

Author: Yao, Xiangtong, Bing, Zhenshan, Zhuang, Genghang, Chen, Kejia, Zhou, Hongkuan, Huang, Kai, and Knoll, Alois
Subjects: Computer Science - Artificial Intelligence
Abstract: Meta-reinforcement learning (meta-RL) is a promising approach that enables the agent to learn new tasks quickly. However, most meta-RL algorithms show poor generalization in multi-task scenarios due to the insufficient task information provided only by rewards. Language-conditioned meta-RL improves the generalization capability by matching language instructions with the agent's behaviors. While both behaviors and language instructions have symmetry, which can speed up human learning of new knowledge. Thus, combining symmetry and language instructions into meta-RL can help improve the algorithm's generalization and learning efficiency. We propose a dual-MDP meta-reinforcement learning method that enables learning new tasks efficiently with symmetrical behaviors and language instructions. We evaluate our method in multiple challenging manipulation tasks, and experimental results show that our method can greatly improve the generalization and learning efficiency of meta-reinforcement learning. Videos are available at https://tumi6robot.wixsite.com/symmetry/.
Published: 2022

11. TGL: A General Framework for Temporal GNN Training on Billion-Scale Graphs

Author: Zhou, Hongkuan, Zheng, Da, Nisa, Israt, Ioannidis, Vasileios, Song, Xiang, and Karypis, George
Subjects: Computer Science - Machine Learning
Abstract: Many real world graphs contain time domain information. Temporal Graph Neural Networks capture temporal information as well as structural and contextual information in the generated dynamic node embeddings. Researchers have shown that these embeddings achieve state-of-the-art performance in many different tasks. In this work, we propose TGL, a unified framework for large-scale offline Temporal Graph Neural Network training where users can compose various Temporal Graph Neural Networks with simple configuration files. TGL comprises five main components, a temporal sampler, a mailbox, a node memory module, a memory updater, and a message passing engine. We design a Temporal-CSR data structure and a parallel sampler to efficiently sample temporal neighbors to formtraining mini-batches. We propose a novel random chunk scheduling technique that mitigates the problem of obsolete node memory when training with a large batch size. To address the limitations of current TGNNs only being evaluated on small-scale datasets, we introduce two large-scale real-world datasets with 0.2 and 1.3 billion temporal edges. We evaluate the performance of TGL on four small-scale datasets with a single GPU and the two large datasets with multiple GPUs for both link prediction and node classification tasks. We compare TGL with the open-sourced code of five methods and show that TGL achieves similar or better accuracy with an average of 13x speedup. Our temporal parallel sampler achieves an average of 173x speedup on a multi-core CPU compared with the baselines. On a 4-GPU machine, TGL can train one epoch of more than one billion temporal edges within 1-10 hours. To the best of our knowledge, this is the first work that proposes a general framework for large-scale Temporal Graph Neural Networks training on multiple GPUs., Comment: VLDB'22
Published: 2022

12. Design and Implementation of Knowledge Base for Runtime Management of Software Defined Hardware

Author: Zhou, Hongkuan, Srivastava, Ajitesh, Kannan, Rajgopal, and Prasanna, Viktor
Subjects: Computer Science - Software Engineering, Computer Science - Databases
Abstract: Runtime-reconfigurable software coupled with reconfigurable hardware is highly desirable as a means towards maximizing runtime efficiency without compromising programmability. Compilers for such software systems are extremely difficult to design as they must leverage different types of hardware at runtime. To address the need for static and dynamic compiler optimization of workflows matched to dynamically reconfigurable hardware, we propose a novel design of the central component of a dynamic software compiler for software defined hardware. Our comprehensive design focuses not just on static knowledge but also on semi-supervised extraction of knowledge from program executions and developing their performance models. Specifically, our novel {\it dynamic and extensible knowledge base} 1) continuously gathers knowledge during execution of workflows 2) identifies {\it optimal} implementations of workflows on {\it optimal} (available) hardware configurations. It plays a hub role in storing information from, and providing information to other components of the compiler, as well as the human analyst. Through a rich tripartite graph representation, the knowledge base captures and learns extensive information on decomposition and mapping of code steps to kernels and mapping of kernels to available hardware configurations. The knowledge base is implemented using the C++ Boost Library and is capable of quickly processing offline and online queries and updates. We show that our knowledge base can answer queries in $1ms$ regardless of the number of workflows it stores. To the best of our knowledge, this is the first design of a dynamic and extensible knowledge base to support compilation of high-level languages to leverage arbitrary reconfigurable platforms., Comment: HPEC'19
Published: 2022
Full Text: View/download PDF

13. Model-Architecture Co-Design for High Performance Temporal GNN Inference on FPGA

Author: Zhou, Hongkuan, Zhang, Bingyi, Kannan, Rajgopal, Prasanna, Viktor, and Busart, Carl
Subjects: Computer Science - Hardware Architecture, Computer Science - Machine Learning
Abstract: Temporal Graph Neural Networks (TGNNs) are powerful models to capture temporal, structural, and contextual information on temporal graphs. The generated temporal node embeddings outperform other methods in many downstream tasks. Real-world applications require high performance inference on real-time streaming dynamic graphs. However, these models usually rely on complex attention mechanisms to capture relationships between temporal neighbors. In addition, maintaining vertex memory suffers from intrinsic temporal data dependency that hinders task-level parallelism, making it inefficient on general-purpose processors. In this work, we present a novel model-architecture co-design for inference in memory-based TGNNs on FPGAs. The key modeling optimizations we propose include a light-weight method to compute attention scores and a related temporal neighbor pruning strategy to further reduce computation and memory accesses. These are holistically coupled with key hardware optimizations that leverage FPGA hardware. We replace the temporal sampler with an on-chip FIFO based hardware sampler and the time encoder with a look-up-table. We train our simplified models using knowledge distillation to ensure similar accuracy vis-\'a-vis the original model. Taking advantage of the model optimizations, we propose a principled hardware architecture using batching, pipelining, and prefetching techniques to further improve the performance. We also propose a hardware mechanism to ensure the chronological vertex updating without sacrificing the computation parallelism. We evaluate the performance of the proposed hardware accelerator on three real-world datasets., Comment: IPDPS'22
Published: 2022

14. SeDyT: A General Framework for Multi-Step Event Forecasting via Sequence Modeling on Dynamic Entity Embeddings

Author: Zhou, Hongkuan, Orme-Rogers, James, Kannan, Rajgopal, and Prasanna, Viktor
Subjects: Computer Science - Machine Learning, Computer Science - Computation and Language
Abstract: Temporal Knowledge Graphs store events in the form of subjects, relations, objects, and timestamps which are often represented by dynamic heterogeneous graphs. Event forecasting is a critical and challenging task in Temporal Knowledge Graph reasoning that predicts the subject or object of an event in the future. To obtain temporal embeddings multi-step away in the future, existing methods learn generative models that capture the joint distribution of the observed events. To reduce the high computation costs, these methods rely on unrealistic assumptions of independence and approximations in training and inference. In this work, we propose SeDyT, a discriminative framework that performs sequence modeling on the dynamic entity embeddings to solve the multi-step event forecasting problem. SeDyT consists of two components: a Temporal Graph Neural Network that generates dynamic entity embeddings in the past and a sequence model that predicts the entity embeddings in the future. Compared with the generative models, SeDyT does not rely on any heuristic-based probability model and has low computation complexity in both training and inference. SeDyT is compatible with most Temporal Graph Neural Networks and sequence models. We also design an efficient training method that trains the two components in one gradient descent propagation. We evaluate the performance of SeDyT on five popular datasets. By combining temporal Graph Neural Network models and sequence models, SeDyT achieves an average of 2.4% MRR improvement when not using the validation set and more than 10% MRR improvement when using the validation set.
Published: 2021
Full Text: View/download PDF

15. A concurrent fault diagnosis method for electric isolation valves in nuclear power plants based on rule-based reasoning and data-driven methods

Author: Ai, Xin, Liu, Yongkuo, Shan, Longfei, Xie, Chunli, and Zhou, Hongkuan
Published: 2024
Full Text: View/download PDF

16. Accelerating Large Scale Real-Time GNN Inference using Channel Pruning

Author: Zhou, Hongkuan, Srivastava, Ajitesh, Zeng, Hanqing, Kannan, Rajgopal, and Prasanna, Viktor
Subjects: Computer Science - Machine Learning
Abstract: Graph Neural Networks (GNNs) are proven to be powerful models to generate node embedding for downstream applications. However, due to the high computation complexity of GNN inference, it is hard to deploy GNNs for large-scale or real-time applications. In this paper, we propose to accelerate GNN inference by pruning the dimensions in each layer with negligible accuracy loss. Our pruning framework uses a novel LASSO regression formulation for GNNs to identify feature dimensions (channels) that have high influence on the output activation. We identify two inference scenarios and design pruning schemes based on their computation and memory usage for each. To further reduce the inference complexity, we effectively store and reuse hidden features of visited nodes, which significantly reduces the number of supporting nodes needed to compute the target embedding. We evaluate the proposed method with the node classification problem on five popular datasets and a real-time spam detection application. We demonstrate that the pruned GNN models greatly reduce computation and memory usage with little accuracy loss. For full inference, the proposed method achieves an average of 3.27x speedup with only 0.002 drop in F1-Micro on GPU. For batched inference, the proposed method achieves an average of 6.67x speedup with only 0.003 drop in F1-Micro on CPU. To the best of our knowledge, we are the first to accelerate large scale real-time GNN inference through channel pruning.
Published: 2021
Full Text: View/download PDF

17. A time series and deep fusion framework for rotating machinery fault diagnosis

Author: Zhang, Jiasheng, Hu, Di, Yang, Tao, Zhou, Hongkuan, and Li, Xianling
Published: 2024
Full Text: View/download PDF

18. Accurate, Efficient and Scalable Training of Graph Neural Networks

Author: Zeng, Hanqing, Zhou, Hongkuan, Srivastava, Ajitesh, Kannan, Rajgopal, and Prasanna, Viktor
Subjects: Computer Science - Machine Learning, Computer Science - Distributed, Parallel, and Cluster Computing
Abstract: Graph Neural Networks (GNNs) are powerful deep learning models to generate node embeddings on graphs. When applying deep GNNs on large graphs, it is still challenging to perform training in an efficient and scalable way. We propose a novel parallel training framework. Through sampling small subgraphs as minibatches, we reduce training workload by orders of magnitude compared with state-of-the-art minibatch methods. We then parallelize the key computation steps on tightly-coupled shared memory systems. For graph sampling, we exploit parallelism within and across sampler instances, and propose an efficient data structure supporting concurrent accesses from samplers. The parallel sampler theoretically achieves near-linear speedup with respect to number of processing units. For feature propagation within subgraphs, we improve cache utilization and reduce DRAM traffic by data partitioning. Our partitioning is a 2-approximation strategy for minimizing the communication cost compared to the optimal. We further develop a runtime scheduler to reorder the training operations and adjust the minibatch subgraphs to improve parallel performance. Finally, we generalize the above parallelization strategies to support multiple types of GNN models and graph samplers. The proposed training outperforms the state-of-the-art in scalability, efficiency and accuracy simultaneously. On a 40-core Xeon platform, we achieve 60x speedup (with AVX) in the sampling step and 20x speedup in the feature propagation step, compared to the serial implementation. Our algorithm enables fast training of deeper GNNs, as demonstrated by orders of magnitude speedup compared to the Tensorflow implementation. We open-source our code at https://github.com/GraphSAINT/GraphSAINT., Comment: 43 pages, 8 figures. arXiv admin note: text overlap with arXiv:1810.11899
Published: 2020
Full Text: View/download PDF

19. GraphSAINT: Graph Sampling Based Inductive Learning Method

Author: Zeng, Hanqing, Zhou, Hongkuan, Srivastava, Ajitesh, Kannan, Rajgopal, and Prasanna, Viktor
Subjects: Computer Science - Machine Learning, Statistics - Machine Learning
Abstract: Graph Convolutional Networks (GCNs) are powerful models for learning representations of attributed graphs. To scale GCNs to large graphs, state-of-the-art methods use various layer sampling techniques to alleviate the "neighbor explosion" problem during minibatch training. We propose GraphSAINT, a graph sampling based inductive learning method that improves training efficiency and accuracy in a fundamentally different way. By changing perspective, GraphSAINT constructs minibatches by sampling the training graph, rather than the nodes or edges across GCN layers. Each iteration, a complete GCN is built from the properly sampled subgraph. Thus, we ensure fixed number of well-connected nodes in all layers. We further propose normalization technique to eliminate bias, and sampling algorithms for variance reduction. Importantly, we can decouple the sampling from the forward and backward propagation, and extend GraphSAINT with many architecture variants (e.g., graph attention, jumping connection). GraphSAINT demonstrates superior performance in both accuracy and training time on five large graphs, and achieves new state-of-the-art F1 scores for PPI (0.995) and Reddit (0.970)., Comment: Published at ICLR 2020; Code release: github.com/GraphSAINT/GraphSAINT
Published: 2019

20. Terahertz transfer characterization for composite delamination under variable conditions based on deep adversarial domain adaptation

Author: Xu, Yafei, Lian, Guanghui, Zhou, Hongkuan, Hou, Yushan, Zhang, Hao, Zhang, Liuyang, Yan, Ruqiang, and Chen, Xuefeng
Published: 2023
Full Text: View/download PDF

21. Accurate, Efficient and Scalable Graph Embedding

Author: Zeng, Hanqing, Zhou, Hongkuan, Srivastava, Ajitesh, Kannan, Rajgopal, and Prasanna, Viktor
Subjects: Computer Science - Machine Learning, Computer Science - Performance, Statistics - Machine Learning
Abstract: The Graph Convolutional Network (GCN) model and its variants are powerful graph embedding tools for facilitating classification and clustering on graphs. However, a major challenge is to reduce the complexity of layered GCNs and make them parallelizable and scalable on very large graphs -- state-of the art techniques are unable to achieve scalability without losing accuracy and efficiency. In this paper, we propose novel parallelization techniques for graph sampling-based GCNs that achieve superior scalable performance on very large graphs without compromising accuracy. Specifically, our GCN guarantees work-efficient training and produces order of magnitude savings in computation and communication. To scale GCN training on tightly-coupled shared memory systems, we develop parallelization strategies for the key steps in training: For the graph sampling step, we exploit parallelism within and across multiple sampling instances, and devise an efficient data structure for concurrent accesses that provides theoretical guarantee of near-linear speedup with number of processing units. For the feature propagation step within the sampled graph, we improve cache utilization and reduce DRAM communication by data partitioning. We prove that our partitioning strategy is a 2-approximation for minimizing the communication time compared to the optimal strategy. We demonstrate that our parallel graph embedding outperforms state-of-the-art methods in scalability (with respect to number of processors, graph size and GCN model size), efficiency and accuracy on several large datasets. On a 40-core Xeon platform, our parallel training achieves $64\times$ speedup (with AVX) in the sampling step and $25\times$ speedup in the feature propagation step, compared to the serial implementation, resulting in a net speedup of $21\times$., Comment: 10 pages. 2019 IEEE International Parallel and Distributed Processing Symposium (IPDPS)
Published: 2018
Full Text: View/download PDF

22. Sea Clutter Distribution Modeling: A Kernel Density Estimation Approach

Author: Zhou, Hongkuan, Li, Yuzhou, and Jiang, Tao
Subjects: Statistics - Applications, Computer Science - Information Theory
Abstract: An accurate sea clutter distribution is crucial for decision region determination when detecting sea-surface floating targets. However, traditional parametric models possibly have a considerable gap to the realistic distribution of sea clutters due to the volatile sea states. In this paper, we develop a kernel density estimation based framework to model the sea clutter distributions without requiring any prior knowledge. In this framework, we jointly consider two embedded fundamental problems, the selection of a proper kernel density function and the determination of its corresponding optimal bandwidth. Regarding these two problems, we adopt the Gaussian, Gamma, and Weibull distributions as the kernel functions, and derive the closed-form optimal bandwidth equations for them. To deal with the highly complicated equations for the three kernels, we further design a fast iterative bandwidth selection algorithm to solve them. Experimental results show that, compared with existing methods, our proposed approach can significantly decrease the error incurred by sea clutter modeling (about two orders of magnitude reduction) and improve the target detection probability (up to $36\%$ in low false alarm rate cases)., Comment: 6 pages, 4 figures, 1 table, to appear in Proc. International Conference on Wireless Communications & Signal Processing (WCSP), Hangzhou, China, Oct. 2018
Published: 2018

23. To Relay or not to Relay: Open Distance and Optimal Deployment for Linear Underwater Acoustic Networks

Author: Li, Yuzhou, Zhang, Yu, Zhou, Hongkuan, and Jiang, Tao
Subjects: Computer Science - Information Theory
Abstract: Existing works have widely studied relay-aided underwater acoustic networks under some specialized relay distributions, e.g., equidistant and rectangular-grid. In this paper, we investigate two fundamental problems that under which conditions a relay should be deployed and where to deploy it if necessary in terms of the energy and delay performance in linear underwater acoustic networks. To address these two problems, we first accurately approximate the complicated effective bandwidth and transmit power in the logarithm domain to formulate an energy minimization problem. By analyzing the formulation, we discover a critical transmission distance, defined as open distance, and explicitly show that a relay should not be deployed if the transmission distance is less than the open distance and should be otherwise. Most importantly, we derive a closed-form and easy-to-calculate expression for the open distance and also strictly prove that the optimal placing position is at the middle point of the link when a relay should be introduced. Moreover, although this paper considers a linear two-hop relay network as the first step, our derived results can be applied to construct energy-efficient and delay-friendly multi-hop networks. Simulation results validate our theoretical analysis and show that properly introducing a relay can dramatically reduce the network energy consumption almost without increasing the end-to-end delay., Comment: 13 pages, 13 figures
Published: 2018
Full Text: View/download PDF

24. Real-time terahertz characterization for composite delamination using a lightweight CPU adaptive network

Author: Xu, Yafei, Wang, Xingyu, Zhou, Hongkuan, Hou, Yushan, Wen, Bihan, Zhang, Liuyang, Yan, Ruqiang, and Chen, Xuefeng
Published: 2022
Full Text: View/download PDF

25. Full scale promoted convolution neural network for intelligent terahertz 3D characterization of GFRP delamination

Author: Xu, Yafei, Zhou, Hongkuan, Cui, Yuqing, Wang, Xingyu, Citrin, D.S., Zhang, Liuyang, Yan, Ruqiang, and Chen, Xuefeng
Published: 2022
Full Text: View/download PDF

26. Accurate, efficient and scalable training of Graph Neural Networks

Author: Zeng, Hanqing, Zhou, Hongkuan, Srivastava, Ajitesh, Kannan, Rajgopal, and Prasanna, Viktor
Published: 2021
Full Text: View/download PDF

27. An Efficient Distributed Graph Engine for Deep Learning on Graphs

Author: Deng, Gangda, primary, Akgül, Ömer Faruk, additional, Zhou, Hongkuan, additional, Zeng, Hanqing, additional, Xia, Yinglong, additional, Li, Jianbo, additional, and Prasanna, Viktor, additional
Published: 2023
Full Text: View/download PDF

28. DistTGL: Distributed Memory-Based Temporal Graph Neural Network Training

Author: Zhou, Hongkuan, primary, Zheng, Da, additional, Song, Xiang, additional, Karypis, George, additional, and Prasanna, Viktor, additional
Published: 2023
Full Text: View/download PDF

29. Learning from Symmetry: Meta-Reinforcement Learning with Symmetrical Behaviors and Language Instructions

Author: Yao, Xiangtong, primary, Bing, Zhenshan, additional, Zhuang, Genghang, additional, Chen, Kejia, additional, Zhou, Hongkuan, additional, Huang, Kai, additional, and Knoll, Alois, additional
Published: 2023
Full Text: View/download PDF

30. Spectral Efficiency Analysis of Downlink Transmission for Two-Way Cell-Free Massive MIMO System With Few-Bit ADCs

Author: Cui, Jiaxi, primary, Liu, Pei, additional, Wang, Kehao, additional, Zhou, Hongkuan, additional, Zhang, Yue, additional, Sun, Xinghua, additional, and Buzzi, Stefano, additional
Published: 2023
Full Text: View/download PDF

31. A Prognosis Method for Condenser Fouling Based on Differential Modeling

Author: Zhang, Ying, primary, Yang, Tao, additional, Zhou, Hongkuan, additional, Lyu, Dongzhen, additional, Zheng, Wei, additional, and Li, Xianling, additional
Published: 2023
Full Text: View/download PDF

32. HTNet: Dynamic WLAN Performance Prediction using Heterogenous Temporal GNN

Author: Zhou, Hongkuan, primary, Kannan, Rajgopal, additional, Swami, Ananthram, additional, and Prasanna, Viktor, additional
Published: 2023
Full Text: View/download PDF

33. Solving Robotic Manipulation With Sparse Reward Reinforcement Learning Via Graph-Based Diversity and Proximity

Author: Bing, Zhenshan, primary, Zhou, Hongkuan, additional, Li, Rui, additional, Su, Xiaojie, additional, Morin, Fabrice O., additional, Huang, Kai, additional, and Knoll, Alois, additional
Published: 2023
Full Text: View/download PDF

34. What Matters to Enhance Traffic Rule Compliance of Imitation Learning for Automated Driving

Author: Zhou, Hongkuan, Sui, Aifen, Cao, Wei, Bing, Zhenshan, Zhou, Hongkuan, Sui, Aifen, Cao, Wei, and Bing, Zhenshan
Abstract: More research attention has recently been given to end-to-end autonomous driving technologies where the entire driving pipeline is replaced with a single neural network because of its simpler structure and faster inference time. Despite this appealing approach largely reducing the components in the driving pipeline, its simplicity also leads to interpretability problems and safety issues. The trained policy is not always compliant with the traffic rules and it is also hard to discover the reason for the misbehavior because of the lack of intermediate outputs. Meanwhile, sensors are also critical to autonomous driving's security and feasibility to perceive the surrounding environment under complex driving scenarios. In this paper, we proposed P-CSG, a penalty-based imitation learning approach with cross semantics generation sensor fusion technologies to increase the overall performance of end-to-end autonomous driving. In this method, we introduce three penalties - red light, stop sign, and curvature speed penalty to make the agent more sensitive to traffic rules. The proposed cross semantics generation helps to align the shared information from different input modalities. We assessed our model's performance using the CARLA leaderboard - Town 05 Long benchmark and Longest6 Benchmark, achieving an impressive driving score improvement. Furthermore, we conducted robustness evaluations against adversarial attacks like FGSM and Dot attacks, revealing a substantial increase in robustness compared to baseline models. More detailed information, such as code base resources, and videos can be found at https://hk-zh.github.io/p-csg-plus., Comment: 10 pages, 2 figures
Published: 2023

35. TGL

Author: Zhou, Hongkuan, Zheng, Da, Nisa, Israt, Ioannidis, Vasileios, Song, Xiang, and Karypis, George
Subjects: FOS: Computer and information sciences, Computer Science - Machine Learning, General Engineering, Machine Learning (cs.LG)
Abstract: Many real world graphs contain time domain information. Temporal Graph Neural Networks capture temporal information as well as structural and contextual information in the generated dynamic node embeddings. Researchers have shown that these embeddings achieve state-of-the-art performance in many different tasks. In this work, we propose TGL, a unified framework for large-scale offline Temporal Graph Neural Network training where users can compose various Temporal Graph Neural Networks with simple configuration files. TGL comprises five main components, a temporal sampler, a mailbox, a node memory module, a memory updater, and a message passing engine. We design a Temporal-CSR data structure and a parallel sampler to efficiently sample temporal neighbors to formtraining mini-batches. We propose a novel random chunk scheduling technique that mitigates the problem of obsolete node memory when training with a large batch size. To address the limitations of current TGNNs only being evaluated on small-scale datasets, we introduce two large-scale real-world datasets with 0.2 and 1.3 billion temporal edges. We evaluate the performance of TGL on four small-scale datasets with a single GPU and the two large datasets with multiple GPUs for both link prediction and node classification tasks. We compare TGL with the open-sourced code of five methods and show that TGL achieves similar or better accuracy with an average of 13x speedup. Our temporal parallel sampler achieves an average of 173x speedup on a multi-core CPU compared with the baselines. On a 4-GPU machine, TGL can train one epoch of more than one billion temporal edges within 1-10 hours. To the best of our knowledge, this is the first work that proposes a general framework for large-scale Temporal Graph Neural Networks training on multiple GPUs., Comment: VLDB'22
Published: 2022

36. Causal-Trivial Attention Graph Neural Network for Fault Diagnosis of Complex Industrial Processes

Author: Wang, Hao, Liu, Ruonan, Ding, Steven X., Hu, Qinghua, Li, Zengxiang, and Zhou, Hongkuan
Abstract: In modern industrial systems, components have complex interactions with each other, which makes it become a challenging task to identify the operational conditions of industrial systems. Considering that an industrial system, the embedded components and their interactions can be expressed as nodes and edges in a graph, respectively. Therefore, graph representation algorithms are powerful tools for fault diagnosis of industrial systems. As one of the most commonly used graph representation algorithms, graph neural networks (GNN) mainly follow the law of “learning to attend.” GNN extract training data features learn the statistical correlations between features and labels, resulting in the attended graph favoring for accessing noncausal features as a shortcut for prediction. This shortcut feature is unstable and depends on the data distribution characteristics in the training dataset, which reduces the generalization ability of the classifier. By performing the causal analysis of GNN modeling for graph representation, the results show that shortcut features act as confounding factors between causal features and predictions, causing classifiers to learn wrong correlations. Therefore, to discover patterns of causality and weaken the confounding effects of shortcut features, a causal-trivial attention graph neural network strategy is proposed. First, node and edge representations are given by estimating soft masks. Second, through disentanglement, both causal features and shortcut features are obtained from the graph. Third, the backdoor adjustment of the causal theory is parameterized to combine each causal feature with a variety of shortcut features. Finally, comparative experiments on the three-phase flow facility dataset illustrate the effectiveness of the proposed method.
Published: 2024
Full Text: View/download PDF

37. Throughput optimization in heterogeneous MIMO networks

Author: Wang, Ta-Yang, primary, Zhou, Hongkuan, additional, Kannan, Rajgopal, additional, Swami, Ananthram, additional, and Prasanna, Viktor, additional
Published: 2022
Full Text: View/download PDF

38. Research on intelligent diagnosis system for SBLOCA in nuclear power plant

Author: Chai Wenting, Li Kaiyu, Zhou Hongkuan, Li Xianling, Wang Chenyang, and Tao Mo
Published: 2022

39. Weighted Data-Based Fault Detection Approach for Nonlinear Nuclear Power System

Author: Chen, Zhaoxu, additional, Zhou, Hongkuan, additional, Ke, Zhiwu, additional, Qi, Xiao, additional, and Qiu, Zhiqiang, additional
Published: 2022
Full Text: View/download PDF

40. Data Reconstruction of Faulty Sensors for the Nuclear Power Plants Control System: A Strong Tracking Filter Approach

Author: Zhou, Hongkuan, additional, Zheng, Wei, additional, Tao, Mo, additional, Guo, Xiaojie, additional, Huang, Chonghai, additional, Chai, Wenting, additional, Chen, Kai, additional, and Chen, Zhaoxu, additional
Published: 2022
Full Text: View/download PDF

41. Research on Stability Analysis and Guaranteed Performance Control of Networked Control System in Nuclear Power Plants

Author: Tao, Mo, primary, Wang, Chenyang, additional, Ke, Zhiwu, additional, Guo, Xiaojie, additional, Zheng, Wei, additional, Zhou, Hongkuan, additional, Chai, Wenting, additional, Feng, Yi, additional, and Sun, Quqin, additional
Published: 2022
Full Text: View/download PDF

42. Active Adaptive Fault-tolerant Control and Its Application in Ship Speed/Course Coordination

Author: Qiu, Zhiqiang, primary, Guo, Xiaojie, additional, Zheng, Wei, additional, Zhou, Hongkuan, additional, and Ke, Zhiwu, additional
Published: 2022
Full Text: View/download PDF

43. Model-Architecture Co-Design for High Performance Temporal GNN Inference on FPGA

Author: Zhou, Hongkuan, primary, Zhang, Bingyi, additional, Kannan, Rajgopal, additional, Prasanna, Viktor, additional, and Busart, Carl, additional
Published: 2022
Full Text: View/download PDF

44. Anti-jamming strategy based on game theory in single-channel UAV communication network

Author: Xu, Jiangwei, primary, Wang, Kehao, additional, Zhang, Xun, additional, Liu, Pei, additional, Kong, Dejin, additional, and Zhou, Hongkuan, additional
Published: 2021
Full Text: View/download PDF

45. SeDyT: A General Framework for Multi-Step Event Forecasting via Sequence Modeling on Dynamic Entity Embeddings

Author: Zhou, Hongkuan, primary, Orme-Rogers, James, additional, Kannan, Rajgopal, additional, and Prasanna, Viktor, additional
Published: 2021
Full Text: View/download PDF

46. Accelerating large scale real-time GNN inference using channel pruning

Author: Zhou, Hongkuan, primary, Srivastava, Ajitesh, additional, Zeng, Hanqing, additional, Kannan, Rajgopal, additional, and Prasanna, Viktor, additional
Published: 2021
Full Text: View/download PDF

47. Deep Dynamic Adaptive Transfer Network for Rolling Bearing Fault Diagnosis With Considering Cross-Machine Instance

Author: Zhou, Yuxuan, primary, Dong, Yining, additional, Zhou, Hongkuan, additional, and Tang, Gang, additional
Published: 2021
Full Text: View/download PDF

48. Sensor Correlation Network Based Anomaly Detection for Thermal Systems on Ships

Author: Zheng, Wei, primary, Zhou, Hongkuan, additional, Qiu, Zhiqiang, additional, Ke, Zhiwu, additional, Tao, Mo, additional, and Chen, Zhaoxu, additional
Published: 2020
Full Text: View/download PDF

49. Design and Implementation of Knowledge Base for Runtime Management of Software Deﬁned Hardware

Author: Zhou, Hongkuan, primary, Srivastava, Ajitesh, additional, Kannan, Rajgopal, additional, and Prasanna, Viktor, additional
Published: 2019
Full Text: View/download PDF

50. Decision Tree Based Sea-Surface Weak Target Detection With False Alarm Rate Controllable

Author: Zhou, Hongkuan, primary and Jiang, Tao, additional
Published: 2019
Full Text: View/download PDF

Catalog

Books, media, physical & digital resources

See catalog results

Searchworks

Select search scope, currently: Articles Catalog books, media & more in Jio Institute collections Articles journal articles & other e-resources

Search

Search Constraints

Refine your results

Search Limiters

Topic

Publication Year Range

Language

Publication Type

Journal

Database

Publisher

102 results on '"Zhou, Hongkuan"'

Search Results

Catalog

Select search scope, currently: Articles

Catalog

books, media & more in Jio Institute collections

Articles

journal articles & other e-resources