64,856 results on '"Wang, Chao"'
Search Results
2. Autoencoder Enhanced Realised GARCH on Volatility Forecasting
- Author
-
Zhao, Qianli, Wang, Chao, Gerlach, Richard, Storti, Giuseppe, and Zhang, Lingxiang
- Subjects
Quantitative Finance - Risk Management ,Computer Science - Machine Learning ,Economics - Econometrics - Abstract
Realised volatility has become increasingly prominent in volatility forecasting due to its ability to capture intraday price fluctuations. With a growing variety of realised volatility estimators, each with unique advantages and limitations, selecting an optimal estimator may introduce challenges. In this thesis, aiming to synthesise the impact of various realised volatility measures on volatility forecasting, we propose an extension of the Realised GARCH model that incorporates an autoencoder-generated synthetic realised measure, combining the information from multiple realised measures in a nonlinear manner. Our proposed model extends existing linear methods, such as Principal Component Analysis and Independent Component Analysis, to reduce the dimensionality of realised measures. The empirical evaluation, conducted across four major stock markets from January 2000 to June 2022 and including the period of COVID-19, demonstrates both the feasibility of applying an autoencoder to synthesise volatility measures and the superior effectiveness of the proposed model in one-step-ahead rolling volatility forecasting. The model exhibits enhanced flexibility in parameter estimations across each rolling window, outperforming traditional linear approaches. These findings indicate that nonlinear dimension reduction offers further adaptability and flexibility in improving the synthetic realised measure, with promising implications for future volatility forecasting applications., Comment: 48 pages, 6 figures
- Published
- 2024
3. Knowledge-aware Evolutionary Graph Neural Architecture Search
- Author
-
Wang, Chao, Zhao, Jiaxuan, Li, Lingling, Jiao, Licheng, Liu, Fang, Liu, Xu, and Yang, Shuyuan
- Subjects
Computer Science - Neural and Evolutionary Computing ,Computer Science - Artificial Intelligence ,Computer Science - Machine Learning - Abstract
Graph neural architecture search (GNAS) can customize high-performance graph neural network architectures for specific graph tasks or datasets. However, existing GNAS methods begin searching for architectures from a zero-knowledge state, ignoring the prior knowledge that may improve the search efficiency. The available knowledge base (e.g. NAS-Bench-Graph) contains many rich architectures and their multiple performance metrics, such as the accuracy (#Acc) and number of parameters (#Params). This study proposes exploiting such prior knowledge to accelerate the multi-objective evolutionary search on a new graph dataset, named knowledge-aware evolutionary GNAS (KEGNAS). KEGNAS employs the knowledge base to train a knowledge model and a deep multi-output Gaussian process (DMOGP) in one go, which generates and evaluates transfer architectures in only a few GPU seconds. The knowledge model first establishes a dataset-to-architecture mapping, which can quickly generate candidate transfer architectures for a new dataset. Subsequently, the DMOGP with architecture and dataset encodings is designed to predict multiple performance metrics for candidate transfer architectures on the new dataset. According to the predicted metrics, non-dominated candidate transfer architectures are selected to warm-start the multi-objective evolutionary algorithm for optimizing the #Acc and #Params on a new dataset. Empirical studies on NAS-Bench-Graph and five real-world datasets show that KEGNAS swiftly generates top-performance architectures, achieving 4.27% higher accuracy than advanced evolutionary baselines and 11.54% higher accuracy than advanced differentiable baselines. In addition, ablation studies demonstrate that the use of prior knowledge significantly improves the search performance., Comment: This work has been accepted by Knowledge-Based Systems
- Published
- 2024
4. Self-testing quantum randomness expansion on an integrated photonic chip
- Author
-
Zhang, Gong, Primaatmaja, Ignatius William, Chen, Yue, Ng, Si Qi, Ng, Hong Jie, Pistoia, Marco, Gong, Xiao, Goh, Koon Tong, Wang, Chao, and Lim, Charles
- Subjects
Quantum Physics - Abstract
The power of quantum random number generation is more than just the ability to create truly random numbers$\unicode{x2013}$it can also enable self-testing, which allows the user to verify the implementation integrity of certain critical quantum components with minimal assumptions. In this work, we develop and implement a self-testing quantum random number generator (QRNG) chipset capable of generating 15.33 Mbits of certifiable randomness in each run (an expansion rate of $5.11\times 10^{-4}$ at a repetition rate of 10 Mhz). The chip design is based on a highly loss-and-noise tolerant measurement-device-independent protocol, where random coherent states encoded using quadrature phase shift keying are used to self-test the quantum homodyne detection unit: well-known to be challenging to characterise in practice. Importantly, this proposal opens up the possibility to implement miniaturised self-testing QRNG devices at production scale using standard silicon photonics foundry platforms., Comment: 15 Pages, 5 Figures, and 2 Tables
- Published
- 2024
5. Superpixel-informed Implicit Neural Representation for Multi-Dimensional Data
- Author
-
Li, Jiayi, Zhao, Xile, Wang, Jianli, Wang, Chao, and Wang, Min
- Subjects
Computer Science - Computer Vision and Pattern Recognition - Abstract
Recently, implicit neural representations (INRs) have attracted increasing attention for multi-dimensional data recovery. However, INRs simply map coordinates via a multi-layer perception (MLP) to corresponding values, ignoring the inherent semantic information of the data. To leverage semantic priors from the data, we propose a novel Superpixel-informed INR (S-INR). Specifically, we suggest utilizing generalized superpixel instead of pixel as an alternative basic unit of INR for multi-dimensional data (e.g., images and weather data). The coordinates of generalized superpixels are first fed into exclusive attention-based MLPs, and then the intermediate results interact with a shared dictionary matrix. The elaborately designed modules in S-INR allow us to ingenuously exploit the semantic information within and across generalized superpixels. Extensive experiments on various applications validate the effectiveness and efficacy of our S-INR compared to state-of-the-art INR methods., Comment: Accepted at ECCV 2024, 18 pages, 7 figures
- Published
- 2024
- Full Text
- View/download PDF
6. Imprints of black hole charge on the precessing jet nozzle of M87*
- Author
-
Meng, Xiang-Cheng, Wang, Chao-Hui, and Wei, Shao-Wen
- Subjects
General Relativity and Quantum Cosmology ,Astrophysics - High Energy Astrophysical Phenomena ,High Energy Physics - Theory - Abstract
The observed jet precession period of approximately 11 years for M87* strongly suggests the presence of a supermassive rotating black hole with a tilted accretion disk at the center of the galaxy. By modeling the motion of the tilted accretion disk particle with the spherical orbits around a Kerr-Newman black hole, we study the effect of charge on the observation of the precession period, thereby exploring the potential of this strong-gravity observation in constraining multiple black hole parameters. Firstly, we study the spherical orbits around a Kerr-Newman black hole and find that their precession periods increase with the charge. Secondly, we utilize the observed M87* jet precession period to constrain the relationship between the spin, charge, and warp radius, specifically detailing the correlations between each pair of these three quantities. Moreover, to further refine constraints on the charge, we explore the negative correlation between the maximum warp radius and charge. A significant result shows that the gap between the maximum warp radii of the prograde and retrograde orbits decrease with the black hole charge. If the warp radius is provided by other observations, different constraints on the charge can be derived for the prograde and retrograde cases. These results suggest that in the era of multi-messenger astronomy, such strong-gravity observation of precessing jet nozzle presents a promising avenue for constraining black hole parameters., Comment: 15 pages, 7 figures
- Published
- 2024
7. Physics-informed Kolmogorov-Arnold Network with Chebyshev Polynomials for Fluid Mechanics
- Author
-
Guo, Chunyu, Sun, Lucheng, Li, Shilong, Yuan, Zelong, and Wang, Chao
- Subjects
Physics - Fluid Dynamics - Abstract
Solving partial differential equations (PDEs) is essential in scientific forecasting and fluid dynamics. Traditional approaches often incur expensive computational costs and trade-offs in efficiency and accuracy. Recent deep neural networks improve accuracy but require quality training data. Physics-informed neural networks (PINNs) effectively integrate physical laws, reducing data reliance in limited sample scenarios. A novel machine-learning framework, Chebyshev physics-informed Kolmogorov-Arnold network (ChebPIKAN), is proposed to integrate the robust architectures of Kolmogorov-Arnold networks (KAN) with physical constraints to enhance calculation accuracy of PDEs for fluid mechanics. We explore the fundamentals of KAN, emphasis on the advantages of using the orthogonality of Chebyshev polynomial basis functions in spline fitting, and describe the incorporation of physics-informed loss functions tailored to specific PDEs in fluid dynamics, including Allen-Cahn equation, nonlinear Burgers equation, two-dimensional Helmholtz equations, two-dimensional Kovasznay flow and two-dimensional Navier-Stokes equations. Extensive experiments demonstrate that the proposed ChebPIKAN model significantly outperforms standard KAN architecture in solving various PDEs by embedding essential physical information more effectively. These results indicate that augmenting KAN with physical constraints can not only alleviate overfitting issues of KAN but also improve extrapolation performance. Consequently, this study highlights the potential of ChebPIKAN as a powerful tool in computational fluid dynamics, proposing a path toward fast and reliable predictions in fluid mechanics and beyond.
- Published
- 2024
8. TokenSelect: Efficient Long-Context Inference and Length Extrapolation for LLMs via Dynamic Token-Level KV Cache Selection
- Author
-
Wu, Wei, Pan, Zhuoshi, Wang, Chao, Chen, Liyi, Bai, Yunchu, Fu, Kun, Wang, Zheng, and Xiong, Hui
- Subjects
Computer Science - Computation and Language ,Computer Science - Artificial Intelligence ,Computer Science - Machine Learning - Abstract
With the development of large language models (LLMs), the ability to handle longer contexts has become a key capability for Web applications such as cross-document understanding and LLM-powered search systems. However, this progress faces two major challenges: performance degradation due to sequence lengths out-of-distribution, and excessively long inference times caused by the quadratic computational complexity of attention. These issues hinder the application of LLMs in long-context scenarios. In this paper, we propose Dynamic Token-Level KV Cache Selection (TokenSelect), a model-agnostic, training-free method for efficient and accurate long-context inference. TokenSelect builds upon the observation of non-contiguous attention sparsity, using Query-Key dot products to measure per-head KV Cache criticality at token-level. By per-head soft voting mechanism, TokenSelect selectively involves a small number of critical KV cache tokens in the attention calculation without sacrificing accuracy. To further accelerate TokenSelect, we designed the Selection Cache based on observations of consecutive Query similarity and implemented efficient dot product kernel, significantly reducing the overhead of token selection. A comprehensive evaluation of TokenSelect demonstrates up to 23.84x speedup in attention computation and up to 2.28x acceleration in end-to-end latency, while providing superior performance compared to state-of-the-art long-context inference methods.
- Published
- 2024
9. Efficient learning of mixed-state tomography for photonic quantum walk
- Author
-
Wang, Qin-Qin, Dong, Shaojun, Li, Xiao-Wei, Xu, Xiao-Ye, Wang, Chao, Han, Shuai, Yung, Man-Hong, Han, Yong-Jian, Li, Chuan-Feng, and Guo, Guang-Can
- Subjects
Quantum Physics ,Physics - Optics - Abstract
Noise-enhanced applications in open quantum walk (QW) have recently seen a surge due to their ability to improve performance. However, verifying the success of open QW is challenging, as mixed-state tomography is a resource-intensive process, and implementing all required measurements is almost impossible due to various physical constraints. To address this challenge, we present a neural-network-based method for reconstructing mixed states with a high fidelity (~97.5%) while costing only 50% of the number of measurements typically required for open discrete-time QW in one dimension. Our method uses a neural density operator that models the system and environment, followed by a generalized natural gradient descent procedure that significantly speeds up the training process. Moreover, we introduce a compact interferometric measurement device, improving the scalability of our photonic QW setup that enables experimental learning of mixed states. Our results demonstrate that highly expressive neural networks can serve as powerful alternatives to traditional state tomography.
- Published
- 2024
- Full Text
- View/download PDF
10. OwMatch: Conditional Self-Labeling with Consistency for Open-World Semi-Supervised Learning
- Author
-
Niu, Shengjie, Lin, Lifan, Huang, Jian, and Wang, Chao
- Subjects
Computer Science - Machine Learning ,Computer Science - Computer Vision and Pattern Recognition ,Statistics - Machine Learning - Abstract
Semi-supervised learning (SSL) offers a robust framework for harnessing the potential of unannotated data. Traditionally, SSL mandates that all classes possess labeled instances. However, the emergence of open-world SSL (OwSSL) introduces a more practical challenge, wherein unlabeled data may encompass samples from unseen classes. This scenario leads to misclassification of unseen classes as known ones, consequently undermining classification accuracy. To overcome this challenge, this study revisits two methodologies from self-supervised and semi-supervised learning, self-labeling and consistency, tailoring them to address the OwSSL problem. Specifically, we propose an effective framework called OwMatch, combining conditional self-labeling and open-world hierarchical thresholding. Theoretically, we analyze the estimation of class distribution on unlabeled data through rigorous statistical analysis, thus demonstrating that OwMatch can ensure the unbiasedness of the self-label assignment estimator with reliability. Comprehensive empirical analyses demonstrate that our method yields substantial performance enhancements across both known and unknown classes in comparison to previous studies. Code is available at https://github.com/niusj03/OwMatch., Comment: NeurIPS 2024 camera-ready (10 pages, 4 figures) with the appendices (10 pages, 7 figures)
- Published
- 2024
11. Towards Constraint-aware Learning for Resource Allocation in NFV-enabled Networks
- Author
-
Wang, Tianfu, Yang, Long, Wang, Chao, Qin, Chuan, Deng, Liwei, Shen, Li, and Xiong, Hui
- Subjects
Computer Science - Networking and Internet Architecture - Abstract
Virtual Network Embedding (VNE) is a challenging combinatorial optimization problem that refers to resource allocation associated with hard and multifaceted constraints in network function virtualization (NFV). Existing works for VNE struggle to handle such complex constraints, leading to compromised system performance and stability. In this paper, we propose a \textbf{CON}straint-\textbf{A}ware \textbf{L}earning framework for VNE, named \textbf{CONAL}, to achieve efficient constraint management. Concretely, we formulate the VNE problem as a constrained Markov decision process with violation tolerance. This modeling approach aims to improve both resource utilization and solution feasibility by precisely evaluating solution quality and the degree of constraint violation. We also propose a reachability-guided optimization with an adaptive reachability budget method that dynamically assigns budget values. This method achieves persistent zero violation to guarantee the feasibility of VNE solutions and more stable policy optimization by handling instances without any feasible solution. Furthermore, we propose a constraint-aware graph representation method to efficiently learn cross-graph relations and constrained path connectivity in VNE. Finally, extensive experimental results demonstrate the superiority of our proposed method over state-of-the-art baselines. Our code is available at https://github.com/GeminiLight/conal-vne.
- Published
- 2024
12. Graph Signal Processing for Global Stock Market Volatility Forecasting
- Author
-
Chi, Zhengyang, Gao, Junbin, and Wang, Chao
- Subjects
Quantitative Finance - General Finance - Abstract
The interconnectedness of global financial markets has brought increasing attention to modeling volatility spillover effects. Via incorporating Graph Signal Processing techniques, a novel multivariate framework, extending the traditional Heterogeneous Auto-Regressive model, is developed in the spectral domain constructed by the graph Fourier transformation method. Further, a set of convolution filters with learnable weights is employed to more flexibly aggregate the past mid-term and long-term information. Using 24 global stock market indices, the effectiveness of the proposed model is demonstrated through comprehensive empirical evaluations.
- Published
- 2024
13. Quantum Computational Insurance and Actuarial Science
- Author
-
Liu, Huan-Yu, Zhuang, Xi-Ning, Wang, Chao, Dou, Meng-Han, Chen, Zhao-Yun, Xue, Cheng, Wu, Yu-Chun, and Guo, Guo-Ping
- Subjects
Quantum Physics - Abstract
In recent years, quantum computation has been rapidly advancing, driving a technological revolution with significant potential across various sectors, particularly in finance. Despite this, the insurance industry, an essential tool for mitigating unforeseen risks and losses, has received limited attention. This paper provides an initial exploration into the realm of quantum computational insurance and actuarial science. After introducing key insurance models and challenges, we examine quantum algorithms designed to address complex insurance issues. Our study includes experimental and numerical demonstrations of quantum applications in non-life insurance, life insurance, and reinsurance. Additionally, we explore the timeline for quantum insurance, the development of quantum-enhanced insurance products, and the challenges posed by quantum computational advancements., Comment: 10 pages, 5 figures
- Published
- 2024
14. MMDocBench: Benchmarking Large Vision-Language Models for Fine-Grained Visual Document Understanding
- Author
-
Zhu, Fengbin, Liu, Ziyang, Ng, Xiang Yao, Wu, Haohui, Wang, Wenjie, Feng, Fuli, Wang, Chao, Luan, Huanbo, and Chua, Tat Seng
- Subjects
Computer Science - Computer Vision and Pattern Recognition ,Computer Science - Artificial Intelligence - Abstract
Large Vision-Language Models (LVLMs) have achieved remarkable performance in many vision-language tasks, yet their capabilities in fine-grained visual understanding remain insufficiently evaluated. Existing benchmarks either contain limited fine-grained evaluation samples that are mixed with other data, or are confined to object-level assessments in natural images. To holistically assess LVLMs' fine-grained visual understanding capabilities, we propose using document images with multi-granularity and multi-modal information to supplement natural images. In this light, we construct MMDocBench, a benchmark with various OCR-free document understanding tasks for the evaluation of fine-grained visual perception and reasoning abilities. MMDocBench defines 15 main tasks with 4,338 QA pairs and 11,353 supporting regions, covering various document images such as research papers, receipts, financial reports, Wikipedia tables, charts, and infographics. Based on MMDocBench, we conduct extensive experiments using 13 open-source and 3 proprietary advanced LVLMs, assessing their strengths and weaknesses across different tasks and document image types. The benchmark, task instructions, and evaluation code will be made publicly available., Comment: Under review
- Published
- 2024
15. PMM-Net: Single-stage Multi-agent Trajectory Prediction with Patching-based Embedding and Explicit Modal Modulation
- Author
-
Liu, Huajian, Dong, Wei, Fan, Kunpeng, Wang, Chao, and Gao, Yongzhuo
- Subjects
Computer Science - Robotics ,Computer Science - Artificial Intelligence - Abstract
Analyzing and forecasting trajectories of agents like pedestrians plays a pivotal role for embodied intelligent applications. The inherent indeterminacy of human behavior and complex social interaction among a rich variety of agents make this task more challenging than common time-series forecasting. In this letter, we aim to explore a distinct formulation for multi-agent trajectory prediction framework. Specifically, we proposed a patching-based temporal feature extraction module and a graph-based social feature extraction module, enabling effective feature extraction and cross-scenario generalization. Moreover, we reassess the role of social interaction and present a novel method based on explicit modality modulation to integrate temporal and social features, thereby constructing an efficient single-stage inference pipeline. Results on public benchmark datasets demonstrate the superior performance of our model compared with the state-of-the-art methods. The code is available at: github.com/TIB-K330/pmm-net.
- Published
- 2024
16. PiLocNet: Physics-informed neural network on 3D localization with rotating point spread function
- Author
-
Lu, Mingda, Ao, Zitian, Wang, Chao, Prasad, Sudhakar, and Chan, Raymond H.
- Subjects
Computer Science - Machine Learning ,Computer Science - Artificial Intelligence ,Computer Science - Computer Vision and Pattern Recognition ,Physics - Optics - Abstract
For the 3D localization problem using point spread function (PSF) engineering, we propose a novel enhancement of our previously introduced localization neural network, LocNet. The improved network is a physics-informed neural network (PINN) that we call PiLocNet. Previous works on the localization problem may be categorized separately into model-based optimization and neural network approaches. Our PiLocNet combines the unique strengths of both approaches by incorporating forward-model-based information into the network via a data-fitting loss term that constrains the neural network to yield results that are physically sensible. We additionally incorporate certain regularization terms from the variational method, which further improves the robustness of the network in the presence of image noise, as we show for the Poisson and Gaussian noise models. This framework accords interpretability to the neural network, and the results we obtain show its superiority. Although the paper focuses on the use of single-lobe rotating PSF to encode the full 3D source location, we expect the method to be widely applicable to other PSFs and imaging problems that are constrained by known forward processes., Comment: 25 pages, 4 figures
- Published
- 2024
17. A Lightweight Target-Driven Network of Stereo Matching for Inland Waterways
- Author
-
Su, Jing, Zhou, Yiqing, Zhang, Yu, Wang, Chao, and Wei, Yi
- Subjects
Computer Science - Computer Vision and Pattern Recognition - Abstract
Stereo matching for inland waterways is one of the key technologies for the autonomous navigation of Unmanned Surface Vehicles (USVs), which involves dividing the stereo images into reference images and target images for pixel-level matching. However, due to the challenges of the inland waterway environment, such as blurred textures, large spatial scales, and computational resource constraints of the USVs platform, the participation of geometric features from the target image is required for efficient target-driven matching. Based on this target-driven concept, we propose a lightweight target-driven stereo matching neural network, named LTNet. Specifically, a lightweight and efficient 4D cost volume, named the Geometry Target Volume (GTV), is designed to fully utilize the geometric information of target features by employing the shifted target features as the filtered feature volume. Subsequently, to address the substantial texture interference and object occlusions present in the waterway environment, a Left-Right Consistency Refinement (LRR) module is proposed. The \text{LRR} utilizes the pixel-level differences in left and right disparities to introduce soft constraints, thereby enhancing the accuracy of predictions during the intermediate stages of the network. Moreover, knowledge distillation is utilized to enhance the generalization capability of lightweight models on the USVInland dataset. Furthermore, a new large-scale benchmark, named Spring, is utilized to validate the applicability of LTNet across various scenarios. In experiments on the aforementioned two datasets, LTNet achieves competitive results, with only 3.7M parameters. The code is available at https://github.com/Open-YiQingZhou/LTNet ., Comment: 12 pages, 6 figures
- Published
- 2024
18. A Variational Bayesian Inference Theory of Elasticity and Its Mixed Probabilistic Finite Element Method for Inverse Deformation Solutions in Any Dimension
- Author
-
Wang, Chao and Li, Shaofan
- Subjects
Computer Science - Computer Vision and Pattern Recognition ,Computer Science - Machine Learning ,Mathematics - Numerical Analysis - Abstract
In this work, we have developed a variational Bayesian inference theory of elasticity, which is accomplished by using a mixed Variational Bayesian inference Finite Element Method (VBI-FEM) that can be used to solve the inverse deformation problems of continua. In the proposed variational Bayesian inference theory of continuum mechanics, the elastic strain energy is used as a prior in a Bayesian inference network, which can intelligently recover the detailed continuum deformation mappings with only given the information on the deformed and undeformed continuum body shapes without knowing the interior deformation and the precise actual boundary conditions, both traction as well as displacement boundary conditions, and the actual material constitutive relation. Moreover, we have implemented the related finite element formulation in a computational probabilistic mechanics framework. To numerically solve mixed variational problem, we developed an operator splitting or staggered algorithm that consists of the finite element (FE) step and the Bayesian learning (BL) step as an analogue of the well-known the Expectation-Maximization (EM) algorithm. By solving the mixed probabilistic Galerkin variational problem, we demonstrated that the proposed method is able to inversely predict continuum deformation mappings with strong discontinuity or fracture without knowing the external load conditions. The proposed method provides a robust machine intelligent solution for the long-sought-after inverse problem solution, which has been a major challenge in structure failure forensic pattern analysis in past several decades. The proposed method may become a promising artificial intelligence-based inverse method for solving general partial differential equations.
- Published
- 2024
19. SPikE-SSM: A Sparse, Precise, and Efficient Spiking State Space Model for Long Sequences Learning
- Author
-
Zhong, Yan, Zhao, Ruoyu, Wang, Chao, Guo, Qinghai, Zhang, Jianguo, Lu, Zhichao, and Leng, Luziwei
- Subjects
Computer Science - Neural and Evolutionary Computing ,Computer Science - Artificial Intelligence - Abstract
Spiking neural networks (SNNs) provide an energy-efficient solution by utilizing the spike-based and sparse nature of biological systems. Since the advent of Transformers, SNNs have struggled to compete with artificial networks on long sequential tasks, until the recent emergence of state space models (SSMs), which offer superior computational efficiency and modeling capability. However, applying the highly capable SSMs to SNNs for long sequences learning poses three major challenges: (1) The membrane potential is determined by the past spiking history of the neuron, leading to reduced efficiency for sequence modeling in parallel computing scenarios. (2) Complex dynamics of biological spiking neurons are crucial for functionality but challenging to simulate and exploit effectively in large networks. (3) It is arduous to maintain high sparsity while achieving high accuracy for spiking neurons without resorting to dense computing, as utilized in artificial neuron-based SSMs. To address them, we propose a sparse, precise and efficient spiking SSM framework, termed SPikE-SSM. For (1), we propose a boundary compression strategy (PMBC) to accelerate the inference of the spiking neuron model, enabling parallel processing for long sequence learning. For (2), we propose a novel and concise neuron model incorporating reset-refractory mechanism to leverage the inherent temporal dimension for dynamic computing with biological interpretability. For (3), we hierarchically integrate the proposed neuron model to the original SSM block, and enhance the dynamics of SPikE-SSM by incorporating trainable thresholds and refractory magnitudes to balance accuracy and sparsity. Extensive experiments verify the effectiveness and robustness of SPikE-SSM on the long range arena benchmarks and large language dataset WikiText-103, showing the potential of dynamic spiking neurons in efficient long sequence learning., Comment: 23 pages, 5 figures
- Published
- 2024
20. Structure-Enhanced Protein Instruction Tuning: Towards General-Purpose Protein Understanding
- Author
-
Wu, Wei, Wang, Chao, Chen, Liyi, Yin, Mingze, Zhu, Yiheng, Fu, Kun, Ye, Jieping, Xiong, Hui, and Wang, Zheng
- Subjects
Computer Science - Computation and Language ,Quantitative Biology - Biomolecules - Abstract
Proteins, as essential biomolecules, play a central role in biological processes, including metabolic reactions and DNA replication. Accurate prediction of their properties and functions is crucial in biological applications. Recent development of protein language models (pLMs) with supervised fine tuning provides a promising solution to this problem. However, the fine-tuned model is tailored for particular downstream prediction task, and achieving general-purpose protein understanding remains a challenge. In this paper, we introduce Structure-Enhanced Protein Instruction Tuning (SEPIT) framework to bridge this gap. Our approach integrates a noval structure-aware module into pLMs to inform them with structural knowledge, and then connects these enhanced pLMs to large language models (LLMs) to generate understanding of proteins. In this framework, we propose a novel two-stage instruction tuning pipeline that first establishes a basic understanding of proteins through caption-based instructions and then refines this understanding using a mixture of experts (MoEs) to learn more complex properties and functional information with the same amount of activated parameters. Moreover, we construct the largest and most comprehensive protein instruction dataset to date, which allows us to train and evaluate the general-purpose protein understanding model. Extensive experimental results on open-ended generation and closed-set answer tasks demonstrate the superior performance of SEPIT over both closed-source general LLMs and open-source LLMs trained with protein knowledge.
- Published
- 2024
21. An Approach to Elicit Human-Understandable Robot Expressions to Support Human-Robot Interaction
- Author
-
Leusmann, Jan, Villa, Steeven, Liang, Thomas, Wang, Chao, Schmidt, Albrecht, and Mayer, Sven
- Subjects
Computer Science - Robotics ,Computer Science - Human-Computer Interaction - Abstract
Understanding the intentions of robots is essential for natural and seamless human-robot collaboration. Ensuring that robots have means for non-verbal communication is a basis for intuitive and implicit interaction. For this, we contribute an approach to elicit and design human-understandable robot expressions. We outline the approach in the context of non-humanoid robots. We paired human mimicking and enactment with research from gesture elicitation in two phases: first, to elicit expressions, and second, to ensure they are understandable. We present an example application through two studies (N=16 \& N=260) of our approach to elicit expressions for a simple 6-DoF robotic arm. We show that it enabled us to design robot expressions that signal curiosity and interest in getting attention. Our main contribution is an approach to generate and validate understandable expressions for robots, enabling more natural human-robot interaction.
- Published
- 2024
22. Normalizing flow regularization for photoacoustic tomography
- Author
-
Wang, Chao and Thiery, Alexandre H.
- Subjects
Mathematics - Optimization and Control ,Mathematics - Probability - Abstract
Proper regularization is crucial in inverse problems to achieve high-quality reconstruction, even with an ill-conditioned measurement system. This is particularly true for three-dimensional photoacoustic tomography, which is computationally demanding and requires rapid scanning, often leading to incomplete measurements. Deep neural networks, known for their efficiency in handling big data, are anticipated to be adept at extracting underlying information from images sharing certain characteristics, such as specific types of natural or medical images. We introduce a Normalizing Flow Regularization (NFR) method designed to reconstruct images from incomplete and noisy measurements. The method involves training a normalizing flow network to understand the statistical distribution of sample images by mapping them to Gaussian distributions. This well-trained network then acts as a regularization tool within a Bayesian inversion framework. Additionally, we explore the concept of adaptive regularization selection, providing theoretical proof of its admissibility. A significant challenge in three-dimensional image training is the extensive memory and computation requirements. We address this by training the normalizing flow model using only small-size images and applying a patch-based model for reconstructing larger images. Our approach is model-independent, allowing the reuse of a well-trained network as regularization for various imaging systems. Moreover, as a data-driven prior, NFR effectively leverages the available dataset information, outperforming artificial priors. This advantage is demonstrated through numerical simulations of three-dimensional photoacoustic tomography under various conditions of sparsity, noise levels, and limited-view scenarios.
- Published
- 2024
23. GraphGI:A GNN Explanation Method using Game Interaction
- Author
-
Xian, Xingping, Liu, Jianlu, Wu, Tao, Yuan, Lin, Wang, Chao, and Chen, Baiyun
- Subjects
Computer Science - Machine Learning ,Computer Science - Social and Information Networks - Abstract
Graph Neural Networks (GNNs) have garnered significant attention and have been extensively utilized across various domains. However, similar to other deep learning models, GNNs are often viewed as black-box models, making it challenging to interpret their prediction mechanisms. Current graph explanation techniques focus on identifying key nodes or edges, attributing the critical data features that drive model predictions. Nevertheless, these features do not independently influence the model's outcomes; rather, they interact with one another to collectively affect predictions. In this work, we propose a novel explanatory method GraphGI, which identifies the coalition with the highest interaction strength and presents it as an explanatory subgraph. Given a trained model and an input graph, our method explains predictions by gradually incorporating significant edges into the selected subgraph. We utilize game-theoretic interaction values to assess the interaction strength after edge additions, ensuring that the newly added edges confer maximum interaction strength to the explanatory subgraph. To enhance computational efficiency, we adopt effective approximation techniques for calculating Shapley values and game-theoretic interaction values. Empirical evaluations demonstrate that our method achieves superior fidelity and sparsity, maintaining the interpretability of the results at a comprehensible level.
- Published
- 2024
24. Scalable tensor network algorithm for thermal quantum many-body systems in two dimension
- Author
-
Zhang, Meng, Zhang, Hao, Wang, Chao, and He, Lixin
- Subjects
Condensed Matter - Strongly Correlated Electrons - Abstract
Simulating strongly-correlated quantum many-body systems at finite temperatures is a significant challenge in computational physics. In this work, we present a scalable finite-temperature tensor network algorithm for two-dimensional quantum many-body systems. We employ the (fermionic) projected entangled pair state (PEPS) to represent the vectorization of the quantum thermal state and utilize a stochastic reconfiguration method to cool down the quantum states from infinite temperature. We validate our method by benchmarking it against the 2D antiferromagnetic Heisenberg model, the $J_1$-$J_2$ model, and the Fermi-Hubbard model, comparing physical properties such as internal energy, specific heat, and magnetic susceptibility with results obtained from stochastic series expansion (SSE), exact diagonalization, and determinant quantum Monte Carlo (DQMC).
- Published
- 2024
25. Global Stock Market Volatility Forecasting Incorporating Dynamic Graphs and All Trading Days
- Author
-
Chi, Zhengyang, Gao, Junbin, and Wang, Chao
- Subjects
Quantitative Finance - General Finance ,Economics - General Economics - Abstract
This study introduces a global stock market volatility forecasting model that enhances forecasting accuracy and practical utility in real-world financial decision-making by integrating dynamic graph structures and encompassing the union of active trading days of different stock markets. The model employs a spatial-temporal graph neural network (GNN) architecture to capture the volatility spillover effect, where shocks in one market spread to others through the interconnective global economy. By calculating the volatility spillover index to depict the volatility network as graphs, the model effectively mirrors the volatility dynamics for the chosen stock market indices. In the empirical analysis, the proposed model surpasses the benchmark model in all forecasting scenarios and is shown to be sensitive to the underlying volatility interrelationships.
- Published
- 2024
26. RLPF: Reinforcement Learning from Prediction Feedback for User Summarization with LLMs
- Author
-
Wu, Jiaxing, Ning, Lin, Liu, Luyang, Lee, Harrison, Wu, Neo, Wang, Chao, Prakash, Sushant, O'Banion, Shawn, Green, Bradley, and Xie, Jun
- Subjects
Computer Science - Computation and Language ,Computer Science - Artificial Intelligence ,Computer Science - Machine Learning - Abstract
LLM-powered personalization agent systems employ Large Language Models (LLMs) to predict users' behavior from their past activities. However, their effectiveness often hinges on the ability to effectively leverage extensive, long user historical data due to its inherent noise and length of such data. Existing pretrained LLMs may generate summaries that are concise but lack the necessary context for downstream tasks, hindering their utility in personalization systems. To address these challenges, we introduce Reinforcement Learning from Prediction Feedback (RLPF). RLPF fine-tunes LLMs to generate concise, human-readable user summaries that are optimized for downstream task performance. By maximizing the usefulness of the generated summaries, RLPF effectively distills extensive user history data while preserving essential information for downstream tasks. Our empirical evaluation demonstrates significant improvements in both extrinsic downstream task utility and intrinsic summary quality, surpassing baseline methods by up to 22% on downstream task performance and achieving an up to 84.59% win rate on Factuality, Abstractiveness, and Readability. RLPF also achieves a remarkable 74% reduction in context length while improving performance on 16 out of 19 unseen tasks and/or datasets, showcasing its generalizability. This approach offers a promising solution for enhancing LLM personalization by effectively transforming long, noisy user histories into informative and human-readable representations.
- Published
- 2024
27. Enabling Practical and Privacy-Preserving Image Processing
- Author
-
Wang, Chao, Yang, Shubing, Sun, Xiaoyan, Dai, Jun, and Zhao, Dongfang
- Subjects
Computer Science - Cryptography and Security ,C.2.0 ,K.6.5 - Abstract
Fully Homomorphic Encryption (FHE) enables computations on encrypted data, preserving confidentiality without the need for decryption. However, FHE is often hindered by significant performance overhead, particularly for high-precision and complex data like images. Due to serious efficiency issues, traditional FHE methods often encrypt images by monolithic data blocks (such as pixel rows), instead of pixels. However, this strategy compromises the advantages of homomorphic operations and disables pixel-level image processing. In this study, we address these challenges by proposing and implementing a pixel-level homomorphic encryption approach, iCHEETAH, based on the CKKS scheme. To enhance computational efficiency, we introduce three novel caching mechanisms to pre-encrypt radix values or frequently occurring pixel values, substantially reducing redundant encryption operations. Extensive experiments demonstrate that our approach achieves up to a 19-fold improvement in encryption speed compared to the original CKKS, while maintaining high image quality. Additionally, real-world image applications such as mean filtering, brightness enhancement, image matching and watermarking are tested based on FHE, showcasing up to a 91.53% speed improvement. We also proved that our method is IND-CPA (Indistinguishability under Chosen Plaintext Attack) secure, providing strong encryption security. These results underscore the practicality and efficiency of iCHEETAH, marking a significant advancement in privacy-preserving image processing at scale., Comment: 16 pages, 10 figures
- Published
- 2024
28. FairQuant: Certifying and Quantifying Fairness of Deep Neural Networks
- Author
-
Kim, Brian Hyeongseok, Wang, Jingbo, and Wang, Chao
- Subjects
Computer Science - Machine Learning ,Computer Science - Software Engineering - Abstract
We propose a method for formally certifying and quantifying individual fairness of deep neural networks (DNN). Individual fairness guarantees that any two individuals who are identical except for a legally protected attribute (e.g., gender or race) receive the same treatment. While there are existing techniques that provide such a guarantee, they tend to suffer from lack of scalability or accuracy as the size and input dimension of the DNN increase. Our method overcomes this limitation by applying abstraction to a symbolic interval based analysis of the DNN followed by iterative refinement guided by the fairness property. Furthermore, our method lifts the symbolic interval based analysis from conventional qualitative certification to quantitative certification, by computing the percentage of individuals whose classification outputs are provably fair, instead of merely deciding if the DNN is fair. We have implemented our method and evaluated it on deep neural networks trained on four popular fairness research datasets. The experimental results show that our method is not only more accurate than state-of-the-art techniques but also several orders-of-magnitude faster., Comment: Accepted at ICSE 2025; To Appear In Proceedings of the 47th IEEE/ACM International Conference on Software Engineering
- Published
- 2024
29. UserSumBench: A Benchmark Framework for Evaluating User Summarization Approaches
- Author
-
Wang, Chao, Wu, Neo, Ning, Lin, Wu, Jiaxing, Liu, Luyang, Xie, Jun, O'Banion, Shawn, and Green, Bradley
- Subjects
Computer Science - Machine Learning ,Computer Science - Artificial Intelligence ,Computer Science - Computation and Language - Abstract
Large language models (LLMs) have shown remarkable capabilities in generating user summaries from a long list of raw user activity data. These summaries capture essential user information such as preferences and interests, and therefore are invaluable for LLM-based personalization applications, such as explainable recommender systems. However, the development of new summarization techniques is hindered by the lack of ground-truth labels, the inherent subjectivity of user summaries, and human evaluation which is often costly and time-consuming. To address these challenges, we introduce \UserSumBench, a benchmark framework designed to facilitate iterative development of LLM-based summarization approaches. This framework offers two key components: (1) A reference-free summary quality metric. We show that this metric is effective and aligned with human preferences across three diverse datasets (MovieLens, Yelp and Amazon Review). (2) A novel robust summarization method that leverages time-hierarchical summarizer and self-critique verifier to produce high-quality summaries while eliminating hallucination. This method serves as a strong baseline for further innovation in summarization techniques.
- Published
- 2024
30. Multi-modal Adversarial Training for Zero-Shot Voice Cloning
- Author
-
Janiczek, John, Chong, Dading, Dai, Dongyang, Faria, Arlo, Wang, Chao, Wang, Tao, and Liu, Yuzong
- Subjects
Electrical Engineering and Systems Science - Audio and Speech Processing ,Computer Science - Machine Learning ,Computer Science - Sound - Abstract
A text-to-speech (TTS) model trained to reconstruct speech given text tends towards predictions that are close to the average characteristics of a dataset, failing to model the variations that make human speech sound natural. This problem is magnified for zero-shot voice cloning, a task that requires training data with high variance in speaking styles. We build off of recent works which have used Generative Advsarial Networks (GAN) by proposing a Transformer encoder-decoder architecture to conditionally discriminates between real and generated speech features. The discriminator is used in a training pipeline that improves both the acoustic and prosodic features of a TTS model. We introduce our novel adversarial training technique by applying it to a FastSpeech2 acoustic model and training on Libriheavy, a large multi-speaker dataset, for the task of zero-shot voice cloning. Our model achieves improvements over the baseline in terms of speech quality and speaker similarity. Audio examples from our system are available online., Comment: Accepted at INTERSPEECH 2024
- Published
- 2024
31. Communication-Free Robust Wireless Power Transfer with Constant Output Power and Stable Frequency
- Author
-
Zhang, Zhuoyu, Lai, Junan, Huang, Yuangen, Hao, Xianglin, Yin, Ke, Jiang, Zhiqin, Wang, Chao, Ma, Xikui, Huang, Ming, and Dong, Tianyu
- Subjects
Physics - Applied Physics - Abstract
A primary challenge in wireless power transfer (WPT) systems is to achieve efficient and stable power transmission without complex control strategies when load conditions change dynamically. Addressing this issue, we propose a third-order pseudo-Hermitian WPT system whose output characteristics exhibit a stable frequency and constant power. The frequency selection mechanism and energy efficiency of the nonlinear WPT system based on pseudo-Hermitian under the coupling mode theory approximation are analyzed. Theoretical analysis indicates that under certain coupling coefficients and load conditions, the proposed system can achieve frequency adaptation in a stable frequency mode without the need to change the circuit frequency. When the load changes dynamically, the stability of the power output is maintained using a proportional integral (PI) control strategy that only collects the voltage and current at the transmitting end, eliminating the need for wireless communication circuits with feedback from the receiving side. Experimental results demonstrate that the proposed design scheme can achieve constant power transmission when load conditions change, maintaining stable and relatively high transmission efficiency. The proposed scheme exhibits benefits in practical applications since no communication is required.
- Published
- 2024
32. SpikingSSMs: Learning Long Sequences with Sparse and Parallel Spiking State Space Models
- Author
-
Shen, Shuaijie, Wang, Chao, Huang, Renzhuo, Zhong, Yan, Guo, Qinghai, Lu, Zhichao, Zhang, Jianguo, and Leng, Luziwei
- Subjects
Computer Science - Computation and Language ,Computer Science - Machine Learning ,Computer Science - Neural and Evolutionary Computing - Abstract
Known as low energy consumption networks, spiking neural networks (SNNs) have gained a lot of attention within the past decades. While SNNs are increasing competitive with artificial neural networks (ANNs) for vision tasks, they are rarely used for long sequence tasks, despite their intrinsic temporal dynamics. In this work, we develop spiking state space models (SpikingSSMs) for long sequence learning by leveraging on the sequence learning abilities of state space models (SSMs). Inspired by dendritic neuron structure, we hierarchically integrate neuronal dynamics with the original SSM block, meanwhile realizing sparse synaptic computation. Furthermore, to solve the conflict of event-driven neuronal dynamics with parallel computing, we propose a light-weight surrogate dynamic network which accurately predicts the after-reset membrane potential and compatible to learnable thresholds, enabling orders of acceleration in training speed compared with conventional iterative methods. On the long range arena benchmark task, SpikingSSM achieves competitive performance to state-of-the-art SSMs meanwhile realizing on average 90\% of network sparsity. On language modeling, our network significantly surpasses existing spiking large language models (spikingLLMs) on the WikiText-103 dataset with only a third of the model size, demonstrating its potential as backbone architecture for low computation cost LLMs.
- Published
- 2024
33. Embedding periodic maps of surfaces into those of spheres with minimal dimensions
- Author
-
Wang, Chao, Wang, Shicheng, and Wang, Zhongzi
- Subjects
Mathematics - Geometric Topology ,Primary 57R40, Secondary 57M12, 57M60 - Abstract
It is known that any periodic map of order $n$ on a closed oriented surface of genus $g$ can be equivariantly embedded into $S^m$ for some $m$. In the orientable and smooth category, we determine the smallest possible $m$ when $n\geq 3g$. We show that for each integer $k>1$ there exist infinitely many periodic maps such that the smallest possible $m$ is equal to $k$., Comment: 20 pages, 8 Figures
- Published
- 2024
34. Loss-based Bayesian Sequential Prediction of Value at Risk with a Long-Memory and Non-linear Realized Volatility Model
- Author
-
Peiris, Rangika, Tran, Minh-Ngoc, Wang, Chao, and Gerlach, Richard
- Subjects
Quantitative Finance - Risk Management ,Statistics - Machine Learning - Abstract
A long memory and non-linear realized volatility model class is proposed for direct Value at Risk (VaR) forecasting. This model, referred to as RNN-HAR, extends the heterogeneous autoregressive (HAR) model, a framework known for efficiently capturing long memory in realized measures, by integrating a Recurrent Neural Network (RNN) to handle non-linear dynamics. Loss-based generalized Bayesian inference with Sequential Monte Carlo is employed for model estimation and sequential prediction in RNN HAR. The empirical analysis is conducted using daily closing prices and realized measures from 2000 to 2022 across 31 market indices. The proposed models one step ahead VaR forecasting performance is compared against a basic HAR model and its extensions. The results demonstrate that the proposed RNN-HAR model consistently outperforms all other models considered in the study.
- Published
- 2024
35. First Activations Matter: Training-Free Methods for Dynamic Activation in Large Language Models
- Author
-
Ma, Chi, Huang, Mincong, Zhang, Ying, Wang, Chao, Wang, Yujie, Yu, Lei, Liu, Chuan, and Lin, Wei
- Subjects
Computer Science - Computation and Language ,Computer Science - Machine Learning - Abstract
Dynamic activation (DA) techniques, such as DejaVu and MoEfication, have demonstrated their potential to significantly enhance the inference efficiency of large language models (LLMs). However, these techniques often rely on ReLU activation functions or require additional parameters and training to maintain performance. This paper introduces a training-free Threshold-based Dynamic Activation(TDA) method that leverage sequence information to exploit the inherent sparsity of models across various architectures. This method is designed to accelerate generation speed by 18-25\% without significantly compromising task performance, thereby addressing the limitations of existing DA techniques. Moreover, we delve into the root causes of LLM sparsity and theoretically analyze two of its critical features: history-related activation uncertainty and semantic-irrelevant activation inertia. Our comprehensive analyses not only provide a robust theoretical foundation for DA methods but also offer valuable insights to guide future research in optimizing LLMs for greater efficiency and effectiveness.
- Published
- 2024
36. Microsatellite-based real-time quantum key distribution
- Author
-
Li, Yang, Cai, Wen-Qi, Ren, Ji-Gang, Wang, Chao-Ze, Yang, Meng, Zhang, Liang, Wu, Hui-Ying, Chang, Liang, Wu, Jin-Cai, Jin, Biao, Xue, Hua-Jian, Li, Xue-Jiao, Liu, Hui, Yu, Guang-Wen, Tao, Xue-Ying, Chen, Ting, Liu, Chong-Fei, Luo, Wen-Bin, Zhou, Jie, Yong, Hai-Lin, Li, Yu-Huai, Li, Feng-Zhi, Jiang, Cong, Chen, Hao-Ze, Wu, Chao, Tong, Xin-Hai, Xie, Si-Jiang, Zhou, Fei, Liu, Wei-Yue, Liu, Nai-Le, Li, Li, Xu, Feihu, Cao, Yuan, Yin, Juan, Shu, Rong, Wang, Xiang-Bin, Zhang, Qiang, Wang, Jian-Yu, Liao, Sheng-Kai, Peng, Cheng-Zhi, and Pan, Jian-Wei
- Subjects
Quantum Physics - Abstract
A quantum network provides an infrastructure connecting quantum devices with revolutionary computing, sensing, and communication capabilities. As the best-known application of a quantum network, quantum key distribution (QKD) shares secure keys guaranteed by the laws of quantum mechanics. A quantum satellite constellation offers a solution to facilitate the quantum network on a global scale. The Micius satellite has verified the feasibility of satellite quantum communications, however, scaling up quantum satellite constellations is challenging, requiring small lightweight satellites, portable ground stations and real-time secure key exchange. Here we tackle these challenges and report the development of a quantum microsatellite capable of performing space-to-ground QKD using portable ground stations. The quantum microsatellite features a payload weighing approximately 23 kg, while the portable ground station weighs about 100 kg. These weights represent reductions by more than an order and two orders of magnitude, respectively, compared to the Micius satellite. Additionally, we multiplex bidirectional satellite-ground optical communication with quantum communication, enabling key distillation and secure communication in real-time. Using the microsatellite and the portable ground stations, we demonstrate satellite-based QKD with multiple ground stations and achieve the sharing of up to 0.59 million bits of secure keys during a single satellite pass. The compact quantum payload can be readily assembled on existing space stations or small satellites, paving the way for a satellite-constellation-based quantum and classical network for widespread real-life applications., Comment: 40 pages, 8 figures
- Published
- 2024
37. MambaEVT: Event Stream based Visual Object Tracking using State Space Model
- Author
-
Wang, Xiao, wang, Chao, Wang, Shiao, Wang, Xixi, Zhao, Zhicheng, Zhu, Lin, and Jiang, Bo
- Subjects
Computer Science - Computer Vision and Pattern Recognition ,Computer Science - Artificial Intelligence - Abstract
Event camera-based visual tracking has drawn more and more attention in recent years due to the unique imaging principle and advantages of low energy consumption, high dynamic range, and dense temporal resolution. Current event-based tracking algorithms are gradually hitting their performance bottlenecks, due to the utilization of vision Transformer and the static template for target object localization. In this paper, we propose a novel Mamba-based visual tracking framework that adopts the state space model with linear complexity as a backbone network. The search regions and target template are fed into the vision Mamba network for simultaneous feature extraction and interaction. The output tokens of search regions will be fed into the tracking head for target localization. More importantly, we consider introducing a dynamic template update strategy into the tracking framework using the Memory Mamba network. By considering the diversity of samples in the target template library and making appropriate adjustments to the template memory module, a more effective dynamic template can be integrated. The effective combination of dynamic and static templates allows our Mamba-based tracking algorithm to achieve a good balance between accuracy and computational cost on multiple large-scale datasets, including EventVOT, VisEvent, and FE240hz. The source code will be released on https://github.com/Event-AHU/MambaEVT, Comment: In Peer Review
- Published
- 2024
38. Beyond App Markets: Demystifying Underground Mobile App Distribution Via Telegram
- Author
-
Guo, Yanhui, Wang, Dong, Wang, Liu, Fang, Yongsheng, Wang, Chao, Yang, Minghui, Liu, Tianming, and Wang, Haoyu
- Subjects
Computer Science - Cryptography and Security - Abstract
Within the thriving mobile app ecosystem ecosystem, a subset of apps provides illicit services such as gambling and pornography to pursue economic gains, collectively referred to as "underground economy apps". While previous studies have examined these apps' characteristics and identification methods, investigations into their distribution via platforms beyond app markets (like Telegram) remain scarce, which has emerged as a crucial channel for underground activities and cybercrime due to the robust encryption and user anonymity. This study provides the first comprehensive exploration of the underground mobile app ecosystem on Telegram. Overcoming the complexities of the Telegram environment, we build a novel dataset and analyze the prevalence, promotional strategies, and characteristics of these apps. Our findings reveal the significant prevalence of these apps on Telegram, with the total sum of subscription user numbers across channels promoting these apps equivalent to 1% of Telegram's user base. We find these apps primarily cater to gambling and pornography services. We uncover sophisticated promotional strategies involving complex networks of apps, websites, users, and channels, and identify significant gaps in Telegram's content moderation capabilities. Our analysis also exposes the misuse of iOS features for app distribution and the prevalence of malicious behaviors in these apps. This research not only enhances our understanding of the underground app ecosystem but also provides valuable insights for developing effective regulatory measures and protecting users from potential risks associated with these covert operations. Our findings provide implications for platform regulators, app market operators, law enforcement agencies, and cybersecurity professionals in combating the proliferation of underground apps on encrypted messaging platforms., Comment: To appear in SIGMETRICS 2025
- Published
- 2024
39. Beyond Boundaries: efficient Projected Entangled Pair States methods for periodic quantum systems
- Author
-
Dong, Shaojun, Wang, Chao, Zhang, Hao, Zhang, Meng, and He, Lixin
- Subjects
Condensed Matter - Strongly Correlated Electrons ,Quantum Physics - Abstract
Projected Entangled Pair States (PEPS) are recognized as a potent tool for exploring two-dimensional quantum many-body systems. However, a significant challenge emerges when applying conventional PEPS methodologies to systems with periodic boundary conditions (PBC), attributed to the prohibitive computational scaling with the bond dimension. This has notably restricted the study of systems with complex boundary conditions. To address this challenge, we have developed a strategy that involves the superposition of PEPS with open boundary conditions (OBC) to treat systems with PBC. This approach significantly reduces the computational complexity of such systems while maintaining their translational invariance and the PBC. We benchmark this method against the Heisenberg model and the $J_1$-$J_2$ model, demonstrating its capability to yield highly accurate results at low computational costs, even for large system sizes. The techniques are adaptable to other boundary conditions, including cylindrical and twisted boundary conditions, and therefore significantly expands the application scope of the PEPS approach, shining new light on numerous applications.
- Published
- 2024
40. A lower bound of the crossing number of composite knots
- Author
-
Qiu, Ruifeng and Wang, Chao
- Subjects
Mathematics - Geometric Topology ,Primary 57M25, Secondary 57N10 - Abstract
Let $c(K)$ denote the crossing number of a knot $K$ and let $K_1\# K_2$ denote the connected sum of two oriented knots $K_1$ and $K_2$. It is a very old unsolved question that whether $c(K_1\# K_2)=c(K_1)+c(K_2)$. In this paper we show that $c(K_1\# K_2)> (c(K_1)+c(K_2))/16$., Comment: 46 pages, 45 figures
- Published
- 2024
41. A Tutorial on Fluid Antenna System for 6G Networks: Encompassing Communication Theory, Optimization Methods and Hardware Designs
- Author
-
New, Wee Kiat, Wong, Kai-Kit, Xu, Hao, Wang, Chao, Ghadi, Farshad Rostami, Zhang, Jichen, Rao, Junhui, Murch, Ross, Ramírez-Espinosa, Pablo, Morales-Jimenez, David, Chae, Chan-Byoung, and Tong, Kin-Fai
- Subjects
Electrical Engineering and Systems Science - Signal Processing - Abstract
The advent of the sixth-generation (6G) networks presents another round of revolution for the mobile communication landscape, promising an immersive experience, robust reliability, minimal latency, extreme connectivity, ubiquitous coverage, and capabilities beyond communication, including intelligence and sensing. To achieve these ambitious goals, it is apparent that 6G networks need to incorporate the state-of-the-art technologies. One of the technologies that has garnered rising interest is fluid antenna system (FAS) which represents any software-controllable fluidic, conductive, or dielectric structure capable of dynamically changing its shape and position to reconfigure essential radio-frequency (RF) characteristics. Compared to traditional antenna systems (TASs) with fixed-position radiating elements, the core idea of FAS revolves around the unique flexibility of reconfiguring the radiating elements within a given space. One recent driver of FAS is the recognition of its position-flexibility as a new degree of freedom (dof) to harness diversity and multiplexing gains. In this paper, we provide a comprehensive tutorial, covering channel modeling, signal processing and estimation methods, information-theoretic insights, new multiple access techniques, and hardware designs. Moreover, we delineate the challenges of FAS and explore the potential of using FAS to improve the performance of other contemporary technologies. By providing insights and guidance, this tutorial paper serves to inspire researchers to explore new horizons and fully unleash the potential of FAS., Comment: 53 pages, 45 figures, 5 tables. Accepted by IEEE Communications Surveys and Tutorials
- Published
- 2024
42. 52B to 1T: Lessons Learned via Tele-FLM Series
- Author
-
Li, Xiang, Yao, Yiqun, Jiang, Xin, Fang, Xuezhi, Wang, Chao, Liu, Xinzhang, Wang, Zihan, Zhao, Yu, Wang, Xin, Huang, Yuyao, Song, Shuangyong, Li, Yongxiang, Zhang, Zheng, Zhao, Bo, Sun, Aixin, Wang, Yequan, He, Zhongjiang, Wang, Zhongyuan, Li, Xuelong, and Huang, Tiejun
- Subjects
Computer Science - Computation and Language ,Computer Science - Artificial Intelligence - Abstract
Large Language Models (LLMs) represent a significant stride toward Artificial General Intelligence. As scaling laws underscore the potential of increasing model sizes, the academic community has intensified its investigations into LLMs with capacities exceeding 50 billion parameters. This technical report builds on our prior work with Tele-FLM (also known as FLM-2), a publicly available 52-billion-parameter model. We delve into two primary areas: we first discuss our observation of Supervised Fine-tuning (SFT) on Tele-FLM-52B, which supports the "less is more" approach for SFT data construction; second, we demonstrate our experiments and analyses on the best practices for progressively growing a model from 52 billion to 102 billion, and subsequently to 1 trillion parameters. We will open-source a 1T model checkpoint, namely Tele-FLM-1T, to advance further training and research., Comment: For the Tele-FLM-52B tech report, see also 2404.16645
- Published
- 2024
43. An Outline of Prognostics and Health Management Large Model: Concepts, Paradigms, and Challenges
- Author
-
Tao, Laifa, Li, Shangyu, Liu, Haifei, Huang, Qixuan, Ma, Liang, Ning, Guoao, Chen, Yiling, Wu, Yunlong, Li, Bin, Zhang, Weiwei, Zhao, Zhengduo, Zhan, Wenchao, Cao, Wenyan, Wang, Chao, Liu, Hongmei, Ma, Jian, Suo, Mingliang, Cheng, Yujie, Ding, Yu, Song, Dengwei, and Lu, Chen
- Subjects
Computer Science - Artificial Intelligence ,Computer Science - Software Engineering ,Electrical Engineering and Systems Science - Signal Processing ,Electrical Engineering and Systems Science - Systems and Control - Abstract
Prognosis and Health Management (PHM), critical for ensuring task completion by complex systems and preventing unexpected failures, is widely adopted in aerospace, manufacturing, maritime, rail, energy, etc. However, PHM's development is constrained by bottlenecks like generalization, interpretation and verification abilities. Presently, generative artificial intelligence (AI), represented by Large Model, heralds a technological revolution with the potential to fundamentally reshape traditional technological fields and human production methods. Its capabilities, including strong generalization, reasoning, and generative attributes, present opportunities to address PHM's bottlenecks. To this end, based on a systematic analysis of the current challenges and bottlenecks in PHM, as well as the research status and advantages of Large Model, we propose a novel concept and three progressive paradigms of Prognosis and Health Management Large Model (PHM-LM) through the integration of the Large Model with PHM. Subsequently, we provide feasible technical approaches for PHM-LM to bolster PHM's core capabilities within the framework of the three paradigms. Moreover, to address core issues confronting PHM, we discuss a series of technical challenges of PHM-LM throughout the entire process of construction and application. This comprehensive effort offers a holistic PHM-LM technical framework, and provides avenues for new PHM technologies, methodologies, tools, platforms and applications, which also potentially innovates design, research & development, verification and application mode of PHM. And furthermore, a new generation of PHM with AI will also capably be realized, i.e., from custom to generalized, from discriminative to generative, and from theoretical conditions to practical applications.
- Published
- 2024
44. An idiogram on pachytene bivajents with high resalution multiple bands of Epinephelus malabaricus
- Author
-
Zhou, Aiguo, Xie, Shaolin, Wang, Zhenlu, Chen, Yanfeng, Feng, Yongyong, Wang, Chao, Ye, Qiao, Fan, Lanfen, Wang, Meifang, and Zou, Jixing
- Published
- 2019
- Full Text
- View/download PDF
45. Activation of the helper NRC4 immune receptor forms a hexameric resistosome
- Author
-
Liu, Furong, Yang, Zhenlin, Wang, Chao, You, Zhang, Martin, Raoul, Qiao, Wenjie, Huang, Jian, Jacob, Pierre, Dangl, Jeffery L, Carette, Jan E, Luan, Sheng, Nogales, Eva, and Staskawicz, Brian J
- Subjects
Biochemistry and Cell Biology ,Biomedical and Clinical Sciences ,Biological Sciences ,Immunology ,ETI ,NLR proteins ,NRC0 ,NRC2 ,NRC3 ,NRC4 resistosome ,PTI ,calcium influx ,pathogen recognition ,plant immunity ,Medical and Health Sciences ,Developmental Biology ,Biological sciences ,Biomedical and clinical sciences - Abstract
Innate immune responses to microbial pathogens are regulated by intracellular receptors known as nucleotide-binding leucine-rich repeat receptors (NLRs) in both the plant and animal kingdoms. Across plant innate immune systems, "helper" NLRs (hNLRs) work in coordination with "sensor" NLRs (sNLRs) to modulate disease resistance signaling pathways. Activation mechanisms of hNLRs based on structures are unknown. Our research reveals that the hNLR, known as NLR required for cell death 4 (NRC4), assembles into a hexameric resistosome upon activation by the sNLR Bs2 and the pathogenic effector AvrBs2. This conformational change triggers immune responses by facilitating the influx of calcium ions (Ca2+) into the cytosol. The activation mimic alleles of NRC2, NRC3, or NRC4 alone did not induce Ca2+ influx and cell death in animal cells, suggesting that unknown plant-specific factors regulate NRCs' activation in plants. These findings significantly advance our understanding of the regulatory mechanisms governing plant immune responses.
- Published
- 2024
46. GraphMU: Repairing Robustness of Graph Neural Networks via Machine Unlearning
- Author
-
Wu, Tao, Cao, Xinwen, Wang, Chao, Qiao, Shaojie, Xian, Xingping, Yuan, Lin, Cui, Canyixing, and Liu, Yanbing
- Subjects
Computer Science - Social and Information Networks ,Computer Science - Machine Learning - Abstract
Graph Neural Networks (GNNs) have demonstrated significant application potential in various fields. However, GNNs are still vulnerable to adversarial attacks. Numerous adversarial defense methods on GNNs are proposed to address the problem of adversarial attacks. However, these methods can only serve as a defense before poisoning, but cannot repair poisoned GNN. Therefore, there is an urgent need for a method to repair poisoned GNN. In this paper, we address this gap by introducing the novel concept of model repair for GNNs. We propose a repair framework, Repairing Robustness of Graph Neural Networks via Machine Unlearning (GraphMU), which aims to fine-tune poisoned GNN to forget adversarial samples without the need for complete retraining. We also introduce a unlearning validation method to ensure that our approach effectively forget specified poisoned data. To evaluate the effectiveness of GraphMU, we explore three fine-tuned subgraph construction scenarios based on the available perturbation information: (i) Known Perturbation Ratios, (ii) Known Complete Knowledge of Perturbations, and (iii) Unknown any Knowledge of Perturbations. Our extensive experiments, conducted across four citation datasets and four adversarial attack scenarios, demonstrate that GraphMU can effectively restore the performance of poisoned GNN.
- Published
- 2024
47. Explainable AI Security: Exploring Robustness of Graph Neural Networks to Adversarial Attacks
- Author
-
Wu, Tao, Cui, Canyixing, Xian, Xingping, Qiao, Shaojie, Wang, Chao, Yuan, Lin, and Yu, Shui
- Subjects
Computer Science - Machine Learning ,Computer Science - Social and Information Networks - Abstract
Graph neural networks (GNNs) have achieved tremendous success, but recent studies have shown that GNNs are vulnerable to adversarial attacks, which significantly hinders their use in safety-critical scenarios. Therefore, the design of robust GNNs has attracted increasing attention. However, existing research has mainly been conducted via experimental trial and error, and thus far, there remains a lack of a comprehensive understanding of the vulnerability of GNNs. To address this limitation, we systematically investigate the adversarial robustness of GNNs by considering graph data patterns, model-specific factors, and the transferability of adversarial examples. Through extensive experiments, a set of principled guidelines is obtained for improving the adversarial robustness of GNNs, for example: (i) rather than highly regular graphs, the training graph data with diverse structural patterns is crucial for model robustness, which is consistent with the concept of adversarial training; (ii) the large model capacity of GNNs with sufficient training data has a positive effect on model robustness, and only a small percentage of neurons in GNNs are affected by adversarial attacks; (iii) adversarial transfer is not symmetric and the adversarial examples produced by the small-capacity model have stronger adversarial transferability. This work illuminates the vulnerabilities of GNNs and opens many promising avenues for designing robust GNNs.
- Published
- 2024
48. MOYU: A Theoretical Study on Massive Over-activation Yielded Uplifts in LLMs
- Author
-
Ma, Chi, Huang, Mincong, Wang, Chao, Wang, Yujie, and Yu, Lei
- Subjects
Computer Science - Machine Learning - Abstract
Massive Over-activation Yielded Uplifts(MOYU) is an inherent property of large language models, and dynamic activation(DA) based on the MOYU property is a clever yet under-explored strategy designed to accelerate inference in these models. Existing methods that utilize MOYU often face a significant 'Impossible Trinity': struggling to simultaneously maintain model performance, enhance inference speed, and extend applicability across various architectures. Due to the theoretical ambiguities surrounding MOYU, this paper elucidates the root cause of the MOYU property and outlines the mechanisms behind two primary limitations encountered by current DA methods: 1) history-related activation uncertainty, and 2) semantic-irrelevant activation inertia. Our analysis not only underscores the limitations of current dynamic activation strategies within large-scale LLaMA models but also proposes opportunities for refining the design of future sparsity schemes.
- Published
- 2024
49. Evolutionary Spiking Neural Networks: A Survey
- Author
-
Shen, Shuaijie, Zhang, Rui, Wang, Chao, Huang, Renzhuo, Tuerhong, Aiersi, Guo, Qinghai, Lu, Zhichao, Zhang, Jianguo, and Leng, Luziwei
- Subjects
Computer Science - Neural and Evolutionary Computing - Abstract
Spiking neural networks (SNNs) are gaining increasing attention as potential computationally efficient alternatives to traditional artificial neural networks(ANNs). However, the unique information propagation mechanisms and the complexity of SNN neuron models pose challenges for adopting traditional methods developed for ANNs to SNNs. These challenges include both weight learning and architecture design. While surrogate gradient learning has shown some success in addressing the former challenge, the latter remains relatively unexplored. Recently, a novel paradigm utilizing evolutionary computation methods has emerged to tackle these challenges. This approach has resulted in the development of a variety of energy-efficient and high-performance SNNs across a wide range of machine learning benchmarks. In this paper, we present a survey of these works and initiate discussions on potential challenges ahead.
- Published
- 2024
- Full Text
- View/download PDF
50. Job-SDF: A Multi-Granularity Dataset for Job Skill Demand Forecasting and Benchmarking
- Author
-
Chen, Xi, Qin, Chuan, Fang, Chuyu, Wang, Chao, Zhu, Chen, Zhuang, Fuzhen, Zhu, Hengshu, and Xiong, Hui
- Subjects
Computer Science - Machine Learning ,Computer Science - Artificial Intelligence - Abstract
In a rapidly evolving job market, skill demand forecasting is crucial as it enables policymakers and businesses to anticipate and adapt to changes, ensuring that workforce skills align with market needs, thereby enhancing productivity and competitiveness. Additionally, by identifying emerging skill requirements, it directs individuals towards relevant training and education opportunities, promoting continuous self-learning and development. However, the absence of comprehensive datasets presents a significant challenge, impeding research and the advancement of this field. To bridge this gap, we present Job-SDF, a dataset designed to train and benchmark job-skill demand forecasting models. Based on 10.35 million public job advertisements collected from major online recruitment platforms in China between 2021 and 2023, this dataset encompasses monthly recruitment demand for 2,324 types of skills across 521 companies. Our dataset uniquely enables evaluating skill demand forecasting models at various granularities, including occupation, company, and regional levels. We benchmark a range of models on this dataset, evaluating their performance in standard scenarios, in predictions focused on lower value ranges, and in the presence of structural breaks, providing new insights for further research. Our code and dataset are publicly accessible via the https://github.com/Job-SDF/benchmark., Comment: NeurIPS 2024 Accepted
- Published
- 2024
Catalog
Discovery Service for Jio Institute Digital Library
For full access to our library's resources, please sign in.