Author: "Song, Linqi" - Searchworks@Jio Institute Digital Library Search Results

Your search keyword '"Song, Linqi"' showing total 347 results

Start Over Author "Song, Linqi"

347 results on '"Song, Linqi"'

1. VOVTrack: Exploring the Potentiality in Videos for Open-Vocabulary Object Tracking

Author: Qian, Zekun, Han, Ruize, Hou, Junhui, Song, Linqi, and Feng, Wei
Subjects: Computer Science - Computer Vision and Pattern Recognition, Computer Science - Artificial Intelligence
Abstract: Open-vocabulary multi-object tracking (OVMOT) represents a critical new challenge involving the detection and tracking of diverse object categories in videos, encompassing both seen categories (base classes) and unseen categories (novel classes). This issue amalgamates the complexities of open-vocabulary object detection (OVD) and multi-object tracking (MOT). Existing approaches to OVMOT often merge OVD and MOT methodologies as separate modules, predominantly focusing on the problem through an image-centric lens. In this paper, we propose VOVTrack, a novel method that integrates object states relevant to MOT and video-centric training to address this challenge from a video object tracking standpoint. First, we consider the tracking-related state of the objects during tracking and propose a new prompt-guided attention mechanism for more accurate localization and classification (detection) of the time-varying objects. Subsequently, we leverage raw video data without annotations for training by formulating a self-supervised object similarity learning technique to facilitate temporal object association (tracking). Experimental results underscore that VOVTrack outperforms existing methods, establishing itself as a state-of-the-art solution for open-vocabulary tracking task.
Published: 2024

2. Mitigating the Language Mismatch and Repetition Issues in LLM-based Machine Translation via Model Editing

Author: Wang, Weichuan, Li, Zhaoyi, Lian, Defu, Ma, Chen, Song, Linqi, and Wei, Ying
Subjects: Computer Science - Computation and Language, Computer Science - Machine Learning
Abstract: Large Language Models (LLMs) have recently revolutionized the NLP field, while they still fall short in some specific down-stream tasks. In the work, we focus on utilizing LLMs to perform machine translation, where we observe that two patterns of errors frequently occur and drastically affect the translation quality: language mismatch and repetition. The work sets out to explore the potential for mitigating these two issues by leveraging model editing methods, e.g., by locating Feed-Forward Network (FFN) neurons or something that are responsible for the errors and deactivating them in the inference time. We find that directly applying such methods either limited effect on the targeted errors or has significant negative side-effect on the general translation quality, indicating that the located components may also be crucial for ensuring machine translation with LLMs on the rails. To this end, we propose to refine the located components by fetching the intersection of the locating results under different language settings, filtering out the aforementioned information that is irrelevant to targeted errors. The experiment results empirically demonstrate that our methods can effectively reduce the language mismatch and repetition ratios and meanwhile enhance or keep the general translation quality in most cases., Comment: 20 pages, EMNLP'2024 Main Conference
Published: 2024

3. Determine-Then-Ensemble: Necessity of Top-k Union for Large Language Model Ensembling

Author: Yao, Yuxuan, Wu, Han, Liu, Mingyang, Luo, Sichun, Han, Xiongwei, Liu, Jie, Guo, Zhijiang, and Song, Linqi
Subjects: Computer Science - Computation and Language, Computer Science - Artificial Intelligence
Abstract: Large language models (LLMs) exhibit varying strengths and weaknesses across different tasks, prompting recent studies to explore the benefits of ensembling models to leverage their complementary advantages. However, existing LLM ensembling methods often overlook model compatibility and struggle with inefficient alignment of probabilities across the entire vocabulary. In this study, we empirically investigate the factors influencing ensemble performance, identifying model performance, vocabulary size, and response style as key determinants, revealing that compatibility among models is essential for effective ensembling. This analysis leads to the development of a simple yet effective model selection strategy that identifies compatible models. Additionally, we introduce the \textsc{Uni}on \textsc{T}op-$k$ \textsc{E}nsembling (\textsc{UniTE}), a novel approach that efficiently combines models by focusing on the union of the top-k tokens from each model, thereby avoiding the need for full vocabulary alignment and reducing computational overhead. Extensive evaluations across multiple benchmarks demonstrate that \textsc{UniTE} significantly enhances performance compared to existing methods, offering a more efficient framework for LLM ensembling.
Published: 2024

4. Surrogate-Assisted Search with Competitive Knowledge Transfer for Expensive Optimization

Author: Xue, Xiaoming, Hu, Yao, Feng, Liang, Zhang, Kai, Song, Linqi, and Tan, Kay Chen
Subjects: Computer Science - Neural and Evolutionary Computing
Abstract: Expensive optimization problems (EOPs) have attracted increasing research attention over the decades due to their ubiquity in a variety of practical applications. Despite many sophisticated surrogate-assisted evolutionary algorithms (SAEAs) that have been developed for solving such problems, most of them lack the ability to transfer knowledge from previously-solved tasks and always start their search from scratch, making them troubled by the notorious cold-start issue. A few preliminary studies that integrate transfer learning into SAEAs still face some issues, such as defective similarity quantification that is prone to underestimate promising knowledge, surrogate-dependency that makes the transfer methods not coherent with the state-of-the-art in SAEAs, etc. In light of the above, a plug and play competitive knowledge transfer method is proposed to boost various SAEAs in this paper. Specifically, both the optimized solutions from the source tasks and the promising solutions acquired by the target surrogate are treated as task-solving knowledge, enabling them to compete with each other to elect the winner for expensive evaluation, thus boosting the search speed on the target task. Moreover, the lower bound of the convergence gain brought by the knowledge competition is mathematically analyzed, which is expected to strengthen the theoretical foundation of sequential transfer optimization. Experimental studies conducted on a series of benchmark problems and a practical application from the petroleum industry verify the efficacy of the proposed method. The source code of the competitive knowledge transfer is available at https://github.com/XmingHsueh/SAS-CKT., Comment: 22 pages, 14 figures
Published: 2024

5. Error Correction Decoding Algorithms of RS Codes Based on An Earlier Termination Algorithm to Find The Error Locator Polynomial

Author: Jiang, Zhengyi, Shi, Hao, Huang, Zhongyi, Song, Linqi, Bai, Bo, Zhang, Gong, and Hou, Hanxu
Subjects: Computer Science - Information Theory
Abstract: Reed-Solomon (RS) codes are widely used to correct errors in storage systems. Finding the error locator polynomial is one of the key steps in the error correction procedure of RS codes. Modular Approach (MA) is an effective algorithm for solving the Welch-Berlekamp (WB) key-equation problem to find the error locator polynomial that needs $2t$ steps, where $t$ is the error correction capability. In this paper, we first present a new MA algorithm that only requires $2e$ steps and then propose two fast decoding algorithms for RS codes based on our MA algorithm, where $e$ is the number of errors and $e\leq t$. We propose Improved-Frequency Domain Modular Approach (I-FDMA) algorithm that needs $2e$ steps to solve the error locator polynomial and present our first decoding algorithm based on the I-FDMA algorithm. We show that, compared with the existing methods based on MA algorithms, our I-FDMA algorithm can effectively reduce the decoding complexity of RS codes when $e
Published: 2024

6. OCTrack: Benchmarking the Open-Corpus Multi-Object Tracking

Author: Qian, Zekun, Han, Ruize, Feng, Wei, Hou, Junhui, Song, Linqi, and Wang, Song
Subjects: Computer Science - Computer Vision and Pattern Recognition, Computer Science - Artificial Intelligence
Abstract: We study a novel yet practical problem of open-corpus multi-object tracking (OCMOT), which extends the MOT into localizing, associating, and recognizing generic-category objects of both seen (base) and unseen (novel) classes, but without the category text list as prompt. To study this problem, the top priority is to build a benchmark. In this work, we build OCTrackB, a large-scale and comprehensive benchmark, to provide a standard evaluation platform for the OCMOT problem. Compared to previous datasets, OCTrackB has more abundant and balanced base/novel classes and the corresponding samples for evaluation with less bias. We also propose a new multi-granularity recognition metric to better evaluate the generative object recognition in OCMOT. By conducting the extensive benchmark evaluation, we report and analyze the results of various state-of-the-art methods, which demonstrate the rationale of OCMOT, as well as the usefulness and advantages of OCTrackB.
Published: 2024

7. OptiBench Meets ReSocratic: Measure and Improve LLMs for Optimization Modeling

Author: Yang, Zhicheng, Wang, Yiwei, Huang, Yinya, Guo, Zhijiang, Shi, Wei, Han, Xiongwei, Feng, Liang, Song, Linqi, Liang, Xiaodan, and Tang, Jing
Subjects: Computer Science - Machine Learning, Mathematics - Optimization and Control
Abstract: Large language models (LLMs) have exhibited their problem-solving abilities in mathematical reasoning. Solving realistic optimization (OPT) problems in application scenarios requires advanced and applied mathematics ability. However, current OPT benchmarks that merely solve linear programming are far from complex realistic situations. In this work, we propose OptiBench, a benchmark for End-to-end optimization problem-solving with human-readable inputs and outputs. OptiBench contains rich optimization problems, including linear and nonlinear programming with or without tabular data, which can comprehensively evaluate LLMs' solving ability. In our benchmark, LLMs are required to call a code solver to provide precise numerical answers. Furthermore, to alleviate the data scarcity for optimization problems, and to bridge the gap between open-source LLMs on a small scale (e.g., Llama-3-8b) and closed-source LLMs (e.g., GPT-4), we further propose a data synthesis method namely ReSocratic. Unlike general data synthesis methods that proceed from questions to answers, \ReSocratic first incrementally synthesizes formatted optimization demonstration with mathematical formulations step by step and then back-translates the generated demonstrations into questions. Based on this, we synthesize the ReSocratic-29k dataset. We further conduct supervised fine-tuning with ReSocratic-29k on multiple open-source models. Experimental results show that ReSocratic-29k significantly improves the performance of open-source models.
Published: 2024

8. FVEL: Interactive Formal Verification Environment with Large Language Models via Theorem Proving

Author: Lin, Xiaohan, Cao, Qingxing, Huang, Yinya, Wang, Haiming, Lu, Jianqiao, Liu, Zhengying, Song, Linqi, and Liang, Xiaodan
Subjects: Computer Science - Artificial Intelligence, Computer Science - Computation and Language, Computer Science - Machine Learning
Abstract: Formal verification (FV) has witnessed growing significance with current emerging program synthesis by the evolving large language models (LLMs). However, current formal verification mainly resorts to symbolic verifiers or hand-craft rules, resulting in limitations for extensive and flexible verification. On the other hand, formal languages for automated theorem proving, such as Isabelle, as another line of rigorous verification, are maintained with comprehensive rules and theorems. In this paper, we propose FVEL, an interactive Formal Verification Environment with LLMs. Specifically, FVEL transforms a given code to be verified into Isabelle, and then conducts verification via neural automated theorem proving with an LLM. The joined paradigm leverages the rigorous yet abundant formulated and organized rules in Isabelle and is also convenient for introducing and adjusting cutting-edge LLMs. To achieve this goal, we extract a large-scale FVELER3. The FVELER dataset includes code dependencies and verification processes that are formulated in Isabelle, containing 758 theories, 29,125 lemmas, and 200,646 proof steps in total with in-depth dependencies. We benchmark FVELER in the FVEL environment by first fine-tuning LLMs with FVELER and then evaluating them on Code2Inv and SV-COMP. The results show that FVEL with FVELER fine-tuned Llama3- 8B solves 17.39% (69 -> 81) more problems, and Mistral-7B 12% (75 -> 84) more problems in SV-COMP. And the proportion of proof errors is reduced. Project page: https://fveler.github.io/.
Published: 2024

9. Interpretable Catastrophic Forgetting of Large Language Model Fine-tuning via Instruction Vector

Author: Jiang, Gangwei, Jiang, Caigao, Li, Zhaoyi, Xue, Siqiao, Zhou, Jun, Song, Linqi, Lian, Defu, and Wei, Ying
Subjects: Computer Science - Artificial Intelligence
Abstract: Fine-tuning large language models (LLMs) can cause them to lose their general capabilities. However, the intrinsic mechanisms behind such forgetting remain unexplored. In this paper, we begin by examining this phenomenon by focusing on knowledge understanding and instruction following, with the latter identified as the main contributor to forgetting during fine-tuning. Consequently, we propose the Instruction Vector (IV) framework to capture model representations highly related to specific instruction-following capabilities, thereby making it possible to understand model-intrinsic forgetting. Through the analysis of IV dynamics pre and post-training, we suggest that fine-tuning mostly adds specialized reasoning patterns instead of erasing previous skills, which may appear as forgetting. Building on this insight, we develop IV-guided training, which aims to preserve original computation graph, thereby mitigating catastrophic forgetting. Empirical tests on three benchmarks confirm the efficacy of this new approach, supporting the relationship between IVs and forgetting. Our code will be made available soon.
Published: 2024

10. Bi-Chainer: Automated Large Language Models Reasoning with Bidirectional Chaining

Author: Liu, Shuqi, He, Bowei, and Song, Linqi
Subjects: Computer Science - Computation and Language, Computer Science - Artificial Intelligence
Abstract: Large Language Models (LLMs) have shown human-like reasoning abilities but still face challenges in solving complex logical problems. Existing unidirectional chaining methods, such as forward chaining and backward chaining, suffer from issues like low prediction accuracy and efficiency. To address these, we propose a bidirectional chaining method, Bi-Chainer, which dynamically switches to depth-first reasoning in the opposite reasoning direction when it encounters multiple branching options within the current direction. Thus, the intermediate reasoning results can be utilized as guidance to facilitate the reasoning process. We show that Bi-Chainer achieves sizable accuracy boots over unidirectional chaining frameworks on four challenging logical reasoning datasets. Moreover, Bi-Chainer enhances the accuracy of intermediate proof steps and reduces the average number of inference calls, resulting in more efficient and accurate reasoning., Comment: Accepted by ACL 2024
Published: 2024

11. Privacy in LLM-based Recommendation: Recent Advances and Future Directions

Author: Luo, Sichun, Shao, Wei, Yao, Yuxuan, Xu, Jian, Liu, Mingyang, Li, Qintong, He, Bowei, Wang, Maolin, Deng, Guanzhi, Hou, Hanxu, Zhang, Xinyi, and Song, Linqi
Subjects: Computer Science - Computation and Language, Computer Science - Information Retrieval
Abstract: Nowadays, large language models (LLMs) have been integrated with conventional recommendation models to improve recommendation performance. However, while most of the existing works have focused on improving the model performance, the privacy issue has only received comparatively less attention. In this paper, we review recent advancements in privacy within LLM-based recommendation, categorizing them into privacy attacks and protection mechanisms. Additionally, we highlight several challenges and propose future directions for the community to address these critical problems.
Published: 2024

12. DuQuant: Distributing Outliers via Dual Transformation Makes Stronger Quantized LLMs

Author: Lin, Haokun, Xu, Haobo, Wu, Yichen, Cui, Jingzhi, Zhang, Yingtao, Mou, Linzhan, Song, Linqi, Sun, Zhenan, and Wei, Ying
Subjects: Computer Science - Computation and Language
Abstract: Quantization of large language models (LLMs) faces significant challenges, particularly due to the presence of outlier activations that impede efficient low-bit representation. Traditional approaches predominantly address $\textit{Normal Outliers}$, which are activations across all tokens with relatively large magnitudes. However, these methods struggle with smoothing $\textit{Massive Outliers}$ that display significantly larger values, which leads to significant performance degradation in low-bit quantization. In this paper, we introduce DuQuant, a novel approach that utilizes rotation and permutation transformations to more effectively mitigate both massive and normal outliers. First, DuQuant starts by constructing rotation matrices, using specific outlier dimensions as prior knowledge, to redistribute outliers to adjacent channels by block-wise rotation. Second, We further employ a zigzag permutation to balance the distribution of outliers across blocks, thereby reducing block-wise variance. A subsequent rotation further smooths the activation landscape, enhancing model performance. DuQuant simplifies the quantization process and excels in managing outliers, outperforming the state-of-the-art baselines across various sizes and types of LLMs on multiple tasks, even with 4-bit weight-activation quantization. Our code is available at https://github.com/Hsu1023/DuQuant., Comment: 26 pages, 13 figures, Website at https://duquant.github.io
Published: 2024

13. Reed-Solomon Codes over Cyclic Polynomial Ring with Lower Encoding/Decoding Complexity

Author: Liu, Wenhao, Jiang, Zhengyi, Huang, Zhongyi, Song, Linqi, and Hou, Hanxu
Subjects: Computer Science - Information Theory
Abstract: Reed-Solomon (RS) codes are constructed over a finite field that have been widely employed in storage and communication systems. Many fast encoding/decoding algorithms such as fast Fourier transform (FFT) and modular approach are designed for RS codes to reduce the encoding/decoding complexity defined as the number of XORs involved in the encoding/decoding procedure. In this paper, we present the construction of RS codes over the cyclic polynomial ring $ \mathbb{F}_2[x]/(1+x+\ldots+x^{p-1})$ and show that our codes are maximum distance separable (MDS) codes. Moreover, we propose the FFT and modular approach over the ring that can be employed in our codes for encoding/decoding complexity reduction. We show that our codes have 17.9\% encoding complexity reduction and 7.5\% decoding complexity reduction compared with RS codes over finite field, for $(n,k)=(2048,1984)$.
Published: 2024

14. Benchmarking and Improving Compositional Generalization of Multi-aspect Controllable Text Generation

Author: Zhong, Tianqi, Li, Zhaoyi, Wang, Quan, Song, Linqi, Wei, Ying, Lian, Defu, and Mao, Zhendong
Subjects: Computer Science - Computation and Language
Abstract: Compositional generalization, representing the model's ability to generate text with new attribute combinations obtained by recombining single attributes from the training data, is a crucial property for multi-aspect controllable text generation (MCTG) methods. Nonetheless, a comprehensive compositional generalization evaluation benchmark of MCTG is still lacking. We propose CompMCTG, a benchmark encompassing diverse multi-aspect labeled datasets and a crafted three-dimensional evaluation protocol, to holistically evaluate the compositional generalization of MCTG approaches. We observe that existing MCTG works generally confront a noticeable performance drop in compositional testing. To mitigate this issue, we introduce Meta-MCTG, a training framework incorporating meta-learning, where we enable models to learn how to generalize by simulating compositional generalization scenarios in the training phase. We demonstrate the effectiveness of Meta-MCTG through achieving obvious improvement (by at most 3.64%) for compositional testing performance in 94.4% cases., Comment: Accepted to ACL 2024 (Main); 32 pages
Published: 2024

15. Learning From Correctness Without Prompting Makes LLM Efficient Reasoner

Author: Yao, Yuxuan, Wu, Han, Guo, Zhijiang, Zhou, Biyan, Gao, Jiahui, Luo, Sichun, Hou, Hanxu, Fu, Xiaojin, and Song, Linqi
Subjects: Computer Science - Computation and Language
Abstract: Large language models (LLMs) have demonstrated outstanding performance across various tasks, yet they still exhibit limitations such as hallucination, unfaithful reasoning, and toxic content. One potential approach to mitigate these issues is learning from human or external feedback (e.g. tools). In this paper, we introduce an intrinsic self-correct reasoning framework for LLMs that eliminates the need for human feedback, external tools, and handcraft prompts. The proposed framework, based on a multi-step reasoning paradigm \textbf{Le}arning from \textbf{Co}rrectness (\textsc{LeCo}), improves reasoning performance without needing to learn from errors. This paradigm prioritizes learning from correct reasoning steps, and a unique method to measure confidence for each reasoning step based on generation logits. Experimental results across various multi-step reasoning tasks demonstrate the effectiveness of the framework in improving reasoning performance with reduced token consumption., Comment: Accepted to COLM 2024
Published: 2024

16. MoPE-CLIP: Structured Pruning for Efficient Vision-Language Models with Module-wise Pruning Error Metric

Author: Lin, Haokun, Bai, Haoli, Liu, Zhili, Hou, Lu, Sun, Muyi, Song, Linqi, Wei, Ying, and Sun, Zhenan
Subjects: Computer Science - Computer Vision and Pattern Recognition, Computer Science - Artificial Intelligence, Computer Science - Multimedia
Abstract: Vision-language pre-trained models have achieved impressive performance on various downstream tasks. However, their large model sizes hinder their utilization on platforms with limited computational resources. We find that directly using smaller pre-trained models and applying magnitude-based pruning on CLIP models leads to inflexibility and inferior performance. Recent efforts for VLP compression either adopt uni-modal compression metrics resulting in limited performance or involve costly mask-search processes with learnable masks. In this paper, we first propose the Module-wise Pruning Error (MoPE) metric, accurately assessing CLIP module importance by performance decline on cross-modal tasks. Using the MoPE metric, we introduce a unified pruning framework applicable to both pre-training and task-specific fine-tuning compression stages. For pre-training, MoPE-CLIP effectively leverages knowledge from the teacher model, significantly reducing pre-training costs while maintaining strong zero-shot capabilities. For fine-tuning, consecutive pruning from width to depth yields highly competitive task-specific models. Extensive experiments in two stages demonstrate the effectiveness of the MoPE metric, and MoPE-CLIP outperforms previous state-of-the-art VLP compression methods., Comment: 18 pages, 8 figures, Published in CVPR2024
Published: 2024

17. Can LLM Substitute Human Labeling? A Case Study of Fine-grained Chinese Address Entity Recognition Dataset for UAV Delivery

Author: Yao, Yuxuan, Luo, Sichun, Zhao, Haohan, Deng, Guanzhi, and Song, Linqi
Subjects: Computer Science - Computation and Language, Computer Science - Artificial Intelligence, Computer Science - Information Retrieval
Abstract: We present CNER-UAV, a fine-grained \textbf{C}hinese \textbf{N}ame \textbf{E}ntity \textbf{R}ecognition dataset specifically designed for the task of address resolution in \textbf{U}nmanned \textbf{A}erial \textbf{V}ehicle delivery systems. The dataset encompasses a diverse range of five categories, enabling comprehensive training and evaluation of NER models. To construct this dataset, we sourced the data from a real-world UAV delivery system and conducted a rigorous data cleaning and desensitization process to ensure privacy and data integrity. The resulting dataset, consisting of around 12,000 annotated samples, underwent human experts and \textbf{L}arge \textbf{L}anguage \textbf{M}odel annotation. We evaluated classical NER models on our dataset and provided in-depth analysis. The dataset and models are publicly available at \url{https://github.com/zhhvvv/CNER-UAV}., Comment: Accepted by TheWebConf'24 (WWW'24) as a Resource Paper
Published: 2024

18. Understanding and Patching Compositional Reasoning in LLMs

Author: Li, Zhaoyi, Jiang, Gangwei, Xie, Hong, Song, Linqi, Lian, Defu, and Wei, Ying
Subjects: Computer Science - Computation and Language
Abstract: LLMs have marked a revolutonary shift, yet they falter when faced with compositional reasoning tasks. Our research embarks on a quest to uncover the root causes of compositional reasoning failures of LLMs, uncovering that most of them stem from the improperly generated or leveraged implicit reasoning results. Inspired by our empirical findings, we resort to Logit Lens and an intervention experiment to dissect the inner hidden states of LLMs. This deep dive reveals that implicit reasoning results indeed surface within middle layers and play a causative role in shaping the final explicit reasoning results. Our exploration further locates multi-head self-attention (MHSA) modules within these layers, which emerge as the linchpins in accurate generation and leveraing of implicit reasoning results. Grounded on the above findings, we develop CREME, a lightweight method to patch errors in compositional reasoning via editing the located MHSA modules. Our empirical evidence stands testament to CREME's effectiveness, paving the way for autonomously and continuously enhancing compositional reasoning capabilities in language models., Comment: Accepted by ACL'2024 Findings
Published: 2024

19. MUSTARD: Mastering Uniform Synthesis of Theorem and Proof Data

Author: Huang, Yinya, Lin, Xiaohan, Liu, Zhengying, Cao, Qingxing, Xin, Huajian, Wang, Haiming, Li, Zhenguo, Song, Linqi, and Liang, Xiaodan
Subjects: Computer Science - Artificial Intelligence, Computer Science - Computation and Language, Computer Science - Formal Languages and Automata Theory, Computer Science - Machine Learning, Computer Science - Programming Languages
Abstract: Recent large language models (LLMs) have witnessed significant advancement in various tasks, including mathematical reasoning and theorem proving. As these two tasks require strict and formal multi-step inference, they are appealing domains for exploring the reasoning ability of LLMs but still face important challenges. Previous studies such as Chain-of-Thought (CoT) have revealed the effectiveness of intermediate steps guidance. However, such step-wise annotation requires heavy labor, leading to insufficient training steps for current benchmarks. To fill this gap, this work introduces MUSTARD, a data generation framework that masters uniform synthesis of theorem and proof data of high quality and diversity. MUSTARD synthesizes data in three stages: (1) It samples a few mathematical concept seeds as the problem category. (2) Then, it prompts a generative language model with the sampled concepts to obtain both the problems and their step-wise formal solutions. (3) Lastly, the framework utilizes a proof assistant (e.g., Lean Prover) to filter the valid proofs. With the proposed MUSTARD, we present a theorem-and-proof benchmark MUSTARDSAUCE with 5,866 valid data points. Each data point contains an informal statement, an informal proof, and a translated formal proof that passes the prover validation. We perform extensive analysis and demonstrate that MUSTARD generates validated high-quality step-by-step data. We further apply the MUSTARDSAUCE for fine-tuning smaller language models. The fine-tuned Llama 2-7B achieves a 15.41% average relative performance gain in automated theorem proving, and 8.18% in math word problems. Codes and data are available at https://github.com/Eleanor-H/MUSTARD.
Published: 2024

20. From Synthetic to Real: Unveiling the Power of Synthetic Data for Video Person Re-ID

Author: Zhang, Xiangqun, Feng, Wei, Han, Ruize, Wang, Likai, Song, Linqi, and Hou, Junhui
Subjects: Computer Science - Computer Vision and Pattern Recognition
Abstract: In this study, we investigate the novel challenge of cross-domain video-based person re-identification (Re-ID). Here, we utilize synthetic video datasets as the source domain for training and real-world videos for testing, notably reducing the reliance on expensive real data acquisition and annotation. To harness the potential of synthetic data, we first propose a self-supervised domain-invariant feature learning strategy for both static and dynamic (temporal) features. Additionally, to enhance person identification accuracy in the target domain, we propose a mean-teacher scheme incorporating a self-supervised ID consistency loss. Experimental results across five real datasets validate the rationale behind cross-synthetic-real domain adaptation and demonstrate the efficacy of our method. Notably, the discovery that synthetic data outperforms real data in the cross-domain scenario is a surprising outcome. The code and data will be publicly available at https://github.com/XiangqunZhang/UDA_Video_ReID.
Published: 2024

21. Improved Quantization Strategies for Managing Heavy-tailed Gradients in Distributed Learning

Author: Yan, Guangfeng, Li, Tan, Xiao, Yuanzhang, Hou, Hanxu, and Song, Linqi
Subjects: Computer Science - Machine Learning, Computer Science - Distributed, Parallel, and Cluster Computing
Abstract: Gradient compression has surfaced as a key technique to address the challenge of communication efficiency in distributed learning. In distributed deep learning, however, it is observed that gradient distributions are heavy-tailed, with outliers significantly influencing the design of compression strategies. Existing parameter quantization methods experience performance degradation when this heavy-tailed feature is ignored. In this paper, we introduce a novel compression scheme specifically engineered for heavy-tailed gradients, which effectively combines gradient truncation with quantization. This scheme is adeptly implemented within a communication-limited distributed Stochastic Gradient Descent (SGD) framework. We consider a general family of heavy-tail gradients that follow a power-law distribution, we aim to minimize the error resulting from quantization, thereby determining optimal values for two critical parameters: the truncation threshold and the quantization density. We provide a theoretical analysis on the convergence error bound under both uniform and non-uniform quantization scenarios. Comparative experiments with other benchmarks demonstrate the effectiveness of our proposed method in managing the heavy-tailed gradients in a distributed learning environment., Comment: arXiv admin note: substantial text overlap with arXiv:2402.01160
Published: 2024

22. Truncated Non-Uniform Quantization for Distributed SGD

Author: Yan, Guangfeng, Li, Tan, Xiao, Yuanzhang, Li, Congduan, and Song, Linqi
Subjects: Computer Science - Machine Learning, Computer Science - Distributed, Parallel, and Cluster Computing
Abstract: To address the communication bottleneck challenge in distributed learning, our work introduces a novel two-stage quantization strategy designed to enhance the communication efficiency of distributed Stochastic Gradient Descent (SGD). The proposed method initially employs truncation to mitigate the impact of long-tail noise, followed by a non-uniform quantization of the post-truncation gradients based on their statistical characteristics. We provide a comprehensive convergence analysis of the quantized distributed SGD, establishing theoretical guarantees for its performance. Furthermore, by minimizing the convergence error, we derive optimal closed-form solutions for the truncation threshold and non-uniform quantization levels under given communication constraints. Both theoretical insights and extensive experimental evaluations demonstrate that our proposed algorithm outperforms existing quantization schemes, striking a superior balance between communication efficiency and convergence performance.
Published: 2024

23. Towards Quantum-Safe Federated Learning via Homomorphic Encryption: Learning with Gradients

Author: Yan, Guangfeng, Lyu, Shanxiang, Hou, Hanxu, Zheng, Zhiyong, and Song, Linqi
Subjects: Computer Science - Cryptography and Security
Abstract: This paper introduces a privacy-preserving distributed learning framework via private-key homomorphic encryption. Thanks to the randomness of the quantization of gradients, our learning with error (LWE) based encryption can eliminate the error terms, thus avoiding the issue of error expansion in conventional LWE-based homomorphic encryption. The proposed system allows a large number of learning participants to engage in neural network-based deep learning collaboratively over an honest-but-curious server, while ensuring the cryptographic security of participants' uploaded gradients.
Published: 2024

24. PROXYQA: An Alternative Framework for Evaluating Long-Form Text Generation with Large Language Models

Author: Tan, Haochen, Guo, Zhijiang, Shi, Zhan, Xu, Lu, Liu, Zhili, Feng, Yunlong, Li, Xiaoguang, Wang, Yasheng, Shang, Lifeng, Liu, Qun, and Song, Linqi
Subjects: Computer Science - Computation and Language, Computer Science - Artificial Intelligence
Abstract: Large Language Models (LLMs) have succeeded remarkably in understanding long-form contents. However, exploring their capability for generating long-form contents, such as reports and articles, has been relatively unexplored and inadequately assessed by existing benchmarks. The prevalent evaluation methods, which predominantly rely on crowdsourcing, are recognized for their labor-intensive nature and lack of efficiency, whereas automated metrics, such as the ROUGE score, demonstrate discordance with human judgment criteria. In this paper, we propose ProxyQA, an innovative framework dedicated to assessing long-text generation. ProxyQA comprises in-depth human-curated meta-questions spanning various domains, each accompanied by specific proxy-questions with pre-annotated answers. LLMs are tasked to generate extensive content in response to these meta-questions, by engaging an evaluator and incorporating the generated texts as contextual background, ProxyQA assesses the generated content's quality through the evaluator's accuracy in addressing the proxy-questions. We examine multiple LLMs, emphasizing ProxyQA's demanding nature as a high-quality assessment tool. Human evaluation demonstrates that the proxy-question method is notably self-consistent and aligns closely with human evaluative standards. The dataset and leaderboard is available at \url{https://proxy-qa.com}., Comment: Accepted to ACL 2024 main conference
Published: 2024

25. Integrating Large Language Models into Recommendation via Mutual Augmentation and Adaptive Aggregation

Author: Luo, Sichun, Yao, Yuxuan, He, Bowei, Huang, Yinya, Zhou, Aojun, Zhang, Xinyi, Xiao, Yuanzhang, Zhan, Mingjie, and Song, Linqi
Subjects: Computer Science - Information Retrieval
Abstract: Conventional recommendation methods have achieved notable advancements by harnessing collaborative or sequential information from user behavior. Recently, large language models (LLMs) have gained prominence for their capabilities in understanding and reasoning over textual semantics, and have found utility in various domains, including recommendation. Conventional recommendation methods and LLMs each have their strengths and weaknesses. While conventional methods excel at mining collaborative information and modeling sequential behavior, they struggle with data sparsity and the long-tail problem. LLMs, on the other hand, are proficient at utilizing rich textual contexts but face challenges in mining collaborative or sequential information. Despite their individual successes, there is a significant gap in leveraging their combined potential to enhance recommendation performance. In this paper, we introduce a general and model-agnostic framework known as \textbf{L}arge \textbf{la}nguage model with \textbf{m}utual augmentation and \textbf{a}daptive aggregation for \textbf{Rec}ommendation (\textbf{Llama4Rec}). Llama4Rec synergistically combines conventional and LLM-based recommendation models. Llama4Rec proposes data augmentation and prompt augmentation strategies tailored to enhance the conventional model and LLM respectively. An adaptive aggregation module is adopted to combine the predictions of both kinds of models to refine the final recommendation results. Empirical studies on three real-world datasets validate the superiority of Llama4Rec, demonstrating its consistent outperformance of baseline methods and significant improvements in recommendation performance.
Published: 2024

26. 3D Shape Completion on Unseen Categories:A Weakly-supervised Approach

Author: Wu, Lintai, Hou, Junhui, Song, Linqi, and Xu, Yong
Subjects: Computer Science - Computer Vision and Pattern Recognition
Abstract: 3D shapes captured by scanning devices are often incomplete due to occlusion. 3D shape completion methods have been explored to tackle this limitation. However, most of these methods are only trained and tested on a subset of categories, resulting in poor generalization to unseen categories. In this paper, we introduce a novel weakly-supervised framework to reconstruct the complete shapes from unseen categories. We first propose an end-to-end prior-assisted shape learning network that leverages data from the seen categories to infer a coarse shape. Specifically, we construct a prior bank consisting of representative shapes from the seen categories. Then, we design a multi-scale pattern correlation module for learning the complete shape of the input by analyzing the correlation between local patterns within the input and the priors at various scales. In addition, we propose a self-supervised shape refinement model to further refine the coarse shape. Considering the shape variability of 3D objects across categories, we construct a category-specific prior bank to facilitate shape refinement. Then, we devise a voxel-based partial matching loss and leverage the partial scans to drive the refinement process. Extensive experimental results show that our approach is superior to state-of-the-art methods by a large margin., Comment: 15 pages,10 figures
Published: 2024

27. RecRanker: Instruction Tuning Large Language Model as Ranker for Top-k Recommendation

Author: Luo, Sichun, He, Bowei, Zhao, Haohan, Shao, Wei, Qi, Yanlin, Huang, Yinya, Zhou, Aojun, Yao, Yuxuan, Li, Zongpeng, Xiao, Yuanzhang, Zhan, Mingjie, and Song, Linqi
Subjects: Computer Science - Information Retrieval
Abstract: Large Language Models (LLMs) have demonstrated remarkable capabilities and have been extensively deployed across various domains, including recommender systems. Prior research has employed specialized \textit{prompts} to leverage the in-context learning capabilities of LLMs for recommendation purposes. More recent studies have utilized instruction tuning techniques to align LLMs with human preferences, promising more effective recommendations. However, existing methods suffer from several limitations. The full potential of LLMs is not fully elicited due to low-quality tuning data and the overlooked integration of conventional recommender signals. Furthermore, LLMs may generate inconsistent responses for different ranking tasks in the recommendation, potentially leading to unreliable results. In this paper, we introduce \textbf{RecRanker}, tailored for instruction tuning LLMs to serve as the \textbf{Ranker} for top-\textit{k} \textbf{Rec}ommendations. Specifically, we introduce an adaptive sampling module for sampling high-quality, representative, and diverse training data. To enhance the prompt, we introduce a position shifting strategy to mitigate position bias and augment the prompt with auxiliary information from conventional recommendation models, thereby enriching the contextual understanding of the LLM. Subsequently, we utilize the sampled data to assemble an instruction-tuning dataset with the augmented prompts comprising three distinct ranking tasks: pointwise, pairwise, and listwise rankings. We further propose a hybrid ranking method to enhance the model performance by ensembling these ranking tasks. Our empirical evaluations demonstrate the effectiveness of our proposed RecRanker in both direct and sequential recommendation scenarios.
Published: 2023

28. Layered Randomized Quantization for Communication-Efficient and Privacy-Preserving Distributed Learning

Author: Yan, Guangfeng, Li, Tan, Lan, Tian, Wu, Kui, and Song, Linqi
Subjects: Computer Science - Distributed, Parallel, and Cluster Computing
Abstract: Next-generation wireless networks, such as edge intelligence and wireless distributed learning, face two critical challenges: communication efficiency and privacy protection. In this work, our focus is on addressing these issues in a distributed learning framework. We consider a new approach that simultaneously achieves communication efficiency and privacy protection by exploiting the privacy advantage offered by quantization. Specifically, we use a quantization scheme called \textbf{Gau}ssian \textbf{L}ayered \textbf{R}andomized \textbf{Q}uantization (Gau-LRQ) that compresses the raw model gradients using a layer multishift coupler. By adjusting the parameters of Gau-LRQ, we shape the quantization error to follow the expected Gaussian distribution, thus ensuring client-level differential privacy (CLDP). We demonstrate the effectiveness of our proposed Gau-LRQ in the distributed stochastic gradient descent (SGD) framework and theoretically quantify the trade-offs between communication, privacy, and convergence performance. We further improve the convergence performance by enabling dynamic private budget and quantization bit allocation. We achieve this by using an optimization formula that minimizes convergence error subject to the privacy budget constraint. We evaluate our approach on multiple datasets, including MNIST, CIFAR-10, and CIFAR-100, and show that our proposed method outperforms the baselines in terms of learning performance under various privacy constraints. Moreover, we observe that dynamic privacy allocation yields additional accuracy improvements for the models compared to the fixed scheme.
Published: 2023

29. CLOMO: Counterfactual Logical Modification with Large Language Models

Author: Huang, Yinya, Hong, Ruixin, Zhang, Hongming, Shao, Wei, Yang, Zhicheng, Yu, Dong, Zhang, Changshui, Liang, Xiaodan, and Song, Linqi
Subjects: Computer Science - Computation and Language, Computer Science - Artificial Intelligence
Abstract: In this study, we delve into the realm of counterfactual reasoning capabilities of large language models (LLMs). Our primary objective is to cultivate the counterfactual thought processes within LLMs and rigorously assess these processes for their validity. Specifically, we introduce a novel task, Counterfactual Logical Modification (CLOMO), and a high-quality human-annotated benchmark. In this task, LLMs must adeptly alter a given argumentative text to uphold a predetermined logical relationship. To effectively evaluate a generation model's counterfactual capabilities, we propose an innovative evaluation metric, the decomposed Self-Evaluation Score (SES) to directly evaluate the natural language output of LLMs instead of modeling the task as a multiple-choice problem. Analysis shows that the proposed automatic metric aligns well with human preference. Our experimental results show that while LLMs demonstrate a notable capacity for logical counterfactual thinking, there remains a discernible gap between their current abilities and human performance. Code and data are available at https://github.com/Eleanor-H/CLOMO.
Published: 2023

30. Fine-grained Conversational Decoding via Isotropic and Proximal Search

Author: Yao, Yuxuan, Wu, Han, Xu, Qiling, and Song, Linqi
Subjects: Computer Science - Computation and Language
Abstract: General-purpose text decoding approaches are usually adopted for dialogue response generation. Although the quality of the generated responses can be improved with dialogue-specific encoding methods, conversational decoding methods are still under-explored. Inspired by \citet{wu2023learning} that a good dialogue feature space should follow the rules of locality and isotropy, we present a fine-grained conversational decoding method, termed \textit{isotropic and proximal search (IPS)}. Our method is designed to generate the semantic-concentrated response, while still maintaining informativeness and discrimination against the context. Experiments show that our approach outperforms existing decoding strategies in the dialogue field across both automatic and human evaluation metrics. More in-depth analyses further confirm the effectiveness of our approach., Comment: Accepted to EMNLP 2023 Main Conference
Published: 2023

31. MathCoder: Seamless Code Integration in LLMs for Enhanced Mathematical Reasoning

Author: Wang, Ke, Ren, Houxing, Zhou, Aojun, Lu, Zimu, Luo, Sichun, Shi, Weikang, Zhang, Renrui, Song, Linqi, Zhan, Mingjie, and Li, Hongsheng
Subjects: Computer Science - Computation and Language, Computer Science - Artificial Intelligence, Computer Science - Computer Vision and Pattern Recognition, Computer Science - Machine Learning
Abstract: The recently released GPT-4 Code Interpreter has demonstrated remarkable proficiency in solving challenging math problems, primarily attributed to its ability to seamlessly reason with natural language, generate code, execute code, and continue reasoning based on the execution output. In this paper, we present a method to fine-tune open-source language models, enabling them to use code for modeling and deriving math equations and, consequently, enhancing their mathematical reasoning abilities. We propose a method of generating novel and high-quality datasets with math problems and their code-based solutions, referred to as MathCodeInstruct. Each solution interleaves natural language, code, and execution results. We also introduce a customized supervised fine-tuning and inference approach. This approach yields the MathCoder models, a family of models capable of generating code-based solutions for solving challenging math problems. Impressively, the MathCoder models achieve state-of-the-art scores among open-source LLMs on the MATH (45.2%) and GSM8K (83.9%) datasets, substantially outperforming other open-source alternatives. Notably, the MathCoder model not only surpasses ChatGPT-3.5 and PaLM-2 on GSM8K and MATH but also outperforms GPT-4 on the competition-level MATH dataset. The dataset and models will be released at https://github.com/mathllm/MathCoder., Comment: The state-of-the-art open-source language models for mathematical reasoning
Published: 2023

32. Solving Challenging Math Word Problems Using GPT-4 Code Interpreter with Code-based Self-Verification

Author: Zhou, Aojun, Wang, Ke, Lu, Zimu, Shi, Weikang, Luo, Sichun, Qin, Zipeng, Lu, Shaoqing, Jia, Anya, Song, Linqi, Zhan, Mingjie, and Li, Hongsheng
Subjects: Computer Science - Computation and Language, Computer Science - Artificial Intelligence, Computer Science - Computer Vision and Pattern Recognition
Abstract: Recent progress in large language models (LLMs) like GPT-4 and PaLM-2 has brought significant advancements in addressing math reasoning problems. In particular, OpenAI's latest version of GPT-4, known as GPT-4 Code Interpreter, shows remarkable performance on challenging math datasets. In this paper, we explore the effect of code on enhancing LLMs' reasoning capability by introducing different constraints on the \textit{Code Usage Frequency} of GPT-4 Code Interpreter. We found that its success can be largely attributed to its powerful skills in generating and executing code, evaluating the output of code execution, and rectifying its solution when receiving unreasonable outputs. Based on this insight, we propose a novel and effective prompting method, explicit \uline{c}ode-based \uline{s}elf-\uline{v}erification~(CSV), to further boost the mathematical reasoning potential of GPT-4 Code Interpreter. This method employs a zero-shot prompt on GPT-4 Code Interpreter to encourage it to use code to self-verify its answers. In instances where the verification state registers as ``False'', the model shall automatically amend its solution, analogous to our approach of rectifying errors during a mathematics examination. Furthermore, we recognize that the states of the verification result indicate the confidence of a solution, which can improve the effectiveness of majority voting. With GPT-4 Code Interpreter and CSV, we achieve an impressive zero-shot accuracy on MATH dataset \textbf{(53.9\% $\to$ 84.3\%)}., Comment: Solving Challenging Math Word Problems Using GPT-4 Code Interpreter with Code-based Self-Verification
Published: 2023

33. Reconstruct Before Summarize: An Efficient Two-Step Framework for Condensing and Summarizing Meeting Transcripts

Author: Tan, Haochen, Wu, Han, Shao, Wei, Zhang, Xinyun, Zhan, Mingjie, Hou, Zhaohui, Liang, Ding, and Song, Linqi
Subjects: Computer Science - Computation and Language
Abstract: Meetings typically involve multiple participants and lengthy conversations, resulting in redundant and trivial content. To overcome these challenges, we propose a two-step framework, Reconstruct before Summarize (RbS), for effective and efficient meeting summarization. RbS first leverages a self-supervised paradigm to annotate essential contents by reconstructing the meeting transcripts. Secondly, we propose a relative positional bucketing (RPB) algorithm to equip (conventional) summarization models to generate the summary. Despite the additional reconstruction process, our proposed RPB significantly compressed the input, leading to faster processing and reduced memory consumption compared to traditional summarization methods. We validate the effectiveness and efficiency of our method through extensive evaluations and analysis. On two meeting summarization datasets, AMI and ICSI, our approach outperforms previous state-of-the-art approaches without relying on large-scale pre-training or expert-grade annotating tools., Comment: Accepted to EMNLP 2023 main conference
Published: 2023

34. PerFedRec++: Enhancing Personalized Federated Recommendation with Self-Supervised Pre-Training

Author: Luo, Sichun, Xiao, Yuanzhang, Zhang, Xinyi, Liu, Yang, Ding, Wenbo, and Song, Linqi
Subjects: Computer Science - Information Retrieval
Abstract: Federated recommendation systems employ federated learning techniques to safeguard user privacy by transmitting model parameters instead of raw user data between user devices and the central server. Nevertheless, the current federated recommender system faces challenges such as heterogeneity and personalization, model performance degradation, and communication bottleneck. Previous studies have attempted to address these issues, but none have been able to solve them simultaneously. In this paper, we propose a novel framework, named PerFedRec++, to enhance the personalized federated recommendation with self-supervised pre-training. Specifically, we utilize the privacy-preserving mechanism of federated recommender systems to generate two augmented graph views, which are used as contrastive tasks in self-supervised graph learning to pre-train the model. Pre-training enhances the performance of federated models by improving the uniformity of representation learning. Also, by providing a better initial state for federated training, pre-training makes the overall training converge faster, thus alleviating the heavy communication burden. We then construct a collaborative graph to learn the client representation through a federated graph neural network. Based on these learned representations, we cluster users into different user groups and learn personalized models for each cluster. Each user learns a personalized model by combining the global federated model, the cluster-level federated model, and its own fine-tuned local model. Experiments on three real-world datasets show that our proposed method achieves superior performance over existing methods.
Published: 2023

35. VCSUM: A Versatile Chinese Meeting Summarization Dataset

Author: Wu, Han, Zhan, Mingjie, Tan, Haochen, Hou, Zhaohui, Liang, Ding, and Song, Linqi
Subjects: Computer Science - Computation and Language, Computer Science - Artificial Intelligence
Abstract: Compared to news and chat summarization, the development of meeting summarization is hugely decelerated by the limited data. To this end, we introduce a versatile Chinese meeting summarization dataset, dubbed VCSum, consisting of 239 real-life meetings, with a total duration of over 230 hours. We claim our dataset is versatile because we provide the annotations of topic segmentation, headlines, segmentation summaries, overall meeting summaries, and salient sentences for each meeting transcript. As such, the dataset can adapt to various summarization tasks or methods, including segmentation-based summarization, multi-granularity summarization and retrieval-then-generate summarization. Our analysis confirms the effectiveness and robustness of VCSum. We also provide a set of benchmark models regarding different downstream summarization tasks on VCSum to facilitate further research. The dataset and code will be released at https://github.com/hahahawu/VCSum., Comment: Findings of ACL 2023 (long paper). GitHub: https://github.com/hahahawu/VCSum
Published: 2023

36. Killing Two Birds with One Stone: Quantization Achieves Privacy in Distributed Learning

Author: Yan, Guangfeng, Li, Tan, Wu, Kui, and Song, Linqi
Subjects: Computer Science - Machine Learning, Computer Science - Artificial Intelligence, Computer Science - Cryptography and Security
Abstract: Communication efficiency and privacy protection are two critical issues in distributed machine learning. Existing methods tackle these two issues separately and may have a high implementation complexity that constrains their application in a resource-limited environment. We propose a comprehensive quantization-based solution that could simultaneously achieve communication efficiency and privacy protection, providing new insights into the correlated nature of communication and privacy. Specifically, we demonstrate the effectiveness of our proposed solutions in the distributed stochastic gradient descent (SGD) framework by adding binomial noise to the uniformly quantized gradients to reach the desired differential privacy level but with a minor sacrifice in communication efficiency. We theoretically capture the new trade-offs between communication, privacy, and learning performance.
Published: 2023

37. A Scalable Test Problem Generator for Sequential Transfer Optimization

Author: Xue, Xiaoming, Yang, Cuie, Feng, Liang, Zhang, Kai, Song, Linqi, and Tan, Kay Chen
Subjects: Computer Science - Neural and Evolutionary Computing, Computer Science - Artificial Intelligence, Computer Science - Machine Learning
Abstract: Sequential transfer optimization (STO), which aims to improve the optimization performance on a task of interest by exploiting the knowledge captured from several previously-solved optimization tasks stored in a database, has been gaining increasing research attention over the years. However, despite the remarkable advances in algorithm design, the development of a systematic benchmark suite for comprehensive comparisons of STO algorithms received far less attention. Existing test problems are either simply generated by assembling other benchmark functions or extended from specific practical problems with limited scalability. The relationships between the optimal solutions of the source and target tasks in these problems are also often manually configured, limiting their ability to model different similarity relationships presented in real-world problems. Consequently, the good performance achieved by an algorithm on these problems might be biased and hard to be generalized to other problems. In light of the above, in this study, we first introduce four concepts for characterizing STO problems and present an important problem feature, namely similarity distribution, which quantitatively delineates the relationship between the optima of the source and target tasks. Then, we present the general design guidelines of STO problems and a particular STO problem generator with good scalability. Specifically, the similarity distribution of a problem can be easily customized, enabling a continuous spectrum of representation of the diverse similarity relationships of real-world problems. Lastly, a benchmark suite with 12 STO problems featured by a variety of customized similarity relationships is developed using the proposed generator. The source code of the problem generator is available at https://github.com/XmingHsueh/STOP-G.
Published: 2023

38. Gap-closing Matters: Perceptual Quality Evaluation and Optimization of Low-Light Image Enhancement

Author: Chen, Baoliang, Zhu, Lingyu, Zhu, Hanwei, Yang, Wenhan, Song, Linqi, and Wang, Shiqi
Subjects: Computer Science - Computer Vision and Pattern Recognition, Electrical Engineering and Systems Science - Image and Video Processing
Abstract: There is a growing consensus in the research community that the optimization of low-light image enhancement approaches should be guided by the visual quality perceived by end users. Despite the substantial efforts invested in the design of low-light enhancement algorithms, there has been comparatively limited focus on assessing subjective and objective quality systematically. To mitigate this gap and provide a clear path towards optimizing low-light image enhancement for better visual quality, we propose a gap-closing framework. In particular, our gap-closing framework starts with the creation of a large-scale dataset for Subjective QUality Assessment of REconstructed LOw-Light Images (SQUARE-LOL). This database serves as the foundation for studying the quality of enhanced images and conducting a comprehensive subjective user study. Subsequently, we propose an objective quality assessment measure that plays a critical role in bridging the gap between visual quality and enhancement. Finally, we demonstrate that our proposed objective quality measure can be incorporated into the process of optimizing the learning of the enhancement model toward perceptual optimality. We validate the effectiveness of our proposed framework through both the accuracy of quality prediction and the perceptual quality of image enhancement. Our database and codes are publicly available at https://github.com/Baoliang93/IACA_For_Lowlight_IQA., Comment: Basis Angle Consistency in Sec.3.2 will be revised
Published: 2023
Full Text: View/download PDF

39. A Framework of Large-Scale Peer-to-Peer Learning System

Author: Luo, Yongkang, Han, Peiyi, Luo, Wenjian, Xue, Shaocong, Chen, Kesheng, Song, Linqi, Goos, Gerhard, Founding Editor, Hartmanis, Juris, Founding Editor, Bertino, Elisa, Editorial Board Member, Gao, Wen, Editorial Board Member, Steffen, Bernhard, Editorial Board Member, Yung, Moti, Editorial Board Member, Luo, Biao, editor, Cheng, Long, editor, Wu, Zheng-Guang, editor, Li, Hongyi, editor, and Li, Chaojie, editor
Published: 2024
Full Text: View/download PDF

40. Adaptive Top-K in SGD for Communication-Efficient Distributed Learning

Author: Ruan, Mengzhe, Yan, Guangfeng, Xiao, Yuanzhang, Song, Linqi, and Xu, Weitao
Subjects: Computer Science - Machine Learning, Computer Science - Distributed, Parallel, and Cluster Computing, Mathematics - Optimization and Control
Abstract: Distributed stochastic gradient descent (SGD) with gradient compression has become a popular communication-efficient solution for accelerating distributed learning. One commonly used method for gradient compression is Top-K sparsification, which sparsifies the gradients by a fixed degree during model training. However, there has been a lack of an adaptive approach to adjust the sparsification degree to maximize the potential of the model's performance or training speed. This paper proposes a novel adaptive Top-K in SGD framework that enables an adaptive degree of sparsification for each gradient descent step to optimize the convergence performance by balancing the trade-off between communication cost and convergence error. Firstly, an upper bound of convergence error is derived for the adaptive sparsification scheme and the loss function. Secondly, an algorithm is designed to minimize the convergence error under the communication cost constraints. Finally, numerical results on the MNIST and CIFAR-10 datasets demonstrate that the proposed adaptive Top-K algorithm in SGD achieves a significantly better convergence rate compared to state-of-the-art methods, even after considering error compensation., Comment: 6 pages, 10 figures, has been accepted by GlobeCom 2023
Published: 2022

41. Hiding Images in Deep Probabilistic Models

Author: Chen, Haoyu, Song, Linqi, Qian, Zhenxing, Zhang, Xinpeng, and Ma, Kede
Subjects: Computer Science - Cryptography and Security, Computer Science - Computer Vision and Pattern Recognition, Computer Science - Multimedia
Abstract: Data hiding with deep neural networks (DNNs) has experienced impressive successes in recent years. A prevailing scheme is to train an autoencoder, consisting of an encoding network to embed (or transform) secret messages in (or into) a carrier, and a decoding network to extract the hidden messages. This scheme may suffer from several limitations regarding practicability, security, and embedding capacity. In this work, we describe a different computational framework to hide images in deep probabilistic models. Specifically, we use a DNN to model the probability density of cover images, and hide a secret image in one particular location of the learned distribution. As an instantiation, we adopt a SinGAN, a pyramid of generative adversarial networks (GANs), to learn the patch distribution of one cover image. We hide the secret image by fitting a deterministic mapping from a fixed set of noise maps (generated by an embedding key) to the secret image during patch distribution learning. The stego SinGAN, behaving as the original SinGAN, is publicly communicated; only the receiver with the embedding key is able to extract the secret image. We demonstrate the feasibility of our SinGAN approach in terms of extraction accuracy and model security. Moreover, we show the flexibility of the proposed method in terms of hiding multiple images for different receivers and obfuscating the secret image.
Published: 2022

42. Towards Communication Efficient and Fair Federated Personalized Sequential Recommendation

Author: Luo, Sichun, Xiao, Yuanzhang, Liu, Yang, Li, Congduan, and Song, Linqi
Subjects: Computer Science - Information Retrieval
Abstract: Federated recommendations leverage the federated learning (FL) techniques to make privacy-preserving recommendations. Though recent success in the federated recommender system, several vital challenges remain to be addressed: (i) The majority of federated recommendation models only consider the model performance and the privacy-preserving ability, while ignoring the optimization of the communication process; (ii) Most of the federated recommenders are designed for heterogeneous systems, causing unfairness problems during the federation process; (iii) The personalization techniques have been less explored in many federated recommender systems. In this paper, we propose a Communication efficient and Fair personalized Federated personalized Sequential Recommendation algorithm (CF-FedSR) to tackle these challenges. CF-FedSR introduces a communication-efficient scheme that employs adaptive client selection and clustering-based sampling to accelerate the training process. A fairness-aware model aggregation algorithm that can adaptively capture the data and performance imbalance among different clients to address the unfairness problems is proposed. The personalization module assists clients in making personalized recommendations and boosts the recommendation performance via local fine-tuning and model adaption. Extensive experimental results show the effectiveness and efficiency of our proposed method.
Published: 2022

43. HySAGE: A Hybrid Static and Adaptive Graph Embedding Network for Context-Drifting Recommendations

Author: Luo, Sichun, Zhang, Xinyi, Xiao, Yuanzhang, and Song, Linqi
Subjects: Computer Science - Information Retrieval
Abstract: The recent popularity of edge devices and Artificial Intelligent of Things (AIoT) has driven a new wave of contextual recommendations, such as location based Point of Interest (PoI) recommendations and computing resource-aware mobile app recommendations. In many such recommendation scenarios, contexts are drifting over time. For example, in a mobile game recommendation, contextual features like locations, battery, and storage levels of mobile devices are frequently drifting over time. However, most existing graph-based collaborative filtering methods are designed under the assumption of static features. Therefore, they would require frequent retraining and/or yield graphical models burgeoning in sizes, impeding their suitability for context-drifting recommendations. In this work, we propose a specifically tailor-made Hybrid Static and Adaptive Graph Embedding (HySAGE) network for context-drifting recommendations. Our key idea is to disentangle the relatively static user-item interaction and rapidly drifting contextual features. Specifically, our proposed HySAGE network learns a relatively static graph embedding from user-item interaction and an adaptive embedding from drifting contextual features. These embeddings are incorporated into an interest network to generate the user interest in some certain context. We adopt an interactive attention module to learn the interactions among static graph embeddings, adaptive contextual embeddings, and user interest, helping to achieve a better final representation. Extensive experiments on real-world datasets demonstrate that HySAGE significantly improves the performance of the existing state-of-the-art recommendation algorithms.
Published: 2022

44. Personalized Federated Recommendation via Joint Representation Learning, User Clustering, and Model Adaptation

Author: Luo, Sichun, Xiao, Yuanzhang, and Song, Linqi
Subjects: Computer Science - Information Retrieval
Abstract: Federated recommendation applies federated learning techniques in recommendation systems to help protect user privacy by exchanging models instead of raw user data between user devices and the central server. Due to the heterogeneity in user's attributes and local data, attaining personalized models is critical to help improve the federated recommendation performance. In this paper, we propose a Graph Neural Network based Personalized Federated Recommendation (PerFedRec) framework via joint representation learning, user clustering, and model adaptation. Specifically, we construct a collaborative graph and incorporate attribute information to jointly learn the representation through a federated GNN. Based on these learned representations, we cluster users into different user groups and learn personalized models for each cluster. Then each user learns a personalized model by combining the global federated model, the cluster-level federated model, and the user's fine-tuned local model. To alleviate the heavy communication burden, we intelligently select a few representative users (instead of randomly picked users) from each cluster to participate in training. Experiments on real-world datasets show that our proposed method achieves superior performance over existing methods.
Published: 2022

45. Towards Efficient Communications in Federated Learning: A Contemporary Survey

Author: Zhao, Zihao, Mao, Yuzhu, Liu, Yang, Song, Linqi, Ouyang, Ye, Chen, Xinlei, and Ding, Wenbo
Subjects: Computer Science - Distributed, Parallel, and Cluster Computing
Abstract: In the traditional distributed machine learning scenario, the user's private data is transmitted between clients and a central server, which results in significant potential privacy risks. In order to balance the issues of data privacy and joint training of models, federated learning (FL) is proposed as a particular distributed machine learning procedure with privacy protection mechanisms, which can achieve multi-party collaborative computing without revealing the original data. However, in practice, FL faces a variety of challenging communication problems. This review seeks to elucidate the relationship between these communication issues by methodically assessing the development of FL communication research from three perspectives: communication efficiency, communication environment, and communication resource allocation. Firstly, we sort out the current challenges existing in the communications of FL. Second, we have collated FL communications-related papers and described the overall development trend of the field based on their logical relationship. Ultimately, we discuss the future directions of research for communications in FL.
Published: 2022

46. Towards Better Understanding with Uniformity and Explicit Regularization of Embeddings in Embedding-based Neural Topic Models

Author: Shao, Wei, Huang, Lei, Liu, Shuqi, Ma, Shihua, and Song, Linqi
Subjects: Computer Science - Computation and Language
Abstract: Embedding-based neural topic models could explicitly represent words and topics by embedding them to a homogeneous feature space, which shows higher interpretability. However, there are no explicit constraints for the training of embeddings, leading to a larger optimization space. Also, a clear description of the changes in embeddings and the impact on model performance is still lacking. In this paper, we propose an embedding regularized neural topic model, which applies the specially designed training constraints on word embedding and topic embedding to reduce the optimization space of parameters. To reveal the changes and roles of embeddings, we introduce \textbf{uniformity} into the embedding-based neural topic model as the evaluation metric of embedding space. On this basis, we describe how embeddings tend to change during training via the changes in the uniformity of embeddings. Furthermore, we demonstrate the impact of changes in embeddings in embedding-based neural topic models through ablation studies. The results of experiments on two mainstream datasets indicate that our model significantly outperforms baseline models in terms of the harmony between topic quality and document modeling. This work is the first attempt to exploit uniformity to explore changes in embeddings of embedding-based neural topic models and their impact on model performance to the best of our knowledge., Comment: IJCNN 2022
Published: 2022

47. Learning Locality and Isotropy in Dialogue Modeling

Author: Wu, Han, Tan, Haochen, Zhan, Mingjie, Zhao, Gangming, Lu, Shaoqing, Liang, Ding, and Song, Linqi
Subjects: Computer Science - Computation and Language, Computer Science - Information Retrieval, Computer Science - Machine Learning
Abstract: Existing dialogue modeling methods have achieved promising performance on various dialogue tasks with the aid of Transformer and the large-scale pre-trained language models. However, some recent studies revealed that the context representations produced by these methods suffer the problem of anisotropy. In this paper, we find that the generated representations are also not conversational, losing the conversation structure information during the context modeling stage. To this end, we identify two properties in dialogue modeling, i.e., locality and isotropy, and present a simple method for dialogue representation calibration, namely SimDRC, to build isotropic and conversational feature spaces. Experimental results show that our approach significantly outperforms the current state-of-the-art models on three dialogue tasks across the automatic and human evaluation metrics. More in-depth analyses further confirm the effectiveness of our proposed approach., Comment: To appear in ICLR 2023
Published: 2022

48. Zero-shot Cross-lingual Conversational Semantic Role Labeling

Author: Wu, Han, Tan, Haochen, Xu, Kun, Liu, Shuqi, Wu, Lianwei, and Song, Linqi
Subjects: Computer Science - Computation and Language, Computer Science - Artificial Intelligence
Abstract: While conversational semantic role labeling (CSRL) has shown its usefulness on Chinese conversational tasks, it is still under-explored in non-Chinese languages due to the lack of multilingual CSRL annotations for the parser training. To avoid expensive data collection and error-propagation of translation-based methods, we present a simple but effective approach to perform zero-shot cross-lingual CSRL. Our model implicitly learns language-agnostic, conversational structure-aware and semantically rich representations with the hierarchical encoders and elaborately designed pre-training objectives. Experimental results show that our model outperforms all baselines by large margins on two newly collected English CSRL test sets. More importantly, we confirm the usefulness of CSRL to non-Chinese conversational tasks such as the question-in-context rewriting task in English and the multi-turn dialogue response generation tasks in English, German and Japanese by incorporating the CSRL information into the downstream conversation-based models. We believe this finding is significant and will facilitate the research of non-Chinese dialogue tasks which suffer the problems of ellipsis and anaphora., Comment: NAACL 2022 findings
Published: 2022

49. A Sentence is Worth 128 Pseudo Tokens: A Semantic-Aware Contrastive Learning Framework for Sentence Embeddings

Author: Tan, Haochen, Shao, Wei, Wu, Han, Yang, Ke, and Song, Linqi
Subjects: Computer Science - Computation and Language
Abstract: Contrastive learning has shown great potential in unsupervised sentence embedding tasks, e.g., SimCSE. However, We find that these existing solutions are heavily affected by superficial features like the length of sentences or syntactic structures. In this paper, we propose a semantics-aware contrastive learning framework for sentence embeddings, termed Pseudo-Token BERT (PT-BERT), which is able to exploit the pseudo-token space (i.e., latent semantic space) representation of a sentence while eliminating the impact of superficial features such as sentence length and syntax. Specifically, we introduce an additional pseudo token embedding layer independent of the BERT encoder to map each sentence into a sequence of pseudo tokens in a fixed length. Leveraging these pseudo sequences, we are able to construct same-length positive and negative pairs based on the attention mechanism to perform contrastive learning. In addition, we utilize both the gradient-updating and momentum-updating encoders to encode instances while dynamically maintaining an additional queue to store the representation of sentence embeddings, enhancing the encoder's learning performance for negative examples. Experiments show that our model outperforms the state-of-the-art baselines on six standard semantic textual similarity (STS) tasks. Furthermore, experiments on alignments and uniformity losses, as well as hard examples with different sentence lengths and syntax, consistently verify the effectiveness of our method., Comment: Long paper; ACL 2022 (Findings)
Published: 2022

50. Privacy-Preserving Communication-Efficient Federated Multi-Armed Bandits

Author: Li, Tan and Song, Linqi
Subjects: Computer Science - Machine Learning, Computer Science - Networking and Internet Architecture
Abstract: Communication bottleneck and data privacy are two critical concerns in federated multi-armed bandit (MAB) problems, such as situations in decision-making and recommendations of connected vehicles via wireless. In this paper, we design the privacy-preserving communication-efficient algorithm in such problems and study the interactions among privacy, communication and learning performance in terms of the regret. To be specific, we design privacy-preserving learning algorithms and communication protocols and derive the learning regret when networked private agents are performing online bandit learning in a master-worker, a decentralized and a hybrid structure. Our bandit learning algorithms are based on epoch-wise sub-optimal arm eliminations at each agent and agents exchange learning knowledge with the server/each other at the end of each epoch. Furthermore, we adopt the differential privacy (DP) approach to protect the data privacy at each agent when exchanging information; and we curtail communication costs by making less frequent communications with fewer agents participation. By analyzing the regret of our proposed algorithmic framework in the master-worker, decentralized and hybrid structures, we theoretically show tradeoffs between regret and communication costs/privacy. Finally, we empirically show these trade-offs which are consistent with our theoretical analysis.
Published: 2021

Catalog

Books, media, physical & digital resources

See catalog results

Searchworks

Select search scope, currently: Articles Catalog books, media & more in Jio Institute collections Articles journal articles & other e-resources

Search

Search Constraints

Refine your results

Search Limiters

Topic

Publication Year Range

Language

Publication Type

Journal

Database

Publisher

347 results on '"Song, Linqi"'

Search Results

Catalog

Select search scope, currently: Articles

Catalog

books, media & more in Jio Institute collections

Articles

journal articles & other e-resources