Author: "Chen, Qi-an" - Searchworks@Jio Institute Digital Library Search Results

Your search keyword '"Chen, Qi-an"' showing total 24,156 results

Start Over Author "Chen, Qi-an"

24,156 results on '"Chen, Qi-an"'

1. RetrievalAttention: Accelerating Long-Context LLM Inference via Vector Retrieval

Author: Liu, Di, Chen, Meng, Lu, Baotong, Jiang, Huiqiang, Han, Zhenhua, Zhang, Qianxi, Chen, Qi, Zhang, Chengruidong, Ding, Bailu, Zhang, Kai, Chen, Chen, Yang, Fan, Yang, Yuqing, and Qiu, Lili
Subjects: Computer Science - Machine Learning, Computer Science - Computation and Language
Abstract: Transformer-based Large Language Models (LLMs) have become increasingly important. However, due to the quadratic time complexity of attention computation, scaling LLMs to longer contexts incurs extremely slow inference latency and high GPU memory consumption for caching key-value (KV) vectors. This paper proposes RetrievalAttention, a training-free approach to both accelerate attention computation and reduce GPU memory consumption. By leveraging the dynamic sparsity of attention mechanism, RetrievalAttention proposes to use approximate nearest neighbor search (ANNS) indexes for KV vectors in CPU memory and retrieves the most relevant ones with vector search during generation. Unfortunately, we observe that the off-the-shelf ANNS indexes are often ineffective for such retrieval tasks due to the out-of-distribution (OOD) between query vectors and key vectors in attention mechanism. RetrievalAttention addresses the OOD challenge by designing an attention-aware vector search algorithm that can adapt to the distribution of query vectors. Our evaluation shows that RetrievalAttention only needs to access 1--3% of data while maintaining high model accuracy. This leads to significant reduction in the inference cost of long-context LLMs with much lower GPU memory footprint. In particular, RetrievalAttention only needs a single NVIDIA RTX4090 (24GB) for serving 128K tokens in LLMs with 8B parameters, which is capable of generating one token in 0.188 seconds., Comment: 16 pages
Published: 2024

2. Revisiting Physical-World Adversarial Attack on Traffic Sign Recognition: A Commercial Systems Perspective

Author: Wang, Ningfei, Xie, Shaoyuan, Sato, Takami, Luo, Yunpeng, Xu, Kaidi, and Chen, Qi Alfred
Subjects: Computer Science - Cryptography and Security, Computer Science - Computer Vision and Pattern Recognition
Abstract: Traffic Sign Recognition (TSR) is crucial for safe and correct driving automation. Recent works revealed a general vulnerability of TSR models to physical-world adversarial attacks, which can be low-cost, highly deployable, and capable of causing severe attack effects such as hiding a critical traffic sign or spoofing a fake one. However, so far existing works generally only considered evaluating the attack effects on academic TSR models, leaving the impacts of such attacks on real-world commercial TSR systems largely unclear. In this paper, we conduct the first large-scale measurement of physical-world adversarial attacks against commercial TSR systems. Our testing results reveal that it is possible for existing attack works from academia to have highly reliable (100\%) attack success against certain commercial TSR system functionality, but such attack capabilities are not generalizable, leading to much lower-than-expected attack success rates overall. We find that one potential major factor is a spatial memorization design that commonly exists in today's commercial TSR systems. We design new attack success metrics that can mathematically model the impacts of such design on the TSR system-level attack success, and use them to revisit existing attacks. Through these efforts, we uncover 7 novel observations, some of which directly challenge the observations or claims in prior works due to the introduction of the new metrics., Comment: Accepted by NDSS 2025
Published: 2024
Full Text: View/download PDF

3. Analyzing Tumors by Synthesis

Author: Chen, Qi, Lai, Yuxiang, Chen, Xiaoxi, Hu, Qixin, Yuille, Alan, and Zhou, Zongwei
Subjects: Electrical Engineering and Systems Science - Image and Video Processing, Computer Science - Computer Vision and Pattern Recognition
Abstract: Computer-aided tumor detection has shown great potential in enhancing the interpretation of over 80 million CT scans performed annually in the United States. However, challenges arise due to the rarity of CT scans with tumors, especially early-stage tumors. Developing AI with real tumor data faces issues of scarcity, annotation difficulty, and low prevalence. Tumor synthesis addresses these challenges by generating numerous tumor examples in medical images, aiding AI training for tumor detection and segmentation. Successful synthesis requires realistic and generalizable synthetic tumors across various organs. This chapter reviews AI development on real and synthetic data and summarizes two key trends in synthetic data for cancer imaging research: modeling-based and learning-based approaches. Modeling-based methods, like Pixel2Cancer, simulate tumor development over time using generic rules, while learning-based methods, like DiffTumor, learn from a few annotated examples in one organ to generate synthetic tumors in others. Reader studies with expert radiologists show that synthetic tumors can be convincingly realistic. We also present case studies in the liver, pancreas, and kidneys reveal that AI trained on synthetic tumors can achieve performance comparable to, or better than, AI only trained on real data. Tumor synthesis holds significant promise for expanding datasets, enhancing AI reliability, improving tumor detection performance, and preserving patient privacy., Comment: Accepted as a chapter in the Springer Book: "Generative Machine Learning Models in Medical Image Computing."
Published: 2024

4. A model for inflaton induced baryogenesis and its phenomenological consequences

Author: An, Haipeng, Chen, Qi, and Yin, Yuan
Subjects: Astrophysics - Cosmology and Nongalactic Astrophysics, High Energy Physics - Phenomenology
Abstract: In this study, we introduce a novel approach aimed at addressing the longstanding baryon-anti-baryon asymmetry conundrum. Our proposed mechanism suggests that baryon numbers were generated during the inflationary epoch through the dynamics of the inflaton field coupled with an explicit baryon number violating interaction. Notably, during inflation, it is possible to halt the baryon number generation process via a symmetry restoration phase transition. We elucidate that prior to this phase transition, baryon numbers could be synthesized and preserved within classical field configurations. Subsequently, following the phase transition, these baryon numbers were liberated as particles. Crucially, we demonstrate that this mechanism of baryon number production is intricately linked with significant cosmological collider signals and gravitational wave (GW) signals, offering a compelling framework to explore the origins of the universe's matter-antimatter asymmetry.
Published: 2024

5. Maximizing orientation of a three-state molecule in a cavity with analytically designed pulses

Author: Fan, Li-Bao, Li, Hai-Ji, Chen, Qi, Zhou, Hang, Liu, Heng, and Shu, Chuan-Cun
Subjects: Quantum Physics
Abstract: We theoretically explore the precise control of a molecular polariton by strongly coupling the lowest three rotational states of a single molecule with a single-mode cavity. We examine two distinct cavity resonance configurations: a fundamental frequency cavity ($\omega_c = 2B$ with the rotational constant $B$) resonating with the lowest two rotational states, and a second harmonic cavity ($\omega_c = 4B$) coupling with the first and second excited rotational states. We propose two control schemes based on the two polariton configurations and derive the corresponding pulse-area theorems to achieve a theoretical maximum orientation of 0.7746, identical to the molecule in the absence of the cavity. The control schemes are analyzed in Carbonyl Sulfide (OCS) molecules in their ground rotational state. Our numerical simulation results demonstrate the theoretical control schemes and analyze the sensitivity of the molecular polariton orientation degree to the control field bandwidth and phases. This work provides a valuable reference for achieving maximum field-free orientation of ultracold three-state molecules in a cavity using analytically designed pulses., Comment: 22 pages, 7 figures
Published: 2024

6. HEAD: A Bandwidth-Efficient Cooperative Perception Approach for Heterogeneous Connected and Autonomous Vehicles

Author: Qu, Deyuan, Chen, Qi, Zhu, Yongqi, Zhu, Yihao, Avedisov, Sergei S., Fu, Song, and Yang, Qing
Subjects: Computer Science - Computer Vision and Pattern Recognition
Abstract: In cooperative perception studies, there is often a trade-off between communication bandwidth and perception performance. While current feature fusion solutions are known for their excellent object detection performance, transmitting the entire sets of intermediate feature maps requires substantial bandwidth. Furthermore, these fusion approaches are typically limited to vehicles that use identical detection models. Our goal is to develop a solution that supports cooperative perception across vehicles equipped with different modalities of sensors. This method aims to deliver improved perception performance compared to late fusion techniques, while achieving precision similar to the state-of-art intermediate fusion, but requires an order of magnitude less bandwidth. We propose HEAD, a method that fuses features from the classification and regression heads in 3D object detection networks. Our method is compatible with heterogeneous detection networks such as LiDAR PointPillars, SECOND, VoxelNet, and camera Bird's-eye View (BEV) Encoder. Given the naturally smaller feature size in the detection heads, we design a self-attention mechanism to fuse the classification head and a complementary feature fusion layer to fuse the regression head. Our experiments, comprehensively evaluated on the V2V4Real and OPV2V datasets, demonstrate that HEAD is a fusion method that effectively balances communication bandwidth and perception performance., Comment: Accepted by ECCV 2024 Workshop
Published: 2024

7. XLIP: Cross-modal Attention Masked Modelling for Medical Language-Image Pre-Training

Author: Wu, Biao, Xie, Yutong, Zhang, Zeyu, Phan, Minh Hieu, Chen, Qi, Chen, Ling, and Wu, Qi
Subjects: Computer Science - Computer Vision and Pattern Recognition
Abstract: Vision-and-language pretraining (VLP) in the medical field utilizes contrastive learning on image-text pairs to achieve effective transfer across tasks. Yet, current VLP approaches with the masked modelling strategy face two challenges when applied to the medical domain. First, current models struggle to accurately reconstruct key pathological features due to the scarcity of medical data. Second, most methods only adopt either paired image-text or image-only data, failing to exploit the combination of both paired and unpaired data. To this end, this paper proposes a XLIP (Masked modelling for medical Language-Image Pre-training) framework to enhance pathological learning and feature learning via unpaired data. First, we introduce the attention-masked image modelling (AttMIM) and entity-driven masked language modelling module (EntMLM), which learns to reconstruct pathological visual and textual tokens via multi-modal feature interaction, thus improving medical-enhanced features. The AttMIM module masks a portion of the image features that are highly responsive to textual features. This allows XLIP to improve the reconstruction of highly similar image data in medicine efficiency. Second, our XLIP capitalizes unpaired data to enhance multimodal learning by introducing disease-kind prompts. The experimental results show that XLIP achieves SOTA for zero-shot and fine-tuning classification performance on five datasets. Our code will be available at https://github.com/White65534/XLIP
Published: 2024

8. InfiniMotion: Mamba Boosts Memory in Transformer for Arbitrary Long Motion Generation

Author: Zhang, Zeyu, Liu, Akide, Chen, Qi, Chen, Feng, Reid, Ian, Hartley, Richard, Zhuang, Bohan, and Tang, Hao
Subjects: Computer Science - Computer Vision and Pattern Recognition
Abstract: Text-to-motion generation holds potential for film, gaming, and robotics, yet current methods often prioritize short motion generation, making it challenging to produce long motion sequences effectively: (1) Current methods struggle to handle long motion sequences as a single input due to prohibitively high computational cost; (2) Breaking down the generation of long motion sequences into shorter segments can result in inconsistent transitions and requires interpolation or inpainting, which lacks entire sequence modeling. To solve these challenges, we propose InfiniMotion, a method that generates continuous motion sequences of arbitrary length within an autoregressive framework. We highlight its groundbreaking capability by generating a continuous 1-hour human motion with around 80,000 frames. Specifically, we introduce the Motion Memory Transformer with Bidirectional Mamba Memory, enhancing the transformer's memory to process long motion sequences effectively without overwhelming computational resources. Notably our method achieves over 30% improvement in FID and 6 times longer demonstration compared to previous state-of-the-art methods, showcasing significant advancements in long motion generation. See project webpage: https://steve-zeyu-zhang.github.io/InfiniMotion/
Published: 2024

9. Measurement of microwave photon correlations at millikelvin with a thermal detector

Author: Keränen, Aarne, Chen, Qi-Ming, Gunyhó, András, Singh, Priyank, Ma, Jian, Vesterinen, Visa, Govenius, Joonas, and Möttönen, Mikko
Subjects: Quantum Physics, Condensed Matter - Mesoscale and Nanoscale Physics
Abstract: Microwave photons are important carriers of quantum information in many promising platforms for quantum computing. They can be routinely generated, controlled, and teleported in experiments, indicating a variety of applications in quantum technology. However, observation of quantum statistical properties of microwave photons remains demanding: The energy of several microwave photons is considerably smaller than the thermal fluctuation of any room-temperature detector, while amplification necessarily induces noise. Here, we present a measurement technique with a nanobolometer that directly measures the photon statistics at millikelvin and overcomes this trade-off. We apply our method to thermal states generated by a blackbody radiator operating in the regime of circuit quantum electrodynamics. We demonstrate the photon number resolvedness of the nanobolometer, and reveal the n(n+1)-scaling law of the photon number variance as indicated by the Bose--Einstein distribution. By engineering the coherent and incoherent proportions of the input field, we observe the transition between super-Poissonian and Poissonian statistics of the microwave photons from the bolometric second-order correlation measurement. This technique is poised to serve in fundamental tests of quantum mechanics with microwave photons and function as a scalable readout solution for a quantum information processor.
Published: 2024

10. MMR-Mamba: Multi-Modal MRI Reconstruction with Mamba and Spatial-Frequency Information Fusion

Author: Zou, Jing, Liu, Lanqing, Chen, Qi, Wang, Shujun, Hu, Zhanli, Xing, Xiaohan, and Qin, Jing
Subjects: Electrical Engineering and Systems Science - Image and Video Processing, Computer Science - Computer Vision and Pattern Recognition
Abstract: Multi-modal MRI offers valuable complementary information for diagnosis and treatment; however, its utility is limited by prolonged scanning times. To accelerate the acquisition process, a practical approach is to reconstruct images of the target modality, which requires longer scanning times, from under-sampled k-space data using the fully-sampled reference modality with shorter scanning times as guidance. The primary challenge of this task is comprehensively and efficiently integrating complementary information from different modalities to achieve high-quality reconstruction. Existing methods struggle with this: 1) convolution-based models fail to capture long-range dependencies; 2) transformer-based models, while excelling in global feature modeling, struggle with quadratic computational complexity. To address this, we propose MMR-Mamba, a novel framework that thoroughly and efficiently integrates multi-modal features for MRI reconstruction, leveraging Mamba's capability to capture long-range dependencies with linear computational complexity while exploiting global properties of the Fourier domain. Specifically, we first design a Target modality-guided Cross Mamba (TCM) module in the spatial domain, which maximally restores the target modality information by selectively incorporating relevant information from the reference modality. Then, we introduce a Selective Frequency Fusion (SFF) module to efficiently integrate global information in the Fourier domain and recover high-frequency signals for the reconstruction of structural details. Furthermore, we devise an Adaptive Spatial-Frequency Fusion (ASFF) module, which mutually enhances the spatial and frequency domains by supplementing less informative channels from one domain with corresponding channels from the other., Comment: 10 pages, 5 figure
Published: 2024

11. On Zero-Error Capacity of Graphs with One Edge

Author: Cao, Qi, Chen, Qi, and Bai, Baoming
Subjects: Computer Science - Information Theory
Abstract: In this paper, we study the zero-error capacity of channels with memory, which are represented by graphs. We provide a method to construct code for any graph with one edge, thereby determining a lower bound on its zero-error capacity. Moreover, this code can achieve zero-error capacity when the symbols in a vertex with degree one are the same. We further apply our method to the one-edge graphs representing the binary channels with two memories. There are 28 possible graphs, which can be organized into 11 categories based on their symmetries. The code constructed by our method is proved to achieve the zero-error capacity for all these graphs except for the two graphs in Case 11.
Published: 2024

12. Gradient-Mask Tuning Elevates the Upper Limits of LLM Performance

Author: Li, Haoling, Zhang, Xin, Liu, Xiao, Gong, Yeyun, Wang, Yifan, Yang, Yujiu, Chen, Qi, and Cheng, Peng
Subjects: Computer Science - Artificial Intelligence, Computer Science - Computation and Language
Abstract: Large language models (LLMs) have revolutionized lots of fields of research. Although it is well-known that fine-tuning is essential for enhancing the capabilities of LLMs, existing research suggests that there is potential redundancy in the fine-tuning process and therefore proposes to update only a subset of parameters. However, these methods fail to leverage the task-specific information to identify important parameters during training. Based on the insight that gradients inherently contain information on task-specific data, we propose Gradient-Mask Tuning (GMT), a method that selectively updates parameters during training based on their gradient information. Specifically, we compute the absolute values of the gradients and apply masking to those with relatively smaller magnitudes. Our empirical results across various tasks demonstrate that GMT not only outperforms traditional fine-tuning methods but also elevates the upper limits of LLM performance. Further analysis indicates that GMT exhibits insensitivity to mask ratio and possesses computational efficiency comparable to vanilla SFT.
Published: 2024

13. Meta-Learning Neural Procedural Biases

Author: Raymond, Christian, Chen, Qi, Xue, Bing, and Zhang, Mengjie
Subjects: Computer Science - Machine Learning
Abstract: The goal of few-shot learning is to generalize and achieve high performance on new unseen learning tasks, where each task has only a limited number of examples available. Gradient-based meta-learning attempts to address this challenging task by learning how to learn new tasks by embedding inductive biases informed by prior learning experiences into the components of the learning algorithm. In this work, we build upon prior research and propose Neural Procedural Bias Meta-Learning (NPBML), a novel framework designed to meta-learn task-adaptive procedural biases. Our approach aims to consolidate recent advancements in meta-learned initializations, optimizers, and loss functions by learning them simultaneously and making them adapt to each individual task to maximize the strength of the learned inductive biases. This imbues each learning task with a unique set of procedural biases which is specifically designed and selected to attain strong learning performance in only a few gradient steps. The experimental results show that by meta-learning the procedural biases of a neural network, we can induce strong inductive biases towards a distribution of learning tasks, enabling robust learning performance across many well-established few-shot learning benchmarks.
Published: 2024

14. ControlLoc: Physical-World Hijacking Attack on Visual Perception in Autonomous Driving

Author: Ma, Chen, Wang, Ningfei, Zhao, Zhengyu, Wang, Qian, Chen, Qi Alfred, and Shen, Chao
Subjects: Computer Science - Computer Vision and Pattern Recognition
Abstract: Recent research in adversarial machine learning has focused on visual perception in Autonomous Driving (AD) and has shown that printed adversarial patches can attack object detectors. However, it is important to note that AD visual perception encompasses more than just object detection; it also includes Multiple Object Tracking (MOT). MOT enhances the robustness by compensating for object detection errors and requiring consistent object detection results across multiple frames before influencing tracking results and driving decisions. Thus, MOT makes attacks on object detection alone less effective. To attack such robust AD visual perception, a digital hijacking attack has been proposed to cause dangerous driving scenarios. However, this attack has limited effectiveness. In this paper, we introduce a novel physical-world adversarial patch attack, ControlLoc, designed to exploit hijacking vulnerabilities in entire AD visual perception. ControlLoc utilizes a two-stage process: initially identifying the optimal location for the adversarial patch, and subsequently generating the patch that can modify the perceived location and shape of objects with the optimal location. Extensive evaluations demonstrate the superior performance of ControlLoc, achieving an impressive average attack success rate of around 98.1% across various AD visual perceptions and datasets, which is four times greater effectiveness than the existing hijacking attack. The effectiveness of ControlLoc is further validated in physical-world conditions, including real vehicle tests under different conditions such as outdoor light conditions with an average attack success rate of 77.5%. AD system-level impact assessments are also included, such as vehicle collision, using industry-grade AD systems and production-grade AD simulators with an average vehicle collision rate and unnecessary emergency stop rate of 81.3%.
Published: 2024

15. SlowPerception: Physical-World Latency Attack against Visual Perception in Autonomous Driving

Author: Ma, Chen, Wang, Ningfei, Zhao, Zhengyu, Chen, Qi Alfred, and Shen, Chao
Subjects: Computer Science - Computer Vision and Pattern Recognition, Computer Science - Cryptography and Security
Abstract: Autonomous Driving (AD) systems critically depend on visual perception for real-time object detection and multiple object tracking (MOT) to ensure safe driving. However, high latency in these visual perception components can lead to significant safety risks, such as vehicle collisions. While previous research has extensively explored latency attacks within the digital realm, translating these methods effectively to the physical world presents challenges. For instance, existing attacks rely on perturbations that are unrealistic or impractical for AD, such as adversarial perturbations affecting areas like the sky, or requiring large patches that obscure most of a camera's view, thus making them impossible to be conducted effectively in the real world. In this paper, we introduce SlowPerception, the first physical-world latency attack against AD perception, via generating projector-based universal perturbations. SlowPerception strategically creates numerous phantom objects on various surfaces in the environment, significantly increasing the computational load of Non-Maximum Suppression (NMS) and MOT, thereby inducing substantial latency. Our SlowPerception achieves second-level latency in physical-world settings, with an average latency of 2.5 seconds across different AD perception systems, scenarios, and hardware configurations. This performance significantly outperforms existing state-of-the-art latency attacks. Additionally, we conduct AD system-level impact assessments, such as vehicle collisions, using industry-grade AD systems with production-grade AD simulators with a 97% average rate. We hope that our analyses can inspire further research in this critical domain, enhancing the robustness of AD systems against emerging vulnerabilities., Comment: This submission was made without all contributors' consent
Published: 2024

16. Intersectional Unfairness Discovery

Author: Xu, Gezheng, Chen, Qi, Ling, Charles, Wang, Boyu, and Shui, Changjian
Subjects: Computer Science - Machine Learning, Computer Science - Computers and Society
Abstract: AI systems have been shown to produce unfair results for certain subgroups of population, highlighting the need to understand bias on certain sensitive attributes. Current research often falls short, primarily focusing on the subgroups characterized by a single sensitive attribute, while neglecting the nature of intersectional fairness of multiple sensitive attributes. This paper focuses on its one fundamental aspect by discovering diverse high-bias subgroups under intersectional sensitive attributes. Specifically, we propose a Bias-Guided Generative Network (BGGN). By treating each bias value as a reward, BGGN efficiently generates high-bias intersectional sensitive attributes. Experiments on real-world text and image datasets demonstrate a diverse and efficient discovery of BGGN. To further evaluate the generated unseen but possible unfair intersectional sensitive attributes, we formulate them as prompts and use modern generative AI to produce new texts and images. The results of frequently generating biased data provides new insights of discovering potential unfairness in popular modern generative AI systems. Warning: This paper contains generative examples that are offensive in nature., Comment: ICML-2024 camera-ready
Published: 2024

17. Exploring Backdoor Attacks against Large Language Model-based Decision Making

Author: Jiao, Ruochen, Xie, Shaoyuan, Yue, Justin, Sato, Takami, Wang, Lixu, Wang, Yixuan, Chen, Qi Alfred, and Zhu, Qi
Subjects: Computer Science - Cryptography and Security, Computer Science - Artificial Intelligence
Abstract: Large Language Models (LLMs) have shown significant promise in decision-making tasks when fine-tuned on specific applications, leveraging their inherent common sense and reasoning abilities learned from vast amounts of data. However, these systems are exposed to substantial safety and security risks during the fine-tuning phase. In this work, we propose the first comprehensive framework for Backdoor Attacks against LLM-enabled Decision-making systems (BALD), systematically exploring how such attacks can be introduced during the fine-tuning phase across various channels. Specifically, we propose three attack mechanisms and corresponding backdoor optimization methods to attack different components in the LLM-based decision-making pipeline: word injection, scenario manipulation, and knowledge injection. Word injection embeds trigger words directly into the query prompt. Scenario manipulation occurs in the physical environment, where a high-level backdoor semantic scenario triggers the attack. Knowledge injection conducts backdoor attacks on retrieval augmented generation (RAG)-based LLM systems, strategically injecting word triggers into poisoned knowledge while ensuring the information remains factually accurate for stealthiness. We conduct extensive experiments with three popular LLMs (GPT-3.5, LLaMA2, PaLM2), using two datasets (HighwayEnv, nuScenes), and demonstrate the effectiveness and stealthiness of our backdoor triggers and mechanisms. Finally, we critically assess the strengths and weaknesses of our proposed approaches, highlight the inherent vulnerabilities of LLMs in decision-making tasks, and evaluate potential defenses to safeguard LLM-based decision making systems., Comment: 27 pages, including main paper, references, and appendix
Published: 2024

18. MS MARCO Web Search: a Large-scale Information-rich Web Dataset with Millions of Real Click Labels

Author: Chen, Qi, Geng, Xiubo, Rosset, Corby, Buractaon, Carolyn, Lu, Jingwen, Shen, Tao, Zhou, Kun, Xiong, Chenyan, Gong, Yeyun, Bennett, Paul, Craswell, Nick, Xie, Xing, Yang, Fan, Tower, Bryan, Rao, Nikhil, Dong, Anlei, Jiang, Wenqi, Liu, Zheng, Li, Mingqin, Liu, Chuanjie, Li, Zengzhong, Majumder, Rangan, Neville, Jennifer, Oakley, Andy, Risvik, Knut Magne, Simhadri, Harsha Vardhan, Varma, Manik, Wang, Yujing, Yang, Linjun, Yang, Mao, and Zhang, Ce
Subjects: Computer Science - Information Retrieval
Abstract: Recent breakthroughs in large models have highlighted the critical significance of data scale, labels and modals. In this paper, we introduce MS MARCO Web Search, the first large-scale information-rich web dataset, featuring millions of real clicked query-document labels. This dataset closely mimics real-world web document and query distribution, provides rich information for various kinds of downstream tasks and encourages research in various areas, such as generic end-to-end neural indexer models, generic embedding models, and next generation information access system with large language models. MS MARCO Web Search offers a retrieval benchmark with three web retrieval challenge tasks that demand innovations in both machine learning and information retrieval system research domains. As the first dataset that meets large, real and rich data requirements, MS MARCO Web Search paves the way for future advancements in AI and system research. MS MARCO Web Search dataset is available at: https://github.com/microsoft/MS-MARCO-Web-Search., Comment: 10 pages, 6 figures, for associated dataset, see http://github.com/microsoft/MS-MARCO-Web-Search
Published: 2024
Full Text: View/download PDF

19. Sharpness-Aware Minimization for Evolutionary Feature Construction in Regression

Author: Zhang, Hengzhe, Chen, Qi, Xue, Bing, Banzhaf, Wolfgang, and Zhang, Mengjie
Subjects: Computer Science - Machine Learning, Computer Science - Neural and Evolutionary Computing
Abstract: In recent years, genetic programming (GP)-based evolutionary feature construction has achieved significant success. However, a primary challenge with evolutionary feature construction is its tendency to overfit the training data, resulting in poor generalization on unseen data. In this research, we draw inspiration from PAC-Bayesian theory and propose using sharpness-aware minimization in function space to discover symbolic features that exhibit robust performance within a smooth loss landscape in the semantic space. By optimizing sharpness in conjunction with cross-validation loss, as well as designing a sharpness reduction layer, the proposed method effectively mitigates the overfitting problem of GP, especially when dealing with a limited number of instances or in the presence of label noise. Experimental results on 58 real-world regression datasets show that our approach outperforms standard GP as well as six state-of-the-art complexity measurement methods for GP in controlling overfitting. Furthermore, the ensemble version of GP with sharpness-aware minimization demonstrates superior performance compared to nine fine-tuned machine learning and symbolic regression algorithms, including XGBoost and LightGBM., Comment: Submitted to IEEE Transactions on Pattern Analysis and Machine Intelligence
Published: 2024

20. AniTalker: Animate Vivid and Diverse Talking Faces through Identity-Decoupled Facial Motion Encoding

Author: Liu, Tao, Chen, Feilong, Fan, Shuai, Du, Chenpeng, Chen, Qi, Chen, Xie, and Yu, Kai
Subjects: Computer Science - Computer Vision and Pattern Recognition, Computer Science - Artificial Intelligence
Abstract: The paper introduces AniTalker, an innovative framework designed to generate lifelike talking faces from a single portrait. Unlike existing models that primarily focus on verbal cues such as lip synchronization and fail to capture the complex dynamics of facial expressions and nonverbal cues, AniTalker employs a universal motion representation. This innovative representation effectively captures a wide range of facial dynamics, including subtle expressions and head movements. AniTalker enhances motion depiction through two self-supervised learning strategies: the first involves reconstructing target video frames from source frames within the same identity to learn subtle motion representations, and the second develops an identity encoder using metric learning while actively minimizing mutual information between the identity and motion encoders. This approach ensures that the motion representation is dynamic and devoid of identity-specific details, significantly reducing the need for labeled data. Additionally, the integration of a diffusion model with a variance adapter allows for the generation of diverse and controllable facial animations. This method not only demonstrates AniTalker's capability to create detailed and realistic facial movements but also underscores its potential in crafting dynamic avatars for real-world applications. Synthetic results can be viewed at https://github.com/X-LANCE/AniTalker., Comment: 14 pages, 7 figures
Published: 2024

21. Particle production from gluon-nucleon interactions in relativistic heavy ion collisions

Author: Fu, Yong-Ping, Huang, Fei-Jie, and Chen, Qi-Hui
Subjects: Nuclear Theory, High Energy Physics - Phenomenology
Abstract: We propose a particle production mechanism analogous to the particle photoproduction processes, arising from the gluon-nucleon interactions in relativistic heavy ion collisions. The comparison is made on the effect of the gluon-nucleon interactions on the photon production in Au+Au collisions at $\sqrt{s_{NN}}=$200 GeV and Pb+Pb collisions at $\sqrt{s_{NN}}=$2.76 TeV. The numerical results indicate that as the collision energy increases, the contribution of gluon-nucleon interactions becomes more prominent., Comment: 6 pages,7 figures
Published: 2024

22. STT: Stateful Tracking with Transformers for Autonomous Driving

Author: Jing, Longlong, Yu, Ruichi, Chen, Xu, Zhao, Zhengli, Sheng, Shiwei, Graber, Colin, Chen, Qi, Li, Qinru, Wu, Shangxuan, Deng, Han, Lee, Sangjin, Sweeney, Chris, He, Qiurui, Hung, Wei-Chih, He, Tong, Zhou, Xingyi, Moussavi, Farshid, Guo, Zijian, Zhou, Yin, Tan, Mingxing, Yang, Weilong, and Li, Congcong
Subjects: Computer Science - Robotics, Computer Science - Artificial Intelligence, Computer Science - Computer Vision and Pattern Recognition, Computer Science - Machine Learning
Abstract: Tracking objects in three-dimensional space is critical for autonomous driving. To ensure safety while driving, the tracker must be able to reliably track objects across frames and accurately estimate their states such as velocity and acceleration in the present. Existing works frequently focus on the association task while either neglecting the model performance on state estimation or deploying complex heuristics to predict the states. In this paper, we propose STT, a Stateful Tracking model built with Transformers, that can consistently track objects in the scenes while also predicting their states accurately. STT consumes rich appearance, geometry, and motion signals through long term history of detections and is jointly optimized for both data association and state estimation tasks. Since the standard tracking metrics like MOTA and MOTP do not capture the combined performance of the two tasks in the wider spectrum of object states, we extend them with new metrics called S-MOTA and MOTPS that address this limitation. STT achieves competitive real-time performance on the Waymo Open Dataset., Comment: ICRA 2024
Published: 2024

23. Symmetric Entropy Regions of Degrees Six and Seven

Author: Li, Zihan, Liu, Shaocheng, and Chen, Qi
Subjects: Computer Science - Information Theory
Abstract: In this paper, we classify all G-symmetric almost entropic regions according to their Shannon-tightness, that is, whether they can be fully characterized by Shannon-type inequalities, where G is a permutation group of degree 6 or 7.
Published: 2024

24. GSTalker: Real-time Audio-Driven Talking Face Generation via Deformable Gaussian Splatting

Author: Chen, Bo, Hu, Shoukang, Chen, Qi, Du, Chenpeng, Yi, Ran, Qian, Yanmin, and Chen, Xie
Subjects: Computer Science - Computer Vision and Pattern Recognition
Abstract: We present GStalker, a 3D audio-driven talking face generation model with Gaussian Splatting for both fast training (40 minutes) and real-time rendering (125 FPS) with a 3$\sim$5 minute video for training material, in comparison with previous 2D and 3D NeRF-based modeling frameworks which require hours of training and seconds of rendering per frame. Specifically, GSTalker learns an audio-driven Gaussian deformation field to translate and transform 3D Gaussians to synchronize with audio information, in which multi-resolution hashing grid-based tri-plane and temporal smooth module are incorporated to learn accurate deformation for fine-grained facial details. In addition, a pose-conditioned deformation field is designed to model the stabilized torso. To enable efficient optimization of the condition Gaussian deformation field, we initialize 3D Gaussians by learning a coarse static Gaussian representation. Extensive experiments in person-specific videos with audio tracks validate that GSTalker can generate high-fidelity and audio-lips synchronized results with fast training and real-time rendering speed.
Published: 2024

25. EC-SLAM: Real-time Dense Neural RGB-D SLAM System with Effectively Constrained Global Bundle Adjustment

Author: Li, Guanghao, Chen, Qi, Yan, YuXiang, and Pu, Jian
Subjects: Computer Science - Robotics
Abstract: We introduce EC-SLAM, a real-time dense RGB-D simultaneous localization and mapping (SLAM) system utilizing Neural Radiance Fields (NeRF). Although recent NeRF-based SLAM systems have demonstrated encouraging outcomes, they have yet to completely leverage NeRF's capability to constrain pose optimization. By employing an effectively constrained global bundle adjustment (BA) strategy, our system makes use of NeRF's implicit loop closure correction capability. This improves the tracking accuracy by reinforcing the constraints on the keyframes that are most pertinent to the optimized current frame. In addition, by implementing a feature-based and uniform sampling strategy that minimizes the number of ineffective constraint points for pose optimization, we mitigate the effects of random sampling in NeRF. EC-SLAM utilizes sparse parametric encodings and the truncated signed distance field (TSDF) to represent the map in order to facilitate efficient fusion, resulting in reduced model parameters and accelerated convergence velocity. A comprehensive evaluation conducted on the Replica, ScanNet, and TUM datasets showcases cutting-edge performance, including enhanced reconstruction accuracy resulting from precise pose estimation, 21 Hz run time, and tracking precision improvements of up to 50\%. The source code is available at https://github.com/Lightingooo/EC-SLAM.
Published: 2024

26. DKE-Research at SemEval-2024 Task 2: Incorporating Data Augmentation with Generative Models and Biomedical Knowledge to Enhance Inference Robustness

Author: Wang, Yuqi, Wang, Zeqiang, Wang, Wei, Chen, Qi, Huang, Kaizhu, Nguyen, Anh, and De, Suparna
Subjects: Computer Science - Computation and Language
Abstract: Safe and reliable natural language inference is critical for extracting insights from clinical trial reports but poses challenges due to biases in large pre-trained language models. This paper presents a novel data augmentation technique to improve model robustness for biomedical natural language inference in clinical trials. By generating synthetic examples through semantic perturbations and domain-specific vocabulary replacement and adding a new task for numerical and quantitative reasoning, we introduce greater diversity and reduce shortcut learning. Our approach, combined with multi-task learning and the DeBERTa architecture, achieved significant performance gains on the NLI4CT 2024 benchmark compared to the original language models. Ablation studies validate the contribution of each augmentation method in improving robustness. Our best-performing model ranked 12th in terms of faithfulness and 8th in terms of consistency, respectively, out of the 32 participants.
Published: 2024

27. G-NeRF: Geometry-enhanced Novel View Synthesis from Single-View Images

Author: Huang, Zixiong, Chen, Qi, Sun, Libo, Yang, Yifan, Wang, Naizhou, Tan, Mingkui, and Wu, Qi
Subjects: Computer Science - Computer Vision and Pattern Recognition
Abstract: Novel view synthesis aims to generate new view images of a given view image collection. Recent attempts address this problem relying on 3D geometry priors (e.g., shapes, sizes, and positions) learned from multi-view images. However, such methods encounter the following limitations: 1) they require a set of multi-view images as training data for a specific scene (e.g., face, car or chair), which is often unavailable in many real-world scenarios; 2) they fail to extract the geometry priors from single-view images due to the lack of multi-view supervision. In this paper, we propose a Geometry-enhanced NeRF (G-NeRF), which seeks to enhance the geometry priors by a geometry-guided multi-view synthesis approach, followed by a depth-aware training. In the synthesis process, inspired that existing 3D GAN models can unconditionally synthesize high-fidelity multi-view images, we seek to adopt off-the-shelf 3D GAN models, such as EG3D, as a free source to provide geometry priors through synthesizing multi-view data. Simultaneously, to further improve the geometry quality of the synthetic data, we introduce a truncation method to effectively sample latent codes within 3D GAN models. To tackle the absence of multi-view supervision for single-view images, we design the depth-aware training approach, incorporating a depth-aware discriminator to guide geometry priors through depth maps. Experiments demonstrate the effectiveness of our method in terms of both qualitative and quantitative results., Comment: CVPR 2024 Accepted Paper
Published: 2024

28. PairAug: What Can Augmented Image-Text Pairs Do for Radiology?

Author: Xie, Yutong, Chen, Qi, Wang, Sinuo, To, Minh-Son, Lee, Iris, Khoo, Ee Win, Hendy, Kerolos, Koh, Daniel, Xia, Yong, and Wu, Qi
Subjects: Computer Science - Computer Vision and Pattern Recognition
Abstract: Current vision-language pre-training (VLP) methodologies predominantly depend on paired image-text datasets, a resource that is challenging to acquire in radiology due to privacy considerations and labelling complexities. Data augmentation provides a practical solution to overcome the issue of data scarcity, however, most augmentation methods exhibit a limited focus, prioritising either image or text augmentation exclusively. Acknowledging this limitation, our objective is to devise a framework capable of concurrently augmenting medical image and text data. We design a Pairwise Augmentation (PairAug) approach that contains an Inter-patient Augmentation (InterAug) branch and an Intra-patient Augmentation (IntraAug) branch. Specifically, the InterAug branch of our approach generates radiology images using synthesised yet plausible reports derived from a Large Language Model (LLM). The generated pairs can be considered a collection of new patient cases since they are artificially created and may not exist in the original dataset. In contrast, the IntraAug branch uses newly generated reports to manipulate images. This process allows us to create new paired data for each individual with diverse medical conditions. Our extensive experiments on various downstream tasks covering medical image classification zero-shot and fine-tuning analysis demonstrate that our PairAug, concurrently expanding both image and text data, substantially outperforms image-/text-only expansion baselines and advanced medical VLP baselines. Our code is released at \url{https://github.com/YtongXie/PairAug}., Comment: Accepted to CVPR2024
Published: 2024

29. A dataset of primary nasopharyngeal carcinoma MRI with multi-modalities segmentation

Author: Li, Yin, Chen, Qi, Wang, Kai, Li, Meige, Si, Liping, Guo, Yingwei, Xiong, Yu, Wang, Qixing, Qin, Yang, Xu, Ling, van der Smagt, Patrick, Tang, Jun, and Chen, Nutan
Subjects: Electrical Engineering and Systems Science - Image and Video Processing, Computer Science - Artificial Intelligence, Computer Science - Computer Vision and Pattern Recognition, Computer Science - Machine Learning
Abstract: Multi-modality magnetic resonance imaging data with various sequences facilitate the early diagnosis, tumor segmentation, and disease staging in the management of nasopharyngeal carcinoma (NPC). The lack of publicly available, comprehensive datasets limits advancements in diagnosis, treatment planning, and the development of machine learning algorithms for NPC. Addressing this critical need, we introduce the first comprehensive NPC MRI dataset, encompassing MR axial imaging of 277 primary NPC patients. This dataset includes T1-weighted, T2-weighted, and contrast-enhanced T1-weighted sequences, totaling 831 scans. In addition to the corresponding clinical data, manually annotated and labeled segmentations by experienced radiologists offer high-quality data resources from untreated primary NPC.
Published: 2024

30. How Risky Are Cyber Security Threats Against Autonomous Vehicles?

Author: Chakraborty, Trishna and Chen, Qi Alfred
Abstract: To operate safely, autonomous vehicles (AVs) rely on external sensors such as cameras, light detection and ranging (LiDAR) technology, and radar. These sensors pair with machine learning-based perception modules that interpret the surrounding environment and enable the AV to act accordingly. Perception modules are the “eyes and ears” of the vehicle and are vulnerable to cybersecurity attacks. The most critical and practical threats, however, arise from physical attacks that do not require access to the AV’s internal systems. The risks of these types of attacks are still unknown. To advance the field in this area, we conducted the first ever quantitative risk assessment for physical adversarial attacks on AVs. First, we identified relevant attack vectors, or types of cyber security attacks, targeting AV perception modules. Next, we conducted an in-depth analysis of the stages of an attack. Finally, we used these exercises to identify risk metrics and perform a subsequent computation of risk scores for different attack vectors. Through this process, we were able to quantitatively rank the real-life risks posed by different attack vectors identified in existing research. This analysis provides a framework for comprehensive risk analysis to ensure the safety of AVs on our roadways.
Published: 2024

31. Risk Assessment for Security Threats and Vulnerabilities of Autonomous Vehicles

Author: Chakraborty, Trishna and Chen, Qi Alfred, PhD
Subjects: Autonomous vehicles, machine learning, cybersecurity, risk assessment, visual texture recognition
Abstract: Autonomous vehicles (AVs) heavily rely on machine learning-based perception models to accurately interpret their surroundings. However, these crucial perception components are vulnerable to a range of malicious attacks. Even though individual attacks can be highly successful, the actual security risks such attacks can pose to our daily life are unclear. Various factors, such as lack of stealthiness, cost-effectiveness, and ease of deployment, can deter potential attackers from employing certain attacks, thereby reducing the actual risk. This research report presents the first quantitative risk assessment for physical adversarial attacks on AVs. The specific focus is on attacks on AV’s perception components due to their highly critical function and representation in existing research. The report defines the daily-life risk as the likelihood that a given type of attack will be employed in real life and the authors develop a problem-specific risk scoring system and accompanying metrics. They perform an initial evaluation of the proposed risk assessment method for all the reported attacks on AVs from 2017 to 2023. They quantitatively rank the daily-life risks posed by each of eight different categories of attacks s and find three attacks with the highest risks: 2D printed images, 2D patches, and coated camouflage stickers, which deserve more focused attention for potential future mitigation strategy development and policy making.
Published: 2024

32. Pulsed Field Ablation Index-Guided Ablation for Lesion Formation: Impact of Contact Force and Number of Applications in the Ventricular Model.

Author: Di Biase, Luigi, Marazzato, Jacopo, Govari, Assaf, Altman, Andreas, Beeckler, Christopher, Keyes, Joe, Sharma, Tushar, Grupposo, Vito, Zou, Fengwei, Sugawara, Masafumi, Ikeda, Atsushi, Raissi, Farshad, Bhardwaj, Rahul, Lee, Mark, Banker, Rajesh, Mohanty, Sanghamitra, Natale, Andrea, Chen, Qi, Parikh, Paras, Zhang, Xiaodong, Nakagawa, Hiroshi, and Hsu, Jonathan
Subjects: catheter ablation, electroporation, heart ventricles, swine, Swine, Animals, Catheter Ablation, Heart Ventricles, Catheters, Equipment Design
Abstract: BACKGROUND: The effect of contact force (CF) on lesion formation is not clear during pulsed field ablation (PFA). The aim of this study was to evaluate the impact of CF, PFA, and their interplay through the PFA index (PF index) formula on the ventricular lesion size in swine. METHODS: PFA was delivered through the CF-sensing OMNYPULSE catheter. Predefined PFA applications (×3, ×6, ×9, and ×12) were delivered maintaining low (5-25 g), high (26-50 g), and very high (51-80 g) CFs. First, PFA lesions were evaluated on necropsy in 11 swine to investigate the impact of CF/PFA-and their integration in the PF index equation-on lesion size (study characterization). Then, 3 different PF index thresholds-300, 450, and 600-were tested in 6 swine to appraise the PF index accuracy to predict the ventricular lesion depth (study validation). RESULTS: In the study characterization data set, 111 PFA lesions were analyzed. CF was 32±17 g. The average lesion depth and width were 3.5±1.2 and 12.0±3.5 mm, respectively. More than CF and PFA dose alone, it was their combined effect to impact lesion depth through an asymptotically increasing relationship. Likewise, not only was the PF index related to lesion depth in the study validation data set (r2=0.66; P
Published: 2024

33. Fast and Efficient Local Search for Genetic Programming Based Loss Function Learning

Author: Raymond, Christian, Chen, Qi, Xue, Bing, and Zhang, Mengjie
Subjects: Computer Science - Neural and Evolutionary Computing, Computer Science - Artificial Intelligence, Computer Science - Computer Vision and Pattern Recognition, Computer Science - Machine Learning
Abstract: In this paper, we develop upon the topic of loss function learning, an emergent meta-learning paradigm that aims to learn loss functions that significantly improve the performance of the models trained under them. Specifically, we propose a new meta-learning framework for task and model-agnostic loss function learning via a hybrid search approach. The framework first uses genetic programming to find a set of symbolic loss functions. Second, the set of learned loss functions is subsequently parameterized and optimized via unrolled differentiation. The versatility and performance of the proposed framework are empirically validated on a diverse set of supervised learning tasks. Results show that the learned loss functions bring improved convergence, sample efficiency, and inference performance on tabulated, computer vision, and natural language processing problems, using a variety of task-specific neural network architectures., Comment: arXiv admin note: substantial text overlap with arXiv:2209.08907
Published: 2024

34. Towards Generalizable Tumor Synthesis

Author: Chen, Qi, Chen, Xiaoxi, Song, Haorui, Xiong, Zhiwei, Yuille, Alan, Wei, Chen, and Zhou, Zongwei
Subjects: Electrical Engineering and Systems Science - Image and Video Processing, Computer Science - Computer Vision and Pattern Recognition
Abstract: Tumor synthesis enables the creation of artificial tumors in medical images, facilitating the training of AI models for tumor detection and segmentation. However, success in tumor synthesis hinges on creating visually realistic tumors that are generalizable across multiple organs and, furthermore, the resulting AI models being capable of detecting real tumors in images sourced from different domains (e.g., hospitals). This paper made a progressive stride toward generalizable tumor synthesis by leveraging a critical observation: early-stage tumors (< 2cm) tend to have similar imaging characteristics in computed tomography (CT), whether they originate in the liver, pancreas, or kidneys. We have ascertained that generative AI models, e.g., Diffusion Models, can create realistic tumors generalized to a range of organs even when trained on a limited number of tumor examples from only one organ. Moreover, we have shown that AI models trained on these synthetic tumors can be generalized to detect and segment real tumors from CT volumes, encompassing a broad spectrum of patient demographics, imaging protocols, and healthcare facilities., Comment: The IEEE / CVF Computer Vision and Pattern Recognition Conference (CVPR 2024)
Published: 2024

35. Understanding the Weakness of Large Language Model Agents within a Complex Android Environment

Author: Xing, Mingzhe, Zhang, Rongkai, Xue, Hui, Chen, Qi, Yang, Fan, and Xiao, Zhen
Subjects: Computer Science - Artificial Intelligence, Computer Science - Human-Computer Interaction, Computer Science - Software Engineering
Abstract: Large language models (LLMs) have empowered intelligent agents to execute intricate tasks within domain-specific software such as browsers and games. However, when applied to general-purpose software systems like operating systems, LLM agents face three primary challenges. Firstly, the action space is vast and dynamic, posing difficulties for LLM agents to maintain an up-to-date understanding and deliver accurate responses. Secondly, real-world tasks often require inter-application cooperation}, demanding farsighted planning from LLM agents. Thirdly, agents need to identify optimal solutions aligning with user constraints, such as security concerns and preferences. These challenges motivate AndroidArena, an environment and benchmark designed to evaluate LLM agents on a modern operating system. To address high-cost of manpower, we design a scalable and semi-automated method to construct the benchmark. In the task evaluation, AndroidArena incorporates accurate and adaptive metrics to address the issue of non-unique solutions. Our findings reveal that even state-of-the-art LLM agents struggle in cross-APP scenarios and adhering to specific constraints. Additionally, we identify a lack of four key capabilities, i.e., understanding, reasoning, exploration, and reflection, as primary reasons for the failure of LLM agents. Furthermore, we provide empirical analysis on the failure of reflection, and improve the success rate by 27% with our proposed exploration strategy. This work is the first to present valuable insights in understanding fine-grained weakness of LLM agents, and offers a path forward for future research in this area. Environment, benchmark, and evaluation code for AndroidArena are released at https://github.com/AndroidArenaAgent/AndroidArena.
Published: 2024

36. Source-Free Unsupervised Domain Adaptation with Hypothesis Consolidation of Prediction Rationale

Author: Shu, Yangyang, Cao, Xiaofeng, Chen, Qi, Zhang, Bowen, Zhou, Ziqin, Hengel, Anton van den, and Liu, Lingqiao
Subjects: Computer Science - Computer Vision and Pattern Recognition
Abstract: Source-Free Unsupervised Domain Adaptation (SFUDA) is a challenging task where a model needs to be adapted to a new domain without access to target domain labels or source domain data. The primary difficulty in this task is that the model's predictions may be inaccurate, and using these inaccurate predictions for model adaptation can lead to misleading results. To address this issue, this paper proposes a novel approach that considers multiple prediction hypotheses for each sample and investigates the rationale behind each hypothesis. By consolidating these hypothesis rationales, we identify the most likely correct hypotheses, which we then use as a pseudo-labeled set to support a semi-supervised learning procedure for model adaptation. To achieve the optimal performance, we propose a three-step adaptation process: model pre-adaptation, hypothesis consolidation, and semi-supervised learning. Extensive experimental results demonstrate that our approach achieves state-of-the-art performance in the SFUDA task and can be easily integrated into existing approaches to improve their performance. The codes are available at \url{https://github.com/GANPerf/HCPR}.
Published: 2024

37. Security-Sensitive Task Offloading in Integrated Satellite-Terrestrial Networks

Author: Lan, Wenjun, Chen, Kongyang, Cao, Jiannong, Li, Yikai, Li, Ning, Chen, Qi, and Sahni, Yuvraj
Subjects: Electrical Engineering and Systems Science - Signal Processing, Computer Science - Cryptography and Security, Computer Science - Networking and Internet Architecture
Abstract: With the rapid development of sixth-generation (6G) communication technology, global communication networks are moving towards the goal of comprehensive and seamless coverage. In particular, low earth orbit (LEO) satellites have become a critical component of satellite communication networks. The emergence of LEO satellites has brought about new computational resources known as the \textit{LEO satellite edge}, enabling ground users (GU) to offload computing tasks to the resource-rich LEO satellite edge. However, existing LEO satellite computational offloading solutions primarily focus on optimizing system performance, neglecting the potential issue of malicious satellite attacks during task offloading. In this paper, we propose the deployment of LEO satellite edge in an integrated satellite-terrestrial networks (ISTN) structure to support \textit{security-sensitive computing task offloading}. We model the task allocation and offloading order problem as a joint optimization problem to minimize task offloading delay, energy consumption, and the number of attacks while satisfying reliability constraints. To achieve this objective, we model the task offloading process as a Markov decision process (MDP) and propose a security-sensitive task offloading strategy optimization algorithm based on proximal policy optimization (PPO). Experimental results demonstrate that our algorithm significantly outperforms other benchmark methods in terms of performance.
Published: 2024

38. Prospects for Joint Detection of Gravitational Waves with Counterpart Gamma-Ray Bursts Detected by the HADAR Experiment

Author: Hu, Pei-Jin, Chen, Qi-Ling, Chen, Tian-Lu, Kang, Ming-Ming, Guo, Yi-Qing, Luo-Bu, Dan-Zeng, Feng, You-Liang, Gao, Qi, Gou, Quan-Bu, Hu, Hong-Bo, Li, Hai-Jin, Liu, Cheng, Liu, Mao-Yuan, Liu, Wei, Qian, Xiang-Li, Qiao, Bing-Qiang, Su, Jing-Jing, Sun, Hui-Ying, Wang, Xu, Wang, Zhen, Xin, Guang-Guang, Yang, Chao-Wen, Yao, Yu-Hua, Yuan, Qiang, and Zhang, Yi
Subjects: Astrophysics - High Energy Astrophysical Phenomena
Abstract: The detection of GW170817/GRB170817A implied the strong association between short gamma-ray bursts (SGRBs) and binary neutron star (BNS) mergers which produce gravitational waves (GWs). More evidence is needed to confirm the association and reveal the physical processes of BNS mergers. The upcoming High Altitude Detection of Astronomical Radiation (HADAR) experiment, excelling in a wide field of view (FOV) and a large effective area above tens of GeV, is a hope for the prompt detection of very-high-energy (VHE; > 10 GeV) SGRBs. The aim of this paper is to simulate and analyse GW/SGRB joint detections by future GW detector networks in synergy with HADAR, including the second generation LIGO, Virgo and KAGRA and the third generation ET and CE. We provide a brief introduction of the HADAR experiment for SGRB simulations and its expected SGRB detections. For GW simulations, we adopt a phenomenological model to describe GWs produced by BNS mergers and introduce the signal-noise ratios (SNRs) as detector responses. Following a theoretical analysis we compute the redshift-dependent efficiency functions of GW detector networks. We then construct the simulation of GW detection by Monte Carlo sampling. We compare the simulated results of LIGO-Virgo O2 and O3 runs with their actual detections as a check. The combination of GW and SGRB models is then discussed for joint detection, including parameter correlations, triggered SNRs and efficiency skymaps. The estimated joint detection rates are 0.09-2.52 per year for LHVK network with HADAR under different possible configurations, and approximately 0.27-7.89 per year for ET+CE network with HADAR.
Published: 2024

39. Towards Automated Driving Violation Cause Analysis in Scenario-Based Testing for Autonomous Driving Systems

Author: Wan, Ziwen, Huai, Yuqi, Chen, Yuntianyi, Garcia, Joshua, and Chen, Qi Alfred
Subjects: Computer Science - Software Engineering
Abstract: The rapid advancement of Autonomous Vehicles (AVs), exemplified by companies like Waymo and Cruise offering 24/7 paid taxi services, highlights the paramount importance of ensuring AVs' compliance with various policies, such as safety regulations, traffic rules, and mission directives. Despite significant progress in the development of Autonomous Driving System (ADS) testing tools, there has been a notable absence of research on attributing the causes of driving violations. Counterfactual causality analysis has emerged as a promising approach for identifying the root cause of program failures. While it has demonstrated effectiveness in pinpointing error-inducing inputs, its direct application to the AV context to determine which computation result, generated by which component, serves as the root cause poses a considerable challenge. A key obstacle lies in our inability to straightforwardly eliminate the influence of a specific internal message to establish the causal relationship between the output of each component and a system-level driving violation. In this work, we propose a novel driving violation cause analysis (DVCA) tool. We design idealized component substitutes to enable counterfactual analysis of ADS components by leveraging the unique opportunity provided by the simulation. We evaluate our tool on a benchmark with real bugs and injected faults. The results show that our tool can achieve perfect component-level attribution accuracy (100%) and almost (>98%) perfect message-level accuracy. Our tool can reduce the debugging scope from hundreds of complicated interdependent messages to one single computation result generated by one component.
Published: 2024

40. Invisible Reflections: Leveraging Infrared Laser Reflections to Target Traffic Sign Perception

Author: Sato, Takami, Bhupathiraju, Sri Hrushikesh Varma, Clifford, Michael, Sugawara, Takeshi, Chen, Qi Alfred, and Rampazzi, Sara
Subjects: Computer Science - Cryptography and Security, Computer Science - Computer Vision and Pattern Recognition
Abstract: All vehicles must follow the rules that govern traffic behavior, regardless of whether the vehicles are human-driven or Connected Autonomous Vehicles (CAVs). Road signs indicate locally active rules, such as speed limits and requirements to yield or stop. Recent research has demonstrated attacks, such as adding stickers or projected colored patches to signs, that cause CAV misinterpretation, resulting in potential safety issues. Humans can see and potentially defend against these attacks. But humans can not detect what they can not observe. We have developed an effective physical-world attack that leverages the sensitivity of filterless image sensors and the properties of Infrared Laser Reflections (ILRs), which are invisible to humans. The attack is designed to affect CAV cameras and perception, undermining traffic sign recognition by inducing misclassification. In this work, we formulate the threat model and requirements for an ILR-based traffic sign perception attack to succeed. We evaluate the effectiveness of the ILR attack with real-world experiments against two major traffic sign recognition architectures on four IR-sensitive cameras. Our black-box optimization methodology allows the attack to achieve up to a 100% attack success rate in indoor, static scenarios and a >80.5% attack success rate in our outdoor, moving vehicle scenarios. We find the latest state-of-the-art certifiable defense is ineffective against ILR attacks as it mis-certifies >33.5% of cases. To address this, we propose a detection strategy based on the physical properties of IR laser reflections which can detect 96% of ILR attacks., Comment: The first two authors are co-first. Accepted to NDSS '24
Published: 2024
Full Text: View/download PDF

41. Research on the Strengthening and Toughening Mechanism and Regulation Principle of 20Mn2Cr

Author: Liu, Mei-ling, Chen, Qi-wei, Tang, Heng-qiang, and Han, Lu
Published: 2024
Full Text: View/download PDF

42. Fixed-time adaptive fault-tolerant control of stochastic nonlinear systems with time-varying full-state constrains

Author: Chen, Qi and Wu, Li-Bing
Published: 2024
Full Text: View/download PDF

43. Sleep effects on memory consolidation in adolescent English vocabulary memorization: investigating links between sleep quality, memory, and mood disturbance

Author: Ma, Huizhong, Li, Haoke, Chen, Chun, Xu, Naihong, Chen, Qi, and Ma, Jingni
Published: 2024
Full Text: View/download PDF

44. Impact of chemotherapy delay on long-term prognosis of laparoscopic radical surgery for locally advanced gastric cancer: a pooled analysis of four randomized controlled trials

Author: Zhong, Qing, Liu, Zhi-Yu, Shang-Guan, Zhi-Xin, Li, Yi-Fan, Li, Yi, Wu, Ju, Huang, Qiang, Li, Ping, Xie, Jian-Wei, Chen, Qi-Yue, Huang, Chang-Ming, and Zheng, Chao-Hui
Published: 2024
Full Text: View/download PDF

45. Exosome miRNA-203 promotes M1 macrophage polarization and inhibits prostate cancer tumor progression

Author: Zhang, Lian-Sheng, Chen, Qi-Chao, Zong, Hong-Tao, and Xia, Qiang
Published: 2024
Full Text: View/download PDF

46. Developing a modified textbook outcome for elderly patients with gastric cancer: a multi-center study

Author: Zhong, Qing, Zheng, Zi-Fang, Wu, Dong, Shang-Guan, Zhi-Xin, Liu, Zhi-Yu, Zheng, Lin-Yong, Lin, Jian-Xian, Chen, Qi-Yue, Wang, Jia-Bin, Xie, Jian-Wei, Lin, Mi, Lin, Wei, Zheng, Chao-Hui, Huang, Chang-Ming, and Li, Ping
Published: 2024
Full Text: View/download PDF

47. Safety assessment of coronary arteries during left bundle branch area pacing

Author: Kong, Qiling, Chen, Huolong, Hua, Juan, Xiong, Ziyi, Le, Shuyun, Liu, Jinwei, Wang, Dandan, and Chen, Qi
Published: 2024
Full Text: View/download PDF

48. Super-stealth dicing of transparent solids with nanometric precision

Author: Li, Zhen-Ze, Fan, Hua, Wang, Lei, Zhang, Xu, Zhao, Xin-Jing, Yu, Yan-Hao, Xu, Yi-Shi, Wang, Yi, Wang, Xiao-Jie, Juodkazis, Saulius, Chen, Qi-Dai, and Sun, Hong-Bo
Published: 2024
Full Text: View/download PDF

49. Multi-agent dynamic formation interception control based on rigid graph

Author: Wang, Chuanyun, Sun, Yunfei, Ma, Xiaoping, Chen, Qi, Gao, Qian, and Liu, Xiaona
Published: 2024
Full Text: View/download PDF

50. High dimensional proteomic mapping of bone marrow immune characteristics in immune thrombocytopenia

Author: Liu, Feng-Qi, Qu, Qing-Yuan, Lei, Ying, Chen, Qi, Chen, Yu-Xiu, Li, Meng-Lin, Sun, Xue-Yan, Wu, Ye-Jun, Huang, Qiu-Sha, Fu, Hai-Xia, Kong, Yuan, Li, Yue-Ying, Wang, Qian-Fei, Huang, Xiao-Jun, and Zhang, Xiao-Hui
Published: 2024
Full Text: View/download PDF

Catalog

Books, media, physical & digital resources

See catalog results

Searchworks

Select search scope, currently: Articles Catalog books, media & more in Jio Institute collections Articles journal articles & other e-resources

Search

Search Constraints

Refine your results

Search Limiters

Topic

Publication Year Range

Language

Publication Type

Journal

Region

Database

Publisher

24,156 results on '"Chen, Qi-an"'

Search Results

Catalog

Select search scope, currently: Articles

Catalog

books, media & more in Jio Institute collections

Articles

journal articles & other e-resources