105,525 results on '"WANG, LEI"'
Search Results
2. Rapid Automatic Multiple Moving Objects Detection Method Based on Feature Extraction from Images with Non-sidereal Tracking
- Author
-
Wang, Lei, Zhang, Xiaoming, Bai, Chunhai, Xie, Haiwen, Li, Juan, Ge, Jiayi, Wang, Jianfeng, Zeng, Xianqun, Sun, Jiantao, and Jiang, Xiaojun
- Subjects
Astrophysics - Instrumentation and Methods for Astrophysics - Abstract
Optically observing and monitoring moving objects, both natural and artificial, is important to human space security. Non-sidereal tracking can improve the system's limiting magnitude for moving objects, which benefits the surveillance. However, images with non-sidereal tracking include complex background, as well as objects with different brightness and moving mode, posing a significant challenge for accurate multi-object detection in such images, especially in wide field of view (WFOV) telescope images. To achieve a higher detection precision in a higher speed, we proposed a novel object detection method, which combines the source feature extraction and the neural network. First, our method extracts object features from optical images such as centroid, shape, and flux. Then it conducts a naive labeling based on those features to distinguish moving objects from stars. After balancing the labeled data, we employ it to train a neural network aimed at creating a classification model for point-like and streak-like objects. Ultimately, based on the neural network model's classification outcomes, moving objects whose motion modes consistent with the tracked objects are detected via track association, while objects with different motion modes are detected using morphological statistics. The validation, based on the space objects images captured in target tracking mode with the 1-meter telescope at Nanshan, Xinjiang Astronomical Observatory, demonstrates that our method achieves 94.72% detection accuracy with merely 5.02% false alarm rate, and a processing time of 0.66s per frame. Consequently, our method can rapidly and accurately detect objects with different motion modes from wide-field images with non-sidereal tracking.
- Published
- 2024
3. Searching for MeV-scale Axion-like Particles and Dark Photons with PandaX-4T
- Author
-
PandaX Collaboration, Li, Tao, Bo, Zihao, Chen, Wei, Chen, Xun, Chen, Yunhua, Cheng, Zhaokan, Cui, Xiangyi, Fan, Yingjie, Fang, Deqing, Gao, Zhixing, Geng, Lisheng, Giboni, Karl, Guo, Xunan, Guo, Xuyuan, Guo, Zichao, Han, Chencheng, He, Ke HanChangda, He, Jinrong, Huang, Di, Huang, Houqi, Huang, Junting, Hou, Ruquan, Hou, Yu, Ji, Xiangdong, Ji, Xiangpan, Ju, Yonglin, Li, Chenxiang, Li, Jiafu, Li, Mingchuan, Li, Shuaijie, Li, Zhiyuan, Lin, Qing, Liu, Jianglai, Lu, Congcong, Lu, Xiaoying, Luo, Lingyin, Luo, Yunyang, Ma, Wenbo, Ma, Yugang, Mao, Yajun, Meng, Yue, Ning, Xuyang, Pang, Binyu, Qi, Ningchun, Qian, Zhicheng, Ren, Xiangxiang, Shan, Dong, Shang, Xiaofeng, Shao, Xiyuan, Shen, Guofang, Shen, Manbin, Sun, Wenliang, Tao, Yi, Wang, Anqing, Wang, Guanbo, Wang, Hao, Wang, Jiamin, Wang, Lei, Wang, Meng, Wang, Qiuhong, Wang, Shaobo, Wang, Siguang, Wang, Wei, Wang, Xiuli, Wang, Xu, Wang, Zhou, Wei, Yuehuan, Wu, Weihao, Wu, Yuan, Xiao, Mengjiao, Xiao, Xiang, Xiong, Kaizhi, Xu, Yifan, Yao, Shunyu, Yan, Binbin, Yan, Xiyu, Yang, Yong, Ye, Peihua, Yu, Chunxu, Yuan, Ying, Yuan, Zhe, Yun, Youhui, Zeng, Xinning, Zhang, Minzhen, Zhang, Peng, Zhang, Shibo, Zhang, Shu, Zhang, Tao, Zhang, Wei, Zhang, Yang, Zhang, Yingxin, Zhang, Yuanyuan, Zhao, Li, Zhou, Jifang, Zhou, Jiaxu, Zhou, Jiayi, Zhou, Ning, Zhou, Xiaopeng, Zhou, Yubo, and Zhou, Zhizhen
- Subjects
High Energy Physics - Experiment - Abstract
Axion-like particles (ALPs) and dark photons (DPs) are viable dark matter particle candidates. We have searched for possible ALP/DP signals in the PandaX-4T liquid xenon detector using 94.8 days of data. A binned likelihood fit is constructed to search for possible mono-energetic peaks induced by the absorption processes between ALPs/DPs and atomic electrons of xenon. A detailed temporal model of decays associated with xenon isotopes is introduced to constrain the number of background events. No signal excess over background expectations is observed, and we have established the most stringent exclusion limits for most ALP/DP masses ranging from 150 keV/$c^2$ to 1 MeV/$c^2$.
- Published
- 2024
4. MoManifold: Learning to Measure 3D Human Motion via Decoupled Joint Acceleration Manifolds
- Author
-
Dang, Ziqiang, Fan, Tianxing, Zhao, Boming, Shen, Xujie, Wang, Lei, Zhang, Guofeng, and Cui, Zhaopeng
- Subjects
Computer Science - Computer Vision and Pattern Recognition - Abstract
Incorporating temporal information effectively is important for accurate 3D human motion estimation and generation which have wide applications from human-computer interaction to AR/VR. In this paper, we present MoManifold, a novel human motion prior, which models plausible human motion in continuous high-dimensional motion space. Different from existing mathematical or VAE-based methods, our representation is designed based on the neural distance field, which makes human dynamics explicitly quantified to a score and thus can measure human motion plausibility. Specifically, we propose novel decoupled joint acceleration manifolds to model human dynamics from existing limited motion data. Moreover, we introduce a novel optimization method using the manifold distance as guidance, which facilitates a variety of motion-related tasks. Extensive experiments demonstrate that MoManifold outperforms existing SOTAs as a prior in several downstream tasks such as denoising real-world human mocap data, recovering human motion from partial 3D observations, mitigating jitters for SMPL-based pose estimators, and refining the results of motion in-betweening., Comment: Accepted by BMVC 2024. Supplementary material is included at the end of the main paper (12 pages, 11 figures, 5 tables)
- Published
- 2024
5. Could Bibliometrics Reveal Top Science and Technology Achievements and Researchers? The Case for Evaluatology-based Science and Technology Evaluation
- Author
-
Kang, Guoxin, Gao, Wanling, Wang, Lei, Luo, Chunjie, Ye, Hainan, He, Qian, Dai, Shaopeng, and Zhan, Jianfeng
- Subjects
Computer Science - Computational Engineering, Finance, and Science ,Computer Science - Computers and Society - Abstract
By utilizing statistical methods to analyze bibliographic data, bibliometrics faces inherent limitations in identifying the most significant science and technology achievements and researchers. To overcome this challenge, we present an evaluatology-based science and technology evaluation methodology. At the heart of this approach lies the concept of an extended evaluation condition, encompassing eight crucial components derived from a field. We define four relationships that illustrate the connections among various achievements based on their mapped extended EC components, as well as their temporal and citation links. Within a relationship under an extended evaluation condition, evaluators can effectively compare these achievements by carefully addressing the influence of confounding variables. We establish a real-world evaluation system encompassing an entire collection of achievements, each of which is mapped to several components of an extended EC. Within a specific field like chip technology or open source, we construct a perfect evaluation model that can accurately trace the evolution and development of all achievements in terms of four relationships based on the real-world evaluation system. Building upon the foundation of the perfect evaluation model, we put forth four-round rules to eliminate non-significant achievements by utilizing four relationships. This process allows us to establish a pragmatic evaluation model that effectively captures the essential achievements, serving as a curated collection of the top N achievements within a specific field during a specific timeframe. We present a case study on the top 100 Chip achievements which highlights its practical application and efficacy in identifying significant achievements and researchers that otherwise can not be identified by using bibliometrics., Comment: 18 pages, 8 figures, and 2 tables
- Published
- 2024
6. UNetMamba: An Efficient UNet-Like Mamba for Semantic Segmentation of High-Resolution Remote Sensing Images
- Author
-
Zhu, Enze, Chen, Zhan, Wang, Dingkai, Shi, Hanru, Liu, Xiaoxuan, and Wang, Lei
- Subjects
Computer Science - Computer Vision and Pattern Recognition - Abstract
Semantic segmentation of high-resolution remote sensing images is vital in downstream applications such as land-cover mapping, urban planning and disaster assessment.Existing Transformer-based methods suffer from the constraint between accuracy and efficiency, while the recently proposed Mamba is renowned for being efficient. Therefore, to overcome the dilemma, we propose UNetMamba, a UNet-like semantic segmentation model based on Mamba. It incorporates a mamba segmentation decoder (MSD) that can efficiently decode the complex information within high-resolution images, and a local supervision module (LSM), which is train-only but can significantly enhance the perception of local contents. Extensive experiments demonstrate that UNetMamba outperforms the state-of-the-art methods with mIoU increased by 0.87% on LoveDA and 0.36% on ISPRS Vaihingen, while achieving high efficiency through the lightweight design, less memory footprint and reduced computational cost. The source code is available at https://github.com/EnzeZhu2001/UNetMamba., Comment: 5 pages, 3 figures
- Published
- 2024
7. Spatio-Temporal Communication Compression for Distributed Prime-Dual Optimization
- Author
-
Ren, Zihao, Wang, Lei, Yuan, Deming, Su, Hongye, and Shi, Guodong
- Subjects
Electrical Engineering and Systems Science - Systems and Control - Abstract
In this paper, for the problem of distributed computing, we propose a general spatio-temporal compressor and discuss its compression methods. This compressor comprehensively considers both temporal and spatial information, encompassing many existing specific compressors. We use the average consensus algorithm as a starting point and further studies distributed optimization algorithms, the Prime-Dual algorithm as an example, in both continuous and discrete time forms. We find that under stronger additional assumptions, the spatio-temporal compressor can be directly applied to distributed computing algorithms, while its default form can also be successfully applied through observer-based differential compression methods, ensuring the linear convergence of the algorithm when the objective function is strongly convex. On this basis, we also discuss the acceleration of the algorithm, filter-based compression methods in the literature, and the addition of randomness to the spatio-temporal compressor. Finally, numerical simulations illustrate the generality of the spatio-temporal compressor, compare different compression methods, and verify the algorithm's performance in the convex objective function scenario., Comment: 21 pages. arXiv admin note: text overlap with arXiv:2408.02332
- Published
- 2024
8. Exploring New Physics with PandaX-4T Low Energy Electronic Recoil Data
- Author
-
PandaX Collaboration, Zeng, Xinning, Bo, Zihao, Chen, Wei, Chen, Xun, Chen, Yunhua, Cheng, Zhaokan, Cui, Xiangyi, Fan, Yingjie, Fang, Deqing, Gao, Zhixing, Geng, Lisheng, Giboni, Karl, Guo, Xunan, Guo, Xuyuan, Guo, Zichao, Han, Chencheng, He, Ke HanChangda, He, Jinrong, Huang, Di, Huang, Houqi, Huang, Junting, Hou, Ruquan, Hou, Yu, Ji, Xiangdong, Ji, Xiangpan, Ju, Yonglin, Li, Chenxiang, Li, Jiafu, Li, Mingchuan, Li, Shuaijie, Li, Tao, Li, Zhiyuan, Lin, Qing, Liu, Jianglai, Lu, Congcong, Lu, Xiaoying, Luo, Lingyin, Luo, Yunyang, Ma, Wenbo, Ma, Yugang, Mao, Yajun, Meng, Yue, Ning, Xuyang, Pang, Binyu, Qi, Ningchun, Qian, Zhicheng, Ren, Xiangxiang, Shan, Dong, Shang, Xiaofeng, Shao, Xiyuan, Shen, Guofang, Shen, Manbin, Sun, Wenliang, Tao, Yi, Wang, Anqing, Wang, Guanbo, Wang, Hao, Wang, Jiamin, Wang, Lei, Wang, Meng, Wang, Qiuhong, Wang, Shaobo, Wang, Siguang, Wang, Wei, Wang, Xiuli, Wang, Xu, Wang, Zhou, Wei, Yuehuan, Wu, Weihao, Wu, Yuan, Xiao, Mengjiao, Xiao, Xiang, Xiong, Kaizhi, Xu, Yifan, Yao, Shunyu, Yan, Binbin, Yan, Xiyu, Yang, Yong, Ye, Peihua, Yu, Chunxu, Yuan, Ying, Yuan, Zhe, Yun, Youhui, Zhang, Minzhen, Zhang, Peng, Zhang, Shibo, Zhang, Shu, Zhang, Tao, Zhang, Wei, Zhang, Yang, Zhang, Yingxin, Zhang, Yuanyuan, Zhao, Li, Zhou, Jifang, Zhou, Jiaxu, Zhou, Jiayi, Zhou, Ning, Zhou, Xiaopeng, Zhou, Yubo, and Zhou, Zhizhen
- Subjects
High Energy Physics - Experiment - Abstract
New particles beyond the Standard Model of particle physics, such as axions, can be effectively searched through their interactions with electrons. We use the large liquid xenon detector PandaX-4T to search for novel electronic recoil signals induced by solar axions, neutrinos with anomalous magnetic moment, axion-like particles, dark photons, and light fermionic dark matter. A detailed background model is established with the latest datasets with 1.54 $\rm tonne \cdot year$ exposure. No significant excess above the background has been observed, and we have obtained competitive constraints for axion couplings, neutrino magnetic moment, and fermionic dark matter interactions.
- Published
- 2024
9. MatterGPT: A Generative Transformer for Multi-Property Inverse Design of Solid-State Materials
- Author
-
Chen, Yan, Wang, Xueru, Deng, Xiaobin, Liu, Yilun, Chen, Xi, Zhang, Yunwei, Wang, Lei, and Xiao, Hang
- Subjects
Condensed Matter - Materials Science ,Physics - Computational Physics - Abstract
Inverse design of solid-state materials with desired properties represents a formidable challenge in materials science. Although recent generative models have demonstrated potential, their adoption has been hindered by limitations such as inefficiency, architectural constraints and restricted open-source availability. The representation of crystal structures using the SLICES (Simplified Line-Input Crystal-Encoding System) notation as a string of characters enables the use of state-of-the-art natural language processing models, such as Transformers, for crystal design. Drawing inspiration from the success of GPT models in generating coherent text, we trained a generative Transformer on the next-token prediction task to generate solid-state materials with targeted properties. We demonstrate MatterGPT's capability to generate de novo crystal structures with targeted single properties, including both lattice-insensitive (formation energy) and lattice-sensitive (band gap) properties. Furthermore, we extend MatterGPT to simultaneously target multiple properties, addressing the complex challenge of multi-objective inverse design of crystals. Our approach showcases high validity, uniqueness, and novelty in generated structures, as well as the ability to generate materials with properties beyond the training data distribution. This work represents a significant step forward in computational materials discovery, offering a powerful and open tool for designing materials with tailored properties for various applications in energy, electronics, and beyond., Comment: 20 pages, 6 figures
- Published
- 2024
10. Probing a light long-lived pseudo-scalar from Higgs decay via displaced taus at the LHC
- Author
-
Shan, Lianyou, Wang, Lei, Yang, Jin Min, and Zhu, Rui
- Subjects
High Energy Physics - Phenomenology - Abstract
A light (GeV mass) long-lived ($c\tau$ around dozens of millimeters) CP-odd scalar can be readily predicted in new physics models. In this work we investigate the Higgs decay into such a light scalar plus a $Z$-boson and take the aligned two-Higgs-doublet model (2HDM) as an example. This light long-lived scalar, with the dominant decay to tau leptons, will fly over a distance from the production point and present a displaced vertex in an Inner Detector of a generally purposed experiment like ATLAS or CMS. In our study we focus on the LHC experiment and perform Monte Carlo simulations for the signal and backgrounds. We demonstrate some benchmark points for the aligned 2HDM and find the signal to be detectable when the luminosity is accumulated to 300 fb$^{-1}$. So our study suggests an experimental search for this process in the ongoing LHC., Comment: 18 pages, 6 figures
- Published
- 2024
11. Prioritizing Modalities: Flexible Importance Scheduling in Federated Multimodal Learning
- Author
-
Bian, Jieming, Wang, Lei, and Xu, Jie
- Subjects
Computer Science - Machine Learning ,Computer Science - Distributed, Parallel, and Cluster Computing - Abstract
Federated Learning (FL) is a distributed machine learning approach that enables devices to collaboratively train models without sharing their local data, ensuring user privacy and scalability. However, applying FL to real-world data presents challenges, particularly as most existing FL research focuses on unimodal data. Multimodal Federated Learning (MFL) has emerged to address these challenges, leveraging modality-specific encoder models to process diverse datasets. Current MFL methods often uniformly allocate computational frequencies across all modalities, which is inefficient for IoT devices with limited resources. In this paper, we propose FlexMod, a novel approach to enhance computational efficiency in MFL by adaptively allocating training resources for each modality encoder based on their importance and training requirements. We employ prototype learning to assess the quality of modality encoders, use Shapley values to quantify the importance of each modality, and adopt the Deep Deterministic Policy Gradient (DDPG) method from deep reinforcement learning to optimize the allocation of training resources. Our method prioritizes critical modalities, optimizing model performance and resource utilization. Experimental results on three real-world datasets demonstrate that our proposed method significantly improves the performance of MFL models., Comment: Submitted to IEEE TMC, under review
- Published
- 2024
12. LUT Tensor Core: Lookup Table Enables Efficient Low-Bit LLM Inference Acceleration
- Author
-
Mo, Zhiwen, Wang, Lei, Wei, Jianyu, Zeng, Zhichen, Cao, Shijie, Ma, Lingxiao, Jing, Naifeng, Cao, Ting, Xue, Jilong, Yang, Fan, and Yang, Mao
- Subjects
Computer Science - Hardware Architecture ,Computer Science - Machine Learning - Abstract
As large language model (LLM) inference demands ever-greater resources, there is a rapid growing trend of using low-bit weights to shrink memory usage and boost inference efficiency. However, these low-bit LLMs introduce the need for mixed-precision matrix multiplication (mpGEMM), which is a crucial yet under-explored operation that involves multiplying lower-precision weights with higher-precision activations. Unfortunately, current hardware does not natively support mpGEMM, resulting in indirect and inefficient dequantization-based implementations. To address the mpGEMM requirements in low-bit LLMs, we explored the lookup table (LUT)-based approach for mpGEMM. However, a conventional LUT implementation falls short of its potential. To fully harness the power of LUT-based mpGEMM, we introduce LUT Tensor Core, a software-hardware co-design optimized for low-bit LLM inference. Specifically, we introduce software-based operator fusion and table symmetrization techniques to optimize table precompute and table storage, respectively. Then, LUT Tensor Core proposes the hardware design featuring an elongated tiling shape design to enhance table reuse and a bit-serial design to support various precision combinations in mpGEMM. Moreover, we design an end-to-end compilation stack with new instructions for LUT-based mpGEMM, enabling efficient LLM compilation and optimizations. The evaluation on low-bit LLMs (e.g., BitNet, LLAMA) shows that LUT Tensor Core achieves more than a magnitude of improvements on both compute density and energy efficiency.
- Published
- 2024
13. A 95 GeV Higgs boson and spontaneous CP-violation at the finite temperature
- Author
-
Gao, Jing, Ma, Jinghong, Wang, Lei, and Xu, Haotian
- Subjects
High Energy Physics - Phenomenology - Abstract
The ATLAS and CMS collaborations reported a diphoton excess in the invariant mass distribution around the 95.4 GeV with a local significance of $3.1\sigma$. Moreover, there is another $2.3\sigma$ local excess in the $b\bar{b}$ final state at LEP in the same mass region. A plausible solution is that the Higgs sector is extended to include an additional Higgs boson with a mass of $95.4$ GeV. We study a complex singlet scalar extension of the two-Higgs-doublet model in which the 95.4 GeV Higgs is from the mixing of three CP-even Higgs fields. In addition, the extended Higgs potential can achieve spontaneous CP-violation at the finite temperature and restore CP symmetry at the present temperature of the Universe. We find that the model can simultaneously explain the baryon asymmetry of the Universe, the diphoton and $b\bar{b}$ excesses around the 95.4 GeV while satisfying various relevant constraints including the experiments of collider and electric dipole moment., Comment: 28 pages, 8 figures, 1 tables, add references. arXiv admin note: text overlap with arXiv:2311.02828
- Published
- 2024
14. MMRole: A Comprehensive Framework for Developing and Evaluating Multimodal Role-Playing Agents
- Author
-
Dai, Yanqi, Hu, Huanran, Wang, Lei, Jin, Shengjie, Chen, Xu, and Lu, Zhiwu
- Subjects
Computer Science - Artificial Intelligence - Abstract
Recently, Role-Playing Agents (RPAs) have garnered increasing attention for their potential to deliver emotional value and facilitate sociological research. However, existing studies are primarily confined to the textual modality, unable to simulate humans' multimodal perceptual capabilities. To bridge this gap, we introduce the concept of Multimodal Role-Playing Agents (MRPAs), and propose a comprehensive framework, MMRole, for their development and evaluation, which comprises a personalized multimodal dataset and a robust evaluation method. Specifically, we construct a large-scale, high-quality dataset, MMRole-Data, consisting of 85 characters, 11K images, and 14K single or multi-turn dialogues. Additionally, we present a robust evaluation method, MMRole-Eval, encompassing eight metrics across three dimensions, where a reward model is trained to score MRPAs with the constructed ground-truth data for comparison. Moreover, we develop the first specialized MRPA, MMRole-Agent. Extensive evaluation results demonstrate the improved performance of MMRole-Agent and highlight the primary challenges in developing MRPAs, emphasizing the need for enhanced multimodal understanding and role-playing consistency. The data, code, and models will be available at https://github.com/YanqiDai/MMRole.
- Published
- 2024
15. Spatio-Temporal Communication Compression in Distributed Prime-Dual Flows
- Author
-
Ren, Zihao, Wang, Lei, Yuan, Deming, Su, Hongye, and Shi, Guodong
- Subjects
Electrical Engineering and Systems Science - Systems and Control - Abstract
In this paper, we study distributed prime-dual flows for multi-agent optimization with spatio-temporal compressions. The central aim of multi-agent optimization is for a network of agents to collaboratively solve a system-level optimization problem with local objective functions and node-to-node communication by distributed algorithms. The scalability of such algorithms crucially depends on the complexity of the communication messages, and a number of communication compressors for distributed optimization have recently been proposed in the literature. First of all, we introduce a general spatio-temporal compressor characterized by the stability of the resulting dynamical system along the vector field of the compressor. We show that several important distributed optimization compressors such as the greedy sparsifier, the uniform quantizer, and the scalarizer all fall into the category of this spatio-temporal compressor. Next, we propose two distributed prime-dual flows with the spatio-temporal compressors being applied to local node states and local error states, respectively, and prove (exponential) convergence of the node trajectories to the global optimizer for (strongly) convex cost functions. Finally, a few numerical examples are present to illustrate our theoretical results.
- Published
- 2024
16. Dark Matter Search Results from 1.54 Tonne$\cdot$Year Exposure of PandaX-4T
- Author
-
PandaX Collaboration, Bo, Zihao, Chen, Wei, Chen, Xun, Chen, Yunhua, Cheng, Zhaokan, Cui, Xiangyi, Fan, Yingjie, Fang, Deqing, Gao, Zhixing, Geng, Lisheng, Giboni, Karl, Guo, Xunan, Guo, Xuyuan, Guo, Zichao, Han, Chencheng, Han, Ke, He, Changda, He, Jinrong, Huang, Di, Huang, Houqi, Huang, Junting, Hou, Ruquan, Hou, Yu, Ji, Xiangdong, Ji, Xiangpan, Ju, Yonglin, Li, Chenxiang, Li, Jiafu, Li, Mingchuan, Li, Shuaijie, Li, Tao, Li, Zhiyuan, Lin, Qing, Liu, Jianglai, Lu, Congcong, Lu, Xiaoying, Luo, Lingyin, Luo, Yunyang, Ma, Wenbo, Ma, Yugang, Mao, Yajun, Meng, Yue, Ning, Xuyang, Pang, Binyu, Qi, Ningchun, Qian, Zhicheng, Ren, Xiangxiang, Shan, Dong, Shang, Xiaofeng, Shao, Xiyuan, Shen, Guofang, Shen, Manbin, Sun, Wenliang, Tao, Yi, Wang, Anqing, Wang, Guanbo, Wang, Hao, Wang, Jiamin, Wang, Lei, Wang, Meng, Wang, Qiuhong, Wang, Shaobo, Wang, Siguang, Wang, Wei, Wang, Xiuli, Wang, Xu, Wang, Zhou, Wei, Yuehuan, Wu, Weihao, Wu, Yuan, Xiao, Mengjiao, Xiao, Xiang, Xiong, Kaizhi, Xu, Yifan, Yao, Shunyu, Yan, Binbin, Yan, Xiyu, Yang, Yong, Ye, Peihua, Yu, Chunxu, Yuan, Ying, Yuan, Zhe, Yun, Youhui, Zeng, Xinning, Zhang, Minzhen, Zhang, Peng, Zhang, Shibo, Zhang, Shu, Zhang, Tao, Zhang, Wei, Zhang, Yang, Zhang, Yingxin, Zhang, Yuanyuan, Zhao, Li, Zhou, Jifang, Zhou, Jiaxu, Zhou, Jiayi, Zhou, Ning, Zhou, Xiaopeng, Zhou, Yubo, and Zhou, Zhizhen
- Subjects
High Energy Physics - Experiment - Abstract
In this letter, we report the dark matter search results from the commissioning run and the first science run of the PandaX-4T experiment. A blind analysis is carried out on the entire data set. The data processing is improved compared to previous work, unifying the low-level signal reconstruction in a wide energy range up to 120 keV. With a total exposure of 1.54 tonne$\cdot$year, no significant excess of nuclear recoil events is found. The lowest 90% confidence level exclusion on the spin-independent cross section is $1.6 \times 10^{-47} \mathrm{cm}^2$ at a dark matter mass of 40 GeV$/c^2$. Our results represent the most stringent constraint for a dark matter mass above 100 GeV$/c^2$.
- Published
- 2024
17. RoCo:Robust Collaborative Perception By Iterative Object Matching and Pose Adjustment
- Author
-
Huang, Zhe, Wang, Shuo, Wang, Yongcai, Li, Wanting, Li, Deying, and Wang, Lei
- Subjects
Computer Science - Artificial Intelligence - Abstract
Collaborative autonomous driving with multiple vehicles usually requires the data fusion from multiple modalities. To ensure effective fusion, the data from each individual modality shall maintain a reasonably high quality. However, in collaborative perception, the quality of object detection based on a modality is highly sensitive to the relative pose errors among the agents. It leads to feature misalignment and significantly reduces collaborative performance. To address this issue, we propose RoCo, a novel unsupervised framework to conduct iterative object matching and agent pose adjustment. To the best of our knowledge, our work is the first to model the pose correction problem in collaborative perception as an object matching task, which reliably associates common objects detected by different agents. On top of this, we propose a graph optimization process to adjust the agent poses by minimizing the alignment errors of the associated objects, and the object matching is re-done based on the adjusted agent poses. This process is carried out iteratively until convergence. Experimental study on both simulated and real-world datasets demonstrates that the proposed framework RoCo consistently outperforms existing relevant methods in terms of the collaborative object detection performance, and exhibits highly desired robustness when the pose information of agents is with high-level noise. Ablation studies are also provided to show the impact of its key parameters and components. The code is released at https://github.com/HuangZhe885/RoCo., Comment: ACM MM2024
- Published
- 2024
- Full Text
- View/download PDF
18. Long-distance distribution of telecom time-energy entanglement generated on a silicon chip
- Author
-
Zhao, Yuan-yuan, Yue, Fuyong, Gao, Feng, Wang, Qibing, Li, Chao, Liu, Zichen, Wang, Lei, and He, Zhixue
- Subjects
Quantum Physics ,Physics - Optics - Abstract
Entanglement distribution is a critical technique that enables numerous quantum applications. Most fiber-based long-distance experiments reported to date have utilized photon pair sources generated in bulk optical crystals, with the entanglement encoded in the polarization degree of freedom. Here, we create time-energy entanglement for photon pairs generated from an on-chip silicon ring resonator via SFWM process and report the distribution of the entanglement over standard optical fiber with distance >81 km. Our work paves the way for future large-scale quantum networks with connect of distant quantum nodes., Comment: 8 pages, 4 figures
- Published
- 2024
19. ThinK: Thinner Key Cache by Query-Driven Pruning
- Author
-
Xu, Yuhui, Jie, Zhanming, Dong, Hanze, Wang, Lei, Lu, Xudong, Zhou, Aojun, Saha, Amrita, Xiong, Caiming, and Sahoo, Doyen
- Subjects
Computer Science - Computation and Language ,Computer Science - Artificial Intelligence - Abstract
Large Language Models (LLMs) have revolutionized the field of natural language processing, achieving unprecedented performance across a variety of applications by leveraging increased model sizes and sequence lengths. However, the associated rise in computational and memory costs poses significant challenges, particularly in managing long sequences due to the quadratic complexity of the transformer attention mechanism. This paper focuses on the long-context scenario, addressing the inefficiencies in KV cache memory consumption during inference. Unlike existing approaches that optimize the memory based on the sequence lengths, we uncover that the channel dimension of the KV cache exhibits significant redundancy, characterized by unbalanced magnitude distribution and low-rank structure in attention weights. Based on these observations, we propose ThinK, a novel query-dependent KV cache pruning method designed to minimize attention weight loss while selectively pruning the least significant channels. Our approach not only maintains or enhances model accuracy but also achieves a reduction in memory costs by over 20% compared with vanilla KV cache eviction methods. Extensive evaluations on the LLaMA3 and Mistral models across various long-sequence datasets confirm the efficacy of ThinK, setting a new precedent for efficient LLM deployment without compromising performance. We also outline the potential of extending our method to value cache pruning, demonstrating ThinK's versatility and broad applicability in reducing both memory and computational overheads., Comment: 20 pages, 6 figures
- Published
- 2024
20. Distributed Adaptive Time-Varying Optimization with Global Asymptotic Convergence
- Author
-
Jiang, Liangze, Wu, Zheng-Guang, and Wang, Lei
- Subjects
Electrical Engineering and Systems Science - Systems and Control ,Mathematics - Optimization and Control - Abstract
In this note, we study distributed time-varying optimization for a multi-agent system. We first focus on a class of time-varying quadratic cost functions, and develop a new distributed algorithm that integrates an average estimator and an adaptive optimizer, with both bridged by a Dead Zone Algorithm. Based on a composite Lyapunov function and finite escape-time analysis, we prove the closed-loop global asymptotic convergence to the optimal solution under mild assumptions. Particularly, the introduction of the estimator relaxes the requirement for the Hessians of cost functions, and the integrated design eliminates the waiting time required in the relevant literature for estimating global parameter during algorithm implementation. We then extend this result to a more general class of time-varying cost functions. Two examples are used to verify the proposed designs., Comment: 11 pages, 7 figures
- Published
- 2024
21. Text-Region Matching for Multi-Label Image Recognition with Missing Labels
- Author
-
Ma, Leilei, Xie, Hongxing, Wang, Lei, Fu, Yanping, Sun, Dengdi, and Zhao, Haifeng
- Subjects
Computer Science - Computer Vision and Pattern Recognition - Abstract
Recently, large-scale visual language pre-trained (VLP) models have demonstrated impressive performance across various downstream tasks. Motivated by these advancements, pioneering efforts have emerged in multi-label image recognition with missing labels, leveraging VLP prompt-tuning technology. However, they usually cannot match text and vision features well, due to complicated semantics gaps and missing labels in a multi-label image. To tackle this challenge, we propose $\textbf{T}$ext-$\textbf{R}$egion $\textbf{M}$atching for optimizing $\textbf{M}$ulti-$\textbf{L}$abel prompt tuning, namely TRM-ML, a novel method for enhancing meaningful cross-modal matching. Compared to existing methods, we advocate exploring the information of category-aware regions rather than the entire image or pixels, which contributes to bridging the semantic gap between textual and visual representations in a one-to-one matching manner. Concurrently, we further introduce multimodal contrastive learning to narrow the semantic gap between textual and visual modalities and establish intra-class and inter-class relationships. Additionally, to deal with missing labels, we propose a multimodal category prototype that leverages intra- and inter-category semantic relationships to estimate unknown labels, facilitating pseudo-label generation. Extensive experiments on the MS-COCO, PASCAL VOC, Visual Genome, NUS-WIDE, and CUB-200-211 benchmark datasets demonstrate that our proposed framework outperforms the state-of-the-art methods by a significant margin. Our code is available here: https://github.com/yu-gi-oh-leilei/TRM-ML., Comment: Accepted to ACM International Conference on Multimedia (ACM MM) 2024
- Published
- 2024
- Full Text
- View/download PDF
22. Neural Modulation Alteration to Positive and Negative Emotions in Depressed Patients: Insights from fMRI Using Positive/Negative Emotion Atlas
- Author
-
Feng, Yu, Zeng, Weiming, Xie, Yifan, Chen, Hongyu, Wang, Lei, Wang, Yingying, Yan, Hongjie, Zhang, Kaile, Tao, Ran, Siok, Wai Ting, and Wang, Nizhuan
- Subjects
Computer Science - Computer Vision and Pattern Recognition - Abstract
Background: Although it has been noticed that depressed patients show differences in processing emotions, the precise neural modulation mechanisms of positive and negative emotions remain elusive. FMRI is a cutting-edge medical imaging technology renowned for its high spatial resolution and dynamic temporal information, making it particularly suitable for the neural dynamics of depression research. Methods: To address this gap, our study firstly leveraged fMRI to delineate activated regions associated with positive and negative emotions in healthy individuals, resulting in the creation of positive emotion atlas (PEA) and negative emotion atlas (NEA). Subsequently, we examined neuroimaging changes in depression patients using these atlases and evaluated their diagnostic performance based on machine learning. Results: Our findings demonstrate that the classification accuracy of depressed patients based on PEA and NEA exceeded 0.70, a notable improvement compared to the whole-brain atlases. Furthermore, ALFF analysis unveiled significant differences between depressed patients and healthy controls in eight functional clusters during the NEA, focusing on the left cuneus, cingulate gyrus, and superior parietal lobule. In contrast, the PEA revealed more pronounced differences across fifteen clusters, involving the right fusiform gyrus, parahippocampal gyrus, and inferior parietal lobule. Limitations: Due to the limited sample size and subtypes of depressed patients, the efficacy may need further validation in future. Conclusions: These findings emphasize the complex interplay between emotion modulation and depression, showcasing significant alterations in both PEA and NEA among depression patients. This research enhances our understanding of emotion modulation in depression, with implications for diagnosis and treatment evaluation.
- Published
- 2024
23. Compensate Quantization Errors+: Quantized Models Are Inquisitive Learners
- Author
-
Gao, Yifei, Ou, Jie, Wang, Lei, Shang, Fanhua, Wu, Jaji, and Cheng, Jun
- Subjects
Computer Science - Computation and Language ,Computer Science - Artificial Intelligence ,I.2.7 - Abstract
Large Language Models (LLMs) showcase remarkable performance and robust deductive capabilities, yet their expansive size complicates deployment and raises environmental concerns due to substantial resource consumption. The recent development of a quantization technique known as Learnable Singular-value Increment (LSI) has addressed some of these quantization challenges. Leveraging insights from LSI and our extensive research, we have developed innovative methods that enhance the performance of quantized LLMs, particularly in low-bit settings. Our methods consistently deliver state-of-the-art results across various quantization scenarios and offer deep theoretical insights into the quantization process, elucidating the potential of quantized models for widespread application., Comment: Effecient Quantization Methods for LLMs
- Published
- 2024
24. Thermocapillary migration of a self-rewetting droplet on an inclined surface: A phase-field simulation
- Author
-
Yan, He, Wang, Lei, Huang, Jiangxu, and Yu, Yuan
- Subjects
Physics - Fluid Dynamics ,Physics - Computational Physics - Abstract
In this paper, we investigated the thermocapillary migration of a self-rewetting droplet on an inclined surface using a phase field based lattice Boltzmann method. Unlike the normal fluid whose surface tension decreases linearly with temperature, the self-rewetting fluid consider in the current work has a quadratic temperature dependence of surface tension with a well-defined minimum. we first explored the influence of the Marangoni number on droplet migration, and found that the droplet hardly deforms and migrates slowly when the Marangoni number is small. However, as the Marangoni number increases, the droplet begins to deform and elongate, and its migration speed increases. Subsequently, we studied the effect of surface wettability on droplet migration. The results show that the droplet migrate towards regions of higher surface energy on hydrophilic surfaces and in the opposite direction on hydrophobic surfaces. Furthermore, by varying the viscosity ratio and the inclination angle of the plate, we found that the droplet's migration speed decreases with an increase in the viscosity ratio. In particular, two vortices appear inside the droplet at a high viscosity ratio, whereas only one vortex is present at a low viscosity ratio.
- Published
- 2024
25. Unipotent radicals, one-dimensional transitive groups, and solvable factors of classical groups
- Author
-
Feng, Tao, Li, Cai Heng, Li, Conghui, Wang, Lei, Xia, Binzhou, and Zou, Hanlin
- Subjects
Mathematics - Group Theory - Abstract
By developing a tangible way to decompose unipotent radicals into irreducible submodules of Singer cycles, we achieve a classification of solvable factors of finite classical groups of Lie type. This completes previous work on factorizations of classical groups with a solvable factor. In particular, it resolves the final uncertain case in the long-standing problem of determining exact factorizations of almost simple groups. As a byproduct of the classification, we also obtain a new characterization of one-dimensional transitive groups, offering further insights into their group structure.
- Published
- 2024
26. First Measurement of Solar $^8$B Neutrino Flux through Coherent Elastic Neutrino-Nucleus Scattering in PandaX-4T
- Author
-
PandaX Collaboration, Bo, Zihao, Chen, Wei, Chen, Xun, Chen, Yunhua, Cheng, Zhaokan, Cui, Xiangyi, Fan, Yingjie, Fang, Deqing, Gao, Zhixing, Geng, Lisheng, Giboni, Karl, Guo, Xunan, Guo, Xuyuan, Guo, Zichao, Han, Chencheng, Han, Ke, He, Changda, He, Jinrong, Huang, Di, Huang, Houqi, Huang, Junting, Hou, Ruquan, Hou, Yu, Ji, Xiangdong, Ji, Xiangpan, Ju, Yonglin, Li, Chenxiang, Li, Jiafu, Li, Mingchuan, Li, Shuaijie, Li, Tao, Li, Zhiyuan, Lin, Qing, Liu, Jianglai, Lu, Congcong, Lu, Xiaoying, Luo, Lingyin, Luo, Yunyang, Ma, Wenbo, Ma, Yugang, Mao, Yajun, Meng, Yue, Ning, Xuyang, Pang, Binyu, Qi, Ningchun, Qian, Zhicheng, Ren, Xiangxiang, Shan, Dong, Shang, Xiaofeng, Shao, Xiyuan, Shen, Guofang, Shen, Manbin, Sun, Wenliang, Tao, Yi, Wang, Anqing, Wang, Guanbo, Wang, Hao, Wang, Jiamin, Wang, Lei, Wang, Meng, Wang, Qiuhong, Wang, Shaobo, Wang, Siguang, Wang, Wei, Wang, Xiuli, Wang, Xu, Wang, Zhou, Wei, Yuehuan, Wu, Weihao, Wu, Yuan, Xiao, Mengjiao, Xiao, Xiang, Xiong, Kaizhi, Xu, Yifan, Yao, Shunyu, Yan, Binbin, Yan, Xiyu, Yang, Yong, Ye, Peihua, Yu, Chunxu, Yuan, Ying, Yuan, Zhe, Yun, Youhui, Zeng, Xinning, Zhang, Minzhen, Zhang, Peng, Zhang, Shibo, Zhang, Shu, Zhang, Tao, Zhang, Wei, Zhang, Yang, Zhang, Yingxin, Zhang, Yuanyuan, Zhao, Li, Zhou, Jifang, Zhou, Jiaxu, Zhou, Jiayi, Zhou, Ning, Zhou, Xiaopeng, Zhou, Yubo, and Zhou, Zhizhen
- Subjects
High Energy Physics - Experiment ,Astrophysics - Solar and Stellar Astrophysics ,Nuclear Experiment - Abstract
The PandaX-4T liquid xenon detector at the China Jinping Underground Laboratory is used to measure the solar $^8$B neutrino flux by detecting neutrinos through coherent scattering with xenon nuclei. Data samples requiring the coincidence of scintillation and ionization signals (paired), as well as unpaired ionization-only signals (US2), are selected with energy threshold of approximately 1.1 keV (0.33 keV) nuclear recoil energy. Combining the commissioning run and the first science run of PandaX-4T, a total exposure of 1.25 and 1.04 tonne$\cdot$year are collected for the paired and US2, respectively. After unblinding, 3 and 332 events are observed with an expectation of 2.8$\pm$0.5 and 251$\pm$32 background events, for the paired and US2 data, respectively. A combined analysis yields a best-fit $^8$B neutrino signal of 3.5 (75) events from the paired (US2) data sample, with $\sim$37\% uncertainty, and the background-only hypothesis is disfavored at 2.64$\sigma$ significance. This gives a solar $^8$B neutrino flux of ($8.4\pm3.1$)$\times$10$^6$ cm$^{-2}$s$^{-1}$, consistent with the standard solar model prediction. This is the first indication of solar $^8$B neutrino ``fog'' in a dark matter direct detection experiment.
- Published
- 2024
27. Automated high-resolution backscattered-electron imaging at macroscopic scale
- Author
-
Lang, Zhiyuan, Zhang, Zunshuai, Wang, Lei, Liu, Yuhan, Qian, Weixiong, Zhou, Shenghua, Jiang, Ying, Zhang, Tongyi, and Yang, Jiong
- Subjects
Condensed Matter - Materials Science ,Electrical Engineering and Systems Science - Image and Video Processing - Abstract
Scanning electron microscopy (SEM) has been widely utilized in the field of materials science due to its significant advantages, such as large depth of field, wide field of view, and excellent stereoscopic imaging. However, at high magnification, the limited imaging range in SEM cannot cover all the possible inhomogeneous microstructures. In this research, we propose a novel approach for generating high-resolution SEM images across multiple scales, enabling a single image to capture physical dimensions at the centimeter level while preserving submicron-level details. We adopted the SEM imaging on the AlCoCrFeNi2.1 eutectic high entropy alloy (EHEA) as an example. SEM videos and image stitching are combined to fulfill this goal, and the video-extracted low-definition (LD) images are clarified by a well-trained denoising model. Furthermore, we segment the macroscopic image of the EHEA, and area of various microstructures are distinguished. Combining the segmentation results and hardness experiments, we found that the hardness is positively correlated with the content of body-centered cubic (BCC) phase, negatively correlated with the lamella width, and the relationship with the proportion of lamellar structures was not significant. Our work provides a feasible solution to generate macroscopic images based on SEMs for further analysis of the correlations between the microstructures and spatial distribution, and can be widely applied to other types of microscope., Comment: 22 pages,12 figures
- Published
- 2024
28. EMPL: A novel Efficient Meta Prompt Learning Framework for Few-shot Unsupervised Domain Adaptation
- Author
-
Yang, Wanqi, Wang, Haoran, Wang, Lei, Song, Ge, and Gao, Yang
- Subjects
Computer Science - Computer Vision and Pattern Recognition - Abstract
Few-shot unsupervised domain adaptation (FS-UDA) utilizes few-shot labeled source domain data to realize effective classification in unlabeled target domain. However, current FS-UDA methods are still suffer from two issues: 1) the data from different domains can not be effectively aligned by few-shot labeled data due to the large domain gaps, 2) it is unstable and time-consuming to generalize to new FS-UDA tasks.To address this issue, we put forward a novel Efficient Meta Prompt Learning Framework for FS-UDA. Within this framework, we use pre-trained CLIP model as the feature learning base model. First, we design domain-shared prompt learning vectors composed of virtual tokens, which mainly learns the meta knowledge from a large number of meta tasks to mitigate domain gaps. Secondly, we also design a task-shared prompt learning network to adaptively learn specific prompt vectors for each task, which aims to realize fast adaptation and task generalization. Thirdly, we learn a task-specific cross-domain alignment projection and a task-specific classifier with closed-form solutions for each meta task, which can efficiently adapt the model to new tasks in one step. The whole learning process is formulated as a bilevel optimization problem, and a good initialization of model parameters is learned through meta-learning. Extensive experimental study demonstrates the promising performance of our framework on benchmark datasets. Our method has the large improvement of at least 15.4% on 5-way 1-shot and 8.7% on 5-way 5-shot, compared with the state-of-the-art methods. Also, the performance of our method on all the test tasks is more stable than the other methods.
- Published
- 2024
29. MHNet: Multi-view High-order Network for Diagnosing Neurodevelopmental Disorders Using Resting-state fMRI
- Author
-
Li, Yueyang, Zeng, Weiming, Dong, Wenhao, Cai, Luhui, Wang, Lei, Chen, Hongyu, Yan, Hongjie, Bian, Lingbin, and Wang, Nizhuan
- Subjects
Computer Science - Computer Vision and Pattern Recognition - Abstract
Background: Deep learning models have shown promise in diagnosing neurodevelopmental disorders (NDD) like ASD and ADHD. However, many models either use graph neural networks (GNN) to construct single-level brain functional networks (BFNs) or employ spatial convolution filtering for local information extraction from rs-fMRI data, often neglecting high-order features crucial for NDD classification. Methods: We introduce a Multi-view High-order Network (MHNet) to capture hierarchical and high-order features from multi-view BFNs derived from rs-fMRI data for NDD prediction. MHNet has two branches: the Euclidean Space Features Extraction (ESFE) module and the Non-Euclidean Space Features Extraction (Non-ESFE) module, followed by a Feature Fusion-based Classification (FFC) module for NDD identification. ESFE includes a Functional Connectivity Generation (FCG) module and a High-order Convolutional Neural Network (HCNN) module to extract local and high-order features from BFNs in Euclidean space. Non-ESFE comprises a Generic Internet-like Brain Hierarchical Network Generation (G-IBHN-G) module and a High-order Graph Neural Network (HGNN) module to capture topological and high-order features in non-Euclidean space. Results: Experiments on three public datasets show that MHNet outperforms state-of-the-art methods using both AAL1 and Brainnetome Atlas templates. Extensive ablation studies confirm the superiority of MHNet and the effectiveness of using multi-view fMRI information and high-order features. Our study also offers atlas options for constructing more sophisticated hierarchical networks and explains the association between key brain regions and NDD. Conclusion: MHNet leverages multi-view feature learning from both Euclidean and non-Euclidean spaces, incorporating high-order information from BFNs to enhance NDD classification performance., Comment: 18 pages
- Published
- 2024
30. Motion meets Attention: Video Motion Prompts
- Author
-
Chen, Qixiang, Wang, Lei, Koniusz, Piotr, and Gedeon, Tom
- Subjects
Computer Science - Computer Vision and Pattern Recognition ,Computer Science - Artificial Intelligence ,Computer Science - Machine Learning - Abstract
Videos contain rich spatio-temporal information. Traditional methods for extracting motion, used in tasks such as action recognition, often rely on visual contents rather than precise motion features. This phenomenon is referred to as 'blind motion extraction' behavior, which proves inefficient in capturing motions of interest due to a lack of motion-guided cues. Recently, attention mechanisms have enhanced many computer vision tasks by effectively highlighting salient visual areas. Inspired by this, we propose using a modified Sigmoid function with learnable slope and shift parameters as an attention mechanism to activate and modulate motion signals derived from frame differencing maps. This approach generates a sequence of attention maps that enhance the processing of motion-related video content. To ensure temporally continuity and smoothness of the attention maps, we apply pair-wise temporal attention variation regularization to remove unwanted motions (e.g., noise) while preserving important ones. We then perform Hadamard product between each pair of attention maps and the original video frames to highlight the evolving motions of interest over time. These highlighted motions, termed video motion prompts, are subsequently used as inputs to the model instead of the original video frames. We formalize this process as a motion prompt layer and incorporate the regularization term into the loss function to learn better motion prompts. This layer serves as an adapter between the model and the video data, bridging the gap between traditional 'blind motion extraction' and the extraction of relevant motions of interest., Comment: Research report
- Published
- 2024
31. Multi-level Reliable Guidance for Unpaired Multi-view Clustering
- Author
-
Xin, Like, Yang, Wanqi, Wang, Lei, and Yang, Ming
- Subjects
Computer Science - Computer Vision and Pattern Recognition - Abstract
In this paper, we address the challenging problem of unpaired multi-view clustering (UMC), aiming to perform effective joint clustering using unpaired observed samples across multiple views. Commonly, traditional incomplete multi-view clustering (IMC) methods often depend on paired samples to capture complementary information between views. However, the strategy becomes impractical in UMC due to the absence of paired samples. Although some researchers have attempted to tackle the issue by preserving consistent cluster structures across views, they frequently neglect the confidence of these cluster structures, especially for boundary samples and uncertain cluster structures during the initial training. Therefore, we propose a method called Multi-level Reliable Guidance for UMC (MRG-UMC), which leverages multi-level clustering to aid in learning a trustworthy cluster structure across inner-view, cross-view, and common-view, respectively. Specifically, within each view, multi-level clustering fosters a trustworthy cluster structure across different levels and reduces clustering error. In cross-view learning, reliable view guidance enhances the confidence of the cluster structures in other views. Similarly, within the multi-level framework, the incorporation of a common view aids in aligning different views, thereby reducing the clustering error and uncertainty of cluster structure. Finally, as evidenced by extensive experiments, our method for UMC demonstrates significant efficiency improvements compared to 20 state-of-the-art methods.
- Published
- 2024
32. A thermodynamically consistent phase-field lattice Boltzmann method for two-phase electrohydrodynamic flows
- Author
-
Xiong, Fang, Wang, Lei, Huang, Jiangxu, and Luo, Kang
- Subjects
Physics - Fluid Dynamics - Abstract
In this work, we aim to develop a phase-field based lattice Boltzmann (LB) method for simulating two-phase electrohydrodynamics (EHD) flows, which allows for different properties (densities, viscosities, conductivity and permittivity) of each phase while maintaining thermodynamic consistency. To this end, we first present a theoretical analysis on the two-phase EHD flows by using the Onsager's variational principle, which is an extension of Rayleigh's principle of least energy dissipation and, naturally, guarantees thermodynamic consistency. It shows that the governing equations of the model include the hydrodynamic equations, Cahn-Hilliard equation coupled with additional electrical effect, and the full Poisson-Nernst-Planck electrokinetic equations. After that, a coupled lattice Boltzmann (LB) scheme is constructed for simulating two-phase EHD flows. In particular, in order to handle two-phase EHD flows with a relatively larger electric permittivity ratio, we also introduce a delicately designed discrete forcing term into the LB equation for electrostatic field. Moreover, some numerical examples including two-phase EHD flows in planar layers and charge diffusion of a Gaussian bell are simulated with the developed LB method. It is shown that our numerical scheme shares a second-order convergence rate in space in predicting electric potential and charge density. Finally, we used the current model to simulate the deformation of a droplet under an electric field and the dynamics of droplet detachment in reversed electrowetting. Our numerical results align well with the theoretic solutions, and the available experimental/numerical data, demonstrating that the proposed method is feasible for simulating two-phase EHD flows.
- Published
- 2024
33. Biospecific Chemistry for Covalent Linking of Biomacromolecules.
- Author
-
Cao, Li and Wang, Lei
- Subjects
Humans ,Animals ,Metal-Organic Frameworks ,Bioreactors ,Proteins ,Amino Acids ,Hydrophobic and Hydrophilic Interactions ,RNA ,Cross-Linking Reagents ,Protein Engineering - Abstract
Interactions among biomacromolecules, predominantly noncovalent, underpin biological processes. However, recent advancements in biospecific chemistry have enabled the creation of specific covalent bonds between biomolecules, both in vitro and in vivo. This Review traces the evolution of biospecific chemistry in proteins, emphasizing the role of genetically encoded latent bioreactive amino acids. These amino acids react selectively with adjacent natural groups through proximity-enabled bioreactivity, enabling targeted covalent linkages. We explore various latent bioreactive amino acids designed to target different protein residues, ribonucleic acids, and carbohydrates. We then discuss how these novel covalent linkages can drive challenging protein properties and capture transient protein-protein and protein-RNA interactions in vivo. Additionally, we examine the application of covalent peptides as potential therapeutic agents and site-specific conjugates for native antibodies, highlighting their capacity to form stable linkages with target molecules. A significant focus is placed on proximity-enabled reactive therapeutics (PERx), a pioneering technology in covalent protein therapeutics. We detail its wide-ranging applications in immunotherapy, viral neutralization, and targeted radionuclide therapy. Finally, we present a perspective on the existing challenges within biospecific chemistry and discuss the potential avenues for future exploration and advancement in this rapidly evolving field.
- Published
- 2024
34. Machine-Type Communication Waveforms: An Exploration of New Dimensions
- Author
-
Wang, Michael, Wang, Lei, and You, Xiaohu
- Subjects
Electrical Engineering and Systems Science - Signal Processing - Abstract
This paper derives a generalized class of waveforms with an application to machine-type communication (MTC) while studying its underlying structural characteristics in relation to conventional modulation waveforms. First, a canonical waveform of frequency-error tolerance is identified for a unified preamble and traffic signal design, ideal for MTC use as a composite waveform, commonly known as a transmission burst. It is shown that the most widely used modulation schemes for mIoT traffic signals, e.g., FSK and LoRa modulation, are simply subsets of the canonical waveform. The intrinsic characteristics and degrees of freedom the waveform offers are then explored. Most significantly, a new waveform dimension is uncovered and exploited as additional degrees of freedom for satisfying the MTC requirements, i.e., energy and resource efficiency and robustness. The corresponding benefits are evaluated analytically and numerically in AWGN, frequency-flat, and selective channels. We demonstrate that neither FSK nor LoRa can fully address the mIoT requirements since neither fully exploits the degrees of freedom from the perspective of the generalized waveform class. Finally, a solution is devised to optimize energy and resource efficiency under various deployment environments and practical constraints while maintaining the low-complexity property., Comment: 17 pages, 9 figures
- Published
- 2024
35. YuLan: An Open-source Large Language Model
- Author
-
Zhu, Yutao, Zhou, Kun, Mao, Kelong, Chen, Wentong, Sun, Yiding, Chen, Zhipeng, Cao, Qian, Wu, Yihan, Chen, Yushuo, Wang, Feng, Zhang, Lei, Li, Junyi, Wang, Xiaolei, Wang, Lei, Zhang, Beichen, Dong, Zican, Cheng, Xiaoxue, Chen, Yuhan, Tang, Xinyu, Hou, Yupeng, Ren, Qiangqiang, Pang, Xincheng, Xie, Shufang, Zhao, Wayne Xin, Dou, Zhicheng, Mao, Jiaxin, Lin, Yankai, Song, Ruihua, Xu, Jun, Chen, Xu, Yan, Rui, Wei, Zhewei, Hu, Di, Huang, Wenbing, Gao, Ze-Feng, Chen, Yueguo, Lu, Weizheng, and Wen, Ji-Rong
- Subjects
Computer Science - Computation and Language ,Computer Science - Artificial Intelligence - Abstract
Large language models (LLMs) have become the foundation of many applications, leveraging their extensive capabilities in processing and understanding natural language. While many open-source LLMs have been released with technical reports, the lack of training details hinders further research and development. This paper presents the development of YuLan, a series of open-source LLMs with $12$ billion parameters. The base model of YuLan is pre-trained on approximately $1.7$T tokens derived from a diverse corpus, including massive English, Chinese, and multilingual texts. We design a three-stage pre-training method to enhance YuLan's overall capabilities. Subsequent phases of training incorporate instruction-tuning and human alignment, employing a substantial volume of high-quality synthesized data. To facilitate the learning of complex and long-tail knowledge, we devise a curriculum-learning framework throughout across these stages, which helps LLMs learn knowledge in an easy-to-hard manner. YuLan's training is finished on Jan, 2024 and has achieved performance on par with state-of-the-art LLMs across various English and Chinese benchmarks. This paper outlines a comprehensive technical roadmap for developing LLMs from scratch. Our model and codes are available at https://github.com/RUC-GSAI/YuLan-Chat.
- Published
- 2024
36. T-MAC: CPU Renaissance via Table Lookup for Low-Bit LLM Deployment on Edge
- Author
-
Wei, Jianyu, Cao, Shijie, Cao, Ting, Ma, Lingxiao, Wang, Lei, Zhang, Yanyong, and Yang, Mao
- Subjects
Computer Science - Distributed, Parallel, and Cluster Computing ,Computer Science - Artificial Intelligence - Abstract
The deployment of Large Language Models (LLMs) on edge devices is increasingly important to enhance on-device intelligence. Weight quantization is crucial for reducing the memory footprint of LLMs on devices. However, low-bit LLMs necessitate mixed precision matrix multiplication (mpGEMM) of low precision weights and high precision activations during inference. Existing systems, lacking native support for mpGEMM, resort to dequantize weights for high precision computation. Such an indirect way can lead to a significant inference overhead. In this paper, we introduce T-MAC, an innovative lookup table(LUT)-based method designed for efficient low-bit LLM (i.e., weight-quantized LLM) inference on CPUs. T-MAC directly supports mpGEMM without dequantization, while simultaneously eliminating multiplications and reducing additions required. Specifically, T-MAC transforms the traditional data-type-centric multiplication to bit-wise table lookup, and enables a unified and scalable mpGEMM solution. Our LUT-based kernels scale linearly to the weight bit-width. Evaluated on low-bit Llama and BitNet models, T-MAC demonstrates up to 4x increase in throughput and 70% reduction in energy consumption compared to llama.cpp. For BitNet-b1.58-3B, T-MAC delivers a token generation throughput of 30 tokens/s with a single core and 71 tokens/s with eight cores on M2-Ultra, and 11 tokens/s on lower-end devices like Raspberry Pi 5, which significantly exceeds the adult average reading speed. T-MAC with LUT-based computing paradigm, paves the way for the practical deployment of low-bit LLMs on resource-constrained edge devices without compromising computational efficiency. The system is open-sourced at https://github.com/microsoft/T-MAC.
- Published
- 2024
37. Compensate Quantization Errors: Make Weights Hierarchical to Compensate Each Other
- Author
-
Gao, Yifei, Ou, Jie, Wang, Lei, Xiao, Yuting, Xiang, Zhiyuan, Dai, Ruiting, and Cheng, Jun
- Subjects
Computer Science - Computation and Language ,Computer Science - Artificial Intelligence ,F.2.3 - Abstract
Emergent Large Language Models (LLMs) use their extraordinary performance and powerful deduction capacity to discern from traditional language models. However, the expenses of computational resources and storage for these LLMs are stunning, quantization then arises as a trending conversation. To address accuracy decay caused by quantization, two streams of works in post-training quantization methods stand out. One uses other weights to compensate existing quantization error, while the other transfers the quantization difficulty to other parts in the model. Combining both merits, we introduce Learnable Singular value Increment (LSI) as an advanced solution. LSI uses Singular Value Decomposition to extract singular values of the weights and make them learnable to help weights compensate each other conditioned on activation. Incorporating LSI with existing techniques, we achieve state-of-the-art performance in diverse quantization settings, no matter in weight-only, weight-activation or extremely low bit scenarios. By unleashing the potential of LSI, efficient finetuning on quantized model is no longer a prohibitive problem., Comment: Efficient quantization method
- Published
- 2024
38. You Only Acquire Sparse-channel (YOAS): A Unified Framework for Dense-channel EEG Generation
- Author
-
Chen, Hongyu, Zeng, Weiming, Cai, Luhui, Wang, Lei, Lu, Jia, Li, Yueyang, Yan, Hongjie, Siok, Wai Ting, and Wang, Nizhuan
- Subjects
Computer Science - Computer Vision and Pattern Recognition - Abstract
High-precision acquisition of dense-channel electroencephalogram (EEG) signals is often impeded by the costliness and lack of portability of equipment. In contrast, generating dense-channel EEG signals effectively from sparse channels shows promise and economic viability. However, sparse-channel EEG poses challenges such as reduced spatial resolution, information loss, signal mixing, and heightened susceptibility to noise and interference. To address these challenges, we first theoretically formulate the dense-channel EEG generation problem as by optimizing a set of cross-channel EEG signal generation problems. Then, we propose the YOAS framework for generating dense-channel data from sparse-channel EEG signals. The YOAS totally consists of four sequential stages: Data Preparation, Data Preprocessing, Biased-EEG Generation, and Synthetic EEG Generation. Data Preparation and Preprocessing carefully consider the distribution of EEG electrodes and low signal-to-noise ratio problem of EEG signals. Biased-EEG Generation includes sub-modules of BiasEEGGanFormer and BiasEEGDiffFormer, which facilitate long-term feature extraction with attention and generate signals by combining electrode position alignment with diffusion model, respectively. Synthetic EEG Generation synthesizes the final signals, employing a deduction paradigm for multi-channel EEG generation. Extensive experiments confirmed YOAS's feasibility, efficiency, and theoretical validity, even remarkably enhancing data discernibility. This breakthrough in dense-channel EEG signal generation from sparse-channel data opens new avenues for exploration in EEG signal processing and application.
- Published
- 2024
39. Unsupervised Monocular Depth Estimation Based on Hierarchical Feature-Guided Diffusion
- Author
-
Liu, Runze, Zhu, Dongchen, Zhang, Guanghui, Xu, Yue, Shi, Wenjun, Zhang, Xiaolin, Wang, Lei, and Li, Jiamao
- Subjects
Computer Science - Computer Vision and Pattern Recognition - Abstract
Unsupervised monocular depth estimation has received widespread attention because of its capability to train without ground truth. In real-world scenarios, the images may be blurry or noisy due to the influence of weather conditions and inherent limitations of the camera. Therefore, it is particularly important to develop a robust depth estimation model. Benefiting from the training strategies of generative networks, generative-based methods often exhibit enhanced robustness. In light of this, we employ a well-converging diffusion model among generative networks for unsupervised monocular depth estimation. Additionally, we propose a hierarchical feature-guided denoising module. This model significantly enriches the model's capacity for learning and interpreting depth distribution by fully leveraging image features to guide the denoising process. Furthermore, we explore the implicit depth within reprojection and design an implicit depth consistency loss. This loss function serves to enhance the performance of the model and ensure the scale consistency of depth within a video sequence. We conduct experiments on the KITTI, Make3D, and our self-collected SIMIT datasets. The results indicate that our approach stands out among generative-based models, while also showcasing remarkable robustness.
- Published
- 2024
40. STEMO: Early Spatio-temporal Forecasting with Multi-Objective Reinforcement Learning
- Author
-
Shao, Wei, Kang, Yufan, Peng, Ziyan, Xiao, Xiao, Wang, Lei, Yang, Yuhui, and Salim, Flora D
- Subjects
Computer Science - Machine Learning ,Computer Science - Artificial Intelligence - Abstract
Accuracy and timeliness are indeed often conflicting goals in prediction tasks. Premature predictions may yield a higher rate of false alarms, whereas delaying predictions to gather more information can render them too late to be useful. In applications such as wildfires, crimes, and traffic jams, timely forecasting are vital for safeguarding human life and property. Consequently, finding a balance between accuracy and timeliness is crucial. In this paper, we propose an early spatio-temporal forecasting model based on Multi-Objective reinforcement learning that can either implement an optimal policy given a preference or infer the preference based on a small number of samples. The model addresses two primary challenges: 1) enhancing the accuracy of early forecasting and 2) providing the optimal policy for determining the most suitable prediction time for each area. Our method demonstrates superior performance on three large-scale real-world datasets, surpassing existing methods in early spatio-temporal forecasting tasks., Comment: Accepted paper in KDD 2024
- Published
- 2024
41. A novel measurement method for SiPM external crosstalk probability at low temperature
- Author
-
Li, Guanda, Wang, Lei, Sun, Xilei, Liu, Fang, Guo, Cong, Zhao, Kangkang, Tian, Lei, Yu, Zeyuan, Hou, Zhilong, Li, Chi, Lei, Yu, Wang, Bin, and Zhou, Rongbin
- Subjects
Physics - Instrumentation and Detectors ,Nuclear Experiment - Abstract
Silicon photomultipliers (SiPMs) are being considered as potential replacements for conventional photomultiplier tubes (PMTs). However, a significant disadvantage of SiPMs is crosstalk (CT), wherein photons propagate through other pixels, resulting in secondary avalanches. CT can be categorized into internal crosstalk and external crosstalk based on whether the secondary avalanche occurs within the same SiPM or a different one. Numerous methods exist for quantitatively estimating the percentage of internal crosstalk (iCT). However, external crosstalk (eCT) has not been extensively studied. This article presents a novel measurement method for the probability of emitting an external crosstalk photon during a single pixel avalanche, using a setup involving two identical SiPMs facing each other, and without the need for complex optical designs. The entire apparatus is enclosed within a stainless steel chamber, functioning as a light-tight enclosure, and maintained at liquid nitrogen temperature. The experimental setup incorporates two Sensl J-60035 SiPM chips along with two 0.5-inch Hamamatsu Photonics (HPK) VUV4 S13370-6050CN SiPM arrays. The findings show a linear relationship between the probability of emitting an external crosstalk photon and the SiPM overvoltage for both SiPM samples. Surprisingly, this novel measurement method also rovides measurements of the SiPM photon detection efficiency (PDE) for eCT photons at low temperature.
- Published
- 2024
42. NUMCoT: Numerals and Units of Measurement in Chain-of-Thought Reasoning using Large Language Models
- Author
-
Xu, Ancheng, Tan, Minghuan, Wang, Lei, Yang, Min, and Xu, Ruifeng
- Subjects
Computer Science - Computation and Language ,Computer Science - Artificial Intelligence - Abstract
Numeral systems and units of measurement are two conjoined topics in activities of human beings and have mutual effects with the languages expressing them. Currently, the evaluation of Large Language Models (LLMs) often involves mathematical reasoning, yet little attention is given to how minor changes in numbers or units can drastically alter the complexity of problems and the performance of LLMs. In this paper, we scrutinize existing LLMs on processing of numerals and units of measurement by constructing datasets with perturbations. We first anatomize the reasoning of math word problems to different sub-procedures like numeral conversions from language to numbers and measurement conversions based on units. Then we further annotate math word problems from ancient Chinese arithmetic works which are challenging in numerals and units of measurement. Experiments on perturbed datasets demonstrate that LLMs still encounter difficulties in handling numeral and measurement conversions., Comment: Findings of ACL 2024
- Published
- 2024
43. Stoner instabilities and Ising excitonic states in twisted transition metal dichalcogenides
- Author
-
Ghiotto, Augusto, Wei, LingNan, Song, Larry, Zang, Jiawei, Tazi, Aya Batoul, Ostrom, Daniel, Watanabe, Kenji, Taniguchi, Takashi, Hone, James C., Rhodes, Daniel A., Millis, Andrew J., Dean, Cory R., Wang, Lei, and Pasupathy, Abhay N.
- Subjects
Condensed Matter - Strongly Correlated Electrons ,Condensed Matter - Mesoscale and Nanoscale Physics - Abstract
Moir\'e transition metal dichalcogenide (TMD) systems provide a tunable platform for studying electron-correlation driven quantum phases. Such phases have so far been found at rational fillings of the moir\'e superlattice, and it is believed that lattice commensurability plays a key role in their stability. In this work, we show via magnetotransport measurements on twisted WSe2 that new correlated electronic phases can exist away from commensurability. The first phase is an antiferromagnetic metal that is driven by proximity to the van Hove singularity. The second is a re-entrant magnetic field-driven insulator. This insulator is formed from a small and equal density of electrons and holes with opposite spin projections - an Ising excitonic insulator.
- Published
- 2024
44. Bridging the Gap Between Domain-specific Frameworks and Multiple Hardware Devices
- Author
-
Wen, Xu, Gao, Wanling, Wang, Lei, and Zhan, Jianfeng
- Subjects
Computer Science - Software Engineering - Abstract
The rapid development of domain-specific frameworks has presented us with a significant challenge: The current approach of implementing solutions on a case-by-case basis incurs a theoretical complexity of O(M*N), thereby increasing the cost of porting applications to different hardware platforms. To address these challenges, we propose a systematic methodology that effectively bridges the gap between domain-specific frameworks and multiple hardware devices, reducing porting complexity to O(M+N). The approach utilizes multi-layer abstractions. Different domain-specific abstractions are employed to represent applications from various domains. These abstractions are then transformed into a unified abstraction, which is subsequently translated into combinations of primitive operators. Finally, these operators are mapped to multiple hardware platforms. The implemented unified framework supports deep learning, classical machine learning, and data analysis across X86, ARM, RISC-V, IoT devices, and GPU. It outperforms existing solutions like scikit-learn, hummingbird, Spark, and pandas, achieving impressive speedups: 1.1x to 3.83x on X86 servers, 1.06x to 4.33x on ARM IoT devices, 1.25x to 3.72x on RISC-V IoT devices, and 1.93x on GPU. The source code is available at https://github.com/BenchCouncil/bridger.git., Comment: 15pages, 8 figures
- Published
- 2024
45. Information Leakage from Embedding in Large Language Models
- Author
-
Wan, Zhipeng, Cheng, Anda, Wang, Yinggui, and Wang, Lei
- Subjects
Computer Science - Machine Learning ,Computer Science - Cryptography and Security - Abstract
The widespread adoption of large language models (LLMs) has raised concerns regarding data privacy. This study aims to investigate the potential for privacy invasion through input reconstruction attacks, in which a malicious model provider could potentially recover user inputs from embeddings. We first propose two base methods to reconstruct original texts from a model's hidden states. We find that these two methods are effective in attacking the embeddings from shallow layers, but their effectiveness decreases when attacking embeddings from deeper layers. To address this issue, we then present Embed Parrot, a Transformer-based method, to reconstruct input from embeddings in deep layers. Our analysis reveals that Embed Parrot effectively reconstructs original inputs from the hidden states of ChatGLM-6B and Llama2-7B, showcasing stable performance across various token lengths and data distributions. To mitigate the risk of privacy breaches, we introduce a defense mechanism to deter exploitation of the embedding reconstruction process. Our findings emphasize the importance of safeguarding user privacy in distributed learning systems and contribute valuable insights to enhance the security protocols within such environments.
- Published
- 2024
46. Strongly coupled magneto-exciton condensates in large-angle twisted double bilayer graphene
- Author
-
Li, Qingxin, Chen, Yiwei, Wei, LingNan, Chen, Hong, Huang, Yan, Zhu, Yujian, Zhu, Wang, An, Dongdong, Song, Junwei, Gan, Qikang, Zhang, Qi, Watanabe, Kenji, Taniguchi, Takashi, Shi, Xiaoyang, Novoselov, Kostya S., Wang, Rui, Yu, Geliang, and Wang, Lei
- Subjects
Condensed Matter - Mesoscale and Nanoscale Physics - Abstract
Excitons, the bosonic quasiparticle emerging from Coulomb interaction between electrons and holes, will undergo a Bose-Einstein condensation(BEC) and transition into a superfluid state with global phase coherence at low temperatures. An important platform to study such excitonic physics is built on double-layer quantum wells or recent two-dimensional material heterostructures, where two parallel planes of electrons and holes are separated by a thin insulating layer. Lowering this separation distance ($d$) enhances the interlayer Coulomb interaction thereby strengthens the exciton binding energy. However, an exceedingly small $d$ will lead to the undesired interlayer tunneling, which results the annihilation of excitons. Here, we report the observation of a sequences of robust exciton condensates(ECs) in double bilayer graphenes twisted to $\sim 10^\circ$ with no insulating mid-layer. The large momentum mismatch between the two graphene layers well suppress the interlayer tunneling, allowing us to reach the separation lower limit $\sim$ 0.334 nm and investigate ECs in the extreme coupling regime. Carrying out transport measurements on the bulk and edge of the devices, we find incompressible states corresponding to ECs when both layers are half-filled in the $N=0$ and $N=1$ Landau levels (LLs). The comparison between these ECs and theoretical calculations suggest that the low-energy charged excitation of ECs can be meron-antimeron or particle-hole pair, which relies on both LL index and carrier type. Our results establish large-angle twisted bilayers as an experimental platform with extreme coupling strength for studying quantum bosonic phase and its low-energy excitations.
- Published
- 2024
47. A thermodynamic and analytical description on the quantitative phase-field model with enhanced interface diffusivity
- Author
-
Li, Yue, Wang, Lei, Li, Junjie, Wang, Jincheng, and Wang, Zhijun
- Subjects
Condensed Matter - Materials Science - Abstract
Based on the idea of maintaining physical diffuse interface kinetics, enhancing interfacial diffusivity has recently provided a new direction for quantitative phase-field simulation at microstructural length and time scale. Establishing a general relationship between interface diffusivity and width is vital to facilitate the practical application. However, it is still limited by time-consuming numerical corrections, and its relationship with non-dilute thermodynamic properties still needs to be revealed. In this study, we present a new thermodynamic and analytical method for determining interfacial diffusivity enhancement. Unlike previous numerical corrections of partition coefficients and interface temperature, this new method aims to keep several thermodynamic quantities unchanged after enlarging the interface width. These essential quantities are theoretically proven to be diffusion potential jump across the diffuse interface and free energy dissipation by trans-interface diffusion. Since no dilute approximation has been employed in model derivation, the present method is available for binary alloys with arbitrary thermodynamic properties and can be easily extended to describe multicomponent systems. Therefore, the present method is expected to advance the recent quantitative phase-field framework and facilitate its practical applications., Comment: 22 pages 10+1figures
- Published
- 2024
48. SignLLM: Sign Languages Production Large Language Models
- Author
-
Fang, Sen, Wang, Lei, Zheng, Ce, Tian, Yapeng, and Chen, Chen
- Subjects
Computer Science - Computer Vision and Pattern Recognition ,Computer Science - Computation and Language - Abstract
In this paper, we introduce the first comprehensive multilingual sign language dataset named Prompt2Sign, which builds from public data including American Sign Language (ASL) and seven others. Our dataset transforms a vast array of videos into a streamlined, model-friendly format, optimized for training with translation models like seq2seq and text2text. Building on this new dataset, we propose SignLLM, the first multilingual Sign Language Production (SLP) model, which includes two novel multilingual SLP modes that allow for the generation of sign language gestures from input text or prompt. Both of the modes can use a new loss and a module based on reinforcement learning, which accelerates the training by enhancing the model's capability to autonomously sample high-quality data. We present benchmark results of SignLLM, which demonstrate that our model achieves state-of-the-art performance on SLP tasks across eight sign languages., Comment: 33 pages, website at https://signllm.github.io/
- Published
- 2024
49. Attainability of the best constant of Hardy-Sobolev inequality with full boundary singularities
- Author
-
Sun, Liming and Wang, Lei
- Subjects
Mathematics - Analysis of PDEs ,35A23, 35B09, 35B44, 35J75, 35J91 - Abstract
We consider a type of Hardy-Sobolev inequality, whose weight function is singular on the whole domain boundary. We are concerned with the attainability of the best constant of such inequality. In dimension two, we link the inequality to a conformally invariant one using the conformal radius of the domain. The best constant of such inequality on a smooth bounded domain is achieved if and only if the domain is non-convex. In higher dimensions, the best constant is achieved if the domain has negative mean curvature somewhere. If the mean curvature vanishes but is non-umbilic somewhere, we also establish the attainability for some special cases. In the other direction, we also show the best constant is not achieved if the domain is sufficiently close to a ball in $C^2$ sense., Comment: 42 pages
- Published
- 2024
50. On the Adversarial Robustness of Learning-based Image Compression Against Rate-Distortion Attacks
- Author
-
Wu, Chenhao, Wu, Qingbo, Wei, Haoran, Chen, Shuai, Wang, Lei, Ngan, King Ngi, Meng, Fanman, and Li, Hongliang
- Subjects
Electrical Engineering and Systems Science - Image and Video Processing - Abstract
Despite demonstrating superior rate-distortion (RD) performance, learning-based image compression (LIC) algorithms have been found to be vulnerable to malicious perturbations in recent studies. However, the adversarial attacks considered in existing literature remain divergent from real-world scenarios, both in terms of the attack direction and bitrate. Additionally, existing methods focus solely on empirical observations of the model vulnerability, neglecting to identify the origin of it. These limitations hinder the comprehensive investigation and in-depth understanding of the adversarial robustness of LIC algorithms. To address the aforementioned issues, this paper considers the arbitrary nature of the attack direction and the uncontrollable compression ratio faced by adversaries, and presents two practical rate-distortion attack paradigms, i.e., Specific-ratio Rate-Distortion Attack (SRDA) and Agnostic-ratio Rate-Distortion Attack (ARDA). Using the performance variations as indicators, we evaluate the adversarial robustness of eight predominant LIC algorithms against diverse attacks. Furthermore, we propose two novel analytical tools for in-depth analysis, i.e., Entropy Causal Intervention and Layer-wise Distance Magnify Ratio, and reveal that hyperprior significantly increases the bitrate and Inverse Generalized Divisive Normalization (IGDN) significantly amplifies input perturbations when under attack. Lastly, we examine the efficacy of adversarial training and introduce the use of online updating for defense. By comparing their advantages and disadvantages, we provide a reference for constructing more robust LIC algorithms against the rate-distortion attacks.
- Published
- 2024
Catalog
Discovery Service for Jio Institute Digital Library
For full access to our library's resources, please sign in.