62,797 results on '"Chen, Xi"'
Search Results
2. A Powder Diffraction-AI Solution for Crystalline Structure
- Author
-
Wu, Di, Wang, Pengkun, Zhou, Shiming, Zhang, Bochun, Yu, Liheng, Chen, Xi, Wang, Xu, Zhou, Zhengyang, Wang, Yang, Wang, Sujing, and Du, Jiangfeng
- Subjects
Condensed Matter - Materials Science - Abstract
Determining the atomic-level structure of crystalline solids is critically important across a wide array of scientific disciplines. The challenges associated with obtaining samples suitable for single-crystal diffraction, coupled with the limitations inherent in classical structure determination methods that primarily utilize powder diffraction for most polycrystalline materials, underscore an urgent need to develop alternative approaches for elucidating the structures of commonly encountered crystalline compounds. In this work, we present an artificial intelligence-directed leapfrog model capable of accurately determining the structures of both organic and inorganic-organic hybrid crystalline solids through direct analysis of powder X-ray diffraction data. This model not only offers a comprehensive solution that effectively circumvents issues related to insoluble challenges in conventional structure solution methodologies but also demonstrates applicability to crystal structures across all conceivable space groups. Furthermore, it exhibits notable compatibility with routine powder diffraction data typically generated by standard instruments, featuring rapid data collection and normal resolution levels.
- Published
- 2024
3. The Flattest Infrared Extinction Curve in Four Isolated Dense Molecular Cloud Cores
- Author
-
Li, Jun, Chen, Bingqiu, Jiang, Biwei, Zhao, He, Jiang, Botao, and Chen, Xi
- Subjects
Astrophysics - Astrophysics of Galaxies ,Astrophysics - Solar and Stellar Astrophysics - Abstract
The extinction curve of interstellar dust in the dense molecular cloud cores is crucial for understanding dust properties, particularly size distribution and composition. We investigate the infrared extinction law in four nearby isolated molecular cloud cores, L429, L483, L673, and L1165, across the 1.2 - 8.0 $\mu$m wavelength range, using deep near-infrared (NIR) and mid-infrared (MIR) photometric data from UKIDSS and Spitzer Space Telescope. These observations probe an unprecedented extinction depth, reaching $A_V\sim$ 40-60 mag in these dense cloud cores. We derive color-excess ratios $E(K-\lambda)/E(H-K)$ by fitting color-color diagrams of $(K-\lambda)$ versus $(H-K)$, which are subsequently used to calculate the extinction law $A_\lambda/A_K$. Our analysis reveals remarkably similar and exceptionally flat infrared extinction curves for all four cloud cores, exhibiting the most pronounced flattening reported in the literature to date. This flatness is consistent with the presence of large dust grains, suggesting significant grain growth in dense environments. Intriguingly, our findings align closely with the Astrodust model for a diffuse interstellar environment proposed by Hensley \& Draine. This agreement between dense core observations and a diffuse medium model highlights the complexity of dust evolution and the need for further investigation into the processes governing dust properties in different interstellar environments., Comment: Accepted for publication in The Astrophysical Journal Letters (15 pages, 8 figures, 3 tables)
- Published
- 2024
4. Large Language Models (LLMs) for Wireless Networks: An Overview from the Prompt Engineering Perspective
- Author
-
Zhou, Hao, Hu, Chengming, Yuan, Dun, Yuan, Ye, Wu, Di, Chen, Xi, Tabassum, Hina, and Liu, Xue
- Subjects
Computer Science - Networking and Internet Architecture - Abstract
Recently, large language models (LLMs) have been successfully applied to many fields, showing outstanding comprehension and reasoning capabilities. Despite their great potential, LLMs usually require dedicated pre-training and fine-tuning for domain-specific applications such as wireless networks. These adaptations can be extremely demanding for computational resources and datasets, while most network devices have limited computation power, and there are a limited number of high-quality networking datasets. To this end, this work explores LLM-enabled wireless networks from the prompt engineering perspective, i.e., designing prompts to guide LLMs to generate desired output without updating LLM parameters. Compared with other LLM-driven methods, prompt engineering can better align with the demands of wireless network devices, e.g., higher deployment flexibility, rapid response time, and lower requirements on computation power. In particular, this work first introduces LLM fundamentals and compares different prompting techniques such as in-context learning, chain-of-thought, and self-refinement. Then we propose two novel prompting schemes for network applications: iterative prompting for network optimization, and self-refined prompting for network prediction. The case studies show that the proposed schemes can achieve comparable performance as conventional machine learning techniques, and our proposed prompting-based methods avoid the complexity of dedicated model training and fine-tuning, which is one of the key bottlenecks of existing machine learning techniques.
- Published
- 2024
5. Landau-Level Quantization and Band Splitting of FeSe Monolayers Revealed by Scanning Tunneling Spectroscopy
- Author
-
Huang, Wantong, Lin, Haicheng, Yin, Yuguo, Zheng, Cheng, Chen, Wei, Ji, Lichen, Hughes, Jack, Kusmartsev, Fedor, Kusmartseva, Anna, Xue, Qi-Kun, Chen, Xi, and Ji, Shuai-Hua
- Subjects
Condensed Matter - Superconductivity ,Condensed Matter - Mesoscale and Nanoscale Physics - Abstract
Two-dimensional (2D) superconductors that reside on substrates must be influenced by Rashba spin-orbit coupling (SOC). The intriguing effect of Rashba-type SOCs on iron-based superconductors (IBSs) has remained largely a mystery. In this work, we unveil modified Landau-level spectroscopy and the intricate band splitting of FeSe monolayers through the precision of scanning tunneling spectroscopy, which unequivocally demonstrates the presence of Rashba SOC. The discovery sheds light on a nonparabolic electron band at the X/Y point, displaying a distinctive Landau quantization behavior characterized by $E_n\propto(nB)^{4/3}$. The theoretical model aligns with our experimental insights, positing that the k$^4$-term of the electron band becomes predominant and profoundly reshapes the band structure. Our research underscores the pivotal role of the Rashba SOC effect on 2D superconductors and sets the stage to probe new quantum states in systems with remarkably low carrier concentrations., Comment: 21 pages, 5 figures
- Published
- 2024
6. Lower Bounds for Convexity Testing
- Author
-
Chen, Xi, De, Anindya, Nadimpalli, Shivam, Servedio, Rocco A., and Waingarten, Erik
- Subjects
Computer Science - Computational Complexity ,Computer Science - Data Structures and Algorithms - Abstract
We consider the problem of testing whether an unknown and arbitrary set $S \subseteq \mathbb{R}^n$ (given as a black-box membership oracle) is convex, versus $\varepsilon$-far from every convex set, under the standard Gaussian distribution. The current state-of-the-art testing algorithms for this problem make $2^{\tilde{O}(\sqrt{n})\cdot \mathrm{poly}(1/\varepsilon)}$ non-adaptive queries, both for the standard testing problem and for tolerant testing. We give the first lower bounds for convexity testing in the black-box query model: - We show that any one-sided tester (which may be adaptive) must use at least $n^{\Omega(1)}$ queries in order to test to some constant accuracy $\varepsilon>0$. - We show that any non-adaptive tolerant tester (which may make two-sided errors) must use at least $2^{\Omega(n^{1/4})}$ queries to distinguish sets that are $\varepsilon_1$-close to convex versus $\varepsilon_2$-far from convex, for some absolute constants $0<\varepsilon_1<\varepsilon_2$. Finally, we also show that for any constant $c>0$, any non-adaptive tester (which may make two-sided errors) must use at least $n^{1/4 - c}$ queries in order to test to some constant accuracy $\varepsilon>0$., Comment: 52 pages, to appear in SODA 2025
- Published
- 2024
7. Multiple collisions in N59 bubble: Sequential cloud-cloud collisions
- Author
-
Chen, En, Chen, Xi, Chen, Xuepeng, Fang, Min, and He, Qianru
- Subjects
Astrophysics - Astrophysics of Galaxies - Abstract
We report that the gas components in the N59 bubble suffered from sequential multiple cloud-cloud collision (CCC) processes. The molecular gas in the N59 bubble can be decomposed into four velocity components, namely Cloud A [95, 108] km s$^{-1}$, Cloud B [86, 95] km s$^{-1}$, Cloud C [79, 86] km s$^{-1}$ and Cloud D [65, 79] km s$^{-1}$. Four CCC processes occurred among these four velocity components, i.e., Cloud A vs. Cloud B, Cloud A vs. Cloud C, Cloud C vs. Cloud D, and Cloud A vs. Cloud D. Using Spitzer MIR and UKIDSS NIR photometric point source catalogs, we identified 514 YSO candidates clustered in 13 YSO groups, and most of them (~60$\%$) were located at the colliding interfaces, indicating that they were mainly triggered by these four CCC processes. We also found that these four collisions occurred in a time sequential order: the earliest and most violent collision occurred between Cloud A and Cloud D about 2 Myr ago, then Cloud B collided with Cloud A about 1 Myr ago, and finally, Cloud C collided with Clouds A and D simultaneously about 0.4 Myr ago.
- Published
- 2024
8. Designing a Validation Experiment for Radio Frequency Condensation
- Author
-
Fu, Lanke, Mitra, E. Litvinova, Nies, R., Reiman, A. H., Austin, M., Bardoczi, L., Brookman, M., Chen, Xi, Choi, W., Fisch, N. J., Hu, Q., Hyatt, A., Jung, E., La Haye, R., Logan, N. C., Maraschek, M., McClenaghan, J. J., Strait, E., Welander, A., Yang, J., and team, ASDEX Upgrade
- Subjects
Physics - Plasma Physics - Abstract
Theoretical studies have suggested that nonlinear effects can lead to "radio frequency condensation", which coalesces RF power deposition and driven current near the center of a magnetic island. It is predicted that an initially broad current profile can coalesce in islands when they reach sufficient width, providing automatic stabilization. Experimental validation of the theory has thus far been lacking. This paper proposes experiments on DIII-D for testing and refining the theory of the nonlinear effects.
- Published
- 2024
9. Relative-error monotonicity testing
- Author
-
Chen, Xi, De, Anindya, Huang, Yizhi, Li, Yuhao, Nadimpalli, Shivam, Servedio, Rocco A., and Yang, Tianqi
- Subjects
Computer Science - Computational Complexity ,Computer Science - Discrete Mathematics ,Computer Science - Data Structures and Algorithms - Abstract
The standard model of Boolean function property testing is not well suited for testing $\textit{sparse}$ functions which have few satisfying assignments, since every such function is close (in the usual Hamming distance metric) to the constant-0 function. In this work we propose and investigate a new model for property testing of Boolean functions, called $\textit{relative-error testing}$, which provides a natural framework for testing sparse functions. This new model defines the distance between two functions $f, g: \{0,1\}^n \to \{0,1\}$ to be $$\textsf{reldist}(f,g) := { \frac{|f^{-1}(1) \triangle g^{-1}(1)|} {|f^{-1}(1)|}}.$$ This is a more demanding distance measure than the usual Hamming distance ${ {|f^{-1}(1) \triangle g^{-1}(1)|}/{2^n}}$ when $|f^{-1}(1)| \ll 2^n$; to compensate for this, algorithms in the new model have access both to a black-box oracle for the function $f$ being tested and to a source of independent uniform satisfying assignments of $f$. In this paper we first give a few general results about the relative-error testing model; then, as our main technical contribution, we give a detailed study of algorithms and lower bounds for relative-error testing of $\textit{monotone}$ Boolean functions. We give upper and lower bounds which are parameterized by $N=|f^{-1}(1)|$, the sparsity of the function $f$ being tested. Our results show that there are interesting differences between relative-error monotonicity testing of sparse Boolean functions, and monotonicity testing in the standard model. These results motivate further study of the testability of Boolean function properties in the relative-error model.
- Published
- 2024
10. Symmetry-enhanced Counterdiabatic Quantum Algorithm for Qudits
- Author
-
Bottarelli, Alberto, de Andoin, Mikel Garcia, Chandarana, Pranav, Paul, Koushik, Chen, Xi, Sanz, Mikel, and Hauke, Philipp
- Subjects
Quantum Physics - Abstract
Qubit-based variational quantum algorithms have undergone rapid development in recent years but still face several challenges. In this context, we propose a symmetry-enhanced digitized counterdiabatic quantum algorithm utilizing qudits instead of qubits. This approach offers three types of compression as compared to with respect to conventional variational circuits. First, compression in the circuit depth is achieved by counterdiabatic protocols. Second, information about the problem is compressed by replacing qubits with qudits, allowing for a more efficient representation of the problem. Lastly, the number of parameters is reduced by employing the symmetries of the system. We illustrate this approach by tackling a graph-based optimization problem Max-3-Cut and a highly-entangled state preparation, the qutrit W state. As our numerical results show, we achieve a better convergence with a lower circuit depth and less measurement overhead in all the cases considered. This work leads to a better design of shallow variational quantum circuits, improving the feasibility of their implementation on near-term qudit devices
- Published
- 2024
11. TRACE: Temporal Grounding Video LLM via Causal Event Modeling
- Author
-
Guo, Yongxin, Liu, Jingyu, Li, Mingda, Tang, Xiaoying, Liu, Qingbin, and Chen, Xi
- Subjects
Computer Science - Computer Vision and Pattern Recognition - Abstract
Video Temporal Grounding (VTG) is a crucial capability for video understanding models and plays a vital role in downstream tasks such as video browsing and editing. To effectively handle various tasks simultaneously and enable zero-shot prediction, there is a growing trend in employing video LLMs for VTG tasks. However, current video LLM-based methods rely exclusively on natural language generation, lacking the ability to model the clear structure inherent in videos, which restricts their effectiveness in tackling VTG tasks. To address this issue, this paper first formally introduces causal event modeling framework, which represents videos as sequences of events, and predict the current event using previous events, video inputs, and textural instructions. Each event consists of three components: timestamps, salient scores, and textual captions. We then propose a novel task-interleaved video LLM called TRACE to effectively implement the causal event modeling framework in practice. The TRACE processes visual frames, timestamps, salient scores, and text as distinct tasks, employing various encoders and decoding heads for each. Task tokens are arranged in an interleaved sequence according to the causal event modeling framework's formulation. Extensive experiments on various VTG tasks and datasets demonstrate the superior performance of TRACE compared to state-of-the-art video LLMs. Our model and code are available at \url{https://github.com/gyxxyg/TRACE}.
- Published
- 2024
12. Fira: Can We Achieve Full-rank Training of LLMs Under Low-rank Constraint?
- Author
-
Chen, Xi, Feng, Kaituo, Li, Changsheng, Lai, Xunhao, Yue, Xiangyu, Yuan, Ye, and Wang, Guoren
- Subjects
Computer Science - Machine Learning ,Computer Science - Artificial Intelligence - Abstract
Low-rank training has emerged as a promising approach for reducing memory usage in training Large Language Models (LLMs). Previous methods either rely on decomposing weight matrices (e.g., LoRA), or seek to decompose gradient matrices (e.g., GaLore) to ensure reduced memory consumption. However, both of them constrain the training in a low-rank subspace, thus inevitably leading to sub-optimal performance. This raises a question: whether it is possible to consistently preserve the low-rank constraint for memory efficiency, while achieving full-rank training (i.e., training with full-rank gradients of full-rank weights) to avoid inferior outcomes? In this paper, we propose a new plug-and-play training framework for LLMs called Fira, as the first attempt to achieve this goal. First, we observe an interesting phenomenon during LLM training: the scaling impact of adaptive optimizers (e.g., Adam) on the gradient norm remains similar from low-rank to full-rank training. Based on this observation, we propose a norm-based scaling method, which utilizes the scaling impact of low-rank optimizers as substitutes for that of original full-rank optimizers to enable full-rank training. In this way, we can preserve the low-rank constraint in the optimizer while achieving full-rank training for better performance. Moreover, we find that there are sudden gradient rises during the optimization process, potentially causing loss spikes. To address this, we further put forward a norm-growth limiter to smooth the gradient via regulating the relative increase of gradient norms. Extensive experiments on the pre-training and fine-tuning of LLMs show that Fira outperforms both LoRA and GaLore, achieving performance that is comparable to or even better than full-rank training., Comment: Add further analysis of the scaling factor, code is available at: https://github.com/xichen-fy/Fira
- Published
- 2024
13. LLM+KG@VLDB'24 Workshop Summary
- Author
-
Khan, Arijit, Wu, Tianxing, and Chen, Xi
- Subjects
Computer Science - Databases ,Computer Science - Artificial Intelligence ,Computer Science - Machine Learning - Abstract
The unification of large language models (LLMs) and knowledge graphs (KGs) has emerged as a hot topic. At the LLM+KG'24 workshop, held in conjunction with VLDB 2024 in Guangzhou, China, one of the key themes explored was important data management challenges and opportunities due to the effective interaction between LLMs and KGs. This report outlines the major directions and approaches presented by various speakers during the LLM+KG'24 workshop., Comment: 7 pages, 1 figure
- Published
- 2024
14. Self-supervised Preference Optimization: Enhance Your Language Model with Preference Degree Awareness
- Author
-
Li, Jian, Huang, Haojing, Zhang, Yujia, Xu, Pengfei, Chen, Xi, Song, Rui, Shi, Lida, Wang, Jingwen, and Xu, Hao
- Subjects
Computer Science - Computation and Language ,Computer Science - Artificial Intelligence - Abstract
Recently, there has been significant interest in replacing the reward model in Reinforcement Learning with Human Feedback (RLHF) methods for Large Language Models (LLMs), such as Direct Preference Optimization (DPO) and its variants. These approaches commonly use a binary cross-entropy mechanism on pairwise samples, i.e., minimizing and maximizing the loss based on preferred or dis-preferred responses, respectively. However, while this training strategy omits the reward model, it also overlooks the varying preference degrees within different responses. We hypothesize that this is a key factor hindering LLMs from sufficiently understanding human preferences. To address this problem, we propose a novel Self-supervised Preference Optimization (SPO) framework, which constructs a self-supervised preference degree loss combined with the alignment loss, thereby helping LLMs improve their ability to understand the degree of preference. Extensive experiments are conducted on two widely used datasets of different tasks. The results demonstrate that SPO can be seamlessly integrated with existing preference optimization methods and significantly boost their performance to achieve state-of-the-art performance. We also conduct detailed analyses to offer comprehensive insights into SPO, which verifies its effectiveness. The code is available at https://github.com/lijian16/SPO., Comment: Accepted at EMNLP 2024 Findings
- Published
- 2024
15. AI Delegates with a Dual Focus: Ensuring Privacy and Strategic Self-Disclosure
- Author
-
Chen, Xi, Zhang, Zhiyang, Yang, Fangkai, Qin, Xiaoting, Du, Chao, Cheng, Xi, Liu, Hangxin, Lin, Qingwei, Rajmohan, Saravan, Zhang, Dongmei, and Zhang, Qi
- Subjects
Computer Science - Artificial Intelligence ,Computer Science - Computers and Society - Abstract
Large language model (LLM)-based AI delegates are increasingly utilized to act on behalf of users, assisting them with a wide range of tasks through conversational interfaces. Despite their advantages, concerns arise regarding the potential risk of privacy leaks, particularly in scenarios involving social interactions. While existing research has focused on protecting privacy by limiting the access of AI delegates to sensitive user information, many social scenarios require disclosing private details to achieve desired outcomes, necessitating a balance between privacy protection and disclosure. To address this challenge, we conduct a pilot study to investigate user preferences for AI delegates across various social relations and task scenarios, and then propose a novel AI delegate system that enables privacy-conscious self-disclosure. Our user study demonstrates that the proposed AI delegate strategically protects privacy, pioneering its use in diverse and dynamic social interactions.
- Published
- 2024
16. Nonlinear field dependence of Hall effect and high-mobility multi-carrier transport in an altermagnet CrSb
- Author
-
Bai, Yuqing, Xiang, Xinji, Pan, Shuang, Zhang, Shichao, Chen, Haifeng Chen Xi, Han, Zhida, Xu, Guizhou, and Xu, Feng
- Subjects
Condensed Matter - Materials Science - Abstract
As a promising candidate for altermagnet, CrSb possesses a distinctive compensated spin split band structure that could bring groundbreaking concepts to the field of spintronics. In this work, we have grown high-quality CrSb single crystals and comprehensively investigated their electronic and magneto-transport properties. We have observed large, positive, and non-saturated magnetoresistance (MR) in CrSb, which well obeys Kohler's rule, indicating its classic Lorentz scattering origins. Remarkably, a nonlinear magnetic field dependence of Hall effect resembling the spontaneous anomalous Hall is identified over a wide temperature range. After careful analysis of the transport data, we conclude the non-linearity mainly stems from the incorporation of different carriers in the magnetoconductivity. According to the Fermi surface analyses of CrSb, we applied the three-carrier model to fit the conductivity data, yielding good agreement. The extracted carrier concentration and mobility indicates that CrSb behaves more like a semimetal, with the highest mobility reaching 3*103 cm2V-1s-1. Furthermore, calculations using the semiclassical Boltzmann transport theory have successfully reproduced the main features of the experimental MR and Hall effect in CrSb. These exceptional transport properties make CrSb unique for applications in spintronics as an altermagnet.
- Published
- 2024
17. Lyapunov Controlled Counterdiabatic Quantum Optimization
- Author
-
Chandarana, Pranav, Paul, Koushik, Swain, Kasturi Ranjan, Chen, Xi, and del Campo, Adolfo
- Subjects
Quantum Physics ,Condensed Matter - Mesoscale and Nanoscale Physics - Abstract
We introduce a quantum algorithm integrating counterdiabatic (CD) protocols with quantum Lyapunov control (QLC) to tackle combinatorial optimization problems. This approach offers versatility, allowing implementation as either a digital-analog or purely digital algorithm based on selected control strategies. By examining spin-glass Hamiltonians, we illustrate how the algorithm can explore alternative paths to enhance solution outcomes compared to conventional CD techniques. This method reduces the dependence on extensive higher-order CD terms and classical optimization techniques, rendering it more suitable for existing quantum computing platforms. The combination of digital compression via CD protocols and the adaptable nature of QLC methods positions this approach as a promising candidate for near-term quantum computing.
- Published
- 2024
18. Scrambling in the Charging of Quantum Batteries
- Author
-
Romero, Sebastián V., Ding, Yongcheng, Chen, Xi, and Ban, Yue
- Subjects
Quantum Physics ,Condensed Matter - Strongly Correlated Electrons ,High Energy Physics - Theory ,Nonlinear Sciences - Chaotic Dynamics - Abstract
Exponentially fast scrambling of an initial state characterizes quantum chaotic systems. Given the importance of quickly populating higher energy levels from low-energy states in quantum battery charging protocols, this Letter investigates the role of quantum scrambling in quantum batteries and its effect on optimal power and charging times. We adopt a bare representation with normalized bandwidths to suppress system energy dependence. To our knowledge, this is the first in-depth exploration of quantum scrambling in the context of quantum batteries. By analyzing the dynamics of out-of-time-order correlators, our findings indicate that quantum scrambling does not necessarily lead to faster charging, despite its potential for accelerating the process., Comment: Main text: 4 pages, 4 figures. Supplemental material: 4 pages, 4 figures
- Published
- 2024
19. Enhancing Long Video Understanding via Hierarchical Event-Based Memory
- Author
-
Cheng, Dingxin, Li, Mingda, Liu, Jingyu, Guo, Yongxin, Jiang, Bin, Liu, Qingbin, Chen, Xi, and Zhao, Bo
- Subjects
Computer Science - Computer Vision and Pattern Recognition ,Computer Science - Artificial Intelligence - Abstract
Recently, integrating visual foundation models into large language models (LLMs) to form video understanding systems has attracted widespread attention. Most of the existing models compress diverse semantic information within the whole video and feed it into LLMs for content comprehension. While this method excels in short video understanding, it may result in a blend of multiple event information in long videos due to coarse compression, which causes information redundancy. Consequently, the semantics of key events might be obscured within the vast information that hinders the model's understanding capabilities. To address this issue, we propose a Hierarchical Event-based Memory-enhanced LLM (HEM-LLM) for better understanding of long videos. Firstly, we design a novel adaptive sequence segmentation scheme to divide multiple events within long videos. In this way, we can perform individual memory modeling for each event to establish intra-event contextual connections, thereby reducing information redundancy. Secondly, while modeling current event, we compress and inject the information of the previous event to enhance the long-term inter-event dependencies in videos. Finally, we perform extensive experiments on various video understanding tasks and the results show that our model achieves state-of-the-art performances.
- Published
- 2024
20. Freely Suspended Nematic and Smectic Films and Free-Standing Smectic Filaments in the Ferroelectric Nematic Realm
- Author
-
Hedlund, Keith G., Martinez, Vikina, Chen, Xi, Park, Cheol S., Maclennan, Joseph E., Glaser, Matthew A., and Clark, Noel A.
- Subjects
Condensed Matter - Soft Condensed Matter - Abstract
We show that stable, freely suspended liquid crystal films can be made from the ferroelectric nematic ($\mathrm{N_F}$) phase and from the recently discovered polar, lamellar $\mathrm{SmZ_A}$ and $\mathrm{SmA_F}$ phases. The $\mathrm{N_F}$ films display two-dimensional, smectic-like parabolic focal conic textures comprising director/polarization bend that are a manifestation of the electrostatic suppression of director splay in the film plane. In the $\mathrm{SmZ_A}$ and $\mathrm{SmA_F}$ phases, the smectic layers orient preferentially normal to the film surfaces, a condition never found in typical thermotropic or lyotropic lamellar LC phases, with the $\mathrm{SmZ_A}$ films exhibiting focal-conic fan textures mimicking the appearance of typical smectics in glass cells when the layers are oriented normal to the plates, and the $\mathrm{SmA_F}$ films showing a texture of plaquettes of uniform in-plane orientation where both bend and splay are suppressed, separated by grain boundaries. The $\mathrm{SmA_F}$ phase can also be drawn into thin filaments, in which X-ray scattering reveals that the smectic layer planes are normal to the filament axis. Remarkably, the filaments are mechanically stable even if they break, forming free-standing, fluid filaments supported only at one end. The unique architectures of these films and filaments are stabilized by the electrostatic self-interaction of the liquid crystal polarization field, which enables the formation of confined, fluid structures that are fundamentally different from those of their counterparts made using previously known liquid crystal phases., Comment: Main paper 25 pages (5 figures); Supplement: 7 pages (7 figures)
- Published
- 2024
21. FrozenSeg: Harmonizing Frozen Foundation Models for Open-Vocabulary Segmentation
- Author
-
Chen, Xi, Yang, Haosen, Jin, Sheng, Zhu, Xiatian, and Yao, Hongxun
- Subjects
Computer Science - Computer Vision and Pattern Recognition - Abstract
Open-vocabulary segmentation poses significant challenges, as it requires segmenting and recognizing objects across an open set of categories in unconstrained environments. Building on the success of powerful vision-language (ViL) foundation models, such as CLIP, recent efforts sought to harness their zero-short capabilities to recognize unseen categories. Despite notable performance improvements, these models still encounter the critical issue of generating precise mask proposals for unseen categories and scenarios, resulting in inferior segmentation performance eventually. To address this challenge, we introduce a novel approach, FrozenSeg, designed to integrate spatial knowledge from a localization foundation model (e.g., SAM) and semantic knowledge extracted from a ViL model (e.g., CLIP), in a synergistic framework. Taking the ViL model's visual encoder as the feature backbone, we inject the space-aware feature into the learnable queries and CLIP features within the transformer decoder. In addition, we devise a mask proposal ensemble strategy for further improving the recall rate and mask quality. To fully exploit pre-trained knowledge while minimizing training overhead, we freeze both foundation models, focusing optimization efforts solely on a lightweight transformer decoder for mask proposal generation-the performance bottleneck. Extensive experiments demonstrate that FrozenSeg advances state-of-the-art results across various segmentation benchmarks, trained exclusively on COCO panoptic data, and tested in a zero-shot manner. Code is available at https://github.com/chenxi52/FrozenSeg., Comment: 14 pages, 9 figures
- Published
- 2024
22. TC-LLaVA: Rethinking the Transfer from Image to Video Understanding with Temporal Considerations
- Author
-
Gao, Mingze, Liu, Jingyu, Li, Mingda, Xie, Jiangtao, Liu, Qingbin, Zhao, Bo, Chen, Xi, and Xiong, Hui
- Subjects
Computer Science - Computer Vision and Pattern Recognition ,Computer Science - Artificial Intelligence - Abstract
Multimodal Large Language Models (MLLMs) have significantly improved performance across various image-language applications. Recently, there has been a growing interest in adapting image pre-trained MLLMs for video-related tasks. However, most efforts concentrate on enhancing the vision encoder and projector components, while the core part, Large Language Models (LLMs), remains comparatively under-explored. In this paper, we propose two strategies to enhance the model's capability in video understanding tasks by improving inter-layer attention computation in LLMs. Specifically, the first approach focuses on the enhancement of Rotary Position Embedding (RoPE) with Temporal-Aware Dual RoPE, which introduces temporal position information to strengthen the MLLM's temporal modeling capabilities while preserving the relative position relationships of both visual and text tokens. The second approach involves enhancing the Attention Mask with the Frame-wise Block Causal Attention Mask, a simple yet effective method that broadens visual token interactions within and across video frames while maintaining the causal inference mechanism. Based on these proposed methods, we adapt LLaVA for video understanding tasks, naming it Temporal-Considered LLaVA (TC-LLaVA). Our TC-LLaVA achieves new state-of-the-art performance across various video understanding benchmarks with only supervised fine-tuning (SFT) on video-related datasets.
- Published
- 2024
23. CLIBE: Detecting Dynamic Backdoors in Transformer-based NLP Models
- Author
-
Zeng, Rui, Chen, Xi, Pu, Yuwen, Zhang, Xuhong, Du, Tianyu, and Ji, Shouling
- Subjects
Computer Science - Cryptography and Security ,Computer Science - Computation and Language ,Computer Science - Machine Learning - Abstract
Backdoors can be injected into NLP models to induce misbehavior when the input text contains a specific feature, known as a trigger, which the attacker secretly selects. Unlike fixed words, phrases, or sentences used in the static text trigger, NLP dynamic backdoor attacks design triggers associated with abstract and latent text features, making them considerably stealthier than traditional static backdoor attacks. However, existing research on NLP backdoor detection primarily focuses on defending against static backdoor attacks, while detecting dynamic backdoors in NLP models remains largely unexplored. This paper presents CLIBE, the first framework to detect dynamic backdoors in Transformer-based NLP models. CLIBE injects a "few-shot perturbation" into the suspect Transformer model by crafting optimized weight perturbation in the attention layers to make the perturbed model classify a limited number of reference samples as a target label. Subsequently, CLIBE leverages the generalization ability of this few-shot perturbation to determine whether the original model contains a dynamic backdoor. Extensive evaluation on three advanced NLP dynamic backdoor attacks, two widely-used Transformer frameworks, and four real-world classification tasks strongly validates the effectiveness of CLIBE. We also demonstrate the robustness of CLIBE against various adaptive attacks. Furthermore, we employ CLIBE to scrutinize 49 popular Transformer models on Hugging Face and discover one exhibiting a high probability of containing a dynamic backdoor. We have contacted Hugging Face and provided detailed evidence of this model's backdoor behavior. Moreover, we extend CLIBE to detect backdoor text generation models modified to exhibit toxic behavior. To the best of our knowledge, CLIBE is the first framework capable of detecting backdoors in text generation models without access to trigger input test samples., Comment: To appear in the Network and Distributed System Security (NDSS) Symposium, February, 2025
- Published
- 2024
24. Direct and indirect regulation of β-glucocerebrosidase by the transcription factors USF2 and ONECUT2.
- Author
-
Ging, Kathi, Frick, Lukas, Schlachetzki, Johannes, Armani, Andrea, Zhu, Yanping, Gilormini, Pierre-André, Dhingra, Ashutosh, Böck, Desirée, Marques, Ana, Deen, Matthew, Chen, Xi, Serdiuk, Tetiana, Trevisan, Chiara, Sellitto, Stefano, Pisano, Claudio, Glass, Christopher, Heutink, Peter, Yin, Jiang-An, Vocadlo, David, and Aguzzi, Adriano
- Abstract
Mutations in GBA1 encoding the lysosomal enzyme β-glucocerebrosidase (GCase) are among the most prevalent genetic susceptibility factors for Parkinsons disease (PD), with 10-30% of carriers developing the disease. To identify genetic modifiers contributing to the incomplete penetrance, we examined the effect of 1634 human transcription factors (TFs) on GCase activity in lysates of an engineered human glioblastoma line homozygous for the pathogenic GBA1 L444P variant. Using an arrayed CRISPR activation library, we uncovered 11 TFs as regulators of GCase activity. Among these, activation of MITF and TFEC increased lysosomal GCase activity in live cells, while activation of ONECUT2 and USF2 decreased it. While MITF, TFEC, and USF2 affected GBA1 transcription, ONECUT2 might control GCase trafficking. The effects of MITF, TFEC, and USF2 on lysosomal GCase activity were reproducible in iPSC-derived neurons from PD patients. Our study provides a systematic approach to identifying modulators of GCase activity and deepens our understanding of the mechanisms regulating GCase.
- Published
- 2024
25. Conan-embedding: General Text Embedding with More and Better Negative Samples
- Author
-
Li, Shiyu, Tang, Yang, Chen, Shizhe, and Chen, Xi
- Subjects
Computer Science - Computation and Language - Abstract
With the growing popularity of RAG, the capabilities of embedding models are gaining increasing attention. Embedding models are primarily trained through contrastive loss learning, with negative examples being a key component. Previous work has proposed various hard negative mining strategies, but these strategies are typically employed as preprocessing steps. In this paper, we propose the conan-embedding model, which maximizes the utilization of more and higher-quality negative examples. Specifically, since the model's ability to handle preprocessed negative examples evolves during training, we propose dynamic hard negative mining method to expose the model to more challenging negative examples throughout the training process. Secondly, contrastive learning requires as many negative examples as possible but is limited by GPU memory constraints. Therefore, we use a Cross-GPU balancing Loss to provide more negative examples for embedding training and balance the batch size across multiple tasks. Moreover, we also discovered that the prompt-response pairs from LLMs can be used for embedding training. Our approach effectively enhances the capabilities of embedding models, currently ranking first on the Chinese leaderboard of Massive text embedding benchmark
- Published
- 2024
26. The Impact of Group Discussion and Formation on Student Performance: An Experience Report in a Large CS1 Course
- Author
-
Wu, Tong, Tang, Xiaohang, Wong, Sam, Chen, Xi, Shaffer, Clifford A., and Chen, Yan
- Subjects
Computer Science - Computers and Society - Abstract
Programming instructors often conduct collaborative learning activities, such as Peer Instruction (PI), to enhance student motivation, engagement, and learning gains. However, the impact of group discussion and formation mechanisms on student performance remains unclear. To investigate this, we conducted an 11-session experiment in a large, in-person CS1 course. We employed both random and expertise-balanced grouping methods to examine the efficacy of different group mechanisms and the impact of expert students' presence on collaborative learning. Our observations revealed complex dynamics within the collaborative learning environment. Among 255 groups, 146 actively engaged in discussions, with 96 of these groups demonstrating improvement for poor-performing students. Interestingly, our analysis revealed that different grouping methods (expertise-balanced or random) did not significantly influence discussion engagement or poor-performing students' improvement. In our deeper qualitative analysis, we found that struggling students often derived benefits from interactions with expert peers, but this positive effect was not consistent across all groups. We identified challenges that expert students face in peer instruction interactions, highlighting the complexity of leveraging expertise within group discussions.
- Published
- 2024
27. A Low-dose CT Reconstruction Network Based on TV-regularized OSEM Algorithm
- Author
-
An, Ran, Zhang, Yinghui, Chen, Xi, Li, Lemeng, Chen, Ke, and Li, Hongwei
- Subjects
Electrical Engineering and Systems Science - Image and Video Processing ,Computer Science - Computer Vision and Pattern Recognition ,I.4.5 - Abstract
Low-dose computed tomography (LDCT) offers significant advantages in reducing the potential harm to human bodies. However, reducing the X-ray dose in CT scanning often leads to severe noise and artifacts in the reconstructed images, which might adversely affect diagnosis. By utilizing the expectation maximization (EM) algorithm, statistical priors could be combined with artificial priors to improve LDCT reconstruction quality. However, conventional EM-based regularization methods adopt an alternating solving strategy, i.e. full reconstruction followed by image-regularization, resulting in over-smoothing and slow convergence. In this paper, we propose to integrate TV regularization into the ``M''-step of the EM algorithm, thus achieving effective and efficient regularization. Besides, by employing the Chambolle-Pock (CP) algorithm and the ordered subset (OS) strategy, we propose the OSEM-CP algorithm for LDCT reconstruction, in which both reconstruction and regularization are conducted view-by-view. Furthermore, by unrolling OSEM-CP, we propose an end-to-end reconstruction neural network (NN), named OSEM-CPNN, with remarkable performance and efficiency that achieves high-quality reconstructions in just one full-view iteration. Experiments on different models and datasets demonstrate our methods' outstanding performance compared to traditional and state-of-the-art deep-learning methods., Comment: 11 pages, 8 figures
- Published
- 2024
28. Self-Refined Generative Foundation Models for Wireless Traffic Prediction
- Author
-
Hu, Chengming, Zhou, Hao, Wu, Di, Chen, Xi, Yan, Jun, and Liu, Xue
- Subjects
Electrical Engineering and Systems Science - Systems and Control - Abstract
With a broad range of emerging applications in 6G networks, wireless traffic prediction has become a critical component of network management. However, the dynamically shifting distribution of wireless traffic in non-stationary 6G networks presents significant challenges to achieving accurate and stable predictions. Motivated by recent advancements in Generative AI (GAI)-enabled 6G networks, this paper proposes a novel self-refined Large Language Model (LLM) for wireless traffic prediction, namely TrafficLLM, through in-context learning without parameter fine-tuning or model training. The proposed TrafficLLM harnesses the powerful few-shot learning abilities of LLMs to enhance the scalability of traffic prediction in dynamically changing wireless environments. Specifically, our proposed TrafficLLM embraces an LLM to iteratively refine its predictions through a three-step process: traffic prediction, feedback generation, and prediction refinement. Initially, the proposed TrafficLLM conducts traffic predictions using task-specific demonstration prompts. Recognizing that LLMs may generate incorrect predictions on the first attempt, we subsequently incorporate feedback demonstration prompts designed to provide multifaceted and valuable feedback related to these initial predictions. Following this comprehensive feedback, our proposed TrafficLLM introduces refinement demonstration prompts, enabling the same LLM to further refine its predictions and thereby enhance prediction performance. The evaluations on two realistic datasets demonstrate that the proposed TrafficLLM outperforms state-of-the-art methods with performance improvements of 23.17% and 17.09%, respectively.
- Published
- 2024
29. Demonstration of Hardware Efficient Photonic Variational Quantum Algorithm
- Author
-
Agresti, Iris, Paul, Koushik, Schiansky, Peter, Steiner, Simon, Yin, Zhengao, Pentangelo, Ciro, Piacentini, Simone, Crespi, Andrea, Ban, Yue, Ceccarelli, Francesco, Osellame, Roberto, Chen, Xi, and Walther, Philip
- Subjects
Quantum Physics - Abstract
Quantum computing has brought a paradigm change in computer science, where non-classical technologies have promised to outperform their classical counterpart. Such an advantage was only demonstrated for tasks without practical applications, still out of reach for the state-of-art quantum technologies. In this context, a promising strategy to find practical use of quantum computers is to exploit hybrid quantum-classical models, where a quantum device estimates a hard-to-compute quantity, while a classical optimizer trains the parameters of the model. In this work, we demonstrate that single photons and linear optical networks are sufficient for implementing Variational Quantum Algorithms, when the problem specification, or ansatz, is tailored to this specific platform. We show this by a proof-of-principle demonstration of a variational approach to tackle an instance of a factorization task, whose solution is encoded in the ground state of a suitable Hamiltonian. This work which combines Variational Quantum Algorithms with hardware efficient ansatzes for linear-optics networks showcases a promising pathway towards practical applications for photonic quantum platforms.
- Published
- 2024
30. Learning Robust Treatment Rules for Censored Data
- Author
-
Cui, Yifan, Liu, Junyi, Shen, Tao, Qi, Zhengling, and Chen, Xi
- Subjects
Statistics - Methodology ,Mathematics - Statistics Theory ,Statistics - Computation ,Statistics - Machine Learning - Abstract
There is a fast-growing literature on estimating optimal treatment rules directly by maximizing the expected outcome. In biomedical studies and operations applications, censored survival outcome is frequently observed, in which case the restricted mean survival time and survival probability are of great interest. In this paper, we propose two robust criteria for learning optimal treatment rules with censored survival outcomes; the former one targets at an optimal treatment rule maximizing the restricted mean survival time, where the restriction is specified by a given quantile such as median; the latter one targets at an optimal treatment rule maximizing buffered survival probabilities, where the predetermined threshold is adjusted to account the restricted mean survival time. We provide theoretical justifications for the proposed optimal treatment rules and develop a sampling-based difference-of-convex algorithm for learning them. In simulation studies, our estimators show improved performance compared to existing methods. We also demonstrate the proposed method using AIDS clinical trial data.
- Published
- 2024
31. MatterGPT: A Generative Transformer for Multi-Property Inverse Design of Solid-State Materials
- Author
-
Chen, Yan, Wang, Xueru, Deng, Xiaobin, Liu, Yilun, Chen, Xi, Zhang, Yunwei, Wang, Lei, and Xiao, Hang
- Subjects
Condensed Matter - Materials Science ,Physics - Computational Physics - Abstract
Inverse design of solid-state materials with desired properties represents a formidable challenge in materials science. Although recent generative models have demonstrated potential, their adoption has been hindered by limitations such as inefficiency, architectural constraints and restricted open-source availability. The representation of crystal structures using the SLICES (Simplified Line-Input Crystal-Encoding System) notation as a string of characters enables the use of state-of-the-art natural language processing models, such as Transformers, for crystal design. Drawing inspiration from the success of GPT models in generating coherent text, we trained a generative Transformer on the next-token prediction task to generate solid-state materials with targeted properties. We demonstrate MatterGPT's capability to generate de novo crystal structures with targeted single properties, including both lattice-insensitive (formation energy) and lattice-sensitive (band gap) properties. Furthermore, we extend MatterGPT to simultaneously target multiple properties, addressing the complex challenge of multi-objective inverse design of crystals. Our approach showcases high validity, uniqueness, and novelty in generated structures, as well as the ability to generate materials with properties beyond the training data distribution. This work represents a significant step forward in computational materials discovery, offering a powerful and open tool for designing materials with tailored properties for various applications in energy, electronics, and beyond., Comment: 20 pages, 6 figures
- Published
- 2024
32. FAST detection of OH emission in the carbon-rich planetary nebula NGC 7027
- Author
-
Ouyang, Xu-Jia, Zhang, Yong, Zhang, Chuan-Peng, Jiang, Peng, Nakashima, Jun-ichi, Chen, Xi, Qiao, Hai-Hua, Zhang, Xu-Ying, Sun, Hao-Min, Li, Xiao-Hu, and Zijlstra, Albert
- Subjects
Astrophysics - Astrophysics of Galaxies ,Astrophysics - Solar and Stellar Astrophysics - Abstract
We present the first detection of the ground-state OH emission line at 1612 MHz toward the prototypical carbon-rich planetary nebula (PN) NGC 7027, utilizing the newly installed ultra-wideband (UWB) receiver of the Five-hundred-meter Aperture Spherical radio Telescope (FAST). This emission is likely to originate from the interface of the neutral shell and the ionized region. The other three ground-state OH lines at 1665, 1667, and 1721 MHz are observed in absorption and have velocities well matched with that of HCO$^+$ absorption. We infer that the OH absorption is from the outer shell of NGC 7027, although the possibility that they are associated with a foreground cloud cannot be completely ruled out. All the OH lines exhibit a single blue-shifted component with respect to the central star. The formation of OH in carbon-rich environments might be via photodissociation-induced chemical processes. Our observations offer significant constraints for chemical simulations, and they underscore the potent capability of the UWB receiver of FAST to search for nascent PNe., Comment: 17 pages, 3 figures, accepted for publication in ApJ
- Published
- 2024
33. Conformal Trajectory Prediction with Multi-View Data Integration in Cooperative Driving
- Author
-
Chen, Xi, Bhadani, Rahul, and Head, Larry
- Subjects
Computer Science - Artificial Intelligence ,Computer Science - Computer Vision and Pattern Recognition ,Computer Science - Machine Learning - Abstract
Current research on trajectory prediction primarily relies on data collected by onboard sensors of an ego vehicle. With the rapid advancement in connected technologies, such as vehicle-to-vehicle (V2V) and vehicle-to-infrastructure (V2I) communication, valuable information from alternate views becomes accessible via wireless networks. The integration of information from alternative views has the potential to overcome the inherent limitations associated with a single viewpoint, such as occlusions and limited field of view. In this work, we introduce V2INet, a novel trajectory prediction framework designed to model multi-view data by extending existing single-view models. Unlike previous approaches where the multi-view data is manually fused or formulated as a separate training stage, our model supports end-to-end training, enhancing both flexibility and performance. Moreover, the predicted multimodal trajectories are calibrated by a post-hoc conformal prediction module to get valid and efficient confidence regions. We evaluated the entire framework using the real-world V2I dataset V2X-Seq. Our results demonstrate superior performance in terms of Final Displacement Error (FDE) and Miss Rate (MR) using a single GPU. The code is publicly available at: \url{https://github.com/xichennn/V2I_trajectory_prediction}.
- Published
- 2024
34. MSMA: Multi-agent Trajectory Prediction in Connected and Autonomous Vehicle Environment with Multi-source Data Integration
- Author
-
Chen, Xi, Bhadani, Rahul, Sun, Zhanbo, and Head, Larry
- Subjects
Computer Science - Robotics ,Computer Science - Machine Learning - Abstract
The prediction of surrounding vehicle trajectories is crucial for collision-free path planning. In this study, we focus on a scenario where a connected and autonomous vehicle (CAV) serves as the central agent, utilizing both sensors and communication technologies to perceive its surrounding traffics consisting of autonomous vehicles (AVs), connected vehicles (CVs), and human-driven vehicles (HDVs). Our trajectory prediction task is aimed at all the detected surrounding vehicles. To effectively integrate the multi-source data from both sensor and communication technologies, we propose a deep learning framework called MSMA utilizing a cross-attention module for multi-source data fusion. Vector map data is utilized to provide contextual information. The trajectory dataset is collected in CARLA simulator with synthesized data errors introduced. Numerical experiments demonstrate that in a mixed traffic flow scenario, the integration of data from different sources enhances our understanding of the environment. This notably improves trajectory prediction accuracy, particularly in situations with a high CV market penetration rate. The code is available at: https://github.com/xichennn/MSMA.
- Published
- 2024
35. Shortcuts for Adiabatic and Variational Algorithms in Molecular Simulation
- Author
-
Ferreiro-Vélez, Julián, Iriarte-Zendoia, Iñaki, Ban, Yue, and Chen, Xi
- Subjects
Quantum Physics - Abstract
Quantum algorithms are prominent in the pursuit of achieving quantum advantage in various computational tasks. However, addressing challenges, such as limited qubit coherence and high error rate in near-term devices, requires extensive efforts. In this paper, we present a substantial stride in quantum chemistry by integrating shortcuts-to-adiabaticity techniques into adiabatic and variational algorithms for calculating the molecular ground state. Our approach includes the counter-diabatic driving that accelerates adiabatic evolution by mitigating adiabatic errors. Additionally, we introduce the counter-diabatic terms as the adiabatic gauge ansatz for the variational quantum eigensolver, which exhibits favorable convergence properties with a fewer number of parameters, thereby reducing the circuit depth. Our approach achieves comparable accuracy to other established ansatzes, while enhancing the potential for applications in material science, drug discovery, and molecular simulations., Comment: 10 pages, 3 figures
- Published
- 2024
36. Stronger sum uncertainty relations for non-Hermitian operators
- Author
-
Song, Xiao-Feng, Ren, Yi-Fang, Liu, Shuang, Chen, Xi-Hao, and Turek, Yusuf
- Subjects
Quantum Physics ,Physics - Optics - Abstract
Unlike the uncertainty relationships of two arbitrary incompatible observables represented by the product of variances in the past, representing them by the sum of variances is better as it guarantees to be nontrivial for two incompatible operators in some special cases. Although the uncertainty relation is formulated as the sum of variances for unitary operators has been confirmed, its general forms for arbitrary non-Hermitian operators have not been yet investigated in detail. Thus, this study develops four sum uncertainty relations for arbitrary non-Hermitian operators acting on system states by utilizing an appropriate Hilbert-space metric. The compatible forms of our sum inequalities with the conventional quantum mechanics are also provided via $G$-metric formalism. Concrete examples demonstrate the validity of the purposed sum uncertainty relations in both $\mathcal{PT}$-symmetric and $\mathcal{PT}$-broken phases. The proposed methods and results can help the reader to understand in-depth the usefulness of $G$-metric formalism in non-Hermitian quantum mechanics and the sum uncertainty relations of incompatible operators within.
- Published
- 2024
37. Synthetic monopole with half-integer magnetic charge in Bose-Einstein condensates
- Author
-
Chen, Xi-Yu, Jiang, Lijia, Bai, Wen-Kai, Yang, Tao, and Zheng, Jun-Hui
- Subjects
Condensed Matter - Quantum Gases ,Nonlinear Sciences - Pattern Formation and Solitons ,Quantum Physics - Abstract
We propose a scheme to create monopoles with half-integer magnetic charges in a spinful cold atom system. With a minimal monopole in the center, we derive the ground-state single-vortex wave function on the sphere and develop the vortex's kinematic equation in the presence of an external electromagnetic field. The vortex's trajectory is generally depicted by the precession of the system. We further formulate the inter-vortex interaction and build up a theory of multi-vortex dynamics in high-charge monopole systems. We predict the vortices'trajectory in the bi-vortex system and figure out stable vortex (line) patterns in multi-vortex systems. Our study provides deep insights into properties of magnetic monopoles and vortices and paves the way for experimental verification., Comment: 6+2+3 pages, 4+1 figures, 1 table
- Published
- 2024
38. DTFormer: A Transformer-Based Method for Discrete-Time Dynamic Graph Representation Learning
- Author
-
Chen, Xi, Xiong, Yun, Zhang, Siwei, Zhang, Jiawei, Zhang, Yao, Zhou, Shiyang, Wu, Xixi, Zhang, Mingyang, Liu, Tengfei, and Wang, Weiqiang
- Subjects
Computer Science - Machine Learning - Abstract
Discrete-Time Dynamic Graphs (DTDGs), which are prevalent in real-world implementations and notable for their ease of data acquisition, have garnered considerable attention from both academic researchers and industry practitioners. The representation learning of DTDGs has been extensively applied to model the dynamics of temporally changing entities and their evolving connections. Currently, DTDG representation learning predominantly relies on GNN+RNN architectures, which manifest the inherent limitations of both Graph Neural Networks (GNNs) and Recurrent Neural Networks (RNNs). GNNs suffer from the over-smoothing issue as the models architecture goes deeper, while RNNs struggle to capture long-term dependencies effectively. GNN+RNN architectures also grapple with scaling to large graph sizes and long sequences. Additionally, these methods often compute node representations separately and focus solely on individual node characteristics, thereby overlooking the behavior intersections between the two nodes whose link is being predicted, such as instances where the two nodes appear together in the same context or share common neighbors. This paper introduces a novel representation learning method DTFormer for DTDGs, pivoting from the traditional GNN+RNN framework to a Transformer-based architecture. Our approach exploits the attention mechanism to concurrently process topological information within the graph at each timestamp and temporal dynamics of graphs along the timestamps, circumventing the aforementioned fundamental weakness of both GNNs and RNNs. Moreover, we enhance the model's expressive capability by incorporating the intersection relationships among nodes and integrating a multi-patching module. Extensive experiments conducted on six public dynamic graph benchmark datasets confirm our model's efficacy, achieving the SOTA performance., Comment: 11 pages, 3 figures
- Published
- 2024
39. Self-Reasoning Assistant Learning for non-Abelian Gauge Fields Design
- Author
-
Sun, Jinyang, Chen, Xi, Wang, Xiumei, Zhu, Dandan, and Zhou, Xingping
- Subjects
Computer Science - Machine Learning ,Condensed Matter - Mesoscale and Nanoscale Physics ,Computer Science - Artificial Intelligence - Abstract
Non-Abelian braiding has attracted substantial attention because of its pivotal role in describing the exchange behaviour of anyons, in which the input and outcome of non-Abelian braiding are connected by a unitary matrix. Implementing braiding in a classical system can assist the experimental investigation of non-Abelian physics. However, the design of non-Abelian gauge fields faces numerous challenges stemmed from the intricate interplay of group structures, Lie algebra properties, representation theory, topology, and symmetry breaking. The extreme diversity makes it a powerful tool for the study of condensed matter physics. Whereas the widely used artificial intelligence with data-driven approaches has greatly promoted the development of physics, most works are limited on the data-to-data design. Here we propose a self-reasoning assistant learning framework capable of directly generating non-Abelian gauge fields. This framework utilizes the forward diffusion process to capture and reproduce the complex patterns and details inherent in the target distribution through continuous transformation. Then the reverse diffusion process is used to make the generated data closer to the distribution of the original situation. Thus, it owns strong self-reasoning capabilities, allowing to automatically discover the feature representation and capture more subtle relationships from the dataset. Moreover, the self-reasoning eliminates the need for manual feature engineering and simplifies the process of model building. Our framework offers a disruptive paradigm shift to parse complex physical processes, automatically uncovering patterns from massive datasets.
- Published
- 2024
40. LLaST: Improved End-to-end Speech Translation System Leveraged by Large Language Models
- Author
-
Chen, Xi, Zhang, Songyang, Bai, Qibing, Chen, Kai, and Nakamura, Satoshi
- Subjects
Computer Science - Computation and Language - Abstract
We introduces LLaST, a framework for building high-performance Large Language model based Speech-to-text Translation systems. We address the limitations of end-to-end speech translation(E2E ST) models by exploring model architecture design and optimization techniques tailored for LLMs. Our approach includes LLM-based speech translation architecture design, ASR-augmented training, multilingual data augmentation, and dual-LoRA optimization. Our approach demonstrates superior performance on the CoVoST-2 benchmark and showcases exceptional scaling capabilities powered by LLMs. We believe this effective method will serve as a strong baseline for speech translation and provide insights for future improvements of the LLM-based speech translation framework. We release the data, code and models in https://github.com/openaudiolab/LLaST.
- Published
- 2024
41. Gaussian Process Model with Tensorial Inputs and Its Application to the Design of 3D Printed Antennas
- Author
-
Chen, Xi, Sharma, Yashika, Zhang, Hao Helen, Hao, Xin, and Zhou, Qiang
- Subjects
Computer Science - Machine Learning - Abstract
In simulation-based engineering design with time-consuming simulators, Gaussian process (GP) models are widely used as fast emulators to speed up the design optimization process. In its most commonly used form, the input of GP is a simple list of design parameters. With rapid development of additive manufacturing (also known as 3D printing), design inputs with 2D/3D spatial information become prevalent in some applications, for example, neighboring relations between pixels/voxels and material distributions in heterogeneous materials. Such spatial information, vital to 3D printed designs, is hard to incorporate into existing GP models with common kernels such as squared exponential or Mat\'ern. In this work, we propose to embed a generalized distance measure into a GP kernel, offering a novel and convenient technique to incorporate spatial information from freeform 3D printed designs into the GP framework. The proposed method allows complex design problems for 3D printed objects to take advantage of a plethora of tools available from the GP surrogate-based simulation optimization such as designed experiments and GP-based optimizations including Bayesian optimization. We investigate the properties of the proposed method and illustrate its performance by several numerical examples of 3D printed antennas. The dataset is publicly available at: https://github.com/xichennn/GP_dataset.
- Published
- 2024
42. ViLLa: Video Reasoning Segmentation with Large Language Model
- Author
-
Zheng, Rongkun, Qi, Lu, Chen, Xi, Wang, Yi, Wang, Kun, Qiao, Yu, and Zhao, Hengshuang
- Subjects
Computer Science - Computer Vision and Pattern Recognition - Abstract
Although video perception models have made remarkable advancements in recent years, they still heavily rely on explicit text descriptions or pre-defined categories to identify target instances before executing video perception tasks. These models, however, fail to proactively comprehend and reason the user's intentions via textual input. Even though previous works attempt to investigate solutions to incorporate reasoning with image segmentation, they fail to reason with videos due to the video's complexity in object motion. To bridge the gap between image and video, in this work, we propose a new video segmentation task - video reasoning segmentation. The task is designed to output tracklets of segmentation masks given a complex input text query. What's more, to promote research in this unexplored area, we construct a reasoning video segmentation benchmark. Finally, we present ViLLa: Video reasoning segmentation with a Large Language Model, which incorporates the language generation capabilities of multimodal Large Language Models (LLMs) while retaining the capabilities of detecting, segmenting, and tracking multiple instances. We use a temporal-aware context aggregation module to incorporate contextual visual cues to text embeddings and propose a video-frame decoder to build temporal correlations across segmentation tokens. Remarkably, our ViLLa demonstrates capability in handling complex reasoning and referring video segmentation. Also, our model shows impressive ability in different temporal understanding benchmarks. Both quantitative and qualitative experiments show our method effectively unlocks new video reasoning segmentation capabilities for multimodal LLMs. The code and dataset will be available at https://github.com/rkzheng99/ViLLa., Comment: 15 pages,6 figures
- Published
- 2024
43. Composable Generation Strategy Framework Enabled Bidirectional Design on Topological Circuits
- Author
-
Chen, Xi, Sun, Jinyang, Wang, Xiumei, Chen, Maoxin, Lin, Qingyuan, Xia, Minggang, and Zhou, Xingping
- Subjects
Physics - Applied Physics ,Condensed Matter - Mesoscale and Nanoscale Physics - Abstract
Topological insulators show important properties, such as topological phase transitions and topological edge states. Although these properties and phenomena can be simulated by well-designed circuits, it is undoubtedly difficult to design such topological circuits due to the complex physical principles and calculations involved. Therefore, achieving a framework that can automatically to complete bidirectional design of topology circuits is very significant. Here, we propose an effective bidirectional collaborative design framework with strong task adaptability, which can automatically generate specific results according to our requirements. In the framework, a composable generation strategy is employed, which involves building a shared multimodal space by bridging alignment in the diffusion process. For simplicity, a series of two-dimensional (2D) Su-Schrieffer-Heeger (SSH) circuits are constructed with different structural parameters. The framework at first is applied to find the relationship between the structural information and topological features. Then, the correctness of the results through experimental measurements can be verified by the automatically generated circuit diagram following the manufacture of Printed Circuit Board (PCB). The framework is demonstrated by achieving good results in the reverse design of circuit structures and forward prediction of topological edge states, reaching an accuracy of 94%. Overall, our research demonstrates the enormous potential of the proposed bidirectional deep learning framework in complex tasks and provides insights for collaborative design tasks.
- Published
- 2024
44. LogoSticker: Inserting Logos into Diffusion Models for Customized Generation
- Author
-
Zhu, Mingkang, Chen, Xi, Wang, Zhongdao, Zhao, Hengshuang, and Jia, Jiaya
- Subjects
Computer Science - Computer Vision and Pattern Recognition - Abstract
Recent advances in text-to-image model customization have underscored the importance of integrating new concepts with a few examples. Yet, these progresses are largely confined to widely recognized subjects, which can be learned with relative ease through models' adequate shared prior knowledge. In contrast, logos, characterized by unique patterns and textual elements, are hard to establish shared knowledge within diffusion models, thus presenting a unique challenge. To bridge this gap, we introduce the task of logo insertion. Our goal is to insert logo identities into diffusion models and enable their seamless synthesis in varied contexts. We present a novel two-phase pipeline LogoSticker to tackle this task. First, we propose the actor-critic relation pre-training algorithm, which addresses the nontrivial gaps in models' understanding of the potential spatial positioning of logos and interactions with other objects. Second, we propose a decoupled identity learning algorithm, which enables precise localization and identity extraction of logos. LogoSticker can generate logos accurately and harmoniously in diverse contexts. We comprehensively validate the effectiveness of LogoSticker over customization methods and large models such as DALLE~3. \href{https://mingkangz.github.io/logosticker}{Project page}., Comment: ECCV2024
- Published
- 2024
45. On the global complexity of a derivative-free Levenberg-Marquardt algorithm via orthogonal spherical smoothing
- Author
-
Chen, Xi and Fan, Jinyan
- Subjects
Mathematics - Numerical Analysis ,Mathematics - Optimization and Control - Abstract
In this paper, we propose a derivative-free Levenberg-Marquardt algorithm for nonlinear least squares problems, where the Jacobian matrices are approximated via orthogonal spherical smoothing. It is shown that the gradient models which use the approximate Jacobian matrices are probabilistically first-order accurate, and the high probability complexity bound of the algorithm is also given.
- Published
- 2024
46. Pulse-based variational quantum optimization and metalearning in superconducting circuits
- Author
-
Wang, Yapeng, Ding, Yongcheng, Cárdenas-López, Francisco Andrés, and Chen, Xi
- Subjects
Quantum Physics - Abstract
Solving optimization problems using variational algorithms stands out as a crucial application for noisy intermediate-scale devices. Instead of constructing gate-based quantum computers, our focus centers on designing variational quantum algorithms within the analog paradigm. This involves optimizing parameters that directly control pulses, driving quantum states towards target states without the necessity of compiling a quantum circuit. In this work, we introduce pulse-based variational quantum optimization (PBVQO) as a hardware-level framework. We illustrate the framework by optimizing external fluxes on superconducting quantum interference devices, effectively driving the wave function of this specific quantum architecture to the ground state of an encoded problem Hamiltonian. Given that the performance of variational algorithms heavily relies on appropriate initial parameters, we introduce a global optimizer as a meta-learning technique to tackle a simple problem. The synergy between PBVQO and meta-learning provides an advantage over conventional gate-based variational algorithms., Comment: 9 pages, 4 figures
- Published
- 2024
47. Trace reconstruction from local statistical queries
- Author
-
Chen, Xi, De, Anindya, Lee, Chin Ho, and Servedio, Rocco A.
- Subjects
Computer Science - Data Structures and Algorithms - Abstract
The goal of trace reconstruction is to reconstruct an unknown $n$-bit string $x$ given only independent random traces of $x$, where a random trace of $x$ is obtained by passing $x$ through a deletion channel. A Statistical Query (SQ) algorithm for trace reconstruction is an algorithm which can only access statistical information about the distribution of random traces of $x$ rather than individual traces themselves. Such an algorithm is said to be $\ell$-local if each of its statistical queries corresponds to an $\ell$-junta function over some block of $\ell$ consecutive bits in the trace. Since several -- but not all -- known algorithms for trace reconstruction fall under the local statistical query paradigm, it is interesting to understand the abilities and limitations of local SQ algorithms for trace reconstruction. In this paper we establish nearly-matching upper and lower bounds on local Statistical Query algorithms for both worst-case and average-case trace reconstruction. For the worst-case problem, we show that there is an $\tilde{O}(n^{1/5})$-local SQ algorithm that makes all its queries with tolerance $\tau \geq 2^{-\tilde{O}(n^{1/5})}$, and also that any $\tilde{O}(n^{1/5})$-local SQ algorithm must make some query with tolerance $\tau \leq 2^{-\tilde{\Omega}(n^{1/5})}$. For the average-case problem, we show that there is an $O(\log n)$-local SQ algorithm that makes all its queries with tolerance $\tau \geq 1/\mathrm{poly}(n)$, and also that any $O(\log n)$-local SQ algorithm must make some query with tolerance $\tau \leq 1/\mathrm{poly}(n).$, Comment: RANDOM 2024
- Published
- 2024
48. Deep Bag-of-Words Model: An Efficient and Interpretable Relevance Architecture for Chinese E-Commerce
- Author
-
Lin, Zhe, Tan, Jiwei, Ou, Dan, Chen, Xi, Yao, Shaowei, and Zheng, Bo
- Subjects
Computer Science - Information Retrieval ,Computer Science - Artificial Intelligence ,Computer Science - Computation and Language - Abstract
Text relevance or text matching of query and product is an essential technique for the e-commerce search system to ensure that the displayed products can match the intent of the query. Many studies focus on improving the performance of the relevance model in search system. Recently, pre-trained language models like BERT have achieved promising performance on the text relevance task. While these models perform well on the offline test dataset, there are still obstacles to deploy the pre-trained language model to the online system as their high latency. The two-tower model is extensively employed in industrial scenarios, owing to its ability to harmonize performance with computational efficiency. Regrettably, such models present an opaque ``black box'' nature, which prevents developers from making special optimizations. In this paper, we raise deep Bag-of-Words (DeepBoW) model, an efficient and interpretable relevance architecture for Chinese e-commerce. Our approach proposes to encode the query and the product into the sparse BoW representation, which is a set of word-weight pairs. The weight means the important or the relevant score between the corresponding word and the raw text. The relevance score is measured by the accumulation of the matched word between the sparse BoW representation of the query and the product. Compared to popular dense distributed representation that usually suffers from the drawback of black-box, the most advantage of the proposed representation model is highly explainable and interventionable, which is a superior advantage to the deployment and operation of online search engines. Moreover, the online efficiency of the proposed model is even better than the most efficient inner product form of dense representation ..., Comment: KDD'24 accepted paper
- Published
- 2024
- Full Text
- View/download PDF
49. PaliGemma: A versatile 3B VLM for transfer
- Author
-
Beyer, Lucas, Steiner, Andreas, Pinto, André Susano, Kolesnikov, Alexander, Wang, Xiao, Salz, Daniel, Neumann, Maxim, Alabdulmohsin, Ibrahim, Tschannen, Michael, Bugliarello, Emanuele, Unterthiner, Thomas, Keysers, Daniel, Koppula, Skanda, Liu, Fangyu, Grycner, Adam, Gritsenko, Alexey, Houlsby, Neil, Kumar, Manoj, Rong, Keran, Eisenschlos, Julian, Kabra, Rishabh, Bauer, Matthias, Bošnjak, Matko, Chen, Xi, Minderer, Matthias, Voigtlaender, Paul, Bica, Ioana, Balazevic, Ivana, Puigcerver, Joan, Papalampidi, Pinelopi, Henaff, Olivier, Xiong, Xi, Soricut, Radu, Harmsen, Jeremiah, and Zhai, Xiaohua
- Subjects
Computer Science - Computer Vision and Pattern Recognition ,Computer Science - Artificial Intelligence ,Computer Science - Computation and Language ,Computer Science - Machine Learning - Abstract
PaliGemma is an open Vision-Language Model (VLM) that is based on the SigLIP-So400m vision encoder and the Gemma-2B language model. It is trained to be a versatile and broadly knowledgeable base model that is effective to transfer. It achieves strong performance on a wide variety of open-world tasks. We evaluate PaliGemma on almost 40 diverse tasks including standard VLM benchmarks, but also more specialized tasks such as remote-sensing and segmentation., Comment: v2 adds Appendix H and I and a few citations
- Published
- 2024
50. A Re-solving Heuristic for Dynamic Assortment Optimization with Knapsack Constraints
- Author
-
Chen, Xi, Liu, Mo, Wang, Yining, and Zhou, Yuan
- Subjects
Mathematics - Optimization and Control ,Statistics - Machine Learning - Abstract
In this paper, we consider a multi-stage dynamic assortment optimization problem with multi-nomial choice modeling (MNL) under resource knapsack constraints. Given the current resource inventory levels, the retailer makes an assortment decision at each period, and the goal of the retailer is to maximize the total profit from purchases. With the exact optimal dynamic assortment solution being computationally intractable, a practical strategy is to adopt the re-solving technique that periodically re-optimizes deterministic linear programs (LP) arising from fluid approximation. However, the fractional structure of MNL makes the fluid approximation in assortment optimization highly non-linear, which brings new technical challenges. To address this challenge, we propose a new epoch-based re-solving algorithm that effectively transforms the denominator of the objective into the constraint. Theoretically, we prove that the regret (i.e., the gap between the resolving policy and the optimal objective of the fluid approximation) scales logarithmically with the length of time horizon and resource capacities.
- Published
- 2024
Catalog
Discovery Service for Jio Institute Digital Library
For full access to our library's resources, please sign in.