32,442 results on '"Cheng, Wei"'
Search Results
2. A Common Origin for Nano-Hz Gravitational Wave Background and Black Hole Merger Events
- Author
-
Lu, Bo-Qiang, Chiang, Cheng-Wei, and Li, Tianjun
- Subjects
Astrophysics - Cosmology and Nongalactic Astrophysics ,High Energy Physics - Phenomenology - Abstract
We explore the potential primordial connection between the black hole merger events detected by LIGO and the nano-Hz stochastic gravitational wave background observed by pulsar timing arrays. We propose an innovative mechanism for the formation of primordial black holes, suggesting that the Poisson fluctuations within the domain wall network can give rise to horizon-sized overdense regions. Our results indicate a plausible common origin for gravitational wave observations in two different frequency bands, potentially linked to the annihilation of the domain wall network at the QCD scale, while accounting for the accretion effects on primordial black holes. Furthermore, we demonstrate that the bias potential induced by the QCD instanton effect may naturally facilitate the annihilation of the domain wall network during the QCD phase transition. Additionally, our scenario can yield the correct axion dark matter relic abundance, particularly if realized within the clockwork axion framework., Comment: 6 pages, 2 figures, 1 table
- Published
- 2024
3. Primordial black hole from domain wall fluctuations
- Author
-
Lu, Bo-Qiang, Chiang, Cheng-Wei, and Li, Tianjun
- Subjects
Astrophysics - Cosmology and Nongalactic Astrophysics ,High Energy Physics - Phenomenology - Abstract
Domain walls are topological defects produced by the spontaneous symmetry-breaking of discrete symmetry during cosmological phase transitions. The horizon-size domain wall can significantly contribute to the energy density in the late-evolution stage. We propose that the density perturbations from the fluctuations in the number density of the horizon-size domain wall could collapse to form primordial black holes. This mechanism becomes effective when the domain wall energy density ratio to that of the radiation reaches about 0.1 in the radiation-dominated universe. We find that models with $Z_2$ symmetry are excluded for interpreting pulsar timing array observations on the nano-Hz gravitational wave background since this model's domain wall number density fluctuations could lead to an overabundance of the primordial black holes. Moreover, models with $N\sim 10$ domain walls also suffer strong constraints from the overabundance of primordial black holes., Comment: 39 pages, 11 figures, 4 tables
- Published
- 2024
4. A Pair Programming Framework for Code Generation via Multi-Plan Exploration and Feedback-Driven Refinement
- Author
-
Zhang, Huan, Cheng, Wei, Wu, Yuhan, and Hu, Wei
- Subjects
Computer Science - Software Engineering ,Computer Science - Artificial Intelligence - Abstract
Large language models (LLMs) have achieved impressive performance on code generation. Although prior studies enhanced LLMs with prompting techniques and code refinement, they still struggle with complex programming problems due to rigid solution plans. In this paper, we draw on pair programming practices to propose PairCoder, a novel LLM-based framework for code generation. PairCoder incorporates two collaborative LLM agents, namely a Navigator agent for high-level planning and a Driver agent for specific implementation. The Navigator is responsible for proposing promising solution plans, selecting the current optimal plan, and directing the next iteration round based on execution feedback. The Driver follows the guidance of Navigator to undertake initial code generation, code testing, and refinement. This interleaved and iterative workflow involves multi-plan exploration and feedback-based refinement, which mimics the collaboration of pair programmers. We evaluate PairCoder with both open-source and closed-source LLMs on various code generation benchmarks. Extensive experimental results demonstrate the superior accuracy of PairCoder, achieving relative pass@1 improvements of 12.00%-162.43% compared to prompting LLMs directly., Comment: Accepted in the 39th IEEE/ACM International Conference on Automated Software Engineering (ASE 2024)
- Published
- 2024
5. Variational construction of singular characteristics and propagation of singularities
- Author
-
Cannarsa, Piermarco, Cheng, Wei, Hong, Jiahui, and Wang, Kaizhi
- Subjects
Mathematics - Analysis of PDEs ,Mathematics - Dynamical Systems ,35F21, 49L25, 37J50 - Abstract
On a smooth closed manifold $M$, we introduce a novel theory of maximal slope curves for any pair $(\phi,H)$ with $\phi$ a semiconcave function and $H$ a Hamiltonian. By using the notion of maximal slope curve from gradient flow theory, the intrinsic singular characteristics constructed in [Cannarsa, P.; Cheng, W., \textit{Generalized characteristics and Lax-Oleinik operators: global theory}. Calc. Var. Partial Differential Equations 56 (2017), no. 5, 56:12], the smooth approximation method developed in [Cannarsa, P.; Yu, Y. \textit{Singular dynamics for semiconcave functions}. J. Eur. Math. Soc. 11 (2009), no. 5, 999--1024], and the broken characteristics studied in [Khanin, K.; Sobolevski, A., \textit{On dynamics of Lagrangian trajectories for Hamilton-Jacobi equations}. Arch. Ration. Mech. Anal. 219 (2016), no. 2, 861--885], we prove the existence and stability of such maximal slope curves and discuss certain new weak KAM features. We also prove that maximal slope curves for any pair $(\phi,H)$ are exactly broken characteristics which have right derivatives everywhere. Applying this theory, we establish a global variational construction of strict singular characteristics and broken characteristics. Moreover, we prove a result on the global propagation of cut points along generalized characteristics, as well as a result on the propagation of singular points along strict singular characteristics, for weak KAM solutions. We also obtain the continuity equation along strict singular characteristics which clarifies the mass transport nature in the problem of propagation of singularities.
- Published
- 2024
6. A geometric approach to Mather quotient problem
- Author
-
Cheng, Wei and Wei, Wenxue
- Subjects
Mathematics - Dynamical Systems ,Mathematics - Differential Geometry ,35F21, 49L25, 37J50 - Abstract
Let $(M,g)$ be a closed, connected and orientable Riemannian manifold with nonnegative Ricci curvature. Consider a Lagrangian $L(x,v):TM\to\R$ defined by $L(x,v):=\frac 12g_x(v,v)-\omega(v)+c$, where $c\in\R$ and $\omega$ is a closed 1-form. From the perspective of differential geometry, we estimate the Laplacian of the weak KAM solution $u$ to the associated Hamilton-Jacobi equation $H(x,du)=c[L]$ in the barrier sense. This analysis enables us to prove that each weak KAM solution $u$ is constant if and only if $\omega$ is a harmonic 1-form. Furthermore, we explore several applications to the Mather quotient and Ma\~n\'e's Lagrangian.
- Published
- 2024
7. Tetraquark nature of the $a_0(980)$ meson in hadronic $D$ decays
- Author
-
Cheng, Hai-Yang, Chiang, Cheng-Wei, and Xu, Fanrong
- Subjects
High Energy Physics - Phenomenology ,High Energy Physics - Experiment - Abstract
The internal structure of the light scalar meson $a_0(980)$ is explored in the three-body $D$ decays of $D\to a_0(980)P\to P_1P_2P$ through the intermediate state $a_0(980)$, where $P$ denotes a pseudoscalar meson. The quasi-two-body $D\to a_0(980)^+P$ decays are governed by the external $W$-emission diagram in which $a_0(980)^+$ is emitted. The predicted branching fractions in the $q\bar q$ model of $a_0(980)$ are too small by one to two orders of magnitude compared to experiment as the amplitude is suppressed by the smallness of the $a_0(980)^+$ decay constant, while those for $D^+\to a_0(980)^0 P$ and $D^0\to a_0(980)^{-}P$ are usually too large. These discrepancies can be resolved provided that $a_0(980)$ is a tetraquark state. In this case, there exist two additional $T$-like topological amplitudes, denoted by $\overline{T}$ and $\tilde T$ which readily account for the discrepancies. An important implication of the tetraquark model is that the $D_s^+\to a_0(980)^+\pi^0+a_0(980)^0\pi^+$ decay is not a purely $W$-annihilation process as in the diquark model of $a_0(980)$; it receives dominant contributions from $\overline{T}$ newly noticed in this work. Therefore, measurements of $(D,D_s^+)\to a_0(980)P$ decays lend strong support to the tetraquark picture of $a_0(980)$., Comment: 14 pages, 1 figure. arXiv admin note: text overlap with arXiv:2201.00460
- Published
- 2024
8. Strategist: Learning Strategic Skills by LLMs via Bi-Level Tree Search
- Author
-
Light, Jonathan, Cai, Min, Chen, Weiqin, Wang, Guanzhi, Chen, Xiusi, Cheng, Wei, Yue, Yisong, and Hu, Ziniu
- Subjects
Computer Science - Artificial Intelligence ,Computer Science - Computation and Language - Abstract
In this paper, we propose a new method Strategist that utilizes LLMs to acquire new skills for playing multi-agent games through a self-improvement process. Our method gathers quality feedback through self-play simulations with Monte Carlo tree search and LLM-based reflection, which can then be used to learn high-level strategic skills such as how to evaluate states that guide the low-level execution.We showcase how our method can be used in both action planning and dialogue generation in the context of games, achieving good performance on both tasks. Specifically, we demonstrate that our method can help train agents with better performance than both traditional reinforcement learning-based approaches and other LLM-based skill learning approaches in games including the Game of Pure Strategy (GOPS) and The Resistance: Avalon., Comment: website: https://llm-strategist.github.io
- Published
- 2024
9. High-Capacity Metasurface at Limits of Polarization and Wavelength Multiplexing
- Author
-
Bao, Yanjun, Shi, Hongsheng, Wei, Rui, Wang, Boyou, Zhou, Zhou, Qiu, Cheng-Wei, and Li, Baojun
- Subjects
Physics - Optics - Abstract
Polarization and wavelength multiplexing are the two most widely employed techniques to improve the capacity in the metasurfaces. Existing works have pushed each technique to its individual limits. For example, the polarization multiplexing channels working at a single wavelength have been significantly increased by using noise engineering. However, it is still challenging to achieve the multiplexing limits of wavelength and polarization simultaneously. Besides, such multiplexing methods suffer from computational inefficiencies, hindering their application in tasks like image recognition that require extensive training computation. In this work, we introduce a gradient-based optimization algorithm using deep neural network (DNN) to achieve the limits of both polarization and wavelength multiplexing with high computational efficiency. We experimentally demonstrate this capability, achieving a record-breaking capacity of 15 holographic images across five wavelengths and the maximum of three independent polarization channels, as well as 18 holographic images across three wavelengths and six corelated polarization channels. Moreover, leveraging the high computational efficiency of our DNN-based method, which is well-suited for processing large datasets, we implement large-scale image recognition tasks across 36 classes encoded in a record of nine multiplexed channels (three wavelengths * three polarizations), achieving 96% classification accuracy in calculations and 91.5% in experiments. This work sets a new benchmark for high-capacity multiplexing with metasurfaces and demonstrates the power of gradient-based inverse design for realizing multi-functional optical elements.
- Published
- 2024
10. Combinatorial synthesis and characterization of thin film Al1-xRExN (RE = Pr3+, Tb3+) heterostructural alloys
- Author
-
Paudel, Binod, Mangum, John S., Rom, Christopher L., Egbo, Kingsley, Lee, Cheng-Wei, Guthrey, Harvey, Allen, Sean, Haegel, Nancy M., Yazawa, Keisuke, Brennecka, Geoff L., and Smaha, Rebecca W.
- Subjects
Condensed Matter - Materials Science - Abstract
The potential impact of cation-substituted AlN-based materials, such as Al1-xScxN, Al1-xGaxN, and Al1-xBxN, with exceptional electronic, electromechanical, and dielectric properties has spurred research into this broad family of materials. Rare earth (RE) cations are particularly appealing as they could additionally impart optoelectronic or magnetic functionality. However, success in incorporating a significant level of RE cations into AlN has been limited so far because it is thermodynamically challenging to stabilize such heterostructural alloys. Using combinatorial co-sputtering, we synthesized Al1-xRExN (RE = Pr, Tb) thin films and performed a rapid survey of the composition-structure-property relationships as a function of RE alloying. Under our growth conditions, we observe that Al1-xPrxN maintains a phase-pure wurtzite structure until transitioning to amorphous for x>0.22. Al1-xTbxN exhibits a phase-pure wurtzite structure until x<0.15, then exhibits mixed wurtzite and rocksalt phases for 0.16
- Published
- 2024
11. Polarization entanglement enabled by orthogonally stacked van der Waals NbOCl2 crystals
- Author
-
Guo, Qiangbing, Wu, Yun-Kun, Zhang, Di, Zhang, Qiuhong, Guo, Guang-Can, Alù, Andrea, Ren, Xi-Feng, and Qiu, Cheng-Wei
- Subjects
Physics - Optics ,Condensed Matter - Materials Science - Abstract
Polarization entanglement holds significant importance for photonic quantum technologies. Recently emerging subwavelength nonlinear quantum light sources, e.g., GaP and LiNbO3 thin films, benefiting from the relaxed phase-matching constraints and volume confinement, has shown intriguing properties, such as high-dimensional hyperentanglement and robust entanglement anti-degradation. Van der Waals (vdW) NbOCl2 crystal, renowned for its superior optical nonlinearities, has emerged as one of ideal candidates for ultrathin quantum light sources [Nature 613, 53 (2023)]. However, polarization-entanglement is inaccessible in NbOCl2 crystal due to its unfavorable nonlinear susceptibility tensor. Here, by leveraging the twist-stacking degree of freedom inherently in vdW systems, we showcase the preparation of tunable polarization entanglement and quantum Bell states. Our work not only provides a new and tunable polarization-entangled vdW photon-pair source, but also introduces a new knob in engineering the entanglement state of quantum light at the nanoscale., Comment: 16 pages,4 figures
- Published
- 2024
12. Unconventional Thermophotonic Charge Density Wave
- Author
-
Zhou, Cheng-Long, Torbatian, Zahra, Yang, Shui-Hua, Zhang, Yong, Yi, Hong-Liang, Antezza, Mauro, Novko, Dino, and Qiu, Cheng-Wei
- Subjects
Condensed Matter - Materials Science ,Condensed Matter - Strongly Correlated Electrons - Abstract
Charge-order states of broken symmetry, such as charge density wave (CDW), are able to induce exceptional physical properties, however, the precise understanding of the underlying physics is still elusive. Here, we combine fluctuational electrodynamics and density functional theory to reveal an unconventional thermophotonic effect in CDW-bearing TiSe$_2$, referred to as thermophotonic-CDW ($tp$-CDW). The interplay of plasmon polariton and CDW electron excitations give rise to an anomalous negative temperature dependency in thermal photons transport, offering an intuitive fingerprint for a transformation of the electron order. Additionally, the demonstrated nontrivial features of $tp$-CDW transition hold promise for a controllable manipulation of heat flow, which could be extensively utilized in various fields such as thermal science and electron dynamics, as well as in next-generation energy devices., Comment: 7 pages, 4 figures
- Published
- 2024
- Full Text
- View/download PDF
13. Enabling all-to-circular polarization upconversion by nonlinear chiral metasurfaces with rotational symmetry
- Author
-
Gromyko, Dmitrii, Loh, Jun Siang, Feng, Jiangang, Qiu, Cheng-Wei, and Wu, Lin
- Subjects
Physics - Optics - Abstract
We implement a stacking strategy in designing chiral metasurfaces with high rotational symmetry, enabling quasi-bound-in-the-continuum (quasi-BIC) resonances characterized by absolute chirality. The rotational symmetry allows a circularly polarized pump to be converted into a circularly polarized nonlinear signal. Meanwhile, our bilayered metasurface can be engineered to respond solely to one selected circular polarization. Consequently, integrating resonant chiral response and rotational symmetry endows a unique category of metasurfaces to upconvert any linear or unpolarized pump into a circularly polarized nonlinear signal. Our results reveal that when such a metasurface is subjected to a linearly polarized pump, the intensity ratio of the resultant circularly polarized signals varies with the order of the nonlinear process. Counterintuitively, this ratio scales as the fourth power of the local field enhancement in the second harmonic process and the second power in the third harmonic process. Our work offers a comprehensive theoretical description of the nonlinear processes in chiral structures with rotation and provides universal guidelines for designing nonlinear all-dielectric metasurfaces with a strong chiral response.
- Published
- 2024
14. StraightLine: An End-to-End Resource-Aware Scheduler for Machine Learning Application Requests
- Author
-
Ching, Cheng-Wei, Guan, Boyuan, Xu, Hailu, and Hu, Liting
- Subjects
Computer Science - Distributed, Parallel, and Cluster Computing ,Computer Science - Machine Learning - Abstract
The life cycle of machine learning (ML) applications consists of two stages: model development and model deployment. However, traditional ML systems (e.g., training-specific or inference-specific systems) focus on one particular stage or phase of the life cycle of ML applications. These systems often aim at optimizing model training or accelerating model inference, and they frequently assume homogeneous infrastructure, which may not always reflect real-world scenarios that include cloud data centers, local servers, containers, and serverless platforms. We present StraightLine, an end-to-end resource-aware scheduler that schedules the optimal resources (e.g., container, virtual machine, or serverless) for different ML application requests in a hybrid infrastructure. The key innovation is an empirical dynamic placing algorithm that intelligently places requests based on their unique characteristics (e.g., request frequency, input data size, and data distribution). In contrast to existing ML systems, StraightLine offers end-to-end resource-aware placement, thereby it can significantly reduce response time and failure rate for model deployment when facing different computing resources in the hybrid infrastructure., Comment: 6 pages, 8 figures, to appear in AIoTC'24
- Published
- 2024
15. AgileDART: An Agile and Scalable Edge Stream Processing Engine
- Author
-
Hu, Liting and Ching, Cheng-Wei
- Subjects
Computer Science - Databases ,Computer Science - Distributed, Parallel, and Cluster Computing - Abstract
Edge applications generate a large influx of sensor data at massive scales. Under many time-critical scenarios, these massive data streams must be processed in a very short time to derive actionable intelligence. However, traditional data processing systems (e.g., stream processing systems, cloud-based IoT data processing systems) are not well-suited for these edge applications. This is because they often do not scale well with a large number of concurrent stream queries, do not support low-latency processing under limited edge computing resources, and do not adapt to the level of heterogeneity and dynamicity commonly present in edge computing environments. These gaps suggest a need for a new edge stream processing system that advances the stream processing paradigm to achieve efficiency and flexibility under the constraints presented by edge computing architectures. We present AgileDart, an agile and scalable edge stream processing engine that enables fast stream processing of a large number of concurrently running low-latency edge applications' queries at scale in dynamic, heterogeneous edge environments. The novelty of our work lies in a dynamic dataflow abstraction that leverages distributed hash table (DHT) based peer-to-peer (P2P) overlay networks to automatically place, chain, and scale stream operators to reduce query latencies, adapt to workload variations, and recover from failures; and a bandit-based path planning model that can re-plan the data shuffling paths to adapt to unreliable and heterogeneous edge networks. We show analytically and empirically that AgileDart outperforms Storm and EdgeWise on query latency and significantly improves scalability and adaptability when processing a large number of real-world edge stream applications' queries., Comment: 18 pages, 18 figures
- Published
- 2024
16. From Prediction to Experimental Realization of Ferroelectric Wurtzite Al$_{1-x}$Gd$_{x}$N Alloys
- Author
-
Lee, Cheng-Wei, Smaha, Rebecca W., Brennecka, Geoff L., Haegel, Nancy, Gorai, Prashun, and Yazawa, Keisuke
- Subjects
Condensed Matter - Materials Science ,Physics - Computational Physics - Abstract
AlN-based alloys find widespread application in high-power microelectronics, optoelectronics, and electromechanics. The realization of ferroelectricity in wurtzite AlN-based heterostructural alloys has opened up the possibility of directly integrating ferroelectrics with conventional microelectronics based on tetrahedral semiconductors such as Si, SiC and III-Vs, enabling compute-in-memory architectures, high-density data storage, and more. The discovery of AlN-based wurtzite ferroelectrics has been driven to date by chemical intuition and empirical explorations. Here, we demonstrate the computationally-guided discovery and experimental demonstration of new ferroelectric wurtzite Al$_{1-x}$Gd$_x$N alloys. First-principles calculations indicate that the minimum energy pathway for switching changes from a collective to an individual switching process with a lower overall energy barrier, at a rare-earth fraction $x$ of $x>$ 0.10$-$0.15. Experimentally, ferroelectric switching is observed at room temperature in Al$_{1-x}$Gd$_x$N films with $x>$ 0.12, which strongly supports the switching mechanisms in wurtzite ferroelectrics proposed previously (Lee et al., $\textit{Science Advances}$ 10, eadl0848, 2024). This is also the first demonstration of ferroelectricity in an AlN-based alloy with a magnetic rare-earth element, which could pave the way for additional functionalities such as multiferroicity and opto-ferroelectricity in this exciting class of AlN-based materials.
- Published
- 2024
17. Dielectric Fano Nanoantennas for Enabling Sub-Nanosecond Lifetimes in NV-based Single Photon Emitters
- Author
-
An, Shu, Kalashnikov, Dmitry, Shi, Wenqiao, Mahfoud, Zackaria, Chew, Ah Bian, Liu, Yan, Wu, Jing, Zhu, Di, Gao, Weibo, Qiu, Cheng-Wei, Leong, Victor, and Dong, Zhaogang
- Subjects
Physics - Optics ,Physics - Applied Physics ,Quantum Physics - Abstract
Solid-state quantum emitters are essential sources of single photons, and enhancing their emission rates is of paramount importance for applications in quantum communications, computing, and metrology. One approach is to couple quantum emitters with resonant photonic nanostructures, where the emission rate is enhanced due to the Purcell effect. Dielectric nanoantennas are promising as they provide strong emission enhancement compared to plasmonic ones, which suffer from high Ohmic loss. Here, we designed and fabricated a dielectric Fano resonator based on a pair of silicon (Si) ellipses and a disk, which supports the mode hybridization between quasi-bound-states-in-the-continuum (quasi-BIC) and Mie resonance. We demonstrated the performance of the developed resonant system by interfacing it with single photon emitters (SPEs) based on nitrogen-vacancy (NV-) centers in nanodiamonds (NDs). We observed that the interfaced emitters have a Purcell enhancement factor of ~10, with sub-ns emission lifetime and a polarization contrast of 9. Our results indicate a promising method for developing efficient and compact single-photon sources for integrated quantum photonics applications., Comment: 20 pages, 4 figures
- Published
- 2024
18. Indirect detection constraints on semi-annihilation of inert scalar multiplets
- Author
-
Beauchesne, Hugues and Chiang, Cheng-Wei
- Subjects
High Energy Physics - Phenomenology - Abstract
Certain models of inert multiplets allow for semi-annihilation processes, in which two dark matter candidates annihilate to a dark matter particle and a non-dark matter particle. The existence of these processes can alleviate certain constraints and substantially modify the indirect detection signal. In this paper, we study current indirect detection constraints on the semi-annihilation of inert scalar multiplets. We show that there exist gauge numbers for which dark matter can be thermally produced and be compatible with indirect detection constraints even for very cuspy galactic dark matter density profiles., Comment: 24 pages, 4 figures
- Published
- 2024
19. Antibiotic-Induced Gut Microbiota Dysbiosis Modulates Host Transcriptome and m6A Epitranscriptome via Bile Acid Metabolism.
- Author
-
Yang, Meng, Zheng, Xiaoqi, Fan, Jiajun, Cheng, Wei, Yan, Tong-Meng, Lai, Yushan, Zhang, Nianping, Lu, Yi, Qi, Jiali, Huo, Zhengyi, Xu, Zihe, Huang, Jia, Jiao, Yuting, Liu, Biaodi, Pang, Rui, Zhong, Xiang, Huang, Shi, Luo, Guan-Zheng, Lee, Gina, Jobin, Christian, Eren, A, Chang, Eugene, Wei, Hong, Pan, Tao, and Wang, Xiaoyun
- Subjects
N6‐Methyladenosine ,bile acids ,epitranscriptome ,gut microbiota ,transcriptome ,Animals ,Gastrointestinal Microbiome ,Bile Acids and Salts ,Dysbiosis ,Mice ,Transcriptome ,Anti-Bacterial Agents ,Adenosine ,Disease Models ,Animal ,Mice ,Inbred C57BL ,Male - Abstract
Gut microbiota can influence host gene expression and physiology through metabolites. Besides, the presence or absence of gut microbiome can reprogram host transcriptome and epitranscriptome as represented by N6-methyladenosine (m6A), the most abundant mammalian mRNA modification. However, which and how gut microbiota-derived metabolites reprogram host transcriptome and m6A epitranscriptome remain poorly understood. Here, investigation is conducted into how gut microbiota-derived metabolites impact host transcriptome and m6A epitranscriptome using multiple mouse models and multi-omics approaches. Various antibiotics-induced dysbiotic mice are established, followed by fecal microbiota transplantation (FMT) into germ-free mice, and the results show that bile acid metabolism is significantly altered along with the abundance change in bile acid-producing microbiota. Unbalanced gut microbiota and bile acids drastically change the host transcriptome and the m6A epitranscriptome in multiple tissues. Mechanistically, the expression of m6A writer proteins is regulated in animals treated with antibiotics and in cultured cells treated with bile acids, indicating a direct link between bile acid metabolism and m6A biology. Collectively, these results demonstrate that antibiotic-induced gut dysbiosis regulates the landscape of host transcriptome and m6A epitranscriptome via bile acid metabolism pathway. This work provides novel insights into the interplay between microbial metabolites and host gene expression.
- Published
- 2024
20. Neural Network-Assisted End-to-End Design for Dispersive Full-Parameter Control of Meta-Optics
- Author
-
Chi, Hanbin, Hu, Yueqiang, Ou, Xiangnian, Jiang, Yuting, Yu, Dian, Lou, Shaozhen, Wang, Quan, Xie, Qiong, Qiu, Cheng-Wei, and Duan, Huigao
- Subjects
Physics - Optics ,Physics - Applied Physics - Abstract
Flexible control light field across multiple parameters is the cornerstone of versatile and miniaturized optical devices. Metasurfaces, comprising subwavelength scatterers, offer a potent platform for executing such precise manipulations. However, the inherent mutual constraints between parameters of metasurfaces make it challenging for traditional approaches to achieve full-parameter control across multiple wavelengths. Here, we propose a universal end-to-end inverse design framework to directly optimize the geometric parameter layout of meta-optics based on the target functionality of full-parameter control across multiple wavelengths. This framework employs a differentiable forward simulator integrating a neural network-based dispersive full-parameter Jones matrix and Fourier propagation to facilitate gradient-based optimization. Its superiority over sequential forward designs in dual-polarization channel color holography with higher quality and tri-polarization three-dimensional color holography with higher multiplexed capacity is showcased. To highlight the universality, we further present polarized spectral multi-information processing with six arbitrary polarizations and three wavelengths. This versatile, differentiable, system-level design framework is poised to expedite the advancement of meta-optics in integrated multi-information display, imaging, and communication, extending to multi-modal sensing applications.
- Published
- 2024
21. Unidirectional Chiral Emission via Twisted Bi-layer Metasurfaces
- Author
-
Gromyko, Dmitrii, An, Shu, Gorelik, Sergey, Xu, Jiahui, Lim, Li Jun, Lee, Henry Yit Loong, Tjiptoharsono, Febiana, Tan, Zhi-Kuang, Qiu, Cheng-Wei, Dong, Zhaogang, and Wu, Lin
- Subjects
Physics - Optics - Abstract
Controlling and channelling light emissions from unpolarized quantum dots into specific directions with chiral polarization remains a key challenge in modern photonics. Stacked metasurface designs offer a potential compact solution for chirality and directionality engineering. However, experimental observations of directional chiral radiation from resonant metasurfaces with quantum emitters remain obscure. In this paper, we present experimental observations of unidirectional chiral emission from a twisted bi-layer metasurface via multi-dimensional control, including twist angle, interlayer distance, and lateral displacement between the top and bottom layers, as enabled by doublet alignment lithography (DAL). First, maintaining alignment, the metasurface demonstrates a resonant intrinsic optical chirality with near-unity circular dichroism of 0.94 and reflectance difference of 74%, where a high circular dichroism greater than 0.9 persists across a wide range of angles from -11 to 11 degrees. Second, engineered lateral displacement induces a unidirectional chiral resonance, resulting in unidirectional chiral emission from the quantum dots deposited onto the metasurface. Our bi-layer metasurfaces offer a universal compact platform for efficient radiation manipulation over a wide angular range, promising potential applications in miniaturized lasers, grating couplers, and chiral nanoantennas., Comment: 16 pages, 4 figures
- Published
- 2024
22. Improving Logits-based Detector without Logits from Black-box LLMs
- Author
-
Zeng, Cong, Tang, Shengkun, Yang, Xianjun, Chen, Yuanzhou, Sun, Yiyou, xu, zhiqiang, Li, Yao, Chen, Haifeng, Cheng, Wei, and Xu, Dongkuan
- Subjects
Computer Science - Computation and Language ,Computer Science - Machine Learning - Abstract
The advent of Large Language Models (LLMs) has revolutionized text generation, producing outputs that closely mimic human writing. This blurring of lines between machine- and human-written text presents new challenges in distinguishing one from the other a task further complicated by the frequent updates and closed nature of leading proprietary LLMs. Traditional logits-based detection methods leverage surrogate models for identifying LLM-generated content when the exact logits are unavailable from black-box LLMs. However, these methods grapple with the misalignment between the distributions of the surrogate and the often undisclosed target models, leading to performance degradation, particularly with the introduction of new, closed-source models. Furthermore, while current methodologies are generally effective when the source model is identified, they falter in scenarios where the model version remains unknown, or the test set comprises outputs from various source models. To address these limitations, we present Distribution-Aligned LLMs Detection (DALD), an innovative framework that redefines the state-of-the-art performance in black-box text detection even without logits from source LLMs. DALD is designed to align the surrogate model's distribution with that of unknown target LLMs, ensuring enhanced detection capability and resilience against rapid model iterations with minimal training investment. By leveraging corpus samples from publicly accessible outputs of advanced models such as ChatGPT, GPT-4 and Claude-3, DALD fine-tunes surrogate models to synchronize with unknown source model distributions effectively.
- Published
- 2024
23. Ultrafast optical switching to a heterochiral charge-density wave state
- Author
-
Huang, Wayne Cheng-Wei, Mu, Sai, von Witte, Gevin, Li, Yanshuo Sophie, Kurtz, Felix, Hung, Sheng-Hsiung, Jeng, Horng-Tay, Rossnagel, Kai, Horstmann, Jan Gerrit, and Ropers, Claus
- Subjects
Condensed Matter - Mesoscale and Nanoscale Physics - Abstract
Optical control of correlated electronic states promises unprecedented tunability of novel functional materials. Tailored optical excitations can steer a system along non-equilibrium pathways to metastable states with specific structural or electronic properties. A much-desired feature is the reproducible and ultrafast switching to functional states. The light-induced hidden state of 1T-TaS$_{2}$, with its strongly enhanced conductivity and exceptionally long lifetime, represents a unique model system for studying the switching of correlated electronic states using femtosecond optical stimuli. However, despite intense investigation, the switching mechanism and the structural origins of the distinctive electronic properties of the hidden state have not been fully uncovered. Here, we use surface-sensitive electron diffraction in combination with a femtosecond optical quench to reveal coexistent charge-density wave chiralities as a new structural feature of the hidden state. We find that a single-pulse optical quench produces a state with long-range structural order and different weights of the two chiral enantiomorphs of the charge-density wave. Harnessing a double-pulse optical quench, we trace the origin of the mixed chirality to the transient electronic excitation of the host crystal. The coexistent long-range-order of both chiralities suggests the presence of extended heterochiral charge-density wave interfaces, which results in a higher-level, fractal-type moir\'{e} superstructure. Density functional theory simulations for such a charge-density wave moir\'{e} superstructure reveal multiple flat bands, Dirac cones, and a kagome electronic subsystem around the Fermi energy. Our findings shed light on novel electronic properties gained by chiral interface engineering, and create avenues for light-induced moir\'{e} superstructures in quasi-two-dimensional materials.
- Published
- 2024
24. MeshXL: Neural Coordinate Field for Generative 3D Foundation Models
- Author
-
Chen, Sijin, Chen, Xin, Pang, Anqi, Zeng, Xianfang, Cheng, Wei, Fu, Yijun, Yin, Fukun, Wang, Yanru, Wang, Zhibin, Zhang, Chi, Yu, Jingyi, Yu, Gang, Fu, Bin, and Chen, Tao
- Subjects
Computer Science - Computer Vision and Pattern Recognition - Abstract
The polygon mesh representation of 3D data exhibits great flexibility, fast rendering speed, and storage efficiency, which is widely preferred in various applications. However, given its unstructured graph representation, the direct generation of high-fidelity 3D meshes is challenging. Fortunately, with a pre-defined ordering strategy, 3D meshes can be represented as sequences, and the generation process can be seamlessly treated as an auto-regressive problem. In this paper, we validate the Neural Coordinate Field (NeurCF), an explicit coordinate representation with implicit neural embeddings, is a simple-yet-effective representation for large-scale sequential mesh modeling. After that, we present MeshXL, a family of generative pre-trained auto-regressive models, which addresses the process of 3D mesh generation with modern large language model approaches. Extensive experiments show that MeshXL is able to generate high-quality 3D meshes, and can also serve as foundation models for various down-stream applications.
- Published
- 2024
25. Dataflow-Guided Retrieval Augmentation for Repository-Level Code Completion
- Author
-
Cheng, Wei, Wu, Yuhan, and Hu, Wei
- Subjects
Computer Science - Software Engineering ,Computer Science - Computation and Language - Abstract
Recent years have witnessed the deployment of code language models (LMs) in various code intelligence tasks such as code completion. Yet, it is challenging for pre-trained LMs to generate correct completions in private repositories. Previous studies retrieve cross-file context based on import relations or text similarity, which is insufficiently relevant to completion targets. In this paper, we propose a dataflow-guided retrieval augmentation approach, called DraCo, for repository-level code completion. DraCo parses a private repository into code entities and establishes their relations through an extended dataflow analysis, forming a repo-specific context graph. Whenever triggering code completion, DraCo precisely retrieves relevant background knowledge from the repo-specific context graph and generates well-formed prompts to query code LMs. Furthermore, we construct a large Python dataset, ReccEval, with more diverse completion targets. Our experiments demonstrate the superior accuracy and applicable efficiency of DraCo, improving code exact match by 3.43% and identifier F1-score by 3.27% on average compared to the state-of-the-art approach., Comment: Accepted in the 62nd Annual Meeting of the Association for Computational Linguistics (ACL 2024)
- Published
- 2024
26. Low-Resource Crop Classification from Multi-Spectral Time Series Using Lossless Compressors
- Author
-
Cheng, Wei, Ye, Hongrui, Wen, Xiao, Zhang, Jiachen, Xu, Jiping, and Zhang, Feifan
- Subjects
Computer Science - Computer Vision and Pattern Recognition ,Computer Science - Artificial Intelligence ,Computer Science - Machine Learning - Abstract
Deep learning has significantly improved the accuracy of crop classification using multispectral temporal data. However, these models have complex structures with numerous parameters, requiring large amounts of data and costly training. In low-resource situations with fewer labeled samples, deep learning models perform poorly due to insufficient data. Conversely, compressors are data-type agnostic, and non-parametric methods do not bring underlying assumptions. Inspired by this insight, we propose a non-training alternative to deep learning models, aiming to address these situations. Specifically, the Symbolic Representation Module is proposed to convert the reflectivity into symbolic representations. The symbolic representations are then cross-transformed in both the channel and time dimensions to generate symbolic embeddings. Next, the Multi-scale Normalised Compression Distance (MNCD) is designed to measure the correlation between any two symbolic embeddings. Finally, based on the MNCDs, high quality crop classification can be achieved using only a k-nearest-neighbor classifier kNN. The entire framework is ready-to-use and lightweight. Without any training, it outperformed, on average, 7 advanced deep learning models trained at scale on three benchmark datasets. It also outperforms more than half of these models in the few-shot setting with sparse crop labels. Therefore, the high performance and robustness of our non-training framework makes it truly applicable to real-world crop mapping. Codes are available at: https://github.com/qinfengsama/Compressor-Based-Crop-Mapping., Comment: 8 pages, 10 figures
- Published
- 2024
27. Synthesizing Programmatic Reinforcement Learning Policies with Large Language Model Guided Search
- Author
-
Liu, Max, Yu, Chan-Hung, Lee, Wei-Hsu, Hung, Cheng-Wei, Chen, Yen-Chun, and Sun, Shao-Hua
- Subjects
Computer Science - Machine Learning ,Computer Science - Artificial Intelligence ,Computer Science - Programming Languages - Abstract
Programmatic reinforcement learning (PRL) has been explored for representing policies through programs as a means to achieve interpretability and generalization. Despite promising outcomes, current state-of-the-art PRL methods are hindered by sample inefficiency, necessitating tens of millions of program-environment interactions. To tackle this challenge, we introduce a novel LLM-guided search framework (LLM-GS). Our key insight is to leverage the programming expertise and common sense reasoning of LLMs to enhance the efficiency of assumption-free, random-guessing search methods. We address the challenge of LLMs' inability to generate precise and grammatically correct programs in domain-specific languages (DSLs) by proposing a Pythonic-DSL strategy - an LLM is instructed to initially generate Python codes and then convert them into DSL programs. To further optimize the LLM-generated programs, we develop a search algorithm named Scheduled Hill Climbing, designed to efficiently explore the programmatic search space to consistently improve the programs. Experimental results in the Karel domain demonstrate the superior effectiveness and efficiency of our LLM-GS framework. Extensive ablation studies further verify the critical role of our Pythonic-DSL strategy and Scheduled Hill Climbing algorithm.
- Published
- 2024
28. Pruning as a Domain-specific LLM Extractor
- Author
-
Zhang, Nan, Liu, Yanchi, Zhao, Xujiang, Cheng, Wei, Bao, Runxue, Zhang, Rui, Mitra, Prasenjit, and Chen, Haifeng
- Subjects
Computer Science - Computation and Language - Abstract
Large Language Models (LLMs) have exhibited remarkable proficiency across a wide array of NLP tasks. However, the escalation in model size also engenders substantial deployment costs. While few efforts have explored model pruning techniques to reduce the size of LLMs, they mainly center on general or task-specific weights. This leads to suboptimal performance due to lacking specificity on the target domain or generality on different tasks when applied to domain-specific challenges. This work introduces an innovative unstructured dual-pruning methodology, D-Pruner, for domain-specific compression on LLM. It extracts a compressed, domain-specific, and task-agnostic LLM by identifying LLM weights that are pivotal for general capabilities, like linguistic capability and multi-task solving, and domain-specific knowledge. More specifically, we first assess general weight importance by quantifying the error incurred upon their removal with the help of an open-domain calibration dataset. Then, we utilize this general weight importance to refine the training loss, so that it preserves generality when fitting into a specific domain. Moreover, by efficiently approximating weight importance with the refined training loss on a domain-specific calibration dataset, we obtain a pruned model emphasizing generality and specificity. Our comprehensive experiments across various tasks in healthcare and legal domains show the effectiveness of D-Pruner in domain-specific compression. Our code is available at https://github.com/psunlpgroup/D-Pruner., Comment: NAACL 2024 Findings
- Published
- 2024
29. Protecting Your LLMs with Information Bottleneck
- Author
-
Liu, Zichuan, Wang, Zefan, Xu, Linjie, Wang, Jinyu, Song, Lei, Wang, Tianchun, Chen, Chunlin, Cheng, Wei, and Bian, Jiang
- Subjects
Computer Science - Computation and Language ,Computer Science - Artificial Intelligence ,Computer Science - Cryptography and Security - Abstract
The advent of large language models (LLMs) has revolutionized the field of natural language processing, yet they might be attacked to produce harmful content. Despite efforts to ethically align LLMs, these are often fragile and can be circumvented by jailbreaking attacks through optimized or manual adversarial prompts. To address this, we introduce the Information Bottleneck Protector (IBProtector), a defense mechanism grounded in the information bottleneck principle, and we modify the objective to avoid trivial solutions. The IBProtector selectively compresses and perturbs prompts, facilitated by a lightweight and trainable extractor, preserving only essential information for the target LLMs to respond with the expected answer. Moreover, we further consider a situation where the gradient is not visible to be compatible with any LLM. Our empirical evaluations show that IBProtector outperforms current defense methods in mitigating jailbreak attempts, without overly affecting response quality or inference speed. Its effectiveness and adaptability across various attack methods and target LLMs underscore the potential of IBProtector as a novel, transferable defense that bolsters the security of LLMs without requiring modifications to the underlying models., Comment: 23 pages, 7 figures, 8 tables
- Published
- 2024
30. Δυσχέρεια and Ἀπορία: The Formation of a Philosophical Term
- Author
-
Cheng, Wei
- Published
- 2018
- Full Text
- View/download PDF
31. Quantum Advantage of One-Way Squeezing in Enhancing Weak-Force Sensing
- Author
-
Wang, Jie, Zhang, Qian, Jiao, Ya-Feng, Zhang, Sheng-Dian, Lu, Tian-Xiang, Li, Zhipeng, Qiu, Cheng-Wei, and Jing, Hui
- Subjects
Quantum Physics - Abstract
Cavity optomechanical (COM) sensors, featuring efficient light-motion couplings, have been widely used for ultra sensitive measurements of various physical quantities ranging from displacements to accelerations or weak forces. Previous works, however, have mainly focused on reciprocal COM systems. Here, we propose how to further improve the performance of quantum COM sensors by breaking reciprocal symmetry in purely quantum regime. Specifically, we consider a spinning COM resonator and show that by selectively driving it in opposite directions, highly nonreciprocal optical squeezing can emerge, which in turn provides an efficient way to surpass the standard quantum limit that otherwise exists in conventional reciprocal devices. Our work confirms that breaking reciprocal symmetry, already achieved in diverse systems well beyond spinning systems, can serve as a new strategy to further enhance the abilities of advanced quantum sensors, for applications ranging from testing fundamental physical laws to practical quantum metrology., Comment: 7 pages,3 figures
- Published
- 2024
32. Novel quantum spin liquid ground state in the trimer rhodate Ba$_4$NbRh$_3$O$_{12}$
- Author
-
Bandyopadhyay, Abhisek, Lee, S., Adroja, D. T., Stenning, G. B. G., Berlie, Adam, Lees, M. R., Saha, R. A., Takegami, D., Melendez-Sans, A., Poelchen, G., Yoshimura, M., Tsuei, K. D., Hu, Z., Kao, Cheng-Wei, Huang, Yu-Cheng, Chan, Ting-Shan, and Cho, Kwang-Yong
- Subjects
Condensed Matter - Strongly Correlated Electrons ,Condensed Matter - Materials Science - Abstract
Frustrated magnets offer a plethora of exotic magnetic ground states, including quantum spin liquids (QSLs), in which enhanced quantum fluctuations prevent a long-range magnetic ordering of the strongly correlated spins down to lowest temperature. Here we have investigated the trimer based mixed valence hexagonal rhodate Ba$_4$NbRh$_3$O$_{12}$ using a combination of dc and ac magnetization, electrical resistivity, specific heat, and muon spin rotation/relaxation ($\mu$SR) measurements. Despite the substantial antiferromagnetic exchange interactions, as evident from the Weiss temperature ($\theta_{\mathrm{W}}\sim -35$ to -45 K), among the Rh-local moments, neither long-range magnetic ordering nor spin-freezing is observed down to at least 50 mK, in ac-susceptibility, specific heat and ZF-$\mu$SR measurements (down to 0.26 K). We ascribe the absence of any magnetic transition to enhanced quantum fluctuations as a result of geometrical frustration arising out of the edge-sharing equilateral Rh-triangular network in the structure. Our longitudinal-field $\mu$SR result evidences persistent spin fluctuations down to 0.26~K, thus stabilizing a dynamic QSL ground state in Ba$_4$NbRh$_3$O$_{12}$. Furthermore, the magnetic specific heat ($C_{\mathrm{m}}$) data at low-$T$ reveal a significant $T$-linear contribution plus a quadratic $T$-dependence. A $T$-linear behavior is evocative of gapless spin excitations, while the $T^2$-term of $C_{\mathrm{m}}$ may indicate the Dirac QSL phenomenology of the spinon excitations with a linear dispersion., Comment: 21 pages, 11 figures
- Published
- 2024
33. Rare $B$ and $K$ decays in a scotogenic model
- Author
-
Chen, Chuan-Hung and Chiang, Cheng-Wei
- Subjects
High Energy Physics - Phenomenology ,High Energy Physics - Experiment - Abstract
A scotogenic model can radiatively generate the observed neutrino mass, provide a dark matter candidate, and lead to rare lepton flavor-violating processes. We aim to extend the model to establish a potential connection to the quark flavor-related processes within the framework of scotogenesis, enhancing the unexpectedly large branching ratio (BR) of $B^+\to K^+ \nu \bar\nu$, observed by Belle II Collaboration. Meanwhile, the model can address tensions between some experimental measurements and standard model (SM) predictions in flavor physics, such as the muon $g-2$ excess and the higher BR of $B_s \to \mu^- \mu^+$. We introduce in the model the following dark particles: a neutral singlet Dirac-type lepton ($N$); two inert Higgs doublets ($\eta_{1,2}$), with one of which carrying a lepton number; a charged singlet dark scalar $(\chi^+)$, and a singlet vector-like up-type dark quark ($T$). The first two entities are responsible for the radiative neutrino mass, and $\chi^+$ couples to right-handed quarks and leptons and can resolve the tensions existing in muon $g-2$ and $B_s\to \mu^- \mu^+$. Furthermore, the BR of $B^+ \to K^+ \nu \bar\nu$ can be enhanced up to a factor of 2 compared to the SM prediction through the mediations of the dark $T$ and the charged scalars. In addition, we also study the impacts on the $K\to \pi \nu \bar\nu$ decays., Comment: 34 pages, 6 figures, references added, text revised
- Published
- 2024
34. Dark matter semi-annihilation for inert scalar multiplets
- Author
-
Beauchesne, Hugues and Chiang, Cheng-Wei
- Subjects
High Energy Physics - Phenomenology - Abstract
Dark matter semi-annihilation is a process through which two dark matter candidates annihilate to a single dark matter particle and a non-dark matter particle. Such processes are common when the symmetry stabilizing the dark matter differs from $\mathbb{Z}_2$ and can lead to qualitatively different phenomenology. In this work, we study the viability of semi-annihilation models including one or two inert multiplets. For one multiplet, we show that there does not exist any viable model in which semi-annihilation is efficient. For two multiplets, semi-annihilation can be efficient, but the number of viable and technically natural models is limited. We then perform a detailed study of the most promising model, showing that the correct relic abundance can be obtained for a wide range of masses., Comment: 26 pages, 4 figures, references added, treatment of SE improved, matches published version
- Published
- 2024
35. InfuserKI: Enhancing Large Language Models with Knowledge Graphs via Infuser-Guided Knowledge Integration
- Author
-
Wang, Fali, Bao, Runxue, Wang, Suhang, Yu, Wenchao, Liu, Yanchi, Cheng, Wei, and Chen, Haifeng
- Subjects
Computer Science - Computation and Language ,Computer Science - Artificial Intelligence ,Computer Science - Machine Learning - Abstract
Though Large Language Models (LLMs) have shown remarkable open-generation capabilities across diverse domains, they struggle with knowledge-intensive tasks. To alleviate this issue, knowledge integration methods have been proposed to enhance LLMs with domain-specific knowledge graphs using external modules. However, they suffer from data inefficiency as they require both known and unknown knowledge for fine-tuning. Thus, we study a novel problem of integrating unknown knowledge into LLMs efficiently without unnecessary overlap of known knowledge. Injecting new knowledge poses the risk of forgetting previously acquired knowledge. To tackle this, we propose a novel Infuser-Guided Knowledge Integration (InfuserKI) framework that utilizes transformer internal states to determine whether to enhance the original LLM output with additional information, thereby effectively mitigating knowledge forgetting. Evaluations on the UMLS-2.5k and MetaQA domain knowledge graphs demonstrate that InfuserKI can effectively acquire new knowledge and outperform state-of-the-art baselines by 9% and 6%, respectively, in reducing knowledge forgetting., Comment: 12 pages, 5 figures
- Published
- 2024
36. Parametric Augmentation for Time Series Contrastive Learning
- Author
-
Zheng, Xu, Wang, Tianchun, Cheng, Wei, Ma, Aitian, Chen, Haifeng, Sha, Mo, and Luo, Dongsheng
- Subjects
Computer Science - Machine Learning - Abstract
Modern techniques like contrastive learning have been effectively used in many areas, including computer vision, natural language processing, and graph-structured data. Creating positive examples that assist the model in learning robust and discriminative representations is a crucial stage in contrastive learning approaches. Usually, preset human intuition directs the selection of relevant data augmentations. Due to patterns that are easily recognized by humans, this rule of thumb works well in the vision and language domains. However, it is impractical to visually inspect the temporal structures in time series. The diversity of time series augmentations at both the dataset and instance levels makes it difficult to choose meaningful augmentations on the fly. In this study, we address this gap by analyzing time series data augmentation using information theory and summarizing the most commonly adopted augmentations in a unified format. We then propose a contrastive learning framework with parametric augmentation, AutoTCL, which can be adaptively employed to support time series representation learning. The proposed approach is encoder-agnostic, allowing it to be seamlessly integrated with different backbone encoders. Experiments on univariate forecasting tasks demonstrate the highly competitive results of our method, with an average 6.5\% reduction in MSE and 4.7\% in MAE over the leading baselines. In classification tasks, AutoTCL achieves a $1.2\%$ increase in average accuracy., Comment: Accepted by International Conference on Learning Representations (ICLR 2024)
- Published
- 2024
37. Uncertainty Quantification for In-Context Learning of Large Language Models
- Author
-
Ling, Chen, Zhao, Xujiang, Zhang, Xuchao, Cheng, Wei, Liu, Yanchi, Sun, Yiyou, Oishi, Mika, Osaki, Takao, Matsuda, Katsushi, Ji, Jie, Bai, Guangji, Zhao, Liang, and Chen, Haifeng
- Subjects
Computer Science - Computation and Language ,Computer Science - Machine Learning - Abstract
In-context learning has emerged as a groundbreaking ability of Large Language Models (LLMs) and revolutionized various fields by providing a few task-relevant demonstrations in the prompt. However, trustworthy issues with LLM's response, such as hallucination, have also been actively discussed. Existing works have been devoted to quantifying the uncertainty in LLM's response, but they often overlook the complex nature of LLMs and the uniqueness of in-context learning. In this work, we delve into the predictive uncertainty of LLMs associated with in-context learning, highlighting that such uncertainties may stem from both the provided demonstrations (aleatoric uncertainty) and ambiguities tied to the model's configurations (epistemic uncertainty). We propose a novel formulation and corresponding estimation method to quantify both types of uncertainties. The proposed method offers an unsupervised way to understand the prediction of in-context learning in a plug-and-play fashion. Extensive experiments are conducted to demonstrate the effectiveness of the decomposition. The code and data are available at: https://github.com/lingchen0331/UQ_ICL., Comment: Accepted to the main conference of NAACL 2024
- Published
- 2024
38. PAC Learnability under Explanation-Preserving Graph Perturbations
- Author
-
Zheng, Xu, Shirani, Farhad, Wang, Tianchun, Gao, Shouwei, Dong, Wenqian, Cheng, Wei, and Luo, Dongsheng
- Subjects
Computer Science - Machine Learning - Abstract
Graphical models capture relations between entities in a wide range of applications including social networks, biology, and natural language processing, among others. Graph neural networks (GNN) are neural models that operate over graphs, enabling the model to leverage the complex relationships and dependencies in graph-structured data. A graph explanation is a subgraph which is an `almost sufficient' statistic of the input graph with respect to its classification label. Consequently, the classification label is invariant, with high probability, to perturbations of graph edges not belonging to its explanation subgraph. This work considers two methods for leveraging such perturbation invariances in the design and training of GNNs. First, explanation-assisted learning rules are considered. It is shown that the sample complexity of explanation-assisted learning can be arbitrarily smaller than explanation-agnostic learning. Next, explanation-assisted data augmentation is considered, where the training set is enlarged by artificially producing new training samples via perturbation of the non-explanation edges in the original training set. It is shown that such data augmentation methods may improve performance if the augmented data is in-distribution, however, it may also lead to worse sample complexity compared to explanation-agnostic learning rules if the augmented data is out-of-distribution. Extensive empirical evaluations are provided to verify the theoretical analysis., Comment: 21 pages, 6 figures, 4 tables
- Published
- 2024
39. DFA-RAG: Conversational Semantic Router for Large Language Model with Definite Finite Automaton
- Author
-
Sun, Yiyou, Hu, Junjie, Cheng, Wei, and Chen, Haifeng
- Subjects
Computer Science - Computation and Language ,Computer Science - Machine Learning - Abstract
This paper introduces the retrieval-augmented large language model with Definite Finite Automaton (DFA-RAG), a novel framework designed to enhance the capabilities of conversational agents using large language models (LLMs). Traditional LLMs face challenges in generating regulated and compliant responses in special scenarios with predetermined response guidelines, like emotional support and customer service. Our framework addresses these challenges by embedding a Definite Finite Automaton (DFA), learned from training dialogues, within the LLM. This structured approach acts as a semantic router which enables the LLM to adhere to a deterministic response pathway. The routing is achieved by the retrieval-augmentation generation (RAG) strategy, which carefully selects dialogue examples aligned with the current conversational context. The advantages of DFA-RAG include an interpretable structure through human-readable DFA, context-aware retrieval for responses in conversations, and plug-and-play compatibility with existing LLMs. Extensive benchmarks validate DFA-RAG's effectiveness, indicating its potential as a valuable contribution to the conversational agent., Comment: Accepted to ICML 2024
- Published
- 2024
40. Optimal transport in the frame of abstract Lax-Oleinik operator revisited
- Author
-
Cheng, Wei, Hong, Jiahui, and Shi, Tianqi
- Subjects
Mathematics - Analysis of PDEs ,Mathematics - Dynamical Systems - Abstract
This is our first paper on the extension of our recent work on the Lax-Oleinik commutators and its applications to the intrinsic approach of propagation of singularities of the viscosity solutions of Hamilton-Jacobi equations. We reformulate Kantorovich-Rubinstein duality theorem in the theory of optimal transport in terms of abstract Lax-Oleinik operators, and analyze the relevant optimal transport problem in the case the cost function $c(x,y)=h(t_1,t_2,x,y)$ is the fundamental solution of Hamilton-Jacobi equation. For further applications to the problem of cut locus and propagation of singularities in optimal transport, we introduce corresponding random Lax-Oleinik operators. We also study the problem of singularities for $c$-concave functions and its dynamical implication when $c$ is the fundamental solution with $t_2-t_1\ll1$ and $t_2-t_1<\infty$, and $c$ is the Peierls' barrier respectively.
- Published
- 2024
41. TrustAgent: Towards Safe and Trustworthy LLM-based Agents through Agent Constitution
- Author
-
Hua, Wenyue, Yang, Xianjun, Jin, Mingyu, Cheng, Wei, Tang, Ruixiang, and Zhang, Yongfeng
- Subjects
Computer Science - Computation and Language ,Computer Science - Artificial Intelligence ,Computer Science - Machine Learning ,Computer Science - Multiagent Systems - Abstract
The rise of LLM-based agents shows great potential to revolutionize task planning, capturing significant attention. Given that these agents will be integrated into high-stake domains, ensuring their reliability and safety is crucial. This paper presents an Agent-Constitution-based agent framework, TrustAgent, with a particular focus on improving the LLM-based agent safety. The proposed framework ensures strict adherence to the Agent Constitution through three strategic components: pre-planning strategy which injects safety knowledge to the model before plan generation, in-planning strategy which enhances safety during plan generation, and post-planning strategy which ensures safety by post-planning inspection. Our experimental results demonstrate that the proposed framework can effectively enhance an LLM agent's safety across multiple domains by identifying and mitigating potential dangers during the planning. Further analysis reveals that the framework not only improves safety but also enhances the helpfulness of the agent. Additionally, we highlight the importance of the LLM reasoning ability in adhering to the Constitution. This paper sheds light on how to ensure the safe integration of LLM-based agents into human-centric environments. Data and code are available at https://github.com/agiresearch/TrustAgent.
- Published
- 2024
42. Experimental test of the Crooks fluctuation theorem in a single nuclear spin
- Author
-
Cheng, Wei, Liu, Wenquan, Niu, Zhibo, Duan, Chang-Kui, Rong, Xing, and Du, Jiangfeng
- Subjects
Quantum Physics - Abstract
We experimentally test the Crooks fluctuation theorem in a quantum spin system. Our results show that the Crooks fluctuation theorem is valid for different speeds of the nonequilibrium processes and under various effective temperatures. Work is not an observable in quantum systems, which makes tests of quantum thermodynamic theorems challenging. In this work, we developed high-fidelity single-shot readouts of a single nuclear spin in diamond and implemented the two-point work measurement protocol, enabling a direct experimental test of the Crooks fluctuation theorem. Our results provide a quantum insight into fluctuations and the methods we developed can be utilized to study other quantum thermodynamic theorems.
- Published
- 2024
43. Observation of quantum strong Mpemba effect
- Author
-
Zhang, Jie, Xia, Gang, Wu, Chun-Wang, Chen, Ting, Zhang, Qian, Xie, Yi, Su, Wen-Bo, Wu, Wei, Qiu, Cheng-Wei, Chen, Ping-xing, Li, Weibin, Jing, Hui, and Zhou, Yan-Li
- Subjects
Quantum Physics - Abstract
An ancient and counterintuitive phenomenon know as the Mpemba effect (water can cool faster when initially heated up) showcases the critical role of initial conditions in relaxation processes. How to realize and utilize this effect for speeding up relaxation is an important but challenging task in purely quantum system till now. Here, we report the first experiment, as far as we know,about the strong Mpemba effect in a single trapped ion system in which an exponentially expedited relaxation in time is observed by preparing an optimal initial state with no excitation of the slowest decaying mode. Also, we find that the condition of realizing such effect coincides with the Liouvillian exceptional point, featuring the coalescence of both the eigenvalues and the eigenmodes of the system. Our work provides an efficient strategy to exponentially accelerate relaxations of quantum system to their stationary state, and suggests a link unexplored yet between the Mpemba effect and the non-Hermitian physics. It could open up the door to engineer a wide range of dissipative quantum systems by utilizing the anomalous Mpemba effect, for applications in quantum simulation and quantum information processing.
- Published
- 2024
44. Deep Learning to Improve the Sensitivity of Di-Higgs Searches in the $4b$ Channel
- Author
-
Chiang, Cheng-Wei, Hsieh, Feng-Yang, Hsu, Shih-Chieh, and Low, Ian
- Subjects
High Energy Physics - Phenomenology ,High Energy Physics - Experiment - Abstract
The study of di-Higgs events, both resonant and non-resonant, plays a crucial role in understanding the fundamental interactions of the Higgs boson. In this work we consider di-Higgs events decaying into four $b$-quarks and propose to improve the experimental sensitivity by utilizing a novel machine learning algorithm known as Symmetry Preserving Attention Network (\textsc{Spa-Net}) -- a neural network structure whose architecture is designed to incorporate the inherent symmetries in particle reconstruction tasks. We demonstrate that the \textsc{Spa-Net} can enhance the experimental reach over baseline methods such as the cut-based and the Deep Neural Networks (DNN)-based analyses. At the Large Hadron Collider, with a 14-TeV centre-of-mass energy and an integrated luminosity of 300 fb$^{-1}$, the \textsc{Spa-Net} allows us to establish 95\% C.L. upper limits in resonant production cross-sections that are 10\% to 45\% stronger than baseline methods. For non-resonant di-Higgs production, \textsc{Spa-Net} enables us to constrain the self-coupling that is 9\% more stringent than the baseline method.
- Published
- 2024
45. Exploring the Impact of Dissipation Coefficient in Warm Higgs Inflation
- Author
-
Cheng, Wei, Chen, Xue-Wen, Zhou, Ruiyu, Jiang, Jiu-Jiang, Dai, Xin-Rui, Zhang, Zi-Han, and Qin, Tong
- Subjects
Astrophysics - Cosmology and Nongalactic Astrophysics ,Astrophysics - High Energy Astrophysical Phenomena - Abstract
In this study, we conducted a detailed analysis of the core parameter of Warm Higgs Inflation (WHI) $-$ the dissipation coefficient ($Q$). As a crucial parameter in the warm inflation process, $Q$ exerts profound influences on the entire evolutionary process. By meticulously deriving the relationships between various quantities and $Q$, we successfully circumvented the common preconceptions regarding strong and weak dissipation, laying the foundation for a more accurate exploration of their interconnections. Taking into account the constraints imposed by Cosmic Microwave Background, we observed that the dissipation coefficient $Q$ remains at extremely low levels throughout the entire warm inflation process, i.e., $Q \ll 1$. This observation indicates that WHI falls under the category of weakly dissipative warm inflation. Despite being weakly dissipative, $Q$ still plays a crucial role in the evolution of temperature, energy, and other quantities, highlighting its significance and non-negligibility. We delved deeper into the impact of the primordial power spectrum on the dissipation coefficient $Q$ during the warm inflation process, discovering that the dependency is not significant. Consequently, this naturally leads to the unobtrusive dependence of the gravitational wave power spectrum on $Q$. Finally, we found that gravitational waves generated by WHI hold the potential for verification in future observational experiments, especially through the SKA100 experiment. These findings provide a theoretical support for a more profound understanding of the early evolution of the universe., Comment: 16 pages, 5 figures
- Published
- 2024
46. Updated analysis of $D\to PP, V\!P$ and $VV$ decays: Implications for $K_S^0-K_L^0$ asymmetries and $D^0$-$\overline {D}^0$ mixing
- Author
-
Cheng, Hai-Yang and Chiang, Cheng-Wei
- Subjects
High Energy Physics - Phenomenology ,High Energy Physics - Experiment - Abstract
An updated analysis of the two-body $D\to PP, V\!P$ and $VV$ decays within the framework of the topological diagram approach is performed. A global fit to the Cabibbo-favored (CF) modes in the $V\!P$ sector gives many solutions with similarly small local minima in $\chi^2$. The solution degeneracy is lifted once we use them to predict for the singly Cabibbo-suppressed (SCS) modes. Topological amplitudes are extracted for the $\eta-\eta'$ mixing angles $\phi=40.4^\circ$ and $43.5^\circ$. The $K_S^0-K_L^0$ asymmetries in $D\to K_{S,L}^0M$ decays denoted by $R(D,M)$ are studied. While the predicted $R(D^0,P)$ for $P=\pi^0, \eta$ and $\eta'$ agree with experiment, the calculated $R(D^+,\pi^+)$, $R(D_s^+, K^+)$, $R(D^0,\omega)$ and $R(D^0,\phi)$ deviate from the data. We conjecture that the relative phase between the topological amplitudes $(C+A)$ and $(T+C)$ should be slightly smaller than $90^\circ$ in order to explain the first two discrepancies and that additional singlet contributions due to the SU(3)-singlet nature of $\omega$ and $\phi$ are needed to account for the last two. For doubly Cabibbo-suppressed (DCS) $D\to V\!P$ decays, their topological amplitudes (double-primed) cannot be all the same as the corresponding ones in the CF modes. The assumption of $E_{V,P}''=E_{V,P}$ for the $W$-exchange amplitude leads to some inconsistencies with the experiment. Through the measured relative phases between CF and DCS channels, the relations of $E_{V,P}''$ with $E_{V,P}$ are determined. Long-distance contributions to the $D^0$-$\overline {D}^0$ mixing parameter $y$ are evaluated in the exclusive approach. In particular, we focus on $D\to PP$ and $V\!P$ decays where $y$ can be reliably estimated. We conclude that $y_{_{P\!P}}\sim (0.110\pm 0.011)\%$ and the lower bound on $y_{_{V\!P}}$ is $(0.220\pm 0.071)\%$., Comment: 33 pages, accepted by PRD
- Published
- 2024
47. Impact of non-thermal phase-space distributions on dark matter abundance in secluded sectors
- Author
-
Beauchesne, Hugues and Chiang, Cheng-Wei
- Subjects
High Energy Physics - Phenomenology - Abstract
Many new physics models include secluded sectors that interact little with the Standard Model and whose internal interactions control the dark matter abundance. If these same interactions are responsible for maintaining kinematic equilibrium within the secluded sector, it is possible that the phase-space distributions will differ considerably from their thermal values during freeze-out. This can potentially result in deviations of the dark matter abundance from that computed under the assumption of thermal distributions. In this paper, we revisit dark matter abundance computations for a benchmark secluded sector by numerically tracking the phase-space distributions. Namely, we show that the dark matter abundance can deviate considerably from standard results during the freeze-out process, but that a longer period of annihilation ultimately leaves only a slight excess., Comment: 14 pages, 2 figures
- Published
- 2024
48. Neutrophils in Atopic Dermatitis
- Author
-
Chiang, Chih-Chao, Cheng, Wei-Jen, Dela Cruz, Joseph Renz Marion Santiago, Raviraj, Thiyagarajan, Wu, Nan-Lin, Korinek, Michal, and Hwang, Tsong-Long
- Published
- 2024
- Full Text
- View/download PDF
49. Identification and characterization of hull-less barley (Hordeum vulgare L.) germplasms for salt tolerance
- Author
-
Sreesaeng, Jakkrit, Qiu, Cheng-Wei, Zhang, Shuo, Shi, Shou-Heng, Luo, Liming, Holford, Paul, and Wu, Feibo
- Published
- 2024
- Full Text
- View/download PDF
50. Super-resolved snapshot hyperspectral imaging of solid-state quantum emitters for high-throughput integrated quantum technologies
- Author
-
Liu, Shunfa, Li, Xueshi, Liu, Hanqing, Qiu, Guixin, Ma, Jiantao, Nie, Liang, Meng, Yun, Hu, Xiaolong, Ni, Haiqiao, Niu, Zhichuan, Qiu, Cheng-Wei, Wang, Xuehua, and Liu, Jin
- Published
- 2024
- Full Text
- View/download PDF
Catalog
Discovery Service for Jio Institute Digital Library
For full access to our library's resources, please sign in.