54,979 results on '"LI, Jia"'
Search Results
2. Sifting through the Chaff: On Utilizing Execution Feedback for Ranking the Generated Code Candidates
- Author
-
Sun, Zhihong, Wan, Yao, Li, Jia, Zhang, Hongyu, Jin, Zhi, Li, Ge, and Lyu, Chen
- Subjects
Computer Science - Software Engineering - Abstract
Large Language Models (LLMs), such as GPT-4, StarCoder, and CodeLlama, are transforming the way developers approach programming by automatically generating code based on given natural language descriptions. Despite advancements, generating syntactically and semantically correct code remains challenging, especially for complex programming tasks. Typically, individuals generate multiple candidate solutions using LLMs to increase the likelihood of producing correct code. However, selecting the correct code from these candidates-a process known as code ranking-remains a major challenge. Current research on code ranking can be categorized into execution-based and non-execution-based methods. Execution-based methods, although effective, encounter notable limitations, such as scarcity of quality unit tests and security risks. Non-execution-based methods like CodeRanker, which rely solely on classification labels to train a code ranker, struggle to capture subtle errors and provide detailed error insights. Recognizing the strengths and limitations of both approaches, we propose a new method. The key insight of our work is that an effective code ranker is expected to genuinely comprehend the underlying causes of erroneous code, as relying solely on classification labels is insufficient. Inspired by this, this paper puts forward RankEF, an innovative approach for code ranking that leverages execution feedback. RankEF employs multi-task learning to integrate code classification with execution feedback generation. This approach enables the model to understand the reasons behind incorrect code, distinguishing between correct and incorrect solutions without the need to execute the code during the ranking phase. Experiments on three code generation benchmarks demonstrate that RankEF significantly outperforms the state-of-the-art CodeRanker., Comment: Accepted by the 39th IEEE/ACM International Conference on Automated Software Engineering (ASE 2024)
- Published
- 2024
3. Tackling Noisy Clients in Federated Learning with End-to-end Label Correction
- Author
-
Jiang, Xuefeng, Sun, Sheng, Li, Jia, Xue, Jingjing, Li, Runhan, Wu, Zhiyuan, Xu, Gang, Wang, Yuwei, and Liu, Min
- Subjects
Computer Science - Machine Learning ,Computer Science - Artificial Intelligence - Abstract
Recently, federated learning (FL) has achieved wide successes for diverse privacy-sensitive applications without sacrificing the sensitive private information of clients. However, the data quality of client datasets can not be guaranteed since corresponding annotations of different clients often contain complex label noise of varying degrees, which inevitably causes the performance degradation. Intuitively, the performance degradation is dominated by clients with higher noise rates since their trained models contain more misinformation from data, thus it is necessary to devise an effective optimization scheme to mitigate the negative impacts of these noisy clients. In this work, we propose a two-stage framework FedELC to tackle this complicated label noise issue. The first stage aims to guide the detection of noisy clients with higher label noise, while the second stage aims to correct the labels of noisy clients' data via an end-to-end label correction framework which is achieved by learning possible ground-truth labels of noisy clients' datasets via back propagation. We implement sixteen related methods and evaluate five datasets with three types of complicated label noise scenarios for a comprehensive comparison. Extensive experimental results demonstrate our proposed framework achieves superior performance than its counterparts for different scenarios. Additionally, we effectively improve the data quality of detected noisy clients' local datasets with our label correction framework. The code is available at https://github.com/Sprinter1999/FedELC., Comment: To appear in ACM CIKM'24 full research paper track
- Published
- 2024
- Full Text
- View/download PDF
4. Exceptional features in nonlinear Hermitian systems
- Author
-
Fang, Liang, Bai, Kai, Guo, Cheng, Liu, Tian-Rui, Li, Jia-Zheng, and Xiao, Meng
- Subjects
Physics - Optics - Abstract
Non-Hermitian systems and their topological singularities, such as exceptional points (EPs), lines, and surfaces, have recently attracted intense interest. The investigation of these exceptional constituents has led to fruitful applications. The responsivity of the eigenvalue diverges at EPs, and chiral state transfer occurs when encircling an EP. Traditionally, it was believed that these exceptional features were unique to non-Hermitian systems requiring gain, loss, or nonreciprocal hopping. Here, we show that these exceptional features are also present in nonlinear Hermitian systems. We consider two coupled resonators with Kerr nonlinearity in one resonator, and no non-Hermitian terms. We identify EP-like points (ELPs) on the eigenspectra where the critical behaviors are the same as those of typical EPs. Additionally, this nonlinear Hermitian system can be mapped to linear non-Hermitian systems, with ELPs corresponding to EPs. We also demonstrate that encirclement around an ELP in the parameter space leads to unique chiral state transfer behavior., Comment: 15 pages, 3 figures and 43 references
- Published
- 2024
5. Large positive magnetoconductance in carbon nanoscrolls
- Author
-
Zhong, Yu-Jie, Huang, Xuan-Fu, Chen, Ting-Zhen, Zhang, Jia-Ren, Li, Jia-Cheng, Huang, Angus, Hsu, Hsiu-Chuan, Ortix, Carmine, and Chang, Ching-Hao
- Subjects
Condensed Matter - Mesoscale and Nanoscale Physics ,Condensed Matter - Disordered Systems and Neural Networks ,Condensed Matter - Materials Science ,Quantum Physics - Abstract
We theoretically demonstrate that carbon nanoscrolls -- spirally wrapped graphene layers with open endpoints -- can be characterized by a large positive magnetoconductance. We show that when a carbon nanoscroll is subject to an axial magnetic field of ~ 10T, the ballistic conductance at low carrier densities of the nanoscroll has an increase of about 200%. Importantly, we find that this positive magnetoconductance is not only preserved but can be even enhanced in the presence of on-site disorder. We prove that the positive magnetoconductance comes about the emergence of magnetic field-induced zero energy modes, specific of rolled-up geometries. Our results establish curved graphene systems as a new material platform displaying sizable magnetoresistive phenomena., Comment: 13 pages, 4 figures
- Published
- 2024
6. E$^3$NeRF: Efficient Event-Enhanced Neural Radiance Fields from Blurry Images
- Author
-
Qi, Yunshan, Li, Jia, Zhao, Yifan, Zhang, Yu, and Zhu, Lin
- Subjects
Computer Science - Computer Vision and Pattern Recognition - Abstract
Neural Radiance Fields (NeRF) achieve impressive rendering performance by learning volumetric 3D representation from several images of different views. However, it is difficult to reconstruct a sharp NeRF from blurry input as it often occurs in the wild. To solve this problem, we propose a novel Efficient Event-Enhanced NeRF (E$^3$NeRF) by utilizing the combination of RGB images and event streams. To effectively introduce event streams into the neural volumetric representation learning process, we propose an event-enhanced blur rendering loss and an event rendering loss, which guide the network via modeling the real blur process and event generation process, respectively. Specifically, we leverage spatial-temporal information from the event stream to evenly distribute learning attention over temporal blur while simultaneously focusing on blurry texture through the spatial attention. Moreover, a camera pose estimation framework for real-world data is built with the guidance of the events to generalize the method to practical applications. Compared to previous image-based or event-based NeRF, our framework makes more profound use of the internal relationship between events and images. Extensive experiments on both synthetic data and real-world data demonstrate that E$^3$NeRF can effectively learn a sharp NeRF from blurry images, especially in non-uniform motion and low-light scenes.
- Published
- 2024
7. Building Trust in Mental Health Chatbots: Safety Metrics and LLM-Based Evaluation Tools
- Author
-
Park, Jung In, Abbasian, Mahyar, Azimi, Iman, Bounds, Dawn, Jun, Angela, Han, Jaesu, McCarron, Robert, Borelli, Jessica, Li, Jia, Mahmoudi, Mona, Wiedenhoeft, Carmen, and Rahmani, Amir
- Subjects
Computer Science - Computation and Language ,Computer Science - Artificial Intelligence ,Computer Science - Human-Computer Interaction ,Computer Science - Machine Learning - Abstract
Objective: This study aims to develop and validate an evaluation framework to ensure the safety and reliability of mental health chatbots, which are increasingly popular due to their accessibility, human-like interactions, and context-aware support. Materials and Methods: We created an evaluation framework with 100 benchmark questions and ideal responses, and five guideline questions for chatbot responses. This framework, validated by mental health experts, was tested on a GPT-3.5-turbo-based chatbot. Automated evaluation methods explored included large language model (LLM)-based scoring, an agentic approach using real-time data, and embedding models to compare chatbot responses against ground truth standards. Results: The results highlight the importance of guidelines and ground truth for improving LLM evaluation accuracy. The agentic method, dynamically accessing reliable information, demonstrated the best alignment with human assessments. Adherence to a standardized, expert-validated framework significantly enhanced chatbot response safety and reliability. Discussion: Our findings emphasize the need for comprehensive, expert-tailored safety evaluation metrics for mental health chatbots. While LLMs have significant potential, careful implementation is necessary to mitigate risks. The superior performance of the agentic approach underscores the importance of real-time data access in enhancing chatbot reliability. Conclusion: The study validated an evaluation framework for mental health chatbots, proving its effectiveness in improving safety and reliability. Future work should extend evaluations to accuracy, bias, empathy, and privacy to ensure holistic assessment and responsible integration into healthcare. Standardized evaluations will build trust among users and professionals, facilitating broader adoption and improved mental health support through technology.
- Published
- 2024
8. PGNeXt: High-Resolution Salient Object Detection via Pyramid Grafting Network
- Author
-
Xia, Changqun, Xie, Chenxi, He, Zhentao, Yu, Tianshu, and Li, Jia
- Subjects
Computer Science - Computer Vision and Pattern Recognition - Abstract
We present an advanced study on more challenging high-resolution salient object detection (HRSOD) from both dataset and network framework perspectives. To compensate for the lack of HRSOD dataset, we thoughtfully collect a large-scale high resolution salient object detection dataset, called UHRSD, containing 5,920 images from real-world complex scenarios at 4K-8K resolutions. All the images are finely annotated in pixel-level, far exceeding previous low-resolution SOD datasets. Aiming at overcoming the contradiction between the sampling depth and the receptive field size in the past methods, we propose a novel one-stage framework for HR-SOD task using pyramid grafting mechanism. In general, transformer-based and CNN-based backbones are adopted to extract features from different resolution images independently and then these features are grafted from transformer branch to CNN branch. An attention-based Cross-Model Grafting Module (CMGM) is proposed to enable CNN branch to combine broken detailed information more holistically, guided by different source feature during decoding process. Moreover, we design an Attention Guided Loss (AGL) to explicitly supervise the attention matrix generated by CMGM to help the network better interact with the attention from different branches. Comprehensive experiments on UHRSD and widely-used SOD datasets demonstrate that our method can simultaneously locate salient object and preserve rich details, outperforming state-of-the-art methods. To verify the generalization ability of the proposed framework, we apply it to the camouflaged object detection (COD) task. Notably, our method performs superior to most state-of-the-art COD methods without bells and whistles.
- Published
- 2024
9. Constraints on large-scale polarization in northern hemisphere
- Author
-
Zhang, Dongdong, Wang, Bo, Li, Jia-Rui, Cai, Yi-Fu, and Feng, Chang
- Subjects
Astrophysics - Cosmology and Nongalactic Astrophysics - Abstract
Present cosmic microwave background (CMB) observations have significantly advanced our understanding of the universe's origin, especially with primordial gravitational waves (PGWs). Currently, ground-based CMB telescopes are mainly located in the southern hemisphere, leaving an untapped potential for observations in the northern hemisphere. In this work, we investigate the perspective of a northern hemisphere CMB polarization telescope (NHT) to detect PGWs and present mock data for such a project. We forecast the detection sensitivity on the tensor-to-scalar ratio r of NHT and compare it with the existed ground-based experiments, also search for optimal experimental configurations that can achieve the best sensitivity of r. Our results indicate that, considering realistic experimental conditions, the first year of NHT observations combined with Planck can achieve a precision of \sigma (r)= 0.015, reaching the level of BICEP2/Keck, with significant potential for improvement with subsequent instrumentation parameter enhancements.
- Published
- 2024
10. Estimate collective cooperativeness of driving agents in mixed traffic flow
- Author
-
Chen, Di, Li, Jia, and Zhang, H. Michael
- Subjects
Physics - Physics and Society ,Computer Science - Machine Learning ,Computer Science - Multiagent Systems - Abstract
Cooperation is a ubiquitous phenomenon in many natural, social, and engineered systems that contain multiple agents. Characterizing and quantifying cooperativeness of driving agents is of interest and significance for two reasons. Theoretically, it will enhance the understanding of micro-macro connections and emergence of cooperation in mixed traffic. Pragmatically, this understanding will benefit the design and operations of automated and mixed-autonomy transportation systems. However, it remains unclear how the cooperativeness can be accurately defined and quantified from empirical data, and it remains open when and to what extent collective cooperativeness exists. This paper is intended to fill the gap. We propose a unified conceptual framework to estimate collective cooperativeness of driving agents leveraging a recent behavioral equilibrium model of mixed autonomy traffic (Li et al. 2022a). This framework is interpretable, theoretically consistent, and enables quantifying collective cooperativeness of traffic agents from trajectory data. We apply the framework to multilane freeway traffic employing NGSIM I-80 trajectory data set and careful data selection. Our case study indicates the existence of collective cooperativeness between human-driven passenger cars and trucks in real-world traffic and reveals its other properties that are otherwise unknown.
- Published
- 2024
11. Two-Component gamma-ray Emission Spectrum and X-Ray Polarization of the Radio Galaxy Pictor A
- Author
-
Li, Jia-Xuan, Hu, Xin-Ke, Lian, Ji-Shun, Yu, Yu-Wei, Deng, Wei, Liu, Kuan, Zhang, Hai-Ming, Chen, Liang, and Zhang, Jin
- Subjects
Astrophysics - High Energy Astrophysical Phenomena - Abstract
Pictor A is a $\gamma$-ray emitting radio galaxy and has a bright hotspot called WHS, located $\sim$4 arcmin away from the nucleus. In this letter, we present an analysis of its 16-year Fermi-LAT data and report the first Imaging X-ray Polarimetry Explorer (IXPE) observation for this source. Our analysis of the Fermi-LAT observations reveals evidence of two components in the average $\gamma$-ray spectrum of Pictor A, exhibiting a statistically significant hardening from $\Gamma^1_{\gamma}=3.25\pm0.15$ to $\Gamma^2_{\gamma}=1.81\pm0.07$ at a break energy of $2.46\pm0.09$ GeV. The evident variability of $\gamma$-rays is observed in Pictor A. Interestingly, the variability is dominated by the component below the break energy, and the component above the break energy shows no variability. Furthermore, we find that a power-law function can adequately fit the spectrum during high-flux states, whereas a broken power-law is still required to explain the spectrum during low-flux state. We suggest that the low-energy component originates from the nucleus, while the high-energy component primarily stems from WHS. The broadband spectral energy distributions of both nucleus and WHS can be well represented by a simple leptonic model, with both $\gamma$-ray components attributed to the synchrotron-self-Compton (SSC) process. The analysis of IXPE data on the nucleus yields an upper limit to the polarization degree $\Pi_{\rm X}<$8.9\% in the 2--8 keV band, agreeing with its X-ray emission originating from SSC. However, $\Pi_{\rm X}=23.5\%\pm5.6\%$ is observed at a confidence level of $>99\%$ in the 5--7 keV band, and the possible physical origin of this narrow-energy-band polarization signal is discussed., Comment: 14 Pages, 4 Figures, 3 Tables, submitted, comments are welcome
- Published
- 2024
12. Spin Bott Indices For Time-Reversal-Invariant Higher-Order Topological Superconductors
- Author
-
Luo, Xun-Jiang, Li, Jia-Zheng, Xiao, Meng, and Wu, Fengcheng
- Subjects
Condensed Matter - Superconductivity - Abstract
The abundance of bulk and boundary topologies in higher-order topological phases offer remarkable tunability and diversity to boundary states but also pose a challenge to their unified topological characterization. In this work, we propose a theoretical framework to characterize time-reversal invariant topological superconductors hosting Majorana Kramers pairs (MKP) of corner states by using a series of spin Bott indices, which capture both bulk and boundary states topology. The developed invariants can characterize MKP in arbitrarily shaped systems and all distinct spatial distribution patterns of MKP. As an illustrative example, we apply our theory to analyze the Kane-Mele model with sublattice-dependent superconducting pairing potentials. In this representative model, various patterns of MKP can be engineered through edge cleavage, and despite their high sensitivity to boundary terminations, MKP can be faithfully characterized by the proposed topological invariants., Comment: 19 pages, 7 figures
- Published
- 2024
13. Parameter-Efficient Fine-Tuning via Circular Convolution
- Author
-
Chen, Aochuan, Cheng, Jiashun, Liu, Zijing, Gao, Ziqi, Tsung, Fugee, Li, Yu, and Li, Jia
- Subjects
Computer Science - Machine Learning ,Computer Science - Computation and Language - Abstract
Low-Rank Adaptation (LoRA) has gained popularity for fine-tuning large foundation models, leveraging low-rank matrices $\mathbf{A}$ and $\mathbf{B}$ to represent weight changes (i.e., $\Delta \mathbf{W} = \mathbf{B} \mathbf{A}$). This method reduces trainable parameters and mitigates heavy memory consumption associated with full delta matrices by sequentially multiplying $\mathbf{A}$ and $\mathbf{B}$ with the activation. Despite its success, the intrinsic low-rank characteristic may limit its performance. Although several variants have been proposed to address this issue, they often overlook the crucial computational and memory efficiency brought by LoRA. In this paper, we propose Circular Convolution Adaptation (C$^3$A), which not only achieves high-rank adaptation with enhanced performance but also excels in both computational power and memory utilization. Extensive experiments demonstrate that C$^3$A consistently outperforms LoRA and its variants across various fine-tuning tasks., Comment: Work in progress
- Published
- 2024
14. Code Structure-Aware through Line-level Semantic Learning for Code Vulnerability Detection
- Author
-
Wang, Ziliang, Li, Ge, Li, Jia, Dong, Yihong, Xiong, Yingfei, and Jin, Zhi
- Subjects
Computer Science - Software Engineering - Abstract
Different from the flow semantics of natural languages, programming languages are inherently rigid in structure and grammar. Existing fine-tuning methodologies for code vulnerability detection generally treat code as long text sequences, stripping away structural elements such as newlines ('/n') and whitespace. However, this approach inadvertently results in the loss of crucial structural information, diminishing the distinct characteristics of code and impairing the accuracy of vulnerability detection. To address these challenges, we propose a novel network architecture method based on pre-trained code models, which incorporates structural information awareness. We propose an enhanced code text processing workflow that retains structural elements prior to modeling. This refinement allows the model to retain and exploit line-level structural information and semantic information during the modeling process. Furthermore, we introduce a new network architecture, the Code Structure-Aware Network through Line-level Semantic Learning (CSLS), which integrates three key components: global vulnerability awareness, line-structural awareness, and sensitive-line awareness. We have conducted comprehensive experiments using vulnerability detection datasets from real-world projects. Extensive experiments were conducted on vulnerability detection datasets derived from real-world projects. The results demonstrate that our new code pre-processing flow significantly improves existing baselines (e.g., a 3\% accuracy improvement on the Devign dataset when applied to popular models such as CoderBert and UniXcoder). The proposed network architecture also demonstrates superior accuracy in detecting vulnerabilities, surpassing newly established benchmarks. These findings underscore the importance of structural information in enhancing the efficacy of code vulnerability detection models.
- Published
- 2024
15. The Oscars of AI Theater: A Survey on Role-Playing with Language Models
- Author
-
Chen, Nuo, Wang, Yan, Deng, Yang, and Li, Jia
- Subjects
Computer Science - Artificial Intelligence ,Computer Science - Computation and Language - Abstract
This survey explores the burgeoning field of role-playing with language models, focusing on their development from early persona-based models to advanced character-driven simulations facilitated by Large Language Models (LLMs). Initially confined to simple persona consistency due to limited model capabilities, role-playing tasks have now expanded to embrace complex character portrayals involving character consistency, behavioral alignment, and overall attractiveness. We provide a comprehensive taxonomy of the critical components in designing these systems, including data, models and alignment, agent architecture and evaluation. This survey not only outlines the current methodologies and challenges, such as managing dynamic personal profiles and achieving high-level persona consistency but also suggests avenues for future research in improving the depth and realism of role-playing applications. The goal is to guide future research by offering a structured overview of current methodologies and identifying potential areas for improvement. Related resources and papers are available at https://github.com/nuochenpku/Awesome-Role-Play-Papers., Comment: 28 pages
- Published
- 2024
16. Implications of scattering for CMB foreground emission modelling
- Author
-
Li, Jia-Rui, Delabrouille, Jacques, Cai, Yi-Fu, and Zhang, Dongdong
- Subjects
Astrophysics - Cosmology and Nongalactic Astrophysics - Abstract
Context. The extreme precision and accuracy of forthcoming observations of CMB temperature and polarization anisotropies, aiming to detect the tiny signatures of primordial gravitational waves or of light relic particles beyond the standard three light neutrinos, requires commensurate precision in the modelling of foreground Galactic emission that contaminates CMB observations. Aims. We evaluate the impact of second-order effects in Galactic foreground emission due to Thomson scattering off interstellar free electrons and to Rayleigh scattering off interstellar dust particles. Methods. We use existing sky survey data and models of the distribution of free electrons and dust within the Milky Way to estimate the amplitude and power spectra of the emission originating from radiation scattered either by free electrons or by dust grains at CMB frequencies. Results. Both processes generate corrections to the total emission that are small compared to direct emission, and are small enough not to pose problems for current-generation observations. Conclusions. However, B-modes generated by Thomson scattering of incoming radiation by interstellar free electrons at CMB frequencies are within an order of magnitude of the sensitivity of the most advanced forthcoming CMB telescopes, and might require more precise evaluation in the future., Comment: 8 pages, 5 figures
- Published
- 2024
17. GLBench: A Comprehensive Benchmark for Graph with Large Language Models
- Author
-
Li, Yuhan, Wang, Peisong, Zhu, Xiao, Chen, Aochuan, Jiang, Haiyun, Cai, Deng, Chan, Victor Wai Kin, and Li, Jia
- Subjects
Computer Science - Machine Learning ,Computer Science - Computation and Language - Abstract
The emergence of large language models (LLMs) has revolutionized the way we interact with graphs, leading to a new paradigm called GraphLLM. Despite the rapid development of GraphLLM methods in recent years, the progress and understanding of this field remain unclear due to the lack of a benchmark with consistent experimental protocols. To bridge this gap, we introduce GLBench, the first comprehensive benchmark for evaluating GraphLLM methods in both supervised and zero-shot scenarios. GLBench provides a fair and thorough evaluation of different categories of GraphLLM methods, along with traditional baselines such as graph neural networks. Through extensive experiments on a collection of real-world datasets with consistent data processing and splitting strategies, we have uncovered several key findings. Firstly, GraphLLM methods outperform traditional baselines in supervised settings, with LLM-as-enhancers showing the most robust performance. However, using LLMs as predictors is less effective and often leads to uncontrollable output issues. We also notice that no clear scaling laws exist for current GraphLLM methods. In addition, both structures and semantics are crucial for effective zero-shot transfer, and our proposed simple baseline can even outperform several models tailored for zero-shot scenarios. The data and code of the benchmark can be found at https://github.com/NineAbyss/GLBench., Comment: arXiv admin note: text overlap with arXiv:2306.10280 by other authors
- Published
- 2024
18. Taxonomic resolution of the hillstream suck-loach Beaufortia pingi species group (Cypriniformes, Gastromyzontidae) and two new species from Southwest China– Beaufortia granulopinna and Beaufortia viridis
- Author
-
Chen, Jing-Chen, Li, Jia-Jia, Tang, Wenqiao, Pu, Xin-Rui, Lei, Hao-Tian, and Pensoft Publishers
- Subjects
Beaufortia ,Beaufortia pingi ,Beaufortia zebroida ,molecular phylogeny ,Morphology ,redescription - Published
- 2024
19. GraphArena: Benchmarking Large Language Models on Graph Computational Problems
- Author
-
Tang, Jianheng, Zhang, Qifan, Li, Yuhan, and Li, Jia
- Subjects
Computer Science - Artificial Intelligence ,Computer Science - Computation and Language - Abstract
The "arms race" of Large Language Models (LLMs) demands novel, challenging, and diverse benchmarks to faithfully examine their progresses. We introduce GraphArena, a benchmarking tool designed to evaluate LLMs on graph computational problems using million-scale real-world graphs from diverse scenarios such as knowledge graphs, social networks, and molecular structures. GraphArena offers a suite of 10 computational tasks, encompassing four polynomial-time (e.g., Shortest Distance) and six NP-complete challenges (e.g., Travelling Salesman Problem). It features a rigorous evaluation framework that classifies LLM outputs as correct, suboptimal (feasible but not optimal), or hallucinatory (properly formatted but infeasible). Evaluation of 10 leading LLMs, including GPT-4o and LLaMA3-70B-Instruct, reveals that even top-performing models struggle with larger, more complex graph problems and exhibit hallucination issues. Despite the application of strategies such as chain-of-thought prompting, these issues remain unresolved. GraphArena contributes a valuable supplement to the existing LLM benchmarks and is open-sourced at https://github.com/squareRoot3/GraphArena.
- Published
- 2024
20. The impact of shear on the rotation of Galactic plane molecular clouds
- Author
-
Rani, Raffaele, Li, Jia-Lun, Moore, Toby J. T., Eden, David J., Rigby, Andrew J., Park, Geumsook, and Lee, Yueh-Ning
- Subjects
Astrophysics - Astrophysics of Galaxies - Abstract
Stars form in the densest regions of molecular clouds, however, there is no universal understanding of the factors that regulate cloud dynamics and their influence on the gas-to-stars conversion. This study considers the impact of Galactic shear on the rotation of giant molecular clouds (GMCs) and its relation to the solenoidal modes of turbulence. We estimate the direction of rotation for a large sample of clouds in the \ce{^{13}CO}/\ce{C^{18}O} (3-2) Heterodyne Inner Milky Way Plane Survey (CHIMPS) and their corresponding sources in a new segmentation of the \ce{^{12}CO}(3-2) High-Resolution Survey (COHRS). To quantify the strength of shear, we introduce a parameter that describes the shear's ability to disrupt growing density perturbations within the cloud. Although we find no correlation between the direction of cloud rotation, the shear parameter, and the magnitude of the velocity gradient, the solenoidal fraction of the turbulence in the CHIMPS sample is positively correlated with the shear parameter and behaves similarly when plotted over Galactocentric distance. GMCs may thus not be large or long-lived enough to be affected by shear to the point of showing rotational alignment. In theory, Galactic shear can facilitate the rise of solenoidal turbulence and thus contribute to suppressing star formation. These results also suggest that the rotation of clouds is not strictly related to the overall rotation of the disc, but is more likely to be the imprint of Kelvin-Helmholtz instabilities in the colliding flows that formed the clouds., Comment: Accepted, MNRAS
- Published
- 2024
21. The dynamics of Tonks-Girardeau gas excited by a pulse drive
- Author
-
Li, Jia and Hao, Yajiang
- Subjects
Condensed Matter - Quantum Gases - Abstract
In this paper we study the dynamics of Tonks-Girardeau (TG) gases in a harmonic potential driven by Gaussian pulse, which is a correspondence of the excitation dynamics of electrons in matters driven by ultrashort laser pulse. The evolving dynamics of TG gas are obtained with Bose-Fermi mapping method combined with the numerical techniques. We calculate the evolving dynamics of occupation distribution of single-particle energy levels, density distribution and momentum distribution of the system. It is shown that the system arrived at a dynamically stable state at the end of driving. At high-frequency regime TG gases return back to ground state while at low-frequency regime the population inversion exhibits and all atoms occupy high levels., Comment: 9 pages, 6 figures
- Published
- 2024
22. Relaxing Continuous Constraints of Equivariant Graph Neural Networks for Physical Dynamics Learning
- Author
-
Zheng, Zinan, Liu, Yang, Li, Jia, Yao, Jianhua, and Rong, Yu
- Subjects
Computer Science - Machine Learning ,Computer Science - Artificial Intelligence - Abstract
Incorporating Euclidean symmetries (e.g. rotation equivariance) as inductive biases into graph neural networks has improved their generalization ability and data efficiency in unbounded physical dynamics modeling. However, in various scientific and engineering applications, the symmetries of dynamics are frequently discrete due to the boundary conditions. Thus, existing GNNs either overlook necessary symmetry, resulting in suboptimal representation ability, or impose excessive equivariance, which fails to generalize to unobserved symmetric dynamics. In this work, we propose a general Discrete Equivariant Graph Neural Network (DEGNN) that guarantees equivariance to a given discrete point group. Specifically, we show that such discrete equivariant message passing could be constructed by transforming geometric features into permutation-invariant embeddings. Through relaxing continuous equivariant constraints, DEGNN can employ more geometric feature combinations to approximate unobserved physical object interaction functions. Two implementation approaches of DEGNN are proposed based on ranking or pooling permutation-invariant functions. We apply DEGNN to various physical dynamics, ranging from particle, molecular, crowd to vehicle dynamics. In twenty scenarios, DEGNN significantly outperforms existing state-of-the-art approaches. Moreover, we show that DEGNN is data efficient, learning with less data, and can generalize across scenarios such as unobserved orientation.
- Published
- 2024
23. Deblurring Neural Radiance Fields with Event-driven Bundle Adjustment
- Author
-
Qi, Yunshan, Zhu, Lin, Zhao, Yifan, Bao, Nan, and Li, Jia
- Subjects
Computer Science - Computer Vision and Pattern Recognition - Abstract
Neural Radiance Fields (NeRF) achieves impressive 3D representation learning and novel view synthesis results with high-quality multi-view images as input. However, motion blur in images often occurs in low-light and high-speed motion scenes, which significantly degrades the reconstruction quality of NeRF. Previous deblurring NeRF methods struggle to estimate pose and lighting changes during the exposure time, making them unable to accurately model the motion blur. The bio-inspired event camera measuring intensity changes with high temporal resolution makes up this information deficiency. In this paper, we propose Event-driven Bundle Adjustment for Deblurring Neural Radiance Fields (EBAD-NeRF) to jointly optimize the learnable poses and NeRF parameters by leveraging the hybrid event-RGB data. An intensity-change-metric event loss and a photo-metric blur loss are introduced to strengthen the explicit modeling of camera motion blur. Experiments on both synthetic and real-captured data demonstrate that EBAD-NeRF can obtain accurate camera trajectory during the exposure time and learn a sharper 3D representations compared to prior works., Comment: Accepted by 32nd ACM International Conference on Multimedia (MM 2024)
- Published
- 2024
- Full Text
- View/download PDF
24. Harnessing spontaneous emission of correlated photon pairs from ladder-type giant atoms
- Author
-
Gao, Zhao-Min, Li, Jia-Qi, Wu, Ying-Huan, Liu, Wen-Xiao, and Wang, Xin
- Subjects
Quantum Physics - Abstract
The realization of correlated multi-photon processes usually depends on the interaction between nonlinear media and atoms. However, the nonlinearity of optical materials is generally weak, making it still very challenging to achieve correlated multi-photon dynamics at the few-photon level. Meanwhile, giant atoms, with their capability for multi-point coupling, which is a novel paradigm in quantum optics, mostly focus on the single photon field. In this work, using the method described in Phys. Rev. Res. 6. 013279 (2024), we reveal that the ladder-type three-level giant atom spontaneously emits strongly correlated photon pairs with high efficiency by designing and optimizing the target function. In addition, by encoding local phases into the optimal coupling sequence, directional two-photon correlated transfer can be achieved. This method does not require a nonlinear waveguide and can be realized in the conventional environment. We show that the photon pairs emitted in both the bidirectional and the chiral case exhibit strong correlation properties in both time and space. Such correlated photon pairs have great potential applications for quantum information processing. For example, numerical results show that our proposal can realize the two-photon mediated cascaded quantum system., Comment: 12 pages; 10 figures
- Published
- 2024
25. SynthTree: Co-supervised Local Model Synthesis for Explainable Prediction
- Author
-
Kuriabov, Evgenii and Li, Jia
- Subjects
Statistics - Methodology ,Statistics - Applications ,Statistics - Machine Learning - Abstract
Explainable machine learning (XML) has emerged as a major challenge in artificial intelligence (AI). Although black-box models such as Deep Neural Networks and Gradient Boosting often exhibit exceptional predictive accuracy, their lack of interpretability is a notable drawback, particularly in domains requiring transparency and trust. This paper tackles this core AI problem by proposing a novel method to enhance explainability with minimal accuracy loss, using a Mixture of Linear Models (MLM) estimated under the co-supervision of black-box models. We have developed novel methods for estimating MLM by leveraging AI techniques. Specifically, we explore two approaches for partitioning the input space: agglomerative clustering and decision trees. The agglomerative clustering approach provides greater flexibility in model construction, while the decision tree approach further enhances explainability, yielding a decision tree model with linear or logistic regression models at its leaf nodes. Comparative analyses with widely-used and state-of-the-art predictive models demonstrate the effectiveness of our proposed methods. Experimental results show that statistical models can significantly enhance the explainability of AI, thereby broadening their potential for real-world applications. Our findings highlight the critical role that statistical methodologies can play in advancing explainable AI.
- Published
- 2024
26. SimXRD-4M: Big Simulated X-ray Diffraction Data Accelerate the Crystalline Symmetry Classification
- Author
-
Cao, Bin, Liu, Yang, Zheng, Zinan, Tan, Ruifeng, Li, Jia, and Zhang, Tong-yi
- Subjects
Condensed Matter - Materials Science - Abstract
Spectroscopic data, particularly diffraction data, contain detailed crystal and microstructure information and thus are crucial for materials discovery. Powder X-ray diffraction (XRD) patterns are greatly effective in identifying crystals. Although machine learning (ML) has significantly advanced the analysis of powder XRD patterns, the progress is hindered by a lack of training data. To address this, we introduce SimXRD, the largest open-source simulated XRD pattern dataset so far, to accelerate the development of crystallographic informatics. SimXRD comprises 4,065,346 simulated powder X-ray diffraction patterns, representing 119,569 distinct crystal structures under 33 simulated conditions that mimic real-world variations. We find that the crystal symmetry inherently follows a long-tailed distribution and evaluate 21 sequence learning models on SimXRD. The results indicate that existing neural networks struggle with low-frequency crystal classifications. The present work highlights the academic significance and the engineering novelty of simulated XRD patterns in this interdisciplinary field.
- Published
- 2024
27. Photonic realization of chiral hinge states in a Chern-insulator stack
- Author
-
Xia, Han-Rong, Li, Jia-Zheng, Yuan, Si-Yu, and Xiao, Meng
- Subjects
Condensed Matter - Mesoscale and Nanoscale Physics - Abstract
Higher-order topological insulators, as a novel family of topological phases, are a hot frontier in condensed matter physics due to their adherence to unconventional bulk-boundary correspondence. A three-dimensional second-order topological insulator can support one-dimensional modes along its hinges (dubbed as hinge states). Here, we present a simple and direct method to construct chiral hinge modes based on a Chern-insulator stack. We analyze the existence of the hinge modes through the nontrivial quadrupole indices, and then design a photonic crystal to realize the specific flowing pattern of the hinge mode in our model. The experimental results align well with full-wave simulations, clearly demonstrating the existence of chiral hinge states. We also verify the robustness of these hinge states against defects in our photonic system.
- Published
- 2024
28. M2CVD: Enhancing Vulnerability Semantic through Multi-Model Collaboration for Code Vulnerability Detection
- Author
-
Wang, Ziliang, Li, Ge, Li, Jia, Xiong, Yingfei, Yan, Meng, and Jin, Zhi
- Subjects
Computer Science - Software Engineering - Abstract
Large Language Models (LLMs) have strong capabilities in code comprehension, but fine-tuning costs and semantic alignment issues limit their project-specific optimization; conversely, code models such CodeBERT are easy to fine-tune, but it is often difficult to learn vulnerability semantics from complex code languages. To address these challenges, this paper introduces the Multi-Model Collaborative Vulnerability Detection approach (M2CVD) that leverages the strong capability of analyzing vulnerability semantics from LLMs to improve the detection accuracy of code models. M2CVD employs a novel collaborative process: first enhancing the quality of vulnerability semantic description produced by LLMs through the understanding of project code by code models, and then using these improved vulnerability semantic description to boost the detection accuracy of code models. We demonstrated M2CVD's effectiveness on two real-world datasets, where M2CVD significantly outperformed the baseline. In addition, we demonstrate that the M2CVD collaborative method can extend to other different LLMs and code models to improve their accuracy in vulnerability detection tasks.
- Published
- 2024
29. ProG: A Graph Prompt Learning Benchmark
- Author
-
Zi, Chenyi, Zhao, Haihong, Sun, Xiangguo, Lin, Yiqing, Cheng, Hong, and Li, Jia
- Subjects
Computer Science - Machine Learning - Abstract
Artificial general intelligence on graphs has shown significant advancements across various applications, yet the traditional 'Pre-train & Fine-tune' paradigm faces inefficiencies and negative transfer issues, particularly in complex and few-shot settings. Graph prompt learning emerges as a promising alternative, leveraging lightweight prompts to manipulate data and fill the task gap by reformulating downstream tasks to the pretext. However, several critical challenges still remain: how to unify diverse graph prompt models, how to evaluate the quality of graph prompts, and to improve their usability for practical comparisons and selection. In response to these challenges, we introduce the first comprehensive benchmark for graph prompt learning. Our benchmark integrates SIX pre-training methods and FIVE state-of-the-art graph prompt techniques, evaluated across FIFTEEN diverse datasets to assess performance, flexibility, and efficiency. We also present 'ProG', an easy-to-use open-source library that streamlines the execution of various graph prompt models, facilitating objective evaluations. Additionally, we propose a unified framework that categorizes existing graph prompt methods into two main approaches: prompts as graphs and prompts as tokens. This framework enhances the applicability and comparison of graph prompt techniques. The code is available at: https://github.com/sheldonresearch/ProG.
- Published
- 2024
30. DualTime: A Dual-Adapter Multimodal Language Model for Time Series Representation
- Author
-
Zhang, Weiqi, Ye, Jiexia, Li, Ziyue, Li, Jia, and Tsung, Fugee
- Subjects
Computer Science - Machine Learning ,Computer Science - Artificial Intelligence ,Computer Science - Computation and Language - Abstract
The recent rapid development of language models (LMs) has attracted attention in the field of time series, including multimodal time series modeling. However, we note that current time series multimodal methods are biased, often assigning a primary role to one modality while the other assumes a secondary role. They overlook the mutual benefits and complementary of different modalities. For example, in seizure diagnosis, relying solely on textual clinical reports makes it difficult to pinpoint the area and type of the disease, while electroencephalograms (EEGs) alone cannot provide an accurate diagnosis without considering the symptoms. In this study, based on the complementary information mining of time series multimodal data, we propose DualTime, a Dual-adapter multimodal language model for Time series representation implementing temporal-primary and textual-primary modeling simultaneously. By injecting lightweight adaption tokens, the LM pipeline shared by dual adapters encourages embedding alignment and achieves efficient fine-tuning. Empirically, our method outperforms state-of-the-art models in both supervised and unsupervised settings, highlighting the complementary benefits of different modalities. In addition, we conduct few-shot label transfer experiments, which further verifies the transferability and expressiveness of our proposed DualTime., Comment: 15 pages, 12 figure, 5 tables
- Published
- 2024
31. Query-based Semantic Gaussian Field for Scene Representation in Reinforcement Learning
- Author
-
Wang, Jiaxu, Zhang, Ziyi, Zhang, Qiang, Li, Jia, Sun, Jingkai, Sun, Mingyuan, He, Junhao, and Xu, Renjing
- Subjects
Computer Science - Robotics - Abstract
Latent scene representation plays a significant role in training reinforcement learning (RL) agents. To obtain good latent vectors describing the scenes, recent works incorporate the 3D-aware latent-conditioned NeRF pipeline into scene representation learning. However, these NeRF-related methods struggle to perceive 3D structural information due to the inefficient dense sampling in volumetric rendering. Moreover, they lack fine-grained semantic information included in their scene representation vectors because they evenly consider free and occupied spaces. Both of them can destroy the performance of downstream RL tasks. To address the above challenges, we propose a novel framework that adopts the efficient 3D Gaussian Splatting (3DGS) to learn 3D scene representation for the first time. In brief, we present the Query-based Generalizable 3DGS to bridge the 3DGS technique and scene representations with more geometrical awareness than those in NeRFs. Moreover, we present the Hierarchical Semantics Encoding to ground the fine-grained semantic features to 3D Gaussians and further distilled to the scene representation vectors. We conduct extensive experiments on two RL platforms including Maniskill2 and Robomimic across 10 different tasks. The results show that our method outperforms the other 5 baselines by a large margin. We achieve the best success rates on 8 tasks and the second-best on the other two tasks.
- Published
- 2024
32. Activity-driven polymer knotting for macromolecular topology engineering
- Author
-
Li, Jia-Xiang, Wu, Song, Hao, Li-Li, Lei, Qun-Li, and Ma, Yu-Qiang
- Subjects
Condensed Matter - Soft Condensed Matter - Abstract
Macromolecules can gain special properties by adopting knotted conformations, but engineering knotted macromolecules is a challenging task. Here we surprisingly observed that knotting can be very effectively produced in active polymers. When one end of an actively reptative polymer is anchored, it can undergo continual self-knotting as a result of intermittent giant conformation fluctuations and the outward reptative motion. Once a knot is formed, it migrates to the anchored point due to a non-equilibrium ratchet effect. Moreover, when the active polymer is grafted on the end of a passive polymer, it can function as a self-propelling soft needle to either transfer its own knots to the passive polymer or directly braid knots on the passive polymer. We further show that these active needles can create inter-molecular bridging knots between two passive polymers. Our finding highlights the non-equilibrium effects in modifying the dynamic pathways of polymer systems, which have potential applications in macromolecular topology engineering, e.g., manipulating topological states of proteins and nucleic acids, as well as macromolecular braiding.
- Published
- 2024
33. One QuantLLM for ALL: Fine-tuning Quantized LLMs Once for Efficient Deployments
- Author
-
Yi, Ke, Xu, Yuhui, Chang, Heng, Tang, Chen, Meng, Yuan, Zhang, Tong, and Li, Jia
- Subjects
Computer Science - Artificial Intelligence - Abstract
Large Language Models (LLMs) have advanced rapidly but face significant memory demands. While quantization has shown promise for LLMs, current methods typically require lengthy training to alleviate the performance degradation from quantization loss. However, deploying LLMs across diverse scenarios with different resource constraints, e.g., servers and personal computers, requires repeated training per application, which amplifies the lengthy training problem. Given that, it is advantageous to train a once-for-all (OFA) supernet capable of yielding diverse optimal subnets for downstream applications through one-shot training. Nonetheless, the scale of current language models impedes efficiency and amplifies interference from weight sharing between subnets. We make an initial attempt to extend the once-for-all framework to large language models. Specifically, we decouple shared weights to eliminate the interference and incorporate Low-Rank adapters for training efficiency. Furthermore, we observe the imbalance allocation of training resources from the traditional uniform sampling. A non-parametric scheduler is introduced to adjust the sampling rate for each quantization configuration, achieving a more balanced allocation among subnets with varying demands. We validate the approach on LLaMA2 families, and downstream evaluation confirms our ability to maintain high performance while significantly reducing deployment time faced with multiple scenarios.
- Published
- 2024
34. DevEval: A Manually-Annotated Code Generation Benchmark Aligned with Real-World Code Repositories
- Author
-
Li, Jia, Li, Ge, Zhao, Yunfei, Li, Yongmin, Liu, Huanyu, Zhu, Hao, Wang, Lecheng, Liu, Kaibo, Fang, Zheng, Wang, Lanshen, Ding, Jiazheng, Zhang, Xuanming, Zhu, Yuqi, Dong, Yihong, Jin, Zhi, Li, Binhua, Huang, Fei, and Li, Yongbin
- Subjects
Computer Science - Computation and Language ,Computer Science - Software Engineering - Abstract
How to evaluate the coding abilities of Large Language Models (LLMs) remains an open question. We find that existing benchmarks are poorly aligned with real-world code repositories and are insufficient to evaluate the coding abilities of LLMs. To address the knowledge gap, we propose a new benchmark named DevEval, which has three advances. (1) DevEval aligns with real-world repositories in multiple dimensions, e.g., code distributions and dependency distributions. (2) DevEval is annotated by 13 developers and contains comprehensive annotations (e.g., requirements, original repositories, reference code, and reference dependencies). (3) DevEval comprises 1,874 testing samples from 117 repositories, covering 10 popular domains (e.g., Internet, Database). Based on DevEval, we propose repository-level code generation and evaluate 8 popular LLMs on DevEval (e.g., gpt-4, gpt-3.5, StarCoder 2, DeepSeek Coder, CodeLLaMa). Our experiments reveal these LLMs' coding abilities in real-world code repositories. For example, in our experiments, the highest Pass@1 of gpt-4-turbo is only 53.04%. We also analyze LLMs' failed cases and summarize their shortcomings. We hope DevEval can facilitate the development of LLMs in real code repositories. DevEval, prompts, and LLMs' predictions have been released., Comment: Accepted by the 62nd Annual Meeting of the Association for Computational Linguistics (ACL 2024). arXiv admin note: substantial text overlap with arXiv:2404.00599, arXiv:2401.06401
- Published
- 2024
35. Text Guided Image Editing with Automatic Concept Locating and Forgetting
- Author
-
Li, Jia, Hu, Lijie, He, Zhixian, Zhang, Jingfeng, Zheng, Tianhang, and Wang, Di
- Subjects
Computer Science - Computer Vision and Pattern Recognition ,Computer Science - Artificial Intelligence - Abstract
With the advancement of image-to-image diffusion models guided by text, significant progress has been made in image editing. However, a persistent challenge remains in seamlessly incorporating objects into images based on textual instructions, without relying on extra user-provided guidance. Text and images are inherently distinct modalities, bringing out difficulties in fully capturing the semantic intent conveyed through language and accurately translating that into the desired visual modifications. Therefore, text-guided image editing models often produce generations with residual object attributes that do not fully align with human expectations. To address this challenge, the models should comprehend the image content effectively away from a disconnect between the provided textual editing prompts and the actual modifications made to the image. In our paper, we propose a novel method called Locate and Forget (LaF), which effectively locates potential target concepts in the image for modification by comparing the syntactic trees of the target prompt and scene descriptions in the input image, intending to forget their existence clues in the generated image. Compared to the baselines, our method demonstrates its superiority in text-guided image editing tasks both qualitatively and quantitatively.
- Published
- 2024
36. Chiral quantum heating and cooling with an optically controlled ion
- Author
-
Bu, Jin-Tao, Zhang, Jian-Qi, Ding, Ge-Yi, Li, Jia-Chong, Zhang, Jia-Wei, Wang, Bin, Ding, Wen-Qiang, Yuan, Wen-Fei, Chen, Liang, Zhong, Qi, Keçebaş, Ali, Özdemir, Şahin K., Zhou, Fei, Jing, Hui, and Feng, Mang
- Subjects
Quantum Physics - Abstract
Quantum heat engines and refrigerators are open quantum systems, whose dynamics can be well understood using a non-Hermitian formalism. A prominent feature of non-Hermiticity is the existence of exceptional points (EPs), which has no counterpart in closed quantum systems. It has been shown in classical systems that dynamical encirclement in the vicinity of an EP, whether the loop includes the EP or not, could lead to chiral mode conversion. Here, we show that this is valid also for quantum systems when dynamical encircling is performed in the vicinity of their Liouvillian EPs (LEPs) which include the effects of quantum jumps and associated noise - an important quantum feature not present in previous works. We demonstrate, using a Paul-trapped ultracold ion, the first chiral quantum heating and refrigeration by dynamically encircling a closed loop in the vicinity of an LEP. We witness the cycling direction to be associated with the chirality and heat release (absorption) of the quantum heat engine (quantum refrigerator). Our experiments have revealed that not only the adiabaticity-breakdown but also the Landau-Zener-St\"uckelberg process play an essential role during dynamic encircling, resulting in chiral thermodynamic cycles. Our observations contributes to further understanding of chiral and topological features in non-Hermitian systems and pave a way to exploring the relation between chirality and quantum thermodynamics., Comment: Accepted by Light: Science & Applications
- Published
- 2024
- Full Text
- View/download PDF
37. 4-bit Shampoo for Memory-Efficient Network Training
- Author
-
Wang, Sike, Li, Jia, Zhou, Pan, and Huang, Hua
- Subjects
Computer Science - Machine Learning ,Computer Science - Artificial Intelligence - Abstract
Second-order optimizers, maintaining a matrix termed a preconditioner, are superior to first-order optimizers in both theory and practice. The states forming the preconditioner and its inverse root restrict the maximum size of models trained by second-order optimizers. To address this, compressing 32-bit optimizer states to lower bitwidths has shown promise in reducing memory usage. However, current approaches only pertain to first-order optimizers. In this paper, we propose the first 4-bit second-order optimizers, exemplified by 4-bit Shampoo, maintaining performance similar to that of 32-bit ones. We show that quantizing the eigenvector matrix of the preconditioner in 4-bit Shampoo is remarkably better than quantizing the preconditioner itself both theoretically and experimentally. By rectifying the orthogonality of the quantized eigenvector matrix, we enhance the approximation of the preconditioner's eigenvector matrix, which also benefits the computation of its inverse 4-th root. Besides, we find that linear square quantization slightly outperforms dynamic tree quantization when quantizing second-order optimizer states. Evaluation on various networks for image classification demonstrates that our 4-bit Shampoo achieves comparable test accuracy to its 32-bit counterpart while being more memory-efficient. The source code will be made available.
- Published
- 2024
38. Standardizing the Gamma-ray burst as a standard candle and applying to the cosmological probes: constraints on the two-component dark energy model
- Author
-
Li, Jia-Lun, Yang, Yu-Peng, Yi, Shuang-Xi, Hu, Jian-Ping, Qu, Yan-Kun, and Wang, Fa-Yin
- Subjects
Astrophysics - High Energy Astrophysical Phenomena ,Astrophysics - Cosmology and Nongalactic Astrophysics - Abstract
As one of the most energetic and brightest events, gamma-ray bursts (GRBs) have been used as a standard candle for cosmological probe. Based on the relevant features of GRBs light curves, a plateau phase followed a decay phase, we obtain X-ray samples of 31 GRBs and optical samples of 50 GRBs, which are thought to be caused by the same physical mechanism. We standardize GRBs using the two-dimension fundamental plane relation of the rest-frame luminosity of the plateau emission ($L_{b,z}$) and the end time of plateau ($T_{b,z}$) $L_{b,z}-T_{b,z}$, as well as the three-dimension fundamental plane correlation including the peak energy ($E_{p,i}$) $L_{b,z}-T_{b,z}-E_{p,i}$. For the cosmological probes, we consider the $\omega$CDM model in which the dark energy consists of one component, and mainly focus on the $X_1X_2$CDM model in which the dark energy is made up of two independent components. We obtain the constraints on the related parameters of the cosmological models using the type Ia supernovae (SNe Ia) data and selected X-ray and optical samples. For the $X_1X_2$CDM model, we find that the values of the equations of state parameters of two dark energies, $\omega_1$ and $\omega_2$, are very close. We also conduct the comparison between the models using the Bayesian information criterion, and find that the $\omega$CDM model is favoured., Comment: 13 pages, 8 figures and 3 tables, accepted for publication in Astronomy & Astrophysics
- Published
- 2024
39. Canonical Variates in Wasserstein Metric Space
- Author
-
Li, Jia and Lin, Lin
- Subjects
Statistics - Machine Learning ,Computer Science - Artificial Intelligence ,Computer Science - Machine Learning - Abstract
In this paper, we address the classification of instances each characterized not by a singular point, but by a distribution on a vector space. We employ the Wasserstein metric to measure distances between distributions, which are then used by distance-based classification algorithms such as k-nearest neighbors, k-means, and pseudo-mixture modeling. Central to our investigation is dimension reduction within the Wasserstein metric space to enhance classification accuracy. We introduce a novel approach grounded in the principle of maximizing Fisher's ratio, defined as the quotient of between-class variation to within-class variation. The directions in which this ratio is maximized are termed discriminant coordinates or canonical variates axes. In practice, we define both between-class and within-class variations as the average squared distances between pairs of instances, with the pairs either belonging to the same class or to different classes. This ratio optimization is achieved through an iterative algorithm, which alternates between optimal transport and maximization steps within the vector space. We conduct empirical studies to assess the algorithm's convergence and, through experimental validation, demonstrate that our dimension reduction technique substantially enhances classification performance. Moreover, our method outperforms well-established algorithms that operate on vector representations derived from distributional data. It also exhibits robustness against variations in the distributional representations of data clouds., Comment: double space 37 pages, 6 figures
- Published
- 2024
40. Data-driven Nucleus Subclassification on Colon H&E using Style-transferred Digital Pathology
- Author
-
Remedios, Lucas W., Bao, Shunxing, Remedios, Samuel W., Lee, Ho Hin, Cai, Leon Y., Li, Thomas, Deng, Ruining, Newlin, Nancy R., Saunders, Adam M., Cui, Can, Li, Jia, Liu, Qi, Lau, Ken S., Roland, Joseph T., Washington, Mary K, Coburn, Lori A., Wilson, Keith T., Huo, Yuankai, and Landman, Bennett A.
- Subjects
Electrical Engineering and Systems Science - Image and Video Processing ,Computer Science - Computer Vision and Pattern Recognition ,Computer Science - Machine Learning - Abstract
Understanding the way cells communicate, co-locate, and interrelate is essential to furthering our understanding of how the body functions. H&E is widely available, however, cell subtyping often requires expert knowledge and the use of specialized stains. To reduce the annotation burden, AI has been proposed for the classification of cells on H&E. For example, the recent Colon Nucleus Identification and Classification (CoNIC) Challenge focused on labeling 6 cell types on H&E of the colon. However, the CoNIC Challenge was unable to classify epithelial subtypes (progenitor, enteroendocrine, goblet), lymphocyte subtypes (B, helper T, cytotoxic T), and connective subtypes (fibroblasts). We use inter-modality learning to label previously un-labelable cell types on H&E. We take advantage of multiplexed immunofluorescence (MxIF) histology to label 14 cell subclasses. We performed style transfer on the same MxIF tissues to synthesize realistic virtual H&E which we paired with the MxIF-derived cell subclassification labels. We evaluated the efficacy of using a supervised learning scheme where the input was realistic-quality virtual H&E and the labels were MxIF-derived cell subclasses. We assessed our model on private virtual H&E and public real H&E. On virtual H&E, we were able to classify helper T cells and epithelial progenitors with positive predictive values of $0.34 \pm 0.15$ (prevalence $0.03 \pm 0.01$) and $0.47 \pm 0.1$ (prevalence $0.07 \pm 0.02$) respectively, when using ground truth centroid information. On real H&E we could classify helper T cells and epithelial progenitors with upper bound positive predictive values of $0.43 \pm 0.03$ (parent class prevalence 0.21) and $0.94 \pm 0.02$ (parent class prevalence 0.49) when using ground truth centroid information. This is the first work to provide cell type classification for helper T and epithelial progenitor nuclei on H&E., Comment: arXiv admin note: text overlap with arXiv:2401.05602
- Published
- 2024
41. Demonstrating a universal logical gate set on a superconducting quantum processor
- Author
-
Zhang, Jiaxuan, Chen, Zhao-Yun, Wang, Yun-Jie, Lu, Bin-Han, Zhang, Hai-Feng, Li, Jia-Ning, Duan, Peng, Wu, Yu-Chun, and Guo, Guo-Ping
- Subjects
Quantum Physics - Abstract
Fault-tolerant quantum computing (FTQC) is essential for achieving large-scale practical quantum computation. Implementing arbitrary FTQC requires the execution of a universal gate set on logical qubits, which is highly challenging. Particularly, in the superconducting system, two-qubit gates on surface code logical qubits have not been realized. Here, we experimentally implement logical CNOT gate as well as arbitrary single-qubit rotation gates on distance-2 surface codes using the superconducting quantum processor \textit{Wukong}, thereby demonstrating a universal logical gate set. In the experiment, we design encoding circuits to prepare the required logical states, where the fidelities of the fault-tolerantly prepared logical states surpass those of the physical states. Furthermore, we demonstrate the transversal CNOT gate between two logical qubits and fault-tolerantly prepare four logical Bell states, all with fidelities exceeding those of the Bell states on the physical qubits. Using the logical CNOT gate and an ancilla logical state, arbitrary single-qubit rotation gate is implemented through gate teleportation. All logical gates are characterized on a complete state set and their fidelities are evaluated by logical Pauli transfer matrices. Implementation of the universal logical gate set and entangled logical states beyond physical fidelity marks a significant step towards FTQC on superconducting quantum processors., Comment: 15 pages, 12 figures
- Published
- 2024
42. Multi-wavelength Emission of Gamma-ray Burst Prompt Phase. I. Time-resolved and Time-integrated Polarizations
- Author
-
Li, Jia-Sheng, Lan, Mi-Xiang, and Wang, Hao-Bing
- Subjects
Astrophysics - High Energy Astrophysical Phenomena - Abstract
The time-integrated polarization degree (PD) at prompt optical band of gamma-ray burst (GRB) was predicted to be less than $20\%$, while the time-resolved one can reach as high as $75\%$ in photosphere model. Polarizations in optical band during GRB prompt phase had not been studied under framework of the magnetic reconnection model. Here, a three-segment power laws of the energy spectrum is used to reconstruct the Stokes parameters of the magnetic reconnection model. The multi-wavelength light curves and polarization curves from the optical band to MeV gamma-rays in GRB prompt phase are studied. We found depending mainly on the jet dynamics there is a long lasting high PD phase at all calculated energy bands for the typical parameter sets. The time-resolved PD could be as high as $50\%$, while the time-integrated one is roughly $17\%$) in optical band. It can reach $60\%$ for the time-resolved PD in X-rays and the time-integrated one is around $(30-40)\%$. The polarization angle (PA) evolution is random in both optical and gamma-ray bands for the photosphere model, while it is roughly a constant in the synchrotron models. Therefore, future time-resolved PA observations in the prompt optical or gamma-ray band could distinguish between the photosphere and the synchrotron models., Comment: 16 pages, 9 figures, submitted
- Published
- 2024
43. Parameter-Efficient Fine-Tuning with Discrete Fourier Transform
- Author
-
Gao, Ziqi, Wang, Qichao, Chen, Aochuan, Liu, Zijing, Wu, Bingzhe, Chen, Liang, and Li, Jia
- Subjects
Computer Science - Machine Learning ,Computer Science - Artificial Intelligence ,Computer Science - Computation and Language - Abstract
Low-rank adaptation~(LoRA) has recently gained much interest in fine-tuning foundation models. It effectively reduces the number of trainable parameters by incorporating low-rank matrices $A$ and $B$ to represent the weight change, i.e., $\Delta W=BA$. Despite LoRA's progress, it faces storage challenges when handling extensive customization adaptations or larger base models. In this work, we aim to further compress trainable parameters by enjoying the powerful expressiveness of the Fourier transform. Specifically, we introduce FourierFT, which treats $\Delta W$ as a matrix in the spatial domain and learns only a small fraction of its spectral coefficients. With the trained spectral coefficients, we implement the inverse discrete Fourier transform to recover $\Delta W$. Empirically, our FourierFT method shows comparable or better performance with fewer parameters than LoRA on various tasks, including natural language understanding, natural language generation, instruction tuning, and image classification. For example, when performing instruction tuning on the LLaMA2-7B model, FourierFT surpasses LoRA with only 0.064M trainable parameters, compared to LoRA's 33.5M. Our code is released at \url{https://github.com/Chaos96/fourierft}., Comment: Accepted by ICML 2024
- Published
- 2024
44. Semi-supervised Symmetric Matrix Factorization with Low-Rank Tensor Representation
- Author
-
Jia, Yuheng, Li, Jia-Nan, Wu, Wenhui, and Wang, Ran
- Subjects
Computer Science - Machine Learning - Abstract
Semi-supervised symmetric non-negative matrix factorization (SNMF) utilizes the available supervisory information (usually in the form of pairwise constraints) to improve the clustering ability of SNMF. The previous methods introduce the pairwise constraints from the local perspective, i.e., they either directly refine the similarity matrix element-wisely or restrain the distance of the decomposed vectors in pairs according to the pairwise constraints, which overlook the global perspective, i.e., in the ideal case, the pairwise constraint matrix and the ideal similarity matrix possess the same low-rank structure. To this end, we first propose a novel semi-supervised SNMF model by seeking low-rank representation for the tensor synthesized by the pairwise constraint matrix and a similarity matrix obtained by the product of the embedding matrix and its transpose, which could strengthen those two matrices simultaneously from a global perspective. We then propose an enhanced SNMF model, making the embedding matrix tailored to the above tensor low-rank representation. We finally refine the similarity matrix by the strengthened pairwise constraints. We repeat the above steps to continuously boost the similarity matrix and pairwise constraint matrix, leading to a high-quality embedding matrix. Extensive experiments substantiate the superiority of our method. The code is available at https://github.com/JinaLeejnl/TSNMF.
- Published
- 2024
45. A Survey of Time Series Foundation Models: Generalizing Time Series Representation with Large Language Model
- Author
-
Ye, Jiexia, Zhang, Weiqi, Yi, Ke, Yu, Yongzi, Li, Ziyue, Li, Jia, and Tsung, Fugee
- Subjects
Computer Science - Machine Learning ,Computer Science - Artificial Intelligence - Abstract
Time series data are ubiquitous across various domains, making time series analysis critically important. Traditional time series models are task-specific, featuring singular functionality and limited generalization capacity. Recently, large language foundation models have unveiled their remarkable capabilities for cross-task transferability, zero-shot/few-shot learning, and decision-making explainability. This success has sparked interest in the exploration of foundation models to solve multiple time series challenges simultaneously. There are two main research lines, namely pre-training foundation models from scratch for time series and adapting large language foundation models for time series. They both contribute to the development of a unified model that is highly generalizable, versatile, and comprehensible for time series analysis. This survey offers a 3E analytical framework for comprehensive examination of related research. Specifically, we examine existing works from three dimensions, namely Effectiveness, Efficiency and Explainability. In each dimension, we focus on discussing how related works devise tailored solution by considering unique challenges in the realm of time series. Furthermore, we provide a domain taxonomy to help followers keep up with the domain-specific advancements. In addition, we introduce extensive resources to facilitate the field's development, including datasets, open-source, time series libraries. A GitHub repository is also maintained for resource updates (https://github.com/start2020/Awesome-TimeSeries-LLM-FM)., Comment: 5 figures, 6 tables, 41 pages
- Published
- 2024
46. Origin of the Very High Energy {\gamma}-rays in the Low-luminosity Active Galactic Nucleus NGC 4278
- Author
-
Lian, Ji-Shun, Li, Jia-Xuan, Hu, Xin-Ke, Gan, Ying-Ying, Wu, Tan-Zheng, Zhang, Hai-Ming, and Zhang, Jin
- Subjects
Astrophysics - High Energy Astrophysical Phenomena - Abstract
NGC 4278, a Low-luminosity active galactic nucleus (AGN), is generally classified as a low-ionization nuclear emission line region (LINER). Recently, it has been reported to be associated with a very high energy $\gamma$-ray source 1LHAASO J1219+2915 in the first Large High Altitude Air Shower Observatory source catalog. However, no associated counterpart has been detected by analyzing the data collected by the Large Area Telescope on board the Fermi Gamma-ray Space Telescope. By analyzing its X-ray observation data from Swift-XRT, we find NGC 4278 is in a high-flux state on MJD 59546, with the X-ray flux more than one order of magnitude higher than that observed $\sim$ 11.7 year earlier by Chandra. Interestingly, this Swift-XRT observation was conducted during the active phase of the $\gamma$-ray source 1LHAASO J1219+2915. We propose that the detection of VHE $\gamma$-rays from NGC 4278 may be attributed to the presence of an active nucleus in its center. To reproduce the spectral energy distribution (SED) of NGC 4278, we employ a one-zone leptonic model, typically used for fitting broadband SEDs of BL Lacs, and find that a smaller magnetic field strength is required than that of typical TeV BL Lacs. Furthermore, NGC 4278 exhibits significantly lower luminosity in both radio and TeV bands when compared with typical TeV BL Lacs. In the radio-luminosity vs. Eddington-ratio plane, NGC 4278 shows greater similarity to Seyfert galaxies and LINERs rather than BL Lacs; however, it still roughly follows the extension towards lower luminosity seen in BL Lacs., Comment: 2 Comments, 14 Pages, 8 Figures, 2 Tables, Accepted for Publication in ApJ
- Published
- 2024
47. Exact Universal Characterization of Chiral-Symmetric Higher-Order Topological Phases
- Author
-
Li, Jia-Zheng, Luo, Xun-Jiang, Wu, Fengcheng, and Xiao, Meng
- Subjects
Condensed Matter - Mesoscale and Nanoscale Physics ,Condensed Matter - Quantum Gases - Abstract
Utilizing a series of Bott indices formulated through polynomials of position operators, we establish a comprehensive framework for characterizing topological zero-energy corner states in systems with chiral symmetry. Our framework covers systems with arbitrary shape, including topological phases that are not characterizable by previously proposed invariants such as multipole moments or multipole chiral numbers. A key feature of our framework is its ability to capture the real-space pattern of zero-energy corner states. We provide a rigorous analytical proof of its higher-order correspondence. To demonstrate the effectiveness of our theory, we examine several model systems with representative patterns of zero-energy corner states that previous frameworks fail to classify., Comment: 16 pages, 3 figures
- Published
- 2024
48. A Comprehensive Evaluation on Event Reasoning of Large Language Models
- Author
-
Tao, Zhengwei, Jin, Zhi, Zhang, Yifan, Chen, Xiancai, Zhao, Haiyan, Li, Jia, Liang, Bing, Tao, Chongyang, Liu, Qun, and Wong, Kam-Fai
- Subjects
Computer Science - Computation and Language ,Computer Science - Artificial Intelligence - Abstract
Event reasoning is a fundamental ability that underlies many applications. It requires event schema knowledge to perform global reasoning and needs to deal with the diversity of the inter-event relations and the reasoning paradigms. How well LLMs accomplish event reasoning on various relations and reasoning paradigms remains unknown. To mitigate this disparity, we comprehensively evaluate the abilities of event reasoning of LLMs. We introduce a novel benchmark EV2 for EValuation of EVent reasoning. EV2 consists of two levels of evaluation of schema and instance and is comprehensive in relations and reasoning paradigms. We conduct extensive experiments on EV2. We find that LLMs have abilities to accomplish event reasoning but their performances are far from satisfactory. We also notice the imbalance of event reasoning abilities in LLMs. Besides, LLMs have event schema knowledge, however, they're not aligned with humans on how to utilize the knowledge. Based on these findings, we guide the LLMs in utilizing the event schema knowledge as memory leading to improvements on event reasoning.
- Published
- 2024
49. Deep RAW Image Super-Resolution. A NTIRE 2024 Challenge Survey
- Author
-
Conde, Marcos V., Vasluianu, Florin-Alexandru, Timofte, Radu, Zhang, Jianxing, Li, Jia, Wang, Fan, Li, Xiaopeng, Liu, Zikun, Park, Hyunhee, Song, Sejun, Kim, Changho, Huang, Zhijuan, Yu, Hongyuan, Wan, Cheng, Xiang, Wending, Lin, Jiamin, Zhong, Hang, Zhang, Qiaosong, Sun, Yue, Yin, Xuanwu, Zuo, Kunlong, Xu, Senyan, Jiang, Siyuan, Sun, Zhijing, Zhu, Jiaying, Li, Liangyan, Chen, Ke, Li, Yunzhe, Ning, Yimo, Zhao, Guanhua, Chen, Jun, Yu, Jinyang, Xu, Kele, Xu, Qisheng, and Dou, Yong
- Subjects
Computer Science - Computer Vision and Pattern Recognition ,Electrical Engineering and Systems Science - Image and Video Processing - Abstract
This paper reviews the NTIRE 2024 RAW Image Super-Resolution Challenge, highlighting the proposed solutions and results. New methods for RAW Super-Resolution could be essential in modern Image Signal Processing (ISP) pipelines, however, this problem is not as explored as in the RGB domain. Th goal of this challenge is to upscale RAW Bayer images by 2x, considering unknown degradations such as noise and blur. In the challenge, a total of 230 participants registered, and 45 submitted results during thee challenge period. The performance of the top-5 submissions is reviewed and provided here as a gauge for the current state-of-the-art in RAW Image Super-Resolution., Comment: CVPR 2024 - NTIRE Workshop
- Published
- 2024
50. Exploring and Unleashing the Power of Large Language Models in Automated Code Translation
- Author
-
Yang, Zhen, Liu, Fang, Yu, Zhongxing, Keung, Jacky Wai, Li, Jia, Liu, Shuo, Hong, Yifan, Ma, Xiaoxue, Jin, Zhi, and Li, Ge
- Subjects
Computer Science - Software Engineering ,Computer Science - Artificial Intelligence - Abstract
Code translation tools (transpilers) are developed for automatic source-to-source translation. Although learning-based transpilers have shown impressive enhancement against rule-based counterparts, owing to their task-specific pre-training on extensive monolingual corpora. Their current performance still remains unsatisfactory for practical deployment, and the associated training resources are also prohibitively expensive. LLMs pre-trained on huge amounts of human-written code/text have shown remarkable performance in many code intelligence tasks due to their powerful generality, even without task-specific training. Thus, LLMs can potentially circumvent the above limitations, but they have not been exhaustively explored yet. This paper investigates diverse LLMs and learning-based transpilers for automated code translation tasks, finding that: although certain LLMs have outperformed current transpilers, they still have some accuracy issues, where most of the failures are induced by a lack of comprehension of source programs, missing clear instructions on I/O types in translation, and ignoring discrepancies between source and target programs. Enlightened by the above findings, we further propose UniTrans, a Unified code Translation framework, applicable to various LLMs, for unleashing their power in this field. Specifically, UniTrans first crafts a series of test cases for target programs with the assistance of source programs. Next, it harnesses the above auto-generated test cases to augment the code translation and then evaluate their correctness via execution. Afterward, UniTrans further (iteratively) repairs incorrectly translated programs prompted by test case execution results. Extensive experiments are conducted on six settings of translation datasets between Python, Java, and C++. Three recent LLMs of diverse sizes are tested with UniTrans, and all achieve substantial improvements., Comment: 23 pages, 7 figures, accepted by FSE'24 (2024 ACM International Conference on the Foundations of Software Engineering)
- Published
- 2024
Catalog
Discovery Service for Jio Institute Digital Library
For full access to our library's resources, please sign in.