Author: "Zhang Yi" - Searchworks@Jio Institute Digital Library Search Results

Your search keyword '"Zhang Yi"' showing total 100,283 results

Start Over Author "Zhang Yi"

100,283 results on '"Zhang Yi"'

1. SM3-Text-to-Query: Synthetic Multi-Model Medical Text-to-Query Benchmark

Author: Sivasubramaniam, Sithursan, Osei-Akoto, Cedric, Zhang, Yi, Stockinger, Kurt, and Fuerst, Jonathan
Subjects: Computer Science - Databases, Computer Science - Artificial Intelligence
Abstract: Electronic health records (EHRs) are stored in various database systems with different database models on heterogeneous storage architectures, such as relational databases, document stores, or graph databases. These different database models have a big impact on query complexity and performance. While this has been a known fact in database research, its implications for the growing number of Text-to-Query systems have surprisingly not been investigated so far. In this paper, we present SM3-Text-to-Query, the first multi-model medical Text-to-Query benchmark based on synthetic patient data from Synthea, following the SNOMED-CT taxonomy -- a widely used knowledge graph ontology covering medical terminology. SM3-Text-to-Query provides data representations for relational databases (PostgreSQL), document stores (MongoDB), and graph databases (Neo4j and GraphDB (RDF)), allowing the evaluation across four popular query languages, namely SQL, MQL, Cypher, and SPARQL. We systematically and manually develop 408 template questions, which we augment to construct a benchmark of 10K diverse natural language question/query pairs for these four query languages (40K pairs overall). On our dataset, we evaluate several common in-context-learning (ICL) approaches for a set of representative closed and open-source LLMs. Our evaluation sheds light on the trade-offs between database models and query languages for different ICL strategies and LLMs. Last, SM3-Text-to-Query is easily extendable to additional query languages or real, standard-based patient databases., Comment: NeurIPS 2024 Track Datasets and Benchmarks
Published: 2024

2. A versatile framework for attitude tuning of beamlines at advanced light sources

Author: Li, Peng-Cheng, Bi, Xiao-Xue, Zhang, Zhen, Deng, Xiao-Bao, Li, Chun, Wang, Li-Wen, Liu, Gong-Fa, Zhang, Yi, Zhou, Ai-Yu, and Liu, Yu
Subjects: Physics - Instrumentation and Detectors, High Energy Physics - Experiment
Abstract: Aside from regular beamline experiments at light sources, the preparation steps before these experiments are also worth systematic consideration in terms of automation; a representative category in these steps is attitude tuning, which typically appears in names like beam focusing, sample alignment etc. With the goal of saving time and manpower in both writing and using in mind, a Mamba-based attitude-tuning framework is created. It supports flexible input/output ports, easy integration of diverse evaluation functions, and free selection of optimisation algorithms; with the help from Mamba's infrastructure, machine learning (ML) and artificial intelligence (AI) technologies can also be readily integrated. The tuning of a polycapillary lens and of an X-ray emission spectrometer are given as examples for the general use of this framework, featuring powerful command-line interfaces (CLIs) and friendly graphical user interfaces (GUIs) that allow comfortable human-in-the-loop control. The tuning of a Raman spectrometer demonstrates more specialised use of the framework with customised optimisation algorithms. With similar applications in mind, our framework is estimated to be capable of fulfilling a majority of attitude-tuning needs. Also reported is a virtual-beamline mechanism based on easily customisable simulated detectors and motors, which facilitates both testing for developers and training for users., Comment: 12 pages, 8 figures
Published: 2024

3. Detector integration at HEPS: a systematic, efficient and high-performance approach

Author: Zhang, Qun, Li, Peng-Cheng, Bian, Ling-Zhu, Li, Chun, Yue, Zong-Yang, Zhang, Cheng-Long, Zhao, Zhuo-Feng, Zhang, Yi, Li, Gang, Zhou, Ai-Yu, and Liu, Yu
Subjects: Physics - Instrumentation and Detectors, High Energy Physics - Experiment
Abstract: At least 25 kinds of detector-like devices need to be integrated in Phase I of the High Energy Photon Source (HEPS), and the work needs to be carefully planned to maximise productivity with highly limited human resources. After a systematic analysis on the actual work involved in detector integration, a separation of concerns between collaborating groups of personnel is established to minimise the duplication of efforts. To facilitate software development for detector integration, the ADGenICam library, which abstracts repeated code in EPICS modules for cameras, is extended to support a much wider range of detectors. An increasingly considerable fraction of detectors, both inside and outside HEPS, offer performance that exceed capabilities of the areaDetector framework in EPICS. Given this background, areaDetector's limitations in performance and architecture are analysed, and a QueueIOC -based framework that overcomes these limitations is introduced. A simple, flexible ZeroMQ-based protocol is used for data transport in this framework, while RDMA transport and multi-node readout will be explored for higher data throughputs. By calling C/C++ libraries from within Python, the performance of the former and the expressiveness of the latter can coexist nicely; the expressiveness allows for much higher efficiency in the implementation and use of integration modules functionally comparable to their EPICS counterparts., Comment: 11 pages, 3 figures
Published: 2024

4. Detection of two TeV gamma-ray outbursts from NGC 1275 by LHAASO

Author: Cao, Zhen, Aharonian, F., Axikegu, Bai, Y. X., Bao, Y. W., Bastieri, D., Bi, X. J., Bi, Y. J., Cai, J. T., Cao, Q., Cao, W. Y., Cao, Zhe, Chang, J., Chang, J. F., Chen, A. M., Chen, E. S., Chen, Liang, Chen, Lin, Chen, Long, Chen, M. J., Chen, M. L., Chen, Q. H., Chen, S. H., Chen, S. Z., Chen, T. L., Chen, Y., Cheng, N., Cheng, Y. D., Cui, M. Y., Cui, S. W., Cui, X. H., Cui, Y. D., Dai, B. Z., Dai, H. L., Dai, Z. G., Danzengluobu, della Volpe, D., Dong, X. Q., Duan, K. K., Fan, J. H., Fan, Y. Z., Fang, J., Fang, K., Feng, C. F., Feng, L., Feng, S. H., Feng, X. T., Feng, Y. L., Gabici, S., Gao, B., Gao, C. D., Gao, L. Q., Gao, Q., Gao, W., Gao, W. K., Ge, M. M., Geng, L. S., Giacinti, G., Gong, G. H., Gou, Q. B., Gu, M. H., Guo, F. L., Guo, X. L., Guo, Y. Q., Guo, Y. Y., Han, Y. A., He, H. H., He, H. N., He, J. Y., He, X. B., He, Y., Heller, M., Hor, Y. K., Hou, B. W., Hou, C., Hou, X., Hu, H. B., Hu, Q., Hu, S. C., Huang, D. H., Huang, T. Q., Huang, W. J., Huang, X. T., Huang, X. Y., Huang, Y., Huang, Z. C., Ji, X. L., Jia, H. Y., Jia, K., Jiang, K., Jiang, X. W., Jiang, Z. J., Jin, M., Kang, M. M., Ke, T., Kuleshov, D., Kurinov, K., Li, B. B., Li, Cheng, Li, Cong, Li, D., Li, F., Li, H. B., Li, H. C., Li, H. Y., Li, J., Li, Jian, Li, Jie, Li, K., Li, W. L., Li, X. R., Li, Xin, Li, Y. Z., Li, Zhe, Li, Zhuo, Liang, E. W., Liang, Y. F., Lin, S. J., Liu, B., Liu, C., Liu, D., Liu, H., Liu, H. D., Liu, J., Liu, J. L., Liu, J. Y., Liu, M. Y., Liu, R. Y., Liu, S. M., Liu, W., Liu, Y., Liu, Y. N., Lu, R., Luo, Q., Lv, H. K., Ma, B. Q., Ma, L. L., Ma, X. H., Mao, J. R., Min, Z., Mitthumsiri, W., Mu, H. J., Nan, Y. C., Neronov, A., Ou, Z. W., Pang, B. Y., Pattarakijwanich, P., Pei, Z. Y., Qi, M. Y., Qi, Y. Q., Qiao, B. Q., Qin, J. J., Ruffolo, D., Sáiz, A., Semikoz, D., Shao, C. Y., Shao, L., Shchegolev, O., Sheng, X. D., Shu, F. W., Song, H. C., Stenkin, Yu. V., Stepanov, V., Su, Y., Sun, Q. N., Sun, X. N., Sun, Z. B., Tam, P. H. T., Tang, Q. W., Tang, Z. B., Tian, W. W., Wang, C., Wang, C. B., Wang, G. W., Wang, H. G., Wang, H. H., Wang, J. C., Wang, K., Wang, L. P., Wang, L. Y., Wang, P. H., Wang, R., Wang, W., Wang, X. G., Wang, X. Y., Wang, Y., Wang, Y. D., Wang, Y. J., Wang, Z. H., Wang, Z. X., Wang, Zhen, Wang, Zheng, Wei, D. M., Wei, J. J., Wei, Y. J., Wen, T., Wu, C. Y., Wu, H. R., Wu, S., Wu, X. F., Wu, Y. S., Xi, S. Q., Xia, J., Xia, J. J., Xiang, G. M., Xiao, D. X., Xiao, G., Xin, G. G., Xin, Y. L., Xing, Y., Xiong, Z., Xu, D. L., Xu, R. F., Xu, R. X., Xu, W. L., Xue, L., Yan, D. H., Yan, J. Z., Yan, T., Yang, C. W., Yang, F., Yang, F. F., Yang, H. W., Yang, J. Y., Yang, L. L., Yang, M. J., Yang, R. Z., Yang, S. B., Yao, Y. H., Yao, Z. G., Ye, Y. M., Yin, L. Q., Yin, N., You, X. H., You, Z. Y., Yu, Y. H., Yuan, Q., Yue, H., Zeng, H. D., Zeng, T. X., Zeng, W., Zha, M., Zhang, B. B., Zhang, F., Zhang, H. M., Zhang, H. Y., Zhang, J. L., Zhang, L. X., Zhang, Li, Zhang, P. F., Zhang, P. P., Zhang, R., Zhang, S. B., Zhang, S. R., Zhang, S. S., Zhang, X., Zhang, X. P., Zhang, Y. F., Zhang, Yi, Zhang, Yong, Zhao, B., Zhao, J., Zhao, L., Zhao, L. Z., Zhao, S. P., Zheng, F., Zhou, B., Zhou, H., Zhou, J. N., Zhou, M., Zhou, P., Zhou, R., Zhou, X. X., Zhu, C. G., Zhu, F. R., Zhu, H., Zhu, K. J., and Zuo., X.
Subjects: Astrophysics - High Energy Astrophysical Phenomena
Abstract: The Water Cherenkov Detector Array (WCDA) is one of the components of Large High Altitude Air Shower Observatory (LHAASO) and can monitor any sources over two-thirds of the sky for up to 7 hours per day with >98\% duty cycle. In this work, we report the detection of two outbursts of the Fanaroff-Riley I radio galaxy NGC 1275 that were detected by LHAASO-WCDA between November 2022 and January 2023 with statistical significance of 5.2~$\sigma$ and 8.3~$\sigma$. The observed spectral energy distribution in the range from 500 GeV to 3 TeV is fitted by a power-law with a best-fit spectral index of $\alpha=-3.37\pm0.52$ and $-3.35\pm0.29$, respectively. The outburst flux above 0.5~TeV was ($4.55\pm 4.21)\times~10^{-11}~\rm cm^{-2}~s^{-1}$ and ($3.45\pm 1.78)\times~10^{-11}~\rm cm^{-2}~s^{-1}$, corresponding to 60\%, 45\% of Crab Nebula flux. Variation analysis reveals the variability time-scale of days at the TeV energy band. A simple test by one-zone synchrotron self-Compton model reproduces the data in the gamma-ray band well., Comment: 11 pages, 8 figures, 3 tables
Published: 2024

5. Right this way: Can VLMs Guide Us to See More to Answer Questions?

Author: Liu, Li, Yang, Diji, Zhong, Sijia, Tholeti, Kalyana Suma Sree, Ding, Lei, Zhang, Yi, and Gilpin, Leilani H.
Subjects: Computer Science - Computer Vision and Pattern Recognition, Computer Science - Artificial Intelligence, Computer Science - Machine Learning
Abstract: In question-answering scenarios, humans can assess whether the available information is sufficient and seek additional information if necessary, rather than providing a forced answer. In contrast, Vision Language Models (VLMs) typically generate direct, one-shot responses without evaluating the sufficiency of the information. To investigate this gap, we identify a critical and challenging task in the Visual Question Answering (VQA) scenario: can VLMs indicate how to adjust an image when the visual information is insufficient to answer a question? This capability is especially valuable for assisting visually impaired individuals who often need guidance to capture images correctly. To evaluate this capability of current VLMs, we introduce a human-labeled dataset as a benchmark for this task. Additionally, we present an automated framework that generates synthetic training data by simulating ``where to know'' scenarios. Our empirical results show significant performance improvements in mainstream VLMs when fine-tuned with this synthetic data. This study demonstrates the potential to narrow the gap between information assessment and acquisition in VLMs, bringing their performance closer to humans., Comment: NeurIPS 2024
Published: 2024

6. Decoupled structure-preserving discretization of incompressible MHD equations with general boundary conditions

Author: Zhang, Yi, Palha, Artur, Brugnoli, Andrea, Toshniwal, Deepesh, and Gerritsma, Marc
Subjects: Mathematics - Numerical Analysis
Abstract: In the framework of a mixed finite element method, a structure-preserving formulation for incompressible MHD equations with general boundary conditions is proposed. A leapfrog-type temporal scheme fully decouples the fluid part from the Maxwell part by means of staggered discrete time sequences and, in doing so, partially linearizes the system. Conservation and dissipation properties of the formulation before and after the decoupling are analyzed. We demonstrate optimal spatial and second-order temporal error convergence and conservation and dissipation properties of the proposed method using manufactured solutions, and apply it to the benchmark Orszag-Tang and lid-driven cavity test cases.
Published: 2024

7. The SRG/eROSITA diffuse soft X-ray background. I. The local hot bubble in the western Galactic hemisphere

Author: Yeung, Michael C. H., Ponti, Gabriele, Freyberg, Michael J., Dennerl, Konrad, Liu, Teng, Locatelli, Nicola, Mayer, Martin G. F., Sanders, Jeremy S., Sasaki, Manami, Strong, Andy, Zhang, Yi, Zheng, Xueying, and Gatuzz, Efrain
Subjects: Astrophysics - Astrophysics of Galaxies
Abstract: The SRG/eROSITA All-Sky Surveys (eRASSs) combine the advantages of complete sky coverage and the energy resolution provided by the charge couple device and offer the most holistic and detailed view of the diffuse soft X-ray background (SXRB) to date. The first eRASS (eRASS1) was completed at solar minimum, when solar wind charge exchange emission was minimal, providing the clearest view of the SXRB. We aim to extract spatial and spectral information from each constituent of the SXRB in the western Galactic hemisphere, focusing on the local hot bubble (LHB). We extracted and analysed eRASS1 spectra from almost all directions in the western Galactic hemisphere by dividing the sky into equal signal-to-noise bins. We fitted all bins with fixed spectral templates of known background constituents. We find the temperature of the LHB exhibits a north-south dichotomy at high latitudes ($|b|>30^{\circ}$), with the south being hotter, with a mean temperature at $kT=121.8\pm0.6\,$eV and the north at $kT=100.8\pm0.5\,$eV. At low latitudes, the LHB temperature increases towards the Galactic plane, especially towards the inner Galaxy. The LHB emission measure (${\rm EM_{LHB}}$) enhances approximately towards the Galactic poles. The ${\rm EM_{LHB}}$ map shows clear anti-correlation with the local dust column density. In particular, we found tunnels of dust cavities filled with hot plasma, potentially forming a wider network of hot interstellar medium. We also constructed a three-dimensional LHB model from ${\rm EM_{LHB}}$, assuming constant density. The average thermal pressure of the LHB is $P_{\rm thermal}/k=10100^{+1200}_{-1500}\,{\rm cm^{-3}\,K}$, a lower value than typical supernova remnants and wind-blown bubbles. This could be an indication of the LHB being open towards high Galactic latitudes., Comment: 31 pages, 37 figures, published in A&A
Published: 2024
Full Text: View/download PDF

8. P$^2$C$^2$Net: PDE-Preserved Coarse Correction Network for efficient prediction of spatiotemporal dynamics

Author: Wang, Qi, Ren, Pu, Zhou, Hao, Liu, Xin-Yang, Deng, Zhiwen, Zhang, Yi, Chengze, Ruizhi, Liu, Hongsheng, Wang, Zidong, Wang, Jian-Xun, Ji-Rong_Wen, Sun, Hao, and Liu, Yang
Subjects: Mathematics - Numerical Analysis, Computer Science - Artificial Intelligence, Computer Science - Machine Learning
Abstract: When solving partial differential equations (PDEs), classical numerical methods often require fine mesh grids and small time stepping to meet stability, consistency, and convergence conditions, leading to high computational cost. Recently, machine learning has been increasingly utilized to solve PDE problems, but they often encounter challenges related to interpretability, generalizability, and strong dependency on rich labeled data. Hence, we introduce a new PDE-Preserved Coarse Correction Network (P$^2$C$^2$Net) to efficiently solve spatiotemporal PDE problems on coarse mesh grids in small data regimes. The model consists of two synergistic modules: (1) a trainable PDE block that learns to update the coarse solution (i.e., the system state), based on a high-order numerical scheme with boundary condition encoding, and (2) a neural network block that consistently corrects the solution on the fly. In particular, we propose a learnable symmetric Conv filter, with weights shared over the entire model, to accurately estimate the spatial derivatives of PDE based on the neural-corrected system state. The resulting physics-encoded model is capable of handling limited training data (e.g., 3--5 trajectories) and accelerates the prediction of PDE solutions on coarse spatiotemporal grids while maintaining a high accuracy. P$^2$C$^2$Net achieves consistent state-of-the-art performance with over 50\% gain (e.g., in terms of relative prediction error) across four datasets covering complex reaction-diffusion processes and turbulent flows.
Published: 2024

9. MARCO: Multi-Agent Real-time Chat Orchestration

Author: Shrimal, Anubhav, Kanagaraj, Stanley, Biswas, Kriti, Raghuraman, Swarnalatha, Nediyanchath, Anish, Zhang, Yi, and Yenigalla, Promod
Subjects: Computer Science - Artificial Intelligence, Computer Science - Computation and Language, Computer Science - Machine Learning, Computer Science - Multiagent Systems
Abstract: Large language model advancements have enabled the development of multi-agent frameworks to tackle complex, real-world problems such as to automate tasks that require interactions with diverse tools, reasoning, and human collaboration. We present MARCO, a Multi-Agent Real-time Chat Orchestration framework for automating tasks using LLMs. MARCO addresses key challenges in utilizing LLMs for complex, multi-step task execution. It incorporates robust guardrails to steer LLM behavior, validate outputs, and recover from errors that stem from inconsistent output formatting, function and parameter hallucination, and lack of domain knowledge. Through extensive experiments we demonstrate MARCO's superior performance with 94.48% and 92.74% accuracy on task execution for Digital Restaurant Service Platform conversations and Retail conversations datasets respectively along with 44.91% improved latency and 33.71% cost reduction. We also report effects of guardrails in performance gain along with comparisons of various LLM models, both open-source and proprietary. The modular and generic design of MARCO allows it to be adapted for automating tasks across domains and to execute complex usecases through multi-turn interactions., Comment: EMNLP 2024 Industry Track
Published: 2024

10. Gain-Loss Coupled Systems

Author: Zhang, Chunlei, Kim, Mun, Zhang, Yi-Hui, Wang, Yi-Pu, Trivedi, Deepanshu, Krasnok, Alex, Wang, Jianbo, Isleifson, Dustin, Roshko, Roy, and Hu, Can-Ming
Subjects: Quantum Physics, Condensed Matter - Mesoscale and Nanoscale Physics, Physics - Applied Physics, Physics - Optics
Abstract: Achieving oscillations with small dimensions, high power, high coherence, and low phase noise has been a long-standing goal in wave physics, driving innovations across classical electromagnetic theory and quantum physics. Key applications include electronic oscillators, lasers, and spin-torque oscillations. In recent decades, physicists have increasingly focused on harnessing passive oscillatory modes to manipulate these oscillations, leading to the development of diverse gain-loss coupled systems, including photon-photon, exciton-photon, photon-magnon, magnon-phonon, and magnon-magnon couplings. This review provides a comprehensive overview of these systems, exploring their fundamental physical structures, key experimental observations, and theoretical insights. By synthesizing insights from these studies, we propose future research directions to further advance the understanding and application of gain-loss coupled systems for quantum science and quantum technologies. (The field of gain-loss coupled systems is vast. The authors welcome suggestions and feedback from the community to continuously improve this review article until it is published)., Comment: 20 pages, 10 figures
Published: 2024

11. 1D Spontaneous Symmetry Breaking in thermal equilibrium via Non-Hermitian Construction

Author: Wang, Jia-Bao, Dong, Zi-Hao, and Zhang, Yi
Subjects: Quantum Physics, Condensed Matter - Mesoscale and Nanoscale Physics, Condensed Matter - Strongly Correlated Electrons, Physics - Computational Physics
Abstract: Spontaneous symmetry breaking generally circumvents one-dimensional systems with local interactions in thermal equilibrium. Here, we analyze a category of one-dimensional Hermitian models via local non-Hermitian constructions. Notably, spontaneous symmetry breaking and long-range order may emerge at finite temperatures in such systems under periodic boundary conditions, in sharp contrast to Hermitian constructions. We demonstrate clear numerical evidence, such as order parameters and specific heat, supporting phase diagrams with robust ordered phases. Non-Hermitian physics plays a vital role in prohibiting domain-wall proliferation and promoting spontaneous symmetry breaking. The fermions exhibit an exotic topological nature in their path-integral windings, which uphold nonzero integers -- commonly a non-Hermitian signature -- in the ordered phases, thus offering a novel and spontaneous origin for both symmetry breaking and non-Hermiticity., Comment: 7 pages, 5 figures
Published: 2024

12. Double Banking on Knowledge: Customized Modulation and Prototypes for Multi-Modality Semi-supervised Medical Image Segmentation

Author: Chen, Yingyu, Yang, Ziyuan, Yan, Ming, Zhang, Zhongzhou, Yu, Hui, Liu, Yan, and Zhang, Yi
Subjects: Computer Science - Computer Vision and Pattern Recognition
Abstract: Multi-modality (MM) semi-supervised learning (SSL) based medical image segmentation has recently gained increasing attention for its ability to utilize MM data and reduce reliance on labeled images. However, current methods face several challenges: (1) Complex network designs hinder scalability to scenarios with more than two modalities. (2) Focusing solely on modality-invariant representation while neglecting modality-specific features, leads to incomplete MM learning. (3) Leveraging unlabeled data with generative methods can be unreliable for SSL. To address these problems, we propose Double Bank Dual Consistency (DBDC), a novel MM-SSL approach for medical image segmentation. To address challenge (1), we propose a modality all-in-one segmentation network that accommodates data from any number of modalities, removing the limitation on modality count. To address challenge (2), we design two learnable plug-in banks, Modality-Level Modulation bank (MLMB) and Modality-Level Prototype (MLPB) bank, to capture both modality-invariant and modality-specific knowledge. These banks are updated using our proposed Modality Prototype Contrastive Learning (MPCL). Additionally, we design Modality Adaptive Weighting (MAW) to dynamically adjust learning weights for each modality, ensuring balanced MM learning as different modalities learn at different rates. Finally, to address challenge (3), we introduce a Dual Consistency (DC) strategy that enforces consistency at both the image and feature levels without relying on generative methods. We evaluate our method on a 2-to-4 modality segmentation task using three open-source datasets, and extensive experiments show that our method outperforms state-of-the-art approaches.
Published: 2024

13. Improving Instance Optimization in Deformable Image Registration with Gradient Projection

Author: Zhang, Yi, Zhao, Yidong, and Tao, Qian
Subjects: Computer Science - Computer Vision and Pattern Recognition, Electrical Engineering and Systems Science - Image and Video Processing
Abstract: Deformable image registration is inherently a multi-objective optimization (MOO) problem, requiring a delicate balance between image similarity and deformation regularity. These conflicting objectives often lead to poor optimization outcomes, such as being trapped in unsatisfactory local minima or experiencing slow convergence. Deep learning methods have recently gained popularity in this domain due to their efficiency in processing large datasets and achieving high accuracy. However, they often underperform during test time compared to traditional optimization techniques, which further explore iterative, instance-specific gradient-based optimization. This performance gap is more pronounced when a distribution shift between training and test data exists. To address this issue, we focus on the instance optimization (IO) paradigm, which involves additional optimization for test-time instances based on a pre-trained model. IO effectively combines the generalization capabilities of deep learning with the fine-tuning advantages of instance-specific optimization. Within this framework, we emphasize the use of gradient projection to mitigate conflicting updates in MOO. This technique projects conflicting gradients into a common space, better aligning the dual objectives and enhancing optimization stability. We validate our method using a state-of-the-art foundation model on the 3D Brain inter-subject registration task (LUMIR) from the Learn2Reg 2024 Challenge. Our results show significant improvements over standard gradient descent, leading to more accurate and reliable registration results., Comment: Learn2Reg Challenge at MICCAI 2024
Published: 2024

14. Inverse scattering transform for the defocusing-defocusing coupled Hirota equations with non-parallel boundary conditions at infinity

Author: Han, Peng-Fei, Ma, Wen-Xiu, and Zhang, Yi
Subjects: Nonlinear Sciences - Exactly Solvable and Integrable Systems
Abstract: The inverse scattering transform for the defocusing-defocusing coupled Hirota equations is strictly discussed with non-zero boundary conditions at infinity including non-parallel boundary conditions, specifically referring to the asymptotic polarization vectors. To address the non-analyticity encountered in some of the Jost eigenfunctions, the "adjoint" Lax pair is employed. The inverse problem is formulated as an appropriate matrix Riemann-Hilbert problem. A key difference between non-parallel and parallel boundary conditions lies in the asymptotic behavior of the scattering coefficients, which significantly impacts the normalization of the eigenfunctions and the properties of sectionally meromorphic matrices within the Riemann-Hilbert problem framework. When the asymptotic polarization vectors are non-orthogonal, two distinct methodologies are introduced to convert the Riemann-Hilbert problem into a series of linear algebraic-integral equations. In contrast, when the asymptotic polarization vectors are orthogonal, only one method is feasible. Ultimately, it is demonstrated that pure soliton solutions do not exist in both orthogonal and non orthogonal polarization vector cases. This study provides a comprehensive framework for analyzing the defocusing-defocusing coupled Hirota equations using the inverse scattering transform, offering new insights into the characteristics and solutions of the equations.
Published: 2024

15. DFlow: Diverse Dialogue Flow Simulation with Large Language Models

Author: Du, Wanyu, Feng, Song, Gung, James, Sun, Lijia, Zhang, Yi, Mansour, Saab, and Qi, Yanjun
Subjects: Computer Science - Computation and Language, Computer Science - Artificial Intelligence
Abstract: Developing language model-based dialogue agents requires effective data to train models that can follow specific task logic. However, most existing data augmentation methods focus on increasing diversity in language, topics, or dialogue acts at the utterance level, largely neglecting a critical aspect of task logic diversity at the dialogue level. This paper proposes a novel data augmentation method designed to enhance the diversity of synthetic dialogues by focusing on task execution logic. Our method uses LLMs to generate decision tree-structured task plans, which enables the derivation of diverse dialogue trajectories for a given task. Each trajectory, referred to as a "dialog flow", guides the generation of a multi-turn dialogue that follows a unique trajectory. We apply this method to generate a task-oriented dialogue dataset comprising 3,886 dialogue flows across 15 different domains. We validate the effectiveness of this dataset using the next action prediction task, where models fine-tuned on our dataset outperform strong baselines, including GPT-4. Upon acceptance of this paper, we plan to release the code and data publicly., Comment: 16 pages
Published: 2024

16. Open Domain Question Answering with Conflicting Contexts

Author: Liu, Siyi, Ning, Qiang, Halder, Kishaloy, Xiao, Wei, Qi, Zheng, Htut, Phu Mon, Zhang, Yi, John, Neha Anna, Min, Bonan, Benajiba, Yassine, and Roth, Dan
Subjects: Computer Science - Computation and Language, Computer Science - Artificial Intelligence
Abstract: Open domain question answering systems frequently rely on information retrieved from large collections of text (such as the Web) to answer questions. However, such collections of text often contain conflicting information, and indiscriminately depending on this information may result in untruthful and inaccurate answers. To understand the gravity of this problem, we collect a human-annotated dataset, Question Answering with Conflicting Contexts (QACC), and find that as much as 25% of unambiguous, open domain questions can lead to conflicting contexts when retrieved using Google Search. We evaluate and benchmark three powerful Large Language Models (LLMs) with our dataset QACC and demonstrate their limitations in effectively addressing questions with conflicting information. To explore how humans reason through conflicting contexts, we request our annotators to provide explanations for their selections of correct answers. We demonstrate that by finetuning LLMs to explain their answers, we can introduce richer information into their training that guide them through the process of reasoning with conflicting contexts.
Published: 2024

17. Domain-Conditioned Transformer for Fully Test-time Adaptation

Author: Tang, Yushun, Chen, Shuoshuo, Jia, Jiyuan, Zhang, Yi, and He, Zhihai
Subjects: Computer Science - Computer Vision and Pattern Recognition
Abstract: Fully test-time adaptation aims to adapt a network model online based on sequential analysis of input samples during the inference stage. We observe that, when applying a transformer network model into a new domain, the self-attention profiles of image samples in the target domain deviate significantly from those in the source domain, which results in large performance degradation during domain changes. To address this important issue, we propose a new structure for the self-attention modules in the transformer. Specifically, we incorporate three domain-conditioning vectors, called domain conditioners, into the query, key, and value components of the self-attention module. We learn a network to generate these three domain conditioners from the class token at each transformer network layer. We find that, during fully online test-time adaptation, these domain conditioners at each transform network layer are able to gradually remove the impact of domain shift and largely recover the original self-attention profile. Our extensive experimental results demonstrate that the proposed domain-conditioned transformer significantly improves the online fully test-time domain adaptation performance and outperforms existing state-of-the-art methods by large margins.
Published: 2024

18. Learning from the past: predicting critical transitions with machine learning trained on surrogates of historical data

Author: Ma, Zhiqin, Zeng, Chunhua, Zhang, Yi-Cheng, and Bury, Thomas M.
Subjects: Physics - Data Analysis, Statistics and Probability, Computer Science - Machine Learning
Abstract: Complex systems can undergo critical transitions, where slowly changing environmental conditions trigger a sudden shift to a new, potentially catastrophic state. Early warning signals for these events are crucial for decision-making in fields such as ecology, biology and climate science. Generic early warning signals motivated by dynamical systems theory have had mixed success on real noisy data. More recent studies found that deep learning classifiers trained on synthetic data could improve performance. However, neither of these methods take advantage of historical, system-specific data. Here, we introduce an approach that trains machine learning classifiers directly on surrogate data of past transitions, namely surrogate data-based machine learning (SDML). The approach provides early warning signals in empirical and experimental data from geology, climatology, sociology, and cardiology with higher sensitivity and specificity than two widely used generic early warning signals -- variance and lag-1 autocorrelation. Since the approach is trained directly on surrogates of historical data, it is not bound by the restricting assumption of a local bifurcation like previous methods. This system-specific approach can contribute to improved early warning signals to help humans better prepare for or avoid undesirable critical transitions.
Published: 2024

19. ErrorRadar: Benchmarking Complex Mathematical Reasoning of Multimodal Large Language Models Via Error Detection

Author: Yan, Yibo, Wang, Shen, Huo, Jiahao, Li, Hang, Li, Boyan, Su, Jiamin, Gao, Xiong, Zhang, Yi-Fan, Xu, Tianlong, Chu, Zhendong, Zhong, Aoxiao, Wang, Kun, Xiong, Hui, Yu, Philip S., Hu, Xuming, and Wen, Qingsong
Subjects: Computer Science - Computation and Language
Abstract: As the field of Multimodal Large Language Models (MLLMs) continues to evolve, their potential to revolutionize artificial intelligence is particularly promising, especially in addressing mathematical reasoning tasks. Current mathematical benchmarks predominantly focus on evaluating MLLMs' problem-solving ability, yet there is a crucial gap in addressing more complex scenarios such as error detection, for enhancing reasoning capability in complicated settings. To fill this gap, we formally formulate the new task: multimodal error detection, and introduce ErrorRadar, the first benchmark designed to assess MLLMs' capabilities in such a task. ErrorRadar evaluates two sub-tasks: error step identification and error categorization, providing a comprehensive framework for evaluating MLLMs' complex mathematical reasoning ability. It consists of 2,500 high-quality multimodal K-12 mathematical problems, collected from real-world student interactions in an educational organization, with rigorous annotation and rich metadata such as problem type and error category. Through extensive experiments, we evaluated both open-source and closed-source representative MLLMs, benchmarking their performance against educational expert evaluators. Results indicate significant challenges still remain, as GPT-4o with best performance is still around 10% behind human evaluation. The dataset will be available upon acceptance.
Published: 2024

20. LHAASO detection of very-high-energy gamma-ray emission surrounding PSR J0248+6021

Author: Cao, Zhen, Aharonian, F., An, Q., Axikegu, Bai, Y. X., Bao, Y. W., Bastieri, D., Bi, X. J., Bi, Y. J., Cai, J. T., Cao, Q., Cao, W. Y., Cao, Zhe, Chang, J., Chang, J. F., Chen, A. M., Chen, E. S., Chen, Liang, Chen, Lin, Chen, Long, Chen, M. J., Chen, M. L., Chen, Q. H., Chen, S. H., Chen, S. Z., Chen, T. L., Chen, Y., Cheng, N., Cheng, Y. D., Cui, M. Y., Cui, S. W., Cui, X. H., Cui, Y. D., Dai, B. Z., Dai, H. L., Dai, Z. G., Danzengluobu, Dong, X. Q., Duan, K. K., Fan, J. H., Fan, Y. Z., Fang, J., Fang, K., Feng, C. F., Feng, L., Feng, S. H., Feng, X. T., Feng, Y. L., Gabici, S., Gao, B., Gao, C. D., Gao, L. Q., Gao, Q., Gao, W., Gao, W. K., Ge, M. M., Geng, L. S., Giacinti, G., Gong, G. H., Gou, Q. B., Gu, M. H., Guo, F. L., Guo, X. L., Guo, Y. Q., Guo, Y. Y., Han, Y. A., He, H. H., He, H. N., He, J. Y., He, X. B., He, Y., Hor, Y. K., Hou, B. W., Hou, C., Hou, X., Hu, H. B., Hu, Q., Hu, S. C., Huang, D. H., Huang, T. Q., Huang, W. J., Huang, X. T., Huang, X. Y., Huang, Y., Huang, Z. C., Ji, X. L., Jia, H. Y., Jia, K., Jiang, K., Jiang, X. W., Jiang, Z. J., Jin, M., Kang, M. M., Ke, T., Kuleshov, D., Kurinov, K., Li, B. B., Li, Cheng, Li, Cong, Li, D., Li, F., Li, H. B., Li, H. C., Li, H. Y., Li, J., Li, Jian, Li, Jie, Li, K., Li, W. L., Li, X. R., Li, Xin, Li, Y. Z., Li, Zhe, Li, Zhuo, Liang, E. W., Liang, Y. F., Lin, J., Liu, B., Liu, C., Liu, D., Liu, H., Liu, H. D., Liu, J., Liu, J. L., Liu, J. Y., Liu, M. Y., Liu, R. Y., Liu, S. M., Liu, W., Liu, Y., Liu, Y. N., Lu, R., Luo, Q., Lv, H. K., Ma, B. Q., Ma, L. L., Ma, X. H., Mao, J. R., Min, Z., Mitthumsiri, W., Mu, H. J., Nan, Y. C., Neronov, A., Ou, Z. W., Pang, B. Y., Pattarakijwanich, P., Pei, Z. Y., Qi, M. Y., Qi, Y. Q., Qiao, B. Q., Qin, J. J., Ruffolo, D., Sáiz, A., Semikoz, D., Shao, C. Y., Shao, L., Shchegolev, O., Sheng, X. D., Shu, F. W., Song, H. C., Stenkin, Yu. V., Stepanov, V., Su, Y., Sun, Q. N., Sun, X. N., Sun, Z. B., Tam, P. H. T., Tang, Q. W., Tang, Z. B., Tian, W. W., Wang, C., Wang, C. B., Wang, G. W., Wang, H. G., Wang, H. H., Wang, J. C., Wang, K., Wang, L. P., Wang, L. Y., Wang, P. H., Wang, R., Wang, W., Wang, X. G., Wang, X. Y., Wang, Y., Wang, Y. D., Wang, Y. J., Wang, Z. H., Wang, Z. X., Wang, Zhen, Wang, Zheng, Wei, D. M., Wei, J. J., Wei, Y. J., Wen, T., Wu, C. Y., Wu, H. R., Wu, S., Wu, X. F., Wu, Y. S., Xi, S. Q., Xia, J., Xia, J. J., Xiang, G. M., Xiao, D. X., Xiao, G., Xin, G. G., Xin, Y. L., Xing, Y., Xiong, Z., Xu, D. L., Xu, R. F., Xu, R. X., Xu, W. L., Xue, L., Yan, D. H., Yan, J. Z., Yan, T., Yang, C. W., Yang, F., Yang, F. F., Yang, H. W., Yang, J. Y., Yang, L. L., Yang, M. J., Yang, R. Z., Yang, S. B., Yao, Y. H., Yao, Z. G., Ye, Y. M., Yin, L. Q., Yin, N., You, X. H., You, Z. Y., Yu, Y. H., Yuan, Q., Yue, H., Zeng, H. D., Zeng, T. X., Zeng, W., Zha, M., Zhang, B. B., Zhang, F., Zhang, H. M., Zhang, H. Y., Zhang, J. L., Zhang, L. X., Zhang, Li, Zhang, P. F., Zhang, P. P., Zhang, R., Zhang, S. B., Zhang, S. R., Zhang, S. S., Zhang, X., Zhang, X. P., Zhang, Y. F., Zhang, Yi, Zhang, Yong, Zhao, B., Zhao, J., Zhao, L., Zhao, L. Z., Zhao, S. P., Zheng, F., Zheng, J. H., Zhou, B., Zhou, H., Zhou, J. N., Zhou, M., Zhou, P., Zhou, R., Zhou, X. X., Zhu, C. G., Zhu, F. R., Zhu, H., Zhu, K. J., Zou, Y. C., and Zuo, X.
Subjects: Astrophysics - High Energy Astrophysical Phenomena
Abstract: We report the detection of an extended very-high-energy (VHE) gamma-ray source coincident with the locations of middle-aged (62.4~\rm kyr) pulsar PSR J0248+6021, by using the LHAASO-WCDA data of live 796 days and LHAASO-KM2A data of live 1216 days. A significant excess of \gray induced showers is observed both by WCDA in energy bands of 1-25~\rm TeV and KM2A in energy bands of $>$ 25~\rm TeV with 7.3 $\sigma$ and 13.5 $\sigma$, respectively. The best-fit position derived through WCDA data is R.A. = 42.06$^\circ \pm$ 0.12$^\circ$ and Dec. = 60.24$^\circ \pm $ 0.13$^\circ$ with an extension of 0.69$^\circ\pm$0.15$^\circ$ and that of the KM2A data is R.A.= 42.29$^\circ \pm $ 0.13$^\circ$ and Dec. = 60.38$^\circ \pm$ 0.07$^\circ$ with an extension of 0.37$^\circ\pm$0.07$^\circ$. No clear extended multiwavelength counterpart of this LHAASO source has been found from the radio band to the GeV band. The most plausible explanation of the VHE \gray emission is the inverse Compton process of highly relativistic electrons and positrons injected by the pulsar. These electrons/positrons are hypothesized to be either confined within the pulsar wind nebula or to have already escaped into the interstellar medium, forming a pulsar halo., Comment: 12 pages, 10 figures, Accepted by Sci. China-Phys. Mech. Astron
Published: 2024

21. Structured List-Grounded Question Answering

Author: Sung, Mujeen, Feng, Song, Gung, James, Shu, Raphael, Zhang, Yi, and Mansour, Saab
Subjects: Computer Science - Computation and Language
Abstract: Document-grounded dialogue systems aim to answer user queries by leveraging external information. Previous studies have mainly focused on handling free-form documents, often overlooking structured data such as lists, which can represent a range of nuanced semantic relations. Motivated by the observation that even advanced language models like GPT-3.5 often miss semantic cues from lists, this paper aims to enhance question answering (QA) systems for better interpretation and use of structured lists. To this end, we introduce the LIST2QA dataset, a novel benchmark to evaluate the ability of QA systems to respond effectively using list information. This dataset is created from unlabeled customer service documents using language models and model-based filtering processes to enhance data quality, and can be used to fine-tune and evaluate QA models. Apart from directly generating responses through fine-tuned models, we further explore the explicit use of Intermediate Steps for Lists (ISL), aligning list items with user backgrounds to better reflect how humans interpret list items before generating responses. Our experimental results demonstrate that models trained on LIST2QA with our ISL approach outperform baselines across various metrics. Specifically, our fine-tuned Flan-T5-XL model shows increases of 3.1% in ROUGE-L, 4.6% in correctness, 4.5% in faithfulness, and 20.6% in completeness compared to models without applying filtering and the proposed ISL method.
Published: 2024

22. Lost in Tracking: Uncertainty-guided Cardiac Cine MRI Segmentation at Right Ventricle Base

Author: Zhao, Yidong, Zhang, Yi, Simonetti, Orlando, Han, Yuchi, and Tao, Qian
Subjects: Electrical Engineering and Systems Science - Image and Video Processing, Computer Science - Computer Vision and Pattern Recognition
Abstract: Accurate biventricular segmentation of cardiac magnetic resonance (CMR) cine images is essential for the clinical evaluation of heart function. However, compared to left ventricle (LV), right ventricle (RV) segmentation is still more challenging and less reproducible. Degenerate performance frequently occurs at the RV base, where the in-plane anatomical structures are complex (with atria, valve, and aorta) and vary due to the strong interplanar motion. In this work, we propose to address the currently unsolved issues in CMR segmentation, specifically at the RV base, with two strategies: first, we complemented the public resource by reannotating the RV base in the ACDC dataset, with refined delineation of the right ventricle outflow tract (RVOT), under the guidance of an expert cardiologist. Second, we proposed a novel dual encoder U-Net architecture that leverages temporal incoherence to inform the segmentation when interplanar motions occur. The inter-planar motion is characterized by loss-of-tracking, via Bayesian uncertainty of a motion-tracking model. Our experiments showed that our method significantly improved RV base segmentation taking into account temporal incoherence. Furthermore, we investigated the reproducibility of deep learning-based segmentation and showed that the combination of consistent annotation and loss of tracking could enhance the reproducibility of RV segmentation, potentially facilitating a large number of clinical studies focusing on RV.
Published: 2024
Full Text: View/download PDF

23. CriSPO: Multi-Aspect Critique-Suggestion-guided Automatic Prompt Optimization for Text Generation

Author: He, Han, Liu, Qianchu, Xu, Lei, Shivade, Chaitanya, Zhang, Yi, Srinivasan, Sundararajan, and Kirchhoff, Katrin
Subjects: Computer Science - Computation and Language, Computer Science - Artificial Intelligence, Computer Science - Machine Learning
Abstract: Existing automatic prompt engineering methods are typically designed for discriminative tasks, where new task prompts are iteratively refined with limited feedback from a single metric reflecting a single aspect. However, these approaches are suboptimal for generative tasks, which require more nuanced guidance beyond a single numeric metric to improve the prompt and optimize multiple aspects of the generated text. To address these challenges, we propose a novel multi-aspect Critique-Suggestion-guided automatic Prompt Optimization (CriSPO) approach. CriSPO introduces a critique-suggestion module as its core component. This module spontaneously discovers aspects, and compares generated and reference texts across these aspects, providing specific suggestions for prompt modification. These clear critiques and actionable suggestions guide a receptive optimizer module to make more substantial changes, exploring a broader and more effective search space. To further improve CriSPO with multi-metric optimization, we introduce an Automatic Suffix Tuning (AST) extension to enhance the performance of task prompts across multiple metrics. We evaluate CriSPO on 4 state-of-the-art LLMs across 4 summarization and 5 QA datasets. Extensive experiments show 3-4\% ROUGE score improvement on summarization and substantial improvement of various metrics on QA.
Published: 2024

24. Stability of a class of supercritical volume-filling chemotaxis-fluid model near Couette flow

Author: Wang, Lili, Wang, Wendong, and Zhang, Yi
Subjects: Mathematics - Analysis of PDEs
Abstract: Consider a class of chemotaxis-fluid model incorporating a volume-filling effect in the sense of Painter and Hillen (Can. Appl. Math. Q. 2002; 10(4): 501-543), which is a supercritical parabolic-elliptic Keller-Segel system. As shown by Winkler et al., for any given mass, there exists a corresponding solution of the same mass that blows up in either finite or infinite time. In this paper, we investigate the stability properties of the two dimensional Patlak-Keller-Segel-type chemotaxis-fluid model near the Couette flow $ (Ay, 0) $ in $ \mathbb{T}\times\mathbb{R}, $ and show that the solutions are global in time as long as the initial cell mass $M<\frac{2\pi}{\sqrt{3}} $ and the shear flow is sufficiently strong ($A$ is large enough)., Comment: arXiv admin note: text overlap with arXiv:2405.10337
Published: 2024

25. PhyMPGN: Physics-encoded Message Passing Graph Network for spatiotemporal PDE systems

Author: Zeng, Bocheng, Wang, Qi, Yan, Mengtao, Liu, Yang, Chengze, Ruizhi, Zhang, Yi, Liu, Hongsheng, Wang, Zidong, and Sun, Hao
Subjects: Computer Science - Machine Learning, Computer Science - Artificial Intelligence, Computer Science - Computational Engineering, Finance, and Science
Abstract: Solving partial differential equations (PDEs) serves as a cornerstone for modeling complex dynamical systems. Recent progresses have demonstrated grand benefits of data-driven neural-based models for predicting spatiotemporal dynamics (e.g., tremendous speedup gain compared with classical numerical methods). However, most existing neural models rely on rich training data, have limited extrapolation and generalization abilities, and suffer to produce precise or reliable physical prediction under intricate conditions (e.g., irregular mesh or geometry, complex boundary conditions, diverse PDE parameters, etc.). To this end, we propose a new graph learning approach, namely, Physics-encoded Message Passing Graph Network (PhyMPGN), to model spatiotemporal PDE systems on irregular meshes given small training datasets. Specifically, we incorporate a GNN into a numerical integrator to approximate the temporal marching of spatiotemporal dynamics for a given PDE system. Considering that many physical phenomena are governed by diffusion processes, we further design a learnable Laplace block, which encodes the discrete Laplace-Beltrami operator, to aid and guide the GNN learning in a physically feasible solution space. A boundary condition padding strategy is also designed to improve the model convergence and accuracy. Extensive experiments demonstrate that PhyMPGN is capable of accurately predicting various types of spatiotemporal dynamics on coarse unstructured meshes, consistently achieves the state-of-the-art results, and outperforms other baselines with considerable gains.
Published: 2024

26. Mamba Neural Operator: Who Wins? Transformers vs. State-Space Models for PDEs

Author: Cheng, Chun-Wun, Huang, Jiahao, Zhang, Yi, Yang, Guang, Schönlieb, Carola-Bibiane, and Aviles-Rivero, Angelica I
Subjects: Computer Science - Machine Learning, Mathematics - Numerical Analysis
Abstract: Partial differential equations (PDEs) are widely used to model complex physical systems, but solving them efficiently remains a significant challenge. Recently, Transformers have emerged as the preferred architecture for PDEs due to their ability to capture intricate dependencies. However, they struggle with representing continuous dynamics and long-range interactions. To overcome these limitations, we introduce the Mamba Neural Operator (MNO), a novel framework that enhances neural operator-based techniques for solving PDEs. MNO establishes a formal theoretical connection between structured state-space models (SSMs) and neural operators, offering a unified structure that can adapt to diverse architectures, including Transformer-based models. By leveraging the structured design of SSMs, MNO captures long-range dependencies and continuous dynamics more effectively than traditional Transformers. Through extensive analysis, we show that MNO significantly boosts the expressive power and accuracy of neural operators, making it not just a complement but a superior framework for PDE-related tasks, bridging the gap between efficient representation and accurate solution approximation.
Published: 2024

27. CVVLSNet: Vehicle Location and Speed Estimation Using Partial Connected Vehicle Trajectory Data

Author: Ye, Jiachen, Wang, Dingyu, Jia, Shaocheng, Pei, Xin, Yang, Zi, Zhang, Yi, and Wong, S. C.
Subjects: Computer Science - Computer Vision and Pattern Recognition
Abstract: Real-time estimation of vehicle locations and speeds is crucial for developing many beneficial transportation applications in traffic management and control, e.g., adaptive signal control. Recent advances in communication technologies facilitate the emergence of connected vehicles (CVs), which can share traffic information with nearby CVs or infrastructures. At the early stage of connectivity, only a portion of vehicles are CVs. The locations and speeds for those non-CVs (NCs) are not accessible and must be estimated to obtain the full traffic information. To address the above problem, this paper proposes a novel CV-based Vehicle Location and Speed estimation network, CVVLSNet, to simultaneously estimate the vehicle locations and speeds exclusively using partial CV trajectory data. A road cell occupancy (RCO) method is first proposed to represent the variable vehicle state information. Spatiotemporal interactions can be integrated by simply fusing the RCO representations. Then, CVVLSNet, taking the Coding-RAte TransformEr (CRATE) network as a backbone, is introduced to estimate the vehicle locations and speeds. Moreover, physical vehicle size constraints are also considered in loss functions. Extensive experiments indicate that the proposed method significantly outperformed the existing method under various CV penetration rates, signal timings, and volume-to-capacity ratios.
Published: 2024

28. Trustworthy Text-to-Image Diffusion Models: A Timely and Focused Survey

Author: Zhang, Yi, Chen, Zhen, Cheng, Chih-Hong, Ruan, Wenjie, Huang, Xiaowei, Zhao, Dezong, Flynn, David, Khastgir, Siddartha, and Zhao, Xingyu
Subjects: Computer Science - Machine Learning
Abstract: Text-to-Image (T2I) Diffusion Models (DMs) have garnered widespread attention for their impressive advancements in image generation. However, their growing popularity has raised ethical and social concerns related to key non-functional properties of trustworthiness, such as robustness, fairness, security, privacy, factuality, and explainability, similar to those in traditional deep learning (DL) tasks. Conventional approaches for studying trustworthiness in DL tasks often fall short due to the unique characteristics of T2I DMs, e.g., the multi-modal nature. Given the challenge, recent efforts have been made to develop new methods for investigating trustworthiness in T2I DMs via various means, including falsification, enhancement, verification \& validation and assessment. However, there is a notable lack of in-depth analysis concerning those non-functional properties and means. In this survey, we provide a timely and focused review of the literature on trustworthy T2I DMs, covering a concise-structured taxonomy from the perspectives of property, means, benchmarks and applications. Our review begins with an introduction to essential preliminaries of T2I DMs, and then we summarise key definitions/metrics specific to T2I tasks and analyses the means proposed in recent literature based on these definitions/metrics. Additionally, we review benchmarks and domain applications of T2I DMs. Finally, we highlight the gaps in current research, discuss the limitations of existing methods, and propose future research directions to advance the development of trustworthy T2I DMs. Furthermore, we keep up-to-date updates in this field to track the latest developments and maintain our GitHub repository at: https://github.com/wellzline/Trustworthy_T2I_DMs, Comment: under review
Published: 2024

29. Properties of the QCD Matter: A Review of Selected Results from the ALICE Experiment

Author: Shou, Qi-Ye, Ma, Yu-Gang, Zhang, Song, Zhu, Jian-Hui, Mao, Ya-Xian, Pei, Hua, Yin, Zhong-Bao, Zhang, Xiao-Ming, Zhou, Dai-Cui, Peng, Xin-Ye, Bai, Xiao-Zhi, Tang, Ze-Bo, Zhang, Yi-Fei, and Li, Xiao-Mei
Subjects: Nuclear Experiment, High Energy Physics - Experiment
Abstract: The Large Hadron Collider (LHC), the world's largest and most powerful particle accelerator, has been a pivotal tool in advancing our understanding of fundamental physics. By colliding heavy ions (such as lead ions), the LHC recreates conditions similar to those just after the Big Bang. This allows scientists to study the Quark-Gluon Plasma (QGP), a state of matter where quarks and gluons are not confined within protons and neutrons. These studies provide insights into the strong force and the early universe's behavior. In this paper, we provide a comprehensive overview of recent significant findings from A Large Ion Collider Experiment (ALICE) at LHC. The topics encompass measurements regarding to properties of the QGP, particle production, flow and correlations, dileptons, quarkonia and electromagnetic probes, heavy flavor, and jets. Additionally, we introduce future plans for detector upgrades of the ALICE experiment., Comment: 29 pages, 32 figures. This review is dedicated to Professor Wenqing Shen in honor of his leadership and significant impact on the Chinese heavy-ion physics community. All authors contributed equally to this work
Published: 2024

30. Creation of independently controllable and long lifetime polar skyrmion textures in ferroelectric-metallic heterostructures

Author: Sun, Fei, Ren, Jianhua, Li, Hongfang, Wu, Yiwei, Liang, Jianwei, Yang, Hui, Zhang, Yi, Liu, Jianyi, Liu, Linjie, Wu, Mengjun, Zhang, Xiaoyue, Zhu, Wenpeng, Chen, Weijin, and Zheng, Yue
Subjects: Condensed Matter - Materials Science
Abstract: Topological textures like vortices, labyrinths and skyrmions formed in ferroic materials have attracted extensive interests during the past decade for their fundamental physics, intriguing topology, and technological prospects. So far, polar skyrmions remain scarce in ferroelectrics as they require a delicate balance between various dipolar interactions. Here, we report that PbTiO3 thin films in a metallic contact undergo a topological phase transition and stabilize a broad family of skyrmion-like textures (e.g., skyrmion bubbles, multiple {\pi}-twist target skyrmions, and skyrmion bags) with independent controllability, analogous to those reported in magnetic systems. Weakly-interacted skyrmion arrays with a density over 300 Gb/inch2 are successfully written, erased and read-out by local electrical and mechanical stimuli of a scanning probe. Interestingly, in contrast to the relatively short lifetime <20 hours of the skyrmion bubbles, the multiple {\pi}-twist target skyrmions and skyrmion bags show topology-enhanced stability with lifetime over two weeks. Experimental and theoretical analysis implies the heterostructures carry electric Dzyaloshinskii-Moriya interaction mediated by oxygen octahedral tiltings. Our results demonstrate ferroelectric-metallic heterostructures as fertile playground for topological states and emergent phenomena.
Published: 2024

31. Few-Shot Testing of Autonomous Vehicles with Scenario Similarity Learning

Author: Li, Shu, He, Honglin, Yang, Jingxuan, Hu, Jianming, Zhang, Yi, and Feng, Shuo
Subjects: Electrical Engineering and Systems Science - Systems and Control
Abstract: Testing and evaluation are critical to the development and deployment of autonomous vehicles (AVs). Given the rarity of safety-critical events such as crashes, millions of tests are typically needed to accurately assess AV safety performance. Although techniques like importance sampling can accelerate this process, it usually still requires too many numbers of tests for field testing. This severely hinders the testing and evaluation process, especially for third-party testers and governmental bodies with very limited testing budgets. The rapid development cycles of AV technology further exacerbate this challenge. To fill this research gap, this paper introduces the few-shot testing (FST) problem and proposes a methodological framework to tackle it. As the testing budget is very limited, usually smaller than 100, the FST method transforms the testing scenario generation problem from probabilistic sampling to deterministic optimization, reducing the uncertainty of testing results. To optimize the selection of testing scenarios, a cross-attention similarity mechanism is proposed to learn to extract the information of AV's testing scenario space. This allows iterative searches for scenarios with the smallest evaluation error, ensuring precise testing within budget constraints. Experimental results in cut-in scenarios demonstrate the effectiveness of the FST method, significantly enhancing accuracy and enabling efficient, precise AV testing., Comment: 12 pages, 8 figures
Published: 2024

32. Trustworthy Hate Speech Detection Through Visual Augmentation

Author: Yang, Ziyuan, Yan, Ming, Chen, Yingyu, Wang, Hui, Lu, Zexin, and Zhang, Yi
Subjects: Computer Science - Computer Vision and Pattern Recognition, Computer Science - Artificial Intelligence
Abstract: The surge of hate speech on social media platforms poses a significant challenge, with hate speech detection~(HSD) becoming increasingly critical. Current HSD methods focus on enriching contextual information to enhance detection performance, but they overlook the inherent uncertainty of hate speech. We propose a novel HSD method, named trustworthy hate speech detection method through visual augmentation (TrusV-HSD), which enhances semantic information through integration with diffused visual images and mitigates uncertainty with trustworthy loss. TrusV-HSD learns semantic representations by effectively extracting trustworthy information through multi-modal connections without paired data. Our experiments on public HSD datasets demonstrate the effectiveness of TrusV-HSD, showing remarkable improvements over conventional methods.
Published: 2024

33. Quantifying Observational Projection Effects with a Simulation-based hot CGM model

Author: Shreeram, Soumya, Comparat, Johan, Merloni, Andrea, Zhang, Yi, Ponti, Gabriele, Nandra, Kirpal, ZuHone, John, Marini, Ilaria, Vladutescu-Zopp, Stephan, Popesso, Paola, Pakmor, Ruediger, Seppi, Riccardo, Peroux, Celine, and Sorini, Daniele
Subjects: Astrophysics - Astrophysics of Galaxies, Astrophysics - High Energy Astrophysical Phenomena
Abstract: The hot phase of the circumgalactic medium (CGM) allows us to probe the inflow and outflow of gas within a galaxy, which is responsible for dictating the evolution of the galaxy. Studying the hot CGM sheds light on a better understanding of gas physics, which is crucial to inform and constrain simulation models. With the recent advances in observational measurements probing the hot CGM in X-rays and tSZ, we have a new avenue for widening our knowledge of gas physics and feedback by exploiting the information from current/future observations. In this paper, we use the TNG300 hydrodynamical simulations to build a fully self-consistent forward model for the hot CGM. We construct a lightcone and generate mock X-ray observations. We quantify the projection effects, namely the locally correlated large-scale structure in X-rays and the effect due to satellite galaxies misclassified as centrals which affects the measured hot CGM galactocentric profiles in stacking experiments. We present an analytical model that describes the intrinsic X-ray surface brightness profile across the stellar and halo mass bins. The increasing stellar mass bins result in decreasing values of $\beta$, the exponent quantifying the slope of the intrinsic galactocentric profiles. We carry forward the current state-of-the-art by also showing the impact of the locally correlated environment on the measured X-ray surface brightness profiles. We also present, for the first time, the effect of misclassified centrals in stacking experiments for three stellar mass bins: $10^{10.5-11}\ M_\odot$, $10^{11-11.2}\ M_\odot$, and $10^{11.2-11.5}\ M_\odot$. We find that the contaminating effect of the misclassified centrals on the stacked profiles increases when the stellar mass decreases., Comment: 14 pages, 10 figures, Submitted to A&A, comments welcome
Published: 2024

34. Josephson diodes induced by the loop current states

Author: Shen, Qi-Kai and Zhang, Yi
Subjects: Condensed Matter - Superconductivity, Condensed Matter - Mesoscale and Nanoscale Physics
Abstract: We study the diode effect of the supercurrent in the Josephson junctions with the loop current states as the tunneling barrier. The loop current states are realized by the Haldane model which preserves the inversion symmetry and thus forbids the diode effect. We demonstrate how the inversion symmetry can be broken in the monolayer and bilayer systems. In the monolayer system, inversion symmetry can be broken by either introducing a sublattice staggered potential for the Haldane model or introducing a modified Haldane model, and in the bilayer system, it can be broken by either staking the two layers with opposite current directions or by directly applying an electric field perpendicular to the layers. We further show that the diode efficiency can be tuned by the interlayer coupling and the strength of the electric field or interlayer voltage. Our results provide another route to realize the Josephson diode effect by breaking the time-reversal symmetry through the loop-current states., Comment: 10 pages, 8 figures, and 1 table
Published: 2024

35. Robust Coulomb Gap and Varied-temperature Study of Epitaxial 1T'-WSe$_2$ Monolayers

Author: Chen, Wang, Hu, Mengli, Zong, Junyu, Xie, Xuedong, Ren, Wei, Meng, Qinghao, Yu, Fan, Tian, Qichao, Jin, Shaoen, Qiu, Xiaodong, Wang, Kaili, Wang, Can, Liu, Junwei, Li, Fang-Sen, Wang, Li, and Zhang, Yi
Subjects: Condensed Matter - Mesoscale and Nanoscale Physics, Condensed Matter - Materials Science
Abstract: The transition metal dichalcogenides (TMDCs) with a 1T' structural phase are predicted to be two-dimensional topological insulators at zero temperature. Although the quantized edge conductance of 1T'-WTe$_2$ has been confirmed to survive up to 100 K, this temperature is still relatively low for industrial applications. Addressing the limited studies on temperature effects in 1T'-TMDCs, our research focuses on the electronic and crystal properties of the epitaxial 1T'-WSe$_2$ monolayers grown on bilayer graphene (BLG) and SrTiO$_3$(100) substrates at various temperatures. For the 1T'-WSe$_2$ grown on BLG, we observed a significant thermal expansion effect on its band structures with a thermal expansion coefficient of $\sim$60$\times$10$^{-6}$ K$^{-1}$. In contrast, the 1T'-WSe$_2$ grown on SrTiO$_3$(100) exhibits minimal changes with varied temperatures due to the enhanced strain exerted by the substrate. Besides, A significant Coulomb gap (CG) was observed pinned at the Fermi level in the angle-resolved photoemission spectroscopy (ARPES) and scanning tunneling spectroscopy (STS). The CG was founded to decrease with increasing temperatures, and can persist up to 200 K for 1T'-WSe$_2$/BLG, consistent with our Monte Carlo simulations. The robustness of the CG and the positive fundamental gap endow the epitaxial 1T'-WSe$_2$ monolayers with huge potential for realizing the quantum spin Hall devices.
Published: 2024

36. AI-Driven Virtual Teacher for Enhanced Educational Efficiency: Leveraging Large Pretrain Models for Autonomous Error Analysis and Correction

Author: Xu, Tianlong, Zhang, Yi-Fan, Chu, Zhendong, Wang, Shen, and Wen, Qingsong
Subjects: Computer Science - Computer Vision and Pattern Recognition, Computer Science - Artificial Intelligence, Computer Science - Multimedia
Abstract: Students frequently make mistakes while solving mathematical problems, and traditional error correction methods are both time-consuming and labor-intensive. This paper introduces an innovative \textbf{V}irtual \textbf{A}I \textbf{T}eacher system designed to autonomously analyze and correct student \textbf{E}rrors (VATE). Leveraging advanced large language models (LLMs), the system uses student drafts as a primary source for error analysis, which enhances understanding of the student's learning process. It incorporates sophisticated prompt engineering and maintains an error pool to reduce computational overhead. The AI-driven system also features a real-time dialogue component for efficient student interaction. Our approach demonstrates significant advantages over traditional and machine learning-based error correction methods, including reduced educational costs, high scalability, and superior generalizability. The system has been deployed on the Squirrel AI learning platform for elementary mathematics education, where it achieves 78.3\% accuracy in error analysis and shows a marked improvement in student learning efficiency. Satisfaction surveys indicate a strong positive reception, highlighting the system's potential to transform educational practices.
Published: 2024

37. xTED: Cross-Domain Adaptation via Diffusion-Based Trajectory Editing

Author: Niu, Haoyi, Chen, Qimao, Liu, Tenglong, Li, Jianxiong, Zhou, Guyue, Zhang, Yi, Hu, Jianming, and Zhan, Xianyuan
Subjects: Computer Science - Robotics, Computer Science - Machine Learning
Abstract: Reusing pre-collected data from different domains is an appealing solution for decision-making tasks that have insufficient data in the target domain but are relatively abundant in other related domains. Existing cross-domain policy transfer methods mostly aim at learning domain correspondences or corrections to facilitate policy learning, such as learning domain/task-specific discriminators, representations, or policies. This design philosophy often results in heavy model architectures or task/domain-specific modeling, lacking flexibility. This reality makes us wonder: can we directly bridge the domain gaps universally at the data level, instead of relying on complex downstream cross-domain policy transfer models? In this study, we propose the Cross-Domain Trajectory EDiting (xTED) framework that employs a specially designed diffusion model for cross-domain trajectory adaptation. Our proposed model architecture effectively captures the intricate dependencies among states, actions, and rewards, as well as the dynamics patterns within target data. By utilizing the pre-trained diffusion as a prior, source domain trajectories can be transformed to match with target domain properties while preserving original semantic information. This process implicitly corrects underlying domain gaps, enhancing state realism and dynamics reliability in the source data, and allowing flexible incorporation with various downstream policy learning methods. Despite its simplicity, xTED demonstrates superior performance in extensive simulation and real-robot experiments., Comment: xTED offers a novel, generic, flexible, simple and effective paradigm that casts cross-domain policy adaptation as a data pre-processing problem
Published: 2024

38. LogoRA: Local-Global Representation Alignment for Robust Time Series Classification

Author: Zhang, Huanyu, Zhang, Yi-Fan, Zhang, Zhang, Wen, Qingsong, and Wang, Liang
Subjects: Computer Science - Machine Learning
Abstract: Unsupervised domain adaptation (UDA) of time series aims to teach models to identify consistent patterns across various temporal scenarios, disregarding domain-specific differences, which can maintain their predictive accuracy and effectively adapt to new domains. However, existing UDA methods struggle to adequately extract and align both global and local features in time series data. To address this issue, we propose the Local-Global Representation Alignment framework (LogoRA), which employs a two-branch encoder, comprising a multi-scale convolutional branch and a patching transformer branch. The encoder enables the extraction of both local and global representations from time series. A fusion module is then introduced to integrate these representations, enhancing domain-invariant feature alignment from multi-scale perspectives. To achieve effective alignment, LogoRA employs strategies like invariant feature learning on the source domain, utilizing triplet loss for fine alignment and dynamic time warping-based feature alignment. Additionally, it reduces source-target domain gaps through adversarial training and per-class prototype alignment. Our evaluations on four time-series datasets demonstrate that LogoRA outperforms strong baselines by up to $12.52\%$, showcasing its superiority in time series UDA tasks., Comment: Accepted by IEEE Transactions on Knowledge and Data Engineering
Published: 2024

39. Active-Passive Federated Learning for Vertically Partitioned Multi-view Data

Author: Liu, Jiyuan, Liu, Xinwang, Wang, Siqi, Hu, Xingchen, Liao, Qing, Wan, Xinhang, Zhang, Yi, Lv, Xin, and He, Kunlun
Subjects: Computer Science - Machine Learning
Abstract: Vertical federated learning is a natural and elegant approach to integrate multi-view data vertically partitioned across devices (clients) while preserving their privacies. Apart from the model training, existing methods requires the collaboration of all clients in the model inference. However, the model inference is probably maintained for service in a long time, while the collaboration, especially when the clients belong to different organizations, is unpredictable in real-world scenarios, such as concellation of contract, network unavailablity, etc., resulting in the failure of them. To address this issue, we, at the first attempt, propose a flexible Active-Passive Federated learning (APFed) framework. Specifically, the active client is the initiator of a learning task and responsible to build the complete model, while the passive clients only serve as assistants. Once the model built, the active client can make inference independently. In addition, we instance the APFed framework into two classification methods with employing the reconstruction loss and the contrastive loss on passive clients, respectively. Meanwhile, the two methods are tested in a set of experiments and achieves desired results, validating their effectiveness.
Published: 2024

40. The Giant Radio Array for Neutrino Detection (GRAND) Collaboration -- Contributions to the 10th International Workshop on Acoustic and Radio EeV Neutrino Detection Activities (ARENA 2024)

Author: Batista, Rafael Alves, Benoit-Lévy, Aurélien, Bister, Teresa, Bohacova, Martina, Bustamante, Mauricio, Carvalho, Washington, Chen, Yiren, Cheng, LingMei, Chiche, Simon, Colley, Jean-Marc, Correa, Pablo, Laurenciu, Nicoleta Cucu, Dai, Zigao, de Almeida, Rogerio M., de Errico, Beatriz, de Jong, Sijbrand, Neto, João R. T. de Mello, de Vries, Krijn D, Decoene, Valentin, Denton, Peter B., Duan, Bohao, Duan, Kaikai, Engel, Ralph, Erba, William, Fan, Yizhong, Ferrière, Arsène, Gou, QuanBu, Gu, Junhua, Guelfand, Marion, Guo, Jianhua, Guo, Yiqing, Guépin, Claire, Gülzow, Lukas, Haungs, Andreas, Havelka, Matej, He, Haoning, Hivon, Eric, Hu, Hongbo, Huang, Xiaoyuan, Huang, Yan, Huege, Tim, Jiang, Wen, Koirala, Ramesh, Kong, ChuiZheng, Kotera, Kumiko, Köhler, Jelena, Lago, Bruno L., Lai, Zhisen, Coz, Sandra Le, Legrand, François, Leisos, Antonios, Li, Rui, Li, Xingyu, Li, YiFei, Liu, Cheng, Liu, Ruoyu, Liu, Wei, Ma, Pengxiong, Macias, Oscar, Magnard, Frédéric, Marcowith, Alexandre, Martineau-Huynh, Olivier, McKinley, Thomas, Minodier, Paul, Mitra, Pragati, Mostafá, Miguel, Murase, Kohta, Niess, Valentin, Nonis, Stavros, Ogio, Shoichi, Oikonomou, Foteini, Pan, Hongwei, Papageorgiou, Konstantinos, Pierog, Tanguy, Piotrowski, Lech Wiktor, Prunet, Simon, Qian, Xiangli, Roth, Markus, Sako, Takashi, Schoorlemmer, Harm, Szálas-Motesiczky, Dániel, Sławiński, Szymon, Tian, Xishui, Timmermans, Anne, Timmermans, Charles, Tobiska, Petr, Tsirigotis, Apostolos, Tueros, Matías, Vittakis, George, Wang, Hanrui, Wang, Jiale, Wang, Shen, Wang, Xiangyu, Wang, Xu, Wei, Daming, Wei, Feng, Wu, Xiangping, Wu, Xuefeng, Xu, Xin, Xu, Xing, Yang, Fufu, Yang, Lili, Yang, Xuan, Yuan, Qiang, Zarka, Philippe, Zeng, Houdun, Zhang, Chao, Zhang, Jianli, Zhang, Kewen, Zhang, Pengfei, Zhang, Qingchi, Zhang, Songbo, Zhang, Yi, Zhou, Hao, Wissel, Stephanie, Zeolla, Andrew, Deaconu, Cosmin, Hughes, Kaeli, Martin, Zachary, Mulrey, Katharine, Cummings, Austin, Krömer, Oliver, Plant, Kathryn, and Schroeder, Frank G.
Subjects: Astrophysics - Instrumentation and Methods for Astrophysics, Astrophysics - High Energy Astrophysical Phenomena, High Energy Physics - Experiment, High Energy Physics - Phenomenology
Abstract: This is an index of the contributions by the Giant Radio Array for Neutrino Detection (GRAND) Collaboration to the 10th International Workshop on Acoustic and Radio EeV Neutrino Detection Activities (ARENA 2024, University of Chicago, June 11-14, 2024). The contributions include an overview of GRAND in its present and future incarnations, methods of radio-detection that are being developed for them, and ongoing joint work between the GRAND and BEACON experiments., Comment: Note: To access the list of contributions, please follow the "HTML" link that can be found on the arXiv page
Published: 2024

41. CMM-Math: A Chinese Multimodal Math Dataset To Evaluate and Enhance the Mathematics Reasoning of Large Multimodal Models

Author: Liu, Wentao, Pan, Qianjun, Zhang, Yi, Liu, Zhuo, Wu, Ji, Zhou, Jie, Zhou, Aimin, Chen, Qin, Jiang, Bo, and He, Liang
Subjects: Computer Science - Computation and Language
Abstract: Large language models (LLMs) have obtained promising results in mathematical reasoning, which is a foundational skill for human intelligence. Most previous studies focus on improving and measuring the performance of LLMs based on textual math reasoning datasets (e.g., MATH, GSM8K). Recently, a few researchers have released English multimodal math datasets (e.g., MATHVISTA and MATH-V) to evaluate the effectiveness of large multimodal models (LMMs). In this paper, we release a Chinese multimodal math (CMM-Math) dataset, including benchmark and training parts, to evaluate and enhance the mathematical reasoning of LMMs. CMM-Math contains over 28,000 high-quality samples, featuring a variety of problem types (e.g., multiple-choice, fill-in-the-blank, and so on) with detailed solutions across 12 grade levels from elementary to high school in China. Specifically, the visual context may be present in the questions or opinions, which makes this dataset more challenging. Through comprehensive analysis, we discover that state-of-the-art LMMs on the CMM-Math dataset face challenges, emphasizing the necessity for further improvements in LMM development. We also propose a Multimodal Mathematical LMM (Math-LMM) to handle the problems with mixed input of multiple images and text segments. We train our model using three stages, including foundational pre-training, foundational fine-tuning, and mathematical fine-tuning. The extensive experiments indicate that our model effectively improves math reasoning performance by comparing it with the SOTA LMMs over three multimodal mathematical datasets.
Published: 2024

42. BEAVER: An Enterprise Benchmark for Text-to-SQL

Author: Chen, Peter Baile, Wenz, Fabian, Zhang, Yi, Kayali, Moe, Tatbul, Nesime, Cafarella, Michael, Demiralp, Çağatay, and Stonebraker, Michael
Subjects: Computer Science - Computation and Language, Computer Science - Artificial Intelligence, Computer Science - Databases
Abstract: Existing text-to-SQL benchmarks have largely been constructed using publicly available tables from the web with human-generated tests containing question and SQL statement pairs. They typically show very good results and lead people to think that LLMs are effective at text-to-SQL tasks. In this paper, we apply off-the-shelf LLMs to a benchmark containing enterprise data warehouse data. In this environment, LLMs perform poorly, even when standard prompt engineering and RAG techniques are utilized. As we will show, the reasons for poor performance are largely due to three characteristics: (1) public LLMs cannot train on enterprise data warehouses because they are largely in the "dark web", (2) schemas of enterprise tables are more complex than the schemas in public data, which leads the SQL-generation task innately harder, and (3) business-oriented questions are often more complex, requiring joins over multiple tables and aggregations. As a result, we propose a new dataset BEAVER, sourced from real enterprise data warehouses together with natural language queries and their correct SQL statements which we collected from actual user history. We evaluated this dataset using recent LLMs and demonstrated their poor performance on this task. We hope this dataset will facilitate future researchers building more sophisticated text-to-SQL systems which can do better on this important class of data.
Published: 2024

43. MedSAM-U: Uncertainty-Guided Auto Multi-Prompt Adaptation for Reliable MedSAM

Author: Zhou, Nan, Zou, Ke, Ren, Kai, Luo, Mengting, He, Linchao, Wang, Meng, Chen, Yidi, Zhang, Yi, Chen, Hu, and Fu, Huazhu
Subjects: Computer Science - Computer Vision and Pattern Recognition
Abstract: The Medical Segment Anything Model (MedSAM) has shown remarkable performance in medical image segmentation, drawing significant attention in the field. However, its sensitivity to varying prompt types and locations poses challenges. This paper addresses these challenges by focusing on the development of reliable prompts that enhance MedSAM's accuracy. We introduce MedSAM-U, an uncertainty-guided framework designed to automatically refine multi-prompt inputs for more reliable and precise medical image segmentation. Specifically, we first train a Multi-Prompt Adapter integrated with MedSAM, creating MPA-MedSAM, to adapt to diverse multi-prompt inputs. We then employ uncertainty-guided multi-prompt to effectively estimate the uncertainties associated with the prompts and their initial segmentation results. In particular, a novel uncertainty-guided prompts adaptation technique is then applied automatically to derive reliable prompts and their corresponding segmentation outcomes. We validate MedSAM-U using datasets from multiple modalities to train a universal image segmentation model. Compared to MedSAM, experimental results on five distinct modal datasets demonstrate that the proposed MedSAM-U achieves an average performance improvement of 1.7\% to 20.5\% across uncertainty-guided prompts., Comment: 10 pages, 4 figures
Published: 2024

44. Temporally distinct 3D multi-omic dynamics in the developing human brain

Author: Heffel, Matthew G, Zhou, Jingtian, Zhang, Yi, Lee, Dong-Sung, Hou, Kangcheng, Pastor-Alonso, Oier, Abuhanna, Kevin D, Galasso, Joseph, Kern, Colin, Tai, Chu-Yi, Garcia-Padilla, Carlos, Nafisi, Mahsa, Zhou, Yi, Schmitt, Anthony D, Li, Terence, Haeussler, Maximilian, Wick, Brittney, Zhang, Martin Jinye, Xie, Fangming, Ziffra, Ryan S, Mukamel, Eran A, Eskin, Eleazar, Nowakowski, Tomasz J, Dixon, Jesse R, Pasaniuc, Bogdan, Ecker, Joseph R, Zhu, Quan, Bintu, Bogdan, Paredes, Mercedes F, and Luo, Chongyuan
Subjects: Biological Sciences, Genetics, Human Genome, Brain Disorders, Mental Illness, Mental Health, Neurosciences, 2.1 Biological and endogenous factors, 1.1 Normal biological development and functioning, Neurological, General Science & Technology
Abstract: The human hippocampus and prefrontal cortex play critical roles in learning and cognition1,2, yet the dynamic molecular characteristics of their development remain enigmatic. Here we investigated the epigenomic and three-dimensional chromatin conformational reorganization during the development of the hippocampus and prefrontal cortex, using more than 53,000 joint single-nucleus profiles of chromatin conformation and DNA methylation generated by single-nucleus methyl-3C sequencing (snm3C-seq3)3. The remodelling of DNA methylation is temporally separated from chromatin conformation dynamics. Using single-cell profiling and multimodal single-molecule imaging approaches, we have found that short-range chromatin interactions are enriched in neurons, whereas long-range interactions are enriched in glial cells and non-brain tissues. We reconstructed the regulatory programs of cell-type development and differentiation, finding putatively causal common variants for schizophrenia strongly overlapping with chromatin loop-connected, cell-type-specific regulatory regions. Our data provide multimodal resources for studying gene regulatory dynamics in brain development and demonstrate that single-cell three-dimensional multi-omics is a powerful approach for dissecting neuropsychiatric risk loci.
Published: 2024

45. Hadronic cross section measurements with the DAMPE space mission using 20GeV-10TeV cosmic-ray protons and $^4$He

Author: Alemanno, F., An, Q., Azzarello, P., Barbato, F. C. T., Bernardini, P., Bi, X. J., Cagnoli, I., Cai, M. S., Casilli, E., Catanzani, E., Chang, J., Chen, D. Y., Chen, J. L., Chen, Z. F., Coppin, P., Cui, M. Y., Cui, T. S., Cui, Y. X., Dai, H. T., De Benedittis, A., De Mitri, I., de Palma, F., Di Giovanni, A., Ding, Q., Dong, T. K., Dong, Z. X., Donvito, G., Droz, D., Duan, J. L., Duan, K. K., Fan, R. R., Fan, Y. Z., Fang, F., Fang, K., Feng, C. Q., Feng, L., Frieden, J. M., Fusco, P., Gao, M., Gargano, F., Gong, K., Gong, Y. Z., Guo, D. Y., Guo, J. H., Han, S. X., Hu, Y. M., Huang, G. S., Huang, X. Y., Huang, Y. Y., Ionica, M., Jiang, L. Y., Jiang, Y. Z., Jiang, W., Kong, J., Kotenko, A., Kyratzis, D., Lei, S. J., Li, W. H., Li, W. L., Li, X., Li, X. Q., Liang, Y. M., Liu, C. M., Liu, H., Liu, J., Liu, S. B., Liu, Y., Loparco, F., Luo, C. N., Ma, M., Ma, P. X., Ma, T., Ma, X. Y., Marsella, G., Mazziotta, M. N., Mo, D., Niu, X. Y., Pan, X., Parenti, A., Peng, W. X., Peng, X. Y., Perrina, C., Putti-Garcia, E., Qiao, R., Rao, J. N., Ruina, A., Sarkar, R., Savina, P., Serpolla, A., Shangguan, Z., Shen, W. H., Shen, Z. Q., Shen, Z. T., Silveri, L., Song, J. X., Stolpovskiy, M., Su, H., Su, M., Sun, H. R., Sun, Z. Y., Surdo, A., Teng, X. J., Tykhonov, A., Wang, J. Z., Wang, L. G., Wang, S., Wang, S. X., Wang, X. L., Wang, Y., Wang, Y. F., Wang, Y. Z., Wang, Z. M., Wei, D. M., Wei, J. J., Wei, Y. F., Wu, D., Wu, J., Wu, S. S., Wu, X., Xia, Z. Q., Xu, H. T., Xu, J., Xu, Z. H., Xu, Z. L., Xu, E. H., Xu, Z. Z., Xue, G. F., Yang, H. B., Yang, P., Yang, Y. Q., Yao, H. J., Yu, Y. H., Yuan, G. W., Yuan, Q., Yue, C., Zang, J. J., Zhang, S. X., Zhang, W. Z., Zhang, Yan, Zhang, Yi, Zhang, Y. J., Zhang, Y. L., Zhang, Y. P., Zhang, Y. Q., Zhang, Z., Zhang, Z. Y., Zhao, C., Zhao, H. Y., Zhao, X. F., Zhou, C. Y., and Zhu, Y.
Subjects: High Energy Physics - Experiment
Abstract: Precise direct cosmic-ray (CR) measurements provide an important probe to study the energetic particle sources in our Galaxy, and the interstellar environment through which these particles propagate. Uncertainties on hadronic models, ion-nucleon cross sections in particular, are currently the limiting factor towards obtaining more accurate CR ion flux measurements with calorimetric space-based experiments. We present an energy-dependent measurement of the inelastic cross section of protons and helium-4 nuclei (alpha particles) on a Bi$_4$Ge$_3$O$_{12}$ target, using 88 months of data collected by the DAMPE space mission. The kinetic energy range per nucleon of the measurement points ranges from 18 GeV to 9 TeV for protons, and from 5 GeV/n to 3 TeV/n for helium-4 nuclei. Our results lead to a significant improvement of the CR flux normalisation. In the case of helium-4, these results correspond to the first cross section measurements on a heavy target material at energies above 10 GeV/n., Comment: 17 pages, submitted to PRD
Published: 2024

46. Spin-triplet pair density wave superconductors

Author: Zhang, Yi and Wang, Ziqiang
Subjects: Condensed Matter - Superconductivity, Condensed Matter - Strongly Correlated Electrons
Abstract: Recent experiments have shown that the nonzero center of mass momentum pair density wave (PDW) is a widespread phenomenon observed over different superconducting materials. However, concrete theoretical model realizations of the PDW order have remained elusive. Here, we study a one-dimensional model with nearest-neighbor pairing attraction, i.e. a spinful Kitaev chain, under generic spin-orbit couplings such that the spin-rotation symmetry is fully broken. The most general superconducting order parameter is described by a spatial dependent $\mathbf{d}_i$-vector. We show that a spin-triplet pair density wave (t-PDW) emerges in the ground state and occupies a large part of the phase diagram. The $\mathbf{d_i}$-vector of the t-PDW rotates with a pitch $Q_{\rm pdw}$ along the chain and spans an ellipsoid. The pure t-PDW is fully-gapped and a class-DIII topological superconductor with two Majorana zero modes localized at each end of the chain and protected by time-reversal symmetry. Our findings reveal unprecedented insights into the exotic pure PDW superconductor and provide a possible explanation for the one-dimensional PDW detected along domain walls in monolayer iron-based superconductor Fe(Te,Se) and potentially realizable using other quantum structures in unconventional superconductors., Comment: 10 pages, 6 figures
Published: 2024

47. MME-RealWorld: Could Your Multimodal LLM Challenge High-Resolution Real-World Scenarios that are Difficult for Humans?

Author: Zhang, Yi-Fan, Zhang, Huanyu, Tian, Haochen, Fu, Chaoyou, Zhang, Shuangqing, Wu, Junfei, Li, Feng, Wang, Kun, Wen, Qingsong, Zhang, Zhang, Wang, Liang, Jin, Rong, and Tan, Tieniu
Subjects: Computer Science - Computer Vision and Pattern Recognition
Abstract: Comprehensive evaluation of Multimodal Large Language Models (MLLMs) has recently garnered widespread attention in the research community. However, we observe that existing benchmarks present several common barriers that make it difficult to measure the significant challenges that models face in the real world, including: 1) small data scale leads to a large performance variance; 2) reliance on model-based annotations results in restricted data quality; 3) insufficient task difficulty, especially caused by the limited image resolution. To tackle these issues, we introduce MME-RealWorld. Specifically, we collect more than $300$K images from public datasets and the Internet, filtering $13,366$ high-quality images for annotation. This involves the efforts of professional $25$ annotators and $7$ experts in MLLMs, contributing to $29,429$ question-answer pairs that cover $43$ subtasks across $5$ real-world scenarios, extremely challenging even for humans. As far as we know, MME-RealWorld is the largest manually annotated benchmark to date, featuring the highest resolution and a targeted focus on real-world applications. We further conduct a thorough evaluation involving $28$ prominent MLLMs, such as GPT-4o, Gemini 1.5 Pro, and Claude 3.5 Sonnet. Our results show that even the most advanced models struggle with our benchmarks, where none of them reach $60\%$ accuracy. The challenges of perceiving high-resolution images and understanding complex real-world scenarios remain urgent issues to be addressed. The data and evaluation code are released at https://mme-realworld.github.io/ ., Comment: Project Page: https://mme-realworld.github.io/
Published: 2024

48. A Survey of Embodied Learning for Object-Centric Robotic Manipulation

Author: Zheng, Ying, Yao, Lei, Su, Yuejiao, Zhang, Yi, Wang, Yi, Zhao, Sicheng, Zhang, Yiyi, and Chau, Lap-Pui
Subjects: Computer Science - Robotics, Computer Science - Artificial Intelligence, Computer Science - Computer Vision and Pattern Recognition, Computer Science - Machine Learning
Abstract: Embodied learning for object-centric robotic manipulation is a rapidly developing and challenging area in embodied AI. It is crucial for advancing next-generation intelligent robots and has garnered significant interest recently. Unlike data-driven machine learning methods, embodied learning focuses on robot learning through physical interaction with the environment and perceptual feedback, making it especially suitable for robotic manipulation. In this paper, we provide a comprehensive survey of the latest advancements in this field and categorize the existing work into three main branches: 1) Embodied perceptual learning, which aims to predict object pose and affordance through various data representations; 2) Embodied policy learning, which focuses on generating optimal robotic decisions using methods such as reinforcement learning and imitation learning; 3) Embodied task-oriented learning, designed to optimize the robot's performance based on the characteristics of different tasks in object grasping and manipulation. In addition, we offer an overview and discussion of public datasets, evaluation metrics, representative applications, current challenges, and potential future research directions. A project associated with this survey has been established at https://github.com/RayYoh/OCRM_survey.
Published: 2024

49. GRANDlib: A simulation pipeline for the Giant Radio Array for Neutrino Detection (GRAND)

Author: GRAND Collaboration, Batista, Rafael Alves, Benoit-Lévy, Aurélien, Bister, Teresa, Bohacova, Martina, Bustamante, Mauricio, Carvalho, Washington, Chen, Yiren, Cheng, LingMei, Chiche, Simon, Colley, Jean-Marc, Correa, Pablo, Laurenciu, Nicoleta Cucu, Dai, Zigao, de Almeida, Rogerio M., de Errico, Beatriz, de Jong, Sijbrand, Neto, João R. T. de Mello, de Vries, Krijn D., Decoene, Valentin, Denton, Peter B., Duan, Bohao, Duan, Kaikai, Engel, Ralph, Erba, William, Fan, Yizhong, Ferrière, Arsène, Gou, QuanBu, Gu, Junhua, Guelfand, Marion, Guo, Jianhua, Guo, Yiqing, Guépin, Claire, Gülzow, Lukas, Haungs, Andreas, Havelka, Matej, He, Haoning, Hivon, Eric, Hu, Hongbo, Huang, Xiaoyuan, Huang, Yan, Huege, Tim, Jiang, Wen, Koirala, Ramesh, Kong, ChuiZheng, Kotera, Kumiko, Köhler, Jelena, Lago, Bruno L., Lai, Zhisen, Coz, Sandra Le, Legrand, François, Leisos, Antonios, Li, Rui, Li, Xingyu, Li, YiFei, Liu, Cheng, Liu, Ruoyu, Liu, Wei, Ma, Pengxiong, Macias, Oscar, Magnard, Frédéric, Marcowith, Alexandre, Martineau-Huynh, Olivier, McKinley, Thomas, Minodier, Paul, Mitra, Pragati, Mostafá, Miguel, Murase, Kohta, Niess, Valentin, Nonis, Stavros, Ogio, Shoichi, Oikonomou, Foteini, Pan, Hongwei, Papageorgiou, Konstantinos, Pierog, Tanguy, Piotrowski, Lech Wiktor, Prunet, Simon, Qian, Xiangli, Roth, Markus, Sako, Takashi, Schoorlemmer, Harm, Szálas-Motesiczky, Dániel, Sławiński, Szymon, Tian, Xishui, Timmermans, Anne, Timmermans, Charles, Tobiska, Petr, Tsirigotis, Apostolos, Tueros, Matías, Vittakis, George, Wang, Hanrui, Wang, Jiale, Wang, Shen, Wang, Xiangyu, Wang, Xu, Wei, Daming, Wei, Feng, Wu, Xiangping, Wu, Xuefeng, Xu, Xin, Xu, Xing, Yang, Fufu, Yang, Lili, Yang, Xuan, Yuan, Qiang, Zarka, Philippe, Zeng, Houdun, Zhang, Chao, Zhang, Jianli, Zhang, Kewen, Zhang, Pengfei, Zhang, Qingchi, Zhang, Songbo, Zhang, Yi, and Zhou, Hao
Subjects: Astrophysics - Instrumentation and Methods for Astrophysics, High Energy Physics - Experiment, High Energy Physics - Phenomenology
Abstract: The operation of upcoming ultra-high-energy cosmic-ray, gamma-ray, and neutrino radio-detection experiments, like the Giant Radio Array for Neutrino Detection (GRAND), poses significant computational challenges involving the production of numerous simulations of particle showers and their detection, and a high data throughput. GRANDlib is an open-source software tool designed to meet these challenges. Its primary goal is to perform end-to-end simulations of the detector operation, from the interaction of ultra-high-energy particles, through -- by interfacing with external air-shower simulations -- the ensuing particle shower development and its radio emission, to its detection by antenna arrays and its processing by data-acquisition systems. Additionally, GRANDlib manages the visualization, storage, and retrieval of experimental and simulated data. We present an overview of GRANDlib to serve as the basis of future GRAND analyses., Comment: 11 pages, 9 figures, plus appendices
Published: 2024

50. Cornwall-Jackiw-Tomboulis effective field theory to nonuniversal equation of state of an ultracold Bose gas

Author: Zhang, Yi and Liang, Zhaoxin
Subjects: Condensed Matter - Quantum Gases
Abstract: The equation of state (EOS) serves as a cornerstone in elucidating the properties of quantum many-body systems. A recent highlight along this research line consists of the derivation of the nonuniversal Lee-Huang-Yang (LHY) EOS for an ultracold quantum bosonic gas with finite-range interatomic interactions using one-loop effective path-integral field theory. The purpose of this work is to extend Salasnich's pioneering work to uncover beyond-LHY corrections to the EOS by employing the Cornwall-Jackiw-Tomboulis (CJT) effective field theory, leveraging its two-loop approximation. In this end, we expand Salasnich's remarkable findings of EOS to the next leading order characterized by $\left(\rho a_{\text{s}}^{3}\right)^{2}$, with $\rho$ and $a_{\text{s}}$ being the density and the $s$-wave scattering length. Notably, we derive analytical expressions for quantum depletion and chemical potential, representing the next-to-LHY corrections to nonuniversal EOS induced by finite-range effects. Moreover, we propose an experimental protocol of observing the nonuniversal next-to-LHY corrections to the EOS by calculating fractional frequency shifts in the breathing modes. The nonuniversal beyond-LHY EOS in this work paves the way of using LHY effects in quantum simulation experiments and for investigations beyond the LHY regime., Comment: 15 pages, 2 figures, accepted by Phys. Rev. A
Published: 2024
Full Text: View/download PDF

Catalog

Books, media, physical & digital resources

See catalog results

Searchworks

Select search scope, currently: Articles Catalog books, media & more in Jio Institute collections Articles journal articles & other e-resources

Search

Search Constraints

Refine your results

Search Limiters

Topic

Publication Year Range

Language

Category

Publication Type

Journal

Region

Database

Publisher

100,283 results on '"Zhang Yi"'

Search Results

Catalog

Select search scope, currently: Articles

Catalog

books, media & more in Jio Institute collections

Articles

journal articles & other e-resources