36,236 results on '"Zhou, Yu"'
Search Results
2. Innovation in a Science-Based Sector: The Institutional Evolution behind China's Emerging Biopharmaceutical Innovation Boom
- Author
-
Zhou, Yu and Coplin, Abigail E.
- Published
- 2022
3. Anchor Attention, Small Cache: Code Generation with Large Language Models
- Author
-
Zhang, Xiangyu, Zhou, Yu, Yang, Guang, Gall, Harald C., and Chen, Taolue
- Subjects
Computer Science - Software Engineering ,68N19 ,D.2.3 - Abstract
The development of large language models (LLMs) has revolutionized automated code generation. However, their high demand of computation resources has hindered a broader deployment and raised environmental concerns. A common strategy for diminishing computational demands is to cache Key-Value (KV) states from the attention mechanism which is adopted predominately by mainstream LLMs. It can mitigate the need of repeated attention computations, but brings significant memory overhead. Current practices in NLP often use sparse attention which may, unfortunately, lead to substantial inaccuracies, or hallucinations, in code generation tasks. In this paper, we analyze the attention weights distribution within code generation models via an empirical study, uncovering a sparsity pattern, i.e., the aggregation of information at specific anchor points. Based on this observation, we propose a novel approach, AnchorCoder, which features token-wise anchor attention designed to extract and compress the contextual information, and layer-wise anchor attention enabling cross-layer communication to mitigate the issue of excessive superposition caused by the compression. The extensive experiments across multiple benchmark datasets confirm the effectiveness of AnchorCoder, which can consistently achieve a significant (at least 70%) reduction in KV cache requirements, while preserving the majority of model's performance., Comment: 14 pages, 8 figures
- Published
- 2024
4. Memory Remedy: An AI-Enhanced Interactive Story Exploring Human-Robot Interaction and Companionship
- Author
-
Han, Lei, Zhou, Yu, Chen, Qiongyan, and Yip, David
- Subjects
Computer Science - Human-Computer Interaction - Abstract
We present our approach to using AI-generated content (AIGC) and multiple media to develop an immersive, game-based, interactive story experience. The narrative of the story, "Memory Remedy", unfolds through flashbacks, allowing the audience to gradually uncover the story and the complex relationship between the robot protagonist and the older adults. This exploration explores important themes such as the journey of life, the profound influence of memories, and the concept of post-human emotional care. By engaging with this AIGC-based interactive story, audiences are encouraged to reflect on the potential role of robotic companionship in the lives of older adults in the future; and to encourage deeper reflection on the complex relationship between artificial intelligence and humanity., Comment: The 17th International Symposium on Visual Information Communication and Interaction (VINCI 2024), December 11--13, 2024, Hsinchu, Taiwan
- Published
- 2024
- Full Text
- View/download PDF
5. Less is More: DocString Compression in Code Generation
- Author
-
Yang, Guang, Zhou, Yu, Cheng, Wei, Zhang, Xiangyu, Chen, Xiang, Zhuo, Terry Yue, Liu, Ke, Zhou, Xin, Lo, David, and Chen, Taolue
- Subjects
Computer Science - Software Engineering - Abstract
The widespread use of Large Language Models (LLMs) in software engineering has intensified the need for improved model and resource efficiency. In particular, for neural code generation, LLMs are used to translate function/method signature and DocString to executable code. DocStrings which capture user re quirements for the code and used as the prompt for LLMs, often contains redundant information. Recent advancements in prompt compression have shown promising results in Natural Language Processing (NLP), but their applicability to code generation remains uncertain. Our empirical study show that the state-of-the-art prompt compression methods achieve only about 10% reduction, as further reductions would cause significant performance degradation. In our study, we propose a novel compression method, ShortenDoc, dedicated to DocString compression for code generation. Our extensive experiments on six code generation datasets, five open-source LLMs (1B to 10B parameters), and one closed-source LLM GPT-4o confirm that ShortenDoc achieves 25-40% compression while preserving the quality of generated code, outperforming other baseline methods at similar compression levels. The benefit of this research is to improve efficiency and reduce the cost while maintaining the quality of the generated code, especially when calling third-party APIs, and is able to reduce the token processing cost by 25-40%., Comment: UNDER REVIEW
- Published
- 2024
6. Effective Action and Gravitational Pair Production in (A)dS Spacetime
- Author
-
Zhou, Yu and Zhang, Hai-Qing
- Subjects
General Relativity and Quantum Cosmology ,High Energy Physics - Theory - Abstract
We compute the effective action for a massive scalar field in (A)dS spacetime using the Euclidean heat kernel method. We highlight that in even-dimensional dS spacetimes, the effective action exhibits a non-trivial imaginary part, reminiscent of the Schwinger effect in quantum electrodynamics. We find consistency between the results obtained from the Euclidean heat kernel method with those from the Green's function approach in Lorentzian signature. Additionally, we compare our results with the perturbative calculations and find that the perturbation theory almost fails to capture the correct non-perturbative imaginary part of the effective action. This discrepancy presents a challenge to computing the gravitational pair production using the perturbation theory.
- Published
- 2024
7. SeaS: Few-shot Industrial Anomaly Image Generation with Separation and Sharing Fine-tuning
- Author
-
Dai, Zhewei, Zeng, Shilei, Liu, Haotian, Li, Xurui, Xue, Feng, and Zhou, Yu
- Subjects
Computer Science - Computer Vision and Pattern Recognition - Abstract
Current segmentation methods require many training images and precise masks, while insufficient anomaly images hinder their application in industrial scenarios. To address such an issue, we explore producing diverse anomalies and accurate pixel-wise annotations. By observing the real production lines, we find that anomalies vary randomly in shape and appearance, whereas products hold globally consistent patterns with slight local variations. Such a characteristic inspires us to develop a Separation and Sharing Fine-tuning (SeaS) approach using only a few abnormal and some normal images. Firstly, we propose the Unbalanced Abnormal (UA) Text Prompt tailored to industrial anomaly generation, consisting of one product token and several anomaly tokens. Then, for anomaly images, we propose a Decoupled Anomaly Alignment (DA) loss to bind the attributes of the anomalies to different anomaly tokens. Re-blending such attributes may produce never-seen anomalies, achieving a high diversity of anomalies. For normal images, we propose a Normal-image Alignment (NA) loss to learn the products' key features that are used to synthesize products with both global consistency and local variations. The two training processes are separated but conducted on a shared U-Net. Finally, SeaS produces high-fidelity annotations for the generated anomalies by fusing discriminative features of U-Net and high-resolution VAE features. Extensive evaluations on the challenging MVTec AD and MVTec 3D AD dataset demonstrate the effectiveness of our approach. For anomaly image generation, we achieve 1.88 on IS and 0.34 on IC-LPIPS on MVTec AD dataset, 1.95 on IS and 0.30 on IC-LPIPS on MVTec 3D AD dataset. For downstream task, using our generated anomaly image-mask pairs, three common segmentation methods achieve an average 11.17% improvement on IoU on MVTec AD dataset, and a 15.49% enhancement in IoU on MVTec 3D AD dataset.
- Published
- 2024
8. Axion effects on gamma-ray spectral irregularities. II: EBL absorption models
- Author
-
Li, Hai-Jun, Chao, Wei, Tan, Xiu-Hui, and Zhou, Yu-Feng
- Subjects
High Energy Physics - Phenomenology ,Astrophysics - High Energy Astrophysical Phenomena - Abstract
In this study, we explore how the extragalactic background light (EBL) absorption effect influences the photon to axionlike particle (ALP) conversions from the very-high-energy gamma-ray spectral irregularities. For our analysis, we select two well-known BL Lac blazars: Markarian 421 and Markarian 501 with their low and well-defined redshifts $z_0=0.031$ and 0.034, respectively. Their gamma-ray data are recently measured by Fermi-LAT and HAWC with the 1038 days of exposure from 2015 June to 2018 July. We first discuss the EBL absorption effect on the gamma-ray spectral energy distributions by using three common EBL spectral models: Franceschini-08, Finke-10, and Gilmore-12. Then we consider the photon-ALP conversions in the astrophysical magnetic fields. Under the ALP assumption with the parameter space of $\{m_a, g_{a\gamma}\}$, we calculate the best-fit chi-square distribution of the EBL models and define a new delta chi-square $\chi_d^2$ to quantify the chi-square difference. Our results show that the impact from these different EBL spectral models are non-dominated at the low-redshift gamma-ray axionscope., Comment: 17 pages, 5 figures
- Published
- 2024
9. AnomalyNCD: Towards Novel Anomaly Class Discovery in Industrial Scenarios
- Author
-
Huang, Ziming, Li, Xurui, Liu, Haotian, Xue, Feng, Wang, Yuzhe, and Zhou, Yu
- Subjects
Computer Science - Computer Vision and Pattern Recognition - Abstract
In the industrial scenario, anomaly detection could locate but cannot classify anomalies. To complete their capability, we study to automatically discover and recognize visual classes of industrial anomalies. In terms of multi-class anomaly classification, previous methods cluster anomalies represented by frozen pre-trained models but often fail due to poor discrimination. Novel class discovery (NCD) has the potential to tackle this. However, it struggles with non-prominent and semantically weak anomalies that challenge network learning focus. To address these, we introduce AnomalyNCD, a multi-class anomaly classification framework compatible with existing anomaly detection methods. This framework learns anomaly-specific features and classifies anomalies in a self-supervised manner. Initially, a technique called Main Element Binarization (MEBin) is first designed, which segments primary anomaly regions into masks to alleviate the impact of incorrect detections on learning. Subsequently, we employ mask-guided contrastive representation learning to improve feature discrimination, which focuses network attention on isolated anomalous regions and reduces the confusion of erroneous inputs through re-corrected pseudo labels. Finally, to enable flexible classification at both region and image levels during inference, we develop a region merging strategy that determines the overall image category based on the classified anomaly regions. Our method outperforms the state-of-the-art works on the MVTec AD and MTD datasets. Compared with the current methods, AnomalyNCD combined with zero-shot anomaly detection method achieves a 10.8% $F_1$ gain, 8.8% NMI gain, and 9.5% ARI gain on MVTec AD, 12.8% $F_1$ gain, 5.7% NMI gain, and 10.8% ARI gain on MTD. The source code is available at https://github.com/HUST-SLOW/AnomalyNCD.
- Published
- 2024
10. First Creating Backgrounds Then Rendering Texts: A New Paradigm for Visual Text Blending
- Author
-
Li, Zhenhang, Shu, Yan, Zeng, Weichao, Yang, Dongbao, and Zhou, Yu
- Subjects
Computer Science - Computer Vision and Pattern Recognition - Abstract
Diffusion models, known for their impressive image generation abilities, have played a pivotal role in the rise of visual text generation. Nevertheless, existing visual text generation methods often focus on generating entire images with text prompts, leading to imprecise control and limited practicality. A more promising direction is visual text blending, which focuses on seamlessly merging texts onto text-free backgrounds. However, existing visual text blending methods often struggle to generate high-fidelity and diverse images due to a shortage of backgrounds for synthesis and limited generalization capabilities. To overcome these challenges, we propose a new visual text blending paradigm including both creating backgrounds and rendering texts. Specifically, a background generator is developed to produce high-fidelity and text-free natural images. Moreover, a text renderer named GlyphOnly is designed for achieving visually plausible text-background integration. GlyphOnly, built on a Stable Diffusion framework, utilizes glyphs and backgrounds as conditions for accurate rendering and consistency control, as well as equipped with an adaptive text block exploration strategy for small-scale text rendering. We also explore several downstream applications based on our method, including scene text dataset synthesis for boosting scene text detectors, as well as text image customization and editing. Code and model will be available at \url{https://github.com/Zhenhang-Li/GlyphOnly}., Comment: Accepted to ECAI2024
- Published
- 2024
11. TextCtrl: Diffusion-based Scene Text Editing with Prior Guidance Control
- Author
-
Zeng, Weichao, Shu, Yan, Li, Zhenhang, Yang, Dongbao, and Zhou, Yu
- Subjects
Computer Science - Computer Vision and Pattern Recognition - Abstract
Centred on content modification and style preservation, Scene Text Editing (STE) remains a challenging task despite considerable progress in text-to-image synthesis and text-driven image manipulation recently. GAN-based STE methods generally encounter a common issue of model generalization, while Diffusion-based STE methods suffer from undesired style deviations. To address these problems, we propose TextCtrl, a diffusion-based method that edits text with prior guidance control. Our method consists of two key components: (i) By constructing fine-grained text style disentanglement and robust text glyph structure representation, TextCtrl explicitly incorporates Style-Structure guidance into model design and network training, significantly improving text style consistency and rendering accuracy. (ii) To further leverage the style prior, a Glyph-adaptive Mutual Self-attention mechanism is proposed which deconstructs the implicit fine-grained features of the source image to enhance style consistency and vision quality during inference. Furthermore, to fill the vacancy of the real-world STE evaluation benchmark, we create the first real-world image-pair dataset termed ScenePair for fair comparisons. Experiments demonstrate the effectiveness of TextCtrl compared with previous methods concerning both style fidelity and text accuracy.
- Published
- 2024
12. Embodied Agent Interface: Benchmarking LLMs for Embodied Decision Making
- Author
-
Li, Manling, Zhao, Shiyu, Wang, Qineng, Wang, Kangrui, Zhou, Yu, Srivastava, Sanjana, Gokmen, Cem, Lee, Tony, Li, Li Erran, Zhang, Ruohan, Liu, Weiyu, Liang, Percy, Fei-Fei, Li, Mao, Jiayuan, and Wu, Jiajun
- Subjects
Computer Science - Computation and Language ,Computer Science - Artificial Intelligence ,Computer Science - Machine Learning ,Computer Science - Robotics - Abstract
We aim to evaluate Large Language Models (LLMs) for embodied decision making. While a significant body of work has been leveraging LLMs for decision making in embodied environments, we still lack a systematic understanding of their performance because they are usually applied in different domains, for different purposes, and built based on different inputs and outputs. Furthermore, existing evaluations tend to rely solely on a final success rate, making it difficult to pinpoint what ability is missing in LLMs and where the problem lies, which in turn blocks embodied agents from leveraging LLMs effectively and selectively. To address these limitations, we propose a generalized interface (Embodied Agent Interface) that supports the formalization of various types of tasks and input-output specifications of LLM-based modules. Specifically, it allows us to unify 1) a broad set of embodied decision-making tasks involving both state and temporally extended goals, 2) four commonly-used LLM-based modules for decision making: goal interpretation, subgoal decomposition, action sequencing, and transition modeling, and 3) a collection of fine-grained metrics which break down evaluation into various types of errors, such as hallucination errors, affordance errors, various types of planning errors, etc. Overall, our benchmark offers a comprehensive assessment of LLMs' performance for different subtasks, pinpointing the strengths and weaknesses in LLM-powered embodied AI systems, and providing insights for effective and selective use of LLMs in embodied decision making., Comment: Accepted for oral presentation at NeurIPS 2024 in the Datasets and Benchmarks track. Camera-ready version
- Published
- 2024
13. Leverage Knowledge Graph and Large Language Model for Law Article Recommendation: A Case Study of Chinese Criminal Law
- Author
-
Chen, Yongming, Chen, Miner, Zhu, Ye, Pei, Juan, Chen, Siyu, Zhou, Yu, Wang, Yi, Zhou, Yifan, Li, Hao, and Zhang, Songan
- Subjects
Computer Science - Information Retrieval ,Computer Science - Artificial Intelligence - Abstract
Court efficiency is vital for social stability. However, in most countries around the world, the grassroots courts face case backlogs, with decisions relying heavily on judicial personnel's cognitive labor, lacking intelligent tools to improve efficiency. To address this issue, we propose an efficient law article recommendation approach utilizing a Knowledge Graph (KG) and a Large Language Model (LLM). Firstly, we propose a Case-Enhanced Law Article Knowledge Graph (CLAKG) as a database to store current law statutes, historical case information, and correspondence between law articles and historical cases. Additionally, we introduce an automated CLAKG construction method based on LLM. On this basis, we propose a closed-loop law article recommendation method. Finally, through a series of experiments using judgment documents from the website "China Judgements Online", we have improved the accuracy of law article recommendation in cases from 0.549 to 0.694, demonstrating that our proposed method significantly outperforms baseline approaches.
- Published
- 2024
14. HM3: Hierarchical Multi-Objective Model Merging for Pretrained Models
- Author
-
Zhou, Yu, Wu, Xingyu, Wu, Jibin, Feng, Liang, and Tan, Kay Chen
- Subjects
Computer Science - Machine Learning - Abstract
Model merging is a technique that combines multiple large pretrained models into a single model with enhanced performance and broader task adaptability. It has gained popularity in large pretrained model development due to its ability to bypass the need for original training data and further training processes. However, most existing model merging approaches focus solely on exploring the parameter space, merging models with identical architectures. Merging within the architecture space, despite its potential, remains in its early stages due to the vast search space and the challenges of layer compatibility. This paper marks a significant advance toward more flexible and comprehensive model merging techniques by modeling the architecture-space merging process as a reinforcement learning task. We train policy and value networks using offline sampling of weight vectors, which are then employed for the online optimization of merging strategies. Moreover, a multi-objective optimization paradigm is introduced to accommodate users' diverse task preferences, learning the Pareto front of optimal models to offer customized merging suggestions. Experimental results across multiple tasks, including text translation, mathematical reasoning, and code generation, validate the effectiveness and superiority of the proposed framework in model merging. The code will be made publicly available after the review process.
- Published
- 2024
15. A Complete Landscape of EFX Allocations of Mixed Manna on Graphs
- Author
-
Zhou, Yu, Wei, Tianze, Li, Minming, and Li, Bo
- Subjects
Computer Science - Computer Science and Game Theory - Abstract
We study envy-free up to any item (EFX) allocations on graphs where vertices and edges represent agents and items respectively. An agent is only interested in items that are incident to her and all other items have zero marginal values to her. Christodoulou et al. [EC, 2023] first proposed this setting and studied the case of goods. We extend this setting to the case of mixed manna where an item may be liked or disliked by its endpoint agents. In our problem, an agent has an arbitrary valuation over her incident items such that the items she likes have non-negative marginal values to her and those she dislikes have non-positive marginal values. We provide a complete study of the four notions of EFX for mixed manna in the literature, which differ by whether the removed item can have zero marginal value. We prove that an allocation that satisfies the notion of EFX where the virtually-removed item could always have zero marginal value may not exist and determining its existence is NP-complete, while one that satisfies any of the other three notions always exists and can be computed in polynomial time. We also prove that an orientation (i.e., a special allocation where each edge must be allocated to one of its endpoint agents) that satisfies any of the four notions may not exist, and determining its existence is NP-complete., Comment: Accepted in IJCAI 2024
- Published
- 2024
16. Seat Selection as a Function of Cultural and Individual Differences: Insights from Undergraduate Students in China
- Author
-
Lu Kehan, Amrita Kaur, Zhou Yu, He Yuzhen, Huang Yuchong, Zhan Yinuo, and Mohammad Noman
- Abstract
Students' seating selection is a significant physical variable that has implications for both teachers and students. These seating preferences have been linked to students' personalities, motivation, and academic performance. However, there is limited knowledge regarding the cultural influences on these preferences. In this exploratory qualitative study, we aim to investigate the cultural factors that influence the seating choices of undergraduate students. The study participants were recruited using purposive sampling. Face-to-face interviews and scenario simulation surveys were utilized to collect data, which was analyzed using thematic analysis. The study's findings suggest that seating preferences are largely a function of individual differences and personal preferences, which often stem from personal and cultural factors. These factors are discussed under five primary themes: course academic value, gaining positive experiences, avoiding negative experiences, modesty and humility, and social belonging. These findings have implications for teaching and learning and for instructors, especially those from foreign cultures.
- Published
- 2024
17. Study of the relativistic charged particle beam propagation in Earth's magnetic field
- Author
-
Fang, Meihua, liang, Zheng, Gong, Yingkui, Chen, Jianfei, Zhu, Guiping, Liu, Ting, Tian, Yu, and Zhou, Yu
- Subjects
Physics - Space Physics ,Astrophysics - Instrumentation and Methods for Astrophysics ,High Energy Physics - Phenomenology - Abstract
Relativistic charged particle beam can be used as destructive beam weapons in space for debris removal tasks. The trajectories of charged particles are affected by both electric and magnetic forces in the Earth's magnetic field. In this paper, we firstly analyzed the correlation parameters of the charged particle beam as a weapon when it propagated in the geomagnetic field. Then the models were constructed based on COMSOL Multiphysics and the IGRF model was adopted in the simulation. The gyro-radius and the related uncertainty were analyzed by simulation of the charged particle transport in the geomagnetic field at different altitudes. The charged beam spot radius divergency was also simulated. The magnetic field pinch effect can be found and can limit the beam spreading., Comment: 10 pages, 7 figures
- Published
- 2024
18. Towards Rehearsal-Free Multilingual ASR: A LoRA-based Case Study on Whisper
- Author
-
Xu, Tianyi, Huang, Kaixun, Guo, Pengcheng, Zhou, Yu, Huang, Longtao, Xue, Hui, and Xie, Lei
- Subjects
Computer Science - Computation and Language ,Computer Science - Sound ,Electrical Engineering and Systems Science - Audio and Speech Processing - Abstract
Pre-trained multilingual speech foundation models, like Whisper, have shown impressive performance across different languages. However, adapting these models to new or specific languages is computationally extensive and faces catastrophic forgetting problems. Addressing these issues, our study investigates strategies to enhance the model on new languages in the absence of original training data, while also preserving the established performance on the original languages. Specifically, we first compare various LoRA-based methods to find out their vulnerability to forgetting. To mitigate this issue, we propose to leverage the LoRA parameters from the original model for approximate orthogonal gradient descent on the new samples. Additionally, we also introduce a learnable rank coefficient to allocate trainable parameters for more efficient training. Our experiments with a Chinese Whisper model (for Uyghur and Tibetan) yield better results with a more compact parameter set.
- Published
- 2024
19. ARMADA: Attribute-Based Multimodal Data Augmentation
- Author
-
Jin, Xiaomeng, Kim, Jeonghwan, Zhou, Yu, Huang, Kuan-Hao, Wu, Te-Lin, Peng, Nanyun, and Ji, Heng
- Subjects
Computer Science - Artificial Intelligence - Abstract
In Multimodal Language Models (MLMs), the cost of manually annotating high-quality image-text pair data for fine-tuning and alignment is extremely high. While existing multimodal data augmentation frameworks propose ways to augment image-text pairs, they either suffer from semantic inconsistency between texts and images, or generate unrealistic images, causing knowledge gap with real world examples. To address these issues, we propose Attribute-based Multimodal Data Augmentation (ARMADA), a novel multimodal data augmentation method via knowledge-guided manipulation of visual attributes of the mentioned entities. Specifically, we extract entities and their visual attributes from the original text data, then search for alternative values for the visual attributes under the guidance of knowledge bases (KBs) and large language models (LLMs). We then utilize an image-editing model to edit the images with the extracted attributes. ARMADA is a novel multimodal data generation framework that: (i) extracts knowledge-grounded attributes from symbolic KBs for semantically consistent yet distinctive image-text pair generation, (ii) generates visually similar images of disparate categories using neighboring entities in the KB hierarchy, and (iii) uses the commonsense knowledge of LLMs to modulate auxiliary visual attributes such as backgrounds for more robust representation of original entities. Our empirical results over four downstream tasks demonstrate the efficacy of our framework to produce high-quality data and enhance the model performance. This also highlights the need to leverage external knowledge proxies for enhanced interpretability and real-world grounding.
- Published
- 2024
20. Focus, Distinguish, and Prompt: Unleashing CLIP for Efficient and Flexible Scene Text Retrieval
- Author
-
Zeng, Gangyan, Zhang, Yuan, Wei, Jin, Yang, Dongbao, Zhang, Peng, Gao, Yiwen, Qin, Xugong, and Zhou, Yu
- Subjects
Computer Science - Computer Vision and Pattern Recognition ,Computer Science - Artificial Intelligence - Abstract
Scene text retrieval aims to find all images containing the query text from an image gallery. Current efforts tend to adopt an Optical Character Recognition (OCR) pipeline, which requires complicated text detection and/or recognition processes, resulting in inefficient and inflexible retrieval. Different from them, in this work we propose to explore the intrinsic potential of Contrastive Language-Image Pre-training (CLIP) for OCR-free scene text retrieval. Through empirical analysis, we observe that the main challenges of CLIP as a text retriever are: 1) limited text perceptual scale, and 2) entangled visual-semantic concepts. To this end, a novel model termed FDP (Focus, Distinguish, and Prompt) is developed. FDP first focuses on scene text via shifting the attention to the text area and probing the hidden text knowledge, and then divides the query text into content word and function word for processing, in which a semantic-aware prompting scheme and a distracted queries assistance module are utilized. Extensive experiments show that FDP significantly enhances the inference speed while achieving better or competitive retrieval accuracy compared to existing methods. Notably, on the IIIT-STR benchmark, FDP surpasses the state-of-the-art model by 4.37% with a 4 times faster speed. Furthermore, additional experiments under phrase-level and attribute-aware scene text retrieval settings validate FDP's particular advantages in handling diverse forms of query text. The source code will be publicly available at https://github.com/Gyann-z/FDP., Comment: Accepted by ACM MM 2024
- Published
- 2024
21. Mass mixing between QCD axions
- Author
-
Li, Hai-Jun and Zhou, Yu-Feng
- Subjects
High Energy Physics - Phenomenology ,Astrophysics - Cosmology and Nongalactic Astrophysics - Abstract
We introduce a novel level crossing in the mass mixing between two QCD axions, one canonical QCD axion and one $Z_{\mathcal N}$ QCD axion. The level crossing can take place at the QCD phase transition critical temperature or slightly before it, depending on the ratio of the axion decay constants $\sim1.69$. The cosmological evolution of the mass eigenvalues in these two cases is similar, however, the transition of axion energy density is completely different. Finally, we estimate the relic density of the QCD axion dark matter. This level crossing may also have some cosmological implications., Comment: 6 pages, 4 figures
- Published
- 2024
22. Quantum optical coherence theory based on Feynman's path integral
- Author
-
Liu, Jianbin, Zhou, Yu, Chen, Hui, Zheng, Huaibin, He, Yuchen, Li, Fuli, and Xu, Zhuo
- Subjects
Quantum Physics - Abstract
Compared to classical optical coherence theory based on Maxwell's electromagnetic theory and Glauber's quantum optical coherence theory based on matrix mechanics formulation of quantum mechanics, quantum optical coherence theory based on Feynman's path integral formulation of quantum mechanics provides a novel tool to study optical coherence. It has the advantage of understanding the connection between mathematical calculations and physical interpretations better. Quantum optical coherence theory based on Feynman's path integral is introduced and reviewed in this paper. Based on the results of transient first-order interference of two independent light beams, it is predicted that the classical model for electric field of thermal light introduced by classical optical textbooks may not be accurate. The physics of two-photon bunching of thermal light and Hong-Ou-Mandel dip of entangled photon pairs is the same, which can be interpreted by constructive and destructive two-photon interference, respectively. Quantum optical coherence theory based on Feynman's path integral is helpful to understand the coherence properties of light, which may eventually lead us to the answer of the question: what is a photon?, Comment: 40 pages, 35 figures
- Published
- 2024
23. Unlocking Discovery Potential for Decaying Dark Matter and Faint X-ray Sources with XRISM
- Author
-
Zhou, Yu, Takhistov, Volodymyr, and Mitsuda, Kazuhisa
- Subjects
Astrophysics - High Energy Astrophysical Phenomena ,Astrophysics - Cosmology and Nongalactic Astrophysics ,Astrophysics - Astrophysics of Galaxies ,High Energy Physics - Phenomenology - Abstract
Astrophysical emission lines arising from particle decays can offer unique insights into the nature of dark matter (DM). Using dedicated simulations with background and foreground modeling, we comprehensively demonstrate that the recently launched XRISM space telescope with powerful X-ray spectroscopy capabilities is particularly well-suited to probe decaying DM, such as sterile neutrinos and axion-like particles, in the mass range of few to tens of keV. We analyze and map XRISM's DM discovery potential parameter space by considering Milky Way Galactic DM halo, including establishing an optimal line-of-sight search, as well as dwarf galaxies where we identify Segue 1 as a remarkably promising target. We demonstrate that with only 100 ks exposure XRISM/Resolve instrument is capable of probing the underexplored DM parameter window around few keV and testing DM couplings with sensitivity that exceeds by two orders existing Segue 1 limits. Further, we demonstrate that XRISM/Xtend instrument sensitivity enables discovery of the nature of faint astrophysical X-ray sources, especially in Segue 1, which could shed light on star-formation history. We discuss implications for decaying DM searches with improved detector energy resolution in future experiments., Comment: 12 pages, 9 figures
- Published
- 2024
24. DeCE: Deceptive Cross-Entropy Loss Designed for Defending Backdoor Attacks
- Author
-
Yang, Guang, Zhou, Yu, Chen, Xiang, Zhang, Xiangyu, Zhuo, Terry Yue, Lo, David, and Chen, Taolue
- Subjects
Computer Science - Cryptography and Security ,Computer Science - Software Engineering - Abstract
Code Language Models (CLMs), particularly those leveraging deep learning, have achieved significant success in code intelligence domain. However, the issue of security, particularly backdoor attacks, is often overlooked in this process. The previous research has focused on designing backdoor attacks for CLMs, but effective defenses have not been adequately addressed. In particular, existing defense methods from natural language processing, when directly applied to CLMs, are not effective enough and lack generality, working well in some models and scenarios but failing in others, thus fall short in consistently mitigating backdoor attacks. To bridge this gap, we first confirm the phenomenon of ``early learning" as a general occurrence during the training of CLMs. This phenomenon refers to that a model initially focuses on the main features of training data but may become more sensitive to backdoor triggers over time, leading to overfitting and susceptibility to backdoor attacks. We then analyze that overfitting to backdoor triggers results from the use of the cross-entropy loss function, where the unboundedness of cross-entropy leads the model to increasingly concentrate on the features of the poisoned data. Based on this insight, we propose a general and effective loss function DeCE (Deceptive Cross-Entropy) by blending deceptive distributions and applying label smoothing to limit the gradient to be bounded, which prevents the model from overfitting to backdoor triggers and then enhances the security of CLMs against backdoor attacks. To verify the effectiveness of our defense method, we select code synthesis tasks as our experimental scenarios. Our experiments across various code synthesis datasets, models, and poisoning ratios demonstrate the applicability and effectiveness of DeCE in enhancing the security of CLMs., Comment: Under Review; Waiting for updates
- Published
- 2024
25. Resolving Sentiment Discrepancy for Multimodal Sentiment Detection via Semantics Completion and Decomposition
- Author
-
Wu, Daiqing, Yang, Dongbao, Shen, Huawen, Ma, Can, and Zhou, Yu
- Subjects
Computer Science - Computer Vision and Pattern Recognition ,Computer Science - Computation and Language ,Computer Science - Multimedia ,Computer Science - Social and Information Networks - Abstract
With the proliferation of social media posts in recent years, the need to detect sentiments in multimodal (image-text) content has grown rapidly. Since posts are user-generated, the image and text from the same post can express different or even contradictory sentiments, leading to potential \textbf{sentiment discrepancy}. However, existing works mainly adopt a single-branch fusion structure that primarily captures the consistent sentiment between image and text. The ignorance or implicit modeling of discrepant sentiment results in compromised unimodal encoding and limited performances. In this paper, we propose a semantics Completion and Decomposition (CoDe) network to resolve the above issue. In the semantics completion module, we complement image and text representations with the semantics of the OCR text embedded in the image, helping bridge the sentiment gap. In the semantics decomposition module, we decompose image and text representations with exclusive projection and contrastive learning, thereby explicitly capturing the discrepant sentiment between modalities. Finally, we fuse image and text representations by cross-attention and combine them with the learned discrepant sentiment for final classification. Extensive experiments conducted on four multimodal sentiment datasets demonstrate the superiority of CoDe against SOTA methods., Comment: 8 pages, 6 figures
- Published
- 2024
26. Quantum fluctuation on the worldsheet of probe string in BTZ black hole
- Author
-
Zhou, Yu-Ting and Kuang, Xiao-Mei
- Subjects
High Energy Physics - Theory - Abstract
In this paper, we investigate the second-order normal quantum fluctuation on the world-sheet of a probe string in the Ba\~nados-Teitelboim-Zanelli (BTZ) black hole. These fluctuations is treated as the projection of Hawking radiation on the worldsheet and indeed modify the action growth of the string. Then in the string field theory/boundary conformal field theory framework, via the boundary vertex operator we study the correlation function of the Schr\"odinger functional of excited fields on the world-sheet and further extract the field's formula. Our study could shed light on the potential connection between complexity growth and correlation function., Comment: 18 pages, 2 figures
- Published
- 2024
27. CodeScore-R: An Automated Robustness Metric for Assessing the FunctionalCorrectness of Code Synthesis
- Author
-
Yang, Guang, Zhou, Yu, Chen, Xiang, and Zhang, Xiangyu
- Subjects
Computer Science - Software Engineering - Abstract
Evaluation metrics are crucial in the field of code synthesis. Commonly used code evaluation metrics canbe classified into three types: match-based, semantic-based, and execution-based. Among them, the execution-basedPass@k metric accurately assesses the functionality of predicted code by executing test cases. However, calculatingthis metric requires a significant amount of overhead, necessitating the design of an automated evaluation metric thatcan assess the functionality of predicted code without the need for test cases. Additionally, a good evaluation metricshould be robust, that is the metric can maintain its accuracy even when the predicted code undergoes minor changes.To address these challenges, we propose an automated robust metric, called CodeScore-R, based on UniXcoder andcontrastive learning, for evaluating the functionality of code synthesis. CodeScore-R employs techniques such assketch-based processing, syntactic-equivalent transformations, and mutation testing to effectively mitigate theinterference caused by identifiers, syntax structures, and operators on evaluation results. Experimental resultsdemonstrate that in the tasks of code generation and migration in Java and Python, CodeScore-R outperforms otherevaluation metrics and is more closely aligned with the Pass@k metric, while exhibiting stronger robustness., Comment: in Chinese language, Journal of Computer Research and Development
- Published
- 2024
28. Revolutionizing Wireless Networks with Self-Supervised Learning: A Pathway to Intelligent Communications
- Author
-
Yang, Zhixiang, Du, Hongyang, Niyato, Dusit, Wang, Xudong, Zhou, Yu, Feng, Lei, Zhou, Fanqin, Li, Wenjing, and Qiu, Xuesong
- Subjects
Electrical Engineering and Systems Science - Signal Processing - Abstract
With the rapid proliferation of mobile devices and data, next-generation wireless communication systems face stringent requirements for ultra-low latency, ultra-high reliability, and massive connectivity. Traditional AI-driven wireless network designs, while promising, often suffer from limitations such as dependency on labeled data and poor generalization. To address these challenges, we present an integration of self-supervised learning (SSL) into wireless networks. SSL leverages large volumes of unlabeled data to train models, enhancing scalability, adaptability, and generalization. This paper offers a comprehensive overview of SSL, categorizing its application scenarios in wireless network optimization and presenting a case study on its impact on semantic communication. Our findings highlight the potentials of SSL to significantly improve wireless network performance without extensive labeled data, paving the way for more intelligent and efficient communication systems.
- Published
- 2024
29. A DAFT Based Unified Waveform Design Framework for High-Mobility Communications
- Author
-
Zhang, Xingyao, Yin, Haoran, Tang, Yanqun, Zhou, Yu, Liu, Yuqing, Du, Jinming, and Ding, Yipeng
- Subjects
Electrical Engineering and Systems Science - Signal Processing - Abstract
With the increasing demand for multi-carrier communication in high-mobility scenarios, it is urgent to design new multi-carrier communication waveforms that can resist large delay-Doppler spreads. Various multi-carrier waveforms in the transform domain were proposed for the fast time-varying channels, including orthogonal time frequency space (OTFS), orthogonal chirp division multiplexing (OCDM), and affine frequency division multiplexing (AFDM). Among these, the AFDM is a strong candidate for its low implementation complexity and ability to achieve optimal diversity. This paper unifies the waveforms based on the discrete affine Fourier transform (DAFT) by using the chirp slope factor "k" in the time-frequency representation to construct a unified design framework for high-mobility communications. The design framework is employed to verify that the bit error rate performance of the DAFT-based waveform can be enhanced when the signal-to-noise ratio (SNR) is sufficiently high by adjusting the chirp slope factor "k".
- Published
- 2024
30. Self-Modifying State Modeling for Simultaneous Machine Translation
- Author
-
Yu, Donglei, Kang, Xiaomian, Liu, Yuchen, Zhou, Yu, and Zong, Chengqing
- Subjects
Computer Science - Computation and Language - Abstract
Simultaneous Machine Translation (SiMT) generates target outputs while receiving stream source inputs and requires a read/write policy to decide whether to wait for the next source token or generate a new target token, whose decisions form a \textit{decision path}. Existing SiMT methods, which learn the policy by exploring various decision paths in training, face inherent limitations. These methods not only fail to precisely optimize the policy due to the inability to accurately assess the individual impact of each decision on SiMT performance, but also cannot sufficiently explore all potential paths because of their vast number. Besides, building decision paths requires unidirectional encoders to simulate streaming source inputs, which impairs the translation quality of SiMT models. To solve these issues, we propose \textbf{S}elf-\textbf{M}odifying \textbf{S}tate \textbf{M}odeling (SM$^2$), a novel training paradigm for SiMT task. Without building decision paths, SM$^2$ individually optimizes decisions at each state during training. To precisely optimize the policy, SM$^2$ introduces Self-Modifying process to independently assess and adjust decisions at each state. For sufficient exploration, SM$^2$ proposes Prefix Sampling to efficiently traverse all potential states. Moreover, SM$^2$ ensures compatibility with bidirectional encoders, thus achieving higher translation quality. Experiments show that SM$^2$ outperforms strong baselines. Furthermore, SM$^2$ allows offline machine translation models to acquire SiMT ability with fine-tuning., Comment: Accept to ACL 2024 main conference. 15 pages, 13 figures, 9 tables
- Published
- 2024
31. Upper limit on the axion-photon coupling from Markarian 421
- Author
-
Li, Hai-Jun, Chao, Wei, and Zhou, Yu-Feng
- Subjects
High Energy Physics - Phenomenology ,Astrophysics - High Energy Astrophysical Phenomena - Abstract
Markarian 421 is a well-known nearby BL Lac blazar at the redshift $z=0.031$. Many previous works were investigated to constrain the axion-photon coupling from its TeV gamma-ray observations, showing the upper limit on the coupling constant $g_{a\gamma} \lesssim 2.0\times 10^{-11} \rm \, GeV^{-1}$ for the axion mass $[5.0\times10^{-10} \, {\rm eV} \lesssim m_a \lesssim 5.0\times10^{-7} \, {\rm eV}]$. While in this work, we obtain a more stringent upper limit on the axion-photon coupling from the 1038 days gamma-ray observations of the blazar Markarian 421. The long-term gamma-ray spectra are measured by the collaborations Large Area Telescope on board NASA's Fermi Gamma-ray Space Telescope (Fermi-LAT) and High Altitude Water Cherenkov (HAWC) Gamma-Ray Observatory from 2015 June to 2018 July. We show the best-fit spectral energy distributions (SEDs) of Markarian 421 under the null and axion hypotheses. Then we set the axion-photon limit in the $\{m_a, \, g_{a\gamma}\}$ plane. The 99% $\rm C.L.$ upper limit set by Markarian 421 is $g_{a\gamma} \lesssim 4.0\times 10^{-12} \rm \, GeV^{-1}$ for the axion mass $[1.0\times10^{-9} \, {\rm eV} \lesssim m_a \lesssim 1.0\times10^{-8} \, {\rm eV}]$. It is the most stringent upper limit in this axion mass region., Comment: 11 pages, 4 figures. Published in PLB
- Published
- 2024
32. Development of the Low Frequency Telescope focal plane detector arrays for LiteBIRD
- Author
-
Ghigna, Tommaso, Suzuki, Aritoki, Westbrook, Benjamin, Raum, Christopher, Akamatsu, Hiroki, Beckman, Shawn, Farias, Nicole, de Haan, Tijmen, Halverson, Nils, Hazumi, Masashi, Hubmayr, Johannes, Jaehnig, Greg, Lee, Adrian T., Stever, Samantha L., and Zhou, Yu
- Subjects
Astrophysics - Instrumentation and Methods for Astrophysics ,Astrophysics - Cosmology and Nongalactic Astrophysics ,Physics - Instrumentation and Detectors - Abstract
LiteBIRD, a forthcoming JAXA mission, aims to accurately study the microwave sky within the 40-400 GHz frequency range divided into 15 distinct nominal bands. The primary objective is to constrain the CMB inflationary signal, specifically the primordial B-modes. LiteBIRD targets the CMB B-mode signal on large angular scales, where the primordial inflationary signal is expected to dominate, with the goal of reaching a tensor-to-scalar ratio sensitivity of $\sigma_r\sim0.001$. LiteBIRD frequency bands will be split among three telescopes, with some overlap between telescopes for better control of systematic effects. Here we report on the development status of the detector arrays for the Low Frequency Telescope (LFT), which spans the 34-161 GHz range, with 12 bands subdivided between four types of trichroic pixels consisting of lenslet-coupled sinuous antennas. The signal from the antenna is bandpass filtered and sensed by AlMn Transition-Edge Sensors (TES). We provide an update on the status of the design and development of LiteBIRD's LFT LF1 (40-60-78 GHz), LF2 (50-68-89 GHz) pixels. We discuss design choices motivated by LiteBIRD scientific goals. In particular we focus on the details of the optimization of the design parameters of the sinuous antenna, on-chip bandpass filters, cross-under and impedance transformers and all the RF components that define the LF1 and LF2 pixel detection chain. We present this work in the context of the technical challenges and physical constraints imposed by the finite size of the instrument., Comment: 12 pages, 10 figures, 1 table, SPIE 2024
- Published
- 2024
33. Intermediate-mass-ratio inspirals with general dynamical friction in dark matter minispikes
- Author
-
Zhou, Yu-Chen, Jin, Hong-Bo, Qiao, Cong-Feng, and Wu, Yue-Liang
- Subjects
Astrophysics - High Energy Astrophysical Phenomena - Abstract
The intermediate-mass-ratio inspirals (IMRIs) may be surrounded by dark matter (DM) minispikes. The dynamical friction from these DM minispike structures can affect the dynamics and the gravitational wave (GW) emission of the IMRIs. We analyze the effects of general dynamical friction, with a particular contribution from DM particles moving faster than the stellar-mass black hole in an eccentric IMRI. The results show that the dynamical friction caused by these DM particles tends to eccentricify the orbit, and the general dynamical friction is able to increase the eccentricity. We also analyze the effects of general dynamical friction on the GW characteristic strain. The results indicate that the peak value of the characteristic strain occurs at higher frequencies as the power law index of DM minispike $\gamma_\mathrm{sp}$ increases. For the first time, a general analytical relation between the frequency peak value of characteristic strain of GWs and $\gamma_\mathrm{sp}$ is established. Using the analytical relation, the presence of DM and its halo density may be determined potentially from future GW data., Comment: 7 pages, 4 figures, submitted to PRX
- Published
- 2024
34. Implicit Neural Image Field for Biological Microscopy Image Compression
- Author
-
Dai, Gaole, Tseng, Cheng-Ching, Wuwu, Qingpo, Zhang, Rongyu, Wang, Shaokang, Lu, Ming, Huang, Tiejun, Zhou, Yu, Tuz, Ali Ata, Gunzer, Matthias, Chen, Jianxu, and Zhang, Shanghang
- Subjects
Computer Science - Artificial Intelligence - Abstract
The rapid pace of innovation in biological microscopy imaging has led to large images, putting pressure on data storage and impeding efficient sharing, management, and visualization. This necessitates the development of efficient compression solutions. Traditional CODEC methods struggle to adapt to the diverse bioimaging data and often suffer from sub-optimal compression. In this study, we propose an adaptive compression workflow based on Implicit Neural Representation (INR). This approach permits application-specific compression objectives, capable of compressing images of any shape and arbitrary pixel-wise decompression. We demonstrated on a wide range of microscopy images from real applications that our workflow not only achieved high, controllable compression ratios (e.g., 512x) but also preserved detailed information critical for downstream analysis.
- Published
- 2024
35. A Method of Measuring TES Complex ETF Response in Frequency-domain Multiplexed Readout by Single Sideband Power Modulation
- Author
-
Zhou, Yu, de Haan, Tijmen, Akamatsu, Hiroki, Kaneko, Daisuke, Hazumi, Masashi, Hasegawa, Masaya, Suzuki, Aritoki, and Lee, Adrian T.
- Subjects
Astrophysics - Instrumentation and Methods for Astrophysics ,Physics - Instrumentation and Detectors - Abstract
The digital frequency domain multiplexing (DfMux) technique is widely used for astrophysical instruments with large detector arrays. Detailed detector characterization is required for instrument calibration and systematics control. We conduct the TES complex electrothermal-feedback (ETF) response measurement with the DfMux readout system as follows. By injecting a single sideband signal, we induce modulation in TES power dissipation over a frequency range encompassing the detector response. The modulated current signal induced by TES heating effect is measured, allowing for the ETF response characterization of the detector. With the injection of an upper sideband, the TES readout current shows both an upper and a lower sideband. We model the upper and lower sideband complex ETF response and verify the model by fitting to experimental data. The model not only can fit for certain physical parameters of the detector, such as loop gain, temperature sensitivity, current sensitivity, and time constant, but also enables us to estimate the systematic effect introduced by the multiplexed readout. The method is therefore useful for in-situ detector calibration and for estimating systematic effects during astronomical telescope observations, such as those performed by the upcoming LiteBIRD satellite., Comment: 9 pages, 4 figures, accepted to Journal of Low Temperature Physics
- Published
- 2024
36. Towards Cross-Scale Attention and Surface Supervision for Fractured Bone Segmentation in CT
- Author
-
Zhou, Yu, Zou, Xiahao, and Wang, Yi
- Subjects
Computer Science - Computer Vision and Pattern Recognition ,Computer Science - Artificial Intelligence - Abstract
Bone segmentation is an essential step for the preoperative planning of fracture trauma surgery. The automated segmentation of fractured bone from computed tomography (CT) scans remains challenging, due to the large differences of fractures in position and morphology, and also the inherent anatomical characteristics of different bone structures. To alleviate these issues, we propose a cross-scale attention mechanism as well as a surface supervision strategy for fractured bone segmentation in CT. Specifically, a cross-scale attention mechanism is introduced to effectively aggregate the features among different scales to provide more powerful fracture representation. Moreover, a surface supervision strategy is employed, which explicitly constrains the network to pay more attention to the bone boundary. The efficacy of the proposed method is evaluated on a public dataset containing CT scans with hip fractures. The evaluation metrics are Dice similarity coefficient (DSC), average symmetric surface distance (ASSD), and Hausdorff distance (95HD). The proposed method achieves an average DSC of 93.36%, ASSD of 0.85mm, 95HD of 7.51mm. Our method offers an effective fracture segmentation approach for the pelvic CT examinations, and has the potential to be used for improving the segmentation performance of other types of fractures.
- Published
- 2024
37. CausalBench: A Comprehensive Benchmark for Causal Learning Capability of LLMs
- Author
-
Zhou, Yu, Wu, Xingyu, Huang, Beicheng, Wu, Jibin, Feng, Liang, and Tan, Kay Chen
- Subjects
Computer Science - Machine Learning - Abstract
The ability to understand causality significantly impacts the competence of large language models (LLMs) in output explanation and counterfactual reasoning, as causality reveals the underlying data distribution. However, the lack of a comprehensive benchmark currently limits the evaluation of LLMs' causal learning capabilities. To fill this gap, this paper develops CausalBench based on data from the causal research community, enabling comparative evaluations of LLMs against traditional causal learning algorithms. To provide a comprehensive investigation, we offer three tasks of varying difficulties, including correlation, causal skeleton, and causality identification. Evaluations of 19 leading LLMs reveal that, while closed-source LLMs show potential for simple causal relationships, they significantly lag behind traditional algorithms on larger-scale networks ($>50$ nodes). Specifically, LLMs struggle with collider structures but excel at chain structures, especially at long-chain causality analogous to Chains-of-Thought techniques. This supports the current prompt approaches while suggesting directions to enhance LLMs' causal reasoning capability. Furthermore, CausalBench incorporates background knowledge and training data into prompts to thoroughly unlock LLMs' text-comprehension ability during evaluation, whose findings indicate that, LLM understand causality through semantic associations with distinct entities, rather than directly from contextual information or numerical distributions.
- Published
- 2024
38. Exploring the True Potential: Evaluating the Black-box Optimization Capability of Large Language Models
- Author
-
Huang, Beichen, Wu, Xingyu, Zhou, Yu, Wu, Jibin, Feng, Liang, Cheng, Ran, and Tan, Kay Chen
- Subjects
Computer Science - Neural and Evolutionary Computing - Abstract
Large language models (LLMs) have demonstrated exceptional performance not only in natural language processing tasks but also in a great variety of non-linguistic domains. In diverse optimization scenarios, there is also a rising trend of applying LLMs. However, whether the application of LLMs in the black-box optimization problems is genuinely beneficial remains unexplored. This paper endeavors to offer deep insights into the potential of LLMs in optimization through a comprehensive investigation, which covers both discrete and continuous optimization problems to assess the efficacy and distinctive characteristics that LLMs bring to this field. Our findings reveal both the limitations and advantages of LLMs in optimization. Specifically, on the one hand, despite the significant power consumed for running the models, LLMs exhibit subpar performance in pure numerical tasks, primarily due to a mismatch between the problem domain and their processing capabilities; on the other hand, although LLMs may not be ideal for traditional numerical optimization, their potential in broader optimization contexts remains promising, where LLMs exhibit the ability to solve problems in non-numerical domains and can leverage heuristics from the prompt to enhance their performance. To the best of our knowledge, this work presents the first systematic evaluation of LLMs for numerical optimization. Our findings pave the way for a deeper understanding of LLMs' role in optimization and guide future application of LLMs in a wide range of scenarios.
- Published
- 2024
39. GI-Free Pilot-Aided Channel Estimation for Affine Frequency Division Multiplexing Systems
- Author
-
Zhou, Yu, Yin, Haoran, Zhou, Nanhao, Tang, Yanqun, Zhang, Xiaoying, and Yuan, Weijie
- Subjects
Electrical Engineering and Systems Science - Signal Processing - Abstract
The recently developed affine frequency division multiplexing (AFDM) can achieve full diversity in doubly selective channels, providing a comprehensive sparse representation of the delay-Doppler domain channel. Thus, accurate channel estimation is feasible by using just one pilot symbol. However, traditional AFDM channel estimation schemes necessitate the use of guard intervals (GI) to mitigate data-pilot interference, leading to spectral efficiency degradation. In this paper, we propose a GI-free pilot-aided channel estimation algorithm for AFDM systems, which improves spectral efficiency significantly. To mitigate the interference between the pilot and data symbols caused by the absence of GI, we perform joint interference cancellation, channel estimation, and signal detection iterately. Simulation results show that the bit error rate (BER) performance of the proposed method can approach the ideal case with perfect channel estimation.
- Published
- 2024
40. Linear dynamics and classical tests of the gravitational quantum field theory
- Author
-
Gao, Yuan-Kun, Huang, Da, Ma, Yong-Liang, Tang, Yong, Wu, Yue-Liang, and Zhou, Yu-Feng
- Subjects
General Relativity and Quantum Cosmology ,High Energy Physics - Phenomenology ,High Energy Physics - Theory - Abstract
We explore the new physics phenomena of gravidynamics governed by the inhomogeneous spin gauge symmetry based on the gravitational quantum field theory. Such a gravidynamics enables us to derive the generalized Einstein equation and an equation beyond it. To simplify the analyses, we linearize the dynamic equations of gravitational interaction by keeping terms up to the leading order in the dual gravigauge field. We then apply the linearized dynamic equations into two particular gravitational phenomena. First, we consider the linearized equations in the absence of source fields, which is shown to have five physical propagating polarizations as gravitational waves, i.e., two tensor modes, two vector modes, and one scalar, instead of two tensor polarizations in the general relativity. Second, we examine the Newtonian limit in which the gravitational fields and the matter source distribution are weak and static. By deriving the associated Poisson equation, we obtain the exact relation of the fundamental interaction coupling in the gravidynamics with the experimentally measured Newtonian constant. We also make use of nonrelativistic objects and relativistic photons to probe the Newtonian field configurations. In particular, the experiments from the gravitational deflection of light rays and the Shapiro time delay can place stringent constraints on the linearized gravidynamics in the gravitational quantum field theory., Comment: 7 pages, 1 figure
- Published
- 2024
- Full Text
- View/download PDF
41. Robust Finite-time Stabilization of Linear Systems with Limited State Quantization
- Author
-
Zhou, Yu, Polyakov, Andrey, and Zheng, Gang
- Subjects
Mathematics - Optimization and Control ,Electrical Engineering and Systems Science - Systems and Control - Abstract
This paper investigates the robust asymptotic stabilization of a linear time-invariant (LTI) system by a static feedback with a static state quantization. It is shown that the controllable LTI system can be stabilized to zero in a finite time by means of a nonlinear feedback with a quantizer having a limited (finite) number of values (quantization seeds) even when all parameters of the controller and the quantizer are time-invariant. The control design is based on generalized homogeneity. A homogeneous spherical quantizer is introduced. The static homogeneous feedback is shown to be local (or global) finite-time stabilizer for the linear system (dependently of the system matrix). The tuning rules for both the quantizer and the feedback law are obtained in the form of Linear Matrix Inequalities (LMIs). The closed-loop system is proven to be robust with respect to some bounded matched and vanishing mismatched perturbations. Theoretical results are supported by numerical simulations. \
- Published
- 2024
42. A geometric realization of Koszul duality for graded gentle algebras
- Author
-
Li, Zixu, Qiu, Yu, and Zhou, Yu
- Subjects
Mathematics - Representation Theory ,Mathematics - Category Theory ,Mathematics - Rings and Algebras - Abstract
We show that the Koszul functor of a homologically smooth graded gentle algebra can be realized as the half rotation in a geometric model. As a byproduct, we prove an intersection-dim formula involving the Koszul functor., Comment: 29 pages, 17 figures. Any comments are welcome
- Published
- 2024
43. TextBlockV2: Towards Precise-Detection-Free Scene Text Spotting with Pre-trained Language Model
- Author
-
Lyu, Jiahao, Wei, Jin, Zeng, Gangyan, Li, Zeng, Xie, Enze, Wang, Wei, and Zhou, Yu
- Subjects
Computer Science - Computer Vision and Pattern Recognition - Abstract
Existing scene text spotters are designed to locate and transcribe texts from images. However, it is challenging for a spotter to achieve precise detection and recognition of scene texts simultaneously. Inspired by the glimpse-focus spotting pipeline of human beings and impressive performances of Pre-trained Language Models (PLMs) on visual tasks, we ask: 1) "Can machines spot texts without precise detection just like human beings?", and if yes, 2) "Is text block another alternative for scene text spotting other than word or character?" To this end, our proposed scene text spotter leverages advanced PLMs to enhance performance without fine-grained detection. Specifically, we first use a simple detector for block-level text detection to obtain rough positional information. Then, we finetune a PLM using a large-scale OCR dataset to achieve accurate recognition. Benefiting from the comprehensive language knowledge gained during the pre-training phase, the PLM-based recognition module effectively handles complex scenarios, including multi-line, reversed, occluded, and incomplete-detection texts. Taking advantage of the fine-tuned language model on scene recognition benchmarks and the paradigm of text block detection, extensive experiments demonstrate the superior performance of our scene text spotter across multiple public benchmarks. Additionally, we attempt to spot texts directly from an entire scene image to demonstrate the potential of PLMs, even Large Language Models (LLMs)., Comment: 12 pages, 8 figures
- Published
- 2024
44. Constraints on evaporating primordial black holes from the AMS-02 positron data
- Author
-
Huang, Jia-Zhi and Zhou, Yu-Feng
- Subjects
High Energy Physics - Phenomenology - Abstract
Cosmic-ray (CR) positrons are relatively rare due to its secondary origin and thus sensitive to exotic contributions. Primordial black holes (PBHs) with masses above $\sim 5\times10^{14}\,\mathrm{g}$ can be stable sources of CR positrons due to Hawking radiation. We show that the CR positron flux measured by AMS-02 can place stringent constraints on the energy fraction of PBHs relative to that of dark matter $f_{\text{PBH}}$. Making use of the state-of-the-art models for CR propagation in both the Galaxy and heliosphere, we obtain conservative upper limit of $f_{\text{PBH}}\lesssim3\times 10^{-4}$ at $M_{\mathrm{PBH}}\simeq2\times 10^{16}$ g, which improves the previous constraints obtained from the Voyager CR all-electron data by around an order of magnitude.
- Published
- 2024
45. Biomimetic optoelectronics with nanomaterials for artificial vision
- Author
-
Long, Zhenghao, Zhou, Yu, Ding, Yucheng, Qiu, Xiao, Poddar, Swapnadeep, and Fan, Zhiyong
- Published
- 2024
- Full Text
- View/download PDF
46. Reverse Size Effect of the Unconfined Compressive Strength of Crystalline Rock: A Grain-Scale Perspective
- Author
-
Liang, Qinyuan, Lan, Hengxing, Zhou, Yu, Li, Bo, Sun, Weifeng, Liu, Shijie, and Lv, Wenjun
- Published
- 2024
- Full Text
- View/download PDF
47. Surface quality evaluation of cold plasma and NMQL multi-field coupling eco-friendly micro-milling 7075-T6 aluminum alloy
- Author
-
Duan, Zhen-Jing, Wang, Shuai-Shuai, Shi, Shu-Yan, Liu, Ji-Yu, Li, Yu-Heng, Wang, Zi-Heng, Li, Chang-He, Zhou, Yu-Yang, Song, Jin-Long, and Liu, Xin
- Published
- 2024
- Full Text
- View/download PDF
48. Altered energy dynamics of soil nematode food web modify multifunctionality under precipitation regime change in a temperate grassland
- Author
-
Mo, Xiaomei, Zhou, Yu, Hou, Shuangli, Hu, Zhongmin, Zheng, Guo, and Cui, Shuyan
- Published
- 2024
- Full Text
- View/download PDF
49. Effect of Grain Boundary Character Distribution on the Precipitation Behavior: A Comparative Study for 304 Steel and T91 Steel
- Author
-
Li, Hongjun, Zhou, Yu, Hong, Lin, Huang, Ming, and Yang, Sen
- Published
- 2024
- Full Text
- View/download PDF
50. Forensic age estimation in adults based on multidetector computed tomography analysis of bone density in the medial meta-epiphyseal region of clavicle
- Author
-
Shi, Lei, Luo, Shuai, Liu, Meng, Zhang, Xing‑tao, Zhou, Yu-chi, Yang, Hui-kun, Deng, Zhen-hua, Zhan, Meng-jun, and Chen, Yi-jiu
- Published
- 2024
- Full Text
- View/download PDF
Catalog
Discovery Service for Jio Institute Digital Library
For full access to our library's resources, please sign in.