54,252 results on '"Zhang, Yang"'
Search Results
2. Inverse Melting of Polar Order in a Ferroelectric Oxide
- Author
-
Zhang, Yang, Sung, Suk Hyun, Clement, Colin B., Cheong, Sang-Wook, and Baggari, Ismail El
- Subjects
Condensed Matter - Disordered Systems and Neural Networks - Abstract
In many condensed matter systems, long range order emerges at low temperatures as thermal fluctuations subside. In the presence of competing interactions or quenched disorder, however, some systems can show unusual configurations that become more disordered at low temperature, a rare phenomenon known as "inverse melting". Here, we discover an inverse melting of the polar order in a ferroelectric oxide with quenched chemical disorder (BaTi1-xZrxO3) through direct atomic-scale visualization using in situ scanning transmission electron microscopy. In contrast to the clean BaTiO3 parent system in which long range order tracks lower temperatures, we observe in the doped system BaTi1-xZrxO3 that thermally driven fluctuations at high temperature give way to a more ordered state and then to a re-entrant disordered configuration at even lower temperature. Such an inverse melting of the polar order is likely linked to the random field generated by Zr dopants, which modulates the energy landscape arising from the competition between thermal fluctuations and random field pinning potential. These visualizations highlight a rich landscape of order and disorder in materials with quenched disorder, which may be key to understanding their advanced functionalities.
- Published
- 2024
3. Inverse Nonlinear Scattering by a Metric
- Author
-
Hintz, Peter, Barreto, Antônio Sá, Uhlmann, Gunther, and Zhang, Yang
- Subjects
Mathematics - Analysis of PDEs ,35P25, 58J50 - Abstract
We study the inverse problem of determining a time-dependent globally hyperbolic Lorentzian metric from the scattering operator for semilinear wave equations., Comment: 62 pages
- Published
- 2024
4. LSH-MoE: Communication-efficient MoE Training via Locality-Sensitive Hashing
- Author
-
Nie, Xiaonan, Liu, Qibin, Fu, Fangcheng, Zhu, Shenhan, Miao, Xupeng, Li, Xiaoyang, Zhang, Yang, Liu, Shouda, and Cui, Bin
- Subjects
Computer Science - Distributed, Parallel, and Cluster Computing - Abstract
Larger transformer models always perform better on various tasks but require more costs to scale up the model size. To efficiently enlarge models, the mixture-of-experts (MoE) architecture is widely adopted, which consists of a gate network and a series of experts and keep the training cost constant by routing the input data to a fixed number of experts instead of all. In existing large-scale MoE training systems, experts would be distributed among different GPUs for parallelization, and thus input data requires additional all-to-all communications to access the target experts and conduct corresponding computations. However, upon evaluating the training process of three mainstream MoE models on commonly used GPU clusters, we found that the all-to-all communication ratio averaged around 45%, which significantly hinders the efficiency and scalability of training MoE models. In this paper, we propose LSH-MoE, a communication-efficient MoE training framework using locality-sensitive hashing (LSH). We first present the problems of scaling MoE training in existing systems and highlight the potential of exploiting token similarity to facilitate data compression. Then, we introduce an efficient LSH-based compression technique, which utilizes the cross-polytope hashing for rapid clustering and implements a residual-based error compensation scheme to alleviate the adverse impact of compression. To verify the effectiveness of our methods, we conduct experiments on both language models (e.g., RoBERTa, GPT, and T5) and vision models (e.g., Swin) for pre-training and fine-tuning tasks. The results demonstrate that our method substantially outperforms its counterparts across different tasks by 1.28x - 2.2x of speedup., Comment: Accepted by NeurIPS 2024
- Published
- 2024
5. Generative AI for Data Augmentation in Wireless Networks: Analysis, Applications, and Case Study
- Author
-
Wen, Jinbo, Kang, Jiawen, Niyato, Dusit, Zhang, Yang, Wang, Jiacheng, Sikdar, Biplab, and Zhang, Ping
- Subjects
Computer Science - Networking and Internet Architecture ,Computer Science - Artificial Intelligence - Abstract
Data augmentation is a powerful technique to mitigate data scarcity. However, owing to fundamental differences in wireless data structures, traditional data augmentation techniques may not be suitable for wireless data. Fortunately, Generative Artificial Intelligence (GenAI) can be an effective alternative to wireless data augmentation due to its excellent data generation capability. This article systemically explores the potential and effectiveness of GenAI-driven data augmentation in wireless networks. We first briefly review data augmentation techniques, discuss their limitations in wireless networks, and introduce generative data augmentation, including reviewing GenAI models and their applications in data augmentation. We then explore the application prospects of GenAI-driven data augmentation in wireless networks from the physical, network, and application layers, which provides a GenAI-driven data augmentation architecture for each application. Subsequently, we propose a general generative diffusion model-based data augmentation framework for Wi-Fi gesture recognition, which uses transformer-based diffusion models to generate high-quality channel state information data. Furthermore, we develop residual neural network models for Wi-Fi gesture recognition to evaluate the role of augmented data and conduct a case study based on a real dataset. Simulation results demonstrate the effectiveness of the proposed framework. Finally, we discuss research directions for generative data augmentation.
- Published
- 2024
6. AdaS&S: a One-Shot Supernet Approach for Automatic Embedding Size Search in Deep Recommender System
- Author
-
Wei, He, Yang, Yuekui, Zhang, Yang, Wu, Haiyang, Liu, Meixi, and Ma, Shaoping
- Subjects
Computer Science - Information Retrieval ,Computer Science - Machine Learning - Abstract
Deep Learning Recommendation Model(DLRM)s utilize the embedding layer to represent various categorical features. Traditional DLRMs adopt unified embedding size for all features, leading to suboptimal performance and redundant parameters. Thus, lots of Automatic Embedding size Search (AES) works focus on obtaining mixed embedding sizes with strong model performance. However, previous AES works can hardly address several challenges together: (1) The search results of embedding sizes are unstable; (2) Recommendation effect with AES results is unsatisfactory; (3) Memory cost of embeddings is uncontrollable. To address these challenges, we propose a novel one-shot AES framework called AdaS&S, in which a supernet encompassing various candidate embeddings is built and AES is performed as searching network architectures within it. Our framework contains two main stages: In the first stage, we decouple training parameters from searching embedding sizes, and propose the Adaptive Sampling method to yield a well-trained supernet, which further helps to produce stable AES results. In the second stage, to obtain embedding sizes that benefits the model effect, we design a reinforcement learning search process which utilizes the supernet trained previously. Meanwhile, to adapt searching to specific resource constraint, we introduce the resource competition penalty to balance the model effectiveness and memory cost of embeddings. We conduct extensive experiments on public datasets to show the superiority of AdaS&S. Our method could improve AUC by about 0.3% while saving about 20% of model parameters. Empirical analysis also shows that the stability of searching results in AdaS&S significantly exceeds other methods.
- Published
- 2024
7. Regularized stress tensor of vector fields in de Sitter space
- Author
-
Zhang, Yang and Ye, Xuan
- Subjects
General Relativity and Quantum Cosmology ,Mathematical Physics ,Quantum Physics - Abstract
We study the Stueckelberg field in de Sitter space, which is a massive vector field with the gauge fixing (GF) term $\frac{1}{2\zeta} (A^\mu\,_{;\, \mu})^2$. We obtain the vacuum stress tensor, which consists of the transverse, longitudinal, temporal, and GF parts, and each contains various UV divergences. By the minimal subtraction rule, we regularize each part of the stress tensor to its pertinent adiabatic order. The transverse stress tensor is regularized to the 0th adiabatic order, the longitudinal, temporal, and GF stress tensors are regularized to the 2nd adiabatic order. The resulting total regularized vacuum stress tensor is convergent and maximally-symmetric, has a positive energy density, and respects the covariant conservation, and thus can be identified as the cosmological constant that drives the de Sitter inflation. Under the Lorenz condition $A^\mu\,_{;\, \mu}=0$, the regularized Stueckelberg stress tensor reduces to the regularized Proca stress tensor that contains only the transverse and longitudinal modes. In the massless limit, the regularized Stueckelberg stress tensor becomes zero, and is the same as that of the Maxwell field with the GF term, and no trace anomaly exists. If the order of adiabatic regularization were lower than our prescription, some divergences would remain. If the order were higher, say, under the conventional 4th-order regularization, more terms than necessary would be subtracted off, leading to an unphysical negative energy density and the trace anomaly simultaneously., Comment: 42 pages, 10 figures
- Published
- 2024
8. Transferable Sequential Recommendation via Vector Quantized Meta Learning
- Author
-
Yue, Zhenrui, Zeng, Huimin, Zhang, Yang, McAuley, Julian, and Wang, Dong
- Subjects
Computer Science - Information Retrieval ,Computer Science - Artificial Intelligence - Abstract
While sequential recommendation achieves significant progress on capturing user-item transition patterns, transferring such large-scale recommender systems remains challenging due to the disjoint user and item groups across domains. In this paper, we propose a vector quantized meta learning for transferable sequential recommenders (MetaRec). Without requiring additional modalities or shared information across domains, our approach leverages user-item interactions from multiple source domains to improve the target domain performance. To solve the input heterogeneity issue, we adopt vector quantization that maps item embeddings from heterogeneous input spaces to a shared feature space. Moreover, our meta transfer paradigm exploits limited target data to guide the transfer of source domain knowledge to the target domain (i.e., learn to transfer). In addition, MetaRec adaptively transfers from multiple source tasks by rescaling meta gradients based on the source-target domain similarity, enabling selective learning to improve recommendation performance. To validate the effectiveness of our approach, we perform extensive experiments on benchmark datasets, where MetaRec consistently outperforms baseline methods by a considerable margin., Comment: Accepted to BigData 2024
- Published
- 2024
9. Growth of Gravitational Wave Spectrum from Sound Waves in a Universe with Generic Expansion Rate
- Author
-
Guo, Huai-Ke, Hu, Jiahang, Xiao, Yang, Yang, Jin Min, and Zhang, Yang
- Subjects
General Relativity and Quantum Cosmology ,Astrophysics - Cosmology and Nongalactic Astrophysics ,High Energy Physics - Phenomenology - Abstract
We derived here the factor $\Upsilon$, which quantifies how the gravitational wave spectrum generated by sound waves in the radiation sector grows over time, in a universe with a generic expanding rate set by another dominant energy content. When the dominant energy density satisfies $\rho \propto a^{-3(1+w)}$, we found that $\Upsilon$ has a compact analytical expression: $\Upsilon =\frac{2[1-y^{3(w-1)/2}]}{3(1-w)}$, where $y = a(t)/a(t_s)$ which is the ratio of the scale factor at a later time $t$ to that at $t_s$ when gravitational wave production from sound waves starts. This generic result reduces to that derived previously for radiation-dominated and matter-dominated cases, thus generalizing previous formulas to more general cosmological contexts and providing more accurate results. The derivation relies solely on a stationary source, implying that this generic result of $\Upsilon$ serves as an universal factor in describing the growth of the gravitational wave production and can appear beyond cosmological phase transitions., Comment: 9 pages, 3 figures
- Published
- 2024
10. Real-Time Personalization for LLM-based Recommendation with Customized In-Context Learning
- Author
-
Bao, Keqin, Yan, Ming, Zhang, Yang, Zhang, Jizhi, Wang, Wenjie, Feng, Fuli, and He, Xiangnan
- Subjects
Computer Science - Information Retrieval - Abstract
Frequently updating Large Language Model (LLM)-based recommender systems to adapt to new user interests -- as done for traditional ones -- is impractical due to high training costs, even with acceleration methods. This work explores adapting to dynamic user interests without any model updates by leveraging In-Context Learning (ICL), which allows LLMs to learn new tasks from few-shot examples provided in the input. Using new-interest examples as the ICL few-shot examples, LLMs may learn real-time interest directly, avoiding the need for model updates. However, existing LLM-based recommenders often lose the in-context learning ability during recommendation tuning, while the original LLM's in-context learning lacks recommendation-specific focus. To address this, we propose RecICL, which customizes recommendation-specific in-context learning for real-time recommendations. RecICL organizes training examples in an in-context learning format, ensuring that in-context learning ability is preserved and aligned with the recommendation task during tuning. Extensive experiments demonstrate RecICL's effectiveness in delivering real-time recommendations without requiring model updates. Our code is available at https://github.com/ym689/rec_icl.
- Published
- 2024
11. Causality-Enhanced Behavior Sequence Modeling in LLMs for Personalized Recommendation
- Author
-
Zhang, Yang, You, Juntao, Bai, Yimeng, Zhang, Jizhi, Bao, Keqin, Wang, Wenjie, and Chua, Tat-Seng
- Subjects
Computer Science - Information Retrieval ,Computer Science - Artificial Intelligence - Abstract
Recent advancements in recommender systems have focused on leveraging Large Language Models (LLMs) to improve user preference modeling, yielding promising outcomes. However, current LLM-based approaches struggle to fully leverage user behavior sequences, resulting in suboptimal preference modeling for personalized recommendations. In this study, we propose a novel Counterfactual Fine-Tuning (CFT) method to address this issue by explicitly emphasizing the role of behavior sequences when generating recommendations. Specifically, we employ counterfactual reasoning to identify the causal effects of behavior sequences on model output and introduce a task that directly fits the ground-truth labels based on these effects, achieving the goal of explicit emphasis. Additionally, we develop a token-level weighting mechanism to adjust the emphasis strength for different item tokens, reflecting the diminishing influence of behavior sequences from earlier to later tokens during predicting an item. Extensive experiments on real-world datasets demonstrate that CFT effectively improves behavior sequence modeling. Our codes are available at https://github.com/itsmeyjt/CFT.
- Published
- 2024
12. Thermodynamics of Barrow Einstein-power-Yang-Mills AdS black hole in the restricted phase space
- Author
-
Du, Yun-Zhi, Zhao, Hui-Hua, Zhang, Yang, and Gu, Qiang
- Subjects
High Energy Physics - Theory - Abstract
As we know that due to the quantum gravitational effects black hole horizons are ``fractalized'' into a sphereflake by Barrow. Based on this issue, in this work we investigate the phase structure and stability of the Einstein-Power-Yang-Mills AdS black holes with the fractal structure on the black hole horizon in the restricted phase space. Through the thermodynamics first law and the Smarr relation in the restricted phase space, we observe that the mass parameter is understood as the inter energy and the Smarr relation is not a homogeneous function of order one for all quantities due to the fractal structure. And the fractal structure can be regarded as a phase transition probe. When this system with the fixed central charge there exists a novel phenomena: the supercritical phase transition. Furthermore the effects of the fractal parameter and non-linear Yang-Mills parameter on the thermodynamics stability of this system are also investigated.
- Published
- 2024
13. HairDiffusion: Vivid Multi-Colored Hair Editing via Latent Diffusion
- Author
-
Zeng, Yu, Zhang, Yang, Liu, Jiachen, Shen, Linlin, Deng, Kaijun, He, Weizhao, and Wang, Jinbao
- Subjects
Computer Science - Computer Vision and Pattern Recognition - Abstract
Hair editing is a critical image synthesis task that aims to edit hair color and hairstyle using text descriptions or reference images, while preserving irrelevant attributes (e.g., identity, background, cloth). Many existing methods are based on StyleGAN to address this task. However, due to the limited spatial distribution of StyleGAN, it struggles with multiple hair color editing and facial preservation. Considering the advancements in diffusion models, we utilize Latent Diffusion Models (LDMs) for hairstyle editing. Our approach introduces Multi-stage Hairstyle Blend (MHB), effectively separating control of hair color and hairstyle in diffusion latent space. Additionally, we train a warping module to align the hair color with the target region. To further enhance multi-color hairstyle editing, we fine-tuned a CLIP model using a multi-color hairstyle dataset. Our method not only tackles the complexity of multi-color hairstyles but also addresses the challenge of preserving original colors during diffusion editing. Extensive experiments showcase the superiority of our method in editing multi-color hairstyles while preserving facial attributes given textual descriptions and reference images.
- Published
- 2024
14. Direct observation of a photoinduced topological phase transition in Bi-doped (Pb,Sn)Se
- Author
-
Mogi, Masataka, Choi, Dongsung, Primeau, Louis, Lv, Baiqing, Azoury, Doron, Su, Yifan, Fu, Liang, Zhang, Yang, and Gedik, Nuh
- Subjects
Condensed Matter - Materials Science ,Condensed Matter - Mesoscale and Nanoscale Physics - Abstract
Ultrafast photoexcitation offers a novel approach to manipulating quantum materials. One of the long-standing goals in this field is to achieve optical control over topological properties. However, the impact on their electronic structures, which host gapless surface states, has yet to be directly observed. Here, using time- and angle-resolved photoemission spectroscopy, we visualize the photo-induced evolution of the band structure in Biy(Pb1-xSnx)1-ySe(111) films from topological to trivial insulators. Following near-infrared ultrafast laser excitation, we observe that the topological surface state opens a substantial gap of up to 0.1 eV. Considering the topological phase diagram associated with lattice distortion and atomic displacement, we show that a uniaxial strain generated by the ultrafast optical pulse is sufficiently effective and strong for the observed topological phase transition. Our study highlights the potential of optical tuning of materials through laser excitation to control topological properties on ultrafast timescales., Comment: Accepted in PRL, 17 pages, 4 figures
- Published
- 2024
15. Insulating charge transfer ferromagnetism
- Author
-
Zhang, Yixin and Zhang, Yang
- Subjects
Condensed Matter - Strongly Correlated Electrons ,Condensed Matter - Materials Science - Abstract
We propose a mechanism for insulating ferromagnetism in the honeycomb Hubbard model of semiconductor moir\'e superlattices. The ferromagnetism emerges at critical charge transfer regime, stabilizing the quantum anomalous Hall state without Hund's coupling. We further note the ferromagnetic exchange applies to general charge transfer systems when breaking particle-hole symmetry., Comment: 4+6 pgaes, 4+6 figures
- Published
- 2024
16. The evolution of two-point correlation function of galaxies with a twin-peak initial power spectrum
- Author
-
Zhang, Yang and Li, Bichu
- Subjects
Astrophysics - Astrophysics of Galaxies ,General Relativity and Quantum Cosmology - Abstract
The evolution equation of two-point correlation function $\xi$ of galaxies can analytically describe the large scale structure of the galaxy distribution, and the solution depends also upon the initial condition. The primeval spectrum of the baryon acoustic oscillations (BAO) contains multi peaks that survived the Silk damping, and, as a relevant portion, two peaks of the primeval BAO spectrum fall into the range of current galaxy surveys. Incorporating this portion, we use a twin-peak initial power spectrum of the galaxies, and obtain the evolution solution from a redshift $z=8$ to $z=0$ in the Gaussian approximation. The outcome $\xi(r)$ at $z=0.6$ still exhibits the 100 Mpc periodic bumps as observed by the WiggleZ survey, a feature largely determined by the Jeans length $\lambda_J$ in the equation. In particular, due to the superposition of the twin peaks in the initial condition, $\xi(r)$ shows a shallow trough at $\sim 70 h^{-1}$Mpc and a deep trough at $\sim 140 h^{-1}$Mpc, agreeing with the observational data, much better than our previous work that used a simple one-peak initial spectrum., Comment: 10 pages, 5 figures
- Published
- 2024
17. A Stock Price Prediction Approach Based on Time Series Decomposition and Multi-Scale CNN using OHLCT Images
- Author
-
Pei, Zhiyuan, Yan, Jianqi, Yan, Jin, Yang, Bailing, Li, Ziyuan, Zhang, Lin, Liu, Xin, and Zhang, Yang
- Subjects
Computer Science - Machine Learning ,Computer Science - Artificial Intelligence ,Quantitative Finance - Statistical Finance - Abstract
Recently, deep learning in stock prediction has become an important branch. Image-based methods show potential by capturing complex visual patterns and spatial correlations, offering advantages in interpretability over time series models. However, image-based approaches are more prone to overfitting, hindering robust predictive performance. To improve accuracy, this paper proposes a novel method, named Sequence-based Multi-scale Fusion Regression Convolutional Neural Network (SMSFR-CNN), for predicting stock price movements in the China A-share market. By utilizing CNN to learn sequential features and combining them with image features, we improve the accuracy of stock trend prediction on the A-share market stock dataset. This approach reduces the search space for image features, stabilizes, and accelerates the training process. Extensive comparative experiments on 4,454 A-share stocks show that the model achieves a 61.15% positive predictive value and a 63.37% negative predictive value for the next 5 days, resulting in a total profit of 165.09%., Comment: 32 pages, 5 figures, 5 tables
- Published
- 2024
18. Fictitious Synthetic Data Can Improve LLM Factuality via Prerequisite Learning
- Author
-
Liu, Yujian, Chang, Shiyu, Jaakkola, Tommi, and Zhang, Yang
- Subjects
Computer Science - Computation and Language - Abstract
Recent studies have identified one aggravating factor of LLM hallucinations as the knowledge inconsistency between pre-training and fine-tuning, where unfamiliar fine-tuning data mislead the LLM to fabricate plausible but wrong outputs. In this paper, we propose a novel fine-tuning strategy called Prereq-Tune to address this knowledge inconsistency and reduce hallucinations. Fundamentally, Prereq-Tune disentangles the learning of skills and knowledge, so the model learns only the task skills without being impacted by the knowledge inconsistency. To achieve this, Prereq-Tune introduces an additional prerequisite learning stage to learn the necessary knowledge for SFT, allowing subsequent SFT to focus only on task skills. Prereq-Tune can also be combined with fictitious synthetic data to enhance the grounding of LLM outputs to their internal knowledge. Experiments show that Prereq-Tune outperforms existing baselines in improving LLM's factuality across short QA and long-form generation tasks. It also opens new possibilities for knowledge-controlled generation in LLMs. Our code is available at https://github.com/UCSB-NLP-Chang/Prereq_tune.git.
- Published
- 2024
19. Predicting 30-Day Hospital Readmission in Medicare Patients: Insights from an LSTM Deep Learning Model
- Author
-
Li, Xintao, Liu, Sibei, Yu, Dezhi, Zhang, Yang, and Liu, Xiaoyu
- Subjects
Computer Science - Machine Learning - Abstract
Readmissions among Medicare beneficiaries are a major problem for the US healthcare system from a perspective of both healthcare operations and patient caregiving outcomes. Our study analyzes Medicare hospital readmissions using LSTM networks with feature engineering to assess feature contributions. We selected variables from admission-level data, inpatient medical history and patient demography. The LSTM model is designed to capture temporal dynamics from admission-level and patient-level data. On a case study on the MIMIC dataset, the LSTM model outperformed the logistic regression baseline, accurately leveraging temporal features to predict readmission. The major features were the Charlson Comorbidity Index, hospital length of stay, the hospital admissions over the past 6 months, while demographic variables were less impactful. This work suggests that LSTM networks offers a more promising approach to improve Medicare patient readmission prediction. It captures temporal interactions in patient databases, enhancing current prediction models for healthcare providers. Adoption of predictive models into clinical practice may be more effective in identifying Medicare patients to provide early and targeted interventions to improve patient outcomes., Comment: 5 pages, 1 table, 5 figures, Accepted by 2024 3rd International Conference on Cloud Computing, Big Data Application and Software Engineering(CBASE 2024), the final version will be published on on IEEE Conference proceeding
- Published
- 2024
20. Exploring multi-step electroweak phase transitions in the 2HDM+$\boldsymbol{a}$
- Author
-
Si, Zong-guo, Wang, Hong-xin, Wang, Lei, and Zhang, Yang
- Subjects
High Energy Physics - Phenomenology - Abstract
Multiple electroweak phase transitions occurring sequentially in the early universe can give rise to intriguing phenomenology, compared to the typical single-step electroweak phase transition. In this work, we investigate this scenario within the framework of the two-Higgs-doublet model with a pseudoscalar, utilizing the complete one-loop finite-temperature effective potential. After considering relevant experimental and theoretical constraints, we identify four distinct types of phase transitions. In the first case, only the configuration of the CP-even Higgs acquires a non-zero value via a first-order or a cross-over electroweak phase transition, leading to electroweak symmetry breaking. In the remaining three cases, the pseudoscalar fields can obtain vacuum expectation values at different phases of the multi-step phase transition process, leading to spontaneous breaking of the CP symmetry. As the temperature decreases, the phase shifts to the vacuum observed today via first-order electroweak phase transition, at this point, the vacuum expectation value of the pseudoscalar field returns to zero, restoring the CP symmetry. Finally, we compare the transition strength and the stochastic gravitational wave background generated in the four situations along with the projected detection limits., Comment: 24 pages, 7 figures
- Published
- 2024
21. Personalized Image Generation with Large Multimodal Models
- Author
-
Xu, Yiyan, Wang, Wenjie, Zhang, Yang, Biao, Tang, Yan, Peng, Feng, Fuli, and He, Xiangnan
- Subjects
Computer Science - Information Retrieval - Abstract
Personalized content filtering, such as recommender systems, has become a critical infrastructure to alleviate information overload. However, these systems merely filter existing content and are constrained by its limited diversity, making it difficult to meet users' varied content needs. To address this limitation, personalized content generation has emerged as a promising direction with broad applications. Nevertheless, most existing research focuses on personalized text generation, with relatively little attention given to personalized image generation. The limited work in personalized image generation faces challenges in accurately capturing users' visual preferences and needs from noisy user-interacted images and complex multimodal instructions. Worse still, there is a lack of supervised data for training personalized image generation models. To overcome the challenges, we propose a Personalized Image Generation Framework named Pigeon, which adopts exceptional large multimodal models with three dedicated modules to capture users' visual preferences and needs from noisy user history and multimodal instructions. To alleviate the data scarcity, we introduce a two-stage preference alignment scheme, comprising masked preference reconstruction and pairwise preference alignment, to align Pigeon with the personalized image generation task. We apply Pigeon to personalized sticker and movie poster generation, where extensive quantitative results and human evaluation highlight its superiority over various generative baselines.
- Published
- 2024
22. Aegis:An Advanced LLM-Based Multi-Agent for Intelligent Functional Safety Engineering
- Author
-
Shi, Lu, Qi, Bin, Luo, Jiarui, Zhang, Yang, Liang, Zhanzhao, Gao, Zhaowei, Deng, Wenke, and Sun, Lin
- Subjects
Computer Science - Multiagent Systems - Abstract
Functional safety is a critical aspect of automotive engineering, encompassing all phases of a vehicle's lifecycle, including design, development, production, operation, and decommissioning. This domain involves highly knowledge-intensive tasks. This paper introduces Aegis: An Advanced LLM-Based Multi-Agent for Intelligent Functional Safety Engineering. Aegis is specifically designed to support complex functional safety tasks within the automotive sector. It is tailored to perform Hazard Analysis and Risk Assessment(HARA), document Functional Safety Requirements(FSR), and plan test cases for Automatic Emergency Braking(AEB) systems. The most advanced version, Aegis-Max, leverages Retrieval-Augmented Generation(RAG) and reflective mechanisms to enhance its capability in managing complex, knowledge-intensive tasks. Additionally, targeted prompt refinement by professional functional safety practitioners can significantly optimize Aegis's performance in the functional safety domain. This paper demonstrates the potential of Aegis to improve the efficiency and effectiveness of functional safety processes in automotive engineering.
- Published
- 2024
23. Optimization and Application of Cloud-based Deep Learning Architecture for Multi-Source Data Prediction
- Author
-
Zhang, Yang, Wang, Fa, Huang, Xin, Li, Xintao, Liu, Sibei, and Zhang, Hansong
- Subjects
Computer Science - Distributed, Parallel, and Cluster Computing ,Computer Science - Databases ,Computer Science - Machine Learning ,Quantitative Biology - Quantitative Methods - Abstract
This study develops a cloud-based deep learning system for early prediction of diabetes, leveraging the distributed computing capabilities of the AWS cloud platform and deep learning technologies to achieve efficient and accurate risk assessment. The system utilizes EC2 p3.8xlarge GPU instances to accelerate model training, reducing training time by 93.2% while maintaining a prediction accuracy of 94.2%. With an automated data processing and model training pipeline built using Apache Airflow, the system can complete end-to-end updates within 18.7 hours. In clinical applications, the system demonstrates a prediction accuracy of 89.8%, sensitivity of 92.3%, and specificity of 95.1%. Early interventions based on predictions lead to a 37.5% reduction in diabetes incidence among the target population. The system's high performance and scalability provide strong support for large-scale diabetes prevention and management, showcasing significant public health value., Comment: 6 Pages, 5 Figures, 3 Tables. The final version will be published in the proceedings of the IEEE conference
- Published
- 2024
24. Planning Anything with Rigor: General-Purpose Zero-Shot Planning with LLM-based Formalized Programming
- Author
-
Hao, Yilun, Zhang, Yang, and Fan, Chuchu
- Subjects
Computer Science - Artificial Intelligence ,Computer Science - Computation and Language - Abstract
While large language models (LLMs) have recently demonstrated strong potential in solving planning problems, there is a trade-off between flexibility and complexity. LLMs, as zero-shot planners themselves, are still not capable of directly generating valid plans for complex planning problems such as multi-constraint or long-horizon tasks. On the other hand, many frameworks aiming to solve complex planning problems often rely on task-specific preparatory efforts, such as task-specific in-context examples and pre-defined critics/verifiers, which limits their cross-task generalization capability. In this paper, we tackle these challenges by observing that the core of many planning problems lies in optimization problems: searching for the optimal solution (best plan) with goals subject to constraints (preconditions and effects of decisions). With LLMs' commonsense, reasoning, and programming capabilities, this opens up the possibilities of a universal LLM-based approach to planning problems. Inspired by this observation, we propose LLMFP, a general-purpose framework that leverages LLMs to capture key information from planning problems and formally formulate and solve them as optimization problems from scratch, with no task-specific examples needed. We apply LLMFP to 9 planning problems, ranging from multi-constraint decision making to multi-step planning problems, and demonstrate that LLMFP achieves on average 83.7% and 86.8% optimal rate across 9 tasks for GPT-4o and Claude 3.5 Sonnet, significantly outperforming the best baseline (direct planning with OpenAI o1-preview) with 37.6% and 40.7% improvements. We also validate components of LLMFP with ablation experiments and analyzed the underlying success and failure reasons., Comment: 50 pages, 25 figures, 7 tables
- Published
- 2024
25. A Hitchhiker's Guide to Scaling Law Estimation
- Author
-
Choshen, Leshem, Zhang, Yang, and Andreas, Jacob
- Subjects
Computer Science - Machine Learning ,Computer Science - Artificial Intelligence ,Computer Science - Computation and Language - Abstract
Scaling laws predict the loss of a target machine learning model by extrapolating from easier-to-train models with fewer parameters or smaller training sets. This provides an efficient way for practitioners and researchers alike to compare pretraining decisions involving optimizers, datasets, and model architectures. Despite the widespread use of scaling laws to model the dynamics of language model training, there has been little work on understanding how to best estimate and interpret them. We collect (and release) a large-scale dataset containing losses and downstream evaluations for 485 previously published pretrained models. We use these to estimate more than 1000 scaling laws, then derive a set of best practices for estimating scaling laws in new model families. We find that fitting scaling laws to intermediate checkpoints of training runs (and not just their final losses) substantially improves accuracy, and that -- all else equal -- estimates of performance are generally most accurate when derived from other models of similar sizes. However, because there is a significant degree of variability across model seeds, training multiple small models is sometimes more useful than training a single large one. Moreover, while different model families differ scaling behavior, they are often similar enough that a target model's behavior can be predicted from a single model with the same architecture, along with scaling parameter estimates derived from other model families.
- Published
- 2024
26. Tracing Human Stress from Physiological Signals using UWB Radar
- Author
-
Xu, Jia, Xiao, Teng, Lv, Pin, Chen, Zhe, Cai, Chao, Zhang, Yang, and Xiong, Zehui
- Subjects
Computer Science - Human-Computer Interaction ,Computer Science - Hardware Architecture ,Computer Science - Machine Learning ,Electrical Engineering and Systems Science - Signal Processing - Abstract
Stress tracing is an important research domain that supports many applications, such as health care and stress management; and its closest related works are derived from stress detection. However, these existing works cannot well address two important challenges facing stress detection. First, most of these studies involve asking users to wear physiological sensors to detect their stress states, which has a negative impact on the user experience. Second, these studies have failed to effectively utilize multimodal physiological signals, which results in less satisfactory detection results. This paper formally defines the stress tracing problem, which emphasizes the continuous detection of human stress states. A novel deep stress tracing method, named DST, is presented. Note that DST proposes tracing human stress based on physiological signals collected by a noncontact ultrawideband radar, which is more friendly to users when collecting their physiological signals. In DST, a signal extraction module is carefully designed at first to robustly extract multimodal physiological signals from the raw RF data of the radar, even in the presence of body movement. Afterward, a multimodal fusion module is proposed in DST to ensure that the extracted multimodal physiological signals can be effectively fused and utilized. Extensive experiments are conducted on three real-world datasets, including one self-collected dataset and two publicity datasets. Experimental results show that the proposed DST method significantly outperforms all the baselines in terms of tracing human stress states. On average, DST averagely provides a 6.31% increase in detection accuracy on all datasets, compared with the best baselines., Comment: 19 pages, 11 figures
- Published
- 2024
27. $\texttt{ModSCAN}$: Measuring Stereotypical Bias in Large Vision-Language Models from Vision and Language Modalities
- Author
-
Jiang, Yukun, Li, Zheng, Shen, Xinyue, Liu, Yugeng, Backes, Michael, and Zhang, Yang
- Subjects
Computer Science - Cryptography and Security ,Computer Science - Computers and Society - Abstract
Large vision-language models (LVLMs) have been rapidly developed and widely used in various fields, but the (potential) stereotypical bias in the model is largely unexplored. In this study, we present a pioneering measurement framework, $\texttt{ModSCAN}$, to $\underline{SCAN}$ the stereotypical bias within LVLMs from both vision and language $\underline{Mod}$alities. $\texttt{ModSCAN}$ examines stereotypical biases with respect to two typical stereotypical attributes (gender and race) across three kinds of scenarios: occupations, descriptors, and persona traits. Our findings suggest that 1) the currently popular LVLMs show significant stereotype biases, with CogVLM emerging as the most biased model; 2) these stereotypical biases may stem from the inherent biases in the training dataset and pre-trained models; 3) the utilization of specific prompt prefixes (from both vision and language modalities) performs well in reducing stereotypical biases. We believe our work can serve as the foundation for understanding and addressing stereotypical bias in LVLMs., Comment: Accepted in EMNLP 2024. 29 pages, 22 figures
- Published
- 2024
28. Novel inverse multi-objective optimization-empowered design of microperforated panels for enhanced low-frequency noise mitigation
- Author
-
Zhang, Duo, Zhang, Yang, Yuan, Sichen, Tang, Jiong, and Zhou, Kai
- Subjects
Physics - General Physics - Abstract
Microperforated panels (MPPs) display excellent capacity in noise control applications owing to their high strength, simple design, and efficacy in low-frequency sound absorption. Traditionally, the development of MPPs has relied on a trial-and-error design approach. Although simple optimization-based methods have recently begun to be employed, these designs often overlook practical considerations, such as the increased costs associated with adding more MPP layers, which presents a gap to achieve the practical feasibility of MPP deployment. To address this, the study aims to develop an inverse multi-objective optimization-empowered framework for MPP design to enhance low-frequency noise mitigation while minimizing fabrication costs. Specifically, a finite element (FE) model is established to conduct the acoustic analysis of MPPs, followed by thorough experimental validation. A novel multi-objective particle swarm optimization algorithm (MOPSO) is then developed to cope with mixed-type design variables with interrelations inherent to the MPP architecture. Using the high-fidelity FE model as a cornerstone, the MOPSO guides the inverse optimization analysis to yield multiple non-dominant solutions. These solutions not only avoid the trap of local optima, but also allow for continuous screening to ensure the engineering viability based on empirical judgment. The results clearly demonstrate the effectiveness of the proposed methodology. The MPPs designed in this study show great potential for mitigating indoor noise in buildings, addressing noise issues arising from rapid urbanization and transportation development. Furthermore, the novel optimization strategy proposed in this study holds wide applicability for other sound absorption materials., Comment: 32 pages, 11 figures
- Published
- 2024
29. Unitary branching rules for the general linear Lie superalgebra
- Author
-
Gould, Mark and Zhang, Yang
- Subjects
Mathematics - Representation Theory ,Mathematical Physics ,17B10, 05E10 - Abstract
In terms of highest weights, we establish branching rules for finite dimensional unitary simple modules of the general linear Lie superalgebra $\mathfrak{gl}_{m|n}$. Our proof uses the Howe duality for $\mathfrak{gl}_{m|n}$, as well as branching rules for Kac modules. Moreover, we derive the branching rules of type 2 unitary simple $\mathfrak{gl}_{m|n}$-modules, which are dual to the aforementioned unitary modules., Comment: 14 pages
- Published
- 2024
30. Task-agnostic Pre-training and Task-guided Fine-tuning for Versatile Diffusion Planner
- Author
-
Fan, Chenyou, Bai, Chenjia, Shan, Zhao, He, Haoran, Zhang, Yang, and Wang, Zhen
- Subjects
Computer Science - Machine Learning ,Computer Science - Artificial Intelligence - Abstract
Diffusion models have demonstrated their capabilities in modeling trajectories of multi-tasks. However, existing multi-task planners or policies typically rely on task-specific demonstrations via multi-task imitation, or require task-specific reward labels to facilitate policy optimization via Reinforcement Learning (RL). To address these challenges, we aim to develop a versatile diffusion planner that can leverage large-scale inferior data that contains task-agnostic sub-optimal trajectories, with the ability to fast adapt to specific tasks. In this paper, we propose \textbf{SODP}, a two-stage framework that leverages \textbf{S}ub-\textbf{O}ptimal data to learn a \textbf{D}iffusion \textbf{P}lanner, which is generalizable for various downstream tasks. Specifically, in the pre-training stage, we train a foundation diffusion planner that extracts general planning capabilities by modeling the versatile distribution of multi-task trajectories, which can be sub-optimal and has wide data coverage. Then for downstream tasks, we adopt RL-based fine-tuning with task-specific rewards to fast refine the diffusion planner, which aims to generate action sequences with higher task-specific returns. Experimental results from multi-task domains including Meta-World and Adroit demonstrate that SODP outperforms state-of-the-art methods with only a small amount of data for reward-guided fine-tuning.
- Published
- 2024
31. Offline Signature Verification Based on Feature Disentangling Aided Variational Autoencoder
- Author
-
Zhang, Hansong, Guo, Jiangjian, Li, Kun, Zhang, Yang, and Zhao, Yimei
- Subjects
Computer Science - Computer Vision and Pattern Recognition - Abstract
Offline handwritten signature verification systems are used to verify the identity of individuals, through recognizing their handwritten signature image as genuine signatures or forgeries. The main tasks of signature verification systems include extracting features from signature images and training a classifier for classification. The challenges of these tasks are twofold. First, genuine signatures and skilled forgeries are highly similar in their appearances, resulting in a small inter-class distance. Second, the instances of skilled forgeries are often unavailable, when signature verification models are being trained. To tackle these problems, this paper proposes a new signature verification method. It is the first model that employs a variational autoencoder (VAE) to extract features directly from signature images. To make the features more discriminative, it improves the traditional VAEs by introducing a new loss function for feature disentangling. In addition, it relies on SVM (Support Vector Machine) for classification according to the extracted features. Extensive experiments are conducted on two public datasets: MCYT-75 and GPDS-synthetic where the proposed method significantly outperformed $13$ representative offline signature verification methods. The achieved improvement in distinctive datasets indicates the robustness and great potential of the developed system in real application.
- Published
- 2024
32. MECD: Unlocking Multi-Event Causal Discovery in Video Reasoning
- Author
-
Chen, Tieyuan, Liu, Huabin, He, Tianyao, Chen, Yihang, Gan, Chaofan, Ma, Xiao, Zhong, Cheng, Zhang, Yang, Wang, Yingxue, Lin, Hui, and Lin, Weiyao
- Subjects
Computer Science - Computer Vision and Pattern Recognition - Abstract
Video causal reasoning aims to achieve a high-level understanding of video content from a causal perspective. However, current video reasoning tasks are limited in scope, primarily executed in a question-answering paradigm and focusing on short videos containing only a single event and simple causal relationships, lacking comprehensive and structured causality analysis for videos with multiple events. To fill this gap, we introduce a new task and dataset, Multi-Event Causal Discovery (MECD). It aims to uncover the causal relationships between events distributed chronologically across long videos. Given visual segments and textual descriptions of events, MECD requires identifying the causal associations between these events to derive a comprehensive, structured event-level video causal diagram explaining why and how the final result event occurred. To address MECD, we devise a novel framework inspired by the Granger Causality method, using an efficient mask-based event prediction model to perform an Event Granger Test, which estimates causality by comparing the predicted result event when premise events are masked versus unmasked. Furthermore, we integrate causal inference techniques such as front-door adjustment and counterfactual inference to address challenges in MECD like causality confounding and illusory causality. Experiments validate the effectiveness of our framework in providing causal relationships in multi-event videos, outperforming GPT-4o and VideoLLaVA by 5.7% and 4.1%, respectively., Comment: Accepted at NeurIPS 2024 as a spotlight paper
- Published
- 2024
33. Investigating Layer Importance in Large Language Models
- Author
-
Zhang, Yang, Dong, Yanfei, and Kawaguchi, Kenji
- Subjects
Computer Science - Computation and Language ,Computer Science - Machine Learning - Abstract
Large language models (LLMs) have gained increasing attention due to their prominent ability to understand and process texts. Nevertheless, LLMs largely remain opaque. The lack of understanding of LLMs has obstructed the deployment in safety-critical scenarios and hindered the development of better models. In this study, we advance the understanding of LLM by investigating the significance of individual layers in LLMs. We propose an efficient sampling method to faithfully evaluate the importance of layers using Shapley values, a widely used explanation framework in feature attribution and data valuation. In addition, we conduct layer ablation experiments to assess the performance degradation resulting from the exclusion of specific layers. Our findings reveal the existence of cornerstone layers, wherein certain early layers can exhibit a dominant contribution over others. Removing one cornerstone layer leads to a drastic collapse of the model performance, often reducing it to random guessing. Conversely, removing non-cornerstone layers results in only marginal performance changes. This study identifies cornerstone layers in LLMs and underscores their critical role for future research.
- Published
- 2024
34. Tunable Anomalous Hall Effect in a Kagome Ferromagnetic Weyl Semimetal
- Author
-
Pate, Samuel E., Wang, Bin, Zhang, Yang, Shen, Bing, Liu, Enke, Martin, Ivar, Jiang, J. Samuel, Zhou, Xiuquan, Chung, Duck Young, Kanatzidis, Mercouri G., Welp, Ulrich, Kwok, Wai-Kwong, and Xiao, Zhi-Li
- Subjects
Condensed Matter - Materials Science - Abstract
Emerging from the intricate interplay of topology and magnetism, the giant anomalous Hall effect (AHE) is the most known topological property of the recently discovered kagome ferromagnetic Weyl semimetal Co_3Sn_2S_2 with the magnetic Co atoms arranged on a kagome lattice. Here we report that the AHE in Co_3Sn_2S_2 can be fine-tuned by an applied magnetic field orientated within ~2 degrees of the kagome plane, while beyond this regime, it stays unchanged. Particularly, it can vanish in magnetic fields parallel to the kagome plane and even decrease in magnetic fields collinear with the spin direction. This tunable AHE can be attributed to local spin switching enabled by the geometrical frustration of the magnetic kagome lattice, revealing that spins in a kagome ferromagnet change their switching behavior as the magnetic field approaches the kagome plane. Our results also suggest a versatile way to tune the properties of a kagome magnet.
- Published
- 2024
- Full Text
- View/download PDF
35. Axial Attention Transformer Networks: A New Frontier in Breast Cancer Detection
- Author
-
He, Weijie, Bao, Runyuan, Cang, Yiru, Wei, Jianjun, Zhang, Yang, and Hu, Jiacheng
- Subjects
Electrical Engineering and Systems Science - Image and Video Processing ,Computer Science - Artificial Intelligence ,Computer Science - Computer Vision and Pattern Recognition ,Computer Science - Machine Learning - Abstract
This paper delves into the challenges and advancements in the field of medical image segmentation, particularly focusing on breast cancer diagnosis. The authors propose a novel Transformer-based segmentation model that addresses the limitations of traditional convolutional neural networks (CNNs), such as U-Net, in accurately localizing and segmenting small lesions within breast cancer images. The model introduces an axial attention mechanism to enhance the computational efficiency and address the issue of global contextual information that is often overlooked by CNNs. Additionally, the paper discusses improvements tailored to the small dataset challenge, including the incorporation of relative position information and a gated axial attention mechanism to refine the model's focus on relevant features. The proposed model aims to significantly improve the segmentation accuracy of breast cancer images, offering a more efficient and effective tool for computer-aided diagnosis.
- Published
- 2024
36. Efficient Fine-Tuning of Large Language Models for Automated Medical Documentation
- Author
-
Leong, Hui Yi, Gao, Yi Fan, Shuai, Ji, Zhang, Yang, and Pamuksuz, Uktu
- Subjects
Computer Science - Computation and Language ,Computer Science - Artificial Intelligence - Abstract
Scientific research indicates that for every hour spent in direct patient care, physicians spend nearly two additional hours on administrative tasks, particularly on electronic health records (EHRs) and desk work. This excessive administrative burden not only reduces the time available for patient care but also contributes to physician burnout and inefficiencies in healthcare delivery. To address these challenges, this study introduces MediGen, a fine-tuned large language model (LLM) designed to automate the generation of medical reports from medical dialogues. By leveraging state-of-the-art methodologies for fine-tuning open-source pretrained models, including LLaMA3-8B, MediGen achieves high accuracy in transcribing and summarizing clinical interactions. The fine-tuned LLaMA3-8B model demonstrated promising results, achieving a ROUGE score of 58% and a BERTScore-F1 of 72%, indicating its effectiveness in generating accurate and clinically relevant medical reports. These findings suggest that MediGen has the potential to significantly reduce the administrative workload on physicians, improving both healthcare efficiency and physician well-being., Comment: 4 pages, 3 Figures, 3 Tables. The final version will be published in the proceedings of the IEEE conference
- Published
- 2024
- Full Text
- View/download PDF
37. Seed-Music: A Unified Framework for High Quality and Controlled Music Generation
- Author
-
Bai, Ye, Chen, Haonan, Chen, Jitong, Chen, Zhuo, Deng, Yi, Dong, Xiaohong, Hantrakul, Lamtharn, Hao, Weituo, Huang, Qingqing, Huang, Zhongyi, Jia, Dongya, La, Feihu, Le, Duc, Li, Bochen, Li, Chumin, Li, Hui, Li, Xingxing, Liu, Shouda, Lu, Wei-Tsung, Lu, Yiqing, Shaw, Andrew, Spijkervet, Janne, Sun, Yakun, Wang, Bo, Wang, Ju-Chiang, Wang, Yuping, Wang, Yuxuan, Xu, Ling, Yang, Yifeng, Yao, Chao, Zhang, Shuo, Zhang, Yang, Zhang, Yilin, Zhao, Hang, Zhao, Ziyi, Zhong, Dejian, Zhou, Shicen, and Zou, Pei
- Subjects
Computer Science - Sound ,Electrical Engineering and Systems Science - Audio and Speech Processing - Abstract
We introduce Seed-Music, a suite of music generation systems capable of producing high-quality music with fine-grained style control. Our unified framework leverages both auto-regressive language modeling and diffusion approaches to support two key music creation workflows: controlled music generation and post-production editing. For controlled music generation, our system enables vocal music generation with performance controls from multi-modal inputs, including style descriptions, audio references, musical scores, and voice prompts. For post-production editing, it offers interactive tools for editing lyrics and vocal melodies directly in the generated audio. We encourage readers to listen to demo audio examples at https://team.doubao.com/seed-music "https://team.doubao.com/seed-music"., Comment: Seed-Music technical report, 20 pages, 5 figures
- Published
- 2024
38. Generated Data with Fake Privacy: Hidden Dangers of Fine-tuning Large Language Models on Generated Data
- Author
-
Akkus, Atilla, Li, Mingjie, Chu, Junjie, Backes, Michael, Zhang, Yang, and Sav, Sinem
- Subjects
Computer Science - Cryptography and Security ,Computer Science - Machine Learning - Abstract
Large language models (LLMs) have shown considerable success in a range of domain-specific tasks, especially after fine-tuning. However, fine-tuning with real-world data usually leads to privacy risks, particularly when the fine-tuning samples exist in the pre-training data. To avoid the shortcomings of real data, developers often employ methods to automatically generate synthetic data for fine-tuning, as data generated by traditional models are often far away from the real-world pertaining data. However, given the advanced capabilities of LLMs, the distinction between real data and LLM-generated data has become negligible, which may also lead to privacy risks like real data. In this paper, we present an empirical analysis of this underexplored issue by investigating a key question: "Does fine-tuning with LLM-generated data enhance privacy, or does it pose additional privacy risks?" Based on the structure of LLM's generated data, our research focuses on two primary approaches to fine-tuning with generated data: supervised fine-tuning with unstructured generated data and self-instruct tuning. The number of successful Personal Information Identifier (PII) extractions for Pythia after fine-tuning our generated data raised over $20\%$. Furthermore, the ROC-AUC score of membership inference attacks for Pythia-6.9b after self-instruct methods also achieves more than $40\%$ improvements on ROC-AUC score than base models. The results indicate the potential privacy risks in LLMs when fine-tuning with the generated data.
- Published
- 2024
39. Two-loop amplitudes for $\mathcal{O}(\alpha_s^2)$ corrections to $W\gamma\gamma$ production at the LHC
- Author
-
Badger, Simon, Hartanto, Heribertus Bayu, Wu, Zihao, Zhang, Yang, and Zoia, Simone
- Subjects
High Energy Physics - Phenomenology - Abstract
We present the two-loop helicity amplitudes contributing to the next-to-next-to-leading order QCD predictions for W-boson production in association with two photons at the Large Hadron Collider. We derived compact analytic expressions for the two-loop amplitudes in the leading colour limit, and provide numerical results for the subleading colour contributions. We employ a compact system of integration-by-part identities provided by the NeatIBP package, allowing for an efficient computation of the rational coefficients of the scattering amplitudes over finite fields., Comment: 31 pages, 5 figures, 8 tables, 2 appendices. v2: fix author affiliation
- Published
- 2024
40. Q-value Regularized Decision ConvFormer for Offline Reinforcement Learning
- Author
-
Yan, Teng, Ruan, Zhendong, Cai, Yaobang, Han, Yu, Li, Wenxian, and Zhang, Yang
- Subjects
Computer Science - Machine Learning ,Computer Science - Robotics - Abstract
As a data-driven paradigm, offline reinforcement learning (Offline RL) has been formulated as sequence modeling, where the Decision Transformer (DT) has demonstrated exceptional capabilities. Unlike previous reinforcement learning methods that fit value functions or compute policy gradients, DT adjusts the autoregressive model based on the expected returns, past states, and actions, using a causally masked Transformer to output the optimal action. However, due to the inconsistency between the sampled returns within a single trajectory and the optimal returns across multiple trajectories, it is challenging to set an expected return to output the optimal action and stitch together suboptimal trajectories. Decision ConvFormer (DC) is easier to understand in the context of modeling RL trajectories within a Markov Decision Process compared to DT. We propose the Q-value Regularized Decision ConvFormer (QDC), which combines the understanding of RL trajectories by DC and incorporates a term that maximizes action values using dynamic programming methods during training. This ensures that the expected returns of the sampled actions are consistent with the optimal returns. QDC achieves excellent performance on the D4RL benchmark, outperforming or approaching the optimal level in all tested environments. It particularly demonstrates outstanding competitiveness in trajectory stitching capability.
- Published
- 2024
41. Quantum Oscillations Evidence for Topological Bands in Kagome Metal ScV6Sn6
- Author
-
Zheng, Guoxin, Zhu, Yuan, Mozaffari, Shirin, Mao, Ning, Chen, Kuan-Wen, Jenkins, Kaila, Zhang, Dechen, Chan, Aaron, Arachchige, Hasitha W. Suriya, Madhogaria, Richa P., Cothrine, Matthew, Meier, William R., Zhang, Yang, Mandrus, David, and Li, Lu
- Subjects
Condensed Matter - Strongly Correlated Electrons - Abstract
Metals with kagome lattice provide bulk materials to host both the flat-band and Dirac electronic dispersions. A new family of kagome metals is recently discovered in AV6Sn6. The Dirac electronic structures of this material need more experimental evidence to confirm. In the manuscript, we investigate this problem by resolving the quantum oscillations in both electrical transport and magnetization in ScV6Sn6. The revealed orbits are consistent with the electronic band structure models. Furthermore, the Berry phase of a dominating orbit is revealed to be around $\pi$, providing direct evidence for the topological band structure, which is consistent with calculations. Our results demonstrate a rich physics and shed light on the correlated topological ground state of this kagome metal., Comment: 5 figures, accepted version
- Published
- 2024
- Full Text
- View/download PDF
42. Understanding Data Importance in Machine Learning Attacks: Does Valuable Data Pose Greater Harm?
- Author
-
Wen, Rui, Backes, Michael, and Zhang, Yang
- Subjects
Computer Science - Cryptography and Security ,Computer Science - Machine Learning - Abstract
Machine learning has revolutionized numerous domains, playing a crucial role in driving advancements and enabling data-centric processes. The significance of data in training models and shaping their performance cannot be overstated. Recent research has highlighted the heterogeneous impact of individual data samples, particularly the presence of valuable data that significantly contributes to the utility and effectiveness of machine learning models. However, a critical question remains unanswered: are these valuable data samples more vulnerable to machine learning attacks? In this work, we investigate the relationship between data importance and machine learning attacks by analyzing five distinct attack types. Our findings reveal notable insights. For example, we observe that high importance data samples exhibit increased vulnerability in certain attacks, such as membership inference and model stealing. By analyzing the linkage between membership inference vulnerability and data importance, we demonstrate that sample characteristics can be integrated into membership metrics by introducing sample-specific criteria, therefore enhancing the membership inference performance. These findings emphasize the urgent need for innovative defense mechanisms that strike a balance between maximizing utility and safeguarding valuable data against potential exploitation., Comment: To Appear in Network and Distributed System Security (NDSS) Symposium 2025
- Published
- 2024
43. Velocity-resolved Reverberation Mapping of Changing-look Active Galactic Nucleus NGC 4151 during Outburst Stage. II. Four Season Observation Results
- Author
-
Feng, Hai-Cheng, Li, Sha-Sha, Bai, J. M., Liu, H. T., Lu, Kai-Xing, Pang, Yu-Xuan, Sun, Mouyuan, Wang, Jian-Guo, Zhang, Yang-Wei, and Zhou, Shuying
- Subjects
Astrophysics - Astrophysics of Galaxies - Abstract
We present the results of a four-year velocity-resolved reverberation mapping (RM) campaign of the changing-look active galactic nucleus (CL-AGN) NGC 4151 during its outburst phase. By measuring the time lags of the \ha, \hb, \hg, \hei, and \heii\ emission lines, we confirm a stratified broad-line region (BLR) structure that aligns with predictions from photoionization models. Intriguingly, we observed an ``anti-breathing" phenomenon, where the lags of broad emission lines decreased with increasing luminosity, contrary to the typical expectation. This anomaly may be attributed to the influence of the ultraviolet-optical lag or non-virialized motions in the BLR gas. Velocity-resolved RM and ionization mapping analyses revealed rapid and significant changes in the BLR geometry and kinematics on timescales within one year, which cannot be interpreted by any single mechanism, such as an inhomogeneous BLR, variations in radiation pressure, or changes in the illuminated ionizing field. Additionally, the \hb\ lags of NGC 4151 and other CL-AGNs agree with the radius-luminosity relationship established for AGNs with low accretion rates, implying that the CL phenomenon is more likely driven by intrinsic changes in the accretion rate rather than obscuration. These findings provide new insights into the complex internal processes of CL-AGNs and highlight the importance of long-term, multi-line RM for understanding BLR structures, geometry, and kinematics., Comment: 15 pages, 7 figures, 3 tables, Comments welcome! Submitted to ApJ
- Published
- 2024
44. Membership Inference Attacks Against In-Context Learning
- Author
-
Wen, Rui, Li, Zheng, Backes, Michael, and Zhang, Yang
- Subjects
Computer Science - Cryptography and Security ,Computer Science - Computation and Language - Abstract
Adapting Large Language Models (LLMs) to specific tasks introduces concerns about computational efficiency, prompting an exploration of efficient methods such as In-Context Learning (ICL). However, the vulnerability of ICL to privacy attacks under realistic assumptions remains largely unexplored. In this work, we present the first membership inference attack tailored for ICL, relying solely on generated texts without their associated probabilities. We propose four attack strategies tailored to various constrained scenarios and conduct extensive experiments on four popular large language models. Empirical results show that our attacks can accurately determine membership status in most cases, e.g., 95\% accuracy advantage against LLaMA, indicating that the associated risks are much higher than those shown by existing probability-based attacks. Additionally, we propose a hybrid attack that synthesizes the strengths of the aforementioned strategies, achieving an accuracy advantage of over 95\% in most cases. Furthermore, we investigate three potential defenses targeting data, instruction, and output. Results demonstrate combining defenses from orthogonal dimensions significantly reduces privacy leakage and offers enhanced privacy assurances., Comment: To Appear in the ACM Conference on Computer and Communications Security, October 14-18, 2024
- Published
- 2024
45. Striped magnetization plateau and chirality-reversible anomalous Hall effect in a magnetic kagome metal
- Author
-
Cheng, Erjian, Mao, Ning, Yang, Xiaotian, Song, Boqing, Lou, Rui, Ying, Tianping, Nie, Simin, Fedorov, Alexander, Bertran, François, Ding, Pengfei, Suvorov, Oleksandr, Zhang, Shu, Changdar, Susmita, Schnelle, Walter, Koban, Ralf, Yi, Changjiang, Burkhardt, Ulrich, Büchner, Bernd, Wang, Shancai, Zhang, Yang, Wang, Wenbo, and Felser, Claudia
- Subjects
Condensed Matter - Strongly Correlated Electrons ,Condensed Matter - Materials Science - Abstract
Kagome materials with magnetic frustration in two-dimensional networks are known for their exotic properties, such as the anomalous Hall effect (AHE) with non-collinear spin textures. However, the effects of one-dimensional (1D) spin chains within these networks are less understood. Here, we report a distinctive AHE in the bilayer-distorted kagome material GdTi$_3$Bi$_4$, featuring 1D Gd zigzag spin chains, a one-third magnetization plateau, and two successive metamagnetic transitions. At these metamagnetic transitions, Hall resistivity shows abrupt jumps linked to the formation of stripe domain walls, while within the plateau, the absence of detectable domain walls suggests possible presence of skyrmion phase. Reducing the sample size to a few microns reveals additional Hall resistivity spikes, indicating domain wall skew scattering contributions. Magnetic atomistic spin dynamics simulations reveal that the magnetic textures at these transitions have reverse chirality, explaining the evolution of AHE and domain walls with fields. These results underscore the potential of magnetic and crystal symmetry interplay, and magnetic field-engineered spin chirality, for controlling domain walls and tuning transverse properties, advancing spintronic applications.
- Published
- 2024
46. SPDiffusion: Semantic Protection Diffusion for Multi-concept Text-to-image Generation
- Author
-
Zhang, Yang, Zhang, Rui, Nie, Xuecheng, Li, Haochen, Chen, Jikun, Hao, Yifan, Zhang, Xin, Liu, Luoqi, and Li, Ling
- Subjects
Computer Science - Computer Vision and Pattern Recognition - Abstract
Recent text-to-image models have achieved remarkable success in generating high-quality images. However, when tasked with multi-concept generation which creates images containing multiple characters or objects, existing methods often suffer from attribute confusion, resulting in severe text-image inconsistency. We found that attribute confusion occurs when a certain region of the latent features attend to multiple or incorrect prompt tokens. In this work, we propose novel Semantic Protection Diffusion (SPDiffusion) to protect the semantics of regions from the influence of irrelevant tokens, eliminating the confusion of non-corresponding attributes. In the SPDiffusion framework, we design a Semantic Protection Mask (SP-Mask) to represent the relevance of the regions and the tokens, and propose a Semantic Protection Cross-Attention (SP-Attn) to shield the influence of irrelevant tokens on specific regions in the generation process. To evaluate our method, we created a diverse multi-concept benchmark, and SPDiffusion achieves state-of-the-art results on this benchmark, proving its effectiveness. Our method can be combined with many other application methods or backbones, such as ControlNet, Story Diffusion, PhotoMaker and PixArt-alpha to enhance their multi-concept capabilities, demonstrating strong compatibility and scalability.
- Published
- 2024
47. Searching for MeV-scale Axion-like Particles and Dark Photons with PandaX-4T
- Author
-
PandaX Collaboration, Li, Tao, Bo, Zihao, Chen, Wei, Chen, Xun, Chen, Yunhua, Cheng, Zhaokan, Cui, Xiangyi, Fan, Yingjie, Fang, Deqing, Gao, Zhixing, Geng, Lisheng, Giboni, Karl, Guo, Xunan, Guo, Xuyuan, Guo, Zichao, Han, Chencheng, He, Ke HanChangda, He, Jinrong, Huang, Di, Huang, Houqi, Huang, Junting, Hou, Ruquan, Hou, Yu, Ji, Xiangdong, Ji, Xiangpan, Ju, Yonglin, Li, Chenxiang, Li, Jiafu, Li, Mingchuan, Li, Shuaijie, Li, Zhiyuan, Lin, Qing, Liu, Jianglai, Lu, Congcong, Lu, Xiaoying, Luo, Lingyin, Luo, Yunyang, Ma, Wenbo, Ma, Yugang, Mao, Yajun, Meng, Yue, Ning, Xuyang, Pang, Binyu, Qi, Ningchun, Qian, Zhicheng, Ren, Xiangxiang, Shan, Dong, Shang, Xiaofeng, Shao, Xiyuan, Shen, Guofang, Shen, Manbin, Sun, Wenliang, Tao, Yi, Wang, Anqing, Wang, Guanbo, Wang, Hao, Wang, Jiamin, Wang, Lei, Wang, Meng, Wang, Qiuhong, Wang, Shaobo, Wang, Siguang, Wang, Wei, Wang, Xiuli, Wang, Xu, Wang, Zhou, Wei, Yuehuan, Wu, Weihao, Wu, Yuan, Xiao, Mengjiao, Xiao, Xiang, Xiong, Kaizhi, Xu, Yifan, Yao, Shunyu, Yan, Binbin, Yan, Xiyu, Yang, Yong, Ye, Peihua, Yu, Chunxu, Yuan, Ying, Yuan, Zhe, Yun, Youhui, Zeng, Xinning, Zhang, Minzhen, Zhang, Peng, Zhang, Shibo, Zhang, Shu, Zhang, Tao, Zhang, Wei, Zhang, Yang, Zhang, Yingxin, Zhang, Yuanyuan, Zhao, Li, Zhou, Jifang, Zhou, Jiaxu, Zhou, Jiayi, Zhou, Ning, Zhou, Xiaopeng, Zhou, Yubo, and Zhou, Zhizhen
- Subjects
High Energy Physics - Experiment - Abstract
Axion-like particles (ALPs) and dark photons (DPs) are viable dark matter particle candidates. We have searched for possible ALP/DP signals in the PandaX-4T liquid xenon detector using 94.8 days of data. A binned likelihood fit is constructed to search for possible mono-energetic peaks induced by the absorption processes between ALPs/DPs and atomic electrons of xenon. A detailed temporal model of decays associated with xenon isotopes is introduced to constrain the number of background events. No signal excess over background expectations is observed, and we have established the most stringent exclusion limits for most ALP/DP masses ranging from 150 keV/$c^2$ to 1 MeV/$c^2$.
- Published
- 2024
48. Haptic artificial muscle skin for extended reality
- Author
-
Guo, Yuxuan, Luo, Yang, Plamthottam, Roshan, Pei, Siyou, Wei, Chen, Han, Ziqing, Fan, Jiacheng, Possinger, Mason, Liu, Kede, Zhu, Yingke, Fei, Zhangqing, Winardi, Isabelle, Hong, Hyeonji, Zhang, Yang, Jin, Lihua, and Pei, Qibing
- Subjects
Information and Computing Sciences ,Human-Centred Computing ,Engineering ,Electronics ,Sensors and Digital Hardware ,Clinical Research ,Humans ,Skin ,Artificial ,Touch ,Wearable Electronic Devices ,Muscle ,Skeletal - Abstract
Existing haptic actuators are often rigid and limited in their ability to replicate real-world tactile sensations. We present a wearable haptic artificial muscle skin (HAMS) based on fully soft, millimeter-scale, multilayer dielectric elastomer actuators (DEAs) capable of significant out-of-plane deformation, a capability that typically requires rigid or liquid biasing. The DEAs use a thickness-varying multilayer structure to achieve large out-of-plane displacement and force, maintaining comfort and wearability. Experimental results demonstrate that HAMS can produce complex tactile feedback with high perception accuracy. Moreover, we show that HAMS can be integrated into extended reality (XR) systems, enhancing immersion and offering potential applications in entertainment, education, and assistive technologies.
- Published
- 2024
49. Image-Perfect Imperfections: Safety, Bias, and Authenticity in the Shadow of Text-To-Image Model Evolution
- Author
-
Wu, Yixin, Shen, Yun, Backes, Michael, and Zhang, Yang
- Subjects
Computer Science - Cryptography and Security ,Computer Science - Machine Learning - Abstract
Text-to-image models, such as Stable Diffusion (SD), undergo iterative updates to improve image quality and address concerns such as safety. Improvements in image quality are straightforward to assess. However, how model updates resolve existing concerns and whether they raise new questions remain unexplored. This study takes an initial step in investigating the evolution of text-to-image models from the perspectives of safety, bias, and authenticity. Our findings, centered on Stable Diffusion, indicate that model updates paint a mixed picture. While updates progressively reduce the generation of unsafe images, the bias issue, particularly in gender, intensifies. We also find that negative stereotypes either persist within the same Non-White race group or shift towards other Non-White race groups through SD updates, yet with minimal association of these traits with the White race group. Additionally, our evaluation reveals a new concern stemming from SD updates: State-of-the-art fake image detectors, initially trained for earlier SD versions, struggle to identify fake images generated by updated versions. We show that fine-tuning these detectors on fake images generated by updated versions achieves at least 96.6\% accuracy across various SD versions, addressing this issue. Our insights highlight the importance of continued efforts to mitigate biases and vulnerabilities in evolving text-to-image models., Comment: To Appear in the ACM Conference on Computer and Communications Security, October 14-18, 2024
- Published
- 2024
50. EMHI: A Multimodal Egocentric Human Motion Dataset with HMD and Body-Worn IMUs
- Author
-
Fan, Zhen, Dai, Peng, Su, Zhuo, Gao, Xu, Lv, Zheng, Zhang, Jiarui, Du, Tianyuan, Wang, Guidong, and Zhang, Yang
- Subjects
Computer Science - Computer Vision and Pattern Recognition - Abstract
Egocentric human pose estimation (HPE) using wearable sensors is essential for VR/AR applications. Most methods rely solely on either egocentric-view images or sparse Inertial Measurement Unit (IMU) signals, leading to inaccuracies due to self-occlusion in images or the sparseness and drift of inertial sensors. Most importantly, the lack of real-world datasets containing both modalities is a major obstacle to progress in this field. To overcome the barrier, we propose EMHI, a multimodal \textbf{E}gocentric human \textbf{M}otion dataset with \textbf{H}ead-Mounted Display (HMD) and body-worn \textbf{I}MUs, with all data collected under the real VR product suite. Specifically, EMHI provides synchronized stereo images from downward-sloping cameras on the headset and IMU data from body-worn sensors, along with pose annotations in SMPL format. This dataset consists of 885 sequences captured by 58 subjects performing 39 actions, totaling about 28.5 hours of recording. We evaluate the annotations by comparing them with optical marker-based SMPL fitting results. To substantiate the reliability of our dataset, we introduce MEPoser, a new baseline method for multimodal egocentric HPE, which employs a multimodal fusion encoder, temporal feature encoder, and MLP-based regression heads. The experiments on EMHI show that MEPoser outperforms existing single-modal methods and demonstrates the value of our dataset in solving the problem of egocentric HPE. We believe the release of EMHI and the method could advance the research of egocentric HPE and expedite the practical implementation of this technology in VR/AR products.
- Published
- 2024
Catalog
Discovery Service for Jio Institute Digital Library
For full access to our library's resources, please sign in.