48,426 results on '"Wang, Qing"'
Search Results
2. BFA-YOLO: Balanced multiscale object detection network for multi-view building facade attachments detection
- Author
-
Chen, Yangguang, Wang, Tong, Chen, Guanzhou, Zhu, Kun, Tan, Xiaoliang, Wang, Jiaqi, Xie, Hong, Zhou, Wenlin, Zhao, Jingyi, Wang, Qing, Luo, Xiaolong, and Zhang, Xiaodong
- Subjects
Computer Science - Computer Vision and Pattern Recognition ,Computer Science - Artificial Intelligence - Abstract
Detection of building facade attachments such as doors, windows, balconies, air conditioner units, billboards, and glass curtain walls plays a pivotal role in numerous applications. Building facade attachments detection aids in vbuilding information modeling (BIM) construction and meeting Level of Detail 3 (LOD3) standards. Yet, it faces challenges like uneven object distribution, small object detection difficulty, and background interference. To counter these, we propose BFA-YOLO, a model for detecting facade attachments in multi-view images. BFA-YOLO incorporates three novel innovations: the Feature Balanced Spindle Module (FBSM) for addressing uneven distribution, the Target Dynamic Alignment Task Detection Head (TDATH) aimed at improving small object detection, and the Position Memory Enhanced Self-Attention Mechanism (PMESA) to combat background interference, with each component specifically designed to solve its corresponding challenge. Detection efficacy of deep network models deeply depends on the dataset's characteristics. Existing open source datasets related to building facades are limited by their single perspective, small image pool, and incomplete category coverage. We propose a novel method for building facade attachments detection dataset construction and construct the BFA-3D dataset for facade attachments detection. The BFA-3D dataset features multi-view, accurate labels, diverse categories, and detailed classification. BFA-YOLO surpasses YOLOv8 by 1.8% and 2.9% in mAP@0.5 on the multi-view BFA-3D and street-view Facade-WHU datasets, respectively. These results underscore BFA-YOLO's superior performance in detecting facade attachments., Comment: 22 pages
- Published
- 2024
3. NPU-NTU System for Voice Privacy 2024 Challenge
- Author
-
Yao, Jixun, Kuzmin, Nikita, Wang, Qing, Guo, Pengcheng, Ning, Ziqian, Guo, Dake, Lee, Kong Aik, Chng, Eng-Siong, and Xie, Lei
- Subjects
Electrical Engineering and Systems Science - Audio and Speech Processing - Abstract
Speaker anonymization is an effective privacy protection solution that conceals the speaker's identity while preserving the linguistic content and paralinguistic information of the original speech. To establish a fair benchmark and facilitate comparison of speaker anonymization systems, the VoicePrivacy Challenge (VPC) was held in 2020 and 2022, with a new edition planned for 2024. In this paper, we describe our proposed speaker anonymization system for VPC 2024. Our system employs a disentangled neural codec architecture and a serial disentanglement strategy to gradually disentangle the global speaker identity and time-variant linguistic content and paralinguistic information. We introduce multiple distillation methods to disentangle linguistic content, speaker identity, and emotion. These methods include semantic distillation, supervised speaker distillation, and frame-level emotion distillation. Based on these distillations, we anonymize the original speaker identity using a weighted sum of a set of candidate speaker identities and a randomly generated speaker identity. Our system achieves the best trade-off of privacy protection and emotion preservation in VPC 2024., Comment: System description for VPC 2024
- Published
- 2024
4. Cross-sectional imaging of speed-of-sound distribution using photoacoustic reversal beacons
- Author
-
Wang, Yang, Wang, Danni, Zhong, Liting, Zhou, Yi, Wang, Qing, Chen, Wufan, and Qi, Li
- Subjects
Physics - Medical Physics ,Electrical Engineering and Systems Science - Image and Video Processing - Abstract
Photoacoustic tomography (PAT) enables non-invasive cross-sectional imaging of biological tissues, but it fails to map the spatial variation of speed-of-sound (SOS) within tissues. While SOS is intimately linked to density and elastic modulus of tissues, the imaging of SOS distri-bution serves as a complementary imaging modality to PAT. Moreover, an accurate SOS map can be leveraged to correct for PAT image degradation arising from acoustic heterogene-ities. Herein, we propose a novel approach for SOS reconstruction using only PAT imaging modality. Our method is based on photoacoustic reversal beacons (PRBs), which are small light-absorbing targets with strong photoacoustic contrast. We excite and scan a number of PRBs positioned at the periphery of the target, and the generated photoacoustic waves prop-agate through the target from various directions, thereby achieve spatial sampling of the internal SOS. We formulate a linear inverse model for pixel-wise SOS reconstruction and solve it with iterative optimization technique. We validate the feasibility of the proposed method through simulations, phantoms, and ex vivo biological tissue tests. Experimental results demonstrate that our approach can achieve accurate reconstruction of SOS distribu-tion. Leveraging the obtained SOS map, we further demonstrate significantly enhanced PAT image reconstruction with acoustic correction.
- Published
- 2024
5. PatUntrack: Automated Generating Patch Examples for Issue Reports without Tracked Insecure Code
- Author
-
Jiang, Ziyou, Shi, Lin, Yang, Guowei, and Wang, Qing
- Subjects
Computer Science - Cryptography and Security ,Computer Science - Artificial Intelligence ,Computer Science - Software Engineering - Abstract
Security patches are essential for enhancing the stability and robustness of projects in the software community. While vulnerabilities are officially expected to be patched before being disclosed, patching vulnerabilities is complicated and remains a struggle for many organizations. To patch vulnerabilities, security practitioners typically track vulnerable issue reports (IRs), and analyze their relevant insecure code to generate potential patches. However, the relevant insecure code may not be explicitly specified and practitioners cannot track the insecure code in the repositories, thus limiting their ability to generate patches. In such cases, providing examples of insecure code and the corresponding patches would benefit the security developers to better locate and fix the insecure code. In this paper, we propose PatUntrack to automatically generating patch examples from IRs without tracked insecure code. It auto-prompts Large Language Models (LLMs) to make them applicable to analyze the vulnerabilities. It first generates the completed description of the Vulnerability-Triggering Path (VTP) from vulnerable IRs. Then, it corrects hallucinations in the VTP description with external golden knowledge. Finally, it generates Top-K pairs of Insecure Code and Patch Example based on the corrected VTP description. To evaluate the performance, we conducted experiments on 5,465 vulnerable IRs. The experimental results show that PatUntrack can obtain the highest performance and improve the traditional LLM baselines by +14.6% (Fix@10) on average in patch example generation. Furthermore, PatUntrack was applied to generate patch examples for 76 newly disclosed vulnerable IRs. 27 out of 37 replies from the authors of these IRs confirmed the usefulness of the patch examples generated by PatUntrack, indicating that they can benefit from these examples for patching the vulnerabilities., Comment: Accepted by ASE'24
- Published
- 2024
6. Adversarial Robustness of Open-source Text Classification Models and Fine-Tuning Chains
- Author
-
Qin, Hao, Li, Mingyang, Wang, Junjie, and Wang, Qing
- Subjects
Computer Science - Software Engineering - Abstract
Context:With the advancement of artificial intelligence (AI) technology and applications, numerous AI models have been developed, leading to the emergence of open-source model hosting platforms like Hugging Face (HF). Thanks to these platforms, individuals can directly download and use models, as well as fine-tune them to construct more domain-specific models. However, just like traditional software supply chains face security risks, AI models and fine-tuning chains also encounter new security risks, such as adversarial attacks. Therefore, the adversarial robustness of these models has garnered attention, potentially influencing people's choices regarding open-source models. Objective:This paper aims to explore the adversarial robustness of open-source AI models and their chains formed by the upstream-downstream relationships via fine-tuning to provide insights into the potential adversarial risks. Method:We collect text classification models on HF and construct the fine-tuning chains.Then, we conduct an empirical analysis of model reuse and associated robustness risks under existing adversarial attacks from two aspects, i.e., models and their fine-tuning chains. Results:Despite the models' widespread downloading and reuse, they are generally susceptible to adversarial attack risks, with an average of 52.70% attack success rate. Moreover, fine-tuning typically exacerbates this risk, resulting in an average 12.60% increase in attack success rates. We also delve into the influence of factors such as attack techniques, datasets, and model architectures on the success rate, as well as the transitivity along the model chains.
- Published
- 2024
7. MUSA: Multi-lingual Speaker Anonymization via Serial Disentanglement
- Author
-
Yao, Jixun, Wang, Qing, Guo, Pengcheng, Ning, Ziqian, Yang, Yuguang, Pan, Yu, and Xie, Lei
- Subjects
Electrical Engineering and Systems Science - Audio and Speech Processing - Abstract
Speaker anonymization is an effective privacy protection solution designed to conceal the speaker's identity while preserving the linguistic content and para-linguistic information of the original speech. While most prior studies focus solely on a single language, an ideal speaker anonymization system should be capable of handling multiple languages. This paper proposes MUSA, a Multi-lingual Speaker Anonymization approach that employs a serial disentanglement strategy to perform a step-by-step disentanglement from a global time-invariant representation to a temporal time-variant representation. By utilizing semantic distillation and self-supervised speaker distillation, the serial disentanglement strategy can avoid strong inductive biases and exhibit superior generalization performance across different languages. Meanwhile, we propose a straightforward anonymization strategy that employs empty embedding with zero values to simulate the speaker identity concealment process, eliminating the need for conversion to a pseudo-speaker identity and thereby reducing the complexity of speaker anonymization process. Experimental results on VoicePrivacy official datasets and multi-lingual datasets demonstrate that MUSA can effectively protect speaker privacy while preserving linguistic content and para-linguistic information., Comment: Submitted to TASLP
- Published
- 2024
8. Whisper-SV: Adapting Whisper for Low-data-resource Speaker Verification
- Author
-
Zhang, Li, Jiang, Ning, Wang, Qing, Li, Yue, Lu, Quan, and Xie, Lei
- Subjects
Computer Science - Sound ,Electrical Engineering and Systems Science - Audio and Speech Processing - Abstract
Trained on 680,000 hours of massive speech data, Whisper is a multitasking, multilingual speech foundation model demonstrating superior performance in automatic speech recognition, translation, and language identification. However, its applicability in speaker verification (SV) tasks remains unexplored, particularly in low-data-resource scenarios where labeled speaker data in specific domains are limited. To fill this gap, we propose a lightweight adaptor framework to boost SV with Whisper, namely Whisper-SV. Given that Whisper is not specifically optimized for SV tasks, we introduce a representation selection module to quantify the speaker-specific characteristics contained in each layer of Whisper and select the top-k layers with prominent discriminative speaker features. To aggregate pivotal speaker-related features while diminishing non-speaker redundancies across the selected top-k distinct layers of Whisper, we design a multi-layer aggregation module in Whisper-SV to integrate multi-layer representations into a singular, compacted representation for SV. In the multi-layer aggregation module, we employ convolutional layers with shortcut connections among different layers to refine speaker characteristics derived from multi-layer representations from Whisper. In addition, an attention aggregation layer is used to reduce non-speaker interference and amplify speaker-specific cues for SV tasks. Finally, a simple classification module is used for speaker classification. Experiments on VoxCeleb1, FFSVC, and IMSV datasets demonstrate that Whisper-SV achieves EER/minDCF of 2.22%/0.307, 6.14%/0.488, and 7.50%/0.582, respectively, showing superior performance in low-data-resource SV scenarios.
- Published
- 2024
9. Vision-driven Automated Mobile GUI Testing via Multimodal Large Language Model
- Author
-
Liu, Zhe, Li, Cheng, Chen, Chunyang, Wang, Junjie, Wu, Boyu, Wang, Yawen, Hu, Jun, and Wang, Qing
- Subjects
Computer Science - Software Engineering - Abstract
With the advancement of software rendering techniques, GUI pages in mobile apps now encompass a wealth of visual information, where the visual semantics of each page contribute to the overall app logic, presenting new challenges to software testing. Despite the progress in automated Graphical User Interface (GUI) testing, the absence of testing oracles has constrained its efficacy to identify only crash bugs with evident abnormal signals. Nonetheless, there are still a considerable number of non-crash bugs, ranging from unexpected behaviors to misalignments, often evading detection by existing techniques. While these bugs can exhibit visual cues that serve as potential testing oracles, they often entail a sequence of screenshots, and detecting them necessitates an understanding of the operational logic among GUI page transitions, which is challenging traditional techniques. Considering the remarkable performance of Multimodal Large Language Models (MLLM) in visual and language understanding, this paper proposes a vision-driven automated GUI testing approach VisionDroid to detect non-crash functional bugs with MLLM. It begins by extracting GUI text information and aligning it with screenshots to form a vision prompt, enabling MLLM to understand GUI context. The function-aware explorer then employs MLLM for deeper and function-oriented GUI page exploration, while the logic-aware bug detector segments the entire exploration history into logically cohesive parts and prompts the MLLM for bug detection. We evaluate VisionDroid on three datasets and compare it with 10 baselines, demonstrating its excellent performance. The ablation study further proves the contribution of each module. Moreover, VisionDroid identifies 29 new bugs on Google Play, of which 19 have been confirmed and fixed.
- Published
- 2024
10. Generalized Topology in Lattice Models without Chiral Symmetry
- Author
-
Wang, Qing and Hao, Ning
- Subjects
Condensed Matter - Mesoscale and Nanoscale Physics ,Condensed Matter - Materials Science - Abstract
The Su-Schrieffer-Heeger (SSH) model is a fundamental lattice model used to study topological physics. Here, we propose a new versatile one-dimensional (1D) lattice model that extends beyond the SSH model. Our 1D model breaks chiral symmetry and has generalized topology characterized by a projected winding number $W_{1D,P}=1$. When this model is extended to 2D, it can generate a second-order topological insulator (SOTI) phase. The generalized topology of the SOTI phase is protected by a pair of opposite winding numbers $W_{2D,P}^{\pm}=\pm1$, which count the opposite phase windings of a projected vortex and antivortex pair defined in the manifold of the entire parameter space. Thus, the topology of our models is robust and the end (corner) modes are independent of the selection of unit cells and boundary configurations. More significantly, we demonstrate that the model is very general and can be inherently realized in many categories of crystalline materials such as BaHCl., Comment: 6 pages, 4 figures
- Published
- 2024
11. Markov Switching Multiple-equation Tensor Regressions
- Author
-
Casarin, Roberto, Craiu, Radu, and Wang, Qing
- Subjects
Statistics - Methodology - Abstract
We propose a new flexible tensor model for multiple-equation regression that accounts for latent regime changes. The model allows for dynamic coefficients and multi-dimensional covariates that vary across equations. We assume the coefficients are driven by a common hidden Markov process that addresses structural breaks to enhance the model flexibility and preserve parsimony. We introduce a new Soft PARAFAC hierarchical prior to achieve dimensionality reduction while preserving the structural information of the covariate tensor. The proposed prior includes a new multi-way shrinking effect to address over-parametrization issues. We developed theoretical results to help hyperparameter choice. An efficient MCMC algorithm based on random scan Gibbs and back-fitting strategy is developed to achieve better computational scalability of the posterior sampling. The validity of the MCMC algorithm is demonstrated theoretically, and its computational efficiency is studied using numerical experiments in different parameter settings. The effectiveness of the model framework is illustrated using two original real data analyses. The proposed model exhibits superior performance when compared to the current benchmark, Lasso regression.
- Published
- 2024
12. Enhancing Terrestrial Net Primary Productivity Estimation with EXP-CASA: A Novel Light Use Efficiency Model Approach
- Author
-
Chen, Guanzhou, Zhang, Kaiqi, Zhang, Xiaodong, Xie, Hong, Yang, Haobo, Tan, Xiaoliang, Wang, Tong, Ma, Yule, Wang, Qing, Cao, Jinzhou, and Cui, Weihong
- Subjects
Quantitative Biology - Quantitative Methods - Abstract
The Light Use Efficiency model, epitomized by the CASA model, is extensively applied in the quantitative estimation of vegetation Net Primary Productivity. However, the classic CASA model is marked by significant complexity: the estimation of environmental stress parameters, in particular, necessitates multi-source observation data, adding to the complexity and uncertainty of the model's operation. Additionally, the saturation effect of the Normalized Difference Vegetation Index (NDVI), a key variable in the CASA model, weakened the accuracy of CASA's NPP predictions in densely vegetated areas. To address these limitations, this study introduces the Exponential-CASA (EXP-CASA) model. The EXP-CASA model effectively improves the CASA model by using novel functions for estimating the fraction of absorbed photosynthetically active radiation (FPAR) and environmental stress, by utilizing long-term observational data from FLUXNET and MODIS surface reflectance data. In a comparative analysis of NPP estimation accuracy among four different NPP products, EXP-CASA ($R^2 = 0.68, RMSE= 1.1gC\cdot m^{-2} \cdot d^{-1}$) outperforms others, followed by GLASS-NPP, and lastly MODIS-NPP and classic CASA. Additionally, this research assesses the EXP-CASA model's adaptability to various vegetation indices, evaluates the sensitivity and stability of its parameters over time, and compares its accuracy against other leading NPP estimation products. The findings reveal that the EXP-CASA model exhibits strong adaptability to diverse vegetation indices and stability of model parameters over time series. By introducing a novel estimation approach that optimizes model construction, the EXP-CASA model remarkably improves the accuracy of NPP estimations and paves the way for global-scale, consistent, and continuous assessment of vegetation NPP.
- Published
- 2024
13. Repairing Catastrophic-Neglect in Text-to-Image Diffusion Models via Attention-Guided Feature Enhancement
- Author
-
Chang, Zhiyuan, Li, Mingyang, Wang, Junjie, Liu, Yi, Wang, Qing, and Liu, Yang
- Subjects
Computer Science - Computer Vision and Pattern Recognition ,Computer Science - Artificial Intelligence - Abstract
Text-to-Image Diffusion Models (T2I DMs) have garnered significant attention for their ability to generate high-quality images from textual descriptions. However, these models often produce images that do not fully align with the input prompts, resulting in semantic inconsistencies. The most prominent issue among these semantic inconsistencies is catastrophic-neglect, where the images generated by T2I DMs miss key objects mentioned in the prompt. We first conduct an empirical study on this issue, exploring the prevalence of catastrophic-neglect, potential mitigation strategies with feature enhancement, and the insights gained. Guided by the empirical findings, we propose an automated repair approach named Patcher to address catastrophic-neglect in T2I DMs. Specifically, Patcher first determines whether there are any neglected objects in the prompt, and then applies attention-guided feature enhancement to these neglected objects, resulting in a repaired prompt. Experimental results on three versions of Stable Diffusion demonstrate that Patcher effectively repairs the issue of catastrophic-neglect, achieving 10.1%-16.3% higher Correct Rate in image generation compared to baselines., Comment: 11 pages, 3 figures
- Published
- 2024
14. Exploring Audio-Visual Information Fusion for Sound Event Localization and Detection In Low-Resource Realistic Scenarios
- Author
-
Jiang, Ya, Wang, Qing, Du, Jun, Hu, Maocheng, Hu, Pengfei, Liu, Zeyan, Cheng, Shi, Nian, Zhaoxu, Dong, Yuxuan, Cai, Mingqi, Fang, Xin, and Lee, Chin-Hui
- Subjects
Electrical Engineering and Systems Science - Audio and Speech Processing ,Electrical Engineering and Systems Science - Signal Processing - Abstract
This study presents an audio-visual information fusion approach to sound event localization and detection (SELD) in low-resource scenarios. We aim at utilizing audio and video modality information through cross-modal learning and multi-modal fusion. First, we propose a cross-modal teacher-student learning (TSL) framework to transfer information from an audio-only teacher model, trained on a rich collection of audio data with multiple data augmentation techniques, to an audio-visual student model trained with only a limited set of multi-modal data. Next, we propose a two-stage audio-visual fusion strategy, consisting of an early feature fusion and a late video-guided decision fusion to exploit synergies between audio and video modalities. Finally, we introduce an innovative video pixel swapping (VPS) technique to extend an audio channel swapping (ACS) method to an audio-visual joint augmentation. Evaluation results on the Detection and Classification of Acoustic Scenes and Events (DCASE) 2023 Challenge data set demonstrate significant improvements in SELD performances. Furthermore, our submission to the SELD task of the DCASE 2023 Challenge ranks first place by effectively integrating the proposed techniques into a model ensemble., Comment: accepted by icme2024
- Published
- 2024
15. Calculation of the chiral Lagrangian coefficients with light vector mesons
- Author
-
Geng, Zi-Kan and Wang, Qing
- Subjects
High Energy Physics - Phenomenology ,High Energy Physics - Theory - Abstract
We calculate the coefficients in the effective chiral Lagrangian from QCD, which includes pseudo-scalar mesons and vector mesons (with hidden symmetry), up to O(p4). This encompasses both the normal and anomalous parts. Our work builds on a previous study that derived the chiral Lagrangian from first principles of QCD, where the low-energy coefficients are defined in terms of specific Green's functions in QCD. This research extends our earlier efforts that focused on calculating the low-energy coefficients of the chiral Lagrangian for pure pseudo-scalar mesons. This marks the first calculation of chiral Lagrangian coefficients for vector mesons from QCD, particularly for the important parameters a and g, which are typically considered inputs in existing literature. Notably, the regularization method used previously is inadequate for this broader scope. We find that cut-off regularization yields reasonable results for both pseudo-scalar mesons and vector mesons, though it has certain limitations. Finally, we demonstrate that our method aligns with the Weinberg sum rules., Comment: 34 pages, 2figures
- Published
- 2024
16. SRFUND: A Multi-Granularity Hierarchical Structure Reconstruction Benchmark in Form Understanding
- Author
-
Ma, Jiefeng, Wang, Yan, Liu, Chenyu, Du, Jun, Hu, Yu, Zhang, Zhenrong, Hu, Pengfei, Wang, Qing, and Zhang, Jianshu
- Subjects
Computer Science - Computation and Language ,Computer Science - Artificial Intelligence - Abstract
Accurately identifying and organizing textual content is crucial for the automation of document processing in the field of form understanding. Existing datasets, such as FUNSD and XFUND, support entity classification and relationship prediction tasks but are typically limited to local and entity-level annotations. This limitation overlooks the hierarchically structured representation of documents, constraining comprehensive understanding of complex forms. To address this issue, we present the SRFUND, a hierarchically structured multi-task form understanding benchmark. SRFUND provides refined annotations on top of the original FUNSD and XFUND datasets, encompassing five tasks: (1) word to text-line merging, (2) text-line to entity merging, (3) entity category classification, (4) item table localization, and (5) entity-based full-document hierarchical structure recovery. We meticulously supplemented the original dataset with missing annotations at various levels of granularity and added detailed annotations for multi-item table regions within the forms. Additionally, we introduce global hierarchical structure dependencies for entity relation prediction tasks, surpassing traditional local key-value associations. The SRFUND dataset includes eight languages including English, Chinese, Japanese, German, French, Spanish, Italian, and Portuguese, making it a powerful tool for cross-lingual form understanding. Extensive experimental results demonstrate that the SRFUND dataset presents new challenges and significant opportunities in handling diverse layouts and global hierarchical structures of forms, thus providing deep insights into the field of form understanding. The original dataset and implementations of baseline methods are available at https://sprateam-ustc.github.io/SRFUND, Comment: NeurIPS 2024 Track on Datasets and Benchmarks under review
- Published
- 2024
17. FireBench: A High-fidelity Ensemble Simulation Framework for Exploring Wildfire Behavior and Data-driven Modeling
- Author
-
Wang, Qing, Ihme, Matthias, Gazen, Cenk, Chen, Yi-Fan, and Anderson, John
- Subjects
Physics - Computational Physics - Abstract
Background. Wildfire research uses ensemble methods to analyze fire behaviors and assess uncertainties. Nonetheless, current research methods are either confined to simple models or complex simulations with limits. Modern computing tools could allow for efficient, high-fidelity ensemble simulations. Aims. This study proposes a high-fidelity ensemble wildfire simulation framework for studying wildfire behavior, ML tasks, fire-risk assessment, and uncertainty analysis. Methods. In this research, we present a simulation framework that integrates the Swirl-Fire large-eddy simulation tool for wildfire predictions with the Vizier optimization platform for automated run-time management of ensemble simulations and large-scale batch processing. All simulations are executed on tensor-processing units to enhance computational efficiency. Key results. A dataset of 117 simulations is created, each with 1.35 billion mesh points. The simulations are compared to existing experimental data and show good agreement in terms of fire rate of spread. Computations are done for fire acceleration, mean rate of spread, and fireline intensity. Conclusions. Strong coupling between these 2 parameters are observed for the fire spread and intermittency. A critical Froude number that delineates fires from plume-driven to convection-driven is identified and confirmed with literature observations. Implications. The ensemble simulation framework is efficient in facilitating parametric wildfire studies.
- Published
- 2024
18. A Roadmap for Software Testing in Open Collaborative Development Environments
- Author
-
Wang, Qing, Wang, Junjie, Li, Mingyang, Wang, Yawen, and Liu, Zhe
- Subjects
Computer Science - Software Engineering - Abstract
Amidst the ever-expanding digital sphere, the evolution of the Internet has not only fostered an atmosphere of information transparency and sharing but has also sparked a revolution in software development practices. The distributed nature of open collaborative development, along with its diverse contributors and rapid iterations, presents new challenges for ensuring software quality. This paper offers a comprehensive review and analysis of recent advancements in software quality assurance within open collaborative development environments. Our examination covers various aspects, including process management, personnel dynamics, and technological advancements, providing valuable insights into effective approaches for maintaining software quality in such collaborative settings. Furthermore, we delve into the challenges and opportunities arising from emerging technologies such as LLMs and the AI model-centric development paradigm. By addressing these topics, our study contributes to a deeper understanding of software quality assurance in open collaborative environments and lays the groundwork for future exploration and innovation.
- Published
- 2024
19. A Quantum Neural Network-Based Approach to Power Quality Disturbances Detection and Recognition
- Author
-
Li, Guo-Dong, He, Hai-Yan, Li, Yue, Li, Xin-Hao, Liu, Hao, Wang, Qing-Le, and Cheng, Long
- Subjects
Quantum Physics - Abstract
Power quality disturbances (PQDs) significantly impact the stability and reliability of power systems, necessitating accurate and efficient detection and recognition methods. While numerous classical algorithms for PQDs detection and recognition have been extensively studied and applied, related work in the quantum domain is still in its infancy. In this paper, an improved quantum neural networks (QNN) model for PQDs detection and recognition is proposed. Specifically, the model constructs a quantum circuit comprising data qubits and ancilla qubits. Classical data is transformed into quantum data by embedding it into data qubits via the encoding layer. Subsequently, parametric quantum gates are utilized to form the variational layer, which facilitates qubit information transformation, thereby extracting essential feature information for detection and recognition. The expected value is obtained by measuring ancilla qubits, enabling the completion of disturbance classification based on this expected value. An analysis reveals that the runtime and space complexities of the QNN are $O\left ( poly\left ( N \right ) \right )$ and $O\left ( N \right )$, respectively. Extensive experiments validate the feasibility and superiority of the proposed model in PQD detection and recognition. The model achieves accuracies of 99.75\%, 97.85\% and 95.5\% in experiments involving the detection of disturbances, recognition of seven single disturbances, and recognition of ten mixed disturbances, respectively. Additionally, noise simulation and comparative experiments demonstrate that the proposed model exhibits robust anti-noise capabilities, requires few training parameters, and maintains high accuracy.
- Published
- 2024
20. Affine vertex operator superalgebra $L_{\widehat{osp(1|2)}}(\mathcal{l},0)$ at admissible level
- Author
-
Li, Huaimin and Wang, Qing
- Subjects
Mathematics - Quantum Algebra - Abstract
Let $L_{\widehat{osp(1|2)}}(\mathcal{l},0)$ be the simple affine vertex operator superalgebra with admissible level $\mathcal{l}$. We prove that the category of weak $L_{\widehat{osp(1|2)}}(\mathcal{l},0)$-modules on which the positive part of $\widehat{osp(1|2)}$ acts locally nilpotent is semisimple. Then we prove that $\mathbb{Q}$-graded vertex operator superalgebras $(L_{\widehat{osp(1|2)}}(\mathcal{l},0),\omega_\xi)$ with new Virasoro elements $\omega_\xi$ are rational and the irreducible modules are exactly the admissible modules for $\widehat{osp(1|2)}$, where $0<\xi<1$ is a rational number. Furthermore, we determine the Zhu's algebras $A(L_{\widehat{osp(1|2)}}(\mathcal{l},0))$ and their bimodules $A(L(\mathcal{l},\mathcal{j}))$ for $(L_{\widehat{osp(1|2)}}(\mathcal{l},0),\omega_\xi)$, where $\mathcal{j}$ is the admissible weight. As an application, we calculate the fusion rules among the irreducible ordinary modules of $(L_{\widehat{osp(1|2)}}(\mathcal{l},0),\omega_\xi)$., Comment: 25 pages
- Published
- 2024
21. A Variance-Preserving Interpolation Approach for Diffusion Models with Applications to Single Channel Speech Enhancement and Recognition
- Author
-
Guo, Zilu, Wang, Qing, Du, Jun, Pan, Jia, Liu, Qing-Feng, and Chin-Hui
- Subjects
Electrical Engineering and Systems Science - Audio and Speech Processing - Abstract
In this paper, we propose a variance-preserving interpolation framework to improve diffusion models for single-channel speech enhancement (SE) and automatic speech recognition (ASR). This new variance-preserving interpolation diffusion model (VPIDM) approach requires only 25 iterative steps and obviates the need for a corrector, an essential element in the existing variance-exploding interpolation diffusion model (VEIDM). Two notable distinctions between VPIDM and VEIDM are the scaling function of the mean of state variables and the constraint imposed on the variance relative to the mean's scale. We conduct a systematic exploration of the theoretical mechanism underlying VPIDM and develop insights regarding VPIDM's applications in SE and ASR using VPIDM as a frontend. Our proposed approach, evaluated on two distinct data sets, demonstrates VPIDM's superior performances over conventional discriminative SE algorithms. Furthermore, we assess the performance of the proposed model under varying signal-to-noise ratio (SNR) levels. The investigation reveals VPIDM's improved robustness in target noise elimination when compared to VEIDM. Furthermore, utilizing the mid-outputs of both VPIDM and VEIDM results in enhanced ASR accuracies, thereby highlighting the practical efficacy of our proposed approach.
- Published
- 2024
22. Distinctive and Natural Speaker Anonymization via Singular Value Transformation-assisted Matrix
- Author
-
Yao, Jixun, Wang, Qing, Guo, Pengcheng, Ning, Ziqian, and Xie, Lei
- Subjects
Electrical Engineering and Systems Science - Audio and Speech Processing - Abstract
Speaker anonymization is an effective privacy protection solution that aims to conceal the speaker's identity while preserving the naturalness and distinctiveness of the original speech. Mainstream approaches use an utterance-level vector from a pre-trained automatic speaker verification (ASV) model to represent speaker identity, which is then averaged or modified for anonymization. However, these systems suffer from deterioration in the naturalness of anonymized speech, degradation in speaker distinctiveness, and severe privacy leakage against powerful attackers. To address these issues and especially generate more natural and distinctive anonymized speech, we propose a novel speaker anonymization approach that models a matrix related to speaker identity and transforms it into an anonymized singular value transformation-assisted matrix to conceal the original speaker identity. Our approach extracts frame-level speaker vectors from a pre-trained ASV model and employs an attention mechanism to create a speaker-score matrix and speaker-related tokens. Notably, the speaker-score matrix acts as the weight for the corresponding speaker-related token, representing the speaker's identity. The singular value transformation-assisted matrix is generated by recomposing the decomposed orthonormal eigenvectors matrix and non-linear transformed singular through Singular Value Decomposition (SVD). Experiments on VoicePrivacy Challenge datasets demonstrate the effectiveness of our approach in protecting speaker privacy under all attack scenarios while maintaining speech naturalness and distinctiveness., Comment: Accepted by IEEE/ACM Transactions on Audio, Speech, and Language Processing
- Published
- 2024
23. Angle-Resolved Magneto-Chiral Anisotropy in a Non-Centrosymmetric Atomic Layer Superlattice
- Author
-
Cheng, Long, Bao, Mingrui, Zhang, Jingxian, Zhang, Xue, Yang, Qun, Li, Qiang, Cao, Hui, Qiu, Dawei, Liu, Jia, Ye, Fei, Wang, Qing, Liang, Genhao, Li, Hui, Cheng, Guanglei, Zhou, Hua, Zuo, Jian-Min, Zhou, Xiaodong, Shen, Jian, Zhu, Zhifeng, Mu, Sai, Wang, Wenbo, and Zhai, Xiaofang
- Subjects
Condensed Matter - Materials Science - Abstract
Chirality in solid-state materials has sparked significant interest due to potential applications of topologically-protected chiral states in next-generation information technology. The electrical magneto-chiral effect (eMChE), arising from relativistic spin-orbit interactions, shows great promise for developing chiral materials and devices for electronic integration. Here we demonstrate an angle-resolved eMChE in an A-B-C-C type atomic-layer superlattice lacking time and space inversion symmetry. We observe non-superimposable enantiomers of left-handed and right-handed tilted uniaxial magnetic anisotropy as the sample rotates under static fields, with the tilting angle reaching a striking 45 degree. Magnetic force microscopy and atomistic simulations correlate the tilt to the emergence and evolution of chiral spin textures. The Dzyaloshinskii-Moriya interaction lock effect in competition with Zeeman effect is demonstrated to be responsible for the angle-resolved eMChE. Our findings open up a new horizon for engineering angle-resolved magneto-chiral anisotropy, shedding light on the development of novel angle-resolved sensing or writing techniques in chiral spintronics.
- Published
- 2024
24. The Ninth NTIRE 2024 Efficient Super-Resolution Challenge Report
- Author
-
Ren, Bin, Li, Yawei, Mehta, Nancy, Timofte, Radu, Yu, Hongyuan, Wan, Cheng, Hong, Yuxin, Han, Bingnan, Wu, Zhuoyuan, Zou, Yajun, Liu, Yuqing, Li, Jizhe, He, Keji, Fan, Chao, Zhang, Heng, Zhang, Xiaolin, Yin, Xuanwu, Zuo, Kunlong, Liao, Bohao, Xia, Peizhe, Peng, Long, Du, Zhibo, Di, Xin, Li, Wangkai, Wang, Yang, Zhai, Wei, Pei, Renjing, Guo, Jiaming, Xu, Songcen, Cao, Yang, Zha, Zhengjun, Wang, Yan, Liu, Yi, Wang, Qing, Zhang, Gang, Zhang, Liou, Zhao, Shijie, Sun, Long, Pan, Jinshan, Dong, Jiangxin, Tang, Jinhui, Liu, Xin, Yan, Min, Wang, Qian, Zhou, Menghan, Yan, Yiqiang, Liu, Yixuan, Chan, Wensong, Tang, Dehua, Zhou, Dong, Wang, Li, Tian, Lu, Emad, Barsoum, Jia, Bohan, Qiao, Junbo, Zhou, Yunshuai, Zhang, Yun, Li, Wei, Lin, Shaohui, Zhou, Shenglong, Chen, Binbin, Liao, Jincheng, Zhao, Suiyi, Zhang, Zhao, Wang, Bo, Luo, Yan, Wei, Yanyan, Li, Feng, Wang, Mingshen, Guan, Jinhan, Hu, Dehua, Yu, Jiawei, Xu, Qisheng, Sun, Tao, Lan, Long, Xu, Kele, Lin, Xin, Yue, Jingtong, Yang, Lehan, Du, Shiyi, Qi, Lu, Ren, Chao, Han, Zeyu, Wang, Yuhan, Chen, Chaolin, Li, Haobo, Zheng, Mingjun, Yang, Zhongbao, Song, Lianhong, Yan, Xingzhuo, Fu, Minghan, Zhang, Jingyi, Li, Baiang, Zhu, Qi, Xu, Xiaogang, Guo, Dan, Guo, Chunle, Chen, Jiadi, Long, Huanhuan, Duanmu, Chunjiang, Lei, Xiaoyan, Liu, Jie, Jia, Weilin, Cao, Weifeng, Zhang, Wenlong, Mao, Yanyu, Guo, Ruilong, Zhang, Nihao, Pandey, Manoj, Chernozhukov, Maksym, Le, Giang, Cheng, Shuli, Wang, Hongyuan, Wei, Ziyan, Tang, Qingting, Wang, Liejun, Li, Yongming, Guo, Yanhui, Xu, Hao, Khatami-Rizi, Akram, Mahmoudi-Aznaveh, Ahmad, Hsu, Chih-Chung, Lee, Chia-Ming, Chou, Yi-Shiuan, Joshi, Amogh, Akalwadi, Nikhil, Malagi, Sampada, Yashaswini, Palani, Desai, Chaitra, Tabib, Ramesh Ashok, Patil, Ujwala, and Mudenagudi, Uma
- Subjects
Computer Science - Computer Vision and Pattern Recognition ,Electrical Engineering and Systems Science - Image and Video Processing - Abstract
This paper provides a comprehensive review of the NTIRE 2024 challenge, focusing on efficient single-image super-resolution (ESR) solutions and their outcomes. The task of this challenge is to super-resolve an input image with a magnification factor of x4 based on pairs of low and corresponding high-resolution images. The primary objective is to develop networks that optimize various aspects such as runtime, parameters, and FLOPs, while still maintaining a peak signal-to-noise ratio (PSNR) of approximately 26.90 dB on the DIV2K_LSDIR_valid dataset and 26.99 dB on the DIV2K_LSDIR_test dataset. In addition, this challenge has 4 tracks including the main track (overall performance), sub-track 1 (runtime), sub-track 2 (FLOPs), and sub-track 3 (parameters). In the main track, all three metrics (ie runtime, FLOPs, and parameter count) were considered. The ranking of the main track is calculated based on a weighted sum-up of the scores of all other sub-tracks. In sub-track 1, the practical runtime performance of the submissions was evaluated, and the corresponding score was used to determine the ranking. In sub-track 2, the number of FLOPs was considered. The score calculated based on the corresponding FLOPs was used to determine the ranking. In sub-track 3, the number of parameters was considered. The score calculated based on the corresponding parameters was used to determine the ranking. RLFN is set as the baseline for efficiency measurement. The challenge had 262 registered participants, and 34 teams made valid submissions. They gauge the state-of-the-art in efficient single-image super-resolution. To facilitate the reproducibility of the challenge and enable other researchers to build upon these findings, the code and the pre-trained model of validated solutions are made publicly available at https://github.com/Amazingren/NTIRE2024_ESR/., Comment: The report paper of NTIRE2024 Efficient Super-resolution, accepted by CVPRW2024
- Published
- 2024
25. Unblind Text Inputs: Predicting Hint-text of Text Input in Mobile Apps via LLM
- Author
-
Liu, Zhe, Chen, Chunyang, Wang, Junjie, Chen, Mengzhuo, Wu, Boyu, Huang, Yuekai, Hu, Jun, and Wang, Qing
- Subjects
Computer Science - Human-Computer Interaction - Abstract
Mobile apps have become indispensable for accessing and participating in various environments, especially for low-vision users. Users with visual impairments can use screen readers to read the content of each screen and understand the content that needs to be operated. Screen readers need to read the hint-text attribute in the text input component to remind visually impaired users what to fill in. Unfortunately, based on our analysis of 4,501 Android apps with text inputs, over 0.76 of them are missing hint-text. These issues are mostly caused by developers' lack of awareness when considering visually impaired individuals. To overcome these challenges, we developed an LLM-based hint-text generation model called HintDroid, which analyzes the GUI information of input components and uses in-context learning to generate the hint-text. To ensure the quality of hint-text generation, we further designed a feedback-based inspection mechanism to further adjust hint-text. The automated experiments demonstrate the high BLEU and a user study further confirms its usefulness. HintDroid can not only help visually impaired individuals, but also help ordinary people understand the requirements of input components. HintDroid demo video: https://youtu.be/FWgfcctRbfI., Comment: Accepted by the 2024 CHI Conference on Human Factors in Computing Systems
- Published
- 2024
- Full Text
- View/download PDF
26. Holistic numerical simulation of a quenching process on a real-size multifilamentary superconducting coil
- Author
-
Xue, Cun, Ren, Han-Xi, Jia, Peng, Wang, Qing-Yu, Liu, Wei, Ou, Xian-Jin, Sun, Liang-Ting, and Silhanek, Alejandro V
- Subjects
Condensed Matter - Superconductivity ,Physics - Applied Physics - Abstract
Superconductors play a crucial role in the advancement of high-field electromagnets. Unfortunately, their performance can be compromised by thermomagnetic instabilities, wherein the interplay of rapid magnetic and slow heat diffusion can result in catastrophic flux jumps eventually leading to irreversible damage. This issue has long plagued high-$J_c$ Nb$_3$Sn wires at the core of high-field magnets. In this study, we introduce a groundbreaking large-scale GPU-optimized algorithm aimed at tackling the complex intertwined effects of electromagnetism, heating, and strain acting concomitantly during the quenching process of superconducting coils. We validate our model by conducting comparisons with magnetization measurements obtained from short multifilamentary Nb$_3$Sn wires and further experimental tests conducted on solenoid coils while subject to ramping transport currents. Furthermore, leveraging our developed numerical algorithm, we unveil the dynamic propagation mechanisms underlying thermomagnetic instabilities (including flux jumps and quenches) within the coils. Remarkably, our findings reveal that the velocity field of flux jumps and quenches within the coil is correlated with the amount of Joule heating experienced by each wire over a specific time interval, rather than solely being dependent on instantaneous Joule heating or maximum temperature. These insights have the potential to pave the way for optimizing the design of next-generation superconducting magnets, thereby directly influencing a wide array of technologically relevant and multidisciplinary applications.
- Published
- 2024
27. Generalization of Graph Neural Networks through the Lens of Homomorphism
- Author
-
Li, Shouheng, Kim, Dongwoo, and Wang, Qing
- Subjects
Computer Science - Machine Learning - Abstract
Despite the celebrated popularity of Graph Neural Networks (GNNs) across numerous applications, the ability of GNNs to generalize remains less explored. In this work, we propose to study the generalization of GNNs through a novel perspective - analyzing the entropy of graph homomorphism. By linking graph homomorphism with information-theoretic measures, we derive generalization bounds for both graph and node classifications. These bounds are capable of capturing subtleties inherent in various graph structures, including but not limited to paths, cycles and cliques. This enables a data-dependent generalization analysis with robust theoretical guarantees. To shed light on the generality of of our proposed bounds, we present a unifying framework that can characterize a broad spectrum of GNN models through the lens of graph homomorphism. We validate the practical applicability of our theoretical findings by showing the alignment between the proposed bounds and the empirically observed generalization gaps over both real-world and synthetic datasets., Comment: 17 pages, 3 figures
- Published
- 2024
28. Local Vertex Colouring Graph Neural Networks
- Author
-
Li, Shouheng, Kim, Dongwoo, and Wang, Qing
- Subjects
Computer Science - Machine Learning - Abstract
In recent years, there has been a significant amount of research focused on expanding the expressivity of Graph Neural Networks (GNNs) beyond the Weisfeiler-Lehman (1-WL) framework. While many of these studies have yielded advancements in expressivity, they have frequently come at the expense of decreased efficiency or have been restricted to specific types of graphs. In this study, we investigate the expressivity of GNNs from the perspective of graph search. Specifically, we propose a new vertex colouring scheme and demonstrate that classical search algorithms can efficiently compute graph representations that extend beyond the 1-WL. We show the colouring scheme inherits useful properties from graph search that can help solve problems like graph biconnectivity. Furthermore, we show that under certain conditions, the expressivity of GNNs increases hierarchically with the radius of the search neighbourhood. To further investigate the proposed scheme, we develop a new type of GNN based on two search strategies, breadth-first search and depth-first search, highlighting the graph properties they can capture on top of 1-WL. Our code is available at https://github.com/seanli3/lvc., Comment: 22 pages, 8 figures
- Published
- 2024
29. Case studies on time-dependent Ginzburg-Landau simulations for superconducting applications
- Author
-
Xue, Cun, Wang, Qing-Yu, Ren, Han-Xi, He, An, and Silhanek, A. V.
- Subjects
Condensed Matter - Superconductivity ,Physics - Applied Physics - Abstract
The macroscopic electromagnetic properties of type II superconductors are primarily influenced by the behavior of microscopic superconducting flux quantum units. Time-dependent Ginzburg-Landau (TDGL) equations provide an elegant and powerful tool for describing and examining both the statics and dynamics of these superconducting entities. They have been instrumental in replicating and elucidating numerous experimental results over the past decades.This paper provides a comprehensive overview of the progress in TDGL simulations, focusing on three key aspects of superconductor applications. The initial section delves into vortex rectification in superconductors described within the TDGL framework. We specifically highlight the superconducting diode effect achieved through asymmetric pinning landscapes and the reversible manipulation of vortex ratchets with dynamic pinning landscapes. The subsequent section reviews the achievements of TDGL simulations concerning the critical current density of superconductors, emphasizing the optimization of pinning sites, particularly vortex pinning and dynamics in polycrystalline Nb$_3$Sn with grain boundaries. The third part concentrates on numerical modeling of vortex penetration and dynamics in superconducting radio frequency (SRF) cavities, including a discussion of superconductor insulator superconductor multilayer structures. In the last section, we present key findings, insights, and perspectives derived from the discussed simulations., Comment: 20 pages,13 figures
- Published
- 2024
30. VEglue: Testing Visual Entailment Systems via Object-Aligned Joint Erasing
- Author
-
Chang, Zhiyuan, Li, Mingyang, Wang, Junjie, Li, Cheng, and Wang, Qing
- Subjects
Computer Science - Computer Vision and Pattern Recognition ,Computer Science - Software Engineering - Abstract
Visual entailment (VE) is a multimodal reasoning task consisting of image-sentence pairs whereby a promise is defined by an image, and a hypothesis is described by a sentence. The goal is to predict whether the image semantically entails the sentence. VE systems have been widely adopted in many downstream tasks. Metamorphic testing is the commonest technique for AI algorithms, but it poses a significant challenge for VE testing. They either only consider perturbations on single modality which would result in ineffective tests due to the destruction of the relationship of image-text pair, or just conduct shallow perturbations on the inputs which can hardly detect the decision error made by VE systems. Motivated by the fact that objects in the image are the fundamental element for reasoning, we propose VEglue, an object-aligned joint erasing approach for VE systems testing. It first aligns the object regions in the premise and object descriptions in the hypothesis to identify linked and un-linked objects. Then, based on the alignment information, three Metamorphic Relations are designed to jointly erase the objects of the two modalities. We evaluate VEglue on four widely-used VE systems involving two public datasets. Results show that VEglue could detect 11,609 issues on average, which is 194%-2,846% more than the baselines. In addition, VEglue could reach 52.5% Issue Finding Rate (IFR) on average, and significantly outperform the baselines by 17.1%-38.2%. Furthermore, we leverage the tests generated by VEglue to retrain the VE systems, which largely improves model performance (50.8% increase in accuracy) on newly generated tests without sacrificing the accuracy on the original test set., Comment: 12pages, 3 figures
- Published
- 2024
31. Adversarial Testing for Visual Grounding via Image-Aware Property Reduction
- Author
-
Chang, Zhiyuan, Li, Mingyang, Wang, Junjie, Li, Cheng, Wu, Boyu, Xu, Fanjiang, and Wang, Qing
- Subjects
Computer Science - Computer Vision and Pattern Recognition ,Computer Science - Artificial Intelligence - Abstract
Due to the advantages of fusing information from various modalities, multimodal learning is gaining increasing attention. Being a fundamental task of multimodal learning, Visual Grounding (VG), aims to locate objects in images through natural language expressions. Ensuring the quality of VG models presents significant challenges due to the complex nature of the task. In the black box scenario, existing adversarial testing techniques often fail to fully exploit the potential of both modalities of information. They typically apply perturbations based solely on either the image or text information, disregarding the crucial correlation between the two modalities, which would lead to failures in test oracles or an inability to effectively challenge VG models. To this end, we propose PEELING, a text perturbation approach via image-aware property reduction for adversarial testing of the VG model. The core idea is to reduce the property-related information in the original expression meanwhile ensuring the reduced expression can still uniquely describe the original object in the image. To achieve this, PEELING first conducts the object and properties extraction and recombination to generate candidate property reduction expressions. It then selects the satisfied expressions that accurately describe the original object while ensuring no other objects in the image fulfill the expression, through querying the image with a visual understanding technique. We evaluate PEELING on the state-of-the-art VG model, i.e. OFA-VG, involving three commonly used datasets. Results show that the adversarial tests generated by PEELING achieves 21.4% in MultiModal Impact score (MMI), and outperforms state-of-the-art baselines for images and texts by 8.2%--15.1%., Comment: 14pages, 6 figures
- Published
- 2024
32. Re-Examine Distantly Supervised NER: A New Benchmark and a Simple Approach
- Author
-
Li, Yuepei, Zhou, Kang, Qiao, Qiao, Wang, Qing, and Li, Qi
- Subjects
Computer Science - Computation and Language ,Computer Science - Machine Learning - Abstract
This paper delves into Named Entity Recognition (NER) under the framework of Distant Supervision (DS-NER), where the main challenge lies in the compromised quality of labels due to inherent errors such as false positives, false negatives, and positive type errors. We critically assess the efficacy of current DS-NER methodologies using a real-world benchmark dataset named QTL, revealing that their performance often does not meet expectations. To tackle the prevalent issue of label noise, we introduce a simple yet effective approach, Curriculum-based Positive-Unlabeled Learning CuPUL, which strategically starts on "easy" and cleaner samples during the training process to enhance model resilience to noisy samples. Our empirical results highlight the capability of CuPUL to significantly reduce the impact of noisy labels and outperform existing methods. QTL dataset and our code is available on GitHub.
- Published
- 2024
33. Play Guessing Game with LLM: Indirect Jailbreak Attack with Implicit Clues
- Author
-
Chang, Zhiyuan, Li, Mingyang, Liu, Yi, Wang, Junjie, Wang, Qing, and Liu, Yang
- Subjects
Computer Science - Cryptography and Security ,Computer Science - Artificial Intelligence ,Computer Science - Human-Computer Interaction - Abstract
With the development of LLMs, the security threats of LLMs are getting more and more attention. Numerous jailbreak attacks have been proposed to assess the security defense of LLMs. Current jailbreak attacks primarily utilize scenario camouflage techniques. However their explicitly mention of malicious intent will be easily recognized and defended by LLMs. In this paper, we propose an indirect jailbreak attack approach, Puzzler, which can bypass the LLM's defense strategy and obtain malicious response by implicitly providing LLMs with some clues about the original malicious query. In addition, inspired by the wisdom of "When unable to attack, defend" from Sun Tzu's Art of War, we adopt a defensive stance to gather clues about the original malicious query through LLMs. Extensive experimental results show that Puzzler achieves a query success rate of 96.6% on closed-source LLMs, which is 57.9%-82.7% higher than baselines. Furthermore, when tested against the state-of-the-art jailbreak detection approaches, Puzzler proves to be more effective at evading detection compared to baselines., Comment: 13 pages, 6 figures
- Published
- 2024
34. Causality and a possible interpretation of quantum mechanics
- Author
-
Tu, Kaixun and Wang, Qing
- Subjects
Quantum Physics - Abstract
From the ancient Einstein-Podolsky-Rosen paradox to the recent Sorkin-type impossible measurements problem, the contradictions between relativistic causality, quantum non-locality, and quantum measurement have persisted. Based on quantum field theory, our work provides a framework that harmoniously integrates these three aspects. This framework consists of causality expressed by reduced density matrices and an interpretation of quantum mechanics that considers quantum mechanics to be complete. Specifically, we use reduced density matrices to represent the local information of the quantum state and show that the reduced density matrices cannot evolve superluminally. Unlike recent approaches that focus on causality by introducing new operators to describe detectors, we consider that everything--including detectors, environments, and humans--is composed of the same fundamental fields, which prompts us to question the validity of the derivation of Schrodinger's cat paradox and leads us to propose an interpretation of quantum mechanics that does not require any additional assumptions and is compatible with relativity., Comment: 27 pages, 1 figure
- Published
- 2024
35. QQMR: A Structure-Preserving Quaternion Quasi-Minimal Residual Method for Non-Hermitian Quaternion Linear Systems
- Author
-
Li, Tao, Wang, Qing-Wen, and Zhang, Xin-Fang
- Subjects
Mathematics - Numerical Analysis ,15B33, 65F08, 65F10, 94A08 - Abstract
The quaternion biconjugate gradient (QBiCG) method, as a novel variant of quaternion Lanczos-type methods for solving the non-Hermitian quaternion linear systems, does not yield a minimization property. This means that the method possesses a rather irregular convergence behavior, which leads to numerical instability. In this paper, we propose a new structure-preserving quaternion quasi-minimal residual method, based on the quaternion biconjugate orthonormalization procedure with coupled two-term recurrences, which overcomes the drawback of QBiCG. The computational cost and storage required by the proposed method are much less than the traditional QMR iterations for the real representation of quaternion linear systems. Some convergence properties of which are also established. Finally, we report the numerical results to show the robustness and effectiveness of the proposed method compared with QBiCG., Comment: 25 pages
- Published
- 2024
36. Quantum Secret Sharing Enhanced: Utilizing W States for Anonymous and Secure Communication
- Author
-
Li, Guo-Dong, Cheng, Wen-Chuan, Wang, Qing-Le, Cheng, Long, Mao, Ying, and Jia, Heng-Yue
- Subjects
Quantum Physics - Abstract
Quantum secret sharing (QSS) is the result of merging the principles of quantum mechanics with secret information sharing. It enables a sender to share a secret among receivers, and the receivers can then collectively recover the secret when the need arises. To enhance the practicality of these quantum protocols, an innovative concept of quantum anonymous secret sharing (QASS) is advanced. In this paper, we propose a QASS protocol via W states, which can share secrets while ensuring recover-ability, recover-security, and recover-anonymity. We have rigorously evaluated our protocols, verifying their accuracy and fortifying their security against scenarios involving the active adversary. This includes considerations for dishonest receivers and non-receivers. Moreover, acknowledging the imperfections inherent in real-world communication channels, we have also undertaken an exhaustive analysis of our protocol's security and effectiveness in a quantum network where some form of noise is present. Our investigations reveal that W states exhibit good performance in mitigating noise interference, making them apt for practical applications., Comment: 18 pages, 6 figures
- Published
- 2024
37. SNP-S3: Shared Network Pre-training and Significant Semantic Strengthening for Various Video-Text Tasks
- Author
-
Dong, Xingning, Guo, Qingpei, Gan, Tian, Wang, Qing, Wu, Jianlong, Ren, Xiangyuan, Cheng, Yuan, and Chu, Wei
- Subjects
Computer Science - Computer Vision and Pattern Recognition ,Computer Science - Multimedia - Abstract
We present a framework for learning cross-modal video representations by directly pre-training on raw data to facilitate various downstream video-text tasks. Our main contributions lie in the pre-training framework and proxy tasks. First, based on the shortcomings of two mainstream pixel-level pre-training architectures (limited applications or less efficient), we propose Shared Network Pre-training (SNP). By employing one shared BERT-type network to refine textual and cross-modal features simultaneously, SNP is lightweight and could support various downstream applications. Second, based on the intuition that people always pay attention to several "significant words" when understanding a sentence, we propose the Significant Semantic Strengthening (S3) strategy, which includes a novel masking and matching proxy task to promote the pre-training performance. Experiments conducted on three downstream video-text tasks and six datasets demonstrate that, we establish a new state-of-the-art in pixel-level video-text pre-training; we also achieve a satisfactory balance between the pre-training efficiency and the fine-tuning performance. The codebase are available at https://github.com/alipay/Ant-Multi-Modal-Framework/tree/main/prj/snps3_vtp., Comment: Accepted by TCSVT (IEEE Transactions on Circuits and Systems for Video Technology)
- Published
- 2024
- Full Text
- View/download PDF
38. Dual-View Data Hallucination with Semantic Relation Guidance for Few-Shot Image Recognition
- Author
-
Wu, Hefeng, Ye, Guangzhi, Zhou, Ziyang, Tian, Ling, Wang, Qing, and Lin, Liang
- Subjects
Computer Science - Computer Vision and Pattern Recognition - Abstract
Learning to recognize novel concepts from just a few image samples is very challenging as the learned model is easily overfitted on the few data and results in poor generalizability. One promising but underexplored solution is to compensate the novel classes by generating plausible samples. However, most existing works of this line exploit visual information only, rendering the generated data easy to be distracted by some challenging factors contained in the few available samples. Being aware of the semantic information in the textual modality that reflects human concepts, this work proposes a novel framework that exploits semantic relations to guide dual-view data hallucination for few-shot image recognition. The proposed framework enables generating more diverse and reasonable data samples for novel classes through effective information transfer from base classes. Specifically, an instance-view data hallucination module hallucinates each sample of a novel class to generate new data by employing local semantic correlated attention and global semantic feature fusion derived from base classes. Meanwhile, a prototype-view data hallucination module exploits semantic-aware measure to estimate the prototype of a novel class and the associated distribution from the few samples, which thereby harvests the prototype as a more stable sample and enables resampling a large number of samples. We conduct extensive experiments and comparisons with state-of-the-art methods on several popular few-shot benchmarks to verify the effectiveness of the proposed framework., Comment: Accepted by IEEE Transactions on Multimedia
- Published
- 2024
39. Phase transition like behaviors of Propagation of Passenger Stranding phenomena in Subway Networks
- Author
-
Li, Xinyi, Zhao, Shengda, Wang, Liang, Wang, Qing, Zhang, Xun, Liu, Fang, Zhang, Xiaodong, Gong, Daqing, and Zhang, Xinghua
- Subjects
Physics - Physics and Society - Abstract
The subway as the most important transportation for daily urban commuting is a typical non-equilibrium complex system, composed of 2 types of basic units with service relationship. One challenge to operate it is passengers be stranded at stations, which arise from the spatiotemporal mismatch of supply scale and demand scale. More seriously, there is a special phenomenon of the propagation of passenger stranding (PPS) by forming stranded stations clusters, which significantly reduces the service efficiency. In this study, Beijing subway as an example is studied to reveal the nature of PPS phenomena from a view point of statistical physics. The simulation results demonstrate phase-transition-like behaviors depending on the ratio of service supply scale and demand scale. The transition point can quantitatively characterize the resilience failure threshold of service. The eigen microstate method is used to extracting the fundamental patterns of PPS phenomena. Moreover, this study offers a theoretical foundation for strategies to improve service, such as topological planning and train timetable optimization. The methodology developed in present work has significant implications for study of other service systems.
- Published
- 2024
40. Mixed QCD-EW corrections to $W$-pair production at electron-positron colliders
- Author
-
Li, Zhe, Zhang, Ren-You, Li, Shu-Xiang, Wang, Xiao-Feng, He, Wen-Jie, Han, Liang, Jiang, Yi, and Wang, Qing-hai
- Subjects
High Energy Physics - Phenomenology - Abstract
The discrepancy between the CDF measurement and the Standard Model theoretical prediction for the $W$-boson mass underscores the importance of conducting high-precision studies on the $W$ boson, which is one of the predominant objectives of proposed future $e^+e^-$ colliders. We investigate in detail the production of $W$-boson pairs at $e^+e^-$ colliders, and compute the next-to-next-to-leading order mixed QCD-EW corrections to both the integrated cross section and various kinematic distributions. By employing the method of differential equations, we analytically calculate the two-loop master integrals for the mixed QCD-EW virtual corrections to $e^+e^- \rightarrow W^+W^-$. Utilizing the Magnus transformation, we derive a set of canonical master integrals for each integral family. This canonical basis satisfies a system of differential equations in which the dependence on the dimensional regulator is linearly factorized from the kinematics. We then express all these canonical master integrals as Taylor series in $\epsilon$ up to $\epsilon^4$, with coefficients articulated in terms of Goncharov polylogarithms up to weight four. Upon applying our analytic expressions of these master integrals to the phenomenological analysis of $W$-pair production, we observe that the $\mathcal{O}(\alpha\alpha_s)$ corrections are significantly impactful in the $\alpha(0)$ scheme, particularly in certain phase-space regions. However, these mixed QCD-EW corrections can be heavily suppressed by adopting the $G_{\mu}$ scheme., Comment: 32 pages, 9 figures
- Published
- 2024
41. CAFs-derived Exosomal miR-889-3p Might Repress M1 Macrophage Polarization to Boost ESCC Development by Regulating STAT1
- Author
-
Zhang, Shaofeng, Li, Danqing, Wang, Haijun, Liu, Bo, Du, Fan, and Wang, Qing
- Published
- 2024
- Full Text
- View/download PDF
42. Tracking cognitive trajectories in older survivors of COVID-19 up to 2.5 years post-infection
- Author
-
Liu, Yu-Hui, Wu, Quan-Xin, Wang, Qing-Hua, Zhang, Qiao-Feng, Tang, Yi, Liu, Di, Wang, Jing-Juan, Liu, Xiao-Yu, Wang, Ling-Ru, Li, Li, Xu, Cheng, Zhu, Jie, and Wang, Yan-Jiang
- Published
- 2024
- Full Text
- View/download PDF
43. Multi-band Feature Images Concrete Crack Segmentation Framework Using Deep Learning
- Author
-
Zhou, Shuang Xi, Pan, Yuan, Guan, Jingyuan, and Wang, Qing
- Published
- 2024
- Full Text
- View/download PDF
44. Enhancing Strength-Ductility Synergy and Corrosion Residual Strength of Hot-Rolled Mg-2Zn-0.85Y Alloy
- Author
-
Li, Pengyu, Wang, Wanting, Wang, Qing, Mei, Di, Sun, Yufeng, Guan, Shaokang, and Wang, Jianfeng
- Published
- 2024
- Full Text
- View/download PDF
45. The equivalence of titanium alloys defined via β phase decomposition paths
- Author
-
Wang, Cenyang, Zhu, Zhihao, Song, Mengfan, Zhang, Shuang, Wang, Qing, and Dong, Chuang
- Published
- 2024
- Full Text
- View/download PDF
46. Concrete Crack Identification Framework Using Optimized Unet and I–V Fusion Algorithm for Infrastructure
- Author
-
Pan, Yuan, Zhou, Shuang-xi, Guan, Jing-yuan, Wang, Qing, and Ding, Yang
- Published
- 2024
- Full Text
- View/download PDF
47. Catalytic Complete Oxidation of Ethyl Acetate on MnOx/MgAl2O4 Catalysts
- Author
-
Peng, Dong, Wang, Qing, Zang, Shaohong, and Mo, Liuye
- Published
- 2024
- Full Text
- View/download PDF
48. Neuroprotective and vasoprotective effects of herb pair of Zhiqiao-Danggui in ischemic stroke uncovered by LC-MS/MS-based metabolomics approach
- Author
-
Yao, Benxing, Xu, Di, Wang, Qing, Liu, Lin, Hu, Ziyun, Liu, Wenya, Zheng, Qi, Meng, Huihui, Xiao, Ran, Xu, Qian, Hu, Yudie, and Wang, Junsong
- Published
- 2024
- Full Text
- View/download PDF
49. Experimental Research and Numerical Analysis on the Concrete-Filled Square CFRP Steel Tube Column Under Compressive-Shear Loading
- Author
-
Peng, Kuan, Wang, Qing-li, and Shao, Yong-bo
- Published
- 2024
- Full Text
- View/download PDF
50. Bacillus subtilis QM3, a Plant Growth-Promoting Rhizobacteria, can Promote Wheat Seed Germination by Gibberellin Pathway
- Author
-
Hu, Qingping, Xiao, Ya, Liu, Zhiqin, Huang, Xia, Dong, Bingqi, and Wang, Qing
- Published
- 2024
- Full Text
- View/download PDF
Catalog
Discovery Service for Jio Institute Digital Library
For full access to our library's resources, please sign in.