67,775 results on '"WANG Qi"'
Search Results
2. Real-Time Text Detection with Similar Mask in Traffic, Industrial, and Natural Scenes
- Author
-
Han, Xu, Gao, Junyu, Yang, Chuang, Yuan, Yuan, and Wang, Qi
- Subjects
Computer Science - Computer Vision and Pattern Recognition - Abstract
Texts on the intelligent transportation scene include mass information. Fully harnessing this information is one of the critical drivers for advancing intelligent transportation. Unlike the general scene, detecting text in transportation has extra demand, such as a fast inference speed, except for high accuracy. Most existing real-time text detection methods are based on the shrink mask, which loses some geometry semantic information and needs complex post-processing. In addition, the previous method usually focuses on correct output, which ignores feature correction and lacks guidance during the intermediate process. To this end, we propose an efficient multi-scene text detector that contains an effective text representation similar mask (SM) and a feature correction module (FCM). Unlike previous methods, the former aims to preserve the geometric information of the instances as much as possible. Its post-progressing saves 50$\%$ of the time, accurately and efficiently reconstructing text contours. The latter encourages false positive features to move away from the positive feature center, optimizing the predictions from the feature level. Some ablation studies demonstrate the efficiency of the SM and the effectiveness of the FCM. Moreover, the deficiency of existing traffic datasets (such as the low-quality annotation or closed source data unavailability) motivated us to collect and annotate a traffic text dataset, which introduces motion blur. In addition, to validate the scene robustness of the SM-Net, we conduct experiments on traffic, industrial, and natural scene datasets. Extensive experiments verify it achieves (SOTA) performance on several benchmarks. The code and dataset are available at: \url{https://github.com/fengmulin/SMNet}.
- Published
- 2024
3. Gardner transition coincides with the emergence of jamming scalings in hard spheres and disks
- Author
-
Wang, Qi, Pan, Deng, and Jin, Yuliang
- Subjects
Condensed Matter - Soft Condensed Matter ,Condensed Matter - Disordered Systems and Neural Networks ,Condensed Matter - Statistical Mechanics - Abstract
The Gardner transition in structural glasses is characterized by full-replica symmetry breaking of the free-energy landscape and the onset of anomalous aging dynamics due to marginal stability. Here we show that this transition also has a structural signature in finite-dimensional glasses consisting of hard spheres and disks. By analyzing the distribution of inter-particle gaps in the simulated static configurations at different pressures, we find that the Gardner transition coincides with the emergence of two well-known jamming scalings in the gap distribution, which enables the extraction of a structural order parameter. The jamming scalings reflect a compressible effective force network formed by contact and quasi-contact gaps, while non-contact gaps that do not participate in the effective force network are incompressible. Our results suggest that the Gardner transition in hard-particle glasses is a precursor of the jamming transition. The proposed structural signature and order parameter provide a convenient approach to detecting the Gardner transition in future granular experiments., Comment: 8 pages, 8 figures
- Published
- 2024
4. Theoretical Investigations and Practical Enhancements on Tail Task Risk Minimization in Meta Learning
- Author
-
Lv, Yiqin, Wang, Qi, Liang, Dong, and Xie, Zheng
- Subjects
Computer Science - Machine Learning - Abstract
Meta learning is a promising paradigm in the era of large models and task distributional robustness has become an indispensable consideration in real-world scenarios. Recent advances have examined the effectiveness of tail task risk minimization in fast adaptation robustness improvement \citep{wang2023simple}. This work contributes to more theoretical investigations and practical enhancements in the field. Specifically, we reduce the distributionally robust strategy to a max-min optimization problem, constitute the Stackelberg equilibrium as the solution concept, and estimate the convergence rate. In the presence of tail risk, we further derive the generalization bound, establish connections with estimated quantiles, and practically improve the studied strategy. Accordingly, extensive evaluations demonstrate the significance of our proposal and its scalability to multimodal large models in boosting robustness.
- Published
- 2024
5. P$^2$C$^2$Net: PDE-Preserved Coarse Correction Network for efficient prediction of spatiotemporal dynamics
- Author
-
Wang, Qi, Ren, Pu, Zhou, Hao, Liu, Xin-Yang, Deng, Zhiwen, Zhang, Yi, Chengze, Ruizhi, Liu, Hongsheng, Wang, Zidong, Wang, Jian-Xun, Ji-Rong_Wen, Sun, Hao, and Liu, Yang
- Subjects
Mathematics - Numerical Analysis ,Computer Science - Artificial Intelligence ,Computer Science - Machine Learning - Abstract
When solving partial differential equations (PDEs), classical numerical methods often require fine mesh grids and small time stepping to meet stability, consistency, and convergence conditions, leading to high computational cost. Recently, machine learning has been increasingly utilized to solve PDE problems, but they often encounter challenges related to interpretability, generalizability, and strong dependency on rich labeled data. Hence, we introduce a new PDE-Preserved Coarse Correction Network (P$^2$C$^2$Net) to efficiently solve spatiotemporal PDE problems on coarse mesh grids in small data regimes. The model consists of two synergistic modules: (1) a trainable PDE block that learns to update the coarse solution (i.e., the system state), based on a high-order numerical scheme with boundary condition encoding, and (2) a neural network block that consistently corrects the solution on the fly. In particular, we propose a learnable symmetric Conv filter, with weights shared over the entire model, to accurately estimate the spatial derivatives of PDE based on the neural-corrected system state. The resulting physics-encoded model is capable of handling limited training data (e.g., 3--5 trajectories) and accelerates the prediction of PDE solutions on coarse spatiotemporal grids while maintaining a high accuracy. P$^2$C$^2$Net achieves consistent state-of-the-art performance with over 50\% gain (e.g., in terms of relative prediction error) across four datasets covering complex reaction-diffusion processes and turbulent flows.
- Published
- 2024
6. 'We do use it, but not how hearing people think': How the Deaf and Hard of Hearing Community Uses Large Language Model Tools
- Author
-
Huffman, Shuxu, Chen, Si, Mack, Kelly Avery, Su, Haotian, Wang, Qi, and Kushalnagar, Raja
- Subjects
Computer Science - Human-Computer Interaction - Abstract
Generative AI tools, particularly those utilizing large language models (LLMs), have become increasingly prevalent in both professional and personal contexts, offering powerful capabilities for text generation and communication support. While these tools are widely used to enhance productivity and accessibility, there has been limited exploration of how Deaf and Hard of Hearing (DHH) individuals engage with text-based generative AI tools, as well as the challenges they may encounter. This paper presents a mixed-method survey study investigating how the DHH community uses Text AI tools, such as ChatGPT, to reduce communication barriers, bridge Deaf and hearing cultures, and improve access to information. Through a survey of 80 DHH participants and separate interviews with 11 other participants, we found that while these tools provide significant benefits, including enhanced communication and mental health support, they also introduce barriers, such as a lack of American Sign Language (ASL) support and understanding of Deaf cultural nuances. Our findings highlight unique usage patterns within the DHH community and underscore the need for inclusive design improvements. We conclude by offering practical recommendations to enhance the accessibility of Text AI for the DHH community and suggest directions for future research in AI and accessibility.
- Published
- 2024
7. On-Site Precise Screening of SARS-CoV-2 Systems Using a Channel-Wise Attention-Based PLS-1D-CNN Model with Limited Infrared Signatures
- Author
-
Zhang, Wenwen, Tang, Zhouzhuo, Feng, Yingmei, Yu, Xia, Wang, Qi Jie, and Lin, Zhiping
- Subjects
Electrical Engineering and Systems Science - Signal Processing ,Computer Science - Artificial Intelligence ,Computer Science - Machine Learning ,Quantitative Biology - Biomolecules - Abstract
During the early stages of respiratory virus outbreaks, such as severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2), the efficient utilize of limited nasopharyngeal swabs for rapid and accurate screening is crucial for public health. In this study, we present a methodology that integrates attenuated total reflection-Fourier transform infrared spectroscopy (ATR-FTIR) with the adaptive iteratively reweighted penalized least squares (airPLS) preprocessing algorithm and a channel-wise attention-based partial least squares one-dimensional convolutional neural network (PLS-1D-CNN) model, enabling accurate screening of infected individuals within 10 minutes. Two cohorts of nasopharyngeal swab samples, comprising 126 and 112 samples from suspected SARS-CoV-2 Omicron variant cases, were collected at Beijing You'an Hospital for verification. Given that ATR-FTIR spectra are highly sensitive to variations in experimental conditions, which can affect their quality, we propose a biomolecular importance (BMI) evaluation method to assess signal quality across different conditions, validated by comparing BMI with PLS-GBM and PLS-RF results. For the ATR-FTIR signals in cohort 2, which exhibited a higher BMI, airPLS was utilized for signal preprocessing, followed by the application of the channel-wise attention-based PLS-1D-CNN model for screening. The experimental results demonstrate that our model outperforms recently reported methods in the field of respiratory virus spectrum detection, achieving a recognition screening accuracy of 96.48%, a sensitivity of 96.24%, a specificity of 97.14%, an F1-score of 96.12%, and an AUC of 0.99. It meets the World Health Organization (WHO) recommended criteria for an acceptable product: sensitivity of 95.00% or greater and specificity of 97.00% or greater for testing prior SARS-CoV-2 infection in moderate to high volume scenarios.
- Published
- 2024
8. Offline Reinforcement Learning with OOD State Correction and OOD Action Suppression
- Author
-
Mao, Yixiu, Wang, Qi, Chen, Chen, Qu, Yun, and Ji, Xiangyang
- Subjects
Computer Science - Machine Learning ,Computer Science - Artificial Intelligence - Abstract
In offline reinforcement learning (RL), addressing the out-of-distribution (OOD) action issue has been a focus, but we argue that there exists an OOD state issue that also impairs performance yet has been underexplored. Such an issue describes the scenario when the agent encounters states out of the offline dataset during the test phase, leading to uncontrolled behavior and performance degradation. To this end, we propose SCAS, a simple yet effective approach that unifies OOD state correction and OOD action suppression in offline RL. Technically, SCAS achieves value-aware OOD state correction, capable of correcting the agent from OOD states to high-value in-distribution states. Theoretical and empirical results show that SCAS also exhibits the effect of suppressing OOD actions. On standard offline RL benchmarks, SCAS achieves excellent performance without additional hyperparameter tuning. Moreover, benefiting from its OOD state correction feature, SCAS demonstrates enhanced robustness against environmental perturbations., Comment: Accepted to NeurIPS 2024
- Published
- 2024
9. Research on gesture recognition method based on SEDCNN-SVM
- Author
-
Zhang, Mingjin, Wang, Jiahao, Wang, Jianming, and Wang, Qi
- Subjects
Computer Science - Computer Vision and Pattern Recognition - Abstract
Gesture recognition based on surface electromyographic signal (sEMG) is one of the most used methods. The traditional manual feature extraction can only extract some low-level signal features, this causes poor classifier performance and low recognition accuracy when dealing with some complex signals. A recognition method, namely SEDCNN-SVM, is proposed to recognize sEMG of different gestures. SEDCNN-SVM consists of an improved deep convolutional neural network (DCNN) and a support vector machine (SVM). The DCNN can automatically extract and learn the feature information of sEMG through the convolution operation of the convolutional layer, so that it can capture the complex and high-level features in the data. The Squeeze and Excitation Networks (SE-Net) and the residual module were added to the model, so that the feature representation of each channel could be improved, the loss of feature information in convolutional operations was reduced, useful feature information was captured, and the problem of network gradient vanishing was eased. The SVM can improve the generalization ability and classification accuracy of the model by constructing an optimal hyperplane of the feature space. Hence, the SVM was used to replace the full connection layer and the Softmax function layer of the DCNN, the use of a suitable kernel function in SVM can improve the model's generalization ability and classification accuracy. To verify the effectiveness of the proposed classification algorithm, this method is analyzed and compared with other comparative classification methods. The recognition accuracy of SEDCNN-SVM can reach 0.955, it is significantly improved compared with other classification methods, the SEDCNN-SVM model is recognized online in real time.
- Published
- 2024
10. Data-Efficient CLIP-Powered Dual-Branch Networks for Source-Free Unsupervised Domain Adaptation
- Author
-
Li, Yongguang, Cao, Yueqi, Li, Jindong, Wang, Qi, and Wang, Shengsheng
- Subjects
Computer Science - Computer Vision and Pattern Recognition - Abstract
Source-free Unsupervised Domain Adaptation (SF-UDA) aims to transfer a model's performance from a labeled source domain to an unlabeled target domain without direct access to source samples, addressing critical data privacy concerns. However, most existing SF-UDA approaches assume the availability of abundant source domain samples, which is often impractical due to the high cost of data annotation. To address the dual challenges of limited source data and privacy concerns, we introduce a data-efficient, CLIP-powered dual-branch network (CDBN). This architecture consists of a cross-domain feature transfer branch and a target-specific feature learning branch, leveraging high-confidence target domain samples to transfer text features of source domain categories while learning target-specific soft prompts. By fusing the outputs of both branches, our approach not only effectively transfers source domain category semantic information to the target domain but also reduces the negative impacts of noise and domain gaps during target training. Furthermore, we propose an unsupervised optimization strategy driven by accurate classification and diversity, preserving the classification capability learned from the source domain while generating more confident and diverse predictions in the target domain. CDBN achieves near state-of-the-art performance with far fewer source domain samples than existing methods across 31 transfer tasks on seven datasets., Comment: This update includes: (1) language polishing for clarity and conciseness, (2) new CLIP zero-shot results in Office-31, and (3) expanded results in Table 8 with more random seeds to enhance reliability
- Published
- 2024
11. ST-MoE-BERT: A Spatial-Temporal Mixture-of-Experts Framework for Long-Term Cross-City Mobility Prediction
- Author
-
He, Haoyu, Luo, Haozheng, and Wang, Qi R.
- Subjects
Computer Science - Machine Learning ,Computer Science - Artificial Intelligence - Abstract
Predicting human mobility across multiple cities presents significant challenges due to the complex and diverse spatial-temporal dynamics inherent in different urban environments. In this study, we propose a robust approach to predict human mobility patterns called ST-MoE-BERT. Compared to existing methods, our approach frames the prediction task as a spatial-temporal classification problem. Our methodology integrates the Mixture-of-Experts architecture with BERT model to capture complex mobility dynamics and perform the downstream human mobility prediction task. Additionally, transfer learning is integrated to solve the challenge of data scarcity in cross-city prediction. We demonstrate the effectiveness of the proposed model on GEO-BLEU and DTW, comparing it to several state-of-the-art methods. Notably, ST-MoE-BERT achieves an average improvement of 8.29%., Comment: 2nd ACM SIGSPATIAL International Workshop on the Human Mobility Prediction Challenge
- Published
- 2024
12. Towards Next-Generation LLM-based Recommender Systems: A Survey and Beyond
- Author
-
Wang, Qi, Li, Jindong, Wang, Shiqi, Xing, Qianli, Niu, Runliang, Kong, He, Li, Rui, Long, Guodong, Chang, Yi, and Zhang, Chengqi
- Subjects
Computer Science - Information Retrieval ,Computer Science - Artificial Intelligence - Abstract
Large language models (LLMs) have not only revolutionized the field of natural language processing (NLP) but also have the potential to bring a paradigm shift in many other fields due to their remarkable abilities of language understanding, as well as impressive generalization capabilities and reasoning skills. As a result, recent studies have actively attempted to harness the power of LLMs to improve recommender systems, and it is imperative to thoroughly review the recent advances and challenges of LLM-based recommender systems. Unlike existing work, this survey does not merely analyze the classifications of LLM-based recommendation systems according to the technical framework of LLMs. Instead, it investigates how LLMs can better serve recommendation tasks from the perspective of the recommender system community, thus enhancing the integration of large language models into the research of recommender system and its practical application. In addition, the long-standing gap between academic research and industrial applications related to recommender systems has not been well discussed, especially in the era of large language models. In this review, we introduce a novel taxonomy that originates from the intrinsic essence of recommendation, delving into the application of large language model-based recommendation systems and their industrial implementation. Specifically, we propose a three-tier structure that more accurately reflects the developmental progression of recommendation systems from research to practical implementation, including representing and understanding, scheming and utilizing, and industrial deployment. Furthermore, we discuss critical challenges and opportunities in this emerging field. A more up-to-date version of the papers is maintained at: https://github.com/jindongli-Ai/Next-Generation-LLM-based-Recommender-Systems-Survey.
- Published
- 2024
13. Effort Allocation for Deadline-Aware Task and Motion Planning: A Metareasoning Approach
- Author
-
Sung, Yoonchang, Shperberg, Shahaf S., Wang, Qi, and Stone, Peter
- Subjects
Computer Science - Robotics - Abstract
In robot planning, tasks can often be achieved through multiple options, each consisting of several actions. This work specifically addresses deadline constraints in task and motion planning, aiming to find a plan that can be executed within the deadline despite uncertain planning and execution times. We propose an effort allocation problem, formulated as a Markov decision process (MDP), to find such a plan by leveraging metareasoning perspectives to allocate computational resources among the given options. We formally prove the NP-hardness of the problem by reducing it from the knapsack problem. Both a model-based approach, where transition models are learned from past experience, and a model-free approach, which overcomes the unavailability of prior data acquisition through reinforcement learning, are explored. For the model-based approach, we investigate Monte Carlo tree search (MCTS) to approximately solve the proposed MDP and further design heuristic schemes to tackle NP-hardness, leading to the approximate yet efficient algorithm called DP_Rerun. In experiments, DP_Rerun demonstrates promising performance comparable to MCTS while requiring negligible computation time., Comment: 48 pages, 6 figures
- Published
- 2024
14. Observation of Higgs and Goldstone modes in U(1) symmetry-broken Rydberg atomic systems
- Author
-
Liu, Bang, Zhang, Li-Hua, Wang, Ya-Jun, Zhang, Jun, Wang, Qi-Feng, Ma, Yu, Han, Tian-Yu, Zhang, Zheng-Yuan, Shao, Shi-Yao, Li, Qing, Chen, Han-Chao, Nan, Jia-Dou, Zhu, Dong-Yang, Yin, Yi-Ming, Shi, Bao-Sen, and Ding, Dong-Sheng
- Subjects
Condensed Matter - Quantum Gases ,Quantum Physics - Abstract
Higgs and Goldstone modes manifest as fluctuations in the order parameter of system, offering insights into its phase transitions and symmetry properties. Exploring the dynamics of these collective excitations in a Rydberg atoms system advances various branches of condensed matter, particle physics, and cosmology. Here, we report an experimental signature of Higgs and Goldstone modes in a U(1) symmetry-broken Rydberg atomic gases. By constructing two probe fields to excite atoms, we observe the distinct phase and amplitude fluctuations of Rydberg atoms collective excitations under the particle-hole symmetry. Due to the van der Waals interactions between the Rydberg atoms, we detect a symmetric variance spectrum divided by the divergent regime and phase boundary, capturing the full dynamics of the additional Higgs and Goldstone modes. Studying the Higgs and Goldstone modes in Rydberg atoms allows us to explore fundamental aspects of quantum phase transitions and symmetry breaking phenomena, while leveraging the unique properties of these highly interacting systems to uncover new physics and potential applications in quantum simulation.
- Published
- 2024
15. Studying the $B_{d(s)} \rightarrow K^{(\ast)}\bar{K}^{(\ast)}$ puzzle and $B^+ \rightarrow K^+\nu\bar{\nu}$ in $R$-parity violating MSSM with seesaw mechanism
- Author
-
Zheng, Min-Di, Wang, Qi-Liang, Lai, Li-Fen, and Zhang, Hong-Hao
- Subjects
High Energy Physics - Phenomenology - Abstract
We study the non-leptonic puzzle of $B_{d(s)} \rightarrow K^{(\ast)}\bar{K}^{(\ast)}$ decay in the $R$-parity violating minimal supersymmetric standard model (RPV-MSSM) extended with the inverse seesaw mechanism. In this model, the chiral flip of sneutrinos can contribute to the observables $L_{K\bar{K}}$ and $L_{K^{\ast}\bar{K}^{\ast}}$, that is benefit for explaining the relevant puzzle. We also find that this unique effect can engage in the $B_s$-$\bar{B}_s$ mixing. We utilize the scenario of complex $\lambda^\prime$ couplings to fulfill the recent stringent constraint of $B_s$-$\bar{B}_s$ mixing, and examine other related bounds of $B,K$-meson decays, lepton decays, neutrino data, $Z$-pole results, CP violations (CPV), etc. Besides, inspired by the new measurement of ${\cal B}(B^+ \rightarrow K^+\nu\bar{\nu})$ by Belle II, which shows about $2.7\sigma$ higher than the Standard Model (SM) prediction, we investigate the NP enhancement to this observable and find this tension can also be explained in this model., Comment: 26 pages; 2 figures
- Published
- 2024
16. Open-World Reinforcement Learning over Long Short-Term Imagination
- Author
-
Li, Jiajian, Wang, Qi, Wang, Yunbo, Jin, Xin, Li, Yang, Zeng, Wenjun, and Yang, Xiaokang
- Subjects
Computer Science - Machine Learning - Abstract
Training visual reinforcement learning agents in a high-dimensional open world presents significant challenges. While various model-based methods have improved sample efficiency by learning interactive world models, these agents tend to be "short-sighted", as they are typically trained on short snippets of imagined experiences. We argue that the primary obstacle in open-world decision-making is improving the efficiency of off-policy exploration across an extensive state space. In this paper, we present LS-Imagine, which extends the imagination horizon within a limited number of state transition steps, enabling the agent to explore behaviors that potentially lead to promising long-term feedback. The foundation of our approach is to build a long short-term world model. To achieve this, we simulate goal-conditioned jumpy state transitions and compute corresponding affordance maps by zooming in on specific areas within single images. This facilitates the integration of direct long-term values into behavior learning. Our method demonstrates significant improvements over state-of-the-art techniques in MineDojo.
- Published
- 2024
17. EmojiHeroVR: A Study on Facial Expression Recognition under Partial Occlusion from Head-Mounted Displays
- Author
-
Ortmann, Thorben, Wang, Qi, and Putzar, Larissa
- Subjects
Computer Science - Computer Vision and Pattern Recognition - Abstract
Emotion recognition promotes the evaluation and enhancement of Virtual Reality (VR) experiences by providing emotional feedback and enabling advanced personalization. However, facial expressions are rarely used to recognize users' emotions, as Head-Mounted Displays (HMDs) occlude the upper half of the face. To address this issue, we conducted a study with 37 participants who played our novel affective VR game EmojiHeroVR. The collected database, EmoHeVRDB (EmojiHeroVR Database), includes 3,556 labeled facial images of 1,778 reenacted emotions. For each labeled image, we also provide 29 additional frames recorded directly before and after the labeled image to facilitate dynamic Facial Expression Recognition (FER). Additionally, EmoHeVRDB includes data on the activations of 63 facial expressions captured via the Meta Quest Pro VR headset for each frame. Leveraging our database, we conducted a baseline evaluation on the static FER classification task with six basic emotions and neutral using the EfficientNet-B0 architecture. The best model achieved an accuracy of 69.84% on the test set, indicating that FER under HMD occlusion is feasible but significantly more challenging than conventional FER.
- Published
- 2024
18. Customizing Generated Signs and Voices of AI Avatars: Deaf-Centric Mixed-Reality Design for Deaf-Hearing Communication
- Author
-
Chen, Si, Cheng, Haocong, Su, Suzy, Patterson, Stephanie, Kushalnagar, Raja, Wang, Qi, and Huang, Yun
- Subjects
Computer Science - Human-Computer Interaction - Abstract
This study investigates innovative interaction designs for communication and collaborative learning between learners of mixed hearing and signing abilities, leveraging advancements in mixed reality technologies like Apple Vision Pro and generative AI for animated avatars. Adopting a participatory design approach, we engaged 15 d/Deaf and hard of hearing (DHH) students to brainstorm ideas for an AI avatar with interpreting ability (sign language to English, voice to English) that would facilitate their face-to-face communication with hearing peers. Participants envisioned the AI avatars to address some issues with human interpreters, such as lack of availability, and provide affordable options to expensive personalized interpreting service. Our findings indicate a range of preferences for integrating the AI avatars with actual human figures of both DHH and hearing communication partners. The participants highlighted the importance of having control over customizing the AI avatar, such as AI-generated signs, voices, facial expressions, and their synchronization for enhanced emotional display in communication. Based on our findings, we propose a suite of design recommendations that balance respecting sign language norms with adherence to hearing social norms. Our study offers insights on improving the authenticity of generative AI in scenarios involving specific, and sometimes unfamiliar, social norms.
- Published
- 2024
19. Fast switchable unidirectional magnon emitter
- Author
-
Wang, Yueqi, Guo, Mengying, Davídková, Kristýna, Verba, Roman, Guo, Xueyu, Dubs, Carsten, Chumak, Andrii V., Pirro, Philipp, and Wang, Qi
- Subjects
Condensed Matter - Mesoscale and Nanoscale Physics - Abstract
Magnon spintronics is an emerging field that explores the use of magnons, the quanta of spin waves in magnetic materials for information processing and communication. Achieving unidirectional information transport with fast switching capability is critical for the development of fast integrated magnonic circuits, which offer significant advantages in high-speed, low-power information processing. However, previous unidirectional information transport has primarily focused on Damon-Eshbach spin wave modes, which are non-switchable as their propagation direction is defined by the direction of the external field and cannot be changed in a short time. Here, we experimentally demonstrate a fast switchable unidirectional magnon emitter in the forward volume spin wave mode by a current-induced asymmetric Oersted field. Our findings reveal significant nonreciprocity and nanosecond switchability, underscoring the potential of the method to advance high-speed spin-wave processing networks., Comment: 15 pages, 4 figures
- Published
- 2024
20. PhyMPGN: Physics-encoded Message Passing Graph Network for spatiotemporal PDE systems
- Author
-
Zeng, Bocheng, Wang, Qi, Yan, Mengtao, Liu, Yang, Chengze, Ruizhi, Zhang, Yi, Liu, Hongsheng, Wang, Zidong, and Sun, Hao
- Subjects
Computer Science - Machine Learning ,Computer Science - Artificial Intelligence ,Computer Science - Computational Engineering, Finance, and Science - Abstract
Solving partial differential equations (PDEs) serves as a cornerstone for modeling complex dynamical systems. Recent progresses have demonstrated grand benefits of data-driven neural-based models for predicting spatiotemporal dynamics (e.g., tremendous speedup gain compared with classical numerical methods). However, most existing neural models rely on rich training data, have limited extrapolation and generalization abilities, and suffer to produce precise or reliable physical prediction under intricate conditions (e.g., irregular mesh or geometry, complex boundary conditions, diverse PDE parameters, etc.). To this end, we propose a new graph learning approach, namely, Physics-encoded Message Passing Graph Network (PhyMPGN), to model spatiotemporal PDE systems on irregular meshes given small training datasets. Specifically, we incorporate a GNN into a numerical integrator to approximate the temporal marching of spatiotemporal dynamics for a given PDE system. Considering that many physical phenomena are governed by diffusion processes, we further design a learnable Laplace block, which encodes the discrete Laplace-Beltrami operator, to aid and guide the GNN learning in a physically feasible solution space. A boundary condition padding strategy is also designed to improve the model convergence and accuracy. Extensive experiments demonstrate that PhyMPGN is capable of accurately predicting various types of spatiotemporal dynamics on coarse unstructured meshes, consistently achieves the state-of-the-art results, and outperforms other baselines with considerable gains.
- Published
- 2024
21. Inclusive Emotion Technologies: Addressing the Needs of d/Deaf and Hard of Hearing Learners in Video-Based Learning
- Author
-
Chen, Si, Situ, Jason, Cheng, Haocong, Su, Suzy, Kirst, Desiree, Ming, Lu, Wang, Qi, Angrave, Lawrence, and Huang, Yun
- Subjects
Computer Science - Human-Computer Interaction - Abstract
Accessibility efforts for d/Deaf and hard of hearing (DHH) learners in video-based learning have mainly focused on captions and interpreters, with limited attention to learners' emotional awareness--an important yet challenging skill for effective learning. Current emotion technologies are designed to support learners' emotional awareness and social needs; however, little is known about whether and how DHH learners could benefit from these technologies. Our study explores how DHH learners perceive and use emotion data from two collection approaches, self-reported and automatic emotion recognition (AER), in video-based learning. By comparing the use of these technologies between DHH (N=20) and hearing learners (N=20), we identified key differences in their usage and perceptions: 1) DHH learners enhanced their emotional awareness by rewatching the video to self-report their emotions and called for alternative methods for self-reporting emotion, such as using sign language or expressive emoji designs; and 2) while the AER technology could be useful for detecting emotional patterns in learning experiences, DHH learners expressed more concerns about the accuracy and intrusiveness of the AER data. Our findings provide novel design implications for improving the inclusiveness of emotion technologies to support DHH learners, such as leveraging DHH peer learners' emotions to elicit reflections.
- Published
- 2024
22. Motion Design Principles for Accessible Video-based Learning: Addressing Cognitive Challenges for Deaf and Hard of Hearing Learners
- Author
-
Cheng, Si, Cheng, Haocong, Su, Suzy, Ming, Lu, Masud, Sarah, Wang, Qi, and Huang, Yun
- Subjects
Computer Science - Human-Computer Interaction - Abstract
Deaf and Hard-of-Hearing (DHH) learners face unique challenges in video-based learning due to the complex interplay between visual and auditory information in videos. Traditional approaches to making video content accessible primarily focus on captioning, but these solutions often neglect the cognitive demands of processing both visual and textual information simultaneously. This paper introduces a set of \textit{Motion} design guidelines, aimed at mitigating these cognitive challenges and improving video learning experiences for DHH learners. Through a two-phase research, we identified five key challenges, including misaligned content and visual overload. We proposed five design principles accordingly. User study with 16 DHH participants showed that improving visual-audio relevance and guiding visual attention significantly enhances the learning experience by reducing physical demand, alleviating temporal pressure, and improving learning satisfaction. Our findings highlight the potential of Motion design to transform educational content for DHH learners, and we discuss implications for inclusive video learning tools.
- Published
- 2024
23. 'Real Learner Data Matters' Exploring the Design of LLM-Powered Question Generation for Deaf and Hard of Hearing Learners
- Author
-
Cheng, Si, Huffman, Shuxu, Zhu, Qingxiaoyang, Su, Haotian, Kushalnagar, Raja, and Wang, Qi
- Subjects
Computer Science - Human-Computer Interaction - Abstract
Deaf and Hard of Hearing (DHH) learners face unique challenges in learning environments, often due to a lack of tailored educational materials that address their specific needs. This study explores the potential of Large Language Models (LLMs) to generate personalized quiz questions to enhance DHH students' video-based learning experiences. We developed a prototype leveraging LLMs to generate questions with emphasis on two unique strategies: Visual Questions, which identify video segments where visual information might be misrepresented, and Emotion Questions, which highlight moments where previous DHH learners experienced learning difficulty manifested in emotional responses. Through user studies with DHH undergraduates, we evaluated the effectiveness of these LLM-generated questions in supporting the learning experience. Our findings indicate that while LLMs offer significant potential for personalized learning, challenges remain in the interaction accessibility for the diverse DHH community. The study highlights the importance of considering language diversity and culture in LLM-based educational technology design.
- Published
- 2024
24. Supervised Multi-Modal Fission Learning
- Author
-
Mao, Lingchao, wang, Qi, Su, Yi, Lure, Fleming, and Li, Jing
- Subjects
Computer Science - Machine Learning ,Computer Science - Computer Vision and Pattern Recognition - Abstract
Learning from multimodal datasets can leverage complementary information and improve performance in prediction tasks. A commonly used strategy to account for feature correlations in high-dimensional datasets is the latent variable approach. Several latent variable methods have been proposed for multimodal datasets. However, these methods either focus on extracting the shared component across all modalities or on extracting both a shared component and individual components specific to each modality. To address this gap, we propose a Multi-Modal Fission Learning (MMFL) model that simultaneously identifies globally joint, partially joint, and individual components underlying the features of multimodal datasets. Unlike existing latent variable methods, MMFL uses supervision from the response variable to identify predictive latent components and has a natural extension for incorporating incomplete multimodal data. Through simulation studies, we demonstrate that MMFL outperforms various existing multimodal algorithms in both complete and incomplete modality settings. We applied MMFL to a real-world case study for early prediction of Alzheimers Disease using multimodal neuroimaging and genomics data from the Alzheimers Disease Neuroimaging Initiative (ADNI) dataset. MMFL provided more accurate predictions and better insights into within- and across-modality correlations compared to existing methods.
- Published
- 2024
25. Group Distributionally Robust Optimization can Suppress Class Imbalance Effect in Network Traffic Classification
- Author
-
Du, Wumei, Wang, Qi, Lv, Yiqin, Liang, Dong, Wu, Guanlin, Liang, Xingxing, and Xie, Zheng
- Subjects
Statistics - Machine Learning ,Computer Science - Machine Learning - Abstract
Internet services have led to the eruption of traffic, and machine learning on these Internet data has become an indispensable tool, especially when the application is risk-sensitive. This paper focuses on network traffic classification in the presence of class imbalance, which fundamentally and ubiquitously exists in Internet data analysis. This existence of class imbalance mostly drifts the optimal decision boundary, resulting in a less optimal solution for machine learning models. To alleviate the effect, we propose to design strategies for alleviating the class imbalance through the lens of group distributionally robust optimization. Our approach iteratively updates the non-parametric weights for separate classes and optimizes the learning model by minimizing reweighted losses. We interpret the optimization steps from a Stackelberg game and perform extensive experiments on typical benchmarks. Results show that our approach can not only suppress the negative effect of class imbalance but also improve the comprehensive performance in prediction.
- Published
- 2024
26. Focus Entirety and Perceive Environment for Arbitrary-Shaped Text Detection
- Author
-
Han, Xu, Gao, Junyu, Yang, Chuang, Yuan, Yuan, and Wang, Qi
- Subjects
Computer Science - Computer Vision and Pattern Recognition - Abstract
Due to the diversity of scene text in aspects such as font, color, shape, and size, accurately and efficiently detecting text is still a formidable challenge. Among the various detection approaches, segmentation-based approaches have emerged as prominent contenders owing to their flexible pixel-level predictions. However, these methods typically model text instances in a bottom-up manner, which is highly susceptible to noise. In addition, the prediction of pixels is isolated without introducing pixel-feature interaction, which also influences the detection performance. To alleviate these problems, we propose a multi-information level arbitrary-shaped text detector consisting of a focus entirety module (FEM) and a perceive environment module (PEM). The former extracts instance-level features and adopts a top-down scheme to model texts to reduce the influence of noises. Specifically, it assigns consistent entirety information to pixels within the same instance to improve their cohesion. In addition, it emphasizes the scale information, enabling the model to distinguish varying scale texts effectively. The latter extracts region-level information and encourages the model to focus on the distribution of positive samples in the vicinity of a pixel, which perceives environment information. It treats the kernel pixels as positive samples and helps the model differentiate text and kernel features. Extensive experiments demonstrate the FEM's ability to efficiently support the model in handling different scale texts and confirm the PEM can assist in perceiving pixels more accurately by focusing on pixel vicinities. Comparisons show the proposed model outperforms existing state-of-the-art approaches on four public datasets.
- Published
- 2024
27. Spotlight Text Detector: Spotlight on Candidate Regions Like a Camera
- Author
-
Han, Xu, Gao, Junyu, Yang, Chuang, Yuan, Yuan, and Wang, Qi
- Subjects
Computer Science - Computer Vision and Pattern Recognition - Abstract
The irregular contour representation is one of the tough challenges in scene text detection. Although segmentation-based methods have achieved significant progress with the help of flexible pixel prediction, the overlap of geographically close texts hinders detecting them separately. To alleviate this problem, some shrink-based methods predict text kernels and expand them to restructure texts. However, the text kernel is an artificial object with incomplete semantic features that are prone to incorrect or missing detection. In addition, different from the general objects, the geometry features (aspect ratio, scale, and shape) of scene texts vary significantly, which makes it difficult to detect them accurately. To consider the above problems, we propose an effective spotlight text detector (STD), which consists of a spotlight calibration module (SCM) and a multivariate information extraction module (MIEM). The former concentrates efforts on the candidate kernel, like a camera focus on the target. It obtains candidate features through a mapping filter and calibrates them precisely to eliminate some false positive samples. The latter designs different shape schemes to explore multiple geometric features for scene texts. It helps extract various spatial relationships to improve the model's ability to recognize kernel regions. Ablation studies prove the effectiveness of the designed SCM and MIEM. Extensive experiments verify that our STD is superior to existing state-of-the-art methods on various datasets, including ICDAR2015, CTW1500, MSRA-TD500, and Total-Text.
- Published
- 2024
28. Towards heavy double-gluon hybrid mesons with exotic quantum numbers in QCD sum rules
- Author
-
Lian, Ding-Kun, Wang, Qi-Nan, Chen, Xu-Liang, Yang, Peng-Fei, Chen, Wei, and Chen, Hua-Xing
- Subjects
High Energy Physics - Phenomenology - Abstract
The double-gluon hybrid meson configuration was recently proposed and investigated within QCD sum rules. In this talk, we discuss the color structures of the double-gluon hybrid meson and construct current operators with exotic quantum numbers $J^{PC}=1^{-+}$ and $2^{+-}$ for two of the structures. In the framework of QCD sum rules, we consider the condensates up to dimension-8 at the leading order of $\alpha_{s}$ for both charmonium and the bottomonium systems. The results indicate that the masses of the $1^{-+}$ and $2^{+-}$ charmonium double-gluon hybrid mesons are approximately $6.1-7.2$ GeV and $6.3-6.4$ GeV, respectively. As for the bottomonium systems, their masses fall within the range of $13.7-14.3$ GeV and $12.6-13.3$ GeV for the $1^{-+}$ and $2^{+-}$ channels, respectively. Additionally, the charmonium hybrids could be produced in the radiative decays of bottomonium mesons in BelleII experiment., Comment: 9 pages, 5 figures, 4 tables. Proceedings article for QCD24: 27th Hih-Energy Physics International Conference in Quantum Chromodynamis. arXiv admin note: substantial text overlap with arXiv:2403.18696
- Published
- 2024
- Full Text
- View/download PDF
29. The well-posedness and regularity of the Non-stationary Stokes and Navier-Stokes equations with the friction-type interface condition
- Author
-
Wang, Qi, Kashiwabara, Takahito, and Zhou, Guanyu
- Subjects
Mathematics - Analysis of PDEs - Abstract
The friction-type interface condition (FIC) is introduced to describe the phenomenon of the slip and leak of fluid flow on the interface happens only when the difference of stress force is above a threshold. The FIC involves the subdifferential and can be regarded as an intermediate form of the Dirichlet and the Neumann boundary conditions. This work is devoted to the well-posedness of the non-stationary (Navier-)Stokes equations with FIC in 2D and 3D, the weak forms of which are parabolic variational inequalities of the second type. We establish the existence theorems using the regularization technique and the Galerkin method. For the Stokes case, we prove the global unique existence and investigate the $H^2$ regularity. In the case of 2D Navier-Stokes equation, we show the global unique existence of the weak and strong solutions, respectively. For the Navier-Stokes case in 3D, we demonstrate the global existence of the weak solution and the local unique existence of the strong solution.
- Published
- 2024
30. Unsupervised Attention-Based Multi-Source Domain Adaptation Framework for Drift Compensation in Electronic Nose Systems
- Author
-
Zhang, Wenwen, Hu, Shuhao, Zhang, Zhengyuan, Zheng, Yuanjin, Wang, Qi Jie, and Lin, Zhiping
- Subjects
Electrical Engineering and Systems Science - Signal Processing ,Computer Science - Artificial Intelligence - Abstract
Continuous, long-term monitoring of hazardous, noxious, explosive, and flammable gases in industrial environments using electronic nose (E-nose) systems faces the significant challenge of reduced gas identification accuracy due to time-varying drift in gas sensors. To address this issue, we propose a novel unsupervised attention-based multi-source domain shared-private feature fusion adaptation (AMDS-PFFA) framework for gas identification with drift compensation in E-nose systems. The AMDS-PFFA model effectively leverages labeled data from multiple source domains collected during the initial stage to accurately identify gases in unlabeled gas sensor array drift signals from the target domain. To validate the model's effectiveness, extensive experimental evaluations were conducted using both the University of California, Irvine (UCI) standard drift gas dataset, collected over 36 months, and drift signal data from our self-developed E-nose system, spanning 30 months. Compared to recent drift compensation methods, the AMDS-PFFA model achieves the highest average gas recognition accuracy with strong convergence, attaining 83.20% on the UCI dataset and 93.96% on data from our self-developed E-nose system across all target domain batches. These results demonstrate the superior performance of the AMDS-PFFA model in gas identification with drift compensation, significantly outperforming existing methods.
- Published
- 2024
31. Implicit Government Guarantee Measurement Based on PMC Index Model
- Author
-
Zhang, Yan, Tian, Yixiang, Chen, Lin, and Wang, Qi
- Subjects
Quantitative Finance - General Finance - Abstract
The implicit government guarantee hampers the recognition and management of risks by all stakeholders in the bond market, and it has led to excessive debt for local governments or state-owned enterprises. To prevent the risk of local government debt defaults and reduce investors' expectations of implicit government guarantees, various regulatory departments have issued a series of policy documents related to municipal investment bonds. By employing text mining techniques on policy documents related to municipal investment bond, and utilizing the PMC index model to assess the effectiveness of policy documents. This paper proposes a novel method for quantifying the intensity of implicit governmental guarantees based on PMC index model. The intensity of implicit governmental guarantees is inversely correlated with the PMC index of policies aimed at de-implicitizing governmental guarantees. Then as these policies become more effective, the intensity of implicit governmental guarantees diminishes correspondingly. These findings indicate that recent policies related to municipal investment bond have indeed succeeded in reducing implicit governmental guarantee intensity, and these policies have achieved the goal of risk management. Furthermore, it was showed that the intensity of implicit governmental guarantee affected by diverse aspects of these policies such as effectiveness, clarity, and specificity, as well as incentive and assurance mechanisms., Comment: 22 pages,6 figures
- Published
- 2024
32. Bridging the Gap: GRB 230812B -- A Three-Second Supernova-Associated Burst Detected by the GRID Mission
- Author
-
Wang, Chen-Yu, Yin, Yi-Han Iris, Zhang, Bin-Bin, Feng, Hua, Zeng, Ming, Xiong, Shao-Lin, Pan, Xiao-Fan, Yang, Jun, Zhang, Yan-Qiu, Li, Chen, Yan, Zhen-Yu, Wang, Chen-Wei, Zheng, Xu-Tao, Liu, Jia-Cong, Wang, Qi-Dong, Yang, Zi-Rui, Li, Long-Hao, Liu, Qi-Ze, Zhao, Zheng-Yang, Hu, Bo, Liu, Yi-Qi, Lu, Si-Yuan, Luo, Zi-You, Cang, Ji-Rong, Cao, De-Zhi, Han, Wen-Tao, Jia, Li-Ping, Pan, Xing-Yu, Tian, Yang, Xu, Ben-Da, Yang, Xiao, and Zeng, Zhi
- Subjects
Astrophysics - High Energy Astrophysical Phenomena - Abstract
GRB 230812B, detected by the Gamma-Ray Integrated Detectors (GRID) constellation mission, is an exceptionally bright gamma-ray burst (GRB) with a duration of only 3 seconds. Sitting near the traditional boundary ($\sim$ 2 s) between long and short GRBs, GRB 230812B is notably associated with a supernova (SN), indicating a massive star progenitor. This makes it a rare example of a short-duration GRB resulting from stellar collapse. Our analysis, using a time-evolving synchrotron model, suggests that the burst has an emission radius of approximately $10^{14.5}$~cm. We propose that the short duration of GRB 230812B is due to the combined effects of the central engine's activity time and the time required for the jet to break through the stellar envelope. Our findings provide another case that challenges the conventional view that short-duration GRBs originate exclusively from compact object mergers, demonstrating that a broader range of durations exists for GRBs arising from the collapse of massive stars., Comment: 10 pages, 3 tables, 11 figures
- Published
- 2024
33. Dynamical topological phase transition in cold Rydberg quantum gases
- Author
-
Zhang, Jun, Wang, Ya-Jun, Liu, Bang, Zhang, Li-Hua, Zhang, Zheng-Yuan, Shao, Shi-Yao, Li, Qing, Chen, Han-Chao, Ma, Yu, Han, Tian-Yu, Wang, Qi-Feng, Nan, Jia-Dou, Yin, Yi-Ming, Zhu, Dong-Yang, Shi, Bao-Sen, and Ding, Dong-Sheng
- Subjects
Condensed Matter - Quantum Gases ,Quantum Physics - Abstract
Study of phase transitions provide insights into how a many-body system behaves under different conditions, enabling us to understand the symmetry breaking, critical phenomena, and topological properties. Strong long-range interactions in highly excited Rydberg atoms create a versatile platform for exploring exotic emergent topological phases. Here, we report the experimental observation of dynamical topological phase transitions in cold Rydberg atomic gases under a microwave field driving. By measuring the system transmission curves while varying the probe intensity, we observe complex hysteresis trajectories characterized by distinct winding numbers as they cross the critical point. At the transition state, where the winding number flips, the topology of these hysteresis trajectories evolves into more non-trivial structures. The topological trajectories are shown to be robust against noise, confirming their rigidity in dynamic conditions. These findings contribute to the insights of emergence of complex dynamical topological phases in many-body systems.
- Published
- 2024
34. Scalable Reshaping of Diamond Particles via Programmable Nanosculpting
- Author
-
Zhang, Tongtong, Sun, Fuqiang, Wang, Yaorong, Li, Yingchi, Wang, Jing, Wang, Zhongqiang, Li, Kwai Hei, Zhu, Ye, Wang, Qi, Shao, Lei, Wong, Ngai, Lei, Dangyuan, Lin, Yuan, and Chu, Zhiqin
- Subjects
Condensed Matter - Materials Science ,Condensed Matter - Mesoscale and Nanoscale Physics - Abstract
Diamond particles have many interesting properties and possible applications. However, producing diamond particles with well-defined shapes at scale is challenging because diamonds are chemically inert and extremely hard. Here, we show air oxidation, a routine method for purifying diamonds, can be used to precisely shape diamond particles at scale. By exploiting the distinct reactivities of different crystal facets and defects inside the diamond, layer-by-layer outward-to-inward and inward-to-outward oxidation produced diverse diamond shapes including sphere, twisted surface, pyramidal islands, inverted pyramids, nano-flowers, and hollow polygons. The nanosculpted diamonds had more and finer features that enabled them to outperform the original raw diamonds in various applications. Using experimental observations and Monte Carlo simulations, we built a shape library that guides the design and fabrication of diamond particles with well-defined shapes and functional value. Our study presents a simple, economical and scalable way to produce shape-customized diamonds for various photonics, catalysis, quantum and information technology applications.
- Published
- 2024
35. Spatial Deep Convolutional Neural Networks
- Author
-
Wang, Qi, Parker, Paul A., and Lund, Robert B.
- Subjects
Statistics - Methodology ,Statistics - Applications - Abstract
Spatial prediction problems often use Gaussian process models, which can be computationally burdensome in high dimensions. Specification of an appropriate covariance function for the model can be challenging when complex non-stationarities exist. Recent work has shown that pre-computed spatial basis functions and a feed-forward neural network can capture complex spatial dependence structures while remaining computationally efficient. This paper builds on this literature by tailoring spatial basis functions for use in convolutional neural networks. Through both simulated and real data, we demonstrate that this approach yields more accurate spatial predictions than existing methods. Uncertainty quantification is also considered.
- Published
- 2024
36. Enhancing Convolutional Neural Networks with Higher-Order Numerical Difference Methods
- Author
-
Wang, Qi, Gao, Zijun, Sui, Mingxiu, Mei, Taiyuan, Cheng, Xiaohan, and Li, Iris
- Subjects
Computer Science - Machine Learning ,Computer Science - Artificial Intelligence ,Computer Science - Computer Vision and Pattern Recognition - Abstract
With the rise of deep learning technology in practical applications, Convolutional Neural Networks (CNNs) have been able to assist humans in solving many real-world problems. To enhance the performance of CNNs, numerous network architectures have been explored. Some of these architectures are designed based on the accumulated experience of researchers over time, while others are designed through neural architecture search methods. The improvements made to CNNs by the aforementioned methods are quite significant, but most of the improvement methods are limited in reality by model size and environmental constraints, making it difficult to fully realize the improved performance. In recent years, research has found that many CNN structures can be explained by the discretization of ordinary differential equations. This implies that we can design theoretically supported deep network structures using higher-order numerical difference methods. It should be noted that most of the previous CNN model structures are based on low-order numerical methods. Therefore, considering that the accuracy of linear multi-step numerical difference methods is higher than that of the forward Euler method, this paper proposes a stacking scheme based on the linear multi-step method. This scheme enhances the performance of ResNet without increasing the model size and compares it with the Runge-Kutta scheme. The experimental results show that the performance of the stacking scheme proposed in this paper is superior to existing stacking schemes (ResNet and HO-ResNet), and it has the capability to be extended to other types of neural networks.
- Published
- 2024
37. Chalcogenide Metasurfaces Enabling Ultra-Wideband Detectors from Visible to Mid-infrared
- Author
-
Zhang, Shutao, An, Shu, Dai, Mingjin, Wu, Qing Yang Steve, Adanan, Nur Qalishah, Zhang, Jun, Liu, Yan, Lee, Henry Yit Loong, Wong, Nancy Lai Mun, Suwardi, Ady, Ding, Jun, Simpson, Robert Edward, Wang, Qi Jie, Yang, Joel K. W., and Dong, Zhaogang
- Subjects
Physics - Optics - Abstract
Thermoelectric materials can be designed to support optical resonances across multiple spectral ranges to enable ultra-wide band photodetection. For instance, antimony telluride (Sb2Te3) chalcogenide exhibits interband plasmonic resonances in the visible range and Mie resonances in the mid-infrared (mid-IR) range, while simultaneously possessing large thermoelectric Seebeck coefficients. In this paper, we designed and fabricated Sb2Te3 metasurface devices to achieve resonant absorption for enabling photodetectors operating across an ultra-wideband spectrum, from visible to mid-IR. Furthermore, relying on asymmetric Sb2Te3 metasurface, we demonstrated the thermoelectric photodetectors with polarization-selectivity. This work provides a potential platform towards the portable ultrawide band spectrometers at room temperature, for environmental sensing applications.
- Published
- 2024
38. Topological Quantum Materials with Kagome Lattice
- Author
-
Wang, Qi, Lei, Hechang, Qi, Yanpeng, and Felser, Claudia
- Subjects
Condensed Matter - Superconductivity ,Condensed Matter - Materials Science ,Condensed Matter - Strongly Correlated Electrons - Abstract
In this account, we will give an overview of our research progress on novel quantum properties in topological quantum materials with kagome lattice. Here, there are mainly two categories of kagome materials: magnetic kagome materials and nonmagnetic ones. On one hand, magnetic kagome materials mainly focus on the 3d transition-metal-based kagome systems, including Fe$_3$Sn$_2$, Co$_3$Sn$_2$S$_2$, YMn6Sn6, FeSn, and CoSn. The interplay between magnetism and topological bands manifests vital influence on the electronic response. For example, the existence of massive Dirac or Weyl fermions near the Fermi level signicantly enhances the magnitude of Berry curvature in momentum space, leading to a large intrinsic anomalous Hall effect. In addition, the peculiar frustrated structure of kagome materials enables them to host a topologically protected skyrmion lattice or noncoplaner spin texture, yielding a topological Hall effect that arises from the realspace Berry phase. On the other hand, nonmagnetic kagome materials in the absence of longrange magnetic order include CsV3Sb5 with the coexistence of superconductivity, charge density wave state, and band topology and van der Waals semiconductor Pd$_3$P$_2$S$_8$. For these two kagome materials, the tunability of electric response in terms of high pressure or carrier doping helps to reveal the interplay between electronic correlation effects and band topology and discover the novel emergent quantum phenomena in kagome materials., Comment: 10 pages,7 figures
- Published
- 2024
- Full Text
- View/download PDF
39. MICHELE RUGGIERI'S TIANZHU SHILU (THE TRUE RECORD OF THE LORD OF HEAVEN, 1584). Edited and Translated by DanielCanaris with Contributions by Wang Huiyu, Wang Yuan, and Wang Qi. Studies in the History of Christianity in East Asia, 5. Leiden and Boston: Brill, 2023. Pp. ix + 313. Hardback, €115.00.
- Author
-
Xiong, Wei, primary
- Published
- 2023
- Full Text
- View/download PDF
40. Single-Loop Deterministic and Stochastic Interior-Point Algorithms for Nonlinearly Constrained Optimization
- Author
-
Curtis, Frank E., Jiang, Xin, and Wang, Qi
- Subjects
Mathematics - Optimization and Control ,Computer Science - Machine Learning - Abstract
An interior-point algorithm framework is proposed, analyzed, and tested for solving nonlinearly constrained continuous optimization problems. The main setting of interest is when the objective and constraint functions may be nonlinear and/or nonconvex, and when constraint values and derivatives are tractable to compute, but objective function values and derivatives can only be estimated. The algorithm is intended primarily for a setting that is similar for stochastic-gradient methods for unconstrained optimization, namely, the setting when stochastic-gradient estimates are available and employed in place of gradients of the objective, and when no objective function values (nor estimates of them) are employed. This is achieved by the interior-point framework having a single-loop structure rather than the nested-loop structure that is typical of contemporary interior-point methods. For completeness, convergence guarantees for the framework are provided both for deterministic and stochastic settings. Numerical experiments show that the algorithm yields good performance on a large set of test problems.
- Published
- 2024
41. S4D: Streaming 4D Real-World Reconstruction with Gaussians and 3D Control Points
- Author
-
He, Bing, Chen, Yunuo, Lu, Guo, Wang, Qi, Gu, Qunshan, Xie, Rong, Song, Li, and Zhang, Wenjun
- Subjects
Computer Science - Computer Vision and Pattern Recognition - Abstract
Dynamic scene reconstruction using Gaussians has recently attracted increased interest. Mainstream approaches typically employ a global deformation field to warp a 3D scene in canonical space. However, the inherent low-frequency nature of implicit neural fields often leads to ineffective representations of complex motions. Moreover, their structural rigidity can hinder adaptation to scenes with varying resolutions and durations. To address these challenges, we introduce a novel approach for streaming 4D real-world reconstruction utilizing discrete 3D control points. This method physically models local rays and establishes a motion-decoupling coordinate system. By effectively merging traditional graphics with learnable pipelines, it provides a robust and efficient local 6-degrees-of-freedom (6-DoF) motion representation. Additionally, we have developed a generalized framework that integrates our control points with Gaussians. Starting from an initial 3D reconstruction, our workflow decomposes the streaming 4D reconstruction into four independent submodules: 3D segmentation, 3D control point generation, object-wise motion manipulation, and residual compensation. Experimental results demonstrate that our method outperforms existing state-of-the-art 4D Gaussian splatting techniques on both the Neu3DV and CMU-Panoptic datasets. Notably, the optimization of our 3D control points is achievable in 100 iterations and within just 2 seconds per frame on a single NVIDIA 4070 GPU., Comment: 20 pages, 9 figures, 5 tables
- Published
- 2024
42. Folded multistability and hidden critical point in microwave-driven Rydberg atoms
- Author
-
Ma, Yu, Liu, Bang, Zhang, Li-Hua, Wang, Ya-Jun, Zhang, Zheng-Yuan, Shao, Shi-Yao, Li, Qing, Chen, Han-Chao, Zhang, Jun, Han, Tian-Yu, Wang, Qi-Feng, Nan, Jia-Dou, Yin, Yi-Ming, Zhu, Dong-Yang, Shi, Bao-Sen, and Ding, Dong-Sheng
- Subjects
Condensed Matter - Quantum Gases ,Quantum Physics - Abstract
The interactions between Rydberg atoms and microwave fields provide a valuable framework for studying the complex dynamics out of equilibrium, exotic phases, and critical phenomena in many-body physics. This unique interplay allows us to explore various regimes of nonlinearity and phase transitions. Here, we observe a phase transition from the state in the regime of bistability to that in multistability in strongly interacting Rydberg atoms by varying the microwave field intensity, accompanying with the breaking of Z3-symmetry. During the phase transition, the system experiences a hidden critical point, in which the multistable states are difficult to be identified. Through changing the initial state of system, we can identify a hidden multistable state and reveal a hidden trajectory of phase transition, allowing us to track to a hidden critical point. In addition, we observe multiple phase transitions in spectra, suggesting higher-order symmetry breaking. The reported results shed light on manipulating multistability in dissipative Rydberg atoms systems and hold promise in the applications of non-equilibrium many-body physics., Comment: 10 pages, 5 figures
- Published
- 2024
43. A Training-Free Framework for Video License Plate Tracking and Recognition with Only One-Shot
- Author
-
Ding, Haoxuan, Wang, Qi, Gao, Junyu, and Li, Qiang
- Subjects
Computer Science - Computer Vision and Pattern Recognition - Abstract
Traditional license plate detection and recognition models are often trained on closed datasets, limiting their ability to handle the diverse license plate formats across different regions. The emergence of large-scale pre-trained models has shown exceptional generalization capabilities, enabling few-shot and zero-shot learning. We propose OneShotLP, a training-free framework for video-based license plate detection and recognition, leveraging these advanced models. Starting with the license plate position in the first video frame, our method tracks this position across subsequent frames using a point tracking module, creating a trajectory of prompts. These prompts are input into a segmentation module that uses a promptable large segmentation model to generate local masks of the license plate regions. The segmented areas are then processed by multimodal large language models (MLLMs) for accurate license plate recognition. OneShotLP offers significant advantages, including the ability to function effectively without extensive training data and adaptability to various license plate styles. Experimental results on UFPR-ALPR and SSIG-SegPlate datasets demonstrate the superior accuracy of our approach compared to traditional methods. This highlights the potential of leveraging pre-trained models for diverse real-world applications in intelligent transportation systems. The code is available at https://github.com/Dinghaoxuan/OneShotLP.
- Published
- 2024
44. Exceptional point and hysteresis trajectories in cold Rydberg atomic gases
- Author
-
Zhang, Jun, Li, En-Ze, Wang, Ya-Jun, Liu, Bang, Zhang, Li-Hua, Zhang, Zheng-Yuan, Shao, Shi-Yao, Li, Qing, Chen, Han-Chao, Ma, Yu, Han, Tian-Yu, Wang, Qi-Feng, Nan, Jia-Dou, Ying, Yi-Ming, Zhu, Dong-Yang, Shi, Bao-Sen, and Ding, Dong-Sheng
- Subjects
Condensed Matter - Quantum Gases ,Quantum Physics - Abstract
The interplay between strong long-range interactions and the coherent driving contribute to the formation of complex patterns, symmetry, and novel phases of matter in many-body systems. However, long-range interactions may induce an additional dissipation channel, resulting in non-Hermitian many-body dynamics and the emergence of exceptional points in spectrum. Here, we report experimental observation of interaction-induced exceptional points in cold Rydberg atomic gases, revealing the breaking of charge-conjugation parity symmetry. By measuring the transmission spectrum under increasing and decreasing probe intensity, the interaction-induced hysteresis trajectories are observed, which give rise to non-Hermitian dynamics. We record the area enclosed by hysteresis loops and investigate the dynamics of hysteresis loops. The reported exceptional points and hysteresis trajectories in cold Rydberg atomic gases provide valuable insights into the underlying non-Hermitian physics in many-body systems, allowing us to study the interplay between long-range interactions and non-Hermiticity.
- Published
- 2024
45. Algorithm Research of ELMo Word Embedding and Deep Learning Multimodal Transformer in Image Description
- Author
-
Cheng, Xiaohan, Mei, Taiyuan, Zi, Yun, Wang, Qi, Gao, Zijun, and Yang, Haowei
- Subjects
Computer Science - Computer Vision and Pattern Recognition ,Computer Science - Artificial Intelligence - Abstract
Zero sample learning is an effective method for data deficiency. The existing embedded zero sample learning methods only use the known classes to construct the embedded space, so there is an overfitting of the known classes in the testing process. This project uses category semantic similarity measures to classify multiple tags. This enables it to incorporate unknown classes that have the same meaning as currently known classes into the vector space when it is built. At the same time, most of the existing zero sample learning algorithms directly use the depth features of medical images as input, and the feature extraction process does not consider semantic information. This project intends to take ELMo-MCT as the main task and obtain multiple visual features related to the original image through self-attention mechanism. In this paper, a large number of experiments are carried out on three zero-shot learning reference datasets, and the best harmonic average accuracy is obtained compared with the most advanced algorithms.
- Published
- 2024
46. OutfitAnyone: Ultra-high Quality Virtual Try-On for Any Clothing and Any Person
- Author
-
Sun, Ke, Cao, Jian, Wang, Qi, Tian, Linrui, Zhang, Xindi, Zhuo, Lian, Zhang, Bang, Bo, Liefeng, Zhou, Wenbo, Zhang, Weiming, and Gao, Daiheng
- Subjects
Computer Science - Computer Vision and Pattern Recognition - Abstract
Virtual Try-On (VTON) has become a transformative technology, empowering users to experiment with fashion without ever having to physically try on clothing. However, existing methods often struggle with generating high-fidelity and detail-consistent results. While diffusion models, such as Stable Diffusion series, have shown their capability in creating high-quality and photorealistic images, they encounter formidable challenges in conditional generation scenarios like VTON. Specifically, these models struggle to maintain a balance between control and consistency when generating images for virtual clothing trials. OutfitAnyone addresses these limitations by leveraging a two-stream conditional diffusion model, enabling it to adeptly handle garment deformation for more lifelike results. It distinguishes itself with scalability-modulating factors such as pose, body shape and broad applicability, extending from anime to in-the-wild images. OutfitAnyone's performance in diverse scenarios underscores its utility and readiness for real-world deployment. For more details and animated results, please see \url{https://humanaigc.github.io/outfit-anyone/}., Comment: 10 pages, 13 figures
- Published
- 2024
47. EaDeblur-GS: Event assisted 3D Deblur Reconstruction with Gaussian Splatting
- Author
-
Weng, Yuchen, Shen, Zhengwen, Chen, Ruofan, Wang, Qi, and Wang, Jun
- Subjects
Computer Science - Computer Vision and Pattern Recognition - Abstract
3D deblurring reconstruction techniques have recently seen significant advancements with the development of Neural Radiance Fields (NeRF) and 3D Gaussian Splatting (3DGS). Although these techniques can recover relatively clear 3D reconstructions from blurry image inputs, they still face limitations in handling severe blurring and complex camera motion. To address these issues, we propose Event-assisted 3D Deblur Reconstruction with Gaussian Splatting (EaDeblur-GS), which integrates event camera data to enhance the robustness of 3DGS against motion blur. By employing an Adaptive Deviation Estimator (ADE) network to estimate Gaussian center deviations and using novel loss functions, EaDeblur-GS achieves sharp 3D reconstructions in real-time, demonstrating performance comparable to state-of-the-art methods.
- Published
- 2024
48. Representation-finite tensor product algebras
- Author
-
Wang, Qi
- Subjects
Mathematics - Representation Theory - Abstract
In this paper, we complete the classification of representation-finite tensor product algebras in terms of quiver with relations., Comment: 18 pages, comments welcome
- Published
- 2024
49. Space Adaptive Search for Nonholonomic Mobile Robots Path Planning
- Author
-
Wang, Qi
- Subjects
Computer Science - Robotics ,68T20 ,I.2.8 - Abstract
Path planning for a nonholonomic mobile robot is a challenging problem. This paper proposes a novel space adaptive search (SAS) approach that greatly reduces the computation cost of nonholonomic mobile robot path planning. The classic search-based path planning only updates the state on the current location in each step, which is very inefficient, and, therefore, can easily be trapped by local minimum. The SAS updates not only the state of the current location, but also all states in the neighborhood, and the size of the neighborhood is adaptively varied based on the clearance around the current location at each step. Since a great deal of states can be immediately updated, the search can explore the local minimum and get rid of it very fast. As a result, the proposed approach can effectively deal with clustered environments with a large number of local minima. The SAS also utilizes a set of predefined motion primitives, and dynamically scales them into different sizes during the search to create various new primitives with differing sizes and curvatures. This greatly promotes the flexibility of the search of path planning in more complex environments. Unlike the A* family, which uses heuristic to accelerate the search, the experiments shows that the SAS requires much less computation time and memory cost even without heuristic than the weighted A* algorithm, while still preserving the optimality of the produced path. However, the SAS can also be applied together with heuristic or other path planning algorithms., Comment: 12 pages, 62 figures
- Published
- 2024
50. DailyDVS-200: A Comprehensive Benchmark Dataset for Event-Based Action Recognition
- Author
-
Wang, Qi, Xu, Zhou, Lin, Yuming, Ye, Jingtao, Li, Hongsheng, Zhu, Guangming, Shah, Syed Afaq Ali, Bennamoun, Mohammed, and Zhang, Liang
- Subjects
Computer Science - Computer Vision and Pattern Recognition - Abstract
Neuromorphic sensors, specifically event cameras, revolutionize visual data acquisition by capturing pixel intensity changes with exceptional dynamic range, minimal latency, and energy efficiency, setting them apart from conventional frame-based cameras. The distinctive capabilities of event cameras have ignited significant interest in the domain of event-based action recognition, recognizing their vast potential for advancement. However, the development in this field is currently slowed by the lack of comprehensive, large-scale datasets, which are critical for developing robust recognition frameworks. To bridge this gap, we introduces DailyDVS-200, a meticulously curated benchmark dataset tailored for the event-based action recognition community. DailyDVS-200 is extensive, covering 200 action categories across real-world scenarios, recorded by 47 participants, and comprises more than 22,000 event sequences. This dataset is designed to reflect a broad spectrum of action types, scene complexities, and data acquisition diversity. Each sequence in the dataset is annotated with 14 attributes, ensuring a detailed characterization of the recorded actions. Moreover, DailyDVS-200 is structured to facilitate a wide range of research paths, offering a solid foundation for both validating existing approaches and inspiring novel methodologies. By setting a new benchmark in the field, we challenge the current limitations of neuromorphic data processing and invite a surge of new approaches in event-based action recognition techniques, which paves the way for future explorations in neuromorphic computing and beyond. The dataset and source code are available at https://github.com/QiWang233/DailyDVS-200., Comment: Accepted to ECCV 2024
- Published
- 2024
Catalog
Discovery Service for Jio Institute Digital Library
For full access to our library's resources, please sign in.