32,012 results on '"Yu, Xin"'
Search Results
2. An Empirical Analysis on Spatial Reasoning Capabilities of Large Multimodal Models
- Author
-
Shiri, Fatemeh, Guo, Xiao-Yu, Far, Mona Golestan, Yu, Xin, Haffari, Gholamreza, and Li, Yuan-Fang
- Subjects
Computer Science - Computer Vision and Pattern Recognition ,Computer Science - Artificial Intelligence - Abstract
Large Multimodal Models (LMMs) have achieved strong performance across a range of vision and language tasks. However, their spatial reasoning capabilities are under-investigated. In this paper, we construct a novel VQA dataset, Spatial-MM, to comprehensively study LMMs' spatial understanding and reasoning capabilities. Our analyses on object-relationship and multi-hop reasoning reveal several important findings. Firstly, bounding boxes and scene graphs, even synthetic ones, can significantly enhance LMMs' spatial reasoning. Secondly, LMMs struggle more with questions posed from the human perspective than the camera perspective about the image. Thirdly, chain of thought (CoT) prompting does not improve model performance on complex multi-hop questions involving spatial relations. % Moreover, spatial reasoning steps are much less accurate than non-spatial ones across MLLMs. Lastly, our perturbation analysis on GQA-spatial reveals that LMMs are much stronger at basic object detection than complex spatial reasoning. We believe our benchmark dataset and in-depth analyses can spark further research on LMMs spatial reasoning. Spatial-MM benchmark is available at: https://github.com/FatemehShiri/Spatial-MM
- Published
- 2024
3. Efficient preparation of Dicke states
- Author
-
Yu, Jeffery, Muleady, Sean R., Wang, Yu-Xin, Schine, Nathan, Gorshkov, Alexey V., and Childs, Andrew M.
- Subjects
Quantum Physics - Abstract
We present an algorithm utilizing mid-circuit measurement and feedback that prepares Dicke states with polylogarithmically many ancillas and polylogarithmic depth. Our algorithm uses only global mid-circuit projective measurements and adaptively-chosen global rotations. This improves over prior work that was only efficient for Dicke states of low weight, or was not efficient in both depth and width. Our algorithm can also naturally be implemented in a cavity QED context using polylogarithmic time, zero ancillas, and atom-photon coupling scaling with the square root of the system size., Comment: 7 pages plus end matter and supplement, 4 figures
- Published
- 2024
4. Search for exotic gravitational wave signals beyond general relativity using deep learning
- Author
-
Wang, Yu-Xin, Wei, Xiaotong, Li, Chun-Yue, Sun, Tian-Yang, Jin, Shang-Jie, Wang, He, Cui, Jing-Lei, Zhang, Jing-Fei, and Zhang, Xin
- Subjects
General Relativity and Quantum Cosmology ,Astrophysics - Cosmology and Nongalactic Astrophysics ,High Energy Physics - Phenomenology - Abstract
The direct detection of gravitational waves by LIGO has confirmed general relativity (GR) and sparked rapid growth in gravitational wave (GW) astronomy. However, subtle post-Newtonian (PN) deviations observed during the analysis of high signal-to-noise ratio events from the observational runs suggest that standard waveform templates, which assume strict adherence to GR, might overlook signals from alternative theories of gravity. Incorporating these exotic signals into traditional search algorithms is computationally infeasible due to the vast template space required. This paper introduces a deep learning framework for detecting exotic GW signals, leveraging neural networks trained on GR-based templates. Through their generalization ability, neural networks learn intricate features from the data, enabling the detection of signals that deviate from GR. We present the first study evaluating the capability of deep learning to detect beyond-GR signals, including a variety of PN orders. Our model achieves rapid and accurate identification of exotic GW signals across different luminosity distances, with performance comparable to GR-based detections. Applying the model to the GW150914 event demonstrates excellent performance, highlighting the potential of AI-driven methods for detecting previously overlooked signals beyond GR. This work paves the way for new discoveries in gravitational wave astronomy, enabling the detection of signals that might escape traditional search pipelines., Comment: 10 pages, 7 figures
- Published
- 2024
5. Diverse Sign Language Translation
- Author
-
Shen, Xin, Shen, Lei, Yuan, Shaozu, Du, Heming, Sun, Haiyang, and Yu, Xin
- Subjects
Computer Science - Multimedia ,Computer Science - Computer Vision and Pattern Recognition - Abstract
Like spoken languages, a single sign language expression could correspond to multiple valid textual interpretations. Hence, learning a rigid one-to-one mapping for sign language translation (SLT) models might be inadequate, particularly in the case of limited data. In this work, we introduce a Diverse Sign Language Translation (DivSLT) task, aiming to generate diverse yet accurate translations for sign language videos. Firstly, we employ large language models (LLM) to generate multiple references for the widely-used CSL-Daily and PHOENIX14T SLT datasets. Here, native speakers are only invited to touch up inaccurate references, thus significantly improving the annotation efficiency. Secondly, we provide a benchmark model to spur research in this task. Specifically, we investigate multi-reference training strategies to enable our DivSLT model to achieve diverse translations. Then, to enhance translation accuracy, we employ the max-reward-driven reinforcement learning objective that maximizes the reward of the translated result. Additionally, we utilize multiple metrics to assess the accuracy, diversity, and semantic precision of the DivSLT task. Experimental results on the enriched datasets demonstrate that our DivSLT method achieves not only better translation performance but also diverse translation results.
- Published
- 2024
6. MM-WLAuslan: Multi-View Multi-Modal Word-Level Australian Sign Language Recognition Dataset
- Author
-
Shen, Xin, Du, Heming, Sheng, Hongwei, Wang, Shuyun, Chen, Hui, Chen, Huiqiang, Wu, Zhuojie, Du, Xiaobiao, Ying, Jiaying, Lu, Ruihan, Xu, Qingzheng, and Yu, Xin
- Subjects
Computer Science - Computer Vision and Pattern Recognition - Abstract
Isolated Sign Language Recognition (ISLR) focuses on identifying individual sign language glosses. Considering the diversity of sign languages across geographical regions, developing region-specific ISLR datasets is crucial for supporting communication and research. Auslan, as a sign language specific to Australia, still lacks a dedicated large-scale word-level dataset for the ISLR task. To fill this gap, we curate \underline{\textbf{the first}} large-scale Multi-view Multi-modal Word-Level Australian Sign Language recognition dataset, dubbed MM-WLAuslan. Compared to other publicly available datasets, MM-WLAuslan exhibits three significant advantages: (1) the largest amount of data, (2) the most extensive vocabulary, and (3) the most diverse of multi-modal camera views. Specifically, we record 282K+ sign videos covering 3,215 commonly used Auslan glosses presented by 73 signers in a studio environment. Moreover, our filming system includes two different types of cameras, i.e., three Kinect-V2 cameras and a RealSense camera. We position cameras hemispherically around the front half of the model and simultaneously record videos using all four cameras. Furthermore, we benchmark results with state-of-the-art methods for various multi-modal ISLR settings on MM-WLAuslan, including multi-view, cross-camera, and cross-view. Experiment results indicate that MM-WLAuslan is a challenging ISLR dataset, and we hope this dataset will contribute to the development of Auslan and the advancement of sign languages worldwide. All datasets and benchmarks are available at MM-WLAuslan.
- Published
- 2024
7. Observation of anomalous information scrambling in a Rydberg atom array
- Author
-
Liang, Xinhui, Yue, Zongpei, Chao, Yu-Xin, Hua, Zhen-Xing, Lin, Yige, Tey, Meng Khoon, and You, Li
- Subjects
Quantum Physics ,Condensed Matter - Quantum Gases ,Physics - Atomic Physics - Abstract
Quantum information scrambling, which describes the propagation and effective loss of local information, is crucial for understanding the dynamics of quantum many-body systems. In general, a typical interacting system would thermalize under time evolution, leading to the emergence of ergodicity and linear lightcones of information scrambling. Whereas, for a many-body localized system, strong disorders give rise to an extensive number of conserved quantities that prevent the system from thermalization, resulting in full ergodicity breaking and a logarithmic lightcone for information spreading. Here, we report the experimental observation of anomalous information scrambling in an atomic tweezer array. Working in the Rydberg blockade regime, where van der Waals interaction dominates, we observe a suppressed linear lightcone of information spreading characterized by out-of-time-order correlators for the initial N\'eel state, accompanied by persistent oscillations within the lightcone. Such an anomalous dynamics differs from both generic thermal and many-body localized scenarios. It originates from weak ergodicity breaking and is the characteristic feature for quantum many-body scars. The high-quality single-atom manipulations and coherent constraint dynamics, augmented by the effective protocol for time-reversed evolution we demonstrate, establish a versatile hybrid analog-digital simulation approach to explore diverse exotic non-equilibrium dynamics with atomic tweezer arrays.
- Published
- 2024
8. Instability of steady-state mixed-state symmetry-protected topological order to strong-to-weak spontaneous symmetry breaking
- Author
-
Shah, Jeet, Fechisin, Christopher, Wang, Yu-Xin, Iosue, Joseph T., Watson, James D., Wang, Yan-Qi, Ware, Brayden, Gorshkov, Alexey V., and Lin, Cheng-Ju
- Subjects
Quantum Physics ,Condensed Matter - Statistical Mechanics ,Condensed Matter - Strongly Correlated Electrons - Abstract
Recent experimental progress in controlling open quantum systems enables the pursuit of mixed-state nonequilibrium quantum phases. We investigate whether open quantum systems hosting mixed-state symmetry-protected topological states as steady states retain this property under symmetric perturbations. Focusing on the decohered cluster state -- a mixed-state symmetry-protected topological state protected by a combined strong and weak symmetry -- we construct a parent Lindbladian that hosts it as a steady state. This Lindbladian can be mapped onto exactly solvable reaction-diffusion dynamics, even in the presence of certain perturbations, allowing us to solve the parent Lindbladian in detail and reveal previously-unknown steady states. Using both analytical and numerical methods, we find that typical symmetric perturbations cause strong-to-weak spontaneous symmetry breaking at arbitrarily small perturbations, destabilize the steady-state mixed-state symmetry-protected topological order. However, when perturbations introduce only weak symmetry defects, the steady-state mixed-state symmetry-protected topological order remains stable. Additionally, we construct a quantum channel which replicates the essential physics of the Lindbladian and can be efficiently simulated using only Clifford gates, Pauli measurements, and feedback., Comment: 21+12 pages, 10+4 figures
- Published
- 2024
9. The Effect of Personalization in FedProx: A Fine-grained Analysis on Statistical Accuracy and Communication Efficiency
- Author
-
Yu, Xin, He, Zelin, Sun, Ying, Xue, Lingzhou, and Li, Runze
- Subjects
Statistics - Machine Learning ,Computer Science - Distributed, Parallel, and Cluster Computing ,Computer Science - Machine Learning ,Mathematics - Statistics Theory ,Statistics - Computation - Abstract
FedProx is a simple yet effective federated learning method that enables model personalization via regularization. Despite remarkable success in practice, a rigorous analysis of how such a regularization provably improves the statistical accuracy of each client's local model hasn't been fully established. Setting the regularization strength heuristically presents a risk, as an inappropriate choice may even degrade accuracy. This work fills in the gap by analyzing the effect of regularization on statistical accuracy, thereby providing a theoretical guideline for setting the regularization strength for achieving personalization. We prove that by adaptively choosing the regularization strength under different statistical heterogeneity, FedProx can consistently outperform pure local training and achieve a nearly minimax-optimal statistical rate. In addition, to shed light on resource allocation, we design an algorithm, provably showing that stronger personalization reduces communication complexity without increasing the computation cost overhead. Finally, our theory is validated on both synthetic and real-world datasets and its generalizability is verified in a non-convex setting.
- Published
- 2024
10. Exponential entanglement advantage in sensing correlated noise
- Author
-
Wang, Yu-Xin, Bringewatt, Jacob, Seif, Alireza, Brady, Anthony J., Oh, Changhun, and Gorshkov, Alexey V.
- Subjects
Quantum Physics - Abstract
In this work, we propose a new form of exponential quantum advantage in the context of sensing correlated noise. Specifically, we focus on the problem of estimating parameters associated with Lindblad dephasing dynamics, and show that entanglement can lead to an exponential enhancement in the sensitivity (as quantified via quantum Fisher information of the sensor state) for estimating a small parameter characterizing the deviation of system Lindbladians from a class of maximally correlated dephasing dynamics. This result stands in stark contrast with previously studied scenarios of sensing uncorrelated dephasing noise, where one can prove that entanglement does not lead to an advantage in the signal-to-noise ratio. Our work thus opens a novel pathway towards achieving entanglement-based sensing advantage, which may find applications in characterizing decoherence dynamics of near-term quantum devices. Further, our approach provides a potential quantum-enhanced probe of many-body correlated phases by measuring noise generated by a sensing target. We also discuss realization of our protocol using near-term quantum hardware., Comment: 7+2 pages, 1 figure
- Published
- 2024
11. MVGS: Multi-view-regulated Gaussian Splatting for Novel View Synthesis
- Author
-
Du, Xiaobiao, Wang, Yida, and Yu, Xin
- Subjects
Computer Science - Computer Vision and Pattern Recognition - Abstract
Recent works in volume rendering, \textit{e.g.} NeRF and 3D Gaussian Splatting (3DGS), significantly advance the rendering quality and efficiency with the help of the learned implicit neural radiance field or 3D Gaussians. Rendering on top of an explicit representation, the vanilla 3DGS and its variants deliver real-time efficiency by optimizing the parametric model with single-view supervision per iteration during training which is adopted from NeRF. Consequently, certain views are overfitted, leading to unsatisfying appearance in novel-view synthesis and imprecise 3D geometries. To solve aforementioned problems, we propose a new 3DGS optimization method embodying four key novel contributions: 1) We transform the conventional single-view training paradigm into a multi-view training strategy. With our proposed multi-view regulation, 3D Gaussian attributes are further optimized without overfitting certain training views. As a general solution, we improve the overall accuracy in a variety of scenarios and different Gaussian variants. 2) Inspired by the benefit introduced by additional views, we further propose a cross-intrinsic guidance scheme, leading to a coarse-to-fine training procedure concerning different resolutions. 3) Built on top of our multi-view regulated training, we further propose a cross-ray densification strategy, densifying more Gaussian kernels in the ray-intersect regions from a selection of views. 4) By further investigating the densification strategy, we found that the effect of densification should be enhanced when certain views are distinct dramatically. As a solution, we propose a novel multi-view augmented densification strategy, where 3D Gaussians are encouraged to get densified to a sufficient number accordingly, resulting in improved reconstruction accuracy., Comment: Project Page:https://xiaobiaodu.github.io/mvgs-project/
- Published
- 2024
12. TokenBinder: Text-Video Retrieval with One-to-Many Alignment Paradigm
- Author
-
Zhang, Bingqing, Cao, Zhuo, Du, Heming, Yu, Xin, Li, Xue, Liu, Jiajun, and Wang, Sen
- Subjects
Computer Science - Computer Vision and Pattern Recognition - Abstract
Text-Video Retrieval (TVR) methods typically match query-candidate pairs by aligning text and video features in coarse-grained, fine-grained, or combined (coarse-to-fine) manners. However, these frameworks predominantly employ a one(query)-to-one(candidate) alignment paradigm, which struggles to discern nuanced differences among candidates, leading to frequent mismatches. Inspired by Comparative Judgement in human cognitive science, where decisions are made by directly comparing items rather than evaluating them independently, we propose TokenBinder. This innovative two-stage TVR framework introduces a novel one-to-many coarse-to-fine alignment paradigm, imitating the human cognitive process of identifying specific items within a large collection. Our method employs a Focused-view Fusion Network with a sophisticated cross-attention mechanism, dynamically aligning and comparing features across multiple videos to capture finer nuances and contextual variations. Extensive experiments on six benchmark datasets confirm that TokenBinder substantially outperforms existing state-of-the-art methods. These results demonstrate its robustness and the effectiveness of its fine-grained alignment in bridging intra- and inter-modality information gaps in TVR tasks.
- Published
- 2024
13. EM-DARTS: Hierarchical Differentiable Architecture Search for Eye Movement Recognition
- Author
-
Qin, Huafeng, Zhu, Hongyu, Jin, Xin, Yu, Xin, El-Yacoubi, Mounim A., and Gao, Xinbo
- Subjects
Computer Science - Computer Vision and Pattern Recognition - Abstract
Eye movement biometrics has received increasing attention thanks to its high secure identification. Although deep learning (DL) models have been recently successfully applied for eye movement recognition, the DL architecture still is determined by human prior knowledge. Differentiable Neural Architecture Search (DARTS) automates the manual process of architecture design with high search efficiency. DARTS, however, usually stacks the same multiple learned cells to form a final neural network for evaluation, limiting therefore the diversity of the network. Incidentally, DARTS usually searches the architecture in a shallow network while evaluating it in a deeper one, which results in a large gap between the architecture depths in the search and evaluation scenarios. To address this issue, we propose EM-DARTS, a hierarchical differentiable architecture search algorithm to automatically design the DL architecture for eye movement recognition. First, we define a supernet and propose a global and local alternate Neural Architecture Search method to search the optimal architecture alternately with an differentiable neural architecture search. The local search strategy aims to find an optimal architecture for different cells while the global search strategy is responsible for optimizing the architecture of the target network. To further reduce redundancy, a transfer entropy is proposed to compute the information amount of each layer, so as to further simplify search network. Our experiments on three public databases demonstrate that the proposed EM-DARTS is capable of producing an optimal architecture that leads to state-of-the-art recognition performance., Comment: Submited to IEEE Transactions on Information Forensics and Security
- Published
- 2024
14. FreeAvatar: Robust 3D Facial Animation Transfer by Learning an Expression Foundation Model
- Author
-
Qiu, Feng, Zhang, Wei, Liu, Chen, An, Rudong, Li, Lincheng, Ding, Yu, Fan, Changjie, Hu, Zhipeng, and Yu, Xin
- Subjects
Computer Science - Graphics ,Computer Science - Artificial Intelligence - Abstract
Video-driven 3D facial animation transfer aims to drive avatars to reproduce the expressions of actors. Existing methods have achieved remarkable results by constraining both geometric and perceptual consistency. However, geometric constraints (like those designed on facial landmarks) are insufficient to capture subtle emotions, while expression features trained on classification tasks lack fine granularity for complex emotions. To address this, we propose \textbf{FreeAvatar}, a robust facial animation transfer method that relies solely on our learned expression representation. Specifically, FreeAvatar consists of two main components: the expression foundation model and the facial animation transfer model. In the first component, we initially construct a facial feature space through a face reconstruction task and then optimize the expression feature space by exploring the similarities among different expressions. Benefiting from training on the amounts of unlabeled facial images and re-collected expression comparison dataset, our model adapts freely and effectively to any in-the-wild input facial images. In the facial animation transfer component, we propose a novel Expression-driven Multi-avatar Animator, which first maps expressive semantics to the facial control parameters of 3D avatars and then imposes perceptual constraints between the input and output images to maintain expression consistency. To make the entire process differentiable, we employ a trained neural renderer to translate rig parameters into corresponding images. Furthermore, unlike previous methods that require separate decoders for each avatar, we propose a dynamic identity injection module that allows for the joint training of multiple avatars within a single network., Comment: 11 pages, 10 figures
- Published
- 2024
- Full Text
- View/download PDF
15. StyleTalk++: A Unified Framework for Controlling the Speaking Styles of Talking Heads
- Author
-
Wang, Suzhen, Ma, Yifeng, Ding, Yu, Hu, Zhipeng, Fan, Changjie, Lv, Tangjie, Deng, Zhidong, and Yu, Xin
- Subjects
Computer Science - Computer Vision and Pattern Recognition - Abstract
Individuals have unique facial expression and head pose styles that reflect their personalized speaking styles. Existing one-shot talking head methods cannot capture such personalized characteristics and therefore fail to produce diverse speaking styles in the final videos. To address this challenge, we propose a one-shot style-controllable talking face generation method that can obtain speaking styles from reference speaking videos and drive the one-shot portrait to speak with the reference speaking styles and another piece of audio. Our method aims to synthesize the style-controllable coefficients of a 3D Morphable Model (3DMM), including facial expressions and head movements, in a unified framework. Specifically, the proposed framework first leverages a style encoder to extract the desired speaking styles from the reference videos and transform them into style codes. Then, the framework uses a style-aware decoder to synthesize the coefficients of 3DMM from the audio input and style codes. During decoding, our framework adopts a two-branch architecture, which generates the stylized facial expression coefficients and stylized head movement coefficients, respectively. After obtaining the coefficients of 3DMM, an image renderer renders the expression coefficients into a specific person's talking-head video. Extensive experiments demonstrate that our method generates visually authentic talking head videos with diverse speaking styles from only one portrait image and an audio clip., Comment: TPAMI 2024. arXiv admin note: text overlap with arXiv:2301.01081
- Published
- 2024
16. CF-PRNet: Coarse-to-Fine Prototype Refining Network for Point Cloud Completion and Reconstruction
- Author
-
Chen, Zhi, Wei, Tianqi, Zhao, Zecheng, Lim, Jia Syuen, Luo, Yadan, Zhang, Hu, Yu, Xin, Chapman, Scott, and Huang, Zi
- Subjects
Computer Science - Computer Vision and Pattern Recognition - Abstract
In modern agriculture, precise monitoring of plants and fruits is crucial for tasks such as high-throughput phenotyping and automated harvesting. This paper addresses the challenge of reconstructing accurate 3D shapes of fruits from partial views, which is common in agricultural settings. We introduce CF-PRNet, a coarse-to-fine prototype refining network, leverages high-resolution 3D data during the training phase but requires only a single RGB-D image for real-time inference. Our approach begins by extracting the incomplete point cloud data that constructed from a partial view of a fruit with a series of convolutional blocks. The extracted features inform the generation of scaling vectors that refine two sequentially constructed 3D mesh prototypes - one coarse and one fine-grained. This progressive refinement facilitates the detailed completion of the final point clouds, achieving detailed and accurate reconstructions. CF-PRNet demonstrates excellent performance metrics with a Chamfer Distance of 3.78, an F1 Score of 66.76%, a Precision of 56.56%, and a Recall of 85.31%, and win the first place in the Shape Completion and Reconstruction of Sweet Peppers Challenge., Comment: Technical Report of the 1st place solution to CVPPA@ECCV2024: Shape Completion and Reconstruction of Sweet Peppers Challenge
- Published
- 2024
17. PlantSeg: A Large-Scale In-the-wild Dataset for Plant Disease Segmentation
- Author
-
Wei, Tianqi, Chen, Zhi, Yu, Xin, Chapman, Scott, Melloy, Paul, and Huang, Zi
- Subjects
Computer Science - Computer Vision and Pattern Recognition - Abstract
Plant diseases pose significant threats to agriculture. It necessitates proper diagnosis and effective treatment to safeguard crop yields. To automate the diagnosis process, image segmentation is usually adopted for precisely identifying diseased regions, thereby advancing precision agriculture. Developing robust image segmentation models for plant diseases demands high-quality annotations across numerous images. However, existing plant disease datasets typically lack segmentation labels and are often confined to controlled laboratory settings, which do not adequately reflect the complexity of natural environments. Motivated by this fact, we established PlantSeg, a large-scale segmentation dataset for plant diseases. PlantSeg distinguishes itself from existing datasets in three key aspects. (1) Annotation type: Unlike the majority of existing datasets that only contain class labels or bounding boxes, each image in PlantSeg includes detailed and high-quality segmentation masks, associated with plant types and disease names. (2) Image source: Unlike typical datasets that contain images from laboratory settings, PlantSeg primarily comprises in-the-wild plant disease images. This choice enhances the practical applicability, as the trained models can be applied for integrated disease management. (3) Scale: PlantSeg is extensive, featuring 11,400 images with disease segmentation masks and an additional 8,000 healthy plant images categorized by plant type. Extensive technical experiments validate the high quality of PlantSeg's annotations. This dataset not only allows researchers to evaluate their image classification methods but also provides a critical foundation for developing and benchmarking advanced plant disease segmentation algorithms.
- Published
- 2024
18. Prenatal Exposure to Source-Specific Fine Particulate Matter and Autism Spectrum Disorder.
- Author
-
Luglio, David, Kleeman, Michael, Yu, Xin, Lin, Jane, Chow, Ting, Martinez, Mayra, Chen, Zhanghua, Chen, Jiu-Chiuan, Eckel, Sandrah, Schwartz, Joel, Lurmann, Frederick, McConnell, Rob, Xiang, Anny, and Rahman, Md
- Subjects
PM2.5 ,air pollution sources ,autism spectrum disorders ,gasoline ,pregnancy ,prenatal exposures ,Particulate Matter ,Autism Spectrum Disorder ,Female ,Pregnancy ,Humans ,Prenatal Exposure Delayed Effects ,Air Pollutants ,Adult ,California ,Retrospective Studies ,Maternal Exposure ,Vehicle Emissions - Abstract
In this study, associations between prenatal exposure to fine particulate matter (PM2.5) from 9 sources and development of autism spectrum disorder (ASD) were assessed in a population-based retrospective pregnancy cohort in southern California. The cohort included 318,750 mother-child singleton pairs. ASD cases (N = 4559) were identified by ICD codes. Source-specific PM2.5 concentrations were estimated from a chemical transport model with a 4 × 4 km2 resolution and assigned to maternal pregnancy residential addresses. Cox proportional hazard models were used to estimate the hazard ratios (HR) of ASD development for each individual source. We also adjusted for total PM2.5 mass and in a separate model for all other sources simultaneously. Increased ASD risk was observed with on-road gasoline (HR [CI]: 1.18 [1.13, 1.24]), off-road gasoline (1.15 [1.12, 1.19]), off-road diesel (1.08 [1.05, 1.10]), food cooking (1.05 [1.02, 1.08]), aircraft (1.04 [1.01, 1.06]), and natural gas combustion (1.09 [1.06, 1.11]), each scaled to standard deviation increases in concentration. On-road gasoline and off-road gasoline were robust for other pollutant groups. PM2.5 emitted from different sources may have different impacts on ASD. The results also identify PM source mixtures for toxicological investigations that may provide evidence for future public health policies.
- Published
- 2024
19. Snap and Diagnose: An Advanced Multimodal Retrieval System for Identifying Plant Diseases in the Wild
- Author
-
Wei, Tianqi, Chen, Zhi, and Yu, Xin
- Subjects
Computer Science - Computer Vision and Pattern Recognition ,Computer Science - Information Retrieval - Abstract
Plant disease recognition is a critical task that ensures crop health and mitigates the damage caused by diseases. A handy tool that enables farmers to receive a diagnosis based on query pictures or the text description of suspicious plants is in high demand for initiating treatment before potential diseases spread further. In this paper, we develop a multimodal plant disease image retrieval system to support disease search based on either image or text prompts. Specifically, we utilize the largest in-the-wild plant disease dataset PlantWild, which includes over 18,000 images across 89 categories, to provide a comprehensive view of potential diseases relating to the query. Furthermore, cross-modal retrieval is achieved in the developed system, facilitated by a novel CLIP-based vision-language model that encodes both disease descriptions and disease images into the same latent space. Built on top of the retriever, our retrieval system allows users to upload either plant disease images or disease descriptions to retrieve the corresponding images with similar characteristics from the disease dataset to suggest candidate diseases for end users' consideration.
- Published
- 2024
20. InstantStyleGaussian: Efficient Art Style Transfer with 3D Gaussian Splatting
- Author
-
Yu, Xin-Yi, Yu, Jun-Xin, Zhou, Li-Bo, Wei, Yan, and Ou, Lin-Lin
- Subjects
Computer Science - Computer Vision and Pattern Recognition - Abstract
We present InstantStyleGaussian, an innovative 3D style transfer method based on the 3D Gaussian Splatting (3DGS) scene representation. By inputting a target-style image, it quickly generates new 3D GS scenes. Our method operates on pre-reconstructed GS scenes, combining diffusion models with an improved iterative dataset update strategy. It utilizes diffusion models to generate target style images, adds these new images to the training dataset, and uses this dataset to iteratively update and optimize the GS scenes, significantly accelerating the style editing process while ensuring the quality of the generated scenes. Extensive experimental results demonstrate that our method ensures high-quality stylized scenes while offering significant advantages in style transfer speed and consistency.
- Published
- 2024
21. WalledEval: A Comprehensive Safety Evaluation Toolkit for Large Language Models
- Author
-
Gupta, Prannaya, Yau, Le Qi, Low, Hao Han, Lee, I-Shiang, Lim, Hugo Maximus, Teoh, Yu Xin, Koh, Jia Hng, Liew, Dar Win, Bhardwaj, Rishabh, Bhardwaj, Rajat, and Poria, Soujanya
- Subjects
Computer Science - Computation and Language ,Computer Science - Artificial Intelligence - Abstract
WalledEval is a comprehensive AI safety testing toolkit designed to evaluate large language models (LLMs). It accommodates a diverse range of models, including both open-weight and API-based ones, and features over 35 safety benchmarks covering areas such as multilingual safety, exaggerated safety, and prompt injections. The framework supports both LLM and judge benchmarking and incorporates custom mutators to test safety against various text-style mutations, such as future tense and paraphrasing. Additionally, WalledEval introduces WalledGuard, a new, small, and performant content moderation tool, and two datasets: SGXSTest and HIXSTest, which serve as benchmarks for assessing the exaggerated safety of LLMs and judges in cultural contexts. We make WalledEval publicly available at https://github.com/walledai/walledeval., Comment: Under review
- Published
- 2024
22. Benchmarking In-the-wild Multimodal Disease Recognition and A Versatile Baseline
- Author
-
Wei, Tianqi, Chen, Zhi, Huang, Zi, and Yu, Xin
- Subjects
Computer Science - Computer Vision and Pattern Recognition - Abstract
Existing plant disease classification models have achieved remarkable performance in recognizing in-laboratory diseased images. However, their performance often significantly degrades in classifying in-the-wild images. Furthermore, we observed that in-the-wild plant images may exhibit similar appearances across various diseases (i.e., small inter-class discrepancy) while the same diseases may look quite different (i.e., large intra-class variance). Motivated by this observation, we propose an in-the-wild multimodal plant disease recognition dataset that contains the largest number of disease classes but also text-based descriptions for each disease. Particularly, the newly provided text descriptions are introduced to provide rich information in textual modality and facilitate in-the-wild disease classification with small inter-class discrepancy and large intra-class variance issues. Therefore, our proposed dataset can be regarded as an ideal testbed for evaluating disease recognition methods in the real world. In addition, we further present a strong yet versatile baseline that models text descriptions and visual data through multiple prototypes for a given class. By fusing the contributions of multimodal prototypes in classification, our baseline can effectively address the small inter-class discrepancy and large intra-class variance issues. Remarkably, our baseline model can not only classify diseases but also recognize diseases in few-shot or training-free scenarios. Extensive benchmarking results demonstrate that our proposed in-the-wild multimodal dataset sets many new challenges to the plant disease recognition task and there is a large space to improve for future works.
- Published
- 2024
23. Chiral spin liquid in a generalized Kitaev honeycomb model with $\mathbb{Z}_4$ 1-form symmetry
- Author
-
Yang, Yu-Xin, Cheng, Meng, and Chen, Ji-Yao
- Subjects
Condensed Matter - Strongly Correlated Electrons ,Quantum Physics - Abstract
We explore a large $N$ generalization of the Kitaev model on the honeycomb lattice with a simple nearest-neighbor interacting Hamiltonian. In particular, we focus on the $\mathbb{Z}_4$ case with isotropic couplings, which is characterized by an exact $\mathbb{Z}_4$ one-form symmetry. Guided by symmetry considerations and an analytical study in the single chain limit, on the infinitely long cylinders, we find the model is gapped with an extremely short correlation length. Combined with the $\mathbb{Z}_4$ one-form symmetry, this suggests the model is topologically ordered. To pin down the nature of this phase, we further study the model on both finite and infinitely long strips, where we consistently find a $c=1$ conformal field theory (CFT) description, suggesting the existence of chiral edge modes described by a free boson CFT. Further evidence is found by studying the dimer correlators on infinitely long strips. We find the dimer correlation functions show a power-law decay with the exponent close to 2 on the boundary of the strip, while decay much faster in the bulk. Combined with the topological entanglement entropy extracted from cylinder geometry, we identify the spin liquid is chiral and supports a $\mathrm{U}(1)_{-8}$ chiral topological order. A unified perspective for all $\mathbb{Z}_N$ type Kitaev models is also discussed., Comment: 10 pages, 5 figures
- Published
- 2024
24. Robust High-frequency Laser Phase Noise Suppression by Adaptive Pound-Drever-Hall Feedforward
- Author
-
Chao, Yu-Xin, Hua, Zhen-Xing, Liang, Xin-Hui, Yue, Zong-Pei, Jia, Chen, You, Li, and Tey, Meng Khoon
- Subjects
Physics - Optics ,Physics - Atomic Physics - Abstract
Suppressing high-frequency laser phase noise, particularly at frequencies near and beyond typical feedback bandwidths of a few MHz, is a critical yet challenging task in many advanced applications. Feedforward-based methods generally outperform feedback in high-frequency range, but their performances are more susceptible to perturbations. In this work, we focus on the Pound-Drever-Hall (PDH)-feedforward method we demonstrated recently [Yu-Xin Chao et al., Optica 11(7), 945-950 (2024)] and analyze the factors that affect its long-term stability. By constructing a simple circuit allowing for adaptive control of the feedforward gain in response to power fluctuations of cavity transmission, we demonstrate a robust $\geq 40$~dB suppression of laser phase noise around 2~MHz and a noise suppression bandwidth up to 50~MHz. In comparison, when using normal PDH feedback, robust noise suppression of over 40 dB can only occur for frequencies below tens of kHz in most setups. Our findings may pave the way for general usage of PDH feedforward and allow for simple construction of low-noise lasers for precise quantum controls and precision metrology.
- Published
- 2024
25. Collective optical properties of moir\'e excitons
- Author
-
Huang, Tsung-Sheng, Wang, Yu-Xin, Wang, Yan-Qi, Chang, Darrick, Hafezi, Mohammad, and Grankin, Andrey
- Subjects
Condensed Matter - Mesoscale and Nanoscale Physics ,Condensed Matter - Materials Science ,Condensed Matter - Strongly Correlated Electrons ,Quantum Physics - Abstract
We propose that excitons in moir\'e transition metal dichalcogenide bilayers offer a promising platform for investigating collective radiative properties. While some of these optical properties resemble those of cold atom arrays, moir\'e excitons extend to the deep subwavelength limit, beyond the reach of current optical lattice experiments. Remarkably, we show that the collective optical properties can be exploited to probe certain correlated electron states. Specifically, we illustrate that the Wigner crystal states of electrons doped into these bilayers act as an emergent periodic potential for excitons. Moreover, the collective dissipative excitonic bands and their associated Berry curvature can reveal various charge orders that emerge at the corresponding electronic doping. Our study provides a promising pathway for future research on the interplay between collective effects and strong correlations involving moir\'e excitons.
- Published
- 2024
26. Direct observation of quantum vortex fractionalization in multiband superconductors
- Author
-
Zheng, Yu, Hu, Quanxin, Ji, Haijiao, Timoshuk, Igor, Xu, Hanxiang, Li, Yongwei, Gao, Ye, Yu, Xin, Wu, Rui, Lu, Xingye, Grinenko, Vadim, Babaev, Egor, Yuan, Noah F. Q., Lv, Baiqing, Yim, Chi-Ming, and Ding, Hong
- Subjects
Condensed Matter - Superconductivity - Abstract
Magnetic field is expelled from a superconductor, unless it forms quantum vortices, consisting of a core singularity with current circulating around it. The London quantization condition implies that there is one core singularity per quantum of magnetic flux in single-component superconductors, while in multiband materials fractional vortices are possible. Here, we report the first observation of quantum vortex core fractionalization on the potassium terminated surface of multiband superconductor KFe2As2 by scanning tunneling microscopy. We observe splitting of an integer-flux vortex into several fractional vortices, leading to disparity between numbers of flux quanta and vortex cores. Our findings demonstrate that fractionalized core singularities are possible in a multiband superconductor, opening avenue for new experimental platforms with quasiparticles with fractional statistics., Comment: 16 pages, 4 figures
- Published
- 2024
27. DreamCar: Leveraging Car-specific Prior for in-the-wild 3D Car Reconstruction
- Author
-
Du, Xiaobiao, Sun, Haiyang, Lu, Ming, Zhu, Tianqing, and Yu, Xin
- Subjects
Computer Science - Computer Vision and Pattern Recognition - Abstract
Self-driving industries usually employ professional artists to build exquisite 3D cars. However, it is expensive to craft large-scale digital assets. Since there are already numerous datasets available that contain a vast number of images of cars, we focus on reconstructing high-quality 3D car models from these datasets. However, these datasets only contain one side of cars in the forward-moving scene. We try to use the existing generative models to provide more supervision information, but they struggle to generalize well in cars since they are trained on synthetic datasets not car-specific. In addition, The reconstructed 3D car texture misaligns due to a large error in camera pose estimation when dealing with in-the-wild images. These restrictions make it challenging for previous methods to reconstruct complete 3D cars. To address these problems, we propose a novel method, named DreamCar, which can reconstruct high-quality 3D cars given a few images even a single image. To generalize the generative model, we collect a car dataset, named Car360, with over 5,600 vehicles. With this dataset, we make the generative model more robust to cars. We use this generative prior specific to the car to guide its reconstruction via Score Distillation Sampling. To further complement the supervision information, we utilize the geometric and appearance symmetry of cars. Finally, we propose a pose optimization method that rectifies poses to tackle texture misalignment. Extensive experiments demonstrate that our method significantly outperforms existing methods in reconstructing high-quality 3D cars. \href{https://xiaobiaodu.github.io/dreamcar-project/}{Our code is available.}, Comment: Projet Page: https://xiaobiaodu.github.io/dreamcar-project/
- Published
- 2024
28. Simulation study of performance of the Very Large Area gamma-ray Space Telescope
- Author
-
Pan, Xu, Jiang, Wei, Yue, Chuan, Lei, Shi-Jun, Cui, Yu-Xin, and Yuan, Qiang
- Subjects
Astrophysics - Instrumentation and Methods for Astrophysics ,Astrophysics - High Energy Astrophysical Phenomena ,High Energy Physics - Experiment ,High Energy Physics - Phenomenology ,Physics - Instrumentation and Detectors - Abstract
The Very Large Area gamma-ray Space Telescope (VLAST) is a mission concept proposed to detect gamma-ray photons through both the Compton scattering and electron-positron pair production mechanisms, enabling the detection of photons with energies ranging from MeV to TeV. This project aims to conduct a comprehensive survey of the gamma-ray sky from a low Earth orbit using an anti-coincidence detector, a tracker detector that also serves as a low energy calorimeter, and a high energy imaging calorimeter. We developed a Monte Carlo simulation application of the detector with the GEANT4 toolkit to evaluate the instrument performance including the effective area, angular resolution and energy resolution, as well as explored specific optimizations of the detector configuration. Our simulation-based analysis indicates that the VLAST's current design is physically feasible, with an acceptance larger than 10~$\rm m^2\ sr$ which is four times larger than Fermi-LAT, an energy resolution better than 2\% at 10~GeV, and an angular resolution better than 0.2 degrees at 10~GeV. The VLAST project is expected to make significant contribution to the field of gamma-ray astronomy and to enhance our understanding of the cosmos., Comment: 15 pages, 16 figures; Nuclear Science and Techniques in press
- Published
- 2024
- Full Text
- View/download PDF
29. Affective Behaviour Analysis via Progressive Learning
- Author
-
Liu, Chen, Zhang, Wei, Qiu, Feng, Li, Lincheng, and Yu, Xin
- Subjects
Computer Science - Computer Vision and Pattern Recognition - Abstract
Affective Behavior Analysis aims to develop emotionally intelligent technology that can recognize and respond to human emotions. To advance this, the 7th Affective Behavior Analysis in-the-wild (ABAW) competition establishes two tracks: i.e., the Multi-task Learning (MTL) Challenge and the Compound Expression (CE) challenge based on Aff-Wild2 and C-EXPR-DB datasets. In this paper, we present our methods and experimental results for the two competition tracks. Specifically, it can be summarized in the following four aspects: 1) To attain high-quality facial features, we train a Masked-Auto Encoder in a self-supervised manner. 2) We devise a temporal convergence module to capture the temporal information between video frames and explore the impact of window size and sequence length on each sub-task. 3) To facilitate the joint optimization of various sub-tasks, we explore the impact of sub-task joint training and feature fusion from individual tasks on each task performance improvement. 4) We utilize curriculum learning to transition the model from recognizing single expressions to recognizing compound expressions, thereby improving the accuracy of compound expression recognition. Extensive experiments demonstrate the superiority of our designs., Comment: Techical Report for 7th ABAW Competition
- Published
- 2024
30. Hierarchical Consensus-Based Multi-Agent Reinforcement Learning for Multi-Robot Cooperation Tasks
- Author
-
Feng, Pu, Liang, Junkang, Wang, Size, Yu, Xin, Ji, Xin, Chen, Yiting, Zhang, Kui, Shi, Rongye, and Wu, Wenjun
- Subjects
Computer Science - Artificial Intelligence ,Computer Science - Multiagent Systems ,Computer Science - Robotics - Abstract
In multi-agent reinforcement learning (MARL), the Centralized Training with Decentralized Execution (CTDE) framework is pivotal but struggles due to a gap: global state guidance in training versus reliance on local observations in execution, lacking global signals. Inspired by human societal consensus mechanisms, we introduce the Hierarchical Consensus-based Multi-Agent Reinforcement Learning (HC-MARL) framework to address this limitation. HC-MARL employs contrastive learning to foster a global consensus among agents, enabling cooperative behavior without direct communication. This approach enables agents to form a global consensus from local observations, using it as an additional piece of information to guide collaborative actions during execution. To cater to the dynamic requirements of various tasks, consensus is divided into multiple layers, encompassing both short-term and long-term considerations. Short-term observations prompt the creation of an immediate, low-layer consensus, while long-term observations contribute to the formation of a strategic, high-layer consensus. This process is further refined through an adaptive attention mechanism that dynamically adjusts the influence of each consensus layer. This mechanism optimizes the balance between immediate reactions and strategic planning, tailoring it to the specific demands of the task at hand. Extensive experiments and real-world applications in multi-robot systems showcase our framework's superior performance, marking significant advancements over baselines., Comment: 8 pages, 10 figures. Accepted for presentation at the IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS 2024)
- Published
- 2024
31. A quantitative analysis of Gravitational Wave spectrum sourced from First-Order Chiral Phase Transition of QCD
- Author
-
Zheng, Hui-wen, Gao, Fei, Bian, Ligong, Qin, Si-xue, and Liu, Yu-xin
- Subjects
High Energy Physics - Phenomenology ,Astrophysics - Cosmology and Nongalactic Astrophysics - Abstract
We investigate the cosmological first-order chiral phase transition of QCD, and for the first time calculate its parameters which can fully determine the gravitational wave spectrum. With the state-of-the-art calculation from the functional QCD method, we found that the large chemical potential of QCD phase transition results in very weak and fast first-order phase transitions at the temperature lower than $\mathcal{O}(10^2)$ MeV. These results further suggest that the GW signals of NANOGrav are very unlikely sourced from the chiral phase transition of QCD.
- Published
- 2024
32. DiPEx: Dispersing Prompt Expansion for Class-Agnostic Object Detection
- Author
-
Lim, Jia Syuen, Chen, Zhuoxiao, Baktashmotlagh, Mahsa, Chen, Zhi, Yu, Xin, Huang, Zi, and Luo, Yadan
- Subjects
Computer Science - Computer Vision and Pattern Recognition - Abstract
Class-agnostic object detection (OD) can be a cornerstone or a bottleneck for many downstream vision tasks. Despite considerable advancements in bottom-up and multi-object discovery methods that leverage basic visual cues to identify salient objects, consistently achieving a high recall rate remains difficult due to the diversity of object types and their contextual complexity. In this work, we investigate using vision-language models (VLMs) to enhance object detection via a self-supervised prompt learning strategy. Our initial findings indicate that manually crafted text queries often result in undetected objects, primarily because detection confidence diminishes when the query words exhibit semantic overlap. To address this, we propose a Dispersing Prompt Expansion (DiPEx) approach. DiPEx progressively learns to expand a set of distinct, non-overlapping hyperspherical prompts to enhance recall rates, thereby improving performance in downstream tasks such as out-of-distribution OD. Specifically, DiPEx initiates the process by self-training generic parent prompts and selecting the one with the highest semantic uncertainty for further expansion. The resulting child prompts are expected to inherit semantics from their parent prompts while capturing more fine-grained semantics. We apply dispersion losses to ensure high inter-class discrepancy among child prompts while preserving semantic consistency between parent-child prompt pairs. To prevent excessive growth of the prompt sets, we utilize the maximum angular coverage (MAC) of the semantic space as a criterion for early termination. We demonstrate the effectiveness of DiPEx through extensive class-agnostic OD and OOD-OD experiments on MS-COCO and LVIS, surpassing other prompting methods by up to 20.1% in AR and achieving a 21.3% AP improvement over SAM. The code is available at https://github.com/jason-lim26/DiPEx., Comment: 19 pages
- Published
- 2024
33. Enhancing Single-Slice Segmentation with 3D-to-2D Unpaired Scan Distillation
- Author
-
Yu, Xin, Yang, Qi, Liu, Han, Lee, Ho Hin, Tang, Yucheng, Remedios, Lucas W., Kim, Michael E., Zhang, Rendong, Bao, Shunxing, Huo, Yuankai, Moore, Ann Zenobia, Ferrucci, Luigi, and Landman, Bennett A.
- Subjects
Electrical Engineering and Systems Science - Image and Video Processing ,Computer Science - Computer Vision and Pattern Recognition - Abstract
2D single-slice abdominal computed tomography (CT) enables the assessment of body habitus and organ health with low radiation exposure. However, single-slice data necessitates the use of 2D networks for segmentation, but these networks often struggle to capture contextual information effectively. Consequently, even when trained on identical datasets, 3D networks typically achieve superior segmentation results. In this work, we propose a novel 3D-to-2D distillation framework, leveraging pre-trained 3D models to enhance 2D single-slice segmentation. Specifically, we extract the prediction distribution centroid from the 3D representations, to guide the 2D student by learning intra- and inter-class correlation. Unlike traditional knowledge distillation methods that require the same data input, our approach employs unpaired 3D CT scans with any contrast to guide the 2D student model. Experiments conducted on 707 subjects from the single-slice Baltimore Longitudinal Study of Aging (BLSA) dataset demonstrate that state-of-the-art 2D multi-organ segmentation methods can benefit from the 3D teacher model, achieving enhanced performance in single-slice multi-organ segmentation. Notably, our approach demonstrates considerable efficacy in low-data regimes, outperforming the model trained with all available training subjects even when utilizing only 200 training subjects. Thus, this work underscores the potential to alleviate manual annotation burdens.
- Published
- 2024
34. RichRAG: Crafting Rich Responses for Multi-faceted Queries in Retrieval-Augmented Generation
- Author
-
Wang, Shuting, Yu, Xin, Wang, Mang, Chen, Weipeng, Zhu, Yutao, and Dou, Zhicheng
- Subjects
Computer Science - Computation and Language - Abstract
Retrieval-augmented generation (RAG) effectively addresses issues of static knowledge and hallucination in large language models. Existing studies mostly focus on question scenarios with clear user intents and concise answers. However, it is prevalent that users issue broad, open-ended queries with diverse sub-intents, for which they desire rich and long-form answers covering multiple relevant aspects. To tackle this important yet underexplored problem, we propose a novel RAG framework, namely RichRAG. It includes a sub-aspect explorer to identify potential sub-aspects of input questions, a multi-faceted retriever to build a candidate pool of diverse external documents related to these sub-aspects, and a generative list-wise ranker, which is a key module to provide the top-k most valuable documents for the final generator. These ranked documents sufficiently cover various query aspects and are aware of the generator's preferences, hence incentivizing it to produce rich and comprehensive responses for users. The training of our ranker involves a supervised fine-tuning stage to ensure the basic coverage of documents, and a reinforcement learning stage to align downstream LLM's preferences to the ranking of documents. Experimental results on two publicly available datasets prove that our framework effectively and efficiently provides comprehensive and satisfying responses to users.
- Published
- 2024
35. 3DRealCar: An In-the-wild RGB-D Car Dataset with 360-degree Views
- Author
-
Du, Xiaobiao, Sun, Haiyang, Wang, Shuyun, Wu, Zhuojie, Sheng, Hongwei, Ying, Jiaying, Lu, Ming, Zhu, Tianqing, Zhan, Kun, and Yu, Xin
- Subjects
Computer Science - Computer Vision and Pattern Recognition - Abstract
3D cars are commonly used in self-driving systems, virtual/augmented reality, and games. However, existing 3D car datasets are either synthetic or low-quality, presenting a significant gap toward the high-quality real-world 3D car datasets and limiting their applications in practical scenarios. In this paper, we propose the first large-scale 3D real car dataset, termed 3DRealCar, offering three distinctive features. (1) \textbf{High-Volume}: 2,500 cars are meticulously scanned by 3D scanners, obtaining car images and point clouds with real-world dimensions; (2) \textbf{High-Quality}: Each car is captured in an average of 200 dense, high-resolution 360-degree RGB-D views, enabling high-fidelity 3D reconstruction; (3) \textbf{High-Diversity}: The dataset contains various cars from over 100 brands, collected under three distinct lighting conditions, including reflective, standard, and dark. Additionally, we offer detailed car parsing maps for each instance to promote research in car parsing tasks. Moreover, we remove background point clouds and standardize the car orientation to a unified axis for the reconstruction only on cars without background and controllable rendering. We benchmark 3D reconstruction results with state-of-the-art methods across each lighting condition in 3DRealCar. Extensive experiments demonstrate that the standard lighting condition part of 3DRealCar can be used to produce a large number of high-quality 3D cars, improving various 2D and 3D tasks related to cars. Notably, our dataset brings insight into the fact that recent 3D reconstruction methods face challenges in reconstructing high-quality 3D cars under reflective and dark lighting conditions. \textcolor{red}{\href{https://xiaobiaodu.github.io/3drealcar/}{Our dataset is available here.}}, Comment: Project Page: https://xiaobiaodu.github.io/3drealcar
- Published
- 2024
36. Simulation of DAMPE silicon microstrip detectors in the $\rm Allpix^{2}$ framework
- Author
-
Cui, Yu-Xin, Li, Xiang, Wang, Shen, Yue, Chuan, Wan, Qiang, Lei, Shi-Jun, Yuan, Guan-Wen, Hu, Yi-Ming, Wei, Jia-Ju, and Guo, Jian-Hua
- Subjects
Physics - Instrumentation and Detectors ,High Energy Physics - Experiment - Abstract
Silicon strip detectors have been widely utilized in space experiments for gamma-ray and cosmic-ray detections thanks to their high spatial resolution and stable performance. For a silicon micro-strip detector, the Monte Carlo simulation is recognized as a practical and cost-effective approach to verify the detector performance. In this study, a technique for the simulation of the silicon micro-strip detector with the $\rm Allpix^{2}$ framework is developed. By incorporating the electric field into the particle transport simulation based on Geant4, this framework could precisely emulate the carrier drift in the silicon micro-strip detector. The simulation results are validated using the beam test data as well as the flight data of the DAMPE experiment, which suggests that the $\rm Allpix^{2}$ framework is a powerful tool to obtain the performance of the silicon micro-strip detector.
- Published
- 2024
- Full Text
- View/download PDF
37. Neurostructural subgroup in 4291 individuals with schizophrenia identified using the subtype and stage inference algorithm.
- Author
-
Jiang, Yuchao, Luo, Cheng, Wang, Jijun, Palaniyappan, Lena, Chang, Xiao, Xiang, Shitong, Zhang, Jie, Duan, Mingjun, Huang, Huan, Gaser, Christian, Nemoto, Kiyotaka, Miura, Kenichiro, Hashimoto, Ryota, Westlye, Lars, Richard, Genevieve, Fernandez-Cabello, Sara, Parker, Nadine, Andreassen, Ole, Kircher, Tilo, Nenadić, Igor, Stein, Frederike, Thomas-Odenthal, Florian, Teutenberg, Lea, Usemann, Paula, Dannlowski, Udo, Hahn, Tim, Grotegerd, Dominik, Meinert, Susanne, Lencer, Rebekka, Tang, Yingying, Zhang, Tianhong, Li, Chunbo, Yue, Weihua, Zhang, Yuyanan, Yu, Xin, Zhou, Enpeng, Lin, Ching-Po, Tsai, Shih-Jen, Rodrigue, Amanda, Glahn, David, Pearlson, Godfrey, Blangero, John, Karuk, Andriana, Pomarol-Clotet, Edith, Salvador, Raymond, Fuentes-Claramonte, Paola, Garcia-León, María, Spalletta, Gianfranco, Piras, Fabrizio, Vecchio, Daniela, Banaj, Nerisa, Cheng, Jingliang, Liu, Zhening, Yang, Jie, Gonul, Ali, Uslu, Ozgul, Burhanoglu, Birce, Uyar Demir, Aslihan, Rootes-Murdy, Kelly, Calhoun, Vince, Sim, Kang, Green, Melissa, Quidé, Yann, Chung, Young, Kim, Woo-Sung, Sponheim, Scott, Demro, Caroline, Ramsay, Ian, Iasevoli, Felice, de Bartolomeis, Andrea, Barone, Annarita, Ciccarelli, Mariateresa, Brunetti, Arturo, Cocozza, Sirio, Pontillo, Giuseppe, Tranfa, Mario, Park, Min, Kirschner, Matthias, Georgiadis, Foivos, Kaiser, Stefan, Van Rheenen, Tamsyn, Rossell, Susan, Hughes, Matthew, Woods, William, Carruthers, Sean, Sumner, Philip, Ringin, Elysha, Spaniel, Filip, Skoch, Antonin, Tomecek, David, Homan, Philipp, Homan, Stephanie, Omlor, Wolfgang, Cecere, Giacomo, Nguyen, Dana, Preda, Adrian, Thomopoulos, Sophia, Jahanshad, Neda, Cui, Long-Biao, and Yao, Dezhong
- Subjects
Humans ,Schizophrenia ,Male ,Female ,Adult ,Algorithms ,Magnetic Resonance Imaging ,Gray Matter ,Machine Learning ,Middle Aged ,Brain ,Cross-Sectional Studies ,Europe ,Neuroimaging ,Reproducibility of Results ,North America ,Hippocampus - Abstract
Machine learning can be used to define subtypes of psychiatric conditions based on shared biological foundations of mental disorders. Here we analyzed cross-sectional brain images from 4,222 individuals with schizophrenia and 7038 healthy subjects pooled across 41 international cohorts from the ENIGMA, non-ENIGMA cohorts and public datasets. Using the Subtype and Stage Inference (SuStaIn) algorithm, we identify two distinct neurostructural subgroups by mapping the spatial and temporal trajectory of gray matter change in schizophrenia. Subgroup 1 was characterized by an early cortical-predominant loss with enlarged striatum, whereas subgroup 2 displayed an early subcortical-predominant loss in the hippocampus, striatum and other subcortical regions. We confirmed the reproducibility of the two neurostructural subtypes across various sample sites, including Europe, North America and East Asia. This imaging-based taxonomy holds the potential to identify individuals with shared neurobiological attributes, thereby suggesting the viability of redefining existing disorder constructs based on biological factors.
- Published
- 2024
38. QGait: Toward Accurate Quantization for Gait Recognition with Binarized Input
- Author
-
Tian, Senmao, Gao, Haoyu, Hong, Gangyi, Wang, Shuyun, Wang, JingJie, Yu, Xin, and Zhang, Shunli
- Subjects
Computer Science - Computer Vision and Pattern Recognition - Abstract
Existing deep learning methods have made significant progress in gait recognition. Typically, appearance-based models binarize inputs into silhouette sequences. However, mainstream quantization methods prioritize minimizing task loss over quantization error, which is detrimental to gait recognition with binarized inputs. Minor variations in silhouette sequences can be diminished in the network's intermediate layers due to the accumulation of quantization errors. To address this, we propose a differentiable soft quantizer, which better simulates the gradient of the round function during backpropagation. This enables the network to learn from subtle input perturbations. However, our theoretical analysis and empirical studies reveal that directly applying the soft quantizer can hinder network convergence. We further refine the training strategy to ensure convergence while simulating quantization errors. Additionally, we visualize the distribution of outputs from different samples in the feature space and observe significant changes compared to the full precision network, which harms performance. Based on this, we propose an Inter-class Distance-guided Distillation (IDD) strategy to preserve the relative distance between the embeddings of samples with different labels. Extensive experiments validate the effectiveness of our approach, demonstrating state-of-the-art accuracy across various settings and datasets. The code will be made publicly available.
- Published
- 2024
39. Refining radiative decay studies in singly heavy baryons
- Author
-
Peng, Yu-Xin, Luo, Si-Qiang, and Liu, Xiang
- Subjects
High Energy Physics - Phenomenology ,High Energy Physics - Experiment - Abstract
In this work, we systematically study the radiative decays of singly heavy baryons, a crucial aspect of their spectroscopic behavior. To enhance the accuracy of our calculations, we utilize numerical spatial wave functions for the singly heavy baryons obtained through the Gaussian expansion method, which also yields their mass spectrum. As hadron spectroscopy enters an era of high precision, we believe our study of the radiative decays of singly heavy baryons will provide valuable insights for further exploration of these particles., Comment: 25 pages, 1 figure, 14 tables. Published version in Phys. Rev. D
- Published
- 2024
- Full Text
- View/download PDF
40. 3D MR Fingerprinting for Dynamic Contrast-Enhanced Imaging of Whole Mouse Brain
- Author
-
Zhu, Yuran, Wang, Guanhua, Gu, Yuning, Zhao, Walter, Lu, Jiahao, Zhu, Junqing, MacAskill, Christina J., Dupuis, Andrew, Griswold, Mark A., Ma, Dan, Flask, Chris A., and Yu, Xin
- Subjects
Quantitative Biology - Quantitative Methods - Abstract
Quantitative MRI enables direct quantification of contrast agent concentrations in contrast-enhanced scans. However, the lengthy scan times required by conventional methods are inadequate for tracking contrast agent transport dynamically in mouse brain. We developed a 3D MR fingerprinting (MRF) method for simultaneous T1 and T2 mapping across the whole mouse brain with 4.3-min temporal resolution. We designed a 3D MRF sequence with variable acquisition segment lengths and magnetization preparations on a 9.4T preclinical MRI scanner. Model-based reconstruction approaches were employed to improve the accuracy and speed of MRF acquisition. The method's accuracy for T1 and T2 measurements was validated in vitro, while its repeatability of T1 and T2 measurements was evaluated in vivo (n=3). The utility of the 3D MRF sequence for dynamic tracking of intracisternally infused Gd-DTPA in the whole mouse brain was demonstrated (n=5). Phantom studies confirmed accurate T1 and T2 measurements by 3D MRF with an undersampling factor up to 48. Dynamic contrast-enhanced (DCE) MRF scans achieved a spatial resolution of 192 x 192 x 500 um3 and a temporal resolution of 4.3 min, allowing for the analysis and comparison of dynamic changes in concentration and transport kinetics of intracisternally infused Gd-DTPA across brain regions. The sequence also enabled highly repeatable, high-resolution T1 and T2 mapping of the whole mouse brain (192 x 192 x 250 um3) in 30 min. We present the first dynamic and multi-parametric approach for quantitatively tracking contrast agent transport in the mouse brain using 3D MRF.
- Published
- 2024
41. A Comprehensive Survey and Taxonomy on Point Cloud Registration Based on Deep Learning
- Author
-
Zhang, Yu-Xin, Gui, Jie, Cong, Xiaofeng, Gong, Xin, and Tao, Wenbing
- Subjects
Computer Science - Computer Vision and Pattern Recognition - Abstract
Point cloud registration (PCR) involves determining a rigid transformation that aligns one point cloud to another. Despite the plethora of outstanding deep learning (DL)-based registration methods proposed, comprehensive and systematic studies on DL-based PCR techniques are still lacking. In this paper, we present a comprehensive survey and taxonomy of recently proposed PCR methods. Firstly, we conduct a taxonomy of commonly utilized datasets and evaluation metrics. Secondly, we classify the existing research into two main categories: supervised and unsupervised registration, providing insights into the core concepts of various influential PCR models. Finally, we highlight open challenges and potential directions for future research. A curated collection of valuable resources is made available at https://github.com/yxzhang15/PCR., Comment: This paper is accepted by IJCAI 2024
- Published
- 2024
42. Machine Unlearning via Null Space Calibration
- Author
-
Chen, Huiqiang, Zhu, Tianqing, Yu, Xin, and Zhou, Wanlei
- Subjects
Computer Science - Machine Learning ,Computer Science - Artificial Intelligence - Abstract
Machine unlearning aims to enable models to forget specific data instances when receiving deletion requests. Current research centres on efficient unlearning to erase the influence of data from the model and neglects the subsequent impacts on the remaining data. Consequently, existing unlearning algorithms degrade the model's performance after unlearning, known as \textit{over-unlearning}. This paper addresses this critical yet under-explored issue by introducing machine \underline{U}nlearning via \underline{N}ull \underline{S}pace \underline{C}alibration (UNSC), which can accurately unlearn target samples without over-unlearning. On the contrary, by calibrating the decision space during unlearning, UNSC can significantly improve the model's performance on the remaining samples. In particular, our approach hinges on confining the unlearning process to a specified null space tailored to the remaining samples, which is augmented by strategically pseudo-labeling the unlearning samples. Comparative analyses against several established baselines affirm the superiority of our approach. Code is released at this \href{https://github.com/HQC-ML/Machine-Unlearning-via-Null-Space-Calibration}{URL}., Comment: Accepted by IJCAI-2024
- Published
- 2024
43. Average R\'enyi Entanglement Entropy in Gaussian Boson Sampling
- Author
-
Youm, Jason, Iosue, Joseph T., Ehrenberg, Adam, Wang, Yu-Xin, and Gorshkov, Alexey V.
- Subjects
Quantum Physics - Abstract
Recently, many experiments have been conducted with the goal of demonstrating a quantum advantage over classical computation. One popular framework for these experiments is Gaussian Boson Sampling, where quadratic photonic input states are interfered via a linear optical unitary and subsequently measured in the Fock basis. In this work, we study the modal entanglement of the output states in this framework just before the measurement stage. Specifically, we compute Page curves as measured by various R\'enyi-$\alpha$ entropies, where the Page curve describes the entanglement between two partitioned groups of output modes averaged over all linear optical unitaries. We derive these formulas for $\alpha = 1$ (i.e. the von Neumann entropy), and, more generally, for all positive integer $\alpha$, in the asymptotic limit of infinite number of modes and for input states that are composed of single-mode-squeezed-vacuum state with equal squeezing strength. We then analyze the limiting behaviors when the squeezing is small and large. Having determined the averages, we then explicitly calculate the R\'enyi-$\alpha$ variance for integers $\alpha > 1$, and we are able to show that these entropies are weakly typical., Comment: 7+11 pages, 1+2 figures
- Published
- 2024
44. A novel quark pairing in sQGP induced by the non-Abelian feature of the interaction
- Author
-
Gao, Fei, Lu, Yi, and Liu, Yu-Xin
- Subjects
High Energy Physics - Phenomenology ,High Energy Physics - Theory ,Nuclear Theory - Abstract
We solve the coupled Dyson-Schwinger equations for quark propagator and quark gluon vertex in the Nambu-Gorkov basis which is widely applied to study the color superconductivity. After considering the non-Abelian feature in the off-diagonal part of quark gluon vertex, we acquire a quark pairing gap in chiral limit above the chiral phase transition temperature $T_c$. The gap persists up to $2-3\,T_c$ and vanishes at higher temperature. Such a quark pairing characterizes the strongly coupled quark gluon plasma phase as a new phase and distinct from the phase with quasi quarks and gluons. Its new features can be disclosed in the heavy ion collision experiments., Comment: 8 pages, 5 figures
- Published
- 2024
45. ICE: Interactive 3D Game Character Editing via Dialogue
- Author
-
Wu, Haoqian, Zhao, Minda, Hu, Zhipeng, Li, Lincheng, Chen, Weijie, Zhao, Rui, Fan, Changjie, and Yu, Xin
- Subjects
Computer Science - Multimedia ,Computer Science - Human-Computer Interaction - Abstract
ost recent popular Role-Playing Games (RPGs) allow players to create in-game characters with hundreds of adjustable parameters, including bone positions and various makeup options. Although text-driven auto-customization systems have been developed to simplify the complex process of adjusting these intricate character parameters, they are limited by their single-round generation and lack the capability for further editing and fine-tuning. In this paper, we propose an Interactive Character Editing framework (ICE) to achieve a multi-round dialogue-based refinement process. In a nutshell, our ICE offers a more user-friendly way to enable players to convey creative ideas iteratively while ensuring that created characters align with the expectations of players. Specifically, we propose an Instruction Parsing Module (IPM) that utilizes large language models (LLMs) to parse multi-round dialogues into clear editing instruction prompts in each round. To reliably and swiftly modify character control parameters at a fine-grained level, we propose a Semantic-guided Low-dimension Parameter Solver (SLPS) that edits character control parameters according to prompts in a zero-shot manner. Our SLPS first localizes the character control parameters related to the fine-grained modification, and then optimizes the corresponding parameters in a low-dimension space to avoid unrealistic results. Extensive experimental results demonstrate the effectiveness of our proposed ICE for in-game character creation and the superior editing performance of ICE.
- Published
- 2024
46. Affective Behaviour Analysis via Integrating Multi-Modal Knowledge
- Author
-
Zhang, Wei, Qiu, Feng, Liu, Chen, Li, Lincheng, Du, Heming, Guo, Tiancheng, and Yu, Xin
- Subjects
Computer Science - Computer Vision and Pattern Recognition - Abstract
Affective Behavior Analysis aims to facilitate technology emotionally smart, creating a world where devices can understand and react to our emotions as humans do. To comprehensively evaluate the authenticity and applicability of emotional behavior analysis techniques in natural environments, the 6th competition on Affective Behavior Analysis in-the-wild (ABAW) utilizes the Aff-Wild2, Hume-Vidmimic2, and C-EXPR-DB datasets to set up five competitive tracks, i.e., Valence-Arousal (VA) Estimation, Expression (EXPR) Recognition, Action Unit (AU) Detection, Compound Expression (CE) Recognition, and Emotional Mimicry Intensity (EMI) Estimation. In this paper, we present our method designs for the five tasks. Specifically, our design mainly includes three aspects: 1) Utilizing a transformer-based feature fusion module to fully integrate emotional information provided by audio signals, visual images, and transcripts, offering high-quality expression features for the downstream tasks. 2) To achieve high-quality facial feature representations, we employ Masked-Auto Encoder as the visual features extraction model and fine-tune it with our facial dataset. 3) Considering the complexity of the video collection scenes, we conduct a more detailed dataset division based on scene characteristics and train the classifier for each scene. Extensive experiments demonstrate the superiority of our designs., Comment: 11 pages, 1 figure
- Published
- 2024
47. Exploring the Nuclear Shape Phase Transition in Ultra-Relativistic $^{129}$Xe+$^{129}$Xe Collisions at the LHC
- Author
-
Zhao, Shujun, Xu, Hao-jie, Zhou, You, Liu, Yu-Xin, and Song, Huichao
- Subjects
Nuclear Theory ,High Energy Physics - Phenomenology ,Nuclear Experiment - Abstract
The shape phase transition for certain isotope or isotone chains, associated with the quantum phase transition of finite nuclei, is an intriguing phenomenon in nuclear physics. A notable case is the Xe isotope chain, where the structure transits from a $\gamma$-soft rotor to a spherical vibrator, with the second-order shape phase transition occurring in the vicinity of $^{128-130}$Xe. In this letter, we focus on investigating the $\gamma$-soft deformation of $^{129}$Xe associated with the second-order shape phase transition by constructing novel correlators for ultra-relativistic $^{129}$Xe+$^{129}$Xe collisions. In particular, our iEBE-VISHNU model calculations show that the $v_2^2-[p_T]$ correlation $\rho_{2}$ and the mean transverse momentum fluctuation $\Gamma_{p_T}$, which were previously interpreted as the evidence for the rigid triaxial deformation of $^{129}$Xe, can also be well explained by the $\gamma$-soft deformation of $^{129}$Xe. We also propose two novel correlators $\rho_{4,2}$ and $\rho_{2,4}$, which carry non-trivial higher-order correlations and show unique capabilities to distinguish between the $\gamma$-soft and the rigid triaxial deformation of $^{129}$Xe in $^{129}$Xe+$^{129}$Xe collisions at the LHC. The present study also provides a novel way to explore the second-order shape phase transition of finite nuclei with ultra-relativistic heavy ion collisions.
- Published
- 2024
48. Impact of (magneto-)thermoelectric effect on diffusion of conserved charges in hot and dense hadronic matter
- Author
-
Zhang, He-Xia, Shen, Ke-Ming, Xiao, Yu-Xin, and Zhang, Ben-Wei
- Subjects
Nuclear Theory ,High Energy Physics - Phenomenology ,High Energy Physics - Theory - Abstract
We investigate the thermoelectric effect, which describes the generation of an electric field induced by temperature and conserved charge chemical potential gradients, in the hot and dense hadronic matter created in heavy-ion collisions. Utilizing the Boltzmann kinetic theory within the repulsive mean-field hadron resonance gas model, we evaluate both the diffusion thermopower matrix and diffusion coefficient matrix for the baryon number ($B$), electric charge ($Q$), and strangeness ($S$). The Landau-Lifshitz choice for the rest frame of the fluid is enforced in the derivation. We find that the thermoelectric effect hinders the diffusion processes of multiple conserved charges, particularly reducing the coupling between electric charge and baryon number (strangeness) in baryon (strangeness) diffusion. Given that the repulsive mean-field interactions between hadrons have a significant effect on the diffusion thermopower matrix and diffusion coefficient matrix in the baryon-rich region, we extend the investigation to include the impact of magnetic fields, analyzing the magneto-thermoelectric effect on both the diffusion coefficient matrix and the Hall-like diffusion coefficient matrix. The sensitivities of the magnetic field-dependent diffusion thermopower matrix and magneto-thermoelectric modified diffusion coefficient matrix to the choices of various transverse conditions are also studied., Comment: 21 pages, 12 figures, Version accepted by Phys. Rev. D
- Published
- 2024
49. Progress in High Intensity Focused Ultrasound Ablation for Fertility Preservation Therapy of Uterine Fibroids and Adenomyosis
- Author
-
Zhang, Guorui, Li, Lei, Sun, Mengyuan, and Yu, Xin
- Published
- 2024
- Full Text
- View/download PDF
50. Recent advances in silver-mediated/catalyzed synthesis of trifluoromethoxy compounds
- Author
-
Altaf, Muhammad Bilal, Luan, Yu-Xin, and Tang, Pingping
- Published
- 2024
- Full Text
- View/download PDF
Catalog
Discovery Service for Jio Institute Digital Library
For full access to our library's resources, please sign in.