2,187 results on '"Zhou, Xinyu"'
Search Results
2. HyperINF: Unleashing the HyperPower of the Schulz's Method for Data Influence Estimation
- Author
-
Zhou, Xinyu, Fan, Simin, and Jaggi, Martin
- Subjects
Computer Science - Machine Learning ,Statistics - Machine Learning - Abstract
Influence functions provide a principled method to assess the contribution of individual training samples to a specific target. Yet, their high computational costs limit their applications on large-scale models and datasets. Existing methods proposed for influence function approximation have significantly reduced the computational overheads. However, they mostly suffer from inaccurate estimation due to the lack of strong convergence guarantees from the algorithm. The family of hyperpower methods are well-known for their rigorous convergence guarantees on matrix inverse approximation, while the matrix multiplication operation can involve intractable memory and computation costs on large-scale models. We propose HyperINF, an efficient and accurate influence function approximation method which leverages the hyperpower method, specifically Schulz's iterative algorithm. To deal with the computation-intensive matrix multiplication, we incorporate the generalized fisher information (GFIM) as a low-rank approximation of the Hessian matrix, which reduces the memory and computation overheads to constant costs independent of ranks on LoRA-tuned models. We first demonstrate the superior accuracy and stability of \method compared to other baselines through a synthetic convergence simulation for matrix inversion. We further validate the efficacy of \method through extensive real-world data attribution tasks, including mislabeled data detection and data selection for LLM and VLM fine-tuning. On LoRA-tuned models, HyperINF achieves superior downstream performance with minimal memory and computational overhead, while other baselines suffer from significant degradation. Our codebase is available at https://github.com/Blackzxy/HyperINF.
- Published
- 2024
3. X-Prompt: Multi-modal Visual Prompt for Video Object Segmentation
- Author
-
Guo, Pinxue, Li, Wanyun, Huang, Hao, Hong, Lingyi, Zhou, Xinyu, Chen, Zhaoyu, Li, Jinglun, Jiang, Kaixun, Zhang, Wei, and Zhang, Wenqiang
- Subjects
Computer Science - Computer Vision and Pattern Recognition - Abstract
Multi-modal Video Object Segmentation (VOS), including RGB-Thermal, RGB-Depth, and RGB-Event, has garnered attention due to its capability to address challenging scenarios where traditional VOS methods struggle, such as extreme illumination, rapid motion, and background distraction. Existing approaches often involve designing specific additional branches and performing full-parameter fine-tuning for fusion in each task. However, this paradigm not only duplicates research efforts and hardware costs but also risks model collapse with the limited multi-modal annotated data. In this paper, we propose a universal framework named X-Prompt for all multi-modal video object segmentation tasks, designated as RGB+X. The X-Prompt framework first pre-trains a video object segmentation foundation model using RGB data, and then utilize the additional modality of the prompt to adapt it to downstream multi-modal tasks with limited data. Within the X-Prompt framework, we introduce the Multi-modal Visual Prompter (MVP), which allows prompting foundation model with the various modalities to segment objects precisely. We further propose the Multi-modal Adaptation Experts (MAEs) to adapt the foundation model with pluggable modality-specific knowledge without compromising the generalization capacity. To evaluate the effectiveness of the X-Prompt framework, we conduct extensive experiments on 3 tasks across 4 benchmarks. The proposed universal X-Prompt framework consistently outperforms the full fine-tuning paradigm and achieves state-of-the-art performance. Code: https://github.com/PinxueGuo/X-Prompt.git, Comment: ACMMM'2024
- Published
- 2024
4. General Compression Framework for Efficient Transformer Object Tracking
- Author
-
Hong, Lingyi, Li, Jinglun, Zhou, Xinyu, Yan, Shilin, Guo, Pinxue, Jiang, Kaixun, Chen, Zhaoyu, Gao, Shuyong, Zhang, Wei, Lu, Hong, and Zhang, Wenqiang
- Subjects
Computer Science - Computer Vision and Pattern Recognition - Abstract
Transformer-based trackers have established a dominant role in the field of visual object tracking. While these trackers exhibit promising performance, their deployment on resource-constrained devices remains challenging due to inefficiencies. To improve the inference efficiency and reduce the computation cost, prior approaches have aimed to either design lightweight trackers or distill knowledge from larger teacher models into more compact student trackers. However, these solutions often sacrifice accuracy for speed. Thus, we propose a general model compression framework for efficient transformer object tracking, named CompressTracker, to reduce the size of a pre-trained tracking model into a lightweight tracker with minimal performance degradation. Our approach features a novel stage division strategy that segments the transformer layers of the teacher model into distinct stages, enabling the student model to emulate each corresponding teacher stage more effectively. Additionally, we also design a unique replacement training technique that involves randomly substituting specific stages in the student model with those from the teacher model, as opposed to training the student model in isolation. Replacement training enhances the student model's ability to replicate the teacher model's behavior. To further forcing student model to emulate teacher model, we incorporate prediction guidance and stage-wise feature mimicking to provide additional supervision during the teacher model's compression process. Our framework CompressTracker is structurally agnostic, making it compatible with any transformer architecture. We conduct a series of experiment to verify the effectiveness and generalizability of CompressTracker. Our CompressTracker-4 with 4 transformer layers, which is compressed from OSTrack, retains about 96% performance on LaSOT (66.1% AUC) while achieves 2.17x speed up.
- Published
- 2024
5. Unveiling Language Competence Neurons: A Psycholinguistic Approach to Model Interpretability
- Author
-
Duan, Xufeng, Zhou, Xinyu, Xiao, Bei, and Cai, Zhenguang G.
- Subjects
Computer Science - Computation and Language - Abstract
As large language models (LLMs) become advance in their linguistic capacity, understanding how they capture aspects of language competence remains a significant challenge. This study therefore employs psycholinguistic paradigms, which are well-suited for probing deeper cognitive aspects of language processing, to explore neuron-level representations in language model across three tasks: sound-shape association, sound-gender association, and implicit causality. Our findings indicate that while GPT-2-XL struggles with the sound-shape task, it demonstrates human-like abilities in both sound-gender association and implicit causality. Targeted neuron ablation and activation manipulation reveal a crucial relationship: when GPT-2-XL displays a linguistic ability, specific neurons correspond to that competence; conversely, the absence of such an ability indicates a lack of specialized neurons. This study is the first to utilize psycholinguistic experiments to investigate deep language competence at the neuron level, providing a new level of granularity in model interpretability and insights into the internal mechanisms driving language ability in transformer based LLMs.
- Published
- 2024
6. Linguistic Minimal Pairs Elicit Linguistic Similarity in Large Language Models
- Author
-
Zhou, Xinyu, Chen, Delong, Cahyawijaya, Samuel, Duan, Xufeng, and Cai, Zhenguang G.
- Subjects
Computer Science - Computation and Language - Abstract
We introduce a novel analysis that leverages linguistic minimal pairs to probe the internal linguistic representations of Large Language Models (LLMs). By measuring the similarity between LLM activation differences across minimal pairs, we quantify the and gain insight into the linguistic knowledge captured by LLMs. Our large-scale experiments, spanning 100+ LLMs and 150k minimal pairs in three languages, reveal properties of linguistic similarity from four key aspects: consistency across LLMs, relation to theoretical categorizations, dependency to semantic context, and cross-lingual alignment of relevant phenomena. Our findings suggest that 1) linguistic similarity is significantly influenced by training data exposure, leading to higher cross-LLM agreement in higher-resource languages. 2) Linguistic similarity strongly aligns with fine-grained theoretical linguistic categories but weakly with broader ones. 3) Linguistic similarity shows a weak correlation with semantic similarity, showing its context-dependent nature. 4) LLMs exhibit limited cross-lingual alignment in their understanding of relevant linguistic phenomena. This work demonstrates the potential of minimal pairs as a window into the neural representations of language in LLMs, shedding light on the relationship between LLMs and linguistic theory., Comment: Codes and data are available at https://github.com/ChenDelong1999/Linguistic-Similarity
- Published
- 2024
7. The Effects of Unilateral Slope Loading on Lower Limb Plantar Flexor Muscle EMG Signals in Young Healthy Males
- Author
-
Zhou, Xinyu, Dong, Gengshang, and Zhang, Pengxuan
- Subjects
Quantitative Biology - Quantitative Methods - Abstract
Different loading modes can significantly affect human gait, posture, and lower limb biomechanics. This study investigated the muscle activity intensity of the lower limb soleus muscle in the slope environment of young healthy adult male subjects under unilateral loading environment. Ten subjects held dumbbells equal to 5% and 10% of their body weight (BW) and walked at a fixed speed on a slope of 5 degree and 10 degree, respectively. The changes of electromyography (EMG) of bilateral soleus muscles of the lower limbs were recorded. Experiments were performed using one-way analysis of variance (ANOVA) and multivariate analysis of variance (MANOVA) to examine the relationship between load weight, slope angle, and muscle activity intensity. The data provided by this research can help to promote the development of the field of lower limb assist exoskeleton. The research results fill the missing data when loading on the slope side, provide data support for future assistance systems, and promote the formation of relevant data sets, so as to improve the terrain recognition ability and the movement ability of the device wearer.
- Published
- 2024
8. Hierarchical Visual Categories Modeling: A Joint Representation Learning and Density Estimation Framework for Out-of-Distribution Detection
- Author
-
Li, Jinglun, Zhou, Xinyu, Guo, Pinxue, Sun, Yixuan, Huang, Yiwen, Ge, Weifeng, and Zhang, Wenqiang
- Subjects
Computer Science - Computer Vision and Pattern Recognition - Abstract
Detecting out-of-distribution inputs for visual recognition models has become critical in safe deep learning. This paper proposes a novel hierarchical visual category modeling scheme to separate out-of-distribution data from in-distribution data through joint representation learning and statistical modeling. We learn a mixture of Gaussian models for each in-distribution category. There are many Gaussian mixture models to model different visual categories. With these Gaussian models, we design an in-distribution score function by aggregating multiple Mahalanobis-based metrics. We don't use any auxiliary outlier data as training samples, which may hurt the generalization ability of out-of-distribution detection algorithms. We split the ImageNet-1k dataset into ten folds randomly. We use one fold as the in-distribution dataset and the others as out-of-distribution datasets to evaluate the proposed method. We also conduct experiments on seven popular benchmarks, including CIFAR, iNaturalist, SUN, Places, Textures, ImageNet-O, and OpenImage-O. Extensive experiments indicate that the proposed method outperforms state-of-the-art algorithms clearly. Meanwhile, we find that our visual representation has a competitive performance when compared with features learned by classical methods. These results demonstrate that the proposed method hasn't weakened the discriminative ability of visual recognition models and keeps high efficiency in detecting out-of-distribution samples., Comment: Accepted by ICCV2023
- Published
- 2024
9. TagOOD: A Novel Approach to Out-of-Distribution Detection via Vision-Language Representations and Class Center Learning
- Author
-
Li, Jinglun, Zhou, Xinyu, Jiang, Kaixun, Hong, Lingyi, Guo, Pinxue, Chen, Zhaoyu, Ge, Weifeng, and Zhang, Wenqiang
- Subjects
Computer Science - Computer Vision and Pattern Recognition - Abstract
Multimodal fusion, leveraging data like vision and language, is rapidly gaining traction. This enriched data representation improves performance across various tasks. Existing methods for out-of-distribution (OOD) detection, a critical area where AI models encounter unseen data in real-world scenarios, rely heavily on whole-image features. These image-level features can include irrelevant information that hinders the detection of OOD samples, ultimately limiting overall performance. In this paper, we propose \textbf{TagOOD}, a novel approach for OOD detection that leverages vision-language representations to achieve label-free object feature decoupling from whole images. This decomposition enables a more focused analysis of object semantics, enhancing OOD detection performance. Subsequently, TagOOD trains a lightweight network on the extracted object features to learn representative class centers. These centers capture the central tendencies of IND object classes, minimizing the influence of irrelevant image features during OOD detection. Finally, our approach efficiently detects OOD samples by calculating distance-based metrics as OOD scores between learned centers and test samples. We conduct extensive experiments to evaluate TagOOD on several benchmark datasets and demonstrate its superior performance compared to existing OOD detection methods. This work presents a novel perspective for further exploration of multimodal information utilization in OOD detection, with potential applications across various tasks., Comment: Accepted by ACMMM2024
- Published
- 2024
10. XNN: Paradigm Shift in Mitigating Identity Leakage within Cloud-Enabled Deep Learning
- Author
-
Liu, Kaixin, Xiong, Huixin, Duan, Bingyu, Cheng, Zexuan, Zhou, Xinyu, Zhang, Wanqian, and Zhang, Xiangyu
- Subjects
Computer Science - Cryptography and Security ,Computer Science - Computer Vision and Pattern Recognition - Abstract
In the domain of cloud-based deep learning, the imperative for external computational resources coexists with acute privacy concerns, particularly identity leakage. To address this challenge, we introduce XNN and XNN-d, pioneering methodologies that infuse neural network features with randomized perturbations, striking a harmonious balance between utility and privacy. XNN, designed for the training phase, ingeniously blends random permutation with matrix multiplication techniques to obfuscate feature maps, effectively shielding private data from potential breaches without compromising training integrity. Concurrently, XNN-d, devised for the inference phase, employs adversarial training to integrate generative adversarial noise. This technique effectively counters black-box access attacks aimed at identity extraction, while a distilled face recognition network adeptly processes the perturbed features, ensuring accurate identification. Our evaluation demonstrates XNN's effectiveness, significantly outperforming existing methods in reducing identity leakage while maintaining a high model accuracy.
- Published
- 2024
11. LoGAH: Predicting 774-Million-Parameter Transformers using Graph HyperNetworks with 1/100 Parameters
- Author
-
Zhou, Xinyu, Knyazev, Boris, Jolicoeur-Martineau, Alexia, and Fu, Jie
- Subjects
Computer Science - Machine Learning - Abstract
A good initialization of deep learning models is essential since it can help them converge better and faster. However, pretraining large models is unaffordable for many researchers, which makes a desired prediction for initial parameters more necessary nowadays. Graph HyperNetworks (GHNs), one approach to predicting model parameters, have recently shown strong performance in initializing large vision models. Unfortunately, predicting parameters of very wide networks relies on copying small chunks of parameters multiple times and requires an extremely large number of parameters to support full prediction, which greatly hinders its adoption in practice. To address this limitation, we propose LoGAH (Low-rank GrAph Hypernetworks), a GHN with a low-rank parameter decoder that expands to significantly wider networks without requiring as excessive increase of parameters as in previous attempts. LoGAH allows us to predict the parameters of 774-million large neural networks in a memory-efficient manner. We show that vision and language models (i.e., ViT and GPT-2) initialized with LoGAH achieve better performance than those initialized randomly or using existing hypernetworks. Furthermore, we show promising transfer learning results w.r.t. training LoGAH on small datasets and using the predicted parameters to initialize for larger tasks. We provide the codes in https://github.com/Blackzxy/LoGAH ., Comment: 16 pages
- Published
- 2024
12. LVOS: A Benchmark for Large-scale Long-term Video Object Segmentation
- Author
-
Hong, Lingyi, Liu, Zhongying, Chen, Wenchao, Tan, Chenzhi, Feng, Yuang, Zhou, Xinyu, Guo, Pinxue, Li, Jinglun, Chen, Zhaoyu, Gao, Shuyong, Zhang, Wei, and Zhang, Wenqiang
- Subjects
Computer Science - Computer Vision and Pattern Recognition - Abstract
Video object segmentation (VOS) aims to distinguish and track target objects in a video. Despite the excellent performance achieved by off-the-shell VOS models, existing VOS benchmarks mainly focus on short-term videos lasting about 5 seconds, where objects remain visible most of the time. However, these benchmarks poorly represent practical applications, and the absence of long-term datasets restricts further investigation of VOS in realistic scenarios. Thus, we propose a novel benchmark named LVOS, comprising 720 videos with 296,401 frames and 407,945 high-quality annotations. Videos in LVOS last 1.14 minutes on average, approximately 5 times longer than videos in existing datasets. Each video includes various attributes, especially challenges deriving from the wild, such as long-term reappearing and cross-temporal similar objects. Compared to previous benchmarks, our LVOS better reflects VOS models' performance in real scenarios. Based on LVOS, we evaluate 20 existing VOS models under 4 different settings and conduct a comprehensive analysis. On LVOS, these models suffer a large performance drop, highlighting the challenge of achieving precise tracking and segmentation in real-world scenarios. Attribute-based analysis indicates that key factor to accuracy decline is the increased video length, emphasizing LVOS's crucial role. We hope our LVOS can advance development of VOS in real scenes. Data and code are available at https://lingyihongfd.github.io/lvos.github.io/., Comment: LVOS V2
- Published
- 2024
13. HeR-DRL:Heterogeneous Relational Deep Reinforcement Learning for Decentralized Multi-Robot Crowd Navigation
- Author
-
Zhou, Xinyu, Piao, Songhao, Chi, Wenzheng, Chen, Liguo, and Li, Wei
- Subjects
Computer Science - Robotics - Abstract
Crowd navigation has received significant research attention in recent years, especially DRL-based methods. While single-robot crowd scenarios have dominated research, they offer limited applicability to real-world complexities. The heterogeneity of interaction among multiple agent categories, like in decentralized multi-robot pedestrian scenarios, are frequently disregarded. This "interaction blind spot" hinders generalizability and restricts progress towards robust navigation algorithms. In this paper, we propose a heterogeneous relational deep reinforcement learning(HeR-DRL), based on customised heterogeneous GNN, in order to improve navigation strategies in decentralized multi-robot crowd navigation. Firstly, we devised a method for constructing robot-crowd heterogenous relation graph that effectively simulates the heterogeneous pair-wise interaction relationships. We proposed a new heterogeneous graph neural network for transferring and aggregating the heterogeneous state information. Finally, we incorporate the encoded information into deep reinforcement learning to explore the optimal policy. HeR-DRL are rigorously evaluated through comparing it to state-of-the-art algorithms in both single-robot and multi-robot circle crowssing scenario. The experimental results demonstrate that HeR-DRL surpasses the state-of-the-art approaches in overall performance, particularly excelling in safety and comfort metrics. This underscores the significance of interaction heterogeneity for crowd navigation. The source code will be publicly released in https://github.com/Zhouxy-Debugging-Den/HeR-DRL.
- Published
- 2024
14. OneTracker: Unifying Visual Object Tracking with Foundation Models and Efficient Tuning
- Author
-
Hong, Lingyi, Yan, Shilin, Zhang, Renrui, Li, Wanyun, Zhou, Xinyu, Guo, Pinxue, Jiang, Kaixun, Chen, Yiting, Li, Jinglun, Chen, Zhaoyu, and Zhang, Wenqiang
- Subjects
Computer Science - Computer Vision and Pattern Recognition - Abstract
Visual object tracking aims to localize the target object of each frame based on its initial appearance in the first frame. Depending on the input modility, tracking tasks can be divided into RGB tracking and RGB+X (e.g. RGB+N, and RGB+D) tracking. Despite the different input modalities, the core aspect of tracking is the temporal matching. Based on this common ground, we present a general framework to unify various tracking tasks, termed as OneTracker. OneTracker first performs a large-scale pre-training on a RGB tracker called Foundation Tracker. This pretraining phase equips the Foundation Tracker with a stable ability to estimate the location of the target object. Then we regard other modality information as prompt and build Prompt Tracker upon Foundation Tracker. Through freezing the Foundation Tracker and only adjusting some additional trainable parameters, Prompt Tracker inhibits the strong localization ability from Foundation Tracker and achieves parameter-efficient finetuning on downstream RGB+X tracking tasks. To evaluate the effectiveness of our general framework OneTracker, which is consisted of Foundation Tracker and Prompt Tracker, we conduct extensive experiments on 6 popular tracking tasks across 11 benchmarks and our OneTracker outperforms other models and achieves state-of-the-art performance., Comment: Accepted to CVPR 2024
- Published
- 2024
15. OneVOS: Unifying Video Object Segmentation with All-in-One Transformer Framework
- Author
-
Li, Wanyun, Guo, Pinxue, Zhou, Xinyu, Hong, Lingyi, He, Yangji, Zheng, Xiangyu, Zhang, Wei, and Zhang, Wenqiang
- Subjects
Computer Science - Computer Vision and Pattern Recognition - Abstract
Contemporary Video Object Segmentation (VOS) approaches typically consist stages of feature extraction, matching, memory management, and multiple objects aggregation. Recent advanced models either employ a discrete modeling for these components in a sequential manner, or optimize a combined pipeline through substructure aggregation. However, these existing explicit staged approaches prevent the VOS framework from being optimized as a unified whole, leading to the limited capacity and suboptimal performance in tackling complex videos. In this paper, we propose OneVOS, a novel framework that unifies the core components of VOS with All-in-One Transformer. Specifically, to unify all aforementioned modules into a vision transformer, we model all the features of frames, masks and memory for multiple objects as transformer tokens, and integrally accomplish feature extraction, matching and memory management of multiple objects through the flexible attention mechanism. Furthermore, a Unidirectional Hybrid Attention is proposed through a double decoupling of the original attention operation, to rectify semantic errors and ambiguities of stored tokens in OneVOS framework. Finally, to alleviate the storage burden and expedite inference, we propose the Dynamic Token Selector, which unveils the working mechanism of OneVOS and naturally leads to a more efficient version of OneVOS. Extensive experiments demonstrate the superiority of OneVOS, achieving state-of-the-art performance across 7 datasets, particularly excelling in complex LVOS and MOSE datasets with 70.1% and 66.4% $J \& F$ scores, surpassing previous state-of-the-art methods by 4.2% and 7.0%, respectively. And our code will be available for reproducibility and further research., Comment: 19 pages, 7 figures
- Published
- 2024
16. Complementing Event Streams and RGB Frames for Hand Mesh Reconstruction
- Author
-
Jiang, Jianping, Zhou, Xinyu, Wang, Bingxuan, Deng, Xiaoming, Xu, Chao, and Shi, Boxin
- Subjects
Computer Science - Computer Vision and Pattern Recognition - Abstract
Reliable hand mesh reconstruction (HMR) from commonly-used color and depth sensors is challenging especially under scenarios with varied illuminations and fast motions. Event camera is a highly promising alternative for its high dynamic range and dense temporal resolution properties, but it lacks key texture appearance for hand mesh reconstruction. In this paper, we propose EvRGBHand -- the first approach for 3D hand mesh reconstruction with an event camera and an RGB camera compensating for each other. By fusing two modalities of data across time, space, and information dimensions,EvRGBHand can tackle overexposure and motion blur issues in RGB-based HMR and foreground scarcity and background overflow issues in event-based HMR. We further propose EvRGBDegrader, which allows our model to generalize effectively in challenging scenes, even when trained solely on standard scenes, thus reducing data acquisition costs. Experiments on real-world data demonstrate that EvRGBHand can effectively solve the challenging issues when using either type of camera alone via retaining the merits of both, and shows the potential of generalization to outdoor scenes and another type of event camera.
- Published
- 2024
17. ClickVOS: Click Video Object Segmentation
- Author
-
Guo, Pinxue, Hong, Lingyi, Zhou, Xinyu, Gao, Shuyong, Li, Wanyun, Li, Jinglun, Chen, Zhaoyu, Li, Xiaoqiang, Zhang, Wei, and Zhang, Wenqiang
- Subjects
Computer Science - Computer Vision and Pattern Recognition - Abstract
Video Object Segmentation (VOS) task aims to segment objects in videos. However, previous settings either require time-consuming manual masks of target objects at the first frame during inference or lack the flexibility to specify arbitrary objects of interest. To address these limitations, we propose the setting named Click Video Object Segmentation (ClickVOS) which segments objects of interest across the whole video according to a single click per object in the first frame. And we provide the extended datasets DAVIS-P and YouTubeVOSP that with point annotations to support this task. ClickVOS is of significant practical applications and research implications due to its only 1-2 seconds interaction time for indicating an object, comparing annotating the mask of an object needs several minutes. However, ClickVOS also presents increased challenges. To address this task, we propose an end-to-end baseline approach named called Attention Before Segmentation (ABS), motivated by the attention process of humans. ABS utilizes the given point in the first frame to perceive the target object through a concise yet effective segmentation attention. Although the initial object mask is possibly inaccurate, in our ABS, as the video goes on, the initially imprecise object mask can self-heal instead of deteriorating due to error accumulation, which is attributed to our designed improvement memory that continuously records stable global object memory and updates detailed dense memory. In addition, we conduct various baseline explorations utilizing off-the-shelf algorithms from related fields, which could provide insights for the further exploration of ClickVOS. The experimental results demonstrate the superiority of the proposed ABS approach. Extended datasets and codes will be available at https://github.com/PinxueGuo/ClickVOS.
- Published
- 2024
18. Comparison of Public Responses to Containment Measures During the Initial Outbreak and Resurgence of COVID-19 in China: Infodemiology Study
- Author
-
Zhou, Xinyu, Song, Yi, Jiang, Hao, Wang, Qian, Qu, Zhiqiang, Zhou, Xiaoyu, Jit, Mark, Hou, Zhiyuan, and Lin, Leesa
- Subjects
Computer applications to medicine. Medical informatics ,R858-859.7 ,Public aspects of medicine ,RA1-1270 - Abstract
BackgroundCOVID-19 cases resurged worldwide in the second half of 2020. Not much is known about the changes in public responses to containment measures from the initial outbreak to resurgence. Monitoring public responses is crucial to inform policy measures to prepare for COVID-19 resurgence. ObjectiveThis study aimed to assess and compare public responses to containment measures during the initial outbreak and resurgence of COVID-19 in China. MethodsWe curated all COVID-19–related posts from Sina Weibo (China’s version of Twitter) during the initial outbreak and resurgence of COVID-19 in Beijing, China. With a Python script, we constructed subsets of Weibo posts focusing on 3 containment measures: lockdown, the test-trace-isolate strategy, and suspension of gatherings. The Baidu open-source sentiment analysis model and latent Dirichlet allocation topic modeling, a widely used machine learning algorithm, were used to assess public engagement, sentiments, and frequently discussed topics on each containment measure. ResultsA total of 8,985,221 Weibo posts were curated. In China, the containment measures evolved from a complete lockdown for the general population during the initial outbreak to a more targeted response strategy for high-risk populations during COVID-19 resurgence. Between the initial outbreak and resurgence, the average daily proportion of Weibo posts with negative sentiments decreased from 57% to 47% for the lockdown, 56% to 51% for the test-trace-isolate strategy, and 55% to 48% for the suspension of gatherings. Among the top 3 frequently discussed topics on lockdown measures, discussions on containment measures accounted for approximately 32% in both periods, but those on the second-most frequently discussed topic shifted from the expression of negative emotions (11%) to its impacts on daily life or work (26%). The public expressed a high level of panic (21%) during the initial outbreak but almost no panic (1%) during resurgence. The more targeted test-trace-isolate measure received the most support (60%) among all 3 containment measures in the initial outbreak, and its support rate approached 90% during resurgence. ConclusionsCompared to the initial outbreak, the public expressed less engagement and less negative sentiments on containment measures and were more supportive toward containment measures during resurgence. Targeted test-trace-isolate strategies were more acceptable to the public. Our results indicate that when COVID-19 resurges, more targeted test-trace-isolate strategies for high-risk populations should be promoted to balance pandemic control and its impact on daily life and the economy.
- Published
- 2021
- Full Text
- View/download PDF
19. Prognostic analysis and risk assessment based on RNA editing in hepatocellular carcinoma
- Author
-
Shi, Xintong, Bu, Xiaoyuan, Zhou, Xinyu, Shen, Ningjia, Chang, Yanxin, Yu, Wenlong, and Wu, Yingjun
- Published
- 2024
- Full Text
- View/download PDF
20. Cross-Country Comparison of Public Awareness, Rumors, and Behavioral Responses to the COVID-19 Epidemic: Infodemiology Study
- Author
-
Hou, Zhiyuan, Du, Fanxing, Zhou, Xinyu, Jiang, Hao, Martin, Sam, Larson, Heidi, and Lin, Leesa
- Subjects
Computer applications to medicine. Medical informatics ,R858-859.7 ,Public aspects of medicine ,RA1-1270 - Abstract
BackgroundUnderstanding public behavioral responses to the coronavirus disease (COVID-19) epidemic and the accompanying infodemic is crucial to controlling the epidemic. ObjectiveThe aim of this study was to assess real-time public awareness and behavioral responses to the COVID-19 epidemic across 12 selected countries. MethodsInternet surveillance was used to collect real-time data from the general public to assess public awareness and rumors (China: Baidu; worldwide: Google Trends) and behavior responses (China: Ali Index; worldwide: Google Shopping). These indices measured the daily number of searches or purchases and were compared with the numbers of daily COVID-19 cases. The trend comparisons across selected countries were observed from December 1, 2019 (prepandemic baseline) to April 11, 2020 (at least one month after the governments of selected countries took actions for the pandemic). ResultsWe identified missed windows of opportunity for early epidemic control in 12 countries, when public awareness was very low despite the emerging epidemic. China's epidemic and the declaration of a public health emergency of international concern did not prompt a worldwide public reaction to adopt health-protective measures; instead, most countries and regions only responded to the epidemic after their own case counts increased. Rumors and misinformation led to a surge of sales in herbal remedies in China and antimalarial drugs worldwide, and timely clarification of rumors mitigated the rush to purchase unproven remedies. ConclusionsOur comparative study highlights the urgent need for international coordination to promote mutual learning about epidemic characteristics and effective control measures as well as to trigger early and timely responses in individual countries. Early release of official guidelines and timely clarification of rumors led by governments are necessary to guide the public to take rational action.
- Published
- 2020
- Full Text
- View/download PDF
21. Differentially Private Worst-group Risk Minimization
- Author
-
Zhou, Xinyu and Bassily, Raef
- Subjects
Computer Science - Machine Learning ,Computer Science - Artificial Intelligence ,Computer Science - Cryptography and Security - Abstract
We initiate a systematic study of worst-group risk minimization under $(\epsilon, \delta)$-differential privacy (DP). The goal is to privately find a model that approximately minimizes the maximal risk across $p$ sub-populations (groups) with different distributions, where each group distribution is accessed via a sample oracle. We first present a new algorithm that achieves excess worst-group population risk of $\tilde{O}(\frac{p\sqrt{d}}{K\epsilon} + \sqrt{\frac{p}{K}})$, where $K$ is the total number of samples drawn from all groups and $d$ is the problem dimension. Our rate is nearly optimal when each distribution is observed via a fixed-size dataset of size $K/p$. Our result is based on a new stability-based analysis for the generalization error. In particular, we show that $\Delta$-uniform argument stability implies $\tilde{O}(\Delta + \frac{1}{\sqrt{n}})$ generalization error w.r.t. the worst-group risk, where $n$ is the number of samples drawn from each sample oracle. Next, we propose an algorithmic framework for worst-group population risk minimization using any DP online convex optimization algorithm as a subroutine. Hence, we give another excess risk bound of $\tilde{O}\left( \sqrt{\frac{d^{1/2}}{\epsilon K}} +\sqrt{\frac{p}{K\epsilon^2}} \right)$. Assuming the typical setting of $\epsilon=\Theta(1)$, this bound is more favorable than our first bound in a certain range of $p$ as a function of $K$ and $d$. Finally, we study differentially private worst-group empirical risk minimization in the offline setting, where each group distribution is observed by a fixed-size dataset. We present a new algorithm with nearly optimal excess risk of $\tilde{O}(\frac{p\sqrt{d}}{K\epsilon})$.
- Published
- 2024
22. Learning to Deblur Polarized Images
- Author
-
Zhou, Chu, Teng, Minggui, Zhou, Xinyu, Xu, Chao, and Sh, Boxin
- Subjects
Computer Science - Computer Vision and Pattern Recognition - Abstract
A polarization camera can capture four polarized images with different polarizer angles in a single shot, which is useful in polarization-based vision applications since the degree of polarization (DoP) and the angle of polarization (AoP) can be directly computed from the captured polarized images. However, since the on-chip micro-polarizers block part of the light so that the sensor often requires a longer exposure time, the captured polarized images are prone to motion blur caused by camera shakes, leading to noticeable degradation in the computed DoP and AoP. Deblurring methods for conventional images often show degenerated performance when handling the polarized images since they only focus on deblurring without considering the polarization constrains. In this paper, we propose a polarized image deblurring pipeline to solve the problem in a polarization-aware manner by adopting a divide-and-conquer strategy to explicitly decompose the problem into two less ill-posed sub-problems, and design a two-stage neural network to handle the two sub-problems respectively. Experimental results show that our method achieves state-of-the-art performance on both synthetic and real-world images, and can improve the performance of polarization-based vision applications such as image dehazing and reflection removal.
- Published
- 2024
23. Reading Relevant Feature from Global Representation Memory for Visual Object Tracking
- Author
-
Zhou, Xinyu, Guo, Pinxue, Hong, Lingyi, Li, Jinglun, Zhang, Wei, Ge, Weifeng, and Zhang, Wenqiang
- Subjects
Computer Science - Computer Vision and Pattern Recognition - Abstract
Reference features from a template or historical frames are crucial for visual object tracking. Prior works utilize all features from a fixed template or memory for visual object tracking. However, due to the dynamic nature of videos, the required reference historical information for different search regions at different time steps is also inconsistent. Therefore, using all features in the template and memory can lead to redundancy and impair tracking performance. To alleviate this issue, we propose a novel tracking paradigm, consisting of a relevance attention mechanism and a global representation memory, which can adaptively assist the search region in selecting the most relevant historical information from reference features. Specifically, the proposed relevance attention mechanism in this work differs from previous approaches in that it can dynamically choose and build the optimal global representation memory for the current frame by accessing cross-frame information globally. Moreover, it can flexibly read the relevant historical information from the constructed memory to reduce redundancy and counteract the negative effects of harmful information. Extensive experiments validate the effectiveness of the proposed method, achieving competitive performance on five challenging datasets with 71 FPS., Comment: 9pages,5 figures, accepted by the Thirty-seventh Conference on Neural Information Processing Systems(Neurips 2023)
- Published
- 2024
24. Me LLaMA: Foundation Large Language Models for Medical Applications
- Author
-
Xie, Qianqian, Chen, Qingyu, Chen, Aokun, Peng, Cheng, Hu, Yan, Lin, Fongci, Peng, Xueqing, Huang, Jimin, Zhang, Jeffrey, Keloth, Vipina, Zhou, Xinyu, Qian, Lingfei, He, Huan, Shung, Dennis, Ohno-Machado, Lucila, Wu, Yonghui, Xu, Hua, and Bian, Jiang
- Subjects
Computer Science - Computation and Language ,Computer Science - Artificial Intelligence - Abstract
Recent advancements in large language models (LLMs) like ChatGPT and LLaMA show promise in medical applications, yet challenges remain in medical language comprehension. This study presents Me-LLaMA, a new medical LLM family based on open-source LLaMA models, optimized for medical text analysis and diagnosis by leveraging large-scale, domain-specific datasets. The Me-LLaMA family, including foundation models Me-LLaMA 13/70B and their chat-enhanced versions, was developed through continued pre-training and instruction tuning with 129B tokens and 214K samples from biomedical and clinical sources. Training the 70B models required over 100,000 A100 GPU hours. Me-LLaMA's performance was evaluated across six medical text analysis tasks using 12 benchmark datasets and complex clinical case diagnosis, with automatic and human evaluations. Results indicate Me-LLaMA outperforms LLaMA and other open-source medical LLMs in zero-shot and supervised settings. Task-specific tuning further boosts performance, surpassing ChatGPT on 7 of 8 datasets and GPT-4 on 5 of 8. For complex clinical cases, Me-LLaMA achieves performance comparable to ChatGPT and GPT-4. This work underscores the importance of domain-specific data in developing medical LLMs and addresses the high computational costs involved in training, highlighting a balance between pre-training and fine-tuning strategies. Me-LLaMA models are now accessible under user agreements, providing a valuable resource for advancing medical AI., Comment: 21 pages, 4 figures, 8 tables
- Published
- 2024
25. EvPlug: Learn a Plug-and-Play Module for Event and Image Fusion
- Author
-
Jiang, Jianping, Zhou, Xinyu, Duan, Peiqi, and Shi, Boxin
- Subjects
Computer Science - Computer Vision and Pattern Recognition ,Computer Science - Artificial Intelligence - Abstract
Event cameras and RGB cameras exhibit complementary characteristics in imaging: the former possesses high dynamic range (HDR) and high temporal resolution, while the latter provides rich texture and color information. This makes the integration of event cameras into middle- and high-level RGB-based vision tasks highly promising. However, challenges arise in multi-modal fusion, data annotation, and model architecture design. In this paper, we propose EvPlug, which learns a plug-and-play event and image fusion module from the supervision of the existing RGB-based model. The learned fusion module integrates event streams with image features in the form of a plug-in, endowing the RGB-based model to be robust to HDR and fast motion scenes while enabling high temporal resolution inference. Our method only requires unlabeled event-image pairs (no pixel-wise alignment required) and does not alter the structure or weights of the RGB-based model. We demonstrate the superiority of EvPlug in several vision tasks such as object detection, semantic segmentation, and 3D hand pose estimation
- Published
- 2023
26. Resource Allocation for Semantic Communication under Physical-layer Security
- Author
-
Li, Yang, Zhou, Xinyu, and Zhao, Jun
- Subjects
Electrical Engineering and Systems Science - Signal Processing ,Computer Science - Cryptography and Security ,Computer Science - Machine Learning - Abstract
Semantic communication is deemed as a revolution of Shannon's paradigm in the six-generation (6G) wireless networks. It aims at transmitting the extracted information rather than the original data, which receivers will try to recover. Intuitively, the larger extracted information, the longer latency of semantic communication will be. Besides, larger extracted information will result in more accurate reconstructed information, thereby causing a higher utility of the semantic communication system. Shorter latency and higher utility are desirable objectives for the system, so there will be a trade-off between utility and latency. This paper proposes a joint optimization algorithm for total latency and utility. Moreover, security is essential for the semantic communication system. We incorporate the secrecy rate, a physical-layer security method, into the optimization problem. The secrecy rate is the communication rate at which no information is disclosed to an eavesdropper. Experimental results demonstrate that the proposed algorithm obtains the best joint optimization performance compared to the baselines., Comment: This paper appears in IEEE Global Communications Conference (GLOBECOM) 2023
- Published
- 2023
27. OneVOS: Unifying Video Object Segmentation with All-in-One Transformer Framework
- Author
-
Li, Wanyun, Guo, Pinxue, Zhou, Xinyu, Hong, Lingyi, He, Yangji, Zheng, Xiangyu, Zhang, Wei, Zhang, Wenqiang, Goos, Gerhard, Series Editor, Hartmanis, Juris, Founding Editor, Bertino, Elisa, Editorial Board Member, Gao, Wen, Editorial Board Member, Steffen, Bernhard, Editorial Board Member, Yung, Moti, Editorial Board Member, Leonardis, Aleš, editor, Ricci, Elisa, editor, Roth, Stefan, editor, Russakovsky, Olga, editor, Sattler, Torsten, editor, and Varol, Gül, editor
- Published
- 2025
- Full Text
- View/download PDF
28. Effects of drought stress and re-watering on nitrogen content in soybean at different growth stages
- Author
-
Dong, Shoukun, Zhou, Xinyu, Qu, Zhipeng, and Wang, Xiyue
- Published
- 2024
- Full Text
- View/download PDF
29. Judicial Waves, Ethical Shifts: Bankruptcy Courts and Corporate ESG Performance
- Author
-
Zhou, Zixun, Zhou, Xinyu, Zhang, Xuezhi, and Chen, Wei
- Published
- 2024
- Full Text
- View/download PDF
30. Dysregulations of amino acid metabolism and lipid metabolism in urine of children and adolescents with major depressive disorder: a case-control study
- Author
-
Jiang, Yuanliang, Cai, Yuping, Teng, Teng, Wang, Xiaolin, Yin, Bangmin, Li, Xuemei, Yu, Ying, Liu, Xueer, Wang, Jie, Wu, Hongyan, He, Yuqian, Zhu, Zheng-Jiang, and Zhou, Xinyu
- Published
- 2024
- Full Text
- View/download PDF
31. Falling Damage Behavior Analysis and Degree Prediction for Wooden Pallet Based on Piezoelectric Effect and Acoustic Emission
- Author
-
Ai, Mengyao, Zhou, Xinyu, Gao, Ge, Gao, Shan, and Du, Xinyu
- Published
- 2024
- Full Text
- View/download PDF
32. Correction: A Randomized, Double-Blind, Positive-Controlled, Multicenter Clinical Trial on the Efficacy and Safety of ShuganJieyu Capsule and St. John’s Wort for Major Depressive Disorder with Somatic Complaints
- Author
-
Xiang, Yajie, Wang, Lihua, Gu, Ping, Wang, Chunxue, Tian, Yuling, Shi, Wanying, Deng, Fang, Zhang, Yongbo, Gao, Li, Wang, Kai, Wang, Yi, He, Jincai, Zhao, Wenfeng, Bi, Xiaoying, Hu, Jian, Zhong, Lianmei, Guo, Yi, Zhou, Xinyu, Wang, Hongxing, and Xie, Peng
- Published
- 2024
- Full Text
- View/download PDF
33. The Association Between the Levels of Oxidative Stress Indicators (MDA, SOD, and GSH) in Seminal Plasma and the Risk of Idiopathic Oligo-asthenotera-tozoospermia: Does Cu or Se Level Alter the Association?
- Author
-
Yin, Tao, Yue, Xinyu, Li, Qian, Zhou, Xinyu, Dong, Rui, Chen, Jiayi, Zhang, Runtao, Wang, Xin, He, Shitao, Jiang, Tingting, Tao, Fangbiao, Cao, Yunxia, Ji, Dongmei, and Liang, Chunmei
- Published
- 2024
- Full Text
- View/download PDF
34. Polyphyllin I induces rapid ferroptosis in acute myeloid leukemia through simultaneous targeting PI3K/SREBP-1/SCD1 axis and triggering of lipid peroxidation
- Author
-
Zhou, Xinyu, Zhang, Duanna, Lei, Jieting, Ren, Jixia, Yang, Bo, Cao, Zhixing, Guo, Chuanjie, and Li, Yuzhi
- Published
- 2024
- Full Text
- View/download PDF
35. Depth-Based Statistical Inferences in the Spike Train Space
- Author
-
Zhou, Xinyu and Wu, Wei
- Subjects
Statistics - Applications - Abstract
Metric-based summary statistics such as mean and covariance have been introduced in neural spike train space. They can properly describe template and variability in spike train data, but are often sensitive to outliers and expensive to compute. Recent studies also examine outlier detection and classification methods on point processes. These tools provide reasonable and efficient result, whereas the accuracy remains at a low level in certain cases. In this study, we propose to adopt a well-established notion of statistical depth to the spike train space. This framework can naturally define the median in a set of spike trains, which provides a robust description of the 'center' or 'template' of the observations. It also provides a principled method to identify 'outliers' in the data and classify data from different categories. We systematically compare the median with the state-of-the-art 'mean spike trains' in terms of robustness and efficiency. The performance of our novel outlier detection and classification tools will be compared with previous methods. The result shows the median has superior description for 'template' than the mean. Moreover, the proposed outlier detection and classification perform more accurately than previous methods. The advantages and superiority are well illustrated with simulations and real data.
- Published
- 2023
36. A new connectivity bound for a tournament to be highly linked
- Author
-
Chen, Bin, Hou, Xinmin, Yu, Gexin, and Zhou, Xinyu
- Subjects
Mathematics - Combinatorics ,05C20, 05C38, 05C40 - Abstract
A digraph $D$ is $k$-linked if for any pair of two disjoint sets $\{x_{1},x_{2},\ldots,x_{k}\}$ and $\{y_{1},y_{2},\ldots,y_{k}\}$ of vertices in $D$, there exist vertex disjoint dipaths $P_{1},P_{2},\ldots,P_{k}$ such that $P_{i}$ is a dipath from $x_{i}$ to $y_{i}$ for each $i\in[k]$. Pokrovskiy (JCTB, 2015) confirmed a conjecture of K\"{u}hn et al. (Proc. Lond. Math. Soc., 2014) by verifying that every $452k$-connected tournament is $k$-linked. Meng et al. (Eur. J. Comb., 2021) improved this upper bound by showing that any $(40k-31)$-connected tournament is $k$-linked. In this paper, we show a better upper bound by proving that every $\lceil 12.5k-6\rceil$-connected tournament with minimum out-degree at least $21k-14$ is $k$-linked. Furthermore, we improve a key lemma that was first introduced by Pokrovskiy (JCTB, 2015) and later enhanced by Meng et al. (Eur. J. Comb., 2021)., Comment: 10 pages
- Published
- 2023
37. Logical Bias Learning for Object Relation Prediction
- Author
-
Zhou, Xinyu, Ji, Zihan, and Zhu, Anna
- Subjects
Computer Science - Computer Vision and Pattern Recognition ,Computer Science - Machine Learning - Abstract
Scene graph generation (SGG) aims to automatically map an image into a semantic structural graph for better scene understanding. It has attracted significant attention for its ability to provide object and relation information, enabling graph reasoning for downstream tasks. However, it faces severe limitations in practice due to the biased data and training method. In this paper, we present a more rational and effective strategy based on causal inference for object relation prediction. To further evaluate the superiority of our strategy, we propose an object enhancement module to conduct ablation studies. Experimental results on the Visual Gnome 150 (VG-150) dataset demonstrate the effectiveness of our proposed method. These contributions can provide great potential for foundation models for decision-making.
- Published
- 2023
38. Towards Joint Modeling of Dialogue Response and Speech Synthesis based on Large Language Model
- Author
-
Zhou, Xinyu, Chen, Delong, and Chen, Yudong
- Subjects
Computer Science - Computation and Language ,Computer Science - Sound ,Electrical Engineering and Systems Science - Audio and Speech Processing - Abstract
This paper explores the potential of constructing an AI spoken dialogue system that "thinks how to respond" and "thinks how to speak" simultaneously, which more closely aligns with the human speech production process compared to the current cascade pipeline of independent chatbot and Text-to-Speech (TTS) modules. We hypothesize that Large Language Models (LLMs) with billions of parameters possess significant speech understanding capabilities and can jointly model dialogue responses and linguistic features. We conduct two sets of experiments: 1) Prosodic structure prediction, a typical front-end task in TTS, demonstrating the speech understanding ability of LLMs, and 2) Further integrating dialogue response and a wide array of linguistic features using a unified encoding format. Our results indicate that the LLM-based approach is a promising direction for building unified spoken dialogue systems.
- Published
- 2023
39. Direct visualization of electric current induced dipoles of atomic impurities
- Author
-
Liu, Yaowu, Zhang, Zichun, Chen, Sidan, Xu, Shengnan, Ji, Lichen, Chen, Wei, Zhou, Xinyu, Luo, Jiaxin, Hu, Xiaopen, Duan, Wenhui, Chen, Xi, Xue, Qi-Kun, and Ji, Shuai-Hua
- Subjects
Condensed Matter - Mesoscale and Nanoscale Physics ,Condensed Matter - Materials Science - Abstract
Learning the electron scattering around atomic impurities is a fundamental step to fully understand the basic electronic transport properties of realistic conducting materials. Although many efforts have been made in this field for several decades, atomic scale transport around single point-like impurities has yet been achieved. Here, we report the direct visualization of the electric current induced dipoles around single atomic impurities in epitaxial bilayer graphene by multi-probe low temperature scanning tunneling potentiometry as the local current density is raised up to around 25 A/m, which is considerably higher than that in previous studies. We find the directions of these dipoles which are parallel or anti-parallel to local current are determined by the charge polarity of the impurities, revealing the direct evidence for the existence of the carrier density modulation effect proposed by Landauer in 1976. Furthermore, by $in$ $situ$ tuning local current direction with contact probes, these dipoles are redirected correspondingly. Our work paves the way to explore the electronic quantum transport phenomena at single atomic impurity level and the potential future electronics toward or beyond the end of Moore's Law.
- Published
- 2023
40. Few shot font generation via transferring similarity guided global style and quantization local style
- Author
-
Pan, Wei, Zhu, Anna, Zhou, Xinyu, Iwana, Brian Kenji, and Li, Shilin
- Subjects
Computer Science - Computer Vision and Pattern Recognition - Abstract
Automatic few-shot font generation (AFFG), aiming at generating new fonts with only a few glyph references, reduces the labor cost of manually designing fonts. However, the traditional AFFG paradigm of style-content disentanglement cannot capture the diverse local details of different fonts. So, many component-based approaches are proposed to tackle this problem. The issue with component-based approaches is that they usually require special pre-defined glyph components, e.g., strokes and radicals, which is infeasible for AFFG of different languages. In this paper, we present a novel font generation approach by aggregating styles from character similarity-guided global features and stylized component-level representations. We calculate the similarity scores of the target character and the referenced samples by measuring the distance along the corresponding channels from the content features, and assigning them as the weights for aggregating the global style features. To better capture the local styles, a cross-attention-based style transfer module is adopted to transfer the styles of reference glyphs to the components, where the components are self-learned discrete latent codes through vector quantization without manual definition. With these designs, our AFFG method could obtain a complete set of component-level style representations, and also control the global glyph characteristics. The experimental results reflect the effectiveness and generalization of the proposed method on different linguistic scripts, and also show its superiority when compared with other state-of-the-art methods. The source code can be found at https://github.com/awei669/VQ-Font., Comment: Accepted by ICCV 2023
- Published
- 2023
41. Direct measurement of photoinduced transient conducting state in multilayer 2H-MoTe2
- Author
-
Zhou, XinYu, Wang, H, Liu, Q M, Zhang, S J, Xu, S X, Wu, Q, Li, R S, Yue, L, Hu, T C, Yuan, J Y, Han, S S, Dong, T, Wu, D, and Wang, N L
- Subjects
Condensed Matter - Mesoscale and Nanoscale Physics ,Condensed Matter - Strongly Correlated Electrons - Abstract
Ultrafast light-matter interaction has emerged as a powerful tool to control and probe the macroscopic properties of functional materials, especially two-dimensional transition metal dichalcogenides which can form different structural phases with distinct physical properties. However, it is often difficult to accurately determine the transient optical constants. In this work, we developed a near-infrared pump - terahertz to midinfrared (12-22 THz) probe system in transmission geometry to measure the transient optical conductivity in 2H-MoTe2 layered material. By performing separate measurements on bulk and thin-film samples, we are able to overcome issues related to nonuniform substrate thickness and penetration depth mismatch and to extract the transient optical constants reliably. Our results show that photoexcitation at 690 nm induces a transient insulator-metal transition, while photoexcitation at 2 um has a much smaller effect due to the photon energy being smaller than the band gap of the material. Combining this with a single-color pump-probe measurement, we show that the transient response evolves towards 1T' phase at higher flunece. Our work provides a comprehensive understanding of the photoinduced phase transition in the 2H-MoTe2 system., Comment: 9 pages, 11 figures
- Published
- 2023
- Full Text
- View/download PDF
42. NBMOD: Find It and Grasp It in Noisy Background
- Author
-
Cao, Boyuan, Zhou, Xinyu, Guo, Congmin, Zhang, Baohua, Liu, Yuchen, and Tan, Qianqiu
- Subjects
Computer Science - Computer Vision and Pattern Recognition ,Computer Science - Robotics - Abstract
Grasping objects is a fundamental yet important capability of robots, and many tasks such as sorting and picking rely on this skill. The prerequisite for stable grasping is the ability to correctly identify suitable grasping positions. However, finding appropriate grasping points is challenging due to the diverse shapes, varying density distributions, and significant differences between the barycenter of various objects. In the past few years, researchers have proposed many methods to address the above-mentioned issues and achieved very good results on publicly available datasets such as the Cornell dataset and the Jacquard dataset. The problem is that the backgrounds of Cornell and Jacquard datasets are relatively simple - typically just a whiteboard, while in real-world operational environments, the background could be complex and noisy. Moreover, in real-world scenarios, robots usually only need to grasp fixed types of objects. To address the aforementioned issues, we proposed a large-scale grasp detection dataset called NBMOD: Noisy Background Multi-Object Dataset for grasp detection, which consists of 31,500 RGB-D images of 20 different types of fruits. Accurate prediction of angles has always been a challenging problem in the detection task of oriented bounding boxes. This paper presents a Rotation Anchor Mechanism (RAM) to address this issue. Considering the high real-time requirement of robotic systems, we propose a series of lightweight architectures called RA-GraspNet (GraspNet with Rotation Anchor): RARA (network with Rotation Anchor and Region Attention), RAST (network with Rotation Anchor and Semi Transformer), and RAGT (network with Rotation Anchor and Global Transformer) to tackle this problem. Among them, the RAGT-3/3 model achieves an accuracy of 99% on the NBMOD dataset. The NBMOD and our code are available at https://github.com/kmittle/Grasp-Detection-NBMOD.
- Published
- 2023
43. Ride-hailing pick-up area recommendation in a vehicle-cloud collaborative environment: a feature-aware personalized clustering federated learning approach
- Author
-
Zhou, Xinyu, Liao, ZhuHua, Zhao, Yijiang, Liu, Yizhi, and Yi, Aiping
- Published
- 2025
- Full Text
- View/download PDF
44. How service robots’ human-like appearance impacts consumer trust: a study across diverse cultures and service settings
- Author
-
Li, Yi, Zhou, Xinyu, Jiang, Xia, Fan, Fan, and Song, Bo
- Published
- 2024
- Full Text
- View/download PDF
45. Gmd: Gaussian mixture descriptor for pair matching of 3D fragments
- Author
-
Xiong, Meijun, Shi, Zhenguo, Zhou, Xinyu, Zhang, Yuhe, and Zhang, Shunli
- Published
- 2024
- Full Text
- View/download PDF
46. Stability in change: building a stable ecological security pattern in Northeast China under climate and land use changes
- Author
-
Zhang, Boyan, Zou, Hui, Duan, Detai, Zhou, Xinyu, Chen, Jianxi, Sun, Zhonghua, and Zhang, Xinxin
- Published
- 2024
- Full Text
- View/download PDF
47. The S100 family is a prognostic biomarker and correlated with immune cell infiltration in pan-cancer
- Author
-
Liang, Xiaojie, Huang, Xiaoshan, Cai, Zihong, Deng, Yeling, Liu, Dan, Hu, Jiayi, Jin, Zhihao, Zhou, Xinyu, Zhou, Hongsheng, and Wang, Liang
- Published
- 2024
- Full Text
- View/download PDF
48. Identifying plasma metabolic characteristics of major depressive disorder, bipolar disorder, and schizophrenia in adolescents
- Author
-
Yin, Bangmin, Cai, Yuping, Teng, Teng, Wang, Xiaolin, Liu, Xueer, Li, Xuemei, Wang, Jie, Wu, Hongyan, He, Yuqian, Ren, Fandong, Kou, Tianzhang, Zhu, Zheng-Jiang, and Zhou, Xinyu
- Published
- 2024
- Full Text
- View/download PDF
49. Distribution and protection of Thesium chinense Turcz. under climate and land use change
- Author
-
Zhang, Boyan, Chen, Bingrui, Zhou, Xinyu, Zou, Hui, Duan, Detai, Zhang, Xiyuan, and Zhang, Xinxin
- Published
- 2024
- Full Text
- View/download PDF
50. Exploring the effect of competing mechanism in an immersive learning game based on augmented reality
- Author
-
Zhan, Zehui, Zhou, Xinyu, Cai, Shaohua, and Lan, Xixin
- Published
- 2024
- Full Text
- View/download PDF
Catalog
Discovery Service for Jio Institute Digital Library
For full access to our library's resources, please sign in.