Author: "Cai, Yuxuan" - Searchworks@Jio Institute Digital Library Search Results

Your search keyword '"Cai, Yuxuan"' showing total 264 results

Start Over Author "Cai, Yuxuan"

264 results on '"Cai, Yuxuan"'

1. LLaVA-KD: A Framework of Distilling Multimodal Large Language Models

Author: Cai, Yuxuan, Zhang, Jiangning, He, Haoyang, He, Xinwei, Tong, Ao, Gan, Zhenye, Wang, Chengjie, and Bai, Xiang
Subjects: Computer Science - Computer Vision and Pattern Recognition
Abstract: The success of Large Language Models (LLM) has led researchers to explore Multimodal Large Language Models (MLLM) for unified visual and linguistic understanding. However, the increasing model size and computational complexity of MLLM limit their use in resource-constrained environments. Small-scale MLLM (s-MLLM) aims to retain the capabilities of the large-scale model (l-MLLM) while reducing computational demands, but resulting in a significant decline in performance. To address the aforementioned issues, we propose a novel LLaVA-KD framework to transfer knowledge from l-MLLM to s-MLLM. Specifically, we introduce Multimodal Distillation (MDist) to minimize the divergence between the visual-textual output distributions of l-MLLM and s-MLLM, and Relation Distillation (RDist) to transfer l-MLLM's ability to model correlations between visual features. Additionally, we propose a three-stage training scheme to fully exploit the potential of s-MLLM: 1) Distilled Pre-Training to align visual-textual representations, 2) Supervised Fine-Tuning to equip the model with multimodal understanding, and 3) Distilled Fine-Tuning to further transfer l-MLLM capabilities. Our approach significantly improves performance without altering the small model's architecture. Extensive experiments and ablation studies validate the effectiveness of each proposed component. Code will be available at https://github.com/Fantasyele/LLaVA-KD., Comment: Under review
Published: 2024

2. Allegro: Open the Black Box of Commercial-Level Video Generation Model

Author: Zhou, Yuan, Wang, Qiuyue, Cai, Yuxuan, and Yang, Huan
Subjects: Computer Science - Computer Vision and Pattern Recognition
Abstract: Significant advancements have been made in the field of video generation, with the open-source community contributing a wealth of research papers and tools for training high-quality models. However, despite these efforts, the available information and resources remain insufficient for achieving commercial-level performance. In this report, we open the black box and introduce $\textbf{Allegro}$, an advanced video generation model that excels in both quality and temporal consistency. We also highlight the current limitations in the field and present a comprehensive methodology for training high-performance, commercial-level video generation models, addressing key aspects such as data, model architecture, training pipeline, and evaluation. Our user study shows that Allegro surpasses existing open-source models and most commercial models, ranking just behind Hailuo and Kling. Code: https://github.com/rhymes-ai/Allegro , Model: https://huggingface.co/rhymes-ai/Allegro , Gallery: https://rhymes.ai/allegro_gallery .
Published: 2024

3. Attention-Guided Perturbation for Unsupervised Image Anomaly Detection

Author: Huang, Tingfeng, Cheng, Yuxuan, Xia, Jingbo, Yu, Rui, Cai, Yuxuan, Xiang, Jinhai, He, Xinwei, and Bai, Xiang
Subjects: Computer Science - Computer Vision and Pattern Recognition
Abstract: Reconstruction-based methods have significantly advanced modern unsupervised anomaly detection. However, the strong capacity of neural networks often violates the underlying assumptions by reconstructing abnormal samples well. To alleviate this issue, we present a simple yet effective reconstruction framework named Attention-Guided Pertuation Network (AGPNet), which learns to add perturbation noise with an attention mask, for accurate unsupervised anomaly detection. Specifically, it consists of two branches, \ie, a plain reconstruction branch and an auxiliary attention-based perturbation branch. The reconstruction branch is simply a plain reconstruction network that learns to reconstruct normal samples, while the auxiliary branch aims to produce attention masks to guide the noise perturbation process for normal samples from easy to hard. By doing so, we are expecting to synthesize hard yet more informative anomalies for training, which enable the reconstruction branch to learn important inherent normal patterns both comprehensively and efficiently. Extensive experiments are conducted on three popular benchmarks covering MVTec-AD, VisA, and MVTec-3D, and show that our framework obtains leading anomaly detection performance under various setups including few-shot, one-class, and multi-class setups.
Published: 2024

4. A Comprehensive Library for Benchmarking Multi-class Visual Anomaly Detection

Author: Zhang, Jiangning, He, Haoyang, Gan, Zhenye, He, Qingdong, Cai, Yuxuan, Xue, Zhucun, Wang, Yabiao, Wang, Chengjie, Xie, Lei, and Liu, Yong
Subjects: Computer Science - Computer Vision and Pattern Recognition
Abstract: Visual anomaly detection aims to identify anomalous regions in images through unsupervised learning paradigms, with increasing application demand and value in fields such as industrial inspection and medical lesion detection. Despite significant progress in recent years, there is a lack of comprehensive benchmarks to adequately evaluate the performance of various mainstream methods across different datasets under the practical multi-class setting. The absence of standardized experimental setups can lead to potential biases in training epochs, resolution, and metric results, resulting in erroneous conclusions. This paper addresses this issue by proposing a comprehensive visual anomaly detection benchmark, ADer, which is a modular framework that is highly extensible for new methods. The benchmark includes multiple datasets from industrial and medical domains, implementing fifteen state-of-the-art methods and nine comprehensive metrics. Additionally, we have proposed the GPU-assisted ADEval package to address the slow evaluation problem of metrics like time-consuming mAU-PRO on large-scale data, significantly reducing evaluation time by more than \textit{1000-fold}. Through extensive experimental results, we objectively reveal the strengths and weaknesses of different methods and provide insights into the challenges and future directions of multi-class visual anomaly detection. We hope that ADer will become a valuable resource for researchers and practitioners in the field, promoting the development of more robust and generalizable anomaly detection systems. Full codes are open-sourced at https://github.com/zhangzjn/ader.
Published: 2024

5. High-Performance Temporal Reversible Spiking Neural Networks with $O(L)$ Training Memory and $O(1)$ Inference Cost

Author: Hu, JiaKui, Yao, Man, Qiu, Xuerui, Chou, Yuhong, Cai, Yuxuan, Qiao, Ning, Tian, Yonghong, XU, Bo, and Li, Guoqi
Subjects: Computer Science - Neural and Evolutionary Computing
Abstract: Multi-timestep simulation of brain-inspired Spiking Neural Networks (SNNs) boost memory requirements during training and increase inference energy cost. Current training methods cannot simultaneously solve both training and inference dilemmas. This work proposes a novel Temporal Reversible architecture for SNNs (T-RevSNN) to jointly address the training and inference challenges by altering the forward propagation of SNNs. We turn off the temporal dynamics of most spiking neurons and design multi-level temporal reversible interactions at temporal turn-on spiking neurons, resulting in a $O(L)$ training memory. Combined with the temporal reversible nature, we redesign the input encoding and network organization of SNNs to achieve $O(1)$ inference energy cost. Then, we finely adjust the internal units and residual connections of the basic SNN block to ensure the effectiveness of sparse temporal information interaction. T-RevSNN achieves excellent accuracy on ImageNet, while the memory efficiency, training time acceleration, and inference energy efficiency can be significantly improved by $8.6 \times$, $2.0 \times$, and $1.6 \times$, respectively. This work is expected to break the technical bottleneck of significantly increasing memory cost and training time for large-scale SNNs while maintaining high performance and low inference energy cost. Source code and models are available at: https://github.com/BICLab/T-RevSNN., Comment: Accepted by ICML2024
Published: 2024

6. Anomaly Detection by Adapting a pre-trained Vision Language Model

Author: Cai, Yuxuan, He, Xinwei, Liang, Dingkang, Tong, Ao, and Bai, Xiang
Subjects: Computer Science - Computer Vision and Pattern Recognition
Abstract: Recently, large vision and language models have shown their success when adapting them to many downstream tasks. In this paper, we present a unified framework named CLIP-ADA for Anomaly Detection by Adapting a pre-trained CLIP model. To this end, we make two important improvements: 1) To acquire unified anomaly detection across industrial images of multiple categories, we introduce the learnable prompt and propose to associate it with abnormal patterns through self-supervised learning. 2) To fully exploit the representation power of CLIP, we introduce an anomaly region refinement strategy to refine the localization quality. During testing, the anomalies are localized by directly calculating the similarity between the representation of the learnable prompt and the image. Comprehensive experiments demonstrate the superiority of our framework, e.g., we achieve the state-of-the-art 97.5/55.6 and 89.3/33.1 on MVTec-AD and VisA for anomaly detection and localization. In addition, the proposed method also achieves encouraging performance with marginal training data, which is more challenging.
Published: 2024

7. Yi: Open Foundation Models by 01.AI

Author: AI, 01., Young, Alex, Chen, Bei, Li, Chao, Huang, Chengen, Zhang, Ge, Zhang, Guanwei, Li, Heng, Zhu, Jiangcheng, Chen, Jianqun, Chang, Jing, Yu, Kaidong, Liu, Peng, Liu, Qiang, Yue, Shawn, Yang, Senbin, Yang, Shiming, Yu, Tao, Xie, Wen, Huang, Wenhao, Hu, Xiaohui, Ren, Xiaoyi, Niu, Xinyao, Nie, Pengcheng, Xu, Yuchi, Liu, Yudong, Wang, Yue, Cai, Yuxuan, Gu, Zhenyu, Liu, Zhiyuan, and Dai, Zonghong
Subjects: Computer Science - Computation and Language, Computer Science - Artificial Intelligence
Abstract: We introduce the Yi model family, a series of language and multimodal models that demonstrate strong multi-dimensional capabilities. The Yi model family is based on 6B and 34B pretrained language models, then we extend them to chat models, 200K long context models, depth-upscaled models, and vision-language models. Our base models achieve strong performance on a wide range of benchmarks like MMLU, and our finetuned chat models deliver strong human preference rate on major evaluation platforms like AlpacaEval and Chatbot Arena. Building upon our scalable super-computing infrastructure and the classical transformer architecture, we attribute the performance of Yi models primarily to its data quality resulting from our data-engineering efforts. For pretraining, we construct 3.1 trillion tokens of English and Chinese corpora using a cascaded data deduplication and quality filtering pipeline. For finetuning, we polish a small scale (less than 10K) instruction dataset over multiple iterations such that every single instance has been verified directly by our machine learning engineers. For vision-language, we combine the chat language model with a vision transformer encoder and train the model to align visual representations to the semantic space of the language model. We further extend the context length to 200K through lightweight continual pretraining and demonstrate strong needle-in-a-haystack retrieval performance. We show that extending the depth of the pretrained checkpoint through continual pretraining further improves performance. We believe that given our current results, continuing to scale up model parameters using thoroughly optimized data will lead to even stronger frontier models.
Published: 2024

8. PSO-ECM: particle swarm optimization-based evidential C-means algorithm

Author: Cai, Yuxuan, Zhou, Qianli, and Deng, Yong
Published: 2024
Full Text: View/download PDF

9. A Discrepancy Aware Framework for Robust Anomaly Detection

Author: Cai, Yuxuan, Liang, Dingkang, Luo, Dongliang, He, Xinwei, Yang, Xin, and Bai, Xiang
Subjects: Computer Science - Computer Vision and Pattern Recognition
Abstract: Defect detection is a critical research area in artificial intelligence. Recently, synthetic data-based self-supervised learning has shown great potential on this task. Although many sophisticated synthesizing strategies exist, little research has been done to investigate the robustness of models when faced with different strategies. In this paper, we focus on this issue and find that existing methods are highly sensitive to them. To alleviate this issue, we present a Discrepancy Aware Framework (DAF), which demonstrates robust performance consistently with simple and cheap strategies across different anomaly detection benchmarks. We hypothesize that the high sensitivity to synthetic data of existing self-supervised methods arises from their heavy reliance on the visual appearance of synthetic data during decoding. In contrast, our method leverages an appearance-agnostic cue to guide the decoder in identifying defects, thereby alleviating its reliance on synthetic appearance. To this end, inspired by existing knowledge distillation methods, we employ a teacher-student network, which is trained based on synthesized outliers, to compute the discrepancy map as the cue. Extensive experiments on two challenging datasets prove the robustness of our method. Under the simple synthesis strategies, it outperforms existing methods by a large margin. Furthermore, it also achieves the state-of-the-art localization performance. Code is available at: https://github.com/caiyuxuan1120/DAF., Comment: Accepted by IEEE Transactions on Industrial Informatics. Code is available at: https://github.com/caiyuxuan1120/DAF
Published: 2023

10. RevColV2: Exploring Disentangled Representations in Masked Image Modeling

Author: Han, Qi, Cai, Yuxuan, and Zhang, Xiangyu
Subjects: Computer Science - Computer Vision and Pattern Recognition
Abstract: Masked image modeling (MIM) has become a prevalent pre-training setup for vision foundation models and attains promising performance. Despite its success, existing MIM methods discard the decoder network during downstream applications, resulting in inconsistent representations between pre-training and fine-tuning and can hamper downstream task performance. In this paper, we propose a new architecture, RevColV2, which tackles this issue by keeping the entire autoencoder architecture during both pre-training and fine-tuning. The main body of RevColV2 contains bottom-up columns and top-down columns, between which information is reversibly propagated and gradually disentangled. Such design enables our architecture with the nice property: maintaining disentangled low-level and semantic information at the end of the network in MIM pre-training. Our experimental results suggest that a foundation model with decoupled features can achieve competitive performance across multiple downstream vision tasks such as image classification, semantic segmentation and object detection. For example, after intermediate fine-tuning on ImageNet-22K dataset, RevColV2-L attains 88.4% top-1 accuracy on ImageNet-1K classification and 58.6 mIoU on ADE20K semantic segmentation. With extra teacher and large scale dataset, RevColv2-L achieves 62.1 box AP on COCO detection and 60.4 mIoU on ADE20K semantic segmentation. Code and models are released at https://github.com/megvii-research/RevCol
Published: 2023

11. Reversible Column Networks

Author: Cai, Yuxuan, Zhou, Yizhuang, Han, Qi, Sun, Jianjian, Kong, Xiangwen, Li, Jun, and Zhang, Xiangyu
Subjects: Computer Science - Computer Vision and Pattern Recognition
Abstract: We propose a new neural network design paradigm Reversible Column Network (RevCol). The main body of RevCol is composed of multiple copies of subnetworks, named columns respectively, between which multi-level reversible connections are employed. Such architectural scheme attributes RevCol very different behavior from conventional networks: during forward propagation, features in RevCol are learned to be gradually disentangled when passing through each column, whose total information is maintained rather than compressed or discarded as other network does. Our experiments suggest that CNN-style RevCol models can achieve very competitive performances on multiple computer vision tasks such as image classification, object detection and semantic segmentation, especially with large parameter budget and large dataset. For example, after ImageNet-22K pre-training, RevCol-XL obtains 88.2% ImageNet-1K accuracy. Given more pre-training data, our largest model RevCol-H reaches 90.0% on ImageNet-1K, 63.8% APbox on COCO detection minival set, 61.0% mIoU on ADE20k segmentation. To our knowledge, it is the best COCO detection and ADE20k segmentation result among pure (static) CNN models. Moreover, as a general macro architecture fashion, RevCol can also be introduced into transformers or other neural networks, which is demonstrated to improve the performances in both computer vision and NLP tasks. We release code and models at https://github.com/megvii-research/RevCol, Comment: Accepted by ICLR 2023
Published: 2022

12. An Adaptive Detection Method of Spatial Circular Feature Based on Arc Segment Under Different Lighting Conditions

Author: Ding, Yibing, Zhang, Zongzheng, Cai, Yuxuan, Su, Zhenhua, Wang, Haiming, Luo, Ying, Gong, Shengping, Wen, Hao, Urbach, H. Paul, editor, Li, Deren, editor, and Yu, Dengyun, editor
Published: 2024
Full Text: View/download PDF

13. New insights into assembly processes and driving factors of urban soil microbial community under environmental stress in Beijing

Author: Chen, Ying, Tao, Shiyang, Ma, Jin, Qu, Yajing, Sun, Yi, Wang, Meiying, and Cai, Yuxuan
Published: 2024
Full Text: View/download PDF

14. BiOIO3/Bi12SiO20 core-shell S-type heterojunction for efficient photocatalytic removal of bisphenol A: Performance and mechanism study

Author: Huang, Min, Xiong, Jianhua, Xiao, Xiangyu, Jiang, Zhongqin, Liang, Yinna, Cai, Yuxuan, Zeng, Yingxia, and Chen, Yongli
Published: 2024
Full Text: View/download PDF

15. Predictive analysis and risk assessment of potentially toxic elements in Beijing gas station soils using machine learning and two-dimensional Monte Carlo simulations

Author: Wang, Meiying, Gou, Zilun, Zhao, Wenhao, Qu, Yajing, Chen, Ying, Sun, Yi, Cai, Yuxuan, and Ma, Jin
Published: 2024
Full Text: View/download PDF

16. Automatic Mapping of the Best-Suited DNN Pruning Schemes for Real-Time Mobile Acceleration

Author: Gong, Yifan, Yuan, Geng, Zhan, Zheng, Niu, Wei, Li, Zhengang, Zhao, Pu, Cai, Yuxuan, Liu, Sijia, Ren, Bin, Lin, Xue, Tang, Xulong, and Wang, Yanzhi
Subjects: Computer Science - Machine Learning, Computer Science - Artificial Intelligence, Computer Science - Computer Vision and Pattern Recognition, Computer Science - Distributed, Parallel, and Cluster Computing
Abstract: Weight pruning is an effective model compression technique to tackle the challenges of achieving real-time deep neural network (DNN) inference on mobile devices. However, prior pruning schemes have limited application scenarios due to accuracy degradation, difficulty in leveraging hardware acceleration, and/or restriction on certain types of DNN layers. In this paper, we propose a general, fine-grained structured pruning scheme and corresponding compiler optimizations that are applicable to any type of DNN layer while achieving high accuracy and hardware inference performance. With the flexibility of applying different pruning schemes to different layers enabled by our compiler optimizations, we further probe into the new problem of determining the best-suited pruning scheme considering the different acceleration and accuracy performance of various pruning schemes. Two pruning scheme mapping methods, one is search-based and the other is rule-based, are proposed to automatically derive the best-suited pruning regularity and block size for each layer of any given DNN. Experimental results demonstrate that our pruning scheme mapping methods, together with the general fine-grained structured pruning scheme, outperform the state-of-the-art DNN optimization framework with up to 2.48$\times$ and 1.73$\times$ DNN inference acceleration on CIFAR-10 and ImageNet dataset without accuracy loss.
Published: 2021

17. Achieving Real-Time Object Detection on MobileDevices with Neural Pruning Search

Author: Zhao, Pu, Niu, Wei, Yuan, Geng, Cai, Yuxuan, Ren, Bin, Wang, Yanzhi, and Lin, Xue
Subjects: Computer Science - Computer Vision and Pattern Recognition, Computer Science - Artificial Intelligence
Abstract: Object detection plays an important role in self-driving cars for security development. However, mobile systems on self-driving cars with limited computation resources lead to difficulties for object detection. To facilitate this, we propose a compiler-aware neural pruning search framework to achieve high-speed inference on autonomous vehicles for 2D and 3D object detection. The framework automatically searches the pruning scheme and rate for each layer to find a best-suited pruning for optimizing detection accuracy and speed performance under compiler optimization. Our experiments demonstrate that for the first time, the proposed method achieves (close-to) real-time, 55ms and 99ms inference times for YOLOv4 based 2D object detection and PointPillars based 3D detection, respectively, on an off-the-shelf mobile phone with minor (or no) accuracy loss., Comment: Presented on the HiPEAC 2021 workshop (cogarch 2021)
Published: 2021

18. Improving DNN Fault Tolerance using Weight Pruning and Differential Crossbar Mapping for ReRAM-based Edge AI

Author: Yuan, Geng, Liao, Zhiheng, Ma, Xiaolong, Cai, Yuxuan, Kong, Zhenglun, Shen, Xuan, Fu, Jingyan, Li, Zhengang, Zhang, Chengming, Peng, Hongwu, Liu, Ning, Ren, Ao, Wang, Jinhui, and Wang, Yanzhi
Subjects: Computer Science - Machine Learning, Computer Science - Artificial Intelligence, Computer Science - Performance
Abstract: Recent research demonstrated the promise of using resistive random access memory (ReRAM) as an emerging technology to perform inherently parallel analog domain in-situ matrix-vector multiplication -- the intensive and key computation in deep neural networks (DNNs). However, hardware failure, such as stuck-at-fault defects, is one of the main concerns that impedes the ReRAM devices to be a feasible solution for real implementations. The existing solutions to address this issue usually require an optimization to be conducted for each individual device, which is impractical for mass-produced products (e.g., IoT devices). In this paper, we rethink the value of weight pruning in ReRAM-based DNN design from the perspective of model fault tolerance. And a differential mapping scheme is proposed to improve the fault tolerance under a high stuck-on fault rate. Our method can tolerate almost an order of magnitude higher failure rate than the traditional two-column method in representative DNN tasks. More importantly, our method does not require extra hardware cost compared to the traditional two-column mapping scheme. The improvement is universal and does not require the optimization process for each individual device., Comment: In Proceedings of the 22nd International Symposium on Quality Electronic Design (ISQED), 2021
Published: 2021

19. Evaluating implied urban nature vitality in San Francisco: An interdisciplinary approach combining census data, street view images, and social media analysis

Author: Chen, Mingze, Cai, Yuxuan, Guo, Shuying, Sun, Ruilin, Song, Yang, and Shen, Xiwei
Published: 2024
Full Text: View/download PDF

20. Achieving Real-Time LiDAR 3D Object Detection on a Mobile Device

Author: Zhao, Pu, Niu, Wei, Yuan, Geng, Cai, Yuxuan, Sung, Hsin-Hsuan, Liu, Sijia, Shen, Xipeng, Ren, Bin, Wang, Yanzhi, and Lin, Xue
Subjects: Computer Science - Computer Vision and Pattern Recognition, Computer Science - Artificial Intelligence
Abstract: 3D object detection is an important task, especially in the autonomous driving application domain. However, it is challenging to support the real-time performance with the limited computation and memory resources on edge-computing devices in self-driving cars. To achieve this, we propose a compiler-aware unified framework incorporating network enhancement and pruning search with the reinforcement learning techniques, to enable real-time inference of 3D object detection on the resource-limited edge-computing devices. Specifically, a generator Recurrent Neural Network (RNN) is employed to provide the unified scheme for both network enhancement and pruning search automatically, without human expertise and assistance. And the evaluated performance of the unified schemes can be fed back to train the generator RNN. The experimental results demonstrate that the proposed framework firstly achieves real-time 3D object detection on mobile devices (Samsung Galaxy S20 phone) with competitive detection performance.
Published: 2020

21. NPAS: A Compiler-aware Framework of Unified Network Pruning and Architecture Search for Beyond Real-Time Mobile Acceleration

Author: Li, Zhengang, Yuan, Geng, Niu, Wei, Zhao, Pu, Li, Yanyu, Cai, Yuxuan, Shen, Xuan, Zhan, Zheng, Kong, Zhenglun, Jin, Qing, Chen, Zhiyu, Liu, Sijia, Yang, Kaiyuan, Ren, Bin, Wang, Yanzhi, and Lin, Xue
Subjects: Computer Science - Machine Learning, Computer Science - Artificial Intelligence, Computer Science - Computer Vision and Pattern Recognition, Computer Science - Neural and Evolutionary Computing
Abstract: With the increasing demand to efficiently deploy DNNs on mobile edge devices, it becomes much more important to reduce unnecessary computation and increase the execution speed. Prior methods towards this goal, including model compression and network architecture search (NAS), are largely performed independently and do not fully consider compiler-level optimizations which is a must-do for mobile acceleration. In this work, we first propose (i) a general category of fine-grained structured pruning applicable to various DNN layers, and (ii) a comprehensive, compiler automatic code generation framework supporting different DNNs and different pruning schemes, which bridge the gap of model compression and NAS. We further propose NPAS, a compiler-aware unified network pruning, and architecture search. To deal with large search space, we propose a meta-modeling procedure based on reinforcement learning with fast evaluation and Bayesian optimization, ensuring the total number of training epochs comparable with representative NAS frameworks. Our framework achieves 6.7ms, 5.9ms, 3.9ms ImageNet inference times with 78.2%, 75% (MobileNet-V3 level), and 71% (MobileNet-V2 level) Top-1 accuracy respectively on an off-the-shelf mobile phone, consistently outperforming prior work., Comment: Accepted as an oral paper in the Conference on Computer Vision and Pattern Recognition (CVPR), 2021
Published: 2020

22. Study on the degradation of triethylamine wastewater by catalytic ozone-biological coupled system of Ce-Fe@ZSM-5

Author: Xiao, Xiangyu, Zhou, Zhenqi, Jiang, Zhongqin, Jiao, Chunlin, Liang, Yinna, Du, Ang, Cai, Yuxuan, Xiong, Jianhua, and Chen, Yongli
Published: 2024
Full Text: View/download PDF

23. YOLObile: Real-Time Object Detection on Mobile Devices via Compression-Compilation Co-Design

Author: Cai, Yuxuan, Li, Hongjia, Yuan, Geng, Niu, Wei, Li, Yanyu, Tang, Xulong, Ren, Bin, and Wang, Yanzhi
Subjects: Computer Science - Computer Vision and Pattern Recognition, Computer Science - Artificial Intelligence, Computer Science - Machine Learning
Abstract: The rapid development and wide utilization of object detection techniques have aroused attention on both accuracy and speed of object detectors. However, the current state-of-the-art object detection works are either accuracy-oriented using a large model but leading to high latency or speed-oriented using a lightweight model but sacrificing accuracy. In this work, we propose YOLObile framework, a real-time object detection on mobile devices via compression-compilation co-design. A novel block-punched pruning scheme is proposed for any kernel size. To improve computational efficiency on mobile devices, a GPU-CPU collaborative scheme is adopted along with advanced compiler-assisted optimizations. Experimental results indicate that our pruning scheme achieves 14$\times$ compression rate of YOLOv4 with 49.0 mAP. Under our YOLObile framework, we achieve 17 FPS inference speed using GPU on Samsung Galaxy S20. By incorporating our proposed GPU-CPU collaborative scheme, the inference speed is increased to 19.1 FPS, and outperforms the original YOLOv4 by 5$\times$ speedup. Source code is at: \url{https://github.com/nightsnack/YOLObile}.
Published: 2020

24. Typical daily occupancy profiles of express hotels and its stochasticity effect on building heating and cooling loads

Author: Chen, Shuqin, Lv, Yinyan, Wang, Zhichao, Ma, Yuhang, Huang, Yurui, Wang, Yichao, Cai, Yuxuan, and Rao, Zhiqin
Published: 2023
Full Text: View/download PDF

25. Occupant-centric dynamic heating and cooling loads simplified prediction model for urban community at energy planning stage

Author: Chen, Shuqin, Huang, Yurui, Zhang, Xiyong, Kuznik, Frédéric, He, Xi, Ma, Yuhang, and Cai, Yuxuan
Published: 2023
Full Text: View/download PDF

26. Comparing BOLD and VASO-CBV population receptive field estimates in human visual cortex

Author: Oliveira, Ícaro A.F., Cai, Yuxuan, Hofstetter, Shir, Siero, Jeroen C.W., van der Zwaag, Wietske, and Dumoulin, Serge O.
Published: 2022
Full Text: View/download PDF

27. NEAT Based Approach for Cognitive Covert Communication Assisted by UAV-IRS.

Author: Cai, Yuxuan, Liao, Xiaomin, Lu, Yaobin, Han, Yang, and Wang, Yulai
Published: 2024
Full Text: View/download PDF

28. Source Attribution Analysis of an Ozone Concentration Increase Event in the Main Urban Area of Xi'an Using the WRF-CMAQ Model.

Author: Wang, Ju, Cai, Yuxuan, Zou, Sainan, Zhou, Xiaowei, and Fang, Chunsheng
Abstract: The significant increase in ambient ozone (O3) levels across China highlights the urgent need to investigate the sources and mechanisms driving regional O3 events, particularly in densely populated urban areas. This study focuses on Xi'an, located in northwestern China on the Guanzhong Plain near the Qinling Mountains, where the unique topography contributes to pollutant accumulation. Urbanization and industrial activities have significantly increased pollutant emissions. Utilizing the Weather Research and Forecasting–Community Multiscale Air Quality Model (WRF-CMAQ), we analyzed the contributions of specific regional and industrial sources to rising O3 levels, particularly during an atypical winter event characterized by unusually high concentrations. Our findings indicated that boundary conditions were the primary contributor to elevated O3 levels during this event. Notably, Xianyang and Baoji accounted for 30% and 22% of the increased O3 levels in Xi'an, respectively. Additionally, residential sources and transportation accounted for 31% and 28% of the O3 increase. Within the Xi'an metropolitan area, Baqiao District (18–27%) and Weiyang District (23–30%) emerged as leading contributors. The primary industries contributing to this rise included residential sources (28–37%) and transportation (35–43%). These insights underscore the need for targeted regulatory measures to mitigate O3 pollution in urban settings. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

29. The role of neural tuning in quantity perception

Author: Tsouli, Andromachi, Harvey, Ben M., Hofstetter, Shir, Cai, Yuxuan, van der Smagt, Maarten J., te Pas, Susan F., and Dumoulin, Serge O.
Published: 2022
Full Text: View/download PDF

30. Research on Infrared Decoy Throwing Strategy Based on Hierarchical Clustering Algorithm

Author: Cai Yuxuan, Wu Youli, Chen Bian, Gan Yuepeng, Wu Xin
Subjects: infrared countermeasure, infrared decoy, throwing strategy, hierarchical clustering algorithm, pedigree clustering diagram, Motor vehicles. Aeronautics. Astronautics, TL1-4050
Abstract: Infrared decoy is often used in infrared countermeasure because of its high cost performance and good effect. Research on its throwing strategy can provide reference for the use of decoy, so as to achieve the expected interference effect. Taking the infrared decoy throwing strategy as the research object, three countermea-sure situations are set. The experimental data are obtained through the simulation system, and then the hierarchical clustering algorithm is used for cluster analysis. The clustering results show that in the set confrontation situation, the number of decoys in each group and the time of decoy throwing have a great influence on the interference effect. When the entry angle is 10° and the target has no maneuver, the way of early decoy throwing time and more decoy groups is selected. When the entry angle is 140°, the method of late throwing time and less decoy per group is selected. If the target has a turning maneuver, the method of late throwing time and more simultaneous decoy is selected. Thus, better interference effect can be obtained.
Published: 2021
Full Text: View/download PDF

31. Multi-energy driven form-stable phase change materials based on SEBS and reduced graphene oxide aerogel

Author: Cai, Yuxuan, Zhang, Nan, Yuan, Yanping, Zhong, Wei, and Yu, Nanyang
Published: 2021
Full Text: View/download PDF

32. A New Strategy for the Treatment of Old Corrugated Container Pulping Wastewater by the Ozone-Catalyzed Polyurethane Sponge Biodegradation Process

Author: Cai, Yuxuan, primary, Huang, Shaozhe, additional, and Xiong, Jianhua, additional
Published: 2024
Full Text: View/download PDF

33. An Ecolinguistic Perspective Study on Linguistic Resources Emotions of Migrants in Huizhou

Author: Cai, Yuxuan, primary
Published: 2024
Full Text: View/download PDF

34. A Discrepancy Aware Framework for Robust Anomaly Detection

Author: Cai, Yuxuan, primary, Liang, Dingkang, additional, Luo, Dongliang, additional, He, Xinwei, additional, Yang, Xin, additional, and Bai, Xiang, additional
Published: 2024
Full Text: View/download PDF

35. Individualized cognitive neuroscience needs 7T: Comparing numerosity maps at 3T and 7T MRI

Author: Cai, Yuxuan, Hofstetter, Shir, van der Zwaag, Wietske, Zuiderbaan, Wietske, and Dumoulin, Serge O.
Published: 2021
Full Text: View/download PDF

36. Adaptation to visual numerosity changes neural numerosity selectivity

Author: Tsouli, Andromachi, Cai, Yuxuan, van Ackooij, Martijn, Hofstetter, Shir, Harvey, Ben M., te Pas, Susan F., van der Smagt, Maarten J., and Dumoulin, Serge O.
Published: 2021
Full Text: View/download PDF

37. Familial Resilience in Crisis: Navigating the Mediating Landscape of Depressive Symptoms Between Uncertainty Stress and Suicide Behavior Among Chinese University Students

Author: Yan,Na, Zhou,Tong, Hu,Mingming, Cai,Yuxuan, Qi,Ling, Shiferaw,Blen Dereje, Wang,Wei, Miao,Chunxia, Yan,Na, Zhou,Tong, Hu,Mingming, Cai,Yuxuan, Qi,Ling, Shiferaw,Blen Dereje, Wang,Wei, and Miao,Chunxia
Abstract: Na Yan,1 Tong Zhou,1 Mingming Hu,1 Yuxuan Cai,1 Ling Qi,1 Blen Dereje Shiferaw,1 Wei Wang,1â 3 Chunxia Miao4 1School of Public Health, Xuzhou Medical University, Xuzhou, 221004, Peopleâs Republic of China; 2Research Center for Mental Crisis Prevention and Intervention of College Students in Jiangsu Province, Xuzhou Medical University, Xuzhou, 221004, Peopleâs Republic of China; 3Jiangsu Engineering Research Center of Biological Data Mining and Healthcare Transformation, Xuzhou Medical University, Xuzhou, 221004, Peopleâs Republic of China; 4School of Management, Xuzhou Medical University, Xuzhou, 221004, Peopleâs Republic of ChinaCorrespondence: Wei Wang, School of Public Health, Xuzhou Medical University, 209 Tong Shan Road, Xuzhou, Jiangsu, 221004, Peopleâs Republic of China, Email weiwang90@163.com Chunxia Miao, School of management, Xuzhou Medical University, 209 Tongshan Road, Xuzhou, Jiangsu, 221004, Peopleâs Republic of China, Email miaochunxia1978@163.comBackground: Previous findings indicate that stress has a profound influence on suicide behavior, but the potential mediating and moderating mechanisms are unknown between uncertainty stress and suicide behavior. The present study, therefore, examined the relationship between uncertainty stress and suicide behavior, the mediating effect of depressive symptoms, and the moderating effect of family relationship in a sample of university students in China.Methods: 1828 university students were assessed anonymously by using the Uncertainty Stress Scale, Center for Epidemiologic Studies Depression Scale, Brief Suicidal Scale, and Family Relationship Scale between May to June in 2021. SPSS 26.0 was used for descriptive statistics and Spearman correlation analysis. PROCESS 3.5 was used to calculate the significance of the mediating and moderating effects of the variables.Results: Moderated mediation model analyses showed that: (a) depressive symptoms partially mediated the link between uncertainty stres
Published: 2024

38. ADer: A Comprehensive Benchmark for Multi-class Visual Anomaly Detection

Author: Zhang, Jiangning, He, Haoyang, Gan, Zhenye, He, Qingdong, Cai, Yuxuan, Xue, Zhucun, Wang, Yabiao, Wang, Chengjie, Xie, Lei, Liu, Yong, Zhang, Jiangning, He, Haoyang, Gan, Zhenye, He, Qingdong, Cai, Yuxuan, Xue, Zhucun, Wang, Yabiao, Wang, Chengjie, Xie, Lei, and Liu, Yong
Abstract: Visual anomaly detection aims to identify anomalous regions in images through unsupervised learning paradigms, with increasing application demand and value in fields such as industrial inspection and medical lesion detection. Despite significant progress in recent years, there is a lack of comprehensive benchmarks to adequately evaluate the performance of various mainstream methods across different datasets under the practical multi-class setting. The absence of standardized experimental setups can lead to potential biases in training epochs, resolution, and metric results, resulting in erroneous conclusions. This paper addresses this issue by proposing a comprehensive visual anomaly detection benchmark, \textbf{\textit{ADer}}, which is a modular framework that is highly extensible for new methods. The benchmark includes multiple datasets from industrial and medical domains, implementing fifteen state-of-the-art methods and nine comprehensive metrics. Additionally, we have open-sourced the GPU-assisted \href{https://pypi.org/project/ADEval}{ADEval} package to address the slow evaluation problem of metrics like time-consuming mAU-PRO on large-scale data, significantly reducing evaluation time by more than \textit{1000-fold}. Through extensive experimental results, we objectively reveal the strengths and weaknesses of different methods and provide insights into the challenges and future directions of multi-class visual anomaly detection. We hope that \textbf{\textit{ADer}} will become a valuable resource for researchers and practitioners in the field, promoting the development of more robust and generalizable anomaly detection systems. Full codes have been attached in Appendix and open-sourced at \url{https://github.com/zhangzjn/ader}.
Published: 2024

39. Empathy and redemption: Exploring the narrative transformation of online support for mental health across communities before and after Covid-19.

Author: Cai, Yuxuan, Wei, Ertong, and Cai, Xintong
Subjects: *VIRTUAL communities, *MENTAL health, *COVID-19, *EMPATHY, *PUBLIC health, *INTERNET forums
Abstract: This study examines the impact of the COVID-19 pandemic on individuals' mental health and their online interactions, particularly within Reddit's mental health communities. By analyzing data from 15 subreddits categorized into mental health and control groups from 2018 to 2022, we observed that forums dedicated to mental health exhibited higher levels of user engagement and received more supportive responses than those in other categories. However, as the pandemic evolved, a significant decrease in online support was noted, especially within these mental health groups. This decline hints at a risk of emotional burnout among users, which poses a particularly acute challenge for individuals grappling with mental health issues. Intimate relationships have also an impact on online expression of mental health. The research underscores the pandemic's effect on online support and interaction dynamics, signaling the necessity for a deeper understanding and the development of strategies to maintain support within online communities during times of crisis. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

40. Fumarate Hydratase Enhances the Therapeutic Effect of PD-1 Antibody in Colorectal Cancer by Regulating PCSK9

Author: Qin, Le, primary, Shi, Liang, additional, Wang, Yu, additional, Yu, Haixin, additional, Du, Zhouyuan, additional, Chen, Mian, additional, Cai, Yuxuan, additional, Cao, Yinghao, additional, Deng, Shenghe, additional, Wang, Jun, additional, Cheng, Denglong, additional, Heng, Yixin, additional, Xu, Jiaxin, additional, Cai, Kailin, additional, and Wu, Ke, additional
Published: 2024
Full Text: View/download PDF

41. Familial Resilience in Crisis: Navigating the Mediating Landscape of Depressive Symptoms Between Uncertainty Stress and Suicide Behavior Among Chinese University Students

Author: Yan, Na, primary, Zhou, Tong, additional, Hu, Mingming, additional, Cai, Yuxuan, additional, Qi, Ling, additional, Shiferaw, Blen Dereje, additional, Wang, Wei, additional, and Miao, Chunxia, additional
Published: 2024
Full Text: View/download PDF

42. SecFed: A Secure and Efficient Federated Learning Based on Multi-Key Homomorphic Encryption

Author: Cai, Yuxuan, primary, Ding, Wenxiu, additional, Xiao, Yuxuan, additional, Yan, Zheng, additional, Liu, Ximeng, additional, and Wan, Zhiguo, additional
Published: 2024
Full Text: View/download PDF

43. Topographic numerosity maps cover subitizing and estimation ranges

Author: Cai, Yuxuan, Hofstetter, Shir, van Dijk, Jelle, Zuiderbaan, Wietske, van der Zwaag, Wietske, Harvey, Ben M., and Dumoulin, Serge O.
Published: 2021
Full Text: View/download PDF

44. Topographic maps representing haptic numerosity reveals distinct sensory representations in supramodal networks

Author: Hofstetter, Shir, Cai, Yuxuan, Harvey, Ben M., and Dumoulin, Serge O.
Published: 2021
Full Text: View/download PDF

45. Temperature Effect of Railway PC Part Cable-stayed Bridge Based on Long-term Monitoring

Author: Lin, Guangting, primary, Cai, Yujie, additional, Cai, Yuxuan, additional, and Wang, Xiao, additional
Published: 2023
Full Text: View/download PDF

46. ESMAC: Efficient and Secure Multi-Owner Access Control With TEE in Multi-Level Data Processing

Author: Liu, Dan, primary, Yan, Zheng, additional, Ding, Wenxiu, additional, Cai, Yuxuan, additional, Chen, Yaxing, additional, and Wan, Zhiguo, additional
Published: 2023
Full Text: View/download PDF

47. Association between resting-state brain network topological organization and creative ability: Evidence from a multiple linear regression model

Author: Jiao, Bingqing, Zhang, Delong, Liang, Aiying, Liang, Bishan, Wang, Zengjian, Li, Junchao, Cai, Yuxuan, Gao, Mengxia, Gao, Zhenni, Chang, Song, Huang, Ruiwang, and Liu, Ming
Published: 2017
Full Text: View/download PDF

48. Relation of visual creative imagery manipulation to resting-state brain oscillations

Author: Cai, Yuxuan, Zhang, Delong, Liang, Bishan, Wang, Zengjian, Li, Junchao, Gao, Zhenni, Gao, Mengxia, Chang, Song, Jiao, Bingqing, Huang, Ruiwang, and Liu, Ming
Published: 2018
Full Text: View/download PDF

49. Ultra-light and flexible graphene aerogel-based form-stable phase change materials for energy conversion and energy storage

Author: Cai, Yuxuan, primary, Zhang, Nan, additional, Cao, Xiaoling, additional, Yuan, Yanping, additional, Zhang, Zhaoli, additional, and Yu, Nanyang, additional
Published: 2023
Full Text: View/download PDF

50. Nonsymbolic Numerosity Maps at the Occipitotemporal Cortex Respond to Symbolic Numbers

Author: Cai, Yuxuan, primary, Hofstetter, Shir, additional, and Dumoulin, Serge O., additional
Published: 2023
Full Text: View/download PDF

Catalog

Books, media, physical & digital resources

See catalog results

Searchworks

Select search scope, currently: Articles Catalog books, media & more in Jio Institute collections Articles journal articles & other e-resources

Search

Search Constraints

Refine your results

Search Limiters

Topic

Publication Year Range

Language

Publication Type

Journal

Region

Database

Publisher

264 results on '"Cai, Yuxuan"'

Search Results

Catalog

Select search scope, currently: Articles

Catalog

books, media & more in Jio Institute collections

Articles

journal articles & other e-resources