Author: "Yue, Yutao" / Publication Year Range: Last 10 years - Searchworks@Jio Institute Digital Library Search Results

Your search keyword '"Yue, Yutao"' showing total 133 results

Start Over Author "Yue, Yutao" Publication Year Range Last 10 years

133 results on '"Yue, Yutao"'

1. Maintaining Informative Coherence: Migrating Hallucinations in Large Language Models via Absorbing Markov Chains

Author: Wu, Jiemin, Lai, Songning, Xiao, Ruiqiang, Xue, Tianlang, Yang, Jiayu, and Yue, Yutao
Subjects: Computer Science - Computation and Language, Computer Science - Artificial Intelligence
Abstract: Large Language Models (LLMs) are powerful tools for text generation, translation, and summarization, but they often suffer from hallucinations-instances where they fail to maintain the fidelity and coherence of contextual information during decoding, sometimes overlooking critical details due to their sampling strategies and inherent biases from training data and fine-tuning discrepancies. These hallucinations can propagate through the web, affecting the trustworthiness of information disseminated online. To address this issue, we propose a novel decoding strategy that leverages absorbing Markov chains to quantify the significance of contextual information and measure the extent of information loss during generation. By considering all possible paths from the first to the last token, our approach enhances the reliability of model outputs without requiring additional training or external data. Evaluations on datasets including TruthfulQA, FACTOR, and HaluEval highlight the superior performance of our method in mitigating hallucinations, underscoring the necessity of ensuring accurate information flow in web-based applications.
Published: 2024

2. radarODE-MTL: A Multi-Task Learning Framework with Eccentric Gradient Alignment for Robust Radar-Based ECG Reconstruction

Author: Zhang, Yuanyuan, Yang, Rui, Yue, Yutao, and Lim, Eng Gee
Subjects: Electrical Engineering and Systems Science - Signal Processing, Computer Science - Artificial Intelligence
Abstract: Millimeter-wave radar is promising to provide robust and accurate vital sign monitoring in an unobtrusive manner. However, the radar signal might be distorted in propagation by ambient noise or random body movement, ruining the subtle cardiac activities and destroying the vital sign recovery. In particular, the recovery of electrocardiogram (ECG) signal heavily relies on the deep-learning model and is sensitive to noise. Therefore, this work creatively deconstructs the radar-based ECG recovery into three individual tasks and proposes a multi-task learning (MTL) framework, radarODE-MTL, to increase the robustness against consistent and abrupt noises. In addition, to alleviate the potential conflicts in optimizing individual tasks, a novel multi-task optimization strategy, eccentric gradient alignment (EGA), is proposed to dynamically trim the task-specific gradients based on task difficulties in orthogonal space. The proposed radarODE-MTL with EGA is evaluated on the public dataset with prominent improvements in accuracy, and the performance remains consistent under noises. The experimental results indicate that radarODE-MTL could reconstruct accurate ECG signals robustly from radar signals and imply the application prospect in real-life situations. The code is available at: http://github.com/ZYY0844/radarODE-MTL.
Published: 2024

3. Wolf2Pack: The AutoFusion Framework for Dynamic Parameter Fusion

Author: Tian, Bowen, Lai, Songning, and Yue, Yutao
Subjects: Computer Science - Computer Vision and Pattern Recognition, Computer Science - Machine Learning
Abstract: In the rapidly evolving field of deep learning, specialized models have driven significant advancements in tasks such as computer vision and natural language processing. However, this specialization leads to a fragmented ecosystem where models lack the adaptability for broader applications. To overcome this, we introduce AutoFusion, an innovative framework that fuses distinct model parameters(with the same architecture) for multi-task learning without pre-trained checkpoints. Using an unsupervised, end-to-end approach, AutoFusion dynamically permutes model parameters at each layer, optimizing the combination through a loss-minimization process that does not require labeled data. We validate AutoFusion's effectiveness through experiments on commonly used benchmark datasets, demonstrating superior performance over established methods like Weight Interpolation, Git Re-Basin, and ZipIt. Our framework offers a scalable and flexible solution for model integration, positioning it as a powerful tool for future research and practical applications., Comment: Under review
Published: 2024

4. CAT: Concept-level backdoor ATtacks for Concept Bottleneck Models

Author: Lai, Songning, Yang, Jiayu, Huang, Yu, Hu, Lijie, Xue, Tianlang, Hu, Zhangyi, Li, Jiaxu, Liao, Haicheng, and Yue, Yutao
Subjects: Computer Science - Computer Vision and Pattern Recognition, Computer Science - Cryptography and Security
Abstract: Despite the transformative impact of deep learning across multiple domains, the inherent opacity of these models has driven the development of Explainable Artificial Intelligence (XAI). Among these efforts, Concept Bottleneck Models (CBMs) have emerged as a key approach to improve interpretability by leveraging high-level semantic information. However, CBMs, like other machine learning models, are susceptible to security threats, particularly backdoor attacks, which can covertly manipulate model behaviors. Understanding that the community has not yet studied the concept level backdoor attack of CBM, because of "Better the devil you know than the devil you don't know.", we introduce CAT (Concept-level Backdoor ATtacks), a methodology that leverages the conceptual representations within CBMs to embed triggers during training, enabling controlled manipulation of model predictions at inference time. An enhanced attack pattern, CAT+, incorporates a correlation function to systematically select the most effective and stealthy concept triggers, thereby optimizing the attack's impact. Our comprehensive evaluation framework assesses both the attack success rate and stealthiness, demonstrating that CAT and CAT+ maintain high performance on clean data while achieving significant targeted effects on backdoored datasets. This work underscores the potential security risks associated with CBMs and provides a robust testing methodology for future security assessments.
Published: 2024

5. MINER: Mining the Underlying Pattern of Modality-Specific Neurons in Multimodal Large Language Models

Author: Huang, Kaichen, Huo, Jiahao, Yan, Yibo, Wang, Kun, Yue, Yutao, and Hu, Xuming
Subjects: Computer Science - Computation and Language
Abstract: In recent years, multimodal large language models (MLLMs) have significantly advanced, integrating more modalities into diverse applications. However, the lack of explainability remains a major barrier to their use in scenarios requiring decision transparency. Current neuron-level explanation paradigms mainly focus on knowledge localization or language- and domain-specific analyses, leaving the exploration of multimodality largely unaddressed. To tackle these challenges, we propose MINER, a transferable framework for mining modality-specific neurons (MSNs) in MLLMs, which comprises four stages: (1) modality separation, (2) importance score calculation, (3) importance score aggregation, (4) modality-specific neuron selection. Extensive experiments across six benchmarks and two representative MLLMs show that (I) deactivating ONLY 2% of MSNs significantly reduces MLLMs performance (0.56 to 0.24 for Qwen2-VL, 0.69 to 0.31 for Qwen2-Audio), (II) different modalities mainly converge in the lower layers, (III) MSNs influence how key information from various modalities converges to the last token, (IV) two intriguing phenomena worth further investigation, i.e., semantic probing and semantic telomeres. The source code is available at this URL.
Published: 2024

6. UniBEVFusion: Unified Radar-Vision BEVFusion for 3D Object Detection

Author: Zhao, Haocheng, Guan, Runwei, Wu, Taoyu, Man, Ka Lok, Yu, Limin, and Yue, Yutao
Subjects: Computer Science - Computer Vision and Pattern Recognition, Computer Science - Artificial Intelligence
Abstract: 4D millimeter-wave (MMW) radar, which provides both height information and dense point cloud data over 3D MMW radar, has become increasingly popular in 3D object detection. In recent years, radar-vision fusion models have demonstrated performance close to that of LiDAR-based models, offering advantages in terms of lower hardware costs and better resilience in extreme conditions. However, many radar-vision fusion models treat radar as a sparse LiDAR, underutilizing radar-specific information. Additionally, these multi-modal networks are often sensitive to the failure of a single modality, particularly vision. To address these challenges, we propose the Radar Depth Lift-Splat-Shoot (RDL) module, which integrates radar-specific data into the depth prediction process, enhancing the quality of visual Bird-Eye View (BEV) features. We further introduce a Unified Feature Fusion (UFF) approach that extracts BEV features across different modalities using shared module. To assess the robustness of multi-modal models, we develop a novel Failure Test (FT) ablation experiment, which simulates vision modality failure by injecting Gaussian noise. We conduct extensive experiments on the View-of-Delft (VoD) and TJ4D datasets. The results demonstrate that our proposed Unified BEVFusion (UniBEVFusion) network significantly outperforms state-of-the-art models on the TJ4D dataset, with improvements of 1.44 in 3D and 1.72 in BEV object detection accuracy., Comment: 6 pages, 4 figues, conference
Published: 2024

7. DRIVE: Dependable Robust Interpretable Visionary Ensemble Framework in Autonomous Driving

Author: Lai, Songning, Xue, Tianlang, Xiao, Hongru, Hu, Lijie, Wu, Jiemin, Feng, Ninghui, Guan, Runwei, Liao, Haicheng, Li, Zhenning, and Yue, Yutao
Subjects: Computer Science - Robotics, Computer Science - Computer Vision and Pattern Recognition
Abstract: Recent advancements in autonomous driving have seen a paradigm shift towards end-to-end learning paradigms, which map sensory inputs directly to driving actions, thereby enhancing the robustness and adaptability of autonomous vehicles. However, these models often sacrifice interpretability, posing significant challenges to trust, safety, and regulatory compliance. To address these issues, we introduce DRIVE -- Dependable Robust Interpretable Visionary Ensemble Framework in Autonomous Driving, a comprehensive framework designed to improve the dependability and stability of explanations in end-to-end unsupervised autonomous driving models. Our work specifically targets the inherent instability problems observed in the Driving through the Concept Gridlock (DCG) model, which undermine the trustworthiness of its explanations and decision-making processes. We define four key attributes of DRIVE: consistent interpretability, stable interpretability, consistent output, and stable output. These attributes collectively ensure that explanations remain reliable and robust across different scenarios and perturbations. Through extensive empirical evaluations, we demonstrate the effectiveness of our framework in enhancing the stability and dependability of explanations, thereby addressing the limitations of current models. Our contributions include an in-depth analysis of the dependability issues within the DCG model, a rigorous definition of DRIVE with its fundamental properties, a framework to implement DRIVE, and novel metrics for evaluating the dependability of concept-based explainable autonomous driving models. These advancements lay the groundwork for the development of more reliable and trusted autonomous driving systems, paving the way for their broader acceptance and deployment in real-world applications.
Published: 2024

8. PEPL: Precision-Enhanced Pseudo-Labeling for Fine-Grained Image Classification in Semi-Supervised Learning

Author: Tian, Bowen, Lai, Songning, Li, Lujundong, Shuai, Zhihao, Guan, Runwei, Wu, Tian, and Yue, Yutao
Subjects: Computer Science - Computer Vision and Pattern Recognition
Abstract: Fine-grained image classification has witnessed significant advancements with the advent of deep learning and computer vision technologies. However, the scarcity of detailed annotations remains a major challenge, especially in scenarios where obtaining high-quality labeled data is costly or time-consuming. To address this limitation, we introduce Precision-Enhanced Pseudo-Labeling(PEPL) approach specifically designed for fine-grained image classification within a semi-supervised learning framework. Our method leverages the abundance of unlabeled data by generating high-quality pseudo-labels that are progressively refined through two key phases: initial pseudo-label generation and semantic-mixed pseudo-label generation. These phases utilize Class Activation Maps (CAMs) to accurately estimate the semantic content and generate refined labels that capture the essential details necessary for fine-grained classification. By focusing on semantic-level information, our approach effectively addresses the limitations of standard data augmentation and image-mixing techniques in preserving critical fine-grained features. We achieve state-of-the-art performance on benchmark datasets, demonstrating significant improvements over existing semi-supervised strategies, with notable boosts in accuracy and robustness.Our code has been open sourced at https://github.com/TianSuya/SemiFG., Comment: Under review
Published: 2024

9. NanoMVG: USV-Centric Low-Power Multi-Task Visual Grounding based on Prompt-Guided Camera and 4D mmWave Radar

Author: Guan, Runwei, Liu, Jianan, Jia, Liye, Zhao, Haocheng, Yao, Shanliang, Zhu, Xiaohui, Man, Ka Lok, Lim, Eng Gee, Smith, Jeremy, and Yue, Yutao
Subjects: Computer Science - Computer Vision and Pattern Recognition, Computer Science - Robotics
Abstract: Recently, visual grounding and multi-sensors setting have been incorporated into perception system for terrestrial autonomous driving systems and Unmanned Surface Vehicles (USVs), yet the high complexity of modern learning-based visual grounding model using multi-sensors prevents such model to be deployed on USVs in the real-life. To this end, we design a low-power multi-task model named NanoMVG for waterway embodied perception, guiding both camera and 4D millimeter-wave radar to locate specific object(s) through natural language. NanoMVG can perform both box-level and mask-level visual grounding tasks simultaneously. Compared to other visual grounding models, NanoMVG achieves highly competitive performance on the WaterVG dataset, particularly in harsh environments and boasts ultra-low power consumption for long endurance., Comment: 8 pages, 6 figures
Published: 2024

10. radarODE: An ODE-Embedded Deep Learning Model for Contactless ECG Reconstruction from Millimeter-Wave Radar

Author: Zhang, Yuanyuan, Guan, Runwei, Li, Lingxiao, Yang, Rui, Yue, Yutao, and Lim, Eng Gee
Subjects: Electrical Engineering and Systems Science - Signal Processing, Computer Science - Artificial Intelligence
Abstract: Radar-based contactless cardiac monitoring has become a popular research direction recently, but the fine-grained electrocardiogram (ECG) signal is still hard to reconstruct from millimeter-wave radar signal. The key obstacle is to decouple the cardiac activities in the electrical domain (i.e., ECG) from that in the mechanical domain (i.e., heartbeat), and most existing research only uses pure data-driven methods to map such domain transformation as a black box. Therefore, this work first proposes a signal model for domain transformation, and then a novel deep learning framework called radarODE is designed to fuse the temporal and morphological features extracted from radar signals and generate ECG. In addition, ordinary differential equations are embedded in radarODE as a decoder to provide morphological prior, helping the convergence of the model training and improving the robustness under body movements. After being validated on the dataset, the proposed radarODE achieves better performance compared with the benchmark in terms of missed detection rate, root mean square error, Pearson correlation coefficient with the improvement of 9%, 16% and 19%, respectively. The validation results imply that radarODE is capable of recovering ECG signals from radar signals with high fidelity and can be potentially implemented in real-life scenarios.
Published: 2024

11. MMNeuron: Discovering Neuron-Level Domain-Specific Interpretation in Multimodal Large Language Model

Author: Huo, Jiahao, Yan, Yibo, Hu, Boren, Yue, Yutao, and Hu, Xuming
Subjects: Computer Science - Computation and Language
Abstract: Projecting visual features into word embedding space has become a significant fusion strategy adopted by Multimodal Large Language Models (MLLMs). However, its internal mechanisms have yet to be explored. Inspired by multilingual research, we identify domain-specific neurons in multimodal large language models. Specifically, we investigate the distribution of domain-specific neurons and the mechanism of how MLLMs process features from diverse domains. Furthermore, we propose a three-stage mechanism for language model modules in MLLMs when handling projected image features, and verify this hypothesis using logit lens. Extensive experiments indicate that while current MLLMs exhibit Visual Question Answering (VQA) capability, they may not fully utilize domain-specific information. Manipulating domain-specific neurons properly will result in a 10% change of accuracy at most, shedding light on the development of cross-domain, all-encompassing MLLMs in the future. The source code is available at https://github.com/Z1zs/MMNeuron., Comment: Accepted by the Main Conference of Empirical Methods in Natural Language Processing (EMNLP) 2024
Published: 2024

12. FTS: A Framework to Find a Faithful TimeSieve

Author: Lai, Songning, Feng, Ninghui, Gao, Jiechao, Wang, Hao, Sui, Haochen, Zou, Xin, Yang, Jiayu, Chen, Wenshuo, Zhao, Hang, Hu, Xuming, and Yue, Yutao
Subjects: Computer Science - Machine Learning
Abstract: The field of time series forecasting has garnered significant attention in recent years, prompting the development of advanced models like TimeSieve, which demonstrates impressive performance. However, an analysis reveals certain unfaithfulness issues, including high sensitivity to random seeds, input and layer noise perturbations and parametric perturbations. Recognizing these challenges, we embark on a quest to define the concept of \textbf{\underline{F}aithful \underline{T}ime\underline{S}ieve \underline{(FTS)}}, a model that consistently delivers reliable and robust predictions. To address these issues, we propose a novel framework aimed at identifying and rectifying unfaithfulness in TimeSieve. Our framework is designed to enhance the model's stability and faithfulness, ensuring that its outputs are less susceptible to the aforementioned factors. Experimentation validates the effectiveness of our proposed framework, demonstrating improved faithfulness in the model's behavior.
Published: 2024

13. Multi-Modal UAV Detection, Classification and Tracking Algorithm -- Technical Report for CVPR 2024 UG2 Challenge

Author: Deng, Tianchen, Zhou, Yi, Wu, Wenhua, Li, Mingrui, Huang, Jingwei, Liu, Shuhong, Song, Yanzeng, Zuo, Hao, Wang, Yanbo, Yue, Yutao, Wang, Hesheng, and Chen, Weidong
Subjects: Computer Science - Robotics, Computer Science - Computer Vision and Pattern Recognition
Abstract: This technical report presents the 1st winning model for UG2+, a task in CVPR 2024 UAV Tracking and Pose-Estimation Challenge. This challenge faces difficulties in drone detection, UAV-type classification and 2D/3D trajectory estimation in extreme weather conditions with multi-modal sensor information, including stereo vision, various Lidars, Radars, and audio arrays. Leveraging this information, we propose a multi-modal UAV detection, classification, and 3D tracking method for accurate UAV classification and tracking. A novel classification pipeline which incorporates sequence fusion, region of interest (ROI) cropping, and keyframe selection is proposed. Our system integrates cutting-edge classification techniques and sophisticated post-processing steps to boost accuracy and robustness. The designed pose estimation pipeline incorporates three modules: dynamic points analysis, a multi-object tracker, and trajectory completion techniques. Extensive experiments have validated the effectiveness and precision of our approach. In addition, we also propose a novel dataset pre-processing method and conduct a comprehensive ablation study for our design. We finally achieved the best performance in the classification and tracking of the MMUAD dataset. The code and configuration of our method are available at https://github.com/dtc111111/Multi-Modal-UAV., Comment: Accepted by CVPR 2024 workshop. The 1st winning model in CVPR 2024 UG2+ challenge. The code and configuration of our method are available at https://github.com/dtc111111/Multi-Modal-UAV
Published: 2024

14. Talk2Radar: Bridging Natural Language with 4D mmWave Radar for 3D Referring Expression Comprehension

Author: Guan, Runwei, Zhang, Ruixiao, Ouyang, Ningwei, Liu, Jianan, Man, Ka Lok, Cai, Xiaohao, Xu, Ming, Smith, Jeremy, Lim, Eng Gee, Yue, Yutao, and Xiong, Hui
Subjects: Computer Science - Robotics, Computer Science - Computer Vision and Pattern Recognition
Abstract: Embodied perception is essential for intelligent vehicles and robots in interactive environmental understanding. However, these advancements primarily focus on vision, with limited attention given to using 3D modeling sensors, restricting a comprehensive understanding of objects in response to prompts containing qualitative and quantitative queries. Recently, as a promising automotive sensor with affordable cost, 4D millimeter-wave radars provide denser point clouds than conventional radars and perceive both semantic and physical characteristics of objects, thereby enhancing the reliability of perception systems. To foster the development of natural language-driven context understanding in radar scenes for 3D visual grounding, we construct the first dataset, Talk2Radar, which bridges these two modalities for 3D Referring Expression Comprehension (REC). Talk2Radar contains 8,682 referring prompt samples with 20,558 referred objects. Moreover, we propose a novel model, T-RadarNet, for 3D REC on point clouds, achieving State-Of-The-Art (SOTA) performance on the Talk2Radar dataset compared to counterparts. Deformable-FPN and Gated Graph Fusion are meticulously designed for efficient point cloud feature modeling and cross-modal fusion between radar and text features, respectively. Comprehensive experiments provide deep insights into radar-based 3D REC. We release our project at https://github.com/GuanRunwei/Talk2Radar., Comment: 8 pages, 5 figures
Published: 2024

15. Referring Flexible Image Restoration

Author: Guan, Runwei, Hu, Rongsheng, Zhou, Zhuhao, Xue, Tianlang, Man, Ka Lok, Smith, Jeremy, Lim, Eng Gee, Ding, Weiping, and Yue, Yutao
Subjects: Computer Science - Computer Vision and Pattern Recognition, Computer Science - Multimedia
Abstract: In reality, images often exhibit multiple degradations, such as rain and fog at night (triple degradations). However, in many cases, individuals may not want to remove all degradations, for instance, a blurry lens revealing a beautiful snowy landscape (double degradations). In such scenarios, people may only desire to deblur. These situations and requirements shed light on a new challenge in image restoration, where a model must perceive and remove specific degradation types specified by human commands in images with multiple degradations. We term this task Referring Flexible Image Restoration (RFIR). To address this, we first construct a large-scale synthetic dataset called RFIR, comprising 153,423 samples with the degraded image, text prompt for specific degradation removal and restored image. RFIR consists of five basic degradation types: blur, rain, haze, low light and snow while six main sub-categories are included for varying degrees of degradation removal. To tackle the challenge, we propose a novel transformer-based multi-task model named TransRFIR, which simultaneously perceives degradation types in the degraded image and removes specific degradation upon text prompt. TransRFIR is based on two devised attention modules, Multi-Head Agent Self-Attention (MHASA) and Multi-Head Agent Cross Attention (MHACA), where MHASA and MHACA introduce the agent token and reach the linear complexity, achieving lower computation cost than vanilla self-attention and cross-attention and obtaining competitive performances. Our TransRFIR achieves state-of-the-art performances compared with other counterparts and is proven as an effective architecture for image restoration. We release our project at https://github.com/GuanRunwei/FIR-CP., Comment: 15 pages, 19 figures
Published: 2024

16. WaterVG: Waterway Visual Grounding based on Text-Guided Vision and mmWave Radar

Author: Guan, Runwei, Jia, Liye, Yang, Fengyufan, Yao, Shanliang, Purwanto, Erick, Zhu, Xiaohui, Lim, Eng Gee, Smith, Jeremy, Man, Ka Lok, Hu, Xuming, and Yue, Yutao
Subjects: Computer Science - Computer Vision and Pattern Recognition, Computer Science - Multimedia, Computer Science - Robotics
Abstract: The perception of waterways based on human intent is significant for autonomous navigation and operations of Unmanned Surface Vehicles (USVs) in water environments. Inspired by visual grounding, we introduce WaterVG, the first visual grounding dataset designed for USV-based waterway perception based on human prompts. WaterVG encompasses prompts describing multiple targets, with annotations at the instance level including bounding boxes and masks. Notably, WaterVG includes 11,568 samples with 34,987 referred targets, whose prompts integrates both visual and radar characteristics. The pattern of text-guided two sensors equips a finer granularity of text prompts with visual and radar features of referred targets. Moreover, we propose a low-power visual grounding model, Potamoi, which is a multi-task model with a well-designed Phased Heterogeneous Modality Fusion (PHMF) mode, including Adaptive Radar Weighting (ARW) and Multi-Head Slim Cross Attention (MHSCA). Exactly, ARW extracts required radar features to fuse with vision for prompt alignment. MHSCA is an efficient fusion module with a remarkably small parameter count and FLOPs, elegantly fusing scenario context captured by two sensors with linguistic features, which performs expressively on visual grounding tasks. Comprehensive experiments and evaluations have been conducted on WaterVG, where our Potamoi archives state-of-the-art performances compared with counterparts., Comment: 10 pages, 10 figures
Published: 2024

17. Achelous++: Power-Oriented Water-Surface Panoptic Perception Framework on Edge Devices based on Vision-Radar Fusion and Pruning of Heterogeneous Modalities

Author: Guan, Runwei, Zhao, Haocheng, Yao, Shanliang, Man, Ka Lok, Zhu, Xiaohui, Yu, Limin, Yue, Yong, Smith, Jeremy, Lim, Eng Gee, Ding, Weiping, and Yue, Yutao
Subjects: Computer Science - Computer Vision and Pattern Recognition, Computer Science - Computational Engineering, Finance, and Science, Computer Science - Robotics
Abstract: Urban water-surface robust perception serves as the foundation for intelligent monitoring of aquatic environments and the autonomous navigation and operation of unmanned vessels, especially in the context of waterway safety. It is worth noting that current multi-sensor fusion and multi-task learning models consume substantial power and heavily rely on high-power GPUs for inference. This contributes to increased carbon emissions, a concern that runs counter to the prevailing emphasis on environmental preservation and the pursuit of sustainable, low-carbon urban environments. In light of these concerns, this paper concentrates on low-power, lightweight, multi-task panoptic perception through the fusion of visual and 4D radar data, which is seen as a promising low-cost perception method. We propose a framework named Achelous++ that facilitates the development and comprehensive evaluation of multi-task water-surface panoptic perception models. Achelous++ can simultaneously execute five perception tasks with high speed and low power consumption, including object detection, object semantic segmentation, drivable-area segmentation, waterline segmentation, and radar point cloud semantic segmentation. Furthermore, to meet the demand for developers to customize models for real-time inference on low-performance devices, a novel multi-modal pruning strategy known as Heterogeneous-Aware SynFlow (HA-SynFlow) is proposed. Besides, Achelous++ also supports random pruning at initialization with different layer-wise sparsity, such as Uniform and Erdos-Renyi-Kernel (ERK). Overall, our Achelous++ framework achieves state-of-the-art performance on the WaterScenes benchmark, excelling in both accuracy and power efficiency compared to other single-task and multi-task models. We release and maintain the code at https://github.com/GuanRunwei/Achelous., Comment: 18 pages, 9 figures
Published: 2023

18. Exploring Radar Data Representations in Autonomous Driving: A Comprehensive Review

Author: Yao, Shanliang, Guan, Runwei, Peng, Zitian, Xu, Chenhang, Shi, Yilu, Ding, Weiping, Lim, Eng Gee, Yue, Yong, Seo, Hyungjoon, Man, Ka Lok, Ma, Jieming, Zhu, Xiaohui, and Yue, Yutao
Subjects: Computer Science - Computer Vision and Pattern Recognition, Computer Science - Artificial Intelligence
Abstract: With the rapid advancements of sensor technology and deep learning, autonomous driving systems are providing safe and efficient access to intelligent vehicles as well as intelligent transportation. Among these equipped sensors, the radar sensor plays a crucial role in providing robust perception information in diverse environmental conditions. This review focuses on exploring different radar data representations utilized in autonomous driving systems. Firstly, we introduce the capabilities and limitations of the radar sensor by examining the working principles of radar perception and signal processing of radar measurements. Then, we delve into the generation process of five radar representations, including the ADC signal, radar tensor, point cloud, grid map, and micro-Doppler signature. For each radar representation, we examine the related datasets, methods, advantages and limitations. Furthermore, we discuss the challenges faced in these data representations and propose potential research directions. Above all, this comprehensive review offers an in-depth insight into how these representations enhance autonomous system capabilities, providing guidance for radar perception researchers. To facilitate retrieval and comparison of different data representations, datasets and methods, we provide an interactive website at https://radar-camera-fusion.github.io/radar., Comment: 24 pages, 10 figures, 5 tables. arXiv admin note: text overlap with arXiv:2304.10410
Published: 2023

19. ASY-VRNet: Waterway Panoptic Driving Perception Model based on Asymmetric Fair Fusion of Vision and 4D mmWave Radar

Author: Guan, Runwei, Yao, Shanliang, Zhu, Xiaohui, Man, Ka Lok, Yue, Yong, Smith, Jeremy, Lim, Eng Gee, and Yue, Yutao
Subjects: Computer Science - Computer Vision and Pattern Recognition, Computer Science - Robotics
Abstract: Panoptic Driving Perception (PDP) is critical for the autonomous navigation of Unmanned Surface Vehicles (USVs). A PDP model typically integrates multiple tasks, necessitating the simultaneous and robust execution of various perception tasks to facilitate downstream path planning. The fusion of visual and radar sensors is currently acknowledged as a robust and cost-effective approach. However, most existing research has primarily focused on fusing visual and radar features dedicated to object detection or utilizing a shared feature space for multiple tasks, neglecting the individual representation differences between various tasks. To address this gap, we propose a pair of Asymmetric Fair Fusion (AFF) modules with favorable explainability designed to efficiently interact with independent features from both visual and radar modalities, tailored to the specific requirements of object detection and semantic segmentation tasks. The AFF modules treat image and radar maps as irregular point sets and transform these features into a crossed-shared feature space for multitasking, ensuring equitable treatment of vision and radar point cloud features. Leveraging AFF modules, we propose a novel and efficient PDP model, ASY-VRNet, which processes image and radar features based on irregular super-pixel point sets. Additionally, we propose an effective multitask learning method specifically designed for PDP models. Compared to other lightweight models, ASY-VRNet achieves state-of-the-art performance in object detection, semantic segmentation, and drivable-area segmentation on the WaterScenes benchmark. Our project is publicly available at https://github.com/GuanRunwei/ASY-VRNet., Comment: Accepted by IROS 2024
Published: 2023

20. Radar-STDA: A High-Performance Spatial-Temporal Denoising Autoencoder for Interference Mitigation of FMCW Radars

Author: Liu, Lulu, Guan, Runwei, Ma, Fei, Smith, Jeremy, and Yue, Yutao
Subjects: Electrical Engineering and Systems Science - Signal Processing
Abstract: With its small size, low cost and all-weather operation, millimeter-wave radar can accurately measure the distance, azimuth and radial velocity of a target compared to other traffic sensors. However, in practice, millimeter-wave radars are plagued by various interferences, leading to a drop in target detection accuracy or even failure to detect targets. This is undesirable in autonomous vehicles and traffic surveillance, as it is likely to threaten human life and cause property damage. Therefore, interference mitigation is of great significance for millimeter-wave radar-based target detection. Currently, the development of deep learning is rapid, but existing deep learning-based interference mitigation models still have great limitations in terms of model size and inference speed. For these reasons, we propose Radar-STDA, a Radar-Spatial Temporal Denoising Autoencoder. Radar-STDA is an efficient nano-level denoising autoencoder that takes into account both spatial and temporal information of range-Doppler maps. Among other methods, it achieves a maximum SINR of 17.08 dB with only 140,000 parameters. It obtains 207.6 FPS on an RTX A4000 GPU and 56.8 FPS on an NVIDIA Jetson AGXXavier respectively when denoising range-Doppler maps for three consecutive frames. Moreover, we release a synthetic data set called Ra-inf for the task, which involves 384,769 range-Doppler maps with various clutters from objects of no interest and receiver noise in realistic scenarios. To the best of our knowledge, Ra-inf is the first synthetic dataset of radar interference. To support the community, our research is open-source via the link \url{https://github.com/GuanRunwei/rd_map_temporal_spatial_denoising_autoencoder}.
Published: 2023

21. Achelous: A Fast Unified Water-surface Panoptic Perception Framework based on Fusion of Monocular Camera and 4D mmWave Radar

Author: Guan, Runwei, Yao, Shanliang, Zhu, Xiaohui, Man, Ka Lok, Lim, Eng Gee, Smith, Jeremy, Yue, Yong, and Yue, Yutao
Subjects: Computer Science - Computer Vision and Pattern Recognition, Computer Science - Robotics
Abstract: Current perception models for different tasks usually exist in modular forms on Unmanned Surface Vehicles (USVs), which infer extremely slowly in parallel on edge devices, causing the asynchrony between perception results and USV position, and leading to error decisions of autonomous navigation. Compared with Unmanned Ground Vehicles (UGVs), the robust perception of USVs develops relatively slowly. Moreover, most current multi-task perception models are huge in parameters, slow in inference and not scalable. Oriented on this, we propose Achelous, a low-cost and fast unified panoptic perception framework for water-surface perception based on the fusion of a monocular camera and 4D mmWave radar. Achelous can simultaneously perform five tasks, detection and segmentation of visual targets, drivable-area segmentation, waterline segmentation and radar point cloud segmentation. Besides, models in Achelous family, with less than around 5 million parameters, achieve about 18 FPS on an NVIDIA Jetson AGX Xavier, 11 FPS faster than HybridNets, and exceed YOLOX-Tiny and Segformer-B0 on our collected dataset about 5 mAP$_{\text{50-95}}$ and 0.7 mIoU, especially under situations of adverse weather, dark environments and camera failure. To our knowledge, Achelous is the first comprehensive panoptic perception framework combining vision-level and point-cloud-level tasks for water-surface perception. To promote the development of the intelligent transportation community, we release our codes in \url{https://github.com/GuanRunwei/Achelous}., Comment: Accepted by ITSC 2023
Published: 2023

22. WaterScenes: A Multi-Task 4D Radar-Camera Fusion Dataset and Benchmarks for Autonomous Driving on Water Surfaces

Author: Yao, Shanliang, Guan, Runwei, Wu, Zhaodong, Ni, Yi, Huang, Zile, Liu, Ryan Wen, Yue, Yong, Ding, Weiping, Lim, Eng Gee, Seo, Hyungjoon, Man, Ka Lok, Ma, Jieming, Zhu, Xiaohui, and Yue, Yutao
Subjects: Computer Science - Computer Vision and Pattern Recognition, Computer Science - Robotics
Abstract: Autonomous driving on water surfaces plays an essential role in executing hazardous and time-consuming missions, such as maritime surveillance, survivors rescue, environmental monitoring, hydrography mapping and waste cleaning. This work presents WaterScenes, the first multi-task 4D radar-camera fusion dataset for autonomous driving on water surfaces. Equipped with a 4D radar and a monocular camera, our Unmanned Surface Vehicle (USV) proffers all-weather solutions for discerning object-related information, including color, shape, texture, range, velocity, azimuth, and elevation. Focusing on typical static and dynamic objects on water surfaces, we label the camera images and radar point clouds at pixel-level and point-level, respectively. In addition to basic perception tasks, such as object detection, instance segmentation and semantic segmentation, we also provide annotations for free-space segmentation and waterline segmentation. Leveraging the multi-task and multi-modal data, we conduct benchmark experiments on the uni-modality of radar and camera, as well as the fused modalities. Experimental results demonstrate that 4D radar-camera fusion can considerably improve the accuracy and robustness of perception on water surfaces, especially in adverse lighting and weather conditions. WaterScenes dataset is public on https://waterscenes.github.io., Comment: Accepted by IEEE Transactions on Intelligent Transportation Systems
Published: 2023

23. Drive Current Boost in Double-Channeled Nanotube Gate all Around Field Effect Transistor

Author: Qin, Laixiang, Li, Chunlai, Wei, Yiqun, Xu, Zhangwei, He, Jin, He, Yandong, and Yue, Yutao
Subjects: Physics - Applied Physics
Abstract: We demonstrate an exotic doubled-channeled NT GAAFET (DC NT GAAFET) structure with Ion boost in comparison with NT GAAFET and NW GAAFET with the same footprint. Ion gains of 64.8% and 1.7 times have been obtained in DC NT GAAFET in compared with NT GAAFET and NW GAAFET. Ioff of DC NT GAAFET degrades by 61.8% than that of NT GAAFET, SS is almost comparable in two kinds of device structures, whereas Ion/Ioff ratio in DC NT GAAFET still gains subtly, by 2.4%, than NT GAAFET thanks to the substantial Ion aggrandizement, indicating the sustained superior gate electrostatic controllability in DC NT GAAFET with regarding to NT GAAFET regardless of additional channel incorporated. On the other side, both DC NT GAAFET and NT GAAFET exhibit superior device performance than NW GAAFET in terms of high operation speed and better electrostatic controllability manifested by suppressed SCEs., Comment: 8 page, 4figures
Published: 2023

24. FindVehicle and VehicleFinder: a NER dataset for natural language-based vehicle retrieval and a keyword-based cross-modal vehicle retrieval system

Author: Guan, Runwei, Man, Ka Lok, Chen, Feifan, Yao, Shanliang, Hu, Rongsheng, Zhu, Xiaohui, Smith, Jeremy, Lim, Eng Gee, and Yue, Yutao
Published: 2024
Full Text: View/download PDF

25. Self-Supervised Learning for Point Clouds Data: A Survey

Author: Zeng, Changyu, Wang, Wei, Nguyen, Anh, and Yue, Yutao
Subjects: Computer Science - Computer Vision and Pattern Recognition
Abstract: 3D point clouds are a crucial type of data collected by LiDAR sensors and widely used in transportation applications due to its concise descriptions and accurate localization. Deep neural networks (DNNs) have achieved remarkable success in processing large amount of disordered and sparse 3D point clouds, especially in various computer vision tasks, such as pedestrian detection and vehicle recognition. Among all the learning paradigms, Self-Supervised Learning (SSL), an unsupervised training paradigm that mines effective information from the data itself, is considered as an essential solution to solve the time-consuming and labor-intensive data labelling problems via smart pre-training task design. This paper provides a comprehensive survey of recent advances on SSL for point clouds. We first present an innovative taxonomy, categorizing the existing SSL methods into four broad categories based on the pretexts' characteristics. Under each category, we then further categorize the methods into more fine-grained groups and summarize the strength and limitations of the representative methods. We also compare the performance of the notable SSL methods in literature on multiple downstream tasks on benchmark datasets both quantitatively and qualitatively. Finally, we propose a number of future research directions based on the identified limitations of existing SSL research on point clouds.
Published: 2023

26. FindVehicle and VehicleFinder: A NER dataset for natural language-based vehicle retrieval and a keyword-based cross-modal vehicle retrieval system

Author: Guan, Runwei, Man, Ka Lok, Chen, Feifan, Yao, Shanliang, Hu, Rongsheng, Zhu, Xiaohui, Smith, Jeremy, Lim, Eng Gee, and Yue, Yutao
Subjects: Computer Science - Computer Vision and Pattern Recognition, Computer Science - Multimedia
Abstract: Natural language (NL) based vehicle retrieval is a task aiming to retrieve a vehicle that is most consistent with a given NL query from among all candidate vehicles. Because NL query can be easily obtained, such a task has a promising prospect in building an interactive intelligent traffic system (ITS). Current solutions mainly focus on extracting both text and image features and mapping them to the same latent space to compare the similarity. However, existing methods usually use dependency analysis or semantic role-labelling techniques to find keywords related to vehicle attributes. These techniques may require a lot of pre-processing and post-processing work, and also suffer from extracting the wrong keyword when the NL query is complex. To tackle these problems and simplify, we borrow the idea from named entity recognition (NER) and construct FindVehicle, a NER dataset in the traffic domain. It has 42.3k labelled NL descriptions of vehicle tracks, containing information such as the location, orientation, type and colour of the vehicle. FindVehicle also adopts both overlapping entities and fine-grained entities to meet further requirements. To verify its effectiveness, we propose a baseline NL-based vehicle retrieval model called VehicleFinder. Our experiment shows that by using text encoders pre-trained by FindVehicle, VehicleFinder achieves 87.7\% precision and 89.4\% recall when retrieving a target vehicle by text command on our homemade dataset based on UA-DETRAC. The time cost of VehicleFinder is 279.35 ms on one ARM v8.2 CPU and 93.72 ms on one RTX A4000 GPU, which is much faster than the Transformer-based system. The dataset is open-source via the link https://github.com/GuanRunwei/FindVehicle, and the implementation can be found via the link https://github.com/GuanRunwei/VehicleFinder-CTIM.
Published: 2023

27. Radar-Camera Fusion for Object Detection and Semantic Segmentation in Autonomous Driving: A Comprehensive Review

Author: Yao, Shanliang, Guan, Runwei, Huang, Xiaoyu, Li, Zhuoxiao, Sha, Xiangyu, Yue, Yong, Lim, Eng Gee, Seo, Hyungjoon, Man, Ka Lok, Zhu, Xiaohui, and Yue, Yutao
Subjects: Computer Science - Computer Vision and Pattern Recognition, Computer Science - Artificial Intelligence, Computer Science - Robotics
Abstract: Driven by deep learning techniques, perception technology in autonomous driving has developed rapidly in recent years, enabling vehicles to accurately detect and interpret surrounding environment for safe and efficient navigation. To achieve accurate and robust perception capabilities, autonomous vehicles are often equipped with multiple sensors, making sensor fusion a crucial part of the perception system. Among these fused sensors, radars and cameras enable a complementary and cost-effective perception of the surrounding environment regardless of lighting and weather conditions. This review aims to provide a comprehensive guideline for radar-camera fusion, particularly concentrating on perception tasks related to object detection and semantic segmentation.Based on the principles of the radar and camera sensors, we delve into the data processing process and representations, followed by an in-depth analysis and summary of radar-camera fusion datasets. In the review of methodologies in radar-camera fusion, we address interrogative questions, including "why to fuse", "what to fuse", "where to fuse", "when to fuse", and "how to fuse", subsequently discussing various challenges and potential research directions within this domain. To ease the retrieval and comparison of datasets and fusion methods, we also provide an interactive website: https://radar-camera-fusion.github.io., Comment: Accepted by IEEE Transactions on Intelligent Vehicles (T-IV)
Published: 2023
Full Text: View/download PDF

28. Quantify the Causes of Causal Emergence: Critical Conditions of Uncertainty and Asymmetry in Causal Structure

Author: Jia, Liye, Yang, Fengyufan, Man, Ka Lok, Purwanto, Erick, Guan, Sheng-Uei, Smith, Jeremy, and Yue, Yutao
Subjects: Computer Science - Information Theory, Computer Science - Artificial Intelligence, Physics - Computational Physics, H.1.0, J.2, I.2.m
Abstract: Beneficial to advanced computing devices, models with massive parameters are increasingly employed to extract more information to enhance the precision in describing and predicting the patterns of objective systems. This phenomenon is particularly pronounced in research domains associated with deep learning. However, investigations of causal relationships based on statistical and informational theories have posed an interesting and valuable challenge to large-scale models in the recent decade. Macroscopic models with fewer parameters can outperform their microscopic counterparts with more parameters in effectively representing the system. This valuable situation is called "Causal Emergence." This paper introduces a quantification framework, according to the Effective Information and Transition Probability Matrix, for assessing numerical conditions of Causal Emergence as theoretical constraints of its occurrence. Specifically, our results quantitatively prove the cause of Causal Emergence. By a particular coarse-graining strategy, optimizing uncertainty and asymmetry within the model's causal structure is significantly more influential than losing maximum information due to variations in model scales. Moreover, by delving into the potential exhibited by Partial Information Decomposition and Deep Learning networks in the study of Causal Emergence, we discuss potential application scenarios where our quantification framework could play a role in future investigations of Causal Emergence., Comment: 18 pages, 14 figures
Published: 2022

29. A World-Self Model Towards Understanding Intelligence

Author: Yue, Yutao
Subjects: Computer Science - Artificial Intelligence, Computer Science - Information Theory, I.2.0, G.0, H.1.1
Abstract: The symbolism, connectionism and behaviorism approaches of artificial intelligence have achieved a lot of successes in various tasks, while we still do not have a clear definition of "intelligence" with enough consensus in the community (although there are over 70 different "versions" of definitions). The nature of intelligence is still in darkness. In this work we do not take any of these three traditional approaches, instead we try to identify certain fundamental aspects of the nature of intelligence, and construct a mathematical model to represent and potentially reproduce these fundamental aspects. We first stress the importance of defining the scope of discussion and granularity of investigation. We carefully compare human and artificial intelligence, and qualitatively demonstrate an information abstraction process, which we propose to be the key to connect perception and cognition. We then present the broader idea of "concept", separate the idea of self model out of the world model, and construct a new model called world-self model (WSM). We show the mechanisms of creating and connecting concepts, and the flow of how the WSM receives, processes and outputs information with respect to an arbitrary type of problem to solve. We also consider and discuss the potential computer implementation issues of the proposed theoretical framework, and finally we propose a unified general framework of intelligence based on WSM., Comment: 21 pages, 5 figures
Published: 2022
Full Text: View/download PDF

30. Double channeled nanotube gate all around field effect transistor with drive current boosted

Author: Qin, Laixiang, Tian, He, Li, Chunlai, Wei, Yiqun, He, Jin, He, Yandong, Ren, Tianling, Xu, Zhangwei, and Yue, Yutao
Published: 2024
Full Text: View/download PDF

31. Deep neural network training method based on vectorgraphs for designing of metamaterial broadband polarization converters

Author: Gao, Jiale, Feng, Chunjie, Wu, Xingyi, Wu, Yanghui, Zhu, Xiaobo, Sun, Daying, Yue, Yutao, and Gu, Wenhua
Published: 2023
Full Text: View/download PDF

32. Self-supervised learning for point cloud data: A survey

Author: Zeng, Changyu, Wang, Wei, Nguyen, Anh, Xiao, Jimin, and Yue, Yutao
Published: 2024
Full Text: View/download PDF

33. Mask-VRDet: A robust riverway panoptic perception model based on dual graph fusion of vision and 4D mmWave radar

Author: Guan, Runwei, Yao, Shanliang, Liu, Lulu, Zhu, Xiaohui, Man, Ka Lok, Yue, Yong, Smith, Jeremy, Lim, Eng Gee, and Yue, Yutao
Published: 2024
Full Text: View/download PDF

34. MAN and CAT: mix attention to nn and concatenate attention to YOLO

Author: Guan, Runwei, Man, Ka Lok, Zhao, Haocheng, Zhang, Ruixiao, Yao, Shanliang, Smith, Jeremy, Lim, Eng Gee, and Yue, Yutao
Published: 2023
Full Text: View/download PDF

35. Multi-object detection at night for traffic investigations based on improved SSD framework

Author: Zhang, Qiang, Hu, Xiaojian, Yue, Yutao, Gu, Yanbiao, and Sun, Yizhou
Published: 2022
Full Text: View/download PDF

36. TPC Together with Overlapped Time Domain Multiplexing System Based on Turbo Structure

Author: Zheng, Hao, Xing, Mingjun, Yue, Yutao, Li, Xue, Li, Daoben, and Ji, Chunlin
Subjects: Computer Science - Information Theory
Abstract: Overlapped time domain multiplexing (OvTDM) is a novel technique for utilizing inter-symbol interference (ISI) to benefit a communication system. We implement the OvTDM technique based on turbo structure and associate a turbo product code (TPC) to construct a novel coded turbo-structure OvTDM system. Two schemes of the iterative receiver and soft input and soft output (SISO) decoding algorithms are presented. Simulation results show the advantage of structures in this paper. In addition, an attractive transmission rate and symbol efficiency of the designed system can also be observed.
Published: 2017

37. Steep Slope Field Effect Transistors Based on 2D Materials

Author: Qin, Laixiang, primary, Tian, He, additional, Li, Chunlai, additional, Xie, Ziang, additional, Wei, Yiqun, additional, Li, Yi, additional, He, Jin, additional, Yue, Yutao, additional, and Ren, Tian‐Ling, additional
Published: 2024
Full Text: View/download PDF

38. Radar-Based Swimming Activity Recognition with Temporal Dynamic Convolution and Spectral Data Augmentation

Author: Zhou, Yi, primary, Yu, Xuliang, additional, Lopez-Benitez, Miguel, additional, Yu, Limin, additional, and Yue, Yutao, additional
Published: 2024
Full Text: View/download PDF

39. Text2Doppler: Generating Radar Micro-Doppler Signatures for Human Activity Recognition via Textual Descriptions

Author: Zhou, Yi, primary, López-Benítez, Miguel, additional, Yu, Limin, additional, and Yue, Yutao, additional
Published: 2024
Full Text: View/download PDF

40. WaterScenes: A Multi-Task 4D Radar-Camera Fusion Dataset and Benchmarks for Autonomous Driving on Water Surfaces

Author: Yao, Shanliang, Guan, Runwei, Wu, Zhaodong, Ni, Yi, Huang, Zile, Liu, Ryan Wen, Yue, Yong, Ding, Weiping, Lim, Eng Gee, Seo, Hyungjoon, Man, Ka Lok, Ma, Jieming, Zhu, Xiaohui, Yue, Yutao, Yao, Shanliang, Guan, Runwei, Wu, Zhaodong, Ni, Yi, Huang, Zile, Liu, Ryan Wen, Yue, Yong, Ding, Weiping, Lim, Eng Gee, Seo, Hyungjoon, Man, Ka Lok, Ma, Jieming, Zhu, Xiaohui, and Yue, Yutao
Abstract: Autonomous driving on water surfaces plays an essential role in executing hazardous and time-consuming missions, such as maritime surveillance, survivor rescue, environmental monitoring, hydrography mapping and waste cleaning. This work presents WaterScenes, the first multi-task 4D radar-camera fusion dataset for autonomous driving on water surfaces. Equipped with a 4D radar and a monocular camera, our Unmanned Surface Vehicle (USV) proffers all-weather solutions for discerning object-related information, including color, shape, texture, range, velocity, azimuth, and elevation. Focusing on typical static and dynamic objects on water surfaces, we label the camera images and radar point clouds at pixel-level and point-level, respectively. In addition to basic perception tasks, such as object detection, instance segmentation and semantic segmentation, we also provide annotations for free-space segmentation and waterline segmentation. Leveraging the multi-task and multi-modal data, we conduct benchmark experiments on the uni-modality of radar and camera, as well as the fused modalities. Experimental results demonstrate that 4D radar-camera fusion can considerably improve the accuracy and robustness of perception on water surfaces, especially in adverse lighting and weather conditions. WaterScenes dataset is public on https://waterscenes.github.io.
Published: 2024

41. Steep Slope Field Effect Transistors Based on 2D Materials

Author: Qin, Laixiang, Tian, He, Li, Chunlai, Xie, Ziang, Wei, Yiqun, Li, Yi, He, Jin, Yue, Yutao, Ren, Tian-Ling, Qin, Laixiang, Tian, He, Li, Chunlai, Xie, Ziang, Wei, Yiqun, Li, Yi, He, Jin, Yue, Yutao, and Ren, Tian-Ling
Abstract: With field effect transistor (FET) sustained to downscale to sub-10 nm nodes, performance degradation originates from short channel effects (SCEs) degradation and power consumption increment attributed to inhibition of supply voltage (VDD) scaling down proportionally caused by thermionic limit subthreshold swing (SS) (60 mV dec−1) pose substantial challenges for today's semiconductor industry. To further sustain the Moore's law life, incorporation of new device concepts or new materials are imperative. 2D materials are predicted to be able to combat SCEs by virtue of high carrier mobility maintainability regardless of thickness thinning down, dangling bonds free surface and atomic thickness, which contributes to super gate electrostatic controllability. To overcome increasing power dissipation problem, new device structures including negative capacitance FET (NCFET), tunnel FET (TFET), dirac source FET (DSFET) and the like, which show superiority in decreasing VDD by lowering SS below thermionic limit of 60 mV dec−1 have been brought out. The combination of 2D materials and ultralow steep slope device structures holds great promise for low power-dissipation electronics, which encompass both suppressed SCEs and reduced VDD simultaneously, leading to improved device performance and lowered power dissipation. © 2024 The Authors. Advanced Electronic Materials published by Wiley-VCH GmbH.
Published: 2024

42. Structural, magnetic, and electrical properties of LaMnO3 doped with Na by microwave-assisted synthesis method.

Author: Chen, Guohao, Jiang, Luwen, Li, Chunlai, Wei, Yiqun, Hu, Guoqing, Yue, Yutao, Wang, Xiaomeng, and He, Jin
Published: 2024
Full Text: View/download PDF

43. Corruption Robustness Analysis of Radar Micro-Doppler Classification for Human Activity Recognition

Author: Zhou, Yi, primary, Yu, Xuliang, additional, Lopez-Benitez, Miguel, additional, Yu, Limin, additional, and Yue, Yutao, additional
Published: 2024
Full Text: View/download PDF

44. Radar-Stda: A High-Performance Spatial-Temporal Denoising Autoencoder for Interference Mitigation of Fmcw Radars

Author: Liu, Lulu, primary, Guan, Runwei, additional, Ma, Fei, additional, Man, Ka Lok, additional, and Yue, Yutao, additional
Published: 2024
Full Text: View/download PDF

45. PLSR: Unstructured Pruning with Layer-Wise Sparsity Ratio

Author: Zhao, Haocheng, primary, Yu, Limin, additional, Guan, Runwei, additional, Jia, Liye, additional, Zhang, Junqing, additional, and Yue, Yutao, additional
Published: 2023
Full Text: View/download PDF

46. Radar-Based Swimming Activity Recognition with Temporal Dynamic Convolution and Spectral Data Augmentation

Author: Zhou, Yi, primary, Yu, Xuliang, primary, Lopez-Benitez, Miguel, primary, Yu, Limin, primary, and Yue, Yutao, primary
Published: 2023
Full Text: View/download PDF

47. Robustness Analysis of Radar Micro-Doppler Classification under Corruptions

Author: Zhou, Yi, primary, Yu, Xuliang, primary, Lopez-Benitez, Miguel, primary, Yu, Limin, primary, and Yue, Yutao, primary
Published: 2023
Full Text: View/download PDF

48. Review of: "Artificial Intelligence and Organizational Change"

Author: Yue, Yutao, primary
Published: 2023
Full Text: View/download PDF

49. FindVehicle and VehicleFinder: a NER dataset for natural language-based vehicle retrieval and a keyword-based cross-modal vehicle retrieval system

Author: Guan, Runwei, primary, Man, Ka Lok, additional, Chen, Feifan, additional, Yao, Shanliang, additional, Hu, Rongsheng, additional, Zhu, Xiaohui, additional, Smith, Jeremy, additional, Lim, Eng Gee, additional, and Yue, Yutao, additional
Published: 2023
Full Text: View/download PDF

50. Clutter Detection in Automotive Radar Point Clouds Based on Deep Learning with Self-attention

Author: Liu, Lulu, primary, Guan, Runwei, additional, Zhao, Haocheng, additional, Ma, Fei, additional, and Yue, Yutao, additional
Published: 2023
Full Text: View/download PDF

Catalog

Books, media, physical & digital resources

See catalog results

Searchworks

Select search scope, currently: Articles Catalog books, media & more in Jio Institute collections Articles journal articles & other e-resources

Search

Search Constraints

Refine your results

Search Limiters

Topic

Publication Year Range

Language

Publication Type

Journal

Database

Publisher

133 results on '"Yue, Yutao"'

Search Results

Catalog

Select search scope, currently: Articles

Catalog

books, media & more in Jio Institute collections

Articles

journal articles & other e-resources