Author: "Liu, Zhaoxiang" - Searchworks@Jio Institute Digital Library Search Results

Your search keyword '"Liu, Zhaoxiang"' showing total 378 results

Start Over Author "Liu, Zhaoxiang"

378 results on '"Liu, Zhaoxiang"'

1. GlitchMiner: Mining Glitch Tokens in Large Language Models via Gradient-based Discrete Optimization

Author: Wu, Zihui, Gao, Haichang, Wang, Ping, Zhang, Shudong, Liu, Zhaoxiang, and Lian, Shiguo
Subjects: Computer Science - Artificial Intelligence
Abstract: Glitch tokens in Large Language Models (LLMs) can trigger unpredictable behaviors, threatening model reliability and safety. Existing detection methods rely on predefined patterns, limiting their adaptability across diverse LLM architectures. We propose GlitchMiner, a gradient-based discrete optimization framework that efficiently identifies glitch tokens by introducing entropy as a measure of prediction uncertainty and employing a local search strategy to explore the token space. Experiments across multiple LLM architectures demonstrate that GlitchMiner outperforms existing methods in detection accuracy and adaptability, achieving over 10% average efficiency improvement. This method enhances vulnerability assessment in LLMs, contributing to the development of more robust and reliable applications. Code is available at https://github.com/wooozihui/GlitchMiner.
Published: 2024

2. Piculet: Specialized Models-Guided Hallucination Decrease for MultiModal Large Language Models

Author: Wang, Kohou, Liu, Xiang, Liu, Zhaoxiang, Wang, Kai, and Lian, Shiguo
Subjects: Computer Science - Artificial Intelligence
Abstract: Multimodal Large Language Models (MLLMs) have made significant progress in bridging the gap between visual and language modalities. However, hallucinations in MLLMs, where the generated text does not align with image content, continue to be a major challenge. Existing methods for addressing hallucinations often rely on instruction-tuning, which requires retraining the model with specific data, which increases the cost of utilizing MLLMs further. In this paper, we introduce a novel training-free method, named Piculet, for enhancing the input representation of MLLMs. Piculet leverages multiple specialized models to extract descriptions of visual information from the input image and combine these descriptions with the original image and query as input to the MLLM. We evaluate our method both quantitively and qualitatively, and the results demonstrate that Piculet greatly decreases hallucinations of MLLMs. Our method can be easily extended to different MLLMs while being universal., Comment: 14 pages, 5 figures
Published: 2024

3. Methodology of Adapting Large English Language Models for Specific Cultural Contexts

Author: Zhang, Wenjing, Xiao, Siqi, Lei, Xuejiao, Wang, Ning, Zhang, Huazheng, An, Meijuan, Yang, Bikun, Liu, Zhaoxiang, Wang, Kai, and Lian, Shiguo
Subjects: Computer Science - Computation and Language, Computer Science - Artificial Intelligence
Abstract: The rapid growth of large language models(LLMs) has emerged as a prominent trend in the field of artificial intelligence. However, current state-of-the-art LLMs are predominantly based on English. They encounter limitations when directly applied to tasks in specific cultural domains, due to deficiencies in domain-specific knowledge and misunderstandings caused by differences in cultural values. To address this challenge, our paper proposes a rapid adaptation method for large models in specific cultural contexts, which leverages instruction-tuning based on specific cultural knowledge and safety values data. Taking Chinese as the specific cultural context and utilizing the LLaMA3-8B as the experimental English LLM, the evaluation results demonstrate that the adapted LLM significantly enhances its capabilities in domain-specific knowledge and adaptability to safety values, while maintaining its original expertise advantages., Comment: 11 pages, 2 figures
Published: 2024

4. CHiSafetyBench: A Chinese Hierarchical Safety Benchmark for Large Language Models

Author: Zhang, Wenjing, Lei, Xuejiao, Liu, Zhaoxiang, An, Meijuan, Yang, Bikun, Zhao, KaiKai, Wang, Kai, and Lian, Shiguo
Subjects: Computer Science - Computation and Language, Computer Science - Artificial Intelligence
Abstract: With the profound development of large language models(LLMs), their safety concerns have garnered increasing attention. However, there is a scarcity of Chinese safety benchmarks for LLMs, and the existing safety taxonomies are inadequate, lacking comprehensive safety detection capabilities in authentic Chinese scenarios. In this work, we introduce CHiSafetyBench, a dedicated safety benchmark for evaluating LLMs' capabilities in identifying risky content and refusing answering risky questions in Chinese contexts. CHiSafetyBench incorporates a dataset that covers a hierarchical Chinese safety taxonomy consisting of 5 risk areas and 31 categories. This dataset comprises two types of tasks: multiple-choice questions and question-answering, evaluating LLMs from the perspectives of risk content identification and the ability to refuse answering risky questions respectively. Utilizing this benchmark, we validate the feasibility of automatic evaluation as a substitute for human evaluation and conduct comprehensive automatic safety assessments on mainstream Chinese LLMs. Our experiments reveal the varying performance of different models across various safety domains, indicating that all models possess considerable potential for improvement in Chinese safety capabilities. Our dataset is publicly available at https://github.com/UnicomAI/UnicomBenchmark/tree/main/CHiSafetyBench., Comment: 16 pages, 5 figures
Published: 2024

5. What is the best model? Application-driven Evaluation for Large Language Models

Author: Lian, Shiguo, Zhao, Kaikai, Liu, Xinhui, Lei, Xuejiao, Yang, Bikun, Zhang, Wenjing, Wang, Kai, and Liu, Zhaoxiang
Subjects: Computer Science - Computation and Language, Computer Science - Artificial Intelligence
Abstract: General large language models enhanced with supervised fine-tuning and reinforcement learning from human feedback are increasingly popular in academia and industry as they generalize foundation models to various practical tasks in a prompt manner. To assist users in selecting the best model in practical application scenarios, i.e., choosing the model that meets the application requirements while minimizing cost, we introduce A-Eval, an application-driven LLMs evaluation benchmark for general large language models. First, we categorize evaluation tasks into five main categories and 27 sub-categories from a practical application perspective. Next, we construct a dataset comprising 678 question-and-answer pairs through a process of collecting, annotating, and reviewing. Then, we design an objective and effective evaluation method and evaluate a series of LLMs of different scales on A-Eval. Finally, we reveal interesting laws regarding model scale and task difficulty level and propose a feasible method for selecting the best model. Through A-Eval, we provide clear empirical and engineer guidance for selecting the best model, reducing barriers to selecting and using LLMs and promoting their application and development. Our benchmark is publicly available at https://github.com/UnicomAI/DataSet/tree/main/TestData/GeneralAbility.
Published: 2024

6. Compact low-half-wave-voltage thin film lithium niobate electro-optic phase modulator fabricated by photolithography assisted chemo-mechanical etching

Author: Gao, Lang, Liang, Youting, Chen, Jinming, Yu, Jianping, Qi, Jia, Song, Lvbin, Liu, Jian, Liu, Zhaoxiang, Qi, Hongxin, and Cheng, Ya
Subjects: Physics - Optics, Physics - Applied Physics
Abstract: This paper presents a compact dual-arm thin film lithium niobate (TFLN) electro-optic phase modulator fabricated using the photolithography-assisted chemo-mechanical etching (PLACE) technique. The design of the device allows for complete utilization of the microwave electric field, doubling the modulation efficiency compared to single-arm modulators in theory. With a half-wave voltage of approximately 3 V and a modulation length of 1 cm, the device outperforms conventional phase modulators. Furthermore, the phase modulator exhibits low sensitivity to optical wavelengths in the range of 1510-1600 nm and offers a low insertion loss of 2.8 dB. The capability to generate multiple sideband signals for optical frequency comb applications is also demonstrated, producing 29 sideband signals at an input microwave power of 2 W.
Published: 2024

7. An erbium-doped waveguide amplifier on thin film lithium niobate with an output power exceeding 100 mW

Author: Bao, Rui, Fang, Zhiwei, Liu, Jian, Liu, Zhaoxiang, Chen, Jinming, Wang, Min, Wu, Rongbo, Zhang, Haisu, and Cheng, Ya
Subjects: Physics - Optics
Abstract: We demonstrate high-power thin film lithium niobate (TFLN) erbium-doped waveguide amplifier (EDWA) with a maximum on-chip output power of 113 mW and a gain of 16 dB. The on-chip integrated EDWA is composed of large mode area (LMA) waveguide structures with a total length of 7 cm and a footprint of 1x1 cm2. Particularly, we connect segmented LMA waveguides with waveguide tapers to achieve on-chip mode conversion which maintains single-mode propagation all over the EDWA even at the waveguide bends. The design leads to significant increase of the amplified signal power by orders of magnitude and will open an avenue for applications such as on-chip high-power lasers and amplifiers system., Comment: 13 pages, 4 figures
Published: 2024

8. TP3M: Transformer-based Pseudo 3D Image Matching with Reference Image

Author: Han, Liming, Liu, Zhaoxiang, and Lian, Shiguo
Subjects: Computer Science - Computer Vision and Pattern Recognition
Abstract: Image matching is still challenging in such scenes with large viewpoints or illumination changes or with low textures. In this paper, we propose a Transformer-based pseudo 3D image matching method. It upgrades the 2D features extracted from the source image to 3D features with the help of a reference image and matches to the 2D features extracted from the destination image by the coarse-to-fine 3D matching. Our key discovery is that by introducing the reference image, the source image's fine points are screened and furtherly their feature descriptors are enriched from 2D to 3D, which improves the match performance with the destination image. Experimental results on multiple datasets show that the proposed method achieves the state-of-the-art on the tasks of homography estimation, pose estimation and visual localization especially in challenging scenes., Comment: Accepted by ICRA 2024
Published: 2024
Full Text: View/download PDF

9. On-chip wavelength division multiplexing by angled multimode interferometer fabricated on erbium-doped thin film lithium niobate on insulator

Author: Han, Jinli, Bao, Rui, Wu, Rongbo, Liu, Zhaoxiang, Wang, Zhe, Sun, Chao, Zhang, Zhihao, Li, Mengqi, Fang, Zhiwei, Wang, Min, Zhang, Haisu, and Cheng, Ya
Subjects: Physics - Optics, Physics - Applied Physics
Abstract: Photonic integrated circuits based on erbium doped thin film lithium niobate on insulator has attracted broad interests with insofar various waveguide amplifiers and microlasers demonstrated. Wideband operation facilitated by the broadband absorption and emission of erbium ions necessitates the functional integration of wavelength filter and multiplexer on the same chip. Here a low-loss wavelength division multiplexer at the resonant pumping and emission wavelengths (~1480 nm and 1530~1560 nm) of erbium ions based on angled multimode interferometer, is realized in the erbium doped thin film lithium niobate on insulator fabricated by the photolithography assisted chemomechanical etching technique. The minimum on-chip insertion losses of the fabricated device are <0.7 dB for both wavelength ranges, and a 3-dB bandwidth of >20 nm is measured at the telecom C-band. Besides, direct visualization of the multimode interference pattern by the visible upconversion fluorescence of erbium ions compares well with the simulated light propagation in the multimode interferometer. Spectral tuning of the wavelength division multiplexer by structural design is also demonstrated and discussed., Comment: 11 pages, 5 figures
Published: 2024

10. A Multimodal Benchmark Dataset and Model for Crop Disease Diagnosis

Author: Liu, Xiang, Liu, Zhaoxiang, Hu, Huan, Chen, Zezhou, Wang, Kohou, Wang, Kai, Lian, Shiguo, Goos, Gerhard, Series Editor, Hartmanis, Juris, Founding Editor, Bertino, Elisa, Editorial Board Member, Gao, Wen, Editorial Board Member, Steffen, Bernhard, Editorial Board Member, Yung, Moti, Editorial Board Member, Leonardis, Aleš, editor, Ricci, Elisa, editor, Roth, Stefan, editor, Russakovsky, Olga, editor, Sattler, Torsten, editor, and Varol, Gül, editor
Published: 2025
Full Text: View/download PDF

11. Thin Film Lithium Niobate Electro-optic Isolator Fabricated by photolithography assisted chemo-mechanical etching (PLACE)

Author: Gao, Lang, Liang, Youting, Song, Lvbin, Yin, Difeng, Qi, Jia, Chen, Jinming, Liu, Zhaoxiang, Yu, Jianping, Liu, Jian, Zhang, Haisu, Fang, Zhiwei, Qi, Hongxin, and Cheng, Ya
Subjects: Physics - Optics, Physics - Applied Physics
Abstract: We report a thin-film lithium niobate electro-optic isolator fabricated by photolithography-assisted chemo-mechanical etching in this work. The device demonstrates 39.50 dB isolation when subjected to a 24 GHz microwave of 25.5 dBm on its electrodes. The measured isolation remains consistently above 30 dB within the 1510 nm to 1600 nm wavelength range. The overall device insertion loss, specifically the fiber-to-fiber insert loss, has been measured to be 2.6 dB, which is attributed to our highly efficient spot size converter and the low propagation loss observed in the fabricated waveguides.
Published: 2023

12. On-chip coherent beam combination of waveguide amplifiers on Er$^{3+}$-doped thin film lithium niobate

Author: Bao, Rui, Song, Lvbin, Fang, Zhiwei, Chen, Jinmin, Wang, Zhe, Liu, Jian, Gao, Lang, Liu, Zhaoxiang, Zhang, Zhihao, Wang, Min, Zhang, Haisu, and Cheng, Ya
Subjects: Physics - Optics
Abstract: We demonstrate on-chip coherent beam combination of two waveguide amplifiers on Er$^{3+}$-doped thin film lithium niobate (Er: TFLN) platform. Our device is built based on an electro-optic modulator fabricated on Er: TFLN. The output power of the coherently combined amplifiers is measured as high as 12.9 mW, surpassing that of previous single waveguide amplifiers based on Er$^{3+}$-doped thin film lithium niobate platform.
Published: 2023

13. Patch-wise Auto-Encoder for Visual Anomaly Detection

Author: Cui, Yajie, Liu, Zhaoxiang, and Lian, Shiguo
Subjects: Computer Science - Computer Vision and Pattern Recognition, Computer Science - Artificial Intelligence
Abstract: Anomaly detection without priors of the anomalies is challenging. In the field of unsupervised anomaly detection, traditional auto-encoder (AE) tends to fail based on the assumption that by training only on normal images, the model will not be able to reconstruct abnormal images correctly. On the contrary, we propose a novel patch-wise auto-encoder (Patch AE) framework, which aims at enhancing the reconstruction ability of AE to anomalies instead of weakening it. Each patch of image is reconstructed by corresponding spatially distributed feature vector of the learned feature representation, i.e., patch-wise reconstruction, which ensures anomaly-sensitivity of AE. Our method is simple and efficient. It advances the state-of-the-art performances on Mvtec AD benchmark, which proves the effectiveness of our model. It shows great potential in practical industrial application scenarios.
Published: 2023
Full Text: View/download PDF

14. Semi-supervised Object Detection: A Survey on Recent Research and Progress

Author: Wang, Yanyang, Liu, Zhaoxiang, and Lian, Shiguo
Subjects: Computer Science - Computer Vision and Pattern Recognition, Computer Science - Artificial Intelligence
Abstract: In recent years, deep learning technology has been maturely applied in the field of object detection, and most algorithms tend to be supervised learning. However, a large amount of labeled data requires high costs of human resources, which brings about low efficiency and limitations. Semi-supervised object detection (SSOD) has been paid more and more attentions due to its high research value and practicability. It is designed to learn information by using small amounts of labeled data and large amounts of unlabeled data. In this paper, we present a comprehensive and up-to-date survey on the SSOD approaches from five aspects. We first briefly introduce several ways of data augmentation. Then, we dive the mainstream semi-supervised strategies into pseudo labels, consistent regularization, graph based and transfer learning based methods, and introduce some methods in challenging settings. We further present widely-used loss functions, and then we outline the common benchmark datasets and compare the accuracy among different representative approaches. Finally, we conclude this paper and present some promising research directions for the future. Our survey aims to provide researchers and practitioners new to the field as well as more advanced readers with a solid understanding of the main approaches developed over the past few years., Comment: 10 pages, 20 figures, 2 tables
Published: 2023

15. On-chip arrayed waveguide grating fabricated on thin film lithium niobate

Author: Wang, Zhe, Fang, Zhiwei, Liu, Zhaoxiang, Liang, Youting, Liu, Jian, Yu, Jianping, Huang, Ting, Zhou, Yuan, Zhang, Haisu, Wang, Min, and Cheng, Ya
Subjects: Physics - Optics
Abstract: We design an on-chip 8-channel TFLN AWG and fabricate the device using photolithography assisted chemo-mechanical etching (PLACE) technique. We experimentally measure the transmission of the fabricated TFLN AWG near the central wavelength of 1550 nm. We obtain an on-chip loss as low as 3.32 dB, a single-channel bandwidth of 1.6 nm and a total-channel bandwidth of 12.8 nm. The crosstalk between adjacent channels was measured to be below -7.01 dB within the wavelength range from 1543 nm to 1558 nm, and the crosstalk between non-adjacent channels was below -15 dB.
Published: 2023

16. Electro-optically programmable photonic circuits enabled by wafer-scale integration on thin-film lithium niobate

Author: Zheng, Yong, Zhong, Haozong, Zhang, Haisu, Wu, Rongbo, Liu, Jian, Liang, Youting, Song, Lvbin, Liu, Zhaoxiang, Chen, Jinming, Zhou, Junxia, Fang, Zhiwei, Wang, Min, and Cheng, Ya
Subjects: Physics - Optics, Physics - Applied Physics
Abstract: Programmable photonic circuits performing universal linear-optical transformations underpin vital functions in photonic quantum information processing, quantum-enhanced sensor networks, machine learning and many other intriguing applications. Recent advances in photonic integrated circuits facilitate monolithic integration of externally controlled Mach-Zehnder interferometers which can implement arbitrary unitary transformation on a large number of input/output modes. In this work, we demonstrate a 4x4 programmable linear photonic circuit on lithium niobate on insulator platform employing fast, power-efficient and low-loss electro-optical phase shifters, showing enormous advantages in terms of configuration rate and power consumption. Our device is capable of fast switching with 500 ps rise time and 1.7 ns fall time, and possesses a total on-chip power dissipation of only 0.015 mW when operated at 1 MHz modulation, and an insertion loss of 0.15 dB for each modulator and an on-chip extinction ratio of -34 dB for both cross and bar routes.
Published: 2023

17. Ultra-high-speed high-resolution laser lithography for lithium niobate integrated photonics

Author: Chen, Jinming, Liu, Zhaoxiang, Song, Lvbin, Sun, Chao, Wang, Guanhua, and Cheng, Ya
Subjects: Physics - Optics, Physics - Applied Physics
Abstract: Photolithography assisted chemo-mechanical etching (PLACE), a technique specifically developed for fabricating highquality large-scale photonic integrated circuits (PICs) on thin-film lithium niobate (TFLN), has enabled fabrication of a series of building blocks of PICs ranging from high-quality (high-Q) microresonators and low-loss waveguides to electrooptically (EO) tunable lasers and waveguide amplifiers. Aiming at high-throughput manufacturing of the PIC devices and systems, we have developed an ultra-high-speed high-resolution laser lithography fabrication system employing a high repetition rate femtosecond laser and a high-speed polygon laser scanner, by which a lithography fabrication efficiency of 4.8 cm2/h has been achieved at a spatial resolution of 200 nm. We demonstrate wafer-scale fabrication of TFLN-based photonic structures, optical phase masks as well as color printing
Published: 2023

18. Query Expansion and Verification with Large Language Model for Information Retrieval

Author: Zhang, Wenjing, Liu, Zhaoxiang, Wang, Kai, Lian, Shiguo, Goos, Gerhard, Series Editor, Hartmanis, Juris, Founding Editor, Bertino, Elisa, Editorial Board Member, Gao, Wen, Editorial Board Member, Steffen, Bernhard, Editorial Board Member, Yung, Moti, Editorial Board Member, Huang, De-Shuang, editor, Si, Zhanjun, editor, and Zhang, Chuanlei, editor
Published: 2024
Full Text: View/download PDF

19. Self-supervised Visual Anomaly Detection with Image Patch Generation and Comparison Networks

Author: Huang, Jianfeng, Zhao, Kaikai, Li, Chenyang, Lin, Yimin, Liu, Zhaoxiang, Wang, Kai, Lian, Shiguo, Goos, Gerhard, Series Editor, Hartmanis, Juris, Founding Editor, Bertino, Elisa, Editorial Board Member, Gao, Wen, Editorial Board Member, Steffen, Bernhard, Editorial Board Member, Yung, Moti, Editorial Board Member, Huang, De-Shuang, editor, Zhang, Chuanlei, editor, and Guo, Jiayang, editor
Published: 2024
Full Text: View/download PDF

20. Spatial-Temporal Transformer Network for Continuous Action Recognition in Industrial Assembly

Author: Huang, Jianfeng, Liu, Xiang, Hu, Huan, Tang, Shanghua, Li, Chenyang, Zhao, Shaoan, Lin, Yimin, Wang, Kai, Liu, Zhaoxiang, Lian, Shiguo, Goos, Gerhard, Series Editor, Hartmanis, Juris, Founding Editor, Bertino, Elisa, Editorial Board Member, Gao, Wen, Editorial Board Member, Steffen, Bernhard, Editorial Board Member, Yung, Moti, Editorial Board Member, Huang, De-Shuang, editor, Zhang, Chuanlei, editor, and Guo, Jiayang, editor
Published: 2024
Full Text: View/download PDF

21. Design and Data Analysis of Muskmelon Sugar Content Detection System

Author: Chen, Lintao, Liu, Zhaoxiang, Mou, Xiangwei, Lan, Ying, Yu, Xinye, Ceccarelli, Marco, Series Editor, Corves, Burkhard, Advisory Editor, Glazunov, Victor, Advisory Editor, Hernández, Alfonso, Advisory Editor, Huang, Tian, Advisory Editor, Jauregui Correa, Juan Carlos, Advisory Editor, Takeda, Yukio, Advisory Editor, Agrawal, Sunil K., Advisory Editor, Tan, Jianrong, editor, Liu, Yu, editor, Huang, Hong-Zhong, editor, Yu, Jingjun, editor, and Wang, Zequn, editor
Published: 2024
Full Text: View/download PDF

22. Application-Driven AI Paradigm for Hand-Held Action Detection

Author: Wang, Kohou, Liu, Zhaoxiang, and Lian, Shiguo
Subjects: Computer Science - Computer Vision and Pattern Recognition, Computer Science - Artificial Intelligence
Abstract: In practical applications especially with safety requirement, some hand-held actions need to be monitored closely, including smoking cigarettes, dialing, eating, etc. Taking smoking cigarettes as example, existing smoke detection algorithms usually detect the cigarette or cigarette with hand as the target object only, which leads to low accuracy. In this paper, we propose an application-driven AI paradigm for hand-held action detection based on hierarchical object detection. It is a coarse-to-fine hierarchical detection framework composed of two modules. The first one is a coarse detection module with the human pose consisting of the whole hand, cigarette and head as target object. The followed second one is a fine detection module with the fingers holding cigarette, mouth area and the whole cigarette as target. Some experiments are done with the dataset collected from real-world scenarios, and the results show that the proposed framework achieve higher detection rate with good adaptation and robustness in complex environments.
Published: 2022

23. Application-Driven AI Paradigm for Human Action Recognition

Author: Chen, Zezhou, Cui, Yajie, Zhao, Kaikai, Liu, Zhaoxiang, and Lian, Shiguo
Subjects: Computer Science - Computer Vision and Pattern Recognition, Computer Science - Artificial Intelligence
Abstract: Human action recognition in computer vision has been widely studied in recent years. However, most algorithms consider only certain action specially with even high computational cost. That is not suitable for practical applications with multiple actions to be identified with low computational cost. To meet various application scenarios, this paper presents a unified human action recognition framework composed of two modules, i.e., multi-form human detection and corresponding action classification. Among them, an open-source dataset is constructed to train a multi-form human detection model that distinguishes a human being's whole body, upper body or part body, and the followed action classification model is adopted to recognize such action as falling, sleeping or on-duty, etc. Some experimental results show that the unified framework is effective for various application scenarios. It is expected to be a new application-driven AI paradigm for human action recognition.
Published: 2022

24. Data-Centric AI Paradigm Based on Application-Driven Fine-Grained Dataset Design

Author: Hu, Huan, Cui, Yajie, Liu, Zhaoxiang, and Lian, Shiguo
Subjects: Computer Science - Computer Vision and Pattern Recognition
Abstract: Deep learning has a wide range of applications in industrial scenario, but reducing false alarm (FA) remains a major difficulty. Optimizing network architecture or network parameters is used to tackle this challenge in academic circles, while ignoring the essential characteristics of data in application scenarios, which often results in increased FA in new scenarios. In this paper, we propose a novel paradigm for fine-grained design of datasets, driven by industrial applications. We flexibly select positive and negative sample sets according to the essential features of the data and application requirements, and add the remaining samples to the training set as uncertainty classes. We collect more than 10,000 mask-wearing recognition samples covering various application scenarios as our experimental data. Compared with the traditional data design methods, our method achieves better results and effectively reduces FA. We make all contributions available to the research community for broader use. The contributions will be available at https://github.com/huh30/OpenDatasets.
Published: 2022

25. Monolithically integrated active passive waveguide array fabricated on thin film lithium niobate using a single continuous photolithography process

Author: Zhou, Yuan, Zhu, Yiran, Fang, Zhiwei, Yu, Shupeng, Huang, Ting, Zhou, Junxia, Wu, Rongbo, Liu, Jian, Ma, Yu, Wang, Zhe, Yu, Jianping, Liu, Zhaoxiang, Zhang, Haisu, Wang, Zhenhua, Wang, Min, and Cheng, Ya
Subjects: Physics - Optics, Physics - Applied Physics
Abstract: We demonstrate a robust low-loss optical interface by tiling passive (i.e., without doping of active ions) thin film lithium niobate (TFLN) and active (i.e., doped with rare earth ions) TFLN substrates for monolithic integration of passive/active lithium niobate photonics. The tiled substrates composed of both active and passive areas allow to pattern the mask of the integrated active passive photonic device at once using a single continuous photolithography process. The interface loss of tiled substrate is measured as low as 0.26 dB. Thanks to the stability provided by this approach, a four-channel waveguide amplifier is realized in a straightforward manner, which shows a net gain of ~5 dB at 1550-nm wavelength and that of ~8 dB at 1530-nm wavelength for each channel. The robust low-loss optical interface for passive/active photonic integration will facilitate large-scale high performance photonic devices which require on-chip light sources and amplifiers., Comment: 13 pages, 5 figures
Published: 2022

26. On-chip wavelength division multiplexing by angled multimode interferometer fabricated on erbium-doped thin film lithium niobate on insulator

Author: Han Jinli, Bao Rui, Wu Rongbo, Liu Zhaoxiang, Wang Zhe, Sun Chao, Zhang Zhihao, Li Mengqi, Fang Zhiwei, Wang Min, Zhang Haisu, and Cheng Ya
Subjects: integrated optics, erbium-doped lithium niobate on insulator, wavelength division multiplexing, Physics, QC1-999
Abstract: Photonic-integrated circuits based on erbium-doped thin film lithium niobate on insulator has attracted broad interests with insofar various waveguide amplifiers and microlasers demonstrated. Wideband operation facilitated by the broadband absorption and emission of erbium ions necessitates the functional integration of wavelength filter and multiplexer on the same chip. Here, a low-loss wavelength division multiplexer at the resonant pumping and emission wavelengths (∼1480 nm and 1530–1560 nm) of erbium ions based on angled multimode interferometer is realized in the erbium-doped thin film lithium niobate on insulator fabricated by the photolithography assisted chemomechanical etching technique. The minimum on-chip insertion losses of the fabricated device are 20 nm is measured at the telecom C-band. Besides, direct visualization of the multimode interference pattern by the visible upconversion fluorescence of erbium ions compares well with the simulated light propagation in the multimode interferometer. Spectral tuning of the wavelength division multiplexer by structural design is also demonstrated and discussed.
Published: 2024
Full Text: View/download PDF

27. A Waste Copper Granules Rating System Based on Machine Vision

Author: Zhao, Kaikai, Cui, Yajie, Liu, Zhaoxiang, and Lian, Shiguo
Subjects: Computer Science - Computer Vision and Pattern Recognition
Abstract: In the field of waste copper granules recycling, engineers should be able to identify all different sorts of impurities in waste copper granules and estimate their mass proportion relying on experience before rating. This manual rating method is costly, lacking in objectivity and comprehensiveness. To tackle this problem, we propose a waste copper granules rating system based on machine vision and deep learning. We firstly formulate the rating task into a 2D image recognition and purity regression task. Then we design a two-stage convolutional rating network to compute the mass purity and rating level of waste copper granules. Our rating network includes a segmentation network and a purity regression network, which respectively calculate the semantic segmentation heatmaps and purity results of the waste copper granules. After training the rating network on the augmented datasets, experiments on real waste copper granules demonstrate the effectiveness and superiority of the proposed network. Specifically, our system is superior to the manual method in terms of accuracy, effectiveness, robustness, and objectivity.
Published: 2022

28. A Survey on Unsupervised Anomaly Detection Algorithms for Industrial Images

Author: Cui, Yajie, Liu, Zhaoxiang, and Lian, Shiguo
Subjects: Computer Science - Computer Vision and Pattern Recognition
Abstract: In line with the development of Industry 4.0, surface defect detection/anomaly detection becomes a topical subject in the industry field. Improving efficiency as well as saving labor costs has steadily become a matter of great concern in practice, where deep learning-based algorithms perform better than traditional vision inspection methods in recent years. While existing deep learning-based algorithms are biased towards supervised learning, which not only necessitates a huge amount of labeled data and human labor, but also brings about inefficiency and limitations. In contrast, recent research shows that unsupervised learning has great potential in tackling the above disadvantages for visual industrial anomaly detection. In this survey, we summarize current challenges and provide a thorough overview of recently proposed unsupervised algorithms for visual industrial anomaly detection covering five categories, whose innovation points and frameworks are described in detail. Meanwhile, publicly available datasets for industrial anomaly detection are introduced. By comparing different classes of methods, the advantages and disadvantages of anomaly detection algorithms are summarized. Based on the current research framework, we point out the core issue that remains to be resolved and provide further improvement directions. Meanwhile, based on the latest technological trends, we offer insights into future research directions. It is expected to assist both the research community and industry in developing a broader and cross-domain perspective.
Published: 2022
Full Text: View/download PDF

29. 3D large-scale fused silica microfluidic chips enabled by hybrid laser microfabrication for continuous-flow UV photochemical synthesis

Author: Zhang, Aodong, Xu, Jian, Li, Yucen, Hu, Ming, Lin, Zijie, Song, Yunpeng, Qi, Jia, Chen, Wei, Liu, Zhaoxiang, and Cheng, Ya
Subjects: Physics - Applied Physics
Abstract: We demonstrate a hybrid laser microfabrication approach, which combines the technical merits of ultrafast laser-assisted chemical etching and carbon dioxide laser-induced in-situ melting, for centimeter-scale and bonding-free fabrication of 3D complex hollow microstructures in fused silica glass. With the developed approach, large-scale fused silica microfluidic chips with integrated 3D cascaded micromixing units can be reliably manufactured. High-performance on-chip mixing and continuous-flow photochemical synthesis under UV LEDs irradiation at ~280 nm were demonstrated using the manufactured chip, indicating a powerful capability for versatile fabrication of highly transparent all-glass microfluidic reactors for on-chip photochemical synthesis.
Published: 2022
Full Text: View/download PDF

30. Hybrid attention transformer with re-parameterized large kernel convolution for image super-resolution

Author: Ma, Zhicheng, Liu, Zhaoxiang, Wang, Kai, and Lian, Shiguo
Published: 2024
Full Text: View/download PDF

31. Monolithically integrated waveguide-coupled single-frequency microlaser on erbium-doped thin film lithium niobate

Author: Liang, Youting, Zhou, Junxia, Wu, Rongbo, Fang, Zhiwei, Liu, Zhaoxiang, Yu, Shupeng, Yin, Difeng, Zhang, Haisu, Zhou, Yuan, Liu, Jian, Wang, Zhenhua, Wang, Min, and Cheng, Ya
Subjects: Physics - Optics, Physics - Applied Physics
Abstract: We overcome the difficulty in realizing a monolithic waveguide-coupled microring laser integrated on erbium-doped thin film lithium niobate (Er: TFLN) using photolithography assisted chemo-mechanical etching (PLACE) technique. We demonstrate an integrated single-frequency microring laser operating around 1531 nm wavelength. The PLACE technique, enabling integrated Er: TFLN photonics with low propagation loss, can thus be used to realize low cost mass production of monolithic on-chip microlasers with applications ranging from optical communication and photonic integrated circuit (PIC) to precision metrology and large-scale sensing., Comment: 4 pages, 5 figures
Published: 2022

32. A high-gain cladded waveguide amplifier on erbium doped thin-film lithium niobate fabricated using photolithography assisted chemo-mechanical etching

Author: Liang, Youting, Zhou, Junxia, Liu, Zhaoxiang, Zhang, Haisu, Fang, Zhiwei, Zhou, Yuan, Yin, Difeng, Lin, Jintian, Yu, Jianping, Wu, Rongbo, Wang, Min, and Cheng, Ya
Subjects: Physics - Optics, Physics - Applied Physics
Abstract: Erbium doped integrated waveguide amplifier and laser prevail in power consumption, footprint, stability and scalability over the counterparts in bulk materials, underpinning the lightwave communication and large-scale sensing. Subject to the highly confined mode and moderate propagation loss, gain and power scaling in such integrated micro-to-nanoscale devices prove to be more challenging compared to their bulk counterparts. In this work, stimulated by the prevalent success of double-cladding optical fiber in high-gain/power operation, a Ta2O5 cladding is employed in the erbium doped lithium niobate (LN) waveguide amplifier fabricated on the thin film lithium niobate on insulator (LNOI) wafer by the photolithography assisted chemomechanical etching (PLACE) technique. Above 20 dB small signal internal net gain is achieved at the signal wavelength around 1532 nm in the 10 cm long LNOI amplifier pumped by the diode laser at ~980 nm. Experimental characterizations reveal the advantage of Ta2O5 cladding in higher optical gain compared with the air-clad amplifier, which is further explained by the theoretical modeling of the LNOI amplifier including the guided mode structures and the steady-state response of erbium ions., Comment: 11 pages, 5 figures
Published: 2021

33. On-chip multi-color microdisk laser on Yb3+-doped thin-film lithium niobate

Author: Zhou, Yuan, Wang, Zhe, Fang, Zhiwei, Liu, Zhaoxiang, Zhang, Haisu, Yin, Difeng, Liang, Youting, Zhang, Zhihao, Liu, Jian, Huang, Ting, Bao, Rui, Wu, Rongbo, Lin, Jintian, Wang, Min, and Cheng, Ya
Subjects: Physics - Optics
Abstract: We demonstrate an on-chip Yb3+-doped lithium niobate (LN) microdisk laser. The intrinsic quality factors of the fabricated Yb3+-doped LN microdisk resonator are measured up to 3.79x10^5 at 976 nm wavelength and 1.1x10^6 at 1514 nm wavelength. The multi-mode laser emissions are obtained in a band from 1020 nm to 1070 nm pumped by 984 nm laser and with the low threshold of 103 {\mu}W, resulting in a slope efficiency of 0.53% at room temperature. Furthermore, the second-harmonic frequency of pump light and the sum-frequency of the pump light and laser emissions are both generated in the on-chip Yb3+-doped LN microdisk benefited from the strong \c{hi}(2) nonlinearity of LN. These microdisk lasers are expected to contribute to the high-density integration of LNOI-based photonic chip., Comment: 4 pages, 5 figures
Published: 2021
Full Text: View/download PDF

34. Fabrication of high-aspect-ratio fused silica microstructures with large depths using Bessel-beam femtosecond laser-assisted etching

Author: Song, Yunpeng, Xu, Jian, Liu, Zhaoxiang, Zhang, Aodong, Yu, Jianping, Qi, Jia, Chen, Wei, and Cheng, Ya
Published: 2024
Full Text: View/download PDF

35. An electro-optically tunable microring laser monolithically integrated on lithium niobate on insulator

Author: Yin, Difeng, Zhou, Yuan, Liu, Zhaoxiang, Wang, Zhe, Zhang, Haisu, Fang, Zhiwei, Chu, Wei, Wu, Rongbo, Zhang, Jianhao, Chen, Wei, Wang, Min, and Cheng, Ya
Subjects: Physics - Optics, Physics - Applied Physics
Abstract: We demonstrate monolithic integration of an electro-optically (EO) tunable microring laser on lithium niobate on insulator (LNOI) platform. The device is fabricated by photolithography assisted chemo-mechanical etching (PLACE), and the pump laser is evanescently coupled into the erbium (Er3+) doped LN microring laser using an undoped LN waveguide mounted above the microring. The quality factor of the LN microring resonator is measured as high as 1.54x10^5 at the wavelength of 1542 nm. Lasing action can be observed at a pump power threshold below 3.5 mW using a 980 nm continuous-wave pump laser. Finally, tuning of the laser wavelength is achieved by varying the electric voltage on the microelectrodes fabricated in the vicinity of microring waveguide, showing an EO coefficient of 0.33 pm/V., Comment: 4 pages, 4 figures
Published: 2021
Full Text: View/download PDF

36. On-chip integrated waveguide amplifiers on Erbium-doped thin film lithium niobate on insulator

Author: Zhou, Junxia, Liang, Youting, Liu, Zhaoxiang, Chu, Wei, Zhang, Haisu, Yin, Difeng, Fang, Zhiwei, Wu, Rongbo, Zhang, Jianhao, Chen, Wei, Wang, Zhe, Zhou, Yuan, Wang, Min, and Cheng, Ya
Subjects: Physics - Optics, Physics - Applied Physics
Abstract: We demonstrate on-chip light amplification with integrated optical waveguide fabricated on erbium-doped thin film lithium niobate on insulator (TFLNOI) using the photolithography assisted chemo-mechanical etching (PLACE) technique. A maximum internal net gain of 18 dB in the small-signal-gain regime is measured at the peak emission wavelength of 1530 nm for a waveguide length of 3.6 cm, indicating a differential gain per unit length of 5 dB/cm. This work paves the way to the monolithic integration of diverse active and passive photonic components on the TFLNOI platform., Comment: 4 pages, 6 figures
Published: 2021

37. An on-chip tunable micro-disk laser fabricated on Er3+ doped lithium niobate on insulator (LNOI)

Author: Wang, Zhe, Fang, Zhiwei, Liu, Zhaoxiang, Chu, Wei, Zhou, Yuan, Zhang, Jianhao, Wu, Rongbo, Wang, Min, Lu, Tao, and Cheng, Ya
Subjects: Physics - Optics, Physics - Applied Physics
Abstract: We demonstrate a C-band wavelength-tunable microlaser with an Er3+ doped high quality (~1.02x10^6) lithium niobate microdisk resonator. With a 976 nm continuous-wave pump laser, lasing action can be observed at a pump power threshold as low as ~250 {\mu}W at room temperature. Furthermore, the microdisk laser wavelength can be tuned by varying the pump laser power, showing a tuning efficiency of ~-17.03 pm/mW at low pump power blow 13 mW, and 10.58 pm/mW at high pump power above 13 mW.
Published: 2020
Full Text: View/download PDF

38. A spectrally bright wavelength-switchable vacuum ultraviolet source driven by quantum coherence in strong-field-ionized molecules

Author: Wan, Yuexin, Liu, Zhaoxiang, Yao, Jinping, Xu, Bo, Chen, Jinming, Zhang, Fangbo, Zhang, Zhihao, Qiao, Lingling, and Cheng, Ya
Subjects: Physics - Optics, Physics - Atomic Physics
Abstract: We report generation of spectrally bright vacuum ultraviolet (VUV) and deep UV (DUV) coherent radiations driven by quantum coherence in tunnel-ionized carbon monoxide (CO) molecules. Our technique allows us to switch between multiple wavelengths provided by the abundant energy levels of molecular ions. The DUV/VUV sources can have arbitrary polarization states by manipulating the pump laser polarization. The superior temporal and spectral properties of the developed source give rise to a broadband Raman comb in the DUV/VUV region., Comment: 5 pages, 5 figures
Published: 2020

39. Extreme nonlinear Raman interaction of an ultrashort nitrogen ion laser with an impulsively excited molecular wavepacket

Author: Liu, Zhaoxiang, Yao, Jinping, Zhang, Haisu, Xu, Bo, Chen, Jinming, Zhang, Fangbo, Zhang, Zhihao, Wan, Yuexin, Chu, Wei, Wang, Zhenhua, and Cheng, Ya
Subjects: Physics - Optics
Abstract: We report generation of cascaded rotational Raman scattering up to 58th orders in coherently excited CO_2 molecules. The high-order Raman scattering, which produces a quasiperiodic frequency comb with more than 600 sidebands, is obtained using an intense femtosecond laser to impulsively excite rotational coherence and the femtosecond-laser-induced N_2^+ lasing to generate cascaded Raman signals. The novel configuration allows this experiment to be performed with a single femtosecond laser beam at free-space standoff locations. It is revealed that the efficient spectral extension of Raman signals is attributed to the specific spectra-temporal structures of N_2^+ lasing, the ideal spatial overlap of femtosecond laser and N2+ lasing, and the guiding effect of molecular alignment. The Raman spectrum extending above 2000 cm^-1 naturally corresponds to a femtosecond pulse train due to the periodic revivals of molecular rotational wavepackets., Comment: 17 pages, 4 figures
Published: 2019
Full Text: View/download PDF

40. How Old Are You? Face Age Translation with Identity Preservation Using GANs

Author: Wang, Zipeng, Liu, Zhaoxiang, Huang, Jianfeng, Lian, Shiguo, and Lin, Yimin
Subjects: Computer Science - Computer Vision and Pattern Recognition
Abstract: We present a novel framework to generate images of different age while preserving identity information, which is known as face aging. Different from most recent popular face aging networks utilizing Generative Adversarial Networks(GANs) application, our approach do not simply transfer a young face to an old one. Instead, we employ the edge map as intermediate representations, firstly edge maps of young faces are extracted, a CycleGAN-based network is adopted to transfer them into edge maps of old faces, then another pix2pixHD-based network is adopted to transfer the synthesized edge maps, concatenated with identity information, into old faces. In this way, our method can generate more realistic transfered images, simultaneously ensuring that face identity information be preserved well, and the apparent age of the generated image be accurately appropriate. Experimental results demonstrate that our method is feasible for face age translation., Comment: 9 pages, 10 figures
Published: 2019

41. A Realistic Face-to-Face Conversation System based on Deep Neural Networks

Author: Chen, Zezhou, Liu, Zhaoxiang, Hu, Huan, Bai, Jinqiang, Lian, Shiguo, Shi, Fuyuan, and Wang, Kai
Subjects: Computer Science - Computer Vision and Pattern Recognition, Computer Science - Sound, Electrical Engineering and Systems Science - Audio and Speech Processing, Electrical Engineering and Systems Science - Image and Video Processing
Abstract: To improve the experiences of face-to-face conversation with avatar, this paper presents a novel conversation system. It is composed of two sequence-to-sequence models respectively for listening and speaking and a Generative Adversarial Network (GAN) based realistic avatar synthesizer. The models exploit the facial action and head pose to learn natural human reactions. Based on the models' output, the synthesizer uses the Pixel2Pixel model to generate realistic facial images. To show the improvement of our system, we use a 3D model based avatar driving scheme as a reference. We train and evaluate our neural networks with the data from ESPN shows. Experimental results show that our conversation system can generate natural facial reactions and realistic facial images., Comment: Accepted to ICCV 2019 workshop
Published: 2019

42. A Neural Virtual Anchor Synthesizer based on Seq2Seq and GAN Models

Author: Wang, Zipeng, Liu, Zhaoxiang, Chen, Zezhou, Hu, Huan, and Lian, Shiguo
Subjects: Computer Science - Computer Vision and Pattern Recognition
Abstract: This paper presents a novel framework to generate realistic face video of an anchor, who is reading certain news. This task is also known as Virtual Anchor. Given some paragraphs of words, we first utilize a pretrained Word2Vec model to embed each word into a vector; then we utilize a Seq2Seq-based model to translate these word embeddings into action units and head poses of the target anchor; these action units and head poses will be concatenated with facial landmarks as well as the former $n$ synthesized frames, and the concatenation serves as input of a Pix2PixHD-based model to synthesize realistic facial images for the virtual anchor. The experimental results demonstrate our framework is feasible for the synthesis of virtual anchor., Comment: Accepted to ISMAR 2019
Published: 2019

43. Video synthesis of human upper body with realistic face

Author: Liu, Zhaoxiang, Hu, Huan, Wang, Zipeng, Wang, Kai, Bai, Jinqiang, and Lian, Shiguo
Subjects: Computer Science - Computer Vision and Pattern Recognition
Abstract: This paper presents a generative adversarial learning-based human upper body video synthesis approach to generate an upper body video of target person that is consistent with the body motion, face expression, and pose of the person in source video. We use upper body keypoints, facial action units and poses as intermediate representations between source video and target video. Instead of directly transferring the source video to the target video, we firstly map the source person's facial action units and poses into the target person's facial landmarks, then combine the normalized upper body keypoints and generated facial landmarks with spatio-temporal smoothing to generate the corresponding target video's image. Experimental results demonstrated the effectiveness of our method., Comment: 3 pages, 4 figures,Accepted by ISMAR 2019
Published: 2019

44. Feature Aggregation Network for Video Face Recognition

Author: Liu, Zhaoxiang, Hu, Huan, Bai, Jinqiang, Li, Shaohua, and Lian, Shiguo
Subjects: Computer Science - Computer Vision and Pattern Recognition, Computer Science - Artificial Intelligence
Abstract: This paper aims to learn a compact representation of a video for video face recognition task. We make the following contributions: first, we propose a meta attention-based aggregation scheme which adaptively and fine-grained weighs the feature along each feature dimension among all frames to form a compact and discriminative representation. It makes the best to exploit the valuable or discriminative part of each frame to promote the performance of face recognition, without discarding or despising low quality frames as usual methods do. Second, we build a feature aggregation network comprised of a feature embedding module and a feature aggregation module. The embedding module is a convolutional neural network used to extract a feature vector from a face image, while the aggregation module consists of cascaded two meta attention blocks which adaptively aggregate the feature vectors into a single fixed-length representation. The network can deal with arbitrary number of frames, and is insensitive to frame order. Third, we validate the performance of proposed aggregation scheme. Experiments on publicly available datasets, such as YouTube face dataset and IJB-A dataset, show the effectiveness of our method, and it achieves competitive performances on both the verification and identification protocols., Comment: 9 pages, 4 figures, Accepted by ICCV 2019 workshop
Published: 2019

45. Facial Pose Estimation by Deep Learning from Label Distributions

Author: Liu, Zhaoxiang, Chen, Zezhou, Bai, Jinqiang, Li, Shaohua, and Lian, Shiguo
Subjects: Computer Science - Computer Vision and Pattern Recognition, Computer Science - Artificial Intelligence
Abstract: Facial pose estimation has gained a lot of attentions in many practical applications, such as human-robot interaction, gaze estimation and driver monitoring. Meanwhile, end-to-end deep learning-based facial pose estimation is becoming more and more popular. However, facial pose estimation suffers from a key challenge: the lack of sufficient training data for many poses, especially for large poses. Inspired by the observation that the faces under close poses look similar, we reformulate the facial pose estimation as a label distribution learning problem, considering each face image as an example associated with a Gaussian label distribution rather than a single label, and construct a convolutional neural network which is trained with a multi-loss function on AFLW dataset and 300W-LP dataset to predict the facial poses directly from color image. Extensive experiments are conducted on several popular benchmarks, including AFLW2000, BIWI, AFLW and AFW, where our approach shows a significant advantage over other state-of-the-art methods., Comment: 9 pages,5 figures, Accepted by ICCV 2019 workshop
Published: 2019

46. Virtual-Blind-Road Following Based Wearable Navigation Device for Blind People

Author: Bai, Jinqiang, Lian, Shiguo, Liu, Zhaoxiang, Wang, Kai, and Liu, Dijun
Subjects: Computer Science - Computer Vision and Pattern Recognition, Electrical Engineering and Systems Science - Image and Video Processing
Abstract: To help the blind people walk to the destination efficiently and safely in indoor environment, a novel wearable navigation device is presented in this paper. The locating, way-finding, route following and obstacle avoiding modules are the essential components in a navigation system, while it remains a challenging task to consider obstacle avoiding during route following, as the indoor environment is complex, changeable and possibly with dynamic objects. To address this issue, we propose a novel scheme which utilizes a dynamic sub-goal selecting strategy to guide the users to the destination and help them bypass obstacles at the same time. This scheme serves as the key component of a complete navigation system deployed on a pair of wearable optical see-through glasses for the ease of use of blind people's daily walks. The proposed navigation device has been tested on a collection of individuals and proved to be effective on indoor navigation tasks. The sensors embedded are of low cost, small volume and easy integration, making it possible for the glasses to be widely used as a wearable consumer device., Comment: 8 pages, 9 figures, TCE accepted
Published: 2019
Full Text: View/download PDF

47. Deep Learning Based Robot for Automatically Picking up Garbage on the Grass

Author: Bai, Jinqiang, Lian, Shiguo, Liu, Zhaoxiang, Wang, Kai, and Liu, Dijun
Subjects: Computer Science - Computer Vision and Pattern Recognition, Computer Science - Robotics
Abstract: This paper presents a novel garbage pickup robot which operates on the grass. The robot is able to detect the garbage accurately and autonomously by using a deep neural network for garbage recognition. In addition, with the ground segmentation using a deep neural network, a novel navigation strategy is proposed to guide the robot to move around. With the garbage recognition and automatic navigation functions, the robot can clean garbage on the ground in places like parks or schools efficiently and autonomously. Experimental results show that the garbage recognition accuracy can reach as high as 95%, and even without path planning, the navigation strategy can reach almost the same cleaning efficiency with traditional methods. Thus, the proposed robot can serve as a good assistance to relieve dustman's physical labor on garbage cleaning tasks., Comment: 8 pages, 13 figures,TCE accepted
Published: 2019
Full Text: View/download PDF

48. Wearable Travel Aid for Environment Perception and Navigation of Visually Impaired People

Author: Bai, Jinqiang, Liu, Zhaoxiang, Lin, Yimin, Li, Ye, Lian, Shiguo, and Liu, Dijun
Subjects: Computer Science - Computer Vision and Pattern Recognition, Computer Science - Human-Computer Interaction
Abstract: This paper presents a wearable assistive device with the shape of a pair of eyeglasses that allows visually impaired people to navigate safely and quickly in unfamiliar environment, as well as perceive the complicated environment to automatically make decisions on the direction to move. The device uses a consumer Red, Green, Blue and Depth (RGB-D) camera and an Inertial Measurement Unit (IMU) to detect obstacles. As the device leverages the ground height continuity among adjacent image frames, it is able to segment the ground from obstacles accurately and rapidly. Based on the detected ground, the optimal walkable direction is computed and the user is then informed via converted beep sound. Moreover, by utilizing deep learning techniques, the device can semantically categorize the detected obstacles to improve the users' perception of surroundings. It combines a Convolutional Neural Network (CNN) deployed on a smartphone with a depth-image-based object detection to decide what the object type is and where the object is located, and then notifies the user of such information via speech. We evaluated the device's performance with different experiments in which 20 visually impaired people were asked to wear the device and move in an office, and found that they were able to avoid obstacle collisions and find the way in complicated scenarios., Comment: 7 pages, 12 figures
Published: 2019

49. Deep Global-Relative Networks for End-to-End 6-DoF Visual Localization and Odometry

Author: Lin, Yimin, Liu, Zhaoxiang, Huang, Jianfeng, Wang, Chaopeng, Du, Guoguang, Bai, Jinqiang, Lian, Shiguo, and Huang, Bill
Subjects: Computer Science - Computer Vision and Pattern Recognition
Abstract: Although a wide variety of deep neural networks for robust Visual Odometry (VO) can be found in the literature, they are still unable to solve the drift problem in long-term robot navigation. Thus, this paper aims to propose novel deep end-to-end networks for long-term 6-DoF VO task. It mainly fuses relative and global networks based on Recurrent Convolutional Neural Networks (RCNNs) to improve the monocular localization accuracy. Indeed, the relative sub-networks are implemented to smooth the VO trajectory, while global subnetworks are designed to avoid drift problem. All the parameters are jointly optimized using Cross Transformation Constraints (CTC), which represents temporal geometric consistency of the consecutive frames, and Mean Square Error (MSE) between the predicted pose and ground truth. The experimental results on both indoor and outdoor datasets show that our method outperforms other state-of-the-art learning-based VO methods in terms of pose accuracy., Comment: 7 pages, 6 figures
Published: 2018

50. DTjRTL: A Configurable Framework for Automated Hardware Trojan Insertion at RTL

Author: Dai, Ruochen, primary, Liu, Zhaoxiang, additional, Arias, Orlando, additional, Guo, Xiaolong, additional, and Yavuz, Tuba, additional
Published: 2024
Full Text: View/download PDF

Catalog

Books, media, physical & digital resources

See catalog results

Searchworks

Select search scope, currently: Articles Catalog books, media & more in Jio Institute collections Articles journal articles & other e-resources

Search

Search Constraints

Refine your results

Search Limiters

Topic

Publication Year Range

Language

Publication Type

Journal

Database

Publisher

378 results on '"Liu, Zhaoxiang"'

Search Results

Catalog

Select search scope, currently: Articles

Catalog

books, media & more in Jio Institute collections

Articles

journal articles & other e-resources