Author: "Du, Songlin" / Publication Type: Magazines - Searchworks@Jio Institute Digital Library Search Results

1. Bi-Pose: Bidirectional 2D-3D Transformation for Human Pose Estimation From a Monocular Camera

Author: Du, Songlin, Wang, Hao, Yuan, Zhiwei, and Ikenaga, Takeshi
Abstract: Automatically estimating 3D human poses in video and inferring their meanings play an essential role in many human-centered automation systems. Existing researches made remarkable progresses by first estimating 2D human joints in video and then reconstructing 3D human pose from the 2D joints. However, mono-directionally reconstructing 3D pose from 2D joints ignores the interaction between information in 3D space and 2D space, losses rich information of original video, therefore limits the ceiling of estimation accuracy. To this end, this paper proposes a bidirectional 2D-3D transformation framework that bidirectionally exchanges 2D and 3D information and utilizes video information to estimate an offset for refining 3D human pose. In addition, a bone-length stability loss is utilized for the purpose of exploring human body structure to make the estimated 3D pose more natural and to further increase the overall accuracy. By evaluation, estimation error of the proposed method, measured by the mean per joint position error (MPJPE), is only 46.5 mm, which is much lower than state-of-the-art methods under the same experimental condition. The improvement on accuracy will make machines to better understand human poses for building superior human-centered automation systems. Note to Practitioners—This paper was motivated by the demand of human-centered automation systems needing to accurately understand human poses. Existing approaches mainly focus on inferring 3D human pose from 2D joints mono-directionally. Although they made remarkable contributions to estimating 3D human pose in such a mono-directional way, we found that they ignore the 2D-3D interaction and do not use original video when inferring 3D pose from 2D joints. This paper therefore suggests a bidirectional 2D-3D transformation that exchanges 2D and 3D information and utilizes video information to estimate more accurate 3D human pose for human-centered automation systems. This work is a pioneering attempt of interactively using 2D and 3D information for more accurate estimation of human pose. Benefited from the state-of-the-art accuracy, the proposed approach is expected to make significant contributions to many human-centered automation systems, such as human-machine interaction, biomimetic manipulation, and automatic surveillance systems.
Published: 2024
Full Text: View/download PDF

2. Toward Better Generalization: Shape Feature-Enhanced Fastener Defect Detection With Diffusion Model

Author: Su, Shixiang, Du, Songlin, Wang, Dezhou, and Lu, Xiaobo
Abstract: Fastener defect detection is important for making sure that railroads operate reliably. A robust fastener detector should perform well in most railway track circumstances, especially in an unseen environment. However, with a limited fastener dataset, this requirement remains challenging due to a large domain shift between the available dataset and unseen fasteners. This article aims to build a robust fastener defect detection architecture by instance segmentation, where only a single domain of data is available for training. To this end, we propose a novel end-to-end fastener detection model that can be divided into a base detector and two key components: a mask prediction head (MPH) and a mask refinement head (MRH). MPH generates the initial masks based on the predicted boxes supplied from the base detector. By combining the shape priors of the fastener, a simple yet effective shape enhancement module is introduced to encourage MPH to better capture the shape feature. MRH is presented to improve the initial masks with an iteratively refined process, which is achieved by a denoising diffusion model (DDM). Meanwhile, a shape guidance module (SGM) is designed to enhance the recovery capability of DDM. Experiments on the track fastener dataset revealed that the proposed heads, i.e., MPH and MRH, can be equipped with various detection frameworks and have shown consistent generalization improvements on many base detectors. Moreover, our approach significantly boosts the detection performance on different unseen sites.
Published: 2024
Full Text: View/download PDF

3. Reactive power optimization for distribution network with distributed generators by improved driving training‑based optimization

Author: Loskot, Pavel, Ding, Xiaofeng, Du, Songlin, Hai, Tao, Lu, Jianfeng, and Wang, Jun
Published: 2023
Full Text: View/download PDF

4. RFS-Net: Railway Track Fastener Segmentation Network With Shape Guidance

Author: Su, Shixiang, Du, Songlin, Wei, Xuan, and Lu, Xiaobo
Abstract: The fastener is one of the main components of a rail track system. In recent years, deep learning methods such as image segmentation have greatly boosted the fastener state detection process. However, there is still a need to improve the segmentation accuracy and speed, especially for the fasteners in complex environments. To handle this problem, a fast and accurate fastener semantic segmentation network named RFS-Net is proposed based on shape guidance, which can offer a better speed/accuracy trade-off performance via a very shallow architecture. Specifically, in the encoder, a two-stream structure (i.e., regular stream and shape stream) that processes the fastener and shape image in parallel is introduced. The shape image is created based on the geometric structure of the fastener, and it is served as input to the shape stream to guide the segmentation of the fastener. The decoder integrates deep features from the two-stream encoder and then recovers the shape information by the shape attention blocks with skipping connections. We provide two versions of RFS-Net: RFS-Net_S (1.0M, 1014FPS) and RFS-Net_L (12.01M, 453FPS) on the NVIDIA RTX 3060. Experimental results demonstrate the effectiveness of our method by achieving a promising trade-off between accuracy and inference speed. In particular, our method is faster and more accurate on a challenging dataset, from fast modes: 1014 FPS for RFS-Net_S versus 724 FPS for Segmenter, to high-quality segmentation: better performance than STDC with nearly one percent (92.36% versus 91.48% Mean IoU score).
Published: 2023
Full Text: View/download PDF

5. Geometric Constraint and Image Inpainting-Based Railway Track Fastener Sample Generation for Improving Defect Inspection

Author: Su, Shixiang, Du, Songlin, and Lu, Xiaobo
Abstract: Defective fastener images detection is an essential task in the vision-based railway track safety inspection. Although existing methods have achieved some level of success, the detection accuracy in this field suffers from the defective fasteners being far less common than normal fasteners. One way to tackle this problem is to expand the defect sample. However, current state-of-the-art defective fastener generation methods mainly rely on generative adversarial networks or simply augment the defect data through traditional image processing. These methods may not be ideal as it is difficult to produce images with high quality and rich diversity at the same time. This paper proposes a new method for fastener sample generation that actively divides the sample generation into two independent parts: defective foregrounds generation and complete backgrounds generation. The key to this method is to generate foregrounds and backgrounds based on geometric constraint and image inpainting, respectively. Specifically, we adopt a skeleton mapping algorithm to directionally control the generated types of defective foregrounds. Meanwhile, an image inpainting network is employed to expand the background. The experiments show that this enables us to generate better-quality and richer-diversity images by combining deep learning and image processing advantages. To the best of our knowledge, our method is the first to achieve state-of-the-art performance, i.e., the classification accuracy reaches 97.97%, without using real defective fastener images during the defect classification network training process.
Published: 2022
Full Text: View/download PDF

6. Highly-Parallel Hardwired Deep Convolutional Neural Network for 1-ms Dual-Hand Tracking

Author: Zhang, Peiqi, Hu, Tingting, Luo, Dingli, Du, Songlin, and Ikenaga, Takeshi
Abstract: 1-ms vision systems represent an extreme case of temporal development in video sensing techniques. Moreover, a 1-ms dual-hand tracking system leverages the dexterous functionality of hands and thus serves as a seamless and intuitive interface for Human-Computer Interaction. Deep CNN is promising for high tracking robustness, however, neither GPU-based nor FPGA-based implementation addresses the tracking task with ultra-high-speed. This paper proposes: (a) A paradigm to directly map a deep CNN as a hardwired circuit, so the entire network runs in parallel and high processing speed is obtained. The network is exempted from memory access since all intermediate neural values are implicitly represented in hardware states. And condensed binarization is used to reduce resource utilization; (b) Hardware design of the hardwired network on FPGA, inside which kernel-adapted convolutional trees are devised to maximize the parallelism. The speed bottleneck of the network is therefore removed by implementing convolutional layers as fine-grained pipelines with unified components; (c) FPGA-GPU hetero complementation, which utilizes an auxiliary GPU network to compensate for accuracy of the FPGA network without affecting its speed. The quick primary results on FPGA are intermittently refined using delayed but accurate hints from GPU. Implementation results show that the proposed method reaches 973fps and consumes merely 1.30ms to process on $640\times 480$ images, while the accuracy is only 4.7% lower compared with the general method on test sequences. Video demonstrations are available at https://wcms.waseda.jp/em/5f9d020f136e7.
Published: 2022
Full Text: View/download PDF

7. Backwards pre-analysis and flexible quantization group size adjustment based quality optimization for screen content coding

Author: Su, Ruidan, Zhao, Ziyan, Li, Yihang, Liu, Qin, Du, Songlin, and Ikenaga, Takeshi
Published: 2020
Full Text: View/download PDF

8. Bottom-up check and temporal rate proportion based fast InterIMV algorithm in versatile video coding

Author: Su, Ruidan, Li, Yihang, Zhao, Ziyan, Liu, Qin, Du, Songlin, and Ikenaga, Takeshi
Published: 2020
Full Text: View/download PDF

9. Blind image quality assessment with the histogram sequences of high-order local derivative patterns.

Author: Du, Songlin, Yan, Yaping, and Ma, Yide
Subjects: *COMPUTER vision, *DIGITAL image processing, *HISTOGRAMS, *IMAGE quality analysis, *PATTERN recognition systems, *IMAGE databases
Abstract: Automatic assessment of the perceptual quality of digital image is an important and challenging issue in computer vision. Although human visual system (HVS) is sensitive to degradations on spatial structures, most of the existing methods do not take into account the spatial distribution of local structures. This paper reports a novel approach coined high-order local derivative pattern (LDP) based metric (HOLDPM). In particular, HOLDPM extracts local image structures with LDPs in multi-directions to yield an accurate assessment of image quality. HOLDPM is extensively evaluated on three large-scale public databases. Experimental results demonstrate that HOLDPM is able to achieve high assessment accuracy. Besides, objective assessment result of the HOLDPM is consistent with the subjective assessment result of the HVS. Specifically, the experimental results also indicate that HOLDPM outperforms most of the state-of-the-art methods in distortion specific tests. Additionally, HOLDPM shows competitive overall performance when measured with the weighted average of Spearman rank-order correlation coefficient (SROCC) and the weighted average of Pearson linear correlation coefficient (PLCC) over the test databases. [ABSTRACT FROM AUTHOR]
Published: 2016
Full Text: View/download PDF

10. Human Visual Characteristics Inspired Adaptive Image Quantization Method

Author: Huang, Yi, Ma, Jianlin, Du, Songlin, and Ma, Yide
Abstract: Combined with human visual characteristics, this paper proposes an adaptive image quantization method using pulse coupled neural networks (PCNN). It is acknowledged that PCNN satisfies human visual characteristics which can be used to quantize images. We have been trying to obtain the optimal image quantized with minimum amount of data for a long time and finally discover that the proposed algorithm comes up to the expected standard. First, the gray scale image is imported into PCNN and then the parameters of PCNN are set automatically. Secondly, the image is quantized by the hand of proposed algorithm. The experimental results of the gray natural images from the standard image library prove the validity and efficiency of our proposed quantization method.
Published: 2014
Full Text: View/download PDF

11. Learning the histogram sequences of generalized local ternary patterns for blind image quality assessment

Author: Xie, Yi, Wang, Yulin, Jiang, Xudong, Yan, Yaping, Du, Songlin, Zhang, Hongjuan, and Ma, Yide
Published: 2015
Full Text: View/download PDF

12. Effective and fully automatic image segmentation using quantum entropy and pulse-coupled neural networks

Author: Wang, Yulin, Jiang, Xudong, Zhang, David, Du, Songlin, Yan, Yaping, and Ma, Yide
Published: 2015
Full Text: View/download PDF

Searchworks

Select search scope, currently: Articles

Catalog

books, media & more in Jio Institute collections

Articles

journal articles & other e-resources

Refine your results

12 results on '"Du, Songlin"'

1. Bi-Pose: Bidirectional 2D-3D Transformation for Human Pose Estimation From a Monocular Camera

2. Toward Better Generalization: Shape Feature-Enhanced Fastener Defect Detection With Diffusion Model

3. Reactive power optimization for distribution network with distributed generators by improved driving training‑based optimization

4. RFS-Net: Railway Track Fastener Segmentation Network With Shape Guidance

5. Geometric Constraint and Image Inpainting-Based Railway Track Fastener Sample Generation for Improving Defect Inspection

6. Highly-Parallel Hardwired Deep Convolutional Neural Network for 1-ms Dual-Hand Tracking

7. Backwards pre-analysis and flexible quantization group size adjustment based quality optimization for screen content coding

8. Bottom-up check and temporal rate proportion based fast InterIMV algorithm in versatile video coding

9. Blind image quality assessment with the histogram sequences of high-order local derivative patterns.

10. Human Visual Characteristics Inspired Adaptive Image Quantization Method

11. Learning the histogram sequences of generalized local ternary patterns for blind image quality assessment

12. Effective and fully automatic image segmentation using quantum entropy and pulse-coupled neural networks

Catalog

Searchworks

Select search scope, currently: Articles Catalog books, media & more in Jio Institute collections Articles journal articles & other e-resources

Search

Search Constraints

Refine your results

Search Limiters

Topic

Publication Year Range

Language

Publication Type

Journal

Database

12 results on '"Du, Songlin"'

Search Results

Catalog

Select search scope, currently: Articles

Catalog

books, media & more in Jio Institute collections

Articles

journal articles & other e-resources