34 results on '"Jingsheng Lei"'
Search Results
2. CIMFNet: Cross-Layer Interaction and Multiscale Fusion Network for Semantic Segmentation of High-Resolution Remote Sensing Images
- Author
-
Wujie Zhou, Jianhui Jin, Jingsheng Lei, and Lu Yu
- Subjects
Signal Processing ,Electrical and Electronic Engineering - Published
- 2022
3. FRNet: Feature Reconstruction Network for RGB-D Indoor Scene Parsing
- Author
-
Wujie Zhou, Enquan Yang, Jingsheng Lei, and Lu Yu
- Subjects
Signal Processing ,Electrical and Electronic Engineering - Published
- 2022
4. A Time-Specified Zeroing Neural Network for Quadratic Programming With Its Redundant Manipulator Application
- Author
-
Xianyi Li, Ying Kong, Yunliang Jiang, and Jingsheng Lei
- Subjects
Constraint (information theory) ,Artificial neural network ,Terminal (electronics) ,Control and Systems Engineering ,Computer science ,Control theory ,Convergence (routing) ,Attractor ,Quadratic programming ,State (computer science) ,Electrical and Electronic Engineering ,Constant (mathematics) - Abstract
To solve time-varying quadratic programming with equation constraint, a new time-specified zeroing neural network (TSZNN) is proposed and analyzed. Unlike the existing methods such as Zhang neural network (ZNN) with different activation functions (AFs) and finite time neural network (FTNN), TSZNN model is incorporated into a terminal attractor and the convergent error can be guaranteed to reduce to zero in advance (instead of finite-time property). The greatest advantage of the TSZNN model is independent to the initial state of the systematic dynamics, which is much astonishing to the finite convergence relied on the initial conditions and comprehensively modifies the convergent performance. Mathematical analyses substantiate the pre-specified convergence of the TSZNN model and high convergent precision under the situation of various convergent time setting. Pre-specified convergence of the TSZNN model for a quadratic programming problem has been mathematically proved under different convergent constant setting. In addition, simulation applications conducted on a repeatable trajectory planning of redundant manipulator are studied to demonstrate the validity of the proposed TSZNN model.
- Published
- 2022
5. MFFENet: Multiscale Feature Fusion and Enhancement Network For RGB–Thermal Urban Road Scene Parsing
- Author
-
Jeng-Neng Hwang, Lu Yu, Jingsheng Lei, Wujie Zhou, and Lin Xinyang
- Subjects
Fusion ,Parsing ,Computer science ,business.industry ,Deep learning ,ComputingMethodologies_IMAGEPROCESSINGANDCOMPUTERVISION ,computer.software_genre ,Computer Science Applications ,Robustness (computer science) ,Signal Processing ,Shadow ,Media Technology ,RGB color model ,Computer vision ,Artificial intelligence ,Electrical and Electronic Engineering ,Layer (object-oriented design) ,business ,Encoder ,computer - Abstract
Compared with traditional handcrafted features, deep learning has greatly improved the performance of scene parsing. However, it remains challenging under various environmental conditions caused by imaging limitations. Thermal imaging cameras have several advantages over cameras for the visible spectrum, such as operation in total darkness, robustness to shadow effects, insensitivity to illumination variations, and strong ability to penetrate smog and haze. These advantages of thermal imaging cameras make them ideal for the scene parsing of semantic objects in daytime and nighttime. In this paper, we propose a novel multiscale feature fusion and enhancement network (MFFENet) for accurate parsing of RGBthermal urban road scenes even when the quality of the available RGB data is compromised. The proposed MFFENet consists of two encoders, a feature fusion layer, and a multi-labelsupervision layer. We concatenate the multi-scale features with the features that contain global semantic information. Furthermore, we explore the cross-modal fusion of RGB and thermal features at multiple stages, rather than fusing them once at the low or high stage. Then, we propose a spatial attention mechanism module that provides a higher weight to (focuses more on) the foreground area, allowing MFFENet to emphasize foreground objects. Finally, multi-labelsupervision is introduced to optimize parameters of the proposed MFFENet. Experimental results confirm that the proposed MFFENet outperforms similar high-performing methods.
- Published
- 2022
6. Multiscale Progressive Complementary Fusion Network for Fine-Grained Visual Classification
- Author
-
Jingsheng Lei, Xinqi Yang, and Shengying Yang
- Subjects
General Computer Science ,General Engineering ,General Materials Science ,Electrical and Electronic Engineering - Published
- 2022
7. RTLNet: Recursive Triple-Path Learning Network for Scene Parsing of RGB-D Images
- Author
-
Yuchun Yue, Wujie Zhou, Jingsheng Lei, and Lu Yu
- Subjects
Applied Mathematics ,Signal Processing ,Electrical and Electronic Engineering - Published
- 2022
8. CCAFNet: Crossflow and Cross-Scale Adaptive Fusion Network for Detecting Salient Objects in RGB-D Images
- Author
-
Yun Zhu, Wujie Zhou, Wan Jian, Jingsheng Lei, and Lu Yu
- Subjects
Fusion ,Relation (database) ,Channel (digital image) ,business.industry ,Computer science ,ComputingMethodologies_IMAGEPROCESSINGANDCOMPUTERVISION ,Pattern recognition ,02 engineering and technology ,Computer Science Applications ,Convolution ,Feature (computer vision) ,Signal Processing ,0202 electrical engineering, electronic engineering, information engineering ,Media Technology ,RGB color model ,Leverage (statistics) ,020201 artificial intelligence & image processing ,Artificial intelligence ,Electrical and Electronic Engineering ,business ,Spatial analysis - Abstract
Owing to the widespread adoption of depth sensors, salient object detection (SOD) supported by depth maps for reliable complementary information is being increasingly investigated. Existing SOD models mainly exploit the relation between an RGB image and its corresponding depth information across three fusion domains: input RGB-D images, extracted feature maps, and output salient object. However, these models do not leverage the crossflows between high- and low-level information well. Moreover, the decoder in these models uses conventional convolution that involves several calculations. To further improve RGB-D SOD, we propose a crossflow and cross-scale adaptive fusion network (CCAFNet) to detect salient objects in RGB-D images. First, a channel fusion module allows for effective fusing depth and high-level RGB features. This module extracts accurate semantic information features from high-level RGB features. Meanwhile, a spatial fusion module combines low-level RGB and depth features with accurate boundaries and subsequently extracts detailed spatial information from low-level depth features. Finally, a purification loss is proposed to precisely learn the boundaries of salient objects and obtain additional details of the objects. The results of comprehensive experiments on seven common RGB-D SOD datasets indicate that the performance of the proposed CCAFNet is comparable to those of state-of-the-art RGB-D SOD models.
- Published
- 2022
9. CEGFNet: Common Extraction and Gate Fusion Network for Scene Parsing of Remote Sensing Images
- Author
-
Jenq-Neng Hwang, Jianhui Jin, Wujie Zhou, and Jingsheng Lei
- Subjects
Data stream ,Parsing ,Computer science ,ComputingMethodologies_IMAGEPROCESSINGANDCOMPUTERVISION ,Context (language use) ,computer.software_genre ,Convolutional neural network ,Upsampling ,Feature (computer vision) ,General Earth and Planetary Sciences ,Electrical and Electronic Engineering ,computer ,Spatial analysis ,AND gate ,Remote sensing - Abstract
Scene parsing of high spatial resolution (HSR) remote sensing images has achieved notable progress in recent years by the adoption of convolutional neural networks. However, for scene parsing of multimodal remote sensing images, effectively integrating complementary information remains challenging. For instance, the decrease in feature map resolution through a neural network causes loss of spatial information, likely leading to blurred object boundaries and misclassification of small objects. In addition, object scales on a remote sensing image vary substantially, undermining the parsing performance. To solve these problems, we propose an end-to-end common extraction and gate fusion network (CEGFNet) to capture both high-level semantic features and low-level spatial details for scene parsing of remote sensing images. Specifically, we introduce a gate fusion module to extract complementary features from spectral data and digital surface model data. A gate mechanism removes redundant features in the data stream and extracts complementary features that improve multimodal feature fusion. In addition, a global context module and a multilayer aggregation decoder handle scale variations between objects and the loss of spatial details due to downsampling, respectively. The proposed CEGFNet was quantitatively evaluated on benchmark scene parsing datasets containing HSR remote sensing images, and it achieved state-of-the-art performance.
- Published
- 2022
10. LFF-YOLO: A YOLO Algorithm With Lightweight Feature Fusion Network for Multi-Scale Defect Detection
- Author
-
Xiaohong Qian, Xu Wang, Shengying Yang, and Jingsheng Lei
- Subjects
General Computer Science ,General Engineering ,General Materials Science ,Electrical and Electronic Engineering - Published
- 2022
11. THCANet: Two-layer hop cascaded asymptotic network for robot-driving road-scene semantic segmentation in RGB-D images
- Author
-
Gao Xu, Wujie Zhou, Xiaohong Qian, Yulai Zhang, Jingsheng Lei, and Lu Yu
- Subjects
Computational Theory and Mathematics ,Artificial Intelligence ,Applied Mathematics ,Signal Processing ,Computer Vision and Pattern Recognition ,Electrical and Electronic Engineering ,Statistics, Probability and Uncertainty - Published
- 2023
12. Global and Local-Contrast Guides Content-Aware Fusion for RGB-D Saliency Prediction
- Author
-
Ying Lv, Jingsheng Lei, Lu Yu, and Wujie Zhou
- Subjects
Fusion ,Computer science ,business.industry ,Deep learning ,Feature extraction ,020206 networking & telecommunications ,Pattern recognition ,02 engineering and technology ,Computer Science Applications ,Visualization ,Human-Computer Interaction ,Upsampling ,Control and Systems Engineering ,0202 electrical engineering, electronic engineering, information engineering ,RGB color model ,020201 artificial intelligence & image processing ,Artificial intelligence ,Electrical and Electronic Engineering ,business ,Image resolution ,Scaling ,Software - Abstract
Many RGB-D visual attention models have been proposed with diverse fusion models; thus, the main challenge lies in the differences in the results between the different models. To address this challenge, we propose a local-global fusion model for fixation prediction on an RGB-D image; this method combines global and local information through a content-aware fusion module (CAFM) structure. First, it comprises a channel-based upsampling block for exploiting global contextual information and scaling up this information to the same resolution as the input. Second, our Deconv block contains a contrast feature module to utilize multilevel local features stage-by-stage for superior local feature representation. The experimental results demonstrate that the proposed model exhibits competitive performance on two databases.
- Published
- 2021
13. HFF-SRGAN: super-resolution generative adversarial network based on high-frequency feature fusion
- Author
-
Jingsheng Lei, Hanbo Xue, Shengying Yang, Wenbin Shi, Shuping Zhang, and Yi Wu
- Subjects
Electrical and Electronic Engineering ,Atomic and Molecular Physics, and Optics ,Computer Science Applications - Published
- 2022
14. Two-Stage Cascaded Decoder for Semantic Segmentation of RGB-D Images
- Author
-
Yuchun Yue, Wujie Zhou, Lu Yu, and Jingsheng Lei
- Subjects
Computer science ,business.industry ,Applied Mathematics ,Feature extraction ,ComputingMethodologies_IMAGEPROCESSINGANDCOMPUTERVISION ,020206 networking & telecommunications ,Pattern recognition ,02 engineering and technology ,Image segmentation ,Feature (computer vision) ,Signal Processing ,0202 electrical engineering, electronic engineering, information engineering ,Benchmark (computing) ,RGB color model ,Segmentation ,Noise (video) ,Artificial intelligence ,Electrical and Electronic Engineering ,business ,Decoding methods - Abstract
Exploiting RGB and depth information can boost the performance of semantic segmentation. However, owing to the differences between RGB images and the corresponding depth maps, such multimodal information should be effectively used and combined. Most existing methods use the same fusion strategy to explore multilevel complementary information at various levels, likely ignoring different feature contributions at various levels for segmentation. To address this problem, we propose a network using a two-stage cascaded decoder (TCD), embedding a detail polishing module, to effectively integrate high- and low-level features and suppress noise from low-level details. Additionally, we introduce a depth filter and fusion module to extract informative regions from depth cues with the guidance of RGB images. The proposed TCD network achieves comparable performance to state-of-the-art RGB-D semantic segmentation methods on the benchmark NYUDv2 and SUN RGB-D datasets.
- Published
- 2021
15. MRINet: Multilevel Reverse-Context Interactive-Fusion Network for Detecting Salient Objects in RGB-D Images
- Author
-
Pan Sijia, Lu Yu, Jingsheng Lei, and Wujie Zhou
- Subjects
Computer science ,business.industry ,Applied Mathematics ,Feature extraction ,Multilevel model ,ComputingMethodologies_IMAGEPROCESSINGANDCOMPUTERVISION ,Context (language use) ,Pattern recognition ,Semantics ,Salient ,Feature (computer vision) ,Signal Processing ,RGB color model ,Artificial intelligence ,Electrical and Electronic Engineering ,business ,ComputingMethodologies_COMPUTERGRAPHICS ,Block (data storage) - Abstract
The use of RGB-D information for salient object detection (SOD) is being increasingly explored. Traditional multilevel models handle both low- and high-level features similarly, as they use the same number of features for blending. Unlike these models, in this paper, we propose multilevel reverse-context interactive-fusion (MRI) network (MRINet) for RGB-D SOD. Specifically, first, we extract and reuse different numbers of features depending on their level; the deeper the information, the more times do we perform the extraction. Deeper information contains more semantic cues, which are important for locating salient regions. Thereafter, we use an RGB MRI block (MRIB) to merge RGB information at different levels; furthermore, we use depth features as auxiliary information and an RGB-D MRIB for full merging with RGB information. RGB and RGB-D MRIBs can reconstruct the high-level feature map in high resolution and integrate the low-level feature map to enhance boundary details. Extensive experiments demonstrate the effectiveness of the proposed MRINet and its state-of-the-art performance in RGB-D SOD.
- Published
- 2021
16. Salient Object Detection in Stereoscopic 3D Images Using a Deep Convolutional Residual Autoencoder
- Author
-
Junwei Wu, Lu Yu, Wujie Zhou, Jenq-Neng Hwang, and Jingsheng Lei
- Subjects
Computer science ,business.industry ,Feature extraction ,Supervised learning ,ComputingMethodologies_IMAGEPROCESSINGANDCOMPUTERVISION ,Stereoscopy ,02 engineering and technology ,Autoencoder ,Object detection ,Computer Science Applications ,law.invention ,law ,Feature (computer vision) ,Signal Processing ,Pyramid ,0202 electrical engineering, electronic engineering, information engineering ,Media Technology ,RGB color model ,020201 artificial intelligence & image processing ,Computer vision ,Artificial intelligence ,Pyramid (image processing) ,Electrical and Electronic Engineering ,business ,Encoder - Abstract
In recent years, the detection of distinctive objects in stereoscopic 3D images has drawn increasing attention. Unlike 2D salient object detection, salient object detection in stereoscopic 3D images is highly challenging. Hence, we propose a novel Deep Convolutional Residual Autoencoder (DCRA) for end-to-end salient object detection in stereoscopic 3D images. The core trainable architecture of the salient object detection model employs raw stereoscopic 3D images as the inputs and their corresponding ground truth saliency masks as the labels. A convolutional residual module is applied to both the encoder and the decoder as a basic building block in the DCRA, and long-range skip connections are employed to bypass the equal-sized feature maps between the encoder and the decoder. To explore the complex relationships and exploit the complementarity between RGB (photometric) and depth (geometric) information, multiple feature map fusion modules are constructed. These modules integrate texture and structure information between the RGB and depth branches of the encoder and fuse their features over several multiscale layers. Finally, to efficiently optimize DCRA parameters, a supervision pyramid based on boundary loss and background prior loss is adopted, which employs supervised learning over the multiscale layers in the decoder to prevent vanishing gradients and accelerate the training at the fusion stage. We compare the proposed DCRA with state-of-the-art methods on two challenging benchmark datasets. The results of these experiments demonstrate that our proposed DCRA performs favorably against the comparison models.
- Published
- 2021
17. MFENet: Multitype fusion and enhancement network for detecting salient objects in RGB-T images
- Author
-
Junyi Wu, Wujie Zhou, Xiaohong Qian, Jingsheng Lei, Lu Yu, and Ting Luo
- Subjects
Computational Theory and Mathematics ,Artificial Intelligence ,Applied Mathematics ,Signal Processing ,Computer Vision and Pattern Recognition ,Electrical and Electronic Engineering ,Statistics, Probability and Uncertainty - Published
- 2023
18. CCFNet: Cross-Complementary fusion network for RGB-D scene parsing of clothing images
- Author
-
Gao Xu, Wujie Zhou, Xiaohong Qian, Lv Ye, Jingsheng Lei, and Lu Yu
- Subjects
Signal Processing ,Media Technology ,Computer Vision and Pattern Recognition ,Electrical and Electronic Engineering - Published
- 2023
19. Global contextually guided lightweight network for RGB-thermal urban scene understanding
- Author
-
Tingting Gong, Wujie Zhou, Xiaohong Qian, Jingsheng Lei, and Lu Yu
- Subjects
Artificial Intelligence ,Control and Systems Engineering ,Electrical and Electronic Engineering - Published
- 2023
20. Asymmetric Deeply Fused Network for Detecting Salient Objects in RGB-D Images
- Author
-
Yuzhen Chen, Chang Liu, Jingsheng Lei, and Wujie Zhou
- Subjects
business.industry ,Computer science ,Applied Mathematics ,Feature extraction ,020206 networking & telecommunications ,Pattern recognition ,02 engineering and technology ,Salient objects ,Object detection ,Visualization ,Signal Processing ,0202 electrical engineering, electronic engineering, information engineering ,RGB color model ,Artificial intelligence ,Electrical and Electronic Engineering ,business - Abstract
Most RGB-D salient object detection (SOD) models use the same network to process RGB images and their corresponding depth maps. Subsequently, these models perform direct concatenation and summation at deep or shallow layers. However, these models ignore the complementarity of multi-level features extracted from RGB images and depth maps. This paper presents an asymmetric deeply fused network (ADFNet) for RGB-D SOD. Two different backbone networks, i.e., ResNet-50 and VGG-16, are utilized to process RGB images and related depth maps. We use an aggregation decoder and adaptive attention transformer module (AATM) to avoid information loss in the decoding process. Additionally, we use an attention early fusion module (AEFM) and deep fusion module (DFM) to deal with the deep features in various complex situations. Experiments validate the effectiveness of the proposed ADFNet, which outperforms thirteen recent RGB-D SOD models in the analysis of five public RGB-D SOD datasets.
- Published
- 2020
21. LBENet: Lightweight boundary enhancement network for detecting salient objects in RGB-D images
- Author
-
Junwei Wu, Wujie Zhou, Jingsheng Lei, Qiang Li, and Lu Yu
- Subjects
Electrical and Electronic Engineering ,Atomic and Molecular Physics, and Optics ,Electronic, Optical and Magnetic Materials - Published
- 2022
22. Efficient power component identification with long short-term memory and deep neural network
- Author
-
Wenbin Shi, Fengyong Li, Zhichao Lei, and Jingsheng Lei
- Subjects
0209 industrial biotechnology ,Computer science ,lcsh:TK7800-8360 ,Image processing ,Context (language use) ,Convolutional neural network ,02 engineering and technology ,Interference (wave propagation) ,Power component identification ,020901 industrial engineering & automation ,Live work inspection ,Component (UML) ,0202 electrical engineering, electronic engineering, information engineering ,Long short-term memory ,Electrical and Electronic Engineering ,Artificial neural network ,business.industry ,lcsh:Electronics ,Anti-interference ,Pattern recognition ,Identification (information) ,Signal Processing ,Pattern recognition (psychology) ,020201 artificial intelligence & image processing ,Artificial intelligence ,business ,Information Systems - Abstract
This paper tackles a recent challenge in patrol image processing on how to improve the identification accuracy for power component, especially for the scenarios including many interference objects. Our proposed method can fully use the patrol image information from live work, and it is thus different from traditional power component identification methods. Firstly, we use long short-term memory networks to synthesize the context information in a convolutional neural network. Then, we constructed the Mask LSTM-CNN model by combining the existing Mask R-CNN method and the context information. Further, by extracting the specific features belonging to the power components, we design an optimization algorithm to optimize the parameters of Mask LSTM-CNN model. Our solution is competitive in the sense that the power component is still identified accurately even if the patrol images contain much interference information. Extensive experiments show that the proposed scheme can improve the accuracy of component recognition and has an excellent anti-interference ability. Comparing with the existing R-FCN model and Faster R-CNN model, the proposed method demonstrates a significantly superior detection performance, and the average recognition accuracy is improved from 8 to 11%.
- Published
- 2018
23. PGDENet: Progressive Guided Fusion and Depth Enhancement Network for RGB-D Indoor Scene Parsing
- Author
-
Wujie Zhou, Enquan Yang, Jingsheng Lei, Jian Wan, and Lu Yu
- Subjects
Signal Processing ,Media Technology ,Electrical and Electronic Engineering ,Computer Science Applications - Published
- 2022
24. Boundary-aware pyramid attention network for detecting salient objects in RGB-D images
- Author
-
Ting Luo, Xi Zhou, Wujie Zhou, Lu Yu, Yuzhen Chen, and Jingsheng Lei
- Subjects
Pixel ,business.industry ,Computer science ,Applied Mathematics ,ComputingMethodologies_IMAGEPROCESSINGANDCOMPUTERVISION ,020206 networking & telecommunications ,02 engineering and technology ,Object (computer science) ,Convolutional neural network ,Computational Theory and Mathematics ,Artificial Intelligence ,Robustness (computer science) ,Salient ,Signal Processing ,0202 electrical engineering, electronic engineering, information engineering ,RGB color model ,020201 artificial intelligence & image processing ,Segmentation ,Computer vision ,Computer Vision and Pattern Recognition ,Pyramid (image processing) ,Artificial intelligence ,Electrical and Electronic Engineering ,Statistics, Probability and Uncertainty ,business - Abstract
Recent developments in convolutional neural networks (CNNs) have significantly improved the results of salient object detection (SOD), particularly RGB-D SOD. This article proposes BPA-Net (Boundary-aware Pyramid Attention Network), a network that addresses two key issues in RGB-D SOD based on CNNs: 1) accurately locking the position of an object when it is unclear whether it is a multi-object or a single object, and 2) depicting fine edges and fill pixels while maintaining robustness with complex scenes and similarly-colored backgrounds. Accordingly, we model three network branches to solve these problems separately. To address the first problem, we devise the Multi-scale Attention Branch, a pyramid attention network that collects the positions of objects, thereby eliminating interference from non-objects. The second is addressed via a Boundary Refine Branch that uses a depth image to capture the edges of objects. This step refines the boundaries of objects and emphasizes the importance of salient edge information. Such branches are learned for obtaining precise salient boundaries and for position estimation and are subsequently combined with a coarse salient map generated by the Coarse Salient Detection Branch, an encode-decode SOD network, to improve salient object segmentation. Extensive experiments show that our BPA-Net outperforms state-of-the-art approaches on two popular benchmarks.
- Published
- 2021
25. Multi-layer fusion network for blind stereoscopic 3D visual quality prediction
- Author
-
Xi Zhou, Lin Xinyang, Jingsheng Lei, Wujie Zhou, Ting Luo, and Lu Yu
- Subjects
Fusion ,business.industry ,Computer science ,media_common.quotation_subject ,ComputingMethodologies_IMAGEPROCESSINGANDCOMPUTERVISION ,020206 networking & telecommunications ,Stereoscopy ,Pattern recognition ,02 engineering and technology ,law.invention ,law ,Perception ,Signal Processing ,0202 electrical engineering, electronic engineering, information engineering ,020201 artificial intelligence & image processing ,Quality (business) ,Computer Vision and Pattern Recognition ,Artificial intelligence ,Electrical and Electronic Engineering ,business ,Binocular vision ,Multi layer ,Software ,media_common - Abstract
Stereoscopic 3D (S3D) visual quality prediction (VQP) is used to predict human perception of visual quality for S3D images accurately and automatically. Unlike that of 2D VQP, the quality prediction of S3D images is more difficult owing to complex binocular vision mechanisms. In this study, inspired by the binocular fusion and competition of the binocular visual system (BVS), we designed a blind deep visual quality predictor for S3D images. The proposed predictor is a multi-layer fusion network that fuses different levels of features. The left- and right-view sub-networks use the same structure and parameters. The weights and qualities for the left- and right-view patches of S3D images can be predicted. Furthermore, training patches with more saliency information can improve the accuracy of prediction results, which also make the predictor more robust. The LIVE 3D Phase I and II datasets were used to evaluate the proposed predictor. The results demonstrate that the performance of the proposed predictor surpasses most existing predictors on both asymmetrically and symmetrically distorted S3D images.
- Published
- 2021
26. Attention-based contextual interaction asymmetric network for RGB-D saliency prediction
- Author
-
Xinyue Zhang, Wujie Zhou, Jingsheng Lei, and Jin Ting
- Subjects
Computer science ,business.industry ,ComputingMethodologies_IMAGEPROCESSINGANDCOMPUTERVISION ,020207 software engineering ,Pattern recognition ,02 engineering and technology ,Salient ,Feature (computer vision) ,Depth map ,Complementarity (molecular biology) ,Signal Processing ,0202 electrical engineering, electronic engineering, information engineering ,Media Technology ,RGB color model ,Contextual information ,020201 artificial intelligence & image processing ,Computer Vision and Pattern Recognition ,Artificial intelligence ,Electrical and Electronic Engineering ,Representation (mathematics) ,business ,Block (data storage) - Abstract
Saliency prediction on RGB-D images is an underexplored and challenging task in computer vision. We propose a channel-wise attention and contextual interaction asymmetric network for RGB-D saliency prediction. In the proposed network, a common feature extractor provides cross-modal complementarity between the RGB image and corresponding depth map. In addition, we introduce a four-stream feature-interaction module that fully leverages multiscale and cross-modal features for extracting contextual information. Moreover, we propose a channel-wise attention module to highlight the feature representation of salient regions. Finally, we refine coarse maps through a corresponding refinement block. Experimental results show that the proposed network achieves a performance comparable with state-of-the-art saliency prediction methods on two representative datasets.
- Published
- 2021
27. Multiscale multilevel context and multimodal fusion for RGB-D salient object detection
- Author
-
Junwei Wu, Lu Yu, Ting Luo, Wujie Zhou, and Jingsheng Lei
- Subjects
Multimodal fusion ,business.industry ,Computer science ,Aggregate (data warehouse) ,ComputingMethodologies_IMAGEPROCESSINGANDCOMPUTERVISION ,Boundary (topology) ,020206 networking & telecommunications ,Context (language use) ,Pattern recognition ,02 engineering and technology ,Function (mathematics) ,Salient object detection ,Control and Systems Engineering ,Feature (computer vision) ,Salience (neuroscience) ,Signal Processing ,0202 electrical engineering, electronic engineering, information engineering ,RGB color model ,020201 artificial intelligence & image processing ,Saliency map ,Computer Vision and Pattern Recognition ,Artificial intelligence ,Electrical and Electronic Engineering ,business ,Software - Abstract
Red–green–blue and depth (RGB-D) saliency detection has recently attracted much research attention; however, the effective use of depth information remains challenging. This paper proposes a method that leverages depth information in clear shapes to detect the boundary of salient objects. As context plays an important role in saliency detection, the method incorporates a proposed end-to-end multiscale multilevel context and multimodal fusion network (MCMFNet) to aggregate multiscale multilevel context feature maps for accurate saliency detection from objects of varying sizes. Finally, a coarse-to-fine approach is applied to an attention module retrieving multilevel and multimodal feature maps to produce the final saliency map. A comprehensive loss function is also incorporated in MCMFNet to optimize the network parameters. Extensive experiments demonstrate the effectiveness of the proposed method and its substantial improvement over state-of-the-art methods for RGB-D salient object detection on four representative datasets.
- Published
- 2021
28. Opinion-unaware blind picture quality measurement using deep encoder–decoder architecture
- Author
-
Lin Xinyang, Jingsheng Lei, Wujie Zhou, Xi Zhou, Ting Luo, and Lu Yu
- Subjects
Image quality ,Computer science ,media_common.quotation_subject ,02 engineering and technology ,Overfitting ,Similarity (network science) ,Artificial Intelligence ,0202 electrical engineering, electronic engineering, information engineering ,Quality (business) ,Pyramid (image processing) ,Electrical and Electronic Engineering ,media_common ,business.industry ,Applied Mathematics ,Supervised learning ,020206 networking & telecommunications ,Pattern recognition ,Computational Theory and Mathematics ,Signal Processing ,Metric (mathematics) ,020201 artificial intelligence & image processing ,Computer Vision and Pattern Recognition ,Artificial intelligence ,Statistics, Probability and Uncertainty ,business ,Scale (map) - Abstract
Recently, deep-learning-based blind picture quality measurement (BPQM) metrics have gained significant attention. However, training a robust deep BPQM metric remains a difficult and challenging task because of the limited number of subject-rated training samples. State-of-the-art full-reference (FR) picture quality measurement (PQM) metrics are in good agreement with human subjective quality scores. Therefore, they can be employed to approximate human subjective quality scores to train BPQM metrics. Inspired by this, we propose a deep encoder–decoder architecture (DEDA) for opinion-unaware (OU) BPQM that does not require human-labeled distorted samples for training. In the training procedure, to avoid overfitting and to ensure the independency of the training and testing samples, we first construct 6,000 distorted pictures and use their objective quality/similarity maps obtained using a high-performance FR-PQM metric for distorted pictures as training labels. Subsequently, an end-to-end map between the distorted pictures and their objective quality/similarity maps (labels) is learned, represented as the DEDA that takes the distorted picture as the input and outputs its predicted quality/similarity map. In the DEDA, the pyramid supervision training strategy is used, which applies supervised learning over three scale layers to efficiently optimize the parameters. In the testing procedure, the quality/similarity maps of the testing samples that can help localize distortions can be predicted with the trained DEDA architecture. The predicted quality/similarity maps are then gradually pooled together to obtain the overall objective quality scores. Comparative experiments on three publicly available standard PQM datasets demonstrate that our proposed DEDA metric is in good agreement with subjective assessment compared to previous state-of-the-art OU-BPQM metrics.
- Published
- 2020
29. Three-branch architecture for stereoscopic 3D salient object detection
- Author
-
Jingsheng Lei, Xi Zhou, Wujie Zhou, Ting Luo, Pan Sijia, and Lu Yu
- Subjects
Computer science ,Stereoscopy ,02 engineering and technology ,law.invention ,Artificial Intelligence ,Robustness (computer science) ,law ,0202 electrical engineering, electronic engineering, information engineering ,Computer vision ,Electrical and Electronic Engineering ,Architecture ,Artificial neural network ,business.industry ,Applied Mathematics ,020206 networking & telecommunications ,Boundary refinement ,Salient object detection ,Computational Theory and Mathematics ,Signal Processing ,Fuse (electrical) ,RGB color model ,020201 artificial intelligence & image processing ,Computer Vision and Pattern Recognition ,Artificial intelligence ,Statistics, Probability and Uncertainty ,business - Abstract
Existing stereoscopic 3D (S3D) salient object detection (SOD) networks typically employ a two-branch architecture, in which the RGB and depth channels are learned independently. Conventional methods based on conventional neural networks generally fuse the two branches by combining their deep representations at a later stage with only one path, which can be inefficient and insufficient for retaining a large amount of cross-modal data. In this study, we combine the RGB branch and depth branch to generate a third branch. The first branch is the embedded attention branch containing the attention mechanism, and we introduce the embedded attention module in this branch to give the allocation of available processing resources to the most informative components of an input signal. The second branch is the boundary refinement branch combined with the low-level information of RGB and depth images. Additionally, we propose a new module, called the detail correlation module, to ensure clear object boundaries and salient object refinement. The third branch is the global deep-view branch containing the global view module, which fuses high-level information and expands the sensor field. We also use three different loss functions to match our special SOD network. Extensive experiments demonstrate the effectiveness and robustness of the proposed architecture and show that it represents a significant improvement over other state-of-the-art SOD approaches.
- Published
- 2020
30. A simple and robust approach to energy disaggregation in the presence of outliers
- Author
-
Jiuyang Tang, Guoming Tang, Kui Wu, and Jingsheng Lei
- Subjects
Data cleansing ,General Computer Science ,Virtual appliance ,Computer science ,020209 energy ,020208 electrical & electronic engineering ,02 engineering and technology ,Energy consumption ,computer.software_genre ,Outlier ,0202 electrical engineering, electronic engineering, information engineering ,Overhead (computing) ,Data mining ,Electrical and Electronic Engineering ,Hidden Markov model ,Cluster analysis ,computer ,Energy (signal processing) - Abstract
Energy disaggregation is to discover the energy consumption of individual appliances from their aggregated energy values. Most existing approaches rely on either appliances’ signatures or their state transition patterns, both hard to obtain in practice. In addition, load data may be corrupted due to various reasons. To overcome the problems, this paper utilizes easily accessible knowledge of appliances and the sparsity of the switching events to design a Sparse Switching Event Recovering (SSER) method. Furthermore, a robust version of this method (RSSER) is developed to tackle the problems caused by corrupted data and unknown appliances. By minimizing the total variation of the sparse event matrix and introducing a virtual appliance, RSSER can obtain accurate energy disaggregation results in the presence of outliers, without using any explicit data cleansing method. To speed up RSSER, a Parallel Local Optimization Algorithm (PLOA) is proposed to solve the problem in active epochs of appliance activities in parallel. To automatically acquire the power consumption knowledge of appliances whose information is unknown, we develop a K-median clustering based power division approach and establish an appliance power configuration platform. Using real-world trace data from our energy monitoring platform, the performance of RSSER is compared with that of the state-of-the-art solutions, including the least square estimation methods and a machine learning method using iterative Hidden Markov Model. The results show that RSSER not only has an overall better performance in both detection accuracy and overhead, but also can tolerate the interference of corrupted data and unknown appliances.
- Published
- 2016
31. Unsupervised Hyperspectral Band Selection by Dominant Set Extraction
- Author
-
Zhongqin Bi, Jingsheng Lei, Yuancheng Huang, Guokang Zhu, and Feifei Xu
- Subjects
business.industry ,Feature extraction ,0211 other engineering and technologies ,Hyperspectral imaging ,Pattern recognition ,02 engineering and technology ,Spectral bands ,Set (abstract data type) ,0202 electrical engineering, electronic engineering, information engineering ,Benchmark (computing) ,General Earth and Planetary Sciences ,Graph (abstract data type) ,020201 artificial intelligence & image processing ,Artificial intelligence ,Electrical and Electronic Engineering ,business ,Representation (mathematics) ,Independence (probability theory) ,021101 geological & geomatics engineering ,Mathematics - Abstract
Unsupervised hyperspectral band selection has been an important topic in hyperspectral imagery. This technique aims at selecting some critical and decisive spectral bands from an original image for compact representation without compromising and distorting the raw information in the relevant spectral bands. Although many efforts have been made to this topic, the structural information has not yet been well exploited during band selection, and there are still several deficiencies in search strategies, leaving room for further improvement. This paper tackles the unsupervised hyperspectral band selection problem from a global perspective and proposes a novel method claiming the following main contributions: 1) structure-aware measures for band informativeness and independence; and 2) a graph formulation of band selection allowing for an efficient integrated search by means of dominant set extraction. Experiments on three real hyperspectral images demonstrate the superiority of the proposed band selector in comparison with benchmark methods.
- Published
- 2016
32. Anti-compression JPEG steganography over repetitive compression networks
- Author
-
Fengyong Li, Kui Wu, Jingsheng Lei, and Chuan Qin
- Subjects
Computer science ,ComputingMethodologies_IMAGEPROCESSINGANDCOMPUTERVISION ,Data_CODINGANDINFORMATIONTHEORY ,02 engineering and technology ,Compression (functional analysis) ,0202 electrical engineering, electronic engineering, information engineering ,Discrete cosine transform ,Computer vision ,Dither ,Electrical and Electronic Engineering ,Steganography ,business.industry ,Process (computing) ,020206 networking & telecommunications ,computer.file_format ,JPEG ,Transmission (telecommunications) ,Control and Systems Engineering ,Signal Processing ,Bit error rate ,020201 artificial intelligence & image processing ,Computer Vision and Pattern Recognition ,Artificial intelligence ,business ,computer ,Software - Abstract
Existing work on steganography mostly assumes that an image remains unchanged during transmission from the sender to the receiver. This assumption, however, may not hold in the era of the Internet due to unknown compression from the network service providers. As a result, the hidden information cannot be correctly recovered at the receiver. To address the problem, we design a new JPEG steganographic method that can resist repetitive compression during network transmission, without even knowing the compression process controlled by the network service providers. Our method uses a simulated repetitive compression network, and based on its feedback performs adaptive dither adjustment to dynamically modify the DCT coefficients disturbed by the compression process. Stego images generated with our method can be used to successfully extract the original secret messages, even after the stego images pass through multiple unknown compression processes during network transmission. Extensive experiments demonstrate that compared with existing JPEG steganographic methods, our method can effectively resist repetitive compression, while maintaining a lower bit error rate and strong anti-steganalysis capability.
- Published
- 2020
33. CO-MAP: Improving Mobile Multiple Access Efficiency With Location Input
- Author
-
Wan Du, Jingsheng Lei, and Mo Li
- Subjects
Mobile identification number ,business.industry ,Computer science ,computer.internet_protocol ,Applied Mathematics ,Mobile broadband ,IMT Advanced ,Mobile computing ,Mobile Web ,Computer Science Applications ,Mobile station ,Location-based service ,Wireless Application Protocol ,Electrical and Electronic Engineering ,business ,computer ,Computer network - Published
- 2014
34. Improving the Uplink Performance of Drive-Thru Internet via Platoon-Based Cooperative Retransmission
- Author
-
Dongyao Jia, Jingsheng Lei, Jianping Wang, Rui Zhang, Kejie Lu, and Zhongqin Bi
- Subjects
Engineering ,Vehicular ad hoc network ,Computer Networks and Communications ,business.industry ,Retransmission ,ComputerSystemsOrganization_COMPUTER-COMMUNICATIONNETWORKS ,Aerospace Engineering ,Throughput ,Traffic flow (computer networking) ,Upload ,Automotive Engineering ,Telecommunications link ,Platoon ,The Internet ,Electrical and Electronic Engineering ,business ,Computer network - Abstract
For many vehicular safety applications, it is critical to timely and reliably deliver multimedia data from a traveling vehicle to a roadside access point (AP) in an error-prone vehicular ad hoc network (VANET), which is a typical uplink scenario for drive-thru Internet. To achieve this goal, we propose a cooperative retransmission scheme that exploits a common phenomenon in reality, in which consecutive vehicles can naturally form a platoon to reduce energy consumption. We develop a 4-D Markov chain to model the proposed scheme and analyze the uplink throughput of drive-thru Internet, which also reveals some fundamental relationships among traffic flow, platoon parameters, and system throughput. We conduct extensive simulation in OMNeT++ to validate our scheme and the analytical model. Numerical results show that the proposed platoon-based cooperative retransmission scheme significantly improves the uplink throughput of drive-thru Internet, considerably decreases the total transmission times for a given quantity of upload data, and, hence, achieves a greener mobile multimedia communication.
- Published
- 2014
Catalog
Discovery Service for Jio Institute Digital Library
For full access to our library's resources, please sign in.