59 results on '"Jingsheng Lei"'
Search Results
2. MENet: Lightweight multimodality enhancement network for detecting salient objects in RGB-thermal images
- Author
-
Junyi Wu, Wujie Zhou, Xiaohong Qian, Jingsheng Lei, Lu Yu, and Ting Luo
- Subjects
Artificial Intelligence ,Cognitive Neuroscience ,Computer Science Applications - Published
- 2023
3. MTANet: Multitask-Aware Network With Hierarchical Multimodal Fusion for RGB-T Urban Scene Understanding
- Author
-
Wujie Zhou, Shaohua Dong, Jingsheng Lei, and Lu Yu
- Subjects
Control and Optimization ,Artificial Intelligence ,Automotive Engineering - Published
- 2023
4. APNet: Adversarial Learning Assistance and Perceived Importance Fusion Network for All-Day RGB-T Salient Object Detection
- Author
-
Wujie Zhou, Yun Zhu, Jingsheng Lei, Jian Wan, and Lu Yu
- Subjects
Computational Mathematics ,Control and Optimization ,Artificial Intelligence ,Computer Science Applications - Published
- 2022
5. Two-stage sequential recommendation for side information fusion and long-term and short-term preferences modeling
- Author
-
Jingsheng Lei, Yuexin Li, Shengying Yang, Wenbin Shi, and Yi Wu
- Subjects
Artificial Intelligence ,Computer Networks and Communications ,Hardware and Architecture ,Software ,Information Systems - Published
- 2022
6. HFNet: Hierarchical feedback network with multilevel atrous spatial pyramid pooling for RGB-D saliency detection
- Author
-
Wujie Zhou, Chang Liu, Jingsheng Lei, Lu Yu, and Ting Luo
- Subjects
Artificial Intelligence ,Cognitive Neuroscience ,Computer Science Applications - Published
- 2022
7. TMFNet: Three-Input Multilevel Fusion Network for Detecting Salient Objects in RGB-D Images
- Author
-
Pan Sijia, Lu Yu, Wujie Zhou, and Jingsheng Lei
- Subjects
Fusion ,Control and Optimization ,business.industry ,Computer science ,Deep learning ,ComputingMethodologies_IMAGEPROCESSINGANDCOMPUTERVISION ,Pattern recognition ,Computer Science Applications ,Image (mathematics) ,Computational Mathematics ,Artificial Intelligence ,Salient ,Depth map ,Benchmark (computing) ,RGB color model ,Artificial intelligence ,business ,Representation (mathematics) - Abstract
The use of depth information, acquired by depth sensors, for salient object detection (SOD) is being explored. Despite the remarkable results from recent deep learning approaches for RGB-D SOD, they fail to fully incorporate original and accurate information to express details of RGB-D images in salient objects. Here, we propose an RGB-D SOD model using a three-input multilevel fusion network (TMFNet), which differs from existing methods based on double-stream networks. In addition to RGB input (first input) and depth input (second input), the RGB image and depth map are combined into a four-channel representation (RGBD input) that constitutes the third input to the TMFNet. The RGBD input generates multilevel features that reflect details of the RGB-D image. In addition, the proposed TMFNet aggregates diverse region-based contextual information without discarding RGB and depth features. Thus, we introduce a cross-fusion module, and benefiting from rich low- and high-level information from salient features, feature fusion enables the improvement of localization of salient objects. The proposed TMFNet achieves state-of-the-art performance on six benchmark datasets for SOD.
- Published
- 2022
8. MFFENet: Multiscale Feature Fusion and Enhancement Network For RGB–Thermal Urban Road Scene Parsing
- Author
-
Jeng-Neng Hwang, Lu Yu, Jingsheng Lei, Wujie Zhou, and Lin Xinyang
- Subjects
Fusion ,Parsing ,Computer science ,business.industry ,Deep learning ,ComputingMethodologies_IMAGEPROCESSINGANDCOMPUTERVISION ,computer.software_genre ,Computer Science Applications ,Robustness (computer science) ,Signal Processing ,Shadow ,Media Technology ,RGB color model ,Computer vision ,Artificial intelligence ,Electrical and Electronic Engineering ,Layer (object-oriented design) ,business ,Encoder ,computer - Abstract
Compared with traditional handcrafted features, deep learning has greatly improved the performance of scene parsing. However, it remains challenging under various environmental conditions caused by imaging limitations. Thermal imaging cameras have several advantages over cameras for the visible spectrum, such as operation in total darkness, robustness to shadow effects, insensitivity to illumination variations, and strong ability to penetrate smog and haze. These advantages of thermal imaging cameras make them ideal for the scene parsing of semantic objects in daytime and nighttime. In this paper, we propose a novel multiscale feature fusion and enhancement network (MFFENet) for accurate parsing of RGBthermal urban road scenes even when the quality of the available RGB data is compromised. The proposed MFFENet consists of two encoders, a feature fusion layer, and a multi-labelsupervision layer. We concatenate the multi-scale features with the features that contain global semantic information. Furthermore, we explore the cross-modal fusion of RGB and thermal features at multiple stages, rather than fusing them once at the low or high stage. Then, we propose a spatial attention mechanism module that provides a higher weight to (focuses more on) the foreground area, allowing MFFENet to emphasize foreground objects. Finally, multi-labelsupervision is introduced to optimize parameters of the proposed MFFENet. Experimental results confirm that the proposed MFFENet outperforms similar high-performing methods.
- Published
- 2022
9. CCAFNet: Crossflow and Cross-Scale Adaptive Fusion Network for Detecting Salient Objects in RGB-D Images
- Author
-
Yun Zhu, Wujie Zhou, Wan Jian, Jingsheng Lei, and Lu Yu
- Subjects
Fusion ,Relation (database) ,Channel (digital image) ,business.industry ,Computer science ,ComputingMethodologies_IMAGEPROCESSINGANDCOMPUTERVISION ,Pattern recognition ,02 engineering and technology ,Computer Science Applications ,Convolution ,Feature (computer vision) ,Signal Processing ,0202 electrical engineering, electronic engineering, information engineering ,Media Technology ,RGB color model ,Leverage (statistics) ,020201 artificial intelligence & image processing ,Artificial intelligence ,Electrical and Electronic Engineering ,business ,Spatial analysis - Abstract
Owing to the widespread adoption of depth sensors, salient object detection (SOD) supported by depth maps for reliable complementary information is being increasingly investigated. Existing SOD models mainly exploit the relation between an RGB image and its corresponding depth information across three fusion domains: input RGB-D images, extracted feature maps, and output salient object. However, these models do not leverage the crossflows between high- and low-level information well. Moreover, the decoder in these models uses conventional convolution that involves several calculations. To further improve RGB-D SOD, we propose a crossflow and cross-scale adaptive fusion network (CCAFNet) to detect salient objects in RGB-D images. First, a channel fusion module allows for effective fusing depth and high-level RGB features. This module extracts accurate semantic information features from high-level RGB features. Meanwhile, a spatial fusion module combines low-level RGB and depth features with accurate boundaries and subsequently extracts detailed spatial information from low-level depth features. Finally, a purification loss is proposed to precisely learn the boundaries of salient objects and obtain additional details of the objects. The results of comprehensive experiments on seven common RGB-D SOD datasets indicate that the performance of the proposed CCAFNet is comparable to those of state-of-the-art RGB-D SOD models.
- Published
- 2022
10. A Finite-Time Convergent Neural Network for Solving Time-Varying Linear Equations with Inequality Constraints Applied to Redundant Manipulator
- Author
-
Renji Han, Jingsheng Lei, Tanglong Hu, and Ying Kong
- Subjects
Artificial neural network ,Computer Networks and Communications ,Computer science ,General Neuroscience ,Stability (learning theory) ,Complex system ,Contrast (statistics) ,Computational intelligence ,Slack variable ,Recurrent neural network ,Artificial Intelligence ,Applied mathematics ,Software ,Linear equation - Abstract
Zhang neural network (ZNN), a special recurrent neural network, has recently been established as an effective alternative for time-varying linear equations with inequality constraints (TLEIC) solving. Still, the convergent time produced by the ZNN model always tends to infinity. In contrast to ZNN, a finite-time convergent neural network (FCNN) is proposed for the TLEIC problem. By introducing a non-negative slack variable, the initial form of the TLEIC has been transformed into a system of time-varying linear equation. Afterwards, the stability and finite-time performance of the FCNN model is substantiated by the theoretical analysis. Then, simulation results further verify the effectiveness and superiority of the proposed FCNN model as compared with the ZNN model for solving TLEIC problem. Finally, the proposed FCNN model is successfully applied to the trajectory planning of redundant manipulators with joint limitations, thereby illustrating the applicability of the new neural network model.
- Published
- 2021
11. TSNet: Three-Stream Self-Attention Network for RGB-D Indoor Semantic Segmentation
- Author
-
Jingsheng Lei, Wujie Zhou, Ting Luo, and Jianzhong Yuan
- Subjects
Computer Networks and Communications ,Computer science ,business.industry ,Feature extraction ,ComputingMethodologies_IMAGEPROCESSINGANDCOMPUTERVISION ,Bilinear interpolation ,02 engineering and technology ,Image segmentation ,Upsampling ,Artificial Intelligence ,Feature (computer vision) ,0202 electrical engineering, electronic engineering, information engineering ,RGB color model ,020201 artificial intelligence & image processing ,Computer vision ,Segmentation ,Artificial intelligence ,business ,Encoder - Abstract
This article proposes a three-stream self-attention network (TSNet) for indoor semantic segmentation comprising two asymmetric input streams (asymmetric encoder structure) and a cross-modal distillation stream with a self-attention module. The two asymmetric input streams are ResNet34 for the red-green-blue (RGB) stream and VGGNet16 for the depth stream. Accompanying the RGB and depth streams, a cross-modal distillation stream with a self-attention module extracts new RGB plus depth features in each level in the bottom-up path. In addition, while using bilinear upsampling to recover the spatial resolution of the feature map, we incorporated the feature information of both the RGB flow and the depth flow through the self-attention module. We constructed the NYU Depth V2 dataset to evaluate the TSNet and achieved results comparable to those of current state-of-the-art methods.
- Published
- 2021
12. THCANet: Two-layer hop cascaded asymptotic network for robot-driving road-scene semantic segmentation in RGB-D images
- Author
-
Gao Xu, Wujie Zhou, Xiaohong Qian, Yulai Zhang, Jingsheng Lei, and Lu Yu
- Subjects
Computational Theory and Mathematics ,Artificial Intelligence ,Applied Mathematics ,Signal Processing ,Computer Vision and Pattern Recognition ,Electrical and Electronic Engineering ,Statistics, Probability and Uncertainty - Published
- 2023
13. Global and Local-Contrast Guides Content-Aware Fusion for RGB-D Saliency Prediction
- Author
-
Ying Lv, Jingsheng Lei, Lu Yu, and Wujie Zhou
- Subjects
Fusion ,Computer science ,business.industry ,Deep learning ,Feature extraction ,020206 networking & telecommunications ,Pattern recognition ,02 engineering and technology ,Computer Science Applications ,Visualization ,Human-Computer Interaction ,Upsampling ,Control and Systems Engineering ,0202 electrical engineering, electronic engineering, information engineering ,RGB color model ,020201 artificial intelligence & image processing ,Artificial intelligence ,Electrical and Electronic Engineering ,business ,Image resolution ,Scaling ,Software - Abstract
Many RGB-D visual attention models have been proposed with diverse fusion models; thus, the main challenge lies in the differences in the results between the different models. To address this challenge, we propose a local-global fusion model for fixation prediction on an RGB-D image; this method combines global and local information through a content-aware fusion module (CAFM) structure. First, it comprises a channel-based upsampling block for exploiting global contextual information and scaling up this information to the same resolution as the input. Second, our Deconv block contains a contrast feature module to utilize multilevel local features stage-by-stage for superior local feature representation. The experimental results demonstrate that the proposed model exhibits competitive performance on two databases.
- Published
- 2021
14. Two-Stage Cascaded Decoder for Semantic Segmentation of RGB-D Images
- Author
-
Yuchun Yue, Wujie Zhou, Lu Yu, and Jingsheng Lei
- Subjects
Computer science ,business.industry ,Applied Mathematics ,Feature extraction ,ComputingMethodologies_IMAGEPROCESSINGANDCOMPUTERVISION ,020206 networking & telecommunications ,Pattern recognition ,02 engineering and technology ,Image segmentation ,Feature (computer vision) ,Signal Processing ,0202 electrical engineering, electronic engineering, information engineering ,Benchmark (computing) ,RGB color model ,Segmentation ,Noise (video) ,Artificial intelligence ,Electrical and Electronic Engineering ,business ,Decoding methods - Abstract
Exploiting RGB and depth information can boost the performance of semantic segmentation. However, owing to the differences between RGB images and the corresponding depth maps, such multimodal information should be effectively used and combined. Most existing methods use the same fusion strategy to explore multilevel complementary information at various levels, likely ignoring different feature contributions at various levels for segmentation. To address this problem, we propose a network using a two-stage cascaded decoder (TCD), embedding a detail polishing module, to effectively integrate high- and low-level features and suppress noise from low-level details. Additionally, we introduce a depth filter and fusion module to extract informative regions from depth cues with the guidance of RGB images. The proposed TCD network achieves comparable performance to state-of-the-art RGB-D semantic segmentation methods on the benchmark NYUDv2 and SUN RGB-D datasets.
- Published
- 2021
15. Visual Saliency Prediction Using Attention-based Cross-modal Integration Network in RGB-D Images
- Author
-
Jin Ting, Mingjie Han, Zhichao Cao, Jingsheng Lei, and Xinyue Zhang
- Subjects
Modal ,Computational Theory and Mathematics ,Artificial Intelligence ,Computer science ,business.industry ,Computer vision ,Artificial intelligence ,business ,Software ,Theoretical Computer Science ,Visual saliency - Published
- 2021
16. MRINet: Multilevel Reverse-Context Interactive-Fusion Network for Detecting Salient Objects in RGB-D Images
- Author
-
Pan Sijia, Lu Yu, Jingsheng Lei, and Wujie Zhou
- Subjects
Computer science ,business.industry ,Applied Mathematics ,Feature extraction ,Multilevel model ,ComputingMethodologies_IMAGEPROCESSINGANDCOMPUTERVISION ,Context (language use) ,Pattern recognition ,Semantics ,Salient ,Feature (computer vision) ,Signal Processing ,RGB color model ,Artificial intelligence ,Electrical and Electronic Engineering ,business ,ComputingMethodologies_COMPUTERGRAPHICS ,Block (data storage) - Abstract
The use of RGB-D information for salient object detection (SOD) is being increasingly explored. Traditional multilevel models handle both low- and high-level features similarly, as they use the same number of features for blending. Unlike these models, in this paper, we propose multilevel reverse-context interactive-fusion (MRI) network (MRINet) for RGB-D SOD. Specifically, first, we extract and reuse different numbers of features depending on their level; the deeper the information, the more times do we perform the extraction. Deeper information contains more semantic cues, which are important for locating salient regions. Thereafter, we use an RGB MRI block (MRIB) to merge RGB information at different levels; furthermore, we use depth features as auxiliary information and an RGB-D MRIB for full merging with RGB information. RGB and RGB-D MRIBs can reconstruct the high-level feature map in high resolution and integrate the low-level feature map to enhance boundary details. Extensive experiments demonstrate the effectiveness of the proposed MRINet and its state-of-the-art performance in RGB-D SOD.
- Published
- 2021
17. Salient Object Detection in Stereoscopic 3D Images Using a Deep Convolutional Residual Autoencoder
- Author
-
Junwei Wu, Lu Yu, Wujie Zhou, Jenq-Neng Hwang, and Jingsheng Lei
- Subjects
Computer science ,business.industry ,Feature extraction ,Supervised learning ,ComputingMethodologies_IMAGEPROCESSINGANDCOMPUTERVISION ,Stereoscopy ,02 engineering and technology ,Autoencoder ,Object detection ,Computer Science Applications ,law.invention ,law ,Feature (computer vision) ,Signal Processing ,Pyramid ,0202 electrical engineering, electronic engineering, information engineering ,Media Technology ,RGB color model ,020201 artificial intelligence & image processing ,Computer vision ,Artificial intelligence ,Pyramid (image processing) ,Electrical and Electronic Engineering ,business ,Encoder - Abstract
In recent years, the detection of distinctive objects in stereoscopic 3D images has drawn increasing attention. Unlike 2D salient object detection, salient object detection in stereoscopic 3D images is highly challenging. Hence, we propose a novel Deep Convolutional Residual Autoencoder (DCRA) for end-to-end salient object detection in stereoscopic 3D images. The core trainable architecture of the salient object detection model employs raw stereoscopic 3D images as the inputs and their corresponding ground truth saliency masks as the labels. A convolutional residual module is applied to both the encoder and the decoder as a basic building block in the DCRA, and long-range skip connections are employed to bypass the equal-sized feature maps between the encoder and the decoder. To explore the complex relationships and exploit the complementarity between RGB (photometric) and depth (geometric) information, multiple feature map fusion modules are constructed. These modules integrate texture and structure information between the RGB and depth branches of the encoder and fuse their features over several multiscale layers. Finally, to efficiently optimize DCRA parameters, a supervision pyramid based on boundary loss and background prior loss is adopted, which employs supervised learning over the multiscale layers in the decoder to prevent vanishing gradients and accelerate the training at the fusion stage. We compare the proposed DCRA with state-of-the-art methods on two challenging benchmark datasets. The results of these experiments demonstrate that our proposed DCRA performs favorably against the comparison models.
- Published
- 2021
18. MFENet: Multitype fusion and enhancement network for detecting salient objects in RGB-T images
- Author
-
Junyi Wu, Wujie Zhou, Xiaohong Qian, Jingsheng Lei, Lu Yu, and Ting Luo
- Subjects
Computational Theory and Mathematics ,Artificial Intelligence ,Applied Mathematics ,Signal Processing ,Computer Vision and Pattern Recognition ,Electrical and Electronic Engineering ,Statistics, Probability and Uncertainty - Published
- 2023
19. Global contextually guided lightweight network for RGB-thermal urban scene understanding
- Author
-
Tingting Gong, Wujie Zhou, Xiaohong Qian, Jingsheng Lei, and Lu Yu
- Subjects
Artificial Intelligence ,Control and Systems Engineering ,Electrical and Electronic Engineering - Published
- 2023
20. Asymmetric Deeply Fused Network for Detecting Salient Objects in RGB-D Images
- Author
-
Yuzhen Chen, Chang Liu, Jingsheng Lei, and Wujie Zhou
- Subjects
business.industry ,Computer science ,Applied Mathematics ,Feature extraction ,020206 networking & telecommunications ,Pattern recognition ,02 engineering and technology ,Salient objects ,Object detection ,Visualization ,Signal Processing ,0202 electrical engineering, electronic engineering, information engineering ,RGB color model ,Artificial intelligence ,Electrical and Electronic Engineering ,business - Abstract
Most RGB-D salient object detection (SOD) models use the same network to process RGB images and their corresponding depth maps. Subsequently, these models perform direct concatenation and summation at deep or shallow layers. However, these models ignore the complementarity of multi-level features extracted from RGB images and depth maps. This paper presents an asymmetric deeply fused network (ADFNet) for RGB-D SOD. Two different backbone networks, i.e., ResNet-50 and VGG-16, are utilized to process RGB images and related depth maps. We use an aggregation decoder and adaptive attention transformer module (AATM) to avoid information loss in the decoding process. Additionally, we use an attention early fusion module (AEFM) and deep fusion module (DFM) to deal with the deep features in various complex situations. Experiments validate the effectiveness of the proposed ADFNet, which outperforms thirteen recent RGB-D SOD models in the analysis of five public RGB-D SOD datasets.
- Published
- 2020
21. Blind Binocular Visual Quality Predictor Using Deep Fusion Network
- Author
-
Ting Luo, Lu Yu, Qiuping Jiang, Wujie Zhou, and Jingsheng Lei
- Subjects
Monocular ,genetic structures ,business.industry ,Computer science ,Deep learning ,media_common.quotation_subject ,Feature extraction ,Pattern recognition ,Convolutional neural network ,eye diseases ,Computer Science Applications ,Visualization ,Computational Mathematics ,Feature (computer vision) ,Encoding (memory) ,Signal Processing ,Contrast (vision) ,Artificial intelligence ,business ,media_common - Abstract
Blind binocular visual quality prediction (BVQP) is more challenging than blind monocular visual quality prediction (MVQP). Recently, the application of convolutional neural networks (CNNs) to blind MVQP has resulted in significant progress in that area. In contrast, the adoption of deep learning for blind BVQP has received scant attention. In this study, we devised an end-to-end deep fusion network (DFNet) model trained in a unified framework for blind BVQP. This core prediction engine comprises monocular feature encoding networks and binocular feature fusion networks, followed by a quality prediction layer. The monocular feature encoding networks are first established to capture the low- and high-level monocular features of the left and right retinal views, respectively. Subsequently, these monocular features are integrated by the binocular feature fusion networks to obtain binocular deep features. Finally, the final binocular visual quality is predicted by quality prediction networks. Comparisons via experiments using two standard subject-rated BVQP datasets indicate that the proposed DFNet architecture achieves highly consistent alignment with human assessment and outperforms most relevant existing models.
- Published
- 2020
22. GMNet: Graded-Feature Multilabel-Learning Network for RGB-Thermal Urban Scene Semantic Segmentation
- Author
-
Wujie Zhou, Jinfu Liu, Jingsheng Lei, Lu Yu, and Jenq-Neng Hwang
- Subjects
Robotic sensing ,business.industry ,Computer science ,Feature extraction ,Pattern recognition ,Image segmentation ,Semantics ,Computer Graphics and Computer-Aided Design ,Feature (computer vision) ,Leverage (statistics) ,RGB color model ,Segmentation ,Artificial intelligence ,business ,Software - Abstract
Semantic segmentation is a fundamental task in computer vision, and it has various applications in fields such as robotic sensing, video surveillance, and autonomous driving. A major research topic in urban road semantic segmentation is the proper integration and use of cross-modal information for fusion. Here, we attempt to leverage inherent multimodal information and acquire graded features to develop a novel multilabel-learning network for RGB-thermal urban scene semantic segmentation. Specifically, we propose a strategy for graded-feature extraction to split multilevel features into junior, intermediate, and senior levels. Then, we integrate RGB and thermal modalities with two distinct fusion modules, namely a shallow feature fusion module and deep feature fusion module for junior and senior features. Finally, we use multilabel supervision to optimize the network in terms of semantic, binary, and boundary characteristics. Experimental results confirm that the proposed architecture, the graded-feature multilabel-learning network, outperforms state-of-the-art methods for urban scene semantic segmentation, and it can be generalized to depth data.
- Published
- 2021
23. Efficient power component identification with long short-term memory and deep neural network
- Author
-
Wenbin Shi, Fengyong Li, Zhichao Lei, and Jingsheng Lei
- Subjects
0209 industrial biotechnology ,Computer science ,lcsh:TK7800-8360 ,Image processing ,Context (language use) ,Convolutional neural network ,02 engineering and technology ,Interference (wave propagation) ,Power component identification ,020901 industrial engineering & automation ,Live work inspection ,Component (UML) ,0202 electrical engineering, electronic engineering, information engineering ,Long short-term memory ,Electrical and Electronic Engineering ,Artificial neural network ,business.industry ,lcsh:Electronics ,Anti-interference ,Pattern recognition ,Identification (information) ,Signal Processing ,Pattern recognition (psychology) ,020201 artificial intelligence & image processing ,Artificial intelligence ,business ,Information Systems - Abstract
This paper tackles a recent challenge in patrol image processing on how to improve the identification accuracy for power component, especially for the scenarios including many interference objects. Our proposed method can fully use the patrol image information from live work, and it is thus different from traditional power component identification methods. Firstly, we use long short-term memory networks to synthesize the context information in a convolutional neural network. Then, we constructed the Mask LSTM-CNN model by combining the existing Mask R-CNN method and the context information. Further, by extracting the specific features belonging to the power components, we design an optimization algorithm to optimize the parameters of Mask LSTM-CNN model. Our solution is competitive in the sense that the power component is still identified accurately even if the patrol images contain much interference information. Extensive experiments show that the proposed scheme can improve the accuracy of component recognition and has an excellent anti-interference ability. Comparing with the existing R-FCN model and Faster R-CNN model, the proposed method demonstrates a significantly superior detection performance, and the average recognition accuracy is improved from 8 to 11%.
- Published
- 2018
24. Attention-based fusion network for human eye-fixation prediction in 3D images
- Author
-
Wujie Zhou, Jingsheng Lei, Ying Lv, Lv Ye, and Ting Luo
- Subjects
Machine vision ,Computer science ,Feature vector ,Feature extraction ,ComputingMethodologies_IMAGEPROCESSINGANDCOMPUTERVISION ,02 engineering and technology ,Fixation, Ocular ,01 natural sciences ,Convolutional neural network ,010309 optics ,Optics ,Imaging, Three-Dimensional ,Salience (neuroscience) ,0103 physical sciences ,Humans ,Saliency map ,Segmentation ,Attention ,Artificial neural network ,business.industry ,Pattern recognition ,021001 nanoscience & nanotechnology ,Atomic and Molecular Physics, and Optics ,Databases as Topic ,Fixation (visual) ,RGB color model ,Artificial intelligence ,0210 nano-technology ,business ,Algorithms - Abstract
Human eye-fixation prediction in 3D images is important for many 3D applications, such as fine-grained 3D video object segmentation and intelligent bulletproof curtains. While the vast majority of existing 2D-based approaches cannot be applied, the main challenge lies in the inconsistency, or even conflict, between the RGB and depth saliency maps. In this paper, we propose a three-stream architecture to accurately predict human visual attention on 3D images end-to-end. First, a two-stream feature extraction network based on advanced convolutional neural networks is trained for RGB and depth, and hierarchical information is extracted from each ResNet-18. Then, these multi-level features are fed into the channel attention mechanism to suppress the feature space inconsistency and make the network focus on a significant target. The enhanced saliency map is fused step-by-step by VGG-16 to generate the final coarse saliency map. Finally, each coarse map is refined empirically through refinement blocks, and the network's own identification errors are corrected based on the acquired knowledge, thus converting the prediction saliency map from coarse to fine. The results of comparison of our model with six other state-of-the-art approaches on the NUS dataset (CC of 0.5579, KLDiv of 1.0903, AUC of 0.8339, and NSS of 2.3373) and the NCTU dataset (CC of 0.8614, KLDiv of 0.2681, AUC of 0.9143, and NSS of 2.3795) indicate that the proposed model consistently outperforms them by a considerable margin as it fully employs the channel attention mechanism.
- Published
- 2019
25. Kernelized random KISS metric learning for person re-identification
- Author
-
Jingsheng Lei, Yipeng Chen, Xuekuan Wang, Wai Keung Wong, Cairong Zhao, and Duoqian Miao
- Subjects
business.industry ,Covariance matrix ,Cognitive Neuroscience ,Gaussian ,Pattern recognition ,02 engineering and technology ,Covariance ,Computer Science Applications ,KISS (TNC) ,KISS principle ,symbols.namesake ,Artificial Intelligence ,020204 information systems ,Metric (mathematics) ,0202 electrical engineering, electronic engineering, information engineering ,symbols ,Feature (machine learning) ,020201 artificial intelligence & image processing ,Artificial intelligence ,business ,Subspace topology ,Mathematics - Abstract
Person re-identification is critical for human tracking in the video surveillance which has attracted more and more attention in recent years. Various recent approaches have made great progress in re-identification performance using metric learning techniques and among them, Keep It Simple and Straightforward (KISS) metric learning method has shown remarkably importance because of its simpleness and high-efficiency. The KISS method is based on an assumption that the differences between feature pairs obey the Gaussian distribution. However, for most existing features of person re-identification, the distributions of differences between feature pairs are irregular and undulant. Therefore, prior to the Guassian based metric learning step, it's important to augment the Guassian distribution of data without losing discernment. Moreover, most metric learning methods were greatly influenced by the small sample size (SSS) problem and the KISS method is no exception, which causing the inexistence of inverse of covariance matrices. To solve the above two problems, we present Kernelized Random KISS (KRKISS) metric learning method. By transforming the original features into kernelized features, the differences between feature pairs can better fit the Gaussian distribution and thus they can be more suitable for the Guassian assumption based models. To solve the inverse of covariance matrix estimation problem, we apply a random subspace ensemble method to obtain exact estimation of covariance matrix by randomly selecting and combining several different subspaces. In each subspace, the influence of SSS problem can be minimized. Experimental results on three challenging person re-identification datasets demonstrate that the KRKISS method significantly improves the KISS method and achieves better performance than most existing metric learning approaches.
- Published
- 2018
26. Boundary-aware pyramid attention network for detecting salient objects in RGB-D images
- Author
-
Ting Luo, Xi Zhou, Wujie Zhou, Lu Yu, Yuzhen Chen, and Jingsheng Lei
- Subjects
Pixel ,business.industry ,Computer science ,Applied Mathematics ,ComputingMethodologies_IMAGEPROCESSINGANDCOMPUTERVISION ,020206 networking & telecommunications ,02 engineering and technology ,Object (computer science) ,Convolutional neural network ,Computational Theory and Mathematics ,Artificial Intelligence ,Robustness (computer science) ,Salient ,Signal Processing ,0202 electrical engineering, electronic engineering, information engineering ,RGB color model ,020201 artificial intelligence & image processing ,Segmentation ,Computer vision ,Computer Vision and Pattern Recognition ,Pyramid (image processing) ,Artificial intelligence ,Electrical and Electronic Engineering ,Statistics, Probability and Uncertainty ,business - Abstract
Recent developments in convolutional neural networks (CNNs) have significantly improved the results of salient object detection (SOD), particularly RGB-D SOD. This article proposes BPA-Net (Boundary-aware Pyramid Attention Network), a network that addresses two key issues in RGB-D SOD based on CNNs: 1) accurately locking the position of an object when it is unclear whether it is a multi-object or a single object, and 2) depicting fine edges and fill pixels while maintaining robustness with complex scenes and similarly-colored backgrounds. Accordingly, we model three network branches to solve these problems separately. To address the first problem, we devise the Multi-scale Attention Branch, a pyramid attention network that collects the positions of objects, thereby eliminating interference from non-objects. The second is addressed via a Boundary Refine Branch that uses a depth image to capture the edges of objects. This step refines the boundaries of objects and emphasizes the importance of salient edge information. Such branches are learned for obtaining precise salient boundaries and for position estimation and are subsequently combined with a coarse salient map generated by the Coarse Salient Detection Branch, an encode-decode SOD network, to improve salient object segmentation. Extensive experiments show that our BPA-Net outperforms state-of-the-art approaches on two popular benchmarks.
- Published
- 2021
27. Unsupervised steganalysis over social networks based on multi-reference sub-image sets
- Author
-
Kui Wu, Jingsheng Lei, Yanli Ren, Fengyong Li, and Mi Wen
- Subjects
Scheme (programming language) ,Computer Networks and Communications ,Computer science ,Calibration (statistics) ,ComputingMethodologies_IMAGEPROCESSINGANDCOMPUTERVISION ,0211 other engineering and technologies ,02 engineering and technology ,computer.software_genre ,Image (mathematics) ,0202 electrical engineering, electronic engineering, information engineering ,Media Technology ,computer.programming_language ,Steganalysis ,021110 strategic, defence & security studies ,Steganography ,business.industry ,Pattern recognition ,computer.file_format ,JPEG ,Hardware and Architecture ,020201 artificial intelligence & image processing ,Artificial intelligence ,Data mining ,business ,computer ,Software - Abstract
This work proposes a new unsupervised steganalysis scheme which mainly tackles the challenge in identifying individual JPEG image as stego or cover. The proposed scheme does not need a large number of samples to train classification model, and thus it is significantly different from the existing supervised steganalysis schemes. The proposed scheme employs calibration technology to construct multiple reference images from one suspicious image. These reference images are considered as the imitation of cover. Furthermore, randomized sampling is performed to construct sub-image sets from suspicious image and reference images, respectively. By calculating the maximum mean discrepancy between any two sub-image sets, an efficient measure is provided to give the optimal decision on this suspicious image. Experimental results show that the proposed scheme is effective and efficient in identifying individual image, and outperforms the state-of-the-art steganalysis scheme.
- Published
- 2017
28. Short-Term Power Load Forecasting for Larger Consumer Based on TensorFlow Deep Learning Framework and Clustering-Regression Model
- Author
-
Hao Jiawei, Ding Yang, Deng Menghua, Rui Wang, Gu Haoliang, Jingsheng Lei, Liu Yihua, and Zewen Huang
- Subjects
business.industry ,Computer science ,020209 energy ,Deep learning ,Big data ,Regression analysis ,02 engineering and technology ,computer.software_genre ,Term (time) ,Data set ,Robustness (computer science) ,0202 electrical engineering, electronic engineering, information engineering ,Data mining ,Artificial intelligence ,Electric power ,business ,Cluster analysis ,computer - Abstract
This paper tackles a new challenge in high precision load forecasting of larger consumer with the background of electric power big data. The proposed short-term power load forecasting method is based on TensorFlow deep learning framwork and clustering-regression model. Proposed scheme firstly clustering the users with different electrical attributes and then obtains the “load curve of each cluster”, which is considered as the properties of a regional total load, and represents the features of various types of consumers. Furthermore, the “clustering-regression” model is used to forecast the power load of the certain region, which is implemented by TensorFlow deep learning framework. Extensive experiments show that the proposed scheme can predict reasonably the short-term power load and has excellent robustness. Comparing with the traditional model, the proposed method has a higher efficiency in dealing with large-scale data set and can be effectively applied to the power load forecasting.
- Published
- 2018
29. Multi-layer fusion network for blind stereoscopic 3D visual quality prediction
- Author
-
Xi Zhou, Lin Xinyang, Jingsheng Lei, Wujie Zhou, Ting Luo, and Lu Yu
- Subjects
Fusion ,business.industry ,Computer science ,media_common.quotation_subject ,ComputingMethodologies_IMAGEPROCESSINGANDCOMPUTERVISION ,020206 networking & telecommunications ,Stereoscopy ,Pattern recognition ,02 engineering and technology ,law.invention ,law ,Perception ,Signal Processing ,0202 electrical engineering, electronic engineering, information engineering ,020201 artificial intelligence & image processing ,Quality (business) ,Computer Vision and Pattern Recognition ,Artificial intelligence ,Electrical and Electronic Engineering ,business ,Binocular vision ,Multi layer ,Software ,media_common - Abstract
Stereoscopic 3D (S3D) visual quality prediction (VQP) is used to predict human perception of visual quality for S3D images accurately and automatically. Unlike that of 2D VQP, the quality prediction of S3D images is more difficult owing to complex binocular vision mechanisms. In this study, inspired by the binocular fusion and competition of the binocular visual system (BVS), we designed a blind deep visual quality predictor for S3D images. The proposed predictor is a multi-layer fusion network that fuses different levels of features. The left- and right-view sub-networks use the same structure and parameters. The weights and qualities for the left- and right-view patches of S3D images can be predicted. Furthermore, training patches with more saliency information can improve the accuracy of prediction results, which also make the predictor more robust. The LIVE 3D Phase I and II datasets were used to evaluate the proposed predictor. The results demonstrate that the performance of the proposed predictor surpasses most existing predictors on both asymmetrically and symmetrically distorted S3D images.
- Published
- 2021
30. Attention-based contextual interaction asymmetric network for RGB-D saliency prediction
- Author
-
Xinyue Zhang, Wujie Zhou, Jingsheng Lei, and Jin Ting
- Subjects
Computer science ,business.industry ,ComputingMethodologies_IMAGEPROCESSINGANDCOMPUTERVISION ,020207 software engineering ,Pattern recognition ,02 engineering and technology ,Salient ,Feature (computer vision) ,Depth map ,Complementarity (molecular biology) ,Signal Processing ,0202 electrical engineering, electronic engineering, information engineering ,Media Technology ,RGB color model ,Contextual information ,020201 artificial intelligence & image processing ,Computer Vision and Pattern Recognition ,Artificial intelligence ,Electrical and Electronic Engineering ,Representation (mathematics) ,business ,Block (data storage) - Abstract
Saliency prediction on RGB-D images is an underexplored and challenging task in computer vision. We propose a channel-wise attention and contextual interaction asymmetric network for RGB-D saliency prediction. In the proposed network, a common feature extractor provides cross-modal complementarity between the RGB image and corresponding depth map. In addition, we introduce a four-stream feature-interaction module that fully leverages multiscale and cross-modal features for extracting contextual information. Moreover, we propose a channel-wise attention module to highlight the feature representation of salient regions. Finally, we refine coarse maps through a corresponding refinement block. Experimental results show that the proposed network achieves a performance comparable with state-of-the-art saliency prediction methods on two representative datasets.
- Published
- 2021
31. Deep Binocular Fixation Prediction using a Hierarchical Multimodal Fusion Network
- Author
-
Ting Luo, Wenyu Liu, Wujie Zhou, Jingsheng Lei, and Lu Yu
- Subjects
Computer science ,business.industry ,Feature vector ,ComputingMethodologies_IMAGEPROCESSINGANDCOMPUTERVISION ,Pattern recognition ,02 engineering and technology ,Fixation (psychology) ,03 medical and health sciences ,0302 clinical medicine ,Image texture ,Artificial Intelligence ,Feature (computer vision) ,0202 electrical engineering, electronic engineering, information engineering ,RGB color model ,020201 artificial intelligence & image processing ,Artificial intelligence ,Pyramid (image processing) ,business ,Spatial analysis ,030217 neurology & neurosurgery ,Software ,Block (data storage) - Abstract
RGB-D data are increasingly being used for myriad computer vision tasks. For such tasks, most methods simply concatenate or add feature vectors from RGB images and depth maps and allow the two modalities to complement each other mutually. However, such a fusion strategy results in inefficient and inadequate performance. In this study, we propose deep binocular fixation prediction based on a hierarchical multimodal fusion network that suitably combines RGB and depth maps. In the proposed method, a novel convolutional block attention module completely extracts image texture features and retains spatial information. In addition, a pyramid dilated-convolution module refines feature information, further improving the fusion of RGB and depth maps. Experimental results indicate that the proposed network achieves state-of-the-art performance on the NUS and NCTU datasets.
- Published
- 2021
32. Multiscale multilevel context and multimodal fusion for RGB-D salient object detection
- Author
-
Junwei Wu, Lu Yu, Ting Luo, Wujie Zhou, and Jingsheng Lei
- Subjects
Multimodal fusion ,business.industry ,Computer science ,Aggregate (data warehouse) ,ComputingMethodologies_IMAGEPROCESSINGANDCOMPUTERVISION ,Boundary (topology) ,020206 networking & telecommunications ,Context (language use) ,Pattern recognition ,02 engineering and technology ,Function (mathematics) ,Salient object detection ,Control and Systems Engineering ,Feature (computer vision) ,Salience (neuroscience) ,Signal Processing ,0202 electrical engineering, electronic engineering, information engineering ,RGB color model ,020201 artificial intelligence & image processing ,Saliency map ,Computer Vision and Pattern Recognition ,Artificial intelligence ,Electrical and Electronic Engineering ,business ,Software - Abstract
Red–green–blue and depth (RGB-D) saliency detection has recently attracted much research attention; however, the effective use of depth information remains challenging. This paper proposes a method that leverages depth information in clear shapes to detect the boundary of salient objects. As context plays an important role in saliency detection, the method incorporates a proposed end-to-end multiscale multilevel context and multimodal fusion network (MCMFNet) to aggregate multiscale multilevel context feature maps for accurate saliency detection from objects of varying sizes. Finally, a coarse-to-fine approach is applied to an attention module retrieving multilevel and multimodal feature maps to produce the final saliency map. A comprehensive loss function is also incorporated in MCMFNet to optimize the network parameters. Extensive experiments demonstrate the effectiveness of the proposed method and its substantial improvement over state-of-the-art methods for RGB-D salient object detection on four representative datasets.
- Published
- 2021
33. Opinion-unaware blind picture quality measurement using deep encoder–decoder architecture
- Author
-
Lin Xinyang, Jingsheng Lei, Wujie Zhou, Xi Zhou, Ting Luo, and Lu Yu
- Subjects
Image quality ,Computer science ,media_common.quotation_subject ,02 engineering and technology ,Overfitting ,Similarity (network science) ,Artificial Intelligence ,0202 electrical engineering, electronic engineering, information engineering ,Quality (business) ,Pyramid (image processing) ,Electrical and Electronic Engineering ,media_common ,business.industry ,Applied Mathematics ,Supervised learning ,020206 networking & telecommunications ,Pattern recognition ,Computational Theory and Mathematics ,Signal Processing ,Metric (mathematics) ,020201 artificial intelligence & image processing ,Computer Vision and Pattern Recognition ,Artificial intelligence ,Statistics, Probability and Uncertainty ,business ,Scale (map) - Abstract
Recently, deep-learning-based blind picture quality measurement (BPQM) metrics have gained significant attention. However, training a robust deep BPQM metric remains a difficult and challenging task because of the limited number of subject-rated training samples. State-of-the-art full-reference (FR) picture quality measurement (PQM) metrics are in good agreement with human subjective quality scores. Therefore, they can be employed to approximate human subjective quality scores to train BPQM metrics. Inspired by this, we propose a deep encoder–decoder architecture (DEDA) for opinion-unaware (OU) BPQM that does not require human-labeled distorted samples for training. In the training procedure, to avoid overfitting and to ensure the independency of the training and testing samples, we first construct 6,000 distorted pictures and use their objective quality/similarity maps obtained using a high-performance FR-PQM metric for distorted pictures as training labels. Subsequently, an end-to-end map between the distorted pictures and their objective quality/similarity maps (labels) is learned, represented as the DEDA that takes the distorted picture as the input and outputs its predicted quality/similarity map. In the DEDA, the pyramid supervision training strategy is used, which applies supervised learning over three scale layers to efficiently optimize the parameters. In the testing procedure, the quality/similarity maps of the testing samples that can help localize distortions can be predicted with the trained DEDA architecture. The predicted quality/similarity maps are then gradually pooled together to obtain the overall objective quality scores. Comparative experiments on three publicly available standard PQM datasets demonstrate that our proposed DEDA metric is in good agreement with subjective assessment compared to previous state-of-the-art OU-BPQM metrics.
- Published
- 2020
34. Three-branch architecture for stereoscopic 3D salient object detection
- Author
-
Jingsheng Lei, Xi Zhou, Wujie Zhou, Ting Luo, Pan Sijia, and Lu Yu
- Subjects
Computer science ,Stereoscopy ,02 engineering and technology ,law.invention ,Artificial Intelligence ,Robustness (computer science) ,law ,0202 electrical engineering, electronic engineering, information engineering ,Computer vision ,Electrical and Electronic Engineering ,Architecture ,Artificial neural network ,business.industry ,Applied Mathematics ,020206 networking & telecommunications ,Boundary refinement ,Salient object detection ,Computational Theory and Mathematics ,Signal Processing ,Fuse (electrical) ,RGB color model ,020201 artificial intelligence & image processing ,Computer Vision and Pattern Recognition ,Artificial intelligence ,Statistics, Probability and Uncertainty ,business - Abstract
Existing stereoscopic 3D (S3D) salient object detection (SOD) networks typically employ a two-branch architecture, in which the RGB and depth channels are learned independently. Conventional methods based on conventional neural networks generally fuse the two branches by combining their deep representations at a later stage with only one path, which can be inefficient and insufficient for retaining a large amount of cross-modal data. In this study, we combine the RGB branch and depth branch to generate a third branch. The first branch is the embedded attention branch containing the attention mechanism, and we introduce the embedded attention module in this branch to give the allocation of available processing resources to the most informative components of an input signal. The second branch is the boundary refinement branch combined with the low-level information of RGB and depth images. Additionally, we propose a new module, called the detail correlation module, to ensure clear object boundaries and salient object refinement. The third branch is the global deep-view branch containing the global view module, which fuses high-level information and expands the sensor field. We also use three different loss functions to match our special SOD network. Extensive experiments demonstrate the effectiveness and robustness of the proposed architecture and show that it represents a significant improvement over other state-of-the-art SOD approaches.
- Published
- 2020
35. Steganalysis Over Large-Scale Social Networks With High-Order Joint Features and Clustering Ensembles
- Author
-
Zhongqin Bi, Fengyong Li, Chunhua Gu, Mi Wen, Kui Wu, and Jingsheng Lei
- Subjects
Steganalysis ,021110 strategic, defence & security studies ,Steganography ,Computer Networks and Communications ,Computer science ,business.industry ,Feature extraction ,Payload (computing) ,ComputingMethodologies_IMAGEPROCESSINGANDCOMPUTERVISION ,0211 other engineering and technologies ,Pattern recognition ,Data_CODINGANDINFORMATIONTHEORY ,02 engineering and technology ,computer.file_format ,JPEG ,0202 electrical engineering, electronic engineering, information engineering ,Discrete cosine transform ,020201 artificial intelligence & image processing ,Artificial intelligence ,Safety, Risk, Reliability and Quality ,Cluster analysis ,business ,computer ,Transform coding - Abstract
This paper tackles a recent challenge in identifying culprit actors, who try to hide confidential payload with steganography, among many innocent actors in social media networks. The problem is called steganographer detection problem and is significantly different from the traditional stego detection problem that classifies an individual object as a cover or a stego. To solve the steganographer detection problem over large-scale social media networks, this paper proposes a method that uses high-order joint features and clustering ensembles. It employs 250-D features calculated from the high-order joint matrices of Discrete Cosine Transform (DCT) coefficients of JPEG images, which indicate the dependencies of image content. Furthermore, a number of hierarchical sub-clusterings trained by the features are integrated as a clustering ensemble based on the majority voting strategy, which is used to make optimal decisions on suspicious steganographers. Experimental results show that the proposed scheme is effective and efficient in identifying potential steganographers in large-scale social media networks, and has better performance when tested against the state-of-the-art steganographic methods.
- Published
- 2016
36. Unsupervised Hyperspectral Band Selection by Dominant Set Extraction
- Author
-
Zhongqin Bi, Jingsheng Lei, Yuancheng Huang, Guokang Zhu, and Feifei Xu
- Subjects
business.industry ,Feature extraction ,0211 other engineering and technologies ,Hyperspectral imaging ,Pattern recognition ,02 engineering and technology ,Spectral bands ,Set (abstract data type) ,0202 electrical engineering, electronic engineering, information engineering ,Benchmark (computing) ,General Earth and Planetary Sciences ,Graph (abstract data type) ,020201 artificial intelligence & image processing ,Artificial intelligence ,Electrical and Electronic Engineering ,business ,Representation (mathematics) ,Independence (probability theory) ,021101 geological & geomatics engineering ,Mathematics - Abstract
Unsupervised hyperspectral band selection has been an important topic in hyperspectral imagery. This technique aims at selecting some critical and decisive spectral bands from an original image for compact representation without compromising and distorting the raw information in the relevant spectral bands. Although many efforts have been made to this topic, the structural information has not yet been well exploited during band selection, and there are still several deficiencies in search strategies, leaving room for further improvement. This paper tackles the unsupervised hyperspectral band selection problem from a global perspective and proposes a novel method claiming the following main contributions: 1) structure-aware measures for band informativeness and independence; and 2) a graph formulation of band selection allowing for an efficient integrated search by means of dominant set extraction. Experiments on three real hyperspectral images demonstrate the superiority of the proposed band selector in comparison with benchmark methods.
- Published
- 2016
37. Anti-compression JPEG steganography over repetitive compression networks
- Author
-
Fengyong Li, Kui Wu, Jingsheng Lei, and Chuan Qin
- Subjects
Computer science ,ComputingMethodologies_IMAGEPROCESSINGANDCOMPUTERVISION ,Data_CODINGANDINFORMATIONTHEORY ,02 engineering and technology ,Compression (functional analysis) ,0202 electrical engineering, electronic engineering, information engineering ,Discrete cosine transform ,Computer vision ,Dither ,Electrical and Electronic Engineering ,Steganography ,business.industry ,Process (computing) ,020206 networking & telecommunications ,computer.file_format ,JPEG ,Transmission (telecommunications) ,Control and Systems Engineering ,Signal Processing ,Bit error rate ,020201 artificial intelligence & image processing ,Computer Vision and Pattern Recognition ,Artificial intelligence ,business ,computer ,Software - Abstract
Existing work on steganography mostly assumes that an image remains unchanged during transmission from the sender to the receiver. This assumption, however, may not hold in the era of the Internet due to unknown compression from the network service providers. As a result, the hidden information cannot be correctly recovered at the receiver. To address the problem, we design a new JPEG steganographic method that can resist repetitive compression during network transmission, without even knowing the compression process controlled by the network service providers. Our method uses a simulated repetitive compression network, and based on its feedback performs adaptive dither adjustment to dynamically modify the DCT coefficients disturbed by the compression process. Stego images generated with our method can be used to successfully extract the original secret messages, even after the stego images pass through multiple unknown compression processes during network transmission. Extensive experiments demonstrate that compared with existing JPEG steganographic methods, our method can effectively resist repetitive compression, while maintaining a lower bit error rate and strong anti-steganalysis capability.
- Published
- 2020
38. Multi-Source Stego Detection with Low-Dimensional Textural Feature and Clustering Ensembles
- Author
-
Jingsheng Lei, Mi Wen, Xinpeng Zhang, Kui Wu, and Fengyong Li
- Subjects
Majority rule ,Physics and Astronomy (miscellaneous) ,Computer science ,General Mathematics ,0211 other engineering and technologies ,ComputingMethodologies_IMAGEPROCESSINGANDCOMPUTERVISION ,02 engineering and technology ,multimedia security ,steganalysis ,steganographer detection ,image texture feature ,clustering ensembles ,Digital image processing ,0202 electrical engineering, electronic engineering, information engineering ,Computer Science (miscellaneous) ,Cluster analysis ,Steganalysis ,021110 strategic, defence & security studies ,Steganography ,business.industry ,lcsh:Mathematics ,Pattern recognition ,lcsh:QA1-939 ,Hierarchical clustering ,Chemistry (miscellaneous) ,020201 artificial intelligence & image processing ,Artificial intelligence ,Local ternary patterns ,business ,Multi-source - Abstract
This work tackles a recent challenge in digital image processing: how to identify the steganographic images from a steganographer, who is unknown among multiple innocent actors. The method does not need a large number of samples to train classification model, and thus it is significantly different from the traditional steganalysis. The proposed scheme consists of textural features and clustering ensembles. Local ternary patterns (LTP) are employed to design low-dimensional textural features which are considered to be more sensitive to steganographic changes in texture regions of image. Furthermore, we use the extracted low-dimensional textural features to train a number of hierarchical clustering results, which are integrated as an ensemble based on the majority voting strategy. Finally, the ensemble is used to make optimal decision for suspected image. Extensive experiments show that the proposed scheme is effective and efficient and outperforms the state-of-the-art steganalysis methods with an average gain from 4 % to 6 % .
- Published
- 2018
- Full Text
- View/download PDF
39. A clustering ensemble: Two-level-refined co-association matrix with path-based transformation
- Author
-
Caiming Zhong, Xiaodong Yue, Zehua Zhang, and Jingsheng Lei
- Subjects
Fuzzy clustering ,business.industry ,Correlation clustering ,Single-linkage clustering ,Pattern recognition ,Spectral clustering ,Biclustering ,ComputingMethodologies_PATTERNRECOGNITION ,Distance matrix ,Artificial Intelligence ,Signal Processing ,Computer Vision and Pattern Recognition ,Artificial intelligence ,Cluster analysis ,business ,Software ,k-medians clustering ,Mathematics - Abstract
The aim of clustering ensemble is to combine multiple base partitions into a robust, stable and accurate partition. One of the key problems of clustering ensemble is how to exploit the cluster structure information in each base partition. Evidence accumulation is an effective framework which can convert the base partitions into a co-association matrix. This matrix describes the frequency of a pair of points partitioned into the same cluster, but ignores some hidden information in the base partitions. In this paper, we reveal some of those information by refining the co-association matrix from data point and base cluster level. From the data point level, as pairs of points in the same base cluster may have varied similarities, their contributions to the co-association matrix can be different. From the cluster level, since the base clusters may have diversified qualities, the contribution of a base cluster as a whole can also be different from those of others. After being refined, the co-association matrix is transformed into a path-based similarity matrix so that more global information of the cluster structure is incorporated into the matrix. Finally, spectral clustering is applied to the matrix to generate the final clustering result. Experimental results on 8 synthetic and 8 real data sets demonstrate that the clustering ensemble based on the refined co-association matrix outperforms some state-of-the-art clustering ensemble schemes. HighlightsA two-level-refined co-association matrix for cluster ensemble is proposed.The refined co-association matrix is transformed by path-based measure.A theoretical background of the refinement is given.The proposed method outperforms some state-of-the-art ensemble methods.
- Published
- 2015
40. Visual hierarchical cluster structure: A refined co-association matrix based visual assessment of cluster tendency
- Author
-
Jingsheng Lei, Caiming Zhong, and Xiaodong Yue
- Subjects
Fuzzy clustering ,business.industry ,Single-linkage clustering ,Correlation clustering ,Pattern recognition ,Complete-linkage clustering ,Spectral clustering ,Hierarchical clustering ,ComputingMethodologies_PATTERNRECOGNITION ,Artificial Intelligence ,CURE data clustering algorithm ,Signal Processing ,Computer Vision and Pattern Recognition ,Artificial intelligence ,business ,Cluster analysis ,Software ,Mathematics - Abstract
We employ a refined and transformed co-association matrix as the input of VAT.An efficient path-based similarity algorithm is presented and its time complexity is O(N2).A simple approach to analyze D* and obtain the clustering is designed.A visual hierarchical cluster structure can be presented. A hierarchical clustering algorithm, such as Single-linkage, can depict the hierarchical relationship of clusters, but its clustering quality mainly depends on the similarity measure used. Visual assessment of cluster tendency (VAT) reorders a similarity matrix to reveal the cluster structure of a data set, and a VAT-based clustering discovers clusters by image segmentation techniques. Although VAT can visually present the cluster structure, its performance also relies on the similarity matrix employed. In this paper, we take a refined co-association matrix, which is originally used in ensemble clustering, as an initial similarity matrix and transform it by path-based measure, and then apply it to VAT. The final clustering is achieved by directly analyzing the transformed and reordered similarity matrix. The proposed method can deal with data sets with some complex cluster structures and reveal the relationship of clusters hierarchically. The experimental results on synthetic and real data sets demonstrate the above mentioned properties.
- Published
- 2015
41. Finding influential agent groups in complex multiagent software systems based on citation network analyses
- Author
-
Jingsheng Lei, J.C. Jiang, and J.Y. Yu
- Subjects
Group (mathematics) ,Computer science ,business.industry ,Multi-agent system ,General Engineering ,ComputingMethodologies_ARTIFICIALINTELLIGENCE ,Outcome (game theory) ,Agent-based social simulation ,Software agent ,Software system ,Artificial intelligence ,business ,Citation ,Centrality ,Software - Abstract
Current complex engineering software systems are often composed of many components and can be built based on a multiagent approach, resulting in what are called complex multiagent software systems. In a complex multiagent software system, various software agents may cite the operation results of others, and the citation relationships among agents form a citation network; therefore, the importance of a software agent in a system can be described by the citations from other software agents. Moreover, the software agents in a system are often divided into various groups, and each group contains the agents undergoing similar tasks or having related functions; thus, it is necessary to find the influential agent group (not only the influential individual agent) that can influence the system outcome utilities more than the others. To solve such a problem, this paper presents a new model for finding influential agent groups based on group centrality analyses in citation networks. In the presented model, a concept of extended group centrality is presented to evaluate the impact of an agent group, which is collectively determined by both direct and indirect citations from other agents outside the group. Moreover, the presented model addresses two typical types of agent groups: one is the adjacent group where agents of a group are adjacent in the citation network, and the other is the scattering group where agents of a group are distributed separately in the citation network. Finally, we present case studies and simulation experiments to prove the effectiveness of the presented model.
- Published
- 2015
42. Towards building a social emotion detection system for online news
- Author
-
Jingsheng Lei, Xiaojun Quan, Liu Wenyin, Qing Li, and Yanghui Rao
- Subjects
Computer Networks and Communications ,Computer science ,business.industry ,media_common.quotation_subject ,Context (language use) ,Ambiguity ,Lexicon ,Part of speech ,computer.software_genre ,Task (project management) ,Hardware and Architecture ,Selection (linguistics) ,Artificial intelligence ,business ,computer ,Software ,Natural language processing ,media_common - Abstract
Social emotion detection of online users has become an important task for mining public opinions. Social emotion detection aims at predicting the readers’ emotions evoked by news articles, tweets, etc. In this article, we focus on building a social emotion detection system for online news. The system is built based on the modules of document selection, Part-of-speech (POS) tagging, and social emotion lexicon generation. Empirical studies are extensively conducted on a large scale real-world collection of news articles. Experiments show that the document selection algorithm has a positive effect on the social emotion detection. The system performs better with the words and POS combination compared to a feature set consisting only of words. POS is also useful to detect emotion ambiguity of words and the context dependence of their sentiment orientations. Furthermore, the proposed method of generating the lexicon outperforms the baselines in terms of social emotion prediction.
- Published
- 2014
43. Building emotional dictionary for sentiment analysis of online news
- Author
-
Mingliang Chen, Qing Li, Liu Wenyin, Jingsheng Lei, and Yanghui Rao
- Subjects
Topic model ,Information retrieval ,Social emotions ,Web 2.0 ,Computer Networks and Communications ,Computer science ,business.industry ,Microblogging ,Sentiment analysis ,computer.software_genre ,Hardware and Architecture ,Social media ,Pruning (decision trees) ,Artificial intelligence ,Construct (philosophy) ,business ,computer ,Software ,Natural language processing ,Word (computer architecture) - Abstract
Sentiment analysis of online documents such as news articles, blogs and microblogs has received increasing attention in recent years. In this article, we propose an efficient algorithm and three pruning strategies to automatically build a word-level emotional dictionary for social emotion detection. In the dictionary, each word is associated with the distribution on a series of human emotions. In addition, a method based on topic modeling is proposed to construct a topic-level dictionary, where each topic is correlated with social emotions. Experiment on the real-world data sets has validated the effectiveness and reliability of the methods. Compared with other lexicons, the dictionary generated using our approach is language-independent, fine-grained, and volume-unlimited. The generated dictionary has a wide range of applications, including predicting the emotional distribution of news articles, identifying social emotions on certain entities and news events.
- Published
- 2013
44. Toward Automatic Answers in User-Interactive Question Answering Systems
- Author
-
Jingsheng Lei, Feifei Xu, Qing Li, Tianyong Hao, and Liu Wenyin
- Subjects
Structure (mathematical logic) ,Information retrieval ,Computer science ,business.industry ,media_common.quotation_subject ,Ambiguity ,Representation (arts) ,Semantics ,computer.software_genre ,Semantic computing ,Similarity (psychology) ,Question answering ,Pattern matching ,Artificial intelligence ,business ,computer ,Natural language processing ,media_common - Abstract
A strategy of automatic answer retrieval for repeated or similar questions in user-interactive systems by employing semantic question patterns is proposed in this paper. The used semantic question pattern is a generalized representation of a group of questions with both similar structure and relevant semantics. Specifically, it consists of semantic annotations (or constraints) for the variable components in the pattern and hence enhances the semantic representation and greatly reduces the ambiguity of a question instance when asked by a user using such pattern. The proposed method consists of four major steps: structure processing, similar pattern matching and filtering, automatic pattern generation, question similarity evaluation and answer retrieval. Preliminary experiments in a real question answering system show a precision of more than 90% of the method.
- Published
- 2011
45. Network Computing and Information Security : Second International Conference, NCIS 2012, Shanghai, China, December 7-9, 2012, Proceedings
- Author
-
Jingsheng Lei, Fu Lee Wang, Mo Li, Yuan Luo, Jingsheng Lei, Fu Lee Wang, Mo Li, and Yuan Luo
- Subjects
- Data protection, Computer networks, Artificial intelligence, Electronic digital computers—Evaluation
- Abstract
This book constitutes the proceedings of the Second International Conference on Network Computing and Information Security, NCIS 2012, held in Shanghai, China, in December 2012. The 104 revised papers presented in this volume were carefully reviewed and selected from 517 submissions. They are organized in topical sections named: applications of cryptography; authentication and non-repudiation; cloud computing; communication and information systems; design and analysis of cryptographic algorithms; information hiding and watermarking; intelligent networked systems; multimedia computing and intelligence; network and wireless network security; network communication; parallel and distributed systems; security modeling and architectures; sensor network; signal and information processing; virtualization techniques and applications; and wireless network.
- Published
- 2013
46. Artificial Intelligence and Computational Intelligence : 4th International Conference, AICI 2012, Chengdu, China, October 26-28, 2012, Proceedings
- Author
-
Jingsheng Lei, Fu Lee Wang, Hepu Deng, Duoqian Miao, Jingsheng Lei, Fu Lee Wang, Hepu Deng, and Duoqian Miao
- Subjects
- Artificial intelligence, Pattern recognition systems, Computer vision, Data mining, Computer science, Algorithms
- Abstract
This volume proceedings contains revised selected papers from the 4th International Conference on Artificial Intelligence and Computational Intelligence, AICI 2012, held in Chengdu, China, in October 2012. The total of 163 high-quality papers presented were carefully reviewed and selected from 724 submissions. The papers are organized into topical sections on applications of artificial intelligence, applications of computational intelligence, data mining and knowledge discovery, evolution strategy, expert and decision support systems, fuzzy computation, information security, intelligent control, intelligent image processing, intelligent information fusion, intelligent signal processing, machine learning, neural computation, neural networks, particle swarm optimization, and pattern recognition.
- Published
- 2012
47. Emerging Research in Artificial Intelligence and Computational Intelligence : International Conference, AICI 2012, Chengdu, China, October 26-28, 2012. Proceedings
- Author
-
Jingsheng Lei, Fu Lee Wang, Hepu Deng, Duoqian Miao, Jingsheng Lei, Fu Lee Wang, Hepu Deng, and Duoqian Miao
- Subjects
- Artificial intelligence, Computer science, Data mining, Computer vision, Pattern recognition systems
- Abstract
This book constitutes the refereed proceedings of the International Conference on Artificial Intelligence and Computational Intelligence, AICI 2012, held in Chengdu, China, in October 2012. The 163 revised full papers presented were carefully reviewed and selected from 724 submissions. The papers are organized in topical sections on applications of artificial intelligence; applications of computational intelligence; data mining and knowledge discovering; evolution strategy; intelligent image processing; machine learning; neural networks; pattern recognition.
- Published
- 2012
48. Artificial Intelligence and Computational Intelligence : Second International Conference, AICI 2011, Taiyuan, China, September 24-25, 2011, Proceedings, Part III
- Author
-
Hepu Deng, Duoqian Miao, Jingsheng Lei, Fu Lee Wang, Hepu Deng, Duoqian Miao, Jingsheng Lei, and Fu Lee Wang
- Subjects
- Artificial intelligence, Application software, Algorithms, Information storage and retrieval systems, Computer science, Software engineering
- Abstract
This three-volume proceedings contains revised selected papers from the Second International Conference on Artificial Intelligence and Computational Intelligence, AICI 2011, held in Taiyuan, China, in September 2011. The total of 265 high-quality papers presented were carefully reviewed and selected from 1073 submissions. The topics of Part III covered are: machine vision; natural language processing; nature computation; neural computation; neural networks; particle swarm optimization; pattern recognition; rough set theory; and support vector machine.
- Published
- 2011
49. A formal measurement of the cognitive complexity of texts in cognitive linguistics
- Author
-
Robert C. Berwick, Yingxu Wang, Jingsheng Lei, and Xiangfeng Luo
- Subjects
Cognitive models of information retrieval ,Computer science ,business.industry ,Cognitive musicology ,Cognitive complexity ,Cognitive semantics ,Cognitive architecture ,computer.software_genre ,Language and Communication Technologies ,Quantitative linguistics ,Artificial intelligence ,business ,Cognitive linguistics ,computer ,Natural language processing - Abstract
The cognitive complexity of texts in natural languages is a fundamental measure of the properties of syntax and semantics in textual comprehension, processing, and search. This paper presents a formal metrics of text comprehension complexity in cognitive linguistics. Both objective and subjective aspects of text comprehension and their complexity are formally modeled. A formal language model is established that characterizes the discourse of natural languages. The mathematical models of cognitive complexity of texts and their comprehension are rigorously described. On the basis of the cognitive and mathematical models of cognitive linguistics, the measurement of cognitive complexity of texts is quantitatively established and tested by a set of case studies. A wide range of applications of the measurement of textual complexity are identified in cognitive linguistics and contemporary web technologies such as search engines, online document retrieval, natural language processing, cognitive linguistics, cognitive computing, cognitive machine learning, and computing with words.
- Published
- 2012
50. Mechanism design for finding experts using locally constructed social referral web
- Author
-
Yunhao Liu, Jingsheng Lei, Jiaguang Sun, Lan Zhang, and Xiang-Yang Li
- Subjects
Mechanism design ,Matching (statistics) ,Information retrieval ,Social network ,Referral ,Computer science ,business.industry ,Machine learning ,computer.software_genre ,Computational Theory and Mathematics ,Hardware and Architecture ,Signal Processing ,Artificial intelligence ,Data mining ,business ,computer ,Mechanism (sociology) ,Social structure ,Local search (constraint satisfaction) - Abstract
In this work, we address the problem of distributed expert finding using chains of social referrals and profile matching with only local information in online social networks. By assuming that users are selfish, rational, and have privately known cost of participating in the referrals, we design a novel truthful efficient mechanism in which an expert-finding query will be relayed by intermediate users. When receiving a referral request, a participant will locally choose among her neighbors some user to relay the request. In our mechanism, several closely coupled methods are carefully designed to improve the performance of distributed search, including, profile matching, social acquaintance prediction, score function for locally choosing relay neighbors, and budget estimation. We conduct extensive experiments on several data sets of online social networks. The extensive study of our mechanism shows that the success rate of our mechanism is about 90 percent in finding closely matched experts using only local search and limited budget, which significantly improves the previously best rate 20 percent. The overall cost of finding an expert by our truthful mechanism is about 20 percent of the untruthful methods, e.g., the method that always selects high-degree neighbors. The median length of social referral chains is 6 using our localized search decision, which surprisingly matches the well-known small-world phenomenon of global social structures.
- Published
- 2012
Catalog
Discovery Service for Jio Institute Digital Library
For full access to our library's resources, please sign in.