47 results on '"Song, Zhanjie"'
Search Results
2. Frame-part-activated deep reinforcement learning for Action Prediction.
- Author
-
Chen, Lei and Song, Zhanjie
- Subjects
- *
DEEP reinforcement learning , *REINFORCEMENT learning , *ACTIVE learning , *REINFORCEMENT (Psychology) , *HUMAN body - Abstract
In this paper, we propose a frame-part-activated deep reinforcement learning (FPA-DRL) for action prediction. Most existing methods for action prediction utilize the evolution of whole frames to model actions, which cannot avoid the noise of the current action, especially in the early prediction. Moreover, the loss of structural information of human body diminishes the capacity of features to describe actions. To address this, we design a FPA-DRL to exploit the structure of the human body by extracting skeleton proposals and reduce the redundancy of frames under a deep reinforcement learning framework. Specifically, we extract features from different parts of the human body individually, activate the action-related parts in features and the action-related frames in videos to enhance the representation. Our method not only exploits the structure information of the human body, but also considers the attention frame for expressing actions. We evaluate our method on three popular action prediction datasets: UT-Interaction, BIT-Interaction and UCF101. Our experimental results demonstrate that our method achieves the very competitive performance with state-of-the-arts. • We design the part-activated module to enhance the action-related parts of features. • We design the frame-activated module to reduce the redundancy of frames. • We achieved very competitive results of state-of-the-arts on three datasets. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
3. Learning cross-task relations for panoptic driving perception.
- Author
-
Song, Zhanjie and Zhao, Linqing
- Subjects
- *
INFORMATION sharing , *AUTONOMOUS vehicles - Abstract
Accurately understanding traffic surroundings is crucial for various autonomous and assisted driving scenarios. The visual perception system must capture the entire scene, including vehicle positions, road conditions, and lane configurations. While existing methods co-train models for these tasks simultaneously, they overlook the topological relationships among roads, lanes, and traffic objects in images. In this paper, we propose leveraging inherent structural relations among these tasks to enhance precise panoptic driving perception. We introduce a cross-task relation mining (CRM) method to achieve this goal. Self-attention mechanisms are used to blend key spatial features within each task, and cross-attention facilitates essential information exchange between tasks, resulting in a more comprehensive scene interpretation. Extensive experiments demonstrate the effectiveness of our approach in complex traffic scenarios. • We present a cross-task relation mining technique that enhances the overall comprehension of driving scenarios. • We propose a multi-scale attention-based interaction module to capture mutual priors across all tasks. • Our method improves multi-task consistency and achieves the best performance across all tasks. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
4. Approximation of Nonhomogeneous Random Field from Local Averages.
- Author
-
Song, Zhanjie and Zhang, Shuo
- Subjects
- *
RANDOM fields , *APPROXIMATION error , *SAMPLING theorem - Abstract
In this article, we consider the extension of Shannon sampling series reconstruction theorem for nonhomogeneous random fields using local averages sampling, which helps improve certain earlier results. The upper bound of mean square truncation sampling approximation error is more precise, and we establish one approximation result in the almost sure sense. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
5. Learning Hybrid Semantic Affinity for Point Cloud Segmentation.
- Author
-
Song, Zhanjie, Zhao, Linqing, and Zhou, Jie
- Subjects
- *
POINT cloud , *BLENDED learning , *VIDEO coding , *IMAGE segmentation , *NEIGHBORHOODS , *TASK analysis - Abstract
In this paper, we present a hybrid semantic affinity learning method (HSA) to capture and leverage the dependencies of categories for 3D semantic segmentation. Unlike existing methods that only use the cross-entropy loss to perform one-to-one supervision and ignore the semantic relations between points, our approach aims to learn the label dependencies between 3D points from a hybrid perspective. From a global view, we introduce the structural correlations among different classes to provide global priors for point features. Specifically, we fuse word embeddings of labels and scene-level features as category nodes, which are processed via a graph convolutional network (GCN) to produce the sample-adapted global priors. These priors are then combined with point features to enhance the rationality of semantic predictions. From a local view, we propose the concept of local affinity to effectively model the intra-class and inter-class semantic similarities for adjacent neighborhoods, making the predictions more discriminative. Experimental results show that our method consistently improves the performance of state-of-the-art models across indoor (S3DIS, ScanNet), outdoor (SemanticKITTI), and synthetic (ShapeNet) datasets. [ABSTRACT FROM AUTHOR]
- Published
- 2022
- Full Text
- View/download PDF
6. Average sampling theorem for the homogeneous random fields in a reproducing kernel subspace of mixed Lebesgue space.
- Author
-
Wang, Suping and Song, Zhanjie
- Subjects
- *
SAMPLING theorem - Abstract
In this paper, we mainly investigate the average sampling problem for the homogeneous random fields in a reproducing kernel subspace of mixed Lebesgue space. Based on the counterpart sampling result for the deterministic signals in the same space, a mean square convergence result for recovering the homogeneous random fields by the iterative reconstruction algorithm is obtained. [ABSTRACT FROM AUTHOR]
- Published
- 2022
- Full Text
- View/download PDF
7. Average Sampling Theorems on Multidimensional Random Signals.
- Author
-
Song, Zhanjie and Zhang, Shuo
- Subjects
- *
SAMPLING theorem , *STATISTICAL sampling - Abstract
In this paper, a lower bounded of sequences on classical sampling theorem for random signals is given and is used to extend the applicable range of frame. The convergence property of sampling series, the estimate of truncation error both in mean square sense and for almost sure results on sampling theorem for multidimensional random signals from asymmetrical average sampling are analyzed. Using the new results on frames recently, the famous Shannon sampling theorem on multidimensional random signals has been extended with a new idea. [ABSTRACT FROM AUTHOR]
- Published
- 2021
- Full Text
- View/download PDF
8. Signal Separation Operator Based on Wavelet Transform for Non-Stationary Signal Decomposition.
- Author
-
Han, Ningning, Pei, Yongzhen, and Song, Zhanjie
- Abstract
This paper develops a new frame for non-stationary signal separation, which is a combination of wavelet transform, clustering strategy and local maximum approximation. We provide a rigorous mathematical theoretical analysis and prove that the proposed algorithm can estimate instantaneous frequencies and sub-signal modes from a blind source signal. The error bounds for instantaneous frequency estimation and sub-signal recovery are provided. Numerical experiments on synthetic and real data demonstrate the effectiveness and efficiency of the proposed algorithm. Our method based on wavelet transform can be extended to other time–frequency transforms, which provides a new perspective of time–frequency analysis tools in solving the non-stationary signal separation problem. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
9. Space–time inhomogeneous background intensity estimators for semi-parametric space–time self-exciting point process models.
- Author
-
Li, Chenlong, Song, Zhanjie, and Wang, Wenjun
- Subjects
- *
POINT processes , *EXPECTATION-maximization algorithms , *BANDWIDTHS , *HISTOGRAMS - Abstract
Histogram maximum likelihood estimators of semi-parametric space–time self-exciting point process models via expectation–maximization algorithm can be biased when the background process is inhomogeneous. We explore an alternative estimation method based on the variable bandwidth kernel density estimation (KDE) and EM algorithm. The proposed estimation method involves expanding the semi-parametric models by incorporating an inhomogeneous background process in space and time and applying the variable bandwidth KDE to estimate the background intensity function. Using an example, we show how the variable bandwidth KDE can be estimated this way. Two simulation examples based on residual analysis are designed to evaluate and validate the ability of our methods to recover the background intensity function and parametric triggering intensity function. [ABSTRACT FROM AUTHOR]
- Published
- 2020
- Full Text
- View/download PDF
10. Task-Aware Attention Model for Clothing Attribute Prediction.
- Author
-
Zhang, Sanyi, Song, Zhanjie, Cao, Xiaochun, Zhang, Hua, and Zhou, Jie
- Subjects
- *
FORECASTING , *IMAGE color analysis - Abstract
Clothing attribute recognition, especially in unconstrained street images, is a challenging task for multimedia. Existing methods for multi-task clothing attribute prediction often ignore the relation between specific attributes and positions. However, the attribute response is always location-sensitive, i.e., different spatial locations have various contributions to attributes. Inspired by the locality of clothing attributes, in this paper, we introduce the attention mechanism to incorporate the impact of positions for clothing attribute prediction with only image-level annotations. However, the performance improvement is limited if we directly use the traditional spatial attention model for each task since it does not take the influence from other tasks into account. Instead, we propose a novel task-aware attention mechanism, which estimates the importance of each position across different tasks. We first evaluate a task attention network with an end-to-end multi-task clothing attribute learning architecture on the shop domain. And then, we employ curriculum learning strategy, which transfers the well-trained shop domain attribute knowledge to the street domain attribute prediction. Experiments are conducted on three clothing benchmarks, i.e., cross-domain clothing attribute dataset, woman clothing dataset, and man clothing dataset. The performance of attribute prediction demonstrates the superiority of the proposed task-aware attention mechanism over several state-of-the-art methods both in shop and street domains. [ABSTRACT FROM AUTHOR]
- Published
- 2020
- Full Text
- View/download PDF
11. Mining periodic patterns and cascading bursts phenomenon in individual e-mail communication.
- Author
-
Li, Chenlong, Song, Zhanjie, Wang, Wenjun, and Wang, Xu (Sunny)
- Subjects
- *
POISSON processes , *STOCHASTIC processes , *HUMAN behavior , *EMAIL , *ECONOMIC trends , *COMMUNICATION patterns - Abstract
Quantitative understanding of human activity is very important as many social and economic trends are driven by human actions. We propose a novel stochastic process, the Multi-state Markov Cascading Non-homogeneous Poisson Process (M2CNPP), to analyze human e-mail communication involving both periodic patterns and bursts phenomenon. The model parameters are estimated using the Generalized Expectation Maximization (GEM) algorithm while the hidden states are treated as missing values. The empirical results demonstrate that the proposed model adequately captures the major temporal cascading features as well as the periodic patterns in e-mail communication. [ABSTRACT FROM AUTHOR]
- Published
- 2019
- Full Text
- View/download PDF
12. Learning principal orientations and residual descriptor for action recognition.
- Author
-
Chen, Lei, Song, Zhanjie, Lu, Jiwen, and Zhou, Jie
- Subjects
- *
PATTERN recognition systems , *MACHINE learning , *FEATURE extraction , *DISTRIBUTION (Probability theory) , *ARTIFICIAL neural networks - Abstract
Highlights • We exploit the distribution information of principal orientations of dataset by learning the projection matrix with trajectories on both spatial and temporal domains for extracting features informatively. • We exploit the residual information of projected features in the projection subspace by maximizing the residual value of features from principal orientations. • We consider the correlation between RGB channel and depth channel for RGB-D based action recognition and jointly learn the projection matrices on corresponding channels. Abstract In this paper, we propose an unsupervised representation method to learn principal orientations and residual descriptor (PORD) for action recognition. Our PORD aims to learn the statistic principal orientations and to represent the local features of action videos with residual values. The existing hand-crafted feature based methods require high prior knowledge and lack of the ability to represent the distribution of features of the dataset. Most of the deep learned feature based methods are data adaptive, but they do not consider the projection orientations of features nor the loss of locally aggregated descriptors of the quantization. We propose a method of principal orientations and residual descriptor considering that the principal orientations reflect the distribution of local features in the dataset and the residual of projection contains discriminative information of local features. Moreover, we propose a multi-modality PORD method by reducing the modality gap of the RGB channels and the depth channel at the feature level to make our method applicable to RGB-D action recognition. To evaluate the performance, we conduct experiments on five challenging action datasets: Hollywood2, UCF101, HMDB51, MSRDaily, and MSR-Pair. The results show that our method is competitive with the state-of-the-art methods. [ABSTRACT FROM AUTHOR]
- Published
- 2019
- Full Text
- View/download PDF
13. An improved signal determination method on machined surface topography.
- Author
-
Sun, Jingjing, Song, Zhanjie, He, Gaiyun, and Sang, Yicun
- Subjects
- *
SURFACE topography , *SIGNAL denoising , *HILBERT-Huang transform , *TRANSFER functions , *FOURIER analysis - Abstract
The characteristic signals of the machined surface are a mixture of actual signals and noise. It is feasible to make the features distinct through wavelet denoising. However, some of the deterministic signals may be lost with noise removed resulting in the loss of energy which make it difficult to judge the real components of the surface. An improved signal determination method —— wavelet denoising with compensation of the loss (WDCL) is proposed in this paper. The compensation method uses ensemble empirical mode decomposition (EEMD) and transfer function in which instantaneous frequency is calculated by Hilbert transform (HT). The coefficients of the transfer function are adjusted by improving the passing rate of the deterministic signals and lowering the passing rate of noise. The result shows that the WDCL can enhance the resolution of the real signals and reduce noise further. [ABSTRACT FROM AUTHOR]
- Published
- 2018
- Full Text
- View/download PDF
14. Optimal distribution of integration time in degree of linear polarization polarimetry based on the expected variance.
- Author
-
Song, Zhanjie, Li, Xiaobo, and Liu, Tiegen
- Subjects
- *
POLARIMETRY , *ANALYSIS of variance , *OPTICAL polarization , *DISTRIBUTION (Probability theory) , *RANDOM noise theory - Abstract
Previous study shows that if the total integration time of intensity measurements is fixed, the variance of DOLP estimator depends on the distribution of the degree of linear polarization for two intensity measurements. However, the optimal time dependents on the quantity being measured. In this paper, the expected variance is used to define a cost function for optimization of the estimator, which overcomes the above limitation. Actually, minimizing the expected variance is equivalent to minimizing the noise power of estimator, so as to improve the precision of the estimator. We also deduce the closed-form solution of the optimal distribution of the integration time for additive Gaussian noise by employing Lagrange multiplier method. According to the theoretical analyses, it is shown that the variance of DOLP estimator can be decreased for most values of DOLP without prior knowledge, and the proposed method statistically improve the measurement accuracy of the polarimetry system. [ABSTRACT FROM AUTHOR]
- Published
- 2017
- Full Text
- View/download PDF
15. Learning Pooling for Convolutional Neural Network.
- Author
-
Sun, Manli, Song, Zhanjie, Jiang, Xiaoheng, Pan, Jing, and Pang, Yanwei
- Subjects
- *
ARTIFICIAL neural networks , *MACHINE learning , *OBJECT recognition (Computer vision) , *PARAMETER estimation , *FEATURE selection - Abstract
Convolutional neural networks (CNNs) consist of alternating convolutional layers and pooling layers. The pooling layer is obtained by applying pooling operator to aggregate information within each small region of the input feature channels and then down sampling the results. Typically, hand-crafted pooling operations are used to aggregate information within a region, but they are not guaranteed to minimize the training error. To overcome this drawback, we propose a learned pooling operation obtained by end-to-end training which is called LEAP (LEArning Pooling). Specifically, in our method, one shared linear combination of the neurons in the region is learned for each feature channel (map). In fact, average pooling can be seen as one special case of our method where all the weights are equal. In addition, inspired by the LEAP operation, we propose one simplified convolution operation to replace the traditional convolution which consumes many extra parameters. The simplified convolution greatly reduces the number of parameters while maintaining comparable performance. By combining the proposed LEAP method and the simplified convolution, we demonstrate the state-of-the-art classification performance with moderate parameters on three public object recognition benchmarks: CIFAR10 dataset, CIFAR100 dataset, and ImageNet2012 dataset. [ABSTRACT FROM AUTHOR]
- Published
- 2017
- Full Text
- View/download PDF
16. Profile error evaluation of free-form surface using sequential quadratic programming algorithm.
- Author
-
Lang, Ailei, Song, Zhanjie, He, Gaiyun, and Sang, Yicun
- Subjects
- *
ERROR analysis in mathematics , *QUADRATIC programming , *LOCALIZATION theory , *LITHOGRAPHY , *LINEAR differential equations - Abstract
Profile error of free-form surface is evaluated in this paper based on sequential quadratic programming (SQP) algorithm. The optimal localization model is established with the minimum zone criterion firstly. Subsequently, the surface subdivision method or STL (STeror Lithography) model is used to compute the point-to-surface distance and the approximate linear differential movement model of signed distance is deduced to simplify the updating process of alignment parameters. Finally, the optimization model on profile error evaluation of free-form surface is solved with SQP algorithm. Simulation examples indicate that the results acquired by SQP method are closer to the ideal results than the other algorithms in the problem of solving transformation parameters. In addition, real part experiments show that the maximum distance between the measurement points and their corresponding closest points on the design model is shorter by using SQP-based algorithm. Lastly, the results obtained in the experiment of the workpiece with S form illustrate that the SQP-based profile error evaluation algorithm can dramatically reduce the iterations and keep the precision of result simultaneously. Furthermore, a simulation is conducted to test the robustness of the proposed method. In a word, this study purposes a new algorithm which is of high accuracy and less time-consuming. [ABSTRACT FROM AUTHOR]
- Published
- 2017
- Full Text
- View/download PDF
17. Precursory waves and eigenfrequencies identified from acoustic emission data based on Singular Spectrum Analysis and laboratory rock-burst experiments.
- Author
-
Gong, Yuxin, Song, Zhanjie, He, Manchao, Gong, Weili, and Ren, Fuqiang
- Subjects
- *
ACOUSTIC emission , *EIGENFREQUENCIES , *SPECTRUM analysis , *ROCK bursts , *SHOCK waves - Abstract
Important task for acoustic emission (AE) monitoring involves detecting frequency shift phenomenon and intense periodic components. In the present research, we investigate time dynamics embedded in AE signal acquired in the laboratory rock burst experiment on limestone sample. By applying the Singular Spectrum Analysis (SSA)-based algorithm developed in this research, we reconstruct the decomposed components and then select the main component with a decision-making process based on the criterion that it should be significant both in the eigenvector space and spectral domain, termed eigenfrequency. The frequency shift phenomenon is represented by the eigenfrequencies of the first main component consistently. Precursory waves of the first main component represents time dynamics of the rock burst process by elastic wave over the low-level loading phase, high-frequency wave with self-oscillating envelopes at unloading, low-frequency quasi-shock waves during the rheological delay phase and low-frequency shock wave at complete rock burst failure. [ABSTRACT FROM AUTHOR]
- Published
- 2017
- Full Text
- View/download PDF
18. An advanced inversion algorithm for significant wave height estimation based on random field.
- Author
-
Zhang, Shuo, Song, Zhanjie, and Li, Ying
- Subjects
- *
OCEAN waves , *MATHEMATICAL transformations , *HEIGHT measurement , *ORTHOGONAL functions , *PRINCIPAL components analysis , *STANDARD deviations - Abstract
To describe the random movement of ocean wave exactly, the random field theory for local average sampling is introduced. In this work, the rotated empirical orthogonal function analysis (REOF) is proposed to estimate the significant wave height (SWH) from the data captured with marine X-band radar. After obtaining the rotated principal components (PC) of the radar image sequences, the standard deviation of the rotated PC is a new means of estimating SWH. The results show the advanced algorithm has a good correlation between the SWH retrieved by radar and that of the measured with the buoy. The root mean squared error (RMSE) of the SWH between the advanced model and the buoy is only 0.1440 m, which decreased about 22% compared with the EOF analysis. [ABSTRACT FROM AUTHOR]
- Published
- 2016
- Full Text
- View/download PDF
19. Bayesian multiple measurement vector problem with spatial structured sparsity patterns.
- Author
-
Han, Ningning and Song, Zhanjie
- Subjects
- *
BAYESIAN analysis , *STATISTICAL decision making , *SPARSE graphs , *STATISTICAL correlation , *GRAPHIC methods in statistics - Abstract
A promising research that has drawn considerable attentions is exploiting the inherent structures in the sparse signal. In this work, we apply the property to the multiple measurement vector (MMV) problem, in which a group of collected sparse signals that share an identical sparsity support are recovered from undersampled measurements. The main objective of this paper is to introduce a Bayesian model with taking both spatial and temporal dependencies into account and show how this model can be used for MMV with spatial structured sparsity patterns. Due to the property of the beta process that the sparse representation can be decomposed to values and sparsity indicators, the proposed algorithm ingeniously captures the temporal correlation structure by the learning of amplitudes and the spatial correlation structure by the estimation of clustered sparsity patterns. Detailed numerical experiments including synthetic and real data demonstrate the effectiveness of the proposed algorithm. [ABSTRACT FROM AUTHOR]
- Published
- 2018
- Full Text
- View/download PDF
20. Cluster-based image super-resolution via jointly low-rank and sparse representation.
- Author
-
Han, Ningning, Song, Zhanjie, and Li, Ying
- Subjects
- *
ALGORITHM research , *DESIGN research , *VISUAL communication , *DIGITAL techniques for visual communication , *IMAGE representation - Abstract
In this paper, we propose a novel algorithm for single image super-resolution by developing a concept of cluster rather than using patch as the basic unit. For the proposed algorithm, all patches are splitted into numerous subspaces, and the optimal representation problem is solved with jointly low-rank and sparse regularization for each subspace. By enforcing global consistency constraint of each subspace with nuclear norm regularization and capturing local linear structure of each patch with ℓ 1 -norm regularization, effective matching functions for test and exemplar patches can be created. Accordingly, the desirable results with low computational complexity are obtained. Experimental results show that the proposed algorithm generates high-quality images in comparison with other state-of-the-art methods. [ABSTRACT FROM AUTHOR]
- Published
- 2016
- Full Text
- View/download PDF
21. An improved IRLS algorithm for sparse recovery with intra-block correlation.
- Author
-
Lei, Yang and Song, Zhanjie
- Subjects
- *
COMPUTER algorithms , *STATISTICAL correlation , *SIGNAL processing , *ELECTROCARDIOGRAPHY , *COVARIANCE matrices , *ITERATIVE methods (Mathematics) - Abstract
Non-convex l 2 / l q (0 < q < 1) minimization method can efficiently recover the block-sparse signals whose non-zero coefficients occur in a few blocks. However, in many applications such as face recognition and fetal ECG monitoring, real-world signals also exhibit intra-block correlations aside from standard block-sparsity. In order to recover such signals exactly and robustly, the block sparse Bayesian learning framework is studied in this paper. In contrast to l 2 / l q norm minimization the proposed method involves a quadratic Mahalanobis distance measure on the block and a covariance matrix on the intra-block correlation. The improved iteratively reweighted least-squares algorithm for the induced framework is proposed than the recent known for mixed l 2 / l q optimization. The proposed algorithm is tested and compared with the mixed l 2 / l q algorithm on a series of signals modeled by autoregressive processes. Numerical results demonstrate the outperformance of the proposed algorithm and meanfulness of the novel strategy, especially in low sample ratio and large unknown noise level. [ABSTRACT FROM AUTHOR]
- Published
- 2015
- Full Text
- View/download PDF
22. An inexact continuation accelerated proximal gradient algorithm for low n -rank tensor recovery.
- Author
-
Liu, Huihui and Song, Zhanjie
- Subjects
- *
ALGORITHMS , *TENSOR fields , *OPERATOR theory , *MATHEMATICAL optimization , *SET theory , *LIPSCHITZ spaces - Abstract
The lown-rank tensor recovery problem is an interesting extension of thecompressed sensing. This problem consists of finding a tensor of minimumn-rank subject to linear equality constraints and has been proposed in many areas such as data mining, machine learning and computer vision. In this paper, operator splitting technique and convex relaxation technique are adapted to transform the lown-rank tensor recovery problem into a convex, unconstrained optimization problem, in which the objective function is the sum of a convex smooth function with Lipschitz continuous gradient and a convex function on a set of matrices. Furthermore, in order to solve the unconstrained nonsmooth convex optimization problem, an accelerated proximal gradient algorithm is proposed. Then, some computational techniques are used to improve the algorithm. At the end of this paper, some preliminary numerical results demonstrate the potential value and application of the tensor as well as the efficiency of the proposed algorithm. [ABSTRACT FROM PUBLISHER]
- Published
- 2014
- Full Text
- View/download PDF
23. Robust face recognition based on sparse representation in 2D Fisherface space.
- Author
-
Cheng, Guangtao and Song, Zhanjie
- Subjects
- *
HUMAN facial recognition software , *ROBUST control , *TWO-dimensional models , *MATHEMATICAL proofs , *COMPUTER algorithms , *PIXELS - Abstract
Abstract: Sparse representation is being proved to be effective for many tasks in the field of face recognition. In this paper, we will propose an efficient face recognition algorithm via sparse representation in 2D Fisherface space. We firstly transformed the 2D image into 2D Fisherface in preprocessing, and classify the testing image via sparse representation in the 2D Fisherface space. Then we extend the proposed method using some supplementary matrices to deal with random pixels corruption. For face image with contiguous occlusion, we partition each image into some blocks, and define a new rule combining sparsity and reconstruction residual to discard the occluded blocks, the final result is aggregated by voting the classification result of the valid individual block. The experimental results have shown that the proposed algorithm achieves a satisfying performance in both accuracy and robustness. [Copyright &y& Elsevier]
- Published
- 2014
- Full Text
- View/download PDF
24. Concentrative sparse representation based classification.
- Author
-
Cheng, Guangtao and Song, Zhanjie
- Subjects
- *
MATHEMATICAL proofs , *PATTERN recognition systems , *CLASSIFICATION algorithms , *PERFORMANCE evaluation , *OPTICAL measurements , *ACCURACY - Abstract
Abstract: Sparse representation is being proved to be effective for many tasks in the field of pattern recognition. In this paper, an efficient classification algorithm based on concentrative sparse representation will be proposed to address the problem caused by insufficient training samples in each class. We firstly compute representation coefficient of the testing sample with training samples matrix using subspace pursuit recovery algorithm. Then we define concentration measurement function in order to determine whether the sparse representation coefficient is concentrative. Subspace pursuit is repeatedly used to revise the sparse representation until concentration is met. Such a concentrative sparse representation can contribute to discriminative residuals that are critical to accurate classification. The experimental results have showed that the proposed algorithm achieves a satisfying performance in both accuracy and efficiency. [Copyright &y& Elsevier]
- Published
- 2013
- Full Text
- View/download PDF
25. An Improved Nyquist–Shannon Irregular Sampling Theorem From Local Averages.
- Author
-
Song, Zhanjie, Liu, Bei, Pang, Yanwei, Hou, Chunping, and Li, Xuelong
- Subjects
- *
IRREGULAR sampling (Signal processing) , *SAMPLING theorem , *APPROXIMATION theory , *PIECEWISE linear approximation , *EMAIL systems , *SIGNAL reconstruction , *BANDWIDTHS , *STOCHASTIC convergence - Abstract
The Nyquist–Shannon sampling theorem is on the reconstruction of a band-limited signal from its uniformly sampled samples. The higher the signal bandwidth gets, the more challenging the uniform sampling may become. To deal with this problem, signal reconstruction from local averages has been studied in the literature. In this paper, we obtain an improved Nyquist–Shannon sampling theorem from general local averages. In practice, the measurement apparatus gives a weighted average over an asymmetrical interval. As a special case, for local averages from symmetrical interval, we show that the sampling rate is much lower than that of a result by Gröchenig. Moreover, we obtain two exact dual frames from local averages, one of which improves a result by Sun and Zhou. At the end of this paper, as an example application of local average sampling, we consider a reconstruction algorithm: the piecewise linear approximations. [ABSTRACT FROM AUTHOR]
- Published
- 2012
- Full Text
- View/download PDF
26. Approximation of WKS Sampling Theorem on Random Signals.
- Author
-
He, Gaiyun and Song, Zhanjie
- Subjects
- *
SIGNAL processing , *APPROXIMATION theory , *FUNCTIONAL analysis , *TELECOMMUNICATION , *PROBABILITY theory , *ENGINEERING mathematics , *STATISTICAL sampling - Abstract
The WKS sampling theorem is a fundamental result in the field of telecommunication and signal processing. It is a perfect example of the synthesis of traditionally distinct disciplines in mathematics, engineering analysis, and the sciences. The theorem recovers a bandlimited signal from its samples at uniformly points, which purpose is to analyze deterministic functions, or deterministic signals. However, there are a few counterparts of this theorem for random signals so far. This article gives a new WKS sampling principle on random signals from local averages with probability 1. [ABSTRACT FROM AUTHOR]
- Published
- 2011
- Full Text
- View/download PDF
27. Approximation of signals from local averages
- Author
-
Song, Zhanjie, Yang, Shouyuan, and Zhou, Xingwei
- Subjects
- *
ERROR analysis in mathematics , *MATHEMATICAL statistics , *INSTRUMENTAL variables (Statistics) , *NUMERICAL analysis - Abstract
Abstract: This work is concerned with approximation of a signal from local averages. It improves a result of Butzer and Lei [P.L. Butzer, J. Lei, Approximation of signals using measured sampled values and error analysis, Commun. Appl. Anal. 4 (2000) 245–255]. [Copyright &y& Elsevier]
- Published
- 2006
- Full Text
- View/download PDF
28. Stability of neutral-type neural network with Lévy noise and mixed time-varying delays.
- Author
-
Cui, Kaiyan, Song, Zhanjie, and Zhang, Shuo
- Subjects
- *
EXPONENTIAL stability , *LINEAR matrix inequalities , *NOISE , *PSYCHOLOGICAL feedback - Abstract
In the paper, the stability and stabilization are considered for neutral-type neural network with Lévy noise and mixed time-varying delays. By employing a class of appropriate Lyapunov functionals, the analysis process of mean square exponential stability for neutral-type neural network with Lévy noise and mixed time-varying delays can be effectively carried out. Based on the linear matrix inequalities technique, the sufficient conditions are presented to ensure the mean square exponential stability for the system. In view of the unstable situation of the system, a feedback controller is designed to stabilize the system, and the corresponding LMIs conditions are given. At last, two numerical examples show the validity of the obtained results. • A class of Lyapunov functionals is employed to analysis the mean square exponential stability neutral-type neural network. • The sufficient conditions of the mean square exponential stability for the system are presented. • A feedback controller is designed to stabilize the system, and the corresponding LMIs conditions are given. [ABSTRACT FROM AUTHOR]
- Published
- 2022
- Full Text
- View/download PDF
29. Pointwise Estimate for Exponential-Type Operators.
- Author
-
Song, Zhanjie, Liu, Xiwu, and Guo, Shunsheng
- Subjects
- *
LINEAR operators , *MODULES (Algebra) , *EXPONENTS - Abstract
Abstract. The purpose of this paper is to derive the pointwise estimate for three general exponential-type operators by using new Ditzian-Totik modulus. [ABSTRACT FROM AUTHOR]
- Published
- 2000
- Full Text
- View/download PDF
30. Automatic identification of atrial fibrillation based on the modified Elman neural network with exponential moving average algorithm.
- Author
-
Song, Zhanjie and Wang, Jibin
- Subjects
- *
MOVING average process , *ALGORITHMS , *AUTOMATIC identification , *COMPUTER-aided diagnosis , *ATRIAL fibrillation , *ATRIAL arrhythmias , *PHYSICIANS - Abstract
Atrial fibrillation is a most common arrhythmia. An early and accurate detection for the cure and even spread of this disease is considerably critical. The visual examination of electrocardiogram signals is the most extensively used diagnosis approach, but this method is cumbersome and low-efficient. In this work, we propose an intelligent network model based on the modified Elman neural network for signals discrimination. Motivated from the exponential moving average strategy, the proposed model is capable of fully modeling the information feedback and also effectively and efficiently striking a balance between current information representation and historical information representation in original Elman neural network. To evaluate its practicability, the model is also plugged into a convolutional neural network framework and two control subjects are established for a fair comparison. Experiments on the MIT-BIH atrial fibrillation and arrhythmia databases show that the proposed model can enjoy a consistent improvement in classification performance with the accuracy of 98.2% and 97.2% respectively and exhibit lower convergence rate than existing Elman network. Thanks to its high model performance, we are planning to develop the model into a computer-aided diagnosis system to assist physicians. • We design modified Elman network (MENN) for atrial fibrillation (AF) detection. • Patient-independent validation ensures the model robustness. • The feature extraction and classification are not required. • To our knowledge, this is the first time to redesign ENN for AF detection. [ABSTRACT FROM AUTHOR]
- Published
- 2021
- Full Text
- View/download PDF
31. Ambiguousness-Aware State Evolution for Action Prediction.
- Author
-
Chen, Lei, Lu, Jiwen, Song, Zhanjie, and Zhou, Jie
- Subjects
- *
AMBIGUITY , *ACTIVE learning , *GENERATIVE adversarial networks , *CLASS actions , *FORECASTING , *SUPERVISED learning - Abstract
In this paper, we propose an ambiguousness-aware state evolution (AASE) method which represents the uncertainty of the input sequence and evolves the subsequent skeletons to generate a reasonable full-length sequence for action prediction. Unlike most existing methods that enforce partial sequences with the labels of full-length videos and ignore the semantic information of the subsequent action, we develop an evolution method by predicting the instructional actions and generating the reasonable candidate subsequent actions, so that the ambiguity of the full sequence’s label supervising for the partial actions can be effectively alleviated. Our method generates the rational subsequent actions under the instructional action class to complement the partially observed action sequence. We design two criteria for a rational generation: 1) the instruction of subsequent action keeps the semantic consistency with the observed sequence; 2) the generation sequence is satisfied with the distribution of the sequence of real data. Moreover, we design an uncertainty module to decide the instructional action class for the generation network. AASE predicts instructional actions with uncertainty learning and evolves different instructional actions by generating the subsequent skeletons, which find the most probable action to represent the partially observed action by learning the way of perceiving the tendency of the ongoing action. We conduct experiments on seven widely used action datasets: NTU-60, NTU-120, UCF101, UT-Interaction, BIT, PKU-MMD and HMDB51, and our experimental results clearly demonstrate that our method achieves very competitive performance with state-of-the-art. [ABSTRACT FROM AUTHOR]
- Published
- 2022
- Full Text
- View/download PDF
32. A Hybrid Network for Large-Scale Action Recognition from RGB and Depth Modalities.
- Author
-
Wang, Huogen, Song, Zhanjie, Li, Wanqing, and Wang, Pichao
- Subjects
- *
CONVOLUTIONAL neural networks , *RECURRENT neural networks , *SUPPORT vector machines , *FEED analysis - Abstract
The paper presents a novel hybrid network for large-scale action recognition from multiple modalities. The network is built upon the proposed weighted dynamic images. It effectively leverages the strengths of the emerging Convolutional Neural Network (CNN) and Recurrent Neural Network (RNN) based approaches to specifically address the challenges that occur in large-scale action recognition and are not fully dealt with by the state-of-the-art methods. Specifically, the proposed hybrid network consists of a CNN based component and an RNN based component. Features extracted by the two components are fused through canonical correlation analysis and then fed to a linear Support Vector Machine (SVM) for classification. The proposed network achieved state-of-the-art results on the ChaLearn LAP IsoGD, NTU RGB+D and Multi-modal & Multi-view & Interactive ( M 2 I ) datasets and outperformed existing methods by a large margin (over 10 percentage points in some cases). [ABSTRACT FROM AUTHOR]
- Published
- 2020
- Full Text
- View/download PDF
33. Bayesian Robust Principal Component Analysis with Adaptive Singular Value Penalty.
- Author
-
Cui, Kaiyan, Wang, Guan, Song, Zhanjie, and Han, Ningning
- Subjects
- *
MONTE Carlo method , *PRINCIPAL components analysis , *PATTERN recognition systems , *MARKOV processes , *IMAGE processing , *DIMENSION reduction (Statistics) - Abstract
Robust principal component analysis (RPCA) has recently seen ubiquitous activity for dimensionality reduction in image processing, visualization and pattern recognition. Conventional RPCA methods model the low-rank component as regularizing each singular value equally. However, in numerous modern applications, each singular value has different physical meaning and should be treated differently. This is one of the main reasons why RPCA techniques cannot work well in dealing with many realistic problems. To solve this problem, a novel hierarchical Bayesian RPCA model with adaptive singular value penalty is proposed. This model enforces the low-rank constraint by introducing an adaptive penalty function on the singular values of the low-rank component. In particular, we impose a hierarchical Exponent-Gamma prior on the singular values of the low-rank component and the Beta-Bernoulli prior on sparsity indicators. The variational Bayesian framework and the Markov chain Monte Carlo-based Bayesian inference are considered for inferring the posteriors of all latent variables involved in low-rank and sparse components. Numerical experiments demonstrate the competitive performance of the proposed model on synthetic and real data. [ABSTRACT FROM AUTHOR]
- Published
- 2020
- Full Text
- View/download PDF
34. The convergence guarantee of the iterative hard thresholding algorithm with suboptimal feedbacks for large systems.
- Author
-
Han, Ningning, Li, Shidong, and Song, Zhanjie
- Subjects
- *
THRESHOLDING algorithms - Abstract
Thresholding based iterative algorithms have the trade-off between effectiveness and optimality. Some are effective but involving sub-matrix inversions in every step of iterations. For systems of large sizes, such algorithms can be computationally expensive and/or prohibitive. The null space tuning algorithm with hard thresholding and feedbacks (NST+HT+FB) has a mean to expedite its procedure by a suboptimal feedback, in which sub-matrix inversion is replaced by an eigenvalue-based approximation. The resulting suboptimal feedback scheme becomes exceedingly effective for large system recovery problems. An adaptive algorithm based on thresholding, suboptimal feedback and null space tuning (AdptNST+HT+subOptFB) without a prior knowledge of the sparsity level is also proposed and analyzed. Convergence analysis is the focus of this article. [ABSTRACT FROM AUTHOR]
- Published
- 2019
- Full Text
- View/download PDF
35. A novel robust principal component analysis method for image and video processing.
- Author
-
Huan, Guoqiang, Li, Ying, and Song, Zhanjie
- Subjects
- *
ROBUST statistics , *PRINCIPAL components analysis , *IMAGE processing , *VIDEO processing , *ERROR analysis in mathematics , *MARKOV processes - Abstract
The research on the robust principal component analysis has been attracting much attention recently. Generally, the model assumes sparse noise and characterizes the error term by the λ-norm. However, the sparse noise has clustering effect in practice so using a certain λ-norm simply is not appropriate for modeling. In this paper, we propose a novel method based on sparse Bayesian learning principles and Markov random fields. The method is proved to be very effective for low-rank matrix recovery and contiguous outliers detection, by enforcing the low-rank constraint in a matrix factorization formulation and incorporating the contiguity prior as a sparsity constraint. The experiments on both synthetic data and some practical computer vision applications show that the novel method proposed in this paper is competitive when compared with other state-of-the-art methods. [ABSTRACT FROM AUTHOR]
- Published
- 2016
- Full Text
- View/download PDF
36. Bayesian robust principal component analysis with structured sparse component.
- Author
-
Han, Ningning, Song, Yumeng, and Song, Zhanjie
- Subjects
- *
BAYESIAN analysis , *MULTIPLE correspondence analysis (Statistics) , *ROBUST statistics , *ESTIMATION theory , *LATENT variables - Abstract
The robust principal component analysis (RPCA) refers to the decomposition of an observed matrix into the low-rank component and the sparse component. Conventional methods model the sparse component as pixel-wisely sparse (e.g., ℓ 1 -norm for the sparsity). However, in many practical scenarios, elements in the sparse part are not truly independently sparse but distributed with contiguous structures. This is the reason why representative RPCA techniques fail to work well in realistic complex situations. To solve this problem, a Bayesian framework for RPCA with structured sparse component is proposed, where both amplitude and support correlation structure are considered simultaneously in recovering the sparse component. The model learning is based on the variational Bayesian inference, which can potentially be applied to estimate the posteriors of all latent variables. Experimental results demonstrate the proposed methodology is validated on synthetic and real data. [ABSTRACT FROM AUTHOR]
- Published
- 2017
- Full Text
- View/download PDF
37. Error evaluation of free-form surface based on distance function of measured point to surface.
- Author
-
He, Gaiyun, Zhang, Mei, and Song, Zhanjie
- Subjects
- *
ERROR analysis in mathematics , *SURFACES (Technology) , *MATHEMATICAL functions , *ALGORITHMS , *DIFFERENTIAL evolution - Abstract
As free-form surface is widely used in engineering, it is urgently needed to develop advanced methodology of detecting and evaluating the profile error. To this end, the semantic of profile tolerance in ASME Y14.5.1M is reviewed and the mathematical definition of profile tolerance is discussed. Subsequently, a mathematical model for error evaluation is built. This mathematical model is augmented based on distance function by considering the second-order terms in the computation of the distance from point to surface. Then, a profile error evaluation algorithm, which combines Differential Evolution (DE) algorithm and Nelder–Mead (NM) algorithm, is developed to solve this model. The proposed model and optimization algorithm are validated with simulation results from a case study. Additionally, the model is superior to Least-squares (LS) model in simplicity, efficiency and robustness. [ABSTRACT FROM AUTHOR]
- Published
- 2015
- Full Text
- View/download PDF
38. Depth recovery and refinement from a single image using defocus cues.
- Author
-
Tang, Chang, Hou, Chunping, and Song, Zhanjie
- Subjects
- *
IMAGE processing , *IMAGE denoising , *IMAGE converters , *PARAMETER estimation , *TEXTURE analysis (Image processing) - Abstract
We present a technique to recover and refine the depth map from a single image captured by a conventional camera in this paper. Our method builds on the universal imaging principle: only scene at the focus distance will converge to a single sharp point on imaging sensor but other scene will yield different blur effects varying with its distance from the camera lens. We first estimate depth values at edge locations via spectrum contrast and then recover the full depth map using a depth matting optimization method. Due to the fact that some blur textures such as soft shadows or blur patterns will produce ambiguity results during the procedure of depth estimation, we use a total variation-based image smoothing method to smooth the original image, a smoothed image with detailed texture being suppressed can be generated. Taking this smoothed image as reference image, a guided filter is used to refine the final depth map. [ABSTRACT FROM PUBLISHER]
- Published
- 2015
- Full Text
- View/download PDF
39. Depth recovery and refinement from a single image using defocus cues.
- Author
-
Tang, Chang, Hou, Chunping, and Song, Zhanjie
- Subjects
- *
IMAGE processing , *PIXELS , *SAMPLING theorem , *IMAGE analysis , *OPTICAL images - Abstract
We present a technique to recover and refine the depth map from a single image captured by a conventional camera in this paper. Our method builds on the universal imaging principle: only scene at the focus distance will converge to a single sharp point on imaging sensor but other scene will yield different blur effects varying with its distance from the camera lens. We first estimate depth values at edge locations via spectrum contrast and then recover the full depth map using a depth matting optimization method. Due to the fact that some blur textures such as soft shadows or blur patterns will produce ambiguity results during the procedure of depth estimation, we use a total variation-based image smoothing method to smooth the original image, a smoothed image with detailed texture being suppressed can be generated. Taking this smoothed image as reference image, a guided filter is used to refine the final depth map. [ABSTRACT FROM AUTHOR]
- Published
- 2015
- Full Text
- View/download PDF
40. Efficient iterative thresholding algorithms with functional feedbacks and null space tuning.
- Author
-
Han, Ningning, Li, Shidong, and Song, Zhanjie
- Subjects
- *
THRESHOLDING algorithms , *GREEDY algorithms , *PSYCHOLOGICAL feedback , *LENGTH measurement , *ACCELERATED life testing , *COMPRESSED sensing - Abstract
• An accelerated class of adaptive scheme of iterative thresholding algorithms based on the feedback mechanism of the null space tuning techniques is studied. • Accelerated convergence rate and improved convergence conditions are obtained by selecting an appropriate size of the index support per iteration. • The theoretical findings are sufficiently demonstrated and confirmed by extensive numerical experiments. An accelerated class of adaptive scheme of iterative thresholding algorithms is studied analytically and empirically. They are based on the feedback mechanism of the null space tuning techniques. The main contribution of this article is the accelerated convergence analysis and proofs with a variable/adaptive index selection and different feedback principles at each iteration. The convergence analysis requires no longer a priori sparsity information s of a signal. It is shown that uniform recovery of all s -sparse signals from given linear measurements can be achieved under reasonable (preconditioned) restricted isometry conditions. Accelerated convergence rate and improved convergence conditions are obtained by selecting an appropriate size of the index support per iteration. The theoretical findings are sufficiently demonstrated and confirmed by extensive numerical experiments. It is also observed that the proposed algorithms have a clearly advantageous balance of efficiency, adaptivity and accuracy compared with all other state-of-the-art greedy iterative algorithms. [ABSTRACT FROM AUTHOR]
- Published
- 2021
- Full Text
- View/download PDF
41. Pointwise Estimates for Linear Combinations of Gamma Operators.
- Author
-
Qi, Qiulan, Guo, Shunsheng, Song, Zhanjie, and Liu, Lixia
- Subjects
- *
OPERATOR theory , *MODULI theory - Abstract
Presents an equivalent theorem for linear combinations of Gamma operators. Ditzian-Totik moduli of smoothness; Theorems, lemmas and proofs.
- Published
- 2002
42. Human Parsing With Pyramidical Gather-Excite Context.
- Author
-
Zhang, Sanyi, Qi, Guo-Jun, Cao, Xiaochun, Song, Zhanjie, and Zhou, Jie
- Subjects
- *
PROBLEM solving , *MULTISCALE modeling , *SOURCE code , *HUMAN beings - Abstract
Human parsing, especially in the wild, has attracted a lot of attention due to its great potential in many real-world applications. The Pyramid Spatial Parsing (PSP) module has shown superior performances in scene and human parsing tasks. However, the basic AvgPool operation in PSP equally aggregates spatial clues of a local region, and thus mixes up influences of different human parts presented in this region. It results in failures in capturing useful contexts relevant to parsing different parts. To address this problem, a suitable mechanism to collect spatial clues aligning with different human parts is proposed in this paper. We employ a Gather-Excite (GE) operation, a replacement of the AvgPool-Upsample operation in a pyramidical structure, to accurately reflect relevant human parts of various scales. The GE operation contains two steps: the gather operation that adaptively aggregates spatial clues to relevant human parts, and the excite operation that generates new feature maps with the gathered contextual information. This results in a novel Pyramidical Gather-Excite Context (PGEC) module to solve the multi-scale problem and parse person at various scales. The PGEC module is composed of multiple GE operations with different spatial extents and aggregates local and global spatial clues for better modeling multi-scale contextual information in parallel. Moreover, we integrate the PGEC module with fine-grained details, edge preserving module and deep supervision to formulate a novel PGEC Network (PGECNet) for human parsing. The proposed PGECNet has achieved state-of-the-art performance on four single-person human parsing datasets (i.e., LIP, PPSS, ATR and Fashion Clothing) and two multi-person human parsing datasets (i.e., PASCAL-Person-Part and CIHP). The experimental results show that the proposed PGEC is superior to the PSP and ASPP modules especially in single-human parsing task. The source code is publicly available at https://github.com/31sy/PGECNet. [ABSTRACT FROM AUTHOR]
- Published
- 2021
- Full Text
- View/download PDF
43. Dual-Resolution Dual-Path Convolutional Neural Networks for Fast Object Detection.
- Author
-
Pan, Jing, Sun, Hanqing, Song, Zhanjie, and Han, Jungong
- Subjects
- *
SPINE , *ROBOTICS , *PIPELINES , *DETECTORS , *VISION - Abstract
Downsampling input images is a simple trick to speed up visual object-detection algorithms, especially on robotic vision and applied mobile vision systems. However, this trick comes with a significant decline in accuracy. In this paper, dual-resolution dual-path Convolutional Neural Networks (CNNs), named DualNets, are proposed to bump up the accuracy of those detection applications. In contrast to previous methods that simply downsample the input images, DualNets explicitly take dual inputs in different resolutions and extract complementary visual features from these using dual CNN paths. The two paths in a DualNet are a backbone path and an auxiliary path that accepts larger inputs and then rapidly downsamples them to relatively small feature maps. With the help of the carefully designed auxiliary CNN paths in DualNets, auxiliary features are extracted from the larger input with controllable computation. Auxiliary features are then fused with the backbone features using a proposed progressive residual fusion strategy to enrich feature representation.This architecture, as the feature extractor, is further integrated with the Single Shot Detector (SSD) to accomplish latency-sensitive visual object-detection tasks. We evaluate the resulting detection pipeline on Pascal VOC and MS COCO benchmarks. Results show that the proposed DualNets can raise the accuracy of those CNN detection applications that are sensitive to computation payloads. [ABSTRACT FROM AUTHOR]
- Published
- 2019
- Full Text
- View/download PDF
44. Watch fashion shows to tell clothing attributes.
- Author
-
Zhang, Sanyi, Liu, Si, Cao, Xiaochun, Song, Zhanjie, and Zhou, Jie
- Subjects
- *
CLOTHING & dress , *FASHION shows , *QUALITY (Aesthetics) , *ARTIFICIAL neural networks , *PREDICTION models - Abstract
In this paper, we propose a novel semi-supervised method to predict clothing attributes with the assistance of unlabeled data like fashion shows. To this end, a two-stage framework is built, i.e., the unsupervised triplet network pre-training stage that ensures frames in the same video having coherent representations while frames from different videos having larger feature distances, and a supervised clothing attribute prediction stage to estimate the value of attributes. Specifically, we first detect the clothes of frames in the collected 18,737 female fashion shows and 21,224 male fashion shows which contain no extra labels. Then a triplet neural network is constructed via embedding the temporal appearance consistency between frames in the same video and the representation gap in different videos. Finally, we transfer the triplet model parameters to multi-task clothing attribute prediction model, and fine-tune it with clothing images holding attribute labels. Extensive experiments demonstrate the advantages of the proposed method on two clothing datasets. [ABSTRACT FROM AUTHOR]
- Published
- 2018
- Full Text
- View/download PDF
45. Stereoscopic image quality assessment method based on binocular combination saliency model.
- Author
-
Liu, Yun, Yang, Jiachen, Meng, Qinggang, Lv, Zhihan, Song, Zhanjie, and Gao, Zhiqun
- Subjects
- *
STEREOSCOPE , *IMAGE quality analysis , *BINOCULAR vision , *COMPUTATIONAL complexity , *INFORMATION theory , *METRIC spaces - Abstract
The objective quality assessment of stereoscopic images plays an important role in three-dimensional (3D) technologies. In this paper, we propose an effective method to evaluate the quality of stereoscopic images that are afflicted by symmetric distortions. The major technical contribution of this paper is that the binocular combination behaviors and human 3D visual saliency characteristics are both considered. In particular, a new 3D saliency map is developed, which not only greatly reduces the computational complexity by avoiding calculation of the depth information, but also assigns appropriate weights to the image contents. Experimental results indicate that the proposed metric not only significantly outperforms conventional 2D quality metrics, but also achieves higher performance than the existing 3D quality assessment models. [ABSTRACT FROM AUTHOR]
- Published
- 2016
- Full Text
- View/download PDF
46. A perceptual stereoscopic image quality assessment model accounting for binocular combination behavior.
- Author
-
Yang, Jiachen, Liu, Yun, Gao, Zhiqun, Chu, Rongrong, and Song, Zhanjie
- Subjects
- *
IMAGE quality analysis , *THREE-dimensional imaging , *DEPTH perception , *IMAGE processing , *STEREOSCOPIC views - Abstract
Stereoscopic image quality assessment (SIQA) plays an important role in the development of 3D image processing. In this paper, a full-reference object SIQA model is built based on binocular summation channel and binocular difference channel. In our frame work, binocular combination behavior and how to experience the depth perception are thought to be the key factors to evaluate the quality of stereoscopic images. Differing from the current depth map methods, this method focuses on a new aspect, and an effective combination model is proposed based on the physiological findings in the Human Visual System (HVS). Experimental results demonstrate that the proposed quality assessment metric significantly outperforms the existing metrics and can achieve higher consistency with subject quality assessment when predicting the quality of stereoscopic images that have been symmetrically distorted. [ABSTRACT FROM AUTHOR]
- Published
- 2015
- Full Text
- View/download PDF
47. Balance between object and background: Object-enhanced features for scene image classification.
- Author
-
Ji, Zhong, Wang, Jing, Su, Yuting, Song, Zhanjie, and Xing, Shikai
- Subjects
- *
CLASSIFICATION , *IMAGE analysis , *PERFORMANCE evaluation , *HISTOGRAMS , *FEATURE selection , *VISUAL perception - Abstract
Abstract: An unsupervised object-enhanced feature generation mechanism is proposed, which balances the different effects of object regions and background regions for scene image classification. The proposed method strengthens the characteristics of the whole image with object regions in a biased way and accords with the perception mechanism of humans. Furthermore, it can be easily embedded in the extraction process of existing prominent histogram-based feature representations, such as BOW (bag-of-visual-words) and HOG (histogram of orientations gradients). The paper takes BOW feature as the primary example, and presents a feature named OE-BOW. The overall results of the proposed method show an increase in the classification accuracy of about 1.0–2.5% compared to the original feature on some popular scene datasets. And the performance of the proposed method is also shown to be comparable to the recently reported results. [Copyright &y& Elsevier]
- Published
- 2013
- Full Text
- View/download PDF
Catalog
Discovery Service for Jio Institute Digital Library
For full access to our library's resources, please sign in.