53 results on '"Hanqing Lu"'
Search Results
2. A new scene breakpoint detection algorithm using slice of video stream
- Author
-
Weixin, Kong, primary, Yao, Ren, additional, and Hanqing, Lu, additional
- Published
- 1998
- Full Text
- View/download PDF
3. Local Structure Preserving Based Subspace Analysis Methods and Applications
- Author
-
Jian Cheng and Hanqing Lu
- Subjects
Probabilistic latent semantic analysis ,business.industry ,Computer science ,ComputingMethodologies_IMAGEPROCESSINGANDCOMPUTERVISION ,Probabilistic logic ,Pattern recognition ,Latent Dirichlet allocation ,Facial recognition system ,symbols.namesake ,ComputingMethodologies_PATTERNRECOGNITION ,Kernel (image processing) ,Local consistency ,symbols ,Artificial intelligence ,Cluster analysis ,business ,Subspace topology - Abstract
Subspace analysis is an effective approach for image representation. Local structure preserving has been widely adopted to learn subspace which reflects the intrinsic attributes of samples. In this chapter, inspired by the idea of local structure preserving, we propose two novel subspace methods for face recognition and image clustering tasks. The first is named Supervised Kernel Locality Preserving Projections (SKLPP) for face recognition task, in which geometric relations are preserved according to prior class-label information and complex nonlinear variations of real face images are represented by nonlinear kernel mapping. The second is a novel probabilistic topic model for image clustering task, named Dual Local Consistency Probabilistic Latent Semantic Analysis (DLC-PLSA), The proposed DLC-PLSA model can learn an effective and robust mid-level representation in the latent semantic space for image analysis. As our model considers both the local image structure and local word consistency simultaneously when estimating the probabilistic topic distributions, the image representations can have more powerful description ability in the learned latent semantic space. The extensive experiments on face recognition and image clustering show that the proposed subspace analysis methods are promising.
- Published
- 2014
4. Learning Binary Codes with Bagging PCA
- Author
-
Ting Yuan, Hanqing Lu, Cong Leng, Jian Cheng, and Xiao Bai
- Subjects
business.industry ,Hash function ,Short Code ,Pattern recognition ,Linear code ,Projection method ,Leverage (statistics) ,Binary code ,Artificial intelligence ,business ,computer ,Eigendecomposition of a matrix ,Eigenvalues and eigenvectors ,Mathematics ,computer.programming_language - Abstract
For the eigendecomposition based hashing approaches, the information caught in different dimensions is unbalanced and most of them is typically contained in the top eigenvectors. This often leads to an unexpected phenomenon that longer code does not necessarily yield better performance. This paper attempts to leverage the bootstrap sampling idea and integrate it with PCA, resulting in a new projection method called Bagging PCA, in order to learn effective binary codes. Specifically, a small fraction of the training data is randomly sampled to learn the PCA directions each time and only the top eigenvectors are kept to generate one piece of short code. This process is repeated several times and the obtained short codes are concatenated into one piece of long code. By considering each piece of short code as a "super-bit", the whole process is closely connected with the core idea of LSH. Both theoretical and experimental analyses demonstrate the effectiveness of the proposed method.
- Published
- 2014
5. Collaborative Tracking: Dynamically Fusing Short-Term Trackers and Long-Term Detector
- Author
-
Guibo Zhu, Changsheng Li, Hanqing Lu, and Jinqiao Wang
- Subjects
BitTorrent tracker ,business.industry ,Computer science ,Frame (networking) ,Detector ,ComputingMethodologies_IMAGEPROCESSINGANDCOMPUTERVISION ,Tracking (particle physics) ,Object (computer science) ,Term (time) ,Trajectory ,Object detector ,Computer vision ,Artificial intelligence ,business - Abstract
This paper addresses the problem of long-term tracking of unknown objects in a video stream given its location in the first frame and without any other information. It’s very challenging because of the existence of several factors such as frame cuts, sudden appearance changes and long-lasting occlusions etc. We propose a novel collaborative tracking framework fusing short-term trackers and long-term object detector. The short-term trackers consist of a frame-to-frame tracker and a weakly supervised tracker which would be updated under the weakly supervised information and re-initialized by long-term detector while the trackers fail. Additionally, the short-term trackers would provide multiple instance samples on the object trajectory for training a long-term detector with the bag samples with P-N constraints. Comprehensive experiments and comparisons demonstrate that our approaches achieve better performance than the state-of-the-art methods.
- Published
- 2013
6. Modeling Hidden Topics with Dual Local Consistency for Image Analysis
- Author
-
Jian Cheng, Hanqing Lu, and Peng Li
- Subjects
Topic model ,Probabilistic latent semantic analysis ,business.industry ,Probabilistic logic ,computer.software_genre ,Latent Dirichlet allocation ,symbols.namesake ,Semantic similarity ,Expectation–maximization algorithm ,symbols ,Local consistency ,Artificial intelligence ,Data mining ,business ,computer ,Natural language processing ,Mathematics ,Semantic gap - Abstract
Image representation is the crucial component in image analysis and understanding. However, the widely used low-level features cannot correctly represent the high-level semantic content of images in many situations due to the "semantic gap". In order to bridge the "semantic gap", in this brief, we present a novel topic model, which can learn an effective and robust mid-level representation in the latent semantic space for image analysis. In our model, the l1-graph is constructed to model the local image neighborhood structure and the word co-occurrence is computed to capture the local word consistency. Then, the local information is incorporated into the model for topic discovering. Finally, the generalized EM algorithm is used to estimate the parameters. As our model considers both the local image structure and local word consistency simultaneously when estimating the probabilistic topic distributions, the image representations can have more powerful description ability in the learned latent semantic space. Extensive experiments on the publicly available databases demonstrate the effectiveness of our approach.
- Published
- 2013
7. Object Categorization Using Local Feature Context
- Author
-
Jing Liu, Hanqing Lu, Chunjie Zhang, and Tao Sun
- Subjects
business.industry ,Computer science ,Novel object ,Scale-invariant feature transform ,Pattern recognition ,Boosting methods for object categorization ,Machine learning ,computer.software_genre ,Categorization ,Discriminative model ,Bag-of-words model in computer vision ,Visual Word ,Artificial intelligence ,Invariant (mathematics) ,business ,computer - Abstract
Recently, the use of context has been proven very effective for object categorization. However, most of the researchers only used context information at the visual word level without considering the context information of local features. To tackle this problem, in this paper, we propose a novel object categorization method by considering the local feature context. Given a position in an image, to represent this position’s visual information, we use the local feature on this position as well as other local features based on their distances and angles to this position. The use of local feature context is more discriminative and is also invariant to rotation and scale change. The local feature context can then be combined with the state-of-the-art methods for object categorization. Experimental results on the UIUC-Sports dataset and the Caltech-101 dataset demonstrate the effectiveness of the proposed method.
- Published
- 2013
8. Efficient Clothing Retrieval with Semantic-Preserving Visual Phrases
- Author
-
Hanqing Lu, Jianlong Fu, Min Xu, Jinqiao Wang, and Zechao Li
- Subjects
Structure (mathematical logic) ,Computer science ,business.industry ,Process (computing) ,Inverted index ,Clothing ,computer.software_genre ,Constraint (information theory) ,Discriminative model ,Artificial Intelligence & Image Processing ,Segmentation ,Visual Word ,Artificial intelligence ,business ,computer ,Natural language processing - Abstract
In this paper, we address the problem of large scale cross-scenario clothing retrieval with semantic-preserving visual phrases (SPVP). Since the human parts are important cues for clothing detection and segmentation, we firstly detect human parts as the semantic context, and refine the regions of human parts with sparse background reconstruction. Then, the semantic parts are encoded into the vocabulary tree under the bag-of-visual-word (BOW) framework, and the contextual constraint of visual words among different human parts is exploited through the SPVP. Moreover, the SPVP is integrated into the inverted index structure for accelerating the retrieval process. Experiments and comparisons on our clothing dataset indicate that the SPVP significantly enhances the discriminative power of local features with a slight increase of memory usage or runtime consumption compared to the BOW model. Therefore, the approach is superior to both the state-of-the-art approach and two clothing search engines. © 2013 Springer-Verlag.
- Published
- 2013
9. Hierarchical Object Representations for Visual Recognition via Weakly Supervised Learning
- Author
-
Rui Cai, Tianzhu Zhang, Hanqing Lu, Zhiwei Li, and Lei Zhang
- Subjects
genetic structures ,business.industry ,Supervised learning ,Pattern recognition ,Pascal (programming language) ,Visual recognition ,Form perception ,Bounding overwatch ,Visual patterns ,Artificial intelligence ,business ,computer ,computer.programming_language ,Mathematics - Abstract
In this paper, we propose a weakly supervised approach to learn hierarchical object representations for visual recognition. The learning process is carried out in a bottom-up manner to discover latent visual patterns in multiple scales. To relieve the disturbance of complex backgrounds in natural images, bounding boxes of foreground objects are adopted as weak knowledge in the learning stage to promote those visual patterns which are more related to the target objects. The difference between the patterns of foreground objects and backgrounds is relatively vague at low-levels, but becomes more distinct along with the feature transformations to high-levels. In the test stage, an input image is verified against the learnt patterns level-by-level, and the responses at each level construct a hierarchy of representations which indicates the occurring possibilities of the target object at various scales. Experiments on two PASCAL datasets showed encouraging results for visual recognition.
- Published
- 2013
10. Weighted Interaction Force Estimation for Abnormality Detection in Crowd Scenes
- Author
-
Hanqing Lu, Jinqiao Wang, Jing Liu, Xiaobin Zhu, and Wei Fu
- Subjects
business.industry ,Computer science ,Latent Dirichlet allocation ,symbols.namesake ,Discriminative model ,Frequency domain ,Video tracking ,Social force model ,symbols ,Computer vision ,Visual Word ,Artificial intelligence ,business ,Coding (social sciences) ,Abnormality detection - Abstract
In this paper, we propose a weighted interaction force estimation in the social force model(SFM)-based framework, in which the properties of surrounding individuals in terms of motion consistence, distance apart, and angle-of-view along moving directions are fully utilized in order to more precisely discriminate normal or abnormal behaviors of crowd. To avoid the challenges in object tracking in crowded videos, we first perform particle advection to capture the continuity of crowd flow and use these moving particles as individuals for the interaction force estimation. For a more reasonable interaction force estimation, we jointly consider the properties of surrounding individuals, assuming that the individuals with consistent motion (as a particle group) and the ones out of the angle-of-view have no influence on each other, besides the farther apart ones have weaker influence. In particular, particle groups are clustered by spectral clustering algorithm, in which a novel and high discriminative gait feature in frequency domain, combined with spatial and motion feature, is used. The estimated interaction forces are mapped to image span to form force flow, from which bag-of-word features are extracted. Sparse Topical Coding (STC) model is used to find abnormal events. Experiments conducted on three datasets demonstrate the promising performance of our work against other related ones.
- Published
- 2013
11. A Weighted One Class Collaborative Filtering with Content Topic Features
- Author
-
Hanqing Lu, Xi Zhang, Ting Yuan, Jian Cheng, and Qingshan Liu
- Subjects
Topic model ,Class (computer programming) ,Information retrieval ,business.industry ,Computer science ,Recommender system ,Missing data ,Machine learning ,computer.software_genre ,Weighting ,User experience design ,Collaborative filtering ,Feature (machine learning) ,Artificial intelligence ,business ,computer - Abstract
A task that naturally emerges in recommender system is to improve user experience through personalized recommendations based on user’s implicit feedback, such as news recommendation and scientific paper recommendation. Recommendations dealing with implicit feedback are most thought of as One Class Collaborative Filtering (OCCF), which only positive examples can be observed and the majority of data are missing. The idea to introduce weights for treating missing data as negatives has been shown to help in OCCF. But existing weighting approaches mainly use the statistical properties of feedback to determine the weight, which are not very reasonable and not personalized for each user-item pair. In this paper, we propose to improve recommendation by considering the rich user and item content information to assist weighting the unknown data in OCCF. To incorporate the useful content information, we get a content topic feature for each user and item by using probabilistic topic modeling method, and determine the personalized weight of every unknown user-item pair by these content topic features. Extensive experiments show that our algorithm can achieve better performance than the state-of-art methods.
- Published
- 2013
12. A Dynamic Batch Sampling Mode for SVM Active Learning in Image Retrieval
- Author
-
Xiaoyu Zhang, Hanqing Lu, Changsheng Xu, Songde Ma, and Jian Cheng
- Subjects
Scheme (programming language) ,business.industry ,Active learning (machine learning) ,Computer science ,Relevance feedback ,Sampling (statistics) ,Pattern recognition ,Machine learning ,computer.software_genre ,Support vector machine ,Batch processing ,Artificial intelligence ,business ,computer ,Image retrieval ,Selection (genetic algorithm) ,computer.programming_language - Abstract
In relevance feedback of content-based image retrieval, active learning is an effective method to alleviate the burden of labeling by selecting the most informative examples to label. Traditionally, batch mode is adopted in selective sampling, which suffers from two problems. On the one hand, the existing classification boundary which dominates example selection is usually unreliable for lack of labeled examples; on the other hand, the previously labeled examples in the batch cannot offer instructive information for further selection. In this paper, we propose a novel dynamic batch sampling mode for SVM active learning which addresses the above problems. We select a batch of examples dynamically, using the previously labeled examples as guidance. Experimental results demonstrate the advantage of the proposed scheme in comparison with the traditional ones.
- Published
- 2012
13. Adaptive Model for Robust Pedestrian Counting
- Author
-
Hanqing Lu, Jingjing Liu, and Jinqiao Wang
- Subjects
business.industry ,Computer science ,Heuristic ,Pedestrian detection ,Template matching ,Visibility (geometry) ,Pedestrian ,Reversible-jump Markov chain Monte Carlo ,Grid ,Computer vision ,Artificial intelligence ,Branch structure ,business ,Algorithm - Abstract
Toward robust pedestrian counting with partly occlusion, we put forward a novel model-based approach for pedestrian detection. Our approach consists of two stages: pre-detection and verification. Firstly, based on a whole pedestrian model built up in advance, adaptive models are dynamically determined by the occlusion conditions of corresponding body parts. Thus, a heuristic approach with grid masks is proposed to examine visibility of certain body part. Using part models for template matching, we adopt an approximate branch structure for preliminary detection. Secondly, Bayesian framework is utilized to verify and optimize the pre-detection results. Reversible Jump Markov Chain Monte Carlo (RJMCMC) algorithm is used to solve such problem of high dimensions. Experiments and comparison demonstrate promising application of the proposed approach.
- Published
- 2011
14. Image Classification Using Spatial Pyramid Coding and Visual Word Reweighting
- Author
-
Songde Ma, Jinqiao Wang, Jing Liu, Hanqing Lu, Changsheng Xu, Chunjie Zhang, and Qi Tian
- Subjects
Vocabulary ,Contextual image classification ,business.industry ,Computer science ,media_common.quotation_subject ,Speech recognition ,Pattern recognition ,Bag-of-words model in computer vision ,Embedding ,Visual Word ,Artificial intelligence ,business ,Neural coding ,Spatial analysis ,Coding (social sciences) ,media_common - Abstract
The ignorance on spatial information and semantics of visual words becomes main obstacles in the bag-of-visual-words (BoW) method for image classification. To address the obstacles, we present an improved BoW representation using spatial pyramid coding (SPC) and visual word reweighting. In SPC procedure, we adopt the sparse coding technique to encode visual features with the spatial constraint. Visual features from the same spatial sub-region of images are collected to generate the visual vocabulary. Additionally, a relaxed but simple solution for semantic embedding into visual words is proposed. We relax the semantic embedding from ideal semantic correspondence to naive semantic purity of visual words, and reweight each visual word according to its semantic purity. Higher weights are given to semantically distinctive visual words, and lower weights to semantically general ones. Experiments on a public dataset demonstrate the effectiveness of the proposed method.
- Published
- 2011
15. Correlated PLSA for Image Clustering
- Author
-
Zechao Li, Jian Cheng, Hanqing Lu, and Peng Li
- Subjects
Topic model ,Probabilistic latent semantic analysis ,business.industry ,Computer science ,InformationSystems_INFORMATIONSTORAGEANDRETRIEVAL ,Cosine similarity ,ComputingMethodologies_IMAGEPROCESSINGANDCOMPUTERVISION ,Conditional probability ,Pattern recognition ,computer.software_genre ,Image (mathematics) ,ComputingMethodologies_PATTERNRECOGNITION ,Expectation–maximization algorithm ,Artificial intelligence ,Data mining ,Representation (mathematics) ,Cluster analysis ,business ,computer - Abstract
Probabilistic Latent Semantic Analysis (PLSA) has become a popular topic model for image clustering. However, the traditional PLSA method considers each image (document) independently, which would often be conflict with the real occasion. In this paper, we presents an improved PLSA model, named Correlated Probabilistic Latent Semantic Analysis (C-PLSA). Different from PLSA, the topics of the given image are modeled by the images that are related to it. In our method, each image is represented by bag-of-visual-words. With this representation, we calculate the cosine similarity between each pair of images to capture their correlations. Then we use our C-PLSA model to generate K latent topics and Expectation Maximization (EM) algorithm is utilized for parameter estimation. Based on the latent topics, image clustering is carried out according to the estimated conditional probabilities. Extensive experiments are conducted on the publicly available database. The comparison results show that our approach is superior to the traditional PLSA for image clustering.
- Published
- 2011
16. Abnormal Change Detection of Image Quality Metric Series Using Diffusion Process and Stopping Time Theory
- Author
-
Jian Cheng, Haoting Liu, and Hanqing Lu
- Subjects
Series (mathematics) ,Image quality ,Stopping time ,Metric (mathematics) ,Autoregressive–moving-average model ,Time series ,Algorithm ,Simulation ,Change detection ,Statistical hypothesis testing ,Mathematics - Abstract
To evaluate and monitor the Image Quality (IQ) change of a surveillance sequence for video analysis, a diffusion process and stopping time theory based model is presented in this paper because they can describe the uncertainty of an actual stochastic series rationally. First, we calculate the IQ metric for each frame. Then we connect all these discrete data together to form an Image Quality Metric Series (IQMS). After that, a non-parametric estimation technique based diffusion process model is used to fit the fluctuation path of the IQMS. Finally, a stopping time based model is employed to detect the abnormal change. Different to the conventional diffusion process method, the function forms of our model are estimated online and affirmed by an evaluation result of the hypothesis test. Comparing with the traditional time series model, such as the ARMA model, extensive experiments have proved that this method is effective and efficient on detecting the abnormal change.
- Published
- 2010
17. Personalized Sports Video Customization for Mobile Devices
- Author
-
Jian Cheng, Yu Jiang, Chao Liang, Hanqing Lu, Xiaowei Luo, Changsheng Xu, Jian Ma, Jinqiao Wang, and Yu Fu
- Subjects
Annotation ,Multimedia ,Computer science ,Mobile phone ,Event (computing) ,Interface (computing) ,Mobile search ,computer.software_genre ,Automatic summarization ,computer ,Mobile device ,Personalization - Abstract
In this paper, we have designed and implement a mobile personalized sports video customization system, which aims to provide mobile users with interesting video clips according to their personalized preferences. With the B/S architecture, the whole system includes an intelligent multimedia content server and a client interface on smart phones. For the content server, the web casting text is utilized to detect live events from sports video, which can generate both accurate event location and rich content description. The annotation results are stored in the MPEG-7 format and then the server can provide personalized video retrieval and summarization services based on both game content and user preference. For the client interface, a friend UI is designed for mobile users to customize their favorite video clips. With a new ‘4C' evaluation criterion, our proposed system is proved effective by both quantitative and qualitative experiments conducted on five sports matches.
- Published
- 2010
18. Visual Attention Model Based Object Tracking
- Author
-
Jinqiao Wang, Jing Liu, Lili Ma, Hanqing Lu, and Jian Cheng
- Subjects
Object tracking algorithm ,business.industry ,Computer science ,Video tracking ,Visual attention ,Pattern recognition ,Computer vision ,Saliency map ,Top-down and bottom-up design ,Artificial intelligence ,Attention model ,business - Abstract
A biological visual attention based object tracking algorithm is proposed. This algorithm combines the top-down, task dependent attention and bottom-up, stimulus driven attention. The image is first decomposed into different feature maps according to the bottom-up attention model. Then with the assumption that object region attracts more attention than background, logistic regression is employed to tune the feature maps, which enhances the object features that are different from background while inhibits the background feature. In this way the saliency map is computed and the object location can be predicted using an efficient search strategy in the saliency map. Experiments show the robustness of the algorithm in object tracking. Moreover the saliency map can be integrated into other object tracking methods as a prior to increase the robustness and efficiency of tracking.
- Published
- 2010
19. Extended CBIR via Learning Semantics of Query Image
- Author
-
Jing Liu, Hanqing Lu, Chuanghua Gui, and Changsheng Xu
- Subjects
Information retrieval ,Web search query ,Discriminative model ,business.industry ,Computer science ,Feature (computer vision) ,ComputingMethodologies_IMAGEPROCESSINGANDCOMPUTERVISION ,Semantic search ,Feature selection ,business ,Semantics ,Image retrieval ,Social Semantic Web - Abstract
This demo presents a web image search engine via learning semantics of query image. Unlike traditional CBIR systems which search images according to visual similarities, our system implements an extended CBIR (ExCBIR) which returns both visually and semantically relevant images. Given a query image, we first automatically learn its semantic representation from those visual similar images, and then combine the semantic representation and their visual properties to output the searching result. Considering that different visual features have variously discriminative power under a certain semantic context, we give more confidence to the feature whose result images are more consistent on semantics. Experiments on a large-scale web images demonstrate the effectiveness of our system.
- Published
- 2010
20. People Detection by Boosting Features in Nonlinear Subspace
- Author
-
Jinqiao Wang, Jie Yang, and Hanqing Lu
- Subjects
Boosting (machine learning) ,business.industry ,Detector ,ComputingMethodologies_IMAGEPROCESSINGANDCOMPUTERVISION ,Pattern recognition ,Kernel principal component analysis ,Nonlinear system ,ComputingMethodologies_PATTERNRECOGNITION ,Discriminative model ,Computer Science::Computer Vision and Pattern Recognition ,Histogram ,AdaBoost ,Artificial intelligence ,business ,Subspace topology ,Mathematics - Abstract
In this paper, we propose a novel approach to detect people by boosting features in the nonlinear subspace. Firstly, three types of the HOG (Histograms of Oriented Gradients) descriptor are extracted and grouped into one descriptor to represent the samples. Then, the nonlinear subspace with higher dimension is constructed for positive and negative samples respectively by using Kernel PCA. The final features of the samples are derived by projecting the grouped HOG descriptors onto the nonlinear subspace. Finally, AdaBoost is used to select the discriminative features in the nonlinear subspace and train the detector. Experimental results demonstrate the effectiveness of the proposed method.
- Published
- 2010
21. Human Action Recognition in Videos Using Hybrid Motion Features
- Author
-
Si Liu, Jing Liu, Hanqing Lu, and Tianzhu Zhang
- Subjects
Computer science ,business.industry ,ComputingMethodologies_IMAGEPROCESSINGANDCOMPUTERVISION ,Optical flow ,Markov process ,Pattern recognition ,Motion (physics) ,symbols.namesake ,Bag-of-words model ,Histogram ,Motion estimation ,Feature (machine learning) ,symbols ,Computer vision ,Artificial intelligence ,business ,Representation (mathematics) - Abstract
In this paper, we present hybrid motion features to promote action recognition in videos. The features are composed of two complementary components from different views of motion information. On one hand, the period feature is extracted to capture global motion in time-domain. On the other hand, the enhanced histograms of motion words (EHOM) are proposed to describe local motion information. Each word is represented by optical flow of a frame and the correlations between words are encoded into the transition matrix of a Markov process, and then its stationary distribution is extracted as the final EHOM. Compared to traditional Bags of Words representation, EHOM preserves not only relationships between words but also temporary information in videos to some extent. We show that by integrating local and global features, we get improved recognition rates on a variety of standard datasets.
- Published
- 2010
22. A Hierarchical Semantics-Matching Approach for Sports Video Annotation
- Author
-
Hanqing Lu, Yi Zhang, Jinqiao Wang, Changsheng Xu, and Chao Liang
- Subjects
Matching (statistics) ,Information retrieval ,Multimedia ,Computer science ,Event (computing) ,Search engine indexing ,ComputingMethodologies_IMAGEPROCESSINGANDCOMPUTERVISION ,Semantics ,computer.software_genre ,Automatic summarization ,Video tracking ,Key (cryptography) ,Timestamp ,computer - Abstract
Text facilitated sports video analysis has achieved extensive success in video indexing, retrieval and summarization. A commonly adopted basis in previous work is the separate alignment of timestamps between sports video and game text, which isn't a robust method for generic cross-media analysis. In this paper, we propose a hierarchical semantics-matching approach to annotate sports video. Our key idea is to link video and text with high-level semantics rather than low-level features and find the optimal video-text alignment based on the integral structure rather than individual conditions. For accurate event location, the whole algorithm is implemented in a hierarchical way to generate both refined and accurate video annotation result. Experiments conducted on both basketball and football matches demonstrate that our proposed approach is effective for text facilitated sports video annotation.
- Published
- 2009
23. Visual Tracking Using Particle Filters with Gaussian Process Regression
- Author
-
Hanqing Lu, Yi Wu, and Hongwei Li
- Subjects
Mathematical optimization ,symbols.namesake ,Kriging ,Robustness (computer science) ,Visual Objects ,Resampling ,symbols ,Eye tracking ,Particle filter ,computer ,Gaussian process ,Algorithm ,Auxiliary particle filter ,Mathematics ,computer.programming_language - Abstract
Particle degeneracy is one of the main problems when particle filters are applied to visual tracking. The effective solution methods on the degeneracy phenomenon include good choice of proposal distribution and use of resampling. In this paper, we propose a novel visual-tracking algorithm using particle filters with Gaussian process regression and resampling techniques, which effectively abate the influence of particle degeneracy and improve the robustness of visual tracking. The main characteristic of the proposed algorithm is that we incorporate particle filters with Gaussian process regression which can learn highly effective proposal distributions for particle filters to track the visual objects. Experimental results in challenging sequences demonstrate the effectiveness and robustness of the proposed method.
- Published
- 2009
24. A Novel Role-Based Movie Scene Segmentation Method
- Author
-
Hanqing Lu, Yifan Zhang, Chao Liang, Changsheng Xu, and Jian Cheng
- Subjects
Structure (mathematical logic) ,Semantic link ,Scene segmentation ,business.industry ,Character (computing) ,Computer science ,ComputingMethodologies_IMAGEPROCESSINGANDCOMPUTERVISION ,Novelty ,Computer vision ,Artificial intelligence ,Objective evaluation ,business ,Hidden Markov model ,Semantic gap - Abstract
Semantic scene segmentation is a crucial step in movie video analysis and extensive research efforts have been devoted to this area. However, previous methods are heavily relying on video content itself, which are lack of objective evaluation criterion and necessary semantic link due to the semantic gap. In this paper, we propose a novel role-based approach for movie scene segmentation using script. Script is a text description of movie content that contains the scene structure information and related character names, which can be regarded as an objective evaluation criterion and useful external reference. The main novelty of our approach is that we convert the movie scene segmentation into a movie-script alignment problem and propose a HMM alignment algorithm to map the script scene structure to the movie content. The promising results obtained from three Hollywood movies demonstrate the effectiveness of our proposed approach.
- Published
- 2009
25. Concept-Specific Visual Vocabulary Construction for Object Categorization
- Author
-
Yi Ouyang, Jing Liu, Chunjie Zhang, Songde Ma, and Hanqing Lu
- Subjects
Vocabulary ,Boosting (machine learning) ,Traditional learning ,Computer science ,business.industry ,media_common.quotation_subject ,k-means clustering ,Pattern recognition ,Pascal (programming language) ,computer.software_genre ,Categorization ,Bag-of-words model in computer vision ,Visual Word ,Artificial intelligence ,business ,computer ,Natural language processing ,computer.programming_language ,media_common - Abstract
Recently, the bag-of-words (BOW) based image representation is getting popular in object categorization. However, there is no available visual vocabulary and it has to be learned. As to traditional learning methods, the vocabulary is constructed by exploring only one type of feature or simply concatenating all kinds of visual features into a long vector. Such constructions neglect distinct roles of different features on discriminating object categories. To address the problem, we propose a novel method to construct a conceptspecific visual vocabulary. First, we extract various visual features from local image patches, and cluster them separately according to different features to generate an initial vocabulary. Second, we formulate the concept-specific visual words selection and object categorization into a boosting framework. Experimental results on PASCAL 2006 challenge data set demonstrate the encouraging performance of the proposed method.
- Published
- 2009
26. Selective Sampling Based on Dynamic Certainty Propagation for Image Retrieval
- Author
-
Hanqing Lu, Jian Cheng, Xiaoyu Zhang, and Songde Ma
- Subjects
Scheme (programming language) ,Active learning (machine learning) ,business.industry ,Computer science ,media_common.quotation_subject ,Relevance feedback ,Pattern recognition ,Semi-supervised learning ,Certainty ,computer.software_genre ,Correlation ,Point (geometry) ,Artificial intelligence ,Data mining ,business ,Image retrieval ,computer ,computer.programming_language ,media_common - Abstract
In relevance feedback of image retrieval, selective sampling is often used to alleviate the burden of labeling by selecting only the most informative data to label. Traditional data selection scheme often selects a batch of data at a time and label them all together, which neglects the data's correlation and thus jeopardizes the effectiveness. In this paper, we propose a novel Dynamic Certainty Propagation (DCP) scheme for informative data selection. For each unlabeled data, we define the notion of certainty to quantify our confidence in its predicted label. Every time, we only label one single data point with the lowest degree of certainty. Then we update the rest unlabeled data's certainty dynamically according to their correlation. This one-by-one labeling offers us extra guidance from the last labeled data for the next labeling. Experiments show that the DCP scheme outperforms the traditional method evidently.
- Published
- 2008
27. A Spatial-Temporal-Scale Registration Approach for Video Copy Detection
- Author
-
Tao Wang, Jinqiao Wang, Shi Chen, Hanqing Lu, Jianguo Li, and Yimin Zhang
- Subjects
Speedup ,business.industry ,Computer science ,Hash function ,Search engine indexing ,Video copy detection ,Feature extraction ,ComputingMethodologies_IMAGEPROCESSINGANDCOMPUTERVISION ,Field (computer science) ,Transformation (function) ,Computer vision ,Artificial intelligence ,Zoom ,business - Abstract
Video copy detection is an active research field in copyright control, business intelligence and advertisement monitor etc. The main issues are transformation-invariant feature extraction and robust registration in object level. This paper proposes a novel video copy detection approach based on spatial-temporal-scale registration. In detail, we first build interesting points' trajectories by speeded up robust features (SURF). Then we use an efficient voting based spatial-temporal-scale registration approach to estimate the optimal transformation parameters and achieve the final video copy detection results by propagations of video segments in both spatial-temporal and scale directions. To speed up the detection speed, we use local sensitive hash indexing (LSH) to index trajectories for fast queries of candidate trajectories. Compared with existing approaches, our approach can detect many kinds of copy transformations including cropping, zoom in/out, camcording and re-encoding etc. Extensive experiments on 200 hours of videos demonstrate the effectiveness of our approach.
- Published
- 2008
28. Image Segmentation Based on Supernodes and Region Size Estimation
- Author
-
Lihong Ma, Hanqing Lu, and Yuan Yuan
- Subjects
Segmentation-based object categorization ,business.industry ,ComputingMethodologies_IMAGEPROCESSINGANDCOMPUTERVISION ,Scale-space segmentation ,Pattern recognition ,Image segmentation ,Image texture ,Minimum spanning tree-based segmentation ,Region growing ,Computer Science::Computer Vision and Pattern Recognition ,Artificial intelligence ,Range segmentation ,business ,Connected-component labeling ,Mathematics - Abstract
A kind of self-adaptive image segmentation algorithm is introduced in this paper, and of which the main frame is based on Graph Structure. Two contributions have been made in our work. First, super-pixels act as the graph nodes for computational efficiency, at the same time, more local features could be abstracted from the pre-segmented image. Second, region size is estimated during the process to reduce interaction between human and computer. Experimental results demonstrate that the improved method is unsupervised and could give satisfactory segmentation.
- Published
- 2008
29. Image Segmentation Using Co-EM Strategy
- Author
-
Jian Cheng, Zhenglong Li, Hanqing Lu, and Qingshan Liu
- Subjects
Segmentation-based object categorization ,business.industry ,InformationSystems_INFORMATIONSYSTEMSAPPLICATIONS ,ComputingMethodologies_IMAGEPROCESSINGANDCOMPUTERVISION ,Scale-space segmentation ,ComputerApplications_COMPUTERSINOTHERSYSTEMS ,Pattern recognition ,Image segmentation ,Mixture model ,Image (mathematics) ,Rate of convergence ,Expectation–maximization algorithm ,Segmentation ,Artificial intelligence ,business ,Mathematics - Abstract
Inspired by the idea of multi-view, we proposed an image segmentation algorithm using co-EM strategy in this paper. Image data are modeled using Gaussian Mixture Model (GMM), and two sets of features, i.e. two views, are employed using co-EM strategy instead of conventional single view based EM to estimate the parameters of GMM. Compared with the single view based GMM-EM methods, there are several advantages with the proposed segmentation method using co-EM strategy. First, imperfectness of single view can be compensated by the other view in the co-EM. Second, employing two views, co-EM strategy can offer more reliability to the segmentation results. Third, the drawback of local optimality for single view based EM can be overcome to some extent. Fourth, the convergence rate is improved. The average time is far less than single view based methods. We test the proposed method on large number of images with no specified contents. The experimental results verify the above advantages, and outperform the single view based GMM-EM segmentation methods.
- Published
- 2007
30. A Geometric Contour Framework with Vector Field Support
- Author
-
Zhenglong Li, Hanqing Lu, and Qingshan Liu
- Subjects
Active contour model ,business.industry ,ComputingMethodologies_IMAGEPROCESSINGANDCOMPUTERVISION ,Image processing ,Computer Science::Computer Vision and Pattern Recognition ,Piecewise ,Segmentation ,Vector field ,Artificial intelligence ,Spurious relationship ,Geometric modeling ,business ,Smoothing ,ComputingMethodologies_COMPUTERGRAPHICS ,Mathematics - Abstract
In this paper, we propose a new geometric contour framework with support of specified vector field. First we define three criteria for selection of vector field in geometric model. According to the criteria, EdgeFlow, a powerful segmentation tool, is selected to generate desirable initial vector field. In order to overcome the drawbacks of conventional geometric models, multi-source external forces, such as from texture and multi-spectra, are integrated to provide the ability for segmenting the texture-rich and complex scene images. Instead of common smoothing pre-processing to denoise and suppress possible spurious edges, the more advanced complex diffusion filters are adopted in our algorithm, which result in the piecewise filtered image to help detect those sharp transition regions. We test our model on the Berkeley Segmentation Database, and the experimental results are promising.
- Published
- 2006
31. Motion Detection in Driving Environment Using U-V-Disparity
- Author
-
Jia Wang, Keiichi Uchimura, Hanqing Lu, and Zhencheng Hu
- Subjects
Motion analysis ,Adaptive control ,Geometric analysis ,business.industry ,Iterative method ,Computer science ,ComputingMethodologies_IMAGEPROCESSINGANDCOMPUTERVISION ,Motion detection ,Stereopsis ,Match moving ,Motion estimation ,Computer vision ,Artificial intelligence ,business - Abstract
Motion detection in driving environment, which aims to detect REAL moving objects from continuously changing background, is vital for Adaptive Cruise Control (ACC) applications. This paper presents an efficient solution for such problem using a stereovision based method. First, a comprehensive analysis about 3D global motion is given based on ”U-V-disparity” concept, in which a 5-parameter model is deduced to describe global motion within U-V-disparity domain and an iterative Least Square Estimation method is proposed to estimate the parameters. Then, in order to identify separate objects, geometric analysis segments the road scene into 3D object-surfaces based on U-V-disparity features of road surfaces, roadside structures and obstacles. Finally, the motions of the segmented object-surfaces are compared with the estimated global motion to find REAL moving surfaces, which correspond to the real moving objects. The proposed algorithm has been tested on real road sequences and experimental results verified its efficiency.
- Published
- 2006
32. Multiple Similarities Based Kernel Subspace Learning for Image Classification
- Author
-
Hanqing Lu, Songde Ma, Qingshan Liu, and Wang Yan
- Subjects
Kernel method ,String kernel ,business.industry ,Kernel embedding of distributions ,Polynomial kernel ,Kernel (statistics) ,Radial basis function kernel ,Computer vision ,Artificial intelligence ,Tree kernel ,business ,Kernel principal component analysis ,Mathematics - Abstract
In this paper, we propose a new method for image classification, in which matrix based kernel features are designed to capture the multiple similarities between images in different low-level visual cues. Based on the property that dot product kernel can be regarded as a similarity measure, we apply kernel functions to different low-level visual features respectively to measure the similarities between two images, and obtain a kernel feature matrix for each image. In order to deal with the problems of over fitting and numerical computation, a revised version of Two-Dimensional PCA algorithm is developed to learn intrinsic subspace of matrix features for classification. Extensive experiments on the Corel database show the advantage of the proposed method.
- Published
- 2006
33. Automatic TV Logo Detection, Tracking and Removal in Broadcast Video
- Author
-
Ling-Yu Duan, Qingshan Liu, Changsheng Xu, Hanqing Lu, and Jinqiao Wang
- Subjects
business.industry ,Computer science ,Tensor (intrinsic definition) ,Logo ,Computer vision ,Artificial intelligence ,Broadcasting ,business ,TRECVID ,Domain (software engineering) - Abstract
TV logo detection, tracking and removal play an important role in the applications of claiming video content ownership, logo-based broadcasting surveillance, commercial skipping, and program rebroadcasting with new logos. In this paper, we present a novel and robust framework using tensor method for these three tasks. First, we use tensor based generalized gradient and the OTSU binarization algorithm to logo detection, and propose a two level framework from coarse to fine to tracking the TV logos. Finally, we extend the regularization PDEs by incorporation of temporal information to inpaint the logo region. Due to the introduction of the structure tensor, the generalized gradient based method can detect the logo region by tracking the change rate of pixels in spatio-temporal domain, and the region of logo removal is well filled in a structure-preserving way. Since temporal correlation of multiple consecutive frames is considered, the proposed method can deal with opaque, semi-transparent, and animated logos. The experiments and comparison with previous methods are conducted on the part of TRECVID 2005 news corpus and several Chinese TV channels with challenging TV logos, and the experimental results are promising.
- Published
- 2006
34. A Fuzzy Segmentation of Salient Region of Interest in Low Depth of Field Image
- Author
-
Hanqing Lu, Qi Zhao, ZhenYu Wang, KeDai Zhang, and MiYi Duan
- Subjects
business.industry ,Computer science ,Fuzzy set ,ComputingMethodologies_IMAGEPROCESSINGANDCOMPUTERVISION ,Pattern recognition ,Image segmentation ,Defuzzification ,Fuzzy logic ,Edge detection ,Region of interest ,Salient ,Computer Science::Computer Vision and Pattern Recognition ,Segmentation ,Computer vision ,Artificial intelligence ,Mean-shift ,Range segmentation ,business - Abstract
Unsupervised segmenting region of interest in images is very useful in content-based application such as image indexing for content-based retrieval and target recognition. The proposed method applies fuzzy theory to separate the salient region of interest from background in low depth of field (DOF) images automatically. First the image is divided into regions based on mean shift method and the regions are characterized by color features and wavelet modulus maxima edge point densities. And then the regions are described as fuzzy sets by fuzzification. The salient region interest and background are separated by defuzzification on fuzzy sets finally. The segmentation method is full automatic and without empirical parameters.
- Published
- 2006
35. Automatic Moving Object Segmentation with Accurate Boundaries
- Author
-
Hanqing Lu, Jia Wang, Qingshan Liu, and Haifeng Wang
- Subjects
Motion analysis ,business.industry ,Computer science ,ComputingMethodologies_IMAGEPROCESSINGANDCOMPUTERVISION ,Image processing ,Motion detection ,Motion field ,Cut ,Motion estimation ,Structure from motion ,Computer vision ,Segmentation ,Artificial intelligence ,business - Abstract
This paper presents a layer-model based method to segment moving objects from image sequence with accurate boundaries. The segmentation framework involves three stages: Motion seed detection, Motion layer expansion and Motion boundary refinement. In the first stage, motion seeds, which determine the amount and initial position of motion layers, are detected by corner matching between consecutive frames, and classified by global motion analysis. In the second stage, the detected motion seeds are expanded into motion layers. To preserve the spatial continuity, an energy function is defined to evaluate the spatial smoothness and accuracy of the layers. Then, Graph Cuts technique is used to solve the energy minimization problem and extract motion layers. In the last stage, the extracted layers are combined with edge information to find accurate boundaries of moving objects. The proposed method is tested on several image sequences and the experimental results illustrate its promising performance.
- Published
- 2006
36. Dynamic Similarity Kernel for Visual Recognition
- Author
-
Hanqing Lu, Songde Ma, Wang Yan, and Qingshan Liu
- Subjects
Graph kernel ,Computer science ,ComputerApplications_COMPUTERSINOTHERSYSTEMS ,Machine learning ,computer.software_genre ,Facial recognition system ,Kernel (linear algebra) ,symbols.namesake ,Polynomial kernel ,Gaussian function ,Image retrieval ,Gaussian process ,Contextual image classification ,business.industry ,Nonlinear dimensionality reduction ,Pattern recognition ,Linear discriminant analysis ,Kernel method ,Kernel (image processing) ,Kernel embedding of distributions ,Metric (mathematics) ,Radial basis function kernel ,symbols ,Artificial intelligence ,Tree kernel ,business ,computer - Abstract
Inspired by studies of cognitive psychology, we proposed a new dynamic similarity kernel for visual recognition. This kernel has great consistency with human visual similarity judgement by incorporating the perceptual distance function. Moreover, this kernel can be seen as an extension of Gaussian kernel, and therefore can deal with nonlinear variations well like the traditional kernels. Experimental results on natural image classification and face recognition show its superior performance compared to other kernels.
- Published
- 2006
37. Fusion Method of Fingerprint Quality Evaluation: From the Local Gabor Feature to the Global Spatial-Frequency Structures
- Author
-
Lihong Ma, Zhiqing Chen, Decong Yu, and Hanqing Lu
- Subjects
Gabor filter ,Fingerprint ,business.industry ,Computer science ,Feature (computer vision) ,Image quality ,Pattern recognition (psychology) ,Feature extraction ,Computer vision ,Image processing ,Artificial intelligence ,Fingerprint recognition ,business - Abstract
We propose a new fusion method to evaluate fingerprint quality by combining both spatial and frequency features of a fingerprint image. In frequency domain, a ring structure of DFT magnitude and directional Gabor features are applied. In spatial domain, black pixel ratio of central area is taken into account. These three features are the most efficient indexes for fingerprint quality assessment. Though additional features could be introduced, their slight improvement in performance will be traded off with complexity and computational load to some extent. Thus in this paper, each of the three features are first employed to assess fingerprint quality, their evaluation performance are also discussed. Then the suggested fusion approach of the three features is presented to obtain the final quality scores. We test the fusion method in our public security fingerprint database. Experimental results demonstrate that the proposed scheme can estimate the quality of fingerprint images accurately. It provides a feasible rejection of poor fingerprint images before they are presented to the fingerprint recognition system for feature extraction and matching.
- Published
- 2006
38. Fast Global Motion Estimation Via Iterative Least-Square Method
- Author
-
Jia Wang, Qingshan Liu, Hanqing Lu, and Haifeng Wang
- Subjects
Motion field ,Iterative method ,business.industry ,Computation ,Motion estimation ,Image processing ,Artificial intelligence ,business ,Thresholding ,Algorithm ,Gradient method ,Square (algebra) ,Mathematics - Abstract
This paper presents a fast algorithm for global motion estimation based on Iterative Least- Square Estimation (ILSE) technique. Compared with the traditional framework, three improvements were made to accelerate the computation progress. First, a new 3-parameter linear model, together with its solution using modified ILSE method, is proposed to describe and estimate global motion, which is simple and reasonable. Second, a pre-analysis method, Gradient Thresholding (GT) method, is introduced to pre-analyze the image macro-blocks before global motion estimation using their gradient information, which reduce the computational cost by reducing the amount of involved blocks. Lastly, Successive Elimination Algorithm (SEA), which is used to calculate motion field, is improved by a new presented matching criterion considering both the gradient information and the intensity information. The presented method has been tested on a variety of image sequences, and experimental results illustrate its promising performance.
- Published
- 2006
39. A Semantic Image Category for Structuring TV Broadcast Video Streams
- Author
-
Jesse S. Jin, Hanqing Lu, Jinqiao Wang, and Ling-Yu Duan
- Subjects
Set (abstract data type) ,Information retrieval ,Multimedia ,Computer science ,Search engine indexing ,Unsupervised learning ,Image processing ,Graphics ,computer.software_genre ,TRECVID ,computer - Abstract
TV broadcast video stream consists of various kinds of programs such as sitcoms, news, sports, commercials, weather, etc. In this paper, we propose a semantic image category, named as Program Oriented Informative Images (POIM), to facilitate the segmentation, indexing and retrieval of different programs. The assumption is that most stations tend to insert lead-in/-out video shots for explicitly introducing the current program and indicating the transitions between consecutive programs within TV streams. Such shots often utilize the overlapping of text, graphics, and storytelling images to create an image sequence of POIM as a visual representation for the current program. With the advance of post-editing effects, POIM is becoming an effective indicator to structure TV streams, and also is a fairly common “prop” in program content production. We have attempted to develop a POIM recognizer involving a set of global/local visual features and supervised/unsupervised learning. Comparison experiments have been carried out. A promising result, F1 = 90.2%, has been achieved on a part of TRECVID 2005 video corpus. The recognition of POIM, together with other audiovisual features, can be used to further determine program boundaries.
- Published
- 2006
40. Boosting Multi-gabor Subspaces for Face Recognition
- Author
-
Hongliang Jin, Hanqing Lu, Songde Ma, Xiaoou Tang, and Qingshan Liu
- Subjects
Boosting (machine learning) ,Computational complexity theory ,business.industry ,Gabor wavelet ,Supervised learning ,ComputingMethodologies_IMAGEPROCESSINGANDCOMPUTERVISION ,Facial recognition system ,Linear subspace ,ComputingMethodologies_PATTERNRECOGNITION ,Gabor filter ,Computer Science::Sound ,Computer Science::Computer Vision and Pattern Recognition ,Computer vision ,Artificial intelligence ,business ,Subspace topology ,Mathematics - Abstract
In this paper, we propose a new scheme of Gabor-based face recognition. Based on the fact that different Gabor filters have different properties, we first learn discriminating subspace for each kind of Gabor images respectively. Then the boosting learning is performed to fuse all the Gabor discriminating subspaces for recognition. Compared with previous work, the proposed method has three contributions: (1). We make sufficiently use of the respective properties of the Gabor filters, and learn different discriminant subspaces for different Gabor images respectively; (2). Boosting based fusing method adaptively determines the discriminating vectors and dimensionality of each subspace according to its discriminating capacity, so as to further improve the recognition performance; (3). The problem of computational complexity is well handled by subspace analysis and boosting based fusion. Extensive experiments show its encouraging performance.
- Published
- 2006
41. Improving ICA Performance for Modeling Image Appearance with the Kernel Trick
- Author
-
Jian Cheng, Songde Ma, Hanqing Lu, and Qingshan Liu
- Subjects
business.industry ,Computer science ,Feature vector ,Pattern recognition ,Real image ,Facial recognition system ,Independent component analysis ,Nonlinear system ,Kernel method ,Distortion ,Principal component analysis ,Computer vision ,Artificial intelligence ,business - Abstract
Independent Component Analysis (ICA) is a popular method for modeling image appearance, but it is inadequate to describe complex nonlinear variations of real images due to illumination, distortion, and other variations because of its linear properties in nature. In this paper, we propose to combine the nonlinear kernel trick to improve the performance of ICA for modeling image appearance. First, the kernel trick is used to project the input data into a high-dimensional implicit feature space, and then ICA is performed in this implicit feature space to extract nonlinear independent components of input data. Extensive experiments show that the proposed method outperforms ICA for describing real images.
- Published
- 2004
42. Automatic Synchronized Browsing of Images Across Multiple Devices
- Author
-
Xing Xie, Wei-Ying Ma, Hanqing Lu, and Zhigang Hua
- Subjects
Multimedia ,Computer science ,Human–computer interaction ,computer.software_genre ,computer ,Mobile device - Abstract
Mobile devices are undergoing considerable progress during recent years. Using these portable devices, people can easily capture and share photos even when they are on the move. Doubtlessly, browsing a large number of images on such small-form-factor devices is still hard and time-consuming, especially when they are distributed across various devices. In this paper, we propose a novel synchronized approach to facilitate image browsing across multiple devices. In this approach, similar images across multiple devices can be simultaneously presented for users to make comparatively viewing or searching. Experimental results show that the synchronized approach is beneficial to improve users' browsing experience.
- Published
- 2004
43. Random Independent Subspace for Face Recognition
- Author
-
Hanqing Lu, Jian Cheng, Qingshan Liu, and Yen-Wei Chen
- Subjects
Random subspace method ,ComputingMethodologies_PATTERNRECOGNITION ,business.industry ,Computer science ,Speech recognition ,Pattern recognition ,Artificial intelligence ,business ,Classifier (UML) ,Linear subspace ,Independent component analysis ,Facial recognition system ,Subspace topology - Abstract
Independent Component Analysis (ICA) is a popular approach for face recognition. However, face recognition is often a small sample size problem, which will weaken the recognition performance of ICA classifier. In this paper, a novel method is proposed to enhance ICA classifier for the small sample size problem. First, we use the random resampling method to generate some random independent subspaces, and a classifier is constructed in each subspace. Then a voting strategy is adopted to integrate these classifiers for discrimination. Experimental results on public available face database show that the proposed method can obvious improve the performance of ICA classifier.
- Published
- 2004
44. Face Recognition Using Overcomplete Independent Component Analysis
- Author
-
Jian Cheng, Yen-Wei Chen, Hanqing Lu, and Xiangyan Zeng
- Subjects
Set (abstract data type) ,ComputingMethodologies_PATTERNRECOGNITION ,Training set ,business.industry ,Computer science ,Speech recognition ,Pattern recognition ,Basis function ,Artificial intelligence ,business ,Facial recognition system ,Independent component analysis ,Subspace topology - Abstract
Most current face recognition algorithms find a set of basis functions in a subspace by training the input data. However, in many applications, the training data is limited or only a few training data are available. In the case, these classic algorithms degrade rapidly. The overcomplete independent component analysis (overcomplete ICA) can separate out more source signals than the input data. In this paper, we use the overcomplete ICA for face recognition with the limited training data. The experimental results show that the overcomplete ICA can improve efficiently the recognition rate.
- Published
- 2003
45. Multilevel Relevance Judgment, Loss Function, and Performance Measure in Image Retrieval
- Author
-
Hanqing Lu, Hong Wu, and Songde Ma
- Subjects
Computer science ,business.industry ,Relevance feedback ,Real image ,computer.software_genre ,Machine learning ,Ordinal regression ,Support vector machine ,Relevance (information retrieval) ,Artificial intelligence ,Data mining ,Precision and recall ,business ,Image retrieval ,computer - Abstract
Most learning algorithms for image retrieval are based on dichotomy relevance judgement (relevance and non-relevance), though this measurement of relevance is too coarse. To better identify the user needs and preference, a good retrieval system should be able to handle multilevel relevance judgement. In this paper, we focus on relevance feedback with multilevel relevance judgment. We consider relevance feedback as an ordinal regression problem, and discuss its properties and loss function. Since traditional performance measures such as precision and recall are based on dichotomy relevance judgment, we adopt a performance measure that is based on the preference of one image to another one. Furthermore, we develop a new relevance feedback scheme based on a support vector learning algorithm for ordinal regression. Our solution is tested on real image database, and promising results are achieved.
- Published
- 2003
46. An ICA-Based Illumination-Free Texture Model and Its Application to Image Retrieval
- Author
-
Hanqing Lu, Yen-Wei Chen, Zensho Nakao, and Xiangyan Zeng
- Subjects
Pixel ,business.industry ,Computer science ,Feature vector ,Feature extraction ,ComputingMethodologies_IMAGEPROCESSINGANDCOMPUTERVISION ,Pattern recognition ,Grayscale ,Image texture ,Feature (computer vision) ,Computer Science::Computer Vision and Pattern Recognition ,Computer vision ,Artificial intelligence ,Pattern matching ,business ,Image retrieval - Abstract
We propose a novel pixel pattern-based approach for texture classification, which is independent of the variance of illumination. Gray scale images are first transformed into pattern maps in which edges and lines, used for characterizing texture information, are classified by pattern matching. We employ independent component analysis (ICA) which is widely applied to feature extraction. We use the basis functions learned through PCA as templates for pattern matching. Using PCA pattern maps, the feature vector is comprised of the numbers of the pixels belonging to a specific pattern. The effectiveness of the new feature is demonstrated by applications to image retrieval of Brodatz texture database. Comparisons with multichannel and multiresolution features indicate that the new feature is quite time saving, free of the influence of illumination, and has notable accuracy. The applicability of the proposed method to image retrieval has also been demonstrated.
- Published
- 2002
47. Video Segmentation by Two Measures and Two Thresholds
- Author
-
DaLong Li, Hanqing Lu, and Dong Zhang
- Subjects
Video production ,Similarity (geometry) ,business.industry ,Computer science ,Shot (filmmaking) ,ComputingMethodologies_IMAGEPROCESSINGANDCOMPUTERVISION ,Image segmentation ,Measure (mathematics) ,Histogram ,Key (cryptography) ,Segmentation ,Computer vision ,Artificial intelligence ,business - Abstract
Video partitioning is a key issue in video classification that facilitates the management of video resources. The video partitioning involves the detection of boundaries between uninterrupted segments (video shots). Shot boundaries can be classified into two categories, gradual transition and abrupt change. Detection of a gradual transition is considered to be difficult. Few methods have been reported for gradual transition detection. In this paper, a new approach called two Measures Two Thresholds (TMTT) is proposed. The method requires the use of two measures and consective frames with a smaller Threshold (Ts), possible shot boundaries are located. Then false boundaries are discarded by comparing their color ratio histogram with another threshold that is used to measure the similarity of content of the frames. The efficiency of TMTT is promising according to the analysis of some experimental results.
- Published
- 2000
48. Virtual Mouse—Inputting Device by Hand Gesture Tracking and Recognition
- Author
-
Songde Ma, Changbo Hu, Lichen Liang, and Hanqing Lu
- Subjects
Property (programming) ,Computer science ,business.industry ,Input device ,Tracking (particle physics) ,Gesture recognition ,Active shape model ,Eye tracking ,Computer vision ,Condensation algorithm ,Artificial intelligence ,Hidden Markov model ,business ,Gesture - Abstract
In this paper, we develop a system to track and recognize hand motion in nearly real time. An important application of this system is to simulate mouse as a visual inputting device. Tracking approach is based on Condensation algorithm, and active shape model. Our contribution is combining multi-modal templates to increase the tracking performance. Weighting value is given to the sampling ratio of Condensation by applying the prior property of the templates. The recognition approach is based on HMM. Experiments show our system is very promising to work as an auxiliary inputting device.
- Published
- 2000
49. Improvement of Shot Detection Using Illumination Invariant Metric and Dynamic Threshold Selection
- Author
-
Xianfeng Ding, Songde Ma, Hanqing Lu, and Weixin Kong
- Subjects
Color histogram ,Signal processing ,Video production ,business.industry ,Computer science ,Color image ,Histogram ,ComputingMethodologies_IMAGEPROCESSINGANDCOMPUTERVISION ,Image processing ,Computer vision ,Image segmentation ,Artificial intelligence ,business - Abstract
Automatic shot detection is the first step and also an important step for content-based parsing and indexing of video data. Many methods have been introduced to address this problem, e.g. pixel-by-pixel comparisons and histogram comparisons. But gray or color histograms used in most existing methods ignore the problem of illumination variation inherent in the video production process. So they often fail when the incident illumination varies. And because shot change is basically a local process of a video, it is difficult to find an appropriate global threshold for absolute difference measure. In this paper, new techniques for shot detection are proposed. We use color ratio histograms as frame content measure, because it is robust to illumination changes. A local adaptive threshold technique is adopted to utilize the local characteristic of shot change. The effectiveness of our methods is validated by experiments on some real-world video sequences. Some experimental results are also discussed in this paper.
- Published
- 1999
50. Robust Tracking of Video Objects through Topological Constraint on Homogeneous Motion
- Author
-
Yi Li, Ming Liao, Hanqing Lu, and Songde Ma
- Subjects
Motion analysis ,Motion compensation ,business.industry ,Computer science ,ComputingMethodologies_IMAGEPROCESSINGANDCOMPUTERVISION ,Image processing ,Image segmentation ,Topology ,Quarter-pixel motion ,Constraint (information theory) ,Motion field ,Video tracking ,Motion estimation ,Segmentation ,Computer vision ,Artificial intelligence ,business ,Block-matching algorithm - Abstract
Considering the currently available methods for the motion analysis of video objects, we notice that the topological constraint on homogeneous motion is usually ignored in piecewise methods, or improperly imposed by blocks that do not have physical correspondence. In this paper we address the idea of area-based parametric motion estimation with spatial constraint involved, in order that the semantic segmentation and tracking of non-rigid object can be undertaken in interactive environment, which is the center demand of applications such as MPEG-4/7 or content-based video retrieval. The estimation of global motion and occlusion can also be computed through the tracking of background areas. Besides, based on the proposed hierarchical robust framework, the accurate motion parameters between correspondent areas can be obtained and the computational efficiency is improved remarkably.
- Published
- 1999
Catalog
Discovery Service for Jio Institute Digital Library
For full access to our library's resources, please sign in.