22 results on '"HanQing Lu"'
Search Results
2. Progressive rectification network for irregular text recognition
- Author
-
Yingying Chen, Yunze Gao, Hanqing Lu, and Jinqiao Wang
- Subjects
Iterative and incremental development ,General Computer Science ,business.industry ,Computer science ,Perspective (graphical) ,ComputingMethodologies_IMAGEPROCESSINGANDCOMPUTERVISION ,Process (computing) ,020207 software engineering ,Pattern recognition ,02 engineering and technology ,Transformation (function) ,Rectification ,Iterative refinement ,Line (geometry) ,0202 electrical engineering, electronic engineering, information engineering ,Artificial intelligence ,business ,Envelope (motion) - Abstract
Scene text recognition has received increasing attention in the research community. Text in the wild often possesses irregular arrangements, which typically include perspective, curved, and oriented texts. Most of the existing methods do not work well for irregular text, especially for severely distorted text. In this paper, we propose a novel progressive rectification network (PRN) for irregular scene text recognition. Our PRN progressively rectifies the irregular text to a front-horizontal view and further boosts the recognition performance. The distortions are removed step by step by leveraging the observation that the intermediate rectified result provides good guidance for subsequent higher quality rectification. Additionally, by decomposing the rectification process into multiple procedures, the difficulty of each step is considerably mitigated. First, we specifically perform a rough rectification, and then adopt iterative refinement to gradually achieve optimal rectification. Additionally, to avoid the boundary damage problem in direct iterations, we design an envelope-refinement structure to maintain the integrity of the text during the iterative process. Instead of the rectified images, the text line envelope is tracked and continually refined, which implicitly models the transformation information. Then, the original input image is consistently utilized for transformation based on the refined envelope. In this manner, the original character information is preserved until the final transformation. These designs lead to optimal rectification to boost the performance of succeeding recognition. Extensive experiments on eight challenging datasets demonstrate the superiority of our method, especially on irregular benchmarks.
- Published
- 2020
- Full Text
- View/download PDF
3. Automatic group activity annotation for mobile videos
- Author
-
Chaoyang Zhao, Hanqing Lu, Jinqiao Wang, and Jianqiang Li
- Subjects
Computer Networks and Communications ,Computer science ,Inference ,Context (language use) ,02 engineering and technology ,010501 environmental sciences ,Machine learning ,computer.software_genre ,01 natural sciences ,Automatic group ,Annotation ,Upload ,Human–computer interaction ,0202 electrical engineering, electronic engineering, information engineering ,Media Technology ,Feature (machine learning) ,0105 earth and related environmental sciences ,business.industry ,Categorization ,Hardware and Architecture ,020201 artificial intelligence & image processing ,Artificial intelligence ,business ,computer ,Mobile device ,Software ,Information Systems - Abstract
Due to the rapid growth of modern mobile devices, users can capture a variety of videos at anytime and anywhere. The explosive growth of mobile videos brings about the difficulty and challenge on categorization and management. In this paper, we propose a novel approach to annotate group activities for mobile videos, which helps tag each person with an activity label, thus helping users efficiently manage the uploaded videos. To extract rich context information, we jointly model three co-existing cues including the activity duration time, individual action feature and the context information shared between person interactions. Then these appearances and context cues are modeled with a structure learning framework, which can be solved by inference with a greedy forward search. Moreover, we can infer group activity labels of all the persons together with their activity durations, especially for the situation with multiple group activities co-existing. Experimental results on mobile video dataset show that the proposed approach achieves outstanding results for group activity classification and annotation.
- Published
- 2016
- Full Text
- View/download PDF
4. Learning discriminative context models for concurrent collective activity recognition
- Author
-
Jinqiao Wang, Chaoyang Zhao, and Hanqing Lu
- Subjects
Computer Networks and Communications ,Computer science ,business.industry ,020207 software engineering ,Context (language use) ,02 engineering and technology ,Machine learning ,computer.software_genre ,Task (project management) ,Activity recognition ,Discriminative model ,Hardware and Architecture ,Activity classification ,0202 electrical engineering, electronic engineering, information engineering ,Media Technology ,020201 artificial intelligence & image processing ,Artificial intelligence ,business ,computer ,Software ,Discriminative learning - Abstract
Collective activity classification is the task to identify activities with multiple persons participation, which often involves the context information like person relationships and person interactions. Most existing approaches assume that all individuals in a single image share the same activity label. However, in many cases, multiple activities co-exist and serve as context cues for each other in real-world scenarios. Based on this observation, in this paper, a unified discriminative learning framework of multiple context models is proposed for concurrent collective activity recognition. Firstly, both the intra-class and inter-class behaviour interactions among persons in a scenario are considered. Besides, the scenario where activities happen also provides additional context information for recognizing specific collective activities. Finally, we jointly model the multiple context cues (intra-class, inter-class and global-context) with a max-margin leaning framework. A greedy forward search method is utilized to label the activities in the testing scenes. Experimental results demonstrate the superiority of our approach in activity recognition.
- Published
- 2016
- Full Text
- View/download PDF
5. Enriching one-class collaborative filtering with content information from social media
- Author
-
Hanqing Lu, Qinshan Liu, Ting Yuan, Xi Zhang, and Jian Cheng
- Subjects
Topic model ,Class (computer programming) ,Information retrieval ,Computer Networks and Communications ,Computer science ,02 engineering and technology ,Recommender system ,Missing data ,Information overload ,Hardware and Architecture ,020204 information systems ,0202 electrical engineering, electronic engineering, information engineering ,Media Technology ,Collaborative filtering ,Profiling (information science) ,020201 artificial intelligence & image processing ,Social media ,Software ,Information Systems - Abstract
In recent years, recommender systems have become popular to handle the information overload problem of social media websites. The most widely used Collaborative Filtering methods make recommendations by mining users' rating history. However, users' behaviors in social media are usually implicit, where no ratings are available. This is a One-Class Collaborative Filtering (OCCF) problem with only positive examples. How to distinguish the negative examples from missing data is important for OCCF. Existing OCCF methods tackle this by the statistical properties of users' historical behavior; however, they ignored the rich content information in social media websites, which provide additional evidence for profiling users and items. In this paper, we propose to improve OCCF accuracy by exploiting the social media content information to find the potential negative examples from the missing user-item pairs. Specifically, we get a content topic feature for each user and item by probabilistic topic modeling and embed them into the Matrix Factorization model. Extensive experiments show that our algorithm can achieve better performance than the state-of-art methods.
- Published
- 2014
- Full Text
- View/download PDF
6. Boosted MIML method for weakly-supervised image semantic segmentation
- Author
-
Zechao Li, Hanqing Lu, Jing Liu, and Yang Liu
- Subjects
Training set ,Boosting (machine learning) ,Computer Networks and Communications ,Computer science ,business.industry ,LabelMe ,Scale-space segmentation ,Pattern recognition ,Machine learning ,computer.software_genre ,Hardware and Architecture ,Media Technology ,Segmentation ,Artificial intelligence ,business ,Classifier (UML) ,computer ,Software - Abstract
Weakly-supervised image semantic segmentation aims to segment images into semantically consistent regions with only image-level labels are available, and is of great significance for fine-grained image analysis, retrieval and other possible applications. In this paper, we propose a Boosted Multi-Instance Multi-Label (BMIML) learning method to address this problem, the approach is built upon the following principles. We formulate the image semantic segmentation task as a MIML problem under the boosting framework, where the goal is to simultaneously split the superpixels obtained from over-segmented images into groups and train one classifier for each group. In the method, a loss function which uses the image-level labels as weakly-supervised constraints, is employed to suitable semantic labels to these classifiers. At the same time a contextual loss term is also combined to reduce the ambiguities existing in the training data. In each boosting round, we introduce an "objectness" measure to jointly reweigh the instances, in order to overcome the disturbance from highly frequent background superpixels. We demonstrate that BMIML outperforms the state-of-the-arts for weakly-supervised semantic segmentation on two widely used datasets, i.e., MSRC and LabelMe.
- Published
- 2014
- Full Text
- View/download PDF
7. Chat with illustration
- Author
-
Yu Jiang, Jing Liu, and Hanqing Lu
- Subjects
Scheme (programming language) ,Service (systems architecture) ,Multimedia ,Computer Networks and Communications ,Computer science ,business.industry ,First language ,Cryptography ,02 engineering and technology ,computer.software_genre ,Automatic summarization ,Computer graphics ,Hardware and Architecture ,020204 information systems ,0202 electrical engineering, electronic engineering, information engineering ,Media Technology ,020201 artificial intelligence & image processing ,Social media ,Cluster analysis ,business ,computer ,Software ,Information Systems ,computer.programming_language - Abstract
Instant messaging service is an important aspect of social media and sprung up in last decades. Traditional instant messaging service transfers information mainly based on textual message, while the visual message is ignored to a great extent. Such instant messaging service is thus far from satisfactory in all-around information communication. In this paper, we propose a novel visual assisted instant messaging scheme named Chat with illustration (CWI), which presents users visual messages associated with textual message automatically. When users start their chat, the system first identifies meaningful keywords from dialogue content and analyzes grammatical and logical relations. Then CWI explores keyword-based image search on a hierarchically clustering image database which is built offline. Finally, according to grammatical and logical relations, CWI assembles these images properly and presents an optimal visual message. With the combination of textual and visual message, users could get a more interesting and vivid communication experience. Especially for different native language speakers, CWI can help them cross language barrier to some degree. In addition, a visual dialogue summarization is also proposed, which help users recall the past dialogue. The in-depth user studies demonstrate the effectiveness of our visual assisted instant messaging scheme.
- Published
- 2014
- Full Text
- View/download PDF
8. Learning latent semantic model with visual consistency for image analysis
- Author
-
Peng Li, Ting Rui, Jian Cheng, and Hanqing Lu
- Subjects
Probabilistic latent semantic analysis ,Computer Networks and Communications ,Computer science ,business.industry ,Latent semantic analysis ,Feature vector ,Document-term matrix ,Semantic data model ,computer.software_genre ,Latent Dirichlet allocation ,symbols.namesake ,Semantic similarity ,Hardware and Architecture ,Explicit semantic analysis ,Semantic computing ,Media Technology ,symbols ,Artificial intelligence ,Cluster analysis ,business ,computer ,Software ,Natural language processing ,Latent semantic indexing - Abstract
Latent semantic models (e.g. PLSA and LDA) have been successfully used in document analysis. In recent years, many of the latent semantic models have also been proved to be promising for visual content analysis tasks, such as image clustering and classification. The topics and words which are two of the key components in latent semantic models have explicit semantic meaning in document analysis. However, these topics and words are difficult to be described or represented in visual content analysis tasks, which usually leads to failure in practice. In this paper, we consider simultaneously the topic consistency and word consistency in semantic space to adapt the traditional PLSA model to the visual content analysis tasks. In our model, the l 1-graph is constructed to model the local neighborhood structure of images in feature space and the word co-occurrence is computed to capture the local word consistency. Then, the local information is incorporated into the model for topic discovering. Finally, the generalized EM algorithm is used to estimate the parameters. Extensive experiments on publicly available databases demonstrate the effectiveness of our approach.
- Published
- 2014
- Full Text
- View/download PDF
9. Finding logos in real-world images with point-context representation-based region search
- Author
-
Hanqing Lu, Jianlong Fu, and Jinqiao Wang
- Subjects
Logo recognition ,Exploit ,Computer Networks and Communications ,business.industry ,Computer science ,ComputingMethodologies_IMAGEPROCESSINGANDCOMPUTERVISION ,Pattern recognition ,Cryptography ,Inverted index ,Logos Bible Software ,Computer graphics ,ComputingMethodologies_PATTERNRECOGNITION ,Hardware and Architecture ,Media Technology ,Search problem ,Clutter ,Computer vision ,Artificial intelligence ,business ,Software ,Information Systems - Abstract
Finding logos in the real-world images is a challenging task due to their small size, simple shape, less texture and clutter background. In this paper, through visual logo analysis with different types of features, we propose a novel framework for finding visual logos in the real-world images. First, we exploit the contextual shape and patch information around feature points, merge them into a combined feature representation (point-context). Considering the characteristics of logos, this kind of fusion is an effective enhancement for the discriminability of single point features. Second, to eliminate the interference of the complex and noisy background, we transfer the logo recognition to a region-to-image search problem by segmenting real-world images into region trees. A weak geometric constraint based on regions is encoded into an inverted file structure to accelerate the search process. Third, we apply global features to refine initial results in the re-ranking stage. Finally, we combine each region score both in max-response and accumulate-response mode to obtain the final results. Performances of the proposed approach are evaluated on both our CASIA-LOGO dataset and the standard Flickr logos 27 dataset. Experiments and comparisons show that our approach is superior to the state-of-the-art approaches.
- Published
- 2013
- Full Text
- View/download PDF
10. Semi-supervised Unified Latent Factor learning with multi-view data
- Author
-
Hanqing Lu, Yu Jiang, Jing Liu, and Zechao Li
- Subjects
Optimization problem ,Probabilistic latent semantic analysis ,Iterative method ,business.industry ,Semi-supervised learning ,Machine learning ,computer.software_genre ,Computer Science Applications ,Non-negative matrix factorization ,ComputingMethodologies_PATTERNRECOGNITION ,Hardware and Architecture ,Pattern recognition (psychology) ,Computer Vision and Pattern Recognition ,Artificial intelligence ,Data mining ,Cluster analysis ,business ,Representation (mathematics) ,computer ,Software ,Mathematics - Abstract
Explosive multimedia resources are generated on web, which can be typically considered as a kind of multi-view data in nature. In this paper, we present a Semi-supervised Unified Latent Factor learning approach (SULF) to learn a predictive unified latent representation by leveraging both complementary information among multiple views and the supervision from the partially label information. On one hand, SULF employs a collaborative Nonnegative Matrix Factorization formulation to discover a unified latent space shared across multiple views. On the other hand, SULF adopts a regularized regression model to minimize a prediction loss on partially labeled data with the latent representation. Consequently, the obtained parts-based representation can have more discriminating power. In addition, we also develop a mechanism to learn the weights of different views automatically. To solve the proposed optimization problem, we design an effective iterative algorithm. Extensive experiments are conducted for both classification and clustering tasks on three real-world datasets and the compared results demonstrate the superiority of our approach.
- Published
- 2013
- Full Text
- View/download PDF
11. Key observation selection-based effective video synopsis for camera network
- Author
-
Jinqiao Wang, Xiaobin Zhu, Hanqing Lu, and Jing Liu
- Subjects
Motion compensation ,Video post-processing ,Multimedia ,Computer science ,business.industry ,ComputingMethodologies_IMAGEPROCESSINGANDCOMPUTERVISION ,Video processing ,computer.file_format ,Smacker video ,computer.software_genre ,Computer Science Applications ,Video compression picture types ,Hardware and Architecture ,Video tracking ,Video browsing ,Computer vision ,Computer Vision and Pattern Recognition ,Artificial intelligence ,Multiview Video Coding ,business ,computer ,Software - Abstract
Nowadays, tremendous amount of video is captured endlessly from increased numbers of video cameras distributed around the world. Since needless information is abundant in the raw videos, making video browsing and retrieval is inefficient and time consuming. Video synopsis is an effective way to browse and index such video, by producing a short video representation, while keeping the essential activities of the original video. However, video synopsis for single camera is limited in its view scope, while understanding and monitoring overall activity for large scenarios is valuable and demanding. To solve the above issues, we propose a novel video synopsis algorithm for partially overlapping camera network. Our main contributions reside in three aspects: First, our algorithm can generate video synopsis for large scenarios, which can facilitate understanding overall activities. Second, for generating overall activity, we adopt a novel unsupervised graph matching algorithm to associate trajectories across cameras. Third, a novel multiple kernel similarity is adopted in selecting key observations for eliminating content redundancy in video synopsis. We have demonstrated the effectiveness of our approach on real surveillance videos captured by our camera network.
- Published
- 2013
- Full Text
- View/download PDF
12. Sparse semantic metric learning for image retrieval
- Author
-
Hanqing Lu, Zechao Li, and Jing Liu
- Subjects
Optimization problem ,Computer Networks and Communications ,Computer science ,business.industry ,Visual space ,Overfitting ,Machine learning ,computer.software_genre ,Hardware and Architecture ,Feature (computer vision) ,Metric (mathematics) ,Media Technology ,Visual Word ,Artificial intelligence ,business ,Image retrieval ,computer ,Software ,Information Systems ,Semantic gap - Abstract
Typical content-based image retrieval solutions usually cannot achieve satisfactory performance due to the semantic gap challenge. With the popularity of social media applications, large amounts of social images associated with user tagging information are available, which can be leveraged to boost image retrieval. In this paper, we propose a sparse semantic metric learning (SSML) algorithm by discovering knowledge from these social media resources, and apply the learned metric to search relevant images for users. Different from the traditional metric learning approaches that use similar or dissimilar constraints over a homogeneous visual space, the proposed method exploits heterogeneous information from two views of images and formulates the learning problem with the following principles. The semantic structure in the text space is expected to be preserved for the transformed space. To prevent overfitting the noisy, incomplete, or subjective tagging information of images, we expect that the mapping space by the learned metric does not deviate from the original visual space. In addition, the metric is straightforward constrained to be row-wise sparse with the l2,1-norm to suppress certain noisy or redundant visual feature dimensions. We present an iterative algorithm with proved convergence to solve the optimization problem. With the learned metric for image retrieval, we conduct extensive experiments on a real-world dataset and validate the effectiveness of our approach compared with other related work.
- Published
- 2013
- Full Text
- View/download PDF
13. A three-level framework for affective content analysis and its case studies
- Author
-
Xiangjian He, Jesse S. Jin, Min Xu, Jinqiao Wang, Hanqing Lu, and Suhuai Luo
- Subjects
Modality (human–computer interaction) ,Multimedia ,Computer Networks and Communications ,Event (computing) ,Process (engineering) ,Computer science ,media_common.quotation_subject ,Software Engineering ,Representation (arts) ,Affect (psychology) ,computer.software_genre ,Focus (linguistics) ,Hardware and Architecture ,Perception ,Media Technology ,Subtitle ,Artificial Intelligence & Image Processing ,Dialog box ,computer ,Software ,Cognitive psychology ,media_common - Abstract
Emotional factors directly reflect audiences' attention, evaluation and memory. Recently, video affective content analysis attracts more and more research efforts. Most of the existing methods map low-level affective features directly to emotions by applying machine learning. Compared to human perception process, there is actually a gap between low-level features and high-level human perception of emotion. In order to bridge the gap, we propose a three-level affective content analysis framework by introducing mid-level representation to indicate dialog, audio emotional events (e.g., horror sounds and laughters) and textual concepts (e.g., informative keywords). Mid-level representation is obtained from machine learning on low-level features and used to infer high-level affective content. We further apply the proposed framework and focus on a number of case studies. Audio emotional event, dialog and subtitle are studied to assist affective content detection in different video domains/genres. Multiple modalities are considered for affective analysis, since different modality has its own merit to evoke emotions. Experimental results shows the proposed framework is effective and efficient for affective content analysis. Audio emotional event, dialog and subtitle are promising mid-level representations. © 2012 Springer Science+Business Media, LLC.
- Published
- 2012
- Full Text
- View/download PDF
14. Interactive ads recommendation with contextual search on product topic space
- Author
-
Jinqiao Wang, Hanqing Lu, Bo Wang, Ling-Yu Duan, and Qi Tian
- Subjects
Service (systems architecture) ,Computer Networks and Communications ,business.industry ,Computer science ,Contextual advertising ,Space (commercial competition) ,Automatic summarization ,Digital media ,World Wide Web ,Hardware and Architecture ,Media Technology ,Product (category theory) ,business ,Cluster analysis ,Software - Abstract
The rapid popularization of various online media services have attracted large amounts of consumers and shown us a large potential market of video advertising. In this paper, we propose interactive service recommendation based on ad concept hierarchy and contextual search. Instead of traditional ODP (Open Directory Project) based approach, we built a ad domain based concept hierarchy to make the most of the product details over the e-commerce sites. Firstly, we capture the summarization images related to the advertising product in the video content and search visually similar product images from the built product image database. Then, we aggregate the visual tags and textual tags with K-line clustering. Finally, we map them to the product concept space and make keywords suggestion, and users can interactively select keyframes or keywords to personalize their intentions by textual re-search. Experiments and comparison show that the system can accurately provide effective advertising suggestions.
- Published
- 2011
- Full Text
- View/download PDF
15. A hierarchical approach for background modeling and moving objects detection
- Author
-
Jinqiao Wang, Jie Yang, and Hanqing Lu
- Subjects
Speedup ,Pixel ,Computer science ,business.industry ,ComputingMethodologies_IMAGEPROCESSINGANDCOMPUTERVISION ,Robotics ,Motion detection ,Mechatronics ,Computer Science Applications ,Variable (computer science) ,Visual surveillance ,Control and Systems Engineering ,Computer Science::Computer Vision and Pattern Recognition ,Convergence (routing) ,Computer vision ,Artificial intelligence ,business - Abstract
In this paper, we propose a hierarchical approach for background modeling and moving objects detections in the intelligent visual surveillance system. The proposed approach models the background in block level and pixel level hierarchically, and the background is represented by texture information in block level and by color information in pixel level respectively. Meanwhile the variable parameters learning rate is proposed to speed up the convergence of the model parameters in the pixel model. The proposed approach provides us with many advantages compared to the state-of-the-art. Experimental results demonstrate the effectiveness and efficiency of the proposed approach.
- Published
- 2010
- Full Text
- View/download PDF
16. Personalized retrieval of sports video based on multi-modal analysis and user preference acquisition
- Author
-
Changsheng Xu, Hanqing Lu, Xiaoyu Zhang, and Yifan Zhang
- Subjects
Information retrieval ,Computer Networks and Communications ,Event (computing) ,Computer science ,ComputingMethodologies_IMAGEPROCESSINGANDCOMPUTERVISION ,Relevance feedback ,Text annotation ,Semantics ,Preference ,Annotation ,Hardware and Architecture ,Video tracking ,Web page ,Media Technology ,Software - Abstract
In this paper, we present a novel framework on personalized retrieval of sports video, which includes two research tasks: semantic annotation and user preference acquisition. For semantic annotation, web-casting texts which are corresponding to sports videos are firstly captured from the webpages using data region segmentation and labeling. Incorporating the text, we detect events in the sports video and generate video event clips. These video clips are annotated by the semantics extracted from web-casting texts and indexed in a sports video database. Based on the annotation, these video clips can be retrieved from different semantic attributes according to the user preference. For user preference acquisition, we utilize click-through data as a feedback from the user. Relevance feedback is applied on text annotation and visual features to infer the intention and interested points of the user. A user preference model is learned to re-rank the initial results. Experiments are conducted on broadcast soccer and basketball videos and show an encouraging performance of the proposed method.
- Published
- 2009
- Full Text
- View/download PDF
17. Automatic composition of broadcast sports video
- Author
-
Qi Tian, Eng Siong Chng, Hanqing Lu, Changsheng Xu, and Jinjun Wang
- Subjects
Video production ,Multimedia ,Computer Networks and Communications ,Computer science ,Video capture ,business.industry ,ComputingMethodologies_IMAGEPROCESSINGANDCOMPUTERVISION ,Video content analysis ,Video processing ,computer.software_genre ,Video compression picture types ,Video editing ,Hardware and Architecture ,Video tracking ,Media Technology ,Multiview Video Coding ,business ,computer ,Software ,Information Systems - Abstract
This study examines an automatic broadcast soccer video composition system. The research is important as the ability to automatically compose broadcast sports video will not only improve broadcast video generation efficiency, but also provides the possibility to customize sports video broadcasting. We present a novel approach to the two major issues required in the system's implementation, specifically the camera view selection/switching module and the automatic replay generation module. In our implementation, we use multi-modal framework to perform video content analysis, event and event boundary detection from the raw unedited main/sub-camera captures. This framework explores the possible cues using mid-level representations to bridge the gap between low-level features and high-level semantics. The video content analysis results are utilized for camera view selection/switching in the generated video composition, and the event detection results and mid-level representations are used to generate replays which are automatically inserted into the broadcast soccer video. Our experimental results are promising and found to be comparable to those generated by broadcast professionals.
- Published
- 2008
- Full Text
- View/download PDF
18. An improved variable-size block-matching algorithm
- Author
-
Hanqing Lu, Qingshan Liu, and Haifeng Wang
- Subjects
Matching (statistics) ,Speedup ,Computer Networks and Communications ,Computer science ,business.industry ,Variable size ,Work (physics) ,Mode (statistics) ,Pattern recognition ,Hardware and Architecture ,Motion estimation ,Media Technology ,Artificial intelligence ,business ,Algorithm ,Software ,Block-matching algorithm ,Block (data storage) - Abstract
In this paper, we proposed an improved "bottom---up" variable-size block matching method. Different from previous work, the proposed method does not need any threshold during the matching, and we just keep all the motion vectors leading to the minimum matching error. A Marco-block mode prediction method is put forward to speed up the motion estimation procedure without introducing any loss to the prediction precision. The improved variable-size block matching algorithm can achieve exactly the same prediction precision as full-search based fixed-size block matching algorithm. In order to reduce the effect of illumination change on mode selection, we proposed an illumination removal method, which acts as a post-processing step to prevent the macro-blocks from over-splitting. Experiments show its encouraging performance.
- Published
- 2007
- Full Text
- View/download PDF
19. Kernel-based nonlinear discriminant analysis for face recognition
- Author
-
Hanqing Lu, Songde Ma, Rui Huang, and Qingshan Liu
- Subjects
Computer science ,Feature vector ,Facial recognition system ,Kernel principal component analysis ,Theoretical Computer Science ,Kernel (linear algebra) ,Polynomial kernel ,business.industry ,Dimensionality reduction ,Pattern recognition ,Linear discriminant analysis ,Linear subspace ,Computer Science Applications ,Kernel method ,Computational Theory and Mathematics ,Hardware and Architecture ,Kernel embedding of distributions ,Variable kernel density estimation ,Optimal discriminant analysis ,Principal component analysis ,Radial basis function kernel ,Principal component regression ,Artificial intelligence ,Kernel Fisher discriminant analysis ,business ,Software ,Subspace topology - Abstract
Linear subspace analysis methods have been successfully applied to extract features for face recognition. But they are inadequate to represent the complex and nonlinear variations of real face images, such as illumination, facial expression and pose variations, because of their linear properties. In this paper, a nonlinear subspace analysis method, Kernel-based Nonlinear Discriminant Analysis (KNDA), is presented for face recognition, which combines the nonlinear kernel trick with the linear subspace analysis method -- Fisher Linear Discriminant Analysis (FLDA). First, the kernel trick is used to project the input data into an implicit feature space, then FLDA is performed in this feature space. Thus nonlinear discriminant features of the input data are yielded. In addition, in order to reduce the computational complexity, a geometry-based feature vectors selection scheme is adopted. Another similar nonlinear subspace analysis is Kernel-based Principal Component Analysis (KPCA), which combines the kernel trick with linear Principal Component Analysis (PCA). Experiments are performed with the polynomial kernel, and KNDA is compared with KPCA and FLDA. Extensive experimental results show that KNDA can give a higher recognition rate than KPCA and FLDA.
- Published
- 2003
- Full Text
- View/download PDF
20. A non-parameter bayesian classifier for face recognition
- Author
-
Hanqing Lu, Songde Ma, and Qingshan Liu
- Subjects
business.industry ,Kernel density estimation ,Pattern recognition ,Facial recognition system ,Kernel principal component analysis ,k-nearest neighbors algorithm ,Naive Bayes classifier ,ComputingMethodologies_PATTERNRECOGNITION ,Variable kernel density estimation ,Kernel (statistics) ,Principal component analysis ,Artificial intelligence ,Electrical and Electronic Engineering ,business ,Mathematics - Abstract
A non-parameter Bayesian classifier based on Kernel Density Estimation (KDE) is presented for face recognition, which can be regarded as a weighted Nearest Neighbor (NN) classifier in formation. The class conditional density is estimated by KDE and the bandwidth of the kernel function is estimated by Expectation Maximum (EM) algorithm. Two subspace analysis methods—linear Principal Component Analysis (PCA) and Kernel-based PCA (KPCA) are respectively used to extract features, and the proposed method is compared with Probabilistic Reasoning Models (PRM), Nearest Center (NC) and NN classifiers which are widely used in face recognition systems. The experiments are performed on two benchmarks and the experimental results show that the KDE outperforms PRM, NC and NN classifiers.
- Published
- 2003
- Full Text
- View/download PDF
21. Head tracking using shapes and adaptive color histograms
- Author
-
Songde Ma, Hanqing Lu, and Qingshan Liu
- Subjects
Normalization (statistics) ,Color histogram ,Balanced histogram thresholding ,business.industry ,Computer science ,Color normalization ,ComputingMethodologies_IMAGEPROCESSINGANDCOMPUTERVISION ,Histogram matching ,Computer Science Applications ,Theoretical Computer Science ,Computational Theory and Mathematics ,Hardware and Architecture ,Histogram ,RGB color model ,Computer vision ,Adaptive histogram equalization ,Artificial intelligence ,Mean-shift ,business ,Software ,Histogram equalization - Abstract
A new method is presented for tracking a person's head in real-time. The head is shaped as an ellipse, and the adaptively modified RGB color histogram is used to represent the tacked object (head). The method is composed of two parts. First, a robust nonparametrlc technique, called mean shift algorithm, is adopted for histogram matching to estimate the head's location in the current frame. Second, a local search is performed after histogram matching to maximize the normalized gradient magnitude around the boundary of the elliptical head, so that a more accurate location and the best scale size of the head can be obtained. The method is demonstrated to be a real-time tracker and robust to clutter, scale variation, occlusion, rotation and camera motion, for several test sequences.
- Published
- 2002
- Full Text
- View/download PDF
22. Emission of M2X+ cluster ions in thermal ionization mass spectrometry in the presence of graphite
- Author
-
Yinmin Zhou, Haizheng Wei, Yingkai Xiao, Yun-Hui Wang, Hanqing Lu, Weiguo Liu, and Qingtao Wang
- Subjects
Lattice energy ,Ionic radius ,Physics::Plasma Physics ,Chemistry ,Physics::Atomic and Molecular Clusters ,Analytical chemistry ,Solvation ,Thermal ionization ,Thermal ionization mass spectrometry ,Mass spectrometry ,Biochemistry ,Ion source ,Ion - Abstract
The emission of M2X+ cluster ions in thermal ionization mass spectrometry when graphite is loaded on the heating filaments was studied. The emission model of non-reductive thermal ionization of graphite was preliminarily discussed and factors influencing the thermal emission of M2X+ ions were investigated. The results show that the intensities of M2X+ cluster ions are related to ionic radius and crystal lattice energy, and possibly also to the solvation energies of ions. The intensities of M2Cl+ (M stands for K, Rb, and Cs) cluster ions, the M2Cl+/M+ ratios, and the 37Cl/35Cl ratios determined from M2Cl+ ion measurement usually increase with measurement time. The variation of the 37Cl/35Cl ratios determined from Cs2Cl+ ion measurement is lower than those based on K2Cl+ and Rb2Cl+ ion measurement, indicating the lowest isotopic fractionation.
- Published
- 2001
- Full Text
- View/download PDF
Catalog
Discovery Service for Jio Institute Digital Library
For full access to our library's resources, please sign in.