332 results
Search Results
2. A biometric system based on Gabor feature extraction with SVM classifier for Finger-Knuckle-Print.
- Author
-
Muthukumar, A. and Kavipriya, A.
- Subjects
- *
FEATURE extraction , *HAMMING distance , *SUPPORT vector machines , *GABOR filters , *BIOMETRIC fingerprinting , *PATTERN perception - Abstract
• New idea for authentication of FKP using Gabor feature extraction to generate short and long features. • Enrolled template are matched with hamming distance for further classification. • SVM classifier plays a major role in segregating the genuine or impostor. • Double instance combination of FKP shows best result for MIN-MAX as 92.33% and 96.01%. An authentic Personal identification infrastructure helps to control the access in order to secure data and information. Biometric technology is mainly based on physiological or behavioural characteristics of human body. This paper elucidates Finger Knuckle Print (FKP) biometric system based on feature extraction methodology using the short and long Gabor feature extraction. This FKP authentication system involves all basic processes like pre-processing, feature extraction and classification. This feature extraction is done by Gabor filter for extracting the important features form the FKP dataset. The query FKP Gabor features are matched and compared with the enrolled template using Hamming distance [HD]. Finally this paper proposes the FKP recognition based on Support Vector Machines in accordance with score level fusion to improve the recognition performance of FKP by integrating the Gabor features. The main aim of this paper is to utilize the ability of Support Vector Machines (SVM) in pattern recognition and classifying with Hamming distance which helps to improve the False Acceptance Rate (FAR) and Genuine Acceptance Rate (GAR). This new combination (double instance) of FKP shows better results as 96.01% for MAX Rule and 92.33% for Min Rule than single instance performance as 89.11%.This idea shows good results in Finger Knuckle Print recognition of a person. [ABSTRACT FROM AUTHOR]
- Published
- 2019
- Full Text
- View/download PDF
3. A review of Convolutional-Neural-Network-based action recognition.
- Author
-
Yao, Guangle, Lei, Tao, and Zhong, Jiandan
- Subjects
- *
NEURAL computers , *DEEP learning , *PATTERN perception , *ROBOTICS , *HUMAN-computer interaction - Abstract
Highlights • We provide a review of Convolutional Neural Network based action recognition. • The review follows a clue of temporal information exploitation. • We discuss the performance of action recognition on recent large-scale benchmarks. • We indicate the limitations and future research of Convolutional Neural Network based action recognition. Abstract Video action recognition is widely applied in video indexing, intelligent surveillance, multimedia understanding, and other fields. Recently, it was greatly improved by incorporating the learning of deep information using Convolutional Neural Network (CNN). This motivated us to review the notable CNN-based action recognition works. Because CNN is primarily designed to extract 2D spatial features from still image and videos are naturally viewed as 3D spatiotemporal signals, the core issue of extending the CNN from image to video is temporal information exploitation. We divide the solutions for exploiting temporal information exploration into three strategies: 1) 3D CNN; 2) taking the motion-related information as the CNN input; and 3) fusion. In this paper, we present a comprehensive review of the CNN-based action recognition methods according to these strategies. We also discuss the action recognition performance on recent large-scale benchmarks and the limitations and future research directions of CNN-based action recognition. This paper offers an objective and clear review of CNN-based action recognition and provides a guide for future research. [ABSTRACT FROM AUTHOR]
- Published
- 2019
- Full Text
- View/download PDF
4. Perceiving the person and their interactions with the others for social robotics – A review.
- Author
-
Tapus, Adriana, Bandera, Antonio, Vazquez-Martin, Ricardo, and Calderita, Luis V.
- Subjects
- *
SOCIAL robots , *HUMAN-robot interaction , *PATTERN perception , *SPATIAL behavior , *FACE perception - Abstract
Highlights • This review paper summarizes techniques in understanding activities of a person and/or of a group of people. • Include recent development in recognition approaches based on recurrent neural networks. • Future research for integrating advances on robotics is indicated. Graphical abstract Abstract Social robots need to understand human activities, dynamics, and the intentions behind their behaviors. Most of the time, this implies the modeling of the whole scene. The recognition of the activities and intentions of a person are inferred from the perception of the individual, but also from their interactions with the rest of the environment (i.e., objects and/or people). Centering on the social nature of the person, robots need to understand human social cues, which include verbal but also nonverbal behavioral signals such as actions, gestures, body postures, facial emotions, and proxemics. The correct understanding of these signals helps these robots to anticipate the needs and expectations of people. It also avoids abrupt changes on the human–robot interaction, as the temporal dynamics of interactions are anchored and driven by a major repertoire of social landmarks. Within the general framework of interaction of robots with their human counterparts, this paper reviews recent approaches for recognizing human activities, but also for perceiving social signals emanated from a person or a group of people during an interaction. The perception of visual and/or audio signals allow them to correctly localize themselves with respect to humans from the environment while also navigating and/or interacting with a person or a group of people. [ABSTRACT FROM AUTHOR]
- Published
- 2019
- Full Text
- View/download PDF
5. VF3-Light: A lightweight subgraph isomorphism algorithm and its experimental evaluation.
- Author
-
Carletti, Vincenzo, Foggia, Pasquale, Greco, Antonio, Vento, Mario, and Vigilante, Vincenzo
- Subjects
- *
ALGORITHMS , *DENSE graphs , *DATA structures , *PATTERN perception , *ISOMORPHISM (Mathematics) - Abstract
• VF3-Light is a simplified version of the VF3 subgraph isomorphism algorithm. • The use of simpler heuristics makes VF3-Light more effective on some kinds of graphs. • An experimental evaluation shows good performance on several kinds of graphs. • VF3-Light performs well even with respect to other state-of-the-art algorithms. In this paper we introduce VF3-Light, a simplification of VF3, a recently introduced, general-purpose subgraph isomorphism algorithm. While VF3 has demonstrated to be very effective on several datasets, especially on very large and very dense graphs, we will show that on some classes of graphs, the full power of VF3 may become an overkill; indeed, by removing some of the heuristics used in it, and as a consequence also some of the data structures that are required by them, we obtain an algorithm (VF3-Light) that is actually faster. In order to provide a characterization of this modified algorithm, we have performed an evaluation using several publicly available graph datasets. Besides comparing VF3-Light with VF3, we have also included in the comparison other recent algorithms that are rated among the fastest in the state of the art. [ABSTRACT FROM AUTHOR]
- Published
- 2019
- Full Text
- View/download PDF
6. Supervised discriminative manifold learning with subsidiary-view information for near infrared spectroscopic classification of crop seeds.
- Author
-
Li, Jia, Ge, Wenzhang, Wei, Yaoguang, and An, Dong
- Subjects
- *
SEED crops , *PATTERN perception , *FEATURE extraction , *CLASSIFICATION , *INFORMATION commons - Abstract
• A novel manifold learning method is proposed for classification. • A subsidiary-view information fusion is further added to the proposed method. • The proposed method is validated effectively in NIRS classification of crop seeds. This paper introduces a novel manifold learning method for near infrared spectroscopic classification dimensionality reduction and feature extraction. First, similar spectra from different categories reduced by traditional methods are mixed seriously which becomes an obstacle to build an effective model. We present a novel application to discovering the spectral manifold structure through manifold learning. Second, we propose a new supervised discriminative manifold learning method to expanded category overlapping for dimensionality reduction in classification and pattern recognition. The proposed method constructs a feature space which can make the samples of different categories in overlapping region far away from each other and keep the samples of same category in non-overlapping region close to each other. Accordingly, the low-dimensional feature space with expanding the overlap region is more conducive to classification. Moreover, the subsidiary-view version of the proposed method is proposed to further improve the classification accuracy. Finally, an authentic spectra dataset obtained from 200 maize seeds is introduced for building a haploid identification model to verify the method. Experimental results show that the proposed method outperforms several relevant manifold learning method and common dimensionality reduction methods of spectra. Furthermore, it is validated that subsidiary-view manifold learning is better than single view under appropriate parameters in near infrared spectral dimensionality reduction and classification. [ABSTRACT FROM AUTHOR]
- Published
- 2019
- Full Text
- View/download PDF
7. Kernel modified optimal margin distribution machine for imbalanced data classification.
- Author
-
Zhang, Xiaogang, Wang, Dingxiang, Zhou, Yicong, Chen, Hua, Cheng, Fanyong, and Liu, Min
- Subjects
- *
KERNEL functions , *PATTERN perception , *HYPERPLANES , *DATA modeling , *MACHINING - Abstract
• Optimal margin distribution machine (ODM) is failed to deal with imbalanced data. • The conventional conformal function of kernel scaling method is not suitable for ODM. • A novel conformal function is designed to improve kernel scaling method for ODM. • Kernel modified ODM (KMODM) can alleviate the skew of separator caused by imbalanced data. • KMODM can inherit the good generalization performance of ODM and obtain a balanced detection rate. Although the optimal margin distribution machine (ODM) has better generalization performance in pattern recognition than traditional classifiers, ODM as well as traditional classifiers often suffers from data imbalance. To address this, this paper proposes a kernel modified ODM (KMODM) to eliminate the side effect of imbalanced data. According to the mechanism of ODM, a novel conformal function is designed to scale the kernel matrix of ODM, this can increases the separability of the training data in the feature space. In addition, to eliminate the skew of the separator toward minority class, KMODM introduces two free parameters in conformal function to balance the influence of different training data on separating hyperplane. Experimental results on two-dimensional visualization data show that KMODM can alleviate the skew of the separating hyperplane caused by imbalanced data. For most of ten UCI data sets, KMODM can broad the margin of the minority class and achieve the highest average G-mean and F1 score. This means that KMODM has more balanced detection rate and better generalization performance compared to other baseline methods, especially in presence of heavily imbalanced training data. [ABSTRACT FROM AUTHOR]
- Published
- 2019
- Full Text
- View/download PDF
8. BitStream: An efficient framework for inference of binary neural networks on CPUs.
- Author
-
Jiang, Yanshu, Zhao, Tianli, He, Xiangyu, Leng, Cong, and Cheng, Jian
- Subjects
- *
BINARY sequences , *CACHE memory , *NATURAL language processing , *PATTERN recognition systems , *PATTERN perception , *POWER resources - Abstract
• We propose a memory management strategy for Binary Neural Networks. • We propose a new computation flow for Binary Neural Networks. • We implement an framework for efficient inference of Binary Neural Networks. Convolutional Neural Networks (CNN) has been well-studied and widely used in the field of pattern recognition. Many pattern recognition algorithms need features extracted from CNN models to adapt to complex tasks, such as image classification, object detection, natural language processing and so on. However, to deal with more and more complex tasks, modern CNN models are becoming larger and larger, contain large number of parameters and computation, leading to high consumption of memory, computational and power resources during inference. This makes it difficult to run CNN based applications in real time on mobile devices, where memory, computational and power resources are limited. Binarization of neural networks is proposed to reduce memory and computational complexity of CNN. However, traditional implementations of Binary Neural Networks (BNN) follow the conventional im2col-based convolution computation flow, which is widely used in floating-point networks but not friendly enough to cache when it comes to binarized neural networks. In this paper, we propose BitStream, a general architecture for efficient inference of BNN on CPUs. In BitStream, we propose a simple but novel computation flow for BNN. Unlike existing implementations of BNN, in BitStream, all the layers, including convolutional layers, binarization layers and pooling layers are all calculated in binary precision. Comprehensive analyses demonstrate that our proposed computation flow consumes less memory during inference of BNN, and it's friendly to cache because of its continuous memory access. [ABSTRACT FROM AUTHOR]
- Published
- 2019
- Full Text
- View/download PDF
9. Deep Sparse Representation Classifier for facial recognition and detection system.
- Author
-
Cheng, Eric-Juwei, Chou, Kuang-Pen, Rajora, Shantanu, Jin, Bo-Hao, Tanveer, M., Lin, Chin-Teng, Young, Ku-Young, Lin, Wen-Chieh, and Prasad, Mukesh
- Subjects
- *
HUMAN facial recognition software , *FEATURE extraction , *PATTERN perception - Abstract
• The proposed two-layer Convolutional Neural Network (CNN) is able to learn the high-level features. • The feature maps extracted by the proposed CNN-based model are sparse and selective. • Adopted an averaging model approach for is training several different models on subsets of dataset. • The proposed model generates a pool of features for training and selecting effective classifiers. • An improvement in the discriminative power of face recognition system with a small sample of datasets. This paper proposes a two-layer Convolutional Neural Network (CNN) to learn the high-level features which utilizes to the face identification via sparse representation. Feature extraction plays a vital role in real-world pattern recognition and classification tasks. The details description of the given input face image, significantly improve the performance of the facial recognition system. Sparse Representation Classifier (SRC) is a popular face classifier that sparsely represents the face image by a subset of training data, which is known as insensitive to the choice of feature space. The proposed method shows the performance improvement of SRC via a precisely selected feature exactor. The experimental results show that the proposed method outperforms other methods on given datasets. [ABSTRACT FROM AUTHOR]
- Published
- 2019
- Full Text
- View/download PDF
10. New set of generalized legendre moment invariants for pattern recognition.
- Author
-
Benouini, Rachid, Batioua, Imad, Zenkouar, Khalid, Mrabti, Fatiha, and fadili, Hakim El
- Subjects
- *
PATTERN perception , *CARTESIAN coordinates , *FEATURE extraction , *OBJECT recognition (Computer vision) - Abstract
• Introduce a new set of moment invariants for pattern recognition based on Legendre polynomials of fractional-order. • Prove their geometric invariance against Rotation, Scale and Translation transforms. • Present a systematic parameter selection method for finding the optimal fractional parameters values. • Propose an adaptive feature extraction method using our proposed moment invariants. • Provide numerical experiments to demonstrate their validity and superiority. In this paper, we present a new set of rotation, scale and translation invariants, named Generalized Legendre Moment Invariants (GLMI). This new set of invariants is defined on the Cartesian coordinate system, where we can derive the GLMI based on the algebraic relation between the fractional-order Legendre polynomials and the geometric basis. Consequently, several experiments are carried out to evaluate the performance of the proposed GLMI, with regard to their invariability property, object recognition capability and computation efficiency, in comparison with the most representative families of moment invariants. In addition, we have presented a systematic parameter selection method for finding the optimal fractional parameter values with respect to pattern recognition applications. Just as important, we have introduced an adaptive scheme to set the fractional parameters according to the characteristics of the image. The obtained results clearly show that the proposed invariants provide higher features accuracy and discrimination power even in the presence of noisy effects. [ABSTRACT FROM AUTHOR]
- Published
- 2019
- Full Text
- View/download PDF
11. How to utilize syllable distribution patterns as the input of LSTM for Korean morphological analysis.
- Author
-
Kim, Hyemin, Yang, Seon, and Ko, Youngjoong
- Subjects
- *
PATTERN perception , *DEEP learning , *SHORT-term memory , *MORPHEMICS , *TASK performance - Abstract
Abstract This paper proposes the use of syllable distribution patterns as deep learning inputs for morphological analysis. The proposed syllable distribution pattern comprises two parts: a distributed syllable embedding vector and a morpheme syllable-level distribution pattern. As a learning method, we utilize bidirectional long short-term memory with a conditional random field layer (Bi-LSTM-CRF) for Korean part-of-speech tagging tasks. After syllable-level outputs are generated by Bi-LSTM-CRF, a morpheme restoration process is performed utilizing pre-analyzed dictionaries that were automatically created from a training corpus. Experimental results reveal outstanding performance for the proposed method with an F1-score of 98.65%. [ABSTRACT FROM AUTHOR]
- Published
- 2019
- Full Text
- View/download PDF
12. Depth matrix and adaptive Bayes classifier based dynamic hand gesture recognition.
- Author
-
Kane, Lalit and Khanna, Pritee
- Subjects
- *
HAND physiology , *GESTURE , *PATTERN perception , *WINDOWS (Graphical user interfaces) , *HUMAN-computer interaction - Abstract
Abstract A sequence of apparently ad-hoc hand postures can generate meaningful dynamic gestures which can be utilized in interface controls for computer, television, or games. In order to develop deployable systems with these gestures, selected descriptors should be fast enough to meet the live recognition requirements. This paper proposes framework for a practical system capable of recognizing continuous dynamic gestures characterized by short-duration posture sequences. A depth-based modification to the shape matrix is devised to describe hand silhouettes, which gives a faster alternative to region-based descriptors. Postures are recognized using depth matrix and 1-nearest neighbor strategy. Posture sequence labels are predicted by a dynamic naive Bayes classifier which works in association with an adaptive windowing mechanism. The conducted experiments report up to 96.2% accurate results with mean accuracy of 95.2% on dynamic gesture dataset. Depth matrix computation takes a maximum of 2ms time. [ABSTRACT FROM AUTHOR]
- Published
- 2019
- Full Text
- View/download PDF
13. Learning skeleton representations for human action recognition.
- Author
-
Saggese, Alessia, Strisciuglio, Nicola, Vento, Mario, and Petkov, Nicolai
- Subjects
- *
LEARNING , *SKELETON , *HUMAN behavior , *COMPUTER vision , *PATTERN perception - Abstract
Highlights • We propose trainable feature extractors for representation of skeleton poses. • We employ the proposed feature extractor for classification of human actions. • We designed an human action classification method based on string kernel. • We carried out experiments on the MHAD, MSRDA and MIVIA-S action datasets. • We publicly released the MIVIA-S dataset for research purpose. Abstract Automatic interpretation of human actions gained strong interest among researchers in patter recognition and computer vision because of its wide range of applications, such as in social and home robotics, elderly people health care, surveillance, among others. In this paper, we propose a method for recognition of human actions by analysis of skeleton poses. The method that we propose is based on novel trainable feature extractors, which can learn the representation of prototype skeleton examples and can be employed to recognize skeleton poses of interest. We combine the proposed feature extractors with an approach for classification of pose sequences based on string kernels. We carried out experiments on three benchmark data sets (MIVIA-S, MSRSDA and MHAD) and the results that we achieved are comparable or higher than the ones obtained by other existing methods. A further important contribution of this work is the MIVIA-S dataset, that we collected and made publicly available. [ABSTRACT FROM AUTHOR]
- Published
- 2019
- Full Text
- View/download PDF
14. Effective vector representation for the Korean named-entity recognition.
- Author
-
Kwon, Sunjae, Ko, Youngjoong, and Seo, Jungyun
- Subjects
- *
PATTERN perception , *DATA mining , *NEURAL circuitry , *ROBUST statistics , *LINGUISTICS - Abstract
Highlights • Propose a new named-entity pattern recognition method in a syllable bigram units. • Jointly train positional information in the syllable bigram vector representation. • Compared with the existing methods, both speed and performance are improved. Abstract Named-entity recognition, part of information extraction, is the task of finding the position of a proper names in a sentence and assigning it to the correct category. Existing studies have access to Korean named-entity recognition by a morphological-level method that performs named-entity recognition processes by using the results of morphological analysis as input. While this method has the advantage of using various linguistic clues, it suffers from the error propagation problem of morphological analysis. In this paper, we propose an effective method for Korean syllable-level named-entity recognition to solve the above problem. Firstly, we suggest an approach to use the syllable bi-gram vector representation for Korean syllable-level named-entity recognition. Secondly, influenced by the linguistic characteristics of Korean, we suggest a novel way to make the joint vector representation of syllable bi-gram and Korean eojeol's positional information. In the experiment, we have evaluated our methods on the two Korean named-entity recognition corpora using Bi-directional LSTM-CRFs as a sequence labeler. Experimental results verify that our methods significantly improve the performance of syllable-level named-entity recognition and have similar performance to existing morphological-level named-entity recognition. Besides, additional experiments have shown that our syllable-level named-entity recognition is not only more robust but also faster than traditional morphological-level named-entity recognition by eliminating the morphological analysis process. [ABSTRACT FROM AUTHOR]
- Published
- 2019
- Full Text
- View/download PDF
15. l2,1-norm minimization based negative label relaxation linear regression for feature selection.
- Author
-
Peng, Yali, Sehdev, Paramjit, Liu, Shigang, Li, Jun, and Wang, Xili
- Subjects
- *
REGRESSION analysis , *FEATURE selection , *T-matrix , *PATTERN perception , *PATTERN recognition systems - Abstract
Highlights • l 2,1 -norm minimization based negative label relaxation linear regression for feature selection is presented. • The technique called negative label relaxation is integrated into LR model. • l 2,1 -norm regularization constraint is imposed on transformation matrix to achieve row sparsity. • We devised an efficient algorithm to solve the objective function. • Our feature selection method is more efficient due to joint relaxing class labels and l 2,1 -norm. Abstract Feature selection (FS) is an important issue in the field of pattern recognition and machine learning. Linear regression (LR) and its variants have been widely used for classification problems. In this paper, we propose a novel feature selection method, i.e., l 2,1 -norm minimization based negative label relaxation linear regression for feature selection (NLRL21-FS). The core idea of our method is that a technique called negative label relaxation is integrated into LR model for classification, and l 2,1 -norm regularization constraint is imposed on transformation matrix of the modified LR model to achieve row sparsity. Thus we incorporate feature selection into the training process of the classifier. An efficient algorithm is devised to solve the objective function and analysis of convergence of the proposed algorithm is presented. Our feature selection method is more efficient due to joint relaxing class labels and l 2,1 -norm regularization. Three groups of experiments on 6 benchmark data sets of different type demonstrate the effectiveness of the proposed feature selection method. [ABSTRACT FROM AUTHOR]
- Published
- 2018
- Full Text
- View/download PDF
16. Hexagonal Grid based triangulated feature descriptor for shape retrieval.
- Author
-
P, Govindaraj and MS, Sudhakar
- Subjects
- *
IMAGE retrieval , *FEATURE extraction , *TESSELLATIONS (Mathematics) , *PATTERN perception , *COMPUTER vision - Abstract
Highlights • Hexagonal grid based triangular tessellation for acute feature characterisation. • Achieving geometrical congruence employing triangle tessellation. • Local-Global approach for realizing acute and simple shape Histograms. • Superior BER scores over its predecessors. • Complexity analysis reveals the simplicity of the proposed intention. Graphical abstract Abstract Shape characterization schemes catering object recognition and retrieval have been undergoing intense study in computer vision. As shape encompasses much of image information, yielding higher retrieval performance with less complexity continues to be a challenging issue. Accordingly, a simple approach, blending hexagonal grid modelling with triangular tessellation for shape matching and retrieval is proposed in this paper. At the onset, a shape image is decomposed into overlapping hexagonal grids that are then divided into six non-overlapping equilateral triangles. Next, the intensity differences of each triangle side is evaluated and the maximum of the three sides is retained. This process is repeated on the remaining triangles that produces six maximum values corresponding to each triangle. These six values, then replace the six corners of the hexagonal subregion to finally produce a feature map that is uniquely packed into feature histograms representing the given shape. Quantitative and qualitative examinations on MPEG-7, TARI-1000 and Kimia's 99 datasets reveals a consistent BER greater than 90%. The superior performance over its competitors can be attributed to the congruent nature of the hexagonal and triangular tessellation. [ABSTRACT FROM AUTHOR]
- Published
- 2018
- Full Text
- View/download PDF
17. Data augmentation and directional feature maps extraction for in-air handwritten Chinese character recognition based on convolutional neural network.
- Author
-
Qu, Xiwen, Wang, Weiqiang, Lu, Ke, and Zhou, Jianshe
- Subjects
- *
ARTIFICIAL neural networks , *PATTERN recognition systems , *CHINESE characters , *RECOGNITION (Philosophy) , *PATTERN perception - Abstract
Recently convolutional neural networks (CNN) have demonstrated remarkable performance in various classification problems. In this paper, we also introduce CNN into in-air handwritten Chinese character recognition (IAHCCR) and propose new directional feature maps, named bend directional feature maps. Then we integrate the combination of various types of directional feature maps with the CNN and obtain better recognition performance compared with other methods reported for IAHCCR. For further improving recognition rate, we propose a new data augmentation method dedicated to in-air handwritten Chinese characters. The proposed data augmentation method combines global transformation with local distortion and effectively enlarges the training dataset. Experimental results demonstrate that our proposed methods can greatly improve the recognition rate for IAHCCR. [ABSTRACT FROM AUTHOR]
- Published
- 2018
- Full Text
- View/download PDF
18. Symmetry detection based on multiscale pairwise texture boundary segment interactions.
- Author
-
Mignotte, Max
- Subjects
- *
DATABASES , *ALGORITHMS , *SYMMETRY , *OPTICAL images , *PATTERN perception - Abstract
In this paper, we propose a new unsupervised and simple approach to local symmetry detection of ribbon-like structure in natural images. The proposed model consists in quantifying the presence of a partial medial axis segment, existing between each pair of (preliminary detected) line segments delineating the boundary of two textured regions, by a set of heuristics related both to the geometrical structure of each pair of line segments and its ability to locally delimit a homogeneous texture region in the image. This semi-local approach is finally embedded in a two-step algorithm with an amplification step, via a Hough-style voting approach achieved at different scales and coordinate spaces which aims at determining the dominant local symmetries present in the image and a final denoising step, via an averaging procedure, which aims at removing noise and spurious local symmetries. The experiments, reported in this paper and conducted on the recent extension of the Berkeley Segmentation Dataset for the local symmetry detection task, demonstrate that the proposed symmetry detector performs well compared to the best existing state-of-the-art algorithms recently proposed in the literature. [ABSTRACT FROM AUTHOR]
- Published
- 2016
- Full Text
- View/download PDF
19. Recovering and matching minutiae patterns from finger knuckle images.
- Author
-
Kumar, Ajay and Wang, Bichai
- Subjects
- *
PATTERN perception , *BIOMETRIC identification , *IMAGE analysis , *LAW enforcement , *IMAGE quality analysis - Abstract
Personal identification approaches using finger knuckle patterns are receiving increasing attention in the biometrics literature. Several approaches have been explored and these methods consider knuckle patterns as textured-like patterns, similar to the iris, and illustrate promising results. However much of the law-enforcement and forensic analysis for hand biometrics still relies on recovery and matching of minutiae patterns which has matured in last several decades. This is largely due to the fact that the identification of minutiae patterns is believed to be more scientific and pervasive in connecting with the anatomy/uniqueness of individuals. Availability of high resolution finger dorsal images acquired for recreational or covert imaging can provide important cues for forensic investigation and analysis, especially when finger dorsal patterns are only the piece of evidence available for the identification. Identification of finger knuckle patterns using minutiae recovery and matching is expected to significantly help in prosecution of such suspects. This paper therefore investigates recovery and matching of minutiae patterns using finger knuckle images. We investigate effective use of minutiae quality in improving performance for the knuckle pattern matching. Our study detailed in this paper also presents comparative evaluation of performance using three popular minutiae matching approaches. The experimental results presented in this paper from a database from 120 different subjects are highly encouraging and validate such first attempt to study minutiae recovery and matching from finger knuckle images. [ABSTRACT FROM AUTHOR]
- Published
- 2015
- Full Text
- View/download PDF
20. On the integration of crowd knowledge in pattern recognition.
- Author
-
Zhang, Richong and Mao, Yongyi
- Subjects
- *
CROWDSOURCING , *PATTERN perception , *IMAGE recognition (Computer vision) , *THEORY of knowledge , *MONTE Carlo method , *ERROR probability - Abstract
This paper is concerned with the fundamentals of integrating crowd knowledge such as ratings, opinions or tags provided by the internet users. As a concrete example, we consider the problem of image recognition based on user-provided tags. Each user is assumed to have certain knowledge about the images, which can be incomplete or only of partial relevance to the recognition task. Each user is also assumed to have his own choice of tag vocabulary, possibly different from the set of prescribed labels for image recognition. We argue that a user’s knowledge can be separated into the structure of the knowledge and the representation of the structure (namely, his tag vocabulary). This perspective advocates a systematic three-step methodology for crowd knowledge integration in such applications, whereby the problem of interest is decoupled into three sub-problems in tandem: knowledge structure aggregation, vocabulary interpretation, and label assignment. We derive a lower bound for the achievable error probability. Using this bound and via Monte-Carlo simulations, we investigate the performance of a knowledge integration system in relation to various parameter settings. [ABSTRACT FROM AUTHOR]
- Published
- 2018
- Full Text
- View/download PDF
21. Lattice-Support repetitive local feature detection for visual search.
- Author
-
Manandhar, Dipu, Yap, Kim-Hui, Miao, Zhenwei, and Chau, Lap-Pui
- Subjects
- *
REPETITIVE patterns (Decorative arts) , *IMAGE retrieval , *PATTERN perception , *HOUGH transforms , *ESTIMATION theory , *GEOMETRIC analysis - Abstract
Repetitive patterns such as building facades, floor tiles, vegetation, and wallpapers are commonly found in sceneries and images. The presence of such repetitive patterns in images often leads to visual burstiness and geometric ambiguity, which poses challenge for state-of-the-art visual search technologies. To alleviate these problems, we propose a new lattice-support repetitive local feature detection method to detect repetitive patterns, estimate the underlying lattice structure, and enhance descriptors used for subsequent visual image search. Existing methods for repetitive pattern detection are commonly based on determining the underlying lattice structures. However, these structures do not correspond directly to robust features that are scale- and rotation-invariant. This paper proposes a new lattice-support repetitive local feature (LS-RLF) detection method that aims to integrate lattice information into repeated local feature detection and extraction. The advantage of the proposed method is that the detected features can be directly used by current visual search technologies. The LS-RLF method estimates the undetected repeated features in the lattice structure using Hough transform-based feature estimation. Further, in order to handle the visual burstiness issue, a new LS-RLF based image retrieval framework is developed. Experiments performed on benchmark datasets show that the proposed method outperforms the state-of-the-art methods by mean Average Precisions (mAP) of 4.5%, 5.5% and 3.2% on Oxford, Paris, and INRIA holidays datasets respectively. This demonstrates the effectiveness of the proposed method in performing visual search for images which contain wide range of repeated patterns. [ABSTRACT FROM AUTHOR]
- Published
- 2017
- Full Text
- View/download PDF
22. On-line anomaly detection and resilience in classifier ensembles.
- Author
-
Sagha, Hesam, Bayati, Hamidreza, Millán, José del R., and Chavarriaga, Ricardo
- Subjects
- *
ANOMALY detection (Computer security) , *STATISTICAL ensembles , *PATTERN perception , *INFORMATION theory , *MATHEMATICAL models , *SUPPORT vector machines , *PARAMETER estimation - Abstract
Abstract: Detection of anomalies is a broad field of study, which is applied in different areas such as data monitoring, navigation, and pattern recognition. In this paper we propose two measures to detect anomalous behaviors in an ensemble of classifiers by monitoring their decisions; one based on Mahalanobis distance and another based on information theory. These approaches are useful when an ensemble of classifiers is used and a decision is made by ordinary classifier fusion methods, while each classifier is devoted to monitor part of the environment. Upon detection of anomalous classifiers we propose a strategy that attempts to minimize adverse effects of faulty classifiers by excluding them from the ensemble. We applied this method to an artificial dataset and sensor-based human activity datasets, with different sensor configurations and two types of noise (additive and rotational on inertial sensors). We compared our method with two other well-known approaches, generalized likelihood ratio (GLR) and One-Class Support Vector Machine (OCSVM), which detect anomalies at data/feature level. We found that our method is comparable with GLR and OCSVM. The advantages of our method compared to them is that it avoids monitoring raw data or features and only takes into account the decisions that are made by their classifiers, therefore it is independent of sensor modality and nature of anomaly. On the other hand, we found that OCSVM is very sensitive to the chosen parameters and furthermore in different types of anomalies it may react differently. In this paper we discuss the application domains which benefit from our method. [Copyright &y& Elsevier]
- Published
- 2013
- Full Text
- View/download PDF
23. When is the area under the receiver operating characteristic curve an appropriate measure of classifier performance?
- Author
-
Hand, D.J. and Anagnostopoulos, C.
- Subjects
- *
RECEIVER operating characteristic curves , *DISCRIMINANT analysis , *CLASSIFIERS (Linguistics) , *DATA analysis , *PATTERN perception , *CLASSIFICATION - Abstract
Abstract: The area under the receiver operating characteristic curve is a widely used measure of the performance of classification rules. This paper shows that when classifications are based solely on data describing individual objects to be classified, the area under the receiver operating characteristic curve is an incoherent measure of performance, in the sense that the measure itself depends on the classifier being measured. It significantly extends earlier work by showing that this incoherence is not a consequence of a cost-based interpretation of misclassifications, but is a fundamental property of the area under the curve itself. The paper also shows that if additional information, such as the class assignments of other objects, is taken into account when making a classification, then the area under the curve is a coherent measure, although in those circumstances it makes an assumption which is seldom if ever appropriate. [Copyright &y& Elsevier]
- Published
- 2013
- Full Text
- View/download PDF
24. Double distribution support vector machine.
- Author
-
Cheng, Fanyong, Zhang, Jing, Li, Zuoyong, and Tang, Mingzhu
- Subjects
- *
SUPPORT vector machines , *PATTERN perception , *GENERALIZATION , *MARGINAL distributions , *PERFORMANCE - Abstract
This paper studies the role of the sample mean in binary classifier based on the margin theory. Support Vector Machine (SVM) with maximized minimum margin is widely used in pattern recognition, but it sometimes induces the weak margin distribution which is negative for the generalization performance. Therefore Double Distribution Support Vector Machine (DDSVM) is proposed to obtain strong generalization performance by maximizing the margin distribution of two classes sample means and the minimum margin. The sample mean is usually a good description of samples, and DDSVM can increase the margin distribution and improve the generalization performance. DDSVM is a general learning approach, and its superiority is verified both theoretically and experimentally. [ABSTRACT FROM AUTHOR]
- Published
- 2017
- Full Text
- View/download PDF
25. Modeling temporal structure of complex actions using Bag-of-Sequencelets.
- Author
-
Jung, Hyun-Joo and Hong, Ki-Sang
- Subjects
- *
PATTERN perception , *SOFTWARE frameworks , *BIG data , *OLYMPIC Games , *CHRONOLOGY - Abstract
This paper proposes a new framework for modeling temporal structures of complex human actions. Inspired by the fact that a complex action is the temporally ordered composition of sub-actions, we develop a new model named Bag-of-Sequencelets (BoS). To construct a BoS model, a video is represented as a sequence of Primitive Actions (PAs). A PA is a representative motion pattern that constitutes actions and is learned in an unsupervised manner. Representing a video as a sequence of PAs preserves their temporal order. A sequencelet is an informative sub-sequence that describes the partial structure of actions while preserving temporal relations among PAs. In a BoS model, an action is modeled as an ensemble of sequencelets. We can use sequential pattern mining to automatically learn the sequencelet without any annotation or prior knowledge of action structure. Because the BoS model has both compositional and chronological properties, it can effectively model the structures of complex actions despite intra-class variations such as viewpoint change. Experimental results show the effectiveness of the BoS model in temporal structure modeling. Applied to the Olympic sports and UCF YouTube datasets, BoS achieves greater classification accuracy than state-of-the-art methods. [ABSTRACT FROM AUTHOR]
- Published
- 2017
- Full Text
- View/download PDF
26. An optimized palmprint recognition approach based on image sharpness.
- Author
-
Zhang, Kunai, Huang, Da, and Zhang, David
- Subjects
- *
PALMPRINT recognition , *PATTERN perception , *BIOMETRIC identification , *IMAGE analysis , *SECURITY management - Abstract
Biometric identification is an essential field in biometric security. The preprocessing of a palmprint image is essential to the recognition performance. Most researchers use clear palmprint images for recognition and consider that the higher is the image sharpness, the better is the performance. However, we found that the performance of palmprint recognition can be improved by using low sharpness images, as long as the sharpness is within a range which we call optimal range. In this paper, the method of evaluating the palmprint image sharpness is introduced and an approach of changing the image sharpness to the optimal range is proposed. When all the images are tuned to this optimal range, the palmprint recognition performance can be significantly improved. Experiments were conducted on the PolyU Palmprint Database and IIT Delhi using CompCode and POC to validate the proposed approach and find the optimal range. The experimental results show that the proposed approach can improve palmprint recognition performance by 15%. [ABSTRACT FROM AUTHOR]
- Published
- 2017
- Full Text
- View/download PDF
27. Bayes covariant multi-class classification.
- Author
-
Šuch, Ondrej and Barreda, Santiago
- Subjects
- *
COVARIANT field theories , *PAIRED comparisons (Mathematics) , *ANALYSIS of covariance , *PATTERN perception , *FEEDBACK control systems - Abstract
We consider multi-class classification models built from complete sets of pairwise binary classifiers. The Bradley–Terry model is often used to estimate posterior distributions in this setting. We introduce the notion of Bayes covariance, which holds if the multi-class classifier respects multiplicative group action on class priors. As a consequence, a Bayes covariant method yields the same result whether new priors are considered before or after combination of the individual classifiers, which has several practical advantages for systems with feedback. In the paper, we construct a Bayes covariant combining method and compare it with previously published methods in both Monte Carlo simulations as well as on a practical speech frame recognition task. [ABSTRACT FROM AUTHOR]
- Published
- 2016
- Full Text
- View/download PDF
28. Semantic parts based top-down pyramid for action recognition.
- Author
-
Zhao, Zhichen, Ma, Huimin, and Chen, Xiaozhi
- Subjects
- *
SEMANTIC computing , *PYRAMIDS (Geometry) , *PATTERN perception , *DETECTORS , *SPATIAL arrangement - Abstract
We focus on the problem of recognizing actions in still images, and this paper provides an approach which arranges features of different semantic parts in spatial order. Our approach includes three components: (1) a semantic learning algorithm that collects a set of part detectors, (2) an efficient detection method that divides multiple images by the same grid and evaluates parallelly, and (3) a top-down spatial arrangement that increases the inter-class variance. The proposed semantic parts learning algorithm captures both interactive objects and discriminative poses. Our spatial arrangement can be seen as a kind of adaptive pyramid, which highlights spatial distribution of body parts in different actions, and provides more discriminative representations. Experimental results show that our approach outperforms the state-of-the-art significantly on two challenging benchmarks: (1) PASCAL VOC 2012 and (2) Stanford-40 (by 2.6% mAP and 5.2% mAP, respectively). [ABSTRACT FROM AUTHOR]
- Published
- 2016
- Full Text
- View/download PDF
29. Efficient and effective algorithms for training single-hidden-layer neural networks
- Author
-
Yu, Dong and Deng, Li
- Subjects
- *
COMPUTER algorithms , *ARTIFICIAL neural networks , *MACHINE learning , *GAMMA ray telescopes , *COMPARATIVE studies , *PATTERN perception - Abstract
Abstract: Recently there have been renewed interests in single-hidden-layer neural networks (SHLNNs). This is due to its powerful modeling ability as well as the existence of some efficient learning algorithms. A prominent example of such algorithms is extreme learning machine (ELM), which assigns random values to the lower-layer weights. While ELM can be trained efficiently, it requires many more hidden units than is typically needed by the conventional neural networks to achieve matched classification accuracy. The use of a large number of hidden units translates to significantly increased test time, which is more valuable than training time in practice. In this paper, we propose a series of new efficient learning algorithms for SHLNNs. Our algorithms exploit both the structure of SHLNNs and the gradient information over all training epochs, and update the weights in the direction along which the overall square error is reduced the most. Experiments on the MNIST handwritten digit recognition task and the MAGIC gamma telescope dataset show that the algorithms proposed in this paper obtain significantly better classification accuracy than ELM when the same number of hidden units is used. For obtaining the same classification accuracy, our best algorithm requires only 1/16 of the model size and thus approximately 1/16 of test time compared with ELM. This huge advantage is gained at the expense of 5 times or less the training cost incurred by the ELM training. [Copyright &y& Elsevier]
- Published
- 2012
- Full Text
- View/download PDF
30. Estimating redundancy information of selected features in multi-dimensional pattern classification
- Author
-
Jung, Chi-Sang, Seo, Hyunson, and Kang, Hong-Goo
- Subjects
- *
CLASSIFICATION , *PATTERN perception , *ALGORITHMS , *DISCRIMINATION learning , *CONDITIONED response , *EXPERIMENTAL design - Abstract
Abstract: This paper proposes a novel criterion for estimating the redundancy information of selected feature sets in multi-dimensional pattern classification. An appropriate feature selection process typically maximizes the relevancy of features to each class and minimizes the redundancy of features between selected features. Unlike to the relevancy information that can be measured by mutual information, however, it is difficult to estimate the redundancy information because its dynamic range is varied by the characteristics of features and classes. By utilizing the conceptual diagram of the relationship between candidate features, selected features, and class variables, this paper proposes a new criterion to accurately compute the amount of redundancy. Specifically, the redundancy term is estimated by conditional mutual information between selected and candidate features to each class variable, which does not need a cumbersome normalization process as the conventional algorithm does. The proposed algorithm is implemented into a speech/music discrimination system to evaluate classification performance. Experimental results by varying the number of selected features verify that the proposed method shows higher classification accuracy than conventional algorithms. [ABSTRACT FROM AUTHOR]
- Published
- 2011
- Full Text
- View/download PDF
31. A 2-point algorithm for 3D reconstruction of horizontal lines from a single omni-directional image
- Author
-
Chen, Wang, Cheng, Irene, Xiong, Zihui, Basu, Anup, and Zhang, Maojun
- Subjects
- *
THREE-dimensional imaging , *ALGORITHMS , *IMAGE reconstruction , *FEATURE extraction , *PATTERN perception , *CATADIOPTRIC systems - Abstract
Abstract: Reconstruction of 3D scenes with abundant straight line features has many applications in computer vision and robot navigation. Most approaches to this problem involve stereo techniques, in which a solution to the correspondence problem between at least two different images is required. In contrast, 3D reconstruction of straight horizontal lines from a single 2D omni-directional image is studied in this paper. The authors show that, for symmetric non-central catadioptric systems, a 3D horizontal line can be estimated using only two points extracted from a single image of the line. One of the two points is the symmetry point of the image curve of horizontal line, and the other is a generic point on the image curve. This paper improves on several prior works, including horizontal line detection in omni-directional image and line reconstruction from four viewing rays, but is simpler than those methods while being more robust. We evaluate how the precision of feature point extraction can affect line reconstruction accuracy, and discuss preliminary experimental results. [ABSTRACT FROM AUTHOR]
- Published
- 2011
- Full Text
- View/download PDF
32. Efficient computation of new extinction values from extended component tree
- Author
-
Silva, Alexandre Gonçalves and Lotufo, Roberto de Alencar
- Subjects
- *
TREE graphs , *GEOMETRIC surfaces , *MATHEMATICAL decomposition , *COMPUTER vision , *PATTERN perception , *QUASILINEARIZATION - Abstract
Abstract: A gray-scale image can be interpreted as a topographical surface, and represented by a component tree, based on the inclusion relation of connected components obtained by threshold decomposition. Relations between plateaus, valleys or mountains of this relief are useful in computer vision systems. An important definition to characterize the topographical surface is the dynamics, introduced by , associated with each regional minimum. This concept has been extended, by , by the definition of extinction values associated with each extremum of the image. This paper proposes three new extinction values – two based on the topology of the component tree: (i) number of descendants and (ii) sub-tree height; and one geometric: (iii) level component bounding box (subdivided into extinctions of height, width or diagonal). This paper describes an efficient computation of these extinction values based on the incremental determination of attributes from the component tree construction in quasi-linear time, compares the computation time of the method and illustrates the usefulness of these new extinction values from real examples. [Copyright &y& Elsevier]
- Published
- 2011
- Full Text
- View/download PDF
33. Joint discriminative–generative modelling based on statistical tests for classification
- Author
-
Xue, Jing-Hao and Titterington, D. Michael
- Subjects
- *
DISCRIMINANT analysis , *STATISTICAL hypothesis testing , *CLASSIFICATION , *PATTERN perception , *LINEAR statistical models , *LOGISTIC regression analysis - Abstract
Abstract: In statistical pattern classification, generative approaches, such as linear discriminant analysis (LDA), assume a data-generating process (DGP), whereas discriminative approaches, such as linear logistic regression (LLR), do not model the DGP. In general, a generative classifier performs better than its discriminative counterpart if the DGP is well-specified and worse than the latter if the DGP is clearly mis-specified. In view of this, this paper presents a joint discriminative–generative modelling (JoDiG) approach, by partitioning predictor variables X into two sub-vectors, namely , to which a generative approach is applied, and , to be treated by a discriminative approach. This partitioning of X is based on statistical tests of the assumed DGP: the variables that clearly fail the tests are grouped as and the rest as . Then the generative and discriminative approaches are combined in a probabilistic rather than a heuristic way. The principle of the JoDiG approach is quite generic, but for illustrative purposes numerical studies of the paper focus on a widely-used case, in which the DGP assumes a multivariate normal distribution for each class. In this case, the JoDiG approach uses LDA for and LLR for . Numerical experiments on real and simulated data demonstrate that the performance of this new approach to classification is similar to or better than that of its discriminative and generative counterparts, in particular when the size of the training-set is comparable to the dimension of the data. [Copyright &y& Elsevier]
- Published
- 2010
- Full Text
- View/download PDF
34. Tuning cost and performance in multi-biometric systems: A novel and consistent view of fusion strategies based on the Sequential Probability Ratio Test (SPRT)
- Author
-
Allano, Lorene, Dorizzi, Bernadette, and Garcia-Salicetti, Sonia
- Subjects
- *
COST analysis , *PERFORMANCE evaluation , *BIOMETRY , *SEQUENTIAL analysis , *PROBABILITY theory , *STATISTICAL matching , *PATTERN perception - Abstract
Abstract: In this paper we propose a novel sequential score fusion strategy for multi-biometric systems. The strategy’s aim is to reduce the cost of a multi-biometric system by dynamically fusing the optimal number of systems required to take the final decision. This way we optimise at the same time cost and performance in the system. The novelty of this paper lies in the automatic tuning of the decision parameters (thresholds) at a desired level of performance by revisiting the Sequential Probability Ratio Test (SPRT). [Copyright &y& Elsevier]
- Published
- 2010
- Full Text
- View/download PDF
35. A hybrid biometric cryptosystem for securing fingerprint minutiae templates
- Author
-
Nagar, Abhishek, Nandakumar, Karthik, and Jain, Anil K.
- Subjects
- *
BIOMETRIC identification , *CRYPTOGRAPHY , *TEMPLATE matching (Digital image processing) , *ALGORITHMS , *FUZZY systems , *PERFORMANCE evaluation , *PATTERN perception - Abstract
Abstract: Security concerns regarding the stored biometric data is impeding the widespread public acceptance of biometric technology. Though a number of bio-crypto algorithms have been proposed, they have limited practical applicability due to the trade-off between recognition performance and security of the template. In this paper, we improve the recognition performance as well as the security of a fingerprint based biometric cryptosystem, called fingerprint fuzzy vault. We incorporate minutiae descriptors, which capture ridge orientation and frequency information in a minutia’s neighborhood, in the vault construction using the fuzzy commitment approach. Experimental results show that with the use of minutiae descriptors, the fingerprint matching performance improves from an FAR of 0.7% to 0.01% at a GAR of 95% with some improvement in security as well. An analysis of security while considering two different attack scenarios is also presented. A preliminary version of this paper appeared in the International Conference on Pattern Recognition, 2008 and was selected as the Best Scientific Paper in the biometrics track. [Copyright &y& Elsevier]
- Published
- 2010
- Full Text
- View/download PDF
36. Context information from search engines for document recognition
- Author
-
Donoser, Michael, Wagner, Silke, and Bischof, Horst
- Subjects
- *
INFORMATION theory , *PATTERN perception , *PERFORMANCE evaluation , *TEMPLATE matching (Digital image processing) , *WEB search engines - Abstract
Abstract: In this work we propose the use of contextual information provided by web search engine queries for improving text recognition performance. We first describe a framework for automated text recognition from images. It is based on detecting text areas in images by analysis of Maximally Stable Extremal Regions (MSERs) and recognizing characters by simple template matching. The main emphasis of the paper is on introducing a novel method for exploiting contextual information to improve the obtained recognition results. We propose to analyze the results of web search engine queries on two levels of detail (word and sentence level) which both allow to significantly improve the overall text recognition performance. Experimental evaluations on reference data sets prove that dictionary based methods are outperformed and that even based on a low-quality single character recognition method the proposed web search engine extension enables reasonable text recognition results. This work received the “Best Scientific Paper Award” at the International Conference on Pattern Recognition (ICPR), 2008 (). [Copyright &y& Elsevier]
- Published
- 2010
- Full Text
- View/download PDF
37. Unsupervised writer adaptation of whole-word HMMs with application to word-spotting
- Author
-
Rodríguez-Serrano, José A., Perronnin, Florent, Sánchez, Gemma, and Lladós, Josep
- Subjects
- *
SUPERVISED learning , *GRAPHOLOGY , *PATTERN perception , *MARKOV processes , *ADAPTIVE control systems - Abstract
Abstract: In this paper we propose a novel approach for writer adaptation in a handwritten word-spotting task. The method exploits the fact that the semi-continuous hidden Markov model separates the word model parameters into (i) a codebook of shapes and (ii) a set of word-specific parameters. Our main contribution is to employ this property to derive writer-specific word models by statistically adapting an initial universal codebook to each document. This process is unsupervised and does not even require the appearance of the keyword(s) in the searched document. Experimental results show an increase in performance when this adaptation technique is applied. To the best of our knowledge, this is the first work dealing with adaptation for word-spotting. The preliminary version of this paper obtained an IBM Best Student Paper Award at the 19th International Conference on Pattern Recognition. [Copyright &y& Elsevier]
- Published
- 2010
- Full Text
- View/download PDF
38. Novel Gaussianized vector representation for improved natural scene categorization
- Author
-
Zhou, Xi, Zhuang, Xiaodan, Tang, Hao, Hasegawa-Johnson, Mark, and Huang, Thomas S.
- Subjects
- *
GAUSSIAN processes , *VECTOR analysis , *IMAGE analysis , *MACHINE learning , *DISTRIBUTION (Probability theory) , *PERFORMANCE evaluation , *PATTERN perception , *EXPECTATION-maximization algorithms , *CONFERENCES & conventions - Abstract
Abstract: We present a novel Gaussianized vector representation for scene images by an unsupervised approach. Each image is first encoded as an ensemble of orderless bag of features. A global Gaussian Mixture Model (GMM) learned from all images is then used to randomly distribute each feature into one Gaussian component by a multinomial trial. The posteriors of the feature on all the Gaussian components serve as the parameters of the multinomial distribution. Finally, the normalized means of the features distributed in every Gaussian component are concatenated to form a supervector, which is a compact representation for each scene image. We prove that these supervectors observe the standard normal distribution. The Gaussianized vector representation is a more generalized form of the widely used histogram representation. Our experiments on scene categorization tasks using this vector representation show significantly improved performance compared with the histogram-of-features representation. This paper is an extended version of our work that won the IBM Best Student Paper Award at the 2008 International Conference on Pattern Recognition (ICPR 2008) (). [Copyright &y& Elsevier]
- Published
- 2010
- Full Text
- View/download PDF
39. Learning explicit and implicit visual manifolds by information projection
- Author
-
Zhu, Song-Chun, Shi, Kent, and Si, Zhangzhang
- Subjects
- *
MACHINE learning , *INFORMATION processing , *VISUAL perception , *COMPUTATIONAL complexity , *PATTERN perception , *STOCHASTIC processes - Abstract
Abstract: Natural images have a vast amount of visual patterns distributed in a wide spectrum of subspaces of varying complexities and dimensions. Understanding the characteristics of these subspaces and their compositional structures is of fundamental importance for pattern modeling, learning and recognition. In this paper, we start with small image patches and define two types of atomic subspaces: explicit manifolds of low dimensions for structural primitives and implicit manifolds of high dimensions for stochastic textures. Then we present an information theoretical learning framework that derives common models for these manifolds through information projection, and study a manifold pursuit algorithm that clusters image patches into those atomic subspaces and ranks them according to their information gains. We further show how those atomic subspaces change over an image scaling process and how they are composed to form larger and more complex image patterns. Finally, we integrate the implicit and explicit manifolds to form a primal sketch model as a generic representation in early vision and to generate a hybrid image template representation for object category recognition in high level vision. The study of the mathematical structures in the image space sheds lights on some basic questions in human vision, such as atomic elements in visual perception, the perceptual metrics in various manifolds, and the perceptual transitions over image scales. This paper is based on the J.K. Aggarwal Prize lecture by the first author at the International Conference on Pattern Recognition, Tempa, FL. 2008. [Copyright &y& Elsevier]
- Published
- 2010
- Full Text
- View/download PDF
40. Multi-resolution recognition of 3D objects based on visual resolution limits
- Author
-
Ma, Huimin, Huang, Tiantian, and Wang, Yanzhi
- Subjects
- *
PATTERN perception , *VISUAL perception , *OPTICAL resolution , *THREE-dimensional display systems , *SCIENTIFIC experimentation - Abstract
Abstract: This paper presents a multi-resolution recognition method for 3D objects, based on the human visual model. In the first part of this paper, we propose a new visual resolution limit (VRL) calculation method that considers lens size, the scale of imaging cells and the distance, orientation and velocity of the object. In addition, we simplify 3D models with a novel mesh simplification method based on edge collapse, which controls the simplification degree with VRL. After applying viewpoint space partitioning to the simplified models at different resolutions, we develop a multi-resolution aspect graph library indexed by observation distance. Finally, we propose a 3D object recognition method based on multi-resolution aspect graphs and implement a real-time gradual multi-resolution recognition system that imitates human vision. We design and execute a set of experiments based on plane, car and ship models. Our results demonstrate that our recognition method is effective. [Copyright &y& Elsevier]
- Published
- 2010
- Full Text
- View/download PDF
41. Acoustic event detection in meeting-room environments
- Author
-
Temko, Andrey and Nadeu, Climent
- Subjects
- *
COMPUTATIONAL auditory scene analysis , *CONFERENCE rooms , *DIGITAL signal processing , *MICROPHONES , *SUPPORT vector machines , *AUDITORY perception , *PATTERN perception - Abstract
Abstract: Acoustic event detection (AED) aims at determining the identity of sounds and their temporal position in the signals that are captured by one or several microphones. The AED problem has been recently proposed for meeting-room or class-room environments, where a specific set of meaningful sounds has been defined, and several evaluations have been carried out (within the international CLEAR evaluation campaigns). This paper reports some work in AED done by the authors in that framework, and particularly presents the extension to the difficult problem of detecting overlapped sounds. Actually, temporal overlaps accounted for more than 70% of errors in the real-world interactive seminar recordings used in CLEAR 2007 evaluations. An attempt to deal with that problem at the level of models using our SVM-based AED system is reported in the paper. The proposed two-step system noticeably outperforms the baseline system for both an artificially generated database and a real seminar recording database. The databases and metrics developed for the CLEAR 2007 evaluations are also described. Finally, a real-time AED system implemented in the UPC’s smart-room using several microphones is reported, along with a GUI-based demo that includes also the output of an acoustic source localization system. [Copyright &y& Elsevier]
- Published
- 2009
- Full Text
- View/download PDF
42. Recovery of upper body poses in static images based on joints detection
- Author
-
Hu, Zhilan, Wang, Guijin, Lin, Xinggang, and Yan, Hong
- Subjects
- *
PATTERN perception , *HUMAN body , *JOINTS (Anatomy) , *CLOTHING & dress , *IMAGE analysis , *HEURISTIC programming , *MARKOV processes , *MONTE Carlo method - Abstract
Abstract: Recovering human body poses from static images is challenging without prior knowledge of pose, appearance, background and clothing. In this paper, we propose a novel model-based upper poses recovery method via effective joints detection. In our research, three observables are firstly detected: face, skin, and torso. Then the joints are properly initialized according to the observables and some heuristic configuration constraints. Finally the sample-based Markov chain Monte Carlo (MCMC) method is employed to determine the final pose. The main contributions of this paper include a robust torso detector through maximizing a posterior estimation, effective joints initialization, and two continuous likelihood functions developed for effective pose inference. Experiments on 250 real world images show that our method can accurately recover upper body poses from images with a variety of individuals, poses, backgrounds and clothing. [Copyright &y& Elsevier]
- Published
- 2009
- Full Text
- View/download PDF
43. Synthetic data generation technique in Signer-independent sign language recognition
- Author
-
Jiang, Feng, Gao, Wen, Yao, Hongxun, Zhao, Debin, and Chen, Xilin
- Subjects
- *
SIGN language , *PATTERN perception , *DATA analysis , *SYNTHETIC training devices , *INFORMATION processing , *ALGORITHMS - Abstract
Abstract: The lack of training samples is an important problem in the field of sign language recognition. This paper presents a method of generating synthetic multi-stream samples so as to enlarge the training set of sign. The mean shift algorithm is able to obtain the directions of maximum increase and decrease in the density function, so it is used to control the direction and the intensity of synthetic data generation. The synthetic data generation proposed in this paper satisfies the need of the synthetic samples, which must include a large amount of effective information of unspecific signers. The proposed method is evaluated under different experimental conditions, such as the generating strategy, the capacity of the model, as well as the intensity and direction of the generating process. The results show that in most cases recognition accuracy is improved; and in some, even greatly improved. [Copyright &y& Elsevier]
- Published
- 2009
- Full Text
- View/download PDF
44. Non-rigid object tracking in complex scenes
- Author
-
Zhou, Huiyu, Yuan, Yuan, Zhang, Yi, and Shi, Chunmei
- Subjects
- *
AUTOMATIC tracking , *ALGORITHMS , *PATTERN perception , *COMPUTER vision , *PROBABILITY theory , *DISTRIBUTION (Probability theory) , *ANALYSIS of covariance - Abstract
Abstract: A recently proposed colour based tracking algorithm has been established to track objects in real circumstances [Zivkovic, Z., Kröse, B. 2004. An EM-like algorithm for color-histogram-based object tracking. In: Proc. IEEE Conf. on Computer Vision and Pattern Recognition, pp. 798–803]. To improve the performance of this technique in complex scenes, in this paper we propose a new algorithm for optimally adapting the ellipse outlining the objects of interest. This paper presents a Lagrangian based method to integrate a regularising component into the covariance matrix to be computed. Technically, we intend to reduce the residuals between the estimated probability distribution and the expected one. We argue that, by doing this, the shape of the ellipse can be properly adapted in the tracking stage. Experimental results show that the proposed method has favourable performance in shape adaption and object localisation. [Copyright &y& Elsevier]
- Published
- 2009
- Full Text
- View/download PDF
45. Parametric active contours for object tracking based on matching degree image of object contour points
- Author
-
Chen, Qiang, Sun, Quan-Sen, Heng, Pheng-Ann, and Xia, De-Shen
- Subjects
- *
MAGNETIC resonance , *BRAIN stem , *MAGNETIC fields , *PATTERN perception , *PATTERN recognition systems - Abstract
Abstract: A parametric active contour model is presented for object tracking based on matching degree image of object contour points. We first construct a matching degree image according to object contour points, and track the object using parametric active contours. This paper presents a new feature matching approach and a new directional filter. Assuming that the motion of objects is small in this paper, we constrain the motion of object contour within the contour vicinity defined by a band, which is constructed by the generation method of narrow band of level set method. Experimental results demonstrate that our method can effectively track rigid and non-rigid objects. We apply the proposed tracking method for face tracking, outer contour segmentation of left ventricle magnetic resonance (MR) images, and brain stem segmentation of Chinese visible human datasets, which demonstrates that our method is feasible for practical applications. [Copyright &y& Elsevier]
- Published
- 2008
- Full Text
- View/download PDF
46. Segmentation of heterogeneous blob objects through voting and level set formulation
- Author
-
Chang, Hang, Yang, Qing, and Parvin, Bahram
- Subjects
- *
PATTERN perception , *PATTERN recognition systems , *CURVES , *ATTENTION , *SENSORY perception - Abstract
Abstract: Blob-like structures occur often in nature, where they aid in cueing and the pre-attentive process. These structures often overlap, form perceptual boundaries, and are heterogeneous in shape, size, and intensity. In this paper, voting, Voronoi tessellation, and level set methods are combined to delineate blob-like structures. Voting and subsequent Voronoi tessellation provide the initial condition and the boundary constraints for each blob, while curve evolution through level set formulation provides refined segmentation of each blob within the Voronoi region. The paper concludes with the application of the proposed method to a dataset produced from cell based fluorescence assays and stellar data. [Copyright &y& Elsevier]
- Published
- 2007
- Full Text
- View/download PDF
47. Homography-based partitioning of curved surface for stereo correspondence establishment
- Author
-
Su, Jianbo, Chung, Ronald, and Jin, Liang
- Subjects
- *
COLLINEATION , *PATTERN perception , *IMAGE processing , *CORRESPONDENCE analysis (Communications) , *MEASUREMENT - Abstract
Planar homography (collineation) is an image-to-image mapping that could be used to pinpoint stereo correspondences, but its usage has been limited to only planar scenes. This paper describes a mechanism that generalizes the use of planar homography for establishing stereo correspondences over curved scenes. Piecewise-linear approximation is used to describe curved scene, so that stereo correspondences over images of the scene are captured by a collection of local homographies. Unlike the classical approaches, the mechanism does not employ the smoothness heuristic in establishing correspondences, but is instead based upon the interplay of two processes in an iterative manner: (1) partitioning of the scene into local planar patches based upon the most current set of confirmed stereo correspondences; and (2) prediction of new stereo correspondences by the use of the local homographies defined by the partitions, plus confirmation of such predictions by the image data, thereby enlarging the set of confirmed correspondences in every iteration until no more improvement could be obtained. A key step of the mechanism is to decide what local planar homography, and thereby what planar patch, is to be used for correspondence prediction of any given unmatched feature in any given iteration of the mechanism. With knowledge of the epipolar geometry, homography could be defined by any 3 non-collinear feature correspondences. Thus for any given unmatched feature, there are choices of the aforementioned local homography as defined by any set of three non-collinear matched features in the vicinity of the feature. In this paper, we explore what criteria are to be used to decide which of such triplet sets of matched features should be used. We analyze what errors in correspondence prediction could come with a planar homography. In particular, we classify the errors into scene-related geometric error and computation-related algebraic error. We examine their effects by simulated data experiments, and propose two ways of deciding which local homography to adopt for any given unmatched feature point. Real image experiments show that the methods could lead to promising results. [Copyright &y& Elsevier]
- Published
- 2007
- Full Text
- View/download PDF
48. Optimization based grayscale image colorization
- Author
-
Nie, Dongdong, Ma, Qinyong, Ma, Lizhuang, and Xiao, Shuangjiu
- Subjects
- *
COLORING matter , *IMAGING systems , *PATTERN perception , *PIXELS , *IMAGE processing - Abstract
An optimization based interactive grayscale image colorization method is presented in this paper. It is an interactive colorization method, whereas the only thing user need to do is to provide some color hints by scribbling or seed pixels. The main contribution of this paper is that the colorization method greatly reduces computation time with the same good results in image quality by quadtree decomposition based non-uniform sampling. Moreover, by introducing a new simple weighting function to represent intensity similarity in the cost function, annoying color diffusion among different regions is alleviated. Experiments show that this method gives the same good quality of colorized images as the method of Levin et al. with a fraction of the computational cost. [Copyright &y& Elsevier]
- Published
- 2007
- Full Text
- View/download PDF
49. Global, local and personalised modeling and pattern discovery in bioinformatics: An integrated approach
- Author
-
Kasabov, Nikola
- Subjects
- *
BIOINFORMATICS , *PATTERN perception , *DATA analysis , *DECISION support systems , *ARTIFICIAL intelligence research , *COMPARATIVE studies , *CANCER treatment , *GENE expression - Abstract
The paper is offering a comparative study of major modeling and pattern discovery approaches applicable to the area of data analysis and decision support systems in general, and to the area of Bioinformatics and Medicine – in particular. Compared are inductive versus transductive reasoning, global, local, and personalised modeling, and all these approaches are illustrated on a case study of gene expression and clinical data related to cancer outcome prognosis. While inductive modeling is used to develop a model (function) from data on the whole problem space and then to recall it on new data, transductive modeling is concerned with the creation of single model for every new input vector based on some closest vectors from the existing problem space. A new method – WWKNN (weighted distance, weighted variables K-nearest neighbors), and a framework for the integration of global, local and personalised models for a single input vector are proposed. Integration of data (e.g. clinical and genetic) and of models (e.g. global, local and personalised) for a better pattern discovery, adaptation and accuracy of the results, are the major points of the paper. [Copyright &y& Elsevier]
- Published
- 2007
- Full Text
- View/download PDF
50. Video text recognition using sequential Monte Carlo and error voting methods
- Author
-
Chen, Datong and Odobez, Jean-Marc
- Subjects
- *
PATTERN perception , *PATTERN recognition systems , *ARTIFICIAL intelligence , *OPTICAL pattern recognition - Abstract
Abstract: This paper addresses the issue of segmentation and recognition of text embedded in video sequences from their associated text image sequence extracted by a text detection module. To this end, we propose a probabilistic algorithm based on Bayesian adaptive thresholding and Monte-Carlo sampling. The algorithm approximates the posterior distribution of segmentation thresholds of text pixels in an image by a set of weighted samples. The set of samples is initialized by applying a classical segmentation algorithm on the first video frame and further refined by random sampling under a temporal Bayesian framework. One important contribution of the paper is to show that, thanks to the proposed methodology, the likelihood of a segmentation parameter sample can be estimated not using a classification criterion or a visual quality criterion based on the produced segmentation map, but directly from the induced text recognition result, which is directly relevant to our task. Furthermore, as a second contribution of the paper, we propose to align text recognition results from high confidence samples gathered over time, to composite a final result using error voting technique (ROVER) at the character level. Experiments are conducted on a two hour video database. Character recognition rates higher than 93%, and word error rates higher than 90% are achieved, which are 4% and 3% more than state-of-the-art methods applied to the same database. [Copyright &y& Elsevier]
- Published
- 2005
- Full Text
- View/download PDF
Discovery Service for Jio Institute Digital Library
For full access to our library's resources, please sign in.