196 results on '"Local Descriptor"'
Search Results
2. Descriptor Distillation: A Teacher-Student-Regularized Framework for Learning Local Descriptors.
- Author
-
Liu, Yuzhen and Dong, Qiulei
- Subjects
- *
COMPUTER vision , *DEEP learning , *COMPARATIVE method , *LOCAL knowledge , *DISTILLATION - Abstract
Learning a fast and discriminative patch descriptor is a challenging topic in computer vision. Recently, many existing works focus on training various descriptor learning networks by minimizing a triplet loss (or its variants), which is expected to decrease the distance between each positive pair and increase the distance between each negative pair. However, such an expectation has to be lowered due to the non-perfect convergence of network optimizer to a local solution. Addressing this problem and the open computational speed problem, we propose a Descriptor Distillation framework for local descriptor learning, called DesDis, where a student model gains knowledge from a pre-trained teacher model, and it is further enhanced via a designed teacher-student regularizer. This teacher-student regularizer is to constrain the difference between the positive (also negative) pair similarity from the teacher model and that from the student model, and we theoretically prove that a more effective student model could be trained by minimizing a weighted combination of the triplet loss and this regularizer, than its teacher which is trained by minimizing the triplet loss singly. Under the proposed DesDis, many existing descriptor networks could be embedded as the teacher model, and accordingly, both equal-weight and light-weight student models could be derived, which outperform their teacher in either accuracy or speed. Experimental results on 3 public datasets demonstrate that the equal-weight student models, derived from the proposed DesDis framework by utilizing three typical descriptor learning networks as teacher models, could achieve significantly better performances than their teachers and several other comparative methods. In addition, the derived light-weight models could achieve 8 times or even faster speeds than the comparative methods under similar patch verification performances. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
3. Universally describing keypoints from a semi-global to local perspective, without any specific training: Universally Describing Keypoints From a Semi-Global to Local Perspective, Without Any Specific Training
- Author
-
Su, Shuai, Liu, Chengju, and Chen, Qijun
- Published
- 2024
- Full Text
- View/download PDF
4. Learning more discriminative local descriptors with parameter-free weighted attention for few-shot learning.
- Author
-
Song, Qijun, Zhou, Siyun, and Chen, Die
- Abstract
Few-shot learning for image classification comes up as a hot topic in computer vision, which aims at fast learning from a limited number of labeled images and generalize over the new tasks. In this paper, motivated by the idea of Fisher Score, we propose a Discriminative Local Descriptors Attention model that uses the ratio of intra-class and inter-class similarity to adaptively highlight the representative local descriptors without introducing any additional parameters, while most of the existing local descriptors based methods utilize the neural networks that inevitably involve the tedious parameter tuning. Experiments on four benchmark datasets show that our method achieves higher accuracy compared with the state-of-art approaches for few-shot learning. Specifically, our method is optimal on the CUB-200 dataset, and outperforms the second best competitive algorithm by 4.12 % and 0.49 % under the 5-way 1-shot and 5-way 5-shot settings, respectively. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
5. AWEDD: a descriptor simultaneously encoding multiscale extrinsic and intrinsic shape features.
- Author
-
Liu, Shengjun, Luo, Feifan, Li, Qinsong, Liu, Xinru, and Hu, Ling
- Subjects
- *
ENCODING , *ANISOTROPY , *SYMMETRY , *GEOMETRY - Abstract
We construct a novel descriptor called anisotropic wavelet energy decomposition descriptor (AWEDD) for non-rigid shape analysis, based on anisotropic diffusion geometry. We first extend the Dirichlet energy of the vertex coordinate function to an anisotropic version, then use multiscale anisotropic spectral manifold wavelets to decompose the Dirichlet energy to all vertices and collect local energy at each vertex to form AWEDD. AWEDD simultaneously encodes multiscale extrinsic and intrinsic shape features, which are more informative and robust than purely intrinsic or extrinsic descriptors. And the introduction of anisotropy endows AWEDD with stronger abilities of feature discrimination and intrinsic symmetry identification. Our results demonstrate that AWEDD is more discriminative than current state-of-the-art descriptors. In addition, we show that AWEDD is an excellent choice of the initial inputs for various shape analysis approaches, such as functional map pipelines and deep convolutional architectures. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
6. AFSRNet: learning local descriptors with adaptive multi-scale feature fusion and symmetric regularization.
- Author
-
Li, Dong, Liang, Haowen, and Lam, Kin-Man
- Subjects
CONVOLUTIONAL neural networks ,COMPUTER performance ,DESCRIPTOR systems ,DEEP learning - Abstract
Multi-scale feature fusion has been widely used in handcrafted descriptors, but has not been fully explored in deep learning-based descriptor extraction. Simple concatenation of descriptors of different scales has not been successful in significantly improving performance for computer vision tasks. In this paper, we propose a novel convolutional neural network, based on center-surround adaptive multi-scale feature fusion. Our approach enables the network to focus on different center-surround scales, resulting in improved performance. We also introduce a novel regularization technique that uses second-order similarity to constrain the learning of local descriptors, based on the symmetric property of the similarity matrix. The proposed method outperforms single-scale or simple-concatenation descriptors on two datasets and achieves state-of-the-art results on the Brown dataset. Furthermore, our method demonstrates excellent generalization ability on the HPatches dataset. Our code is released on GitHub: https://github.com/Leung-GD/AFSRNet/tree/main. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
7. Anomaly Detection from Crowded Video by Convolutional Neural Network and Descriptors Algorithm: Survey.
- Author
-
Hussan Altalbi, Ali Abid, Shaker, Shaimaa Hameed, and Ali, Akbas Ezaldeen
- Subjects
CONVOLUTIONAL neural networks ,VIDEO surveillance ,ALGORITHMS ,VIDEOS - Abstract
Depending on the context of interest, an anomaly is defined differently. In the case when a video event isn’t expected to take place in the video, it is seen as anomaly. It can be difficult to describe uncommon events in complicated scenes, but this problem is frequently resolved by using high-dimensional features as well as descriptors. There is a difficulty in creating reliable model to be trained with these descriptors because it needs a huge number of training samples and is computationally complex. Spatiotemporal changes or trajectories are typically represented by features that are extracted. The presented work presents numerous investigations to address the issue of abnormal video detection from crowded video and its methodology. Through the use of low-level features, like global features, local features, and feature features. For the most accurate detection and identification of anomalous behavior in videos, and attempting to compare the various techniques, this work uses a more crowded and difficult dataset and require light weight for diagnosing anomalies in objects through recording and tracking movements as well as extracting features; thus, these features should be strong and differentiate objects. After reviewing previous works, this work noticed that there is more need for accuracy in video modeling and decreased time, and since attempted to work on real-time and outdoor scenes. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
8. BDLA: Bi-directional local alignment for few-shot learning.
- Author
-
Zheng, Zijun, Feng, Xiang, Yu, Huiqun, Li, Xiuquan, and Gao, Mengqi
- Subjects
IMAGE representation ,DEEP learning ,COMPUTER vision ,LEARNING goals - Abstract
Deep learning has been successfully exploited to various computer vision tasks, which depend on abundant annotations. The core goal of few-shot learning, in contrast, is to learn a classifier to recognize new classes from only a few labeled examples that produce a key challenge of visual recognition. However, most of the existing methods often adopt image-level features or local monodirectional manner-based similarity measures, which suffer from the interference of non-dominant objects. To tackle this limitation, we propose a Bi-Directional Local Alignment (BDLA) approach for the few-shot visual classification problem. Specifically, building upon the episodic learning mechanism, we first adopt a shared embedding network to encode the 3D tensor features with semantic information, which can effectively describe the spatial geometric representation of the image. Afterwards, we construct a forward and a backward distance by exploring the nearest neighbor search to determine the semantic region-wise feature corresponding to each local descriptor of query sets and support sets. The bi-directional distance can encourage the alignment between similar semantic information while filtering out the interference information. Finally, we design a convex combination to merge the bi-directional distance and optimize the network in an end-to-end manner. Extensive experiments also show that our proposed approach outperforms several previous methods on four standard few-shot classification datasets. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
9. Multi-scale local cues and hierarchical attention-based LSTM for stock price trend prediction.
- Author
-
Teng, Xiao, Zhang, Xiang, and Luo, Zhigang
- Subjects
- *
STOCK prices , *PRICE fluctuations , *STOCK exchanges , *PRICES , *FUTURES sales & prices , *EARNINGS forecasting , *HISTOGRAMS - Abstract
Stock price trend prediction is to seek profit maximum of stock investment by estimating future stock price tendency. Nevertheless, it is still a tough task due to noisy and non-stationary properties of stock market. Thus, it is important how to relieve such negative effects and to improve prediction accuracy. In this paper, we leverage four diverse local descriptors in short durations to alleviate noisy fluctuations of stock price. In detail, piecewise aggregate approximation (PAA) collects relatively stable average values; the derivatives of short-time series reflect the change ratio of stock price; the slope implies the short-time price trend; hog-1D aggregates different oriented gradients into histograms in a statistical fashion. They provide diverse and comprehensive cues about the stock price series across different aspects. Building upon such local descriptors, we propose a multi-scale local cues and hierarchical attention-based LSTM model (MLCA-LSTM) to capture the underlying price trend patterns. It has two advantages: 1) multi-scale information is further enriched by performing different scale sliding windows over stock price series to induce diverse local descriptors, 2) temporal dependency and multi-scale interactions are jointly attended and aggregated through the hierarchical attention mechanism and multi-branch LSTM structure. Experiments on the real stock price data confirm the efficacy of the proposed model as compared to the state-of-the-art counterparts. [ABSTRACT FROM AUTHOR]
- Published
- 2022
- Full Text
- View/download PDF
10. SDNet: Spatial adversarial perturbation local descriptor learned with the dynamic probabilistic weighting loss.
- Author
-
Huang, Kaiji, Yang, Hua, Jiang, Yuyang, and Yin, Zhouping
- Subjects
- *
CONVOLUTIONAL neural networks , *COMPUTER vision , *IMAGE registration , *DEEP learning , *GENERALIZATION - Abstract
Local descriptor is an important upstream component in computer vision tasks. Despite considerable advances with deep learning-based descriptors, recent descriptors are not robust enough to handle widespread viewpoint changes in image matching tasks such as localization and 3D reconstruction. In this study, SDNet, a robust descriptor utilizing spatial adversarial perturbations, trained with a novel dynamic probabilistic weighting loss to enhance performance under such challenges. First, to increase the robustness and generalization ability of the network across spatially transformed instances, a innovative module for generating hard negative samples via spatial adversarial perturbations is designed. By maximizing adversarial loss, this module generates more complex patches, significantly enhancing the geometric robustness of the descriptor. Importantly, this module integrates seamlessly with existing patch-based descriptors without necessitating extra training data. Second, to mitigate the imbalance in the matching relationship between generated positive and negative pairs, the label weighted triplet loss is proposed, which markedly improves descriptor performance. Third, a comprehensive theoretical analysis of preceding studies is carried out from a gradient perspective, and a probabilistic dynamic weighting approach that adaptively emphasizes weighting functions with higher likelihoods is proposed to improve training performance of the descriptor. Extensive experiments are carried out on mainstream datasets. These comprehensive experiments demonstrate the effectiveness of SDNet, and the proposed method achieves significant improvements on the UBC, HPatches and ETH datasets, outperforming current state-of-the-art methods. The code is available at https://github.com/webd111/sdnet. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
11. Monocular Depth Estimation from a Single Infrared Image.
- Author
-
Han, Daechan and Choi, Yukyung
- Subjects
INFRARED imaging ,THERMOGRAPHY ,MONOCULARS ,IMAGE registration ,NETWORK performance - Abstract
Thermal infrared imaging is attracting much attention due to its strength against illuminance variation. However, because of the spectral difference between thermal infrared images and RGB images, the existing research on self-supervised monocular depth estimation has performance limitations. Therefore, in this study, we propose a novel Self-Guided Framework using a Pseudolabel predicted from RGB images. Our proposed framework, which solves the problem of appearance matching loss in the existing framework, transfers the high accuracy of Pseudolabel to the thermal depth estimation network by comparing low- and high-level pixels. Furthermore, we propose Patch-NetVLAD Loss, which strengthens local detail and global context information in the depth map from thermal infrared imaging by comparing locally global patch-level descriptors. Finally, we introduce an Image Matching Loss to estimate a more accurate depth map in a thermal depth network by enhancing the performance of the Pseudolabel. We demonstrate that the proposed framework shows significant performance improvement even when applied to various depth networks in the KAIST Multispectral Dataset. [ABSTRACT FROM AUTHOR]
- Published
- 2022
- Full Text
- View/download PDF
12. High-order histogram-based local clustering patterns in polar coordinate for facial recognition and retrieval.
- Author
-
Lin, Chih-Wei and Hong, Sidi
- Subjects
- *
FACE perception , *HUMAN facial recognition software , *CONVOLUTIONAL neural networks , *CARTESIAN coordinates , *COMPUTER vision , *DEEP learning - Abstract
Local feature patterns are conspicuous and are widely used in computer vision, especially in face recognition and retrieval. However, a statistical descriptor that can be used in various scenarios and effectively present the detailed local discrimination information of face images is a challenging and exploring task even if deep learning technology is widelyspread. In this study, we propose a novel local pattern descriptor called the Local Clustering Pattern (LCP) in high-order derivative space for facial recognition and retrieval. Unlike prior methods, LCP exploits the concept of clustering to analyze the relationship of intra- and inter-classes of the referenced pixel and its adjacent pixels to encode the local descriptor for facial recognition. There are three tasks (1) Local Clustering Pattern (LCP), (2) Clustering Coding Scheme, (3) High-order Local Clustering Pattern. To generate local clustering pattern, the local derivative variations with multi-direction are considered and that are integrated on rectangular coordinate system with the pairwise combinatorial direction. Moreover, to generate the discriminative local pattern, the features of local derivative variations are transformed from the rectangular coordinate system into the polar coordinate system to generate the characteristics of magnitude (m) and orientation (θ ). Then, we shift and project the features (m and θ ), which are scattered in the four quadrants of polar coordinate system, into the first quadrant of polar coordinates to strengthen the relationship of intra- and inter-classes of the referenced pixel and its adjacent pixels. To encode the local pattern, we consider the spatial relationship between reference and its adjacent pixels and fuse the clustering algorithm into the coding scheme by utilizing the relationship of intra- and inter-classes in a local patch. In addition, we extend the LCP from low- into high-order derivative space to extract the detailed and abundant information for facial description. LCP efficiently encodes the feature of a local region that is discriminative the inter-classes and robust the intra-class of the related pixels to describe a face image. This study has three main contributions: (1) we generate the novel features with magnitude (m) and orientation (θ ) based on the pairs of the derivative variations to describe the characteristics of each pixel, (2) we shift and project the features from four quadrants of polar coordinate system into the first quadrant of polar coordinates to strengthen the relationship of intra- and inter-classes between pixels in a local patch, (3) we exploit the concept of clustering, which considers the relationship of intra- and inter-classes of the referenced pixel and its adjacent pixels, to encode the local descriptor in a polar coordinate system for facial recognition and retrieval. Experimental results show that LCP outperforms the existing descriptors (LBP, ELBP LDP, LTrP, LVP, LDZP, LGHP) on six public datasets (ORL, Extend Yale B, CAS PEAL, and LFW, CMU-PIE and FERET) for both face recognition and retrieval tasks. Moreover, we further compare the proposed facial descriptor with the popular deep convolutional neural networks to demonstrate the discrimination of the extracted features and applicability of our approach. [ABSTRACT FROM AUTHOR]
- Published
- 2022
- Full Text
- View/download PDF
13. Multimodal biometric system combining left and right palmprints
- Author
-
Taouche, Chérif and Belhadef, Hacene
- Published
- 2020
- Full Text
- View/download PDF
14. Shape binary patterns: an efficient local descriptor and keypoint detector for point clouds.
- Author
-
Romero-González, Cristina, García-Varea, Ismael, and Martínez-Gómez, Jesus
- Subjects
POINT cloud ,DETECTORS ,COMPUTER vision ,AUTONOMOUS robots ,THREE-dimensional imaging ,ROBOT vision - Abstract
Many of the research problems in robot vision involve the detection of keypoints, areas with salient information in the input images and the generation of local descriptors, that encode relevant information for such keypoints. Computer vision solutions have recently relied on Deep Learning techniques, which make extensive use of the computational capabilities available. In autonomous robots, these capabilities are usually limited and, consequently, images cannot be processed adequately. For this reason, some robot vision tasks still benefit from a more classic approach based on keypoint detectors and local descriptors. In 2D images, the use of binary representations for visual tasks has shown that, with lower computational requirements, they can obtain a performance comparable to classic real-value techniques. However, these achievements have not been fully translated to 3D images, where research is mainly focused on real-value approaches. Thus, in this paper, we propose a keypoint detector and local descriptor based on 3D binary patterns. The experimentation demonstrates that our proposal is competitive against state-of-the-art techniques, while its processing can be performed more efficiently. [ABSTRACT FROM AUTHOR]
- Published
- 2022
- Full Text
- View/download PDF
15. Learning local descriptors with multi-level feature aggregation and spatial context pyramid.
- Author
-
Liang, Pengpeng, Ji, Haoxuanye, Cheng, Erkang, Chai, Yumei, Wang, Liming, and Ling, Haibin
- Subjects
- *
DESCRIPTOR systems , *PYRAMIDS , *CONVOLUTIONAL neural networks - Abstract
• Strengthen the descriptor by effectively combing features at different levels of CNN. • Capture the spatial information of a local patch with spatial context pyramid. • Comparable performance with state-of-the-art and comprehensive ablation experiments. Despite that efforts have shifted to learning local descriptors with convolutional neural network (CNN) from hand-crafted realm, the inherent feature hierarchy within CNN has been rarely explored. To increase both the invariant and discriminative abilities of the CNN-based local descriptors by making use of the complementary representation powers of the feature maps at different levels of CNN, in this paper, we design a multi-level feature aggregation (MLFA) module to communicate information across pyramid levels effectively. Then, each level extracts a feature vector after feature fusion and the final descriptor concatenates these outputs. Moreover, to leverage the spatial structure within a local patch, we propose a novel spatial context pyramid (SCP) module to capture the spatial information. SCP is devised in a residual manner and only several additional parameters are introduced to the model. We implement our algorithm based on the HardNet framework and carry out comprehensive evaluation on the UBC Phototour, HPatches and ETH datasets. The experimental results demonstrate that the proposed method performs favorably against the state-of-the-art ones. Ablation study is also provided to show the effectiveness of each component. [ABSTRACT FROM AUTHOR]
- Published
- 2021
- Full Text
- View/download PDF
16. Combining Statistical Features and Local Pattern Features for Texture Image Retrieval
- Author
-
Hengbin Wang, Huaijing Qu, Jia Xu, Jiwei Wang, Yanan Wei, and Zhisheng Zhang
- Subjects
Texture image retrieval ,local descriptor ,statistical modeling ,feature fusion ,similarity measurement ,Electrical engineering. Electronics. Nuclear engineering ,TK1-9971 - Abstract
The complementary fusion of global and local features can effectively improve the performance of image retrieval. This article proposes a new local texture descriptor, combined with statistical modeling in transform domain for texture image retrieval. The proposed local descriptor calculates the eight directions of the central pixel by using the relationship between the central pixel and the neighboring pixels in six directions, which is called the local eight direction pattern (LEDP). In the texture image retrieval system of this article, the feature extraction part combines global statistical features and local pattern features. Among them, both the relative magnitude (RM) sub-band coefficients and relative phase (RP) sub-band coefficients are modeled as wrapped Cauchy (WC) distribution in the dual-tree complex wavelet transform (DTCWT) domain, and the global statistical features employ the parameters of this model; while the local pattern features respectively choose the local binary pattern (LBP) histogram features in the spatial domain and the LEDP histogram features of each direction sub-band in the DTCWT domain. On the other hand, the similarity measurement selects matching distances for different features and combines them in the form of convex linear optimization. Texture image retrieval experiments are conducted in the Corel-1k database (DB1), Brodatz texture database (DB2) and MIT VisTex texture database (DB3), respectively. Experimental results show that, compared with the best existing methods, the approach proposed in this article has achieved better retrieval performance.
- Published
- 2020
- Full Text
- View/download PDF
17. Face recognition with a new local descriptor based on strings of successive values.
- Author
-
Zaaraoui, H., El Kaddouhi, S., Saaidi, A., and Abarkan, M.
- Subjects
FACE perception ,PIXELS ,ENCYCLOPEDIAS & dictionaries - Abstract
In this paper, a novel face recognition approach based on strings of successive values (SSV) is presented. In contrast to most of the existing local descriptors which encode only a limited number of pixels included in a mask, the strings extract more discriminative information over the whole face region, by moving from the current pixel to the next one, and to the other next, and so on, according to the variations of their intensities. Therefore, the SSV can be stopped in any place of the face area, which allows us to encode more edge information and texture information than the existing methods. The proposed face recognition scheme requires several steps. Firstly, the images are divided into non-overlapping sub-regions from which the strings are extracted since each pixel produces two different strings. Thereafter, the dictionary of visual words is created to reduce the number of strings obtained from each patch of the image. Therefore, the face image is described only by visual words, because each string is replaced by its nearest dictionary word. As a result, the occurrence of visual words is computed in a histogram as a face descriptor. Finally, the recognition is performed by using the nearest neighbor classifier with the Hellinger distance. The effectiveness of the proposed approach is evaluated on three different databases, and the experimental results show that the recognition performances achieved are competitive or even outperform the literature state of the art methods. [ABSTRACT FROM AUTHOR]
- Published
- 2021
- Full Text
- View/download PDF
18. Robust H∞ deconvolution filtering of 2-D digital systems of orthogonal local descriptor.
- Author
-
El Mallahi, Mostafa, Boukili, Bensalem, Zouhri, Amal, Hmamed, Abdelaziz, and Qjidaa, Hassan
- Subjects
ORTHOGONAL systems ,DECONVOLUTION (Mathematics) ,FEATURE extraction ,LIGHT filters ,LINEAR matrix inequalities - Abstract
In this work, we propose a new set of H ∞ deconvolution filtering of 2-D color image using feature extraction of local descriptor and Fornasini-Machesini II (FM-II) model. The principal goal is to design 2-D deconvolution filter to reconstruct the noisy color image with the minimal information extracted from local Krawtchouk moment, Moreover, the filtering error system is asymptotically stable and satisfy the H ∞ performance index. the sufficient condition is given to ensure the H ∞ performance of the filtering error system through the Lyapunov theory, and the local Krawtckouk moment to give the feature extraction according to the order defined in advance instead of the global color image. Moreover, the 2-D deconvolution filter is designed to achieve the H ∞ performance index which the filter parameters are determined with certain optimization resolution. Finally, simulation example is provided to demonstrate the usefulness of the proposed design methods. [ABSTRACT FROM AUTHOR]
- Published
- 2021
- Full Text
- View/download PDF
19. QSSR Modeling of Bacillus Subtilis Lipase A Peptide Collision Cross-Sections in Ion Mobility Spectrometry: Local Descriptor Versus Global Descriptor.
- Author
-
Ni, Zhong, Wang, Anlin, Kang, Lingyu, and Zhang, Tiancheng
- Subjects
- *
MONTE Carlo method , *ION mobility spectroscopy , *BACILLUS subtilis , *LIPASES , *COLLISION broadening , *AMINO acid residues , *MOLECULAR structure , *TRIPEPTIDES - Abstract
To investigate the structure-dependent peptide mobility behavior in ion mobility spectrometry (IMS), quantitative structure-spectrum relationship (QSSR) is systematically modeled and predicted for the collision cross section Ω values of totally 162 single-protonated tripeptide fragments extracted from the Bacillus subtilis lipase A. Two different types of structure characterization methods, namely, local and global descriptor as well as three machine learning methods, namely, partial least squares (PLS), support vector machine (SVM) and Gaussian process (GP), are employed to parameterize and correlate the structures and Ω values of these peptide samples. In this procedure, the local descriptor is derived from the principal component analysis (PCA) of 516 physicochemical properties for 20 standard amino acids, which can be used to sequentially characterize the three amino acid residues composing a tripeptide. The global descriptor is calculated using CODESSA method, which can generate > 200 statistically significant variables to characterize the whole molecular structure of a tripeptide. The obtained QSSR models are evaluated rigorously via tenfold cross-validation and Monte Carlo cross-validation (MCCV). A comprehensive comparison is performed on the resulting statistics arising from the systematic combination of different descriptor types and machine learning methods. It is revealed that the local descriptor-based QSSR models have a better fitting ability and predictive power, but worse interpretability, than those based on the global descriptor. In addition, since the QSSR modeling using local descriptor does not consider the three-dimensional conformation of tripeptide samples, the method would be largely efficient as compared to the global descriptor. [ABSTRACT FROM AUTHOR]
- Published
- 2021
- Full Text
- View/download PDF
20. Using Deep Neural Networks to Improve the Performance of Protein–Protein Interactions Prediction.
- Author
-
Gui, Yuan-Miao, Wang, Ru-Jing, Wang, Xue, and Wei, Yuan-Yuan
- Subjects
- *
PROTEIN-protein interactions , *FORECASTING , *NETWORK performance , *AMINO acid sequence , *DRUG development - Abstract
Protein–protein interactions (PPIs) help to elucidate the molecular mechanisms of life activities and have a certain role in promoting disease treatment and new drug development. With the advent of the proteomics era, some PPIs prediction methods have emerged. However, the performances of these PPIs prediction methods still need to be optimized and improved. In order to optimize the performance of the PPIs prediction methods, we used the dropout method to reduce over-fitting by deep neural networks (DNNs), and combined with three types of feature extraction methods, conjoint triad (CT), auto covariance (AC) and local descriptor (LD), to build DNN models based on amino acid sequences. The results showed that the accuracy of the CT, AC and LD increased from 97.11% to 98.12%, 96.84% to 98.17%, and 95.30% to 95.60%, respectively. The loss values of the CT, AC and LD decreased from 27.47% to 14.96%, 65.91% to 17.82% and 36.23% to 15.34%, respectively. Experimental results show that dropout can optimize the performances of the DNN models. The results can provide a resource for scholars in future studies involving the prediction of PPIs. The experimental code is available at https://github.com/smalltalkman/hppi-tensorflow. [ABSTRACT FROM AUTHOR]
- Published
- 2020
- Full Text
- View/download PDF
21. A dynamic inverse distance weighting-based local face descriptor.
- Author
-
Cevik, Nazife
- Subjects
HUMAN facial recognition software ,DISTANCES ,PIXELS - Abstract
This paper proposes a novel high-performance dynamic inverse distance weighting based local descriptor (DIDWLD) for facial recognition. Studies proposed thus far have focused on finding local descriptors that can represent the texture of the face best. However, the robustness of the descriptors against rotational variances and noise affects have been largely omitted. Thus, this study does not only concern with proposing a high-discriminative descriptor, but also a robust one against rotational changes and noise affects. DIDWLD mainly basis on Inverse Distance Weighting (IDW). That is, for each pixel in the image, a new descriptive value is calculated, taking into account the intensity values of the neighboring pixels and their distance to the reference pixels. A dynamic distance-decay parameter is applied throughout the image rather than keeping it uniform as done in ordinary IDW. The calculated descriptor is independent of the changes in the rotation. Because, when calculating the descriptor, the intensity values of the surrounding pixels with their distances to the reference pixel are taken into consideration, yet their directional relation to the reference pixel is ignored. Furthermore, when a pixel is suffered to noise, inherently, its neighboring pixels are also affected. Hence, by taking into account the effect of the surrounding pixels and also the original intensity value of the pixel, the degrading impact of noise on recognition performance is mitigated. The results of extensive simulations show the remarkable and competitive performance of the proposed method regarding recognition accuracy, and robustness against rotational variances and, noise effects. [ABSTRACT FROM AUTHOR]
- Published
- 2020
- Full Text
- View/download PDF
22. Evaluating dynamic texture descriptors to recognize human iris in video image sequence.
- Author
-
de Melo Langoni, Virgílio and Gonzaga, Adilson
- Subjects
- *
TEXTURE analysis (Image processing) , *IRIS recognition , *MOTION analysis , *TIME dilation , *TEXTURES , *VIDEOS - Abstract
In the last decades, iris features have been widely used in biometric systems. Because iris features are virtually unique for each person, their usage is highly reliable. However, biometric systems based on iris features are not completely fraud-resistant, as most systems use static images and do not distinguish between a live iris and a photograph. The iris structure and texture change with light variations, and traditional techniques for iris recognition always identify the iris texture in a controlled environment. However, in uncontrolled environments, live irises are recognized by their dynamic response to light: If the light changes, the pupils dilate or contract, and their texture dynamically changes. If a biometric system can identify people during the constriction or dilation time interval, that system will be more fraud-resistant. This paper proposes a new methodology to evaluate the "dynamic texture" from iris image sequences (motion analysis) and measure the discriminant power of these features for biometric system applications. We propose two new dynamic descriptors—dynamic local mapped pattern and dynamic sampled local mapped pattern—which are extensions of the local mapped pattern previously published for texture classification. We applied our proposed dynamic texture descriptors in a sequence of iris images segmented from video under light variation. Then, we compared our results with the well-known dynamic texture descriptor local binary pattern from three orthogonal planes (LBP-TOP). We used statistical measures to evaluate the performance of both descriptors and concluded that our methodology performed better than the LBP-TOP. Moreover, our descriptors can extract dynamic textures faster than the LBP-TOP. [ABSTRACT FROM AUTHOR]
- Published
- 2020
- Full Text
- View/download PDF
23. LOG-GABOR BINARIZED STATISTICAL DESCRIPTOR FOR FINGER KNUCKLE PRINT RECOGNITION SYSTEM.
- Author
-
Chaa, Mourad
- Subjects
ERROR rates - Abstract
This paper proposes a new local image descriptor for Finger Knuckle Print Recognition Systems (FKPRS), named Log-Gabor Binarized Statistical Image Features descriptor (LGBSIF). The idea of LGBSIF is based on the image Log-Gabor wavelet representation and the Binarized Statistical Image Features (BSIF). Initially, the Region of Interest (ROI) of the FKP images are analyzed with a 1D Log-Gabor wavelet to extract the preliminary features that are presented by both the real and imaginary parts of the filtered image. The main motive of the LGBSIF is to enhance the Log-Gabor real and imaginary features by applying the BSIF coding method. Secondly, histograms extracted from the encoded real and imaginary images respectively are concatenated in one large feature vector. Thirdly, the PCA+LDA technique is used to reduce the dimensionality of this feature and enhance its discriminatory power. Finally, the Nearest Neighbor Classifier that uses the Cosine distance is employed for the matching process. The evaluation of the performance of the proposed system is done on the Poly-U FKP database. However, the experimental results have shown that the proposed system achieves better results than other state-of-the-art systems and confirmed the tenacity of the proposed descriptor. Further, the results also prove that the performance efficiency of the introduced system in terms of recognition rate (Rank1) and equal error rate (EER) are 100% and 0% for both modes of identification and verification respectively. [ABSTRACT FROM AUTHOR]
- Published
- 2020
- Full Text
- View/download PDF
24. Steel Surface Defect Classification Based on Discriminant Manifold Regularized Local Descriptor
- Author
-
Jiuliang Zhao, Yishu Peng, and Yunhui Yan
- Subjects
Steel surface defect classification ,local descriptor ,discriminant manifold learning ,manifold metric ,Electrical engineering. Electronics. Nuclear engineering ,TK1-9971 - Abstract
Steel surface demonstrates various sorts of defects due to the production technique and environment. The appearance of defect is in much more random pattern than that of the normal texture image. Therefore, it is challenging to capture the discriminant information to categorize the defects. The defect image is out of image registration in grayscale, and thus, the local descriptor is inclined to be utilized for feature extraction. In the previous works, involving a local descriptor for categorizing the defect images, the thresholding operator participates in the hand-crafted feature extraction, such as local binary patterns and histogram of oriented gradient, leading to sub-optimal features. By introducing the learning mechanism into the construction of local descriptor, a novel algorithm named discriminant manifold regularized local descriptor (DMRLD) is proposed to conduct the defect classification task in this paper. First, the DMRLD computes the dense pixel difference vector (DPDV) to draw the local information of defect images. Then, the manifold of these DPDVs can be constructed by searching for a number of linear models to represent the feature. In order to enhance the discriminant ability of the feature, a projection on the manifold is learned for achieving a low-dimensional subspace. Finally, the manifold distance defined in the subspace can accomplish the matching task to get the category of the defect image. The proposed algorithm is first applied on the Kylberg texture dataset to evaluate the texture feature extraction performance, and then the experiments on the real steel surface defect dataset are conducted to illustrate the effectiveness of DMRLD compared with other local descriptors.
- Published
- 2018
- Full Text
- View/download PDF
25. Pyramid Histogram of Double Competitive Pattern for Finger Vein Recognition
- Author
-
Yu Lu, Sook Yoon, Shiqian Wu, and Dong Sun Park
- Subjects
Competitive pattern ,finger vein ,local descriptor ,Electrical engineering. Electronics. Nuclear engineering ,TK1-9971 - Abstract
Finger vein is a new and secure biometric for personal authentication due to its line-structure network with abundant local and orientation features. However, these features cannot be well represented by the existing local descriptors. To effectively utilize these rich orientation features in finger vein images, this paper proposes a new local descriptor, namely, pyramid histogram of double competitive pattern (PHDCP). For a finger vein image, the PHDCP first obtains a bank of filtered images using Gabor filters with large kernel size and rich orientations, by which the local line features are captured. Then, the orientation orders with the largest and smallest responses, which are the most robust features, are selected to generate an encoded map. Finally, a column-partition-based pyramid histogram extraction method is presented to capture the hierarchical features from the encoded image. Numerical experiments are conducted on two public finger vein data sets, MMCBNU_6000 and UTFVP. The experimental results demonstrate that the proposed PHDCP performs much better than the existing local descriptors.
- Published
- 2018
- Full Text
- View/download PDF
26. Background modelling using discriminative motion representation
- Author
-
Zuofeng Zhong, Yong Xu, Zuoyong Li, and Yinnan Zhao
- Subjects
discriminative motion representation ,background modelling method ,local descriptor ,weighted combination ,differential excitations ,discriminability enhancement ,Computer applications to medicine. Medical informatics ,R858-859.7 ,Computer software ,QA76.75-76.765 - Abstract
Robustness is an important factor for background modelling on various scenarios. Current pixel‐based adaptive segmentation method cannot effectively tackle diverse objects simultaneously. To address this problem, in this study, a background modelling method using discriminative motion representation is proposed. Instead of simple usage of intensity to construct the background model, the proposed method extracts a new local descriptor which uses a weighted combination of differential excitations for each pixel to enhance the discriminability of pixels. On the basis of this background model, different categories of objects can be quickly identified by a simple but effective classification rule and accurately be represented in background model by a smart selection of updating strategies. Therefore, the authors’ background modelling method can generate complete representation for static objects and decrease false detection caused by dynamic background or illumination variations. Extensive experiments have been conducted to demonstrate that the proposed method obtains more advantages of foreground detection than the state‐of‐the‐art methods. In addition, the proposed method provides a computational efficient algorithm for foreground detection tasks.
- Published
- 2017
- Full Text
- View/download PDF
27. Revisiting correlation-based filters for low-resolution and long-term visual tracking.
- Author
-
Fazl-Ersi, Ehsan and Kazemi Nooghabi, Masoud
- Subjects
- *
FILTERS & filtration , *ARTIFICIAL satellite tracking , *NEIGHBORHOODS - Abstract
In this paper, we revisit the problem of visual tracking by introducing a novel low-dimensional descriptor based on gradient distribution and specifically focus our attention on the problem of low-resolution and long-term visual tracking. We show that our tracking solution empowered by our proposed descriptor can effectively address the existing challenges in low-resolution and long-term visual tracking. Compared to the existing descriptors, the proposed method provides better robustness against local geometric and photometric variations. It adopts a new approach for aggregating information in a local neighborhood such that the sensitivity of the descriptor to noise and unreliable texture information is reduced. Integrating the proposed descriptor into a correlation-based tracking framework results in a robust and fast visual tracker. An extensive set of experiments on a number of large-scale benchmark datasets shows that the proposed method outperforms the state-of-the-art methods on low-resolution and long-term challenges, while achieving state-of-the-art performance in generic tracking. [ABSTRACT FROM AUTHOR]
- Published
- 2019
- Full Text
- View/download PDF
28. Local directional relation pattern for unconstrained and robust face retrieval.
- Author
-
Dubey, Shiv Ram
- Subjects
DESCRIPTOR systems ,HUMAN facial recognition software ,IMAGE retrieval ,DEEP learning - Abstract
Face recognition is still a very demanding area of research. This problem becomes more challenging in unconstrained environment and in the presence of several variations like pose, illumination, expression, etc. Local descriptors are widely used for this task. The most of the existing local descriptors consider only few immediate local neighbors and not able to utilize the wider local information to make the descriptor more discriminative. The wider local information based descriptors mainly suffer due to the increased dimensionality. In this paper, this problem is solved by encoding the relationship among directional neighbors in an efficient manner. The relationship between the center pixel and the encoded directional neighbors is utilized further to form the proposed local directional relation pattern (LDRP). The descriptor is inherently uniform illumination invariant. The multi-scale mechanism is also adapted to further boost the discriminative ability of the descriptor. The proposed descriptor is evaluated under the image retrieval framework over face databases. Very challenging databases like PaSC, LFW, PubFig, ESSEX, FERET, AT&T, and FaceScrub are used to test the discriminative ability and robustness of LDRP descriptor. Results are also compared with the recent state-of-the-art face descriptors such as LBP, LTP, LDP, LDN, LVP, DCP, LDGP and LGHP. Very promising performance is observed using the proposed descriptor over very appealing face databases as compared to the existing face descriptors. The proposed LDRP descriptor also outperforms the pre-trained ImageNet CNN models over large-scale FaceScrub face dataset. Moreover, it also outperforms the deep learning based DLib face descriptor in many scenarios. [ABSTRACT FROM AUTHOR]
- Published
- 2019
- Full Text
- View/download PDF
29. A Fast Action Recognition Strategy Based on Motion Trajectory Occurrences.
- Author
-
Garzón, G. and Martínez, F.
- Abstract
A few light stimuli coherently distributed in the space and time are the essential input that a visual system needs to perceive motion. Inspired in such fact, a compact motion descriptor is herein proposed to describe patterns of neighboring trajectories for human action recognition. The proposed method introduces a strategy that models the local distribution of neighboring points by defining a spatial point process around motion trajectories. Particularly, a two-level occurrence analysis is carried out to discover motion patterns that underlying on trajectory points representation. Firstly, local occurrence words are computed over a circular grid layout that is centered in a fixed position for each trajectory. Then, a regional occurrence description is achieved by representing actions as the most frequent local words that occur in a particular video. This second occurrence layer could be computed for the entire video or by each frame to achieve an online recognition. This compact descriptor, with local size of 72 and sequence descriptor size of 400, acquires importance in real-time applications and environments with hardware restrictions. The proposed strategy was evaluated on KTH and Weizmann dataset, achieving an average accuracy of 91.2 and 78%, respectively. Moreover, a further online recognition was performed over UT-Interaction achieving an accuracy of 67% by using only the first 25% of video sequences. [ABSTRACT FROM AUTHOR]
- Published
- 2019
- Full Text
- View/download PDF
30. Face retrieval using frequency decoded local descriptor.
- Author
-
Dubey, Shiv Ram
- Subjects
DESCRIPTOR systems ,COMPUTER vision ,PROBLEM solving - Abstract
The local descriptors have been the backbone of most of the computer vision problems. Most of the existing local descriptors are generated over the raw input images. In order to increase the discriminative power of the local descriptors, some researchers converted the raw image into multiple images with the help of some high and low pass frequency filters, then the local descriptors are computed over each filtered image and finally concatenated into a single descriptor. By doing so, these approaches do not utilize the inter frequency relationship which causes the less improvement in the discriminative power of the descriptor that could be achieved. In this paper, this problem is solved by utilizing the decoder concept of multi-channel decoded local binary pattern over the multi-frequency patterns. A frequency decoded local binary pattern (FDLBP) is proposed with two decoders. Each decoder works with one low frequency pattern and two high frequency patterns. Finally, the descriptors from both decoders are concatenated to form the single descriptor. The face retrieval experiments are conducted over four benchmarks and challenging databases such as PaSC, LFW, PubFig, and ESSEX. The experimental results confirm the superiority of the FDLBP descriptor as compared to the state-of-the-art descriptors such as LBP, SOBEL_LBP, BoF_LBP, SVD_S_LBP, mdLBP, etc. [ABSTRACT FROM AUTHOR]
- Published
- 2019
- Full Text
- View/download PDF
31. DLGBD: A directional local gradient based descriptor for face recognition.
- Author
-
Cevik, Nazife and Cevik, Taner
- Subjects
HUMAN facial recognition software ,PIXELS ,ROTATIONAL motion - Abstract
This paper proposes a novel high-performance gradient-based local descriptor that handles the prominent challenges of face recognition such as resistance against rotational, illuminative changes as well as noise effects. One of the novelties this study poses is that, while processing the gradient for each direction, an analysis is done by considering the predecessors of the corresponding pixel as well as the successors at that direction. Furthermore, earlier studies represent these local relationships by encoding them in binary because they consider only the positive and negative intensity changes. However, we propose an alternative way of representation that encodes the relationships between each pixel and its neighbors in a multi-valued logic manner called Directional Local Gradient Based Descriptor (DLGBD). Our method not only considers the variations but also uniformity. A threshold value is defined to identify whether an intensity variation is present in the specified direction. If the intensity change exceeds the threshold value, then it is evaluated as a variation either in positively or negatively depending on the direction of the change. Three states of the relationship between multiple pixels at each direction yield a more discriminative descriptor for face retrieval. Ternary logic is applied to express three states. Ternary values that are calculated at each direction are concatenated and the resulting compound ternary value is replaced with the reference pixel. By this way, a more discriminative face descriptor is achieved which is resistant to noise and challenges in unconstrained environments. Extensive simulations are conducted over benchmark datasets and the performance of DLGBD is compared to the other state-of-the-art methods. As presented by the simulation results, the DLGBD achieves very high discriminating performance as well as providing resistance against rotation and illumination variations. [ABSTRACT FROM AUTHOR]
- Published
- 2019
- Full Text
- View/download PDF
32. DLFace: Deep local descriptor for cross-modality face recognition.
- Author
-
Peng, Chunlei, Wang, Nannan, Li, Jie, and Gao, Xinbo
- Subjects
- *
HUMAN facial recognition software , *DESCRIPTOR systems , *ARTIFICIAL neural networks , *DATA distribution , *DISCRIMINANT analysis - Abstract
Highlights • We develop a deep local descriptor for cross-modality face recognition, which can learn discriminant information from image patches. • We propose an enumeration loss function to eliminate modality gap on local patch level, which is integrated into a convolutional neural network. • Extensive experiments show that DLFace outperforms existing methods, which demonstrate the effectiveness of our method. Abstract Cross-modality face recognition aims to identify faces across different modalities, such as matching sketches with photos, low resolution face images with high resolution images, and near infrared images with visual lighting images, which is challenging because of the modality gap caused by texture, resolution, and illumination variations. Existing approaches either utilized hand-crafted approaches which ignore inherent data distribution characteristic, or applied deep learning-based algorithms on holistic face images with facial local information ignored. In this paper, we propose a deep local descriptor learning framework for cross-modality face recognition, which aims to learn discriminant and compact local information directly from raw facial patches. A novel cross-modality enumeration loss is proposed to eliminate the modality gap on local patch level, which is then integrated into a convolutional neural networks for deep local descriptor extraction. The proposed deep local descriptor can be easily applied to any traditional face recognition systems, and we use Fisherface as an example in the paper. Extensive experiments on six widely used cross-modality face recognition datasets demonstrate the superiority of proposed method over state-of-the-art methods. [ABSTRACT FROM AUTHOR]
- Published
- 2019
- Full Text
- View/download PDF
33. A Novel Multi-Feature Representation of Images for Heterogeneous IoTs
- Author
-
Laihang Yu, Lin Feng, Chen Chen, Tie Qiu, Li Li, and Jun Wu
- Subjects
Internet of Things ,feature extraction ,image retrieval ,image representation ,local descriptor ,Electrical engineering. Electronics. Nuclear engineering ,TK1-9971 - Abstract
With the applications heterogeneous of Internet of Things (IoT) technology, the heterogeneous IoT systems generate a large number of heterogeneous datas, including videos and images. How to efficiently represent these images is an important and challenging task. As a local descriptor, the texton analysis has attracted wide attentions in the field of image processing. A variety of texton-based methods have been proposed in the past few years, which have achieved excellent performance. But, there still exists some problems to be solved, especially, it is difficult to describe the images with complex scenes from IoT. To address this problem, this paper proposes a multi-feature representation method called diagonal structure descriptor. It is more suitable for intermediate feature extraction and conducive to multi-feature fusion. Based on visual attention mechanism, five kinds of diagonal structure textons are defined by the color differences of diagonal pixels. Then, four types of visual features are extracted from the mapping sub-graphs and integrated into 1-D vector. Various experiments on three Corel-datasets demonstrate that the proposed method performs better than several state-of-the-art methods.
- Published
- 2016
- Full Text
- View/download PDF
34. Selecting Discriminative Binary Patterns for a Local Feature
- Author
-
Li Yingying, Tan Jieqing, and Zhong Jinqin
- Subjects
selecting patterns ,searching tree ,local descriptor ,matching ,binary pattern ,Cybernetics ,Q300-390 - Abstract
The local descriptors based on a binary pattern feature have state-of-the-art distinctiveness. However, their high dimensionality resists them from matching faster and being used in a low-end device. In this paper we propose an efficient and feasible learning method to select discriminative binary patterns for constructing a compact local descriptor. In the selection, a searching tree with Branch&Bound is used instead of the exhaustive enumeration, in order to avoid tremendous computation in training. New local descriptors are constructed based on the selected patterns. The efficiency of selecting binary patterns has been confirmed by the evaluation of these new local descriptors’ performance in experiments of image matching and object recognition.
- Published
- 2015
- Full Text
- View/download PDF
35. Acoustic Scene Classification Using Efficient Summary Statistics and Multiple Spectro-Temporal Descriptor Fusion.
- Author
-
Ye, Jiaxing, Kobayashi, Takumi, Toyama, Nobuyuki, Tsuda, Hiroshi, and Murakawa, Masahiro
- Subjects
FEATURE extraction ,ARITHMETIC mean ,EXTREME value theory - Abstract
This paper presents a novel approach for acoustic scene classification based on efficient acoustic feature extraction using spectro-temporal descriptors fusion. Grounded on the finding in neuroscience—"auditory system summarizes the temporal details of sounds using time-averaged statistics to understand acoustic scenes", we devise an efficient computational framework for sound scene classification by using multipe time-frequency descriptors fusion with discriminant information enhancement. To characterize rich information of sound, i.e., local structures on the time-frequency plane, we adopt 2-dimensional local descriptors. A more critical issue raised in how to logically 'summarize' those local details into a compact feature vector for scene classification. Although 'time-averaged statistics' is suggested by the psychological investigation, directly computing time average of local acoustic features is not a logical way, since arithmetic mean is vulnerable to extreme values which are anticipated to be generated by interference sounds which are irrelevant to the scene category. To tackle this problem, we develop time-frame weighting approach to enhance sound textures as well as to suppress scene-irrelevant events. Subsequently, robust acoustic feature for scene classification can be efficiently characterized. The proposed method had been validated by using Rouen dataset which consists of 19 acoustic scene categories with 3029 real samples. Extensive results demonstrated the effectiveness of the proposed scheme. [ABSTRACT FROM AUTHOR]
- Published
- 2018
- Full Text
- View/download PDF
36. Local descriptor margin projections (LDMP) for face recognition.
- Author
-
Yang, Zhangjing, Huang, Pu, Wan, Minghua, Zhang, Fanlong, Yang, Guowei, Qian, Chengshan, Zhang, Jincheng, and Li, Zuoyong
- Abstract
Feature extraction is a key problem in face recognition systems. This paper tackles this problem by combining the strength of image descriptor with dimensionality reduction technology. So, this paper proposes a new efficient face recognition method-local descriptor margin projections (LDMP). Firstly, we propose a novel local descriptor for face image representation. At this step, an effective and simple metric approach named gray value accumulating distance (GAD) is firstly proposed. And then a novel local descriptor based on GAD is presented to capture the local structure information between central pixel and its neighbors effectively. Secondly, we propose a dimensionality reduction algorithm named maximum margin learning projections (MMLP) which can obtain the low-dimensional and discriminative feature. Finally, experimental results on the Yale, Extended Yale B, PIE, AR and LFW face databases show the effectiveness of the proposed method. [ABSTRACT FROM AUTHOR]
- Published
- 2018
- Full Text
- View/download PDF
37. Sparse projections matrix binary descriptors for face recognition.
- Author
-
Fan, Chunxiao, Tian, Lei, Ming, Yue, Hong, Xiaopeng, Zhao, Guoying, and Pietikäinen, Matti
- Subjects
- *
BINARY codes , *HUMAN facial recognition software , *HASHING , *COMPUTATIONAL biology , *ALGORITHMS - Abstract
In recent years, the binary feature descriptor has achieved great success in face recognition (FR) field, such as local binary pattern (LBP). It is well known that the high-dimensional feature representations can contain more discriminative information, therefore, it is natural for us to construct the high-dimensional binary feature for FR task. However, the high-dimensional representations would lead to high computational cost and overfitting. Therefore, an effective sparsity regularizer is necessary. In this paper, we introduce the sparsity constraint into the objective function of general binary codes learning framework, so that the problem of high computational cost and overfitting can be somehow solved. There are three main requirements in our objective function: First, we require that the high-dimensional binary codes have the minimized quantization loss compared with centered original data. Second, we require the projection matrices are sparse, so that the projection process would not take lots of computational resource even faced with high-dimensional original data. Third, for a mapping (hashing) function, the bit-independence and bit-balance are two excellent properties for generating discriminative binary codes. We also empirically show that the high-dimensional binary codes can obtain more discriminative ability by pooling process with an unsupervised clustering method. Therefore, a discriminative and low-cost Sparse Projection Matrix Binary Descriptors (SPMBD) is learned by the data-driven way. Extensive experimental results on four public datasets show that our SPMBD descriptor outperforms other existing face recognition algorithms and demonstrate the effectiveness and robustness of the proposed methods. [ABSTRACT FROM AUTHOR]
- Published
- 2018
- Full Text
- View/download PDF
38. DHNet: working double hard to learn a convolutional neural network-based local descriptor.
- Author
-
Li, Dandan, Zeng, Dan, and Zhaob, Kai
- Subjects
- *
ARTIFICIAL neural networks , *COMPUTER vision , *IMAGE registration , *BIG data , *ALGORITHMS - Abstract
Designing effective local descriptors is crucial for many computer vision tasks such as image matching and patch verification. We propose a convolutional neural network (CNN)-based local descriptor named DHNet with a considerate sampling strategy and a dedicated loss function. By considerate sampling, both the closest nonmatching sample and the farther matching sample can be obtained for effectively training a discriminative model. In addition, an improved triplet loss is designed by adding a constraint that limits the absolute distance for the closest nonmatching pair. Based on hard samples and the constraint, our lightweight CNN can quickly generate local descriptors with enhanced intraclass compactness and interclass separation. Experimental results show that our method significantly outperforms the state-of-the-art methods in terms of strong discrimination ability, as evidenced by a considerable performance improvement on several benchmarks. ©2018 SPIE and IS&T [DOI: 10.1117/1.JEI.27.4.043008] [ABSTRACT FROM AUTHOR]
- Published
- 2018
- Full Text
- View/download PDF
39. Locally aggregated histogram-based descriptors.
- Author
-
Lu, Xiusheng, Yao, Hongxun, Sun, Xin, and Zhang, Yanhao
- Abstract
Histogram is commonly used in the area of designing features. However, most existing histogram-based descriptors ignore the information of the distribution of points in each bin. Motivated by VLAD, we introduce the locally aggregation strategy into the design of hand-crafted features to address this issue, and put forward several locally aggregated histogram-based descriptors, including LA-HOG, LA-HOF and LA-MBH, based on HOG, HOF and MBH, respectively. In the binning process of the proposed descriptors, we accumulate the differences between the local information and their nearest bin centers, which describes the distribution of the local information in each bin. The proposed descriptors are utilized in object and action recognition tasks, which are demonstrated to be complementary to the original descriptors in the experiments. The comparison results show that their combinations outperform the original descriptors alone by about 2% on average both in these two tasks. [ABSTRACT FROM AUTHOR]
- Published
- 2018
- Full Text
- View/download PDF
40. Multi-trend binary code descriptor: a novel local texture feature descriptor for image retrieval.
- Author
-
Yu, Laihang, Feng, Lin, Wang, Huibing, Li, Li, Liu, Yang, and Liu, Shenglan
- Abstract
With the development of image vision technology, local descriptors have attracted wide attention in the fields of image retrieval and classification. Even though varieties of methods based on local descriptor have achieved excellent performance, most of them cannot effectively represent the trend of pixels change, and they neglect the mutual occurrence of patterns. Therefore, how to construct local descriptors is of vital importance but challenging. In order to solve this problem, this paper proposes a multi-trend binary code descriptor (MTBCD). MTBCD mimics the visual perception of human to describe images by constructing a set of multi-trend descriptors which are encoded with binary codes. The method exploits the trend of pixels change in four symmetric directions to obtain the texture feature, and extracts the spatial correlation information using co-occurrence matrix. These intermediate features are integrated into one histogram using a new fusion strategy. The proposed method not only captures the global color features, but also reflects the local texture information. Extensive experiments have demonstrated the excellent performance of the proposed method. [ABSTRACT FROM AUTHOR]
- Published
- 2018
- Full Text
- View/download PDF
41. 3D traversability awareness for rough terrain mobile robots
- Author
-
Giulio Reina, Mauro Bellone, Luigi Spedicato, and Nicola Ivan Giannoccaro
- Published
- 2014
- Full Text
- View/download PDF
42. Monocular Depth Estimation from a Single Infrared Image
- Author
-
Daechan Han and Yukyung Choi
- Subjects
Computer Networks and Communications ,Hardware and Architecture ,Control and Systems Engineering ,Computer Science::Computer Vision and Pattern Recognition ,Signal Processing ,ComputingMethodologies_IMAGEPROCESSINGANDCOMPUTERVISION ,Electrical and Electronic Engineering ,monocular depth estimation ,self-supervised learning ,infrared image ,night vision ,pseudo-label ,local descriptor - Abstract
Thermal infrared imaging is attracting much attention due to its strength against illuminance variation. However, because of the spectral difference between thermal infrared images and RGB images, the existing research on self-supervised monocular depth estimation has performance limitations. Therefore, in this study, we propose a novel Self-Guided Framework using a Pseudolabel predicted from RGB images. Our proposed framework, which solves the problem of appearance matching loss in the existing framework, transfers the high accuracy of Pseudolabel to the thermal depth estimation network by comparing low- and high-level pixels. Furthermore, we propose Patch-NetVLAD Loss, which strengthens local detail and global context information in the depth map from thermal infrared imaging by comparing locally global patch-level descriptors. Finally, we introduce an Image Matching Loss to estimate a more accurate depth map in a thermal depth network by enhancing the performance of the Pseudolabel. We demonstrate that the proposed framework shows significant performance improvement even when applied to various depth networks in the KAIST Multispectral Dataset.
- Published
- 2022
- Full Text
- View/download PDF
43. Low-dimension local descriptor for dense stereo matching and scene reconstruction.
- Author
-
Chao Zhang, Bindang Xue, and Fugen Zhou
- Subjects
- *
DEPTH maps (Digital image processing) , *IMAGE reconstruction , *DISCRETE cosine transforms - Abstract
The DAISY descriptor has been widely used in dense stereo matching and scene reconstruction. However, DAISY is vulnerable to similar feature regions because the construction method of DAISY sequentially arranges the description of center and neighbor sample points and does not consider their relationships. To enhance the discriminative power of the local descriptor and accelerate the speed of dense matching and scene reconstruction, we propose a low-dimensional local descriptor. The proposed descriptor is inspired from the local binary pattern (LBP). In image space, LBP describes local detail texture by computing the difference between center and neighbor sample points. We introduce this advantage in scale space to extend the DAISY descriptor and make it more efficient for dense matching similar features in the different regions. On this basis, a two-dimensional discrete cosine transform (2D-DCT) is utilized to reduce the dimensions of the descriptor as well as reduce the computation cost of dense matching and scene reconstruction. Through a variety of experiments on the benchmark laser-scanned ground truth scenes as well as indoor and outdoor scenes, we show the proposed descriptor can get more accurate depth maps and more complete reconstruction results than that of using other common descriptors, and the computational speed is much faster than that of using DAISY. [ABSTRACT FROM AUTHOR]
- Published
- 2017
- Full Text
- View/download PDF
44. Augment color discrimination of local descriptors.
- Author
-
Liu, Yan, Wang, Hua, and Qin, YeYang
- Subjects
- *
OPTICAL resolution , *OPTICAL properties , *COLOR , *GRAY , *IMAGE processing - Abstract
Scale invariance and color invariance are two critical characters of robust local descriptors. For scale invariance, most of local descriptors are constructed based on scale pyramid only, while lack of considering the resolution influence. For color invariance, most of local descriptors are constructed by joining the color features to gray features. This enhances color discrimination of local descriptors but doubles descriptors' length. In this paper, we analyze the resolution influence for descriptor robustness and give a way to construct the base layer of scale pyramid by resolution analysis. And then we propose a color transformation method to extract robust color discriminated local descriptors. At last, we validate the proposed method on different color and resolution image sets. [ABSTRACT FROM AUTHOR]
- Published
- 2017
- Full Text
- View/download PDF
45. Learning spherical hashing based binary codes for face recognition.
- Author
-
Tian, Lei, Fan, Chunxiao, and Ming, Yue
- Subjects
HUMAN facial recognition software ,BINARY codes ,HASHING ,CANONICAL correlation (Statistics) ,PIXELS ,DESCRIPTOR systems - Abstract
Local feature descriptor has been widely used in computer vision field due to their excellent discriminative power and strong robustness. However, the forms of such local descriptors are predefined in the hand-crafted way, which requires strong domain knowledge to design them. In this paper, we propose a simple and efficient Spherical Hashing based Binary Codes (SHBC) feature learning method to learn a discriminative and robust binary face descriptor in the data-driven way. Firstly, we extract patch-wise pixel difference vectors (PDVs) by computing the difference between center patch and its neighboring patches. Then, inspired by the fact that hypersphere provide much stronger power in defining a tighter closed region in the original data space than hyperplane, we learn a hypersphere-based hashing function to map these PDVs into low-dimensional binary codes by an efficient iterative optimization process, which achieves both balanced bits partitioning of data points and independence between hashing functions. In order to better capture the semantic information of the dataset, our SHBC also can be used with supervised data embedding method, such as Canonical Correlation Analysis (CCA), namely Supervised-SHBC (S-SHBC). Lastly, we cluster and pool these learned binary codes into a histogram-based feature that describes the co-occurrence of binary codes. And we consider the histogram-based feature as our final feature representation for each face image. We investigate the performance of our SHBC and S-SHBC on FERET, CAS-PEAL-R1, LFW and PaSC databases. Extensive experimental results demonstrate that our SHBC descriptor outperforms other state-of-the-art face descriptors. [ABSTRACT FROM AUTHOR]
- Published
- 2017
- Full Text
- View/download PDF
46. An application of chain code-based local descriptor and its extension to face recognition.
- Author
-
Karczmarek, Paweł, Kiersztyn, Adam, Pedrycz, Witold, and Dolecki, Michał
- Subjects
- *
CHAIN codes (Data compression) , *HUMAN facial recognition software , *BIG data , *ALGORITHMS , *DESCRIPTOR systems - Abstract
Local descriptors are widely used technique of feature extraction to obtain information about both local and global properties of an object. Here, we discuss an application of the Chain Code-Based Local Descriptor to face recognition by focusing on various datasets and considering different variants of this description method. We augment the generic form of the descriptor by adding a possibility of grouping pixels into blocks, i.e., effectively describing larger neighborhoods. The results of experiments show the efficiency of the approach. We demonstrate that the obtained results are comparable or even better than those delivered by other important algorithms in the class of methods based on the Bag-of-Visual-Words paradigm. [ABSTRACT FROM AUTHOR]
- Published
- 2017
- Full Text
- View/download PDF
47. L2SSP: Robust keypoint description using local second-order statistics with soft-pooling.
- Author
-
Song, Tiecheng, Meng, Fanman, Wu, Qingbo, Luo, Bing, Zhang, Tianqi, and Xu, Yongjun
- Subjects
- *
MATHEMATICAL symmetry , *HISTOGRAMS , *ROBUST control , *PIXELS , *IMAGE processing - Abstract
In recent years, local image descriptors based on histograms of oriented gradients (e.g., SIFT, DAISY) and intensity orders (e.g., LBP, LIOP) have been popular for the keypoint matching. However, by relying on the dominant orientation estimation or several pixels based interaction to achieve rotation invariance, these descriptors tend to be error-prone or noise-sensitive. Moreover, they represent features as histograms which are restricted to the low-order statistics. In this paper, we propose to use local second-order statistics with soft-pooling (L2SSP) for robust keypoint description. To this end, a feature set is first designed by modeling each pixel as local spatial-frequency patterns and local extremum patterns. Such a feature set is rotationally invariant, highly discriminative and also robust to noise. Then, a soft spatial binning is introduced to encode the gradient information in a rotation invariant way. Finally, the descriptor is constructed by concatenating all sub-descriptors which are obtained by pooling local features within each spatial bin via the second-order statistics (covariance matrix). The proposed local descriptor, i.e., L2SSP, has been extensively evaluated on the Oxford dataset and some synthesized image pairs. Experimental results demonstrate the superior performance of L2SSP over state-of-the-art methods under a variety of image transformations. The source code of L2SSP is publicly available at https://pan.baidu.com/s/1pLIVwLH#path=/%252Fl2ssp . [ABSTRACT FROM AUTHOR]
- Published
- 2017
- Full Text
- View/download PDF
48. LCO: A robust and efficient local descriptor for image matching.
- Author
-
Duo, Jingyun, Chen, Pengfeng, and Zhao, Long
- Subjects
- *
IMAGE registration , *DESCRIPTOR systems , *ROBUST statistics , *PIXELS , *IMAGE databases - Abstract
In this paper, a discriminating, yet very simple and efficient local descriptor, called local contrast and ordering (LCO), is presented for image matching. LCO descriptor employs sign and ordering of intensity differences to represent different local region characteristics. Specifically, for each pixel in a given interest region, LCO descriptor only calculates the intensity difference between current pixel and interest point. By using an exact ordering technology, ordering ambiguity is eliminated, and all pixels can be accurately divided into 6 intervals for simulating the positive and negative differences of high, middle, and low frequency. Our experiments are performed on a standard image data set, and the experimental results indicate that LCO descriptor achieves superior matching performance and higher computational efficiency than many popular local feature descriptors. [ABSTRACT FROM AUTHOR]
- Published
- 2017
- Full Text
- View/download PDF
49. Distance based k-NN Classification of Gabor Jet Local Descriptors.
- Author
-
Lefkovits, Szidónia and Lefkovits, László
- Subjects
DESCRIPTOR systems ,IMAGE recognition (Computer vision) ,ARTIFICIAL intelligence research ,ROBOTICS ,GAUSSIAN processes - Abstract
In the domain of artificial intelligence object detection has become a popular area over the past several years. The creation of an automatic detection system plays an important role in several domains of interest, such as: bioinformatics, traffic supervision, access control, identification and authentication systems and industry using intelligent robots etc. Creating such a detection system is a challenge for every researcher in this domain. The main difficulty comes from the extreme diversity in which all objects appear. They have a large variety of appearance, aspect, form, dimension, color, position, rotation angle, illumination, shadow or occlusion. In this approach we analyzed part-based object detection systems. These can be generally separated in three main phases: detection of interest points, local descriptor and the object model. This paper proposes a new local descriptor, for the second phase and compares its detection performance with several classification algorithms. The developed patch descriptor is based on two-dimensional Gabor wavelets. The Gabor filters are Gaussian modulated sinusoidal waves, which describe the neighborhood of a given image pixel in two-dimensional space. It is defined by 9 degrees of freedom. Each of these parameters can have an infinite definition domain. The goal is to reduce the infinite number of possible values and to determine the most adequate filters for a given object. The definition domain of the nine parameters is narrowed by some theoretical considerations and by the dimension of the image patch analyzed. According to our experiments, we have deduced set of Gabor filter responses which characterize the region of interest in a given image. The goal of this approach is to find the most characteristic Gabor filters for the object of interest. After defining more thousand such filters, with a selection algorithm, we determine the most discriminative n filters based on the training set of images and the total number of defined Gabor filter descriptors. Only the first best n classifiers are going to play a role in the classification of the image patch. This paper compares the k-NN decision to other learning methods as the Gentle Boost algorithm, the SVM classification, which we have used in our previous works. The choice of Gabor wavelets for object detection is well-founded, because it has been physiologically demonstrated that the human visual cortex system works similarly, in other words, it can be modeled by Gabor filter decomposition [5] and [6]. In our previous works we have defined a novel Gabor-filter based patch descriptors for object detection. In this approach we classify them with different classification methods. We have obtained high classification performance with an easy, computationally simple algorithm as k-NN Nearest Neighbors method, which reduces the training process. Our contribution is finding the most adequate Gabor filters parameters considering a given object or object part. After defining a novel patch descriptor based on these responses, we compare several classification methods, in order to obtain as good detection performances as possible. The main contribution of this paper is to apply a simple, but efficient classification method as the k-NN, in order to reduce the computations compared to previously used classification algorithms[1], [2], [3] and [4]. [ABSTRACT FROM AUTHOR]
- Published
- 2015
- Full Text
- View/download PDF
50. Computerized Ultrasonic Imaging Inspection: From Shallow to Deep Learning
- Author
-
Jiaxing Ye, Shunya Ito, and Nobuyuki Toyama
- Subjects
nondestructive evaluation ,ultrasonic imaging ,computer vision ,deep learning ,local descriptor ,convolutional neural networks ,Chemical technology ,TP1-1185 - Abstract
For many decades, ultrasonic imaging inspection has been adopted as a principal method to detect multiple defects, e.g., void and corrosion. However, the data interpretation relies on an inspector’s subjective judgment, thus making the results vulnerable to human error. Nowadays, advanced computer vision techniques reveal new perspectives on the high-level visual understanding of universal tasks. This research aims to develop an efficient automatic ultrasonic image analysis system for nondestructive testing (NDT) using the latest visual information processing technique. To this end, we first established an ultrasonic inspection image dataset containing 6849 ultrasonic scan images with full defect/no-defect annotations. Using the dataset, we performed a comprehensive experimental comparison of various computer vision techniques, including both conventional methods using hand-crafted visual features and the most recent convolutional neural networks (CNN) which generate multiple-layer stacking for representation learning. In the computer vision community, the two groups are referred to as shallow and deep learning, respectively. Experimental results make it clear that the deep learning-enabled system outperformed conventional (shallow) learning schemes by a large margin. We believe this benchmarking could be used as a reference for similar research dealing with automatic defect detection in ultrasonic imaging inspection.
- Published
- 2018
- Full Text
- View/download PDF
Catalog
Discovery Service for Jio Institute Digital Library
For full access to our library's resources, please sign in.