32 results on '"Xiaochun Cao"'
Search Results
2. Difference Residual Graph Neural Networks
- Author
-
Liang Yang, Weihang Peng, Wenmiao Zhou, Bingxin Niu, Junhua Gu, Chuan Wang, Yuanfang Guo, Dongxiao He, and Xiaochun Cao
- Published
- 2022
3. Confederated Learning: Going Beyond Centralization
- Author
-
Zitai Wang, Qianqian Xu, Ke Ma, Xiaochun Cao, and Qingming Huang
- Published
- 2022
4. A Unified Framework against Topology and Class Imbalance
- Author
-
Junyu Chen, Qianqian Xu, Zhiyong Yang, Xiaochun Cao, and Qingming Huang
- Published
- 2022
5. Imitated Detectors
- Author
-
Siyuan Liang, Aishan Liu, Jiawei Liang, Longkang Li, Yang Bai, and Xiaochun Cao
- Published
- 2022
6. Recurrent Meta-Learning against Generalized Cold-start Problem in CTR Prediction
- Author
-
Junyu Chen, Qianqian Xu, Zhiyong Yang, Ke Ma, Xiaochun Cao, and Qingming Huang
- Published
- 2022
7. Graph Neural Networks Beyond Compromise Between Attribute and Topology
- Author
-
Liang Yang, Wenmiao Zhou, Weihang Peng, Bingxin Niu, Junhua Gu, Chuan Wang, Xiaochun Cao, and Dongxiao He
- Published
- 2022
8. Implicit Feedbacks are Not Always Favorable: Iterative Relabeled One-Class Collaborative Filtering against Noisy Interactions
- Author
-
Zitai Wang, Qingming Huang, Zhiyong Yang, Qianqian Xu, and Xiaochun Cao
- Subjects
Class (computer programming) ,Preference learning ,Exploit ,Computer science ,business.industry ,Recommender system ,Machine learning ,computer.software_genre ,Task (project management) ,Noise ,Negative feedback ,Collaborative filtering ,Artificial intelligence ,business ,computer - Abstract
Due to privacy concerns, there is a rising favor in Recommender System community for the One-class Collaborative Filtering (OCCF) framework, which predicts user preferences only based on binary implicit feedback (e.g., click or not-click, rated or unrated). The major challenge in OCCF problem stems from the inherent noise in implicit interaction. Previous approaches have taken into account the noise in unobserved interactions (i.e., not-click only means a missing value, rather than negative feedback). However, they generally ignore the noise in observed interactions (i.e., click does not necessarily represent positive feedback), which might induce performance degradation. To attack this issue, we propose a novel iteratively relabeling framework to jointly mitigate the noise in both observed and unobserved interactions. As the core of the framework, the iterative relabeling module exploits the self-training principle to dynamically generate pseudo labels for user preferences. The downstream module for a recommendation task is then trained with the refreshed labels where the noisy patterns are largely alleviated. Finally, extensive experiments on three real-world datasets demonstrate the effectiveness of our proposed methods.
- Published
- 2021
9. Deep Self-Supervised t-SNE for Multi-modal Subspace Clustering
- Author
-
Wei Xia, Xiaochun Cao, Qianqian Wang, Zhiqiang Tao, and Quanxue Gao
- Subjects
Structure (mathematical logic) ,Modalities ,business.industry ,Computer science ,Pattern recognition ,ComputingMethodologies_PATTERNRECOGNITION ,Modal ,Discriminative model ,Feature (computer vision) ,Artificial intelligence ,Layer (object-oriented design) ,business ,Cluster analysis ,Encoder - Abstract
Existing multi-modal subspace clustering methods, aiming to exploit the correlation information between different modalities, have achieved promising preliminary results. However, these methods might be incapable of handling real problems with complex heterogeneous structures between different modalities, since the large heterogeneous structure makes it difficult to directly learn a discriminative shared self-representation for multi-modal clustering. To tackle this problem, in this paper, we propose a deep Self-supervised t-SNE method (StSNE) for multi-modal subspace clustering, which learns soft label features by multi-modal encoders and utilizes the common label feature to supervise soft label feature of each modal by adversarial training and reconstruction networks. Specifically, the proposed StSNE consists of four components: 1) multi-modal convolutional encoders; 2) a self-supervised t-SNE module; 3) a self-expressive layer; 4) multi-modal convolutional decoders. Multi-modal data are fed to encoders to obtain soft label features, for which the self-supervised t-SNE module is added to make full use of the label information among different modalities. Simultaneously, the latent representations given by encoders are constrained by a self-expressive layer to capture the hierarchical information of each modal, followed by decoders reconstructing the encoded features to preserve the structure of the original data. Experimental results on several public datasets demonstrate the superior clustering performance of the proposed method over state-of-the-art methods.
- Published
- 2021
10. Identity-Preserving Face Anonymization via Adaptively Facial Attributes Obfuscation
- Author
-
Xiaochun Cao, Lili Wang, Lutong Han, Hua Zhang, Ruoyu Chen, Jingzhi Li, and Bing Han
- Subjects
Focus (computing) ,Computer science ,business.industry ,Face (geometry) ,Obfuscation ,Identity (object-oriented programming) ,Pattern recognition ,Artificial intelligence ,Visual appearance ,business ,Facial recognition system ,Generator (mathematics) ,Variety (cybernetics) - Abstract
With the popularity of using computer vision technology in monitoring system, there is an increasing societal concern on intruding people's privacy as the captured images/videos may contain identity-related information e.g. people's face. Existing methods on protecting such privacy focus on removing the identity-related information from faces. However, this would weaken the utility of current monitoring system. In this paper, we develop a face anonymization framework that could obfuscate visual appearance while preserving the identity discriminability. The framework is composed of two parts: an identity-aware region discovery module and an identity-aware face confusion module. The former adaptively locates the identity-independent attributes on human faces, and the latter generates the privacy-preserving faces using original faces and discovered facial attributes. To optimize the face generator, we employ a multi-task based loss function, which consists of discriminator loss, identify preserving loss, and reconstruction loss functions. Our model can achieve a balance between recognition utility and appearance anonymizing by modifying different numbers of facial attributes according to pratical demands, and provide a variety of results. Extensive experiments conducted on two public benchmarks Celeb-A and VGG-Face2 demonstrate the effectiveness of our model under distinct face recognition scenarios.
- Published
- 2021
11. Quaternion-Based Knowledge Graph Network for Recommendation
- Author
-
Qianqian Xu, Xiaochun Cao, Zhaopeng Li, Yangbangyan Jiang, and Qingming Huang
- Subjects
Structure (mathematical logic) ,Hypercomplex number ,Theoretical computer science ,Computer science ,02 engineering and technology ,Recommender system ,020204 information systems ,Core (graph theory) ,0202 electrical engineering, electronic engineering, information engineering ,020201 artificial intelligence & image processing ,Limit (mathematics) ,Representation (mathematics) ,Quaternion ,Semantic matching - Abstract
Recently, to alleviate the data sparsity and cold start problem, many research efforts have been devoted to the usage of knowledge graph (KG) in recommender systems. It is common for most existing KG based models to represent users and items using real-valued embeddings. However, compared with complex or hypercomplex numbers, these real-valued vectors are of less representation capacity and no intrinsic asymmetrical properties, thus may limit the modeling of interactions between entities and relations in KG. In this paper, we propose Quaternion-based Knowledge Graph Network (QKGN) for recommendation, which represents users and items with quaternion embeddings in hypercomplex space, so that the latent inter-dependencies between entities and relations could be captured effectively. In the core of our model, a semantic matching principle based on Hamilton product is applied to learn expressive quaternion representations from the unified user-item KG. On top of this, those embeddings are attentively updated by a customized preference propagation mechanism with structure information concerned. Finally, we apply the proposed QKGN to three real-world datasets of music, movie and book, and experimental results show the validity of our method.
- Published
- 2020
12. Graph Attention Topic Modeling Network
- Author
-
Yuanfang Guo, Di Jin, Chuan Wang, Xiaochun Cao, Fan Wu, Junhua Gu, and Liang Yang
- Subjects
Independent and identically distributed random variables ,Topic model ,Theoretical computer science ,Word embedding ,Computer science ,Inference ,02 engineering and technology ,Latent variable ,010501 environmental sciences ,Overfitting ,computer.software_genre ,01 natural sciences ,Latent Dirichlet allocation ,Dirichlet distribution ,symbols.namesake ,Stochastic block model ,0202 electrical engineering, electronic engineering, information engineering ,0105 earth and related environmental sciences ,Probabilistic latent semantic analysis ,Document classification ,symbols ,Graph (abstract data type) ,Topological graph theory ,020201 artificial intelligence & image processing ,computer ,Latent semantic indexing - Abstract
Existing topic modeling approaches possess several issues, including the overfitting issue of Probablistic Latent Semantic Indexing (pLSI), the failure of capturing the rich topical correlations among topics in Latent Dirichlet Allocation (LDA), and high inference complexity. In this paper, we provide a new method to overcome the overfitting issue of pLSI by using the amortized inference with word embedding as input, instead of the Dirichlet prior in LDA. For generative topic model, the large number of free latent variables is the root of overfitting. To reduce the number of parameters, the amortized inference replaces the inference of latent variable with a function which possesses the shared (amortized) learnable parameters. The number of the shared parameters is fixed and independent of the scale of the corpus. To overcome the limited application of amortized inference to independent and identically distributed (i.i.d) data, a novel graph neural network, Graph Attention TOpic Network (GATON), is proposed to model the topic structure of non-i.i.d documents according to the following two observations. First, pLSI can be interpreted as stochastic block model (SBM) on a specific bi-partite graph. Second, graph attention network (GAT) can be explained as the semi-amortized inference of SBM, which relaxes the i.i.d data assumption of vanilla amortized inference. GATON provides a novel scheme, i.e. graph convolution operation based scheme, to integrate word similarity and word co-occurrence structure. Specifically, the bag-of-words document representation is modeled as a bi-partite graph topology. Meanwhile, word embedding, which captures the word similarity, is modeled as attribute of the word node and the term frequency vector is adopted as the attribute of the document node. Based on the weighted (attention) graph convolution operation, the word co-occurrence structure and word similarity patterns are seamlessly integrated for topic identification. Extensive experiments demonstrate that the effectiveness of GATON on topic identification not only benefits the document classification, but also significantly refines the input word embedding.
- Published
- 2020
13. Collaborative Preference Embedding against Sparse Labels
- Author
-
Qianqian Xu, Zhiyong Yang, Ke Ma, Shilong Bao, Qingming Huang, and Xiaochun Cao
- Subjects
business.industry ,Computer science ,010102 general mathematics ,02 engineering and technology ,Recommender system ,Machine learning ,computer.software_genre ,01 natural sciences ,0202 electrical engineering, electronic engineering, information engineering ,Collaborative filtering ,Embedding ,020201 artificial intelligence & image processing ,The Internet ,Artificial intelligence ,0101 mathematics ,business ,computer - Abstract
Living in the era of the internet, we are now facing with a big bang of online information. As a consequence, we often find ourselves troubling with hundreds and thousands of options before making a decision. As a way to improve the quality of users' online experience, Recommendation System aims to facilitate personalized online decision making processes via predicting users' responses toward different options. However, the vast majority of the literature in the field merely focus on datasets with sufficient amount of samples. Different from the traditional methods, we propose a novel method named as Collaborative Preference Embedding (CPE) which directly deals with sparse and insufficient user preference information. Specifically, we represent the intrinsic pattern of users/items with a high dimensional embedding space. On top of this embedding space, we design two schemes specifically against the limited generalization ability in terms of sparse labels. On one hand, we construct a margin function which could indicate the consistency between the embedding space and the true user preference. From the margin theory point-of-view, we then propose a generalization enhancement scheme for sparse and insufficient labels via optimizing the margin distribution. On the other hand, regarding the embedding as a code for a user/item, we then improve the generalization ability from the coding point-of-view. Specifically, we leverage a compact embedding space by reducing the dependency across different dimensions of a code (embedding). Finally, extensive experiments on a number of real-world datasets demonstrate the superior generalization performance of the proposed algorithm.
- Published
- 2019
14. Duet Robust Deep Subspace Clustering
- Author
-
Qianqian Xu, Zhiyong Yang, Xiaochun Cao, Yangbangyan Jiang, and Qingming Huang
- Subjects
Computer science ,business.industry ,Deep learning ,020206 networking & telecommunications ,02 engineering and technology ,Machine learning ,computer.software_genre ,Regularization (mathematics) ,ComputingMethodologies_PATTERNRECOGNITION ,Subspace clustering ,Robustness (computer science) ,0202 electrical engineering, electronic engineering, information engineering ,020201 artificial intelligence & image processing ,Artificial intelligence ,business ,Cluster analysis ,computer ,Smoothing ,Subspace topology - Abstract
Subspace clustering has long been recognized as vulnerable toward gross corruptions -- the corruptions can easily mislead the estimation of the underlying subspace structure. Recently, deep extensions of traditional subspace clustering methods have shown their great power to boost the clustering performance. However, deep learning methods are, in themselves, more prone to be affected by data corruptions. This motivates us to design specific robust extensions for deep subspace clustering methods. More precisely, we contribute a new robust deep framework called Duet Robust Deep Subspace Clustering (DRDSC). Our main idea is to explicitly model the corrupted patterns from both the data reconstruction perspective and the latent self-expression perspective with two regularization norms. Moreover, since the two involved norms are non-smooth, we implement a smoothing technique for these norms to facilitate the back-propagation of our proposed network. Experiments carried out on read-world vision tasks with different noise settings demonstrate the effectiveness of our proposed method.
- Published
- 2019
15. Adversarial Preference Learning with Pairwise Comparisons
- Author
-
Qianqian Xu, Yangbangyan Jiang, Xiaochun Cao, Zitai Wang, Ke Ma, and Qingming Huang
- Subjects
Preference learning ,Computer science ,business.industry ,Human intelligence ,02 engineering and technology ,010501 environmental sciences ,Recommender system ,Machine learning ,computer.software_genre ,01 natural sciences ,Empirical research ,Ranking ,Discriminative model ,020204 information systems ,0202 electrical engineering, electronic engineering, information engineering ,Leverage (statistics) ,Pairwise comparison ,Artificial intelligence ,business ,computer ,0105 earth and related environmental sciences - Abstract
When facing rich multimedia content and making a decision, users tend to be overwhelmed with redundant options. Recommendation system can improve the users' experience by predicting the possible preference of a given user. The vast majority of the literature adopts the collaborative framework, which relies on a static and fixed formulation of the rating score prediction function (in most cases an inner product function). However, such a static learning paradigm is not consistent with the dynamic feature of human intelligence. Motivated by this, we present a novel adversarial framework for collaborative ranking. On one hand, we leverage a deep generator to approximate an arbitrary continuous score function in terms of pairwise comparison. On the other hand, a discriminator provides personalized supervision signals with increasing difficulty. Different from the traditional static learning framework, our proposed approach enjoys a dynamic nature and unifies both the generative and the discriminative model for collaborative ranking. Comprehensive empirical studies on three real-world datasets show significant improvements of the adversarial framework over the state-of-the-art methods.
- Published
- 2019
16. When to Learn What
- Author
-
Yangbangyan Jiang, Qianqian Xu, Xiaochun Cao, Qingming Huang, and Zhiyong Yang
- Subjects
Computer science ,business.industry ,Deep learning ,02 engineering and technology ,010501 environmental sciences ,Machine learning ,computer.software_genre ,01 natural sciences ,Linear subspace ,Outlier ,0202 electrical engineering, electronic engineering, information engineering ,Benchmark (computing) ,020201 artificial intelligence & image processing ,Noise (video) ,Artificial intelligence ,Cluster analysis ,Representation (mathematics) ,business ,computer ,0105 earth and related environmental sciences - Abstract
Subspace clustering aims at clustering data points drawn from a union of low-dimensional subspaces. Recently deep neural networks are introduced into this problem to improve both representation ability and precision for non-linear data. However, such models are sensitive to noise and outliers, since both difficult and easy samples are treated equally. On the contrary, in the human cognitive process, individuals tend to follow a learning paradigm from easy to hard and less to more. In other words, human beings always learn from simple concepts, then absorb more complicated ones gradually. Inspired by such learning scheme, in this paper, we propose a robust deep subspace clustering framework based on the principle of human cognitive process. Specifically, we measure the easinesses of samples dynamically so that our proposed method could gradually utilize instances from easy to more complex ones in a robust way. Meanwhile, a promising solution is designed to update the weights and parameters using an alternative optimization strategy, followed by a theoretical analysis to demonstrated the rationality of the proposed method. Experimental results on three popular benchmark datasets demonstrate the validity of the proposed method.
- Published
- 2018
17. A Margin-based MLE for Crowdsourced Partial Ranking
- Author
-
Qianqian Xu, Zhiyong Yang, Xinwei Sun, Qingming Huang, Jiechao Xiong, Yuan Yao, and Xiaochun Cao
- Subjects
FOS: Computer and information sciences ,Generalized linear model ,Computer Science - Machine Learning ,Computer science ,business.industry ,Probabilistic logic ,Machine Learning (stat.ML) ,02 engineering and technology ,Crowdsourcing ,computer.software_genre ,Machine Learning (cs.LG) ,Multimedia (cs.MM) ,Ranking ,Statistics - Machine Learning ,Margin (machine learning) ,020204 information systems ,Convex optimization ,0202 electrical engineering, electronic engineering, information engineering ,020201 artificial intelligence & image processing ,Pairwise comparison ,Data mining ,business ,Preference (economics) ,computer ,Computer Science - Multimedia - Abstract
A preference order or ranking aggregated from pairwise comparison data is commonly understood as a strict total order. However, in real-world scenarios, some items are intrinsically ambiguous in comparisons, which may very well be an inherent uncertainty of the data. In this case, the conventional total order ranking can not capture such uncertainty with mere global ranking or utility scores. In this paper, we are specifically interested in the recent surge in crowdsourcing applications to predict partial but more accurate (i.e., making less incorrect statements) orders rather than complete ones. To do so, we propose a novel framework to learn some probabilistic models of partial orders as a \emph{margin-based Maximum Likelihood Estimate} (MLE) method. We prove that the induced MLE is a joint convex optimization problem with respect to all the parameters, including the global ranking scores and margin parameter. Moreover, three kinds of generalized linear models are studied, including the basic uniform model, Bradley-Terry model, and Thurstone-Mosteller model, equipped with some theoretical analysis on FDR and Power control for the proposed methods. The validity of these models are supported by experiments with both simulated and real-world datasets, which shows that the proposed models exhibit improvements compared with traditional state-of-the-art algorithms., 9 pages, Accepted by ACM Multimedia 2018 as a full paper
- Published
- 2018
18. Who to Ask
- Author
-
Yangbangyan Jiang, Xiaochun Cao, Qianqian Xu, and Qingming Huang
- Subjects
Human–computer interaction ,Ask price ,business.industry ,Computer science ,0202 electrical engineering, electronic engineering, information engineering ,020207 software engineering ,020201 artificial intelligence & image processing ,02 engineering and technology ,Clothing ,business - Abstract
Humankind has always been in pursuit of fashion. Nevertheless, people are often troubled by collocating clothes, e.g., tops, bottoms, shoes, and accessories, from numerous fashion items in their closets. Moreover, it may be expensive and inconvenient to employ a fashion stylist. In this paper, we present Stile, an end-to-end intelligent fashion consultant system, to generate stylish outfits for given items. Unlike previous systems, our framework considers the global compatibility of fashion items in the outfit and models the dependencies among items in a fixed order via a bidirectional LSTM. Therefore, it can guarantee that items in the same outfit should share a similar style and neither redundant nor missing items exist in the resulting outfit for essential categories. The demonstration shows that our proposed system provides people with a practical and convenient solution to find natural and proper fashion outfits.
- Published
- 2018
19. LEAF
- Author
-
Xiaochun Cao, Rui Wang, Hua Zhang, and Changqing Zhang
- Subjects
Computer science ,business.industry ,Feature selection ,Pattern recognition ,02 engineering and technology ,010501 environmental sciences ,Machine learning ,computer.software_genre ,01 natural sciences ,ComputingMethodologies_PATTERNRECOGNITION ,Discriminative model ,0202 electrical engineering, electronic engineering, information engineering ,Embedding ,020201 artificial intelligence & image processing ,Artificial intelligence ,business ,Classifier (UML) ,computer ,0105 earth and related environmental sciences - Abstract
To improve the discrimination of attribute representation, in this paper, we propose to extend the traditional attribute representations via embedding the latent high-order structure between attributes. Specifically, our aim is to construct the Latent Extended Attribute Features (LEAF) for visual classification. Since there only exist weak label for each attribute, we firstly propose a feature selection method to explore the common feature structures across categories. After that, the attribute classifiers are trained based on the selected features. Then, the category specific graph is introduced, which is composed of single attributes and their co-occurrence attribute pairs. This attribute graph is used as the initialized representation of each image. Considering our aim, we should discover the discriminative latent structure between attributes and train the robust category classifiers. To that end, we develop a joint learning objective function which is composed of the high-order representation mining term and the classifier training term. The mining term can both preserve category-specific information and discover the common structure between categories. Based on the discovery representation, the robust visual classifiers could be trained by the classifier term. Finally, an alternating optimization method is designed to seek the optimal solution of our objective function. Experimental results on the challenging datasets demonstrate the advantages of our proposed model over existing work.
- Published
- 2017
20. MatchDR
- Author
-
Rui Wang, Xiaochun Cao, Dong Liang, and Wei Zhang
- Subjects
business.industry ,Computer science ,Perspective (graphical) ,ComputingMethodologies_IMAGEPROCESSINGANDCOMPUTERVISION ,020207 software engineering ,02 engineering and technology ,Image (mathematics) ,Constraint (information theory) ,0202 electrical engineering, electronic engineering, information engineering ,020201 artificial intelligence & image processing ,Computer vision ,Artificial intelligence ,business ,Spatial analysis ,Algorithm ,ComputingMethodologies_COMPUTERGRAPHICS - Abstract
Image correspondence is to establish the connections between coherent images, which can be quite challenging due to the visual and geometric deformations. This paper proposes a robust image correspondence technique from the perspective of spatial regularity. Specifically, the visual deformation is addressed by introducing the spatial information by enforcing the distance ratio constrain. At the same time, the geometric deformation is tolerated by adopting a smoothness term. Subsequently, image correspondence is formulated as permutation problem, for which, we propose a Gradient Guided Simulated Annealing method for robust optimization. Furthermore, our method is much more memory efficient, where the storage complexity is reduced from O(n4) to O(n2). The experiments on several datasets indicate that our proposed formulation and optimization significantly improve the baselines for both visually-similar and semantically-similar images, where both visual and geometric deformations are present.
- Published
- 2016
21. Beauty eMakeup
- Author
-
Si Liu, Xinyu Ou, Hefei Ling, and Xiaochun Cao
- Subjects
Computer science ,business.industry ,media_common.quotation_subject ,020207 software engineering ,02 engineering and technology ,Transfer system ,Cosmetics ,Beauty ,0202 electrical engineering, electronic engineering, information engineering ,020201 artificial intelligence & image processing ,Computer vision ,Artificial intelligence ,business ,media_common - Abstract
In this demo, we present a Beauty eMakeup System to automatically recommend the most suitable makeup for a female and synthesis the makeup on her face. Given a before-makeup face, her most suitable makeup is determined automatically. Then, both the before-makeup and the reference faces are fed into the proposed Deep Transfer Network to generate the after-makeup face. Our end-to-end makeup transfer network have several nice properties including: (1) with complete functions: including foundation, lip gloss, and eye shadow transfer; (2) cosmetic specific: different cosmetics are transferred in different manners; (3) localized: different cosmetics are applied on different facial regions; (4) producing naturally looking results without obvious artifacts; (5) controllable makeup lightness: various results from light makeup to heavy makeup can be generated. Extensive experimental evaluations and analysis on testing images well demonstrate the effectiveness of the proposed system.
- Published
- 2016
22. Deep People Counting in Extremely Dense Crowds
- Author
-
Hua Zhang, Chuan Wang, Liang Yang, Xiaochun Cao, and Si Liu
- Subjects
Crowds ,Training set ,Robustness (computer science) ,Computer science ,business.industry ,Crowd analysis ,Computer vision ,Artificial intelligence ,business ,Convolutional neural network ,Crowd counting - Abstract
People counting in extremely dense crowds is an important step for video surveillance and anomaly warning. The problem becomes especially more challenging due to the lack of training samples, severe occlusions, cluttered scenes and variation of perspective. Existing methods either resort to auxiliary human and face detectors or surrogate by estimating the density of crowds. Most of them rely on hand-crafted features, such as SIFT, HOG etc, and thus are prone to fail when density grows or the training sample is scarce. In this paper we propose an end-to-end deep convolutional neural networks (CNN) regression model for counting people of images in extremely dense crowds. Our method has following characteristics. Firstly, it is a deep model built on CNN to automatically learn effective features for counting. Besides, to weaken influence of background like buildings and trees, we purposely enrich the training data with expanded negative samples whose ground truth counting is set as zero. With these negative samples, the robustness can be enhanced. Extensive experimental results show that our method achieves superior performance than the state-of-the-arts in term of the mean and variance of absolute difference.
- Published
- 2015
23. Multi-cue Augmented Face Clustering
- Author
-
Huazhu Fu, Changqing Zhang, Xiaochun Cao, Rui Wang, and Chengju Zhou
- Subjects
Facial expression ,Fuzzy clustering ,Computer science ,business.industry ,ComputingMethodologies_IMAGEPROCESSINGANDCOMPUTERVISION ,Pattern recognition ,Sparse approximation ,Task (project management) ,ComputingMethodologies_PATTERNRECOGNITION ,Face (geometry) ,Computer vision ,Artificial intelligence ,business ,Cluster analysis ,Face detection - Abstract
Face clustering is an important but challenging task since facial images always have huge variation due to change in facial expressions, head poses and partial occlusions, etc. Moreover, face clustering is actually an unsupervised problem which makes it more difficult to reach an accurate result. Fortunately, there are some cues that can be used to improve clustering performance. In this paper, two types of cues are employed. The first one is pairwise constraints: must-link and cannot-link constraints, which can be extracted from the temporal and spatial knowledge of data. The other is that each face is associated with a series of attributes (i.e, gender) which can contribute discrimination among faces. To take advantage of the above cues, we propose a new algorithm, Multi-cue Augmented Face Clustering (McAFC), which effectively incorporates the cues via graph-guided sparse subspace clustering technique. Specially, facial images from the same individual are encouraged to be connected while faces from different persons are restrained to be connected. Experiments on three face datasets from real-world videos show the improvements of our algorithm over the state-of-the-art methods.
- Published
- 2015
24. Co-Saliency Detection via Base Reconstruction
- Author
-
Xiaochun Cao, Zhiqiang Tao, Huazhu Fu, and Yupeng Cheng
- Subjects
business.industry ,Computer science ,Feature vector ,ComputingMethodologies_IMAGEPROCESSINGANDCOMPUTERVISION ,Pattern recognition ,Base (topology) ,Image (mathematics) ,Set (abstract data type) ,Benchmark (computing) ,Relevance (information retrieval) ,Computer vision ,Artificial intelligence ,business ,ComputingMethodologies_COMPUTERGRAPHICS - Abstract
Co-saliency aims at detecting common saliency in a series of images, which is useful for a variety of multimedia applications. In this paper, we address the co-saliency detection to a reconstruction problem: the foreground could be well reconstructed by using the reconstruction bases, which are extracted from each image and have the similar appearances in the feature space. We firstly obtain a candidate set by measuring the saliency prior of each image. Relevance information among the multiple images is utilized to remove the inaccuracy reconstruction bases. Finally, with the updated reconstruction bases, we rebuild the images and provide the reconstruction error regarded as a negative correlational value in co-saliency measurement. The satisfactory quantitative and qualitative experimental results on two benchmark datasets demonstrate the efficiency and effectiveness of our method.
- Published
- 2014
25. Augmented Image Retrieval using Multi-order Object Layout with Attributes
- Author
-
Xiaojie Guo, Jinhui Tang, Xingxing Wei, Xiaochun Cao, and Yahong Han
- Subjects
Matching (statistics) ,Spatial relation ,Information retrieval ,Computer science ,Region of interest ,Concept map ,ComputingMethodologies_IMAGEPROCESSINGANDCOMPUTERVISION ,Object (computer science) ,Image retrieval ,Semantic gap ,Image (mathematics) - Abstract
In image retrieval, users' search intention is usually specified by textual queries, exemplar images, concept maps, and even sketches, which can only express the search intention partially. These query strategies lack the abilities to indicate the Regions Of Interests (ROIs) and represent the spatial or semantic correlations among the ROIs, which results in the so-called semantic gap between users' search intention and images' low-level visual content. In this paper, we propose a novel image search method, which allows the users to indicate any number of Regions Of Interest (ROIs) within the query as well as utilize various semantic concepts and spatial relations to search images. Specifically, we firstly propose a structured descriptor to jointly represent the categories, attributes, and spatial relations among objects. Then, based on the defined descriptor, our method ranks the images in the database according to the matching scores w.r.t. the category, attribute, and spatial relations. We conduct the experiments on the aPascal and aYahoo datasets, and experimental results show the advantage of the proposed method compared to the state of the arts.
- Published
- 2014
26. Beautifying Fisheye Images using Orientation and Shape Cues
- Author
-
Xiaojie Guo, Xiaobo Wang, Xiaochun Cao, and Zhanjie Song
- Subjects
Range (mathematics) ,Seam carving ,Computer science ,Orientation (computer vision) ,business.industry ,Computer graphics (images) ,ComputingMethodologies_IMAGEPROCESSINGANDCOMPUTERVISION ,Computer vision ,Artificial intelligence ,business ,Rotation (mathematics) ,ComputingMethodologies_COMPUTERGRAPHICS - Abstract
Fisheye images, due to their wide range of vision, become more and more popular in our daily life. However, the fisheye images usually suffer from misalignment that reduces their visual pleasure. In this paper, we develop a computational method for enhancing the aesthetics of such images by exploiting the orientation and shape cues. More specifically, the orientation cue is based on the observation that cameras are often oriented when taking photos, so that their upvectors are parallel to vertical linear structures in the scene. While the shape one refers to that after repositing the fisheye image, the circular shape should be preserved. By employing these two rules as our basic aesthetic guidelines, our method can correct the rotation angle between the camera coordinate and the world coordinate to make the virtual camera oriented, and complete the missing part. Experimental results on a number of challenging indoor and outdoor fisheye images show the effectiveness of our approach, and demonstrate the superior aesthetics of the proposed method compared to the state-of-the-arts.
- Published
- 2014
27. Human Skin Detection via Semantic Constraint
- Author
-
Binbin Ma, Changqing Zhang, Jiangjian Xiao, Xiaochun Cao, Ri Qu, and Jingjing Chen
- Subjects
integumentary system ,Pixel ,business.industry ,Computer science ,ComputingMethodologies_IMAGEPROCESSINGANDCOMPUTERVISION ,Image processing ,Human skin ,Human body ,Constraint (information theory) ,Improved performance ,Preprocessor ,Computer vision ,Artificial intelligence ,business ,ComputingMethodologies_COMPUTERGRAPHICS - Abstract
Human skin detection is a fundamental preprocessing step in many content based image processing applications and has received much attention. In this paper, we propose a novel Semantically Constrained Skin Detection (SCSD) method. Different from traditional skin detection algorithms, our method introduces the semantic constraint, which is based on the dependence between skin pixels and human body parts (skin pixels should be overlapped with body parts) to refine the detection. By employing the semantic constraint, the environmental skin-like pixels are removed effectively, while the true skin pixels constituting body parts are retained still. Experimental results on two public datasets demonstrate the significantly improved performance of our method over the state-of-the-art algorithms.
- Published
- 2014
28. Depth Enhanced Saliency Detection Method
- Author
-
Huazhu Fu, Xiaochun Cao, Xingxing Wei, Jiangjian Xiao, and Yupeng Cheng
- Subjects
Computer science ,Machine vision ,business.industry ,ComputingMethodologies_IMAGEPROCESSINGANDCOMPUTERVISION ,Pattern recognition ,Salient objects ,Image (mathematics) ,Kadir–Brady saliency detector ,Salient ,3d image ,Depth map ,Computer vision ,3d perception ,Artificial intelligence ,business - Abstract
Human vision system understands the environment from 3D perception. However, most existing saliency detection algorithms detect the salient foreground based on 2D image information. In this paper, we propose a saliency detection method using the additional depth information. In our method, saliency cues are provided to follow the laws of the visually salient stimuli in both color and depth spaces. Simultaneously, the 'center bias' is also extended to 'spatial' bias to represent the nature advantage in 3D image. In addition, We build a dataset to test our method and the experiments demonstrate that the depth information is useful for extracting the salient object from the complex scenes.
- Published
- 2014
29. The (un)supervised detection of overlapping communities as well as hubs and outliers via (bayesian) NMF
- Author
-
Di Jin, Yixin Cao, Dongxiao He, Xiao Wang, and Xiaochun Cao
- Subjects
Normalization (statistics) ,Computer science ,business.industry ,Bayesian probability ,Outlier ,Pattern recognition ,Artificial intelligence ,business ,Machine learning ,computer.software_genre ,computer ,Non-negative matrix factorization - Abstract
The detection of communities in various networks has been considered by many researchers. Moreover, it is preferable for a community detection method to detect hubs and outliers as well. This becomes even more interesting and challenging when taking the unsupervised assumption, that is, we do not assume the prior knowledge of the number K of communities. In this poster, we define a novel model to identify overlapping communities as well as hubs and outliers. When K is given, we propose a normalized symmetric nonnegative matrix factorization algorithm to learn the parameters of the model. Otherwise, we introduce a Bayesian symmetric nonnegative matrix factorization to learn the parameters of the model, while determining K. Our experiment indicate its superior performance on various networks.
- Published
- 2014
30. Object coding on the semantic graph for scene classification
- Author
-
Xiaochun Cao, Jingjing Chen, Yahong Han, and Qi Tian
- Subjects
Graph database ,Computer science ,business.industry ,Pattern recognition ,computer.software_genre ,Semantics ,Graph ,Semantic similarity ,Semantic computing ,Graph (abstract data type) ,Semantic memory ,Artificial intelligence ,business ,computer ,MathematicsofComputing_DISCRETEMATHEMATICS ,Coding (social sciences) - Abstract
In the scene classification, a scene can be considered as a set of object cliques. Objects inside each clique have semantic correlations with each other, while two objects from different cliques are relatively independent. To utilize these correlations for better recognition performance, we propose a new method - Object Coding on the Semantic Graph to address the scene classification problem. We first exploit prior knowledge by making statistics on a large number of labeled images and calculating the dependency degree between objects. Then, a graph is built to model the semantic correlations between objects. This semantic graph captures semantics by treating the objects as vertices and the objects affinities as the weights of edges. By encoding this semantic knowledge into the semantic graph, object coding is conducted to automatically select a set of object cliques that have strongly semantic correlations to represent a specific scene. The experimental results show that the Object Coding on semantic graph can improve the classification accuracy.
- Published
- 2013
31. Visual saliency detection based on photographic composition
- Author
-
Handong Zhao, Xiaochun Cao, Yahong Han, and Jingjing Chen
- Subjects
Computer science ,business.industry ,ComputingMethodologies_IMAGEPROCESSINGANDCOMPUTERVISION ,Pattern recognition ,Image processing ,GrabCut ,Kadir–Brady saliency detector ,Salient ,Prior probability ,Benchmark (computing) ,Segmentation ,Computer vision ,Artificial intelligence ,business ,Spatial analysis - Abstract
Visual saliency detection and segmentation are widely used in many applications in image processing and computer vision. However, existing saliency detection methods have not fully taken the spatial information of salient regions into account. Inspired by the basic photographic composition rules, we present a novel saliency detection method, which utilizes the knowledge of photographic composition as priors to improve the saliency detection results. Moreover, an online parameter selection method is proposed when utilizing GrabCut to achieve the saliency segmentation result. We test our method on the 1000 benchmark test images and dataset MSRA. Extensive experimental results show the applicability and effectiveness of our method.
- Published
- 2013
32. Video object segmentation with shortest path
- Author
-
Handong Zhao, Xiaochun Cao, and Bao Zhang
- Subjects
Computer science ,business.industry ,Segmentation-based object categorization ,ComputingMethodologies_IMAGEPROCESSINGANDCOMPUTERVISION ,Scale-space segmentation ,Pattern recognition ,Image segmentation ,Object (computer science) ,Video tracking ,Shortest path problem ,Computer vision ,Segmentation ,Artificial intelligence ,business ,Dijkstra's algorithm - Abstract
Unsupervised video object segmentation is to automatically segment the foreground object in the video without any prior knowledge. This paper proposes an object-level method to segment foreground object, while existing methods are normally based on low level information. We firstly find all the object-like regions. Then based on the corresponding map between the successive frames, the video segmentation problem is converted to graph model one. Rather than adopting TRW-S which might result in a local optimal solution, a shortest path algorithm is explored to get a globally optimum solution. Compared with the state-of-the-art object-level method, our method not only guarantees the continuity of segmentation result but also works well even under the big disturbance of fast motion object in the background. The experimental results on two open datasets (SegTrack and Berkeley Motion Segmentation Dataset) and video sequences captured by ourselves demonstrate the effectiveness of our method.
- Published
- 2012
Catalog
Discovery Service for Jio Institute Digital Library
For full access to our library's resources, please sign in.