434 results on '"Yan, Shuicheng"'
Search Results
2. Position-Guided Text Prompt for Vision-Language Pre-Training
- Author
-
Wang, Jinpeng, primary, Zhou, Pan, additional, Shou, Mike Zheng, additional, and Yan, Shuicheng, additional
- Published
- 2023
- Full Text
- View/download PDF
3. Exploring Incompatible Knowledge Transfer in Few-shot Image Generation
- Author
-
Zhao, Yunqing, primary, Du, Chao, additional, Abdollahzadeh, Milad, additional, Pang, Tianyu, additional, Lin, Min, additional, Yan, Shuicheng, additional, and Cheung, Ngai-Man, additional
- Published
- 2023
- Full Text
- View/download PDF
4. Towards Feature Distribution Alignment and Diversity Enhancement for Data-Free Quantization
- Author
-
Gao, Yangcheng, primary, Zhang, Zhao, additional, Hong, Richang, additional, Zhang, Haijun, additional, Fan, Jicong, additional, and Yan, Shuicheng, additional
- Published
- 2022
- Full Text
- View/download PDF
5. MetaFormer is Actually What You Need for Vision
- Author
-
Yu, Weihao, primary, Luo, Mi, additional, Zhou, Pan, additional, Si, Chenyang, additional, Zhou, Yichen, additional, Wang, Xinchao, additional, Feng, Jiashi, additional, and Yan, Shuicheng, additional
- Published
- 2022
- Full Text
- View/download PDF
6. Deep Color Consistent Network for Low-Light Image Enhancement
- Author
-
Zhang, Zhao, primary, Zheng, Huan, additional, Hong, Richang, additional, Xu, Mingliang, additional, Yan, Shuicheng, additional, and Wang, Meng, additional
- Published
- 2022
- Full Text
- View/download PDF
7. Triplet Deep Subspace Clustering via Self-Supervised Data Augmentation
- Author
-
Zhang, Zhao, primary, Li, Xianzhen, additional, Zhang, Haijun, additional, Yang, Yi, additional, Yan, Shuicheng, additional, and Wang, Meng, additional
- Published
- 2021
- Full Text
- View/download PDF
8. PnP-DETR: Towards Efficient Visual Analysis with Transformers
- Author
-
Wang, Tao, primary, Yuan, Li, additional, Chen, Yunpeng, additional, Feng, Jiashi, additional, and Yan, Shuicheng, additional
- Published
- 2021
- Full Text
- View/download PDF
9. Tokens-to-Token ViT: Training Vision Transformers from Scratch on ImageNet
- Author
-
Yuan, Li, primary, Chen, Yunpeng, additional, Wang, Tao, additional, Yu, Weihao, additional, Shi, Yujun, additional, Jiang, Zihang, additional, Tay, Francis E. H., additional, Feng, Jiashi, additional, and Yan, Shuicheng, additional
- Published
- 2021
- Full Text
- View/download PDF
10. PSGAN++: Robust Detail-Preserving Makeup Transfer and Removal.
- Author
-
Liu, Si, Jiang, Wentao, Gao, Chen, He, Ran, Feng, Jiashi, Li, Bo, and Yan, Shuicheng
- Subjects
GENERATIVE adversarial networks ,REFERENCE sources - Abstract
In this paper, we address the makeup transfer and removal tasks simultaneously, which aim to transfer the makeup from a reference image to a source image and remove the makeup from the with-makeup image respectively. Existing methods have achieved much advancement in constrained scenarios, but it is still very challenging for them to transfer makeup between images with large pose and expression differences, or handle makeup details like blush on cheeks or highlight on the nose. In addition, they are hardly able to control the degree of makeup during transferring or to transfer a specified part in the input face. These defects limit the application of previous makeup transfer methods to real-world scenarios. In this work, we propose a Pose and expression robust Spatial-aware GAN (abbreviated as PSGAN++). PSGAN++ is capable of performing both detail-preserving makeup transfer and effective makeup removal. For makeup transfer, PSGAN++ uses a Makeup Distill Network (MDNet) to extract makeup information, which is embedded into spatial-aware makeup matrices. We also devise an Attentive Makeup Morphing (AMM) module that specifies how the makeup in the source image is morphed from the reference image, and a makeup detail loss to supervise the model within the selected makeup detail area. On the other hand, for makeup removal, PSGAN++ applies an Identity Distill Network (IDNet) to embed the identity information from with-makeup images into identity matrices. Finally, the obtained makeup/identity matrices are fed to a Style Transfer Network (STNet) that is able to edit the feature maps to achieve makeup transfer or removal. To evaluate the effectiveness of our PSGAN++, we collect a Makeup Transfer In the Wild (MT-Wild) dataset that contains images with diverse poses and expressions and a Makeup Transfer High-Resolution (MT-HR) dataset that contains high-resolution images. Experiments demonstrate that PSGAN++ not only achieves state-of-the-art results with fine makeup details even in cases of large pose/expression differences but also can perform partial or degree-controllable makeup transfer. Both the code and the newly collected datasets will be released at https://github.com/wtjiang98/PSGAN. [ABSTRACT FROM AUTHOR]
- Published
- 2022
- Full Text
- View/download PDF
11. Human-Centric Relation Segmentation: Dataset and Solution.
- Author
-
Liu, Si, Wang, Zitian, Gao, Yulu, Ren, Lejian, Liao, Yue, Ren, Guanghui, Li, Bo, and Yan, Shuicheng
- Subjects
INTERPERSONAL relations ,IMAGE segmentation - Abstract
Vision and language understanding techniques have achieved remarkable progress, but currently it is still difficult to well handle problems involving very fine-grained details. For example, when the robot is told to “bring me the book in the girl’s left hand”, most existing methods would fail if the girl holds one book respectively in her left and right hand. In this work, we introduce a new task named human-centric relation segmentation (HRS), as a fine-grained case of HOI-det. HRS aims to predict the relations between the human and surrounding entities and identify the relation-correlated human parts, which are represented as pixel-level masks. For the above exemplar case, our HRS task produces results in the form of relation triplets $\langle$ 〈 girl [left hand], hold, book $\rangle$ 〉 and exacts segmentation masks of the book, with which the robot can easily accomplish the grabbing task. Correspondingly, we collect a new Person In Context (PIC) dataset for this new task, which contains 17,122 high-resolution images and densely annotated entity segmentation and relations, including 141 object categories, 23 relation categories and 25 semantic human parts. We also propose a Simultaneous Matching and Segmentation (SMS) framework as a solution to the HRS task. It contains three parallel branches for entity segmentation, subject object matching and human parsing respectively. Specifically, the entity segmentation branch obtains entity masks by dynamically-generated conditional convolutions; the subject object matching branch detects the existence of any relations, links the corresponding subjects and objects by displacement estimation and classifies the interacted human parts; and the human parsing branch generates the pixelwise human part labels. Outputs of the three branches are fused to produce the final HRS results. Extensive experiments on PIC and V-COCO datasets show that the proposed SMS method outperforms baselines with the 36 FPS inference speed. Notably, SMS outperforms the best performing baseline $m$ m -KERN with only 17.6 percent time cost. The dataset and code will be released at http://picdataset.com/challenge/index/. [ABSTRACT FROM AUTHOR]
- Published
- 2022
- Full Text
- View/download PDF
12. DRNet: Double Recalibration Network for Few-Shot Semantic Segmentation.
- Author
-
Gao, Guangyu, Fang, Zhiyuan, Han, Cen, Wei, Yunchao, Liu, Chi Harold, and Yan, Shuicheng
- Subjects
IMAGE color analysis - Abstract
Few-shot segmentation aims at learning to segment query images guided by only a few annotated images from the support set. Previous methods rely on mining the feature embedding similarity across the query and the support images to achieve successful segmentation. However, these models tend to perform badly in cases where the query instances have a large variance from the support ones. To enhance model robustness against such intra-class variance, we propose a Double Recalibration Network (DRNet) with two recalibration modules, i.e., the Self-adapted Recalibration (SR) module and the Cross-attended Recalibration (CR) module. In particular, beyond learning robust feature embedding for pixel-wise comparison between support and query as in conventional methods, the DRNet further exploits semantic-aware knowledge embedded in the query image to help segment itself, which we call ‘self-adapted recalibration’. More specifically, DRNet first employs guidance from the support set to roughly predict an incomplete but correct initial object region for the query image, and then reversely uses the feature embedding extracted from the incomplete object region to segment the query image. Also, we devise a CR module to refine the feature representation of the query image by propagating the underlying knowledge embedded in the support image’s foreground to the query. Instead of foreground global pooling, we refine the response at each pixel in the query feature map by attending to all foreground pixels in the support feature map and taking the weighted average by their similarity; meanwhile, feature maps of the query image are also added back to weighted feature maps as a residual connection. Our DRNet can effectively address the intra-class variance under the few-shot setting with such two recalibration modules, and mine more accurate target regions for query images. We conduct extensive experiments on the popular benchmarks PASCAL- $5^{i}$ and COCO- $20^{i}$. The DRNet with the best configuration achieves the mIoU of $\textbf {63.6}\%$ and $\textbf {64.9}\%$ on PASCAL- $5^{i}$ and $\textbf {44.7}\%$ and $\textbf {49.6}\%$ on COCO- $20^{i}$ for 1-shot and 5-shot settings respectively, significantly outperforming the state-of-the-arts without any bells and whistles. Code is available at: https://github.com/fangzy97/drnet. [ABSTRACT FROM AUTHOR]
- Published
- 2022
- Full Text
- View/download PDF
13. Fine-Grained Human-Centric Tracklet Segmentation with Single Frame Supervision.
- Author
-
Liu, Si, Ren, Guanghui, Sun, Yao, Wang, Jinqiao, Wang, Changhu, Li, Bo, and Yan, Shuicheng
- Subjects
OPTICAL flow ,FACE ,LEG ,IMAGE segmentation ,SUPERVISION - Abstract
In this paper, we target at the Fine-grAined human-Centric Tracklet Segmentation (FACTS) problem, where 12 human parts, e.g., face, pants, left-leg, are segmented. To reduce the heavy and tedious labeling efforts, FACTS requires only one labeled frame per video during training. The small size of human parts and the labeling scarcity makes FACTS very challenging. Considering adjacent frames of videos are continuous and human usually do not change clothes in a short time, we explicitly consider the pixel-level and frame-level context in the proposed Temporal Context segmentation Network (TCNet). On the one hand, optical flow is on-line calculated to propagate the pixel-level segmentation results to neighboring frames. On the other hand, frame-level classification likelihood vectors are also propagated to nearby frames. By fully exploiting the pixel-level and frame-level context, TCNet indirectly uses the large amount of unlabeled frames during training and produces smooth segmentation results during inference. Experimental results on four video datasets show the superiority of TCNet over the state-of-the-arts. The newly annotated datasets can be downloaded via http://liusi-group.com/projects/FACTS for the further studies. [ABSTRACT FROM AUTHOR]
- Published
- 2022
- Full Text
- View/download PDF
14. Towards Age-Invariant Face Recognition.
- Author
-
Zhao, Jian, Yan, Shuicheng, and Feng, Jiashi
- Subjects
- *
FEATURE extraction , *HUMAN facial recognition software , *GENERATIVE adversarial networks - Abstract
Despite the remarkable progress in face recognition related technologies, reliably recognizing faces across ages remains a big challenge. The appearance of a human face changes substantially over time, resulting in significant intra-class variations. As opposed to current techniques for age-invariant face recognition, which either directly extract age-invariant features for recognition, or first synthesize a face that matches target age before feature extraction, we argue that it is more desirable to perform both tasks jointly so that they can leverage each other. To this end, we propose a deep Age-Invariant Model (AIM) for face recognition in the wild with three distinct novelties. First, AIM presents a novel unified deep architecture jointly performing cross-age face synthesis and recognition in a mutual boosting way. Second, AIM achieves continuous face rejuvenation/aging with remarkable photorealistic and identity-preserving properties, avoiding the requirement of paired data and the true age of testing samples. Third, effective and novel training strategies are developed for end-to-end learning of the whole deep architecture, which generates powerful age-invariant face representations explicitly disentangled from the age variation. Moreover, we construct a new large-scale Cross-Age Face Recognition (CAFR) benchmark dataset to facilitate existing efforts and push the frontiers of age-invariant face recognition research. Extensive experiments on both our CAFR dataset and several other cross-age datasets (MORPH, CACD, and FG-NET) demonstrate the superiority of the proposed AIM model over the state-of-the-arts. Benchmarking our model on the popular unconstrained face recognition datasets YTF and IJB-C additionally verifies its promising generalization ability in recognizing faces in the wild. [ABSTRACT FROM AUTHOR]
- Published
- 2022
- Full Text
- View/download PDF
15. AdversarialNAS: Adversarial Neural Architecture Search for GANs
- Author
-
Gao, Chen, primary, Chen, Yunpeng, additional, Liu, Si, additional, Tan, Zhenxiong, additional, and Yan, Shuicheng, additional
- Published
- 2020
- Full Text
- View/download PDF
16. PSGAN: Pose and Expression Robust Spatial-Aware GAN for Customizable Makeup Transfer
- Author
-
Jiang, Wentao, primary, Liu, Si, additional, Gao, Chen, additional, Cao, Jie, additional, He, Ran, additional, Feng, Jiashi, additional, and Yan, Shuicheng, additional
- Published
- 2020
- Full Text
- View/download PDF
17. Very Long Natural Scenery Image Prediction by Outpainting
- Author
-
Yang, Zongxin, primary, Dong, Jian, additional, Liu, Ping, additional, Yang, Yi, additional, and Yan, Shuicheng, additional
- Published
- 2019
- Full Text
- View/download PDF
18. Single-Stage Multi-Person Pose Machines
- Author
-
Nie, Xuecheng, primary, Feng, Jiashi, additional, Zhang, Jianfeng, additional, and Yan, Shuicheng, additional
- Published
- 2019
- Full Text
- View/download PDF
19. Task Relation Networks
- Author
-
Jianshu Li, Jiashi Feng, Terence Sim, Sujoy Roy, Pan Zhou, Yunpeng Chen, Jian Zhao, and Yan Shuicheng
- Subjects
Boosting (machine learning) ,Exploit ,business.industry ,Computer science ,Deep learning ,Multi-task learning ,02 engineering and technology ,Mutual information ,010501 environmental sciences ,Covariance ,01 natural sciences ,0202 electrical engineering, electronic engineering, information engineering ,Task analysis ,Leverage (statistics) ,020201 artificial intelligence & image processing ,Artificial intelligence ,business ,0105 earth and related environmental sciences - Abstract
Multi-task learning is popular in machine learning and computer vision. In multitask learning, properly modeling task relations is important for boosting the performance of jointly learned tasks. Task covariance modeling has been successfully used to model the relations of tasks but is limited to homogeneous multi-task learning. In this paper, we propose a feature based task relation modeling approach, suitable for both homogeneous and heterogeneous multi-task learning. First, we propose a new metric to quantify the relations between tasks. Based on the quantitative metric, we then develop the task relation layer, which can be combined with any deep learning architecture to form task relation networks to fully exploit the relations of different tasks in an online fashion. Benefiting from the task relation layer, the task relation networks can better leverage the mutual information from the data. We demonstrate our proposed task relation networks are effective in improving the performance in both homogeneous and heterogeneous multi-task learning settings through extensive experiments on computer vision tasks.
- Published
- 2019
20. DerainCycleGAN: Rain Attentive CycleGAN for Single Image Deraining and Rainmaking.
- Author
-
Wei, Yanyan, Zhang, Zhao, Wang, Yang, Xu, Mingliang, Yang, Yi, Yan, Shuicheng, and Wang, Meng
- Subjects
RAIN-making ,RAINFALL ,GENERATIVE adversarial networks - Abstract
Single Image Deraining (SID) is a relatively new and still challenging topic in emerging vision applications, and most of the recently emerged deraining methods use the supervised manner depending on the ground-truth (i.e., using paired data). However, in practice it is rather common to encounter unpaired images in real deraining task. In such cases, how to remove the rain streaks in an unsupervised way will be a challenging task due to lack of constraints between images and hence suffering from low-quality restoration results. In this paper, we therefore explore the unsupervised SID issue using unpaired data, and propose a new unsupervised framework termed DerainCycleGAN for single image rain removal and generation, which can fully utilize the constrained transfer learning ability and circulatory structures of CycleGAN. In addition, we design an unsupervised rain attentive detector (UARD) for enhancing the rain information detection by paying attention to both rainy and rain-free images. Besides, we also contribute a new synthetic way of generating the rain streak information, which is different from the previous ones. Specifically, since the generated rain streaks have diverse shapes and directions, existing derianing methods trained on the generated rainy image by this way can perform much better for processing real rainy images. Extensive experimental results on synthetic and real datasets show that our DerainCycleGAN is superior to current unsupervised and semi-supervised methods, and is also highly competitive to the fully-supervised ones. [ABSTRACT FROM AUTHOR]
- Published
- 2021
- Full Text
- View/download PDF
21. Heterogeneous Domain Adaptation via Covariance Structured Feature Translators.
- Author
-
Ren, Chuan-Xian, Feng, Jiashi, Dai, Dao-Qing, and Yan, Shuicheng
- Abstract
Domain adaptation (DA) and transfer learning with statistical property description is very important in image analysis and data classification. This article studies the domain adaptive feature representation problem for the heterogeneous data, of which both the feature dimensions and the sample distributions across domains are so different that their features cannot be matched directly. To transfer the discriminant information efficiently from the source domain to the target domain, and then enhance the classification performance for the target data, we first introduce two projection matrices specified for different domains to transform the heterogeneous features into a shared space. We then propose a joint kernel regression model to learn the regression variable, which is called feature translator in this article. The novelty focuses on the exploration of optimal experimental design (OED) to deal with the heterogeneous and nonlinear DA by seeking the covariance structured feature translators (CSFTs). An approximate and efficient method is proposed to compute the optimal data projections. Comprehensive experiments are conducted to validate the effectiveness and efficacy of the proposed model. The results show the state-of-the-art performance of our method in heterogeneous DA. [ABSTRACT FROM AUTHOR]
- Published
- 2021
- Full Text
- View/download PDF
22. Flexible Auto-Weighted Local-Coordinate Concept Factorization: A Robust Framework for Unsupervised Clustering.
- Author
-
Zhang, Zhao, Zhang, Yan, Li, Sheng, Liu, Guangcan, Zeng, Dan, Yan, Shuicheng, and Wang, Meng
- Subjects
FACTORIZATION ,DATA scrubbing ,DATA recovery ,CONCEPTS ,MATRIX decomposition - Abstract
Concept Factorization (CF) and its variants may produce inaccurate representation and clustering results due to the sensitivity to noise, hard constraint on the reconstruction error, and pre-obtained approximate similarities. To improve the representation ability, a novel unsupervised Robust Flexible Auto-weighted Local-coordinate Concept Factorization (RFA-LCF) framework is proposed for clustering high-dimensional data. Specifically, RFA-LCF integrates the robust flexible CF by clean data space recovery, robust sparse local-coordinate coding, and adaptive weighting into a unified model. RFA-LCF improves the representations by enhancing the robustness of CF to noise and errors, providing a flexible constraint on the reconstruction error and optimizing the locality jointly. For robust learning, RFA-LCF clearly learns a sparse projection to recover the underlying clean data space, and then the flexible CF is performed in the projected feature space. RFA-LCF also uses a L2,1-norm based flexible residue to encode the mismatch between the recovered data and its reconstruction, and uses the robust sparse local-coordinate coding to represent data using a few nearby basis concepts. For auto-weighting, RFA-LCF jointly preserves the manifold structures in the basis concept space and new coordinate space in an adaptive manner by minimizing the reconstruction errors on clean data, anchor points and coordinates. By updating the local-coordinate preserving data, basis concepts and new coordinates alternately, the representation abilities can be potentially improved. Extensive results on public databases show that RFA-LCF delivers the state-of-the-art clustering results compared with other related methods. [ABSTRACT FROM AUTHOR]
- Published
- 2021
- Full Text
- View/download PDF
23. Robust Unsupervised Flexible Auto-weighted Local-coordinate Concept Factorization for Image Clustering
- Author
-
Zhang, Zhao, primary, Zhang, Yan, additional, Li, Sheng, additional, Liu, Guangcan, additional, Wang, Meng, additional, and Yan, Shuicheng, additional
- Published
- 2019
- Full Text
- View/download PDF
24. Faster First-Order Methods for Stochastic Non-Convex Optimization on Riemannian Manifolds.
- Author
-
Zhou, Pan, Yuan, Xiao-Tong, Yan, Shuicheng, and Feng, Jiashi
- Subjects
RIEMANNIAN manifolds ,PRINCIPAL components analysis ,LOW-rank matrices ,LEARNING problems ,PROCESS optimization ,MACHINE learning - Abstract
First-order non-convex Riemannian optimization algorithms have gained recent popularity in structured machine learning problems including principal component analysis and low-rank matrix completion. The current paper presents an efficient Riemannian Stochastic Path Integrated Differential EstimatoR (R-SPIDER) algorithm to solve the finite-sum and online Riemannian non-convex minimization problems. At the core of R-SPIDER is a recursive semi-stochastic gradient estimator that can accurately estimate Riemannian gradient under not only exponential mapping and parallel transport, but also general retraction and vector transport operations. Compared with prior Riemannian algorithms, such a recursive gradient estimation mechanism endows R-SPIDER with lower computational cost in first-order oracle complexity. Specifically, for finite-sum problems with $n$ n components, R-SPIDER is proved to converge to an $\epsilon$ ε -approximate stationary point within $\mathcal {O}\big (\min \big (n+\frac{\sqrt{n}}{\epsilon ^2},\frac{1}{\epsilon ^3}\big)\big)$ O min n + n ε 2 , 1 ε 3 stochastic gradient evaluations, beating the best-known complexity $\mathcal {O}\big (n+\frac{1}{\epsilon ^4}\big)$ O n + 1 ε 4 ; for online optimization, R-SPIDER is shown to converge with $\mathcal {O}\big (\frac{1}{\epsilon ^3}\big)$ O 1 ε 3 complexity which is, to the best of our knowledge, the first non-asymptotic result for online Riemannian optimization. For the special case of gradient dominated functions, we further develop a variant of R-SPIDER with improved linear rate of convergence. Extensive experimental results demonstrate the advantage of the proposed algorithms over the state-of-the-art Riemannian non-convex optimization methods. [ABSTRACT FROM AUTHOR]
- Published
- 2021
- Full Text
- View/download PDF
25. Learning Target-Domain-Specific Classifier for Partial Domain Adaptation.
- Author
-
Ren, Chuan-Xian, Ge, Pengfei, Yang, Peiyi, and Yan, Shuicheng
- Subjects
KNOWLEDGE transfer ,POCKET computers ,FEATURE extraction ,TASK analysis ,ACTIVE learning - Abstract
Unsupervised domain adaptation (UDA) aims at reducing the distribution discrepancy when transferring knowledge from a labeled source domain to an unlabeled target domain. Previous UDA methods assume that the source and target domains share an identical label space, which is unrealistic in practice since the label information of the target domain is agnostic. This article focuses on a more realistic UDA scenario, i.e., partial domain adaptation (PDA), where the target label space is subsumed to the source label space. In the PDA scenario, the source outliers that are absent in the target domain may be wrongly matched to the target domain (technically named negative transfer), leading to performance degradation of UDA methods. This article proposes a novel target-domain-specific classifier learning-based domain adaptation (TSCDA) method. TSCDA presents a soft-weighed maximum mean discrepancy criterion to partially align feature distributions and alleviate negative transfer. Also, it learns a target-specific classifier for the target domain with pseudolabels and multiple auxiliary classifiers to further address the classifier shift. A module named peers-assisted learning is used to minimize the prediction difference between multiple target-specific classifiers, which makes the classifiers more discriminant for the target domain. Extensive experiments conducted on three PDA benchmark data sets show that TSCDA outperforms other state-of-the-art methods with a large margin, e.g., 4% and 5.6% averagely on Office-31 and Office-Home, respectively. [ABSTRACT FROM AUTHOR]
- Published
- 2021
- Full Text
- View/download PDF
26. Joint Subspace Recovery and Enhanced Locality Driven Robust Flexible Discriminative Dictionary Learning.
- Author
-
Zhang, Zhao, Ren, Jiahuan, Jiang, Weiming, Zhang, Zheng, Hong, Richang, Yan, Shuicheng, and Wang, Meng
- Subjects
LAPLACIAN matrices ,DATA scrubbing ,SPARSE matrices - Abstract
We propose a joint subspace recovery and enhanced locality-based robust flexible label consistent dictionary learning method called Robust Flexible Discriminative Dictionary Learning (RFDDL). The RFDDL mainly improves the data representation and classification abilities by enhancing the robust property to sparse errors and encoding the locality, reconstruction error, and label consistency more accurately. First, for the robustness to noise and sparse errors in data and atoms, the RFDDL aims at recovering the underlying clean data and clean atom subspaces jointly, and then performs DL and encodes the locality in the recovered subspaces. Second, to enable the data sampled from a nonlinear manifold to be handled potentially and obtain the accurate reconstruction by avoiding the overfitting, the RFDDL minimizes the reconstruction error in a flexible manner. Third, to encode the label consistency accurately, the RFDDL involves a discriminative flexible sparse code error to encourage the coefficients to be soft. Fourth, to encode the locality well, the RFDDL defines the Laplacian matrix over recovered atoms, includes label information of atoms in terms of intra-class compactness and inter-class separation, and associates with group sparse codes and classifier to obtain the accurate discriminative locality-constrained coefficients and classifier. The extensive results on public databases show the effectiveness of our RFDDL. [ABSTRACT FROM AUTHOR]
- Published
- 2020
- Full Text
- View/download PDF
27. ORDNet: Capturing Omni-Range Dependencies for Scene Parsing.
- Author
-
Huang, Shaofei, Liu, Si, Hui, Tianrui, Han, Jizhong, Li, Bo, Feng, Jiashi, and Yan, Shuicheng
- Subjects
FEATURE extraction ,TASK analysis - Abstract
Learning to capture dependencies between spatial positions is essential to many visual tasks, especially the dense labeling problems like scene parsing. Existing methods can effectively capture long-range dependencies with self-attention mechanism while short ones by local convolution. However, there is still much gap between long-range and short-range dependencies, which largely reduces the models’ flexibility in application to diverse spatial scales and relationships in complicated natural scene images. To fill such a gap, we develop a Middle-Range (MR) branch to capture middle-range dependencies by restricting self-attention into local patches. Also, we observe that the spatial regions which have large correlations with others can be emphasized to exploit long-range dependencies more accurately, and thus propose a Reweighed Long-Range (RLR) branch. Based on the proposed MR and RLR branches, we build an Omni-Range Dependencies Network (ORDNet) which can effectively capture short-, middle- and long-range dependencies. Our ORDNet is able to extract more comprehensive context information and well adapt to complex spatial variance in scene images. Extensive experiments show that our proposed ORDNet outperforms previous state-of-the-art methods on three scene parsing benchmarks including PASCAL Context, COCO Stuff and ADE20K, demonstrating the superiority of capturing omni-range dependencies in deep models for scene parsing task. [ABSTRACT FROM AUTHOR]
- Published
- 2020
- Full Text
- View/download PDF
28. Learning Hybrid Representation by Robust Dictionary Learning in Factorized Compressed Space.
- Author
-
Ren, Jiahuan, Zhang, Zhao, Li, Sheng, Wang, Yang, Liu, Guangcan, Yan, Shuicheng, and Wang, Meng
- Subjects
BLENDED learning ,IMAGE reconstruction algorithms ,FACTORIZATION ,IMAGE compression - Abstract
In this paper, we investigate the robust dictionary learning (DL) to discover the hybrid salient low-rank and sparse representation in a factorized compressed space. A Joint Robust Factorization and Projective Dictionary Learning (J-RFDL) model is presented. The setting of J-RFDL aims at improving the data representations by enhancing the robustness to outliers and noise in data, encoding the reconstruction error more accurately and obtaining hybrid salient coefficients with accurate reconstruction ability. Specifically, J-RFDL performs the robust representation by DL in a factorized compressed space to eliminate the negative effects of noise and outliers on the results, which can also make the DL process efficient. To make the encoding process robust to noise in data, J-RFDL clearly uses sparse L2, 1-norm that can potentially minimize the factorization and reconstruction errors jointly by forcing rows of the reconstruction errors to be zeros. To deliver salient coefficients with good structures to reconstruct given data well, J-RFDL imposes the joint low-rank and sparse constraints on the embedded coefficients with a synthesis dictionary. Based on the hybrid salient coefficients, we also extend J-RFDL for the joint classification and propose a discriminative J-RFDL model, which can improve the discriminating abilities of learnt coefficients by minimizing the classification error jointly. Extensive experiments on public datasets demonstrate that our formulations can deliver superior performance over other state-of-the-art methods. [ABSTRACT FROM AUTHOR]
- Published
- 2020
- Full Text
- View/download PDF
29. Joint Rain Detection and Removal from a Single Image with Contextualized Deep Networks.
- Author
-
Yang, Wenhan, Tan, Robby T., Feng, Jiashi, Guo, Zongming, Yan, Shuicheng, and Liu, Jiaying
- Subjects
RAINFALL ,DEEP learning ,COMPUTER vision ,COMPUTER algorithms ,IMAGE reconstruction ,FOG - Abstract
Rain streaks, particularly in heavy rain, not only degrade visibility but also make many computer vision algorithms fail to function properly. In this paper, we address this visibility problem by focusing on single-image rain removal, even in the presence of dense rain streaks and rain-streak accumulation, which is visually similar to mist or fog. To achieve this, we introduce a new rain model and a deep learning architecture. Our rain model incorporates a binary rain map indicating rain-streak regions, and accommodates various shapes, directions, and sizes of overlapping rain streaks, as well as rain accumulation, to model heavy rain. Based on this model, we construct a multi-task deep network, which jointly learns three targets: the binary rain-streak map, rain streak layers, and clean background, which is our ultimate output. To generate features that can be invariant to rain steaks, we introduce a contextual dilated network, which is able to exploit regional contextual information. To handle various shapes and directions of overlapping rain streaks, our strategy is to utilize a recurrent process that progressively removes rain streaks. Our binary map provides a constraint and thus additional information to train our network. Extensive evaluation on real images, particularly in heavy rain, shows the effectiveness of our model and architecture. [ABSTRACT FROM AUTHOR]
- Published
- 2020
- Full Text
- View/download PDF
30. Joint Label Prediction Based Semi-Supervised Adaptive Concept Factorization for Robust Data Representation.
- Author
-
Zhang, Zhao, Zhang, Yan, Liu, Guangcan, Tang, Jinhui, Yan, Shuicheng, and Wang, Meng
- Subjects
LABELS ,FORECASTING ,TAGS (Metadata) - Abstract
Constrained Concept Factorization (CCF) yields the enhanced representation ability over CF by incorporating label information as additional constraints, but it cannot classify and group unlabeled data appropriately. Minimizing the difference between the original data and its reconstruction directly can enable CCF to model a small noisy perturbation, but is not robust to gross sparse errors. Besides, CCF cannot preserve the manifold structures in new representation space explicitly, especially in an adaptive manner. In this paper, we propose a joint label prediction based Robust Semi-Supervised Adaptive Concept Factorization (RS2ACF) framework. To obtain robust representation, RS2ACF relaxes the factorization to make it simultaneously stable to small entrywise noise and robust to sparse errors. To enrich prior knowledge to enhance the discrimination, RS2ACF clearly uses class information of labeled data and more importantly propagates it to unlabeled data by jointly learning an explicit label indicator for unlabeled data. By the label indicator, RS2ACF can ensure the unlabeled data of the same predicted label to be mapped into the same class in feature space. Besides, RS2ACF incorporates the joint neighborhood reconstruction error over the new representations and predicted labels of both labeled and unlabeled data, so the manifold structures can be preserved explicitly and adaptively in the representation space and label space at the same time. Owing to the adaptive manner, the tricky process of determining the neighborhood size or kernel width can be avoided. Extensive results on public databases verify that our RS2ACF can deliver state-of-the-art data representation, compared with other related methods. [ABSTRACT FROM AUTHOR]
- Published
- 2020
- Full Text
- View/download PDF
31. Perspective-Adaptive Convolutions for Scene Parsing.
- Author
-
Zhang, Rui, Tang, Sheng, Zhang, Yongdong, Li, Jintao, and Yan, Shuicheng
- Subjects
CONVOLUTIONAL neural networks ,MATHEMATICAL convolutions ,FORECASTING - Abstract
Many existing scene parsing methods adopt Convolutional Neural Networks with receptive fields of fixed sizes and shapes, which frequently results in inconsistent predictions of large objects and invisibility of small objects. To tackle this issue, we propose perspective-adaptive convolutions to acquire receptive fields of flexible sizes and shapes during scene parsing. Through adding a new perspective regression layer, we can dynamically infer the position-adaptive perspective coefficient vectors utilized to reshape the convolutional patches. Consequently, the receptive fields can be adjusted automatically according to the various sizes and perspective deformations of the objects in scene images. Our proposed convolutions are differentiable to learn the convolutional parameters and perspective coefficients in an end-to-end way without any extra training supervision of object sizes. Furthermore, considering that the standard convolutions lack contextual information and spatial dependencies, we propose a context adaptive bias to capture both local and global contextual information through average pooling on the local feature patches and global feature maps, followed by flexible attentive summing to the convolutional results. The attentive weights are position-adaptive and context-aware, and can be learned through adding an additional context regression layer. Experiments on Cityscapes and ADE20K datasets well demonstrate the effectiveness of the proposed methods. [ABSTRACT FROM AUTHOR]
- Published
- 2020
- Full Text
- View/download PDF
32. Tensor Robust Principal Component Analysis with a New Tensor Nuclear Norm.
- Author
-
Lu, Canyi, Feng, Jiashi, Chen, Yudong, Liu, Wei, Lin, Zhouchen, and Yan, Shuicheng
- Subjects
PRINCIPAL components analysis ,CALCULUS of tensors ,SINGULAR value decomposition ,UNIT ball (Mathematics) ,MATRIX decomposition - Abstract
In this paper, we consider the Tensor Robust Principal Component Analysis (TRPCA) problem, which aims to exactly recover the low-rank and sparse components from their sum. Our model is based on the recently proposed tensor-tensor product (or t-product). Induced by the t-product, we first rigorously deduce the tensor spectral norm, tensor nuclear norm, and tensor average rank, and show that the tensor nuclear norm is the convex envelope of the tensor average rank within the unit ball of the tensor spectral norm. These definitions, their relationships and properties are consistent with matrix cases. Equipped with the new tensor nuclear norm, we then solve the TRPCA problem by solving a convex program and provide the theoretical guarantee for the exact recovery. Our TRPCA model and recovery guarantee include matrix RPCA as a special case. Numerical experiments verify our results, and the applications to image recovery and background modeling problems demonstrate the effectiveness of our method. [ABSTRACT FROM AUTHOR]
- Published
- 2020
- Full Text
- View/download PDF
33. Asymmetric GAN for Unpaired Image-to-Image Translation.
- Author
-
Li, Yu, Tang, Sheng, Zhang, Rui, Zhang, Yongdong, Li, Jintao, and Yan, Shuicheng
- Subjects
TRANSLATIONS ,CYCLING training ,KNOWLEDGE transfer ,GALLIUM nitride - Abstract
Unpaired image-to-image translation problem aims to model the mapping from one domain to another with unpaired training data. Current works like the well-acknowledged Cycle GAN provide a general solution for any two domains through modeling injective mappings with a symmetric structure. While in situations where two domains are asymmetric in complexity, i.e., the amount of information between two domains is different, these approaches pose problems of poor generation quality, mapping ambiguity, and model sensitivity. To address these issues, we propose Asymmetric GAN (AsymGAN) to adapt the asymmetric domains by introducing an auxiliary variable (aux) to learn the extra information for transferring from the information-poor domain to the information-rich domain, which improves the performance of state-of-the-art approaches in the following ways. First, aux better balances the information between two domains which benefits the quality of generation. Second, the imbalance of information commonly leads to mapping ambiguity, where we are able to model one-to-many mappings by tuning aux, and furthermore, our aux is controllable. Third, the training of Cycle GAN can easily make the generator pair sensitive to small disturbances and variations while our model decouples the ill-conditioned relevance of generators by injecting aux during training. We verify the effectiveness of our proposed method both qualitatively and quantitatively on asymmetric situation, label-photo task, on Cityscapes and Helen datasets, and show many applications of asymmetric image translations. In conclusion, our AsymGAN provides a better solution for unpaired image-to-image translation in asymmetric domains. [ABSTRACT FROM AUTHOR]
- Published
- 2019
- Full Text
- View/download PDF
34. Unsupervised Nonnegative Adaptive Feature Extraction for Data Representation.
- Author
-
Zhang, Yan, Zhang, Zhao, Li, Sheng, Qin, Jie, Liu, Guangcan, Wang, Meng, and Yan, Shuicheng
- Subjects
FEATURE extraction ,DATA extraction ,NONNEGATIVE matrices ,MATRIX decomposition ,APPROXIMATION error ,LINEAR programming - Abstract
In this paper, we propose a novel unsupervised Nonnegative Adaptive Feature Extraction (NAFE) algorithm for data representation and classification. The formulation of NAFE integrates the sparsity constrained nonnegative matrix factorization (NMF), representation learning, and adaptive reconstruction weight learning into a unified model. Specifically, NAFE performs feature and weight learning over the new robust representations of NMF for more accurate measure and representation. For nonnegative adaptive feature extraction, our NAFE first utilizes the sparsity constrained NMF to obtain the new and robust representations of the original data. To preserve the manifold structures of the learnt new representations, we also incorporate a neighborhood reconstruction error over the weight matrix for joint minimization. Note that to further improve the representation power, the weights are jointly shared in the new low-dimensional nonnegative representation space, low-dimensional nonlinear manifold space, and low-dimensional projective subspace, i.e., local neighborhood information is clearly preserved in different feature spaces so that informative representations and features can be jointly obtained. To enable NAFE to extract features from new data, we also include a feature approximation error by a linear projection so that the learnt extractor can obtain features from new data efficiently. Extensive simulations show that our formulation can deliver state-of-the-art results on several public databases for feature extraction and classification, compared with several related methods. [ABSTRACT FROM AUTHOR]
- Published
- 2019
- Full Text
- View/download PDF
35. 3D-Aided Dual-Agent GANs for Unconstrained Face Recognition.
- Author
-
Zhao, Jian, Xiong, Lin, Li, Jianshu, Xing, Junliang, Yan, Shuicheng, and Feng, Jiashi
- Subjects
HUMAN facial recognition software ,DEPERSONALIZATION ,LEARNING problems ,GALLIUM nitride - Abstract
Synthesizing realistic profile faces is beneficial for more efficiently training deep pose-invariant models for large-scale unconstrained face recognition, by augmenting the number of samples with extreme poses and avoiding costly annotation work. However, learning from synthetic faces may not achieve the desired performance due to the discrepancy betwedistributions of the synthetic and real face images. To narrow this gap, we propose a Dual-Agent Generative Adversarial Network (DA-GAN) model, which can improve the realism of a face simulator's output using unlabeled real faces while preserving the identity information during the realism refinement. The dual agents are specially designed for distinguishing real versus fake and identities simultaneously. In particular, we employ an off-the-shelf 3D face model as a simulator to generate profile face images with varying poses. DA-GAN leverages a fully convolutional network as the generator to generate high-resolution images and an auto-encoder as the discriminator with the dual agents. Besides the novel architecture, we make several key modifications to the standard GAN to preserve pose, texture as well as identity, and stabilize the training process: (i) a pose perception loss; (ii) an identity perception loss; (iii) an adversarial loss with a boundary equilibrium regularization term. Experimental results show that DA-GAN not only achieves outstanding perceptual results but also significantly outperforms state-of-the-arts on the large-scale and challenging NIST IJB-A and CFP unconstrained face recognition benchmarks. In addition, the proposed DA-GAN is also a promising new approach for solving generic transfer learning problems more effectively. DA-GAN is the foundation of our winning entry to the NIST IJB-A face recognition competition in which we secured the $1^{st}$ places on the tracks of verification and identification. [ABSTRACT FROM AUTHOR]
- Published
- 2019
- Full Text
- View/download PDF
36. Robust Locality-Constrained Label Consistent K-SVD by Joint Sparse Embedding
- Author
-
Zhang, Zhao, primary, Jiang, Weiming, additional, Li, Sheng, additional, Qin, Jie, additional, Liu, Guangcan, additional, and Yan, Shuicheng, additional
- Published
- 2018
- Full Text
- View/download PDF
37. Robust Projective Low-Rank and Sparse Representation by Robust Dictionary Learning
- Author
-
Ren, Jiahuan, primary, Zhang, Zhao, additional, Li, Sheng, additional, Liu, Guangcan, additional, Wang, Meng, additional, and Yan, Shuicheng, additional
- Published
- 2018
- Full Text
- View/download PDF
38. Robust Discriminative Projective Dictionary Pair Learning by Adaptive Representations
- Author
-
Sun, Yulin, primary, Zhang, Zhao, additional, Jiang, Weming, additional, Liu, Guangcan, additional, Wang, Meng, additional, and Yan, Shuicheng, additional
- Published
- 2018
- Full Text
- View/download PDF
39. Neural Style Transfer via Meta Networks
- Author
-
Shen, Falong, primary, Yan, Shuicheng, additional, and Zeng, Gang, additional
- Published
- 2018
- Full Text
- View/download PDF
40. Human Pose Estimation with Parsing Induced Learner
- Author
-
Nie, Xuecheng, primary, Feng, Jiashi, additional, Zuo, Yiming, additional, and Yan, Shuicheng, additional
- Published
- 2018
- Full Text
- View/download PDF
41. Multi-oriented Scene Text Detection via Corner Localization and Region Segmentation
- Author
-
Lyu, Pengyuan, primary, Yao, Cong, additional, Wu, Wenhao, additional, Yan, Shuicheng, additional, and Bai, Xiang, additional
- Published
- 2018
- Full Text
- View/download PDF
42. Towards Pose Invariant Face Recognition in the Wild
- Author
-
Zhao, Jian, primary, Cheng, Yu, additional, Xu, Yan, additional, Xiong, Lin, additional, Li, Jianshu, additional, Zhao, Fang, additional, Jayashree, Karlekar, additional, Pranata, Sugiri, additional, Shen, Shengmei, additional, Xing, Junliang, additional, Yan, Shuicheng, additional, and Feng, Jiashi, additional
- Published
- 2018
- Full Text
- View/download PDF
43. Robust Projective Dictionary Learning by Joint Label Embedding and Classification
- Author
-
Jiang, Weiming, primary, Zhang, Zhao, additional, Qin, Jie, additional, Zhao, Mingbo, additional, Li, Fanzhang, additional, and Yan, Shuicheng, additional
- Published
- 2017
- Full Text
- View/download PDF
44. Neural Person Search Machines
- Author
-
Liu, Hao, primary, Feng, Jiashi, additional, Jie, Zequn, additional, Jayashree, Karlekar, additional, Zhao, Bo, additional, Qi, Meibin, additional, Jiang, Jianguo, additional, and Yan, Shuicheng, additional
- Published
- 2017
- Full Text
- View/download PDF
45. Recurrent 3D-2D Dual Learning for Large-Pose Facial Landmark Detection
- Author
-
Xiao, Shengtao, primary, Feng, Jiashi, additional, Liu, Luoqi, additional, Nie, Xuecheng, additional, Wang, Wei, additional, Yan, Shuicheng, additional, and Kassim, Ashraf, additional
- Published
- 2017
- Full Text
- View/download PDF
46. FoveaNet: Perspective-Aware Urban Scene Parsing
- Author
-
Li, Xin, primary, Jie, Zequn, additional, Wang, Wei, additional, Liu, Changsong, additional, Yang, Jimei, additional, Shen, Xiaohui, additional, Lin, Zhe, additional, Chen, Qiang, additional, Yan, Shuicheng, additional, and Feng, Jiashi, additional
- Published
- 2017
- Full Text
- View/download PDF
47. Scale-Adaptive Convolutions for Scene Parsing
- Author
-
Zhang, Rui, primary, Tang, Sheng, additional, Zhang, Yongdong, additional, Li, Jintao, additional, and Yan, Shuicheng, additional
- Published
- 2017
- Full Text
- View/download PDF
48. Video Scene Parsing with Predictive Feature Learning
- Author
-
Jin, Xiaojie, primary, Li, Xin, additional, Xiao, Huaxin, additional, Shen, Xiaohui, additional, Lin, Zhe, additional, Yang, Jimei, additional, Chen, Yunpeng, additional, Dong, Jian, additional, Liu, Luoqi, additional, Jie, Zequn, additional, Feng, Jiashi, additional, and Yan, Shuicheng, additional
- Published
- 2017
- Full Text
- View/download PDF
49. Deep Subspace Clustering.
- Author
-
Peng, Xi, Feng, Jiashi, Zhou, Joey Tianyi, Lei, Yingjie, and Yan, Shuicheng
- Subjects
DEEP learning ,DEVELOPING countries ,SUBSPACES (Mathematics) ,SPARSE matrices ,IMAGE reconstruction - Abstract
In this article, we propose a deep extension of sparse subspace clustering, termed deep subspace clustering with L1-norm (DSC-L1). Regularized by the unit sphere distribution assumption for the learned deep features, DSC-L1 can infer a new data affinity matrix by simultaneously satisfying the sparsity principle of SSC and the nonlinearity given by neural networks. One of the appealing advantages brought by DSC-L1 is that when original real-world data do not meet the class-specific linear subspace distribution assumption, DSC-L1 can employ neural networks to make the assumption valid with its nonlinear transformations. Moreover, we prove that our neural network could sufficiently approximate the minimizer under mild conditions. To the best of our knowledge, this could be one of the first deep-learning-based subspace clustering methods. Extensive experiments are conducted on four real-world data sets to show that the proposed method is significantly superior to 17 existing methods for subspace clustering on handcrafted features and raw data. [ABSTRACT FROM AUTHOR]
- Published
- 2020
- Full Text
- View/download PDF
50. Discriminative Local Sparse Representation by Robust Adaptive Dictionary Pair Learning.
- Author
-
Sun, Yulin, Zhang, Zhao, Jiang, Weiming, Zhang, Zheng, Zhang, Li, Yan, Shuicheng, and Wang, Meng
- Subjects
COMPUTER science ,IMAGE recognition (Computer vision) ,IMAGE representation ,NEURAL codes - Abstract
In this article, we propose a structured robust adaptive dictionary pair learning (RA-DPL) framework for the discriminative sparse representation (SR) learning. To achieve powerful representation ability of the available samples, the setting of RA-DPL seamlessly integrates the robust projective DPL, locality-adaptive SRs, and discriminative coding coefficients learning into a unified learning framework. Specifically, RA-DPL improves existing projective DPL in four perspectives. First, it applies a sparse l
2,1 -norm-based metric to encode the reconstruction error to deliver the robust projective dictionary pairs, and the l2,1 -norm has the potential to minimize the error. Second, it imposes the robust l2,1 -norm clearly on the analysis dictionary to ensure the sparse property of the coding coefficients rather than using the costly l0 /l1 -norm. As such, the robustness of the data representation and the efficiency of the learning process are jointly considered to guarantee the efficacy of our RA-DPL. Third, RA-DPL conceives a structured reconstruction weight learning paradigm to preserve the local structures of the coding coefficients within each class clearly in an adaptive manner, which encourages to produce the locality preserving representations. Fourth, it also considers improving the discriminating ability of coding coefficients and dictionary by incorporating a discriminating function, which can ensure high intraclass compactness and interclass separation in the code space. Extensive experiments show that our RA-DPL can obtain superior performance over other state of the arts. [ABSTRACT FROM AUTHOR]- Published
- 2020
- Full Text
- View/download PDF
Catalog
Discovery Service for Jio Institute Digital Library
For full access to our library's resources, please sign in.