Author: "Xiang, Tao" - Searchworks@Jio Institute Digital Library Search Results

Your search keyword '"Xiang, Tao"' showing total 4,831 results

Start Over Author "Xiang, Tao"

4,831 results on '"Xiang, Tao"'

201. Evaluating effects of climate change on the spatial distribution of an atypical cavefish Onychostoma macrolepis

Author: Dong, Xianghong, Ju, Tao, Shi, Lei, Luo, Chao, Gan, Lei, Wang, Zhenlu, Wang, Weiwei, He, Haoyu, Zhang, Shuhai, Zhou, Yuebing, An, Miao, Jiang, Haibo, Shao, Jian, and Xiang, Tao
Published: 2024
Full Text: View/download PDF

202. Enhancing the bioactivity and ductility of bulk metallic glass by introducing Fe to construct semi-degradable biomaterial

Author: Zuo, Kun, Du, Peng, Yang, Xinxin, Li, Kun, Xiang, Tao, Zhang, Liang, and Xie, Guoqiang
Published: 2024
Full Text: View/download PDF

203. An experimental study on wave transmission by engineered plain and enhanced oyster reefs

Author: Xiang, Tao, Bryski, Ephraim, and Farhadzadeh, Ali
Published: 2024
Full Text: View/download PDF

204. Zeolite enhanced iron-modified biocarrier drives Fe(II)/Fe(III) cycle to achieve nitrogen removal from eutrophic water

Author: Cheng, Lang, Liang, Hong, Yang, Wenbo, Xiang, Tao, Chen, Tao, and Gao, Dawen
Published: 2024
Full Text: View/download PDF

205. StyleMeUp: Towards Style-Agnostic Sketch-Based Image Retrieval

Author: Sain, Aneeshan, Bhunia, Ayan Kumar, Yang, Yongxin, Xiang, Tao, and Song, Yi-Zhe
Subjects: Computer Science - Computer Vision and Pattern Recognition
Abstract: Sketch-based image retrieval (SBIR) is a cross-modal matching problem which is typically solved by learning a joint embedding space where the semantic content shared between photo and sketch modalities are preserved. However, a fundamental challenge in SBIR has been largely ignored so far, that is, sketches are drawn by humans and considerable style variations exist amongst different users. An effective SBIR model needs to explicitly account for this style diversity, crucially, to generalise to unseen user styles. To this end, a novel style-agnostic SBIR model is proposed. Different from existing models, a cross-modal variational autoencoder (VAE) is employed to explicitly disentangle each sketch into a semantic content part shared with the corresponding photo, and a style part unique to the sketcher. Importantly, to make our model dynamically adaptable to any unseen user styles, we propose to meta-train our cross-modal VAE by adding two style-adaptive components: a set of feature transformation layers to its encoder and a regulariser to the disentangled semantic content latent code. With this meta-learning framework, our model can not only disentangle the cross-modal shared semantic content for SBIR, but can adapt the disentanglement to any unseen user style as well, making the SBIR model truly style-agnostic. Extensive experiments show that our style-agnostic model yields state-of-the-art performance for both category-level and instance-level SBIR., Comment: IEEE Conf. on Computer Vision and Pattern Recognition (CVPR), 2021
Published: 2021

206. Cloud2Curve: Generation and Vectorization of Parametric Sketches

Author: Das, Ayan, Yang, Yongxin, Hospedales, Timothy, Xiang, Tao, and Song, Yi-Zhe
Subjects: Computer Science - Computer Vision and Pattern Recognition, Computer Science - Artificial Intelligence
Abstract: Analysis of human sketches in deep learning has advanced immensely through the use of waypoint-sequences rather than raster-graphic representations. We further aim to model sketches as a sequence of low-dimensional parametric curves. To this end, we propose an inverse graphics framework capable of approximating a raster or waypoint based stroke encoded as a point-cloud with a variable-degree B\'ezier curve. Building on this module, we present Cloud2Curve, a generative model for scalable high-resolution vector sketches that can be trained end-to-end using point-cloud data alone. As a consequence, our model is also capable of deterministic vectorization which can map novel raster or waypoint based sketches to their corresponding high-resolution scalable B\'ezier equivalent. We evaluate the generation and vectorization capabilities of our model on Quick, Draw! and K-MNIST datasets., Comment: Accepted at CVPR 2021 (Poster)
Published: 2021

207. More Photos are All You Need: Semi-Supervised Learning for Fine-Grained Sketch Based Image Retrieval

Author: Bhunia, Ayan Kumar, Chowdhury, Pinaki Nath, Sain, Aneeshan, Yang, Yongxin, Xiang, Tao, and Song, Yi-Zhe
Subjects: Computer Science - Computer Vision and Pattern Recognition
Abstract: A fundamental challenge faced by existing Fine-Grained Sketch-Based Image Retrieval (FG-SBIR) models is the data scarcity -- model performances are largely bottlenecked by the lack of sketch-photo pairs. Whilst the number of photos can be easily scaled, each corresponding sketch still needs to be individually produced. In this paper, we aim to mitigate such an upper-bound on sketch data, and study whether unlabelled photos alone (of which they are many) can be cultivated for performances gain. In particular, we introduce a novel semi-supervised framework for cross-modal retrieval that can additionally leverage large-scale unlabelled photos to account for data scarcity. At the centre of our semi-supervision design is a sequential photo-to-sketch generation model that aims to generate paired sketches for unlabelled photos. Importantly, we further introduce a discriminator guided mechanism to guide against unfaithful generation, together with a distillation loss based regularizer to provide tolerance against noisy training samples. Last but not least, we treat generation and retrieval as two conjugate problems, where a joint learning procedure is devised for each module to mutually benefit from each other. Extensive experiments show that our semi-supervised model yields significant performance boost over the state-of-the-art supervised alternatives, as well as existing methods that can exploit unlabelled photos for FG-SBIR., Comment: IEEE Conf. on Computer Vision and Pattern Recognition (CVPR), 2021 Code : https://github.com/AyanKumarBhunia/semisupervised-FGSBIR
Published: 2021

208. Vectorization and Rasterization: Self-Supervised Learning for Sketch and Handwriting

Author: Bhunia, Ayan Kumar, Chowdhury, Pinaki Nath, Yang, Yongxin, Hospedales, Timothy M., Xiang, Tao, and Song, Yi-Zhe
Subjects: Computer Science - Computer Vision and Pattern Recognition
Abstract: Self-supervised learning has gained prominence due to its efficacy at learning powerful representations from unlabelled data that achieve excellent performance on many challenging downstream tasks. However supervision-free pre-text tasks are challenging to design and usually modality specific. Although there is a rich literature of self-supervised methods for either spatial (such as images) or temporal data (sound or text) modalities, a common pre-text task that benefits both modalities is largely missing. In this paper, we are interested in defining a self-supervised pre-text task for sketches and handwriting data. This data is uniquely characterised by its existence in dual modalities of rasterized images and vector coordinate sequences. We address and exploit this dual representation by proposing two novel cross-modal translation pre-text tasks for self-supervised feature learning: Vectorization and Rasterization. Vectorization learns to map image space to vector coordinates and rasterization maps vector coordinates to image space. We show that the our learned encoder modules benefit both raster-based and vector-based downstream approaches to analysing hand-drawn data. Empirical evidence shows that our novel pre-text tasks surpass existing single and multi-modal self-supervision methods., Comment: IEEE Conf. on Computer Vision and Pattern Recognition (CVPR), 2021 Code : https://github.com/AyanKumarBhunia/Self-Supervised-Learning-for-Sketch
Published: 2021

209. Context-Aware Layout to Image Generation with Enhanced Object Appearance

Author: He, Sen, Liao, Wentong, Yang, Michael Ying, Yang, Yongxin, Song, Yi-Zhe, Rosenhahn, Bodo, and Xiang, Tao
Subjects: Computer Science - Computer Vision and Pattern Recognition
Abstract: A layout to image (L2I) generation model aims to generate a complicated image containing multiple objects (things) against natural background (stuff), conditioned on a given layout. Built upon the recent advances in generative adversarial networks (GANs), existing L2I models have made great progress. However, a close inspection of their generated images reveals two major limitations: (1) the object-to-object as well as object-to-stuff relations are often broken and (2) each object's appearance is typically distorted lacking the key defining characteristics associated with the object class. We argue that these are caused by the lack of context-aware object and stuff feature encoding in their generators, and location-sensitive appearance representation in their discriminators. To address these limitations, two new modules are proposed in this work. First, a context-aware feature transformation module is introduced in the generator to ensure that the generated feature encoding of either object or stuff is aware of other co-existing objects/stuff in the scene. Second, instead of feeding location-insensitive image features to the discriminator, we use the Gram matrix computed from the feature maps of the generated object images to preserve location-sensitive information, resulting in much enhanced object appearance. Extensive experiments show that the proposed method achieves state-of-the-art performance on the COCO-Thing-Stuff and Visual Genome benchmarks., Comment: CVPR 2021
Published: 2021

210. Universal scaling of the critical temperature and the strange-metal scattering rate in unconventional superconductors

Author: Yuan, Jie, Chen, Qihong, Jiang, Kun, Feng, Zhongpei, Lin, Zefeng, Yu, Heshan, He, Ge, Zhang, Jinsong, Jiang, Xingyu, Zhang, Xu, Shi, Yujun, Zhang, Yanmin, Cheng, Zhi Gang, Tamura, Nobumichi, Yang, Yifeng, Xiang, Tao, Hu, Jiangping, Takeuchi, Ichiro, Jin, Kui, and Zhao, Zhongxian
Subjects: Condensed Matter - Superconductivity
Abstract: Dramatic evolution of properties with minute change in the doping level is a hallmark of the complex chemistry which governs cuprate superconductivity as manifested in the celebrated superconducting domes as well as quantum criticality taking place at precise compositions. The strange metal state, where the resistivity varies linearly with temperature, has emerged as a central feature in the normal state of cuprate superconductors. The ubiquity of this behavior signals an intimate link between the scattering mechanism and superconductivity. However, a clear quantitative picture of the correlation has been lacking. Here, we report observation of quantitative scaling laws between the superconducting transition temperature $T_{\rm c}$ and the scattering rate associated with the strange metal state in electron-doped cuprate $\rm La_{2-x}Ce_xCuO_4$ (LCCO) as a precise function of the doping level. High-resolution characterization of epitaxial composition-spread films, which encompass the entire overdoped range of LCCO has allowed us to systematically map its structural and transport properties with unprecedented accuracy and increment of $\Delta x = 0.0015$. We have uncovered the relations $T_{\rm c}\sim(x_{\rm c}-x)^{0.5}\sim(A_1^\square)^{0.5}$, where $x_c$ is the critical doping where superconductivity disappears on the overdoped side and $A_1^\square$ is the scattering rate of perfect $T$-linear resistivity per CuO$_2$ plane. We argue that the striking similarity of the $T_{\rm c}$ vs $A_1^\square$ relation among cuprates, iron-based and organic superconductors is an indication of a common mechanism of the strange metal behavior and unconventional superconductivity in these systems., Comment: 15 pages, 3 figures
Published: 2021
Full Text: View/download PDF

211. Enhancement of Superconductivity Linked with Linear-in-Temperature/Field Resistivity in Ion-Gated FeSe Films

Author: Jiang, Xingyu, Qin, Mingyang, Wei, Xinjian, Feng, Zhongpei, Ke, Jiezun, Zhu, Haipeng, Chen, Fucong, Zhang, Liping, Xu, Li, Zhang, Xu, Zhang, Ruozhou, Wei, Zhongxu, Xiong, Peiyu, Liang, Qimei, Xi, Chuanying, Wang, Zhaosheng, Yuan, Jie, Zhu, Beiyi, Jiang, Kun, Yang, Ming, Wang, Junfeng, Hu, Jiangping, Xiang, Tao, Leridon, Brigitte, Yu, Rong, Chen, Qihong, Jin, Kui, and Zhao, Zhongxian
Subjects: Condensed Matter - Superconductivity
Abstract: Iron selenide (FeSe) - the structurally simplest iron-based superconductor, has attracted tremendous interest in the past years. While the transition temperature (Tc) of bulk FeSe is $\sim$ 8 K, it can be significantly enhanced to 40 - 50 K by various ways of electron doping. However, the underlying physics for such great enhancement of Tc and so the Cooper pairing mechanism still remain puzzles. Here, we report a systematic study of the superconducting- and normal-state properties of FeSe films via ionic liquid gating. With fine tuning, Tc evolves continuously from below 10 K to above 40 K; in situ two-coil mutual inductance measurements unambiguously confirm the gating is a uniform bulk effect. Close to Tc, the normal-state resistivity shows a linear dependence on temperature and the linearity extends to lower temperatures with the superconductivity suppressed by high magnetic fields. At high fields, the normal-state magnetoresistance exhibits a linear-in-field dependence and obeys a simple scaling relation between applied field and temperature. Consistent behaviors are observed for different-Tc states throughout the gating process, suggesting the pairing mechanism very likely remains the same from low- to high-Tc state. Importantly, the coefficient of the linear-in-temperature resistivity is positively correlated with Tc, similarly to the observations in cuprates, Bechgaard salts and iron pnictide superconductors. Our study points to a short-range antiferromagnetic exchange interaction mediated pairing mechanism in FeSe., Comment: 21 pages, 5 figures, SI not included
Published: 2021

212. Domain Generalization: A Survey

Author: Zhou, Kaiyang, Liu, Ziwei, Qiao, Yu, Xiang, Tao, and Loy, Chen Change
Subjects: Computer Science - Machine Learning, Computer Science - Artificial Intelligence, Computer Science - Computer Vision and Pattern Recognition
Abstract: Generalization to out-of-distribution (OOD) data is a capability natural to humans yet challenging for machines to reproduce. This is because most learning algorithms strongly rely on the i.i.d.~assumption on source/target data, which is often violated in practice due to domain shift. Domain generalization (DG) aims to achieve OOD generalization by using only source data for model learning. Over the last ten years, research in DG has made great progress, leading to a broad spectrum of methodologies, e.g., those based on domain alignment, meta-learning, data augmentation, or ensemble learning, to name a few; DG has also been studied in various application areas including computer vision, speech recognition, natural language processing, medical imaging, and reinforcement learning. In this paper, for the first time a comprehensive literature review in DG is provided to summarize the developments over the past decade. Specifically, we first cover the background by formally defining DG and relating it to other relevant fields like domain adaptation and transfer learning. Then, we conduct a thorough review into existing methods and theories. Finally, we conclude this survey with insights and discussions on future research directions., Comment: IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI), 2022
Published: 2021
Full Text: View/download PDF

213. Momentum-Resolved Visualization of Electronic Evolution in Doping a Mott Insulator

Author: Hu, Cheng, Zhao, Jianfa, Gao, Qiang, Yan, Hongtao, Rong, Hongtao, Huang, Jianwei, Liu, Jing, Cai, Yongqing, Li, Cong, Chen, Hao, Zhao, Lin, Liu, Guodong, Jin, Changqing, Xu, Zuyan, Xiang, Tao, and Zhou, X. J.
Subjects: Condensed Matter - Strongly Correlated Electrons, Condensed Matter - Materials Science, Condensed Matter - Superconductivity
Abstract: High temperature superconductivity in cuprates arises from doping a parent Mott insulator by electrons or holes. A central issue is how the Mott gap evolves and the low-energy states emerge with doping. Here we report angle-resolved photoemission spectroscopy measurements on a cuprate parent compound by sequential in situ electron doping. The chemical potential jumps to the bottom of the upper Hubbard band upon a slight electron doping, making it possible to directly visualize the charge transfer band and the full Mott gap region. With increasing doping, the Mott gap rapidly collapses due to the spectral weight transfer from the charge transfer band to the gapped region and the induced low-energy states emerge in a wide energy range inside the Mott gap. These results provide key information on the electronic evolution in doping a Mott insulator and establish a basis for developing microscopic theories for cuprate superconductivity., Comment: 23 pages, 5 figures
Published: 2021
Full Text: View/download PDF

214. Contrastive Prototype Learning with Augmented Embeddings for Few-Shot Learning

Author: Gao, Yizhao, Fei, Nanyi, Liu, Guangzhen, Lu, Zhiwu, Xiang, Tao, and Huang, Songfang
Subjects: Computer Science - Computer Vision and Pattern Recognition
Abstract: Most recent few-shot learning (FSL) methods are based on meta-learning with episodic training. In each meta-training episode, a discriminative feature embedding and/or classifier are first constructed from a support set in an inner loop, and then evaluated in an outer loop using a query set for model updating. This query set sample centered learning objective is however intrinsically limited in addressing the lack of training data problem in the support set. In this paper, a novel contrastive prototype learning with augmented embeddings (CPLAE) model is proposed to overcome this limitation. First, data augmentations are introduced to both the support and query sets with each sample now being represented as an augmented embedding (AE) composed of concatenated embeddings of both the original and augmented versions. Second, a novel support set class prototype centered contrastive loss is proposed for contrastive prototype learning (CPL). With a class prototype as an anchor, CPL aims to pull the query samples of the same class closer and those of different classes further away. This support set sample centered loss is highly complementary to the existing query centered loss, fully exploiting the limited training data in each episode. Extensive experiments on several benchmarks demonstrate that our proposed CPLAE achieves new state-of-the-art.
Published: 2021

215. Few-shot Action Recognition with Prototype-centered Attentive Learning

Author: Zhu, Xiatian, Toisoul, Antoine, Perez-Rua, Juan-Manuel, Zhang, Li, Martinez, Brais, and Xiang, Tao
Subjects: Computer Science - Computer Vision and Pattern Recognition
Abstract: Few-shot action recognition aims to recognize action classes with few training samples. Most existing methods adopt a meta-learning approach with episodic training. In each episode, the few samples in a meta-training task are split into support and query sets. The former is used to build a classifier, which is then evaluated on the latter using a query-centered loss for model updating. There are however two major limitations: lack of data efficiency due to the query-centered only loss design and inability to deal with the support set outlying samples and inter-class distribution overlapping problems. In this paper, we overcome both limitations by proposing a new Prototype-centered Attentive Learning (PAL) model composed of two novel components. First, a prototype-centered contrastive learning loss is introduced to complement the conventional query-centered learning objective, in order to make full use of the limited training samples in each episode. Second, PAL further integrates a hybrid attentive learning mechanism that can minimize the negative impacts of outliers and promote class separation. Extensive experiments on four standard few-shot action benchmarks show that our method clearly outperforms previous state-of-the-art methods, with the improvement particularly significant (10+\%) on the most challenging fine-grained action recognition benchmark., Comment: 10 pages, 4 figures
Published: 2021

216. Magnetic field-tuned quantum criticality in optimally electron-doped cuprate thin films

Author: Zhang, Xu, Yu, Heshan, Chen, Qihong, Yang, Runqiu, He, Ge, Lin, Ziquan, Li, Qian, Yuan, Jie, Zhu, Beiyi, Li, Liang, Yang, Yi-feng, Xiang, Tao, Cai, Rong-Gen, Kusmartseva, Anna, Kusmartsev, F. V., Wang, Jun-Feng, and Jin, Kui
Subjects: Condensed Matter - Superconductivity
Abstract: Antiferromagnetic (AF) spin fluctuations are commonly believed to play a key role in electron pairing of cuprate superconductors. In electron-doped cuprates, it is still in paradox about the interplay among different electronic states in quantum perturbations, especially between superconducting and magnetic states. Here, we report a systematic transport study on cation-optimized La2-xCexCuO4 (x = 0.10) thin films in high magnetic fields. We find an AF quantum phase transition near 60 T, where the Hall number jumps from nH =-x to nH = 1-x, resembling the change of nH at the AF boundary (xAF = 0.14) tuned by Ce doping. In the AF region a spin dependent state manifesting anomalous positive magnetoresistance is observed, which is closely related to superconductivity. Once the AF state is suppressed by magnetic field, a polarized ferromagnetic state is predicted, reminiscent of the recently reported ferromagnetic state at the quantum endpoint of the superconducting dome by Ce doping. The magnetic field that drives phase transitions in a similar but distinct manner to doping thereby provides a unique perspective to understand the quantum criticality of electron-doped cuprates.
Published: 2021
Full Text: View/download PDF

217. Local Black-box Adversarial Attacks: A Query Efficient Approach

Author: Xiang, Tao, Liu, Hangcheng, Guo, Shangwei, Zhang, Tianwei, and Liao, Xiaofeng
Subjects: Computer Science - Computer Vision and Pattern Recognition, Computer Science - Cryptography and Security, Computer Science - Machine Learning
Abstract: Adversarial attacks have threatened the application of deep neural networks in security-sensitive scenarios. Most existing black-box attacks fool the target model by interacting with it many times and producing global perturbations. However, global perturbations change the smooth and insignificant background, which not only makes the perturbation more easily be perceived but also increases the query overhead. In this paper, we propose a novel framework to perturb the discriminative areas of clean examples only within limited queries in black-box attacks. Our framework is constructed based on two types of transferability. The first one is the transferability of model interpretations. Based on this property, we identify the discriminative areas of a given clean example easily for local perturbations. The second is the transferability of adversarial examples. It helps us to produce a local pre-perturbation for improving query efficiency. After identifying the discriminative areas and pre-perturbing, we generate the final adversarial examples from the pre-perturbed example by querying the targeted model with two kinds of black-box attack techniques, i.e., gradient estimation and random search. We conduct extensive experiments to show that our framework can significantly improve the query efficiency during black-box perturbing with a high attack success rate. Experimental results show that our attacks outperform state-of-the-art black-box attacks under various system settings., Comment: This work has been submitted to the IEEE for possible publication
Published: 2021

218. Density Matrix and Tensor Network Renormalization

Author: Xiang, Tao
Published: 2023
Full Text: View/download PDF

219. Rethinking Semantic Segmentation from a Sequence-to-Sequence Perspective with Transformers

Author: Zheng, Sixiao, Lu, Jiachen, Zhao, Hengshuang, Zhu, Xiatian, Luo, Zekun, Wang, Yabiao, Fu, Yanwei, Feng, Jianfeng, Xiang, Tao, Torr, Philip H. S., and Zhang, Li
Subjects: Computer Science - Computer Vision and Pattern Recognition
Abstract: Most recent semantic segmentation methods adopt a fully-convolutional network (FCN) with an encoder-decoder architecture. The encoder progressively reduces the spatial resolution and learns more abstract/semantic visual concepts with larger receptive fields. Since context modeling is critical for segmentation, the latest efforts have been focused on increasing the receptive field, through either dilated/atrous convolutions or inserting attention modules. However, the encoder-decoder based FCN architecture remains unchanged. In this paper, we aim to provide an alternative perspective by treating semantic segmentation as a sequence-to-sequence prediction task. Specifically, we deploy a pure transformer (ie, without convolution and resolution reduction) to encode an image as a sequence of patches. With the global context modeled in every layer of the transformer, this encoder can be combined with a simple decoder to provide a powerful segmentation model, termed SEgmentation TRansformer (SETR). Extensive experiments show that SETR achieves new state of the art on ADE20K (50.28% mIoU), Pascal Context (55.83% mIoU) and competitive results on Cityscapes. Particularly, we achieve the first position in the highly competitive ADE20K test server leaderboard on the day of submission., Comment: CVPR 2021. Project page at https://fudan-zvg.github.io/SETR/
Published: 2020

220. Universal quantum transition from superconducting to insulating states in pressurized Bi2Sr2CaCu2O8+{\delta} superconductors

Author: Zhou, Yazhou, Guo, Jing, Cai, Shu, Gu, Genda, Lin, Chengtian, Yan, Hongtao, Huang, Cheng, Yang, Chongli, Long, Sijin, Gong, Yu, Li, Yanchun, Li, Xiaodong, Wu, Qi, Hu, Jiangping, Zhou, Xingjiang, Xiang, Tao, and Sun, Liling
Subjects: Condensed Matter - Superconductivity
Abstract: Copper oxide superconductors have continually fascinated the communities of condensed matter physics and material sciences because they host the highest ambient-pressure superconducting transition temperature (Tc) and mysterious physics. Searching for the universal correlation between the superconducting state and its normal state or neighboring ground state is believed to be an effective way for finding clues to elucidate the underlying mechanism of the superconductivity. One of the common pictures for the copper oxide superconductors is that a well-behaved metallic phase will present after the superconductivity is entirely suppressed by chemical doping or application of the magnetic field. Here, we report a different observation of universal quantum transition from superconducting state to insulating-like state under pressure in the under-, optimally- and over-doped Bi2212 superconductors with two CuO2 planes in a unit cell. The same phenomenon has been also found in the Bi2201 superconductor with one CuO2 plane and the Bi2223 superconductor with three CuO2 planes in a unit cell. These results not only provide fresh information but also pose a new challenge for achieving a unified understanding on the underlying physics of the high-Tc superconductivity., Comment: 17 pages, 4 figures
Published: 2020

221. Margin-Based Transfer Bounds for Meta Learning with Deep Feature Embedding

Author: Guan, Jiechao, Lu, Zhiwu, Xiang, Tao, and Hospedales, Timothy
Subjects: Computer Science - Machine Learning
Abstract: By transferring knowledge learned from seen/previous tasks, meta learning aims to generalize well to unseen/future tasks. Existing meta-learning approaches have shown promising empirical performance on various multiclass classification problems, but few provide theoretical analysis on the classifiers' generalization ability on future tasks. In this paper, under the assumption that all classification tasks are sampled from the same meta-distribution, we leverage margin theory and statistical learning theory to establish three margin-based transfer bounds for meta-learning based multiclass classification (MLMC). These bounds reveal that the expected error of a given classification algorithm for a future task can be estimated with the average empirical error on a finite number of previous tasks, uniformly over a class of preprocessing feature maps/deep neural networks (i.e. deep feature embeddings). To validate these bounds, instead of the commonly-used cross-entropy loss, a multi-margin loss is employed to train a number of representative MLMC models. Experiments on three benchmarks show that these margin-based models still achieve competitive performance, validating the practical value of our margin-based theoretical analysis., Comment: 14 pages, 2 figures
Published: 2020

222. Boundary-sensitive Pre-training for Temporal Localization in Videos

Author: Xu, Mengmeng, Perez-Rua, Juan-Manuel, Escorcia, Victor, Martinez, Brais, Zhu, Xiatian, Zhang, Li, Ghanem, Bernard, and Xiang, Tao
Subjects: Computer Science - Computer Vision and Pattern Recognition
Abstract: Many video analysis tasks require temporal localization thus detection of content changes. However, most existing models developed for these tasks are pre-trained on general video action classification tasks. This is because large scale annotation of temporal boundaries in untrimmed videos is expensive. Therefore no suitable datasets exist for temporal boundary-sensitive pre-training. In this paper for the first time, we investigate model pre-training for temporal localization by introducing a novel boundary-sensitive pretext (BSP) task. Instead of relying on costly manual annotations of temporal boundaries, we propose to synthesize temporal boundaries in existing video action classification datasets. With the synthesized boundaries, BSP can be simply conducted via classifying the boundary types. This enables the learning of video representations that are much more transferable to downstream temporal localization tasks. Extensive experiments show that the proposed BSP is superior and complementary to the existing action classification based pre-training counterpart, and achieves new state-of-the-art performance on several temporal localization tasks., Comment: 11 pages, 4 figures
Published: 2020

223. Interplay between superconductivity and the strange-metal state in FeSe

Author: Jiang, Xingyu, Qin, Mingyang, Wei, Xinjian, Xu, Li, Ke, Jiezun, Zhu, Haipeng, Zhang, Ruozhou, Zhao, Zhanyi, Liang, Qimei, Wei, Zhongxu, Lin, Zefeng, Feng, Zhongpei, Chen, Fucong, Xiong, Peiyu, Yuan, Jie, Zhu, Beiyi, Li, Yangmu, Xi, Chuanying, Wang, Zhaosheng, Yang, Ming, Wang, Junfeng, Xiang, Tao, Hu, Jiangping, Jiang, Kun, Chen, Qihong, Jin, Kui, and Zhao, Zhongxian
Published: 2023
Full Text: View/download PDF

224. Triggering heteroatomic interdiffusion in one-pot-oxidation synthesized NiO/CuFeO2 heterojunction photocathodes for efficient solar hydrogen production from water splitting

Author: Han, Fei, Xu, Wei, Jia, Chun-Xu, Chen, Xiang-Tao, Xie, Ying-Peng, Zhen, Chao, and Liu, Gang
Published: 2023
Full Text: View/download PDF

225. Generative adversarial networks with adaptive learning strategy for noise-to-image synthesis

Author: Gan, Yan, Xiang, Tao, Liu, Hangcheng, Ye, Mao, and Zhou, Mingliang
Published: 2023
Full Text: View/download PDF

226. Genetic deletion of phosphodiesterase 4D in the liver improves kidney damage in high-fat fed mice: liver-kidney crosstalk

Author: Xiang Tao, Can Chen, Zheng Huang, Yu Lei, Muru Wang, Shuhui Wang, and Dean Tian
Subjects: Cytology, QH573-671
Abstract: Abstract A growing body of epidemiological evidence suggests that nonalcoholic fatty liver disease (NAFLD) is an independent risk factor for chronic kidney disease (CKD), but the regulatory mechanism linking NAFLD and CKD remains unclear. Our previous studies have shown that overexpression of PDE4D in mouse liver is sufficient for NAFLD, but little is known about its role in kidney injury. Here, liver-specific PDE4D conditional knockout (LKO) mice, adeno-associated virus 8 (AAV8)-mediated gene transfer of PDE4D and the PDE4 inhibitor roflumilast were used to assess the involvement of hepatic PDE4D in NAFLD-associated renal injury. We found that mice fed a high-fat diet (HFD) for 16 weeks developed hepatic steatosis and kidney injury, with an associated increase in hepatic PDE4D but no changes in renal PDE4D. Furthermore, liver-specific knockout of PDE4D or pharmacological inhibition of PDE4 with roflumilast ameliorated hepatic steatosis and kidney injury in HFD-fed diabetic mice. Correspondingly, overexpression of hepatic PDE4D resulted in significant renal damage. Mechanistically, highly expressed PDE4D in fatty liver promoted the production and secretion of TGF-β1 into blood, which triggered kidney injury by activating SMADs and subsequent collagen deposition. Our findings revealed PDE4D might act as a critical mediator between NAFLD and associated kidney injury and indicated PDE4 inhibitor roflumilast as a potential therapeutic strategy for NAFLD-associated CKD.
Published: 2023
Full Text: View/download PDF

227. Fine-tuning Is Not Enough: A Simple yet Effective Watermark Removal Attack for DNN Models

Author: Guo, Shangwei, Zhang, Tianwei, Qiu, Han, Zeng, Yi, Xiang, Tao, and Liu, Yang
Subjects: Computer Science - Cryptography and Security, Computer Science - Machine Learning, Statistics - Machine Learning
Abstract: Watermarking has become the tendency in protecting the intellectual property of DNN models. Recent works, from the adversary's perspective, attempted to subvert watermarking mechanisms by designing watermark removal attacks. However, these attacks mainly adopted sophisticated fine-tuning techniques, which have certain fatal drawbacks or unrealistic assumptions. In this paper, we propose a novel watermark removal attack from a different perspective. Instead of just fine-tuning the watermarked models, we design a simple yet powerful transformation algorithm by combining imperceptible pattern embedding and spatial-level transformations, which can effectively and blindly destroy the memorization of watermarked models to the watermark samples. We also introduce a lightweight fine-tuning strategy to preserve the model performance. Our solution requires much less resource or knowledge about the watermarking scheme than prior works. Extensive experimental results indicate that our attack can bypass state-of-the-art watermarking solutions with very high success rates. Based on our attack, we propose watermark augmentation techniques to enhance the robustness of existing watermarks., Comment: 7 pages, 4 figures, accpeted by IJCAI 2021
Published: 2020

228. Resonating valence bond realization of spin-1 non-Abelian chiral spin liquid on the torus

Author: Zhang, Hua-Chen, Wu, Ying-Hai, Tu, Hong-Hao, and Xiang, Tao
Subjects: Condensed Matter - Strongly Correlated Electrons
Abstract: We propose resonating valence bond wave functions for a spin-1 system on the torus that realize a non-Abelian chiral spin liquid. The wave functions take the form of infinite dimensional matrix product states constructed from conformal blocks of the $\mathrm{SO}(3)_{1}$ Wess-Zumino-Witten model. This means that they are lattice analogues of the bosonic Moore-Read state introduced in fractional quantum Hall systems. The topological order of this system is revealed by explicit construction of three-fold degenerate ground states and analytical computation of the modular S and T matrices., Comment: 11 pages, 2 figures
Published: 2020
Full Text: View/download PDF

229. Cross-Modal Hierarchical Modelling for Fine-Grained Sketch Based Image Retrieval

Author: Sain, Aneeshan, Bhunia, Ayan Kumar, Yang, Yongxin, Xiang, Tao, and Song, Yi-Zhe
Subjects: Computer Science - Computer Vision and Pattern Recognition, Computer Science - Information Retrieval
Abstract: Sketch as an image search query is an ideal alternative to text in capturing the fine-grained visual details. Prior successes on fine-grained sketch-based image retrieval (FG-SBIR) have demonstrated the importance of tackling the unique traits of sketches as opposed to photos, e.g., temporal vs. static, strokes vs. pixels, and abstract vs. pixel-perfect. In this paper, we study a further trait of sketches that has been overlooked to date, that is, they are hierarchical in terms of the levels of detail -- a person typically sketches up to various extents of detail to depict an object. This hierarchical structure is often visually distinct. In this paper, we design a novel network that is capable of cultivating sketch-specific hierarchies and exploiting them to match sketch with photo at corresponding hierarchical levels. In particular, features from a sketch and a photo are enriched using cross-modal co-attention, coupled with hierarchical node fusion at every level to form a better embedding space to conduct retrieval. Experiments on common benchmarks show our method to outperform state-of-the-arts by a significant margin., Comment: Accepted for ORAL presentation in BMVC 2020
Published: 2020

230. On Learning Semantic Representations for Million-Scale Free-Hand Sketches

Author: Xu, Peng, Huang, Yongye, Yuan, Tongtong, Xiang, Tao, Hospedales, Timothy M., Song, Yi-Zhe, and Wang, Liang
Subjects: Computer Science - Computer Vision and Pattern Recognition
Abstract: In this paper, we study learning semantic representations for million-scale free-hand sketches. This is highly challenging due to the domain-unique traits of sketches, e.g., diverse, sparse, abstract, noisy. We propose a dual-branch CNNRNN network architecture to represent sketches, which simultaneously encodes both the static and temporal patterns of sketch strokes. Based on this architecture, we further explore learning the sketch-oriented semantic representations in two challenging yet practical settings, i.e., hashing retrieval and zero-shot recognition on million-scale sketches. Specifically, we use our dual-branch architecture as a universal representation framework to design two sketch-specific deep models: (i) We propose a deep hashing model for sketch retrieval, where a novel hashing loss is specifically designed to accommodate both the abstract and messy traits of sketches. (ii) We propose a deep embedding model for sketch zero-shot recognition, via collecting a large-scale edge-map dataset and proposing to extract a set of semantic vectors from edge-maps as the semantic knowledge for sketch zero-shot domain alignment. Both deep models are evaluated by comprehensive experiments on million-scale sketches and outperform the state-of-the-art competitors., Comment: arXiv admin note: substantial text overlap with arXiv:1804.01401
Published: 2020

231. Learning to Generate Novel Domains for Domain Generalization

Author: Zhou, Kaiyang, Yang, Yongxin, Hospedales, Timothy, and Xiang, Tao
Subjects: Computer Science - Computer Vision and Pattern Recognition
Abstract: This paper focuses on domain generalization (DG), the task of learning from multiple source domains a model that generalizes well to unseen domains. A main challenge for DG is that the available source domains often exhibit limited diversity, hampering the model's ability to learn to generalize. We therefore employ a data generator to synthesize data from pseudo-novel domains to augment the source domains. This explicitly increases the diversity of available training domains and leads to a more generalizable model. To train the generator, we model the distribution divergence between source and synthesized pseudo-novel domains using optimal transport, and maximize the divergence. To ensure that semantics are preserved in the synthesized data, we further impose cycle-consistency and classification losses on the generator. Our method, L2A-OT (Learning to Augment by Optimal Transport) outperforms current state-of-the-art DG methods on four benchmark datasets., Comment: ECCV'20
Published: 2020

232. B\'ezierSketch: A generative model for scalable vector sketches

Author: Das, Ayan, Yang, Yongxin, Hospedales, Timothy, Xiang, Tao, and Song, Yi-Zhe
Subjects: Computer Science - Computer Vision and Pattern Recognition, Computer Science - Machine Learning, Electrical Engineering and Systems Science - Image and Video Processing
Abstract: The study of neural generative models of human sketches is a fascinating contemporary modeling problem due to the links between sketch image generation and the human drawing process. The landmark SketchRNN provided breakthrough by sequentially generating sketches as a sequence of waypoints. However this leads to low-resolution image generation, and failure to model long sketches. In this paper we present B\'ezierSketch, a novel generative model for fully vector sketches that are automatically scalable and high-resolution. To this end, we first introduce a novel inverse graphics approach to stroke embedding that trains an encoder to embed each stroke to its best fit B\'ezier curve. This enables us to treat sketches as short sequences of paramaterized strokes and thus train a recurrent sketch generator with greater capacity for longer sketches, while producing scalable high-resolution results. We report qualitative and quantitative results on the Quick, Draw! benchmark., Comment: Accepted as poster at ECCV 2020
Published: 2020

233. Egocentric Action Recognition by Video Attention and Temporal Context

Author: Perez-Rua, Juan-Manuel, Toisoul, Antoine, Martinez, Brais, Escorcia, Victor, Zhang, Li, Zhu, Xiatian, and Xiang, Tao
Subjects: Computer Science - Computer Vision and Pattern Recognition
Abstract: We present the submission of Samsung AI Centre Cambridge to the CVPR2020 EPIC-Kitchens Action Recognition Challenge. In this challenge, action recognition is posed as the problem of simultaneously predicting a single `verb' and `noun' class label given an input trimmed video clip. That is, a `verb' and a `noun' together define a compositional `action' class. The challenging aspects of this real-life action recognition task include small fast moving objects, complex hand-object interactions, and occlusions. At the core of our submission is a recently-proposed spatial-temporal video attention model, called `W3' (`What-Where-When') attention~\cite{perez2020knowing}. We further introduce a simple yet effective contextual learning mechanism to model `action' class scores directly from long-term temporal behaviour based on the `verb' and `noun' prediction scores. Our solution achieves strong performance on the challenge metrics without using object-specific reasoning nor extra training data. In particular, our best solution with multimodal ensemble achieves the 2$^{nd}$ best position for `verb', and 3$^{rd}$ best for `noun' and `action' on the Seen Kitchens test set., Comment: EPIC-Kitchens challenges@CVPR 2020
Published: 2020

234. Topology-aware Differential Privacy for Decentralized Image Classification

Author: Guo, Shangwei, Zhang, Tianwei, Xu, Guowen, Yu, Han, Xiang, Tao, and Liu, Yang
Subjects: Computer Science - Cryptography and Security, Computer Science - Machine Learning
Abstract: In this paper, we design Top-DP, a novel solution to optimize the differential privacy protection of decentralized image classification systems. The key insight of our solution is to leverage the unique features of decentralized communication topologies to reduce the noise scale and improve the model usability. (1) We enhance the DP-SGD algorithm with this topology-aware noise reduction strategy, and integrate the time-aware noise decay technique. (2) We design two novel learning protocols (synchronous and asynchronous) to protect systems with different network connectivities and topologies. We formally analyze and prove the DP requirement of our proposed solutions. Experimental evaluations demonstrate that our solution achieves a better trade-off between usability and privacy than prior works. To the best of our knowledge, this is the first DP optimization work from the perspective of network topologies., Comment: Accepted by TCSVT
Published: 2020

235. Long-Term Cloth-Changing Person Re-identification

Author: Qian, Xuelin, Wang, Wenxuan, Zhang, Li, Zhu, Fangrui, Fu, Yanwei, Xiang, Tao, Jiang, Yu-Gang, and Xue, Xiangyang
Subjects: Computer Science - Computer Vision and Pattern Recognition
Abstract: Person re-identification (Re-ID) aims to match a target person across camera views at different locations and times. Existing Re-ID studies focus on the short-term cloth-consistent setting, under which a person re-appears in different camera views with the same outfit. A discriminative feature representation learned by existing deep Re-ID models is thus dominated by the visual appearance of clothing. In this work, we focus on a much more difficult yet practical setting where person matching is conducted over long-duration, e.g., over days and months and therefore inevitably under the new challenge of changing clothes. This problem, termed Long-Term Cloth-Changing (LTCC) Re-ID is much understudied due to the lack of large scale datasets. The first contribution of this work is a new LTCC dataset containing people captured over a long period of time with frequent clothing changes. As a second contribution, we propose a novel Re-ID method specifically designed to address the cloth-changing challenge. Specifically, we consider that under cloth-changes, soft-biometrics such as body shape would be more reliable. We, therefore, introduce a shape embedding module as well as a cloth-elimination shape-distillation module aiming to eliminate the now unreliable clothing appearance features and focus on the body shape information. Extensive experiments show that superior performance is achieved by the proposed model on the new LTCC dataset. The code and dataset will be available at https://naiq.github.io/LTCC_Perosn_ReID.html., Comment: ACCV 2020 Oral
Published: 2020

236. Tunable giant magnetoresistance in a single-molecule junction

Author: Yang, Kai, Chen, Hui, Pope, Thomas, Hu, Yibin, Liu, Liwei, Wang, Dongfei, Tao, Lei, Xiao, Wende, Fei, Xiangmin, Zhang, Yu-Yang, Luo, Hong-Gang, Du, Shixuan, Xiang, Tao, Hofer, Werner A., and Gao, Hong-Jun
Subjects: Condensed Matter - Mesoscale and Nanoscale Physics
Abstract: Controlling electronic transport through a single-molecule junction is crucial for molecular electronics or spintronics. In magnetic molecular devices, the spin degree-of-freedom can be used to this end since the magnetic properties of the magnetic ion centers fundamentally impact the transport through the molecules. Here we demonstrate that the electron pathway in a single-molecule device can be selected between two molecular orbitals by varying a magnetic field, giving rise to a tunable anisotropic magnetoresistance up to 93%. The unique tunability of the electron pathways is due to the magnetic reorientation of the transition metal center, resulting in a re-hybridization of molecular orbitals. We obtain the tunneling electron pathways by Kondo effect, which manifests either as a peak or a dip line shape. The energy changes of these spin-reorientations are remarkably low and less than one millielectronvolt. The large tunable anisotropic magnetoresistance could be used to control electronic transport in molecular spintronics.
Published: 2020
Full Text: View/download PDF

237. Correlation between Fermi surface reconstruction and superconductivity in pressurized FeTe0.55Se0.45

Author: Lin, Gongchang, Zhu, Jing Guo Yanglin, Cai, Shu, Zhou, Yazhou, Huang, Cheng, Yang, Chongli, Long, Sijin, Wu, Qi, Mao, Zhiqiang, Xiang, Tao, and Sun, Liling
Subjects: Condensed Matter - Superconductivity
Abstract: Here we report the first results of the high-pressure Hall coefficient (RH) measurements, combined with the high-pressure resistance measurements, at different temperatures on the putative topological superconductor FeTe0.55Se0.45. We find the intimate correlation of sign change of RH, a fingerprint to manifest the reconstruction of Fermi surface, with structural phase transition and superconductivity. Below the critical pressure (PC) of 2.7 GPa, our data reveal that the hole - electron carriers are thermally balanced (RH=0) at a critical temperature (T*), where RH changes its sign from positive to negative, and concurrently a tetragonal-orthorhombic phase transition takes place. Within the pressure range from 1bar to PC, T* is continuously suppressed by pressure, while TC increases monotonically. At about PC, T* is indistinguishable and TC reaches a maximum value. Moreover, a pressure-induced sign change of RH is found at ~PC where the orthorhombic-monoclinic phase transition occurs. With further compression, TC decreases and disappears at ~ 12 GPa. The correlation among the electron-hole balance, crystal structure and superconductivity found in the pressurized FeTe0.55Se0.45 implies that its nontrivial superconductivity is closely associated with its exotic normal state resulted from the interplay between the reconstruction of the Fermi surface and the change of the structural lattice., Comment: 15 pages and 3 figures
Published: 2020
Full Text: View/download PDF

238. Knowing What, Where and When to Look: Efficient Video Action Modeling with Attention

Author: Perez-Rua, Juan-Manuel, Martinez, Brais, Zhu, Xiatian, Toisoul, Antoine, Escorcia, Victor, and Xiang, Tao
Subjects: Computer Science - Computer Vision and Pattern Recognition
Abstract: Attentive video modeling is essential for action recognition in unconstrained videos due to their rich yet redundant information over space and time. However, introducing attention in a deep neural network for action recognition is challenging for two reasons. First, an effective attention module needs to learn what (objects and their local motion patterns), where (spatially), and when (temporally) to focus on. Second, a video attention module must be efficient because existing action recognition models already suffer from high computational cost. To address both challenges, a novel What-Where-When (W3) video attention module is proposed. Departing from existing alternatives, our W3 module models all three facets of video attention jointly. Crucially, it is extremely efficient by factorizing the high-dimensional video feature data into low-dimensional meaningful spaces (1D channel vector for `what' and 2D spatial tensors for `where'), followed by lightweight temporal attention reasoning. Extensive experiments show that our attention model brings significant improvements to existing action recognition models, achieving new state-of-the-art performance on a number of benchmarks.
Published: 2020

239. Domain-Adaptive Few-Shot Learning

Author: Zhao, An, Ding, Mingyu, Lu, Zhiwu, Xiang, Tao, Niu, Yulei, Guan, Jiechao, Wen, Ji-Rong, and Luo, Ping
Subjects: Computer Science - Computer Vision and Pattern Recognition
Abstract: Existing few-shot learning (FSL) methods make the implicit assumption that the few target class samples are from the same domain as the source class samples. However, in practice this assumption is often invalid -- the target classes could come from a different domain. This poses an additional challenge of domain adaptation (DA) with few training samples. In this paper, the problem of domain-adaptive few-shot learning (DA-FSL) is tackled, which requires solving FSL and DA in a unified framework. To this end, we propose a novel domain-adversarial prototypical network (DAPN) model. It is designed to address a specific challenge in DA-FSL: the DA objective means that the source and target data distributions need to be aligned, typically through a shared domain-adaptive feature embedding space; but the FSL objective dictates that the target domain per class distribution must be different from that of any source domain class, meaning aligning the distributions across domains may harm the FSL performance. How to achieve global domain distribution alignment whilst maintaining source/target per-class discriminativeness thus becomes the key. Our solution is to explicitly enhance the source/target per-class separation before domain-adaptive feature embedding learning in the DAPN, in order to alleviate the negative effect of domain alignment on FSL. Extensive experiments show that our DAPN outperforms the state-of-the-art FSL and DA models, as well as their na\"ive combinations. The code is available at https://github.com/dingmyu/DAPN.
Published: 2020

240. Domain Adaptive Ensemble Learning

Author: Zhou, Kaiyang, Yang, Yongxin, Qiao, Yu, and Xiang, Tao
Subjects: Computer Science - Computer Vision and Pattern Recognition
Abstract: The problem of generalizing deep neural networks from multiple source domains to a target one is studied under two settings: When unlabeled target data is available, it is a multi-source unsupervised domain adaptation (UDA) problem, otherwise a domain generalization (DG) problem. We propose a unified framework termed domain adaptive ensemble learning (DAEL) to address both problems. A DAEL model is composed of a CNN feature extractor shared across domains and multiple classifier heads each trained to specialize in a particular source domain. Each such classifier is an expert to its own domain and a non-expert to others. DAEL aims to learn these experts collaboratively so that when forming an ensemble, they can leverage complementary information from each other to be more effective for an unseen target domain. To this end, each source domain is used in turn as a pseudo-target-domain with its own expert providing supervisory signal to the ensemble of non-experts learned from the other sources. For unlabeled target data under the UDA setting where real expert does not exist, DAEL uses pseudo-label to supervise the ensemble learning. Extensive experiments on three multi-source UDA datasets and two DG datasets show that DAEL improves the state of the art on both problems, often by significant margins. The code is released at \url{https://github.com/KaiyangZhou/Dassl.pytorch}., Comment: Accepted at TIP
Published: 2020
Full Text: View/download PDF

241. Deep Domain-Adversarial Image Generation for Domain Generalisation

Author: Zhou, Kaiyang, Yang, Yongxin, Hospedales, Timothy, and Xiang, Tao
Subjects: Computer Science - Computer Vision and Pattern Recognition
Abstract: Machine learning models typically suffer from the domain shift problem when trained on a source dataset and evaluated on a target dataset of different distribution. To overcome this problem, domain generalisation (DG) methods aim to leverage data from multiple source domains so that a trained model can generalise to unseen domains. In this paper, we propose a novel DG approach based on \emph{Deep Domain-Adversarial Image Generation} (DDAIG). Specifically, DDAIG consists of three components, namely a label classifier, a domain classifier and a domain transformation network (DoTNet). The goal for DoTNet is to map the source training data to unseen domains. This is achieved by having a learning objective formulated to ensure that the generated data can be correctly classified by the label classifier while fooling the domain classifier. By augmenting the source training data with the generated unseen domain data, we can make the label classifier more robust to unknown domain changes. Extensive experiments on four DG datasets demonstrate the effectiveness of our approach., Comment: 8 pages
Published: 2020

242. Incremental Few-Shot Object Detection

Author: Perez-Rua, Juan-Manuel, Zhu, Xiatian, Hospedales, Timothy, and Xiang, Tao
Subjects: Computer Science - Computer Vision and Pattern Recognition
Abstract: Most existing object detection methods rely on the availability of abundant labelled training samples per class and offline model training in a batch mode. These requirements substantially limit their scalability to open-ended accommodation of novel classes with limited labelled training data. We present a study aiming to go beyond these limitations by considering the Incremental Few-Shot Detection (iFSD) problem setting, where new classes must be registered incrementally (without revisiting base classes) and with few examples. To this end we propose OpeN-ended Centre nEt (ONCE), a detector designed for incrementally learning to detect novel class objects with few examples. This is achieved by an elegant adaptation of the CentreNet detector to the few-shot learning scenario, and meta-learning a class-specific code generator model for registering novel classes. ONCE fully respects the incremental learning paradigm, with novel class registration requiring only a single forward pass of few-shot training samples, and no access to base classes -- thus making it suitable for deployment on embedded devices. Extensive experiments conducted on both the standard object detection and fashion landmark detection tasks show the feasibility of iFSD for the first time, opening an interesting and very important line of research., Comment: CVPR 2020
Published: 2020

243. AdarGCN: Adaptive Aggregation GCN for Few-Shot Learning

Author: Zhang, Jianhong, Zhang, Manli, Lu, Zhiwu, Xiang, Tao, and Wen, Jirong
Subjects: Computer Science - Machine Learning, Statistics - Machine Learning
Abstract: Existing few-shot learning (FSL) methods assume that there exist sufficient training samples from source classes for knowledge transfer to target classes with few training samples. However, this assumption is often invalid, especially when it comes to fine-grained recognition. In this work, we define a new FSL setting termed few-shot fewshot learning (FSFSL), under which both the source and target classes have limited training samples. To overcome the source class data scarcity problem, a natural option is to crawl images from the web with class names as search keywords. However, the crawled images are inevitably corrupted by large amount of noise (irrelevant images) and thus may harm the performance. To address this problem, we propose a graph convolutional network (GCN)-based label denoising (LDN) method to remove the irrelevant images. Further, with the cleaned web images as well as the original clean training images, we propose a GCN-based FSL method. For both the LDN and FSL tasks, a novel adaptive aggregation GCN (AdarGCN) model is proposed, which differs from existing GCN models in that adaptive aggregation is performed based on a multi-head multi-level aggregation module. With AdarGCN, how much and how far information carried by each graph node is propagated in the graph structure can be determined automatically, therefore alleviating the effects of both noisy and outlying training samples. Extensive experiments show the superior performance of our AdarGCN under both the new FSFSL and the conventional FSL settings., Comment: The code is at github - https://github.com/RiceZJH/AdarGCN
Published: 2020

244. Sketch Less for More: On-the-Fly Fine-Grained Sketch Based Image Retrieval

Author: Bhunia, Ayan Kumar, Yang, Yongxin, Hospedales, Timothy M., Xiang, Tao, and Song, Yi-Zhe
Subjects: Computer Science - Computer Vision and Pattern Recognition
Abstract: Fine-grained sketch-based image retrieval (FG-SBIR) addresses the problem of retrieving a particular photo instance given a user's query sketch. Its widespread applicability is however hindered by the fact that drawing a sketch takes time, and most people struggle to draw a complete and faithful sketch. In this paper, we reformulate the conventional FG-SBIR framework to tackle these challenges, with the ultimate goal of retrieving the target photo with the least number of strokes possible. We further propose an on-the-fly design that starts retrieving as soon as the user starts drawing. To accomplish this, we devise a reinforcement learning-based cross-modal retrieval framework that directly optimizes rank of the ground-truth photo over a complete sketch drawing episode. Additionally, we introduce a novel reward scheme that circumvents the problems related to irrelevant sketch strokes, and thus provides us with a more consistent rank list during the retrieval. We achieve superior early-retrieval efficiency over state-of-the-art methods and alternative baselines on two publicly available fine-grained sketch retrieval datasets., Comment: IEEE Conf. on Computer Vision and Pattern Recognition (CVPR), 2020 [Oral Presentation] Code: https://github.com/AyanKumarBhunia/on-the-fly-FGSBIR
Published: 2020

245. Fine-Grained Instance-Level Sketch-Based Video Retrieval

Author: Xu, Peng, Liu, Kun, Xiang, Tao, Hospedales, Timothy M., Ma, Zhanyu, Guo, Jun, and Song, Yi-Zhe
Subjects: Computer Science - Computer Vision and Pattern Recognition, Computer Science - Multimedia
Abstract: Existing sketch-analysis work studies sketches depicting static objects or scenes. In this work, we propose a novel cross-modal retrieval problem of fine-grained instance-level sketch-based video retrieval (FG-SBVR), where a sketch sequence is used as a query to retrieve a specific target video instance. Compared with sketch-based still image retrieval, and coarse-grained category-level video retrieval, this is more challenging as both visual appearance and motion need to be simultaneously matched at a fine-grained level. We contribute the first FG-SBVR dataset with rich annotations. We then introduce a novel multi-stream multi-modality deep network to perform FG-SBVR under both strong and weakly supervised settings. The key component of the network is a relation module, designed to prevent model over-fitting given scarce training data. We show that this model significantly outperforms a number of existing state-of-the-art models designed for video analysis.
Published: 2020

246. Byzantine-resilient Decentralized Stochastic Gradient Descent

Author: Guo, Shangwei, Zhang, Tianwei, Yu, Han, Xie, Xiaofei, Ma, Lei, Xiang, Tao, and Liu, Yang
Subjects: Computer Science - Machine Learning, Computer Science - Cryptography and Security
Abstract: Decentralized learning has gained great popularity to improve learning efficiency and preserve data privacy. Each computing node makes equal contribution to collaboratively learn a Deep Learning model. The elimination of centralized Parameter Servers (PS) can effectively address many issues such as privacy, performance bottleneck and single-point-failure. However, how to achieve Byzantine Fault Tolerance in decentralized learning systems is rarely explored, although this problem has been extensively studied in centralized systems. In this paper, we present an in-depth study towards the Byzantine resilience of decentralized learning systems with two contributions. First, from the adversarial perspective, we theoretically illustrate that Byzantine attacks are more dangerous and feasible in decentralized learning systems: even one malicious participant can arbitrarily alter the models of other participants by sending carefully crafted updates to its neighbors. Second, from the defense perspective, we propose UBAR, a novel algorithm to enhance decentralized learning with Byzantine Fault Tolerance. Specifically, UBAR provides a Uniform Byzantine-resilient Aggregation Rule for benign nodes to select the useful parameter updates and filter out the malicious ones in each training iteration. It guarantees that each benign node in a decentralized system can train a correct model under very strong Byzantine attacks with an arbitrary number of faulty nodes. We conduct extensive experiments on standard image classification tasks and the results indicate that UBAR can effectively defeat both simple and sophisticated Byzantine attacks with higher performance efficiency than existing solutions.
Published: 2020

247. Meta-Learning across Meta-Tasks for Few-Shot Learning

Author: Fei, Nanyi, Lu, Zhiwu, Gao, Yizhao, Tian, Jia, Xiang, Tao, and Wen, Ji-Rong
Subjects: Computer Science - Machine Learning, Statistics - Machine Learning
Abstract: Existing meta-learning based few-shot learning (FSL) methods typically adopt an episodic training strategy whereby each episode contains a meta-task. Across episodes, these tasks are sampled randomly and their relationships are ignored. In this paper, we argue that the inter-meta-task relationships should be exploited and those tasks are sampled strategically to assist in meta-learning. Specifically, we consider the relationships defined over two types of meta-task pairs and propose different strategies to exploit them. (1) Two meta-tasks with disjoint sets of classes: this pair is interesting because it is reminiscent of the relationship between the source seen classes and target unseen classes, featured with domain gap caused by class differences. A novel learning objective termed meta-domain adaptation (MDA) is proposed to make the meta-learned model more robust to the domain gap. (2) Two meta-tasks with identical sets of classes: this pair is useful because it can be employed to learn models that are robust against poorly sampled few-shots. To that end, a novel meta-knowledge distillation (MKD) objective is formulated. There are some mistakes in the experiments. We thus choose to withdraw this paper., Comment: There are some mistakes in the experiments. We thus choose to withdraw this paper
Published: 2020

248. Few-Shot Learning as Domain Adaptation: Algorithm and Analysis

Author: Guan, Jiechao, Lu, Zhiwu, Xiang, Tao, and Wen, Ji-Rong
Subjects: Computer Science - Machine Learning, Statistics - Machine Learning
Abstract: To recognize the unseen classes with only few samples, few-shot learning (FSL) uses prior knowledge learned from the seen classes. A major challenge for FSL is that the distribution of the unseen classes is different from that of those seen, resulting in poor generalization even when a model is meta-trained on the seen classes. This class-difference-caused distribution shift can be considered as a special case of domain shift. In this paper, for the first time, we propose a domain adaptation prototypical network with attention (DAPNA) to explicitly tackle such a domain shift problem in a meta-learning framework. Specifically, armed with a set transformer based attention module, we construct each episode with two sub-episodes without class overlap on the seen classes to simulate the domain shift between the seen and unseen classes. To align the feature distributions of the two sub-episodes with limited training samples, a feature transfer network is employed together with a margin disparity discrepancy (MDD) loss. Importantly, theoretical analysis is provided to give the learning bound of our DAPNA. Extensive experiments show that our DAPNA outperforms the state-of-the-art FSL alternatives, often by significant margins., Comment: There exist some mistakes in the experiments
Published: 2020

249. Deep Learning for Person Re-identification: A Survey and Outlook

Author: Ye, Mang, Shen, Jianbing, Lin, Gaojie, Xiang, Tao, Shao, Ling, and Hoi, Steven C. H.
Subjects: Computer Science - Computer Vision and Pattern Recognition
Abstract: Person re-identification (Re-ID) aims at retrieving a person of interest across multiple non-overlapping cameras. With the advancement of deep neural networks and increasing demand of intelligent video surveillance, it has gained significantly increased interest in the computer vision community. By dissecting the involved components in developing a person Re-ID system, we categorize it into the closed-world and open-world settings. The widely studied closed-world setting is usually applied under various research-oriented assumptions, and has achieved inspiring success using deep learning techniques on a number of datasets. We first conduct a comprehensive overview with in-depth analysis for closed-world person Re-ID from three different perspectives, including deep feature representation learning, deep metric learning and ranking optimization. With the performance saturation under closed-world setting, the research focus for person Re-ID has recently shifted to the open-world setting, facing more challenging issues. This setting is closer to practical applications under specific scenarios. We summarize the open-world Re-ID in terms of five different aspects. By analyzing the advantages of existing methods, we design a powerful AGW baseline, achieving state-of-the-art or at least comparable performance on twelve datasets for FOUR different Re-ID tasks. Meanwhile, we introduce a new evaluation metric (mINP) for person Re-ID, indicating the cost for finding all the correct matches, which provides an additional criteria to evaluate the Re-ID system for real applications. Finally, some important yet under-investigated open issues are discussed., Comment: 20 pages, 8 figures. Accepted by IEEE TPAMI
Published: 2020

250. Deep Learning for Free-Hand Sketch: A Survey

Author: Xu, Peng, Hospedales, Timothy M., Yin, Qiyue, Song, Yi-Zhe, Xiang, Tao, and Wang, Liang
Subjects: Computer Science - Computer Vision and Pattern Recognition, Computer Science - Graphics, Computer Science - Machine Learning
Abstract: Free-hand sketches are highly illustrative, and have been widely used by humans to depict objects or stories from ancient times to the present. The recent prevalence of touchscreen devices has made sketch creation a much easier task than ever and consequently made sketch-oriented applications increasingly popular. The progress of deep learning has immensely benefited free-hand sketch research and applications. This paper presents a comprehensive survey of the deep learning techniques oriented at free-hand sketch data, and the applications that they enable. The main contents of this survey include: (i) A discussion of the intrinsic traits and unique challenges of free-hand sketch, to highlight the essential differences between sketch data and other data modalities, e.g., natural photos. (ii) A review of the developments of free-hand sketch research in the deep learning era, by surveying existing datasets, research topics, and the state-of-the-art methods through a detailed taxonomy and experimental evaluation. (iii) Promotion of future work via a discussion of bottlenecks, open problems, and potential research directions for the community., Comment: This paper is accepted by IEEE TPAMI
Published: 2020

Catalog

Books, media, physical & digital resources

See catalog results

Searchworks

Select search scope, currently: Articles Catalog books, media & more in Jio Institute collections Articles journal articles & other e-resources

Search

Search Constraints

Refine your results

Search Limiters

Topic

Publication Year Range

Language

Publication Type

Journal

Region

Database

Publisher

4,831 results on '"Xiang, Tao"'

Search Results

Catalog

Select search scope, currently: Articles

Catalog

books, media & more in Jio Institute collections

Articles

journal articles & other e-resources