26 results on '"Zhang Liwen"'
Search Results
2. Technique Report of CVPR 2024 PBDL Challenges
- Author
-
Fu, Ying, Li, Yu, You, Shaodi, Shi, Boxin, Chen, Linwei, Zou, Yunhao, Wang, Zichun, Li, Yichen, Han, Yuze, Zhang, Yingkai, Wang, Jianan, Liu, Qinglin, Yu, Wei, Lv, Xiaoqian, Li, Jianing, Zhang, Shengping, Ji, Xiangyang, Chen, Yuanpei, Zhang, Yuhan, Peng, Weihang, Zhang, Liwen, Xu, Zhe, Gou, Dingyong, Li, Cong, Xu, Senyan, Zhang, Yunkang, Jiang, Siyuan, Lu, Xiaoqiang, Jiao, Licheng, Liu, Fang, Liu, Xu, Li, Lingling, Ma, Wenping, Yang, Shuyuan, Xie, Haiyang, Zhao, Jian, Huang, Shihua, Cheng, Peng, Shen, Xi, Wang, Zheng, An, Shuai, Zhu, Caizhi, Li, Xuelong, Zhang, Tao, Li, Liang, Liu, Yu, Yan, Chenggang, Zhang, Gengchen, Jiang, Linyan, Song, Bingyi, An, Zhuoyu, Lei, Haibo, Luo, Qing, Song, Jie, Liu, Yuan, Li, Qihang, Zhang, Haoyuan, Wang, Lingfeng, Chen, Wei, Luo, Aling, Li, Cheng, Cao, Jun, Chen, Shu, Dou, Zifei, Liu, Xinyu, Zhang, Jing, Zhang, Kexin, Yang, Yuting, Gou, Xuejian, Wang, Qinliang, Liu, Yang, Zhao, Shizhan, Zhang, Yanzhao, Yan, Libo, Guo, Yuwei, Li, Guoxin, Gao, Qiong, Che, Chenyue, Sun, Long, Chen, Xiang, Li, Hao, Pan, Jinshan, Xie, Chuanlong, Chen, Hongming, Li, Mingrui, Deng, Tianchen, Huang, Jingwei, Li, Yufeng, Wan, Fei, Xu, Bingxin, Cheng, Jian, Liu, Hongzhe, Xu, Cheng, Zou, Yuxiang, Pan, Weiguo, Dai, Songyin, Jia, Sen, Zhang, Junpei, and Chen, Puhua
- Subjects
Computer Science - Computer Vision and Pattern Recognition - Abstract
The intersection of physics-based vision and deep learning presents an exciting frontier for advancing computer vision technologies. By leveraging the principles of physics to inform and enhance deep learning models, we can develop more robust and accurate vision systems. Physics-based vision aims to invert the processes to recover scene properties such as shape, reflectance, light distribution, and medium properties from images. In recent years, deep learning has shown promising improvements for various vision tasks, and when combined with physics-based vision, these approaches can enhance the robustness and accuracy of vision systems. This technical report summarizes the outcomes of the Physics-Based Vision Meets Deep Learning (PBDL) 2024 challenge, held in CVPR 2024 workshop. The challenge consisted of eight tracks, focusing on Low-Light Enhancement and Detection as well as High Dynamic Range (HDR) Imaging. This report details the objectives, methodologies, and results of each track, highlighting the top-performing solutions and their innovative approaches., Comment: CVPR 2024 PBDL Challenges: https://pbdl-ws.github.io/pbdl2024/challenge/index.html
- Published
- 2024
3. TripleSurv: Triplet Time-adaptive Coordinate Loss for Survival Analysis
- Author
-
Zhang, Liwen, Zhong, Lianzhen, Yang, Fan, Dong, Di, Hui, Hui, and Tian, Jie
- Subjects
Computer Science - Machine Learning ,Computer Science - Artificial Intelligence ,Statistics - Machine Learning - Abstract
A core challenge in survival analysis is to model the distribution of censored time-to-event data, where the event of interest may be a death, failure, or occurrence of a specific event. Previous studies have showed that ranking and maximum likelihood estimation (MLE)loss functions are widely-used for survival analysis. However, ranking loss only focus on the ranking of survival time and does not consider potential effect of samples for exact survival time values. Furthermore, the MLE is unbounded and easily subject to outliers (e.g., censored data), which may cause poor performance of modeling. To handle the complexities of learning process and exploit valuable survival time values, we propose a time-adaptive coordinate loss function, TripleSurv, to achieve adaptive adjustments by introducing the differences in the survival time between sample pairs into the ranking, which can encourage the model to quantitatively rank relative risk of pairs, ultimately enhancing the accuracy of predictions. Most importantly, the TripleSurv is proficient in quantifying the relative risk between samples by ranking ordering of pairs, and consider the time interval as a trade-off to calibrate the robustness of model over sample distribution. Our TripleSurv is evaluated on three real-world survival datasets and a public synthetic dataset. The results show that our method outperforms the state-of-the-art methods and exhibits good model performance and robustness on modeling various sophisticated data distributions with different censor rates. Our code will be available upon acceptance., Comment: 9 pages,6 figures
- Published
- 2024
4. YAYI 2: Multilingual Open-Source Large Language Models
- Author
-
Luo, Yin, Kong, Qingchao, Xu, Nan, Cao, Jia, Hao, Bao, Qu, Baoyu, Chen, Bo, Zhu, Chao, Zhao, Chenyang, Zhang, Donglei, Feng, Fan, Zhao, Feifei, Sun, Hailong, Yang, Hanxuan, Pan, Haojun, Liu, Hongyu, Guo, Jianbin, Du, Jiangtao, Wang, Jingyi, Li, Junfeng, Sun, Lei, Liu, Liduo, Dong, Lifeng, Liu, Lili, Wang, Lin, Zhang, Liwen, Wang, Minzheng, Wang, Pin, Yu, Ping, Li, Qingxiao, Yan, Rui, Zou, Rui, Li, Ruiqun, Huang, Taiwen, Wang, Xiaodong, Wu, Xiaofei, Peng, Xin, Zhang, Xina, Fang, Xing, Xiao, Xinglin, Hao, Yanni, Dong, Yao, Wang, Yigang, Liu, Ying, Jiang, Yongyu, Wang, Yungan, Wang, Yuqi, Wang, Zhangsheng, Yu, Zhaoxin, Luo, Zhen, Mao, Wenji, Wang, Lei, and Zeng, Dajun
- Subjects
Computer Science - Computation and Language ,Computer Science - Artificial Intelligence - Abstract
As the latest advancements in natural language processing, large language models (LLMs) have achieved human-level language understanding and generation abilities in many real-world tasks, and even have been regarded as a potential path to the artificial general intelligence. To better facilitate research on LLMs, many open-source LLMs, such as Llama 2 and Falcon, have recently been proposed and gained comparable performances to proprietary models. However, these models are primarily designed for English scenarios and exhibit poor performances in Chinese contexts. In this technical report, we propose YAYI 2, including both base and chat models, with 30 billion parameters. YAYI 2 is pre-trained from scratch on a multilingual corpus which contains 2.65 trillion tokens filtered by our pre-training data processing pipeline. The base model is aligned with human values through supervised fine-tuning with millions of instructions and reinforcement learning from human feedback. Extensive experiments on multiple benchmarks, such as MMLU and CMMLU, consistently demonstrate that the proposed YAYI 2 outperforms other similar sized open-source models.
- Published
- 2023
5. FinEval: A Chinese Financial Domain Knowledge Evaluation Benchmark for Large Language Models
- Author
-
Zhang, Liwen, Cai, Weige, Liu, Zhaowei, Yang, Zhi, Dai, Wei, Liao, Yujie, Qin, Qianru, Li, Yifei, Liu, Xingyu, Liu, Zhiqiang, Zhu, Zhoufan, Wu, Anbo, Guo, Xin, and Chen, Yun
- Subjects
Computer Science - Computation and Language - Abstract
Large language models (LLMs) have demonstrated exceptional performance in various natural language processing tasks, yet their efficacy in more challenging and domain-specific tasks remains largely unexplored. This paper presents FinEval, a benchmark specifically designed for the financial domain knowledge in the LLMs. FinEval is a collection of high-quality multiple-choice questions covering Finance, Economy, Accounting, and Certificate. It includes 4,661 questions spanning 34 different academic subjects. To ensure a comprehensive model performance evaluation, FinEval employs a range of prompt types, including zero-shot and few-shot prompts, as well as answer-only and chain-of-thought prompts. Evaluating state-of-the-art Chinese and English LLMs on FinEval, the results show that only GPT-4 achieved an accuracy close to 70% in different prompt settings, indicating significant growth potential for LLMs in the financial domain knowledge. Our work offers a more comprehensive financial knowledge evaluation benchmark, utilizing data of mock exams and covering a wide range of evaluated LLMs.
- Published
- 2023
6. Membrane Potential Batch Normalization for Spiking Neural Networks
- Author
-
Guo, Yufei, Zhang, Yuhan, Chen, Yuanpei, Peng, Weihang, Liu, Xiaode, Zhang, Liwen, Huang, Xuhui, and Ma, Zhe
- Subjects
Computer Science - Computer Vision and Pattern Recognition - Abstract
As one of the energy-efficient alternatives of conventional neural networks (CNNs), spiking neural networks (SNNs) have gained more and more interest recently. To train the deep models, some effective batch normalization (BN) techniques are proposed in SNNs. All these BNs are suggested to be used after the convolution layer as usually doing in CNNs. However, the spiking neuron is much more complex with the spatio-temporal dynamics. The regulated data flow after the BN layer will be disturbed again by the membrane potential updating operation before the firing function, i.e., the nonlinear activation. Therefore, we advocate adding another BN layer before the firing function to normalize the membrane potential again, called MPBN. To eliminate the induced time cost of MPBN, we also propose a training-inference-decoupled re-parameterization technique to fold the trained MPBN into the firing threshold. With the re-parameterization technique, the MPBN will not introduce any extra time burden in the inference. Furthermore, the MPBN can also adopt the element-wised form, while these BNs after the convolution layer can only use the channel-wised form. Experimental results show that the proposed MPBN performs well on both popular non-spiking static and neuromorphic datasets. Our code is open-sourced at \href{https://github.com/yfguo91/MPBN}{MPBN}., Comment: Accepted by ICCV2023
- Published
- 2023
7. RMP-Loss: Regularizing Membrane Potential Distribution for Spiking Neural Networks
- Author
-
Guo, Yufei, Liu, Xiaode, Chen, Yuanpei, Zhang, Liwen, Peng, Weihang, Zhang, Yuhan, Huang, Xuhui, and Ma, Zhe
- Subjects
Computer Science - Computer Vision and Pattern Recognition - Abstract
Spiking Neural Networks (SNNs) as one of the biology-inspired models have received much attention recently. It can significantly reduce energy consumption since they quantize the real-valued membrane potentials to 0/1 spikes to transmit information thus the multiplications of activations and weights can be replaced by additions when implemented on hardware. However, this quantization mechanism will inevitably introduce quantization error, thus causing catastrophic information loss. To address the quantization error problem, we propose a regularizing membrane potential loss (RMP-Loss) to adjust the distribution which is directly related to quantization error to a range close to the spikes. Our method is extremely simple to implement and straightforward to train an SNN. Furthermore, it is shown to consistently outperform previous state-of-the-art methods over different network architectures and datasets., Comment: Accepted by ICCV2023
- Published
- 2023
8. Variance Control for Distributional Reinforcement Learning
- Author
-
Kuang, Qi, Zhu, Zhoufan, Zhang, Liwen, and Zhou, Fan
- Subjects
Computer Science - Machine Learning - Abstract
Although distributional reinforcement learning (DRL) has been widely examined in the past few years, very few studies investigate the validity of the obtained Q-function estimator in the distributional setting. To fully understand how the approximation errors of the Q-function affect the whole training process, we do some error analysis and theoretically show how to reduce both the bias and the variance of the error terms. With this new understanding, we construct a new estimator \emph{Quantiled Expansion Mean} (QEM) and introduce a new DRL algorithm (QEMRL) from the statistical perspective. We extensively evaluate our QEMRL algorithm on a variety of Atari and Mujoco benchmark tasks and demonstrate that QEMRL achieves significant improvement over baseline algorithms in terms of sample efficiency and convergence performance., Comment: ICML 2023
- Published
- 2023
9. InfLoR-SNN: Reducing Information Loss for Spiking Neural Networks
- Author
-
Guo, Yufei, Chen, Yuanpei, Zhang, Liwen, Liu, Xiaode, Tong, Xinyi, Ou, Yuanyuan, Huang, Xuhui, and Ma, Zhe
- Subjects
Computer Science - Neural and Evolutionary Computing ,Computer Science - Computer Vision and Pattern Recognition - Abstract
The Spiking Neural Network (SNN) has attracted more and more attention recently. It adopts binary spike signals to transmit information. Benefitting from the information passing paradigm of SNNs, the multiplications of activations and weights can be replaced by additions, which are more energy-efficient. However, its "Hard Reset" mechanism for the firing activity would ignore the difference among membrane potentials when the membrane potential is above the firing threshold, causing information loss. Meanwhile, quantifying the membrane potential to 0/1 spikes at the firing instants will inevitably introduce the quantization error thus bringing about information loss too. To address these problems, we propose to use the "Soft Reset" mechanism for the supervised training-based SNNs, which will drive the membrane potential to a dynamic reset potential according to its magnitude, and Membrane Potential Rectifier (MPR) to reduce the quantization error via redistributing the membrane potential to a range close to the spikes. Results show that the SNNs with the "Soft Reset" mechanism and MPR outperform their vanilla counterparts on both static and dynamic datasets., Comment: Accepted by ECCV2022
- Published
- 2023
10. Learning Multilingual Sentence Representations with Cross-lingual Consistency Regularization
- Author
-
Gao, Pengzhi, Zhang, Liwen, He, Zhongjun, Wu, Hua, and Wang, Haifeng
- Subjects
Computer Science - Computation and Language ,Computer Science - Artificial Intelligence - Abstract
Multilingual sentence representations are the foundation for similarity-based bitext mining, which is crucial for scaling multilingual neural machine translation (NMT) system to more languages. In this paper, we introduce MuSR: a one-for-all Multilingual Sentence Representation model that supports more than 220 languages. Leveraging billions of English-centric parallel corpora, we train a multilingual Transformer encoder, coupled with an auxiliary Transformer decoder, by adopting a multilingual NMT framework with CrossConST, a cross-lingual consistency regularization technique proposed in Gao et al. (2023). Experimental results on multilingual similarity search and bitext mining tasks show the effectiveness of our approach. Specifically, MuSR achieves superior performance over LASER3 (Heffernan et al., 2022) which consists of 148 independent multilingual sentence encoders.
- Published
- 2023
11. Improving Zero-shot Multilingual Neural Machine Translation by Leveraging Cross-lingual Consistency Regularization
- Author
-
Gao, Pengzhi, Zhang, Liwen, He, Zhongjun, Wu, Hua, and Wang, Haifeng
- Subjects
Computer Science - Computation and Language ,Computer Science - Artificial Intelligence - Abstract
The multilingual neural machine translation (NMT) model has a promising capability of zero-shot translation, where it could directly translate between language pairs unseen during training. For good transfer performance from supervised directions to zero-shot directions, the multilingual NMT model is expected to learn universal representations across different languages. This paper introduces a cross-lingual consistency regularization, CrossConST, to bridge the representation gap among different languages and boost zero-shot translation performance. The theoretical analysis shows that CrossConST implicitly maximizes the probability distribution for zero-shot translation, and the experimental results on both low-resource and high-resource benchmarks show that CrossConST consistently improves the translation performance. The experimental analysis also proves that CrossConST could close the sentence representation gap and better align the representation space. Given the universality and simplicity of CrossConST, we believe it can serve as a strong baseline for future multilingual NMT research., Comment: Accepted to Findings of ACL 2023
- Published
- 2023
12. Joint A-SNN: Joint Training of Artificial and Spiking Neural Networks via Self-Distillation and Weight Factorization
- Author
-
Guo, Yufei, Peng, Weihang, Chen, Yuanpei, Zhang, Liwen, Liu, Xiaode, Huang, Xuhui, and Ma, Zhe
- Subjects
Computer Science - Computer Vision and Pattern Recognition - Abstract
Emerged as a biology-inspired method, Spiking Neural Networks (SNNs) mimic the spiking nature of brain neurons and have received lots of research attention. SNNs deal with binary spikes as their activation and therefore derive extreme energy efficiency on hardware. However, it also leads to an intrinsic obstacle that training SNNs from scratch requires a re-definition of the firing function for computing gradient. Artificial Neural Networks (ANNs), however, are fully differentiable to be trained with gradient descent. In this paper, we propose a joint training framework of ANN and SNN, in which the ANN can guide the SNN's optimization. This joint framework contains two parts: First, the knowledge inside ANN is distilled to SNN by using multiple branches from the networks. Second, we restrict the parameters of ANN and SNN, where they share partial parameters and learn different singular weights. Extensive experiments over several widely used network structures show that our method consistently outperforms many other state-of-the-art training methods. For example, on the CIFAR100 classification task, the spiking ResNet-18 model trained by our method can reach to 77.39% top-1 accuracy with only 4 time steps., Comment: Accepted by Pattern Recognition
- Published
- 2023
13. Real Spike: Learning Real-valued Spikes for Spiking Neural Networks
- Author
-
Guo, Yufei, Zhang, Liwen, Chen, Yuanpei, Tong, Xinyi, Liu, Xiaode, Wang, YingLei, Huang, Xuhui, and Ma, Zhe
- Subjects
Computer Science - Neural and Evolutionary Computing ,Computer Science - Machine Learning - Abstract
Brain-inspired spiking neural networks (SNNs) have recently drawn more and more attention due to their event-driven and energy-efficient characteristics. The integration of storage and computation paradigm on neuromorphic hardwares makes SNNs much different from Deep Neural Networks (DNNs). In this paper, we argue that SNNs may not benefit from the weight-sharing mechanism, which can effectively reduce parameters and improve inference efficiency in DNNs, in some hardwares, and assume that an SNN with unshared convolution kernels could perform better. Motivated by this assumption, a training-inference decoupling method for SNNs named as Real Spike is proposed, which not only enjoys both unshared convolution kernels and binary spikes in inference-time but also maintains both shared convolution kernels and Real-valued Spikes during training. This decoupling mechanism of SNN is realized by a re-parameterization technique. Furthermore, based on the training-inference-decoupled idea, a series of different forms for implementing Real Spike on different levels are presented, which also enjoy shared convolutions in the inference and are friendly to both neuromorphic and non-neuromorphic hardware platforms. A theoretical proof is given to clarify that the Real Spike-based SNN network is superior to its vanilla counterpart. Experimental results show that all different Real Spike versions can consistently improve the SNN performance. Moreover, the proposed method outperforms the state-of-the-art models on both non-spiking static and neuromorphic datasets., Comment: Accepted by ECCV2022
- Published
- 2022
14. Scene Clustering Based Pseudo-labeling Strategy for Multi-modal Aerial View Object Classification
- Author
-
Yu, Jun, Chang, Hao, Lu, Keda, Zhang, Liwen, Du, Shenshen, and Zhang, Zhong
- Subjects
Computer Science - Computer Vision and Pattern Recognition - Abstract
Multi-modal aerial view object classification (MAVOC) in Automatic target recognition (ATR), although an important and challenging problem, has been under studied. This paper firstly finds that fine-grained data, class imbalance and various shooting conditions preclude the representational ability of general image classification. Moreover, the MAVOC dataset has scene aggregation characteristics. By exploiting these properties, we propose Scene Clustering Based Pseudo-labeling Strategy (SCP-Label), a simple yet effective method to employ in post-processing. The SCP-Label brings greater accuracy by assigning the same label to objects within the same scene while also mitigating bias and confusion with model ensembles. Its performance surpasses the official baseline by a large margin of +20.57% Accuracy on Track 1 (SAR), and +31.86% Accuracy on Track 2 (SAR+EO), demonstrating the potential of SCP-Label as post-processing. Finally, we win the championship both on Track1 and Track2 in the CVPR 2022 Perception Beyond the Visible Spectrum (PBVS) Workshop MAVOC Challenge. Our code is available at https://github.com/HowieChangchn/SCP-Label.
- Published
- 2022
15. Ultra High Dimensional Change Point Detection
- Author
-
Liu, Xin, Zhang, Liwen, and Zhang, Zhen
- Subjects
Statistics - Methodology ,Mathematics - Statistics Theory - Abstract
Structural breaks have been commonly seen in applications. Specifically for detection of change points in time, research gap still remains on the setting in ultra high dimension, where the covariates may bear spurious correlations. In this paper, we propose a two-stage approach to detect change points in ultra high dimension, by firstly proposing the dynamic titled current correlation screening method to reduce the input dimension, and then detecting possible change points in the framework of group variable selection. Not only the spurious correlation between ultra-high dimensional covariates is taken into consideration in variable screening, but non-convex penalties are studied in change point detection in the ultra high dimension. Asymptotic properties are derived to guarantee the asymptotic consistency of the selection procedure, and the numerical investigations show the promising performance of the proposed approach.
- Published
- 2021
16. Non-decreasing Quantile Function Network with Efficient Exploration for Distributional Reinforcement Learning
- Author
-
Zhou, Fan, Zhu, Zhoufan, Kuang, Qi, and Zhang, Liwen
- Subjects
Computer Science - Machine Learning ,Computer Science - Artificial Intelligence - Abstract
Although distributional reinforcement learning (DRL) has been widely examined in the past few years, there are two open questions people are still trying to address. One is how to ensure the validity of the learned quantile function, the other is how to efficiently utilize the distribution information. This paper attempts to provide some new perspectives to encourage the future in-depth studies in these two fields. We first propose a non-decreasing quantile function network (NDQFN) to guarantee the monotonicity of the obtained quantile estimates and then design a general exploration framework called distributional prediction error (DPE) for DRL which utilizes the entire distribution of the quantile function. In this paper, we not only discuss the theoretical necessity of our method but also show the performance gain it achieves in practice by comparing with some competitors on Atari 2600 Games especially in some hard-explored games.
- Published
- 2021
17. Constrained Text Generation with Global Guidance -- Case Study on CommonGen
- Author
-
Liu, Yixian, Zhang, Liwen, Han, Wenjuan, Zhang, Yue, and Tu, Kewei
- Subjects
Computer Science - Computation and Language - Abstract
This paper studies constrained text generation, which is to generate sentences under certain pre-conditions. We focus on CommonGen, the task of generating text based on a set of concepts, as a representative task of constrained text generation. Traditional methods mainly rely on supervised training to maximize the likelihood of target sentences.However, global constraints such as common sense and coverage cannot be incorporated into the likelihood objective of the autoregressive decoding process. In this paper, we consider using reinforcement learning to address the limitation, measuring global constraints including fluency, common sense and concept coverage with a comprehensive score, which serves as the reward for reinforcement learning. Besides, we design a guided decoding method at the word, fragment and sentence levels. Experiments demonstrate that our method significantly increases the concept coverage and outperforms existing models in various automatic evaluations.
- Published
- 2021
18. Adversarial Attack and Defense of Structured Prediction Models
- Author
-
Han, Wenjuan, Zhang, Liwen, Jiang, Yong, and Tu, Kewei
- Subjects
Computer Science - Computation and Language - Abstract
Building an effective adversarial attacker and elaborating on countermeasures for adversarial attacks for natural language processing (NLP) have attracted a lot of research in recent years. However, most of the existing approaches focus on classification problems. In this paper, we investigate attacks and defenses for structured prediction tasks in NLP. Besides the difficulty of perturbing discrete words and the sentence fluency problem faced by attackers in any NLP tasks, there is a specific challenge to attackers of structured prediction models: the structured output of structured prediction models is sensitive to small perturbations in the input. To address these problems, we propose a novel and unified framework that learns to attack a structured prediction model using a sequence-to-sequence model with feedbacks from multiple reference models of the same structured prediction task. Based on the proposed attack, we further reinforce the victim model with adversarial training, making its prediction more robust and accurate. We evaluate the proposed framework in dependency parsing and part-of-speech tagging. Automatic and human evaluations show that our proposed framework succeeds in both attacking state-of-the-art structured prediction models and boosting them with adversarial training., Comment: Accepted to EMNLP 2020
- Published
- 2020
19. Latent Variable Sentiment Grammar
- Author
-
Zhang, Liwen, Tu, Kewei, and Zhang, Yue
- Subjects
Computer Science - Computation and Language - Abstract
Neural models have been investigated for sentiment classification over constituent trees. They learn phrase composition automatically by encoding tree structures but do not explicitly model sentiment composition, which requires to encode sentiment class labels. To this end, we investigate two formalisms with deep sentiment representations that capture sentiment subtype expressions by latent variables and Gaussian mixture vectors, respectively. Experiments on Stanford Sentiment Treebank (SST) show the effectiveness of sentiment grammar over vanilla neural encoders. Using ELMo embeddings, our method gives the best results on this benchmark., Comment: Accepted at ACL 2019
- Published
- 2019
20. Acoustic scene classification using multi-layer temporal pooling based on convolutional neural network
- Author
-
Zhang, Liwen and Han, Jiqing
- Subjects
Computer Science - Sound ,Electrical Engineering and Systems Science - Audio and Speech Processing - Abstract
The performance of an Acoustic Scene Classification (ASC) system is highly depending on the latent temporal dynamics of the audio signal. In this paper, we proposed a multiple layers temporal pooling method using CNN feature sequence as in-put, which can effectively capture the temporal dynamics for an entire audio signal with arbitrary duration by building direct connections between the sequence and its time indexes. We applied our novel framework on DCASE 2018 task 1, ASC. For evaluation, we trained a Support Vector Machine (SVM) with the proposed Multi-Layered Temporal Pooling (MLTP) learned features. Experimental results on the development dataset, usage of the MLTP features significantly improved the ASC performance. The best performance with 75.28% accuracy was achieved by using the optimal setting found in our experiments., Comment: (0) the title for this version is inappropriate; (1) the introduction part about the discusses about the handcrafted methods are not precise; (2) the Fig. 1 in section 2 is not correct; (3) the experiments about the CNN part are insufficient
- Published
- 2019
21. FurcaNeXt: End-to-end monaural speech separation with dynamic gated dilated temporal convolutional networks
- Author
-
Zhang, Liwen, Shi, Ziqiang, Han, Jiqing, Shi, Anyan, and Ma, Ding
- Subjects
Computer Science - Sound ,Electrical Engineering and Systems Science - Audio and Speech Processing - Abstract
Deep dilated temporal convolutional networks (TCN) have been proved to be very effective in sequence modeling. In this paper we propose several improvements of TCN for end-to-end approach to monaural speech separation, which consists of 1) multi-scale dynamic weighted gated dilated convolutional pyramids network (FurcaPy), 2) gated TCN with intra-parallel convolutional components (FurcaPa), 3) weight-shared multi-scale gated TCN (FurcaSh), 4) dilated TCN with gated difference-convolutional component (FurcaSu), that all these networks take the mixed utterance of two speakers and maps it to two separated utterances, where each utterance contains only one speaker's voice. For the objective, we propose to train the network by directly optimizing utterance level signal-to-distortion ratio (SDR) in a permutation invariant training (PIT) style. Our experiments on the the public WSJ0-2mix data corpus results in 18.4dB SDR improvement, which shows our proposed networks can leads to performance improvement on the speaker separation task., Comment: Arxiv only allows figures with a small resolution. If you need to see large-resolution figures, please contact us. arXiv admin note: substantial text overlap with arXiv:1902.00651
- Published
- 2019
22. Tropical Geometry of Deep Neural Networks
- Author
-
Zhang, Liwen, Naitzat, Gregory, and Lim, Lek-Heng
- Subjects
Computer Science - Learning ,Mathematics - Algebraic Geometry ,Statistics - Machine Learning ,14T05, 62M45, 68T01 - Abstract
We establish, for the first time, connections between feedforward neural networks with ReLU activation and tropical geometry --- we show that the family of such neural networks is equivalent to the family of tropical rational maps. Among other things, we deduce that feedforward ReLU neural networks with one hidden layer can be characterized by zonotopes, which serve as building blocks for deeper networks; we relate decision boundaries of such neural networks to tropical hypersurfaces, a major object of study in tropical geometry; and we prove that linear regions of such neural networks correspond to vertices of polytopes associated with tropical rational functions. An insight from our tropical formulation is that a deeper network is exponentially more expressive than a shallow network., Comment: 18 pages, 6 figures
- Published
- 2018
23. Gaussian Mixture Latent Vector Grammars
- Author
-
Zhao, Yanpeng, Zhang, Liwen, and Tu, Kewei
- Subjects
Computer Science - Computation and Language ,Computer Science - Learning - Abstract
We introduce Latent Vector Grammars (LVeGs), a new framework that extends latent variable grammars such that each nonterminal symbol is associated with a continuous vector space representing the set of (infinitely many) subtypes of the nonterminal. We show that previous models such as latent variable grammars and compositional vector grammars can be interpreted as special cases of LVeGs. We then present Gaussian Mixture LVeGs (GM-LVeGs), a new special case of LVeGs that uses Gaussian mixtures to formulate the weights of production rules over subtypes of nonterminals. A major advantage of using Gaussian mixtures is that the partition function and the expectations of subtype rules can be computed using an extension of the inside-outside algorithm, which enables efficient inference and learning. We apply GM-LVeGs to part-of-speech tagging and constituency parsing and show that GM-LVeGs can achieve competitive accuracies. Our code is available at https://github.com/zhaoyanpeng/lveg., Comment: Accepted to ACL 2018
- Published
- 2018
24. Unidirectional zero reflection as gauged parity-time symmetry
- Author
-
Gear, James, Sun, Yong, Xiao, Shiyi, Zhang, Liwen, Fitzgerald, Richard, Rotter, Stefan, Chen, Hong, and Li, Jensen
- Subjects
Physics - Classical Physics - Abstract
We introduce here the concept of establishing Parity-time symmetry through a gauge transformation involving a shift of the mirror plane for the Parity operation. The corresponding unitary transformation on the system's constitutive matrix allows us to generate and explore a family of equivalent Parity-time symmetric systems. We further derive that unidirectional zero reflection can always be associated with a gauged PT-symmetry and demonstrate this experimentally using a microstrip transmission-line with magnetoelectric coupling. This study allows us to use bianisotropy as a simple route to realize and explore exceptional point behaviour of PT-symmetric or generally non-Hermitian systems., Comment: 13 pages total, 5 figures, 1 appendix
- Published
- 2017
- Full Text
- View/download PDF
25. Gaussian Attention Model and Its Application to Knowledge Base Embedding and Question Answering
- Author
-
Zhang, Liwen, Winn, John, and Tomioka, Ryota
- Subjects
Statistics - Machine Learning ,Computer Science - Artificial Intelligence ,Computer Science - Computation and Language ,Computer Science - Learning - Abstract
We propose the Gaussian attention model for content-based neural memory access. With the proposed attention model, a neural network has the additional degree of freedom to control the focus of its attention from a laser sharp attention to a broad attention. It is applicable whenever we can assume that the distance in the latent space reflects some notion of semantics. We use the proposed attention model as a scoring function for the embedding of a knowledge base into a continuous vector space and then train a model that performs question answering about the entities in the knowledge base. The proposed attention model can handle both the propagation of uncertainty when following a series of relations and also the conjunction of conditions in a natural way. On a dataset of soccer players who participated in the FIFA World Cup 2014, we demonstrate that our model can handle both path queries and conjunctive queries well., Comment: 16 pages, 4 figures
- Published
- 2016
26. Jointly Learning Multiple Measures of Similarities from Triplet Comparisons
- Author
-
Zhang, Liwen, Maji, Subhransu, and Tomioka, Ryota
- Subjects
Statistics - Machine Learning ,Computer Science - Artificial Intelligence ,Computer Science - Computer Vision and Pattern Recognition ,Computer Science - Learning - Abstract
Similarity between objects is multi-faceted and it can be easier for human annotators to measure it when the focus is on a specific aspect. We consider the problem of mapping objects into view-specific embeddings where the distance between them is consistent with the similarity comparisons of the form "from the t-th view, object A is more similar to B than to C". Our framework jointly learns view-specific embeddings exploiting correlations between views. Experiments on a number of datasets, including one of multi-view crowdsourced comparison on bird images, show the proposed method achieves lower triplet generalization error when compared to both learning embeddings independently for each view and all views pooled into one view. Our method can also be used to learn multiple measures of similarity over input features taking class labels into account and compares favorably to existing approaches for multi-task metric learning on the ISOLET dataset.
- Published
- 2015
Catalog
Discovery Service for Jio Institute Digital Library
For full access to our library's resources, please sign in.