Author: "Wong, Derek F." / Database: Supplemental Index - Searchworks@Jio Institute Digital Library Search Results

Your search keyword '"Wong, Derek F."' showing total 12 results

Start Over Author "Wong, Derek F." Database Supplemental Index

12 results on '"Wong, Derek F."'

1. Obscurity-Quantified Curriculum Learning for Machine Translation Evaluation

Author: Zhang, Cuilian, Wong, Derek F., Lei, Eddy S. K., Zhan, Runzhe, and Chao, Lidia S.
Abstract: The pre-trained language model has been developed for evaluating the quality of machine translation. It achieves state-of-the-art results. However, building a model for the evaluation of machine translation still faces the following challenges: 1) large scale of the training data affects the speed of the optimization; 2) the varied quality of the training data makes the optimization process unstable. To alleviate the issues of data learning, curriculum learning is proposed to rearrange the training sequence following an “easy-to-hard” process. However, the definition of difficulty can not be directly applied to the training data used in the machine translation evaluation. Hence, we propose an obscurity-quantified curriculum learning framework for this task. Specifically, the obscurity of each training example can be measured from multiple perspectives, including the difficulty of ranking, the fuzziness of reference, the complexity of text, and the unreliability of judgement. To incorporate the obscurity measurements, we also design a dynamic learning strategy to guide the training process from instances with low obscurity to those with high-obscurity. Experimental results show that our proposed methods yield remarkable improvements on the segment-level WMT2019 and WMT2020 Metrics Shared Tasks compared to other baseline methods.
Published: 2023
Full Text: View/download PDF

2. Multi-Level Curriculum Learning for Multi-Turn Dialogue Generation

Author: Chen, Guanhua, Zhan, Runzhe, Wong, Derek F., and Chao, Lidia S.
Abstract: Since deep learning is the dominant paradigm in the multi-turn dialogue generation task, large-scale training data is the key factor affecting the model performance. To make full use of the training data, the existing work directly applied curriculum learning to the multi-turn dialogue generation task, training model in a “easy-to-hard” way. But the design of the current methodology does not consider dialogue-specific features. To close this gap, we propose a Multi-Level Curriculum Learning (MLCL) method for multi-turn dialogue generation by considering the word-level linguistic feature and utterance-level semantic relation in a dialogue. The motivation is that word-level knowledge is beneficial to understanding complex utterance-level dependency of dialogue. Thus, we design two difficulty measurements and a self-adaptive curriculum scheduler, making the model gradually shift the learning focus from word-level to utterance-level information during the training process. We also verify the independence and complementarity of the two measurements at different levels. We evaluate the performance on two widely used multi-turn dialogue datasets, and the results demonstrate that our proposed method outperforms the strong baselines and existing CL methods in terms of automated metrics and human evaluation. We will release the code files upon acceptance.
Published: 2023
Full Text: View/download PDF

3. Towards Energy-Preserving Natural Language Understanding With Spiking Neural Networks

Author: Xiao, Rong, Wan, Yu, Yang, Baosong, Zhang, Haibo, Tang, Huajin, Wong, Derek F., and Chen, Boxing
Abstract: Artificial neural networks have shown promising results in a variety of natural language understanding (NLU) tasks. Despite their successes, conventional neural-based NLU models are criticized for high energy consumption, making them laborious to be widely applied in low-power electronics, such as smartphones and intelligent terminals. In this paper, we introduce a potential direction to alleviate this bottleneck by proposing a spiking encoder. The core of our model is bi-directional spiking neural network (SNN) which transforms numeric values into discrete spiking signals and replaces massive multiplications with much cheaper additive operations. We examine our model on sentiment classification and machine translation tasks. Experimental results reveal that our model achieves comparable classification and translation accuracy to advanced Transformer baseline, whereas significantly reduces the required computational energy to 0.82%.
Published: 2023
Full Text: View/download PDF

4. Exploiting Translation Model for Parallel Corpus Mining

Author: Leong, Chongman, Liu, Xuebo, Wong, Derek F., and Chao, Lidia S.
Abstract: Parallel corpus mining (PCM) is beneficial for many corpus-based natural language processing tasks, e.g., machine translation and bilingual dictionary induction, especially in low-resource languages and domains. It relies heavily on cross-lingual representations to model the interdependencies between different languages and determine whether sentences are parallel or not. In this paper, we take the first step towards exploiting the multilingual Transformer translation model to produce expressive sentence representations for PCM. Since the traditional Transformer lacks an immediate sentence representation, we pool the output representation of the encoder as the sentence representation, which is further optimized as a part of the training flow of the translation model. Experiments conducted on the BUCC PCM task show that the proposed method improves mining performance over the existing methods with the assistance of the pre-trained multilingual BERT. To further test the usability of the proposed method, we mine parallel sentences from public resources and find that the mined sentences can indeed enhance low-resource machine translation.
Published: 2021
Full Text: View/download PDF

5. Latent Attribute Based Hierarchical Decoder for Neural Machine Translation

Author: Liu, Xuebo, Wong, Derek F., Chao, Lidia S., and Liu, Yang
Abstract: Neural machine translation (NMT) has achieved state-of-the-art performance in many translation tasks. However, because the computational cost increases with the size of the search space for predicting the target words, the translation quality of NMT is constrained by the limited vocabulary. To alleviate this problem, we propose a novel dynamic hierarchical decoder for NMT to utilize all of the target words in the training and decoding process. In the proposed model, a target word is represented by two latent attribute vectors rather than a word vector. The model is trained to dynamically put together those words that share similar linguistic attributes. The prediction of a target word is, therefore, turned into the prediction of attribute vectors, where the $\mathrm{softmax}$ functions are performed at the attribute level. This greatly reduces the model size and the decoding time. Our experimental results demonstrate that the proposed model significantly outperforms the NMT baselines in both Chinese-English and English-German translation tasks.
Published: 2019
Full Text: View/download PDF

6. Linguistic Knowledge-Aware Neural Machine Translation

Author: Li, Qiang, Wong, Derek F., Chao, Lidia S., Zhu, Muhua, Xiao, Tong, Zhu, Jingbo, and Zhang, Min
Abstract: Recently, researchers have shown an increasing interest in incorporating linguistic knowledge into neural machine translation (NMT). To this end, previous works choose either to alter the architecture of NMT encoder to incorporate syntactic information into the translation model, or to generalize the embedding layer of the encoder to encode additional linguistic features. The former approach mainly focuses on injecting the syntactic structure of the source sentence into the encoding process, leading to a complicated model that lacks the flexibility to incorporate other types of knowledge. The latter extends word embeddings by considering additional linguistic knowledge as features to enrich the word representation. It thus does not explicitly balance the contribution from word embeddings and the contribution from additional linguistic knowledge. To address these limitations, this paper proposes a knowledge-aware NMT approach that models additional linguistic features in parallel to the word feature. The core idea is that we propose modeling a series of linguistic features at the word level (knowledge block) using a recurrent neural network (RNN). And in sentence level, those word-corresponding feature blocks are further encoded using a RNN encoder. In decoding, we propose a knowledge gate and an attention gate to dynamically control the proportions of information contributing to the generation of target words from different sources. Extensive experiments show that our approach is capable of better accounting for importance of additional linguistic, and we observe significant improvements from 1.0 to 2.3 BLEU points on Chinese$\leftrightarrow$ English and English$\rightarrow$German translation tasks.
Published: 2018
Full Text: View/download PDF

7. Content-Oriented User Modeling for Personalized Response Ranking in Chatbots

Author: Liu, Bingquan, Xu, Zhen, Sun, Chengjie, Wang, Baoxun, Wang, Xiaolong, Wong, Derek F., and Zhang, Min
Abstract: Automatic chatbots (also known as chat-agents) have attracted much attention from both researching and industrial fields. Generally, the semantic relevance between users' queries and the corresponding responses is considered as the essential element for conversation modeling in both generation and ranking based chat systems. By contrast, it is a nontrivial task to adopt the users' information, such as preference, social role, etc., into conversational models reasonably, while users' profiles play a significant role in the procedure of conversations by providing the implicit contexts. This paper aims to address the personalized response ranking task by incorporating user profiles into the conversation model. In our approach, users' personalized representations are latently learned from the contents posted by them via a two-branch neural network. After that, a deep neural network architecture is further presented to learn the fusion representation of posts, responses, and personal information. In this way, the proposed model could understand conversations from the users' perspective; hence, the more appropriate responses are selected for a specified person. The experimental results on two datasets from social network services demonstrate that our approach is hopeful to represent users' personal information implicitly based on user generated contents, and it is promising to perform as an important component in chatbots to select the personalized responses for each user.
Published: 2018
Full Text: View/download PDF

8. A Loss-Augmented Approach to Training Syntactic Machine Translation Systems

Author: Xiao, Tong, Wong, Derek F., and Zhu, Jingbo
Abstract: Current syntactic machine translation (MT) systems implicitly use beam-width unlimited search in learning model parameters (e.g., feature values for each translation rule). However, a limited beam-width has to be adopted in decoding new sentences, and the MT output is in general evaluated by various metrics, such as BLEU and TER. In this paper, we address: 1) the mismatch of adopted beam-widths between training and decoding; and 2) the mismatch of training criteria and MT evaluation metrics. Unlike previous work, we model the two problems in a single training paradigm simultaneously. We design a loss-augmented approach that explicitly considers the limited beam-width and evaluation metric in training, and present a simple but effective method to learn the model. By using beam search and BLEU-related losses, our approach improves a state-of-the-art syntactic MT system by +1.0 BLEU on Chinese-to-English and English-to-Chinese translation tasks. It even outperforms seven previous training approaches over 0.8 BLEU points. More interestingly, promising improvements are observed when our approach works with TER.
Published: 2016
Full Text: View/download PDF

9. Graph-Based Lexicon Regularization for PCFG With Latent Annotations

Author: Zeng, Xiaodong, Wong, Derek F., Chao, Lidia S., and Trancoso, Isabel
Abstract: This paper aims at learning a better probabilistic context-free grammar with latent annotations (PCFG-LA) by using a graph propagation (GP) technique. We propose leveraging the GP to regularize the lexical model of the grammar. The proposed approach constructs k-nearest neighbor ( k-NN) similarity graphs over words with identical pre-terminal (part-of-speech) tags, for propagating the probabilities of latent annotations given the words. The graphs demonstrate the relationship between words in syntactic and semantic levels, estimated by using a neural word representation method based on Recursive autoencoder (RAE). We modify the conventional PCFG-LA parameter estimation algorithm, expectation maximization (EM), by incorporating a GP process subsequent to the M-step. The GP encourages the smoothness among the graph vertices, where different words under similar syntactic and semantic environments should have approximate posterior distributions of nonterminal subcategories. The proposed PCFG-LA learning approach was evaluated together with a hierarchical split-and-merge training strategy, on parsing tasks for English, Chinese and Portuguese. The empirical results reveal two crucial findings: 1) regularizing the lexicons with GP results in positive effects to parsing accuracy; and 2) learning with unlabeled data can also expand the PCFG-LA lexicons.
Published: 2015
Full Text: View/download PDF

10. iTagger: Part-of-Speech Tagging Based on SBCB Learning Algorithm

Author: Zeng, Xiao Dong, Chao, Lidia S., Wong, Derek F., and He, Liang Ye
Abstract: The problem of part-of-speech (POS) tagging or disambiguation is a practical issue in natural language processing (NLP) community, especially in the development of a machine translation system. The performance of POS tagging system may interference the subsequent analytical tasks in the translation process, and thereafter affects the overall translation quality. This paper presents a novel POS tagging system, iTagger, which is developed based on Selecting Base Classifiers on Bagging (SBCB) learning algorithm. In this work, the POS tagging task is regarded as a classification problem. Features such as the surrounding context of ambiguous candidates, n-gram information, lexical items and linguistic clues are used and automatically extracted from the annotated corpus. The proposed system has been compared against two state-of-the-art tagging methods, Hidden Markov Model (HMM) and Maximum Entropy. The empirical results conducted on the corpora of (English) Brown corpus, (Portuguese) Tycho Brahe corpus and the Chinese Tree Bank corpus reveal the competitiveness of iTagger. Moreover, the iTagger has been developed and released to the public as library and tool for various development and application purposes.
Published: 2013
Full Text: View/download PDF

11. Annotation System for the Construction of Synchronous Grammar Tree Alignment Relationships

Author: Oliveira, Francisco, Wong, Derek F., Chao, Lidia S., and Sun, Fan
Abstract: The construction of syntactic tree structures is vital to different Natural Language Processing applications. In the meanwhile, the use of monolingual or bilingual structures directly affects the quality of Machine Translation systems. However, manually annotated syntactic tree structures are not only considered as a time consuming task but also a very expensive work, and automatic construction approaches cannot always guarantee the quality of the syntactic trees. In this paper, a system for annotating and constructing synchronous grammar tree structures in a semi-automatic way is proposed. The system is built in the Web environment with a graphical display for users to review and modify alignment relationships between nodes of the tree and strings or trees of the other language. The core part relies on the parsing of Constraint Synchronous Grammar, and consists of several modules in establishing alignments at different levels between the languages, including relationships between syntactic tree and strings of the other language, and bilingual tree alignments. Moreover, it provides import functions in obtaining monolingual skeletal bracketing syntactic tree and Translation Corresponding Tree structures for the creation of synchronous rules in order to have wider applicability.
Published: 2013
Full Text: View/download PDF

12. An Experimental Platform for Cross-Language Document Retrieval

Author: Wang, Long Yue, Wong, Derek F., and Chao, Lidia S.
Abstract: This paper presents a proposed Cross-Language Document Retrieval experimental platform integrated with preprocessing of training data, document translation, query generation, document retrieval and precision evaluation modules. Given a certain document in source language, it will be translated into target language by statistical machine translation module which is trained by selected training data. The query generation module then selects the most relevant words in the translated version of the document as searching query. After all the documents in the target language are ranked by the document retrieval module, the system will choose the N-best documents as its target language versions. Finally, the results can be evaluated by precision evaluator, which can reflect the merits of the strategies. Experimental results showed that this platform was effective and achieved very good performance.
Published: 2013
Full Text: View/download PDF

Catalog

Books, media, physical & digital resources

See catalog results

Searchworks

Select search scope, currently: Articles

Catalog

books, media & more in Jio Institute collections

Articles

journal articles & other e-resources

Refine your results

12 results on '"Wong, Derek F."'

1. Obscurity-Quantified Curriculum Learning for Machine Translation Evaluation

2. Multi-Level Curriculum Learning for Multi-Turn Dialogue Generation

3. Towards Energy-Preserving Natural Language Understanding With Spiking Neural Networks

4. Exploiting Translation Model for Parallel Corpus Mining

5. Latent Attribute Based Hierarchical Decoder for Neural Machine Translation

6. Linguistic Knowledge-Aware Neural Machine Translation

7. Content-Oriented User Modeling for Personalized Response Ranking in Chatbots

8. A Loss-Augmented Approach to Training Syntactic Machine Translation Systems

9. Graph-Based Lexicon Regularization for PCFG With Latent Annotations

10. iTagger: Part-of-Speech Tagging Based on SBCB Learning Algorithm

11. Annotation System for the Construction of Synchronous Grammar Tree Alignment Relationships

12. An Experimental Platform for Cross-Language Document Retrieval

Catalog

Searchworks

Select search scope, currently: Articles Catalog books, media & more in Jio Institute collections Articles journal articles & other e-resources

Search

Search Constraints

Refine your results

Search Limiters

Publication Year Range

Publication Type

Journal

Database

12 results on '"Wong, Derek F."'

Search Results

Catalog

Select search scope, currently: Articles

Catalog

books, media & more in Jio Institute collections

Articles

journal articles & other e-resources