Author: "Sankaran, Baskaran" - Searchworks@Jio Institute Digital Library Search Results

Your search keyword '"Sankaran, Baskaran"' showing total 17 results

Start Over Author "Sankaran, Baskaran"

17 results on '"Sankaran, Baskaran"'

1. Attention-based Vocabulary Selection for NMT Decoding

Author: Sankaran, Baskaran, Freitag, Markus, and Al-Onaizan, Yaser
Subjects: Computer Science - Computation and Language
Abstract: Neural Machine Translation (NMT) models usually use large target vocabulary sizes to capture most of the words in the target language. The vocabulary size is a big factor when decoding new sentences as the final softmax layer normalizes over all possible target words. To address this problem, it is widely common to restrict the target vocabulary with candidate lists based on the source sentence. Usually, the candidate lists are a combination of external word-to-word aligner, phrase table entries or most frequent words. In this work, we propose a simple and yet novel approach to learn candidate lists directly from the attention layer during NMT training. The candidate lists are highly optimized for the current NMT model and do not need any external computation of the candidate pool. We show significant decoding speedup compared with using the entire vocabulary, without losing any translation quality for two language pairs., Comment: Submitted to Second Conference on Machine Translation (WMT-17); 7 pages
Published: 2017

2. Ensemble Distillation for Neural Machine Translation

Author: Freitag, Markus, Al-Onaizan, Yaser, and Sankaran, Baskaran
Subjects: Computer Science - Computation and Language
Abstract: Knowledge distillation describes a method for training a student network to perform better by learning from a stronger teacher network. Translating a sentence with an Neural Machine Translation (NMT) engine is time expensive and having a smaller model speeds up this process. We demonstrate how to transfer the translation quality of an ensemble and an oracle BLEU teacher network into a single NMT system. Further, we present translation improvements from a teacher network that has the same architecture and dimensions of the student network. As the training of the student model is still expensive, we introduce a data filtering method based on the knowledge of the teacher model that not only speeds up the training, but also leads to better translation quality. Our techniques need no code change and can be easily reproduced with any NMT architecture to speed up the decoding process.
Published: 2017

3. Temporal Attention Model for Neural Machine Translation

Author: Sankaran, Baskaran, Mi, Haitao, Al-Onaizan, Yaser, and Ittycheriah, Abe
Subjects: Computer Science - Computation and Language
Abstract: Attention-based Neural Machine Translation (NMT) models suffer from attention deficiency issues as has been observed in recent research. We propose a novel mechanism to address some of these limitations and improve the NMT attention. Specifically, our approach memorizes the alignments temporally (within each sentence) and modulates the attention with the accumulated temporal memory, as the decoder generates the candidate translation. We compare our approach against the baseline NMT model and two other related approaches that address this issue either explicitly or implicitly. Large-scale experiments on two language pairs show that our approach achieves better and robust gains over the baseline and related NMT approaches. Our model further outperforms strong SMT baselines in some settings even without using ensembles., Comment: 8 pages
Published: 2016

4. Zero-Resource Translation with Multi-Lingual Neural Machine Translation

Author: Firat, Orhan, Sankaran, Baskaran, Al-Onaizan, Yaser, Vural, Fatos T. Yarman, and Cho, Kyunghyun
Subjects: Computer Science - Computation and Language
Abstract: In this paper, we propose a novel finetuning algorithm for the recently introduced multi-way, mulitlingual neural machine translate that enables zero-resource machine translation. When used together with novel many-to-one translation strategies, we empirically show that this finetuning algorithm allows the multi-way, multilingual model to translate a zero-resource language pair (1) as well as a single-pair neural translation model trained with up to 1M direct parallel sentences of the same language pair and (2) better than pivot-based translation strategy, while keeping only one additional copy of attention-related parameters.
Published: 2016

5. Coverage Embedding Models for Neural Machine Translation

Author: Mi, Haitao, Sankaran, Baskaran, Wang, Zhiguo, and Ittycheriah, Abe
Subjects: Computer Science - Computation and Language
Abstract: In this paper, we enhance the attention-based neural machine translation (NMT) by adding explicit coverage embedding models to alleviate issues of repeating and dropping translations in NMT. For each source word, our model starts with a full coverage embedding vector to track the coverage status, and then keeps updating it with neural networks as the translation goes. Experiments on the large-scale Chinese-to-English task show that our enhanced model improves the translation quality significantly on various test sets over the strong large vocabulary NMT system., Comment: 6 pages; In Proceddings of EMNLP 2016
Published: 2016

6. Multi-way, multilingual neural machine translation

Author: Firat, Orhan, Cho, Kyunghyun, Sankaran, Baskaran, Yarman Vural, Fatos T., and Bengio, Yoshua
Published: 2017
Full Text: View/download PDF

7. Domain Adaptation Techniques for Machine Translation and Their Evaluation in a Real-World Setting

Author: Sankaran, Baskaran, Razmara, Majid, Farzindar, Atefeh, Khreich, Wael, Popowich, Fred, Sarkar, Anoop, Hutchison, David, editor, Kanade, Takeo, editor, Kittler, Josef, editor, Kleinberg, Jon M., editor, Mattern, Friedemann, editor, Mitchell, John C., editor, Naor, Moni, editor, Nierstrasz, Oscar, editor, Pandu Rangan, C., editor, Steffen, Bernhard, editor, Sudan, Madhu, editor, Terzopoulos, Demetri, editor, Tygar, Doug, editor, Vardi, Moshe Y., editor, Weikum, Gerhard, editor, Goebel, Randy, editor, Siekmann, Jörg, editor, Wahlster, Wolfgang, editor, Kosseim, Leila, editor, and Inkpen, Diana, editor
Published: 2012
Full Text: View/download PDF

8. n-Best Reranking for the Efficient Integration of Word Sense Disambiguation and Statistical Machine Translation

Author: Specia, Lucia, Sankaran, Baskaran, das Graças Volpe Nunes, Maria, Hutchison, David, editor, Kanade, Takeo, editor, Kittler, Josef, editor, Kleinberg, Jon M., editor, Mattern, Friedemann, editor, Mitchell, John C., editor, Naor, Moni, editor, Nierstrasz, Oscar, editor, Pandu Rangan, C., editor, Steffen, Bernhard, editor, Sudan, Madhu, editor, Terzopoulos, Demetri, editor, Tygar, Doug, editor, Vardi, Moshe Y., editor, Weikum, Gerhard, editor, and Gelbukh, Alexander, editor
Published: 2008
Full Text: View/download PDF

9. Domain Adaptation Techniques for Machine Translation and Their Evaluation in a Real-World Setting

Author: Sankaran, Baskaran, primary, Razmara, Majid, additional, Farzindar, Atefeh, additional, Khreich, Wael, additional, Popowich, Fred, additional, and Sarkar, Anoop, additional
Published: 2012
Full Text: View/download PDF

10. n-Best Reranking for the Efficient Integration of Word Sense Disambiguation and Statistical Machine Translation

Author: Specia, Lucia, primary, Sankaran, Baskaran, additional, and das Graças Volpe Nunes, Maria, additional
Published: 2008
Full Text: View/download PDF

11. Zero-Resource Translation with Multi-Lingual Neural Machine Translation

Author: Firat, Orhan, primary, Sankaran, Baskaran, additional, Al-Onaizan, Yaser, additional, Yarman Vural, Fatos T., additional, and Cho, Kyunghyun, additional
Published: 2016
Full Text: View/download PDF

12. Coverage Embedding Models for Neural Machine Translation

Author: Mi, Haitao, primary, Sankaran, Baskaran, additional, Wang, Zhiguo, additional, and Ittycheriah, Abe, additional
Published: 2016
Full Text: View/download PDF

13. Improvements in hierarchical phrase-based Statistical Machine Translation

Author: Sankaran, Baskaran
Abstract: Hierarchical phrase-based translation (Hiero) is a statistical machine translation (SMT) model that encodes translation as a synchronous context-free grammar derivation between source and target language strings (Chiang, 2005; Chiang, 2007). Hiero models are more powerful than phrase-based models in capturing complex source-target reordering as well as discontiguous phrases, while being easier to estimate and decode with compared to their full syntax-based counterparts. In this thesis, we propose improvements to two broad aspects of the Hiero translation pipeline: i) learning Hiero translation model and estimating their parameters and ii) parameter tuning for discriminative log-linear models that are used to decode with such features. We use our own open-source implementation of Hiero called Kriya (Sankaran et al., 2012b) for all the experiments in this thesis. This thesis contains the following specific contributions: We propose a Bayesian model for learning Hiero grammars as an alternative to the heuristic method usually used in Hiero. Our model learns a peaked distribution of grammars, which consistently performs better than the heuristically extracted grammars across several language pairs (Sankaran et al., 2013a). We propose a novel unified-cascade framework for jointly learning alignments and the Hiero translation rules by removing the disconnect between the alignments and extracted synchronous context-free grammar. This is the first time a joint training framework is being proposed for Hiero, where we iterate the two step inference so that it learns in alternate iterations the phrase alignments and then the Hiero rules that are consistent with alignments. We extend our Bayesian model for extracting compact Hiero translation rules using arity-1 grammars, resulting in up to 57% reduction in model size while retaining the translation performance (Sankaran et al., 2011; Sankaran et al., 2012a). We propose several novel approaches for parameter tuning of discriminative log-linear models for SMT which can be used for jointly optimizing towards multiple evaluation metrics. We show that our methods for multi-objective tuning for SMT yield substantial gains in translation quality measured through automatic as well as human evaluations (Sankaran et al., 2013b; Duh et al., 2013).
Published: 2013

14. Mixing multiple translation models in statistical machine translation

Author: Razmara, Majid, Foster, George, Sankaran, Baskaran, and Sarkar, Anoop
Abstract: Statistical machine translation is often faced with the problem of combining training data from many diverse sources into a single translation model which then has to translate sentences in a new domain. We propose a novel approach, ensemble decoding, which combines a number of translation systems dynamically at the decoding step. In this paper, we evaluate performance on a domain adaptation setting where we translate sentences from the medical domain. Our experimental results show that ensemble decoding outperforms various strong baselines including mixture models, the current state-of-the-art for domain adaptation in machine translation., 50th Annual Meeting of the Association for Computational Linguistics (ACL 2012), Jeju Island, Republic of Korea, 8-14 July, 2012
Published: 2012

15. Incremental translation using hierarchichal phrase-based translation system

Author: Siahbani, Maryam, primary, Seraj, Ramtin Mehdizadeh, additional, Sankaran, Baskaran, additional, and Sarkar, Anoop, additional
Published: 2014
Full Text: View/download PDF

16. Kriya - An end-to-end Hierarchical Phrase-based MT System

Author: Sankaran, Baskaran, primary, Razmara, Majid, additional, and Sarkar, Anoop, additional
Published: 2012
Full Text: View/download PDF

17. New Computer Technique for Root Locus Analysis

Author: Sankaran, Baskaran
Published: 1979

Catalog

Books, media, physical & digital resources

See catalog results

Searchworks

Select search scope, currently: Articles

Catalog

books, media & more in Jio Institute collections

Articles

journal articles & other e-resources

Refine your results

17 results on '"Sankaran, Baskaran"'

1. Attention-based Vocabulary Selection for NMT Decoding

2. Ensemble Distillation for Neural Machine Translation

3. Temporal Attention Model for Neural Machine Translation

4. Zero-Resource Translation with Multi-Lingual Neural Machine Translation

5. Coverage Embedding Models for Neural Machine Translation

6. Multi-way, multilingual neural machine translation

7. Domain Adaptation Techniques for Machine Translation and Their Evaluation in a Real-World Setting

8. n-Best Reranking for the Efficient Integration of Word Sense Disambiguation and Statistical Machine Translation

9. Domain Adaptation Techniques for Machine Translation and Their Evaluation in a Real-World Setting

10. n-Best Reranking for the Efficient Integration of Word Sense Disambiguation and Statistical Machine Translation

11. Zero-Resource Translation with Multi-Lingual Neural Machine Translation

12. Coverage Embedding Models for Neural Machine Translation

13. Improvements in hierarchical phrase-based Statistical Machine Translation

14. Mixing multiple translation models in statistical machine translation

15. Incremental translation using hierarchichal phrase-based translation system

16. Kriya - An end-to-end Hierarchical Phrase-based MT System

17. New Computer Technique for Root Locus Analysis

Catalog

Searchworks

Select search scope, currently: Articles Catalog books, media & more in Jio Institute collections Articles journal articles & other e-resources

Search

Search Constraints

Refine your results

Search Limiters

Topic

Publication Year Range

Language

Publication Type

Journal

Database

Publisher

17 results on '"Sankaran, Baskaran"'

Search Results

Catalog

Select search scope, currently: Articles

Catalog

books, media & more in Jio Institute collections

Articles

journal articles & other e-resources