21 results on '"Masao Utiyama"'
Search Results
2. A Fuzzy Training Framework for Controllable Sequence-to-Sequence Generation
- Author
-
Jiajia Li, Ping Wang, Zuchao Li, Xi Liu, Masao Utiyama, Eiichiro Sumita, Hai Zhao, and Haojun Ai
- Subjects
Music lyrics generation ,controllable generation ,music understanding ,constrained decoding ,fuzzy training ,Electrical engineering. Electronics. Nuclear engineering ,TK1-9971 - Abstract
The generation of music lyrics by artificial intelligence (AI) is frequently modeled as a language-targeted sequence-to-sequence generation task. Formally, if we convert the melody into a word sequence, we can consider the lyrics generation task to be a machine translation task. Traditional machine translation tasks involve translating between cross-lingual word sequences, whereas music lyrics generation tasks involve translating between music and natural language word sequences. The theme or key words of the generated lyrics are usually limited to meet the needs of the users when they are generated. This requirement can be thought of as a restricted translation problem. In this paper, we propose a fuzzy training framework that allows a model to simultaneously support both unrestricted and restricted translation by adopting an additional auxiliary training process without constraining the decoding process. This maintains the benefits of restricted translation but greatly reduces the extra time overhead of constrained decoding, thus improving its practicality. The experimental results show that our framework is well suited to the Chinese lyrics generation and restricted machine translation tasks, and that it can also generate language sequence under the condition of given restricted words without training multiple models, thereby achieving the goal of green AI.
- Published
- 2022
- Full Text
- View/download PDF
3. Statistical Khmer Name Romanization
- Author
-
Chenchen Ding, Vichet Chea, Masao Utiyama, Eiichiro Sumita, Sethserey Sam, and Sopheap Seng
- Subjects
Computational linguistics. Natural language processing ,P98-98.5 - Published
- 2022
- Full Text
- View/download PDF
4. Constituency Parsing by Cross-Lingual Delexicalization
- Author
-
Hour Kaing, Chenchen Ding, Masao Utiyama, Eiichiro Sumita, Katsuhito Sudoh, and Satoshi Nakamura
- Subjects
Cross-lingual ,delexicalization ,syntactic parsing ,natural language processing ,Electrical engineering. Electronics. Nuclear engineering ,TK1-9971 - Abstract
Cross-lingual transfer is an important technique for low-resource language processing. Temporarily, most research on syntactic parsing works on the dependency structures. This work investigates cross-lingual parsing on another type of important syntactic structure, i.e., the constituency structure. We propose a delexicalized approach, where part-of-speech sequences of rich-resource languages are used to train cross-lingual models to parse low-resource languages. We also investigate the measurements on the selection of proper rich-resource languages for specific low-resource languages. The experiments show that the delexicalized approach outperforms state-of-the-art unsupervised models on six languages by a margin of 4.2 to 37.0 of sentence-level F1-score. Based on the experiment results, the limitation and future work of the delexicalized approach are discussed.
- Published
- 2021
- Full Text
- View/download PDF
5. Universal Multimodal Representation for Language Understanding
- Author
-
Zhuosheng Zhang, Kehai Chen, Rui Wang, Masao Utiyama, Eiichiro Sumita, Zuchao Li, and Hai Zhao
- Subjects
FOS: Computer and information sciences ,Artificial Intelligence (cs.AI) ,Computer Science - Computation and Language ,Computational Theory and Mathematics ,Artificial Intelligence ,Computer Science - Artificial Intelligence ,Computer Vision and Pattern Recognition (cs.CV) ,Applied Mathematics ,Computer Science - Computer Vision and Pattern Recognition ,Computer Vision and Pattern Recognition ,Computation and Language (cs.CL) ,Software - Abstract
Representation learning is the foundation of natural language processing (NLP). This work presents new methods to employ visual information as assistant signals to general NLP tasks. For each sentence, we first retrieve a flexible number of images either from a light topic-image lookup table extracted over the existing sentence-image pairs or a shared cross-modal embedding space that is pre-trained on out-of-shelf text-image pairs. Then, the text and images are encoded by a Transformer encoder and convolutional neural network, respectively. The two sequences of representations are further fused by an attention layer for the interaction of the two modalities. In this study, the retrieval process is controllable and flexible. The universal visual representation overcomes the lack of large-scale bilingual sentence-image pairs. Our method can be easily applied to text-only tasks without manually annotated multimodal parallel corpora. We apply the proposed method to a wide range of natural language generation and understanding tasks, including neural machine translation, natural language inference, and semantic similarity. Experimental results show that our method is generally effective for different tasks and languages. Analysis indicates that the visual signals enrich textual representations of content words, provide fine-grained grounding information about the relationship between concepts and events, and potentially conduce to disambiguation., Accepted by IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI)
- Published
- 2023
6. Self-Training for Unsupervised Neural Machine Translation in Unbalanced Training Data Scenarios
- Author
-
Eiichiro Sumita, Haipeng Sun, Tiejun Zhao, Rui Wang, Kehai Chen, and Masao Utiyama
- Subjects
FOS: Computer and information sciences ,Training set ,Computer Science - Computation and Language ,Machine translation ,Computer science ,business.industry ,Training (meteorology) ,computer.software_genre ,Translation (geometry) ,Estonian ,language.human_language ,030507 speech-language pathology & audiology ,03 medical and health sciences ,ComputingMethodologies_PATTERNRECOGNITION ,language ,Artificial intelligence ,0305 other medical science ,business ,computer ,Self training ,Computation and Language (cs.CL) ,Natural language processing - Abstract
Unsupervised neural machine translation (UNMT) that relies solely on massive monolingual corpora has achieved remarkable results in several translation tasks. However, in real-world scenarios, massive monolingual corpora do not exist for some extremely low-resource languages such as Estonian, and UNMT systems usually perform poorly when there is not adequate training corpus for one language. In this paper, we first define and analyze the unbalanced training data scenario for UNMT. Based on this scenario, we propose UNMT self-training mechanisms to train a robust UNMT system and improve its performance in this case. Experimental results on several language pairs show that the proposed methods substantially outperform conventional UNMT systems., Accepted by NAACL 2021
- Published
- 2020
7. Reference Language based Unsupervised Neural Machine Translation
- Author
-
Rui Wang, Eiichiro Sumita, Hai Zhao, Zuchao Li, and Masao Utiyama
- Subjects
FOS: Computer and information sciences ,Computer Science - Computation and Language ,Machine translation ,business.industry ,Computer science ,media_common.quotation_subject ,Supervised learning ,SIGNAL (programming language) ,02 engineering and technology ,010501 environmental sciences ,Translation (geometry) ,computer.software_genre ,01 natural sciences ,Agreement ,Pivot language ,Subject (grammar) ,0202 electrical engineering, electronic engineering, information engineering ,020201 artificial intelligence & image processing ,Artificial intelligence ,business ,Computation and Language (cs.CL) ,computer ,Natural language processing ,0105 earth and related environmental sciences ,media_common - Abstract
Exploiting a common language as an auxiliary for better translation has a long tradition in machine translation and lets supervised learning-based machine translation enjoy the enhancement delivered by the well-used pivot language in the absence of a source language to target language parallel corpus. The rise of unsupervised neural machine translation (UNMT) almost completely relieves the parallel corpus curse, though UNMT is still subject to unsatisfactory performance due to the vagueness of the clues available for its core back-translation training. Further enriching the idea of pivot translation by extending the use of parallel corpora beyond the source-target paradigm, we propose a new reference language-based framework for UNMT, RUNMT, in which the reference language only shares a parallel corpus with the source, but this corpus still indicates a signal clear enough to help the reconstruction training of UNMT through a proposed reference agreement mechanism. Experimental results show that our methods improve the quality of UNMT over that of a strong baseline that uses only one auxiliary language, demonstrating the usefulness of the proposed reference language-based UNMT and establishing a good start for the community., EMNLP 2020, ACL Findings
- Published
- 2020
8. Modeling Future Cost for Neural Machine Translation
- Author
-
Kehai Chen, Tiejun Zhao, Eiichiro Sumita, Rui Wang, Chaoqun Duan, Conghui Zhu, and Masao Utiyama
- Subjects
FOS: Computer and information sciences ,Context model ,Computer Science - Computation and Language ,Acoustics and Ultrasonics ,Machine translation ,Artificial neural network ,Computer science ,business.industry ,Machine learning ,computer.software_genre ,Speech processing ,Computational Mathematics ,Computer Science (miscellaneous) ,Artificial intelligence ,Electrical and Electronic Engineering ,Representation (mathematics) ,business ,Computation and Language (cs.CL) ,computer ,Word (computer architecture) ,Decoding methods ,Transformer (machine learning model) - Abstract
Existing neural machine translation (NMT) systems utilize sequence-to-sequence neural networks to generate target translation word by word, and then make the generated word at each time-step and the counterpart in the references as consistent as possible. However, the trained translation model tends to focus on ensuring the accuracy of the generated target word at the current time-step and does not consider its future cost which means the expected cost of generating the subsequent target translation (i.e., the next target word). To respond to this issue, in this article, we propose a simple and effective method to model the future cost of each target word for NMT systems. In detail, a future cost representation is learned based on the current generated target word and its contextual information to compute an additional loss to guide the training of the NMT model. Furthermore, the learned future cost representation at the current time-step is used to help the generation of the next target word in the decoding. Experimental results on three widely-used translation datasets, including the WMT14 English-to-German, WMT14 English-to-French, and WMT17 Chinese-to-English, show that the proposed approach achieves significant improvements over strong Transformer-based NMT baseline.
- Published
- 2020
9. Explicit Sentence Compression for Neural Machine Translation
- Author
-
Zuchao Li, Rui Wang, Kehai Chen, Masao Utiyama, Hai Zhao, Eiichiro Sumita, and Zhuosheng Zhang
- Subjects
FOS: Computer and information sciences ,Sentence compression ,Computer Science - Computation and Language ,Machine translation ,Computer science ,business.industry ,General Medicine ,computer.software_genre ,030507 speech-language pathology & audiology ,03 medical and health sciences ,Artificial intelligence ,0305 other medical science ,business ,Computation and Language (cs.CL) ,computer ,Natural language processing ,Sentence ,Transformer (machine learning model) - Abstract
State-of-the-art Transformer-based neural machine translation (NMT) systems still follow a standard encoder-decoder framework, in which source sentence representation can be well done by an encoder with self-attention mechanism. Though Transformer-based encoder may effectively capture general information in its resulting source sentence representation, the backbone information, which stands for the gist of a sentence, is not specifically focused on. In this paper, we propose an explicit sentence compression method to enhance the source sentence representation for NMT. In practice, an explicit sentence compression goal used to learn the backbone information in a sentence. We propose three ways, including backbone source-side fusion, target-side fusion, and both-side fusion, to integrate the compressed sentence into NMT. Our empirical tests on the WMT English-to-French and English-to-German translation tasks show that the proposed sentence compression method significantly improves the translation performances over strong baselines., Working in progress, part of this work is accepted in AAAI-2020
- Published
- 2019
10. Document-level Neural Machine Translation with Associated Memory Network
- Author
-
Hai Zhao, Kehai Chen, Masao Utiyama, Shu Jiang, Bao-liang Lu, Rui Wang, Eiichiro Sumita, and Zuchao Li
- Subjects
FOS: Computer and information sciences ,Computer Science - Computation and Language ,Machine translation ,business.industry ,Computer science ,computer.software_genre ,Document level ,Artificial Intelligence ,Hardware and Architecture ,Computer Vision and Pattern Recognition ,Artificial intelligence ,Electrical and Electronic Engineering ,business ,Computation and Language (cs.CL) ,computer ,Software ,Natural language processing - Abstract
Standard neural machine translation (NMT) is on the assumption that the document-level context is independent. Most existing document-level NMT approaches are satisfied with a smattering sense of global document-level information, while this work focuses on exploiting detailed document-level context in terms of a memory network. The capacity of the memory network that detecting the most relevant part of the current sentence from memory renders a natural solution to model the rich document-level context. In this work, the proposed document-aware memory network is implemented to enhance the Transformer NMT baseline. Experiments on several tasks show that the proposed method significantly improves the NMT performance over strong Transformer baselines and other related studies.
- Published
- 2019
11. Graph-Based Bilingual Word Embedding for Statistical Machine Translation
- Author
-
Masao Utiyama, Rui Wang, Sabine Ploux, Eiichiro Sumita, Bao-Liang Lu, Hai Zhao, CAMS, Centre d'Analyse et de Mathématique sociales (CAMS), and École des hautes études en sciences sociales (EHESS)-Centre National de la Recherche Scientifique (CNRS)-École des hautes études en sciences sociales (EHESS)-Centre National de la Recherche Scientifique (CNRS)
- Subjects
Phrase ,Word embedding ,General Computer Science ,Machine translation ,Computer science ,business.industry ,[SCCO.NEUR]Cognitive science/Neuroscience ,010102 general mathematics ,Context (language use) ,02 engineering and technology ,Pointwise mutual information ,computer.software_genre ,01 natural sciences ,[SHS]Humanities and Social Sciences ,0202 electrical engineering, electronic engineering, information engineering ,Graph (abstract data type) ,Embedding ,020201 artificial intelligence & image processing ,Artificial intelligence ,0101 mathematics ,business ,computer ,Natural language processing ,Word (computer architecture) ,ComputingMilieux_MISCELLANEOUS - Abstract
Bilingual word embedding has been shown to be helpful for Statistical Machine Translation (SMT). However, most existing methods suffer from two obvious drawbacks. First, they only focus on simple contexts such as an entire document or a fixed-sized sliding window to build word embedding and ignore latent useful information from the selected context. Second, the word sense but not the word should be the minimal semantic unit; however, most existing methods still use word representation. To overcome these drawbacks, this article presents a novel Graph-Based Bilingual Word Embedding (GBWE) method that projects bilingual word senses into a multidimensional semantic space. First, a bilingual word co-occurrence graph is constructed using the co-occurrence and pointwise mutual information between the words. Then, maximum complete subgraphs (cliques), which play the role of a minimal unit for bilingual sense representation, are dynamically extracted according to the contextual information. Consequently, correspondence analysis, principal component analyses, and neural networks are used to summarize the clique-word matrix into lower dimensions to build the embedding model. Without contextual information, the proposed GBWE can be applied to lexical translation. In addition, given contextual information, GBWE is able to give a dynamic solution for bilingual word representations, which can be applied to phrase translation and generation. Empirical results show that GBWE can enhance the performance of lexical translation, as well as Chinese/French-to-English and Chinese-to-Japanese phrase-based SMT tasks (IWSLT, NTCIR, NIST, and WAT).
- Published
- 2018
- Full Text
- View/download PDF
12. Dynamic Sentence Sampling for Efficient Training of Neural Machine Translation
- Author
-
Rui Wang, Eiichiro Sumita, and Masao Utiyama
- Subjects
FOS: Computer and information sciences ,Computer Science - Computation and Language ,Machine translation ,Computer science ,Epoch (reference date) ,Speech recognition ,05 social sciences ,Training (meteorology) ,Sampling (statistics) ,Sample (statistics) ,010501 environmental sciences ,computer.software_genre ,01 natural sciences ,0502 economics and business ,NIST ,050207 economics ,Computation and Language (cs.CL) ,computer ,Sentence ,0105 earth and related environmental sciences - Abstract
Traditional Neural machine translation (NMT) involves a fixed training procedure where each sentence is sampled once during each epoch. In reality, some sentences are well-learned during the initial few epochs; however, using this approach, the well-learned sentences would continue to be trained along with those sentences that were not well learned for 10-30 epochs, which results in a wastage of time. Here, we propose an efficient method to dynamically sample the sentences in order to accelerate the NMT training. In this approach, a weight is assigned to each sentence based on the measured difference between the training costs of two iterations. Further, in each epoch, a certain percentage of sentences are dynamically sampled according to their weights. Empirical results based on the NIST Chinese-to-English and the WMT English-to-German tasks depict that the proposed method can significantly accelerate the NMT training and improve the NMT performance., Revised version of ACL-2018
- Published
- 2018
13. Guiding Neural Machine Translation with Retrieved Translation Pieces
- Author
-
Graham Neubig, Masao Utiyama, Satoshi Nakamura, Eiichiro Sumita, and Jingyi Zhang
- Subjects
FOS: Computer and information sciences ,Computer Science - Computation and Language ,Machine translation ,Computer science ,business.industry ,InformationSystems_INFORMATIONSTORAGEANDRETRIEVAL ,Process (computing) ,020207 software engineering ,02 engineering and technology ,Translation (geometry) ,computer.software_genre ,Domain (software engineering) ,0202 electrical engineering, electronic engineering, information engineering ,020201 artificial intelligence & image processing ,Artificial intelligence ,business ,Computation and Language (cs.CL) ,computer ,Sentence ,Natural language processing ,BLEU - Abstract
One of the difficulties of neural machine translation (NMT) is the recall and appropriate translation of low-frequency words or phrases. In this paper, we propose a simple, fast, and effective method for recalling previously seen translation examples and incorporating them into the NMT decoding process. Specifically, for an input sentence, we use a search engine to retrieve sentence pairs whose source sides are similar with the input sentence, and then collect $n$-grams that are both in the retrieved target sentences and aligned with words that match in the source sentences, which we call "translation pieces". We compute pseudo-probabilities for each retrieved sentence based on similarities between the input sentence and the retrieved source sentences, and use these to weight the retrieved translation pieces. Finally, an existing NMT model is used to translate the input sentence, with an additional bonus given to outputs that contain the collected translation pieces. We show our method improves NMT translation results up to 6 BLEU points on three narrow domain translation tasks where repetitiveness of the target sentences is particularly salient. It also causes little increase in the translation time, and compares favorably to another alternative retrieval-based method with respect to accuracy, speed, and simplicity of implementation., NAACL 2018
- Published
- 2018
14. Agreement on Target-Bidirectional Recurrent Neural Networks for Sequence-to-Sequence Learning.
- Author
-
Lemao Liu, Finc, Andrew, Masao Utiyama, and Eiichiro Sumita
- Subjects
ARTIFICIAL neural networks ,LEARNING ,MACHINE translating ,TRANSLITERATION ,GENETIC transduction - Abstract
Recurrent neural networks are extremely appealing for sequence-to-sequence learning tasks. Despite their great success, they typically suffer from a shortcoming: they are prone to generate unbalanced targets with good prefixes but bad suffixes, and thus performance suffers when dealing with long sequences. We propose a simple yet effective approach to overcome this shortcoming. Our approach relies on the agreement between a pair of target-directional RNNs, which generates more balanced targets. In addition, we develop two efficient approximate search methods for agreement that are empirically shown to be almost optimal in terms of either sequence level or non-sequence level metrics. Extensive experiments were performed on three standard sequence-to-sequence transduction tasks: machine transliteration, grapheme-to-phoneme transformation and machine translation. The results show that the proposed approach achieves consistent and substantial improvements, compared to many state-of-the-art systems. [ABSTRACT FROM AUTHOR]
- Published
- 2020
- Full Text
- View/download PDF
15. Chapter 2: Mining Patents for Parallel Corpora: 2.6 MT Experiments.
- Author
-
Masao Utiyama and Hitoshi Isahara
- Published
- 2008
16. Chapter 2: Mining Patents for Parallel Corpora: 2.5 Statistics of the Patent Parallel Corpus.
- Author
-
Masao Utiyama and Hitoshi Isahara
- Published
- 2008
17. Chapter 2: Mining Patents for Parallel Corpora: 2.4 Alignment Procedure.
- Author
-
Masao Utiyama and Hitoshi Isahara
- Published
- 2008
18. Distortion Model Based on Word Sequence Labeling for Statistical Machine Translation.
- Author
-
Isao Goto, Masao Utiyama, Eiichiro Sumita, Akihiro Tamura, and Sadao Kurohashi
- Published
- 2014
- Full Text
- View/download PDF
19. Chapter 2: Mining Patents for Parallel Corpora: 2.7 Conclusion.
- Author
-
Masao Utiyama and Hitoshi Isahara
- Published
- 2008
20. Chapter 2: Mining Patents for Parallel Corpora: 2.2 Related Work.
- Author
-
Masao Utiyama and Hitoshi Isahara
- Published
- 2008
21. Chapter 2: Mining Patents for Parallel Corpora: 2.1 Introduction.
- Author
-
Masao Utiyama and Hitoshi Isahara
- Published
- 2008
Catalog
Discovery Service for Jio Institute Digital Library
For full access to our library's resources, please sign in.