Descriptor: "Language Representation" / Publisher: acm - Searchworks@Jio Institute Digital Library Search Results

Your search keyword '"Language Representation"' showing total 6 results

Start Over Descriptor "Language Representation" Publisher acm

6 results on '"Language Representation"'

1. BioNumQA-BERT

Author: Ruibang Luo, Hing-Fung Ting, Tak-Wah Lam, and Ye Wu
Subjects: Scheme (programming language), Language representation, Source code, Computer science, Generalization, business.industry, media_common.quotation_subject, computer.software_genre, Encoding (memory), Question answering, Leverage (statistics), Language model, Artificial intelligence, business, computer, Natural language processing, media_common, computer.programming_language
Abstract: Biomedical question answering (QA) is playing an increasingly significant role in medical knowledge translation. However, current biomedical QA datasets and methods have limited capacity, as they commonly neglect the role of numerical facts in biomedical QA. In this paper, we constructed BioNumQA, a novel biomedical QA dataset that answers research questions using relevant numerical facts for biomedical QA model training and testing. To leverage the new dataset, we designed a new method called BioNumQA-BERT by introducing a novel numerical encoding scheme into the popular biomedical language model BioBERT to represent the numerical values in the input text. Our experiments show that BioNumQA-BERT significantly outperformed other state-of-art models, including DrQA, BERT and BioBERT (39.0% vs 29.5%, 31.3% and 33.2%, respectively, in strict accuracy). To improve the generalization ability of BioNumQA-BERT, we further pretrained it on a large biomedical text corpus and achieved 41.5% strict accuracy. BioNumQA and BioNumQA-BERT establish a new baseline for biomedical QA. The dataset, source codes and pretrained model of BioNumQA-BERT are available at https://github.com/LeaveYeah/BioNumQA-BERT.
Published: 2021
Full Text: View/download PDF

2. Learning Deep and Wide Contextual Representations Using BERT for Statistical Parametric Speech Synthesis

Author: Zhen-Hua Ling and Ya-Jie Zhang
Subjects: Language representation, Computer science, Speech recognition, Feature (machine learning), Context (language use), Speech synthesis, Prosody, computer.software_genre, Encoder, computer, Parametric statistics
Abstract: In this paper, we propose a method of learning deep and wide contextual representations for statistical parametric speech synthesis (SPSS) using BERT, a pre-trained language representation model. Traditional acoustic models in SPSS utilize phoneme sequences and prosody labels as input, and can not make full use of the deep linguistic representations of current and surrounding sentences. Therefore, this paper designs two context encoders, i.e., a sentence-window context encoder and a paragraph-level context encoder, to integrate the contextual representations extracted from multiple sentences by BERT into Tacotron2 via an extra attention module. The parameters of BERT are pre-trained and then fine-tuned together with other components in the model. Experimental results on the Blizzard Challenge 2019 dataset show that both context encoders can reduce the errors of acoustic feature prediction and improve the subjective performance of synthetic speech comparing with the baseline Tacotron2 model.
Published: 2021
Full Text: View/download PDF

3. Language Representation Models for Music Genre Classification Using Lyrics

Author: Hasan Akalp, Seyma Yilmaz, Necva Bolucu, Enes Furkan Cigdem, and Burcu Can
Subjects: Language representation, Language understanding, business.industry, Computer science, Deep learning, Universal language, Lyrics, language.human_language, Linguistics, Field (computer science), language, Artificial intelligence, business, Set (psychology), Period (music)
Abstract: There are various genres of music available in every period and field of human life. Every music genre represents a set of shared conventions. Today people have the opportunity to listen to any genre of music they want using various music platforms. However, with the increasing number of music genres, the management of these platforms becomes difficult. Language representation models such as BERT, DistilBERT have been proven to be useful in learning universal language representations. Such language representation models have achieved amazing results in many language understanding tasks. In this study, we apply language representation models for music genre classification using song lyrics. We examine whether language representation models are better than traditional deep learning models for music genre classification by comparing results and computation times. Experimental results show that BERT outperforms other models on one-label and multi-label classification with accuracy of 77.63% and 71.29% respectively. On the other hand, considering the time taken for one epoch, BERT runs 4 times faster than DistilBERT.
Published: 2021
Full Text: View/download PDF

4. Tuning Language Representation Models for Classification of Turkish News

Author: Fatmanur Turhan, Burcu Can, Necva Bolucu, and Meltem Tokgoz
Subjects: Language representation, Language understanding, business.industry, Computer science, Turkish, Lexical analysis, Representation (systemics), English language, computer.software_genre, language.human_language, Task (project management), Quantitative analysis (finance), language, Artificial intelligence, business, computer, Natural language processing
Abstract: Pre-trained language representation models are very efficient in learning language representation independent from natural language processing tasks to be performed. The language representation models such as BERT and DistilBERT have achieved amazing results in many language understanding tasks. Studies on text classification problems in the literature are generally carried out for the English language. This study aims to classify the news in the Turkish language using pre-trained language representation models. In this study, we utilize BERT and DistilBERT by tuning both models for the text classification task to learn the categories of Turkish news with different tokenization methods. We provide a quantitative analysis of the performance of BERT and DistilBERT on the Turkish news dataset by comparing the models in terms of their representation capability in the text classification task. The highest performance is obtained with DistilBERT with an accuracy of 97.4%.
Published: 2021
Full Text: View/download PDF

5. Language representation learning models

Author: El Habib Nfaoui and Sanae Achsas
Subjects: Language representation, Computer science, business.industry, 02 engineering and technology, Learning models, computer.software_genre, Task (project management), 03 medical and health sciences, Vector graphics, 0302 clinical medicine, Text mining, 0202 electrical engineering, electronic engineering, information engineering, 020201 artificial intelligence & image processing, Word2vec, Artificial intelligence, business, computer, 030217 neurology & neurosurgery, Natural language processing
Abstract: Recently, Natural Language Processing has shown significant development, especially in text mining and analysis. An important task in this area is learning vector-space representations of text. Since various machine learning algorithms require representing their inputs in a vector format. In this paper, we highlight the most important language representation learning models used in the literature, ranging from the free contextual approaches like word2vec and Glove until the appearance of recent modern contextualized approaches such as ELMo, BERT, and XLNet. We show and discuss their main architectures and their main strengths and limits.
Published: 2020
Full Text: View/download PDF

6. Boosting Recommender Systems with Advanced Embedding Models

Author: Gjorgjina Cenikj and Sonja Gievska
Subjects: Language representation, Information retrieval, Boosting (machine learning), Computer science, Exploratory research, Embedding, Inference, Social media, Recommender system, Content filtering
Abstract: Recommender systems are paramount in providing personalized content and intelligent content filtering on any social media platform, web portal, and online application. In line with the current trends in the field directed towards mapping problem and data encoding representations from other fields, this research investigates the feasibility of augmenting a graph-based recommender system for Amazon products with two state-of-the-art representation models. In particular, the potential benefits of using the language representation model BERT and GraphSage based representations of nodes and edges for improving the quality of the recommendations were investigated. Link prediction and link attribute inference were used to identify the products that the users will buy and predict the rating they will give to a product, respectively. The initial results of our exploratory study are encouraging and point to potential directions for future research.
Published: 2020
Full Text: View/download PDF

Catalog

Books, media, physical & digital resources

See catalog results

Searchworks

Select search scope, currently: Articles

Catalog

books, media & more in Jio Institute collections

Articles

journal articles & other e-resources

Refine your results

6 results on '"Language Representation"'

1. BioNumQA-BERT

2. Learning Deep and Wide Contextual Representations Using BERT for Statistical Parametric Speech Synthesis

3. Language Representation Models for Music Genre Classification Using Lyrics

4. Tuning Language Representation Models for Classification of Turkish News

5. Language representation learning models

6. Boosting Recommender Systems with Advanced Embedding Models

Catalog

Searchworks

Select search scope, currently: Articles Catalog books, media & more in Jio Institute collections Articles journal articles & other e-resources

Search

Search Constraints

Refine your results

Search Limiters

Topic

Publication Year Range

Language

Journal

Database

6 results on '"Language Representation"'

Search Results

Catalog

Select search scope, currently: Articles

Catalog

books, media & more in Jio Institute collections

Articles

journal articles & other e-resources