94 results on '"Language Representation"'
Search Results
2. Label-Embedding Bi-directional Attentive Model for Multi-label Text Classification
- Author
-
Jiangtao Ren, Naiyin Liu, and Qianlong Wang
- Subjects
0209 industrial biotechnology ,Language representation ,Computer Networks and Communications ,business.industry ,Computer science ,General Neuroscience ,Computational intelligence ,02 engineering and technology ,computer.software_genre ,Security token ,Field (computer science) ,Task (project management) ,ComputingMethodologies_PATTERNRECOGNITION ,020901 industrial engineering & automation ,Artificial Intelligence ,0202 electrical engineering, electronic engineering, information engineering ,Embedding ,020201 artificial intelligence & image processing ,Artificial intelligence ,State (computer science) ,business ,Representation (mathematics) ,computer ,Software ,Natural language processing - Abstract
Multi-label text classification is a critical task in natural language processing field. As the latest language representation model, BERT obtains new state-of-the-art results in the classification task. Nevertheless, the text classification framework of BERT neglects to make full use of the token-level text representation and label embedding, since it only utilizes the final hidden state corresponding to CLS token as sequence-level text representation for classification. We assume that the finer-grained token-level text representation and label embedding contribute to classification. Consequently, in this paper, we propose a Label-Embedding Bi-directional Attentive model to improve the performance of BERT’s text classification framework. In particular, we extend BERT’s text classification framework with label embedding and bi-directional attention. Experimental results on the five datasets indicate that our model has notable improvements over both baselines and state-of-the-art models.
- Published
- 2021
- Full Text
- View/download PDF
3. Means of Language Representation of the Space Category in the English Biotechnology Terminology
- Author
-
O Syrotina
- Subjects
Language representation ,Computer science ,Process (engineering) ,Linguistics ,Terminology - Published
- 2020
- Full Text
- View/download PDF
4. Zero‐anaphora resolution in Korean based on deep language representation model: BERT
- Author
-
Young Tae Kim, Soojong Lim, and Dong-Yul Ra
- Subjects
Language representation ,General Computer Science ,TK7800-8360 ,Computer science ,business.industry ,Deep learning ,Resolution (electron density) ,deep learning ,language representation model ,TK5101-6720 ,computer.software_genre ,Electronic, Optical and Magnetic Materials ,Zero (linguistics) ,attention ,Telecommunication ,bidirectional encoder representations from transformers (bert) ,Artificial intelligence ,Electrical and Electronic Engineering ,Electronics ,business ,computer ,zero‐anaphora resolution (zar) ,Natural language processing ,Anaphora (rhetoric) - Abstract
It is necessary to achieve high performance in the task of zero anaphora resolution (ZAR) for completely understanding the texts in Korean, Japanese, Chinese, and various other languages. Deep‐learning‐based models are being employed for building ZAR systems, owing to the success of deep learning in the recent years. However, the objective of building a high‐quality ZAR system is far from being achieved even using these models. To enhance the current ZAR techniques, we fine‐tuned a pre‐trained bidirectional encoder representations from transformers (BERT). Notably, BERT is a general language representation model that enables systems to utilize deep bidirectional contextual information in a natural language text. It extensively exploits the attention mechanism based upon the sequence‐transduction model Transformer. In our model, classification is simultaneously performed for all the words in the input word sequence to decide whether each word can be an antecedent. We seek end‐to‐end learning by disallowing any use of hand‐crafted or dependency‐parsing features. Experimental results show that compared with other models, our approach can significantly improve the performance of ZAR.
- Published
- 2020
5. K-BERT: Enabling Language Representation with Knowledge Graph
- Author
-
Zhe Zhao, Ping Wang, Haotang Deng, Peng Zhou, Weijie Liu, Zhiruo Wang, and Qi Ju
- Subjects
FOS: Computer and information sciences ,Computer Science - Machine Learning ,Computer science ,media_common.quotation_subject ,02 engineering and technology ,computer.software_genre ,Machine Learning (cs.LG) ,030507 speech-language pathology & audiology ,03 medical and health sciences ,Limit (category theory) ,Reading (process) ,0202 electrical engineering, electronic engineering, information engineering ,media_common ,Language representation ,Computer Science - Computation and Language ,business.industry ,Matrix (music) ,General Medicine ,Knowledge graph ,Domain knowledge ,020201 artificial intelligence & image processing ,Artificial intelligence ,Noise (video) ,0305 other medical science ,business ,Computation and Language (cs.CL) ,computer ,Sentence ,Natural language processing ,Meaning (linguistics) - Abstract
Pre-trained language representation models, such as BERT, capture a general language representation from large-scale corpora, but lack domain-specific knowledge. When reading a domain text, experts make inferences with relevant knowledge. For machines to achieve this capability, we propose a knowledge-enabled language representation model (K-BERT) with knowledge graphs (KGs), in which triples are injected into the sentences as domain knowledge. However, too much knowledge incorporation may divert the sentence from its correct meaning, which is called knowledge noise (KN) issue. To overcome KN, K-BERT introduces soft-position and visible matrix to limit the impact of knowledge. K-BERT can easily inject domain knowledge into the models by equipped with a KG without pre-training by-self because it is capable of loading model parameters from the pre-trained BERT. Our investigation reveals promising results in twelve NLP tasks. Especially in domain-specific tasks (including finance, law, and medicine), K-BERT significantly outperforms BERT, which demonstrates that K-BERT is an excellent choice for solving the knowledge-driven problems that require experts., Comment: 8 pages, 20190917
- Published
- 2020
- Full Text
- View/download PDF
6. Named Entity Recognition in Spanish Biomedical Literature: Short Review and Bert Model
- Author
-
Liliya Akhtyamova
- Subjects
Normalization (statistics) ,Language representation ,Computer science ,business.industry ,Deep learning ,named entity recognition ,computer.software_genre ,Knowledge acquisition ,lcsh:Telecommunication ,Spanish Clinical Case Corpus (SPACCC) ,knowledge acquisition ,Named-entity recognition ,Error analysis ,lcsh:TK5101-6720 ,biomedical natural language processing ,Artificial intelligence ,Clinical case ,Computer Engineering ,Recognition (NER) ,business ,contextualized word embeddings ,computer ,Natural language processing - Abstract
Entity Recognition (NER) is the first step for knowledge acquisition when we deal with an unknown corpus of texts. Having received these entities, we have an opportunity to form parameters space and to solve problems of text mining as concept normalization, speech recognition, etc. The recent advances in NER are related to the technology of contextualized word embeddings, which transforms text to the form being effective for Deep Learning. In the paper, we show how NER model detects pharmacological substances, compounds, and proteins in the dataset obtained from the Spanish Clinical Case Corpus (SPACCC). To achieve this goal, we train from scratch the BERT language representation model and fine-tune it for our problem. As it is expected, this model shows better results than the NER model trained over the standard word embeddings. We further conduct an error analysis showing the origins of models’ errors and proposing strategies to further improve the model’s quality.
- Published
- 2020
7. LANGUAGE REPRESENTATION OF ELITISM IN THE IMAGE CORPORATE DISCOURSE
- Author
-
Tatiana Vladimirovna Kovalkova
- Subjects
Language representation ,Computer science ,Linguistics ,Image (mathematics) ,Elitism - Published
- 2020
- Full Text
- View/download PDF
8. Extensive Feature Analysis and Baseline Model for Stance Detection Task
- Author
-
Deepti Mehrotra, Avantika Singh, Kumar Shaswat, and Parul Kalra
- Subjects
Language representation ,Process (engineering) ,Computer science ,business.industry ,Fact checking ,Baseline model ,computer.software_genre ,Task (project management) ,Similarity (psychology) ,Fake news ,Artificial intelligence ,business ,computer ,Natural language processing ,Stance detection - Abstract
Identifying and curtailing the spread of fake news is a complex and challenging task. Automated stance detection can be an important first step in alleviating the tedious process of fact checking. In this paper, we try to evaluate various language representation, and statistical similarity technique to discern the best possible mathematical modeling of the sentences for the task of stance evaluation. Our work is based on the dataset of the fake news challenge 1. The paper expounds our implementation details, the features, and models that work best for the given task.
- Published
- 2021
- Full Text
- View/download PDF
9. BioNumQA-BERT
- Author
-
Ruibang Luo, Hing-Fung Ting, Tak-Wah Lam, and Ye Wu
- Subjects
Scheme (programming language) ,Language representation ,Source code ,Computer science ,Generalization ,business.industry ,media_common.quotation_subject ,computer.software_genre ,Encoding (memory) ,Question answering ,Leverage (statistics) ,Language model ,Artificial intelligence ,business ,computer ,Natural language processing ,media_common ,computer.programming_language - Abstract
Biomedical question answering (QA) is playing an increasingly significant role in medical knowledge translation. However, current biomedical QA datasets and methods have limited capacity, as they commonly neglect the role of numerical facts in biomedical QA. In this paper, we constructed BioNumQA, a novel biomedical QA dataset that answers research questions using relevant numerical facts for biomedical QA model training and testing. To leverage the new dataset, we designed a new method called BioNumQA-BERT by introducing a novel numerical encoding scheme into the popular biomedical language model BioBERT to represent the numerical values in the input text. Our experiments show that BioNumQA-BERT significantly outperformed other state-of-art models, including DrQA, BERT and BioBERT (39.0% vs 29.5%, 31.3% and 33.2%, respectively, in strict accuracy). To improve the generalization ability of BioNumQA-BERT, we further pretrained it on a large biomedical text corpus and achieved 41.5% strict accuracy. BioNumQA and BioNumQA-BERT establish a new baseline for biomedical QA. The dataset, source codes and pretrained model of BioNumQA-BERT are available at https://github.com/LeaveYeah/BioNumQA-BERT.
- Published
- 2021
- Full Text
- View/download PDF
10. Rich Visual and Language Representation with Complementary Semantics for Video Captioning
- Author
-
Pengjie Tang, Hanli Wang, and Qinyu Li
- Subjects
Closed captioning ,Language representation ,Computer Networks and Communications ,business.industry ,Computer science ,020207 software engineering ,02 engineering and technology ,Coherence (statistics) ,Residual ,computer.software_genre ,Semantics ,Convolutional neural network ,Term (time) ,Hardware and Architecture ,0202 electrical engineering, electronic engineering, information engineering ,Benchmark (computing) ,020201 artificial intelligence & image processing ,Artificial intelligence ,business ,computer ,Natural language processing - Abstract
It is interesting and challenging to translate a video to natural description sentences based on the video content. In this work, an advanced framework is built to generate sentences with coherence and rich semantic expressions for video captioning. A long short term memory (LSTM) network with an improved factored way is first developed, which takes the inspiration of LSTM with a conventional factored way and a common practice to feed multi-modal features into LSTM at the first time step for visual description. Then, the incorporation of the LSTM network with the proposed improved factored way and un-factored way is exploited, and a voting strategy is utilized to predict candidate words. In addition, for robust and abstract visual and language representation, residuals are employed to enhance the gradient signals that are learned from the residual network (ResNet), and a deeper LSTM network is constructed. Furthermore, three convolutional neural network based features extracted from GoogLeNet, ResNet101, and ResNet152, are fused to catch more comprehensive and complementary visual information. Experiments are conducted on two benchmark datasets, including MSVD and MSR-VTT2016, and competitive performances are obtained by the proposed techniques as compared to other state-of-the-art methods.
- Published
- 2019
- Full Text
- View/download PDF
11. REPT: Bridging Language Models and Machine Reading Comprehension via Retrieval-Based Pre-training
- Author
-
Feng Ji, Feng-Lin Li, Yilin Niu, Fangkai Jiao, Yangyang Guo, and Liqiang Nie
- Subjects
FOS: Computer and information sciences ,Language representation ,Computer Science - Computation and Language ,business.industry ,Computer science ,computer.software_genre ,Bridge (nautical) ,Bridging (programming) ,Comprehension ,Model architecture ,Artificial intelligence ,Language model ,business ,Computation and Language (cs.CL) ,Machine reading ,computer ,Natural language processing - Abstract
Pre-trained Language Models (PLMs) have achieved great success on Machine Reading Comprehension (MRC) over the past few years. Although the general language representation learned from large-scale corpora does benefit MRC, the poor support in evidence extraction which requires reasoning across multiple sentences hinders PLMs from further advancing MRC. To bridge the gap between general PLMs and MRC, we present REPT, a REtrieval-based Pre-Training approach. In particular, we introduce two self-supervised tasks to strengthen evidence extraction during pre-training, which is further inherited by downstream MRC tasks through the consistent retrieval operation and model architecture. To evaluate our proposed method, we conduct extensive experiments on five MRC datasets that require collecting evidence from and reasoning across multiple sentences. Experimental results demonstrate the effectiveness of our pre-training approach. Moreover, further analysis shows that our approach is able to enhance the capacity of evidence extraction without explicit supervision., 14 pages, 3 figures, Findings of ACL 2021
- Published
- 2021
12. ADVISor: Automatic Visualization Answer for Natural-Language Question on Tabular Data
- Author
-
Ruike Jiang, Yun Han, Can Liu, and Xiaoru Yuan
- Subjects
Language representation ,Information retrieval ,Artificial neural network ,Computer science ,business.industry ,05 social sciences ,020207 software engineering ,02 engineering and technology ,Type (model theory) ,Pipeline (software) ,Visualization ,Data visualization ,0202 electrical engineering, electronic engineering, information engineering ,0501 psychology and cognitive sciences ,business ,050107 human factors ,Natural language - Abstract
We propose an automatic pipeline to generate visualization with annotations to answer natural-language questions raised by the public on tabular data. With a pre-trained language representation model, the input natural language questions and table headers are first encoded into vectors. According to these vectors, a multi-task end-to-end deep neural network extracts related data areas and corresponding aggregation type. We present the result with carefully designed visualization and annotations for different attribute types and tasks. We conducted a comparison experiment with state-of-the-art works and the best commercial tools. The results show that our method outperforms those works with higher accuracy and more effective visualization.
- Published
- 2021
- Full Text
- View/download PDF
13. Fine-tuning Pretrained Multilingual BERT Model for Indonesian Aspect-based Sentiment Analysis
- Author
-
Masayu Leylia Khodra and Annisa Nurul Azhar
- Subjects
FOS: Computer and information sciences ,Language representation ,Fine-tuning ,Computer Science - Computation and Language ,Computer science ,business.industry ,Sentiment analysis ,computer.software_genre ,language.human_language ,Domain (software engineering) ,Task (project management) ,Indonesian ,Transformation (function) ,language ,Artificial intelligence ,business ,Computation and Language (cs.CL) ,computer ,Natural language processing ,Test data - Abstract
Although previous research on Aspect-based Sentiment Analysis (ABSA) for Indonesian reviews in hotel domain has been conducted using CNN and XGBoost, its model did not generalize well in test data and high number of OOV words contributed to misclassification cases. Nowadays, most state-of-the-art results for wide array of NLP tasks are achieved by utilizing pretrained language representation. In this paper, we intend to incorporate one of the foremost language representation model, BERT, to perform ABSA in Indonesian reviews dataset. By combining multilingual BERT (m-BERT) with task transformation method, we manage to achieve significant improvement by 8% on the F1-score compared to the result from our previous study.
- Published
- 2021
14. Learning Deep and Wide Contextual Representations Using BERT for Statistical Parametric Speech Synthesis
- Author
-
Zhen-Hua Ling and Ya-Jie Zhang
- Subjects
Language representation ,Computer science ,Speech recognition ,Feature (machine learning) ,Context (language use) ,Speech synthesis ,Prosody ,computer.software_genre ,Encoder ,computer ,Parametric statistics - Abstract
In this paper, we propose a method of learning deep and wide contextual representations for statistical parametric speech synthesis (SPSS) using BERT, a pre-trained language representation model. Traditional acoustic models in SPSS utilize phoneme sequences and prosody labels as input, and can not make full use of the deep linguistic representations of current and surrounding sentences. Therefore, this paper designs two context encoders, i.e., a sentence-window context encoder and a paragraph-level context encoder, to integrate the contextual representations extracted from multiple sentences by BERT into Tacotron2 via an extra attention module. The parameters of BERT are pre-trained and then fine-tuned together with other components in the model. Experimental results on the Blizzard Challenge 2019 dataset show that both context encoders can reduce the errors of acoustic feature prediction and improve the subjective performance of synthetic speech comparing with the baseline Tacotron2 model.
- Published
- 2021
- Full Text
- View/download PDF
15. Language Representation Models for Music Genre Classification Using Lyrics
- Author
-
Hasan Akalp, Seyma Yilmaz, Necva Bolucu, Enes Furkan Cigdem, and Burcu Can
- Subjects
Language representation ,Language understanding ,business.industry ,Computer science ,Deep learning ,Universal language ,Lyrics ,language.human_language ,Linguistics ,Field (computer science) ,language ,Artificial intelligence ,business ,Set (psychology) ,Period (music) - Abstract
There are various genres of music available in every period and field of human life. Every music genre represents a set of shared conventions. Today people have the opportunity to listen to any genre of music they want using various music platforms. However, with the increasing number of music genres, the management of these platforms becomes difficult. Language representation models such as BERT, DistilBERT have been proven to be useful in learning universal language representations. Such language representation models have achieved amazing results in many language understanding tasks. In this study, we apply language representation models for music genre classification using song lyrics. We examine whether language representation models are better than traditional deep learning models for music genre classification by comparing results and computation times. Experimental results show that BERT outperforms other models on one-label and multi-label classification with accuracy of 77.63% and 71.29% respectively. On the other hand, considering the time taken for one epoch, BERT runs 4 times faster than DistilBERT.
- Published
- 2021
- Full Text
- View/download PDF
16. Tuning Language Representation Models for Classification of Turkish News
- Author
-
Fatmanur Turhan, Burcu Can, Necva Bolucu, and Meltem Tokgoz
- Subjects
Language representation ,Language understanding ,business.industry ,Computer science ,Turkish ,Lexical analysis ,Representation (systemics) ,English language ,computer.software_genre ,language.human_language ,Task (project management) ,Quantitative analysis (finance) ,language ,Artificial intelligence ,business ,computer ,Natural language processing - Abstract
Pre-trained language representation models are very efficient in learning language representation independent from natural language processing tasks to be performed. The language representation models such as BERT and DistilBERT have achieved amazing results in many language understanding tasks. Studies on text classification problems in the literature are generally carried out for the English language. This study aims to classify the news in the Turkish language using pre-trained language representation models. In this study, we utilize BERT and DistilBERT by tuning both models for the text classification task to learn the categories of Turkish news with different tokenization methods. We provide a quantitative analysis of the performance of BERT and DistilBERT on the Turkish news dataset by comparing the models in terms of their representation capability in the text classification task. The highest performance is obtained with DistilBERT with an accuracy of 97.4%.
- Published
- 2021
- Full Text
- View/download PDF
17. Analyzing DistilBERT for Sentiment Classification of Banking Financial News
- Author
-
M. N. Talib, Varun Dogra, Sahil Verma, Kavita, Aman Singh, and Nz Jhanjhi
- Subjects
Language representation ,Computer science ,business.industry ,Financial news ,Decision tree ,Machine learning ,computer.software_genre ,Logistic regression ,Random forest ,Artificial intelligence ,tf–idf ,business ,Baseline (configuration management) ,computer - Abstract
In this paper, the sentiment classification approaches are introduced in Indian banking, governmental and global news. The study assesses state-of-art deep contextual language representation, DistilBERT, and traditional context-independent system, TF-IDF, on multiclass (positive, negative, and neutral) sentiment classification news-events. The DistilBERT model is fine-tuned and fed into four supervised machine learning classifiers Random Forest, Decision Tree, Logistic Regression, and Linear SVC, and similarly with baseline TF-IDF. The findings indicate that DistilBERT can transfer basic semantic understanding to further domains and lead to greater accuracy than the baseline TF-IDF. The results also suggest that Random Forest with DistilBERT leads to higher accuracy than other ML classifiers. The Random Forest with DistilBERT achieves 78% accuracy, which is 7% more than with TF-IDF.
- Published
- 2021
- Full Text
- View/download PDF
18. Pre-training a BERT with Curriculum Learning by Increasing Block-Size of Input Text
- Author
-
Koichi Nagatsuka, Clifford Broni-Bediako, and Masayasu Atsumi
- Subjects
Language representation ,business.industry ,Computer science ,Training (meteorology) ,computer.software_genre ,Machine learning ,Range (mathematics) ,Convergence (routing) ,Artificial intelligence ,business ,Baseline (configuration management) ,computer ,Block size ,Curriculum ,Natural language processing - Abstract
Recently, pre-trained language representation models such as BERT and RoBERTa have achieved significant results in a wide range of natural language processing (NLP) tasks, however, it requires extremely high computational cost. Curriculum Learning (CL) is one of the potential solutions to alleviate this problem. CL is a training strategy where training samples are given to models in a meaningful order instead of random sampling. In this work, we propose a new CL method which gradually increases the block-size of input text for training the self-attention mechanism of BERT and its variants using the maximum available batch-size. Experiments in low-resource settings show that our approach outperforms the baseline in terms of convergence speed and final performance on downstream tasks.
- Published
- 2021
- Full Text
- View/download PDF
19. Identifying Topics of Scientific Articles with BERT-Based Approaches and Topic Modeling
- Author
-
Anna Glazkova
- Subjects
Topic model ,Language representation ,Information retrieval ,Scope (project management) ,Artificial neural network ,Computer science ,media_common.quotation_subject ,05 social sciences ,02 engineering and technology ,Task (project management) ,Voting ,0202 electrical engineering, electronic engineering, information engineering ,020201 artificial intelligence & image processing ,0509 other social sciences ,050904 information & library sciences ,media_common - Abstract
This paper describes neural models developed for the First Workshop on Scope Detection of the Peer Review Articles shared task collocated with PAKDD 2021. The aim of the task is to identify topics or category of scientific abstracts. We investigate the use of several fine-tuned language representation models pretrained on different large-scale corpora. In addition, we conduct experiments on combining BERT-based models and document topic vectors for scientific text classification. The topic vectors are obtained using LDA topic modeling. The topic-informed soft voting ensemble of neural networks achieved F1-score of 93.82%.
- Published
- 2021
- Full Text
- View/download PDF
20. BERT for Sequence-to-Sequence Multi-label Text Classification
- Author
-
Ramil Yarullin and Pavel Serdyukov
- Subjects
Sequence ,Language representation ,Computer science ,business.industry ,Artificial intelligence ,computer.software_genre ,business ,computer ,Encoder ,Natural language processing - Abstract
We study the BERT language representation model and the sequence generation model with BERT encoder for the multi-label text classification task. We show that the Sequence Generating BERT model achieves decent results in significantly fewer training epochs compared to the standard BERT. We also introduce and experimentally examine a mixed model, an ensemble of BERT and Sequence Generating BERT models. Our experiments demonstrate that the proposed model outperforms current baselines in several metrics on three well-studied multi-label classification datasets with English texts and two private Yandex Taxi datasets with Russian texts.
- Published
- 2021
- Full Text
- View/download PDF
21. Czert – Czech BERT-like Model for Language Representation
- Author
-
Ondřej Pražák, Jan Pašek, Michal Seják, Jakub Sido, Miloslav Konopík, and Pavel Přibáň
- Subjects
FOS: Computer and information sciences ,Czech ,language modeling ,Language representation ,Computer Science - Computation and Language ,Czech language ,Process (engineering) ,Computer science ,business.industry ,předtrénovaný model ,computer.software_genre ,language.human_language ,Research community ,jazykový mode ,language ,Artificial intelligence ,český jazyk ,business ,Computation and Language (cs.CL) ,pre-trained model ,Publication ,computer ,Natural language processing ,BERT - Abstract
This paper describes the training process of the first Czech monolingual language representation models based on BERT and ALBERT architectures. We pre-train our models on more than 340K of sentences, which is 50 times more than multilingual models that include Czech data. We outperform the multilingual models on 9 out of 11 datasets. In addition, we establish the new state-of-the-art results on nine datasets. At the end, we discuss properties of monolingual and multilingual models based upon our results. We publish all the pre-trained and fine-tuned models freely for the research community., Comment: 13 pages
- Published
- 2021
- Full Text
- View/download PDF
22. eMLM: A New Pre-training Objective for Emotion Related Tasks
- Author
-
Tiberiu Sosea and Cornelia Caragea
- Subjects
Language representation ,Computer science ,business.industry ,Sentiment analysis ,Emotion detection ,computer.software_genre ,Variety (linguistics) ,Variation (linguistics) ,Robustness (computer science) ,Language modelling ,Artificial intelligence ,business ,computer ,Natural language processing - Abstract
BERT has been shown to be extremely effective on a wide variety of natural language processing tasks, including sentiment analysis and emotion detection. However, the proposed pretraining objectives of BERT do not induce any sentiment or emotion-specific biases into the model. In this paper, we present Emotion Masked Language Modelling, a variation of Masked Language Modelling aimed at improving the BERT language representation model for emotion detection and sentiment analysis tasks. Using the same pre-training corpora as the original model, Wikipedia and BookCorpus, our BERT variation manages to improve the downstream performance on 4 tasks from emotion detection and sentiment analysis by an average of 1.2% F-1. Moreover, our approach shows an increased performance in our task-specific robustness tests.
- Published
- 2021
- Full Text
- View/download PDF
23. Detecting Anatomical and Functional Connectivity Relations in Biomedical Literature via Language Representation Models
- Author
-
Anita Bandrowski, Joseph Menke, Ibrahim Burak Ozyurt, and Maryann E. Martone
- Subjects
Language representation ,Relation (database) ,business.industry ,Computer science ,Active learning (machine learning) ,Functional connectivity ,Wiring diagram ,Machine learning ,computer.software_genre ,Relationship extraction ,Anatomical connectivity ,Artificial intelligence ,business ,computer - Abstract
Understanding of nerve-organ interactions is crucial to facilitate the development of effective bioelectronic treatments. Towards the end of developing a systematized and computable wiring diagram of the autonomic nervous system (ANS), we introduce a curated ANS connectivity corpus together with several neural language representation model based connectivity relation extraction systems. We also show that active learning guided curation for labeled corpus expansion significantly outperforms randomly selecting connectivity relation candidates minimizing curation effort. Our final relation extraction system achieves F1 = 72.8% on anatomical connectivity and F1 = 74.6% on functional connectivity relation extraction.
- Published
- 2021
- Full Text
- View/download PDF
24. Initial Experiments on Question Answering from the Intrinsic Structure of Oral History Archives
- Author
-
Adam Chýlek, Jan Švec, and Luboš Šmídl
- Subjects
Structure (mathematical logic) ,Questions and answers ,Language representation ,odpovídání na otázky ,Information retrieval ,archiv MALACH ,Computer science ,Natural language question answering ,The MALACH archive ,Oral history ,Transformers ,Question answering ,Datasets ,Natural (music) ,Active listening ,datové sady ,transfromery - Abstract
Velké zvukové archivy s mluveným obsahem jsou přirozenými kandidáty pro systémy odpovídající na otázky. Archivy orální historie obecně obsahují mnoho faktů a příběhů, které by bylo jinak těžké získat bez poslechu mnoha hodin nahrávek. Snažíme se učinit archiv přístupnějším tím, že umožňujeme hledat odpovědi na otázky položené v přirozeném jazyce. V tomto článku popisujeme výzvy, které naše datová sada představuje. Navrhujeme náš počáteční přístup, který využívá otázky a odpovědi získané ze samotného archivu a hodnotíme výkon v experimentech s modely s předem natrénovanou jazykovou reprezentací a s předtrénovanými modely odpovědí na otázky. Large audio archives with spoken content are natural candidates for question answering systems. Oral history archives generally contain many facts and stories that would be otherwise hard to obtain without listening to hours of recordings. We strive for making the archive more accessible by allowing natural language question answering. In this paper, we present challenges our dataset poses. We propose our initial approach that uses questions and answers mined from the archive itself and evaluate the performance in experiments with pretrained language representation and question answering models.
- Published
- 2021
- Full Text
- View/download PDF
25. Inherent Discriminability of BERT Towards Racial Minority Associated Data
- Author
-
Ziheng Chi, Hongmei Chi, Maryam Ramezanzadehmoghadam, and Edward L. Jones
- Subjects
Language representation ,Tokenization (data security) ,business.industry ,Mechanism (biology) ,Computer science ,Artificial intelligence ,Machine learning ,computer.software_genre ,Human resources ,business ,computer ,Masking (Electronic Health Record) ,Encoder - Abstract
AI and BERT (Bidirectional Encoder Representations from Transformers) have been increasingly adopted in the human resources (HR) industry for recruitment. The increased efficiency (e.g., fairness) will help remove biases in machine learning, help organizations find a qualified candidate, and remove bias in the labor market. BERT has further improved the performance of language representation models by using an auto-encoding model which incorporates larger bidirectional contexts. However, BERT’s underlying mechanisms that enhance its effectiveness, such as tokenization, masking, and leveraging the attention mechanism to compute vector score, are not well understood.
- Published
- 2021
- Full Text
- View/download PDF
26. Advances and Challenges in Unsupervised Neural Machine Translation
- Author
-
Rui Wang and Hai Zhao
- Subjects
Language representation ,Machine translation ,Computer science ,business.industry ,Initialization ,Artificial intelligence ,Machine learning ,computer.software_genre ,business ,computer - Abstract
Unsupervised cross-lingual language representation initialization methods, together with mechanisms such as denoising and back-translation, have advanced unsupervised neural machine translation (UNMT), which has achieved impressive results. Meanwhile, there are still several challenges for UNMT. This tutorial first introduces the background and the latest progress of UNMT. We then examine a number of challenges to UNMT and give empirical results on how well the technology currently holds up.
- Published
- 2021
- Full Text
- View/download PDF
27. Temporal-aware Language Representation Learning From Crowdsourced Labels
- Author
-
Wenbiao Ding, Zitao Liu, Xiao Zhai, and Yang Hao
- Subjects
FOS: Computer and information sciences ,Language representation ,Computer Science - Computation and Language ,Source lines of code ,Computer Science - Artificial Intelligence ,Heuristic ,Computer science ,business.industry ,media_common.quotation_subject ,Machine learning ,computer.software_genre ,Memorization ,Range (mathematics) ,Artificial Intelligence (cs.AI) ,Code (cryptography) ,Deep neural networks ,Quality (business) ,Artificial intelligence ,business ,computer ,Computation and Language (cs.CL) ,media_common - Abstract
Learning effective language representations from crowdsourced labels is crucial for many real-world machine learning tasks. A challenging aspect of this problem is that the quality of crowdsourced labels suffer high intra- and inter-observer variability. Since the high-capacity deep neural networks can easily memorize all disagreements among crowdsourced labels, directly applying existing supervised language representation learning algorithms may yield suboptimal solutions. In this paper, we propose \emph{TACMA}, a \underline{t}emporal-\underline{a}ware language representation learning heuristic for \underline{c}rowdsourced labels with \underline{m}ultiple \underline{a}nnotators. The proposed approach (1) explicitly models the intra-observer variability with attention mechanism; (2) computes and aggregates per-sample confidence scores from multiple workers to address the inter-observer disagreements. The proposed heuristic is extremely easy to implement in around 5 lines of code. The proposed heuristic is evaluated on four synthetic and four real-world data sets. The results show that our approach outperforms a wide range of state-of-the-art baselines in terms of prediction accuracy and AUC. To encourage the reproducible results, we make our code publicly available at \url{https://github.com/CrowdsourcingMining/TACMA}., Comment: The 59th Annual Meeting of the Association for Computational Linguistics Workshop on Representation Learning for NLP (ACL RepL4NLP 2021)
- Published
- 2021
- Full Text
- View/download PDF
28. RobeCzech: Czech RoBERTa, a Monolingual Contextualized Language Representation Model
- Author
-
Milan Straka, Jakub Náplava, David Samuel, and Jana Straková
- Subjects
Czech ,Language representation ,business.industry ,Computer science ,language ,Artificial intelligence ,State (computer science) ,business ,computer.software_genre ,computer ,Natural language processing ,language.human_language ,Transformer (machine learning model) - Abstract
We present RobeCzech, a monolingual RoBERTa language representation model trained on Czech data. RoBERTa is a robustly optimized Transformer-based pretraining approach. We show that RobeCzech considerably outperforms equally-sized multilingual and Czech-trained contextualized language representation models, surpasses current state of the art in all five evaluated NLP tasks and reaches state-of-the-art results in four of them. The RobeCzech model is released publicly at https://hdl.handle.net/11234/1-3691 and https://huggingface.co/ufal/robeczech-base.
- Published
- 2021
- Full Text
- View/download PDF
29. NLyticsFKIE at SemEval-2021 Task 6: Detection of Persuasion Techniques In Texts And Images
- Author
-
Albert Pritzkau
- Subjects
Persuasion ,Sequence ,Language representation ,Computer science ,business.industry ,media_common.quotation_subject ,Security token ,computer.software_genre ,Class (biology) ,SemEval ,Task (project management) ,Model architecture ,ComputingMethodologies_PATTERNRECOGNITION ,Artificial intelligence ,business ,computer ,Natural language processing ,media_common - Abstract
The following system description presents our approach to the detection of persuasion techniques in texts and images. The given task has been framed as a multi-label classification problem with the different techniques serving as class labels. The multi-label classification problem is one in which a list of target variables such as our class labels is associated with every input chunk and assumes that a document can simultaneously and independently be assigned to multiple labels or classes. In order to assign class labels to the given memes, we opted for RoBERTa (A Robustly Optimized BERT Pretraining Approach) as a neural network architecture for token and sequence classification. Starting off with a pre-trained model for language representation we fine-tuned this model on the given classification task with the provided annotated data in supervised training steps. To incorporate image features in the multi-modal setting, we rely on the pre-trained VGG-16 model architecture.
- Published
- 2021
- Full Text
- View/download PDF
30. Towards Similar User Utterance Augmentation for Out-of-Domain Detection
- Author
-
Arantza del Pozo, Andoni Azpeitia, Manex Serras, Mikel D. Fernández-Bhogal, and Laura García-Sardiña
- Subjects
Language representation ,Computer science ,business.industry ,Artificial intelligence ,computer.software_genre ,business ,Chatbot ,computer ,Utterance ,Natural language processing ,Domain (software engineering) - Abstract
Data scarcity is a common issue in the development of Dialogue Systems from scratch, where it is difficult to find dialogue data. This scenario is more likely to happen when the system’s language differs from English. This paper proposes a first text augmentation approach that selects samples similar to annotated user utterances from existing corpora, even if they differ in style, domain or content, in order to improve the detection of Out-of-Domain (OOD) user inputs. Three different sampling methods based on word-vectors extracted from BERT language representation model are compared. The evaluation is carried out using a Spanish chatbot corpus for OOD utterances detection, which has been artificially reduced to simulate various scenarios with different amounts of data. The presented approach is shown to be capable of enhancing the detection of OOD user utterances, achieving greater improvements when less annotated data is available.
- Published
- 2020
- Full Text
- View/download PDF
31. EFSG: Evolutionary Fooling Sentences Generator
- Author
-
Marco Brambilla, Marco Di Giovanni, Di Giovanni M., and Brambilla M.
- Subjects
COLA (software architecture) ,FOS: Computer and information sciences ,Language representation ,Computer Science - Computation and Language ,Computer science ,business.industry ,Generalization ,02 engineering and technology ,010501 environmental sciences ,01 natural sciences ,Adversarial system ,Binary classification ,Robustness (computer science) ,0202 electrical engineering, electronic engineering, information engineering ,Benchmark (computing) ,Adversarial Attacks, Language Models, Evolutionary Generation ,020201 artificial intelligence & image processing ,Artificial intelligence ,business ,Computation and Language (cs.CL) ,0105 earth and related environmental sciences ,Generator (mathematics) - Abstract
Large pre-trained language representation models (LMs) have recently collected a huge number of successes in many NLP tasks. In 2018 BERT, and later its successors (e.g. RoBERTa), obtained state-of-the-art results in classical benchmark tasks, such as GLUE benchmark. After that, works about adversarial attacks have been published to test their generalization proprieties and robustness. In this work, we design Evolutionary Fooling Sentences Generator (EFSG), a model- and task-agnostic adversarial attack algorithm built using an evolutionary approach to generate false-positive sentences for binary classification tasks. We successfully apply EFSG to CoLA and MRPC tasks, on BERT and RoBERTa, comparing performances. Results prove the presence of weak spots in state-of-the-art LMs. We finally test adversarial training as a data augmentation defence approach against EFSG, obtaining stronger improved models with no loss of accuracy when tested on the original datasets., 13 pages, 19 figures
- Published
- 2020
32. Language representation learning models
- Author
-
El Habib Nfaoui and Sanae Achsas
- Subjects
Language representation ,Computer science ,business.industry ,02 engineering and technology ,Learning models ,computer.software_genre ,Task (project management) ,03 medical and health sciences ,Vector graphics ,0302 clinical medicine ,Text mining ,0202 electrical engineering, electronic engineering, information engineering ,020201 artificial intelligence & image processing ,Word2vec ,Artificial intelligence ,business ,computer ,030217 neurology & neurosurgery ,Natural language processing - Abstract
Recently, Natural Language Processing has shown significant development, especially in text mining and analysis. An important task in this area is learning vector-space representations of text. Since various machine learning algorithms require representing their inputs in a vector format. In this paper, we highlight the most important language representation learning models used in the literature, ranging from the free contextual approaches like word2vec and Glove until the appearance of recent modern contextualized approaches such as ELMo, BERT, and XLNet. We show and discuss their main architectures and their main strengths and limits.
- Published
- 2020
- Full Text
- View/download PDF
33. Achieving Reliable Sentiment Analysis in the Software Engineering Domain using BERT
- Author
-
K. Vijay-Shanker, Eeshita Biswas, Mehmet Efruz Karabulut, and Lori Pollock
- Subjects
Language representation ,Software artifacts ,Computer science ,business.industry ,Sentiment analysis ,020207 software engineering ,02 engineering and technology ,010501 environmental sciences ,Recommender system ,01 natural sciences ,Software ,0202 electrical engineering, electronic engineering, information engineering ,Stack overflow ,Software engineering ,business ,Classifier (UML) ,0105 earth and related environmental sciences - Abstract
Researchers have shown that sentiment analysis of software artifacts can potentially improve various software engineering tools, including API and library recommendation systems, code suggestion tools, and tools for improving communication among software developers. However, sentiment analysis techniques applied to software artifacts still have not yet yielded very high accuracy. Recent adaptations of sentiment analysis tools to the software domain have reported some improvements, but the f-measures for the positive and negative sentences still remain in the 0.4-0.64 range, which deters their practical usefulness for software engineering tools.In this paper, we explore the potential effectiveness of customizing BERT, a language representation model, which has recently achieved very good results on various Natural Language Processing tasks on English texts, for the task of sentiment analysis of software artifacts. We describe our application of BERT to analyzing sentiments of sentences in Stack Overflow posts and compare the impact of a BERT sentiment classifier to state-of-the-art sentiment analysis techniques when used on a domain-specific data set created from Stack Overflow posts. We also investigate how the performance of sentiment analysis changes when using a much (3 times) larger data set than previous studies. Our results show that the BERT classifier achieves reliable performance for sentiment analysis of software engineering texts. BERT combined with the larger data set achieves an overall f-measure of 0.87, with the f-measures for the negative and positive sentences reaching 0.91 and 0.78 respectively, a significant improvement over the state-of-the-art.
- Published
- 2020
- Full Text
- View/download PDF
34. BUT-FIT at SemEval-2020 Task 5: Automatic detection of counterfactual statements with deep pre-trained language representation models
- Author
-
Martin Docekal, Martin Fajcik, Pavel Smrz, and Josef Jon
- Subjects
Statement (computer science) ,Counterfactual thinking ,FOS: Computer and information sciences ,Language representation ,Computer Science - Machine Learning ,Counterfactual conditional ,Computer Science - Computation and Language ,Antecedent (logic) ,business.industry ,Computer science ,Machine Learning (stat.ML) ,computer.software_genre ,SemEval ,Task (project management) ,Machine Learning (cs.LG) ,Statistics - Machine Learning ,Causal reasoning ,Artificial intelligence ,business ,computer ,Computation and Language (cs.CL) ,Natural language processing - Abstract
This paper describes BUT-FIT’s submission at SemEval-2020 Task 5: Modelling Causal Reasoning in Language: Detecting Counterfactuals. The challenge focused on detecting whether a given statement contains a counterfactual (Subtask 1) and extracting both antecedent and consequent parts of the counterfactual from the text (Subtask 2). We experimented with various state-of-the-art language representation models (LRMs). We found RoBERTa LRM to perform the best in both subtasks. We achieved the first place in both exact match and F1 for Subtask 2 and ranked second for Subtask 1.
- Published
- 2020
35. Enhancing Pre-trained Language Representation for Multi-Task Learning of Scientific Summarization
- Author
-
Jinpeng Li, Pengfei Yin, Yanbing Liu, Ruipeng Jia, Fang Fang, and Yannan Cao
- Subjects
Language representation ,Computer science ,business.industry ,05 social sciences ,Multi-task learning ,010501 environmental sciences ,computer.software_genre ,01 natural sciences ,Automatic summarization ,Data modeling ,0502 economics and business ,Task analysis ,Language model ,Artificial intelligence ,050207 economics ,business ,computer ,Natural language ,Natural language processing ,Sentence ,0105 earth and related environmental sciences - Abstract
This paper aims to extract summarization and keywords from scientific articles simultaneously, while abstract extraction (AE) and key extraction (KE) are considered as auxiliary tasks to each other. For the data scarcity in scientific AE and KE tasks, we propose a multi-task learning framework which uses huge unlabeled data to learn scientific language representation (pre-training) and uses smaller annotated data to transfer the learned representation to AE and KE (fine-tuning). Although the pre-trained language model performs well in universal natural language tasks, its capacity still has a margin of improvement for specific tasks. Inspired by this intuition, we use another two tasks keyword masking and key sentence prediction before the fine-tuning phase to enhance the language representation for AE and KE. This language representation enhancing stage uses the same labeled data but different optimization objectives with the fine-tuning phase. In order to evaluate our model, we develop and release a high-quality annotated corpus for scientific papers with keywords and abstract. We conduct comparative experiments on this dataset, and experimental results show that our multi-task learning framework achieves the state-of-the-art performance, proving the effectiveness of the language model enhancing mechanism.
- Published
- 2020
- Full Text
- View/download PDF
36. Editorial: Language Representation and Learning in Cognitive and Artificial Intelligence Systems
- Author
-
Giovanni Luca Masala, Bruno Golosio, Angelo Cangelosi, and Massimo Esposito
- Subjects
robotics ,Language representation ,Cognitive systems ,Computer science ,business.industry ,lcsh:Mechanical engineering and machinery ,Deep learning ,deep learning ,Robotics ,Cognition ,Natural Language Processing (NLP) ,lcsh:QA75.5-76.95 ,Computer Science Applications ,machine learning ,Artificial Intelligence ,lcsh:TJ1-1570 ,cognitive systems ,lcsh:Electronic computers. Computer science ,Artificial intelligence ,business - Published
- 2020
- Full Text
- View/download PDF
37. On the effectiveness of small, discriminatively pre-trained language representation models for biomedical text mining
- Author
-
Ibrahim Burak Ozyurt
- Subjects
0303 health sciences ,Language representation ,business.industry ,Computer science ,010501 environmental sciences ,Machine learning ,computer.software_genre ,01 natural sciences ,Relationship extraction ,Biomedical text mining ,03 medical and health sciences ,Ranking ,Named-entity recognition ,Question answering ,Artificial intelligence ,business ,computer ,Natural language processing ,030304 developmental biology ,0105 earth and related environmental sciences ,Transformer (machine learning model) - Abstract
Neural language representation models such as BERT have recently shown state of the art performance in downstream NLP tasks and bio-medical domain adaptation of BERT (Bio-BERT) has shown same behavior on biomedical text mining tasks. However, due to their large model size and resulting increased computational need, practical application of models such as BERT is challenging making smaller models with comparable performance desirable for real word applications. Recently, a new language transformers based language representation model named ELECTRA is introduced, that makes efficient usage of training data in a generative-discriminative neural model setting that shows performance gains over BERT. These gains are especially impressive for smaller models. Here, we introduce two small ELECTRA based model named Bio-ELECTRA and Bio-ELECTRA++ that are eight times smaller than BERT Base and Bio-BERT and achieves comparable or better performance on biomedical question answering, yes/no question answer classification, question answer candidate ranking and relation extraction tasks. Bio-ELECTRA is pre-trained from scratch on PubMed abstracts using a consumer grade GPU with only 8GB memory. Bio-ELECTRA++ is the further pre-trained version of Bio-ELECTRA trained on a corpus of open access full papers from PubMed Central. While, for biomedical named entity recognition, large BERT Base model outperforms Bio-ELECTRA++, Bio-ELECTRA and ELECTRA-Small++, with hyperparameter tuning Bio-ELECTRA++ achieves results comparable to BERT.
- Published
- 2020
- Full Text
- View/download PDF
38. A comparison of language representation models on small text corpora of scientific and technical documents
- Author
-
Thadeous A. Goodwyn, Peter F. David, Tavish M. McDonald, and Michael T. Gorczyca
- Subjects
Text corpus ,Language representation ,Computer science ,business.industry ,Artificial intelligence ,Technical documentation ,business ,computer.software_genre ,computer ,Natural language processing - Published
- 2020
- Full Text
- View/download PDF
39. Boosting Recommender Systems with Advanced Embedding Models
- Author
-
Gjorgjina Cenikj and Sonja Gievska
- Subjects
Language representation ,Information retrieval ,Boosting (machine learning) ,Computer science ,Exploratory research ,Embedding ,Inference ,Social media ,Recommender system ,Content filtering - Abstract
Recommender systems are paramount in providing personalized content and intelligent content filtering on any social media platform, web portal, and online application. In line with the current trends in the field directed towards mapping problem and data encoding representations from other fields, this research investigates the feasibility of augmenting a graph-based recommender system for Amazon products with two state-of-the-art representation models. In particular, the potential benefits of using the language representation model BERT and GraphSage based representations of nodes and edges for improving the quality of the recommendations were investigated. Link prediction and link attribute inference were used to identify the products that the users will buy and predict the rating they will give to a product, respectively. The initial results of our exploratory study are encouraging and point to potential directions for future research.
- Published
- 2020
- Full Text
- View/download PDF
40. Solving Arithmetic Word Problems with a Templatebased Multi-Task Deep Neural Network (T-MTDNN)
- Author
-
Donggeon Lee and Gahgene Gweon
- Subjects
020203 distributed computing ,Language representation ,Artificial neural network ,Computer science ,02 engineering and technology ,Word problem (mathematics education) ,Template ,Robustness (computer science) ,0202 electrical engineering, electronic engineering, information engineering ,Bit error rate ,Task analysis ,020201 artificial intelligence & image processing ,Arithmetic ,Bitwise operation - Abstract
Solving arithmetic word problem automatically has been a challenge both in terms of attaining robustness to unseen problems and achieving high problem-solving accuracy. In this paper, we propose a Template based – Multi-Task Deep Neural Network (T-MTDNN) framework, which utilizes two types of techniques. First, by generating normalized equation templates, we achieve robustness by enabling a more general language representation of a given linguistic task. Second, by applying MTDNN [1], which uses BERT with number and operator classification as multi-tasks, we gain higher problem solving accuracy compared to T-RNN [2], which is the state-of-the-art model. Specifically, with MAWPS dataset, the accuracy of T-MTDNN is 78.88% compared to the accuracy of T-RNN at 66.8%. With Math23K dataset, the accuracy of T-MTDNN is 72.6% compared to the accuracy of T-RNN at 66.9%.
- Published
- 2020
- Full Text
- View/download PDF
41. SLM: Learning a Discourse Language Representation with Sentence Unshuffling
- Author
-
Christopher D. Manning, Drew A. Hudson, Kangwook Lee, and Haejun Lee
- Subjects
FOS: Computer and information sciences ,Computer Science - Machine Learning ,Language representation ,Computer Science - Computation and Language ,Computer science ,business.industry ,02 engineering and technology ,010501 environmental sciences ,computer.software_genre ,01 natural sciences ,Machine Learning (cs.LG) ,0202 electrical engineering, electronic engineering, information engineering ,020201 artificial intelligence & image processing ,Language model ,Artificial intelligence ,business ,Computation and Language (cs.CL) ,computer ,Natural language ,Natural language processing ,Sentence ,0105 earth and related environmental sciences ,Transformer (machine learning model) - Abstract
We introduce Sentence-level Language Modeling, a new pre-training objective for learning a discourse language representation in a fully self-supervised manner. Recent pre-training methods in NLP focus on learning either bottom or top-level language representations: contextualized word representations derived from language model objectives at one extreme and a whole sequence representation learned by order classification of two given textual segments at the other. However, these models are not directly encouraged to capture representations of intermediate-size structures that exist in natural languages such as sentences and the relationships among them. To that end, we propose a new approach to encourage learning of a contextualized sentence-level representation by shuffling the sequence of input sentences and training a hierarchical transformer model to reconstruct the original ordering. Through experiments on downstream tasks such as GLUE, SQuAD, and DiscoEval, we show that this feature of our model improves the performance of the original BERT by large margins., EMNLP 2020
- Published
- 2020
- Full Text
- View/download PDF
42. Pre-trained Models for Natural Language Processing: A Survey
- Author
-
Xuanjing Huang, Yunfan Shao, Ning Dai, Tianxiang Sun, Yige Xu, and Xipeng Qiu
- Subjects
FOS: Computer and information sciences ,Computer Science - Machine Learning ,Word embedding ,Computer science ,02 engineering and technology ,010402 general chemistry ,computer.software_genre ,01 natural sciences ,Distributed representation ,Machine Learning (cs.LG) ,Taxonomy (general) ,General Materials Science ,Language representation ,Computer Science - Computation and Language ,Self supervised learning ,business.industry ,Deep learning ,General Engineering ,021001 nanoscience & nanotechnology ,0104 chemical sciences ,Categorization ,Language modelling ,Artificial intelligence ,0210 nano-technology ,business ,computer ,Computation and Language (cs.CL) ,Natural language processing - Abstract
Recently, the emergence of pre-trained models (PTMs) has brought natural language processing (NLP) to a new era. In this survey, we provide a comprehensive review of PTMs for NLP. We first briefly introduce language representation learning and its research progress. Then we systematically categorize existing PTMs based on a taxonomy with four perspectives. Next, we describe how to adapt the knowledge of PTMs to the downstream tasks. Finally, we outline some potential directions of PTMs for future research. This survey is purposed to be a hands-on guide for understanding, using, and developing PTMs for various NLP tasks., Comment: Invited Review of Science China Technological Sciences
- Published
- 2020
- Full Text
- View/download PDF
43. Multimodal Sentiment Analysis with Multi-perspective Fusion Network Focusing on Sense Attentive Language
- Author
-
Xia Li and Minping Chen
- Subjects
Language representation ,Fusion ,Modalities ,Modality (human–computer interaction) ,Computer science ,business.industry ,Sentiment analysis ,Representation (systemics) ,computer.software_genre ,Multi perspective ,Word representation ,Artificial intelligence ,business ,computer ,Natural language processing - Abstract
Multimodal sentiment analysis aims to learn a joint representation of multiple features. As demonstrated by previous studies, it is shown that the language modality may contain more semantic information than that of other modalities. Based on this observation, we propose a Multi-perspective Fusion Network(MPFN) focusing on Sense Attentive Language for multimodal sentiment analysis. Different from previous studies, we use the language modality as the main part of the final joint representation, and propose a multi-stage and uni-stage fusion strategy to get the fusion representation of the multiple modalities to assist the final language-dominated multimodal representation. In our model, a Sense-Level Attention Network is proposed to dynamically learn the word representation which is guided by the fusion of the multiple modalities. As in turn, the learned language representation can also help the multi-stage and uni-stage fusion of the different modalities. In this way, the model can jointly learn a well integrated final representation focusing on the language and the interactions between the multiple modalities both on multi-stage and uni-stage. Several experiments are carried on the CMU-MOSI, the CMU-MOSEI and the YouTube public datasets. The experiments show that our model performs better or competitive results compared with the baseline models.
- Published
- 2020
- Full Text
- View/download PDF
44. Labeling of Multilingual Breast MRI Reports
- Author
-
Arnaldo Mayer, Chen-Han Tsai, Eli Konen, Miri Sklair-Levy, and Nahum Kiryati
- Subjects
Language representation ,medicine.diagnostic_test ,business.industry ,Computer science ,Clinical settings ,Machine learning ,computer.software_genre ,Clinical trial ,Improved performance ,medicine ,Breast MRI ,Artificial intelligence ,Transfer of learning ,business ,computer ,Classifier (UML) - Abstract
Medical reports are an essential medium in recording a patient’s condition throughout a clinical trial. They contain valuable information that can be extracted to generate a large labeled dataset needed for the development of clinical tools. However, the majority of medical reports are stored in an unregularized format, and a trained human annotator (typically a doctor) must manually assess and label each case, resulting in an expensive and time consuming procedure. In this work, we present a framework for developing a multilingual breast MRI report classifier using a custom-built language representation called LAMBR. Our proposed method overcomes practical challenges faced in clinical settings, and we demonstrate improved performance in extracting labels from medical reports when compared with conventional approaches.
- Published
- 2020
- Full Text
- View/download PDF
45. BERT Feature Based Model for Predicting the Helpfulness Scores of Online Customers Reviews
- Author
-
Shuzhe Xu, Don Hong, and Salvador E. Barbosa
- Subjects
Computational model ,Language representation ,Information retrieval ,Artificial neural network ,Computer science ,business.industry ,Deep learning ,Helpfulness ,Feature based ,Feature selection ,Artificial intelligence ,business ,Encoder - Abstract
Online product reviews help consumers make purchase decisions when shopping online. As such, many computational models have been constructed to automatically evaluate the helpfulness of customer product reviews. However, many existing models are based on simple explanatory variables, including those extracted from low quality reviews that can be misleading and lead to confusion. Quality feature selection is essential for predicting the helpfulness of online customer reviews. The Bidirectional Encoder Representations from Transformers (BERT) is a very recently developed language representation model which can attain state-of-the-art results on many natural language processing tasks. In this study, a predictive model for determining helpfulness scores of customer reviews based on incorporation of BERT features with deep learning techniques is proposed. The application analyzes the Amazon product reviews dataset, and uses a BERT features based algorithm expected to be useful in help consumers to make a better purchase decisions.
- Published
- 2020
- Full Text
- View/download PDF
46. CN-HIT-IT.NLP at SemEval-2020 Task 4: Enhanced Language Representation with Multiple Knowledge Triples
- Author
-
Peng Jin, Yice Zhang, Yang Fan, Bingquan Liu, Jiaxuan Lin, and Yuanchao Liu
- Subjects
Language representation ,business.industry ,Computer science ,computer.software_genre ,SemEval ,Focus (linguistics) ,Ranking (information retrieval) ,Task (project management) ,Knowledge graph ,Artificial intelligence ,business ,computer ,Natural language ,Natural language processing - Abstract
This paper describes our system that participated in the SemEval-2020 task 4: Commonsense Validation and Explanation. For this task, it is obvious that external knowledge, such as Knowledge graph, can help the model understand commonsense in natural language statements. But how to select the right triples for statements remains unsolved, so how to reduce the interference of irrelevant triples on model performance is a research focus. This paper adopt a modified K-BERT as the language encoder, to enhance language representation through triples from knowledge graphs. Experiments show that our method is better than models without external knowledge, and is slightly better than the original K-BERT. We got an accuracy score of 0.97 in subtaskA, ranking 1/45, and got an accuracy score of 0.948, ranking 2/35.
- Published
- 2020
- Full Text
- View/download PDF
47. The Integrated Language Representation System of Illustrations Based on the Streaming Media Technology
- Author
-
Di Lu and Yi Yuan
- Subjects
Language representation ,Scope (project management) ,Cover (telecommunications) ,Multimedia ,Computer science ,Line (text file) ,computer.software_genre ,computer ,Connotation - Abstract
Illustration is the use of the graphic images, in line with the principle of the unity of aesthetics and practice, trying to make lines and shapes clear and lucid, easy to produce. Illustration is a universal language in the world. Its design is usually divided into characters, animals and commodity images in the commercial applications. In recent years, with the popularization of computers and the network, the modern illustration creation in China has been greatly developed. Illustration has gradually been widely used in various fields of our life and work. The diversified characteristics of the modern illustration development can no longer cover the application scope of the traditional illustration. With the development of the media technology, the application of the streaming media technology in the comprehensive language representation system of illustrations has developed into the technical connotation of the illustration art display.
- Published
- 2020
- Full Text
- View/download PDF
48. ALBERT-Based Chinese Named Entity Recognition
- Author
-
Yishuang Ning, Haifeng Lv, and Ke Ning
- Subjects
Conditional random field ,Language representation ,Computer science ,business.industry ,Deep learning ,Model parameters ,computer.software_genre ,Method comparison ,Named-entity recognition ,Leverage (statistics) ,Artificial intelligence ,business ,computer ,Natural language processing - Abstract
Chinese named entity recognition (NER) has been an important problem in natural language processing (NLP) field. Most existing methods mainly use traditional deep learning models which cannot fully leverage contextual dependencies that are very important for capturing the relations between words or characters for modeling. To address this problem, various language representation methods such as BERT have been proposed to learn the global context information. Although these methods can achieve good results, the large number of parameters limited the efficiency and application in real-world scenarios. To improve both of the performance and efficiency, this paper proposes an ALBERT-based Chinese NER method which uses ALBERT, a Lite version of BERT, as the pre-trained model to reduce model parameters and to improve the performance through sharing cross-layer parameters. Besides, it uses conditional random field (CRF) to capture the sentence-level correlation information between words or characters to alleviate the tagging inconsistency problems. Experimental results demonstrate that our method outperforms the comparison methods over 4.23–11.17% in terms of relative F1-measure with only 4% of BERT’s parameters.
- Published
- 2020
- Full Text
- View/download PDF
49. STRUCTURAL FEATURES OF THE LANGUAGE REPRESENTATION OF THE CONCEPT OF LONELINESS IN LEXICOGRAPHIC SOURCES
- Author
-
Svetlana A. Fetter, Olga N. Prokhorova, and Igor V. Chekulai
- Subjects
Language representation ,Computer science ,medicine ,Loneliness ,medicine.symptom ,Lexicographical order ,Linguistics - Published
- 2018
- Full Text
- View/download PDF
50. LANGUAGE REPRESENTATION OF SUBJECT AND OBJECT RELATIONS BETWEEN THE ELEMENTS OF SCIENTIFIC KNOWLEDGE (BASED ON DEFINITIONS OF TERMS OF DEVELOPING PROFESSIONAL FIELDS)
- Author
-
A.G. Monogarova and M.N. Latu
- Subjects
Sociology of scientific knowledge ,Language representation ,Computer science ,Object relations theory ,Subject (philosophy) ,Linguistics - Abstract
The paper presents the most common patterns of representation of subject and object relations in the definitions of terms, and identifies the ways of implementing subject relations between elements of scientific knowledge in active and passive structures that are part of the structure of applied models of organization of scientific knowledge. In addition, the article raises the question of the potential of various grammatical structures in the context of the transfer of subject and object relations. The results of the study show that system relation S can be represented by lexical and grammatical means. The lexical verbalizes of this relation are the key words of blocks of subject relations, and among the grammatical language means it is possible to distinguish the category of case.
- Published
- 2018
- Full Text
- View/download PDF
Catalog
Discovery Service for Jio Institute Digital Library
For full access to our library's resources, please sign in.