1. Language modeling and bidirectional coders representations: an overview of key technologies
- Author
-
D. I. Kachkou
- Subjects
Computer science ,computer.software_genre ,transformer architecture ,03 medical and health sciences ,model bert ,Text processing ,information technology ,language models ,informatics ,natural language processing ,030304 developmental biology ,Transformer (machine learning model) ,0303 health sciences ,Class (computer programming) ,business.industry ,030302 biochemistry & molecular biology ,Information technology ,QA75.5-76.95 ,Electronic computers. Computer science ,Artificial intelligence ,Language model ,business ,attention mechanism ,Knowledge transfer ,computer ,Strengths and weaknesses ,Natural language ,Natural language processing - Abstract
The article is an essay on the development of technologies for natural language processing, which formed the basis of BERT (Bidirectional Encoder Representations from Transformers), a language model from Google, showing high results on the whole class of problems associated with the understanding of natural language. Two key ideas implemented in BERT are knowledge transfer and attention mechanism. The model is designed to solve two problems on a large unlabeled data set and can reuse the identified language patterns for effective learning for a specific text processing problem. Architecture Transformer is based on the attention mechanism, i.e. it involves evaluation of relationships between input data tokens. In addition, the article notes strengths and weaknesses of BERT and the directions for further model improvement.
- Published
- 2021