Start Over

Transformer-Based Composite Language Models for Text Evaluation and Classification.

Authors :: Škorić, Mihailo
Utvić, Miloš
Stanković, Ranka
Source :: Mathematics (2227-7390). Nov2023, Vol. 11 Issue 22, p4660. 25p.
Publication Year :: 2023
Abstract: Parallel natural language processing systems were previously successfully tested on the tasks of part-of-speech tagging and authorship attribution through mini-language modeling, for which they achieved significantly better results than independent methods in the cases of seven European languages. The aim of this paper is to present the advantages of using composite language models in the processing and evaluation of texts written in arbitrary highly inflective and morphology-rich natural language, particularly Serbian. A perplexity-based dataset, the main asset for the methodology assessment, was created using a series of generative pre-trained transformers trained on different representations of the Serbian language corpus and a set of sentences classified into three groups (expert translations, corrupted translations, and machine translations). The paper describes a comparative analysis of calculated perplexities in order to measure the classification capability of different models on two binary classification tasks. In the course of the experiment, we tested three standalone language models (baseline) and two composite language models (which are based on perplexities outputted by all three standalone models). The presented results single out a complex stacked classifier using a multitude of features extracted from perplexity vectors as the optimal architecture of composite language models for both tasks. [ABSTRACT FROM AUTHOR]

Subjects :: *LANGUAGE models
*MACHINE translating
*NATURAL language processing
*GENERATIVE pre-trained transformers
*TRANSFORMER models
*SERBIAN language
*ATTRIBUTION of authorship

Details

Language :: English
ISSN :: 22277390
Volume :: 11
Issue :: 22
Database :: Academic Search Index
Journal :: Mathematics (2227-7390)
Publication Type :: Academic Journal
Accession number :: 173862855
Full Text :: https://doi.org/10.3390/math11224660

Full Text Access

View/download PDF

Tools

Email
Cite

Printer

Authors Abstract Subjects Details

Searchworks

Select search scope, currently: Articles

Catalog

books, media & more in Jio Institute collections

Articles

journal articles & other e-resources

Transformer-Based Composite Language Models for Text Evaluation and Classification.

Abstract

Subjects

Details

Tools

Searchworks

Select search scope, currently: Articles Catalog books, media & more in Jio Institute collections Articles journal articles & other e-resources

Transformer-Based Composite Language Models for Text Evaluation and Classification.

Abstract

Subjects

Details

Tools

Select search scope, currently: Articles

Catalog

books, media & more in Jio Institute collections

Articles

journal articles & other e-resources