Back to Search Start Over

Methodology for experimental verification of software that implements the algorithm for graphematic analysis and preprocessing of text resources.

Authors :
Petrushevskaya, Anastasia
Rabin, Alexey
Source :
AIP Conference Proceedings; 2021, Vol. 2402 Issue 1, p1-8, 8p
Publication Year :
2021

Abstract

A technique has been developed for experimental verification of software that implements the algorithm for graphematic analysis and preprocessing of text resources. This software is used to solve the problem of automated mining analysis of poorly structured data. It is designed to extract semantically significant constructs from poorly structured resources, which is achieved by transforming text using a number of auxiliary algorithms and classifying the data obtained using machine learning algorithms in Python. At the first stage of the developed algorithm, abbreviations and acronyms are searched in a text file. After searching for abbreviations and acronyms using templates, a search is performed for graphematic descriptors, namely name, email and URL, phone number and date. At the final stage of the developed algorithm, the boundaries of sentences and direct speech are distinguished. [ABSTRACT FROM AUTHOR]

Details

Language :
English
ISSN :
0094243X
Volume :
2402
Issue :
1
Database :
Complementary Index
Journal :
AIP Conference Proceedings
Publication Type :
Conference
Accession number :
153597831
Full Text :
https://doi.org/10.1063/5.0073144