Back to Search
Start Over
Methodology for experimental verification of software that implements the algorithm for graphematic analysis and preprocessing of text resources.
- Source :
- AIP Conference Proceedings; 2021, Vol. 2402 Issue 1, p1-8, 8p
- Publication Year :
- 2021
-
Abstract
- A technique has been developed for experimental verification of software that implements the algorithm for graphematic analysis and preprocessing of text resources. This software is used to solve the problem of automated mining analysis of poorly structured data. It is designed to extract semantically significant constructs from poorly structured resources, which is achieved by transforming text using a number of auxiliary algorithms and classifying the data obtained using machine learning algorithms in Python. At the first stage of the developed algorithm, abbreviations and acronyms are searched in a text file. After searching for abbreviations and acronyms using templates, a search is performed for graphematic descriptors, namely name, email and URL, phone number and date. At the final stage of the developed algorithm, the boundaries of sentences and direct speech are distinguished. [ABSTRACT FROM AUTHOR]
Details
- Language :
- English
- ISSN :
- 0094243X
- Volume :
- 2402
- Issue :
- 1
- Database :
- Complementary Index
- Journal :
- AIP Conference Proceedings
- Publication Type :
- Conference
- Accession number :
- 153597831
- Full Text :
- https://doi.org/10.1063/5.0073144