1. Técnicas em software livre para exploração de corpora do português livremente disponíveis na WWW.
- Author
-
de Alencar, Leonel Figueiredo
- Subjects
- *
LANGUAGE & languages , *CORPORA , *INFORMATION science , *ELECTRONIC systems , *SCRIPTS , *COMMAND languages (Computer science) , *COMPUTATIONAL linguistics , *COMPUTER operating systems , *FEASIBILITY studies - Abstract
This paper approaches corpus linguistics as a subfield in applied informatics which features among its main focuses automatic data extraction from corpora. For this purpose, we develop commands and scripts in the UNIX bash command language, illustrating its applicability in the investigation of the -vel suffix and of iterations of letters and words in two of the main corpuses of Portuguese. We argue that using free software tools with textual interface, whose mastering together with programming skills is a necessity in computational linguistics, is more advantageous in corpus linguistics in comparison to commercial and proprietary programs with graphical interface. [ABSTRACT FROM AUTHOR]
- Published
- 2009