Back to Search Start Over

Vocabulary size prediction of Croatian texts

Authors :
Tuđman, Miroslav
Mikelić, Nives
Boras, Damir
Budin, Leo
Lužar-Stiffler, Vesna
Bekić, Zoran
Hljuz Dobrić, Vesna
Publication Year :
2003

Abstract

The preliminary research of the vocabulary size of the Croatian lexical corpora shows that the distribution of types is regular and that deviations of the calculated values are within theoretically acceptable limit. The research also brought us to conclusion that Zipf's Law in Croatian language is not applicable because the lexical density is different, i.e. the proportion of types and tokens in different languages is different and the parameters of that proportion need to be calculated for every language separately.

Details

Language :
English
Database :
OpenAIRE
Accession number :
edsair.57a035e5b1ae..1778fee671afbb7ebe3ef1d682c52f8c