Back to Search Start Over

Augmenting Statistical Data Dissemination by Short Quantified Sentences of Natural Language

Authors :
Miroslav Hudec
Andreas Holzinger
Erika Bednárová
Source :
Journal of Official Statistics, Vol 34, Iss 4, Pp 981-1010 (2018)
Publication Year :
2018
Publisher :
Walter de Gruyter GmbH, 2018.

Abstract

Data from National Statistical Institutes is generally considered an important source of credible evidence for a variety of users. Summarization and dissemination via traditional methods is a convenient approach for providing this evidence. However, this is usually comprehensible only for users with a considerable level of statistical literacy. A promising alternative lies in augmenting the summarization linguistically. Less statistically literate users (e.g., domain experts and the general public), as well as disabled people can benefit from such a summarization. This article studies the potential of summaries expressed in short quantified sentences. Summaries including, for example, “most visits from remote countries are of a short duration” can be immediately understood by diverse users. Linguistic summaries are not intended to replace existing dissemination approaches, but can augment them by providing alternatives for the benefit of diverse users of official statistics. Linguistic summarization can be achieved via mathematical formalization of linguistic terms and relative quantifiers by fuzzy sets. To avoid summaries based on outliers or data with low coverage, a quality criterion is applied. The concept based on linguistic summaries is demonstrated on test interfaces, interpreting summaries from real municipal statistical data. The article identifies a number of further research opportunities, and demonstrates ways to explore those.

Details

ISSN :
20017367
Volume :
34
Database :
OpenAIRE
Journal :
Journal of Official Statistics
Accession number :
edsair.doi.dedup.....55b864a2bc449c3ce9895c65671bdea8
Full Text :
https://doi.org/10.2478/jos-2018-0048