Back to Search
Start Over
Augmenting Statistical Data Dissemination by Short Quantified Sentences of Natural Language
- Source :
- Journal of Official Statistics, Vol 34, Iss 4, Pp 981-1010 (2018)
- Publication Year :
- 2018
- Publisher :
- Walter de Gruyter GmbH, 2018.
-
Abstract
- Data from National Statistical Institutes is generally considered an important source of credible evidence for a variety of users. Summarization and dissemination via traditional methods is a convenient approach for providing this evidence. However, this is usually comprehensible only for users with a considerable level of statistical literacy. A promising alternative lies in augmenting the summarization linguistically. Less statistically literate users (e.g., domain experts and the general public), as well as disabled people can benefit from such a summarization. This article studies the potential of summaries expressed in short quantified sentences. Summaries including, for example, “most visits from remote countries are of a short duration” can be immediately understood by diverse users. Linguistic summaries are not intended to replace existing dissemination approaches, but can augment them by providing alternatives for the benefit of diverse users of official statistics. Linguistic summarization can be achieved via mathematical formalization of linguistic terms and relative quantifiers by fuzzy sets. To avoid summaries based on outliers or data with low coverage, a quality criterion is applied. The concept based on linguistic summaries is demonstrated on test interfaces, interpreting summaries from real municipal statistical data. The article identifies a number of further research opportunities, and demonstrates ways to explore those.
- Subjects :
- 0209 industrial biotechnology
media_common.quotation_subject
Fuzzy set
02 engineering and technology
Statistical literacy
computer.software_genre
linguistic quantifiers
020901 industrial engineering & automation
0202 electrical engineering, electronic engineering, information engineering
Quality (business)
Dissemination
linguistic summaries
media_common
business.industry
Statistics
Automatic summarization
HA1-4737
Variety (cybernetics)
fuzzy sets
user interface
020201 artificial intelligence & image processing
Artificial intelligence
database queries
User interface
business
computer
Natural language
Natural language processing
Subjects
Details
- ISSN :
- 20017367
- Volume :
- 34
- Database :
- OpenAIRE
- Journal :
- Journal of Official Statistics
- Accession number :
- edsair.doi.dedup.....55b864a2bc449c3ce9895c65671bdea8
- Full Text :
- https://doi.org/10.2478/jos-2018-0048