Back to Search Start Over

The Game Walkthrough Corpus (GWTC) – A Resource for the Analysis of Textual Game Descriptions

Authors :
Jochen Tiepmar
Manuel Burghardt
Source :
Journal of Open Humanities Data, Vol 7 (2021), Journal of Open Humanities Data; Vol 7 (2021); 14
Publication Year :
2021
Publisher :
Ubiquity Press, 2021.

Abstract

We present the Game Walkthrough Corpus (GWTC), which contains 12,295 unique walkthrough documents covering 6,117 games. For each game walkthrough, we provide frequencies of unigrams and bigrams, treating the walkthrough document as a Bag of Words. In addition, we provide word frequencies at the sentence level. Furthermore, the GWTC contains a number of game-related metadata, including title, publisher, developer, year, and genre. All the language statistics and metadata are stored in separate plain text files and can be referenced through uniform resource names (URN). These URNs can also be used to derive any combination of statistics and metadata. Researchers, for instance, can investigate the most frequent unigrams for games in the “Adventure” genre. This way, the GWTC can be reused for different kinds of research questions on gaming language.

Details

Language :
English
ISSN :
2059481X
Volume :
7
Database :
OpenAIRE
Journal :
Journal of Open Humanities Data
Accession number :
edsair.doi.dedup.....59b6a21d052623dc1b0371508f7cf778