Back to Search
Start Over
The importance of sampling frames in representative historical corpora : a case study of Parisian theater
- Source :
- CogniTextes, Vol 19 (2019)
- Publication Year :
- 2019
- Publisher :
- Association Française de Linguistique Cognitive, 2019.
-
Abstract
- Cognitive linguistics makes specific claims about language use, and corpora are our most powerful tool to test those claims. Representative sampling (Laplace 1814) is a technique that allows us to study smaller, more manageable corpora, and generalize our results to a broader sampling frame. For a sampled corpus to be relevant to our research questions, its sampling frame must have an understandable connection to the subject of our research question.In my dissertation study (Grieve-Smith 2009) I tested the type frequency hypothesis of analogical extension (Bybee 1995) using the FRANTEXT corpus (CNRTL 2018). In this study I test the theatrical texts in FRANTEXT from 1800-1815 against the new Digital Parisian Stage corpus, sampled from Wicks (1950 et seq.), a catalog of every play that premiered in Paris in the nineteenth century. Declarative sentence negations in the Digital Parisian Stage corpus occurred with ne … pas in 73.9 % of tokens, while in FRANTEXT they only occurred with ne … pas in 50.5 % of tokens. This shows that FRANTEXT is biased in favor of elite literary language. To properly test usage-based theories of language change we will need a representative corpus covering a century or more.
Details
- Language :
- English, French
- ISSN :
- 19585322
- Volume :
- 19
- Database :
- Directory of Open Access Journals
- Journal :
- CogniTextes
- Publication Type :
- Academic Journal
- Accession number :
- edsdoj.9362880801bd4c26b690e8657f393058
- Document Type :
- article
- Full Text :
- https://doi.org/10.4000/cognitextes.1671