Back to Search Start Over

The Children's Picture Books Lexicon (CPB-Lex): A large-scale lexical database from children's picture books.

Authors :
Green, Clarence
Keogh, Kathleen
Sun, He
O'Brien, Beth
Source :
Behavior Research Methods. Aug2024, Vol. 56 Issue 5, p4504-4521. 18p.
Publication Year :
2024

Abstract

This article presents cpb-lex, a large-scale database of lexical statistics derived from children's picture books (age range 0–8 years). Such a database is essential for research in psychology, education and computational modelling, where rich details on the vocabulary of early print exposure are required. Cpb-lex was built through an innovative method of computationally extracting lexical information from automatic speech-to-text captions and subtitle tracks generated from social media channels dedicated to reading picture books aloud. It consists of approximately 25,585 types (wordforms) and their frequency norms (raw and Zipf-transformed), a lexicon of bigrams (two-word sequences and their transitional probabilities) and a document-term matrix (which shows the importance of each word in the corpus in each book). Several immediate contributions of cpb-lex to behavioural science research are reported, including that the new cpb-lex frequency norms strongly predict age of acquisition and outperform comparable child-input lexical databases. The database allows researchers and practitioners to extract lexical statistics for high-frequency words which can be used to develop word lists. The paper concludes with an investigation of how cpb-lex can be used to extend recent modelling research on the lexical diversity children receive from picture books in addition to child-directed speech. Our model shows that the vocabulary input from a relatively small number of picture books can dramatically enrich vocabulary exposure from child-directed speech and potentially assist children with vocabulary input deficits. The database is freely available from the Open Science Framework repository: https://tinyurl.com/4este73c. [ABSTRACT FROM AUTHOR]

Details

Language :
English
ISSN :
1554351X
Volume :
56
Issue :
5
Database :
Academic Search Index
Journal :
Behavior Research Methods
Publication Type :
Academic Journal
Accession number :
178775348
Full Text :
https://doi.org/10.3758/s13428-023-02198-y