86 results on '"keyness"'
Search Results
2. Evaluation of keyness metrics: performance and reliability.
- Author
-
Sönning, Lukas
- Subjects
JOB applications ,TASK analysis ,ACADEMIC discourse - Abstract
The methodological debates surrounding keyword analysis have given rise to a wide range of keyness metrics. The present paper delineates four dimensions of keyness, which distinguish between frequency- and dispersion-related perspectives. Existing measures are then organized according to these dimensions and evaluated with regard to their performance on a specific keyword analysis task: The identification of key verbs in academic writing. To this end, the rankings produced by 32 different metrics are evaluated against an established academic word list. Further, the reliability of measures is assessed, to determine whether they produce stable rankings across repeated studies on the same pair of text varieties. We observe notable differences among metrics with regard to these criteria. Our findings provide further support for the superiority of the Wilcoxon rank sum test and text-dispersion–based measures, and allow us to identify, within each dimension of keyness, metrics that may be given preference in applied work. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
3. Du Fu's conspicuous negativity and Li Bai's hidden positivity: a sentiment comparison and exploration.
- Author
-
Meng, Yingying, Wan, Yuwei, and Kit, Chunyu
- Subjects
- *
POETRY collections , *CHINESE poetry , *OPTIMISM , *SENTIMENT analysis , *ENCYCLOPEDIAS & dictionaries - Abstract
In the studies of classical Chinese poetry, the comparison between Li Bai and Du Fu is an everlasting topic, yielding many qualitative interpretations, among which a widely known but disputable one is Li's positivity versus Du's negativity. With the development of digital means, distant reading has become possible, and the sentiment issue can be further explored in quantitative ways. This research conducts a corpus-based sentiment comparison of Li and Du with a self-constructed sentiment dictionary. The Complete Collection of Tang Poems is used as a representative of Tang poets, and sentiment comparisons are made at the levels of poems, verses, and characters, as well as key characters extracted with the log-likelihood measure. Analyses show that (1) among Tang poets, Du is more negative at all of the above textual levels, while Li is only more positive at the key character level, proving the importance of key characters in readers' perception of sentiment; (2) Li and Du both stand out among Tang poets with a negative depiction of the dark reality and a positive expression of grand ideals; and (3) Li's positivity is largely embodied in his depictions of color, light, and temperature, while Du's negativity is closely related to his psychological description. To conclude, this research has not only determined the sentiment difference between Li and Du but also located its sources in texts with a novel key character-based sentiment analysis approach. Keywords: [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
4. Key n-Grams in EU Directives and in the UK National Legislation on Consumer Contracts.
- Author
-
Giampieri, Patrizia
- Abstract
Key n-grams are useful in the analysis of legal discourse as they help bring recurrent key expressions to the fore and understand the patterning of legal language. This paper aims to generate, analyse and compare the key n-grams of two legal corpora: a corpus of European directives on distance consumer contracts and a UK national legislation corpus on the same subject-matter. The corpora are considered, alternatively, as both focus and reference corpora. In this way, keyness, i.e., the terminology that makes each corpus unique, is revealed from both corpora. The paper findings mostly bring to the fore five different patterns: differences in the key n-grams due to institutional or country-related factors; legalese influences; typical n-grams of Eurolect; dichotomy in the terminology used (albeit applying the same legal principles), and polysemy (i.e., similar words with different applications in various genres). This analysis confirms the usefulness and insightfulness of key n-grams in understanding the impact of disciplinary conventions in legal language. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
5. U.S. administration’s press communications on Tunisia after the July 25, 2021 ‘state of exception’: The shaping of urgency discourse
- Author
-
Boutheina Ben Ghozlen
- Subjects
u.s. press communications ,tunisia ,july 25 ,crisis ,urgency ,cads ,framing ,keyness ,collocation network ,Philology. Linguistics ,P1-1091 - Abstract
July 25, 2021 was an exceptional day in Tunisia, ushering in a new chapter in its contemporary political history and triggering a wave of global responses to the declared ‘state of exception’. This research examines the U.S. administration’s press communications on Tunisia following this event. Specifically, it explores (i) the dominant crisis frames permeating these communications and their underlying political agenda and (ii) the extent to which the discourse they imparted signals a change in U.S. foreign policy towards Tunisia in crisis situations. To meet these research objectives, a corpus-based investigation was undertaken using an integrative framework combining qualitative (frames) and quantitative (keyness and collocations) approaches. Results revealed changing discourses around the theme of crisis, moving from a sense of togetherness to urgency. This may echo the cautious attitude of the American government and its heightened concern about Kais Saied’s transitional measures. In broad terms, the exploration offered a glimpse of how the dynamics of global politics unfold discursively. Importantly, the Biden administration’s construction of Tunisia’s political-democratic crisis in terms of urgency can have real-life consequences for international perceptions of the country’s future. Theoretically, the study’s implications touch primarily upon Corpus-Assisted Discourse Studies (CADS, henceforth), particularly the evolving corpus linguistics concepts of keyness and collocation networks.
- Published
- 2023
- Full Text
- View/download PDF
6. Incorporating structural topic modeling into short text analysis.
- Author
-
Po-Ya Angela Wang and Shu-Kai Hsieh
- Subjects
STRUCTURAL models ,SONG lyrics ,CORPORA ,PRONOUNS (Grammar) - Abstract
The past few decades have seen the rapid development of topic modeling. So far, research has been more concerned with determining the ideal number of topics or meaningful topic clustering words than with applying topic modeling techniques to evaluate linguistic theories. This study proposes the Structural Topic Model (STM)-led framework to facilitate the interpretation of topic modeling results and standardize text analysis. STM encompasses various model training mechanisms, thereby requiring systematic designs to properly combine language studies. “Structural” in STM refers to the inclusion of metadata structure. Unlike the corpus-based keyness approach, STM can capture contextual cues and meta-information for the interpretation of topical results. Besides, STM can make crosscorpora comparisons via topical contrast, a challenging task for corpusdriven related models such as the Biterm Topic Model (BTM). Stylistic variations in song lyrics are taken as an illustration to show how to use the suggested framework to delve into the linguistic theory proposed by Pennebaker (2013). The topical model and iterable model in the proposed paradigm can clarify how pronouns affect style distinction. We believe the proposed STM-led framework can shed light on text analysis by conducting a reproducible cross-corpora comparison on short texts. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
7. Key feature analysis: a simple, yet powerful method for comparing text varieties.
- Author
-
Egbert, Jesse and Biber, Douglas
- Subjects
SKEWNESS (Probability theory) ,CAMPAIGN debates ,KEYWORDS ,UNITED States presidential elections - Abstract
To date, corpus-based methods for comparing language varieties have fallen into one of two camps: (1) md analysis – a complicated multi-variate approach based on analysis of functionally motivated linguistic features in each text of a corpus, or (2) keyword/key pos analysis – simple, univariate techniques to identify any feature with a statistically skewed distribution in a corpus. In this paper, we introduce a complementary technique – key feature analysis – which is a simple quantitative approach to compare the texts in two varieties with respect to a set of functionally motivated lexico-grammatical features. We introduce the methods of key feature analysis, contrast them with other approaches for comparing text varieties, and present case studies from the domains of online registers and US presidential debates. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
8. How can we communicate (visually) what we (usually) mean by collocation and keyness?: A visual response to Gries (2022a).
- Author
-
Jeaco, Stephen
- Subjects
COLLOCATION (Linguistics) ,CORPORA ,TEACHER-student relationships ,ODDS ratio ,LINGUISTICS - Abstract
Corpus linguistic methods can now be easily employed in a wide range of studies within sub-disciplines of linguistics and well beyond. In a two-part paper, Gries (2022a, 2022b) challenges some of the most widely used 'association measures' of what many might feel to be powerful aspects of text patterning: collocation and key words. While the additional association measure offers some new possibilities, this paper highlights the strong influence of another frequency parameter on odds ratio and Gries's suggested association measure, and questions the applicability of his cautions for many different kinds of corpus research. Nevertheless, having been inspired to look at different aspects of association and dispersion more carefully, the author presents some new visualizations which were designed to communicate some of the important lessons to be learned from Gries's papers, especially for learners and teachers using corpus tools in Second Language classrooms. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
9. Frequency or Keyness?
- Author
-
Đurović, Zorica
- Subjects
MARINE engineering ,VOCABULARY ,FREQUENCY (Linguistics) ,BILINGUALISM ,ENCYCLOPEDIAS & dictionaries - Abstract
Copyright of Lexikos is the property of Bureau of the Woordeboek van die Afrikaanse Taal and its content may not be copied or emailed to multiple sites or posted to a listserv without the copyright holder's express written permission. However, users may print, download, or email articles for individual use. This abstract may be abridged. No warranty is given about the accuracy of the copy. Users should refer to the original published version of the material for the full abstract. (Copyright applies to all Abstracts.)
- Published
- 2023
- Full Text
- View/download PDF
10. Constructing Persuasion in Tourism Promotion Websites: A Corpus-Assisted Study of Hyphenated Adjectives in English
- Abstract
The discourse of promotional tourism employs a rich array of adjectives. However, to date, there exists a dearth of comprehensive studies exploring the usage and characteristics of hyphenated adjectives within promotional tourism discourse. This paper focuses on the lexical examination of such adjectives and elucidates their persuasive role as interpersonal markers that shape the author’s voice/stance within the metadiscourse framework. Two main objectives are pursued: (1) to determine the keyness of hyphenated adjectives within the study corpus (PROMTOUR) in comparison to their occurrence in the enTenTen20 reference corpus, and (2) to identify and classify morphological patterns and clusters associated with hyphenated adjectives. A specialized corpus comprising over 760,000 words from 33 original English promotional tourism websites is analysed using Sketch Engine. The findings indicate that hyphenated adjectives account for approximately 30% of the adjectival lemmas, displaying a remarkably high occurrence in PROMTOUR. Consequently, these adjectives emerge as a pivotal lexical characteristic employed by the authors to fulfil readers’ expectations within this particular genre. Furthermore, qualitative analyses reveal the recurrent occurrence of specific morphological patterns, notably those involving past and present participles. The implications of the study for the teaching of tourism English and translation are also discussed.
- Published
- 2024
11. Predictive keywords: Using machine learning to explain document characteristics
- Author
-
Aki-Juhani Kyröläinen and Veronika Laippala
- Subjects
keyness ,corpus linguistics ,support vector machines ,machine learning ,Electronic computers. Computer science ,QA75.5-76.95 - Abstract
When exploring the characteristics of a discourse domain associated with texts, keyword analysis is widely used in corpus linguistics. However, one of the challenges facing this method is the evaluation of the quality of the keywords. Here, we propose casting keyword analysis as a prediction problem with the goal of discriminating the texts associated with the target corpus from the reference corpus. We demonstrate that, when using linear support vector machines, this approach can be used not only to quantify the discrimination between the two corpora, but also extract keywords. To evaluate the keywords, we develop a systematic and rigorous approach anchored to the concepts of usefulness and relevance used in machine learning. The extracted keywords are compared with the recently proposed text dispersion keyness measure. We demonstrate that that our approach extracts keywords that are highly useful and linguistically relevant, capturing the characteristics of their discourse domain.
- Published
- 2023
- Full Text
- View/download PDF
12. ‘Return to the International Family of Democracies’: Keyness Factor in the International Speeches of the Baltic Presidents
- Author
-
Līga Romāne-Kalniņa
- Subjects
presidential speeches ,Baltic States ,corpus linguistics ,critical discourse analysis ,identity ,keyness ,Literature (General) ,PN1-6790 ,Philology. Linguistics ,P1-1091 - Abstract
Presidential speeches as a type of political discourse are aimed not only at the negotiation and construction of the national identity of a nation-state at a local level but also at the representation and shaping of the national identity internationally. The presidents of the Baltic States have represented their individual, collective and regional identities in the international gatherings of world leaders since the restoration of independence of Estonia, Latvia, and Lithuania from the Soviet Union. The current study displays an analysis of how the keyness factor of particular lexical items used in 142 speeches given by the presidents of the Baltic States internationally from 1991 until 2021 helps to identify the tendencies of identity construction and representation, which can then be investigated in detail via a critical analysis of the discursive strategies and linguistic means applied in the speeches. Moreover, the analysis of keyword tendencies across speeches marked by different criteria shows how the process of identity construction as marked by lexical change varies across time and states. The keyness factor points to multiple identities being constructed in the international speeches, where the national identities are constructed most frequently, followed by the common European identity, Baltic regional identity, and global identity. It is also concluded that a common political past is one of the main elements of national and Baltic identities, while shared values such as democracy and cooperation are the main elements of supra-national identities.
- Published
- 2022
- Full Text
- View/download PDF
13. 'RETURN TO THE INTERNATIONAL FAMILY OF DEMOCRACIES': KEYNESS FACTOR IN THE INTERNATIONAL SPEECHES OF THE BALTIC PRESIDENTS.
- Author
-
ROMĀNE-KALNIŅA, LĪGA
- Subjects
IDENTITY (Psychology) ,INTELLIGIBILITY of speech ,GROUP identity ,NATIONAL character ,POLITICAL oratory ,LINGUISTIC identity ,CRITICAL discourse analysis - Abstract
Presidential speeches as a type of political discourse are aimed not only at the negotiation and construction of the national identity of a nationstate at a local level but also at the representation and shaping of the national identity internationally. The presidents of the Baltic States have represented their individual, collective and regional identities in the international gatherings of world leaders since the restoration of independence of Estonia, Latvia, and Lithuania from the Soviet Union. The current study displays an analysis of how the keyness factor of particular lexical items used in 142 speeches given by the presidents of the Baltic States internationally from 1991 until 2021 helps to identify the tendencies of identity construction and representation, which can then be investigated in detail via a critical analysis of the discursive strategies and linguistic means applied in the speeches. Moreover, the analysis of keyword tendencies across speeches marked by different criteria shows how the process of identity construction as marked by lexical change varies across time and states. The keyness factor points to multiple identities being constructed in the international speeches, where the national identities are constructed most frequently, followed by the common European identity, Baltic regional identity, and global identity. It is also concluded that a common political past is one of the main elements of national and Baltic identities, while shared values such as democracy and cooperation are the main elements of supra-national identities. [ABSTRACT FROM AUTHOR]
- Published
- 2022
- Full Text
- View/download PDF
14. Identifying Learning Activity Sequences that Are Associated with High Intention-Fulfillment in MOOCs
- Author
-
Rabin, Eyal, Silber-Varod, Vered, Kalman, Yoram M., Kalz, Marco, Goos, Gerhard, Founding Editor, Hartmanis, Juris, Founding Editor, Bertino, Elisa, Editorial Board Member, Gao, Wen, Editorial Board Member, Steffen, Bernhard, Editorial Board Member, Woeginger, Gerhard, Editorial Board Member, Yung, Moti, Editorial Board Member, Scheffel, Maren, editor, Broisin, Julien, editor, Pammer-Schindler, Viktoria, editor, Ioannou, Andri, editor, and Schneider, Jan, editor
- Published
- 2019
- Full Text
- View/download PDF
15. The Europe of Brexit: a corpus-assisted discourse study of identities in the press.
- Author
-
Pena-Díaz, Carmen and Sánchez Ramos, Maria del Mar
- Subjects
BREXIT Referendum, 2016 ,BRITISH withdrawal from the European Union, 2016-2020 ,IDENTITY (Psychology) ,DISCOURSE ,LINGUISTIC identity ,PRESS - Abstract
Copyright of CIRCULO de Linguistica Aplicada a la Comunicacion is the property of Universidad Complutense de Madrid and its content may not be copied or emailed to multiple sites or posted to a listserv without the copyright holder's express written permission. However, users may print, download, or email articles for individual use. This abstract may be abridged. No warranty is given about the accuracy of the copy. Users should refer to the original published version of the material for the full abstract. (Copyright applies to all Abstracts.)
- Published
- 2021
- Full Text
- View/download PDF
16. La selección temática del vocabulario para fines didácticos: evaluación de un acercamiento cuantitativo
- Author
-
Jasper Degraeuwe and Patrick Goethals
- Subjects
lingüística de corpus ,aprendizaje de vocabulario ,selección del vocabulario automática ,selección del vocabulario temática ,frecuencia absoluta ,keyness ,dispersión ,español como lengua extranjera (ele) ,Philology. Linguistics ,P1-1091 - Abstract
El presente estudio tiene por objetivo evaluar los resultados de un acercamiento cuantitativo a la selección temática del vocabulario con fines didácticos. Describimos en detalle cómo se configuran y se combinan tres medidas cuantitativas (la frecuencia absoluta, el keyness y la dispersión) a fin de automatizar la selección del vocabulario específico de un corpus especializado. A continuación evaluamos si la selección automática se ve confirmada por el juicio de profesores ELE. Hemos podido comprobar, en efecto, que en más del 85% de los casos el resultado del método cuantitativo es confirmado por al menos la mitad de los profesores. Esta observación también se evidencia estadísticamente, con un test de interrater reliability que demuestra un acuerdo sustancial (Cohen’s kappa = 0,61) entre el juicio de los profesores y la selección automática.
- Published
- 2020
- Full Text
- View/download PDF
17. A key to understanding why a text is difficult to process. Lexical uniqueness of academic English texts
- Author
-
Natalia Borza
- Subjects
academic english ,english for specific purposes (esp) ,keyness ,lexical uniqueness ,register analysis ,Philology. Linguistics ,P1-1091 - Abstract
While the register of English language tertiary textbooks has been investigated substantially, moderately little is explored about the register analytical features of secondary textbooks. The purpose of the present pedagogically-driven study is to analyse the register of biology textbooks for secondary students from the point of view of English as a second language (ESL) teaching by describing the lexical uniqueness of the register of the biology corpus (BIOCOR) 10th-grade students need to process during their studies at a bilingual secondary school. The BIOCOR (consisting of 7,021 words) was compared to a reference corpus (REFCOR) of general English texts at a CEFR B2 level (comprising 7,098 words) by exploring its high-value positive and negative keyness lexical items. The results of the investigation disclose that the lack of specialised uniqueness is prevalent in the BIOCOR with regard to academic English and specific biology terminology. The lexical plainness of the biology textbook can be regarded as one of the linguistic features revealing the non-academic but popularizing nature of the secondary textbook register.
- Published
- 2020
18. The Europe of Brexit: a corpus-assisted discourse study of identities in the press
- Author
-
Carmen Pena Díaz and María del Mar Sánchez Ramos
- Subjects
corpus-assisted discourse analysis ,media discourse ,identities ,Brexit ,keyness ,Philology. Linguistics ,P1-1091 - Abstract
Drawing on a what is known as corpus-assisted discourse study (CADS) approach (Baker et al., 2008), this article will research the construction of different identities by means of the language used in two newspaper articles on Brexit from the Spanish El País and the British The Guardian, to examine how these identities are constructed through media discourse at the time following the Brexit referendum (2016-2018). Media discourse surrounding Brexit is examined under the consideration of media power. A comparable corpus made up of original newspaper articles about Brexit was used to carry out the analysis, identifying statistically significant keywords compared with a reference corpus with the aim of providing an example of how the British and Spanish press construct identity.
- Published
- 2021
- Full Text
- View/download PDF
19. Problematising characteristicness: A biomedical association case study.
- Author
-
Prentice, Sheryl, Knight, Jo, Rayson, Paul, Haj, Mahmoud El, and Rutherford, Nathan
- Subjects
- *
BIOMEDICAL organizations , *CORPORA - Abstract
Keyness is a commonly used method in corpus linguistics and is assumed to identify key items that are characteristic of 1 corpus when compared to another. This paper puts this assumption to the test by comparing case study corpora in the fields of genetic, immunological and psychiatric biomedical association studies, using what we refer to as a 'K-FLUX' analysis to produce a set of key items. Experts from within these fields are asked to evaluate the extent to which identified key items are characteristic of their discipline. The paper concludes that less than 50% of the items identified by the method are rated as highly characteristic by experts and that this ranges between types of association study. Further, there is difficulty in reaching a consensus over what is deemed to be 'characteristic', thus posing a challenge to the ultimate aim of the keyness method. The paper demonstrates the value of supporting corpus linguistic studies with expert assessments to evaluate whether (and which) items can be said to be indicative of a particular field. [ABSTRACT FROM AUTHOR]
- Published
- 2021
- Full Text
- View/download PDF
20. Evolution of Concept "Black" in the US Media Discourse.
- Author
-
Melnichuk, Tatiana and Saburova, Natalia
- Subjects
BLACK people in mass media ,RACIAL inequality - Abstract
Media discourse is an effective tool for projecting and shaping the public perception of a certain idea or image. The article focuses on the linguistic and semantic representation of the concept "Black" in the American media discourse with a particular attention to how the concept representation has evolved from the 1990s to 2010s. The study employed corpus methodology (keyness, frequency, concordances) to analyze news articles from "The New York Times" and "The Los Angeles Times", which were arranged into three corpora according to the publication date (1990s, 2000s, 2010s). The corpus analysis established a number of changes in the concept "Black" representation manifested primarily through the high relevance keywords and high frequency collocations. Dominant semantic components were identified in the concept representation in each corpus, as well as notable shifts in core and peripheral aspects within these semantic components. The analysis showed that although the semantic components 'racial / ethnic inequality' and 'economic issues' remain at the core of the concept in each corpus, they are expressed through connections with other semantic components which may vary throughout three decades, such as 'culture' in the 1990s, 'education' and 'politics' in the 2000s and 'police brutality and profiling' and 'appearance' in the 2010s. [ABSTRACT FROM AUTHOR]
- Published
- 2021
- Full Text
- View/download PDF
21. From Keyness to Distinctiveness – Triangulation and Evaluation in Computational Literary Studies.
- Author
-
Schröter, Julian, Du, Keli, Dudar, Julia, Rok, Cora, and Schöch, Christof
- Subjects
LITERARY criticism ,SUPERVISED learning ,LITERARY form ,COMPUTATIONAL linguistics ,TRIANGULATION ,MIXED methods research - Abstract
There is a set of statistical measures developed mostly in corpus and computational linguistics and information retrieval, known as keyness measures, which are generally expected to detect textual features that account for differences between two texts or groups of texts. These measures are based on the frequency, distribution, or dispersion of words (or other features). Searching for relevant differences or similarities between two text groups is also an activity that is characteristic of traditional literary studies, whenever two authors, two periods in the work of one author, two historical periods or two literary genres are to be compared. Therefore, applying quantitative procedures in order to search for differences seems to be promising in the field of computational literary studies as it allows to analyze large corpora and to base historical hypotheses on differences between authors, genres and periods on larger empirical evidence. However, applying quantitative procedures in order to answer questions relevant to literary studies in many cases raises methodological problems, which have been discussed on a more general level in the context of integrating or triangulating quantitative and qualitative methods in mixed methods research of the social sciences. This paper aims to solve these methodological issues concretely for the concept of distinctiveness and thus to lay the methodological foundation permitting to operationalize quantitative procedures in order to use them not only as rough exploratory tools, but in a hermeneutically meaningful way for research in literary studies. Based on a structural definition of potential candidate measures for analyzing distinctiveness in the first section, we offer a systematic description of the issue of integrating quantitative procedures into a hermeneutically meaningful understanding of distinctiveness by distinguishing its epistemological from the methodological perspective. The second section develops a systematic strategy to solve the methodological side of this issue based on a critical reconstruction of the widespread non-integrative strategy in research on keyness measures that can be traced back to Rudolf Carnap's model of explication. We demonstrate that it is, in the first instance, mandatory to gain a comprehensive qualitative understanding of the actual task. We show that Carnap's model of explication suffers from a shortcoming that consists in ignoring the need for a systematic comparison of what he calls the explicatum and the explicandum. Only if there is a method of systematic comparison, the next task, namely that of evaluation can be addressed, which verifies whether the output of a quantitative procedure corresponds to the qualitative expectation that must be clarified in advance. We claim that evaluation is necessary for integrating quantitative procedures to a qualitative understanding of distinctiveness. Our reconstruction shows that both steps are usually skipped in empirical research on keyness measures that are the most important point of reference for the development of a measure of distinctiveness. Evaluation, which in turn requires thorough explication and conceptual clarification, needs to be employed to verify this relation. In the third section we offer a qualitative clarification of the concept of distinctiveness by spanning a three-dimensional conceptual space. This flexible framework takes into account that there is no single and proper concept of distinctiveness but rather a field of possible meanings depending on research interest, theoretical framework, and access to the perceptibility or salience of textual features. Therefore, we shall, instead of stipulating any narrow and strict definition, take into account that each of these aspects – interest, theoretical framework, and access to perceptibility – represents one dimension of the heuristic space of possible uses of the concept of distinctiveness. The fourth section discusses two possible strategies of operationalization and evaluation that we consider to be complementary to the previously provided clarification, and that complete the task of establishing a candidate measure successfully as a measure of distinctiveness in a qualitatively ambitious sense. We demonstrate that two different general strategies are worth considering, depending on the respective notion of distinctiveness and the interest as elaborated in the third section. If the interest is merely taxonomic, classification tasks based on multi-class supervised machine learning are sufficient. If the interest is aesthetic, more complex and intricate evaluation strategies are required, which have to rely on a thorough conceptual clarification of the concept of distinctiveness, in particular on the idea of salience or perceptibility. The challenge here is to correlate perceivable complex features of texts such as plot, theme (aboutness), style, form, or roles and constellation of fictional characters with the unperceived frequency and distribution of word features that are calculated by candidate measures of distinctiveness. Existing research did not clarify, so far, how to correlate such complex features with individual word features. The paper concludes with a general reflection on the possibility of mixed methods research for computational literary studies in terms of explanatory power and exploratory use. As our strategy of combining explication and evaluation shows, integration should be understood as a strategy of combining two different perspectives on the object area: in our evaluation scenarios, that of empirical reader response and that of a specific quantitative procedure. This does not imply that measures of distinctiveness, which proved to reach explanatory power in one qualitative aspect, should be supposed to be successful in all fields of research. As long as evaluation is omitted, candidate measures of distinctiveness lack explanatory power and are limited to exploratory use. In contrast with a skepticism that has sometimes been expressed from literary scholars with regard to the relevance of computational literary studies on proper issues of the humanities, we believe that integrating computational methods into hermeneutic literary studies can be achieved in a way that reaches higher explanatory power than the usual exploratory use of keyness measures, but it can only be achieved individually for concrete tasks and not once and for all based on a general theoretical demonstration. [ABSTRACT FROM AUTHOR]
- Published
- 2021
- Full Text
- View/download PDF
22. USING CORPORA TO AID QUALITATIVE TEXT ANALYSIS
- Author
-
Jędrzej Olejniczak
- Subjects
Corpora ,text analysis ,concordance ,wordlist ,keyness ,dispersion plot ,corpus building ,Education (General) ,L7-991 ,Social sciences (General) ,H1-99 - Abstract
Aim. The aim of this paper is to present and exemplify a number of basic uses of corpus-based text analysis tools that can supplement and provide additional insight for an otherwise qualitative analysis of a text. I attempt to show that nowadays certain corpus tools are easily accessible to any researcher and can be used to enrich the results of studies concerned with texts. Methods. This paper comprises the basics of corpus building, the main types of data that can be drawn from a simple corpus and a detailed description of four methods that can aid text analysis: wordlists, concordances, dispersion plots and keywords. Each of those four methods is thoroughly described, including a number of examples of its applications and indicates its possible limitations. Results. The examples provided suggest that even performing a very simple corpus analysis of a text might unveil certain trends and phenomena not noticeable through the classic qualitative text analysis methods (e.g. close reading). The paper argues that corpus research can hence work as an extension of a quantitative analysis (or be its starting point) by examining themes and keywords present in a given text and enrich the results of a qualitative study with a fresh perspective. Finally, the paper claims that basic corpus analysis can, in fact, be successfully employed by researchers who do not have any prior experience with statistics or corpora.
- Published
- 2018
- Full Text
- View/download PDF
23. Calculating and displaying key labels: the texts, sections, authors and neighbourhoods where words and collocations are likely to be prominent.
- Author
-
Jeaco, Stephen
- Subjects
NEIGHBORHOODS ,FOREIGN language education ,LABELS ,ENGLISH language ,MACHINE tools - Abstract
Corpora are usually not only made up of words, sentences and plain texts; they usually also have metadata, background information and structural features which can be used to filter searches or provide additional information about the context of specific concordance lines. This paper presents a new approach which uses the information about the texts in which words and collocations occur, generating clouds and tables of what are called Key Labels. The procedure can be likened to looking at key words (Scott, 1997; and Scott and Tribble, 2006) from the opposite starting point: beginning with a word of interest and exploring the features of texts and the parts of text in which it occurs. The paper explains the background to the procedure, how it is carried out, and how these Key Labels are integrated into The Prime Machine corpus tool for English language learning. [ABSTRACT FROM AUTHOR]
- Published
- 2020
- Full Text
- View/download PDF
24. Key words when text forms the unit of study: Sizing up the effects of different measures.
- Author
-
Jeaco, Stephen
- Subjects
- *
KEYWORDS , *CURRICULUM , *CORPORA , *DEVIANT behavior , *QUANTITATIVE research - Abstract
Throughout the social sciences, there has been growing pressure to present effect sizes when publishing empirical data (see American Psychological Association, 2001; Parsons & Nelson, 2004). While it seems indisputable that for the majority of quantitative research foci, effect size is an essential element of statistical analysis, this paper argues that specifically for key word analysis in corpus linguistics, the means of reporting effect size must depend on the level of the unit of study of each investigation (single text, collection or large corpus). After exploring some main criticisms of the log-likelihood measure, this paper unpacks the parameters of different measures for keyness and how they might address underlying concerns. It maintains that for the exploration of foregrounded/deviant/salient/marked features in text, the use of log-likelihood scores to rank the results is still fit for purpose and coupled with Bayes Factors is a solid approach for key word analyses. [ABSTRACT FROM AUTHOR]
- Published
- 2020
- Full Text
- View/download PDF
25. A key to understanding why a text is difficult to process: Lexical uniqueness of academic English texts.
- Author
-
Borza, Natalia
- Subjects
ENGLISH as a foreign language ,ENGLISH language - Abstract
While the register of English language tertiary textbooks has been investigated substantially, moderately little is explored about the register analytical features of secondary textbooks. The purpose of the present pedagogically-driven study is to analyse the register of biology textbooks for secondary students from the point of view of English as a second language (ESL) teaching by describing the lexical uniqueness of the register of the biology corpus (BIOCOR) 10th-grade students need to process during their studies at a bilingual secondary school. The BIOCOR (consisting of 7,021 words) was compared to a reference corpus (REFCOR) of general English texts at a CEFR B2 level (comprising 7,098 words) by exploring its high-value positive and negative keyness lexical items. The results of the investigation disclose that the lack of specialised uniqueness is prevalent in the BIOCOR with regard to academic English and specific biology terminology. The lexical plainness of the biology textbook can be regarded as one of the linguistic features revealing the non-academic but popularizing nature of the secondary textbook register. [ABSTRACT FROM AUTHOR]
- Published
- 2020
26. LA SELECCIÓN TEMÁTICA DEL VOCABULARIO PARA FINES DIDÁCTICOS: EVALUACIÓN DE UN ACERCAMIENTO CUANTITATIVO.
- Author
-
Degraeuwe, Jasper and Goethals, Patrick
- Published
- 2020
- Full Text
- View/download PDF
27. Students’ use of academic vocabulary in comparison to that of published writers: a corpus-driven analysis
- Author
-
Trish Cooper
- Subjects
academic vocabulary ,first- and additional-language speakers ,student writing ,corpus analysis ,qualitative study ,keyness ,Language and Literature ,Philology. Linguistics ,P1-1091 ,African languages and literature ,PL8000-8844 - Abstract
An aspect of vocabulary research that tends to be somewhat neglected is that based on qualitative investigation. While a number of studies have considered the differences in vocabulary size between first-language (L1) and additional language (AL) speakers of English, there has been relatively little in-depth investigation into the nature of the vocabulary differences between these groups. The aim of this paper is to shed light on some of the vocabulary features of both L1 and AL student writing in relation to published writing as a benchmark. This study is based on the results of a qualitative investigation conducted using a corpus-driven approach which focused on differences in the use of academic vocabulary by both L1 and AL groups across first-, second- and third-year psychology students. The method used to identify vocabulary differences was keyness analysis, in which vocabulary items are compared on the basis of significantly different frequencies. One of the patterns that emerged serves to support the assumption that L1 students have a better grasp of academic vocabulary than AL students, as there are a greater number of grammatical, semantic and collocational idiosyncrasies in AL writing. The analysis also confirms that high achievers tend to use a broader range of academic words than low achievers. Given the evidence that a good knowledge of academic vocabulary in particular is essential for success at the level of tertiary education, the results of this study contribute to the question of what the specific vocabulary needs of undergraduate students are within the university context.
- Published
- 2017
- Full Text
- View/download PDF
28. Adjectives and their keyness: a corpus-based analysis of tourism discourse in English.
- Author
-
Durán-Muñoz, Isabel
- Subjects
DISCOURSE analysis ,ADVENTURE tourism ,TOURISM websites ,CORPORA ,SOCIOLINGUISTICS - Abstract
This paper attempts to shed some light on the importance of adjectives in the linguistic characterisation of tourism discourse in English in general and in adventure tourism in particular as well as to prove how significant the difference in usage is compared to the general language. It seeks to understand the role that adjectives play in this specific subdomain and to contribute to the linguistic characterisation of tourism discourse in this respect. It also aims to confirm or reject previous assumptions regarding the use, and frequency of use, of adjectives and adjectival patterns in this specialised domain and, in general, to promote the study of adjectivisation in domain-specific discourses. To do so, it proposes a corpus-based study that measures the keyness of adjectives in promotional texts of the adventure tourism domain in English by comparing their usage in the compiled corpus to the two most relevant reference corpora of English (coca and the bnc). [ABSTRACT FROM AUTHOR]
- Published
- 2019
- Full Text
- View/download PDF
29. Evaluation of keyness metrics: performance and reliability
- Author
-
Sönning, Lukas
- Subjects
Linguistics and Language ,COCA ,vocabulary lists ,corpus linguistics ,corpus ,methodology ,keywords ,Language and Linguistics ,keyness ,English ,frequency ,dispersion measures ,key word analysis ,dispersion ,keyword analysis ,lexical dispersion ,word importance ,word frequency lists ,Corpus of Contemporary American English - Abstract
The methodological debates surrounding keyword analysis have given rise to a wide range of keyness metrics. The present paper delineates four dimensions of keyness, which distinguish between frequency- and dispersion-related perspectives. Existing measures are then organized according to these dimensions and evaluated with regard to their performance on a specific keyword analysis task: The identification of key verbs in academic writing. To this end, the rankings produced by 32 different metrics are evaluated against an established academic word list. Further, the reliability of measures is assessed, to determine whether they produce stable rankings across repeated studies on the same pair of text varieties. We observe notable differences among metrics with regard to these criteria. Our findings provide further support for the superiority of the Wilcoxon rank sum test and text-dispersion–based measures, and allow us to identify, within each dimension of keyness, metrics that may be given preference in applied work.
- Published
- 2023
30. Using Keyword Features to Automatically Classify Genre of Song Ci Poem
- Author
-
Mu, Yong, Hutchison, David, Series editor, Kanade, Takeo, Series editor, Kittler, Josef, Series editor, Kleinberg, Jon M., Series editor, Mattern, Friedemann, Series editor, Mitchell, John C., Series editor, Naor, Moni, Series editor, Pandu Rangan, C., Series editor, Steffen, Bernhard, Series editor, Terzopoulos, Demetri, Series editor, Tygar, Doug, Series editor, Weikum, Gerhard, Series editor, Lu, Qin, editor, and Gao, Helena Hong, editor
- Published
- 2015
- Full Text
- View/download PDF
31. Let’s Talk About Sex: The Academic Discourse on Sex in Counseling Journals
- Author
-
Dykeman, Cass and Ferrese, Shauna
- Subjects
academic discourse ,collocation ,counseling ,corpus linguistics ,sex ,Social and Behavioral Sciences ,sexuality ,keyness - Abstract
This study addresses gaps in the research by examining the academic discourse on sex and human sexuality in counseling and draws comparisons between how researchers discuss these crucial developmental topics in the counseling field versus the discourse in other mental health professions.
- Published
- 2023
- Full Text
- View/download PDF
32. Capturing Distinctiveness: Transparent Procedures to Escape a Pervasive Black-Box Propensity
- Author
-
Matilde, Trevisani and Tuzzi, Arjuna
- Subjects
statistical learning ,machine learning ,text classification ,keyness, distinctiveness, machine learning, statistical learning, text classification ,distinctiveness ,keyness - Published
- 2023
33. Evaluation of Measures of Distinctiveness. Classification of Literary Texts on the Basis of Distinctive Words
- Author
-
Du, Keli, Dudar, Julia, and Schöch, Christof
- Subjects
literary texts ,evaluation ,Distinctiveness ,keyness - Abstract
This paper concerns an empirical evaluation of nine different measures of distinctiveness or ‘keyness’ in the context of Computational Literary Studies. We use nine different sets of literary texts (specifically, novels) written in seven different languages as a basis for this evaluation. The evaluation is performed as a downstream classification task, where segments of the novels need to be classified by subgenre or period of first publication. The classifier receives different numbers of features identified using different measures of distinctiveness. The main contribution of our paper is that we can show that across a wide variety of parameters, but especially when only a small number of features is used, (more recent) dispersion-based measures very often outperform other (more established) frequency-based measures by significant margins. Our findings support an emerging trend to consider dispersion as an important property of words in addition to frequency. , Journal of Computational Literary Studies
- Published
- 2023
- Full Text
- View/download PDF
34. A Corpus Linguistic Analysis of the Child and Therapist Relationship in Play Therapy
- Author
-
Dykeman, Cass and Marquez, Oriana
- Subjects
play therapy ,collocation ,non directive play therapy ,counselor education ,children ,Keyness ,LSM ,Social and Behavioral Sciences ,directive play therapy - Abstract
This project is composed of two separate manuscripts that aim to achieve two research objectives. The first objective is to fill a gap in the counseling literature regarding play therapy approaches using corpus linguistics to analyze child-therapist interactions. The second objective is to deepen counselors' understanding of play therapy approaches by investigating in language patterns play therapy. Despite the extensive research on play therapy, corpus linguistics methodology has not been used to examine the two leading schools of play therapy approaches, limiting understanding of their mechanisms. Language style matching (LSM) research in play therapy has implications for the counseling field. A strong therapeutic relationship is crucial for effective play therapy, and LSM may aid in building this relationship (Landreth & Bratton, 2006). Analyzing the language patterns between the child and the therapist during play sessions can provide insights into the therapeutic relationship's dynamics and bridge the understanding and improving of the relationship and achieve greater successful outcomes.
- Published
- 2023
- Full Text
- View/download PDF
35. Computational Genre Analysis (in: The Dragonfly's Gaze)
- Author
-
Schöch, Christof
- Subjects
Topic Modeling ,computational literary studies ,Keyness ,Arthur Conan Doyle ,literary genre - Abstract
Genre is, like authorship or time period, one of a number of fundamental categories allowing authors and readers as well as literary scholars to endow the vast field of literary production with some internal structure. Genre is not specific to literature, of course: whether we consider painting, music or cinema, genre as an intermediary category situated between individual works and an entire artform is always a relevant category. Unlike theme or style, literary genre is not in itself a level of analysis; rather, different levels of analysis such as theme or style can be used to describe and distinguish genres. The aim of this chapter is to introduce to the analysis of genre with quantitative, computational methods. First, the chapter will discuss what literary genres are and which aspects of genre may be studied quantitatively. Second, some issues relevant to building text collections for genre analysis are presented. Then, several example analyses are discussed, all based on the idea that computational genre analysis can be conducted from a contrastive perspective – that is, by comparing texts belonging to a genre of interest to texts from other, related genres.
- Published
- 2022
- Full Text
- View/download PDF
36. Rogers, Perls, and Ellis in Three Approaches to Psychotherapy: A Corpus-based Study
- Author
-
Miranda, Roberta and Dykeman, Cass
- Subjects
Counseling ,The Gloria Tapes ,Corpus linguistics ,Counseling Psychology ,Carl Rogers ,Linguistics ,Fritz Perls ,Social and Behavioral Sciences ,Computational Linguistics ,FOS: Psychology ,Albert Ellis ,Keyness ,FOS: Languages and literature ,Psychology - Abstract
Published research focused on the analysis of specific words used by counselors during counseling sessions is limited. Applying keyness analysis within a corpus linguistic framework can help researchers examine transcribed verbal interactions between counselors and clients in counseling sessions. Additionally, keyness studies can uncover specific word choices that align with specific theoretical orientations, and study potential implications made by both new counselors and counseling predecessors. This study employed a corpus linguistic design analyzing words within the three unique counseling transcriptions in the popular training film, Three Approaches to Psychotherapy. Outcomes determined that keywords identified with the language spoken by Carl Rogers, Albert Ellis, and Fritz Perls did align with their associated theories of client-centered, Rational Emotive Behavioral Therapy, and Gestalt theories, respectively. As such, counseling educators can use this film to illustrate specific word choices reflective of three unique theoretical models.
- Published
- 2022
- Full Text
- View/download PDF
37. Collects of the Missale Romanum: A Corpus Based Research Project
- Author
-
Dykeman, Cass
- Subjects
collocation ,Latin ,Collects ,Missale Romanum ,Keyness ,Missal ,Vatican II ,n-grams - Abstract
The aim of this project is to produce a series of empirical studies examining the linguistic, psychological, and theological differences between the pre and post Vatican II collects. This project will conduct these examinations using methods from corpus linguistics.
- Published
- 2022
- Full Text
- View/download PDF
38. The creative use of absences.
- Author
-
Montoro, Rocío
- Subjects
- *
CORPORA , *PRESENCE (Philosophy) , *TRIANGULATION (Psychology) - Abstract
In an article published in this journal, Partington (2014) addresses the criticism often made against corpus linguistics that it is apparently unable to cope with absences. He convincingly argues that corpus linguistics is better suited to account for absences than has been claimed. I resume the debate by discussing a type of absence not fully addressed in Partington (2014) which I have termed 'creative absences'. With a focus on corpus stylistics, I consider the way in which the author Henry Green dispenses with a compulsory element in the grammatical structure of Standard English, i.e. the determiner (mainly, the definite article). By means of a manual analysis as well as two corpus stylistic analyses (keyness and text-type analysis) of the novel Living (Green 1929), I explore the effects of such an unorthodox use and argue, alongside Partington (2014), for the usefulness of corpus approaches to account for at least certain types of absences. [ABSTRACT FROM AUTHOR]
- Published
- 2018
- Full Text
- View/download PDF
39. USING CORPORA TO AID QUALITATIVE TEXT ANALYSIS. AN INTERDISCIPLINARY APPROACH.
- Author
-
OLEJNICZAK, JĘDRZEJ
- Subjects
- *
CORPORA , *CONCORDANCES , *KEYWORDS , *DATA analysis , *STATISTICS - Abstract
Aim. The aim of this paper is to present and exemplify a number of basic uses of corpus-based text analysis tools that can supplement and provide additional insight for an otherwise qualitative analysis of a text. I attempt to show that nowadays certain corpus tools are easily accessible to any researcher and can be used to enrich the results of studies concerned with texts. Methods. This paper comprises the basics of corpus building, the main types of data that can be drawn from a simple corpus and a detailed description of four methods that can aid text analysis: wordlists, concordances, dispersion plots, and keywords. Each of those four methods is thoroughly described, including a number of examples of its applications and indicates its possible limitations. Results. The examples provided suggest that even performing a very simple corpus analysis of a text might unveil certain trends and phenomena not noticeable through the classic qualitative text analysis methods (e.g. close reading). The paper argues that corpus research can hence work as an extension of a quantitative analysis (or be its starting point) by examining themes and keywords present in a given text and enrich the results of a qualitative study with a fresh perspective. Finally, the paper claims that basic corpus analysis can, in fact, be successfully employed by researchers who do not have any prior experience with statistics or corpora. [ABSTRACT FROM AUTHOR]
- Published
- 2018
- Full Text
- View/download PDF
40. Smart Cities: A Review and Analysis of Stakeholders’ Literature.
- Author
-
Marrone, Mauricio and Hammerle, Mara
- Abstract
Recent literature on smart cities stresses the role of digitization in tackling urban issues such as environmental degradation and poverty. The wicked nature of these issues gives rise to the need to understand the diverse perspectives of relevant stakeholder groups on smart cities. However, existing research that compares these perspectives tends to exclude the beliefs of those living in smart cities. Integrating these beliefs in smart city discourses is paramount to increase the likelihood that these systems will be accepted. With the view that the literature consumed by an audience will influence that audience’s perspectives, the main aim of this study is to compare and contrast the pertinent topics found in various types of literature on smart cities. Using an innovative approach of literature comparison, based on a semantic entity annotator and keyword analysis, this article extracts and compares topics in news media (for citizens), trade publications (for businesses), academic articles (for research organizations) and government reports (for governments). The findings suggest that citizens tend to be under-represented in discussions on smart cities and highlight those topics considered relevant only by smart city citizens. Increased understanding in this area can help guide discussions and policies that are relevant for all stakeholders. [ABSTRACT FROM AUTHOR]
- Published
- 2018
- Full Text
- View/download PDF
41. Log-likelihood and odds ratio: Keyness statistics for different purposes of keyword analysis.
- Author
-
Pojanapunya, Punjaporn and Watson Todd, Richard
- Subjects
CORPORA ,DISCOURSE analysis ,ENGLISH language ,ACADEMIC discourse ,COMPUTATIONAL linguistics - Abstract
Keyword analysis is used in a range of sub-disciplines of applied linguistics from genre analyses to critically-oriented studies for different purposes ranging from producing a general characterization of a genre to identifying text-specific ideological issues. This study compares the use of log-likelihood (LL), a probability statistic, and odds ratio (OR), an effect size statistic, for keyword identification and argues that the two methods produce different keywords applicable to research focusing on different purposes. Through two case studies, keyword analyses of advance fee scams against the British National Corpus and research articles in applied linguistics against research articles from other academic disciplines, we show that both the LL and OR keywords concern the aboutness of the corpus, but differ in their specificity and pervasiveness through the corpus. LL highlights words which are relatively common in general use serving genre purposes, whereas OR highlights more specialized words serving critically-oriented purposes. Methodological and practical contributions to keyword analysis are discussed. [ABSTRACT FROM AUTHOR]
- Published
- 2018
- Full Text
- View/download PDF
42. A Case Study on Some Frequent Concepts in Works of Poetry.
- Author
-
Pace-Sigge, Michael Thomas
- Subjects
CORPORA ,SYNTAX (Grammar) ,LINGUISTICS research ,POETRY collections ,TERMS & phrases - Abstract
This paper looks at a corpus of British and US poetry, uncovering phraseological units which, through their frequency, are indicators of key concepts. Multi-word-units (MWUs) have been discussed extensively with reference to corpus-based research, for example by Sinclair (1996) [2004], Biber and Conrad (1999), or, referred to as formulaicity by Wray (2002); O'Keefe et al. (2007), Greaves and Warren (2010) and Pace-Sigge (2015) describe MWUs preferred in different spoken and written genres. So far, however, there has been very little research in how far MWUs appear in the genre of poetry. A commonly held view is that poetry by definition should not be yielding patterns - it subverts every pattern (linguistically speaking) that it can. Through focus on the main themes surfacing in multiword units, this research looks at usages found in poetic texts in-depth and compares sets of words found with their occurrence patterns in prose literature. Key issues will be highlighted through a number of theme-based case studies, looking at themes of world and sky. Results show that there are common clusters found in poetry and prose corpora: itis depth of usage that marks their divergence. [ABSTRACT FROM AUTHOR]
- Published
- 2018
- Full Text
- View/download PDF
43. A quantitative assessment of codeswitching in Setswana.
- Author
-
OTLOGETSWE, T. J.
- Subjects
BILINGUALISM ,ENGLISH language ,CORPORA ,EDUCATION ,LINGUISTS - Abstract
Measuring the number of foreign words used daily in conversation or in general language discourse is a challenge to linguists. This is because all instances of a language's use cannot be collected and measured reliably for the occurrence of specific language elements. This is in part because the overall size of a language population, that is, all words and expressions used in a language cannot be known (Kilgarriff & Grefenstette, 2003). They are dynamic in nature. There is, however, value in the enquiry of determining how much of foreign words exist in any given discourse. This paper proposes an approach of measuring the amount of foreign terms in a language. It uses a 13-million-word corpus to measure the amount of English in a Setswana corpus. The data is analysed using Oxford Wordsmith Tools software. In the study, we use keyword analysis. The different components of the corpus are measured to extract terms which are typical or descriptive of a specific text. The key terms are then listed collectively on the basis of their keyness measures. The English words are then counted in the extracted list at 100 word intervals and then tabulated. An average is subsequently computed to characterise the amount of English words found in the keyword list. One of the key findings of the study is that about 25% of Batswana's language use is English. This is attributed to Batswana's bilingualism and the Botswana education system's systematic promotion of English dating back to the late 70s. The article recommends that for the preservation and promotion of Setswana, the language must be used beyond language classes in social functional domains such as medicine, law and farming. [ABSTRACT FROM AUTHOR]
- Published
- 2017
44. Representations of immigrants and refugees in US K-12 school-to-home correspondence: an exploratory corpus-assisted discourse study.
- Author
-
Berger, Cynthia, Friginal, Eric, and Roberts, Jennifer
- Subjects
CORPORA ,PUBLIC schools ,SCHOOL children ,IMMIGRANTS ,REFUGEES ,MONOLINGUALISM ,POSTSECONDARY education ,PUBLIC education - Abstract
This study details a comparative, corpus-based discourse analysis of corpora containing educational documents distributed to parents and guardians of K-12 children in public schools in the United States (US). The exploratory local corpus (n=152,934) contains parent-directed educational documents collected from four public schools in a city located in the south-eastern US with an unusually high percentage of foreign-born residents. The comparison corpus (n=147,796) contains parent-directed documents collected from a sampling of K-12 schools across the US. Following Baker et al. (2008), keyness and collocations were utilised as central theoretical notions and tools of analysis, in addition to a lexical sophistication comparison, in order to investigate text simplification across corpora. Results show that while the first corpus used labels for students that were superficially inclusive, English language learners themselves were discursively represented as outsiders facing barriers to inclusion that native-English speaking monolingual students do not face. Furthermore, the first corpus revealed an emphasis on identifying and categorising language learners so as to provide them with immediate services, while the non-geographically specific corpus focussed more on the long-term development of learners and on preparation for post-secondary education. We discuss the implications for language policy in public education and for policies related to K-12 school-to-home correspondence. [ABSTRACT FROM AUTHOR]
- Published
- 2017
- Full Text
- View/download PDF
45. Keyness in maritime institutional law texts.
- Author
-
Wenyu Lu, Sung-Min Lee, and Se-Eun Jhang
- Subjects
CORPORA ,PUBLIC law ,CIVIL law ,LEGAL language ,LINGUISTIC analysis - Abstract
This study describes some characteristics of maritime institutional legal texts in terms of corpus methodology. We self-built two study corpora: a public maritime institutional corpus and a private maritime institutional corpus. The differences between the two corpora can be distinguished by identifying typical linguistic features from the keyness aspects of key words, key clusters, and key semantic domains. Specific words and phrases in complementary distribution are offered to distinguish public maritime legal characteristics from private maritime legal characteristics by comparing the self-built specialized corpora with the more general British National Corpus (BNC informative genre). Linguistic features are discussed from the view of keyness, thus enabling non-legal practitioners as well as non-specialist readers to discover and describe underlying parameters that best depict the differences between legal registers or genres. [ABSTRACT FROM AUTHOR]
- Published
- 2017
- Full Text
- View/download PDF
46. The Europe of Brexit: a corpus-assisted discourse study of identities in the press
- Abstract
Drawing on a what is known as corpus-assisted discourse study (CADS) approach (Baker et al., 2008), this article will research the construction of different identities by means of the language used in two newspaper articles on Brexit from the Spanish El País and the British The Guardian, to examine how these identities are constructed through media discourse at the time following the Brexit referendum (2016-2018). Media discourse surrounding Brexit is examined under the consideration of media power. A comparable corpus made up of original newspaper articles about Brexit was used to carry out the analysis, identifying statistically significant keywords compared with a reference corpus with the aim of providing an example of how the British and Spanish press construct identity., Partiendo de la perspectiva de los estudios del discurso asistido por corpus (Baker et al., 2008), este artículo investigará la construcción de diferentes identidades mediante el lenguaje utilizado sobre el Brexit en el diario español El País y el británico The Guardian para así examinar cómo se construyen dichas identidades a través del discurso mediático en el momento posterior al referéndum del Brexit (2016-2018). El discurso mediático en torno al Brexit se examina bajo el prisma del poder mediático. Para llevar a cabo el análisis se ha utilizado un corpus comparable formado por artículos periodísticos originales sobre el Brexit, identificando palabras clave estadísticamente significativas en comparación con un corpus de referencia con el objetivo de proporcionar un ejemplo de cómo la prensa británica y española construyen la identidad alrededor de un mismo tema
- Published
- 2021
47. Predictive keywords: Using machine learning to explain document characteristics.
- Author
-
Kyröläinen AJ and Laippala V
- Abstract
When exploring the characteristics of a discourse domain associated with texts, keyword analysis is widely used in corpus linguistics. However, one of the challenges facing this method is the evaluation of the quality of the keywords. Here, we propose casting keyword analysis as a prediction problem with the goal of discriminating the texts associated with the target corpus from the reference corpus. We demonstrate that, when using linear support vector machines, this approach can be used not only to quantify the discrimination between the two corpora, but also extract keywords. To evaluate the keywords, we develop a systematic and rigorous approach anchored to the concepts of usefulness and relevance used in machine learning. The extracted keywords are compared with the recently proposed text dispersion keyness measure. We demonstrate that that our approach extracts keywords that are highly useful and linguistically relevant, capturing the characteristics of their discourse domain., Competing Interests: The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest., (Copyright © 2023 Kyröläinen and Laippala.)
- Published
- 2023
- Full Text
- View/download PDF
48. Contrastive analysis of adolescent learner interlanguage in asynchronous online communication: A keyness approach.
- Author
-
Lin, Yen-Liang
- Subjects
- *
CONTRASTIVE linguistics , *ADOLESCENT psychology , *INTERLANGUAGE (Language learning) , *TELEMATICS , *SEMANTICS - Abstract
Online communication provides learners of English with opportunities to interact with native speakers across geographical boundaries. While there is a burgeoning field of research which looks at computer-mediated communication (CMC), few studies have employed a keyness approach to the analysis of interlanguage of adolescent learners. This study reports on a corpus analysis of samples of asynchronous online discourse between a group of British and Taiwanese adolescents, with the aim of exploring the significant differences in the use of grammatical categories between the two groups of participants. Keyness analysis (Rayson, 2008) at the part-of-speech level highlights the linguistic features which deserve particular attention. Specifically, it reveals the grammatical categories that occur unusually frequently or unusually infrequently in the English learners' discourse when compared with the language used by the native speakers of English in the same sample. The research findings demonstrate the pedagogical merit of keyness analysis and thus help in the design of courses for adolescent online interaction. [ABSTRACT FROM AUTHOR]
- Published
- 2015
- Full Text
- View/download PDF
49. The Europe of Brexit: a corpus-assisted discourse study of identities in the press
- Author
-
María del Mar Sánchez Ramos, Carmen Pena Díaz, and Universidad de Alcalá. Departamento de Filología Moderna
- Subjects
media discourse ,Linguistics and Language ,media_common.quotation_subject ,Media studies ,análisis del discurso asistido por corpus ,Language and Linguistics ,Newspaper ,identities ,keyness ,Power (social and political) ,discurso mediático ,Brexit ,Identity (philosophy) ,Political science ,Referendum ,Guardian ,corpus-assisted discourse analysis ,Philology ,Construct (philosophy) ,identidad ,media_common ,Filología - Abstract
Drawing on what is known as a corpus-assisted discourse study (CADS) approach (Baker et al., 2008), this article will research the construction of different identities by means of the language used in two newspaper articles on Brexit from the Spanish El País and the British The Guardian, to examine how these identities are constructed through media discourse at the time following the Brexit referendum (2016-2018). Media discourse surrounding Brexit is examined under the consideration of media power. A comparable corpus made up of original newspaper articles about Brexit was used to carry out the analysis, identifying statistically significant keywords compared with a reference corpus with the aim of providing an example of how the British and Spanish press construct identity. Keywords: corpus-assisted discourse analysis, media discourse, identities, Brexit, keyness., Partiendo de la perspectiva de los estudios del discurso asistido por corpus (Baker et al., 2008), este artículo investigará la construcción de diferentes identidades mediante el lenguaje utilizado sobre el Brexit en el diario español El País y el británico The Guardian para así examinar cómo se construyen dichas identidades a través del discurso mediático en el momento posterior al referéndum del Brexit (2016-2018). El discurso mediático en torno al Brexit se examina bajo el prisma del poder mediático. Para llevar a cabo el análisis se ha utilizado un corpus comparable formado por artículos periodísticos originales sobre el Brexit, identificando palabras clave estadísticamente significativas en comparación con un corpus de referencia con el objetivo de proporcionar un ejemplo de cómo la prensa británica y española construyen la identidad alrededor de un mismo tema., Agencia Estatal de Investigación (AEI)
- Published
- 2021
50. 'La princesa guerrera enarcó una ceja, mientras el macarra bamboleaba sus atributos bajo el pantalón'. La co-construcción narrativa del homoerotismo en internet
- Author
-
Giovanni Garofalo
- Subjects
Queer studies ,heteronormativity ,social construction of desire ,keyness ,Linguistics and Language ,Estudios queer ,heteronormatividad ,construcción social del deseo ,Settore L-LIN/07 - Lingua e Traduzione - Lingua Spagnola ,Language and Linguistics - Abstract
Compaginando algunos conceptos medulares de la sociología y del psicoanálisis freudiano con la metodología de la lingüística de corpus, el presente trabajo se inscribe en la senda de los estudios queer de enfoque cuantitativo (Baker, 2004, 2005, 2014, 2018; King, 2015; Milani, 2013) sobre la producción y la regulación social de la sexualidad y del género, para explorar el papel del lenguaje en la construcción del imaginario erótico de una comunidad gay hispanohablante en internet. Con la ayuda de Sketch Engine, se analizan las palabras clave de un corpus de relatos eróticos amateur, gais y lesbianos, para arrojar luz sobre las representaciones simbólicas y los mecanismos discursivos que guían la co-construcción del deseo erótico homosexual, masculino y femenino, en internet. Añadiendo matices significativos para el ámbito cultural hispano, los resultados del análisis confirman la tesis mantenida por Baker (2005), según la cual, al narrar encuentros sexuales fuertemente idealizados, la comunidad homosexual sigue reproduciendo y perpetuando arquetipos ancestrales heteronormativos.
- Published
- 2021
Catalog
Discovery Service for Jio Institute Digital Library
For full access to our library's resources, please sign in.