Corpus ID: 232427899

No Keyword is an Island: In search of covert associations

  title={No Keyword is an Island: In search of covert associations},
  author={V{\'a}clav Cvr{\vc}ek and Masako Ueda Fidler},
This paper describes how corpus-assisted discourse analysis based on keyword (KW) identification and interpretation can benefit from employing Market basket analysis (MBA) after KW extraction. MBA is a data mining technique used originally in marketing that can reveal consistent associations between items in a shopping cart, but also between keywords in a corpus of many texts. By identifying recurring associations between KWs we can compensate for the lack of wider context which is a major… Expand

Figures and Tables from this paper


Incorporating text dispersion into keyword analyses
Keyword analysis has become an indispensable tool for discourse analysts, being applied to identify the words that are especially characteristic of the texts in a target discourse domain. But,Expand
Going Beyond “Aboutness”: A Quantitative Analysis of Sputnik Czech Republic
This paper is an attempt to unpack the “alternativeness” of Sputnik Czech Republic, an online news-opinion portal that targets the Czech-speaking audience. The overarching principle used in theExpand
A Data-Driven Analysis of Reader Viewpoints: Reconstructing the Historical Reader Using Keyword Analysis
This study uses corpus-linguistic methods to examine the relationship between language usage patterns and divergence in text interpretation. Our target of analysis is a set of texts (CzechoslovakExpand
‘Keywords Method’ versus ‘Calcul des Spécificités’: A comparison of tools and methods
Major similarities and differences between the tools mainly concern the most typical keywords, whereas the differences concern the total number of significant keywords extracted, the granularity of both probability value and typicality coefficient and the type of the reference corpus. Expand
Keyness: Appropriate metrics and practical issues
The extent of overlap in the keyword rankings resulting from the adoption of different metrics is looked at, and the implications of ranking-based analysis adopting one metric or another are discussed. Expand
Problems in investigating keyness, or clearing the undergrowth and marking out trails…
The chapter focuses in on a number of specific issues: the amount of text to take as a unit when computing keyness, statistical problems in making claims of different kinds about they keyness of words and phrases, the choice of an appropriate reference corpus, and the types of repetition which characterize key words. Expand
Keywords and frequent phrases of Jane Austen's "Pride and Prejudice": a corpus-stylistic analysis
Corpus linguistic analyses reveal meanings and structural features of data, that cannot be detected intuitively. This has been amply demonstrated with regard to non-fiction data, but fiction textsExpand
Clusters, key clusters and local textual functions in Dickens
It is argued that corpus linguistics can make useful contributions to the descriptive inventory of literary stylistics by suggesting that clusters, i.e. repeated sequences of words, can be interpreted as pointers to local textual functions. Expand
Practical Text Mining and Statistical Analysis for Non-structured Text Data Applications
This comprehensive professional reference brings together all the information, tools and methods a professional will need to efficiently use text mining applications and statistical analysis and presents a comprehensive how- to reference that shows the user how to conduct text mining and statistically analyze results. Expand
Text and Corpus Analysis: Computer-Assisted Studies of Language and Culture
List of Figures, Concordances and Tables. Acknowledgements. Data Conventions and Terminology. Notes on Corpus Data and Software. Part I: Concepts and History:. 1. Texts and Text Types. 2. BritishExpand