Representativeness in Corpora of Literary Texts: Introducing the C18P Project

  • Iris Gemeinböck Department of English and American Studies - University of Vienna


Currently there are very few specialised corpora of literary texts that are tailored to the needs of literary critics who are interested in corpus stylistic analyses of prose fiction. Many existing corpora including literary texts were compiled for linguistic research interests and are often unsuitable for corpus stylistic purposes. The paper addresses three of the main problems: the absence of labelling of the texts for literary genre, the use of extracts, and the prevalence of linguistic periodisation schemes. C18P is a corpus of prose fiction designed specifically to address these issues. It traces the early development of the novel from 1700 up until the Victorian era. It can, for instance, be used for an analysis of the characteristic linguistic features of individual literary genres and forms. The following paper introduces the design of the corpus as well as some of its potential uses.


Author Biography

Iris Gemeinböck, Department of English and American Studies - University of Vienna
University of Vienna, PhD candidate


GEMEINBÖCK, Iris. Representativeness in Corpora of Literary Texts: Introducing the C18P Project. MATLIT: Materialities of Literature, [S.l.], v. 4, n. 2, p. 29-48, july 2016.
corpus analysis; corpus stylistics; corpus building; eighteenth century; prose fiction; representativeness