Corpus of Modern Scottish Writing (CMSW)

Corpus of Modern Scottish Writing (CMSW)

  • Access this resource: www.scottishcorpus.ac.uk/cmsw/
  • Principal Investigator: Professor John Corbett
  • Co-Investigator: Professor Jeremy Smith
  • Researcher: Dr Wendy Anderson
  • Computing Manager: Mr David Beavan
  • Project Student: Dorian Grieve

Following the successful completion of the Scottish Corpus of Texts and Speech, the project team are now concentrating on a new linguistic resource. The Corpus of Modern Scottish Writing (1700-1945) project (CMSW) will provide an evidence-based platform for a new account of the development of Modern Scots and Scottish English. It will create a major research resource, a publicly available, digitised archive of texts in language varieties ranging from Broad Scots to Scottish Standard English. The corpus will provide the ‘missing link’ between the Helsinki Corpus of Older Scots and its related projects (1375-1700) and the Scottish Corpus of Texts and Speech (1945-present day; http://www.scottishcorpus.ac.uk/).

The content of CMSW will mainly be written texts, though early 20th century audio recordings exist. Texts will be selected on the basis of genre and region. Some texts will be chosen as ‘written records of speech’, e.g. minutes of meetings and transcripts of court proceedings. As far as is possible for historical texts, sociolinguistic variables will be recorded in the searchable metadata.

One novel aim is the inclusion alongside the corpus of texts of commentaries on Modern Scots, from Alexander Hume’s 17th century  ‘Of the orthographie and congruitie of the Britan tongue’ to William Grant’s ‘Introduction’ to the Scottish National Dictionary in the early 20th century. The availability of a series of observations on the Scots language alongside samples of its usage will be a powerful aid to orthographic research. These will be supplemented by relevant excerpts from Education Acts and reports, tracking the attitudes to Broad Scots in education during the period.

The project also aims to advance the implementation of automatic searching for spelling and dialectal variants. The creation of the corpus and the particular requirements of variationist and historical material entail advances in the computing strategies developed for the Scottish Corpus of Texts and Speech (SCOTS).