Integrating the Old Bailey Corpus into the CLARIN-D infrastructure (discipline-specific working group 2)

Project content

The Old Bailey Corpus 1720-1913 (OBC) is a corpus of 18th and 19th century spoken English and consists of selected Proceedings of the Old Bailey, London's central criminal court. The OBC currently has c. 750,000 words of direct speech per decade between the years 1720-1913, amounting to about 13.9 million words of spoken English. Every speaker turn is annotated for sociobiographical (gender, social class, age), pragmatic (role in the court proceeding) and textual variables (the shorthand scribe, printer and publisher of individual Proceedings). The aim of this project was to integrate the OBC into the German section of CLARIN to achieve sustainability of this resource (persistent storage and access). The OBC will be hosted at the CLARIN-D Service Centre of Saarland University.

Project duration

  • 01.01.2015 – 31.12.2015


Responsible institution

Project management

  • Magnus Nissel

  • Karin Puga

Project website / references