Alexander von Humboldt’s famous ‘Kosmos-Lecture’ at the Berlin Sing-Akademie (1827/28): From Digital to Print Edition

A blog post by Christian Thomas (CLARIN-D, BBAW)

Book: Alexander von Humboldt, Henriette Kohlrausch: Die Kosmos-Vorlesung an der Berliner Sing-Akademie. Edited by Christian Kassung and Christian Thomas. Berlin: Insel Verlag, 2019.
(insel taschenbuch 4719, ISBN 978-3-458-36419-1) Publisher’s landing page:

Fig.1: Cover insel taschenbuch 4719, © Insel Verlag Berlin.


Background: Alexander von Humboldt’s legendary ‘Kosmos-Lectures’

Alexander von Humboldt’s legendary ‘Kosmos-Lectures’ in 1827/28 at the Berlin Sing-Akademie – then the city’s largest lecture hall, today’s seat of the Maxim Gorki Theater – are regarded as a decisive moment in the history of scientific popularization. In the winter of 1827/28, approximately one thousand Berliners and guests from abroad attended the 16 consecutive lectures. Humboldt gave a vast overview on the state of scientific knowledge of his time, spanning astronomic, geographic, geological and biological topics, but also the cultural and social spheres in an encompassing ‘portrait of nature’ (“Naturgemälde”). The audience represented a broad spectrum of the learned society and interested laymen including – following Humboldt’s explicit invitation – women, who were still excluded from Prussia’s universities until the end of the 19th century. Since the lectures were never published by Humboldt himself, the elaborate notebooks several of his auditors kept that were preserved in different archival and private holdings in Germany, Poland, Turkey and Norway, become even more valuable as authentic documents of this important moment in the history of science.

The recently published volume Die Kosmos-Vorlesung an der Berliner Sing-Akademie, edited by Christian Kassung and Christian Thomas, presents, for the first time in a printed edition, the reliable and complete text of this lecture series. The edited primary text was corrected on the basis of the unique manuscript held at the Berlin State Library. A detailed foreword by the editors explains the background and the current state of research on the lectures and their significance from today’s perspective. Selected facsimiles from the manuscript itself and from Humboldt’s legacy collection give an impression of the historical sources.


Fig. 2: Sample pages from insel taschenbuch 4719,, © Insel Verlag Berlin.


By means of a comparison with other handwritten documents, the editors could attribute this fair copy, which was formerly assumed to be by an anonymous (male) writer, to Henriette Kohlrausch, a self-educated, confident woman who was an active participator in Berlin’s cultural and societal life at the time, including salons and gatherings that also Alexander von Humboldt, his brother Wilhelm and his wife Caroline attended.

Project context: The Humboldt University’s Hidden Kosmos project in cooperation with the Berlin CLARIN-D centre

The image digitisation, full text acquisition and collation of Kohlrausch’s manuscript and several other transcripts by individual attendees of the ‘Kosmos Lectures’ was funded by the Excellence Initiative at Berlin’s Humboldt University as part of the project Hidden Kosmos: Reconstructing A. v. Humboldt’s »Kosmos-Lectures«. In cooperation between the Humboldt University and the CLARIN-D centre at the Berlin-Brandenburg Academy of Sciences and Humanities (BBAW), the full-text transcriptions were elaborated and published as scholarly digital editions in the German Text Archive (Deutsches Textarchiv, DTA). By the end of 2016, not only Henriette Kohlrausch’s manuscript, but all currently known auditors' transcripts of Alexander von Humboldt’s world-famous ‘Kosmos Lectures’ were made freely available under a Creative Commons-license (CC-BY 4.0) in the German Text Archive.[1] All data has been captured and published adhering to the FAIR principles; ensuring its findability, accessability, interoperability and reuseability.

Fig. 3: [Kohlrausch, Henriette]: Physikalische Geographie. Vorgetragen von Alexander von Humboldt. [Berlin], [1828]. [= Nachschrift der ‚Kosmos-Vorträge‛ Alexander von Humboldts in der Sing-Akademie zu Berlin, 6.12.1827–27.3.1828.] In: Deutsches Textarchiv, <>, last access February 14, 2020. 


Preparation, Publication, Analysis and Preservation: Using the CLARIN infrastructure following the whole Research Data Lifecycle

This digital edition, and consequently the print volume featured here, made heavy use of and benefitted greatly from components of the CLARIN infrastructure, using several of its tools and services all the way through the research data lifecycle: from gathering and structuring data in a standardised and interoperable manner to its publication, computer-based analysis, and long-term archiving. The transcription guidelines and the annotation schema, using Extensible Markup Language (XML) to structure and annotate the transcribed text, was based upon the German Text Archive ‘base format’ (DTA-Basisformat) in compliance with the de facto-standard of the international Text Encoding Initiative (TEI). The XML documents were created using DTABf’s RelaxNG schema and set of Schematron rules for validation, ensuring standard compliance on the formative level.

Fig. 4: DTA-Basisformat > Spezial > Auszeichnung von Manuskripten > Editorisches Inventar > Verantwortlichkeiten.[2]

The publication of the full-text XML transcriptions alongside facsimiles of the manuscripts enabled the use of DTA’s web-based quality assurance platform DTAQ. Here, intensive proofreading and deeper annotation of the documents was carried out collaboratively. In DTAQ, all names of persons mentioned in the course of the lectures were tagged with tei:persName elements and referenced using unique identifiers from the integrated authority file (GND), resulting in a general index encompassing nearly 9000 historical persons. Candidates for this manual semantic tagging were identified using the tool chain for Named Entity Recognition (NER) provided by the DWDS and DTA projects at the BBAW, and were subsequently checked and amended manually. 

Fig. 5: Personenverzeichnis “Hidden Kosmos”: register of all person names mentioned in the course of Humboldt’s lectures in all of the attendees' transcripts.


As a further benefit of the publication via the DTA/CLARIN-D infrastructure, the historical spelling variation found in the documents is automatically normalised and mapped onto its modern forms using the Cascaded Analysis Broker DTA::CAB. The elaborated query options provided by the linguistic search engine DDC can also be used immediately, including Part-of-Speech-Tags and lemmatisation. Furthermore, collocation analyses using CLARIN-D’s DiaCollo: collocation analysis in diachronic perspective can be carried out seamlessly for every single document in the corpus, for the corpus as a whole, or in comparison to the DTA and DWDS corpora. External tools like Voyant and a whole set of applications from CLARIN’s Language Resource Switchboard are linked from the DTA resources, as well. The metadata is stored in the teiHeader of the respective documents, converted into the simpler, but still broadly used Dublin Core format, as well as into the more specific CMDI format developed in the CLARIN context. The CMDI records are automatically handed over to the Virtual Language Observatory (VLO) by an application programming interface (API). All data is stored for long-term preservation in the certified CLARIN repository at the BBAW.


Fig. 6: DiaCollo collocation analysis, view: Cloud, query term: “Natur”, corpus: Works by A. v. Humboldt and transcripts of his ‘Kosmos-Lectures’ in the German Text Archive.

The Best of two Worlds: Digital and analogue text editions

The print volume announced here, published by the renowned Insel Verlag Berlin, was produced directly from the XML source, encoded following the DTA ‘base format’. The volume’s person register was derived directly from the XML sources as well. In the eBook version, each page of the historical document is directly linked to the corresponding electronic edition of the manuscript in the German Text Archive, inviting readers to explore the semantically richer digital version in parallel.

Between the digital and the anologue, each publication format has its own strengths and merits: In many respects, the print publication is better suited for a consecutive ‘close reading’ of the historical source, which is presented in the context of other archival documents closely related to the central event, the famous ‘Kosmos-Lectures’. The foreword by the editors summarises the actual state of research on the topic, as well as providing a reflection on the authorial status of this outstanding, singular document by which Henriette Kohlrausch ingeniously managed to capture an ephemeral moment in time, i. e. an original publication by Alexander von Humboldt presented only orally. The digital edition, on the other hand, allows for computer-assisted ‘distant’, or rather ‘scalable reading’, based on highly structured, semantically annotated and interlinked text resources, providing an explorative overview encompassing a total of some 3,500 manuscript pages covering Humboldt’s lectures.

In the many ways described above, the digital infrastructure set up by the CLARIN ERIC and CLARIN-D initiatives served as a godfather to this newly born, analogous offspring, accompanying its way into the world from the beginning onwards. The digital edition of the manuscript is freely accessible and reuseable via the German Text Archive in various formats, inculding the native TEI-XML source. The book based on this electronic edition, Alexander von Humboldt und Henriette Kohlrausch: Die Kosmos-Vorlesung an der Berliner Sing-Akademie, was published in August 2019 by Insel Verlag. It is available as hardcover in print and also as eBook in the EPUB-Format. 

[1.] Cf.; Further reading:

  • Christian Thomas: “You Can’t Put Your Arms Around a Memory: The Multiple Versions of Alexander von Humboldt’s ‘Kosmos-Lectures’ (1827/28).” In: Versioning Cultural Objects. Edited by Roman Bleier and Sean M. Winslow. Schriften des Instituts für Dokumentologie und Editorik, 13. Norderstedt: Books on Demand, 2019, pp. 77–99. OpenAccess via Kölner UniversitätsPublikationsServer (KUPS),;
  • Christian Thomas, Benjamin Fiechter und Marius Hug: „Methoden und Ziele der Erschließung handschriftlicher Quellen zu Alexander von Humboldts Kosmos-Vorträgen: Das Projekt Hidden Kosmos der Humboldt-Universität zu Berlin.“ In: Horizonte der Humboldtforschung, hrsg. v. Ottmar Ette und Julian Drews in der Reihe Pointe (Potsdamer inter- und transkulturelle Texte), hrsg. v. Ottmar Ette und Gesine Müller. Hildesheim: Olms-Weidmann, 2016, S. 287–318.

[2.] For an introduction to the DTA-‘base format’ for manuscripts see Susanne Haaf, Christian Thomas: “Enabling the Encoding of Manuscripts within the DTABf: Extension and Modularization of the Format.” In: Journal of the Text Encoding Initiative (jTEI), Issue 10 | December 2016 – July 2019, Online since 08 August 2017, connection on 15 February 2020. URL: DOI: 10.4000/jtei.1650. 



Geschrieben von : Melanie Grumt Suárez

1000 Buchstaben übrig