Dataset Open Access
Quang-Vinh Dang;
Hussein Mohammed
This is the complete set of our generated Computational Visual Catalogue (CVC) for Rilke's notebooks (a total of 56 notebooks). The images are notebook pages of Rainer Maria Rilke, from the Deutsche Literaturarchiv Marbach (DLA), A:Rilke-Archiv Gernsbach. The JSON files are computationally generated using several AI models and contains information automatically extracted from the images about various visual properties of text, such as word location, colour, orientation, and the used writing implement.
The structure of the JSON files is as follows:
Root (object)
├─ info (object)
│ ├─ description : string
│ ├─ contributor : string
│ ├─ version : string
│ ├─ year : integer
│ └─ date_created : string # "YYYY-MM-DD"
│
├─ images (array of object)
│ └─ [image] (object)
│ ├─ id : integer
│ ├─ file_name : string
│ ├─ width : integer
│ └─ height : integer
│
└─ annotations (array of object)
└─ [annotation] (object)
├─ id : integer
├─ image_id : integer
├─ category_id : integer
├─ bbox : array of 4 numbers # [x, y, width, height]
├─ area : number # float
├─ segmentation : array of array of number # [[x1, y1, x2, y2, …]]
├─ iscrowd : integer # 0 or 1
├─ score : number # float
├─ color_name : string
├─ color_code : string # e.g. "145-144-122"
├─ orientation : string # e.g. "hor" or "ver"
└─ writing_tool : string # e.g. "pcl"
See ScriptSight tool for examples of how this computational visual catalogue can be used.
Acknowledgements:
The research for this work was funded by the Deutsche Forschungsgemeinschaft (DFG, German Research Foundation) under Germany’s Excellence Strategy - EXC 2176 ‘Understanding Written Artefacts: Material, Interaction and Transmission in Manuscript Cultures’, project no. 390893796. The research was conducted within the scope of the Centre for the Study of Manuscript Cultures (CSMC) at Universität Hamburg.
The images are offered by the Deutsche Literaturarchiv Marbach (DLA) as a part of their collaboration with the CSMC.
Name | Size | |
---|---|---|
Computational Visual Catalogue.png
md5:6d80f39e6e59025e535e6bf95ea4b834 |
54.6 kB | Download |
CVC_Rilke_full_12.08.2025.zip
md5:a063341e6e2b779e743e09eb89b3bc2e |
195.8 MB | Download |
images.zip
md5:9ac0464cfbb5c92b73ed2117011c8516 |
16.4 GB | Download |