Dataset Open Access

Computational Visual Catalogue (CVC) - Rilke's notebooks - Full CVCs

Quang-Vinh Dang; Hussein Mohammed

This is the complete set of our generated Computational Visual Catalogue (CVC) for Rilke's notebooks (a total of 56 notebooks). The images are notebook pages of Rainer Maria Rilke, from the Deutsche Literaturarchiv Marbach (DLA), A:Rilke-Archiv Gernsbach. The JSON files are computationally generated using several AI models and contains information automatically extracted from the images about various visual properties of text, such as word location, colour, orientation, and the used writing implement.

The structure of the JSON files is as follows:

Root (object)
├─ info (object)
│   ├─ description   : string
│   ├─ contributor   : string
│   ├─ version       : string
│   ├─ year          : integer
│   └─ date_created  : string    # "YYYY-MM-DD"

├─ images (array of object)
│   └─ [image] (object)
│       ├─ id        : integer
│       ├─ file_name : string
│       ├─ width     : integer
│       └─ height    : integer

└─ annotations (array of object)
    └─ [annotation] (object)
        ├─ id            : integer
        ├─ image_id      : integer
        ├─ category_id   : integer
        ├─ bbox          : array of 4 numbers      # [x, y, width, height]
        ├─ area          : number                 # float
        ├─ segmentation  : array of array of number  # [[x1, y1, x2, y2, …]]
        ├─ iscrowd       : integer                # 0 or 1
        ├─ score         : number                 # float
        ├─ color_name    : string
        ├─ color_code    : string                 # e.g. "145-144-122"
        ├─ orientation   : string                 # e.g. "hor" or "ver"
        └─ writing_tool  : string                 # e.g. "pcl"
 

See ScriptSight tool for examples of how this computational visual catalogue can be used.

Acknowledgements: 
The research for this work was funded by the Deutsche Forschungsgemeinschaft (DFG, German Research Foundation) under Germany’s Excellence Strategy - EXC 2176 ‘Understanding Written Artefacts: Material, Interaction and Transmission in Manuscript Cultures’, project no. 390893796. The research was conducted within the scope of the Centre for the Study of Manuscript Cultures (CSMC) at Universität Hamburg.

The images are offered by the Deutsche Literaturarchiv Marbach (DLA) as a part of their collaboration with the CSMC.

Files (16.6 GB)
Name Size
Computational Visual Catalogue.png
md5:6d80f39e6e59025e535e6bf95ea4b834
54.6 kB Download
CVC_Rilke_full_12.08.2025.zip
md5:a063341e6e2b779e743e09eb89b3bc2e
195.8 MB Download
images.zip
md5:9ac0464cfbb5c92b73ed2117011c8516
16.4 GB Download

Cite record as