Dataset Open Access

Synthetic MSI Images of Georgian Palimpsests (SGP Dataset)

Mahdi Jampour; Hussein Mohammed; Jost Gippert

This is a dataset of Synthetic MSI Images of Georgian Palimpsests (SGP Dataset). It has been created for the purpose of training inpainting models in order to remove the overtext and reconstruct the undertext. It consists of three subsets: train, test and validation. Each synthetic palimpsest has its on mask and ground truth imagesas follows: 

  • Ground Truth Image: ImageName_a
  • Mask Image: ImageName_b
  • Synthetic Palimpsests: ImageName_c

The typeface used to generate this synthetic dataset is for a very particular and unknown script to generate synthetic training samples. The first draft of the typeface was created by Jost Gippert in 2005, while the final version was prepared by Andreas Stötzner in 2007.

 

The research for this work was funded by the Deutsche Forschungsgemeinschaft (DFG, German Research Foundation) under Germany's Excellence Strategy – EXC 2176 ‘Understanding Written Artefacts: Material, Interaction and Transmission in Manuscript Cultures', project no. 390893796. The research was conducted within the scope of the Centre for the Study of Manuscript Cultures (CSMC) at Universität Hamburg.

Files (13.9 GB)
Name Size
Image_0000_a.png
md5:b7495684fea4d6f1bb7bac972937b880
5.6 MB Download
Image_0000_b.png
md5:3f281d7b7155e71065d7bc9ace48d15c
391.7 kB Download
Image_0000_c.png
md5:09e4c7f0b649d912ae12710972b9ea19
8.0 MB Download
The SGP Dataset.zip
md5:33f0d7cc51c0e21cae390dd1fdfc45ab
13.9 GB Download

Cite record as