Dataset Open Access

NER-Modell 20 des Projekts Dehmel Digital

Flüh, Marie

This dataset contains two types of resources: Firstly, three Named Entity Recognition models developed in the context of the project "Dehmel digital" for the automatic annotation of persons, places, artworks and organisations in letters from the period around 1900. The training corpus for model 20 consists of 400,000 manually annotated tokens.
Second, a table in which the results of the performance test are broken down in detail. The performance was calculated on the basis of eight different test texts, each consisting of 10,000 manually annotated tokens.
 

Files (158.4 MB)
Name Size
ner-modell_20_mit_Liste_clean.ser.gz
md5:0ca4558f13f1e8ad730d9fbcdd657123
54.3 MB Download
ner-modell_20_mit_Liste_sloppy.ser.gz
md5:30f7cd4be1b4a5eaba5005115f7ab738
52.5 MB Download
ner-modell_20_ohne_liste.ser.gz
md5:b9ae49296886dddcf9f31f0790adbd7f
51.3 MB Download
Testseries_M20.numbers
md5:c9e35404e1393c6991521fb412f38d1e
228.1 kB Download
Testseries_M20.xlsx
md5:799de9105e6c27453cbb7665817523dc
8.5 kB Download

Cite record as