Presentation Open Access

Data Linking Workshop 2023: Computer Vision and Natural Language Processing – Challenges in the Humanities

Sylvia Melzer

The humanities meet computer science to create new synergies using computer vision and natural language processing.

 

Aim & Scope

Historians are increasingly using technologies to evaluate digitised texts in a machine-readable way, as well as techniques from the field of natural language processing (NLP) to analyse the content and context of language in written artefacts. These techniques can be used to analyse large corpora and identify patterns. In general, however, these methods often use training data from current rather than historical data. The use of these methods can lead to biases in the historical record, incurring the risk of false inferences about history. Therefore, the methods used should be fully investigated to account for any biases. In this DL workshop, the challenges of applying computer vision and NLP techniques in the humanities, and first solutions to them, will be presented.

This entry includes the following presentations from the first Data Linking Workshop 2023: Computer Vision and Natural Language Processing – Challenges in the Humanities

  • Pepper, Welcome
  • Eva Wilden, Charles Li: Tamilex -- Digital Lexicography
  • Stefan Baums, Stephen White: Computer Vision and Kharoṣṭhī Paleography
  • Oskar von Hinüber, Haiyan Hu-von Hinüber, Sylvia Melzer: What the Buddhological Epigraphy can expect from the AI: The Information System "Buddhist Bronzes Inscriptions"
  • Kathrin Holz: The Proto-Śāradā Project: Towards the edition of a new collection of administrative letters and documents from pre-modern South Asia
  • Ines Konczak-Nagel, Erik Radisch:
    The Kucha Mural Information System: Taxonomy and Semi-Automated Image Recognition
  • Ralf Möller: Aligned AI and the role of the humanities: Training AI systems using human feedback
  • Isabelle Marthot-Santaniello: The application of NLP in combination with Computer Vision for analysing ancient Greek handwritings on papyri
  • Olga Serbaeva: Some features of the 17th century Newārī script: READ-based statistical approach to palaeography
  • Lena Hinrichsen: OCR technologies in research practice
  • Oliver Hellwig: Web-based information systems for Indian scripts and texts
  • Simon Schiff, Ralf Möller: Persistent Data, Sustainable Information
  • Hamid Reza Hakimi, Lisa Mischer, Tariq Yousef, Maxim Romanov: Finding and Linking Information in Arabic Historical Texts
  • Sylvia Melzer: Building Information Systems on Demand with ChatGPT?
  • Martin Braun, Hannes Fellner, Bernhard Koller: A Digital Paleography of Tarim Brahmi
  • Hussein Mohammed: Computer vision beyond OCR: potentials and challenges for the study of written artefact

The submitted presentations are included in this upload for which permission to publish has been granted.

The KI2021 workshop – Humanities-Centred AI was supported by the Deutsche Forschungsgemeinschaft (DFG, German Research Foundation) under Germany’s Excellence Strategy - EXC 2176 ’Understanding Written Artefacts: Material, Interaction and Transmission in Manuscript Cultures’, project no. 390893796.

Files (188.3 MB)
Name Size
00_DataLinkingWorkshop2023-Agenda.pdf
md5:9b2cbccce86bb7bed1e5f93fc262a05c
186.7 kB Download
01-Welcome-text_Pepper.pdf
md5:32e214d14b5602f30badc5ed8bae53e8
63.0 kB Download
01-Welcome_Pepper.mp4
md5:5d7ca1f4605f1623b4fb7f9d054529d5
64.2 MB Download
02-Keynote1_Tamilex_Wilden_Li.pdf
md5:aacc5e192f98a57639333fc64b68e968
1.9 MB Download
03-READ_Baums-White.pdf
md5:0557c41bc4df7a4c0d47047720a8e1f4
4.7 MB Download
04-BuBronIn_Hinueber-Hu-Melzer.pdf
md5:06ac40309baa4e7a3e48fbc19b39b101
2.8 MB Download
05-Proto-Śāradā_Holz.pdf
md5:62a97359daef637ec982e3ed055de5ec
5.5 MB Download
06a-Taxonomy_Konczak-Nagel.pdf
md5:4c39db6a6bf827fcc6ebca81f5194137
12.2 MB Download
06b_Deep_Learning _KMIS-Radisch.pdf
md5:105180fc33beee0338509b26241dc608
7.5 MB Download
07-Keynote2_Aligned-AI_Moeller.pdf
md5:94429cfa868499a0010629a9eee5f339
2.8 MB Download
08-Greek-papyri_Marthot-Santaniello.pdf
md5:f38535cc6c028d30c11524ccfa3a77f4
7.9 MB Download
09-Newari_Serbaeva.pdf
md5:d4414e58a55a1f7d0e2a3853469375f8
13.1 MB Download
10-OCR_Hinrichsen.pdf
md5:c38b112b17c7112e14e0329cdbab6477
1.2 MB Download
12-PersistentData_Schiff_Moeller.pdf
md5:6c6b0e940636123f37fe36221a77bbc7
1.4 MB Download
13-FindingAndLinking_HMYR.pdf
md5:d04dcf7b78fbabd807aa94f631535efd
3.2 MB Download
14-InfSoD_Melzer.pdf
md5:577bacc4dad16005f58fc7aedbaf1b22
1.3 MB Download
15-Digital_Paleography_Koller.pdf
md5:7412e864ac81abd6ed56d3153f25a6f6
3.3 MB Download
16-ComputerVision_Mohammed.pdf
md5:91866e50669f14c941755640e1ec8b74
1.6 MB Download
17-Bye_Pepper.mp4
md5:10e61f25edecc9c00daa32606a66efee
53.3 MB Download
17-Bye_Pepper.pdf
md5:b26693a3ee7a4cf024b8eb2482c680ea
59.4 kB Download

Cite record as