INEL Selkup Corpus

Brykina, Maria; Orlova, Svetlana; Wagner-Nagy, Beáta

doi:10.25592/uhhfdm.9753

June 30, 2020 Dataset Open Access

INEL Selkup Corpus

Brykina, Maria; Orlova, Svetlana; Wagner-Nagy, Beáta

Dublin Core Export

<?xml version='1.0' encoding='utf-8'?>
<oai_dc:dc xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:oai_dc="http://www.openarchives.org/OAI/2.0/oai_dc/" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://www.openarchives.org/OAI/2.0/oai_dc/ http://www.openarchives.org/OAI/2.0/oai_dc.xsd">
  <dc:contributor>Wagner-Nagy, Be´ata</dc:contributor>
  <dc:contributor>Arkhipov, Alexandre</dc:contributor>
  <dc:contributor>Brykina, Maria</dc:contributor>
  <dc:contributor>Orlova, Svetlana</dc:contributor>
  <dc:contributor>Ferger, Anne</dc:contributor>
  <dc:contributor>Jettka, Daniel</dc:contributor>
  <dc:contributor>Lehmberg, Timm</dc:contributor>
  <dc:creator>Brykina, Maria</dc:creator>
  <dc:creator>Orlova, Svetlana</dc:creator>
  <dc:creator>Wagner-Nagy, Beáta</dc:creator>
  <dc:date>2020-06-30</dc:date>
  <dc:description>Corpus Citation

Brykina, Maria; Orlova, Svetlana; Wagner-Nagy, Beáta. 2020. INEL Selkup Corpus. Version 1.0. Publication date 2020-06-30. Archived in Hamburger Zentrum für Sprachkorpora. http://hdl.handle.net/11022/0000-0007-E1D5-A. In: Wagner-Nagy, Beáta; Arkhipov, Alexandre; Ferger, Anne; Jettka, Daniel; Lehmberg, Timm (eds.). The INEL corpora of indigenous Northern Eurasian languages.

Corpus Description

The INEL Selkup corpus has been created within the long-term INEL project ("Grammatical Descriptions, Corpora and Language Technology for Indigenous Northern Eurasian Languages”), 2016–2033. The corpus enables typologically aware corpus-based grammatical research on the Selkup language and expands the documentation of the lesser described indigenous languages of Northern Eurasia.

The INEL Selkup corpus is composed of texts from the archive of Angelina Ivanovna Kuzmina (1924–2002), who gathered a large amount of material on Selkup in almost all regions where the Selkup people lived between 1962–1977. The archive was transferred by A.I. Kuzmina to Eugen Helimski and acquired by the Universität Hamburg in 2001. Most texts in the corpus originate from the handwritten part of the archive, the others come from sound recordings made by A.I. Kuzmina, transcribed and translated within the INEL project.

The corpus has been produced in the context of the joint research funding of the German Federal Government and Federal States in the Academies’ Programme, with funding from the Federal Ministry of Education and Research and the Free and Hanseatic City of Hamburg. The Academies’ Programme is coordinated by the Union of the German Academies of Sciences and Humanities.

Funding

The corpus has been produced in the context of the joint research funding of the German Federal Government and Federal States in the Academies’ Programme, with funding from the Federal Ministry of Education and Research and the Free and Hanseatic City of Hamburg. The Academies’ Programme is coordinated by the Union of the German Academies of Sciences and Humanities.

Contributions/Acknowledgements

Audio recordings made by Angelina Kuzmina were transcribed and translated by native speakers of Selkup:


	Irina Anatolyevna Korobejnikova, written transcription and Russian translation of audio in Central and Southern dialects
	Natalya Platonovna Izhenbina, written transcription and Russian translation of audio in Southern dialects
	Svetlana Nikitichna Sankevich (Kunina), oral transcription and Russian translation of audio in Northern dialects
	Evgeniya Sergeevna Smorgunova (Irikova), oral and written transcription and Russian translation of audio in Northern dialects
	Valentina Vladimirovna Tamelkina, oral transcription and Russian translation of audio in Northern dialects


For individual contributions to the collecting, transcribing and analyzing of individual texts, please refer to the user documentation and to the corpus metadata.

The web-based search interface is using the Tsakonian Corpus platform developed by Dr. Timofey Arkhangelskiy, Humboldt Research Fellow at IFUU, Hamburg University

New in release 1.0


	The corpus now contains 264 texts from 74 speakers, representing the dialects of Middle Taz, Upper Tolka, Baikha (Northern), Narym and Tym (Central), Upper and Middle Ob, Chaya, Upper and Middle Ket (Southern). These contain 7887 sentences and 42466 words in total.
	Many texts have been provided with annotations for syntactic functions and semantic roles.
	Corrections to audio transcriptions, glossing and other annotations.
</dc:description>
  <dc:identifier>https://www.fdr.uni-hamburg.de/record/9753</dc:identifier>
  <dc:identifier>10.25592/uhhfdm.9753</dc:identifier>
  <dc:identifier>oai:fdr.uni-hamburg.de:9753</dc:identifier>
  <dc:language>sel</dc:language>
  <dc:relation>handle:11022/0000-0007-E1D5-A</dc:relation>
  <dc:relation>doi:10.25592/uhhfdm.9721</dc:relation>
  <dc:rights>info:eu-repo/semantics/openAccess</dc:rights>
  <dc:rights>https://creativecommons.org/licenses/by-nc-sa/4.0/legalcode</dc:rights>
  <dc:subject>endangered language</dc:subject>
  <dc:subject>indigenous language</dc:subject>
  <dc:subject>L1 data</dc:subject>
  <dc:subject>language contact</dc:subject>
  <dc:subject>language documentation</dc:subject>
  <dc:subject>INEL</dc:subject>
  <dc:subject>folklore</dc:subject>
  <dc:subject>narrative</dc:subject>
  <dc:subject>monologue</dc:subject>
  <dc:subject>annotated</dc:subject>
  <dc:subject>morphological glossing</dc:subject>
  <dc:subject>borrowings</dc:subject>
  <dc:subject>code-switching</dc:subject>
  <dc:subject>semantic roles</dc:subject>
  <dc:subject>syntactic functions</dc:subject>
  <dc:subject>information status</dc:subject>
  <dc:subject>English translation</dc:subject>
  <dc:subject>German translation</dc:subject>
  <dc:subject>Russian translation</dc:subject>
  <dc:title>INEL Selkup Corpus</dc:title>
  <dc:type>info:eu-repo/semantics/other</dc:type>
  <dc:type>dataset</dc:type>
</oai_dc:dc>

Publication date:

June 30, 2020

DOI:

Keyword(s):

endangered language indigenous language L1 data language contact language documentation INEL folklore narrative monologue annotated morphological glossing borrowings code-switching semantic roles syntactic functions information status English translation German translation Russian translation

Related identifiers:

Cited by:
11022/0000-0007-E1D5-A

Communities:

License (for files):

Creative Commons Attribution Non Commercial Share Alike 4.0 International

Versions

Version 2.0 10.25592/uhhfdm.9754	Dec 31, 2021
Version 1.0 10.25592/uhhfdm.9753	Jun 30, 2020
Version 0.1 10.25592/uhhfdm.9722	Dec 31, 2018

Cite all versions? You can cite all versions by using the DOI 10.25592/uhhfdm.9721. This DOI represents all versions, and will always resolve to the latest one.

Zentrumfür Nachhaltiges Forschungsdatenmanagement

Suche

INEL Selkup Corpus

Dublin Core Export

Versions

Cite record as

Export

INEL Selkup Corpus

Dublin Core Export

DOI Badge

Markdown

[![DOI](https://www.fdr.uni-hamburg.de/badge/DOI/10.25592/uhhfdm.9753.svg)](https://doi.org/10.25592/uhhfdm.9753)

reStructedText

.. image:: https://www.fdr.uni-hamburg.de/badge/DOI/10.25592/uhhfdm.9753.svg :target: https://doi.org/10.25592/uhhfdm.9753

HTML

<a href="https://doi.org/10.25592/uhhfdm.9753"><img src="https://www.fdr.uni-hamburg.de/badge/DOI/10.25592/uhhfdm.9753.svg" alt="DOI"></a>

Image URL

https://www.fdr.uni-hamburg.de/badge/DOI/10.25592/uhhfdm.9753.svg

Target URL

https://doi.org/10.25592/uhhfdm.9753

Versions

Cite record as

Export