INEL Kamas Corpus

Gusev, Valentin; Klooster, Tiina; Wagner-Nagy, Beáta

doi:10.25592/uhhfdm.13882

December 29, 2023 Dataset Open Access

INEL Kamas Corpus

Gusev, Valentin; Klooster, Tiina; Wagner-Nagy, Beáta

Citation Style Language JSON Export

{"DOI":"10.25592/uhhfdm.13882","abstract":"<p><strong>Corpus Citation</strong></p>\n\n<p><em>Gusev, Valentin; Klooster, Tiina; Wagner-Nagy, Be&aacute;ta.</em> 2023. &ldquo;INEL Kamas Corpus.&rdquo; Version 2.0. Publication date 2023-12-31. <a href=\"http://hdl.handle.net/11022/0000-0007-FC25-4\">http://hdl.handle.net/11022/0000-0007-FC25-4</a>. Archived at Universit&auml;t Hamburg. In: The INEL corpora of indigenous Northern Eurasian languages.<a href=\"https://hdl.handle.net/11022/0000-0007-F45A-1\">https://hdl.handle.net/11022/0000-0007-F45A-1</a>.</p>\n\n<p><strong>Corpus Description</strong></p>\n\n<p>The INEL Kamas corpus has been created within the long-term INEL project (&quot;Grammatical Descriptions, Corpora and Language Technology for Indigenous Northern Eurasian Languages&quot;), 2016&ndash;2033. The corpus makes possible typologically aware corpus-based grammatical research on the Kamas language and expands the documentation of the lesser described indigenous languages of Northern Eurasia.</p>\n\n<p>The INEL Kamas corpus consists of two parts: folklore texts collected by Kai Donner in 1912&ndash;1914, and transcribed audio recordings of the last speaker of Kamas, Klavdiya Plotnikova, made between 1964 and 1970.</p>\n\n<p>Each text in the corpus is provided with morphological glossing, translation into English, Russian and German, as well as annotation of syntactic functions, semantic roles, Russian borrowings and code-switching. Some texts also have annotations for information status.</p>\n\n<p><strong>New in release 2.0</strong></p>\n\n<ul>\n\t<li>In texts from Donner&rsquo;s collection, phonetic transcription according to Klumpp&#39;s edition of Donner&rsquo;s manuscripts has been added&nbsp;(as stl tier)</li>\n\t<li>Five texts which were originally split between different tapes have been merged, as well as respective parts of recordings. Sentences in each resulting text are numbered throughout\n\t<ul>\n\t\t<li>PKZ_196X_Alenushka_flk + PKZ_196X_Alenushka_continuation_flk &gt; PKZ_196X_Alenushka_flk</li>\n\t\t<li>End of PKZ_196X_SU0226 starting from PKZ_196X_SU0226.203 (210) + PKZ_196X_Alenushka2_continuation_flk &gt; PKZ_196X_Alenushka2_flk</li>\n\t\t<li>PKZ_196X_BlacksmithAndMerchant_flk + PKZ_196X_BlacksmithAndMerchant_cont_flk &gt; PKZ_196X_BlacksmithAndMerchant_flk</li>\n\t\t<li>PKZ_196X_Finist_flk + PKZ_196X_Finist_continuation_flk&nbsp;&gt;&nbsp;PKZ_196X_Finist_flk</li>\n\t\t<li>PKZ_196X_StupidWolf_flk + PKZ_196X_StupidWolf_continuation_flk &gt; PKZ_196X_StupidWolf_flk</li>\n\t</ul>\n\t</li>\n\t<li>Part of the texts are now annotated for existential, locative and possessive predication (ExLocPoss tier, by C.L.&nbsp;D&auml;britz)</li>\n\t<li>Numerous corrections in glosses, other annotations and transcriptions, including:\n\t<ul>\n\t\t<li>Fuller and more consistent transcription, glossing and annotations of borrowings</li>\n\t\t<li>Vowel length is marked in mp tier in <em>ba\u02d0zo\u0294</em> &lsquo;again&rsquo;, <em>b&uuml;\u02d0z\u02bce</em> &lsquo;man&rsquo; and <em>sa\u02d0g\u0259r</em> &lsquo;black&rsquo;</li>\n\t\t<li>Corrections in disambiguation of polysemous or homonymous morphemes:&nbsp;<br>\n\t\t-zi\u0294&nbsp;&quot;INS&quot;/&quot;COM&quot;, -d\u0259 &quot;LAT&quot;/&quot;3SG&quot;, mo- &quot;can/become/want | \u043c\u043e\u0447\u044c/\u0441\u0442\u0430\u0442\u044c/\u0445\u043e\u0442\u0435\u0442\u044c&quot;</li>\n\t\t<li>Possessive suffix unmarked for case: &quot;NOM/GEN/ACC&quot; &gt; &quot;POSS&quot;</li>\n\t\t<li>Glosses for personal pronouns were changed to uniform labels: &quot;I | \u044f&quot; &gt; &quot;PRO1SG&quot;, &quot;we | \u043c\u044b&quot; &gt; &quot;PRO1PL&quot;, &quot;you | \u0442\u044b&quot;&nbsp;&gt;&nbsp;&quot;PRO2SG&quot;, &quot;you.PL | \u0432\u044b&quot; &gt; &quot;PRO2PL&quot;</li>\n\t\t<li>Fuller annotations of code-switching and calques (CS tier)</li>\n\t</ul>\n\t</li>\n\t<li>Added ELAN *.eaf as a supplementary end-user file format for all transcripts</li>\n</ul>\n\n<p><strong>Funding</strong></p>\n\n<p>The corpus has been produced in the context of the joint research funding of the German Federal Government and Federal States in the Academies&rsquo; Programme, with funding from the Federal Ministry of Education and Research and the Free and Hanseatic City of Hamburg. The Academies&rsquo; Programme is coordinated by the Union of the German Academies of Sciences and Humanities.</p>\n\n<p><strong>Contributions/Acknowledgements</strong></p>\n\n<ul>\n\t<li>\n\t<p>Recordings of Kamas speech made by Ago K&uuml;nnap in Abalakovo and by Tiit-Rein Viitso in Tartu provided by the Archive of Estonian Dialects and Kindred Languages of the University of Tartu, Estonia (AEDKL, or T&Uuml;EMSA).</p>\n\t</li>\n\t<li>\n\t<p>Recordings of Klavdiya Plotnikova made by Jaakko Yli-Paavola in Tallinn in 1970 provided by the Institute for the Languages of Finland archive, Helsinki (KOTUS).</p>\n\t</li>\n\t<li>\n\t<p>Scanned pages from the Kai Donners Kamassisches W&ouml;rterbuch (Joki 1944) containing texts collected by Kai Donner published online courtesy of the Finno-Ugrian Society.</p>\n\t</li>\n\t<li>\n\t<p>The web-based search interface is using the Tsakonian Corpus platform developed by Dr. Timofey Arkhangelskiy.</p>\n\t</li>\n</ul>","author":[{"family":"Gusev, Valentin"},{"family":"Klooster, Tiina"},{"family":"Wagner-Nagy, Be\u00e1ta"}],"id":"13882","issued":{"date-parts":[[2023,12,29]]},"language":"xas","title":"INEL Kamas Corpus","type":"dataset","version":"2.0"}

Publication date:

December 29, 2023

DOI:

Keyword(s):

endangered language indigenous language L1 data language contact language documentation INEL folklore narrative monologue annotated morphological glossing borrowings code-switching semantic roles syntactic functions information status English translation German translation Russian translation

Related identifiers:

Cited by:
11022/0000-0007-FC25-4

Communities:

License (for files):

Creative Commons Attribution Non Commercial Share Alike 4.0 International

Versions

Version 2.0 10.25592/uhhfdm.13882	Dec 29, 2023
Version 1.0 10.25592/uhhfdm.9752	Dec 15, 2019
Version 0.1 10.25592/uhhfdm.9741	Dec 31, 2018

Cite all versions? You can cite all versions by using the DOI 10.25592/uhhfdm.9740. This DOI represents all versions, and will always resolve to the latest one.

Zentrumfür Nachhaltiges Forschungsdatenmanagement

Suche

INEL Kamas Corpus

Citation Style Language JSON Export

Versions

Cite record as

Export

INEL Kamas Corpus

Citation Style Language JSON Export

DOI Badge

Markdown

[![DOI](https://www.fdr.uni-hamburg.de/badge/DOI/10.25592/uhhfdm.13882.svg)](https://doi.org/10.25592/uhhfdm.13882)

reStructedText

.. image:: https://www.fdr.uni-hamburg.de/badge/DOI/10.25592/uhhfdm.13882.svg :target: https://doi.org/10.25592/uhhfdm.13882

HTML

<a href="https://doi.org/10.25592/uhhfdm.13882"><img src="https://www.fdr.uni-hamburg.de/badge/DOI/10.25592/uhhfdm.13882.svg" alt="DOI"></a>

Image URL

https://www.fdr.uni-hamburg.de/badge/DOI/10.25592/uhhfdm.13882.svg

Target URL

https://doi.org/10.25592/uhhfdm.13882

Versions

Cite record as

Export