INEL Kalmyk Corpus

Baranova, Vlada

doi:10.25592/uhhfdm.17676

July 17, 2025 Dataset Open Access

INEL Kalmyk Corpus

Baranova, Vlada

Citation Style Language JSON Export

{"DOI":"10.25592/uhhfdm.17676","abstract":"<p><strong>Corpus citation</strong></p>\n\n<p><em>Baranova, Vlada</em>. 2025. INEL Kalmyk Corpus. Archived at Universit&auml;t Hamburg. Version 1.0. Publication date 2025-07-17.&nbsp;<a href=\"https://hdl.handle.net/11022/0000-0007-FFB1-2\">https://hdl.handle.net/11022/0000-0007-FFB1-2</a>. Archived at Universit&auml;t Hamburg. In: <em>The INEL Corpora of Indigenous Northern Eurasian Languages</em>.&nbsp;<a href=\"https://hdl.handle.net/11022/0000-0007-F45A-1\">https://hdl.handle.net/11022/0000-0007-F45A-1</a>.</p>\n\n<p><strong>Corpus Description</strong></p>\n\n<p>The INEL Kalmyk Corpus has been created within the long-term INEL project (&quot;Grammatical Descriptions, Corpora and Language Technology for Indigenous Northern Eurasian Languages&quot;), 2016&ndash;2033.</p>\n\n<p>The corpus consists of transcribed audio recordings collected in the Republic of Kalmykia between 2007 and 2018 in the Ketchenerovsky District (Derbet&nbsp; and Torgut dialect).</p>\n\n<p>All texts in the corpus are provided with interlinear morpheme-by-morpheme glosses and translation into English and Russian. All texts for which the audio recordings were accessible are time-aligned with them.&nbsp;</p>\n\n<p><strong>Corpus Size</strong></p>\n\n<p>The corpus contains <strong>55 </strong>texts, <strong>2,076 </strong>sentences, and <strong>19,742&nbsp;</strong>tokens. The total duration of the audio recordings is <strong>4 </strong>hours and <strong>23 </strong>minutes.</p>\n\n<p><strong>Funding</strong></p>\n\n<p>The corpus has been produced in the context of the joint research funding of the German Federal Government and Federal States in the Academies&rsquo; Programme, with funding from the Federal Ministry of Education and Research and the Free and Hanseatic City of Hamburg. The Academies&rsquo; Programme is coordinated by the Union of the German Academies of Sciences and Humanities.</p>\n\n<p><strong>Contributions / Acknowledgements</strong></p>\n\n<p>Native speakers generously shared their knowledge of Kalmyk, making the creation of this corpus possible. Zamira Xejchieva and Galina Cabdy`rova assisted with oral transcription and the Russian translation of the audio materials.</p>\n\n<p>Part of the materials were recorded during joint expeditions of St. Petersburg University and the Institute for Linguistic Studies of the Russian Academy of Sciences in 2007&ndash;2008, under the direction of Elena Perekhvalskaya and Sergey Say.</p>\n\n<p>This corpus primarily follows the transcription system and partially adopts the glossing conventions developed by a research team led by Sergey Say, with input from other expedition participants.</p>\n\n<p><strong>Searching the corpus</strong></p>\n\n<p>The corpus can be downloaded from the ZFDM Repository using the links provided below and browsed or searched locally using the&nbsp;<a href=\"https://exmaralda.org/\">EXMARaLDA</a>&nbsp;software or, alternatively,&nbsp;<a href=\"https://archive.mpi.nl/tla/elan\">ELAN</a>.</p>\n\n<p>Online search with Tsakorpus platform is available at&nbsp;<a href=\"https://inel.corpora.uni-hamburg.de/KalmykCorpus/search\">https://inel.corpora.uni-hamburg.de/KalmykCorpus/search</a>.</p>\n\n<p>Remote search with EXMARaLDA is also possible without downloading all the files (see&nbsp;<a href=\"https://inel.corpora.uni-hamburg.de/portal/help/en/index.php\">https://inel.corpora.uni-hamburg.de/portal/help/en/index.php</a>).</p>\n\n<p>See the user documentation&nbsp;(section 3) for details on transcription, annotation tiers and annotation tags.<br>\nFind further information and links on the Kalmyk Corpus page at the INEL Resources portal:&nbsp;<a href=\"https://inel.corpora.uni-hamburg.de/portal/corpora/kalmyk/\">https://inel.corpora.uni-hamburg.de/portal/corpora/kalmyk/</a>.</p>","author":[{"family":"Baranova, Vlada"}],"id":"17676","issued":{"date-parts":[[2025,7,17]]},"language":"xal","title":"INEL Kalmyk Corpus","type":"dataset","version":"1.0"}

Publication date:

July 17, 2025

DOI:

Keyword(s):

endangered language indigenous language language contact language documentation INEL folklore narrative monologue morphological glossing English translation Russian translation EXMARaLDA ELAN XML ISO/TEI Mongolic languages annotated corpus

Alternate identifiers:

11022/0000-0007-FFB1-2

Communities:

License (for files):

Creative Commons Attribution Non Commercial Share Alike 4.0 International

Versions

Version 1.0 10.25592/uhhfdm.17676

Jul 17, 2025

Cite all versions? You can cite all versions by using the DOI 10.25592/uhhfdm.17675. This DOI represents all versions, and will always resolve to the latest one.

Zentrumfür Nachhaltiges Forschungsdatenmanagement

Suche

INEL Kalmyk Corpus

Citation Style Language JSON Export

Versions

Cite record as

Export

INEL Kalmyk Corpus

Citation Style Language JSON Export

DOI Badge

Markdown

[![DOI](https://www.fdr.uni-hamburg.de/badge/DOI/10.25592/uhhfdm.17676.svg)](https://doi.org/10.25592/uhhfdm.17676)

reStructedText

.. image:: https://www.fdr.uni-hamburg.de/badge/DOI/10.25592/uhhfdm.17676.svg :target: https://doi.org/10.25592/uhhfdm.17676

HTML

<a href="https://doi.org/10.25592/uhhfdm.17676"><img src="https://www.fdr.uni-hamburg.de/badge/DOI/10.25592/uhhfdm.17676.svg" alt="DOI"></a>

Image URL

https://www.fdr.uni-hamburg.de/badge/DOI/10.25592/uhhfdm.17676.svg

Target URL

https://doi.org/10.25592/uhhfdm.17676

Versions

Cite record as

Export