Dataset Open Access

INEL Kalmyk Corpus

Baranova, Vlada


JSON Export

{"conceptdoi":"10.25592/uhhfdm.17675","conceptrecid":"17675","created":"2025-07-18T10:55:48.819346+00:00","doi":"10.25592/uhhfdm.17676","id":17676,"links":{"badge":"https://www.fdr.uni-hamburg.de/badge/doi/10.25592/uhhfdm.17676.svg","conceptbadge":"https://www.fdr.uni-hamburg.de/badge/doi/10.25592/uhhfdm.17675.svg","conceptdoi":"http://doi.org/10.25592/uhhfdm.17675","doi":"http://doi.org/10.25592/uhhfdm.17676"},"metadata":{"access_right":"open","access_right_category":"success","alternate_identifiers":[{"identifier":"11022/0000-0007-FFB1-2","scheme":"handle"}],"communities":[{"id":"adwhh"},{"id":"inel"},{"id":"uhh"}],"contributors":[{"affiliation":"Universit\u00e4t Hamburg","name":"Lazarenko, Elena","type":"DataManager"},{"affiliation":"Universit\u00e4t Hamburg","name":"Riaposov, Aleksandr","type":"DataManager"},{"affiliation":"Universit\u00e4t Hamburg","name":"Arkhipov, Alexandre","type":"Editor"}],"creators":[{"affiliation":"Universit\u00e4t Hamburg","name":"Baranova, Vlada","orcid":"0000-0003-1642-4003"}],"description":"<p><strong>Corpus citation</strong></p>\n\n<p><em>Baranova, Vlada</em>. 2025. INEL Kalmyk Corpus. Archived at Universit&auml;t Hamburg. Version 1.0. Publication date 2025-07-17.&nbsp;<a href=\"https://hdl.handle.net/11022/0000-0007-FFB1-2\">https://hdl.handle.net/11022/0000-0007-FFB1-2</a>. Archived at Universit&auml;t Hamburg. In: <em>The INEL Corpora of Indigenous Northern Eurasian Languages</em>.&nbsp;<a href=\"https://hdl.handle.net/11022/0000-0007-F45A-1\">https://hdl.handle.net/11022/0000-0007-F45A-1</a>.</p>\n\n<p><strong>Corpus Description</strong></p>\n\n<p>The INEL Kalmyk Corpus has been created within the long-term INEL project (&quot;Grammatical Descriptions, Corpora and Language Technology for Indigenous Northern Eurasian Languages&quot;), 2016&ndash;2033.</p>\n\n<p>The corpus consists of transcribed audio recordings collected in the Republic of Kalmykia between 2007 and 2018 in the Ketchenerovsky District (Derbet&nbsp; and Torgut dialect).</p>\n\n<p>All texts in the corpus are provided with interlinear morpheme-by-morpheme glosses and translation into English and Russian. All texts for which the audio recordings were accessible are time-aligned with them.&nbsp;</p>\n\n<p><strong>Corpus Size</strong></p>\n\n<p>The corpus contains <strong>55 </strong>texts, <strong>2,076 </strong>sentences, and <strong>19,742&nbsp;</strong>tokens. The total duration of the audio recordings is <strong>4 </strong>hours and <strong>23 </strong>minutes.</p>\n\n<p><strong>Funding</strong></p>\n\n<p>The corpus has been produced in the context of the joint research funding of the German Federal Government and Federal States in the Academies&rsquo; Programme, with funding from the Federal Ministry of Education and Research and the Free and Hanseatic City of Hamburg. The Academies&rsquo; Programme is coordinated by the Union of the German Academies of Sciences and Humanities.</p>\n\n<p><strong>Contributions / Acknowledgements</strong></p>\n\n<p>Native speakers generously shared their knowledge of Kalmyk, making the creation of this corpus possible. Zamira Xejchieva and Galina Cabdy`rova assisted with oral transcription and the Russian translation of the audio materials.</p>\n\n<p>Part of the materials were recorded during joint expeditions of St. Petersburg University and the Institute for Linguistic Studies of the Russian Academy of Sciences in 2007&ndash;2008, under the direction of Elena Perekhvalskaya and Sergey Say.</p>\n\n<p>This corpus primarily follows the transcription system and partially adopts the glossing conventions developed by a research team led by Sergey Say, with input from other expedition participants.</p>\n\n<p><strong>Searching the corpus</strong></p>\n\n<p>The corpus can be downloaded from the ZFDM Repository using the links provided below and browsed or searched locally using the&nbsp;<a href=\"https://exmaralda.org/\">EXMARaLDA</a>&nbsp;software or, alternatively,&nbsp;<a href=\"https://archive.mpi.nl/tla/elan\">ELAN</a>.</p>\n\n<p>Online search with Tsakorpus platform is available at&nbsp;<a href=\"https://inel.corpora.uni-hamburg.de/KalmykCorpus/search\">https://inel.corpora.uni-hamburg.de/KalmykCorpus/search</a>.</p>\n\n<p>Remote search with EXMARaLDA is also possible without downloading all the files (see&nbsp;<a href=\"https://inel.corpora.uni-hamburg.de/portal/help/en/index.php\">https://inel.corpora.uni-hamburg.de/portal/help/en/index.php</a>).</p>\n\n<p>See the user documentation&nbsp;(section 3) for details on transcription, annotation tiers and annotation tags.<br>\nFind further information and links on the Kalmyk Corpus page at the INEL Resources portal:&nbsp;<a href=\"https://inel.corpora.uni-hamburg.de/portal/corpora/kalmyk/\">https://inel.corpora.uni-hamburg.de/portal/corpora/kalmyk/</a>.</p>","doi":"10.25592/uhhfdm.17676","keywords":["endangered language","indigenous language","language contact","language documentation","INEL","folklore","narrative","monologue","morphological glossing","English translation","Russian translation","EXMARaLDA","ELAN","XML","ISO/TEI","Mongolic languages","annotated corpus"],"language":"xal","license":{"id":"CC-BY-NC-SA-4.0"},"publication_date":"2025-07-17","related_identifiers":[{"identifier":"10.25592/uhhfdm.17675","relation":"isVersionOf","scheme":"doi"}],"relations":{"version":[{"count":1,"index":0,"is_last":true,"last_child":{"pid_type":"recid","pid_value":"17676"},"parent":{"pid_type":"recid","pid_value":"17675"}}]},"resource_type":{"title":"Dataset","type":"dataset"},"title":"INEL Kalmyk Corpus","version":"1.0"},"owners":[350],"revision":7,"updated":"2025-07-22T10:59:13.650905+00:00"}

Cite record as