Dataset Open Access

INEL Enets Corpus

Shluinsky, Andrey; Khanina, Olesya; Wagner-Nagy, Beáta


JSON Export

{"conceptdoi":"10.25592/uhhfdm.16181","conceptrecid":"16181","created":"2025-12-22T10:31:51.336806+00:00","doi":"10.25592/uhhfdm.18195","id":18195,"links":{"badge":"https://www.fdr.uni-hamburg.de/badge/doi/10.25592/uhhfdm.18195.svg","conceptbadge":"https://www.fdr.uni-hamburg.de/badge/doi/10.25592/uhhfdm.16181.svg","conceptdoi":"http://doi.org/10.25592/uhhfdm.16181","doi":"http://doi.org/10.25592/uhhfdm.18195"},"metadata":{"access_right":"open","access_right_category":"success","communities":[{"id":"adwhh"},{"id":"inel"},{"id":"uhh"}],"contributors":[{"affiliation":"Universit\u00e4t Hamburg","name":"Arkhipov, Alexandre","type":"Editor"},{"affiliation":"Universit\u00e4t Hamburg","name":"Wagner-Nagy, Be\u00e1ta","type":"Editor"},{"affiliation":"Universit\u00e4t Hamburg","name":"Lazarenko, Elena","type":"DataManager"},{"affiliation":"Universit\u00e4t Hamburg","name":"Riaposov, Aleksandr","type":"DataManager"},{"affiliation":"Universit\u00e4t Hamburg","name":"Lehmberg, Timm","type":"DataManager"}],"creators":[{"affiliation":"Universit\u00e4t Hamburg","name":"Shluinsky, Andrey","orcid":"0000-0002-2553-7213"},{"affiliation":"University of Helsinki","name":"Khanina, Olesya","orcid":"0000-0001-5930-4656"},{"affiliation":"Universit\u00e4t Hamburg","name":"Wagner-Nagy, Be\u00e1ta","orcid":"0000-0002-6801-1895"}],"description":"<p><strong>Corpus Citation</strong></p>\n\n<p><em>Shluinsky, Andrey; Khanina, Olesya; Wagner-Nagy, Be&aacute;ta</em>. 2025. INEL Enets Corpus. Version 1.1. Publication date 2025-12-31. <a href=\"https://hdl.handle.net/11022/0000-0008-005C-1\">https://hdl.handle.net/11022/0000-0008-005C-1</a>. Archived at Universit&auml;t Hamburg. In: The INEL corpora of indigenous Northern Eurasian languages. <a href=\"https://hdl.handle.net/11022/0000-0007-F45A-1\">https://hdl.handle.net/11022/0000-0007-F45A-1</a></p>\n\n<p><strong>Corpus Description</strong></p>\n\n<p>The INEL Enets corpus has been created within the long-term INEL project (&quot;Grammatical Descriptions, Corpora and Language Technology for Indigenous Northern Eurasian Languages&quot;), 2016&ndash;2033.</p>\n\n<p>The corpus includes texts recorded between 1962&ndash;2017 in both Enets lects &ndash; Forest Enets and Tundra Enets. The sources of the corpus (see more details in the user documentation, section 2.2) are:</p>\n\n<ul>\n\t<li>Audio recordings done by Olesya Khanina, Maria Ovsjannikova, Andrey Shluinsky, Natalia Stoynova and Sergey Trubetskoy,</li>\n\t<li>Legacy audio recordings done by Vera Bettu, Nina N. Bolina, Dar`ya S. Bolina, Zoya N. Bolina, Oksana E. Dobzhanskaya, Valentin Gusev, Eugene Helimski&dagger;, Kazimir I. Labanauskas&dagger;, Larisa Leisi&ouml;, Marina Lyublinskaya, Kaur M&auml;gi, Viktor N. Pal`chin, Marina N. Pal`china, Irina P. Sorokina&dagger;, Anna Urmanchieva, Be&aacute;ta Wagner-Nagy and possibly other people,</li>\n\t<li>Published audio recordings,</li>\n\t<li>Texts published by Dar`ya S. Bolina, Yaroslav A. Gluxij&dagger; and Vasilij A. Susekov&dagger;, Eugene Helimski&dagger;, Kazimir I. Labanauskas&dagger;, Tibor Mikola&dagger;, J&aacute;nos Pusztay, Irina P. Sorokina&dagger;, Anna Urmanchieva,</li>\n\t<li>Legacy manuscript transcriptions and self-transcriptions done and/or edited by Dar`ya S. Bolina, Galina S. Bolina, Zoya N. Bolina, Valentin Gusev, Eugene Helimski&dagger;, Kazimir I. Labanauskas&dagger;, Larisa Leisi&ouml;, Marina Lyublinskaya, Vasilij F. Ly`rmin&dagger;, Anton N. Pal`chin, Viktor N. Pal`chin, Ivan I. Silkin&dagger;, Irina P. Sorokina&dagger;, Natal`ya M. Tere&scaron;\u010denko&dagger;, Anna Urmanchieva and possibly other people.</li>\n</ul>\n\n<p>All texts in the corpus are provided with interlinear morpheme-by-morpheme glosses and translation into English and Russian. All texts for which the audio recordings were accessible are time-aligned with them. Video recordings are also included into the corpus if available.</p>\n\n<p><br>\n<strong>New in release 1.1</strong></p>\n\n<ul>\n\t<li>Annotation of syntactic functions (tier category &quot;SyF&quot;) is now available for 55 additional texts, of which 52 are folklore and 3 <em>&ndash; </em> narrative;</li>\n\t<li>For texts originating from published and archival sources, as well as manuscripts, detailed references were added to the &quot;Citation&quot; section of the documentation and the respective field in the corpus metadata file.</li>\n</ul>\n\n<p><strong>Corpus size</strong></p>\n\n<ul>\n\t<li>Forest Enets: <strong>541</strong> texts, <strong>41,396</strong> sentences, <strong>173,380</strong>&nbsp;tokens</li>\n\t<li>Tundra Enets: <strong>137</strong> texts, <strong>12,737</strong> sentences, <strong>45,331</strong> tokens</li>\n\t<li>Total: <strong>678</strong> texts, <strong>54,133</strong> sentences, <strong>218,711</strong>&nbsp;tokens</li>\n\t<li>Total duration of audio: <strong>43 </strong>hours <strong>26 </strong>minutes</li>\n</ul>\n\n<p><strong>Funding</strong></p>\n\n<p>The corpus has been produced in the context of the joint research funding of the German Federal Government and Federal States in the Academies&rsquo; Programme, with funding from the Federal Ministry of Education and Research and the Free and Hanseatic City of Hamburg. The Academies&rsquo; Programme is coordinated by the Union of the German Academies of Sciences and Humanities.</p>\n\n<p>Preliminary glossing work included into this corpus was supported by Endangered Languages Documentation Programme (ELDP) and by Max Planck Institute for Evolutionary Anthropology (MPI-EVA). See more details on financial support in the documentation&nbsp;file below, section 1.6.</p>\n\n<p><strong>Contributions/Acknowledgements</strong></p>\n\n<p>Dozens of people and many institutions contributed to the corpus (see more details in the documentation&nbsp;file below, section 1.6). We are especially grateful to:</p>\n\n<ul>\n\t<li>Enets speakers who generously shared their knowledge, especially those who spent many days working with us: Aleksandr S. Bolin&dagger;, Leonid D. Bolin&dagger;, Viktor N. Bolin, Nadezhda K. Bolina, Nina N. Bolina, Ekaterina S. Glibchenko, Gennadij A. Ivanov&dagger;, Irina P. Koshkaryova&dagger;, Valentina P. Nader, Lyudmila P. Novosyolova, Svetlana A. Roslyakova&dagger;, Ivan I. Silkin&dagger;, Nikolaj I. Silkin, Alevtina S. Silkina, Zoya A. Turutina, Tat`yana Ch. Yar,</li>\n\t<li>In particular, Zoya N. Bolina and Viktor N. Pal`chin who also collaborated in ELDP project and extensively transcribed Enets recordings,</li>\n\t<li>Natalia Stoynova, Sergey Trubetskoy and foremostly Maria Ovsjannikova who did recordings and transcriptions of Enets texts,</li>\n\t<li>Institutions and private individuals who shared legacy data: the Institute for Linguistic Studies RAS, the Taymyr House of National Arts, the Dudinka branch of GTRK &ldquo;Norilsk&rdquo;; Dar`ya S. Bolina, Oksana E. Dobzhanskaya, Valentin Gusev, Larisa Leisi&ouml;, Viktor N. Pal`chin, Irina P. Sorokina&dagger;, Anna Urmanchieva,</li>\n\t<li>Marina Lyublinskaya and Anna Urmanchieva who kindly permitted to include texts processed by them into the corpus,</li>\n\t<li>Dar`ya S. Bolina who consulted a lot in the process of compilation of the corpus.</li>\n</ul>\n\n<p><strong>Searching the corpus</strong></p>\n\n<p>The corpus can be downloaded from the ZFDM Repository using the links provided below and browsed or searched locally using the <a href=\"https://exmaralda.org/\">EXMARaLDA</a> software or, alternatively, <a href=\"https://archive.mpi.nl/tla/elan\">ELAN</a>.</p>\n\n<p>Online search with Tsakorpus platform is available at <a href=\"https://inel.corpora.uni-hamburg.de/EnetsCorpus/search\">https://inel.corpora.uni-hamburg.de/EnetsCorpus/search</a>.</p>\n\n<p>Remote search with EXMARaLDA is also possible without downloading all the files (see <a href=\"https://inel.corpora.uni-hamburg.de/portal/help/en/index.php#search\">https://inel.corpora.uni-hamburg.de/portal/help/en/index.php#search</a>).</p>\n\n<p>See the user documentation&nbsp;(section 3) for details on transcription, annotation tiers and annotation tags.<br>\nFind further information and links on the Enets Corpus page at the INEL Resources portal: <a href=\"https://inel.corpora.uni-hamburg.de/portal/corpora/enets/\">https://inel.corpora.uni-hamburg.de/portal/corpora/enets/</a>.</p>","doi":"10.25592/uhhfdm.18195","keywords":["Uralic","Samoyedic","Enets","Forest Enets","Tundra Enets","endangered language","language contact","language documentation","legacy data","INEL","AdWHH","text corpus","speech corpus","parallel texts","folklore","tales","narrative","dialogue","song","transcription","time-aligned","audio","video","morphological glossing","part-of-speech","borrowings","code-switching","English translation","Russian translation","EXMARaLDA","ELAN","XML","ISO/TEI"],"license":{"id":"CC-BY-NC-SA-4.0"},"publication_date":"2025-12-31","related_identifiers":[{"identifier":"11022/0000-0008-005C-1","relation":"isCitedBy","scheme":"handle"},{"identifier":"10.25592/uhhfdm.16181","relation":"isVersionOf","scheme":"doi"}],"relations":{"version":[{"count":2,"index":1,"is_last":true,"last_child":{"pid_type":"recid","pid_value":"18195"},"parent":{"pid_type":"recid","pid_value":"16181"}}]},"resource_type":{"title":"Dataset","type":"dataset"},"title":"INEL Enets Corpus","version":"1.1"},"owners":[350],"revision":1,"updated":"2025-12-22T10:31:51.594275+00:00"}

Cite record as