Dataset Open Access
Shluinsky, Andrey;
Khanina, Olesya;
Wagner-Nagy, Beáta
<?xml version='1.0' encoding='UTF-8'?>
<record xmlns="http://www.loc.gov/MARC21/slim">
<leader>00000nmm##2200000uu#4500</leader>
<datafield tag="100" ind1=" " ind2=" ">
<subfield code="a">Shluinsky, Andrey</subfield>
<subfield code="u">Universität Hamburg</subfield>
<subfield code="0">(orcid)0000-0002-2553-7213</subfield>
</datafield>
<datafield tag="540" ind1=" " ind2=" ">
<subfield code="u">https://creativecommons.org/licenses/by-nc-sa/4.0/legalcode</subfield>
<subfield code="a">Creative Commons Attribution Non Commercial Share Alike 4.0 International</subfield>
</datafield>
<datafield tag="520" ind1=" " ind2=" ">
<subfield code="a"><p><strong>Corpus Citation</strong></p>
<p><em>Shluinsky, Andrey; Khanina, Olesya; Wagner-Nagy, Be&aacute;ta</em>. 2025. INEL Enets Corpus. Version 1.1. Publication date 2025-12-31. <a href="https://hdl.handle.net/11022/0000-0008-005C-1">https://hdl.handle.net/11022/0000-0008-005C-1</a>. Archived at Universit&auml;t Hamburg. In: The INEL corpora of indigenous Northern Eurasian languages. <a href="https://hdl.handle.net/11022/0000-0007-F45A-1">https://hdl.handle.net/11022/0000-0007-F45A-1</a></p>
<p><strong>Corpus Description</strong></p>
<p>The INEL Enets corpus has been created within the long-term INEL project (&quot;Grammatical Descriptions, Corpora and Language Technology for Indigenous Northern Eurasian Languages&quot;), 2016&ndash;2033.</p>
<p>The corpus includes texts recorded between 1962&ndash;2017 in both Enets lects &ndash; Forest Enets and Tundra Enets. The sources of the corpus (see more details in the user documentation, section 2.2) are:</p>
<ul>
<li>Audio recordings done by Olesya Khanina, Maria Ovsjannikova, Andrey Shluinsky, Natalia Stoynova and Sergey Trubetskoy,</li>
<li>Legacy audio recordings done by Vera Bettu, Nina N. Bolina, Dar`ya S. Bolina, Zoya N. Bolina, Oksana E. Dobzhanskaya, Valentin Gusev, Eugene Helimski&dagger;, Kazimir I. Labanauskas&dagger;, Larisa Leisi&ouml;, Marina Lyublinskaya, Kaur M&auml;gi, Viktor N. Pal`chin, Marina N. Pal`china, Irina P. Sorokina&dagger;, Anna Urmanchieva, Be&aacute;ta Wagner-Nagy and possibly other people,</li>
<li>Published audio recordings,</li>
<li>Texts published by Dar`ya S. Bolina, Yaroslav A. Gluxij&dagger; and Vasilij A. Susekov&dagger;, Eugene Helimski&dagger;, Kazimir I. Labanauskas&dagger;, Tibor Mikola&dagger;, J&aacute;nos Pusztay, Irina P. Sorokina&dagger;, Anna Urmanchieva,</li>
<li>Legacy manuscript transcriptions and self-transcriptions done and/or edited by Dar`ya S. Bolina, Galina S. Bolina, Zoya N. Bolina, Valentin Gusev, Eugene Helimski&dagger;, Kazimir I. Labanauskas&dagger;, Larisa Leisi&ouml;, Marina Lyublinskaya, Vasilij F. Ly`rmin&dagger;, Anton N. Pal`chin, Viktor N. Pal`chin, Ivan I. Silkin&dagger;, Irina P. Sorokina&dagger;, Natal`ya M. Tere&scaron;čenko&dagger;, Anna Urmanchieva and possibly other people.</li>
</ul>
<p>All texts in the corpus are provided with interlinear morpheme-by-morpheme glosses and translation into English and Russian. All texts for which the audio recordings were accessible are time-aligned with them. Video recordings are also included into the corpus if available.</p>
<p><br>
<strong>New in release 1.1</strong></p>
<ul>
<li>Annotation of syntactic functions (tier category &quot;SyF&quot;) is now available for 55 additional texts, of which 52 are folklore and 3 <em>&ndash; </em> narrative;</li>
<li>For texts originating from published and archival sources, as well as manuscripts, detailed references were added to the &quot;Citation&quot; section of the documentation and the respective field in the corpus metadata file.</li>
</ul>
<p><strong>Corpus size</strong></p>
<ul>
<li>Forest Enets: <strong>541</strong> texts, <strong>41,396</strong> sentences, <strong>173,380</strong>&nbsp;tokens</li>
<li>Tundra Enets: <strong>137</strong> texts, <strong>12,737</strong> sentences, <strong>45,331</strong> tokens</li>
<li>Total: <strong>678</strong> texts, <strong>54,133</strong> sentences, <strong>218,711</strong>&nbsp;tokens</li>
<li>Total duration of audio: <strong>43 </strong>hours <strong>26 </strong>minutes</li>
</ul>
<p><strong>Funding</strong></p>
<p>The corpus has been produced in the context of the joint research funding of the German Federal Government and Federal States in the Academies&rsquo; Programme, with funding from the Federal Ministry of Education and Research and the Free and Hanseatic City of Hamburg. The Academies&rsquo; Programme is coordinated by the Union of the German Academies of Sciences and Humanities.</p>
<p>Preliminary glossing work included into this corpus was supported by Endangered Languages Documentation Programme (ELDP) and by Max Planck Institute for Evolutionary Anthropology (MPI-EVA). See more details on financial support in the documentation&nbsp;file below, section 1.6.</p>
<p><strong>Contributions/Acknowledgements</strong></p>
<p>Dozens of people and many institutions contributed to the corpus (see more details in the documentation&nbsp;file below, section 1.6). We are especially grateful to:</p>
<ul>
<li>Enets speakers who generously shared their knowledge, especially those who spent many days working with us: Aleksandr S. Bolin&dagger;, Leonid D. Bolin&dagger;, Viktor N. Bolin, Nadezhda K. Bolina, Nina N. Bolina, Ekaterina S. Glibchenko, Gennadij A. Ivanov&dagger;, Irina P. Koshkaryova&dagger;, Valentina P. Nader, Lyudmila P. Novosyolova, Svetlana A. Roslyakova&dagger;, Ivan I. Silkin&dagger;, Nikolaj I. Silkin, Alevtina S. Silkina, Zoya A. Turutina, Tat`yana Ch. Yar,</li>
<li>In particular, Zoya N. Bolina and Viktor N. Pal`chin who also collaborated in ELDP project and extensively transcribed Enets recordings,</li>
<li>Natalia Stoynova, Sergey Trubetskoy and foremostly Maria Ovsjannikova who did recordings and transcriptions of Enets texts,</li>
<li>Institutions and private individuals who shared legacy data: the Institute for Linguistic Studies RAS, the Taymyr House of National Arts, the Dudinka branch of GTRK &ldquo;Norilsk&rdquo;; Dar`ya S. Bolina, Oksana E. Dobzhanskaya, Valentin Gusev, Larisa Leisi&ouml;, Viktor N. Pal`chin, Irina P. Sorokina&dagger;, Anna Urmanchieva,</li>
<li>Marina Lyublinskaya and Anna Urmanchieva who kindly permitted to include texts processed by them into the corpus,</li>
<li>Dar`ya S. Bolina who consulted a lot in the process of compilation of the corpus.</li>
</ul>
<p><strong>Searching the corpus</strong></p>
<p>The corpus can be downloaded from the ZFDM Repository using the links provided below and browsed or searched locally using the <a href="https://exmaralda.org/">EXMARaLDA</a> software or, alternatively, <a href="https://archive.mpi.nl/tla/elan">ELAN</a>.</p>
<p>Online search with Tsakorpus platform is available at <a href="https://inel.corpora.uni-hamburg.de/EnetsCorpus/search">https://inel.corpora.uni-hamburg.de/EnetsCorpus/search</a>.</p>
<p>Remote search with EXMARaLDA is also possible without downloading all the files (see <a href="https://inel.corpora.uni-hamburg.de/portal/help/en/index.php#search">https://inel.corpora.uni-hamburg.de/portal/help/en/index.php#search</a>).</p>
<p>See the user documentation&nbsp;(section 3) for details on transcription, annotation tiers and annotation tags.<br>
Find further information and links on the Enets Corpus page at the INEL Resources portal: <a href="https://inel.corpora.uni-hamburg.de/portal/corpora/enets/">https://inel.corpora.uni-hamburg.de/portal/corpora/enets/</a>.</p></subfield>
</datafield>
<datafield tag="650" ind1="1" ind2="7">
<subfield code="a">cc-by</subfield>
<subfield code="2">opendefinition.org</subfield>
</datafield>
<datafield tag="773" ind1=" " ind2=" ">
<subfield code="a">11022/0000-0008-005C-1</subfield>
<subfield code="i">isCitedBy</subfield>
<subfield code="n">handle</subfield>
</datafield>
<datafield tag="773" ind1=" " ind2=" ">
<subfield code="a">10.25592/uhhfdm.16181</subfield>
<subfield code="i">isVersionOf</subfield>
<subfield code="n">doi</subfield>
</datafield>
<datafield tag="856" ind1="4" ind2=" ">
<subfield code="s">1999513</subfield>
<subfield code="u">https://www.fdr.uni-hamburg.de/record/18195/files/enets-1.1-documentation.pdf</subfield>
<subfield code="z">md5:d1e52f222542b135aa6d40ad9f46aa65</subfield>
</datafield>
<datafield tag="856" ind1="4" ind2=" ">
<subfield code="s">110506005</subfield>
<subfield code="u">https://www.fdr.uni-hamburg.de/record/18195/files/enets-1.1-lite.zip</subfield>
<subfield code="z">md5:0fc6a0d4de91d80974845d2f93c2b42f</subfield>
</datafield>
<datafield tag="856" ind1="4" ind2=" ">
<subfield code="s">4344751116</subfield>
<subfield code="u">https://www.fdr.uni-hamburg.de/record/18195/files/enets-1.1-mp3.zip</subfield>
<subfield code="z">md5:4d9470f39ec34b467040b5391b733a33</subfield>
</datafield>
<datafield tag="856" ind1="4" ind2=" ">
<subfield code="s">13559891278</subfield>
<subfield code="u">https://www.fdr.uni-hamburg.de/record/18195/files/enets-1.1-standard.zip</subfield>
<subfield code="z">md5:2448ca59f925de2509acb583b4c079be</subfield>
</datafield>
<datafield tag="856" ind1="4" ind2=" ">
<subfield code="s">27973111395</subfield>
<subfield code="u">https://www.fdr.uni-hamburg.de/record/18195/files/enets-1.1-video.zip</subfield>
<subfield code="z">md5:89221ac52c896680696d713064e60522</subfield>
</datafield>
<datafield tag="542" ind1=" " ind2=" ">
<subfield code="l">open</subfield>
</datafield>
<datafield tag="700" ind1=" " ind2=" ">
<subfield code="a">Khanina, Olesya</subfield>
<subfield code="u">University of Helsinki</subfield>
<subfield code="0">(orcid)0000-0001-5930-4656</subfield>
</datafield>
<datafield tag="700" ind1=" " ind2=" ">
<subfield code="a">Wagner-Nagy, Beáta</subfield>
<subfield code="u">Universität Hamburg</subfield>
<subfield code="0">(orcid)0000-0002-6801-1895</subfield>
</datafield>
<datafield tag="700" ind1=" " ind2=" ">
<subfield code="a">Arkhipov, Alexandre</subfield>
<subfield code="u">Universität Hamburg</subfield>
<subfield code="4">edt</subfield>
</datafield>
<datafield tag="700" ind1=" " ind2=" ">
<subfield code="a">Wagner-Nagy, Beáta</subfield>
<subfield code="u">Universität Hamburg</subfield>
<subfield code="4">edt</subfield>
</datafield>
<datafield tag="700" ind1=" " ind2=" ">
<subfield code="a">Lazarenko, Elena</subfield>
<subfield code="u">Universität Hamburg</subfield>
<subfield code="4">dtm</subfield>
</datafield>
<datafield tag="700" ind1=" " ind2=" ">
<subfield code="a">Riaposov, Aleksandr</subfield>
<subfield code="u">Universität Hamburg</subfield>
<subfield code="4">dtm</subfield>
</datafield>
<datafield tag="700" ind1=" " ind2=" ">
<subfield code="a">Lehmberg, Timm</subfield>
<subfield code="u">Universität Hamburg</subfield>
<subfield code="4">dtm</subfield>
</datafield>
<datafield tag="245" ind1=" " ind2=" ">
<subfield code="a">INEL Enets Corpus</subfield>
</datafield>
<datafield tag="024" ind1=" " ind2=" ">
<subfield code="a">10.25592/uhhfdm.18195</subfield>
<subfield code="2">doi</subfield>
</datafield>
<controlfield tag="005">20251222103151.0</controlfield>
<controlfield tag="001">18195</controlfield>
<datafield tag="980" ind1=" " ind2=" ">
<subfield code="a">user-adwhh</subfield>
</datafield>
<datafield tag="980" ind1=" " ind2=" ">
<subfield code="a">user-inel</subfield>
</datafield>
<datafield tag="980" ind1=" " ind2=" ">
<subfield code="a">user-uhh</subfield>
</datafield>
<datafield tag="260" ind1=" " ind2=" ">
<subfield code="c">2025-12-31</subfield>
</datafield>
<datafield tag="909" ind1="C" ind2="O">
<subfield code="o">oai:fdr.uni-hamburg.de:18195</subfield>
<subfield code="p">user-adwhh</subfield>
<subfield code="p">user-uhh</subfield>
<subfield code="p">user-inel</subfield>
</datafield>
<datafield tag="653" ind1=" " ind2=" ">
<subfield code="a">Uralic</subfield>
</datafield>
<datafield tag="653" ind1=" " ind2=" ">
<subfield code="a">Samoyedic</subfield>
</datafield>
<datafield tag="653" ind1=" " ind2=" ">
<subfield code="a">Enets</subfield>
</datafield>
<datafield tag="653" ind1=" " ind2=" ">
<subfield code="a">Forest Enets</subfield>
</datafield>
<datafield tag="653" ind1=" " ind2=" ">
<subfield code="a">Tundra Enets</subfield>
</datafield>
<datafield tag="653" ind1=" " ind2=" ">
<subfield code="a">endangered language</subfield>
</datafield>
<datafield tag="653" ind1=" " ind2=" ">
<subfield code="a">language contact</subfield>
</datafield>
<datafield tag="653" ind1=" " ind2=" ">
<subfield code="a">language documentation</subfield>
</datafield>
<datafield tag="653" ind1=" " ind2=" ">
<subfield code="a">legacy data</subfield>
</datafield>
<datafield tag="653" ind1=" " ind2=" ">
<subfield code="a">INEL</subfield>
</datafield>
<datafield tag="653" ind1=" " ind2=" ">
<subfield code="a">AdWHH</subfield>
</datafield>
<datafield tag="653" ind1=" " ind2=" ">
<subfield code="a">text corpus</subfield>
</datafield>
<datafield tag="653" ind1=" " ind2=" ">
<subfield code="a">speech corpus</subfield>
</datafield>
<datafield tag="653" ind1=" " ind2=" ">
<subfield code="a">parallel texts</subfield>
</datafield>
<datafield tag="653" ind1=" " ind2=" ">
<subfield code="a">folklore</subfield>
</datafield>
<datafield tag="653" ind1=" " ind2=" ">
<subfield code="a">tales</subfield>
</datafield>
<datafield tag="653" ind1=" " ind2=" ">
<subfield code="a">narrative</subfield>
</datafield>
<datafield tag="653" ind1=" " ind2=" ">
<subfield code="a">dialogue</subfield>
</datafield>
<datafield tag="653" ind1=" " ind2=" ">
<subfield code="a">song</subfield>
</datafield>
<datafield tag="653" ind1=" " ind2=" ">
<subfield code="a">transcription</subfield>
</datafield>
<datafield tag="653" ind1=" " ind2=" ">
<subfield code="a">time-aligned</subfield>
</datafield>
<datafield tag="653" ind1=" " ind2=" ">
<subfield code="a">audio</subfield>
</datafield>
<datafield tag="653" ind1=" " ind2=" ">
<subfield code="a">video</subfield>
</datafield>
<datafield tag="653" ind1=" " ind2=" ">
<subfield code="a">morphological glossing</subfield>
</datafield>
<datafield tag="653" ind1=" " ind2=" ">
<subfield code="a">part-of-speech</subfield>
</datafield>
<datafield tag="653" ind1=" " ind2=" ">
<subfield code="a">borrowings</subfield>
</datafield>
<datafield tag="653" ind1=" " ind2=" ">
<subfield code="a">code-switching</subfield>
</datafield>
<datafield tag="653" ind1=" " ind2=" ">
<subfield code="a">English translation</subfield>
</datafield>
<datafield tag="653" ind1=" " ind2=" ">
<subfield code="a">Russian translation</subfield>
</datafield>
<datafield tag="653" ind1=" " ind2=" ">
<subfield code="a">EXMARaLDA</subfield>
</datafield>
<datafield tag="653" ind1=" " ind2=" ">
<subfield code="a">ELAN</subfield>
</datafield>
<datafield tag="653" ind1=" " ind2=" ">
<subfield code="a">XML</subfield>
</datafield>
<datafield tag="653" ind1=" " ind2=" ">
<subfield code="a">ISO/TEI</subfield>
</datafield>
<datafield tag="980" ind1=" " ind2=" ">
<subfield code="a">dataset</subfield>
</datafield>
</record>