Dataset Open Access
Shluinsky, Andrey;
Khanina, Olesya;
Wagner-Nagy, Beáta
<?xml version='1.0' encoding='utf-8'?>
<resource xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns="http://datacite.org/schema/kernel-3" xsi:schemaLocation="http://datacite.org/schema/kernel-3 http://schema.datacite.org/meta/kernel-3/metadata.xsd">
<identifier identifierType="DOI">10.25592/uhhfdm.18195</identifier>
<creators>
<creator>
<creatorName>Shluinsky, Andrey</creatorName>
<nameIdentifier nameIdentifierScheme="ORCID" schemeURI="http://orcid.org/">0000-0002-2553-7213</nameIdentifier>
<affiliation>Universität Hamburg</affiliation>
</creator>
<creator>
<creatorName>Khanina, Olesya</creatorName>
<nameIdentifier nameIdentifierScheme="ORCID" schemeURI="http://orcid.org/">0000-0001-5930-4656</nameIdentifier>
<affiliation>University of Helsinki</affiliation>
</creator>
<creator>
<creatorName>Wagner-Nagy, Beáta</creatorName>
<nameIdentifier nameIdentifierScheme="ORCID" schemeURI="http://orcid.org/">0000-0002-6801-1895</nameIdentifier>
<affiliation>Universität Hamburg</affiliation>
</creator>
</creators>
<titles>
<title>INEL Enets Corpus</title>
</titles>
<publisher>Universität Hamburg</publisher>
<publicationYear>2025</publicationYear>
<subjects>
<subject>Uralic</subject>
<subject>Samoyedic</subject>
<subject>Enets</subject>
<subject>Forest Enets</subject>
<subject>Tundra Enets</subject>
<subject>endangered language</subject>
<subject>language contact</subject>
<subject>language documentation</subject>
<subject>legacy data</subject>
<subject>INEL</subject>
<subject>AdWHH</subject>
<subject>text corpus</subject>
<subject>speech corpus</subject>
<subject>parallel texts</subject>
<subject>folklore</subject>
<subject>tales</subject>
<subject>narrative</subject>
<subject>dialogue</subject>
<subject>song</subject>
<subject>transcription</subject>
<subject>time-aligned</subject>
<subject>audio</subject>
<subject>video</subject>
<subject>morphological glossing</subject>
<subject>part-of-speech</subject>
<subject>borrowings</subject>
<subject>code-switching</subject>
<subject>English translation</subject>
<subject>Russian translation</subject>
<subject>EXMARaLDA</subject>
<subject>ELAN</subject>
<subject>XML</subject>
<subject>ISO/TEI</subject>
</subjects>
<contributors>
<contributor contributorType="Editor">
<contributorName>Arkhipov, Alexandre</contributorName>
<affiliation>Universität Hamburg</affiliation>
</contributor>
<contributor contributorType="Editor">
<contributorName>Wagner-Nagy, Beáta</contributorName>
<affiliation>Universität Hamburg</affiliation>
</contributor>
<contributor contributorType="DataManager">
<contributorName>Lazarenko, Elena</contributorName>
<affiliation>Universität Hamburg</affiliation>
</contributor>
<contributor contributorType="DataManager">
<contributorName>Riaposov, Aleksandr</contributorName>
<affiliation>Universität Hamburg</affiliation>
</contributor>
<contributor contributorType="DataManager">
<contributorName>Lehmberg, Timm</contributorName>
<affiliation>Universität Hamburg</affiliation>
</contributor>
</contributors>
<dates>
<date dateType="Issued">2025-12-31</date>
</dates>
<resourceType resourceTypeGeneral="Dataset"/>
<alternateIdentifiers>
<alternateIdentifier alternateIdentifierType="url">https://www.fdr.uni-hamburg.de/record/18195</alternateIdentifier>
</alternateIdentifiers>
<relatedIdentifiers>
<relatedIdentifier relatedIdentifierType="Handle" relationType="IsCitedBy">11022/0000-0008-005C-1</relatedIdentifier>
<relatedIdentifier relatedIdentifierType="DOI" relationType="IsPartOf">10.25592/uhhfdm.16181</relatedIdentifier>
</relatedIdentifiers>
<version>1.1</version>
<rightsList>
<rights rightsURI="https://creativecommons.org/licenses/by-nc-sa/4.0/legalcode">Creative Commons Attribution Non Commercial Share Alike 4.0 International</rights>
<rights rightsURI="info:eu-repo/semantics/openAccess">Open Access</rights>
</rightsList>
<descriptions>
<description descriptionType="Abstract"><p><strong>Corpus Citation</strong></p>
<p><em>Shluinsky, Andrey; Khanina, Olesya; Wagner-Nagy, Be&aacute;ta</em>. 2025. INEL Enets Corpus. Version 1.1. Publication date 2025-12-31. <a href="https://hdl.handle.net/11022/0000-0008-005C-1">https://hdl.handle.net/11022/0000-0008-005C-1</a>. Archived at Universit&auml;t Hamburg. In: The INEL corpora of indigenous Northern Eurasian languages. <a href="https://hdl.handle.net/11022/0000-0007-F45A-1">https://hdl.handle.net/11022/0000-0007-F45A-1</a></p>
<p><strong>Corpus Description</strong></p>
<p>The INEL Enets corpus has been created within the long-term INEL project (&quot;Grammatical Descriptions, Corpora and Language Technology for Indigenous Northern Eurasian Languages&quot;), 2016&ndash;2033.</p>
<p>The corpus includes texts recorded between 1962&ndash;2017 in both Enets lects &ndash; Forest Enets and Tundra Enets. The sources of the corpus (see more details in the user documentation, section 2.2) are:</p>
<ul>
<li>Audio recordings done by Olesya Khanina, Maria Ovsjannikova, Andrey Shluinsky, Natalia Stoynova and Sergey Trubetskoy,</li>
<li>Legacy audio recordings done by Vera Bettu, Nina N. Bolina, Dar`ya S. Bolina, Zoya N. Bolina, Oksana E. Dobzhanskaya, Valentin Gusev, Eugene Helimski&dagger;, Kazimir I. Labanauskas&dagger;, Larisa Leisi&ouml;, Marina Lyublinskaya, Kaur M&auml;gi, Viktor N. Pal`chin, Marina N. Pal`china, Irina P. Sorokina&dagger;, Anna Urmanchieva, Be&aacute;ta Wagner-Nagy and possibly other people,</li>
<li>Published audio recordings,</li>
<li>Texts published by Dar`ya S. Bolina, Yaroslav A. Gluxij&dagger; and Vasilij A. Susekov&dagger;, Eugene Helimski&dagger;, Kazimir I. Labanauskas&dagger;, Tibor Mikola&dagger;, J&aacute;nos Pusztay, Irina P. Sorokina&dagger;, Anna Urmanchieva,</li>
<li>Legacy manuscript transcriptions and self-transcriptions done and/or edited by Dar`ya S. Bolina, Galina S. Bolina, Zoya N. Bolina, Valentin Gusev, Eugene Helimski&dagger;, Kazimir I. Labanauskas&dagger;, Larisa Leisi&ouml;, Marina Lyublinskaya, Vasilij F. Ly`rmin&dagger;, Anton N. Pal`chin, Viktor N. Pal`chin, Ivan I. Silkin&dagger;, Irina P. Sorokina&dagger;, Natal`ya M. Tere&scaron;čenko&dagger;, Anna Urmanchieva and possibly other people.</li>
</ul>
<p>All texts in the corpus are provided with interlinear morpheme-by-morpheme glosses and translation into English and Russian. All texts for which the audio recordings were accessible are time-aligned with them. Video recordings are also included into the corpus if available.</p>
<p><br>
<strong>New in release 1.1</strong></p>
<ul>
<li>Annotation of syntactic functions (tier category &quot;SyF&quot;) is now available for 55 additional texts, of which 52 are folklore and 3 <em>&ndash; </em> narrative;</li>
<li>For texts originating from published and archival sources, as well as manuscripts, detailed references were added to the &quot;Citation&quot; section of the documentation and the respective field in the corpus metadata file.</li>
</ul>
<p><strong>Corpus size</strong></p>
<ul>
<li>Forest Enets: <strong>541</strong> texts, <strong>41,396</strong> sentences, <strong>173,380</strong>&nbsp;tokens</li>
<li>Tundra Enets: <strong>137</strong> texts, <strong>12,737</strong> sentences, <strong>45,331</strong> tokens</li>
<li>Total: <strong>678</strong> texts, <strong>54,133</strong> sentences, <strong>218,711</strong>&nbsp;tokens</li>
<li>Total duration of audio: <strong>43 </strong>hours <strong>26 </strong>minutes</li>
</ul>
<p><strong>Funding</strong></p>
<p>The corpus has been produced in the context of the joint research funding of the German Federal Government and Federal States in the Academies&rsquo; Programme, with funding from the Federal Ministry of Education and Research and the Free and Hanseatic City of Hamburg. The Academies&rsquo; Programme is coordinated by the Union of the German Academies of Sciences and Humanities.</p>
<p>Preliminary glossing work included into this corpus was supported by Endangered Languages Documentation Programme (ELDP) and by Max Planck Institute for Evolutionary Anthropology (MPI-EVA). See more details on financial support in the documentation&nbsp;file below, section 1.6.</p>
<p><strong>Contributions/Acknowledgements</strong></p>
<p>Dozens of people and many institutions contributed to the corpus (see more details in the documentation&nbsp;file below, section 1.6). We are especially grateful to:</p>
<ul>
<li>Enets speakers who generously shared their knowledge, especially those who spent many days working with us: Aleksandr S. Bolin&dagger;, Leonid D. Bolin&dagger;, Viktor N. Bolin, Nadezhda K. Bolina, Nina N. Bolina, Ekaterina S. Glibchenko, Gennadij A. Ivanov&dagger;, Irina P. Koshkaryova&dagger;, Valentina P. Nader, Lyudmila P. Novosyolova, Svetlana A. Roslyakova&dagger;, Ivan I. Silkin&dagger;, Nikolaj I. Silkin, Alevtina S. Silkina, Zoya A. Turutina, Tat`yana Ch. Yar,</li>
<li>In particular, Zoya N. Bolina and Viktor N. Pal`chin who also collaborated in ELDP project and extensively transcribed Enets recordings,</li>
<li>Natalia Stoynova, Sergey Trubetskoy and foremostly Maria Ovsjannikova who did recordings and transcriptions of Enets texts,</li>
<li>Institutions and private individuals who shared legacy data: the Institute for Linguistic Studies RAS, the Taymyr House of National Arts, the Dudinka branch of GTRK &ldquo;Norilsk&rdquo;; Dar`ya S. Bolina, Oksana E. Dobzhanskaya, Valentin Gusev, Larisa Leisi&ouml;, Viktor N. Pal`chin, Irina P. Sorokina&dagger;, Anna Urmanchieva,</li>
<li>Marina Lyublinskaya and Anna Urmanchieva who kindly permitted to include texts processed by them into the corpus,</li>
<li>Dar`ya S. Bolina who consulted a lot in the process of compilation of the corpus.</li>
</ul>
<p><strong>Searching the corpus</strong></p>
<p>The corpus can be downloaded from the ZFDM Repository using the links provided below and browsed or searched locally using the <a href="https://exmaralda.org/">EXMARaLDA</a> software or, alternatively, <a href="https://archive.mpi.nl/tla/elan">ELAN</a>.</p>
<p>Online search with Tsakorpus platform is available at <a href="https://inel.corpora.uni-hamburg.de/EnetsCorpus/search">https://inel.corpora.uni-hamburg.de/EnetsCorpus/search</a>.</p>
<p>Remote search with EXMARaLDA is also possible without downloading all the files (see <a href="https://inel.corpora.uni-hamburg.de/portal/help/en/index.php#search">https://inel.corpora.uni-hamburg.de/portal/help/en/index.php#search</a>).</p>
<p>See the user documentation&nbsp;(section 3) for details on transcription, annotation tiers and annotation tags.<br>
Find further information and links on the Enets Corpus page at the INEL Resources portal: <a href="https://inel.corpora.uni-hamburg.de/portal/corpora/enets/">https://inel.corpora.uni-hamburg.de/portal/corpora/enets/</a>.</p></description>
</descriptions>
</resource>