Dataset Open Access

INEL Nenets Corpus

Budzisch, Josefina; Wagner-Nagy, Beáta


MARC21 XML Export

<?xml version='1.0' encoding='UTF-8'?>
<record xmlns="http://www.loc.gov/MARC21/slim">
  <leader>00000nmm##2200000uu#4500</leader>
  <datafield tag="100" ind1=" " ind2=" ">
    <subfield code="a">Budzisch, Josefina</subfield>
    <subfield code="u">Universität Hamburg</subfield>
  </datafield>
  <datafield tag="540" ind1=" " ind2=" ">
    <subfield code="u">https://creativecommons.org/licenses/by-nc-sa/4.0/legalcode</subfield>
    <subfield code="a">Creative Commons Attribution Non Commercial Share Alike 4.0 International</subfield>
  </datafield>
  <datafield tag="520" ind1=" " ind2=" ">
    <subfield code="a">&lt;p&gt;&lt;strong&gt;Corpus Citation&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;&lt;em&gt;Budzisch, Josefina; Wagner-Nagy, Be&amp;aacute;ta. 2024. INEL Nenets Corpus. Version 1.0. Publication date 2024-12-31. &lt;/em&gt;&lt;a href="https://hdl.handle.net/11022/0000-0007-FE37-E"&gt;https://hdl.handle.net/11022/0000-0007-FE37-E&lt;/a&gt;&lt;em&gt;. Archived at Universit&amp;auml;t Hamburg. In: The INEL corpora of indigenous Northern Eurasian languages.&amp;nbsp;&lt;/em&gt;&lt;a href="https://hdl.handle.net/11022/0000-0007-F45A-1"&gt;https://hdl.handle.net/11022/0000-0007-F45A-1&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Corpus Description&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;The INEL Nenets corpus has been created within the long-term INEL project (&amp;quot;Grammatical Descriptions, Corpora and Language Technology for Indigenous Northern Eurasian Languages&amp;quot;), 2016&amp;ndash;2033.&lt;/p&gt;

&lt;p&gt;The corpus includes texts recorded between 1940&amp;ndash;2011 in both Nenets lects &amp;ndash; Forest Nenets and Tundra Nenets. The majority of texts in this corpus originate from published works, which are appropriately cited in the relevant sections of the metadata. In particular, the following publications were used, the full information can be found in the reference section of the documentation:&lt;/p&gt;

&lt;ul&gt;
	&lt;li&gt;Barmich 2018&lt;/li&gt;
	&lt;li&gt;Burkova 2008&lt;/li&gt;
	&lt;li&gt;Burkova 2012&lt;/li&gt;
	&lt;li&gt;Burkova et al. 2003&lt;/li&gt;
	&lt;li&gt;Hajd&amp;uacute; 1968&lt;/li&gt;
	&lt;li&gt;Koshkareva et al. 2007&lt;/li&gt;
	&lt;li&gt;Labanauskas 2001&lt;/li&gt;
	&lt;li&gt;Logany &amp;amp; Logany 2016&lt;/li&gt;
	&lt;li&gt;Lyubinskaya 2022&lt;/li&gt;
	&lt;li&gt;Pusztay 1976&lt;/li&gt;
	&lt;li&gt;Tereshchenko 1956&lt;/li&gt;
	&lt;li&gt;Tereshchenko 1990&lt;/li&gt;
	&lt;li&gt;Turutina 2003&lt;/li&gt;
	&lt;li&gt;Yangasova 2018&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Svetlana Burkova kindly shared a collection of her Forest Nenets data including an original sound recording (Agan dialect), transcripts and glosses as Toolbox files and Word documents (Agan and Pur dialects), as well as published texts in Pur (Turutina 2003) and Numto (Logany &amp;amp; Logany 2016) dialects.&lt;/p&gt;

&lt;p&gt;All texts in the corpus are provided with interlinear morpheme-by-morpheme glosses and translation into English, German and Russian. Audio recording is also provided for one text.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Corpus size&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
	&lt;li&gt;Forest Nenets: &lt;strong&gt;80&lt;/strong&gt; texts, 3,709 sentences, &lt;strong&gt;23,597&lt;/strong&gt; tokens&lt;/li&gt;
	&lt;li&gt;Tundra Nenets: &lt;strong&gt;56&lt;/strong&gt; texts, &lt;strong&gt;6,545&lt;/strong&gt; sentences, &lt;strong&gt;37,681&lt;/strong&gt; tokens&lt;/li&gt;
	&lt;li&gt;Total: &lt;strong&gt;136&lt;/strong&gt; texts, &lt;strong&gt;10,254&lt;/strong&gt; sentences, &lt;strong&gt;61,278&lt;/strong&gt; tokens&lt;/li&gt;
	&lt;li&gt;Total duration of audio: &lt;strong&gt;44&lt;/strong&gt; minutes &lt;strong&gt;45&lt;/strong&gt; seconds&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Funding&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;The corpus has been produced in the context of the joint research funding of the German Federal Government and Federal States in the Academies&amp;rsquo; Programme, with funding from the Federal Ministry of Education and Research and the Free and Hanseatic City of Hamburg. The Academies&amp;rsquo; Programme is coordinated by the Union of the German Academies of Sciences and Humanities.&lt;/p&gt;

&lt;p&gt;Searching the corpus&lt;/p&gt;

&lt;p&gt;The corpus can be downloaded from the ZFDM Repository using the links provided below and browsed or searched locally using the &lt;a href="https://exmaralda.org/de/"&gt;EXMARaLDA&lt;/a&gt; software or, alternatively, &lt;a href="https://archive.mpi.nl/tla/elan"&gt;ELAN&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;Online search with Tsakorpus platform is available at &lt;a href="https://inel.corpora.uni-hamburg.de/NenetsCorpus/search"&gt;https://inel.corpora.uni-hamburg.de/NenetsCorpus/search&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;Remote search with EXMARaLDA is also possible without downloading all the files (see &lt;a href="https://inel.corpora.uni-hamburg.de/portal/help/en/index.php#search"&gt;https://inel.corpora.uni-hamburg.de/portal/help/en/index.php&lt;/a&gt;).&lt;/p&gt;

&lt;p&gt;See the user documentation (section 3) for details on transcription, annotation tiers and annotation tags. Find further information and links on the Nenets Corpus page at the INEL Resources portal: &lt;a href="https://inel.corpora.uni-hamburg.de/portal/corpora/nenets/"&gt;https://inel.corpora.uni-hamburg.de/portal/corpora/nenets/&lt;/a&gt;.&lt;/p&gt;</subfield>
  </datafield>
  <datafield tag="650" ind1="1" ind2="7">
    <subfield code="a">cc-by</subfield>
    <subfield code="2">opendefinition.org</subfield>
  </datafield>
  <datafield tag="773" ind1=" " ind2=" ">
    <subfield code="a">11022/0000-0007-FE37-E</subfield>
    <subfield code="i">isCitedBy</subfield>
    <subfield code="n">handle</subfield>
  </datafield>
  <datafield tag="773" ind1=" " ind2=" ">
    <subfield code="a">10.25592/uhhfdm.16517</subfield>
    <subfield code="i">isVersionOf</subfield>
    <subfield code="n">doi</subfield>
  </datafield>
  <datafield tag="856" ind1="4" ind2=" ">
    <subfield code="s">1296074</subfield>
    <subfield code="u">https://www.fdr.uni-hamburg.de/record/16518/files/nenets-1.0-documentation.pdf</subfield>
    <subfield code="z">md5:0981dc44c598669ec643eb37981644d0</subfield>
  </datafield>
  <datafield tag="856" ind1="4" ind2=" ">
    <subfield code="s">34648270</subfield>
    <subfield code="u">https://www.fdr.uni-hamburg.de/record/16518/files/nenets-1.0-lite.zip</subfield>
    <subfield code="z">md5:3c425493d61d8d05c1ffd9ebbc5361e7</subfield>
  </datafield>
  <datafield tag="856" ind1="4" ind2=" ">
    <subfield code="s">215061786</subfield>
    <subfield code="u">https://www.fdr.uni-hamburg.de/record/16518/files/nenets-1.0-mp3.zip</subfield>
    <subfield code="z">md5:c3b7d76ed299e3b9d9b36c6d4fd31927</subfield>
  </datafield>
  <datafield tag="856" ind1="4" ind2=" ">
    <subfield code="s">248132887</subfield>
    <subfield code="u">https://www.fdr.uni-hamburg.de/record/16518/files/nenets-1.0-standard.zip</subfield>
    <subfield code="z">md5:beeba88cbec55aec6db8bc6b1affd786</subfield>
  </datafield>
  <datafield tag="542" ind1=" " ind2=" ">
    <subfield code="l">open</subfield>
  </datafield>
  <datafield tag="700" ind1=" " ind2=" ">
    <subfield code="a">Wagner-Nagy, Beáta</subfield>
    <subfield code="u">Universität Hamburg</subfield>
  </datafield>
  <datafield tag="700" ind1=" " ind2=" ">
    <subfield code="a">Wagner-Nagy, Beáta</subfield>
    <subfield code="u">Universität Hamburg</subfield>
    <subfield code="4">edt</subfield>
  </datafield>
  <datafield tag="700" ind1=" " ind2=" ">
    <subfield code="a">Arkhipov, Alexandre</subfield>
    <subfield code="u">Universität Hamburg</subfield>
    <subfield code="4">edt</subfield>
  </datafield>
  <datafield tag="700" ind1=" " ind2=" ">
    <subfield code="a">Lazarenko, Elena</subfield>
    <subfield code="u">Universität Hamburg</subfield>
    <subfield code="4">dtm</subfield>
  </datafield>
  <datafield tag="700" ind1=" " ind2=" ">
    <subfield code="a">Riaposov, Aleksandr</subfield>
    <subfield code="u">Universität Hamburg</subfield>
    <subfield code="4">dtm</subfield>
  </datafield>
  <datafield tag="700" ind1=" " ind2=" ">
    <subfield code="a">Lehmberg, Timm</subfield>
    <subfield code="u">Universität Hamburg</subfield>
    <subfield code="4">dtm</subfield>
  </datafield>
  <datafield tag="245" ind1=" " ind2=" ">
    <subfield code="a">INEL Nenets Corpus</subfield>
  </datafield>
  <datafield tag="024" ind1=" " ind2=" ">
    <subfield code="a">10.25592/uhhfdm.16518</subfield>
    <subfield code="2">doi</subfield>
  </datafield>
  <controlfield tag="005">20241219104803.0</controlfield>
  <controlfield tag="001">16518</controlfield>
  <datafield tag="980" ind1=" " ind2=" ">
    <subfield code="a">user-adwhh</subfield>
  </datafield>
  <datafield tag="980" ind1=" " ind2=" ">
    <subfield code="a">user-inel</subfield>
  </datafield>
  <datafield tag="980" ind1=" " ind2=" ">
    <subfield code="a">user-uhh</subfield>
  </datafield>
  <datafield tag="260" ind1=" " ind2=" ">
    <subfield code="c">2024-12-31</subfield>
  </datafield>
  <datafield tag="909" ind1="C" ind2="O">
    <subfield code="o">oai:fdr.uni-hamburg.de:16518</subfield>
    <subfield code="p">user-uhh</subfield>
    <subfield code="p">user-adwhh</subfield>
    <subfield code="p">user-inel</subfield>
  </datafield>
  <datafield tag="653" ind1=" " ind2=" ">
    <subfield code="a">Uralic</subfield>
  </datafield>
  <datafield tag="653" ind1=" " ind2=" ">
    <subfield code="a">Samoyedic</subfield>
  </datafield>
  <datafield tag="653" ind1=" " ind2=" ">
    <subfield code="a">Nenets</subfield>
  </datafield>
  <datafield tag="653" ind1=" " ind2=" ">
    <subfield code="a">Forest Nenets</subfield>
  </datafield>
  <datafield tag="653" ind1=" " ind2=" ">
    <subfield code="a">Tundra Nenets</subfield>
  </datafield>
  <datafield tag="653" ind1=" " ind2=" ">
    <subfield code="a">endangered language</subfield>
  </datafield>
  <datafield tag="653" ind1=" " ind2=" ">
    <subfield code="a">language contact</subfield>
  </datafield>
  <datafield tag="653" ind1=" " ind2=" ">
    <subfield code="a">language documentation</subfield>
  </datafield>
  <datafield tag="653" ind1=" " ind2=" ">
    <subfield code="a">legacy data</subfield>
  </datafield>
  <datafield tag="653" ind1=" " ind2=" ">
    <subfield code="a">INEL</subfield>
  </datafield>
  <datafield tag="653" ind1=" " ind2=" ">
    <subfield code="a">AdWHH</subfield>
  </datafield>
  <datafield tag="653" ind1=" " ind2=" ">
    <subfield code="a">text corpus</subfield>
  </datafield>
  <datafield tag="653" ind1=" " ind2=" ">
    <subfield code="a">speech corpus</subfield>
  </datafield>
  <datafield tag="653" ind1=" " ind2=" ">
    <subfield code="a">parallel texts</subfield>
  </datafield>
  <datafield tag="653" ind1=" " ind2=" ">
    <subfield code="a">folklore</subfield>
  </datafield>
  <datafield tag="653" ind1=" " ind2=" ">
    <subfield code="a">tales</subfield>
  </datafield>
  <datafield tag="653" ind1=" " ind2=" ">
    <subfield code="a">narrative</subfield>
  </datafield>
  <datafield tag="653" ind1=" " ind2=" ">
    <subfield code="a">elicitation</subfield>
  </datafield>
  <datafield tag="653" ind1=" " ind2=" ">
    <subfield code="a">song</subfield>
  </datafield>
  <datafield tag="653" ind1=" " ind2=" ">
    <subfield code="a">transcription</subfield>
  </datafield>
  <datafield tag="653" ind1=" " ind2=" ">
    <subfield code="a">time-aligned</subfield>
  </datafield>
  <datafield tag="653" ind1=" " ind2=" ">
    <subfield code="a">audio</subfield>
  </datafield>
  <datafield tag="653" ind1=" " ind2=" ">
    <subfield code="a">morphological glossing</subfield>
  </datafield>
  <datafield tag="653" ind1=" " ind2=" ">
    <subfield code="a">part-of-speech</subfield>
  </datafield>
  <datafield tag="653" ind1=" " ind2=" ">
    <subfield code="a">borrowings</subfield>
  </datafield>
  <datafield tag="653" ind1=" " ind2=" ">
    <subfield code="a">code-switching</subfield>
  </datafield>
  <datafield tag="653" ind1=" " ind2=" ">
    <subfield code="a">existantial predication</subfield>
  </datafield>
  <datafield tag="653" ind1=" " ind2=" ">
    <subfield code="a">locative predication</subfield>
  </datafield>
  <datafield tag="653" ind1=" " ind2=" ">
    <subfield code="a">possessive predication</subfield>
  </datafield>
  <datafield tag="653" ind1=" " ind2=" ">
    <subfield code="a">English translation</subfield>
  </datafield>
  <datafield tag="653" ind1=" " ind2=" ">
    <subfield code="a">German translation</subfield>
  </datafield>
  <datafield tag="653" ind1=" " ind2=" ">
    <subfield code="a">Russian translation</subfield>
  </datafield>
  <datafield tag="653" ind1=" " ind2=" ">
    <subfield code="a">EXMARaLDA</subfield>
  </datafield>
  <datafield tag="653" ind1=" " ind2=" ">
    <subfield code="a">ELAN</subfield>
  </datafield>
  <datafield tag="653" ind1=" " ind2=" ">
    <subfield code="a">XML</subfield>
  </datafield>
  <datafield tag="653" ind1=" " ind2=" ">
    <subfield code="a">ISO/TEI</subfield>
  </datafield>
  <datafield tag="980" ind1=" " ind2=" ">
    <subfield code="a">dataset</subfield>
  </datafield>
  <datafield tag="041" ind1=" " ind2=" ">
    <subfield code="a">yrk</subfield>
  </datafield>
</record>

Cite record as