Dataset Open Access

INEL Mansi Corpus

Wagner-Nagy, Beáta; Sipőcz, Katalin


MARC21 XML Export

<?xml version='1.0' encoding='UTF-8'?>
<record xmlns="http://www.loc.gov/MARC21/slim">
  <leader>00000nmm##2200000uu#4500</leader>
  <datafield tag="520" ind1=" " ind2=" ">
    <subfield code="a">&lt;p&gt;&lt;strong&gt;Corpus Citation&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Sipőcz, Katalin &amp;amp; Wagner-Nagy, Be&amp;aacute;ta. 2026. INEL Mansi Corpus. Version 1.0. Publication date 2026-06-05. &lt;a href="https://hdl.handle.net/11022/0000-0008-00F5-3"&gt;https://hdl.handle.net/11022/0000-0008-00F5-3&lt;/a&gt;. Archived at Universit&amp;auml;t Hamburg. In: &lt;em&gt;The INEL corpora of indigenous Northern Eurasian languages. &lt;/em&gt;&lt;a href="https://hdl.handle.net/11022/0000-0007-F45A-1"&gt;https://hdl.handle.net/11022/0000-0007-F45A-1&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Corpus Description&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;The INEL Mansi Corpus has been created as part of the long-term project INEL (&amp;ldquo;Grammatical Descriptions, Corpora, and Language Technology for Indigenous Northern Eurasian Languages&amp;rdquo;) in the context of the Academies&amp;rsquo; Programme, coordinated by the Union of the German Academies of Sciences and Humanities.&lt;/p&gt;

&lt;p&gt;Mansi is a relatively well-documented language, with numerous grammatical descriptions and an existing corpus. However, not all varieties have been represented in previously available corpora. The present corpus addresses this gap by incorporating materials from the Tavda variety, alongside a number of texts from the Western dialect group. Most of the corpus data originate from the Northern dialect group.&lt;/p&gt;

&lt;p&gt;The INEL Mansi Corpus comprises texts drawn from the following sources:&lt;/p&gt;

&lt;ul&gt;
	&lt;li&gt;Dolovai, Dorottya and Katalin Sipőcz 1996. Szoszvai vogul sz&amp;ouml;vegek. In: M&amp;eacute;sz&amp;aacute;ros, Edit (ed.): &lt;em&gt;&amp;Uuml;nnepi k&amp;ouml;nyv Mikola Tibor tisztelet&amp;eacute;re&lt;/em&gt;: 73-75. Szeged.&lt;/li&gt;
	&lt;li&gt;K. Sal, &amp;Eacute;va 1980. Szigvai vogul mes&amp;eacute;k. &lt;em&gt;Nyelvtudom&amp;aacute;nyi K&amp;ouml;zlem&amp;eacute;nyek&lt;/em&gt; 82: 289-298.&lt;/li&gt;
	&lt;li&gt;K&amp;aacute;lm&amp;aacute;n, B&amp;eacute;la 1959. Vogulin tutkimassa. &lt;em&gt;Viritt&amp;auml;j&amp;auml;&lt;/em&gt; 63/3: 411-416.&lt;/li&gt;
	&lt;li&gt;K&amp;aacute;lm&amp;aacute;n, B&amp;eacute;la 1960. Manysi sz&amp;ouml;vegmutatv&amp;aacute;nyok. &lt;em&gt;Nyelvtudom&amp;aacute;nyi K&amp;ouml;zlem&amp;eacute;nyek&lt;/em&gt; 62: 23-32.&lt;/li&gt;
	&lt;li&gt;K&amp;aacute;lm&amp;aacute;n, B&amp;eacute;la 1976a. &lt;em&gt;Wogulische Texte mit einem Glossar&lt;/em&gt;. Budapest: Akad&amp;eacute;mia Kiad&amp;oacute;.&lt;/li&gt;
	&lt;li&gt;K&amp;aacute;lm&amp;aacute;n, B&amp;eacute;la 1976b. &lt;em&gt;Chrestomathia Vogulica&lt;/em&gt;. Budapest: Tank&amp;ouml;nyvkiad&amp;oacute;.&lt;/li&gt;
	&lt;li&gt;Kannisto, Artturi and Matti Liimola 1951. &lt;em&gt;Wogulische Volksdichtung &lt;/em&gt;gesammelt und &amp;uuml;bersetzt von Artturi Kannisto, bearbeitet und herausgegeben von Matti Liimola Volume I. &lt;em&gt;Texte mythischen Inhalts&lt;/em&gt;. [M&amp;eacute;moires de la Soci&amp;eacute;t&amp;eacute; Finno-Ougrienne 101]. Helsinki: Suomalais-Ugrilainen Seura.&lt;/li&gt;
	&lt;li&gt;Kannisto, Artturi and Matti Liimola 1955. &lt;em&gt;Wogulische Volksdichtung &lt;/em&gt;gesammelt und &amp;uuml;bersetzt von Artturi Kannisto, bearbeitet und herausgegeben von Matti Liimola Volume II&lt;em&gt;. Kriegs und Heldensagen&lt;/em&gt;. [M&amp;eacute;moires de la Soci&amp;eacute;t&amp;eacute; Finno-Ougrienne 109]. Helsinki: Suomalais-Ugrilainen Seura.&lt;/li&gt;
	&lt;li&gt;Kannisto, Artturi and Matti Liimola 1956. &lt;em&gt;Wogulische Volksdichtung&lt;/em&gt; gesammelt und &amp;uuml;bersetzt von Artturi Kannisto, bearbeitet und herausgegeben von Matti Liimola Volume III. &lt;em&gt;M&amp;auml;rchen&lt;/em&gt;. [M&amp;eacute;moires de la Soci&amp;eacute;t&amp;eacute; Finno-Ougrienne 111]. Helsinki: Suomalais-Ugrilainen Seura.&lt;/li&gt;
	&lt;li&gt;Kannisto, Artturi and Matti Liimola 1958. &lt;em&gt;Wogulische Volksdichtung &lt;/em&gt;gesammelt und &amp;uuml;bersetzt von Artturi Kannisto, bearbeitet und herausgegeben von Matti Liimola Volume IV&lt;em&gt;. &lt;/em&gt;&lt;em&gt;B&amp;auml;renlieder&lt;/em&gt;. [M&amp;eacute;moires de la Soci&amp;eacute;t&amp;eacute; Finno-Ougrienne 114]. Helsinki: Suomalais-Ugrilainen Seura.&lt;/li&gt;
	&lt;li&gt;Kannisto, Artturi and Matti Liimola 1963. &lt;em&gt;Wogulische Volksdichtung &lt;/em&gt;gesammelt und &amp;uuml;bersetzt von Artturi Kannisto, bearbeitet und herausgegeben von Matti Liimola Volume VI.&lt;em&gt; Schicksalslieder, Klagelieder, Kinderreime, R&amp;auml;tsel, Verschiedenes&lt;/em&gt;. [M&amp;eacute;moires de la Soci&amp;eacute;t&amp;eacute; Finno-Ougrienne 134]. Helsinki: Suomalais-Ugrilainen Seura.&lt;/li&gt;
	&lt;li&gt;&lt;em&gt;Lūimā sēripos&lt;/em&gt;, a Northern Mansi language newspaper published in the Khanty-Mansi Autonomous Okrug&amp;ndash;Yugra, No. 29, 2012.&lt;/li&gt;
	&lt;li&gt;Munk&amp;aacute;csi, Bern&amp;aacute;t 1887&lt;em&gt;. &lt;/em&gt;A vogul nyelvj&amp;aacute;r&amp;aacute;sok (Sz&amp;oacute;ragoz&amp;aacute;s &amp;eacute;s nyelvmutatv&amp;aacute;nyok).&lt;em&gt; Nyelvtudom&amp;aacute;nyi K&amp;ouml;zlem&amp;eacute;nyek &lt;/em&gt;21: 321-455&lt;em&gt;.&lt;/em&gt;&lt;/li&gt;
	&lt;li&gt;Munk&amp;aacute;csi, Bern&amp;aacute;t 1892. &lt;em&gt;Vogul n&amp;eacute;pk&amp;ouml;lt&amp;eacute;si gyűjtem&amp;eacute;ny&lt;/em&gt; II/1. Istenek hősi &amp;eacute;nekei, reg&amp;eacute;i &amp;eacute;s id&amp;eacute;ző ig&amp;eacute;i. Budapest: Magyar Tudom&amp;aacute;nyos Akad&amp;eacute;mia.&lt;/li&gt;
	&lt;li&gt;Munk&amp;aacute;csi, Bern&amp;aacute;t 1893. &lt;em&gt;Vogul n&amp;eacute;pk&amp;ouml;lt&amp;eacute;si gyűjtem&amp;eacute;ny&lt;/em&gt; III/1. Medve&amp;eacute;nekek. Budapest: Magyar Tudom&amp;aacute;nyos Akad&amp;eacute;mia.&lt;/li&gt;
	&lt;li&gt;Munk&amp;aacute;csi, Bern&amp;aacute;t 1896. &lt;em&gt;Vogul n&amp;eacute;pk&amp;ouml;lt&amp;eacute;si gyűjtem&amp;eacute;ny&lt;/em&gt; IV/1. &amp;Eacute;letk&amp;eacute;pek. Budapest: Magyar Tudom&amp;aacute;nyos Akad&amp;eacute;mia.&lt;/li&gt;
	&lt;li&gt;Munk&amp;aacute;csi, Bern&amp;aacute;t 1902. &lt;em&gt;Vogul n&amp;eacute;pk&amp;ouml;lt&amp;eacute;si gyűjtem&amp;eacute;ny&lt;/em&gt; I/1-2. Reg&amp;eacute;k &amp;eacute;s &amp;eacute;nekek a vil&amp;aacute;g teremt&amp;eacute;s&amp;eacute;ről : vogul sz&amp;ouml;vegek &amp;eacute;s ford&amp;iacute;t&amp;aacute;saik t&amp;aacute;rgyi &amp;eacute;s nyelvi magyar&amp;aacute;zatokkal : bevezet&amp;eacute;s&amp;uuml;l a vogulok n&amp;eacute;pk&amp;ouml;lt&amp;eacute;se &amp;eacute;s ősi hitvil&amp;aacute;ga. Budapest: Magyar Tudom&amp;aacute;nyos Akad&amp;eacute;mia.&lt;/li&gt;
	&lt;li&gt;Rombandeeva, Evgokiya 1956. &lt;em&gt;Manki latnguv: Lovin&amp;#39;tan kniga man&amp;#39;si nachal&amp;#39;nyi shkola kitit klass magys&lt;/em&gt; [Nasha rech&amp;#39;: Kniga dlia chteniia dlia 2-go klassa mansiiskoi nachal&amp;#39;noi shkoly / Our Speech: A Reader for the Second Grade of Mansi Primary Schools]. Leningrad: Uchpedgiz.&lt;/li&gt;
	&lt;li&gt;Sipőcz, Katalin 2014. A manysi evidenci&amp;aacute;lisr&amp;oacute;l. &lt;em&gt;Folia Uralica Debreceniensia&lt;/em&gt; 21: 121&amp;shy;140.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;All texts in the corpus are provided with interlinear morpheme-by-morpheme glosses. All texts for which audio recordings are available have been time-aligned with the corresponding recordings.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Corpus size&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;The corpus contains 196 texts from 47 speakers, 6,179 sentences and 48,145 tokens. The total duration of the audio recordings is 1 hour 36 minutes.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Funding&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;The corpus has been produced in the context of the joint research funding of the German Federal Government and Federal States in the Academies&amp;rsquo; Programme, with funding from the Federal Ministry of Education and Research and the Free and Hanseatic City of Hamburg. The Academies&amp;rsquo; Programme is coordinated by the Union of the German Academies of Sciences and Humanities.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Searching the corpus&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;The corpus can be downloaded from the ZFDM Repository using the links provided below and browsed or searched locally using the &lt;a href="https://exmaralda.org/"&gt;EXMARaLDA&lt;/a&gt; software or, alternatively, &lt;a href="https://archive.mpi.nl/tla/elan"&gt;ELAN&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;Online search with Tsakorpus platform is available at &lt;a href="https://inel.corpora.uni-hamburg.de/MansiCorpus/search"&gt;https://inel.corpora.uni-hamburg.de/MansiCorpus/search&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;Remote search with EXMARaLDA is also possible without downloading all the files (see &lt;a href="https://inel.corpora.uni-hamburg.de/portal/help/en/index.php"&gt;https://inel.corpora.uni-hamburg.de/portal/help/en/index.php&lt;/a&gt;).&lt;/p&gt;

&lt;p&gt;See the user documentation (section 3) for details on transcription, annotation tiers and annotation tags.&lt;br&gt;
Find further information and links on the Mansi Corpus page at the INEL Resources portal: &lt;a href="https://inel.corpora.uni-hamburg.de/portal/corpora/mansi/"&gt;https://inel.corpora.uni-hamburg.de/portal/corpora/mansi/&lt;/a&gt;.&lt;/p&gt;</subfield>
  </datafield>
  <datafield tag="100" ind1=" " ind2=" ">
    <subfield code="a">Wagner-Nagy, Beáta</subfield>
    <subfield code="u">Universität Hamburg</subfield>
  </datafield>
  <datafield tag="540" ind1=" " ind2=" ">
    <subfield code="u">https://creativecommons.org/licenses/by-nc-sa/4.0/legalcode</subfield>
    <subfield code="a">Creative Commons Attribution Non Commercial Share Alike 4.0 International</subfield>
  </datafield>
  <datafield tag="650" ind1="1" ind2="7">
    <subfield code="a">cc-by</subfield>
    <subfield code="2">opendefinition.org</subfield>
  </datafield>
  <datafield tag="980" ind1=" " ind2=" ">
    <subfield code="a">dataset</subfield>
  </datafield>
  <controlfield tag="001">18715</controlfield>
  <datafield tag="856" ind1="4" ind2=" ">
    <subfield code="s">1051138</subfield>
    <subfield code="u">https://www.fdr.uni-hamburg.de/record/18715/files/mansi-1.0-documentation.pdf</subfield>
    <subfield code="z">md5:63473a38445166005158f577875e5c8a</subfield>
  </datafield>
  <datafield tag="856" ind1="4" ind2=" ">
    <subfield code="s">24257130</subfield>
    <subfield code="u">https://www.fdr.uni-hamburg.de/record/18715/files/mansi-1.0-lite.zip</subfield>
    <subfield code="z">md5:546e9937795b291f81a50e96c69637ee</subfield>
  </datafield>
  <datafield tag="856" ind1="4" ind2=" ">
    <subfield code="s">390266239</subfield>
    <subfield code="u">https://www.fdr.uni-hamburg.de/record/18715/files/mansi-1.0-mp3.zip</subfield>
    <subfield code="z">md5:77e33f69cdc3234abf96a8f0975072b4</subfield>
  </datafield>
  <datafield tag="856" ind1="4" ind2=" ">
    <subfield code="s">1149023186</subfield>
    <subfield code="u">https://www.fdr.uni-hamburg.de/record/18715/files/mansi-1.0-standard.zip</subfield>
    <subfield code="z">md5:f56840f9086b6cdfa85a986026f26303</subfield>
  </datafield>
  <datafield tag="773" ind1=" " ind2=" ">
    <subfield code="a">10.25592/uhhfdm.18714</subfield>
    <subfield code="i">isVersionOf</subfield>
    <subfield code="n">doi</subfield>
  </datafield>
  <datafield tag="653" ind1=" " ind2=" ">
    <subfield code="a">Uralic</subfield>
  </datafield>
  <datafield tag="653" ind1=" " ind2=" ">
    <subfield code="a">Mansi</subfield>
  </datafield>
  <datafield tag="653" ind1=" " ind2=" ">
    <subfield code="a">endangered Language</subfield>
  </datafield>
  <datafield tag="653" ind1=" " ind2=" ">
    <subfield code="a">language contact</subfield>
  </datafield>
  <datafield tag="653" ind1=" " ind2=" ">
    <subfield code="a">Ob-Ugric</subfield>
  </datafield>
  <datafield tag="653" ind1=" " ind2=" ">
    <subfield code="a">language documentation</subfield>
  </datafield>
  <datafield tag="653" ind1=" " ind2=" ">
    <subfield code="a">legacy data</subfield>
  </datafield>
  <datafield tag="653" ind1=" " ind2=" ">
    <subfield code="a">INEL</subfield>
  </datafield>
  <datafield tag="653" ind1=" " ind2=" ">
    <subfield code="a">AdWHH</subfield>
  </datafield>
  <datafield tag="653" ind1=" " ind2=" ">
    <subfield code="a">text corpus</subfield>
  </datafield>
  <datafield tag="653" ind1=" " ind2=" ">
    <subfield code="a">parallel texts</subfield>
  </datafield>
  <datafield tag="653" ind1=" " ind2=" ">
    <subfield code="a">folklore</subfield>
  </datafield>
  <datafield tag="653" ind1=" " ind2=" ">
    <subfield code="a">tales</subfield>
  </datafield>
  <datafield tag="653" ind1=" " ind2=" ">
    <subfield code="a">narrative</subfield>
  </datafield>
  <datafield tag="653" ind1=" " ind2=" ">
    <subfield code="a">dialogue</subfield>
  </datafield>
  <datafield tag="653" ind1=" " ind2=" ">
    <subfield code="a">song</subfield>
  </datafield>
  <datafield tag="653" ind1=" " ind2=" ">
    <subfield code="a">transcription</subfield>
  </datafield>
  <datafield tag="653" ind1=" " ind2=" ">
    <subfield code="a">morphological glossing</subfield>
  </datafield>
  <datafield tag="653" ind1=" " ind2=" ">
    <subfield code="a">part-of-speech</subfield>
  </datafield>
  <datafield tag="653" ind1=" " ind2=" ">
    <subfield code="a">borrowings</subfield>
  </datafield>
  <datafield tag="653" ind1=" " ind2=" ">
    <subfield code="a">syntactic function</subfield>
  </datafield>
  <datafield tag="653" ind1=" " ind2=" ">
    <subfield code="a">semantic role</subfield>
  </datafield>
  <datafield tag="653" ind1=" " ind2=" ">
    <subfield code="a">English translation</subfield>
  </datafield>
  <datafield tag="653" ind1=" " ind2=" ">
    <subfield code="a">Russian translation</subfield>
  </datafield>
  <datafield tag="653" ind1=" " ind2=" ">
    <subfield code="a">Hungarian translation</subfield>
  </datafield>
  <datafield tag="653" ind1=" " ind2=" ">
    <subfield code="a">German translation</subfield>
  </datafield>
  <datafield tag="653" ind1=" " ind2=" ">
    <subfield code="a">EXMARaLDA</subfield>
  </datafield>
  <datafield tag="653" ind1=" " ind2=" ">
    <subfield code="a">ELAN</subfield>
  </datafield>
  <datafield tag="653" ind1=" " ind2=" ">
    <subfield code="a">XML</subfield>
  </datafield>
  <datafield tag="653" ind1=" " ind2=" ">
    <subfield code="a">ISO/TEI</subfield>
  </datafield>
  <datafield tag="653" ind1=" " ind2=" ">
    <subfield code="a">existential predication</subfield>
  </datafield>
  <datafield tag="653" ind1=" " ind2=" ">
    <subfield code="a">locative predication</subfield>
  </datafield>
  <datafield tag="653" ind1=" " ind2=" ">
    <subfield code="a">possessive predication</subfield>
  </datafield>
  <datafield tag="024" ind1=" " ind2=" ">
    <subfield code="a">10.25592/uhhfdm.18715</subfield>
    <subfield code="2">doi</subfield>
  </datafield>
  <datafield tag="024" ind1=" " ind2=" ">
    <subfield code="a">11022/0000-0008-00F5-3</subfield>
    <subfield code="2">handle</subfield>
    <subfield code="q">alternateidentifier</subfield>
  </datafield>
  <datafield tag="909" ind1="C" ind2="O">
    <subfield code="o">oai:fdr.uni-hamburg.de:18715</subfield>
    <subfield code="p">user-adwhh</subfield>
    <subfield code="p">user-inel</subfield>
    <subfield code="p">user-uhh</subfield>
  </datafield>
  <datafield tag="980" ind1=" " ind2=" ">
    <subfield code="a">user-adwhh</subfield>
  </datafield>
  <datafield tag="980" ind1=" " ind2=" ">
    <subfield code="a">user-inel</subfield>
  </datafield>
  <datafield tag="980" ind1=" " ind2=" ">
    <subfield code="a">user-uhh</subfield>
  </datafield>
  <datafield tag="245" ind1=" " ind2=" ">
    <subfield code="a">INEL Mansi Corpus</subfield>
  </datafield>
  <datafield tag="260" ind1=" " ind2=" ">
    <subfield code="c">2026-06-05</subfield>
  </datafield>
  <datafield tag="700" ind1=" " ind2=" ">
    <subfield code="a">Sipőcz, Katalin</subfield>
    <subfield code="u">University of Szeged</subfield>
  </datafield>
  <datafield tag="700" ind1=" " ind2=" ">
    <subfield code="a">Wagner-Nagy, Beáta</subfield>
    <subfield code="u">Universität Hamburg</subfield>
    <subfield code="4">edt</subfield>
  </datafield>
  <datafield tag="700" ind1=" " ind2=" ">
    <subfield code="a">Brykina, Maria</subfield>
    <subfield code="u">Universität Hamburg</subfield>
    <subfield code="4">edt</subfield>
  </datafield>
  <datafield tag="700" ind1=" " ind2=" ">
    <subfield code="a">Lazarenko, Elena</subfield>
    <subfield code="u">Universität Hamburg</subfield>
    <subfield code="4">dtm</subfield>
  </datafield>
  <datafield tag="700" ind1=" " ind2=" ">
    <subfield code="a">Riaposov, Aleksandr</subfield>
    <subfield code="u">Universität Hamburg</subfield>
    <subfield code="4">dtm</subfield>
  </datafield>
  <datafield tag="542" ind1=" " ind2=" ">
    <subfield code="l">open</subfield>
  </datafield>
  <controlfield tag="005">20260605193355.0</controlfield>
  <datafield tag="041" ind1=" " ind2=" ">
    <subfield code="a">mns</subfield>
  </datafield>
</record>

Cite record as