Dataset Open Access
Michaelis, Lars
<?xml version='1.0' encoding='UTF-8'?>
<record xmlns="http://www.loc.gov/MARC21/slim">
<leader>00000nmm##2200000uu#4500</leader>
<datafield tag="520" ind1=" " ind2=" ">
<subfield code="a"><p>WikiEvents is a knowledge graph based dataset for NLP and event-related machine learning tasks.</p>
<p>This dataset includes RDF data in JSON-LD about events between January 2020 and December 2022. It was extracted from the Wikipedia Current events portal, Wikidata, OpenStreetMaps Nominatim and Falcon 2.0. The extractor is available on GitHub under <a href="https://github.com/semantic-systems/current-events-to-kg">semantic-systems/current-events-to-kg</a>.</p>
<p>The RDF data for each month is split onto four graph modules each:</p>
<ul>
<li>The <strong>base</strong> graph module contains events, event summaries with references from named entities to Wikipedia articles.</li>
<li>The <strong>ohg</strong> graph module with all one-hop graphs (ohg) around the referencend Wikidata entities.</li>
<li>The <strong>osm</strong> graph module which contains spartial data from OpenStreetMap (OSM).</li>
<li>The <strong>raw</strong> graph module containing the raw HTML objects of events and article infoboxes.</li>
</ul>
<p>This repository additionally includes two JSON files with training samples used for entity linking and event-related location extraction. They were created using queries to the WikiEvents dataset uploaded into this repository.</p></subfield>
</datafield>
<datafield tag="100" ind1=" " ind2=" ">
<subfield code="a">Michaelis, Lars</subfield>
</datafield>
<datafield tag="540" ind1=" " ind2=" ">
<subfield code="u">https://creativecommons.org/licenses/by-sa/4.0/legalcode</subfield>
<subfield code="a">Creative Commons Attribution Share Alike 4.0 International</subfield>
</datafield>
<datafield tag="650" ind1="1" ind2="7">
<subfield code="a">cc-by</subfield>
<subfield code="2">opendefinition.org</subfield>
</datafield>
<datafield tag="980" ind1=" " ind2=" ">
<subfield code="a">dataset</subfield>
</datafield>
<controlfield tag="001">11447</controlfield>
<datafield tag="856" ind1="4" ind2=" ">
<subfield code="s">1970505942</subfield>
<subfield code="u">https://www.fdr.uni-hamburg.de/record/11447/files/WikiEvents_2020_2022.zip</subfield>
<subfield code="z">md5:5999b870f092922ea1c307effb727374</subfield>
</datafield>
<datafield tag="773" ind1=" " ind2=" ">
<subfield code="a">10.25592/uhhfdm.11446</subfield>
<subfield code="i">isVersionOf</subfield>
<subfield code="n">doi</subfield>
</datafield>
<datafield tag="653" ind1=" " ind2=" ">
<subfield code="a">Knowledge Graph</subfield>
</datafield>
<datafield tag="653" ind1=" " ind2=" ">
<subfield code="a">Events</subfield>
</datafield>
<datafield tag="653" ind1=" " ind2=" ">
<subfield code="a">Location Extraction</subfield>
</datafield>
<datafield tag="653" ind1=" " ind2=" ">
<subfield code="a">Entity Linking</subfield>
</datafield>
<datafield tag="653" ind1=" " ind2=" ">
<subfield code="a">NLP</subfield>
</datafield>
<datafield tag="024" ind1=" " ind2=" ">
<subfield code="a">10.25592/uhhfdm.11447</subfield>
<subfield code="2">doi</subfield>
</datafield>
<datafield tag="909" ind1="C" ind2="O">
<subfield code="o">oai:fdr.uni-hamburg.de:11447</subfield>
<subfield code="p">user-uhh</subfield>
</datafield>
<datafield tag="980" ind1=" " ind2=" ">
<subfield code="a">user-uhh</subfield>
</datafield>
<datafield tag="245" ind1=" " ind2=" ">
<subfield code="a">WikiEvents Dataset from January 2020 to December 2022</subfield>
</datafield>
<datafield tag="260" ind1=" " ind2=" ">
<subfield code="c">2023-02-07</subfield>
</datafield>
<datafield tag="542" ind1=" " ind2=" ">
<subfield code="l">open</subfield>
</datafield>
<controlfield tag="005">20231229151440.0</controlfield>
<datafield tag="041" ind1=" " ind2=" ">
<subfield code="a">eng</subfield>
</datafield>
</record>