Dataset Open Access

WikiEvents Dataset from January 2020 to December 2022

Michaelis, Lars


MARC21 XML Export

<?xml version='1.0' encoding='UTF-8'?>
<record xmlns="http://www.loc.gov/MARC21/slim">
  <leader>00000nmm##2200000uu#4500</leader>
  <datafield tag="520" ind1=" " ind2=" ">
    <subfield code="a">&lt;p&gt;WikiEvents is a knowledge graph based dataset for NLP and event-related machine learning tasks.&lt;/p&gt;

&lt;p&gt;This dataset includes RDF data in JSON-LD about events between January 2020 and December 2022. It was extracted from the Wikipedia Current events portal, Wikidata, OpenStreetMaps Nominatim and Falcon 2.0. The extractor is available on GitHub under &lt;a href="https://github.com/semantic-systems/current-events-to-kg"&gt;semantic-systems/current-events-to-kg&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;The RDF data for each month is split onto four graph modules each:&lt;/p&gt;

&lt;ul&gt;
	&lt;li&gt;The &lt;strong&gt;base&lt;/strong&gt; graph module contains events, event summaries with references from named entities to Wikipedia articles.&lt;/li&gt;
	&lt;li&gt;The &lt;strong&gt;ohg&lt;/strong&gt; graph module with all one-hop graphs (ohg) around the referencend Wikidata entities.&lt;/li&gt;
	&lt;li&gt;The &lt;strong&gt;osm&lt;/strong&gt; graph module which contains spartial data from OpenStreetMap (OSM).&lt;/li&gt;
	&lt;li&gt;The &lt;strong&gt;raw&lt;/strong&gt; graph module containing the raw HTML objects of events and article infoboxes.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;This repository additionally includes two JSON files with training samples used for entity linking and event-related location extraction. They were created using queries to the WikiEvents dataset uploaded into this repository.&lt;/p&gt;</subfield>
  </datafield>
  <datafield tag="100" ind1=" " ind2=" ">
    <subfield code="a">Michaelis, Lars</subfield>
  </datafield>
  <datafield tag="540" ind1=" " ind2=" ">
    <subfield code="u">https://creativecommons.org/licenses/by-sa/4.0/legalcode</subfield>
    <subfield code="a">Creative Commons Attribution Share Alike 4.0 International</subfield>
  </datafield>
  <datafield tag="650" ind1="1" ind2="7">
    <subfield code="a">cc-by</subfield>
    <subfield code="2">opendefinition.org</subfield>
  </datafield>
  <datafield tag="980" ind1=" " ind2=" ">
    <subfield code="a">dataset</subfield>
  </datafield>
  <controlfield tag="001">11447</controlfield>
  <datafield tag="856" ind1="4" ind2=" ">
    <subfield code="s">1970505942</subfield>
    <subfield code="u">https://www.fdr.uni-hamburg.de/record/11447/files/WikiEvents_2020_2022.zip</subfield>
    <subfield code="z">md5:5999b870f092922ea1c307effb727374</subfield>
  </datafield>
  <datafield tag="773" ind1=" " ind2=" ">
    <subfield code="a">10.25592/uhhfdm.11446</subfield>
    <subfield code="i">isVersionOf</subfield>
    <subfield code="n">doi</subfield>
  </datafield>
  <datafield tag="653" ind1=" " ind2=" ">
    <subfield code="a">Knowledge Graph</subfield>
  </datafield>
  <datafield tag="653" ind1=" " ind2=" ">
    <subfield code="a">Events</subfield>
  </datafield>
  <datafield tag="653" ind1=" " ind2=" ">
    <subfield code="a">Location Extraction</subfield>
  </datafield>
  <datafield tag="653" ind1=" " ind2=" ">
    <subfield code="a">Entity Linking</subfield>
  </datafield>
  <datafield tag="653" ind1=" " ind2=" ">
    <subfield code="a">NLP</subfield>
  </datafield>
  <datafield tag="024" ind1=" " ind2=" ">
    <subfield code="a">10.25592/uhhfdm.11447</subfield>
    <subfield code="2">doi</subfield>
  </datafield>
  <datafield tag="909" ind1="C" ind2="O">
    <subfield code="o">oai:fdr.uni-hamburg.de:11447</subfield>
    <subfield code="p">user-uhh</subfield>
  </datafield>
  <datafield tag="980" ind1=" " ind2=" ">
    <subfield code="a">user-uhh</subfield>
  </datafield>
  <datafield tag="245" ind1=" " ind2=" ">
    <subfield code="a">WikiEvents Dataset from January 2020 to December 2022</subfield>
  </datafield>
  <datafield tag="260" ind1=" " ind2=" ">
    <subfield code="c">2023-02-07</subfield>
  </datafield>
  <datafield tag="542" ind1=" " ind2=" ">
    <subfield code="l">open</subfield>
  </datafield>
  <controlfield tag="005">20231229151440.0</controlfield>
  <datafield tag="041" ind1=" " ind2=" ">
    <subfield code="a">eng</subfield>
  </datafield>
</record>

Cite record as