Dataset Open Access
Michaelis, Lars
<?xml version='1.0' encoding='utf-8'?>
<resource xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns="http://datacite.org/schema/kernel-3" xsi:schemaLocation="http://datacite.org/schema/kernel-3 http://schema.datacite.org/meta/kernel-3/metadata.xsd">
<identifier identifierType="DOI">10.25592/uhhfdm.11447</identifier>
<creators>
<creator>
<creatorName>Michaelis, Lars</creatorName>
</creator>
</creators>
<titles>
<title>WikiEvents Dataset from January 2020 to December 2022</title>
</titles>
<publisher>Universität Hamburg</publisher>
<publicationYear>2023</publicationYear>
<subjects>
<subject>Knowledge Graph</subject>
<subject>Events</subject>
<subject>Location Extraction</subject>
<subject>Entity Linking</subject>
<subject>NLP</subject>
</subjects>
<dates>
<date dateType="Issued">2023-02-07</date>
</dates>
<language>en</language>
<resourceType resourceTypeGeneral="Dataset"/>
<alternateIdentifiers>
<alternateIdentifier alternateIdentifierType="url">https://www.fdr.uni-hamburg.de/record/11447</alternateIdentifier>
</alternateIdentifiers>
<relatedIdentifiers>
<relatedIdentifier relatedIdentifierType="DOI" relationType="IsPartOf">10.25592/uhhfdm.11446</relatedIdentifier>
</relatedIdentifiers>
<rightsList>
<rights rightsURI="https://creativecommons.org/licenses/by-sa/4.0/legalcode">Creative Commons Attribution Share Alike 4.0 International</rights>
<rights rightsURI="info:eu-repo/semantics/openAccess">Open Access</rights>
</rightsList>
<descriptions>
<description descriptionType="Abstract"><p>WikiEvents is a knowledge graph based dataset for NLP and event-related machine learning tasks.</p>
<p>This dataset includes RDF data in JSON-LD about events between January 2020 and December 2022. It was extracted from the Wikipedia Current events portal, Wikidata, OpenStreetMaps Nominatim and Falcon 2.0. The extractor is available on GitHub under <a href="https://github.com/semantic-systems/current-events-to-kg">semantic-systems/current-events-to-kg</a>.</p>
<p>The RDF data for each month is split onto four graph modules each:</p>
<ul>
<li>The <strong>base</strong> graph module contains events, event summaries with references from named entities to Wikipedia articles.</li>
<li>The <strong>ohg</strong> graph module with all one-hop graphs (ohg) around the referencend Wikidata entities.</li>
<li>The <strong>osm</strong> graph module which contains spartial data from OpenStreetMap (OSM).</li>
<li>The <strong>raw</strong> graph module containing the raw HTML objects of events and article infoboxes.</li>
</ul>
<p>This repository additionally includes two JSON files with training samples used for entity linking and event-related location extraction. They were created using queries to the WikiEvents dataset uploaded into this repository.</p></description>
</descriptions>
</resource>