Finnish News Corpus for Named Entity Recognition

Beskrivning

The corpus consists of 953 articles (193,742 word tokens) with six named entity classes (organization, location, person, product, event,and date). The articles are extracted from the archives of Digitoday, a Finnish online technology news source. The data sets are available at https://github.com/mpsilfve/finer-data and will be available in the download service korp.csc.fi/download in Kielipankki – the Language Bank of Finland. The FiNER system and its technical documentation are available at http://urn.fi/urn:nbn:fi:lb-2018091301
Visa mer

Publiceringsår

2019

Typ av data

Upphovspersoner

University of Helsinki - Kurator, Upphovsperson

Projekt

Övriga uppgifter

Vetenskapsområden

Språkvetenskaper

Språk

finska

Öppen tillgång

Öppet

Licens

Creative Commons Attribution NonCommercial NoDerivatives 4.0 International (CC BY NC ND 4.0)

Nyckelord

Ämnesord

Temporal täckning

undefined

Relaterade till denna forskningsdata