Finnish Dark Web Marketplace Corpus

Beskrivning

The resource is available for restricted use via Kielipankki - the Language Bank of Finland. Instructions on applying for access are available on the resource group page (see Documentation). This Finnish dataset consists of 3 104 515 messages posted on the Torilauta discussion board operating in the dark web in the years 2017-2020. The data were collected and submitted by the site administrator in order to be archived for research use. The data set was received by the ENNCODE project at the University of Tampere. In addition to the message title and text, the posts contain the following metadata: time stamps of sending and deletion, sender's nickname, subject area, and the message and thread identifiers. The data was provided as a JSONLINES text file, each line of which corresponds to one message and its metadata in JSON format. For data protection reasons, identifying information including a number of individual messages has been removed from the data.
Visa mer

Publiceringsår

2022

Typ av data

Upphovspersoner

Tampere University - Upphovsperson

Tuomas Harviainen - Kurator

Projekt

Övriga uppgifter

Vetenskapsområden

Språkvetenskaper

Språk

finska

Öppen tillgång

Begränsad tillgång

Licens

CLARIN RES (Restricted) End User License 1.0

Nyckelord

Ämnesord

Temporal täckning

undefined

Relaterade till denna forskningsdata