Uralic, Turkic, Indo-Iranian and Mongol languages; languages of Siberia and Caucasia (UHLCS)

Beskrivning

The corpus is available in Kielipankki - the Language Bank of Finland (puhti.csc.fi, access rights instructions: http://www.kielipankki.fi/access). Location: appl/data/kielipankki/mrc-uhlcs/multilingual-language-archive/ Contains texts in Chukchi, Koryak, Kurdish, Ossete, Tajik, Avar, Lak, Tabassaran, Kalmyk, Even, Evenki, Nanay, as well as in various Uralic and Turkic languages. Here is a list of all the languages included in alphabetical order (with information about the location subfolders): Avar (location: north-east-caucasian-lgs/avar-andi-tsez-lgs/avar/) Azerbaijani (location: turkic-lgs/south-west-turkic-lgs/azerbaijani/) Balkar (location: turkic-lgs/north-west-turkic-lgs/balkar/) Bashkir (location: turkic-lgs/north-west-turkic-lgs/bashkir/) Chukchi (location: chukotko-kamchatkan-lgs/chukchi/) Chuvash (location: turkic-lgs/bolgar-group/chuvash/) Crimean-Turkish (location: turkic-lgs/north-west-turkic-lgs/crimean-turkish/) Dvina-Karelian (location: uralic-lgs/finno-ugric-lgs/baltic-finnic-lgs/dvina-karelian/) Eastern & Meadow Mari (location: uralic-lgs/finno-ugric-lgs/mari-lgs/eastern-mari/) Enets (location: uralic-lgs/samoyedic-lgs/enets/) Erzya (location: uralic-lgs/finno-ugric-lgs/mordvin-lgs/erzya/) Even (location: tungusic-lgs/north-tungusic-lgs/even/) Evenki (location: tungusic-lgs/north-tungusic-lgs/evenki/) Gagauz (location: turkic-lgs/south-west-turkic-lgs/gagauz/) Hill Mari (location: uralic-lgs/finno-ugric-lgs/mari-lgs/western-mari/) Kalmyk (location: mongolic-lgs/east-mongolic-lgs/kalmyk/) Kamas (location: uralic-lgs/samoyedic-lgs/kamas/) Khakas (location: turkic-lgs/north-east-turkic-lgs/khakas/) Kildin-Saami (location: uralic-lgs/finno-ugric-lgs/saami-lgs/kildin-saami/) Kirghiz (location: turkic-lgs/north-west-turkic-lgs/kirghiz/) Komi-Permyak (location: uralic-lgs/finno-ugric-lgs/permic-lgs/komi/permyak/) Koryak (location: chukotko-kamchatkan-lgs/koryak/) Kurdish (location: indo-european-lgs/iranian-lgs/west-iranian-lgs/kurdish/) Lak (location: north-east-caucasian-lgs/lak-dargva-lgs/lak/) Livonian (location: uralic-lgs/finno-ugric-lgs/baltic-finnic-lgs/livonian/) Mansi (location: uralic-lgs/finno-ugric-lgs/ugric-lgs/mansi/) Moksha (location: uralic-lgs/finno-ugric-lgs/mordvin-lgs/moksha/) Nanay (location: tungusic-lgs/south-tungusic-lgs/nanay/) Olonets-Karelian aka Livvi (location: uralic-lgs/finno-ugric-lgs/baltic-finnic-lgs/livvi/) Ossete (location: indo-european-lgs/iranian-lgs/east-iranian-lgs/ossete/) Selkup (location: uralic-lgs/samoyedic-lgs/selkup/) Tabassaran (location: north-east-caucasian-lgs/lezgian-lgs/tabassaran/) Tajik (location: indo-european-lgs/iranian-lgs/west-iranian-lgs/tajik/) Tatar (location: turkic-lgs/north-west-turkic-lgs/tatar/) Turkmen (location: turkic-lgs/south-west-turkic-lgs/turkmen/) Tuvin (location: turkic-lgs/north-east-turkic-lgs/tuvin/) Udmurt (location: uralic-lgs/finno-ugric-lgs/permic-lgs/udmurt/) Uigur (location: turkic-lgs/south-east-turkic-lgs/uighur/) Uzbek (location: turkic-lgs/south-east-turkic-lgs/uzbek/) Veps (location: uralic-lgs/finno-ugric-lgs/baltic-finnic-lgs/veps/) Yakut aka Sakha (location: turkic-lgs/north-east-turkic-lgs/yakut/) The corpus is a part of the Multilingual Resource Collection of the UHLCS. UHLCS has many different IPR holders. Should you have any questions regarding the collection, please contact Pirkko Suihkonen (suihkonen.pirkko@gmail.com). The purpose of the resource use must be outlined in a research plan.
Visa mer

Publiceringsår

2025

Typ av data

Upphovspersoner

User support at CSC - IT Center for Science Ltd. The Language Bank of Finland - Kurator

Pirkko Suihkonen - Upphovsperson, Rättighetsinnehavare

Multiple publishers, check distribution rights holders in original metadata by following its persistent identifier - Utgivare

Projekt

Övriga uppgifter

Vetenskapsområden

Språkvetenskaper

Språk

avariska, Tjuktjiska, Even language, Evenki language, Nanai language, Korjakiska, kurdiska, Lak language, ossetiska, Tabasaran language, tadzjikiska, kalmuckiska

Öppen tillgång

Begränsad tillgång

Licens

CLARIN RES (Restricted) End User License 1.0

Nyckelord

Ämnesord

Temporal täckning

undefined

Relaterade till denna forskningsdata