Lahjoita puhetta semisupervised baseline Kaldi ASR model

Beskrivning

Lahjoita puhetta semisupervised baseline speech recognition model, built with the Kaldi toolkit. Trained on 100 hours of supervised and approx. 1600 hours of untranscribed Finnish speech. Described in more detail in the paper https://arxiv.org/abs/2203.12906 "Lahjoita puhetta – a large-scale corpus of spoken Finnish with some benchmarks". For details on the training method, see https://github.com/aalto-speech/lahjoita-puhetta-baseline-kaldi. The title and description of this software/code correspond with the situation when the software metadata was imported to ACRIS. The most recent version of metadata is available in the original repository.
Visa mer

Publiceringsår

2022

Typ av data

Upphovspersoner

Department of Signal Processing and Acoustics

Tamás Grósz Orcid -palvelun logo - Upphovsperson

Zenodo - Utgivare

Projekt

Övriga uppgifter

Vetenskapsområden

Data- och informationsvetenskap

Språk

Öppen tillgång

Öppet

Licens

Creative Commons Attribution 4.0 International (CC BY 4.0)

Nyckelord

Ämnesord

Temporal täckning

undefined

Relaterade till denna forskningsdata