Dataset of Audiovisual Speech for AR Telepresence Studies (Speech Recordings)

Beskrivning

Dataset of speech recordings made in the anechoic chamber "Lampio" at Aalto University. 21 Participant ("P1" - "P21") Four parts are included: 1) Conversations: Ten different scripted three-part conversations ("C1" - "C10"). Each participant is in two of them. All of the three parts is played by all three participants ("S1" - "S3"). See assignment_conversations.xlsx 2) Harvard_Sets: Sets 25 and 36 of the Harvard sentence lists 3) Sentence 1 from List 25 in five different voice levels (from "barely not whispering" to "screaming as loud as you can") 4) Native_Language: List 25 translated to native languages of 12 of the participants (French, Finnish, Hebrew, Hindi, Spanish (Mexico), Spanish (Chile), Catalan, Latvian, Italian, Polish, Romanian, German) Each file contain data from three receivers: Ch 1: GRAS 40 HF 1" low-noise meausurement microphone. 1.5 m away from the subject Ch 2: RØDE NT1 large diaphragm condenser microphone. 2 m away from the subject Ch 3: DPA 4060. Attached to the subject's clothes Calibration.wav: Calibration data for Ch1. (Recorded using a B&K 4231 Calibrator 1kHz, 94 dB) Accompanying video data can be obtained by personal request from nils.meyer-kahlen@aalto.fi

Visa mer

Publiceringsår

2025

Typ av data

Upphovspersoner

Aalto-universitetet

Department of Information and Communications Engineering

Anja Hofmann - Upphovsperson

Nils Meyer-Kahlen - Upphovsperson

Sebastian Schlecht - Upphovsperson

Tapio Lokki - Upphovsperson

Friedrich-Alexander-Universität Erlangen-Nürnberg - Medarbetare

Zenodo - Utgivare

Projekt

Övriga uppgifter

Vetenskapsområden

El-, automations- och telekommunikationsteknik, elektronik

Språk

Öppen tillgång

Begränsad tillgång

Licens

Other

Nyckelord

Ämnesord

Temporal täckning

undefined