BinMov2023: Binaural Dataset for Source Position Estimation with Head Rotation and Moving Listeners

Tietoaineisto

Kuvaus

DESCRIPTION BinMov2023: Binaural Dataset for Source Position Estimation with Head Rotation and Moving Listeners is a binaural dataset containing synthetic data of single source speech signals reverberated with simulated room impulse responses. The data allows for experiments related to audio tasks of sound source localization and sound distance estimation. The dataset consists of three subsets, related to three different scenarios: - static: a static sound source and a static listener - rotation: a static sound source and a static listener with a head rotating in the azimuth plane - walking: a static sound source and a listener moving in space Each sound file contains a unique combination of a simulated room and source and receiver positions. The walking scenario contains simulations of 2500 different rooms, whereas the static and rotation scenarios contain 5000 rooms. REPORT AND REFERENCE A detailed description of the dataset and the data generation process can be found in: D. A. Krause, G. García-Barrios, A. Politis and A. Mesaros, "Binaural Sound Source Distance Estimation and Localization for a Moving Listener," in IEEE/ACM Transactions on Audio, Speech, and Language Processing, doi: 10.1109/TASLP.2023.3346297. available here. The supplementary material describing the data simulation is available under this link. If you use the dataset, please consider citing the abovementioned paper. METADATA Each sound file has a separate metadata file assigned. The information in the metadata comes per frame in the following format: [nb_frame (int)], [x (float)], [y (float)], [z (float)], [a (float)], [b (float)], [c (float)], [d (float)] Where nb_frame is the number of frame, {x, y, z} are the unnormalized Cartesian coordinates of the sound source and {a, b, c, d} are the quaternion values related to the rotation of the listener's head. LICENSE The database is published under a custom **open non-commercial with attribution** license. It can be found in the `LICENSE.txt` file that accompanies the data.
Koska saatavilla1 maalisk. 2023
JulkaisijaZenodo

Field of science, Statistics Finland

  • 113 Tietojenkäsittely ja informaatiotieteet

Siteeraa tätä