Abstrakti
In this paper, we investigate the tasks of binaural source distance estimation (SDE) and direction-of-arrival estimation (DOAE) using motion-based cues in a scenario with a walking listener. On top of performing both tasks as separate problems, we study two methods of solving the joint task of simultaneous source distance estimation and localization (SDEL), with a single model. Experiments are conducted for three different scenarios: a static receiver; a static receiver with a rotating head; and a freely moving listener inside a room. The study proposes rotation and translation features to include information about the receiver's motion during model training and studies the effects of these on the final performance. The work includes extended simulation of three datasets containing numerous testing scenarios for sound sources, covering a wide range of DOAs and a source-to-receiver distance up to 15 m. Results are further analyzed with respect to room reverberation, walking speed, as well as source-to-receiver distance. The presented outcomes show large improvements in both DOA and distance estimation for a model that uses motion-based cues as compared with a static scenario. These include a decrease of 9.50° in DOA and 1.56m in distance errors for a joint model, followed by 16.17° and 0.17m for separate models.
Alkuperäiskieli | Englanti |
---|---|
Sivut | 996 - 1011 |
Sivumäärä | 16 |
Julkaisu | IEEE/ACM Transactions on Audio Speech and Language Processing |
Vuosikerta | 32 |
Varhainen verkossa julkaisun päivämäärä | 22 jouluk. 2023 |
DOI - pysyväislinkit | |
Tila | Julkaistu - 2024 |
OKM-julkaisutyyppi | A1 Alkuperäisartikkeli tieteellisessä aikakauslehdessä |
Julkaisufoorumi-taso
- Jufo-taso 3
!!ASJC Scopus subject areas
- Computer Science (miscellaneous)
- Acoustics and Ultrasonics
- Computational Mathematics
- Electrical and Electronic Engineering
Sormenjälki
Sukella tutkimusaiheisiin 'Binaural Sound Source Distance Estimation and Localization for a Moving Listener'. Ne muodostavat yhdessä ainutlaatuisen sormenjäljen.Tietoaineistot
-
BinMov2023: Binaural Dataset for Source Position Estimation with Head Rotation and Moving Listeners
Politis, A. (Creator), Barrios, G. G. (Creator), Krause, D. A. (Creator) & Mesaros, A. (Creator), Zenodo, 1 maalisk. 2023
DOI - pysyväislinkki: 10.5281/zenodo.7689063
Tietoaineisto: Dataset