Abstract
This paper investigates the joint localization, detection, and tracking of sound events using a convolutional recurrent neural network (CRNN). We use a CRNN previously proposed for the localization and detection of stationary sources, and show that the recurrent layers enable the spatial tracking of moving sources when trained with dynamic scenes. The tracking performance of the CRNN is compared with a stand-alone tracking method that combines a multi-source (DOA) estimator and a particle filter. Their respective performance is evaluated in various acoustic conditions such as anechoic and reverberant scenarios, stationary and moving sources at several angular velocities, and with a varying number of overlapping sources. The results show that the CRNN manages to track multiple sources more consistently than the parametric method across acoustic scenarios, but at the cost of higher localization error.
| Original language | English |
|---|---|
| Title of host publication | Proceedings of the Detection and Classification of Acoustic Scenes and Events 2019 Workshop (DCASE2019) |
| Pages | 20-24 |
| ISBN (Electronic) | 978-0-578-59596-2 |
| Publication status | Published - Oct 2019 |
| Publication type | A4 Article in conference proceedings |
| Event | Workshop on Detection and Classification of Acoustic Scenes and Events - New York, United States Duration: 25 Oct 2019 → 26 Oct 2019 |
Workshop
| Workshop | Workshop on Detection and Classification of Acoustic Scenes and Events |
|---|---|
| Abbreviated title | DCASE |
| Country/Territory | United States |
| City | New York |
| Period | 25/10/19 → 26/10/19 |
Publication forum classification
- No publication forum level
Fingerprint
Dive into the research topics of 'Localization, Detection and Tracking of Multiple Moving Sound Sources with a Convolutional Recurrent Neural Network'. Together they form a unique fingerprint.Datasets
-
TAU-NIGENS Spatial Sound Events 2020
Politis, A. (Creator), Adavanne, S. (Creator) & Virtanen, T. (Creator), Zenodo, 31 May 2020
Dataset