TY - GEN
T1 - Sound Event Envelope Estimation in Polyphonic Mixtures
AU - Martín-Morató, Irene
AU - Mesaros, Annamaria
AU - Heittola, Toni
AU - Virtanen, Tuomas
AU - Cobos, Maximo
AU - Ferri, Francesc J.
PY - 2019/4/17
Y1 - 2019/4/17
N2 - Sound event detection is the task of identifying automatically the presence and temporal boundaries of sound events within an input audio stream. In the last years, deep learning methods have established themselves as the state-of-the-art approach for the task, using binary indicators during training to denote whether an event is active or inactive. However, such binary activity indicators do not fully describe the events, and estimating the envelope of the sounds could provide more precise modeling of their activity. This paper proposes to estimate the amplitude envelopes of target sound event classes in polyphonic mixtures. For training, we use the amplitude envelopes of the target sounds, calculated from mixture signals and, for comparison, from their isolated counterparts. The model is then used to perform envelope estimation and sound event detection. Results show that the envelope estimation allows good modeling of the sounds activity, with detection results comparable to current state-of-the art.
AB - Sound event detection is the task of identifying automatically the presence and temporal boundaries of sound events within an input audio stream. In the last years, deep learning methods have established themselves as the state-of-the-art approach for the task, using binary indicators during training to denote whether an event is active or inactive. However, such binary activity indicators do not fully describe the events, and estimating the envelope of the sounds could provide more precise modeling of their activity. This paper proposes to estimate the amplitude envelopes of target sound event classes in polyphonic mixtures. For training, we use the amplitude envelopes of the target sounds, calculated from mixture signals and, for comparison, from their isolated counterparts. The model is then used to perform envelope estimation and sound event detection. Results show that the envelope estimation allows good modeling of the sounds activity, with detection results comparable to current state-of-the art.
KW - acoustic signal detection
KW - acoustic signal processing
KW - learning (artificial intelligence)
KW - sound event envelope estimation
KW - polyphonic mixtures
KW - sound event detection
KW - input audio stream
KW - deep learning methods
KW - binary activity indicators
KW - amplitude envelopes
KW - target sound event classes
KW - sounds activity
KW - Training
KW - Estimation
KW - Event detection
KW - Acoustics
KW - Signal to noise ratio
KW - Automobiles
KW - Dogs
KW - Sound event detection
KW - Envelope estimation
KW - Deep Neural Networks
U2 - 10.1109/ICASSP.2019.8682858
DO - 10.1109/ICASSP.2019.8682858
M3 - Conference contribution
SN - 978-1-4799-8132-8
T3 - IEEE International Conference on Acoustics, Speech and Signal Processing
SP - 935
EP - 939
BT - ICASSP 2019 - 2019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)
PB - IEEE
T2 - IEEE International Conference on Acoustics, Speech and Signal Processing
Y2 - 1 January 1900 through 1 January 2000
ER -