Abstract
Methods for detection of overlapping sound events in audio involve matrix factorization approaches, often assigning separated components to event classes. We present a method that bypasses the supervised construction of class models. The method learns the components as a non-negative dictionary in a coupled matrix factorization problem, where the spectral representation and the class activity annotation of the audio signal share the activation matrix. In testing, the dictionaries are used to estimate directly the class activations. For dealing with large amount of training data, two methods are proposed for reducing the size of the dictionary. The methods were tested on a database of real life recordings, and outperformed previous approaches by over 10%.
Original language | English |
---|---|
Title of host publication | 2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) |
Publisher | IEEE |
Pages | 151-155 |
Number of pages | 5 |
ISBN (Print) | 9781467369978 |
DOIs | |
Publication status | Published - 2015 |
Publication type | A4 Article in conference proceedings |
Event | IEEE International Conference on Acoustics, Speech and Signal Processing - Duration: 1 Jan 1900 → 1 Jan 2000 |
Conference
Conference | IEEE International Conference on Acoustics, Speech and Signal Processing |
---|---|
Period | 1/01/00 → 1/01/00 |
Publication forum classification
- Publication forum level 1