Active Learning for Sound Event Detection

Research output: Contribution to journalArticleScientificpeer-review

Abstract

This article proposes an active learning system for sound event detection (SED). It aims at maximizing the accuracy of a learned SED model with limited annotation effort. The proposed system analyzes an initially unlabeled audio dataset, from which it selects sound segments for manual annotation. The candidate segments are generated based on a proposed change point detection approach, and the selection is based on the principle of mismatch-first farthest-traversal. During the training of SED models, recordings are used as training inputs, preserving the long-term context for annotated segments. The proposed system clearly outperforms reference methods in the two datasets used for evaluation (TUT Rare Sound 2017 and TAU Spatial Sound 2019). Training with recordings as context outperforms training with only annotated segments. Mismatch-first farthest-traversal outperforms reference sample selection methods based on random sampling and uncertainty sampling. Remarkably, the required annotation effort can be greatly reduced on the dataset where target sound events are rare: by annotating only 2% of the training data, the achieved SED performance is similar to annotating all the training data.

Original languageEnglish
Pages (from-to)2895-2905
Number of pages11
JournalIEEE/ACM Transactions on Audio Speech and Language Processing
Volume28
DOIs
Publication statusPublished - 2020
Publication typeA1 Journal article-refereed

Keywords

  • Active learning
  • change point detection
  • mismatch-first farthest-traversal
  • sound event detection
  • weakly supervised learning

Publication forum classification

  • Publication forum level 3

ASJC Scopus subject areas

  • Computer Science (miscellaneous)
  • Acoustics and Ultrasonics
  • Computational Mathematics
  • Electrical and Electronic Engineering

Fingerprint

Dive into the research topics of 'Active Learning for Sound Event Detection'. Together they form a unique fingerprint.

Cite this