Detection and Classification of Acoustic Scenes and Events: Outcome of the DCASE 2016 Challenge

Annamaria Mesaros, Toni Heittola, Emmanouil Benetos, Peter Foster, Mathieu Lagrange, Tuomas Virtanen, Mark D. Plumbley

    Research output: Contribution to journalArticleScientificpeer-review

    123 Citations (Scopus)
    127 Downloads (Pure)

    Abstract

    Public evaluation campaigns and datasets promote active development in target research areas, allowing direct comparison of algorithms. The second edition of the challenge on Detection and Classification of Acoustic Scenes and Events (DCASE 2016) has offered such an opportunity for development of state-of-the-art methods, and succeeded in drawing together a large number of participants from academic and industrial backgrounds. In this paper, we report on the tasks and outcomes of the DCASE 2016 challenge. The challenge comprised four tasks: acoustic scene classification, sound event detection in synthetic audio, sound event detection in real-life audio, and domestic audio tagging. We present in detail each task and analyse the submitted systems in terms of design and performance. We observe the emergence of deep learning as the most popular classification method, replacing the traditional approaches based on Gaussian mixture models and support vector machines. By contrast, feature representations have not changed substantially throughout the years, as mel frequency-based representations predominate in all tasks. The datasets created for and used in DCASE 2016 are publicly available and are a valuable resource for further research.

    Original languageEnglish
    Pages (from-to)379-393
    JournalIEEE/ACM Transactions on Audio Speech and Language Processing
    Volume26
    Issue number2
    Early online date28 Nov 2017
    DOIs
    Publication statusPublished - Feb 2018
    Publication typeA1 Journal article-refereed

    Keywords

    • Acoustic scene classification
    • Acoustics
    • audio datasets
    • Event detection
    • Hidden Markov models
    • pattern recognition
    • sound event detection
    • Speech
    • Speech processing
    • Tagging

    Publication forum classification

    • Publication forum level 2

    ASJC Scopus subject areas

    • Signal Processing
    • Media Technology
    • Instrumentation
    • Acoustics and Ultrasonics
    • Linguistics and Language
    • Electrical and Electronic Engineering
    • Speech and Hearing

    Fingerprint

    Dive into the research topics of 'Detection and Classification of Acoustic Scenes and Events: Outcome of the DCASE 2016 Challenge'. Together they form a unique fingerprint.

    Cite this