Exemplar-based speech enhancement for deep neural network based automatic speech recognition

Deepak Baby, Jort F. Gemmeke, Tuomas Virtanen, Hugo Van Hamme

    Research output: Chapter in Book/Report/Conference proceedingConference contributionScientificpeer-review

    16 Citations (Scopus)

    Abstract

    Deep neural network (DNN) based acoustic modelling has been successfully used for a variety of automatic speech recognition (ASR) tasks, thanks to its ability to learn higher-level information using multiple hidden layers. This paper investigates the recently proposed exemplar-based speech enhancement technique using coupled dictionaries as a pre-processing stage for DNN-based systems. In this setting, the noisy speech is decomposed as a weighted sum of atoms in an input dictionary containing exemplars sampled from a domain of choice, and the resulting weights are applied to a coupled output dictionary containing exemplars sampled in the short-time Fourier transform (STFT) domain to directly obtain the speech and noise estimates for speech enhancement. In this work, settings using input dictionary of exemplars sampled from the STFT, Mel-integrated magnitude STFT and modulation envelope spectra are evaluated. Experiments performed on the AURORA-4 database revealed that these pre-processing stages can improve the performance of the DNN-HMM-based ASR systems with both clean and multi-condition training.

    Original languageEnglish
    Title of host publicationICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings
    PublisherIEEE
    Pages4485-4489
    Number of pages5
    ISBN (Print)9781467369978
    DOIs
    Publication statusPublished - 4 Aug 2015
    Publication typeA4 Article in conference proceedings
    EventIEEE International Conference on Acoustics, Speech and Signal Processing -
    Duration: 1 Jan 19001 Jan 2000

    Conference

    ConferenceIEEE International Conference on Acoustics, Speech and Signal Processing
    Period1/01/001/01/00

    Keywords

    • coupled dictionaries
    • deep neural networks
    • modulation envelope
    • non-negative matrix factorisation
    • speech enhancement

    Publication forum classification

    • Publication forum level 1

    ASJC Scopus subject areas

    • Signal Processing
    • Software
    • Electrical and Electronic Engineering

    Fingerprint

    Dive into the research topics of 'Exemplar-based speech enhancement for deep neural network based automatic speech recognition'. Together they form a unique fingerprint.

    Cite this