Speech Detection on Broadcast Audio

Unal Zubari, Ezgi Can Ozan, Banu Oskay Acar, Tolga Ciloglu, Ersin Esen, Tugrul K. Ates, Duygu Oskay Onur

Research output: Chapter in Book/Report/Conference proceedingConference contributionScientificpeer-review

3 Citations (Scopus)

Abstract

Speech boundary detection contributes to performance of speech based applications such as speech recognition and speaker recognition. Speech boundary detector implemented in this study works on broadcast audio as a pre-processor module of a keyword spotter. Speech boundary detection is handled in 3 steps. At first step, audio data is segmented into homogeneous regions in an unsupervised manner. After an ACTIVITY/NON-ACTIVITY decision is made for each region, ACTIVITY regions are classified as Speech/Non-speech via Gaussian Mixture Model (GMM) based classification. GMM's are trained using a novel feature, Spectral Flow Direction (SFD), and an improved multi-band harmonicity feature in addition to widely used Mel Frequency Cepstral Coefficients (MFCC's).

Original languageEnglish
Title of host publication18TH European Signal Processing Conference (EUSIPCO-2010)
EditorsB Kleijn, J Larsen
Place of PublicationKESSARIANI
PublisherEUROPEAN ASSOC SIGNAL SPEECH & IMAGE PROCESSING-EURASIP
Pages85-89
Number of pages5
Publication statusPublished - 2010
Externally publishedYes
Publication typeA4 Article in conference proceedings
Event18th European Signal Processing Conference (EUSIPCO) - Aalborg, Denmark
Duration: 23 Aug 201027 Aug 2010

Publication series

NameEuropean Signal Processing Conference
PublisherEUROPEAN ASSOC SIGNAL SPEECH & IMAGE PROCESSING-EURASIP
Volume18
ISSN (Print)2076-1465

Conference

Conference18th European Signal Processing Conference (EUSIPCO)
Country/TerritoryDenmark
Period23/08/1027/08/10

Keywords

  • CLASSIFICATION
  • RETRIEVAL
  • MUSIC

Fingerprint

Dive into the research topics of 'Speech Detection on Broadcast Audio'. Together they form a unique fingerprint.

Cite this