Reference Channel Selection by Multi-Channel Masking for End-to-End Multi-Channel Speech Enhancement

Research output: Chapter in Book/Report/Conference proceedingConference contributionScientificpeer-review

5 Downloads (Pure)

Abstract

In end-to-end multi-channel speech enhancement, the traditional approach of designating one microphone signal as the reference for processing may not always yield optimal results. The limitation is particularly in scenarios with large distributed microphone arrays with varying speaker-to-microphone distances or compact, highly directional microphone arrays where speaker or microphone positions change over time. Current mask-based methods often fix the reference channel during training, which makes it not possible to adaptively select the reference channel for optimal performance. To address this problem, we introduce an adaptive approach for selecting the optimal reference channel. Our method leverages a multi-channel masking-based scheme, where multiple masked signals are combined to generate a single-channel output signal. This enhanced signal is then used for loss calculation, while the reference clean speech is adjusted based on the highest scale-invariant signal-to-distortion ratio (SI-SDR). The experimental results on the Spear challenge simulated dataset D4 demonstrate the superiority of our proposed method over the conventional approach of using a fixed reference channel with single-channel masking.

Original languageEnglish
Title of host publication2024 32nd European Signal Processing Conference (EUSIPCO)
PublisherIEEE
Pages241-245
Number of pages5
ISBN (Electronic)9789464593617
DOIs
Publication statusPublished - 2024
Publication typeA4 Article in conference proceedings
EventEuropean Signal Processing Conference - Lyon, France
Duration: 26 Aug 202430 Aug 2024

Publication series

NameEuropean Signal Processing Conference
ISSN (Print)2219-5491

Conference

ConferenceEuropean Signal Processing Conference
Country/TerritoryFrance
CityLyon
Period26/08/2430/08/24

Keywords

  • end-to-end multi-channel speech enhancement
  • multi-channel masking
  • reference channel selection

Publication forum classification

  • Publication forum level 1

ASJC Scopus subject areas

  • Signal Processing
  • Electrical and Electronic Engineering

Fingerprint

Dive into the research topics of 'Reference Channel Selection by Multi-Channel Masking for End-to-End Multi-Channel Speech Enhancement'. Together they form a unique fingerprint.

Cite this