Time Difference of Arrival Estimation of Multiple Simultaneous Speakers Using Deep Clustering Neural Networks

  • Mikko Parviainen
  • , Pasi Pertilä

Tutkimustuotos: KonferenssiartikkeliTieteellinenvertaisarvioitu

Abstrakti

A novel multiple acoustic source localization approach is presented that is capable of providing spatial information about concurrent active speakers from a mixture signal captured by a microphone array. The proposed method first separates the observed array mixture signal into single speaker array signals using deep clustering (DC), which is a deep neural network (DNN) based method that maps source signal information into an embedding space, in which a clustering algorithm can be then used to separate the sources. Spatial information in terms of time difference of arrival (TDoA) can be then extracted from each separated signal. This approach is novel for TDoA estimation of multiple sources, since the state-of-the-art method first localizes multiple sources and then performs the separation. The inherent advantage of the proposed approach is that there is no need for data association of the measurements and the sources. The results with data from an actual room show that the proposed approach outperforms the current state-of-the- art in extracting the spatial information from two concurrent speakers mixture signal.
AlkuperäiskieliEnglanti
OtsikkoIEEE MMSP 2021 - 23rd Workshop on Multimedia Signal Processing
KustantajaIEEE
Sivumäärä6
ISBN (elektroninen)978-1-6654-3288-7
DOI - pysyväislinkit
TilaJulkaistu - 2022
OKM-julkaisutyyppiA4 Artikkeli konferenssijulkaisussa
TapahtumaIEEE International Workshop on Multimedia Signal Processing - Tampere, Suomi
Kesto: 6 lokak. 20218 lokak. 2021
https://attend.ieee.org/mmsp-2021/

Julkaisusarja

NimiIEEE International Workshop on Multimedia Signal Processing
ISSN (elektroninen)2473-3628

Conference

ConferenceIEEE International Workshop on Multimedia Signal Processing
LyhennettäIEEE MMSP 2021
Maa/AlueSuomi
KaupunkiTampere
Ajanjakso6/10/218/10/21
www-osoite

Julkaisufoorumi-taso

  • Jufo-taso 1

Sormenjälki

Sukella tutkimusaiheisiin 'Time Difference of Arrival Estimation of Multiple Simultaneous Speakers Using Deep Clustering Neural Networks'. Ne muodostavat yhdessä ainutlaatuisen sormenjäljen.

Siteeraa tätä