A curated dataset of urban scenes for audio-visual scene analysis

Tutkimustuotos: Conference contributionScientificvertaisarvioitu

Abstrakti

This paper introduces a curated dataset of urban scenes for audio-visual scene analysis which consists of carefully selected and recorded material. The data was recorded in multiple European cities, using the same equipment, in multiple locations for each scene, and is openly available. We also present a case study for audio-visual scene recognition and show that joint modeling of audio and visual modalities brings significant performance gain compared to state of the art uni-modal systems. Our approach obtained an 84.8% accuracy compared to 75.8% for the audio-only and 68.4% for the video-only equivalent systems.

AlkuperäiskieliEnglanti
OtsikkoICASSP 2021 - 2021 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)
KustantajaIEEE
Sivut626-630
Sivumäärä5
ISBN (elektroninen)978-1-7281-7605-5
DOI - pysyväislinkit
TilaJulkaistu - 2021
OKM-julkaisutyyppiA4 Artikkeli konferenssijulkaisussa
TapahtumaIEEE International Conference on Acoustics, Speech and Signal Processing - Metro Toronto Convention Centre, Toronto, Kanada
Kesto: 6 kesäkuuta 202111 kesäkuuta 2021
https://2021.ieeeicassp.org

Julkaisusarja

NimiProceedings of the IEEE International Conference on Acoustics, Speech, and Signal Processing
ISSN (painettu)1520-6149

Conference

ConferenceIEEE International Conference on Acoustics, Speech and Signal Processing
Maa/AlueKanada
KaupunkiToronto
Ajanjakso6/06/2111/06/21
www-osoite

Julkaisufoorumi-taso

  • Jufo-taso 1

!!ASJC Scopus subject areas

  • Software
  • Signal Processing
  • Electrical and Electronic Engineering

Sormenjälki

Sukella tutkimusaiheisiin 'A curated dataset of urban scenes for audio-visual scene analysis'. Ne muodostavat yhdessä ainutlaatuisen sormenjäljen.

Siteeraa tätä