Neural network-based acoustic vehicle counting

Slobodan Djukanović, Yash Patel, Jiri Matas, T. Virtanen

Research output: Chapter in Book/Report/Conference proceedingConference contributionScientificpeer-review


This paper addresses acoustic vehicle counting using one-channel audio. We predict the pass-by instants of vehicles from local minima of clipped vehicle-to-microphone distance. This distance is predicted from audio using a two-stage (coarse-fine) regression, with both stages realised via neural networks (NNs). Experiments show that the NN-based distance regression outperforms by far the previously proposed support vector regression. The 95% confidence interval for the mean of vehicle counting error is within [0.28%, −0.55%]. Besides the minima-based counting, we propose a deep learning counting that operates on the predicted distance without detecting local minima. Although outperformed in accuracy by the former approach, deep counting has a significant advantage in that it does not depend on minima detection parameters. Results also show that removing low frequencies in features improves the counting performance.
Original languageEnglish
Title of host publication2021 29th European Signal Processing Conference (EUSIPCO)
Number of pages5
ISBN (Electronic)978-9-0827-9706-0
Publication statusPublished - 2021
Publication typeA4 Article in conference proceedings
EventEuropean Signal Processing Conference - Dublin, Ireland
Duration: 23 Aug 202127 Aug 2021

Publication series

NameEuropean Signal Processing Conference
ISSN (Electronic)2076-1465


ConferenceEuropean Signal Processing Conference
Abbreviated titleEUSIPCO 2021


  • Support vector machines
  • Deep learning
  • Europe
  • Artificial neural networks
  • Signal processing
  • Acoustics
  • Vehicle counting
  • log-mel spectrogram
  • neural network
  • peak detection
  • deep learning

Publication forum classification

  • Publication forum level 1


Dive into the research topics of 'Neural network-based acoustic vehicle counting'. Together they form a unique fingerprint.

Cite this