Siirry päänavigointiin Siirry hakuun Siirry pääsisältöön

Pre-trained weights for the baseline DNN system of DCASE 2021 automated audio captioning task

  • Konstantinos Drossos (Creator)
  • Samuel Lipping (Creator)
  • Tuomas Virtanen (Creator)

Tietoaineisto

Kuvaus

This is the repository of the pre-trained weights for the baseline deep neural network (DNN), used in the baseline system of automated audio captioning at the DCASE 2021 Challenge. The pre-trained weights can be used with the baseline DNN in order to reproduce the reported results on the evaluation split (development-testing set in DCASE terminology) of the Clotho dataset. You can find the description of the automated audio captioning task and the reported results on the webpage of the task: http://dcase.community/challenge2021/task-automatic-audio-captioning Clotho dataset can be found at: https://zenodo.org/record/3490684 GitHub repositories of audio captioning can be found at: https://github.com/audio-captioning If you use the baseline system, please consider citing the paper of Clotho: K. Drossos, S. Lipping, and T. Virtanen, "Clotho: An Audio Captioning Dataset," to be presented in the IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), May 4-8, 2020 available online at: https://arxiv.org/abs/1910.09387
Koska saatavilla28 toukok. 2021
JulkaisijaZenodo

Field of science, Statistics Finland

  • 113 Tietojenkäsittely ja informaatiotieteet
  • Clotho: an Audio Captioning Dataset

    Drossos, K., Lipping, S. & Virtanen, T., 2020, IEEE 45th International Conference on Acoustics, Speech, and Signal Processing (ICASSP). IEEE, s. 736-740 5 Sivumäärä (Proceedings of the IEEE International Conference on Acoustics, Speech, and Signal Processing).

    Tutkimustuotos: KonferenssiartikkeliTieteellinenvertaisarvioitu

    Open access
    324 Sitaatiot (Scopus)

Siteeraa tätä