Clotho dataset

Tietoaineisto

Description

Clotho is a novel audio captioning dataset, consisting of 4981 audio samples, and each audio sample has five captions (a total of 24 905 captions). Audio samples are of 15 to 30 s duration and captions are eight to 20 words long.
Koska saatavilla15 lokak. 2019
JulkaisijaTampere University of Technology
Tietojen luontipäivämäärä2019 -
  • Clotho: an Audio Captioning Dataset

    Drossos, K., Lipping, S. & Virtanen, T., 2020, IEEE 45th International Conference on Acoustics, Speech, and Signal Processing (ICASSP). IEEE, s. 736-740 5 Sivumäärä (Proceedings of the IEEE International Conference on Acoustics, Speech, and Signal Processing).

    Tutkimustuotos: Conference contributionScientificvertaisarvioitu

    Open access
    4 Sitaatiot (Scopus)
  • Crowdsourcing a Dataset of Audio Captions

    Lipping, S., Drossos, K. & Virtanen, T., 26 lokak. 2019, Proceedings of the Detection and Classification of Acoustic Scenes and Events 2019 Workshop (DCASE2019).

    Tutkimustuotos: Conference contributionScientificvertaisarvioitu

    Open access

Siteeraa tätä