Speech from people with Parkinson's disease (PD) are likely to be degraded on phonation, articulation, and prosody. Motivated to describe articulation deficits comprehensively, we investigated 1) the universal phonological features that model articulation manner and place, also known as speech attributes, and 2) glottal features capturing phonation characteristics. These were further supplemented by, and compared with, prosodic features using a popular compact feature set and standard MFCC. Temporal characteristics of these features were modeled by convolutional neural networks. Besides the features, we were also interested in the speech tasks for collecting data for automatic PD speech assessment, like sustained vowels, text reading, and spontaneous monologue. For this, we utilized a recently collected Finnish PD corpus (PDSTU) as well as a Spanish database (PC-GITA). The experiments were formulated as regression problems against expert ratings of PD-related symptoms, including ratings of speech intelligibility, voice impairment, overall severity of communication disorder on PDSTU, as well as on the Unified Parkinson's Disease Rating Scale (UPDRS) on PC-GITA. The experimental results show: 1) the speech attribute features can well indicate the severity of pathologies in parkinsonian speech; 2) combining phonation features with articulatory features improves the PD assessment performance, but requires high-quality recordings to be applicable; 3) read speech leads to more accurate automatic ratings than the use of sustained vowels, but not if the amount of speech is limited to correspond to the sustained vowels in duration; and 4) jointly using data from several speech tasks can further improve the automatic PD assessment performance.
|IEEE/ACM Transactions on Audio Speech and Language Processing
|Varhainen verkossa julkaisun päivämäärä
|17 marrask. 2022
|DOI - pysyväislinkit
|Julkaistu - 2023
|A1 Alkuperäisartikkeli tieteellisessä aikakauslehdessä
- Jufo-taso 3