Source Separation and Reconstruction of Spatial Audio Using Spectrogram Factorization

Joonas Nikunen, Tuomas Virtanen

    Research output: Chapter in Book/Report/Conference proceedingChapterScientificpeer-review

    1 Citation (Scopus)


    This chapter introduces methods for factorizing the spectrogram of multichannel audio into repetitive spectral objects and apply the introduced models to the analysis of spatial audio and modification of spatial sound through source separation. The purpose of decomposing an audio spectrogram using spectral templates is to learn the underlying structures (audio objects) from the observed data. The chapter discusses two main scenarios such as parameterization of multichannel surround sound and parameterization of microphone array signals. It explains the principles of source separation by time-frequency filtering using separation masks constructed from the spectrogram models. The chapter introduces a spatial covariance matrix model based on the directions of arrival of sound events and spectral templates, and discusses its relationship to conventional spatial audio signal processing. Source separation using spectrogram factorization models is achieved via time- frequency filtering of the original observation short-time Fourier transform (STFT) by a generalized Wiener filter obtained from the spectrogram model parameters.
    Original languageEnglish
    Title of host publicationParametric time-frequency-domain spatial audio
    EditorsVille Pulkki, Symeon Delikaris-Manias, Archontis Politis
    PublisherJohn Wiley & Sons
    ISBN (Electronic)978-1-119-25263-4
    ISBN (Print)978-1-119-25259-7
    Publication statusPublished - 13 Oct 2017
    Publication typeA3 Book chapter

    Publication forum classification

    • Publication forum level 2


    Dive into the research topics of 'Source Separation and Reconstruction of Spatial Audio Using Spectrogram Factorization'. Together they form a unique fingerprint.

    Cite this