Single channel speech music separation using nonnegative matrix factorization with sliding windows and spectral masks

Grais, Emad Mounir and Erdoğan, Hakan (2011) Single channel speech music separation using nonnegative matrix factorization with sliding windows and spectral masks. In: 12th Annual Conference of the International Speech Communication Association (Interspeech 2011), Florence, Italy

[thumbnail of sliding_windows.pdf] PDF
sliding_windows.pdf

Download (138kB)

Abstract

A single channel speech-music separation algorithm based on nonnegative matrix factorization (NMF) with sliding windows and spectral masks is proposed in this work. We train a set of basis vectors for each source signal using NMF in the magnitude spectral domain. Rather than forming the columns of the matrices to be decomposed by NMF of a single spectral frame, we build them with multiple spectral frames stacked in one column. After observing the mixed signal, NMF is used to decompose its magnitude spectra into a weighted linear combination of the trained basis vectors for both sources. An initial spectrogram estimate for each source is found, and a spectral mask is built using these initial estimates. This mask is used to weight the mixed signal spectrogram to find the contributions of each source signal in the mixed signal. The method is shown to perform better than the conventional NMF approach.
Item Type: Papers in Conference Proceedings
Uncontrolled Keywords: Single channel source separation, source separation, semi-blind source separation, speech music separation, speech processing, nonnegative matrix factorization, and Wiener filter
Subjects: T Technology > T Technology (General)
Divisions: Faculty of Engineering and Natural Sciences > Academic programs > Electronics
Faculty of Engineering and Natural Sciences
Depositing User: Emad Mounir Grais Girgis
Date Deposited: 25 Nov 2011 14:11
Last Modified: 26 Apr 2022 09:02
URI: https://research.sabanciuniv.edu/id/eprint/17516

Actions (login required)

View Item
View Item