Single channel speech music separation using nonnegative matrix factorization and spectral masks

Grais, Emad Mounir and Erdoğan, Hakan (2011) Single channel speech music separation using nonnegative matrix factorization and spectral masks. In: 17th International Conference on Digital Signal Processing (DSP 2011), Corfu, Greece

[thumbnail of PID1794283.pdf] PDF
PID1794283.pdf

Download (420kB)

Abstract

A single channel speech-music separation algorithm based on nonnegative matrix factorization (NMF) with spectral masks is proposed in this work. The proposed algorithm uses training data of speech and music signals with nonnegative matrix factorization followed by masking to separate the mixed signal. In the training stage, NMF uses the training data to train a set of basis vectors for each source. These bases are trained using NMF in the magnitude spectrum domain. After observing the mixed signal, NMF is used to decompose its magnitude spectra into a linear combination of the trained bases for both sources. The decomposition results are used to build a mask, which explains the contribution of each source in the mixed signal. Experimental results show that using masks after NMF improves the separation process even when calculating NMF with fewer iterations, which yields a faster separation process.
Item Type: Papers in Conference Proceedings
Uncontrolled Keywords: Source separation , Wiener filter , nonnegative matrix factorization , semi-blind source separation , single channel source separation , speech music separation , speech processing
Subjects: T Technology > T Technology (General)
Divisions: Faculty of Engineering and Natural Sciences > Academic programs > Electronics
Faculty of Engineering and Natural Sciences
Depositing User: Emad Mounir Grais Girgis
Date Deposited: 25 Nov 2011 14:17
Last Modified: 26 Apr 2022 09:02
URI: https://research.sabanciuniv.edu/id/eprint/17514

Actions (login required)

View Item
View Item