Grais, Emad Mounir and Erdoğan, Hakan (2011) Adaptation of speaker-specific bases in non-negative matrix factorization for single channel speech-music separation. In: 12th Annual Conference of the International Speech Communication Association (Interspeech 2011), Florence, Italy
PDF
Adaptation.pdf
Download (138kB)
Adaptation.pdf
Download (138kB)
Abstract
This paper introduces a speaker adaptation algorithm for nonnegative matrix factorization (NMF) models. The proposed adaptation algorithm is a combination of Bayesian and subspace model adaptation. The adapted model is used to separate speech signal from a background music signal in a single record. Training speech data for multiple speakers is used with NMF to train a set of basis vectors as a general model for speech signals. The probabilistic interpretation of NMF is used to achieve Bayesian adaptation to adjust the general model with respect to the actual properties of the speech signals that is observed in the mixed signal. The Bayesian adapted model is adapted again by a linear transform, which changes the subspace that the Bayesian adapted model spans to better match the speech signal that is in the mixed signal. The experimental results show that combining Bayesian with linear transform adaptation improves the separation results.
Item Type: | Papers in Conference Proceedings |
---|---|
Uncontrolled Keywords: | Model adaptation, single channel source separation, source separation, speech music separation, and nonnegative matrix factorization |
Subjects: | T Technology > T Technology (General) |
Divisions: | Faculty of Engineering and Natural Sciences > Academic programs > Electronics Faculty of Engineering and Natural Sciences |
Depositing User: | Emad Mounir Grais Girgis |
Date Deposited: | 25 Nov 2011 12:54 |
Last Modified: | 26 Apr 2022 09:02 |
URI: | https://research.sabanciuniv.edu/id/eprint/17517 |