Regularized nonnegative matrix factorization using Gaussian mixture priors for supervised single channel source separation
Grais, Emad Mounir and Erdoğan, Hakan (2013) Regularized nonnegative matrix factorization using Gaussian mixture priors for supervised single channel source separation. Computer Speech and Language (Sl), 27 (3). pp. 746-762. ISSN 0885-2308
This is the latest version of this item.
Full text not available from this repository.
Official URL: http://dx.doi.org/10.1016/j.csl.2012.09.002
We introduce a new regularized nonnegative matrix factorization (NMF) method for supervised single-channel source separation (SCSS). We propose a new multi-objective cost function which includes the conventional divergence term for the NMF together with a prior likelihood term. The first term measures the divergence between the observed data and the multiplication of basis and gains matrices. The novel second term encourages the log-normalized gain vectors of the NMF solution to increase their likelihood under a prior Gaussian mixture model (GMM) which is used to encourage the gains to follow certain patterns. In this model, the parameters to be estimated are the basis vectors, the gain vectors and the parameters of the GMM prior. We introduce two different ways to train the model parameters, sequential training and joint training. In sequential training, after finding the basis and gains matrices, the gains matrix is then used to train the prior GMM in a separate step. In joint training, within each NMF iteration the basis matrix, the gains matrix and the prior GMM parameters are updated jointly using the proposed regularized NMF. The normalization of the gains makes the prior models energy independent, which is an advantage as compared to earlier proposals. In addition, GMM is a much richer prior than the previously considered alternatives such as conjugate priors which may not represent the distribution of the gains in the best possible way. In the separation stage after observing the mixed signal, we use the proposed regularized cost function with a combined basis and the GMM priors for all sources that were learned from training data for each source. Only the gain vectors are estimated from the mixed data by minimizing the joint cost function. We introduce novel update rules that solve the optimization problem efficiently for the new regularized NMF problem. This optimization is challenging due to using energy normalization and GMM for prior modeling, which makes the problem highly nonlinear and non-convex. The experimental results show that the introduced methods improve the performance of single channel source separation for speech separation and speech music separation with different NMF divergence functions. The experimental results also show that, using the GMM prior gives better separation results than using the conjugate prior.
Available Versions of this Item
Repository Staff Only: item control page