Bayesian allocation model: marginal likelihood-based model selection for count tensors

Warning The system is temporarily closed to updates for reporting purpose.

Yıldırım, Sinan and Kurutmaz, M. Burak and Barsbey, Melih and Şimşekli, Umut and Cemgil, A. Taylan (2021) Bayesian allocation model: marginal likelihood-based model selection for count tensors. IEEE Journal of Selected Topics in Signal Processing, 15 (3). pp. 560-573. ISSN 1932-4553 (Print) 1941-0484 (Online)

[thumbnail of Yıldırım_et_al_2021_BAM.pdf] PDF
Yıldırım_et_al_2021_BAM.pdf
Restricted to Registered users only

Download (875kB) | Request a copy

Abstract

In this article, we introduce a dynamic generative model, the Bayesian allocation model (BAM), for modeling count data. BAM covers various probabilistic nonnegative tensor factorization (NTF) and topic models under one general framework. In BAM, allocations are made using a Bayesian network, whose conditional probability tables can be integrated out analytically. We show that, when allocations are viewed as sequential, the resulting marginal process is a special type of Polya urn process, which we name as Polya-Bayes process, an integer valued self-reinforcing process. Exploiting the Polya urn construction, we develop a novel sequential Monte Carlo (SMC) algorithm for marginal likelihood estimation in BAM, leading to a unified scoring method for discrete variable Bayesian networks with hidden nodes, including various NTF and topic models. The SMC estimator for marginal likelihood has the remarkable property of being unbiased in contrast to variational algorithms which are generally biased. We also demonstrate how our novel SMC-based likelihood estimation can be integrated within a Markov chain Monte Carlo algorithm for a principled and correct (in terms of respecting the true posterior distribution) Bayesian model selection and hyperparameter estimation for BAM. We provide several numerical examples, both on artificial and real datasets, that demonstrate the performance of the algorithms for various data regimes.
Item Type: Article
Uncontrolled Keywords: Data models; Tensors; Bayes methods; Resource management; Estimation; Probabilistic logic; Numerical models; Bayesian model selection; Bayesian network; Markov chain Monte Carlo; model scoring; nonnegative tensor factorization; sequential Monte Carlo; topic models; Pó lya  urns
Subjects: Q Science > QA Mathematics > QA273-280 Probabilities. Mathematical statistics
Q Science > QA Mathematics > QA075 Electronic computers. Computer science
Divisions: Faculty of Engineering and Natural Sciences > Academic programs > Computer Science & Eng.
Faculty of Engineering and Natural Sciences
Depositing User: Sinan Yıldırım
Date Deposited: 06 May 2021 13:31
Last Modified: 19 Aug 2022 11:11
URI: https://research.sabanciuniv.edu/id/eprint/41492

Actions (login required)

View Item
View Item