Scalable Monte Carlo inference in regression models with missing data

Koçhan, Didem (2018) Scalable Monte Carlo inference in regression models with missing data. [Thesis]

PDF - Requires a PDF viewer such as GSview, Xpdf or Adobe Acrobat Reader

Official URL: http://risc01.sabanciuniv.edu/record=b1817043 (Table of Contents)


Markov chain Monte Carlo (MCMC) and Stochastic Gradient Langevin Dynamics (SGLD) algorithms comprise a basis for this thesis. These methods are studied in detail and combined for handling incomplete and large datasets. Two algorithms, which are based on Metropolis-Hastings (MH) and SGLD, are proposed to improve the performance of regression with missing data. We introduce an SGLD algorithm for large datasets with missing portions. The algorithm approximates the gradient of the log-likelihood of a subset of the data with respect to the unknown parameter by using samples for missing components obtained with MH moves. We implemented these methods for a logistic regression model to obtain parameter estimations. We worked with two different datasets with missing features and compared their performances. The first dataset is artificially generated from a logistic regression model where the features are normally distributed, whereas the second dataset is a real categorical data.

Item Type:Thesis
Uncontrolled Keywords:Industrial and Industrial Engineering. -- Endüstri ve Endüstri Mühendisliği.
Subjects:T Technology > T Technology (General) > T055.4-60.8 Industrial engineering. Management engineering
ID Code:36605
Deposited By:IC-Cataloging
Deposited On:05 Oct 2018 11:33
Last Modified:30 Apr 2020 12:27

Repository Staff Only: item control page