Scalable Monte Carlo inference in regression models with missing data

Koçhan, Didem (2018) Scalable Monte Carlo inference in regression models with missing data. [Thesis]

[thumbnail of 10207838_DidemKochan.pdf] PDF

Download (1MB)


Markov chain Monte Carlo (MCMC) and Stochastic Gradient Langevin Dynamics (SGLD) algorithms comprise a basis for this thesis. These methods are studied in detail and combined for handling incomplete and large datasets. Two algorithms, which are based on Metropolis-Hastings (MH) and SGLD, are proposed to improve the performance of regression with missing data. We introduce an SGLD algorithm for large datasets with missing portions. The algorithm approximates the gradient of the log-likelihood of a subset of the data with respect to the unknown parameter by using samples for missing components obtained with MH moves. We implemented these methods for a logistic regression model to obtain parameter estimations. We worked with two different datasets with missing features and compared their performances. The first dataset is artificially generated from a logistic regression model where the features are normally distributed, whereas the second dataset is a real categorical data.
Item Type: Thesis
Uncontrolled Keywords: Industrial and Industrial Engineering. -- Endüstri ve Endüstri Mühendisliği.
Subjects: T Technology > T Technology (General) > T055.4-60.8 Industrial engineering. Management engineering
Divisions: Faculty of Engineering and Natural Sciences > Academic programs > Industrial Engineering
Faculty of Engineering and Natural Sciences
Depositing User: IC-Cataloging
Date Deposited: 05 Oct 2018 11:33
Last Modified: 26 Apr 2022 10:26

Actions (login required)

View Item
View Item