Analyzing crowd workers' learning behavior to obtain more reliable labels

Rabiger, Stefan (2018) Analyzing crowd workers' learning behavior to obtain more reliable labels. [Thesis]

[thumbnail of 10208310_StefanRaebiger.pdf]

PDF
10208310_StefanRaebiger.pdf
Download (4MB)

Official URL: http://risc01.sabanciuniv.edu/record=b1817259 (Table of Contents)

Abstract

Crowdsourcing is a popular means to obtain high-quality labels for datasets at moderate costs. These crowdsourced datasets are then used for training supervised or semisupervised predictors. This implies that the performance of the resulting predictors depends on the quality/reliability of the labels that crowd workers assigned – low reliability usually leads to poorly performing predictors. In practice, label reliability in crowdsourced datasets varies substantially depending on multiple factors such as the difficulty of the labeling task at hand, the characteristics and motivation of the participating crowd workers, or the difficulty of the documents to be labeled. Different approaches exist to mitigate the effects of the aforementioned factors, for example by identifying spammers based on their annotation times and removing their submitted labels. To complement existing approaches for improving label reliability in crowdsourcing, this thesis explores label reliability from two perspectives: first, how the label reliability of crowd workers develops over time during an actual labeling task, and second how it is affected by the difficulty of the documents to be labeled. We find that label reliability of crowd workers increases after they labeled a certain number of documents. Motivated by our finding that the label reliability for more difficult documents is lower, we propose a new crowdsourcing methodology to improve label reliability: given an unlabeled dataset to be crowdsourced, we first train a difficulty predictor v on a small seed set and the predictor then estimates the difficulty level in the remaining unlabeled documents. This procedure might be repeated multiple times until the performance of the difficulty predictor is sufficient. Ultimately, difficult documents are separated from the rest, so that only the latter documents are crowdsourced. Our experiments demonstrate the feasibility of this method.

Item Type:	Thesis
Uncontrolled Keywords:	Worker disagreement. -- Crowdsourcing. -- Dataset quality. -- Label reliability. -- Tweet ambiguity. -- Annotation behavior. -- Annotation behavior. -- Learning effect. -- Human factors. -- Çalışan anlaşmazlığı . -- Kitle-kaynak yöntemi. -- Veri seti kalitesi. -- Etiket güvenirliliği. -- Tweet anlam belirsizliği. -- Etiketleme tutumu. -- Öğrenme etkisi. -- İnsan faktörleri.
Subjects:	T Technology > TK Electrical engineering. Electronics Nuclear engineering > TK7800-8360 Electronics > TK7885-7895 Computer engineering. Computer hardware
Divisions:	Faculty of Engineering and Natural Sciences > Academic programs > Computer Science & Eng. Faculty of Engineering and Natural Sciences
Depositing User:	IC-Cataloging
Date Deposited:	12 Oct 2018 00:27
Last Modified:	26 Apr 2022 10:26
URI:	https://research.sabanciuniv.edu/id/eprint/36625

Actions (login required)

: View Item