Quantifying behavioral complexities of human and bot accounts using data compression

Bayık, Davut (2022) Quantifying behavioral complexities of human and bot accounts using data compression. [Thesis]

[thumbnail of 10452567.pdf] PDF

Download (18MB)


As the number of automated accounts grew rapidly in parallel with social media platforms gain more users around the world, there is a growing need to understand the nature of bot accounts to prevent their manipulative and misleading effects on ordinary users. This study focused on complexity analysis of users on the Twitter platform to reveal the hidden and differentiating patterns between human and bot accounts, using 14 publicly available datasets collected through the Twitter API and labelled with different annotation methods. The analysis consists of two parts, quantifying the complexity of account behavior and reducing the dimensionality of profile information. In our research, the assessment of account complexity is performed by encoding account activities into sequence of codes and compressing the repetitions and patterns about it. For the profile information, we developed a heuristic method to determine how much of an account’s profile features can be compressed with minimal loss of information using variational autoencoders. The results for both parts of our analyzes are largely consistent with each other in terms of comparing complexity with different datasets and between human and bot accounts. We validated and corroborated our findings by predicting the next activity of accounts and calculating the accuracy of the predictions using discretetime Markov Chains. Consequently, we analyzed the complexity of bot and human accounts and had complexity levels for each bot dataset we used, and we hope this study will lead to develop measures to quantify robustness of bot detection systems.
Item Type: Thesis
Uncontrolled Keywords: Data Compression. -- Variational Autoencoders. -- Twitter. -- Bot Accounts. -- Dimensionality Reduction. -- Digital DNA. -- Complexity Analysis. -- Markov Chains. -- Veri Sıkıştırma. -- Varyasyonel Otokodlayıcılar. -- Twitter. -- Bot Hesaplar. -- Boyut Azaltma. -- Dijital DNA. -- Karmaşıklık Analizi. -- Markov Zincirleri.
Subjects: T Technology > T Technology (General) > T055.4-60.8 Industrial engineering. Management engineering > T58.5 Information technology
Divisions: Faculty of Engineering and Natural Sciences
Depositing User: Dila Günay
Date Deposited: 25 Apr 2023 14:20
Last Modified: 13 Nov 2023 14:11
URI: https://research.sabanciuniv.edu/id/eprint/47160

Actions (login required)

View Item
View Item