Boosting large-scale graph embedding with multi-level graph coarsening

Akyıldız, Taha Atahan (2020) Boosting large-scale graph embedding with multi-level graph coarsening. [Thesis]

[thumbnail of 10356113_Akyıldız_Taha_Atahan.pdf] PDF
10356113_Akyıldız_Taha_Atahan.pdf

Download (1MB)

Abstract

Graphs can be found anywhere from protein interaction networks to social networks. However, the irregular structure of graph data constitutes an obstacle for running machine learning tasks such as link prediction, node classification, and anomaly detection. Graph embedding is the process of representing graphs in a multidimensional space, which enables machine learning tasks to be run on graphs. Although, embedding is proven to be advantageous by a series of works, it is computeintensive. Current embedding approaches either can not scale to large graphs or they require expensive hardware for such purposes. In this work we propose a novel, parallel multi-level coarsening method to boost the performance of graph embedding both in terms of speed, and accuracy. We integrate the proposed coarsening approach into a GPU-based graph embedding tool called Gosh, which is able to embed large-scale graphs with a single GPU at a fraction of the time compared to the state-of-the-art. When coarsening is introduced, the run-time of Gosh improves by 14× while scoring greater AUCROC for the majority of medium-scale graphs. For the largest graph in our data-set with 66 million vertices, and 1.8 billion edges, embedding takes under an hour, and 93.4% AUCROC is achieved. Moreover, we investigate the relationship between quality of the coarsening on the quality of the embeddings. Our preliminary experiments show that the coarsening decisions must be balanced and the proposed coarsening strategy novel performs well for graph embedding
Item Type: Thesis
Uncontrolled Keywords: Graph coarsening. -- graph embedding. -- GPU. -- parallel graph algorithms. -- link prediction. -- çizge irileştirme. -- çizge katıştırma. -- ekran kartı. -- bağlantı tahmini. -- paralel algoritmalar.
Subjects: T Technology > TK Electrical engineering. Electronics Nuclear engineering > TK7800-8360 Electronics > TK7885-7895 Computer engineering. Computer hardware
Divisions: Faculty of Engineering and Natural Sciences > Academic programs > Computer Science & Eng.
Faculty of Engineering and Natural Sciences
Depositing User: IC-Cataloging
Date Deposited: 24 Oct 2020 12:00
Last Modified: 26 Apr 2022 10:34
URI: https://research.sabanciuniv.edu/id/eprint/41179

Actions (login required)

View Item
View Item