A CPU-GPU hybrid algorithm for embedding large graphs
||The system is temporarily closed to updates for reporting purpose.
Alabsi Aljundi, Amro (2020) A CPU-GPU hybrid algorithm for embedding large graphs. [Thesis]
Official URL: https://risc01.sabanciuniv.edu/record=b2486356 _ (Table of contents)
Graphs have become ubiquitous in this day and age, and their sizes are only becoming larger and harder to deal with. Graph embedding is the process of transforming graphs into a d-dimensional vector space to carry out machine learning tasks on them. However, time- and memory-wise, it is a very expensive task. Many approaches have been proposed to optimize the process of graph embedding using distributed systems and GPUs, however, state-of-the-art GPU implementations fail to embed graphs unless the total memory of the available GPUs satisfies the cost of embedding. We propose a hybrid CPU-GPU graph embedding algorithm that enables arbitrarily large graphs to be embedded using a single GPU even when the GPU’s memory capabilities fall short. The embedding is partitioned into smaller embeddings and the GPU carries out embedding updates on embedding portions that fit the GPU’s memory. The system generates samples on the CPU and sends them to the GPU as they become needed without any global synchronization across the system. The system adopts a generalizable DAG execution model to minimize the dependencies between its sub-tasks. We embed a graph with 60 million vertices and 1.8 billion edges in 17 minutes and report a link prediction AUC ROC score of 97.84% making us 67× faster than the state-of-the-art GPU implementation
Repository Staff Only: item control page