Learning word representations for Turkish (Türkçe için kelime temsillerinin öğrenimi)

Şen, Mehmet Umut and Erdoğan, Hakan (2014) Learning word representations for Turkish (Türkçe için kelime temsillerinin öğrenimi). In: 22nd Signal Processing and Communications Applications Conference (SIU 2014), Trabzon, Turkey

Full text not available from this repository. (Request a copy)

Abstract

High-quality word representations have been very successful in recent years at improving performance across a variety of NLP tasks. These word representations are the mappings of each word in the vocabulary to a real vector in the Euclidean space. Besides high performance on specific tasks, learned word representations have been shown to perform well on establishing linear relationships among words. The recently introduced skip-gram model improved performance on unsupervised learning of word embeddings that contains rich syntactic and semantic word relations both in terms of accuracy and speed. Word embeddings that have been used frequently on English language, is not applied to Turkish yet. In this paper, we apply the skip-gram model to a large Turkish text corpus and measured the performance of them quantitatively with the "question" sets that we generated. The learned word embeddings and the question sets are publicly available at our website.
Item Type: Papers in Conference Proceedings
Subjects: T Technology > TK Electrical engineering. Electronics Nuclear engineering
Divisions: Faculty of Engineering and Natural Sciences > Academic programs > Telecommunications
Faculty of Engineering and Natural Sciences > Academic programs > Electronics
Faculty of Engineering and Natural Sciences
Depositing User: Hakan Erdoğan
Date Deposited: 21 Jan 2015 16:19
Last Modified: 26 Apr 2022 09:18
URI: https://research.sabanciuniv.edu/id/eprint/26655

Actions (login required)

View Item
View Item