Bulut, Ahmet Emin (2014) Diarization of telephone conversations using probabilistic linear discriminant analysis. [Thesis]
PDF
AhmetEminBulut_10060168.pdf
Download (581kB)
AhmetEminBulut_10060168.pdf
Download (581kB)
Abstract
Speaker diarization can be summarized as the process of partitioning an audio data into homogeneous segments according to speaker identity. This thesis investigates the application of the probabilistic linear discriminant analysis (PLDA) to speaker diarization of telephone conversations. We introduce a variational Bayes (VB) approach for inference under a PLDA model for modeling segmental i-vectors in speaker diarization. Deterministic annealing (DA) algorithm is employed in order to avoid locally optimal solutions in VB iterations. We compare our proposed system with a well-known system that applies k-means clustering on principal component analysis coe cients of segmental i-vectors. We used summed channel telephone data from the National Institute of Standards and Technology 2008 Speaker Recognition Evaluation as the test set in order to evaluate the performance of the proposed system. We achieve about 20% relative improvement in diarization error rate as compared to the baseline system.
Item Type: | Thesis |
---|---|
Uncontrolled Keywords: | Speaker diarization. -- i-vector. -- PLDA. -- Deterministic annealing. -- Variational Bayes. -- Speaker. -- Speaker verification system. -- Segmentation. -- Konuşmacı. -- Bölütleme. -- i-vektör. -- ODAA. -- Belirleyici tavlama. -- Değişkenli Bayes. -- Konuşmacı onaylama sistemi. |
Subjects: | T Technology > TK Electrical engineering. Electronics Nuclear engineering > TK7800-8360 Electronics > TK7885-7895 Computer engineering. Computer hardware |
Divisions: | Faculty of Engineering and Natural Sciences > Academic programs > Computer Science & Eng. Faculty of Engineering and Natural Sciences |
Depositing User: | IC-Cataloging |
Date Deposited: | 09 Mar 2016 16:35 |
Last Modified: | 26 Apr 2022 10:06 |
URI: | https://research.sabanciuniv.edu/id/eprint/29203 |