Local representations and random sampling for speaker verification
Işık, Yusuf Ziya (2010) Local representations and random sampling for speaker verification. [Thesis]
Official URL: http://192.168.1.20/record=b1302877 (Table of Contents)
In text-independent speaker verification, studies focused on compensating intra-speaker variabilities at the modeling stage through the last decade. Intra-speaker variabilities may be due to channel effects, phonetic content or the speaker himself in the form of speaking style, emotional state, health or other similar factors. Joint Factor Analysis, Total Variability Space compensation, Nuisance Attribute Projection are some of the most successful approaches for inter-session variability compensation in the literature. In this thesis, we criticize the assumptions of low dimensionality of channel space in these methods and propose to partition the acoustic space into local regions. Intra-speaker variability compensation may be done in each local space separately. Two architectures are proposed depending on whether the subsequent modeling and scoring steps will also be done locally or globally. We have also focused on a particular component of intra-speaker variability, namely within-session variability. The main source of within-session variability is the differences in the phonetic content of speech segments in a single utterance. The variabilities in phonetic content may be either due to across acoustic event variabilities or due to differences in the actual realizations of the acoustic events. We propose a method to combat these variabilities through random sampling of training utterance. The method is shown to be effective both in short and long test utterances.
Repository Staff Only: item control page