YFCC-CelebA face attributes datasets [YFCC-CelebA yüz özellikleri veri setleri]

Yavuz, Mehmet Can and Ali Ahmed, Sara Atito and Kısaağa, Mehmet Efe and Ocak, Hasan and Yanıkoğlu, Berrin (2021) YFCC-CelebA face attributes datasets [YFCC-CelebA yüz özellikleri veri setleri]. In: 29th Signal Processing and Communications Applications Conference (SIU), Istanbul, Turkey

Full text not available from this repository. (Request a copy)


The scales of the data accessible through internet search engines can reach hundreds of millions, or even billions. The existence of such large weak-labeled databases has gained importance in the training of face recognition algorithms. Starting with the publicly available YFCC100M, we propose a weaklylabeled subset for multi-label face recognition for self-supervised methods. A 392K image subset of YFCC100M of 128x128 images was obtained by querying for the 40 facial attributes. We made this dataset publicly available for other face recognition studies, by sharing the IDs, the links and the bounding boxes1. To reduce outliers with respect to CelebA, we apply the Elliptic Envelope algorithm, in the the latent feature space learned over CelebA, obtaining 353K face images. MixMatch algorithm is applied to this last set, to obtain pseudo labels. Pretraining with these pseudo-labels and final fine-tuning with CelebA brings an improvement of 0.4% points in the Area Under the ROC Curve (AUC) score over the system trained only with CelebA.
Item Type: Papers in Conference Proceedings
Uncontrolled Keywords: Biometrics; Face Attributes; Semi Supervised Learning; Webly Supervised Learning
Divisions: Faculty of Engineering and Natural Sciences
Depositing User: Berrin Yanıkoğlu
Date Deposited: 01 Sep 2022 12:34
Last Modified: 01 Sep 2022 12:34
URI: https://research.sabanciuniv.edu/id/eprint/43555

Actions (login required)

View Item
View Item