Yavuz, Mehmet Can and Ali Ahmed, Sara Atito and Kısaağa, Mehmet Efe and Ocak, Hasan and Yanıkoğlu, Berrin (2021) YFCC-CelebA face attributes datasets [YFCC-CelebA yüz özellikleri veri setleri]. In: 29th Signal Processing and Communications Applications Conference (SIU), Istanbul, Turkey
Full text not available from this repository. (Request a copy)
Official URL: https://dx.doi.org/10.1109/SIU53274.2021.9477959
Abstract
The scales of the data accessible through internet search engines can reach hundreds of millions, or even billions. The existence of such large weak-labeled databases has gained importance in the training of face recognition algorithms. Starting with the publicly available YFCC100M, we propose a weaklylabeled subset for multi-label face recognition for self-supervised methods. A 392K image subset of YFCC100M of 128x128 images was obtained by querying for the 40 facial attributes. We made this dataset publicly available for other face recognition studies, by sharing the IDs, the links and the bounding boxes1. To reduce outliers with respect to CelebA, we apply the Elliptic Envelope algorithm, in the the latent feature space learned over CelebA, obtaining 353K face images. MixMatch algorithm is applied to this last set, to obtain pseudo labels. Pretraining with these pseudo-labels and final fine-tuning with CelebA brings an improvement of 0.4% points in the Area Under the ROC Curve (AUC) score over the system trained only with CelebA.
Item Type: | Papers in Conference Proceedings |
---|---|
Uncontrolled Keywords: | Biometrics; Face Attributes; Semi Supervised Learning; Webly Supervised Learning |
Divisions: | Faculty of Engineering and Natural Sciences |
Depositing User: | Berrin Yanıkoğlu |
Date Deposited: | 01 Sep 2022 12:34 |
Last Modified: | 01 Sep 2022 12:34 |
URI: | https://research.sabanciuniv.edu/id/eprint/43555 |