Gül, Furkan (2023) A distance Transform-based loss Function for semantic Image segmentation With deep neural Networks. [Thesis]
PDF
10613180.Gül.pdf
Download (9MB)
10613180.Gül.pdf
Download (9MB)
Official URL: https://risc01.sabanciuniv.edu/record=b3400645
Abstract
In recent decades, there has been a tremendous enlargement of semantic segmentation datasets across diverse complex domains, including autonomous driving, satellite imaging, and medical imaging. Despite numerous advancements in solving complex semantic segmentation problems with these datasets, certain challenges, such as the precise segmentation of object boundaries in complexly structured objects, persist. Traditional loss functions like Cross-Entropy and Intersection over Union (IoU), which are typically based on integrals over segmentation regions, often fall short in these scenarios. These functions perceive objects regionally rather than contour-based, assigning equal importance to all object contours such as boundaries and inner parts. This approach overlooks the fact that segmentation at object boundaries is both more challenging and more critical. To address this, this thesis introduces a distance transform-based loss function, specifically designed to enhance the alignment between predicted and ground-truth boundaries during training, a feature not explicitly enforced in commonly used image segmentation losses. This proposed loss function is model-agnostic and can be integrated into the training of any segmentation models to enhance boundary details. Our loss was evaluated using two segmentation datasets: CelebAMask-HQ for single-class, and Cityscapes for multi-class segmentation. Experiments were conducted using two models, U-Net and DeepLabv3+, and three encoders, ResNet-34, ResNet-50, and MobileNetV2, to demonstrate the adaptability and effectiveness of our loss across various network architectures. Our evaluations and comparisons of different loss functions revealed that our loss surpassed other commonly used loss functions by 0.0561 for the Cityscapes dataset with U-Net models in terms of boundary IoU, a metric specifically designed to assess the boundary quality of objects in images. Furthermore, our loss demonstrated superior performance by using 2.4% less GPU memory, a significant factor when training larger neural networks with big datasets.
Item Type: | Thesis |
---|---|
Uncontrolled Keywords: | Deep learning, computer vision, image segmentation, semantic segmentation, boundary loss, distance transform, distance map, loss function. |
Subjects: | T Technology > TJ Mechanical engineering and machinery > TJ163.12 Mechatronics |
Divisions: | Faculty of Engineering and Natural Sciences > Academic programs > Mechatronics Faculty of Engineering and Natural Sciences |
Depositing User: | Dila Günay |
Date Deposited: | 03 Sep 2024 15:11 |
Last Modified: | 03 Sep 2024 15:11 |
URI: | https://research.sabanciuniv.edu/id/eprint/49874 |