Code underlying the PhD thesis: Label Alchemy: Transforming Noisy Data into Precious Insights in Deep Learning (dataset)

Code underlying the PhD thesis: Label Alchemy: Transforming Noisy Data into Precious Insights in Deep Learning

DOI:10.4121/b00277a6-9431-47dc-9369-e9a477031e66.v1

The DOI displayed above is for this specific version of this dataset, which is currently the latest. Newer versions may be published in the future. For a link that will always point to the latest version, please use
DOI: 10.4121/b00277a6-9431-47dc-9369-e9a477031e66

Datacite citation style

Ghiassi, AmirMasoud (2024): Code underlying the PhD thesis: Label Alchemy: Transforming Noisy Data into Precious Insights in Deep Learning. Version 1. 4TU.ResearchData. dataset. https://doi.org/10.4121/b00277a6-9431-47dc-9369-e9a477031e66.v1

Other citation styles (APA, Harvard, MLA, Vancouver, Chicago, IEEE) available at Datacite

Dataset

Usage statistics

views

114

downloads

Keywords

Deep Learning Noisy Labels Robust DNNs Robust Learning

Licence

CC BY 4.0

Interoperability

RO-Crate Metadata

Export as...

RefWorks BibTeX Reference Manager Endnote DataCite NLM DC CFF

by AmirMasoud Ghiassi

Labels are essential for training Deep Neural Networks (DNNs), guiding learning with fundamental ground truth. Label quality directly impacts DNN performance and generalization with accurate labels fostering robust predictions. Noisy labels introduce errors and hinder learning, affecting performance adversely. High-quality labels aid convergence, optimizing DNN training towards accurate data distribution representation. Ensuring label accuracy is vital for DNNs' effective learning, generalization, and real-world performance. Undoubtedly, ensuring the quality of labels is not only critical but also demanding, often entailing considerable resources in terms of time and cost. As the scale of datasets grows, methods such as crowdsourcing have gained traction to expedite the labeling process. However, this approach comes with its own set of challenges, most notably the inherent susceptibility to errors and inaccuracies. For example, it was observed that the accuracy of AlexNet in classifying CIFAR-10 images plummeted from 77\% to a mere 10\% when labels were subjected to random flips. This stark drop in accuracy exemplifies the magnitude of influence that corrupted or erroneous labels can exert on the performance of DNNs. Such instances underscore the critical relationship between accurate labels and the efficacy of DNNs in understanding and effectively leveraging data.

Ensuring DNN robustness is vital, involving strategies like noise label identification, filtering, and integrating noise patterns into training for resilient models. Architectural and loss function design also combats label-related challenges, enhancing DNN adaptability across applications. This thesis investigates the pivotal role of labels in DNN training and their quality impact on model performance. Strategies spanning noise recovery, robust learning frameworks, and multi-label solutions contribute to DNN resilience against noisy labels, advancing both understanding and practical applications.

***This is the code repository for each chapter of the thesis. ***

History

2024-04-22 first online, published, posted

Publisher

4TU.ResearchData

Format

compressed files for each chapter .zip, python code files .py

Organizations

TU Delft, Faculty of Engineering, Mathematics and Computer Science, Distributed Systems Group

DATA

Files (6)

1,173 bytesMD5:f36998e92cf48b454f02875619fda9f4README.md
21,402 bytesMD5:9af8d13048c5a11508048fddeb12ad87chapter2.zip
8,026,652 bytesMD5:d440083f9abcc98b5099f6a8c1638356chapter3.zip
37,887 bytesMD5:b34c256fc4d4caeffe9d7a727defbb56chapter4.zip
54,891 bytesMD5:18d8e26512f6c1790bff33832734da4dchapter5.zip
126,943 bytesMD5:895d410acfbbc8f494deccdfd4d13fbfchapter6.zip
download all files (zip)
8,268,948 bytes unzipped