Title

Mitigating class imbalance in long-tailed visual recognition through the use of intrinsic dimensionality

Abstract

Abstract

Natural image datasets used in the field of visual recognition are often imbalanced in terms of the number of samples between class categories in the dataset. This problem, defined commonly as class imbalance, results in sub-optimal performance on these under-represented classes for deep learning models which are trained with such datasets. Attempts to remedy this problem include re-sampling, loss re-weighting and other calibration methods which generally use the number of samples as the primary factor in their mitigation strategy, ignoring other factors. In this thesis, we argue that model performance in a dataset depends on the difficulty of individual class categories as well as the number of samples present in the dataset. We use the concept of intrinsic dimensionality to express this idea of difficulty and explore the different definitions and estimation strategies for calculating ID inside a dataset. We further investigate the relationship between ID and class imbalance. Lastly, we report our results on using class ID estimation for class imbalance mitigation on long-tailed variations of natural image datasets -- MNIST-LT, CIFAR-10-LT and CIFAR-100-LT.

Supervisor(s)

Supervisor(s)

CAGRI ESER

Date and Location

Date and Location

2024-01-11 10:30:00

Category

Category

MSc_Thesis