diff --git a/src/experimentalresults.tex b/src/experimentalresults.tex index 968b70e..08fbebf 100644 --- a/src/experimentalresults.tex +++ b/src/experimentalresults.tex @@ -2,8 +2,8 @@ \subsection{Does Active-Learning benefit the learning process?}\label{subsec:does-active-learning-benefit-the-learning-process?} -A test series was performed inside a Jupyter notebook. -The active learning loop starts with a untrained RESNet-18 model and a random selection of samples. +A test series was performed with a Jupyter notebook. +The active learning loop starts with an untrained RESNet-18 model and a random selection of samples. The muffin and chihuahua dataset was used for this binary classification task. The dataset is split into training and test set which contains $\sim4750$ train- and $\sim1250$ test-images.\cite{muffinsvschiuahuakaggle} (see subsection~\ref{subsubsec:muffinvschihuahua} for more infos) diff --git a/src/implementation.tex b/src/implementation.tex index e809455..ec42e99 100644 --- a/src/implementation.tex +++ b/src/implementation.tex @@ -33,13 +33,13 @@ match predict_mode: pass \end{lstlisting} -Moreover, the Dataset was manually imported and preprocessed with random augmentations. +Moreover, the Dataset was manually imported with the help of a custom torch dataloader and preprocessed with random augmentations. After each loop iteration the Area Under the Curve (AUC) was calculated over the validation set to get a performance measure. All those AUC were visualized in a line plot, see section~\ref{sec:experimental-results} for the results. \subsection{Balanced sample selection} -To avoid the model to learn only from one class, the sample selection process was balanced as mentioned in paragraph~\ref{par:furtherimprovements}. +To avoid the model to learn only from one class, the sample selection process was balanced to avoid imbalanced learning as mentioned in paragraph~\ref{par:furtherimprovements}. Simply sort by predicted class first and then select the $\mathcal{B}/2$ lowest certain samples per class. This should help to balance the sample selection process. diff --git a/src/materialandmethods.tex b/src/materialandmethods.tex index 875beca..fe041c5 100644 --- a/src/materialandmethods.tex +++ b/src/materialandmethods.tex @@ -5,10 +5,10 @@ \subsubsection{Muffin vs chihuahua}\label{subsubsec:muffinvschihuahua} Muffin vs chihuahua is a free dataset available on Kaggle. It consists of $\sim6000$ images of the two classes muffins and chihuahuas. -The source data is scraped from google images and is split into a training and validation set. -The trainings set contains $\sim4750$ and test set $\sim1250$ images, overall the two classes are almost balanced. +The source data is scraped from Google images and is split into a training and validation set. +The trainings set contains $\sim4750$ and test set $\sim1250$ images, overall the two classes are almost balanced.\cite{muffinsvschiuahuakaggle} This is expected to be a relatively hard classification task because the eyes of chihuahuas and chocolate parts of muffins look very similar. -It is used in this practical work as a binary classification task to evaluate the performance of active learning.\cite{muffinsvschiuahuakaggle} +It is used in this practical work as a binary classification task to evaluate the performance of active learning. \begin{figure} \centering @@ -52,14 +52,14 @@ A Jupyter notebook is a shareable document which combines code and its output, t The notebook along with the editor provides a environment for fast prototyping and data analysis. It is widely used in the data science, mathematics and machine learning community. -In the case of this practical work it can be used to test and evaluate the active learning loop before implementing it in a Dagster pipeline. \cite{jupyter} +In the context of this practical work it can be used to test and evaluate the active learning loop before implementing it in a Dagster pipeline. \cite{jupyter} \subsubsection{Active-Learning} Active learning is a subfield of supervised learning. The key idea is if the algorithm is allowed to choose the data it learns from, it can perform better with less data. A supervised classifier requires hundreds or even thousands of labeled samples to perform well. -Those labeled samples must be manually labeled by an oracle (human expert).\cite{RubensRecSysHB2010} +Those labeled samples must be manually labeled by an oracle\footnote{Human annotator}.\cite{RubensRecSysHB2010} Clearly this results in a huge bottleneck for the training procedure. Active learning aims to overcome this bottleneck by selecting the most informative samples to be labeled.\cite{settles.tr09} @@ -91,7 +91,7 @@ The active learning process can be modeled as a loop as shown in Figure~\ref{fig \end{figure} The active learning loop starts with the model inference on $\mathcal{S}$ samples. -The most uncertain samples of size $\mathcal{B}$ are selected and given to the oracle\footnote{Human annotator} for labeling. +The most uncertain samples of size $\mathcal{B}$ are selected and given to the oracle for labeling. Those labeled samples are then used to train the model. The loop starts again with the new model and draws new samples from the unlabeled sample set $\mathcal{X}_U$. @@ -108,36 +108,6 @@ Those two techniques combined yield semi-supervised learning. Some of the labels are known, but for most of the data we have only the raw datapoints. The basic idea is that the unlabeled data can significantly improve the model performance when used in combination with the labeled data.\cite{Xu_2022_CVPR} -\subsubsection{ROC and AUC} - -A receiver operating characteristic curve can be used to measure the performance of a classifier of a binary classification task. -When using the accuracy as the performance metric it doesn't reveal much about the balance of the predictions. -There might be many true-positives and rarely any true-negatives and the accuracy is still good. -The ROC curve helps with this problem and visualizes the true-positives and false-positives on a line plot. -The more the curve ascents the upper-left or bottom-right corner the better the classifier gets. -Figure~\ref{fig:roc-example} shows an example of a ROC curve with differently performing classifiers. - -\begin{figure} - \centering - \includegraphics[width=\linewidth/2]{../rsc/Roc_curve.svg} - \caption{ROC curve comparision of two classifiers. \cite{ROCWikipedia}} - \label{fig:roc-example} -\end{figure} - -Furthermore, the area under this curve is called AUR curve and a useful metric to measure the performance of a binary classifier. \cite{suptechniques} - -\subsubsection{RESNet} - -Residual neural networks are a special type of neural network architecture. -They are especially good for deep learning and have been used in many state-of-the-art computer vision tasks. -The main idea behind ResNet is the skip connection. -The skip connection is a direct connection from one layer to another layer which is not the next layer. -This helps to avoid the vanishing gradient problem and helps with the training of very deep networks. -ResNet has proven to be very successful in many computer vision tasks and is used in this practical work for the classification task. -There are several different ResNet architectures, the most common are ResNet-18, ResNet-34, ResNet-50, ResNet-101 and ResNet-152. \cite{resnet} - -Since the dataset is relatively small and the two class classification task is relatively easy the ResNet-18 architecture is used in this practical work. - \subsubsection{CNN} Convolutional neural networks are especially good model architectures for processing images, speech and audio signals. A CNN typically consists of Convolutional layers, pooling layers and fully connected layers. @@ -159,6 +129,36 @@ Figure~\ref{fig:cnn-architecture} shows a typical binary classification task. \label{fig:cnn-architecture} \end{figure} +\subsubsection{RESNet} + +Residual neural networks are a special type of neural network architecture. +They are especially good for deep learning and have been used in many state-of-the-art computer vision tasks. +The main idea behind ResNet is the skip connection. +The skip connection is a direct connection from one layer to another layer which is not the next layer. +This helps to avoid the vanishing gradient problem and helps with the training of very deep networks. +ResNet has proven to be very successful in many computer vision tasks and is used in this practical work for the classification task. +There are several different ResNet architectures, the most common are ResNet-18, ResNet-34, ResNet-50, ResNet-101 and ResNet-152. \cite{resnet} + +Since the dataset is relatively small and the two class classification task is relatively easy (for such a large model) the ResNet-18 architecture is used in this practical work. + +\subsubsection{ROC and AUC} + +A receiver operating characteristic curve can be used to measure the performance of a classifier of a binary classification task. +When using the accuracy as the performance metric it doesn't reveal much about the balance of the predictions. +There might be many true-positives and rarely any true-negatives and the accuracy is still good. +The ROC curve helps with this problem and visualizes the true-positives and false-positives on a line plot. +The more the curve ascents the upper-left or bottom-right corner the better the classifier gets. +Figure~\ref{fig:roc-example} shows an example of a ROC curve with differently performing classifiers. + +\begin{figure} + \centering + \includegraphics[width=\linewidth/2]{../rsc/Roc_curve.svg} + \caption{ROC curve comparision of two classifiers. \cite{ROCWikipedia}} + \label{fig:roc-example} +\end{figure} + +Furthermore, the area under this curve is called AUR curve and a useful metric to measure the performance of a binary classifier. \cite{suptechniques} + \subsubsection{Softmax} The Softmax function~\eqref{eq:softmax}\cite{liang2017soft} converts $n$ numbers of a vector into a probability distribution.