diff --git a/src/experimentalresults.tex b/src/experimentalresults.tex index c2a4f1e..a49454d 100644 --- a/src/experimentalresults.tex +++ b/src/experimentalresults.tex @@ -92,6 +92,15 @@ Label-Studio provides a great api which can be used to update the predictions of \subsection{Does balancing the learning samples improve performance?}\label{subsec:does-balancing-the-learning-samples-improve-performance?} +The previous process was improved by balancing the classes to give the oracle for labelling. +The idea is that it might happen that the low certainty samples might always be of one class and thus lead to an imbalanced learning process. +The sample selection was modified as described in~\ref{par:furtherimprovements}. + +Unfortunately it didn't improve the convergence speed and it seems to make no difference compared to not balancing. +This might be the case because the uncertainty sampling process balances the draws itself pretty well. + +% todo insert imgs + Not really. % todo add img and add stuff \ No newline at end of file diff --git a/src/implementation.tex b/src/implementation.tex index 7457d97..ac3538d 100644 --- a/src/implementation.tex +++ b/src/implementation.tex @@ -35,6 +35,8 @@ match predict_mode: Moreover, the Dataset was manually imported and preprocessed with random augmentations. +\subsection{Balanced sample selection} + \subsection{Dagster with Label-Studio}\label{subsec:dagster-with-label-studio} The main goal is to implement an active learning loop with the help of Dagster and Label-Studio. diff --git a/src/materialandmethods.tex b/src/materialandmethods.tex index c70ca71..20f681f 100644 --- a/src/materialandmethods.tex +++ b/src/materialandmethods.tex @@ -260,7 +260,7 @@ So now we have defined the samples we want to label with $\mathcal{X}_t$ and the After labelling the model $g(\pmb{x};\pmb{w})$ is trained with the new samples and the weights $\pmb{w}$ are updated with the labeled samples $\mathcal{X}_t$. The loop starts again with the new model and draws new unlabeled samples from $\mathcal{X}_U$ as in~\eqref{eq:batchdef}. -\paragraph{Further improvement by class balancing} +\paragraph{Further improvement by class balancing} \label{par:furtherimprovements} An intuitive improvement step might be the balancing of the class predictions. The selected samples of the active learning step above from $\mathcal{X}_t$ might all be from one class. This is bad for the learning process because the model might overfit to one class if always the same class is selected.