add balanced stuff
This commit is contained in:
parent
841c8deb6d
commit
2ff58491b0
@ -92,6 +92,15 @@ Label-Studio provides a great api which can be used to update the predictions of
|
||||
|
||||
\subsection{Does balancing the learning samples improve performance?}\label{subsec:does-balancing-the-learning-samples-improve-performance?}
|
||||
|
||||
The previous process was improved by balancing the classes to give the oracle for labelling.
|
||||
The idea is that it might happen that the low certainty samples might always be of one class and thus lead to an imbalanced learning process.
|
||||
The sample selection was modified as described in~\ref{par:furtherimprovements}.
|
||||
|
||||
Unfortunately it didn't improve the convergence speed and it seems to make no difference compared to not balancing.
|
||||
This might be the case because the uncertainty sampling process balances the draws itself pretty well.
|
||||
|
||||
% todo insert imgs
|
||||
|
||||
Not really.
|
||||
|
||||
% todo add img and add stuff
|
@ -35,6 +35,8 @@ match predict_mode:
|
||||
|
||||
Moreover, the Dataset was manually imported and preprocessed with random augmentations.
|
||||
|
||||
\subsection{Balanced sample selection}
|
||||
|
||||
\subsection{Dagster with Label-Studio}\label{subsec:dagster-with-label-studio}
|
||||
|
||||
The main goal is to implement an active learning loop with the help of Dagster and Label-Studio.
|
||||
|
@ -260,7 +260,7 @@ So now we have defined the samples we want to label with $\mathcal{X}_t$ and the
|
||||
After labelling the model $g(\pmb{x};\pmb{w})$ is trained with the new samples and the weights $\pmb{w}$ are updated with the labeled samples $\mathcal{X}_t$.
|
||||
The loop starts again with the new model and draws new unlabeled samples from $\mathcal{X}_U$ as in~\eqref{eq:batchdef}.
|
||||
|
||||
\paragraph{Further improvement by class balancing}
|
||||
\paragraph{Further improvement by class balancing} \label{par:furtherimprovements}
|
||||
An intuitive improvement step might be the balancing of the class predictions.
|
||||
The selected samples of the active learning step above from $\mathcal{X}_t$ might all be from one class.
|
||||
This is bad for the learning process because the model might overfit to one class if always the same class is selected.
|
||||
|
Loading…
Reference in New Issue
Block a user