add proper section references on every ref
This commit is contained in:
parent
ef23935c93
commit
2bc8c45f9d
@ -6,7 +6,7 @@ A test series was performed inside a Jupyter notebook.
|
||||
The active learning loop starts with a untrained RESNet-18 model and a random selection of samples.
|
||||
The muffin and chihuahua dataset was used for this binary classification task.
|
||||
The dataset is split into training and test set which contains $\sim4750$ train- and $\sim1250$ test-images.
|
||||
(see~\ref{subsec:material-and-methods} for more infos)
|
||||
(see subsection~\ref{subsec:material-and-methods} for more infos)
|
||||
|
||||
As a loss function CrossEntropyLoss was used and the Adam optimizer with a learning rate of $0.0001$.
|
||||
|
||||
@ -79,16 +79,16 @@ Moreover, when increasing the sample-space $\mathcal{S}$ from where the pre-pred
|
||||
This is because the selected subset $\pmb{x} \sim \mathcal{X}_U$ has a higher chance of containing relevant elements corresponding to the selected metric.
|
||||
But keep in mind this improvement comes with a performance penalty because more model evaluations are required to predict the ranking scores.
|
||||
|
||||
\ref{fig:auc_normal_lowcer_2_10};\ref{fig:auc_normal_lowcer_2_20};\ref{fig:auc_normal_lowcer_2_50} shows the AUC curve with a batch size of 2 and a sample size of 10, 20, 50 respectively.
|
||||
Figures~\ref{fig:auc_normal_lowcer_2_10};\ref{fig:auc_normal_lowcer_2_20};\ref{fig:auc_normal_lowcer_2_50} show the AUC curve with a batch size of 2 and a sample size of 10, 20, 50 respectively.
|
||||
On all three graphs the active learning curve outperforms the passive learning curve in all four scenarios.
|
||||
Generally the higher the sample space $\mathcal{S}$ the better the performance.
|
||||
|
||||
\ref{fig:auc_normal_lowcer_4_16};\ref{fig:auc_normal_lowcer_4_24} shows the AUC curve with a batch size of 4 and a sample size of 16, 24 respectively.
|
||||
Figures~\ref{fig:auc_normal_lowcer_4_16};\ref{fig:auc_normal_lowcer_4_24} show the AUC curve with a batch size of 4 and a sample size of 16, 24 respectively.
|
||||
The performance is already much worse compared to the results from above with a batch size of 2.
|
||||
Only the low certainty first approach outperforms the passive learning in both cases.
|
||||
The other methods are as good or worse than the passive learning curve.
|
||||
|
||||
\ref{fig:auc_normal_lowcer_8_16};\ref{fig:auc_normal_lowcer_8_32} shows the AUC curve with a batch size of 8 and a sample size of 16, 32 respectively.
|
||||
Figures~\ref{fig:auc_normal_lowcer_8_16};\ref{fig:auc_normal_lowcer_8_32} show the AUC curve with a batch size of 8 and a sample size of 16, 32 respectively.
|
||||
The performance is even worse compared to the results from above with a batch size of 4.
|
||||
This might be the case because the model already converges to a good AUC value before the same amount of pre-predictions is reached.
|
||||
|
||||
@ -107,7 +107,7 @@ For smaller projects a simpler solution just in a notebook or as a simple python
|
||||
|
||||
The previous process was improved by balancing the classes to give the oracle for labelling.
|
||||
The idea is that it might happen that the low certainty samples might always be of one class and thus lead to an imbalanced learning process.
|
||||
The sample selection was modified as described in~\ref{par:furtherimprovements}.
|
||||
The sample selection was modified as described in paragraph~\ref{par:furtherimprovements}.
|
||||
|
||||
\begin{figure}
|
||||
\centering
|
||||
@ -118,5 +118,5 @@ The sample selection was modified as described in~\ref{par:furtherimprovements}.
|
||||
|
||||
Unfortunately it didn't improve the convergence speed and it seems to make no difference compared to not balancing and seems mostly even worse.
|
||||
This might be the case because the uncertainty sampling process balances the draws itself pretty well.
|
||||
\ref{fig:balancedauc} shows the AUC curve with a batch size $\mathcal{B}=4$ and a sample size $\mathcal{S}=24$ for both, balanced and unbalanced low certainty sampling.
|
||||
Figure~\ref{fig:balancedauc} shows the AUC curve with a batch size $\mathcal{B}=4$ and a sample size $\mathcal{S}=24$ for both, balanced and unbalanced low certainty sampling.
|
||||
The result looks similar for the other batch sizes and sample sizes.
|
@ -35,11 +35,11 @@ match predict_mode:
|
||||
|
||||
Moreover, the Dataset was manually imported and preprocessed with random augmentations.
|
||||
After each loop iteration the Area Under the Curve (AUC) was calculated over the validation set to get a performance measure.
|
||||
All those AUC were visualized in a line plot, see~\ref{sec:experimental-results} for the results.
|
||||
All those AUC were visualized in a line plot, see section~\ref{sec:experimental-results} for the results.
|
||||
|
||||
\subsection{Balanced sample selection}
|
||||
|
||||
To avoid the model to learn only from one class, the sample selection process was balanced as mentioned in~\ref{par:furtherimprovements}.
|
||||
To avoid the model to learn only from one class, the sample selection process was balanced as mentioned in paragraph~\ref{par:furtherimprovements}.
|
||||
Simply sort by predicted class first and then select the $\mathcal{B}/2$ lowest certain samples per class.
|
||||
This should help to balance the sample selection process.
|
||||
|
||||
@ -75,7 +75,7 @@ Most of the python routines implemented in section~\ref{subsec:jupyter} were reu
|
||||
\label{fig:dagster_assets}
|
||||
\end{figure}
|
||||
|
||||
\ref{fig:dagster_assets} shows the implemented assets in which the task is split.
|
||||
Figure~\ref{fig:dagster_assets} shows the implemented assets in which the task is split.
|
||||
Whenever an asset materializes it is stored in the Dagster database.
|
||||
This helps to keep track of the data and to rerun the pipeline with the same data.
|
||||
\textit{train\_sup\_model} is the main asset that trains the model with the labeled samples.
|
||||
@ -97,9 +97,9 @@ Such that Label-Studio has always samples with scores produced by the current mo
|
||||
\caption{Dagster graph assets}
|
||||
\end{figure}
|
||||
|
||||
\ref{fig:train_model} shows the train model asset in detail.
|
||||
Figure~\ref{fig:train_model} shows the train model asset in detail.
|
||||
It loads the data for training, trains the model and saves the model automatically due to the Dagster asset system.
|
||||
Moreover, testing data is loaded and the model is evaluated with the test data to get a performance measure.
|
||||
\ref{fig:predict_scores} shows the predict scores asset in detail.
|
||||
Figure~\ref{fig:predict_scores} shows the predict scores asset in detail.
|
||||
It samples $\mathcal{S}$ samples from the unlabeled samples $\mathcal{X}_U$ and predicts the scores.
|
||||
Then it connects to the Label-Studio API with an API key and updates the scores of the samples.
|
||||
|
@ -24,6 +24,6 @@ Does balancing this distribution help the model performance?
|
||||
In section~\ref{sec:material-and-methods} we talk about general methods and materials used.
|
||||
First the problem is modeled mathematically in section~\ref{subsubsec:mathematicalmodeling} and then implemented and benchmarked in a Jupyter notebook~\ref{subsubsec:jupyternb}.
|
||||
Section~\ref{sec:implementation} gives deeper insights to the implementation for the interested reader with some code snippets.
|
||||
The experimental results in~\ref{sec:experimental-results} are well-presented with clear figures illustrating the performance of active learning across different sample sizes and batch sizes.
|
||||
The conclusion~\ref{subsec:conclusion} provides an overview of the findings, highlighting the benefits of active learning.
|
||||
The experimental results in section~\ref{sec:experimental-results} are well-presented with clear figures illustrating the performance of active learning across different sample sizes and batch sizes.
|
||||
The conclusion in subsection~\ref{subsec:conclusion} provides an overview of the findings, highlighting the benefits of active learning.
|
||||
Additionally the outlook section~\ref{subsec:outlook} suggests avenues for future research which are not covered in this work.
|
@ -64,7 +64,7 @@ Those labeled samples must be manually labeled by an oracle (human expert).\cite
|
||||
Clearly this results in a huge bottleneck for the training procedure.
|
||||
Active learning aims to overcome this bottleneck by selecting the most informative samples to be labeled.\cite{settles.tr09}
|
||||
|
||||
The active learning process can be modeled as a loop as shown in~\ref{fig:active-learning-workflow}.
|
||||
The active learning process can be modeled as a loop as shown in Figure~\ref{fig:active-learning-workflow}.
|
||||
\begin{figure}
|
||||
\centering
|
||||
\begin{tikzpicture}[node distance=2cm]
|
||||
@ -148,7 +148,7 @@ Pooling layers sample down the feature maps created by the convolutional layers.
|
||||
This helps reducing the computational complexity of the overall network and help with overfitting.
|
||||
Common pooling layers include average- and max pooling.
|
||||
Finally, after some convolution layers the feature map is flattened and passed to a network of fully connected layers to perform a classification or regression task.
|
||||
\ref{fig:cnn-architecture} shows a typical binary classification task.
|
||||
Figure~\ref{fig:cnn-architecture} shows a typical binary classification task.
|
||||
\cite{cnnintro}
|
||||
|
||||
\begin{figure}
|
||||
@ -217,7 +217,7 @@ That means taking the absolute value of the prediction minus the class center re
|
||||
\cite{activelearning}
|
||||
|
||||
With the help of this metric the pseudo predictions can be sorted by the score $S(z)$.
|
||||
We define $\text{min}_n(S)$ and $\text{max}_n(S)$ respectively in~\ref{eq:minnot} and~\ref{eq:maxnot} to define a short form of taking a subsection of the minimum or maximum of a set.
|
||||
We define $\text{min}_n(S)$ and $\text{max}_n(S)$ respectively in equation~\ref{eq:minnot} and equation~\ref{eq:maxnot} to define a short form of taking a subsection of the minimum or maximum of a set.
|
||||
|
||||
\begin{equation}\label{eq:minnot}
|
||||
\text{min}_n(S) \coloneqq a \subset S \mid \text{where } a \text{ are the } n \text{ smallest numbers of } S
|
||||
@ -229,7 +229,7 @@ We define $\text{min}_n(S)$ and $\text{max}_n(S)$ respectively in~\ref{eq:minnot
|
||||
|
||||
This notation helps to define which subsets of samples to give the user for labeling.
|
||||
There are different ways how this subset can be chosen.
|
||||
In this PW we do the obvious experiments with High-Certainty first~\ref{par:low-certainty-first}, Low-Certainty first~\ref{par:high-certainty-first}.
|
||||
In this PW we do the obvious experiments with High-Certainty first in paragraph~\ref{par:low-certainty-first}, Low-Certainty first in paragraph~\ref{par:high-certainty-first}.
|
||||
Furthermore, the two mixtures between them, half-high and half-low certain and only the middle section of the sorted certainty scores.
|
||||
|
||||
\paragraph{Low certainty first}\label{par:low-certainty-first}
|
||||
|
Loading…
Reference in New Issue
Block a user