fix some typos and add some remaining sources
This commit is contained in:
parent
419d06e6b9
commit
5d6e8177da
@ -2,7 +2,7 @@
|
||||
|
||||
\subsection{Does Active-Learning benefit the learning process?}\label{subsec:does-active-learning-benefit-the-learning-process?}
|
||||
|
||||
With the test setup described in~\ref{sec:implementation} a test series was performed.
|
||||
With the test setup described in section~\ref{sec:implementation} a test series was performed.
|
||||
Several different batch sizes $\mathcal{B} = \left\{ 2,4,6,8 \right\}$ and sample sizes $\mathcal{S} = \left\{ 2\mathcal{B}_i,4\mathcal{B}_i,5\mathcal{B}_i,10\mathcal{B}_i \right\}$
|
||||
dependent on the selected batch size were selected.
|
||||
We define the baseline (passive learning) AUC curve as the supervised learning process without any active learning.
|
||||
@ -89,7 +89,7 @@ Dagster provides a clean way to build pipelines and to keep track of the data in
|
||||
Label-Studio provides a great api which can be used to update the predictions of the model from the dagster pipeline.
|
||||
|
||||
Overall this option should just be chosen if the solution needs to be scalable and deployed in the cloud.
|
||||
For smaller projects a simpler solution just in an notebook or as a simple python script might be more appropriate.
|
||||
For smaller projects a simpler solution just in a notebook or as a simple python script might be more appropriate.
|
||||
|
||||
\subsection{Does balancing the learning samples improve performance?}\label{subsec:does-balancing-the-learning-samples-improve-performance?}
|
||||
|
||||
|
@ -76,7 +76,7 @@ Most of the python routines implemented in section~\ref{subsec:jupyter} were reu
|
||||
\end{figure}
|
||||
|
||||
\ref{fig:dagster_assets} shows the implemented assets in which the task is split.
|
||||
Whenever a asset materializes it is stored in the Dagster database.
|
||||
Whenever an asset materializes it is stored in the Dagster database.
|
||||
This helps to keep track of the data and to rerun the pipeline with the same data.
|
||||
\textit{train\_sup\_model} is the main asset that trains the model with the labeled samples.
|
||||
\textit{inference\_unlabeled\_samples} is the asset that predicts the scores for the unlabeled samples und updates them with the Label-Studio API.
|
||||
|
@ -22,8 +22,8 @@ Does balancing this distribution help the model performance?
|
||||
\subsection{Outline}\label{subsec:outline}
|
||||
|
||||
In section~\ref{sec:material-and-methods} we talk about general methods and materials used.
|
||||
First the problem is modeled mathematically in~\ref{subsubsec:mathematicalmodeling} and then implemented and benchmarked in a Jupyter notebook~\ref{subsubsec:jupyternb}.
|
||||
First the problem is modeled mathematically in section~\ref{subsubsec:mathematicalmodeling} and then implemented and benchmarked in a Jupyter notebook~\ref{subsubsec:jupyternb}.
|
||||
Section~\ref{sec:implementation} gives deeper insights to the implementation for the interested reader with some code snippets.
|
||||
The experimental results~\ref{sec:experimental-results} are well-presented with clear figures illustrating the performance of active learning across different sample sizes and batch sizes.
|
||||
The experimental results in~\ref{sec:experimental-results} are well-presented with clear figures illustrating the performance of active learning across different sample sizes and batch sizes.
|
||||
The conclusion~\ref{subsec:conclusion} provides an overview of the findings, highlighting the benefits of active learning.
|
||||
Additionally the outlook section~\ref{subsec:outlook} suggests avenues for future research which are not covered in this work.
|
@ -139,6 +139,7 @@ This helps reducing the computational complexity of the overall network and help
|
||||
Common pooling layers include average- and max pooling.
|
||||
Finally, after some convolution layers the feature map is flattened and passed to a network of fully connected layers to perform a classification or regression task.
|
||||
\ref{fig:cnn-architecture} shows a typical binary classification task.
|
||||
\cite{cnnintro}
|
||||
|
||||
\begin{figure}
|
||||
\centering
|
||||
@ -156,6 +157,8 @@ Its a generalization of the Sigmoid function and often used as an Activation Lay
|
||||
\end{equation}
|
||||
|
||||
The softmax function has high similarities with the Boltzmann distribution and was first introduced in the 19$^{\textrm{th}}$ century~\cite{Boltzmann}.
|
||||
|
||||
|
||||
\subsubsection{Cross Entropy Loss}
|
||||
Cross Entropy Loss is a well established loss function in machine learning.
|
||||
\eqref{eq:crelformal} shows the formal general definition of the Cross Entropy Loss.
|
||||
@ -167,7 +170,7 @@ And~\eqref{eq:crelbinary} is the special case of the general Cross Entropy Loss
|
||||
\mathcal{L}(p,q) &= - \frac1N \sum_{i=1}^{\mathcal{B}} (p_i \log q_i + (1-p_i) \log(1-q_i))\label{eq:crelbinarybatch}
|
||||
\end{align}
|
||||
|
||||
$\mathcal{L}(p,q)$~\eqref{eq:crelbinarybatch} is the Binary Cross Entropy Loss for a batch of size $\mathcal{B}$ and used for model training in this PW.
|
||||
$\mathcal{L}(p,q)$~\eqref{eq:crelbinarybatch} is the Binary Cross Entropy Loss for a batch of size $\mathcal{B}$ and used for model training in this PW.\cite{crossentropy}
|
||||
|
||||
\subsubsection{Mathematical modeling of problem}\label{subsubsec:mathematicalmodeling}
|
||||
|
||||
|
@ -142,4 +142,28 @@ doi = {10.1007/978-0-387-85820-3_23}
|
||||
howpublished = "\url{https://cointelegraph.com/explained/what-are-convolutional-neural-networks}",
|
||||
year = {2024},
|
||||
note = "[Online; accessed 12-April-2024]"
|
||||
}
|
||||
}
|
||||
|
||||
@misc{cnnintro,
|
||||
title={An Introduction to Convolutional Neural Networks},
|
||||
author={Keiron O'Shea and Ryan Nash},
|
||||
year={2015},
|
||||
eprint={1511.08458},
|
||||
archivePrefix={arXiv},
|
||||
primaryClass={cs.NE}
|
||||
}
|
||||
|
||||
@InProceedings{crossentropy,
|
||||
ISSN = {00359246},
|
||||
URL = {http://www.jstor.org/stable/2984087},
|
||||
abstract = {This paper deals first with the relationship between the theory of probability and the theory of rational behaviour. A method is then suggested for encouraging people to make accurate probability estimates, a connection with the theory of information being mentioned. Finally Wald's theory of statistical decision functions is summarised and generalised and its relation to the theory of rational behaviour is discussed.},
|
||||
author = {I. J. Good},
|
||||
journal = {Journal of the Royal Statistical Society. Series B (Methodological)},
|
||||
number = {1},
|
||||
pages = {107--114},
|
||||
publisher = {[Royal Statistical Society, Wiley]},
|
||||
title = {Rational Decisions},
|
||||
urldate = {2024-05-23},
|
||||
volume = {14},
|
||||
year = {1952}
|
||||
}
|
||||
|
Loading…
Reference in New Issue
Block a user