fix some typos and add some remaining sources

This commit is contained in:
lukas-heilgenbrunner 2024-05-23 22:28:31 +02:00
parent 419d06e6b9
commit 5d6e8177da
5 changed files with 34 additions and 7 deletions

View File

@ -2,7 +2,7 @@
\subsection{Does Active-Learning benefit the learning process?}\label{subsec:does-active-learning-benefit-the-learning-process?} \subsection{Does Active-Learning benefit the learning process?}\label{subsec:does-active-learning-benefit-the-learning-process?}
With the test setup described in~\ref{sec:implementation} a test series was performed. With the test setup described in section~\ref{sec:implementation} a test series was performed.
Several different batch sizes $\mathcal{B} = \left\{ 2,4,6,8 \right\}$ and sample sizes $\mathcal{S} = \left\{ 2\mathcal{B}_i,4\mathcal{B}_i,5\mathcal{B}_i,10\mathcal{B}_i \right\}$ Several different batch sizes $\mathcal{B} = \left\{ 2,4,6,8 \right\}$ and sample sizes $\mathcal{S} = \left\{ 2\mathcal{B}_i,4\mathcal{B}_i,5\mathcal{B}_i,10\mathcal{B}_i \right\}$
dependent on the selected batch size were selected. dependent on the selected batch size were selected.
We define the baseline (passive learning) AUC curve as the supervised learning process without any active learning. We define the baseline (passive learning) AUC curve as the supervised learning process without any active learning.
@ -89,7 +89,7 @@ Dagster provides a clean way to build pipelines and to keep track of the data in
Label-Studio provides a great api which can be used to update the predictions of the model from the dagster pipeline. Label-Studio provides a great api which can be used to update the predictions of the model from the dagster pipeline.
Overall this option should just be chosen if the solution needs to be scalable and deployed in the cloud. Overall this option should just be chosen if the solution needs to be scalable and deployed in the cloud.
For smaller projects a simpler solution just in an notebook or as a simple python script might be more appropriate. For smaller projects a simpler solution just in a notebook or as a simple python script might be more appropriate.
\subsection{Does balancing the learning samples improve performance?}\label{subsec:does-balancing-the-learning-samples-improve-performance?} \subsection{Does balancing the learning samples improve performance?}\label{subsec:does-balancing-the-learning-samples-improve-performance?}

View File

@ -76,7 +76,7 @@ Most of the python routines implemented in section~\ref{subsec:jupyter} were reu
\end{figure} \end{figure}
\ref{fig:dagster_assets} shows the implemented assets in which the task is split. \ref{fig:dagster_assets} shows the implemented assets in which the task is split.
Whenever a asset materializes it is stored in the Dagster database. Whenever an asset materializes it is stored in the Dagster database.
This helps to keep track of the data and to rerun the pipeline with the same data. This helps to keep track of the data and to rerun the pipeline with the same data.
\textit{train\_sup\_model} is the main asset that trains the model with the labeled samples. \textit{train\_sup\_model} is the main asset that trains the model with the labeled samples.
\textit{inference\_unlabeled\_samples} is the asset that predicts the scores for the unlabeled samples und updates them with the Label-Studio API. \textit{inference\_unlabeled\_samples} is the asset that predicts the scores for the unlabeled samples und updates them with the Label-Studio API.

View File

@ -22,8 +22,8 @@ Does balancing this distribution help the model performance?
\subsection{Outline}\label{subsec:outline} \subsection{Outline}\label{subsec:outline}
In section~\ref{sec:material-and-methods} we talk about general methods and materials used. In section~\ref{sec:material-and-methods} we talk about general methods and materials used.
First the problem is modeled mathematically in~\ref{subsubsec:mathematicalmodeling} and then implemented and benchmarked in a Jupyter notebook~\ref{subsubsec:jupyternb}. First the problem is modeled mathematically in section~\ref{subsubsec:mathematicalmodeling} and then implemented and benchmarked in a Jupyter notebook~\ref{subsubsec:jupyternb}.
Section~\ref{sec:implementation} gives deeper insights to the implementation for the interested reader with some code snippets. Section~\ref{sec:implementation} gives deeper insights to the implementation for the interested reader with some code snippets.
The experimental results~\ref{sec:experimental-results} are well-presented with clear figures illustrating the performance of active learning across different sample sizes and batch sizes. The experimental results in~\ref{sec:experimental-results} are well-presented with clear figures illustrating the performance of active learning across different sample sizes and batch sizes.
The conclusion~\ref{subsec:conclusion} provides an overview of the findings, highlighting the benefits of active learning. The conclusion~\ref{subsec:conclusion} provides an overview of the findings, highlighting the benefits of active learning.
Additionally the outlook section~\ref{subsec:outlook} suggests avenues for future research which are not covered in this work. Additionally the outlook section~\ref{subsec:outlook} suggests avenues for future research which are not covered in this work.

View File

@ -139,6 +139,7 @@ This helps reducing the computational complexity of the overall network and help
Common pooling layers include average- and max pooling. Common pooling layers include average- and max pooling.
Finally, after some convolution layers the feature map is flattened and passed to a network of fully connected layers to perform a classification or regression task. Finally, after some convolution layers the feature map is flattened and passed to a network of fully connected layers to perform a classification or regression task.
\ref{fig:cnn-architecture} shows a typical binary classification task. \ref{fig:cnn-architecture} shows a typical binary classification task.
\cite{cnnintro}
\begin{figure} \begin{figure}
\centering \centering
@ -156,6 +157,8 @@ Its a generalization of the Sigmoid function and often used as an Activation Lay
\end{equation} \end{equation}
The softmax function has high similarities with the Boltzmann distribution and was first introduced in the 19$^{\textrm{th}}$ century~\cite{Boltzmann}. The softmax function has high similarities with the Boltzmann distribution and was first introduced in the 19$^{\textrm{th}}$ century~\cite{Boltzmann}.
\subsubsection{Cross Entropy Loss} \subsubsection{Cross Entropy Loss}
Cross Entropy Loss is a well established loss function in machine learning. Cross Entropy Loss is a well established loss function in machine learning.
\eqref{eq:crelformal} shows the formal general definition of the Cross Entropy Loss. \eqref{eq:crelformal} shows the formal general definition of the Cross Entropy Loss.
@ -167,7 +170,7 @@ And~\eqref{eq:crelbinary} is the special case of the general Cross Entropy Loss
\mathcal{L}(p,q) &= - \frac1N \sum_{i=1}^{\mathcal{B}} (p_i \log q_i + (1-p_i) \log(1-q_i))\label{eq:crelbinarybatch} \mathcal{L}(p,q) &= - \frac1N \sum_{i=1}^{\mathcal{B}} (p_i \log q_i + (1-p_i) \log(1-q_i))\label{eq:crelbinarybatch}
\end{align} \end{align}
$\mathcal{L}(p,q)$~\eqref{eq:crelbinarybatch} is the Binary Cross Entropy Loss for a batch of size $\mathcal{B}$ and used for model training in this PW. $\mathcal{L}(p,q)$~\eqref{eq:crelbinarybatch} is the Binary Cross Entropy Loss for a batch of size $\mathcal{B}$ and used for model training in this PW.\cite{crossentropy}
\subsubsection{Mathematical modeling of problem}\label{subsubsec:mathematicalmodeling} \subsubsection{Mathematical modeling of problem}\label{subsubsec:mathematicalmodeling}

View File

@ -143,3 +143,27 @@ doi = {10.1007/978-0-387-85820-3_23}
year = {2024}, year = {2024},
note = "[Online; accessed 12-April-2024]" note = "[Online; accessed 12-April-2024]"
} }
@misc{cnnintro,
title={An Introduction to Convolutional Neural Networks},
author={Keiron O'Shea and Ryan Nash},
year={2015},
eprint={1511.08458},
archivePrefix={arXiv},
primaryClass={cs.NE}
}
@InProceedings{crossentropy,
ISSN = {00359246},
URL = {http://www.jstor.org/stable/2984087},
abstract = {This paper deals first with the relationship between the theory of probability and the theory of rational behaviour. A method is then suggested for encouraging people to make accurate probability estimates, a connection with the theory of information being mentioned. Finally Wald's theory of statistical decision functions is summarised and generalised and its relation to the theory of rational behaviour is discussed.},
author = {I. J. Good},
journal = {Journal of the Royal Statistical Society. Series B (Methodological)},
number = {1},
pages = {107--114},
publisher = {[Royal Statistical Society, Wiley]},
title = {Rational Decisions},
urldate = {2024-05-23},
volume = {14},
year = {1952}
}