PWAI/src/materialandmethods.tex

\section{Material and Methods}\label{sec:material-and-methods}

\subsection{Material}\label{subsec:material}

\subsubsection{Dagster}
\subsubsection{Label-Studio}
\subsubsection{Pytorch}
\subsubsection{NVTec}
\subsubsection{Imagenet}
\subsubsection{Anomalib}
% todo maybe remove?

\subsection{Methods}\label{subsec:methods}

\subsubsection{Active-Learning}
\subsubsection{Semi-Supervised learning}
In traditional supervised learning we have a labeled dataset.
Each datapoint is associated with a corresponding target label.
The goal is to fit a model to predict the labels from datapoints.

In traditional unsupervised learning there are also datapoints but no labels are known.
The goal is to find patterns or structures in the data.
Moreover, it can be used for clustering or downprojection.

Those two techniques combined yield semi-supervised learning.
Some of the labels are known, but for most of the data we have only the raw datapoints.
The basic idea is that the unlabeled data can significantly improve the model performance when used in combination with the labeled data.

\subsubsection{ROC and AUC}

A receiver operating characteristic curve can be used to measure the performance of a classifier of a binary classification task.
When using the accuracy as the performance metric it doesn't reveal much about the balance of the predictions.
There might be many true-positives and rarely any true-negatives and the accuracy is still good.
The ROC curve helps with this problem and visualizes the true-positives and false-positives on a line plot.
The more the curve ascents the upper-left or bottom-right corner the better the classifier gets.

\begin{figure}
    \centering
    \includegraphics[width=\linewidth]{../rsc/Roc_curve.svg}
    \caption{Architecture convolutional neural network. Image by \href{https://cointelegraph.com/explained/what-are-convolutional-neural-networks}{SKY ENGINE AI}}
    \label{fig:roc-example}
\end{figure}

Furthermore, the area under this curve is called AUR curve and a useful metric to measure the performance of a binary classifier.

\subsubsection{RESNet}
\subsubsection{CNN}
Convolutional neural networks are especially good model architectures for processing images, speech and audio signals.
A CNN typically consists of Convolutional layers, pooling layers and fully connected layers.
Convolutional layers are a set of learnable kernels (filters).
Each filter performs a convolution operation by sliding a window over every pixel of the image.
On each pixel a dot product creates a feature map.
Convolutional layers capture features like edges, textures or shapes.
Pooling layers sample down the feature maps created by the convolutional layers.
This helps reducing the computational complexity of the overall network and help with overfitting.
Common pooling layers include average- and max pooling.
Finally, after some convolution layers the feature map is flattened and passed to a network of fully connected layers to perform a classification or regression task.
\ref{fig:cnn-architecture} shows a typical binary classification task.

\begin{figure}
    \centering
    \includegraphics[width=\linewidth]{../rsc/cnn_architecture}
    \caption{Architecture convolutional neural network. Image by \href{https://cointelegraph.com/explained/what-are-convolutional-neural-networks}{SKY ENGINE AI}}
    \label{fig:cnn-architecture}
\end{figure}

\subsubsection{Softmax}

The Softmax function converts $n$ numbers of a vector into a probability distribution.
Its a generalization of the Sigmoid function and often used as an Activation Layer in neural networks.
\begin{equation}\label{eq:softmax}
\sigma(\mathbf{z})_j = \frac{e^{z_j}}{\sum_{k=1}^K e^{z_k}} \; for j\coloneqq\{1,\dots,K\}
\end{equation}

The softmax function has high similarities with the Boltzmann distribution and was first introduced in the 19$^{\textrm{th}}$ century~\cite{Boltzmann}.
\subsubsection{Cross Entropy Loss}
% todo maybe remove this
\subsubsection{Adam}