bachelor-thesis/src/materialandmethods.tex

\section{Material and Methods}\label{sec:material-and-methods}

\subsection{Material}\label{subsec:material}

\subsubsection{MVTec AD}\label{subsubsec:mvtecad}
MVTec AD is a dataset for benchmarking anomaly detection methods with a focus on industrial inspection.
It contains over 5000 high-resolution images divided into fifteen different object and texture categories.
Each category comprises a set of defect-free training images and a test set of images with various kinds of defects as well as images without defects.

% todo source for https://www.mvtec.com/company/research/datasets/mvtec-ad

% todo example image
%\begin{figure}
%    \centering
%    \includegraphics[width=\linewidth/2]{../rsc/muffin_chiauaua_poster}
%    \caption{Sample images from dataset. \cite{muffinsvschiuahuakaggle_poster}}
%    \label{fig:roc-example}
%\end{figure}


\subsection{Methods}\label{subsec:methods}

\subsubsection{Dagster}
\subsubsection{Label-Studio}

\subsubsection{Jupyter Notebook}\label{subsubsec:jupyternb}

A Jupyter notebook is a shareable document which combines code and its output, text and visualizations.
The notebook along with the editor provides a environment for fast prototyping and data analysis.
It is widely used in the data science, mathematics and machine learning community.

In the context of this practical work it can be used to test and evaluate the active learning loop before implementing it in a Dagster pipeline. \cite{jupyter}

\subsubsection{CNN}
Convolutional neural networks are especially good model architectures for processing images, speech and audio signals.
A CNN typically consists of Convolutional layers, pooling layers and fully connected layers.
Convolutional layers are a set of learnable kernels (filters).
Each filter performs a convolution operation by sliding a window over every pixel of the image.
On each pixel a dot product creates a feature map.
Convolutional layers capture features like edges, textures or shapes.
Pooling layers sample down the feature maps created by the convolutional layers.
This helps reducing the computational complexity of the overall network and help with overfitting.
Common pooling layers include average- and max pooling.
Finally, after some convolution layers the feature map is flattened and passed to a network of fully connected layers to perform a classification or regression task.
Figure~\ref{fig:cnn-architecture} shows a typical binary classification task.
\cite{cnnintro}

\begin{figure}
    \centering
    \includegraphics[width=\linewidth]{../rsc/cnn_architecture}
    \caption{Architecture convolutional neural network. \cite{cnnarchitectureimg}}
    \label{fig:cnn-architecture}
\end{figure}

\subsubsection{RESNet}

Residual neural networks are a special type of neural network architecture.
They are especially good for deep learning and have been used in many state-of-the-art computer vision tasks.
The main idea behind ResNet is the skip connection.
The skip connection is a direct connection from one layer to another layer which is not the next layer.
This helps to avoid the vanishing gradient problem and helps with the training of very deep networks.
ResNet has proven to be very successful in many computer vision tasks and is used in this practical work for the classification task.
There are several different ResNet architectures, the most common are ResNet-18, ResNet-34, ResNet-50, ResNet-101 and ResNet-152. \cite{resnet}

Since the dataset is relatively small and the two class classification task is relatively easy (for such a large model) the ResNet-18 architecture is used in this practical work.

\subsubsection{Softmax}

The Softmax function~\eqref{eq:softmax}\cite{liang2017soft} converts $n$ numbers of a vector into a probability distribution.
Its a generalization of the Sigmoid function and often used as an Activation Layer in neural networks.
\begin{equation}\label{eq:softmax}
\sigma(\mathbf{z})_j = \frac{e^{z_j}}{\sum_{k=1}^K e^{z_k}} \; for j\coloneqq\{1,\dots,K\}
\end{equation}

The softmax function has high similarities with the Boltzmann distribution and was first introduced in the 19$^{\textrm{th}}$ century~\cite{Boltzmann}.


\subsubsection{Cross Entropy Loss}
Cross Entropy Loss is a well established loss function in machine learning.
Equation~\eqref{eq:crelformal}\cite{crossentropy} shows the formal general definition of the Cross Entropy Loss.
And equation~\eqref{eq:crelbinary} is the special case of the general Cross Entropy Loss for binary classification tasks.

\begin{align}
    H(p,q) &= -\sum_{x\in\mathcal{X}} p(x)\, \log q(x)\label{eq:crelformal}\\
    H(p,q) &= - (p \log q + (1-p) \log(1-q))\label{eq:crelbinary}\\
    \mathcal{L}(p,q) &= - \frac1N \sum_{i=1}^{\mathcal{B}} (p_i \log q_i + (1-p_i) \log(1-q_i))\label{eq:crelbinarybatch}
\end{align}

Equation~$\mathcal{L}(p,q)$~\eqref{eq:crelbinarybatch}\cite{handsonaiI} is the Binary Cross Entropy Loss for a batch of size $\mathcal{B}$ and used for model training in this Practical Work.

\subsubsection{Mathematical modeling of problem}\label{subsubsec:mathematicalmodeling}
setup skeleton for work 2024-09-30 15:39:19 +02:00			`\section{Material and Methods}\label{sec:material-and-methods}`

			`\subsection{Material}\label{subsec:material}`

			`\subsubsection{MVTec AD}\label{subsubsec:mvtecad}`
			`MVTec AD is a dataset for benchmarking anomaly detection methods with a focus on industrial inspection.`
			`It contains over 5000 high-resolution images divided into fifteen different object and texture categories.`
			`Each category comprises a set of defect-free training images and a test set of images with various kinds of defects as well as images without defects.`

			`% todo source for https://www.mvtec.com/company/research/datasets/mvtec-ad`

			`% todo example image`
			`%\begin{figure}`
			`% \centering`
			`% \includegraphics[width=\linewidth/2]{../rsc/muffin_chiauaua_poster}`
			`% \caption{Sample images from dataset. \cite{muffinsvschiuahuakaggle_poster}}`
			`% \label{fig:roc-example}`
			`%\end{figure}`


			`\subsection{Methods}\label{subsec:methods}`

			`\subsubsection{Dagster}`
			`\subsubsection{Label-Studio}`

			`\subsubsection{Jupyter Notebook}\label{subsubsec:jupyternb}`

			`A Jupyter notebook is a shareable document which combines code and its output, text and visualizations.`
			`The notebook along with the editor provides a environment for fast prototyping and data analysis.`
			`It is widely used in the data science, mathematics and machine learning community.`

			`In the context of this practical work it can be used to test and evaluate the active learning loop before implementing it in a Dagster pipeline. \cite{jupyter}`

			`\subsubsection{CNN}`
			`Convolutional neural networks are especially good model architectures for processing images, speech and audio signals.`
			`A CNN typically consists of Convolutional layers, pooling layers and fully connected layers.`
			`Convolutional layers are a set of learnable kernels (filters).`
			`Each filter performs a convolution operation by sliding a window over every pixel of the image.`
			`On each pixel a dot product creates a feature map.`
			`Convolutional layers capture features like edges, textures or shapes.`
			`Pooling layers sample down the feature maps created by the convolutional layers.`
			`This helps reducing the computational complexity of the overall network and help with overfitting.`
			`Common pooling layers include average- and max pooling.`
			`Finally, after some convolution layers the feature map is flattened and passed to a network of fully connected layers to perform a classification or regression task.`
			`Figure~\ref{fig:cnn-architecture} shows a typical binary classification task.`
			`\cite{cnnintro}`

			`\begin{figure}`
			`\centering`
			`\includegraphics[width=\linewidth]{../rsc/cnn_architecture}`
			`\caption{Architecture convolutional neural network. \cite{cnnarchitectureimg}}`
			`\label{fig:cnn-architecture}`
			`\end{figure}`

			`\subsubsection{RESNet}`

			`Residual neural networks are a special type of neural network architecture.`
			`They are especially good for deep learning and have been used in many state-of-the-art computer vision tasks.`
			`The main idea behind ResNet is the skip connection.`
			`The skip connection is a direct connection from one layer to another layer which is not the next layer.`
			`This helps to avoid the vanishing gradient problem and helps with the training of very deep networks.`
			`ResNet has proven to be very successful in many computer vision tasks and is used in this practical work for the classification task.`
			`There are several different ResNet architectures, the most common are ResNet-18, ResNet-34, ResNet-50, ResNet-101 and ResNet-152. \cite{resnet}`

			`Since the dataset is relatively small and the two class classification task is relatively easy (for such a large model) the ResNet-18 architecture is used in this practical work.`

			`\subsubsection{Softmax}`

			`The Softmax function~\eqref{eq:softmax}\cite{liang2017soft} converts $n$ numbers of a vector into a probability distribution.`
			`Its a generalization of the Sigmoid function and often used as an Activation Layer in neural networks.`
			`\begin{equation}\label{eq:softmax}`
			`\sigma(\mathbf{z})_j = \frac{e^{z_j}}{\sum_{k=1}^K e^{z_k}} \; for j\coloneqq\{1,\dots,K\}`
			`\end{equation}`

			`The softmax function has high similarities with the Boltzmann distribution and was first introduced in the 19$^{\textrm{th}}$ century~\cite{Boltzmann}.`


			`\subsubsection{Cross Entropy Loss}`
			`Cross Entropy Loss is a well established loss function in machine learning.`
			`Equation~\eqref{eq:crelformal}\cite{crossentropy} shows the formal general definition of the Cross Entropy Loss.`
			`And equation~\eqref{eq:crelbinary} is the special case of the general Cross Entropy Loss for binary classification tasks.`

			`\begin{align}`
			`H(p,q) &= -\sum_{x\in\mathcal{X}} p(x)\, \log q(x)\label{eq:crelformal}\\`
			`H(p,q) &= - (p \log q + (1-p) \log(1-q))\label{eq:crelbinary}\\`
			`\mathcal{L}(p,q) &= - \frac1N \sum_{i=1}^{\mathcal{B}} (p_i \log q_i + (1-p_i) \log(1-q_i))\label{eq:crelbinarybatch}`
			`\end{align}`

			`Equation~$\mathcal{L}(p,q)$~\eqref{eq:crelbinarybatch}\cite{handsonaiI} is the Binary Cross Entropy Loss for a batch of size $\mathcal{B}$ and used for model training in this Practical Work.`

			`\subsubsection{Mathematical modeling of problem}\label{subsubsec:mathematicalmodeling}`