setup skeleton for work

2024-09-30 15:39:19 +02:00
parent 7d81c43e7c
commit 75f5756e07
8 changed files with 1319 additions and 1 deletions
@@ -26,7 +26,7 @@ jobs:
        run: |
          cd src
          pdflatex -interaction=nonstopmode -halt-on-error -file-line-error main.tex
-          bibtex main
+          bibtex sources
          pdflatex -interaction=nonstopmode -halt-on-error -file-line-error main.tex
          pdflatex -interaction=nonstopmode -halt-on-error -file-line-error main.tex

@@ -0,0 +1,5 @@
+\section{Conclusion and Outlook}\label{sec:conclusion-and-outlook}
+
+\subsection{Conclusion}\label{subsec:conclusion}
+
+\subsection{Outlook}\label{subsec:outlook}
@@ -0,0 +1,16 @@
+\section{Experimental Results}\label{sec:experimental-results}
+
+\subsubsection{Is Few-Shot learning a suitable fit for anomaly detection?}
+
+Should Few-Shot learning be used for anomaly detection tasks?
+How does it compare to well established algorithms such as Patchcore or EfficientAD?
+
+\subsubsection{How does disbalancing the Shot number affect performance?}
+Does giving the Few-Shot learner more good than bad samples improve the model performance?
+
+\subsubsection{How does the 3 (ResNet, CAML, \pmf) methods perform in only detecting the anomaly class?}
+How much does the performance improve if only detecting an anomaly or not?
+How does it compare to PatchCore and EfficientAD?
+
+\subsubsection{Extra: How does Euclidean distance compare to Cosine-similarity when using ResNet as a feature-extractor?}
+I've tried different distance measures $\rightarrow$ but results are pretty much the same.
@@ -0,0 +1,12 @@
+\section{Implementation}\label{sec:implementation}
+
+\subsection{Jupyter}\label{subsec:jupyter}
+
+To get accurate performance measures the active-learning process was implemented in a Jupyter notebook first.
+This helps to choose which of the methods performs the best and which one to use in the final Dagster pipeline.
+A straight forward machine-learning pipeline was implemented with the help of Pytorch and RESNet-18.
+
+Moreover, the Dataset was manually imported with the help of a custom torch dataloader and preprocessed with random augmentations.
+After each loop iteration the Area Under the Curve (AUC) was calculated over the validation set to get a performance measure.
+All those AUC were visualized in a line plot, see section~\ref{sec:experimental-results} for the results.
+
@@ -140,6 +140,11 @@
 \maketitle
 \fi
 \input{introduction}
+\input{materialandmethods}
+\input{implementation}
+\input{experimentalresults}
+\input{conclusionandoutlook}
+
 %% The next two lines define the bibliography style to be used, and
 %% the bibliography file.
 \bibliographystyle{ACM-Reference-Format}
@@ -0,0 +1,91 @@
+\section{Material and Methods}\label{sec:material-and-methods}
+
+\subsection{Material}\label{subsec:material}
+
+\subsubsection{MVTec AD}\label{subsubsec:mvtecad}
+MVTec AD is a dataset for benchmarking anomaly detection methods with a focus on industrial inspection.
+It contains over 5000 high-resolution images divided into fifteen different object and texture categories.
+Each category comprises a set of defect-free training images and a test set of images with various kinds of defects as well as images without defects.
+
+% todo source for https://www.mvtec.com/company/research/datasets/mvtec-ad
+
+% todo example image
+%\begin{figure}
+%    \centering
+%    \includegraphics[width=\linewidth/2]{../rsc/muffin_chiauaua_poster}
+%    \caption{Sample images from dataset. \cite{muffinsvschiuahuakaggle_poster}}
+%    \label{fig:roc-example}
+%\end{figure}
+
+
+\subsection{Methods}\label{subsec:methods}
+
+\subsubsection{Dagster}
+\subsubsection{Label-Studio}
+
+\subsubsection{Jupyter Notebook}\label{subsubsec:jupyternb}
+
+A Jupyter notebook is a shareable document which combines code and its output, text and visualizations.
+The notebook along with the editor provides a environment for fast prototyping and data analysis.
+It is widely used in the data science, mathematics and machine learning community.
+
+In the context of this practical work it can be used to test and evaluate the active learning loop before implementing it in a Dagster pipeline. \cite{jupyter}
+
+\subsubsection{CNN}
+Convolutional neural networks are especially good model architectures for processing images, speech and audio signals.
+A CNN typically consists of Convolutional layers, pooling layers and fully connected layers.
+Convolutional layers are a set of learnable kernels (filters).
+Each filter performs a convolution operation by sliding a window over every pixel of the image.
+On each pixel a dot product creates a feature map.
+Convolutional layers capture features like edges, textures or shapes.
+Pooling layers sample down the feature maps created by the convolutional layers.
+This helps reducing the computational complexity of the overall network and help with overfitting.
+Common pooling layers include average- and max pooling.
+Finally, after some convolution layers the feature map is flattened and passed to a network of fully connected layers to perform a classification or regression task.
+Figure~\ref{fig:cnn-architecture} shows a typical binary classification task.
+\cite{cnnintro}
+
+\begin{figure}
+    \centering
+    \includegraphics[width=\linewidth]{../rsc/cnn_architecture}
+    \caption{Architecture convolutional neural network. \cite{cnnarchitectureimg}}
+    \label{fig:cnn-architecture}
+\end{figure}
+
+\subsubsection{RESNet}
+
+Residual neural networks are a special type of neural network architecture.
+They are especially good for deep learning and have been used in many state-of-the-art computer vision tasks.
+The main idea behind ResNet is the skip connection.
+The skip connection is a direct connection from one layer to another layer which is not the next layer.
+This helps to avoid the vanishing gradient problem and helps with the training of very deep networks.
+ResNet has proven to be very successful in many computer vision tasks and is used in this practical work for the classification task.
+There are several different ResNet architectures, the most common are ResNet-18, ResNet-34, ResNet-50, ResNet-101 and ResNet-152. \cite{resnet}
+
+Since the dataset is relatively small and the two class classification task is relatively easy (for such a large model) the ResNet-18 architecture is used in this practical work.
+
+\subsubsection{Softmax}
+
+The Softmax function~\eqref{eq:softmax}\cite{liang2017soft} converts $n$ numbers of a vector into a probability distribution.
+Its a generalization of the Sigmoid function and often used as an Activation Layer in neural networks.
+\begin{equation}\label{eq:softmax}
+\sigma(\mathbf{z})_j = \frac{e^{z_j}}{\sum_{k=1}^K e^{z_k}} \; for j\coloneqq\{1,\dots,K\}
+\end{equation}
+
+The softmax function has high similarities with the Boltzmann distribution and was first introduced in the 19$^{\textrm{th}}$ century~\cite{Boltzmann}.
+
+
+\subsubsection{Cross Entropy Loss}
+Cross Entropy Loss is a well established loss function in machine learning.
+Equation~\eqref{eq:crelformal}\cite{crossentropy} shows the formal general definition of the Cross Entropy Loss.
+And equation~\eqref{eq:crelbinary} is the special case of the general Cross Entropy Loss for binary classification tasks.
+
+\begin{align}
+    H(p,q) &= -\sum_{x\in\mathcal{X}} p(x)\, \log q(x)\label{eq:crelformal}\\
+    H(p,q) &= - (p \log q + (1-p) \log(1-q))\label{eq:crelbinary}\\
+    \mathcal{L}(p,q) &= - \frac1N \sum_{i=1}^{\mathcal{B}} (p_i \log q_i + (1-p_i) \log(1-q_i))\label{eq:crelbinarybatch}
+\end{align}
+
+Equation~$\mathcal{L}(p,q)$~\eqref{eq:crelbinarybatch}\cite{handsonaiI} is the Binary Cross Entropy Loss for a batch of size $\mathcal{B}$ and used for model training in this Practical Work.
+
+\subsubsection{Mathematical modeling of problem}\label{subsubsec:mathematicalmodeling}