move typst to root and delte latex
Some checks failed
Build Typst document / build_typst_documents (push) Failing after 8s
Some checks failed
Build Typst document / build_typst_documents (push) Failing after 8s
This commit is contained in:
133
materialandmethods.typ
Normal file
133
materialandmethods.typ
Normal file
@ -0,0 +1,133 @@
|
||||
= Material and Methods
|
||||
|
||||
== Material
|
||||
|
||||
=== MVTec AD
|
||||
MVTec AD is a dataset for benchmarking anomaly detection methods with a focus on industrial inspection.
|
||||
It contains over 5000 high-resolution images divided into fifteen different object and texture categories.
|
||||
Each category comprises a set of defect-free training images and a test set of images with various kinds of defects as well as images without defects.
|
||||
|
||||
#figure(
|
||||
image("rsc/dataset_overview_large.png", width: 80%),
|
||||
caption: [Architecture convolutional neural network. #cite(<datasetsampleimg>)],
|
||||
) <datasetoverview>
|
||||
|
||||
// todo
|
||||
Todo: descibe which categories are used in this bac and how many samples there are.
|
||||
|
||||
== Methods
|
||||
|
||||
=== Few-Shot Learning
|
||||
Few-Shot learning is a subfield of machine-learning which aims to train a classification-model with just a few or no samples at all.
|
||||
In contrast to traditional supervised learning where a huge amount of labeled data is required is to generalize well to unseen data.
|
||||
So the model is prone to overfitting to the few training samples.
|
||||
|
||||
Typically a few-shot leaning task consists of a support and query set.
|
||||
Where the support-set contains the training data and the query set the evaluation data for real world evaluation.
|
||||
A common way to format a few-shot leaning problem is using n-way k-shot notation.
|
||||
For Example 3 target classeas and 5 samples per class for training might be a 3-way 5-shot few-shot classification problem.
|
||||
|
||||
A classical example of how such a model might work is a prototypical network.
|
||||
These models learn a representation of each class and classify new examples based on proximity to these representations in an embedding space.
|
||||
|
||||
#figure(
|
||||
image("rsc/prototype_fewshot_v3.png", width: 60%),
|
||||
caption: [Prototypical network for few-shots. #cite(<snell2017prototypicalnetworksfewshotlearning>)],
|
||||
) <prototypefewshot>
|
||||
|
||||
The first and easiest method of this bachelor thesis uses a simple ResNet to calucalte those embeddings and is basically a simple prototypical netowrk.
|
||||
See //%todo link to this section
|
||||
// todo proper source
|
||||
|
||||
=== Generalisation from few samples
|
||||
|
||||
An especially hard task is to generalize from such few samples.
|
||||
In typical supervised learning the model sees thousands or millions of samples of the corresponding domain during learning.
|
||||
This helps the model to learn the underlying patterns and to generalize well to unseen data.
|
||||
In few-shot learning the model has to generalize from just a few samples.
|
||||
|
||||
=== Patchcore
|
||||
|
||||
%todo also show values how they perform on MVTec AD
|
||||
|
||||
=== EfficientAD
|
||||
todo stuff #cite(<patchcorepaper>)
|
||||
// https://arxiv.org/pdf/2106.08265
|
||||
todo stuff #cite(<efficientADpaper>)
|
||||
// https://arxiv.org/pdf/2303.14535
|
||||
|
||||
=== Jupyter Notebook
|
||||
|
||||
A Jupyter notebook is a shareable document which combines code and its output, text and visualizations.
|
||||
The notebook along with the editor provides a environment for fast prototyping and data analysis.
|
||||
It is widely used in the data science, mathematics and machine learning community.
|
||||
|
||||
In the context of this bachelor thesis it was used to test and evaluate the three few-shot learning methods and to compare them. #cite(<jupyter>)
|
||||
|
||||
=== CNN
|
||||
Convolutional neural networks are especially good model architectures for processing images, speech and audio signals.
|
||||
A CNN typically consists of Convolutional layers, pooling layers and fully connected layers.
|
||||
Convolutional layers are a set of learnable kernels (filters).
|
||||
Each filter performs a convolution operation by sliding a window over every pixel of the image.
|
||||
On each pixel a dot product creates a feature map.
|
||||
Convolutional layers capture features like edges, textures or shapes.
|
||||
Pooling layers sample down the feature maps created by the convolutional layers.
|
||||
This helps reducing the computational complexity of the overall network and help with overfitting.
|
||||
Common pooling layers include average- and max pooling.
|
||||
Finally, after some convolution layers the feature map is flattened and passed to a network of fully connected layers to perform a classification or regression task.
|
||||
@cnnarchitecture shows a typical binary classification task.
|
||||
#cite(<cnnintro>)
|
||||
|
||||
#figure(
|
||||
image("rsc/cnn_architecture.png", width: 80%),
|
||||
caption: [Architecture convolutional neural network. #cite(<cnnarchitectureimg>)],
|
||||
) <cnnarchitecture>
|
||||
|
||||
=== RESNet
|
||||
|
||||
Residual neural networks are a special type of neural network architecture.
|
||||
They are especially good for deep learning and have been used in many state-of-the-art computer vision tasks.
|
||||
The main idea behind ResNet is the skip connection.
|
||||
The skip connection is a direct connection from one layer to another layer which is not the next layer.
|
||||
This helps to avoid the vanishing gradient problem and helps with the training of very deep networks.
|
||||
ResNet has proven to be very successful in many computer vision tasks and is used in this practical work for the classification task.
|
||||
There are several different ResNet architectures, the most common are ResNet-18, ResNet-34, ResNet-50, ResNet-101 and ResNet-152. #cite(<resnet>)
|
||||
|
||||
For this bachelor theis the ResNet-50 architecture was used to predict the corresponding embeddings for the few-shot learning methods.
|
||||
|
||||
|
||||
=== CAML
|
||||
Todo
|
||||
=== P$>$M$>$F
|
||||
Todo
|
||||
|
||||
=== Softmax
|
||||
|
||||
The Softmax function @softmax #cite(<liang2017soft>) converts $n$ numbers of a vector into a probability distribution.
|
||||
Its a generalization of the Sigmoid function and often used as an Activation Layer in neural networks.
|
||||
|
||||
$
|
||||
sigma(bold(z))_j = (e^(z_j)) / (sum_(k=1)^k e^(z_k)) "for" j:={1,...,k}
|
||||
$ <softmax>
|
||||
|
||||
The softmax function has high similarities with the Boltzmann distribution and was first introduced in the 19th century #cite(<Boltzmann>).
|
||||
|
||||
|
||||
=== Cross Entropy Loss
|
||||
Cross Entropy Loss is a well established loss function in machine learning.
|
||||
Equation~\eqref{eq:crelformal}\cite{crossentropy} shows the formal general definition of the Cross Entropy Loss.
|
||||
And equation~\eqref{eq:crelbinary} is the special case of the general Cross Entropy Loss for binary classification tasks.
|
||||
|
||||
$
|
||||
H(p,q) &= -sum_(x in cal(X)) p(x) log q(x)\
|
||||
H(p,q) &= -(p log(q) + (1-p) log(1-q))\
|
||||
cal(L)(p,q) &= -1/N sum_(i=1)^(cal(B)) (p_i log(q_i) + (1-p_i) log(1-q_i))
|
||||
$
|
||||
|
||||
Equation~$cal(L)(p,q)$~\eqref{eq:crelbinarybatch}\cite{handsonaiI} is the Binary Cross Entropy Loss for a batch of size $cal(B)$ and used for model training in this Practical Work.
|
||||
|
||||
=== Mathematical modeling of problem
|
||||
|
||||
== Alternative Methods
|
||||
|
||||
There are several alternative methods to few-shot learning which are not used in this bachelor thesis.
|
Reference in New Issue
Block a user