fix stuff from prof
All checks were successful
Build Typst document / build_typst_documents (push) Successful in 9s

This commit is contained in:
lukas-heiligenbrunner 2025-02-04 20:07:59 +01:00
parent 7681b4afce
commit c8ac00571b
6 changed files with 27 additions and 27 deletions

View File

@ -1,7 +1,7 @@
= Conclusion and Outlook <sectionconclusionandoutlook>
== Conclusion
In conclusion one can say that Few-Shot learning is not the best choice for anomaly detection tasks.
It is hugely outperformed by state of the art algorithms like PatchCore or EfficientAD.
It is hugely outperformed by state of the art algorithms like PatchCore@patchcorepaper or EfficientAD@efficientADpaper.
The only benefit of Few-Shot learning is that it can be used in environments where only a limited number of good samples are available.
But this should not be the case in most scenarios.
Most of the time plenty of good samples are available and in this case PatchCore or EfficientAD should perform great.
@ -10,8 +10,8 @@ The only case where Few-Shot learning could be used is in a scenarios where one
PatchCore and EfficientAD can only detect if an anomaly is present or not but not what type of anomaly it actually is.
So chaining a Few-Shot learner after PatchCore or EfficientAD could be a good idea to use the best of both worlds.
In most of the tests P>M>F performed the best.
But also the simple ResNet50 method performed better than expected in most cases and can be considered if the computational resources are limited and if a simple architecture is enough.
In most of the tests P>M>F@pmfpaper performed the best.
But also the simple ResNet50@resnet method performed better than expected in most cases and can be considered if the computational resources are limited and if a simple architecture is enough.
== Outlook
In the future, when new Few-Shot learning methods evolve, it could be interesting to test again how they perform in anomaly detection tasks.

View File

@ -9,7 +9,7 @@ How does it compare to well established algorithms such as PatchCore or Efficien
@comparison2waybottle shows the performance of the 2-way classification (anomaly or not) on the bottle class and @comparison2waycable the same on the cable class.
The performance values are the same as in @experiments but just merged together into one graph.
As a reference PatchCore reaches an AUROC score of 99.6% and EfficientAD reaches 99.8% averaged over all classes provided by the MVTec AD dataset.
As a reference PatchCore@patchcorepaper reaches an AUROC score of 99.6% and EfficientAD@efficientADpaper reaches 99.8% averaged over all classes provided by the MVTec AD dataset.
Both are trained with samples from the 'good' class only.
So there is a clear performance gap between Few-Shot learning and the state of the art anomaly detection algorithms.
In the @comparison2way PatchCore and EfficientAD are not included as they aren't directly compareable in the same fashion.
@ -29,7 +29,7 @@ That means if the goal is just to detect anomalies, Few-Shot learning is not the
)
== How does disbalancing the Shot number affect performance?
_Does giving the Few-Shot learner more good than bad samples improve the model performance?_
_Does giving the Few-Shot learner a higher proportion of normal (non-anomalous) samples compared to anomalous samples improve the model's performance?_
As all three method results in @experiments show, the performance of the Few-Shot learner decreases with an increasing number of good samples.
Which is an result that is unexpected (since one can think more samples perform always better) but align with the idea that all classes should always be as balanced as possible.
@ -64,11 +64,11 @@ Which is an result that is unexpected (since one can think more samples perform
Clearly all four graphs show that the performance decreases with an increasing number of good samples.
So the conclusion is that the Few-Shot learner should always be trained with as balanced classes as possible.
== How do the 3 (ResNet, CAML, P>M>F) methods perform in only detecting the anomaly class?
_How much does the performance improve by only detecting the presence of an anomaly?
== How do the 3 (ResNet, CAML, P>M>F) methods perform in distinguishing between different anomaly types?
_And how much does the performance improve by only detecting the presence of an anomaly?
How does it compare to PatchCore and EfficientAD#todo[Maybe remove comparion?]?_
@comparisonnormal shows graphs comparing the performance of the ResNet, CAML and P>M>F methods in detecting the anomaly class only including the good class as well as excluding the good class.
@comparisonnormal shows graphs comparing the performance of the ResNet@resnet, CAML@caml_paper and P>M>F@pmfpaper methods in detecting the anomaly class only including the good class as well as excluding the good class.
P>M>F performs in almost all cases better than ResNet and CAML.
P>M>F reaches up to 78% accuracy in the bottle class (@comparisonnormalbottle) and 46% in the cable class (@comparisonnormalcable) when detecting all classes including good ones
and 84% in the bottle class (@comparisonfaultyonlybottle) and 51% in the cable class (@comparisonfaultyonlycable) when excluding the good class.

View File

@ -4,7 +4,7 @@
#import "@preview/subpar:0.1.1"
= Implementation <sectionimplementation>
The three methods described (ResNet50, CAML, P>M>F) were implemented in a Jupyter notebook and compared to each other.
The three methods described (ResNet50@resnet, CAML@caml_paper, P>M>F@pmfpaper) were implemented in a Jupyter notebook and compared to each other.
== Experiments <experiments>
For all of the three methods we test the following use-cases:
@ -29,7 +29,7 @@ The rest of the images was used to test the model and measure the accuracy.
== ResNet50 <resnet50impl>
=== Approach
The simplest approach is to use a pretrained ResNet50 model as a feature extractor.
The simplest approach is to use a pretrained ResNet50@resnet model as a feature extractor.
From both the support and query set the features are extracted to get a downprojected representation of the images.
After downprojection the support set embeddings are compared to the query set embeddings.
To predict the class of a query, the class with the smallest distance to the support embedding is chosen.
@ -136,7 +136,7 @@ but this is expected as the cable class consists of 8 faulty classes.
== P>M>F
=== Approach
For P>M>F, I used the pretrained model weights from the original paper.
For P>M>F@pmfpaper, I used the pretrained model weights from the original paper.
As backbone feature extractor a DINO model is used, which is pretrained by facebook.
This is a vision transformer with a patch size of 16 and 12 attention heads learned in a self-supervised fashion.
This feature extractor was meta-trained with 10 public image dasets #footnote[ImageNet-1k, Omniglot, FGVC-
@ -182,7 +182,7 @@ So it is clearly a bad idea to add more good shots to the support set.
== CAML
=== Approach
For the CAML implementation I used the pretrained model weights from the original paper.
For the CAML@caml_paper implementation I used the pretrained model weights from the original paper.
The non-causal sequence model (transformer) is pretrained with every class having the same number of shots.
This brings the limitation that it can only process default few-shot learning tasks in the n-way k-shots fashion.
Since it expects the input sequence to be distributed with the same number of shots per class.

View File

@ -8,12 +8,12 @@ Machine learning helped the field to advance a lot in the past.
Most of the time the error rate is sub $0.1%$ and therefore plenty of good data and almost no faulty data is available.
So the train data is heavily unbalanced.~#cite(<parnami2022learningexamplessummaryapproaches>)
PatchCore and EfficientAD are state of the art algorithms trained only on good data and then detect anomalies within unseen (but similar) data.
PatchCore@patchcorepaper and EfficientAD@efficientADpaper are state of the art algorithms trained only on good data and then detect anomalies within unseen (but similar) data.
One of their problems is the need of lots of training data and time to train.
Moreover a slight change of the camera position or the lighting conditions can lead to a mandatory complete retraining of the model.
Few-Shot learning might be a suitable alternative with hugely lowered train times and fast adaption to new conditions.~#cite(<efficientADpaper>)#cite(<patchcorepaper>)#cite(<parnami2022learningexamplessummaryapproaches>)
In this thesis the performance of 3 Few-Shot learning algorithms (ResNet50, P>M>F, CAML) will be compared in the field of anomaly detection.
In this thesis the performance of 3 Few-Shot learning algorithms (ResNet50@resnet, P>M>F@pmfpaper, CAML@caml_paper) will be compared in the field of anomaly detection.
Moreover, few-shot learning might be able not only to detect anomalies but also to detect the anomaly class.
== Research Questions <sectionresearchquestions>
@ -23,10 +23,10 @@ _Should Few-Shot learning be used for anomaly detection tasks?
How does it compare to well established algorithms such as PatchCore or EfficientAD?_
=== How does disbalancing the Shot number affect performance?
_Does giving the Few-Shot learner more good than bad samples improve the model performance?_
_Does giving the Few-Shot learner a higher proportion of normal (non-anomalous) samples compared to anomalous samples improve the model's performance?_
=== How do the 3 (ResNet, CAML, P>M>F) methods perform in only detecting the anomaly class?
_How much does the performance improve by only detecting the presence of an anomaly?
=== How do the 3 (ResNet, CAML, P>M>F) methods perform in distinguishing between different anomaly types?
_And how much does the performance improve by only detecting the presence of an anomaly?
How does it compare to PatchCore and EfficientAD?_
/*#if inwriting [
@ -38,7 +38,7 @@ How does it compare to PatchCore and EfficientAD?_
This thesis is structured to provide a comprehensive exploration of Few-Shot Learning in anomaly detection.
@sectionmaterialandmethods introduces the datasets and methodologies used in this research.
The MVTec AD dataset is discussed in detail as the primary source for benchmarking, along with an overview of the Few-Shot Learning paradigm.
The section elaborates on the three selected methods—ResNet50, P>M>F, and CAML—while also touching upon well established anomaly detection algorithms such as PatchCore and EfficientAD.
The section elaborates on the three selected methods—ResNet50@resnet, P>M>F@pmfpaper, and CAML@caml_paper—while also touching upon well established anomaly detection algorithms such as PatchCore and EfficientAD.
@sectionimplementation focuses on the practical realization of the methods described in the previous chapter.
It outlines the experimental setup, including the use of Jupyter Notebook for prototyping and testing, and provides a detailed account of how each method was implemented and evaluated.

View File

@ -52,10 +52,10 @@
title: "Few-Shot Learning for Anomaly Detection",
abstract-en: [//max. 250 words
This thesis explores the application of Few-Shot Learning (FSL) in anomaly detection, a critical area in industrial and automotive domains requiring robust and efficient algorithms for identifying defects.
Traditional methods, such as PatchCore and EfficientAD, achieve high accuracy but often demand extensive training data and are sensitive to environmental changes, necessitating frequent retraining.
Traditional methods for anomaly detection, such as PatchCore@patchcorepaper and EfficientAD@efficientADpaper, achieve high accuracy but often demand extensive training data and are sensitive to environmental changes, necessitating frequent retraining.
FSL offers a promising alternative by enabling models to generalize effectively from minimal samples, thus reducing training time and adaptation overhead.
The study evaluates three FSL methods—ResNet50, P>M>F, and CAML—using the MVTec AD dataset.
The study evaluates three FSL methods—ResNet50@resnet, P>M>F@pmfpaper, and CAML@caml_paper—using the MVTec AD dataset.
Experiments focus on tasks such as anomaly detection, class imbalance handling, //and comparison of distance metrics.
and anomaly type classification.
Results indicate that while FSL methods trail behind state-of-the-art algorithms in detecting anomalies, they excel in classifying anomaly types, showcasing potential in scenarios requiring detailed defect identification.

View File

@ -89,7 +89,7 @@ These models learn a representation of each class in a reduced dimensionality an
caption: [Prototypical network for 3-ways and 5-shots. #cite(<snell2017prototypicalnetworksfewshotlearning>)],
) <prototypefewshot>
The first and easiest method of this bachelor thesis uses a simple ResNet50 to calculate those embeddings and clusters the shots together by calculating the class center.
The first and easiest method of this bachelor thesis uses a simple ResNet50@resnet to calculate those embeddings and clusters the shots together by calculating the class center.
This is basically a simple prototypical network.
See @resnet50impl.~@chowdhury2021fewshotimageclassificationjust
@ -152,12 +152,12 @@ $ <euclideannorm>
=== PatchCore
// https://arxiv.org/pdf/2106.08265
PatchCore is an advanced method designed for cold-start anomaly detection and localization, primarily focused on industrial image data.
PatchCore@patchcorepaper is an advanced method designed for cold-start anomaly detection and localization, primarily focused on industrial image data.
It operates on the principle that an image is anomalous if any of its patches is anomalous.
The method achieves state-of-the-art performance on benchmarks like MVTec AD with high accuracy, low computational cost, and competitive inference times. #cite(<patchcorepaper>)
#todo[Absatz umformulieren und vereinfachen]
The PatchCore framework leverages a pretrained convolutional neural network (e.g., WideResNet50) to extract mid-level features from image patches.
The PatchCore framework leverages a pretrained convolutional neural network (e.g., WideResNet50@resnet) to extract mid-level features from image patches.
By focusing on intermediate layers, PatchCore balances the retention of localized information with a reduction in bias associated with high-level features pre-trained on ImageNet.
To enhance robustness to spatial variations, the method aggregates features from local neighborhoods using adaptive pooling, which increases the receptive field without sacrificing spatial resolution. #cite(<patchcorepaper>)
@ -183,7 +183,7 @@ This lowers computational costs while maintaining detection accuracy.~#cite(<pat
=== EfficientAD
// https://arxiv.org/pdf/2303.14535
EfficientAD is another state of the art method for anomaly detection.
EfficientAD@efficientADpaper is another state of the art method for anomaly detection.
It focuses on maintaining performance as well as high computational efficiency.
At its core, EfficientAD uses a lightweight feature extractor, the Patch Description Network (PDN), which processes images in less than a millisecond on modern hardware.
In comparison to PatchCore, which relies on a deeper, more computationaly heavy WideResNet-101 network, the PDN uses only four convolutional layers and two pooling layers.
@ -249,7 +249,7 @@ For this bachelor thesis the ResNet-50 architecture was used to predict the corr
=== P$>$M$>$F
// https://arxiv.org/pdf/2204.07305
P>M>F (Pre-training > Meta-training > Fine-tuning) is a three-stage pipeline designed for few-shot learning.
P>M>F@pmfpaper (Pre-training > Meta-training > Fine-tuning) is a three-stage pipeline designed for few-shot learning.
It focuses on simplicity but still achieves competitive performance.
The three stages convert a general feature extractor into a task-specific model through fine-tuned optimization.
#cite(<pmfpaper>)
@ -296,7 +296,7 @@ For a query image the feature extractor extracts its embedding in lower dimensio
The query image is then assigned to the class with the closest prototype.~#cite(<pmfpaper>)
*Performance:*
P>M>F performs well across several few-shot learning benchmarks.
P>M>F@pmfpaper performs well across several few-shot learning benchmarks.
The combination of pre-training on large dataset and meta-training with episodic tasks helps the model to generalize well.
The inclusion of fine-tuning enhances adaptability to unseen domains, ensuring robust and efficient learning.~#cite(<pmfpaper>)
@ -309,7 +309,7 @@ Future research could focus on exploring faster and more efficient methods for f
=== CAML <CAML>
// https://arxiv.org/pdf/2310.10971v2
CAML (Context-Aware Meta-Learning) is one of the state-of-the-art methods for few-shot learning.
CAML (Context-Aware Meta-Learning)@caml_paper is one of the state-of-the-art methods for few-shot learning.
It consists of three different components: a frozen pretrained image encoder, a fixed Equal Length and Maximally Equiangular Set (ELMES) class encoder and a non-causal sequence model.
This is a universal meta-learning approach.
That means no fine-tuning or meta-training is applied for specific domains.~#cite(<caml_paper>)