add images for final results
All checks were successful
Build Typst document / build_typst_documents (push) Successful in 12s

This commit is contained in:
lukas-heiligenbrunner 2025-01-03 21:48:48 +01:00
parent 2690a3d0f2
commit 2eeed2c31e
4 changed files with 50 additions and 9 deletions

View File

@ -1,15 +1,41 @@
#import "utils.typ": todo
#import "@preview/subpar:0.1.1"
= Experimental Results = Experimental Results
== Is Few-Shot learning a suitable fit for anomaly detection? == Is Few-Shot learning a suitable fit for anomaly detection?
_Should Few-Shot learning be used for anomaly detection tasks?
How does it compare to well established algorithms such as Patchcore or EfficientAD?_
Should Few-Shot learning be used for anomaly detection tasks? @comparison2waybottle shows the performance of the 2-way classification (anomaly or not) on the bottle class and @comparison2waycable the same on the cable class.
How does it compare to well established algorithms such as Patchcore or EfficientAD? The performance values are the same as in @experiments but just merged together into one graph.
As a reference Patchcore reaches an AUROC score of 99.6% and EfficientAD reaches 99.8% averaged over all classes provided by the MVTec AD dataset.
Both are trained with samples from the 'good' class only.
So there is a clear performance gap between Few-Shot learning and the state of the art anomaly detection algorithms.
That means if the goal is just to detect anomalies, Few-Shot learning is not the best choice and Patchcore or EfficientAD should be used.
#subpar.grid(
figure(image("rsc/comparison-2way-bottle.png"), caption: [
Bottle class
]), <comparison2waybottle>,
figure(image("rsc/comparison-2way-cable.png"), caption: [
Cable class
]), <comparison2waycable>,
columns: (1fr, 1fr),
caption: [2-Way classification performance],
label: <comparison2way>,
)
== How does disbalancing the Shot number affect performance? == How does disbalancing the Shot number affect performance?
Does giving the Few-Shot learner more good than bad samples improve the model performance? _Does giving the Few-Shot learner more good than bad samples improve the model performance?_
As all three method results in @experiments show, the performance of the Few-Shot learner decreases with an increasing number of good samples.
Which is an result that is unexpected.
#todo[Image of disbalanced shots]
== How does the 3 (ResNet, CAML, \pmf) methods perform in only detecting the anomaly class? == How does the 3 (ResNet, CAML, \pmf) methods perform in only detecting the anomaly class?
How much does the performance improve if only detecting an anomaly or not? _How much does the performance improve if only detecting an anomaly or not?
How does it compare to PatchCore and EfficientAD? How does it compare to PatchCore and EfficientAD?_
== Extra: How does Euclidean distance compare to Cosine-similarity when using ResNet as a feature-extractor? == Extra: How does Euclidean distance compare to Cosine-similarity when using ResNet as a feature-extractor?

View File

@ -6,7 +6,7 @@
= Implementation = Implementation
The three methods described (ResNet50, CAML, P>M>F) were implemented in a Jupyter notebook and compared to each other. The three methods described (ResNet50, CAML, P>M>F) were implemented in a Jupyter notebook and compared to each other.
== Experiments == Experiments <experiments>
For all of the three methods we test the following use-cases:#todo[maybe write more to each test] For all of the three methods we test the following use-cases:#todo[maybe write more to each test]
- Detection of anomaly class (1,3,5 shots) - Detection of anomaly class (1,3,5 shots)
- 2 Way classification (1,3,5 shots) - 2 Way classification (1,3,5 shots)
@ -20,7 +20,7 @@ Those experiments were conducted on the MVTEC AD dataset on the bottle and cable
== Experiment Setup == Experiment Setup
All the experiments were done on the bottle and cable classes of the MVTEC AD dataset. All the experiments were done on the bottle and cable classes of the MVTEC AD dataset.
The correspoinding number of shots were randomly selected from the dataset. The correspoinding number of shots were randomly selected from the dataset.
The rest of the images were used to test the model and measure the accuracy. The rest of the images was used to test the model and measure the accuracy.
#todo[Maybe add real number of samples per classes] #todo[Maybe add real number of samples per classes]
== ResNet50 == ResNet50
@ -86,7 +86,7 @@ In the following diagram the ResNet50 architecture is visualized and the cut-poi
After creating the embeddings for the support and query set the euclidean distance is calculated. After creating the embeddings for the support and query set the euclidean distance is calculated.
The class with the smallest distance is chosen as the predicted class. The class with the smallest distance is chosen as the predicted class.
=== Results === Results <resnet50perf>
This method performed better than expected wich such a simple method. This method performed better than expected wich such a simple method.
As in @resnet50bottleperfa with a normal 5 shot / 4 way classification the model achieved an accuracy of 75%. As in @resnet50bottleperfa with a normal 5 shot / 4 way classification the model achieved an accuracy of 75%.
When detecting only if there occured an anomaly or not the performance is significantly better and peaks at 81% with 5 shots / 2 ways. When detecting only if there occured an anomaly or not the performance is significantly better and peaks at 81% with 5 shots / 2 ways.
@ -131,10 +131,25 @@ but this is expected as the cable class consists of 8 faulty classes.
== P>M>F == P>M>F
=== Approach === Approach
For P>M>F the pretrained model weights from the original paper were used.
As backbone feature extractor a DINO model is used, which is pre-trained by facebook.
This is a vision transformer with a patch size of 16 and 12 attention heads learned in a self-supervised fashion.
This feature extractor was meta-trained with 10 public image dasets #footnote[ImageNet-1k, Omniglot, FGVC-
Aircraft, CUB-200-2011, Describable Textures, QuickDraw,
FGVCx Fungi, VGG Flower, Traffic Signs and MSCOCO~#cite(<pmfpaper>)]
of diverse domains by the authors of the original paper.#cite(<pmfpaper>)
Finally, this model is finetuned with the support set of every test iteration.
Everytime the support set changes we need to finetune the model again.
In a real world scenario this should not be the case because the support set is fixed and only the query set changes.
=== Results === Results
The results of P>M>F look very promising and improve by a large margin over the ResNet50 method. The results of P>M>F look very promising and improve by a large margin over the ResNet50 method.
In @pmfbottleperfa the model reached an accuracy of 79% with 5 shots / 4 way classification. In @pmfbottleperfa the model reached an accuracy of 79% with 5 shots / 4 way classification.
#todo[write bit more here] The 2 way classification (faulty or not) performed even better and peaked at 94% accuracy with 5 shots.#todo[Add somehow that all classes are stacked]
Similar to the ResNet50 method in @resnet50perf the tests with an inbalanced class distribution performed worse than with balanced classes.
So it is clearly a bad idea to add more good shots to the support set.
#subpar.grid( #subpar.grid(
figure(image("rsc/pmf/P>M>F-bottle.png"), caption: [ figure(image("rsc/pmf/P>M>F-bottle.png"), caption: [

Binary file not shown.

After

Width:  |  Height:  |  Size: 37 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 29 KiB