fix more comma errors
All checks were successful
Build Typst document / build_typst_documents (push) Successful in 17s

This commit is contained in:
lukas-heilgenbrunner 2025-01-29 12:08:23 +01:00
parent af58cda976
commit fbdb9b166b
4 changed files with 10 additions and 10 deletions

View File

@ -14,7 +14,7 @@ In most of the tests P>M>F performed the best.
But also the simple ResNet50 method performed better than expected in most cases and can be considered if the computational resources are limited and if a simple architecture is enough. But also the simple ResNet50 method performed better than expected in most cases and can be considered if the computational resources are limited and if a simple architecture is enough.
== Outlook == Outlook
In the future when new Few-Shot learning methods evolve it could be interesting to test again how they perform in anomaly detection tasks. In the future, when new Few-Shot learning methods evolve, it could be interesting to test again how they perform in anomaly detection tasks.
There might be a lack of research in the area where the classes to detect are very similar to each other There might be a lack of research in the area where the classes to detect are very similar to each other
and when building a few-shot learning algorithm tailored specifically for very similar classes this could boost the performance by a large margin. and when building a few-shot learning algorithm tailored specifically for very similar classes this could boost the performance by a large margin.

View File

@ -14,7 +14,7 @@ Both are trained with samples from the 'good' class only.
So there is a clear performance gap between Few-Shot learning and the state of the art anomaly detection algorithms. So there is a clear performance gap between Few-Shot learning and the state of the art anomaly detection algorithms.
In the @comparison2way Patchcore and EfficientAD are not included as they aren't directly compareable in the same fashion. In the @comparison2way Patchcore and EfficientAD are not included as they aren't directly compareable in the same fashion.
That means if the goal is just to detect anomalies, Few-Shot learning is not the best choice and Patchcore or EfficientAD should be used. That means if the goal is just to detect anomalies, Few-Shot learning is not the best choice, and Patchcore or EfficientAD should be used.
#subpar.grid( #subpar.grid(
figure(image("rsc/comparison-2way-bottle.png"), caption: [ figure(image("rsc/comparison-2way-bottle.png"), caption: [
@ -97,7 +97,7 @@ One could use a well established algorithm like PatchCore or EfficientAD for det
8-Way - Cable class 8-Way - Cable class
]), <comparisonfaultyonlycable>, ]), <comparisonfaultyonlycable>,
columns: (1fr, 1fr), columns: (1fr, 1fr),
caption: [Nomaly class only classification performance], caption: [Anomaly class only classification performance],
label: <comparisonnormal>, label: <comparisonnormal>,
) )

View File

@ -92,7 +92,7 @@ After creating the embeddings for the support and query set the euclidean distan
The class with the smallest distance is chosen as the predicted class. The class with the smallest distance is chosen as the predicted class.
=== Results <resnet50perf> === Results <resnet50perf>
This method performed better than expected wich such a simple method. This method performed better than expected with such a simple method.
As in @resnet50bottleperfa with a normal 5 shot / 4 way classification the model achieved an accuracy of 75%. As in @resnet50bottleperfa with a normal 5 shot / 4 way classification the model achieved an accuracy of 75%.
When detecting if there occured an anomaly or not only the performance is significantly better and peaks at 81% with 5 shots / 2 ways. When detecting if there occured an anomaly or not only the performance is significantly better and peaks at 81% with 5 shots / 2 ways.
Interestintly the model performed slightly better with fewer shots in this case. Interestintly the model performed slightly better with fewer shots in this case.
@ -136,7 +136,7 @@ but this is expected as the cable class consists of 8 faulty classes.
== P>M>F == P>M>F
=== Approach === Approach
For P>M>F I used the pretrained model weights from the original paper. For P>M>F, I used the pretrained model weights from the original paper.
As backbone feature extractor a DINO model is used, which is pre-trained by facebook. As backbone feature extractor a DINO model is used, which is pre-trained by facebook.
This is a vision transformer with a patch size of 16 and 12 attention heads learned in a self-supervised fashion. This is a vision transformer with a patch size of 16 and 12 attention heads learned in a self-supervised fashion.
This feature extractor was meta-trained with 10 public image dasets #footnote[ImageNet-1k, Omniglot, FGVC- This feature extractor was meta-trained with 10 public image dasets #footnote[ImageNet-1k, Omniglot, FGVC-
@ -144,7 +144,7 @@ Aircraft, CUB-200-2011, Describable Textures, QuickDraw,
FGVCx Fungi, VGG Flower, Traffic Signs and MSCOCO~@pmfpaper] FGVCx Fungi, VGG Flower, Traffic Signs and MSCOCO~@pmfpaper]
of diverse domains by the authors of the original paper.~@pmfpaper of diverse domains by the authors of the original paper.~@pmfpaper
Finally, this model is finetuned with the support set of every test iteration. Finally, this model is fine-tuned with the support set of every test iteration.
Every time the support set changes, we need to finetune the model again. Every time the support set changes, we need to finetune the model again.
In a real world scenario this should not be the case because the support set is fixed and only the query set changes. In a real world scenario this should not be the case because the support set is fixed and only the query set changes.
@ -196,12 +196,12 @@ This transformer was trained on a huge number of images as described in @CAML.
=== Results === Results
The results were not as good as expeced. The results were not as good as expeced.
This might be caused by the fact that the model was not fine-tuned for any industrial dataset domain. This might be because the model was not fine-tuned for any industrial dataset domain.
The model was trained on a large number of general purpose images and is not fine-tuned at all. The model was trained on a large number of general purpose images and is not fine-tuned at all.
Moreover, it was not fine-tuned on the support set similar to the P>M>F method, which could have a huge impact on performance. Moreover, it was not fine-tuned on the support set similar to the P>M>F method, which could have a huge impact on performance.
It might also not handle very similar images well. It might also not handle very similar images well.
Compared the the other two methods, CAML performed poorly in almost all experiments. Compared to the other two methods, CAML performed poorly in almost all experiments.
The normal few-shot classification reached only 40% accuracy in @camlperfa at best. The normal few-shot classification reached only 40% accuracy in @camlperfa at best.
The only test it did surprisingly well was the detection of the anomaly class for the cable class in @camlperfb were it reached almost 60% accuracy. The only test it did surprisingly well was the detection of the anomaly class for the cable class in @camlperfb were it reached almost 60% accuracy.

View File

@ -249,7 +249,7 @@ For this bachelor thesis the ResNet-50 architecture was used to predict the corr
=== P$>$M$>$F === P$>$M$>$F
// https://arxiv.org/pdf/2204.07305 // https://arxiv.org/pdf/2204.07305
P>P>F (Pre-training > Meta-training > Fine-tuning) is a three-stage pipeline designed for few-shot learning. P>M>F (Pre-training > Meta-training > Fine-tuning) is a three-stage pipeline designed for few-shot learning.
It focuses on simplicity but still achieves competitive performance. It focuses on simplicity but still achieves competitive performance.
The three stages convert a general feature extractor into a task-specific model through fine-tuned optimization. The three stages convert a general feature extractor into a task-specific model through fine-tuned optimization.
#cite(<pmfpaper>) #cite(<pmfpaper>)
@ -309,7 +309,7 @@ Future research could focus on exploring faster and more efficient methods for f
=== CAML <CAML> === CAML <CAML>
// https://arxiv.org/pdf/2310.10971v2 // https://arxiv.org/pdf/2310.10971v2
CAML (Context aware meta learning) is one of the state-of-the-art methods for few-shot learning. CAML (Context-Aware Meta-Learning) is one of the state-of-the-art methods for few-shot learning.
It consists of three different components: a frozen pre-trained image encoder, a fixed Equal Length and Maximally Equiangular Set (ELMES) class encoder and a non-causal sequence model. It consists of three different components: a frozen pre-trained image encoder, a fixed Equal Length and Maximally Equiangular Set (ELMES) class encoder and a non-causal sequence model.
This is a universal meta-learning approach. This is a universal meta-learning approach.
That means no fine-tuning or meta-training is applied for specific domains.~#cite(<caml_paper>) That means no fine-tuning or meta-training is applied for specific domains.~#cite(<caml_paper>)