From fbdb9b166ba5a06fd4c3d25663050087ae8ac6d6 Mon Sep 17 00:00:00 2001 From: lukas-heilgenbrunner Date: Wed, 29 Jan 2025 12:08:23 +0100 Subject: [PATCH] fix more comma errors --- conclusionandoutlook.typ | 2 +- experimentalresults.typ | 4 ++-- implementation.typ | 10 +++++----- materialandmethods.typ | 4 ++-- 4 files changed, 10 insertions(+), 10 deletions(-) diff --git a/conclusionandoutlook.typ b/conclusionandoutlook.typ index 4fcb530..11a3cd2 100644 --- a/conclusionandoutlook.typ +++ b/conclusionandoutlook.typ @@ -14,7 +14,7 @@ In most of the tests P>M>F performed the best. But also the simple ResNet50 method performed better than expected in most cases and can be considered if the computational resources are limited and if a simple architecture is enough. == Outlook -In the future when new Few-Shot learning methods evolve it could be interesting to test again how they perform in anomaly detection tasks. +In the future, when new Few-Shot learning methods evolve, it could be interesting to test again how they perform in anomaly detection tasks. There might be a lack of research in the area where the classes to detect are very similar to each other and when building a few-shot learning algorithm tailored specifically for very similar classes this could boost the performance by a large margin. diff --git a/experimentalresults.typ b/experimentalresults.typ index 2fd4e29..1efc1e2 100644 --- a/experimentalresults.typ +++ b/experimentalresults.typ @@ -14,7 +14,7 @@ Both are trained with samples from the 'good' class only. So there is a clear performance gap between Few-Shot learning and the state of the art anomaly detection algorithms. In the @comparison2way Patchcore and EfficientAD are not included as they aren't directly compareable in the same fashion. -That means if the goal is just to detect anomalies, Few-Shot learning is not the best choice and Patchcore or EfficientAD should be used. +That means if the goal is just to detect anomalies, Few-Shot learning is not the best choice, and Patchcore or EfficientAD should be used. #subpar.grid( figure(image("rsc/comparison-2way-bottle.png"), caption: [ @@ -97,7 +97,7 @@ One could use a well established algorithm like PatchCore or EfficientAD for det 8-Way - Cable class ]), , columns: (1fr, 1fr), - caption: [Nomaly class only classification performance], + caption: [Anomaly class only classification performance], label: , ) diff --git a/implementation.typ b/implementation.typ index 223c5d8..dafdc26 100644 --- a/implementation.typ +++ b/implementation.typ @@ -92,7 +92,7 @@ After creating the embeddings for the support and query set the euclidean distan The class with the smallest distance is chosen as the predicted class. === Results -This method performed better than expected wich such a simple method. +This method performed better than expected with such a simple method. As in @resnet50bottleperfa with a normal 5 shot / 4 way classification the model achieved an accuracy of 75%. When detecting if there occured an anomaly or not only the performance is significantly better and peaks at 81% with 5 shots / 2 ways. Interestintly the model performed slightly better with fewer shots in this case. @@ -136,7 +136,7 @@ but this is expected as the cable class consists of 8 faulty classes. == P>M>F === Approach -For P>M>F I used the pretrained model weights from the original paper. +For P>M>F, I used the pretrained model weights from the original paper. As backbone feature extractor a DINO model is used, which is pre-trained by facebook. This is a vision transformer with a patch size of 16 and 12 attention heads learned in a self-supervised fashion. This feature extractor was meta-trained with 10 public image dasets #footnote[ImageNet-1k, Omniglot, FGVC- @@ -144,7 +144,7 @@ Aircraft, CUB-200-2011, Describable Textures, QuickDraw, FGVCx Fungi, VGG Flower, Traffic Signs and MSCOCO~@pmfpaper] of diverse domains by the authors of the original paper.~@pmfpaper -Finally, this model is finetuned with the support set of every test iteration. +Finally, this model is fine-tuned with the support set of every test iteration. Every time the support set changes, we need to finetune the model again. In a real world scenario this should not be the case because the support set is fixed and only the query set changes. @@ -196,12 +196,12 @@ This transformer was trained on a huge number of images as described in @CAML. === Results The results were not as good as expeced. -This might be caused by the fact that the model was not fine-tuned for any industrial dataset domain. +This might be because the model was not fine-tuned for any industrial dataset domain. The model was trained on a large number of general purpose images and is not fine-tuned at all. Moreover, it was not fine-tuned on the support set similar to the P>M>F method, which could have a huge impact on performance. It might also not handle very similar images well. -Compared the the other two methods, CAML performed poorly in almost all experiments. +Compared to the other two methods, CAML performed poorly in almost all experiments. The normal few-shot classification reached only 40% accuracy in @camlperfa at best. The only test it did surprisingly well was the detection of the anomaly class for the cable class in @camlperfb were it reached almost 60% accuracy. diff --git a/materialandmethods.typ b/materialandmethods.typ index 17c0983..e253c0a 100644 --- a/materialandmethods.typ +++ b/materialandmethods.typ @@ -249,7 +249,7 @@ For this bachelor thesis the ResNet-50 architecture was used to predict the corr === P$>$M$>$F // https://arxiv.org/pdf/2204.07305 -P>P>F (Pre-training > Meta-training > Fine-tuning) is a three-stage pipeline designed for few-shot learning. +P>M>F (Pre-training > Meta-training > Fine-tuning) is a three-stage pipeline designed for few-shot learning. It focuses on simplicity but still achieves competitive performance. The three stages convert a general feature extractor into a task-specific model through fine-tuned optimization. #cite() @@ -309,7 +309,7 @@ Future research could focus on exploring faster and more efficient methods for f === CAML // https://arxiv.org/pdf/2310.10971v2 -CAML (Context aware meta learning) is one of the state-of-the-art methods for few-shot learning. +CAML (Context-Aware Meta-Learning) is one of the state-of-the-art methods for few-shot learning. It consists of three different components: a frozen pre-trained image encoder, a fixed Equal Length and Maximally Equiangular Set (ELMES) class encoder and a non-causal sequence model. This is a universal meta-learning approach. That means no fine-tuning or meta-training is applied for specific domains.~#cite()