From 30d09a67d2e40d44d6f70e4f27dcd025a61c2a45 Mon Sep 17 00:00:00 2001
From: lukas-heilgenbrunner <lukas.heiligenbrunner@gmail.com>
Date: Tue, 14 Jan 2025 20:05:11 +0100
Subject: [PATCH] fix caml stuff and add things to last sec

---
 conclusionandoutlook.typ | 9 ++++++---
 implementation.typ       | 4 +++-
 materialandmethods.typ   | 4 ++--
 3 files changed, 11 insertions(+), 6 deletions(-)

diff --git a/conclusionandoutlook.typ b/conclusionandoutlook.typ
index 81f9c31..4fcb530 100644
--- a/conclusionandoutlook.typ
+++ b/conclusionandoutlook.typ
@@ -6,14 +6,17 @@ The only benefit of Few-Shot learning is that it can be used in environments whe
 But this should not be the case in most scenarios.
 Most of the time plenty of good samples are available and in this case Patchcore or EfficientAD should perform great.
 
-The only case where Few-Shot learning could be used is in a scenario where one wants to detect the anomaly class itself.
-Patchcore and EfficientAD can only detect if an anomaly is present or not but not what the anomaly is.
+The only case where Few-Shot learning could be used is in a scenarios where one wants to detect the anomaly class itself.
+Patchcore and EfficientAD can only detect if an anomaly is present or not but not what type of anomaly it actually is.
 So chaining a Few-Shot learner after Patchcore or EfficientAD could be a good idea to use the best of both worlds.
 
-In most of the tests performed P>M>F performed the best.
+In most of the tests P>M>F performed the best.
 But also the simple ResNet50 method performed better than expected in most cases and can be considered if the computational resources are limited and if a simple architecture is enough.
 
 == Outlook
 In the future when new Few-Shot learning methods evolve it could be interesting to test again how they perform in anomaly detection tasks.
 There might be a lack of research in the area where the classes to detect are very similar to each other
 and when building a few-shot learning algorithm tailored specifically for very similar classes this could boost the performance by a large margin.
+
+It might be interesting to test the SOT method (see @SOT) with a ResNet50 feature extractor similar as proposed in this thesis but with SOT for embedding comparison.
+Moreover, TRIDENT (see @TRIDENT) could achive promising results in a anomaly detection scenario.
diff --git a/implementation.typ b/implementation.typ
index 24b046a..936d56c 100644
--- a/implementation.typ
+++ b/implementation.typ
@@ -183,7 +183,9 @@ So it is clearly a bad idea to add more good shots to the support set.
 == CAML
 === Approach
 For the CAML implementation the pretrained model weights from the original paper were used.
-This brings the limitation of a maximum squence length to the non-causal sequence model.
+The non-causal sequence model (transformer) is pretrained with every class having the same number of shots.
+This brings the limitation that it can only process default few-shot learning tasks in the n-way k-shots fashion.
+Since it expects the input sequence to be distributed with the same number of shots per class.
 This is the reason why for this method the two imbalanced test cases couldn't be conducted.
 
 As a feture extractor a ViT-B/16 model was used, which is a Vision Transformer with a patch size of 16.
diff --git a/materialandmethods.typ b/materialandmethods.typ
index 57e6f76..2735799 100644
--- a/materialandmethods.typ
+++ b/materialandmethods.typ
@@ -392,7 +392,7 @@ If the pre-trained model lacks relevant information for the task, SgVA-CLIP migh
 This might be a no-go for anomaly detection tasks because the images in such tasks are often very task-specific and not covered by general pre-trained models.
 Also, fine-tuning the model can require considerable computational resources, which might be a limitation in some cases.~#cite(<peng2023sgvaclipsemanticguidedvisualadapting>)
 
-=== TRIDENT (Transductive Decoupled Variational Inference for Few-Shot Classification)
+=== TRIDENT (Transductive Decoupled Variational Inference for Few-Shot Classification) <TRIDENT>
 // https://arxiv.org/pdf/2208.10559v1
 // https://arxiv.org/abs/2208.10559v1
 
@@ -406,7 +406,7 @@ This feature extractor dynamically aligns features from both the support and the
 This model is specifically designed for few-shot classification tasks but might also work well for anomaly detection.
 Its ability to isolate critical features while droping irellevant context aligns with requirements needed for anomaly detection.
 
-=== SOT (Self-Optimal-Transport Feature Transform)
+=== SOT (Self-Optimal-Transport Feature Transform) <SOT>
 // https://arxiv.org/pdf/2204.03065v1
 // https://arxiv.org/abs/2204.03065v1