From 1805bc2d789bc41e9de7672c8cc061b8e5d60fea Mon Sep 17 00:00:00 2001
From: lukas-heiligenbrunner <lukas.heiligenbrunner@gmail.com>
Date: Sat, 21 Dec 2024 18:42:59 +0100
Subject: [PATCH] add stuff for CAML

---
 materialandmethods.typ | 32 ++++++++++++++++++++++++++++----
 1 file changed, 28 insertions(+), 4 deletions(-)
diff --git a/materialandmethods.typ b/materialandmethods.typ
index e8ec933..62f1a54 100644
--- a/materialandmethods.typ
+++ b/materialandmethods.typ
@@ -195,20 +195,44 @@ There are several different ResNet architectures, the most common are ResNet-18,
 
 For this bachelor theis the ResNet-50 architecture was used to predict the corresponding embeddings for the few-shot learning methods.
 
+=== P$>$M$>$F
+Todo
 
 === CAML
 // https://arxiv.org/pdf/2310.10971v2
 CAML (Context aware meta learning) is one of the state-of-the-art methods for few-shot learning.
-#todo[Here we should describe in detail how caml works]
+It consists of three different components: a frozen pre-trained image encoder, a fixed Equal Length and Maximally Equiangular Set (ELMES) class encoder and a non-causal sequence model.
+
+*Architecture:* CAML first encodes the query and support set images using the fronzen pre-trained feature extractor as shown in @camlarchitecture.
+This step brings the images into a low dimensional space where similar images are encoded into similar embeddings.
+The class labels are encoded with the ELMES class encoder.
+Since the class of the query image is unknown in this stage we add a special learnable "unknown token" to the encoder.
+This embedding is learned during pre-training.
+Afterwards each image embedding is concatenated with the corresponding class embedding.
+
+#todo[We should add stuff here why we have a max amount of shots bc. of pretrained model]
+
+*ELMES Encoder:* The ELMES (Equal Length and Maximally Equiangular Set) encoder encodes the class labels to vectors of equal length.
+The encoder is a bijective mapping between the labels and set of vectors that are equal length and maximally equiangular.
+#todo[Describe what equiangular and bijective means]
+Similar to one-hot encoding but with some advantages.
+
+*Non-causal sequence model:*
+#todo[Desc. what this is]
+
+*Large-Scale Pre-Training:*
+#todo[Desc. what this is]
+
+*Theoretical Analysis:*
+#todo[Mybe not that important?]
+
+*Results:*
 
 #figure(
   image("rsc/caml_architecture.png", width: 80%),
   caption: [Architecture of CAML. #cite(<caml_paper>)],
 ) <camlarchitecture>
 
-=== P$>$M$>$F
-Todo
-
 === Softmax
 #todo[Maybe remove this section]
 The Softmax function @softmax #cite(<liang2017soft>) converts $n$ numbers of a vector into a probability distribution.