add stuff for CAML

2024-12-21 18:42:59 +01:00
parent a358401ffb
commit 1805bc2d78
1 changed files with 28 additions and 4 deletions
--- a/materialandmethods.typ
+++ b/materialandmethods.typ
@ -195,20 +195,44 @@ There are several different ResNet architectures, the most common are ResNet-18,
 For this bachelor theis the ResNet-50 architecture was used to predict the corresponding embeddings for the few-shot learning methods.
 === P$>$M$>$F
 Todo
 === CAML
 // https://arxiv.org/pdf/2310.10971v2
 CAML (Context aware meta learning) is one of the state-of-the-art methods for few-shot learning.
-#todo[Here we should describe in detail how caml works]
+It consists of three different components: a frozen pre-trained image encoder, a fixed Equal Length and Maximally Equiangular Set (ELMES) class encoder and a non-causal sequence model.
 *Architecture:* CAML first encodes the query and support set images using the fronzen pre-trained feature extractor as shown in @camlarchitecture.
 This step brings the images into a low dimensional space where similar images are encoded into similar embeddings.
 The class labels are encoded with the ELMES class encoder.
 Since the class of the query image is unknown in this stage we add a special learnable "unknown token" to the encoder.
 This embedding is learned during pre-training.
 Afterwards each image embedding is concatenated with the corresponding class embedding.
 #todo[We should add stuff here why we have a max amount of shots bc. of pretrained model]
 *ELMES Encoder:* The ELMES (Equal Length and Maximally Equiangular Set) encoder encodes the class labels to vectors of equal length.
 The encoder is a bijective mapping between the labels and set of vectors that are equal length and maximally equiangular.
 #todo[Describe what equiangular and bijective means]
 Similar to one-hot encoding but with some advantages.
 *Non-causal sequence model:*
 #todo[Desc. what this is]
 *Large-Scale Pre-Training:*
 #todo[Desc. what this is]
 *Theoretical Analysis:*
 #todo[Mybe not that important?]
 *Results:*
 #figure(
  image("rsc/caml_architecture.png", width: 80%),
  caption: [Architecture of CAML. #cite(<caml_paper>)],
 ) <camlarchitecture>
 === P$>$M$>$F
 Todo
 === Softmax
 #todo[Maybe remove this section]
 The Softmax function @softmax #cite(<liang2017soft>) converts $n$ numbers of a vector into a probability distribution.