From 1805bc2d789bc41e9de7672c8cc061b8e5d60fea Mon Sep 17 00:00:00 2001 From: lukas-heiligenbrunner Date: Sat, 21 Dec 2024 18:42:59 +0100 Subject: [PATCH] add stuff for CAML --- materialandmethods.typ | 32 ++++++++++++++++++++++++++++---- 1 file changed, 28 insertions(+), 4 deletions(-) diff --git a/materialandmethods.typ b/materialandmethods.typ index e8ec933..62f1a54 100644 --- a/materialandmethods.typ +++ b/materialandmethods.typ @@ -195,20 +195,44 @@ There are several different ResNet architectures, the most common are ResNet-18, For this bachelor theis the ResNet-50 architecture was used to predict the corresponding embeddings for the few-shot learning methods. +=== P$>$M$>$F +Todo === CAML // https://arxiv.org/pdf/2310.10971v2 CAML (Context aware meta learning) is one of the state-of-the-art methods for few-shot learning. -#todo[Here we should describe in detail how caml works] +It consists of three different components: a frozen pre-trained image encoder, a fixed Equal Length and Maximally Equiangular Set (ELMES) class encoder and a non-causal sequence model. + +*Architecture:* CAML first encodes the query and support set images using the fronzen pre-trained feature extractor as shown in @camlarchitecture. +This step brings the images into a low dimensional space where similar images are encoded into similar embeddings. +The class labels are encoded with the ELMES class encoder. +Since the class of the query image is unknown in this stage we add a special learnable "unknown token" to the encoder. +This embedding is learned during pre-training. +Afterwards each image embedding is concatenated with the corresponding class embedding. + +#todo[We should add stuff here why we have a max amount of shots bc. of pretrained model] + +*ELMES Encoder:* The ELMES (Equal Length and Maximally Equiangular Set) encoder encodes the class labels to vectors of equal length. +The encoder is a bijective mapping between the labels and set of vectors that are equal length and maximally equiangular. +#todo[Describe what equiangular and bijective means] +Similar to one-hot encoding but with some advantages. + +*Non-causal sequence model:* +#todo[Desc. what this is] + +*Large-Scale Pre-Training:* +#todo[Desc. what this is] + +*Theoretical Analysis:* +#todo[Mybe not that important?] + +*Results:* #figure( image("rsc/caml_architecture.png", width: 80%), caption: [Architecture of CAML. #cite()], ) -=== P$>$M$>$F -Todo - === Softmax #todo[Maybe remove this section] The Softmax function @softmax #cite() converts $n$ numbers of a vector into a probability distribution.