add stuff for CAML
All checks were successful
Build Typst document / build_typst_documents (push) Successful in 15s
All checks were successful
Build Typst document / build_typst_documents (push) Successful in 15s
This commit is contained in:
parent
155faa6e80
commit
24118dce93
@ -15,6 +15,10 @@ For all of the three methods we test the following use-cases:#todo[maybe write m
|
||||
|
||||
Those experiments were conducted on the MVTEC AD dataset on the bottle and cable classes.
|
||||
|
||||
|
||||
== Experiment Setup
|
||||
#todo[Setup of experiments, which classes used, nr of samples]
|
||||
|
||||
== ResNet50
|
||||
=== Approach
|
||||
The simplest approach is to use a pre-trained ResNet50 model as a feature extractor.
|
||||
@ -79,23 +83,27 @@ After creating the embeddings for the support and query set the euclidean distan
|
||||
The class with the smallest distance is chosen as the predicted class.
|
||||
|
||||
=== Results
|
||||
This method perofrmed better than expected wich such a simple method.
|
||||
|
||||
|
||||
== CAML
|
||||
#todo[Add images of graphs with ResNet50 stuff only]
|
||||
|
||||
== P>M>F
|
||||
=== Approach
|
||||
=== Results
|
||||
|
||||
== Experiment Setup
|
||||
% todo
|
||||
todo setup of experiments, which classes used, nr of samples
|
||||
kinds of experiments which lead to graphs
|
||||
== CAML
|
||||
=== Approach
|
||||
For the CAML implementation the pretrained model weights from the original paper were used.
|
||||
As a feture extractor a ViT-B/16 model was used, which is a Vision Transformer with a patch size of 16.
|
||||
This feature extractor was already pretrained when used by the authors of the original paper.
|
||||
For the non-causal sequence model a transformer model was used
|
||||
It consists of 24 Layers with 16 Attention-heads and a hidden dimension of 1024 and output MLP size of 4096.
|
||||
This transformer was trained on a huge number of images as described in @CAML.
|
||||
|
||||
== Jupyter
|
||||
=== Results
|
||||
The results were not as good as expeced.
|
||||
This might be caused by the fact that the model was not fine-tuned for any industrial dataset domain.
|
||||
The model was trained on a large number of general purpose images and is not fine-tuned at all.
|
||||
It might not handle very similar images well.
|
||||
|
||||
To get accurate performance measures the active-learning process was implemented in a Jupyter notebook first.
|
||||
This helps to choose which of the methods performs the best and which one to use in the final Dagster pipeline.
|
||||
A straight forward machine-learning pipeline was implemented with the help of Pytorch and RESNet-18.
|
||||
|
||||
Moreover, the Dataset was manually imported with the help of a custom torch dataloader and preprocessed with random augmentations.
|
||||
After each loop iteration the Area Under the Curve (AUC) was calculated over the validation set to get a performance measure.
|
||||
All those AUC were visualized in a line plot, see section~\ref{sec:experimental-results} for the results.
|
||||
#todo[Add images of graphs with CAML stuff only]
|
||||
|
@ -1,3 +1,5 @@
|
||||
#import "utils.typ": todo
|
||||
|
||||
= Introduction
|
||||
== Motivation
|
||||
Anomaly detection has especially in the industrial and automotive field essential importance.
|
||||
@ -31,4 +33,4 @@ How does it compare to PatchCore and EfficientAD?
|
||||
// I've tried different distance measures $->$ but results are pretty much the same.
|
||||
|
||||
== Outline
|
||||
todo
|
||||
#todo[Todo]
|
||||
|
@ -197,9 +197,11 @@ There are several different ResNet architectures, the most common are ResNet-18,
|
||||
For this bachelor theis the ResNet-50 architecture was used to predict the corresponding embeddings for the few-shot learning methods.
|
||||
|
||||
=== P$>$M$>$F
|
||||
Todo
|
||||
// https://arxiv.org/pdf/2204.07305
|
||||
|
||||
=== CAML
|
||||
#todo[Todo]#cite(<pmfpaper>)
|
||||
|
||||
=== CAML <CAML>
|
||||
// https://arxiv.org/pdf/2310.10971v2
|
||||
CAML (Context aware meta learning) is one of the state-of-the-art methods for few-shot learning.
|
||||
It consists of three different components: a frozen pre-trained image encoder, a fixed Equal Length and Maximally Equiangular Set (ELMES) class encoder and a non-causal sequence model.
|
||||
@ -237,12 +239,9 @@ Afterwards it is passed through a simple MLP network to predict the class of the
|
||||
*Large-Scale Pre-Training:*
|
||||
CAML is pre-trained on a huge number of images from ImageNet-1k, Fungi, MSCOCO, and WikiArt datasets.
|
||||
Those datasets span over different domains and help to detect any new visual concept during inference.
|
||||
Only the non-causal sequence model is trained and the image encoder and ELMES encoder are frozen.
|
||||
Only the non-causal sequence model is trained and the weights of the image encoder and ELMES encoder are kept frozen.
|
||||
~#cite(<caml_paper>)
|
||||
|
||||
*Theoretical Analysis:*
|
||||
#todo[Mybe not that important?]
|
||||
|
||||
*Inference:*
|
||||
During inference, CAML processes the following:
|
||||
- Encodes the support set images and labels with the pre-trained feature and class encoders.
|
||||
@ -250,7 +249,7 @@ During inference, CAML processes the following:
|
||||
- Passes the sequence through the non-causal sequence model, enabling dynamic interaction between query and support set representations.
|
||||
- Extracts the transformed query embedding and classifies it using a Multi-Layer Perceptron (MLP).~#cite(<caml_paper>)
|
||||
|
||||
*Results:*
|
||||
*Performance:*
|
||||
CAML achieves state-of-the-art performance in universal meta-learning across 11 few-shot classification benchmarks,
|
||||
including generic object recognition (e.g., MiniImageNet), fine-grained classification (e.g., CUB, Aircraft),
|
||||
and cross-domain tasks (e.g., Pascal+Paintings).
|
||||
|
10
sources.bib
10
sources.bib
@ -127,3 +127,13 @@
|
||||
year = {2021},
|
||||
publisher={Johannes Kepler Universität Linz}
|
||||
}
|
||||
|
||||
@misc{pmfpaper,
|
||||
title={Pushing the Limits of Simple Pipelines for Few-Shot Learning: External Data and Fine-Tuning Make a Difference},
|
||||
author={Shell Xu Hu and Da Li and Jan Stühmer and Minyoung Kim and Timothy M. Hospedales},
|
||||
year={2022},
|
||||
eprint={2204.07305},
|
||||
archivePrefix={arXiv},
|
||||
primaryClass={cs.CV},
|
||||
url={https://arxiv.org/abs/2204.07305},
|
||||
}
|
||||
|
Loading…
x
Reference in New Issue
Block a user