add abstract, finish the alternatvie methods and fix some todos and improve sources
All checks were successful
Build Typst document / build_typst_documents (push) Successful in 21s

This commit is contained in:
2025-01-14 19:22:15 +01:00
parent 7c54e11238
commit 49d5e97417
6 changed files with 127 additions and 21 deletions

View File

@ -7,15 +7,19 @@
The three methods described (ResNet50, CAML, P>M>F) were implemented in a Jupyter notebook and compared to each other.
== Experiments <experiments>
For all of the three methods we test the following use-cases:#todo[maybe write more to each test]
For all of the three methods we test the following use-cases:
- Detection of anomaly class (1,3,5 shots)
- Every faulty class and the good class is detected.
- 2 Way classification (1,3,5 shots)
- Only faulty or not faulty is detected. All the samples of the faulty classes are treated as a single class.
- Detect only anomaly classes (1,3,5 shots)
- Similar to the first test but without the good class. Only faulty classes are detected.
- Inbalanced 2 Way classification (5,10,15,30 good shots, 5 bad shots)
- Inbalanced target class prediction (5,10,15,30 good shots, 5 bad shots)
Those experiments were conducted on the MVTEC AD dataset on the bottle and cable classes.
- Similar to the 2 way classification but with an inbalanced number of good shots.
- Inbalanced target class prediction (5,10,15,30 good shots, 5 bad shots)#todo[Avoid bullet points and write flow text?]
- Detect only the faulty classes without the good classed with an inbalanced number of shots.
All those experiments were conducted on the MVTEC AD dataset on the bottle and cable classes.
== Experiment Setup
All the experiments were done on the bottle and cable classes of the MVTEC AD dataset.
@ -23,20 +27,21 @@ The correspoinding number of shots were randomly selected from the dataset.
The rest of the images was used to test the model and measure the accuracy.
#todo[Maybe add real number of samples per classes]
== ResNet50
== ResNet50 <resnet50impl>
=== Approach
The simplest approach is to use a pre-trained ResNet50 model as a feature extractor.
From both the support and query set the features are extracted to get a downprojected representation of the images.
The support set embeddings are compared to the query set embeddings.
To predict the class of a query the class with the smallest distance to the support embedding is chosen.
If there are more than one support embedding within the same class the mean of those embeddings is used (class center).
This approach is similar to a prototypical network @snell2017prototypicalnetworksfewshotlearning.
This approach is similar to a prototypical network @snell2017prototypicalnetworksfewshotlearning and the work of _Just Use a Library of Pre-trained Feature
Extractors and a Simple Classifier_ @chowdhury2021fewshotimageclassificationjust but just with a simple distance metric instead of a neural net.
In this bachelor thesis a pre-trained ResNet50 (IMAGENET1K_V2) pytorch model was used.
It is pretrained on the imagenet dataset and has 50 residual layers.
To get the embeddings the last layer of the model was removed and the output of the second last layer was used as embedding output.
In the following diagram the ResNet50 architecture is visualized and the cut-point is marked.
In the following diagram the ResNet50 architecture is visualized and the cut-point is marked.~@chowdhury2021fewshotimageclassificationjust
#diagram(
spacing: (5mm, 5mm),