fix more comma errors
	
		
			
	
		
	
	
		
	
		
			All checks were successful
		
		
	
	
		
			
				
	
				Build Typst document / build_typst_documents (push) Successful in 17s
				
			
		
		
	
	
				
					
				
			
		
			All checks were successful
		
		
	
	Build Typst document / build_typst_documents (push) Successful in 17s
				
			This commit is contained in:
		@@ -14,7 +14,7 @@ In most of the tests P>M>F performed the best.
 | 
			
		||||
But also the simple ResNet50 method performed better than expected in most cases and can be considered if the computational resources are limited and if a simple architecture is enough.
 | 
			
		||||
 | 
			
		||||
== Outlook
 | 
			
		||||
In the future when new Few-Shot learning methods evolve it could be interesting to test again how they perform in anomaly detection tasks.
 | 
			
		||||
In the future, when new Few-Shot learning methods evolve, it could be interesting to test again how they perform in anomaly detection tasks.
 | 
			
		||||
There might be a lack of research in the area where the classes to detect are very similar to each other
 | 
			
		||||
and when building a few-shot learning algorithm tailored specifically for very similar classes this could boost the performance by a large margin.
 | 
			
		||||
 | 
			
		||||
 
 | 
			
		||||
@@ -14,7 +14,7 @@ Both are trained with samples from the 'good' class only.
 | 
			
		||||
So there is a clear performance gap between Few-Shot learning and the state of the art anomaly detection algorithms.
 | 
			
		||||
In the @comparison2way Patchcore and EfficientAD are not included as they aren't directly compareable in the same fashion.
 | 
			
		||||
 | 
			
		||||
That means if the goal is just to detect anomalies, Few-Shot learning is not the best choice and Patchcore or EfficientAD should be used.
 | 
			
		||||
That means if the goal is just to detect anomalies, Few-Shot learning is not the best choice, and Patchcore or EfficientAD should be used.
 | 
			
		||||
 | 
			
		||||
#subpar.grid(
 | 
			
		||||
  figure(image("rsc/comparison-2way-bottle.png"), caption: [
 | 
			
		||||
@@ -97,7 +97,7 @@ One could use a well established algorithm like PatchCore or EfficientAD for det
 | 
			
		||||
    8-Way - Cable class
 | 
			
		||||
  ]), <comparisonfaultyonlycable>,
 | 
			
		||||
  columns: (1fr, 1fr),
 | 
			
		||||
  caption: [Nomaly class only  classification performance],
 | 
			
		||||
  caption: [Anomaly class only  classification performance],
 | 
			
		||||
  label: <comparisonnormal>,
 | 
			
		||||
)
 | 
			
		||||
 | 
			
		||||
 
 | 
			
		||||
@@ -92,7 +92,7 @@ After creating the embeddings for the support and query set the euclidean distan
 | 
			
		||||
The class with the smallest distance is chosen as the predicted class.
 | 
			
		||||
 | 
			
		||||
=== Results <resnet50perf>
 | 
			
		||||
This method performed better than expected wich such a simple method.
 | 
			
		||||
This method performed better than expected with such a simple method.
 | 
			
		||||
As in @resnet50bottleperfa with a normal 5 shot / 4 way classification the model achieved an accuracy of 75%.
 | 
			
		||||
When detecting if there occured an anomaly or not only the performance is significantly better and peaks at 81% with 5 shots / 2 ways.
 | 
			
		||||
Interestintly the model performed slightly better with fewer shots in this case.
 | 
			
		||||
@@ -136,7 +136,7 @@ but this is expected as the cable class consists of 8 faulty classes.
 | 
			
		||||
 | 
			
		||||
== P>M>F
 | 
			
		||||
=== Approach
 | 
			
		||||
For P>M>F I used the pretrained model weights from the original paper.
 | 
			
		||||
For P>M>F, I used the pretrained model weights from the original paper.
 | 
			
		||||
As backbone feature extractor a DINO model is used, which is pre-trained by facebook.
 | 
			
		||||
This is a vision transformer with a patch size of 16 and 12 attention heads learned in a self-supervised fashion.
 | 
			
		||||
This feature extractor was meta-trained with 10 public image dasets #footnote[ImageNet-1k, Omniglot, FGVC-
 | 
			
		||||
@@ -144,7 +144,7 @@ Aircraft, CUB-200-2011, Describable Textures, QuickDraw,
 | 
			
		||||
FGVCx Fungi, VGG Flower, Traffic Signs and MSCOCO~@pmfpaper]
 | 
			
		||||
 of diverse domains  by the authors of the original paper.~@pmfpaper
 | 
			
		||||
 | 
			
		||||
Finally, this model is finetuned with the support set of every test iteration.
 | 
			
		||||
Finally, this model is fine-tuned with the support set of every test iteration.
 | 
			
		||||
Every time the support set changes, we need to finetune the model again.
 | 
			
		||||
In a real world scenario this should not be the case because the support set is fixed and only the query set changes.
 | 
			
		||||
 | 
			
		||||
@@ -196,12 +196,12 @@ This transformer was trained on a huge number of images as described in @CAML.
 | 
			
		||||
 | 
			
		||||
=== Results
 | 
			
		||||
The results were not as good as expeced.
 | 
			
		||||
This might be caused by the fact that the model was not fine-tuned for any industrial dataset domain.
 | 
			
		||||
This might be because the model was not  fine-tuned for any industrial dataset domain.
 | 
			
		||||
The model was trained on a large number of general purpose images and is not fine-tuned at all.
 | 
			
		||||
Moreover, it was not fine-tuned on the support set similar to the P>M>F method, which could have a huge impact on performance.
 | 
			
		||||
It might also not handle very similar images well.
 | 
			
		||||
 | 
			
		||||
Compared the the other two methods, CAML performed poorly in almost all experiments.
 | 
			
		||||
Compared to the other two methods, CAML performed poorly in almost all experiments.
 | 
			
		||||
The normal few-shot classification reached only 40% accuracy in @camlperfa at best.
 | 
			
		||||
The only test it did surprisingly well was the detection of the anomaly class for the cable class in @camlperfb were it reached almost 60% accuracy.
 | 
			
		||||
 | 
			
		||||
 
 | 
			
		||||
@@ -249,7 +249,7 @@ For this bachelor thesis the ResNet-50 architecture was used to predict the corr
 | 
			
		||||
 | 
			
		||||
=== P$>$M$>$F
 | 
			
		||||
// https://arxiv.org/pdf/2204.07305
 | 
			
		||||
P>P>F (Pre-training > Meta-training > Fine-tuning) is a three-stage pipeline designed for few-shot learning.
 | 
			
		||||
P>M>F (Pre-training > Meta-training > Fine-tuning) is a three-stage pipeline designed for few-shot learning.
 | 
			
		||||
It focuses on simplicity but still achieves competitive performance.
 | 
			
		||||
The three stages convert a general feature extractor into a task-specific model through fine-tuned optimization.
 | 
			
		||||
#cite(<pmfpaper>)
 | 
			
		||||
@@ -309,7 +309,7 @@ Future research could focus on exploring faster and more efficient methods for f
 | 
			
		||||
 | 
			
		||||
=== CAML <CAML>
 | 
			
		||||
// https://arxiv.org/pdf/2310.10971v2
 | 
			
		||||
CAML (Context aware meta learning) is one of the state-of-the-art methods for few-shot learning.
 | 
			
		||||
CAML (Context-Aware Meta-Learning) is one of the state-of-the-art methods for few-shot learning.
 | 
			
		||||
It consists of three different components: a frozen pre-trained image encoder, a fixed Equal Length and Maximally Equiangular Set (ELMES) class encoder and a non-causal sequence model.
 | 
			
		||||
This is a universal meta-learning approach.
 | 
			
		||||
That means no fine-tuning or meta-training is applied for specific domains.~#cite(<caml_paper>)
 | 
			
		||||
 
 | 
			
		||||
		Reference in New Issue
	
	Block a user