make suggested typo changes
All checks were successful
Build Typst document / build_typst_documents (push) Successful in 33s

This commit is contained in:
2025-01-25 11:31:50 +01:00
parent 0da616107f
commit af58cda976
4 changed files with 11 additions and 11 deletions

View File

@ -76,7 +76,7 @@ In contrast to traditional supervised learning, where a huge amount of labeled d
here we only have 1-10 samples per class (so called shots).
So the model is prone to overfitting to the few training samples and this means they should represent the whole sample distribution as good as possible.~#cite(<parnami2022learningexamplessummaryapproaches>)
Typically a few-shot leaning task consists of a support and query set.
Typically, a few-shot leaning task consists of a support and query set.
Where the support-set contains the training data and the query set the evaluation data for real world evaluation.
A common way to format a few-shot leaning problem is using n-way k-shot notation.
For Example, 3 target classes and 5 samples per class for training might be a 3-way 5-shot few-shot classification problem.~@snell2017prototypicalnetworksfewshotlearning @patchcorepaper
@ -89,7 +89,7 @@ These models learn a representation of each class in a reduced dimensionality an
caption: [Prototypical network for 3-ways and 5-shots. #cite(<snell2017prototypicalnetworksfewshotlearning>)],
) <prototypefewshot>
The first and easiest method of this bachelor thesis uses a simple ResNet50 to calucalte those embeddings and clusters the shots together by calculating the class center.
The first and easiest method of this bachelor thesis uses a simple ResNet50 to calculate those embeddings and clusters the shots together by calculating the class center.
This is basically a simple prototypical network.
See @resnet50impl.~@chowdhury2021fewshotimageclassificationjust
@ -186,7 +186,7 @@ This lowers computational costs while maintaining detection accuracy.~#cite(<pat
EfficientAD is another state of the art method for anomaly detection.
It focuses on maintaining performance as well as high computational efficiency.
At its core, EfficientAD uses a lightweight feature extractor, the Patch Description Network (PDN), which processes images in less than a millisecond on modern hardware.
In comparison to Patchcore, which relies on a deeper, more computationaly heavy WideResNet-101 network, the PDN uses only four convulutional layers and two pooling layers.
In comparison to Patchcore, which relies on a deeper, more computationaly heavy WideResNet-101 network, the PDN uses only four convolutional layers and two pooling layers.
This results in reduced latency while retaining the ability to generate patch-level features.~#cite(<efficientADpaper>)
#todo[reference to image below]
@ -283,7 +283,7 @@ If a novel task is drawn from an unseen domain the model may fail to generalize
To overcome this the model is optionally fine-tuned with the support set on a few gradient steps.
Data augmentation is used to generate a pseudo query set.
With the support set the class prototypes are calculated and compared against the models predictions for the pseudo query set.
With the loss of this step the whole model is fine-tuned to the new domain.~#cite(<pmfpaper>)
During this step, the entire model is fine-tuned to the new domain.~#cite(<pmfpaper>)
#figure(
image("rsc/pmfarchitecture.png", width: 100%),
@ -400,7 +400,7 @@ Also, fine-tuning the model can require considerable computational resources, wh
// https://arxiv.org/pdf/2208.10559v1
// https://arxiv.org/abs/2208.10559v1
TRIDENT, a variational infernce network, is a few-shot learning approach which decouples image representation into semantic and label-specific latent variables.
TRIDENT, a variational inference network, is a few-shot learning approach which decouples image representation into semantic and label-specific latent variables.
Semantic attributes contain context or stylistic information, while label-specific attributes focus on the characteristics crucial for classification.
By decoupling these parts TRIDENT enhances the networks ability to generalize effectively from unseen data.~#cite(<singh2022transductivedecoupledvariationalinference>)
@ -427,7 +427,7 @@ The transform features parameterless-ness, which makes it easy to integrate into
It is differentiable which allows for end-to-end training. For example (re-)train the hosting network to adopt to SOT.
SOT is equivariant, which means that the transform is invariant to the order of the input features.~#cite(<shalam2022selfoptimaltransportfeaturetransform>)
The improvements of SOT over traditional feature transforms dpeend on the used backbone network and the task.
The improvements of SOT over traditional feature transforms depend on the used backbone network and the task.
But in most cases it outperforms state-of-the-art methods and could be used as a drop-in replacement for existing feature transforms.~#cite(<shalam2022selfoptimaltransportfeaturetransform>)
// anomaly detect