make suggested typo changes
All checks were successful
Build Typst document / build_typst_documents (push) Successful in 33s
All checks were successful
Build Typst document / build_typst_documents (push) Successful in 33s
This commit is contained in:
@ -76,7 +76,7 @@ In contrast to traditional supervised learning, where a huge amount of labeled d
|
||||
here we only have 1-10 samples per class (so called shots).
|
||||
So the model is prone to overfitting to the few training samples and this means they should represent the whole sample distribution as good as possible.~#cite(<parnami2022learningexamplessummaryapproaches>)
|
||||
|
||||
Typically a few-shot leaning task consists of a support and query set.
|
||||
Typically, a few-shot leaning task consists of a support and query set.
|
||||
Where the support-set contains the training data and the query set the evaluation data for real world evaluation.
|
||||
A common way to format a few-shot leaning problem is using n-way k-shot notation.
|
||||
For Example, 3 target classes and 5 samples per class for training might be a 3-way 5-shot few-shot classification problem.~@snell2017prototypicalnetworksfewshotlearning @patchcorepaper
|
||||
@ -89,7 +89,7 @@ These models learn a representation of each class in a reduced dimensionality an
|
||||
caption: [Prototypical network for 3-ways and 5-shots. #cite(<snell2017prototypicalnetworksfewshotlearning>)],
|
||||
) <prototypefewshot>
|
||||
|
||||
The first and easiest method of this bachelor thesis uses a simple ResNet50 to calucalte those embeddings and clusters the shots together by calculating the class center.
|
||||
The first and easiest method of this bachelor thesis uses a simple ResNet50 to calculate those embeddings and clusters the shots together by calculating the class center.
|
||||
This is basically a simple prototypical network.
|
||||
See @resnet50impl.~@chowdhury2021fewshotimageclassificationjust
|
||||
|
||||
@ -186,7 +186,7 @@ This lowers computational costs while maintaining detection accuracy.~#cite(<pat
|
||||
EfficientAD is another state of the art method for anomaly detection.
|
||||
It focuses on maintaining performance as well as high computational efficiency.
|
||||
At its core, EfficientAD uses a lightweight feature extractor, the Patch Description Network (PDN), which processes images in less than a millisecond on modern hardware.
|
||||
In comparison to Patchcore, which relies on a deeper, more computationaly heavy WideResNet-101 network, the PDN uses only four convulutional layers and two pooling layers.
|
||||
In comparison to Patchcore, which relies on a deeper, more computationaly heavy WideResNet-101 network, the PDN uses only four convolutional layers and two pooling layers.
|
||||
This results in reduced latency while retaining the ability to generate patch-level features.~#cite(<efficientADpaper>)
|
||||
#todo[reference to image below]
|
||||
|
||||
@ -283,7 +283,7 @@ If a novel task is drawn from an unseen domain the model may fail to generalize
|
||||
To overcome this the model is optionally fine-tuned with the support set on a few gradient steps.
|
||||
Data augmentation is used to generate a pseudo query set.
|
||||
With the support set the class prototypes are calculated and compared against the models predictions for the pseudo query set.
|
||||
With the loss of this step the whole model is fine-tuned to the new domain.~#cite(<pmfpaper>)
|
||||
During this step, the entire model is fine-tuned to the new domain.~#cite(<pmfpaper>)
|
||||
|
||||
#figure(
|
||||
image("rsc/pmfarchitecture.png", width: 100%),
|
||||
@ -400,7 +400,7 @@ Also, fine-tuning the model can require considerable computational resources, wh
|
||||
// https://arxiv.org/pdf/2208.10559v1
|
||||
// https://arxiv.org/abs/2208.10559v1
|
||||
|
||||
TRIDENT, a variational infernce network, is a few-shot learning approach which decouples image representation into semantic and label-specific latent variables.
|
||||
TRIDENT, a variational inference network, is a few-shot learning approach which decouples image representation into semantic and label-specific latent variables.
|
||||
Semantic attributes contain context or stylistic information, while label-specific attributes focus on the characteristics crucial for classification.
|
||||
By decoupling these parts TRIDENT enhances the networks ability to generalize effectively from unseen data.~#cite(<singh2022transductivedecoupledvariationalinference>)
|
||||
|
||||
@ -427,7 +427,7 @@ The transform features parameterless-ness, which makes it easy to integrate into
|
||||
It is differentiable which allows for end-to-end training. For example (re-)train the hosting network to adopt to SOT.
|
||||
SOT is equivariant, which means that the transform is invariant to the order of the input features.~#cite(<shalam2022selfoptimaltransportfeaturetransform>)
|
||||
|
||||
The improvements of SOT over traditional feature transforms dpeend on the used backbone network and the task.
|
||||
The improvements of SOT over traditional feature transforms depend on the used backbone network and the task.
|
||||
But in most cases it outperforms state-of-the-art methods and could be used as a drop-in replacement for existing feature transforms.~#cite(<shalam2022selfoptimaltransportfeaturetransform>)
|
||||
|
||||
// anomaly detect
|
||||
|
Reference in New Issue
Block a user