204 lines
8.5 KiB
TeX
204 lines
8.5 KiB
TeX
\documentclass[sigconf]{acmart}
|
|
\usepackage{amsmath}
|
|
\usepackage{bbm}
|
|
\usepackage{mathtools}
|
|
|
|
%%
|
|
%% \BibTeX command to typeset BibTeX logo in the docs
|
|
\AtBeginDocument{%
|
|
\providecommand\BibTeX{{%
|
|
\normalfont B\kern-0.5em{\scshape i\kern-0.25em b}\kern-0.8em\TeX}}}
|
|
|
|
%% Rights management information. This information is sent to you
|
|
%% when you complete the rights form. These commands have SAMPLE
|
|
%% values in them; it is your responsibility as an author to replace
|
|
%% the commands and values with those provided to you when you
|
|
%% complete the rights form.
|
|
\setcopyright{acmcopyright}
|
|
\copyrightyear{2018}
|
|
\acmYear{2018}
|
|
\acmDOI{XXXXXXX.XXXXXXX}
|
|
|
|
%% These commands are for a PROCEEDINGS abstract or paper.
|
|
\acmConference[Conference acronym 'XX]{Make sure to enter the correct
|
|
conference title from your rights confirmation emai}{June 03--05,
|
|
2018}{Woodstock, NY}
|
|
%
|
|
% Uncomment \acmBooktitle if th title of the proceedings is different
|
|
% from ``Proceedings of ...''!
|
|
%
|
|
%\acmBooktitle{Woodstock '18: ACM Symposium on Neural Gaze Detection,
|
|
% June 03--05, 2018, Woodstock, NY}
|
|
\acmPrice{15.00}
|
|
\acmISBN{978-1-4503-XXXX-X/18/06}
|
|
|
|
%%
|
|
%% end of the preamble, start of the body of the document source.
|
|
\begin{document}
|
|
|
|
%%
|
|
%% The "title" command has an optional parameter,
|
|
%% allowing the author to define a "short title" to be used in page headers.
|
|
\title{Cross-Model Pseudo-Labeling for Semi-Supervised Action recognition}
|
|
|
|
%%
|
|
%% The "author" command and its associated commands are used to define
|
|
%% the authors and their affiliations.
|
|
%% Of note is the shared affiliation of the first two authors, and the
|
|
%% "authornote" and "authornotemark" commands
|
|
%% used to denote shared contribution to the research.
|
|
\author{Lukas Heiligenbrunner}
|
|
\email{k12104785@students.jku.at}
|
|
\affiliation{%
|
|
\institution{Johannes Kepler University Linz}
|
|
\city{Linz}
|
|
\state{Upperaustria}
|
|
\country{Austria}
|
|
\postcode{4020}
|
|
}
|
|
|
|
%%
|
|
%% By default, the full list of authors will be used in the page
|
|
%% headers. Often, this list is too long, and will overlap
|
|
%% other information printed in the page headers. This command allows
|
|
%% the author to define a more concise list
|
|
%% of authors' names for this purpose.
|
|
\renewcommand{\shortauthors}{Trovato and Tobin, et al.}
|
|
|
|
%%
|
|
%% The abstract is a short summary of the work to be presented in the
|
|
%% article.
|
|
\begin{abstract}
|
|
Cross-Model Pseudo-Labeling is a new Framework for generating Pseudo-labels
|
|
for supervised learning tasks where only a subset of true labels is known.
|
|
It builds upon the existing approach of FixMatch and improves it further by
|
|
using two different sized models complementing each other.
|
|
\end{abstract}
|
|
|
|
%%
|
|
%% Keywords. The author(s) should pick words that accurately describe
|
|
%% the work being presented. Separate the keywords with commas.
|
|
\keywords{neural networks, videos, pseudo-labeling, action recognition}
|
|
|
|
\received{20 February 2007}
|
|
\received[revised]{12 March 2009}
|
|
\received[accepted]{5 June 2009}
|
|
|
|
%%
|
|
%% This command processes the author and affiliation and title
|
|
%% information and builds the first part of the formatted document.
|
|
\maketitle
|
|
|
|
\section{Introduction}
|
|
For most supervised learning tasks are lots of training samples essential.
|
|
With too less training data the model will gerneralize not well and not fit a real world task.
|
|
Labeling datasets is commonly seen as an expensive task and wants to be avoided as much as possible.
|
|
Thats why there is a machine-learning field called Semi-Supervised learning.
|
|
The general approach is to train a model that predicts Pseudo-Labels which then can be used to train the main model.
|
|
|
|
\section{Semi-Supervised learning}
|
|
In traditional supervised learning we have a labeled dataset.
|
|
Each datapoint is associated with a corresponding target label.
|
|
The goal is to fit a model to predict the labels from datapoints.
|
|
|
|
In traditional unsupervised learning no labels are known.
|
|
The goal is to find patterns and structures in the data.
|
|
|
|
Those two techniques combined yield semi-supervised learning.
|
|
Some of the labels are known, but for most of the data we have only the raw datapoints.
|
|
The basic idea is that the unlabeled data can significantly improve the model performance when used in combination with the labeled data.
|
|
|
|
\section{FixMatch}\label{sec:fixmatch}
|
|
There exists an already existing approach called FixMatch.
|
|
This was introduced in a Google Research paper from 2020~\cite{fixmatch}.
|
|
The key idea of FixMatch is to leverage the unlabeled data by predicting pseudo-labels out of the known labels.
|
|
Then both, the known labels and the predicted ones are used side by side to train the model.
|
|
The labeled samples guide the learning process and the unlabeled samples gain additional information.
|
|
|
|
Not every pseudo prediction is kept to train the model further.
|
|
A confidence threshold is defined to evaluate how `confident` the model is of its prediction.
|
|
The prediction is dropped if the model is too less confident.
|
|
The quantity and quality of the obtained labels is crucial and they have an significant impact on the overall accuracy.
|
|
This means improving the pseudo-label framework as much as possible is important.
|
|
|
|
\subsection{Math of FixMatch}\label{subsec:math-of-fixmatch}
|
|
$\mathcal{L}_u$ defines the loss-function that trains the model.
|
|
The sum over a batch size $B_u$ takes the average loss of this batch and should be straight forward.
|
|
The input data is augmented in two different ways.
|
|
At first there is a weak augmentation $\mathcal{T}_{\text{weak}}(\cdot)$ which only applies basic transformation such as filtering and bluring.
|
|
Moreover, there is the strong augmentation $\mathcal{T}_{\text{strong}}(\cdot)$ which does cropouts and edge-detections.
|
|
The interesting part is the indicator function $\mathbbm{1}(\cdot)$ which applies a principle called `confidence-based masking`.
|
|
It retains a label only if its largest probability is above a threshold $\tau$.
|
|
Where $p_i \coloneqq F(\mathcal{T}_{\text{weak}}(u_i))$ is a model evaluation with a weakly augmented input.
|
|
The second part $\mathcal{H}(\cdot, \cdot)$ is a standard Cross-entropy loss function which takes two inputs.
|
|
$\hat{y}_i$, the obtained pseudo-label and $F(\mathcal{T}_{\text{strong}}(u_i))$, a model evaluation with strong augmentation.
|
|
The indicator function evaluates in $0$ if the pseudo prediction is not confident and the current loss evaluation will be dropped.
|
|
Otherwise it will be kept and trains the model further.
|
|
|
|
\begin{equation}
|
|
\label{eq:equation2}
|
|
\mathcal{L}_u = \frac{1}{B_u} \sum_{i=1}^{B_u} \mathbbm{1}(\max(p_i) \geq \tau) \mathcal{H}(\hat{y}_i,F(\mathcal{T}_{\text{strong}}(u_i)))
|
|
\end{equation}
|
|
|
|
\section{Cross-Model Pseudo-Labeling}
|
|
todo write stuff \cite{Xu_2022_CVPR}
|
|
|
|
\section{Math}\label{sec:math}
|
|
\begin{equation}
|
|
\label{eq:equation}
|
|
\mathcal{L}_u = \frac{1}{B_u} \sum_{i=1}^{B_u} \mathbbm{1}(\max(p_i) \geq \tau) \mathcal{H}(\hat{y}_i,F(\mathcal{T}_{\text{strong}}(u_i)))
|
|
\end{equation}
|
|
|
|
\section{Performance}
|
|
|
|
\begin{figure}[h]
|
|
\centering
|
|
\includegraphics[width=\linewidth]{../presentation/rsc/results}
|
|
\caption{Performance comparisons between CMPL, FixMatch and supervised learning only}
|
|
\Description{A woman and a girl in white dresses sit in an open car.}
|
|
\label{fig:results}
|
|
\end{figure}
|
|
|
|
%%
|
|
%% The next two lines define the bibliography style to be used, and
|
|
%% the bibliography file.
|
|
\bibliographystyle{ACM-Reference-Format}
|
|
\bibliography{sources}
|
|
|
|
%%
|
|
%% If your work has an appendix, this is the place to put it.
|
|
\appendix
|
|
|
|
\section{Research Methods}
|
|
|
|
\subsection{Part One}
|
|
|
|
Lorem ipsum dolor sit amet, consectetur adipiscing elit. Morbi
|
|
malesuada, quam in pulvinar varius, metus nunc fermentum urna, id
|
|
sollicitudin purus odio sit amet enim. Aliquam ullamcorper eu ipsum
|
|
vel mollis. Curabitur quis dictum nisl. Phasellus vel semper risus, et
|
|
lacinia dolor. Integer ultricies commodo sem nec semper.
|
|
|
|
\subsection{Part Two}
|
|
|
|
Etiam commodo feugiat nisl pulvinar pellentesque. Etiam auctor sodales
|
|
ligula, non varius nibh pulvinar semper. Suspendisse nec lectus non
|
|
ipsum convallis congue hendrerit vitae sapien. Donec at laoreet
|
|
eros. Vivamus non purus placerat, scelerisque diam eu, cursus
|
|
ante. Etiam aliquam tortor auctor efficitur mattis.
|
|
|
|
\section{Online Resources}
|
|
|
|
Nam id fermentum dui. Suspendisse sagittis tortor a nulla mollis, in
|
|
pulvinar ex pretium. Sed interdum orci quis metus euismod, et sagittis
|
|
enim maximus. Vestibulum gravida massa ut felis suscipit
|
|
congue. Quisque mattis elit a risus ultrices commodo venenatis eget
|
|
dui. Etiam sagittis eleifend elementum.
|
|
|
|
Nam interdum magna at lectus dignissim, ac dignissim lorem
|
|
rhoncus. Maecenas eu arcu ac neque placerat aliquam. Nunc pulvinar
|
|
massa et mattis lacinia.
|
|
|
|
\end{document}
|
|
\endinput
|