add conclusion
This commit is contained in:
		@@ -85,14 +85,13 @@ The goal of this paper is video action recognition.
 | 
				
			|||||||
Given are approximately 10 seconds long videos which should be classified.
 | 
					Given are approximately 10 seconds long videos which should be classified.
 | 
				
			||||||
In this paper datasets with 400 and 101 different classes are used.
 | 
					In this paper datasets with 400 and 101 different classes are used.
 | 
				
			||||||
The proposed approach is tested with 1\% and 10\% of known labels of all data points.
 | 
					The proposed approach is tested with 1\% and 10\% of known labels of all data points.
 | 
				
			||||||
The used model depends on the exact usecase but in this case a 3D-ResNet50 and 3D-ResNet18 are used.
 | 
					 | 
				
			||||||
 | 
					
 | 
				
			||||||
\section{Semi-Supervised learning}\label{sec:semi-supervised-learning}
 | 
					\section{Semi-Supervised learning}\label{sec:semi-supervised-learning}
 | 
				
			||||||
In traditional supervised learning we have a labeled dataset.
 | 
					In traditional supervised learning we have a labeled dataset.
 | 
				
			||||||
Each datapoint is associated with a corresponding target label.
 | 
					Each datapoint is associated with a corresponding target label.
 | 
				
			||||||
The goal is to fit a model to predict the labels from datapoints.
 | 
					The goal is to fit a model to predict the labels from datapoints.
 | 
				
			||||||
 | 
					
 | 
				
			||||||
In traditional unsupervised learning no labels are known.
 | 
					In traditional unsupervised learning there are also datapoints but no labels are known.
 | 
				
			||||||
The goal is to find patterns or structures in the data.
 | 
					The goal is to find patterns or structures in the data.
 | 
				
			||||||
Moreover, it can be used for clustering or downprojection.
 | 
					Moreover, it can be used for clustering or downprojection.
 | 
				
			||||||
 | 
					
 | 
				
			||||||
@@ -118,8 +117,8 @@ It relies on a single model for generating pseudo-labels which can introduce err
 | 
				
			|||||||
Incorrect pseudo-labels may effect the learning process negatively.
 | 
					Incorrect pseudo-labels may effect the learning process negatively.
 | 
				
			||||||
Furthermore, Fixmatch uses a compareably small model for label prediction which has a limited capacity.
 | 
					Furthermore, Fixmatch uses a compareably small model for label prediction which has a limited capacity.
 | 
				
			||||||
This can negatively affect the learning process as well.
 | 
					This can negatively affect the learning process as well.
 | 
				
			||||||
There is no measure defined how certain the model is about its prediction.
 | 
					%There is no measure defined how certain the model is about its prediction.
 | 
				
			||||||
Such a measure improves overall performance by filtering noisy and unsure predictions.
 | 
					%Such a measure improves overall performance by filtering noisy and unsure predictions.
 | 
				
			||||||
Cross-Model Pseudo-Labeling tries to address all of those limitations.
 | 
					Cross-Model Pseudo-Labeling tries to address all of those limitations.
 | 
				
			||||||
 | 
					
 | 
				
			||||||
\subsection{Math of FixMatch}\label{subsec:math-of-fixmatch}
 | 
					\subsection{Math of FixMatch}\label{subsec:math-of-fixmatch}
 | 
				
			||||||
@@ -137,6 +136,12 @@ Moreover, there is the strong augmentation $\mathcal{T}_{\text{strong}}(\cdot)$
 | 
				
			|||||||
The indicator function $\mathbbm{1}(\cdot)$ applies a principle called `confidence-based masking`.
 | 
					The indicator function $\mathbbm{1}(\cdot)$ applies a principle called `confidence-based masking`.
 | 
				
			||||||
It retains a label only if its largest probability is above a threshold $\tau$.
 | 
					It retains a label only if its largest probability is above a threshold $\tau$.
 | 
				
			||||||
Where $p_i \coloneqq F(\mathcal{T}_{\text{weak}}(u_i))$ is a model evaluation with a weakly augmented input.
 | 
					Where $p_i \coloneqq F(\mathcal{T}_{\text{weak}}(u_i))$ is a model evaluation with a weakly augmented input.
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					\begin{equation}
 | 
				
			||||||
 | 
					  \label{eq:crossentropy}
 | 
				
			||||||
 | 
					  \mathcal{H}(\hat{y}_i, y_i) = -\sum_{i=1} y_i \cdot log(\hat{y}_i)
 | 
				
			||||||
 | 
					\end{equation}
 | 
				
			||||||
 | 
					
 | 
				
			||||||
The second part $\mathcal{H}(\cdot, \cdot)$ is a standard Cross-entropy loss function which takes two inputs, the predicted and the true label.
 | 
					The second part $\mathcal{H}(\cdot, \cdot)$ is a standard Cross-entropy loss function which takes two inputs, the predicted and the true label.
 | 
				
			||||||
$\hat{y}_i$, the obtained pseudo-label and $F(\mathcal{T}_{\text{strong}}(u_i))$, a model evaluation with strong augmentation.
 | 
					$\hat{y}_i$, the obtained pseudo-label and $F(\mathcal{T}_{\text{strong}}(u_i))$, a model evaluation with strong augmentation.
 | 
				
			||||||
The indicator function evaluates in $0$ if the pseudo prediction is not confident and the current loss evaluation will be dropped.
 | 
					The indicator function evaluates in $0$ if the pseudo prediction is not confident and the current loss evaluation will be dropped.
 | 
				
			||||||
@@ -145,7 +150,10 @@ Otherwise it evaluates to 1 and it will be kept and trains the model further.
 | 
				
			|||||||
\section{Cross-Model Pseudo-Labeling}\label{sec:cross-model-pseudo-labeling}
 | 
					\section{Cross-Model Pseudo-Labeling}\label{sec:cross-model-pseudo-labeling}
 | 
				
			||||||
The newly invented approach of this paper is called Cross-Model Pseudo-Labeling (CMPL)\cite{Xu_2022_CVPR}.
 | 
					The newly invented approach of this paper is called Cross-Model Pseudo-Labeling (CMPL)\cite{Xu_2022_CVPR}.
 | 
				
			||||||
Figure~\ref{fig:cmpl-structure} visualizs the structure of CMPL\@.
 | 
					Figure~\ref{fig:cmpl-structure} visualizs the structure of CMPL\@.
 | 
				
			||||||
We define two different models, a smaller auxiliary model and a larger model.
 | 
					Two different models, a smaller auxiliary model and a larger model are defined.
 | 
				
			||||||
 | 
					They provide pseudo-labels for each other.
 | 
				
			||||||
 | 
					The two different models have a different structural bias which leads to complementary representations.
 | 
				
			||||||
 | 
					This symetric design performs a boost in performance.
 | 
				
			||||||
The SG label means stop gradient.
 | 
					The SG label means stop gradient.
 | 
				
			||||||
The loss function evaluations are fed into the opposite model as loss.
 | 
					The loss function evaluations are fed into the opposite model as loss.
 | 
				
			||||||
The two models train each other.
 | 
					The two models train each other.
 | 
				
			||||||
@@ -225,6 +233,14 @@ For example:
 | 
				
			|||||||
 | 
					
 | 
				
			||||||
Those are just other approaches one can keep in mind.
 | 
					Those are just other approaches one can keep in mind.
 | 
				
			||||||
This doesn't mean they are better, in fact they performed even worse in this study.
 | 
					This doesn't mean they are better, in fact they performed even worse in this study.
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					\section{Conclusion}\label{sec:conclusion}
 | 
				
			||||||
 | 
					In conclusion, Cross-Model Pseudo-Labeling demonstrates the potential to significantly advance the field of semi-supervised action recognition.
 | 
				
			||||||
 | 
					Cross-Model Pseudo-Labeling outperforms the supervised-only approach over several experiments by a multiple.
 | 
				
			||||||
 | 
					It surpasses most of the other existing pseudo-labeling frameworks.
 | 
				
			||||||
 | 
					Through the integration of main and auxiliary models, consistency regularization, and uncertainty estimation, CMPL offers a powerful framework for leveraging unlabeled data and improving model performance.
 | 
				
			||||||
 | 
					It paves the way for more accurate and efficient action recognition systems.
 | 
				
			||||||
 | 
					
 | 
				
			||||||
%%
 | 
					%%
 | 
				
			||||||
%% The next two lines define the bibliography style to be used, and
 | 
					%% The next two lines define the bibliography style to be used, and
 | 
				
			||||||
%% the bibliography file.
 | 
					%% the bibliography file.
 | 
				
			||||||
 
 | 
				
			|||||||
		Reference in New Issue
	
	Block a user