fix some typos and formulations
This commit is contained in:
		@@ -76,9 +76,9 @@
 | 
				
			|||||||
 | 
					
 | 
				
			||||||
\section{Introduction}\label{sec:introduction}
 | 
					\section{Introduction}\label{sec:introduction}
 | 
				
			||||||
For most supervised learning tasks are lots of training samples essential.
 | 
					For most supervised learning tasks are lots of training samples essential.
 | 
				
			||||||
With too less training data the model will gerneralize not well and not fit a real world task.
 | 
					With too less training data the model will not gerneralize well and not fit a real world task.
 | 
				
			||||||
Labeling datasets is commonly seen as an expensive task and wants to be avoided as much as possible.
 | 
					Labeling datasets is commonly seen as an expensive task and wants to be avoided as much as possible.
 | 
				
			||||||
Thats why there is a machine-learning field called Semi-Supervised learning.
 | 
					Thats why there is a machine-learning field called semi-supervised learning.
 | 
				
			||||||
The general approach is to train a model that predicts Pseudo-Labels which then can be used to train the main model.
 | 
					The general approach is to train a model that predicts Pseudo-Labels which then can be used to train the main model.
 | 
				
			||||||
 | 
					
 | 
				
			||||||
The goal of this paper is video action recognition.
 | 
					The goal of this paper is video action recognition.
 | 
				
			||||||
@@ -100,7 +100,7 @@ Some of the labels are known, but for most of the data we have only the raw data
 | 
				
			|||||||
The basic idea is that the unlabeled data can significantly improve the model performance when used in combination with the labeled data.
 | 
					The basic idea is that the unlabeled data can significantly improve the model performance when used in combination with the labeled data.
 | 
				
			||||||
 | 
					
 | 
				
			||||||
\section{FixMatch}\label{sec:fixmatch}
 | 
					\section{FixMatch}\label{sec:fixmatch}
 | 
				
			||||||
There exists an already existing approach called FixMatch.
 | 
					There is an already existing approach called FixMatch.
 | 
				
			||||||
This was introduced in a Google Research paper from 2020~\cite{fixmatch}.
 | 
					This was introduced in a Google Research paper from 2020~\cite{fixmatch}.
 | 
				
			||||||
The key idea of FixMatch is to leverage the unlabeled data by predicting pseudo-labels out of the known labels.
 | 
					The key idea of FixMatch is to leverage the unlabeled data by predicting pseudo-labels out of the known labels.
 | 
				
			||||||
Then both, the known labels and the predicted ones are used side by side to train the model.
 | 
					Then both, the known labels and the predicted ones are used side by side to train the model.
 | 
				
			||||||
@@ -109,7 +109,7 @@ The labeled samples guide the learning process and the unlabeled samples gain ad
 | 
				
			|||||||
Not every pseudo prediction is kept to train the model further.
 | 
					Not every pseudo prediction is kept to train the model further.
 | 
				
			||||||
A confidence threshold is defined to evaluate how `confident` the model is about its prediction.
 | 
					A confidence threshold is defined to evaluate how `confident` the model is about its prediction.
 | 
				
			||||||
The prediction is dropped if the model is too less confident.
 | 
					The prediction is dropped if the model is too less confident.
 | 
				
			||||||
The quantity and quality of the obtained labels is crucial and they have an significant impact on the overall accuracy.
 | 
					The quantity and quality of the obtained labels is crucial and they have a significant impact on the overall accuracy.
 | 
				
			||||||
This means improving the pseudo-label framework as much as possible is essential.
 | 
					This means improving the pseudo-label framework as much as possible is essential.
 | 
				
			||||||
 | 
					
 | 
				
			||||||
FixMatch results in some major limitations.
 | 
					FixMatch results in some major limitations.
 | 
				
			||||||
@@ -154,7 +154,7 @@ Two different models, a smaller auxiliary model and a larger model are defined.
 | 
				
			|||||||
They provide pseudo-labels for each other.
 | 
					They provide pseudo-labels for each other.
 | 
				
			||||||
The two different models have a different structural bias which leads to complementary representations.
 | 
					The two different models have a different structural bias which leads to complementary representations.
 | 
				
			||||||
This symetric design performs a boost in performance.
 | 
					This symetric design performs a boost in performance.
 | 
				
			||||||
The SG label means stop gradient.
 | 
					The SG label means \grqq Stop Gradient \grqq.
 | 
				
			||||||
The loss function evaluations are fed into the opposite model as loss.
 | 
					The loss function evaluations are fed into the opposite model as loss.
 | 
				
			||||||
The two models train each other.
 | 
					The two models train each other.
 | 
				
			||||||
 | 
					
 | 
				
			||||||
@@ -168,7 +168,7 @@ The two models train each other.
 | 
				
			|||||||
 | 
					
 | 
				
			||||||
\subsection{Math of CMPL}\label{subsec:math}
 | 
					\subsection{Math of CMPL}\label{subsec:math}
 | 
				
			||||||
The loss function of CMPL is similar to that one explaind above.
 | 
					The loss function of CMPL is similar to that one explaind above.
 | 
				
			||||||
But we have to differ from the loss generated from the supervised samples where the labels are known and the unsupervised loss where no labels are knonw.
 | 
					But we have to differ from the loss generated from the supervised samples where the labels are known and the unsupervised loss where no labels are available.
 | 
				
			||||||
 | 
					
 | 
				
			||||||
The two equations~\ref{eq:cmpl-losses1} and~\ref{eq:cmpl-losses2} are normal Cross-Entropy loss functions generated with the supervised labels of the two seperate models.
 | 
					The two equations~\ref{eq:cmpl-losses1} and~\ref{eq:cmpl-losses2} are normal Cross-Entropy loss functions generated with the supervised labels of the two seperate models.
 | 
				
			||||||
 | 
					
 | 
				
			||||||
@@ -190,7 +190,7 @@ They are very similar to FastMatch, but important to note is that the confidence
 | 
				
			|||||||
  \mathcal{L}_u^A &= \frac{1}{B_u} \sum_{i=1}^{B_u} \mathbbm{1}(\max(p_i^F) \geq \tau) \mathcal{H}(\hat{y}_i^F,A(\mathcal{T}_{\text{strong}}(u_i)))
 | 
					  \mathcal{L}_u^A &= \frac{1}{B_u} \sum_{i=1}^{B_u} \mathbbm{1}(\max(p_i^F) \geq \tau) \mathcal{H}(\hat{y}_i^F,A(\mathcal{T}_{\text{strong}}(u_i)))
 | 
				
			||||||
\end{align}
 | 
					\end{align}
 | 
				
			||||||
 | 
					
 | 
				
			||||||
Finally to train the main objective an overall loss is calculated by simply summing all the losses.
 | 
					Finally to train the main objective a overall loss is calculated by simply summing all the losses.
 | 
				
			||||||
The loss is regulated by an hyperparamter $\lambda$ to enhance the importance of the supervised loss.
 | 
					The loss is regulated by an hyperparamter $\lambda$ to enhance the importance of the supervised loss.
 | 
				
			||||||
 | 
					
 | 
				
			||||||
\begin{equation}
 | 
					\begin{equation}
 | 
				
			||||||
@@ -221,7 +221,7 @@ Even when only 1\% of true labels are known for the UCF-101 dataset 25.1\% of th
 | 
				
			|||||||
\section{Further schemes}\label{sec:further-schemes}
 | 
					\section{Further schemes}\label{sec:further-schemes}
 | 
				
			||||||
How the pseudo-labels are generated may impact the overall performance.
 | 
					How the pseudo-labels are generated may impact the overall performance.
 | 
				
			||||||
In this paper the pseudo-labels are obtained by the cross-model approach.
 | 
					In this paper the pseudo-labels are obtained by the cross-model approach.
 | 
				
			||||||
But there might be other strategies.
 | 
					But there might be other strategies as well.
 | 
				
			||||||
For example:
 | 
					For example:
 | 
				
			||||||
\begin{enumerate*}
 | 
					\begin{enumerate*}
 | 
				
			||||||
  \item Self-First: Each network uses just its own prediction if its confident enough.
 | 
					  \item Self-First: Each network uses just its own prediction if its confident enough.
 | 
				
			||||||
 
 | 
				
			|||||||
		Reference in New Issue
	
	Block a user