Semi-supervised machine learning is a technique that uses both labeled and unlabeled data for training. It begins by taking a large unlabeled dataset and labeling a small portion of it. Then it uses unsupervised learning to cluster the unlabeled data and supervised learning on the labeled data to classify the remaining unlabeled data. Some applications of semi-supervised learning include internet content classification, where it is not feasible to label all webpages, and audio/video analysis, where the volume of files makes complete labeling impossible.
2. Spotle.ai Study Material
Spotle.ai/Learn
Recap - Supervised Machine Learning
Supervised learning is a learning in which
we teach or train the machine using data
which are properly or rather correctly
labeled.
Training data
Machine learning
algorithm
Predictive model
Model
evaluation
Feedback loop
Labeled
3. Spotle.ai Study Material
Spotle.ai/Learn
Recap - Unsupervised Machine Learning
Input data
Machine learning
algorithm
Outputl
Unlabeled
Unsupervised learning is the learning
of machine using information that is
neither classified nor labeled and
allowing the algorithm to act on that
information without guidance.
4. The biggest drawbacks
Supervised Machine Learning algorithm needs labeled data from which it will
learn. We have an enormous amount of data available in the world, including
texts, images, audios, videos, time-series and many more, but only a small
fraction such data is actually labeled. This process of labeling data, whether
algorithmically or by hand, is a very costly process, especially when we are
dealing with large volumes of data.
On the other hand, the biggest disadvantage of any Unsupervised Machine
Learning is that it’s application spectrum is limited.
5. Spotle.ai Study Material
Spotle.ai/Learn
Problems! OK. How to get rid of them?
How about using the benefits of unsupervised learning and building clusters
using the dataset? Unsupervised learning doesn’t work on the labeled data. So
the cost of labeling the data is saved. And then using the benefits of supervised
learning by using a small amount of labeled dataset and classifying the dataset as
accurately as possible? We can do that. This technique is called semi-supervised
machine learning.
Supervised Learning Unsupervised Learning
6. Semi-supervised Machine Learning
A. Pick-up the large unlabeled dataset.
A. Label a small portion of the dataset.
A. Put the unlabeled dataset into clusters using Unsupervised Machine Learning
algorithm.
A. Build your model to use the labeled data to label and classify the rest of the
unlabeled data.
7. Spotle.ai Study Material
Spotle.ai/Learn
Some applications
Internet Content Classification: There are millions and millions of webpages. It
is practically impossible to label all the webpages if you need to. Semi supervised
machine learning helps here to classify the webpages.
Audio/ Video analysis: We have an overwhelming amount of audio and video
files all over. Labeling them is a massive task, if not unfeasible. Semi supervised
learning comes handy in audio/ video analysis.