Introduction to Unsupervised Learning

Unsupervised learning is a branch of machine learning. It is used to find underlying patterns in data and is often used in exploratory data analysis. The major difference between supervised and unsupervised learning is that it uses labelled and unlabelled data respectively.

While using unsupervised learning, we are not concerned with the targeted outputs because the goal of the algorithm is to find relationships within the data and group data points based on the input data alone. It allows the model to work on its own to discover patterns and information that was previously undetected.

The main task of unsupervised learning is to find patterns in the data.

Algorithms

It uses machine learning algorithms to analyze and cluster unlabeled datasets. Its algorithm discovers hidden patterns or data grouping. It allows users to perform more complex processing tasks compared to supervised learning. It is more unpredictable compared with other natural learning methods.

It includes clustering, anomaly detection, neural networks, etc. It can discover similarities and differences in information making it the ideal solution for exploratory data analysis, cross-selling strategies, customer segmentation, and image recognition.

Use Cases

  1. News Sections: As we see Google News, uses unsupervised learning to differentiate categories of articles on the same story from various online news outlets.
  2. Computer vision: It also uses  Unsupervised learning algorithms for visual perception tasks, such as object recognition. 
  3. Medical imaging:  It also uses  Unsupervised machine learning because it provides the features to medical imaging devices, such as image detection, classification and segmentation. It is used in radiology and pathology to diagnose patients quickly and accurately.
  4. Anomaly detection: It is used for detecting fraud transactions.
  5. Recommendation Engines: It is used for making relevant add-on recommendations to customers during the checkout process for online retailers by discovering the past purchases behaviour data using unsupervised learning to develop more effective cross-selling strategies.

Features

  • It uses the clustering method to automatically split the dataset into groups based on their similarities.
  • It has the feature of Anomaly detection which are used to detect the unusual data points in your dataset and are more commonly used in finding fraudulent transactions.
  • It also uses Association mining to identify the sets of items that often occur together in your dataset.

Disadvantages

  • It does not give precise information regarding data sorting.
  • It gives less accurate results because here input data is not labelled and the user also does not label it in advance. This means that the machine requires to do this itself.
  • Its spectral properties of classes can also change over time so we can’t have the same class information while moving from one image to another.
  • It needs to spend time interpreting and labelling the classes which follow that classification.

It is a machine learning technique where we do not need any supervision to supervise the model as we need in supervised learning. It helps us in finding all kinds of unknown patterns in data. Unsupervised learning is of two types Clustering and Association. It has the biggest drawback that is we cannot get precise information regarding data sorting. Here Association rules allow you to establish associations amongst data objects inside large databases.