OPTICS CLUSTERING (Intro)

Rupak (Bob) Roy - II
3 min readJun 25, 2021

--

Ordering Points To Identify Cluster Structure (OPTICS) ) is a density-based clustering technique that allows partitioning data into groups with similar characteristics(clusters)

Its addresses one of the DBSCAN’s major weaknesses. The problem of detecting meaningful clusters in data of varying density.

In a density based clustering, clusters are defined as dense regions of data points separated by low-density regions.

It adds two more terms to the concepts of DBSCAN clustering. They are:

  1. Core Distance: it is the minimum value of radius required to classify a given point as a core point. if the given point is not a Core point, then its Core Distance is undefined.

2. Reachability Distance: it is defined with respect to another data point ‘q’. The Reachability distance between a point p and q . Note that The Reachability Distance is not defined if ‘q’ is not a Core point.

Few advantages:

· OPTICS clustering doesn’t require a predefined number of clusters in advance.

· Clusters can be of any shape, including non-spherical ones.

· Able to identify outliers(noise data)

Disadvantages:

· It fails if there are no density drops between clusters.

· It is also sensitive to parameters that define density( radius and the minimum number of points) and proper parameter settings require domain knowledge.

Now let’s understand the coding part.

First, we will load our dataset and perform a few data pre-processing steps like scaling.

OPTICS Clustering: Data Preprocessing
Data set (X) optics clustering
Data set (X) optics clustering

Now we will build the model using min_samples = 10, xi =0.05, min_cluster_size = 0.05

OPTICS Clustering: Modeling

Here we are..our OPTICS model is complete…. We can print the cluster labels using optics_model.labels_

Now we will take one step further by comparing the performance of OPTICS and DBSCAN to validate DBSCAN lacks clustering in different densities

comparing the performance of OPTICS and DBSCAN to validate DBSCAN lacks clustering in different densities
OPTICS vs DBSCAN

Next, we have another interesting and advanced clustering method specially used for data exploration and visualizing high-dimensional data called T-SNE.

If you like to know more about advanced types of clustering follow my other article A-Z Clustering

Some of my alternative internet presences are Facebook, Instagram, Udemy, Blogger, Issuu, and more.

Also available on Quora @ https://www.quora.com/profile/Rupak-Bob-Roy

https://www.quora.com/profile/Rupak-Bob-Roy
https://www.quora.com/profile/Rupak-Bob-Roy

Have a good day.

optics clustering
pexel

--

--

Rupak (Bob) Roy - II
Rupak (Bob) Roy - II

Written by Rupak (Bob) Roy - II

Things i write about frequently on Medium: Data Science, Machine Learning, Deep Learning, NLP and many other random topics of interest. ~ Let’s stay connected!

No responses yet