MACHINE LEARNING RECIPES
DATA CLEANING PYTHON
DATA MUNGING
PANDAS CHEATSHEET
ALL TAGS
# What is cosine similarity and how to calculate it?

# What is cosine similarity and how to calculate it?

This recipe explains what is cosine similarity and how to calculate it

Cosine similarity gives us the sense of cos angle between vectors. When vector are in same direction, cosine similarity is 1 while in case of perpendicular, it is 0. It is given by (1- cosine distance).

So this recipe is a short example on what cosine similarity is and how to calculate it. Let's get started.

```
from scipy import spatial
```

Let's pause and look at these imports. We have imported spatial library from scipy class Scipy contains bunch of scientific routies like solving differential equations.

```
x=[1,2,3]
y=[-1,-2,-3]
```

Let us create two vectors list.

```
z=1-spatial.distance.cosine(x,y)
```

We have first calucated cosine distance and the subtracting it from 1 has given us cosine similarity

```
print(z)
```

Simply use print function to print new appended list.

Once we run the above code snippet, we will see:

-1.0

In this human activity recognition project, we use multiclass classification machine learning techniques to analyse fitness dataset from a smartphone tracker.

In this machine learning project, you will use the video clip of an IPL match played between CSK and RCB to forecast key performance indicators like the number of appearances of a brand logo, the frames, and the shortest and longest area percentage in the video.

In this machine learning churn project, we implement a churn prediction model in python using ensemble techniques.

Use cluster analysis to identify the groups of characteristically similar schools in the College Scorecard dataset. Considerations: Clustering Algorithm Data Preparation How will you deal with missing values? Categorical variables? Feature intercorrelations? Feature normalization or scaling? Dimensionality reduction? Hyperparameters How will you set the parameters -- the algorithm's knobs and dials, so to speak -- in order to achieve valid and useful output? Interpretation Is it possible to explain what each cluster represents? Did you retain or prepare a set of features that enables a meaningful interpretation of the clusters? Do the compositions of the clusters seem to make sense? Validation How will you measure the validity of your clustering process? Which metrics will you use and how will you apply them?

We all at some point in time wished to create our own language as a child! But what if certain words always cooccur with another in a corpus? Thus you can make your own model which will understand which word goes with which one, which words are often coming together etc. This all can be done by building a custom embeddings model which we create in this project

In this deep learning project, you will build a convolutional neural network using MNIST dataset for handwritten digit recognition.

In this project, we will use time-series forecasting to predict the values of a sensor using multiple dependent variables. A variety of machine learning models are applied in this task of time series forecasting. We will see a comparison between the LSTM, ARIMA and Regression models. Classical forecasting methods like ARIMA are still popular and powerful but they lack the overall generalizability that memory-based models like LSTM offer. Every model has its own advantages and disadvantages and that will be discussed. The main objective of this article is to lead you through building a working LSTM model and it's different variants such as Vanilla, Stacked, Bidirectional, etc. There will be special focus on customized data preparation for LSTM.

In this ML Project, you will use the Avocado dataset to build a machine learning model to predict the average price of avocado which is continuous in nature based on region and varieties of avocado.

Given big data at taxi service (ride-hailing) i.e. OLA, you will learn multi-step time series forecasting and clustering with Mini-Batch K-means Algorithm on geospatial data to predict future ride requests for a particular region at a given time.

In this Deep Learning Project on Image Segmentation Python, you will learn how to implement the Mask R-CNN model for early fire detection.