How to perform Chi Square test in python?
MACHINE LEARNING RECIPES DATA CLEANING PYTHON DATA MUNGING PANDAS CHEATSHEET     ALL TAGS

# How to perform Chi Square test in python?

This recipe helps you perform Chi Square test in python

0

## Recipe Objective.

How to perform Chi Square test in python?

The Chi-Squared test is a applied math hypothesis test that assumes (the null hypothesis) that the determined frequencies for a categorical variable match the expected frequencies for the specific variable. The test calculates a data point that incorporates a chi-squared distribution.

The number of observations for a class might or might not an equivalent. still, we will calculate the expected frequency of observations in every social group and see whether or not the partitioning of interests.

If Statistic >= Critical Value: reject null hypothesis (Ho), model dependent. If Statistic < Critical Value: fail to reject null hypothesis (Ho), model independent.

## Step 1- Importing Libraries.

``` # chi-squared test with similar proportions from scipy.stats import chi2_contingency from scipy.stats import chi2 import pandas as pd ```

## Step 2- Creating Table.

Creating a sample-2d table to calculate sample stat, p, dof and expected values. Predefining prob as 0.9 to calculate chi values.

``` # contingency table data = [[37, 73, 102, 400], [10, 45, 200, 300]] print(data) stat, p, dof, expected = chi2_contingency(data) # interpret test-statistic prob = 0.90 chi = chi2.ppf(prob, dof) chi ```

## Step 3- Printing Result.

Now we will compare the chi value to stat value to know whether we reject the null hypothesis or fail to reject the null hypothesis.

``` if abs(stat) >= chi: print('reject Ho') else: print('fail to reject Ho') ```

#### Relevant Projects

##### NLP and Deep Learning For Fake News Classification in Python
In this project you will use Python to implement various machine learning methods( RNN, LSTM, GRU) for fake news classification.

##### Data Science Project-TalkingData AdTracking Fraud Detection
Machine Learning Project in R-Detect fraudulent click traffic for mobile app ads using R data science programming language.

##### Build an Image Classifier for Plant Species Identification
In this machine learning project, we will use binary leaf images and extracted features, including shape, margin, and texture to accurately identify plant species using different benchmark classification techniques.

##### Loan Eligibility Prediction using Gradient Boosting Classifier
This data science in python project predicts if a loan should be given to an applicant or not. We predict if the customer is eligible for loan based on several factors like credit score and past history.

##### Build a Similar Images Finder with Python, Keras, and Tensorflow
Build your own image similarity application using Python to search and find images of products that are similar to any given product. You will implement the K-Nearest Neighbor algorithm to find products with maximum similarity.

##### Build a Collaborative Filtering Recommender System in Python
Use the Amazon Reviews/Ratings dataset of 2 Million records to build a recommender system using memory-based collaborative filtering in Python.

##### Perform Time series modelling using Facebook Prophet
In this project, we are going to talk about Time Series Forecasting to predict the electricity requirement for a particular house using Prophet.

##### Predict Credit Default | Give Me Some Credit Kaggle
In this data science project, you will predict borrowers chance of defaulting on credit loans by building a credit score prediction model.

##### Predict Macro Economic Trends using Kaggle Financial Dataset
In this machine learning project, you will uncover the predictive value in an uncertain world by using various artificial intelligence, machine learning, advanced regression and feature transformation techniques.

##### Resume parsing with Machine learning - NLP with Python OCR and Spacy
In this machine learning resume parser example we use the popular Spacy NLP python library for OCR and text classification.