HANDS-ON-LAB

Flight Price Prediction using Machine Learning

Problem Statement

Predict the ticket price for the flight using the Linear Regression Algorithm

Dataset

The data contains 12 variables and of which Price is the Target variable. The complete data dictionary can be found here.

 

Kindly download the data from here.

 

Tasks

  1. Hypothesis-based EDA:

  • Does the price vary with Airlines for the same source_city to destination_city?

  • How is the price affected when tickets are bought just 1 or 2 days before departure?

  • Does the ticket price change based on the departure time and arrival time?

  1. Check the distribution of the Price variable and remove outliers to create a new dataset.

  2. Build a Linear regression Model with all the features (m1), select only the top 5 features using the model coefficient, and rebuild the regression model (m2) again using Statsmodels. Observe if there are any differences between the two models in Adjusted-R2 and R2. 

  3. Build a Linear Regression model using Scikit learn and Statsmodels; compare the results

Analyze the impact of airlines, time of purchase, and departure/arrival time on ticket prices.

 

FAQs

Q1. Does the price vary with Airlines for the same source_city to destination_city?

Yes, the price can vary based on different airlines operating between the same source and destination cities.

 

Q2. How is the price affected when tickets are bought just 1 or 2 days before departure?

Typically, ticket prices tend to be higher when purchased closer to the departure date due to increased demand and limited availability.

 

Q3. Does the ticket price change based on the departure time and arrival time?

Yes, the departure time and arrival time can impact ticket prices, as certain time slots may be more popular or in higher demand, resulting in price fluctuations.