How to skip rows while reading pandas dataframe?

This recipe helps you skip rows while reading pandas dataframe

Recipe Objective

While working with dataframes, importing can be a tedious task if we know that some reduntant rows prexist in our dataset. To handle them, skip rows command can become quite handy.

So this recipe is a short example on how to skip rows while reading pandas dataframe. Let's get started.

Master the Art of Data Cleaning in Machine Learning

Step 1 - Import the library

import pandas as pd import seaborn as sb

Let's pause and look at these imports. Pandas is generally used for performing mathematical operation and preferably over arrays. Seaborn is just for importing dataset for now.

Step 2 - Setup the Data

df = sb.load_dataset('tips') df.to_csv('tips.csv') df1=pd.read_csv('tips.csv') print(df1.head())

Here we have simply imported tips dataset from seaborn library and thereby saved it as a csv file in existing directory. Furthermore (from 3rd line), we have imported our dataset in df variable.

Step 3 - Performing skip rows operation while importing.

df2=pd.read_csv('tips.csv',skiprows=[1,2,4]) print(df2.head())

Here, we are trying to understand the importance of skiprows command. We are ignoring 1,2 and 4th rows while reading our dataset.

Step 4 Let's look at our dataset now

Once we run the above code snippet, we will see:

Scroll down to the ipython file below to see the output of the present operations.

What Users are saying..

profile image

Abhinav Agarwal

Graduate Student at Northwestern University
linkedin profile url

I come from Northwestern University, which is ranked 9th in the US. Although the high-quality academics at school taught me all the basics I needed, obtaining practical experience was a challenge.... Read More

Relevant Projects

Multi-Class Text Classification with Deep Learning using BERT
In this deep learning project, you will implement one of the most popular state of the art Transformer models, BERT for Multi-Class Text Classification

Build Time Series Models for Gaussian Processes in Python
Time Series Project - A hands-on approach to Gaussian Processes for Time Series Modelling in Python

Learn to Build an End-to-End Machine Learning Pipeline - Part 2
In this Machine Learning Project, you will learn how to build an end-to-end machine learning pipeline for predicting truck delays, incorporating Hopsworks' feature store and Weights and Biases for model experimentation.

Learn How to Build PyTorch Neural Networks from Scratch
In this deep learning project, you will learn how to build PyTorch neural networks from scratch.

Natural language processing Chatbot application using NLTK for text classification
In this NLP AI application, we build the core conversational engine for a chatbot. We use the popular NLTK text classification library to achieve this.

AWS MLOps Project for Gaussian Process Time Series Modeling
MLOps Project to Build and Deploy a Gaussian Process Time Series Model in Python on AWS

OpenCV Project for Beginners to Learn Computer Vision Basics
In this OpenCV project, you will learn computer vision basics and the fundamentals of OpenCV library using Python.

Build a Multi-Class Classification Model in Python on Saturn Cloud
In this machine learning classification project, you will build a multi-class classification model in Python on Saturn Cloud to predict the license status of a business.

Build a Customer Churn Prediction Model using Decision Trees
Develop a customer churn prediction model using decision tree machine learning algorithms and data science on streaming service data.

Build a Review Classification Model using Gated Recurrent Unit
In this Machine Learning project, you will build a classification model in python to classify the reviews of an app on a scale of 1 to 5 using Gated Recurrent Unit.