How to skip rows while reading pandas dataframe?

This recipe helps you skip rows while reading pandas dataframe

Recipe Objective

While working with dataframes, importing can be a tedious task if we know that some reduntant rows prexist in our dataset. To handle them, skip rows command can become quite handy.

So this recipe is a short example on how to skip rows while reading pandas dataframe. Let's get started.

Master the Art of Data Cleaning in Machine Learning

Step 1 - Import the library

import pandas as pd import seaborn as sb

Let's pause and look at these imports. Pandas is generally used for performing mathematical operation and preferably over arrays. Seaborn is just for importing dataset for now.

Step 2 - Setup the Data

df = sb.load_dataset('tips') df.to_csv('tips.csv') df1=pd.read_csv('tips.csv') print(df1.head())

Here we have simply imported tips dataset from seaborn library and thereby saved it as a csv file in existing directory. Furthermore (from 3rd line), we have imported our dataset in df variable.

Step 3 - Performing skip rows operation while importing.

df2=pd.read_csv('tips.csv',skiprows=[1,2,4]) print(df2.head())

Here, we are trying to understand the importance of skiprows command. We are ignoring 1,2 and 4th rows while reading our dataset.

Step 4 Let's look at our dataset now

Once we run the above code snippet, we will see:

Scroll down to the ipython file below to see the output of the present operations.

What Users are saying..

profile image

Abhinav Agarwal

Graduate Student at Northwestern University
linkedin profile url

I come from Northwestern University, which is ranked 9th in the US. Although the high-quality academics at school taught me all the basics I needed, obtaining practical experience was a challenge.... Read More

Relevant Projects

Build Portfolio Optimization Machine Learning Models in R
Machine Learning Project for Financial Risk Modelling and Portfolio Optimization with R- Build a machine learning model in R to develop a strategy for building a portfolio for maximized returns.

NLP Project for Multi Class Text Classification using BERT Model
In this NLP Project, you will learn how to build a multi-class text classification model using using the pre-trained BERT model.

GCP MLOps Project to Deploy ARIMA Model using uWSGI Flask
Build an end-to-end MLOps Pipeline to deploy a Time Series ARIMA Model on GCP using uWSGI and Flask

Azure Deep Learning-Deploy RNN CNN models for TimeSeries
In this Azure MLOps Project, you will learn to perform docker-based deployment of RNN and CNN Models for Time Series Forecasting on Azure Cloud.

Build a Multi Class Image Classification Model Python using CNN
This project explains How to build a Sequential Model that can perform Multi Class Image Classification in Python using CNN

Word2Vec and FastText Word Embedding with Gensim in Python
In this NLP Project, you will learn how to use the popular topic modelling library Gensim for implementing two state-of-the-art word embedding methods Word2Vec and FastText models.

Azure Text Analytics for Medical Search Engine Deployment
Microsoft Azure Project - Use Azure text analytics cognitive service to deploy a machine learning model into Azure Databricks

MLOps Project on GCP using Kubeflow for Model Deployment
MLOps using Kubeflow on GCP - Build and deploy a deep learning model on Google Cloud Platform using Kubeflow pipelines in Python

Langchain Project for Customer Support App in Python
In this LLM Project, you will learn how to enhance customer support interactions through Large Language Models (LLMs), enabling intelligent, context-aware responses. This Langchain project aims to seamlessly integrate LLM technology with databases, PDF knowledge bases, and audio processing agents to create a comprehensive customer support application.

End-to-End ML Model Monitoring using Airflow and Docker
In this MLOps Project, you will learn to build an end to end pipeline to monitor any changes in the predictive power of model or degradation of data.