How to connect to API endpoint and query data using Python?
BIG DATA RECIPES DATA CLEANING PYTHON DATA MUNGING MACHINE LEARNING RECIPES PANDAS CHEATSHEET     ALL TAGS

How to connect to API endpoint and query data using Python?

How to connect to API endpoint and query data using Python?

This recipe helps you connect to API endpoint and query data using Python

0

Recipe Objective

In big data scenarios , we are going to connect to multiple API endpoints and need to retrieve the data from the api which is the very first step of data extraction and perform some processing like parsing , cleaning, transformation on the data for deriving business value out of the data.

System requirements :

  • Install the python module as follows if the below modules are not found:
  • pip install requests
  • The below codes can be run in Jupyter notebook , or any python console
  • In this scenario we are using an open api to make an requests and the link open api Click Here
  • Link to some public api’s for you Open-API's

Step 1: Import the module

The most common library for making requests and working with APIs is the requests library. You’ll need to import it. Let’s start with that important step:

import requests

Step 2: Making an HTTP request

To make a request to the API, there are different types of requests like GET, POST etc. GET request is the most commonly using one It used to get the data from api.when get the request is successful then it will give the response status code 200 , to make GET request we will use the get() method.

Sample code to make a request :

import requests response = requests.get('https://ghibliapi.herokuapp.com/films/') if response.status_code == 200: print("Succesful connection with API.") print('-------------------------------') data = response.json() print(data) elif response.status_code == 404: print("Unable to reach URL.") else: print("Unable to connect API or retrieve data.")

In the above code if the request response statuscode is 200, then it will print the "Successful connection with API." and also prints data in the json object format , otherwise it will print the "unable to reach URL".

Output of the above code:

Step 3: To query the data by sending the params

To query the specific data from the api will pass the "params" variable as send argument in the get method.

import requests response = requests.get('https://ghibliapi.herokuapp.com/films/', params={'id' :"4e236f34-b981-41c3-8c65-f8c9000b94e7"}) if response.status_code == 200: print("Successful connection with API.") print('-------------------------------') data = response.json() elif response.status_code == 404: print("Unable to reach URL.") else: print("Unable to connect API or retrieve data.") for record in data: print("Title: {},\n Release_Date: {},\n Director: {},\n".format(record['title'] , record['release_date'],record['director']))

Output of the above code: In the above code we query the specific data from the api and print the data in the sorted structure, Here the result is it will print specific id of data.

Relevant Projects

PySpark Tutorial - Learn to use Apache Spark with Python
PySpark Project-Get a handle on using Python with Spark through this hands-on data processing spark python tutorial.

Yelp Data Processing Using Spark And Hive Part 1
In this big data project, we will continue from a previous hive project "Data engineering on Yelp Datasets using Hadoop tools" and do the entire data processing using spark.

Real-time Auto Tracking with Spark-Redis
Spark Project - Discuss real-time monitoring of taxis in a city. The real-time data streaming will be simulated using Flume. The ingestion will be done using Spark Streaming.

Finding Unique URL's using Hadoop Hive
Hive Project -Learn to write a Hive program to find the first unique URL, given 'n' number of URL's.

Tough engineering choices with large datasets in Hive Part - 2
This is in continuation of the previous Hive project "Tough engineering choices with large datasets in Hive Part - 1", where we will work on processing big data sets using Hive.

Data Warehouse Design for E-commerce Environments
In this hive project, you will design a data warehouse for e-commerce environments.

Spark Project -Real-time data collection and Spark Streaming Aggregation
In this big data project, we will embark on real-time data collection and aggregation from a simulated real-time system using Spark Streaming.

Real-Time Log Processing using Spark Streaming Architecture
In this Spark project, we are going to bring processing to the speed layer of the lambda architecture which opens up capabilities to monitor application real time performance, measure real time comfort with applications and real time alert in case of security

Real-Time Log Processing in Kafka for Streaming Architecture
The goal of this apache kafka project is to process log entries from applications in real-time using Kafka for the streaming architecture in a microservice sense.

Tough engineering choices with large datasets in Hive Part - 1
Explore hive usage efficiently in this hadoop hive project using various file formats such as JSON, CSV, ORC, AVRO and compare their relative performances