How to query JSON data from the table in Snowflake

This recipe helps you query JSON data from the table in Snowflake

Recipe Objective: How to query JSON data from the table in Snowflake?

Snowflake is one of the few enterprise-ready cloud data warehouses that brings simplicity without sacrificing features. It automatically scales, both up and down, to get the right balance of performance vs. cost. Snowflake's claim to fame is that it separates computation from storage. This is significant because almost every other database, Redshift included, combines the two, meaning you must size for your largest workload and incur the cost that comes with it. In this scenario, we will learn how to import or load the data into the target table in the snowflake and query the JSON data.

Build Log Analytics Application with Spark Streaming and Kafka

System requirements :

Step 1: Log in to the account

We need to log in to the snowflake account. Go to snowflake.com and then log in by providing your credentials. Follow the steps provided in the link above.

Step 2: Select Database

To select the database which you created earlier, we will use the "use" statement

Syntax of the statement:

Use database [database-name];

Example of the statement:

use database demo_db;

The output of the statement:

bigdata_1.jpg

Step 3: Create File Format for JSON

Creates a named file format that describes a set of staged data to access or load into Snowflake tables.

Syntax of the statement:

CREATE [ OR REPLACE ] FILE FORMAT [ IF NOT EXISTS ] TYPE = { CSV | JSON | AVRO | ORC | PARQUET | XML } [ formatTypeOptions ] [ COMMENT = '' ]

Example of the statement:

create or replace file format my_json_format type = 'json' strip_outer_array = true;

The output of the statement:

bigdata_2.jpg

Step 4: Create an Internal stage

Here we are going to create an internal stage that is json_temp_int_stage with the file format is JSON type

Syntax of the statement:

-- Internal stage CREATE [ OR REPLACE ] [ TEMPORARY ] STAGE [ IF NOT EXISTS ] [ FILE_FORMAT = ( { FORMAT_NAME = '' | TYPE = { CSV | JSON | AVRO | ORC | PARQUET | XML } [ formatTypeOptions ] ) } ] [ COPY_OPTIONS = ( copyOptions ) ] [ COMMENT = '' ]

Example of the statement:

create temporary stage custome_temp_int_stage file_format = my_json_format;

The output of the statement:

bigdata_3.jpg

Step 5: Create Table in Snowflake using Create Statement

Here we are going to create a temporary table using the Create statement as shown below. It creates a new table in the current/specified schema or replaces an existing table.

Syntax of the statement:

CREATE [ OR REPLACE ] TABLE [ ( [ ] , [ ] , ... ) ] ;

Example of the statement:

create or replace temporary table dezyre_customer_table (dezyre_customer_data variant );

The output of the statement:

bigdata_4.jpg

Step 6: Load JSON file to internal stage

Here we will load the JSON data file from your local system to the staging of the Snowflake as shown below.

Example of the statement:

put file://D:\customer.json @custome_temp_int_stage;

The output of the statement:

bigdata_5.jpg

Step 7: Copy the data into Target Table

Here we will load the JSON data to the target table, which we loaded earlier into the internal stage, as shown below.

Example of the statement:

copy into dezyre_customer_table from @json_temp_int_stage/customer.json on_error = 'skip_file';

The output of the statement:

bigdata_6.jpg

Step 8: Querying the data directly

Here we will verify the data loaded into the target table by running a select query as shown below.

Example of the statement:

SELECT * from DEZYRE_EMP_TABLE;

The output of the statement:

bigdata_7.jpg

Step 9: Querying the JSON object

Here we are going to query the JSON object using the select statement as shown below.

Example of the statement:

SELECT DEZYRE_CUSTOMER_DATA, DEZYRE_CUSTOMER_DATA:address.city::string as City, DEZYRE_CUSTOMER_DATA:address.state::string as state, DEZYRE_CUSTOMER_DATA:address.streetAddress::string as streetNo from DEZYRE_CUSTOMER_TABLE;

The output of the query: As you can see, in the below image, we are select the individual attributes from the JSON object and create columns.

bigdata_8.jpg

Conclusion

Here we learned to query JSON data from the table in Snowflake.

What Users are saying..

profile image

Gautam Vermani

Data Consultant at Confidential
linkedin profile url

Having worked in the field of Data Science, I wanted to explore how I can implement projects in other domains, So I thought of connecting with ProjectPro. A project that helped me absorb this topic... Read More

Relevant Projects

SQL Project for Data Analysis using Oracle Database-Part 4
In this SQL Project for Data Analysis, you will learn to efficiently write queries using WITH clause and analyse data using SQL Aggregate Functions and various other operators like EXISTS, HAVING.

Databricks Data Lineage and Replication Management
Databricks Project on data lineage and replication management to help you optimize your data management practices | ProjectPro

Build a Data Pipeline in AWS using NiFi, Spark, and ELK Stack
In this AWS Project, you will learn how to build a data pipeline Apache NiFi, Apache Spark, AWS S3, Amazon EMR cluster, Amazon OpenSearch, Logstash and Kibana.

Build a Data Pipeline with Azure Synapse and Spark Pool
In this Azure Project, you will learn to build a Data Pipeline in Azure using Azure Synapse Analytics, Azure Storage, Azure Synapse Spark Pool to perform data transformations on an Airline dataset and visualize the results in Power BI.

Learn How to Implement SCD in Talend to Capture Data Changes
In this Talend Project, you will build an ETL pipeline in Talend to capture data changes using SCD techniques.

Graph Database Modelling using AWS Neptune and Gremlin
In this data analytics project, you will use AWS Neptune graph database and Gremlin query language to analyse various performance metrics of flights.

Build a real-time Streaming Data Pipeline using Flink and Kinesis
In this big data project on AWS, you will learn how to run an Apache Flink Python application for a real-time streaming platform using Amazon Kinesis.

COVID-19 Data Analysis Project using Python and AWS Stack
COVID-19 Data Analysis Project using Python and AWS to build an automated data pipeline that processes COVID-19 data from Johns Hopkins University and generates interactive dashboards to provide insights into the pandemic for public health officials, researchers, and the general public.

Hands-On Real Time PySpark Project for Beginners
In this PySpark project, you will learn about fundamental Spark architectural concepts like Spark Sessions, Transformation, Actions, and Optimization Techniques using PySpark

Project-Driven Approach to PySpark Partitioning Best Practices
In this Big Data Project, you will learn to implement PySpark Partitioning Best Practices.