Load JSON data from the local to an internal stage in Snowflake

This recipe helps you load JSON data from the local to an internal stage in Snowflake. Snowflake is one of the few enterprise-ready cloud data warehouses that brings simplicity.

Recipe Objective: How to load JSON data from the local to an internal stage in Snowflake?

Snowflake is one of the few enterprise-ready cloud data warehouses that brings simplicity without sacrificing features. It automatically scales, both up and down, to get the right balance of performance vs. cost. Snowflake's claim to fame is that it separates computation from storage. This is significant because almost every other database, Redshift included, combines the two, meaning you must size for your largest workload and incur the cost that comes with it. In this scenario, we will learn how to import or load the JSON from the local to an internal stage in Snowflake.

System requirements:

Step 1: Log in to the account

We need to log in to the snowflake account. Go to snowflake.com and then log in by providing your credentials. Follow the steps provided in the link above.

Step 2: Select Database

To select the database which you created earlier, we will use the "use" statement

Syntax of the statement:

Use database [database-name];

Example of the statement:

use database demo_db;

The output of the statement:

bigdata_1.jpg

Step 3: Create File Format for JSON

Creates a named file format that describes a set of staged data to access or load into Snowflake tables.

Syntax of the statement:

CREATE [ OR REPLACE ] FILE FORMAT [ IF NOT EXISTS ] TYPE = { CSV | JSON | AVRO | ORC | PARQUET | XML } [ formatTypeOptions ] [ COMMENT = '' ]

Example of the statement:

create or replace file format my_json_format type = 'json' strip_outer_array = true;

The output of the statement:

bigdata_2.jpg

Step 4: Create an Internal stage

Here we are going to Create an internal stage that is json_temp_int_stage with the file format is JSON type

Syntax of the statement:

-- Internal stage CREATE [ OR REPLACE ] [ TEMPORARY ] STAGE [ IF NOT EXISTS ] [ FILE_FORMAT = ( { FORMAT_NAME = '' | TYPE = { CSV | JSON | AVRO | ORC | PARQUET | XML } [ formatTypeOptions ] ) } ] [ COPY_OPTIONS = ( copyOptions ) ] [ COMMENT = '' ]

Example of the statement:

create temporary stage json_temp_int_stage file_format = my_json_format;

The output of the statement:

bigdata_3.jpg

Step 5: Create Table in Snowflake using Create Statement

Here we are going to create a temporary table using the Create statement as shown in the below, It Creates a new table in the current/specified schema or replaces an existing table.

Syntax of the statement:

CREATE [ OR REPLACE ] TABLE [ ( [ ] , [ ] , ... ) ] ;

Example of the statement:

create or replace temporary table dezyre_emp_table (dezyre_json_data variant );

The output of the statement:

bigdata_4.jpg

Step 6: Load JSON file to internal stage

Here we will load the JSON data file from your local system to the staging of the Snowflake as shown below.

Example of the statement:

put file://D:\restarunts.json @json_temp_int_stage;

The output of the statement:

bigdata_5.jpg

Step 7: Copy the data into Target Table

Here we will load the JSON data to the target table, which we loaded earlier into the internal stage, as shown below.

Example of the statement:

copy into dezyre_emp_table from @json_temp_int_stage/employees.json on_error = 'skip_file';

The output of the statement:

bigdata_6.jpg

Here we will verify the data loaded into the target table by running a select query as shown below.

Example of the statement:

SELECT * from DEZYRE_EMP_TABLE;

The output of the statement:

bigdata_7.jpg

Conclusion

Here we learned to load JSON data from the local to an internal stage in Snowflake.

What Users are saying..

profile image

Savvy Sahai

Data Science Intern, Capgemini
linkedin profile url

As a student looking to break into the field of data engineering and data science, one can get really confused as to which path to take. Very few ways to do it are Google, YouTube, etc. I was one of... Read More

Relevant Projects

dbt Snowflake Project to Master dbt Fundamentals in Snowflake
DBT Snowflake Project to Master the Fundamentals of DBT and learn how it can be used to build efficient and robust data pipelines with Snowflake.

Build an Analytical Platform for eCommerce using AWS Services
In this AWS Big Data Project, you will use an eCommerce dataset to simulate the logs of user purchases, product views, cart history, and the user’s journey to build batch and real-time pipelines.

Streaming Data Pipeline using Spark, HBase and Phoenix
Build a Real-Time Streaming Data Pipeline for an application that monitors oil wells using Apache Spark, HBase and Apache Phoenix .

Graph Database Modelling using AWS Neptune and Gremlin
In this data analytics project, you will use AWS Neptune graph database and Gremlin query language to analyse various performance metrics of flights.

Build a Scalable Event Based GCP Data Pipeline using DataFlow
In this GCP project, you will learn to build and deploy a fully-managed(serverless) event-driven data pipeline on GCP using services like Cloud Composer, Google Cloud Storage (GCS), Pub-Sub, Cloud Functions, BigQuery, BigTable

Build an ETL Pipeline with DBT, Snowflake and Airflow
Data Engineering Project to Build an ETL pipeline using technologies like dbt, Snowflake, and Airflow, ensuring seamless data extraction, transformation, and loading, with efficient monitoring through Slack and email notifications via SNS

Build a Real-Time Spark Streaming Pipeline on AWS using Scala
In this Spark Streaming project, you will build a real-time spark streaming pipeline on AWS using Scala and Python.

Retail Analytics Project Example using Sqoop, HDFS, and Hive
This Project gives a detailed explanation of How Data Analytics can be used in the Retail Industry, using technologies like Sqoop, HDFS, and Hive.

Databricks Real-Time Streaming with Event Hubs and Snowflake
In this Azure Databricks Project, you will learn to use Azure Databricks, Event Hubs, and Snowflake to process and analyze real-time data, specifically in monitoring IoT devices.

Learn to Build Regression Models with PySpark and Spark MLlib
In this PySpark Project, you will learn to implement regression machine learning models in SparkMLlib.