Display free space and sizes of files in HDFS

This recipe helps you display free space and sizes of files and directories contained in the given directory in HDFS. The hdfs command
Last Updated: 25 Aug 2022

Get access to Big Data projects View all Big Data projects

APACHE HADOOP PROJECTS DATA CLEANING PYTHON DATA MUNGING MACHINE LEARNING RECIPES PANDAS CHEATSHEET ALL TAGS

Recipe Objective: How to display free space and sizes of files and directories contained in the given directory in HDFS?

It is always essential to keep track of the available free space and size of files and directories present in the HDFS. In this recipe, we learn how to find these values for a given directory in the HDFS.

Access Source Code for Airline Dataset Analysis using Hadoop

Prerequisites:

Before proceeding with the recipe, make sure Single node Hadoop (click here ) is installed on your local EC2 instance.

Steps to set up an environment:

In the AWS, create an EC2 instance and log in to Cloudera Manager with your public IP mentioned in the EC2 instance. Login to putty/terminal and check if Hadoop is installed. If not installed, please find the links provided above for installations.
Type "&ltyour public IP&gt:7180" in the web browser and log in to Cloudera Manager, where you can check if Hadoop is installed.
If they are not visible in the Cloudera cluster, you may add them by clicking on the "Add Services" in the cluster to add the required services in your local instance.

Displaying free space & sizes of files and directories contained in the given directory:

The hdfs command "-df" is used to find the available free space and file and directory sizes in the HDFS directories. The syntax for the same is:

hdfs dfs -df

The below pic shows the result of this command, which contains the data about the file system, its total allocated memory size, the memory used, available memory, and percentage of memory used.

bigdata_1

Using "-count": We can provide the paths to the required files in this command, which returns the output containing columns - "DIR_COUNT," "FILE_COUNT," "CONTENT_SIZE," "FILE_NAME." The command for the same is:

bigdata_2

This is how we display the space and size-related information about the files and directories in a given directory in HDFS.

Download Materials

bigdata_1

bigdata_2

What Users are saying..

Jingwei Li

Graduate Research assistance at Stony Brook University

ProjectPro is an awesome platform that helps me learn much hands-on industrial experience with a step-by-step walkthrough of projects. There are two primary paths to learn: Data Science and Big Data.... Read More

Relevant Projects

Machine Learning Projects

Data Science Projects

Python Projects for Data Science

Data Science Projects in R

Machine Learning Projects for Beginners

Deep Learning Projects

Neural Network Projects

Tensorflow Projects

NLP Projects

Kaggle Projects

IoT Projects

Big Data Projects

Hadoop Real-Time Projects Examples

Spark Projects

Data Analytics Projects for Students

Relevant Projects

Build an ETL Pipeline with Talend for Export of Data from Cloud

In this Talend ETL Project, you will build an ETL pipeline using Talend to export employee data from the Snowflake database and investor data from the Azure database, combine them using a Loop-in mechanism, filter the data for each sales representative, and export the result as a CSV file.

View Project Details

Display free space and sizes of files in HDFS

Recipe Objective: How to display free space and sizes of files and directories contained in the given directory in HDFS?

Prerequisites:

Steps to set up an environment:

Displaying free space & sizes of files and directories contained in the given directory:

Jingwei Li

Relevant Projects

You might also like

Relevant Projects