Hadoop HDFS is a java based distributed file system for storing large unstructured data sets. Hadoop HDFS is designed to provide high performance access to data across large Hadoop clusters of commodity servers. It is referred to as the “Secret Sauce” of Apache Hadoop components as the data can be stored in blocks on the file system until the organization’s wants to leverage it for big data analytics.
Hadoop HDFS tolerates any disk failures by storing multiple copies of a single data block on different servers in the Hadoop cluster. Each file is stored in the form of small blocks which are replicated across multiple servers in a Hadoop cluster.