Hadoop is an open-source platform that enables us for big data data processing, storage and management. It consists of components such as the HDFS( Hadoop distributed file system), Mapreduce, spark and YARN.
HDFS uses for data storage and Mapreduce enables us for parallel bulk data processing. Spark helps us for real-time data processing. YARN also uses for resource management among the Hadoop nodes.
Hadoop nodes can be classified as master nodes and slave nodes. The master node is known as JobTracker and the slave one is called TaskTracker.
The JobTracker controls the slave nodes ( TaskTracker) and assigns them Mapreduce tasks.
The reason for using Hadoop framework is that data which is big in volume, continuous growing and different in variety can not handled by relational database management systems(DBMS).