Hadoop is an open-source framework for data processing written in JAVA. First lets dig deep into the history of Hadoop and from where did this name came. Kids are very good at inventing new names. Like the word Googol is a kid's term and later on it was improvised by the company and made GOOGLE. Doug cutting is the creator of Hadoop. And the idea of the name Hadoop came to him through his son. His son named a yellow colored stuffed elephant toy as Hadoop. And Hadoop was named after his son started calling the stuffed elephant toy Hadoop Hadoop. Hadoop is an framework which allows us to process large data sets (i.e., data sets in zeta bytes and peta bytes).
In this blog we are going to go through the entire Hadoop Ecosystem. The Hadoop Ecosystem comprises of many different analytical tools. Big Data in not limited to the use of only a few tools for analysis. Today, we stand in the world where a gigabyte of data is generated every second. Even a click on any social networking site generates data. So to analyze this data increasing in geometric progression many different tools have been developed over the years. Many IT Giants like Yahoo, Facebook, etc and The Apache Software Foundation have developed several analytical tools for Big Data computing. Let us start by discussing several tools present under the Hadoop Ecosystem for data analysis. The Hadoop Ecosystem comprises of the following tools : 1) HDFS - Hadoop Distributed File System 2) YARN - Yet Another Resource Negotiator 3) MapReduce - Big Data Processing Using Programming (like using Java,R etc.) 4) Spark - It is a framework for real time data analytics 5) Apache
Comments
Post a Comment