What are the most popular Hadoop tools/projects?

Posted January 31, 2019 1.8k views
Big Data

I have a question what are the most popular Hadoop tools/projects?

edited by MattIPv4

These answers are provided by our Community. If you find them useful, show some love by clicking the heart. If you run into issues leave a comment, or add your own answer to help others.

Submit an Answer
2 answers

Hive is an SQL-like language for data processing, which gets converted into a MapReduce job behind the scenes. Hive is popular because it is written using familiar SQL-like syntax. This is often confusing, because Hive doesn’t have all the controls of a relational database. but the query language is familiar.

Spark is popular, especially for data processing, analytics and distributed machine learning. Spark jobs can be written in Scala or Python. The latter makes Spark especially popular with data scientists and those from a statistics background.

With the speed of open source software development, this answer may be very different in a year’s time.

the most popular Hadoop tools are:

  1. Hadoop distributed File System
  2. Hbase
  3. Hive
  4. Sqoop
  5. Pig
  6. NoSQL
  7. GIS Tools
  8. Spark: Apache Spark is an open-source distributed general-purpose cluster-computing framework. Spark provides an interface for programming entire clusters with implicit data parallelism and fault tolerance.
  9. Flume
  10. Ambari
edited by MattIPv4