# Some Hadoop Related Projects

A few commonly-used Hadoop related projects include:

* **HBase**: A scalable, distributed database that supports structured **data storage** for large tables
* **Hive**: A data warehouse infrastructure that provides **data summarization** and **ad hoc querying**
* **Pig**: A high-level data-flow **language** and **execution framework for parallel computation**
* **Spark**: A fast and general compute engine for Hadoop data. Spark provides a simple and expressive programming model that supports a wide range of applications, including **ETL**, **machine learning**, **stream processing**, and **graph computation**

**The next section lists the steps involved in solving a typical large data problem.**
