Difference between revisions of "Hadoop: Hive untuk Query SQL"
Onnowpurbo (talk | contribs) (New page: Sumber: http://hortonworks.com/hadoop/hive/ ==What Hive Does== Hadoop was built to organize and store massive amounts of data of all shapes, sizes and formats. Because of Hadoop’s “s...) |
Onnowpurbo (talk | contribs) m (Hadopp: Hive untuk Query SQL moved to Hadoop: Hive untuk Query SQL) |
(No difference)
|
Revision as of 05:21, 14 November 2015
Sumber: http://hortonworks.com/hadoop/hive/
What Hive Does
Hadoop was built to organize and store massive amounts of data of all shapes, sizes and formats. Because of Hadoop’s “schema on read” architecture, a Hadoop cluster is a perfect reservoir of heterogeneous data—structured and unstructured—from a multitude of sources.
Data analysts use Hive to explore, structure and analyze that data, then turn it into actionable business insight.
Advantages of using Hive for enterprise SQL in Hadoop:
Feature Description Familiar Query data with a SQL-based language Fast Interactive response times, even over huge datasets Scalable and Extensible As data variety and volume grows, more commodity machines can be added, without a corresponding reduction in performance
How Hive Works
The tables in Hive are similar to tables in a relational database, and data units are organized in a taxonomy from larger to more granular units. Databases are comprised of tables, which are made up of partitions. Data can be accessed via a simple query language and Hive supports overwriting or appending data.
Within a particular database, data in the tables is serialized and each table has a corresponding Hadoop Distributed File System (HDFS) directory. Each table can be sub-divided into partitions that determine how data is distributed within sub-directories of the table directory. Data within partitions can be further broken down into buckets.
Hive supports all the common primitive data formats such as BIGINT, BINARY, BOOLEAN, CHAR, DECIMAL, DOUBLE, FLOAT, INT, SMALLINT, STRING, TIMESTAMP, and TINYINT. In addition, analysts can combine primitive data types to form complex data types, such as structs, maps and arrays.