
本文共 1968 字,大约阅读时间需要 6 分钟。
Hive完整建表语句
完整建表语句
CREATE [TEMPORARY] [EXTERNAL] TABLE [IF NOT EXISTS] [db_name.]table_name -- (Note: TEMPORARY available in Hive 0.14.0 and later) [(col_name data_type [COMMENT col_comment], ... [constraint_specification])] [COMMENT table_comment] [PARTITIONED BY (col_name data_type [COMMENT col_comment], ...)] [CLUSTERED BY (col_name, col_name, ...) [SORTED BY (col_name [ASC|DESC], ...)] INTO num_buckets BUCKETS] [SKEWED BY (col_name, col_name, ...) -- (Note: Available in Hive 0.10.0 and later)] ON ((col_value, col_value, ...), (col_value, col_value, ...), ...) [STORED AS DIRECTORIES] [ [ROW FORMAT row_format] [STORED AS file_format] | STORED BY 'storage.handler.class.name' [WITH SERDEPROPERTIES (...)] -- (Note: Available in Hive 0.6.0 and later) ] [LOCATION hdfs_path] [TBLPROPERTIES (property_name=property_value, ...)] -- (Note: Available in Hive 0.6.0 and later) [AS select_statement]; -- (Note: Available in Hive 0.5.0 and later; not supported for external tables)
介绍:
[TEMPORARY]:临时表,当sesssion关闭时,表会被删除
[EXTERNAL]:外部表,与 [LOCATION hdfs_path]对应,外部表的位置,删除外部表,文件不会从hdfs上删除
[PARTITIONED BY (col_name data_type [COMMENT col_comment], …)]:根据后面字段进行分区,对应于hdfs上table级下又的子文件夹,注意,不能是hive中已经存在的字段
[CLUSTERED BY (col_name, col_name, …) [SORTED BY (col_name [ASC|DESC], …)] INTO num_buckets BUCKETS]:分桶,根据后面字段进行分桶,并且排序,对应与mr中的partation,方便数据采样
[SKEWED BY (col_name, col_name, …) – (Note: Available in Hive 0.10.0 and later)] ON ((col_value, col_value, …), (col_value, col_value, …), …)
[STORED AS DIRECTORIES]:hive标记这个表是倾斜表, (col_name, col_name, …)指倾斜的列,on后跟随的是倾斜的值。
[
[ROW FORMAT row_format]
[STORED AS file_format]
| STORED BY ‘storage.handler.class.name’ [WITH SERDEPROPERTIES (…)] – (Note: Available in Hive 0.6.0 and later)
]:ROW FORMAT row_format 定义数据之间的分割符,[STORED AS file_format]指定文件存储格式,
[TBLPROPERTIES (property_name=property_value, …)] :hive中表内置了一部分属性,并且可以根据这个进行表属性的修改
[AS select_statement]:在创建表的时候,导入数据。
发表评论
最新留言
关于作者
