Hive 基本操作-蒲公英云

Hive 基本操作

hive最主要做查询不涉及删除修改默认不支持删除修改，默认不支持事务，并不完全支持标准sql

一.HQL操作初体验

1.创建表

# 创建表 row format delimited fields terminated by ',' 指定了字段的分隔符为逗号，所以load数据的时候，load的文本也要为逗号，否则加载后为NULL。hive只支持单个字符的分隔符，hive默认的分隔符是\001
hive> CREATE TABLE student(classNo string, stuNo string, score int) row format delimited fields terminated by ',';

2.将数据load到表中
在本地文件系统创建一个如下的文本文件：/root/student.txt

C01,N0101,82
C01,N0102,59
C01,N0103,65
C02,N0201,81
C02,N0202,82
C02,N0203,79
C03,N0301,56
C03,N0302,92
C03,N0306,72
# 将student.txt文件复制到hive的warehouse目录中，这个目录由hive.metastore.warehouse.dir配置项设置，默认值为/usr/local/apache-hive-2.3.6/warehouse . Overwrite选项将导致Hive事先删除student目录下所有的文件, 并将文件内容映射到表中
hive> load data local inpath '/root/student.txt'overwrite into table student;
# 也可以从外部直接将文件put进去
# hadoop fs -put /root/student.txt /usr/local/apache-hive-2.3.6/warehouse/student

3.查询表中的数据

select * from student;

在这里插入图片描述

二.内部表和外部表

内部表(managed table): 数据默认保存在hive.metastore.warehouse.dir配置项设置的路径;删除时,直接删除元数据(metabata)及存储数据;修改表会将修改直接同步给元数据.
外部表(external table):数据由HDFS管理,存在hdfs任意位置;删除时,仅删除元数据;对表结构和分区进行修改,则需要修复(MSCK REPAIR TABLE table_name;)

# 创建一个外部表
hive> CREATE EXTERNAL TABLE student2 (classNo string, stuNo string, score int) row format delimited fields terminated by ',' location '/tmp/student';
# 装载数据
hadoop fs -put /root/student.txt /tmp/student

在这里插入图片描述

三.分区表

# 员工数据
tom,4300
jerry,12000
mike,13000
jake,11000
rob,10000
#  创建一张员工表employee   设置partitioned分区为date2
hive> create table employee (name string,salary bigint) partitioned by (date2 string) row format delimited fields terminated by ',' lines terminated by '\n' stored as textfile;
# 添加分区
hive> alter table employee add if not exists partition(date2='2018-12-01');
hive> alter table employee add if not exists partition(date2='2018-12-02');
hive> alter table employee add if not exists partition(date2='2018-12-03');
# 加载数据到分区
hive> load data local inpath '/root/employee.txt' into table employee partition(date2='2018-12-01');
hive> load data local inpath '/root/employee.txt' into table employee partition(date2='2018-12-02');
hive> load data local inpath '/root/employee.txt' into table employee partition(date2='2018-12-03');
# 查看表的分区
hive> show partitions employee;
# 根据分区查找相应的数据
hive> select * from employee where date2='2018-12-01';

参看数据在hdfs中的情况
hadoop fs -ls /usr/local/apache-hive-2.3.6/warehouse/employee
在这里插入图片描述

如果重复加载同名文件，不会报错，会自动创建一个*_copy_1.txt

load data local inpath '/home/hadoop/tmp/employee.txt' into table employee partition(date1='2018-12-01');

在这里插入图片描述

# 查看表在hdfs中的情况
hadoop fs -ls /usr/local/apache-hive-2.3.6/warehouse/employee/date2=2018-12-01

在这里插入图片描述

Hive 基本操作

一.HQL操作初体验

二.内部表和外部表

三.分区表

如果重复加载同名文件，不会报错，会自动创建一个*_copy_1.txt

发表评论取消回复

还没有评论，来说两句吧...

相关阅读

相关 Hive 基本操作

相关 Hive基本操作

相关 hive安装和基本操作

相关 Hive(三)--基本操作

相关 Hive总结（二）hive基本操作

相关 hive的基本操作（重点）

相关 Hive的基本操作总结

相关 Hive的基本操作（三）

相关 Hive 基本操作命令

相关 hive的基本操作

随便看看

面试必问SpringBoot的自动配置原理

exp: 查找 Android Studio 引用的jar或aar包位置

微信小程序中OCR身份证识别流程

web前端开发技术储久良第三版答案整理（1~17章答案+附录模拟试卷）

反向传播求偏导原理简单理解

名词解释（总有些惊奇的际遇）

教程文章

热评文章

1江湖小白之一起学Python （二）爬取数据的保存

2Java Shiro：简化身份验证和授权的安全框架

3Java中try()catch{}的使用方法

4Swagger注解-@ApiModel 和 @ApiModelProperty

5windows下强制杀死tomcat进程

6uni-app 条形码(一维码)/二维码生成实现

标签列表