4
[Hive 基础]-- 使用 Map 和 Array 数据结构
source link: https://blog.51cto.com/u_13966077/5820199
Go to the source link to view the article. You can view the picture content, updated content and better typesetting reading experience. If the link is broken, please click the button below to view the snapshot at that time.
Hive支持的数据类型分为基础数据类型和复杂类型,这是关系性数据库不具备的特性。
- 基础类型主要包括:tinyint,smalint,int,bigint,boolean,float,double,string,timestamp 等
- 复杂类型主要包括:array,map,struct等
参考: https://cwiki.apache.org/confluence/display/Hive/LanguageManual+Types
1、map
1.1 可以存储如下数据
"geo": {
"country": "中国",
"subdivision": "四川",
"city": "成都",
"latitude": "31.261",
"longitude": "127.22562",
"isp": "电信"
}
"country": "中国",
"subdivision": "四川",
"city": "成都",
"latitude": "31.261",
"longitude": "127.22562",
"isp": "电信"
}
说明:geo 代表hive 表字段名称,冒号后的是对应的值。
1.2 创建 hive 表
create table fdp.map_test(
id string,
geo map<string,string>
);
id string,
geo map<string,string>
);
1.3 准备数据文件 map_test
1 country:中国,subdivision:四川,city:德阳,latitude:31.86141,longitude:117.27562,isp:移动
2 country:中国,subdivision:四川,city:绵阳,latitude:31.86141,longitude:117.27562,isp:电信
3 country:中国,subdivision:安徽,city:淮南,latitude:31.86141,longitude:117.27562,isp:电信
2 country:中国,subdivision:四川,city:绵阳,latitude:31.86141,longitude:117.27562,isp:电信
3 country:中国,subdivision:安徽,city:淮南,latitude:31.86141,longitude:117.27562,isp:电信
1.4 load 到hive 表中,然后查询
load data local inpath "/data/map_test" into table fdp.map_test;
select * from fdp.map_test where geo["country"] ='中国' ;
select * from fdp.map_test where geo["country"] ='中国' ;
2、array
2.1 可以存储如下数据
"business": ["it","os","abc"]
说明:business 代表hive 表里的字段名称,冒号后的是对应的值。
2.2 创建hive 表
create table fdp.array_test(
id string,
business array<string>
);
id string,
business array<string>
);
2.3 准备数据文件 array_test
1 iqwt,st,ows,dds
2 idw,iss,dcf
3 idw,swd,dv
2 idw,iss,dcf
3 idw,swd,dv
2.4 load 到 hive 表中,然后查询
注意: sql 中需要使用视图,spark sql 和 hive sql 都支持 视图。
#load data local inpath "/data/array_test" into table fdp.array_test;
#select * from fdp.array_test LATERAL VIEW explode(business) rTable AS realid where realid='it' limit 2;
#select * from fdp.array_test LATERAL VIEW explode(business) rTable AS realid where realid='it' limit 2;
Recommend
About Joyk
Aggregate valuable and interesting links.
Joyk means Joy of geeK