清华大佬耗费三个月吐血整理的几百G的资源,免费分享!....>>>
hive> desc t1; OK id int name string p_id int Time taken: 0.118 seconds, Fetched: 3 row(s) hive> desc t2; OK id int name string Time taken: 0.051 seconds, Fetched: 2 row(s) hive> select * from t1; OK 1 aaa 2 2 bbb 2 3 ccc 3 4 ddd 4 5 fff 3 6 ooo 23 Time taken: 0.418 seconds, Fetched: 6 row(s) hive> select * from t2; OK 4 jjj 4 jjj 4 jjj 2 abc 3 hhh 4 jjj 3 ii 2 fuck 7 shit Time taken: 0.068 seconds, Fetched: 9 row(s) hive> select * from t1 left outer join t2 on (t1.p_id=t2.id) where t2.name='abc'; OK 1 aaa 2 2 abc 2 bbb 2 2 abc Time taken: 21.53 seconds, Fetched: 2 row(s) hive> select * from t1 left outer join t2 on (t1.p_id=t2.id and t2.name='abc'); OK 1 aaa 2 2 abc 2 bbb 2 2 abc 3 ccc 3 NULL NULL 4 ddd 4 NULL NULL 5 fff 3 NULL NULL 6 ooo 23 NULL NULL Time taken: 17.676 seconds, Fetched: 6 row(s) hive left outer join 要过滤右表的数据应该是第二种写法,第一种是mysql的写法,但是在hive中会存在问题。