作者:哗锅_348 | 来源:互联网 | 2023-09-12 18:50
1.建立hive的外部表匹配hdfs上的数据出现如下报错:hive(solar)>select*fromsolar.ori_mysql_sqoop_open_third_party
1.建立hive的外部表匹配hdfs上的数据
出现如下报错:
hive (solar)> select * from solar.ori_mysql_sqoop_open_third_party_user_da limit 10;
OK
Failed with exception java.io.IOException:java.io.IOException: Not a file: hdfs://f04/sqoop/open/third_party_user/dt=2016-12-12
Time taken: 0.043 seconds
再来看一下这个表的结构:
hive (solar)> show create table solar.ori_mysql_sqoop_open_third_party_user_da;
OK
CREATE EXTERNAL TABLE `solar.ori_mysql_sqoop_open_third_party_user_da`(
`id` string COMMENT 'from deserializer',
`md5` string COMMENT 'from deserializer',
`appid` string COMMENT 'from deserializer',
`createdtime` string COMMENT 'from deserializer')
ROW FORMAT SERDE
'org.apache.hadoop.hive.serde2.OpenCSVSerde'
WITH SERDEPROPERTIES (
'escapeChar'='\\',
'quoteChar'='\'',
'separatorChar'=',')
STORED AS INPUTFORMAT
'org.apache.hadoop.mapred.TextInputFormat'
OUTPUTFORMAT
'org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat'
LOCATION
'hdfs://f04/sqoop/open/third_party_user'
TBLPROPERTIES (
'COLUMN_STATS_ACCURATE'='false',
'last_modified_by'='maintain',
'last_modified_time'='1481608526',
'numFiles'='0',
'numRows'='-1',
'rawDataSize'='-1',
'totalSize'='0',
'transient_lastDdlTime'='1481608526')
Time taken: 0.024 seconds, Fetched: 26 row(s)
可以发现这个表没有建立分区,但是在hdfs上是有分区的:
hive (solar)> dfs -ls hdfs://f04/sqoop/open/third_party_user
> ;
Found 4 items
-rw-r--r-- 3 maintain supergroup 0 2016-12-13 05:00 hdfs://f04/sqoop/open/third_party_user/_SUCCESS
drwxr-xr-x - maintain supergroup 0 2016-12-13 11:39 hdfs://f04/sqoop/open/third_party_user/dt=2016-12-12
-rw-r--r-- 3 maintain supergroup 194 2016-12-13 05:00 hdfs://f04/sqoop/open/third_party_user/part-m-00000
-rw-r--r-- 3 maintain supergroup 350 2016-12-13 05:00 hdfs://f04/sqoop/open/third_party_user/part-m-00001
解决方法是删除这个分区目录,就可以匹配数据了:
hive (solar)> dfs -ls -rmr hdfs://f04/sqoop/open/third_party_user/dt=2016-12-12
hive (solar)> select * from solar.ori_mysql_sqoop_open_third_party_user_da limit 10;
OK
2508604386885887497711481011995823
4-72406826565515368111481011997002
2.hive建立有分区的外部表时,发现没有数据
有可能是因为没有加partition,加partiiton后,再查一下数