Hive | MSCK Repair table fails with "Unexpected component partitionvalue"

0 votes
0 views
asked Aug 30, 2017 in Hadoop by admin (4,410 points)
Summary

Symptoms

Hive command "MSCK REPAIR TABLE" command fails with:

10781 [main] WARN hive.ql.exec.DDLTask - Failed to run metacheck:
org.apache.hadoop.hive.ql.metadata.HiveException: MetaException(message:Unexpected component partitionvalue)
at org.apache.hadoop.hive.ql.exec.DDLTask.msck(DDLTask.java:1729)
at org.apache.hadoop.hive.ql.exec.DDLTask.execute(DDLTask.java:373)
at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:212)
...
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:343)
at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:164)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:415)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1714)
at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:158)
Caused by: MetaException(message:Unexpected component partitionvalue)
at org.apache.hadoop.hive.metastore.Warehouse.makeValsFromName(Warehouse.java:411)
at org.apache.hadoop.hive.ql.exec.DDLTask.msck(DDLTask.java:1727)
... 34 more
10794 [main] ERROR org.apache.hadoop.hive.ql.Driver - FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.DDLTask
Applies To
Cause

If you list out the contents of the HDFS directory of the table, you might see that there are directories for each partition.
MSCK expects the directory structure to be in format "partitioncolumn=value". For example a table with partitions year and month:
/data/table1/2017/03
should be in format
/data/table1/year=2017/month=03

This was the default behavior in some older versions however the MSCK command did not fail before, but silently did nothing.
 

Instructions

The solution is to:
1. Rename directories to conform
or
2. create partitions manually with running "ALTER TABLE ... ADD PARTITION" commands.

It is not advised to use MSCK as part of "data load" instead of "ALTER TABLE ... ADD PARTITION".
MSCK is not designed for this usage and does not scale well as it needs to search/crawl through the hdfs directories. It is advised to use "ALTER TABLE ... ADD PARTITION" instead when loading a new partition.

Please log in or register to answer this question.

...