Recovering HDFS Data After Deleting With the -skiptrash Option

0 votes
1 view
asked Aug 25, 2017 in Hadoop by admin (4,410 points)
SummaryIf the -skiptrash option was used to delete files, HDFS begins deleting blocks for those files immediately. Data will be lost. If the amount of data being deleted is large and you act quickly there is a possibility of recovering some of the data.
Applies To
Symptoms

If a large amount of data was removed with the -skiptrash option and you move quickly, it may be possible to recover some of the data.

IMPORTANT:    Stop the NameNodes or put the NameNodes in SafeMode as quickly as possible to minimize data loss.   You may have as little as a few minutes depending upon the size of the data deleted, the size of your cluster, and speed of your network and drives.

 

Cause

A delete from HDFS was accidentally performed with the -skiptrash flag.   Example:

hdfs dfs -rm -r -skiptrash <pathname>

NOTE:   This is not a bug and HDFS is working as it is should.

Instructions
IMPORTANT:    Data will be lost by using -skipTrash.   Per the Apache Documentation for the rm command, the -skipTrash option will bypass trash, if enabled, and delete the specified file(s) immediately.  This is working as expected.

If a small amount of data was accidentally deleted with -skipTrash, there is no way to recover the data from HDFS.

If a large amount of data was accidentally deleted with -skipTrash, immediately:
  • Stop the HDFS service
    or
  • Put the NameNode into safeMode
    and
  • Contact Cloudera Support
Important:
If shutdown or safemode transition is done long after remove was done, then it is likely the blocks are already removed and there is no way to recover.
In Safemode, to confirm if ANY blocks can really be recovered :
Login to CM - Charts - Chart Builder and run : select pending_deletion_blocks
OR 
From Namenode UI, "Number of Blocks Pending Deletion
      If this shows 0 blocks/relicas, then this means the blocks are already removed and nothing can be recovered. There is no point in doing anything further.

If HDFS snapshots were taken on the removed folder, then that could be an option for recovery.

If it has been more than 2 hours since the command with -skipTrash was issues, consider the data completely unrecoverable.  The longer since the delete was performed, the less chance any data could be recovered.  Even if HDFS is immediately stopped, some data will be lost.  

When files are deleted with -skipTrash, the blocks associated with the files are queued for deletion from the DataNodes.  For a small amount of data, this queue will be processed quickly and the blocks will be deleted.   For a large amount of data, it may take a while for the queue to be processed to avoid overloading the DataNodes I/O with delete requests.  The amount of time it will take will depend on the size of the cluster and the number of blocks (size of the data) being deleted.  Large files with many blocks will be corrupt as the queued deletes are independent of file association.

Please log in or register to answer this question.

...