Creating HDFS Current and Historical Disk Usage Reports | fsimage | Lucene

+1 vote
9 views
asked Oct 22, 2017 in Hadoop by anonymous
edited Nov 6, 2017 by admin

1 Answer

0 votes
answered Nov 6, 2017 by admin (4,410 points)

Summary

  • Cloudera Manager
  • HDFS Disk Usage Reports
  • RMAN Database

Question:

How are the HDFS Current and Historical Disk Usage reports created?

Answer:

  • The reports are derived based on data extracted from the NameNode's fsimage as follows:
  • The fsimage is periodically fetched from the NameNode by the Reports Manager.  To find the frequency in Cloudera Manager:
  • From Cloudera Manager navigate to: Management Services > Configuration
  • Enter update frequency into the Search field
  • View the value for Reports Manager Update Frequency (The default is one hour and a lower setting is not recommended.)
  • Each time the fsimage is fetched the following happens:
  • A new Lucene Index created based on the data in the fsimage
  • The results form the Lucene index are inserted into the Cloudera Manager RMAN database
  • A new Current Disk Usage report is generated

​​​​Notes:

  • The Current Disk Usage reports are derived from the most recent sample in the RMAN database
  • The Historic Disk Usage reports are derived from all samples in the RMAN database and dynamically aggregated by (hour, day, week, month, year)
...