Parquet is a columnar batch file format, which will buffer data into batches before writing to a file. Parquet is implemented with a pre-allocated buffer and only flushes the buffer out to HDFS when the buffer is full.By default Hive allocates a 128MB buffer each open Parquet file buffer. The buffer size is controlled by the configuration setting 'parquet.block.size'. For best performance, the Parquet buffer size should be aligned to the HDFS block size so that each file sits within a single HDFS block so that each I/O request reads an entire data file without having to reach across the network to read subsequent blocks.
Inserting data into a partitioned table can therefore be a memory-intensive operation, because each data file, in each partition, requires a memory buffer to hold the data before it is written. Such inserts can also exceed HDFS limits on simultaneous open files, because each Mapper/Reducer could potentially write to a separate data file for each partition, all at the same time.
-- set Parquet buffer size to 256MB (in bytes)
Often, the INSERT ... SELECT statement is translated into a Mapper-only job if there are no joins or aggregations. The Mappers simply reads the records and outputs them to the destination Parquet file. In this scenario, each Mapper must create a new Parquet file (buffer) for each partition it discovers based on the data it reads. If the Mapper discovers many partitions, it will require many open Parquet files at the same time. This can often lead to Mappers crashing with an OutOfMemoryError exception.