Extensive logging in Spark driver with Dynamic allocation turned on

0 votes
1 view
asked Aug 30, 2017 in Hadoop by admin (4,410 points)
SummaryDynamic allocation allows Spark to dynamically scale the cluster resources allocated to your application based on the workload. When dynamic allocation is enabled and a Spark application has a backlog of pending tasks, it can request executors. When the application becomes idle, its executors are released and can be acquired by other applications.
Applies To
  • Spark on YARN
  • Dynamic Allocation
Symptoms

The following messages show up in the Driver and Yarn ApplicationMaster's container logs.

17/02/23 09:59:29 INFO spark.ExecutorAllocationManager: Requesting 1 new executor because tasks are backlogged (new desired total will be 1)
17/02/23 09:59:29 INFO yarn.YarnAllocator: Container request (host: Any, capability: <memory:1408, vCores:1>)
17/02/23 09:59:30 INFO spark.ExecutorAllocationManager: Requesting 2 new executors because tasks are backlogged (new desired total will be 3)
17/02/23 09:59:30 INFO yarn.YarnAllocator: Container request (host: Any, capability: <memory:1408, vCores:1>)
17/02/23 09:59:30 INFO yarn.YarnAllocator: Container request (host: Any, capability: <memory:1408, vCores:1>)
(... other iterations)

17/02/23 09:59:50 INFO spark.ExecutorAllocationManager: Requesting 64 new executors because tasks are backlogged (new desired total will be 127)
17/02/23 09:59:50 INFO yarn.YarnAllocator: Container request (host: Any, capability: <memory:1408, vCores:1>)
(... other 62 YarnAllocator container requests)
17/02/23 09:59:50 INFO yarn.YarnAllocator: Container request (host: Any, capability: <memory:1408, vCores:1>)

If there is a large enough job in a cluster with limited resources Spark will requests thousands of executor containers and these requests will be printed out.

Cause
Instructions

The recommended workaround is to cap the amount of executors an application can request with the spark.dynamicAllocation.maxExecutorsproperty. A typical configuration can be the maximum amount of executor containers a cluster can host. This can be calculated with the following formula:

( Number of NodeManager hosts * yarn.nodemanager.resource.cpu-vcores / spark.executor.cores )

where yarn.nodemanager.resource.cpu-vcores is the Number of vcores that can be allocated for containers on a NodeManager host and spark.executor.cores is the number of cores to use on each executor.

Example calculation:
Cluster setup:

  • Cluster has 12 worker host with NodeManager services
  • Worker hosts have 2 x 8-core CPU with hyperthreading, 24 vcores are configured for the NodeManagers (yarn.nodemanager.resource.cpu-vcores=24)
  • The spark.executor.cores is not overridden so the default configuration is used (spark.executor.cores=1) 

With the described setup the maximum amount of executors could be set to 12*24/1=288.

Please log in or register to answer this question.

...