Hive on Spark Jobs Failed With Error: Client closed before SASL negotiation finished

0 votes
67 views
asked Aug 28, 2017 in Hadoop by admin (4,410 points)
Summary
Applies To
Symptoms

Hive on Spark job failed with below error message, this happens only when cluster is busy:

17/04/29 09:19:25 ERROR yarn.ApplicationMaster: SparkContext did not initialize after waiting for 100000 ms. 
Please check earlier log output for errors. Failing the application.
Hive on Spark job failed with below error:
17/04/29 09:19:15 ERROR yarn.ApplicationMaster: User class threw exception: java.util.concurrent.ExecutionException: javax.security.sasl.SaslException: 
Client closed before SASL negotiation finished.
java.util.concurrent.ExecutionException: javax.security.sasl.SaslException: Client closed before SASL negotiation finished.
at io.netty.util.concurrent.AbstractFuture.get(AbstractFuture.java:37)
at org.apache.hive.spark.client.RemoteDriver.<init>(RemoteDriver.java:156)
at org.apache.hive.spark.client.RemoteDriver.main(RemoteDriver.java:556)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.apache.spark.deploy.yarn.ApplicationMaster$$anon$2.run(ApplicationMaster.scala:542)
Caused by: javax.security.sasl.SaslException: Client closed before SASL negotiation finished.
at org.apache.hive.spark.client.rpc.Rpc$SaslClientHandler.dispose(Rpc.java:453)
at org.apache.hive.spark.client.rpc.SaslHandler.channelInactive(SaslHandler.java:90)
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelInactive(AbstractChannelHandlerContext.java:208)
at io.netty.channel.AbstractChannelHandlerContext.fireChannelInactive(AbstractChannelHandlerContext.java:194)
at io.netty.channel.ChannelInboundHandlerAdapter.channelInactive(ChannelInboundHandlerAdapter.java:75)
at org.apache.hive.spark.client.rpc.KryoMessageCodec.channelInactive(KryoMessageCodec.java:127)
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelInactive(AbstractChannelHandlerContext.java:208)
at io.netty.channel.AbstractChannelHandlerContext.fireChannelInactive(AbstractChannelHandlerContext.java:194)
at io.netty.channel.ChannelInboundHandlerAdapter.channelInactive(ChannelInboundHandlerAdapter.java:75)
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelInactive(AbstractChannelHandlerContext.java:208)
at io.netty.channel.AbstractChannelHandlerContext.fireChannelInactive(AbstractChannelHandlerContext.java:194)
at io.netty.channel.DefaultChannelPipeline.fireChannelInactive(DefaultChannelPipeline.java:828)
at io.netty.channel.AbstractChannel$AbstractUnsafe$7.run(AbstractChannel.java:621)
at io.netty.util.concurrent.SingleThreadEventExecutor.runAllTasks(SingleThreadEventExecutor.java:357)
at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:357)
at io.netty.util.concurrent.SingleThreadEventExecutor$2.run(SingleThreadEventExecutor.java:111)
at java.lang.Thread.run(Thread.java:745)
17/04/29 09:19:15 INFO yarn.ApplicationMaster: Final app status: FAILED, exitCode: 15, (reason: User class threw exception: java.util.concurrent.ExecutionException: 
javax.security.sasl.SaslException: Client closed before SASL negotiation finished.)
17/04/29 09:19:25 ERROR yarn

Cause

This is caused by no resources are available for this job to proceed and it kept waiting for resources until timed out, so this should only happen when cluster is busy and running out of resources.

Instructions

There are two options.

Option 1:

Increase the below timeouts before executing the problematic query (in the beeline session). Check the following settings:

SET hive.spark.client.connect.timeout;
SET hive.spark.client.server.connect.timeout;

And then increase their values, example as below:

SET hive.spark.client.connect.timeout=30000ms;
SET hive.spark.client.server.connect.timeout=300000ms;
<...then submit HoS query...>

Option 2:

Please consider also to decrease the memory or executor requirement for Spark application (in the beeline session):

SET spark.dynamicAllocation.maxExecutors=<number of maximum executors>;

OR:

SET spark.executor.memory=<memory in bytes>;
SET spark.driver.memory=<memory in bytes>;

For more details, please refer to below document for tuning Hive on Spark.

For option 1, if you also want to apply the same setting for other queries and other users, take in consideration to set it globally:

on Cloudera Manager Home Page > Hive > Configuration > Hive Service Advanced Configuration Snippet (Safety Valve) for hive-site.xml > put in the following XML for this field:

<property>
    <name>hive.spark.client.connect.timeout</name>
    <value>30000ms</value>
</property>
<property>
    <name>hive.spark.client.server.connect.timeout</name>
    <value>300000ms</value>
</property>

Restart Hive Services to make the changes in effect.

Please log in or register to answer this question.

...