Capturing a Thread Dump from a Live Impala Deamon

0 votes
4 views
asked Aug 19, 2017 in Hadoop by admin (4,410 points)
SummaryTo troubleshoot some issues with Impala, it is useful to capture debugging information about the currently active threads in an Impala daemon while it is running. The video linked here walks through how to do that process. A transcript of the video narration is also included, with a couple of important added notes. The duration of the video is approximately seven minutes. An embedded copy of the video is included at the end as an experiment -- it may not display correctly.
Symptoms
Applies To
Cause
Instructions


Transcript: 

In this video, we're going to review how to capture a stack trace and core dump from a currently running Impala daemon. 

First, you'll need to identify the host where the Impala daemon you're interested in is running.  For this demo, I'm going to get information about the impalad on this node listed here in Cloudera Manager. I'll open an SSH connection as the root user. 

Cloudera support will provide the URL for the symbol package you'll need to download. The URL is different for each version of CDH, so make sure you're using the specific URL sent to you. I have the URL for the version I'm using in a text document here.

You'll want to download the package file to an appropriate location. Here I'm going to change to the tmp directory and use wget to download the file. (That's going to take a few minutes, so let's skip ahead.)

OK, that's finishing up and now that we've got the symbol file downloaded, I'm going to create a directory to unpack it into and cd into that new directory. 

Next, I'll use the rpm2cpio command to convert the format of the package from the parent directory, and pipe that to cpio -idmv to extract it to the current directory. (That will take a couple of minutes, so let's skip forward.)

Now that the archive is extracted, I'll search in this directory for impalad.debug, which is the name of the file we're going to be using.

It turns out there are two impalad.debug files, one each for the debug and release builds. For now, I'll copy both paths and paste them into my text document.  I don't have the debug build enabled in this environment, so we should be using sbin-retail, but we'll confirm that in a moment. 

Before starting up the debugger, I'm going to create a directory to collect its output and cd into that new directory. 

[IMPORTANT: Ensure that the filesystem for the current working directory has enough free space for a core dump file. The size of the file will depend on the Impala daemon's memory usage, which can be hundreds of gigabytes on some systems. If there isn't enough free disk space, use a different location or skip the core dump.]

Then I'll run ps ax and grep through the output for impala. You can see that the path to the running impalad executable is the release (retail-bin) build, just as I thought. That means from that text document, we're going to use the path to the symbols for the release build.

Next I'll grab the process ID for the impalad process. I'll use that to start up the debugger with gdb --pid followed by that process ID number. 

After some startup output, that leaves us at the gdb debugger prompt.

The next step is to start capturing the output from the debugger with set logging on. You can see that it says Copying output to gdb.txt.

Next, I'll enter set height 0 to avoid hitting the return key after every page of output. That's not strictly required, but it makes the process a lot quicker and easier.

Then I'll grab the path to the symbol file from my text document and load that path with the symbol-file command. You'll see it says reading symbols from the symbol file path and then ...done.

Now we're ready to get backtraces for all threads. The command for that is thread apply all backtrace (or bt for short). After hitting enter, you can see all the backtrace info scroll out continuously. 

Once that stops, I'll enter gcore to generate a core dump. [As noted above, this step may fail if there isn't enough free space on the working directory's filesystem.]

(That will take several minutes, so let's skip forward again.)

The debugger will eventually say that it saved the corefile, so now we're ready to quit, which will detach the impalad process, as it warns.

So now we have two output files, the gdb.txt thread output, and the core file. Upload one or both to the case as instructed by Cloudera support.

We've reviewed how to attach to a live Impala daemon process and generate a stack trace and core dump. Thanks for watching, and as always, please let us know if you have any questions.



 

Please log in or register to answer this question.

...