how to kill hadoop jobs

HadoopKillJobs

Hadoop Problem Overview


I want to kill all my hadoop jobs automatically when my code encounters an unhandled exception. I am wondering what is the best practice to do it?

Thanks

Hadoop Solutions


Solution 1 - Hadoop

Depending on the version, do:

version <2.3.0

Kill a hadoop job:

hadoop job -kill $jobId

You can get a list of all jobId's doing:

hadoop job -list

version >=2.3.0

Kill a hadoop job:

yarn application -kill $ApplicationId

You can get a list of all ApplicationId's doing:

yarn application -list

Solution 2 - Hadoop

Use of folloing command is depreciated

hadoop job -list
hadoop job -kill $jobId

consider using

mapred job -list
mapred job -kill $jobId

Solution 3 - Hadoop

Run list to show all the jobs, then use the jobID/applicationID in the appropriate command.

Kill mapred jobs:

mapred job -list
mapred job -kill <jobId>

Kill yarn jobs:

yarn application -list
yarn application -kill <ApplicationId>

Solution 4 - Hadoop

An unhandled exception will (assuming it's repeatable like bad data as opposed to read errors from a particular data node) eventually fail the job anyway.

You can configure the maximum number of times a particular map or reduce task can fail before the entire job fails through the following properties:

  • mapred.map.max.attempts - The maximum number of attempts per map task. In other words, framework will try to execute a map task these many number of times before giving up on it.
  • mapred.reduce.max.attempts - Same as above, but for reduce tasks

If you want to fail the job out at the first failure, set this value from its default of 4 to 1.

Solution 5 - Hadoop

Simply forcefully kill the process ID, the hadoop job will also be killed automatically . Use this command:

kill -9 <process_id> 

>eg: process ID no: 4040 namenode

username@hostname:~$ kill -9 4040

Solution 6 - Hadoop

Use below command to kill all jobs running on yarn.

For accepted jobs use below command.

for x in $(yarn application -list -appStates ACCEPTED | awk 'NR > 2 { print $1 }'); do yarn application -kill $x; done

For running, jobs use the below command.

for x in $(yarn application -list -appStates RUNNING | awk 'NR > 2 { print $1 }'); do yarn application -kill $x; done

Attributions

All content for this solution is sourced from the original question on Stackoverflow.

The content on this page is licensed under the Attribution-ShareAlike 4.0 International (CC BY-SA 4.0) license.

Content TypeOriginal AuthorOriginal Content on Stackoverflow
QuestionFrankView Question on Stackoverflow
Solution 1 - Hadoopdr0iView Answer on Stackoverflow
Solution 2 - HadoopPrabhuView Answer on Stackoverflow
Solution 3 - HadoopAni MenonView Answer on Stackoverflow
Solution 4 - HadoopChris WhiteView Answer on Stackoverflow
Solution 5 - HadoopVenu A PositiveView Answer on Stackoverflow
Solution 6 - HadoopdilshadView Answer on Stackoverflow