java.io.IOException: Could not locate executable null\bin\winutils.exe in the Hadoop binaries. spark Eclipse on windows 7

EclipseScalaApache Spark

Eclipse Problem Overview


I'm not able to run a simple spark job in Scala IDE (Maven spark project) installed on Windows 7

Spark core dependency has been added.

val conf = new SparkConf().setAppName("DemoDF").setMaster("local")
val sc = new SparkContext(conf)
val logData = sc.textFile("File.txt")
logData.count()

Error:

16/02/26 18:29:33 INFO SparkContext: Created broadcast 0 from textFile at FrameDemo.scala:13
16/02/26 18:29:34 ERROR Shell: Failed to locate the winutils binary in the hadoop binary path
java.io.IOException: Could not locate executable null\bin\winutils.exe in the Hadoop binaries.
	at org.apache.hadoop.util.Shell.getQualifiedBinPath(Shell.java:278)
	at org.apache.hadoop.util.Shell.getWinUtilsPath(Shell.java:300)
	at org.apache.hadoop.util.Shell.<clinit>(Shell.java:293)
	at org.apache.hadoop.util.StringUtils.<clinit>(StringUtils.java:76)
	at org.apache.hadoop.mapred.FileInputFormat.setInputPaths(FileInputFormat.java:362)
	at <br>org.apache.spark.SparkContext$$anonfun$hadoopFile$1$$anonfun$33.apply(SparkContext.scala:1015)
	at org.apache.spark.SparkContext$$anonfun$hadoopFile$1$$anonfun$33.apply(SparkContext.scala:1015)
	at <br>org.apache.spark.rdd.HadoopRDD$$anonfun$getJobConf$6.apply(HadoopRDD.scala:176)
	at <br>org.apache.spark.rdd.HadoopRDD$$anonfun$getJobConf$6.apply(HadoopRDD.scala:176)<br>
	at scala.Option.map(Option.scala:145)<br>
	at org.apache.spark.rdd.HadoopRDD.getJobConf(HadoopRDD.scala:176)<br>
	at org.apache.spark.rdd.HadoopRDD.getPartitions(HadoopRDD.scala:195)<br>
	at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:239)<br>
	at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:237)<br>
	at scala.Option.getOrElse(Option.scala:120)<br>
	at org.apache.spark.rdd.RDD.partitions(RDD.scala:237)<br>
	at org.apache.spark.rdd.MapPartitionsRDD.getPartitions(MapPartitionsRDD.scala:35)<br>
	at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:239)<br>
	at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:237)<br>
	at scala.Option.getOrElse(Option.scala:120)<br>
	at org.apache.spark.rdd.RDD.partitions(RDD.scala:237)<br>
	at org.apache.spark.SparkContext.runJob(SparkContext.scala:1929)<br>
	at org.apache.spark.rdd.RDD.count(RDD.scala:1143)<br>
	at com.org.SparkDF.FrameDemo$.main(FrameDemo.scala:14)<br>
	at com.org.SparkDF.FrameDemo.main(FrameDemo.scala)<br>

Eclipse Solutions


Solution 1 - Eclipse

Here is a good explanation of your problem with the solution.

  1. Download the version of winutils.exe from https://github.com/steveloughran/winutils.

  2. Set up your HADOOP_HOME environment variable on the OS level or programmatically:

    System.setProperty("hadoop.home.dir", "full path to the folder with winutils");

  3. Enjoy

Solution 2 - Eclipse

  1. Download winutils.exe
  2. Create folder, say C:\winutils\bin
  3. Copy winutils.exe inside C:\winutils\bin
  4. Set environment variable HADOOP_HOME to C:\winutils

Solution 3 - Eclipse

Follow this:

  1. Create a bin folder in any directory(to be used in step 3).

  2. Download winutils.exe and place it in the bin directory.

  3. Now add System.setProperty("hadoop.home.dir", "PATH/TO/THE/DIR"); in your code.

Solution 4 - Eclipse

1) Download winutils.exe from https://github.com/steveloughran/winutils 
2) Create a directory In windows "C:\winutils\bin
3) Copy the winutils.exe inside the above bib folder .
4) Set the environmental property in the code 
  System.setProperty("hadoop.home.dir", "file:///C:/winutils/");
5) Create a folder "file:///C:/temp" and give 777 permissions.
6) Add config property in spark Session ".config("spark.sql.warehouse.dir", "file:///C:/temp")"

Solution 5 - Eclipse

You can alternatively download winutils.exe from GITHub:

https://github.com/steveloughran/winutils/tree/master/hadoop-2.7.1/bin

replace hadoop-2.7.1 with the version you want and place the file in D:\hadoop\bin

> If you do not have access rights to the environment variable settings > on your machine, simply add the below line to your code:

System.setProperty("hadoop.home.dir", "D:\\hadoop");

Solution 6 - Eclipse

On Windows 10 - you should add two different arguments.

(1) Add the new variable and value as - HADOOP_HOME and path (i.e. c:\Hadoop) under System Variables.

(2) Add/append new entry to the "Path" variable as "C:\Hadoop\bin".

The above worked for me.

Solution 7 - Eclipse

if we see below issue > ERROR Shell: Failed to locate the winutils binary in the hadoop binary path > > java.io.IOException: Could not locate executable null\bin\winutils.exe in the Hadoop binaries.

then do following steps

  1. download winutils.exe from http://public-repo-1.hortonworks.com/hdp- win-alpha/winutils.exe.
  2. and keep this under bin folder of any folder you created for.e.g. C:\Hadoop\bin
  3. and in program add following line before creating SparkContext or SparkConf System.setProperty("hadoop.home.dir", "C:\Hadoop");

Solution 8 - Eclipse

I got the same problem while running unit tests. I found this workaround solution:

The following workaround allows to get rid of this message:

    File workaround = new File(".");
    System.getProperties().put("hadoop.home.dir", workaround.getAbsolutePath());
    new File("./bin").mkdirs();
    new File("./bin/winutils.exe").createNewFile();

from: https://issues.cloudera.org/browse/DISTRO-544

Solution 9 - Eclipse

Setting the Hadoop_Home environment variable in system properties didn't work for me. But this did:

  • Set the Hadoop_Home in the Eclipse Run Configurations environment tab.

  • Follow the 'Windows Environment Setup' from here

Solution 10 - Eclipse

  • Download winutils.exe and hadoop.dll in your windows machine.
  • create folder C:\hadoop\bin
  • Copy winutils.exe and hadoop.dll in newly created hadoop folder
  • Setup environment variable HADOOP_HOME=C:\hadoop

Solution 11 - Eclipse

On top of mentioning your environment variable for HADOOP_HOME in windows as C:\winutils, you also need to make sure you are the administrator of the machine. If not and adding environment variables prompts you for admin credentials (even under USER variables) then these variables will be applicable once you start your command prompt as administrator.

Solution 12 - Eclipse

I have also faced the similar problem with the following details Java 1.8.0_121, Spark spark-1.6.1-bin-hadoop2.6, Windows 10 and Eclipse Oxygen.When I ran my WordCount.java in Eclipse using HADOOP_HOME as a system variable as mentioned in the previous post, it did not work, what worked for me is -

System.setProperty("hadoop.home.dir", "PATH/TO/THE/DIR");

PATH/TO/THE/DIR/bin=winutils.exe whether you run within Eclipse as a Java application or by spark-submit from cmd using

spark-submit --class groupid.artifactid.classname --master local[2] /path to the jar file created using maven /path to a demo test file /path to output directory command

Example: Go to the bin location of Spark/home/location/bin and execute the spark-submit as mentioned,

D:\BigData\spark-2.3.0-bin-hadoop2.7\bin>spark-submit --class com.bigdata.abdus.sparkdemo.WordCount --master local[1] D:\BigData\spark-quickstart\target\spark-quickstart-0.0.1-SNAPSHOT.jar D:\BigData\spark-quickstart\wordcount.txt

Solution 13 - Eclipse

That's a tricky one... Your storage letter must be capical. For example "C:\..."

Attributions

All content for this solution is sourced from the original question on Stackoverflow.

The content on this page is licensed under the Attribution-ShareAlike 4.0 International (CC BY-SA 4.0) license.

Content TypeOriginal AuthorOriginal Content on Stackoverflow
QuestionElvish_BladeView Question on Stackoverflow
Solution 1 - EclipseTakyView Answer on Stackoverflow
Solution 2 - EclipseDeokant GuptaView Answer on Stackoverflow
Solution 3 - EclipseAni MenonView Answer on Stackoverflow
Solution 4 - EclipseSampat KumarView Answer on Stackoverflow
Solution 5 - EclipseSaurabhView Answer on Stackoverflow
Solution 6 - Eclipseuser1023627View Answer on Stackoverflow
Solution 7 - EclipsePrem SView Answer on Stackoverflow
Solution 8 - EclipseJoabe LucenaView Answer on Stackoverflow
Solution 9 - EclipseRamyaView Answer on Stackoverflow
Solution 10 - EclipseSwapnilView Answer on Stackoverflow
Solution 11 - EclipseAbhishek SakhujaView Answer on Stackoverflow
Solution 12 - EclipseAbdus MondalView Answer on Stackoverflow
Solution 13 - EclipseAchillesView Answer on Stackoverflow