In Java 8, why were Arrays not given the forEach method of Iterable?

JavaArraysForeachJava 8Javac

Java Problem Overview


I must be missing something here.

In Java 5, the "for-each loop" statement (also called the enhanced for loop) was introduced. It appears that it was introduced mainly to iterate through Collections. Any collection (or container) class that implements the Iterable interface is eligible for iteration using the "for-each loop". Perhaps for historic reasons, the Java arrays did not implement the Iterable interface. But since arrays were/are ubiquitous, javac would accept the use of for-each loop on arrays (generating bytecode equivalent to a traditional for loop).

In Java 8, the forEach method was added to the Iterable interface as a default method. This made passing lambda expressions to collections (while iterating) possible (e.g. list.forEach(System.out::println)). But again, arrays don't enjoy this treatment. (I understand that there are workarounds).

Are there technical reasons why javac couldn't be enhanced to accept arrays in forEach, just like it accepts them in the enhanced for loop? It appears that code generation would be possible without requiring that arrays implement Iterable. Am I being naive?

This is especially important for a newcomer to the language who rather naturally uses arrays because of their syntactical ease. It's hardly natural to switch to Lists and use Arrays.asList(1, 2, 3).

Java Solutions


Solution 1 - Java

There are a bunch of special cases in the Java language and in the JVM for arrays. Arrays have an API, but it's barely visible. It is as if arrays are declared to have:

  • implements Cloneable, Serializable
  • public final int length
  • public T[] clone() where T is the array's component type

However, these declarations aren't visible in any source code anywhere. See JLS 4.10.3 and JLS 10.7 for explanations. Cloneable and Serializable are visible via reflection, and are returned by a call to

Object[].class.getInterfaces()

Perhaps surprisingly, the length field and the clone() method aren't visible reflectively. The length field isn't a field at all; using it turns into a special arraylength bytecode. A call to clone() results in an actual virtual method call, but if the receiver is an array type, this is handled specially by the JVM.

Notably, though, array classes do not implement the Iterable interface.

When the enhanced-for loop ("for-each") was added in Java SE 5, it supported two different cases for the right-hand-side expression: an Iterable or an array type (JLS 14.14.2). The reason is that Iterable instances and arrays are handled completely differently by the enhanced-for statement. That section of the JLS gives the full treatment, but put more simply, the situation is as follows.

For an Iterable<T> iterable, the code

for (T t : iterable) {
    <loop body>
}

is syntactic sugar for

for (Iterator<T> iterator = iterable.iterator(); iterator.hasNext(); ) {
    t = iterator.next();
    <loop body>
}

For an array T[], the code

for (T t : array) {
    <loop body>
}

is syntactic sugar for

int len = array.length;
for (int i = 0; i < len; i++) {
    t = array[i];
    <loop body>
}

Now, why was it done this way? It would certainly be possible for arrays to implement Iterable, since they implement other interfaces already. It would also be possible for the compiler to synthesize an Iterator implementation that's backed by an array. (There is precedent for this. The compiler already synthesizes the static values() and valueOf() methods that are automatically added to every enum class, as described in JLS 8.9.3.)

But arrays are a very low-level construct, and accessing an array by an int value is expected to be extremely inexpensive operation. It's quite idiomatic to run a loop index from 0 to an array's length, incrementing by one each time. The enhanced-for loop on an array does exactly that. If the enhanced-for loop over an array were implemented using the Iterable protocol, I think most people would be unpleasantly surprised to discover that looping over an array involved an initial method call and memory allocation (creating the Iterator), followed by two method calls per loop iteration.

So when default methods were added to Iterable in Java 8, this didn't affect arrays at all.

As others have noted, if you have an array of int, long, double, or of reference type, it's possible to turn this into a stream using one of the Arrays.stream() calls. This provides access to map(), filter(), forEach(), etc.

It would be nice, though, if the special cases in the Java language and JVM for arrays were replaced by real constructs (along with fixing a bunch of other array-related problems, such as poor handling of 2+ dimensional arrays, the 2^31 length limitation, and so forth). This is the subject of the "Arrays 2.0" investigation being led by John Rose. See John's talk at JVMLS 2012 (video, slides). The ideas relevant to this discussion include introduction of an actual interface for arrays, to allow libraries to interpose element access, to support additional operations such as slicing and copying, and so forth.

Note that all of this is investigation and future work. There is nothing from these array enhancements that is committed in the Java roadmap for any release, as of this writing (2016-02-23).

Solution 2 - Java

Suppose the special code will be added into java compiler to handle forEach. Then many similar questions could be asked. Why we cannot write myArray.fill(0)? Or myArray.copyOfRange(from, to)? Or myArray.sort()? myArray.binarySearch()? myArray.stream()? Practically every static method in Arrays interface could be converted into the corresponding method of the "array class". Why should JDK developers stop on myArray.forEach()? Note however that every such method must be added not only into classlib specification, but into Java Language Specification which is far more stable and conservative. Also this would mean that not only the implementation of such methods would become part of specification, but also classes like java.util.function.Consumer should be explicitly mentioned in JLS (which is the argument of proposed forEach method). Also note that new consumers would be necessary to add to the standard library like FloatConsumer, ByteConsumer, etc. for the corresponding array types. Currently the JLS rarely refers to the types outside of java.lang package (with some notable exceptions like java.util.Iterator). This implies some stability layer. The proposed change is too drastic for Java language.

Also note that currently we have one method which could be called for arrays directly (and which implementation differs from the java.lang.Object): it's clone() method. It actually adds some dirty parts into javac and even JVM as it must be handled specially everywhere. This causes bugs (e.g. method references were incorrectly handled in Java 8 JDK-8056051). Adding more similar complexity into javac may introduce even more similar bugs.

Such feature will probably be implemented in some not so near future as a part of Arrays 2.0 initiative. The idea is to introduce some superclass for arrays which will be located in class library, so new methods could be added just by writing normal java code without tweaking javac/JVM. However, this is also very hard feature as arrays are always treated specially in Java, and, as far as I know it's unknown yet whether it will be implemented and when.

Attributions

All content for this solution is sourced from the original question on Stackoverflow.

The content on this page is licensed under the Attribution-ShareAlike 4.0 International (CC BY-SA 4.0) license.

Content TypeOriginal AuthorOriginal Content on Stackoverflow
QuestionKedar MhaswadeView Question on Stackoverflow
Solution 1 - JavaStuart MarksView Answer on Stackoverflow
Solution 2 - JavaTagir ValeevView Answer on Stackoverflow