[L array notation - where does it come from?

JavaArrays

Java Problem Overview


I've often seen messages that use [L then a type to denote an array, for instance:

[Ljava.lang.Object; cannot be cast to [Ljava.lang.String;

(The above being an arbitrary example I just pulled out.) I know this signifies an array, but where does the syntax come from? Why the beginning [ but no closing square bracket? And why the L? Is it purely arbitrary or is there some other historical/technical reason behind it?

Java Solutions


Solution 1 - Java

[ stands for Array, the Lsome.type.Here means the type. That's similar to the type descriptors used internally in the bytecode seen in §4.3 of the Java Virtual Machine Specification -- picked to be as brief as possible. The only difference is in that the real descriptors use / rather than . for denoting packages.

For instance, for primitives the value is: [I for array of int, a two-dimensional array would be: [[I.

Since classes may have any name, it would be harder to identify what class it is, hence the L, the class name ends with a ;

Descriptors are also used to represent the types of fields and methods.

For instance:

(IDLjava/lang/Thread;)Ljava/lang/Object;

... corresponds to a method whose parameters are int, double, and Thread and the return type is Object

edit

You can also see this in .class files using the java dissambler

C:>more > S.java
class S {
  Object  hello(int i, double d, long j, Thread t ) {
   return new Object();
  }
}
^C
C:>javac S.java

C:>javap -verbose S
class S extends java.lang.Object
  SourceFile: "S.java"
  minor version: 0
  major version: 50
  Constant pool:
const #1 = Method       #2.#12; //  java/lang/Object."<init>":()V
const #2 = class        #13;    //  java/lang/Object
const #3 = class        #14;    //  S
const #4 = Asciz        <init>;
const #5 = Asciz        ()V;
const #6 = Asciz        Code;
const #7 = Asciz        LineNumberTable;
const #8 = Asciz        hello;
const #9 = Asciz        (IDJLjava/lang/Thread;)Ljava/lang/Object;;
const #10 = Asciz       SourceFile;
const #11 = Asciz       S.java;
const #12 = NameAndType #4:#5;//  "<init>":()V
const #13 = Asciz       java/lang/Object;
const #14 = Asciz       S;

{
S();
  Code:
   Stack=1, Locals=1, Args_size=1
   0:   aload_0
   1:   invokespecial   #1; //Method java/lang/Object."<init>":()V
   4:   return
  LineNumberTable:
   line 1: 0


java.lang.Object hello(int, double, long, java.lang.Thread);
  Code:
   Stack=2, Locals=7, Args_size=5
   0:   new     #2; //class java/lang/Object
   3:   dup
   4:   invokespecial   #1; //Method java/lang/Object."<init>":()V
   7:   areturn
  LineNumberTable:
   line 3: 0


}

And in raw class file ( look at line 5 ):

enter image description here

Reference: Field description on the JVM specification

Solution 2 - Java

JVM array descriptors.

[Z = boolean
[B = byte
[S = short
[I = int
[J = long
[F = float
[D = double
[C = char
[L = any non-primitives(Object)

To get the main data-type, you need:

[Object].getClass().getComponentType();

It will return null if the "object" is not an array. to determine if it is an array, just call:

[Any Object].getClass().isArray()

or

Class.class.isArray();

Solution 3 - Java

This is used in the JNI (and the JVM internally in general) to indicate a type. Primitives are denoted with a single letter (Z for boolean, I for int, etc), [ indicates an array, and L is used for a class (terminated by a ;).

See here: JNI Types

EDIT: To elaborate on why there is no terminating ] - this code is to allow the JNI/JVM to quickly identify a method and its signature. It's intended to be as compact as possible to make parsing fast (=as few characters as possible), so [ is used for an array which is pretty straightforward (what better symbol to use?). I for int is equally obvious.

Solution 4 - Java

> [L array notation - where does it come from?

From the JVM spec. This is the representation of type names that is specified in the classFile format and other places.

  • The '[' denotes an array. In fact, the array type name is [<typename> where <typename> is the name of the base type of the array.
  • 'L' is actually part of the base type name; e.g. String is "Ljava.lang.String;". Note the trailing ';'!!

And yes, the notation is documented in other places as well.

> Why?

There is no doubt that that internal type name representation was chosen because it is:

  • compact,
  • self-delimiting (this is important for representations of method signatures, and it's why the 'L' and the trailing ';' are there), and
  • uses printable characters (for legibility ... if not readability).

But it is unclear why they decided to expose the internal type names of array types via the Class.getName() method. I think they could have mapped the internal names to something more "human friendly". My best guess is that it was just one of those things that they didn't get around to fixing until it was too late. (Nobody is perfect ... not even the hypothetical "intelligent designer".)

Solution 5 - Java

I think it's because C was taken by char, so next letter in class is L.

Solution 6 - Java

Another source for this would be the documentation of Class.getName(). Of course, all these specifications are congruent, since they are made to fit each other.

Attributions

All content for this solution is sourced from the original question on Stackoverflow.

The content on this page is licensed under the Attribution-ShareAlike 4.0 International (CC BY-SA 4.0) license.

Content TypeOriginal AuthorOriginal Content on Stackoverflow
QuestionMichael BerryView Question on Stackoverflow
Solution 1 - JavaOscarRyzView Answer on Stackoverflow
Solution 2 - JavaDino Rico BendanilloView Answer on Stackoverflow
Solution 3 - JavaEboMikeView Answer on Stackoverflow
Solution 4 - JavaStephen CView Answer on Stackoverflow
Solution 5 - JavaEnerccioView Answer on Stackoverflow
Solution 6 - JavaPaŭlo EbermannView Answer on Stackoverflow