Why is System.arraycopy native in Java?

Java Problem Overview

I was surprised to see in the Java source that System.arraycopy is a native method.

Of course the reason is because it's faster. But what native tricks is the code able to employ that make it faster?

Why not just loop over the original array and copy each pointer to the new array - surely this isn't that slow and cumbersome?

Java Solutions

Solution 1 - Java

In native code, it can be done with a single memcpy / memmove, as opposed to n distinct copy operations. The difference in performance is substantial.

Solution 2 - Java

It can't be written in Java. Native code is able to ignore or elide the difference between arrays of Object and arrays of primitives. Java can't do that, at least not efficiently.

And it can't be written with a single memcpy(), because of the semantics required by overlapping arrays.

Solution 3 - Java

It is, of course, implementation dependent.

HotSpot will treat it as an "intrinsic" and insert code at the call site. That is machine code, not slow old C code. This also means the problems with the signature of the method largely go away.

A simple copy loop is simple enough that obvious optimisations can be applied to it. For instance loop unrolling. Exactly what happens is again implementation dependent.

Solution 4 - Java

There are a few reasons:

The JIT is unlikely to generate as efficient low level code as a manually written C code. Using low level C can enable a lot of optimizations that are close to impossible to do for a generic JIT compiler.

See this link for some tricks and speed comparisons of hand written C implementations (memcpy, but the principle is the same): Check this Optimizing Memcpy improves speed
The C version is pretty much independant of the type and size of the array members. It is not possible to do the same in java since there is no way to get the array contents as a raw block of memory (eg. pointer).

Solution 5 - Java

In my own tests System.arraycopy() for copying multiple dimension arrays is 10 to 20 times faster than interleaving for loops:

float[][] foo = mLoadMillionsOfPoints(); // result is a float[1200000][9]
float[][] fooCpy = new float[foo.length][foo[0].length];
long lTime = System.currentTimeMillis();
System.arraycopy(foo, 0, fooCpy, 0, foo.length);
System.out.println("native duration: " + (System.currentTimeMillis() - lTime) + " ms");
lTime = System.currentTimeMillis();

for (int i = 0; i < foo.length; i++)
{
    for (int j = 0; j < foo[0].length; j++)
    {
        fooCpy[i][j] = foo[i][j];
    }
}
System.out.println("System.arraycopy() duration: " + (System.currentTimeMillis() - lTime) + " ms");
for (int i = 0; i < foo.length; i++)
{
    for (int j = 0; j < foo[0].length; j++)
    {
        if (fooCpy[i][j] != foo[i][j])
        {
            System.err.println("ERROR at " + i + ", " + j);
        }
    }
}

This prints:

System.arraycopy() duration: 1 ms
loop duration: 16 ms

Content Type	Original Author	Original Content on Stackoverflow
Question	James B	View Question on Stackoverflow
Solution 1 - Java	Péter Török	View Answer on Stackoverflow
Solution 2 - Java	user207421	View Answer on Stackoverflow
Solution 3 - Java	Tom Hawtin - tackline	View Answer on Stackoverflow
Solution 4 - Java	Hrvoje Prgeša	View Answer on Stackoverflow
Solution 5 - Java	jumar	View Answer on Stackoverflow