Which loop has better performance? Why?

JavaPerformanceStringGarbage Collection

Java Problem Overview


String s = "";
for(i=0;i<....){
    s = some Assignment;
}

or

for(i=0;i<..){
    String s = some Assignment;
}

I don't need to use 's' outside the loop ever again. The first option is perhaps better since a new String is not initialized each time. The second however would result in the scope of the variable being limited to the loop itself.

EDIT: In response to Milhous's answer. It'd be pointless to assign the String to a constant within a loop wouldn't it? No, here 'some Assignment' means a changing value got from the list being iterated through.

Also, the question isn't because I'm worried about memory management. Just want to know which is better.

Java Solutions


Solution 1 - Java

Limited Scope is Best

Use your second option:

for ( ... ) {
  String s = ...;
}
Scope Doesn't Affect Performance

If you disassemble code the compiled from each (with the JDK's javap tool), you will see that the loop compiles to the exact same JVM instructions in both cases. Note also that Brian R. Bondy's "Option #3" is identical to Option #1. Nothing extra is added or removed from the stack when using the tighter scope, and same data are used on the stack in both cases.

Avoid Premature Initialization

The only difference between the two cases is that, in the first example, the variable s is unnecessarily initialized. This is a separate issue from the location of the variable declaration. This adds two wasted instructions (to load a string constant and store it in a stack frame slot). A good static analysis tool will warn you that you are never reading the value you assign to s, and a good JIT compiler will probably elide it at runtime.

You could fix this simply by using an empty declaration (i.e., String s;), but this is considered bad practice and has another side-effect discussed below.

Often a bogus value like null is assigned to a variable simply to hush a compiler error that a variable is read without being initialized. This error can be taken as a hint that the variable scope is too large, and that it is being declared before it is needed to receive a valid value. Empty declarations force you to consider every code path; don't ignore this valuable warning by assigning a bogus value.

Conserve Stack Slots

As mentioned, while the JVM instructions are the same in both cases, there is a subtle side-effect that makes it best, at a JVM level, to use the most limited scope possible. This is visible in the "local variable table" for the method. Consider what happens if you have multiple loops, with the variables declared in unnecessarily large scope:

void x(String[] strings, Integer[] integers) {
  String s;
  for (int i = 0; i < strings.length; ++i) {
    s = strings[0];
    ...
  }
  Integer n;
  for (int i = 0; i < integers.length; ++i) {
    n = integers[i];
    ...
  }
}

The variables s and n could be declared inside their respective loops, but since they are not, the compiler uses two "slots" in the stack frame. If they were declared inside the loop, the compiler can reuse the same slot, making the stack frame smaller.

What Really Matters

However, most of these issues are immaterial. A good JIT compiler will see that it is not possible to read the initial value you are wastefully assigning, and optimize the assignment away. Saving a slot here or there isn't going to make or break your application.

The important thing is to make your code readable and easy to maintain, and in that respect, using a limited scope is clearly better. The smaller scope a variable has, the easier it is to comprehend how it is used and what impact any changes to the code will have.

Solution 2 - Java

In theory, it's a waste of resources to declare the string inside the loop. In practice, however, both of the snippets you presented will compile down to the same code (declaration outside the loop).

So, if your compiler does any amount of optimization, there's no difference.

Solution 3 - Java

In general I would choose the second one, because the scope of the 's' variable is limited to the loop. Benefits:

  • This is better for the programmer because you don't have to worry about 's' being used again somewhere later in the function
  • This is better for the compiler because the scope of the variable is smaller, and so it can potentially do more analysis and optimisation
  • This is better for future readers because they won't wonder why the 's' variable is declared outside the loop if it's never used later

Solution 4 - Java

If you want to speed up for loops, I prefer declaring a max variable next to the counter so that no repeated lookups for the condidtion are needed:

instead of

for (int i = 0; i < array.length; i++) {
  Object next = array[i];
}

I prefer

for (int i = 0, max = array.lenth; i < max; i++) {
  Object next = array[i];
}

Any other things that should be considered have already been mentioned, so just my two cents (see ericksons post)

Greetz, GHad

Solution 5 - Java

To add on a bit to @Esteban Araya's answer, they will both require the creation of a new string each time through the loop (as the return value of the some Assignment expression). Those strings need to be garbage collected either way.

Solution 6 - Java

I know this is an old question, but I thought I'd add a bit that is slightly related.

I've noticed while browsing the Java source code that some methods, like String.contentEquals (duplicated below) makes redundant local variables that are merely copies of class variables. I believe that there was a comment somewhere, that implied that accessing local variables is faster than accessing class variables.

In this case "v1" and "v2" are seemingly unnecessary and could be eliminated to simplify the code, but were added to improve performance.

public boolean contentEquals(StringBuffer sb) {
    synchronized(sb) {
        if (count != sb.length())
            return false;
        char v1[] = value;
        char v2[] = sb.getValue();
        int i = offset;
        int j = 0;
        int n = count;
        while (n-- != 0) {
            if (v1[i++] != v2[j++])
                return false;
        }
    }
    return true;
}

Solution 7 - Java

It seems to me that we need more specification of the problem.

The

s = some Assignment;

is not specified as to what kind of assignment this is. If the assignment is

s = "" + i + "";

then a new sting needs to be allocated.

but if it is

s = some Constant;

s will merely point to the constants memory location, and thus the first version would be more memory efficient.

Seems i little silly to worry about to much optimization of a for loop for an interpreted lang IMHO.

Solution 8 - Java

When I'm using multiple threads (50+) then i found this to be a very effective way of handling ghost thread issues with not being able to close a process correctly ....if I'm wrong, please let me know why I'm wrong:

Process one;
BufferedInputStream two;
try{
one = Runtime.getRuntime().exec(command);
two = new BufferedInputStream(one.getInputStream());
}
}catch(e){
e.printstacktrace
}
finally{
//null to ensure they are erased
one = null;
two = null;
//nudge the gc
System.gc();
}

Attributions

All content for this solution is sourced from the original question on Stackoverflow.

The content on this page is licensed under the Attribution-ShareAlike 4.0 International (CC BY-SA 4.0) license.

Content TypeOriginal AuthorOriginal Content on Stackoverflow
QuestionVivek KodiraView Question on Stackoverflow
Solution 1 - JavaericksonView Answer on Stackoverflow
Solution 2 - JavaEsteban ArayaView Answer on Stackoverflow
Solution 3 - JavaGreg HewgillView Answer on Stackoverflow
Solution 4 - JavaGHadView Answer on Stackoverflow
Solution 5 - Java1800 INFORMATIONView Answer on Stackoverflow
Solution 6 - JavaRandy StegbauerView Answer on Stackoverflow
Solution 7 - JavaMilhousView Answer on Stackoverflow
Solution 8 - JavaPetroView Answer on Stackoverflow