scala vs java, performance and memory?

Java Problem Overview

I am keen to look into Scala, and have one basic question I cant seem to find an answer to: in general, is there a difference in performance and usage of memory between Scala and Java?

Java Solutions

Solution 1 - Java

Scala makes it very easy to use enormous amounts of memory without realizing it. This is usually very powerful, but occasionally can be annoying. For example, suppose you have an array of strings (called array), and a map from those strings to files (called mapping). Suppose you want to get all files that are in the map and come from strings of length greater than two. In Java, you might

int n = 0;
for (String s: array) {
  if (s.length > 2 && mapping.containsKey(s)) n++;
}
String[] bigEnough = new String[n];
n = 0;
for (String s: array) {
  if (s.length <= 2) continue;
  bigEnough[n++] = mapping.get(s);
}

Whew! Hard work. In Scala, the most compact way to do the same thing is:

val bigEnough = array.filter(_.length > 2).flatMap(mapping.get)

Easy! But, unless you're fairly familiar with how the collections work, what you might not realize is that this way of doing this created an extra intermediate array (with filter), and an extra object for every element of the array (with mapping.get, which returns an option). It also creates two function objects (one for the filter and one for the flatMap), though that is rarely a major issue since function objects are small.

So basically, the memory usage is, at a primitive level, the same. But Scala's libraries have many powerful methods that let you create enormous numbers of (usually short-lived) objects very easily. The garbage collector is usually pretty good with that kind of garbage, but if you go in completely oblivious to what memory is being used, you'll probably run into trouble sooner in Scala than Java.

Note that the Computer Languages Benchmark Game Scala code is written in a rather Java-like style in order to get Java-like performance, and thus has Java-like memory usage. You can do this in Scala: if you write your code to look like high-performance Java code, it will be high-performance Scala code. (You may be able to write it in a more idiomatic Scala style and still get good performance, but it depends on the specifics.)

I should add that per amount of time spent programming, my Scala code is usually faster than my Java code since in Scala I can get the tedious not-performance-critical parts done with less effort, and spend more of my attention optimizing the algorithms and code for the performance-critical parts.

Solution 2 - Java

I'm a new user, so I'm not able to add a comment to Rex Kerr's answer above (allowing new users to "answer" but not "comment" is a very odd rule btw).

I signed up simply to respond to the "phew, Java is so verbose and such hard work" insinuation of Rex's popular answer above. While you can of course write more concise Scala code, the Java example given is clearly bloated. Most Java developers would code something like this:

List<String> bigEnough = new ArrayList<String>();
for(String s : array) {
  if(s.length() > 2 && mapping.get(s) != null) {
    bigEnough.add(mapping.get(s));
  }
}

And of course, if we are going to pretend that Eclipse doesn't do most of the actual typing for you and that every character saved really makes you a better programmer, then you could code this:

List b=new ArrayList();
for(String s:array)
  if(s.length()>2 && mapping.get(s) != null) b.add(mapping.get(s));

Now not only did I save the time it took me to type full variable names and curly braces (freeing me to spend 5 more seconds to think deep algorithmic thoughts), but I can also enter my code in obfuscation contests and potentially earn extra cash for the holidays.

Solution 3 - Java

Write your Scala like Java, and you can expect almost identical bytecode to be emitted - with almost identical metrics.

Write it more "idiomatically", with immutable objects and higher order functions, and it'll be a bit slower and a bit larger. The one exception to this rule-of-thumb is when using generic objects in which the type params use the @specialised annotation, this'll create even larger bytecode that can outpace Java's performance by avoiding boxing/unboxing.

Also worth mentioning is the fact that more memory / less speed is an inevitable trade-off when writing code that can be run in parallel. Idiomatic Scala code is far more declarative in nature than typical Java code, and is often a mere 4 characters (.par) away from being fully parallel.

So if

Scala code takes 1.25x longer than Java code in a single thread
It can be easily split across 4 cores (now common even in laptops)
for a parallel run time of (1.24 / 4 =) 0.3125x the original Java

Would you then say that the Scala code is now comparatively 25% slower, or 3x faster?

The correct answer depends on exactly how you define "performance" :)

Solution 4 - Java

Computer Language Benchmarks Game:

Speed test java/scala 1.71/2.25

Memory test java/scala 66.55/80.81

So, this benchmarks say that java is 24% faster and scala uses 21% more memory.

All-in-all it's no big deal and should not matter in real world apps, where most of the time is consumed by database and network.

Bottom line: If Scala makes you and your team (and people taking project over when you leave) more productive, then you should go for it.

Solution 5 - Java

Others have answered this question with respect to tight loops although there seems to be an obvious performance difference between Rex Kerr's examples that I have commented on.

This answer is really targeted at people who might investigate a need for tight-loop optimisation as design flaw.

I am relatively new to Scala (about a year or so) but the feel of it, thus far, is that it allows you to defer many aspects of design, implementation and execution relatively easily (with enough background reading and experimentation :)

Deferred Design Features:

Deferred Implementation Features:

Deferred Execution Features: (sorry, no links)

Thread-safe lazy values
Pass-by-name
Monadic stuff

These features, to me, are the ones that help us to tread the path to fast, tight applications.

Rex Kerr's examples differ in what aspects of execution are deferred. In the Java example, allocation of memory is deferred until it's size is calculated where the Scala example defers the mapping lookup. To me, they seem like completely different algorithms.

Here's what I think is more of an apples to apples equivalent for his Java example:

val bigEnough = array.collect({
    case k: String if k.length > 2 && mapping.contains(k) => mapping(k)
})

No intermediary collections, no Option instances etc. This also preserves the collection type so bigEnough's type is Array[File] - Array's collect implementation will probably be doing something along the lines of what Mr Kerr's Java code does.

The deferred design features I listed above would also allow Scala's collection API developers to implement that fast Array-specific collect implementation in future releases without breaking the API. This is what I'm referring to with treading the path to speed.

Also:

val bigEnough = array.withFilter(_.length > 2).flatMap(mapping.get)

The withFilter method that I've used here instead of filter fixes the intermediate collection problem but there is still the Option instance issue.

One example of simple execution speed in Scala is with logging.

In Java we might write something like:

if (logger.isDebugEnabled())
    logger.debug("trace");

In Scala, this is just:

logger.debug("trace")

because the message parameter to debug in Scala has the type "=> String" which I think of as a parameter-less function that executes when it is evaluated, but which the documentation calls pass-by-name.

EDIT { Functions in Scala are objects so there is an extra object here. For my work, the weight of a trivial object is worth removing the possibility of a log message getting needlessly evaluated. }

This doesn't make the code faster but it does make it more likely to be faster and we're less likely to have the experience of going through and cleaning up other people's code en masse.

To me, this is a consistent theme within Scala.

Hard code fails to capture why Scala is faster though it does hint a bit.

I feel that it's a combination of code re-use and the ceiling of code quality in Scala.

In Java, awesome code is often forced to become an incomprehensible mess and so isn't really viable within production quality APIs as most programmers wouldn't be able to use it.

I have high hopes that Scala could allow the einsteins among us to implement far more competent APIs, potentially expressed through DSLs. The core APIs in Scala are already far along this path.

Solution 6 - Java

@higherkinded´s presentation on the subject - Scala Performance Considerations which does some Java/Scala comparisions.

Tools:

Great blogpost:

Nanotrusting the Nanotime

Solution 7 - Java

Java and Scala both compile down to JVM bytecode, so the difference isn't that big. The best comparison you can get is probably on the computer language benchmarks game, which essentially says that Java and Scala both have the same memory usage. Scala is only slightly slower than Java on some of the benchmarks listed, but that could simply be because the implementation of the programs are different.

Really though, they're both so close it's not worth worrying about. The productivity increase you get by using a more expressive language like Scala is worth so much more than minimal (if any) performance hit.

Solution 8 - Java

The Java example is really not an idiom for typical application programs. Such optimized code might be found in a system library method. But then it would use an array of the right type, i.e. File[] and would not throw an IndexOutOfBoundsException. (Different filter conditions for counting and adding). My version would be (always (!) with curly braces because I don't like to spend an hour searching a bug which was introduced by saving the 2 seconds to hit a single key in Eclipse):

List<File> bigEnough = new ArrayList<File>();
for(String s : array) {
  if(s.length() > 2) {
    File file = mapping.get(s);
    if (file != null) {
      bigEnough.add(file);
    }
  }
}

But I could bring you a lot of other ugly Java code examples from my current project. I tried to avoid the common copy&modify style of coding by factoring out common structures and behaviour.

In my abstract DAO base class I have an abstract inner class for the common caching mechanism. For every concrete model object type there is a subclass of the abstract DAO base class, in which the inner class is subclassed to provide an implementation for the method which creates the business object when it is loaded from the database. (We can not use an ORM tool because we access another system via a proprietary API.)

This subclassing and instantiation code is not at all clear in Java and would be very readable in Scala.

Content Type	Original Author	Original Content on Stackoverflow
Question	JohnSmith	View Question on Stackoverflow
Solution 1 - Java	Rex Kerr	View Answer on Stackoverflow
Solution 2 - Java	Not Sleeping	View Answer on Stackoverflow
Solution 3 - Java	Kevin Wright	View Answer on Stackoverflow
Solution 4 - Java	Peter Knego	View Answer on Stackoverflow
Solution 5 - Java	Seth	View Answer on Stackoverflow
Solution 6 - Java	oluies	View Answer on Stackoverflow
Solution 7 - Java	ryeguy	View Answer on Stackoverflow
Solution 8 - Java	MickH	View Answer on Stackoverflow