How to profile methods in Scala?

Function Problem Overview

What is a standard way of profiling Scala method calls?

What I need are hooks around a method, using which I can use to start and stop Timers.

In Java I use aspect programming, aspectJ, to define the methods to be profiled and inject bytecode to achieve the same.

Is there a more natural way in Scala, where I can define a bunch of functions to be called before and after a function without losing any static typing in the process?

Function Solutions

Solution 1 - Function

Do you want to do this without changing the code that you want to measure timings for? If you don't mind changing the code, then you could do something like this:

def time[R](block: => R): R = {
    val t0 = System.nanoTime()
    val result = block    // call-by-name
    val t1 = System.nanoTime()
    println("Elapsed time: " + (t1 - t0) + "ns")
    result
}

// Now wrap your method calls, for example change this...
val result = 1 to 1000 sum

// ... into this
val result = time { 1 to 1000 sum }

Solution 2 - Function

In addition to Jesper's answer, you can automatically wrap method invocations in the REPL:

scala> def time[R](block: => R): R = {
   | val t0 = System.nanoTime()
   | val result = block
   | println("Elapsed time: " + (System.nanoTime - t0) + "ns")
   | result
   | }
time: [R](block: => R)R

Now - let's wrap anything in this

scala> :wrap time
wrap: no such command.  Type :help for help.

OK - we need to be in power mode

scala> :power
** Power User mode enabled - BEEP BOOP SPIZ **
** :phase has been set to 'typer'.          **
** scala.tools.nsc._ has been imported      **
** global._ and definitions._ also imported **
** Try  :help,  vals.<tab>,  power.<tab>    **

Wrap away

scala> :wrap time
Set wrapper to 'time'

scala> BigDecimal("1.456")
Elapsed time: 950874ns
Elapsed time: 870589ns
Elapsed time: 902654ns
Elapsed time: 898372ns
Elapsed time: 1690250ns
res0: scala.math.BigDecimal = 1.456

I have no idea why that printed stuff out 5 times

Update as of 2.12.2:

scala> :pa
// Entering paste mode (ctrl-D to finish)

package wrappers { object wrap { def apply[A](a: => A): A = { println("running...") ; a } }}

// Exiting paste mode, now interpreting.


scala> $intp.setExecutionWrapper("wrappers.wrap")

scala> 42
running...
res2: Int = 42

Solution 3 - Function

This what I use:

import System.nanoTime
def profile[R](code: => R, t: Long = nanoTime) = (code, nanoTime - t)

// usage:
val (result, time) = profile { 
  /* block of code to be profiled*/ 
}

val (result2, time2) = profile methodToBeProfiled(foo)

Solution 4 - Function

There are three benchmarking libraries for Scala that you can avail of.

Since the URLs on the linked site are likely to change, I am pasting the relevant content below.

SPerformance - Performance Testing framework aimed at automagically comparing performance tests and working inside Simple Build Tool.
scala-benchmarking-template - SBT template project for creating Scala (micro-)benchmarks based on Caliper.
Metrics - Capturing JVM- and application-level metrics. So you know what's going on

Solution 5 - Function

testing.Benchmark might be useful.

scala> def testMethod {Thread.sleep(100)}
testMethod: Unit

scala> object Test extends testing.Benchmark {
     |   def run = testMethod
     | }
defined module Test

scala> Test.main(Array("5"))
$line16.$read$$iw$$iw$Test$     100     100     100     100     100

Solution 6 - Function

I use a technique that's easy to move around in code blocks. The crux is that the same exact line starts and ends the timer - so it is really a simple copy and paste. The other nice thing is that you get to define what the timing means to you as a string, all in that same line.

Example usage:

Timelog("timer name/description")
//code to time
Timelog("timer name/description")

The code:

object Timelog {

  val timers = scala.collection.mutable.Map.empty[String, Long]

  //
  // Usage: call once to start the timer, and once to stop it, using the same timer name parameter
  //
  def timer(timerName:String) = {
    if (timers contains timerName) {
      val output = s"$timerName took ${(System.nanoTime() - timers(timerName)) / 1000 / 1000} milliseconds"
      println(output) // or log, or send off to some performance db for analytics
    }
    else timers(timerName) = System.nanoTime()
  }

Pros:

no need to wrap code as a block or manipulate within lines
can easily move the start and end of the timer among code lines when being exploratory

Cons:

less shiny for utterly functional code
obviously this object leaks map entries if you do not "close" timers, e.g. if your code doesn't get to the second invocation for a given timer start.

Solution 7 - Function

ScalaMeter is a nice library to perform benchmarking in Scala

Below is a simple example

import org.scalameter._

def sumSegment(i: Long, j: Long): Long = (i to j) sum

val (a, b) = (1, 1000000000)

val execution_time = measure { sumSegment(a, b) }

If you execute above code snippet in Scala Worksheet you get the running time in milliseconds

execution_time: org.scalameter.Quantity[Double] = 0.260325 ms

Solution 8 - Function

I took the solution from Jesper and added some aggregation to it on multiple run of the same code

def time[R](block: => R) = {
    def print_result(s: String, ns: Long) = {
      val formatter = java.text.NumberFormat.getIntegerInstance
      println("%-16s".format(s) + formatter.format(ns) + " ns")
    }

    var t0 = System.nanoTime()
    var result = block    // call-by-name
    var t1 = System.nanoTime()

    print_result("First Run", (t1 - t0))

    var lst = for (i <- 1 to 10) yield {
      t0 = System.nanoTime()
      result = block    // call-by-name
      t1 = System.nanoTime()
      print_result("Run #" + i, (t1 - t0))
      (t1 - t0).toLong
    }

    print_result("Max", lst.max)
    print_result("Min", lst.min)
    print_result("Avg", (lst.sum / lst.length))
}

Suppose you want to time two functions counter_new and counter_old, the following is the usage:

scala> time {counter_new(lst)}
First Run       2,963,261,456 ns
Run #1          1,486,928,576 ns
Run #2          1,321,499,030 ns
Run #3          1,461,277,950 ns
Run #4          1,299,298,316 ns
Run #5          1,459,163,587 ns
Run #6          1,318,305,378 ns
Run #7          1,473,063,405 ns
Run #8          1,482,330,042 ns
Run #9          1,318,320,459 ns
Run #10         1,453,722,468 ns
Max             1,486,928,576 ns
Min             1,299,298,316 ns
Avg             1,407,390,921 ns

scala> time {counter_old(lst)}
First Run       444,795,051 ns
Run #1          1,455,528,106 ns
Run #2          586,305,699 ns
Run #3          2,085,802,554 ns
Run #4          579,028,408 ns
Run #5          582,701,806 ns
Run #6          403,933,518 ns
Run #7          562,429,973 ns
Run #8          572,927,876 ns
Run #9          570,280,691 ns
Run #10         580,869,246 ns
Max             2,085,802,554 ns
Min             403,933,518 ns
Avg             797,980,787 ns

Hopefully this is helpful

Solution 9 - Function

The recommended approach to benchmarking Scala code is via sbt-jmh

> "Trust no one, bench everything." - sbt plugin for JMH (Java > Microbenchmark Harness)

This approach is taken by many of the major Scala projects, for example,

Scala programming language itself
Dotty (Scala 3)
cats library for functional programming
Metals language server for IDEs

Simple wrapper timer based on System.nanoTime is not a reliable method of benchmarking:

> System.nanoTime is as bad as String.intern now: you can use it, > but use it wisely. The latency, granularity, and scalability effects > introduced by timers may and will affect your measurements if done > without proper rigor. This is one of the many reasons why > System.nanoTime should be abstracted from the users by benchmarking > frameworks

Furthermore, considerations such as JIT warmup, garbage collection, system-wide events, etc. might introduce unpredictability into measurements:

> Tons of effects need to be mitigated, including warmup, dead code > elimination, forking, etc. Luckily, JMH already takes care of many > things, and has bindings for both Java and Scala.

Based on Travis Brown's answer here is an example of how to setup JMH benchmark for Scala

Add jmh to project/plugins.sbt

addSbtPlugin("pl.project13.scala" % "sbt-jmh" % "0.3.7")

Enable jmh plugin in build.sbt
```
enablePlugins(JmhPlugin)
```

Add to src/main/scala/bench/VectorAppendVsListPreppendAndReverse.scala

package bench

import org.openjdk.jmh.annotations._

@State(Scope.Benchmark)
@BenchmarkMode(Array(Mode.AverageTime))
class VectorAppendVsListPreppendAndReverse {
  val size = 1_000_000
  val input = 1 to size

  @Benchmark def vectorAppend: Vector[Int] = 
    input.foldLeft(Vector.empty[Int])({ case (acc, next) => acc.appended(next)})

  @Benchmark def listPrependAndReverse: List[Int] = 
    input.foldLeft(List.empty[Int])({ case (acc, next) => acc.prepended(next)}).reverse
}

Execute benchmark with

sbt "jmh:run -i 10 -wi 10 -f 2 -t 1 bench.VectorAppendVsListPreppendAndReverse"

The results are

Benchmark                                                   Mode  Cnt  Score   Error  Units
VectorAppendVsListPreppendAndReverse.listPrependAndReverse  avgt   20  0.024 ± 0.001   s/op
VectorAppendVsListPreppendAndReverse.vectorAppend           avgt   20  0.130 ± 0.003   s/op

which seems to indicate prepending to a List and then reversing it at the end is order of magnitude faster than keep appending to a Vector.

Solution 10 - Function

I like the simplicity of @wrick's answer, but also wanted:

the profiler handles looping (for consistency and convenience)
more accurate timing (using nanoTime)
time per iteration (not total time of all iterations)
just return ns/iteration - not a tuple

This is achieved here:

def profile[R] (repeat :Int)(code: => R, t: Long = System.nanoTime) = { 
  (1 to repeat).foreach(i => code)
  (System.nanoTime - t)/repeat
}

For even more accuracy, a simple modification allows a JVM Hotspot warmup loop (not timed) for timing small snippets:

def profile[R] (repeat :Int)(code: => R) = {  
  (1 to 10000).foreach(i => code)   // warmup
  val start = System.nanoTime
  (1 to repeat).foreach(i => code)
  (System.nanoTime - start)/repeat
}

Solution 11 - Function

While standing on the shoulders of giants...

A solid 3rd-party library would be more ideal, but if you need something quick and std-library based, following variant provides:

Repetitions
Last result wins for multiple repetitions
Total time and average time for multiple repetitions
Removes the need for time/instant provider as a param

import scala.concurrent.duration._
import scala.language.{postfixOps, implicitConversions}

package object profile {

  def profile[R](code: => R): R = profileR(1)(code)

  def profileR[R](repeat: Int)(code: => R): R = {
    require(repeat > 0, "Profile: at least 1 repetition required")

    val start = Deadline.now

    val result = (1 until repeat).foldLeft(code) { (_: R, _: Int) => code }

    val end = Deadline.now

    val elapsed = ((end - start) / repeat)

    if (repeat > 1) {
      println(s"Elapsed time: $elapsed averaged over $repeat repetitions; Total elapsed time")

      val totalElapsed = (end - start)

      println(s"Total elapsed time: $totalElapsed")
    }
    else println(s"Elapsed time: $elapsed")

    result
  }
}

Also worth noting you can use the Duration.toCoarsest method to convert to the biggest time unit possible, although I am not sure how friendly this is with minor time difference between runs e.g.

Welcome to Scala version 2.11.7 (Java HotSpot(TM) 64-Bit Server VM, Java 1.8.0_60).
Type in expressions to have them evaluated.
Type :help for more information.

scala> import scala.concurrent.duration._
import scala.concurrent.duration._

scala> import scala.language.{postfixOps, implicitConversions}
import scala.language.{postfixOps, implicitConversions}

scala> 1000.millis
res0: scala.concurrent.duration.FiniteDuration = 1000 milliseconds

scala> 1000.millis.toCoarsest
res1: scala.concurrent.duration.Duration = 1 second

scala> 1001.millis.toCoarsest
res2: scala.concurrent.duration.Duration = 1001 milliseconds

scala>

Solution 12 - Function

You can use System.currentTimeMillis:

def time[R](block: => R): R = {
	val t0 = System.currentTimeMillis()
	val result = block    // call-by-name
	val t1 = System.currentTimeMillis()
	println("Elapsed time: " + (t1 - t0) + "ms")
	result
}

Usage:

time{
    //execute somethings here, like methods, or some codes.
}

nanoTime will show you ns, so it will hard to see. So I suggest that you can use currentTimeMillis instead of it.

Solution 13 - Function

adding on => method with name & seconds

profile[R](block: => R,methodName : String): R = {
    val n = System.nanoTime()
    val result = block
    val n1 = System.nanoTime()
    println(s"Elapsed time: ${TimeUnit.MILLISECONDS.convert(n1 - n,TimeUnit.NANOSECONDS)}ms - MethodName: ${methodName}")
    result
  }

Content Type	Original Author	Original Content on Stackoverflow
Question	sheki	View Question on Stackoverflow
Solution 1 - Function	Jesper	View Answer on Stackoverflow
Solution 2 - Function	oxbow_lakes	View Answer on Stackoverflow
Solution 3 - Function	pathikrit	View Answer on Stackoverflow
Solution 4 - Function	missingfaktor	View Answer on Stackoverflow
Solution 5 - Function	Luigi Plinge	View Answer on Stackoverflow
Solution 6 - Function	matanster	View Answer on Stackoverflow
Solution 7 - Function	Dharmesh	View Answer on Stackoverflow
Solution 8 - Function	John Zhu	View Answer on Stackoverflow
Solution 9 - Function	Mario Galic	View Answer on Stackoverflow
Solution 10 - Function	Brent Faust	View Answer on Stackoverflow
Solution 11 - Function	Darren Bishop	View Answer on Stackoverflow
Solution 12 - Function	haiyang	View Answer on Stackoverflow
Solution 13 - Function	Zach	View Answer on Stackoverflow