Pure Danger Tech


navigation
home

Running rings around Scala

05 Jan 2009

Recently I did a post on an actor benchmark with Erlang. I’ve ported that to Scala just to get my feet wet. So far, I’d say the learning curve for Scala has been a bit steeper for me. I’m not sure if that’s just because I’ve approached learning it in a haphazard way (which I have) or whether it’s just harder to grok.

It’s hard not to feel the intensity of thought that went into the design of Scala. At first glance there appears to be a lot of syntax to deal with although it turns out on closer inspection that actually most of that falls out as operators and other extensible things built into the language. That’s both very cool and also quite daunting. I’m just a few days in, so I don’t really buy my own feelings anyways as I need way more experience with it first.

Anyhow, to the code…..I’m not going to explain it much as it’s a pretty straight port from the Erlang version in the previous post. I suspect seasoned Scala programmers will laugh at this but I’m ok with that. Feel free to leave constructive (or non-constructive but funny) comments if you have any suggestions.

I created one object (in Scala objects are true singletons and classes are like Java classes) named Ring. It has a main (so I can run it) and a method to start up the ring. The main takes one arg which is the number of nodes in the ring. First I construct the TimerActor used by the NodeActor to time the ring timings later. Then I create all the nodes. And finally I connect each node to the next in the ring. Some syntax notes for the Scala newbs: ;’s are optional, Unit is like void in Java, array access is with () not [], ! is the Erlang send operator ported into the actor library (it’s just a method call on a method named “!”).

[source:scala]

import scala.actors._

import scala.actors.Actor._

object Ring {

def main(args: Array[String]): Unit = {

val node = startRing(args(0).toInt)

node ! StartMessage

}

def startRing(n:Int): NodeActor = {

val nodes = spawnNodes(n, startTimer())

connectNodes(n, nodes)

return nodes(0)

}

def startTimer(): TimerActor = {

val timer = new TimerActor

timer.start

return timer

}

def spawnNodes(n:Int, timer:TimerActor): Array[NodeActor] = {

println(“constructing nodes”)

val startConstructing = System.currentTimeMillis

val nodes = new Array[NodeActor](n+1)

for(i <- 0 until n) { nodes(i) = new NodeActor(i, timer, null) nodes(i).start } val endConstructing = System.currentTimeMillis println(“Took “ + (endConstructing-startConstructing) + “ ms to construct “ + n + “ nodes”) return nodes } def connectNodes(n:Int, nodes: Array[NodeActor]) = { println(“connecting nodes”) nodes(n) = nodes(0) for(i <-0 until n) nodes(i).connect(nodes(i+1)) } } [/source] And then we have the actors. Actors send messages (with !), which are put into an actor’s mailbox, inside actors, they pattern match messages from the mailbox and act on the messages. So, basically very close to the Erlang model. There are two forms of actors however - you can use “react” to make lightweight actors that is not tied to a real thread, which is very similar to Erlang. Or you can use “receive” to get an actor backed by a real (Java) thread. It’s actually kind of nice to have this flexibility. I kind of wish it didn’t require you to choose different methods though; seems like some way of using the same code but specifying on actor creation which model to use would be cleaner. Messages are best represented as Scala case classes - these are basically just immutable objects we can pass around. Here I define some case objects (true singletons) and a case class for the message token: [source:scala] case object StartMessage case object StopMessage case object CancelMessage case class TokenMessage(id:Int, value:Int) [/source] And here’s the NodeActor which deals with three message types: Start / Stop / Token. Start causes messages to be sent around the ring with an initial token value of 0 and source from this node’s nodeId. The Token is just passed on unless we hit the 1Mth time around in which case we start sending Stop messages around to kill off the ring. [source:scala] class NodeActor(id:Int, timer:TimerActor, var nextNode:NodeActor) extends Actor { val nodeId: Int = id def connect(node:NodeActor) = nextNode = node def act() { loop { react { case StartMessage => {

log(“Starting messages”)

timer ! StartMessage

nextNode ! TokenMessage(nodeId, 0)

}

case StopMessage => {

log(“Stopping”)

nextNode ! StopMessage

exit

}

case TokenMessage(id,value) if id == nodeId => {

val nextValue = value+1

if(nextValue % 10000 == 0)

log(“Around ring ” + nextValue + ” times”)

if(nextValue == 1000000) {

timer ! StopMessage

timer ! CancelMessage

nextNode ! StopMessage

exit

} else {

nextNode ! TokenMessage(id, nextValue)

}

}

case TokenMessage(id,value) => {

nextNode ! TokenMessage(id,value)

}

}

}

}

def log(msg: String) {

println(System.currentTimeMillis() + ” ” + nodeId + “: ” + msg)

}

}

[/source]

Also, here’s the TimerActor: [source:scala]

class TimerActor() extends Actor {

private var timing: Boolean = false

private var startTime: Long = 0</p>

def act() {

loop {

receive {

case StartMessage if !timing => {

startTime = System.currentTimeMillis()

timing = true

}

case StopMessage if timing => {

val end = System.currentTimeMillis()

println(“Start=” + startTime + ” Stop=” + end + ” Elapsed=” + (end-startTime))

timing = false

}

case CancelMessage => {

exit

}

}

}

}

}

[/source]

And finally, let’s run it and get some timings: [source:scala]

$ scalac Ring.scala

$ scala -cp . Ring 100

constructing nodes

Took 10 ms to construct 100 nodes

connecting nodes

1231219721752 0: Starting messages

1231219723493 0: Around ring 10000 times

1231219724818 0: Around ring 20000 times

..etc

1231219871405 0: Around ring 990000 times

1231219873267 0: Around ring 1000000 times

Start=1231219721758 Stop=1231219873269 Elapsed=151511

[/source]

It took 10 ms to construct 100 nodes (compare to 0.2 ms in Erlang). It took 152 seconds to send a total of 100 million messages in the ring or about 660,000 messages per second (compared with about 1.3 million messages per second in Erlang). I also tested constructing 20000 nodes in Scala, as in Erlang. That took 345 millisconds in Scala (was 120 milliseconds in Erlang).

With the caveat that this benchmark is a bunch of crap, I’ll say Erlang was about 3x faster at spawning processes and about 2x faster at sending messages than Scala. I can also tell you anecdotally that the cpu ran noticeably hotter with Scala than Erlang. Between the two, I enjoyed writing the Erlang one more although Scala definitely had better error messages, especially when using the Eclipse plugin as an IDE.

UPDATE: At Philipp’s suggestion, I was using Scala 2.7.2. I re-ran with Scala 2.7.3 and did see a noticeable bump in performance. Here’s all the #s:

Language JDK Spawn 100 Send 100M messages Spawn 20k
Erlang R12B   0.2 ms 77354 ms 120 ms
Scala 2.7.2 jdk 1.6 (1) 10 ms 151511 ms 345 ms
Scala 2.7.2 jdk 1.6 (2) 8 ms 306866 ms 356 ms
Scala 2.7.3 jdk 1.6 (1) 10 ms 121712 ms 315 ms
Scala 2.7.3 jdk 1.6 (2) 13 ms 334774 ms 410 ms
Scala 2.7.3 jdk 1.5 (3) 9 ms 578093 ms 124 ms

_

JDK detail:</p>

  1. jdk 1.6 (SoyLatte 1.0.2): build 1.6.0_03-p3-landonf_03_feb_2008_02_12-b00
  2. jdk 1.6 (Apple): build 1.6.0_07-b06-153
  3. jdk 1.5 (Apple): build 1.5.0_16-133

</em>

So, some interesting numbers there. Surprisingly, JDK 1.5 is actually the fastest at creating actors and the latest Apple JDK 1.6 is actually the slowest. That’s a little puzzling. I’m not sure what to make of the other numbers either. SoyLatte 1.6 definitely seems to run substantially faster than the Apple 1.6 and shows some improvement with Scala 2.7.3. The Apple 1.6 numbers are actually worse with Scala 2.7.3 though. I suspect this has a lot to do with me just soaking the CPUs on this box. So, I wouldn’t put much faith in any of these.