bitbashing

Thu, 31 Jul 2008

Impressions of Scala

I've been looking at Scala for the last couple of days. This is actually the first JVM language I've learned - I have not used Java since my freshman year of college, and never in anything like production code. What follows is basically a set of disconnected notes and thoughts about Scala that I've had so far:

I was surprised to find you could only have tuples of up to 22 elements: "error: value Tuple23 is not a member of package scala". Another language that implements the tuple type as a library instead of something built into the language is C++, using the Boost tuple template (also now included in TR1/C++0x), and it also limits the number of elements, having specific cases in its implementation for each arity. But with variadic templates (added in C++0x) one can easily and cleanly implement arbitrary arity tuples as a library, and I was surprised Scala could not express something in a similar manner.

The emacs mode was weirdly hard to find, I eventually found a copy in the Scala SVN tree, but attempting to use it with GNU Emacs 22.2 gave me nonfunctional indentation and many error messages about a missing function named comment-beginning. I finally discovered a bug entry that had the fix (add (require 'newcomment) at the top of scala-mode.el) (I could not find a reference to it in the elisp manual, I should have known to grep the sources instead!)

I wish Scala had gone the Python route of BigInt everywhere. Instead BigInt isn't even included as part of the numeric stack. Having to worry about overflow is a pain, and signed overflow as in Java is almost never what you want (it seems like you would always want either unsigned overflow with a specific word size, or never overflow and use the memory as needed - I would guess that most of the time using an extra word of memory is less of a danger than an undetected integer overflow, though there would be the danger of a remote attacker causing you to use up very large amounts of memory via a counter of some kind. Allowing the programmer to specify using a BigInt with a maximum bit size (with either an exception or back to zero on overflow) might satisfy those cases.

I am still trying to figure out how to build high performance network code in Scala. I checked out the code for scala.actors.remote.TcpService (which unfortunately is not general enough to use outside of actors anyway), which uses one thread to listen for new connections and then one thread per client socket. But the whole point of using event-driven actors is to avoid using many (heavier) JVM threads, and as seen by Twisted, event-based actors can scale very well in this area. I'm not sure how to write a fast select/poll-style network loop across many sockets, with an actor (but not a thread) per socket connection.

Terracotta (which can be used to cluster Scala) seems worth more investigation, though I am pretty cautious about distributed shared memory as a viable clustering mechanism, especially when latency is higher than LAN speeds.

How difficult would it be to implement Pluribus in Scala? Could the E-on-Java implementation be used?

I would like to investigate if it is possible to implement facets. That the actors library is able to provide such a clean syntax gives me hope this is doable, it will require me understanding much more about Scala's type system than I do now. Perhaps code and/or ideas from Joe-E could be used.

Tony Morris posed a set of Scala exercises for beginners, here are my solutions. I should go back and do these again using pattern matching.

object Exercises {
  def succ(n: Int) = n + 1
  def pred(n: Int) = n - 1

  def add(x: Int, y: Int): Int =
    if (y == 0) x
    else if (y < 0) add(pred(x),succ(y))
    else add(succ(x),pred(y))

  def sum(x: List[Int]): Int =
    x.foldLeft(0)(add)

  def length[A](x: List[A]): Int =
    x.foldLeft(0) { (x, y) => add(x, 1) }

  def map[A, B](x: List[A], f: A => B): List[B] =
    x.foldRight(List[B]()) { (e, lst) => f(e) :: lst }

  def filter[A](x: List[A], f: A => Boolean): List[A] =
    x.foldRight(List[A]()) { (e, lst) => if (f(e)) e :: lst else lst }

  def append[A](x: List[A], y: List[A]): List[A] =
    x.foldRight(y) { (e, lst) => e :: lst }

  def concat[A](x: List[List[A]]): List[A] =
    x.foldRight(List[A]()) { (e, lst) => append(lst, e) }

  def concatMap[A, B](x: List[A], f: A => List[B]): List[B] =
     return concat(map(x, f))

  def maximum(x: List[Int]): Int =
     if (x.length == 0) error("No maximum of a zero length list")
     else x.tail.foldLeft (x.head) { (x,y) => if (x > y) x else y }

  def reverse[A](x: List[A]): List[A] =
     x.foldLeft(List[A]()) { (lst, e) => e :: lst }
}

In Scala By Example, I noticed the use of POLA in the the Auction actor. The only message sent to any of the bidders that contains a reference to the seller is AuctionConcluded, is only sent to the winner of the auction (and the seller). Here, the auction object is acting as a trusted introducer between the winner of the auction and the seller. Scala's actors implementation seems to allow sending any message to any actor that one has a reference to. That means someone wishing to fake an action could actually spam his reachable graph of objects with AuctionConcluded messages, in the chance that it has some other reference to the seller. This can be worked around by the seller by comparing the sender of the AuctionConcluded message with the seller actor that it previously spawned.

I ran some simple benchmarks on a quad processor machine, passing messages along a ring of actors. With identical workloads, the single threaded reactor took 95 seconds to complete, while the FJTaskScheduler2 (weird name, but seemingly the fastest scheduler included in the actors library) took 31 seconds wall time and 119 seconds total CPU time.

It seems strange to me that List.foldLeft and List.foldRight enforce their curriedness. But x.foldLeft(0, add) is not valid, you have to used x.foldLeft(0)(add).

What happens when an actor fails with an uncaught exception? (I will write code to test this later)

posted 2008/07/31 00:23 [category: bitbashing / programming]

< Powerset | Daily Links Are Gone >