Bay Area Clojure Meetup

While I was out at JavaOne, I attended the Bay Area Clojure User Group meetup that was held at 3VR. Here in St. Louis there is the Lambda Lounge and St. Louis Java User Group that have had some Clojure talks but there’s hardly enough people to form a regular group. I was excited to see people talk about doing real stuff with Clojure and also hear from Rich Hickey, the creator of Clojure.

It turned out to be a great meeting (even if beset by the canonical “how do I hook up a computer to a projector” blues – why is this not a solved problem yet?) and I think there were about 60 people there, with a bunch wandering over from JavaOne like me.

There were a series of talks at the beginning and then Rich took the floor for a while. I’m going to write about them in reverse order here though.

Rich did a short talk about his most recent work on chunked sequences. I know there is a lot of history in the current sequence stuff and some speculative work that was done on streams but I don’t know the details. Rich talked about how the sequence model, the importance of laziness, and its uniform appearance in the data structures as being a really key aspect of Clojure. The problem is that it’s fundamentally built on a Lisp cons cell and that forces a single-step model where each step requires three levels of overhead.

The work on chunked sequences is designed to create sequences n steps at a time, thus amortizing the overhead over n evaluations. Also, it seems that in some of the persistent data structures (like maps), the structure of the data structure itself can be leveraged to make this even better.

It struck me how many times in software I have seen data flow problems start with either “all data” or “one at a time” approaches only to ultimately settle on a batched (or chunked or paged) approach that balances the memory issues inherent in “all at once” vs the overhead issues of “one at a time”.

A lot of the chunked work is already in trunk and so far it does seem to be yielding worthwhile gains. After the chunked sequences overview, Rich just took questions for a while. I took some notes because in many ways this was my favorite part of the meeting.

On how he got into creating Clojure – originally Rich owned a recording studio and started developing software in C++ to do stuff at the studio. He later did some Java and discovered Lisp and fell in love. He tried several unsuccessful attempts to create ways to call between Lisp and Java so he could use one from the other. Ultimately, he ended up taking a 2 year full-time sabbatical to really try to put things together into what ultimately became Clojure.
On promoting Clojure – Rich mentioned that he has only ever posted one note on a mailing list to promote Clojure and interest grew like crazy without any explicit promotion after that. Certainly lots of other folks (and Rich himself) have done a great job communicating about the language since.
On Git and community help – Rich has heard the desire in the community for a distributed source repo and is actively exploring both Git and Mercurial. He wants to make sure that the workflow that results is something that works for him and that sufficient tooling exists. When asked about community help, he said one of the biggest ways people can help is by supplying patches for issues in the bug tracker. Having a good distributed version control system should be a big enabler in that respect as well.
On the Clojure community – someone asked how the Clojure community came to be such a helpful and friendly one and Rich humbly said he had nothing to do with it. That’s obviously NOT true from my perspective. The attitude and actions of the creator have an enormous impact on how people approach the language and what kind of reception they get and I think Rich has done a tremendous job of being extremely helpful and welcoming to first questions on the mailing list and elsewhere. In particular he tends to focus on solving real problems, making Clojure useful for real work, and making Clojure the best language it can be but not at the expense of tearing down other languages or work. The community has followed his lead.
On testing – Rich somewhat famously is not big on writing a lot of tests and someone asked about that. This is one of those issues that can be religious and I’ll try to fairly re-state his comments here but I apologize in advance if I mis-represent his views – if so, that’s my fault, not his. Rich said that he believes his greatest value is in spending a lot of time thinking about the issues and doing design work and a smaller amount of time doing implementation. For him, the large amount of up-front thinking and design work mean that the implementation is comparatively straight-forward and that adding exhaustive tests for it is not the best use of his time. In other words, while tests have value, they aren’t the way to maximize his value. He greatly appreciates the contributions that have been made to the test libraries and he does run the regression tests before commits to the language. He mentioned that if he spent all his time writing tests, he would probably not be a programmer. :)

I think the other interesting point he made was that the “testing and TDD” culture may be more a product of having large Java OO systems where testing is used as a crutch to tell you whether things far apart stop working together. The whole idea of Clojure is that these systems can be made far simpler to the point where some of those kinds of tests are no longer even needed. I have mixed feelings about some of that but I think it’s an interesting conjecture.
On JSR 292 – I asked whether JSR 292 and invokedynamic would open the doors to greater performance gains (as they are with JRuby) and he said actaully he didn’t think so. It’s a big benefit in dynamic languages like Ruby or Python but much less so in Clojure, which already does a great deal of that work statically. Things like fixnums (arbitrarily precision integers) and tail calls would be very useful in improving performance, but that shouldn’t be surprising since many of the most popular JVM languages want those too.
On clojure in clojure – Rich would like to move the bulk of the Clojure implementation out of Java and into Clojure itself but there are still a few important things that are needed before that is feasible. In particular, there was some things around proxies (although I don’t recall the details). He mentioned that this should make things like Clojure development and ports to other VMs much much easier. Someone is working on a .NET version of Clojure but Rich called it a “heroic” effort at the moment – moving to Clojure in Clojure would be a big step forward for efforts like that.

Some notes from the early talks:

Emacs integration with Clojure – the first talk was apparently a follow-up from a previous talk showing more about Clojure integration with Emacs. Most importantly, it showed how you could debug your Clojure code in Emacs by connecting to jdb and showing Clojure code as you walked. I’m not an Emacs guy (some day, I’ll take the time to really grok it, I swear) but it seemed pretty useful. The speaker (unfortunately, I didn’t catch his name) looked specifically at what Clojure is producing in byte code when compiling Clojure code. He decompiled the byte code to show the Java pseudo-equivalent. That was actually interesting because the decompiler choked on it (as they often do) since the code was actually nulling the local fields before making a tail call to avoid retaining a reference. That’s actually not possible to represent in Java hence the decompiler chokage.

Rich also pointed out that the symbol lookups were being done in static constants, which is a key to some of Clojure’s performance.
Swarmiji – the next speaker went over a distributed service framework being developed at Runa, which is using Clojure. The actual distributed communication was being handled by a RabbitMQ messaging backbone (which is Erlang-based).
Clojure Object Explorer – this speaker demo’ed some improvements to an explorer for a clojure object graph with the ability to expand to variable depth and breadth as well as interactively, and also display in multiple formats. Cool stuff.
Intellij Clojure support – Alexei from Intellij went over some of the latest improvements in their Clojure support and there are definitely a lot of goodies there for things like code completion and function rename replacement.
Flightcaster – the final talk was in some ways the most interesting to me as it was about this startup’s use of Flightcaster to do actual stuff. The idea of the company is to use public and private data sources to predict flight delays. It uses some machine learning techniques like Bayes law and conditional probability now with more stuff planned for support vector machines and other techniques in the future. The code is all in Clojure and from what we looked at, it was pretty compact. He had built a DSL within Clojure that got decently close to what he needed for representing some of the probability stuff. They’re also using Hadoop for some map/reduce stuff.

In all, a great meeting and well worth my time – I’m glad I made the extra trip.

Pure Danger Tech

Bay Area Clojure Meetup