JSR 166 Concurrency Updates Hit JDK 7

Doug Lea posted a note today on the concurrency-interest list that the bulk of the JDK 7 changes from JSR 166y (the second maintenance update) have been pushed in the latest JDK 7 M5 snapshots. You can find the API for these changes here.

The major updates are:

Phasers

A Phaser is kind of like a CyclicBarrier but are more flexible. A CyclicBarrier allows threads to repeatedly meet at a synchronization point and is something I use all the time to control multi-threaded tests or applications.

Phaser expands on CyclicBarrier by allowing several additional features:

Allows party count to change over time – this is something I would like to do all the time
Phases are numbered and threads know the phase count
More flexible task support – CyclicBarrier allows a single synchronization action to be registered, but Phaser allows for more flexible and dynamic actions
Termination – explicit support for termination phase
Tiered trees of phasers

Phaser will prove to be one of our most important day-to-day Java concurrency primitives in the future, I suspect.

TransferQueue

The new TransferQueue interface and it’s new implementation LinkedTransferQueue clean up a bunch of things in the collection queues and provide a new interface for a queue where the producer can block for a consumer to arrive and consume an item. We already have one implementation of this in the JDK in SynchronousQueue, which is actually a 0-length queue: it requires that both producer and consumer arrive to achieve a hand-off. This turns out to be an extremely useful collection in handing a result from one thread to another thread that will continue processing.

I’ve blogged at some length about the uses for TransferQueue here so follow that if you’re interested. One of the big potential uses is for this kind of queue is at the heart of work-handoff in an ExeuctorService, so this should allow greater performance and more thread creation flexibility in that area.

Fork-join

The fork–join library provides support for fine-grained parallel divide-and-conquer style parallelism. You can think of many of the classes adding in JDK 5 for queues, executors, etc as excellent building blocks for coarse-grained parallelism (on the level of a “task” or “transaction”). Fork-join works at a lower level, providing parallelism for working on a subset of a large data set in parallel. You can examine the interface in ForkJoinPool.

As we start to hit a higher number of cores in our systems, we can take advantage of those cores with algorithms that can take a large data set, break it into chunks, then recursively work on those chunks from small to big. Many sorting, filtering, and searching algorithms can be structured in this way.

Fork-join has already become a critical library for other languages even before it is officially available in the JDK. Languages like Scala, Clojure, Fortress, and Groovy (GPars) are taking advantage of it as the core of various parallel libraries or core language constructs. While I don’t think most people will be using fork-join explicitly, I suspect lots of people in the coming years will be using it under other APIs that rely on it.

Some other possibilities that appear to have gotten away so far are:

Fences – The Fences API is an attempt to allow for a low-level ability to specify memory orderings. It doesn’t fall nicely into an API as it’s really trying to surface a lower level of abstraction that ideally you should never have to think about, but for high-performance concurrency work is actually helpful. Due to a lot of discussion and feedback, the Fences API is currently being reworked as just a set of updates to the java.util.atomic.Atomic* classes. So this work is still planned but pending.
ConcurrentReferenceHashMap – The ConcurrentReferenceHashMap class arises from the need to have some class that a) is concurrent, b) supports either identity or equality semantics on keys and c) has options for strong/soft/weak keys and values. Existing classes like ConcurrentHashMap, IdentityHashMap, and WeakHashMap each serve some of those requirements but nothing in the JDK serves them all. Jason Greene submitted a candidate for this but so far it has been a bit slow, maybe because the JSR 166 group has been wanting to use the ephemeron concept to implement it, and that would require more changes (and maybe JVM support?). </li>
- Concurrent LRU – Various people talked about adding a concurrent approximate-LRU map that would be suitable for basic caching to the JDK. It appears that this work is not going to be done enough to make it into JDK 7 but it sounds like it’s still useful enough that work on it will continue. We’ve done some of this inside Terracotta’s various cache products (in distributed form) and it sounds like the Google Collections team is also considering adding support to MapMaker for bounded maps.
  
  Brief Doug Lea note here on why CRHM and concurrent LRU aren’t (yet) in the JDK 7 plans.
  - ParallelArray – The ParallelArray library builds on top of fork-join and provides a functional style API for mapping, filtering, reducing, etc over an array of Java objects. Without closure support, the API is not particularly pretty, but I think it’s eminently useful. The decision was made not to standardize it in the JDK yet but you can still download it from the JSR 166 site directly.</ul>

Pure Danger Tech

JSR 166 Concurrency Updates Hit JDK 7

Phasers

TransferQueue

Fork-join