Pure Danger Tech


JavaOne: Collections Connection

03 Jun 2009

For my last serious BOF of the night, I went to the Collections Connection with Josh Bloch, Martin Bucholz, and Kevin Bourillion. The under-stated goal of said BOF was to “Achieve world domination with the Java Collections Framework”. I believe they got 42.9% of the way there. In all, it was an extremely useful roundup of what’s coming down the pike in Java 7 collections.

Collection literals and array-like access – I covered this in the previous Project Coin blog – these are currently under consideration for Coin, so cross your fingers. Examples:

List<String> list = ["a", "b", "c"];
		String firstElement = list[0];
		Map<Integer, String> map = {1 : "One"};
		List<String> list = ...;
		String firstElement = list[0];
		Map<Integer, String> map = ...;
		map[1] = "one";

fork/join from JSR 166y – it’s likely that the fork/join framework (but not the ParallelArray lib) will be included in Java 7.

This library provides fine-grained parallelism by recursively breaking down tasks and accumulating results.

TransferQueue – also part of JSR 166y, TransferQueue and it’s interface LinkedTransferQueue are basically extensions to BlockingQueue that will provide an additional method tryTransfer(Task) that adds a non-blocking way to transfer a task to a thread that might be waiting on a queue. If no thread is waiting, the method returns immediately.

The big win here is that this allows ThreadPoolExecutor to be improved – currently it will spin up new threads even if there is a thread idle that could be used. There are good reasons for this but TransferQueue will eliminate those reasons.

I blogged about this some more a while back.

ConcurrentLinkedQueue improvements – Martin talked about several improvements being made to ConcurrentLinkedQueue:

  • When adding a new node to the end of the queue, the current implementation requires two volatile writes. With some tricky and the verboten use of sun.misc.Unsafe, they can trim that down to a single volatile write.
  • The current implementation can happen to leak nodes when the queue is being modified and there are some issues that can occur where the object structure is bad for GC (I blogged about this a while back).
  • Head and tail nodes can lag behind their actual head and tail in the current impl which is kind of a feature. I think Martin said that they would update every other modification to cut down on these moves but maybe I’m not sure that’s right now.

Improved Collections/Arrays.sort() – Josh has been working on taking advantage of the really awesome sort work done by Tim Peters in Python. This improvement has some wonderful properties. It is a stable, adaptive, iterative mergesort and it is written to take advantage of any partial order in the list being sorted – the order doesn’t have to be complete; any order is helpful. In a partially sorted list, you can see dramatic speed improvements. In the case of random lists, the performance is approximately the same as the existing sort implementation.

As a bonus, the memory usage is also reduced. In the case of mostly sorted lists, only a few extra elements are needed and in worst case n/2 elements are needed (as compared to current n).

Phasers – I’ve written about these in the past – basically improved CyclicBarriers.

ConcurrentReferenceMap – I asked about this in the Q&A and it sounds promising that Bob Lee and Doug Lea will have something put together for the ConcurrentReferenceMap based on the Google Collections MapMaker builder.

Google collections – Kevin Bourillion did a very high level overview of some nice stuff in the Google collections library, which is 1.0 RC2 as of today. Hopefully many of these improvements can be moved into the JDK in the future. Highlights:

  • Immutable collections
  • Multisets, multimaps
  • MapMaker – for creating maps that are optionally concurrent, have soft/weak/strong keys/values, timed expiration, and lazy value computation

Great BOF – I love this stuff…