Pure Danger Tech


navigation
home

Why YOU should use Integer.valueOf(int)

01 Feb 2007

In particular, why you should use Integer.valueOf(int) instead of new Integer(int): CACHING.

This variant of valueOf was added in JDK 5 to Byte, Short, Integer, and Long (it already existed in the trivial case in Boolean since JDK 1.4). All of these are, of course, immutable objects in Java. Used to be that if you needed an Integer object from an int, you’d construct a new Integer. But in JDK 5+, you should really use valueOf because Integer now caches Integer objects between -128 and 127 and can hand you back the same exact Integer(0) object every time instead of wasting an object construction on a brand new identical Integer object.

private static class IntegerCache {
	private IntegerCache(){}

	static final Integer cache[] = new Integer[-(-128) + 127 + 1];

	static {
	    for(int i = 0; i < cache.length; i++)
		cache[i] = new Integer(i - 128);
	}
    }

    public static Integer valueOf(int i) {
	final int offset = 128;
	if (i >= -128 && i <= 127) { // must cache 
	    return IntegerCache.cache[i + offset];
	}
        return new Integer(i);
    }

A side note is that the cache is contained in a static class which will not be instantiated until the first time it is needed, so if valueOf() is never called, the cache is never created. And conversely, the first time it’s called, it’s going to suck (as it will really create 256 objects). Also of interest is that because of the synchronization guarantees of the JVM when creating a static class instance, there is NO need for synchronization when creating the cache! This is the same basic trick mentioned in Bob Lee’s recent post on lazy-loading singletons. Hot stuff.

Seems like a textbook case for why you’d want to hide object construction behind a factory method - it allows you to later on decide to add a cache for commonly constructed immutable objects.

I have not read this anywhere but I wonder if the addition of autoboxing in JDK 5 (which automatically creates Integer objects from ints in some cases) prompted the change as these primitive objects were being created much more frequently.

Actually, if you want to see a whole bunch of really complicated optimized code, take a look at the Integer.java source. Interesting stuff.

Update 1: I intended to mention that the awesome and wonderful FindBugs tool will find uses of the Integer constructor in your code for you. I’m not sure whether other analysis tools like PMD and Checkstyle will do so but it certainly seems that they could.

Update 2: I also wanted to mention that you can look, in comparison, at the Gnu Classpath implementation:

private static final int MIN_CACHE = -128;
private static final int MAX_CACHE = 127;
private static Integer[] intCache = new Integer[MAX_CACHE - MIN_CACHE + 1];

public static Integer valueOf(int val)
  {
    if (val < MIN_CACHE || val > MAX_CACHE)
        return new Integer(val);
    synchronized (intCache)
      {
	if (intCache[val - MIN_CACHE] == null)
	    intCache[val - MIN_CACHE] = new Integer(val);
	return intCache[val - MIN_CACHE];
      }
  }

And here’s the Apache Harmony implementation:

private static final Integer[] CACHE = new Integer[256];

public static Integer valueOf(int i) {
        if (i < -128 || i > 127) {
            return new Integer(i);
        }
        synchronized (CACHE) {
            int idx = 128 + i; // 128 matches a cache size of 256
            Integer result = CACHE[idx];
            return (result == null ? CACHE[idx] = new Integer(i) : result);
        }
    }

You’ll note in both cases that the static class is not used and thus synchronization is required, making these implementations possibly worse in multi-threaded environments (although likely not in a way that you’d notice). You’ll also notice however that gcj and Harmony cache on demand instead of pre-populating the cache as the Sun JDK version is forced to when using the static class trick to avoid the synchronization.

I also find the use of magic numbers interesting across these examples. Here we’ve got several magic numbers (-128, 127, and 256). Conventional wisdom (from something like Code Complete) is that magic numbers in the code should be lifted into constants and named for clarity. Also, of these three constants, any one (take your pick) can be derived from the others, so could be defined as a static calculation. The Sun and Harmony versions eschew any pretense of doing this and simply use them all as literals, which is at least consistent. In Sun’s case it’s actually a little better as the related code blocks using the constants are next to each other whereas in Harmony’s the static definition is at the top of the class, far from the usage of the cache. gcj did replace all magic numbers with constants and calculated the cache size (256) at the point of need. Seems like the val - MIN_CACHE calculation for the offset could have been pulled out into a local variable though, if not for performance (as the compiler would presumably optimize this), at least for maintenance (less copied code to change).

Update 3: To satisfy Mr. John Smith in the comments, I ran a small performance test. It prints the time (in nanoseconds) to get the first Integer, the second Integer, and and the total and average while creating 1000000000 Integers using either new Integer() or Integer.valueOf().

Here’s the results:

> java PerfTest n
new Integer
first = 34921 ns
second = 2514 ns
all = 15778328252 ns
avg = 15 ns

> java PerfTest v
valueof
first = 729981 ns
second = 2514 ns
all = 7729155801 ns
avg = 7 ns

So, as expected valueOf sucks on the first call. The second call is a tie, which seems odd. But really, this is most likely just a fluke of the resolution of the clock vs how fast it is to construct a single object. Seems exceedingly odd that the numbers are the same - I’d guess it’s most likely you’re seeing the the smallest “tick” of the nanos clock, not a real value (esp in light of the final averages). On the totals and averages we see the full-term story: valueOf takes an anverage 7 ns vs 15 ns for constructing a new Object, so half the time. If I were less lazy, I’d write another program to calculate the break-even point but it would probably be way too biased by my environment. Here’s the code if you want to use it: PerfTest.java.

Update 4: If you’re interested in more of the history, check out section 5.1.7 of the Java Language Specification which talks about boxing conversions, which describes the rule that boxing (which calls valueOf()) must return the identical instance for integers in this range. Ideally, they would like the result of all boxing conversions to return identical instances, but that’s not practical, so this is part of a compromise.

Update 5: Joe Darcy, the author of the Sun code in question posted a response to this blog, which is worth the read as to the alternatives that were considered and performance benchmarking.