Pure Danger Tech


navigation
home

Subclassing in Clojure

12 Aug 2011

In general, dealing with Java from Clojure is quite painless and much easier than calling Java from Java. :) However, I ran into (not for the first time) some issues around subclassing an abstract class and properly overriding a method from the parent class. For my own future sanity (and perhaps yours dear reader), I thought I’d collect a few links and an example or two here.

The three tools we have in our toolbox for extending/implementing Java classes in Clojure are gen-class, proxy, and reify. gen-class allows you to generate a brand new Java class. proxy and reify each let you create a new anonymous instance of a class. If you just need to implement an interface (or multiple) in an anonymous instance, reify is the best choice. If you need to create an anonymous instance that subclasses another class, reify can’t help you, but proxy can. If proxy doesn’t work for you, you might need to fall back on actually creating a new class and instantiating it and that’s where gen-class comes in.

The specific situation where I got into trouble was subclassing an abstract class where the method I needed to implement had multiple arities in the super class. In particular I was trying to subclass java.io.InputStream. InputStream is an abstract class with a single abstract method read(), but also two non-abstract overloads ``In general, dealing with Java from Clojure is quite painless and much easier than calling Java from Java. :) However, I ran into (not for the first time) some issues around subclassing an abstract class and properly overriding a method from the parent class. For my own future sanity (and perhaps yours dear reader), I thought I’d collect a few links and an example or two here.

The three tools we have in our toolbox for extending/implementing Java classes in Clojure are gen-class, proxy, and reify. gen-class allows you to generate a brand new Java class. proxy and reify each let you create a new anonymous instance of a class. If you just need to implement an interface (or multiple) in an anonymous instance, reify is the best choice. If you need to create an anonymous instance that subclasses another class, reify can’t help you, but proxy can. If proxy doesn’t work for you, you might need to fall back on actually creating a new class and instantiating it and that’s where gen-class comes in.

The specific situation where I got into trouble was subclassing an abstract class where the method I needed to implement had multiple arities in the super class. In particular I was trying to subclass java.io.InputStream. InputStream is an abstract class with a single abstract method read(), but also two non-abstract overloads`` and read(byte[], int, int).

For the purposes of this blog, we’ll say we want to subclass InputStream and create a variant that takes a sequence of bytes which will be doled out by the stream (and yes I know ByteArrayInputStream exists – this is an example). The stream is stateful and holds it’s byte-seq in a ref.

(defn byte-input-stream [byte-seq]
  (let [ byte-state (ref byte-seq)]
    (proxy [ java.io.InputStream] []
      (read [] 
        (dosync
          ;; peel off one byte to return and save the rest
          (let [[ b & more-bytes] @byte-state]
            (ref-set byte-state more-bytes)
            (if b b -1)))))))

If you try using this read function directly, it works fine:

user> (.read (byte-input-stream [1 2 3]))
1

But if you try using the other arity versions, you’ll find some issues:

user> (.read (byte-input-stream [1 2 3]) (byte-array 3) 0 3)

Wrong number of args (4) passed to: buffer$byte-input-stream$fn
  [Thrown class java.lang.IllegalArgumentException]
  0: clojure.lang.AFn.throwArity(AFn.java:437)
  1: clojure.lang.AFn.invoke(AFn.java:51)
  2: example.proxy$java.io.InputStream$0.read(Unknown Source)

Basically, this proxy doesn’t have those other arity versions anymore. But you can support multiple arities in a proxy method definition and there is also a function called proxy-super that allows you to call super class methods. This would look like this:

(defn byte-input-stream [byte-seq]
  (let [byte-state (ref byte-seq)]
    (proxy [ java.io.InputStream] []
      (read
        ([] (dosync
             (let [[b & more-bytes] @byte-state]
               (ref-set byte-state more-bytes)
               (if b b -1))))
        ([byte-arr]
           (proxy-super read byte-arr))
        ([byte-arr off len]
           (proxy-super read byte-arr off len))))))

Testing this isn’t too happy though:

user> (.read (byte-input-stream [1 2 3]))
1
user> (.read (byte-input-stream [1 2 3]) (byte-array 3) 0 3)
java.io.InputStream.read()I
  [Thrown class java.lang.AbstractMethodError]
  0: revelytix.federator.buffer.buffer.proxy$java.io.InputStream$0.read(Unknown Source)
  1: java.io.InputStream.read(InputStream.java:154)

The multi-arity version ultimately calls the single-arity version and can’t find it. It turns out this problem is already documented (with deeper explanation) on Meikel Brandmeyer’s excellent proxy post but no solution is given.

For that, I dove into gen-class. Our goal with gen-class is to generate a class that is a subclass of InputStream, which we can then instantiate and call. I will first mention that I again went back to the well of Brandmeyer to learn more about gen-class and that I also found this old clojure group post and related blog helpful in putting everything together.

gen-class will actually override and implement all interface and superclass methods by default with implementations that try to forward to specially named functions in your generated class (if they exist). These functions are named with the method name and types of their args.

Refer to Brandmeyer’s post or the docs for details on how gen-class works but I created a class that had the ref containing the byte sequence in the “state” and an initializer function that took the incoming byte-seq and wrapped it into that ref. I then created the specific read() function that I needed to override which took no args and is thus called “read-void” with the default prefix of “-” and thus -read-void:

(ns example.ByteInputStream
  (:gen-class
   :extends java.io.InputStream
   :state byteseq
   :init init
   :constructors {[ clojure.lang.Seqable] []}
   :main false))

(defn -init [byte-seq]
  [[] (ref byte-seq)])

(defn -read-void [this]
  (dosync
   (let [byte-seq (.byteseq this)
         [b & more-bytes] @byte-seq]
     (ref-set byte-seq more-bytes)
     (if b b -1)))

And if we compile this we can test it successfully:

example.ByteInputStream> (compile 'example.ByteInputStream)
example.ByteInputStream
example.ByteInputStream> (def bis (example.ByteInputStream. [1 2 3]))
#'example.ByteInputStream/bis
example.ByteInputStream> (.read bis)
1
example.ByteInputStream> (.read bis (byte-array 2) 0 2)
2   ;; this returns the number of bytes read into the array

Or perhaps a more useful usage is to just leverage Clojure’s slurp function:

example.ByteInputStream> (def bis (example.ByteInputStream. [67 108 111 106 117 114 101]))
#'example.ByteInputStream/bis
example.ByteInputStream> (slurp bis)
"Clojure"

If you wanted to override the 3-arg version of read as well, you can do that by overriding the function -read-byte<>-int-int. Obviously.

Caveat: It could very well be that I am abusing gen-class and how it names its own versions of super-class methods and there is a cleaner way to do this. I’ve reached my exploration limit on this however and I need to get back to work! For example, it may make more sense to leverage the exposesMethods feature of gen-class. Comments welcome…