Wednesday, November 2, 2011

Lazy Mandelbrots in Clojure

When learning a new language it is often beneficial to select a small fun task that one has previously applied to other languages.

Rendering the Mandelbrot set is such a small task i find to be fun, i am a sucker for results that are visually appealing! Such rendering, while simple to implement, is computationally expensive. Rendering the set as fast as possible can be a useful exercise to understand the performance capabilities of the language.

My experience so far with Clojure has been a good one. The runtime has a bijouxesque quality to it, the Clojure 1.3 runtime jar is about 3.4 MB. I thought the syntax would be an issue but one quickly sees through the brackets and braces when using a good editor. The learning curve for the Clojure syntax is not much different to that of the JSON syntax.

All code for rendering the Mandelbrot set can be found here on github.

First, a function needs to be defined that determines if a point, c say, in the complex plane is a member of the Mandelbrot set. Basically a point c is a member of the set if iterations of zn+1 = zn2 + c remain bounded i.e. no matter how large n gets zn never goes beyond a certain value. In addition, if c is not a member then a value can be assigned that reflects the rate at which the sequence of z tends towards infinity, this value is commonly used to assign colours to points outside of the set.

Using Clojure 1.3 the following function is the fastest i could implement:


If the function returns zero then c is a member of the set, otherwise a value is returned that represents how fast the sequence tends towards infinity.

Notice the primitive type hints (see here for more details). This is not ideal, i would prefer to utilize complex number types from Clojure contrib rather than having to split out the real and imaginary parts, for example:


which is similar to the implementation here. I especially like the use of the iterate function to create the lazy sequence of z values. However, this approach proved to be slow. I don't recall the exact numbers but there was about a 15x difference between using the more appealing, but slower, function on Clojure 1.2 and using less appealing, but faster, function on Clojure 1.3. Note that Clojure 1.3 does have some significant performance improvements for arithmetic operations, however using the more appealing function on Clojure 1.3 still proved to be too slow.

Regardless of the two approaches i like Clojure's approach to tail recursion using loop and recur.

Next, lets define a lazy sequence of values of mandelbrot function that range over an area of the complex plane:


Since this is a lazy sequence no calculations will occur until values are extracted from that sequence. This function applies a linear transformation from co-ordinates in some space, say that for a graphical image, to that in the complex space.

The sequence returned will be of the form ([x y limit] [x y limit] ...).

Then, lets define a function that returns a lazy sequence of sequences given a bounding area in complex space, a and b, and the number of values, size, that should be calculated in either the real or imaginary axis:


The bound represented by a and b is normalized into a lower and upper bound. Given the size, lower and upper bound a distance d is calculated, and the width, w, and height, h (or number of values to be calculated on the real and imaginary axis respectively, thus the total number of values will be the product of the width and height). If w=size then h<=size, otherwise if h=size then w<=size.

The sequence returned will be of the form:
[w h
  ([x y w h ([x y limit] [x y limit] ...)]
   [x y w h ([x y limit] [x y limit] ...)]
 ...)]
Where n defines the number of sub-sequences. In this particular case the function is splitting up the bounded area into n regions each represented by a lazy sequence, and for each sub-sequence the sub-bounds are declared. The reasons for this will become apparently later on. Again note that when this function is called no calculations will be performed until values are extracted from the sequences.

As a slight aside notice the ranges function:


I started writing this function then realized partition did exactly what i wanted, which is, given a right-open interval of [0, l) return the sequence of non-intersecting right-open sub-intervals of size s, and optionally ending in an right-open interval whose size is less than s and whose upper endpoint is l, for example:
mandel.mandelbrot=> (ranges 10 3)
((0 3) (3 6) (6 9) (9 10))
The lesson being, "there is probably a function for that".

Given the mandelbrot-seqs function we can start rendering Mandelbrots. Here is a function to render as text:


Which can produce output like the following:
mandel.mandelbrot=> (text-mandel 40 [-2 -1] [0.5 1] 10)
(888887777666666666666655554431   5555666
 888877766666666666665555554421  24555566
 8888776666666666666555555443    03445556
 8887766666666666665555544331     2444556
 8887766666666666655555433321     1234455
 8877666666666665555544 0 1         23205
 88766666666666555544430                4
 88766666666665544444320                3
 8866666666655444444330                03
 876666666553233433321                 02
 866666555443 12112211                  
 866655554432      00                  
 8655555444320                          
 855555444321                           1
 8555543321                             3
 8443233220                            13
 8                                     23
 8443233220                            13
 8555543321                             3
 855555444321                           1
 8655555444320                          
 866655554432      00                  
 866666555443 12112211                  
 876666666553233433321                 02
 8866666666655444444330                03
 88766666666665544444320                3
 88766666666666555544430                4
 8877666666666665555544 0 1         23205
 8887766666666666655555433321     1234455
 8887766666666666665555544331     2444556
 8888776666666666666555555443    03445556
 888877766666666666665555554421  24555566)
nil
In this case we don't need to split into multiple sub-sequences and we are just interested in the width, w, or number of characters on the line so the sequence of strings can be partitioned by that value.

Alternatively we can render as an image, which gets more interesting. An image will contain many more pixels than characters in textual output, so will be more expensive to create. The rendering algorithm is easier to parallelize. This is the reason why multiple sub-sequences are supported.

If i have a machine with 4 CPUs i can map the problem to four lazy sequences, let each core process a sequence to produce an image, and reduce those images to one larger image.

Here is the code to produce a BufferedImage:


For the case where the problem is split into n parts the pmap function is used to map the function image-mandel-doseq in parallel to the lazy sequences. This produces a sequence of images which are then reduced into one large image by a further sequential map operation.

The image-mandel-doseq function is where the work is performed to set pixels in an image:


The doseq repeatedly invokes WritableRaster.setPixel. Notice the type hint ^BufferedImage and the functional calls to int and ints. To ensure that Clojure knows which setPixel method to call it is necessary to convert values, otherwise a costly call by reflection will result.

When interacting with Java libraries i recommend setting the *warn-on-reflection* flag to true, then the compiler will warn when reflection is used to make calls:
(set! *warn-on-reflection* true)
So how does the parallel execution work out on my Mac Book Pro with an Intel Core i7?
user=> (defn t [n f]
  (for [_ (range 0 n)]
    (let [start (. System (nanoTime))
          ret (f)]
      (/ (double (- (. System (nanoTime)) start)) 1000000.0))))
#'user/t
user=> (for [i (range 1 8)]
  (/ (apply + (t 10 #(mandel.mandelbrot/image-mandel i 1024 [-2 -1] [0.5 1] 256))) 10))
(679.6881000000001 467.689 451.8355 366.8658 361.23620000000005 400.1187 535.1956)
The system profiler says my machine has 2 cores, but sysctl -a says there are 4 CPUs. The results are not quite what i was expecting. Nearly a 2x increase when the number of parallel tasks is four or five. Hmm... i don't quite understand this result. Need to dig deeper to work out what is going on.

I would love a graphically aware repl that could render the image that is returned from the image-mandel function, kind of like a poor man's Mathematica repl :-) An alternative is to develop a web service and render the image in a browser. This is really simple with compojure and ring. The image can be viewed by executing lein ring server.

The Renderable protocol of compojure was extended to support rendering of images:


Ideally i would like to directly stream out the image rather that buffer to a byte array, but this requires some tweaks to ring, which may make it after the 1.0 release, see the discussion here.

Finally the pièce de résistance is deployment of the application to Cloudbees by executing lein cloudbees deploy. Browse to here (if it takes a while to load that is because the application has hibernated and it takes some time to wake up). Developers forking the project can easily deploy to CloudBees if they have an account by tweaking the project.clj. For more detailed instructions on how to use lein with CloudBees see my previous blog entry.

Friday, October 28, 2011

Using lein to deploy Clojure ring applications to CloudBees

If you have a Clojure ring application and are using lein, then it is really easy to deploy that application to CloudBees using the lein CloudBees plugin, which is available on clojars.

Signup for CloudBees to create an account.

Download the CloudBees SDK and install. (See later for the case where the CloudBees SDK is not installed.)

Verify that you can execute bees app:list. The first time a bees command is executed it will prompt for your user name and password so that your API key and API secret can be downloaded and cached in the ~/.bees/bees.config properties file. This key and secret will be used to authenticate when using the CloudBees SDK or the lein CloudBees plugin.

Modify the project.clj file to identify the application:
:cloudbees-app-id "<account>/<appname>"
Where account is the name of your account and appname is the name of your application.

Modify the project.clj file to include the following development dependency:
[lein-cloudbees "1.0.1"]
for example:
(defproject mandel"1.0.0-SNAPSHOT"
  :description "A Mandelbrot web app"
  :cloudbees-app-id "sandoz/mandel"
  :dependencies [[org.clojure/clojure "1.3.0"]
                 [org.clojure/clojure-contrib "1.2.0"]
                 [compojure "0.6.4"]]
  :dev-dependencies [[lein-ring "0.4.6"]
                     [lein-cloudbees "1.0.1"]]
  :ring {:handler mandel.core/app})

Verify that lein cloudbees works, you should see something like the following:

$ lein cloudbees
Manage a ring-based application on Cloudbees.
Subtasks available:
list-apps   List the current applications deployed to CloudBees.
deploy      Deploy the ring application to CloudBees.
tail        Tail the runtime log of the deployed application.
restart     Restart the deployed application.
stop        Stop the deployed application.
start       Start the deployed application.
Arguments: ([list-apps deploy tail restart stop start])

Deploy the application:

$ lein cloudbees deploy
Created /Users/sandoz/Projects/clojure/mandel/.project.zip
Deploying app to CloudBees, please wait....
http://mandel.sandoz.cloudbees.net
Applcation deployed.
The deployment creates an uber war and then deploys that war to CloudBees. The deployment process is smart, only changes will be sent and furthermore any jars in WEB-INF/lib will be checked, securely, against a global cache, before sending. So even if the uber war is rather big the actual stuff sent across the wire may be much less than expected, even on the first deployment.

Eh Voila! your application is deployed and the state can be verified:
$ lein cloudbees list-apps
sandoz/mandel  -  active
If you don't have the CloudBees SDK installed then you can reference the API key and secret key in the project.clj, for example:
:cloudbees-api-key ~(.trim (slurp "/Users/sandoz/cloudbees/sandoz.apikey"))
:cloudbees-api-secret ~(.trim (slurp "/Users/sandoz/cloudbees/sandoz.secret"))
Such declarations will take precedence over any API key and secret key declared in the ~/.bees/bees.config properties file.

It's a bad idea to reference the key and secret directly in the project.clj and instead it is better to refer to that information in a file. Slurp them in from a file to a string, then trim to remove any white space or line-feeds (the plugin needs to be modified to trim those values).

Monday, February 14, 2011

Buzzzzzz

On the 1st of March I will be joining my illustrious new colleagues at CloudBees. A heady mixture of ex-Sun, ex-JBoss and ex-Red Hat employees. So some of those new colleagues are actually old colleagues too, namely Harpreet, Kohsuke and Vivek.

CloudBees has a very compelling vision to push the full development-to-production cycle (build, test, deploy) into the cloud. They have been on a roll cranking out stuff at a rapid rate.

The CloudBees platform consists of two main pillars: DEV@cloud; and RUN@cloud. General availability was announced recently.

DEV@cloud enables developers to develop and build software in the cloud using the best Continuous Integration software available, namely Jenkins, as a service.

RUN@cloud enables developers to deploy software to the cloud.

There is an enormous amount of value to gained by integrating these two pillars together into one integrated platform.
 
My focus will be on Jenkins, Nectar and DEV@cloud. Exciting times ahead.

I have been watching, well lurking, as the s/Hudson/Jenkins events unrolled. For the moment lets just say that we are living in "interesting times" and given the outcome of the events I feel far more comfortable with Jenkins than I would be if involved with the alternative.

Between now and the 1st March i will be taking some time off, but the temptation to do a little hacking will be hard to resist! On verra...

Farewell Sun and Oracle

Today is my last day at Sun and Oracle. 13 years 5 months at Sun and 8 months at Oracle or a ratio of ~20:1.

Like Tor i looked at my badge I am gonna turn in today:
The picture was taken when I first joined Sun in 1997. It's been in my back pocket for ages, it's almost like loosing a limb!

To my work friends and colleagues I wish you luck. "Fair Winds and Following Seas" to Java SE 7, Java EE 7 and GlassFish 7 (i made that last version up :-) , but a 777 release is rather appealing numerologically speaking).

I've "had a blast", "kicked butt", and "had fun", it's onwards to new adventures... (more on that later today). 

Many thanks to all those sending public and private communications of support. 

Signing off with my favorite Scott McNealy quote:
The only value you can add to a banana is a bruise

Monday, February 7, 2011

Sundowners @ Grenoble, The Shannon Pub, Fri 18th Feb from 9pm

To celebrate my new employment, to get nostalgic about Sun [*], to say farewell to Oracle, to philosophize about Life, the Universe and Everything, if one needs an excuse this it to meet up and imbibe a few Sundowners (in the British sense not the Aussie/NZ sense, which has something to do with sheep).

We will meet up at The Shannon Pub, Fri 18th Feb from 9pm.

[*] If i knew of a unicode character for the Sun logo i would use it!