mdub@DogBiscuit.org
... mmm, crunchy!
about - weblog - software - resume - email - pgp

Rake profiling

Where's the bottleneck in your Rake build? Let's find out. Drop (or include) this in your Rakefile:

module Rake
  class Task
    def execute_with_timestamps(*args)
      start = Time.now
      execute_without_timestamps(*args)
      execution_time_in_seconds = Time.now - start
      printf("** %s took %.1f seconds\n", name, execution_time_in_seconds)
    end
    
    alias :execute_without_timestamps :execute
    alias :execute :execute_with_timestamps 
  end
end

How I Learning to Stop Worrying and Love the Mac

In my new job, a Mac is the preferred tool of the trade. So now I'm learning to use a nice shiny MacBook Pro, and after years developing on Windoze, it's a very pleasant experience. Here are some of the things that are making my life just that little bit more delightful:

  • It's Unix. On Windoze, Cygwin helped a little, but this is soooo much better.
  • Launchbar - an application launcher and more. I bought this almost immediately after getting my Mac, based on some random recommendation somewhere, and haven't regretted it - I probably use it once every 5 minutes, on average. I've subsequently tried Quicksilver, but it didn't feel immediately "right" in the same way Launchbar does.
  • Textmate - "the missing editor". I've been a dedicated Emacs user for about 20 years now, but had to give Textmate a try, given all the hype. It's noice! Many of the features I know and love from Emacs are there (albeit bound to different cryptic key-combinations), and the UI is clean and Mac-savvy.
  • 1Password - a password manager. I got this handy little utility as part of a bundle from macheist.com. It stores all your passwords in a (secure) keychain, indexed by website, making it really easy to log back in next time you visit. Best of all, it integrates with most browsers, meaning you only need to store passwords once. It can store multiple sets of login details per site, too, which is very useful when testing web-apps.
  • Safari is a nice little web browser, and once you turn on the debug menu, it's even better. Normally, I'd reach for Firebug for this sort of functionality, but the built-in Safari equivalent is almost as good.

Diving (thoughtlessly) back into the workforce

As of last week, I'm developing software for money again, after a nine month break.

At the same time, I'm saying goodbye to ThoughtWorks, which is not easy, since I've really enjoyed my time there. I joined TW to hang out with good people, and wasn't disappointed: the faces have changed a little over the years (as people come and go), but TW still employs some of the most talented and passionate people I've ever had the opportunity to work with.

In the end, though, I figured it was time to try something new. I'm now working for Cogent Consulting. I've known Steve and Marty, who run the company, since the early days of the Melbourne eXtreme Programming group, and have a lot of respect for them both. They've started to assemble a very interesting, talented bunch of individuals, and I'm looking forward to the ride.

Among other things, I'm going to be doing a whole lot more Ruby and Rails work than I have to-date. Which feels good, since I've been blathering about Ruby for a few years now.

Nine months at home

Since Easter last year, I've been a house-husband. My beautiful wife (Tanya) was keen to get back to work, and we didn't want our kids (Ngara and Jonah) in full-time care, so I've been home playing Dad. And it's been great.

Lots of people have said how jealous they were of "my time off", like it was some kind of holiday. I have to say, it hasn't felt like a holiday, by any stretch. Looking after kids can be hard work, 13+ hours/day, 7 days/week. Being a introverted control-freak, I found it challenging. Developing software is easier and more relaxing.

I imagined it would be extremely difficult at first. But after a few months, I figured, I'll get good at it, and then it will be relatively easy. Well, perhaps not easy, but manageable. As it turned out, I actually found pretty manageable from the outset - not quite as hard as I'd imagined, and the kids were surprisingly accepting, once they'd got their heads around it. On the other hand, it hasn't got any easier whatsoever over time; I still struggle with exactly the same challenges now, on a daily basis: how to keep them fed, entertained, safe, and below 100dB.

What a great experience, though. I'm particularly thankful for the time I've been able to spend with Jonah (almost 2). When I stopped work he'd just turned one, and for his first year of life I hardly saw him, given that I was away for most of his waking hours. So it's been really special to be around him this last year, as he's transformed from a carpet-bound nappy-filler into a real little boy. My little boy. First steps, first words, first trip to the emergency department ... all that good stuff.

It was a different story with Ngara (almost 5), though. Previously, I was her "fun" Dad, who got to read her books at bedtime, and play with her in the weekends. Instead, once the novelty of me being home wore off, I became the responsible parent, too busy cleaning the kitchen or changing nappies to be much fun, much of the time. I think it's been good for us both to spend more time together, but I miss being the "fun" one.

Anyway, this little interlude in our lives is about to come to an end, with me about to return to work. I'm very thankful that it's been possible for me to spend this time at home. But in many ways, I'm also glad it's over :-)

Jonahisms

My son Jonah is just starting to talk in earnest. I've forgotten most of his elder sister's cuter sayings, so am making a point of jotting down some of his:

noh
no (he's nearing 2, so starting to use this fluently)
noooooOOOhohoho
definitely not
mmmmm
well, okay then
moh!
more!
Mum-mah
his mum
Dad-dee
that's me
Rah-rah
Ngara (his elder sister)
tar
car
tuck
truck
dut
duck
Dut! Dut! Dut!
Quack! Quack! Quack!
neigh
horse
baap
sheep
bup
bug
ay o
elephant (from "Hey De Ho", a children's song)
rahh
tiger, lion
mao
cat
olly daaay
holiday
Oh oh ooooh
"Ho ho ho" (used for anything related to Christmas)
bubbish tuck!
rubbish truck!
moh bubbish tuck?
where did the rubbish truck go?
bubba dut
baby duck (always difficult to distinguish from a rubbish truck)
ut si
outside (his favourite place)
i sti
swing me "high in the sky"
Moh i sti!
C'mon, higher!
bah bye
bye bye (usually delivered with a heartbreaking little wave)

He's saying a lot of other things too, but most of them are completely unintelligible at this point. :-)

Ant build tips

During my past few Java projects, I've developed some guidelines which I find make builds faster, more reliable and easier to maintain. The details are specific to Ant, but hopefully the principles are transferrable to other software build systems.

These ideas may seem blindingly obvious to some readers, but I suspect they'll appear new-and-strange, and perhaps even bad-and-wrong, to others. In any event, I hope to trigger some thought/discussion.

Principles

My build approach is based on two simple principles:

  • Efficiency - don't rebuild up-to-date outputs
  • Safety - do rebuild out-of-date outputs

(By "output", I mean some artifact produced by the build. I'm avoiding the word "target" here, since it has specific meaning in Ant.)

Efficiency - DON'T rebuild up-to-date outputs

Quick builds, and rapid feedback, are important for developer productivity. Using a build system that recreates everything from scratch after even a minor change is a great way to kill productivity.

Re-executing a single build step is typically not the end of the world, but many outputs are also inputs to other build steps, so unnecessarily rebuilding an output early on during the build can trigger rework all the way through.

Safety - DO rebuild out-of-date outputs

On the flip side, when a key input DOES change, you need to ensure that all the derived outputs are rebuilt, or at least revalidated. Otherwise, your build becomes "flaky" and unpredictable.

A flaky build forces developers to compensate somehow, e.g. by explicitly running "clean" builds every time, whch impacts productivity.

Tips

Explicitly declare dependencies between your targets

Some people are reluctant to declare dependencies, because declaring them introduces overhead. But not doing so is unsafe, because it opens the door to build steps being executed with stale inputs, resulting in confusing, frustrating, non-deterministic build behaviour.

If you've followed the "Don't rebuild up-to-date outputs" rule, then dependencies should be safe/cheap, ie. there's minimal overhead, and no reason not to declare them.

Targets should be Nouns, not Verbs

Typically, programmers name Ant targets by what they do, e.g. "compile", "test". However, this tends to produce very procedural builds.

So instead, I recommend choosing names describing what the target produces, e.g. "classes", "test/report". Perhaps it's just because I spent so many years automating builds using make, but I find that such noun-ish targets help in various ways:

  • it's easier to understand what outputs each target produces (for obvious reasons)
  • intermediate targets tend to become useful in their own right
  • dependencies become clearer, as it makes more sense to depend on a concrete input, rather than a process

If you've read this far, go read Martin Fowler's "OutputBuildTarget" article; he explores the subject more eloquently than I'm capable of.

Some targets might not produce a concrete artifact (or the artifact might not be the main point of the target). In such cases, I'll sometimes name them based on the condition they produce, or ensure. For example, a target using Simian to check for duplication might be called "minimal-duplication" (as opposed to "simian").

Use <uptodate> to avoid unnecessary rework

Most Ant tasks include dependency-checking based on file timestamps, and will avoid rework. But some tasks aren't so clever. For instance, the <junit> task will happily re-run all your tests, even if they all passed last time, and neither code not tests have changed.

The <uptodate> task can help fill the gap. It compares the timestamps of specified input and output files, and sets a property indicating that work can be avoided.

Here's an example where <uptodate> is used to avoid unnecessary re-generation of XML-mapping code:

<target name="xml-module/check"
        depends="properties">
    <uptodate property="xml-module.uptodate"
              targetfile="${xml-module.jar}">
        <srcfiles dir="spec" includes="**/*.xsd"/>
    </uptodate>
</target>

<target name="xml-module"
        depends="xml-module/check, xmlbean/taskdef"
        unless="xml-module.uptodate">
    <xmlbean destfile="${xml-module.jar}"
             classpathref="xmlbeans.classpath">
        <fileset dir="spec" includes="**/*.xsd"/>
    </xmlbean>
</target>

Use <touch> to record a completed task

Although it's unusual, some build steps have no output: they are simply processes that must be executed, e.g. validating the format of a file, or verifying adherence to coding standards (Checkstyle, Simian). Other build steps can produce many outputs, e.g. code-generation tools.

In these cases, where there's no identifiable primary output, it can be useful to invent a placeholder output-file using Ant's <touch> task. The resulting file is empty, but it's timestamp can be used for dependency-checking, to determine if/when the build step needs to be re-run.

<touch> is most useful in conjunction with <uptodate>, as in the following example:

<target name="libs/check">
    <uptodate property="libs.uptodate">
        <srcfiles dir="." includes="ivy.xml"/>
        <mapper type="merge" to="lib/.done"/>
    </uptodate>
</target>

<target name="libs" description="retrieve dependencies with ivy"
        depends="libs/check" unless="libs.uptodate">
    <ivy:retrieve pattern="lib/[conf]/[artifact].[ext]" />
    <touch file="lib/.done" />
</target>    

Here we're using Ivy to download third-party libraries. After download, we create a touch-file to mark the job as done. On subsequent runs, the library resolution and download process will be skipped, unless the "ivy.xml" control-file has been changed.

As I alluded to earlier, I have also used the combination of <touch> and <uptodate> to:

  • skip code-style checks when code hasn't changed
  • skip tests when neither code nor tests have changed

Use <dependset> to remove out-of-date outputs

When Ant is not clever enough to determine when something needs re-doing, the <dependset> task is useful for mopping up stale outputs.

Pitfalls

Avoid "private" targets

Many builds include "private" or "hidden" targets, that are unsafe to call directly. A common convention in the Ant world is name these targets starting with '-', since that makes them inaccessible from the command-line.

I think private targets are a smell: they indicate that implicit dependencies are present in the build. Hiding the unsafe targets makes sense, in a way ... but I much prefer to make the dependencies explicit, as described above, at which point it's safe to let every target be called directly (which often comes in handy when testing some aspect of the build process).

Avoid targets depending on "clean"

Having popular targets depend on "clean" is a bad smell. You DO need to avoid using artifacts from previous builds which have passed their use-by date, but starting the whole build from scratch is overkill, when proper dependencies and careful timestamp-checking can ensure that just the stale stuff is rebuilt.

Avoid <copy overwrite="true">

An anti-pattern I often encounter (and a pet peeve) is:

<copy overwrite="true" ...>
    ...
    <filterset>
        <filter token="PASSWORD" value="${db.password}"/>
        ...
    </filterset>
</copy>

The "overwrite" attribute causes Ant to copy files every time, ignoring the usual timestamp-checking that prevents re-generation of up-to-date files. Using "overwrite" can easily cause most of your jars/wars/ears/etc to be updated with every build.

Instead, use <dependset> to invalidate the outputs in the case that ${db.password} has changed.

See Also

method_missing magic - emulating Groovy's "it" in Ruby

Inspired variously by:

I've cooked up a shortcut for generating simple blocks, meaning that rather than

people.select { |x| x.name.length > 10 }

I can write such things as:

people.select(&its.name.length > 10)

Disclaimer: I think this is more "cool hack" than useful tool; it's probably too much of an alien artifact to be useful in real life. And it's not generally applicable, like "it" in Groovy. And really, it's not that much more verbose to use a block. Aaaaaanyway ...

The trick is that the above is parsed as

people.select(&(its.name.length.>(10)))

The "its" method creates a MessageBuffer object, which records the messages (method invocations) sent it's way:

irb(main):001:0> require 'message_buffer'
=> true
irb(main):002:0> its
=> #<MessageBuffer:0x6b40b44 @messages=[]>
irb(main):003:0> its.name.length < 10
=> #<MessageBuffer:0x6b3e678 @messages=[[:name], [:length], [:<, 10]]>

Now, the "&" operator coerces it's argument to a Proc, and MessageBuffer#to_proc generates a Proc that replays all the recorded messages. Q.E.D.

The full source-code is fairly short, so I'll include it inline:

class MessageBuffer 

  instance_methods.each do |m|
    undef_method m unless m =~ /^(__|respond_to|inspect)/ 
  end
  
  def initialize
    @messages = []
  end

  def method_missing(*message)
    @messages << message        # record the message
    self                        # return self so we can keep recording
  end
  
  def __replay_all_messages__(obj)
    @messages.inject(obj) do |obj, message|
      obj.__send__(*message)
    end 
  end
  
  def to_proc
    proc { |x| __replay_all_messages__(x) }
  end

end

def its
  MessageBuffer.new
end


Update: Florian Gross suggested a better way to replay recorded messages, using inject, and I've updated the code accordingly.

Selenium Core 0.8.0

The Selenium Core team (of which I'm a sometime member) released version 0.8.0 last week.

Highlights include:

  • a "multiWindow" option which places the application-under-test in a separate window, allowing testing of "frame-busting" apps;
  • more reliable page-load detection for popup windows;
  • new cookie-management actions;
  • a run-speed slider and "Pause" button which replace the old Run/Walk/Step radio-buttons;
  • many bug-fixes and stability improvements;
  • tested against latest versions of Firefox, IE6, Opera, Konqueror, Safari and WebKit.

The multi-window layout option is a great step forward, since it was a limitation that prevented many people from using Selenium.

You can download the new version at:

http://release.openqa.org/selenium-core/0.8.0/

(Yes, the documentation and website still suck. Sorry.)

Presentation on Ruby/Rails at EJA

A couple of months ago I gave a presentation on Ruby and Rails to a local Java user-group. My slides are now online:

It contains a few examples showing how expressive Ruby can be, when compared to Java.

I hate "frameworks"

Give me a "toolkit" or "library" over a "framework" any hour of the day.

A software framework offers to solve 80% of my problem, but usually without understanding what my problem actually is.

A toolkit is collection of tools. I can pick them up and use them as I see fit. I can use individual tools/components, without needing to adopt them all. I can use them in conjunction with other tools I have, without voiding any warranties.

Grumble.