TDD – That’s Design Done

For many years now there have been internet flame wars about TDD in
the software development community.

Most people I talk to involved in software development seem to have
strong opinions for or against Test Driven Development and some of the
heated discussions on forums have become as passionate as arguments
over politics or religion.

I am going to take a tongue in cheek look at some of the
characterisations of the different sides of this long running argument
so I’m guaranteed to offend everyone!

On one side are the proponents of TDD as a way of life for developers.

Test Driven Disciples

The strict adherence to the process and disciplines of
conjures up images of the developer equivalent of the warrior monks of
China. Developers rising before the sun to practice their code kata,
lines of practitioner hunched over keyboards applying the steps of TDD
as if driven by a metronome, each successful invocation of the ‘green’
genie of a passing test accompanied by a Kiai of triumph.

What they think of as the studied concentration of practised
discipline is often perceived by non-TDDers as the slavish adherence to
an unthinking cult. Mindless automatons relentlessly ‘mocking’ real
design to produce a disconnected, disassociated collection of classes
that clog the understanding of the problem space with unnecessary
layers of abstraction.

On the other are the haters of TDD. More difficult to categorise, they
form their own little bands of rebels.

Anarchy of ‘Testless’ Development

The Code Warriors

This post apocalyptic band of ‘code’ warriors hack code as fast as they can to meet delivery ‘dead’-lines set by the cruel warlords of the project management cult. They have no time for testing of any kind. Testing is for ‘wimps’!

Many don’t survive this brutal and unforgiving landscape, brought to an untimely end by remature release schedules. The hardened veterans carry scars, missing the figurative ‘limbs’ of code quality, professional pride bleeding into the sand of the unforgiving desert of the ‘immaculate schedule’.

The Afterthoughts

Usually made up of outcasts and refugees from the hardened ‘Code
Warriors’, the Afterthoughts do write tests, if their evil project
management overlords allow. Their tests are added after the code has
been implemented and sometimes after it’s been shipped.

This tribe of zombie-like figures are characterised by the harassed
and haunted 1000 yard stares as they struggle to reverse engineer
tests over code that is frequently tightly coupled, riddled with
multiple responsibilities, mutated parameters and static
references. To them tests are seen as a necessary evil to clear the
razor wire fence of the arbitrary unit test coverage percentage.

The Imagineers

This magic Faye folk are hard to see and even harder to catch. They
are almost ephemeral and mythical. If you do see them it will be out
of the corner of your eye when they are lounging in a hammock thinking
and when you look again they will have vanished in a puff of magic
logic. They have the mystical ability to visualise complex designs
almost fully formed from thin air.

Although they may write tests, these are most likely to be end to end,
system or integration tests. They have no need for unit tests as a
means of developing code as their astonishing brains and supernatural
vision enable them to see the most complex code paths and data
structures as if they were ghostly apparitions realised in the
swirling mists of an architectural seance.

So to TDD or not to TDD?

So putting aside the sarcasm and the terrible similes, should you use
Test Driven Development or not? Does it fulfil the promise of
evolutionary design claimed by it’s adherents or is it the ‘snake oil’
claimed by it’s opponents?

Well, firstly a caveat, the following are the opinions of the author
based on 16 years of software design before using TDD and 12 years
since using it. Most of my opinions have been formed by the
experiences I have had and, more relevantly, by the way my brain
discovers and visualises problems and abstract concepts, YMMV.

What has TDD ever done for me?

Most of my professional programming life has been spent writing in either
procedural or object-oriented languages. In addition, they have been
statically ‘typed’ (typing is a lower level concept in COBOL and
Fortran but you need to declare primitive variable structures up front
so I’ll count it). Given these environments TDD gives me a number of

Test coverage

Although I used to write unit tests fairly thoroughly as a COBOL
programmer, and these tests were even written in advance, they
were not automated and therefore hard and expensive to reproduce. TDD
is not synonymous with automated testing and although you can have the later
without the former my personal experience is that ‘test after’ will
always be sacrificed to the pressure of delivery.

Therefore, TDD, or at least ‘test first’, tends, in real world projects,
to produce better automated test coverage at a fairly fine grained
level (more on that later).


Again this opinion is definitely flavoured by the OOP languages I’ve
written in (mainly Java, a bit of C++ and C#) but, unlike the
‘Imagineers’, I have trouble visualising detailed designs fully formed
without a bit of exploration of the concepts and the data involved.

In a language like Java it’s quite hard to ‘try out’ and visualise how
data needs to be transformed or what low level abstractions are
present in the problem space. As I tend to explore problems from both
the top down and bottom up I find it useful to probe at data for

Without TDD the usual option to discover emergent design in ‘the micro’
is to run code liberally peppered with print statements or add break
points and step through with a debugger.

I find TDD gives me quick feedback but has the advantage over print
statements or break points in that the tests don’t need to be unpicked
from the production implementation and that they act as
institutionalised memory and capture the abstractions realised so far
so that you don’t have to use cognitive load in holding a mental model
of every low level data structure and it’s current state.


This one is definitely open to challenge and is almost the same point
as in ‘Test coverage’.

Having a network of tests written up front against the low level
design criteria means that you have some level of confidence that the
lower level design satisfies your understanding of the problem and
therefore you have some measure of the ‘completeness’ of your
solution. If you’ve fulfilled the tests for that small part of the
design you have completed it to your current understanding.

Of course this presupposes that your tests acurately reflect the
problem and that your understanding is accurate but this is a problem
regardless of TDD.


One advantage of test first is that in order to write tests up front
you have to drive the low level design to be ‘testable’. This means
that if you’re lazy and impatient, like me, you will want to get
your test to have as little set up as possible and only verify one
thing per test.

This inherently tends to lead to a design that favours
small methods and objects with few dependencies and one responsibility

‘Change net’

Another side-effect of writing a lot of automated tests is that when
you want to make changes to one part of your system the tests for the
part of the system you have not changed will inform you if you’ve made
any breaking changes. I call this the ‘change net’, a bit like the
concept of a safety net.

This kind of failure on change of a different part of the system will flag coupling that you may not have been aware of and give a trigger to reconsider the design.

Although this effect can be achieved with good test coverage written
after the implementation, as discussed, it is not typical in practice to
get high levels of test coverage in the real world if tests are
written after the fact.

In addition, the practice of writing new tests or modifying existing
tests to reflect the change about to be made can give a guide as to
when the change has been completed as the tests will fail until this is so.


A few years ago I employed an experienced developer with a lot of TDD
experience. She came into a team where the first version of the
software had been released and we were working on version 2. I was the
lone web developer on the team and had written all the code to serve
the dynamic web application.

As, at the time, I was the CTO of the consultancy concerned, I was
constantly involved in short term consultancy work and a lot of
presales activity and therefore out of the office on client sites a
lot of the time.

I had about 3 hours to explain the project on Monday
morning before I had to leave and I wasn’t back in the office until
Wednesday afternoon.

I got back on Wednesday and asked said developer how she was getting
on (apologising profusely for leaving her with such little
support). She replied:

Initially I was a bit lost, then I started to read your tests and
I’ve already implemented the first enhancement for version 2

Again, this is not the exclusive territory of TDD as any well written
test harness can provide this ‘living’ document, and BSD style tests
are even more effective, but in the real world TDD grown tests provide
a useful starting point.

Communication through tests – nuff said.

So TDD is a ‘no brainer’ then?

So TDD is a ‘no brainer’ then? Well, yes, sometimes it is implemented
with no brains.

Lets look at the potential negatives:

You’re mocking me

A frequent criticism of TDD is that it leads to a lot of the objects
that interact with the ‘system under test’ (probably a class or method
in OOP, more on that later)
being replaced with mocks that are
‘primed’ with a canned response for each test case. The argument is
that this can lead to:

  1. False positives as mocked ‘objects’ behave as expected but actual
    ones don’t.
  2. Designs that introduce layers and/or abstractions that are simply
    there to support mocking.

Tackling the second point first, I’ve never seen this in the wild
myself but I’ve seen people cite examples. My own feelings are that
this is rare and that the abstractions introduced in most cases are
improvements that reduce coupling rather than introduce more.

The first point I have more sympathy for. I’ve definitely seen this
‘mocking masking behaviour’ issue in codebases including my own
code. I think the antidote to this is:

  1. Try and use the TDD ‘classicist’ approach and only mock external
    interactions with your system if possible.
  2. Try to design your code to use less side effects1 and pass most
    arguments as immutable values as this makes each method/function
    testable without using mocks at all.
  3. Pick your ‘unit to test’ carefully. Try and test a single
    behaviour from end to end rather defaulting to testing methods and
    classes as units.

[1] A side effect is when a method, function or procedure acts upon
something other than it’s inputs and outputs. For example, it may
write to some output device, read input from some external source or
change some stateful value.

Tests, tests everywhere and not a drop to think

TDD, in its purist form, focuses mainly on lower level abstractions and
the ‘fine detail’ of your application design. This tends to drive a
“can’t see the wood for the trees” perspective where attention to
detail means you lose sight of the bigger picture and therefore miss
opportunities to employ alternative approaches.

An antidote to this is to practice TDD ‘in the large’ as well,
otherwise known as behaviour driven development (BDD) a specialisation
of specification driven development2. By considering the
acceptance criteria of each ‘feature’, ‘property’ or ‘story’ first and
codifying them in, preferably automated, tests you will be drawn back
up to the level of considering the larger components and the system as
a whole whenever you run and amend these BDD tests.

[2] I’m thinking of trademarking Geek Driven Development.

Your tests are a big (re)factor

Again, this is not exclusive to TDD, but a side effect of using TDD is
that you frequently have a lot of tests. Many of these tests may be
covered by higher level acceptance, system and integration tests.

This is fine until you have to introduce a change to the design that
involves refactoring. This kind of change may involve rewriting tests
at unit (traditionally where TDD sits), component and system level
resulting in three times the effort.

There are several ways to mitigate this.

  • Following the techniques mentioned in “You’re mocking me” will reduce the
    amount of ‘white box’ testing you do and therefore reduce the amount
    of low level changes required.
  • Using your test coverage tool (you do instrument your tests with a
    test coverage tool right?)
    will help identify when your code is
    already exercised by higher level tests. Given this information it may
    then be possible to use TDD at the fine grained level to act as
    feedback when writing methods/functions/procedures but delete some of
    those tests once the implementation has been developed as they’re
    already covered at the higher level.


So where does that leave TDD?

In my opinion, TDD is a useful tool for providing feedback and a level
of confidence when developing low level design constructs.

It’s a fine grained tool and is useful in the micro.

However, it’s of limited use in the design of high level concepts, like:

  • what are your components?
  •  what are the public APIs?
  • how should you delineate responsibilities within your components
    (i.e. packaging, namespaces, etc)

All of these can use TDD in a supporting role but they need other
tools like thought about architecture/design and context (possibly
documented in the form of diagrams and documents, wikis,
. TDD is useful in some languages to provide a feedback loop as
described in Feedback but there are other tools that can
provide this in certain languages, for example a REPL3.

TDD has it’s place and should be used where useful. I think of it as a
watch makers screwdriver, a fine grained tool, and as such should be
used where appropriate but it’s not a religion.

[3] Read Eval Print Loop


What state are your properties in?

My last blog ‘Prop’ up your tests with test.check walked through an example of using Property Based Testing (PBT) on a very simple RESTful API.

However, this approach had limitations.

I could test the properties of each call on the API in isolation but what if I wanted to generate tests and assert about properties that spanned calls to the API.

I want to be able to generate random getpost and delete methods for randomly generated resources and be able to assert on these randomly generated commands. For example, if my PBT’s generate a post followed by a get for the same resource I should find the resource but if the tests generate a post, delete, get sequence for the same resource I should get a 404 (not found) on the get.

This gives me a couple of problems.

  1. How to generate random streams of commands with appropriate arguments.
  2. How to assert on properties based on the expected state of the resources on the server.


Fortunately there is a Clojure library for that.

Stateful.check allows me to model the state of any system and to generate random streams of commands with their associated pre and post conditions.

Due to a change in the implementation of the rose tree in the latest version of test.check I needed to use ‘Michael Drogalis’ fork of stateful.check

Stateful.check uses specification maps to model state, define how to generate arguments, check pre and post conditions, etc.

Stateful.check specifications have two parts: an abstract model (state) and a real execution. It took me some trial and error to work out that the abstract model stores the functions you define to operate on  either the model or the real values as symbols and records the order of commands and only during the execution phase does it replay this stack of functions but this time actually evaluating them against the real values.

The model (state) is defined as a hash map and you supply the initial value of the model in the :init-state value of the overall specification.

The state map is used to model the state of the actual implementation. In my case the state map models the state of Customer resources on the server for my simple Customer RESTful API.

To use stateful.check I need to include it in my project.clj file:

  {:dev {:dependencies [[javax.servlet/servlet-api "2.5"]
                        [ring/ring-mock "0.3.0"]
                        [cheshire "5.5.0"]
                        [org.clojure/test.check "0.9.0"]
                        [com.gfredericks/test.chuck "0.2.6"]
                        ; add stateful-check in dev profile
                        [mdrogalis/stateful-check "0.3.2"]]}})

Continue reading

‘Prop’ up your tests with test.check

I’ve been experimenting for a few months on and off with property based testing, otherwise known as generative testing. This blog is an attempt to show property based testing applied to the kind of business problems I deal with most days.

The principle of property based testing is to look for invariant ‘properties’ of a function under test, generate random test data for the function and verify that the ‘property’ of the function holds true for every generated test case. This contrasts with traditional testing that takes an ‘example’ based approach, i.e. explicitly coding each input and asserting on the expected output.

Property based testing (PBT) is a powerful technique that discovers edge cases more thoroughly than traditional ‘example’ based testing as randomly generated input tends to discover test cases that no human would think of. Also PBT can generate hundreds or even thousands of tests.

PBT is exemplified in John Hughes work in Haskell’s QuickCheck and the subsequent Erlang implementation of QuickCheck. QuickCheck has also been implemented in other languages notably, FsCheck for F#, ScalaCheck for Scala and, unsurprisingly, there’s an implementation for Clojure called test.check by Reid Draper. I am not going to go through a detailed description of the power of PBT in this blog but if you’re interested these talks by Reid (Reid Draper – Powerful Testing with test.check) and John Hughes (John Hughes – Testing the Hard Stuff and Staying Sane) are well worth checking out

However, I found PBT was not a substitute for example based tests but more a supplement. Example based tests can provide ‘developer readable’ documentation in a way that PBT doesn’t (or at least doesn’t for me). Also thinking of properties to test is hard. It takes quite a lot of thought and sometimes quite complex code in it’s own right to generate and verify randomly generated test data. I personally found it hard to come up with generic properties of a function before I’d started implementing.

I approached this by using a combination of example based (usually TDD) tests, the REPL for explorative testing and PBT.

My main issue with most of the example of PBT I’ve been through (and I must have tried at least 6 tutorials!) is that they are simple and algorithmic. By that I mean the functions were pure and tended to have properties that were easily verifiable and inputs that were easily generated. For example. testing sort on a vector or the behaviour of a queue.

I live in the world of business where most of my problems are about moving and transforming data. I don’t think I’ve implemented a sort or a queue like structure since I left university. My problems are messier and tend to involve inconvenient things like state.

Therefore I thought I would try and put together a simple but slightly more real world example to use PBT on that involved a RESTful API. I hope to show how and when I used PBT in combination with traditional REPL and example based tests.

My imaginary API is really simple. It consists of a ‘customers’ resource that will allow CRUD operations.

Continue reading

I don’t give a fig(wheel) – part 2

In the first part of this blog I showed a way of loading figwheel from an nREPL while keeping the figwheel code separate from the generic build used to build for production.

One of the things I wanted to ensure in my lein project was that any Clojurescript test code is kept separate from the production code to be shipped.

I decided to use the doo library to run my cljs tests. I experimented with using the cljsbuild plugin with a js script and phantomjs to run cljs tests as described here but I found this approach didn’t always report failures in async tests correctly.

To set up doo you need to add the following entries to the project.clj file:

 :profiles {
   :dev {
     :source-paths ["dev/src" "src-cljs" "test-cljs"] ; add test dir
     :dependencies [...
                    [lein-doo "0.1.6"]]
     :plugins [[lein-doo "0.1.6"]]}}

As well as adding lein-doo to the dependencies and the plugins for only the :dev profile, you need to add the directory containing the test code to the source paths. In my case I am keeping my test code in a directory  called ‘test-cljs’.

In addition lein-doo needs a test runner to bootstrap it. Therefore I added a test-runner namespace that calls the doo-tests function to run the test namespaces (in this case only core-test).

(ns blogcljsfigwheel.test-runner
 (:require [doo.runner :refer-macros [doo-tests]]
(doo-tests 'blogcljsfigwheel.core-test)

I also needed to add the cljsbuild config for the test namespaces to the project file to ensure the test code is transpiled to JavaScript.

:cljsbuild {:builds {...
                     :test {:source-paths ["src-cljs" "test-cljs"]
                            :compiler {:main "blogcljsfigwheel.test-runner"
                                       :output-to "resources/private/js/unit-test.js"
                                       :optimizations :whitespace
                                       :pretty-print true}}}}

If you have phantomjs installed on your path you can then run the tests using lein doo phantom test.

As you can see the test build outputs to the resources/private directory whereas I configured the app build to output to the target directory:

 :cljsbuild {:builds 
                {:source-paths ["src-cljs"]
                 :compiler {:main "blogcljsfigwheel.core"
                            :output-to "target/cljsbuild/public/js/compiled/blogcljsfigwheel.js"
                            :output-dir "target/cljsbuild/public/js/compiled/out"
                            :asset-path "js/compiled/out"}}

In order to package the Clojurescript in the uberjar I also needed to add:

 :profiles {:uberjar {:aot :all
                            {:jar true
                             :compiler {:optimizations :advanced}}}}
                      :prep-tasks ["compile" ["cljsbuild" "once" "app"]]}

The prep-tasks ensure that the cljsbuild is run for the app build when the uberjar is built using lein uberjar.



I don’t give a fig(wheel)

This blog is actually more of a collection of notes to myself for how to configure Leiningen to build a Clojure server and Clojurescript client.

Over the last few weeks I’ve been trying to work out how to build a Clojurescript client and make sure test code is not included in the production build and that Figwheel is included only in the dev build.

In the past the approved way to connect to the Figwheel REPL using nREPL was to add :nrepl-port configuration parameter to your project.clj . You would then start Figwheel using the lein figwheel plugin, which would then allow your favourite editor to connect to Figwheel using nREPL.

The new way to connect is detailed here. However, I have struggled with configuring the profile.clj file to share some cljsbuild parameters across profiles but still inject figwheel into the build for development only using this new approach.

What follows is a guide to how I accomplished this although I am sure there are other approaches.

Setting up common cljsbuild

I wanted one place to configure the common parts of the cljsbuild so I started by adding an ‘app’ build.

 (defproject blogcljsfigwheel "0.1.0-SNAPSHOT"
 :resource-paths ["resources" "target/cljsbuild"]
 :cljsbuild {
   :plugins [[lein-cljsbuild "1.1.1"]
             [lein-figwheel "0.5.1"]] ;; Note figwheel plugin
     {:app {:source-paths ["src-cljs"]
              {:main "blogcljsfigwheel.core"
               :output-to "target/cljsbuild/public/js/compiled/blogcljsfigwheel.js"
               :output-dir "target/cljsbuild/public/js/compiled/out"
               :asset-path "js/compiled/out"}}}...
 :profiles {:dev
            {:dependencies [[figwheel-sidecar "0.5.1"] ; Deps for sidecar 
                            [com.cemerick/piggieback "0.2.1"]] ; piggieback
 :repl-options {:nrepl-middleware [cemerick.piggieback/wrap-cljs-repl]})

I initially thought that I could add the figwheel true option to the :dev profile. Like so:

:profiles {
                {:app {:figwheel true}}
             :dependencies ...}} 

And then launch figwheel by starting a REPL (either a lein repl or a CIDER repl) and then connect by starting Figwheel from the repl. Like so:

$ lein repl
user=> (use 'figwheel-sidecar.repl-api)
user=> (do (start-figwheel!) nil)
Figwheel: Starting server at http://localhost:3449
Figwheel: Watching build - app
Compiling "target/cljsbuild/public/js/compiled/blogcljsfigwheel.js" from ["src-cljs"]...
Successfully compiled "target/cljsbuild/public/js/compiled/blogcljsfigwheel.js" in 1.057 seconds.
user=> (cljs-repl)
Launching ClojureScript REPL for build: app
Figwheel Controls:
    Docs: (doc function-name-here)
    Exit: Control+C or :cljs/quit
 Results: Stored in vars *1, *2, *3, *e holds last exception object
Prompt will show when Figwheel connects to your application

However (start-figwheel!), although it pulls in the cljsbuild config from profiles.clj, doesn’t merge in profiles.

So to get around this I created a user.clj file that I added some helper functions and a custom figwheel config to.

(ns user
  (:require [figwheel-sidecar.repl-api :as ra]))

;; As figwheel doesn't seem to properly merge cljsbuild config from profiles
;; we have to define our figwheel config here and use some helper fns to start
;; and stop figwheel.

(def figwheel-config
  {:figwheel-options {}
   :build-ids ["dev"]
   [{:id "dev"
     :source-paths ["src-cljs"]
     :figwheel true
     :compiler {:main "blogcljsfigwheel.core"
                :output-to "target/cljsbuild/public/js/
                :output-dir "target/cljsbuild/public/js/compiled/out"
                :asset-path "js/compiled/out"}}]})

(defn start-figwheel! [] (do (ra/start-figwheel! figwheel-config) nil))

(defn stop-figwheel! [] (do (ra/stop-figwheel!) nil))

(defn cljs-repl [] (ra/cljs-repl))

I put this user.clj file in a dev/src directory and then added this directory to the :dev source paths.

:profiles {
   :dev {...
         :source-paths ["dev/src" "src-cljs"]

I also added user as the initial namespace for the repl-options.

:repl-options {:init-ns user ...}

Once this is in place you can start a REPL and issue the commands:

user=> (start-figwheel!)
Figwheel: Starting server at http://localhost:3449
Figwheel: Watching build - dev
Compiling "target/cljsbuild/public/js/compiled/blogcljsfigwheel.js" from ["src-cljs"]...
Successfully compiled "target/cljsbuild/public/js/compiled/blogcljsfigwheel.js" in 2.578 seconds.
user=> (cljs-repl)
Launching ClojureScript REPL for build: dev
Figwheel Controls:
    Docs: (doc function-name-here)
    Exit: Control+C or :cljs/quit
 Results: Stored in vars *1, *2, *3, *e holds last exception object
Prompt will show when Figwheel connects to your application
To quit, type: :cljs/quit

Note: The figwheel-sidecar.repl-api is not directly required and referred through the ‘use’ macro so the (start-figwheel!) and (cljs-repl) commands are those defined in the user.clj file that use the custom figwheel-config.

Server side

Obviously, you need the server running to serve the client .js file. In my case I served the compiled JavaScript from hiccup mapped to the root of the server using compojure.

(ns blogcljsfigwheel.http
  (:require [com.stuartsierra.component :as component]
            [compojure.core :refer [defroutes GET POST]]
            [compojure.route :as route]
            [hiccup.core :refer [html]]
            [ :refer [include-js]]
            [ring.middleware.defaults :refer [wrap-defaults api-defaults]]
            [ring.adapter.jetty :refer [run-jetty]]))

(def home-page
     [:meta {:charset "utf-8"}]
     [:meta {:name "viewport"
             :content "width=device-width, initial-scale=1"}]
     [:link {:rel "stylesheet" :href "" :integrity "sha384-1q8mTJOASx8j1Au+a5WDVnPi2lkFfwwEAa8hDDdjZlpLegxhjVME1fgjWPGmkzs7" :crossorigin "anonymous"}]]
      [:h3 "Loading...."]
      [:p "Loading application. Please wait...."]]
     (include-js "js/compiled/blogcljsfigwheel.js")
     [:script {:type "text/javascript"} "addEventListener(\"load\", blogcljsfigwheel.core.main, false);"]]]))

(defroutes app-routes
  (route/resources "/")
  (GET "/" [] home-page)
  (POST "/echo/:echo" [echo] (str echo " has been to the server and back."))
  (route/not-found "Not Found"))

(def app
  (wrap-defaults app-routes api-defaults))
... Not shown - code to run server. I'm using Jetty Adapter and Stuart Sierra's components ...

I am using an exported ‘main’ function in a different (core) namespace to start the server but you can see the ‘addEventListener code bootstrapping the server in the script tag in home-page.

I then define helper functions in the user.clj to allow me to start, restart and stop the server using Stuart Sierra’s component reload workflow.

See my next blog for more detail or clone the repo here

Blog to follow, how to add cljs tests without polluting the uberjar build.

WhIP Scrum into shape

I’ve been using Scrum or similar ‘timeboxed’ project management methods in Software Development for 19 years (starting with DSDM in 1997). During this period I have had some spectacular successes and some more average results but rarely any spectacular failures.

So having got that out of the way I want to talk about why I think that imposing artificial ‘timeboxes’ causes almost as many issues as it resolves.

I’m going to take examples from a number of projects I’ve worked on over the last few years with the names and places changed to protect the not-so innocent.

The idea of setting a timeboxed period for delivery of software has been around for more years than I have (and that’s a lot!). All ‘projects’ have a deadline, an end point. This constraint is usually dictated by two things, a business imperative and/or a finite amount of money. All to often I see the second being the main driver for delivery, see my talk ‘Projects kill Agile Development‘ on why this is a lousy way to constrain software delivery and the bad behaviours it drives.

However, what I’m discussing here is the imposition of an artificial ‘timebox’ on a delivery team as a way of introducing ‘urgency’ and measuring progress. This principle of an ‘x week long’ timebox is most obviously demonstrated in Scrum but is present in many agile software development methods including XP1, DSDM and Crystal.

The intent of this timebox is actually to focus the team on building small, well tested, deliverables that are released to production to provide early ROI and to learn what customers actually find useful. The learning is as, if not more, important than the money made or saved.

However, in almost every ‘project’ I’ve ever worked on in large organisations this ‘release regularly’ style of iteration doesn’t happen. Instead an artificial time period is established (usually management read a book on Scrum and establish two weeks as if there were something magical about that frequency – BTW the team should decide the timebox size!). This artificial timebox is punctuated by some kind of review meeting and a retrospective, usually focusing on what went badly rather than positives that can be expanded upon.

The issues with this artificial timeboxed period are many:

  • People over estimate what can be achieved and work gets compressed to the end of the period and this leads to quality issues.
  • Artificial pressure is imposed by management leading to shortcuts on the work compressed toward the end of the timebox. This is most often tasks such as testing (particularly non functional testing), support materials, user guides, etc. These tasks are then either rushed or dropped all together.
  • Often this timebox is perceived through the established lens of the ‘build phase’ (or if you are lucky ‘build and test phase’). This leads to tasks required prior or subsequent to ‘build’ being given less focus or being discarded altogether.
  • Timeboxes are seen, not as a way to release frequently to learn if the correct thing is being built, but as a measurement of progress towards a fixed and agreed set of ‘requirements’. This is predicated on a fallacy that the senior stakeholders fully represent the user population, are prescient and can see how needs and technology will change in the future. This leads to projects that have regular, but not particularly useful, checkpoints, disguised as reviews, by management to an artificial and unvalidated understanding of the needs of the actual customers. These reviews tend to focus on functional and demonstrable features and any non functional or technical concerns take a back seat.

I have seen this ‘iterative waterfall’ approach go on for many months, or even years, before software is released into the ‘wild’ for true validation.

There have been many attempts to address the deficiencies of the ‘artificial timebox’. Most frequently, the ‘Definition of Done’ (DoD) which will define what tasks need to be complete and verified before a unit of functionality is declared ‘done’.  The DoD is a useful construct but all too frequently I see it constrained by artificial organisational ‘silos’. For example the following activities or areas are excluded from the DoD;

  • ‘copy’ and ‘legal’ are almost never integrated into the team so don’t form part of the DoD,
  • the route to live involves handover points to other teams that restrict the frequency of delivery and are not accounted for in the DoD,
  • UX is ‘outside’ the development team and is not in DoD,
  • non functional requirements testing is outside the development team and not in DoD,
  • user guide/help/tutorial materials and handover to a ‘dedicated support’2 team are not included in DoD
  • ‘ivory tower’ architecture imposes artificial constraints on the development team without real justification or understanding of the teams needs or environment,
  • technical constraints, particularly around the provision of development/test environments, mean the team cannot include some elements of testing in the DoD (again frequently NFR testing),

Another interesting symptom of this artificial timebox is the exclusion of some necessary activities.

Probably the most common of these is the ‘sprint zero’. ‘Sprint zero’ is a timebox that is introduced at the start of a ‘project’ to provide a place to carry out activities that define the ‘scope’ of the delivery and to allow less ‘agile’ parts of the organisation to provide input to the project. There are a number of issues with ‘sprint zero’ but most fall into the category of assuming that all requirements are known, well understood and immutable. I’ve never worked on a project where this has held true.

Another strange symptom I’ve seen is to move early activities that are pre-requisites of the build and test phase to their own timeboxes that precede the build and test timebox. Most frequently I’ve seen functional tests and UX work moved to their own t-1 or t-2 ‘sprints’ with a ‘Definition of Ready’ (DoR) introduced.

I find this practice extremely disturbing and unhelpful. It splits the development team into ‘them and us’ (UX/Testers and Dev’s), it treats development and test execution as more important than UX and test/environment preparation. It also makes these pre-development activities less visible. I have even seen it lead to a blame culture between the UX/testers and the developers. This is just ‘waterfall’ in miniature.

There are more obvious symptoms like:

  • No attempt at a DoD so no real idea of whether a feature/story is ready to ship,
  • A fixation on a story/feature being ‘complete’ when it’s been reviewed and passed it’s definition of done (a feature is apt to change even, or more especially, when it’s in production so how can you ever tell it’s ‘complete’? It may be ready to ship/deliver but that’s it!),
  • Moving activities that need to be carried out to release to outside the timebox to make work ‘fit’ in the artificial time constraint,
  • Timeboxes appearing on gantt charts to placate managers used to controlling time and resource who don’t understand the concept of scope management and feedback from your customers.

Frequently these artificial timeboxes can go on for months before code gets anywhere close to an environment representative of ‘live’ let alone actually released. The apparent feedback and progress lulls the development team and stakeholders into a feeling of complacency that everything is as expected, fulfills customer’s needs and will scale and perform correctly in production.

So how do we address these concerns?

Most of the symptoms above come from one of two causes:

  • Organisational silos preventing the team incorporating all the skills required to ship (including timely input from the legal department, marketing, etc.)
  • Considering a ‘feature’ or ‘story’ that is too large

The first takes a cultural change that is difficult, if not impossible, to drive from the perspective of an individual project or initiative and is beyond the scope of this post.

The second is something that comes with education and practice. Some people advocate making iterations smaller to try and force the team to focus on very small amounts of functionality. I find this rarely works as the ‘features’ or ‘stories’ still rarely conform to whatever timebox size has been choosen.

This is where an agile coach can help in educating the business stakeholders that releasing something very small can still be useful as a learning experience.

If we don’t use timeboxes how do we ensure that the team are focusing on what’s required, that progress is being made and that we are regularly releasing to gain feedback from the only audience that really matters, our customers.

The reason for timeboxing is to constrain what the team is working on to ensure that the team is not losing time in constant context switching between tasks, that they are seeing ‘features’ through to release. The secondary reason, for which I find the timebox is much less useful, is to provide a guide to future planning.

Another way of reducing context switching is to adopt Kanban practices and to enforce work in progress limits. If you rigidly enforce limits to the work that can be in progress at any one time and make the flow of ‘features’ through activities visible this can be used to educate both stakeholders and the team about sizing features.

If ‘done’ means ‘released to our customers’ we can use very small, incremental changes to test the waters of whether our latest products or approaches have a market.

We should measure the value of re-engineering our sofware. Re-architecting our platform to scale out under peak load may not be sexy but neither are thousands of customer complaints about our site being unavailable. Isn’t there as much value in reducing the complaints of customers and not losing reputation as gaining market share with a new offer?

Isn’t it easier to plan for future change by measuring average ‘cycle’ time and throughput to identify how long a feature actually takes to be delivered to the customer? This also has added benefit that it educates stakeholders into how to break up ‘features’ or ‘stories’ into smaller deliveries to lower cycle time and smooth out throughput. Kanban techniques ‘done right’ also ensure that the team maintains a visibility and focus on the activities required to release not just on an artificially imposed deadline of the end of a timebox.

However, I want to add a note of warning. Kanban requires discipline in both a development team and the organisational stakeholders. To do well, it requires infrastructure to support rapid change in parallel. It requires high levels of automation of testing, continuous integration and delivery. Most of all it needs the organisation as a whole to understand what the real value of their software platform is.

For all of the above reasons I most often start with a ‘timeboxed’ approach to delivery in an organisation with no exposure to agile or lean practices. I would, however, adopt some of the measures that Kanban uses to identify bottlenecks and enforce WIP limits.

I think the take away from this rather rambling post is look out for the symptoms of artificial ‘timeboxing’, keep in mind nothing is of value until it’s released and that ‘testing’ doesn’t stop in production, we need to gather customer feedback and monitor how design decisions impact on our system then design mechanisms to assign value to this feedback to identify what we need to feed into our development teams.

1. eXtreme Programming actually advocates regular delivery to production but I frequently see ‘regular delivery’ being redefined to mean ‘internal delivery’ to Stakeholders rather than an actual release.
2. See my blog Lean Software Support or Lazy Developers are Good Developers for my perspective on dedicated support teams.

The Physics of Software or Why I am not a ‘Software Engineer’

About ten years ago I had the job title ‘Software Engineer’ and I occasionally still describe myself as a ‘Software Architect’. However, I am not entirely comfortable with being described as either. I have taken to using the title ‘Technical Navigator’ because it suggests steering through treacherous waters, and no one really knows what to make of it! 

The reasons I am uncomfortable with all analogies to engineering and construction are many but the primary reason is ‘physics’. Civil engineering is constrained by the laws of physics. With some notable exceptions I will elaborate later, most business led software development is not constrained in the same way.

Those of you who know me personally will have heard my oft repeated phrase “unlike software ‘engineering’ when they were constructing the Tyne Bridge, someone didn’t widen the Tyne by 300 miles (and simultaneously relocate it to the South Pole)”.

During software development it is the norm for the sponsors to add and remove requirements. Even more frequently, they don’t really know what to ask for, or how to ask for it. Even when they do, they don’t really know if the customer wants or needs what they are suggesting until it’s built and shipped. We can’t blame business sponsors for this. The world moves at a tremendous pace and we Software Developers are one of the primary reasons why!

The same sponsors believe that software development is as well understood as civil engineering, and expect the same predictability. However, they over estimate this predictability. How many civil engineering projects over run time and budget?

In reality, software development is probably at the stage of architecture and construction in Western Europe in the early Middle Ages. We sometimes construct some spectacular Cathedrals, but for every one of those there are hundreds of wonky, subsiding, crumbling shacks that keep the worst of the weather out, but leak and lose thatch in a strong breeze. To (over) stretch the analogy, all of these buildings need maintenance to stay fit for purpose and we keep adding outbuildings and extra chimneys as the families living in them grow.

There are plenty of examples in history of civil engineering following similar patterns to large software code bases.As an example consider one of my favourite buildings, Ely Cathedral (just up the road from where I grew up).

Ely Cathedral is a splendid edifice, with a unique Octagonal Tower. However, it didn’t spring fully constructed from the Cambridgeshire Fens over night, and it didn’t (still doesn’t) survive without constant maintenance.

It’s construction began in the 11th Century (1083AD) and it didn’t reach it’s current form until the 16th Century. Since it’s initial construction, arches and buttresses have been ‘tacked on’ to shore up parts of the building, two transepts and the Norman Central Tower have collapsed.

What has changed in modern civil engineering between now and what the medieval stone masons did is the growth of the fields of mathematics, physics, material science, etc. Civil Engineers now understand the laws and constraints within which they need to work.

In Software Engineering we are still finding and pushing our limits. With the exception of software that actually is constrained by physical limitations, like guidance systems, ultra low latency distributed systems, etc., we are still unclear what are the limits of the materials we are using or the environments we are using them in.

I have high hopes of the current movement towards the pragmatic application of functional programming, persistent immutable data structures (and databases) and the promise of fields of mathematics, like linear logic, to help solve some of the problems of distributed systems.

However, I think we are still in the early middle ages of computer science, as Dr Philip Wadler says, “you don’t put science in your name if you’re a real science!”, and I don’t put ‘Engineer’ in my job title as I’m not a real ‘Engineer’.

Chris Howe-Jones, Technical Navigator

Lean Software Support OR “Lazy Developers are good Developers”

Recently an ex-colleague wrote a blog about support teams and their role in the world of Agile Software Development that used some rather emotive and denigrating language about Software Developers. He focussed on what he perceived as Software Developers lack of care and attention to what happens to their software after it’s released to production.

He added some emotive and unhelpful language which stated, that in his opinion, Software Developers in an agile development are:

  • focused only on coding, implying they don’t care about anything beyond ‘code’.
  • lazy
  • malicious
  • only interested in the next cool technology

However, under the distracting emotional language he was trying to make the point that he felt Agile teams adopt a ‘throw it over the wall’ attitude to production support.

His suggested antidote was more documentation, knowledge articles, embedding support team members in development ‘projects’ and, above all else, good old fashioned ITIL with a dedicated support team to look after software in production.

I would whole heartedly agree with this….ten to fifteen years ago this was a valid model for software support.

But it’s not 2000, or even 2005, anymore. Modern Lean and Agile development and support rely on a characteristic that he highlighted in developers and that I agree with and am proud of:

We software developers are lazy! And this is good!

Continue reading

Risk, What is it good for?

Firstly lets clear up any ambiguity, this blog is about the concept of risk in software systems not about the excellent board game [half the readership has now left].

I’ve spent large parts of my career working in large corporate organisations. In these companies I frequently hear the phrase “We won’t try that as [insert major financial institution or Government dept here] are too risk averse”.

I just want to examine that statement for a moment.

In some cases this response has come from the suggestion of introducing a new technology that has been trialled by ‘bleeding edge’ companies for more than two years. In three organisations I’ve heard this response to the suggestion that the code base be refactored as it has become too unwieldily to change and is not well tested.

The risk that is being referred to here is the risk of spending money on something that may break when shipped to production. It’s fair to highlight this risk but this glib response doesn’t consider the risk of doing nothing.

What is the risk to the organisation if the competition delivers on to that new technology or platform ahead of it?

What is the risk to the organisation if the code base becomes so large and difficult to reason about that even superficial changes to it become a mammoth expensive project (with it’s own risks of miscommunication and error due to the number of people and resources involved)?

Or that changes are so difficult to test that code is changed in ‘edit and pray’ mode and deployment involves more finger crossing and sleepless nights than celebration of success?

I would argue that it’s better to have small focussed and measured initiatives to trial new technology fast and, if necessary, fail early before too much money is burnt?

Wouldn’t investment on refactoring or even rewriting parts of the code base pay off in future with cheaper and safer changes?

Seems like common sense to try these incremental approaches to change?

However, the cynical part of me sees the real drive for this ‘risk aversion’ reaction as being the next bonus or the next promotion for the manager quoting it. They are rewarded for delivering over the next 2-3 month horizon and penalised for missed deadlines. No one loses their job over following the establish technologies and patterns rubber stamped by the ‘architecture department’.

You can’t blame the manager for reacting in a way determined by the way they are incentivised.

You can blame the organisational leadership for measuring them this way!