If you find that you are stepping outside of your product development process on a regular basis, you need to do one of two things:

1. You need to enforce the process more stringently and educate all team members in how to follow the process;

2. You need a different process, my friend.

Welcome to the world of impedance mismatch. Over the years, I have seen a *colossal* amount of effort poured into smoothing out the mismatch with #1. If you’re sitting there with a full Programme Management office and a three page Change Request document and are still having to subvert the process to deliver results, you’ve reached the end of the road for the current process. But, the problem comes after you take the plunge with #2 and find yourself unexpectedly back at #1 after a while. There is a point of diminishing marginal returns, a point when it’s no longer a matter of continuing to educate everyone in the new process (they get it) or locking down workflows (the workflows are tuned). Instead, it might be the case that the new process isn’t the right one.

Crucially, this doesn’t mean that the new process wasn’t the right one at the start. Analyse the reasons for the turbulence: at the beginning this was likely to be fear-of-the-new issues, however it’s entirely possible that the organisation has evolved new issues as it becomes used to working in a new way. Moving from Waterfall to Scrum might have solved 90% of your problems but it’s not the end of the road: Scrum may only be a staging post to something else.

In Agile at Westfield, I covered how Westfield employed the Scrum process, noting that a key component is the ability to inspect and change the development process from the vanilla ‘out-of-the-book’ Scrum implementation to better interlock with the organisation, commenting that ‘No battle plan survives contact with the enemy’. This has proved to be true.

Since writing about the second anniversary of the Scrum process at Westfield we entered a crunch period in the run-up to Christmas, fitting in, among other things, launching a mobile site, adding three new payment methods and the re-engineering of the checkout experience to allow payment to multiple retailers with several payment types all in the one page. This was a high-risk strategy and we were forced to make compromises on our product development process to hit the deadlines. Some bugs slipped through, but mostly it all held together, and we spent a couple of sprints patching up the holes. There was a conscious decision to pile up technical debt and dial back test coverage (for reasons covered below), effectively taking calculated risks to hit the deadlines.

Before settling back into a comfortable Scrum rhythm of 3-weekly iterations, in which the last four sprints were a bad dream, one question that needs to be asked is: was the recent period really an aberration or were we witnessing the evolution of the organisation to a new ‘normal’? If the latter, how do we cope? What could we learn and put in place before finding ourselves back there again? Here are my takeouts….

1. The piper must always be paid: you cannot run development at 150% and not expect to rack up tech debt. Especially in software development, there is no such thing as a free lunch; it was acknowledged across the business up-front that the short-term race to the finish would need following up by sprints with very little business benefit in them.

2. Test code dwarfs the production codebase by a 3:1 ratio. It is insane to not think that tech debt piling up in your test codebase isn’t real technical debt and it doesn’t need to be paid down. Thanks to hidden tech debt, our checkout refactor fell foul of brittle, badly coded tests, leaving us with no recourse but to drop test coverage by disabling tests.

3. It is possible to release without certainty that the code is production ready, in special circumstances. If you have tested the important pathways (e.g. payment) thoroughly you can be reasonably sure that any bugs that slip through will be in less critical areas. You can then watch the system in production and patch as necessary with low impact to customers.

The last point turned out to be Pandora’s box: traditionally, skimping on QA releases all the software evils on the world, but there is also some good. You can scale back from ‘perfect’ to ‘good enough’ in your QA process only if you still have rigour around key processes and are able to release changes quickly. This means embracing Continuous Deployment techniques within a Scrum framework, and the two are odd bedfellows.

Scrum vs Continuous Deployment

At the heart of the Agile development methodology is the idea that you need to release a little, and release often. For teams coming at Agile off the back of nine-month-phased Waterfall projects, cutting down to monthly iterations feels blisteringly fast, but it allows a lot of the features of Waterfall to remain in the background, such as long UAT and release processes; it may be the case that development can start and finish on a feature within the same month, but that feature may face a tortuous journey to production from then on. If you then have pressure from the business owner to deliver on a fortnightly basis, you soon realise that you have an Agile process with a cumbersome Waterfall tail, where the set-up and tear-down costs for a fortnightly sprint are a large percentage of the total costs of the sprint. The time has come to optimise the workload downstream of Dev so that each feature can flow efficiently and continuously into production.

Continuous deployment requires the following:

* Continuous integration of the codebase via a farm of build servers, plus alerting on red builds and a philosophy of whoever breaks the build fixes it, quickly

* A clean source main line, so that if the builds are all green the codebase can be tagged for a deployment at any time

* A source management system that allows cheap feature branching so that all non-trivial work is kept away from the main source line until it has passed feature QA and is ready to be merged into the main line

* Automated QA to allow the deployment to be stress and regression tested, releasing human QAs to range freely across the application looking for trouble spots

* Automated, reliable, auditable deployment scripts

It’s not strictly necessary that each merge into the main line triggers a release process that deposits the code into production (after all, would you want code going straight out that was checked in during Friday drinks?), but rather that there is a capability of starting a release process at any time and having it ready for production by (at most) close of business the same day. Using this capability, fortnightly iterations become low-cost, low-burden, which seems to solve the problem.

Or does it? Scrum requires that the stories are lined up from an ordered backlog at kick-off and committed to as an ensemble. Stories must fit within the sprint and allow an increment of business value to be delivered. There is a point where shortening the sprint cycle to increase delivery responsiveness runs into the software management equivalent of the Planck length: the minimum size of a unit of business value. If your average per-developer velocity is 10 points for a three week sprint, dropping to a weekly sprint cycle means that all stories must be three points or less. If you have an atomic five-point story, it won’t fit.

An ideal scenario would allow stories to be injected into the sprint while also allowing velocity to be calculated and tracking of team commitment. In this world, the team commitment is a set of empty story slots that may be filled as the sprint progresses by the top-prioritised stories from the backlog. The Scrum concept of being able to visualise the entire commitment before starting work is still mostly possible by looking at the top stories in the backlog, with the understanding that priorities may change over time. While this may be a little less efficient from the developer’s point of view (in wanting to line up similar stories to get pipelining efficiencies) it means that the team is never engaged in wasted or lower-priority work.

Trivial stories and bug fixes can be committed directly to the main line, more complex stories (and even entire features) sit on their own feature branches waiting to be pulled back into the main line by the QA team, who now act as quality gatekeepers to keep changes off the main line until deemed stable and accepted by the business owner. Developer, BA, UX and QA resources are responsible as a team for taking a feature from concept, through visual design, development, acceptance, and finally release in one single sweep. This also has the advantage of the team being able to follow a story from cradle to grave (and into production) while the knowledge is still fresh (as covered off by @mootpointer on Continuous Deployment).

This sounds closer to Kanban than Scrum. The pipeline of story production that sits above the story development process in Scrum is extended down into the iteration and all the way out the other end into a tight DevOps integration that flows the stories out into the production environment. Scrum process components such as velocity can still be calculated retroactively at the end of the iteration period, to allow capacity planning going forward. The all-important Retro is still held at the end of each iteration. Regular backlog grooming meetings still occur. However, if the business needs to react swiftly to a new opportunity, there is no longer the harsh grinding of gears as stories are pushed out of the commitment, new stories brought in, sprint cycles potentially lengthened or reduced.

Should Westfield move to full Kanban? Not in the short term, due to the massive education and process re-engineering overhead across the business. But the journey of a thousand miles begins with a single step. Continuous delivery is that first step.