The Productivity Trap

When a Measurement Becomes a Target

May 14, 2024

∙ Paid

pen om paper — Photo by Isaac Smith on Unsplash

Measures of productivity do not lead to improvement in productivity. There is in the United States, any day, a conference on productivity, usually more than one. There is in fact a permanent conference on productivity, and there is now the President’s Committee on Productivity. The aim of these conferences is to construct measures of productivity. It is important to have measures of productivity for meaningful comparisons of productivity in the United States year by year, and for meaningful comparisons between countries. Unfortunately, however, figures on productivity in the United States do not help to improve productivity in the United States. Measures of productivity are like statistics on accidents: they tell you all about the number of accidents in the home, on the road, and at the work place, but they do not tell you how to reduce the frequency of accidents.
Deming, W. Edwards. Out of the Crisis (The MIT Press) (p. 15).

THE AIM for this newsletter is to share with you some thoughts I had after reading a blog post by long-time agile software development practitioner, author, and now newly-minted VP of Engineering for OpenSesame, James Shore, titled A Useful Productivity Measure? I came across it while in the throes of editing Episode 2 of The New Economics Companion, and was tempted to drop it to write this analysis instead, but managed to keep my head and, as Kanban folks say, “stop starting, start finishing”.

The reason I wanted to write this post in situ was because James’ tale dovetails well with the lessons Dr. Deming imparts in the lecture I was writing based on his identification of the nine faulty practices of the prevailing style of management, ie. without good theory to guide us, we create a negative reinforcing feedback loop that worsens the situation we were originally motivated to improve. While James does eventually land-up in a better place than where he began, he and his SLT expended a lot of time and energy getting there that could have been dramatically shortened had they all first done some learning of outside knowledge together.

tl;dr

Well-known agile practitioner-turned-VP of Engineering for a fully-remote company goes on an epic six month journey to find a way to meet CEO’s demand for measuring the productivity of their software engineering teams brought about by lacking a cohesive theory of management. Several examples of faulty practices ensue, including Management by Objective, Setting Numerical Goals, and Management by Results, which could have been avoided had the journey begun with learning some new theory, first.

The Productivity Trap in Software Development

The entirety of James’ post is about his reluctant search for a way to provide a number to his CEO and SLT to describe the productivity of his engineering teams, which he knows is not only impossible—owing to the nature of software development—but also sets a potential trap in the form of Goodhart’s Law, ie. when a measure becomes a target, it ceases to be a good measure. This phenomena combines the two of the faulty practices Dr. Deming identifies in The New Economics: setting numerical goals and Management by Results (MBR).

Ergo, James was being sent on a bit of a wild goose chase looking for answers where they wouldn’t be found and could actually make things worse. And to his credit, as an experienced agile practitioner, he knew this:

I can’t fault the question. I mean, sure, I’d rather it be phrased about how I’m improving productivity, rather than how I’m measuring it, but fair enough. I need to be accountable for engineering productivity. There are real problems in the org, I do need to fix them, and I need to demonstrate that I’m doing so.
Just one little problem: software productivity is famously unmeasurable. Martin Fowler: “Cannot Measure Productivity.” From 2003. 2003!
More recently, Kent Beck and Gergely Orosz tackled the same question. Kent concluded: “Measure developer productivity? Not possible.”¹
¹Kent and Gergely’s two-part article is excellent and worth reading. Part one. Part two. And a later followup.

What complicates measuring productivity in software development is that there isn’t a 1:1 relationship between inputs like time, effort, or people and outputs like value or impact. It’s quite possible to expend a lot of time, effort, and money and create little customer value, even if you think you’re building the right things because the medium, software code and thinking, is infinitely tractable. Because of this, any gains you think you’d get from parallelizing the work across more people can actually lead to decreasing your apparent productivity, a phenomena that Fred Brooks Jr. wrote about in The Mythical Man-Month (1975!) and described with his eponymous law: “Adding manpower to a late software project makes it later.” (p. 25).

Welcome to the weird world of nailing jello to a wall in software development.

What Was Tried

In the article, James describes several waypoints in his journey to measure the unmeasurable and keep his leadership happy, which he clearly states his job depended on doing. These included:

Brainstorming a set of “indicators” with the CEO, CTO, and Chief Product Officer to show how close they were to building the best product engineering organization in the world
Navigating the introduction of company-wide productivity OKRs by leadership
Experimenting with having leadership prioritize initiatives as “product bets” (think: PDSAs without the structure or learning)
Measuring “actual” and “estimated” ROI
Finally, the ratio of percentage of time engineering time spent on value-added activities (which sounds an awful lot like Process Cycle Efficiency for the lean folks in the room…)

The Faulty Practices of Management Revealed

Each solution in Shore’s journey revealed problems that resided in the system of management, starting with what was motivating the CEO to want to know this figure. Shore doesn’t explain it, but we can presume it’s because of the typical “original sin” of the fourth faulty practice of management Dr. Deming describes, Failure to Manage the Organization as a System. In this context, the CEO has a hunch that engineering is underperforming, and if a number can be affixed to them, then they can be managed.

This is confirmed when Shore later describes how leadership sets company-wide productivity OKRs (Objectives and Key Results, a form of the fifth Faulty Practice, Management by Objective) for each department: the intent to manage parts and not their interactions is confirmed. Of course, this seeds a future problem where optimizations will be made locally instead of system-wide, which will prompt a reaction, usually in the form of the seventh faulty practice of Management by Results, as mentioned earlier with reference to Goodhart’s Law. Without knowledge of variation, leadership will bias to reacting to every fluctuation in the productivity measure that goes against their expectations and ironically inject more variation as a result.

Moreover, once we begin down the path of measuring parts in comparison to one another, each vying for a share of leadership attention and budget, adversarial competition between the components begins. Deming observed that left un-managed, they become "selfish profit centres”. In the words of Agile Manifesto signatory, Kent Beck: “As soon as the metrics have consequences, the gaming begins.”

The Solutions

The one “success” Shore has is when, after realizing they don’t have any data to effectively measure ROI as a productivity proxy (another proposition in jello nailing…) he alights on the lean concept of measuring the percentage of time engineering actively spends on adding value to a product for a customer versus “wasteful” activities like bug-fixing, maintenance, and other necessities, what in lean thinking circles is known as muda. The aim is to work on the sources of muda to improve the value-added ratio.

This is a well-known concept in agile and lean software development, and I’m a little surprised that Shore didn’t go here first, considering his experience. For example, back in 2014(!) Dave Nicolette blogged about how he used a status board made of LEGO for his team to capture data to measure PCE. In the picture below, green bricks are value-added time, red are other activities, with each row corresponding to a feature or product:

In agile coaching, we favour radiating aggregated data with visual tools like this so we can understand our system because much of the “work” within it is nearly invisible to outside observers. Moreover, we understand that radiating data helps to change conversations away from blaming the individuals and more to the system creating the effects we’re observing.

In the article, Shore describes how he similarly presented his value-added time measurements as a series of stacked bar graphs to show how much of engineering time was being “wasted” on non-value-added activities, which is why their apparent productivity from on-high seemed so low. This proved to be the icebreaker that finally shifted leadership away from productivity and towards what we’d call a conversation about quality.

Had Shore and his CEO had a conversation about studying Deming six months earlier, they might have gotten to this point sooner, but better late than never?

Productivity Viewed Through a Deming Lens

Dr. Deming was quite critical about the drive to measure productivity because it was a lagging indicator of the past that didn’t tell you what to do to improve your situation now nor into the future. More importantly, as he taught the Japanese with his chain reaction, productivity is how well you’re delivering a product or service after accounting for complications, mistakes, delays, defects, etc.:

Thus, if you want to improve productivity you need to improve quality, and this begins with the system of management. Had James taught this first to his CEO and SLT, he might have been able to get to his final destination a little more quickly by avoiding all the side-quests into ROI and OKRs.

Rx? Pull on the Quality Lever

The single-most important source of leverage you have to improve productivity is quality. Each time you make a successful improvement, you effectively move an invisible fulcrum along a plane that increases the “mechanical advantage” of your organization’s efforts. Each time you double-down on a faulty practice, you move the fulcrum back, decreasing that advantage. In some organizations, the faulty practices used can overwhelm any good improvement to the point of neutralizing it, which is what happens to a lot of “agile transformations” that aren’t preceded by leadership learning new theory, first.

In the case of Shore’s engineering teams, by working on reducing time spent on “wasteful” activities that impair quality, moving the fulcrum might look like the diagram below:

Of course, this depends on leadership not doing things to resist moving the fulcrum in the other direction by indulging in one or more of the faulty practices of management and starting on managing the company as a system. From the leadership perspective, moving the fulcrum to make engineering’s efforts more productive might look like this:

As Dr. Deming teaches us, whereas 94% of problems and issues that come up originate in the system, from common-causes of variation, leadership, as the system owners, have the largest lever to effect change. This requires outside knowledge, which Dr. Deming has done all the heavy lifting for us to uncover: all we have to do is dedicate some time to learning and applying it.

Ironically, Shore isn’t totally unaware of Deming as he bristles about leadership introducing OKRs:

“OKRs” are “Objectives and Key Results.” They’re a way of setting and tracking goals. Similar to Management by Objectives, about which Deming said: “Eliminate management by objective. Eliminate management by numbers, numerical goals. Substitute leadership.” But that’s a rant for another day.

I wonder why he didn’t dig deeper to channel this into a more substantive conversation with his leadership?

Summary and Reflection Questions

Dr. Deming was adamant on his belief that quality and productivity begin with management, and that any possible contribution workers could make would be minimal, “1/5 to 1/7 the contribution good management can make”[1]. In his time, the current fad was Quality Circles (QC), which today can be substituted with any flavour of transformation that aims to change what people do instead of the system that directs their work. He viewed trying to measure productivity as akin to gathering accident statistics, in that they tell you there’s a problem but not what to do about it.

In Shore’s story we see this played out in his journey to find a productivity measurement to appease his leadership who were still firmly stuck in the prevailing system of management. What worked for him to change the conversation was revealing how his previously opaque part of the system was working. It’s a big step toward shifting the SLT’s mindset toward moving their fulcrum along the quality plane to their “mechanical” advantage.

Consider Shore’s journey that he relates in the article and the highlights above, particularly the metaphor of the quality lever. Where is your respective fulcrum located? At a mechanical “advantage” or “disadvantage” to the goals you want to achieve? How have your attempted to move the fulcrum along the “quality” plane? What resisted or enhanced your efforts? How could you initiate a conversation about quality in your organization that would influence leadership thinking?

The Miro Board

As always, access to the Miro Board I use to collate my notes and thinking behind posts like this is available to paid tier subscribers. I’ll be adding more material to this post-publication as there’s a lot more that I was drawing upon that is left out for brevity.

Keep reading with a 7-day free trial

Subscribe to The Digestible Deming to keep reading this post and get 7 days of free access to the full post archives.