New Competency #2: Understanding Variation
The Least Appreciated, Highest Leverage Management Skill
Life is variation. Variation there will always be, between people, in output, in service, in product. What is the variation trying to tell us about a process, and about the people that work in it? …
The teacher failed to observe that roughly half of her pupils will be above the average on any test, and the other half below. Half of the people in any area will be above average for that area in a test of cholesterol. There is not much that anyone can do about it.
When do data indicate that a process is stable, that the distribution of the output is predictable? Once a process has been brought into a state of statistical control, it has a definable capability. A process that is not in statistical control has not a definable capability: its performance is not predictable.
- Deming, Dr. W.E. The New Economics, 3rd Ed. (pp. 67-68)
THE AIM for this entry is to introduce the second of six new leadership competencies from Peter Scholtes’ 1998 text, The Leader’s Handbook: The ability to understand the variability of work in planning and problem solving. This is perhaps the highest-leverage yet least appreciated or understood competency in contemporary management that is so powerful that it can provide novices with a significant advantage over their peers who have yet to learn its lessons. This competency marries quite well the first one we explored, The Ability to Think in Terms of Systems and Knowing How to Lead Systems, as all systems and their processes exhibit the phenomena of variation.
We begin where we should, with an operational definition…
What is Variation?
Dr. Deming observes that variation is life, or life is variation. There will always been differences between people, processes, and machines that will manifest in the products and services their interactions produce. Knowledge about variation is the gateway to a deeper, more profound understanding of how our organizations work as a system.
The discovery of the phenomena of variation is credited to Dr. Walter Shewhart who observed two types in his work at Bell Labs in the 1920s: 1) from the system itself (“chance” causes); and 2) from certain circumstances (“assignable” causes). Dr. Deming would later reinterpret these as “common” and “special” causes, with the former being the responsibility of management to improve and the latter as an operational definition for management action, ie. investigate and/or eliminate so as to stabilize the system and get back to the business of improving common causes.
Shewhart did add a caveat, however: even when you know about the two types of variation, you’ll make mistakes confusing one for the other, ie. attributing cause to the wrong source. However, these mistakes are manageable and far superior to those made without this knowledge. Nobody will get it right 100% of the time because things… vary.
The Illusion of Knowledge
It naturally follows that if we don’t understand the two types of variation systems present we will be susceptible to correlating coincidences with causes and effects, which also happens to be a good diagnosis of the prevailing style of management Deming warned us about. Worse, we’ll take action on this confusion and invoke the faulty practice of Management by Results, which can and frequently does, make things worse.
In The Leader’s Handbook, Scholtes calls this fumbling around the illusion of knowledge and illustrates it with a short story about a psychologist attached to the Israeli airforce who undertook an investigation into reports of flight instructors heaping public abuse and reprimands on their student pilots. Through a series of the interviews the psychologist learned the instructors had tried to be more complimentary with their students but it found it led to deteriorated performance; reprimands, however, always seemed to boost it.
Visualized, it looked something like the chart below:
Without an understanding of systems or variation the instructors reasonably inferred pilots had “good” or “bad” days based solely on themselves, making one of the two classic mistakes Shewhart (and Deming) warned about: confusing common causes of variation with special causes. In reality, the instructors had created an illusory connection between their actions and the students’ performance, leading to a false sense of control over the chaos that distracted attention away from what could be done to improve student learning outcomes without yelling and demoralizing them.
What the instructors had overlooked were all the system variables that interacted to produce a variety of “good” or “bad” days irrespective of whether they praised or reprimanded their students, eg. the variation between instructors, students, flight plans, training exercises, planes, ground crews, weather conditions, and many other unknowns and unknowables.
Had the instructors even had a little understanding of systems and variation by say, participating in or observing a run of Dr. Deming’s Red Bead Experiment, they likely would have realized reprimanding a student for a bad day wouldn’t change anything - it was all a matter of luck: the system carries on, up and down, regardless.
Rx? Gather and Visualize the Data
The only way to know what type of variation is present in a system is to visualize it using a Process Behaviour Chart, which we’ve looked at in this newsletter many times. To refresh, this is a standard time series run chart that has been augmented with three lines representing the upper and lower limits of common cause variation, calculated from the data itself, and the mean around which that variation oscillates. Plotting the student pilot performance data from above would look like the chart below which you can also download here:
When all of the performance data points fall within the red process limit lines, the system they represent is said to be stable and predictable, meaning that we have rational basis for assuming future data points will follow the same pattern. The air force instructors may wish for a different pattern around a higher mean or perhaps with less variation around the existing mean, but this would require changing the system. However, were a data point to fall above the upper limits, this would be a signal of special causes and a rational basis for management intervention to investigate and eliminate the cause to restore the system to a state of stability.
Without Data, Opinion Prevails
Scholtes explains that the advantage of visualizing system variation data is the ability to better challenge baseless opinions and conjecture with good theory. For example:
We may hear that costs are “out of control”, inferring a chaotic, positive trend that demands immediate action, but when viewed in the context of historical costs are within the range of normal, common-cause variation that may well reflect past over-reactions;
Conversely, we may believe that our defect rates are within acceptable bounds with no real problems, but when plotted indicate multiple signals of special causes that have gone ignored or unnoticed;
We may also, like the instructors, confound coincidence with cause-and-effect by establishing reward systems that are no more effective than lotteries in identifying who gets credit for improving a figure up or down.
When opinion prevails over data, superstitious and illusory learning follows, making the work of transformation and improvement almost impossible.
What Goes Up…
Scholtes also makes a good observation about common cause variation that should be more commonly known, especially among practitioners of the Two Data Point Comparison Theatre troupes in the media and those who emulate them in organizations:
In a system of common-cause variation, when performance is high [ie. data points above the mean], it is far more likely to go down than up. When performance is low [ie. data points below the mean], it is unlikely to stay low.
(p. 27)
In other words, before getting anxious over the latest data point going up or down compared to a previous one, consider it within the context of the whole data set. If the system is stable, the points will randomly move above and below the mean because that is the nature of variation.
Once you are attuned to visualizing data to understand the underlying patterns of variation, you’ll react less and become more curious. For example, for the past several months news reports here in Canada have been obsessing over record inflation as evidenced in the year-over-year changes in the Consumer Price Index (CPI), which will be presented as a single percentage up/down. When we “plot the dots” going back twenty years, we get some insights about how CPI has varied normally and in 2022, extraordinarily - something entirely missed by the media:
Variation in Agile Software Development
In my May 11/23 newsletter I describe a run of the Red Bead Experiment I did for my local agile meetup where I changed my aim in the debrief from focusing solely on quality to prioritizing system stability before estimating. This hasn’t really been on the radar for my professional community who have largely biased toward probabilistic forecasting without knowledge of variation. However, this appears to be changing as shown by this post by an agile coach I recently came across on Twitter/X demonstrating the test of a hypothesis on system performance by plotting his team’s aggregate work item ages (how long things have been in-progress) on a Process Behaviour Chart:
It appears that the change they made in how the daily work huddle was organized produced a clear, commensurate effect in the WIP age with the last three data points showing signals of special causes. In order to confirm this, the experiment will need to be run for at least another day to see if the team’s system is settling into a new period of common cause variation, and from here on out.
The big take-away here is the systematic approach this coach has taken to building real learning and knowledge with properly analyzed and presented data that clearly show how the system behaved prior to and after their proposed change, and whether the effect was significant and to what extent. Brilliant work by Mr. Brown!
Summary
In this newsletter we covered some aspects of Scholtes’ second New Leadership Competency, The ability to understand the variability of work in planning and problem solving. We’ve looked at the two main types of variation systems exhibit, common and special cause, and how this knowledge changes how we think about daily phenomena in our organizations. We’ve learned that when performance is high, it will likely go down, and vice-versa regardless of our interventions, and that the best way to know what kind of variation this represents is to plot the data on a Process Behaviour Chart. Further we’ve learned how this provides us with a rational basis for predicting future performance and whether an intervention is warranted, with an example from a real-world software team.
This is just a sampler, however: the topic of variation goes much deeper, which we will explore in future newsletters.
Next Up: Understanding How We Learn, Develop, and Improve
In our next chapter of this series, we’ll look at Scholtes’ third new leadership competency that dives into what Dr. Deming called the Theory of Knowledge, or how we build a cumulative understanding of the world around us through a scientific approach to learning and what this means in an organization and how it can be implemented.
Reflection Questions
Consider what we’ve learned about Scholtes’ Leadership Competency #2: The Ability to Understand the Variability of Work in Planning and Problem Solving in the context of your own organization. What sources of variation can you identify? How are they currently managed? How is their respective performance measured? What impressions do different people have about the predictability of the systems? Are they reliably good, bad, or indifferent? How are changes made: rationally with the aid of data, or without?
Consider the flight instructors’ illusion of knowledge that led them to confuse coincidence with cause-and-effect in the performance of their students. What examples can you identify in your organization of similar behaviours? What is the corresponding effect on people and their morale, teamwork, quality of product and service?