THE AIM for this post is to announce the full access to the revised and updated version of my Red Bead Experiment simulator that I shared in my previous post, which I’ve now unlocked for all subscribers. You can now download the beta version via my Github repository here.
Getting Started
Decompress the .zip package into a folder that you can easily access via your Terminal (Mac) or Windows Terminal - this will be necessary to run the script.
In order to run the simulator, you need to install Python on your computer. Follow the corresponding links on my README page here for directions on Windows and MacOS systems.
Next, you will need to open your Terminal app and navigate to the folder where you decompressed the files. On Mac, just right-click on the folder and select New Terminal At Folder. On Windows, the option is called Open in Terminal.
Next, you need to install some libraries to make the script work. Type the following at the command prompt:
> pip install -r requirements.txt
This should automagically fetch them from the net and seamlessly install them. Now you’re ready to run The Red Bead Experiment Simulator!
Wait, what is this, again?
For the full story, see my last post and the README on my Github repository. tl;dr: What began as a way to test an assertion Dr. Deming made in Out of the Crisis about predicting the cumulative average of red beads from successive experiments has evolved into an educational toy for understanding variation and how it is visualized from time series data using a Process Behaviour Chart.
NB: As I note in the README and my last post, there’s a significant difference between running a sim with random numbers and the mechanical sampling in the physical experiment. Draw conclusions from what you see with this in mind.
Running the Standard Simulation
The benefit of running simulations in silica is that we get to try out things that would be tedious in real life, like say ten Red Bead Experiments in a row, which is the default configuration. At the command prompt, enter:
> python3 RedBeadSim.py
After a brief pause, a new browser window should open up and you’ll see something like this:
This is a standard Process Behaviour or “XmR” Chart showing the results of six willing workers sifting beads for forty days straight! Alternatively, if you want to give the poor workers a break, you could run a single experiment, by entering:
> python3 RedBeadSim.py —-experimentCycles 1
The top chart shows the red beads per worker with two three guidelines for the mean, upper, and lower process limits. The lower chart shows the differences between successive pairs of data points called the Moving Range or mR. It tells us about the variation in the data itself.
Process Behaviour Charts are designed to quickly test the homogeneity or sameness of time series data, typically generated by a process, pointing out signals of special causes of variation where this sameness breaks down. Wherever a point goes outside a red process limit line, that’s a signal. There’s 1 on the X chart, and two on the mR showing us where this has occurred. NB: We tend to see mR signals where there is either a sharp rise or drop from one data point to the next on the X chart.
Looking at the top of the chart is a summary of the simulation parameters and results:
Some are self-explanatory, eg. Experiments, Data Points, Total Red Beads, Mean, UPL, and LPL. Paddle size indicates the number of beads sampled per worker per day; Method is the algorithm that is used for simulating drawing the samples in the experiment; Baseline Sample Period is the number of data points used to calculate the mean and process limits - which is the entire data set.
NEW! Rule 2 Signals
Looking closer at the X chart, you will see a new feature for this release that highlights “Rule 2” signals of special cause variation that occurs whenever there are eight or more successive data points above or below the mean:
While not indicative of anything in particular with respect to the simulation, I’ve included this feature as a teaching aid to help learn how to identify this weaker signal. NB: It is possible for a data point to be outside the limits and part of a Rule 2 signal. Repeated runs can reveal this…
Change the Baseline Sample Period
Let’s layer-in a change to calculate the limits of our chart using the first two experiments’ of red bead data, or 48 data points:
> python3 RedBeadSim.py —-baselineSamplePeriod 48
What this visual aid shows is the range of data points that was used to predict the variation of the following 192 in the simulation. By changing the Baseline Sample Period up or down, you can see the effect it has on setting the limits and consequently the sensitivity to special cause signals. For example, using the first ten can produce a lot of potentially false signals:
New! Highlight Sigma Units
The limits on on Process Behaviour Chart are three sigma units of dispersion above and below the mean. They are estimated from the data using a simple formula:
The Average of the Baseline Sample Period +/- 3 * The Corresponding Average of the Moving Range / 1.128
Why three? Because after many experiments over many different types of applications and analyses it was found that at three sigma the limits were conservative enough to avoid false signals. Through these observations, The Empirical Rule was created, which states that for most homogenous data sets:
Roughly 60%-75% of the data will be located within one sigma unit around the average.
Usually 90%-98% of the data will be located within two sigma units around the average.
Approximately 99%-100% of the data will be located within three sigma units around the average.
We can visualize the rule in our data by adding aids to see how many data points fall within 1, 2, or 3 sigma units of dispersion around the mean. For example, to see the data points that fall within one and two sigma units, enter:
> python3 RedBeadSim.py —-baselineSamplePeriod 48 —-showSigmaUnitHighlights 2
So visualized, it becomes easy to understand the relationship between the variation in the baseline sample period and the estimated sigma units of dispersion in the process limits: each can be thought of as a progressively finer filter with only the most significant signals escaping the upper or lower boundaries.
In this example, note that our zones just so happen to land within the boundaries of the rule…
New! Show Red Beads as a Distribution
Suppose you want to see the shape of the distribution of the red beads the simulation is generating, as I did. Use the —-showDistribution
option to add a distribution histogram with the mean and process limits in a second browser tab:
> python3 RedBeadSim.py —-baselineSamplePeriod 48 —-showSigmaUnitHighlights 2 —-showDistribution
It’s worth noting here that Deming colleague and expert statistician, Dr. Donald Wheeler, teaches that Process Behaviour Charts do not require the data to follow a “normalized distribution” as we see here, to be effective.
Speaking of Beads…
In The New Economics, Dr. Deming describes using an 80/20 mix of red to white beads in the experiment: 3200 white, 800 red. In his earlier book, Out of the Crisis, he uses 3000 white to 750 red, also an 80/20 distribution. Does this have any appreciable effect on the experiment? With the new —beads
option, you can try different mixes:
> python3 RedBeadSim.py --baselineSamplePeriod 48 —-showSigmaUnitHighlights 2 —-beads 3000 750
Maybe you want to see what effect having a paddle with more or fewer indentations on the number of red beads sampled. Use the —-paddleLotSize
option:
> python3 RedBeadSim.py --baselineSamplePeriod 48 —-showSigmaUnitHighlights 2 —-beads 3000 750 —-paddleLotSize 60
Export to Excel
Sometimes you want to get your hands on the data and test it out for yourself, or maybe just to archive the results of various simulations. Add the exportToExcel
option and a timestamped Excel workbook containing the core data will be created in the script folder:
> python3 RedBeadSim.py —-baselineSamplePeriod 48 —-showSigmaHighlights 2 —-exportToExcel
Data points captured under Rule 2 are highlighted in orange, while points outside the process limits are highlighted in light red/rose. From here, you can create a chart or change the average with your own calculations. Your data, your rules!
Degrees of Freedom
Earlier, I mentioned the relationship the Baseline Sample Period used to calculate the limits can affect the sensitivity of the chart’s three limits (Upper, Lower, mean) and increase the probability for false signals. This is because there is an inverse relationship between the “uncertainty” in the limits and the number of data points you use to capture the variation, also known as the “degrees of freedom”. (See my Sept. 18/23 newsletter for more on this)
This can be expressed using the equation: 1 / √(2 × Degrees of Freedom [data points])
Use the —showDegreesOfFreedom
option to show this relationship in a handy reference chart:
> python3 RedBeadSim.py —-showDegreesOfFreedom
Summing it Up
As I’ve written here before, variation is one of my favourite aspects of Deming’s System of Profound Knowledge, and this educational toy is an extension of this interest. I create and use Process Behaviour Charts almost every day to analyze data and share the results with others. My aim in putting this out was to create something that might spark a similar interest in others by making the process of experimentation and what-if? simulation quick, easy, and fun. I invite you to try it out for yourself, play with the options and see what trouble you can get into, and let me know if it breaks in any weird and wonderful ways.
Just an out there thought. Instead of generating random numbers, why not use Demings records of numbers of beads as an sort of look up table…. If one can find them.
You should take a look at our Deming Red Beads in Augmented Reality ... search the app stores.
You also might like to download the Deming Funnel in Interactive 3D Augmented Reality. Search "Deming Funnel". The latter is FREE.
Sadly we don't provided multicolored belts of stupidity, so despite the app being free, having great learning and being widely promoted, and the coolest thing you have ever seen, the uptake has been low.