Tom Keelin's Metalog Distributions

Mathematical Elegance Coupled to Computational Efficiency

by Sam Savage


After receiving his PhD in Decision Analysis from Stanford, Tom spent 40 years in analytical consulting, including an 18-year stint at the prestigious Strategic Decisions Group, where he was Worldwide Managing Director. Tom was struck by general management’s inability to compute uncertainties, and has developed a flexible family of continuous data-driven probability distributions based on pragmatic consulting experience. The Metalog distributions, as he calls them, combine mathematical elegance with computational efficiency [i], [ii], [iii].

To put Metalogs in perspective, I remind the reader that the theory of probability and statistics is powerful and elegant. But so is the steam locomotive, and they were developed around the same time. By the 1970’s, computational approaches to statistics such as bootstrapping arose. These were based on the brute force of computer simulation instead of 19th century calculus. Although Metalogs are also based on simple mathematical principles, they are intended to be fit to data sets, not adjusted by parameters such a mean and standard deviation.  And they output the ideal food for simulations:  inverse cumulative functions. These functions are the most common way to generate random variates in simulations. The Excel function NORMINV(rand(),Mean,Sigma), for example, will produce Normal random variables with the specified mean and standard deviation with every press of the Calculate Key.

The informative Metalog Distributions website contains extensive documentation and implementations in numerous environments, including Excel and R. We have already implemented some of the Metalogs in the SIPmath™ Modeler Tools as described below. You can also download any of the Excel templates from the Metalog website and use them with the tools. Just be sure to replace the “random” cells in the templates with either RAND() or HDR generators from the SIPmath tools.

It is still early innings for Metalogs. For example, last year Tom and I discovered how to generalize the concept to solve a vexing problem in simulation. Suppose you are modeling an uncertain number of risk events, such a transformer failure. Each failure will cause a fire with a skewed, lognormally, distributed adverse consequence. On a given simulation trial you may get 3, 5, 8 or some other number of failures, and need to add up 3, 5, 8 or some other number of lognormal distributions. But you don’t know in advance how many you will have so you don’t know how many you need to generate. Until our approach with the Generalized Metalog, there was apparently no closed form solution for expressing sums of lognormals. We (mostly Tom) wrote this up for publication, and with his help we built sums of lognormal and triangular distributions into the Enterprise SIPmath™ Modeler Tools. Tom is now Chair of Data-Driven Distributions at, and we will keep you apprised of future Metalog developments, several of which are in progress.

Using Metalogs in the SIPmath Tools

All the latest versions of the tools support the SPT (symmetric percentile triplet) Metalog, which can produce a wide range of distribution shapes as shown below [iv].



Furthermore, Tom has written a nice tutorial on their use in the SIPmath Tools.

The sums of identical triangular and lognormals are implemented in the Enterprise version of the tools, as described below.

Sum of Lognormal Risk Consequences in the SIPmath Tools

Suppose your organization is subject to a risk characterized by an average of 5 adverse events per year, each with a consequence that is lognormally distributed with a 50th percentile of $1Million, and a 90th percentile of $3Million.

The steps below show how to model this situation in the Enterprise SIPmath Tools

1. Poisson number of events
After initializing the file, we model the number of events per year as a Poisson variable.



2. Creating a sum of IID lognormals based on the Poisson number of events
We then create a lognormal in cell E5, checking the box on Sum multiple IIDs box (IID stands for independent, identical distributions). The number of lognormals to sum will be the number of events generated in cell C5, which varies with each simulation trial.


3. Specifying Risk as Output
We now specify E5 as an output of the simulation named “Risk” (cell E4) and denote cells F4 through G7 for a sparkline histogram.



4. Querying Statistics
Once the output is specified, you may specify any statistics, such as percentiles as shown below.


Now if you change any of the inputs (C3, E3, F3) the model will instantly update. And like all models created with the SIPmath modeler tools, the file is pure Excel, and uses no macros or add-ins, so you may share it with 1 billion of your closest friends.


[i] Keelin, T.W. and Powley, B.W., 2011. Quantile-parameterized distributions. Decision Analysis, 8(3), pp.206-219.

[ii] Keelin, 2016. The Metalog Distributions. Decision Analysis, 13(4), pp.243-277.


[iv] From

© Copyright 2018 Sam Savage

The Five-Step Process of Donald Knuth

by Sam Savage

Worthless Clichés

Life is full of helpful sounding procedures for improving your memory, losing weight, landing that perfect job, etc. I have found most of these to be worthless clichés with one notable exception: the five-step process of the renowned computer scientist Donald Knuth. In fact, it is the only thing I am religious about.

When I was studying computational complexity in graduate school in the early 1970s, I was exposed to Knuth’s multi-volume set on computer science, much of which went over my head. But early in one of the volumes he lays out the five steps of writing a computer program, which I have found invaluable in many settings. I state these in the context of analytical modeling, which I do more of these days than programming.

The Steps

  1. Decide what you want to do.
    What is the purpose of the analysis? Who is the audience?

  2. Decide how to do it.
    Is a spreadsheet adequate for the analysis or will I need a more powerful tool? Will I model time discretely or continuously?

  3. Do it.
    Put fingers to keyboard and press appropriately.

  4. Debug it.
    Of course it didn’t work as planned. Who do you think you are, Einstein?

  5. Trash steps 1 through 4 now that you know what you really wanted in the first place.
    The power of recursion!

Get to Step Five Fast

I’ll bet your organization spends a lot of time in steps 1 and 2 and calls it planning. I say, get to step 3 with a primitive prototype as quickly as possible. You will then be at step 4 before you know it, which qualifies you for the true enlightenment of step 5.

I consider myself a black belt in the Step Five Process. When I start a new modeling project, I am completely confident that I don’t know what I want, so I only spend 3 seconds on step 1. I give myself much longer on step 2, 30 seconds. If it takes longer than that, I quit. Step 3 is where the time comes in. I put on headphones, switch to my Eagles Channel on Pandora (as much as I love classical music, it does not work here), and typically work for 15 to 30 minutes before finding the fatal flaw, which I must debug. I don’t spend a lot of time debugging at step 4, maybe 5 minutes, because I know that step 5 is inevitable, and I can’t wait to start again on what I now think I wanted in the first place.

When do I terminate the Step Five Process cycle? When my model is dead! A living model is always evolving in this manner.

© Copyright 2018 Sam Savage

Standardized Risk

by Sam Savage

According to George Bernard Shaw, “The single biggest problem in communication is the illusion that it has taken place.” The poster child for this conundrum is the word risk.

You worry about the risk of XYZ stock going down, but I’ve shorted it, so I worry about XYZ going up. I offer you $200,000 in cash or a coin toss worth 0 or $1 million and you take the cash. Bill Gates risks the coin flip.

Bottom line: risk is in the eye of the beholder.

Yet most risk management techniques serve up “risk” as a single number, or worse, a color on a heat map, which is blind to risk attitude. No wonder Doug Hubbard, author of The Failure of Risk Management: Why It’s Broken and How to Fix It, argues that most current methodologies “are no better than astrology.” Both Doug and I agree that a promising approach is the computer simulation of uncertainty, but most risk simulations are siloed, and cannot be networked together into integrated systems. This will take a degree of standardization which is just emerging, and which will need to take place on multiple levels.

As an analogy, consider the standardization of financial statements.

Uniform Formatting defines how things look (let's go with the green stripe).


Formatting helps organize information visually, but does little else. The risk management version is a heat map. Don’t get me started!


Uniform Calculations define what things mean.


Risk calculations are typically done three ways:

  1. Not at all
    Doug Hubbard and co-author Richard Seiersen deride calculations with heat maps as “Orange times fish plus purple times birds equals Pee Wee Herman.”

  2. With averages
    The good news: averages are easy to calculate with. The bad news: they lead to the Flaw of Averages. I will address this in a future post, but you’d better use an incognito browser when you read it.

  3. With simulations
    Simulations preserve the uncertainty of risky situations, allowing results to be viewed according to the beholder’s individual risk attitude.

Here an important calculation is:

Risk = Likelihood of Failure x Consequence of Failure

If you treat Likelihood and Consequence as single numbers in this expression, stay tuned for my upcoming diatribe. If you use it in a simulation which conveys uncertainty, then game on!

Uniform Representations define how things are communicated. 

Hah! You thought we were done. But if you can’t communicate the results in an actionable way you are stuck in your own silo.


Hindu-Arabic numerals are so entrenched that we don’t even realize we have a choice. At we are working with others on an analogy for conveying uncertainty between simulations.

Our open SIPmath Standard represents uncertainties as arrays of simulated trials called Stochastic Information Packets (SIPs). Doug Hubbard is developing a family of portable random number generators that have already been adopted by the SIPmath Tools available on our website, and may eventually enable massive networked simulations that communicate across the economy. Tom Keelin's new representation for probability distributions, called Metalogs, also appears in our tools. I will devote a future blog to Tom’s mathematically elegant and practical invention.  

The theme of our Annual Conference in San Jose on March 27th and 28th is Standardized Risk and Doug, Tom, and I will be presenting. I hope you can join us.

Do you have ideas of your own that you want to share with the world? Send us an email.

Sam L. Savage
Executive Director

© Copyright 2018 Sam Savage

Connecting the Seat of the Intellect to the Seat of the Pants


by Sam Savage

In his 2011 book, Thinking Fast and Slow, Daniel Kahneman divides the human thought process into a fast, intuitive component, System 1, and a slower analytical side, System 2. This is a useful dichotomy, and I refer to these systems as "the seat of the pants" and the "seat of the intellect," as sensitively portrayed by Jeff Danziger in my 2009 book, The Flaw of Averages.

Kahneman claims that System 1 is bad at understanding statistics because it can only focus on one thing at a time, and System 2, which can handle many things at once, is slow and lazy and may not be consulted in the heat of decision making. 

But when you put System 1 and System 2 on what Steve Jobs called a Bicycle for the Mind (a computer), you can connect the seat of the intellect to the seat of the pants and fundamentally change your thought process.

This is one of the advantages of SIPmath for performing probabilistic analysis. First the SIP itself, as an array of thousands of potential outcomes, is "one" thing that contains "many" things. Second, because SIPmath simulations in Excel yields results in real time by evaluating thousands of possibilities per keystroke, it can tap into our limbic system, with its tens of millions of years of evolution.

© Copyright 2018 Sam Savage