The Multivariate Metalog

By Sam L. Savage

(Free webinar with Tom Keelin on the Multivariate Metalog Distribution, May 17, 2023, 8:00 AM PT)

Over the years I have blogged numerous times about the Metalog quantile functions, described here in Wikipedia. A quantile function is a formula used in simulations to generate random variates of any shape from a uniform random number (Rand(), for example, in Excel). This blog provides background and context for understanding a more complex version, the Multivariate Metalog, which its inventor, Tom Keelin will be presenting in a webinar later this month.

 

Tom views the Metalog as an extremely flexible family of probability distributions, which can be easily fit to data using the ordinary least squares method.

Although I agree with Tom’s assessment, I have a very different perspective. I view the Metalog as a practical way to encode up to 100 million Monte Carlo trials into 22 parameters (88 bytes), when coupled with Doug Hubbard’s HDR uniform random number generator. Now before you protest that packing millions of numbers into 88 bytes of memory violates Shannon’s theory of information, look again. I didn’t say pack, I said encode. That is, the bare information to generate the numbers fits into 88 bytes. By practical I mean that the data may be interpreted with a single formula each for the Metalog and HDR and these are easy to implement in Excel, Python, R, or virtually any other programming environment.

Recall that the definition of probability management is the storage of uncertainty as data, which obey both the laws of arithmetic and the laws of probability, while maintaining statistical coherence [WIKI]. Generating random variates is only part of the problem. A separate and potentially more difficult issue is to maintain coherence with respect to the underlying statistical interrelationships. This is usually performed by correlating the uniform random numbers driving the quantile functions in a process known as the Copula Method.

To put all this in the context of the Open SIPmath™ 3.0 Standard, the 88 bytes representing the Metalog and HDR are embedded in JSON files along with what is called a Copula Layer and Metadata. This makes the 3.0 Standard sort of a USB port of distributions for simulation.

The Multivariate Metalog, in which the input parameters of one Metalog are driven by the outputs of another Metalog presents an interesting and potentially revolutionary alternative to the Copula Method. The rotating image above was created with a tri-variate Metalog to represent the length, girth and weight of steelhead trout. The red dots represent the original data set, while the blue dots represent synthetic results generated by the Metalog.

Tom Keelin offers a chance to learn more about this exciting approach in this webinar.

Copyright © 2023 Sam L. Savage