DiscreteDistribution.frink

Download or view DiscreteDistribution.frink in plain text format

/** This class allows you to randomly *and efficiently* select between items based on a discrete probability distribution that you specify. To use this class, first create an instance of DiscreteDistribution, passing in an array of [outcome, probability] pairs, where the outcome can be any Frink expression that you want to receive from calls to the "random" method. Probabilities should sum to 1 but the code will attempt to silently normalize them if they don't. For example, the following will flip a loaded coin 100 times and print a series of "heads", "tails", and "SIDE?!" outcomes. events = [ ["heads", 49/100], ["tails", 50/100], ["SIDE?!", 1/100] ] r = new DiscreteDistribution[events] for a = 1 to 100 println[r.random[]] This uses an implementation of the alias method using Vose's algorithm. The alias method allows for efficient sampling of random values from a discrete probability distribution (i.e. rolling a loaded die) in O(1) time each after O(n) preprocessing time. The code is based on an implementation by Keith Schwarz. For a complete writeup on the alias method, including the intuition and important proofs, please see the excellent article "Darts, Dice, and Coins: Sampling from a Discrete Distribution" by Keith Schwarz at: http://www.keithschwarz.com/darts-dice-coins/ http://www.keithschwarz.com/interesting/code/?dir=alias-method You can also use this code to efficiently index into a distribution by calling the pickOutcome[p] method where p is a number between 0 and 1 (inclusive). Keep in mind that outcomes are not contiguous! */ class DiscreteDistribution { /** This is an array that maps from index to index. */ var aliasMap /** This is an array of renormalized probabilities. */ var probability /** This is an array of outcomes. */ var outcomes /** Construct a new probability distribution from a list of [outcome, probability] pairs. (See the example above.) Each outcome can be any Frink expression and is what will be returned by each call to the random[] method. */ new[pairs] := { // Make a copy of the pairs array, as we're going to modify it pairs = deepCopy[toArray[pairs]] size = length[pairs] if size == 0 { println("DiscreteDistribution.new: Probability vector is empty"); return; } outcomes = new array[size] probability = new array[size] aliasMap = new array[size] average = 1/size // These stacks will act as worklists as we populate the tables. // They contain integer indices. small = new array large = new array for i=0 to size-1 { [outcome, prob] = pairs@i if prob >= average large.push[i] else small.push[i] outcomes.push[outcome] } /* As a note: in the mathematical specification of the algorithm, we will always exhaust the small list before the big list. However, due to floating point inaccuracies, this may not necessarily be true. Consequently, this inner loop (which tries to pair small and large elements) will have to check that both lists aren't empty. */ while length[small] > 0 and length[large] > 0 { less = small.pop[] more = large.pop[] /* These pairs have not yet been scaled up to be such that 1/n is given weight 1. We do this here instead. */ probability@less = pairs@less@1 * size aliasMap@less = more /* Decrease the probability of the larger one by the appropriate amount. */ pairs@more@1 = pairs@more@1 + pairs@less@1 - average /* If the new probability is less than the average, add it into the small list; otherwise add it to the large list. */ if pairs@more@1 >= 1 / size large.push[more] else small.push[more] } /* At this point, everything is in one list, which means that the remaining probabilities should all be 1/n. Based on this, set them appropriately. Due to potential numerical issues if floating-point numbers are used, we can't be sure which stack will hold the entries, so we empty both. */ while length[small] > 0 probability@(small.pop[]) = 1 while length[large] > 0 probability@(large.pop[]) = 1 } /** Successive calls to the random[] method will return a random outcome with the probability distribution specified in the constructor. */ random[] := { /* Generate a fair die roll to determine which column to inspect. */ column = random[length[probability]] /* Generate a biased coin toss to determine which option to pick. */ coinToss = randomFloat[0,1] < probability@column /* Based on the outcome, return either the column or its alias. */ index = coinToss ? column : aliasMap@column return outcomes@index } /** This is a deterministic routine to pick an outcome with a user-generated "random" number from 0 to 1. */ pickOutcome[p] := { size = length[probability] scaled = size * p /** Find the column to inspect */ column = floor[scaled] if column == size // This happens when p = 1 column = column - 1 /** Find the location in the column to determine which option to pick */ coinToss = (scaled - column) < probability@column /* Based on the outcome, return either the column or its alias. */ index = coinToss ? column : aliasMap@column return outcomes@index } }

Download or view DiscreteDistribution.frink in plain text format

This is a program written in the programming language Frink.
For more information, view the Frink Documentation or see More Sample Frink Programs.

Alan Eliasen, eliasen@mindspring.com