The Alias Method

Some days ago, there was some discussion in #lisp about how to program a random variable with nontrivial probability distribution. That is, given values x1,…,xn, and probabilities p1,…,pn, how to write a function that randomly returns one of those values with the corresponding probability each time it is called.

It turns out this problem has a beautiful, simple, and efficient solution called the alias method (R. A. Kronmal and A. V. Peterson. On the alias method for generating random variables from a discrete distribution. The American Statistician, 33:214–218, 1979.), that requires only one call to the random number generator. While I haven’t read any complete explanation of it (only the one in R. A. Kronmal, A. V. Peterson, The alias and alias-rejection-mixture methods for generating random variables from probability distributions. Proc. 1979 Winter Simulation Conference (ed. H.J. Highland et al.) 269–280, 1979, which isn’t complete), I think it is highly unlikely that it is different from the method I describe below.

An Common Lisp implementation can be found here.

The alias method

If you choose one real number between 0 and 1 at random with a uniform distribution, the probability that it lies between a and b, with
0≤a<b≤1, is b-a. So the obvious solution to our problem is to put intervals Ii in [0,1) of length pi next to each other. Each time we want a value of our random variable, we choose a value v in [0,1) at random with a uniform distribution, and see in which interval it falls. If it falls in Ik, the value is xk. The problem is that finding the interval is costly.

The fundamental idea behind the alias method is that this approach still works if each Ii was subdivided in nonoverlaping, not necessarily contiguous intervals Ji1, Ji2,…,Jiki in [0,1) such that the sum of their lengths is pi. This can be used to rearange the intervals in such a way that it is easy to know in what interval a given point lies, and to which value the interval corresponds. To do this, one divides [0,1) in n intervals Lk of equal length, and then packs each with at most two intervals belonging to different xk. Thus, finding out in what interval a point falls amounts to finding out, first, in which Lk it falls, which can be obtained by division and truncation, and then finding out which one of the two possibilities that are left holds.

To get this redistribution right, pack L1 with an interval whose length is less or equal than 1/n, and, if necessary, take away what is needed to fill up L1 from one whose length is greater than 1/n. Now, filling up the rest of the Li is very close to the initial problem, but with one value (and one interval) less.

About these ads
  1. #1 by Jens Axel Søgaard on April 17, 2006 - 7:42 pm

    An interesting problem. Is this alias method the same as the alias method in “The Art of Computer Programming” by Knuth vol 2 section 3.2.1 ? It seems to solve the same problem, but is attributed to A.J.Walker (1977).

    A Scheme implementation can be found at the bottom of

  2. #3 by prxq on April 17, 2006 - 10:55 pm

    In the second reference I gave above, the authors say that their version is a slight modification of the alias method by Walker. Not having the Knuth at hand, I can’t really say.

  3. #4 by Jens Axel Søgaard on April 17, 2006 - 11:31 pm

    Based on the title “On the alias method…” I think you are right it is a refinement of the Walker method. I was unable to find an online copy, so I hope my local library can get it for me.

  1. Far away from Ankaa » More on the alias method

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

Follow

Get every new post delivered to your Inbox.

%d bloggers like this: