# Refactoring probability distributions, part 2: Random sampling

In Part 1, we cloned PFP, a library for computing with probability distributions. PFP represents a distribution as a list of possible values, each with an associated probability.

But in the real world, things aren’t always so easy. What if we wanted to pick a random number between 0 and 1? Our previous implementation would break, because there’s an infinite number of values between 0 and 1—they don’t exactly fit in a list.

As it turns out, Sungwoo Park and colleagues found an elegant solution to this problem. They represented probability distributions as sampling functions, resulting in something called the λ_{◯} calculus. (I have no idea how to pronounce this!)

With a little bit of hacking, we can use their sampling functions as a drop-in replacement for PFP.

### A common interface

Since we will soon have two ways to represent probability distributions, we need to define a common interface.

```
type Weight = Float
class (Functor d, Monad d) => Dist d where
weighted :: [(a, Weight)] -> d a
uniform :: Dist d => [a] -> d a
uniform = weighted . map (\x -> (x, 1))
```

The function `uniform`

will create an equally-weighted distribution from a list of values. Using this API, we can represent a two-child family as follows:

```
data Child = Girl | Boy
deriving (Show, Eq, Ord)
child :: Dist d => d Child
child = uniform [Girl, Boy]
family :: Dist d => d [Child]
family = do
child1 <- child
child2 <- child
return [child1, child2]
```

Now, we need to implement this API two different ways: Once with lists, and a second time with sampling functions.

### Finite distibutions, revisted

The monad from part 1 now becomes:

```
type FDist = PerhapsT ([])
instance Dist FDist where
weighted [] = error "Empty distribution"
weighted xws = PerhapsT (map weight xws)
where weight (x,w) = Perhaps x (P (w / sum))
sum = foldl' (+) 0 (map snd xws)
exact :: FDist a -> [Perhaps a]
exact = runPerhapsT
```

We can run this as follows:

```
> exact family
[Perhaps [Girl,Girl] 25.0%,
Perhaps [Girl,Boy] 25.0%,
Perhaps [Boy,Girl] 25.0%,
Perhaps [Boy,Boy] 25.0%]
```

### Sampling functions

This second part is also easy. First, we define a type `Rand a`

, which represents a value of type `a`

computed using random numbers.

We implement `Rand`

by reduction to Haskell’s IO monad. (There’s any number of other ways to do this, of course.)

```
newtype Rand a = Rand { runRand :: IO a }
randomFloat :: Rand Float
randomFloat = Rand (getStdRandom (random))
```

Next, we make `Rand`

a trivial, sequential monad:

```
instance Functor Rand where
fmap = liftM
instance Monad Rand where
return x = Rand (return x)
r >>= f = Rand (do x <- runRand r
runRand (f x))
```

We can turn any `FDist`

into a `Rand`

by picking an element at random:

```
instance Dist Rand where
weighted = liftF . weighted
liftF :: FDist a -> Rand a
liftF fdist = do
n <- randomFloat
pick (P n) (runPerhapsT fdist)
pick :: Monad m => Prob -> [Perhaps a] -> m a
pick _ [] = fail "No values to pick from"
pick n ((Perhaps x p):ps)
| n <= p = return x
| otherwise = pick (n-p) ps
```

Of course, a single random sample won’t do us much good. We need functions for repeated sampling and for displaying the results:

```
sample :: Rand a -> Int -> Rand [a]
sample r n = sequence (replicate n r)
sampleIO r n = runRand (sample r n)
histogram :: Ord a => [a] -> [Int]
histogram = map length . group . sort
```

And sure enough, we get a nice, even distribution:

```
> liftM histogram (sampleIO family 1000)
[243,242,254,261]
```

That’s it! Two monads for the price of one.

Part 3: Coming soon.

Want to contact me about this article? Or if you're looking for something else to read, here's a list of popular posts.

`sequence (replicate f xs)`

is just`replicateM f xs`

. Great articles, by the way!