# HW4: Approximate Inference in Bayesian Networks

## Due 11:59pm, Friday, April 5, 2013

In this homework, you will implement rejection sampling and Gibbs sampling to perform approximate inference on Bayesian Networks (BNs). More specifically, your task is to compute conditional probabilities on small BNs (~10 variables) containing binary random variables. First, carefully read Bishop 8.1 and 8.2 and Bishop 11.1-11.3. Next, you will have to write some code to represent BNs and compute the joint probability of an assignment of values to the variables. Then you will implement the two inference algorithms. Your experiments will roughly work as follows: generate a BN (either by hand or randomly), generate a random query (ie. a conditional probability), answer the query with the two inference algorithms, and compare the answers you get. Obviously, you should try lots of queries. In the limit, the two algorithms should converge on the same answer. As always, you should think of a sensible way of presenting your results and some extra experiment(s) to impress the grader for full credit.
The chapter on sampling in Bishop describes sampling algorithms in a general way that applies to any probabilistic model -- not just BNs. So, here are some hints on the specifics of how to implement these algorithms for BNs.

Rejection sampling: First, generate lots of samples from the prior distribution represented by the BN. This can be done using ancestral sampling (described on page 365). Then reject the samples that don't agree with the evidence (ie. have the same values for the variables you are conditioning on), and keep the ones that do. The samples that agree with the evidence will be samples from the conditional distribution. Rejection sampling is a very inefficient method, but works for very small networks. Its main purpose in this assignment is to check the correctness of your Gibbs sampling implementation.

Gibbs sampling: Implement Gibbs sampling as described on Bishop p. 543, modified to compute P(z_i | e), where z_i is a *single variable*, and e is an assignment to some set of variables. Note that the algorithm in the book samples from the joint distribution. Instead, to sample from the conditional distribution P(z_i | e), do the following: Change the initialization (line 1 of the algorithm) so that it initializes the variables in e to have the values in e. Also, in step 2 of the algorithm, do not resample values for the variables in e -- leave them alone. This can be seen as "clamping" the variables in the evidence set e. You can choose how many samples to use and the length of the burnin period. Part of the algorithm involves computing the probability distribution of a variable given an assignment of values to the set of all the other variables, P(z_i | {z_\i}). This probability can be computed using the equation on p. 382, replacing the integration with summation since we are dealing with discrete random variables. Below the equation, they discuss a more efficient way of doing this that involves the concept of a Markov Blanket, but you can ignore that for the homework and just do it the "brute force" way, by directly implementing the equation.

In your report, make sure to answer the following question: why is rejection sampling less efficient (in general) than Gibbs sampling?

Submit your code and a report. The command to submit will be something like: `turnin --submit lewfish hw4.`