## CS329 Assignment 4

Due: Monday, February 11, by the end of the day.

### The Assignment

Below are three different "languages," meaning the result of a series of "experiments" in each of the following scenarios:
• You have an unbalanced coin that is twice as likely to roll a head as a tail.

• You have a six-sided die (one of a pair of dice) that is lopsided. Rolls of 1 and 2 are equally likely; so are rolls of 3 and 4; and so are rolls of 5 and 6. However, the die rolls a 1 twice as often as a 3, and roll a 5 twice as often as a 1.

• You work as a programmer for the Goldilocks Porridge Polling Company. Company pollsters ask the local bear population about porridge preferences, both in variety (sweetened or unsweetened) and serving temperature (hot, cold, just-right). Thus, any respondent can have one of six preference profiles. Suprisingly, preferences in variety and preferences in temperature are entirely independent of one another. But sweetened porridge is twice as popular as unsweetened. Hot and cold are equally popular, but just-right is twice as popular as either of the others.

The goal is to come up with encodings for each of these languages which are streaming, lossless and uniquely decodable. For each language, answer the following six questions:

1. What are the symbols in the language?
2. Compute the entropy of this language. (Please write down the instance of the formula; you don't have to compute a final numeric answer).
3. What is the naive encoding for this language?
4. Devise an encoding with the required properties and does better on average than the naive encoding.
5. Can your encoding ever be worse than the naive encoding? Explain your answer.
6. Provide a rigorous argument that your encoding is better on average than the naive encoding.

Please type your answers, but don't worry too much about formatting, as long as it is readable.