CS 314 Specification 4 - Evil Hangman - Using Data Structures

Programming Assignment 4: Individual Assignment. You must complete this assignment on your own. You may not acquire from any source (e.g.  another student, a web site, a LLM such as chatGPT) a partial or complete solution to a problem or project that has been assigned. You may not show another student your solution to an assignment. You may not have another person (current student, former student, tutor, friend, anyone, LLM) “walk you through” how to solve an assignment. You may get help from the instructional staff. You may discuss general ideas and approaches with other students but you may not develop code together. Review the class policy on collaboration from the syllabus

The purpose of this assignment is to implement a program that uses data structures including lists, maps, and sets to create a very challenging version of Hangman.

Many thanks to Keith Schwarz of Stanford University for introducing me to this programming project and to Stuart Reges for his explanation of how the algorithm used to cheat.

Files: File Responsibility
Dictionaries Provided by me
Java Documentation Provided by Oracle
Sample output Sample outputs of the Evil Hangman Program
  • sample log 1 (using the small dictionary with debugging and hardest difficulty)
  • sample log 2 (using the small dictionary with debugging and easiest difficulty)
  • sample log 3 (using the small dictionary without debugging, various difficulties)
  • sample log 4 (using the large dictionary with debugging and hardest difficulty)
  • sample log 5 (using the large dictionary with debugging and easiest difficulty)
  • sample log 6 (using the large dictionary without debugging and hardest difficulty)
  • sample log 7 (using the large dictionary without debugging and medium difficulty)
  • sample log 8 (using the large dictionary without debugging and easiest difficulty)
Provided by me. Your programs output must match the sample runs with debugging off exactly. You can copy and paste into a diff program or website to compare.

The only allowable difference is when a player loses. The word printed is random and so your output may not match in some cases.

Implementation HangmanMain.java  - The driver program for the game. Provided by me.
Implementation HangmanDifficulty.java - An Enumerated Type with the three difficulty levels. Used in HangmanMain and HangmanManager. Provided by me. Do not alter.
Implementation
and Submission
HangmanManager.java - A class that keeps track of the state of the Hangman game. I provide the shell, you complete the class.
Testing In the past students visual inspected their results and compared them to the sample output logs. Students often missed or didn't test for errors and then failed 50 or more of our automated tests. Now you have access to the kinds of tests we run and a version of our testing program. (The actual tests we use will be different of course.)

Once you are sure your program is correct, you can and should generate your own tests and share them with other students.

EvilHangmanAutoTester.java - Uses your HangmanManager and the tests encoded in evilTests_STUDENT_VERSION.eht to automatically test. As new test files get generated you can alter the name of the input file.

evilTest_STUDENT_VERSION.eht - Data file for EvilHangmanAutoTester.java. This file is not human readable. The file was generated by printing out objects that implement the Serializable interface (such as TreeMap or HashMap) that allow an object to be printed out and read in with a single line of code. (And is generally more efficient that printing out a String version of the object.)

GenerateTests.java - Use this if you want to generate your own tests to share with other students. Replace the test methods or add new ones and change the name of the output file. This class relies on a data file and your HangmanManager to generate tests. Here is an example data file, studentTestData.txt, which was used to generate the evilTest_STUDENT_VERSION.eht above.

Provided by me. You are not required to turn in any tests.

Problem Description: Hangman in CS312 was easy. (Well, at least for the person playing the game.) Time to ramp up the difficulty with our Hangman program. This is Evil Hangman. In this version of the game the program AI cheats, but in a very subtle and clever way.

If you are not familiar with the game read the Wikipedia entry. You can also find many versions of the game online such as this one.

Our version of the game will NOT have any graphics or drawings of the gallows.

Evil Hangman is, well, evil, because the computer puts off actually picking a word for as long as possible. The program starts off with a dictionary (list) of words. As the user guesses letters the computer narrows the list of words that are active, meaning words that could still be chosen. The words (really just Strings) in the active list all  have the same pattern as the "word" displayed to the user.

In other words, the program does not pick the secret word until it has to. Instead, it keeps a list of possible words based on user guesses.

A pattern consist of the letters that have been guessed correctly and the letters that have not been guessed yet. For example, if the current pattern is "-e-e-" then the words "beget", "beret", "bevel", "defer", "deter" and so forth would be in the list of active words. Those words "match" the pattern. The "-" characters are wildcard characters and represent any character that has not been guessed yet.

When the user guesses a letter the computer splits the current list of words into different families (really just various lists of words) based on the letter the user just guessed and the resulting pattern of that letter in words from the current list. The computer then picks a list of word from the various lists that have been created. If the computer is being evil, it picks the hardest list of words as the new active list. In general the longest list of words is considered the hardest, but there are tiebreakers explained below. The program also allows the difficulty to vary so the picked list may not always be the list with the largest number of words.

I am providing a driver program that handles the input and output. You are writing a class, HangmanManager that keeps track of the state of the game and narrows the list of words as the user makes guesses. Your HangmanManager class must have the following methods:

Method Name Description
public HangmanManager(Set<String> words,
                                 boolean debug)

public HangmanManager(Set<String> words)
Constructors that creates the HangmanManager. words will be a nonempty Set of Strings. You can assume the Strings will contain only lower case letters a through z.

For the constructor with only a Set parameter debugging shall be false.

The constructors copy the dictionary and possibly some other things depending on your implementation. They don't a lot of other prep work. Most of that is done in prepForRound.

public void prepForRound(int wordLen, int numGuesses, HangmanDifficulty diff) This method must be called before each game, including the first one.

It is an important method. The HangmanMain program allows the user to play the game. Instead of creating a new HangmanManager object each time a person plays a game, call this method to prepare (or reset) the HangmanManager object with the new word length, number of wrong guesses allowed, and the difficulty. It is important to realize this method must also reset the list of eligible active words based on the original dictionary, the pattern, and the number of wrong guesses.

public TreeMap<String, Integer> makeGuess(char guess) This is the method does the heavy lifting. The precondition is !alreadyGuessed(guess). If the character has already been guessed throw an IllegalStateException .

Using this guess, the HangmnaManager decides what set of words to use going forward. The method returns a map of the patterns it generated and how many words matched each pattern. (Note, the HangmanManager does not keep track of this map as an instance variable.) After this method the active list of words may be changed. It is important to realize there is a lot to do in this method including updating the active pattern, the letters that have been guessed so far, and the number of wrong guesses if the pattern does not change.

You must create and use a local variable of type Map<String, ArrayList<String>> to break the active word list up into new families of words. (Recall, one of the purposes of the assignment is to learn and practice using map data structures.)

The String in the Map returned represent the possible word patterns based on the guess and the list of active words. The values are the number of words that fit this pattern. The sample output with debugging on shows exactly what this map should be for the given conditions. I suggest putting off building this second map returned by the method until the end of the method. This return value is used for debugging and testing purposes. If we were writing the program outside of an academic class this method would likely be void.

public boolean alreadyGuessed(char guess)

This method returns true if the user has already guessed the given character, false otherwise.

public int getGuessesLeft() This method returns the number of wrong guesses the user has left.
public String getGuessesMade() This method returns the letters the user has guessed so far in sorted order.
public String getPattern() This method returns the current pattern to be displayed for the hangman game taking into account guesses that have been made.  Letters that have not yet been guessed should be displayed as a dashes.  There shall only be lower case letters and dashes in the returned String.
public String getSecretWord() { This method gets the secret word the computer finally settles on. If the active list of words has more than one element, then pick a word at random from the active list. If the list has been narrowed to one word and the user has guessed it, return that word. This method throws an IllegalStateException if the list of active words is size 0. (This should never occur if you implement the program correctly.)
public int numWords(int length) This method returns the number of words in the original dictionary of the given length. It is used at the beginning of each game to ensure the user picks a valid word length.
public int numWordsCurrent() This method returns the number of words in the active list of words. (not the original dictionary.) It is used for debugging purposes to show how many words are left in the active list.

Add the instance variables you need to HangmanManager class. Add helper methods as necessary to keep methods small and easy to understand. My version of the makeGuess method uses five private helper methods in addition to the public methods listed above. For reference my solution has 8 (That is a lot and I am trying to figure out a way to reduce it.)  instance variables and is 315 lines long. That 315 lines includes a lot of blank lines, debugging code, instance variable declarations, method headers, import statements, comments, and lines with a single closing brace. The actual number of lines of code is on the order of 170.


Explanation of How the Program Cheats:  As noted earlier, this version of hangman cheats.  The program doesn’t actually pick a single word until it is forced to. Suppose that the user has asked for a 5-letter word.  Instead of picking a specific 5-letter word, it considers all 5-letter words from the original dictionary as potential choices.  Then as the user makes guesses, the program can’t completely lie.  It has to somehow fool the user into thinking that it isn’t cheating.  In other words, it has to cover its tracks.  Your HangmanManager object shall do this in a very specific way every time the makeGuess method is called.  Let’s look at a small example.

Suppose that the dictionary contains just the following 9 words:

[ally, beta, cool, deal, else, flew, good, hope, ibex]

Now, suppose that the user guesses the letter ‘e’.  You now need to indicate which letters in the word you've “picked” are e's.  Of course, you haven't picked a word, and so you have multiple options about where you reveal the e's. Every word in the set falls into one of five “word families:”

·         “- - - -”: which is the pattern for [ally, cool, good]

·         “- e - -”: which is the pattern for [beta, deal]

·         “- - e -”: which is the pattern for [flew, ibex]

·         “e - - e”: which is the pattern for [else]

·         “- - - e”: which is the pattern for [hope]

Since the letters you reveal have to correspond to some word in your word list, you can choose to reveal any one of the above five families. There are many ways to pick which family to reveal – perhaps you want to steer your opponent toward a smaller family with more obscure words, or toward a larger family in the hopes of keeping your options open.  In this assignment, in the interests of simplicity, we'll adopt the latter approach and always choose the largest of the remaining word families. In this case, it means that you should pick the family “- - - -”. This reduces your set of words to:

[ally, cool, good]

Since you didn't reveal any letters, count this as a wrong guess.

Let's see a few more examples of this strategy. Given this three-word set, if the user guesses the letter 'o;, then you would break your word list down into two families:

·         “- o o -”: containing [cool, good]

·         “- - - -”: containing [ally]

The first of these families is larger than the second, and so you choose it, revealing two 'o's in the word and reducing your set of words to

[cool, good]

In this case the user has correctly guessed a letter because there are two occurrences of 'o' in the new pattern.

But what happens if your opponent guesses a letter that doesn't appear anywhere in your word list? For example, what happens if your opponent now guesses 't'? This isn't a problem. If you try splitting these words apart into word families, you'll find that there's only one family – the family “- o o -” in which t appears nowhere and which contains both “cool” and “good”. Since there is only one word family here, it's trivially the largest family, and by picking it you have maintain the word list you already had and you would count this as an incorrect answer.

Here is a useful diagram that shows the winnowing process based on user guesses. Here is a text based version of that image showing the winnowing process. And here is another, more detailed, explanation for how to break the current word list up into families based on the current pattern and current guess from former TA Claire.

To implement this strategy, I require you to use a map on the assignment The keys will be the different patterns for each word family.  (You are free to use String, StringBuilder, arrays, or other classes to determine what patterns exist based on the active word list, the previous pattern, and the user's guess.) Those keys shall map to a list of words that have that pattern. 

For each call to makeGuess, find all of the word families based on the current list of active words (not the original dictionary, except for the first guess) and pick the one that is the hardest

The hardest word list / family is defined as the one with the largest number of words with certain tiebreakers. (This family becomes the new list of words for the next round of the game.)  If there is a tie for the family with the most words (two of the word families are of equal size), pick the family that reveals the fewest total characters. If there is still a tie, pick the family that has the pattern that is "smallest" based on the ASCIIbetical (lexicographical order based on ASCII codes) ordering of the patterns. Look here for more info on ASCIIbetical order, lexicographical order, and Alphabetical vs. ASCIIbetical order. (The last link is to a very interesting blog post about Alphabetical vs. ASCIIbetical order and how most built in String sorts are ASCIIbetical, but when displaying info to real users, like a list of files, true alphabetical is preferred.)

Summary of hardest word list / family:

  1. Family with the maximum elements

  2. If there are two or more families with the maximum number of elements break the tie by picking the one that reveals the fewest characters

  3. If there are two or more families with the maximum number of elements and reveal the same number of characters  break the tie by picking the pattern that is "smallest" based on the lexicographical ordering of Strings.

    Toy example of lexicographical ordering.

    String pat1 = "---a";
    String pat2 = "a---";
    System.out.println(pat1.compareTo(pat2)); // prints out a negative integer. "---a" is lexicographically smaller (or before) "a---"
    System.out.println(pat2.compareTo(pat1)); // prints out a positive integer. "a---" is lexicographically larger (or after) "---a"

Your program shall  be reasonably efficient way in terms of time and space.

Difficulty: Worry about this last. In order to moderate the evilness of the program the user is allowed to enter a difficulty level, an Enum from HangmanDifficulty. The strategy described above assumes the difficulty is HangmanDifficulty.HARD, the hardest.

Initially ignore the user's choice for difficulty and implement the algorithm above. When you have that working then alter the makeGuess method to take into account the difficulty.

If the difficulty is HangmanDifficulty.HARD  then the program picks the word list as descried above.  HangmanDifficulty.HARD  always picks the hardest word list / family.

If the difficulty is  HangmanDifficulty.MEDIUM  the program picks the hardest word list / family 3 times in a row, then the second  hardest word list, then the hardest word list / family 3 times in a row, then the second hardest word list / family. This pattern repeats until the game is over. The game should be a little less difficult if the program doesn't always pick the hardest word list / family. If their is only one word list / family and it is time to pick the second hardest , just pick the single word list / family. Then go back to picking the hardest. In other words the behavior for picking word lists at this difficulty level is:

Medium behavior for picking new list of words in makeGuess method:
Guess 1, PICK HARDEST WORD LIST
Guess 2, PICK HARDEST WORD LIST
Guess 3, PICK HARDEST WORD LIST
Guess 4, PICK SECOND HARDEST WORD LIST IF IT EXISTS, OTHERWISE PICK HARDEST
Guess 5, PICK HARDEST WORD LIST
Guess 6, PICK HARDEST WORD LIST
Guess 7, PICK HARDEST WORD LIST
Guess 8, PICK SECOND HARDEST WORD LIST IF IT EXISTS, OTHERWISE PICK HARDEST
and so forth ...

If the difficulty is  HangmanDifficulty.EASY the program alternates between picking the hardest word list and the second hardest word list if it exists:

Easy behavior for picking new list of words in makeGuess method:
Guess 1, PICK HARDEST WORD LIST
Guess 2, PICK SECOND HARDEST WORD LIST IF IT EXISTS, OTHERWISE PICK HARDEST
Guess 3, PICK HARDEST WORD LIST
Guess 4, PICK SECOND HARDEST WORD LIST IF IT EXISTS, OTHERWISE PICK HARDEST
Guess 5, PICK HARDEST WORD LIST
Guess 6,  PICK SECOND HARDEST WORD LIST IF IT EXISTS, OTHERWISE PICK HARDEST
Guess 7, PICK HARDEST WORD LIST
Guess 8, PICK SECOND HARDEST WORD LIST IF IT EXISTS, OTHERWISE PICK HARDEST
and so forth ...

If it is time to pick the second hardest word list, but there is only one word list, then that single word list is the pick. The second hardest does not "roll over" to the next round in this case.

Note, your solution to this problem, not necessarily picking the hardest pattern should scale easily. Meaning, what if we change the requirement that on easy mode we pick the 4th hardest pattern every other round and on medium difficulty we pick the 3rd hardest pattern every fourth round?

Hint: There are many ways to find the hardest word list / family and to make the difficulty portion easier. You may find it useful to create new, nested classes that implement the Comparable or Comparator interfaces. The official Java tutorial section on Comparable and Comparators and another tutorial.


Submission: Fill in the header for HangmanManager.java. Replace <NAME> with your name. Note, you are stating, on your honor, that you did the assignment on your own. Fair warning: this is one of the assignments I typically catch several students who copy solutions from the web or other students.

Create a zip file name a4.zip with your HangmanManager.java  file. (Yes, a zip with one file.) The zip file must not contain any directory structure, just the required file.

See this page for instructions on how to create a zip via Eclipse.

Turn in a4.zip via your Canvas account to programming assignment 4.


Checklist. Did you remember to:

  1. review and follow the general assignment requirements?
  2. work on the assignment by yourself?
  3. fill in the header in your HangmanManager.java?
  4. ensure your program does not suffer a compile error or runtime error?
  5. test your program and ensure it passes the provided tests?
  6. turn in your file named HangmanManager.java in a zip named a4.zip with no internal directory structure?
  7. turn in your zip named a4.zip to Programming Assignment 4 via Canvas no later than 11 pm on Thursday, February 22?

  Back to the CS 314 homepage.