CS 314 - Specification 8 - Anagrams

Programming Assignment 8: Individual Assignment. You must complete this assignment on your own. You may not acquire from any source (e.g.  another student or an internet site) a partial or complete solution to a problem or project that has been assigned. You may not show another student your solution to an assignment. You may not have another person (current student, former student, tutor, friend, anyone) “walk you through” how to solve an assignment. You may get help from the instructional staff. You may discuss general ideas and approaches with other students but you may not develop code together. Review the class policy on collaboration from the syllabus.

The purposes of this assignment are:

  1. to use various data structures
  2. to implement a program that uses multiple classes
  3. to implement a recursive backtracking algorithm

Thanks to Stuart Reges for sharing this assignment with me.

In this assignment you will implement several classes that allow a user to find anagrams of words and phrases they type in.


Summary: An anagram is formed by taking all of the letters in one word or phrase and scrabbling them to form a new word or phrase. All of the letters from the original word / phrase must be used and no letters can be added. The new word or words must be valid based on some dictionary. For example an anagram of "Isabelle Scott" is "stile obstacle".

There are many online anagram solvers. One of the better ones can be found at http://wordsmith.org/anagram/index.html. PBS has a video with examples of people obsessed with anagrams.

See the guide below for suggestions on how to complete the assignment.

Files:

Source Code AnagramMain.java The driver program for the completed anagram solver. Provided by me. Do not alter except for printing out time results if you would like.
Source Code LetterInventory.java A class that represents a collection of letters from the English alphabet. Provided by you.
Source Code AnagramFinderTester.java. A class with tests for the LetterInventory and your AnagramSolver classes. Add at least 2 tests per public method and the constructor in LetterInventory.java. (Delete the provided tests after you are sure you code passes them.)

The tester requires the d3.txt file for the dictionary and this file with the expected results.

Here is the expected output of the tester without the anagrams shown and with the anagrams shown. You times will vary. Any of my runs that complete in less than 0.2 seconds should take more than 1 second. Any of my runs that take >= .2 seconds should take no more than 5X my time.

Provided by me.
Source Code AnagramSolver.java. A class that solves and returns a list of anagrams for a given phrase and maximum number of words. You must add the standard header with your information and the academic honesty statement. failure to do so will cause you to lose points on the assignment. Provided by you
Source Code Stopwatch.java. A class for calculating elapsed time when running other code. You may use this to see how long it takes to find anagrams and compare results on Piazza. Provided by me.
Dictionary Files d1.txt  A small dictionary file with 56 words.
d2.txt  A medium size dictionary file with 3927 words.
d3.txt  A large dictionary file with 19911 words.

Note, when we test your program, we will use other dictionaries that contain one and two letter words. Do not assume words in the dictionary will have a length of 3 or greater.

Provided by me.
Sample Output Sample run of anagram solver. Other than the timing results, your program must match this output given the same input. I have trimmed many of the anagrams for clarity, but when you run AnagramMain it prints out all the result. The timeouts for correctness will be roughly 5 times the given times. The output also shows the timing data for the sample solution and the tests in AnagramFinderTester.

Here is a run of the d3.txt tests without the output trimmed.

Provided by me.
Submission Turn in these three files: AnagramFindrerTester.java, AnagramSolver.java, LetterInventory.java. No jar, three separate files. Provided by you.

Checklist: Did you remember to:

  1. review and follow the general assignment requirements?
  2. follow the specific requirements and restrictions for this assignment?
  3. work on the assignment by yourself?
  4. fill in the header in AnagramFindrerTester and copy it to LetterInventory.java and AnagramSolver.java?
  5. ensure your LetterInventory and Anagram Solver classes pass the tests in AnagramFindrerTestert? Add at least 2 tests per public method and constructor for LetteInventory?
  6. complete the AnagramSolver and implement a recursive backtracking algorithm with the required efficiency improvements?
  7. ensure your program matches the sample output file when run?
  8. turn in your Java source code files LetterInventory.java, AnagramSolver.java, and AnagramFinderTester.java by 11 pm on Thursday, July 24?

Specification:

The LetterInventory class: Implement a class to model a letter inventory. A letter inventory object stores the number of times each English letter, 'a' through 'z' occur in a word or phrase. So for example the letter inventory of the word "tall" is

1 a, 2 l's, and 1 t

The letter inventory for "Isabelle M. Scott!!" is

1 a, 1 b, 1 c, 2 e's, 1 i, 2 l's, 1 m, 1 o, 2 s's, and 2 t's

Notice the letter inventory ignores characters that are not English letters. Also notice that the letter inventory is case insensitive. "Isabelle M. Scott!!" has 2 s's.

Implement the LetterInventory class. You must use an array of ints to store the number of times each English letter occurs. The inventory is case insensitive so the length of the array will equal the length of the alphabet being used, in this case 26. Use a class constant for the value 26. The class keeps track of the total number of letters in the inventory so this value can be returned quickly without adding up all the individual counters.

Provide the following constructors and methods:

You may add other private methods and constructors if you wish. Note, LetterInventory objects are immutable. There are no methods that alter a given LetterInventory object after it is created.

When you implement the anagram solver you will not use all of these methods, but you are required to complete the class. Your completed LetterInventory class must pass all of the tests in AnagramFindrerTester. You must add at least 2 tests per constructor and public method in LetterInventory to the AnagramFindrerTester class.


The Anagram Solver class: Now that you have a LetterInventory class to handle the low level details, implement a class to find all possible anagrams for a given phrase. (Realize that the legal anagrams for a phrase will vary based on what dictionary is used, the maximum number of words allowed in the anagram, and whether words can be repeated in the anagram or not.) For example the anagrams of "Isabelle Scott" using the dictionary in file d3.txt and a limit of no more than 2 words per anagram are:

best, oscillate
bestial, closet
bleat, solstice
blest, societal
closet, stabile
obstacle, stile
solstice, table

Note, there are two special cases your do not have to worry about.

Provide the following for the AnagramSolver class:

You can and should add private helper methods as necessary to break the problem up into manageable parts. By way of comparison my AnagramSolver class, including a nested class for a Comparator, is a total of 128 lines including comments, blank lines and lines with a single brace.

Finding the anagrams of a given String is a classic recursive backtracking problem similar to the N queens problem, the phone mnemonic problem, and the fair teams problem. 

You shall complete some preprocessing each time your anagram finding method is called. In most cases, many of the words in the dictionary cannot be part of an anagram for the given word or phrase. For example the word "queen" will not be in any anagram of "Isabelle Scott" because there is no 'q' in "Isabelle Scott". Likewise the word "casa' cannot be part of an anagram of "Isabelle Scott: because there are two 'a's in "casa" but only 1 'a'  in "Isabelle Scott". You will, of course, use the LetterInventory class to simplify the task of determining if a given word could possibly be part of an anagram. (The subtract method is VERY useful.)

Creating this smaller set of words from the dictionary and their corresponding LetterInventorys shall not alter the AnagramSolver's main dictionary. Once this smaller set of words is created you are ready to carry out the recursive backtracking method to find all anagrams of the given String. I recommend using the smaller list of words as your "choices" for the recursive step.

Note that the returned List<List<String>> must have the following properties:

Accomplishing all that sorting can be a chore. You might try to do it as you go or simply wait until you have found all the anagrams and then process it. The second option may be a cleaner solution, because it separates the recursive backtracking from the sorting. You are free to use methods from the Collections class (not the Collection interface, the Collections class. With an s. The Collections class contains a number of static methods that perform common task on various kinds of Collection objects.) to make this easier as well as other data structures.

Initially there is no way to sort a list of anagrams. Lists and ArrayLists are not Comparable. One approach would be to create a class that implements the Comparator interface. The Comparator interface allows you to define how two objects should be compared if they are not Comparable. You will need a separate class, but don't put in in a separate file. Instead, you can make your class that implements the Comparator interface a nested class, similar to the iterators you have implemented, except this class will be declared static because it doesn't need to know about the outer class.

public class AnagramSolver {
    // code for AnagramSolver

    // example of a nested class
    private static class AnagramComparator implements Comparator<List<String>> {
        public int compare(List<String> a1, List<String> a2) {
            // code for compare
        }
    } // end of AnagramComparator

} // end of AnagramSolver class

The compare method for the Comparator interface is similar to the compareTo method for Comparable. If a1 is "less than" a2 return a negative int. If a1 "equals" a2 return 0. If a1 is "greater than" a2 return a positive int.

When you create a class that implements the Comparator interface correctly you can call the sort method in the Collections class that takes in a list and a comparator to sort the List of Lists of Strings.


  Back to the CS 314 homepage.