REPORT - Programming Assignment #5 Vincent Dominguez --OVERVIEW-- In this assignment, students were tasked with fully implementing a game of Boggle. I was required, using only the information provided in the assignment handout and the two provided interfaces, to write a program that includes all features and functionality of Boggle. The game is supposed to, through a user interface, take word input and search for the word on a four-by-four grid of letters. There are various rules for constructing words, and the program must take all of these rules into consideration. The word must also be found in the dictionary, which students were required to implement. Finally, when the user is finished entering words, the program must find all remaining valid words on the board using two different search tactics. I identified three different areas for which this program was distributed to emphasize. The first is writing recursive methods. The complexity of searching the board for words is significatly reduced through use of recursion. However, as a concept, recursion is very difficult to master. As a result, this program was most likely assigned primarily to measure students' proficiency with designing and implementing recursive algorithms. Second, The [GameDictionary] class provided a good measure of student's abilities to both use and create data structures. It is likely that this was an intended effect. Finally, the manner in which this assignment was distributed indicates another likely focus area. Students were only provided with a short handout and two interfaces, yet were asked to create the largest program thus far in the course. This suggests a great focus on abstraction. Students' components of the assignment ([GameManager], [GameDictionary], and [Boggle] user interface) should be interchangeable with other correct solutions without losing any functionality, the primary purpose of abstraction. My major goal in completing this assignment was to implement recursive algorithms that were not only correct but also clear. It is often the case that recursion can be accomplished correctly while the implementor does not understand how his/her code works. I constantly went through my code in order to understand what it does. A good measure of my understanding was my ability to write clear comments on the recursive methods that I wrote. A secondary goal of mine was to design and implement a sufficient user interface. I have little experience with user interfaces, so I viewed this assignment as an opportunity to learn about them. I wanted mine to be clear, (somewhat) elegant, and robust. --SOLUTION DESIGN-- Because both of the search tactics for finding words on the board require a functioning dictionary, I began with the [GameDictionary] class. I methodically completed it method by method, skipping around only when another method was required to test a previous one. Upon its completion, I tested it and fixed bugs until I was sure it was correct (further detail can be found in the testing section). Next I worked on the [GameManager] class. To complete this class, I first wrote the several shorter methods that could be completed without dependence on other methods in the class. This meant I saved the most difficult methods, [addWord] and [getAllWords] for last. After further testing and debugging (again, more details found in the testing section), I created the user interface class [Boggle]. This user interface class was fairly simple to implement, as for the most part it uses methods already defined by the [GameManager] and [GameDictionary] classes. The [loadDictionary] method of the [GameDictionary] class in effect initializes the dictionary into a functioning state. An important assumption that I made is that an invalid input file renders the whole instance of the [GameDictionary] useless, so invalid input causes the program to terminate following an appropriate error message. To read the words from the input file, I utilized a [BufferedReader] object, as this object provides a convenient [readLine] method. Because the [BoggleDictionary] interface does not specify that the dictionary should be loaded upon its construction, I added a boolean to the [GameDictionary] class indicating whether it had been loaded yet. Thus, simply checking the status of this boolean variable at the beginning of the other methods in the class prevents any issues that may arise from trying to operate on an uninitialized dictionary. To actually create a dictionary data structure, I created two separate classes in the [GameDictionary.java]. One is a tree class, [DictTree], and the other is a class for the nodes of the tree, [DictTreeNode]. The letters of each word in the dictionary are stored in individual nodes on a [DictTree] object. If the letters on the path from the root of a tree to a particular node form a word in the dictionary, the node's corresponding boolean value is set to true. This way, words that share prefixes also share nodes in the tree. To insert a word on the tree, the correct path is followed until it no longer exists. At this point, a new path is created that branches off of the existing path until the end of the new word is reached. To search for a word or prefix, the path is simply followed (if it exists) to the end of the query. If searching for a word, the last node in the searched path must indicate that it forms a word. These operations will always take O(L) time, where L is the length of the given word. Because only the corresponding path of the letters on the tree is checked (or created), at most L nodes are checked (or created). This is very efficient because this means the running time of inserting or searching for a word in the dictionary is not dependent on the number of words in the dictionary, only on the length of the word. To iterate through the tree, I created a separate [DTIterator] class, also in the [GameDictionary.java] file. A [DTIterator] objects stores the current node position, as well as the current word in the dictionary. To find the next word, the iterator simply traverses down the path of first children (if such a path exists). This is correct because, since words are inserted on the tree in sorted order, each node's children are sorted in its [children] [ArrayList], as 'a' is always added before 'b', 'b' before 'c', etc. If the current node has no more children, the iterator must traverse backwards up the path through parents until a node is found to have a sibling after the current node (the order of nodes is stored in each node's childNumber [int]). Iterating through the dictionary is also not dependent on the number of words it contains. Rather, it is often executed in constant time, since the next word is located a few nodes down the current path. When traversal must be reversed, iteration is still efficient because if is at worst O(T), where T is the length of the longest word in the dictionary (If the longest word is entirely on its own tree, it will be traversed in reverse until the forest is reached. Then a path of equal or less length must be traversed down the next tree to find the next word). The [GameManager] class was slightly more complicated. The letters on the board are stored in a two-dimensional array which represents the desired grid structure. The letters of the cubes are stored in a one-dimensional, as are the [BogglePlayer] objects representing each player (see below). Similarly to the dictionary, I used a [BufferedReader] object to read the cube input file. Also similarly to the dictionary implementation is the assumption I made about invalid input from the cube input file. The program cannot function without correct cubes, so if the cube input file is found to be invalid, a detailed error message is displayed and the program is terminated. Otherwise, a single letter from each cube is randomly selected and placed on a spot on the board to start a new game. An extremely important design decision I made is the creation of three separate classes inside of the [GameManager.java] file. The first of these is the [BogglePlayer] class. A player in Boggle is defined by his/her score AND his/her list of the words found on the board. Thus, a simple [array] of scores would not be sufficient to represent the players. I designed a [BogglePlayer] class with accessors and mutators to represent each player in the current game. This made it easier to retrieve all the players' scores. Also, the fact that each player's used words were stored in a [HashSet] made it significatly simpler to compare newly proposed words to those that have already been used. The other two classes I defined in the [GameManager.java] file involve the searching of the board for words which are discussed in the searching algorithms design section. Upon completion of the searching methods, I was ready to move on to the user interface. To create my [Boggle] user interface class, I first planned out on paper the procedure of the game. To do this, I created a sort of flowchart that indicated which operations should be performed at each juncture. Then, I simply wrote a [run] method that reproduces this flowchart in Java syntax (method calls following other method calls in [while] loops, etc.). As stated previously, the user interface relies on the methods defined in the [BoggleGame] interface, so its implementation was not overly complex. Input is taken for the number of players for each game, identified words, and whether the user wants to continue playing or not. To display the board, the letters are printed in a grid. If a word has been added, the coordinates for the letters of this word are obtained and the appropriate letters are set to upper case. After each added word, the board is redisplayed, along with the scores of each player. When all players are done, the computer searches for the remaining words using the [BoggleGame]'s [getAllWords] method and these are displayed along with the computer's score. This process is repeated until the user indicates the desire to end the program. --SEARCHING ALGORITHM DESIGN AND ANALYSIS-- In the program, there are three features which require the searching of the board for words. I implemented these features using two different recursive algorithms (two of the features can share the same algorithm). The first of these features is taking word input from the user and checking the board to see if the word can be formed while adhering to the given rules of Boggle. I implemented this feature through the [traceWord], [findLetter], and [findLetterHelper] methods. To begin, the driver method of this recursive algorithm, [traceWord], is called. It finds all instances of the first letter on the board, if any exist. It does not find only one instance because one may not lead to the formation of the input word while another may. For each instance found, the location of the letter (stored in a [Point] object in an [ArrayList]), the length of the original input word, and the substring of remaining letters to be found are passed to the main recursive method, [findLetter]. This method checks all adjacent letters on the board in an attempt to find the next letter in the sequence. If it is found (and has not already been used), the method calls itself (through the [findLetterHelper] method) with the update list of points and the remaining letters. In the [findLetter] method, a [HashSet] of [ArrayList]s is created to store all found routes starting from the first letter in the input word to the last found letter of the input word. After checking all adjacent letters, the route with the same number of [Points] as the number of letters in the original word is returned, if it exists. This is one of the possible routes that can be traced to find the word on the board. Since only one route's existence needs to be confirmed to prove the word can be formed from the board, any correct route will suffice. I decided to return an [ArrayList] of [Point]s so that I may reuse this method in my implementation of the [getLastAddedWord] method. This method simply traces the route of the last added word on the board and converts the route (from an [ArrayList] of [Point]s) to a two-dimensional array of [int]s as specified. The second feature is a dictionary-based search of all valid words on the board. To implement this in the [dictionarySearch] method, I simply iterated through all of the words in the dictionary and called the [traceWord] method on each of them. If [traceWord] returns [null], the word cannot be found on the board, otherwise it can. To speed this search up, I observed that not all words in the dictionary will be the required length (in this case 4 letters). Those that were not long enough were ignored; [traceWord] was not called on any of them. The same was done for words that had already been used by any of the players. Each word is checked against each player's [HashSet] of words to make sure it had not already been used. If it had, it is ignored. The use of [HashSet]s for the player's used words allows each dictionary word to be compared against the set of used words in constant time. The last feature is a board-based search of all valid words on the board. Implementation of this feature was significantly more complicated than the dictionary-based search. I decided to split the implementation into two parts: creation of a tree and traversal of the tree. To do this, I created separate [BoggleTree] and [BoggleTreeNode] classes inside of the [GameManager.java] file. A [BoggleTree]'s root is a letter on the board. Its children are letters that, when combined with the root, form a prefix that can be found in the dictionary. To create the tree, I implemented private [boardSearch] and [makeTree] (and [makeTreeHelper]) methods. For each spot on the board, a new [BoggleTree] is constructed. The prefix (which at this point is only one letter) and the location of the letter (again in a [Point] object) is sent to the [makeTree] method. This method checks all adjacent letters and, if when combined with the previous prefix forms another valid prefix, is added as a child of the root. [makeTree] then calls itself (through the [makeTreeHelper] method) sending the newly formed child as the root and the updated prefix and list of [Point]s. Once the tree is constructed, it is traversed to find all words that can be formed. To make traversal easier, I implemented [BoggleTreeNode]s with a first child/next sibling structure. Each root's children can be accessed through a [while] loop. When there are no more children, the last child's [getSibling] method will return [null] this is an easy way to traverse all nodes while at the same time making the [while] loop finite. To traverse the tree and find all valid words, [makeWords] is called, which calls [listWords] on each of the root's children. [listWords] makes a prefix with the current node's character value. If the prefix is found in the dictionary, [listWords] is called recursively on all of the current node's children. If the prefix is also a word in the dictionary, it is added to a [HashSet] of words. I chose the [HashSet] data structure because it contains only unique words, so adding a word to the set that has already been used in the set is not allowed. Each tree's set of words is added to the board's set of words, which is finally converted to an [array] and returned. The comparative efficiency of the two search tactics depends on the size of the dictionary used. Because the dictionary-based search iterates through literally every single word in the dictionary, its simplicity can be offset by its inefficiency when iterating through an enormous list of words. Also, because the words are checked for validity before checking if they are on the board, a somewhat large minimum word length could result in many of the words in the dictionary being ignored completely. On the other hand, the board-based search depends less on the size of the dictionary and more on the size of the grid. In this version of Boggle, which uses a four-by-four grid, the number of paths that can be formed is logarithmically smaller than the number of paths that might be formed in a grid of any greater size (A larger grid means more letters which means more trees). Therefore, the dictionary-based search is the better choice when using a small dictionary or a large grid or when the rules stipulate longer minimum word lengths. The board-based search is best for very large dictionaries and small grids. --LIMITATIONS-- The first big problem I had involved searching the [DictTree]. Both my [contains] methods returned [true] for every query, regardless of whether it was actually a valid word. Upon searching my code, I found that at the beginning of [findSeQ], the method which actually performs the search, the length of the remaining sequence of letters is checked. If it was zero, my method returned true, as I assumed that if the search had reached this point, the path existed and the query existed in the dictionary. This was correct when searching for a prefix, but for words, the last node in the traversed path must indicate that it forms a word. I separated the cases (one for searching for words and another for searching for prefixes) and returned true for a word search only if the last node in the query's [formsWord] indicates that it is the last node in a word in the dictionary. A major issue I encountered was in my algorithm for searching the board for a given word. Only words that followed certain paths on the board (such as completely diagonal) were indicated as found, even though flags I set indicated that the program had actually found the correct paths. Unfortunately, those correct paths were being thrown away and my program produced incorrect paths. To solve this problem, I realized the following: because my algorithm involves returning an [ArrayList] of [Points] that show where the letters are on the board, only [ArrayList]s whose sizes are the same as the number of letters in the given word contain correct paths. To make certain that I was returning the path with the correct size, I created the [routes] [HashSet] in the [findLetter] method. All possible routes are added to this [HashSet], and only a route that has the correct number of [Point]s is returned. After I included this test, my [addWord] method behaved correctly. Another problem I faced involved the [getAllWords] method, specifically when using the board-based tactic. Upon completion of my user interface, I noticed that, occasionally, in the list of words that the computer found, two or three words would be repeated in the list. It is logical that the program found more than one possible route to form these words, but the game is supposed to only take one of these routes into account when generating a list of all words. I realized that I had been using [HashSet]s to generate unique words from each tree (i.e. from each spot on the board), then adding them to an [ArrayList] of all words. The [ArrayList] does allow duplicate values, so I changed my implementation to add elements from [HashSet]s to a larger [HashSet]. The property of sets that disallows duplicate values means that duplicates will not be added to the set. This eliminated the problem. --TESTING-- To test my solution, I employed a variety of strategies, including developing my own (albeit simple) test harness, utilizing the [setGame] method, and controlling input. I recognized that there would be no way to test the correctness of my word search algorithms without a correctly functioning dictionary, so that is what I tested first. To do this, I wrote my own [BoggleTest] file that executed the functions of the dictionary. Using first words from a created [test_words.txt] file, then words from the provided [words.txt], I looked up countless different words that were both in the dictionary and not. I also experimented with different unique cases, such as blank lines in an input file, duplicate words, and one-letter words. I did the same with prefixes. To test the iterating methods, I wrote code that iterated through the entire dictionary, then compared it to the file [words.txt]. I also performed a test that iterated through only part of the dictionary, reset the iterator, then kept iterating. Once I had confirmed the correctness of my dictionary implementation, I wrote the [GameManager] class and tested it with the same [BoggleTest] harness. Before I included the parts of the word search algorithms that check ALL adjacent letters, I wrote code that checks only adjacent letters on one direction. I combined this with use of the [setGame] method to control what words should be found on the board and the exact path by which they could be found. For example, in one test, I searched for the (made up) word 'osty' in a board which I defined to have the letters in 'osty' moving from the bottom right to the top left. All other letters were 'x's. This meant that there was only one word to be found and only one place it could be found on the board. When I made sure my program passed this test, I added the parts that check all adjacent letters, then finally tested the method with a fully randomized board. To test the board- and dictionary-based word search algorithms, I used a similar process. However, when I confirmed that they both seemed to work, I realized that to actually confirm their correctness, I had to run another test. Although the word searches could produce lists of words that looked correct, I had not yet confirmed the completeness of the lists. To do this, I let the computer find all words on a random board using the board-based tactic. I kept the window open and ran the game again, using the [setGame] method to use the same board. This time, however, I let the computer find all words using the dictionary-based tactic. Upon comparison, the lists turned out to be exactly the same. After several more instances of running this test, I concluded that both methods were correct. Finally, to test the user interface for completeness and robustness, I distributed the game to several of my friends and family. I reasoned that I may be making assumptions that others who were detached from the assignment would not, so they might catch errors that I was blind to. I gathered all feedback and used it to fine-tune my interface. After fixing the minor errors caught by others, I concluded that my interface was complete and my Boggle game was correct.