CS 388 Natural Language Processing Homework 3 FAQ

  1. What does "-mx1500m" mean?

    It specifies the maximum memory available to the Java VM to be 1.5 GB. You may want to adjust this if you're working on a machine with less memory or if you find your VM running out of memory.

  2. How can I add Trees to a Treebank?

    The Treebank class doesn't permit this. You can use MemoryTreebank, which allows modifications.

  3. What do you mean by "develop a simple command line interface to the LexicalizedParser class"?

    Your code should compile against LexicalizedParser and instantiate an instance of it, not make a system call and invoke the java interpreter. Your class should have a main method and will take some arguments on the command line. Please remember to list the actual commands you ran in your README file.

  4. Do I need to use FileFilter in my code?

    Unlikely. You can just construct your Treebanks from a file, modifying the makeTreebank() code appropriately when you copy it into your class.

  5. The DemoParser reads in a serialized parser (englishPCFG.ser.gz). What do I do with that?

    Nothing. Get rid of it. You're training your own parsers in this assignment. If necessary, you can make your own serialized parser to save some training time.

CS 388 Natural Language Processing Homework 3 Tips

For Homework 3, you will likely need to make very few if any modifications to the Stanford parser. However, you will create a separate set of classes that interact with the parser, primarily through the LexicalizedParser class. Begin by reading the JavaDoc for that class and package overview JavaDoc for edu.stanford.nlp.parser.lexparser. They are included in the Stanford parser distribution under the javadoc/ directory.

Here are some tips to help you get started: