Setting up the Java Environment for CS 371R:
Intelligent Information Retrieval and Web Search



[A] Running the code from the existing Linux installation on the department fileserver:

1. Setting up the classpath for java

On tcsh or csh shells in Unix: setenv CLASSPATH '.:/u/mooney/ir-code'
On bash shell in Unix: export CLASSPATH='.:/u/mooney/ir-code'

Instead of typing these in everytime you run the code, you can also add these lines in the .cshrc file (for tcsh or csh), or the .bashrc and .profile files (for bash).

2. Running the code

At the command prompt, type:
java ir.vsr.InvertedIndex -html /u/mooney/ir-code/corpora/yahoo-science/

Follow the trace at www.cs.utexas.edu/users/mooney/ir-course/sample-trace.txt for a list of possible commands to try out. Open a Firefox browser before you run the code in order to have selected documents displayed in the browser.

[B] Making your own copy of the code and running your own Linux installation on the department fileserverdfmw (necessary for projects):

1. Copy the ir sub-directory from the ir-code directory into your HOME directory

At the command prompt, type: cp -r /u/mooney/ir-code/ir $HOME

2. Setting up the classpath for java

On tcsh or csh shells in Unix: setenv CLASSPATH '.:/u/[your-login-name]'
On bash shell in Unix: export CLASSPATH='.:/u/[your-login-name]'

where [your-login-name] is your Unix login name.

Instead of typing these in everytime you run the code, you can also add these lines in the .cshrc file (for tcsh or csh), or the .bashrc and .profile files (for bash).

3. Running the code

At the command prompt, type: java ir.vsr.InvertedIndex -html /u/mooney/ir-code/corpora/yahoo-science/

Follow the trace at www.cs.utexas.edu/users/mooney/ir-course/sample-trace.txt for a list of possible commands to try out. Open a Firefox browser before you run the code in order to have selected documents displayed in the browser.

4. Recompiling after modifying code in the ir directory

If you modify the file ABC.java, you can recompile it using the command: javac ABC.java

[C] Running the code under Windows

We do not directly support running the code under Windows, but if you wish to do so, a former student found that the following 3 changes allowed him to run it under Windows. It is up to you to determine any additional changes that are need to get it to run in your own Windows environment.

1. Location of stopwords.txt

In document.java in the ir.vsr package, you need to set the location of the stopwords.txt file. Just downloaded it and store it locally, and changed it to be something like below:
protected static final String stopWordsFile = "C:/cs371r/ir/utilities/stopwords.txt";

2. Using the Browser

Next, you need to change the way the program opens a URL. To do so, you need to change the line in browser.java in the utilities package to be like below:

for Internet explorer:
Runtime.getRuntime().exec("C:/Program Files/Internet Explorer/IEXPLORE.EXE "+url);

for Firefox:
Runtime.getRuntime().exec("C:/Program Files/Mozilla Firefox/firefox.exe "+url);

3. Location of the corpus

In the argument for the invertedIndex file, you need to change the path for the corpus folder to use. Download this directory to a local folder, and use that path as the argument.