Running SPEC
============

This directory contains the code and data for training and testing the
SPEC sentence processing system. After you have successfully installed
the system, say e.g. "spec -test simu inputdata/EXAMPLE" to see what it
looks like. SPEC will read simulation parameters and the network weights
from the file "simu". A graphics window will come up with a number of
buttons. Click on "Run" to start the simulation: SPEC will run through
the sentences in the file "inputdata/EXAMPLE". The graphics display
shows activity propagation through the networks, and in the end,
detailed performance statistics are generated in the standard output.

More generally, SPEC is run with
spec [options] [simulation file] [input file]
where the options are
  -help                 Prints this message
  -test                 Testing mode
  -train                Training mode (default)
  -chain                Networks are connected in a chain (default in testing)
  -nochain              Networks are run in isolation
  -stacknouns           Stack data specified by the nounlist
  -nostacknouns         Stack data specified by sentences (default in testing)
  -includeall           Include all modules (default in testing)
  -noincludeall         Include only those specified in the simufile
  -graphics             Bring up graphics display
  -nographics           Text output only
  -weights              Display weights
  -noweights            Display only activations
  -owncmap              Use a separate colormap
  -noowncmap            Use the existing colormap
  -delay <sec>          Delay in updating the screen (in seconds)

SPEC will come up in training mode by default. You can override it by
giving the option -test in the command line. The meanings of the other
options are described in detail below in conjunction with the different
training and testing strategies and the graphics display parameters.

If no filenames are specified on the command line, the file "simu" is
used for the simulation specifications, and the inputfile specified in
"simu" is taken to contain the input sentences. The first parameter is
taken to be the simufilename, and second (if given) the inputfilename.


Training SPEC
=============

In training mode, SPEC reads the simulation specifications (such as the
number of units, which networks are trained, how many epochs, learning
rates, snapshot epochs...) from the simulation file, and the set of
training sentences from the inputfile. The initial weights and word
representations for all words occurring in the sentences are set up
randomly using the seed specified in the simufile. The networks are then
trained a number of epochs with vanilla on-line backpropagation with no
bias units or momentum (if you want, putting in your own favorite
backprop tricks should be fairly easy).

Strategies
----------
The simulation run from epoch 1 to "simulationendepoch" (specified in
the simufile) is divided into several phases with possibly different
learning rates and different networks to be run. The idea is that you
can train certain networks first and once they have converged, throw in
others, or you can train them all at once. Or you can simply train the
slower-learning networks longer. The different learning rates allow you
e.g. to fine-tune the weights by gradually reducing the learning rate,
which is often a good strategy.

During training, the networks may either be connected together in a
chain (with "chain 1" specified in the simufile), with the output of one
network forming the input of another, or they can be trained in
isolation ("chain 0") with the "correct" input patterns (this is of
course withing the limits of the architecture and the
"phase-network-running" specifications; e.g. the segmenter always has to
get its input through the current parser). It is often faster to train
them in isolation: once they have converged, you can connect them
together and they know what to do with each others output.

The stack can be trained in two ways: With "stacknouns 0" the training
embeddings are obtained from the actual training sentences, and the
network is trained (and statistics are collected) at each push. With
"stacknouns 1", it is trained with the complete set of embeddings of the
type "the <noun1>" and "the <noun1> who the <noun2>", which are obtained
from the list of nouns given in the inputfile (the actual patterns are
obtained through the current parser). The fastest training strategy is
to train the parser and segmenter first, and after that, train the stack
with the complete set of embeddings. In some other systems other
training schemes might be more appropriate.

SPEC presents the training sentences in different order in each epoch,
unless you specify "shuffling 0" in the simulation file. This is useful
because artificially-generated data often contains all variations of a
particular e.g. sentence structure first before any of the other
structures, and the network would be liable to forget some of the early
structures in the epoch.

SPEC develops representations (using the FGREP method) for the input
words in the input layer of each network, unless you turn FGREP off by
"network-fgrepping 0 0 0" in the simulation file (you can turn it on and
off separately for each network; the stack does not actually have word
input but it is included in the fgrep list for completeness). Because
the network with FGREP is shooting at a moving target, it is sometimes
useful to slow down that target by specifying a smaller learning rate
for the word representations than the weights. You can do that
separately for each epoch with the "phase-wordetas" parameter. FGREP
frees you from having to worry about encoding the input/output
representations. However, if your data has words with identical usage
(or symmetric, such as "good" always with "big" and "bad" with "small"),
it will notice it and make their representations the same. If you use
FGREP, make sure each word in your data is has a unique usage.

Architecture
------------
The above inputfile, simulationendepoch, phase-lastepoch, eta, wordeta,
running, chain, stacknouns, shuffling, seed, and network-fgrepping
parameters allow you to specify various training strategies for general
modular connectionist NLP systems.  Your model is defined mostly by the
code. The original code was written so that it should be fairly modular
and easy to modify. When you make modifications, it is a good idea to
use the graphics display to debug your model. Also after each
modification, run a performance check on some training and testing data
that you know well and know what the output should be. This way you can
catch many errors early on and keep the debugging manageable.

In the simulation file, you can specify that you want to run parser only
(see below), and you can change the size of the word representations and
the hidden layers by changing the corresponding values in the simufile.
If you accidentally exceed the table dimensions specified in the code,
it should catch the problem and give you a warning, and you can then
change the table limits in the code.

Snapshots
---------
The training results are saved in the simufile. Snapshots of word
representations and weights are appended at the end of the simulation
file after each epoch that you specify in the "snapshotepochs" list.
All weights are saved at each snapshot even if the network is not
currently running (unless you specify "parseronly 1", see below). In the
beginning of each snapshot definition, the epoch and the average error
per output unit for parser, segmenter, control, and stack are
given.

If you interrupt the simulation before its endepoch, you can continue
from the last saved snapshot. Given the simufile, SPEC reads all
snapshots, restores the state of the simulation, and continues from
there.

Running the simulation
----------------------
Usually (at least for the first few times) it is a good idea to check on
screen that your simulation setup really works. Then you can turn the
display off and do the training in the background. While it is running,
the snapshots are saved in the simufile, and you can test them as they
come in. You don't necessarily need the graphics for this, and SPEC
allows you to train and test even when the display is unspecified (you
can e.g. do it remotely from a non-X terminal).

The idea is that the simufile should be a complete record of the
training simulation. It does not have to, and should not, be changed for
testing.  Although you can override some of the simulation parameters
(chain, stacknouns, running) by command-line options during training, it
is not a good idea to do so because no record of it remains in the
simulation file and it may be difficult to remember how the file in
front of you was actually created. If you doctor the simufile after it
was created, it is a good idea to include comments about it in the
simufile.

As the simulation is running, the average error per output unit for each
module in the system (parser, segmenter, control output of the
segmenter, and stack) is printed in the standard output at each epoch so
you can follow how the learning is progressing. The simulations switches
in use are also printed out in the beginning, so that the output
contains a record of possible options as well.


Testing SPEC
============

In test mode (specified by -test option), the system runs through the
same procedures as in training, except the weights and the
representations are not changed, and more detailed statistics about the
system performance is collected and output.

Some of the simufile parameters are overridden in test mode. Unless you
specify -nochain in the command line, the networks are connected in a
chain.  The stack input comes from the test sentences by default, unless
you specify -nostacknouns. All modules are included in the simulation by
default, unless you give -noincludeall option. In effect, these settings
make the full SPEC performance system as the testing default. Sometimes
it is interesting to see e.g. how each module performs in isolation, and
you can use the options to specify such tests.

In a test simulation, SPEC reads each snapshots one at a time, runs all
sentences in the input file through the network (without shuffling), and
collects statistics about the performance of the system. It prints the
following information for each snapshot in the standard output:

- average error per segmenter output unit (except control output)
- average error per segmenter's control output unit
- percentage of correct control signals generated (within a given range
  of the correct value, 0.3 by default specified in stats.c)
- average error per stack output unit (including only the parser hidden
  layer representation part) 
- average error per parser output unit
- percentage of parser output units within a given range of the correct
  value (e.g. 0.15 as specified in stats.c)
- percentage of correctly identifyable words at the parsers output
  (the output is considered correct if the representation of the correct
  word in the lexicon has the smallest Euclidean distance to the output
  representation over all words in the lexicon)

In addition, one example input sequence and the corresponding output
case-role representation sequence is printed and the above information
is given for each point in the sequence. This statistics only make sense
when the input file consists of one type of sentences.  Usually I test
SPEC with multiple input files, each consisting of sentences with only
one type, and then I can look where SPEC has trouble in processing these
sentences.

In the beginning, the file names and the simulation switches are printed
in the standard output, so that the test simulation can always be
recreated from the output information.


Graphics Display
================

Although in principle it is enough to train and test the networks
without graphics, the graphics display is often indispensable for
debugging the model and understanding what is actually going on. If your
model does not appear to be learning, put it on screen; you may be
surprised what you can find out that way.

Whether the graphics display is brought up depends on what you have
defined in the application defaults for "Spec*bringupDisplay". If it is
"true", SPEC will come up with graphics, otherwise without it. If no
application defaults file can be found, SPEC will come up with
graphics. You can override the application defaults by specifying
-graphics or -nographics in the command line. If you don't have X11, you
can still run SPEC remotely (without graphics) on another machine that
does (which is sometimes useful for starting simulations remotely).

The display shows the unit activations, target patterns, weights (if you
so desire), and the current output error for each network, and also
displays the word sequence read so far and the name of the current
simulation file. On top of the display there are a number of buttons
that control the run:

Buttons
-------
"Run": click here and SPEC will start a simulation run, reading input
 from the input file listed at right. If the training has been completed,
 "Run" has no effect; if testing a snapshotfile has been completed,
 hitting "Run" will run the same tests again. While the simulation is
 running, the "Run" button changes into a "Stop" button:

"Stop"; you can interrupt the run at any time by clicking on the "Stop"
 button, and it changes to the "Run" button. Click "Run" again and the
 simulation continues.

"Step" is a toggle switch; when on, it causes SPEC to pause after
 every major propagation in the network. Click "Run" to continue.

"Clear" interrupts the currently running simulation and clears the
 network activations. After hitting "Run", the simulation file is read in
 again and the simulation continues from the current state of the
 simulation file.

"Quit" terminates the program.

Network displays
----------------
In all network displays, the input layer is on top, the hidden layer in
the middle, and the output layer and the target pattern at the bottom.
The previous hidden layer of the parser is shown right below the input
layer if the weight display is off, otherwise it is shown to the right
of the input word (to make better use of the available space).  At the
output of the segmenter, the 3 control units are shown at right. At the
input and output of the stack, the representation pushed or popped is at
left and the stack representation at right. During training of the
stack, the full target is shown; during testing, only the target for the
popped representation is displayed (because performance statistics are
collected only for that part).

The weight displays are matrices attached to one of the layers. If the
weights are above the layer, they are the input weights of the units. If
they are below, they are the output weights. All the (input or output)
weights of each unit appear in the column attached to that unit. The
y-index runs from 0 to N from the top down, corresponding to units in
the other layer from left to right. Get it?

The word representations at the input and output assemblies are
labeled. Each label indicates the word in the lexicon that is the
closest to the current pattern. If this word is not what it should be,
the correct word is given in parenthesis (e.g. "*boy(girl)*; "_"
indicates the blank, or all-0 representation; this symbol is only used
when the word is incorrect). If the label does not fit the box it is
truncated at right.

Resources
---------
The display interacts with the X system in the normal manner. You can
iconize the display, resize it, change the default parameters in the
app-defaults or .Xdefaults file, use the standard X Toolkit options,
abbreviate the options, etc. A few comments on the more important
resources (look in the file Spec.ad for more details):

The weight display can be turned on or off by toggling the
"Spec*weightDisplay" resource in the app-defaults (or .Xdefaults) file.
You can override it by specifying -weights or -noweights in the command
line.

Since there are lots of weights, the app-defaults file specifies a
different default size for the display with weights:
"Spec*weightnetheight". It should be quite a bit larger than the display
without weights. You can make the weights appear flatter wrt. the units
by changing the "Spec*weightsPerUnit" resource in app-defaults. It is 3
by default, indicating that the colorful boxes representing units are 3
times as tall as the boxes for weights.

The negative weights are displayed in green and positive in red,
intensity corresponding to the magnitude of the weight (black=0.0).  You
can control the display range by changing the "Spec*weightDisplayRange"
resource in app-defaults (it is -1.0 to 1.0 by default; weights outside
the range are displayed at maximum intensity). The unit activations are
shown in the scale of black (0.0) - red (0.333) - yellow (0.667) - white
(1.0). In B&W displays, the gray scale from black to white is used
instead.

Unfortunately the fonts are not resizable in X11R5. If the display size
changes a lot, you may have to specify other fonts in the app-defaults
file. Also, if your display has few available colors, it may be a good
idea to set "Spec*owncmap" to "true"; in this case, SPEC will come up
with a complete, private colormap instead of using whatever colors it
can get (this resource may be overriddent with -owncmap and -noowncmap
options). Lastly, on some fast machines you may actually want to slow
down the display. You can do that with "Spec*delay" with a given number
of seconds, or using the -delay command-line option.


Running the parser alone
========================

You can also run SPEC without the segmenter and stack modules, as a
single simple recurrent parser. This is the "strawman" against which
many more advanced architectures could be compared. Just specify
"parseronly 1" in the simulation specification file. The rest of the
parameters in the simulation file and the entire input file (including
the now-unused control signals) can be just the same.  However, the
weights will be saved and statistics reported only for the parser
network.


Credits etc.
============

Copyright (C) 1994 Risto Miikkulainen

This software can be copied, modified and distributed freely for
educational and research purposes, provided that this notice is included
in the code, and the author is acknowledged in any materials and reports
that result from its use. It may not be used for commercial purposes
without expressed permission from the author.

We hope that this software will be a useful starting point for your own
explorations in connectionist NLP. The software is provided as is,
however, we will do our best to maintain it and accommodate
suggestions. If you want to be notified of future releases of the
software or have questions, comments, bug reports or suggestions, send
email to discern@cs.utexas.edu.
