The SPEC software package
=========================

This directory contains the training and testing code and data for the
SPEC sentence processing system. The software consists of three main
components: (1) the C code for the simulation management, networks, and
graphics, (2) the sentence data files, and (3) example simulation
specification and output files.  This file contains an overview of all
of these.  For more detailed explanations, see the comments in the
individual files themselves.


Code
----
defs.h
Definitions of the global data types and constants. The network weights
and activities and the input data are stored in fixed-size tables whose
max dimensions are hardcoded as compiler constants. If you design your own
data and accidentally exceed the max table sizes SPEC should be able to
catch it and give a warning (and you can then change the table limits in
this file to fit your data).

prototypes.h
Prototype definitions for global functions.

globals.c
The global variables including network dimensions, weights, and activities,
input data, file names, simulation parameters and major graphics structures
are defined here.

gwin.c, Gwin.h, GwinP.h
Defines a simple window widget for graphics output.

main.c
Initialization of the X interface, networks, and input data. The main
training and testing loops. All the input sentences are given as text;
However, when the input is read in, it is converted into index
representation where every word is represented by an index to a table of
representations.

nets.c
Simulation code for the three networks of SPEC. Sentence processing
loop, propagation, input and target initialization, weight initialization.

stats.c
Performance statistics initialization, collecting, and printing routines.

graph.c
X window graphics: initialization, event handling loop, callbacks for
simulation control buttons, colormap allocation, displaying network
activities and weights, resizing.


Input data files (in directory "inputdata")
-------------------------------------------
EXAMPLE
Explains the format of the input files in more detail.

training.40-06.100
The training set for SPEC described in the reports, consisting of 100
randomly-selected sentences from templates 40 and 06.

23.ctr-tail-a (and other files like that)
Each of the other files in this directory contains the full set of
sentences for a particular template. The number gives the template
number and the rest of the name describes the template. For example,
ctr-tail-a means that the sentence has a center embedding, which has a
tail embedding with "who" as the agent. Use these files to test the
performance of SPEC one at a time (so that the sentence-position-specific
statistics make sense).


Example simulation files
------------------------
simu
Example simulation specification. The simulation parameters are defined
here and the results (snapshots of representations and weights) are
stored in the end. The file specifies e.g. what inputfile to use, how
large the networks and word representations are, which modules to train
and how long, when to save snapshots, and what learning rates to use.
Each snapshot is stored at the end of the simufile, and consists of the
snapshot epoch, current error for each network, and the current word
representations and weights for all networks. The idea is that the
simufile is a complete record of one training session. The resulting
network can then be tested in different ways without having to edit the
simulation file. In this example, only one snapshot was saved at the end
of the simulation.  You can delete the snapshot and change the
parameters to specify your own experiments.

Spec.ad
This is the X application default file. It defines a number of
parameters for the X display such as the graphics dimensions, colors and
fonts. This file should be placed in the app-defaults directory or
user's home directory (in which case it should be renamed "Spec") or
included in the user's .Xdefaults file.

output-training
Output of SPEC while training it with "simu" (created with 
"spec simu >output-training").

test-all-templates
A shell script that tests the performance of SPEC in processing each of
the different types of input sentences. To run it, give the name of the
simulation file as the parameter, like "test-all-templates simu".

output-test-all-templates
Created by "test-all-templates simu >output-test-all-templates", it has
the results of testing SPEC with the different templates.
