Semantic Flipping Demo


This demo demonstrates the role that context frequency can play in the disambiguation process, as described in more detail in

Marshall R. Mayberry, III, and Risto Miikkulainen (1994). Lexical Disambiguation Based on Distributed Representations of Context Frequency. In Proceedings of the 16th Annual Conference of the Cognitive Science Society, Atlanta, GA.

Here is the abstract:

A model for lexical disambiguation is presented that is based on combining the frequencies of past contexts of ambiguous words. The frequencies are encoded in the word representations and define the words' semantics. A Simple Recurrent Network (SRN) parser combines the context frequencies one word at a time, always producing the most likely interpretation of the current sentence at its output. This disambiguation process is most striking when the interpretation involves semantic flipping, that is, an alternation between two opposing meanings as more words are read in. The sense of ``throwing a ball'' alternates between ``dance'' and ``baseball'' as indicators such as the agent, location, and recipient are input. The SRN parser demonstrates how the context frequencies are dynamically combined to determine the interpretation of such sentences. We hypothesize that several other aspects of ambiguity resolution are based on similar mechanisms, and can be naturally approached from the distributed connectionist viewpoint.
The SRN has been trained on a corpus of 125 active and 125 passive sentences each having three parameters: an agent, a location, and a recipient. Each of these parameters can take on one of five values. These are graduated from values which are strongly associated with the baseball sense of the word ``ball'', through values that are neutral with respect to the sense, and finally to values that are strongly associated with the dance sense of "ball". Thus, for example, if the passive sentence
The ball was thrown in the clubroom for the fans by the emcee.
is analyzed on a word-by-word basis, a hearer could be expected to anticipate the dance sense of ``ball'' upon hearing clubroom, which is moderately associated with that sense in the training corpus, then to switch over to the baseball sense upon encountering the word fans which is more strongly associated with baseball, and finally back to dance when emcee is processed, because of that word's strong association with that sense of ``ball''.

For the sake of explicitness, the words used in the lexicon have been handcoded (see the paper for details) so that the activation of the last unit of the word labeled by ``Ball'' in the output reveals its sense. This value is displayed under the ``Computed'' label at the bottom of the demo, together with the ``Predicted'' value based on the actual frequency of this context in the training corpus.

Given this background, we can understand the components of the demo:

  1. Menubar:
  2. The Network: the input word (at top) feeds into the hidden layer (shown together with context layer) which feeds into the output layer, which is compared against the target layer. The output layer gives the net's current sense of the word ``ball'' as well as the activations for agent, location, and recipient. These can then be compared against the values in the target layer (with the caveat that the target Verb and Ball will always show ``tossed baseball'' as a baseline for determining the error.
  3. A Graph which plots the current activation to be compared against the predicted value.
  4. A panel of radiobuttons so that the user can try out different sentences from the corpus.
  5. A Status line showing the computed, predicted, and error signals from the network.
All you need to do is select the agent, patient, location, and sentence mood from the radiobuttons at the bottom of the screen and step through the sentence.

The blue line gives the predicted activation (and, therefore, sense of the word "ball") based on its frequency in the given context during training. The red line shows the activation the network actually learned. The activations of the units themselves are distributed across the spectrum with a rough breakout (depending on how many colors are actually allocated on your monitor) as follows:

  1. 0.00-0.25: black to dark to light blue
  2. 0.25-0.50: light blue to olive
  3. 0.50-0.75: olive to dark violet
  4. 0.75-1.00: dark violet to white
For the demo to work, you must be running X11 (not Windows, Macintosh, or NeXTStep), and you must not have a firewall preventing all remote access to your screen. The demo program runs on net.cs.utexas.edu, with the display over the internet on your X11 screen. Click here to set up the demo. Send comments, bug reports etc. to martym@cs.utexas.edu.

Back to UTCS Neural Networks home page

martym@cs.utexas.edu
Last update: 1.8 2000/06/24 04:27:45 jbednar