next up previous
Next: Retrieving from Memory Up: Memory Model Previous: Memory Model

Storing to Memory

With M discrete memory storage slots, the problem then arises as to how a specific training example should be generalized. Training examples are represented here as tex2html_wrap_inline310 , consisting of an angle tex2html_wrap_inline252 , an action a, and a result r where tex2html_wrap_inline252 is the initial position of the defender, a is ``s'' or ``p'' for ``shoot'' or ``pass,'' and r is ``1'' or ``-1'' for ``goal'' or `` miss'' respectively. For instance, tex2html_wrap_inline324 represents a pass resulting in a goal for which the defender started at position tex2html_wrap_inline326 on its circle.

The most straightforward technique would be to store the result at the single memory slot whose index is closest to tex2html_wrap_inline252 , i.e., round tex2html_wrap_inline252 to the nearest tex2html_wrap_inline548 for which Mem[ tex2html_wrap_inline550 ] is defined, and then set tex2html_wrap_inline552 . However, this technique does not provide for the case in which we have two training examples tex2html_wrap_inline338 and tex2html_wrap_inline340 , where tex2html_wrap_inline342 and tex2html_wrap_inline344 both round to the same tex2html_wrap_inline562 . In particular, there is no way of scaling how indicative tex2html_wrap_inline310 is of tex2html_wrap_inline566 .

In order to combat this problem, we scale the value stored to tex2html_wrap_inline568 by the inverse of the distance between tex2html_wrap_inline252 and tex2html_wrap_inline572 relative to the distance between memory indices. A result r at a given tex2html_wrap_inline252 is multiplied by tex2html_wrap_inline578 before being stored to Mem[ tex2html_wrap_inline580 ]. In this way, training examples with tex2html_wrap_inline252 's that are closer to tex2html_wrap_inline584 can affect Mem[ tex2html_wrap_inline586 ] more strongly. For example, with M = 18 (so Mem[ tex2html_wrap_inline590 ] is defined for tex2html_wrap_inline592 ), tex2html_wrap_inline378 causes tex2html_wrap_inline566 to be updated by a value of tex2html_wrap_inline382 . Call this our basic memory storage technique:

tabular54

Using this generalization function, the ``update'' of tex2html_wrap_inline566 would only have an effect at all if tex2html_wrap_inline398 prior to this training example. Consequently, only the past training example for action a with tex2html_wrap_inline252 closest to tex2html_wrap_inline620 is reflected in tex2html_wrap_inline622 : presumably, this training example is most likely to accurately predict tex2html_wrap_inline624 . Notice that this basic memory storage technique is appropriate when the defender's motion is deterministic. In order to handle variations in the defender's speed, we introduce later a more complex memory storage technique. The method of scaling a result based on the difference between tex2html_wrap_inline252 and tex2html_wrap_inline628 will remain unchanged.

In our example above, tex2html_wrap_inline378 would not only affect Mem[120]: as long as tex2html_wrap_inline416 was not already larger, its value would be set to tex2html_wrap_inline418 . Notice that any training example tex2html_wrap_inline420 with tex2html_wrap_inline422 could override this value. Since 116.5 is so much closer to 120 than it is to 100, it makes sense that tex2html_wrap_inline424 affects Mem[120] more strongly than it affects Mem[100]. However, tex2html_wrap_inline426 would affect both memory values equally. This memory storage technique is similar to the kNN and kernel regression function approximation techniques which estimate tex2html_wrap_inline428 based on tex2html_wrap_inline646 possibly scaled by the distance from tex2html_wrap_inline648 to tex2html_wrap_inline252 for the k nearest values of tex2html_wrap_inline652 . In our linear continuum of defender position, our memory generalizes training examples to the 2 nearest memory locations.gif



next up previous
Next: Retrieving from Memory Up: Memory Model Previous: Memory Model



Peter Stone
Mon Dec 11 15:42:40 EST 1995