Testing

Next: Varying the Ball's Speed Up: Fixed Ball Motion Previous: Training

Testing

After training, a neural network could be used very cheaply--just a single forward pass at each decision point--to decide when to begin accelerating. Notice that within a single trial, only one input to the neural network varied: the Ball Distance decreased as the ball approached the Contact Point. Thus, the output of the neural network tended to vary fairly regularly. As the ball began approaching, the output began increasing first slowly and then sharply. After reaching its peak, the output began decreasing slowly at first and then sharply. The optimal time for the shooter to begin accelerating was at the peak of this function, however since the function peaked at different values on different trials, we used the following 3-input neural network shooting policy:
Begin accelerating when Output .6 AND Output < Previous output - .01.
Requiring that Output .6 ensured that the shooter would only start moving if it ``believed'' it was more likely to score than to miss. Output < Previous output - .01 became true when the output of the neural network was just past its peak. Requiring that the output decrease by at least .01 from the previous output ensured that the decrease was not due simply to sensor noise.

Using this learned 3-input neural network shooting policy, the shooter scored 96.5% of the time. The results reported in this section are summarized in Table 2.

table187
Table 2: Results before and after learning for fixed ball motion.

Even more important than the high success rate achieved when using the learned shooting policy was the fact that the shooter achieved the same success rate in each of the four symmetrical reflections of the training situation (the four action quadrants). With no further training, the shooter was able to score from either side of the goal on either side of the field. Figure 4(a) illustrates one of the three symmetrical scenarios. The world description used as input to the neural network contained no information specific to the location on the field, but instead captured only information about the relative positions of the shooter, the ball, and the goal. Thanks to these flexible inputs, training in one situation was applicable to several other situations.

Next: Varying the Ball's Speed Up: Fixed Ball Motion Previous: Training

Peter Stone
Thu Aug 22 12:51:13 EDT 1996