next up previous
Next: Reasoning about action execution Up: Using the Learned Behaviors Previous: Using the Learned Behaviors

Receiver Choice Functions


Recall that the DT estimates the likelihood that a pass to a specific player will succeed. Thus, for a client to use the DT in a game, several additional aspects of its behavior must be defined. First, the DT must be incorporated into a full Receiver Choice Function (RCF). We define the RCF to be the function that determines what the client should do when it has possession of the ball: when the ball is within kicking distance (2m). The input of an RCF is the client's perception of the current state of the world. This perceived state includes both the agent's latest sensory perception and remembered past positions of currently unseen objects [2]. The output of an RCF is an action from among the options dribble, kick, or pass, and a direction, either in terms of a player (i.e. towards teammate number 4) or in terms of a part of the field (i.e. towards the goal). Consequently, before using the DT, the RCF must choose a set of candidate receivers. Then, using the output of the DT for each of these receivers, the RCF can choose its receiver or else decide to dribble or kick the ball. Table 1 defines three RCFs, one of which uses the DT, and the others defined for the purposes of comparison.

Table 1: Specification of the RCFs.

As indicated in Table 1, the set of candidate receivers is determined by the players' positions. Each player is assigned a particular position on the field, or an area to which it goes by default. The approximate locations of these positions are indicated by the locations of the players on the black team in Figure 2.

Figure 2: Player positions used by the behaviors in this paper. The black team, moving from left to right, has a goalie, a sweeper, and one defender, midfielder, and forward on the left, center, and right of the field. The arrows emanating from the players indicate the positions to which each player considers passing when using the RCFs. The players on the left of the field (top of the diagram) consider symmetrical options to their counterparts on the right of the field. The goalie has the same options as the sweeper. The white team has the same positions as the black, except that it has no players on its left side of the field, but rather two in each position on its right.

The formation used by all of the tested functions includes--from the back (left)--a goalie, a sweeper, three defenders, three midfielders, and three forwards. When a player is near its default position, it periodically announces its position to teammates; when a player leaves its position to chase the ball, it announces this fact and is no longer considered ``in position'' (see Table 1, Step 3). The arrows emanating from the players in Figure 2 indicate the positions to which each player considers passing. The clients determine which players are in which positions by listening to their teammates' announcements.

The RCFs defined and used by this paper are laid out in Table 1. As suggested by its name, the DT-- Decision Tree--RCF uses the DT described in Section 2 to choose from among the candidate receivers. In particular, as long as one of the receivers' success confidences is positive, the DT RCF indicates that the passer should pass to the receiver with the highest success confidence, breaking ties randomly. If no receiver has a positive success confidence, the player with the ball should dribble or kick the ball forwards (towards the opponent goal or towards one of the forward corners). This use of the DT confidence factor is, to our knowledge, a novel approach to agent control. The RAND--Random--RCF is the same as the DT RCF except that it chooses randomly from among the candidate receivers.

The PRW--Prefer Right Wing--RCF uses a fixed ordering on the candidate receivers for each of the positions on the field. In general, defenders prefer to pass to the wings rather than forward, midfielders prefer to pass forward rather than sideways, and forwards tend to shoot. As indicated by the name, all players in the center of the field prefer passing to the right rather than passing to the left. The RCF simply returns the most preferable candidate receiver according to this fixed ordering. Again, if no receivers are eligible, the RCF returns ``dribble'' or ``kick.'' This RCF was the initial hand-coded behavior for use in games.

next up previous
Next: Reasoning about action execution Up: Using the Learned Behaviors Previous: Using the Learned Behaviors

Peter Stone
Sun Dec 7 06:59:19 EST 1997