|
|
||
|
Learning Objectives The objectives of the RoboCup Learning Challenge is to solicit comprehensive learning scheme which can be applied to the learning of multiagent systems which need to adapt to the situation, and to evaluate merits and demerits of proposed approaches using the standard tasks. Learning is an essential aspect of intelligent systems. In the RoboCup learning challenge, the task is to create a learning and training method for a group of agents. The learning opportunities in this domain can be broken down into several types:
Technical Issues Technical issues anticipated in meeting this challenge is the development of novel learning scheme which can effectively train indivdual agents and their teamworks in both off-line and on-line. One example of possible learning scheme for meeting this challenge is as follows: Off-line skill learning by individual agents: learning to intercept the ball or learning to kick the ball with the appropriate power when passing. Since such skills are challenging to hand-code, learning can be useful during a skill development phase. However, since the skills are invariant from game to game, there is no need to relearn them at the beginning of each new game. Off-line collaborative learning by teams of agents: learning to pass and receive the ball. This type of skill is qualitatively different from the individual skills in that the behaviors of multiple agents must be coordinated. A "good" pass is only good if it is appropriate for the receivers receiving action, and vice versa. For example, if the passer passes the ball to the receiver's left, then the receiver must at the same time move to the left in order to successfully complete a pass. As above, such coordination can carry over from game to game, thus allowing off-line learning techniques to be used. On-line skill and collaborative learning: learning to play positions. Although off-line learning methods can be useful in the above cases, there may also be advantages to learning incrementally as well. For example, particular aspects of an opposing teams' behavior may render a fixed passing or shooting behavior inefective. In that case, the ability to adaptively change collaborative or individual behaviors during the course of a game, could contribute to a team's success. At a higher level, team issues such as role (position) playing on the field might be best handled with adaptive techniques. Against one opponent it might be best to use 3 defenders and 8 forwards; whereas another opponent might warrant a different configuration of players on the field. The best teams should have the ability to change configurations in response to events that occur during the course of a game. On-line adversarial learning: learning to react to predicted opponent actions. If a player can identify patterns in the opponents' behaviors, it should be able to proactively counteract them. For example, if the opponent's player number 4 always passes to its teammate number 6, then player 6 should always be guarded when player 4 gets the ball. Evaluation For challenge responses that address the machine learning issue (particularly the on-line learning issue), evaluation should be both against the publicly available teams and against at least one previously unseen team. First, teams will play games against other teams and publicly available teams under normal circumstances. This evaluates the team's general performance. This involves both AI-based and non-AI based teams. Next, teams will play a set of defined benchmarks. For example, after fixing their programs, challengers must play a part of the game, starting from the defined player positions, with the movement of the opponents pre-defined, but not disclosed to the challengers. After several sequences of the game, the performance will be evaluated to see if it was able to improve with experience. The movement of the opponents are not coded using absolute coordinate positions, but as a set of algorithms which generates motion sequences. The opponent algorithms will be provided by the organizers of the challenge by withholding at least one successful team from being publicly accessible. Other benckmarks which will clearly evaluate learning performance will be announced after discussing with challenge participants. |
||
|
|
||
| [Back] | ||
|
Copyright (C) 1998 The RoboCup Federation. All Rights Reserved. Terms of Use |
||