Review: Evolving Robot Tank Controllers

From: Danny Wyatt (danny_at_cs.washington.edu)
Date: Sun Oct 19 2003 - 17:55:39 PDT

  • Next message: mkbsh_at_cs.washington.edu: "Review of Paper 1"

    Evolving Robot Tank Controllers
    Jacob Eisenstein

    Summary:

    Eisenstein designed a representation of Robocode controllers that allows
    the controller program to be learned with genetic programming. He
    learned and evaluated his controllers against a set of hand-coded
    controllers, and he discusses the results.

    Important Ideas:

    Finding the right fitness function is very important. As a scored game,
    Robocode already provides an easy way to evaluate success, but most
    human observers are not content with the raw score. Eisenstein adjusted
    his fitness function to account for more nuanced appreciations of
    Robocode battles.

    Learned behavior is hard to predict, but beneficial to understand.
    Controllers that choose to win hugely in one game and lose the rest
    succeed when scored on total points. Controllers that never shoot can
    win against opponents that (possibly irrationally) choose to shoot.
    What can these developments teach us about the environment and our
    assumptions about it?

    Flaws:

    It's all about the representation. Like many genetic programming
    solutions, the representation of the problem and the genome build in
    many assumptions that cannot be learned around. For example, Eisenstein
    did not give the onHitByBullet or onHitRobot events control of the gun
    "since they don't know where the opponent is". These events certainly
    contain enough information to infer some things about the opponent's
    location, and the onHitRobot event can give enough information to more
    accurately predict the opponent's location than the onScannedRobot
    event. But, because Eisenstein built that out of his representation a
    priori, the controllers will never have a chance to learn that.
    (Indeed, he says his controllers could never beat Tracker, and that may
    be the fault of this choice in the representation.)

    Uncontrolled experiments. Eisenstein seems to suggest that he changed
    his fitness function while he was running his experiments, and he may
    not have controlled earlier results to account for this change. More
    broadly, there are so many variables at play that it is hard to
    determine what changes lead to success---if success itself can even be
    well-defined.

    Open Questions:

    How to generalize? Eisenstein says that his evolved controllers don't
    adjust to random start positions and can't handle all opponents evenly.

    How to avoid catatonics? In an environment where penalties discourage
    experimentation it is easy for the most rational choice to become "do
    nothing". How can learning be gotten off the ground before playing for
    keeps?


  • Next message: mkbsh_at_cs.washington.edu: "Review of Paper 1"

    This archive was generated by hypermail 2.1.6 : Sun Oct 19 2003 - 17:55:21 PDT