Sarsa(lambda) agent learning the acrobot task

Task: The two sticks represent a simple version of a acrobat's upper and lower body, hanging from a bar. The acrobat agent must get its feet above the green line.

Possible agent actions: The agent can add clockwise or counter-clockwise force (torque) to the midpoint of its body. It chooses between constant force in either direction and a third option of adding no force. The action chosen at each step is shown by the red block.

Reward signal: The agent receives -1 reward for every step it hasn't passed the green line, urging it to do so as quickly as possible.



To adjust the timing, first click on the task window.