ALE v0.10
In v0.10, ALE now has its own dedicated website, https://ale.farama.org/ with Atari's documentation being moved from Gymnasium.
We have moved the project main code from src
into src/ale
to help incorporate ALE into C++ projects and in the Python API, we have updated get_keys_to_action
to work with gymnasium.utils.play
by changing the key for no-op from None
to the e
key.
Furthermore, we have updated the API to support continuous actions by @jjshoots and @psc-g, see https://arxiv.org/pdf/2410.23810 for the impact.
Previously, users could interact with the ALE interface with only discrete actions linked to joystick controls, ie:
- All left actions (
LEFTDOWN
,LEFTUP
,LEFT...
) -> paddle left max - All right actions (
RIGHTDOWN
,RIGHTUP
,RIGHT...
) -> paddle right max - Up... etc.
- Down... etc.
However, for games using paddles, this loses the ability to specify non-max values for moving left or right. Therefore, this release adds to both the Python and C++ interfaces the ability to use continuous actions (FYI, this only impacts environments with paddles, otherwise they can't make use of this change).
C++ interface changes
Old Discrete ALE interface
reward_t ALEInterface::act(Action action)
New Mixed Discrete-Continuous ALE interface
reward_t ALEInterface::act(Action action, float paddle_strength = 1.0)
Games where the paddle is not used simply have the paddle_strength
parameter ignored.
This mirrors the real-world scenario where you have a paddle connected, but the game doesn't react to it when the paddle is turned.
This maintains backwards compatibility.
Python interface changes
Old Discrete ALE Python Interface
ale.act(action: int)
New Mixed Discrete-Continuous ALE Python Interface
ale.act(action: int, strength: float = 1.0)
The continuous action space is implemented at the Python level within the Gymnasium environment.
if continuous:
# action is expected to be a [2,] array of floats
x, y = action[0] * np.cos(action[1]), action[0] * np.sin(action[1])
action_idx = self.map_action_idx(
left_center_right=(
-int(x < self.continuous_action_threshold)
+ int(x > self.continuous_action_threshold)
),
down_center_up=(
-int(y < self.continuous_action_threshold)
+ int(y > self.continuous_action_threshold)
),
fire=(action[-1] > self.continuous_action_threshold),
)
ale.act(action_idx, action[1])
Full Changelog: v0.9.1...v0.10.0