2048
Usage
or you can directly load Play2048 class
Description
2048 ...
Specs
| Name | Value |
|---|---|
| Version | v0 |
| Number of players | 1 |
| Number of actions | 4 |
| Observation shape | (4, 4, 31) |
| Observation type | bool |
| Rewards | {0, 2, 4, ...} |
Observation
Our obseervation design basically follows [Antonoglou+22]:
In our 2048 experiments we used a binary representation of the observation as an input to our model. Specifically, the 4 × 4 board was flattened into a single vector of size 16, and a binary representation of 31 bits for each number was obtained, for a total size of 496 numbers.
However, instaead of 496-d flat vector, we employ (4, 4, 31) vector.
| Index | Description |
|---|---|
[i, j, b] |
represents that square (i, j) has a tile of 2 ^ b if b > 0 |
Action
Each action corresnponds to 0 (left), 1 (up), 2 (right), 3 (down).
Rewards
Sum of merged tiles.
Termination
If all squares are filled with tiles and no legal actions are available, the game terminates.
Version History
v2: Two updates (v2.0.0)- Fix
legal_action_mask@sotetsuk in #1049 - Specify rng key explicitly (API v2) by @sotetsuk in #1058
v1: Fix reward overflow bug by @sotetsuk in #1034 (v1.4.0)v0: Initial release (v1.0.0)
Reference
[Antonoglou+22]"Planning in Stochastic Environments with a Learned Modell", ICLR