2048

darklight

Usage

import pgx

env = pgx.make("2048")

or you can directly load Play2048 class

from pgx.paly2048 import Play2048

env = Play2048()

Description

2048 ...

Wikipedia

Specs

Name	Value
Version	`v0`
Number of players	`1`
Number of actions	`4`
Observation shape	`(4, 4, 31)`
Observation type	`bool`
Rewards	`{0, 2, 4, ...}`

Observation

Our obseervation design basically follows [Antonoglou+22]:

In our 2048 experiments we used a binary representation of the observation as an input to our model. Specifically, the 4 × 4 board was flattened into a single vector of size 16, and a binary representation of 31 bits for each number was obtained, for a total size of 496 numbers.

However, instaead of 496-d flat vector, we employ (4, 4, 31) vector.

Index	Description
`[i, j, b]`	represents that square `(i, j)` has a tile of `2 ^ b` if `b > 0`

Action

Each action corresnponds to 0 (left), 1 (up), 2 (right), 3 (down).

Rewards

Sum of merged tiles.

Termination

If all squares are filled with tiles and no legal actions are available, the game terminates.

Version History

v2: Two updates (v2.0.0)
Fix legal_action_mask @sotetsuk in #1049
Specify rng key explicitly (API v2) by @sotetsuk in #1058
v1 : Fix reward overflow bug by @sotetsuk in #1034 (v1.4.0)
v0 : Initial release (v1.0.0)

Reference

[Antonoglou+22] "Planning in Stochastic Environments with a Learned Modell", ICLR