cff-version: 1.2.0 abstract: "
Chess Recognition Dataset (ChessReD)
The Chess Recognition Dataset (ChessReD) comprises a diverse collection of images of chess formations captured using smartphone cameras; a sensor choice made to ensure real-world applicability.
It was collected by capturing images of chessboards with various chess piece configurations. The chess opening theory was used to guarantee the variability of those configurations. In particular, the Encyclopaedia of Chess Openings (ECO) classifies opening sequences into five volumes with 100 subcategories each that are uniquely identified by an ECO code. 20 ECO codes were selected from each volume. Subsequently, each code of this set was randomly matched to an already played chess game that followed the particular opening sequence denoted by the ECO code; thus creating a set of 100 chess games. Finally, using the move-by-move information provided by Portable Game Notations (PGNs) that are used to record chess games, the selected games were played out on a physical chessboard with images being captured after each move.
Three distinct smartphone models were used to capture the images. Each model has different camera specifications, such as resolution and sensor type, that introduce further variability in the dataset. The images were also taken from diverse angles, ranging from top-view to oblique angles, and from different perspectives (e.g., white player perspective, side view, etc.). These conditions simulate real-world scenarios where chessboards can be captured from a bystander's arbitrary point of view. Additionally, the dataset includes images captured under different lighting conditions, with both natural and artificial light sources introducing these variations.
The dataset is accompanied by detailed annotations providing information about the chess pieces formation in the images. Therefore, the number of annotations for each image depends on the number of chess pieces depicted in it. There are 12 category ids in total (i.e., 6 piece types per colour) and the chessboard coordinates are in the form of algebraic notation strings (e.g., "a8"). These annotations were automatically extracted from Forsyth-Edwards Notations (FENs) that were available by the games' PGNs. Each FEN string describes the state of the chessboard after each move using algebraic notation for the piece types (e.g., "N" is knight) , capitalization for the piece colours (i.e., white pieces are denoted with uppercase letters, while black pieces with lowercase letters), and digits to denote the number of empty squares. Thus, by matching the captured images to the corresponding FENs, the state of the chessboard in each image was already known and annotations could be extracted. To further facilitate research in the chess recognition domain, bounding-box and chessboard corner annotations are also provided for a subset of 20 chess games. The corners are annotated based on their location on the chessboard (e.g., "bottom-left") with respect to the white player's view. This discrimination between these different types of corners provides information about the orientation of the chessboard that can be leveraged to determine the image's perspective and viewing angle.
Dataset specifications
The dataset consists of 100 chess games, each with an arbitrary number of moves and therefore images, amounting to a total of 10,800 images being collected. It was split into training, validation, and test sets following an 60/20/20 split, which led to a total of 6,479 training images, 2,192 validation images, and 2,129 test images. Since two consecutive images of a chess game differ only by one move, the split was performed on game-level to ensure that quite similar images would not end up in different sets. The split was also stratified over the three distinct smartphone cameras (Apple iPhone 12, Huawei P40 pro, Samsung Galaxy S8) that were used to capture the images. The three smartphone cameras introduced variations to the dataset based on the distinct characteristics of their sensors. For instance, while the image resolution for the Huawei phone was 3072x3072, the resolution for the remaining two models was 3024x3024.
While annotations about the position of the pieces in algebraic notation are available for every image in the dataset, bounding box and chessboard corner annotations are provided only for a subset of 20 randomly selected games from the train, validation, and test sets. For this subset a 70/15/15 split stratified over the smartphone cameras was followed, which led to a total of 14 training games (1,442 images), 3 validation games (330 images), and 3 test games (306 images) being annotated. This subset of ChessReD is denoted as ChessReD2K.
" authors: - family-names: Masouris given-names: Athanasios title: "Chess Recognition Dataset (ChessReD)" keywords: version: 2 identifiers: - type: doi value: 10.4121/99b5c721-280b-450b-b058-b2900b69a90f.v2 license: CC BY-NC-SA 4.0 date-released: 2023-09-04