On anti-cheating in chess, science, reproducibility, and variability

On anti-cheating in chess,
science, reproducibility, and
variability
Mathieu Acher

Chess fairy tale :
● The Queen's Gambit (Netflix)
● streaming/Youtube
● investments (chess.com, chess24, Play Magnus, lichess, etc.)
● AlphaZero didn’t kill the interest

Cheating threatens Chess fairy tale. Almost
forgotten/hidden, but now getting back…

Cheating threatens Chess fairy tale. How does it work?
Use of chess engines during play.

Chess engines like Stockfish or AlphaZero-like variant can
give to the cheater a decisive advantage, since almost
perfect moves can be played.
Elo rating system: method for calculating
the relative skill levels of each player
Stockfish is 3500+
Magnus Carlsen (world champion) is 2834

Already in 2006, Topalov accused Kramnik as part of the
world championship.

Sébastien Feller
https://en.wikipedia.org/wiki/S%C3%A9bastien_Feller
case in Chess Olympiad in 2010.
In October 2010, Feller scored 6/9 (+5 =2 -2) during the 39th Chess Olympiad and won the Gold
medal for best individual performance on board 5. However, the FFE accused Feller, along with
French players GM Arnaud Hauchard and IM Cyril Marzolo, of cheating during the Olympiad. While
Feller was in the playing hall, Marzolo was in France where he checked the best moves with a
chess computer. Marzolo then allegedly sent the move in coded pairs of numbers by text message
to Hauchard. Once Hauchard had the suggested move, he would position himself in the hall behind
one of the other players’ tables in a predefined coded system, where each table represented a
move to play. The FFE claims, in all, 200 text messages were sent during the tournament. The
scam was supposedly uncovered by FFE vice-president Joanna Pomian

Sébastien Feller
https://en.wikipedia.org/wiki/S%C3%A9bastien_Feller
case in Chess Olympiad in 2010.
Not guilty ;-)

With the rise of online games and $$$, cheating is even
more problematic.

A few weeks ago, Magnus Carlsen
https://en.wikipedia.org/wiki/Magnus_Carlsen accused
Hans Niemann (HN) of cheating over the board and
refused to ever play him again.
During the Sinquefield Cup in September 2022, a controversy arose between
chess grandmasters Magnus Carlsen, the current world champion, and Hans
Niemann. Carlsen, after surprisingly losing in their matchup, dropped out of the
tournament. Many interpreted his withdrawal as an insinuation of an accusation
that Niemann cheated. In their next tournament meetup, Carlsen abruptly resigned
after one move, perplexing observers again. It became the most serious scandal
about cheating allegations for international chess in years and garnered significant
attention in the news media worldwide.

You may have heard headlines with "anal bead" supposed
to help HN.

I'm not specifically aiming to talk about chess (and
cheating).
I rather want to discuss how science has been (and will
be) at the heart of the anti-cheating chess problem. It’s an
interesting case.

How to detect cheaters?
The basic idea is to confront moves played by humans (players) with those of
computer engines. The more you play like a computer…

How to detect cheaters?
The basic idea is to confront moves played by humans (players) with those of
computer engines. The more you play like a computer…
Rigorous methods with backed up claims
vs
Fancy analysis that can ruin life of people

I will first argue that many people (chess
hobbyists/experts, data nerds, etc.), most being
non-scientists, have actually done science for trying to
demonstrate or refute the cheating case.

With the sharing of data (analysis of chess games like
those played by HN), scripts, and methods, numerous
results and conclusions have emerged, getting popularized
with social media (twitch, Youtube, twitter, etc.)

On the one hand, I've been quite excited to see all this
energy for trying to advance our understanding and
propose interesting ideas/analysis.
On the other hand, there have been some failures in the quality of some analysis or the choice of closed
systems to compute unclear metrics:
● Let’s Check Analysis is based on a proprietary system: undefined, opaque, non-reproducible
● Dubious combination of probabilities
● More comparisons needed with other players... too easy to find a "pattern" on HN, and then compare
to one player and data point!
● Analysis sensitive to depth and engine
In-between, there have been a report by chess.com and the analysis of the computer scientist Ken Regan
https://cse.buffalo.edu/~regan/ the world renewed specialist.

I still think the problem is open (eg Regan's method is too conservative and
missing many cases; chess.com methodology, though unclear and opaque at
some points, is certainly effective for online cheating, but not over the board
detection).
I will present a variability model of the space of experiments/methods that can be
considered to address the problem.

Dataset
only HN
games
a few
grandmasters for
comparison
Carlsen
Nakamura
…
Rising
stars
Cheaters
all games in the
planet? OTB?
online?

Dataset
Preprocess
consider all
moves
ignore
openings
move
ignore N
first moves
(eg N=8)
use ECO
Encyclopaedia of
Chess Openings
ignore
endgames
“critical”
positions only
(note: define
critical)

Dataset Metrics
ROI
from
Regan
Let’s
Check
Preprocess
Compare with
strength “profile”
(e.g., Elo rating at
that time)
Strength
score
(chess.com)

Dataset Metrics Engine
depth
version
Preprocess

This model can be used to
● pilot the collaborative effort,
● to reproduce, replicate or reject some experiments,
● and to gain confidence or robustness in some conclusions.
Dataset Metrics Engine
depth
version
Preprocess

28
different
methods
different
assumptions
different analyses
different data

the obvious: OPEN, reproducible
open in a security context?
Haematocrit in cyclism (EPO; <= 50 in 2000s’)
Goodhart's law "When a measure becomes a target, it ceases to be a good
measure"
ROI from Regan is unclear/not well-defined… why? perhaps because ROI can be
then systematically computed, monitored, and thus optimized by cheaters

Another last point I want to discuss is that most probably the anti-cheating chess
problem cannot be resolved solely with retrospective computational analysis.
It's just too uncertain, especially if cheaters are "smart".
(Cyber-)security experts, psychologists, chess players, and of course computer
science nerds/professionals can contribute to address this multidisciplinary
problem.

On anti-cheating in chess, science, reproducibility, and variability

Recommandé

Recommandé

Contenu connexe

Plus de University of Rennes, INSA Rennes, Inria/IRISA, CNRS

Plus de University of Rennes, INSA Rennes, Inria/IRISA, CNRS (20)

Dernier

Dernier (20)

On anti-cheating in chess, science, reproducibility, and variability