Science

Algorithm That Mastered 'Pong' Now Excellent at 'Flappy Bird', Still Single

All you need for a super-human score are smart rewards and iteration.

by Ben Guarino
Giphy

Improving on a deep-learning method pioneered for Pong, Space Invaders, and other Atari games, Stanford University computer science student Kevin Chen has created an algorithm that’s quite good at the classic 2014 side-scroller Flappy Bird. Chen has leveraged a concept known as “q-learning,” in which an agent aims to improve its reward score with each iteration of playing, to perfect a nearly impossible and impossibly addictive game.

Chen created a system wherein his algorithm was optimized to seek three rewards: a small positive reward for each frame it stayed alive, a large reward for passing through a pipe, and an equally large (but negative) reward for dying. Thus motivated, the so-called deep-q network can outplay humans, according to the report Chen authored: “We were able to successfully play the game Flappy Bird by learning straight from the pixels and the score, achieving super-human results.”

The original Atari paper, published in 2015 in Nature, came from the Google-owned DeepMind company (now famous for its mastery of the ancient Chinese board game Go). The DeepMind accomplishment was a breakthrough in that it took visual — or pixel, at least — information, and, with minimal input, was able maximize rewards. Such a reward system has been likened to the brain’s dopaminergic response, just simplified.

It’s not the first time an algorithm has conquered the flapping bird: An earlier class of Stanford University computer science students created a program that, when trained overnight, its score improved from 0 pipes passed to 1,600.

Related Tags
Space, Time, Dinosaurs, and Other Essential Topics
Get the best of science—space missions, black holes, futuristic biology—delivered daily, minus the jargon.
By subscribing to this BDG newsletter, you agree to our Terms of Service and Privacy Policy