Software Engineer Trains AI to Play Pokemon Red with 40,000+ Hours of Reinforcement Learning

Software engineer Peter Whidden trained AI reinforcement learning agents to play Pokemon Red with over 40,000 hours of simulated game time. The Pavlovian reinforcement model rewarded AI with points for leveling up creatures, exploring new areas and battling gym leaders.

The reinforcement learning agents started with no knowledge of the game whatsoever and can only press random buttons. Throughout the five years of simulated game time, it gathers data on what to do based on past experiences. Eventually, it masters the game to a point where defeating gym leaders becomes a regular occurrence. You can even run the pre-trained model interactively on your own machine running Python 3.10. More information here.

Sale

Nintendo Switch - OLED Model: Mario Red Edition

Limited-edition design featuring elements inspired by Mario. Available while supplies last!
Local co-op*, local wireless*, and online multiplayer**
Detachable Joy-Con controllers

Peter Whidden Training AI to Play Pokemon Red

By default the game will terminate after 32K steps, or ~1 hour. You can increase this by adjusting the ep_length variable, but it will also use more memory,” said Whidden.

Related Posts

‘One Revolution Per Minute’ Short Film Previews Life in a Rotating Space Habitat

OnePlus Open Officially Unveiled, is Company’s First Foldable Smartphone