Random Bingo Game

This random bingo solver was a project designed to explore reinforcement learning principles. I came across a minigame that consists of a 5x5 grid where the player tries to get as many "bingo" lines as possible with limited moves. The player would pick a tile to flip and then a random unflipped tile on the board would get flipped. After 8 user interactions (16 total tiles) the board would be over and the number of bingos would determine the score.

In order to optimize my score in the minigame to get the maximum reward I thought it could be fun to implement a reinforcement learning algorithm. I had not had good results in the past with reinforcement learning and I thought this task would be simple enough to implement a reward structure that would be easy to optimize. While I knew the game was simple enough to solve with a dynamic programing algorithm, I wanted to see if the reinforcement learning algorithm could find a similar solution.

Project Goals


Tech Stack


Key Features


Playable Example

Bingo Game

Flips Remaining: 8

Score: 0

RL Model Suggestion

0.000

0.000

0.000

0.000

0.000

0.000

0.000

0.000

0.000

0.000

0.000

0.000

0.000

0.000

0.000

0.000

0.000

0.000

0.000

0.000

0.000

0.000

0.000

0.000

0.000

DP Model Suggestion

0.000

0.000

0.000

0.000

0.000

0.000

0.000

0.000

0.000

0.000

0.000

0.000

0.000

0.000

0.000

0.000

0.000

0.000

0.000

0.000

0.000

0.000

0.000

0.000

0.000


Challenges

Initially, I struggled to get the algorithm to select unflipped tiles. I had mildly punished the algorithm for selecting flipped tiles, but it was not enough to get the algorithm to select unflipped tiles. I found that by increasing the punishment and adding a reward for selecting unflipped tiles, the algorithm was able to learn the optimal strategy.

My initial implementation of the game in python was done in a that wasn't as efficient as it could be. This lead to the reinforcement learning model taking a long time to train. I didn't realize this until I had started implementing the dynamic programing solution which also ran extremely slow. I refactored the game logic to be more efficient and reduced the dynamic programing solution time by 10x. I have not retrained the reinforcement learning model yet, but I expect it to be much faster as well.


Lessons Learned

Through this project, I learned how to integrate machine learning concepts, such as reinforcement learning and multi-objective Bayesian optimization, into a real-world application. The project also improved my frontend skills with React and Chakra UI.


Future Improvements