Solving twisty puzzles using parallel q-learning

Kavish Hukmani, Sucheta Kolekar, Sreekumar Vobugari

Research output: Contribution to journalArticlepeer-review

3 Citations (Scopus)


There has been a recent trend of teaching agents to solve puzzles and play games using Deep Reinforcement Learning (DRL) which was brought by the success of AlphaGo. While this method has given some truly groundbreaking results and it is very computationally intensive. This paper evaluates the feasibility of solving Combinatorial Optimization Problems such as Twisty Puzzles using Parallel Q-Learning (PQL). We propose a method using Constant Share-Reinforcement Learning (CSRL) as a more resource optimized approach and measure the impact of sub-goals built using human knowledge. We attempt to solve three puzzles, the 2x2x2 Pocket Rubik’s Cube, the Skewb and the Pyraminx with and without sub-goals based on popular solving methods used by humans and compare their results. Our agents are able to solve these puzzles with a 100% success rate by just a few hours of training, much better than previous DRL based agents that require large computational time. Further, the proposed approach is compared with Deep Learning based solution for 2x2x2 Rubik’s Cube and observed higher success rate.

Original languageEnglish
Pages (from-to)1535-1543
Number of pages9
JournalEngineering Letters
Issue number4
Publication statusPublished - 2021

All Science Journal Classification (ASJC) codes

  • Engineering(all)


Dive into the research topics of 'Solving twisty puzzles using parallel q-learning'. Together they form a unique fingerprint.

Cite this