Dang, you are absolutely heartless and have no idea how to train an RL algorithm. I feel really bad for Rocky right now, he was just trying to help and find the best trajectory and you never appreciated any of his attempts at stepping towards an optimum policy. After not receiving any rewards for matching up any sexual panda partners, he gave up all hope and has stopped learning any new tricks no matter what sort of data you feed him! Wow, you suck at this!
If you enjoyed this Choose your own adventure about designing and training an ML algorithm to increase the domestic production of Panda Bears and would like to read more of our hard hitting scientific articles please like, share, and subscribe with your email, our twitter handle (@JABDE6), our facebook group here, or the Journal of Immaterial Science Subreddit for weekly content. For Real you should follow just to know when our book comes out! ALSO BOOK COMING SOON!