You Chose to use a Dense Reward for Rocky

Rocky has been training on data for a while and still hasn’t been able to match up any sexually compatible Pandas. It’s a very complicated search space and Rocky isn’t exactly the fastest learner. Maybe it was a bad initialization. It’s not the easiest problem in the world matching up pandas sexually speaking. Their tastes are incredibly Niche, like I can really relate to that one after years of Tinder…What do you do?

Keep training Rocky

Give Rocky treats if Pandas at least get to the second date

Published by B McGraw

B McGraw has lived a long and successful professional life as a software developer and researcher. After completing his BS in spaghetti coding at the department of the dark arts at Cranberry Lemon in 2005 he wasted no time in getting a masters in debugging by print statement in 2008 and obtaining his PhD with research in screwing up repos on Github in 2014. That's when he could finally get paid. In 2018 B McGraw finally made the big step of defaulting on his student loans and began advancing his career by adding his name on other people's research papers after finding one grammatical mistake in the Peer Review process.

Leave a Reply

%d bloggers like this: