Wow, good thing you decided to clean out all that. You found some real problematic artifacts in there. Some doofus put the timestamps for the panda birthdays in excel and you began optimizing on one hundred year old pandas before you found the mistake! There’s nothing wrong with just using a regression model, especially with well cleaned data. It’s fitting, to something, you’ve got a score now and can run some optimization metrics to figure out which pandas should have Netflix and chill dates with which other pandas. It is all very explainable and you can even track your uncertainty. Maybe there were some more complicated interactions going on, or maybe this was all you needed. Your feature space was selected with some careful pruning processes you didn’t want to think about too much. You even did some K-fold cross validation and found that the fit did not change much. What do you do?
You cleaned that data before doin a little of that Regression

Published by B McGraw
B McGraw has lived a long and successful professional life as a software developer and researcher. After completing his BS in spaghetti coding at the department of the dark arts at Cranberry Lemon in 2005 he wasted no time in getting a masters in debugging by print statement in 2008 and obtaining his PhD with research in screwing up repos on Github in 2014. That's when he could finally get paid. In 2018 B McGraw finally made the big step of defaulting on his student loans and began advancing his career by adding his name on other people's research papers after finding one grammatical mistake in the Peer Review process. View more posts