You use Most of the Data

You pair down the data to avoid redundant features and prune your data set until your clustering algorithms start making some splotchy plots with all your panda data categorizing into, well, you’re still not quite sure what those categories were, but they look like they’re clustering other than a handful of datapoints of oddballs just sitting in between whatever Panda sexual compatibility spectrums the clustering was fitting to. What do you do?

Try Another Algorithm—seems sketch this unsupervised stuff

Try More Data!

Try only Relevant Data

Evaluate the results first, maybe it’s good, but should check

Trust the algorithm, it’s statistical distance, how are ya even gonna evaluate that thing?

Published by B McGraw

B McGraw has lived a long and successful professional life as a software developer and researcher. After completing his BS in spaghetti coding at the department of the dark arts at Cranberry Lemon in 2005 he wasted no time in getting a masters in debugging by print statement in 2008 and obtaining his PhD with research in screwing up repos on Github in 2014. That's when he could finally get paid. In 2018 B McGraw finally made the big step of defaulting on his student loans and began advancing his career by adding his name on other people's research papers after finding one grammatical mistake in the Peer Review process.

Leave a Reply

%d bloggers like this: