The engineers without borders (EWB) organization has a long history of building up infrastructure to include, roads, bridges, electric utility, and many other common infrastructure from the developed world. This has bred a high demand for civil engineers. Even most Mechanical and Electrical engineers won’t know what to do without good survey data, some CAD schematics, and an interface control document. Even more engineers have been cross training into the field of Data Science and Data Engineering. These specialists still want to help!
With such a powerful hammer, it wasn’t long before the organization found a nail. Most third world countries have a huge data cleaning problem. A new study showed that 94% of the developing world has no access to cleaned public health data. There are already plenty of infrastructure engineers and Bill Gates pet projects cleaning water but nobody’s cleaning 3rd world country data! Year after year, infographics are made in which all of these undeveloped countries will show up with a dismal grey “No Data”. How on earth will anyone know where to electrify communities unless they know where all of the Runescape players per thousand are mining valuable international currency.
Engineers without borders is beginning a new effort to send data engineers from country to country to create all manor of parsers and data cleaning scripts. “840,000 people die from a lack of access to clean drinking water a year,” EWB lead for the new Data Scientists without borders said. “But that’s nothing compared to the amount of data that goes unused because of all of the mislabeled time stamps and missing NaN values. Have you ever seen an excel spread sheet from Uganda? It’s some of the dirtiest data I have ever seen in my professional career. I don’t know how the locals survive!”
The job consists of digging into government data bases and cleaning data until it’s half usable by international institutions and infographic makers. “It’s definitely not the most complicated work I’ve ever done but it’s a humbling experience,” Cranberry Lemon Applied Psychological Machine Learning Professor Chad Broman and EWB Data volunteer. “Usually, I’m working on some overcomplicated machine learning model. Out here in Zambia, they just need someone who can translate an XML files into a single csv. It’s practically a vacation, except I get to help people!”
A big part of the effort is to instruct locals on maintaining databases and cleaning their own data. There are millions of villages across Africa with nobody who even knows how to remove duplicate data. Most Americans take it for granted but over 98% African communities have no access to SQL database developers! Many only have the skill set to just use Microsoft Access.
While most think the worst ‘No Data’ country to get sent to is Greenland, it’s actually North Korea. North Korea is labeled ‘No Data’ more than any other country in the world. Not only is it diplomatically difficult to get in and out of the country, the data is no better. “I’m just trying to stay alive while I’m over here helping out,” EWB lead North Korea Data Scientist. “I’m trying to get all their data in a neat and well-structured database we can send to international organizations. But so many things are blacked out or missing. Most of the time I ask what happened, the guards assigned to me will start yelling and pointing their guns at me! There’s going to be a lot of data I’m gonna have to throw out.”
“The important thing is to volunteer and help out where you can,” one EWB spokesman said. “That high paying job back in the states analyzing financial or healthcare data will always be there. It doesn’t matter if you’re a frequentist, data bro or a pretend bayesian, you have the skills to help out by writing data cleaning or even just parsing scripts.”
The effort to bring modern databases to 3rd world countries will finally help bring awareness to the world through glitzy info graphics exactly what it’s like to live there. Most people think that developing a modern society is all about roads, bridges, electricity, and the hard infrastructure. It’s about maintaining detailed records too! Pretty soon, data scientists will bring their skills around the world and the 3rd world will begin to enjoy the benefits of clean data.
If you enjoyed this fake news story about a filtering unwanted outliers and restructuring databases in subsaharan Africa, please like, share, and subscribe with your email, our twitter handle (@JABDE6), our facebook group here, or the Journal of Immaterial Science Subreddit for weekly content.