Monday, June 15, 2015

Naive naive Bayes? by Gursimran

Today we had our very first ‘kids data science camp’ at Aspiring Minds. It was an awesome experience right from the tiring late night preparations, pizza parties, to content generation and execution. Though it will seem simple at first glance but the content generation turned out to be fairly challenging problem. We iteratively designed the exercise with student volunteers from the same age group to finalize the exercise so that it would be neither too simple nor too advanced for our participants. Given that we accepted applications from a rather wide range of kids from 5th grade to 10th grade, one interesting contention was whether to expect them to know concepts like mean or not.

Keeping in mind the bigger level philosophy to keep things simple, our aim was to inculcate the basics of how kids can use data science in their day to day life without bogging them with too many details. On the other hand it was important to not over-simplify things and detract from their learning. It would be a disaster if kids came away thinking the whole exercise was obvious and a waste of their time!

In the end, we settled on
first asking them to analyze a given person's preferences for making friends and ,if time permitted, attempt to make a simple classifier which could predict the outcomes on previously collected data. We generated cards bearing faces, hobby and names. Kids were supposed to rate how likely they were to befriend the ‘potential friend’ on the card. To prevent overwhelming the kids with too many features (which we data scientists usually want to with faces, names etc), we decided to keep these features really simple. For faces, we chose very basic clip art of a boy and a girl with happy and serious expressions. Similarly simple dichotomous features were embedded in the names (whether new or traditional name) and hobbies (whether indoor or outdoor activity).

The event day turned out to be quite fun. We assigned a mentor to each team comprising of two kids (thanks to all my peers volunteering). Kids did an awesome job, jelled with each other and cracked jokes with the mentors. To our pleasant surprise some kids came up with creating ways of visualizing the data apart from the simple bar charts which were expected. Some of them went forward to draw plot pie charts worked with pivot tables to stratify and visualize data.

Encouraged by their progress, Varun explained them the of ‘big picture’ data science. How we can analyse and extract biases (if any), we have while making friend choices. Some of us are more biased in choosing male friends over female while others are indifferent to these factors. Some are open to making friends with people having wide ranging characteristics; still others are picky. He further described how we can make a naiver naive bayes like classifier and predict on unseen data given we have a face, name and a hobby. At this points, the kids' minds were ships  sailing smoothly in the sea of data science. Pretty neat.

However we decided to stick to our original plan  of introducing just analytical insights and not actually go into the process of  building a classifier. We realized the kids had learned much from the simple exercise and that we would need quite a bit of time to do justice. Nevertheless, all of kids seemed to internalize the  importance of recording their data and how they to base their decisions in analysing data rather than making visceral choices.

One thing which we will want to take up as a group is to analyze all the responses of kids together and probably gather some  insight into about how these kids chose friends. Probably train a classifier,  More on this later. Stay tuned.

Gursimran Singh

No comments:

Post a Comment

Note: Only a member of this blog may post a comment.