Monday, June 15, 2015

Naive naive Bayes? by Gursimran


Today we had our very first ‘kids data science camp’ at Aspiring Minds. It was an awesome experience right from the tiring late night preparations, pizza parties, to content generation and execution. Though it will seem simple at first glance but the content generation turned out to be fairly challenging problem. We iteratively designed the exercise with student volunteers from the same age group to finalize the exercise so that it would be neither too simple nor too advanced for our participants. Given that we accepted applications from a rather wide range of kids from 5th grade to 10th grade, one interesting contention was whether to expect them to know concepts like mean or not.

Keeping in mind the bigger level philosophy to keep things simple, our aim was to inculcate the basics of how kids can use data science in their day to day life without bogging them with too many details. On the other hand it was important to not over-simplify things and detract from their learning. It would be a disaster if kids came away thinking the whole exercise was obvious and a waste of their time!




In the end, we settled on
first asking them to analyze a given person's preferences for making friends and ,if time permitted, attempt to make a simple classifier which could predict the outcomes on previously collected data. We generated cards bearing faces, hobby and names. Kids were supposed to rate how likely they were to befriend the ‘potential friend’ on the card. To prevent overwhelming the kids with too many features (which we data scientists usually want to with faces, names etc), we decided to keep these features really simple. For faces, we chose very basic clip art of a boy and a girl with happy and serious expressions. Similarly simple dichotomous features were embedded in the names (whether new or traditional name) and hobbies (whether indoor or outdoor activity).

The event day turned out to be quite fun. We assigned a mentor to each team comprising of two kids (thanks to all my peers volunteering). Kids did an awesome job, jelled with each other and cracked jokes with the mentors. To our pleasant surprise some kids came up with creating ways of visualizing the data apart from the simple bar charts which were expected. Some of them went forward to draw plot pie charts worked with pivot tables to stratify and visualize data.


Encouraged by their progress, Varun explained them the of ‘big picture’ data science. How we can analyse and extract biases (if any), we have while making friend choices. Some of us are more biased in choosing male friends over female while others are indifferent to these factors. Some are open to making friends with people having wide ranging characteristics; still others are picky. He further described how we can make a naiver naive bayes like classifier and predict on unseen data given we have a face, name and a hobby. At this points, the kids' minds were ships  sailing smoothly in the sea of data science. Pretty neat.

However we decided to stick to our original plan  of introducing just analytical insights and not actually go into the process of  building a classifier. We realized the kids had learned much from the simple exercise and that we would need quite a bit of time to do justice. Nevertheless, all of kids seemed to internalize the  importance of recording their data and how they to base their decisions in analysing data rather than making visceral choices.

One thing which we will want to take up as a group is to analyze all the responses of kids together and probably gather some  insight into about how these kids chose friends. Probably train a classifier,  More on this later. Stay tuned.

Gursimran Singh

5 comments:

  1. There are lots of information about latest technology and how to get trained in them, like Big Data Course in Chennai have spread around the web, but this is a unique one according to me. The strategy you have updated here will make me to get trained in future technologies. By the way you are running a great blog. Thanks for sharing this.

    ReplyDelete
  2. A table is the basic unit of data storage in an oracle database. The table of a database hold all of the user accesible data. Table data is stored in rows and columns. But what is all about the clusters and how to handle it using oracle database system? Expecting a right answer from you. By the way you are maintaining a great blog. Thanks for sharing this in here.
    Oracle Training in Chennai | Oracle Course in Chennai | Oracle Training Center in Chennai

    ReplyDelete
  3. Cloud storage is a model of data storage where the digital data is stored in logical pools, the physical storage spans multiple servers (and often locations), and the physical environment is typically owned and managed by a hosting company. The way you have explained everything is quite impressive and elegant. Thank you so much for sharing this data in here.

    cloud computing training in chennai

    ReplyDelete