Thursday, June 11, 2015

Our work's reviews at EAAI 2016

On the completion of our three camps, it dawned on us that the framework we'd created to teach kids data science had a lot of nuances - we'd taken a lot of decisions in order to ensure that the framework was accessible to a high school audience and while it ensuring that the material was engaging and could be conveyed in a half-day setting.

We decided to write up a paper on it and submit it to a AI-education/pedagogy conference.
We submitted it to EAAI 2016, a co-located conference at AAAI 2016.

Tod Neller and group put up a great conference every year at AAAI where they focus on discussing interesting ways of teaching AI/data science to high schoolers and undergraduates. A regular feature in their conference is the model AI assignments track, which showcases some fun ways to design assignments in AI.

Results were out in the first week of November - and we didn't make the cut.
However, we thought the reviewers' comments showed just how nascent this area was, getting them to opine on various aspects of the paper. We decided to post the reviews here and get a discussion going on it.

Here are the reviews - 

TITLE: Teaching Data Science to School Kids
AUTHORS: Shashank Srikant and Varun Aggarwal

REVIEW 1
The paper describes a half-day “camp” for elementary and middle school students on the full supervised learning pipeline, which assumes very basic background only.

In addition to the factors listed, the design decisions for the exercise are also (presumably) driven by a desire to fit this exercise into a half-day. This hardly satisfies my intuitions about what a camp is! Rather, it appears that you are designing an exercise that can be (presumably) easily adapted by teachers and camp leaders for inclusion as a lesson in a larger curriculum. This is good, and in my mind is the best framing of your project. Add “fit into one half day” as an explicit design constraint with this new framing.

Give an example of a simple balanced dataset to illustrate your intentions.

Why are you using Naive Bayes — I don’t imagine that school kids will view this as intuitive at all. Why not decision trees or decision rules, which began as computational models of concept formation in human subjects (Hunt, Marin, Stone 1966 Concept Learning System, CLS, if I recall correctly). Isn’t that basis in psychological modeling a compelling reason for selecting decision rules and trees?

A domain was ruled out (movie prediction), because of missing and noisy data (p. 4). Isn’t this something to cover in a longer camp, or perhaps another module? Again, I would be thinking about what you are doing as creating modules for adoption by instructors, rather than a standalone camp (but if you want to think of this as a camp, then one-half day does not satisfy the length criterion for a camp.

There are clearly opportunities for discussion of ethics here — a 61% average accuracy on unseen data is compelling caution for the exuberance of one student “I learned how to predict a stranger’s choices”, but moreover, the naive Bayes form may limit the ability to talk about overfitting as well (under fitting is the real “danger”)

I’m glad that students were asked to sign consent forms (for reasons related to their own learning and maturation), but was the process vetted by an IRB? I think it should be, even more particularly because it involves minors.

Overall, this is an interesting exercise, and while I question the particular  predictive modeling language, consent procedures, and some other particulars, it is within EAAI scope, I believe, to design lessons (again, I don’t think this is a camp) that can be adapted by middle and high school teachers.


REVIEW 2
This paper presents a case study where the authors taught a selection of data science topics to 5th-9th graders. The authors probably would have been better served by making this a "Model AI Assignment." As it is, their paper introduces some hypotheses and guidelines, but do not test any of them. They provide a brief qualitative assessment of whether students liked their module or if the students thought they learned.

There were minor typos, but overall the paper was well written.

Overall, I'm not sure how much people will learn from reading this paper / watching the authors present. However, it could spark some interesting discussion:

Is it reasonable to teach this age group data science? Would it be better to teach older students this material? Would students this age be better served by learning some programming rather than using MS Excel?

REVIEW 3
The paper presents the results of a data science camp for 5th through 9th graders.  It explains the goals of the camp and also tried to provide an analysis of the outcomes.

While this paper has a good idea in mind, there are some issues with the paper as written that would result in a stronger paper if they were addressed.  Those issues include:

* The authors state that their hypothesis is that a student learns best by problem solving themselves.  However, this hypothesis is never directly tested (for example, comparing a group of students who did hands on projects to those that learned from instructional videos on a common post-test).

* While I can understand an appreciation for manual data collection, there is a good deal of data that is collected automatically these days (or the people who collect the data are not the same as those who analyze it).  Why should 5th-9th graders be spending their time on data entry rather than analysis?  This needs more justification with respect to the actual learning outcomes students achieve by doing manual data entry.

* The sample space of the data analysis tasks is very small (8-16 instances).  Why even do inference (rather than just building a lookup table) for such a small problem?

* In several places you refer to reducing cognitive load, but this is something that is not actually measured at all.  You need more precision/assessment here to make claims about cognitive load.

* There is a significant problem in how probabilities are dealt with in the pseudo-Naive Bayes model.  By having students sum (rather than multiply) probabilities, you give students an erroneous sense of how probabilities work.  Having taught probability for many years, this is actually a big problem for students, and the task you give them further reinforces this incorrect interpretation.  That seems very problematic to me as an educator in this area.  If you are using addition for interpretability, then why use a Bayesian classifier at all?  Why not just use a decision tree which is more more readily interpretable.

* Similar to the point about data entry, having student each rate 50 images seems like a lot of effort whose educational outcome is unclear.  This needs more explanation (and perhaps some assessment) to determine if it's actually a worthwhile use of time by the students.

* The average validation accuracy is low (62%) for a binary classification task.  Is that due to a low Bayes rate for the task or is it that the models built were just not very good?  This really needs some exploration.

* It would be very useful to provide quantitative results from either the students or mentors or both as to their thoughts about the utility of the camp and the tasks they were involved in.

* The paper would benefit from a round of general English editing.

No comments:

Post a Comment