As I’m sure you all have noticed, the data business is booming right now. (Are you tired of the term “big data” yet?) The fact that 90% of the data in world today has been created in the last two years is a great example of the growth trajectory of data. All of this data provides new opportunities for discovery for those who are willing to analyze it. Enter the data scientist.
“Data Scientist” isn’t even listed as a career by the US Government’s Bureau of Labor Statistics yet, but it’s already been named the sexiest job of the 21st century by Harvard Business Review. With a growth pattern similar to that of data itself, it’s safe to say that data scientists are going to be in high demand. Among other skills, being a practitioner of data science requires analytical thinking, mathematical/statistical ability, a knack for communicating results to non-data people, and creativity. This combination of business acumen and technical skill isn’t easy to come by, and new graduate programs with an emphasis on data science seem to be cropping up daily to fill the gaps. One article from the New York Times recently asserted that the United States will need to increase the number of graduates with data science skills by as much as 60% to keep up with demand. So, when you’re looking for new data scientists, where do you turn? To a generation who’s grown up with data science all around them – through Netflix recommendations, Google search results, and even at the movie theater à la Moneyball.
I was recently asked to participate in a “Job Hop Day” for a local elementary school. The idea was to expose 4-6 graders to different jobs that are available in the Mount Washington Valley in NH. It was a good opportunity to spend a fund day with elementary school students while exposing them to world of data science (and the idea that people actually get paid for doing it!). In preparing for our session, I realized that as thrilling as an hour-long lecture on data science might be for some, 10-year-olds probably wouldn’t be so interested. After ruling out a product demo and a slideshow, my coworkers and I thought about other ways to engage them. We decided the best approach for them to learn about being a data scientist was to do it themselves (in the guise of a game).
When creating the game, we thought about some of the skills we wanted to reinforce, which were things like data mining, basic math, and the ability to make predictions. From there, we got creative – we wanted to pick a subject that kids would be interested in, and since vampires are on the brink of cliché, we settled on werewolves. The game we came up with was a variation of a Family Feud board that involved an initial data-mining phase to glean the characteristics of a werewolf.
To start, I gave the kids ten descriptions of people on color-coded index cards, five of which were designated as “werewolves” and five of which were “non-werewolves”. (Coming up with the descriptions was a good exercise for us as well, we tried to make sure the clues weren’t too obvious, and had to plan them so that some characteristics were more popular than others. An example: three of the werewolves were vacationing in London this summer, but all five of them played some kind of sport). Each data scientist had a whiteboard to write down their descriptions as they went, and we stopped the “data mining” portion of the game once they all felt like they had come up with as many characteristics as they could. The Family Feud board I mentioned earlier had the ten characteristics listed in order of the number of times they came up, and the kids took turns guessing what was on the board.
Over the course of the day, three groups of students played the game, and all three groups seemed to really enjoy it. After we finished the game, we talked about the different uses of data and predictive modeling, covering examples spanning test scores to baseball. They were knee-deep in baseball season and pretty excited when I told them about a baseball scout’s presentation I saw at DRIVE, and how they used statistics to predict what might happen in each game. It was evident from our conversations that the kids had some knowledge of the amount of data around them and were interested in examining the world from a data-driven viewpoint. (I should probably mention here that the kids who chose to attend our session knew it would be math-related, so our sample was a bit biased.) Most of them had never heard of a data scientist or a statistical analyst before, but they were interested in the type of thinking we’d done. A few days later, a student’s mom told me that her son “loved the game” and “was so excited that it was an actual job that he could shoot for”.
Overall, our ad hoc approach to the data scientist experience seemed to go over well, but there’s always room for improvement. I’m interested in any ideas or experiences you guys have might regarding young data scientists, and would love to hear about them in the comments below. In the meantime, if you’ve had a sneaking suspicion about a certain neighbor around a full moon, or just want to have a little fun, I’d recommend trying out your own version of the game.
-Caitlin Garrett is a Statistical Analyst at Rapid Insight