As I’m sure you all have noticed, the data business is
booming right now. (Are you tired of the term “big data” yet?) The fact that
90% of the data in world today has been created in the last two years is a
great example of the growth trajectory of data. All of this data provides new
opportunities for discovery for those who are willing to analyze it. Enter the
data scientist.
“Data Scientist” isn’t
even listed as a career by the US Government’s Bureau
of Labor Statistics yet, but it’s already been named the sexiest job of the
21st century by Harvard
Business Review. With a growth pattern similar to that of data itself, it’s
safe to say that data scientists are going to be in high demand. Among other
skills, being a practitioner of data science requires analytical thinking, mathematical/statistical
ability, a knack for communicating results to non-data people, and creativity. This
combination of business acumen and technical skill isn’t easy to come by, and
new graduate programs with an emphasis on data science seem to be cropping up
daily to fill the gaps. One article
from the New York Times recently asserted that the United States will need to
increase the number of graduates with data science skills by as much as 60% to
keep up with demand. So, when you’re
looking for new data scientists, where do you turn? To a generation who’s grown
up with data science all around them – through Netflix recommendations, Google
search results, and even at the movie theater à la Moneyball.
I was recently asked to participate in a “Job Hop Day” for a
local elementary school. The idea was to expose 4-6 graders to different jobs
that are available in the Mount Washington Valley in NH. It was a good
opportunity to spend a fund day with elementary school students while exposing
them to world of data science (and the idea that people actually get paid for
doing it!). In preparing for our session, I realized that as thrilling as an
hour-long lecture on data science might be for some, 10-year-olds probably
wouldn’t be so interested. After ruling out a product demo and a slideshow, my
coworkers and I thought about other ways to engage them. We decided the best
approach for them to learn about being a data scientist was to do it themselves
(in the guise of a game).
When creating the game, we thought about some of the skills
we wanted to reinforce, which were things like data mining, basic math, and the
ability to make predictions. From there, we got creative – we wanted to pick a
subject that kids would be interested in, and since vampires are on the brink
of cliché, we settled on werewolves. The game we came up with was a variation
of a Family Feud board that involved an initial data-mining phase to glean the
characteristics of a werewolf.
To start, I gave the kids ten descriptions of people on color-coded
index cards, five of which were designated as “werewolves” and five of which
were “non-werewolves”. (Coming up with the descriptions was a good exercise for
us as well, we tried to make sure the
clues weren’t too obvious, and had to plan them so that some characteristics
were more popular than others. An example: three of the werewolves were
vacationing in London
this summer, but all five of them played some kind of sport). Each data
scientist had a whiteboard to write down their descriptions as they went, and
we stopped the “data mining” portion of the game once they all felt like they
had come up with as many characteristics as they could. The Family Feud board I
mentioned earlier had the ten characteristics listed in order of the number of
times they came up, and the kids took turns guessing what was on the board.
Over the course of the day, three groups of students played
the game, and all three groups seemed to really enjoy it. After we finished the
game, we talked about the different uses of data and predictive modeling,
covering examples spanning test scores to baseball. They were knee-deep in
baseball season and pretty excited when I told them about a baseball scout’s
presentation I saw at DRIVE, and how they used statistics to predict what might
happen in each game. It was evident from our conversations that the kids had
some knowledge of the amount of data around them and were interested in
examining the world from a data-driven viewpoint. (I should probably mention
here that the kids who chose to attend our session knew it would be
math-related, so our sample was a bit biased.) Most of them had never heard of
a data scientist or a statistical analyst before, but they were interested in the
type of thinking we’d done. A few days later, a student’s mom told me that her
son “loved the game” and “was so excited that it was an actual job that he
could shoot for”.
Overall, our ad hoc approach to the data scientist
experience seemed to go over well, but there’s always room for improvement. I’m
interested in any ideas or experiences you guys have might regarding young data
scientists, and would love to hear about them in the comments below. In the
meantime, if you’ve had a sneaking suspicion about a certain neighbor around a
full moon, or just want to have a little fun, I’d recommend trying out your own
version of the game.
-Caitlin Garrett is a Statistical Analyst at Rapid Insight