Photo credit: Jonny Goldstein |
As a full-time analytics professional, I have a hard time
conceiving of people who have not fully embraced the power of predictive
analytics, but I know they’re out there and I think it’s important to address
their concerns. In doing so, I’m not here to argue that predictive analytics is
a perfect fit for every organization. Predictive analytics requires investment:
in your data, in infrastructure and technology, and of your time. It’s also an
investment in your company, your internal knowledge base, and your future. I’m
here to argue that the investment is worth it.
To do so, I’ve presented a few
clarifications to address predictive modeling concerns that I’ve heard from
skeptics. If you have anything to add, or if there are any big concerns I’ve
missed, let me know in the comments.
You don’t need to be
a PhD statistician to build predictive models
A working knowledge of statistics will help you to better
interpret the results of predictive models, but you don’t need ten years’
experience or a doctorate degree to glean insight or utilize the output from a
model. There are software packages out there with diagnostics that can help you
understand which variables are important, which are not, and why. Knowing your data is equally important as statistical knowledge, and both will serve you well in the long run.
A predictive model
shouldn’t be a black box
There are plenty of companies and consultants whose
predictive models could fall into the “black box” category. The model building process, in this case, involves sending your data to an
outside party who analyzes it and returns you a series of scores. On the
surface, this may not seem like a bad thing, but once you’ve built your first
model, you’ll understand why this is not nearly as valuable as doing it
yourself. While the output scores are important, you also want to know about
the variables used, how the model handled any missing or outlying variables,
and glean insight beyond a single set of scores so that you can change or
monitor specific behaviors going forward.
Even if you know your data, modeling can help
A finished predictive model will do one of two things: confirm what you’ve always believed, or bring new insights to light. In our office, we refer to this idea as “turn or confirm” – a model will either turn or confirm the things you’ve thought to be true. Most of the time, models will do both. This allows you to both validate any anecdotal evidence you might have (or realize that correlations might not be as strong as you thought) and take a look at new variables or connections that you may not have picked up on before.
A finished predictive model will do one of two things: confirm what you’ve always believed, or bring new insights to light. In our office, we refer to this idea as “turn or confirm” – a model will either turn or confirm the things you’ve thought to be true. Most of the time, models will do both. This allows you to both validate any anecdotal evidence you might have (or realize that correlations might not be as strong as you thought) and take a look at new variables or connections that you may not have picked up on before.
Predictive models can be implemented quickly
I've heard some horror stories about a model taking months, or even years, to implement. If this is the case at your institution, you're doing it wrong. At this point, predictive modeling software has become incredibly efficient - usually able to turn out models within seconds or minutes. The bulk of time spent working on a model is typically spent on the data clean-up, which will vary from company to company. In any case, this is time well spent. Clean data is just as good for reporting, dashboarding, and visualizing as it is for predictive modeling.
Predictive models enhance human judgment, not replace it
If models were meant to replace human judgment, I too would
be uncomfortable and suspicious of the idea. However, 99% of the time, the aim
of predictive modeling is to enhance and expand human expertise to allow us
(the end users) to be better-informed and more data-driven in our decision
making.
-Caitlin Garrett, Statistical Analyst at Rapid Insight