Our next customer webinar, "Strategic Enrollment Management: St. Michael's College and Predictive Analytics" will be given by Bill Anderson, CIO of Saint Michael's College today at 2pm EDT and will be re-broadcasted on Tuesday, March 26th, and Thursday, May 2nd.
I got the chance to ask him a couple of questions about his session, which will describe the ways in which Veera and Analytics are utilized on campus to produce predictions and other analyses for the scoring team.
What types of models have you been building?
Almost entirely enrollment management - mostly apply to enroll. We've been building them on and off for about five years now. I have someone on campus that I collaborate with and when we first started, she was using SPSS for the statistical analysis, but we've since abandoned that.
How has model building changed your Enrollment and/or Financial Aid practices?
There have been a number of ways that we've used the models - one as a sort of verification of what our consultant has been doing, two to be able to do some sensitivity and what-if analysis (and suggest different practices or emphases on where the aid awards should go), and three to help confirm in-semester and in-process prediction on where the class is going to end up.
In some occasions, this has impacted size of waiting list or the way we thought about awarding wait list spots, including the total number of admits. This last year, our model suggested that we could be more selective than we had been in the past.
What do you hope attendees will learn from your presentation?
One thing is that you can do it on your own - it's not that hard. You have to have a background that supports responsible interpretation of the results, but you can sit down and do it. That's one element: just do it. I think there's another element that says once you start thinking this way, it can become infectious. In our enrollment management meetings, we have the opportunity to appeal to the data or look at a Veera job that identifies the applicants we could avoid accepting. This changes the internal conversation - from a culture of anecdote, you can change the conversation with data. The use of the products has been fabulous in terms of making the data accessible to people.
Wednesday, March 20, 2013
Tuesday, March 12, 2013
Rapid Insight's 5th Annual User Conference
Let the countdown to the 5th annual Rapid Insight User Conference begin! Here’s what you need to know about this fun and informative event:
We are making one big change this year: we’ve outgrown our
space here in NH and are hosting the conference on the campus of Yale
University in New Haven, Connecticut. It will kick off at 9am on Thursday, June
27th and wrap up by 4pm on Friday, June 28th. The cost of
the conference is $150 per attendee. In
addition to the presentations and hands-on labs, we’ll be providing a continental
breakfasts and an evening reception to all registrants.
For User Conference lodging, we recommend the Omni New Haven
Hotel at Yale. We have arranged a special rate of $169/night + tax available
through 5/26. You’ll find the dedicated Conference link to guarantee this rate,
along with additional travel information, on the official User Conference webpage.
Be sure to check the Conference webpage frequently for
updates on specific sessions and activities as the date draws near. We look
forward to seeing you there!
Labels:
customers,
networking,
Rapid Insight,
User Conference
Tuesday, March 5, 2013
Six Predictive Modeling Mistakes
As we mentioned in our post on Data Preparation Mistakes, we've built many predictive models in the Rapid Insight office. During the predictive modeling process, there are many places where it's easy to make mistakes. Luckily, we've compiled a few here so you can learn from our mistakes and avoid them in your own analyses:
Failing to consider
enough variables
When deciding which variables to audition for a model, you
want to include anything you have on-hand that you think could possibly be
predictive. Weeding out the extra variables is something that your modeling
program will do, so don’t be afraid to throw the kitchen sink at it for your
first pass.
Not hand-crafting
some additional variables
Any guide-list of variables should be used as just that – a
guide – enriched by other variables that may be unique to your institution. If there are few unique variables to be had,
consider creating some to augment your dataset. Try adding new fields like
“distance from institution” or creating riffs and derivations of variables you
already have.
Selecting the wrong
Y-variable
When building your dataset for a logistic regression model,
you’ll want to select the response with the smaller number of data points as
your y-variable. A great example of this from the higher ed world would come
from building a retention model. In most cases, you’ll actually want to model
attrition, identifying those students who are likely to leave (hopefully the smaller group!) rather than those who are
likely to stay.
Not enough Y-variable
responses
Along with making sure that your model population is large
enough (1,000 records minimum) and spans enough time (3 years is good), you’ll
want to make sure that there are enough Y-variable responses to model. Generally,
you’ll want to shoot for at least 100 instances of the response you’d like to
model.
Building a model on
the wrong population
To borrow an example from the world of fundraising, a model
built to predict future giving will look a lot different for someone with a
giving history than someone who has never given before. Consider which
population you’d eventually like to use the model to score and build the model
tailored to that population, or consider building two models, one for each
sub-group.
Judging the quality
of a model using one measure
It’s difficult to capture the quality of a model in a single
number, which is why modeling outputs provide so many model fit measures.
Beyond the numbers, graphic outputs like decile analysis and lift analysis can provide
visual insight into how well the model is fitting your data and what the gains
from using a model are likely to be.
If you’re not sure which model measures to focus on, ask
around. If you know someone building models similar to yours, see which ones
they rely on and what ranges they shoot for. The take-home point is that with
all of the information available on a model output, you’ll want to consider
multiple gauges before deciding whether your model is worth moving forward
with.
-Caitlin Garrett, Statistical Analyst at Rapid Insight
Photo Credit: http://www.flickr.com/photos/mattimattila/
Have you made any of the above mistakes? Tell us about it (and how you found it!) in the comments.
Photo Credit: http://www.flickr.com/photos/mattimattila/
Have you made any of the above mistakes? Tell us about it (and how you found it!) in the comments.
Labels:
predictive analytics,
predictive modeling,
variables
Subscribe to:
Posts (Atom)