Continuing with
the Forgotten Tabs series, the next tab we’ll be focusing on is the Means
Analysis tab. The Means Analysis tab provides the mean, number of observations,
maximum values, and minimum values for any of the variables in your dataset.
You also have the option to take “means by” and “subclass by” to view the means
of multiple subcategories across variable combinations.
The Means
Analysis tab can be also useful in comparing data from different cohorts or
years in order to spot trends. In the example below, we’re comparing attrition
rates by year, which allows us to pick up on any trends or changes that are
occurring from year to year. If for some reason we were noticing a year that
had a much higher or lower attrition rate than the other years, this gives us
the opportunity to pick up on that and investigate further as to why that might
be.
You might also
note that beyond looking at ‘Attrition’, we are also looking at a variable
called ‘Predicted Attrition’. This variable represents the predicted attrition
probabilities that we’ve assigned to each student. In this case, we’ve grouped
these values by year to get an idea of how well we’re predicting attrition for
that year compared to the actual attrition rate. Comparing our predicted values
to actual values gives us a sense of any weaknesses from year to year that our
predictive model might have. If we do find any weakness in predictive ability,
we have the opportunity to go back and further fine-tune our model in order to
incorporate our findings.
-Caitlin Garrett, Statistical Analyst at Rapid Insight
No comments:
Post a Comment