Rapid Insight: Data Analytics: October 2012

Wednesday, October 31, 2012

Customer Tips From... Brian Johnson (DonorBureau)

Today's installment in our customer tips series comes from Brian Johnson, the VP of Product and Operations at DonorBureau. Here are his tips:

1. You can have multiple select statements in the Query node as only the last statement does not have a temp table statement. This has allowed me to automate any complicated SQL queries I have for reports to run every week.

2. If you need to create a new version of a scheduled report, use the old version and it will inherit the schedule of the original report in terms of run times.

3. You can save output to Dropbox or Google drive to distribute data to team members without having to fill up their inbox.

Wednesday, October 24, 2012

How to Score a Dataset Using Veera

After you’ve ‘memorized’ the predictive model you’d like to use, you’re ready to start the scoring process. There are actually two ways to score a dataset using the Rapid Insight software suite. In this post, we’ll talk about how to import your scoring model into Veera and quickly score your dataset.

We’ll start at the point where you save your scoring model within Analytics. After memorizing your model in the Model tab, you’ll want to move down to the Compare Models tab. This tab allows you to compare any two models side-by-side. Once you’ve decided which model you like better, you’re ready to save it by selecting the model and clicking the “Save Scoring Model” as button, as shown below.

Analytics will prompt you to navigate to where you’d like the file to be saved, and will save it with a .rism (Rapid Insight Scoring Model) extension. Once you’ve saved the file, you’re ready to move into Veera to score your dataset.

In Veera, you’ll want to create a new job for scoring. In that job, bring in your input file (the file you’d like to score), and connect it to an output files. When configuring the output file, you can choose to write your scores to a file, spreadsheet, or back to a database table. Once the input and output files are connected, you’ll be importing the scoring model between them. To do so, right-click on the line connecting the two files and select Wizard -> Import Scoring Model, as shown below:

You will need to navigate to where you saved your scoring (.rism) file and select it to finish the import. Once you’ve done so, you’ll see four or five new nodes populate on the line between your input and output file. These nodes, shown below, are Analytics’ way of communicating the scoring process to Veera.

One very important thing about this scoring process is that your model is not a black-box model – you can explore each step to see how your data is scored. Feel free to open each of the nodes and see what they are accomplishing. If you open the “Create New Variables” node, you’ll be able to see any of the transformations used in your predictive model; you can also access the model formula itself by opening the “Calculate Probability” node. To get the probability scores, go ahead and run your job. The scores will be outputted as a new column called “Probability” in your output file or database.

-Caitlin Garrett, Statistical Analyst at Rapid Insight

Wednesday, October 17, 2012

Customer Tips From... Dr. Nelle Moffett (California State University - Channel Islands)

The next installment in our customer tips series comes from Dr. Nelle Moffett, Director of Institutional Research at Cal State - Channel Islands. Nelle is an avid Veera user and loves building analytic processes. Here are her tips:

Use a Rename node before an output to select only the fields that you want and re-sort them in the desired sequence.
Use the Cleanse node liberally before any Transform node to remove or replace missing data. This will eliminate unexpected results when the Transform node encounters missing data.
To create your own variable labels, create a look-up table in Excel with the original values and the new value labels. Then merge this file with the original file using the field labeled "original" and keep the new field with the desired value labels.

Example of value file:

To calculate the percent of a certain characteristic in the dataset, first use a Transform node to set a flag for that characteristic where 1= has the characteristic and 0= does not have the characteristic. Then use an Aggregate node and select the mean for a flag.
To update a job to the current form of a dataset, first make the connection to the updated dataset. Then double-click on the data node in the job. At the bottom of the window where it says "connection" click on the name of the data file and select the new version. make sure all of the data fields are checked (that should be) and save the changes. If you give the data node a generic (rather than dated) name, then it will still be appropriate as the data continues to be updated to the current date.

...Have tips of your own? Email them to caitlin.garrett@rapidinsightinc.com!

Tuesday, October 16, 2012

Fall Pricing Promotion

For both new and existing customers - now until November 16th!

Not a customer yet? Rapid Insight is offering 10% off your software purchase through November 16th, 2012.

Already feeling the love? We're offering existing customers the "add a license" promotion: add a license of either Rapid Insight Analytics or Veera for only $3,000 each.

We hope this is what you've been waiting for!

For more information, please contact Sheryl Kovalik at
sheryl.kovalik@rapidinsightinc.com or (603) 447-0240 ext. 7568

Wednesday, October 10, 2012

Creating Variables: Fiscal Year

For those of you whose fiscal year is different from the calendar year, having a Fiscal Year variable can be a huge timesaver, which is why we've chosen it as the next entry in our Creating Variables series. Filtering and sorting on this variable make it easy to compare things like gifts on a fiscal year to fiscal year basis, as well as easily focus on one or more years of interest. Fortunately, creating this variable is easy by following these steps in Veera:

The first step is to hook your dataset to a transform node:

Once in the transform, select the date variable (in the form DD/MM/YY) you’d like to extract fiscal year from. Next, click on the “IF” button (the top button on the right-hand side) to generate an ‘if’ equation. In the Enter A Formula window, we’ll want to edit the auto-generated equation so it reads:

IF(month(A)>=7,year(A)+1,year(A))

where A is the date field that you’re extracting fiscal year from. This example is assuming a July 1 fiscal year start, which is why we used the number 7 (feel free to edit accordingly). Be sure to name the new variable and select “text” from the Result Type list before saving.

The formula is saying that if the month of the date field falls after the beginning of the new fiscal year, then set the fiscal year to the newer fiscal year (which is the year after A because the fiscal year will end in that next year). Otherwise, we’re setting the fiscal year equal to the year of the date field (because the fiscal year ends in that year).

Finito!

-Caitlin Garrett, Statistical Analyst at Rapid Insight

Friday, October 5, 2012

Fundraising: The Science

Now that we’ve discussed the art of fundraising, I think it’s only right that we focus a little bit on the science. After all, knowing which prospects are statistically most likely to give makes a gift officer’s contribution to the art of fundraising that much more successful. As I’ve mentioned before, my function in the fundraising spectrum is as an analyst, helping customers build models identifying which prospects are most likely to donate.

One of the most important things we do during the predictive modeling process is data preparation, which often means creating new variables from the data our customers have on-hand. I’d like to discuss some of these variables, as well as how and why to include them in a fundraising or advancement model. For the purposes of this blog entry, I’ll use a higher education example. Typically a higher education institution might have some extra variables, but these can be tailored to fit other institutions or excluded when not relevant.

Demographic Information

It’s always smart to have an idea of what each donor looks like at a demographic level. Variables to include here are things like age, gender, marital status, and any occupational data you might have available to you. In a higher education context, this would also include things like the constituent’s class year, major, whether or not their spouse is an alumni, and whether they are a legacy alumni (meaning a parent or grandparent also attended the institution). Additionally, we often create a “reunion year flag” indicating if the analysis year is a reunion year for that person, as donors are often more likely to give (and give larger gifts) during a reunion year.

Location Information

General information about each donor’s location like ZIP code, city, and county can be useful as categorical variables (treating people that live in each one as a group). Once we have a ZIP code, we always calculate a “distance from institution” variable using one of Veera’s pre-programmed functions. This new variable, which is measured in miles, gives you a solid idea of the relationship between location and giving. If you have access to census data, we recommend appending variables relating to neighborhood or housing type. Creating flag variables for wealthy neighborhood ZIP codes can also be useful; constituents coming from these areas may be more likely to give. Although this can be created at a more local level, we often start with Forbes’ list of the top 500 wealthiest ZIP codes in the US, which is available online at http://www.forbes.com/lists/2011/7/zip-codes-11_rank.html.

Contact History

The ways in which a donor engages with you can tell you a lot about their likelihood of giving. For starters, include variables pertaining to their event history. How many events have they attended? Which types of events are they attending? How many days since their last event? Answers to questions like these can sometimes turn out to be predictive of giving. This is also where your social media variables come into play; create flags for whether a constituent is following you on LinkedIn, Facebook, Twitter, Pinterest, etc. A donor following you on one or more of these sites is an indication that they want to be connected, and therefore they may be more likely to give. Conversely, if a constituent has indicated that they do not want to be contacted, you’ll want to include this information as well, as it can be very predictive.

Gift History

This brings us to our last and most predictive set of variables: giving history. These variables should answer all kinds of questions about what a giver looks like historically, like:

How many gifts have they given in their lifetime?
What was their last gift?
How many days since their first gift? How many days since their last gift?
Have they given in the past 12 months? If so, how much?
What is the velocity of the gifts - are the increasing, decreasing, or staying the same?

One thing to note here is that gift dates themselves aren’t useful in a predictive model, but their translations – like the number of days since an event – allow us to use the insight they provide.

In building a predictive model, some of these variables may be predictive, while others might turn out not to be. It’s a good idea to include some combination of these variables, plus anything you have on-hand that you think could possibly be predictive.

-Caitlin Garrett, Statistical Analyst at Rapid Insight