Wednesday, October 24, 2012

How to Score a Dataset Using Veera


After you’ve ‘memorized’ the predictive model you’d like to use, you’re ready to start the scoring process. There are actually two ways to score a dataset using the Rapid Insight software suite. In this post, we’ll talk about how to import your scoring model into Veera and quickly score your dataset.

We’ll start at the point where you save your scoring model within Analytics. After memorizing your model in the Model tab, you’ll want to move down to the Compare Models tab. This tab allows you to compare any two models side-by-side. Once you’ve decided which model you like better, you’re ready to save it by selecting the model and clicking the “Save Scoring Model” as button, as shown below.


Analytics will prompt you to navigate to where you’d like the file to be saved, and will save it with a .rism (Rapid Insight Scoring Model) extension. Once you’ve saved the file, you’re ready to move into Veera to score your dataset.

In Veera, you’ll want to create a new job for scoring. In that job, bring in your input file (the file you’d like to score), and connect it to an output files. When configuring the output file, you can choose to write your scores to a file, spreadsheet, or back to a database table. Once the input and output files are connected, you’ll be importing the scoring model between them. To do so, right-click on the line connecting the two files and select Wizard -> Import Scoring Model, as shown below:


You will need to navigate to where you saved your scoring (.rism) file and select it to finish the import. Once you’ve done so, you’ll see four or five new nodes populate on the line between your input and output file. These nodes, shown below, are Analytics’ way of communicating the scoring process to Veera.


One very important thing about this scoring process is that your model is not a black-box model – you can explore each step to see how your data is scored. Feel free to open each of the nodes and see what they are accomplishing. If you open the “Create New Variables” node, you’ll be able to see any of the transformations used in your predictive model; you can also access the model formula itself by opening the “Calculate Probability” node. To get the probability scores, go ahead and run your job. The scores will be outputted as a new column called “Probability” in your output file or database. 

-Caitlin Garrett, Statistical Analyst at Rapid Insight

No comments:

Post a Comment