Friday, January 27, 2012

Is this thing on?

Hi everyone, Caitlin here on the Rapid Insight blog. I've been reading some analytics articles this week, and one term I've heard thrown around a lot is "big data". For those of you who aren't familiar with this term, it is used to describe the massive amounts of data that have been created over the past few years in response to technological advances and an increased effort to link data from disparate sources. Those of you who are familiar with this term know that there are positive and negative implications of using such a large expanse of data to create predictive models. Lots of data can mean the ability to predict broader outcomes with more accuracy, but also leaves more room for overconfidence and error. Regardless of the merits and flaws in utilizing it, big data is here to stay. 

One article in particular, Fast Company's "Why Big Data Won't Make You Smart, Rich, Or Pretty" (written by Daniel Rasmus), provides a more in-depth look at some of the challenges that big data poses and is worth a read. One such problem that I'd like to focus on is what Rasmus headlines as "Complexity": 
          
   "I was sitting with the CIO of a large insurance company in Portland. We were talking about generational hand-offs when he raised the issue of an Excel spreadsheet used to evaluate commercial property underwriting. He said one of the older members of the organization owned that spreadsheet and he was the only one who knew how it worked. The hand-off issue was not one of getting the older employee to collaborate with the younger employee, but one of complexity. That spreadsheet was complex and tightly woven into the employee's worldview. Although the transfer could theoretically take place, it is unknowable how long it would take, if the new employee would stay, or how the process would change as multiple worldviews collided. Combining models full of nuance and obscurity increases complexity. Organizations that plan complex uses of Big Data and the algorithms that analyze the data need to think about continuity and succession planning in order to maintain the accuracy and relevance of their models over time, and they need to be very cautious about the time it will take to integrate, and the value of results achieved, from data and models that border on the cryptic."

The amount of time and energy that goes into creating spreadsheets like the one mentioned here is incalculable. As times change and datasets grow, these spreadsheets are tasked with leveraging the information and formulas already contained with a constant flow of new variables and considerations. As you can imagine, the rise of big data is only increasing the complexity of combining and augmenting various data and models. One thing I've learned from our customers is that the point of a hand-off is particularly tricky, especially when each spreadsheet can be full of its own nuances and quirks. The days when a spreadsheet was meant for only one person to understand, use, and manipulate seem to dimming and giving way to more collaborative and transparent efforts. This is the kind of environment in which I see Rapid Insight's data intelligence tool, Veera, being a big help. 

Veera is able to accommodate large amounts of data due to its ability to run data through any user-created jobs without actually storing any data within the program. It has the same abilities that Excel does for manipulating variables and much more, which means no need to further obscure or compromise the accuracy of data with constant formula editing. Not only is the Veera platform easier to use for data manipulation, it is also transparent: you can open any node at any time if you want to take a look under the hood to see what changes are occurring, and how. Passing projects on is easy - you can drag and drop Veera jobs in and out of emails, and attach explanatory notes to every node in the job if you want. I find myself using the notes feature all the time, if only to help jog my memory after taking some time away from a particular job or data path. The ability to save jobs allows you to keep data processes on hand, so rather than recreating a job for each dataset you'd like to use, you can use them as many times as needed, and of course edit any job whenever necessary. 

Although big data does come with some extra considerations, Veera is well equipped to deal with an increasing amount of data while accommodating needs for transparency and ease of explanation. With spreadsheets constantly being updated and sent from person to person, I feel that the less information one has to remember about the data, and the more that can be noted or observed during the analysis process, the better. Veera helps us to adapt to the influx of big data, keeping past work in-tact and relevant while allowing plenty of room for change and growth as new challenges present themselves. 

-Caitlin Garrett, Statistical Analyst at Rapid Insight