It turns out I was wrong…which happens at an alarmingly increasing rate these days—though I chalk that to a thirst to challenge myself…errr, my story!
So, for a while now, I had convinced myself that I knew what the most important thing was about successfully doing predictive analytics: accuracy of data and the model (separating the noise). Veracity, as they say. In working with a few clients lately though, I no longer think that’s the case. Seems the most important thing is actually the first thing: What is the thing you want to know? The Question.
As technologists, we often tend to over-complicate and possibly over-engineer. And it’s easy to make predictive analytics focus on the how; the myriad of ways to integrate large volumes and exotic varieties of data, the many statistical models to evaluate for fit, the integration of the technology components, the visualization techniques used to best surface results, etc. All of that has its place. But ultimately, first, and most importantly, we need to articulate the business problem and the question we want answers for.
What do we want to achieve from the analytics? How will the results help us make a decision?
Easy as that sounds, in practice it is not particularly easy to articulate the business question. It requires a real understanding of the business, its underlying operations, data and analytics and what would really move the meter. There is a need to marry the subject matter expert (say, the line of business owner) with a quant or a data scientist and facilitate the conversation. This is where we figure out the general shape and size of the result and why it would matter; also, what data (internal and external) feeds into it.
Articulating The Question engages the rest of the machinery. Answers are the outcome we care about. The process and the machinery (see below for how we do it) give us repeatability and ways to experiment with both asking questions and getting answers.