As anyone pursuing a big data initiative knows, every big data strategy really has two components: the technology and the people. The technology part is actually very simple to solve, relative to the people. As long as you're not trying to crack big data problems with relational database technology from 2004, this piece of the equation shouldn't be a big scary beast.
The first thing you should do is capture all the structured and unstructured information you can, even if you don't know what's going to be useful. Why? Because too many companies get so wrapped up in putting together a big data plan that six months can go by — while all the data they could have collected during this time is lost. So even if capturing all this data is an ugly path, I'd advise you to capture everything you can.
Once you have the data, you have to figure out what are you going to do with it and how you're going to report on it. This activity requires technical decisions around what kind of storage you'll use, and with which data platforms. Are you going to capture it and deal with it out of a Cassandra database? Are you going to do it with Hadoop? Are you going to do it with a NoSQL data store like Riak?
The heart (and art) of data science
All of those decisions are driven by the people side, which is the biggest challenge I see with the big data world. Often this challenge comes down to a few uncomfortable questions: Who are the data scientists within your company? Where do they sit? What are they focused on? And, in fact, do you even have any? Just because you like that guy who used to run financial reports out of the PeopleSoft system doesn't mean you can take him and say, "Hey, now you're a data scientist. Here are 15 petabytes of data. Go find some insight."
Data science absolutely is an art. It should be called data art, actually, not data science. What do you need to be a data scientist? Obviously, you need some amazing analytical math skills and some computer science background so you can write an R language query. But you have to be an investigator as well, with an innate sense of curiosity. You don't make a data scientist just by taking an engineer and saying, "Hey, go write some queries." Most engineers would much prefer to hear, "Here are the requirements. Go execute against them." And that doesn't always instill curiosity.
So for me the biggest gap I see in big data is data science. Who in your organization is going to do something with all this data? Are they really skilled to be able to find those insights and get those nuggets? These questions suggest other people-based challenges you must face for a big data initiative to stay on the rails:
- Are your big data efforts centralized in one organization or decentralized within the business units?
- If they're decentralized, how do you ensure that 14 teams aren't trying to find the same thing?
- Who funds the big data initiatives?
- How do you measure success?
The complications that arise from attempting to answer these questions say a lot about why most organizations haven't really solved the big data problem. As IT professionals, we can stand up all the infrastructure and technology and data platforms in the world, but if we don't have the right human drivers for these tools, then we're not going to go anywhere.
Bryson Koehler is executive vice president and CIO of The Weather Company. He is responsible for setting the strategic vision, financial planning, technical operations, direction and execution of strategic technology initiatives for the company. In the past, Bryson has worked as an operating partner in private equity and as SVP of global revenue and guest technology at InterContinental Hotels Group.