Getting started with big data can be overwhelming, especially when you have a lot of diverse information available. We recently tackled a big data project at Jet Propulsion Laboratory to help our departments solve some of their big problems. This internal big data project was successful, in part, because of the way we approached it.
Here are some takeaways that could help you tackle your next big data project.
1. Identify the low-hanging fruit. When we went around and asked JPL employees what problems they wished they could solve, they didn’t hold back. We soon had a list of 70 problems. Next we did some prototyping to see if it made sense to climb higher in those fruit trees, and then we selected seven problems that were considered low hanging because they could be addressed during our project time frame. By identifying low-hanging fruits, you’re setting your big data project up for success. You’ll be able to start your project knowing that your team has the time and data necessary to complete it.
2. Create a startup environment. We didn’t know whether the results of our data mining would be valuable, so we opted not to put these big data projects in the operations pipeline. Instead we chose to create our own internal startup, a move that would allow us to keep policies and procedures from slowing us down. We even ripped out cubes so our startup team could face each other instead of walls. A great way to get quick results is to make sure your team is set up to collaborate, which is something many startups thrive at doing.
3. Take it to the cloud. Rapid prototyping was important to our internal big data project. We didn’t want to impact our existing environment, nor did we want to lean on our already-busy IT team, so we worked in the cloud. We also were able to try out some cloud-based tools to help us visualize our data. Since you don’t have to install the cloud-based tools across the enterprise, you can try them, and if it doesn’t work, throw them away and try something different.
4. Get visual early and often. Each time we had results to share with the teams, we found a way to present our findings visually and quickly so that they could iterate through it. By showing the data visually, we found that our sponsors would immediately analyze it and come back at us with questions that prompted us to dig even deeper. They’d say things like “That’s interesting, why is that spike there, or why is that dip there?” And often that turned out to be the valuable question.
5. Let teams analyze their own data. Whenever we tried to analyze the data for our sponsors, we made a mistake. If you make a mistake too early, you lose your credibility. Take a consultative approach and say: “Here’s your data. I visualized it for you. What do you think?” That’s a very different approach than saying: “Here’s your data. Here’s what it means.”
While these takeaways may not apply to every big data endeavor you take on, hopefully they can help you save time and dig deeper in your projects.
Tom Soderstrom is Chief Technology and Innovation Officer, Office of the CIO at the Jet Propulsion Laboratory (JPL) in Los Angeles, CA, and a member of the Enterprisers Editorial Board. JPL is the lead U.S. center for robotic exploration of the solar system and conducts major programs in space-based Earth sciences, including the Mars Science Laboratory mission with the Curiosity rover. JPL currently has several dozen aircraft and instruments conducting active missions in and outside of our solar system.