As organizations became engulfed in big data – high-volume, high-velocity, and/or high-variety information assets – the question quickly became how to effectively derive insight and business value from it.
“Big data naturally leads to advanced analytics. When we can capture a lot of information about a business topic that you can improve, you don’t want just to scratch the surface. You want to discover the unknown, find out the root cause, predict what will happen, address issues with extreme precision,” says Jean-Michel Franco, senior director of product at Talend. “This is more than what humans can do alone, without the help of the machine.” Artificial intelligence (AI) for the enterprise has emerged as both a way to make sense of all that information and a discipline that in fact demands large data sets in order to perform.
It’s only natural, then, that big data and AI are often associated with each other today. “There is certainly a strong relationship between big data and AI,” says JP Baritugo, director at business transformation and outsourcing consultancy Pace Harmon. “Big data is the fuel, and AI is the means.”
[ Want more detail? Read also: How big data and AI work together. ]
However, some misunderstandings about AI and big data have emerged along the way, leading to potential confusion that IT leaders should clarify as organizations pursue data-driven strategies:
1. Some flavors of AI may not require big data
The ‘garbage in, garbage out’ philosophy applies, as you need a sufficient amount of good data to drive meaningful value from your AI efforts,” Baritugo says. But how much data you need may vary. “Big data – where it means large data sets of both structured and unstructured data – feeds some applications of AI, [such as] when you need a lot of data to train AI, to analyze information to spot patterns and use probability to come up with answers to your questions,” explains Sarah Burnett, executive vice president and distinguished analyst at Everest Group. “Not all AI needs a lot of data.”
Some off-the-shelf chatbots, for example, may learn from more minimal input.
“AI, by design, typically requires large, normalized data sets (i.e., a ‘cleaned up’ subset of big data) to meaningfully discern patterns and generate the requisite outputs,” Baritugo says. “The volume of data required (including training and evaluation data sets) is chiefly driven by the complexity of the problem, the number of input features that need to be evaluated, and the algorithm used.”
Machine learning (ML), for example, generally requires less data for training than deep learning (a further subset of machine learning).
2. Not all big data demands the application of AI
AI may help to drive analysis, but you don’t necessarily need it to extract value from big data. “Advanced analytics has been a concept most organizations have been taking advantage of for years. It really depends on the data set size and number of different data sets you need to analyze,” says Wayne Butterfield, director of cognitive automation and innovation at ISG. “With the greatest minds in the world it is impossible to find insightful patterns in huge datasets in anywhere like an adequate amount of time, so the ability for machine learning to do the heavy lifting is advantageous, but not all data sets are huge and varied, so you don’t always need ML in order to gain insight from it.”
IT organizations can also use business intelligence, analytics, and data warehousing solutions to analyze data and visualize insights.
3. Advanced analytics and AI are not the same
Many times, people use the term “big data” to describe more broadly the advanced analysis of these information assets. That’s fine. But they may think that advanced analytics and AI are also interchangeable terms. That’s incorrect.
“AI and advanced analytics are closely linked, but there are key differences,” says Burnett. “For example, AI can try out assumptions, self-learn, and enhance its analysis. Analytics, while it can analyze data, cannot self-learn and relies on people to set its parameters.
4. Big data can skew AI models
“Big data brings the foundation for AI and ML. The more data you get, the better the models can be,” Franco says. “But data can also introduce bias into AI and ML when it is not in control.”
Too much focus on the quantity of data rather than its quality is often to blame. “AI and ML will inevitably fail when people can’t control the underlying data,” Franco says. “Collecting massive amounts of data into a data lake doesn’t bring sufficient foundation to success with AI and ML.”
Let’s examine three more common misunderstandings: