There are few technologies in recent years more prolific than Artificial Intelligence (AI). From self-driving cars to robotic surgery to the applications we use every day, AI is all around us in both our personal and professional lives.
Worldwide revenues for AI, including software, hardware, and services, are forecasted to grow 16.4 percent year over year in 2021 to $327.5 billion, according to IDC.
With all major technological breakthroughs comes new challenges, and these need to be addressed before AI can deliver on its full potential. To better understand the barriers to AI adoption, a new O’Reilly survey seeks to explore what those challenges are and how organizations can overcome them. Not surprisingly, lack of skills and difficulty hiring are the two biggest roadblocks.
[ Check out our primer on 10 key artificial intelligence terms for IT and business leaders: Cheat sheet: AI glossary. ]
In contrast to O'Reilly's data last year, when company culture presented the largest bottleneck to AI adoption, this shift is significant. This trend implies a greater overall acceptance of AI – but also a real and persistent AI talent gap. While it’s not surprising that demand for AI expertise has exceeded supply, it’s important to understand which specific skills and professional titles are most critical to AI adoption. To find out what those were, we asked the technologists actually using AI in production. Here are the three areas technology professionals need to develop in the era of AI.
1. ML modeling and data science skills
The percentage of respondents with AI products in production over the last year is flat when compared with 2020, and even 2019, which may reflect the AI skills gap. According to O’Reilly’s survey, respondents feel the skills shortage most acutely in the areas of machine learning (ML) modeling and data science (52 percent), understanding business use cases (49 percent), and data engineering (42 percent). The emphasis on understanding use cases also indicates that respondents realize AI is not a one-size-fits-all solution, which is important for industries such as healthcare, where domain expertise is vitally important.
On the other end of the spectrum, hyperparameter tuning (2 percent) wasn’t considered a problem, which may reflect the success of automated tools for building models. What’s more concerning is that workflow reproducibility (3 percent) was such a low priority. Being able to reproduce experimental results is critical to any science, and it’s a well-known problem in AI. Putting models into production that work differently than intended can present real problems for businesses. While technologists should focus on building the skills they need to put themselves in the best position to get hired for AI-focused roles, it’s important to understand other important challenges they may be faced with on the job.
[ Want best practices for AI workloads? Get the eBook: Top considerations for building a production-ready AI/ML environment. ]
2. Tools of the trade: scikit-learn and TensorFlow
Respondents with mature practices clearly have their favorite tools. Scikit-learn and TensorFlow top the list (both used by 65 percent of respondents), with PyTorch not far behind. When asked which tools they planned to incorporate over the next year, roughly half of the respondents responded with model monitoring and model visualization. Models require constant tweaking for reasons ranging from changing human behavior to changing datasets, and without updating, can become ineffective and even harmful. The ability to monitor performance will be increasingly important as businesses grow more reliant on AI.
As questions of ethics and bias move to the forefront, modelling model behavior will become even more important for addressing these problems swiftly. While earlier stage respondents’ answers were similar, there was one big difference, and that was a greater reliance on partnering with vendors that incorporate AutoML. This makes a lot of sense. For those developing AI skills, learning to use tools with automated components is a good jumping-off point.
[ Read also: 6 misconceptions about AIOps, explained. ]
3. Data preparation and collection
While it is encouraging that 18 percent of respondents reported concern with data quality, this is nowhere near high enough. A high number of errors exist in publicly available data sets, affecting the accuracy of AI results.
The title of a recent paper says it all: “Everyone wants to do the model work, not the data work.” This paper discusses how errors in data cascade into bad results, particularly in high-risk systems, such as medical systems and autonomous vehicles. It laments that academic programs teach students how to build models but ignore the problem of quality data, reinforcing the idea that model building is more fun and glamorous.
It’s no coincidence that the O'Reilly survey showed that 46 percent of the respondents weren’t using version control tools for data; the ability to track changes to data and reconstruct the data used to train any model is an essential to maintaining data quality. Almost 10 years ago, former Chief Data Scientist of the United States Office of Science and Technology Policy DJ Patil said, “80 percent of the work in any data project is cleaning the data.” That’s still true, and the biggest shortage the industry faces may be people willing to clean the data.
The AI skills gap has existed for some time, but as the most significant bar to wider adoption of the technology, it’s time to get serious about solving the problem. Education is power, and arming tech professionals with the competencies they need to secure AI jobs is the only way to overcome this hurdle. By focusing on the most in-demand skills, most popular tools, and relevant techniques, we can start to close the gap and truly start to shape the next generation of AI.
[ Get exercises and approaches that make disparate teams stronger. Read the digital transformation ebook: Transformation Takes Practice. ]