As the use of artificial intelligence applications – and machine learning – grows within businesses, government, educational institutions, and other organizations, so does the likelihood of bias.
Researchers have studied and found significant racial bias in facial recognition technology, for example, and in particular in the underlying algorithms. That alone is a massive problem.
When you more broadly consider the role AI and ML will play in societal and business contexts, the problem of AI bias becomes seemingly limitless – one that IT leaders and others need to pay close attention to as they ramp up AI and ML implementations.
AI bias often begins with people, which runs counter to the popular narrative that we’ll all soon be controlled by AI robot overlords. Along with people, data becomes a key issue.
“Bias in AI is really a reflection of bias in the training data,” says Rick McFarland, chief data officer at LexisNexis Legal & Professional. “The best place to look for bias in your AI is in your training data.”
Guess who usually has their hands on that training data? People.
In a TED Talk, “How I’m fighting bias in algorithms,” MIT researcher Joy Buolamwini said of training data and bias in AI and ML technologies: “Why isn’t my face being detected? We have to look at how we give machines sight. … You create a training set with examples of faces. However, if the training sets aren’t really that diverse, any face that deviates too much from the established norm will be harder to detect.”
We asked a range of experts to weigh in on the questions IT leaders should be asking to identify and root out potential biases in their own AI systems. These could range from issues such as racial or gender bias to matters of bad analytics and confirmation bias. There’s not much business value in simply training your AI to tell you what you want to hear.
[ What's coming next in AI? Read AI in 2019: 8 trends to watch. ]
We also asked these experts for their insights on how organizations can be thinking through and formulating their own answers to these questions. Here’s their advice, grouped into three overlapping categories: people, data, and management.
People questions to ask about AI bias
1. Who is building the algorithms?
Perhaps someday the sci-fi stories about AI replacing humans will come true. For now, though, AI bias exposes a fallacy in that narrative: AI is highly dependent on human input, especially in its early phases.
Phani Nagarjuna, chief analytics officer at Sutherland, recommends that IT leaders looking to reduce bias start by examining the teams that work most closely with the company’s AI applications.
“Often times, AI becomes a direct reflection of the people who assemble it,” Nagarjuna says. “An AI system will not only adapt to the same behaviors of its developers but reinforce them. The best way to prevent this is by making sure that the designers and developers who program AI systems incorporate cultural and inclusive dimensions of diversity from the start.”
Consider it another competitive edge for diverse and inclusive IT teams.
“Business and IT leaders should ask themselves: Does my team embody enough diversity in skills, background, and approach?” Nagarjuna says. “If this is something the team is lacking, it’s always best to bring in more team members with different experiences who can help represent a more balanced, comprehensive approach.”
2. Do your AI & ML teams take responsibility for how their work will be used?
Harry Glaser, co-founder and CEO of Periscope Data, notes that bias is less likely to occur when the people programming your AI and ML have a real stake in the outcomes. It’s somewhat like developers taking longer-term ownership of their code in DevOps culture.
Glaser adds a couple of follow-up questions to be asking here:
“Do they have personal ownership over the short and long-term impact of their technical work? Are they empowered to be business owners for their projects – and not just technicians?”
3. Who should lead an organization’s effort to identify bias in its AI systems?
Rooting out AI bias is not an ad hoc task; it requires people who are actively seeking to find and eliminate it, according to Tod Northman, a partner at the law firm Tucker Ellis.
“It takes a unique background and skill set to identify potential bias in an AI system; even the question may be foreign to AI developers,” Northman says. “An organization that deploys an AI system must identify a lead for evaluating potential AI bias – one who has sufficient gravitas to command respect but who also has the right training and temperament to identify possible bias.”
4. How is my training data constructed?
“Some training data can be made without human involvement, such as data collected from devices, computers, or machines – think of your cell phone,” McFarland at LexisNexis says. “However, much of the AI training data used today is constructed by humans or has some sort of human involvement. Any AI built from any source of training data will amplify any biases in the training data – no matter the source.”
McFarland notes that as AI and ML platforms become easier to use by a wider range of people across an organization, the possibility for bias – or simply a lack of awareness of bias – increases.
“Developers are not really thinking about data requirements. Instead, they’re thinking about building the model and the product,” McFarland says. “That means untrained developers are not performing the critical, time-consuming training data tests. The consequence is that the model that uses the biased training data will amplify the biases 1,000-fold.”
Data questions to ask about AI bias
5. Is the data set comprehensive?
Nagarjuna, the chief analytics officer at Sutherland, cites a Gartner prediction that an estimated 85 percent of all AI projects during the next several years will deliver flawed outcomes because of initial bias in data and algorithms.
This again reveals the fundamental relationship between people and data.
“An AI model is only as good as the data used to train it; it’s vital that decision-makers take a hard look at the external data and ensure it is both comprehensive and representative of all variables,” Nagarjuna says. “Environmental data is going to be instrumental in enabling outputs to be contextually sensitive beyond being just accurate in predictions.”
6. Do you have multiple sources of data?
This has been a hot-button in data science and analytics in general: You’re only as good as your data sources. AI is not much different in this sense.
“Have you thought holistically about the biases in your data, and worked to minimize them?” Glaser asks. “You can never eliminate bias in data, but you can be aware of it and work to mitigate it. And you can reduce bias by incorporating multiple sources of data.”
Subscribe to our newsletter.
Keep up with the latest advice and insights from CIOs and IT leaders.