Data scientists are one of the most sought after roles in today's technology organizations. Salaries are rising for this important role. In some IT organizations, CIOs may only have the budget to hire one data scientist. So when it comes time to hire a data scientist you probably want that person to be a rockstar.
But when you're doing your interviewing, there's one key skill that you should hone in on that may not be obvious on the resumes your vetting. It isn’t Python or R or Spark or some other new technology or platform. It isn’t the latest machine learning methods or algorithms. It isn’t being able to write AI algorithms from scratch or analyze terabytes of data in minutes.
While those are important – very important – they aren’t THE skill. In fact, the one skill that makes a data science rockstar isn't a technical skill at all – it's a so-called “soft-skill:” The ability to communicate.
The candidates you're interviewing could be the smartest people in the world when it comes to creating some wild machine learning systems to build recommendation engines, but if they can’t communicate the “strategy” behind the system or their approach, they're going to have a hard time, and their potential is going to be unrealized.
[ Read more from Eric Brown: 4 big data myths, busted ]
What do I mean by “strategy?” When you communicate your output/results, data scientists need to be able to discuss more than the standard information (error rates/metrics, etc.). They also need to be able to hit the key "W" points: What, why, when, where, and who. They must be able to clearly define what they did, why they did it, when their approach works (and doesn’t work), where their data came from and who will be affected by what they’ve done. If they can’t answer these questions succinctly and in a manner that a layperson can understand, they’re a failing a data scientist.
Two real world examples – one rockstar, one not-rockstar
I have two recent examples for you to help highlight the difference between a data science rockstar (i.e., someone that communicates well) and one not-so-much rockstar. I’ll give you the background on both and let you make up your own mind on which person you’d hire as your next data scientist. Both of these people work at the same organization.
She’s been a data scientist for four years. She’s got a wide swath of experience in data exploration, feature engineering, machine learning, and data management. She’s had multiple projects over her career that required a deep dive into large datasets and she’s had to use different systems, platforms and languages during her analysis.
For each project she works on, she keeps a running notebook with commentary, ideas, changes and reasons for doing what she’s doing – she is a scientist after all. When she provides updates to team members and management, she doesn’t just focus on the data, she focuses on what the data is able to communicate. She provides a thorough writeup of all her work with detailed notes about why things are being done the way they are done and how potential changes might affect the outcome of her work.
For project “wrap-up” documentation, she delivers an executive summary with many visualizations that succinctly describes the project, the work she did, why she did what she did, what she thinks could be done to improve things and how the project could be improved upon. In addition to the executive summary, she provides a thorough write-up that describes the entire process with multiple appendices and explanatory statements for those people who want to dive deeply into the project. When people are selecting people to work on their projects, her name is the first to come out of their mouths when they start talking about team members.
He’s been a data scientist for four years (about one month longer than Person One). His background is very technical, and he is the “go-to” person for algorithms and programming languages within the team. He’s well thought of and can do just about anything that is thrown over the wall at him. He’s quite successful and is sought after for advice from people all over the company.
When he works on projects he sort of “wings it” (his words) and keeps few notes about what he’s done and why he’s chosen the things he has chosen. For example, if you ask him why he chose Random Forests instead of Support Vector Machines on a project, he’ll tell you “because it worked better,” but he can’t explain what “better” means. Now, there’s not many people who would argue against his choices on projects and his work is rarely questioned. He’s good at what he does and nobody at the company questions his technical skills, but they always question “what is he doing?” And “what did he do?” during/after projects.
For documentation and presentation of results, he puts together the basic report that is expected with the appropriate information, but people always have questions and are always “bothering him” (again … his words). When new projects are being considered, he’s usually last in line for inclusion because there's “just something about working with him” (actual words from his co-workers).
Who would you choose?
I’m assuming you know which of the two is the data science rockstar. While Person Two is technically more advanced than Person One, his communication skills are a bit behind Person One's. Person One is who everyone goes to for delivering the “best” data science outcomes in their organization. Communication is the difference. Person One is not only able to do the technical work, but also share the outcomes in a way that the organization can easily understand.
When you’re looking to hire a data science rockstar, look for one who can communicate or has the ability to improve communication skills. Additionally, as an organization, if you want to be a great data science and analytics company, you must have a great communications culture.
[ Which of your organization's problems could AI solve? Get real-world lessons learned from CIOs in the new HBR Analytic Services report, An Executive's Guide to Real-World AI. ]
Agree, now it would just be a matter to convince recruiters
I think this particular one skill should be present not only in data scientists but in any type of worker too. Without the ability to communicate with both peers and customers, the worker could end up as a liability rather than an asset to the company.