4 steps to build a strong data science community across the enterprise

How can CIOs move past pockets of AI success to enterprise-wide advantage? Consider these four steps to accelerate results from data science, AI, and ML work.
232 readers like this.

In the race to quickly harness the power of data science, Artificial Intelligence (AI), and Machine Learning (ML), organizations have often kick-started initiatives by hiring data scientists and engineers within functional and regional areas of the business. While this approach offers speed, agility, and tight alignment to business requirements, it often comes at the expense of consistency and efficiency, and without consideration of best practices. The reality is that many organizations’ efforts in this space are falling short, with a majority of companies only piloting AI or using it in a single business process – and thus gaining only incremental benefits, according to a recent McKinsey study.

CIOs can pull together the collective power of disparate teams while elevating the entire practice.

CIOs are often positioned well to play a proactive role in pulling together the collective power of disparate teams while elevating the entire practice - all without org changes.

The four steps below effectively connect teams that span across people, processes, and technologies in order to accelerate data science, AI, and ML efforts and build competitive advantage.

1. Create better visibility between projects

This sounds simple, and the good news is that it really is simple. The reality, however, is that organizational boundaries often create friction between technical teams that inhibit understanding of project overlap and redundancies. A common mechanism to overcome this challenge involves leveraging a common project management tool (such as Jira) to allow other teams to search and easily find commonalities across projects.

Establishing a searchable feature and model catalog represents another powerful mechanism to create better visibility across the enterprise. Often, valuable data features created by a given data science team can be reused across multiple business functions, saving both time and processing power. Asking teams to check the catalog before creating anything brand new just makes sense – and creates a more efficient organization.

[ Check out our primer on 10 key artificial intelligence terms for IT and business leaders: Cheat sheet: AI glossary. ]

2. Spotlight a select group of cross-functional projects

While centralized resources may be scarce, creating one or two focused initiatives around key business areas that would have a cross-functional impact frequently translates into the big value gains often promised by data science, AI, and ML. These projects tend to naturally have the “gravitational pull” that interests folks all across the organization. Example projects include top-level predictive KPIs, customer initiatives, or next best product outputs.

One approach to kick-start these types of projects while encouraging collaboration comes in the form of hack-a-thons which often produce the seeds from which larger initiatives can grow. At a minimum, the IT org should be heavily engaged to support data access and the analysis environment. Oftentimes though, it makes sense for IT to play a leadership role in coordinating these types of initiatives given their unique bird’s eye view of the business.

[ Want best practices for AI workloads? Get the eBook: Top considerations for building a production-ready AI/ML environment. ]

3. Strive for a single data science production environment

These days, it’s incredibly easy to stand up workspaces using one of the many cloud providers. However, establishing a single environment that meets the needs of data scientists and is adopted across the entire enterprise represents a completely different challenge. Meet this challenge and you can realize substantial benefits that can dramatically reduce friction between disparate organizations.

Wrapping a set of DevOps processes around the platform creates even greater efficiencies while helping the teams “speak the same language.” Here are a few examples:

  • Peer reviews of code and methodology
  • Version control using Gitlab (or similar tool)
  • Consistent model validation and monitoring standards
  • Consistent model metrics and measurements
  • Governance around data sourcing to ensure models are built and sourcing data from an agreed upon source of truth

Deploying these processes and standards on a single platform establishes a common framework for teams to work from – and ultimately collaborate more effectively.

[ How can public data sets help? Read also: 6 misconceptions about AIOps, explained. ]

4. Create new connection points internally and externally

Nobody wants to feel like they are working on an island. Yet, many data scientists find themselves alone within business units or functional teams without the support of folks with similar skill sets. Establishing some type of regular forum for data scientists and engineers to connect, share challenges, and showcase good work can make a huge difference for this group. That’s especially true for more junior team members. Instant messaging channels (such as Slack) can prove to be a lifeline for team members needing some quick guidance.

Additionally, establishing formal relationships with the local university statistics department can serve as a source for new perspectives on particularly challenging issues. These relationships also encourage continued learning, which is seen as a significant benefit for most data science teams.

Speeding up AI results

The key to making progress towards building a strong data science community lies in understanding where your organization sits today and defining where you want to go. Ultimately, it’s a journey. However, accelerating your progress on this journey can yield significant results and these four steps aim to help you align the people, processes, and technologies to get there.

[ Are you up to speed on hybrid cloud strategy? Get the free eBooks, Hybrid Cloud Strategy for Dummies and Multi-Cloud Portability for Dummies. ]

Vincent Stuntebeck
Vincent Stuntebeck is Sr. Director of Enterprise Data & Analytics at Red Hat. With more than 20 years experience in a variety of data, technology, AI and ML focused roles, Vincent is passionate about creating value from data in ways that transform organizations and products to be more insightful and impactful.