When you start running an application workload, everything seems simple: You run test data, and everyone can see it, and it doesn’t matter where it runs. On-premises or in-cloud, it’s all the same. But once you start deploying real workloads, with real data and real processes, things change: Some of that data, and some of those processes, will be sensitive. So how should you decide where to place workloads, and how should you protect them once they’re there?
This question of how to find the right home for a workload is one you hear IT leaders voice, over and over again.
Let’s answer it by asking five related questions that will help you choose that home:
- What are sensitive data and sensitive processes?
- Who should have access, who shouldn’t?
- Who and what do I trust?
- What placement is appropriate?
- How can I control workload placement?
1. What are sensitive data and sensitive processes?
This question deserves a whole article on its own, which is why, when I started writing this section, I stopped and wrote that article. Read Sensitive data: Time to rethink your definition. But in case you don’t have time to read the entire article, the long and short of it is that almost any data has the potential to be sensitive, depending on the context. Once you’ve identified what data you need to protect, and which of its properties you need to protect - whether that’s confidentiality, integrity, availability, correctness or other - then it’s time to spend some time thinking about how to protect it.
2. Who should have access, who shouldn’t?
One of things that you will have done when working out what data and processes are sensitive will be to understand in what contexts they are sensitive. That will give you some good indicators about what set of people should have access. You should be aware that who this set of people is often changes over time: Let’s say that Alice is promoted, and now has access to new data, or maybe financial data which was confidential becomes public knowledge when company results are published.
The standard way to address this set of changes is by labelling data and giving people different roles, which can change as they move roles: It is then fairly simple to restrict which roles should have access to which data. This is often referred to as RBAC (Role-Based Access Control). In some contexts, this is insufficient, so other attributes of the data or the people can be used, leading to alternative schemes such as ABAC (Attribute-Based Access Control) being applied.
Eagle-eyed readers will have noticed that I changed the referent above from “data and processes” to just “data”. That’s because processes can be … awkward. Processes can change their sensitivity frequently, and sometimes unexpectedly - or maliciously. That browser that was displaying cat videos? Well, it’s now been hijacked by ransomware and is currently encrypting your drive in the background.
Processes are typically difficult to characterise in quite the same way as data, and a good rule of thumb is therefore to restrict them based on the worst that could happen if something went wrong.
And workloads, don’t forget, are processes, so you are going to need to think about worst case scenarios for any particular workload-data combination.
[ Are you speaking the wrong language? See How to talk to normal people about security. ]
3. Who and what do I trust?
I’m from a security background, so my go-to answer is “nobody” – but even I realise that this is unrealistic. It is simpler to say that how you much you trust people depends on context.
As with data, as we described above, context is king when considering trust. Do you trust your fishmonger to manage your tax affairs? Would you trust your accountant to de-bone a fish? The same goes for the people operating and administering your systems: you trust them to do different things.
Specifically, we are talking about two sets of systems: the workloads that you run, and the hosts on which they run. You may already have controls in place to ensure that only your finance and HR teams can access the payroll workloads, but what about the hosts that the payroll workloads run on?
We don’t always realise it, but when a workload is running on a host - “bare-metal” (directly on the host), in a container or in a VM (virtual machine) - anyone, or any process with administrative access to that machine, has full control over that workload. It is not only that they can stop it from running: They can look inside it or even change the data that it contains. This is alarming, because it means, for our payroll workload example, that we need to trust not only the finance and HR teams with the data, but also any administrators who have access to the host that is running the workload.
You also, of course, need to be sure that the hosts themselves are adequately secured, because if an attacker manages to get into one of those hosts, then they now have control over your sensitive workloads.
[ Read also: Public cloud security: 4 myths and realities. ]
The combination of these two points - that you need to trust the people, but you also need to be assured that the hosts themselves are well-managed - allows us to think start thinking about where we might put workloads.