How your cloud security practice can support teams working at wildly different maturity levels

Mark Nunnikhoven - Distinguished Cloud Strategist·September 16, 2022·7 min read

Abstract architectural photo shot from the ground. Features a lot of modern windows and steel. One of the first steps in determining your cloud security strategy should be to understand your business needs.

We’ll dive into the should bit shortly, what’s important for the framing of this post is that one of the hardest things to deal with is the varying rates of change from different teams within the business.

The cloud enables more and more teams to build solutions. Not all of these teams work at the same rate, with the same tools, or with the same level of understanding.

Your security practice needs to support all of them. At the same time.

And—if only for your sense of stability—with minimal effort.

Pace matters

If you’re a runner, you know that pacing matters. If you try to sprint a marathon, you are not going to get very far.

Building technology used to always be a marathon. A project would start by gathering requirements, make sure those were locked in, and then work for weeks or months. Over time, the typical pacing was about a year.

Then, a year after gathering the requirements, a finished product would be presented to the customers, the intended users with an expectation of success.

It rarely worked out.

This methodology is commonly referred to as “waterfall.” It’s not an approach that is used very often any more and for good reason–It was a marathon pace when a sprint was needed.

This pace simply doesn’t match the business need.

Smaller batches of work

Even before the cloud, teams were moving towards a working style that focused on smaller batches of work. These “sprints” would last a week or two and focus on delivering specific, smaller results.

After the sprint, the teams would check in with each and the customer to make sure things were on track. If they weren’t, they adjusted as needed and started the next sprint.

This is the fundamental principle of how technology is built today.

As the cloud has matured, modifications on the agile approach have developed because they adhere to that simple principle; plan, work, check-in, adjust.

Is it perfect? No.

But the process is designed with that in mind. It works…most of the time.

Measuring development

Measuring the quality of development efforts is hard. Building technology is a complex endeavor and no single metric will tell you if you’re doing a good job.

The overall analysis shows that teams who work in smaller batches tend to produce more dependable results.

Furthermore, teams that implement consistent testing practices and strong monitoring and observability practices will meet their goals more often.

Teams in your organization that are working at different speeds will deliver differing results. The question is how different?

The State of DevOps 2020 report provides some insight here.

Comparing the low tier performers with the elite performers, the potential failure rate for any change nearly doubles (0—15% vs 16—30%). Where things get really eye opening is when that statistic is combined with the lead time for changes and the time to restore a service after failure.

Low performers need more than six months heads up for changes and more than six months to restore a service after failure. Those elite performing teams only typically need less than an hour to implement a change or restore a service.

The elite performing teams are 6570 times faster than the low performers.

This is summarized on page 9 of the 2021 report. To dive deeper into this type of analysis, “Accelerate” by Nicole Forsgren, PhD, Jez Humble, and Gene Kim is the book on the subject.

There are many ways to get there, but it’s clear that smaller batches of work lead to better outcomes.

How does this impact security?

There is a direct parallel here to your security practice. Beyond the availability aspects of security—yes, everyone always forgets about that part of the security practice—smaller batches of work are easier to evaluate for risk.

If a team is only changing one configuration setting during a deployment, it’s a lot easier to track down a problem when compared to a big batch of workers changing hundreds of things at any given time.

This causal relationship between security outcomes and DevOps maturity was called out as early as the 2016 State of DevOps report. In that report, on page six, the team highlights that high performers spend fifty percent less time remediating security issues than other teams.

Why? Because they are more consistent and faster in their delivery. And because they are creating fewer security issues because of the higher quality of their work.

Remember, the CSA’s Egregious Eleven, calls out misconfigurations as the second most prevalent threat in the cloud. Teams that aren’t moving quickly up the maturity ladder are going to make more mistakes like misconfigurations and take longer to catch those mistakes.

Adjusting your tooling to match

The first step for your security practice was understanding this mismatch in cadence among the teams that your security practice supports. Now that you have a deeper understanding of the impacts of the different paces of these teams, you have to make sure that your security practice has multiple outputs.

The information and work you do with an elite performing team cannot be the same as the work you do with a team taking their first steps in cloud.

At the same time, you can’t afford to rollout and maintain multiple sets of security tools to accommodate these differences.

Now you need to look for tooling and processes that will help you reduce the amount of time spent customizing the approach for these teams.

Your goal should be to have a system and process for consuming all of the data generated from your cloud environments, analyze it, and then generate security insights from that data.

But it doesn’t stop there, with the map of your business processes from the Cloud Adoption Framework (AWS, Azure, and Google Cloud’s) and your understanding of each team’s cloud maturity, you can customize the output for each team.

The right information at the right time

A team early in their cloud journey will need information that’s focused on your biggest security priorities. Misconfiguration data should be at the top of your list here.

When providing this information, you should take steps to enrich it with everything that a team with that knowledge and skill level would need to solve the problem.

This is where automation can play a big part in your security practice.

Let’s use the classic—if frustrating—example of a public Amazon S3 bucket. This scenario is commonly flagged by security tools (specifically tools in the CSPM and CNAPP categories). The assumption is that it’s bad.

Honestly, most times exposing a storage bucket and all the files in it is a misconfiguration. Teams can often misconfigure the access policy and unnecessarily expose your organizations data.

But it’s not always bad. In fact, AWS has a configuration mode that allows you to host a website from an S3 bucket. There are other use cases for public access too.

The output of your security platform piped into a custom function will add more context and teach the team about the issues around access control; as a result for this type of cloud resource will generate better security outcomes.

This type of automation sounds fancy, but it’s really twenty lines of code that match up the event generated from your security platform to the CSPs documentation and some specific guidance that matches your organization’s risk appetite.

Now, do those elite level teams need that level of educational guidance? Nope.

It doesn’t hurt, but it’s probably more efficient to pipe the event directly from the security platform to their CI/CD workflow. They already know the issues and how to address it given their use case.

A spectrum of support

The philosophy of DevOps has taken over the building community for good reason. When done well, it leads to better, more consistent outcomes. This is why your security practice needs to adopt the same mentality and adjust to the realities of your organization’s cloud usage.

Working through these issues, it becomes apparent that your security practice needs to be able to provide a spectrum of support that matches organizations varying needs.

Start by making sure that your security tooling and processes are set up to make sure that the right information is provided to the right team in a timely manner. That will help those teams take direct action to resolve the issue.

From there, it’s a matter of putting that into action through automation. Ideally, you’d be able to solve it all in one fell swoop. But we all know that isn’t possible.

It may sound counter intuitive, but start with the teams furthest along in their cloud journey.

That will force your security practice to make sure your tooling can keep up with those advanced needs. More important for the success of your program, those elite teams are already in a position to act on the information you’re providing. They are going to be able to generate a lot of small improvements.

Then move to the teams earliest in their cloud journey. Here’s where you’ll see the biggest gains (vs. the small tweaks of the elite teams) by focusing on education. This is where your teamwork will produce big improvements to your security practice.

Just remember, like everything in cybersecurity, this is a continuous process. There are always improvements to be made and teams you can help.