AWS Security Best Practices for 2020 You Need to Implement
October 31, 2019
The AWS Security Landscape
Every security professional dreads that email. “We found your customer’s private data on the internet, and it appears to have been stolen from your systems.” Capital One got that email this summer. Over 30 GB of data involving 106 Million people was compromised. The issue was floating around on social media for months and no one at Capital One had any idea.
The company’s PR department spun up damage control operations immediately, making statements about how much data wasn’t stolen, faint comfort to those 106 Million. Yet Capital One is to be commended. They responded quickly and largely transparently to the situation. The attack was made by a somewhat unstable former employee of Amazon going by the apt handle “Erratic.” Looking over the attacker’s Slack channel, which was still up after her arrest, it seems likely that many major corporations had been compromised. Do they know? Do they plan to disclose? It is not unreasonable to presume they are as in the dark as Capital One was the day before the email. Even now efforts are still underway by law enforcement to identify and notify victims and contain the damage.
It is important to keep in mind that, while the Capital One incident is probably the biggest incident to date, it is far from isolated. Uber had an AWS related breach, whose disclosure almost a year later was heavily criticized. Even more damaging from a branding perspective, OneLogin, a company focusing on keeping passwords safe, had their AWS passwords stolen, leading to a breach of their customer’s passwords. Two airline companies, Thai Lion Air and Malindo Air, also experienced a breach from a poorly configured Amazon S3 bucket.
One of the pitfalls for companies moving to the cloud is that the ease of deploying operating systems or containers has been simplified to the point where software developers are frequently put in charge of migration projects rather than system architects and administrators. Many software developers do not have experience with security principles such as least privilege, file system rights, process separation, high availability, or network access controls. As a result, they tend to produce brilliant functionality that solves their company’s problems but are all too easy for a malicious actor to penetrate.
It is only easy in hindsight to be overly critical of the corporations involved. Capital One has a large budget for security and many respected professionals who work for them, yet alone actor was able to inflict substantial financial damage and drop their stock value almost 6% in a day. Could it be that for all the security budget they had, they were simply missing the right tools?
Capital One was the poster child for AWS efficiency and the power of the cloud in general. The harsh reality that this breach and others like it have laid bare is that there are unique challenges to the cloud that need to be considered as we go into 2020. One of these is visibility. System administrators who are trained in on-prem infrastructure are accustomed to aggregation points in the network that can be monitored for intrusion detection, limited private IP address spaces for servers, administrative access restricted at the network level, and a reasonably static environment. The cloud shatters all these paradigms. Being able to determine what your systems are “doing” and who is accessing them is a new challenge. Network edges are considerably fuzzier if there are any edges at all. With capacity-on-demand and containerization, it can be a challenge to even know what systems to defend. An administrator can be left to wonder, like a young Luke Skywalker, how do you fight an enemy you can’t even see. What is needed is an Obi-Wan that can teach us a new way of seeing. This is where Lacework comes in.
Defining your Cloud Security Strategy
In most cases, the first step to securing your cloud infrastructure is to define a strategy. This is a process of identifying the unique risks and threats your business faces and planning a set of controls that will be used to defend it. This process may seem daunting at first, but the truth is while every company tends to have a few unique elements, most companies are quite similar at their core. Many companies opt for a formalized framework such as NIST CSF, ISO 27001, COBIT, CIS 20, or others. These frameworks define a minimum set of controls and act as a sort of pre-written strategy. Some like NIST CSF is more tailored to the cloud, but all have the goal of being a comprehensive approach. However, you choose to define your strategy, the main point is to know what the end result should look like before you begin building. This will give you confidence in your goals and avoid distractions from the latest buzzwords floating in the news or offered by vendors.
Your cloud security strategy is almost certain to include controls such as:
- Monitoring users and permissions.
- Monitoring anomalous network traffic.
- Monitoring system (or container) processes, files, and user access.
- Checking configuration hardening and standardization from the container to the IaaS level.
The AWS Shared Responsibility Model and Everyone in Your Cloud
In designing your cloud security strategy you should keep in mind the AWS Shared Responsibility Model. Other cloud providers have similar models. It defines a demarcation line between the infrastructure and the application. Having a secure cloud does not mean you have secure data in the cloud. It does mean there is a foundation to build on and some traditional security responsibilities can be offloaded to the cloud provider. This model should guide your strategy.
AWS responsibility “Security of the Cloud”
AWS is responsible for protecting the infrastructure that runs all of the services offered in the AWS Cloud. This infrastructure is composed of the hardware, software, networking, and facilities that run AWS Cloud services.
Customer responsibility “Security in the Cloud”
Customer responsibility will be determined by the AWS Cloud services that a customer selects. This determines the amount of configuration work the customer must perform as part of their security responsibilities. For example, a service such as Amazon Elastic Compute Cloud (Amazon EC2) is categorized as Infrastructure as a Service (IaaS) and, as such, requires the customer to perform all of the necessary security configuration and management tasks.
Customers that deploy an Amazon EC2 instance are responsible for the management of the guest operating system (including updates and security patches), any application software or utilities installed by the customer on the instances, and the configuration of the AWS-provided firewall (called a security group) on each instance. For abstracted services, such as Amazon S3 and Amazon DynamoDB, AWS operates the infrastructure layer, the operating system, and platforms, and customers access the endpoints to store and retrieve data. Customers are responsible for managing their data (including encryption options), classifying their assets, and using IAM tools to apply the appropriate permissions.
This customer/AWS shared responsibility model also extends to IT controls. Just as the responsibility to operate the IT environment is shared between AWS and its customers, so is the management, operation, and verification of IT controls shared.
AWS Security Best Practices You Need to Implement
Because the cloud operates with agility and facilitates an ever-changing landscape, organizations have to construct their approach in a way that is comprehensive across all cloud operations, but that also supports flexibility to support changing technology demands and business needs.
That means first knowing what is going on in their environment, but then also having a way to measure the importance or severity of those activities. The approach mandates an understanding of the continuous state of three critical aspects of AWS security management:
1. Being Compliant and Managing Compliance in the Cloud
Ensure you are aware of all changes and updates to your configuration that could affect your adherence to regulations and established best practices such as the CIS benchmark for AWS.
The typical approach to managing compliance is to apply a checklist formula. Standards and governance frameworks state controls that the IT infrastructure must adhere to, and security and compliance teams implement corresponding controls and settings.
Such a process works in a static environment but is not effective in the cloud. As a highly dynamic system, the configurations and settings of accounts and resources change constantly, and workloads are regularly spun up and brought down. This agility is what sets the cloud apart and provides its major advantages. It also, however, means that the compliance process demands automation and continuous insight into controls, settings, and configurations.
Awareness of cloud events and behaviors is the key factor in maintaining compliance. Since configurations change dynamically in order to allow for user groups or connections to new data sources, there’s really no constant state of the environment.
To be compliant, an organization must ensure continuous awareness of every action that might affect configurations. These are not a one-size-fits-all type of occurrence, either; they happen at the application, ID, workload, and host layers of the cloud. This is where organizational and user data is being transacted, and because of the AWS Shared Responsibility Model, these are the domain of the customer.
A logical starting point is meeting the demands of the CIS Foundations Benchmark best practices. These are the guidelines from the Center for Internet Security (CIS) that outline the application of configurations to the layers within the AWS infrastructure. Used in conjunction with a continuous monitoring tool that delivers insights into configuration changes and anomalous activity, a security team can identify where issues exist that would prevent them from being compliant.
Lacework applies a continuous auditing approach, so security and compliance teams are immediately aware of any issues. The identification and analysis that Lacework applies is done at the velocity and pace of the cloud and generates reports with detailed information about where issues occur, who is responsible, and how they impact other events occurring in AWS.
Keep in mind that audits don’t investigate for present-state only; auditors are looking at the historical impact of an organization’s security posture and the measures used to ensure ongoing adherence to policies. Once out of compliance, the issue can be remediated, but if that particular setting is unknown, then you’re out of compliance until the audit reveals it. If auditors determine your cloud isn’t compliant with standards like PCI, SOC-2, or other compliance frameworks that are related to your business, you could lose your ability to operate. Most organizations understand this but still don’t have an organized approach for awareness. Automating the continuous activity in your cloud will provide a framework that can enable you to operate in compliance and securely.
2. AWS Account Security and CloudTrail Analysis
It’s critical to monitor the activities of your AWS Accounts, who is using what, API calls made to various AWS resources and helps detect any anomalous activity.
As we’ve seen, AWS is specific in how security responsibility is distributed. That should make the job of AWS customers easier because it’s more defined. However, maintaining awareness and an applicable and actionable security posture over their organization’s data, users, and resources demands an effort that goes beyond just oversight.
AWS provides a variety of security-related tools that collect data about events and activities. These tools capture data but do not provide analysis nor do they compare actions to normalized behaviors in order to assess the severity of issues.
To use CloudTrail effectively, you need to first frame the data that’s most relevant to you. Many use CloudTrail logs as a storehouse to reference when something goes awry. That requires forensic analysis of where and how issues happened. Nothing wrong with that, although that’s mostly after-the-fact data and won’t necessarily help you get smarter about the security and compliance of your organization. Threats cannot be averted unless you identify issues before they happen, and CloudTrail isn’t prepared to deliver that.
What’s really needed is the analysis of CloudTrail logs, and Lacework applies visibility, insight, and analysis capabilities to CloudTrail logs, so users get both a continuous and automated view into their environment.
One of these tools is AWS CloudTrail, which is a service that collects important data about the activities of your AWS Accounts. CloudTrail logs provide an overview of changes and updates, but they are not necessarily relevant to your actual environment until you can view them through the context of how these events impact your configurations and settings.
Lacework can act on account anomalies that are critically relevant to AWS Account security. By integrating with AWS CloudTrail and analyzing CloudTrail data means that Lacework can detect issues within AWS accounts, including:
- Irregular activity across AWS resources. This can be done in regions and/or accounts and can identify when new S3 buckets are launched and when there are changes within those resources.
- Unusual changes to users, roles, and any other type of access to apps and resources. This includes changes to security groups and when multi-factor authentication (MFA) is bypassed. Lacework machine learning to understand the normalized behaviors of users, accounts, services, and API calls, and alerts when there is an anomaly. Additionally, it is always monitoring defined, high-value events like S3 bucket creation and security group changes.
- All changes to AWS infrastructure services, which includes changes to access master keys, route table modifications, and anything related to network interfaces and services. High-risk anomalies are presented with insights so that the security team can rapidly investigate and fix potential incidents.
3. Host Monitoring and Intrusion Detection: Immediately Discover Anomalies and Instances of Forced Intrusions that Warrant Critical Alerts
Even after applying best practices and creating an organizational mindset around security, you can only really know what’s happening if issues are identified at the point at which data is collected. That requires an agent to be operating in workloads or containers, so insights can be discovered at the host-based level rather than the network level.
Lacework’s host-based intrusion detection system (HIDS) uses anomaly detection algorithms and machine learning to analyze every application and user behavior inside a workload. The coverage includes all issues on SSH, parent hierarchy, user privilege change, process communication, machine communication, internal and external data transfers, and other cloud events.
Whether changes were intentional or the result of an attack, configuration changes open the cloud environment to potential bad actors and threats. Intrusion detection done at the host-level, like with Lacework, detects anomalies across all layers, leaving no hidden space at the application and data layer for bad actors to hide.
One of the key differentiators for Lacework’s approach is that events are analyzed against normalized behavior, which eliminates unending alerts and instead only surfaces those issues that are truly problematic. Lacework addresses issues with behavior baselining. Rather than looking at every machine, user, and application individually, our approach is to cluster these together based on historical behavior analysis, and alert when behavior is abnormal. The result is that, rather than being alerted multiple times for activities on multiple machines that all operate according to the same behaviors, we can alert you on the few issues that deviate from the norm.
Organizations that apply effective security measures have continuous awareness of cloud events and understand who has access (and what they have access to), the configurations and settings of cloud resources, and the connections among and between applications and data conductors like APIs.
Applying the right tools and skillset within their security teams, organizations can gain control over their cloud environment through the lens of the security elements outlined above.
Next Steps on Securing Your AWS Environment
Because of the shared responsibility model of the cloud, security controls tend to move to a more granular approach at the server or container level rather than more traditional network aggregation points. Authentication must be stronger, and systems hardened, rather than relying on edge controls to protect a classic DMZ. Lacework is a great way to fill this gap with its powerful Host Intrusion Detection System (HIDS). Lacework can operate at the system or container level to monitor access as well as looking for concerns in files, running processes, or network traffic. Modern cloud environments are in constant flux. Being able to deploy your security platform automatically for full coverage and continuous security is a critical feature. Lacework has designed one of the industry’s easiest deployment models. Lacework also has tight integration with major cloud providers including AWS. It leverages this integration to look for common errors in configuration and other security events that can reveal problems aligning with many of the most popular frameworks. This can present administrators with an accurate punch list of configuration settings that need review.
With any modern security strategy, you will also need a tool to focus the data from all of the controls into a single view. There will undoubtedly be more data than any human can sort through, so organizing and searching this data must be sophisticated and automatic. It is in these higher-level features of sorting data and surfacing important events that the best security solutions differentiate themselves. Even if a system’s logs record every administrative action, it does little good until the logs are collected and suspicious activity is identified. Likewise, an Intrusion Detection System is more powerful when its alerts are viewed as trends over time and correlated with system and CloudTrail logs to see the entire story.
This process is also aided by data enrichment, such as threat intelligence that can identify security events that match known bad actors and current events. Powerful machine learning can crunch through gigabytes of data in minutes, detecting patterns and relationships few humans would be able to find. Machine learning can profile the unique processes of your application to understand what “normal” looks like for your environment and use this to target suspicious events that ordinary signature-based rules could not detect. All of this power is rolled up into an intuitive interface with one-click investigations. Having a robust threat intelligence research team, well written alerting rules, and an accurate and thorough machine learning system are the key things to look for in a comprehensive solution. They are the very features Lacework brings to the table.
Photo by Deep Pixel on Shutterstock.