To gain an understanding of PolygraphsTM, let’s first consider the differences between data and insights.
Let’s imagine we have a user who likes to go for drinks in the afternoon at her favorite coffee shop, Starbucks, from her office. She must drive there in her car using a maps app for directions. The app directs her from her office to the location usually by the same route, but occasionally traffic causes the route to change. Or sometimes she is at her second office location. In that case, she goes to another Starbucks. And if that is not convenient, maybe another local coffee shop.
Let’s say we want to do anomaly detection on the above example. Can we figure out when her behavior differs? How will we go about doing it?
Our first approach might be to collect GPS data points from her cell phone. We can look at the history of GPS values and from it use machine learning to find bounds on what values are normal. But this approach has severe problems. It might not work very well and cause false positives. For example, because of an accident our user takes a new route to the coffee shop. This also can’t help us with false negatives when the user visits a dentist located next to Starbucks.
The next best approach might be to compute routes. Based on how fast GPS values change, we can figure out the starting and ending points of a route. This gives us better insights than just GPS data. Now a route change will not throw us off. However, if our user visits a new Starbucks location, we will still get an anomaly and a dentist visit will still confuse our model.
Our next approach works even better. We can try to understand ‘Shops’ to make our model logical and not tied to physical locations. We can give labels to different locations, like ‘Home,’ ‘Office,’ ‘Starbucks’ and ‘Coffee Shop.’ Then we can really begin to understand behavior like ‘Going to Coffee at Starbucks from Office.’ This new model can tolerate the user going to different Starbucks shops via multiple, diverse routes even in different cities! At this point, we have understood our user’s real behavior. We are also able to detect anomalies like when the user visits a dentist office next to Starbucks. Different shops in close proximity do not confuse us anymore.
Another advantage of this new model is that it lets us figure out tiny differences in behavior. As an example, we can determine and score if going to a Peets (another coffee shop) vs Starbucks is really a big behavior departure compared to visiting a dentist.
In a data center, the above example is very similar to getting NetFlow records (GPS data), creating routes (a graph), and understanding behavior (polygraph).
To understand this more deeply, let’s consider an example within an Amazon Web Services S3 environment.
S3 is an AWS service, with hundreds of thousands of IP addresses. NetFlow records are recorded with 5 tuple (Src IP, Src Port, Dst IP, Dst Port, and Protocol). The Src port changes every time a new connection is made. Different bytes can be transferred which makes it difficult for a busy cloud application to understand how many bytes are transferred by whom. This is similar to looking at raw data.
Most security tools stay at the NetFlow level. They will process data and use sophisticated algorithms, but never understand who is making the connections and to whom. Like the GPS data above, they are limited in scope in how effective they can be. They have no idea who the caller is, or what they are trying to do. If your security tool can’t give you any deeper insights, you are not really observing behavior.
Similar to the routes above, by using DNS data and source process data, we can convert IP and source-ports to a graph which tells us which process is talking to which DNS service.
But like the routes above, we can do better and create a polygraph. For example, S3 has multiple buckets and using them allows us to have logical labels on destinations. Processes are part of applications, and understanding the applications lets us label the source. Now we can also aggregate bytes between source application and a bucket to get some insight on data transferred. Now we can start observing behavior like ‘Application called Backup, transferred 100GB to backup-bucket at AWS S3.’ We call this model Polygraph.
Like our coffee shop behavior, a lot of physical details do not matter to behavior anymore. For example, VMs or processes could die or get restarted, or we can do an A-B failover, but none of these things cause us any change in behavior. We can even bring up an entire application in a new AWS account, and the behavior of the backup application won’t change.
And like our coffee shop example, we can detect subtle proximity anomalies. We can figure out if a different application (running on the same VM, reusing the same client port, talking to the same S3 IP address) is talking to the backup-bucket. Or we will know if the Backup application decides to talk to a Finance-bucket (Again, even if everything on NetFlow level is the same).
Lacework Polygraphs work by understanding application behaviors. We understand all your users, applications, services, containers, images, pods, etc. Over time, we create behavior models about what activity is really occurring. As a snippet, the following picture shows what applications and containers talk to AWS S3 in our demo environment.
It does not matter how many of these processes or containers run over different VMs, in different data centers or how many different S3 IP addresses they talk to over different client ports. The behavior models remains stable. This gives us low false positives, but still excellent detection capabilities. Even when one of these connections happen from a new application (a dentist next door), we will find it.
Everything I’ve just described is done by Lacework Polygraphs, not humans. Do you find yourself, querying logs, searching for data, writing rules, and struggling to understand how things are working in your cloud? That’s all toil that Lacework Polygraphs could take on for you (assuming you have the data) If you or your team are toiling in cloud security data, it’s time to talk to Lacework.