How to use audit logs and Lacework Query Language (LQL) to simplify your Kubernetes API migration

Kubernetes releases new versions three times each year. Users of managed Kubernetes services, such as Amazon’s EKS and Google’s GKE, need to ensure they update to these newer Kubernetes versions roughly every six to 12 months. With each migration, it is important to avoid using any deprecated or removed APIs, which can be challenging because it requires tracking all API calls made by applications and users. However, Kubernetes audit logs capture all API calls, making them an excellent place to hunt for removed or deprecated API calls — if you have the right tool.

Kubernetes API versioning

Kubernetes introduces new APIs regularly. Typically, these APIs are added as v1alpha1 (experimental), then will mature to v1beta1 (pre-release), and finally to v1 (generally available). 

The version of the API call is part of the URL. For example, the API call to manage cronjobs, depending on the Kubernetes API version, is: 

  • /apis/batch/v1alpha1/namespaces/{namespace}/cronjobs
  • /apis/batch/v1beta1/namespaces/{namespace}/cronjobs
  • /apis/batch/v1/namespaces/{namespace}/cronjobs
    

There are many variations of the same API for different actions:

  1. /apis/batch/v1/namespaces/{namespace}/cronjobs
  2. /apis/batch/v1/namespaces/{namespace}/cronjobs/{name}
  3. /apis/batch/v1/cronjobs
  4. /apis/batch/v1/watch/namespaces/{namespace}/cronjobs/{name}
  5. /apis/batch/v1/watch/namespaces/{namespace}/cronjobs
  6. /apis/batch/v1/watch/cronjobs
  7. /apis/batch/v1/namespaces/{namespace}/cronjobs/{name}/status

This versioning adds up to 21 possible API calls over three versions for just one object.

In addition to API calls maturing to GA, some API calls can be removed earlier or renamed. 

APIs are first deprecated (they work, but they should not be used anymore), then they are removed. Finding a complete list of the exact APIs removed can be difficult. For this blog, I checked both the release notes and the GitHub change log and cross-checked the information with the Kubernetes documentation for each version.

K8s audit logs and APIs

The Kubernetes audit logs are huge, making it difficult for companies to process and analyze them. Compounding the complexity is that various Kubernetes providers use different log formats. For example, GKE uses a different schema than the standard audit logs and returns different URLs for the API calls. 

Lacework normalizes the GKE audit logs so users don’t have to identify the differences between cloud providers. With Lacework, users can automatically ingest Kubernetes audit logs from EKS and GKE. We provide various mechanisms to search the audit logs:

  • A Polygraph to visualize the API calls and users
  • A searchable list of API calls
  • Anomalies and policies to detect important events
  • Lacework Query Language (LQL), which is a powerful yet simple language to query large datasets

LQL for the K8s audit logs

Lacework has released LQL policies that detect deleted and deprecated API calls for all Kubernetes API versions in this GitHub repository. 

Use the Lacework CLI to create and run your own LQL policies

First, download the Lacework CLI and install it by following the documentation at https://docs.lacework.com/cli/. Create an API key in the Lacework tenant that receives the Kubernetes audit logs.

Then, check out the LQL repository locally:

$ git clone https://github.com/lacework/lql-queries

You’ll find the LQL policies labeled as follows:

  • K8sActivityRemoved{version}.lql: APIs removed in version 1.22 to 1.29
  • K8sActvityDeprecated{version}.lql: APIs deprecated in version 1.22 to 1.29

If you are migrating from Kubernetes 1.21 to 1.24, you need to run the query for APIs removed in 1.22 (K8sActivityRemoved122.lql), 1.23 (K8sActivityRemoved123.lql) and 1.24 (K8sActivityRemoved124.lql).

There are two steps to run a query. 

1. First, save the LQL query with Lacework:

$ lacework query create -f K8sActivityRemoved122.lql

The query K8sActivityApiRemoved122 was created.

2. Once the query has been created within Lacework, you can run it with the CLI:

$ lacework query run K8sActivityApiRemoved123 –start "-30d"
{
    "CLOUD_USER": "system:kubestore-collector",
    "CLUSTER_ID": "gke-1",
    "CLUSTER_TYPE": "GKE",
    "EVENT_NAME": "WatchHorizontalpodautoscalers",
    "EVENT_OBJECT": "/horizontalpodautoscalers",
    "EVENT_SOURCE": "horizontalpodautoscalers",
    "EVENT_URI": "/apis/autoscaling/v2beta2/horizontalpodautoscalers",
    "USER_GROUPS": null,
    "USER_ID": null,
    "USER_NAME": "system:kubestore-collector"
  },
}

Sample output

The queries would return the API calls with the Kubernetes user who initiated the call and the actual cloud user if the API was triggered manually.

LQL details

You’ll find a full guide for LQL at https://docs.lacework.com/lql/lql-overview. Let’s look at the most important information from one of the queries provided: K8sActivityApiRemoved122.lql

Each LQL must indicate the source(s) to query. For the Kubernetes audit logs, it is LW_ACT_K8S_AUDIT. The query filters the data source based on the requestURI (the API URL) and lists all the APIs removed in Kubernetes 1.22 in their different forms: validatingwebhookconfigurations (v1beta1), customresourcedefinitions (v1beta1), etc. 

The last part, the return fields, indicates what information to output:

  • USER_NAME: the Kubernetes username
  • USER_GROUPS: the Kubernetes groups
  • CLUSTER_TYPE: EKS, GKE, etc.
  • EVENT_URI: the API call URL
  • Etc.

To see the list of fields available in the LW_ACT_K8S_AUDIT, use the Lacework CLI:

$ lacework query show-source LW_ACT_K8S_AUDIT
      FIELD NAME        DATA TYPE            DESCRIPTION            
----------------------+-----------+---------------------------------
  RECORD_CREATED_TIME   Timestamp   Record creation time            
  REQUEST_TIMESTAMP     Timestamp   Timestamp of the Kubernetes control plane request
  AUDIT_ID              String      Kubernetes internal event ID
  ...

A preview of the source is available using:

$ lacework query preview-source LW_ACT_K8S_AUDIT

You’ll notice that some LQL filters out a Kubernetes user system:kubestore-collector. This default component of GKE gets updated with GKE upgrades to automatically switch to the new version of the deprecated API.

Now you know how to use the LQL queries to make your Kubernetes API migration easier. Customize them as needed, and don’t hesitate to submit changes directly to GitHub to share with everyone.