How to dynamically correlate Google Cloud Compute Engine instance network traffic using Chronicle
Following up from last week’s blog post on why network security telemetry matters today, our guest author Matt Svensson, a Senior Security Engineer at BetterCloud, discusses how you can use Chronicle to dynamically correlate IP addresses in network traffic logs — like Zeek — to events on Compute Engine instances.
By Matthew Svensson, Senior Security Engineer at BetterCloud
Cloud infrastructure has made it operationally simple to scale your application and run rolling updates to instances with no downtime.
This has also made network visibility a seemingly unsolvable headache. Even if you are doing traffic mirroring and creating Zeek logs, cloud instances don’t use DHCP like a traditional network so you don’t get DHCP logs for correlation.
This creates two problems.
Problem 1: Manual correlation race condition
Imagine you get an alert at 11:30am that 10.10.10.7 reached out to a new domain, something-evil-looking.com. The logging and alerting pipeline is slow and the traffic actually occurred at 11:05am.
When you go into the cloud console, you see no instance with that IP address because it has already been spun down. And, there are no logs in the cloud console to show what host was previously given that IP.
Problem 2: SIEM “lookup tables” don’t scale
Now, imagine this real-world scenario. Today, you have 3 nginx instances in a managed instance group.
- nginx-production-2j3l (10.10.10.1)
- nginx-production-6kj1 (10.10.10.2)
- nginx-production-93k3 (10.10.10.3)
You add those IP to hostname mappings to your SIEM lookup table.
The following week, nginx is updated and a rolling update occurs. Now those instances have changed.
- nginx-production-9gka (10.10.10.4)
- nginx-production-1j48 (10.10.10.5)
- nginx-production-84j3 (10.10.10.6)
Your SIEM lookup table is updated. Problem solved.
Nope. Over time, IP addresses will be re-used and you will have 2 hostnames for the same IP and lose historic context for what actual hostname was associated with the IP address.
Solution: Chronicle’s automatic correlation
Chronicle can take DHCP and Zeek logs and auto-correlate them to the end-hostname, providing you a look-back over time at traffic with the hostname that was historically associated with the IP address.
*Security Engineer bangs head against the wall…”Remember, these Zeek logs don’t have the dhcp.log records to do this correlation.”
“Hacking” the solution:
Chronicle will also take CSV logs, like below, and use them like DHCP logs.
DATETIME, TYPE, IP, HOSTNAME, MAC ADDRESS
Using Google Cloud’s native gcloud command line tool you can get a list of all of your compute instances by running gcloud compute instances list. Below is a simplified example of the output.
NAME, ZONE, MACHINE_TYPE, INTERNAL_IP, EXTERNAL_IP, STATUS
- Jump1, us-central1-a, n1-standard-2, 10.128.0.27, RUNNING
One problem, there are no MAC addresses.
Well…there’s no need to do anything with the MAC address other than use it for correlation., So, you can randomly generate MAC addresses for each unique hostname.
With the IP, hostname, and MAC address, you can generate the necessary CSV log!
Solution in action:
Combining all of this into a script — like this open source tool we made available on GitHub — and running it every minute or two, we can, for each Google Cloud project:
- Run gcloud to get a list of all current hostnames and IP addresses
- For every IP, check if we’ve seen the IP before:
- If yes, ensure you have an updated hostname and keep the same MAC address
- If not, add the new IP and hostname, with a random MAC as a new host
3. Only send Chronicle hosts that are new or have updated hostname
In addition to sending new device logs to Chronicle every 1–2 minutes, it is important to send logs for ALL hosts to Chronicle every day or two. Since DHCP is a temporary lease — Chronicle assumes that, if a device hasn’t been seen for 5 days — that the IP address is no longer associated with that hostname. (Note: Don’t send ALL logs to Chronicle every 2 minutes. This will likely cause correlation issues as that frequent of DHCP activity would constitute unexpected and unusual behavior.)
That’s it! Now you have an automated mechanism to correlate all of your network traffic down to the Compute Engine instance that initiated the traffic.
Stay tuned on this blog post series to learn more about how you can use Chronicle for cloud network traffic use cases. To learn more about Chronicle, complete our Contact Sales form.