Entropy Trap part 2: Real-World Failures and Better Alternatives

TL;DR

Three common attack patterns where entropy detection fails outright: TLS C2, living-off-the-land obfuscation, and cloud exfiltration.
Better signals exist in all three cases. They require understanding protocol semantics, not just string randomness.
The shift is from "this string looks random" to "this session behaves anomalously."

Part 1 covered why entropy-based detection is structurally weak as a primary signal. This part gets specific with three real attack patterns where entropy either fires on everything or misses the attack entirely – and what actually works instead.

Failure case 1: TLS C2

Cobalt Strike, Brute Ratel, and most modern post-exploitation frameworks run C2 over TLS. The traffic looks like normal HTTPS. Valid certificates. Plausible SNI.

Entropy on the SNI or certificate common name tells you basically nothing. "update.microsoft.com" and a random C2 domain both score low. "api.cloudflare.com" looks identical to "cdn.attackdomain.net" on string randomness.

The signals that actually matter are:

JA4 fingerprinting. The TLS ClientHello contains a unique combination of cipher suites, extensions, and elliptic curves. Legitimate browsers and applications produce consistent JA4 hashes. Most default C2 frameworks don't match Chrome or Edge. This is pure wire data – Zeek's ssl.log captures it cleanly.

Certificate anomalies. Real applications rarely use brand-new or very short-lived certificates. Self-signed certs on port 443 coming from major cloud providers are suspicious. A cert issued the same day the domain was registered is almost never a legitimate CDN.

Connection interval analysis. Implants beaconing every 60 seconds with tight jitter create measurable patterns in conn.log. Legitimate application traffic doesn't behave this way. This is statistical baselining, not string math.

YAML

title: Suspicious TLS Fingerprint - Zeek ssl.log
logsource:
  category: network
  product: zeek
  service: ssl
detection:
  selection:
    ja4: 't13d191000_9dc949149365_97f8aa674fd9'  # Example Cobalt Strike default
  condition: selection

Failure case 2: Living-off-the-land obfuscation

PowerShell encoded commands have high entropy. So do legitimate compressed scripts and any script handling binary data.

Entropy on the command line quickly becomes noise. Raise the threshold and you miss real attacks. Lower it and you page on normal admin work.

What you actually want to detect is process lineage and execution context.

Office apps spawning PowerShell is not normal. svchost.exe running cmd.exe /c whoami is not normal. A scheduled task created via WMI that then runs encoded PowerShell is not normal.

These are process tree patterns visible in Sysmon – no entropy math required.

YAML

title: Office Application Spawning PowerShell
logsource:
  category: process_creation
  product: windows
detection:
  selection:
    ParentImage|endswith:
      - '\WINWORD.EXE'
      - '\EXCEL.EXE'
      - '\OUTLOOK.EXE'
    Image|endswith: '\powershell.exe'
  condition: selection

Entropy on the PowerShell command line might be useful as supporting evidence. It doesn't find the behavior.

Failure case 3: Cloud exfiltration

High-entropy S3 object keys are completely normal. UUIDs and content-addressed hashes are everywhere in modern storage.

A rule based purely on entropy in S3 GetObject keys will page on your own legitimate infrastructure.

The real signal lives in behavior: who is calling GetObject, from where, how fast, and against which buckets.

A brand new IAM role suddenly pulling objects, or an EC2 instance accessing buckets outside its normal scope, or a principal authenticating from a new IP range and immediately doing bulk retrieval – these are the patterns worth investigating.

None of this is entropy. It's all baseline deviation.

Entropy signal	Limitation	Better signal
High-entropy DNS subdomain	Fires on CDNs, JWTs, session tokens	Query rate + query type distribution + ASN
High-entropy TLS SNI/CN	Can't distinguish C2 from CDN	JA4 + certificate age + domain registration delta
High-entropy PowerShell cmd	Fires on compressed scripts	Parent process + execution context
High-entropy S3 object keys	UUID keys are normal	Principal behavior baseline + rate + source IP

What these cases have in common

In every case, entropy detects a property of a string. The malicious behavior is a property of context – who did it, what the session looks like over time, and how it deviates from this host or principal's normal pattern.

Protocol semantics tell you what a packet should look like. Behavioral baselines tell you what this host normally does. Combining those two is where real detection coverage comes from.

Entropy can be a supporting signal in that stack. It just can't be the anchor.

Part 3 covers the implementation: how to actually build these kinds of detections using protocol semantics and behavioral baselines – the same approach behind my sigma-to-spl and detection-notes repos.