Show HN: A Mutating Webhook to automatically strip PII from K8s logs

(github.com)

18 points | by aragoss 4 hours ago ago

3 comments

aragoss 4 hours ago ago
Hey HN,
About 3 months ago I posted here the first version of Pii-shield, the tool that sanitizes logs to hide api-keys using Shannon entropy, Luhn algorithm for credit cards, and regex for custom pii data.
The tool itself worked well, but manual injecting sidecars to huge clusters was too complicated, that's why I wanted to rewrite the delivery mechanism, and turn the project into a Kubernetes Operator (Mutating Webhook).
In that process I resolved following issues: 1. Replaced the old tail -f | pii-shield pipe with native Go mechanism, which waits for logs files creation to avoid CrashLoopBackOff.
2. If a main container finish it's work, the sidecar continues working and trying to read the logs files. To fix it the Operator injects the agent into the initContainers array with RestartPolicy: Always, so now Kubernetes will know how to behave and kill the sidecar gracefully.
3. If a main container works under root with umask 0077, the nonroot sidecar can't read the the file because of Permission Denied error. Instead of changing user's manifests, the webhook does it automatically, it checks the SecurityContext of a pod, and injects fsGroup: 65532.
Now everything is packed into one helm chart. You just describe one simple label pii-shield.io/inject: "true", and the Operator will do the rest of the work, with no code changing.
Would be happy to hear you thoughts about it.
deferredgrant 2 hours ago ago
The hard part is not only catching PII. It is doing that without destroying the debugging signal people needed from the logs in the first place.
dlcarrier 3 hours ago ago
I saw PII and K8 and thought this was talking about early 2000's processors from Intel (Pentium II) and AMD (K8 is the 1st-gen Athlon 64), respectively.