Kube-prometheus provides quite a few great collections of components and alerts that help us monitoring our Kubernetes cluster. I’ve used it in the production cluster for serveral months. Although the project exposes a bunch of options via
_config+:: to makes it partially configurable, the scope of parameters that we can modify is still limited.
For example, we currently deploy our workloads on Google Kubernetes Engine (a.k.a. GKE) on Google Cloud Platform. GKE hosts the master node of the cluster, which could mean some components such as the scheduler and controller manager are “invisible” to users.
Therefore, the alert rule groups
kubernetes-system-controller-manager is unnecessary for us, as well as some Grafana dashboards. I personnally want to remove them to prevent potential confusion.
Another example would be editing the
for field of alert rules. The default threshold of alert
15m. This is a bit too long for our SLA. We want a shorter duration that we can tolerate.
The good news is, credit to the powerful Jsonnet syntax, we have the ability to customize and tinker the project without forking or copy-pasting.
As you can see, my idea was to define a bunch of “manipulators” in an array (
prometheusRuleManipulators). Like middlewares in the web apps development, all HTTP requests pass through middlewares serially and can be changed before it arrives the app, I want all alerts to be sent to the manipulators and save the outputs of the last manipulator as the final results.
I initally tended to implement this using
// Won't work
However, Jsonnet seemed not allowing that. So I ended up using a recursion in
applyRuleManipulators. It calls the function in
idx by one, then calls itself with a larger
idx >= std.length(prometheusRuleManipulators).
I made 2 functions
manipulatePrometheusRules, which traverse the groups and rules respectively, and optionally filter out ones that we don’t utilize with the Python-style
manipulatePrometheusRules also calls
applyRuleManipulators mentioned above to apply manipulators.
Finally, we can override the alerts by calling
manipulatePrometheusGroups(super.groups) at the end.
I also found a way to edit the alerts using
std.map after I’ve made this: https://github.com/prometheus-operator/kube-prometheus/discussions/607. And a great quick-start tutorial of Jsonnet in Chinese: https://archive.li/IWlZG, https://archive.li/L4k1L.