Introduction to kubernetes pt. 4: Monitoring

This is the fourth and final part of this little series of introducing kubernetes to system operators. In the first post I gave an overview about the general structure of a kubernetes cluster. The second post deals with managing ingress networking, as exposing services to the outside world is something you need to do in pretty much all instances. Finally we have seen the components involved in getting storage attached to your pods in the previous post.

In this post I will present a solution to the whole monitoring and logging issue as you would want to see it as operator of a few services on the platform. To clarify, this is the level of logging system admins want to see. Application developers and customers have different needs, such as tracing or nice fancy dashboards logging business data.

And the solution is simple: Prometheus operator for monitoring and Loki for logs, displayed via the same Grafana.

Benefits of Prometheus operator with Grafana

The list of benefits is pretty long and I think most of them do not need further explanation:

  • architectural superiority of Prometheus
  • if needed: extendable for long time storage with Thanos
  • excellent query language for derived metrics
  • good helm chart available from Prometheus community
  • kubernetes labels end up as logging and monitoring labels
  • dashboards can be injected via config maps, so versioning in git is possible
  • exporting metrics is very simple
  • Prometheus metrics are ubiquitous
  • debugging with colleagues has minimal friction

I want to emphasize the benefit of a good query language for derived metrics. Often you only get some raw number, like disk usage, which changes over time but you also want to know the rate of change. Well, in the past you then had to either have a disk space plugin that already calculated this or you had to hack some additional exporter into your systems. But with the Prometheus query language, instead of just having a metric called disk_space_used, you now can also display rate(disk_space_used[1h]), which then displays the rate of change and is also transparent to the operator, i.e. you do not have to figure out which magical computation your colleague hid in some bash file on the monitoring server.

Prometheus and Grafana also make debugging sessions so much easier. Gone are the days where you had to be directed by your colleagues to the right file or magical command invocation to observe something. Now you can just copy and paste specific queries and everyone can see them. Sometimes you can even inherit whole dashboards colleagues have built for a specific problem.

Of course Prometheus and Grafana are also superior on more traditional hardware setup, but then you need to use different tools like ansible to get a nice and reproducible deployment.

Some tips running Prometheus operator with Grafana

The first tip is about deploying it and the values file. You might notice that the Prometheus operator uses several subcharts, e.g. namely one for all components and in the values file not all configuration parameters of the subcharts are exposed. So sometimes you just need to dig in the inherited helm charts like the Grafana one to find configurations, e.g. when you want to inject alerting channels as secrets. In these cases also a look at the Grafana provisioning documentation is helpful to get some ideas.

If you are new to Prometheus, the myriads of metrics scraped can make it difficult to find out what is already exported. You can see the whole list of scraped targets on the /targets path of the Prometheus pod.

For many components there are already nice dashboards available, which you can explore on the Grafana website. If you choose to keep them always remember to save the JSON to your git, so you can deploy them as config maps. As for self created dashboards, it can be a little bit cumbersome to do that yourself, as you can not edit dashboards that have been provisioned by config maps. If you have multiple clusters or a testing cluster this is no problem though, as then you can just modify them in one cluster, and then roll out the dashboards via the usual means to your remaining clusters.

About Loki and alternatives

So far I mainly talked about Prometheus, mainly because there is just not much to say about Loki. It collects logs and displays them and you can match labels, query with regular expressions, do sorts of grep, do sorts of grep -v. There is not much to write home about as it simply does what you want and it works.

One should note though that "it just works" is not a particular low standard to work with. There are bad ideas one can have, such as ELK and other proprietary stacks, which have several downsides that make them harder to work with.

On the more general side, as a system administrator, you notice when projects are not community driven but have the taint of opinionated-open-source aka we want to sell you something. Deployment is just not as nice and sophisticated, sometimes documentation is paywalled (thank you for nothing, RedHat), weird bugs stay in forever, the little annoyances that just add up. And especially ElasticSearch is an aristocratic brat when it comes to running smoothly. This might get better with the AWS fork though.

More specific problems simply arise because it was made for a different job. When you write business logging pipelines, you fit your output to the logging stack in the end, so it works nice and well. On the other hand, application logs are the least standardized thing on earth and if you run a sufficient number of different applications you will produce some inputs that will make Elastic just barf up. It's like running a fuzzier on something that really does not like to be fuzzied. Game over.

Also the query language is just not as nice for our purposes. It is more about correlating higher level events and less about just reading the damn error messages some application produces. So it is just not the right tool for the right job.

social