Deploying TOBS (The Observability Stack) on LKE
TOBS, short for The Observability Stack, is a pre-packaged distribution of monitoring tools and dashboard interfaces. It can be installed on any existing Kubernetes cluster. It includes many of the most popular open-source observability tools such as Prometheus, Grafana, Promlens, TimescaleDB, and others. Together, these provide a maintainable solution to analyze the traffic on the server and identify any potential problems with a deployment. This guide covers deploying TOBS on LKE (Linode Kubernetes Engine) using Helm and the kubectl port-forward command for local access to your monitoring interfaces.
TOBS includes the following components:
- OpenTelemetry collector is deployed to collect traces.
- Alertmanager, is deployed alongside Prometheus, forms the alerting layer of the stack, and handles alerts generated by Prometheus.
- Grafana is a data visualization and analytics tool that allows you to build dashboards and graphs for your metrics data.
- PromLens helps users build PromQL queries with ease. PromLens is a PromQL query builder that helps you build, understand, and fix your queries much more effectively.
- TimescaleDB is for long-term storage of metric data. Long-term storage provides the ability to perform post-hoc analysis on metric data over long periods of time. Such data analysis can be used for capacity planning, identifying slow-moving regressions, trend analysis, auditing, and more. For information about connecting to the database from the cluster, see TimescaleDB Documentation
- Promscale provides the translation layer between Prometheus and the database. It allows the Prometheus server to store and retrieve metrics from TimescaleDB, and allows users to use PromQL on Promscale and Prometheus.
- Prometheus is an open-source systems monitoring and altering stack. It has become the de-facto standard in metric monitoring and is the basis of standards such as OpenMetrics. It allows you to monitor and understand how your infrastructure and applications are performing. Service discovery allows Prometheus to automagically discover components within your Kubernetes cluster that are already emitting metrics.
- kube-state-metrics exports the metrics related to Kubernetes resources such as the status and count of Kubernetes resources, with visibility of the desired resources and the current resources, as well as the trends in your cluster.
- Node-Exporter is deployed to export node related metrics such as CPU, memory usage, and others from the Kubernetes cluster.
Before You Begin
Deploy an LKE Cluster. This guide was written using an example node pool with three 4 GB Shared CPU Compute Instances. Depending on the workloads you plan to deploy on your cluster, you may consider using other plans with more available resources.
Install Helm 3 to your local environment.
Install kubectl to your local environment and connect to your cluster.
Create the
monitoring
namespace on your LKE cluster:kubectl create namespace monitoring
Add the stable Helm charts repository to your Helm repos:
helm repo add stable https://charts.helm.sh/stable
Update your Helm repositories:
helm repo update
TOBS Minimal Deployment
In this section, learn to deploy TOBS for individual/local access with kubectl
Port-Forward.
Deploy The Observability Stack
Install a certificate manager for your LKE cluster:
kubectl apply -f https://github.com/cert-manager/cert-manager/releases/download/v1.8.0/cert-manager.yaml
Using Helm, deploy the TOBS release labeled
lke-monitor
in themonitoring
namespace on your LKE cluster:helm repo add timescale https://charts.timescale.com/ helm repo update helm install --wait lke-monitor timescale/tobs --namespace monitoring
Verify that the Prometheus Operator has been deployed to your LKE cluster and its components are running and ready by checking the pods in the
monitoring
namespace:kubectl -n monitoring get pods
You should see a similar output to the following:
NAME READY STATUS RESTARTS AGE alertmanager-tobs-kube-prometheus-alertmanager-0 2/2 Running 0 2m13s lke-monitor-connection-secret-j4sdh 0/1 Completed 0 2m35s lke-monitor-grafana-54d979dcf5-tkkgj 3/3 Running 2 (65s ago) 2m32s lke-monitor-grafana-db-swm8g 0/1 Completed 3 2m35s lke-monitor-kube-state-metrics-6bc5c44b9-g8r5g 1/1 Running 0 2m27s lke-monitor-prometheus-node-exporter-b4vvg 1/1 Running 0 2m33s lke-monitor-prometheus-node-exporter-bbcnd 1/1 Running 0 2m34s lke-monitor-prometheus-node-exporter-frrfp 1/1 Running 0 2m26s lke-monitor-promlens-569cfbd586-bkhrr 1/1 Running 0 2m34s lke-monitor-promscale-86d574986c-9wj2z 1/1 Running 4 (64s ago) 2m27s lke-monitor-timescaledb-0 1/1 Running 0 2m30s opentelemetry-operator-controller-manager-8cf5c85c8-krdj5 2/2 Running 0 2m27s prometheus-tobs-kube-prometheus-prometheus-0 2/2 Running 0 2m13s tobs-kube-prometheus-operator-5b4f674986-55r4k 1/1 Running 0 2m34s
Access Monitoring Interfaces with Port-Forward
List the services running in the
monitoring
namespace and review their respective ports:kubectl -n monitoring get svc
You should see an output similar to the following:
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE alertmanager-operated ClusterIP None <none> 9093/TCP,9094/TCP,9094/UDP 3m41s lke-monitor ClusterIP 10.128.40.142 <none> 5432/TCP 4m3s lke-monitor-config ClusterIP None <none> 8008/TCP 4m3s lke-monitor-grafana ClusterIP 10.128.102.243 <none> 80/TCP 4m3s lke-monitor-kube-state-metrics ClusterIP 10.128.208.39 <none> 8080/TCP 4m3s lke-monitor-prometheus-node-exporter ClusterIP 10.128.170.88 <none> 9100/TCP 4m3s lke-monitor-promlens ClusterIP 10.128.45.92 <none> 80/TCP 4m3s lke-monitor-promscale-connector ClusterIP 10.128.198.88 <none> 9201/TCP,9202/TCP 4m3s lke-monitor-replica ClusterIP 10.128.137.189 <none> 5432/TCP 4m3s opentelemetry-operator-controller-manager-metrics-service ClusterIP 10.128.45.42 <none> 8443/TCP 4m3s opentelemetry-operator-webhook-service ClusterIP 10.128.12.89 <none> 443/TCP 4m3s prometheus-operated ClusterIP None <none> 9090/TCP 3m41s lke-monitor-kube-prometheus-alertmanager ClusterIP 10.128.33.44 <none> 9093/TCP 4m3s lke-monitor-kube-prometheus-operator ClusterIP 10.128.175.39 <none> 443/TCP 4m3s lke-monitor-kube-prometheus-prometheus ClusterIP 10.128.106.173 <none> 9090/TCP 4m3s
From the above output, the resource services you will access have the corresponding ports:
Resource Service Name Port Prometheus lke-monitor-kube-prometheus-prometheus 9090 Alertmanager lke-monitor-kube-prometheus-alertmanager 9093 Grafana lke-monitor-grafana 80 Use
kubectl
port-forward to open a connection to a service, then access the service’s interface by entering the corresponding address in your web browser:Note Press control+C on your keyboard to terminate a port-forward process after entering any of the following commands.To provide access to the Prometheus interface at the address
127.0.0.1:9090
in your web browser, enter:kubectl -n monitoring \ port-forward \ svc/lke-monitor-kube-prometheus-prometheus \ 9090
To provide access to the Alertmanager interface at the address
127.0.0.1:9093
in your web browser, enter:kubectl -n monitoring \ port-forward \ svc/lke-monitor-kube-prometheus-alertmanager \ 9093
To provide access to the Grafana interface at the address
127.0.0.1:8081
in your web browser, enter:kubectl -n monitoring \ port-forward \ svc/lke-monitor-grafana \ 8081:80
When accessing the Grafana interface, log in as
admin
. You can get thepassword
using:kubectl get secret --namespace monitoring lke-monitor-grafana -o jsonpath="{.data.admin-password}" | base64 --decode ; echo
The Grafana dashboards are accessible at Dashboards > Manage from the left navigation bar.
TOBS eliminates the need to maintain configuration details for each of the applications, while providing standardized monitoring for the applications running on your cluster.
More Information
You may wish to consult the following resources for additional information on this topic. While these are provided in the hope that they will be useful, please note that we cannot vouch for the accuracy or timeliness of externally hosted materials.
- TOBS Helm Chart on Github: Useful for reviewing configuration parameters and troubleshooting.
- Prometheus Documentation
- Alertmanager Documentation
- Grafana Tutorials
- TimescaleDB Documentation
This page was originally published on