Kubernetes’ extraordinary resilience tends to change the emphasis of monitoring from alerting to resource and performance management.
You are here
Monitoring Kubernetes with Opsview: Part 2
In Part 1 of this series, we discussed some of the issues that complicate monitoring of advanced container orchestration environments like Kubernetes. We advanced the idea that infrastructure monitoring played a vital role in monitoring complex container stacks and their underlying infrastructure -- a role that complements use of application performance monitoring (APM) to observe workloads. In this second installment, we elaborate our infrastructure monitoring strategy around Kubernetes, provide means for building a test cluster, and present a new tool -- the Opsview Kubernetes Opspack -- enabling Opsview to observe Kubernetes cluster metrics directly.
Opsview’s Kubernetes Monitoring Strategy
Opsview’s Kubernetes Opspack, completes a full stack infrastructure monitoring solution that helps you keep Kubernetes (and, other things being equal, its workloads) available and performant. Implementing this monitoring stack can be done in minutes, with very little required customization or specialized knowledge. It works well on its own, or to complement an Application Performance Monitoring strategy focused on container or higher-order platform metrics (e.g., metrics produced by a Functions-as-a-Service platform like OpenFaaS, hosted on Kubernetes -- see below). Opsview lets you monitor:
- The Linux host OS and hardware on each node - This can be done several ways, but is best accomplished by installing the Opsview Agent on each Kubernetes node and implementing Linux Basic and Advanced monitoring Opspacks, which come preloaded with Opsview. It’s also important to monitor Linux and hardware on nodes of any separate database or volume-storage cluster used to persist container state and/or store critical application data, since these represent single points of failure for application performance and availability.
- The Docker Engines on each node working as container runtimes - This can be done by implementing the Application - Docker Opspack (also pre-loaded). Opsview Docker monitoring requires a Perl library to be installed on each monitored node (see below).
- Any external database or storage system used to persist data or container state - Opsview provides preloaded Opspacks for insightful monitoring of MySQL and many other popular databases, and compatible monitoring plugins for distributed storage engines such as Ceph are readily available.
- The Kubernetes cluster itself - The Opsview Kubernetes Opspack provides seven service checks populated by interrogating the Kubernetes’ cluster’s metrics API -- checks that, in several cases, map closely to Google’s four golden signals.
Additionally, it can make sense to aggregate all these metrics as a Business Service, using Opsview's BSM feature. This assists in visualizing the stack segregated cleanly from other monitored infrastructure, letting you quickly gist status indications of conditions that may affect application availability.
In laboratory work over the past several months, we’ve had opportunity to test this monitoring schema in the course of researching OpenFaaS -- an open source Functions-as-a-Service (aka “Serverless Computing”) framework that lives on top of a Kubernetes (or Docker Swarm) cluster and leverages its high availability, autoscaling, and other features. OpenFaaS has let us stress the underlying infrastructure in a range of ways, and judge the overall resource consumption profile and performance impacts of different kinds of scale-out function workloads.
Helpfully, OpenFaaS includes its own monitoring API/take-off, and includes containerized Prometheus for visualizing these metrics. Using Prometheus in tandem with Opsview has been interesting, to say the least. One thing we’ve determined is that -- logically enough -- APM-level metrics like OpenFaaS gateway invocations (a measurement of traffic to functions) map quite closely to time-series metrics collected by Opsview from Kubernetes for http latency, and to other reactive infrastructure metrics from lower levels of the stack. Our conclusion is that monitoring the stack (provided this is done thoughtfully) is quite productive of insight about application performance and end-user experience, as well as informative about resource utilization, thresholds, and other operations-relevant info. Ultimately, the two kinds of monitoring -- APM and infrastructure -- complement one another.
Rolling Your Own Kubernetes Cluster
Opsview’s Kubernetes Monitoring Opspack will work with any contemporary Kubernetes cluster -- on-premises or hosted -- whose master node entrypoint can be made accessible by proxy on the node local network (kubectl proxy is generally used for this). For monitoring a hosted cluster, we recommend either placing an Opsview instance on a VM sharing the same local network as your Kubernetes, or setting up a VPN connection to the local network from a remote location.
If you’d like to quickly set up your own cluster for testing Opsview's Kubernetes Opspack (also quite useful for development or small-scale work and lots of fun as a host for OpenFaaS), we’ve made life a little easier for you by open sourcing the Ansible playbooks that automate setup of our lab environment, on GitHub, in an Opsview repo called kubernetes-ansible-example. If you supply three Linux VMs (or bare-metal machines) running Ubuntu 16.04 LTS, put them on a VLAN with internet access, and add a VM to host the Ansible deployer, our scripts will build you a Kubernetes 1.10.2 (latest) best-practice cluster using the Kubernetes project’s official kubeadm deployer. The playbooks can easily be customized to deploy a single-node/master-only Kubernetes, a two-node cluster, or prep and join an arbitrary number of additional workers.
The Ansible playbooks will also prep your nodes with dependencies, install Opsview Agent, and install the Perl modules required for Opsview’s Application - Docker Opspack.
We’ve provided a complete installation and customization tutorial in the GitHub repo’s readme.md file. Just clone the repo to the machine on which you’ve installed Ansible, and follow the recipe.
Installing and Configuring the Kubernetes Monitoring Opspack
- Download the binary release of the Kubernetes Opspack (application-kubernetes.opspack) to your desktop from https://github.com/opsview/application-kubernetes/releases .(Image 1)
- Log into Opsview's webUI, click the menu icon (upper right), click the Settings tab, and select Host Template Settings. (Image 2)
- Click Import Opspack (upper left). Browse to the Kubernetes Opspack file, select, and upload it. (Image 3) (Image 4)
- Click the Import button to import the new Opspack into Opsview. The Application - Kubernetes host template will appear in the Host Templates list, highlighted in yellow. (Image 5) (Image 6)
- To enable Opsview to access the Kubernetes API, two methods are available. You can set up certificates and provide the Kubernetes host template with keys to authenticate (much more secure) or, for a PoC cluster on the same local network as your Opsview instance, you can simple SSH into your Kubernetes master node as the administrative user (our cluster calls the admin user ‘k8suser’ and the master node’s hostname is ‘k8smaster’) and initiate the kubectl proxy command:
… using the trailing & to background the process, letting you close the terminal without halting the proxy. The command above uses a regular expression to permit all local hosts -- you can refine this to the IP address of your Opsview instance, if you prefer. (Image 7)
kubectl proxy --port=8080 --address='0.0.0.0' --accept-hosts='^*$' &
- Next, in Opsview, select Menu>Hosts and click Add New. (Image 8)
- Select the Application - Kubernetes host template from the list on the left, and use the arrow to move it to the right-hand column. Give this host group an appropriate title (e.g., ‘Kubernetes’) and provide the IP address of the master node. Do not click Submit Changes yet. (Image 9)
- Under the Variables tab, start typing the word ‘KUBERNETES’ -- you’ll see a selection of Kubernetes-specific variables pop up. You need to add and define a KUBERNETES NODE variable for each node in your cluster, designating its hostname (which is the name Kubernetes recognizes it under). The nodes in our test cluster are called ‘k8smaster,’ ‘k8sworker1,’ and ‘k8sworker2.’ You also need to define a KUBERNETES PORT variable, which we’ve set to 8080 here -- the same as the port we’ve proxied, above. Click Submit Changes. Then pull down the Opsview menu again and select Reload, then Apply Changes, then Acknowledge to implement the new settings. (Image 10) (Image 11)
- Return to Opsview's Host Groups, Hosts, and Services mainpage, and you should see that your Kubernetes cluster is now being monitored. Within five minutes, any pending or unknown service checks should clear. Click to open the hosts tree (on the left), then click the checkbox next to your cluster’s name to view the current state of service checks: a wealth of information about critical services, node health, memory and cluster-wide CPU utilization. (Image 12)
In upcoming blogs, we’ll be expanding on this material -- using the Kubernetes Opspack in tandem with Opsview host templates for Docker, Linux, and MySQL, to demonstrate monitoring the whole stack as Kubernetes dynamically scales out containers under control of the OpenFaaS serverless computing framework. Stay tuned!
More like this
So, last Friday night, I decided to turn my infrastructure into code by learning Ansible, and capture the entire demo configuration.
The Docker monitoring service is performed via our extensible Opspacks system. This plugin architecture allows for any service, and the vast...