So, last Friday night, I decided to turn my infrastructure into code by learning Ansible, and capture the entire demo configuration.
You are here
Opsview Ansible Automation for Monitoring
Opsview provides a complete suite of Ansible modules and an underlying Python package that makes short work of automating routine monitoring tasks at scale. Here’s a shallow dive into how to start using these tools.
I’ve been building out a set of Opsview demo videos; and as part of this project, puzzling out some ways to make live product demonstrations easier, more realistic, and potentially less expensive to maintain.
A big part of the problem is that — because Opsview can work at enterprise scales — demos also need to be “big.” That is, they need to map to realistic enterprise use-cases, meaning lots of hosts, components, service tiers, and relatively complicated monitoring configurations. The latent tendency is to build out these things manually (e.g., on an internal or public cloud), configure monitoring for them (also manually) and then leave them running, which is, shall we say, cost- and/or resource-inefficient. Meanwhile, as IT folk will confirm, maintaining a diversity of large, complex systems over time is difficult: update a few things, or switch out to a new release of Opsview, and there you are, manually tweaking or reconfiguring the monitoring again.
Enter automation. The real answer, here, is to not maintain static demo environments, but deploy them on demand, and tear them down when demos are complete. That means creating and maintaining infrastructure-as-code, including code that implements (and then un-implements) the monitoring on the systems you’re deploying.
Happily, Opsview provides many ways of doing that. At the simplest level, you can access Opsview's built-in REST API CLI (opsview-rest), though that’s more for one-off administrative efforts, or develop standard functions in any language to access the REST API directly. I recently did this using node.js, in the context of a project that built out infrastructure on AWS using Amazon’s amazingly-comprehensive SDK for node.
Much easier, however, is to use standard deployment tools like Ansible, Puppet, or Chef, along with Opsview-provided modules that let you configure and manage Opsview inline with the code you’re using for deployment, scaling, in-service updates, and other lifecycle management (including teardown).
For my demo project, I decided to use the Opsview Ansible modules, since I was already using Ansible to deploy and tear down complex demo target systems on AWS (pro tip: if you do this, use Ansible dynamic inventory — ec2.py — instead of a conventional hosts file. It’s amazing.See Working with Dynamic Inventory.)
To get you started, here are the basics of automating Opsview configuration with Ansible. Huge shout-out to Opsview’s Joshua Griffiths, DevOps Jedi, who wrote and maintains the modules and the underlying Python package.
Minimal documentation can be found in the GitHub repository, https://github.com/opsview/opsview-ansible-modules. The individual modules (there are many) are documented internally — their docs viewable with ansible-doc. We’ll explain that more in a moment.
We’re assuming you have Ansible 2.0 or greater installed. If not, click here for instructions.
Opsview Ansible automation makes use of the Python package pyopsview, which is installable with pip, the Python package manager (for Python 2.7, or pip3 for Python 3.5+, required by Ansible).
$ sudo pip install 'pyopsview>=5.3.3'
Thereafter, you need to check to see if you already have two folders called /etc/ansible/library and /etc/ansible/module_utils, or if not, create them. This is where you’ll put the Opsview Ansible modules and their subsidiary shared utility modules, respectively. These are simply Python source files (.py files).
Finally, clone the opsview-ansible-modules repo to a convenient local directory:
$ git clone https://github.com/opsview/opsview-ansible-modules.git
And copy the modules and module_utils from their respective directories to the ones you located or created, above.
$ sudo cp -p opsview-ansible-modules/library/*.py /etc/ansible/library $ sudo cp -p opsview-ansible-modules/module_utils/*.py /etc/ansible/module_utils
The files will end up owned by you, and unlike regular .py files, will not be marked as executable. That’s fine: Ansible will just include them transparently as needed.
Nine Ansible modules are (currently) provided, letting you log into a remote Opsview instances and providing very fine control of hosts, host groups, hashtags, netflows, and BSM components and service definitions, as well as the ability to trigger ‘Apply Changes’ (aka ‘reload’).
The modules (stored in /etc/ansible/library) are extensively documented in Ansible standard format, but this documentation is internal. To read the docs, use the ansible-doc function with the name of a module, e.g.:
$ ansible-doc opsview_monitoring_server
Docs specify all required and optional parameter names and parameter formats, and provide examples. They are well worth studying and maybe printing out for reference.
The modules can be used several ways, but (currently, at least) I find it simplest to create roles that perform a series of monitoring tasks in proper order for a group of hosts of self-similar configuration. This approach is ideal for configuring monitoring on scale-out environments, where you might deploy (for example) a failover cluster of load balancers, a tier of redundant webservers, and multiple database servers in cluster. In this case, you might compose one role to configure monitoring for each host type (and a companion role to tear down monitoring - more on this in a moment). Additional roles could be added to create (and destroy) BSM models around the deployed system, for example.
These roles won’t typically be very complicated. For example, here’s an outline for a generic role that will happily configure monitoring for all kinds of host collections:
- Log into Opsview and obtain auth token
- Create a required host group (since these must exist before hosts can be placed in them). This step can be expanded to create additional, or nested host groups, if needed.
- Configure monitoring for the hosts
- Trigger Apply Changes - you’re done!
A companion ‘teardown’ role just changes the order of steps to remove hosts (because you can’t delete a host group without removing hosts it contains) before deleting host groups (in reverse order of creation, if multiple nested host groups are involved).
Each role opens with an invocation of the opsview_login module, which takes user/password credentials, logs into your Opsview instance, and obtains a REST authentication token that can be used instead of your password on subsequent REST calls.
Actual code for the role is simple for the sake of example, but the template can be expanded to any required level of complexity. The idea is to populate the role with variables set at toplevel, giving you vast power to apply completely custom, intricate configurations to any host group towards which you aim the role.
Here’s the sample role:
Example of a semi-generic "monitor pretty much anything in Opsview" role in Ansible, using the Opsview Ansible modules.
Note that each of the modules that create objects in Opsview includes a ‘state:’ variable, which is given a value of ‘present’ for object creation or updating, and can be given a value of ‘absent’ to permit object deletion. In some quick-and-dirty circumstances, you might want to create objects and then delete them, using the same role/task — simply changing the requested state to ‘absent.’ More commonly (and neatly) however, it makes sense to deliberately ‘unmake’ your monitoring configurations, using simpler tasks that identify the objects minimalistically and request their absence.
Here’s a top-level playbook that uses the above role (called ‘monitor-everything’) to apply simple monitoring to an arbitrarily-large Ansible host inventory.
Toplevel play that uses the above generic "monitor-everything" role to monitor an arbitrarily large number of AWS-hosted VMs, identified by tags in the inventory auto-generated by AWS' ec2.py dynamic inventory script.
Names of host templates (Opspacks) are important to get right: capitalization and spaces are significant. The best way to get them correct is to CTRL-C copy them right out of Opsview's host monitoring configuration dialog box, where they’re all listed.
As you can see — again by way of example — this call is providing some configuration detail beyond simply listing Opspacks to apply (to wit: it’s naming some specific service checks to remove). The opsview_host module (and other modules) provide much, much more fine-grained control than even this: you can configure variables for hosts, provide authentication, keys, pointers towards preconfigured Opsview variables (e.g., for ServiceNow credentials).
In coming weeks, I’ll be blogging more on Opsview Ansible Modules and other Opsview automation tooling. By the end of the series, you’ll know how to dovetail Opsview monitoring into fairly complicated infra-as-code setups, and build (and tear down) sophisticated monitoring setups, complete with BSM.
More like this
Opsview's Python project structure that provides a "place for everything" and the flexibility to work efficiently both for development and...
The term ansible is in the Oxford English Dictionary, now. It’s written into the DNA of companies and movements like DevOps.