Opsview Monitor 5.3 And Using New Time Series Provider: InfluxDB
Time Series Providers In Opsview Monitor
As mentioned in our previous blog on "Processing Time Series in Opsview 5.2", a new architecture of processing performance metrics allows us to seamlessly swap from one provider to another. The RRD file based databases are the de facto industry standard as they are stable, highly configurable and the project is actively maintained. However, on large systems the disk I/O required to keep the metrics up to date often goes beyond the capabilities of the hardware. Since the release of Opsview Monitor 5.0, we have been looking for a sensible alternative which would meet the following requirements:
- Easy installation
- Ability to store all datapoints, without enforced downsampling
- Minimal resources usage for small systems
- HTTP based API for writing and querying
- High performance and scalability features
The projects that we considered at that time were OpenTSDB, ElasticSearch and InfluxDB - out of those only InfluxDB met all the requirements. Or so we thought...unfortunately the InfluxDB version 0.9, which was just released at that time, failed to survive our internal benchmarks and stress tests. The project was put on hold, but we've continued to track the progress InfluxData was making in their flagship product. So when version 1.2 was released last year, we were excited to resurrect the project.
At Opsview, we pride ourselves in providing not only an easy to use web interface, but more importantly, the flexibility required in an enterprise monitoring solution:
- Our dedicated Performance Graphs in Dashboards allow the combination of multiple metrics from different hosts
- The Graphs in Investigate window for monitored services allow you to quickly view time series changes over time with a single click
- The dedicated Graph Center can quickly draw numerous graphs, using easy to use configuration wizards
In Opsview Monitor 5.3, all of these features have been updated to work with a new time series provider, InfluxDB. So rather then installing separate software for drawing metrics (which would require creating queries manually), all metrics can be viewed without leaving the Opsview Monitor UI - this tight integration is enabled by the Opsview Monitor TimeSeries architecture.
If you have not yet used these new features of Opsview Monitor 5.3, please get in touch with our Customer Success team, who would be happy to help you discover increased value from your Opsview Monitor system.
System Layout For High Availability
One of the most important features of the time series architecture is its ability to deploy the processing of massive amounts of data onto dedicated servers, you may recall it from our previous blog. It is even more important with InfluxDB, for which hardware requirements grow with the number of time series stored. Additionally, our built-in sharding allows you to redistribute the data between multiple servers, keeping both CPU and memory usage within reasonable limits.
So if you manage a large system and RRDs are causing you a lot of headaches due to slow performance or losing data resolution over time, then the recommended approach would be to replace RRDs on your distributed system with InfluxDB on each time series node.
As each node holds its own subset of data, it is highly recommended to keep regular backups of the system. While with RRDs it is as simple as an rsync, the InfluxDB provides more challenges - please refer to the documentation page for information on InfluxDB: Backup & Restore. You might also be interested in using Relay to have a copy of nodes data on a separate system.
Migrate Or Not Migrate?
We will continue using RRD time series provider by default as on the majority of systems, spreading the load onto multiple systems will resolve any performance issues with RRD processing. However, if the real-time tracking is important for you (and Opsview Monitor fully supports sub-minute check intervals) - then InfluxDB could be just for you. As a part of Opsview Monitor 5.3, we have provided an easy to follow data migration guide in our Knowledge Center.
Last but not least is our new Opspack: Application - InfluxDB, a set of tailored service checks that allow you to monitor your InfluxDB servers using a highly flexible monitoring plugin. The key check is to monitor number of series in the database as this is what drives CPU and memory usage.
Notes from Author
The Time Series architecture as the InfluxDB provider are open-source to make it trivial to extend by other technologies - if there is a particular one you might be using, please take a look at my GitHub repository as an example how to implement one.