You are here

Free Monitoring Solutions: Limited Insight

Free Monitoring Solutions: Limited Insight

In our previous blog, we discussed some of the challenges free monitoring solutions present in basic implementation. Lack of CMDB and ITOM integration can slow you down by preventing easy ingestion of pre-existing, authoritative infrastructure configuration data. Lack of automation and unknown plugin quality can add further friction -- requiring more coding, more expertise, more care in solution validation, and more careful manual steps before you can feel confident that basic monitoring is working well.

Assuming your technical prowess is such that you can help your organization vault these barriers, you now have visibility into the health of individual hosts, VMs, and application instances. But modern enterprise IT systems and applications -- particularly the mission-critical kind -- don’t usually live on single hosts. Instead, they achieve resilience through clustering: distributing hosts across multiple fault domains and using clever networked software to enable failover. They achieve performance by load balancing and parallelization: scaling out monolithic application servers or individual microservices across available infrastructure, sometimes even doing so dynamically. And they share dependencies: for example, many enterprise applications may share a single authentication mechanism, access a centralized enterprise database cluster, or run on a single cloud or container orchestration framework, like OpenStack or Kubernetes.

Free monitoring solutions often can’t model resilient clusters and complex business services with many interdependent elements. Part of the purpose of distributed application architectures is that they avoid single points of failure. Your enterprise CRM solution doesn’t keel over when one web server in a multi-server, load-balanced tier goes south, or when one server in a database cluster dies.

Free monitoring software tends to focus on individual components, and may not be easily configurable to model system-level resiliency, service interdependence, or estimate performance impacts of partial cluster failures. That means you need to choose between two unsatisfactory options. You can configure the monitoring platform to alert on failure of any individual cluster component; thus risking over-alerting, operator exhaustion, and associated costs and risks. Or you can suppress alerts on clustered elements and trust that the clustering technology will keep critical services available 100% of the time. Do you feel lucky?

Free monitoring platforms often can’t be configured to estimate performance impacts of infrastructure-level failures on business service availability. Though distributed applications are designed to minimize single points of failure, any infrastructure-level issue will typically have performance impacts. Sometimes those will be enough to affect service levels in meaningful ways. But free monitoring platforms are seldom configurable to measure these impacts, determine whether SLAs or SLOs are in jeopardy, or provide an unambiguous record of success or failure in meeting service-level obligations over particular spans of time. That makes it harder to assert compliance or limit the reputational or financial impacts of availability issues, particularly if these don’t result in applications being completely down.

Read Ten Challenges with Free Monitoring Solutions

jjainschigg's picture
by John Jainschigg,
Technical Content Marketing Manager
John is an open cloud computing and infrastructure-as-code/DevOps advocate. Before joining Opsview, John was Technical Marketing Engineer at OpenStack solutions provider, Mirantis. John lives in New York City with his family, a pariah dog named Lenny, and several cats. In his free time, John enjoys making kimchi, sauerkraut, pickles, and other fermented foods, and riding around town on a self-balancing electric unicycle.

More like this

Visibility
Blog
By John Jainschigg, Technical Content Marketing Manager

Part one of a series objectively examining important topics in contemporary IT monitoring, including observability, automation, and cost control...

IT Monitoring Survey Results
Whitepapers
By Opsview Team, Administrator

Monitoring data, like all operations data, is at its most valuable when it leverages a presentation layer that puts the information in the proper...

DevOps in Desperation - Did Someone Say Ansible?
Blog
By John Jainschigg, Technical Content Marketing Manager

So, last Friday night, I decided to turn my infrastructure into code by learning Ansible, and capture the entire demo configuration.