You are here

Blog

A DevOp's Story: How to Monitor Linux Hosts Across AWS, Azure and GCP

Whether you prefer Ubuntu, Centos or RHEL, monitoring the performance of your Linux hosts in the public cloud is not as straightforward as it might seem. Each cloud provider has its own monitoring system that varies considerably, and many leave out the more detailed kernel-level statistics that you need. Additionally, each cloud provider offers a different API layer and available metrics, making homogenous data access problematic at best. I find that host-level, cloud agnostic monitoring solution is essential to ensure that I'm getting the most out of my virtual machines and to meet service level agreements.

Amazon's CloudWatch will tell you the CPU load on your machines and alert you when one fails a status check, but it provides no insight as to RAM usage, individual process resource consumption, or disk space usage. The performance reporting tools with Microsoft Azure are great for their Windows instances, but the metrics available for Linux hosts are considerably less detailed. A cloud agnostic monitoring solution like Opsview picks up where these solutions end, giving you the more detailed process and kernel level information that you need to ensure you're using your servers efficiently.

CloudWatch Monitoring Details

If a server is being overwhelmed with traffic, the simple solution is to spin up another copy of it behind the load balancer. In a pinch, it's a good solution at 2 AM, and, like most DevOps specialists, I've been in this situation before. Multiple sites were hot-linking a dynamically-created graphic on a customer’s site, costing them tens of thousands of dollars in extra bandwidth and server costs. By analyzing historical network traffic patterns, web server requests, and process tracing, I was able to pinpoint the cost. You can't conduct this kind of analysis with the tools that are provided with cloud solutions, and performing these by hand on multiple web servers is tedious at best.

Adding more capacity in the short term seems like a decent solution, but spending money on extra instances and unneeded hardware can quickly get expensive. Wouldn't it be nice to know which processes are consuming the most CPU and RAM, or what network hosts are making the most connections? Blocking abusive hosts and fixing misbehaving scripts is a lot less costly than blindly throwing hardware at the problem.

Linux Host Check

Many organizations use a mix of cloud and on-premises IT infrastructure, sometimes known as 'Hybrid Cloud'. You could create your own monitoring system with thousands of hours of custom programming, both up front and throughout the years maintenance as vendor APIs change, but this solution is expensive and pulls valuable development staff away from revenue-generating operations. In smaller start-up shops the development hours are often not available for custom solutions, yet your customers demand constant availability and performance.

This is why a cloud-agnostic, on-premise monitoring solution like Opsview is an ideal solution. You get one standard interface giving deep, unified insight into a wide variety of data sets that you control without having to constantly monitor and adapt to vendor API changelogs. You can take advantage of the strengths of each cloud provider, knowing that your monitoring solution will adapt to meet your technical and business needs. 

Get started monitoring your Linux hosts using our free trial

Get unified insight into your IT operations with Opsview Monitor

robert_7's picture
by Robert Oliver,
Technical Writer
Robert is a technical writer, an expert Linux server admin and has been using Amazon Web Services since their early beta period

More like this

Network monitoring tools for Linux
Apr 25, 2017
Blog
By Eric Bernsen, Marketing Analyst

A breakdown of the Linux commands that are most effective in monitoring and generating information regarding network usage. 

Cloud Monitoring
May 10, 2017
Blog
By Freya Ballan Whitfield, Digital Marketing Executive

Learn why cloud monitoring is essential in ensuring that your entire IT infrastructure is performing at an optimal level. 

Troubleshooting Linux
Aug 07, 2017
Blog
By Robert Oliver, Technical Writer

Despite the lightweight, optimized design of Linux, applications can always introduce anomalies and performance problems can, and will, creep into...