A breakdown of the Linux commands that are most effective in monitoring and generating information regarding network usage.
You are here
A DevOp's Story: How to Monitor Linux Hosts Across AWS, Azure and GCP
Whether you prefer Ubuntu, Centos or RHEL, monitoring the performance of your Linux hosts in the public cloud is not as straightforward as it might seem. Each cloud provider has its own monitoring system that varies considerably, and many leave out the more detailed kernel-level statistics that you need. Additionally, each cloud provider offers a different API layer and available metrics, making homogenous data access problematic at best. I find that host-level, cloud agnostic monitoring solution is essential to ensure that I'm getting the most out of my virtual machines and to meet service level agreements.
Amazon's CloudWatch will tell you the CPU load on your machines and alert you when one fails a status check, but it provides no insight as to RAM usage, individual process resource consumption, or disk space usage. The performance reporting tools with Microsoft Azure are great for their Windows instances, but the metrics available for Linux hosts are considerably less detailed. A cloud agnostic monitoring solution like Opsview picks up where these solutions end, giving you the more detailed process and kernel level information that you need to ensure you're using your servers efficiently.
If a server is being overwhelmed with traffic, the simple solution is to spin up another copy of it behind the load balancer. In a pinch, it's a good solution at 2 AM, and, like most DevOps specialists, I've been in this situation before. Multiple sites were hot-linking a dynamically-created graphic on a customer’s site, costing them tens of thousands of dollars in extra bandwidth and server costs. By analyzing historical network traffic patterns, web server requests, and process tracing, I was able to pinpoint the cost. You can't conduct this kind of analysis with the tools that are provided with cloud solutions, and performing these by hand on multiple web servers is tedious at best.
Adding more capacity in the short term seems like a decent solution, but spending money on extra instances and unneeded hardware can quickly get expensive. Wouldn't it be nice to know which processes are consuming the most CPU and RAM, or what network hosts are making the most connections? Blocking abusive hosts and fixing misbehaving scripts is a lot less costly than blindly throwing hardware at the problem.
Many organizations use a mix of cloud and on-premises IT infrastructure, sometimes known as 'Hybrid Cloud'. You could create your own monitoring system with thousands of hours of custom programming, both up front and throughout the years maintenance as vendor APIs change, but this solution is expensive and pulls valuable development staff away from revenue-generating operations. In smaller start-up shops the development hours are often not available for custom solutions, yet your customers demand constant availability and performance.
This is why a cloud-agnostic, on-premise monitoring solution like Opsview is an ideal solution. You get one standard interface giving deep, unified insight into a wide variety of data sets that you control without having to constantly monitor and adapt to vendor API changelogs. You can take advantage of the strengths of each cloud provider, knowing that your monitoring solution will adapt to meet your technical and business needs.
Get started monitoring your Linux hosts using our free trial.
More like this
Learn why cloud monitoring is essential in ensuring that your entire IT infrastructure is performing at an optimal level.
Despite the lightweight, optimized design of Linux, applications can always introduce anomalies and performance problems can, and will, creep into...