Provisioning Enough IOPS for Your Opsview Environment
Whether you're an enterprise with 2,500 hosts and 10,000 service checks or a Pro customer with only 100 hosts, determining the number of IOPS needed to have a seamless experience with Opsview is a question that comes up often. Hopefully after reading this, you can walk away educated on the options available and know how to calculate the IOPS needed for your specific environment.
IOPS on the Master Server
Most environments are distributed; you'll have the Master Server, Slave Servers, and a Database Server. In this blog, we'll be focusing exclusively on the Master Server.
The theoretical IOPS needed can be calculated using a rather simple operation. For your Master, just follow the equation below to calculate your needed IOPS. This is based on the number of service checks and hosts your environment has, so be sure to have those numbers handy.
Below are all of the operations for each service check that runs:
1. Metadata Write (Checkresult file create)
2. Data Write (checkresult data)
3. Metadata Read (Checkresult ctime check)
4. Data Read (Checkresult data)
5. If fs is not mounted no atime Metadata Write (checkresult atime update)
6. Metadata write (Checkresult.ok file)
7. Metadata write (NDO file creation)
8. Data write (NDO data)
9. Metadata read (NDO open for DB update)
10. Data Read (NDO data read for dbupdate)
11. Metadata write (Checkresults deleted)
12. Metadata write (Checkresults .ok deleted)
13. Metadata write (ndo log update size)
14. Data write (ndolog entry)
15. Metadata write (ndo file delete)
This gives us 10 write operations and 5 read operations, a 1:3 read:write ratio. Having disks organized in a configuration that optimizes writes is preferred. Raid configurations of 0 (no fault tolerance), 5, 6, 10, and 60 are ideal. When designing for larger systems, aim for raid 10 for the best performance and raid 60 for the best balance of performance and storage capacity. For disk sizing requirements, stay tuned for our upcoming blog on Sizing Opsview!
Now that we know the transactions for each check, we can use the following equation:
((Hosts + Service-Checks)*15)/Check-Interval = IO/Sec
For a quick example (assuming 5 checks per host), we can see what the IOPS requirements would be like for some differently sized systems that are represented in the chart below:
For a system that has a total of around 35,000 checks, you can expect to require around 1750 IOPS. Using some generic performance specs for storage, we can get an idea of how many drives of a given type we'd need assuming we can have them in a raid 10 configuration:
|7,200 rpm SATA drives||HDD||~75-100 IOPS||SATA 3 Gbit/s|
|10,000 rpm SAS drives||HDD||~140 IOPS||SAS|
|15,000 rpm SAS drives||HDD||~175-210 IOPS||SAS|
Table 1: Source: https://en.wikipedia.org/wiki/IOPS
Raid 10 is calculated simply as (Disks*IOPS)/2. So for 10,000 total checks, we would need:
|7.200 RPM SATA||10|
|10,000 RPM SAS||7|
|15,000 RPM SAS||5|
If you can provision your system even with inexpensive SSD's, you'll find the performance greatly increased and it will also offer additional room for growth with minimal changes. Many consumer grade SSDs can give you 3,000 IOPS on a single disk, so this can be an excellent choice for smaller companies looking to expand down the road. For enterprise users, raid configurations are still ideal as they offer some redundancy in the event of a disk failure.
I hope you've found this information useful in helping size your Opsview system! Again, stay posted for more details on sizing your Database and Slave systems! If you're an existing customer and you have questions, the Customer Success team will be happy to answer any questions you may have.