You are here
Elasticsearch Performance Testing Tips
We recently covered the importance of Elasticsearch monitoring tools and how they enhance Elasticsearch functionality. Once you have your Elasticsearch environment set up and the proper plugins in place, the next step is to undergo continuous performance testing that will maximize your monitoring efforts. However, knowing what areas to test can be challenging considering Elasticsearch is such a vast and robust platform. To ensure your configuration methods are running efficiently, here are a few performance testing recommendations that will ensure Elasticsearch is fully optimized and operating as expected.
Have a look at our Elasticsearch Opspack.
Control cluster growth
The ability to easily create indices and shards is one of the most appealing elements of Elasticsearch. However, strong cluster performance supersedes this need and too many indices/shards can result in overload that puts the entire usability of Elasticsearch at risk. Maintaining an optimal level of index and search performance requires users to pay close attention to their cluster state, an API that provides a health update of the cluster and contains mappings of all indexes. When you have enough accurate data to measure the size of your cluster state, you’ll have better control over your Elasticsearch environment and enough insight to properly plan for future growth.
Linux users should be quite familiar with swapping, a process which copies pages of memory onto a hard disk that frees up space for virtual memory. The major downside to swapping is the fact that accessing disk information takes much longer in comparison to memory speeds. Every time swapping occurs, more time is added onto the process. Within Elasticsearch, a property named mlockall can be set so nodes don’t swap their memory. This quick adjustment automatically increases performance efficiency and by consistently testing memory speed, you’ll have assurance that Elasticsearch is not being slowed down by swapping or any other preventable issues.
Utilize doc values
Standard fields in Elasticsearch are common, but limited in the respect that they require time-consuming efforts for sorting or aggregations to occur. Doc values prove to be advantageous over ‘normal fields’ because they use an on-disk data structure built at document index time and have no negative impact on heap usage. Despite a larger index size and slightly slower fielddata access, faster garbage collections and improved initialization are major benefits of doc values that are worth the barely noticeable shortcomings. By testing the availability of filesystem cache space and making sure enough space is open, the more likely doc values will coincide with net gains in speed and an enhanced Elasticsearch experience.
Leverage Elasticsearch’s benchmarking tool
Benchmarking is always crucial when working to prevent lackluster system performance. Elasticsearch recognized the need for an internal, application-specific benchmarking tool and unveiled Rally to satisfy users looking for a way to measure the impact of system changes during the development phase. Rally has several features, including allocation profiling and inspecting hot classes, that are useful in the effort to reproduce performance data. It is easy to run your first benchmark within Elasticsearch, so be sure to leverage its reliable capabilities throughout the performance testing process.