A Breakdown Of Our Python Project Structure
One of the development challenges when building a microservices architecture is managing the project code structure for a variety of interconnected services. With a language such as Python, it is not obvious how large projects should achieve this, especially since many Python projects have a single root package.
At Opsview, we have created a structure that provides a "place for everything" and the flexibility to work efficiently both for development and production builds.
Some of the criteria are as follows:
- Multiple services, provided by multiple packages
- Namespaced structure
- Implementation patterns
- Resource abstraction
- Virtual environments
- Flexible YAML configuration
Multiple services, provided by multiple packages
This is the essence of a flexible, microservices installation. Using the concept of a service per package, we can build and deploy a system of any size. Packages allow dependency resolution between local services. Meta-packages allow pre-defined units or roles for servers (e.g. slave collector) and then combined with a deployment tool such as Chef/Puppet/Ansible etc, we can create our monitoring infrastructure at scale.
With a large Python codebase, careful namespacing is crucial in order to maintain order throughout the code. The implementation is fairly straightforward in Python using packages, libraries, modules and classes (even if the actual nomenclature isn't). However, Python also has the ability to merge packages together, which means that branches within a namespace hierarchy can be provided by different components when they are imported. We make use of this feature to provide a shared set of modules for use within each service that all fit in the common namespace tree.
Where possible, we attempt to identify common patterns and then provide a common abstraction that can be used within individual services. For example, we may want to have a managed pool of auto-restarting child processes for isolation and resilience purposes. This pattern can be captured in code through a simple interface and included in a shared library of modules.
The end result is a number of services, sharing a similar structure and with minimal boiler-plate code. This improves the readability and maintainability of the service.
Python has a very large ecosystem when it comes to third-party modules and a good slice of time can be spent choosing the best module to interface with a particular external resource. Sometimes, later down the line, a problem arises which means that the chosen module is no longer suitable and another has to be used in its place.
Additionally, we might want to switch out the resource provider itself to a different product (e.g. changing one type of NoSQL database to another). To cope with this, we use the abstract-factory software pattern to provide an abstract view of each resource, allowing us to switch module or resource purely via configuration.
Although in theory each service may have a different set of dependencies, in practice, most of the dependencies are common across services, which means we can test and ship a Python package with a pre-configured fixed set of third-party modules.
However, with a large number of services, it is possible that a feature/performance/fix update may involve utilizing a newer version of a specific module. Therefore, we would also like the flexibility to ship an update to service, but without affecting other services (until such time as the updated module can be fully tested across the board).
One of the ways to achieve this flexibility in Python is to use virtual environments. At Opsview, we use these both for development and production deployments. This means that we can ship the main Python package with the set of well-tested modules, but then update specific ones on a per-service basis, if required.
Flexible YAML configuration
With a microservices architecture, configuration is crucial to ensure all components can talk to each other. Although starting with simple text-file configuration, we quickly discovered that using YAML for configuration files gave us the hierarchy and flexibility we needed for configuration. This is also easier to read, understand and modify correctly than a flat series of interconnected sections in a standard text file. We also merge a series of configuration files together, which allows the user to concentrate on the basic configuration (e.g. component and services interconnection) without worrying about advanced tuning.
There's a fair amount of discussion regarding the use of object-oriented programming practices, especially in regard to dynamic languages like Python. But at Opsview, we have found that a certain level of object-orientation has helped keep structure within the projects, readability through model abstraction and additional namespacing for coding assistance. As projects get larger, it is very important to ensure that the correct abstractions are chosen and frequently refactored as model usage become clearer.
Constructing a full software architecture is never an easy task, but Python has provided us with clear and concise modelling of our domain, flexibility of implementation, as well as speedy development and debugging. It has been quite a challenge to get to the point we are now, but we feel confident that we have a language tool-set capable of achieving the stringent criteria required by Opsview products.