Deployment Environment

Prometheus in terms of reliability is intended to be viewed as a stateless service, like a webserver. You can move a webserver, restart it and bring up extra copies as much as you like without causing issues. This isn't fully true for Prometheus, however it is something to strive for. Prometheus storage can be considered to be ephemeral, in the sense that you should aim to have your most critical alerts working within an hour of starting from a clean storage directory. That way if the worst happens and you have to blow away storage, at least the impact will be limited.

In terms of the compute resources used and the setup required, Prometheus is stateful service in practice. It requires a non-trivial amount of fast local disk space, like a database would. Thus you should approach Prometheus in a similar way to how you would approach PostgreSQL or MySQL in terms of how you deploy it.

While storage is ephemeral in principle, that doesn't mean it absolutely has to be local and lost on machine failure. The storage is more a cache than a data store, and you want to avoid cache flushes where practical. Using services such as Amazon's EBS for storage is done by many users successfully. Note that only one Prometheus at a time can use a given storage directory. Also, NFS (including Amazon's EFS) is strongly discouraged as Prometheus needs a POSIX filesystem and NFS is not known for being a fully compliant POSIX filesystem.

Prometheus presumes that you have a reasonably sophisticated configuration management setup, that uses a tool such as Chef, Puppet, Ansible, Salt, Nix or something like Ksonnet (or the 20 or so other tools that serve a similar role in the Kubenetes ecosystem). It is presumed that you can do things like template and automatically generate configuration files, have some infrastructure to manage secrets, and have the ability to wrap applications like Prometheus in order to tie all of that together. For example you are expected to have the capability to interpolate secrets in configuration files. If this is something you cannot already do, you should develop this capability as you will need it for more things than Prometheus.

Note that Docker, Docker Compose and Kubernetes are not configuration management tools. They are things that a configuration management tool outputs to, in the same way that systemd service files are written out to.

In terms of acquiring Prometheus binaries, you can either download the official released versions (generally recommended) or compile them yourself. It is unwise to use whatever the latest version happens to be when Prometheus starts up, rather choose the versions you deploy and update on your own schedule. Noone wants to be woken up in the middle of the night due to their monitoring breaking due to an unexpected upgrade.