Monitoring your container-based infrastructure is crucial to ensure good performance, identify issues early and gain the insight necessary to maximize its efficiency.
When you are dealing with a large number of often short-lived containers spread over multiple hosts and even data centers, understanding the operational health of your infrastructure implies the need to aggregate performance data from both physical hosts as well as the container cluster running on top of it.
Ideally, you want to capture and correlate application performance with the underlying infrastructure to troubleshoot and identify bottlenecks. Implementing a monitoring system that satisfies these requirements can be a complex endeavor.
A previous blog post compared a number of monitoring options that integrate with Docker. One of those evaluated is the SaaS-based monitoring platform from Datadog (www.datadoghq.com). Datadog works with an agent-based deploy that allows you to capture system resource metrics as well as key Docker metrics and visualize them in highly customizable graphs and dashboards.
The agent is available as a Docker image, which is a huge win in terms of ease of deployment. The recent release of the Datadog Agent introduced the Service Discovery feature, which facilitates polling of application-level metrics in dynamic, container-based environments.
In this article, I will walk you through the steps of setting up environment-wide monitoring across all layers of your stack (ie. host, Docker engine and application) using the Datadog template from Rancher’s application catalog. Continue reading.