PuppyGraph Monitoring Metrics
PuppyGraph supports Prometheus / OpenMetrics format for metrics collection.
Specification
Endpoint
Once enabled, the metrics endpoint becomes accessible at :8081/metrics.
Available metrics
| Metric Name | Description |
|---|---|
| puppy_gremlin_errors_count | gremlin server error count |
| puppy_gremlin_errors_fifteenminuterate | gremlin server mean error rate in 15 minutes |
| puppy_gremlin_errors_fiveminuterate | gremlin server mean error rate in 5 minutes |
| puppy_gremlin_errors_meanrate | gremlin server mean error rate |
| puppy_gremlin_errors_oneminuterate | gremlin server mean error rate in 1 minute |
| puppy_gremlin_op_eval_count | gremlin server op eval count |
| puppy_gremlin_op_eval_fifteenminuterate | gremlin server op eval mean rate in 15 minutes |
| puppy_gremlin_op_eval_fiveminuterate | gremlin server op eval mean rate in 5 minutes |
| puppy_gremlin_op_eval_max | gremlin server op eval max time cost |
| puppy_gremlin_op_eval_mean | gremlin server op eval mean time cost |
| puppy_gremlin_op_eval_meanrate | gremlin server op eval mean rate |
| puppy_gremlin_op_eval_min | gremlin server op eval min time cost |
| puppy_gremlin_op_eval_oneminuterate | gremlin server op eval mean rate in 1 minute |
| puppygraph_client_status | aliveness of client, including gotty, bolt and notebook |
| puppygraph_gremlin_server_status | aliveness of gremlin server |
| puppygraph_node_alive | aliveness of nodes in PuppyGraph cluster |
metrics of name starts with puppy_gremlin are available when metrics for gremlin server is enabled.
Configuration
| Environment Variable | Default Value | Description |
|---|---|---|
METRICS_ENABLED |
false | Enables the metrics endpoint. |
METRICS_AUTH_ENABLED |
true | Enables basic auth (using PuppyGraph credentials) for the metrics endpoint. |
GREMLINSERVER_METRICS_ENABLED |
false | Enables metrics collection for the Gremlin server. |
Prometheus Integration
Use the following Prometheus scrape configuration as a template for collecting PuppyGraph metrics. Replace the placeholder values to match your environment.
global:
scrape_interval: 15s
evaluation_interval: 15s
scrape_configs:
- job_name: puppygraph
metrics_path: /metrics
static_configs:
- targets: ["<PUPPYGRAPH_HOST>:8081"]
basic_auth:
username: <YOUR_PUPPYGRAPH_USERNAME>
password: <YOUR_PUPPYGRAPH_PASSWORD>
- Set
<PUPPYGRAPH_HOST>to the address where PuppyGraph exposes/metrics. - Replace
<YOUR_PUPPYGRAPH_USERNAME>and<YOUR_PUPPYGRAPH_PASSWORD>with valid PuppyGraph credentials. - Adjust the scrape intervals to match your monitoring requirements.
Datadog Integration
Here's an example of how to integrate PuppyGraph's Prometheus endpoint with Datadog for monitoring.
Run Datadog agent
Follow Start the Datadog Agent with Docker to start Datadog agent.
If metrics authentication is enabled, you need to set PUPPYGRAPH_USERNAME and PUPPYGRAPH_PASSWORD as environment variables in the Datadog agent container:
docker run -d --name dd-agent \
-e PUPPYGRAPH_USERNAME="<YOUR_PUPPYGRAPH_USERNAME>" \
-e PUPPYGRAPH_PASSWORD="<YOUR_PUPPYGRAPH_PASSWORD>" \
-e DD_API_KEY="<DATADOG_API_KEY>" \
-e DD_SITE="<YOUR_DATADOG_SITE>" \
-e DD_DOGSTATSD_NON_LOCAL_TRAFFIC=true \
-v /var/run/docker.sock:/var/run/docker.sock:ro \
-v /proc/:/host/proc/:ro \
-v /sys/fs/cgroup/:/host/sys/fs/cgroup:ro \
-v /var/lib/docker/containers:/var/lib/docker/containers:ro \
gcr.io/datadoghq/agent:latest
Run PuppyGraph with Datadog labels
Add labels to a PuppyGraph container to make Datadog to discover and scrape its metrics.
For example, if you are running PuppyGraph with Docker, you can add the following arguments to the docker run command:
With metric authentication enabled:
-l com.datadoghq.ad.check_names='["openmetrics"]' \
-l com.datadoghq.ad.init_configs='[{}]' \
-l com.datadoghq.ad.instances="[{\"openmetrics_endpoint\":\"http://%%host%%:8081/metrics\",\"username\":\"%%env_PUPPYGRAPH_USERNAME%%\",\"password\":\"%%env_PUPPYGRAPH_PASSWORD%%\",\"namespace\":\"puppy\",\"metrics\":[\".*\"]}]" \
With metric authentication disabled: