Prometheus in Action

- 3 mins

Power your metrics and alerting with a leading open-source monitoring solution.

Architecture Overview


Configuring Prometheus to monitor itself

prometheus.yml

Starting Prometheus

By default, Prometheus stores its database in ./data (flag –storage.tsdb.path).

./prometheus --config.file=prometheus.yml

Gather metrics

localhost:9090/metrics

Using the expression browser

localhost:9090/graph

Expression language

Using the graphing interface

Starting up some sample targets

Download the Go client library for Prometheus and run three of these example processes:

# Fetch the client library code and compile example.
git clone https://github.com/prometheus/client_golang.git
cd client_golang/examples/random
go get -d
go build

# Start 3 example targets in separate terminals:
./random -listen-address=:8080
./random -listen-address=:8081
./random -listen-address=:8082

You should now have example targets listening on

http://localhost:8080/metrics
http://localhost:8081/metrics
http://localhost:8082/metrics

Configuring Prometheus to monitor the sample targets

Expose such as the rpc_durations_seconds metric

scrape_configs:
  - job_name: 'example-random'

    # Override the global default and scrape targets from this job every 5 seconds.
    scrape_interval: 5s

    static_configs:
      - targets: ['localhost:8080', 'localhost:8081']
        labels:
          group: 'production'

      - targets: ['localhost:8082']
        labels:
          group: 'canary'

Configure rules for aggregating scraped data into new time series

expression

avg(rate(rpc_durations_seconds_count[5m])) by (job, service)

prometheus.rules.yml

groups:
- name: example
  rules:
  - record: job_service:rpc_durations_seconds_count:avg_rate5m
    expr: avg(rate(rpc_durations_seconds_count[5m])) by (job, service)

Perfect Prometheus Config

global:
  scrape_interval:     15s # By default, scrape targets every 15 seconds.
  evaluation_interval: 15s # Evaluate rules every 15 seconds.

  # Attach these extra labels to all timeseries collected by this Prometheus instance.
  external_labels:
    monitor: 'codelab-monitor'

rule_files:
  - 'prometheus.rules.yml'

scrape_configs:
  - job_name: 'prometheus'

    # Override the global default and scrape targets from this job every 5 seconds.
    scrape_interval: 5s

    static_configs:
      - targets: ['localhost:9090']

  - job_name: 'example-random'

    # Override the global default and scrape targets from this job every 5 seconds.
    scrape_interval: 5s

    static_configs:
      - targets: ['localhost:8080', 'localhost:8081']
        labels:
          group: 'production'

      - targets: ['localhost:8082']
        labels:
          group: 'canary'

Expose metric name

job_service:rpc_durations_seconds_count:avg_rate5m

Reference

comments powered by Disqus
rss github weibo twitter instagram pinterest facebook linkedin stackoverflow reddit quora mail