How to Setup Prometheus

How to Setup Prometheus Prometheus is an open-source systems monitoring and alerting toolkit originally built at SoundCloud in 2012 and now maintained by the Cloud Native Computing Foundation (CNCF). It has become one of the most widely adopted monitoring solutions in modern cloud-native environments, particularly in Kubernetes clusters, microservices architectures, and DevOps pipelines. Unlike tr

Nov 6, 2025 - 10:30
Nov 6, 2025 - 10:30
 3

How to Setup Prometheus

Prometheus is an open-source systems monitoring and alerting toolkit originally built at SoundCloud in 2012 and now maintained by the Cloud Native Computing Foundation (CNCF). It has become one of the most widely adopted monitoring solutions in modern cloud-native environments, particularly in Kubernetes clusters, microservices architectures, and DevOps pipelines. Unlike traditional monitoring tools that rely on pull-based or push-based models inconsistently, Prometheus uses a pull-based model with a powerful query language (PromQL), time-series database, and flexible alerting mechanismsall designed for reliability, scalability, and real-time observability.

Setting up Prometheus correctly is essential for gaining deep insights into system performance, application health, and infrastructure metrics. Whether you're monitoring a single server, a containerized application, or a large-scale distributed system, Prometheus provides the tools to collect, store, visualize, and alert on metrics with precision. This guide walks you through every step of setting up Prometheusfrom installation and configuration to integration with exporters, visualization with Grafana, and implementing best practices for production-grade monitoring.

By the end of this tutorial, youll have a fully functional Prometheus instance capable of scraping metrics from multiple targets, triggering alerts based on custom thresholds, and delivering actionable insights through dashboards. Youll also understand how to maintain, scale, and secure your monitoring stack for long-term reliability.

Step-by-Step Guide

Prerequisites

Before beginning the setup process, ensure your environment meets the following minimum requirements:

  • A Linux-based system (Ubuntu 20.04/22.04, CentOS 7/8, or Debian 11 recommended)
  • At least 2 GB of RAM (4 GB recommended for production)
  • At least 20 GB of available disk space (depending on retention period and metric volume)
  • Root or sudo privileges
  • Basic familiarity with the command line and YAML configuration
  • Network access to the targets you intend to monitor (firewall rules permitting traffic on port 9090 and exporter ports)

If youre monitoring applications running in containers or Kubernetes, ensure Docker or Podman is installed, and if using Kubernetes, have kubectl configured with cluster access.

Step 1: Download and Install Prometheus

Prometheus is distributed as a standalone binary. Downloading and installing it manually gives you full control over configuration and versioning.

First, navigate to the official Prometheus releases page and identify the latest stable version. As of this writing, the latest version is 2.51.x. Use wget to download the binary:

wget https://github.com/prometheus/prometheus/releases/download/v2.51.2/prometheus-2.51.2.linux-amd64.tar.gz

Extract the archive:

tar xvfz prometheus-2.51.2.linux-amd64.tar.gz

Move the extracted files to a standard location:

sudo mv prometheus-2.51.2.linux-amd64 /opt/prometheus

cd /opt/prometheus

Verify the installation by checking the version:

./prometheus --version

You should see output similar to:

prometheus, version 2.51.2 (branch: HEAD, revision: 1234567890abcdef)

Step 2: Create a Prometheus User and Directory Structure

For security and organization, create a dedicated system user and directory structure to run Prometheus:

sudo useradd --no-create-home --shell /bin/false prometheus

Create directories for configuration, rules, and data storage:

sudo mkdir /etc/prometheus

sudo mkdir /var/lib/prometheus

sudo mkdir /etc/prometheus/rules

sudo mkdir /etc/prometheus/alerts

Copy the configuration file and binaries to their appropriate locations:

sudo cp /opt/prometheus/prometheus /usr/local/bin/

sudo chown prometheus:prometheus /usr/local/bin/prometheus

sudo cp /opt/prometheus/promtool /usr/local/bin/

sudo chown prometheus:prometheus /usr/local/bin/promtool

sudo cp /opt/prometheus/prometheus.yml /etc/prometheus/

sudo chown prometheus:prometheus /etc/prometheus/prometheus.yml

sudo chmod 755 /usr/local/bin/prometheus

sudo chmod 755 /usr/local/bin/promtool

Step 3: Configure Prometheus

The core configuration file for Prometheus is prometheus.yml. This YAML file defines scrape targets, job configurations, alerting rules, and global settings.

Open the configuration file:

sudo nano /etc/prometheus/prometheus.yml

Replace the default content with the following minimal but functional configuration:

global:

scrape_interval: 15s

evaluation_interval: 15s

alerting:

alertmanagers:

- static_configs:

- targets:

- localhost:9093

rule_files:

- "/etc/prometheus/rules/*.rules"

- "/etc/prometheus/alerts/*.yml"

scrape_configs:

- job_name: 'prometheus'

static_configs:

- targets: ['localhost:9090']

- job_name: 'node_exporter'

static_configs:

- targets: ['localhost:9100']

Lets break this down:

  • scrape_interval: How often Prometheus pulls metrics from targets (15 seconds is standard).
  • evaluation_interval: How often alerting and recording rules are evaluated.
  • alerting: Points Prometheus to Alertmanager for alert routing (configured later).
  • rule_files: Specifies where custom alert and recording rules are stored.
  • scrape_configs: Defines the targets to monitor. The first job scrapes Prometheus itself; the second scrapes the Node Exporter (explained next).

Save and exit the file.

Step 4: Install and Configure Node Exporter

To monitor system-level metrics such as CPU, memory, disk I/O, and network usage, Prometheus needs an exporter. The Node Exporter is the most commonly used exporter for Linux systems.

Download the Node Exporter binary:

wget https://github.com/prometheus/node_exporter/releases/download/v1.7.0/node_exporter-1.7.0.linux-amd64.tar.gz

Extract and move the binary:

tar xvfz node_exporter-1.7.0.linux-amd64.tar.gz

sudo mv node_exporter-1.7.0.linux-amd64/node_exporter /usr/local/bin/

sudo chown prometheus:prometheus /usr/local/bin/node_exporter

Create a systemd service file for Node Exporter:

sudo nano /etc/systemd/system/node_exporter.service

Add the following content:

[Unit]

Description=Node Exporter

Wants=network-online.target

After=network-online.target

[Service]

User=prometheus

Group=prometheus

Type=simple

ExecStart=/usr/local/bin/node_exporter

[Install]

WantedBy=multi-user.target

Reload systemd and start the service:

sudo systemctl daemon-reload

sudo systemctl enable node_exporter

sudo systemctl start node_exporter

sudo systemctl status node_exporter

Verify Node Exporter is running on port 9100:

curl http://localhost:9100/metrics

You should see a long list of system metrics in plain text format.

Step 5: Configure Prometheus as a Systemd Service

To ensure Prometheus starts automatically on boot and runs in the background, create a systemd service file:

sudo nano /etc/systemd/system/prometheus.service

Add the following content:

[Unit]

Description=Prometheus

Wants=network-online.target

After=network-online.target

[Service]

User=prometheus

Group=prometheus

Type=simple

ExecStart=/usr/local/bin/prometheus \

--config.file /etc/prometheus/prometheus.yml \

--storage.tsdb.path /var/lib/prometheus/ \

--web.console-template=/etc/prometheus/consoles \

--web.console.templates=/etc/prometheus/consoles \

--web.listen-address=0.0.0.0:9090 \

--web.enable-admin-api \

--web.enable-lifecycle \

--storage.tsdb.retention.time=15d \

--enable-feature=remote-write-receiver

Restart=always

[Install]

WantedBy=multi-user.target

Important flags explained:

  • --config.file: Path to your configuration file.
  • --storage.tsdb.path: Where time-series data is stored.
  • --web.listen-address: Listen on all interfaces (0.0.0.0) on port 9090.
  • --web.enable-admin-api: Enables administrative APIs (use cautiously in production).
  • --web.enable-lifecycle: Allows reloading config via HTTP POST.
  • --storage.tsdb.retention.time: How long to retain data (15 days is a good default).
  • --enable-feature=remote-write-receiver: Enables receiving remote writes (useful for HA setups).

Reload systemd and start Prometheus:

sudo systemctl daemon-reload

sudo systemctl enable prometheus

sudo systemctl start prometheus

sudo systemctl status prometheus

Step 6: Access the Prometheus Web Interface

Once Prometheus is running, access the web UI by opening your browser and navigating to:

http://your-server-ip:9090

You should see the Prometheus homepage with a search bar and navigation menu. Click on Status > Targets to verify that both the Prometheus job and the Node Exporter job are showing as UP.

If either shows DOWN, check:

  • Firewall settings (ensure port 9090 and 9100 are open)
  • Service status: sudo systemctl status prometheus and sudo systemctl status node_exporter
  • Configuration syntax: promtool check config /etc/prometheus/prometheus.yml

Step 7: Install and Configure Alertmanager (Optional but Recommended)

Alertmanager handles alerts sent by Prometheus and routes them to notification channels like email, Slack, PagerDuty, or Microsoft Teams.

Download Alertmanager:

wget https://github.com/prometheus/alertmanager/releases/download/v0.27.0/alertmanager-0.27.0.linux-amd64.tar.gz

Extract and move:

tar xvfz alertmanager-0.27.0.linux-amd64.tar.gz

sudo mv alertmanager-0.27.0.linux-amd64/alertmanager /usr/local/bin/

sudo mv alertmanager-0.27.0.linux-amd64/amtool /usr/local/bin/

sudo chown prometheus:prometheus /usr/local/bin/alertmanager

sudo chown prometheus:prometheus /usr/local/bin/amtool

Create a configuration file:

sudo nano /etc/prometheus/alertmanager.yml

Add a basic configuration:

global:

resolve_timeout: 5m

route:

group_by: ['alertname']

group_wait: 10s

group_interval: 10s

repeat_interval: 1h

receiver: 'email-notifications'

receivers:

- name: 'email-notifications'

email_configs:

- to: 'alerts@example.com'

from: 'prometheus@example.com'

smarthost: 'smtp.example.com:587'

auth_username: 'prometheus@example.com'

auth_password: 'your-smtp-password'

html: '{{ template "email.default.html" . }}'

headers:

subject: '[Prometheus Alert] {{ .CommonLabels.alertname }}'

inhibit_rules:

- source_match:

severity: 'critical'

target_match:

severity: 'warning'

equal: ['alertname', 'dev', 'instance']

Create a systemd service for Alertmanager:

sudo nano /etc/systemd/system/alertmanager.service

Add:

[Unit]

Description=Alertmanager

Wants=network-online.target

After=network-online.target

[Service]

User=prometheus

Group=prometheus

Type=simple

ExecStart=/usr/local/bin/alertmanager \

--config.file /etc/prometheus/alertmanager.yml \

--web.listen-address=0.0.0.0:9093

Restart=always

[Install]

WantedBy=multi-user.target

Reload and start:

sudo systemctl daemon-reload

sudo systemctl enable alertmanager

sudo systemctl start alertmanager

sudo systemctl status alertmanager

Update your Prometheus configuration to point to Alertmanager:

In /etc/prometheus/prometheus.yml, ensure the alerting section points to localhost:9093 (as shown earlier). Then reload Prometheus:

curl -X POST http://localhost:9090/-/reload

Step 8: Set Up Grafana for Visualization

While Prometheus provides a basic UI, Grafana is the industry standard for creating rich, customizable dashboards.

Install Grafana:

sudo apt-get install -y apt-transport-https software-properties-common wget

wget -q -O - https://packages.grafana.com/gpg.key | sudo apt-key add -

echo "deb https://packages.grafana.com/oss/deb stable main" | sudo tee -a /etc/apt/sources.list.d/grafana.list

sudo apt-get update

sudo apt-get install -y grafana

Start and enable Grafana:

sudo systemctl daemon-reload

sudo systemctl enable grafana-server

sudo systemctl start grafana-server

Access Grafana at http://your-server-ip:3000. Default login: admin/admin (change password immediately).

Add Prometheus as a data source:

  1. Click Configuration > Data Sources > Add data source
  2. Select Prometheus
  3. Set URL to http://localhost:9090
  4. Click Save & Test

Import a pre-built dashboard:

  • Click Create > Import
  • Enter dashboard ID 1860 (Node Exporter Full) and click Load
  • Select Prometheus as the data source
  • Click Import

You now have a live dashboard showing CPU, memory, disk, and network usage metrics from your server.

Best Practices

Use Meaningful Job Names and Labels

Always use descriptive job names in your prometheus.yml file. Instead of job_name: 'app', use job_name: 'web-api-production'. Labels should be consistent across services to enable powerful grouping and filtering in PromQL queries.

Example:

- job_name: 'web-api-production'

static_configs:

- targets: ['10.0.1.10:9101']

labels:

environment: 'production'

service: 'web-api'

team: 'backend'

Implement Proper Retention Policies

By default, Prometheus retains data for 15 days. For production systems with high metric volume, adjust retention based on storage capacity and compliance needs:

  • Short-term: 714 days (development/testing)
  • Medium-term: 3060 days (production monitoring)
  • Long-term: Use remote storage (Thanos, Cortex, Mimir) for years of data

Set retention in your Prometheus config:

--storage.tsdb.retention.time=60d

Separate Alerting and Recording Rules

Keep alerting rules (conditions that trigger notifications) separate from recording rules (precomputed expressions to improve query performance). Store them in dedicated directories:

  • /etc/prometheus/alerts/ for alerting rules
  • /etc/prometheus/rules/ for recording rules

Example recording rule (/etc/prometheus/rules/cpu_usage.rules):

groups:

- name: cpu_usage

rules:

- record: instance:cpu_usage:avg5m

expr: avg_over_time(node_cpu_seconds_total{mode!="idle"}[5m])

Example alerting rule (/etc/prometheus/alerts/high_cpu_alert.rules):

groups:

- name: high_cpu_alert

rules:

- alert: HighCPUUsage

expr: instance:cpu_usage:avg5m > 0.8

for: 5m

labels:

severity: warning

annotations:

summary: "High CPU usage on {{ $labels.instance }}"

description: "CPU usage has been above 80% for 5 minutes."

Enable Remote Write for Scalability

For high-availability or long-term storage needs, configure Prometheus to send metrics to remote storage like Thanos, Cortex, or Mimir. This decouples storage from the Prometheus server, enabling horizontal scaling and data federation.

remote_write:

- url: "http://thanos-query.example.com/api/v1/write"

queue_config:

max_samples_per_send: 1000

max_retries: 10

min_backoff: 30ms

max_backoff: 100ms

Secure Your Prometheus Instance

By default, Prometheus exposes its web interface and APIs without authentication. In production, secure it using:

  • Reverse proxy with TLS: Use Nginx or Caddy to terminate HTTPS and add basic auth.
  • Network restrictions: Allow access only from internal networks or monitoring VLANs.
  • Disable admin API: Remove --web.enable-admin-api unless absolutely necessary.
  • Use OAuth2 or SAML: Integrate with enterprise identity providers via proxy.

Example Nginx config for basic auth:

server {

listen 9090;

server_name prometheus.example.com;

auth_basic "Prometheus Admin";

auth_basic_user_file /etc/nginx/.htpasswd;

location / {

proxy_pass http://localhost:9090;

proxy_http_version 1.1;

}

}

Monitor Prometheus Itself

Prometheus should monitor its own health. Use the built-in prometheus_build_info and prometheus_target_scrape_duration_seconds metrics to detect scraping failures, memory leaks, or slow queries.

Set up alerts for:

  • Prometheus target down (itself)
  • Scrape duration exceeding threshold
  • TSDB head chunks growing too large
  • Rule evaluation failures

Use Labels Consistently Across Services

Standardize labels like environment, region, service, and team across all exporters and applications. This enables cross-service queries like:

sum(rate(http_requests_total{environment="production"}[5m])) by (service)

Regularly Audit and Clean Up Unused Metrics

Over time, unused or noisy metrics can bloat your TSDB. Use the Prometheus UIs Metrics page to identify low-cardinality or rarely queried metrics. Use metric_relabel_configs to drop them at scrape time:

metric_relabel_configs:

- source_labels: [__name__]

regex: 'old_metric_.*'

action: drop

Tools and Resources

Official Prometheus Tools

  • Promtool: Command-line utility for validating configuration files, testing rules, and querying metrics. Use promtool check config prometheus.yml to validate syntax before restarting.
  • Prometheus Web UI: Built-in interface for querying metrics and viewing targets. Useful for quick debugging.
  • Alertmanager: Handles alert deduplication, grouping, and routing. Integrates with Slack, PagerDuty, Email, and more.

Exporters

Exporters are essential for exposing metrics from third-party systems. Key exporters include:

  • Node Exporter: System-level metrics (CPU, memory, disk, network)
  • Blackbox Exporter: HTTP, TCP, ICMP probe monitoring (for uptime checks)
  • MySQL Exporter: Database performance metrics
  • Redis Exporter: Redis instance metrics
  • PostgreSQL Exporter: Query performance and connection stats
  • Pushgateway: For batch jobs and ephemeral tasks that cannot be scraped
  • App Exporters: Custom exporters for Java (Micrometer), Python (Prometheus Client), Go (Prometheus Client Library)

Visualization

  • Grafana: The de facto standard for dashboarding. Offers hundreds of community-built dashboards.
  • PromLens: A visual PromQL editor with autocomplete and query explanation.
  • VictoriaMetrics: A high-performance, scalable Prometheus-compatible time-series database.

Remote Storage

  • Thanos: Adds long-term storage, global querying, and high availability to Prometheus.
  • Cortex: Multi-tenant, horizontally scalable Prometheus-compatible backend.
  • Mimir: Grafana Labs next-generation Prometheus backend with advanced features like sharding and compression.

Learning Resources

Community and Support

Join the Prometheus community for real-time help:

  • Slack:

    prometheus channel on CNCF Slack

  • Forum: https://discuss.prometheus.io
  • GitHub Issues: Report bugs or request features

Real Examples

Example 1: Monitoring a Web Application with cURL and Custom Metrics

Suppose you have a simple web API that returns a JSON status. You want to monitor its response time and success rate.

Create a custom script (web_monitor.sh) to expose metrics:

!/bin/bash

while true; do

start=$(date +%s.%N)

response=$(curl -s -o /dev/null -w "%{http_code}" http://localhost:8080/health)

end=$(date +%s.%N)

duration=$(echo "$end - $start" | bc -l) echo "

HELP web_api_response_time_seconds Time taken to respond to health check"

echo "

TYPE web_api_response_time_seconds gauge"

echo "web_api_response_time_seconds{status=\"$response\"} $duration"

sleep 10

done

Run it on port 9101:

python3 -m http.server 9101

Then add to Prometheus config:

- job_name: 'web-api-custom'

static_configs:

- targets: ['localhost:9101']

Now you can query:

rate(web_api_response_time_seconds[5m])

Example 2: Alerting on High HTTP Error Rates

Assume youre monitoring a web server with a metric http_requests_total{code="500"}.

Create an alert rule:

groups:

- name: web_errors

rules:

- alert: High5xxErrors

expr: rate(http_requests_total{code=~"5.."}[5m]) > 0.1

for: 10m

labels:

severity: critical

annotations:

summary: "High 5xx errors detected on {{ $labels.instance }}"

description: "HTTP 5xx error rate has exceeded 0.1 per second for 10 minutes."

This triggers an alert if more than one 5xx error occurs every 10 seconds over a 5-minute window.

Example 3: Monitoring Kubernetes with kube-state-metrics

In a Kubernetes cluster, install kube-state-metrics:

kubectl apply -f https://github.com/kubernetes/kube-state-metrics/releases/download/v2.12.0/kube-state-metrics.yaml

Add to Prometheus config:

- job_name: 'kubernetes-pods'

kubernetes_sd_configs:

- role: pod

relabel_configs:

- source_labels: [__meta_kubernetes_pod_annotation_prometheus_io_scrape]

action: keep

regex: true

- source_labels: [__meta_kubernetes_pod_annotation_prometheus_io_path]

action: replace

target_label: __metrics_path__

regex: (.+)

- source_labels: [__address__, __meta_kubernetes_pod_annotation_prometheus_io_port]

action: replace

target_label: __address__

regex: ([^:]+)(?::\d+)?;(\d+)

replacement: $1:$2

- action: labelmap

regex: __meta_kubernetes_pod_label_(.+)

- source_labels: [__meta_kubernetes_namespace]

action: replace

target_label: kubernetes_namespace

- source_labels: [__meta_kubernetes_pod_name]

action: replace

target_label: kubernetes_pod_name

Now you can monitor pod restarts, resource requests, and container statuses directly in Prometheus.

FAQs

What is Prometheus used for?

Prometheus is used for monitoring and alerting on time-series metrics from systems, applications, and services. It excels at collecting metrics like CPU usage, request rates, error counts, and latency, enabling teams to detect anomalies, troubleshoot performance issues, and ensure system reliability.

Can Prometheus monitor Windows servers?

Yes. Use the Windows Exporter (https://github.com/prometheus-community/windows_exporter) to collect metrics such as disk usage, network interfaces, and service states on Windows systems.

Does Prometheus support log monitoring?

No. Prometheus is designed for metrics, not logs. For log aggregation, use Loki (also by Grafana Labs) or ELK stack. Prometheus and Loki are often used together for full observability.

How much disk space does Prometheus need?

It depends on the number of metrics and retention period. A typical server with 1000 time series and 15-day retention uses ~510 GB. High-cardinality metrics (e.g., per-request IDs) can consume hundreds of GBs quickly. Use remote storage for large-scale deployments.

Is Prometheus suitable for production?

Yes. Prometheus is used by major organizations including Google, GitHub, and Netflix. However, for high availability and long-term storage, pair it with Thanos, Cortex, or Mimir.

How do I update Prometheus?

Download the new binary, stop the service, replace the executable, and restart. Always test new versions in staging first. Use version control for your config files to roll back if needed.

Can Prometheus scrape metrics over HTTPS?

Yes. Configure TLS in the scrape config:

scrape_configs:

- job_name: 'secure-app'

scheme: https

tls_config:

ca_file: /etc/prometheus/ca.crt

cert_file: /etc/prometheus/cert.crt

key_file: /etc/prometheus/key.key

static_configs:

- targets: ['app.example.com:443']

What is the difference between Prometheus and Zabbix?

Prometheus is pull-based, cloud-native, and designed for dynamic environments like Kubernetes. Zabbix is push-based, traditionally used for static infrastructure, and has a heavier GUI. Prometheus is more scalable and integrates better with modern DevOps toolchains.

How do I backup Prometheus data?

Backup the /var/lib/prometheus directory. Since its a time-series database, you can also use promtool tsdb backup to create a consistent snapshot. Always stop Prometheus before backing up to avoid corruption.

Why are my targets showing as DOWN?

Common causes:

  • Network connectivity issues
  • Firewall blocking port
  • Exporter not running
  • Incorrect target URL or port
  • Authentication required but not configured

Check the Prometheus UI under Status > Targets for detailed error messages.

Conclusion

Setting up Prometheus is a foundational skill for modern DevOps and SRE teams. From installing the binary and configuring scrape targets to integrating with exporters, Alertmanager, and Grafana, this guide has provided a comprehensive, production-ready roadmap for deploying Prometheus successfully.

Prometheus is not just a toolits a philosophy of observability: collect meaningful metrics, alert on what matters, and visualize trends to drive informed decisions. When paired with best practices like consistent labeling, proper retention policies, and remote storage, Prometheus becomes a powerful engine for system reliability.

Remember: Monitoring is not a one-time setup. Its an ongoing discipline. Regularly review your alerts, prune unused metrics, and refine your dashboards as your infrastructure evolves. The goal is not to collect every possible metric, but to understand the health of your systems at a glanceand act before users are impacted.

With Prometheus, you now have the tools to build a resilient, transparent, and proactive monitoring culture. Start small, validate your setup, and scale gradually. The insights you gain will transform how you operate and maintain your systemstoday and into the future.