How to Setup Elk Stack
How to Setup ELK Stack The ELK Stack — an acronym for Elasticsearch, Logstash, and Kibana — is one of the most powerful and widely adopted open-source platforms for log management, real-time analytics, and observability. Originally developed by Elastic, the ELK Stack has become the de facto standard for centralized logging across enterprises, DevOps teams, and cloud-native environments. Whether yo
How to Setup ELK Stack
The ELK Stack an acronym for Elasticsearch, Logstash, and Kibana is one of the most powerful and widely adopted open-source platforms for log management, real-time analytics, and observability. Originally developed by Elastic, the ELK Stack has become the de facto standard for centralized logging across enterprises, DevOps teams, and cloud-native environments. Whether you're monitoring application performance, troubleshooting infrastructure issues, or analyzing security events, the ELK Stack provides the tools to collect, process, store, and visualize structured and unstructured data at scale.
With the exponential growth of digital systems, logs are no longer just an afterthought they are critical assets for operational visibility. The ELK Stack transforms raw log data into actionable insights, enabling teams to detect anomalies, predict failures, and optimize performance before users are impacted. This tutorial provides a comprehensive, step-by-step guide to setting up the ELK Stack from scratch on a Linux-based system, along with best practices, real-world examples, and essential resources to ensure a robust, scalable, and secure deployment.
Step-by-Step Guide
Prerequisites
Before beginning the setup, ensure your environment meets the following minimum requirements:
- A server running Ubuntu 22.04 LTS or CentOS 8+/RHEL 8+
- At least 4 GB of RAM (8 GB recommended for production)
- At least 2 CPU cores
- At least 20 GB of free disk space (scalable based on log volume)
- Root or sudo access
- Java 11 or Java 17 installed (Elasticsearch requires Java)
- Internet access to download packages
For production environments, consider deploying each component on separate servers to optimize resource allocation and improve fault tolerance. For learning or small-scale use, a single-node setup is acceptable.
Step 1: Install Java
Elasticsearch runs on the Java Virtual Machine (JVM), so Java must be installed before proceeding. Well install OpenJDK 17, which is fully supported by the latest Elasticsearch versions.
On Ubuntu:
sudo apt update
sudo apt install openjdk-17-jdk -y
On CentOS/RHEL:
sudo dnf install java-17-openjdk-devel -y
Verify the installation:
java -version
You should see output similar to:
openjdk version "17.0.10"
OpenJDK Runtime Environment (build 17.0.10+7)
OpenJDK 64-Bit Server VM (build 17.0.10+7, mixed mode, sharing)
Step 2: Install Elasticsearch
Elasticsearch is the distributed search and analytics engine at the core of the ELK Stack. It stores and indexes data, enabling fast full-text searches and complex aggregations.
First, import the Elastic GPG key to verify package authenticity:
wget -qO - https://artifacts.elastic.co/GPG-KEY-elasticsearch | sudo gpg --dearmor -o /usr/share/keyrings/elastic-keyring.gpg
Add the Elasticsearch repository:
echo "deb [signed-by=/usr/share/keyrings/elastic-keyring.gpg] https://artifacts.elastic.co/packages/8.x/apt stable main" | sudo tee -a /etc/apt/sources.list.d/elastic-8.x.list
Update the package list and install Elasticsearch:
sudo apt update
sudo apt install elasticsearch -y
For CentOS/RHEL:
sudo rpm --import https://artifacts.elastic.co/GPG-KEY-elasticsearch
echo "[elasticsearch-8.x]
name=Elasticsearch repository for 8.x packages
baseurl=https://artifacts.elastic.co/packages/8.x/yum
gpgcheck=1
gpgkey=https://artifacts.elastic.co/GPG-KEY-elasticsearch
enabled=1
autorefresh=1
type=rpm-md" | sudo tee /etc/yum.repos.d/elasticsearch.repo
sudo dnf install elasticsearch -y
Configure Elasticsearch by editing its main configuration file:
sudo nano /etc/elasticsearch/elasticsearch.yml
Update the following settings for a single-node development setup:
cluster.name: my-elk-cluster
node.name: node-1
network.host: 0.0.0.0
discovery.type: single-node
xpack.security.enabled: false
Important: In production, always enable security (xpack.security.enabled: true) and configure TLS/SSL certificates. For now, we disable security for simplicity during setup.
Start and enable Elasticsearch:
sudo systemctl daemon-reload
sudo systemctl enable elasticsearch
sudo systemctl start elasticsearch
Verify Elasticsearch is running:
curl -X GET "localhost:9200"
You should receive a JSON response with cluster details, including version and cluster name. If you see an error, check logs with:
sudo journalctl -u elasticsearch -f
Step 3: Install Logstash
Logstash is the data processing pipeline that ingests data from multiple sources, transforms it, and sends it to Elasticsearch. It supports plugins for input, filter, and output stages, making it highly flexible.
Install Logstash using the same repository:
sudo apt install logstash -y
Or on CentOS/RHEL:
sudo dnf install logstash -y
Logstash configuration files are stored in /etc/logstash/conf.d/. Create a configuration file for a basic syslog input and Elasticsearch output:
sudo nano /etc/logstash/conf.d/01-syslog-input.conf
Add the following configuration:
input {
beats {
port => 5044
}
}
filter {
if [type] == "syslog" {
grok {
match => { "message" => "%{SYSLOGTIMESTAMP:syslog_timestamp} %{SYSLOGHOST:syslog_hostname} %{DATA:syslog_program}(?:\[%{POSINT:syslog_pid}\])?: %{GREEDYDATA:syslog_message}" }
}
date {
match => [ "syslog_timestamp", "MMM d HH:mm:ss", "MMM dd HH:mm:ss" ]
}
}
}
output {
elasticsearch {
hosts => ["localhost:9200"]
index => "%{[@metadata][beat]}-%{+YYYY.MM.dd}"
document_type => "%{[@metadata][type]}"
}
}
This configuration listens for Beats input (e.g., Filebeat) on port 5044, parses syslog-style messages using Grok patterns, and forwards them to Elasticsearch.
Start and enable Logstash:
sudo systemctl daemon-reload
sudo systemctl enable logstash
sudo systemctl start logstash
Check its status:
sudo systemctl status logstash
If Logstash fails to start, inspect logs for syntax errors:
sudo tail -f /var/log/logstash/logstash-plain.log
Step 4: Install Kibana
Kibana is the visualization layer of the ELK Stack. It provides a web interface to explore data in Elasticsearch, create dashboards, and monitor system health.
Install Kibana:
sudo apt install kibana -y
Or on CentOS/RHEL:
sudo dnf install kibana -y
Edit the Kibana configuration file:
sudo nano /etc/kibana/kibana.yml
Update the following settings:
server.port: 5601
server.host: "0.0.0.0"
elasticsearch.hosts: ["http://localhost:9200"]
i18n.locale: "en"
Start and enable Kibana:
sudo systemctl daemon-reload
sudo systemctl enable kibana
sudo systemctl start kibana
Verify Kibana is running:
curl http://localhost:5601
You should see HTML output. If youre accessing Kibana remotely, ensure your firewall allows traffic on port 5601:
sudo ufw allow 5601
Open your browser and navigate to http://your-server-ip:5601. You should see the Kibana welcome screen.
Step 5: Install Filebeat (Optional but Recommended)
While Logstash can ingest data from many sources, Filebeat is a lightweight, resource-efficient log shipper designed specifically for forwarding logs to Logstash or Elasticsearch. Its ideal for collecting logs from application servers, web servers, and containers.
Install Filebeat:
sudo apt install filebeat -y
Or on CentOS/RHEL:
sudo dnf install filebeat -y
Configure Filebeat to send logs to Logstash. Edit the configuration:
sudo nano /etc/filebeat/filebeat.yml
Uncomment and update the following sections:
filebeat.inputs:
- type: log
enabled: true
paths:
- /var/log/*.log
- /var/log/syslog
output.logstash:
hosts: ["localhost:5044"]
Enable the system module (to collect OS-level logs):
sudo filebeat modules enable system
Load the template into Elasticsearch:
sudo filebeat setup --e
This command loads the default index template and Kibana dashboards into Elasticsearch.
Start Filebeat:
sudo systemctl enable filebeat
sudo systemctl start filebeat
Step 6: Verify the Full Stack
Now that all components are installed, verify data is flowing end-to-end:
- Check that Filebeat is sending logs:
sudo journalctl -u filebeat -f - Check Logstash is processing:
sudo tail -f /var/log/logstash/logstash-plain.log - Check Elasticsearch has indexed data:
curl -X GET "localhost:9200/_cat/indices?v"
You should see indices like filebeat-* or syslog-* listed.
In Kibana, navigate to Stack Management > Index Patterns, then click Create index pattern. Enter filebeat-* as the pattern and select @timestamp as the time field. Click Create.
Now go to Discover to view live log entries. You should see logs from your systems syslog, auth.log, and other files.
Next, create a dashboard: Go to Dashboard > Create dashboard, add visualizations like log rate over time, top source IPs, or error counts. Save the dashboard for future monitoring.
Best Practices
Security Configuration
Never run the ELK Stack in production without security enabled. Elasticsearch, Kibana, and Logstash all support authentication, role-based access control (RBAC), and TLS encryption.
Enable built-in security in Elasticsearch:
xpack.security.enabled: true
xpack.security.transport.ssl.enabled: true
xpack.security.http.ssl.enabled: true
Generate certificates:
sudo /usr/share/elasticsearch/bin/elasticsearch-certutil ca
sudo /usr/share/elasticsearch/bin/elasticsearch-certutil cert --ca elastic-ca.zip
Place certificates in /etc/elasticsearch/certs/ and reference them in elasticsearch.yml.
Set passwords for built-in users:
sudo /usr/share/elasticsearch/bin/elasticsearch-setup-passwords auto
Store the generated passwords securely. Use them in Kibanas kibana.yml:
elasticsearch.username: "kibana_system"
elasticsearch.password: "your-generated-password"
Resource Allocation
Elasticsearch is memory-intensive. Allocate no more than 50% of your systems RAM to the JVM heap (via jvm.options), and never exceed 32 GB. Use:
-Xms4g
-Xmx4g
Set the heap size to 50% of available RAM, up to 32 GB. For example, on an 8 GB machine, use 4 GB.
Disable swap entirely on Elasticsearch nodes:
sudo swapoff -a
Add to /etc/fstab to make permanent:
none swap swap sw 0 0
Index Management and Retention
Logs accumulate quickly. Implement Index Lifecycle Management (ILM) to automate rollover, deletion, and archiving.
Create an ILM policy in Kibana: Stack Management > Index Lifecycle Policies. Define phases:
- Hot: Indexing and searching (retain for 7 days)
- Warm: Read-only, lower hardware (retain for 30 days)
- Cold: Archived to cheaper storage (retain for 90 days)
- Delete: Remove after 365 days
Apply the policy to your index patterns via index templates.
Monitoring and Alerting
Use Kibanas Monitoring feature (available in Elastic Stacks paid tiers) or Prometheus + Grafana for open-source monitoring.
Set up alerts for critical events:
- High CPU usage on log servers
- Log ingestion rate drops below threshold
- Repeated authentication failures
- Unusual spike in error logs
Use Kibanas Alerting > Create Alert to define conditions based on Elasticsearch queries.
Scalability and High Availability
For production environments:
- Deploy Elasticsearch as a cluster with at least 3 master-eligible nodes
- Separate data nodes from ingest and coordinating nodes
- Use dedicated Logstash nodes for heavy filtering
- Deploy Kibana behind a reverse proxy (Nginx) with HTTPS
- Use load balancers for Kibana and Elasticsearch HTTP endpoints
Enable discovery via Zen (for Elasticsearch 7.x) or Join (8.x) using static IPs or DNS names.
Backup and Disaster Recovery
Regularly snapshot your Elasticsearch indices to S3, NFS, or object storage:
PUT _snapshot/my_backup
{
"type": "s3",
"settings": {
"bucket": "my-elk-backups",
"region": "us-east-1"
}
}
Take snapshots:
PUT _snapshot/my_backup/snapshot_1
{
"indices": "filebeat-*",
"ignore_unavailable": true,
"include_global_state": false
}
Tools and Resources
Official Documentation
Community and Support
- Elastic Discuss Forum Active community for troubleshooting
- Elastic GitHub Repositories Open-source code and issue tracking
- Elastic Learn Platform Free training modules and certifications
Useful Plugins and Integrations
- Filebeat Modules Pre-built configurations for Apache, Nginx, MySQL, PostgreSQL, and more
- Logstash Filters Grok, GeoIP, UserAgent, Mutate, and Ruby for advanced parsing
- Kibana Canvas Create pixel-perfect visual reports
- Kibana Lens Drag-and-drop visualization builder
- Elastic APM Application Performance Monitoring (separate installation)
- Prometheus Exporter for Elasticsearch For monitoring with Prometheus/Grafana
Containerized Deployments
For modern infrastructure, consider deploying the ELK Stack using Docker Compose or Kubernetes:
version: '3.8'
services:
elasticsearch:
image: docker.elastic.co/elasticsearch/elasticsearch:8.12.0
environment:
- discovery.type=single-node
- xpack.security.enabled=false
ports:
- "9200:9200"
volumes:
- esdata:/usr/share/elasticsearch/data
kibana:
image: docker.elastic.co/kibana/kibana:8.12.0
ports:
- "5601:5601"
depends_on:
- elasticsearch
logstash:
image: docker.elastic.co/logstash/logstash:8.12.0
ports:
- "5044:5044"
volumes:
- ./logstash-config:/usr/share/logstash/pipeline
depends_on:
- elasticsearch
volumes:
esdata:
Run with:
docker-compose up -d
Cloud Alternatives
If managing infrastructure is not a priority, consider Elastic Cloud (hosted ELK):
- Managed Elasticsearch and Kibana
- Automatic scaling and backups
- Integrated monitoring and alerting
- Pay-as-you-go pricing
Visit elastic.co/cloud to get started.
Real Examples
Example 1: Monitoring a Web Server
Scenario: You manage a fleet of Nginx web servers and need to monitor request rates, error codes, and response times.
Steps:
- Install Filebeat on each Nginx server
- Enable the Nginx module:
sudo filebeat modules enable nginx - Configure Filebeat to point to
/var/log/nginx/access.logand/var/log/nginx/error.log - Send logs to Logstash or directly to Elasticsearch
- In Kibana, create a dashboard with:
- Top 10 client IPs by request count
- HTTP status code distribution (4xx, 5xx)
- Response time percentiles
- Geolocation map of traffic
Result: You detect a sudden spike in 500 errors from a specific region, triggering an investigation into a misconfigured backend service.
Example 2: Security Log Analysis
Scenario: You want to detect brute-force SSH attacks across 50 Linux servers.
Steps:
- Install Filebeat and enable the system module on all servers
- Ensure
/var/log/auth.logis being monitored - In Kibana, create a visualization for failed SSH attempts by source IP
- Create an alert: Trigger if >10 failed login attempts from one IP in 5 minutes
- Use Kibanas Machine Learning to detect anomalies in login patterns
Result: An alert fires for IP 192.168.1.100 with 23 failed attempts in 3 minutes. You block the IP at the firewall and investigate further.
Example 3: Container Log Aggregation
Scenario: You run Docker containers on multiple hosts and need centralized logging.
Steps:
- Install Filebeat on each Docker host
- Configure Filebeat to read from Dockers JSON log files:
/var/lib/docker/containers/*/*.log - Use the Docker filter plugin to extract container metadata (name, image, ID)
- In Kibana, create a dashboard showing container logs by service (e.g., web, api, db)
- Set up alerts for container restarts or high memory usage logs
Result: You identify a misbehaving API container that restarts every 10 minutes due to an unhandled exception fixed before impacting users.
FAQs
What is the difference between ELK and EKL?
There is no such thing as EKL. The correct acronym is ELK Elasticsearch, Logstash, Kibana. Sometimes, Filebeat is added, making it Elastic Stack or EFK (Elasticsearch, Fluentd, Kibana) if Fluentd replaces Logstash.
Can I use the ELK Stack without Filebeat?
Yes. Logstash can ingest logs directly from syslog, TCP, UDP, or APIs. However, Filebeat is lightweight, reliable, and optimized for log shipping. Its the recommended choice for most use cases.
How much disk space does ELK Stack require?
It depends on log volume. A small setup (10 GB/day) needs 50100 GB. Enterprise deployments (100+ GB/day) may require multiple TBs. Use ILM and compression to manage storage.
Is the ELK Stack free to use?
Yes, the core components (Elasticsearch, Logstash, Kibana, Filebeat) are open-source under the SSPL license. However, advanced features like machine learning, alerting, and SAML authentication require a paid subscription (Elastic Platinum or Enterprise).
Can I run ELK Stack on Windows?
Yes. Elastic provides Windows installers for all components. However, Linux is preferred for production due to better performance, stability, and tooling support.
Why is my Kibana dashboard empty?
Common causes:
- Elasticsearch is not running or unreachable
- Index pattern is misconfigured or doesnt match any indices
- Logstash/Filebeat is not sending data
- Time range filter is set too narrowly
Check the Discover tab, verify index pattern, and inspect logs from Filebeat and Logstash.
How do I upgrade the ELK Stack?
Always upgrade one component at a time, following Elastics upgrade guide. Back up indices first. Ensure compatibility between versions Elasticsearch 8.x requires Kibana 8.x, etc.
Can I use ELK Stack with cloud providers like AWS or Azure?
Absolutely. Many organizations deploy ELK on EC2, Azure VMs, or Google Compute Engine. Use cloud storage (S3, Blob Storage) for snapshots and enable VPC peering for secure communication.
What are common performance bottlenecks?
Common issues:
- Insufficient RAM or heap size for Elasticsearch
- Too many shards per index
- Slow disk I/O (use SSDs)
- Overloaded Logstash pipelines with complex filters
- Network latency between components
Monitor using Kibanas Monitoring UI or Prometheus.
Conclusion
Setting up the ELK Stack is a transformative step for any organization serious about observability, security, and operational efficiency. From collecting logs across hundreds of servers to visualizing real-time metrics and detecting anomalies before they escalate, the ELK Stack provides unmatched flexibility and power.
This guide walked you through a complete installation from Java setup to Kibana dashboards and emphasized critical best practices around security, scalability, and maintenance. Whether youre monitoring a single application or managing a global infrastructure, the ELK Stack is a foundational tool that scales with your needs.
Remember: A well-configured ELK Stack is not a one-time setup. It requires ongoing tuning, monitoring, and refinement. Start small, validate your data flow, and expand incrementally. Leverage community resources, automate with scripts, and never underestimate the value of clean, structured logs.
With the ELK Stack in place, youre no longer flying blind. Youre empowered with visibility, insight, and control turning chaos into clarity, one log at a time.