How to Configure Fluentd

How to Configure Fluentd Fluentd is an open-source data collector designed to unify logging and monitoring across diverse systems. With its lightweight architecture, plugin-based extensibility, and support for over 800 data sources and destinations, Fluentd has become a cornerstone in modern cloud-native and hybrid infrastructure environments. Whether you're managing microservices on Kubernetes, s

Nov 6, 2025 - 10:38
Nov 6, 2025 - 10:38
 1

How to Configure Fluentd

Fluentd is an open-source data collector designed to unify logging and monitoring across diverse systems. With its lightweight architecture, plugin-based extensibility, and support for over 800 data sources and destinations, Fluentd has become a cornerstone in modern cloud-native and hybrid infrastructure environments. Whether you're managing microservices on Kubernetes, scaling applications across hybrid clouds, or centralizing logs from legacy systems, Fluentd provides a reliable, scalable, and flexible solution for log aggregation and forwarding.

Configuring Fluentd correctly is critical to ensuring data integrity, minimizing latency, and maintaining system performance. A misconfigured Fluentd instance can lead to log loss, excessive resource consumption, or even service outages. This comprehensive guide walks you through every step of configuring Fluentdfrom initial installation to advanced tuningequipping you with the knowledge to deploy Fluentd confidently in production environments.

Step-by-Step Guide

Step 1: Understand Fluentds Architecture

Before configuring Fluentd, its essential to understand its core components. Fluentd operates on a simple yet powerful model: input ? filter ? output. Data flows through these stages:

  • Input: Sources that collect data (e.g., files, syslog, HTTP, Docker containers).
  • Filter: Optional transformations applied to log records (e.g., parsing JSON, masking sensitive fields, adding metadata).
  • Output: Destinations where data is sent (e.g., Elasticsearch, S3, Kafka, CloudWatch).

Fluentd also supports buffering, which temporarily stores logs during network outages or destination unavailability. This feature ensures no data is lost during transient failures.

Fluentds configuration filetypically named fluentd.confdefines how these components are chained together. Understanding this flow is the foundation of effective configuration.

Step 2: Install Fluentd

Fluentd can be installed on Linux, macOS, Windows, and within containerized environments. Below are the most common installation methods.

On Ubuntu/Debian

Use the official Fluentd repository to ensure you receive the latest stable version with security updates.

curl -L https://toolbelt.treasuredata.com/sh/install-ubuntu-focal-td-agent4.sh | sh

This script installs td-agent, the official Fluentd distribution maintained by Treasure Data, which includes bundled plugins and system service integration.

After installation, verify its working:

sudo systemctl status td-agent

On CentOS/RHEL

curl -L https://toolbelt.treasuredata.com/sh/install-redhat-8-td-agent4.sh | sh

sudo systemctl status td-agent

Using Docker

For containerized deployments, use the official Fluentd image:

docker run -d --name fluentd -p 24224:24224 -v $(pwd)/fluentd.conf:/etc/fluent/fluent.conf fluent/fluentd:latest

Ensure your configuration file (fluentd.conf) is mounted correctly. This method is ideal for Kubernetes and Docker Compose environments.

Using Ruby Gem (Advanced)

If you need full control over plugin versions or are developing custom plugins, install Fluentd via RubyGems:

gem install fluentd

Then start Fluentd manually:

fluentd -c /path/to/fluentd.conf

Use this method only if youre experienced with Ruby environments and dependency management.

Step 3: Create a Basic Configuration File

Fluentds configuration file uses a simple, human-readable syntax. Below is a minimal working configuration that reads from a file and outputs to stdout.

<source>

@type tail

path /var/log/app.log

pos_file /var/log/fluentd-app.pos

tag app.log

format none

</source>

<match **>

@type stdout

</match>

Lets break this down:

  • <source> defines the input. @type tail monitors a file for new lines, similar to the Unix tail -f command.
  • path specifies the log file to monitor.
  • pos_file tracks the last read position to avoid duplicate logs after restarts.
  • tag labels the data stream. Tags are used for routing in Fluentd.
  • format none means no parsing is appliedeach line is treated as raw text.
  • <match **> captures all tagged data and sends it to @type stdout, which prints to the console.

Save this as fluentd.conf and start Fluentd:

sudo systemctl restart td-agent

Generate test log entries:

echo "2024-06-10T10:00:00Z INFO User logged in" >> /var/log/app.log

Check the Fluentd logs to confirm output:

sudo tail -f /var/log/td-agent/td-agent.log

You should see the log line printed in the Fluentd log output.

Step 4: Parse Structured Logs

Most modern applications output logs in structured formats like JSON. Fluentd can parse these to extract fields for better querying and analysis.

Update your source block to parse JSON:

<source>

@type tail

path /var/log/app.log

pos_file /var/log/fluentd-app.pos

tag app.log

format json

time_key timestamp

time_format %Y-%m-%dT%H:%M:%S.%NZ

</source>

Now, if your log file contains:

{"timestamp":"2024-06-10T10:00:00.123Z","level":"INFO","message":"User logged in","user_id":12345}

Fluentd will extract timestamp, level, message, and user_id as individual fields. These become available for filtering and routing.

Important: Ensure your JSON logs are valid and consistent. Invalid JSON will cause Fluentd to drop the record. Use tools like jq to validate logs before ingestion.

Step 5: Use Filters to Transform Data

Filters modify log records before they reach output. Common use cases include adding hostnames, redacting sensitive data, or enriching logs with metadata.

Example: Add server hostname and mask email addresses.

<filter app.log>

@type record_transformer

<record>

hostname ${HOSTNAME}

</record>

</filter>

<filter app.log>

@type grep

<regexp>

key message

pattern \b[A-Za-z0-9._%+-]+@[A-Za-z0-9.-]+\.[A-Z|a-z]{2,}\b

</regexp>

<exclude>

key message

pattern \b[A-Za-z0-9._%+-]+@[A-Za-z0-9.-]+\.[A-Z|a-z]{2,}\b

</exclude>

</filter>

The first filter adds a hostname field using the systems hostname. The second uses grep to detect emails and remove them from the message field. Note: The grep filter here is used for exclusion; for masking, use record_transformer with regex substitution.

For masking emails safely:

<filter app.log>

@type record_transformer

<record>

message ${record["message"].gsub(/\b[A-Za-z0-9._%+-]+@[A-Za-z0-9.-]+\.[A-Z|a-z]{2,}\b/, "[REDACTED_EMAIL]")}

</record>

</filter>

Filters are processed in order. Place them logically: parse first, then enrich, then sanitize.

Step 6: Configure Multiple Outputs

Fluentd can send the same log data to multiple destinations simultaneously. This is useful for redundancy, compliance, or analytics.

<match app.log>

@type copy

<store>

@type elasticsearch

host localhost

port 9200

index_name fluentd-app

type_name _doc

flush_interval 5s

</store>

<store>

@type s3

aws_key_id YOUR_AWS_KEY

aws_sec_key YOUR_AWS_SECRET

s3_bucket your-logs-bucket

path logs/app/

s3_region us-east-1

buffer_path /var/log/fluentd-s3

time_slice_format %Y%m%d

time_slice_wait 10m

buffer_chunk_limit 256m

</store>

</match>

Here, logs are sent to both Elasticsearch (for real-time search) and S3 (for long-term archival). The @type copy directive enables multi-output routing.

For high availability, use @type forward to send logs to multiple Fluentd instances:

<match app.log>

@type forward

<server>

host fluentd-primary.example.com

port 24224

</server>

<server>

host fluentd-backup.example.com

port 24224

</server>

heartbeat_type tcp

heartbeat_interval 10s

</match>

Fluentd will automatically fail over if the primary server becomes unreachable.

Step 7: Configure Buffering for Reliability

Buffering is Fluentds safety net. It ensures logs arent lost during network issues or destination downtime.

Every output plugin supports buffering. Heres a robust buffer configuration for production:

<match app.log>

@type elasticsearch

host elasticsearch.example.com

port 9200

index_name fluentd-app-${tag}

flush_interval 10s

buffer_type file

buffer_path /var/log/fluentd-buffers/app

buffer_queue_limit 256

buffer_chunk_limit 8m

flush_thread_count 4

retry_max_times 10

retry_wait 10s

max_retry_wait 60s

disable_retry_limit false

</match>

Key parameters:

  • buffer_type file: Stores data on disk (recommended for production).
  • buffer_queue_limit: Maximum number of chunks in memory before spilling to disk.
  • buffer_chunk_limit: Max size per chunk (8MB is safe for most systems).
  • flush_thread_count: Number of threads to flush buffers (increase for high throughput).
  • retry_max_times and retry_wait: Control how often Fluentd retries failed deliveries.

Monitor buffer usage:

curl http://localhost:24220/api/plugins.json

This API endpoint returns real-time buffer metrics, including queue depth and retry counts.

Step 8: Secure Fluentd with Authentication and TLS

Never expose Fluentd to the public internet. Use TLS and authentication for internal communication.

Enable TLS for Forward Input

Configure Fluentd to accept encrypted connections:

<source>

@type forward

port 24224

bind 0.0.0.0

<transport tls>

cert_path /etc/fluent/cert.pem

private_key_path /etc/fluent/key.pem

ca_cert_path /etc/fluent/ca-cert.pem

verify_mode peer

</transport>

</source>

Generate certificates using OpenSSL:

openssl req -x509 -nodes -days 365 -newkey rsa:2048 -keyout key.pem -out cert.pem

On the client side (e.g., another Fluentd instance), configure the output to use TLS:

<match app.log>

@type forward

<server>

host fluentd-server.example.com

port 24224

<transport tls>

cert_path /etc/fluent/client-cert.pem

private_key_path /etc/fluent/client-key.pem

ca_cert_path /etc/fluent/ca-cert.pem

</transport>

</server>

</match>

Use Authentication (Optional)

For added security, enable Fluentds auth plugin:

<source>

@type forward

port 24224

<transport tls>

cert_path /etc/fluent/cert.pem

private_key_path /etc/fluent/key.pem

</transport>

<security>

self_hostname fluentd-server.example.com

<auth>

method secret

secret your-super-secret-password

</auth>

</security>

</source>

Client must include the same secret:

<match app.log>

@type forward

<server>

host fluentd-server.example.com

port 24224

<transport tls>

cert_path /etc/fluent/client-cert.pem

private_key_path /etc/fluent/client-key.pem

</transport>

<security>

secret your-super-secret-password

</security>

</server>

</match>

Step 9: Monitor and Log Fluentds Own Health

Fluentd should monitor itself. Enable internal metrics and expose them via HTTP.

<system>

log_level info

root_dir /var/lib/td-agent

</system>

<source>

@type monitor_agent

bind 0.0.0.0

port 24220

</source>

Access metrics at:

http://your-fluentd-host:24220/api/plugins.json

This endpoint returns JSON with buffer usage, throughput, error rates, and plugin status. Integrate this into your monitoring stack (e.g., Prometheus + Grafana) using the fluentd-plugin-prometheus plugin.

Step 10: Restart and Validate Configuration

After making changes, always validate the configuration before restarting:

sudo td-agent -c /etc/fluent/fluent.conf --dry-run

If the output says Configuration is valid, proceed to restart:

sudo systemctl restart td-agent

Monitor logs for errors:

sudo journalctl -u td-agent -f

Test data flow with real logs and verify output destinations are receiving data.

Best Practices

1. Use Tags to Organize Log Streams

Tags are Fluentds routing keys. Structure them hierarchically: app.service.component. For example:

  • web.nginx.access
  • api.auth.service
  • db.postgresql.log

This enables precise filtering, routing, and indexing in downstream systems like Elasticsearch or BigQuery.

2. Avoid Using Wildcard Matches in Output

While <match **> captures everything, it makes debugging and routing difficult. Always use explicit tags or regex patterns like <match app.*> to ensure predictable behavior.

3. Separate Logs by Severity or Type

Route error logs to a high-priority destination (e.g., PagerDuty-integrated system), and info/debug logs to archival storage. Use filters to classify logs by level:

<filter app.log>

@type record_transformer

<record>

severity ${record["level"].upcase}

</record>

</filter>

<match app.log>

@type copy

<store>

@type elasticsearch

index_name fluentd-errors

<buffer>

@type file

path /var/log/fluentd-buffers/errors

</buffer>

<match>

severity ERROR

</match>

</store>

<store>

@type s3

index_name fluentd-info

<match>

severity INFO

</match>

</store>

</match>

Note: The <match> inside <store> is a Fluentd 1.0+ feature. Use <filter> + <match> if on older versions.

4. Optimize Buffer Settings for Your Workload

High-throughput environments (e.g., 10K+ logs/sec) require larger buffers and more flush threads. Monitor buffer queue depth and adjust:

  • Set buffer_chunk_limit to 816MB.
  • Use buffer_type file (not memory) for persistence.
  • Set flush_thread_count to 48 on multi-core systems.
  • Use retry_wait with exponential backoff (e.g., 10s, 20s, 40s).

5. Use External Configuration Management

Manage Fluentd configurations via tools like Ansible, Puppet, or GitOps (FluxCD). Store templates in version control and deploy using automated pipelines. This ensures consistency across hundreds of nodes.

6. Limit Plugin Usage to Whats Necessary

Each plugin consumes memory and CPU. Avoid installing plugins you dont use. For example, if youre not sending logs to Splunk, dont install the fluent-plugin-splunk gem.

7. Regularly Rotate and Clean Buffer Files

Buffer files grow over time. Set up log rotation for /var/log/fluentd-buffers/ using logrotate:

/var/log/fluentd-buffers/* {

daily

rotate 7

compress

missingok

notifempty

create 0644 td-agent td-agent

}

8. Test Configuration Changes in Staging First

Always validate configuration changes in a non-production environment. Use tools like fluentd -c config.conf --dry-run and simulate traffic with curl or fluent-cat:

echo '{"message":"test"}' | fluent-cat app.log

9. Document Your Fluentd Setup

Create a runbook including:

  • Configuration file structure
  • Tagging conventions
  • Buffer thresholds and alerting rules
  • How to restart Fluentd without downtime
  • Common error codes and resolutions

10. Integrate with Observability Tools

Connect Fluentd to Prometheus for metrics, Grafana for dashboards, and alerting systems like Alertmanager. Use the fluentd-plugin-prometheus plugin to expose internal metrics:

<source>

@type prometheus

port 24231

</source>

<source>

@type prometheus_output_monitor

</source>

Then scrape metrics from http://fluentd-host:24231/metrics.

Tools and Resources

Official Documentation

The official Fluentd documentation at https://docs.fluentd.org is the most authoritative source for configuration syntax, plugin references, and architecture guides.

Fluentd Plugin Registry

Explore over 800 plugins at https://rubygems.org/search?query=fluentd. Popular plugins include:

  • fluent-plugin-elasticsearch Send logs to Elasticsearch/OpenSearch
  • fluent-plugin-s3 Archive logs to AWS S3
  • fluent-plugin-kafka Stream logs to Apache Kafka
  • fluent-plugin-docker_metadata_filter Extract Docker container metadata
  • fluent-plugin-prometheus Expose metrics for monitoring
  • fluent-plugin-aws-cloudwatch-logs Send logs to AWS CloudWatch

Fluent Bit (Lightweight Alternative)

For resource-constrained environments (e.g., edge devices, IoT), consider Fluent Bita faster, memory-efficient cousin of Fluentd. It supports 90% of Fluentds plugins and integrates seamlessly with Fluentd via forward protocol.

Containerized Deployments

For Kubernetes, use the official Fluentd DaemonSet template. It automatically collects logs from Docker and containerd runtimes.

Validation and Debugging Tools

  • fluent-cat Send test messages to Fluentd
  • fluentd -c config.conf --dry-run Validate syntax
  • curl http://localhost:24220/api/plugins.json Monitor buffer and plugin status
  • jq Parse and validate JSON logs
  • tail -f /var/log/td-agent/td-agent.log Monitor Fluentds own logs

Community and Support

Join the Fluentd Slack community and GitHub discussions. The Fluentd project is actively maintained by the Cloud Native Computing Foundation (CNCF) and has a vibrant contributor base.

Monitoring and Alerting

Integrate Fluentd with:

  • Prometheus + Grafana For metrics visualization
  • ELK Stack For log search and analysis
  • Datadog For unified observability
  • Sumo Logic For cloud-native log analytics

Real Examples

Example 1: Centralized Logging for a Microservice Architecture

Scenario: You have 15 microservices running in Kubernetes, each outputting JSON logs to stdout. You want to collect, parse, enrich, and send them to Elasticsearch and S3.

Configuration:

<source>

@type tail

path /var/log/containers/*.log

pos_file /var/log/fluentd-containers.log.pos

tag kubernetes.*

format json

time_key time

time_format %Y-%m-%dT%H:%M:%S.%NZ

read_from_head true

</source>

<filter kubernetes.**>

@type kubernetes_metadata

</filter>

<filter kubernetes.**>

@type record_transformer

<record>

service_name ${record["kubernetes"]["labels"]["app"]}

namespace ${record["kubernetes"]["namespace_name"]}

</record>

</filter>

<match kubernetes.**>

@type copy

<store>

@type elasticsearch

host elasticsearch.logging.svc.cluster.local

port 9200

index_name k8s-logs-${record["service_name"]}

type_name _doc

flush_interval 10s

buffer_type file

buffer_path /var/log/fluentd-buffers/k8s

buffer_chunk_limit 8m

buffer_queue_limit 128

retry_max_times 10

retry_wait 10s

</store>

<store>

@type s3

aws_key_id YOUR_KEY

aws_sec_key YOUR_SECRET

s3_bucket your-k8s-logs-bucket

path logs/k8s/${record["namespace_name"]}/${record["service_name"]}/

s3_region us-east-1

buffer_path /var/log/fluentd-buffers/s3

time_slice_format %Y/%m/%d/%H

time_slice_wait 10m

buffer_chunk_limit 256m

</store>

</match>

This configuration automatically detects container logs, enriches them with Kubernetes metadata, tags them by service and namespace, and routes them to both Elasticsearch (for real-time search) and S3 (for compliance).

Example 2: Legacy System Log Forwarding

Scenario: You have an old Linux server running a proprietary application that writes logs to /var/log/legacy/app.log in a custom format: [TIMESTAMP] [LEVEL] MESSAGE.

Configuration:

<source>

@type tail

path /var/log/legacy/app.log

pos_file /var/log/fluentd-legacy.pos

tag legacy.app

format /^(?\d{4}-\d{2}-\d{2} \d{2}:\d{2}:\d{2}) \[(?[A-Z]+)\] (?.*)$/

time_format %Y-%m-%d %H:%M:%S

</source>

<filter legacy.app>

@type record_transformer

<record>

source "legacy-server-01"

</record>

</filter>

<match legacy.app>

@type forward

<server>

host fluentd-central.example.com

port 24224

<transport tls>

cert_path /etc/fluent/client-cert.pem

private_key_path /etc/fluent/client-key.pem

ca_cert_path /etc/fluent/ca-cert.pem

</transport>

</server>

</match>

This uses a regex parser to extract timestamp, level, and message from unstructured logs, adds source metadata, and forwards securely to a central Fluentd collector.

Example 3: Docker Container Logging with Fluentd

Scenario: Youre running Docker containers and want to collect logs using Fluentd instead of Dockers default JSON-file driver.

Run containers with Fluentd log driver:

docker run -d \

--name myapp \

--log-driver=fluentd \

--log-opt fluentd-address=localhost:24224 \

--log-opt tag=docker.myapp \

my-image

Fluentd configuration:

<source>

@type forward

port 24224

</source>

<match docker.**>

@type elasticsearch

host elasticsearch

port 9200

index_name docker-logs

type_name _doc

flush_interval 5s

</match>

Fluentd automatically receives logs from Docker and forwards them to Elasticsearch. The tag docker.myapp enables routing by container name.

FAQs

1. Whats the difference between Fluentd and Fluent Bit?

Fluentd is a full-featured, Ruby-based log collector with extensive plugin support and complex routing. Fluent Bit is a lightweight, C-based alternative optimized for performance and low memory usage. Use Fluent Bit for edge devices or Kubernetes nodes; use Fluentd for centralized aggregation and advanced processing.

2. How do I prevent log loss in Fluentd?

Use file-based buffering, set appropriate buffer_queue_limit and buffer_chunk_limit, enable retry logic, and monitor buffer metrics. Never use memory-only buffering in production.

3. Can Fluentd handle high-throughput logging (10K+ logs/sec)?

Yes. With proper tuningmultiple flush threads, larger buffer chunks, and optimized output pluginsFluentd can handle tens of thousands of events per second on modern hardware.

4. How do I parse non-JSON logs in Fluentd?

Use the regexp format type with a custom regex pattern. For example: format /^(?.

5. Why are my logs not appearing in Elasticsearch?

Check: (1) Fluentds own logs for errors, (2) Elasticsearch connectivity, (3) buffer status via curl http://localhost:24220/api/plugins.json, (4) index permissions, and (5) whether the tag matches your <match> directive.

6. How do I update Fluentd plugins without downtime?

Fluentd does not support hot-reloading. Plan maintenance windows. Use a rolling update strategy in Kubernetes: deploy new Fluentd pods with updated configs, drain old ones, then terminate.

7. Is Fluentd secure by default?

No. Fluentd listens on unencrypted ports by default. Always enable TLS for network inputs and use authentication in multi-tenant environments.

8. How do I test my Fluentd configuration without affecting production?

Use fluentd -c config.conf --dry-run to validate syntax. Use fluent-cat to inject test logs. Deploy to a staging environment with identical infrastructure before rolling out.

9. Can Fluentd forward logs to multiple cloud providers?

Yes. Use the @type copy directive to send the same logs to AWS CloudWatch, Google Cloud Logging, and Azure Monitor simultaneously.

10. What should I do if Fluentd consumes too much memory?

Reduce buffer_queue_limit, decrease flush_thread_count, disable unused plugins, and monitor buffer usage. Consider switching to Fluent Bit for high-density deployments.

Conclusion

Configuring Fluentd is not merely a technical taskits a strategic decision that impacts the reliability, scalability, and observability of your entire infrastructure. From parsing unstructured logs to securely forwarding data across hybrid clouds, Fluentd provides the flexibility to meet virtually any logging requirement.

This guide has walked you through the full lifecycle of Fluentd configuration: from installation and basic syntax to advanced buffering, security, and real-world use cases. Youve learned how to structure logs with tags, transform data with filters, ensure durability with buffers, and integrate with modern observability tools.

Remember: Fluentds power lies in its simplicity and extensibility. Start smallcollect logs from one service, validate the flow, then scale. Document every change. Monitor relentlessly. Test before you deploy.

As cloud-native architectures continue to evolve, Fluentd remains a foundational tool for centralized logging. Whether youre managing a dozen containers or thousands of microservices, a well-configured Fluentd instance is your key to visibility, control, and resilience.

Now that you understand how to configure Fluentd, take the next step: automate your deployment, integrate it with your CI/CD pipeline, and make logging a first-class citizen in your DevOps workflow.