How to Use Elasticsearch Query

How to Use Elasticsearch Query Elasticsearch is a powerful, distributed search and analytics engine built on Apache Lucene. It enables real-time search and analysis of large volumes of data with remarkable speed and scalability. Whether you're building a product search system, log analytics platform, or monitoring dashboard, mastering Elasticsearch queries is essential to unlocking its full potent

Nov 6, 2025 - 10:44
Nov 6, 2025 - 10:44
 0

How to Use Elasticsearch Query

Elasticsearch is a powerful, distributed search and analytics engine built on Apache Lucene. It enables real-time search and analysis of large volumes of data with remarkable speed and scalability. Whether you're building a product search system, log analytics platform, or monitoring dashboard, mastering Elasticsearch queries is essential to unlocking its full potential. Unlike traditional relational databases that rely on structured SQL queries, Elasticsearch uses a flexible, JSON-based query language that supports full-text search, filtering, aggregations, and complex boolean logicall optimized for modern data-driven applications.

The ability to construct effective Elasticsearch queries allows developers and data engineers to retrieve precise results from massive datasets with minimal latency. From simple keyword searches to multi-layered nested aggregations, Elasticsearch queries provide granular control over how data is indexed, searched, and analyzed. This tutorial provides a comprehensive, step-by-step guide to understanding and implementing Elasticsearch queries, covering best practices, real-world examples, essential tools, and common pitfalls to avoid.

Step-by-Step Guide

Setting Up Your Elasticsearch Environment

Before writing queries, you need a running Elasticsearch instance. The easiest way to get started is by using Docker. Run the following command to launch the latest stable version:

docker run -p 9200:9200 -p 9300:9300 -e "discovery.type=single-node" docker.elastic.co/elasticsearch/elasticsearch:8.12.0

Once Elasticsearch is running, verify its status by accessing http://localhost:9200 in your browser or via curl:

curl -X GET "localhost:9200"

You should receive a JSON response containing cluster name, version, and node information. This confirms your environment is ready.

Creating an Index and Mapping

In Elasticsearch, data is stored in indicessimilar to tables in relational databases. However, unlike SQL tables, Elasticsearch indices are schema-flexible by default. Still, defining explicit mappings improves performance and ensures data consistency.

Lets create an index named products with a structured mapping:

PUT /products

{

"mappings": {

"properties": {

"name": { "type": "text" },

"description": { "type": "text" },

"price": { "type": "float" },

"category": { "type": "keyword" },

"in_stock": { "type": "boolean" },

"created_at": { "type": "date", "format": "yyyy-MM-dd HH:mm:ss" }

}

}

}

Here, text fields are analyzed for full-text search, while keyword fields are used for exact matches and aggregations. The date type ensures proper temporal sorting and filtering.

Indexing Sample Data

Now, insert some sample documents into the products index:

POST /products/_bulk

{"index":{"_id":"1"}}

{"name":"Wireless Headphones","description":"Noise-cancelling over-ear headphones with 30-hour battery","price":199.99,"category":"Electronics","in_stock":true,"created_at":"2024-01-15 10:30:00"}

{"index":{"_id":"2"}}

{"name":"Organic Cotton T-Shirt","description":"100% organic cotton, unisex fit","price":29.99,"category":"Clothing","in_stock":true,"created_at":"2024-01-16 14:22:00"}

{"index":{"_id":"3"}}

{"name":"Smart Watch","description":"Heart rate monitor, GPS, water resistant","price":249.99,"category":"Electronics","in_stock":false,"created_at":"2024-01-14 09:15:00"}

{"index":{"_id":"4"}}

{"name":"Yoga Mat","description":"Non-slip, eco-friendly, 6mm thickness","price":45.50,"category":"Sports","in_stock":true,"created_at":"2024-01-17 11:05:00"}

{"index":{"_id":"5"}}

{"name":"Coffee Grinder","description":"Burr grinder with 15 grind settings","price":89.99,"category":"Kitchen","in_stock":true,"created_at":"2024-01-12 16:40:00"}

Using the _bulk endpoint is efficient for loading multiple documents. Each document is indexed with a unique ID, allowing for targeted retrieval and updates later.

Basic Search Queries

The most common Elasticsearch query is the match query, used for full-text search across analyzed fields:

GET /products/_search

{

"query": {

"match": {

"name": "headphones"

}

}

}

This returns all documents where the name field contains the term headphones, regardless of case or word order. Elasticsearch uses the standard analyzer to tokenize and normalize text, making searches case-insensitive and stemming-aware.

To search across multiple fields, use multi_match:

GET /products/_search

{

"query": {

"multi_match": {

"query": "organic cotton",

"fields": ["name", "description"]

}

}

}

This finds documents where either the name or description contains organic or cotton.

Filtering with Term and Range Queries

While match is great for text, use term for exact matches on keyword fields:

GET /products/_search

{

"query": {

"term": {

"category": "Electronics"

}

}

}

Unlike match, term does not analyze the inputit looks for the exact term as stored. This makes it ideal for filtering by categories, tags, or IDs.

To filter by numeric or date ranges, use range:

GET /products/_search

{

"query": {

"range": {

"price": {

"gte": 50,

"lte": 200

}

}

}

}

This returns products priced between $50 and $200. You can also use gt (greater than), lt (less than), and combine with bool queries for complex logic.

Combining Queries with Bool Queries

The bool query allows you to combine multiple queries using must, should, must_not, and filter clauses:

GET /products/_search

{

"query": {

"bool": {

"must": [

{

"match": {

"name": "cotton"

}

}

],

"filter": [

{

"term": {

"category": "Clothing"

}

},

{

"range": {

"price": {

"lt": 50

}

}

}

],

"must_not": [

{

"term": {

"in_stock": false

}

}

]

}

}

}

In this example:

  • must: The product name must contain cotton (relevance scoring applies).
  • filter: The category must be Clothing and price less than $50 (no scoringused for performance).
  • must_not: Exclude out-of-stock items.

Using filter instead of must for non-scoring conditions improves performance because Elasticsearch caches filtered results.

Sorting and Pagination

Elasticsearch allows sorting by any field, including nested or computed values:

GET /products/_search

{

"query": {

"match_all": {}

},

"sort": [

{

"price": {

"order": "asc"

}

}

],

"from": 0,

"size": 5

}

This returns the 5 cheapest products. The from and size parameters control pagination. For deep pagination (e.g., page 1000), consider using search_after instead of from for better performance:

GET /products/_search

{

"query": {

"match_all": {}

},

"sort": [

{

"price": {

"order": "asc"

}

}

],

"size": 5,

"search_after": [45.5]

}

search_after uses the last sort value from the previous page to fetch the next set, avoiding the performance penalty of skipping thousands of results.

Aggregations for Data Analysis

Aggregations are Elasticsearchs most powerful feature for analytics. They allow you to group data and compute metrics like counts, averages, and percentiles.

Lets group products by category and count them:

GET /products/_search

{

"size": 0,

"aggs": {

"categories": {

"terms": {

"field": "category"

}

}

}

}

The size: 0 suppresses document results, returning only the aggregation. Output will show each category and the number of products in each.

To calculate average price per category:

GET /products/_search

{

"size": 0,

"aggs": {

"categories": {

"terms": {

"field": "category"

},

"aggs": {

"avg_price": {

"avg": {

"field": "price"

}

}

}

}

}

}

This creates a nested aggregation: first group by category, then compute the average price within each group.

You can also use bucket aggregations like date_histogram for time-based analysis:

GET /products/_search

{

"size": 0,

"aggs": {

"products_by_month": {

"date_histogram": {

"field": "created_at",

"calendar_interval": "month"

}

}

}

}

This returns the number of products added each month, ideal for trend analysis.

Using Highlighting for Search Results

When users perform searches, highlighting matched terms improves UX. Use the highlight parameter:

GET /products/_search

{

"query": {

"match": {

"description": "noise-cancelling"

}

},

"highlight": {

"fields": {

"description": {}

}

}

}

The response includes a highlight section with <em> tags around matched terms:

"highlight": {

"description": [

"Noise-<em>cancelling</em> over-ear headphones with 30-hour battery"

]

}

You can customize the highlight tags, pre/post tags, and fragment size for better integration with your frontend.

Using Script Fields for Dynamic Calculations

Script fields allow you to compute values on the fly during query execution:

GET /products/_search

{

"query": {

"match_all": {}

},

"script_fields": {

"price_with_tax": {

"script": {

"source": "doc['price'].value * 1.08"

}

}

}

}

This adds a computed field price_with_tax that multiplies each products price by 1.08 (8% tax). Scripts are written in Painless, Elasticsearchs secure scripting language.

Best Practices

Use Keyword Fields for Exact Matching

Always use the keyword type for fields used in filters, aggregations, or sorts. Text fields are analyzed and split into tokens, making them unsuitable for exact matches. For example, filtering by category: "Electronics" will fail if category is mapped as text, because the analyzer may convert it to lowercase or split it.

Prefer Filter Context Over Query Context

Use filter clauses in bool queries for conditions that dont affect relevance scoring. Filters are cached and executed faster than queries. For example, filtering by date range or status should always be in the filter section, not must.

Limit Result Size and Use Pagination Wisely

Avoid using from and size for deep pagination. For large datasets, use search_after or scroll APIs. Also, always set a reasonable size limit (e.g., 10100) unless you need all results.

Optimize Index Mapping

Define mappings explicitly instead of relying on dynamic mapping. Disable dynamic fields if possible:

"dynamic": "strict"

This prevents accidental field creation and improves cluster stability.

Use Index Templates for Consistency

Create index templates to automatically apply mappings, settings, and aliases to new indices:

PUT _index_template/products_template

{

"index_patterns": ["products-*"],

"template": {

"mappings": {

"properties": {

"name": { "type": "text" },

"category": { "type": "keyword" },

"price": { "type": "float" }

}

}

}

}

This ensures all future indices matching products-* have consistent structure.

Monitor Query Performance with Profile API

To debug slow queries, use the profile parameter:

GET /products/_search

{

"profile": true,

"query": {

"match": {

"name": "headphones"

}

}

}

The response includes detailed timing for each query phase, helping you identify bottlenecks.

Use Aliases for Zero-Downtime Index Management

When reindexing data, use index aliases to switch between versions without changing application code:

PUT /products_v2

{ ... }

POST /_aliases

{

"actions": [

{ "remove": { "index": "products", "alias": "products_current" } },

{ "add": { "index": "products_v2", "alias": "products_current" } }

]

}

Applications query products_currentyou can swap the underlying index without disruption.

Avoid Wildcard Queries in Production

Queries like *term* or term* are expensive because they require scanning all terms in the inverted index. Use n-gram analyzers or edge n-gram tokens for prefix searches instead.

Enable Caching Strategically

Elasticsearch caches filters, segments, and field data. Use index.refresh_interval to reduce refresh frequency for write-heavy indices. For read-heavy workloads, consider using fielddata caching on keyword fields used in aggregations.

Tools and Resources

Elasticsearch Dev Tools (Kibana)

Kibanas Dev Tools console is the most effective environment for writing, testing, and debugging Elasticsearch queries. It provides syntax highlighting, auto-completion, and real-time response visualization. Access it via Kibana > Dev Tools.

Postman and cURL

For API testing outside Kibana, use Postman or cURL. Save common queries as collections in Postman for reuse. Example cURL request:

curl -X GET "localhost:9200/products/_search" -H 'Content-Type: application/json' -d '{

"query": {

"match": {

"name": "coffee"

}

}

}'

Elasticsearch Query DSL Reference

The official Elasticsearch Query DSL documentation is indispensable. Bookmark it: Elasticsearch Query DSL Guide. It includes examples for every query type, from prefix to script_score.

Searchable Sample Datasets

Use public datasets to practice queries:

Query Validation Tools

Use tools like Elasticsearch Query Builder to visually construct complex queries without writing JSON manually. These tools are excellent for learning and prototyping.

Monitoring and Profiling

Use Elasticsearchs built-in monitoring features or integrate with Prometheus and Grafana to track query latency, cache hit ratios, and node health. Enable slow query logging in elasticsearch.yml:

index.search.slowlog.threshold.query.warn: 5s

index.search.slowlog.threshold.query.info: 2s

Community and Forums

Engage with the Elasticsearch community on:

These platforms offer real-world solutions to complex problems and updates on new features.

Real Examples

Example 1: E-Commerce Product Search

Scenario: A user searches for wireless headphones under $150 and wants results sorted by price.

GET /products/_search

{

"query": {

"bool": {

"must": [

{

"multi_match": {

"query": "wireless headphones",

"fields": ["name^3", "description"]

}

}

],

"filter": [

{

"range": {

"price": {

"lte": 150

}

}

},

{

"term": {

"in_stock": true

}

}

]

}

},

"sort": [

{

"price": {

"order": "asc"

}

}

],

"highlight": {

"fields": {

"name": {},

"description": {}

}

},

"size": 10

}

Key features:

  • Boosting: name^3 gives higher relevance to matches in the name field.
  • Filtering: Only in-stock items under $150 are returned.
  • Highlighting: Matched terms are emphasized for UX.
  • Sorting: Results ordered by ascending price.

Example 2: Log Analysis for Error Patterns

Scenario: Find all error logs from the last 24 hours grouped by error type and count occurrences.

GET /logs-*/_search

{

"size": 0,

"query": {

"bool": {

"must": [

{

"match": {

"level": "ERROR"

}

}

],

"filter": [

{

"range": {

"timestamp": {

"gte": "now-24h"

}

}

}

]

}

},

"aggs": {

"error_types": {

"terms": {

"field": "error_type.keyword",

"size": 10

}

}

}

}

This returns the top 10 error types in the last day, helping teams prioritize fixes.

Example 3: User Behavior Analytics

Scenario: Analyze how many users viewed products in each category over the past week.

GET /user_events/_search

{

"size": 0,

"query": {

"bool": {

"must": [

{

"match": {

"event_type": "product_view"

}

}

],

"filter": [

{

"range": {

"event_time": {

"gte": "now-7d"

}

}

}

]

}

},

"aggs": {

"products_by_category": {

"terms": {

"field": "product_category.keyword"

},

"aggs": {

"unique_users": {

"cardinality": {

"field": "user_id.keyword"

}

}

}

}

}

}

This reveals which categories attract the most unique users, informing marketing and inventory decisions.

Example 4: Autocomplete with Edge N-Grams

Scenario: Implement a search-as-you-type feature for product names.

First, define a custom analyzer with edge n-grams:

PUT /products_autocomplete

{

"settings": {

"analysis": {

"analyzer": {

"autocomplete": {

"tokenizer": "autocomplete",

"filter": ["lowercase"]

}

},

"tokenizer": {

"autocomplete": {

"type": "edge_ngram",

"min_gram": 1,

"max_gram": 20,

"token_chars": ["letter", "digit"]

}

}

}

},

"mappings": {

"properties": {

"name": {

"type": "text",

"analyzer": "autocomplete",

"search_analyzer": "standard"

}

}

}

}

Now, search for hea to match headphones:

GET /products_autocomplete/_search

{

"query": {

"match": {

"name": "hea"

}

}

}

This returns results even before the user finishes typing.

FAQs

What is the difference between a match query and a term query?

A match query analyzes the input text and searches across analyzed fields (like text), making it ideal for full-text search. A term query looks for exact, unanalyzed values and should be used with keyword fields for filtering and exact matching.

Why is my Elasticsearch query slow?

Slow queries often result from: using wildcard patterns, querying unoptimized mappings, deep pagination (from > 10,000), large result sets, or insufficient hardware. Use the profile API to identify bottlenecks and optimize filters, mappings, and index structure.

Can I use SQL with Elasticsearch?

Yes, Elasticsearch supports SQL via the SQL REST API or Kibanas SQL console. However, its translated internally into Query DSL and may not perform as well as native queries. Use SQL for quick ad-hoc analysis, but rely on Query DSL for production applications.

How do I handle accents and special characters in search?

Use the asciifolding filter in your analyzer to normalize accented characters (e.g., caf ? cafe). Example:

"filter": ["lowercase", "asciifolding"]

Whats the maximum size for a single Elasticsearch query?

By default, Elasticsearch limits query size to 10,000 documents. Increase this via index.max_result_window setting, but avoid doing souse search_after or scroll APIs instead for large result sets.

How do I update documents in Elasticsearch?

Use the _update endpoint:

POST /products/_update/1

{

"doc": {

"in_stock": false

}

}

Or use update_by_query to update multiple documents matching a condition.

Can Elasticsearch handle real-time data?

Yes. Elasticsearch refreshes indices every second by default, making data searchable almost immediately. For higher throughput, increase refresh_interval to 30s or disable it during bulk indexing.

How do I delete an index or document?

To delete an index:

DELETE /products

To delete a single document:

DELETE /products/_doc/1

Whats the best way to back up Elasticsearch data?

Use snapshots. Configure a repository (e.g., S3, NFS) and take periodic snapshots:

PUT /_snapshot/my_backup

{

"type": "fs",

"settings": {

"location": "/mnt/backups"

}

}

PUT /_snapshot/my_backup/snapshot_1

{

"indices": "products",

"ignore_unavailable": true,

"include_global_state": false

}

Conclusion

Mastery of Elasticsearch queries transforms raw data into actionable insights. From basic keyword searches to complex aggregations and real-time analytics, the Query DSL offers unparalleled flexibility and performance. This guide has walked you through setting up your environment, constructing precise queries, applying best practices, leveraging powerful tools, and implementing real-world use cases.

Remember: the key to efficient Elasticsearch usage lies in thoughtful mapping design, strategic use of filters over queries, and avoiding common pitfalls like deep pagination and wildcard searches. Always test your queries with the profile API and monitor performance in production.

As data volumes grow and user expectations rise, Elasticsearch remains one of the most scalable and responsive search engines available. By applying the principles outlined here, youll build faster, smarter, and more reliable search experiences that scale with your business.

Continue exploring the official documentation, experiment with sample datasets, and contribute to the community. The deeper your understanding of Elasticsearch queries, the more value youll unlock from your data.