elasticsearch-query-dsl

Mark as read

Elasticsearch Query DSL 徹底解説

1. はじめに

1.1 本記事の目的

本記事では、Elasticsearchの最も基本的かつ強力なクエリ言語である**Query DSL（Domain Specific Language）**について、その全体像を包括的に解説します。Query DSLはElasticsearchの心臓部とも言えるコンポーネントであり、全文検索、構造化データのフィルタリング、集計（アグリゲーション）、地理空間検索、セマンティック検索など、Elasticsearchが提供するあらゆる検索機能へのゲートウェイです。

本記事を読み終えることで、以下の知識が得られます。

Query DSLのアーキテクチャと設計思想
すべてのクエリカテゴリとその使い分け
クエリコンテキストとフィルタコンテキストの違いとパフォーマンスへの影響
実務で頻出するクエリパターンの具体例
アグリゲーション（集計）との統合方法
パフォーマンス最適化のベストプラクティス

1.2 Query DSLの位置づけ

Elasticsearchは6つのクエリ言語をサポートしていますが、Query DSLはその中でも最も歴史が長く、最も機能が充実した言語です。

クエリ言語	特徴	主な用途
Query DSL	JSONベース、最も強力で柔軟	アプリケーション組み込み、全機能アクセス
ES\|QL	パイプライン型、直感的	アドホック分析、データ探索
EQL	イベント相関に特化	セキュリティ脅威ハンティング
Elasticsearch SQL	SQL構文	BIツール連携、RDB経験者向け
KQL	テキストベース、シンプル	Kibana UIでのフィルタリング
Lucene Query Syntax	原始的な構文	正規表現・ファジー検索

Query DSLは、REST APIを通じてプログラマティックにElasticsearchと対話する際のデファクトスタンダードです。

1.3 動作環境

本記事の内容は以下のバージョンで検証しています。

Elasticsearch: 8.x 系（8.11以降推奨）
Kibana: 8.x 系（Dev Tools使用）
API: REST API (_search エンドポイント)

2. Query DSLのアーキテクチャ

2.1 抽象構文木（AST）としてのQuery DSL

Query DSLは、JSONを使用してクエリを定義するドメイン固有言語です。内部的には**抽象構文木（Abstract Syntax Tree: AST）**としてクエリを構造化します。このAST構造により、クエリは再帰的にネスト可能であり、単純な条件から極めて複雑な論理式まで表現できます。

               bool (root)
              /    |     \
           must  filter  should
            |      |       |
         match   range   match

この木構造は、Elasticsearchの内部でLuceneクエリに変換され、各シャードで並列実行されます。

2.2 クエリ句の2つの分類

Query DSLのすべてのクエリ句は、以下の2つのカテゴリに大別されます。

リーフクエリ句（Leaf Query Clauses）

リーフクエリ句は、AST構造の「葉」にあたる最小単位のクエリです。特定のフィールドに対して特定の値を検索し、他のクエリ句を必要としません。

代表的なリーフクエリ句:

全文検索クエリ: match, match_phrase, multi_match, query_string
Term-levelクエリ: term, terms, range, exists, prefix, wildcard, regexp, fuzzy, ids
地理空間クエリ: geo_bounding_box, geo_distance, geo_shape
特殊クエリ: more_like_this, percolate, rank_feature

複合クエリ句（Compound Query Clauses）

複合クエリ句は、他のリーフクエリ句や複合クエリ句を組み合わせて、より複雑なクエリロジックを構築します。

bool: 複数のクエリを論理的に組み合わせる（AND/OR/NOT/FILTER）
dis_max: 複数のクエリのうち最もスコアの高いものを採用
constant_score: フィルタ条件に一定のスコアを付与
boosting: ポジティブ/ネガティブの重み付け
function_score: カスタムスコアリング関数の適用

2.3 Search APIの基本構造

Query DSLクエリは、_search エンドポイントを通じて実行します。

GET /my-index/_search
{
  "query": {
    // Query DSLクエリをここに記述
  },
  "from": 0,
  "size": 10,
  "sort": [
    { "@timestamp": "desc" }
  ],
  "_source": ["field1", "field2"],
  "aggs": {
    // アグリゲーションをここに記述
  }
}

主要なリクエストパラメータ:

パラメータ	型	説明	デフォルト
`query`	object	Query DSLクエリ本体	`match_all`
`from`	number	結果のオフセット（ページネーション）	0
`size`	number	返却するドキュメント数	10
`sort`	array	ソート条件	`_score` 降順
`_source`	boolean/array	返却するフィールドの指定	全フィールド
`track_total_hits`	boolean/number	総ヒット数の正確な追跡	10000
`explain`	boolean	スコア計算の詳細を表示	false
`highlight`	object	マッチ箇所のハイライト設定	なし
`aggs`	object	アグリゲーション定義	なし
`search_after`	array	カーソルベースのページネーション	なし

2.4 検索の実行フロー

Query DSLクエリが実行される際の内部フローは以下の通りです。

クエリ解析: JSONリクエストがAST（抽象構文木）に変換される
クエリ最適化: フィルタコンテキストのキャッシュ確認、不要な句の除去
ルーティング: 対象インデックスのシャードが特定される
分散実行（Queryフェーズ）: 各シャードでクエリが並列実行され、上位のドキュメントIDとスコアが返却される
集約（Fetchフェーズ）: 調整ノードが結果をマージし、必要なドキュメントの本文を取得する
レスポンス構築: 最終結果がクライアントに返却される

このQuery-then-Fetchモデルにより、大規模なデータセットでも効率的な検索が可能になります。search_type=dfs_query_then_fetch を指定すると、より正確なスコアリングのために分散ドキュメント頻度を使用するDFSフェーズが追加されます。

3. クエリコンテキストとフィルタコンテキスト

3.1 2つのコンテキストの概要

Query DSLにおいて、クエリ句がどのように評価されるかは、それが使用されるコンテキストによって決まります。この2つのコンテキストの理解は、Query DSLを効果的に使いこなすための最重要概念です。

観点	クエリコンテキスト	フィルタコンテキスト
評価の質問	「このドキュメントはどれだけ条件にマッチするか？」	「このドキュメントは条件にマッチするか？」
スコア計算	あり（`_score` に反映）	なし（スコアは0）
キャッシュ	なし	あり（頻繁に使用されるフィルタは自動キャッシュ）
パフォーマンス	相対的に低い（スコア計算のオーバーヘッド）	高い（バイナリ判定のみ）
主な用途	全文検索、関連度ランキング	構造化データのフィルタリング

3.2 クエリコンテキスト

クエリコンテキストでは、各ドキュメントが検索条件にどの程度マッチするかを評価し、その度合いを _score（関連度スコア）として数値化します。

GET /articles/_search
{
  "query": {
    "match": {
      "content": "Elasticsearch 分散検索エンジン"
    }
  }
}

このクエリでは、content フィールドに「Elasticsearch」「分散」「検索」「エンジン」がより多く含まれるドキュメントほど高いスコアが付与されます。スコア計算にはTF-IDF（Term Frequency-Inverse Document Frequency）やBM25アルゴリズムが使用されます。

3.3 フィルタコンテキスト

フィルタコンテキストでは、ドキュメントが条件にマッチするかどうかをYES/NOでのみ判定します。スコア計算が行われないため高速であり、頻繁に使用されるフィルタの結果はElasticsearchによって自動的にキャッシュされます。

GET /logs/_search
{
  "query": {
    "bool": {
      "filter": [
        { "term": { "status": "error" } },
        { "range": { "@timestamp": { "gte": "2025-01-01" } } }
      ]
    }
  }
}

3.4 実践的な使い分け

パフォーマンスを最大化するための基本原則は以下の通りです。

クエリコンテキストを使うべき場面:

全文検索（match, match_phrase など）
関連度に基づくランキングが必要な場合
ユーザーの検索意図に最も近い結果を上位に表示したい場合

フィルタコンテキストを使うべき場面:

日付範囲によるフィルタリング
ステータス、カテゴリ、タグなどの完全一致
数値範囲の条件
存在チェック（exists）
地理空間条件

実務でよく使われるパターンは、bool クエリで両方のコンテキストを組み合わせる手法です。

GET /products/_search
{
  "query": {
    "bool": {
      "must": [
        {
          "match": {
            "name": "ワイヤレスイヤホン"
          }
        }
      ],
      "filter": [
        { "term": { "category": "electronics" } },
        { "range": { "price": { "gte": 3000, "lte": 30000 } } },
        { "term": { "in_stock": true } }
      ],
      "should": [
        {
          "match": {
            "description": "ノイズキャンセリング"
          }
        }
      ]
    }
  }
}

この例では:

must → クエリコンテキスト: 商品名の全文検索（スコアに影響）
filter → フィルタコンテキスト: カテゴリ・価格帯・在庫のフィルタリング（スコアに影響しない、キャッシュ対象）
should → クエリコンテキスト: 追加のスコアブースト（マッチすればスコアが上がるが、必須ではない）

4. 全文検索クエリ（Full-Text Queries）

全文検索クエリは、テキストデータに対して検索を実行する前に、指定されたアナライザでテキストを解析（トークン化）します。人間が自然言語で入力した検索語を処理するのに適しています。

4.1 match クエリ

match クエリは、Elasticsearchで最も頻繁に使用される全文検索クエリです。入力テキストをアナライザで解析し、生成されたトークンで検索を実行します。

基本的な使い方

GET /articles/_search
{
  "query": {
    "match": {
      "content": "Elasticsearch 全文検索"
    }
  }
}

デフォルトでは、テキストはトークン化され、各トークンがOR条件で結合されます。「Elasticsearch」または「全文検索」を含むドキュメントがマッチします。

主要パラメータ

GET /articles/_search
{
  "query": {
    "match": {
      "content": {
        "query": "Elasticsearch 全文検索エンジン",
        "operator": "and",
        "analyzer": "kuromoji",
        "fuzziness": "AUTO",
        "max_expansions": 50,
        "prefix_length": 2,
        "minimum_should_match": "75%",
        "zero_terms_query": "none",
        "lenient": true,
        "boost": 1.5
      }
    }
  }
}

各パラメータの詳細:

パラメータ	デフォルト	説明
`query`	(必須)	検索テキスト
`operator`	`OR`	トークン間の論理演算子（`OR` / `AND`）
`analyzer`	フィールドのアナライザ	テキスト解析に使用するアナライザ
`fuzziness`	なし	許容する編集距離（`AUTO`, `0`, `1`, `2`）
`max_expansions`	50	ファジー検索の最大展開数
`prefix_length`	0	ファジー検索で変更しない先頭文字数
`minimum_should_match`	-	マッチすべき最小トークン数（数値または割合）
`zero_terms_query`	`none`	アナライザで全トークンが除去された場合の動作
`lenient`	false	型不一致エラーを無視するか
`boost`	1.0	スコアの重み付け係数
`auto_generate_synonyms_phrase_query`	true	多語シノニムのフレーズクエリ自動生成

fuzziness の AUTO設定

fuzziness: "AUTO" を指定すると、トークンの長さに応じて自動的に編集距離が決まります。

トークンの長さ	許容される編集距離
0〜2文字	0（完全一致のみ）
3〜5文字	1
6文字以上	2

AUTO:3,6 のように範囲をカスタマイズすることも可能です。

4.2 match_phrase クエリ

match_phrase クエリは、指定したフレーズ（語順を含む）に一致するドキュメントを検索します。

GET /articles/_search
{
  "query": {
    "match_phrase": {
      "content": {
        "query": "分散検索エンジン",
        "slop": 2,
        "analyzer": "kuromoji"
      }
    }
  }
}

slop パラメータは、トークン間に許容する位置のずれ（ギャップ）の数を指定します。slop: 0 は完全なフレーズ一致、slop: 2 は2つまでのトークンが間に入ることを許容します。

4.3 multi_match クエリ

複数のフィールドに対して同時に match クエリを実行します。

GET /articles/_search
{
  "query": {
    "multi_match": {
      "query": "Elasticsearch チュートリアル",
      "fields": ["title^3", "content", "tags^2"],
      "type": "best_fields",
      "tie_breaker": 0.3
    }
  }
}

type パラメータで検索戦略を指定できます。

タイプ	説明
`best_fields`	最も高いスコアのフィールドを採用（デフォルト）
`most_fields`	全フィールドのスコアを合算
`cross_fields`	全フィールドを1つのフィールドとして扱う
`phrase`	各フィールドで `match_phrase` を実行
`phrase_prefix`	各フィールドで `match_phrase_prefix` を実行
`bool_prefix`	各フィールドで `match_bool_prefix` を実行

fields の ^3 はブーストファクターで、title フィールドのスコアを3倍に重み付けます。

4.4 query_string クエリ

Luceneクエリ構文を使用した高度な全文検索クエリです。AND/OR/NOT演算子、ワイルドカード、正規表現、フィールド指定などをサポートします。

GET /articles/_search
{
  "query": {
    "query_string": {
      "query": "(Elasticsearch OR Opensearch) AND version:8.* AND NOT status:draft",
      "default_field": "content",
      "default_operator": "AND",
      "allow_leading_wildcard": false,
      "analyze_wildcard": true,
      "boost": 1.0
    }
  }
}

注意: query_string はユーザー入力を直接受け付ける場合、構文エラーやインジェクションのリスクがあります。エンドユーザー向けには simple_query_string の使用を推奨します。

4.5 simple_query_string クエリ

query_string のよりロバストなバージョンです。構文エラーが発生してもクエリ全体が失敗せず、不正な部分を無視して実行されます。

GET /articles/_search
{
  "query": {
    "simple_query_string": {
      "query": "Elasticsearch + チュートリアル | ガイド -古い",
      "fields": ["title^2", "content"],
      "default_operator": "and"
    }
  }
}

サポートされる演算子:

演算子	意味
`+`	AND
`\|`	OR
`-`	NOT
`"..."`	フレーズ検索
`*`	末尾のワイルドカード
`(...)`	グループ化
`~N`	ファジー検索（`word~2`）やスロップ（`"phrase"~3`）

4.6 combined_fields クエリ

複数のフィールドを1つの結合フィールドとして検索します。multi_match の cross_fields タイプに似ていますが、よりシンプルで最適化された実装です。

GET /articles/_search
{
  "query": {
    "combined_fields": {
      "query": "Elasticsearch distributed search engine",
      "fields": ["title", "abstract", "body"],
      "operator": "and"
    }
  }
}

4.7 intervals クエリ

テキスト内のトークンの順序と近接度を細かく制御する高度なクエリです。

GET /articles/_search
{
  "query": {
    "intervals": {
      "content": {
        "all_of": {
          "ordered": true,
          "max_gaps": 5,
          "intervals": [
            { "match": { "query": "Elasticsearch" } },
            { "match": { "query": "search engine" } }
          ]
        }
      }
    }
  }
}

5. Term-levelクエリ

Term-levelクエリは、全文検索クエリとは異なり、検索語をアナライズ（解析）しません。フィールドに格納された正確な値と照合するため、構造化データの検索に適しています。

5.1 term クエリ

指定したフィールドに完全一致する値を持つドキュメントを検索します。

GET /products/_search
{
  "query": {
    "term": {
      "status": {
        "value": "published",
        "boost": 1.0
      }
    }
  }
}

重要: term クエリを text フィールドに対して使用しないでください。text フィールドはアナライズされた後にインデックスされるため、原文と格納されたトークンが一致しないことがあります。text フィールドの検索には match クエリを使用してください。

5.2 terms クエリ

複数の値のいずれかに一致するドキュメントを検索します（OR条件）。

GET /products/_search
{
  "query": {
    "terms": {
      "status": ["published", "pending_review"],
      "boost": 1.0
    }
  }
}

terms lookupによるドキュメント間参照

他のドキュメントの値を参照して検索することもできます。

GET /orders/_search
{
  "query": {
    "terms": {
      "product_id": {
        "index": "featured_products",
        "id": "1",
        "path": "product_ids"
      }
    }
  }
}

5.3 terms_set クエリ

指定した値のうち、最小数以上にマッチするドキュメントを検索します。

GET /jobs/_search
{
  "query": {
    "terms_set": {
      "required_skills": {
        "terms": ["java", "elasticsearch", "docker", "kubernetes"],
        "minimum_should_match_field": "required_skill_count"
      }
    }
  }
}

minimum_should_match_field で指定されたフィールドの値が、マッチすべき最小数になります。スクリプトで動的に計算することも可能です。

GET /jobs/_search
{
  "query": {
    "terms_set": {
      "required_skills": {
        "terms": ["java", "elasticsearch", "docker"],
        "minimum_should_match_script": {
          "source": "Math.min(params.num_terms, doc['required_skill_count'].value)"
        }
      }
    }
  }
}

5.4 range クエリ

数値、日付、IPアドレスなどの範囲条件を指定してドキュメントを検索します。

GET /logs/_search
{
  "query": {
    "range": {
      "@timestamp": {
        "gte": "2025-01-01T00:00:00Z",
        "lt": "2025-02-01T00:00:00Z",
        "format": "strict_date_optional_time",
        "time_zone": "+09:00"
      }
    }
  }
}

範囲演算子

演算子	説明
`gt`	より大きい（greater than）
`gte`	以上（greater than or equal to）
`lt`	より小さい（less than）
`lte`	以下（less than or equal to）

日付の数学式（Date Math）

GET /logs/_search
{
  "query": {
    "range": {
      "@timestamp": {
        "gte": "now-1d/d",
        "lt": "now/d"
      }
    }
  }
}

式	説明
`now`	現在時刻
`now-1d`	1日前
`now-1h`	1時間前
`now/d`	本日の開始（切り捨て）
`now/M`	今月の開始
`2025-01-01\|\|+1M`	2025年2月1日

5.5 exists クエリ

指定したフィールドにインデックスされた値が存在するドキュメントを検索します。

GET /users/_search
{
  "query": {
    "exists": {
      "field": "email"
    }
  }
}

フィールドが存在しないドキュメントを検索する場合:

GET /users/_search
{
  "query": {
    "bool": {
      "must_not": {
        "exists": {
          "field": "email"
        }
      }
    }
  }
}

5.6 prefix クエリ

指定したフィールドの値が特定のプレフィックスで始まるドキュメントを検索します。

GET /products/_search
{
  "query": {
    "prefix": {
      "product_code": {
        "value": "ES-",
        "case_insensitive": true
      }
    }
  }
}

5.7 wildcard クエリ

ワイルドカードパターンに一致するドキュメントを検索します。

GET /files/_search
{
  "query": {
    "wildcard": {
      "file_path": {
        "value": "/var/log/*.log",
        "case_insensitive": false
      }
    }
  }
}

パターン	意味
`*`	0文字以上の任意の文字列
`?`	任意の1文字

注意: ワイルドカードクエリ（特に先頭の *）はパフォーマンスに大きな影響を与えます。search.allow_expensive_queries が false の場合、実行できません。

5.8 regexp クエリ

正規表現パターンに一致するドキュメントを検索します。

GET /logs/_search
{
  "query": {
    "regexp": {
      "error_code": {
        "value": "E[0-9]{4}",
        "flags": "ALL",
        "case_insensitive": true,
        "max_determinized_states": 10000
      }
    }
  }
}

5.9 fuzzy クエリ

Levenshtein編集距離に基づくあいまい検索を実行します。

GET /products/_search
{
  "query": {
    "fuzzy": {
      "name": {
        "value": "elastisearch",
        "fuzziness": "AUTO",
        "max_expansions": 50,
        "prefix_length": 3,
        "transpositions": true
      }
    }
  }
}

5.10 ids クエリ

ドキュメントIDに基づいてドキュメントを検索します。

GET /products/_search
{
  "query": {
    "ids": {
      "values": ["1", "42", "100"]
    }
  }
}

6. 複合クエリ（Compound Queries）

複合クエリは、他のクエリ句を組み合わせてより複雑な検索ロジックを構築します。

6.1 bool クエリ

bool クエリはQuery DSLの中核であり、最も頻繁に使用される複合クエリです。4つの句（must, filter, should, must_not）を使って、複数のクエリを論理的に組み合わせます。

GET /articles/_search
{
  "query": {
    "bool": {
      "must": [
        { "match": { "title": "Elasticsearch" } },
        { "match": { "content": "Query DSL" } }
      ],
      "filter": [
        { "term": { "status": "published" } },
        { "range": { "publish_date": { "gte": "2024-01-01" } } }
      ],
      "should": [
        { "match": { "tags": "tutorial" } },
        { "match": { "tags": "beginner" } }
      ],
      "must_not": [
        { "term": { "language": "deprecated" } }
      ],
      "minimum_should_match": 1
    }
  }
}

各句の詳細

句	コンテキスト	スコアへの影響	必須/任意	説明
`must`	クエリ	あり（加算）	必須（AND）	全条件にマッチする必要がある
`filter`	フィルタ	なし	必須（AND）	全条件にマッチする必要がある（キャッシュ対象）
`should`	クエリ	あり（加算）	任意（OR）	マッチするとスコアが上がる
`must_not`	フィルタ	なし	除外（NOT）	マッチするドキュメントを除外

minimum_should_match の動作

must または filter が存在する場合: should のデフォルトは 0（マッチしなくても結果に含まれる）
must も filter も存在しない場合: should のデフォルトは 1（少なくとも1つにマッチする必要がある）

ネストされた bool クエリ

GET /products/_search
{
  "query": {
    "bool": {
      "must": [
        {
          "bool": {
            "should": [
              { "match": { "category": "electronics" } },
              { "match": { "category": "computers" } }
            ]
          }
        }
      ],
      "filter": [
        { "range": { "price": { "lte": 100000 } } }
      ]
    }
  }
}

ベストプラクティス: ネストはなるべく浅く保つことで、パフォーマンスが向上します。

Named Queries

GET /products/_search
{
  "query": {
    "bool": {
      "must": [
        { "match": { "name": { "query": "laptop", "_name": "name_match" } } }
      ],
      "filter": [
        { "range": { "price": { "lte": 200000, "_name": "price_filter" } } }
      ]
    }
  }
}

_name を指定すると、レスポンスの matched_queries フィールドで、どのクエリがマッチしたかを追跡できます。

6.2 dis_max クエリ

複数のクエリのうち、最も高いスコアを生成するクエリのスコアを採用します。tie_breaker パラメータで他のクエリのスコアも部分的に加算できます。

GET /articles/_search
{
  "query": {
    "dis_max": {
      "queries": [
        { "match": { "title": "Elasticsearch" } },
        { "match": { "body": "Elasticsearch" } }
      ],
      "tie_breaker": 0.7
    }
  }
}

最終スコアの計算: 最大スコア + (tie_breaker × 他のクエリのスコアの合計)

6.3 constant_score クエリ

フィルタ条件にマッチするドキュメントに、一定のスコアを付与します。

GET /products/_search
{
  "query": {
    "constant_score": {
      "filter": {
        "term": { "status": "active" }
      },
      "boost": 1.2
    }
  }
}

6.4 boosting クエリ

ポジティブクエリにマッチしつつ、ネガティブクエリにもマッチするドキュメントのスコアを減少させます。

GET /articles/_search
{
  "query": {
    "boosting": {
      "positive": {
        "match": { "content": "Elasticsearch" }
      },
      "negative": {
        "term": { "status": "outdated" }
      },
      "negative_boost": 0.5
    }
  }
}

must_not と異なり、ネガティブ条件にマッチするドキュメントを完全に除外するのではなく、スコアを下げるだけです。

6.5 function_score クエリ

カスタムスコアリング関数を使用して、スコアを自由にカスタマイズします。

GET /articles/_search
{
  "query": {
    "function_score": {
      "query": {
        "match": { "content": "Elasticsearch" }
      },
      "functions": [
        {
          "gauss": {
            "publish_date": {
              "origin": "now",
              "scale": "30d",
              "decay": 0.5
            }
          }
        },
        {
          "field_value_factor": {
            "field": "likes",
            "factor": 1.2,
            "modifier": "sqrt",
            "missing": 1
          },
          "weight": 2
        },
        {
          "filter": { "term": { "featured": true } },
          "weight": 10
        }
      ],
      "score_mode": "sum",
      "boost_mode": "multiply",
      "max_boost": 42
    }
  }
}

利用可能な関数:

関数	説明
`script_score`	スクリプトによるカスタムスコア計算
`weight`	固定の重み付け
`random_score`	ランダムスコアの生成
`field_value_factor`	フィールド値に基づくスコア計算
`decay関数` (linear/exp/gauss)	距離に基づくスコア減衰

score_mode の選択肢: multiply, sum, avg, first, max, min

boost_mode の選択肢: multiply, replace, sum, avg, max, min

7. 結合クエリ（Joining Queries）

Elasticsearchは分散システムであるため、RDBのような完全なSQLスタイルのJOINは現実的ではありません。代わりに、以下の2つのメカニズムが提供されています。

7.1 nested クエリ

nested 型のフィールドに対して検索を実行します。オブジェクト配列の各要素を独立したドキュメントとして扱い、配列内の同一オブジェクトのフィールド間の関係を正確に評価できます。

// マッピング定義
PUT /blog_posts
{
  "mappings": {
    "properties": {
      "title": { "type": "text" },
      "comments": {
        "type": "nested",
        "properties": {
          "author": { "type": "keyword" },
          "text": { "type": "text" },
          "date": { "type": "date" }
        }
      }
    }
  }
}

// nested クエリ
GET /blog_posts/_search
{
  "query": {
    "nested": {
      "path": "comments",
      "query": {
        "bool": {
          "must": [
            { "match": { "comments.author": "Alice" } },
            { "match": { "comments.text": "great article" } }
          ]
        }
      },
      "score_mode": "avg",
      "inner_hits": {
        "size": 3,
        "highlight": {
          "fields": {
            "comments.text": {}
          }
        }
      }
    }
  }
}

score_mode の選択肢: avg, max, min, sum, none

inner_hits を指定すると、マッチしたネストされたオブジェクトの詳細がレスポンスに含まれます。

7.2 has_child / has_parent クエリ

join フィールド型を使用した親子関係のドキュメント間で検索を行います。

// マッピング定義
PUT /company
{
  "mappings": {
    "properties": {
      "relation": {
        "type": "join",
        "relations": {
          "department": "employee"
        }
      },
      "name": { "type": "text" },
      "skill": { "type": "keyword" }
    }
  }
}

// 特定のスキルを持つ従業員がいる部署を検索
GET /company/_search
{
  "query": {
    "has_child": {
      "type": "employee",
      "query": {
        "term": { "skill": "elasticsearch" }
      },
      "score_mode": "max",
      "min_children": 2,
      "max_children": 100,
      "inner_hits": {}
    }
  }
}

// 特定の部署に所属する従業員を検索
GET /company/_search
{
  "query": {
    "has_parent": {
      "parent_type": "department",
      "query": {
        "match": { "name": "Engineering" }
      },
      "score": true
    }
  }
}

パフォーマンス注意: 結合クエリは高コストです。search.allow_expensive_queries が false の場合、実行できません。可能な限り、非正規化（デノーマライゼーション）やアプリケーション側での結合を検討してください。

8. 地理空間クエリ（Geo Queries）

地理空間クエリは、geo_point および geo_shape フィールドに対して、空間的な条件で検索を行います。

8.1 geo_distance クエリ

指定した中心点からの距離内にあるドキュメントを検索します。

GET /shops/_search
{
  "query": {
    "geo_distance": {
      "distance": "5km",
      "location": {
        "lat": 35.6812,
        "lon": 139.7671
      }
    }
  },
  "sort": [
    {
      "_geo_distance": {
        "location": {
          "lat": 35.6812,
          "lon": 139.7671
        },
        "order": "asc",
        "unit": "km"
      }
    }
  ]
}

8.2 geo_bounding_box クエリ

矩形領域（バウンディングボックス）内のドキュメントを検索します。

GET /shops/_search
{
  "query": {
    "geo_bounding_box": {
      "location": {
        "top_left": {
          "lat": 35.70,
          "lon": 139.70
        },
        "bottom_right": {
          "lat": 35.65,
          "lon": 139.80
        }
      }
    }
  }
}

8.3 geo_shape クエリ

任意の形状（ポリゴン、円、線など）との空間関係に基づいて検索します。

GET /regions/_search
{
  "query": {
    "geo_shape": {
      "boundary": {
        "shape": {
          "type": "polygon",
          "coordinates": [
            [
              [139.7, 35.65],
              [139.8, 35.65],
              [139.8, 35.70],
              [139.7, 35.70],
              [139.7, 35.65]
            ]
          ]
        },
        "relation": "within"
      }
    }
  }
}

relation の選択肢:

値	説明
`intersects`	形状と交差する（デフォルト）
`within`	形状の内部に完全に含まれる
`contains`	形状を完全に含む
`disjoint`	形状と交差しない

9. 特殊クエリ（Specialized Queries）

9.1 more_like_this クエリ

指定したテキストやドキュメントに類似するドキュメントを検索します。

GET /articles/_search
{
  "query": {
    "more_like_this": {
      "fields": ["title", "content"],
      "like": [
        {
          "_index": "articles",
          "_id": "1"
        },
        "Elasticsearch is a distributed search and analytics engine"
      ],
      "min_term_freq": 1,
      "min_doc_freq": 3,
      "max_query_terms": 25
    }
  }
}

9.2 script_score クエリ

スクリプトを使用してドキュメントのスコアをカスタム計算します。

GET /products/_search
{
  "query": {
    "script_score": {
      "query": {
        "match": { "description": "wireless headphones" }
      },
      "script": {
        "source": "_score * doc['popularity'].value * Math.log(2 + doc['reviews_count'].value)"
      }
    }
  }
}

9.3 percolate クエリ

通常のクエリが「ドキュメントに対してクエリを実行する」のに対し、percolateクエリは「クエリに対してドキュメントを実行する」逆の操作です。事前に登録されたクエリの中から、指定したドキュメントにマッチするものを見つけます。

// percolateインデックスの作成
PUT /alerts
{
  "mappings": {
    "properties": {
      "query": { "type": "percolator" },
      "severity": { "type": "keyword" }
    }
  }
}

// アラートルール（クエリ）の登録
PUT /alerts/_doc/1
{
  "query": {
    "match": { "message": "critical error" }
  },
  "severity": "high"
}

// ドキュメントに対してマッチするアラートを検索
GET /alerts/_search
{
  "query": {
    "percolate": {
      "field": "query",
      "document": {
        "message": "A critical error occurred in the payment service"
      }
    }
  }
}

9.4 pinned クエリ

特定のドキュメントを検索結果の上位に固定（ピン留め）します。

GET /products/_search
{
  "query": {
    "pinned": {
      "ids": ["42", "100", "7"],
      "organic": {
        "match": {
          "name": "laptop"
        }
      }
    }
  }
}

9.5 rank_feature クエリ

rank_feature または rank_features 型のフィールドに基づいてスコアを計算します。

GET /articles/_search
{
  "query": {
    "bool": {
      "must": [
        { "match": { "content": "Elasticsearch" } }
      ],
      "should": [
        {
          "rank_feature": {
            "field": "pagerank",
            "boost": 2
          }
        },
        {
          "rank_feature": {
            "field": "url_length",
            "boost": 0.1,
            "log": {
              "scaling_factor": 4
            }
          }
        }
      ]
    }
  }
}

9.6 knn（k近傍法）クエリ

ベクトル空間での類似度に基づくk近傍法検索です。セマンティック検索（意味的検索）の基盤となります。

GET /embeddings/_search
{
  "query": {
    "knn": {
      "field": "content_vector",
      "query_vector": [0.1, 0.2, 0.3, ...],
      "k": 10,
      "num_candidates": 100
    }
  }
}

ハイブリッド検索（キーワード検索 + ベクトル検索）の例:

GET /articles/_search
{
  "query": {
    "bool": {
      "must": [
        { "match": { "content": "Elasticsearch 検索エンジン" } }
      ],
      "should": [
        {
          "knn": {
            "field": "content_vector",
            "query_vector": [0.1, 0.2, 0.3, ...],
            "k": 10,
            "num_candidates": 100,
            "boost": 0.5
          }
        }
      ]
    }
  }
}

10. アグリゲーション（Aggregations）

アグリゲーションは、Query DSLのクエリ結果に対して統計的な集計・分析を行う機能です。検索とアグリゲーションを1つのリクエストで同時に実行できるのが、Elasticsearchの大きな強みです。

10.1 アグリゲーションの3つのカテゴリ

カテゴリ	説明	例
Metric	フィールド値から統計量を計算	`avg`, `sum`, `min`, `max`, `stats`, `cardinality`, `percentiles`
Bucket	ドキュメントをグループに分類	`terms`, `date_histogram`, `histogram`, `range`, `filter`, `filters`
Pipeline	他のアグリゲーション結果に対して計算	`avg_bucket`, `max_bucket`, `derivative`, `cumulative_sum`, `moving_avg`

10.2 Metric アグリゲーション

GET /orders/_search
{
  "size": 0,
  "query": {
    "range": {
      "order_date": {
        "gte": "2025-01-01",
        "lt": "2025-04-01"
      }
    }
  },
  "aggs": {
    "total_revenue": { "sum": { "field": "amount" } },
    "avg_order_value": { "avg": { "field": "amount" } },
    "max_order": { "max": { "field": "amount" } },
    "min_order": { "min": { "field": "amount" } },
    "order_stats": { "stats": { "field": "amount" } },
    "unique_customers": { "cardinality": { "field": "customer_id" } },
    "amount_percentiles": {
      "percentiles": {
        "field": "amount",
        "percents": [25, 50, 75, 90, 95, 99]
      }
    }
  }
}

10.3 Bucket アグリゲーション

terms アグリゲーション

GET /logs/_search
{
  "size": 0,
  "aggs": {
    "status_codes": {
      "terms": {
        "field": "status_code",
        "size": 10,
        "order": { "_count": "desc" },
        "min_doc_count": 1
      }
    }
  }
}

date_histogram アグリゲーション

GET /logs/_search
{
  "size": 0,
  "aggs": {
    "requests_over_time": {
      "date_histogram": {
        "field": "@timestamp",
        "calendar_interval": "1h",
        "time_zone": "Asia/Tokyo",
        "min_doc_count": 0,
        "extended_bounds": {
          "min": "2025-01-01T00:00:00",
          "max": "2025-01-01T23:59:59"
        }
      }
    }
  }
}

calendar_interval と fixed_interval の違い:

パラメータ	説明	例
`calendar_interval`	カレンダーに基づく間隔（月の日数が異なる）	`1M`, `1w`, `1d`, `1h`
`fixed_interval`	固定のミリ秒間隔	`30m`, `1h`, `7d`

ネストされたアグリゲーション

GET /orders/_search
{
  "size": 0,
  "aggs": {
    "by_category": {
      "terms": {
        "field": "category",
        "size": 10
      },
      "aggs": {
        "monthly_sales": {
          "date_histogram": {
            "field": "order_date",
            "calendar_interval": "1M"
          },
          "aggs": {
            "revenue": { "sum": { "field": "amount" } },
            "avg_amount": { "avg": { "field": "amount" } }
          }
        },
        "total_revenue": { "sum": { "field": "amount" } }
      }
    }
  }
}

filter / filters アグリゲーション

GET /logs/_search
{
  "size": 0,
  "aggs": {
    "error_types": {
      "filters": {
        "filters": {
          "client_errors": { "range": { "status_code": { "gte": 400, "lt": 500 } } },
          "server_errors": { "range": { "status_code": { "gte": 500, "lt": 600 } } },
          "success": { "range": { "status_code": { "gte": 200, "lt": 300 } } }
        }
      },
      "aggs": {
        "avg_response_time": { "avg": { "field": "response_time_ms" } }
      }
    }
  }
}

10.4 Pipeline アグリゲーション

GET /orders/_search
{
  "size": 0,
  "aggs": {
    "monthly_sales": {
      "date_histogram": {
        "field": "order_date",
        "calendar_interval": "1M"
      },
      "aggs": {
        "revenue": { "sum": { "field": "amount" } }
      }
    },
    "max_monthly_revenue": {
      "max_bucket": {
        "buckets_path": "monthly_sales>revenue"
      }
    },
    "avg_monthly_revenue": {
      "avg_bucket": {
        "buckets_path": "monthly_sales>revenue"
      }
    }
  }
}

derivative（微分）アグリゲーション

GET /metrics/_search
{
  "size": 0,
  "aggs": {
    "daily_metrics": {
      "date_histogram": {
        "field": "@timestamp",
        "calendar_interval": "1d"
      },
      "aggs": {
        "total_requests": { "sum": { "field": "request_count" } },
        "request_growth": {
          "derivative": {
            "buckets_path": "total_requests"
          }
        }
      }
    }
  }
}

11. 検索結果の制御

11.1 ソート

GET /products/_search
{
  "query": { "match_all": {} },
  "sort": [
    { "price": { "order": "asc", "missing": "_last" } },
    { "_score": { "order": "desc" } },
    { "created_at": { "order": "desc", "format": "strict_date_optional_time" } }
  ]
}

11.2 ページネーション

from/size（浅いページネーション）

GET /products/_search
{
  "from": 20,
  "size": 10,
  "query": { "match_all": {} }
}

注意: from + size はデフォルトで10,000が上限です（index.max_result_window で変更可能）。深いページネーションには不向きです。

search_after（深いページネーション）

// 最初のページ
GET /products/_search
{
  "size": 10,
  "query": { "match_all": {} },
  "sort": [
    { "created_at": "desc" },
    { "_id": "asc" }
  ]
}

// 次のページ（前のページの最後のドキュメントのソート値を指定）
GET /products/_search
{
  "size": 10,
  "query": { "match_all": {} },
  "sort": [
    { "created_at": "desc" },
    { "_id": "asc" }
  ],
  "search_after": ["2025-01-15T10:30:00Z", "abc123"]
}

Point in Time (PIT) + search_after

一貫性のあるページネーションには、PIT（Point in Time）と組み合わせます。

// PITの作成
POST /products/_pit?keep_alive=5m

// PIT付きの検索
GET /_search
{
  "size": 10,
  "query": { "match_all": {} },
  "pit": {
    "id": "PITのID",
    "keep_alive": "5m"
  },
  "sort": [
    { "created_at": "desc" },
    { "_shard_doc": "asc" }
  ]
}

11.3 ソースフィルタリング

GET /products/_search
{
  "query": { "match_all": {} },
  "_source": {
    "includes": ["name", "price", "category"],
    "excludes": ["description", "internal_*"]
  }
}

11.4 ハイライト

GET /articles/_search
{
  "query": {
    "match": { "content": "Elasticsearch" }
  },
  "highlight": {
    "pre_tags": ["<mark>"],
    "post_tags": ["</mark>"],
    "fields": {
      "content": {
        "fragment_size": 150,
        "number_of_fragments": 3
      },
      "title": {}
    }
  }
}

12. パフォーマンス最適化

12.1 高コストなクエリの管理

以下のクエリは「高コスト」に分類されており、大規模クラスタでは注意が必要です。

カテゴリ	クエリタイプ	理由
線形スキャン	`script`	ドキュメントごとにスクリプト実行
高い初期コスト	`fuzzy`, `regexp`, `prefix`, `wildcard`	オートマトン構築・全term走査
結合	`nested`, `has_child`, `has_parent`	インデックス間の結合処理
ドキュメント単位	`script_score`, `percolate`	各ドキュメントで計算発生
テキスト/キーワードの範囲	`range` on text/keyword	効率的なインデックス構造がない

// 高コストクエリの無効化
PUT /_cluster/settings
{
  "persistent": {
    "search.allow_expensive_queries": false
  }
}

12.2 フィルタコンテキストの最大活用

// 悪い例: すべてをクエリコンテキストで実行
GET /logs/_search
{
  "query": {
    "bool": {
      "must": [
        { "term": { "status": "error" } },
        { "range": { "@timestamp": { "gte": "now-1h" } } },
        { "match": { "message": "timeout" } }
      ]
    }
  }
}

// 良い例: スコア不要な条件はフィルタコンテキストで実行
GET /logs/_search
{
  "query": {
    "bool": {
      "must": [
        { "match": { "message": "timeout" } }
      ],
      "filter": [
        { "term": { "status": "error" } },
        { "range": { "@timestamp": { "gte": "now-1h" } } }
      ]
    }
  }
}

12.3 その他の最適化テクニック

テクニック	説明
`size: 0`	アグリゲーションのみが必要な場合、ドキュメントを返さない
`_source` フィルタリング	必要なフィールドのみを返す
`search_after`	深いページネーションには `from/size` ではなく `search_after` を使用
`terminate_after`	各シャードで指定数のドキュメントを収集したら検索を打ち切る
`track_total_hits: false`	総ヒット数の正確な追跡が不要な場合に無効化
ルーティング	`routing` パラメータで検索対象のシャードを限定
プリファレンス	`preference` パラメータでキャッシュの効率を向上
インデックスソート	インデックス作成時にソートを指定してクエリ性能を向上

13. 実践的なユースケース

13.1 ECサイトの商品検索

GET /products/_search
{
  "query": {
    "bool": {
      "must": [
        {
          "multi_match": {
            "query": "ワイヤレス ノイズキャンセリング イヤホン",
            "fields": ["name^3", "description", "brand^2"],
            "type": "best_fields",
            "fuzziness": "AUTO"
          }
        }
      ],
      "filter": [
        { "term": { "category": "audio" } },
        { "range": { "price": { "gte": 5000, "lte": 50000 } } },
        { "term": { "in_stock": true } },
        { "terms": { "color": ["black", "white"] } }
      ],
      "should": [
        { "term": { "featured": { "value": true, "boost": 5 } } },
        { "range": { "rating": { "gte": 4.0, "boost": 2 } } }
      ]
    }
  },
  "aggs": {
    "by_brand": {
      "terms": { "field": "brand", "size": 20 }
    },
    "price_ranges": {
      "range": {
        "field": "price",
        "ranges": [
          { "to": 5000 },
          { "from": 5000, "to": 10000 },
          { "from": 10000, "to": 30000 },
          { "from": 30000 }
        ]
      }
    },
    "avg_rating": { "avg": { "field": "rating" } }
  },
  "highlight": {
    "fields": {
      "name": {},
      "description": { "fragment_size": 200 }
    }
  },
  "sort": [
    "_score",
    { "rating": "desc" }
  ],
  "size": 20
}

13.2 ログ監視ダッシュボード

GET /logs-*/_search
{
  "size": 0,
  "query": {
    "bool": {
      "filter": [
        { "range": { "@timestamp": { "gte": "now-24h" } } }
      ]
    }
  },
  "aggs": {
    "error_timeline": {
      "date_histogram": {
        "field": "@timestamp",
        "fixed_interval": "15m"
      },
      "aggs": {
        "by_severity": {
          "terms": { "field": "log.level" }
        },
        "error_rate": {
          "filter": {
            "terms": { "log.level": ["error", "critical"] }
          }
        }
      }
    },
    "top_error_services": {
      "filter": {
        "terms": { "log.level": ["error", "critical"] }
      },
      "aggs": {
        "services": {
          "terms": { "field": "service.name", "size": 10 }
        }
      }
    },
    "response_time_percentiles": {
      "percentiles": {
        "field": "http.response.time_ms",
        "percents": [50, 90, 95, 99]
      }
    }
  }
}

13.3 セキュリティ監査

GET /security-events-*/_search
{
  "query": {
    "bool": {
      "must": [
        {
          "bool": {
            "should": [
              { "match": { "event.action": "login_failed" } },
              { "match": { "event.action": "privilege_escalation" } },
              { "match": { "event.action": "unauthorized_access" } }
            ]
          }
        }
      ],
      "filter": [
        { "range": { "@timestamp": { "gte": "now-1h" } } }
      ]
    }
  },
  "aggs": {
    "by_source_ip": {
      "terms": { "field": "source.ip", "size": 20 },
      "aggs": {
        "event_types": {
          "terms": { "field": "event.action" }
        },
        "targeted_users": {
          "terms": { "field": "user.name" }
        }
      }
    }
  },
  "sort": [{ "@timestamp": "desc" }],
  "size": 100
}

14. まとめ

14.1 Query DSLの強み

強み	説明
包括的な機能	全文検索、構造化データ、地理空間、ベクトル検索をすべてカバー
柔軟な構成	AST構造により、単純なクエリから極めて複雑なクエリまで構築可能
パフォーマンス制御	クエリ/フィルタコンテキストによるきめ細かいパフォーマンスチューニング
検索と分析の統合	1つのリクエストで検索とアグリゲーションを同時実行
プログラマティック	JSONベースのためあらゆるプログラミング言語から利用可能

14.2 学習ロードマップ

基礎: match, term, range, bool の4つのクエリを完全に理解する
中級: クエリコンテキスト/フィルタコンテキストの使い分け、multi_match, function_score を習得
上級: アグリゲーション（Metric → Bucket → Pipeline）、nested, knn, パフォーマンス最適化
実践: ユースケースに基づいた複合クエリの設計、ベンチマーク、モニタリング