OpenSearch

Mark as read

OpenSearch 包括的概要ガイド

1. はじめに

OpenSearch は、Apache Lucene をベースとした分散型の検索・分析エンジンであり、ログ解析、全文検索、セキュリティ分析、可観測性（Observability）など幅広いユースケースに対応するオープンソースプロジェクトである。2021 年に Amazon Web Services（AWS）が Elasticsearch 7.10.2 および Kibana 7.10.2 のフォークとして開発を開始し、Apache License 2.0 のもとでコミュニティ主導の開発が行われている。

本記事では、OpenSearch のアーキテクチャ全容、主要機能、設定の具体例、運用ノウハウを体系的に解説する。対象読者は、OpenSearch の導入を検討しているエンジニア、既存の Elasticsearch 環境からの移行を計画しているチーム、および OpenSearch クラスタの運用・管理を担当する SRE/インフラエンジニアである。

1.1 OpenSearch の位置づけ

特性	説明
ライセンス	Apache License 2.0
ベース技術	Apache Lucene
フォーク元	Elasticsearch 7.10.2 / Kibana 7.10.2
主要コンポーネント	OpenSearch（エンジン）、OpenSearch Dashboards（UI）
開発主体	OpenSearch Project（AWS 主導、コミュニティ貢献）
対応プロトコル	REST API（HTTP/HTTPS）、Bulk API、SQL

1.2 主要なユースケース

OpenSearch は以下のユースケースで広く利用されている：

ログ分析・集約: アプリケーションログ、インフラストラクチャログの収集・分析・可視化
全文検索: Web サイト検索、ドキュメント検索、商品カタログ検索
セキュリティ分析（SIEM）: セキュリティイベントの相関分析、脅威検知
可観測性: メトリクス、トレース、ログの統合管理
ベクトル検索: 機械学習エンベディングを用いたセマンティック検索、RAG パイプライン
ビジネスインテリジェンス: リアルタイムダッシュボード、KPI モニタリング

2. OpenSearch の歴史と背景

2.1 Elasticsearch からのフォーク

2021 年 1 月、Elastic 社が Elasticsearch と Kibana のライセンスを Apache License 2.0 から Server Side Public License（SSPL）および Elastic License に変更すると発表した。これに対し、AWS は Elasticsearch 7.10.2（最後の Apache 2.0 バージョン）をフォークし、OpenSearch プロジェクトを立ち上げた。

2.2 バージョン履歴

バージョン	リリース日	主要な変更点
1.0	2021年7月	初回リリース。Elasticsearch 7.10.2 ベース
1.1	2021年10月	Cross-cluster replication、Bucket-level alerting
1.2	2021年12月	Transforms、Observability 改善
1.3	2022年3月	Segment replication、Remote-backed storage
2.0	2022年5月	Lucene 9.1、Document-level alerting
2.4	2022年11月	Point-in-time search、Searchable snapshots
2.9	2023年7月	Conversational search、AI connectors
2.11	2023年10月	Vector search 改善、Remote model support
2.12	2024年2月	Batch ingestion、Disk-based k-NN
2.13	2024年4月	Concurrent segment search GA
2.17	2024年12月	Star-tree index、Derived fields
2.19	2025年3月	Pull-based ingestion、Workload Management GA

2.3 Elasticsearch との互換性

OpenSearch は Elasticsearch 7.10.2 との高い互換性を維持している：

REST API: 大部分の Elasticsearch 7.x API と互換
クライアントライブラリ: OpenSearch 専用クライアントの他、Elasticsearch 7.x クライアントも多くの場合動作する
プラグイン: Elasticsearch プラグインの直接互換性はないが、同等機能が OpenSearch プラグインとして提供
データ移行: reindex API やスナップショット/リストアによる移行が可能

# opensearch.yml - Elasticsearch互換設定
compatibility:
  override_main_response_version: true  # バージョンレスポンスを7.10.2として返す

3. アーキテクチャ概要

OpenSearch は分散アーキテクチャを採用しており、複数のノードがクラスタを形成してデータの格納・検索を行う。このセクションでは、OpenSearch のアーキテクチャの全体像を解説する。

3.1 全体アーキテクチャ図

                    ┌──────────────────────────────────────────────────┐
                    │              OpenSearch Cluster                   │
                    │                                                  │
  Client Request    │  ┌─────────────┐  ┌─────────────┐              │
  ───────────────►  │  │  Master     │  │  Master     │  Master      │
                    │  │  Node 1     │◄─┤  Node 2     │  eligible    │
  Load Balancer     │  │  (active)   │  │  (standby)  │  nodes       │
  ───────────────►  │  └──────┬──────┘  └─────────────┘              │
                    │         │                                       │
                    │  ┌──────┴──────────────────────────────┐       │
                    │  │         Coordinating Layer           │       │
                    │  │  ┌──────────┐  ┌──────────┐        │       │
                    │  │  │ Coord    │  │ Coord    │        │       │
                    │  │  │ Node 1   │  │ Node 2   │        │       │
                    │  │  └──────────┘  └──────────┘        │       │
                    │  └──────┬──────────────────────────────┘       │
                    │         │                                       │
                    │  ┌──────┴──────────────────────────────┐       │
                    │  │            Data Layer                │       │
                    │  │  ┌────────┐ ┌────────┐ ┌────────┐  │       │
                    │  │  │ Data   │ │ Data   │ │ Data   │  │       │
                    │  │  │ Node 1 │ │ Node 2 │ │ Node 3 │  │       │
                    │  │  │ P0 R1  │ │ P1 R2  │ │ P2 R0  │  │       │
                    │  │  └────────┘ └────────┘ └────────┘  │       │
                    │  └─────────────────────────────────────┘       │
                    │                                                  │
                    │  ┌─────────────────────────────────────┐       │
                    │  │          Ingest Layer                │       │
                    │  │  ┌────────┐ ┌────────┐              │       │
                    │  │  │ Ingest │ │ Ingest │              │       │
                    │  │  │ Node 1 │ │ Node 2 │              │       │
                    │  │  └────────┘ └────────┘              │       │
                    │  └─────────────────────────────────────┘       │
                    └──────────────────────────────────────────────────┘

3.2 ノードの種類

OpenSearch クラスタは、役割の異なる複数のノードタイプで構成される：

ノードタイプ	設定キー	説明
クラスタマネージャ（Master）	`cluster_manager`	クラスタ状態の管理、インデックスの作成/削除、シャード割り当て
データノード	`data`	データの格納、検索クエリの実行、インデクシング
インジェストノード	`ingest`	ドキュメントの前処理（パイプライン実行）
コーディネートノード	（専用設定なし）	リクエストのルーティング、結果の集約
リモートストアノード	`remote_store`	リモートストレージへのデータ永続化
ML ノード	`ml`	機械学習モデルの実行
検索ノード	`search`	Searchable snapshots のための読み取り専用ノード

# opensearch.yml - ノードロール設定例

# クラスタマネージャ専用ノード
node.name: master-node-1
node.roles: [ cluster_manager ]

# データノード
node.name: data-node-1
node.roles: [ data, ingest ]

# コーディネート専用ノード（全ロールを無効化）
node.name: coord-node-1
node.roles: [ ]

# ML専用ノード
node.name: ml-node-1
node.roles: [ ml ]

3.3 クラスタ状態管理

クラスタマネージャノードは以下の情報を管理する：

クラスタメタデータ: インデックス設定、マッピング、エイリアス
ノード情報: クラスタに参加しているノードのリスト
シャード割り当て: どのシャードがどのノードに配置されているか
ルーティングテーブル: リクエストのルーティング情報

# クラスタ状態の確認
curl -X GET "https://localhost:9200/_cluster/state?pretty" \
  -u admin:admin --insecure

# クラスタヘルス確認
curl -X GET "https://localhost:9200/_cluster/health?pretty" \
  -u admin:admin --insecure

# レスポンス例
{
  "cluster_name": "opensearch-cluster",
  "status": "green",
  "timed_out": false,
  "number_of_nodes": 6,
  "number_of_data_nodes": 3,
  "discovered_master": true,
  "discovered_cluster_manager": true,
  "active_primary_shards": 15,
  "active_shards": 30,
  "relocating_shards": 0,
  "initializing_shards": 0,
  "unassigned_shards": 0,
  "delayed_unassigned_shards": 0,
  "number_of_pending_tasks": 0,
  "number_of_in_flight_fetch": 0,
  "task_max_waiting_in_queue_millis": 0,
  "active_shards_percent_as_number": 100.0
}

3.4 クラスタマネージャの選出

OpenSearch はクラスタマネージャの選出に Raft ベースのコンセンサスアルゴリズムを使用する。クォーラム（過半数）の投票が必要であり、スプリットブレインを防止するために奇数台のクラスタマネージャ対象ノードを推奨する。

# opensearch.yml - クラスタマネージャ選出設定
cluster.name: production-cluster
cluster.initial_cluster_manager_nodes:
  - master-node-1
  - master-node-2
  - master-node-3

# ディスカバリ設定
discovery.seed_hosts:
  - 10.0.1.10:9300
  - 10.0.1.11:9300
  - 10.0.1.12:9300
discovery.type: zen

# 投票設定（通常はデフォルトで十分）
cluster.election.strategy: default
cluster.election.max_timeout: 10s

3.5 ネットワーク通信

OpenSearch は 2 つのネットワークレイヤーを使用する：

レイヤー	デフォルトポート	用途
HTTP（REST）	9200	クライアントからの API リクエスト
Transport	9300	ノード間通信（クラスタ内部）

# opensearch.yml - ネットワーク設定
network.host: 0.0.0.0
http.port: 9200
transport.port: 9300

# HTTP設定
http.max_content_length: 100mb
http.max_initial_line_length: 4kb
http.max_header_size: 8kb
http.compression: true

# Transport設定
transport.tcp.compress: true
transport.tcp.keep_alive: true

4. インデックスとシャード

4.1 インデックスの基本概念

インデックスは、OpenSearch におけるデータの論理的な格納単位である。RDBMS のテーブルに相当し、ドキュメント（JSON 形式のレコード）の集合を保持する。

# インデックスの作成
curl -X PUT "https://localhost:9200/my-application-logs" \
  -u admin:admin --insecure \
  -H "Content-Type: application/json" \
  -d '{
    "settings": {
      "index": {
        "number_of_shards": 3,
        "number_of_replicas": 1,
        "refresh_interval": "5s",
        "codec": "best_compression",
        "max_result_window": 10000
      }
    },
    "mappings": {
      "properties": {
        "timestamp": { "type": "date" },
        "level": { "type": "keyword" },
        "message": { "type": "text" },
        "service": { "type": "keyword" },
        "host": { "type": "keyword" },
        "response_time_ms": { "type": "integer" }
      }
    }
  }'

4.2 シャーディング

各インデックスは 1 つ以上のシャード（プライマリシャード）に分割される。シャードは Lucene インデックスそのものであり、データの物理的な格納単位となる。

インデックス: my-application-logs (3 プライマリシャード, 1 レプリカ)

  Node 1           Node 2           Node 3
  ┌──────────┐    ┌──────────┐    ┌──────────┐
  │ P0       │    │ P1       │    │ P2       │
  │ (Primary)│    │ (Primary)│    │ (Primary)│
  ├──────────┤    ├──────────┤    ├──────────┤
  │ R1       │    │ R2       │    │ R0       │
  │ (Replica)│    │ (Replica)│    │ (Replica)│
  └──────────┘    └──────────┘    └──────────┘

  P = プライマリシャード, R = レプリカシャード
  R0 は P0 のレプリカ、R1 は P1 のレプリカ...

シャード数の設計指針

考慮事項	推奨
シャードサイズ	10GB〜50GB が推奨
シャード数上限	データノードあたり 1,000 以下
ヒープとの関係	JVM ヒープ 1GB あたり 20 シャード以下
過小シャード	インデクシングのスループットが制限される
過多シャード	クラスタ状態の肥大化、メモリ圧迫

# シャード情報の確認
curl -X GET "https://localhost:9200/_cat/shards/my-application-logs?v&pretty" \
  -u admin:admin --insecure

# 出力例
# index                shard prirep state    docs   store ip        node
# my-application-logs  0     p      STARTED  15234  25mb  10.0.1.10 data-node-1
# my-application-logs  0     r      STARTED  15234  25mb  10.0.1.12 data-node-3
# my-application-logs  1     p      STARTED  15189  24mb  10.0.1.11 data-node-2
# my-application-logs  1     r      STARTED  15189  24mb  10.0.1.10 data-node-1
# my-application-logs  2     p      STARTED  15077  24mb  10.0.1.12 data-node-3
# my-application-logs  2     r      STARTED  15077  24mb  10.0.1.11 data-node-2

4.3 セグメント

各シャードは複数のセグメント（不変の Lucene セグメント）で構成される。新しいドキュメントはインメモリバッファに書き込まれた後、定期的にセグメントとしてフラッシュされる。

シャード P0 の内部構造:

  ┌──────────────────────────────────────┐
  │            Shard P0                   │
  │                                      │
  │  ┌──────────┐ ┌──────────┐          │
  │  │ Segment 0│ │ Segment 1│  ...     │
  │  │ (sealed) │ │ (sealed) │          │
  │  └──────────┘ └──────────┘          │
  │                                      │
  │  ┌──────────────────┐               │
  │  │ In-memory buffer │ ← 新規ドキュメント│
  │  └──────────────────┘               │
  │                                      │
  │  ┌──────────────────┐               │
  │  │  Translog        │ ← 永続化保証   │
  │  └──────────────────┘               │
  └──────────────────────────────────────┘

4.4 リフレッシュとフラッシュ

操作	デフォルト間隔	説明
Refresh	1秒	インメモリバッファの内容を検索可能なセグメントに変換
Flush	条件による	Translog をディスクにコミットし、新しい Lucene コミットポイントを作成
Merge	自動	小さなセグメントを結合して最適化

# インデックス設定 - リフレッシュ/フラッシュ
index:
  refresh_interval: "5s"       # ログ用途では5-30秒が一般的
  translog:
    durability: "request"      # 各リクエスト後にfsync（デフォルト）
    # durability: "async"      # 非同期fsync（パフォーマンス優先）
    sync_interval: "5s"
    flush_threshold_size: "512mb"

# 手動リフレッシュ
curl -X POST "https://localhost:9200/my-application-logs/_refresh" \
  -u admin:admin --insecure

# 手動フラッシュ
curl -X POST "https://localhost:9200/my-application-logs/_flush" \
  -u admin:admin --insecure

# 強制マージ（セグメント統合）
curl -X POST "https://localhost:9200/my-application-logs/_forcemerge?max_num_segments=1" \
  -u admin:admin --insecure

4.5 インデックスライフサイクル管理（ISM）

OpenSearch の Index State Management（ISM）を使用して、インデックスのライフサイクルを自動化できる。

# ISMポリシーの作成
curl -X PUT "https://localhost:9200/_plugins/_ism/policies/log-retention-policy" \
  -u admin:admin --insecure \
  -H "Content-Type: application/json" \
  -d '{
    "policy": {
      "description": "ログインデックスの保持ポリシー",
      "default_state": "hot",
      "states": [
        {
          "name": "hot",
          "actions": [
            {
              "rollover": {
                "min_index_age": "1d",
                "min_primary_shard_size": "30gb"
              }
            }
          ],
          "transitions": [
            {
              "state_name": "warm",
              "conditions": {
                "min_index_age": "3d"
              }
            }
          ]
        },
        {
          "name": "warm",
          "actions": [
            {
              "replica_count": {
                "number_of_replicas": 0
              }
            },
            {
              "force_merge": {
                "max_num_segments": 1
              }
            },
            {
              "read_only": {}
            }
          ],
          "transitions": [
            {
              "state_name": "cold",
              "conditions": {
                "min_index_age": "30d"
              }
            }
          ]
        },
        {
          "name": "cold",
          "actions": [
            {
              "snapshot": {
                "repository": "s3-backup-repo",
                "snapshot": "{{ctx.index}}-snapshot"
              }
            }
          ],
          "transitions": [
            {
              "state_name": "delete",
              "conditions": {
                "min_index_age": "90d"
              }
            }
          ]
        },
        {
          "name": "delete",
          "actions": [
            {
              "delete": {}
            }
          ],
          "transitions": []
        }
      ],
      "ism_template": [
        {
          "index_patterns": ["logs-*"],
          "priority": 100
        }
      ]
    }
  }'

4.6 インデックステンプレート

# コンポーザブルインデックステンプレートの作成
curl -X PUT "https://localhost:9200/_index_template/logs-template" \
  -u admin:admin --insecure \
  -H "Content-Type: application/json" \
  -d '{
    "index_patterns": ["logs-*"],
    "priority": 200,
    "template": {
      "settings": {
        "index": {
          "number_of_shards": 3,
          "number_of_replicas": 1,
          "refresh_interval": "10s",
          "codec": "best_compression",
          "plugins": {
            "index_state_management": {
              "policy_id": "log-retention-policy",
              "rollover_alias": "logs"
            }
          }
        }
      },
      "mappings": {
        "properties": {
          "@timestamp": { "type": "date" },
          "level": { "type": "keyword" },
          "logger": { "type": "keyword" },
          "message": { "type": "text", "analyzer": "standard" },
          "service": { "type": "keyword" },
          "environment": { "type": "keyword" },
          "host": {
            "properties": {
              "name": { "type": "keyword" },
              "ip": { "type": "ip" }
            }
          },
          "trace_id": { "type": "keyword" },
          "span_id": { "type": "keyword" }
        },
        "dynamic_templates": [
          {
            "strings_as_keywords": {
              "match_mapping_type": "string",
              "mapping": {
                "type": "keyword",
                "ignore_above": 256
              }
            }
          }
        ]
      },
      "aliases": {
        "logs": {}
      }
    },
    "composed_of": ["common-settings"]
  }'

4.7 エイリアスとロールオーバー

# エイリアスの作成（書き込み先の指定）
curl -X POST "https://localhost:9200/_aliases" \
  -u admin:admin --insecure \
  -H "Content-Type: application/json" \
  -d '{
    "actions": [
      {
        "add": {
          "index": "logs-2024-01-01-000001",
          "alias": "logs",
          "is_write_index": true
        }
      }
    ]
  }'

# ロールオーバーの手動実行
curl -X POST "https://localhost:9200/logs/_rollover" \
  -u admin:admin --insecure \
  -H "Content-Type: application/json" \
  -d '{
    "conditions": {
      "max_age": "1d",
      "max_primary_shard_size": "30gb",
      "max_docs": 10000000
    }
  }'

5. データモデルとマッピング

5.1 ドキュメント構造

OpenSearch のドキュメントは JSON 形式であり、フラットな構造またはネストされた構造を持つことができる。

{
  "_index": "orders",
  "_id": "ord-20240115-001",
  "_source": {
    "order_id": "ord-20240115-001",
    "customer": {
      "id": "cust-1234",
      "name": "田中太郎",
      "email": "tanaka@example.com"
    },
    "items": [
      {
        "product_id": "prod-001",
        "name": "OpenSearchガイドブック",
        "quantity": 2,
        "price": 3500
      },
      {
        "product_id": "prod-002",
        "name": "データ分析入門",
        "quantity": 1,
        "price": 2800
      }
    ],
    "total_amount": 9800,
    "status": "shipped",
    "created_at": "2024-01-15T10:30:00Z",
    "tags": ["book", "technical"]
  }
}

5.2 フィールドタイプ

カテゴリ	タイプ	説明
テキスト	`text`	全文検索用、アナライザで分析される
テキスト	`keyword`	完全一致、集計、ソート用
テキスト	`match_only_text`	text より軽量（位置情報を保存しない）
数値	`integer`, `long`	整数型
数値	`float`, `double`	浮動小数点型
数値	`half_float`, `scaled_float`	精度制限付き浮動小数点
日付	`date`	日時フィールド（ISO 8601 など）
日付	`date_nanos`	ナノ秒精度の日時
ブール	`boolean`	true/false
バイナリ	`binary`	Base64 エンコードバイナリ
範囲	`integer_range`, `date_range`	範囲値
IP	`ip`	IPv4/IPv6 アドレス
地理	`geo_point`	緯度/経度座標
地理	`geo_shape`	任意の地理的形状
構造化	`object`	JSON オブジェクト
構造化	`nested`	独立したドキュメントとしての配列オブジェクト
構造化	`join`	親子関係
構造化	`flat_object`	スキーマレスの JSON オブジェクト
ベクトル	`knn_vector`	k-NN ベクトル検索用
トークン	`token_count`	トークン数
その他	`alias`	フィールドの別名
その他	`rank_feature`	ランキングスコア用

5.3 マッピング定義の実践例

# eコマース向けインデックスマッピング
curl -X PUT "https://localhost:9200/products" \
  -u admin:admin --insecure \
  -H "Content-Type: application/json" \
  -d '{
    "settings": {
      "analysis": {
        "analyzer": {
          "ja_analyzer": {
            "type": "custom",
            "tokenizer": "kuromoji_tokenizer",
            "filter": [
              "kuromoji_baseform",
              "kuromoji_part_of_speech",
              "ja_stop",
              "kuromoji_stemmer",
              "lowercase"
            ]
          },
          "autocomplete_analyzer": {
            "type": "custom",
            "tokenizer": "autocomplete_tokenizer",
            "filter": ["lowercase"]
          },
          "autocomplete_search": {
            "type": "custom",
            "tokenizer": "standard",
            "filter": ["lowercase"]
          }
        },
        "tokenizer": {
          "autocomplete_tokenizer": {
            "type": "edge_ngram",
            "min_gram": 2,
            "max_gram": 10,
            "token_chars": ["letter", "digit"]
          }
        },
        "filter": {
          "ja_stop": {
            "type": "stop",
            "stopwords": "_japanese_"
          }
        }
      }
    },
    "mappings": {
      "properties": {
        "product_id": { "type": "keyword" },
        "name": {
          "type": "text",
          "analyzer": "ja_analyzer",
          "fields": {
            "keyword": { "type": "keyword", "ignore_above": 256 },
            "autocomplete": {
              "type": "text",
              "analyzer": "autocomplete_analyzer",
              "search_analyzer": "autocomplete_search"
            }
          }
        },
        "description": {
          "type": "text",
          "analyzer": "ja_analyzer"
        },
        "category": {
          "type": "keyword",
          "fields": {
            "text": { "type": "text" }
          }
        },
        "price": { "type": "scaled_float", "scaling_factor": 100 },
        "stock": { "type": "integer" },
        "brand": { "type": "keyword" },
        "tags": { "type": "keyword" },
        "rating": { "type": "half_float" },
        "review_count": { "type": "integer" },
        "location": { "type": "geo_point" },
        "attributes": { "type": "flat_object" },
        "created_at": {
          "type": "date",
          "format": "strict_date_optional_time||epoch_millis"
        },
        "updated_at": { "type": "date" },
        "embedding": {
          "type": "knn_vector",
          "dimension": 768,
          "method": {
            "name": "hnsw",
            "space_type": "cosinesimil",
            "engine": "nmslib",
            "parameters": {
              "ef_construction": 128,
              "m": 16
            }
          }
        }
      }
    }
  }'

5.4 ダイナミックマッピング

# ダイナミックマッピングの制御
curl -X PUT "https://localhost:9200/dynamic-example" \
  -u admin:admin --insecure \
  -H "Content-Type: application/json" \
  -d '{
    "mappings": {
      "dynamic": "strict",
      "properties": {
        "known_field": { "type": "keyword" },
        "metadata": {
          "type": "object",
          "dynamic": true
        },
        "payload": {
          "type": "object",
          "dynamic": "runtime"
        }
      },
      "dynamic_templates": [
        {
          "longs_as_integers": {
            "match_mapping_type": "long",
            "mapping": { "type": "integer" }
          }
        },
        {
          "string_fields": {
            "match": "*_name",
            "mapping": {
              "type": "text",
              "fields": {
                "keyword": { "type": "keyword", "ignore_above": 256 }
              }
            }
          }
        },
        {
          "unmatched_strings": {
            "match_mapping_type": "string",
            "mapping": { "type": "keyword" }
          }
        }
      ]
    }
  }'

5.5 マルチフィールド

単一のソースフィールドに複数の分析方法を適用する場合に使用する：

{
  "mappings": {
    "properties": {
      "title": {
        "type": "text",
        "analyzer": "ja_analyzer",
        "fields": {
          "keyword": {
            "type": "keyword",
            "ignore_above": 256
          },
          "english": {
            "type": "text",
            "analyzer": "english"
          },
          "ngram": {
            "type": "text",
            "analyzer": "ngram_analyzer"
          }
        }
      }
    }
  }
}

5.6 Nested と Object の違い

# Object型（内部的にフラット化される）
# {"items": [{"name": "A", "qty": 1}, {"name": "B", "qty": 2}]}
# → items.name: ["A", "B"], items.qty: [1, 2]
# → クロスマッチの問題: name=A AND qty=2 がヒットしてしまう

# Nested型（独立したドキュメントとして保存）
curl -X PUT "https://localhost:9200/orders" \
  -u admin:admin --insecure \
  -H "Content-Type: application/json" \
  -d '{
    "mappings": {
      "properties": {
        "order_id": { "type": "keyword" },
        "items": {
          "type": "nested",
          "properties": {
            "name": { "type": "text" },
            "quantity": { "type": "integer" },
            "price": { "type": "float" }
          }
        }
      }
    }
  }'

# Nested クエリ
curl -X GET "https://localhost:9200/orders/_search" \
  -u admin:admin --insecure \
  -H "Content-Type: application/json" \
  -d '{
    "query": {
      "nested": {
        "path": "items",
        "query": {
          "bool": {
            "must": [
              { "match": { "items.name": "OpenSearch" } },
              { "range": { "items.price": { "gte": 3000 } } }
            ]
          }
        }
      }
    }
  }'

6. インデクシングと検索の仕組み

6.1 インデクシングフロー

ドキュメントがインデクシングされる際の内部フローは以下の通り：

1. クライアントがドキュメントをPOST/PUT
         │
2. コーディネートノードが受信
         │
3. ドキュメントID（指定 or 自動生成）からルーティング値を決定
   shard_num = hash(_routing) % num_primary_shards
         │
4. 該当プライマリシャードのあるデータノードへ転送
         │
5. プライマリシャードでの処理:
   a. バリデーション（マッピング確認）
   b. インジェストパイプライン実行（存在する場合）
   c. Luceneインデックスへ書き込み
   d. Translog への書き込み（永続性保証）
         │
6. レプリカシャードへの並列転送
         │
7. 全レプリカからの完了応答を待機
         │
8. クライアントへレスポンス返却

6.2 Bulk API によるバッチインデクシング

# Bulk API によるバッチ投入
curl -X POST "https://localhost:9200/_bulk" \
  -u admin:admin --insecure \
  -H "Content-Type: application/x-ndjson" \
  -d '
{ "index": { "_index": "logs-2024-01-15", "_id": "1" } }
{ "@timestamp": "2024-01-15T10:00:00Z", "level": "INFO", "message": "Application started", "service": "api-gateway" }
{ "index": { "_index": "logs-2024-01-15", "_id": "2" } }
{ "@timestamp": "2024-01-15T10:00:01Z", "level": "ERROR", "message": "Connection timeout", "service": "payment-service" }
{ "index": { "_index": "logs-2024-01-15", "_id": "3" } }
{ "@timestamp": "2024-01-15T10:00:02Z", "level": "WARN", "message": "High memory usage", "service": "recommendation-engine" }
'

6.3 Bulk インデクシングの最適化設定

# バルクインデクシング前の一時的な最適化
curl -X PUT "https://localhost:9200/logs-2024-01-15/_settings" \
  -u admin:admin --insecure \
  -H "Content-Type: application/json" \
  -d '{
    "index": {
      "refresh_interval": "-1",
      "number_of_replicas": 0,
      "translog.durability": "async",
      "translog.sync_interval": "30s"
    }
  }'

# バルクインデクシング実行後、設定を戻す
curl -X PUT "https://localhost:9200/logs-2024-01-15/_settings" \
  -u admin:admin --insecure \
  -H "Content-Type: application/json" \
  -d '{
    "index": {
      "refresh_interval": "5s",
      "number_of_replicas": 1,
      "translog.durability": "request"
    }
  }'

6.4 検索の内部フロー

1. クライアントが検索クエリを送信
         │
2. コーディネートノードが受信
         │
3. Query Phase（散布）:
   対象インデックスの全シャード（プライマリ or レプリカ）に
   クエリを並列送信
         │
4. 各シャードでの処理:
   a. Lucene インデックスを検索
   b. ローカルでスコアリング
   c. 上位N件のドキュメントID + スコアを返却
         │
5. コーディネートノードでの集約:
   各シャードからの結果をマージ・ソート
   グローバルな上位N件を決定
         │
6. Fetch Phase（収集）:
   上位N件のドキュメントの全フィールドを
   該当シャードから取得
         │
7. クライアントへレスポンス返却

6.5 検索タイプとプリファレンス

# 検索プリファレンス設定
# ローカルシャード優先
curl -X GET "https://localhost:9200/logs-*/_search?preference=_local" \
  -u admin:admin --insecure \
  -H "Content-Type: application/json" \
  -d '{ "query": { "match_all": {} } }'

# 特定のノード指定
curl -X GET "https://localhost:9200/logs-*/_search?preference=_only_nodes:data-node-1,data-node-2" \
  -u admin:admin --insecure \
  -H "Content-Type: application/json" \
  -d '{ "query": { "match_all": {} } }'

# カスタムルーティングによる検索
curl -X GET "https://localhost:9200/logs-*/_search?routing=service-a" \
  -u admin:admin --insecure \
  -H "Content-Type: application/json" \
  -d '{ "query": { "match": { "service": "service-a" } } }'

6.6 Segment Replication

OpenSearch 2.x で導入されたセグメントレプリケーションは、従来のドキュメントレプリケーションに代わる方式で、プライマリシャードのセグメントファイルを直接レプリカにコピーする。

# opensearch.yml - セグメントレプリケーション有効化
cluster.indices.replication.strategy: SEGMENT

# インデックスレベルでの設定
curl -X PUT "https://localhost:9200/my-index" \
  -u admin:admin --insecure \
  -H "Content-Type: application/json" \
  -d '{
    "settings": {
      "index": {
        "replication.type": "SEGMENT",
        "number_of_shards": 3,
        "number_of_replicas": 2
      }
    }
  }'

セグメントレプリケーションの利点：

プライマリシャードの CPU/メモリ使用量削減
インデクシングスループットの向上（レプリカでの再インデクシングが不要）
全レプリカで一貫した検索結果

6.7 Remote-Backed Storage

# opensearch.yml - リモートストレージ設定
node.attr.remote_store.segment.repository: my-s3-repo
node.attr.remote_store.translog.repository: my-s3-repo
node.attr.remote_store.state.repository: my-s3-repo

# リモートリポジトリの登録
# PUT _snapshot/my-s3-repo
# {
#   "type": "s3",
#   "settings": {
#     "bucket": "opensearch-remote-store",
#     "region": "us-west-2",
#     "base_path": "remote-store"
#   }
# }

7. クエリ DSL

OpenSearch の Query DSL（Domain Specific Language）は、JSON ベースのクエリ言語であり、全文検索から構造化クエリまで幅広い検索要件に対応する。

7.1 クエリコンテキストとフィルタコンテキスト

コンテキスト	スコアリング	キャッシュ	用途
Query	あり	なし	全文検索、関連性スコアが必要な場合
Filter	なし	あり	完全一致、範囲フィルタ、yes/no 判定

# クエリとフィルタの組み合わせ
curl -X GET "https://localhost:9200/products/_search?pretty" \
  -u admin:admin --insecure \
  -H "Content-Type: application/json" \
  -d '{
    "query": {
      "bool": {
        "must": [
          { "match": { "name": "OpenSearch 入門" } }
        ],
        "filter": [
          { "term": { "category": "技術書" } },
          { "range": { "price": { "gte": 1000, "lte": 5000 } } },
          { "term": { "stock": { "value": true } } }
        ],
        "should": [
          { "match": { "description": "初心者向け" } }
        ],
        "minimum_should_match": 0,
        "must_not": [
          { "term": { "status": "discontinued" } }
        ]
      }
    },
    "from": 0,
    "size": 20,
    "sort": [
      { "_score": { "order": "desc" } },
      { "created_at": { "order": "desc" } }
    ],
    "_source": ["name", "price", "category", "rating"],
    "highlight": {
      "fields": {
        "name": { "number_of_fragments": 1 },
        "description": { "number_of_fragments": 3, "fragment_size": 150 }
      },
      "pre_tags": ["<em>"],
      "post_tags": ["</em>"]
    }
  }'

7.2 全文検索クエリ

# match クエリ（標準的な全文検索）
{ "match": { "message": { "query": "connection timeout error", "operator": "and" } } }

# match_phrase クエリ（フレーズ検索）
{ "match_phrase": { "message": { "query": "connection timeout", "slop": 2 } } }

# match_phrase_prefix クエリ（前方一致フレーズ）
{ "match_phrase_prefix": { "name": { "query": "open sear", "max_expansions": 50 } } }

# multi_match クエリ（複数フィールド横断検索）
{
  "multi_match": {
    "query": "OpenSearch チュートリアル",
    "fields": ["name^3", "description^2", "tags"],
    "type": "best_fields",
    "tie_breaker": 0.3,
    "fuzziness": "AUTO"
  }
}

# query_string クエリ（Lucene構文）
{
  "query_string": {
    "query": "(level:ERROR OR level:WARN) AND service:payment*",
    "default_field": "message",
    "analyze_wildcard": true
  }
}

# simple_query_string クエリ（安全なユーザ入力対応）
{
  "simple_query_string": {
    "query": "\"connection timeout\" + error -warning",
    "fields": ["message", "exception"],
    "default_operator": "AND"
  }
}

7.3 Term レベルクエリ

# term クエリ（完全一致）
{ "term": { "status": "active" } }

# terms クエリ（複数値のOR）
{ "terms": { "level": ["ERROR", "CRITICAL", "FATAL"] } }

# range クエリ
{
  "range": {
    "@timestamp": {
      "gte": "2024-01-01T00:00:00Z",
      "lt": "2024-02-01T00:00:00Z",
      "format": "strict_date_optional_time",
      "time_zone": "+09:00"
    }
  }
}

# prefix クエリ
{ "prefix": { "service": { "value": "payment-" } } }

# wildcard クエリ
{ "wildcard": { "host": { "value": "web-server-*" } } }

# regexp クエリ
{ "regexp": { "error_code": { "value": "E[0-9]{4}", "flags": "ALL" } } }

# exists クエリ（フィールドの存在確認）
{ "exists": { "field": "error_stack_trace" } }

# fuzzy クエリ（あいまい検索）
{ "fuzzy": { "name": { "value": "opensarch", "fuzziness": "AUTO" } } }

# ids クエリ
{ "ids": { "values": ["doc-1", "doc-2", "doc-3"] } }

7.4 複合クエリ

# bool クエリの実践例
curl -X GET "https://localhost:9200/logs-*/_search?pretty" \
  -u admin:admin --insecure \
  -H "Content-Type: application/json" \
  -d '{
    "query": {
      "bool": {
        "must": [
          {
            "match": {
              "message": {
                "query": "timeout exception",
                "operator": "and"
              }
            }
          }
        ],
        "filter": [
          { "term": { "level": "ERROR" } },
          {
            "range": {
              "@timestamp": {
                "gte": "now-24h",
                "lte": "now"
              }
            }
          },
          { "terms": { "service": ["payment-service", "order-service"] } }
        ],
        "should": [
          { "term": { "environment": { "value": "production", "boost": 2.0 } } },
          { "match": { "exception": "java.net.SocketTimeoutException" } }
        ],
        "must_not": [
          { "term": { "host": "canary-server" } }
        ],
        "minimum_should_match": 1
      }
    }
  }'

# boosting クエリ（ネガティブブースト）
{
  "boosting": {
    "positive": { "match": { "name": "OpenSearch guide" } },
    "negative": { "term": { "status": "draft" } },
    "negative_boost": 0.5
  }
}

# function_score クエリ（カスタムスコアリング）
{
  "function_score": {
    "query": { "match": { "name": "OpenSearch" } },
    "functions": [
      {
        "field_value_factor": {
          "field": "rating",
          "modifier": "log1p",
          "factor": 2
        }
      },
      {
        "gauss": {
          "created_at": {
            "origin": "now",
            "scale": "30d",
            "offset": "7d",
            "decay": 0.5
          }
        }
      },
      {
        "script_score": {
          "script": {
            "source": "_score * doc[\"review_count\"].value / 100"
          }
        }
      }
    ],
    "score_mode": "multiply",
    "boost_mode": "multiply",
    "max_boost": 10
  }
}

7.5 SQL による検索

OpenSearch は SQL プラグインを通じて SQL 構文での検索もサポートする。

# SQL クエリの実行
curl -X POST "https://localhost:9200/_plugins/_sql" \
  -u admin:admin --insecure \
  -H "Content-Type: application/json" \
  -d '{
    "query": "SELECT service, level, COUNT(*) as error_count FROM logs-* WHERE level = '\''ERROR'\'' AND @timestamp > DATE_SUB(NOW(), INTERVAL 1 HOUR) GROUP BY service, level ORDER BY error_count DESC LIMIT 20"
  }'

# SQL をDSLに変換
curl -X POST "https://localhost:9200/_plugins/_sql/_explain" \
  -u admin:admin --insecure \
  -H "Content-Type: application/json" \
  -d '{
    "query": "SELECT * FROM products WHERE price > 1000 AND category = '\''技術書'\'' ORDER BY rating DESC"
  }'

# PPL（Piped Processing Language）
curl -X POST "https://localhost:9200/_plugins/_ppl" \
  -u admin:admin --insecure \
  -H "Content-Type: application/json" \
  -d '{
    "query": "source=logs-* | where level='\''ERROR'\'' | stats count() by service | sort - count()"
  }'

7.6 Point in Time（PIT）検索

# PIT の作成
curl -X POST "https://localhost:9200/logs-*/_search/point_in_time?keep_alive=5m" \
  -u admin:admin --insecure

# PIT を使った検索（ページング）
curl -X GET "https://localhost:9200/_search" \
  -u admin:admin --insecure \
  -H "Content-Type: application/json" \
  -d '{
    "size": 100,
    "query": { "match": { "level": "ERROR" } },
    "pit": {
      "id": "xxxxPITIDxxxx",
      "keep_alive": "5m"
    },
    "sort": [
      { "@timestamp": { "order": "desc" } },
      { "_id": "asc" }
    ],
    "search_after": ["2024-01-15T10:00:00Z", "doc-100"]
  }'

8. アグリゲーション

アグリゲーションは、検索結果に対する統計・分析処理機能であり、BI ダッシュボードや分析レポートの基盤となる。

8.1 アグリゲーションの分類

カテゴリ	説明	例
Metric	数値計算	avg, sum, min, max, cardinality, percentiles
Bucket	グループ分け	terms, date_histogram, range, filters
Pipeline	アグリゲーション結果の二次処理	derivative, moving_avg, cumulative_sum
Matrix	複数フィールドの統計	matrix_stats

8.2 Metric アグリゲーション

curl -X GET "https://localhost:9200/orders/_search?pretty" \
  -u admin:admin --insecure \
  -H "Content-Type: application/json" \
  -d '{
    "size": 0,
    "query": {
      "range": {
        "created_at": { "gte": "now-30d" }
      }
    },
    "aggs": {
      "total_revenue": { "sum": { "field": "total_amount" } },
      "average_order_value": { "avg": { "field": "total_amount" } },
      "max_order": { "max": { "field": "total_amount" } },
      "min_order": { "min": { "field": "total_amount" } },
      "order_count": { "value_count": { "field": "order_id" } },
      "unique_customers": { "cardinality": { "field": "customer.id", "precision_threshold": 1000 } },
      "response_time_percentiles": {
        "percentiles": {
          "field": "response_time_ms",
          "percents": [50, 75, 90, 95, 99, 99.9]
        }
      },
      "response_time_stats": {
        "extended_stats": { "field": "response_time_ms" }
      },
      "revenue_median_absolute_deviation": {
        "median_absolute_deviation": { "field": "total_amount" }
      }
    }
  }'

8.3 Bucket アグリゲーション

# 複合Bucketアグリゲーション
curl -X GET "https://localhost:9200/logs-*/_search?pretty" \
  -u admin:admin --insecure \
  -H "Content-Type: application/json" \
  -d '{
    "size": 0,
    "aggs": {
      "errors_by_service": {
        "terms": {
          "field": "service",
          "size": 20,
          "order": { "error_count": "desc" }
        },
        "aggs": {
          "error_count": {
            "filter": { "term": { "level": "ERROR" } },
            "aggs": {
              "count": { "value_count": { "field": "_id" } }
            }
          },
          "error_rate_over_time": {
            "date_histogram": {
              "field": "@timestamp",
              "fixed_interval": "1h",
              "min_doc_count": 0,
              "extended_bounds": {
                "min": "now-24h",
                "max": "now"
              }
            },
            "aggs": {
              "errors": {
                "filter": { "term": { "level": "ERROR" } }
              },
              "total": { "value_count": { "field": "_id" } },
              "error_percentage": {
                "bucket_script": {
                  "buckets_path": {
                    "errors": "errors._count",
                    "total": "total"
                  },
                  "script": "params.total > 0 ? (params.errors / params.total) * 100 : 0"
                }
              }
            }
          },
          "top_error_messages": {
            "terms": {
              "field": "message.keyword",
              "size": 5
            }
          }
        }
      },
      "errors_by_time": {
        "date_histogram": {
          "field": "@timestamp",
          "calendar_interval": "1d",
          "time_zone": "Asia/Tokyo",
          "format": "yyyy-MM-dd"
        },
        "aggs": {
          "level_breakdown": {
            "terms": { "field": "level" }
          }
        }
      },
      "response_time_ranges": {
        "range": {
          "field": "response_time_ms",
          "ranges": [
            { "key": "fast", "to": 100 },
            { "key": "normal", "from": 100, "to": 500 },
            { "key": "slow", "from": 500, "to": 1000 },
            { "key": "very_slow", "from": 1000 }
          ]
        }
      }
    }
  }'

8.4 Pipeline アグリゲーション

# Pipeline アグリゲーション（移動平均、微分）
curl -X GET "https://localhost:9200/metrics-*/_search?pretty" \
  -u admin:admin --insecure \
  -H "Content-Type: application/json" \
  -d '{
    "size": 0,
    "aggs": {
      "requests_over_time": {
        "date_histogram": {
          "field": "@timestamp",
          "fixed_interval": "5m"
        },
        "aggs": {
          "avg_latency": { "avg": { "field": "latency_ms" } },
          "request_count": { "value_count": { "field": "_id" } },
          "latency_moving_avg": {
            "moving_avg": {
              "buckets_path": "avg_latency",
              "window": 12,
              "model": "holt_winters",
              "settings": {
                "type": "add",
                "alpha": 0.3,
                "beta": 0.1,
                "gamma": 0.3,
                "period": 12
              }
            }
          },
          "request_rate_derivative": {
            "derivative": {
              "buckets_path": "request_count"
            }
          },
          "cumulative_requests": {
            "cumulative_sum": {
              "buckets_path": "request_count"
            }
          },
          "p99_latency": {
            "percentiles": {
              "field": "latency_ms",
              "percents": [99]
            }
          }
        }
      },
      "max_latency_bucket": {
        "max_bucket": {
          "buckets_path": "requests_over_time>avg_latency"
        }
      }
    }
  }'

8.5 Composite アグリゲーション（ページネーション対応）

# 大量のバケットをページングで取得
curl -X GET "https://localhost:9200/logs-*/_search?pretty" \
  -u admin:admin --insecure \
  -H "Content-Type: application/json" \
  -d '{
    "size": 0,
    "aggs": {
      "service_level_combo": {
        "composite": {
          "size": 100,
          "sources": [
            { "service": { "terms": { "field": "service" } } },
            { "level": { "terms": { "field": "level" } } },
            { "date": { "date_histogram": { "field": "@timestamp", "calendar_interval": "1d" } } }
          ]
        },
        "aggs": {
          "avg_response_time": { "avg": { "field": "response_time_ms" } }
        }
      }
    }
  }'

# 次ページ取得（after_key を使用）
# "after": { "service": "payment-service", "level": "INFO", "date": 1705276800000 }

9. テキスト分析（アナライザ）

9.1 アナライザの構成

アナライザはテキストを検索可能なトークンに変換する処理パイプラインであり、3 つのコンポーネントで構成される：

入力テキスト → Character Filter → Tokenizer → Token Filter → トークン列
                (0個以上)         (1個)        (0個以上)

9.2 ビルトインアナライザ

アナライザ	説明	例: "The Quick Brown Fox"
standard	デフォルト。Unicode Text Segmentation	[the, quick, brown, fox]
simple	非アルファベットで分割、小文字化	[the, quick, brown, fox]
whitespace	空白で分割のみ	[The, Quick, Brown, Fox]
stop	standard + ストップワード除去	[quick, brown, fox]
keyword	入力全体を1トークンとして扱う	[The Quick Brown Fox]
pattern	正規表現で分割	(パターンによる)
language analyzers	言語固有	english, japanese etc.

9.3 カスタムアナライザの構築

# 日本語全文検索向けカスタムアナライザ
curl -X PUT "https://localhost:9200/japanese-content" \
  -u admin:admin --insecure \
  -H "Content-Type: application/json" \
  -d '{
    "settings": {
      "analysis": {
        "char_filter": {
          "normalize_chars": {
            "type": "icu_normalizer",
            "name": "nfkc_cf"
          },
          "html_strip_filter": {
            "type": "html_strip"
          }
        },
        "tokenizer": {
          "ja_tokenizer": {
            "type": "kuromoji_tokenizer",
            "mode": "search",
            "user_dictionary_rules": [
              "東京スカイツリー,東京 スカイツリー,トウキョウ スカイツリー,カスタム名詞",
              "OpenSearch,OpenSearch,オープンサーチ,カスタム名詞"
            ]
          }
        },
        "filter": {
          "ja_baseform": { "type": "kuromoji_baseform" },
          "ja_part_of_speech": {
            "type": "kuromoji_part_of_speech",
            "stoptags": [
              "助詞-格助詞-一般",
              "助詞-終助詞"
            ]
          },
          "ja_readingform": {
            "type": "kuromoji_readingform",
            "use_romaji": false
          },
          "ja_stemmer": { "type": "kuromoji_stemmer" },
          "ja_stop": {
            "type": "stop",
            "stopwords": "_japanese_"
          },
          "ja_synonym": {
            "type": "synonym_graph",
            "synonyms": [
              "サーバー,サーバ",
              "データベース,DB",
              "検索エンジン,サーチエンジン"
            ]
          }
        },
        "analyzer": {
          "ja_full_text_analyzer": {
            "type": "custom",
            "char_filter": ["normalize_chars", "html_strip_filter"],
            "tokenizer": "ja_tokenizer",
            "filter": [
              "ja_baseform",
              "ja_part_of_speech",
              "ja_stop",
              "ja_stemmer",
              "lowercase"
            ]
          },
          "ja_search_analyzer": {
            "type": "custom",
            "char_filter": ["normalize_chars"],
            "tokenizer": "ja_tokenizer",
            "filter": [
              "ja_baseform",
              "ja_part_of_speech",
              "ja_stop",
              "ja_stemmer",
              "ja_synonym",
              "lowercase"
            ]
          }
        }
      }
    },
    "mappings": {
      "properties": {
        "title": {
          "type": "text",
          "analyzer": "ja_full_text_analyzer",
          "search_analyzer": "ja_search_analyzer"
        },
        "body": {
          "type": "text",
          "analyzer": "ja_full_text_analyzer",
          "search_analyzer": "ja_search_analyzer"
        }
      }
    }
  }'

9.4 アナライザのテスト

# アナライザの動作確認
curl -X POST "https://localhost:9200/japanese-content/_analyze?pretty" \
  -u admin:admin --insecure \
  -H "Content-Type: application/json" \
  -d '{
    "analyzer": "ja_full_text_analyzer",
    "text": "OpenSearchは分散型の検索・分析エンジンです"
  }'

# カスタムアナライザのアドホックテスト
curl -X POST "https://localhost:9200/_analyze?pretty" \
  -u admin:admin --insecure \
  -H "Content-Type: application/json" \
  -d '{
    "tokenizer": "standard",
    "filter": ["lowercase", "asciifolding"],
    "text": "Café résumé naïve"
  }'

9.5 NGram と Edge NGram

# オートコンプリート用のEdge NGram設定
curl -X PUT "https://localhost:9200/search-suggestions" \
  -u admin:admin --insecure \
  -H "Content-Type: application/json" \
  -d '{
    "settings": {
      "analysis": {
        "tokenizer": {
          "edge_ngram_tokenizer": {
            "type": "edge_ngram",
            "min_gram": 1,
            "max_gram": 20,
            "token_chars": ["letter", "digit", "custom"],
            "custom_token_chars": "-_"
          }
        },
        "analyzer": {
          "autocomplete_index": {
            "type": "custom",
            "tokenizer": "edge_ngram_tokenizer",
            "filter": ["lowercase"]
          },
          "autocomplete_search": {
            "type": "custom",
            "tokenizer": "standard",
            "filter": ["lowercase"]
          }
        }
      },
      "max_ngram_diff": 20
    },
    "mappings": {
      "properties": {
        "suggest": {
          "type": "text",
          "analyzer": "autocomplete_index",
          "search_analyzer": "autocomplete_search"
        }
      }
    }
  }'

10. OpenSearch Dashboards

OpenSearch Dashboards は、OpenSearch のデータを可視化するための Web UI であり、Kibana 7.10.2 のフォークである。

10.1 主要機能

機能	説明
Discover	インタラクティブなデータ探索
Visualizations	グラフ、チャート、テーブルの作成
Dashboards	複数のビジュアライゼーションの統合表示
Dev Tools	REST API コンソール
Alerting	アラートの管理
Anomaly Detection	異常検知の管理
Observability	トレース、メトリクス、ログの統合ビュー
Security	ユーザー、ロール、テナントの管理
Notebooks	データ分析ノートブック
Reports	PDF/PNG レポート生成
Maps	地理空間データの可視化

10.2 インストールと設定

# opensearch_dashboards.yml

server.host: "0.0.0.0"
server.port: 5601
server.name: "opensearch-dashboards"

# OpenSearchへの接続設定
opensearch.hosts: ["https://opensearch-node1:9200", "https://opensearch-node2:9200"]
opensearch.ssl.verificationMode: full
opensearch.ssl.certificateAuthorities: ["/etc/opensearch-dashboards/certs/root-ca.pem"]
opensearch.username: "kibanaserver"
opensearch.password: "${DASHBOARDS_PASSWORD}"

# TLS設定
server.ssl.enabled: true
server.ssl.certificate: "/etc/opensearch-dashboards/certs/dashboards.pem"
server.ssl.key: "/etc/opensearch-dashboards/certs/dashboards-key.pem"

# マルチテナンシー
opensearch_security.multitenancy.enabled: true
opensearch_security.multitenancy.tenants.preferred: ["Private", "Global"]

# ログ設定
logging.dest: /var/log/opensearch-dashboards/dashboards.log
logging.verbose: false

# タイムゾーン
dateFormat:tz: "Asia/Tokyo"

# CSP設定
csp.strict: true
csp.warnLegacyBrowsers: true

10.3 Saved Objects のエクスポート/インポート

# ダッシュボードのエクスポート
curl -X POST "https://localhost:5601/api/saved_objects/_export" \
  -u admin:admin --insecure \
  -H "Content-Type: application/json" \
  -H "osd-xsrf: true" \
  -d '{
    "type": ["dashboard", "visualization", "index-pattern", "search"],
    "includeReferencesDeep": true
  }' > dashboards-export.ndjson

# インポート
curl -X POST "https://localhost:5601/api/saved_objects/_import?overwrite=true" \
  -u admin:admin --insecure \
  -H "osd-xsrf: true" \
  --form file=@dashboards-export.ndjson

11. セキュリティ

OpenSearch のセキュリティ機能は Security プラグインにより提供され、認証、認可、暗号化、監査を包括的にカバーする。

11.1 セキュリティアーキテクチャ

  クライアント ──── TLS ───► OpenSearch
                              │
                    ┌─────────┴─────────┐
                    │   認証 (AuthN)     │
                    │ ・Internal DB      │
                    │ ・LDAP/AD          │
                    │ ・SAML             │
                    │ ・OpenID Connect   │
                    │ ・Client Cert      │
                    └─────────┬─────────┘
                              │
                    ┌─────────┴─────────┐
                    │   認可 (AuthZ)     │
                    │ ・ロールベース     │
                    │ ・フィールドレベル │
                    │ ・ドキュメントレベル│
                    └─────────┬─────────┘
                              │
                    ┌─────────┴─────────┐
                    │     監査ログ       │
                    └───────────────────┘

11.2 TLS/SSL 設定

# opensearch.yml - TLS設定

# Transport層TLS（ノード間通信）
plugins.security.ssl.transport.pemcert_filepath: /etc/opensearch/certs/node1.pem
plugins.security.ssl.transport.pemkey_filepath: /etc/opensearch/certs/node1-key.pem
plugins.security.ssl.transport.pemtrustedcas_filepath: /etc/opensearch/certs/root-ca.pem
plugins.security.ssl.transport.enforce_hostname_verification: true

# REST層TLS（クライアント通信）
plugins.security.ssl.http.enabled: true
plugins.security.ssl.http.pemcert_filepath: /etc/opensearch/certs/node1-http.pem
plugins.security.ssl.http.pemkey_filepath: /etc/opensearch/certs/node1-http-key.pem
plugins.security.ssl.http.pemtrustedcas_filepath: /etc/opensearch/certs/root-ca.pem

# 管理者証明書（securityadmin.sh実行用）
plugins.security.authcz.admin_dn:
  - "CN=admin,OU=IT,O=MyCompany,L=Tokyo,C=JP"

# ノード証明書の識別
plugins.security.nodes_dn:
  - "CN=node*,OU=IT,O=MyCompany,L=Tokyo,C=JP"

11.3 内部ユーザーとロール

# internal_users.yml
_meta:
  type: "internalusers"
  config_version: 2

admin:
  hash: "$2y$12$..."  # bcryptハッシュ
  reserved: true
  backend_roles:
    - "admin"
  description: "管理者ユーザー"

log_reader:
  hash: "$2y$12$..."
  reserved: false
  backend_roles:
    - "readall"
  attributes:
    department: "SRE"
  description: "ログ閲覧用ユーザー"

dashboard_user:
  hash: "$2y$12$..."
  reserved: false
  opendistro_security_roles:
    - "kibana_user"
    - "logs_read_role"

# roles.yml
_meta:
  type: "roles"
  config_version: 2

logs_read_role:
  cluster_permissions:
    - "cluster_composite_ops_ro"
  index_permissions:
    - index_patterns:
        - "logs-*"
      allowed_actions:
        - "read"
        - "search"
      dls: '{ "bool": { "must": { "term": { "environment": "production" } } } }'
      fls:
        - "-password"
        - "-credit_card"
      masked_fields:
        - "email"

logs_write_role:
  cluster_permissions:
    - "cluster_composite_ops"
    - "indices:data/write/bulk"
  index_permissions:
    - index_patterns:
        - "logs-*"
      allowed_actions:
        - "crud"
        - "create_index"

sre_admin_role:
  cluster_permissions:
    - "cluster_all"
  index_permissions:
    - index_patterns:
        - "*"
      allowed_actions:
        - "indices_all"
  tenant_permissions:
    - tenant_patterns:
        - "sre_tenant"
      allowed_actions:
        - "kibana_all_write"

# roles_mapping.yml
_meta:
  type: "rolesmapping"
  config_version: 2

all_access:
  reserved: false
  backend_roles:
    - "admin"
  users:
    - "admin"

logs_read_role:
  reserved: false
  backend_roles:
    - "readall"
  users:
    - "log_reader"
    - "dashboard_user"

logs_write_role:
  reserved: false
  backend_roles:
    - "log_writer"

sre_admin_role:
  reserved: false
  backend_roles:
    - "sre_team"

11.4 LDAP/Active Directory 連携

# config.yml - LDAP認証設定
config:
  dynamic:
    authc:
      ldap_auth:
        http_enabled: true
        transport_enabled: true
        order: 1
        http_authenticator:
          type: basic
          challenge: true
        authentication_backend:
          type: ldap
          config:
            enable_ssl: true
            enable_start_tls: false
            enable_ssl_client_auth: false
            verify_hostnames: true
            hosts:
              - "ldaps://ldap.example.com:636"
            bind_dn: "cn=opensearch,ou=ServiceAccounts,dc=example,dc=com"
            password: "${LDAP_BIND_PASSWORD}"
            userbase: "ou=Users,dc=example,dc=com"
            usersearch: "(sAMAccountName={0})"
            username_attribute: "sAMAccountName"
    authz:
      ldap_authz:
        http_enabled: true
        transport_enabled: true
        authorization_backend:
          type: ldap
          config:
            hosts:
              - "ldaps://ldap.example.com:636"
            bind_dn: "cn=opensearch,ou=ServiceAccounts,dc=example,dc=com"
            password: "${LDAP_BIND_PASSWORD}"
            rolebase: "ou=Groups,dc=example,dc=com"
            rolesearch: "(member={0})"
            rolename: "cn"
            resolve_nested_roles: true

11.5 監査ログ

# opensearch.yml - 監査ログ設定
plugins.security.audit.type: internal_opensearch
plugins.security.audit.config.index: ".opendistro-audit-log"
plugins.security.audit.config.enable_rest: true
plugins.security.audit.config.enable_transport: true
plugins.security.audit.config.disabled_rest_categories:
  - AUTHENTICATED
  - GRANTED_PRIVILEGES
plugins.security.audit.config.disabled_transport_categories:
  - AUTHENTICATED
  - GRANTED_PRIVILEGES
plugins.security.compliance.enabled: true
plugins.security.compliance.write_log_diffs: true
plugins.security.compliance.read_watched_fields:
  "sensitive-data":
    - "ssn"
    - "credit_card"
plugins.security.compliance.write_watched_indices:
  - "sensitive-data"

12. アラートとモニタリング

12.1 Alerting プラグイン

OpenSearch の Alerting プラグインは、データの状態を監視し、条件に応じて通知を送信する機能を提供する。

# モニターの作成（クエリベース）
curl -X POST "https://localhost:9200/_plugins/_alerting/monitors" \
  -u admin:admin --insecure \
  -H "Content-Type: application/json" \
  -d '{
    "type": "monitor",
    "name": "High Error Rate Monitor",
    "monitor_type": "query_level_monitor",
    "enabled": true,
    "schedule": {
      "period": {
        "interval": 5,
        "unit": "MINUTES"
      }
    },
    "inputs": [
      {
        "search": {
          "indices": ["logs-*"],
          "query": {
            "size": 0,
            "query": {
              "bool": {
                "filter": [
                  { "range": { "@timestamp": { "gte": "{{period_end}}||-5m", "lte": "{{period_end}}" } } },
                  { "term": { "level": "ERROR" } }
                ]
              }
            },
            "aggs": {
              "error_count": { "value_count": { "field": "_id" } }
            }
          }
        }
      }
    ],
    "triggers": [
      {
        "query_level_trigger": {
          "name": "High Error Count",
          "severity": "1",
          "condition": {
            "script": {
              "source": "ctx.results[0].aggregations.error_count.value > 100",
              "lang": "painless"
            }
          },
          "actions": [
            {
              "name": "Slack Notification",
              "destination_id": "slack-dest-id",
              "message_template": {
                "source": "Monitor {{ctx.monitor.name}} triggered alert {{ctx.trigger.name}}.\nError count: {{ctx.results.0.aggregations.error_count.value}}\nPeriod: {{ctx.periodStart}} - {{ctx.periodEnd}}\nSeverity: {{ctx.trigger.severity}}"
              },
              "throttle_enabled": true,
              "throttle": {
                "value": 15,
                "unit": "MINUTES"
              }
            },
            {
              "name": "PagerDuty Alert",
              "destination_id": "pagerduty-dest-id",
              "message_template": {
                "source": "{\"routing_key\": \"service-key\", \"event_action\": \"trigger\", \"payload\": {\"summary\": \"High error rate detected: {{ctx.results.0.aggregations.error_count.value}} errors in 5 minutes\", \"severity\": \"critical\", \"source\": \"OpenSearch Alerting\"}}"
              }
            }
          ]
        }
      }
    ]
  }'

12.2 バケットレベルモニター

# サービスごとのエラー率監視
curl -X POST "https://localhost:9200/_plugins/_alerting/monitors" \
  -u admin:admin --insecure \
  -H "Content-Type: application/json" \
  -d '{
    "type": "monitor",
    "name": "Per-Service Error Rate",
    "monitor_type": "bucket_level_monitor",
    "enabled": true,
    "schedule": {
      "period": { "interval": 5, "unit": "MINUTES" }
    },
    "inputs": [
      {
        "search": {
          "indices": ["logs-*"],
          "query": {
            "size": 0,
            "query": {
              "range": {
                "@timestamp": { "gte": "now-5m", "lte": "now" }
              }
            },
            "aggs": {
              "by_service": {
                "terms": { "field": "service", "size": 50 },
                "aggs": {
                  "error_count": {
                    "filter": { "term": { "level": "ERROR" } }
                  },
                  "total_count": {
                    "value_count": { "field": "_id" }
                  },
                  "error_rate": {
                    "bucket_script": {
                      "buckets_path": {
                        "errors": "error_count._count",
                        "total": "total_count"
                      },
                      "script": "params.total > 0 ? params.errors / params.total * 100 : 0"
                    }
                  }
                }
              }
            }
          }
        }
      }
    ],
    "triggers": [
      {
        "bucket_level_trigger": {
          "name": "Service Error Rate > 5%",
          "severity": "2",
          "condition": {
            "buckets_path": { "error_rate": "error_rate" },
            "parent_bucket_path": "by_service",
            "script": {
              "source": "params.error_rate > 5.0",
              "lang": "painless"
            }
          },
          "actions": [
            {
              "name": "Notify Service Owner",
              "destination_id": "slack-dest-id",
              "action_execution_policy": {
                "action_execution_scope": {
                  "per_alert": {
                    "actionable_alerts": ["DEDUPED", "NEW"]
                  }
                }
              },
              "message_template": {
                "source": "Service {{bucket_keys}} has error rate {{error_rate}}% in the last 5 minutes."
              }
            }
          ]
        }
      }
    ]
  }'

12.3 通知先（Destination）の設定

# Slack Webhook
curl -X POST "https://localhost:9200/_plugins/_alerting/destinations" \
  -u admin:admin --insecure \
  -H "Content-Type: application/json" \
  -d '{
    "name": "Slack - SRE Channel",
    "type": "slack",
    "slack": {
      "url": "https://hooks.slack.com/services/T00000000/B00000000/XXXXXXXXXXXXXXXXXXXXXXXX"
    }
  }'

# Webhook（汎用）
curl -X POST "https://localhost:9200/_plugins/_alerting/destinations" \
  -u admin:admin --insecure \
  -H "Content-Type: application/json" \
  -d '{
    "name": "PagerDuty",
    "type": "custom_webhook",
    "custom_webhook": {
      "scheme": "HTTPS",
      "host": "events.pagerduty.com",
      "port": 443,
      "path": "/v2/enqueue",
      "header_params": {
        "Content-Type": "application/json"
      }
    }
  }'

# Email
curl -X POST "https://localhost:9200/_plugins/_alerting/destinations" \
  -u admin:admin --insecure \
  -H "Content-Type: application/json" \
  -d '{
    "name": "SRE Email",
    "type": "email",
    "email": {
      "email_account_id": "email-account-id",
      "recipients": [
        { "type": "email_group", "email_group_id": "sre-group-id" },
        { "type": "email", "email": "oncall@example.com" }
      ]
    }
  }'

12.4 Anomaly Detection（異常検知）

# 異常検知ディテクターの作成
curl -X POST "https://localhost:9200/_plugins/_anomaly_detection/detectors" \
  -u admin:admin --insecure \
  -H "Content-Type: application/json" \
  -d '{
    "name": "API Latency Anomaly Detector",
    "description": "APIレイテンシーの異常を検知",
    "time_field": "@timestamp",
    "indices": ["metrics-*"],
    "feature_attributes": [
      {
        "feature_name": "avg_latency",
        "feature_enabled": true,
        "aggregation_query": {
          "avg_latency": { "avg": { "field": "latency_ms" } }
        }
      },
      {
        "feature_name": "p99_latency",
        "feature_enabled": true,
        "aggregation_query": {
          "p99_latency": {
            "percentiles": {
              "field": "latency_ms",
              "percents": [99]
            }
          }
        }
      }
    ],
    "category_field": ["service"],
    "detection_interval": {
      "period": { "interval": 5, "unit": "MINUTES" }
    },
    "window_delay": {
      "period": { "interval": 1, "unit": "MINUTES" }
    },
    "filter_query": {
      "bool": {
        "filter": [
          { "term": { "environment": "production" } }
        ]
      }
    }
  }'

# ディテクターの開始
curl -X POST "https://localhost:9200/_plugins/_anomaly_detection/detectors/<detector_id>/_start" \
  -u admin:admin --insecure

13. パフォーマンスチューニング

13.1 JVM 設定

# jvm.options
# ヒープサイズ（物理メモリの50%以下、最大32GB以下）
-Xms16g
-Xmx16g

# GC設定（OpenSearch 2.x ではG1GCがデフォルト）
-XX:+UseG1GC
-XX:G1HeapRegionSize=16m
-XX:InitiatingHeapOccupancyPercent=40
-XX:G1ReservePercent=25
-XX:MaxGCPauseMillis=200

# GCログ
-Xlog:gc*,gc+age=trace,safepoint:file=/var/log/opensearch/gc.log:utctime,pid,tags:filecount=32,filesize=64m

# Out of Memory時のダンプ
-XX:+HeapDumpOnOutOfMemoryError
-XX:HeapDumpPath=/var/log/opensearch/heapdump.hprof
-XX:ErrorFile=/var/log/opensearch/hs_err_pid%p.log

# その他の最適化
-XX:+AlwaysPreTouch
-Djava.awt.headless=true
-Dfile.encoding=UTF-8
-Djava.io.tmpdir=${OPENSEARCH_TMPDIR}

13.2 OS レベルの設定

# /etc/sysctl.conf - カーネルパラメータ
vm.max_map_count = 262144
vm.swappiness = 1
net.core.somaxconn = 65535
net.core.netdev_max_backlog = 65535
net.ipv4.tcp_max_syn_backlog = 65535
net.ipv4.tcp_keepalive_time = 300
net.ipv4.tcp_keepalive_probes = 5
net.ipv4.tcp_keepalive_intvl = 15

# /etc/security/limits.conf - リソース制限
opensearch  soft  nofile  65536
opensearch  hard  nofile  65536
opensearch  soft  nproc   4096
opensearch  hard  nproc   4096
opensearch  soft  memlock unlimited
opensearch  hard  memlock unlimited

# スワップの無効化
sudo swapoff -a
# または opensearch.yml で
# bootstrap.memory_lock: true

13.3 opensearch.yml パフォーマンス設定

# opensearch.yml - パフォーマンス関連

# メモリロック
bootstrap.memory_lock: true

# スレッドプール設定
thread_pool:
  write:
    size: 8          # データノードのCPUコア数と同じ
    queue_size: 10000
  search:
    size: 13         # int((# of available_processors * 3) / 2) + 1
    queue_size: 1000
  get:
    size: 8
    queue_size: 1000

# サーキットブレーカー
indices.breaker.total.use_real_memory: true
indices.breaker.total.limit: 95%
indices.breaker.fielddata.limit: 40%
indices.breaker.request.limit: 60%
network.breaker.inflight_requests.limit: 100%

# フィールドデータキャッシュ
indices.fielddata.cache.size: 20%

# クエリキャッシュ
indices.queries.cache.size: 10%
index.queries.cache.enabled: true

# リクエストキャッシュ
indices.requests.cache.size: 2%
index.requests.cache.enable: true

# インデクシングバッファ
indices.memory.index_buffer_size: 10%
indices.memory.min_index_buffer_size: 48mb

# 並行リカバリ設定
cluster.routing.allocation.node_concurrent_recoveries: 4
indices.recovery.max_bytes_per_sec: 200mb

13.4 検索パフォーマンスの最適化

# 検索スロークエリログの有効化
curl -X PUT "https://localhost:9200/logs-*/_settings" \
  -u admin:admin --insecure \
  -H "Content-Type: application/json" \
  -d '{
    "index.search.slowlog.threshold.query.warn": "10s",
    "index.search.slowlog.threshold.query.info": "5s",
    "index.search.slowlog.threshold.query.debug": "2s",
    "index.search.slowlog.threshold.fetch.warn": "1s",
    "index.search.slowlog.threshold.fetch.info": "500ms",
    "index.search.slowlog.level": "info",
    "index.indexing.slowlog.threshold.index.warn": "10s",
    "index.indexing.slowlog.threshold.index.info": "5s"
  }'

検索最適化のベストプラクティス

最適化項目	推奨事項
フィルタ vs クエリ	スコアリング不要な条件は `filter` コンテキストに配置
ソース制限	`_source` で必要なフィールドのみ取得
シャードサイズ	10-50GB が最適。過小・過大いずれも性能低下
ページング	`search_after` + PIT を使用（deep pagination 回避）
キャッシュ活用	`now` の代わりに丸めた日時を使用してキャッシュヒット率向上
プリウォーム	`_search` 前に `_warmers` でフィールドデータをロード
routing	検索範囲が明確な場合はルーティングでシャード絞り込み

13.5 インデクシングパフォーマンスの最適化

# バルクサイズの最適化（5-15MB/リクエストが目安）
# 並列度はデータノード数 × 2-3が目安

# インデクシングバッファの確認
curl -X GET "https://localhost:9200/_nodes/stats/indices/indexing?pretty" \
  -u admin:admin --insecure

# リフレッシュ間隔の調整（書き込みヘビーなワークロード）
curl -X PUT "https://localhost:9200/bulk-import-index/_settings" \
  -u admin:admin --insecure \
  -H "Content-Type: application/json" \
  -d '{
    "index": {
      "refresh_interval": "30s",
      "translog.durability": "async",
      "translog.sync_interval": "30s",
      "translog.flush_threshold_size": "1gb"
    }
  }'

13.6 Concurrent Segment Search

OpenSearch 2.12+ で GA となった並行セグメント検索は、シャード内の複数セグメントを並列に検索する機能である。

# opensearch.yml
search.concurrent_segment_search.enabled: true
search.concurrent.max_slice_count: 0  # 0 = 自動（Luceneが決定）

# インデックスレベルでの設定
curl -X PUT "https://localhost:9200/my-index/_settings" \
  -u admin:admin --insecure \
  -H "Content-Type: application/json" \
  -d '{
    "index.search.concurrent_segment_search.enabled": true,
    "index.search.concurrent_segment_search.mode": "auto"
  }'

14. クラスタ管理と運用

14.1 クラスタヘルスとステータス

# クラスタヘルス
curl -X GET "https://localhost:9200/_cluster/health?pretty" \
  -u admin:admin --insecure

# ノード情報
curl -X GET "https://localhost:9200/_cat/nodes?v&h=name,ip,role,heap.percent,ram.percent,cpu,load_1m,disk.used_percent,node.role" \
  -u admin:admin --insecure

# インデックス一覧
curl -X GET "https://localhost:9200/_cat/indices?v&s=store.size:desc&h=index,health,status,pri,rep,docs.count,store.size" \
  -u admin:admin --insecure

# シャード割り当ての確認
curl -X GET "https://localhost:9200/_cat/allocation?v&h=shards,disk.indices,disk.used,disk.avail,disk.percent,node" \
  -u admin:admin --insecure

# 未割り当てシャードの理由確認
curl -X GET "https://localhost:9200/_cluster/allocation/explain?pretty" \
  -u admin:admin --insecure \
  -H "Content-Type: application/json" \
  -d '{
    "index": "logs-2024-01-15",
    "shard": 0,
    "primary": true
  }'

# ペンディングタスクの確認
curl -X GET "https://localhost:9200/_cat/pending_tasks?v" \
  -u admin:admin --insecure

# タスク管理
curl -X GET "https://localhost:9200/_tasks?detailed=true&actions=*search*" \
  -u admin:admin --insecure

14.2 ノード統計

# ノード全体の統計
curl -X GET "https://localhost:9200/_nodes/stats?pretty" \
  -u admin:admin --insecure

# 特定の統計のみ
curl -X GET "https://localhost:9200/_nodes/stats/jvm,os,process,indices?pretty" \
  -u admin:admin --insecure

# ホットスレッドの確認（パフォーマンス問題診断）
curl -X GET "https://localhost:9200/_nodes/hot_threads?threads=3&interval=500ms" \
  -u admin:admin --insecure

14.3 シャード割り当て戦略

# opensearch.yml - シャード割り当て設定

# ディスクベースのシャード割り当て
cluster.routing.allocation.disk.threshold_enabled: true
cluster.routing.allocation.disk.watermark.low: "85%"
cluster.routing.allocation.disk.watermark.high: "90%"
cluster.routing.allocation.disk.watermark.flood_stage: "95%"
cluster.info.update.interval: "30s"

# 割り当ての認識
cluster.routing.allocation.awareness.attributes: rack_id,zone
cluster.routing.allocation.awareness.force.zone.values: zone-a,zone-b,zone-c

# リバランス設定
cluster.routing.rebalance.enable: all
cluster.routing.allocation.cluster_concurrent_rebalance: 4
cluster.routing.allocation.type: balanced
cluster.routing.allocation.balance.shard: 0.45
cluster.routing.allocation.balance.index: 0.55
cluster.routing.allocation.balance.threshold: 1.0

# ノード属性の設定（opensearch.yml）
# node.attr.rack_id: rack1
# node.attr.zone: zone-a
# node.attr.temp: hot

# Hot-Warm-Cold アーキテクチャの割り当てフィルタ
curl -X PUT "https://localhost:9200/logs-2024-01-15/_settings" \
  -u admin:admin --insecure \
  -H "Content-Type: application/json" \
  -d '{
    "index.routing.allocation.require.temp": "warm"
  }'

# 特定ノードからのシャード排除
curl -X PUT "https://localhost:9200/_cluster/settings" \
  -u admin:admin --insecure \
  -H "Content-Type: application/json" \
  -d '{
    "transient": {
      "cluster.routing.allocation.exclude._name": "data-node-3"
    }
  }'

14.4 ローリングリスタート手順

# 1. シャードの再割り当てを無効化
curl -X PUT "https://localhost:9200/_cluster/settings" \
  -u admin:admin --insecure \
  -H "Content-Type: application/json" \
  -d '{
    "persistent": {
      "cluster.routing.allocation.enable": "primaries"
    }
  }'

# 2. 同期フラッシュの実行
curl -X POST "https://localhost:9200/_flush/synced" \
  -u admin:admin --insecure

# 3. ノードを停止し、アップグレード/設定変更を実施
sudo systemctl stop opensearch

# 4. ノードを起動
sudo systemctl start opensearch

# 5. ノードがクラスタに参加したことを確認
curl -X GET "https://localhost:9200/_cat/nodes?v" \
  -u admin:admin --insecure

# 6. シャードの再割り当てを有効化
curl -X PUT "https://localhost:9200/_cluster/settings" \
  -u admin:admin --insecure \
  -H "Content-Type: application/json" \
  -d '{
    "persistent": {
      "cluster.routing.allocation.enable": null
    }
  }'

# 7. クラスタがgreenになるまで待機
curl -X GET "https://localhost:9200/_cluster/health?wait_for_status=green&timeout=5m" \
  -u admin:admin --insecure

# 8. 次のノードに対して手順3-7を繰り返す

14.5 スナップショットとバックアップ

# S3リポジトリの登録
curl -X PUT "https://localhost:9200/_snapshot/s3-backup" \
  -u admin:admin --insecure \
  -H "Content-Type: application/json" \
  -d '{
    "type": "s3",
    "settings": {
      "bucket": "opensearch-snapshots",
      "region": "us-west-2",
      "base_path": "backups",
      "server_side_encryption": true,
      "max_restore_bytes_per_sec": "200mb",
      "max_snapshot_bytes_per_sec": "200mb",
      "chunk_size": "1gb",
      "compress": true
    }
  }'

# スナップショットの作成
curl -X PUT "https://localhost:9200/_snapshot/s3-backup/snapshot-2024-01-15?wait_for_completion=false" \
  -u admin:admin --insecure \
  -H "Content-Type: application/json" \
  -d '{
    "indices": "logs-*,metrics-*",
    "ignore_unavailable": true,
    "include_global_state": false,
    "partial": false,
    "metadata": {
      "taken_by": "admin",
      "reason": "Daily backup"
    }
  }'

# スナップショットの確認
curl -X GET "https://localhost:9200/_snapshot/s3-backup/snapshot-2024-01-15?pretty" \
  -u admin:admin --insecure

# リストア
curl -X POST "https://localhost:9200/_snapshot/s3-backup/snapshot-2024-01-15/_restore" \
  -u admin:admin --insecure \
  -H "Content-Type: application/json" \
  -d '{
    "indices": "logs-2024-01-15",
    "ignore_unavailable": true,
    "include_global_state": false,
    "rename_pattern": "(.+)",
    "rename_replacement": "restored-$1",
    "index_settings": {
      "index.number_of_replicas": 0
    }
  }'

# SLM（Snapshot Lifecycle Management）ポリシー
curl -X POST "https://localhost:9200/_plugins/_sm/policies/daily-backup" \
  -u admin:admin --insecure \
  -H "Content-Type: application/json" \
  -d '{
    "description": "日次バックアップポリシー",
    "creation": {
      "schedule": {
        "cron": {
          "expression": "0 2 * * *",
          "timezone": "Asia/Tokyo"
        }
      },
      "time_limit": "1h"
    },
    "deletion": {
      "schedule": {
        "cron": {
          "expression": "0 4 * * *",
          "timezone": "Asia/Tokyo"
        }
      },
      "condition": {
        "max_age": "30d",
        "max_count": 30,
        "min_count": 7
      },
      "time_limit": "1h"
    },
    "snapshot_config": {
      "repository": "s3-backup",
      "indices": "logs-*,metrics-*",
      "ignore_unavailable": true,
      "include_global_state": false,
      "partial": true,
      "date_expression": "<daily-snap-{now/d}>",
      "metadata": {
        "policy": "daily-backup"
      }
    },
    "notification": {
      "channel": {
        "id": "notification-channel-id"
      },
      "conditions": {
        "creation": true,
        "deletion": true,
        "failure": true
      }
    }
  }'

15. インジェストパイプライン

15.1 概要

インジェストパイプラインは、ドキュメントがインデクシングされる前にデータの変換・加工を行う機能である。

# インジェストパイプラインの作成
curl -X PUT "https://localhost:9200/_ingest/pipeline/log-processing-pipeline" \
  -u admin:admin --insecure \
  -H "Content-Type: application/json" \
  -d '{
    "description": "ログデータの前処理パイプライン",
    "processors": [
      {
        "date": {
          "field": "timestamp_string",
          "formats": ["ISO8601", "yyyy-MM-dd HH:mm:ss,SSS", "UNIX_MS"],
          "target_field": "@timestamp",
          "timezone": "Asia/Tokyo"
        }
      },
      {
        "grok": {
          "field": "message",
          "patterns": [
            "%{IP:client_ip} %{WORD:method} %{URIPATHPARAM:request} %{NUMBER:status:int} %{NUMBER:bytes:long} %{NUMBER:duration:float}",
            "%{LOGLEVEL:level} \\[%{DATA:thread}\\] %{JAVACLASS:logger} - %{GREEDYDATA:log_message}"
          ],
          "ignore_failure": true
        }
      },
      {
        "geoip": {
          "field": "client_ip",
          "target_field": "geo",
          "ignore_missing": true
        }
      },
      {
        "user_agent": {
          "field": "user_agent_string",
          "target_field": "user_agent",
          "ignore_missing": true
        }
      },
      {
        "set": {
          "field": "environment",
          "value": "production",
          "override": false
        }
      },
      {
        "lowercase": {
          "field": "level",
          "ignore_missing": true
        }
      },
      {
        "rename": {
          "field": "hostname",
          "target_field": "host.name",
          "ignore_missing": true
        }
      },
      {
        "remove": {
          "field": ["timestamp_string", "user_agent_string"],
          "ignore_missing": true
        }
      },
      {
        "trim": {
          "field": "message"
        }
      },
      {
        "script": {
          "lang": "painless",
          "source": "if (ctx.containsKey(\"duration\") && ctx.duration != null) { ctx.duration_category = ctx.duration < 100 ? \"fast\" : ctx.duration < 500 ? \"normal\" : \"slow\"; }"
        }
      },
      {
        "pipeline": {
          "name": "enrichment-pipeline",
          "ignore_missing_pipeline": true
        }
      }
    ],
    "on_failure": [
      {
        "set": {
          "field": "_index",
          "value": "failed-logs"
        }
      },
      {
        "set": {
          "field": "error.message",
          "value": "{{_ingest.on_failure_message}}"
        }
      },
      {
        "set": {
          "field": "error.processor",
          "value": "{{_ingest.on_failure_processor_type}}"
        }
      }
    ]
  }'

15.2 主要なプロセッサ一覧

プロセッサ	説明
`grok`	正規表現パターンでテキストを構造化データに変換
`date`	日付文字列をパースして日時フィールドに変換
`geoip`	IP アドレスから地理情報を付加
`user_agent`	User-Agent 文字列をパース
`set`	フィールドの設定
`rename`	フィールド名の変更
`remove`	フィールドの削除
`convert`	フィールドの型変換
`trim`	前後の空白を除去
`lowercase` / `uppercase`	大文字/小文字変換
`split`	文字列を配列に分割
`join`	配列を文字列に結合
`json`	JSON 文字列をオブジェクトにパース
`kv`	Key-Value ペアのパース
`dissect`	パターンベースのテキスト分割
`csv`	CSV フィールドのパース
`script`	Painless スクリプトによる任意の処理
`pipeline`	別のパイプラインの呼び出し
`foreach`	配列の各要素に対するプロセッサ適用
`enrich`	別インデックスからの情報付加
`drop`	条件に合致するドキュメントを破棄
`text_embedding`	ML モデルによるテキストベクトル化

15.3 パイプラインのテストとシミュレーション

# パイプラインのシミュレーション
curl -X POST "https://localhost:9200/_ingest/pipeline/log-processing-pipeline/_simulate?pretty" \
  -u admin:admin --insecure \
  -H "Content-Type: application/json" \
  -d '{
    "docs": [
      {
        "_source": {
          "timestamp_string": "2024-01-15T10:30:00.000+09:00",
          "message": "192.168.1.100 GET /api/v1/users 200 1234 45.2",
          "hostname": "web-server-01"
        }
      },
      {
        "_source": {
          "timestamp_string": "2024-01-15T10:30:01.000+09:00",
          "message": "ERROR [http-thread-1] com.example.UserService - User not found: id=12345",
          "hostname": "app-server-02"
        }
      }
    ]
  }'

16. Logstash / Data Prepper / Fluentd との連携

16.1 Data Prepper（推奨）

Data Prepper は OpenSearch プロジェクトの一部として開発されたデータ収集・変換ツールである。

# data-prepper-config.yaml
ssl: false
peer_forwarder:
  discovery_mode: static
  static_endpoints:
    - "data-prepper-node1"
    - "data-prepper-node2"

# pipelines.yaml
log-pipeline:
  source:
    http:
      port: 2021
      health_check_service: true
      ssl: false
  processor:
    - grok:
        match:
          log:
            - "%{TIMESTAMP_ISO8601:timestamp} %{LOGLEVEL:level} %{GREEDYDATA:message}"
    - date:
        from_time_received: false
        match:
          - key: "timestamp"
            patterns: ["ISO8601"]
        destination: "@timestamp"
    - add_entries:
        entries:
          - key: "pipeline"
            value: "log-pipeline"
  sink:
    - opensearch:
        hosts: ["https://opensearch-node1:9200"]
        index: "logs-%{yyyy.MM.dd}"
        username: "admin"
        password: "admin"
        insecure: true
        bulk_size: 4

otel-trace-pipeline:
  source:
    otel_trace_source:
      port: 21890
      ssl: false
  processor:
    - trace_peer_forwarder:
    - otel_trace_raw:
  sink:
    - opensearch:
        hosts: ["https://opensearch-node1:9200"]
        index_type: trace-analytics-raw
        username: "admin"
        password: "admin"

otel-metrics-pipeline:
  source:
    otel_metrics_source:
      port: 21891
      ssl: false
  processor:
    - otel_metrics:
  sink:
    - opensearch:
        hosts: ["https://opensearch-node1:9200"]
        index_type: custom
        index: "otel-metrics-%{yyyy.MM.dd}"
        username: "admin"
        password: "admin"

16.2 Logstash 連携

# logstash.conf
input {
  beats {
    port => 5044
    ssl => true
    ssl_certificate => "/etc/logstash/certs/logstash.crt"
    ssl_key => "/etc/logstash/certs/logstash.key"
  }
}

filter {
  if [type] == "syslog" {
    grok {
      match => { "message" => "%{SYSLOGTIMESTAMP:syslog_timestamp} %{SYSLOGHOST:syslog_hostname} %{DATA:syslog_program}(?:\[%{POSINT:syslog_pid}\])?: %{GREEDYDATA:syslog_message}" }
    }
    date {
      match => [ "syslog_timestamp", "MMM  d HH:mm:ss", "MMM dd HH:mm:ss" ]
    }
  }
  
  mutate {
    remove_field => ["agent", "ecs", "input"]
  }
}

output {
  opensearch {
    hosts => ["https://opensearch-node1:9200", "https://opensearch-node2:9200"]
    index => "logs-%{+YYYY.MM.dd}"
    user => "logstash_writer"
    password => "${LOGSTASH_PASSWORD}"
    ssl => true
    ssl_certificate_verification => true
    cacert => "/etc/logstash/certs/root-ca.pem"
    template_name => "logs"
    ilm_enabled => false
    manage_template => false
  }
}

16.3 Fluent Bit 連携

# fluent-bit.conf
[SERVICE]
    Flush         5
    Daemon        Off
    Log_Level     info
    Parsers_File  parsers.conf

[INPUT]
    Name          tail
    Path          /var/log/app/*.log
    Parser        json
    Tag           app.*
    Refresh_Interval 10
    Mem_Buf_Limit 5MB
    Skip_Long_Lines On

[INPUT]
    Name          systemd
    Tag           host.*
    Systemd_Filter _SYSTEMD_UNIT=opensearch.service

[FILTER]
    Name          modify
    Match         app.*
    Add           cluster production-cluster
    Add           environment production

[FILTER]
    Name          lua
    Match         app.*
    script        enrich.lua
    call          add_metadata

[OUTPUT]
    Name          opensearch
    Match         *
    Host          opensearch-node1
    Port          9200
    HTTP_User     fluent_writer
    HTTP_Passwd   ${FLUENT_PASSWORD}
    Index         logs
    Type          _doc
    tls           On
    tls.verify    On
    tls.ca_file   /etc/fluent-bit/certs/root-ca.pem
    Logstash_Format On
    Logstash_Prefix logs
    Retry_Limit   5
    Buffer_Size   512KB
    Suppress_Type_Name On
    Trace_Output  Off

17. 可観測性（Observability）

OpenSearch は、ログ・トレース・メトリクスの 3 本柱を統合的に管理する可観測性プラットフォームを提供する。

17.1 トレース分析

OpenTelemetry と連携し、分散トレースの収集・分析を行う。

# トレースデータの検索
curl -X GET "https://localhost:9200/otel-v1-apm-span-*/_search?pretty" \
  -u admin:admin --insecure \
  -H "Content-Type: application/json" \
  -d '{
    "size": 10,
    "query": {
      "bool": {
        "must": [
          { "term": { "serviceName": "payment-service" } },
          { "range": { "durationInNanos": { "gte": 1000000000 } } },
          { "range": { "startTime": { "gte": "now-1h" } } }
        ]
      }
    },
    "sort": [{ "durationInNanos": { "order": "desc" } }]
  }'

# サービスマップの生成クエリ
curl -X GET "https://localhost:9200/otel-v1-apm-service-map*/_search?pretty" \
  -u admin:admin --insecure \
  -H "Content-Type: application/json" \
  -d '{
    "size": 0,
    "aggs": {
      "service_pairs": {
        "composite": {
          "sources": [
            { "source": { "terms": { "field": "serviceName" } } },
            { "target": { "terms": { "field": "destination.resource" } } }
          ]
        }
      }
    }
  }'

17.2 メトリクスの管理

# Prometheusメトリクスの取り込み用インデックステンプレート
curl -X PUT "https://localhost:9200/_index_template/metrics-template" \
  -u admin:admin --insecure \
  -H "Content-Type: application/json" \
  -d '{
    "index_patterns": ["metrics-*"],
    "priority": 100,
    "template": {
      "settings": {
        "index": {
          "number_of_shards": 2,
          "number_of_replicas": 1,
          "refresh_interval": "60s",
          "codec": "best_compression"
        }
      },
      "mappings": {
        "properties": {
          "@timestamp": { "type": "date" },
          "metric_name": { "type": "keyword" },
          "metric_value": { "type": "double" },
          "labels": { "type": "flat_object" },
          "host": { "type": "keyword" },
          "service": { "type": "keyword" },
          "unit": { "type": "keyword" }
        }
      }
    }
  }'

17.3 Notebooks（データ分析ノートブック）

OpenSearch Dashboards のノートブック機能により、データの対話的な探索とドキュメント作成が可能。

// ノートブック内で実行可能なDSLクエリ例
// 段落1: PPL によるエラー率の時系列分析
{
  "query": "source=logs-* | where level='ERROR' | stats count() as errors by span(@timestamp, 1h) as hour | sort hour"
}

// 段落2: SQL によるサービスごとの統計
{
  "query": "SELECT service, COUNT(*) as total, SUM(CASE WHEN level='ERROR' THEN 1 ELSE 0 END) as errors, AVG(response_time_ms) as avg_latency FROM logs-* WHERE @timestamp > DATE_SUB(NOW(), INTERVAL 24 HOUR) GROUP BY service ORDER BY errors DESC"
}

18. ベクトル検索と機械学習

18.1 k-NN（k-Nearest Neighbors）検索

# ベクトル検索用インデックスの作成
curl -X PUT "https://localhost:9200/semantic-search" \
  -u admin:admin --insecure \
  -H "Content-Type: application/json" \
  -d '{
    "settings": {
      "index": {
        "knn": true,
        "knn.algo_param.ef_search": 100,
        "number_of_shards": 3,
        "number_of_replicas": 1
      }
    },
    "mappings": {
      "properties": {
        "title": { "type": "text" },
        "content": { "type": "text" },
        "category": { "type": "keyword" },
        "content_embedding": {
          "type": "knn_vector",
          "dimension": 768,
          "method": {
            "name": "hnsw",
            "space_type": "l2",
            "engine": "faiss",
            "parameters": {
              "ef_construction": 256,
              "m": 16
            }
          }
        }
      }
    }
  }'

18.2 ベクトル検索クエリ

# 基本的なk-NN検索
curl -X GET "https://localhost:9200/semantic-search/_search?pretty" \
  -u admin:admin --insecure \
  -H "Content-Type: application/json" \
  -d '{
    "size": 10,
    "query": {
      "knn": {
        "content_embedding": {
          "vector": [0.1, 0.2, 0.3, ...],
          "k": 10
        }
      }
    }
  }'

# ハイブリッド検索（テキスト + ベクトル）
curl -X GET "https://localhost:9200/semantic-search/_search?pretty" \
  -u admin:admin --insecure \
  -H "Content-Type: application/json" \
  -d '{
    "size": 10,
    "query": {
      "hybrid": {
        "queries": [
          {
            "match": {
              "content": {
                "query": "OpenSearchのベクトル検索機能",
                "boost": 0.3
              }
            }
          },
          {
            "knn": {
              "content_embedding": {
                "vector": [0.1, 0.2, 0.3, ...],
                "k": 10
              }
            }
          }
        ]
      }
    },
    "search_pipeline": {
      "phase_results_processors": [
        {
          "normalization-processor": {
            "normalization": { "technique": "min_max" },
            "combination": {
              "technique": "arithmetic_mean",
              "parameters": { "weights": [0.3, 0.7] }
            }
          }
        }
      ]
    }
  }'

# フィルター付きベクトル検索
curl -X GET "https://localhost:9200/semantic-search/_search?pretty" \
  -u admin:admin --insecure \
  -H "Content-Type: application/json" \
  -d '{
    "size": 10,
    "query": {
      "knn": {
        "content_embedding": {
          "vector": [0.1, 0.2, 0.3, ...],
          "k": 10,
          "filter": {
            "bool": {
              "must": [
                { "term": { "category": "技術" } }
              ]
            }
          }
        }
      }
    }
  }'

18.3 ML Commons フレームワーク

# MLモデルグループの作成
curl -X POST "https://localhost:9200/_plugins/_ml/model_groups/_register" \
  -u admin:admin --insecure \
  -H "Content-Type: application/json" \
  -d '{
    "name": "embedding_models",
    "description": "テキストエンベディング用モデルグループ"
  }'

# リモートモデルコネクタの作成（例: Amazon Bedrock）
curl -X POST "https://localhost:9200/_plugins/_ml/connectors/_create" \
  -u admin:admin --insecure \
  -H "Content-Type: application/json" \
  -d '{
    "name": "Bedrock Titan Embedding Connector",
    "description": "Amazon Bedrock Titan embedding model",
    "version": "1",
    "protocol": "aws_sigv4",
    "parameters": {
      "region": "us-west-2",
      "service_name": "bedrock",
      "model": "amazon.titan-embed-text-v2:0"
    },
    "credential": {
      "access_key": "${AWS_ACCESS_KEY}",
      "secret_key": "${AWS_SECRET_KEY}"
    },
    "actions": [
      {
        "action_type": "predict",
        "method": "POST",
        "url": "https://bedrock-runtime.us-west-2.amazonaws.com/model/amazon.titan-embed-text-v2:0/invoke",
        "headers": { "content-type": "application/json" },
        "request_body": "{ \"inputText\": \"${parameters.inputText}\", \"dimensions\": 768 }",
        "post_process_function": "connector.post_process.bedrock.embedding"
      }
    ]
  }'

# モデルの登録とデプロイ
curl -X POST "https://localhost:9200/_plugins/_ml/models/_register" \
  -u admin:admin --insecure \
  -H "Content-Type: application/json" \
  -d '{
    "name": "Titan Embedding V2",
    "function_name": "remote",
    "model_group_id": "<model_group_id>",
    "connector_id": "<connector_id>"
  }'

# モデルのデプロイ
curl -X POST "https://localhost:9200/_plugins/_ml/models/<model_id>/_deploy" \
  -u admin:admin --insecure

# Neural Search パイプライン（自動ベクトル化）
curl -X PUT "https://localhost:9200/_search/pipeline/neural-search-pipeline" \
  -u admin:admin --insecure \
  -H "Content-Type: application/json" \
  -d '{
    "description": "Neural search pipeline for semantic search",
    "request_processors": [
      {
        "neural_query_enricher": {
          "default_model_id": "<model_id>",
          "neural_field_default_id": {
            "content_embedding": "<model_id>"
          }
        }
      }
    ]
  }'

# インジェスト時の自動ベクトル化パイプライン
curl -X PUT "https://localhost:9200/_ingest/pipeline/neural-ingest-pipeline" \
  -u admin:admin --insecure \
  -H "Content-Type: application/json" \
  -d '{
    "description": "Auto-embed text during ingestion",
    "processors": [
      {
        "text_embedding": {
          "model_id": "<model_id>",
          "field_map": {
            "content": "content_embedding"
          }
        }
      }
    ]
  }'

19. プラグインとエコシステム

19.1 コアプラグイン一覧

プラグイン	説明
Security	認証・認可・暗号化・監査
Alerting	モニタリングと通知
Anomaly Detection	ML ベースの異常検知
Index Management	ISM、ロールオーバー、スナップショット管理
SQL	SQL/PPL クエリサポート
k-NN	ベクトル近傍検索
ML Commons	機械学習フレームワーク
Observability	トレース、メトリクス、ログの統合管理
Notifications	マルチチャネル通知
Cross-Cluster Replication	クロスクラスタレプリケーション
Searchable Snapshots	スナップショットからの直接検索
Reporting	PDF/CSV レポート生成
Neural Search	セマンティック検索
Search Pipelines	検索リクエスト/レスポンスのカスタマイズ
Flow Framework	AI/ML ワークフロー自動化
Conversational Search	対話型検索
Security Analytics	SIEM 機能（脅威検知ルール、相関）
Geospatial	地理空間データ分析
Custom Codecs	カスタム圧縮コーデック

19.2 プラグインの管理

# インストール済みプラグインの確認
curl -X GET "https://localhost:9200/_cat/plugins?v" \
  -u admin:admin --insecure

# プラグインのインストール（CLIから）
bin/opensearch-plugin install analysis-kuromoji
bin/opensearch-plugin install analysis-icu
bin/opensearch-plugin install repository-s3

# プラグインの削除
bin/opensearch-plugin remove analysis-kuromoji

# プラグイン一覧
bin/opensearch-plugin list

19.3 Search Pipelines

# 検索パイプラインの作成
curl -X PUT "https://localhost:9200/_search/pipeline/my-search-pipeline" \
  -u admin:admin --insecure \
  -H "Content-Type: application/json" \
  -d '{
    "description": "検索結果のカスタマイズパイプライン",
    "request_processors": [
      {
        "filter_query": {
          "query": {
            "term": { "status": "published" }
          }
        }
      },
      {
        "script": {
          "source": "if (ctx._source[\"boost_field\"] != null) { ctx._source[\"_score\"] = ctx._source[\"_score\"] * ctx._source[\"boost_field\"]; }",
          "lang": "painless"
        }
      }
    ],
    "response_processors": [
      {
        "rename_field": {
          "field": "internal_id",
          "target_field": "id"
        }
      },
      {
        "collapse": {
          "field": "category"
        }
      },
      {
        "truncate_hits": {
          "target_size": 50
        }
      }
    ]
  }'

# パイプラインをインデックスのデフォルトに設定
curl -X PUT "https://localhost:9200/my-index/_settings" \
  -u admin:admin --insecure \
  -H "Content-Type: application/json" \
  -d '{
    "index.search.default_pipeline": "my-search-pipeline"
  }'

19.4 Cross-Cluster Replication（CCR）

# リモートクラスタの接続設定
curl -X PUT "https://localhost:9200/_cluster/settings" \
  -u admin:admin --insecure \
  -H "Content-Type: application/json" \
  -d '{
    "persistent": {
      "cluster.remote": {
        "leader-cluster": {
          "seeds": ["leader-node1:9300", "leader-node2:9300"]
        }
      }
    }
  }'

# レプリケーションルールの作成
curl -X PUT "https://localhost:9200/_plugins/_replication/logs-2024-01-15/_start" \
  -u admin:admin --insecure \
  -H "Content-Type: application/json" \
  -d '{
    "leader_alias": "leader-cluster",
    "leader_index": "logs-2024-01-15",
    "use_roles": {
      "leader_cluster_role": "cross_cluster_replication_leader_full_access",
      "follower_cluster_role": "cross_cluster_replication_follower_full_access"
    }
  }'

# 自動フォロールールの作成
curl -X POST "https://localhost:9200/_plugins/_replication/_autofollow" \
  -u admin:admin --insecure \
  -H "Content-Type: application/json" \
  -d '{
    "leader_alias": "leader-cluster",
    "name": "auto-follow-logs",
    "pattern": "logs-*",
    "use_roles": {
      "leader_cluster_role": "cross_cluster_replication_leader_full_access",
      "follower_cluster_role": "cross_cluster_replication_follower_full_access"
    }
  }'

20. Security Analytics（SIEM）

# 検知ルールの作成（Sigma形式）
curl -X POST "https://localhost:9200/_plugins/_security_analytics/rules" \
  -u admin:admin --insecure \
  -H "Content-Type: application/json" \
  -d '{
    "category": "windows",
    "log_source": {
      "product": "windows",
      "category": "process_creation"
    },
    "title": "Suspicious PowerShell Execution",
    "description": "疑わしいPowerShellコマンドの実行を検知",
    "level": "high",
    "rule": "title: Suspicious PowerShell\nstatus: experimental\nlogsource:\n  product: windows\n  category: process_creation\ndetection:\n  selection:\n    CommandLine|contains:\n      - \"-encodedcommand\"\n      - \"-enc\"\n      - \"bypass\"\n      - \"hidden\"\n  condition: selection\nlevel: high"
  }'

# ディテクターの作成
curl -X POST "https://localhost:9200/_plugins/_security_analytics/detectors" \
  -u admin:admin --insecure \
  -H "Content-Type: application/json" \
  -d '{
    "name": "Windows Threat Detector",
    "enabled": true,
    "detector_type": "windows",
    "schedule": {
      "period": { "interval": 1, "unit": "MINUTES" }
    },
    "inputs": [
      {
        "detector_input": {
          "description": "Windows event logs analysis",
          "indices": ["windows-events-*"],
          "pre_packaged_rules": [
            { "id": "rule-id-1" }
          ],
          "custom_rules": [
            { "id": "custom-rule-id" }
          ]
        }
      }
    ],
    "triggers": [
      {
        "name": "High severity alert",
        "severity": "1",
        "ids": ["rule-id-1"],
        "actions": [
          {
            "name": "Send to SIEM",
            "destination_id": "siem-webhook-dest",
            "message_template": {
              "source": "Security Alert: {{ctx.detector.name}} - {{ctx.trigger.name}}"
            }
          }
        ]
      }
    ]
  }'

21. Docker / Kubernetes デプロイメント

21.1 Docker Compose

# docker-compose.yml
version: '3.8'

services:
  opensearch-node1:
    image: opensearchproject/opensearch:2.19.0
    container_name: opensearch-node1
    environment:
      - cluster.name=opensearch-cluster
      - node.name=opensearch-node1
      - discovery.seed_hosts=opensearch-node1,opensearch-node2,opensearch-node3
      - cluster.initial_cluster_manager_nodes=opensearch-node1,opensearch-node2,opensearch-node3
      - bootstrap.memory_lock=true
      - "OPENSEARCH_JAVA_OPTS=-Xms4g -Xmx4g"
      - OPENSEARCH_INITIAL_ADMIN_PASSWORD=${OPENSEARCH_ADMIN_PASSWORD}
      - node.roles=cluster_manager,data,ingest
      - plugins.security.ssl.transport.pemcert_filepath=certs/node1.pem
      - plugins.security.ssl.transport.pemkey_filepath=certs/node1-key.pem
      - plugins.security.ssl.transport.pemtrustedcas_filepath=certs/root-ca.pem
      - plugins.security.ssl.http.enabled=true
      - plugins.security.ssl.http.pemcert_filepath=certs/node1-http.pem
      - plugins.security.ssl.http.pemkey_filepath=certs/node1-http-key.pem
      - plugins.security.ssl.http.pemtrustedcas_filepath=certs/root-ca.pem
    ulimits:
      memlock:
        soft: -1
        hard: -1
      nofile:
        soft: 65536
        hard: 65536
    volumes:
      - opensearch-data1:/usr/share/opensearch/data
      - ./certs:/usr/share/opensearch/config/certs:ro
    ports:
      - "9200:9200"
      - "9600:9600"
    networks:
      - opensearch-net
    healthcheck:
      test: ["CMD-SHELL", "curl -sf --insecure https://localhost:9200/_cluster/health || exit 1"]
      interval: 30s
      timeout: 10s
      retries: 5

  opensearch-node2:
    image: opensearchproject/opensearch:2.19.0
    container_name: opensearch-node2
    environment:
      - cluster.name=opensearch-cluster
      - node.name=opensearch-node2
      - discovery.seed_hosts=opensearch-node1,opensearch-node2,opensearch-node3
      - cluster.initial_cluster_manager_nodes=opensearch-node1,opensearch-node2,opensearch-node3
      - bootstrap.memory_lock=true
      - "OPENSEARCH_JAVA_OPTS=-Xms4g -Xmx4g"
      - node.roles=cluster_manager,data,ingest
    ulimits:
      memlock:
        soft: -1
        hard: -1
      nofile:
        soft: 65536
        hard: 65536
    volumes:
      - opensearch-data2:/usr/share/opensearch/data
      - ./certs:/usr/share/opensearch/config/certs:ro
    networks:
      - opensearch-net

  opensearch-node3:
    image: opensearchproject/opensearch:2.19.0
    container_name: opensearch-node3
    environment:
      - cluster.name=opensearch-cluster
      - node.name=opensearch-node3
      - discovery.seed_hosts=opensearch-node1,opensearch-node2,opensearch-node3
      - cluster.initial_cluster_manager_nodes=opensearch-node1,opensearch-node2,opensearch-node3
      - bootstrap.memory_lock=true
      - "OPENSEARCH_JAVA_OPTS=-Xms4g -Xmx4g"
      - node.roles=cluster_manager,data,ingest
    ulimits:
      memlock:
        soft: -1
        hard: -1
      nofile:
        soft: 65536
        hard: 65536
    volumes:
      - opensearch-data3:/usr/share/opensearch/data
      - ./certs:/usr/share/opensearch/config/certs:ro
    networks:
      - opensearch-net

  opensearch-dashboards:
    image: opensearchproject/opensearch-dashboards:2.19.0
    container_name: opensearch-dashboards
    environment:
      - OPENSEARCH_HOSTS=["https://opensearch-node1:9200","https://opensearch-node2:9200","https://opensearch-node3:9200"]
      - DISABLE_SECURITY_DASHBOARDS_PLUGIN=false
    ports:
      - "5601:5601"
    networks:
      - opensearch-net
    depends_on:
      opensearch-node1:
        condition: service_healthy

volumes:
  opensearch-data1:
  opensearch-data2:
  opensearch-data3:

networks:
  opensearch-net:
    driver: bridge

21.2 Kubernetes（Helm Chart）

# Helm リポジトリの追加
helm repo add opensearch https://opensearch-project.github.io/helm-charts/
helm repo update

# values.yaml（OpenSearch Helm Chart）
clusterName: "production-opensearch"

nodeGroup: "master"
masterService: "opensearch-cluster-master"

roles:
  - master
  - data
  - ingest

replicas: 3

opensearchJavaOpts: "-Xms8g -Xmx8g"

resources:
  requests:
    cpu: "2000m"
    memory: "16Gi"
  limits:
    cpu: "4000m"
    memory: "16Gi"

persistence:
  enabled: true
  enableInitChown: true
  storageClass: "gp3"
  accessModes:
    - ReadWriteOnce
  size: 100Gi

config:
  opensearch.yml: |
    cluster.name: production-opensearch
    network.host: 0.0.0.0
    
    plugins.security.ssl.transport.pemcert_filepath: certs/tls.crt
    plugins.security.ssl.transport.pemkey_filepath: certs/tls.key
    plugins.security.ssl.transport.pemtrustedcas_filepath: certs/ca.crt
    plugins.security.ssl.http.enabled: true
    plugins.security.ssl.http.pemcert_filepath: certs/tls.crt
    plugins.security.ssl.http.pemkey_filepath: certs/tls.key
    plugins.security.ssl.http.pemtrustedcas_filepath: certs/ca.crt
    
    plugins.security.allow_default_init_securityindex: true
    plugins.security.audit.type: internal_opensearch
    
    cluster.routing.allocation.awareness.attributes: zone
    cluster.routing.allocation.awareness.force.zone.values: a,b,c

  jvm.options: |
    -Xms8g
    -Xmx8g
    -XX:+UseG1GC
    -XX:G1HeapRegionSize=16m

extraEnvs:
  - name: DISABLE_INSTALL_DEMO_CONFIG
    value: "true"

securityConfig:
  enabled: true
  path: "/usr/share/opensearch/config/opensearch-security"
  config:
    dataComplete: true

antiAffinity: "hard"

topologySpreadConstraints:
  - maxSkew: 1
    topologyKey: topology.kubernetes.io/zone
    whenUnsatisfiable: DoNotSchedule
    labelSelector:
      matchLabels:
        app.kubernetes.io/instance: opensearch

podDisruptionBudget:
  enabled: true
  minAvailable: 2

readinessProbe:
  initialDelaySeconds: 60
  periodSeconds: 10
  failureThreshold: 10

livenessProbe:
  initialDelaySeconds: 120
  periodSeconds: 20
  failureThreshold: 5

sysctlInit:
  enabled: true

ingress:
  enabled: true
  annotations:
    kubernetes.io/ingress.class: nginx
    nginx.ingress.kubernetes.io/backend-protocol: "HTTPS"
    nginx.ingress.kubernetes.io/ssl-redirect: "true"
  hosts:
    - host: opensearch.internal.example.com
      paths:
        - path: /
          pathType: Prefix
  tls:
    - secretName: opensearch-tls
      hosts:
        - opensearch.internal.example.com

# Helmによるデプロイ
helm install opensearch opensearch/opensearch \
  -f values.yaml \
  -n opensearch \
  --create-namespace

# Dashboards のデプロイ
helm install opensearch-dashboards opensearch/opensearch-dashboards \
  -n opensearch \
  --set opensearchHosts="https://opensearch-cluster-master:9200"

22. Amazon OpenSearch Service

22.1 概要

Amazon OpenSearch Service は、AWS が提供する OpenSearch のマネージドサービスである。インフラストラクチャの管理を AWS に委任しつつ、OpenSearch の全機能を利用できる。

22.2 デプロイオプション

オプション	説明
マネージドクラスタ	EC2 インスタンスベースの従来型
サーバーレス	自動スケーリング、使用量ベース課金

22.3 マネージドクラスタの構成

// AWS CloudFormation テンプレート（抜粋）
{
  "OpenSearchDomain": {
    "Type": "AWS::OpenSearchService::Domain",
    "Properties": {
      "DomainName": "production-search",
      "EngineVersion": "OpenSearch_2.13",
      "ClusterConfig": {
        "InstanceType": "r6g.2xlarge.search",
        "InstanceCount": 6,
        "DedicatedMasterEnabled": true,
        "DedicatedMasterType": "m6g.large.search",
        "DedicatedMasterCount": 3,
        "ZoneAwarenessEnabled": true,
        "ZoneAwarenessConfig": {
          "AvailabilityZoneCount": 3
        },
        "WarmEnabled": true,
        "WarmType": "ultrawarm1.medium.search",
        "WarmCount": 3,
        "ColdStorageOptions": {
          "Enabled": true
        },
        "MultiAZWithStandbyEnabled": true
      },
      "EBSOptions": {
        "EBSEnabled": true,
        "VolumeType": "gp3",
        "VolumeSize": 500,
        "Iops": 3000,
        "Throughput": 125
      },
      "VPCOptions": {
        "SubnetIds": ["subnet-xxx", "subnet-yyy", "subnet-zzz"],
        "SecurityGroupIds": ["sg-xxx"]
      },
      "EncryptionAtRestOptions": {
        "Enabled": true,
        "KmsKeyId": "arn:aws:kms:us-west-2:123456789012:key/xxx"
      },
      "NodeToNodeEncryptionOptions": {
        "Enabled": true
      },
      "DomainEndpointOptions": {
        "EnforceHTTPS": true,
        "TLSSecurityPolicy": "Policy-Min-TLS-1-2-PFS-2023-10"
      },
      "AdvancedSecurityOptions": {
        "Enabled": true,
        "InternalUserDatabaseEnabled": false,
        "MasterUserOptions": {
          "MasterUserARN": "arn:aws:iam::123456789012:role/OpenSearchAdmin"
        }
      },
      "LogPublishingOptions": {
        "INDEX_SLOW_LOGS": {
          "CloudWatchLogsLogGroupArn": "arn:aws:logs:us-west-2:123456789012:log-group:/aws/opensearch/domains/production-search/index-slow-logs",
          "Enabled": true
        },
        "SEARCH_SLOW_LOGS": {
          "CloudWatchLogsLogGroupArn": "arn:aws:logs:us-west-2:123456789012:log-group:/aws/opensearch/domains/production-search/search-slow-logs",
          "Enabled": true
        },
        "ES_APPLICATION_LOGS": {
          "CloudWatchLogsLogGroupArn": "arn:aws:logs:us-west-2:123456789012:log-group:/aws/opensearch/domains/production-search/app-logs",
          "Enabled": true
        },
        "AUDIT_LOGS": {
          "CloudWatchLogsLogGroupArn": "arn:aws:logs:us-west-2:123456789012:log-group:/aws/opensearch/domains/production-search/audit-logs",
          "Enabled": true
        }
      },
      "AdvancedOptions": {
        "rest.action.multi.allow_explicit_index": "true",
        "indices.fielddata.cache.size": "20",
        "indices.query.bool.max_clause_count": "1024",
        "override_main_response_version": "false"
      },
      "AutoTuneOptions": {
        "DesiredState": "ENABLED",
        "MaintenanceSchedules": [
          {
            "StartAt": "2024-01-15T02:00:00Z",
            "Duration": { "Value": 2, "Unit": "HOURS" },
            "CronExpressionForRecurrence": "cron(0 2 ? * SUN *)"
          }
        ]
      },
      "OffPeakWindowOptions": {
        "Enabled": true,
        "OffPeakWindow": {
          "WindowStartTime": {
            "Hours": 2,
            "Minutes": 0
          }
        }
      }
    }
  }
}

22.4 OpenSearch Serverless

// サーバーレスコレクションの作成（AWS CLI）
// aws opensearchserverless create-collection
{
  "name": "log-analytics",
  "type": "TIMESERIES",
  "description": "ログ分析用サーバーレスコレクション"
}

// 暗号化ポリシー
{
  "Rules": [
    {
      "ResourceType": "collection",
      "Resource": ["collection/log-analytics"]
    }
  ],
  "AWSOwnedKey": true
}

// ネットワークポリシー
{
  "Rules": [
    {
      "ResourceType": "collection",
      "Resource": ["collection/log-analytics"]
    },
    {
      "ResourceType": "dashboard",
      "Resource": ["collection/log-analytics"]
    }
  ],
  "AllowFromPublic": false,
  "SourceVPCEs": ["vpce-xxx"]
}

// データアクセスポリシー
{
  "Rules": [
    {
      "ResourceType": "index",
      "Resource": ["index/log-analytics/*"],
      "Permission": [
        "aoss:CreateIndex",
        "aoss:UpdateIndex",
        "aoss:DescribeIndex",
        "aoss:ReadDocument",
        "aoss:WriteDocument"
      ]
    },
    {
      "ResourceType": "collection",
      "Resource": ["collection/log-analytics"],
      "Permission": [
        "aoss:CreateCollectionItems",
        "aoss:DescribeCollectionItems"
      ]
    }
  ],
  "Principal": [
    "arn:aws:iam::123456789012:role/LogIngestionRole",
    "arn:aws:iam::123456789012:role/AnalystRole"
  ]
}

22.5 UltraWarm と Cold Storage

# UltraWarm へのインデックス移行
curl -X POST "https://vpc-domain.us-west-2.es.amazonaws.com/_ultrawarm/migration/logs-2024-01-01/_warm" \
  -H "Content-Type: application/json"

# Cold Storage へのインデックス移行
curl -X POST "https://vpc-domain.us-west-2.es.amazonaws.com/_cold/migration/logs-2023-12-01/_cold" \
  -H "Content-Type: application/json"

# ISMポリシーによる自動階層化
{
  "policy": {
    "states": [
      {
        "name": "hot",
        "transitions": [{ "state_name": "warm", "conditions": { "min_index_age": "7d" } }]
      },
      {
        "name": "warm",
        "actions": [{ "warm_migration": {} }],
        "transitions": [{ "state_name": "cold", "conditions": { "min_index_age": "30d" } }]
      },
      {
        "name": "cold",
        "actions": [{ "cold_migration": {} }],
        "transitions": [{ "state_name": "delete", "conditions": { "min_index_age": "365d" } }]
      },
      {
        "name": "delete",
        "actions": [{ "cold_delete": {} }]
      }
    ]
  }
}

23. トラブルシューティング

23.1 一般的な問題と解決策

問題	原因	解決策
クラスタが Red	プライマリシャードが未割り当て	`_cluster/allocation/explain` で原因特定、ディスク容量確認
クラスタが Yellow	レプリカシャードが未割り当て	ノード数の確認、割り当て設定の確認
検索が遅い	シャード過多、クエリ非効率	スロークエリログ確認、シャードサイズ最適化
インデクシング遅い	リフレッシュ間隔短い、レプリカ多い	バルク最適化、リフレッシュ間隔延長
OOM	ヒープ不足、フィールドデータ過大	ヒープ調整、サーキットブレーカー設定
ディスクフル	データ増加、ログ肥大化	ISM ポリシー設定、古いインデックス削除
スプリットブレイン	ネットワーク分断	マスターノード奇数化、`minimum_master_nodes` 設定

23.2 診断コマンド集

# クラスタ全体の診断
curl -X GET "https://localhost:9200/_cluster/health?pretty" -u admin:admin --insecure
curl -X GET "https://localhost:9200/_cat/nodes?v&h=name,heap.percent,ram.percent,cpu,load_1m,disk.used_percent,node.role" -u admin:admin --insecure
curl -X GET "https://localhost:9200/_cat/indices?v&health=red" -u admin:admin --insecure
curl -X GET "https://localhost:9200/_cat/shards?v&h=index,shard,prirep,state,unassigned.reason" -u admin:admin --insecure

# 未割り当てシャードの詳細
curl -X GET "https://localhost:9200/_cat/shards?v&h=index,shard,prirep,state,unassigned.reason&s=state" -u admin:admin --insecure

# ノードのJVM統計
curl -X GET "https://localhost:9200/_nodes/stats/jvm?pretty" -u admin:admin --insecure

# スレッドプールの状態
curl -X GET "https://localhost:9200/_cat/thread_pool?v&h=name,node_name,active,queue,rejected,completed" -u admin:admin --insecure

# サーキットブレーカーの状態
curl -X GET "https://localhost:9200/_nodes/stats/breaker?pretty" -u admin:admin --insecure

# インデックスの統計
curl -X GET "https://localhost:9200/logs-*/_stats?pretty" -u admin:admin --insecure

# セグメント情報
curl -X GET "https://localhost:9200/_cat/segments/logs-*?v&h=index,shard,segment,generation,docs.count,size,size.memory" -u admin:admin --insecure

# リカバリ状態の確認
curl -X GET "https://localhost:9200/_cat/recovery?v&active_only=true" -u admin:admin --insecure

# 長時間実行中のタスク
curl -X GET "https://localhost:9200/_tasks?detailed=true&timeout=30s" -u admin:admin --insecure

# タスクのキャンセル
curl -X POST "https://localhost:9200/_tasks/<task_id>/_cancel" -u admin:admin --insecure

23.3 パフォーマンスプロファイリング

# 検索クエリのプロファイル
curl -X GET "https://localhost:9200/logs-*/_search?pretty" \
  -u admin:admin --insecure \
  -H "Content-Type: application/json" \
  -d '{
    "profile": true,
    "query": {
      "bool": {
        "must": [
          { "match": { "message": "timeout error" } }
        ],
        "filter": [
          { "term": { "level": "ERROR" } },
          { "range": { "@timestamp": { "gte": "now-1h" } } }
        ]
      }
    }
  }'

23.4 未割り当てシャードの手動修復

# 未割り当てシャードの理由確認
curl -X GET "https://localhost:9200/_cluster/allocation/explain?pretty" \
  -u admin:admin --insecure

# シャードの手動割り当て
curl -X POST "https://localhost:9200/_cluster/reroute" \
  -u admin:admin --insecure \
  -H "Content-Type: application/json" \
  -d '{
    "commands": [
      {
        "allocate_stale_primary": {
          "index": "logs-2024-01-15",
          "shard": 0,
          "node": "data-node-1",
          "accept_data_loss": true
        }
      }
    ]
  }'

# レプリカの再割り当て
curl -X POST "https://localhost:9200/_cluster/reroute?retry_failed=true" \
  -u admin:admin --insecure

24. ベストプラクティス

24.1 設計時のベストプラクティス

カテゴリ	ベストプラクティス
シャード設計	シャードサイズ 10-50GB を目標、データノードあたり 1,000 シャード以下
レプリカ	本番環境では最低 1 レプリカ、高可用性要件では 2 以上
マッピング	明示的マッピング定義、`dynamic: strict` の活用
テンプレート	コンポーザブルインデックステンプレートの活用
エイリアス	書き込みにはエイリアス + ロールオーバーを使用
ISM	ライフサイクルポリシーの自動化
命名規則	`<用途>-<日付>-<連番>` 形式（例: `logs-2024.01.15-000001`）

24.2 運用のベストプラクティス

カテゴリ	ベストプラクティス
監視	クラスタヘルス、JVM ヒープ、ディスク使用率を常時監視
バックアップ	日次スナップショット、クロスリージョンレプリケーション
アップグレード	ローリングリスタートによる無停止アップグレード
セキュリティ	TLS 必須、最小権限の原則、監査ログ有効化
キャパシティ	ディスク使用率 75% をアラート閾値に設定
Hot-Warm-Cold	データの鮮度に応じた階層ストレージの活用

24.3 セキュリティのベストプラクティス

# 本番環境推奨設定まとめ
# opensearch.yml

# TLS
plugins.security.ssl.transport.enforce_hostname_verification: true
plugins.security.ssl.http.enabled: true

# 管理者アクセスの制限
plugins.security.restapi.roles_enabled: ["all_access"]
plugins.security.system_indices.enabled: true
plugins.security.system_indices.indices:
  - ".plugins-ml-*"
  - ".opendistro-*"
  - ".opensearch-*"

# 監査ログの有効化
plugins.security.audit.type: internal_opensearch
plugins.security.audit.config.enable_rest: true
plugins.security.audit.config.enable_transport: true
plugins.security.compliance.enabled: true
plugins.security.compliance.write_log_diffs: true

# ホストヘッダの検証
http.host: 0.0.0.0
http.cors.enabled: false

# メモリロック
bootstrap.memory_lock: true

24.4 インデクシングのベストプラクティス

推奨事項	説明
Bulk API を使用	個別 API より大幅に高速
バッチサイズ	5-15MB/リクエストを目標
並列度	データノード数 × 2-3 の並列ワーカー
リフレッシュ間隔	書き込みヘビーなら 30s 以上に延長
レプリカ	大量投入時は一時的に 0 にして完了後に戻す
ID の自動生成	可能であれば `_id` は自動生成（バージョンチェック回避）
ルーティング	アクセスパターンが明確ならカスタムルーティング活用

25. 設定リファレンス

25.1 opensearch.yml 完全設定例

# ============================================================
# OpenSearch 本番環境設定例
# ============================================================

# ---- クラスタ設定 ----
cluster.name: production-opensearch
node.name: ${HOSTNAME}
node.roles: [ data, ingest ]

# ---- パス設定 ----
path.data: /var/lib/opensearch
path.logs: /var/log/opensearch
path.repo: ["/mnt/snapshots"]

# ---- ネットワーク設定 ----
network.host: 0.0.0.0
http.port: 9200
transport.port: 9300
http.max_content_length: 100mb
http.compression: true

# ---- ディスカバリ設定 ----
discovery.seed_hosts:
  - master-1.opensearch.internal:9300
  - master-2.opensearch.internal:9300
  - master-3.opensearch.internal:9300
cluster.initial_cluster_manager_nodes:
  - master-1
  - master-2
  - master-3

# ---- メモリ設定 ----
bootstrap.memory_lock: true

# ---- Security プラグイン ----
plugins.security.ssl.transport.pemcert_filepath: certs/node.pem
plugins.security.ssl.transport.pemkey_filepath: certs/node-key.pem
plugins.security.ssl.transport.pemtrustedcas_filepath: certs/root-ca.pem
plugins.security.ssl.transport.enforce_hostname_verification: true

plugins.security.ssl.http.enabled: true
plugins.security.ssl.http.pemcert_filepath: certs/node-http.pem
plugins.security.ssl.http.pemkey_filepath: certs/node-http-key.pem
plugins.security.ssl.http.pemtrustedcas_filepath: certs/root-ca.pem

plugins.security.authcz.admin_dn:
  - "CN=admin,OU=IT,O=MyCompany,L=Tokyo,C=JP"
plugins.security.nodes_dn:
  - "CN=*.opensearch.internal,OU=IT,O=MyCompany,L=Tokyo,C=JP"

plugins.security.audit.type: internal_opensearch
plugins.security.enable_snapshot_restore_privilege: true
plugins.security.check_snapshot_restore_write_privileges: true
plugins.security.restapi.roles_enabled: ["all_access", "security_rest_api_access"]
plugins.security.system_indices.enabled: true

# ---- シャード割り当て ----
cluster.routing.allocation.disk.threshold_enabled: true
cluster.routing.allocation.disk.watermark.low: "85%"
cluster.routing.allocation.disk.watermark.high: "90%"
cluster.routing.allocation.disk.watermark.flood_stage: "95%"
cluster.routing.allocation.awareness.attributes: zone
node.attr.zone: ${AVAILABILITY_ZONE}
node.attr.temp: hot

# ---- スレッドプール ----
thread_pool:
  write:
    queue_size: 10000
  search:
    queue_size: 1000

# ---- サーキットブレーカー ----
indices.breaker.total.use_real_memory: true
indices.breaker.total.limit: 95%
indices.breaker.fielddata.limit: 40%
indices.breaker.request.limit: 60%

# ---- キャッシュ ----
indices.fielddata.cache.size: 20%
indices.queries.cache.size: 10%

# ---- リカバリ ----
cluster.routing.allocation.node_concurrent_recoveries: 4
indices.recovery.max_bytes_per_sec: 200mb

# ---- スロークエリログ ----
# インデックスレベルで設定推奨

# ---- その他 ----
action.destructive_requires_name: true
action.auto_create_index: "+logs-*,+metrics-*,-*"

25.2 重要な REST API リファレンス

カテゴリ	API	説明
クラスタ	`GET _cluster/health`	クラスタヘルス
クラスタ	`GET _cluster/state`	クラスタ状態
クラスタ	`GET _cluster/stats`	クラスタ統計
クラスタ	`PUT _cluster/settings`	クラスタ設定変更
ノード	`GET _cat/nodes?v`	ノード一覧
ノード	`GET _nodes/stats`	ノード統計
ノード	`GET _nodes/hot_threads`	ホットスレッド
インデックス	`PUT /<index>`	インデックス作成
インデックス	`GET _cat/indices?v`	インデックス一覧
インデックス	`GET /<index>/_stats`	インデックス統計
インデックス	`POST /<index>/_forcemerge`	強制マージ
インデックス	`POST /<index>/_refresh`	リフレッシュ
ドキュメント	`POST /<index>/_doc`	ドキュメント投入
ドキュメント	`POST _bulk`	バルク操作
検索	`GET /<index>/_search`	検索
検索	`POST _msearch`	マルチサーチ
検索	`POST _search/scroll`	スクロール検索
テンプレート	`PUT _index_template/<name>`	テンプレート作成
スナップショット	`PUT _snapshot/<repo>`	リポジトリ登録
スナップショット	`PUT _snapshot/<repo>/<snap>`	スナップショット作成
ISM	`PUT _plugins/_ism/policies/<name>`	ISMポリシー
セキュリティ	`GET _plugins/_security/api/roles`	ロール一覧
アラート	`POST _plugins/_alerting/monitors`	モニター作成
異常検知	`POST _plugins/_anomaly_detection/detectors`	ディテクター作成
ML	`POST _plugins/_ml/models/_register`	モデル登録
SQL	`POST _plugins/_sql`	SQL実行

26. まとめ

OpenSearch は、Elasticsearch のオープンソースフォークとして生まれ、現在では独自の進化を遂げた包括的な検索・分析プラットフォームである。

26.1 OpenSearch の強み

完全なオープンソース: Apache License 2.0 により、商用利用を含む自由な利用が可能
豊富な機能セット: 全文検索、ログ分析、SIEM、可観測性、ベクトル検索を単一プラットフォームで提供
スケーラビリティ: 数ノードの小規模クラスタから数百ノードの大規模クラスタまで対応
エコシステム: Data Prepper、OpenSearch Dashboards、各種クライアントライブラリの充実
マネージドサービス: Amazon OpenSearch Service による運用負荷の軽減
AI/ML 統合: ベクトル検索、Neural Search、Conversational Search など最新の AI 機能

26.2 選定時の考慮事項

考慮事項	詳細
Elasticsearch との比較	ライセンス、機能ロードマップ、エコシステムの違い
セルフホスト vs マネージド	運用負荷、コスト、カスタマイズ性のトレードオフ
スケール計画	データ量、クエリ負荷、保持期間に基づくキャパシティプランニング
セキュリティ要件	認証方式、暗号化、監査ログ、コンプライアンス
可用性要件	マルチ AZ、クロスリージョン、RPO/RTO 設計

OpenSearch は活発なコミュニティと AWS のサポートにより、継続的に機能拡充が行われている。ログ分析、全文検索、セキュリティ分析、AI/ML を統合的に扱うプラットフォームとして、今後もその存在感を増していくことが予想される。

本記事は OpenSearch 2.19 時点の情報に基づいて作成されています。最新の機能や設定については、OpenSearch 公式ドキュメントを参照してください。