Skip to content

Kubernetes 部署 Elasticsearch 集群

Elasticsearch 是一个分布式搜索和分析引擎,广泛用于日志分析、全文搜索、安全分析等场景。本文详细介绍在 Kubernetes 环境中使用 StatefulSet 部署 Elasticsearch 集群的完整方案。

架构介绍

Elasticsearch 集群架构

Elasticsearch 集群由多个节点组成,每个节点可以担任不同角色:Master 节点负责集群管理和元数据操作;Data 节点负责数据存储和查询;Ingest 节点负责数据预处理;Coordinating 节点负责请求分发和聚合。

在 Kubernetes 环境中推荐使用 Master-Ingestion-Data 分离的架构模式,根据负载特点灵活配置不同角色的节点数量。典型的生产环境配置是 3 个 Master 节点(保证奇数)、若干 Data 节点(根据数据量配置)、可选的 Ingest 节点(如果需要数据预处理)。

架构拓扑

┌─────────────────────────────────────────────────────────────┐
│                    Kubernetes Cluster                       │
│                                                      │
│  ┌─────────────────────────────────────────────────┐   │
│  │        Elasticsearch Master Nodes (3)           │   │
│  │  es-master-0  es-master-1  es-master-2          │   │
│  └─────────────────────────────────────────────────┘   │
│                         │                              │
│  ┌─────────────────────────────────────────────────┐   │
│  │         Elasticsearch Data Nodes (N)              │   │
│  │   es-data-0  es-data-1  ...  es-data-N          │   │
│  │         (使用 PVC 持久化存储)                    │   │
│  └─────────────────────────────────────────────────┘   │
│                         │                              │
│  ┌─────────────────────────────────────────────────┐   │
│  │       Elasticsearch Ingest Nodes (可选)          │   │
│  │    es-ingest-0  es-ingest-1  ...               │   │
│  └─────────────────────────────────────────────────┘   │
│                                                      │
│  ┌──────────────────┐  ┌──────────────────────┐   │
│  │   es-master-svc   │  │    es-data-svc         │   │
│  │    (ClusterIP)    │  │    (ClusterIP)        │   │
│  └──────────────────┘  └──────────────────────┘   │
└─────────────────────────────────────────────────────┘

部署资源清单

1. 命名空间

yaml
apiVersion: v1
kind: Namespace
metadata:
  name: elasticsearch
  labels:
    name: elasticsearch
    environment: production

2. ServiceAccount 和 RBAC

yaml
apiVersion: v1
kind: ServiceAccount
metadata:
  name: elasticsearch
  namespace: elasticsearch
---
apiVersion: rbac.authorization.k8s.io/v1
kind: Role
metadata:
  name: elasticsearch
  namespace: elasticsearch
rules:
- apiGroups: [""]
  resources: ["pods"]
  verbs: ["get", "list", "watch"]
- apiGroups: [""]
  resources: ["pods/exec"]
  verbs: ["create"]
---
apiVersion: rbac.authorization.k8s.io/v1
kind: RoleBinding
metadata:
  name: elasticsearch
  namespace: elasticsearch
subjects:
- kind: ServiceAccount
  name: elasticsearch
  namespace: elasticsearch
roleRef:
  kind: Role
  name: elasticsearch
  apiGroup: rbac.authorization.k8s.io

3. ConfigMap 配置

yaml
apiVersion: v1
kind: ConfigMap
metadata:
  name: elasticsearch-config
  namespace: elasticsearch
data:
  elasticsearch.yml: |
    # 集群名称
    cluster.name: es-cluster
    
    # 节点角色配置
    node.master: ${NODE_MASTER}
    node.data: ${NODE_DATA}
    node.ingest: ${NODE_INGEST}
    
    # 网络配置
    network.host: 0.0.0.0
    http.port: 9200
    transport.tcp.port: 9300
    
    # discovery 配置
    discovery.seed_hosts: ${DISCOVERY_SEEDS}
    cluster.initial_master_nodes: ${INITIAL_MASTER_NODES}
    
    # 内存配置
    bootstrap.memory_lock: true
    indices.memory.index_buffer_size: 20%
    
    # 查询缓存配置
    indices.queries.cache.size: 20%
    indices.queries.cache.expire: 60s
    
    # 线程池配置
    thread_pool.write.queue_size: 500
    thread_pool.search.queue_size: 1000
    
    # 分片配置
    number_of_shards: 3
    number_of_replicas: 1
    
    # 存储配置
    path.data: /usr/share/elasticsearch/data
    path.logs: /usr/share/elasticsearch/logs
    
    # GC 配置
    ES_JAVA_OPTS: "-Xmx2g -Xms2g -XX:+UseG1GC -XX:G1ReservePercent=25 -XX:InitiatingHeapOccupancyPercent=30 -XX:MaxGCPauseMillis=500"
    
  jvm.options: |
    -Xms2g
    -Xmx2g
    -XX:+UseG1GC
    -XX:G1ReservePercent=25
    -XX:InitiatingHeapOccupancyPercent=30
    -XX:MaxGCPauseMillis=500
    -XX:+UseCompressedOops
    -Djava.io.tmpdir=${ES_TMPDIR}
    -XX:HeapDumpPath=/usr/share/elasticsearch/logs
    -XX:ErrorLog=/usr/share/elasticsearch/logs/hs_err_pid.log
  
  log4j2.properties: |
    appender.console.layout.pattern: [%d{ISO8601}][%-5p][%c{1.}] %m%n
    rootLogger.level: info
    logger.elasticsearch.name: org.elasticsearch
    logger.elasticsearch.level: info

4. Discovery ConfigMap

yaml
apiVersion: v1
kind: ConfigMap
metadata:
  name: elasticsearch-discovery
  namespace: elasticsearch
data:
  seeds: |
    es-master-0.es-master-headless.elasticsearch.svc.cluster.local
    es-master-1.es-master-headless.elasticsearch.svc.cluster.local
    es-master-2.es-master-headless.elasticsearch.svc.cluster.local

5. Elasticsearch Master 节点

yaml
apiVersion: v1
kind: Service
metadata:
  name: es-master-headless
  namespace: elasticsearch
  labels:
    app: elasticsearch
spec:
  clusterIP: None
  ports:
  - name: http
    port: 9200
  - name: transport
    port: 9300
  selector:
    app: elasticsearch
    role: master
---
apiVersion: apps/v1
kind: StatefulSet
metadata:
  name: es-master
  namespace: elasticsearch
spec:
  serviceName: es-master-headless
  replicas: 3
  selector:
    matchLabels:
      app: elasticsearch
      role: master
  template:
    metadata:
      labels:
        app: elasticsearch
        role: master
    spec:
      terminationGracePeriodSeconds: 60
      affinity:
        podAntiAffinity:
          requiredDuringSchedulingIgnoredDuringExecution:
          - labelSelector:
              matchLabels:
                app: elasticsearch
                role: master
            topologyKey: kubernetes.io/hostname
      containers:
      - name: elasticsearch
        image: docker.elastic.co/elasticsearch/elasticsearch:8.12.0
        imagePullPolicy: IfNotPresent
        env:
        - name: NODE_MASTER
          value: "true"
        - name: NODE_DATA
          value: "false"
        - name: NODE_INGEST
          value: "false"
        - name: DISCOVERY_SEEDS
          valueFrom:
            configMapKeyRef:
              name: elasticsearch-discovery
              key: seeds
        - name: INITIAL_MASTER_NODES
          value: "es-master-0,es-master-1,es-master-2"
        - name: ES_JAVA_OPTS
          value: "-Xms1g -Xmx1g"
        - name: bootstrap.memory_lock
          value: "true"
        - name: xpack.security.enabled
          value: "true"
        - name: ELASTIC_PASSWORD
          valueFrom:
            secretKeyRef:
              name: elasticsearch-secrets
              key: password
        ports:
        - name: http
          containerPort: 9200
        - name: transport
          containerPort: 9300
        resources:
          requests:
            cpu: 500m
            memory: 2Gi
          limits:
            cpu: 2000m
            memory: 4Gi
        volumeMounts:
        - name: data
          mountPath: /usr/share/elasticsearch/data
        - name: elasticsearch-config
          mountPath: /usr/share/elasticsearch/config
        - name: elasticsearch-logs
          mountPath: /usr/share/elasticsearch/logs
        livenessProbe:
          httpGet:
            path: /_cluster/health
            port: 9200
          initialDelaySeconds: 60
          periodSeconds: 30
        readinessProbe:
          httpGet:
            path: /_cluster/health
            port: 9200
          initialDelaySeconds: 30
          periodSeconds: 10
        startupProbe:
          httpGet:
            path: /_cluster/health
            port: 9200
          initialDelaySeconds: 10
          periodSeconds: 10
          failureThreshold: 30
      volumes:
      - name: data
        persistentVolumeClaim:
          claimName: es-master-data
      - name: elasticsearch-config
        configMap:
          name: elasticsearch-config
      - name: elasticsearch-logs
        emptyDir: {}
  volumeClaimTemplates:
  - metadata:
      name: es-master-data
    spec:
      accessModes: ["ReadWriteOnce"]
      storageClassName: fast
      resources:
        requests:
          storage: 10Gi

6. Elasticsearch Data 节点

yaml
apiVersion: v1
kind: Service
metadata:
  name: es-data-headless
  namespace: elasticsearch
  labels:
    app: elasticsearch
spec:
  clusterIP: None
  ports:
  - name: http
    port: 9200
  - name: transport
    port: 9300
  selector:
    app: elasticsearch
    role: data
---
apiVersion: apps/v1
kind: StatefulSet
metadata:
  name: es-data
  namespace: elasticsearch
spec:
  serviceName: es-data-headless
  replicas: 3
  selector:
    matchLabels:
      app: elasticsearch
      role: data
  template:
    metadata:
      labels:
        app: elasticsearch
        role: data
    spec:
      terminationGracePeriodSeconds: 60
      affinity:
        podAntiAffinity:
          requiredDuringSchedulingIgnoredDuringExecution:
          - labelSelector:
              labelPairs:
                app: elasticsearch
                role: data
            topologyKey: kubernetes.io/hostname
      containers:
      - name: elasticsearch
        image: docker.elastic.co/elasticsearch/elasticsearch:8.12.0
        imagePullPolicy: IfNotPresent
        env:
        - name: NODE_MASTER
          value: "false"
        - name: NODE_DATA
          value: "true"
        - name: NODE_INGEST
          value: "false"
        - name: DISCOVERY_SEEDS
          valueFrom:
            configMapKeyRef:
              name: elasticsearch-discovery
              key: seeds
        - name: INITIAL_MASTER_NODES
          value: "es-master-0,es-master-1,es-master-2"
        - name: ES_JAVA_OPTS
          value: "-Xmx4g -Xms4g -XX:+UseG1GC"
        - name: bootstrap.memory_lock
          value: "true"
        - name: xpack.security.enabled
          value: "true"
        - name: ELASTIC_PASSWORD
          valueFrom:
            secretKeyRef:
              name: elasticsearch-secrets
              key: password
        ports:
        - name: http
          containerPort: 9200
        - name: transport
          containerPort: 9300
        resources:
          requests:
            cpu: 1000m
            memory: 6Gi
          limits:
            cpu: 4000m
            memory: 8Gi
        volumeMounts:
        - name: data
          mountPath: /usr/share/elasticsearch/data
        - name: elasticsearch-config
          mountPath: /usr/share/elasticsearch/config
        - name: elasticsearch-logs
          mountPath: /usr/share/elasticsearch/logs
        livenessProbe:
          httpGet:
            path: /_cluster/health
            port: 9200
          initialDelaySeconds: 60
          periodSeconds: 30
        readinessProbe:
          httpGet:
            path: /_cluster/health
            port: 9200
          initialDelaySeconds: 30
          periodSeconds: 10
      volumes:
      - name: data
        persistentVolumeClaim:
          claimName: es-data
      - name: elasticsearch-config
        configMap:
          name: elasticsearch-config
      - name: elasticsearch-logs
        emptyDir: {}
  volumeClaimTemplates:
  - metadata:
      name: es-data
    spec:
      accessModes: ["ReadWriteOnce"]
      storageClassName: fast
      resources:
        requests:
          storage: 100Gi

7. Elasticsearch Ingest 节点(可选)

yaml
apiVersion: apps/v1
kind: StatefulSet
metadata:
  name: es-ingest
  namespace: elasticsearch
spec:
  serviceName: es-ingest-headless
  replicas: 2
  selector:
    matchLabels:
      app: elasticsearch
      role: ingest
  template:
    metadata:
      labels:
        app: elasticsearch
        role: ingest
    spec:
      terminationGracePeriodSeconds: 30
      containers:
      - name: elasticsearch
        image: docker.elastic.co/elasticsearch/elasticsearch:8.12.0
        imagePullPolicy: IfNotPresent
        env:
        - name: NODE_MASTER
          value: "false"
        - name: NODE_DATA
          value: "false"
        - name: NODE_INGEST
          value: "true"
        - name: DISCOVERY_SEEDS
          valueFrom:
            configMapKeyRef:
              name: elasticsearch-discovery
              key: seeds
        - name: ES_JAVA_OPTS
          value: "-Xmx1g -Xms1g"
        - name: xpack.security.enabled
          value: "true"
        - name: ELASTIC_PASSWORD
          valueFrom:
            secretKeyRef:
              name: elasticsearch-secrets
              key: password
        ports:
        - name: http
          containerPort: 9200
        - name: transport
          containerPort: 9300
        resources:
          requests:
            cpu: 500m
            memory: 2Gi
          limits:
            cpu: 2000m
            memory: 4Gi
        volumeMounts:
        - name: elasticsearch-config
          mountPath: /usr/share/elasticsearch/config
  volumeClaimTemplates: []

8. 对外 Service

yaml
apiVersion: v1
kind: Service
metadata:
  name: elasticsearch
  namespace: elasticsearch
  labels:
    app: elasticsearch
spec:
  type: ClusterIP
  ports:
  - name: http
    port: 9200
    targetPort: 9200
  selector:
    app: elasticsearch
    role: data
---
apiVersion: v1
kind: Service
metadata:
  name: elasticsearch-master
  namespace: elasticsearch
spec:
  type: ClusterIP
  ports:
  - name: http
    port: 9200
    targetPort: 9200
  selector:
    app: elasticsearch
    role: master

9. Secret

yaml
apiVersion: v1
kind: Secret
metadata:
  name: elasticsearch-secrets
  namespace: elasticsearch
type: Opaque
stringData:
  password: Elastic@2024

10. StorageClass

yaml
apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
  name: fast
provisioner: kubernetes.io/gce-pd
parameters:
  type: pd-ssd
allowVolumeExpansion: true
reclaimPolicy: Retain

11. Pod 中断预算

yaml
apiVersion: policy/v1
kind: PodDisruptionBudget
metadata:
  name: es-master-pdb
  namespace: elasticsearch
spec:
  minAvailable: 2
  selector:
    matchLabels:
      role: master
---
apiVersion: policy/v1
kind: PodDisruptionBudget
metadata:
  name: es-data-pdb
  namespace: elasticsearch
spec:
  minAvailable: 2
  selector:
    matchLabels:
      role: data

部署步骤

1. 创建命名空间

bash
kubectl apply -f 00-namespace.yaml

2. 创建配置和 Secret

bash
kubectl apply -f 01-configmap.yaml
kubectl apply -f 02-secrets.yaml
kubectl apply -f 03-storageclass.yaml

3. 创建 Services

bash
kubectl apply -f 04-services.yaml

4. 创建 StatefulSets

bash
kubectl apply -f 05-es-master.yaml
kubectl apply -f 06-es-data.yaml
# 如果需要 Ingest 节点
kubectl apply -f 07-es-ingest.yaml

5. 验证部署

bash
# 查看 Pod 状态
kubectl get pods -n elasticsearch

# 查看集群健康状态
kubectl exec -it es-master-0 -n elasticsearch -- \
  curl -u elastic:Elastic@2024 -k https://localhost:9200/_cluster/health

# 查看节点状态
kubectl exec -it es-master-0 -n elasticsearch -- \
  curl -u elastic:Elastic@2024 -k https://localhost:9200/_cat/nodes

使用示例

索引创建

bash
# 创建日志索引
curl -X PUT "http://elasticsearch.elasticsearch:9200/logs-$(date +%Y.%m.%d)" \
  -u elastic:Elastic@2024 \
  -H 'Content-Type: application/json' \
  -d '{
    "settings": {
      "number_of_shards": 3,
      "number_of_replicas": 1
    },
    "mappings": {
      "properties": {
        "timestamp": {"type": "date"},
        "message": {"type": "text"},
        "level": {"type": "keyword"},
        "service": {"type": "keyword"}
      }
    }
  }'

日志收集配置(Filebeat)

yaml
apiVersion: v1
kind: ConfigMap
metadata:
  name: filebeat-config
  namespace: elasticsearch
data:
  filebeat.yml: |
    filebeat.autodiscover:
      providers:
        - type: kubernetes
          templates:
            - condition:
                contains:
                  container.name: app
              config:
                - type: container
                  paths:
                    - /var/log/containers/*.log
                  processors:
                    - add_kubernetes_metadata:
                        in_cluster: true
                    - drop_event:
                        when:
                          contains:
                            log.level: debug
    output.elasticsearch:
      hosts: ['elasticsearch.elasticsearch:9200']
      username: elastic
      password: Elastic@2024
    setup.kibana:
      hosts: ['kibana:5601']

监控和告警

Prometheus Exporter 配置

yaml
apiVersion: v1
kind: ConfigMap
metadata:
  name: es-monitoring
  namespace: elasticsearch
data:
  es-exporter.yaml: |
    - job_name: elasticsearch
      static_configs:
      - targets: ['es-data-0.es-data-headless.elasticsearch:9114']

告警规则

yaml
apiVersion: monitoring.coreos.com/v1
kind: PrometheusRule
metadata:
  name: es-alerts
  namespace: elasticsearch
spec:
  groups:
  - name: elasticsearch.rules
    rules:
    - alert: ESNodesDown
      expr: up{job="elasticsearch"} == 0
      for: 2m
    - alert: ESClusterRed
      expr: es_cluster_health_status == "red"
      for: 2m
    - alert: ESDiskSpaceLow
      prod: es_fs_total_available_bytes / es_fs_total_total_bytes < 0.1
      for: 5m

运维操作

备份(Snapshot)

bash
# 创建快照仓库
curl -X PUT "http://elasticsearch.elasticsearch:9200/_snapshot/backup" \
  -u elastic:Elastic@2024 \
  -H 'Content-Type: application/json' \
  -d '{
    "type": "fs",
    "settings": {
      "location": "/var/backups/elasticsearch"
    }
  }'

# 创建快照
curl -X PUT "http://elasticsearch.elasticsearch:9200/_snapshot/backup/snapshot_$(date +%Y%m%d)" \
  -u elastic:Elastic@2024

# 恢复快照
curl -X POST "http://elasticsearch.elasticsearch:9200/_snapshot/backup/snapshot_name/_restore" \
  -u elastic:Elastic@2024

扩缩容

bash
# 扩容 Data 节点
kubectl scale statefulset es-data -n elasticsearch --replicas=5

# 手动滚动更新
kubectl rollout restart statefulset es-data -n elasticsearch

常见问题排查

集群不健康

可能原因包括节点未启动、分片未分配、网络分区。检查节点状态和分片分配情况。

内存不足

可能原因包括 JVM 堆内存配置过小、大量聚合查询。调整 JVM 参数和优化查询。

写入缓慢

可能原因包括磁盘IO不足、线程池配置不当。增加 Data 节点或优化配置。

总结

本文提供了在 Kubernetes 环境中部署 Elasticsearch 集群的完整方案。生产环境中建议配合使用 Kibana 进行可视化监控,配置定期快照进行数据备份,并根据数据量增长情况适时扩展 Data 节点。