Kubernetes 部署 Elasticsearch 集群
Elasticsearch 是一个分布式搜索和分析引擎,广泛用于日志分析、全文搜索、安全分析等场景。本文详细介绍在 Kubernetes 环境中使用 StatefulSet 部署 Elasticsearch 集群的完整方案。
架构介绍
Elasticsearch 集群架构
Elasticsearch 集群由多个节点组成,每个节点可以担任不同角色:Master 节点负责集群管理和元数据操作;Data 节点负责数据存储和查询;Ingest 节点负责数据预处理;Coordinating 节点负责请求分发和聚合。
在 Kubernetes 环境中推荐使用 Master-Ingestion-Data 分离的架构模式,根据负载特点灵活配置不同角色的节点数量。典型的生产环境配置是 3 个 Master 节点(保证奇数)、若干 Data 节点(根据数据量配置)、可选的 Ingest 节点(如果需要数据预处理)。
架构拓扑
┌─────────────────────────────────────────────────────────────┐
│ Kubernetes Cluster │
│ │
│ ┌─────────────────────────────────────────────────┐ │
│ │ Elasticsearch Master Nodes (3) │ │
│ │ es-master-0 es-master-1 es-master-2 │ │
│ └─────────────────────────────────────────────────┘ │
│ │ │
│ ┌─────────────────────────────────────────────────┐ │
│ │ Elasticsearch Data Nodes (N) │ │
│ │ es-data-0 es-data-1 ... es-data-N │ │
│ │ (使用 PVC 持久化存储) │ │
│ └─────────────────────────────────────────────────┘ │
│ │ │
│ ┌─────────────────────────────────────────────────┐ │
│ │ Elasticsearch Ingest Nodes (可选) │ │
│ │ es-ingest-0 es-ingest-1 ... │ │
│ └─────────────────────────────────────────────────┘ │
│ │
│ ┌──────────────────┐ ┌──────────────────────┐ │
│ │ es-master-svc │ │ es-data-svc │ │
│ │ (ClusterIP) │ │ (ClusterIP) │ │
│ └──────────────────┘ └──────────────────────┘ │
└─────────────────────────────────────────────────────┘部署资源清单
1. 命名空间
yaml
apiVersion: v1
kind: Namespace
metadata:
name: elasticsearch
labels:
name: elasticsearch
environment: production2. ServiceAccount 和 RBAC
yaml
apiVersion: v1
kind: ServiceAccount
metadata:
name: elasticsearch
namespace: elasticsearch
---
apiVersion: rbac.authorization.k8s.io/v1
kind: Role
metadata:
name: elasticsearch
namespace: elasticsearch
rules:
- apiGroups: [""]
resources: ["pods"]
verbs: ["get", "list", "watch"]
- apiGroups: [""]
resources: ["pods/exec"]
verbs: ["create"]
---
apiVersion: rbac.authorization.k8s.io/v1
kind: RoleBinding
metadata:
name: elasticsearch
namespace: elasticsearch
subjects:
- kind: ServiceAccount
name: elasticsearch
namespace: elasticsearch
roleRef:
kind: Role
name: elasticsearch
apiGroup: rbac.authorization.k8s.io3. ConfigMap 配置
yaml
apiVersion: v1
kind: ConfigMap
metadata:
name: elasticsearch-config
namespace: elasticsearch
data:
elasticsearch.yml: |
# 集群名称
cluster.name: es-cluster
# 节点角色配置
node.master: ${NODE_MASTER}
node.data: ${NODE_DATA}
node.ingest: ${NODE_INGEST}
# 网络配置
network.host: 0.0.0.0
http.port: 9200
transport.tcp.port: 9300
# discovery 配置
discovery.seed_hosts: ${DISCOVERY_SEEDS}
cluster.initial_master_nodes: ${INITIAL_MASTER_NODES}
# 内存配置
bootstrap.memory_lock: true
indices.memory.index_buffer_size: 20%
# 查询缓存配置
indices.queries.cache.size: 20%
indices.queries.cache.expire: 60s
# 线程池配置
thread_pool.write.queue_size: 500
thread_pool.search.queue_size: 1000
# 分片配置
number_of_shards: 3
number_of_replicas: 1
# 存储配置
path.data: /usr/share/elasticsearch/data
path.logs: /usr/share/elasticsearch/logs
# GC 配置
ES_JAVA_OPTS: "-Xmx2g -Xms2g -XX:+UseG1GC -XX:G1ReservePercent=25 -XX:InitiatingHeapOccupancyPercent=30 -XX:MaxGCPauseMillis=500"
jvm.options: |
-Xms2g
-Xmx2g
-XX:+UseG1GC
-XX:G1ReservePercent=25
-XX:InitiatingHeapOccupancyPercent=30
-XX:MaxGCPauseMillis=500
-XX:+UseCompressedOops
-Djava.io.tmpdir=${ES_TMPDIR}
-XX:HeapDumpPath=/usr/share/elasticsearch/logs
-XX:ErrorLog=/usr/share/elasticsearch/logs/hs_err_pid.log
log4j2.properties: |
appender.console.layout.pattern: [%d{ISO8601}][%-5p][%c{1.}] %m%n
rootLogger.level: info
logger.elasticsearch.name: org.elasticsearch
logger.elasticsearch.level: info4. Discovery ConfigMap
yaml
apiVersion: v1
kind: ConfigMap
metadata:
name: elasticsearch-discovery
namespace: elasticsearch
data:
seeds: |
es-master-0.es-master-headless.elasticsearch.svc.cluster.local
es-master-1.es-master-headless.elasticsearch.svc.cluster.local
es-master-2.es-master-headless.elasticsearch.svc.cluster.local5. Elasticsearch Master 节点
yaml
apiVersion: v1
kind: Service
metadata:
name: es-master-headless
namespace: elasticsearch
labels:
app: elasticsearch
spec:
clusterIP: None
ports:
- name: http
port: 9200
- name: transport
port: 9300
selector:
app: elasticsearch
role: master
---
apiVersion: apps/v1
kind: StatefulSet
metadata:
name: es-master
namespace: elasticsearch
spec:
serviceName: es-master-headless
replicas: 3
selector:
matchLabels:
app: elasticsearch
role: master
template:
metadata:
labels:
app: elasticsearch
role: master
spec:
terminationGracePeriodSeconds: 60
affinity:
podAntiAffinity:
requiredDuringSchedulingIgnoredDuringExecution:
- labelSelector:
matchLabels:
app: elasticsearch
role: master
topologyKey: kubernetes.io/hostname
containers:
- name: elasticsearch
image: docker.elastic.co/elasticsearch/elasticsearch:8.12.0
imagePullPolicy: IfNotPresent
env:
- name: NODE_MASTER
value: "true"
- name: NODE_DATA
value: "false"
- name: NODE_INGEST
value: "false"
- name: DISCOVERY_SEEDS
valueFrom:
configMapKeyRef:
name: elasticsearch-discovery
key: seeds
- name: INITIAL_MASTER_NODES
value: "es-master-0,es-master-1,es-master-2"
- name: ES_JAVA_OPTS
value: "-Xms1g -Xmx1g"
- name: bootstrap.memory_lock
value: "true"
- name: xpack.security.enabled
value: "true"
- name: ELASTIC_PASSWORD
valueFrom:
secretKeyRef:
name: elasticsearch-secrets
key: password
ports:
- name: http
containerPort: 9200
- name: transport
containerPort: 9300
resources:
requests:
cpu: 500m
memory: 2Gi
limits:
cpu: 2000m
memory: 4Gi
volumeMounts:
- name: data
mountPath: /usr/share/elasticsearch/data
- name: elasticsearch-config
mountPath: /usr/share/elasticsearch/config
- name: elasticsearch-logs
mountPath: /usr/share/elasticsearch/logs
livenessProbe:
httpGet:
path: /_cluster/health
port: 9200
initialDelaySeconds: 60
periodSeconds: 30
readinessProbe:
httpGet:
path: /_cluster/health
port: 9200
initialDelaySeconds: 30
periodSeconds: 10
startupProbe:
httpGet:
path: /_cluster/health
port: 9200
initialDelaySeconds: 10
periodSeconds: 10
failureThreshold: 30
volumes:
- name: data
persistentVolumeClaim:
claimName: es-master-data
- name: elasticsearch-config
configMap:
name: elasticsearch-config
- name: elasticsearch-logs
emptyDir: {}
volumeClaimTemplates:
- metadata:
name: es-master-data
spec:
accessModes: ["ReadWriteOnce"]
storageClassName: fast
resources:
requests:
storage: 10Gi6. Elasticsearch Data 节点
yaml
apiVersion: v1
kind: Service
metadata:
name: es-data-headless
namespace: elasticsearch
labels:
app: elasticsearch
spec:
clusterIP: None
ports:
- name: http
port: 9200
- name: transport
port: 9300
selector:
app: elasticsearch
role: data
---
apiVersion: apps/v1
kind: StatefulSet
metadata:
name: es-data
namespace: elasticsearch
spec:
serviceName: es-data-headless
replicas: 3
selector:
matchLabels:
app: elasticsearch
role: data
template:
metadata:
labels:
app: elasticsearch
role: data
spec:
terminationGracePeriodSeconds: 60
affinity:
podAntiAffinity:
requiredDuringSchedulingIgnoredDuringExecution:
- labelSelector:
labelPairs:
app: elasticsearch
role: data
topologyKey: kubernetes.io/hostname
containers:
- name: elasticsearch
image: docker.elastic.co/elasticsearch/elasticsearch:8.12.0
imagePullPolicy: IfNotPresent
env:
- name: NODE_MASTER
value: "false"
- name: NODE_DATA
value: "true"
- name: NODE_INGEST
value: "false"
- name: DISCOVERY_SEEDS
valueFrom:
configMapKeyRef:
name: elasticsearch-discovery
key: seeds
- name: INITIAL_MASTER_NODES
value: "es-master-0,es-master-1,es-master-2"
- name: ES_JAVA_OPTS
value: "-Xmx4g -Xms4g -XX:+UseG1GC"
- name: bootstrap.memory_lock
value: "true"
- name: xpack.security.enabled
value: "true"
- name: ELASTIC_PASSWORD
valueFrom:
secretKeyRef:
name: elasticsearch-secrets
key: password
ports:
- name: http
containerPort: 9200
- name: transport
containerPort: 9300
resources:
requests:
cpu: 1000m
memory: 6Gi
limits:
cpu: 4000m
memory: 8Gi
volumeMounts:
- name: data
mountPath: /usr/share/elasticsearch/data
- name: elasticsearch-config
mountPath: /usr/share/elasticsearch/config
- name: elasticsearch-logs
mountPath: /usr/share/elasticsearch/logs
livenessProbe:
httpGet:
path: /_cluster/health
port: 9200
initialDelaySeconds: 60
periodSeconds: 30
readinessProbe:
httpGet:
path: /_cluster/health
port: 9200
initialDelaySeconds: 30
periodSeconds: 10
volumes:
- name: data
persistentVolumeClaim:
claimName: es-data
- name: elasticsearch-config
configMap:
name: elasticsearch-config
- name: elasticsearch-logs
emptyDir: {}
volumeClaimTemplates:
- metadata:
name: es-data
spec:
accessModes: ["ReadWriteOnce"]
storageClassName: fast
resources:
requests:
storage: 100Gi7. Elasticsearch Ingest 节点(可选)
yaml
apiVersion: apps/v1
kind: StatefulSet
metadata:
name: es-ingest
namespace: elasticsearch
spec:
serviceName: es-ingest-headless
replicas: 2
selector:
matchLabels:
app: elasticsearch
role: ingest
template:
metadata:
labels:
app: elasticsearch
role: ingest
spec:
terminationGracePeriodSeconds: 30
containers:
- name: elasticsearch
image: docker.elastic.co/elasticsearch/elasticsearch:8.12.0
imagePullPolicy: IfNotPresent
env:
- name: NODE_MASTER
value: "false"
- name: NODE_DATA
value: "false"
- name: NODE_INGEST
value: "true"
- name: DISCOVERY_SEEDS
valueFrom:
configMapKeyRef:
name: elasticsearch-discovery
key: seeds
- name: ES_JAVA_OPTS
value: "-Xmx1g -Xms1g"
- name: xpack.security.enabled
value: "true"
- name: ELASTIC_PASSWORD
valueFrom:
secretKeyRef:
name: elasticsearch-secrets
key: password
ports:
- name: http
containerPort: 9200
- name: transport
containerPort: 9300
resources:
requests:
cpu: 500m
memory: 2Gi
limits:
cpu: 2000m
memory: 4Gi
volumeMounts:
- name: elasticsearch-config
mountPath: /usr/share/elasticsearch/config
volumeClaimTemplates: []8. 对外 Service
yaml
apiVersion: v1
kind: Service
metadata:
name: elasticsearch
namespace: elasticsearch
labels:
app: elasticsearch
spec:
type: ClusterIP
ports:
- name: http
port: 9200
targetPort: 9200
selector:
app: elasticsearch
role: data
---
apiVersion: v1
kind: Service
metadata:
name: elasticsearch-master
namespace: elasticsearch
spec:
type: ClusterIP
ports:
- name: http
port: 9200
targetPort: 9200
selector:
app: elasticsearch
role: master9. Secret
yaml
apiVersion: v1
kind: Secret
metadata:
name: elasticsearch-secrets
namespace: elasticsearch
type: Opaque
stringData:
password: Elastic@202410. StorageClass
yaml
apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
name: fast
provisioner: kubernetes.io/gce-pd
parameters:
type: pd-ssd
allowVolumeExpansion: true
reclaimPolicy: Retain11. Pod 中断预算
yaml
apiVersion: policy/v1
kind: PodDisruptionBudget
metadata:
name: es-master-pdb
namespace: elasticsearch
spec:
minAvailable: 2
selector:
matchLabels:
role: master
---
apiVersion: policy/v1
kind: PodDisruptionBudget
metadata:
name: es-data-pdb
namespace: elasticsearch
spec:
minAvailable: 2
selector:
matchLabels:
role: data部署步骤
1. 创建命名空间
bash
kubectl apply -f 00-namespace.yaml2. 创建配置和 Secret
bash
kubectl apply -f 01-configmap.yaml
kubectl apply -f 02-secrets.yaml
kubectl apply -f 03-storageclass.yaml3. 创建 Services
bash
kubectl apply -f 04-services.yaml4. 创建 StatefulSets
bash
kubectl apply -f 05-es-master.yaml
kubectl apply -f 06-es-data.yaml
# 如果需要 Ingest 节点
kubectl apply -f 07-es-ingest.yaml5. 验证部署
bash
# 查看 Pod 状态
kubectl get pods -n elasticsearch
# 查看集群健康状态
kubectl exec -it es-master-0 -n elasticsearch -- \
curl -u elastic:Elastic@2024 -k https://localhost:9200/_cluster/health
# 查看节点状态
kubectl exec -it es-master-0 -n elasticsearch -- \
curl -u elastic:Elastic@2024 -k https://localhost:9200/_cat/nodes使用示例
索引创建
bash
# 创建日志索引
curl -X PUT "http://elasticsearch.elasticsearch:9200/logs-$(date +%Y.%m.%d)" \
-u elastic:Elastic@2024 \
-H 'Content-Type: application/json' \
-d '{
"settings": {
"number_of_shards": 3,
"number_of_replicas": 1
},
"mappings": {
"properties": {
"timestamp": {"type": "date"},
"message": {"type": "text"},
"level": {"type": "keyword"},
"service": {"type": "keyword"}
}
}
}'日志收集配置(Filebeat)
yaml
apiVersion: v1
kind: ConfigMap
metadata:
name: filebeat-config
namespace: elasticsearch
data:
filebeat.yml: |
filebeat.autodiscover:
providers:
- type: kubernetes
templates:
- condition:
contains:
container.name: app
config:
- type: container
paths:
- /var/log/containers/*.log
processors:
- add_kubernetes_metadata:
in_cluster: true
- drop_event:
when:
contains:
log.level: debug
output.elasticsearch:
hosts: ['elasticsearch.elasticsearch:9200']
username: elastic
password: Elastic@2024
setup.kibana:
hosts: ['kibana:5601']监控和告警
Prometheus Exporter 配置
yaml
apiVersion: v1
kind: ConfigMap
metadata:
name: es-monitoring
namespace: elasticsearch
data:
es-exporter.yaml: |
- job_name: elasticsearch
static_configs:
- targets: ['es-data-0.es-data-headless.elasticsearch:9114']告警规则
yaml
apiVersion: monitoring.coreos.com/v1
kind: PrometheusRule
metadata:
name: es-alerts
namespace: elasticsearch
spec:
groups:
- name: elasticsearch.rules
rules:
- alert: ESNodesDown
expr: up{job="elasticsearch"} == 0
for: 2m
- alert: ESClusterRed
expr: es_cluster_health_status == "red"
for: 2m
- alert: ESDiskSpaceLow
prod: es_fs_total_available_bytes / es_fs_total_total_bytes < 0.1
for: 5m运维操作
备份(Snapshot)
bash
# 创建快照仓库
curl -X PUT "http://elasticsearch.elasticsearch:9200/_snapshot/backup" \
-u elastic:Elastic@2024 \
-H 'Content-Type: application/json' \
-d '{
"type": "fs",
"settings": {
"location": "/var/backups/elasticsearch"
}
}'
# 创建快照
curl -X PUT "http://elasticsearch.elasticsearch:9200/_snapshot/backup/snapshot_$(date +%Y%m%d)" \
-u elastic:Elastic@2024
# 恢复快照
curl -X POST "http://elasticsearch.elasticsearch:9200/_snapshot/backup/snapshot_name/_restore" \
-u elastic:Elastic@2024扩缩容
bash
# 扩容 Data 节点
kubectl scale statefulset es-data -n elasticsearch --replicas=5
# 手动滚动更新
kubectl rollout restart statefulset es-data -n elasticsearch常见问题排查
集群不健康
可能原因包括节点未启动、分片未分配、网络分区。检查节点状态和分片分配情况。
内存不足
可能原因包括 JVM 堆内存配置过小、大量聚合查询。调整 JVM 参数和优化查询。
写入缓慢
可能原因包括磁盘IO不足、线程池配置不当。增加 Data 节点或优化配置。
总结
本文提供了在 Kubernetes 环境中部署 Elasticsearch 集群的完整方案。生产环境中建议配合使用 Kibana 进行可视化监控,配置定期快照进行数据备份,并根据数据量增长情况适时扩展 Data 节点。
