Skip to content

Kubernetes 部署 MySQL 双主集群

MySQL 双主架构是指两台 MySQL 服务器互相作为主从,彼此之间相互复制数据。这种架构提供了读写分离能力的同时保证了高可用性,任一节点故障不影响服务继续运行。本文详细介绍在 Kubernetes 环境中使用 StatefulSet 部署 MySQL 双主集群的完整方案。

架构介绍

双主架构概述

MySQL 双主架构(Master-Master Replication)是数据库高可用的一种经典方案。在双主模式下,两台 MySQL 服务器互为主从,任何一台服务器上的写入操作都会同步到另一台服务器。这种架构的优势在于:两台服务器都可以接受写入请求,实现了真正的读写分离;任一节点故障时,应用程序可以自动切换到另一个节点继续服务;进行维护时可以通过切换保证服务不中断。

在 Kubernetes 环境中,我们通过以下方式实现双主部署:使用 StatefulSet 管理两个 MySQL Pod,每个 Pod 拥有独立的持久化存储;通过 Service 分别暴露主节点和从节点;配置 MySQL 主从复制参数实现数据同步;使用健康检查探针确保服务可用性。

架构拓扑

┌─────────────────────────────────────────────────────────────┐
│                    Kubernetes Cluster                   │
│                                                      │
│  ┌──────────────────┐      ┌──────────────────┐   │
│  │   mysql-master-0 │◄────►│   mysql-master-1 │   │
│  │   (Pod/StatefulSet)│      │  (Pod/StatefulSet) │   │
│  └────────┬─────────┘      └────────┬─────────┘   │
│           │                        │                │
│    ┌──────┴──────┐        ┌──────┴──────┐      │
│    │ PVC (数据)   │        │ PVC (数据)   │      │
│    └─────────────┘        └─────────────┘      │
└─────────────────────────────────────────────────┘
          │                        │
┌────────┴────────┐      ┌────────┴────────┐
│  mysql-master   │      │ mysql-slave    │
│  Service       │      │ Service       │
│  (读写)        │      │  (只读)       │
└────────┬────────┘      └────────┬────────┘
         │                        │
    ┌────┴────┐            ┌────┴────┐
    │  App1   │            │  App2   │
    └─────────┘            └─────────┘

技术要点

双主复制的核心原理是利用 MySQL 的 binlog 日志进行数据同步。每个 MySQL 实例既作为主服务器向对方发送 binlog,又作为从服务器接收并执行对方的 binlog。这种复制模式是异步的,存在轻微的数据延迟,但对于大多数应用场景来说是可以接受的。

关键配置包括:server-id 必须唯一,每个实例拥有不同的 ID;auto-increment-increment 和 auto-increment-offset 设置避免主键冲突;log-bin 开启二进制日志用于复制;relay-log 配置中继日志。

完整部署资源清单

命名空间

yaml
apiVersion: v1
kind: Namespace
metadata:
  name: mysql-cluster
  labels:
    name: mysql-cluster
    environment: production

ConfigMap 配置

yaml
apiVersion: v1
kind: ConfigMap
metadata:
  name: mysql-config
  namespace: mysql-cluster
data:
  my.cnf: |
    [mysqld]
    # 基础配置
    default-storage-engine=InnoDB
    character-set-server=utf8mb4
    collation-server=utf8mb4_unicode_ci
    max_connections=1000
    max_connect_errors=1000
    explicit_defaults_for_timestamp=true

    # 绑定地址
    bind-address=0.0.0.0

    # 二进制日志配置(用于复制)
    log-bin=mysql-bin
    binlog_format=ROW
    binlog_row_image=FULL
    expire_logs_days=7
    max_binlog_size=256M
    sync_binlog=1

    # 中继日志配置
    relay_log=relay-bin
    relay_log_recovery=1

    # 只读配置(从节点)
    read_only=1
    super_read_only=1

    # 复制配置
    server-id=1
    log_replica_updates=1
    replica_skip_errors=1062

    # 性能优化
    innodb_buffer_pool_size=512M
    innodb_log_file_size=128M
    innodb_flush_log_at_trx_commit=1
    innodb_flush_method=O_DIRECT
    table_open_cache=2000
    table_definition_cache=2000
    query_cache_type=0
    query_cache_size=0

    # 日志配置
    log_error=/var/log/mysql/error.log
    slow_query_log=1
    slow_query_log_file=/var/log/mysql/slow.log
    long_query_time=2

    [client]
    default-character-set=utf8mb4

    [mysql]
    default-character-set=utf8mb4
---
apiVersion: v1
kind: ConfigMap
metadata:
  name: mysql-init scripts
  namespace: mysql-cluster
data:
  init-master.sql: |
    -- 设置主从复制账户和权限
    CREATE USER IF NOT EXISTS 'repl'@'%' IDENTIFIED BY 'Repl@2024';
    GRANT REPLICATION SLAVE ON *.* TO 'repl'@'%';
    GRANT SELECT ON *.* TO 'repl'@'%';
    FLUSH PRIVILEGES;

    -- 创建健康检查账户
    CREATE USER IF NOT EXISTS 'healthcheck'@'%' IDENTIFIED BY 'Health@2024';
    GRANT PROCESS, REPLICATION CLIENT ON *.* TO 'healthcheck'@'%';
    FLUSH PRIVILEGES;

  init-replica.sql: |
    -- 停止复制线程
    STOP SLAVE;
    RESET SLAVE ALL;

    -- 配置复制源
    CHANGE REPLICATION SOURCE TO 
        SOURCE_HOST='mysql-master-0.mysql-cluster-headless',
        SOURCE_PORT=3306,
        SOURCE_USER='repl',
        SOURCE_PASSWORD='Repl@2024',
        SOURCE_AUTO_POSITION=1,
        GET_SOURCE_PUBLIC_KEY=1;

    -- 启动复制线程
    START REPLICA;

    -- 验证复制状态
    SHOW REPLICA STATUS\G

Headless Service

yaml
apiVersion: v1
kind: Service
metadata:
  name: mysql-cluster-headless
  namespace: mysql-cluster
  labels:
    app: mysql
spec:
  clusterIP: None
  ports:
    - name: mysql
      port: 3306
  selector:
    app: mysql

MySQL StatefulSet

yaml
apiVersion: apps/v1
kind: StatefulSet
metadata:
  name: mysql-master
  namespace: mysql-cluster
spec:
  serviceName: mysql-cluster-headless
  replicas: 2
  selector:
    matchLabels:
      app: mysql
  updateStrategy:
    type: RollingUpdate
  podManagementPolicy: Parallel
  template:
    metadata:
      labels:
        app: mysql
    spec:
      terminationGracePeriodSeconds: 30
      affinity:
        podAntiAffinity:
          preferredDuringSchedulingIgnoredDuringExecution:
            - weight: 100
              podAffinityTerm:
                labelSelector:
                  matchLabels:
                    app: mysql
                topologyKey: kubernetes.io/hostname
      containers:
        - name: mysql
          image: mysql:8.0.36
          imagePullPolicy: IfNotPresent
          ports:
            - name: mysql
              containerPort: 3306
            - name: mysql-replica
              containerPort: 33060
          env:
            - name: MYSQL_ROOT_PASSWORD
              valueFrom:
                secretKeyRef:
                  name: mysql-secrets
                  key: root-password
            - name: MYSQL_DATABASE
              value: "appdb"
            - name: MYSQL_REPLICATION_USER
              value: "repl"
            - name: MYSQL_REPLICATION_PASSWORD
              valueFrom:
                secretKeyRef:
                  name: mysql-secrets
                  key: repl-password
            - name: SERVER_ID
              valueFrom:
                fieldRef:
                  fieldPath: metadata.name
          resources:
            requests:
              cpu: "500m"
              memory: "1Gi"
            limits:
              cpu: "2000m"
              memory: "4Gi"
          volumeMounts:
            - name: mysql-data
              mountPath: /var/lib/mysql
            - name: mysql-config
              mountPath: /etc/mysql/conf.d
            - name: mysql-logs
              mountPath: /var/log/mysql
          livenessProbe:
            exec:
              command:
                - mysqladmin
                - ping
                - -h
                - localhost
                - -u
                - root
                - -pHealth@2024
            initialDelaySeconds: 60
            periodSeconds: 10
            timeoutSeconds: 5
            failureThreshold: 3
          readinessProbe:
            exec:
              command:
                - mysql
                - -h
                - localhost
                - -u
                - root
                - -pHealth@2024
                - -e
                - SELECT 1
            initialDelaySeconds: 30
            periodSeconds: 5
            timeoutSeconds: 3
            failureThreshold: 3
          startupProbe:
            exec:
              command:
                - mysql
                - -h
                - localhost
                - -u
                - root
                - -pHealth@2024
                - -e
                - SELECT 1
            initialDelaySeconds: 10
            periodSeconds: 10
            timeoutSeconds: 5
            failureThreshold: 30
      volumes:
        - name: mysql-data
          persistentVolumeClaim:
            claimName: mysql-data
        - name: mysql-config
          configMap:
            name: mysql-config
        - name: mysql-logs
          emptyDir: {}
  volumeClaimTemplates:
    - metadata:
        name: mysql-data
        spec:
          accessModes: ["ReadWriteOnce"]
          storageClassName: fast
          resources:
            requests:
              storage: 50Gi

MySQL Services

yaml
apiVersion: v1
kind: Service
metadata:
  name: mysql-master
  namespace: mysql-cluster
  labels:
    app: mysql
spec:
  type: ClusterIP
  ports:
    - name: mysql
      port: 3306
      targetPort: 3306
  selector:
    app: mysql
    statefulset.kubernetes.io/pod-name: mysql-master-0
---
apiVersion: v1
kind: Service
metadata:
  name: mysql-replica
  namespace: mysql-cluster
  labels:
    app: mysql
spec:
  type: ClusterIP
  ports:
    - name: mysql
      port: 3306
      targetPort: 3306
  selector:
    app: mysql
    statefulset.kubernetes.io/pod-name: mysql-master-1

Secret

yaml
apiVersion: v1
kind: Secret
metadata:
  name: mysql-secrets
  namespace: mysql-cluster
type: Opaque
stringData:
  root-password: Root@2024
  repl-password: Repl@2024
  healthcheck-password: Health@2024

ServiceAccount 和 RBAC

yaml
apiVersion: v1
kind: ServiceAccount
metadata:
  name: mysql
  namespace: mysql-cluster
---
apiVersion: rbac.authorization.k8s.io/v1
kind: Role
metadata:
  name: mysql
  namespace: mysql-cluster
rules:
  - apiGroups: [""]
    resources: ["pods/exec"]
    verbs: ["create"]
  - apiGroups: [""]
    resources: ["pods"]
    verbs: ["get", "list", "watch", "update", "patch"]
---
apiVersion: rbac.authorization.k8s.io/v1
kind: RoleBinding
metadata:
  name: mysql
  namespace: mysql-cluster
subjects:
  - kind: ServiceAccount
    name: mysql
    namespace: mysql-cluster
roleRef:
  kind: Role
  name: mysql
  apiGroup: rbac.authorization.k8s.io

PDB(Pod 中断预算)

yaml
apiVersion: policy/v1
kind: PodDisruptionBudget
metadata:
  name: mysql-pdb
  namespace: mysql-cluster
spec:
  minAvailable: 1
  selector:
    matchLabels:
      app: mysql

StorageClass

yaml
apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
  name: fast
provisioner: kubernetes.io/gce-pd
parameters:
  type: pd-ssd
  replication-type: regional-pd
allowVolumeExpansion: true
reclaimPolicy: Retain
volumeBindingMode: WaitForFirstConsumer

初始化脚本

由于 StatefulSet 启动时需要配置主从复制,我们需要添加初始化容器或使用生命周期钩子。以下是完整的初始化配置:

yaml
# 在 StatefulSet 的 initContainers 中添加
initContainers:
  - name: init-mysql
    image: mysql:8.0.36
    command:
      - bash
      - -c
      - |
        #!/bin/bash
        set -ex

        # 获取 Pod 序号
        ordinal=$(echo ${HOSTNAME} | sed 's/.*-//')

        # 第一个节点:初始化数据库
        if [[ ordinal -eq 0 ]]; then
          echo "Initializing first MySQL node..."
          # 等待 MySQL 启动
          until mysqladmin ping -h localhost -u root -p${MYSQL_ROOT_PASSWORD} --silent; do
            echo "Waiting for MySQL to be ready..."
            sleep 2
          done
          
          # 执行初始化 SQL
          mysql -u root -p${MYSQL_ROOT_PASSWORD} < /docker-entrypoint-initdb.d/init-master.sql
          echo "First MySQL node initialized successfully"
        else
          # 等待主节点就绪
          echo "Waiting for master node to be ready..."
          until mysql -h mysql-master-0.mysql-cluster-headless -u root -p${MYSQL_ROOT_PASSWORD} -e "SELECT 1" --silent; do
            echo "Waiting for master node..."
            sleep 2
          done
          
          # 等待复制账户创建完成
          sleep 10
          
          # 配置复制
          echo "Configuring replication from master node..."
          mysql -u root -p${MYSQL_ROOT_PASSWORD} -h mysql-master-0.mysql-cluster-headless <<EOF
            STOP REPLICA;
            RESET REPLICA ALL;
            CHANGE REPLICATION SOURCE TO 
                SOURCE_HOST='mysql-master-0.mysql-cluster-headless',
                SOURCE_PORT=3306,
                SOURCE_USER='${MYSQL_REPLICATION_USER}',
                SOURCE_PASSWORD='${MYSQL_REPLICATION_PASSWORD}',
                SOURCE_AUTO_POSITION=1,
                GET_SOURCE_PUBLIC_KEY=1;
            START REPLICA;
          EOF
          
          echo "Replication configured successfully"
        fi
    env:
      - name: MYSQL_ROOT_PASSWORD
        valueFrom:
          secretKeyRef:
            name: mysql-secrets
            key: root-password
      - name: MYSQL_REPLICATION_USER
        value: "repl"
      - name: MYSQL_REPLICATION_PASSWORD
        valueFrom:
          secretKeyRef:
            name: mysql-secrets
            key: repl-password
    volumeMounts:
      - name: mysql-data
        mountPath: /var/lib/mysql
      - name: mysql-init
        mountPath: /docker-entrypoint-initdb.d

部署步骤

1. 创建命名空间

bash
kubectl apply -f 00-namespace.yaml

2. 创建 ConfigMap 和 Secret

bash
kubectl apply -f 01-configmap.yaml
kubectl apply -f 02-secrets.yaml

3. 创建 StorageClass

bash
kubectl apply -f 03-storageclass.yaml

4. 创建 Service

bash
kubectl apply -f 04-services.yaml

5. 创建 StatefulSet

bash
kubectl apply -f 05-statefulset.yaml

6. 验证部署

bash
# 查看 Pod 状态
kubectl get pods -n mysql-cluster -l app=mysql

# 查看服务
kubectl get svc -n mysql-cluster

# 查看 PVC
kubectl get pvc -n mysql-cluster

# 查看日志
kubectl logs mysql-master-0 -n mysql-cluster -c mysql
kubectl logs mysql-master-1 -n mysql-cluster -c mysql

7. 验证复制状态

bash
# 进入 MySQL 容器
kubectl exec -it mysql-master-0 -n mysql-cluster -- mysql -u root -pRoot@2024

# 执行验证SQL
SHOW REPLICA STATUS\G

应用配置示例

Java 应用配置

yaml
# Spring Boot 配置示例
spring:
  datasource:
    url: jdbc:mysql:replication://source=mysql-master.mysql-cluster.svc.cluster.local:3306,target=mysql-replica.mysql-cluster.svc.cluster.local:3306/appdb?useUnicode=true&characterEncoding=utf-8&useSSL=false&serverTimezone=Asia/Shanghai
    username: appuser
    password: App@2024
    driver-class-name: com.mysql.cj.jdbc.Driver
    hikari:
      maximum-pool-size: 20
      minimum-idle: 5
      connection-timeout: 30000
      idle-timeout: 600000
      max-lifetime: 1800000

Python 应用配置

python
# Python SQLAlchemy 配置示例
from sqlalchemy import create_engine
from sqlalchemy.pool import QueuePool

DATABASE_URL = (
    "mysql+pymysql://appuser:App@2024@"
    "mysql-master.mysql-cluster.svc.cluster.local:3306/appdb"
    "?retry_on_failure=3"
)

engine = create_engine(
    DATABASE_URL,
    poolclass=QueuePool,
    pool_size=10,
    max_overflow=20,
    pool_pre_ping=True,
)

# 读写分离配置可以使用 ReadReplica 连接
READ_REPLICA_URL = (
    "mysql+pymysql://appuser:App@2024@"
    "mysql-replica.mysql-cluster.svc.cluster.local:3306/appdb"
)

监控和告警

Prometheus 监控配置

yaml
apiVersion: v1
kind: ConfigMap
metadata:
  name: mysql-monitoring
  namespace: mysql-cluster
data:
  mysqld-exporter-job.yaml: |
    - job_name: mysql_exporter
      static_configs:
      - targets:
        - mysql-master-0.mysql-cluster:9104
        - mysql-master-1.mysql-cluster:9104

告警规则

yaml
apiVersion: monitoring.coreos.com/v1
kind: PrometheusRule
metadata:
  name: mysql-alerts
  namespace: mysql-cluster
spec:
  groups:
    - name: mysql.rules
      rules:
        - alert: MySQLDown
          expr: up{job="mysql_exporter"} == 0
          for: 2m
          labels:
            severity: critical
          annotations:
            summary: "MySQL {{ $labels.instance }} is down"
        - alert: MySQLReplicationLag
          expr: mysql_replica_seconds_behind_master > 30
          for: 5m
          labels:
            severity: warning
          annotations:
            summary: "MySQL replication lag is {{ $value }} seconds"

运维操作

备份和恢复

bash
# 备份
kubectl exec mysql-master-0 -n mysql-cluster -- \
  mysqldump -u root -pRoot@2024 --all-databases > backup.sql

# 恢复
kubectl exec -i mysql-master-0 -n mysql-cluster -- \
  mysql -u root -pRoot@2024 < backup.sql

扩容

bash
# 扩容(需要手动配置新节点的复制)
kubectl scale statefulset mysql-master -n mysql-cluster --replicas=3

主节点切换

bash
# 手动切换主节点(停止应用后执行)
kubectl exec mysql-master-1 -n mysql-cluster -- \
  mysql -u root -pRoot@2024 -e "STOP REPLICA; START REPLICA;"

常见问题排查

复制延迟过高

可能原因包括网络延迟、事务过大、硬件性能不足。处理方法:优化慢查询、减少大事务、增加硬件资源。

数据不一致

可能原因包括复制中断后未及时发现。处理方法:重新建立复制、使用 pt-table-checksum 检查。

连接超时

可能原因包括健康检查配置不当。处理方法:调整探针配置、增加超时时间。

总结

本文提供了在 Kubernetes 环境中部署 MySQL 双主集群的完整方案。生产环境中建议配合使用监控告警系统,定期检查复制状态,并设置合理的备份策略。双主架构虽然提供了高可用性,但真正的零数据丢失还需要配合延迟副本和定期备份来实现。