Skip to content

Kubernetes-自动扩缩容(HPA)

Kubernetes Horizontal Pod Autoscaler (HPA)是Kubernetes中实现工作负载自动扩缩容的核心组件,它能够根据应用负载动态调整Pod数量,确保应用的高可用性和资源高效利用。本文将全面解析HPA的工作原理、配置方法、最佳实践以及生产环境中的高级应用场景。

HPA核心概念与工作原理

水平扩缩与垂直扩缩的区别

  • 水平扩缩(Horizontal Scaling):通过增减Pod数量来应对负载变化,这是云原生应用更推荐的扩缩方式,因为它提供了更好的弹性和故障隔离能力
  • 垂直扩缩(Vertical Scaling):通过调整单个Pod的资源配额(CPU/内存)来应对负载变化,适用于无法水平扩展的应用

HPA基本工作原理

HPA控制器以固定间隔(默认15秒)执行以下操作:

  1. 指标采集:通过Metrics API查询目标工作负载(如Deployment)的当前指标
  2. 决策计算:计算当前指标值与目标值的比率
  3. 执行扩缩容:根据比率调整Pod副本数量

核心算法公式为:

bash
期望副本数 = ceil(当前副本数 × (当前指标值 / 目标指标值))

算法优化点

  • 容忍度:默认0.1,比率在0.9-1.1之间时不触发扩缩
  • 未就绪Pod处理:忽略启动中的Pod指标
  • 指标缺失处理:保守估计缺失Pod的指标值
  • 降频稳定:默认5分钟窗口期平滑缩容操作

HPA版本演进与指标类型

HPA版本演进

  • HPA v1 (autoscaling/v1):仅支持CPU利用率
  • HPA v2beta1 (autoscaling/v2beta1):扩展支持多指标组合(CPU、内存、自定义指标)
  • HPA v2 (autoscaling/v2):增加外部指标支持,提供更灵活的扩缩策略

支持的指标类型

指标类型数据来源适用场景示例配置
ResourceMetrics ServerCPU/内存利用率type: Utilization
Pods自定义指标(如Prometheus)业务指标(QPS、队列长度)name: requests_per_second
ObjectKubernetes对象(如Ingress)关联对象的指标(如请求延迟)describedObject: Ingress
External外部系统(如云监控)跨集群或混合云指标metric: queue_messages_ready

资源监控工具 Metrics Server

提供资源指标(CPU/内存)。Metrics-Server是集群核心监控数据的聚合器。通俗地说,它存储了集群中各节点的监控数据,并且提供了API以供分析和使用。

bash
kubectl apply -f https://github.com/kubernetes-sigs/metrics-server/releases/download/v0.6.3/components.yaml
# 或者
wget https://github.com/kubernetes-sigs/metrics-server/releases/download/v0.6.3/components.yaml

# 修改一下信息
- --kubelet-insecure-tls #增加证书忽略
image: registry.cn-hangzhou.aliyuncs.com/google_containers/metrics-server:v0.6.3 #修改image为阿里云下载的这个

# 提前下载好镜像:
docker pull registry.aliyuncs.com/google_containers/metrics-server:v0.6.3

components.yaml

提前下载好镜像,这里不会主动下载镜像的imagePullPolicy: IfNotPresent

yaml
apiVersion: v1
kind: ServiceAccount
metadata:
  labels:
    k8s-app: metrics-server
  name: metrics-server
  namespace: kube-system
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
  labels:
    k8s-app: metrics-server
    rbac.authorization.k8s.io/aggregate-to-admin: "true"
    rbac.authorization.k8s.io/aggregate-to-edit: "true"
    rbac.authorization.k8s.io/aggregate-to-view: "true"
  name: system:aggregated-metrics-reader
rules:
- apiGroups:
  - metrics.k8s.io
  resources:
  - pods
  - nodes
  verbs:
  - get
  - list
  - watch
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
  labels:
    k8s-app: metrics-server
  name: system:metrics-server
rules:
- apiGroups:
  - ""
  resources:
  - nodes/metrics
  verbs:
  - get
- apiGroups:
  - ""
  resources:
  - pods
  - nodes
  verbs:
  - get
  - list
  - watch
---
apiVersion: rbac.authorization.k8s.io/v1
kind: RoleBinding
metadata:
  labels:
    k8s-app: metrics-server
  name: metrics-server-auth-reader
  namespace: kube-system
roleRef:
  apiGroup: rbac.authorization.k8s.io
  kind: Role
  name: extension-apiserver-authentication-reader
subjects:
- kind: ServiceAccount
  name: metrics-server
  namespace: kube-system
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
  labels:
    k8s-app: metrics-server
  name: metrics-server:system:auth-delegator
roleRef:
  apiGroup: rbac.authorization.k8s.io
  kind: ClusterRole
  name: system:auth-delegator
subjects:
- kind: ServiceAccount
  name: metrics-server
  namespace: kube-system
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
  labels:
    k8s-app: metrics-server
  name: system:metrics-server
roleRef:
  apiGroup: rbac.authorization.k8s.io
  kind: ClusterRole
  name: system:metrics-server
subjects:
- kind: ServiceAccount
  name: metrics-server
  namespace: kube-system
---
apiVersion: v1
kind: Service
metadata:
  labels:
    k8s-app: metrics-server
  name: metrics-server
  namespace: kube-system
spec:
  ports:
  - name: https
    port: 443
    protocol: TCP
    targetPort: https
  selector:
    k8s-app: metrics-server
---
apiVersion: apps/v1
kind: Deployment
metadata:
  labels:
    k8s-app: metrics-server
  name: metrics-server
  namespace: kube-system
spec:
  selector:
    matchLabels:
      k8s-app: metrics-server
  strategy:
    rollingUpdate:
      maxUnavailable: 0
  template:
    metadata:
      labels:
        k8s-app: metrics-server
    spec:
      hostNetwork: true
      containers:
      - args:
        - --cert-dir=/tmp
        - --secure-port=4443
        - --kubelet-preferred-address-types=InternalIP,ExternalIP,Hostname
        - --kubelet-use-node-status-port
        - --metric-resolution=15s
        - --kubelet-insecure-tls  # 取消证书验证
        #image: registry.k8s.io/metrics-server/metrics-server:v0.6.3
        image: registry.aliyuncs.com/google_containers/metrics-server:v0.6.3
        imagePullPolicy: IfNotPresent
        livenessProbe:
          failureThreshold: 3
          httpGet:
            path: /livez
            port: https
            scheme: HTTPS
          periodSeconds: 10
        name: metrics-server
        ports:
        - containerPort: 4443
          name: https
          protocol: TCP
        readinessProbe:
          failureThreshold: 3
          httpGet:
            path: /readyz
            port: https
            scheme: HTTPS
          initialDelaySeconds: 20
          periodSeconds: 10
        resources:
          requests:
            cpu: 100m
            memory: 200Mi
        securityContext:
          allowPrivilegeEscalation: false
          readOnlyRootFilesystem: true
          runAsNonRoot: true
          runAsUser: 1000
        volumeMounts:
        - mountPath: /tmp
          name: tmp-dir
      nodeSelector:
        kubernetes.io/os: linux
      priorityClassName: system-cluster-critical
      serviceAccountName: metrics-server
      volumes:
      - emptyDir: {}
        name: tmp-dir
---
apiVersion: apiregistration.k8s.io/v1
kind: APIService
metadata:
  labels:
    k8s-app: metrics-server
  name: v1beta1.metrics.k8s.io
spec:
  group: metrics.k8s.io
  groupPriorityMinimum: 100
  insecureSkipTLSVerify: true
  service:
    name: metrics-server
    namespace: kube-system
  version: v1beta1
  versionPriority: 100

启动

bash
kubectl apply -f components.yaml

Kubernetes HPA 参数解析

下面我将对 HPA 的所有核心参数进行详细解析,包括必填参数、可选参数以及它们的行为影响。

基础参数解析

基础资源

yaml
apiVersion: autoscaling/v2
kind: HorizontalPodAutoscaler
metadata:
  name: my-app-hpa      # HPA资源名称
  namespace: production # 目标命名空间
spec:
  scaleTargetRef:
    apiVersion: apps/v1  # 目标资源的API版本
    kind: Deployment     # 资源类型(Deployment/StatefulSet等)
    name: my-app         # 目标资源名称
  
  minReplicas: 2        # 最小Pod数量(必须≥1)
  maxReplicas: 10       # 最大Pod数量(必须≥minReplicas)
  
  metrics:              # 自动扩缩容指标配置
  - type: Resource      # 资源指标类型
    resource:
      name: cpu         # 监控CPU利用率
      target:
        type: Utilization       # 指标类型(Utilization/Value/AverageValue)
        averageUtilization: 70  # CPU目标利用率百分比(70%)
  
  - type: Resource
    resource:
      name: memory      # 监控内存使用量
      target:
        type: AverageValue
        averageValue: 500Mi     # 内存目标平均值
  
  # 可选行为配置(v2+版本特性)
  behavior:
    scaleDown:
      stabilizationWindowSeconds: 300  # 缩容冷却期(防止抖动)
      policies:
      - type: Percent
        value: 10             # 每次最多缩容10%的Pod
        periodSeconds: 60     # 每分钟执行一次评估
    scaleUp:
      stabilizationWindowSeconds: 60  # 扩容冷却期
      policies:
      - type: Pods
        value: 4              # 每次最多扩容4个Pod
        periodSeconds: 15     # 每15秒执行一次评估

scaleTargetRef (必填)

指定要自动扩缩的目标工作负载

yaml
scaleTargetRef:
  apiVersion: apps/v1  # 目标资源的API版本
  kind: Deployment     # 资源类型(Deployment/StatefulSet等)
  name: my-app         # 目标资源名称

注意事项

  • 支持的工作负载类型:Deployment、StatefulSet、ReplicaSet
  • 必须确保目标资源存在且正常运行
  • 扩缩操作会直接修改目标资源的replicas字段

minReplicas (必填)

最小副本数,下限保护

yaml
minReplicas: 2

最佳实践

  • 生产环境建议≥2以保证高可用
  • 设置为0可实现"缩容到零"(需配合KEDA等工具)

maxReplicas (必填)

最大副本数,上限保护

yaml
maxReplicas: 10

注意事项

  • 防止无限扩容造成资源耗尽
  • 需根据集群资源容量合理设置

指标配置(metrics)参数

Resource指标(CPU/内存)

yaml
metrics:
- type: Resource
  resource:
    name: cpu  # 或 memory
    target:
      type: Utilization  # 或 AverageValue
      averageUtilization: 70  # 百分比值
      # averageValue: 500m  # 绝对值(如500毫核)

参数说明

  • type: Utilization:基于请求值的百分比
  • type: AverageValue:基于绝对使用量
  • 必须设置Pod的resources.requests才能计算利用率

Pods指标(自定义指标)

yaml
metrics:
- type: Pods
  pods:
    metric:
      name: requests_per_second  # 指标名称
      selector:  # 可选,指标选择器
        matchLabels:
          app: frontend
    target:
      type: AverageValue  # 必须为此类型
      averageValue: 100   # 目标平均值

数据来源

  • 需部署自定义指标适配器(Prometheus Adapter等)
  • 指标名称需与适配器中定义的规则匹配

Object指标(K8s对象相关)

yaml
metrics:
- type: Object
  object:
    describedObject:  # 关联的K8s对象
      apiVersion: networking.k8s.io/v1
      kind: Ingress
      name: main-ingress
    metric:
      name: latency_ms  # 指标名称
    target:
      type: Value  # 或 AverageValue
      value: 200   # 目标值(毫秒)

典型场景

  • Ingress请求延迟
  • Service错误率
  • 持久卷使用量

External指标(外部系统)

yaml
metrics:
- type: External
  external:
    metric:
      name: sqs_queue_length  # 外部指标名
      selector:  # 可选,指标选择器
        matchLabels:
          queue: orders
    target:
      type: Value  # 或 AverageValue
      value: 1000  # 目标队列长度

常见外部指标

  • 消息队列积压(SQS/Kafka/RabbitMQ)
  • 数据库连接数
  • 云服务指标(如AWS ALB请求计数)

行为控制(behavior)参数

扩缩方向控制

yaml
behavior:
  scaleDown:  # 缩容策略
    stabilizationWindowSeconds: 300
    policies:
    - type: Percent
      value: 10
      periodSeconds: 60
  scaleUp:    # 扩容策略
    stabilizationWindowSeconds: 60
    policies:
    - type: Pods
      value: 4
      periodSeconds: 15

stabilizationWindowSeconds

稳定窗口时间(秒),用于抑制抖动

推荐值

  • scaleDown: 300-600秒(5-10分钟)
  • scaleUp: 30-60秒

policies策略类型

类型说明示例
Pods按绝对Pod数量扩缩value: 4
Percent按当前副本百分比扩缩value: 10

多策略组合

yaml
policies:
- type: Pods
  value: 4
  periodSeconds: 60  # 每分钟最多扩容4个Pod
- type: Percent
  value: 100
  periodSeconds: 300  # 每5分钟最多扩容100%

状态字段解析

通过kubectl describe hpa可查看的重要状态字段:

  1. Reference:关联的目标资源
  2. Metrics:当前指标值与目标值
  3. Min/Max Replicas:副本数边界
  4. Replicas:当前副本数
  5. Conditions
    • AbleToScale:是否可扩缩
    • ScalingActive:指标是否可用
    • ScalingLimited:是否达到边界限制

特殊参数说明

targetAverageUtilization (v1版本)

v1版本CPU扩缩专用参数(已弃用)

yaml
apiVersion: autoscaling/v1
kind: HorizontalPodAutoscaler
spec:
  targetCPUUtilizationPercentage: 70

指标选择器(selector)

过滤特定指标的标签选择器

yaml
metric:
  name: http_requests
  selector:
    matchLabels:
      route: checkout

参数关联影响

  1. 指标采集间隔

    • Metrics Server默认15秒采集一次
    • 影响扩缩响应速度
  2. 冷却时间影响

    mermaid
    graph LR
      A[指标超阈值] --> B{是否在冷却期}
      B -->|否| C[执行扩缩]
      B -->|是| D[等待冷却结束]
  3. 多指标决策逻辑

    • 计算所有指标对应的副本数
    • 选择最大的副本数作为最终结果

生产环境参数调优建议

  1. CPU扩缩

    yaml
    averageUtilization: 65
    behavior:
      scaleUp:
        stabilizationWindowSeconds: 30
        policies:
        - type: Percent
          value: 30
  2. 业务指标扩缩

    yaml
    averageValue: 50
    behavior:
      scaleDown:
        stabilizationWindowSeconds: 600
        policies:
        - type: Pods
          value: 1
  3. 混合指标扩缩

    yaml
    metrics:
    - type: Resource
      resource:
        name: cpu
        target:
          type: Utilization
          averageUtilization: 70
    - type: External
      external:
        metric:
          name: kafka_lag
        target:
          type: AverageValue
          averageValue: 100

Kubernetes HPA 指标类型配置示例

下面我将提供完整的 HPA 配置示例,涵盖所有支持的指标类型(Resource、Pods、Object、External),并包含行为控制、多指标组合等高级配置。

基础 HPA 配置模板

yaml
apiVersion: autoscaling/v2
kind: HorizontalPodAutoscaler
metadata:
  name: <hpa-name>
spec:
  scaleTargetRef:
    apiVersion: apps/v1
    kind: <Deployment|StatefulSet>
    name: <workload-name>
  minReplicas: <minimum-replicas>
  maxReplicas: <maximum-replicas>
  behavior:  # 可选,扩缩行为控制
    scaleDown|scaleUp:
      stabilizationWindowSeconds: <seconds>
      policies:
      - type: <Pods|Percent>
        value: <number>
        periodSeconds: <seconds>
  metrics:  # 指标配置部分
  - type: <Resource|Pods|Object|External>
    <指标类型特定配置>

所有指标类型完整配置示例

Resource 指标(CPU/内存)

yaml
apiVersion: autoscaling/v2
kind: HorizontalPodAutoscaler
metadata:
  name: frontend-hpa
spec:
  scaleTargetRef:
    apiVersion: apps/v1
    kind: Deployment
    name: frontend
  minReplicas: 2
  maxReplicas: 10
  metrics:
  - type: Resource
    resource:
      name: cpu  # 或 memory
      target:
        type: Utilization  # 或 AverageValue
        averageUtilization: 70  # 百分比值,当type=Utilization时
        # averageValue: 500m  # 当type=AverageValue时的绝对值

Pods 指标(自定义指标)

yaml
apiVersion: autoscaling/v2
kind: HorizontalPodAutoscaler
metadata:
  name: order-service-hpa
spec:
  scaleTargetRef:
    apiVersion: apps/v1
    kind: Deployment
    name: order-service
  minReplicas: 3
  maxReplicas: 20
  metrics:
  - type: Pods
    pods:
      metric:
        name: orders_processed_per_second  # 自定义指标名称
      target:
        type: AverageValue  # 必须为AverageValue
        averageValue: 50  # 目标平均值

Object 指标(Kubernetes对象相关指标)

yaml
apiVersion: autoscaling/v2
kind: HorizontalPodAutoscaler
metadata:
  name: ingress-hpa
spec:
  scaleTargetRef:
    apiVersion: apps/v1
    kind: Deployment
    name: ingress-nginx
  minReplicas: 2
  maxReplicas: 10
  metrics:
  - type: Object
    object:
      describedObject:
        apiVersion: networking.k8s.io/v1
        kind: Ingress
        name: main-ingress
      metric:
        name: requests_per_second
      target:
        type: Value  # 或 AverageValue
        value: 1000  # 目标值

External 指标(外部系统指标)

yaml
apiVersion: autoscaling/v2
kind: HorizontalPodAutoscaler
metadata:
  name: payment-service-hpa
spec:
  scaleTargetRef:
    apiVersion: apps/v1
    kind: Deployment
    name: payment-service
  minReplicas: 2
  maxReplicas: 15
  metrics:
  - type: External
    external:
      metric:
        name: sqs_queue_length  # 外部指标名称
        selector:
          matchLabels:
            queue: payment-queue
      target:
        type: AverageValue  # 或 Value
        averageValue: 30  # 每个Pod处理30个队列消息

多指标组合配置示例

yaml
apiVersion: autoscaling/v2
kind: HorizontalPodAutoscaler
metadata:
  name: ecommerce-hpa
spec:
  scaleTargetRef:
    apiVersion: apps/v1
    kind: Deployment
    name: ecommerce-backend
  minReplicas: 4
  maxReplicas: 30
  behavior:
    scaleDown:
      stabilizationWindowSeconds: 300
      policies:
      - type: Percent
        value: 10
        periodSeconds: 60
    scaleUp:
      stabilizationWindowSeconds: 60
      policies:
      - type: Pods
        value: 4
        periodSeconds: 15
  metrics:
  - type: Resource
    resource:
      name: cpu
      target:
        type: Utilization
        averageUtilization: 65
  - type: Pods
    pods:
      metric:
        name: http_requests_per_second
      target:
        type: AverageValue
        averageValue: 100
  - type: Object
    object:
      describedObject:
        apiVersion: networking.k8s.io/v1
        kind: Ingress
        name: store-ingress
      metric:
        name: latency_ms
      target:
        type: Value
        value: 200
  - type: External
    external:
      metric:
        name: redis_connections
      target:
        type: AverageValue
        averageValue: 50

特殊场景配置示例

基于多个外部指标

yaml
metrics:
- type: External
  external:
    metric:
      name: sqs_queue_length
      selector:
        matchLabels:
          queue: order-queue
    target:
      type: AverageValue
      averageValue: 20
- type: External
  external:
    metric:
      name: dynamodb_throttled_requests
    target:
      type: Value
      value: 0  # 目标是没有被限制的请求

基于GPU利用率

yaml
metrics:
- type: Resource
  resource:
    name: nvidia.com/gpu
    target:
      type: Utilization
      averageUtilization: 70

验证HPA配置

创建HPA后,使用以下命令验证:

bash
# 查看HPA状态
kubectl get hpa

# 查看详细描述
kubectl describe hpa <hpa-name>

# 查看HPA相关事件
kubectl get events --field-selector involvedObject.kind=HorizontalPodAutoscaler

重要注意事项

  1. 资源请求必须设置:HPA计算资源利用率时需要Pod的resources.requests
  2. 指标可用性:确保Metrics Server或自定义指标适配器正常运行
  3. 冷却时间:合理设置stabilizationWindowSeconds避免抖动
  4. 最小副本数:生产环境建议至少2个副本保证高可用
  5. 指标延迟:不同指标来源可能有不同延迟,需在业务容忍范围内

HPA 经典案例

HPA-Deployment 测试(CPU/memory)

镜像准备

bash
docker pull alpine
docker pull php:8.2-apache

创建php代码

在node节点中创建php代码

php
<?php
class LoadTester {
    public function handleRequest() {
        header("Content-Type: text/plain");
        
        $action = $_GET['action'] ?? 'basic';
        $params = $_GET;
        
        switch ($action) {
            case 'basic':
                $this->basicTest();
                break;
            case 'cpu':
                $number = $params['number'] ?? 30;
                $this->cpuStressTest((int)$number);
                break;
            case 'memory':
                $this->memoryStressTest();
                break;
            case 'complex':
                $this->complexTest();
                break;
            case 'dynamic':
                $level = $params['level'] ?? 5;
                $this->dynamicLoadTest((int)$level);
                break;
            default:
                echo "Available actions:\n";
                echo "- basic (default)\n";
                echo "- cpu?number=30\n";
                echo "- memory\n";
                echo "- complex\n";
                echo "- dynamic?level=5\n";
        }
    }

    private function basicTest() {
        echo "Hello, this is a lightweight PHP response!\n";
    }

    private function cpuStressTest($number) {
        function fibonacci($n) {
            if ($n <= 1) return $n;
            return fibonacci($n - 1) + fibonacci($n - 2);
        }

        $start_time = microtime(true);
        $result = fibonacci($number);
        $end_time = microtime(true);

        echo "Result: $result\n";
        echo "Time taken: " . round($end_time - $start_time, 4) . " seconds\n";
    }

    private function memoryStressTest() {
        $memory = str_repeat("A", 100 * 1024 * 1024);
        echo "Allocated 100MB memory\n";
        sleep(1);
        echo "Done.\n";
    }

    private function complexTest() {
        function mock_db_query() {
            usleep(100000);
            return rand(1, 100);
        }

        $data = [];
        for ($i = 0; $i < 10; $i++) {
            $data[] = mock_db_query();
        }

        echo "Simulated database results: " . implode(", ", $data) . "\n";
        echo "Peak memory usage: " . round(memory_get_peak_usage() / 1024 / 1024, 2) . "MB\n";
    }

    private function dynamicLoadTest($level) {
        $level = max(1, min(10, $level));
        $iterations = $level * 1000000;
        $sum = 0;
        
        for ($i = 0; $i < $iterations; $i++) {
            $sum += sqrt($i);
        }

        echo "Load level: $level\n";
        echo "Calculated sum: " . round($sum, 2) . "\n";
    }
}

// 使用示例
$tester = new LoadTester();
$tester->handleRequest();
?>

使用说明

  1. 将上述代码保存为LoadTester.php
  2. 通过URL参数调用不同功能:
测试类型调用URL示例可选参数
基础测试/LoadTester.php?action=basic
CPU压力测试/LoadTester.php?action=cpu&number=30number(默认30)
内存压力测试/LoadTester.php?action=memory
综合测试/LoadTester.php?action=complex
动态负载测试/LoadTester.php?action=dynamic&level=5level(1-10, 默认5)

创建php-apache的Deployment

yaml
# ====================== 命名空间配置 ======================
apiVersion: v1
kind: Namespace
metadata:
  name: php-apache
  # 命名空间用于逻辑隔离资源,建议为每个应用创建独立命名空间
  # 命名空间名称应符合DNS命名规范(小写、连字符)

# ====================== 水平自动伸缩器 (HPA) ======================
apiVersion: autoscaling/v2  # 使用v2版本API(支持多指标和更灵活的扩缩策略)
kind: HorizontalPodAutoscaler
metadata:
  name: php-apache
  namespace: php-apache
spec:
  scaleTargetRef:  # 指定要伸缩的目标资源
    apiVersion: apps/v1
    kind: Deployment  # 支持Deployment/StatefulSet等控制器
    name: php-apache  # 必须与目标Deployment名称匹配
  minReplicas: 1     # 最小副本数(生产环境建议至少2个保证高可用)
  maxReplicas: 10    # 最大副本数(根据集群容量和业务需求设置)
  metrics:
  - type: Resource    # 资源类型指标(还支持Pods/External/Object类型)
    resource:
      name: cpu       # 监控CPU利用率(可选memory/其他自定义指标)
      target:
        type: Utilization  # 基于资源使用率的伸缩(对比requests值计算)
        averageUtilization: 50  # 目标CPU利用率阈值(百分比)
        # 可选绝对值模式(当type: AverageValue时):
        # averageValue: 500m  # 直接指定目标值(如500毫核)

# ====================== 部署(Deployment)配置 ======================
apiVersion: apps/v1
kind: Deployment
metadata:
  name: php-apache
  namespace: php-apache
spec:
  selector:  # 标签选择器必须匹配template中的labels
    matchLabels:
      app: php-apache
  replicas: 1  # 初始副本数(HPA会自动调整,此处设置初始值)
  template:
    metadata:
      namespace: php-apache
      labels:
        app: php-apache  # 必须与Service和HPA的selector匹配
    spec:
      volumes: 
        - name: php-path
          hostPath:      # 使用节点本地路径(生产环境建议用PVC)
            path: /data/php  # 节点上必须存在的目录
            type: DirectoryOrCreate  # 如果不存在则自动创建
      containers:
      - name: php-apache
        image: php:8.2-apache  # 官方PHP镜像(带Apache)
        imagePullPolicy: IfNotPresent  # 优先使用本地镜像
        ports:
        - containerPort: 80  # 容器暴露的端口(必须与Service匹配)
        volumeMounts:
          - name: php-path
            mountPath: /var/www/html/  # Apache默认网站根目录
            readOnly: false  # 默认读写挂载(如需只读需显式声明)
        resources:  # 资源限制和请求(HPA依赖这些值计算利用率)
          limits:   # 容器最大可使用资源量
            cpu: 500m  # 500毫核(0.5个CPU核心)
            memory: "512Mi"  # 512兆字节
          requests: # 调度时保证分配的资源量
            cpu: 200m  
            memory: "256Mi"
        livenessProbe:  # 存活探针(失败会重启容器)
          httpGet:
            path: /      # 检测路径(建议使用专用健康检查端点)
            port: 80
          initialDelaySeconds: 30  # 容器启动后30秒开始检查
          periodSeconds: 10        # 每10秒检查一次
        readinessProbe:  # 就绪探针(失败会从Service摘除流量)
          httpGet:
            path: /
            port: 80
          initialDelaySeconds: 5  # 比liveness更短的初始延迟
          periodSeconds: 5        # 更高频率的检查

# ====================== 服务(Service)配置 ======================
apiVersion: v1
kind: Service
metadata:
  name: php-apache
  namespace: php-apache
  labels:
    app: php-apache  # 用于服务发现和监控关联
spec:
  ports:
  - port: 80        # Service对外暴露的端口
    targetPort: 80  # 必须与容器暴露的port匹配
    protocol: TCP   # 默认TCP(可省略)
    # 如需NodePort可添加:
    # nodePort: 30080  # 手动指定节点端口(范围30000-32767)
  selector:
    app: php-apache  # 选择器必须匹配Pod标签
  # 默认ClusterIP类型(集群内访问)
  # 如需外部访问可改为:
  # type: LoadBalancer  # 云厂商会自动创建LB

启动php-apache

bash
kubectl apply -f php-apache.yaml

进行压力测试

bash
# 相同namespace
kubectl run -it --rm ab-test \
--image=alpine \
-n php-apache \
--restart=Never \
-- sh -c "apk add apache2-utils && ab -n 300 -c 50 http://php-apache/LoadTester.php?action=cpu&number=30"

# 完成地址请求
kubectl run -it --rm ab-test --image=alpine -n php-apache --restart=Never -- sh -c "apk add apache2-utils --no-cache && ab -n 1000 -c 30 'http://php-apache.php-apache.svc.cluster.local/LoadTester.php?action=cpu&number=30'"

ab参数说明

  • -n 100000:总请求数
  • -c 100:并发连接数

测试建议

测试类型调用URL示例可选参数
基础测试/LoadTester.php?action=basic
CPU压力测试/LoadTester.php?action=cpu&number=30number(默认30)
内存压力测试/LoadTester.php?action=memory
综合测试/LoadTester.php?action=complex
动态负载测试/LoadTester.php?action=dynamic&level=5level(1-10, 默认5)

状态检测

bash
# 监测 HPA
watch -n 1 kubectl get hpa -n php-apache

# 监测 pod
watch -n 1 kubectl get pod php-apache -n php-apache

# 监测 node
watch -n 1 kubectl top node