Skip to content
清晨的一缕阳光
返回

Redis 云原生部署方案

Redis 云原生部署方案

云原生时代,Redis 的部署和运维方式也在演进。如何在 Kubernetes 上部署高可用的 Redis 集群?本文将分享完整的云原生部署方案。

一、云原生架构

1.1 云原生优势

云原生 Redis 优势:

┌─────────────────────────────────────┐
│ 弹性伸缩    │ 自动扩缩容            │
│ 高可用      │ 自动故障恢复          │
│ 易运维      │ 声明式配置            │
│ 资源优化    │ 容器化隔离            │
│ 可观测性    │ 统一监控日志          │
└─────────────────────────────────────┘

1.2 部署方案对比

方案复杂度自动化适用场景
Docker手动开发测试
Docker Compose半自动小规模
Kubernetes自动生产环境
Operator全自动大规模
云服务最低全自动快速上线

二、Docker 部署

2.1 单机部署

# 使用官方镜像
docker run -d \
  --name redis \
  -p 6379:6379 \
  -v /data/redis:/data \
  -e REDIS_PASSWORD=my_password \
  redis:7.0 \
  redis-server --appendonly yes --requirepass my_password

# 验证
docker exec -it redis redis-cli -a my_password ping

2.2 Docker Compose 部署

# docker-compose.yml
version: '3.8'

services:
  redis-master:
    image: redis:7.0
    container_name: redis-master
    ports:
      - "6379:6379"
    volumes:
      - ./redis-master/data:/data
      - ./redis-master/redis.conf:/usr/local/etc/redis/redis.conf
    command: redis-server /usr/local/etc/redis/redis.conf
    environment:
      - REDIS_PASSWORD=${REDIS_PASSWORD}
    networks:
      - redis-network
    restart: always
    healthcheck:
      test: ["CMD", "redis-cli", "-a", "${REDIS_PASSWORD}", "ping"]
      interval: 10s
      timeout: 5s
      retries: 3

  redis-slave:
    image: redis:7.0
    container_name: redis-slave
    ports:
      - "6380:6379"
    volumes:
      - ./redis-slave/data:/data
      - ./redis-slave/redis.conf:/usr/local/etc/redis/redis.conf
    command: redis-server /usr/local/etc/redis/redis.conf
    environment:
      - REDIS_PASSWORD=${REDIS_PASSWORD}
    depends_on:
      - redis-master
    networks:
      - redis-network
    restart: always

  redis-sentinel:
    image: redis:7.0
    container_name: redis-sentinel
    ports:
      - "26379:26379"
    volumes:
      - ./redis-sentinel/sentinel.conf:/usr/local/etc/redis/sentinel.conf
    command: redis-sentinel /usr/local/etc/redis/sentinel.conf
    depends_on:
      - redis-master
    networks:
      - redis-network
    restart: always

networks:
  redis-network:
    driver: bridge

Redis 配置

# redis-master/redis.conf
bind 0.0.0.0
port 6379
requirepass my_password
masterauth my_password
appendonly yes
appendfilename "appendonly.aof"
dir /data
# redis-slave/redis.conf
bind 0.0.0.0
port 6379
requirepass my_password
masterauth my_password
replicaof redis-master 6379
appendonly yes
# redis-sentinel/sentinel.conf
port 26379
sentinel monitor mymaster redis-master 6379 1
sentinel auth-pass mymaster my_password
sentinel down-after-milliseconds mymaster 5000
sentinel failover-timeout mymaster 60000
sentinel parallel-syncs mymaster 1

启动

# 启动
docker-compose up -d

# 查看状态
docker-compose ps

# 查看日志
docker-compose logs -f redis-master

# 停止
docker-compose down

三、Kubernetes 部署

3.1 StatefulSet 部署

# redis-statefulset.yaml
apiVersion: v1
kind: ConfigMap
metadata:
  name: redis-config
data:
  redis.conf: |
    bind 0.0.0.0
    port 6379
    requirepass $(REDIS_PASSWORD)
    masterauth $(REDIS_PASSWORD)
    appendonly yes
    appendfilename "appendonly.aof"
    dir /data
    cluster-enabled yes
    cluster-config-file nodes.conf
    cluster-node-timeout 5000
---
apiVersion: apps/v1
kind: StatefulSet
metadata:
  name: redis-cluster
spec:
  serviceName: redis-cluster
  replicas: 6
  selector:
    matchLabels:
      app: redis-cluster
  template:
    metadata:
      labels:
        app: redis-cluster
    spec:
      containers:
      - name: redis
        image: redis:7.0
        command:
          - redis-server
          - /etc/redis/redis.conf
        ports:
        - containerPort: 6379
          name: redis
        - containerPort: 16379
          name: bus
        env:
        - name: REDIS_PASSWORD
          valueFrom:
            secretKeyRef:
              name: redis-secret
              key: password
        volumeMounts:
        - name: redis-config
          mountPath: /etc/redis
        - name: redis-data
          mountPath: /data
        resources:
          requests:
            memory: "512Mi"
            cpu: "250m"
          limits:
            memory: "2Gi"
            cpu: "1000m"
        livenessProbe:
          exec:
            command:
            - redis-cli
            - -a
            - $(REDIS_PASSWORD)
            - ping
          initialDelaySeconds: 30
          periodSeconds: 10
        readinessProbe:
          exec:
            command:
            - redis-cli
            - -a
            $(REDIS_PASSWORD)
            - ping
          initialDelaySeconds: 10
          periodSeconds: 5
      volumes:
      - name: redis-config
        configMap:
          name: redis-config
  volumeClaimTemplates:
  - metadata:
      name: redis-data
    spec:
      accessModes: ["ReadWriteOnce"]
      resources:
        requests:
          storage: 10Gi
      storageClassName: standard
---
apiVersion: v1
kind: Service
metadata:
  name: redis-cluster
spec:
  clusterIP: None
  ports:
  - port: 6379
    targetPort: 6379
    name: redis
  - port: 16379
    targetPort: 16379
    name: bus
  selector:
    app: redis-cluster
---
apiVersion: v1
kind: Secret
metadata:
  name: redis-secret
type: Opaque
data:
  password: bXlfcGFzc3dvcmQ=  # base64 encoded "my_password"

部署命令

# 应用配置
kubectl apply -f redis-statefulset.yaml

# 查看 Pod 状态
kubectl get pods -l app=redis-cluster

# 查看日志
kubectl logs redis-cluster-0

# 进入 Pod
kubectl exec -it redis-cluster-0 -- redis-cli -a my_password

# 创建集群
kubectl exec -it redis-cluster-0 -- \
  redis-cli --cluster create \
  redis-cluster-0.redis-cluster:6379 \
  redis-cluster-1.redis-cluster:6379 \
  redis-cluster-2.redis-cluster:6379 \
  redis-cluster-3.redis-cluster:6379 \
  redis-cluster-4.redis-cluster:6379 \
  redis-cluster-5.redis-cluster:6379 \
  --cluster-replicas 1 \
  --cluster-yes \
  -a my_password

# 查看集群状态
kubectl exec -it redis-cluster-0 -- \
  redis-cli -a my_password cluster info

3.2 使用 Helm 部署

# 添加 Bitnami 仓库
helm repo add bitnami https://charts.bitnami.com/bitnami
helm repo update

# 安装 Redis Cluster
helm install redis bitnami/redis-cluster \
  --set password=my_password \
  --set cluster.nodes=6 \
  --set cluster.replicas=1 \
  --set master.persistence.size=10Gi \
  --set replica.persistence.size=10Gi \
  --set metrics.enabled=true \
  --set metrics.serviceMonitor.enabled=true

# 查看状态
helm status redis

# 升级配置
helm upgrade redis bitnami/redis-cluster \
  --set cluster.nodes=9 \
  --set cluster.replicas=2

# 卸载
helm uninstall redis

values.yaml 配置

# custom-values.yaml
password: my_password

cluster:
  nodes: 6
  replicas: 1
  updateAddNodes: 0
  updateRemoveNodes: 0
  init: true

master:
  persistence:
    enabled: true
    size: 10Gi
    storageClass: standard

replica:
  persistence:
    enabled: true
    size: 10Gi
    storageClass: standard

metrics:
  enabled: true
  serviceMonitor:
    enabled: true
    namespace: monitoring
    additionalLabels:
      release: prometheus

podSecurityPolicy:
  enabled: false

rbac:
  create: true

serviceAccount:
  create: true
  name: redis-cluster

四、Redis Operator

4.1 安装 Operator

# 使用 Redis Operator (https://github.com/spotahome/redis-operator)

# 安装 CRD
kubectl apply -f https://raw.githubusercontent.com/spotahome/redis-operator/master/example/redisfailover/00-operator.yaml

# 验证安装
kubectl get pods -n redis-operator

4.2 创建 RedisFailover

# redis-failover.yaml
apiVersion: databases.spotahome.com/v1
kind: RedisFailover
metadata:
  name: my-redis
spec:
  auth:
    secretPath: redis-secret
  
  sentinel:
    replicas: 3
    resources:
      requests:
        cpu: 100m
        memory: 100Mi
      limits:
        cpu: 200m
        memory: 200Mi
    
  redis:
    replicas: 3
    resources:
      requests:
        cpu: 250m
        memory: 500Mi
      limits:
        cpu: 500m
        memory: 1Gi
    persistentVolumeClaim:
      metadata:
        name: redis-pvc
      spec:
        accessModes:
          - ReadWriteOnce
        resources:
          requests:
            storage: 10Gi
        storageClassName: standard
    
  exportedMetrics:
    enabled: true
    serviceMonitor:
      enabled: true
      additionalLabels:
        release: prometheus

部署

# 创建 Secret
kubectl create secret generic redis-secret \
  --from-literal=password=my_password

# 应用配置
kubectl apply -f redis-failover.yaml

# 查看状态
kubectl get redisfailover

# 查看 Pod
kubectl get pods -l app=rf-my-redis

五、监控与告警

5.1 Prometheus 监控

# prometheus-rules.yaml
apiVersion: monitoring.coreos.com/v1
kind: PrometheusRule
metadata:
  name: redis-rules
  labels:
    release: prometheus
spec:
  groups:
  - name: redis.rules
    rules:
    - alert: RedisDown
      expr: redis_up == 0
      for: 1m
      labels:
        severity: critical
      annotations:
        summary: "Redis 实例宕机"
        description: "Redis 实例 {{ $labels.instance }} 已宕机超过 1 分钟"
    
    - alert: RedisMemoryHigh
      expr: redis_memory_used_bytes / redis_memory_max_bytes > 0.9
      for: 5m
      labels:
        severity: warning
      annotations:
        summary: "Redis 内存使用率过高"
        description: "Redis 实例 {{ $labels.instance }} 内存使用率超过 90%"
    
    - alert: RedisConnectedClientsHigh
      expr: redis_connected_clients > 1000
      for: 5m
      labels:
        severity: warning
      annotations:
        summary: "Redis 连接数过高"
        description: "Redis 实例 {{ $labels.instance }} 连接数超过 1000"

5.2 Grafana 仪表盘

{
  "dashboard": {
    "title": "Redis Dashboard",
    "panels": [
      {
        "title": "Memory Usage",
        "type": "graph",
        "targets": [
          {
            "expr": "redis_memory_used_bytes",
            "legendFormat": "{{ instance }}"
          }
        ]
      },
      {
        "title": "Connected Clients",
        "type": "graph",
        "targets": [
          {
            "expr": "redis_connected_clients",
            "legendFormat": "{{ instance }}"
          }
        ]
      },
      {
        "title": "Operations per second",
        "type": "graph",
        "targets": [
          {
            "expr": "rate(redis_commands_processed_total[1m])",
            "legendFormat": "{{ instance }}"
          }
        ]
      }
    ]
  }
}

六、云服务对比

6.1 主流云服务商

服务商产品名称特点
阿里云Redis 版高可用、读写分离
腾讯云Redis集群版、主从版
AWSElastiCache支持 Redis/Memcached
AzureCache for Redis企业级 SLA
GCPMemorystore完全托管

6.2 选择建议

选择云服务考虑因素:

1. 性能要求
   - QPS 需求
   - 延迟要求
   - 带宽需求

2. 可用性要求
   - SLA 保证
   - 故障恢复时间
   - 数据持久性

3. 成本考虑
   - 实例费用
   - 流量费用
   - 存储费用

4. 运维需求
   - 自动备份
   - 监控告警
   - 弹性伸缩

七、总结

7.1 部署方案选择

部署方案决策树:

1. 是否需要自动化运维?
   ├── 否 → Docker/Docker Compose
   └── 是 → 2

2. 是否使用 Kubernetes?
   ├── 否 → 云服务
   └── 是 → 3

3. 是否需要自动故障恢复?
   ├── 否 → StatefulSet
   └── 是 → Redis Operator

7.2 最佳实践

  1. 容器化

    • 使用官方镜像
    • 配置健康检查
    • 资源限制
  2. 持久化

    • 使用 PVC 存储
    • 配置 AOF 持久化
    • 定期备份
  3. 监控

    • 部署 Prometheus
    • 配置告警规则
    • Grafana 可视化
  4. 安全

    • 使用 Secret 管理密码
    • 网络隔离
    • RBAC 权限控制

参考资料


分享这篇文章到:

上一篇文章
Kafka 生产问题排查案例集
下一篇文章
Agent 规划与任务分解实战