Redis 云原生部署方案
云原生时代,Redis 的部署和运维方式也在演进。如何在 Kubernetes 上部署高可用的 Redis 集群?本文将分享完整的云原生部署方案。
一、云原生架构
1.1 云原生优势
云原生 Redis 优势:
┌─────────────────────────────────────┐
│ 弹性伸缩 │ 自动扩缩容 │
│ 高可用 │ 自动故障恢复 │
│ 易运维 │ 声明式配置 │
│ 资源优化 │ 容器化隔离 │
│ 可观测性 │ 统一监控日志 │
└─────────────────────────────────────┘
1.2 部署方案对比
| 方案 | 复杂度 | 自动化 | 适用场景 |
|---|---|---|---|
| Docker | 低 | 手动 | 开发测试 |
| Docker Compose | 低 | 半自动 | 小规模 |
| Kubernetes | 中 | 自动 | 生产环境 |
| Operator | 高 | 全自动 | 大规模 |
| 云服务 | 最低 | 全自动 | 快速上线 |
二、Docker 部署
2.1 单机部署
# 使用官方镜像
docker run -d \
--name redis \
-p 6379:6379 \
-v /data/redis:/data \
-e REDIS_PASSWORD=my_password \
redis:7.0 \
redis-server --appendonly yes --requirepass my_password
# 验证
docker exec -it redis redis-cli -a my_password ping
2.2 Docker Compose 部署
# docker-compose.yml
version: '3.8'
services:
redis-master:
image: redis:7.0
container_name: redis-master
ports:
- "6379:6379"
volumes:
- ./redis-master/data:/data
- ./redis-master/redis.conf:/usr/local/etc/redis/redis.conf
command: redis-server /usr/local/etc/redis/redis.conf
environment:
- REDIS_PASSWORD=${REDIS_PASSWORD}
networks:
- redis-network
restart: always
healthcheck:
test: ["CMD", "redis-cli", "-a", "${REDIS_PASSWORD}", "ping"]
interval: 10s
timeout: 5s
retries: 3
redis-slave:
image: redis:7.0
container_name: redis-slave
ports:
- "6380:6379"
volumes:
- ./redis-slave/data:/data
- ./redis-slave/redis.conf:/usr/local/etc/redis/redis.conf
command: redis-server /usr/local/etc/redis/redis.conf
environment:
- REDIS_PASSWORD=${REDIS_PASSWORD}
depends_on:
- redis-master
networks:
- redis-network
restart: always
redis-sentinel:
image: redis:7.0
container_name: redis-sentinel
ports:
- "26379:26379"
volumes:
- ./redis-sentinel/sentinel.conf:/usr/local/etc/redis/sentinel.conf
command: redis-sentinel /usr/local/etc/redis/sentinel.conf
depends_on:
- redis-master
networks:
- redis-network
restart: always
networks:
redis-network:
driver: bridge
Redis 配置:
# redis-master/redis.conf
bind 0.0.0.0
port 6379
requirepass my_password
masterauth my_password
appendonly yes
appendfilename "appendonly.aof"
dir /data
# redis-slave/redis.conf
bind 0.0.0.0
port 6379
requirepass my_password
masterauth my_password
replicaof redis-master 6379
appendonly yes
# redis-sentinel/sentinel.conf
port 26379
sentinel monitor mymaster redis-master 6379 1
sentinel auth-pass mymaster my_password
sentinel down-after-milliseconds mymaster 5000
sentinel failover-timeout mymaster 60000
sentinel parallel-syncs mymaster 1
启动:
# 启动
docker-compose up -d
# 查看状态
docker-compose ps
# 查看日志
docker-compose logs -f redis-master
# 停止
docker-compose down
三、Kubernetes 部署
3.1 StatefulSet 部署
# redis-statefulset.yaml
apiVersion: v1
kind: ConfigMap
metadata:
name: redis-config
data:
redis.conf: |
bind 0.0.0.0
port 6379
requirepass $(REDIS_PASSWORD)
masterauth $(REDIS_PASSWORD)
appendonly yes
appendfilename "appendonly.aof"
dir /data
cluster-enabled yes
cluster-config-file nodes.conf
cluster-node-timeout 5000
---
apiVersion: apps/v1
kind: StatefulSet
metadata:
name: redis-cluster
spec:
serviceName: redis-cluster
replicas: 6
selector:
matchLabels:
app: redis-cluster
template:
metadata:
labels:
app: redis-cluster
spec:
containers:
- name: redis
image: redis:7.0
command:
- redis-server
- /etc/redis/redis.conf
ports:
- containerPort: 6379
name: redis
- containerPort: 16379
name: bus
env:
- name: REDIS_PASSWORD
valueFrom:
secretKeyRef:
name: redis-secret
key: password
volumeMounts:
- name: redis-config
mountPath: /etc/redis
- name: redis-data
mountPath: /data
resources:
requests:
memory: "512Mi"
cpu: "250m"
limits:
memory: "2Gi"
cpu: "1000m"
livenessProbe:
exec:
command:
- redis-cli
- -a
- $(REDIS_PASSWORD)
- ping
initialDelaySeconds: 30
periodSeconds: 10
readinessProbe:
exec:
command:
- redis-cli
- -a
$(REDIS_PASSWORD)
- ping
initialDelaySeconds: 10
periodSeconds: 5
volumes:
- name: redis-config
configMap:
name: redis-config
volumeClaimTemplates:
- metadata:
name: redis-data
spec:
accessModes: ["ReadWriteOnce"]
resources:
requests:
storage: 10Gi
storageClassName: standard
---
apiVersion: v1
kind: Service
metadata:
name: redis-cluster
spec:
clusterIP: None
ports:
- port: 6379
targetPort: 6379
name: redis
- port: 16379
targetPort: 16379
name: bus
selector:
app: redis-cluster
---
apiVersion: v1
kind: Secret
metadata:
name: redis-secret
type: Opaque
data:
password: bXlfcGFzc3dvcmQ= # base64 encoded "my_password"
部署命令:
# 应用配置
kubectl apply -f redis-statefulset.yaml
# 查看 Pod 状态
kubectl get pods -l app=redis-cluster
# 查看日志
kubectl logs redis-cluster-0
# 进入 Pod
kubectl exec -it redis-cluster-0 -- redis-cli -a my_password
# 创建集群
kubectl exec -it redis-cluster-0 -- \
redis-cli --cluster create \
redis-cluster-0.redis-cluster:6379 \
redis-cluster-1.redis-cluster:6379 \
redis-cluster-2.redis-cluster:6379 \
redis-cluster-3.redis-cluster:6379 \
redis-cluster-4.redis-cluster:6379 \
redis-cluster-5.redis-cluster:6379 \
--cluster-replicas 1 \
--cluster-yes \
-a my_password
# 查看集群状态
kubectl exec -it redis-cluster-0 -- \
redis-cli -a my_password cluster info
3.2 使用 Helm 部署
# 添加 Bitnami 仓库
helm repo add bitnami https://charts.bitnami.com/bitnami
helm repo update
# 安装 Redis Cluster
helm install redis bitnami/redis-cluster \
--set password=my_password \
--set cluster.nodes=6 \
--set cluster.replicas=1 \
--set master.persistence.size=10Gi \
--set replica.persistence.size=10Gi \
--set metrics.enabled=true \
--set metrics.serviceMonitor.enabled=true
# 查看状态
helm status redis
# 升级配置
helm upgrade redis bitnami/redis-cluster \
--set cluster.nodes=9 \
--set cluster.replicas=2
# 卸载
helm uninstall redis
values.yaml 配置:
# custom-values.yaml
password: my_password
cluster:
nodes: 6
replicas: 1
updateAddNodes: 0
updateRemoveNodes: 0
init: true
master:
persistence:
enabled: true
size: 10Gi
storageClass: standard
replica:
persistence:
enabled: true
size: 10Gi
storageClass: standard
metrics:
enabled: true
serviceMonitor:
enabled: true
namespace: monitoring
additionalLabels:
release: prometheus
podSecurityPolicy:
enabled: false
rbac:
create: true
serviceAccount:
create: true
name: redis-cluster
四、Redis Operator
4.1 安装 Operator
# 使用 Redis Operator (https://github.com/spotahome/redis-operator)
# 安装 CRD
kubectl apply -f https://raw.githubusercontent.com/spotahome/redis-operator/master/example/redisfailover/00-operator.yaml
# 验证安装
kubectl get pods -n redis-operator
4.2 创建 RedisFailover
# redis-failover.yaml
apiVersion: databases.spotahome.com/v1
kind: RedisFailover
metadata:
name: my-redis
spec:
auth:
secretPath: redis-secret
sentinel:
replicas: 3
resources:
requests:
cpu: 100m
memory: 100Mi
limits:
cpu: 200m
memory: 200Mi
redis:
replicas: 3
resources:
requests:
cpu: 250m
memory: 500Mi
limits:
cpu: 500m
memory: 1Gi
persistentVolumeClaim:
metadata:
name: redis-pvc
spec:
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 10Gi
storageClassName: standard
exportedMetrics:
enabled: true
serviceMonitor:
enabled: true
additionalLabels:
release: prometheus
部署:
# 创建 Secret
kubectl create secret generic redis-secret \
--from-literal=password=my_password
# 应用配置
kubectl apply -f redis-failover.yaml
# 查看状态
kubectl get redisfailover
# 查看 Pod
kubectl get pods -l app=rf-my-redis
五、监控与告警
5.1 Prometheus 监控
# prometheus-rules.yaml
apiVersion: monitoring.coreos.com/v1
kind: PrometheusRule
metadata:
name: redis-rules
labels:
release: prometheus
spec:
groups:
- name: redis.rules
rules:
- alert: RedisDown
expr: redis_up == 0
for: 1m
labels:
severity: critical
annotations:
summary: "Redis 实例宕机"
description: "Redis 实例 {{ $labels.instance }} 已宕机超过 1 分钟"
- alert: RedisMemoryHigh
expr: redis_memory_used_bytes / redis_memory_max_bytes > 0.9
for: 5m
labels:
severity: warning
annotations:
summary: "Redis 内存使用率过高"
description: "Redis 实例 {{ $labels.instance }} 内存使用率超过 90%"
- alert: RedisConnectedClientsHigh
expr: redis_connected_clients > 1000
for: 5m
labels:
severity: warning
annotations:
summary: "Redis 连接数过高"
description: "Redis 实例 {{ $labels.instance }} 连接数超过 1000"
5.2 Grafana 仪表盘
{
"dashboard": {
"title": "Redis Dashboard",
"panels": [
{
"title": "Memory Usage",
"type": "graph",
"targets": [
{
"expr": "redis_memory_used_bytes",
"legendFormat": "{{ instance }}"
}
]
},
{
"title": "Connected Clients",
"type": "graph",
"targets": [
{
"expr": "redis_connected_clients",
"legendFormat": "{{ instance }}"
}
]
},
{
"title": "Operations per second",
"type": "graph",
"targets": [
{
"expr": "rate(redis_commands_processed_total[1m])",
"legendFormat": "{{ instance }}"
}
]
}
]
}
}
六、云服务对比
6.1 主流云服务商
| 服务商 | 产品名称 | 特点 |
|---|---|---|
| 阿里云 | Redis 版 | 高可用、读写分离 |
| 腾讯云 | Redis | 集群版、主从版 |
| AWS | ElastiCache | 支持 Redis/Memcached |
| Azure | Cache for Redis | 企业级 SLA |
| GCP | Memorystore | 完全托管 |
6.2 选择建议
选择云服务考虑因素:
1. 性能要求
- QPS 需求
- 延迟要求
- 带宽需求
2. 可用性要求
- SLA 保证
- 故障恢复时间
- 数据持久性
3. 成本考虑
- 实例费用
- 流量费用
- 存储费用
4. 运维需求
- 自动备份
- 监控告警
- 弹性伸缩
七、总结
7.1 部署方案选择
部署方案决策树:
1. 是否需要自动化运维?
├── 否 → Docker/Docker Compose
└── 是 → 2
2. 是否使用 Kubernetes?
├── 否 → 云服务
└── 是 → 3
3. 是否需要自动故障恢复?
├── 否 → StatefulSet
└── 是 → Redis Operator
7.2 最佳实践
-
容器化
- 使用官方镜像
- 配置健康检查
- 资源限制
-
持久化
- 使用 PVC 存储
- 配置 AOF 持久化
- 定期备份
-
监控
- 部署 Prometheus
- 配置告警规则
- Grafana 可视化
-
安全
- 使用 Secret 管理密码
- 网络隔离
- RBAC 权限控制
参考资料