前言
本來託管在 AWS 的 RabbitMQ 因為成本考量只開 single instance,每次 maintenance windows 時 app 都會受到影響。因此決定嘗試在 Kubernetes 上建構 RabbitMQ cluster。
RabbitMQ 版本: v3.13.7 (fallow operator default)
Kubernetes (EKS) 版本: v1.30
Installing RabbitMQ Cluster Operator in a Kubernetes Cluster [ref]
這個 Operator 負責建立 RabbitMQ Cluster,版本我固定當前最新版 v2.11.0,預設建立的 RabbitMQ 版本是 v3.13.7 (ref)。
RabbitMQ instance 版本是由 Operator 控制 (目前沒找到可以自訂版本的方法,maybe set image ?),可以在 release Changelog 看到。
1
2
3
4
5
6
7
8
9
|
kubectl apply -f "https://github.com/rabbitmq/cluster-operator/releases/download/v2.11.0/cluster-operator.yml"
# namespace/rabbitmq-system created
# customresourcedefinition.apiextensions.k8s.io/rabbitmqclusters.rabbitmq.com created
# serviceaccount/rabbitmq-cluster-operator created
# role.rbac.authorization.k8s.io/rabbitmq-cluster-leader-election-role created
# clusterrole.rbac.authorization.k8s.io/rabbitmq-cluster-operator-role created
# rolebinding.rbac.authorization.k8s.io/rabbitmq-cluster-leader-election-rolebinding created
# clusterrolebinding.rbac.authorization.k8s.io/rabbitmq-cluster-operator-rolebinding created
# deployment.apps/rabbitmq-cluster-operator created
|
Creating a RabbitMQ Cluster [example]
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
|
apiVersion: rabbitmq.com/v1beta1
kind: RabbitmqCluster
metadata:
name: dev-rabbitmq
labels:
app.kubernetes.io/name: dev-rabbitmq
spec:
replicas: 3
resources:
requests:
cpu: 300m
memory: 256Mi
limits:
cpu: 2000m
memory: 256Mi
rabbitmq:
additionalConfig: |
vm_memory_high_watermark.absolute = 200MB
affinity:
podAntiAffinity:
preferredDuringSchedulingIgnoredDuringExecution:
- weight: 100
podAffinityTerm:
labelSelector:
matchExpressions:
- key: app.kubernetes.io/name
operator: In
values:
- dev-rabbitmq
topologyKey: kubernetes.io/hostname
nodeAffinity:
requiredDuringSchedulingIgnoredDuringExecution:
nodeSelectorTerms:
- matchExpressions:
- key: eks.amazonaws.com/nodegroup
operator: In
values:
- ARM-t4g-medium-1a
|
1
2
3
4
5
6
7
8
9
|
apiVersion: policy/v1beta1
kind: PodDisruptionBudget
metadata:
name: dev-rabbitmq
spec:
maxUnavailable: 1
selector:
matchLabels:
app.kubernetes.io/name: dev-rabbitmq
|
壓力/連線 測試 [ref]
自行替換 hello-world
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
|
username="$(kubectl get secret dev-rabbitmq-default-user -o jsonpath='{.data.username}' | base64 --decode)"
password="$(kubectl get secret dev-rabbitmq-default-user -o jsonpath='{.data.password}' | base64 --decode)"
service="$(kubectl get service dev-rabbitmq -o jsonpath='{.spec.clusterIP}')"
kubectl run perf-test --image=pivotalrabbitmq/perf-test -- --uri amqp://$username:$password@$service
## test quorum queue
kubectl run perf-test --image=pivotalrabbitmq/perf-test -- --uri amqp://$username:$password@$service --quorum-queue --queue eric-quorum --metrics-format compact --use-millis --rate 100
## Test client HPA (Fill up the queue)
kubectl run perf-test --image=pivotalrabbitmq/perf-test -- \
--uri amqp://$username:$password@$service \
--quorum-queue --queue eric-quorum \
--producers 1 --consumers 0 --predeclared --routing-key rk \
--metrics-format compact --use-millis --rate 100
# pod/perf-test created
kubectl logs -f perf-test 2>&1 | tee ~/Downloads/rabbitMQ-cluster-perf-test-$(date "+%m-%d-%H-%M")
|
預設情況下 60% RAM usage 就會觸發 RabbitMQ 的 memory high watermark,同時 alerm triggered + blocked connection。ref
可以透過設定 vm_memory_high_watermark.relative = 0.8
=> Container 環境官方說不適合用 % 數 vm_memory_high_watermark.absolute = 200MB
來調整,running 中的 node 也可透過 rabbitmqctl set_vm_memory_high_watermark <fraction>
動態調整。
參考資料
Author
LastMod
2024-11-18
(9f81df1)