成人国产在线小视频_日韩寡妇人妻调教在线播放_色成人www永久在线观看_2018国产精品久久_亚洲欧美高清在线30p_亚洲少妇综合一区_黄色在线播放国产_亚洲另类技巧小说校园_国产主播xx日韩_a级毛片在线免费

資訊專欄INFORMATION COLUMN

zookeeper和etcd有狀態(tài)服務(wù)部署實踐

jackwang / 3637人閱讀

摘要:二總結(jié)使用的和的,能夠很好的支持這樣的有狀態(tài)服務(wù)部署到集群上。部署方式有待優(yōu)化本次試驗中使用靜態(tài)方式部署集群,如果節(jié)點變遷時,需要執(zhí)行等命令手動配置集群,嚴重限制了集群自動故障恢復(fù)擴容縮容的能力。


一. 概述

kubernetes通過statefulset為zookeeper、etcd等這類有狀態(tài)的應(yīng)用程序提供完善支持,statefulset具備以下特性:

為pod提供穩(wěn)定的唯一的網(wǎng)絡(luò)標識

穩(wěn)定值持久化存儲:通過pv/pvc來實現(xiàn)

啟動和停止pod保證有序:優(yōu)雅的部署和伸縮性

本文闡述了如何在k8s集群上部署zookeeper和etcd有狀態(tài)服務(wù),并結(jié)合ceph實現(xiàn)數(shù)據(jù)持久化。

二. 總結(jié)

使用k8s的statefulset、storageclass、pv、pvc和ceph的rbd,能夠很好的支持zookeeper、etcd這樣的有狀態(tài)服務(wù)部署到kubernetes集群上。

k8s不會主動刪除已經(jīng)創(chuàng)建的pv、pvc對象,防止出現(xiàn)誤刪。

如果用戶確定刪除pv、pvc對象,同時還需要手動刪除ceph段的rbd鏡像。

遇到的坑

storageclass中引用的ceph客戶端用戶,必須要有mon rw,rbd rwx權(quán)限。如果沒有mon write權(quán)限,會導(dǎo)致釋放rbd鎖失敗,無法將rbd鏡像掛載到其他的k8s worker節(jié)點。

zookeeper使用探針檢查zookeeper節(jié)點的健康狀態(tài),如果節(jié)點不健康,k8s將刪除pod,并自動重建該pod,達到自動重啟zookeeper節(jié)點的目的。

因zookeeper 3.4版本的集群配置,是通過靜態(tài)加載文件zoo.cfg來實現(xiàn)的,所以當zookeeper節(jié)點pod ip變動后,需要重啟zookeeper集群中的所有節(jié)點。

etcd部署方式有待優(yōu)化

本次試驗中使用靜態(tài)方式部署etcd集群,如果etcd節(jié)點變遷時,需要執(zhí)行etcdctl member remove/add等命令手動配置etcd集群,嚴重限制了etcd集群自動故障恢復(fù)、擴容縮容的能力。因此,需要考慮對部署方式優(yōu)化,改為使用DNS或者etcd descovery的動態(tài)方式部署etcd,才能讓etcd更好的運行在k8s上。

三. zookeeper集群部署 1. 下載鏡像
docker pull gcr.mirrors.ustc.edu.cn/google_containers/kubernetes-zookeeper:1.0-3.4.10
docker tag gcr.mirrors.ustc.edu.cn/google_containers/kubernetes-zookeeper:1.0-3.4.10 172.16.18.100:5000/gcr.io/google_containers/kubernetes-zookeeper:1.0-3.4.10
docker push  172.16.18.100:5000/gcr.io/google_containers/kubernetes-zookeeper:1.0-3.4.10
2. 定義ceph secret
cat << EOF | kubectl create -f -
apiVersion: v1
data:
  key: QVFBYy9ndGFRUno4QlJBQXMxTjR3WnlqN29PK3VrMzI1a05aZ3c9PQo=
kind: Secret
metadata:
  creationTimestamp: 2017-11-20T10:29:05Z
  name: ceph-secret
  namespace: default
  resourceVersion: "2954730"
  selfLink: /api/v1/namespaces/default/secrets/ceph-secret
  uid: a288ff74-cffffd-11e7-81cc-000c29f99475
type: kubernetes.io/rbd
EOF
3. 定義storageclass rbd存儲
cat << EOF | kubectl create -f -
apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
  name: ceph
parameters:
  adminId: admin
  adminSecretName: ceph-secret
  adminSecretNamespace: default
  fsType: ext4
  imageFormat: "2"
  imagefeatures: layering
  monitors: 172.16.13.223
  pool: k8s
  userId: admin
  userSecretName: ceph-secret
provisioner: kubernetes.io/rbd
reclaimPolicy: Delete
EOF
4. 創(chuàng)建zookeeper集群

使用rbd存儲zookeeper節(jié)點數(shù)據(jù)

cat << EOF | kubectl create -f -
---
apiVersion: v1
kind: Service
metadata:
  name: zk-hs
  labels:
    app: zk
spec:
  ports:
  - port: 2888
    name: server
  - port: 3888
    name: leader-election
  clusterIP: None
  selector:
    app: zk
---
apiVersion: v1
kind: Service
metadata:
  name: zk-cs
  labels:
    app: zk
spec:
  ports:
  - port: 2181
    name: client
  selector:
    app: zk
---
apiVersion: policy/v1beta1
kind: PodDisruptionBudget
metadata:
  name: zk-pdb
spec:
  selector:
    matchLabels:
      app: zk
  maxUnavailable: 1
---
apiVersion: apps/v1beta2 # for versions before 1.8.0 use apps/v1beta1
kind: StatefulSet
metadata:
  name: zk
spec:
  selector:
    matchLabels:
      app: zk
  serviceName: zk-hs
  replicas: 3
  updateStrategy:
    type: RollingUpdate
  podManagementPolicy: Parallel
  template:
    metadata:
      labels:
        app: zk
    spec:
      affinity:
        podAntiAffinity:
          requiredDuringSchedulingIgnoredDuringExecution:
            - labelSelector:
                matchExpressions:
                  - key: "app"
                    operator: In
                    values:
                    - zk
              topologyKey: "kubernetes.io/hostname"
      containers:
      - name: kubernetes-zookeeper
        imagePullPolicy: Always
        image: "172.16.18.100:5000/gcr.io/google_containers/kubernetes-zookeeper:1.0-3.4.10"
        ports:
        - containerPort: 2181
          name: client
        - containerPort: 2888
          name: server
        - containerPort: 3888
          name: leader-election
        command:
        - sh
        - -c
        - "start-zookeeper 
          --servers=3 
          --data_dir=/var/lib/zookeeper/data 
          --data_log_dir=/var/lib/zookeeper/data/log 
          --conf_dir=/opt/zookeeper/conf 
          --client_port=2181 
          --election_port=3888 
          --server_port=2888 
          --tick_time=2000 
          --init_limit=10 
          --sync_limit=5 
          --heap=512M 
          --max_client_cnxns=60 
          --snap_retain_count=3 
          --purge_interval=12 
          --max_session_timeout=40000 
          --min_session_timeout=4000 
          --log_level=INFO"
        readinessProbe:
          exec:
            command:
            - sh
            - -c
            - "zookeeper-ready 2181"
          initialDelaySeconds: 10
          timeoutSeconds: 5
        livenessProbe:
          exec:
            command:
            - sh
            - -c
            - "zookeeper-ready 2181"
          initialDelaySeconds: 10
          timeoutSeconds: 5
        volumeMounts:
        - name: datadir
          mountPath: /var/lib/zookeeper
      securityContext:
        runAsUser: 1000
        fsGroup: 1000
  volumeClaimTemplates:
  - metadata:
      name: datadir
      annotations:
        volume.beta.kubernetes.io/storage-class: ceph
    spec:
      accessModes: [ "ReadWriteOnce" ]
      resources:
        requests:
          storage: 1Gi
EOF

查看創(chuàng)建結(jié)果

[root@172 zookeeper]# kubectl get no
NAME           STATUS    ROLES     AGE       VERSION
172.16.20.10   Ready         50m       v1.8.2
172.16.20.11   Ready         2h        v1.8.2
172.16.20.12   Ready         1h        v1.8.2

[root@172 zookeeper]# kubectl get po -owide 
NAME      READY     STATUS    RESTARTS   AGE       IP              NODE
zk-0      1/1       Running   0          8m        192.168.5.162   172.16.20.10
zk-1      1/1       Running   0          1h        192.168.2.146   172.16.20.11

[root@172 zookeeper]# kubectl get pv,pvc
NAME                                          CAPACITY   ACCESS MODES   RECLAIM POLICY   STATUS    CLAIM                  STORAGECLASS   REASON    AGE
pv/pvc-226cb8f0-d322-11e7-9581-000c29f99475   1Gi        RWO            Delete           Bound     default/datadir-zk-0   ceph                     1h
pv/pvc-22703ece-d322-11e7-9581-000c29f99475   1Gi        RWO            Delete           Bound     default/datadir-zk-1   ceph                     1h

NAME               STATUS    VOLUME                                     CAPACITY   ACCESS MODES   STORAGECLASS   AGE
pvc/datadir-zk-0   Bound     pvc-226cb8f0-d322-11e7-9581-000c29f99475   1Gi        RWO            ceph           1h
pvc/datadir-zk-1   Bound     pvc-22703ece-d322-11e7-9581-000c29f99475   1Gi        RWO            ceph           1h

zk-0 podrbd的鎖信息為

[root@ceph1 ceph]# rbd lock list kubernetes-dynamic-pvc-227b45e5-d322-11e7-90ab-000c29f99475 -p k8s --user admin
There is 1 exclusive lock on this image.
Locker       ID                              Address                   
client.24146 kubelet_lock_magic_172.16.20.10 172.16.20.10:0/1606152350 
5. 測試pod遷移

嘗試將172.16.20.10節(jié)點設(shè)置為污點,讓zk-0 pod自動遷移到172.16.20.12

kubectl cordon 172.16.20.10

[root@172 zookeeper]# kubectl get no
NAME           STATUS                     ROLES     AGE       VERSION
172.16.20.10   Ready,SchedulingDisabled       58m       v1.8.2
172.16.20.11   Ready                          2h        v1.8.2
172.16.20.12   Ready                          1h        v1.8.2

kubectl delete po zk-0

觀察zk-0的遷移過程

[root@172 zookeeper]# kubectl get po -owide -w
NAME      READY     STATUS    RESTARTS   AGE       IP              NODE
zk-0      1/1       Running   0          14m       192.168.5.162   172.16.20.10
zk-1      1/1       Running   0          1h        192.168.2.146   172.16.20.11
zk-0      1/1       Terminating   0         16m       192.168.5.162   172.16.20.10
zk-0      0/1       Terminating   0         16m           172.16.20.10
zk-0      0/1       Terminating   0         16m           172.16.20.10
zk-0      0/1       Terminating   0         16m           172.16.20.10
zk-0      0/1       Terminating   0         16m           172.16.20.10
zk-0      0/1       Terminating   0         16m           172.16.20.10
zk-0      0/1       Pending   0         0s            
zk-0      0/1       Pending   0         0s            172.16.20.12
zk-0      0/1       ContainerCreating   0         0s            172.16.20.12
zk-0      0/1       Running   0         3s        192.168.3.4   172.16.20.12

此時zk-0正常遷移到172.16.20.12
再查看rbd的鎖定信息

[root@ceph1 ceph]# rbd lock list kubernetes-dynamic-pvc-227b45e5-d322-11e7-90ab-000c29f99475 -p k8s --user admin
There is 1 exclusive lock on this image.
Locker       ID                              Address                   
client.24146 kubelet_lock_magic_172.16.20.10 172.16.20.10:0/1606152350 
[root@ceph1 ceph]# rbd lock list kubernetes-dynamic-pvc-227b45e5-d322-11e7-90ab-000c29f99475 -p k8s --user admin
There is 1 exclusive lock on this image.
Locker       ID                              Address                   
client.24154 kubelet_lock_magic_172.16.20.12 172.16.20.12:0/3715989358 

之前在另外一個ceph集群測試這個zk pod遷移的時候,總是報錯無法釋放lock,經(jīng)分析應(yīng)該是使用的ceph賬號沒有相應(yīng)的權(quán)限,所以導(dǎo)致釋放lock失敗。記錄的報錯信息如下:

Nov 27 10:45:55 172 kubelet: W1127 10:45:55.551768   11556 rbd_util.go:471] rbd: no watchers on kubernetes-dynamic-pvc-f35a411e-d317-11e7-90ab-000c29f99475
Nov 27 10:45:55 172 kubelet: I1127 10:45:55.694126   11556 rbd_util.go:181] remove orphaned locker kubelet_lock_magic_172.16.20.12 from client client.171490: err exit status 13, output: 2017-11-27 10:45:55.570483 7fbdbe922d40 -1 did not load config file, using default settings.
Nov 27 10:45:55 172 kubelet: 2017-11-27 10:45:55.600816 7fbdbe922d40 -1 Errors while parsing config file!
Nov 27 10:45:55 172 kubelet: 2017-11-27 10:45:55.600824 7fbdbe922d40 -1 parse_file: cannot open /etc/ceph/ceph.conf: (2) No such file or directory
Nov 27 10:45:55 172 kubelet: 2017-11-27 10:45:55.600825 7fbdbe922d40 -1 parse_file: cannot open ~/.ceph/ceph.conf: (2) No such file or directory
Nov 27 10:45:55 172 kubelet: 2017-11-27 10:45:55.600825 7fbdbe922d40 -1 parse_file: cannot open ceph.conf: (2) No such file or directory
Nov 27 10:45:55 172 kubelet: 2017-11-27 10:45:55.602492 7fbdbe922d40 -1 Errors while parsing config file!
Nov 27 10:45:55 172 kubelet: 2017-11-27 10:45:55.602494 7fbdbe922d40 -1 parse_file: cannot open /etc/ceph/ceph.conf: (2) No such file or directory
Nov 27 10:45:55 172 kubelet: 2017-11-27 10:45:55.602495 7fbdbe922d40 -1 parse_file: cannot open ~/.ceph/ceph.conf: (2) No such file or directory
Nov 27 10:45:55 172 kubelet: 2017-11-27 10:45:55.602496 7fbdbe922d40 -1 parse_file: cannot open ceph.conf: (2) No such file or directory
Nov 27 10:45:55 172 kubelet: 2017-11-27 10:45:55.651594 7fbdbe922d40 -1 auth: unable to find a keyring on /etc/ceph/ceph.client.k8s.keyring,/etc/ceph/ceph.keyring,/etc/ceph/keyring,/etc/ceph/keyring.bin,: (2) No such file or directory
Nov 27 10:45:55 172 kubelet: rbd: releasing lock failed: (13) Permission denied
Nov 27 10:45:55 172 kubelet: 2017-11-27 10:45:55.682470 7fbdbe922d40 -1 librbd: unable to blacklist client: (13) Permission denied

k8s rbd volume的實現(xiàn)代碼:

if lock {
            // check if lock is already held for this host by matching lock_id and rbd lock id
            if strings.Contains(output, lock_id) {
                // this host already holds the lock, exit
                glog.V(1).Infof("rbd: lock already held for %s", lock_id)
                return nil
            }
            // clean up orphaned lock if no watcher on the image
            used, statusErr := util.rbdStatus(&b)
            if statusErr == nil && !used {
                re := regexp.MustCompile("client.* " + kubeLockMagic + ".*")
                locks := re.FindAllStringSubmatch(output, -1)
                for _, v := range locks {
                    if len(v) > 0 {
                        lockInfo := strings.Split(v[0], " ")
                        if len(lockInfo) > 2 {
                            args := []string{"lock", "remove", b.Image, lockInfo[1], lockInfo[0], "--pool", b.Pool, "--id", b.Id, "-m", mon}
                            args = append(args, secret_opt...)
                            cmd, err = b.exec.Run("rbd", args...)
                            # 執(zhí)行rbd lock remove命令時返回了錯誤信息
                            glog.Infof("remove orphaned locker %s from client %s: err %v, output: %s", lockInfo[1], lockInfo[0], err, string(cmd))
                        }
                    }
                }
            }

            // hold a lock: rbd lock add
            args := []string{"lock", "add", b.Image, lock_id, "--pool", b.Pool, "--id", b.Id, "-m", mon}
            args = append(args, secret_opt...)
            cmd, err = b.exec.Run("rbd", args...)
        } 

可以看到,rbd lock remove操作被拒絕了,原因是沒有權(quán)限rbd: releasing lock failed: (13) Permission denied

6. 測試擴容

zookeeper集群節(jié)點數(shù)從2個擴為3個。
集群節(jié)點數(shù)為2時,zoo.cfg的配置中定義了兩個實例

zookeeper@zk-0:/opt/zookeeper/conf$ cat zoo.cfg 
#This file was autogenerated DO NOT EDIT
clientPort=2181
dataDir=/var/lib/zookeeper/data
dataLogDir=/var/lib/zookeeper/data/log
tickTime=2000
initLimit=10
syncLimit=5
maxClientCnxns=60
minSessionTimeout=4000
maxSessionTimeout=40000
autopurge.snapRetainCount=3
autopurge.purgeInteval=12
server.1=zk-0.zk-hs.default.svc.cluster.local:2888:3888
server.2=zk-1.zk-hs.default.svc.cluster.local:2888:3888

使用kubectl edit statefulset zk命令修改replicas=3,start-zookeeper --servers=3,
此時觀察pod的變化

[root@172 zookeeper]# kubectl get po -owide -w
NAME      READY     STATUS    RESTARTS   AGE       IP              NODE
zk-0      1/1       Running   0          1h        192.168.5.170   172.16.20.10
zk-1      1/1       Running   0          1h        192.168.3.12    172.16.20.12
zk-2      0/1       Pending   0         0s            
zk-2      0/1       Pending   0         0s            172.16.20.11
zk-2      0/1       ContainerCreating   0         0s            172.16.20.11
zk-2      0/1       Running   0         1s        192.168.2.154   172.16.20.11
zk-2      1/1       Running   0         11s       192.168.2.154   172.16.20.11
zk-1      1/1       Terminating   0         1h        192.168.3.12   172.16.20.12
zk-1      0/1       Terminating   0         1h            172.16.20.12
zk-1      0/1       Terminating   0         1h            172.16.20.12
zk-1      0/1       Terminating   0         1h            172.16.20.12
zk-1      0/1       Terminating   0         1h            172.16.20.12
zk-1      0/1       Pending   0         0s            
zk-1      0/1       Pending   0         0s            172.16.20.12
zk-1      0/1       ContainerCreating   0         0s            172.16.20.12
zk-1      0/1       Running   0         2s        192.168.3.13   172.16.20.12
zk-1      1/1       Running   0         20s       192.168.3.13   172.16.20.12
zk-0      1/1       Terminating   0         1h        192.168.5.170   172.16.20.10
zk-0      0/1       Terminating   0         1h            172.16.20.10
zk-0      0/1       Terminating   0         1h            172.16.20.10
zk-0      0/1       Terminating   0         1h            172.16.20.10
zk-0      0/1       Terminating   0         1h            172.16.20.10
zk-0      0/1       Pending   0         0s            
zk-0      0/1       Pending   0         0s            172.16.20.10
zk-0      0/1       ContainerCreating   0         0s            172.16.20.10
zk-0      0/1       Running   0         2s        192.168.5.171   172.16.20.10
zk-0      1/1       Running   0         12s       192.168.5.171   172.16.20.10

可以看到zk-0/zk-1都重啟了,這樣可以加載新的zoo.cfg配置文件,保證集群正確配置。
新的zoo.cfg配置文件記錄了3個實例:

[root@172 ~]# kubectl exec zk-0 -- cat /opt/zookeeper/conf/zoo.cfg
#This file was autogenerated DO NOT EDIT
clientPort=2181
dataDir=/var/lib/zookeeper/data
dataLogDir=/var/lib/zookeeper/data/log
tickTime=2000
initLimit=10
syncLimit=5
maxClientCnxns=60
minSessionTimeout=4000
maxSessionTimeout=40000
autopurge.snapRetainCount=3
autopurge.purgeInteval=12
server.1=zk-0.zk-hs.default.svc.cluster.local:2888:3888
server.2=zk-1.zk-hs.default.svc.cluster.local:2888:3888
server.3=zk-2.zk-hs.default.svc.cluster.local:2888:3888
7. 測試縮容

縮容的時候,zk集群也自動重啟了所有的zk節(jié)點,縮容過程如下:

[root@172 ~]# kubectl get po -owide -w
NAME      READY     STATUS    RESTARTS   AGE       IP              NODE
zk-0      1/1       Running   0          5m        192.168.5.171   172.16.20.10
zk-1      1/1       Running   0          6m        192.168.3.13    172.16.20.12
zk-2      1/1       Running   0          7m        192.168.2.154   172.16.20.11
zk-2      1/1       Terminating   0         7m        192.168.2.154   172.16.20.11
zk-1      1/1       Terminating   0         7m        192.168.3.13   172.16.20.12
zk-2      0/1       Terminating   0         8m            172.16.20.11
zk-1      0/1       Terminating   0         7m            172.16.20.12
zk-2      0/1       Terminating   0         8m            172.16.20.11
zk-1      0/1       Terminating   0         7m            172.16.20.12
zk-1      0/1       Terminating   0         7m            172.16.20.12
zk-1      0/1       Terminating   0         7m            172.16.20.12
zk-1      0/1       Pending   0         0s            
zk-1      0/1       Pending   0         0s            172.16.20.12
zk-1      0/1       ContainerCreating   0         0s            172.16.20.12
zk-1      0/1       Running   0         2s        192.168.3.14   172.16.20.12
zk-2      0/1       Terminating   0         8m            172.16.20.11
zk-2      0/1       Terminating   0         8m            172.16.20.11
zk-1      1/1       Running   0         19s       192.168.3.14   172.16.20.12
zk-0      1/1       Terminating   0         7m        192.168.5.171   172.16.20.10
zk-0      0/1       Terminating   0         7m            172.16.20.10
zk-0      0/1       Terminating   0         7m            172.16.20.10
zk-0      0/1       Terminating   0         7m            172.16.20.10
zk-0      0/1       Pending   0         0s            
zk-0      0/1       Pending   0         0s            172.16.20.10
zk-0      0/1       ContainerCreating   0         0s            172.16.20.10
zk-0      0/1       Running   0         3s        192.168.5.172   172.16.20.10
zk-0      1/1       Running   0         13s       192.168.5.172   172.16.20.10
四. etcd集群部署 1. 創(chuàng)建etcd集群
cat << EOF | kubectl create -f -
apiVersion: v1
kind: Service
metadata:
  name: "etcd"
  annotations:
    # Create endpoints also if the related pod isn"t ready
    service.alpha.kubernetes.io/tolerate-unready-endpoints: "true"
spec:
  ports:
  - port: 2379
    name: client
  - port: 2380
    name: peer
  clusterIP: None
  selector:
    component: "etcd"
---
apiVersion: apps/v1beta1
kind: StatefulSet
metadata:
  name: "etcd"
  labels:
    component: "etcd"
spec:
  serviceName: "etcd"
  # changing replicas value will require a manual etcdctl member remove/add
  # command (remove before decreasing and add after increasing)
  replicas: 3
  template:
    metadata:
      name: "etcd"
      labels:
        component: "etcd"
    spec:
      containers:
      - name: "etcd"
        image: "172.16.18.100:5000/quay.io/coreos/etcd:v3.2.3"
        ports:
        - containerPort: 2379
          name: client
        - containerPort: 2380
          name: peer
        env:
        - name: CLUSTER_SIZE
          value: "3"
        - name: SET_NAME
          value: "etcd"
        volumeMounts:
        - name: data
          mountPath: /var/run/etcd
        command:
          - "/bin/sh"
          - "-ecx"
          - |
            IP=$(hostname -i)
            for i in $(seq 0 $((${CLUSTER_SIZE} - 1))); do
              while true; do
                echo "Waiting for ${SET_NAME}-${i}.${SET_NAME} to come up"
                ping -W 1 -c 1 ${SET_NAME}-${i}.${SET_NAME}.default.svc.cluster.local > /dev/null && break
                sleep 1s
              done
            done
            PEERS=""
            for i in $(seq 0 $((${CLUSTER_SIZE} - 1))); do
                PEERS="${PEERS}${PEERS:+,}${SET_NAME}-${i}=http://${SET_NAME}-${i}.${SET_NAME}.default.svc.cluster.local:2380"
            done
            # start etcd. If cluster is already initialized the `--initial-*` options will be ignored.
            exec etcd --name ${HOSTNAME} 
              --listen-peer-urls http://${IP}:2380 
              --listen-client-urls http://${IP}:2379,http://127.0.0.1:2379 
              --advertise-client-urls http://${HOSTNAME}.${SET_NAME}:2379 
              --initial-advertise-peer-urls http://${HOSTNAME}.${SET_NAME}:2380 
              --initial-cluster-token etcd-cluster-1 
              --initial-cluster ${PEERS} 
              --initial-cluster-state new 
              --data-dir /var/run/etcd/default.etcd
## We are using dynamic pv provisioning using the "standard" storage class so
## this resource can be directly deployed without changes to minikube (since
## minikube defines this class for its minikube hostpath provisioner). In
## production define your own way to use pv claims.
  volumeClaimTemplates:
  - metadata:
      name: data
      annotations:
        volume.beta.kubernetes.io/storage-class: ceph
    spec:
      accessModes:
        - "ReadWriteOnce"
      resources:
        requests:
          storage: 1Gi
EOF

創(chuàng)建完成之后的po,pv,pvc清單如下:

[root@172 etcd]# kubectl get po -owide 
NAME      READY     STATUS    RESTARTS   AGE       IP              NODE
etcd-0    1/1       Running   0          15m       192.168.5.174   172.16.20.10
etcd-1    1/1       Running   0          15m       192.168.3.16    172.16.20.12
etcd-2    1/1       Running   0          5s        192.168.5.176   172.16.20.10
2. 測試縮容
kubectl scale statefulset etcd --replicas=2

[root@172 ~]# kubectl get po -owide -w
NAME      READY     STATUS    RESTARTS   AGE       IP              NODE
etcd-0    1/1       Running   0          17m       192.168.5.174   172.16.20.10
etcd-1    1/1       Running   0          17m       192.168.3.16    172.16.20.12
etcd-2    1/1       Running   0          1m        192.168.5.176   172.16.20.10
etcd-2    1/1       Terminating   0         1m        192.168.5.176   172.16.20.10
etcd-2    0/1       Terminating   0         1m            172.16.20.10

檢查集群健康

kubectl exec etcd-0 -- etcdctl cluster-health

failed to check the health of member 42c8b94265b9b79a on http://etcd-2.etcd:2379: Get http://etcd-2.etcd:2379/health: dial tcp: lookup etcd-2.etcd on 10.96.0.10:53: no such host
member 42c8b94265b9b79a is unreachable: [http://etcd-2.etcd:2379] are all unreachable
member 9869f0647883a00d is healthy: got healthy result from http://etcd-1.etcd:2379
member c799a6ef06bc8c14 is healthy: got healthy result from http://etcd-0.etcd:2379
cluster is healthy

發(fā)現(xiàn)縮容后,etcd-2并沒有從etcd集群中自動刪除,可見這個etcd鏡像對自動擴容縮容的支持并不夠好。
我們手工刪除掉etcd-2

[root@172 etcd]# kubectl exec etcd-0 -- etcdctl member remove 42c8b94265b9b79a
Removed member 42c8b94265b9b79a from cluster
[root@172 etcd]# kubectl exec etcd-0 -- etcdctl cluster-health                
member 9869f0647883a00d is healthy: got healthy result from http://etcd-1.etcd:2379
member c799a6ef06bc8c14 is healthy: got healthy result from http://etcd-0.etcd:2379
cluster is healthy
3. 測試擴容

從etcd.yaml的啟動腳本中可以看出,擴容時新啟動一個etcd pod時參數(shù)--initial-cluster-state new,該etcd鏡像并不支持動態(tài)擴容,可以考慮使用基于dns動態(tài)部署etcd集群的方式來修改啟動腳本,這樣才能支持etcd cluster動態(tài)擴容。

文章版權(quán)歸作者所有,未經(jīng)允許請勿轉(zhuǎn)載,若此文章存在違規(guī)行為,您可以聯(lián)系管理員刪除。

轉(zhuǎn)載請注明本文地址:http://systransis.cn/yun/32607.html

相關(guān)文章

  • zookeeperetcd狀態(tài)服務(wù)部署實踐

    摘要:二總結(jié)使用的和的,能夠很好的支持這樣的有狀態(tài)服務(wù)部署到集群上。部署方式有待優(yōu)化本次試驗中使用靜態(tài)方式部署集群,如果節(jié)點變遷時,需要執(zhí)行等命令手動配置集群,嚴重限制了集群自動故障恢復(fù)擴容縮容的能力。 一. 概述 kubernetes通過statefulset為zookeeper、etcd等這類有狀態(tài)的應(yīng)用程序提供完善支持,statefulset具備以下特性: 為pod提供穩(wěn)定的唯一的...

    dingda 評論0 收藏0
  • 2021 年最新基于 Spring Cloud 的微服務(wù)架構(gòu)分析

    摘要:是一個相對比較新的微服務(wù)框架,年才推出的版本雖然時間最短但是相比等框架提供的全套的分布式系統(tǒng)解決方案。提供線程池不同的服務(wù)走不同的線程池,實現(xiàn)了不同服務(wù)調(diào)用的隔離,避免了服務(wù)器雪崩的問題。通過互相注冊的方式來進行消息同步和保證高可用。 Spring Cloud 是一個相對比較新的微服務(wù)框架,...

    cikenerd 評論0 收藏0
  • PowerDotNet平臺化軟件架構(gòu)設(shè)計與實現(xiàn)系列(04):服務(wù)治理平臺

    摘要:的服務(wù)治理平臺發(fā)源于早期的個人項目。客戶端發(fā)現(xiàn)模式要求客戶端負責查詢注冊中心,獲取服務(wù)提供者的列表信息,使用負載均衡算法選擇一個合適的服務(wù)提供者,發(fā)起接口調(diào)用請求。系統(tǒng)和系統(tǒng)之間,少不了數(shù)據(jù)的互聯(lián)互通。隨著微服務(wù)的流行,一個系統(tǒng)內(nèi)的不同應(yīng)用進行互聯(lián)互通也是常態(tài)。 PowerDotNet的服務(wù)治理平臺發(fā)源于早期的個人項目Power.Apix。這個項目借鑒了工作過的公司的服務(wù)治理方案,站在...

    reclay 評論0 收藏0
  • 從容器到容器編排

    摘要:從容器到容器編排平臺以及周邊生態(tài)系統(tǒng)包含很多工具來管理容器的生命周期。終止運行中的容器。發(fā)現(xiàn)在由運行于多個主機上的容器組成的分布式部署容器發(fā)現(xiàn)至關(guān)重要。類似的,當容器崩潰時,編排工具可以啟動替換。 從容器到容器編排 Docker平臺以及周邊生態(tài)系統(tǒng)包含很多工具來管理容器的生命周期。例如,Docker Command Line Interface(CLI)支持下面的容器活動: 從注冊表...

    Hydrogen 評論0 收藏0
  • Etcd超全解:原理闡釋及部署設(shè)置的最佳實踐

    摘要:谷歌思科華為等等均是的貢獻成員。其中谷歌云平臺和等大型云提供商成功在生產(chǎn)環(huán)境中使用了。它為良好穩(wěn)定的生產(chǎn)部署提供了一個良好的起點。預(yù)先準備在繼續(xù)之前,我們需要準備一個谷歌云平臺的賬號免費的應(yīng)該足夠了。我們將為部署配置。 本文將帶你充分了解Etcd的工作原理,演示如何用Kubernetes建立并運行etcd集群,如何與Etcd交互,如何在Etcd中設(shè)置和檢索值,如何配置高可用等等。 sh...

    yhaolpz 評論0 收藏0

發(fā)表評論

0條評論

最新活動
閱讀需要支付1元查看
<