搭建kubernetes高可用集群-04部署高可用etcd集群

创建高可用etcd集群

kuberntes 系统使用 etcd 存储所有数据,本文档介绍部署一个三节点高可用 etcd 集群的步骤,这三个节点ip分别是:192.168.255.194,192.168.255.195, 192.168.255.196

TLS认证文件

需要为 etcd 集群创建加密通信的 TLS 证书,这里复用以前创建的 kubernetes 证书

cp ca.pem /etc/kubernetes/ssl
cp kubernetes-key.pem /etc/kubernetes/ssl
cp kubernetes.pem /etc/kubernetes/ssl

kubernetes 证书的 hosts 字段列表中包含上面三台机器的 IP,否则后续证书校验会失败;

在ETCD节点上安装ETCD

https://github.com/coreos/etcd/releases 页面下载最新版本的二进制文件

wget https://github.com/coreos/etcd/releases/download/v3.1.5/etcd-v3.1.5-linux-amd64.tar.gz
tar -xvf etcd-v3.1.5-linux-amd64.tar.gz
mv etcd-v3.1.5-linux-amd64/etcd* /usr/local/bin

配置ETCD

  • ETCD配置文件
cat > /data/k8s/script/config/etcd.conf << EOF
# [member]
ETCD_NAME="etcd_node1"
ETCD_DATA_DIR="etcd"
ETCD_LISTEN_PEER_URLS="https://192.168.255.194:2380"
ETCD_LISTEN_CLIENT_URLS="https://192.168.255.194:2379"

#[cluster]
ETCD_INITIAL_ADVERTISE_PEER_URLS="https://192.168.255.194:2380"
ETCD_INITIAL_CLUSTER_TOKEN="etcd-cluster"
ETCD_ADVERTISE_CLIENT_URLS="https://192.168.255.194:2379"
ETCD_CLUSTER_URLS="etcd_node1=https://192.168.255.194:2380,etcd_node2=https://192.168.255.195:2380,etcd_node1=https://192.168.255.196:2380"

这是192.168.255.194节点的配置,其他两个etcd节点只要将上面的IP地址改成相应节点的IP地址即可。ETCD_NAME换成对应节点的etcd_node1/2/3。

  • env配置文件内容
cat > /data/k8s/script/config/env << EOF
workdir="/data/k8s/script"
bin_dir="/usr/local/bin"
nproc=65535
EOF
  • ETCD启动脚本
cat > /data/k8s/script/etcd_ctl << EOF
#!/bin/bash

source /data/k8s/script/config/env
source $workdir/config/etcd.conf

name="etcd"
pidfile="$workdir/run/$name.pid"
ETCD_DATA_DIR="$workdir/data/$ETCD_DATA_DIR"

test -d $workdir/data || mkdir -p $workdir/data
test -d $workdir/run || mkdir -p $workdir/run
test -d $workdir/log/$name || mkdir -p $workdir/log/$name

display_help(){
  echo "Usage: `basename $0` (start|stop)"
  exit 0
}

if [ $# -ne 1 ];then
  display_help
fi

source $workdir/pid_utils/pid_util.sh
case $1 in 
  start)
      cd $ETCD_DATA_DIR
      exec $bin_dir/etcd \
          --name ${ETCD_NAME} \
          --cert-file=/etc/kubernetes/ssl/kubernetes.pem \
          --key-file=/etc/kubernetes/ssl/kubernetes-key.pem \
          --peer-cert-file=/etc/kubernetes/ssl/kubernetes.pem \
          --peer-key-file=/etc/kubernetes/ssl/kubernetes-key.pem \
          --trusted-ca-file=/etc/kubernetes/ssl/ca.pem \
          --peer-trusted-ca-file=/etc/kubernetes/ssl/ca.pem \
          --initial-advertise-peer-urls ${ETCD_INITIAL_ADVERTISE_PEER_URLS} \
          --listen-peer-urls ${ETCD_LISTEN_PEER_URLS} \
          --listen-client-urls ${ETCD_LISTEN_CLIENT_URLS},http://127.0.0.1:2379 \
          --advertise-client-urls ${ETCD_ADVERTISE_CLIENT_URLS} \
          --initial-cluster-token ${ETCD_INITIAL_CLUSTER_TOKEN} \
          --initial-cluster ${ETCD_CLUSTER_URLS} \
      --initial-cluster-state new \
          --data-dir=${ETCD_DATA_DIR} \
          1>>$workdir/log/$name/$name.log 2>&1 &

      for try in $(seq 0 9);do
            sleep $try
            echo "wait $name pid (try: $try)"
            pid=$(lsof -t $bin_dir/etcd)
            if [ -n "$pid" ]; then
                echo "$pid" > $pidfile
                break;
            fi
       done
  ;;

  stop)
    kill_and_wait $pidfile
    ;;
  *)
    display_help
    ;;
esac
EOF

对应节点需修改变量相关的IP,及节点名称,在其它节点上做同样操作部署,注意需修改相关IP及节点名称

  • 为了保证通信安全,需要指定 etcd 的公私钥(cert-file和key-file)、Peers 通信的公私钥和 CA 证书(peer-cert-file、peer-key-file、peer-trusted-ca-file)、客户端的CA证书(trusted-ca-file);
  • 创建 kubernetes.pem 证书时使用的 kubernetes-csr.json 文件的 hosts 字段包含所有 etcd 节点的IP,否则证书校验会出错;
  • –initial-cluster-state 值为 new 时,–name 的参数值必须位于 –initial-cluster 列表中;

启动ETCD服务

sh /data/k8s/script/etcd_ctl start

验证ETCD服务

etcdctl \
   --ca-file=/etc/kubernetes/ssl/ca.pem \
   --cert-file=/etc/kubernetes/ssl/kubernetes.pem \
   --key-file=/etc/kubernetes/ssl/kubernetes-key.pem \
   cluster-health

#看到以下返回信息,说明集群已经搭建成功   
member 1330485cd721a2f5 is healthy: got healthy result from https://192.168.255.196:2379
member 2b60be76e09983c7 is healthy: got healthy result from https://192.168.255.195:2379
member cefbbcb93ec7251d is healthy: got healthy result from https://192.168.255.194:2379
cluster is healthy
-------------the end-------------