Docker内置容器编排方案
当年的swarm、k8s、mesos三大系统竞争之激烈,现在都归于k8s了。
Docker内置容器编排方案
当年的swarm、k8s、mesos三大系统竞争之激烈,现在都归于k8s了。
此文档适用于低于1.12版本的docker,之后swarm已内置于docker-engine里。
至少5台PC服务器, 分别如下作用
一台一台的ssh上去执行,或者使用ansible批量部署工具。
安装docker-engine
curl -sSL https://get.docker.com/ | sh
启动之,并使之监听2375端口
sudo docker daemon -H tcp://0.0.0.0:2375 -H unix:///var/run/docker.sock
亦可修改配置,使之永久生效
mkdir /etc/systemd/system/docker.service.d
cat <<EOF >>/etc/systemd/system/docker.service.d/docker.conf
[Service]
ExecStart=
ExecStart=/usr/bin/docker daemon -H tcp://0.0.0.0:2375 -H unix:///var/run/docker.sock --dns 180.76.76.76 --insecure-registry registry.cecf.com -g /home/Docker/docker
EOF
在consul0上启动consul服务,manager用其来认证node连接并存储node状态, 理应建立discovery的高可用,这里简化之
docker run -d -p 8500:8500 --name=consul progrium/consul -server -bootstrap
在manager0上创建the primary manager, 自行替换manager0_ip和consul0_ip的真实IP地址。
docker run -d -p 4000:4000 swarm manage -H :4000 --replication --advertise <manager0_ip>:4000 consul://<consul0_ip>:8500
在manager1上启动replica manger
docker run -d -p 4000:4000 swarm manage -H :4000 --replication --advertise <manager1_ip>:4000 consul://<consul0_ip>:8500
--replication
分别在node0和node1上执行加入集群操作
docker run -d swarm join --advertise=<node_ip>:2375 consul://<consul0_ip>:8500
docker -H :4000 info
此文档适用于不低于1.12版本的docker,因为swarm已内置于docker-engine里。
这里以5台PC服务器为例, 分别如下作用
一台一台的ssh上去执行,或者使用ansible批量部署工具。
安装docker-engine
curl -sSL https://get.docker.com/ | sh
启动之,并使之监听2375端口
sudo docker daemon -H tcp://0.0.0.0:2375 -H unix:///var/run/docker.sock
亦可修改配置,使之永久生效
mkdir /etc/systemd/system/docker.service.d
cat <<EOF >>/etc/systemd/system/docker.service.d/docker.conf
[Service]
ExecStart=
ExecStart=/usr/bin/docker daemon -H tcp://0.0.0.0:2375 -H unix:///var/run/docker.sock --dns 180.76.76.76 --insecure-registry registry.cecf.com -g /home/Docker/docker
EOF
如果开启了防火墙,需要开启如下端口
docker swarm init --advertise-addr <MANAGER-IP>
我的实例里如下:
[root@manager0 ~]# docker swarm init --advertise-addr 10.42.0.243
Swarm initialized: current node (e5eqi0lue90uidzsfddeqwfl8) is now a manager.
To add a worker to this swarm, run the following command:
docker swarm join \
--token SWMTKN-1-3iskhw3lsc9pkdtijj1d23lg9tp7duj18f477i5ywgezry7zlt-dfwjbsjleoajcdj13psu702s6 \
10.42.0.243:2377
To add a manager to this swarm, run 'docker swarm join-token manager' and follow the instructions.
使用 --advertise-addr
来声明manager0的IP,其他的nodes必须可以和此IP互通,
一旦完整初始化,此node即是manger又是worker node.
通过docker info
来查看
$ docker info
Containers: 2
Running: 0
Paused: 0
Stopped: 2
...snip...
Swarm: active
NodeID: e5eqi0lue90uidzsfddeqwfl8
Is Manager: true
Managers: 1
Nodes: 1
...snip...
通过docker node ls
来查看集群的node信息
[root@manager0 ~]# docker node ls
ID HOSTNAME STATUS AVAILABILITY MANAGER STATUS
e5eqi0lue90uidzsfddeqwfl8 * manager0 Ready Active Leader
这里的*
来指明docker client正在链接在这个node上。
执行在manager0上产生docker swarm init
产生的结果即可
如果当时没记录下来,还可以在manager上补看 想把node以worker身份加入,在manager0上执行下面的命令来补看。
docker swarm join-token worker
想把node以manager身份加入,在manager0上执行下面的命令来来补看。
docker swarm join-token manager
为了manager的高可用,我这里需要在manager1上执行
docker swarm join \
--token SWMTKN-1-3iskhw3lsc9pkdtijj1d23lg9tp7duj18f477i5ywgezry7zlt-86dk7l9usp1yh4uc3rjchf2hu \
10.42.0.243:2377
我这里就是依次在node0-2上执行
docker swarm join \
--token SWMTKN-1-3iskhw3lsc9pkdtijj1d23lg9tp7duj18f477i5ywgezry7zlt-dfwjbsjleoajcdj13psu702s6 \
10.42.0.243:2377
这样node就会加入之前我们创建的swarm集群里。
再通过docker node ls
来查看现在的集群情况, swarm的集群里是以node为实例的
[root@manager0 ~]# docker node ls
ID HOSTNAME STATUS AVAILABILITY MANAGER STATUS
0tr5fu8ebi27cp2ot210t67fx manager1 Ready Active Reachable
46irkik4idjk8rjy7pqjb84x0 node1 Ready Active
79hlu1m7x9p4cc4npa4xjuax3 node0 Ready Active
9535h8ow82s8mzuw5kud2mwl3 consul0 Ready Active
e5eqi0lue90uidzsfddeqwfl8 * manager0 Ready Active Leader
这里MANAFER标明各node的身份,空即为worker身份。
Usage: docker service COMMAND
Manage Docker services
Options:
--help Print usage
Commands:
create Create a new service
inspect Display detailed information on one or more services
ps List the tasks of a service
ls List services
rm Remove one or more services
scale Scale one or multiple services
update Update a service
部署示例如下:
docker service create --replicas 2 --name helloworld alpine ping 300.cn
docker service ls
罗列swarm集群的所有services
docker service ps helloworld
查看service部署到了哪个node上
docker service inspect helloworld
查看service 资源、状态等具体信息
docker servcie scale helloworld=5
来扩容service的个数
docker service rm helloworld
来删除service
docker service update
来实现更新service的各项属性,包括滚动升级等。
可更新的属性包含如下:
Usage: docker service update [OPTIONS] SERVICE
Update a service
Options:
--args string Service command args
--constraint-add value Add or update placement constraints (default [])
--constraint-rm value Remove a constraint (default [])
--container-label-add value Add or update container labels (default [])
--container-label-rm value Remove a container label by its key (default [])
--endpoint-mode string Endpoint mode (vip or dnsrr)
--env-add value Add or update environment variables (default [])
--env-rm value Remove an environment variable (default [])
--help Print usage
--image string Service image tag
--label-add value Add or update service labels (default [])
--label-rm value Remove a label by its key (default [])
--limit-cpu value Limit CPUs (default 0.000)
--limit-memory value Limit Memory (default 0 B)
--log-driver string Logging driver for service
--log-opt value Logging driver options (default [])
--mount-add value Add or update a mount on a service
--mount-rm value Remove a mount by its target path (default [])
--name string Service name
--publish-add value Add or update a published port (default [])
--publish-rm value Remove a published port by its target port (default [])
--replicas value Number of tasks (default none)
--reserve-cpu value Reserve CPUs (default 0.000)
--reserve-memory value Reserve Memory (default 0 B)
--restart-condition string Restart when condition is met (none, on-failure, or any)
--restart-delay value Delay between restart attempts (default none)
--restart-max-attempts value Maximum number of restarts before giving up (default none)
--restart-window value Window used to evaluate the restart policy (default none)
--stop-grace-period value Time to wait before force killing a container (default none)
--update-delay duration Delay between updates
--update-failure-action string Action on update failure (pause|continue) (default "pause")
--update-parallelism uint Maximum number of tasks updated simultaneously (0 to update all at once) (default 1)
-u, --user string Username or UID
--with-registry-auth Send registry authentication details to swarm agents
-w, --workdir string Working directory inside the container
如前文所述,默认已经搭建好环境,基于docker1.12版本。
[root@manager0 ~]# docker node ls
ID HOSTNAME STATUS AVAILABILITY MANAGER STATUS
0bbmd3r7aphs374qaea4zcieo node2 Ready Active
3qmxzyauc0bz4kjqvld9uogz5 manager1 Ready Active Reachable
5ewbdtvaopj4ltwqx0a4i65nt * manager0 Ready Drain Leader
5oxxpgk69fnwe5w210kovrqi9 node1 Ready Active
7s1ilay2wkjgt09bp2z0743m7 node0 Ready Active
docker network create -d overlay --subnet 10.254.0.0/16 --gateway 10.254.0.1 mynet1
docker service create --name redis --network mynet1 redis
[root@manager0 ~]# docker service ps redis
ID NAME IMAGE NODE DESIRED STATE CURRENT STATE ERROR
9avksjfqr2gxm413dfrezrmgr redis.1 redis node1 Running Running 17 seconds ago
实例里,同样可以去node1上用docker ps
查看
以上只是最基本的集群创建服务的用法,从中可见,swarm的的调度基本单元是task, 没有pod的概念,一个task可以简单理解成一个docker run的结果。目前swarm里也不支持compose。
docker官方称,以后会支持vm、pod的调度单元,具体日期未知。
使用docker service create
创建服务, 这其中选择再哪个节点部署,docker 提供了三种调度策略;
通过--replicas
参数可以设置服务容器的数量,已达到高可用状态;
#创建多副本
docker service update --replicas 4 redis
#查看副本部署情况
[root@manager0 ~]# docker service ps redis
ID NAME IMAGE NODE DESIRED STATE CURRENT STATE ERROR
9avksjfqr2gxm413dfrezrmgr redis.1 redis node1 Running Running 13 minutes ago
0olv1sfz6d79wdnorw7jgoyri redis.2 redis manager1 Running Running about a minute ago
f3n6deesjlkxu4k48lzabieus redis.3 redis node2 Running Preparing 3 minutes ago
80bzarvkiytpv1690sla6unt2 redis.4 redis node0 Running Running about a minute ago
#验证多可用, 总共4个副本,docker内置的DNS服务会默认使用round-robin调度策略来解析主机。
root@9ed77b4b4432:/data# redis-cli -h redis
redis:6379> set user 1
OK
redis:6379> exit
root@9ed77b4b4432:/data# redis-cli -h redis
redis:6379> get user
(nil)
redis:6379> set user 2
OK
redis:6379> exit
root@9ed77b4b4432:/data# redis-cli -h redis
redis:6379> get user
(nil)
redis:6379> set user 3
OK
redis:6379> exit
root@9ed77b4b4432:/data# redis-cli -h redis
redis:6379> get user
(nil)
redis:6379> set user 4
OK
redis:6379> exit
root@9ed77b4b4432:/data# redis-cli -h redis
redis:6379> get user
"1"
redis:6379>