目前的公司是使用的阿里内部的dubbo,也就是EDAS,里面用了阿里自己的EDAS服务,如果是使用过dubbo的老铁,应该知道zookeeper,zookeeper在大数据和RPC通信上应用比较管饭。不管用过zookeeper没有,这次主要是介绍下zookeeper和集群的部署。这个必须要实际操作下,才能理解的深刻。
(一)介绍zookeeper
Zookeeper 最早起源于雅虎研究院的一个研究小组。在当时,研究人员发现,在雅虎内部很多大型系统基本都需要依赖一个类似的系统来进行分布式协调,但是这些系统往往都存在分布式单点问题。所以,雅虎的开发人员就试图开发一个通用的无单点问题的分布式协调框架,以便让开发人员将精力集中在处理业务逻辑上。
关于“ZooKeeper”这个项目的名字,其实也有一段趣闻。在立项初期,考虑到之前内部很多项目都是使用动物的名字来命名的(例如著名的Pig项目),雅虎的工程师希望给这个项目也取一个动物的名字。时任研究院的首席科学家 RaghuRamakrishnan 开玩笑地说:“在这样下去,我们这儿就变成动物园了!”此话一出,大家纷纷表示就叫动物园管理员吧一一一因为各个以动物命名的分布式组件放在一起,雅虎的整个分布式系统看上去就像一个大型的动物园了,而 Zookeeper 正好要用来进行分布式环境的协调一一于是,Zookeeper 的名字也就由此诞生了。
20世纪60年代,大型机被发明了出来,凭借自身的超强计算和I/O处理能力以及稳定、安全性,成为了世界的主流。但是大型机也有一些致命的问题,比如造价昂贵、操作复杂、单点问题等。特别是对大型机的人才的培养成本非常之高。随着这些问题的出现,不断影响了大型机的发展。而另一方面PC机性能的不断提升和网络普及,大家开始转向了使用小型机和普通的PC服务器来搭建分布式的计算机系统,来降级成本。而后分布式便不断火热起来。
https://zookeeper.Apache.org/
下载地址: https://www-eu.apache.org/dist/zookeeper/
image.png
源码地址:https://github.com/apache/zookeeper
(二)集群部署
集群分为2类,一种是分布式集群,一种是伪分布式集群。
分布式:每个应用在单独的独立主机上,端口都是一致的。
伪分布式:多个应用在一台主机上,端口进行区别。伪分布式在实际生产环境很少。
对于操作来说伪分布式集群更难一些。
mac 安装vgarant :https://idig8.com/2018/07/29/Docker-zhongji-07/
window安装vgaranthttps://idig8.com/2018/07/29/docker-zhongji-08/
系统类型IP地址节点角色CPUMemoryHostnamecentos7192.168.69.100伪分布式22Gzookeeper-virtuaCentos7192.168.69.101真分布式-领导者22Gzookeeper-LeaderCentos7192.168.69.102真分布式-属下122Gzookeeper-Follower1Centos7192.168.69.103真分布式-属下222Gzookeeper-Follower2
src的小技巧,这样就有颜色了,之前一直忽略了,看的眼疼,现在颜色分明了好多了。
还是用vagrant来,自从熟悉了vagrant 我基本没手动创建过虚拟机。
(2.1.1)基础设置
su #密码 vagrant cd ~ vi /etc/ssh/sshd_config sudo systemctl restart sshd vi /etc/resolv.conf # 设置成8.8.8.8 service network restart
(2.1.2)jdk安装
脚本在我的源码里面
vi pro.sh sh pro.sh
(2.1.3)zookeeper下载
下载工具千万切记用最新的已经出到3.5.4,我还是用3.4.10
wget https://www-eu.apache.org/dist/zookeeper/zookeeper-3.4.10/zookeeper-3.4.10.tar.gz
(2.1.4)解压zookeeper
tar zxvf zookeeper-3.4.10.tar.gz
(2.1.5)进入zk中的conf目录下拷贝3个文件
cd /root/zookeeper-3.4.10/conf cp zoo_sample.cfg zoo1.cfg cp zoo_sample.cfg zoo2.cfg cp zoo_sample.cfg zoo3.cfg
(2.1.6) 编辑这3个文件zoo1.cfg,zoo2.cfg,zoo3.cfg
(2.1.6.1)编辑zoo1.cfg
vi zoo1.cfg
dataDir=/Apps/servers/data/d_1
dataLogDir=/apps/servers/logs/logs_1
clientPort=2181
# The number of milliseconds of each tick tickTime=2000 # The number of ticks that the initial # synchronization phase can take initLimit=10 # The number of ticks that can pass between # sending a request and getting an acknowledgement syncLimit=5 # the directory where the snapshot is stored. # do not use /tmp for storage, /tmp here is just # example sakes. dataDir=/apps/servers/data/d_1 dataLogDir=/apps/servers/logs/logs_1 # the port at which the clients will connect clientPort=2181 # the maximum number of client connections. # increase this if you need to handle more clients #maxClientCnxns=60 # # Be sure to read the maintenance section of the # administrator guide before turning on autopurge. # # http://zookeeper.apache.org/doc/current/zookeeperAdmin.html#sc_maintenance # # The number of snapshots to retain in dataDir #autopurge.snapRetainCount=3 # Purge task interval in hours # Set to "0" to disable auto purge feature #autopurge.purgeInterval=1 server.1=localhost:2187:2887 server.2=localhost:2188:2888 server.3=localhost:2189:2889
(2.1.6.2)编辑zoo2.cfg
vi zoo2.cfg
dataDir=/apps/servers/data/d_2
dataLogDir=/apps/servers/logs/logs_2
clientPort=2182
# The number of milliseconds of each tick tickTime=2000 # The number of ticks that the initial # synchronization phase can take initLimit=10 # The number of ticks that can pass between # sending a request and getting an acknowledgement syncLimit=5 # the directory where the snapshot is stored. # do not use /tmp for storage, /tmp here is just # example sakes. dataDir=/apps/servers/data/d_2 dataLogDir=/apps/servers/logs/logs_2 # the port at which the clients will connect clientPort=2182 # the maximum number of client connections. # increase this if you need to handle more clients #maxClientCnxns=60 # # Be sure to read the maintenance section of the # administrator guide before turning on autopurge. # # http://zookeeper.apache.org/doc/current/zookeeperAdmin.html#sc_maintenance # # The number of snapshots to retain in dataDir #autopurge.snapRetainCount=3 # Purge task interval in hours # Set to "0" to disable auto purge feature #autopurge.purgeInterval=1 server.1=localhost:2187:2887 server.2=localhost:2188:2888 server.3=localhost:2189:2889
(2.1.6.3)编辑zoo3.cfg
vi zoo3.cfg
dataDir=/apps/servers/data/d_3
dataLogDir=/apps/servers/logs/logs_3
clientPort=2183
# The number of milliseconds of each tick tickTime=2000 # The number of ticks that the initial # synchronization phase can take initLimit=10 # The number of ticks that can pass between # sending a request and getting an acknowledgement syncLimit=5 # the directory where the snapshot is stored. # do not use /tmp for storage, /tmp here is just # example sakes. dataDir=/apps/servers/data/d_3 dataLogDir=/apps/servers/logs/logs_3 # the port at which the clients will connect clientPort=2183 # the maximum number of client connections. # increase this if you need to handle more clients #maxClientCnxns=60 # # Be sure to read the maintenance section of the # administrator guide before turning on autopurge. # # http://zookeeper.apache.org/doc/current/zookeeperAdmin.html#sc_maintenance # # The number of snapshots to retain in dataDir #autopurge.snapRetainCount=3 # Purge task interval in hours # Set to "0" to disable auto purge feature #autopurge.purgeInterval=1 server.1=localhost:2187:2887 server.2=localhost:2188:2888 server.3=localhost:2189:2889
(2.1.7)创建data目录和日志目录
mkdir -p /apps/servers/data/d_1 mkdir -p /apps/servers/data/d_2 mkdir -p /apps/servers/data/d_3 mkdir -p /apps/servers/logs/logs_1 mkdir -p /apps/servers/logs/logs_2 mkdir -p /apps/servers/logs/logs_3 echo "1" >/apps/servers/data/d_1/myid echo "2" >/apps/servers/data/d_2/myid echo "3" >/apps/servers/data/d_3/myid
(2.1.8)进入bin目录下输入命令 分别进行启动
cd /root/zookeeper-3.4.10/bin sh zkServer.sh start ../conf/zoo1.cfg sh zkServer.sh start ../conf/zoo2.cfg sh zkServer.sh start ../conf/zoo3.cfg
(2.1.9)进入每一个看看效果
source /etc/profile sh zkCli.sh -server localhost:2181 sh zkCli.sh -server localhost:2182 sh zkCli.sh -server localhost:2183
伪分布式其实就是这样就搭建完毕了。重点还是分布式的往下看。
(1.2.1)基础设置(三台机器都需要设置)
su #密码 vagrant cd ~ vi /etc/ssh/sshd_config sudo systemctl restart sshd vi /etc/resolv.conf # 设置成8.8.8.8 service network restart
(1.2.2)jdk安装(三台机器都需要设置)
脚本在我的源码里面
vi pro.sh sh pro.sh
(1.2.3)zookeeper下载(三台机器都需要设置)
下载工具千万切记用最新的已经出到3.5.4,我还是用3.4.10
为什么是三个,因为Zookeeper喜欢奇数不喜欢偶数。
wget https://www-eu.apache.org/dist/zookeeper/zookeeper-3.4.10/zookeeper-3.4.10.tar.gz
(1.2.4)解压zookeeper
tar zxvf zookeeper-3.4.10.tar.gz
(1.2.4)配置cfg文件(三台机器都需要设置)
cd ~ cd zookeeper-3.4.10/ cd conf cp zoo_sample.cfg zoo.cfg
(1.2.5)配置cfg文件,其实这3个机器的配置文件是一样的,我就不重复写cfg了,就直接截图了
vi zoo.cfg # The number of milliseconds of each tick tickTime=2000 # The number of ticks that the initial # synchronization phase can take initLimit=10 # The number of ticks that can pass between # sending a request and getting an acknowledgement syncLimit=5 # the directory where the snapshot is stored. # do not use /tmp for storage, /tmp here is just # example sakes. dataDir=/tmp/zookeeper dataLogDir=/tmp/zookeeper/logs # the port at which the clients will connect clientPort=2181 # the maximum number of client connections. # increase this if you need to handle more clients #maxClientCnxns=60 # # Be sure to read the maintenance section of the # administrator guide before turning on autopurge. # # http://zookeeper.apache.org/doc/current/zookeeperAdmin.html#sc_maintenance # # The number of snapshots to retain in dataDir #autopurge.snapRetainCount=3 # Purge task interval in hours # Set to "0" to disable auto purge feature #autopurge.purgeInterval=1 server.0=192.168.69.101:2888:3888 server.1=192.168.69.102:2888:3888 server.2=192.168.69.103:2888:3888
(1.2.6)配置myid
需要配置到上边dataDir=/tmp/zookeeper的目录下。
cd / cd tmp mkdir zookeeper cd zookeeper
(1.2.6.1)192.168.69.101配置myid
echo '0'>myid cat myid
(1.2.6.2)192.168.69.102配置myid
echo '1'>myid cat myid
(1.2.6.3)192.168.69.103配置myid
echo '2'>myid cat myid
启动命令运行3台虚拟机下的zookeeper
cd ~/zookeeper-3.4.10/bin sh zkServer.sh start
(三) 概念梳理
参数意义tickTime2000syncLimitLeader 和 follower 之间的通讯时长 最长不能超过initLimt*ticktimeinitLimt接受客户端链接 zk 初始化的最长等待心跳时长initLimt*ticktimedataDir数据目录dataLogDir日志文件clientPort客户端链接服务端端口号Server.A=B:C:D A:第几号服务器 B 服务 IP C 代表 Leader 和 follower 通讯端口 D 备用选 leader 端口
Leader:
Leader 作为整个 ZooKeeper 集群的主节点,负责响应所有对 ZooKeeper 状态变更的请求。它会将每个状态更新请求进行排序和编号,以便保证整个集群内部消息处理的 FIFO,写操作都走 leader
Follower :
Follower 的逻辑就比较简单了。除了响应本服务器上的读请求外,follower 还要处理leader 的提议,并在 leader 提交该提议时在本地也进行提交。 另外需要注意的是,leader 和 follower 构成 ZooKeeper 集群的法定人数,也就是说,只有他们才参与新 leader的选举、响应 leader 的提议。
Observer :
如果 ZooKeeper 集群的读取负载很高,或者客户端多到跨机房,可以设置一些 observer服务器,以提高读取的吞吐量。Observer 和 Follower 比较相似,只有一些小区别:首先observer 不属于法定人数,即不参加选举也不响应提议;其次是 observer 不需要将事务持
久化到磁盘,一旦 observer 被重启,需要从 leader 重新同步整个名字空间。
Zookeeper 是一个由多个 server 组成的
1.集群一个 leader,多个 follower
2.每个 server 保存一份数据副本
3.全局数据一致分布式读follower,写由 leader 实施更新请求转发,由 leader 实施更新请求顺序进行,来自同一个 client 的更新请求按其发送顺序依次执行数据更新原子性,
4.一次数据更新要么成功,要么失败全局唯一数据视图,
5.client 无论连接到哪个 server,数据视图都是一致的实时性,在一定事件范围内,client 能读到最新数据
PS:本次主要说说zookeeper的原理和集群部署,没有太详细的介绍里面的细节,下次说说zookeeper使用。