Setting up Apache ZooKeeper cluster
What is ZooKeeper (very rough)
Apache ZooKeeper storage hierarchal structure. Refer to ZNode.
https://de.slideshare.net/sauravhaloi/introduction-to-apache-zookeeper
Install ZooKeeper
My environment
- Ubuntu 20.04
Pre-installation
Install java first.
sudo apt update
sudo apt upgrade -y
sudo apt install -y openjdk-11-jdk
Crete ZooKeeper user.
sudo useradd -r -s /bin/bash zk
Install ZooKeeper on nodes
Download link https://zookeeper.apache.org/releases.html
sudo su
cd /opt
curl https://mirror.netcologne.de/apache.org/zookeeper/zookeeper-3.6.2/apache-zookeeper-3.6.2-bin.tar.gz -O
tar xvf apache-zookeeper-3.6.2-bin.tar.gz
mkdir /var/lib/zookeeper
cd apache-zookeeper-3.6.2-bin
cp conf/zoo_sample.cfg conf/zoo.cfg
Edit conf/zoo.cfg as follows.
# Edit dataDir like
dataDir=/var/lib/zookeeper
# Add the following line
4lw.commands.whitelist=mntr,conf,ruok
Open TCP port 2181 for ZooKeeper clients e.g. SolrCloud, Kafka, etc..
Run a ZooKeeper with ./bin/zkServer.sh start.
If it works fine, stop it with ./bin/zkServer.sh stop.
## Start
/opt/apache-zookeeper-3.6.2-bin# ./bin/zkServer.sh start
## Check log
/opt/apache-zookeeper-3.6.2-bin# tail -f logs/zookeeper-root-server-{{ your_servers_hostname }}.out
## Stop
/opt/apache-zookeeper-3.6.2-bin# ./bin/zkServer.sh stop
Change ownership
chown -R zk:zk /opt/apache-zookeeper-3.6.2-bin/
chown -R zk:zk /var/lib/zookeeper
Add to systemctl
# cat /etc/systemd/system/zk.service
[Unit]
Description=Zookeeper Daemon
Documentation=http://zookeeper.apache.org
Requires=network.target
After=network.target
[Service]
Type=forking
WorkingDirectory=/opt/apache-zookeeper-3.6.2-bin
User=zk
Group=zk
ExecStart=/opt/apache-zookeeper-3.6.2-bin/bin/zkServer.sh start /opt/apache-zookeeper-3.6.2-bin/conf/zoo.cfg
ExecStop=/opt/apache-zookeeper-3.6.2-bin/bin/zkServer.sh stop /opt/apache-zookeeper-3.6.2-bin/conf/zoo.cfg
ExecReload=/opt/apache-zookeeper-3.6.2-bin/bin/zkServer.sh restart /opt/apache-zookeeper-3.6.2-bin/conf/zoo.cfg
TimeoutSec=30
Restart=on-failure
SuccessExitStatus=143
[Install]
WantedBy=default.target
We can manage like systemctl start zk.
ZooKeeper Cluster
Add the following linesin conf/zoo.cnf file.
maxClientCnxns=60
initLimit=10
syncLimit=5
server.1=your_zookeeper_node_1:2888:3888
server.2=your_zookeeper_node_2:2888:3888
server.3=your_zookeeper_node_3:2888:3888
Quote from reference.
ZooKeeper nodes use a pair of ports, :2888 and :3888, for follower nodes to connect to the leader node and for leader election, respectively.
Open TCP ports 2888 and 3888 between all ZooKeeper cluster nodes (both in and out.)
Create myid file under dataDir, in my case /var/lib/zookeeper.
$ cat /var/lib/zookeeper/myid
1
echo "1" >> /var/lib/zookeeper/myid && chown zk /var/lib/zookeeper/myid && chgrp zk /var/lib/zookeeper/myid
In the second and third node, you should set the value /var/lib/zookeeper/myid as 2 and 3 respectively.
Run the cluster in all nodes.
systemctl start zk
Check it.
@server3
bin/zkCli.sh -server {{ your_zookeeper_node_1 }}:2181
[zk: node1:2181(CONNECTED) 0] create /zk_znode_1 sample_data
Created /zk_znode_1
[zk: node1:2181(CONNECTED) 1] ls /
[zk_znode_1, zookeeper]
[zk: node1:2181(CONNECTED) 2] get /zk_znode_1
sample_data
You can check from other nodes!!
Notes
Godd reference.
memo
https://www.corejavaguru.com/blog/bigdata/why-zookeeper-on-odd-number-nodes
- ZooKeeper is server client model.
- server:client = 1:many
- server: a ZK -> ZKs (ensemble)
- there is a leader of servers.
- The purpose of the leader is to order client requests that change the ZooKeeper state: create, setData, and delete.
- client send ping to connected server -> if not got ack, then connect to other server.
- Leader doesnt have connection between client.