-->

Saturday, August 25, 2018

Important points to consider while create zookeepeer and kafka cluster

The kafka queue is a highly scalable and extreme fast queuing service which can really come in handy when you have to handle large amount of messages and have to build and services to work in async mode with an option to handle the fault in the services but not loosing the data at the same time you need the system to be scalable which can meet the ever growing demand of messages which would be pushed to this cluster.

Following are some of the important points to consider while creating the highly available kafka zookeeper cluster:-

1. If you want to scale your kafka nodes you should consider keeping the zookeeper on the separate nodes. This is particularly useful for environment where kafka messages throughput is extremely large and more number of brokers would be required after certain period of time to deal with the fault tolerance while maintaining the system to be scalable.

Kafka in itself is very scalable solution and in case you are not getting the data in TBs you can consider to keep the kafka and zookeeper on same nodes which will help you in the cost saving of the instances if you are running it in the cloud. So the best approach is the end use of the kafka and how much messages will it be handling over a period of time and the scalability required overall.

2. Zookeeper nodes saves the overall state of kafka cluster so your zookeeper cluster can be smaller in resources of infrastructure as compared to the kafka cluster. You should use the bigger instance sizes and more disk space in the kafka cluster than the zookeeper cluster.



3. Zookeeper creates the quorum between the zookeeper nodes and is an indication that the zookeeper has been installed correctly. You can go on using the zkcli.sh in case with the issue in the cluster you will face errors while accessing the zkcli as well under the binaries.

4. In case you want to remove the kafka cluster , topic , consumers details from the zookeeper you can use the zkcli to remove them complete from the zookeeper which is particularly faster however you can use kafka cli as well for the same but it take time.

5. While creating the kafka or zookeeper cluster always remember to assign the proper values to the jvm memory xms and xmx values.

6. You can optionally use the exhibitor for managing the zookeeper which provides and UI and can automatically roll the instance in the zookeeper but it entirely optional.

7. While configuring the kafka check for the proper connectivity between the zookeeper the kafka nodes and assign the resources (cpu and ram) to the kafka cluster to make it work more efficiently.

8. Do not use the private DNS in the advertised.host.name which can lead to a significant issue if someone outside of your network using the network tunnel tries to connect to your kafka cluster.

9. Consider allocating the separate DISK for the kafka logs and topic data which kafka uses.

10. By default the kafka does not enable the compression of the data which can increase the data size. Compression should be enable for all the data generated by the producer while the default is none use the gzip in place for the compression in kafka.

12. The kafka is heavily optimized for the smaller messages as compared to the bigger messages. So try to keep smaller messages in which case your kafka cluster can significantly take more messages and consumer can process them faster.

13. Adjust the setting for the max.message.bytes which controls the largest size of the message for which you will also have to adjust the consumer's fetch size in order to consume this message.

14. Number of partitions for a topic in kafka. partitions allows the parallelism in the kafka and more the partitions more data can be read from the kafka in parallel by the consumer but this would inturn increase the threads or consumers which would be required to read the data from the kafka and would require more processing power.

15. You can enable auto.commit than consider adding longer intervals for auto.commit.interval.ms which is default to 500ms.

16. Changes to the consumers can cause the kafka topics to rebalance during that time the consumers don't read from the kafka topics which can result in the increase of lag in the consumers.

17. Consider pushing the lag metrics on the topic basis to your time series database and plot a metrics graph on it. Further you can put an alert over the metrics to test and if it values grows above certain threshold send an alarm to keep a check on the latency in the kafka cluster.

18. If you restart the kafka broker node service than it will try to revalidate the offset of the topics which can take significant amount of time.

19. Kafka can create the topics in case its does not exist and producer is pushing for the same. Although this might be useful in some scenarios but it mostly has downside where due to improper configuration the data is pushed to wrong topic or to a topic which has not be created properly with number of partitions and other data better to turn this off unless you specifically has requirement for the topics to be created automatically.

20. Expose the jvm metrics on the port 9999 and connect some metrics collector such as datadog etc to collect those metrics and plot them in the graph. Kafka displays lot of metrics which can really come in  handly for the monitoring setup of the kafka.

0 comments:

Post a Comment