In this tutorial, we will do a hands-on basics of Apache Kafka with producer and consumer from command prompt while also learn the concept.
For this tutorial, we will need the following tools :
- Apache Kafka
- Java 1.8
After downloading Apache Kafka, unzip, rename to “kafka” and put the folder in D:\ directory for easy access later on. Open the folder and look for bin folder. This folder contains all the kafka commands that we need, so we better put the bin folder to our Windows Environment Variables PATH for easier execution in command prompt.
Next, look for config directory and open zookeeper.properties. In this file, there is a configuration that we need to modify which is the location of data and log folder. So, go up to the kafka directory and make new folder called data and inside the folder, create new folder again named zookeeper. Modify zookeeper.properties to setup data directory.
Similarly, inside data folder, create kafka folder to store kafka logs and this time, modify server.properties
Kafka comes with Zookeeper and it uses Zookeeper to manage Kafka clusters. For more information on relationship between Zookeeper and Kafka, this blog has a good explanation. For now, keep in mind that Kafka cannot function without Zookeeper even if we only have one Kafka server.
Now, lets open command prompt and start Zookeeper first with command zookeeper-server-start.bat using zookeeper.properties we modified earlier. Zookeeper runs on port 2181, so make sure no application running on that port.
Next, we run Kafka. Open a new command prompt window and start Kafka. Kafka runs in port 9092.
Make sure the two command prompt windows are alive during the tutorial.
Now, we are going to test producing message to Kafka and consuming message from Kafka.
First, we need to create topic. We will use “comments” as topic.
kafka-topics --bootstrap-server localhost:9092 --create --topic comments --partitions 3 --replication-factor 1
Using the above command, we create inside our kafka server, a topic named comments with 3 partitions and the topic replicated to 1 broker/server only.
Because we have more than one partition, writing streams of data to our Kafka broker can be done parallelly, in our case, to 3 partitions. So, instead of waiting for one partition to complete saving our data, we can save to other partition parallelly. Technically, more partition meaning faster data can be saved parallelly. However, there are also drawbacks to many partition which explanation will not be covered in this tutorial.
Also, we have replication-factor to 1, meaning, each data is saved/replicated to 1 broker/server. Since we only have 1 running Kafka broker/server, we can only set replication-factor to 1. (Broker = Server. This is exactly the same. However, in Kafka terminology, we often use broker. Hence, i will use broker.)
Open new command prompt window and run the console-consumer. We can use Java Consumer API or other application to consume Kafka data, but in this case, we use console-consumer first.
kafka-console-consumer --bootstrap-server 127.0.0.1:9092 --topic comments
Nothing happens because we have not produce anything yet. So, open new command prompt window and produce a string “hello kafka beginner” to the comments topic.
kafka-console-producer --bootstrap-server localhost:9092 --topic comments
Each line after the command is representing a data sent to consumer. So, if we look at the console-consumer window, we will get those 3 lines printed in the console. Therefore, our console-producer successfully produce the data which then consumed by console-consumer.
Now that we see Kafka in action, we can learn in-depth of what we just did in next posts.