Saturday, November 22, 2014

Setup Kafka in a single machine running Ubuntu 14.04 LTS

Kafka is a messaging system that can acts as a buffer and feeder for messages processed by Storm spouts. It can also be used as a output buffer for Storm bolts. This post shows how to setup and test Kafka on a single machine running Ubuntu.

Firstly download the kafka from the link below:

Next "tar -xvzf" the kafka_2.8.0- file and move it to a destination folder (say, /Documents/Works/Kafka folder under the user root directory):

> tar -xvzf kafka_2.8.0-
> mkdir $HOME/Documents/Works/Kafka
> mv kafka_2.8.0- $HOME/Documents/Works/Kafka

Now go back to the user root folder and open the .bashrc file for editing:

> cd $HOME
> gedit .bashrc

In the .bashrc file, add the following line to the end:

export KAFKA_HOME=$HOME/Documents/Works/Kakfa/kafka_2.8.0-

Save and close the .bashrc and run "source .bashrc" to update the environment variables. Now navigate to the kafka home folder and edit the in its sub-directory "config":

> cd $KAFKA_HOME/config
> gedit

In the file, search the line "zookeeper.connect" and change it to the following:


search the line "log.dirs" and change it to the following:


Save and close the file ( and are the zookeeper nodes). Next we go and create the folder /var/kafka-logs (which will store the topics and partitions data for kafka) with write permissions:

> sudo mkdir /var/kafka-logs
> sudo chmod -R 777 /var/kafka-logs

Now set up and run the zookeeper cluster by following instructions in the link Once this is done, we are ready to start the kafka messaging system by running the following commands:

> bin/ config/

To start testing kafka setup, Ctrl+Alt+T to open a new terminal and run the following command to create a topic "verification-topic" (a topic is a named entity in kafka which contain one or more partitions which are message queues that can run in parallel and serialize to individual folder in /var/kafka-log folder):

> bin/ --create --zookeeper --topic verification-topic --partitions 1 --replication-factor 1

The above command creates a topic named "verification-topic" which contains 1 partition (and with no replication)

Now we can check the list of topics in kafka by running the following command:

> bin/ --zookeeper --list

To test the producer and consumer interaction in kafka, fire up the console producer by running

> bin/ --broker-list localhost:9092 --topic verification-topic

9092 is the default port for a kafka broker node (which is localhost at the moment). Now the terminal enter interaction mode. Let's open another terminal and run the console consumer:

> bin/ --zookeeper --topic verification-topic

Now enter some data in the console producer terminal and you should see the data immediately display in the console consumer terminal.