Thursday, November 20, 2014

Setup Storm on a single machine running Ubuntu LTS 14.04

This post is about how to setup storm to run on a single machine with Ubuntu LTS 14.04. Firstly download storm-0.9.0.1 from the following link:

https://dl.dropboxusercontent.com/s/tqdpoif32gufapo/storm-0.9.0.1.tar.gz

Next "tar -xvzf" the downloaded gz file to the folder storm-0.9.0.1, let's suppose the full path is
$HOME/Documents/Works/Storm/storm-0.9.0.1

Now cd to the $HOME directory and open the .bashrc file to edit by running the following commands:
> cd $HOME
> gedit .bashrc

In the .bashrc file, add the following line to the end:
export STORM_HOME=$HOME/Documents/Works/Storm/storm-0.9.0.1

Save and close the .bashrc file, Run the following command to update the environment var:
> source .bashrc

Now navigate to the storm's home directory and modify the storm.yaml in its conf sub-directory:
> cd $STORM_HOME
> cd conf
> gedit storm.yaml

Replace the content of storm.yaml with the following lines:

storm.zookeeper.servers:
  - "127.0.0.1"
storm.zookeeper.port: 2181
nimbus.host: "127.0.0.1"
java.library.path: "/usr/local/lib"
storm.local.dir: "/tmp/storm-data"
storm.messaging.transport: backtype.storm.messaging.netty.Context
supervisor.slots.ports:
  - 6700
  - 6701
  - 6702
  - 6703

Save and close the storm.yaml file. Below are some explanation of the above configurations in the storm.yarm file:

storm.zookeeper.servers: the ip addresses of the zookeepers used to maintain the states of the storm cluster
storm.zookeeper.port: the port used by the zookeepers to communicate with its clients
nimbus.host: the ip address of the nimbus node in storm cluster
supervisor.slots.ports: the ports used by each of the work processes (each of which is a jvm) in the supervisor nodes to listen and communicate.

Now run the following commands to start the nimbus node in the storm:
> cd $STORM_HOME
> bin/storm nimbus

Now Ctrl+Alt+T to open another terminal and start a supervisor node in the storm:
> cd $STORM_HOME
> bin/storm supervisor

To submit a jar containing storm topology, run the following command:
> cd $STORM_HOME
> bin/storm jar [jarName] [mainClassName] [mainClassArguments]

[mainClassName] is the fullname of the class containing main() method (which builds and submit the topology), [mainClassArguments] are the arguments passed to the main() method, [jarName] is the full path of the jar containing all the classes and dependencies

To see the java processes running:

> jps

To activate and deactivate a topology in storm:
> cd $STORM_HOME
> bin/storm activate [topologyName]

To kill a topology in storm
> cd $STORM_HOME
> bin/storm kill [topologyName]




No comments:

Post a Comment