This post shows some very basic example of how to use the pre-trained twitter sentiment classifier in Trident-ML to classifier sentiment of text which will return true (positive) or false (negative).
Firstly create a Maven project (e.g. with groupId="com.memeanalytics" artifactId="trident-sentiment-classifier"). The complete source codes of the project can be downloaded from the link:
https://dl.dropboxusercontent.com/u/113201788/storm/trident-sentiment-classifier.tar.gz
For the start we need to configure the pom.xml file in the project.
Configure pom.xml:
Firstly we need to add the clojars repository to the repositories section:<repositories> <repository> <id>clojars</id> <url>http://clojars.org/repo</url> </repository> </repositories>
Next we need to add the storm dependency to the dependencies section (for storm):
<dependency> <groupId>storm</groupId> <artifactId>storm</artifactId> <version>0.9.0.1</version> <scope>provided</scope> </dependency>
Next we need to add the strident-ml dependency to the dependencies section (for text classification):
<dependency> <groupId>com.github.pmerienne</groupId> <artifactId>trident-ml</artifactId> <version>0.0.4</version> </dependency>
Next we need to add the exec-maven-plugin to the build/plugins section (for execute the Maven project):
<plugin> <groupId>org.codehaus.mojo</groupId> <artifactId>exec-maven-plugin</artifactId> <version>1.2.1</version> <executions> <execution> <goals> <goal>exec</goal> </goals> </execution> </executions> <configuration> <includeProjectDependencies>true</includeProjectDependencies> <includePluginDependencies>false</includePluginDependencies> <executable>java</executable> <classpathScope>compile</classpathScope> <mainClass>com.memeanalytics.trident_sentiment_classifier.App</mainClass> </configuration> </plugin>
Next we need to add the maven-assembly-plugin to the build/plugins section (for packacging the Maven project to jar for submitting to Storm cluster):
<plugin> <artifactId>maven-assembly-plugin</artifactId> <version>2.2.1</version> <configuration> <descriptorRefs> <descriptorRef>jar-with-dependencies</descriptorRef> </descriptorRefs> <archive> <manifest> <mainClass></mainClass> </manifest> </archive> </configuration> <executions> <execution> <id>make-assembly</id> <phase>package</phase> <goals> <goal>single</goal> </goals> </execution> </executions> </plugin>
Sentiment Classification in Trident topology using Trident-ML implementation
Once the pom.xml update is completed, we can build a Trident topology which uses TwitterSentimentClassifier in a DRPCStream to classify text sentiment in Trident-ML. This is implemented in the main class shown below:package com.memeanalytics.trident_sentiment_classifier; import com.github.pmerienne.trident.ml.nlp.TwitterSentimentClassifier; import storm.trident.TridentTopology; import backtype.storm.Config; import backtype.storm.LocalCluster; import backtype.storm.LocalDRPC; import backtype.storm.generated.StormTopology; import backtype.storm.tuple.Fields; public class App { public static void main( String[] args ) { LocalDRPC drpc=new LocalDRPC(); LocalCluster cluster=new LocalCluster(); Config config=new Config(); cluster.submitTopology("SentimentClassifierDemo", config, buildTopology(drpc)); try{ Thread.sleep(2000); }catch(InterruptedException ex) { ex.printStackTrace(); } System.out.println(drpc.execute("classify", "Have a nice day!")); System.out.println(drpc.execute("classify", "I feel really bad!")); System.out.println(drpc.execute("classify", "Whatever, i don't really care")); System.out.println(drpc.execute("classify", "feel sleepy zzzz....")); cluster.killTopology("SentimentClassifierDemo"); cluster.shutdown(); drpc.shutdown(); } private static StormTopology buildTopology(LocalDRPC drpc) { TridentTopology topology=new TridentTopology(); topology.newDRPCStream("classify", drpc).each(new Fields("args"), new TwitterSentimentClassifier(), new Fields("sentiment")); return topology.build(); } }
The DRPCStream allows user to pass in a text string to the TwitterSentimentClassifier which will then return a "sentiment" field, that contains the predicted label (true for positive; false for negative) of the testing text.
Next copy the following two files into the "main/resources" folder under the project root folder:
twitter-sentiment-classifier-classifier.json:
https://github.com/pmerienne/trident-ml/blob/master/src/main/resources/twitter-sentiment-classifier-classifier.json
twitter-sentiment-classifier-extractor.json:
https://github.com/pmerienne/trident-ml/blob/master/src/main/resources/twitter-sentiment-classifier-extractor.json
The above step can be important, otherwise you may get a FileNotFoundException during runtime.
Once the coding is completed, we can run the project by navigating to the project root folder and run the following commands:
> .mvn compile exec:java
No comments:
Post a Comment