Friday, November 13, 2015

Eclipse Error: archive required for library cannot be read

Recently I encountered a strange bug in eclipse after a colleague includes selenium in a pom file of a meven project. The eclipse complained " Error: archive required for library cannot be read". Detailed error like this:

"Description Resource Path Location Type Archive for required library: [root]/.m2/repository/org/seleniumhq/selenium/selenium-support/2.46.0/selenium-support-2.46.0.jar' in project '[*]' cannot be read or is not a valid ZIP file umom Build path Build Path Problem"

However, it was found that the selenium-support-2.46.0.jar is valid jar which can be open as zip.

After some searching, the following work around seems to work in my case (link: https://bugs.eclipse.org/bugs/show_bug.cgi?id=364653#c3):

Workaround: For each affected project set 'Incomplete build path' to 'Warning' on the Compiler > Building property page. After that, shut down and restart the eclipse. followed by update the maven projects. And the problem is gone

Thursday, November 12, 2015

Run mvn package with multiple modules containing independent pom files without public repository

Recently i need to build a java application which must be built with dependencies from several different modules, each having their own pom files that packages them into jar. As these files are not distributed from a public repository such as Maven Central and I do not reuse repository system such as Nexus in this case. The modules are in their own folders (with names such as "SK-Utils" "SK-Statistics" "SK-DOM" "OP-Core" "OP-Search" "ML-Core" "ML-Tune" "ML-Clustering" "ML-Trees"). What makes it complicated is that these modules actually have their dependencies specified on each other. E.g., ML-Core depends on "SK-DOM"and "SK-Utils" to build and run unit testing. Running these independent modules using IntelliJ IDE is ok. However, the modules failed to build when build using command lines such as "mvn package". Therefore i wrote a bash scripts which put in the same directory containing the independent modules. The bash script basically run "mvn package" using pom file in each module, then install them to the local repository. The bash script "mvn-build.sh" looks like the following:

#!/usr/bin/env bash
dirArray=( "SK-Utils" "SK-Statistics" "SK-DOM" "OP-Core" "OP-Search" "ML-Core" "ML-Tune" "ML-Clustering" "ML-Trees")
for dirName in "${dirArray[@]}"
do
 echo $dirName
 cd $dirName

 jarPath="target/$dirName-0.0.1-SNAPSHOT-jar-with-dependencies.jar"

 if [ -d $jarPath ]; then
     chmod 777 $jarPath
 fi

    mvn package




    mvn install:install-file -Dfile=$jarPath -DgroupId=com.meme -DartifactId=$dirName -Dpackaging=jar -Dversion=0.0.1-SNAPSHOT


    cd ..
done

Just run the above script using command such as "sudo ./mvn-build.sh" from its folder should build the multiple module project.

Note that each module should have a plugin like below specified so that the "jar-with-dependencies" jar will be generated.

<build>
        <plugins>
            <plugin>
                <groupId>org.apache.maven.plugins</groupId>
                <artifactId>maven-install-plugin</artifactId>
                <version>2.5.2</version>
            </plugin>
            <plugin>
                <groupId>org.apache.maven.plugins</groupId>
                <artifactId>maven-assembly-plugin</artifactId>
                <version>2.6</version>
                <configuration>
                    <descriptorRefs>
                        <descriptorRef>jar-with-dependencies</descriptorRef>
                    </descriptorRefs>
                </configuration>

                <executions>
                    <execution>
                        <id>make-assembly</id> 
                        <phase>package</phase> 
                        <goals>
                            <goal>single</goal>
                        </goals>
                    </execution>
                </executions>
            </plugin>
        </plugins>
    </build>

One more note is that if you are running on environment such as centos linux and encounter "mvn: command not found" when executing "sudo ./mvn-build.sh", it may be due to the fact that the PATH environment variable not in the sudoer, in this case, just run

> sudo env "PATH=$PATH" ./mvn-build.sh

Saturday, November 7, 2015

Use HttpURLConnection to send a "GET" command to Elastic Search with a json body (enum "-d" option in curl's GET)

Recently I was working on implementing a java equivalence using HttpURLConnection to the following curl query which sends a "GET" command to elastic search with a json body specified in the "-d" option of curl command, something like the one below:

curl -XGET "http://127.0.0.1:9200/messages/_search?pretty" -d '
{
  "size" : 10,
  "query" : {
    "bool" : {
      "must" : [ {
        "match" : {
          "id" : {
            "query" : "[some id]",
            "type" : "boolean"
          }
        }
      },  {
        "nested" : {
          "query" : {
            "bool" : {
              "must" : {
                "match" : {
                  "agent" : {
                    "query" : "[some agent name]",
                    "type" : "boolean"
                  }
                }
              }
            }
          },
          "path" : "agents"
        }
      } ]
    }
  }
  }
}

The command requires a json body to be sent to the elastic search via the "GET" restful call. After some trial and error, I got this to work, below is the method implemented in Java.

 public static String httpGet(String urlToRead, String data) {
        URL url;
        HttpURLConnection conn;
        BufferedReader rd;
        String line;
        StringBuilder result = new StringBuilder();
        try {
            url = new URL(urlToRead);
            conn = (HttpURLConnection) url.openConnection();
            conn.setDoOutput(true);
            conn.setRequestMethod("GET");
            BufferedWriter writer = new BufferedWriter(new OutputStreamWriter(conn.getOutputStream()));
            writer.write(data);
            writer.flush();

            rd = new BufferedReader(new InputStreamReader(conn.getInputStream()));
            while ((line = rd.readLine()) != null) {
                result.append(line);
            }
            rd.close();
        } catch (IOException e) {
            e.printStackTrace();
        } catch (Exception e) {
            e.printStackTrace();
        }
        return result.toString();
}
To call the above method and realize the curl query to ES above. just implement the following:

 
int size = 10;
String ipAddress = "127.0.0.1";
String url = "http://"+ipAddress+":9200/messages/_search?pretty";
String data = " { \"size\" : "+size+", \"query\" : { \"bool\" : { \"must\" : [ { \"match\" : { \"id\" : { \"query\" : \"[some id]\", \"type\" : \"boolean\" } } }, { \"nested\" : { \"query\" : { \"bool\" : { \"must\" : { \"match\" : { \"agent\" : { \"query\" : \"[some agent name]\", \"type\" : \"boolean\" } } } } }, \"path\" : \"agents\" } } ] } } } ";
String response = httpGet(url, data);

Tuesday, November 3, 2015

Create and Run Apache Spark Scala project using Eclipse and SBT

This post shows a simple way to create and run a apache spark scala project using eclipse and SBT. Following this post at link (http://czcodezone.blogspot.sg/2015/11/create-scala-project-using-scala-ide.html) to create a SBT-compatiable scala project in Scala IDE, open the build.sbt in the project and add the following line towards the end of the file.

libraryDependencies += "org.apache.spark" %% "spark-core" % "1.5.1"

Further more, since Spark needs to run a spark cluster in a jar, a sbt plugin must be added for the packaging. Create a file named "assembly.sbt" in the "project" folder under the root directory and put the following content:

addSbtPlugin("com.eed3si9n" % "sbt-assembly" % "0.11.2")


Now in the command line, navigate to the project root directory and run "sbt compile", "sbt package" command after you put in your spark scala code, then "sbt run" or "spark-submit" depending on whether you want to run it locally or submit to spark cluster

Create a Scala project using Scala-IDE and SBT on CentOS

This post shows how to create and build a Scala project using  Scala IDE and SBT (Simple Build Tool)

Step 1: Create  a Scala project using Scala IDE


Download, unzip and launch the Scala IDE (link:  http://scala-ide.org/), use the IDE to create a Scala project (said, with a name "ScalaHelloWorld")

Step 2: Install SBT

Download and install SBT using the following command:

> curl https://bintray.com/sbt/rpm/rpm | sudo tee /etc/yum.repos.d/bintray-sbt-rpm.repo
> sudo yum install sbt

Step 3: Configure Scala project for SBT


After installing the SBT, navigate to your "ScalaHelloWorld" root directory and create a build.sbt file in the directory with the following content:

name := "File Searcher"

version := "1.0"

scalaVersion := "2.10.4"

libraryDependencies += "org.scalatest" % "scalatest_2.10" % "2.0" % "test"

Next in the root directory create a folder named "project" and create plugins.sbt file in the created "project" folder, with the following content:

addSbtPlugin("com.typesafe.sbteclipse" % "sbteclipse-plugin" % "4.0.0")


Now go back to the root directory and in the command line enter the following command:

> sbt eclipse

This will setup the eclipse scala project for sbt to work with. To begin writing Scala implementation, a simple recommendation with to create the folder structures like the following in the "src" folder under the root directory:

src/main/scala
src/main/resources
src/test/scala
src/test/resources

Now in the Scala IDE, right-click the folder "src/main/scala" and "src/test/scala" and select "Build Path -> Use as Source Folder", and just create your .scala file in these folders.

Step 4: Run SBT

After you code your scala implementation, navigate to the project root directory and type the following command to run the unit test codes:

> sbt test

And the following command to launch the app in main:

> sbt run