Dockerized Apache Storm
To create a jar and submit to storm:
make enter
cd wordcount
mvn clean compile package
exit
make enter_nimbus
cd /usr/src/app/wordcount/target
storm jar WordCount-1.0-SNAPSHOT.jar org.apache.storm.flux.Flux -r -R /topology.yaml
The backend and nimbus containers both mount the local directory to access
files. The sample wordcount is just Azure-Samples/hdinsight-python-storm-wordcount
with storm.version
updated in pom.xml. The project is set up to give you a
full jar that you can submit to storm.
Experimenting with streamparse
I also experimented with creating a streamparse example but the tricky part is that streamparse won't generate a jar containing the topology definition because it submits the topology information via thrift. This means I'd need to set up inter-container communication so I just decided to use the above solution to generate a jar. If you want to use streamparse there are a few little notes below to help you get around some issues that I ran into.
How did I get here?
Setting up a sample project with Apache Storm using streamparse. The
wordcount
project was created with sparse quickstart wordcount
but it
required some modifications to get running.
https://github.com/Parsely/streamparse/issues/479
I just forked the repo and make a quick update so that I could just permanently run the local mode for now.
Upating dependancies
:dependencies [[org.apache.storm/storm-core "2.3.0"]
[org.apache.storm/flux-core "2.3.0"]]
Updated these versions to match my local storm version
https://github.com/Parsely/streamparse/issues/472
(require 'cemerick.pomegranate.aether)
(cemerick.pomegranate.aether/register-wagon-factory!
"http" #(org.apache.maven.wagon.providers.http.HttpWagon.))
Adding the snippet above to my project.clj seemed to fix this issue