blob: eca2c64cb1fba13866e44baa193cb6db9e4520d7 (
plain)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
|
# Dockerized Apache Storm
To create a jar and submit to storm:
```
make enter
cd wordcount
mvn clean compile package
exit
make enter_nimbus
cd /usr/src/app/wordcount/target
storm jar WordCount-1.0-SNAPSHOT.jar org.apache.storm.flux.Flux -r -R /topology.yaml
```
The backend and nimbus containers both mount the local directory to access
files. The sample wordcount is just [Azure-Samples/hdinsight-python-storm-wordcount][1]
with `storm.version` updated in pom.xml. The project is set up to give you a
full jar that you can submit to storm.
## Experimenting with streamparse
I also experimented with creating a streamparse example but the tricky part is
that streamparse won't generate a jar containing the topology definition
because it submits the topology information via thrift. This means I'd need to
set up inter-container communication so I just decided to use the above
solution to generate a jar. If you want to use streamparse there are a few
little notes below to help you get around some issues that I ran into.
## How did I get here?
Setting up a sample project with Apache Storm using streamparse. The
`wordcount` project was created with `sparse quickstart wordcount` but it
required some modifications to get running.
## https://github.com/Parsely/streamparse/issues/479
I just forked the repo and make a quick update so that I could just permanently
run the local mode for now.
## Upating dependancies
```
:dependencies [[org.apache.storm/storm-core "2.3.0"]
[org.apache.storm/flux-core "2.3.0"]]
```
Updated these versions to match my local storm version
## https://github.com/Parsely/streamparse/issues/472
```
(require 'cemerick.pomegranate.aether)
(cemerick.pomegranate.aether/register-wagon-factory!
"http" #(org.apache.maven.wagon.providers.http.HttpWagon.))
```
Adding the snippet above to my project.clj seemed to fix this issue
[1]: https://github.com/Azure-Samples/hdinsight-python-storm-wordcount
|