Neo4j and Gatling sitting in a tree, Performance T-E-S-T-ing
source link: https://maxdemarzi.com/2013/02/14/neo4j-and-gatling-sitting-in-a-tree-performance-t-e-s-t-ing/
Go to the source link to view the article. You can view the picture content, updated content and better typesetting reading experience. If the link is broken, please click the button below to view the snapshot at that time.
Neo4j and Gatling sitting in a tree, Performance T-E-S-T-ing
I was introduced to the open-source performance testing tool Gatling a few months ago by Dustin Barnes and fell in love with it. It has an easy to use DSL, and even though I don’t know a lick of Scala, I was able to figure out how to use it. It creates pretty awesome graphics and takes care of a lot of work for you behind the scenes. They have great documentation and a pretty active google group where newbies and questions are welcomed.
It ships with Scala, so all you need to do is create your tests and use a command line to execute it. I’ll show you how to do a few basic things, like test that you have everything working, then we’ll create nodes and relationships, and then query those nodes.
We start things off with the import statements:
import
com.excilys.ebi.gatling.core.Predef.
_
import
com.excilys.ebi.gatling.http.Predef.
_
import
akka.util.duration.
_
import
bootstrap.
_
Then we start right off with our simulation. For this first test, we are just going to get the root node via the REST api. We specify our Neo4j server, in this case I am testing on localhost (you’ll want to run your test code and Neo4j server on different servers when doing this for real). Next we specify that we are accepting JSON to return. For our test scenario, for a duration of 10 seconds, we’ll get “/db/data/node/0” and check that Neo4j returns the http status code 200 (for everything be ok). We’ll pause between 0 and 5 milliseconds between calls to simulate actual users, and in our setup we’ll specify that we want 100 users.
class
GetRoot
extends
Simulation {
val
httpConf
=
httpConfig
.acceptHeader(
"application/json"
)
val
scn
=
scenario(
"Get Root"
)
.during(
10
) {
exec(
http(
"get root node"
)
.get(
"/db/data/node/0"
)
.check(status.is(
200
)))
.pause(
0
milliseconds,
5
milliseconds)
}
setUp(
scn.users(
100
).protocolConfig(httpConf)
)
}
We’ll call this file “GetRoot.scala” and put it in the user-files/simulations/neo4j.
gatling-charts-highcharts-
1.4
.
0
/user-files/simulations/neo
4
j/
We can run our code with:
~$ bin/gatling.sh
We’ll get a prompt asking us which test we want to run:
GATLING
_
HOME is set to /Users/maxdemarzi/Projects/gatling-charts-highcharts-
1.4
.
0
Choose a simulation number
:
[
0
] GetRoot
[
1
] advanced.AdvancedExampleSimulation
[
2
] basic.BasicExampleSimulation
Choose the number next to GetRoot and press enter.
Next you’ll get prompted for an id, or you can just go with the default by pressing enter again:
Select simulation id (default is
'getroot'
). Accepted characters are a-z, A-Z,
0
-
9
, - and
_
If you want to add a description, you can:
Select run description (optional)
Finally it starts for real:
================================================================================
2013
-
02
-
14
17
:
18
:
03
10
s elapsed
---- Get Root ------------------------------------------------------------------
Users
:
[
#################################################################
]
100
%
waiting
:
0
/ running
:
0
/ done
:
100
---- Requests ------------------------------------------------------------------
> get root node OK
=
58457
KO
=
0
================================================================================
Simulation finished.
Simulation successful.
Generating reports...
Reports generated in
0
s.
Please open the following file
:
/Users/maxdemarzi/Projects/gatling-charts-highcharts-
1.4
.
0
/results/getroot-
20130214171753
/index.html
The progress bar is a measure of the total number of users who have completed their task, not a measure of the simulation that is done, so don’t worry if that stays at zero for a long while and then jumps quickly to 100%. You can also see the OK (test passed) and KO (tests failed) numbers. Lastly it creates a great html based report for us. Let’s take a look:
Here you can see statistics about the response times as well as the requests per second. So that’s great, we can get the root node, but that’s not very interesting, let’s create some nodes:
class
CreateNodes
extends
Simulation {
val
httpConf
=
httpConfig
.acceptHeader(
"application/json"
)
val
createNode
=
""
"{"
query
": "
create me
"}"
""
val
scn
=
scenario(
"Create Nodes"
)
.repeat(
1000
) {
exec(
http(
"create node"
)
.post(
"/db/data/cypher"
)
.body(createNode)
.asJSON
.check(status.is(
200
)))
.pause(
0
milliseconds,
5
milliseconds)
}
setUp(
scn.users(
100
).ramp(
10
).protocolConfig(httpConf)
)
}
In this case, we are setting 100 users to create 1000 nodes each with a ramp time of 10 seconds. We’ll run this simulation just like before, but choose Create Nodes. Once it’s done, take a look at the report, and scroll down a bit to see the chart of the Number of Requests per Second:
You can see the number of users ramp up over the first 10 seconds and fade at the end. Let’s go ahead and connect some of these nodes together:
We’ll add JSONObject to import statements, and since I want to see what nodes we link to what nodes together, we’ll print the details for the request. I am randomly choosing two ids, and passing them to a cypher query to create the relationships:
import
com.excilys.ebi.gatling.core.Predef.
_
import
com.excilys.ebi.gatling.http.Predef.
_
import
akka.util.duration.
_
import
bootstrap.
_
import
util.parsing.json.JSONObject
class
CreateRelationships
extends
Simulation {
val
httpConf
=
httpConfig
.acceptHeader(
"application/json"
)
.requestInfoExtractor(request
=
> {
println(request.getStringData)
Nil
})
val
rnd
=
new
scala.util.Random
val
chooseRandomNodes
=
exec((session)
=
> {
session.setAttribute(
"params"
, JSONObject(Map(
"id1"
-> rnd.nextInt(
100000
),
"id2"
-> rnd.nextInt(
100000
))).toString())
})
val
createRelationship
=
""
"START node1=node({id1}), node2=node({id2}) CREATE UNIQUE node1-[:KNOWS]->node2"
""
val
cypherQuery
=
""
"{"
query
": "
%
s
", "
params
": %s }"
""
.format(createRelationship,
"${params}"
)
val
scn
=
scenario(
"Create Relationships"
)
.during(
30
) {
exec(chooseRandomNodes)
.exec(
http(
"create relationships"
)
.post(
"/db/data/cypher"
)
.header(
"X-Stream"
,
"true"
)
.body(cypherQuery)
.asJSON
.check(status.is(
200
)))
.pause(
0
milliseconds,
5
milliseconds)
}
setUp(
scn.users(
100
).ramp(
10
).protocolConfig(httpConf)
)
}
When you run this, you’ll see a stream of the parameters we sent to our post request:
{
"query"
:
"START node1=node({id1}), node2=node({id2}) CREATE UNIQUE node1-[:KNOWS]->node2"
,
"params"
:
{
"id1"
:
98468
,
"id2"
:
20147
} }
{
"query"
:
"START node1=node({id1}), node2=node({id2}) CREATE UNIQUE node1-[:KNOWS]->node2"
,
"params"
:
{
"id1"
:
83557
,
"id2"
:
26633
} }
{
"query"
:
"START node1=node({id1}), node2=node({id2}) CREATE UNIQUE node1-[:KNOWS]->node2"
,
"params"
:
{
"id1"
:
22386
,
"id2"
:
99139
} }
You can turn this off, but I just wanted to make sure the ids were random and it helps when debugging. Now we can query the graph. For this next simulation, I want to see the answers returned from Neo4j, and I want to see the nodes related to 10 random nodes passed in as a JSON array. Notice it’s a bit different from before, and we are also checking to see if we got “data” back in our request.
import
com.excilys.ebi.gatling.core.Predef.
_
import
com.excilys.ebi.gatling.http.Predef.
_
import
akka.util.duration.
_
import
bootstrap.
_
import
util.parsing.json.JSONArray
class
QueryGraph
extends
Simulation {
val
httpConf
=
httpConfig
.acceptHeader(
"application/json"
)
.responseInfoExtractor(response
=
> {
println(response.getResponseBody)
Nil
})
.disableResponseChunksDiscarding
val
rnd
=
new
scala.util.Random
val
nodeRange
=
1
to
100000
val
chooseRandomNodes
=
exec((session)
=
> {
session.setAttribute(
"node_ids"
, JSONArray.apply(List.fill(
10
)(nodeRange(rnd.nextInt(nodeRange length)))).toString())
})
val
getNodes
=
""
"START nodes=node({ids}) MATCH nodes -[:KNOWS]-> other_nodes RETURN ID(other_nodes)"
""
val
cypherQuery
=
""
"{"
query
": "
%
s
", "
params
": {"
ids
": %s}}"
""
.format(getNodes,
"${node_ids}"
)
val
scn
=
scenario(
"Query Graph"
)
.during(
30
) {
exec(chooseRandomNodes)
.exec(
http(
"query graph"
)
.post(
"/db/data/cypher"
)
.header(
"X-Stream"
,
"true"
)
.body(cypherQuery)
.asJSON
.check(status.is(
200
))
.check(jsonPath(
"data"
)))
.pause(
0
milliseconds,
5
milliseconds)
}
setUp(
scn.users(
100
).ramp(
10
).protocolConfig(httpConf)
)
}
If we take a look at the details tab for this simulation we see a small spike in the middle:
This is a tell-tale sign of a JVM Garbage Collection taking place and we may want to look into that. Edit your neo4j/conf/neo4j-wrapper.conf file and uncomment the garbage collection logging, as well as add timestamps to gain better visibility in to the issue:
#
Uncomment the following line to enable garbage collection logging
wrapper.java.additional.
4
=
-Xloggc
:
data/log/neo
4
j-gc.log
wrapper.java.additional.
5
=
-XX
:
+PrintGCDateStamps
Neo4j performance tuning deserves its own blog post, but at least now you have a great way of testing your performance as you tweak JVM, cache, hardware, load balancing, and other parameters. Don’t forget while testing Neo4j directly is pretty cool, you can use Gatling to test your whole web application too and measure end to end performance.
Related
Extending Neo4jNovember 26, 2012In "Cypher"
Vendor BenchmarksJuly 22, 2019In "Problems"
Giving Neo4j 2.2 a WorkoutMarch 8, 2015In "Cypher"
Recommend
About Joyk
Aggregate valuable and interesting links.
Joyk means Joy of geeK