11

Neo4j and Gatling sitting in a tree, Performance T-E-S-T-ing

 3 years ago
source link: https://maxdemarzi.com/2013/02/14/neo4j-and-gatling-sitting-in-a-tree-performance-t-e-s-t-ing/
Go to the source link to view the article. You can view the picture content, updated content and better typesetting reading experience. If the link is broken, please click the button below to view the snapshot at that time.
neoserver,ios ssh client

Neo4j and Gatling sitting in a tree, Performance T-E-S-T-ing

neo4j_loves_gatling

I was introduced to the open-source performance testing tool Gatling a few months ago by Dustin Barnes and fell in love with it. It has an easy to use DSL, and even though I don’t know a lick of Scala, I was able to figure out how to use it. It creates pretty awesome graphics and takes care of a lot of work for you behind the scenes. They have great documentation and a pretty active google group where newbies and questions are welcomed.

It ships with Scala, so all you need to do is create your tests and use a command line to execute it. I’ll show you how to do a few basic things, like test that you have everything working, then we’ll create nodes and relationships, and then query those nodes.

We start things off with the import statements:

import com.excilys.ebi.gatling.core.Predef._
import com.excilys.ebi.gatling.http.Predef._
import akka.util.duration._
import bootstrap._

Then we start right off with our simulation. For this first test, we are just going to get the root node via the REST api. We specify our Neo4j server, in this case I am testing on localhost (you’ll want to run your test code and Neo4j server on different servers when doing this for real). Next we specify that we are accepting JSON to return. For our test scenario, for a duration of 10 seconds, we’ll get “/db/data/node/0” and check that Neo4j returns the http status code 200 (for everything be ok). We’ll pause between 0 and 5 milliseconds between calls to simulate actual users, and in our setup we’ll specify that we want 100 users.

class GetRoot extends Simulation {
val httpConf = httpConfig
.acceptHeader("application/json")
val scn = scenario("Get Root")
.during(10) {
exec(
http("get root node")
.get("/db/data/node/0")
.check(status.is(200)))
.pause(0 milliseconds, 5 milliseconds)
}
setUp(
scn.users(100).protocolConfig(httpConf)
)
}

We’ll call this file “GetRoot.scala” and put it in the user-files/simulations/neo4j.

gatling-charts-highcharts-1.4.0/user-files/simulations/neo4j/

We can run our code with:

~$ bin/gatling.sh

We’ll get a prompt asking us which test we want to run:

GATLING_HOME is set to /Users/maxdemarzi/Projects/gatling-charts-highcharts-1.4.0
Choose a simulation number:
[0] GetRoot
[1] advanced.AdvancedExampleSimulation
[2] basic.BasicExampleSimulation

Choose the number next to GetRoot and press enter.

Next you’ll get prompted for an id, or you can just go with the default by pressing enter again:

Select simulation id (default is 'getroot'). Accepted characters are a-z, A-Z, 0-9, - and _

If you want to add a description, you can:

Select run description (optional)

Finally it starts for real:

================================================================================
2013-02-14 17:18:03                                                  10s elapsed
---- Get Root ------------------------------------------------------------------
Users  : [#################################################################]100%
waiting:0     / running:0     / done:100 
---- Requests ------------------------------------------------------------------
> get root node                                              OK=58457  KO=0    
================================================================================
Simulation finished.
Simulation successful.
Generating reports...
Reports generated in 0s.
Please open the following file : /Users/maxdemarzi/Projects/gatling-charts-highcharts-1.4.0/results/getroot-20130214171753/index.html

The progress bar is a measure of the total number of users who have completed their task, not a measure of the simulation that is done, so don’t worry if that stays at zero for a long while and then jumps quickly to 100%. You can also see the OK (test passed) and KO (tests failed) numbers. Lastly it creates a great html based report for us. Let’s take a look:

Here you can see statistics about the response times as well as the requests per second. So that’s great, we can get the root node, but that’s not very interesting, let’s create some nodes:

class CreateNodes extends Simulation {
val httpConf = httpConfig
.acceptHeader("application/json")
val createNode = """{"query": "create me"}"""
val scn = scenario("Create Nodes")
.repeat(1000) {
exec(
http("create node")
.post("/db/data/cypher")
.body(createNode)
.asJSON
.check(status.is(200)))
.pause(0 milliseconds, 5 milliseconds)
}
setUp(
scn.users(100).ramp(10).protocolConfig(httpConf)
)
}

In this case, we are setting 100 users to create 1000 nodes each with a ramp time of 10 seconds. We’ll run this simulation just like before, but choose Create Nodes. Once it’s done, take a look at the report, and scroll down a bit to see the chart of the Number of Requests per Second:

You can see the number of users ramp up over the first 10 seconds and fade at the end. Let’s go ahead and connect some of these nodes together:

We’ll add JSONObject to import statements, and since I want to see what nodes we link to what nodes together, we’ll print the details for the request. I am randomly choosing two ids, and passing them to a cypher query to create the relationships:

import com.excilys.ebi.gatling.core.Predef._
import com.excilys.ebi.gatling.http.Predef._
import akka.util.duration._
import bootstrap._
import util.parsing.json.JSONObject
class CreateRelationships extends Simulation {
val httpConf = httpConfig
.acceptHeader("application/json")
.requestInfoExtractor(request => {
println(request.getStringData)
Nil
})
val rnd = new scala.util.Random
val chooseRandomNodes = exec((session) => {
session.setAttribute("params", JSONObject(Map("id1" -> rnd.nextInt(100000),
"id2" -> rnd.nextInt(100000))).toString())
})
val createRelationship = """START node1=node({id1}), node2=node({id2}) CREATE UNIQUE node1-[:KNOWS]->node2"""
val cypherQuery = """{"query": "%s", "params": %s }""".format(createRelationship, "${params}")
val scn = scenario("Create Relationships")
.during(30) {
exec(chooseRandomNodes)
.exec(
http("create relationships")
.post("/db/data/cypher")
.header("X-Stream", "true")
.body(cypherQuery)
.asJSON
.check(status.is(200)))
.pause(0 milliseconds, 5 milliseconds)
}
setUp(
scn.users(100).ramp(10).protocolConfig(httpConf)
)
}

When you run this, you’ll see a stream of the parameters we sent to our post request:

{"query": "START node1=node({id1}), node2=node({id2}) CREATE UNIQUE node1-[:KNOWS]->node2", "params": {"id1" : 98468, "id2" : 20147} }
{"query": "START node1=node({id1}), node2=node({id2}) CREATE UNIQUE node1-[:KNOWS]->node2", "params": {"id1" : 83557, "id2" : 26633} }
{"query": "START node1=node({id1}), node2=node({id2}) CREATE UNIQUE node1-[:KNOWS]->node2", "params": {"id1" : 22386, "id2" : 99139} }

You can turn this off, but I just wanted to make sure the ids were random and it helps when debugging. Now we can query the graph. For this next simulation, I want to see the answers returned from Neo4j, and I want to see the nodes related to 10 random nodes passed in as a JSON array. Notice it’s a bit different from before, and we are also checking to see if we got “data” back in our request.

import com.excilys.ebi.gatling.core.Predef._
import com.excilys.ebi.gatling.http.Predef._
import akka.util.duration._
import bootstrap._
import util.parsing.json.JSONArray
class QueryGraph extends Simulation {
val httpConf = httpConfig
.acceptHeader("application/json")
.responseInfoExtractor(response => {
println(response.getResponseBody)
Nil
})
.disableResponseChunksDiscarding
val rnd = new scala.util.Random
val nodeRange = 1 to 100000
val chooseRandomNodes = exec((session) => {
session.setAttribute("node_ids", JSONArray.apply(List.fill(10)(nodeRange(rnd.nextInt(nodeRange length)))).toString())
})
val getNodes = """START nodes=node({ids}) MATCH nodes -[:KNOWS]-> other_nodes RETURN ID(other_nodes)"""
val cypherQuery = """{"query": "%s", "params": {"ids": %s}}""".format(getNodes, "${node_ids}")
val scn = scenario("Query Graph")
.during(30) {
exec(chooseRandomNodes)
.exec(
http("query graph")
.post("/db/data/cypher")
.header("X-Stream", "true")
.body(cypherQuery)
.asJSON
.check(status.is(200))
.check(jsonPath("data")))
.pause(0 milliseconds, 5 milliseconds)
}
setUp(
scn.users(100).ramp(10).protocolConfig(httpConf)
)
}

If we take a look at the details tab for this simulation we see a small spike in the middle:

Screen Shot of Gatling

This is a tell-tale sign of a JVM Garbage Collection taking place and we may want to look into that. Edit your neo4j/conf/neo4j-wrapper.conf file and uncomment the garbage collection logging, as well as add timestamps to gain better visibility in to the issue:

# Uncomment the following line to enable garbage collection logging
wrapper.java.additional.4=-Xloggc:data/log/neo4j-gc.log
wrapper.java.additional.5=-XX:+PrintGCDateStamps

Neo4j performance tuning deserves its own blog post, but at least now you have a great way of testing your performance as you tweak JVM, cache, hardware, load balancing, and other parameters. Don’t forget while testing Neo4j directly is pretty cool, you can use Gatling to test your whole web application too and measure end to end performance.

Loading...

Related

Extending Neo4jNovember 26, 2012In "Cypher"

Vendor BenchmarksJuly 22, 2019In "Problems"

Giving Neo4j 2.2 a WorkoutMarch 8, 2015In "Cypher"


About Joyk


Aggregate valuable and interesting links.
Joyk means Joy of geeK