Neo4j – Relationship Modelling- Performance

There are situations in Graph Data Modelling, where we need to create Ratings as Relationship between the nodes. In this article, we will see the performance difference between “Relationship as Types” vs “Relationship as Properties”.

We will use a sample data-set, with two labels – Person and Car and 1 relationship. A Person is connected to Car, based on his/her level of likeness.

The sample data-set is generated from https://mockaroo.com/

Relationship as Properties

First we will look at the performance using Properties. The schema has Person and Car, and relationship between them as a Property of Likeness.

The person can rates the car ranging from 1(HATE) to 5(LOVE).

Download the Data-set from here.

Copy the data-set to your import folder and load with below Cypher query.

//Loader with Ratings as a Property in Relationship
USING PERIODIC COMMIT 100 LOAD CSV WITH HEADERS FROM "file:///car_ratings_as_properties.csv" AS row FIELDTERMINATOR ';'
MERGE(n:person{id:row.id,fullname:row.full_name})
MERGE(c:car{name:row.car})
CREATE (n)-[:RATINGS{rated:toInteger(row.ratings)}]->(c);

Schema

For the first run, we will restart the Neo4j server to clear the cache and execute the cypher query to get all person whose ratings are above 3.

MATCH (n)-[r:RATINGS]->(m)
where r.rated >3
return n.fullname as `Name` ,r.rated as `Ratings`,m.name as `Car`;

We will note the timings and run the same cypher query for 5 more times, without clearing the cache.

Relationship as Types

Since we already mapped the legend of the Ratings with the likeness i.e.

1 -> HATE
2 -> DISLIKE
3 -> NEUTRAL
4 -> LIKE
5 -> LOVE

We will directly create relationship as Type between Person and Car.

The Data-set can be found here.

Copy the dataset to your import folder and load with below Cypher query.

//Loader with Ratings as a Type
USING PERIODIC COMMIT 100 LOAD CSV WITH HEADERS FROM "file:///car_ratings_as_types.csv" AS row FIELDTERMINATOR ';'
MERGE(n:person{id:row.id,fullname:row.full_name})
MERGE(c:car{name:row.car})
WITH n, c, row
CALL apoc.create.relationship(n, row.ratings, {},c) YIELD rel
RETURN rel;

After loading, check the schema.


Clear Neo4j cache and execute below cypher query. Record the timelines for the 1st and the subsequent 5 runs.

MATCH (n)-[r:LIKE|LOVE]->(m)
return n.fullname as `Full Name`,type(r) as `Ratings`,m.name as `Car`;

Observation

After collecting metrics from both the runs, I created Excel Chart. (Note – My timelines may differs from yours)

Conclusion

As you can see from the performance chart, the “Relationship as Type” has higher performance than “Relationship as Property”

One thought on “Neo4j – Relationship Modelling- Performance

  1. Good article! I think it would be important to also show with a larger dataset, like 10M rels or such.
    And not return the data but just aggregate and return the total times

    also try to not access node properties as this affects the measurements.

    WITH n as `Full Name`,type(r) as `Ratings`, m as `Car`
    RETURN count(*);

Leave a Reply

Your email address will not be published. Required fields are marked *