Sometimes, during the project lifecycle, there is a need to quickly start a Neo4j docker with seeded data for QA or UAT environments. Creating a “vanilla” neo4j docker and executing all the data loader cypher queries takes huge amount of time.
To save time, we can bootstrap or seed the docker with all the required data.
Docker “COPY” command can copy a file from current working directory to a folder inside docker.
Docker “RUN” command can execute shell scripts. Note – There can be only ONE “RUN” command.
To load the data during the start, the neo4j initial password needs to be set before starting neo4j.
for demo, we will load countries.csv file into neo4j. This file is saved in the current directory.
id,name
AF,Afghanistan
AL,Albania
DZ,Algeria
AS,American Samoa
AD,Andorra
AO,Angola
AI,Anguilla
AQ,Antarctica
AG,Antigua And Barbuda
to load the countries.csv file, we will create the respective cypher query.
LOAD CSV WITH HEADERS FROM 'file:///countries.csv' AS row
WITH row WHERE row.id IS NOT NULL
MERGE (c:Country {id:row.id,countryName: row.name});
then we create a Dockerfile to copy the csv and cypher query into docker and execute cypher-shell.
FROM neo4j ENV NEO4J_HOME="/var/lib/neo4j" \ NEO4J_PASSWD=neo4j_seed COPY countries.csv ${NEO4J_HOME}/import/ COPY data_loader.cypher ${NEO4J_HOME}/import/ # set initial-password to start loading the data # sleep for 10 secs for neo4j to start without any overlapping CMD bin/neo4j-admin set-initial-password ${NEO4J_PASSWD} && \ bin/neo4j start && sleep 10 && \ if [ -f "${NEO4J_HOME}/import/data_loader.cypher" ]; then \ cat ${NEO4J_HOME}/import/data_loader.cypher | NEO4J_USERNAME=neo4j NEO4J_PASSWORD=${NEO4J_PASSWD} bin/cypher-shell --fail-fast && rm ${NEO4J_HOME}/import/*; \ fi && /bin/bash
build the dockerfile
docker build -t neo4j:seed .
+] Building 11.7s (9/9) FINISHED
=> [internal] load build definition from Dockerfile 0.0s
=> => transferring dockerfile: 686B 0.0s
=> => transferring context: 2B 0.0s
=> [internal] load metadata for docker.io/library/neo4j:latest 11.3s
=> [auth] library/neo4j:pull token for registry-1.docker.io 0.0s
=> [1/3] FROM docker.io/library/neo4j@sha256:c7f24de1dc1d2020ab24a884b8a39538937c1b14bc0ca1da3ddb2573b6fc412f 0.0s
=> [internal] load build context 0.0s
=> => transferring context: 230B 0.0s
=> CACHED [2/3] COPY countries.csv /var/lib/neo4j/import/ 0.0s
=> [3/3] COPY data_loader.cypher /var/lib/neo4j/import/ 0.1s
=> exporting to image 0.1s
=> => exporting layers 0.1s
=> => writing image sha256:ac7113b7e0ae6abe7145f2d112dfbbe9b45aa6c6eb4e4147cfffbff691185cde 0.0s
=> => naming to docker.io/library/neo4j:seed 0.0s
Once the build is successful, run the tagged “neo4j:seed” image
docker run -it -d neo4j:seed 6c848fee3c728333deff359ed8ec5ef400c4e063ad610e2ebb42f046d9009561
Verify the data –
PS C:\Users\domin> docker ps CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES 6c848fee3c72 neo4j:seed "/sbin/tini -g -- /d…" 7 seconds ago Up 6 seconds 7473-7474/tcp, 7687/tcp ecstatic_neumann PS C:\Users\domin> docker exec -it 6c848fee3c72 cypher-shell username: neo4j password: ** Connected to Neo4j 4.2.0 at neo4j://localhost:7687 as user neo4j. Type :help for a list of available commands or :exit to exit the shell. Note that Cypher queries must end with a semicolon. neo4j@neo4j> neo4j@neo4j> neo4j@neo4j> match (n) return (n); +-----------------------------------------------------------+ | n | +-----------------------------------------------------------+ | (:Country {id: "AF", countryName: "Afghanistan"}) | | (:Country {id: "AL", countryName: "Albania"}) | | (:Country {id: "DZ", countryName: "Algeria"}) | | (:Country {id: "AS", countryName: "American Samoa"}) | | (:Country {id: "AD", countryName: "Andorra"}) | | (:Country {id: "AO", countryName: "Angola"}) | | (:Country {id: "AI", countryName: "Anguilla"}) | | (:Country {id: "AQ", countryName: "Antarctica"}) | | (:Country {id: "AG", countryName: "Antigua And Barbuda"}) | +-----------------------------------------------------------+ 9 rows available after 42 ms, consumed after another 4 ms neo4j@neo4j>
From the data, it is verified that the data is seeded / bootstrapped with neo4j database.
Happy Graphing …..