Accessing the PuppyGraph Web UI requires a browser. However, the tutorial offers alternative instructions for those who wish to exclusively use the CLI.
Deployment
▶️ Create a file docker-compose.yaml with the following content.
▶️ Then run the following command to start Iceberg services and PuppyGraph:
dockercomposeup-d
[+] Running 6/6
✔ Network puppy-iceberg Created
✔ Container minio Started
✔ Container mc Started
✔ Container iceberg-rest Started
✔ Container spark-iceberg Started
✔ Container puppygraph Started
Data Preparation
This tutorial is designed to be comprehensive and standalone, so it includes steps to populate data in Iceberg. In practical scenarios, PuppyGraph can query data directly from your existing Iceberg tables.
▶️ Run the following command to start a Spark-SQL shell to access Iceberg.
dockerexec-itspark-icebergspark-sql
The shell will be like this:
spark-sql ()>
▶️ Then execute the following SQL statements in the shell to create tables and insert data:
CREATEDATABASEdemo.modern;CREATEEXTERNALTABLE demo.modern.v_person ( id string,name string, age int) USING iceberg;INSERT INTO demo.modern.v_person VALUES ('v1', 'marko', 29), ('v2', 'vadas', 27), ('v4', 'josh', 32), ('v6', 'peter', 35);CREATEEXTERNALTABLE demo.modern.v_software ( id string,name string, lang string) USING iceberg;INSERT INTO demo.modern.v_software VALUES ('v3', 'lop', 'java'), ('v5', 'ripple', 'java');CREATEEXTERNALTABLE demo.modern.e_created ( id string, from_id string, to_id string,weight double) USING iceberg;INSERT INTO demo.modern.e_created VALUES ('e9', 'v1', 'v3', 0.4), ('e10', 'v4', 'v5', 1.0), ('e11', 'v4', 'v3', 0.4), ('e12', 'v6', 'v3', 0.2);CREATEEXTERNALTABLE demo.modern.e_knows ( id string, from_id string, to_id string,weight double) USING iceberg;INSERT INTO demo.modern.e_knows VALUES ('e7', 'v1', 'v2', 0.5), ('e8', 'v1', 'v4', 1.0);
The above SQL creates the following tables:
id
name
age
v1
marko
29
v2
vadas
27
v4
josh
32
v6
peter
35
id
name
lang
v3
lop
java
v5
ripple
java
id
from_id
to_id
weight
e7
v1
v2
0.5
e8
v1
v4
1.0
id
from_id
to_id
weight
e9
v1
v3
0.4
e10
v4
v5
1.0
e11
v4
v3
0.4
e12
v6
v3
0.2
Modeling a Graph
We then define a graph on top of the data tables we just created. Actually, this is the "Modern" graph defined by Apache Tinkerpop.
A schema instructs PuppyGraph on mapping data from the Iceberg into a graph. PuppyGraph offers various methods for schema creation. For this tutorial, we've already prepared a schema to help save time.
▶️ Create a PuppyGraph schema file schema.json with the following content:
The response shows that graph schema has been uploaded successfully:
{"Status":"OK","Message":"Schema uploaded and gremlin server restarted"}
Querying the Graph
In this tutorial we will use the Gremlin query language to query the Graph. Gremlin is a graph query language developed by Apache TinkerPop. Prior knowledge of Gremlin is not necessary to follow the tutorial. To learn more about it, visit https://tinkerpop.apache.org/gremlin.html.
▶️ Click on the Query panel the left side. The Gremlin Query tab offers an interactive environment for querying the graph using Gremlin.
Queries are entered on the left side, and the right side displays the graph visualization.
The first query retrieves the property of the person named "marko".
▶️ Copy the following query, paste it in the query input, and click on the run button.
g.V().has("name", "marko").valueMap()
The output is plain text like the following:
Rows: 1
age 29
name marko
Now let's also leverage the visualization. The next query gets all the software created by people known to "marko".
▶️ Copy the following query, paste it in the query input, and click on the run button.