Querying Trino Data as a Graph
Summary
In this tutorial, you will:
- Set up a Trino query engine with an Apache Iceberg backend and load it with example data;
- Start a PuppyGraph Docker container and query the Trino data as a graph.
Prerequisites
Please ensure that docker compose is available. The installation can be verified by running:
See https://docs.docker.com/compose/install/ for Docker Compose installation instructions and https://www.docker.com/get-started/ for more details on Docker.
Accessing the PuppyGraph Web UI requires a browser. However, the tutorial offers alternative instructions for those who wish to exclusively use the CLI.
Deployment
Create a file
docker-compose.yaml with the following content:
docker-compose.yaml
services:
trino:
image: trinodb/trino:479
container_name: trino
networks:
demo_net:
ports:
- 8080:8080
volumes:
- ./iceberg.properties:/etc/trino/catalog/iceberg.properties
puppygraph:
image: puppygraph/puppygraph:stable
container_name: puppygraph
networks:
demo_net:
environment:
- PUPPYGRAPH_USERNAME=puppygraph
- PUPPYGRAPH_PASSWORD=puppygraph123
ports:
- "8081:8081"
- "8182:8182"
- "7687:7687"
rest:
image: apache/iceberg-rest-fixture:1.10.1
container_name: iceberg-rest
networks:
demo_net:
ports:
- 8181:8181
environment:
- AWS_ACCESS_KEY_ID=admin
- AWS_SECRET_ACCESS_KEY=password
- AWS_REGION=us-east-1
- CATALOG_WAREHOUSE=s3://warehouse/
- CATALOG_IO__IMPL=org.apache.iceberg.aws.s3.S3FileIO
- CATALOG_S3_ENDPOINT=http://minio:9000
minio:
image: minio/minio:RELEASE.2025-07-23T15-54-02Z
container_name: minio
environment:
- MINIO_ROOT_USER=admin
- MINIO_ROOT_PASSWORD=password
- MINIO_DOMAIN=minio
networks:
demo_net:
aliases:
- warehouse.minio
ports:
- 9001:9001
- 9000:9000
command: ["server", "/data", "--console-address", ":9001"]
mc:
depends_on:
- minio
image: minio/mc:RELEASE.2025-08-13T08-35-41Z
container_name: mc
networks:
demo_net:
environment:
- AWS_ACCESS_KEY_ID=admin
- AWS_SECRET_ACCESS_KEY=password
- AWS_REGION=us-east-1
entrypoint: >
/bin/sh -c "
until (/usr/bin/mc alias set minio http://minio:9000 admin password) do echo '...waiting...' && sleep 1; done;
/usr/bin/mc rm -r --force minio/warehouse;
/usr/bin/mc mb minio/warehouse;
/usr/bin/mc anonymous set public minio/warehouse;
tail -f /dev/null
"
networks:
demo_net:
Create the Trino Iceberg catalog configuration file
iceberg.properties:
iceberg.properties
Then run the following command to start Trino and PuppyGraph:
Data Preparation
This tutorial is designed to be comprehensive and standalone, so it includes steps to populate data in Trino. In practical scenarios, PuppyGraph can query data directly from your existing Trino databases.
Run the following command to start a Trino shell to access the database:
Then execute the following SQL statements in the shell to create tables and insert data:
CREATE SCHEMA IF NOT EXISTS iceberg.modern;
CREATE TABLE iceberg.modern.person (
id VARCHAR(10),
name VARCHAR(50),
age INTEGER
);
INSERT INTO iceberg.modern.person VALUES
('v1', 'marko', 29),
('v2', 'vadas', 27),
('v4', 'josh', 32),
('v6', 'peter', 35);
CREATE TABLE iceberg.modern.software (
id VARCHAR(10),
name VARCHAR(50),
lang VARCHAR(50)
);
INSERT INTO iceberg.modern.software VALUES
('v3', 'lop', 'java'),
('v5', 'ripple', 'java');
CREATE TABLE iceberg.modern.created (
id VARCHAR(10),
from_id VARCHAR(10),
to_id VARCHAR(10),
weight DOUBLE PRECISION
);
INSERT INTO iceberg.modern.created VALUES
('e9', 'v1', 'v3', 0.4),
('e10', 'v4', 'v5', 1.0),
('e11', 'v4', 'v3', 0.4),
('e12', 'v6', 'v3', 0.2);
CREATE TABLE iceberg.modern.knows (
id VARCHAR(10),
from_id VARCHAR(10),
to_id VARCHAR(10),
weight DOUBLE PRECISION
);
INSERT INTO iceberg.modern.knows VALUES
('e7', 'v1', 'v2', 0.5),
('e8', 'v1', 'v4', 1.0);
The above SQL creates the following tables:
| id | name | age |
|---|---|---|
| v1 | marko | 29 |
| v2 | vadas | 27 |
| v4 | josh | 32 |
| v6 | peter | 35 |
| id | name | lang |
|---|---|---|
| v3 | lop | java |
| v5 | ripple | java |
| id | from_id | to_id | weight |
|---|---|---|---|
| e7 | v1 | v2 | 0.5 |
| e8 | v1 | v4 | 1.0 |
| id | from_id | to_id | weight |
|---|---|---|---|
| e9 | v1 | v3 | 0.4 |
| e10 | v4 | v5 | 1.0 |
| e11 | v4 | v3 | 0.4 |
| e12 | v6 | v3 | 0.2 |
Modeling a Graph
We then define a graph on top of the data tables we just created. Actually, this is the "Modern" graph defined by Apache Tinkerpop.

Modern Graph
A schema instructs PuppyGraph on mapping data from Trino into a graph. PuppyGraph offers various methods for schema creation. For this tutorial, we've already prepared a schema to help save time.
Create a PuppyGraph schema file
schema.json with the following content:
schema.json
{
"catalogs": [
{
"name": "trino_data",
"type": "trino",
"jdbc": {
"username": "trino",
"jdbcUri": "jdbc:trino://trino:8080/iceberg"
}
}
],
"graph": {
"vertices": [
{
"label": "person",
"oneToOne": {
"tableSource": {
"catalog": "trino_data",
"schema": "modern",
"table": "person"
},
"id": {
"fields": [
{
"type": "String",
"field": "id",
"alias": "id"
}
]
},
"attributes": [
{
"type": "Int",
"field": "age",
"alias": "age"
},
{
"type": "String",
"field": "name",
"alias": "name"
}
]
}
},
{
"label": "software",
"oneToOne": {
"tableSource": {
"catalog": "trino_data",
"schema": "modern",
"table": "software"
},
"id": {
"fields": [
{
"type": "String",
"field": "id",
"alias": "id"
}
]
},
"attributes": [
{
"type": "String",
"field": "lang",
"alias": "lang"
},
{
"type": "String",
"field": "name",
"alias": "name"
}
]
}
}
],
"edges": [
{
"label": "knows",
"fromVertex": "person",
"toVertex": "person",
"tableSource": {
"catalog": "trino_data",
"schema": "modern",
"table": "knows"
},
"id": {
"fields": [
{
"type": "String",
"field": "id",
"alias": "id"
}
]
},
"fromId": {
"fields": [
{
"type": "String",
"field": "from_id",
"alias": "from_id"
}
]
},
"toId": {
"fields": [
{
"type": "String",
"field": "to_id",
"alias": "to_id"
}
]
},
"attributes": [
{
"type": "Double",
"field": "weight",
"alias": "weight"
}
]
},
{
"label": "created",
"fromVertex": "person",
"toVertex": "software",
"tableSource": {
"catalog": "trino_data",
"schema": "modern",
"table": "created"
},
"id": {
"fields": [
{
"type": "String",
"field": "id",
"alias": "id"
}
]
},
"fromId": {
"fields": [
{
"type": "String",
"field": "from_id",
"alias": "from_id"
}
]
},
"toId": {
"fields": [
{
"type": "String",
"field": "to_id",
"alias": "to_id"
}
]
},
"attributes": [
{
"type": "Double",
"field": "weight",
"alias": "weight"
}
]
}
]
}
}
Log into PuppyGraph Web UI at http://localhost:8081 with username
puppygraph and password puppygraph123.

PuppyGraph Login
Upload the schema by selecting the file
schema.json in the Upload Graph Schema JSON block and clicking on Upload.

Upload Schema Page
Once the schema is uploaded, the schema page shows the visualized graph schema as follows.

Visualized Schema
Alternative: Schema Uploading via CLI
Alternatively, run the following command to upload the schema file:
curl -XPOST -H "content-type: application/json" --data-binary @./schema.json --user "puppygraph:puppygraph123" localhost:8081/schema
The response shows that graph schema has been uploaded successfully:
{"Status":"OK","Message":"Schema updated and PuppyGraph server restarted","Updated":true,"InplaceUpdate":false}
Querying the Graph
In this tutorial, we will use the Gremlin and Cypher query languages to interact with the Graph. Gremlin, developed by Apache TinkerPop, and Cypher, designed by Neo4j, are both powerful graph query languages. Prior knowledge of either Gremlin or Cypher is not necessary to follow this tutorial. To learn more about Gremlin, visit https://tinkerpop.apache.org/gremlin.html. For information on Cypher, you can check out https://neo4j.com/product/cypher-graph-query-language/.
Click on the
Query panel on the left side. The Gremlin Query tab offers an interactive environment for querying the graph using Gremlin.

Interactive Gremlin Query Page
Queries are entered on the left side, and the right side displays the graph visualization.
The first query retrieves the property of the person named "marko".
Copy the following query, paste it in the query input, and click on the run button.
The output is plain text like the following:
Now let's also leverage the visualization. The next query gets all the software created by people known to "marko".
Copy the following query, paste it in the query input, and click on the run button.
The output is as follows. There are two paths in the result as "marko" knows "josh" who created "lop" and "ripple".

Interactive Query with Results
Alternative: Querying the graph via CLI
Alternatively, we can query the graph via CLI.
Execute the following command to access the PuppyGraph CLI.
The welcome screen appears as follows:
____ ____ _
| _ \ _ _ _ __ _ __ _ _ / ___| _ __ __ _ _ __ | |__
| |_) | | | | | | '_ \ | '_ \ | | | | | | _ | '__| / _` | | '_ \ | '_ \
| __/ | |_| | | |_) | | |_) | | |_| | | |_| | | | | (_| | | |_) | | | | |
|_| \__,_| | .__/ | .__/ \__, | \____| |_| \__,_| | .__/ |_| |_|
|_| |_| |___/ |_|
Welcome to PuppyGraph, type help to see the command list
[PuppyGraph]>
console access PuppyGraph Gremlin Console
cypher-console access PuppyGraph Cypher Console
groovy access console to run complex groovy scripts
exit exit PuppyGraph
help show the command list
Access the PuppyGraph Gremlin Console through the
console command.
[PuppyGraph]> console
____ ____ _
| _ \ _ _ _ __ _ __ _ _ / ___| _ __ __ _ _ __ | |__
| |_) | | | | | | '_ \ | '_ \ | | | | | | _ | '__| / _` | | '_ \ | '_ \
| __/ | |_| | | |_) | | |_) | | |_| | | |_| | | | | (_| | | |_) | | | | |
|_| \__,_| | .__/ | .__/ \__, | \____| |_| \__,_| | .__/ |_| |_|
|_| |_| |___/ |_|
Welcome to PuppyGraph!
version: 0.11
To Learn more about the graph schema:
- Use graph.show() to list all the node (vertex) and edge labels.
- Use graph.show('$FOO') to list all the node (vertex) and edge labels related to $FOO.
- Use graph.describe('$BAR') to list all the attributes of the label $BAR.
See https://tinkerpop.apache.org/gremlin.html to learn more about the Gremlin query language.
Here are some example queries for exploring the graph:
- Use g.V() to list all the nodes (vertices).
- Use g.E() to list all the edges.
- Use g.V().count() to get the total number of nodes (vertices).
- Use g.E().count() to get the total number of edges.
- Use g.V('$ID').out() to find out nodes (vertices) that are reachable in 1-hop from the node (vertex) $ID. For example, g.V('person[v1]').out() will find out 1-hop reachable nodes (vertices) from 'person[v1]'.
- Use g.V('$ID').out().out() similarly to find out 2-hop reachable nodes (vertices) from the node (vertex) $ID.
puppy-gremlin>
Run the following queries in the console to query the Graph.
Properties of the person named "marko":
Exit Gremlin console by
:x and then access Cypher console by cypher-console.
Make sure to initiate Cypher queries using the prefix
:>.
puppy-cypher> :> MATCH (v) RETURN count(*)
==>[count(*):6]
puppy-cypher> :> MATCH (v) RETURN v
==>[v:[_type:node,name:peter,_id:person[v6],_label:person,age:35]]
==>[v:[_type:node,name:vadas,_id:person[v2],_label:person,age:27]]
==>[v:[_type:node,name:josh,_id:person[v4],_label:person,age:32]]
==>[v:[_type:node,name:marko,_id:person[v1],_label:person,age:29]]
==>[v:[_type:node,name:ripple,_id:software[v5],lang:java,_label:software]]
==>[v:[_type:node,name:lop,_id:software[v3],lang:java,_label:software]]
Exit Cypher console and the CLI by
:x and then exit command.
Cleaning up
Run the following command to shut down and remove the services: