Querying Nessie and Minio Data as a Graph with TLS
Summary
In this tutorial, you will:
- Create a Nessie-backed Apache Iceberg data lake and load it with example data;
- Deploy TLS-enabled MinIO, Nessie, Spark-Iceberg, and PuppyGraph services using Docker Compose.
- Configure each service to communicate securely via HTTPS.
- Use PuppyGraph to query and visualize Nessie data as a graph.
Note: For the non-TLS version, see Querying Nessie Data as a Graph.
Prerequisites
Docker
Docker is required to run the PuppyGraph server. You can download Docker from here.
Please ensure that docker compose
is available. The installation can be verified by running:
OpenSSL
OpenSSL is needed to generate TLS certificates for secure communications.
For Ubuntu, you can install it using the following commands:
Accessing the PuppyGraph Web UI requires a browser. However, the tutorial offers alternative instructions for those who wish to exclusively use the CLI.
TLS Certificate Setup
To enable secure communications, we generate certificates for MinIO and Nessie.
1. Create a Shell Script
Create a file named
generate_crt.sh
with the following content.
generate_crt.sh
#!/usr/bin/env bash
set -euo pipefail
# ----------------------------
# Configuration parameters
# ----------------------------
# Base directories
CERT_ROOT="${CERT_ROOT:-./certs}"
CA_DIR="${CA_DIR:-${CERT_ROOT}/ca}"
MINIO_DIR="${MINIO_DIR:-${CERT_ROOT}/minio}"
NESSIE_DIR="${NESSIE_DIR:-${CERT_ROOT}/nessie}"
# Validity periods (in days)
CA_DAYS="${CA_DAYS:-3650}"
SERVER_DAYS="${SERVER_DAYS:-365}"
# Password settings
CA_PASSWORD="${CA_PASSWORD:-ca_password}"
NESSIE_PASSWORD="${NESSIE_PASSWORD:-nessie_password}"
# Common Names
CA_CN="${CA_CN:-MyRootCA}"
MINIO_CN="${MINIO_CN:-minio}"
NESSIE_CN="${NESSIE_CN:-nessie}"
# Subject Alternative Names
MINIO_SANS=( ${MINIO_SANS:-minio localhost 127.0.0.1} )
NESSIE_SANS=( ${NESSIE_SANS:-nessie localhost 127.0.0.1} )
# Helper: build SAN entries for OpenSSL config
build_san() {
local sans=("$@")
local idx=1
for san in "${sans[@]}"; do
if [[ $san =~ ^[0-9]+\.[0-9]+\.[0-9]+\.[0-9]+$ ]]; then
echo "IP.$idx = $san"
else
echo "DNS.$idx = $san"
fi
((idx++))
done
}
# Clean and prepare directories
rm -rf "$CERT_ROOT"
mkdir -p "$CA_DIR" "$MINIO_DIR" "$NESSIE_DIR"
# 1. Generate Root CA
echo "[1/4] Generating Root CA..."
openssl genrsa -out "$CA_DIR/ca.key" 2048
cat > "$CA_DIR/ca.cnf" <<EOL
[req]
distinguished_name = req_distinguished_name
x509_extensions = v3_ca
prompt = no
[req_distinguished_name]
C = US
ST = State
L = City
O = Organization
OU = Unit
CN = $CA_CN
[v3_ca]
basicConstraints = critical, CA:TRUE
keyUsage = critical, keyCertSign, cRLSign
EOL
openssl req -new -x509 -days "$CA_DAYS" \
-key "$CA_DIR/ca.key" \
-out "$CA_DIR/ca.crt" \
-config "$CA_DIR/ca.cnf"
# 2. Generate MinIO Server Certificate
echo "[2/4] Generating MinIO certificate..."
cat > "$MINIO_DIR/minio.cnf" <<EOL
[req]
distinguished_name = req_distinguished_name
req_extensions = v3_req
prompt = no
[req_distinguished_name]
C = US
ST = State
L = City
O = Organization
OU = Unit
CN = $MINIO_CN
[v3_req]
subjectAltName = @alt_names
[alt_names]
$(build_san "${MINIO_SANS[@]}")
EOL
openssl genrsa -out "$MINIO_DIR/private.key" 2048
openssl req -new -key "$MINIO_DIR/private.key" \
-out "$MINIO_DIR/minio.csr" \
-config "$MINIO_DIR/minio.cnf"
openssl x509 -req -days "$SERVER_DAYS" \
-in "$MINIO_DIR/minio.csr" \
-CA "$CA_DIR/ca.crt" \
-CAkey "$CA_DIR/ca.key" \
-CAcreateserial \
-out "$MINIO_DIR/public.crt" \
-extfile "$MINIO_DIR/minio.cnf" \
-extensions v3_req
# 3. Create CA truststore (PKCS12)
echo "[3/4] Creating CA truststore (PKCS12)..."
keytool -import -trustcacerts -noprompt \
-alias root-ca \
-file "$CA_DIR/ca.crt" \
-keystore "$CA_DIR/ca-truststore.p12" \
-storetype PKCS12 \
-storepass "$CA_PASSWORD"
# 4. Generate Nessie Server Certificate
echo "[4/4] Generating Nessie certificate..."
cat > "$NESSIE_DIR/nessie.cnf" <<EOL
[req]
distinguished_name = req_distinguished_name
req_extensions = v3_req
prompt = no
[req_distinguished_name]
C = US
ST = State
L = City
O = Organization
OU = Unit
CN = $NESSIE_CN
[v3_req]
subjectAltName = @alt_names
keyUsage = digitalSignature, keyEncipherment
extendedKeyUsage = serverAuth
[alt_names]
$(build_san "${NESSIE_SANS[@]}")
EOL
openssl genrsa -out "$NESSIE_DIR/private.key" 2048
openssl req -new -key "$NESSIE_DIR/private.key" \
-out "$NESSIE_DIR/nessie.csr" \
-config "$NESSIE_DIR/nessie.cnf"
openssl x509 -req -days "$SERVER_DAYS" \
-in "$NESSIE_DIR/nessie.csr" \
-CA "$CA_DIR/ca.crt" \
-CAkey "$CA_DIR/ca.key" \
-CAcreateserial \
-out "$NESSIE_DIR/nessie.crt" \
-extfile "$NESSIE_DIR/nessie.cnf" \
-extensions v3_req
# Create Nessie keystore (PKCS12)
echo "Creating Nessie keystore (PKCS12)..."
openssl pkcs12 -export \
-in "$NESSIE_DIR/nessie.crt" \
-inkey "$NESSIE_DIR/private.key" \
-name "nessie-cert" \
-out "$NESSIE_DIR/nessie.p12" \
-passout "pass:$NESSIE_PASSWORD"
# Tighten permissions
chmod 600 "$CA_DIR/ca.key" "$MINIO_DIR/private.key" "$NESSIE_DIR/private.key"
chmod 644 "$NESSIE_DIR/nessie.p12"
echo -e "\nCertificate generation complete!"
echo "Generated files:"
echo " CA: $CA_DIR/ca.crt"
echo " Truststore (PKCS12): $CA_DIR/ca-truststore.p12"
echo " MinIO cert: $MINIO_DIR/public.crt"
echo " MinIO key: $MINIO_DIR/private.key"
echo " Nessie cert: $NESSIE_DIR/nessie.crt"
echo " Nessie key: $NESSIE_DIR/private.key"
echo " Nessie keystore: $NESSIE_DIR/nessie.p12"
2. Generate the Certificate Files
Run the following commands to generate the server certificates and keys signed by your root CA
3. Prepare TLS Certificates for Docker Volume Mounts
mkdir -p ./minio_certs
cp ./certs/minio/public.crt ./minio_certs/
cp ./certs/minio/private.key ./minio_certs/
mkdir -p ./nessie_certs
cp ./certs/nessie/nessie.p12 ./nessie_certs/
cp ./certs/ca/ca-truststore.p12 ./nessie_certs/
mkdir -p ./spark_certs
cp ./certs/ca/ca-truststore.p12 ./spark_certs/
mkdir -p ./puppygraph
cp ./certs/minio/public.crt ./puppygraph/minio.crt
cp ./certs/nessie/nessie.crt ./puppygraph/nessie.crt
Deployment
1. Create the Docker Compose File
Create a file named
docker-compose.yaml
with the content below.
Notice that TLS parameters are integrated into the MinIO and Nessie service configurations.
docker-compose.yaml
services:
spark-iceberg:
image: tabulario/spark-iceberg
container_name: spark-iceberg
networks:
iceberg_net:
depends_on:
- minio
- nessie
volumes:
- ./spark_certs:/spark_certs
environment:
- AWS_ACCESS_KEY_ID=admin
- AWS_SECRET_ACCESS_KEY=password
- AWS_REGION=us-east-1
- JAVA_TOOL_OPTIONS=-Djavax.net.ssl.trustStore=/spark_certs/ca-truststore.p12 -Djavax.net.ssl.trustStorePassword=ca_password
ports:
- 8888:8888
- 8080:8080
- 10000:10000
- 10001:10001
minio:
image: quay.io/minio/minio:latest
container_name: minio
networks:
iceberg_net:
ports:
- "9000:9000"
- "9001:9001"
volumes:
- ./minio_certs:/minio_certs
environment:
- MINIO_ROOT_USER=admin
- MINIO_ROOT_PASSWORD=password
- MINIO_REGION=us-east-1
entrypoint: >
/bin/sh -c "
minio server /data --certs-dir /minio_certs --console-address ':9001' &
sleep 5;
mc alias set myminio https://localhost:9000 admin password --insecure;
mc mb myminio/my-bucket --ignore-existing --insecure;
tail -f /dev/null"
nessie:
image: ghcr.io/projectnessie/nessie:0.104.1
container_name: nessie
networks:
iceberg_net:
ports:
- 19121:19121
volumes:
- ./nessie_certs:/nessie_certs
environment:
- nessie.version.store.type=IN_MEMORY
- nessie.catalog.default-warehouse=warehouse
- nessie.catalog.warehouses.warehouse.location=s3a://my-bucket/
- nessie.catalog.service.s3.default-options.region=us-east-1
- nessie.catalog.service.s3.default-options.endpoint=https://minio:9000
- nessie.catalog.service.s3.default-options.path-style-access=true
- nessie.catalog.service.s3.default-options.access-key=urn:nessie-secret:quarkus:nessie.catalog.secrets.access-key
- nessie.catalog.secrets.access-key.name=admin
- nessie.catalog.secrets.access-key.secret=password
- nessie.server.authentication.enabled=false
- quarkus.http.ssl-port=19121
- quarkus.http.ssl.certificate.key-store-file=/nessie_certs/nessie.p12
- quarkus.http.ssl.certificate.key-store-password=nessie_password
- quarkus.http.ssl.certificate.key-store-type=PKCS12
- JAVA_TOOL_OPTIONS=-Djavax.net.ssl.trustStore=/nessie_certs/ca-truststore.p12 -Djavax.net.ssl.trustStorePassword=ca_password
puppygraph:
image: puppygraph/puppygraph:stable
container_name: puppygraph
networks:
iceberg_net:
environment:
- PUPPYGRAPH_USERNAME=puppygraph
- PUPPYGRAPH_PASSWORD=puppygraph123
- CERTIFICATE_BASEPATH=/home/ubuntu/tls-certs
ports:
- "8081:8081"
- "8182:8182"
- "7687:7687"
volumes:
- ./puppygraph:/home/ubuntu/tls-certs
depends_on:
- spark-iceberg
networks:
iceberg_net:
2. Start the Services
Then run the following command to start Nessie-backed Iceberg services and PuppyGraph:
[+] Running 5/5
✔ Network iceberg_net Created
✔ Container nessie Started
✔ Container minio Started
✔ Container spark-iceberg Started
✔ Container puppygraph Started
Data Preparation
This tutorial is designed to be comprehensive and standalone, so it includes steps to populate data in Nessie. In practical scenarios, PuppyGraph can query data directly from your existing Nessie tables.
Run the following command to start a Spark-SQL shell connected to Nessie.
docker exec -it spark-iceberg spark-sql \
--conf spark.sql.catalog.demo=org.apache.iceberg.spark.SparkCatalog \
--conf spark.sql.catalog.demo.uri=https://nessie:19121/iceberg/ \
--conf spark.sql.catalog.demo.warehouse=s3a://my-bucket/ \
--conf spark.sql.catalog.demo.type=rest
The shell will be like this:
Then execute the following SQL statements in the shell to create tables and insert data:
CREATE DATABASE demo.modern;
CREATE EXTERNAL TABLE demo.modern.person (
id string,
name string,
age int
) USING iceberg;
INSERT INTO demo.modern.person VALUES
('v1', 'marko', 29),
('v2', 'vadas', 27),
('v4', 'josh', 32),
('v6', 'peter', 35);
CREATE EXTERNAL TABLE demo.modern.software (
id string,
name string,
lang string
) USING iceberg;
INSERT INTO demo.modern.software VALUES
('v3', 'lop', 'java'),
('v5', 'ripple', 'java');
CREATE EXTERNAL TABLE demo.modern.created (
id string,
from_id string,
to_id string,
weight double
) USING iceberg;
INSERT INTO demo.modern.created VALUES
('e9', 'v1', 'v3', 0.4),
('e10', 'v4', 'v5', 1.0),
('e11', 'v4', 'v3', 0.4),
('e12', 'v6', 'v3', 0.2);
CREATE EXTERNAL TABLE demo.modern.knows (
id string,
from_id string,
to_id string,
weight double
) USING iceberg;
INSERT INTO demo.modern.knows VALUES
('e7', 'v1', 'v2', 0.5),
('e8', 'v1', 'v4', 1.0);
The above SQL creates the following tables:
id | name | age |
---|---|---|
v1 | marko | 29 |
v2 | vadas | 27 |
v4 | josh | 32 |
v6 | peter | 35 |
id | name | lang |
---|---|---|
v3 | lop | java |
v5 | ripple | java |
id | from_id | to_id | weight |
---|---|---|---|
e7 | v1 | v2 | 0.5 |
e8 | v1 | v4 | 1.0 |
id | from_id | to_id | weight |
---|---|---|---|
e9 | v1 | v3 | 0.4 |
e10 | v4 | v5 | 1.0 |
e11 | v4 | v3 | 0.4 |
e12 | v6 | v3 | 0.2 |
When finished, exit the Spark-SQL shell by entering:
Modeling the Graph
Step 1: Connecting to Nessie
Log into PuppyGraph Web UI at http://localhost:8081 with username
puppygraph
and password puppygraph123
..
Click on
Create graph schema
to create a new graph schema.
Fill in the fields as follows.

Create Nessie Catalog
Parameter | Value |
---|---|
Catalog type | Apache Iceberg |
Catalog name | Some name for the catalog as you like. |
Metastore Type | Iceberg-Rest |
RestUri | https://nessie:19121/iceberg (note the HTTPS). |
Warehouse | Same as nessie.catalog.warehouses.warehouse.location in docker-compose.yaml . |
Storage type | S3 Compatible |
Endpoint | https://minio:9000 (note the HTTPS). Same as nessie.catalog.service.s3.default-options.endpoint in docker-compose.yaml . |
Access key | Same as AWS_ACCESS_KEY_ID in docker-compose.yaml |
Secret key | Same as AWS_SECRET_ACCESS_KEY in docker-compose.yaml |
Enable SSL | true |
Enable path style access | true |
Click on
Save
, then Click on Submit
to connect to Nessie.
Step 2: Building the Graph Schema
In the
Schema Builder
, add the first node (vertex) to the graph from the table person
.
After that use the Auto Suggestion to create other nodes and edges.
Select
person
as the start node (vertex) and add the auto suggested nodes and edges.
The graph schema should look like this:
Submit the schema to create the graph.
Step 3: Querying the Graph
PuppyGraph provides a Dashboard that gives the summary of the graph.
Use the Interactive Query UI to further explore the graph by sending queries.
Cleaning up
Run the following command to shut down and remove the services: