PuppyGraph
Search
K

Connecting to PostgreSQL

It is also possible to query existing data on PostgreSQL directly without loading it to PuppyGraph. Here is an demo of the feature.
In the demo, the PostgreSQL data source stores people and referral information. To query the data as a graph, we model people as vertices and the referral relationship between people as edges.

Prerequisites

The demo assumes that PuppyGraph has been deployed at localhost according to the instruction in AMI Version or Docker Deployment.
In this demo, we use the username puppygraph and password 888888.

Data Preparation (Optional)

Person Table
Referral Table
ID
Age
Name
v1
29
marko
v2
27
vadas
RefID
Source
Referred
Weight
e1
v1
v2
0.5
The demo uses people and referral information as shown above.
The following command starts a PostgreSQL container through Docker and writes data to PostgreSQL. We assume that the IP address of the machine running the PostgreSQL container is 172.31.19.123
docker volume rm postgres-data
docker volume create postgres-data
docker run -p 5432:5432 --name postgres-server -v postgres-data:/var/lib/postgresql/data -e POSTGRES_PASSWORD=postgres -d --restart=always postgres:15.3
After waiting for the PostgreSQL container to start, connect through PostgreSQL client。
# username = postgres
# passwd = postgres
psql -h 127.0.0.1 -p 5432 -U postgres -W
Now we can create a database and data table, and then write the data to PostgreSQL.
drop table if exists person, referral;
create table if not exists person
(
ID varchar(128) not null,
age int,
name varchar(128),
PRIMARY KEY (ID)
);
insert into person values ('v1', 29, 'marko'), ('v2', 27, 'vadas');
create table if not exists referral
(
refId varchar(128) not null,
source varchar(128),
referred varchar(128),
weight double precision,
PRIMARY KEY (refId)
);
insert into referral values ('e1', 'v1', 'v2', 0.5);

Upload the schema

Now the data are ready in PostgreSQL. We need a PuppyGraph schema before querying it. Let's create a schema file postgres.json:
postgresql.json
{
"catalogs": [
{
"name": "jdbc_postgres",
"type": "postgresql",
"jdbc": {
"username": "postgres",
"password": "postgres",
"jdbcUri": "jdbc:postgresql://172.31.19.123:5432/postgres",
"driverClass": "org.postgresql.Driver",
"driverUrl": "https://repo1.maven.org/maven2/org/postgresql/postgresql/42.3.3/postgresql-42.3.3.jar"
}
}
],
"vertices": [
{
"label": "person",
"mappedTableSource": {
"catalog": "jdbc_postgres",
"schema": "public",
"table": "person",
"metaFields": {
"id": "id"
}
},
"attributes": [
{
"type": "Int",
"name": "age"
},
{
"type": "String",
"name": "name"
}
]
}
],
"edges": [
{
"label": "knows",
"mappedTableSource": {
"catalog": "jdbc_postgres",
"schema": "public",
"table": "referral",
"metaFields": {
"id": "refId",
"from": "source",
"to": "referred"
}
},
"from": "person",
"to": "person",
"attributes": [
{
"type": "Double",
"name": "weight"
}
]
}
]
}
Here are some notes on this schema:
  1. 1.
    A catalog jdbc_postgres is added to specify the remote data source in PostgreSQL.
    • Set type to postgresql.
    • Set driverClass to org.postgresql.Driver.
    • driverUrl: Please provide a URL where PuppyGraph can find the PostgreSQL driver.
  2. 2.
    The label of the vertices and edges do not have to be the same as the names of corresponding tables in PostgreSQL. There is a mappedTableSource field in each of the vertex and edge types specifying the actual schema (public) and table (referral).
  3. 3.
    Additionally, the mappedTableSource marks meta columns in the tables. For example, the fieldsfrom and to describe which columns in the table form the endpoints of edges.
Now we can upload the schema file postgres.json to PuppyGraph with the following shell command, assuming that the PuppyGraph is running on localhost:
curl -XPOST -H "content-type: application/json" --data-binary @./postgres.json --user "puppygraph:888888" localhost:8081/schema

Query the data

Connecting to PuppyGraph at http://localhost:8081 and start gremlin console from the "Query" section:
[PuppyGraph]> console
\,,,/
(o o)
-----oOOo-(3)-oOOo-----
plugin activated: tinkerpop.server
plugin activated: tinkerpop.utilities
plugin activated: tinkerpop.tinkergraph
Now we have connected to the Gremlin Console. We can query the graph:
gremlin> g.V().hasLabel("person").out("knows").values("name")
==>vadas