Connecting to MySQL

It is also possible to query existing data on MySQL directly without loading it to PuppyGraph. Here is an demo of the feature.

In the demo, the mysql data source stores people and referral information. To query the data as a graph, we model people as vertices and the referral relationship between people as edges.

Prerequisites

The demo assumes that PuppyGraph has been deployed at localhost according to the instruction in Launching PuppyGraph from AWS Marketplace or Launching PuppyGraph in Docker.

In this demo, we use the username puppygraph and password puppygraph123.

Data Preparation (Optional)

IDAgeName

v1

29

marko

v2

27

vadas

The demo uses people and referral information as shown above.

The following command starts a MySQL container through Docker and writes data to MySQL. We assume that the IP address of the machine running the MySQL container is 172.31.19.123

docker volume rm mysql-data
docker volume create mysql-data
docker run -p 3306:3306 -itd --name mysql-server -v mysql-data:/var/lib/mysql -e MYSQL_ROOT_PASSWORD=mysql --restart=always mysql:8.0.33

After waiting for the MySQL container to start, connect through MySQL client。

# username = root
# passwd = mysql
mysql -h 127.0.0.1 -P3306 -uroot -p

Now we can create a database and data table, and then write the data to MySQL.

drop database if exists graph;
create database if not exists graph;
use graph;
create table person
(
    ID   varchar(128) not null,
    age  int,
    name varchar(128),
    PRIMARY KEY (ID)
);
insert into person values ('v1', 29, 'marko'), ('v2', 27, 'vadas');
create table referral
(
    refId    varchar(128) not null,
    `source` varchar(128),
    referred varchar(128),
    weight   double,
    PRIMARY KEY (refId)
);
insert into referral values ('e1', 'v1', 'v2', 0.5);

Upload the schema

Now the data are ready in MySQL. We need a PuppyGraph schema before querying it. Let's create a schema file mysql.json:

mysql.json
{
  "catalogs": [
    {
      "name": "jdbc_mysql",
      "type": "mysql",
      "jdbc": {
        "username": "root",
        "password": "mysql",
        "jdbcUri": "jdbc:mysql://172.17.0.1:3306",
        "driverClass": "com.mysql.cj.jdbc.Driver",
        "driverUrl": "https://repo1.maven.org/maven2/mysql/mysql-connector-java/8.0.28/mysql-connector-java-8.0.28.jar"
      }
    }
  ],
  "vertices": [
    {
      "label": "person",
      "mappedTableSource": {
        "catalog": "jdbc_mysql",
        "schema": "graph",
        "table": "person",
        "metaFields": {
          "id": "id"
        }
      },
      "attributes": [
        {
          "type": "Int",
          "name": "age"
        },
        {
          "type": "String",
          "name": "name"
        }
      ]
    }
  ],
  "edges": [
    {
      "label": "knows",
      "mappedTableSource": {
        "catalog": "jdbc_mysql",
        "schema": "graph",
        "table": "referral",
        "metaFields": {
          "id": "refId",
          "from": "source",
          "to": "referred"
        }
      },
      "from": "person",
      "to": "person",
      "attributes": [
        {
          "type": "Double",
          "name": "weight"
        }
      ]
    }
  ]
}

Here are some notes on this schema:

  1. A catalog jdbc_mysql is added to specify the remote data source in MySQL.

    • Set type to mysql.

    • driverClass: The class name of the JDBC driver.

      • com.mysql.jdbc.Driver (MySQL v5.x and earlier)

      • com.mysql.cj.jdbc.Driver (MySQL v6.x and later)

    • driverUrl: Please provide a URL where PuppyGraph can find the MySQL driver. Note the IP address is resolved against the network PuppyGraph container, if you have launched PuppyGraph using Docker in the same host. In this case you might want to use 172.17.0.1 or host.docker.internal.

  2. The label of the vertices and edges do not have to be the same as the names of corresponding tables in MySQL. There is a mappedTableSource field in each of the vertex and edge types specifying the actual schema (graph) and table (referral).

  3. Additionally, the mappedTableSource marks meta columns in the tables. For example, the fieldsfrom and to describe which columns in the table form the endpoints of edges.

Now we can upload the schema file mysql.json to PuppyGraph with the following shell command, assuming that the PuppyGraph is running on localhost:

curl -XPOST -H "content-type: application/json" --data-binary @./mysql.json --user "puppygraph:puppygraph123" localhost:8081/schema

Query the data

Connecting to PuppyGraph at http://localhost:8081 and start gremlin console from the "Query" section:

[PuppyGraph]> console
         \,,,/
         (o o)
-----oOOo-(3)-oOOo-----
plugin activated: tinkerpop.server
plugin activated: tinkerpop.utilities
plugin activated: tinkerpop.tinkergraph

Now we have connected to the Gremlin Console. We can query the graph:

gremlin> g.V().hasLabel("person").out("knows").values("name")
==>vadas

Last updated