Querying MySQL Data as a Graph

A step-by-step tutorial to using PuppyGraph to query data in MySQL

Summary

In this tutorial, you will:

  • Create a MySQL database and load it with example data;

  • Start a PuppyGraph Docker container and query the MySQL data as a graph.

Prerequisites

Please ensure that docker compose is available. The installation can be verified by running:

docker compose version

See https://docs.docker.com/compose/install/ for Docker Compose installation instructions and https://www.docker.com/get-started/ for more details on Docker.

Accessing the PuppyGraph Web UI requires a browser. However, the tutorial offers alternative instructions for those who wish to exclusively use the CLI.

Deployment

▶️ Create a file docker-compose.yaml with the following content:

docker-compose.yaml
version: "3"

services:
  puppygraph:
    image: puppygraph/puppygraph:stable
    pull_policy: always
    container_name: puppygraph
    environment:
      - PUPPYGRAPH_USERNAME=puppygraph
      - PUPPYGRAPH_PASSWORD=puppygraph123
    networks:
      mysql_net:
    ports:
      - "8081:8081"
      - "8182:8182"
      - "7687:7687"
  mysql:
    image: mysql:8.0.33
    container_name: mysql-server
    environment:
      - MYSQL_ROOT_PASSWORD=mysql123
      - MYSQL_USER=mysqluser
      - MYSQL_PASSWORD=mysqlpassword
      - MYSQL_DATABASE=mydatabase
    networks:
      - mysql_net
    ports:
      - "3306:3306"
networks:
  mysql_net:
    name: puppy-mysql

⚠️Warning: Ensure to modify your password environment variables, particularly when your machine is publicly accessible.

▶️ Then run the following command to start Mysql and PuppyGraph:

docker compose up -d
[+] Running 3/3
 ✔ Network puppy-mysql     Created                                                                                                 0.1s 
 ✔ Container mysql-server  Started                                                                                                 3.7s 
 ✔ Container puppygraph    Started                                                                                                 3.7s 

Data Preparation

This tutorial is designed to be comprehensive and standalone, so it includes steps to populate data in MySQL. In practical scenarios, PuppyGraph can query data directly from your existing MySQL databases.

▶️ Run the following command to start a MySQL shell to access the database using root:

docker exec -it mysql-server mysql -uroot -p

It will show a password prompt:

Enter password:

▶️ Input the root password (default root password: mysql123) of mysql-server to access the MySQL client shell.

Welcome to the MySQL monitor.  Commands end with ; or \g.
Your MySQL connection id is 8
Server version: 8.0.33 MySQL Community Server - GPL

Copyright (c) 2000, 2023, Oracle and/or its affiliates.

Oracle is a registered trademark of Oracle Corporation and/or its
affiliates. Other names may be trademarks of their respective
owners.

Type 'help;' or '\h' for help. Type '\c' to clear the current input statement.

mysql> 

▶️ Then execute the following SQL statements in the shell to create tables and insert data.

create schema modern;
create table modern.person (id text, name text, age integer);
insert into modern.person values
                              ('v1', 'marko', 29),
                              ('v2', 'vadas', 27),
                              ('v4', 'josh', 32),
                              ('v6', 'peter', 35);

create table modern.software (id text, name text, lang text);
insert into modern.software values
                                ('v3', 'lop', 'java'),
                                ('v5', 'ripple', 'java');
                                
create table modern.created (id text, from_id text, to_id text, weight double precision);
insert into modern.created values
                               ('e9', 'v1', 'v3', 0.4),
                               ('e10', 'v4', 'v5', 1.0),
                               ('e11', 'v4', 'v3', 0.4),
                               ('e12', 'v6', 'v3', 0.2);

create table modern.knows (id text, from_id text, to_id text, weight double precision);
insert into modern.knows values
                             ('e7', 'v1', 'v2', 0.5),
                             ('e8', 'v1', 'v4', 1.0);

The above SQL creates the following tables:

idnameage

v1

marko

29

v2

vadas

27

v4

josh

32

v6

peter

35

Modeling a Graph

We then define a graph on top of the data tables we just created. Actually, this is the "Modern" graph defined by Apache Tinkerpop.

A schema instructs PuppyGraph on mapping data from the MySQL into a graph. PuppyGraph offers various methods for schema creation. For this tutorial, we've already prepared a schema to help save time.

▶️ Create a PuppyGraph schema file schema.json with the following content:

schema.json
{
  "catalogs": [
    {
      "name": "mysql_data",
      "type": "mysql",
      "jdbc": {
        "username": "root",
        "password": "mysql123",
        "jdbcUri": "jdbc:mysql://mysql-server:3306",
        "driverClass": "com.mysql.cj.jdbc.Driver",
        "driverUrl": "https://repo1.maven.org/maven2/mysql/mysql-connector-java/8.0.28/mysql-connector-java-8.0.28.jar"
      }
    }
  ],
  "vertices": [
    {
      "label": "person",
      "mappedTableSource": {
        "catalog": "mysql_data",
        "schema": "modern",
        "table": "person",
        "metaFields": {
          "id": "id"
        }
      },
      "attributes": [
        {
          "type": "Int",
          "name": "age"
        },
        {
          "type": "String",
          "name": "name"
        }
      ]
    },
    {
      "label": "software",
      "mappedTableSource": {
        "catalog": "mysql_data",
        "schema": "modern",
        "table": "software",
        "metaFields": {
          "id": "id"
        }
      },
      "attributes": [
        {
          "type": "String",
          "name": "lang"
        },
        {
          "type": "String",
          "name": "name"
        }
      ]
    }
  ],
  "edges": [
    {
      "label": "knows",
      "mappedTableSource": {
        "catalog": "mysql_data",
        "schema": "modern",
        "table": "knows",
        "metaFields": {
          "id": "id",
          "from": "from_id",
          "to": "to_id"
        }
      },
      "from": "person",
      "to": "person",
      "attributes": [
        {
          "type": "Double",
          "name": "weight"
        }
      ]
    },
    {
      "label": "created",
      "mappedTableSource": {
        "catalog": "mysql_data",
        "schema": "modern",
        "table": "created",
        "metaFields": {
          "id": "id",
          "from": "from_id",
          "to": "to_id"
        }
      },
      "from": "person",
      "to": "software",
      "attributes": [
        {
          "type": "Double",
          "name": "weight"
        }
      ]
    }
  ]
}

▶️ Replace the password by your modified root password (default root password: mysql123).

▶️ Log into PuppyGraph Web UI at http://localhost:8081 with username and password of PuppyGraph.(default username: puppygraph, default password: puppygraph123)

▶️ Upload the schema by choosing the file schema.json in the Upload Graph Schema JSON block and clicking on Upload.

Once the schema is uploaded, the schema page shows the visualized graph schema as follows.

Alternative: Schema Uploading via CLI

▶️ Alternatively, run the following command to upload the schema file:

curl -XPOST -H "content-type: application/json" --data-binary @./schema.json --user "puppygraph:puppygraph123" localhost:8081/schema

The response shows that graph schema has been uploaded successfully:

{"Status":"OK","Message":"Schema uploaded and gremlin server restarted"}

Querying the Graph

In this tutorial, we will use the Gremlin and Cypher query languages to interact with the Graph. Gremlin, developed by Apache TinkerPop, and Cypher, designed by Neo4j, are both powerful graph query languages. Prior knowledge of either Gremlin or Cypher is not necessary to follow this tutorial. To learn more about Gremlin, visit https://tinkerpop.apache.org/gremlin.html. For information on Cypher, you can check out https://neo4j.com/product/cypher-graph-query-language/.

▶️ Click on the Query panel the left side. The Gremlin Query tab offers an interactive environment for querying the graph using Gremlin.

Queries are entered on the left side, and the right side displays the graph visualization.

The first query retrieves the property of the person named "marko".

▶️ Copy the following query, paste it in the query input, and click on the run button.

g.V().has("name", "marko").valueMap()

The output is plain text like the following:

Rows: 1
age              29
name             marko

Now let's also leverage the visualization. The next query gets all the software created by people known to "marko".

▶️ Copy the following query, paste it in the query input, and click on the run button.

g.V().has("name", "marko")
  .out("knows").out("created").path()

The output is as follows. There are two paths in the result as "marko" knows "josh" who created "lop" and "ripple".

Alternative: Querying the graph via CLI

Alternatively, we can query the graph via CLI.

▶️ Execute the following command to access the PuppyGraph CLI.

docker exec -it puppygraph ./bin/puppygraph

The welcome screen appears as follows:

  ____                                     ____                          _
 |  _ \   _   _   _ __    _ __    _   _   / ___|  _ __    __ _   _ __   | |__
 | |_) | | | | | | '_ \  | '_ \  | | | | | |  _  | '__|  / _` | | '_ \  | '_ \
 |  __/  | |_| | | |_) | | |_) | | |_| | | |_| | | |    | (_| | | |_) | | | | |
 |_|      \__,_| | .__/  | .__/   \__, |  \____| |_|     \__,_| | .__/  |_| |_|
                 |_|     |_|      |___/                         |_|
Welcome to PuppyGraph, type help to see the command list
[PuppyGraph]> 
               console         access PuppyGraph Gremlin Console             
               cypher-console  access PuppyGraph Cypher Console              
               groovy          access console to run complex groovy scripts  
               exit            exit PuppyGraph                               
               help            show the command list                         

▶️ Access the PuppyGraph Gremlin Console through the console command.

[PuppyGraph]> console
  ____                                     ____                          _
 |  _ \   _   _   _ __    _ __    _   _   / ___|  _ __    __ _   _ __   | |__
 | |_) | | | | | | '_ \  | '_ \  | | | | | |  _  | '__|  / _` | | '_ \  | '_ \
 |  __/  | |_| | | |_) | | |_) | | |_| | | |_| | | |    | (_| | | |_) | | | | |
 |_|      \__,_| | .__/  | .__/   \__, |  \____| |_|     \__,_| | .__/  |_| |_|
                 |_|     |_|      |___/                         |_|
Welcome to PuppyGraph!
version: 0.11

To Learn more about the graph schema:
- Use graph.show() to list all the vertex and edge labels.
- Use graph.show('$FOO') to list all the vertex and edge labels related to $FOO.
- Use graph.describe('$BAR') to list all the attributes of the label $BAR.

See https://tinkerpop.apache.org/gremlin.html to learn more about the Gremlin query language.
Here are some example queries for exploring the graph:
- Use g.V() to list all the vertices.
- Use g.E() to list all the edges.
- Use g.V().count() to get the total number of vertices.
- Use g.E().count() to get the total number of edges.
- Use g.V('$ID').out() to find out vertices that are reachable in 1-hop from the vertex $ID. For example, g.V('person:::v1').out() will find out 1-hop reachable vertices from 'person:::v1'.
- Use g.V('$ID').out().out() similarly to find out 2-hop reachable vertices from the vertex $ID.

puppy-gremlin> 

▶️ Run the following queries in the console to query the Graph.

g.V().has("name", "marko").valueMap()

Properties of the person named "marko":

puppy-gremlin> g.V().has("name", "marko").valueMap()
Done! Elapsed time: 0.048s, rows: 1
==>map[age:29 name:marko]

▶️ Exit Gremlin console by :x and then access Cypher console by cypher-console.

puppy-gremlin> :x
Bye!               
[PuppyGraph]> cypher-console

▶️ Make sure to initiate Cypher queries using the prefix :>.

puppy-cypher> :> MATCH (v) RETURN count(*)
==>[count(*):6]
puppy-cypher> :> MATCH (v) RETURN v
==>[v:[_type:node,name:peter,_id:person[v6],_label:person,age:35]]
==>[v:[_type:node,name:vadas,_id:person[v2],_label:person,age:27]]
==>[v:[_type:node,name:josh,_id:person[v4],_label:person,age:32]]
==>[v:[_type:node,name:marko,_id:person[v1],_label:person,age:29]]
==>[v:[_type:node,name:ripple,_id:software[v5],lang:java,_label:software]]
==>[v:[_type:node,name:lop,_id:software[v3],lang:java,_label:software]]

▶️ Exit Cypher console and the CLI by :x and then exit command.

puppy-cypher> :x
[PuppyGraph]> exit

Cleaning up

▶️ Run the following command to shut down and remove the services:

docker compose down
[+] Running 3/3
 ✔ Container mysql-server  Removed                                                                                                 3.6s 
 ✔ Container puppygraph    Removed                                                                                                10.5s 
 ✔ Network puppy-mysql     Removed                                                                                                 0.2s 

Last updated