Skip to content

Querying MongoDB Atlas Data as a Graph

Summary

In this tutorial, you will:

  • Create a MongoDB Atlas database and load it with example data;
  • Start a PuppyGraph Docker container and query the MongoDB Atlas data as a graph.

Prerequisites

  • Please ensure that docker is available;
  • You need an account of MongoDB Atlas and one MongoDB cluster in that account;
  • Accessing the PuppyGraph Web UI requires a browser;

Deployment

▶ Run the following command to start PuppyGraph:

docker run -p 8081:8081 -p 8182:8182 -p 7687:7687 -d --name puppy --rm --pull=always puppygraph/puppygraph:stable

Data Preparation

This tutorial is designed to be comprehensive and standalone, so it includes steps to populate data in MongoDB. In practical scenarios, PuppyGraph can query data directly from existing tables in your MongoDB.

▶ Connect to the database in MongoDB by tools like MongoDB Compass. We will use MongoDB shell to populate data in this tutorial.

▶ Create the demo graph. We strongly recommend user to specify JSON schema validation to avoid errors.

use modern
db.createCollection("person", {
   validator: {
      $jsonSchema: {
         bsonType: "object",
         required: [ "id", "name", "age" ],
         properties: {
            id: { bsonType: "string" },
            name: { bsonType: "string" },
            age: { bsonType: "int"}
         }
      }
   }
})
db.person.insertMany([
  {id: 'v1', name: 'marko', age: 29},
  {id: 'v2', name: 'vadas', age: 27},
  {id: 'v4', name: 'josh', age: 32},
  {id: 'v6', name: 'peter', age: 35}
])
db.createCollection("software", {
   validator: {
      $jsonSchema: {
         bsonType: "object",
         required: [ "id", "name", "lang" ],
         properties: {
            id: { bsonType: "string" },
            name: { bsonType: "string" },
            lang: { bsonType: "string" }
         }
      }
   }
})
db.software.insertMany([
  {id: 'v3', name: 'lop', lang: 'java'},
  {id: 'v5', name: 'ripple', lang: 'java'}
])
db.createCollection("created", {
   validator: {
      $jsonSchema: {
         bsonType: "object",
         required: [ "id", "from_id", "to_id", "weight" ],
         properties: {
            id: { bsonType: "string" },
            from_id: { bsonType: "string" },
            to_id: { bsonType: "string" },
            weight: { bsonType: "double" }
         }
      }
   }
})
db.created.insertMany([
  {id: 'e9', from_id: 'v1', to_id: 'v3', weight: 0.4},
  {id: 'e10', from_id: 'v4', to_id: 'v5', weight: Double(1.1)},
  {id: 'e11', from_id: 'v4', to_id: 'v3', weight: 0.4},
  {id: 'e12', from_id: 'v6', to_id: 'v3', weight: 0.2}
])
db.createCollection("knows", {
   validator: {
      $jsonSchema: {
         bsonType: "object",
         required: [ "id", "from_id", "to_id", "weight" ],
         properties: {
            id: { bsonType: "string" },
            from_id: { bsonType: "string" },
            to_id: { bsonType: "string" },
            weight: { bsonType: "double" }
         }
      }
   }
})
db.knows.insertMany([
  {id: 'e7', from_id: 'v1', to_id: 'v2', weight: 0.5},
  {id: 'e8', from_id: 'v1', to_id: 'v4', weight: Double(1.1)}
])

The above SQL creates the following tables under the modern schema:

id name age
v1 marko 29
v2 vadas 27
v4 josh 32
v6 peter 35
id name lang
v3 lop java
v5 ripple java
id from_id to_id weight
e7 v1 v2 0.5
e8 v1 v4 1.1
id from_id to_id weight
e9 v1 v3 0.4
e10 v4 v5 1.1
e11 v4 v3 0.4
e12 v6 v3 0.2

Create Atlas SQL Connection and get JDBC url

PuppyGraph use JDBC driver to connect to MongoDB Atlas. So we need to create Atlas SQL connections instance in MongoDB Atlas website and get JDBC url.

Graph Schema

Connect to cluster using Atlas SQL

Graph Schema

Creating connecting instance

Graph Schema

Getting JDBC url

Modeling a Graph

We then define a graph on top of the data tables we just created. Actually, this is the "Modern" graph defined by Apache Tinkerpop.

Modern Graph

Modern Graph

A schema instructs PuppyGraph on mapping data from the Hive into a graph. PuppyGraph offers various methods for schema creation. For this tutorial, we've already prepared a schema to help save time.

▶ Create a PuppyGraph schema file schema.json with the following content:

schema.json
{
  "catalogs": [
    {
      "name": "mongodb_data",
      "type": "mongodb",
      "jdbc": {
        "username": "[username]",
        "password": "[password]",
        "jdbcUri": "[jdbcUri]",
        "driverClass": "com.mongodb.jdbc.MongoDriver"
      }
    }
  ],
  "graph": {
    "vertices": [
      {
        "label": "person",
        "oneToOne": {
          "tableSource": {
            "catalog": "mongodb_data",
            "schema": "modern",
            "table": "person"
          },
          "id": {
            "fields": [
              {
                "type": "String",
                "field": "id",
                "alias": "id"
              }
            ]
          },
          "attributes": [
            {
              "type": "Long",
              "field": "age",
              "alias": "age"
            },
            {
              "type": "String",
              "field": "name",
              "alias": "name"
            }
          ]
        }
      },
      {
        "label": "software",
        "oneToOne": {
          "tableSource": {
            "catalog": "mongodb_data",
            "schema": "modern",
            "table": "software"
          },
          "id": {
            "fields": [
              {
                "type": "String",
                "field": "id",
                "alias": "id"
              }
            ]
          },
          "attributes": [
            {
              "type": "String",
              "field": "lang",
              "alias": "lang"
            },
            {
              "type": "String",
              "field": "name",
              "alias": "name"
            }
          ]
        }
      }
    ],
    "edges": [
      {
        "label": "knows",
        "fromVertex": "person",
        "toVertex": "person",
        "tableSource": {
          "catalog": "mongodb_data",
          "schema": "modern",
          "table": "knows"
        },
        "id": {
          "fields": [
            {
              "type": "String",
              "field": "id",
              "alias": "id"
            }
          ]
        },
        "fromId": {
          "fields": [
            {
              "type": "String",
              "field": "from_id",
              "alias": "from_id"
            }
          ]
        },
        "toId": {
          "fields": [
            {
              "type": "String",
              "field": "to_id",
              "alias": "to_id"
            }
          ]
        },
        "attributes": [
          {
            "type": "Double",
            "field": "weight",
            "alias": "weight"
          }
        ]
      },
      {
        "label": "created",
        "fromVertex": "person",
        "toVertex": "software",
        "tableSource": {
          "catalog": "mongodb_data",
          "schema": "modern",
          "table": "created"
        },
        "id": {
          "fields": [
            {
              "type": "String",
              "field": "id",
              "alias": "id"
            }
          ]
        },
        "fromId": {
          "fields": [
            {
              "type": "String",
              "field": "from_id",
              "alias": "from_id"
            }
          ]
        },
        "toId": {
          "fields": [
            {
              "type": "String",
              "field": "to_id",
              "alias": "to_id"
            }
          ]
        },
        "attributes": [
          {
            "type": "Double",
            "field": "weight",
            "alias": "weight"
          }
        ]
      }
    ]
  }
}

Please replace username,password,jdbcUri parameters with your account user name, password and jdbc Url.

▶ Log into PuppyGraph Web UI at http://localhost:8081 with username puppygraph and password puppygraph123.

PuppyGraph Login

PuppyGraph Login

▶ Upload the schema by selecting the file schema.json in the Upload Graph Schema JSON block and clicking on Upload.

Upload Schema Page

Upload Schema Page

Once the schema is uploaded, the schema page shows the visualized graph schema as follows.

Visualized Schema

Visualized Schema

Alternative: Schema Uploading via CLI

▶ Alternatively, run the following command to upload the schema file:

curl -XPOST -H "content-type: application/json" --data-binary @./schema.json --user "puppygraph:puppygraph123" localhost:8081/schema

The response shows that graph schema has been uploaded successfully:

{"Status":"OK","Message":"Schema uploaded and gremlin server restarted"}

Querying the Graph

In this tutorial we will use the Gremlin query language to query the Graph. Gremlin is a graph query language developed by Apache TinkerPop. Prior knowledge of Gremlin is not necessary to follow the tutorial. To learn more about it, visit https://tinkerpop.apache.org/gremlin.html.

▶ Click on the Query panel the left side. The Gremlin Query tab offers an interactive environment for querying the graph using Gremlin.

Interactive Gremlin Query Page

Interactive Gremlin Query Page

Queries are entered on the left side, and the right side displays the graph visualization.

The first query retrieves the property of the person named "marko".

▶ Copy the following query, paste it in the query input, and click on the run button.

g.V().has("name", "marko").valueMap()

The output is plain text like the following:

Rows: 1
age              29
name             marko

Now let's also leverage the visualization. The next query gets all the software created by people known to "marko".

▶ Copy the following query, paste it in the query input, and click on the run button.

g.V().has("name", "marko")
  .out("knows").out("created").path()

The output is as follows. There are two paths in the result as "marko" knows "josh" who created "lop" and "ripple".

Interactive Query with Results

Interactive Query with Results

Alternative: Querying the graph via CLI

Alternatively, we can query the graph via CLI.

▶ Execute the following command to access the PuppyGraph Gremlin CLI

docker exec -it puppy ./bin/console

The welcome screen appears as follows:

  ____                                     ____                          _
 |  _ \   _   _   _ __    _ __    _   _   / ___|  _ __    __ _   _ __   | |__
 | |_) | | | | | | '_ \  | '_ \  | | | | | |  _  | '__|  / _` | | '_ \  | '_ \
 |  __/  | |_| | | |_) | | |_) | | |_| | | |_| | | |    | (_| | | |_) | | | | |
 |_|      \__,_| | .__/  | .__/   \__, |  \____| |_|     \__,_| | .__/  |_| |_|
                 |_|     |_|      |___/                         |_|
Welcome to PuppyGraph!
version: 0.10

puppy-gremlin> 

▶ Run the following queries in the console to query the Graph.

g.V().has("name", "marko").valueMap()

Properties of the person named "marko":

puppy-gremlin> g.V().has("name", "marko").valueMap()
Done! Elapsed time: 0.059s, rows: 1
==>map[age:29 name:marko]
g.V().has("name", "marko").out("knows").out("created").valueMap()

All the software created by the people known to "marko":

puppy-gremlin> g.V().has("name", "marko").out("knows").out("created").valueMap()
Done! Elapsed time: 0.042s, rows: 2
==>map[lang:java name:lop]
==>map[lang:java name:ripple]

▶ To exit PuppyGraph Gremlin Console, enter the command:

:exit

Cleaning up

▶ Run the following command to shut down and remove the containers:

docker stop puppy