Skip to content

Migrating from v0

PuppyGraph 1.0 introduces a new schema format. The Web UI converts legacy 0.x schemas to v1 automatically on upload, but the new shape is also straightforward to write directly. This page covers what changed, how the auto-conversion works, and a field-level mapping you can apply by hand.

The original 0.x schema documentation is preserved under Archived 0.x docs for reference.

What changed

  • Data sources became first-class. Each node, edge, and local table now has a dataSourceGroup that can point at an external catalog table, a local cache, or a union of multiple tables. The v0 single-source mappedTableSource is gone.
  • Many-to-one is no longer a separate concept. v0 distinguished oneToOne from manyToOne node mapping; v1 expresses many-to-one as a unionDataSource with dedupKey[].
  • Local caching is now first-class. The v0 per-node cacheConfig and partitionConfig are gone. Cached data is defined explicitly under the schema's top-level localTable[] and pointed to by localDataSource.
  • metaFields is gone. v0 buried id, from, and to inside a metaFields object on the table source. v1 promotes them to top-level id[], fromKey[], and toKey[] arrays of typed columns.
  • Column shape is unified. v0 had MappedField {name, type, alias}, MappedId {fields[]}, and AttributeSchema {name, type} as separate types. v1 uses a single Column {name, type, auto_increment} everywhere.
  • Source-to-graph column mapping moved. v0 used alias on MappedField for renames. v1 uses an explicit mappedField[] array on each data source, with sourceFieldName and targetFieldName pairs.

Auto-conversion on upload

When you upload a v0 schema in the Web UI, PuppyGraph detects it automatically and offers two decisions:

  • Data source for nodes and edges: use external catalogs directly, or create local tables.
  • After upload (when local tables are chosen): start loading data immediately, or define the structure only and load later.

A progress dialog shows each conversion step. The result is a v1 schema you can download and inspect from the Schema page.

The after-upload choice is request-scoped. For v1 uploads, use the Web UI choice or the postUploadBehavior query parameter on /schema:

Choice API value Effect
Cache and switch switch Create local tables, start loading data, and switch graph elements to read from those local tables when the load completes.
Cache data only load Create local tables and start loading data, but keep graph elements reading from their external sources.
Do not cache data none Define the v1 structure only. Load local-table data later from Local Table Management or POST /ui-api/loadLocalTable.

The old v0 environment variables DATAACCESS_DATA_CACHE_LOADONSCHEMAUPDATE and DATAACCESS_DATA_CACHE_FALLBACKTODIRECTLOAD are not the v1 upload control. In v1, the upload request controls whether data loading starts, and the active dataSourceGroup inside the schema controls whether a node or edge reads from an external source, a local table, or a union.

If you'd prefer to convert manually, or just want to understand what the auto-conversion produces, the rest of this page maps each v0 field to its v1 equivalent.

Field mapping

Node

v0 v1
oneToOne.tableSource dataSourceGroup.externalDataSource
manyToOne.sources[].source dataSourceGroup.unionDataSource.input[]
oneToOne.id.fields[], manyToOne.sources[].id.fields[] id[]
oneToOne.attributes[] attribute[]
MappedField.alias mappedField[].targetFieldName (with sourceFieldName set to the original name)

Edge

v0 v1
tableSource dataSourceGroup.externalDataSource
tableSource.metaFields.id id[]
fromVertex fromNodeLabel
toVertex toNodeLabel
fromId.fields[] (or metaFields.from) fromKey[]
toId.fields[] (or metaFields.to) toKey[]
attributes[] attribute[]

Local cache to local table

v0 v1
Per-node cacheConfig.cacheStrategy Drop the per-node config. Define a localTable and reference it via localDataSource.
Per-node partitionConfig.partitionColumns[] localTable.partitionBy.partitionColumn[]
partitionTimeUnit, partitionInterval Same field names on partitionColumn.
Cache state, managed implicitly per node Managed explicitly on the local table.

Side-by-side examples

Node

v0:

{
  "label": "Person",
  "oneToOne": {
    "tableSource": {
      "catalog": "postgres_data",
      "schema": "modern",
      "table": "person"
    },
    "id":         { "fields": [{ "name": "id", "type": "STRING" }] },
    "attributes": [
      { "name": "name", "type": "STRING" },
      { "name": "age",  "type": "INT" }
    ]
  }
}

v1:

{
  "label": "Person",
  "dataSourceGroup": {
    "externalDataSource": {
      "enabled": true,
      "catalog": "postgres_data",
      "schema": "modern",
      "table": "person"
    }
  },
  "id":        [{ "name": "id",   "type": "STRING" }],
  "attribute": [
    { "name": "name", "type": "STRING" },
    { "name": "age",  "type": "INT" }
  ]
}

Edge

v0:

{
  "label":      "KNOWS",
  "fromVertex": "Person",
  "toVertex":   "Person",
  "tableSource": {
    "catalog":    "postgres_data",
    "schema":     "modern",
    "table":      "knows",
    "metaFields": { "id": "id", "from": "from_id", "to": "to_id" }
  },
  "attributes": [{ "name": "weight", "type": "DOUBLE" }]
}

v1:

{
  "label":         "KNOWS",
  "fromNodeLabel": "Person",
  "toNodeLabel":   "Person",
  "dataSourceGroup": {
    "externalDataSource": {
      "enabled": true,
      "catalog": "postgres_data",
      "schema": "modern",
      "table": "knows"
    }
  },
  "id":        [{ "name": "id",      "type": "STRING" }],
  "fromKey":   [{ "name": "from_id", "type": "STRING" }],
  "toKey":     [{ "name": "to_id",   "type": "STRING" }],
  "attribute": [{ "name": "weight",  "type": "DOUBLE" }]
}

Local cache to local table

v0, per-node cache config:

{
  "label": "Event",
  "oneToOne": { "tableSource": { } },
  "cacheConfig":     { "cacheStrategy": "FULL" },
  "partitionConfig": {
    "partitionColumns": [
      { "partitionKey": "ts", "partitionTimeUnit": "DAY" }
    ]
  }
}

v1, explicit local table, referenced from the node:

"localTable": [
  {
    "name": "event_local",
    "dataSourceGroup": {
      "externalDataSource": {
        "enabled": true,
        "catalog": "postgres_data",
        "schema": "modern",
        "table": "event"
      }
    },
    "column": [
      { "name": "id", "type": "STRING" },
      { "name": "ts", "type": "DATETIME" }
    ],
    "partitionBy": {
      "partitionColumn": [
        { "column": "ts", "partitionTimeUnit": "DAY", "partitionInterval": 1 }
      ]
    }
  }
],
"node": [
  {
    "label": "Event",
    "dataSourceGroup": {
      "localDataSource": { "enabled": true, "localTableName": "event_local" }
    },
    "id": [{ "name": "id", "type": "STRING" }]
  }
]

Converting via the REST API

If you'd rather convert without uploading, the controlplane exposes a separate endpoint that returns the converted v1 JSON without applying it. This is useful for batch-converting schema files in version control, or for scripts that want to inspect the v1 form before deciding to apply it.

Field Value
Method POST
Endpoint /ui-api/convertSchema
Auth HTTP Basic, same credentials as /schema
Body The v0 schema JSON, or the envelope form below

Send the v0 schema JSON directly:

curl -XPOST -H "content-type: application/json" \
  --user "puppygraph:puppygraph123" \
  --data-binary @./v0-schema.json \
  http://localhost:8081/ui-api/convertSchema

Or wrap it to control the conversion options:

{
  "schemaJson": { "...": "v0 schema goes here" },
  "createLocalTable": true,
  "defaultDataSource": "external"
}
Field Description
schemaJson The v0 schema.
createLocalTable Whether to add localTable[] entries for each node and edge in the schema. Default true.
defaultDataSource "external" or "local". Determines which data source group is enabled on each node and edge. Default "external".

The response body is the converted v1 schema as JSON. The current schema in the controlplane is not modified. Apply the result by posting it to /schema when ready.

/schema is the v1 upload endpoint and expects v1 JSON; a v0 document posted directly returns 400 Bad Request. Legacy v0 schemas must be converted via /ui-api/convertSchema first, then uploaded as the resulting v1 JSON.

From local cache to local table management

In v0, the local copy of external data was managed as a per-view local cache keyed by node or edge label, and operated through the four /ui-api/*LocalCache* REST endpoints. In v1, the same concern is owned by local tables: standalone schema entities keyed by localTableName, with their own load-task history and partition state.

The four v0 endpoints still exist in v1 as route-compatible stubs so existing automation does not 404. Every call returns 200 with an empty body and performs no operation, which means v0 automation does not break, but it also does not work; calls succeed silently. Migrating a v0 integration requires updating both the request shape and the response parser for each endpoint, on top of mapping viewId to localTableName.

The rest of this section walks through each of the four endpoint pairs and focuses on the diff: the v0 request and response shapes a parser needs to stop expecting, plus the v1 changes that matter for the migration. The comprehensive v1 reference (full field tables, success/error responses, RBAC) lives in the Local Table Management section of Managing the Graph; use this page when migrating from v0 and that one when writing new clients.

What changed everywhere

A handful of patterns apply to all four endpoints, so it's easier to handle them once across the whole client than to repeat the work per call site.

The first is keying. v0 endpoints were keyed by viewId (a node or edge label). v1 endpoints are keyed by localTableName. A single v0 viewId can map to multiple v1 local tables, so a v0 call against viewIds=["a","b"] becomes one v1 call per local table. The v1 local-table names live in the uploaded v1 schema; if you are migrating an existing v0 schema, send it to POST /ui-api/convertSchema and read the names out of the converted document.

The second is the response envelope. v1 wraps most local-table-management responses in {"ok": <bool>, "errorMessage": <string>, ...payload} with camelCase field names. On the error path you get {"ok": false, "errorMessage": "<reason>"}. Statuses inside the payload are uppercase (PENDING|RUNNING|SUCCESS|FAILED|RETRYING|CANCELLED); int64 fields serialize as JSON strings to avoid JavaScript precision loss. v0 was less consistent: some endpoints returned a bare JSON string, others used {success, error} envelopes, and the success and error keys are renamed to ok and errorMessage in v1. The one exception is GET /ui-api/localTableStatus, which returns an unwrapped per-table status map; treat it as a special case in client code.

The third is the HTTP status convention. v0 endpoints frequently returned 200 even on logical failure and signaled the failure inside the body. For example, dropLocalCachePartition returned 200 with {"success": false, "error": "..."}, and refreshLocalCache returned 200 with a bare apology string. v1 follows the standard convention: 200 only on success, 400 for client errors (validation failures, unknown tables, partition mismatches), and 500 for transport failures. A v0 client that branched on HTTP status alone will get the wrong answer; one that always parsed the body will need its envelope keys updated.

refreshLocalCache

The v0 endpoint accepted a list of view labels and triggered a synchronous-looking cache refresh. The v1 equivalent is a per-local-table call that submits an asynchronous load task and returns a scopeKey for tracking.

In v0 the request body looked like the following. viewIds was a list of node or edge labels and the partition values were optional:

{
  "viewIds": ["person", "software"],
  "partitionStartValue": "2024-08-07 00:00:00",
  "partitionEndValue":   "2024-08-08 00:00:00"
}

The success response was, surprisingly, not a JSON object but a bare JSON string with HTTP 200:

"Requested refresh local cache."

Failure also came back as a bare JSON string, this time with HTTP 400, for example "viewId:does_not_exist not exist in server". A v0 parser that did if (response === "Requested refresh local cache.") { ... } will not work against v1.

In v1 you make one request per local table, with viewIds replaced by a single localTableName. The request and response shapes are documented in the Loading data section of Managing the Graph. The diff worth noting on the migration path is that the response is now a JSON object ({ok, errorMessage, scopeKey}) instead of a bare JSON string, and the new scopeKey is used to poll GET /ui-api/getLocalTableLoadStatus for terminal status. v0's response carried no tracking handle.

A v0 call with two viewIds therefore becomes two v1 calls. For example, the v0 payload above translates to:

curl -X POST -u puppygraph:puppygraph123 \
  -H "Content-Type: application/json" \
  -d '{
        "localTableName": "person_local",
        "partitionStartValue": "2024-08-07 00:00:00",
        "partitionEndValue":   "2024-08-08 00:00:00"
      }' \
  http://localhost:8081/ui-api/loadLocalTable

curl -X POST -u puppygraph:puppygraph123 \
  -H "Content-Type: application/json" \
  -d '{
        "localTableName": "software_local",
        "partitionStartValue": "2024-08-07 00:00:00",
        "partitionEndValue":   "2024-08-08 00:00:00"
      }' \
  http://localhost:8081/ui-api/loadLocalTable

The local-table names (person_local, software_local here) come from the v1 schema. Look them up there, or run your v0 schema through POST /ui-api/convertSchema to see the conversion.

getLocalCacheDetail

The v0 endpoint returned a cache-wide summary in a single call. The v1 equivalent is keyed by localTableName and returns a list of per-load-task records, which is a more granular but differently-shaped surface. There is no v1 endpoint that returns the full v0 cache-wide envelope in one call; for the aggregate Web-UI view, use GET /ui-api/localTableStatus.

In v0 the response body looked like this, a list of per-view items plus an aggregate cache status and a cache type tag:

{
  "items": [
    {
      "viewId": "person",
      "name": "person",
      "state": "PENDING",
      "progress": "0%",
      "errorCode": "0",
      "viewType": "VERTEX",
      "analyzeDataStatus": "SUCCESS",
      "analyzeDataFailedReason": ""
    },
    {
      "viewId": "software",
      "name": "software",
      "state": "SUCCESS",
      "progress": "100%",
      "errorCode": "0",
      "viewType": "VERTEX",
      "analyzeDataStatus": "SUCCESS",
      "analyzeDataFailedReason": ""
    }
  ],
  "status": "REFRESHING_DATA",
  "type": "flex-cache"
}

A few v0 quirks are easy to overlook:

  • progress is a string like "0%" or "100%", not a number.
  • errorCode is also a string ("0" on success).
  • viewType is VERTEX or EDGE and reflects the node/edge classification. It has no v1 equivalent because v1 keys by local table, not by graph element.
  • The top-level status is the aggregate cache state, distinct from per-view items[i].state. The v0 vocabulary covers NOOP, INIT, IN_PROGRESS, PENDING_LOAD, REFRESHING_DATA, READY, UNAVAILABLE, STATUS_RETRIEVAL_ERROR, and DATA_LOADING_ERROR.
  • When the cache hasn't loaded anything yet the whole response is literal null with HTTP 200, not an empty object. JSON.parse then null-check before reading .items.

The v1 equivalent fans out to one call per local table. The full response shape is documented in the Reviewing load history section of Managing the Graph. What matters here is how to translate the v0 parser:

  • Read tasks[] instead of items[]. To recover the v0 "one status per view" view, take the most recent task (tasks[0]); for the aggregate cache state, use the aggregate-status endpoint.
  • The status vocabulary maps onto the v1 enum mostly intuitively:
v0 status v1 task status
READY SUCCESS
PENDING_LOAD / IN_PROGRESS / REFRESHING_DATA PENDING or RUNNING (depending on whether the underlying task has actually started)
DATA_LOADING_ERROR / STATUS_RETRIEVAL_ERROR FAILED
NOOP / INIT / UNAVAILABLE no v1 task records yet; the response is just tasks: []

A few field-level changes are worth pinning when updating a v0 parser:

  • progress is now a number (100), not a string ("100%").
  • id and mostRecentTaskId are JSON strings ("1"), not numbers, because they are proto int64 values and v1 serializes them that way to avoid precision loss on JavaScript clients.
  • viewType is gone, since the local-table model is graph-element-agnostic.
  • An unknown localTableName returns 200 with tasks: [] (not a 404), because the underlying call is a registry lookup that legitimately returns no records for unfamiliar names. Polling code that previously asked "does this view exist?" by looking at whether the cache had it should branch on something else.

getLocalCachePartitionDisplayInfo

The v0 endpoint reported per-partition state for one view, taking the viewId as a query parameter. The v1 equivalent takes localTableName and returns the same kind of per-partition rows, just with renamed fields.

The v0 behavior is worth a paragraph of its own. Against a non-partitioned cache, calling the endpoint returned HTTP 500 with a bare text body (Index 1 out of bounds for length 1, an unhandled exception leaking through as the response). Polling code that classified 5xx as transient and retried would have looped forever; code that treated 5xx as fatal had to special-case this one endpoint. v1 returns a proper 400 with a JSON errorMessage instead, so the fail-fast path is finally available.

The per-partition row was already structured in v0, just with different names. Both shapes carry a name, range, recent state, recent progress, an error code/message pair, and a last-success timestamp:

v0 field v1 field
viewId (removed; identify the table by the request's localTableName)
partitionName partitionName
range partitionRange
recentState mostRecentLoadStatus
recentProgress mostRecentProgress
errorCode (removed)
errorMessage mostRecentErrorMessage
lastSuccessTimestamp lastSuccessTimestamp
(new) partitionState (NORMAL, etc.)
(new) mostRecentTaskId

The full v1 response shape is documented in the Reviewing partition state section of Managing the Graph. Two behavioral changes are specifically relevant for migration:

  • partitionRange is the literal string accepted as input to dropLocalTablePartition, so keep it verbatim if you plan to chain those two calls.
  • For a non-partitioned local table, v1 still returns exactly one row, but with partitionName equal to the local-table name and partitionRange: "". The drop endpoint will refuse to operate on it.

dropLocalCachePartition

The drop endpoint is the smallest of the four in terms of payload, but it has the most surprising parser changes: the envelope keys are renamed, the HTTP status code starts being meaningful, and one more required field appears in the request body.

In v0 you submitted just the view and the partition name:

{
  "viewId": "person",
  "partitionName": "p0"
}

The response used the v0 {success, error} envelope, and (importantly) returned HTTP 200 even when the drop failed. A typical failure body looked like:

{
  "error": "[LSM-11] Failed to drop local cache partition: person p0",
  "success": false
}

In v1 the request body grows by one field: viewId becomes localTableName, and the new required partitionRange field has to match a live partition exactly. The full request and response shapes are documented in the Dropping a partition section of Managing the Graph; the migration-relevant points are:

  • The reliable workflow is to call getLocalTablePartitionDetails first, read partitionName and partitionRange out of the row you want to drop, and pass both verbatim into the drop call. Pre-computed or hard-coded values from v0 will be rejected if they don't match a live partition.
  • The HTTP status code is now load-bearing. Validation errors, table-state errors (e.g. "local table 'X' is not partitioned"), and "partition not found" all return 400, where v0 returned 200 with a body flag. v0 clients that treated 200 as "the call worked" will silently start reporting failures as successes; flip the check to look at ok or the HTTP status.
  • dropLocalTablePartition only applies to partitioned local tables, and it will also refuse to run while an active load targets the partition or the whole table. Both cases come back through the standard {ok: false, errorMessage} envelope.

Client migration checklist

If you only have time for one pass through the client code, address the items below in order. Each one is small but every line of v0 cache automation touches at least the first three.

  1. Replace every viewId / viewIds with the corresponding localTableName, fanning out to one call per local table.
  2. Rename the response envelope keys: success becomes ok, and error becomes errorMessage. Drop the bare-string handling path on refreshLocalCache; that response is a JSON object now.
  3. Treat HTTP status as authoritative. v1 returns proper 400/500 codes; v0's "always 200, check the body" pattern is gone.
  4. Re-type progress. v0 returned the string "100%"; v1 returns the number 100.
  5. Map the status vocabulary onto the v1 enum (most commonly: READY becomes SUCCESS; PENDING_LOAD and REFRESHING_DATA become PENDING or RUNNING; DATA_LOADING_ERROR becomes FAILED), and use the most recent task's status to stand in for the v0 per-view state.
  6. Parse the int64 fields (id, mostRecentTaskId) as strings, not numbers.
  7. Distinguish the two "unknown table" behaviors: getLocalTableLoadStatus returns 200 with tasks: [], while getLocalTablePartitionDetails returns 400.
  8. Before calling dropLocalTablePartition, fetch the live partitionName and partitionRange from getLocalTablePartitionDetails. Do not hard-code partition keys.

What does not auto-migrate

The Web UI's auto-conversion handles the schema document. Anything that lived outside the document doesn't move automatically:

  • Custom REST automation that posts v0-shaped JSON to /schema should be updated to post v1-shaped JSON. The endpoint accepts both, but emitting v1 directly is the long-term path.
  • Cache-related environment variables and operational tooling changed in 1.0. If you previously relied on DATAACCESS_DATA_CACHE_STRATEGY or per-cache REST endpoints, verify the v1 equivalents before assuming they still apply.
  • Saved schema files in version control aren't rewritten on disk. After the first upload, download the v1 result and replace the v0 file in your repo.