Skip to content

Connected Component Finding

Introduction

Connected Component Finding is an algorithm that identifies all nodes (vertices) reachable from specified starting nodes (vertices) through given edge types. The algorithm calculates the minimum distance (hops) from any start node (vertex) to each reachable node.

All features and configuration options described in this document require PuppyGraph version 0.107 or higher.

Output

The algorithm produces the following fields:

Field Name Description
ID The ID of the node (vertex)
hop The minimum hops from any of the provided nodes (vertices)
path One of the shortest paths (with the fewest hops) from the source, available only if collectPath is enabled

Query Examples

Basic Query

Find all nodes reachable from a starting node:

CALL algo.connectedComponentFinding({
    relationshipTypes: ['Knows', 'WorkAt'],
    startIds: ['User[2]']
    })
graph.program(
    ConnectedComponentFindingProgram.builder()
    .edges('Knows', 'WorkAt')
    .startIds('User[2]')
    .create()
).submitAndGet().toList()

Using Vertex Filters

Filter nodes during traversal:

CALL algo.connectedComponentFinding({
    relationshipTypes: ['Knows', 'WorkAt'],
    startIds: ['User[2]'],
    vertexFilters: [
        eq('Company', 'city', 'Boston'),
        eq('isOutdated', false)
        ]
    })
graph.program(
    ConnectedComponentFindingProgram.builder()
    .edges('Knows', 'WorkAt')
    .startIds('User[2]')
    .vertexFilter('Company', 'city', P.eq('Boston'))    // filter for one label
    .vertexFilter('isOutdated', P.eq(false))            // filter for all labels
    .create()
).submitAndGet().toList()

Using Edge Filters

Filter edges during traversal:

CALL algo.connectedComponentFinding({
    relationshipTypes: ['Knows', 'WorkAt'],
    startIds: ['User[2]'],
    edgeFilters: [
        gte('Knows', 'startYear', 2020),
        eq('isDeleted', false)
        ]
    })
graph.program(
    ConnectedComponentFindingProgram.builder()
    .edges('Knows', 'WorkAt')
    .startIds('User[2]')
    .edgeFilter('Knows', 'startYear', P.gte(2020))
    .edgeFilter('isDeleted', P.eq(false))
    .create()
).submitAndGet().toList()

Using Start Vertex Filters

Filter which nodes can be used as starting points:

CALL algo.connectedComponentFinding({
    relationshipTypes: ['Knows', 'WorkAt'],
    startVertexFilters: [
        lte('income', 1000)
        ]
    })
graph.program(
    ConnectedComponentFindingProgram.builder()
    .edges('Knows', 'WorkAt')
    .startVertexFilter('income', P.lte(1000))
    .create()
).submitAndGet().toList()

Using Target IDs

Find paths to specific target nodes:

CALL algo.connectedComponentFinding({
    relationshipTypes: ['Knows', 'WorkAt'],
    startIds: ['User[2]'],
    targetIds: ['User[3]', 'User[4]']
    })
graph.program(
    ConnectedComponentFindingProgram.builder()
    .edges('Knows', 'WorkAt')
    .startIds('User[2]')
    .targetIds('User[3]', 'User[4]')
    .create()
).submitAndGet().toList()

Using Target Vertex Filters

Filter which nodes are included in the results:

CALL algo.connectedComponentFinding({
    relationshipTypes: ['Knows', 'WorkAt'],
    startIds: ['User[2]'],
    targetVertexFilters: [
        gt('score', 80)
        ]
    })
graph.program(
    ConnectedComponentFindingProgram.builder()
    .edges('Knows', 'WorkAt')
    .startIds('User[2]')
    .targetVertexFilter('score', P.gt(80))
    .create()
).submitAndGet().toList()

Collecting Paths

Collect the shortest paths to reached nodes:

CALL algo.connectedComponentFinding({
    relationshipTypes: ['Knows', 'WorkAt'],
    startIds: ['User[2]'],
    collectPath: true,
    collectEdgeInPath: true
    })
graph.program(
    ConnectedComponentFindingProgram.builder()
    .edges('Knows', 'WorkAt')
    .startIds('User[2]')
    .collectPath()
    .collectEdgeInPath()
    .create()
).submitAndGet().toList()

Using Monotonic Edge Property

Enforce monotonic edge property values during traversal:

CALL algo.connectedComponentFinding({
    relationshipTypes: ['Knows', 'WorkAt'],
    startIds: ['User[2]'],
    monotonicEdgeProperty: {
        propertyName: 'start_time',
        increasing: true,
        strict: true
        }
    })
graph.program(
    ConnectedComponentFindingProgram.builder()
    .edges('Knows', 'WorkAt')
    .startIds('User[2]')
    .monotonicEdgeProperty("start_time", true, true)
    .create()
).submitAndGet().toList()

Parameters

Name Description Required Default value
relationshipTypes Relationship types to traverse Yes []
startIds Initial node (vertex) IDs Yes if startVertexFilters is not set []
targetIds Target node (vertex) IDs No []
direction Specify search direction, valid values are 'IN', 'OUT' and 'BOTH' No BOTH
maxIterations Maximum number of iterations No 30
vertexFilters List of filters to filter nodes (vertices) by attribute value No []
startVertexFilters List of filters applied only to starting nodes (vertices) Yes if startIds is not set []
targetVertexFilters List of filters applied only to nodes (vertices) included in the results, does not prevent traversal through non-matching nodes No []
edgeFilters List of filters to filter edges by attribute value No []
stopAnyPassTargetFilter When targetIds or targetVertexFilters is set, the search will terminate upon finding any node that meets the criteria, without exploring further hops No false
maxFanout Restrict the number of outgoing edges per node. This limit is applied separately for each edge label and direction No null
collectPath Collect one of the shortest paths (with the fewest hops) for every node found No false
collectEdgeInPath Include edge IDs in collected paths No false
monotonicEdgeProperty Set an edge attribute to enforce monotonic values during traversal No null

Filter formats: Filters can be specified in two formats: op(labelName::String, attributeName::String, value::Any) and op(attributeName::String, value::Any). The first applies only to a specific label, while the second applies to all labels that have the given attribute.

Supported operators: eq, neq, gt, gte, lt, lte, within, without, isnull, notnull

Monotonic edge property: The monotonicEdgeProperty value must be a map containing three keys: propertyName, increasing, and strict. Both increasing and strict should be boolean values.

Method Description Required
edges(String... names) Edge types to traverse Yes
startIds(String... ids) Initial node (vertex) IDs Yes if startVertexFilter is not set
direction Specify search direction, valid values are IN, OUT and BOTH, default is BOTH No
maxIteration(int n) Sets the maximum number of iterations No
vertexFilter(String labelName, String Attr, P predicate) Filter nodes (vertices) of a specific label by attribute No
vertexFilter(String Attr, P predicate) Filter all nodes (vertices) by attribute (if the node has the given attribute) No
startVertexFilter(String labelName, String Attr, P predicate) Filter applied only to starting nodes (vertices) of a specific label by attribute Yes if startIds is not set
startVertexFilter(String Attr, P predicate) Filter applied only to starting nodes (vertices) of all labels by attribute (if the node has the given attribute) Yes if startIds is not set
targetVertexFilter(String labelName, String Attr, P predicate) Filter applied only to nodes (vertices) included in the results of a specific label by attribute, does not prevent traversal through non-matching nodes No
targetVertexFilter(String Attr, P predicate) Filter applied only to nodes (vertices) included in the results of all labels by attribute (if the node has the given attribute), does not prevent traversal through non-matching nodes No
edgeFilter(String labelName, String Attr, P predicate) Filter edges of a specific label by attribute No
edgeFilter(String Attr, P predicate) Filter all edges by attribute (if the edge has the given attribute) No
stopAnyPassTargetFilter() When targetIds or targetVertexFilters is set, the search will terminate upon finding any node that meets the criteria, without exploring further hops No
maxFanout(int n) Restrict the number of outgoing edges per node. This limit is applied separately for each edge label and direction No
collectPath() Enables collection of one of the shortest paths (with the fewest hops) for every node found No
collectEdgeInPath() Includes edge IDs in collected paths No
monotonicEdgeProperty(String propertyName, boolean increasing, boolean strict) Sets an edge attribute to enforce monotonic values during traversal No

Exporting Query Results

For large datasets, export the results to object storage:

EXPORT TO 's3://my_bucket/target_dir'
PROPERTIES {
  catalog: 'puppygraph_catalog_name'
}
CALL algo.connectedComponentFinding({
    relationshipTypes: ['Knows', 'WorkAt'],
    startIds: ['User[2]']
    }) YIELD ID, hop
RETURN ID, hop 
graph.program(
    ConnectedComponentFindingProgram.builder()
    .edges('Knows', 'WorkAt')
    .startIds('User[2]')
    .vertexFilter('Company', 'city', TextP.startsWith('B'))
    .edgeFilter('Knows', 'startYear', P.gte(2020))
    .vertexFilter('isOutdated', P.eq(false))
    .edgeFilter('isDeleted', P.eq(false))
    .create()
).submitAndSave([
  "exportTo": "s3://my_bucket/target_dir",
  "catalog": "puppygraph_catalog_name"
])