Connecting to Iceberg
Prerequisites
Both the Iceberg Catalog and Storage are accessible over the network from the PuppyGraph instance.
PuppyGraph supports REST, Hive Metastore, and AWS Glue Data Catalog as Iceberg Catalog implementation.
PuppyGraph supports Amazon S3, S3 Compatible Storage (e.g. MinIO), Google Cloud Storage (), Azure, ad HDFS as Iceberg
Configuration
The configuration consists of two parts: Metastore (Catalog) and Data Storage. Please configure them according to you Iceberg setup.
Metastore Configuration
Iceberg REST Catalog
Configuration | Explanation |
---|---|
RestUri | The server endpoint URI of the REST Catalog. |
Warehouse | The name of the Tabular warehouse. Required by Tabular metastore. |
Security | Security Schema of the REST catalog. Set it to |
Session | Set it to |
Credential | The Tabular authentication credential. Required by Tabular metastore. |
AWS Glue
Configuration | Explanation |
---|---|
Region | The region of the AWS Glue Data Catalog. Example: |
Use instance profile | Whether to use role-based authentication (Explicit IAM roles or instance-profile attached) |
IAM Role ARN | The ARN of the IAM role for accessing the AWS Glue Data Catalog. Required by authentication with IAM roles. |
Access key | The access key of the IAM user for accessing the AWS Glue Data Catalog. Required by authentication with IAM User Access keys. |
Secret key | The secret key of the IAM user for accessing the AWS Glue Data Catalog. Required by authentication with IAM User Access keys. |
Hive Metastore
Configuration | Explanation |
---|---|
Hive metastore URI | The URI of your Hive metastore. Format: |
Data Storage Configuration
Amazon S3 (Simple Storage Service)
PuppyGraph supports Amazon S3 (Simple Storage Service) for Iceberg.
Configuration | Explanation |
---|---|
Region | The region of the Amazon S3. Example: |
Use instance profile | Whether to use role-based authentication (Explicit IAM roles or instance-profile attached). |
IAM Role ARN | The ARN of the IAM role for accessing the Amazon S3. Required by authentication with IAM roles. |
Access key | The access key of the IAM user for accessing the Amazon S3. Required by authentication with IAM User Access keys. |
Secret key | The ARN of the IAM role for accessing the Amazon S3. Required by authentication with IAM User Access keys. |
Amazon S3 Compatible Storage
PuppyGraph supports S3 Compatible Storage (e.g. MinIO) for Iceberg.
Configuration | Explanation |
---|---|
Endpoint | The S3 compatible storage endpoint. |
Access key | The access key of an IAM user for accessing the S3 compatible storage. |
Secret key | The secret key of an IAM user for accessing the S3 compatible storage. |
Enable SSL | Whether to enable SSL connection for accessing the S3 compatible storage. |
Enable path style access | Whether to use path-style access method when accessing the S3 compatible storage. |
Get from metastore
There is no need to specify Storage configuration with the following implementation of Iceberg:
HDFS with Hive Metastore.
Tabular (credential vending) with Iceberg REST Catalog.
Select Get from metastore
in the Web UI for these implementations.
Demo
See Querying Iceberg Data as a Graph for a complete demo.
Example Configurations
Please refer to Data Lake Catalog for detailed parameters for each type of catalog and storage.
Catalog Type | Storage Type | Example Configuration |
---|---|---|
REST Catalog | Amazon S3 | |
REST Catalog | MinIO | |
AWS Glue | Amazon S3 | |
Hive Metastore | HDFS | |
Hive Metastore | Amazon S3 | |
Hive Metastore | MinIO | |
Hive Metastore | Google GCS | |
Hive Metastore | Azure Blob | |
Hive Metastore | Azure Data Lake Gen2 |
REST Catalog + Amazon S3
REST Catalog + MinIO
AWS Glue + Amazon S3
Hive Metastore + HDFS
Hive Metastore + MinIO
Hive Metastore + Amazon S3
Hive Metastore + Google GCS
Hive Metastore + Azure Blob
Hive Metastore + Azure Data Lake Gen2
Last updated