Connecting to Iceberg
Last updated
Last updated
Both the Iceberg Catalog and Storage are accessible over the network from the PuppyGraph instance.
PuppyGraph supports REST, Hive Metastore, and AWS Glue Data Catalog as Iceberg Catalog implementation.
PuppyGraph supports Amazon S3, S3 Compatible Storage (e.g. MinIO), Google Cloud Storage (), Azure, ad HDFS as Iceberg
The configuration consists of two parts: Metastore (Catalog) and Data Storage. Please configure them according to you Iceberg setup.
Configuration | Explanation |
---|---|
Configuration | Explanation |
---|---|
PuppyGraph supports Amazon S3 (Simple Storage Service) for Iceberg.
PuppyGraph supports S3 Compatible Storage (e.g. MinIO) for Iceberg.
There is no need to specify Storage configuration with the following implementation of Iceberg:
HDFS with Hive Metastore.
Tabular (credential vending) with Iceberg REST Catalog.
Select Get from metastore
in the Web UI for these implementations.
See Querying Iceberg Data as a Graph for a complete demo.
Please refer to Data Lake Catalog for detailed parameters for each type of catalog and storage.
Configuration | Explanation |
---|---|
Configuration | Explanation |
---|---|
Configuration | Explanation |
---|---|
Catalog Type | Storage Type | Example Configuration |
---|---|---|
RestUri
The server endpoint URI of the REST Catalog.
Warehouse
The name of the Tabular warehouse. Required by Tabular metastore.
Security
Security Schema of the REST catalog. Set it to oauth2
when using Tabular metastore.
Session
Set it to user
when using Tabular metastore.
Credential
The Tabular authentication credential. Required by Tabular metastore.
Region
The region of the AWS Glue Data Catalog. Example: us-east-1
. See AWS Glue endpoints and quotas for more details.
Use instance profile
Whether to use role-based authentication (Explicit IAM roles or instance-profile attached)
IAM Role ARN
The ARN of the IAM role for accessing the AWS Glue Data Catalog. Required by authentication with IAM roles.
Access key
The access key of the IAM user for accessing the AWS Glue Data Catalog. Required by authentication with IAM User Access keys.
Secret key
The secret key of the IAM user for accessing the AWS Glue Data Catalog. Required by authentication with IAM User Access keys.
Hive metastore URI
The URI of your Hive metastore. Format: thrift://<metastore_IP_address>:<metastore_port>
.
Region
The region of the Amazon S3. Example: us-east-1
. See Amazon Simple Storage Service endpoints and quotas for more details.
Use instance profile
Whether to use role-based authentication (Explicit IAM roles or instance-profile attached).
IAM Role ARN
The ARN of the IAM role for accessing the Amazon S3. Required by authentication with IAM roles.
Access key
The access key of the IAM user for accessing the Amazon S3. Required by authentication with IAM User Access keys.
Secret key
The ARN of the IAM role for accessing the Amazon S3. Required by authentication with IAM User Access keys.
Endpoint
The S3 compatible storage endpoint.
Access key
The access key of an IAM user for accessing the S3 compatible storage.
Secret key
The secret key of an IAM user for accessing the S3 compatible storage.
Enable SSL
Whether to enable SSL connection for accessing the S3 compatible storage.
Enable path style access
Whether to use path-style access method when accessing the S3 compatible storage.
REST Catalog
Amazon S3
REST Catalog
MinIO
AWS Glue
Amazon S3
Hive Metastore
HDFS
Hive Metastore
Amazon S3
Hive Metastore
MinIO
Hive Metastore
Google GCS
Hive Metastore
Azure Blob
Hive Metastore
Azure Data Lake Gen2