Unity Catalog Catalog Connector
Connect to a Unity Catalog as a catalog provider for federated SQL query against Delta Lake tables.
Configuration​
catalogs:
- from: unity_catalog:https://my_unity_catalog_host.com/api/2.1/unity-catalog/catalogs/my_catalog
name: uc
include:
- "*.my_table"
dataset_params:
# delta_lake S3 parameters
unity_catalog_aws_region: us-west-2
unity_catalog_aws_access_key_id: ${secrets:aws_access_key_id}
unity_catalog_aws_secret_access_key: ${secrets:aws_secret_access_key}
unity_catalog_aws_endpoint: s3.us-west-2.amazonaws.com
from​
The from field is used to specify the catalog provider. For Unity Catalog, use unity_catalog:<catalog_path>. The catalog_path is the URL to the getCatalog endpoint of the Unity Catalog API. It should be formatted as https://<unity_catalog_host>/api/2.1/unity-catalog/catalogs/<catalog_name>.
name​
The name field is used to specify the name of the catalog in Spice. The schema hierarchy of the external catalog is preserved in Spice.
include​
Use the include field to specify which tables to include from the catalog. The include field supports glob patterns to match multiple tables. For example, *.my_table_name would include all tables with the name my_table_name in the catalog from any schema. Multiple include patterns are OR'ed together and can be specified to include multiple tables.
params​
The params field is used to configure the connection to the Unity Catalog. The following parameters are supported:
unity_catalog_token: The personal access token used to authenticate against the Unity Catalog API.
dataset_params​
The dataset_params field is used to configure the dataset-specific parameters for the catalog.
Unity catalog object store parameters​
AWS S3​
unity_catalog_aws_region: The AWS region for the S3 object store. E.g.us-west-2.unity_catalog_aws_access_key_id: The access key ID for the S3 object store.unity_catalog_aws_secret_access_key: The secret access key for the S3 object store.unity_catalog_aws_endpoint: The endpoint for the S3 object store. E.g.s3.us-west-2.amazonaws.com.
Azure Blob​
One of the following auth values must be provided for Azure Blob:
unity_catalog_azure_storage_account_key,unity_catalog_azure_storage_client_idandazure_storage_client_secret, orunity_catalog_azure_storage_sas_key.
unity_catalog_azure_storage_account_name: The Azure Storage account name.unity_catalog_azure_storage_account_key: The Azure Storage master key for accessing the storage account.unity_catalog_azure_storage_client_id: The service principal client id for accessing the storage account.unity_catalog_azure_storage_client_secret: The service principal client secret for accessing the storage account.unity_catalog_azure_storage_sas_key: The shared access signature key for accessing the storage account.unity_catalog_azure_storage_endpoint: The endpoint for the Azure Blob storage account.
Google Storage (GCS)​
unity_catalog_google_service_account: Filesystem path to the Google service account JSON key file.
Limitations​
-
Unity Catalog does not support reading Delta tables with the
V2Checkpointfeature enabled. To use the Unity Catalog connector with such tables, drop theV2Checkpointfeature by executing the following command:ALTER TABLE <table-name> DROP FEATURE v2Checkpoint [TRUNCATE HISTORY];For more details on dropping Delta table features, refer to the official documentation: Drop Delta table features
