In collaboration with our partners at Databricks, Omega Point provides secure access to computational resources — such as python notebooks — that generate data-intensive insights. For mutual customers with Databricks accounts, this is achieved through Databricks Delta Sharing.
This guide will help you activate the data share, including how to retrieve your Databricks workspace ID and recipient ID to provide to Omega Point.
Prerequisites
Databricks Account: If you use Databricks, please have your Workspace ID ready.
Permissions: Ensure you have the necessary permissions within your organization in Databricks: to accept data shares and modify Unity catalog settings within your Databricks workspace.
Steps to Activate and Access the Data Share
Available Data Shares
Ask your Omega Point customer success manager for a list of data shares available. Each share is scoped to a specific 'release topic' that will manage its own versions.
For example, Omega Point's Thematic Beta Package is available as a data share, and will be separate from other topics.
Provide your Databricks information to Omega Point
Work with your Omega Point customer success manager and provide them with your Databricks Workspace ID.
You can find your Workspace ID in Databricks by navigating to Settings > Workspace settings. The Workspace ID will appear as a unique alphanumeric identifier.
Look up your Databricks Recipient ID and provide this to your Omega Point customer success manager.
Verify Data in Databricks Catalog
After providing your information to OP, your requested data share will be sent securely to your Databricks environment.
In Databricks, navigate to your Data section and select Catalog.
Your newly activated data share should appear under Shared Data.
If the data does not appear immediately, allow a few moments for processing and refresh your catalog.
Configuring Your Databricks Cluster
Once notebooks are received and cloned into your workspace, it may be necessary to configure your Databricks Cluster to access Omega Point's omegapoint-utils library, available via a docker image. These steps walk through setting up a new cluster.
Navigation
On the left-hand side, select the “Compute” option
Within the “Compute” page, click the “Create Compute” button
On this page you will create a new compute cluster, use the configuration in the following section.
Recommended Minimal Configuration
General Settings
Cluster Name: <your_cluster_name>
Policy: Unrestricted
Access Mode
Mode: Single user
User: <user_name>
Performance
Databricks Runtime Version: 14.3 LTS (Scala 2.12, Spark 3.5.0)
Node Type: i3.xlarge (30.5 GB Memory, 4 Cores, 1 Driver)
This is the minimal recommended cluster size
Additional workers and more cores / RAM will improve runtimes (to an extent)
Enable Autoscaling Local Storage: Checked
Terminate After: 120 minutes of inactivity
Recommended to reduce billing for unused resources
Photon Acceleration: Not enabled
Advanced Options
Docker Configuration:
Use Your Own Docker Container: Checked
Docker Image URL:
omegapointresearch/omegapoint-utils:latest
Authentication: Default
IAM Role Passthrough: Not enabled
Example Configuration
Notes
Mutual customers must run provided notebooks code on a databricks cluster
Confirm that the cluster is a new cluster that is enabled with Databricks Unity catalog
The cluster must be configured to use the
omegapoint_utils
docker imageThe
omegapoint_utils
docker image requires no authentication to use
The user must provide a recipient ID to gain access to the
Thematic Analysis
delta share release topic (which contains sample notebooks)
Troubleshooting
I'm getting a ModuleNotFoundError, what can I do?
The cluster must be configured to use the
omegapoint_utils
docker imageThe
omegapoint_utils
docker image requires no authentication to use
I'm not seeing the cluster enabled with Unity Catalog?
Recommend to make a new cluster using the settings above, a new cluster should be enabled with Unity Catalog, not hive metastore.
I'm getting an execution failure. "Failure starting repl. Try detaching and re-attaching the notebook"
The databricks runtime should be 14.3 LTS (Scala 2.12, Spark 3.5.0)
I'm getting an error for "failed to create catalog".
Work with your Databricks administrator to ensure you have the right permissions to your databricks environment.