Generating Data-Intensive Insights | Omega Point Help Center

In collaboration with our partners at Databricks, Omega Point provides secure access to computational resources — such as python notebooks — that generate data-intensive insights. For mutual customers with Databricks accounts, this is achieved through Databricks Delta Sharing.

This guide will help you activate the data share, including how to retrieve your Databricks workspace ID and recipient ID to provide to Omega Point.

Prerequisites

Databricks Account: If you use Databricks, please have your Workspace ID ready.
Permissions: Ensure you have the necessary permissions within your organization in Databricks: to accept data shares and modify Unity catalog settings within your Databricks workspace.

Steps to Activate and Access the Data Share

Available Data Shares

Ask your Omega Point customer success manager for a list of data shares available. Each share is scoped to a specific 'release topic' that will manage its own versions.

For example, Omega Point's Thematic Beta Package is available as a data share, and will be separate from other topics.

Provide your Databricks information to Omega Point

Work with your Omega Point customer success manager and provide them with your Databricks Workspace ID.

You can find your Workspace ID in Databricks by navigating to Settings > Workspace settings. The Workspace ID will appear as a unique alphanumeric identifier.
Look up your Databricks Sharing Identifier and provide this to your Omega Point customer success manager.

You can find the Sharing Identifier by navigating to Catalog > Delta Sharing > Copy Sharing Identifier

Verify Data in Databricks Catalog

After providing your information to OP, your requested data share will be sent securely to your Databricks environment.
In Databricks, navigate to your Data section and select Catalog.
1. Your newly activated data share should appear under Shared Data.
If the data does not appear immediately, allow a few moments for processing and refresh your catalog.

Configuring Your Databricks Cluster

Once notebooks are received and cloned into your workspace, it may be necessary to configure your Databricks Cluster to access Omega Point's omegapoint-utils library, available via a docker image. These steps walk through setting up a new cluster.

Navigation

On the left-hand side, select the “Compute” option

Within the “Compute” page, click the “Create Compute” button

On this page you will create a new compute cluster, use the configuration in the following section.

Recommended Minimal Configuration

General Settings
- Cluster Name: <your_cluster_name>
- Policy: Unrestricted
Access Mode
- Mode: Single user
- User: <user_name>
Performance
- Databricks Runtime Version: 14.3 LTS (Scala 2.12, Spark 3.5.0)
- Node Type: i3.xlarge (30.5 GB Memory, 4 Cores, 1 Driver)
  - This is the minimal recommended cluster size
  - Additional workers and more cores / RAM will improve runtimes (to an extent)
- Enable Autoscaling Local Storage: Checked
- Terminate After: 120 minutes of inactivity
  - Recommended to reduce billing for unused resources
- Photon Acceleration: Not enabled
Advanced Options
- Docker Configuration:
  - Use Your Own Docker Container: Checked
  - Docker Image URL: omegapointresearch/omegapoint-utils:1.0.0
  - Authentication: Default
IAM Role Passthrough: Not enabled

Example Configuration

After changing Simple Form to OFF

Notes

Mutual customers must run provided notebooks code on a databricks cluster
- Confirm that the cluster is a new cluster that is enabled with Databricks Unity catalog
The cluster must be configured to use the omegapoint_utils docker image
- The omegapoint_utils docker image requires no authentication to use
The user must provide a recipient ID to gain access to the Thematic Analysis delta share release topic (which contains sample notebooks)

Troubleshooting

I'm getting a ModuleNotFoundError, what can I do?

The cluster must be configured to use the omegapoint_utils:1 docker image
- The omegapoint_utils:1 docker image requires no authentication to use

I'm not seeing the cluster enabled with Unity Catalog?

Recommend to make a new cluster using the settings above, a new cluster should be enabled with Unity Catalog, not hive metastore.

I'm getting an execution failure. "Failure starting repl. Try detaching and re-attaching the notebook"

The databricks runtime should be 14.3 LTS (Scala 2.12, Spark 3.5.0)
This is a Standard runtime, not an ML runtime. 14.3LTS ML will create an error

I'm getting an error for "failed to create catalog".

Work with your Databricks administrator to ensure you have the right permissions to your databricks environment.