To derive value from data within your organization, you need to solve for data consumption — how data is accessed, used, and understood. The ultimate aim is to get to a state where individuals across the entire organization are able to work effectively, autonomously, and compliantly with data, regardless of their role and skill.
Databricks and Harbr are powerful data platforms that, when used together, enable organizations to realize the full potential of data. While both are data platforms, they serve different and complementary purposes. Using them together means you can go beyond what either platform will enable you to achieve on its own.
Databricks is a Unified Analytics Platform designed to process and analyze massive amounts of data in a scalable cloud environment. It’s primarily used by technical teams who need an end-to-end data solution to process, store, analyze, and model data.
Harbr is a data marketplace platform that enables the governed access and use of data, models, and insights. It’s used by technical teams to configure and manage data products, and business teams to access, adapt, and use those data products. When used together, Databricks and Harbr address two critical challenges for businesses:
This combination allows businesses to solve the problem of data consumption at scale, enabling both technical and non-technical users to derive value from data without compromising on governance or efficiency.
1. Zero-copy data sharing
A significant challenge of enabling access and usage is the movement and replication of data across systems. This can lead to inefficiencies, higher costs, and compliance risks. Databricks’ Delta Share feature allows businesses to share data securely without needing to create multiple copies. Harbr integrates with this and removes the technical complexity to create a highly intuitive experience for business users. This ensures data can be shared without moving — reducing costs, increasing speed, and enabling governed data access.
2. Federated data architecture
Data is often spread across different systems, databases, and environments. Databricks allows for a federated approach to data, meaning it can pull data from multiple sources without needing to centralize it. Harbr’s architecture complements this by curating and organizing data assets from those sources directly, or via Databricks, making it easier for users to find and access what they need without technical support or complex approval processes.
3. Curated data access
While Databricks focuses on analytics and processing, Harbr focuses on data curation and ease of use. Harbr has a highly flexible “data products” and “data assets” model. Any digital object — whether it's a table, an API, a notebook, a visualization, or even a PDF — can be registered as a data asset. Any combination of data assets can be turned into, and managed as, a data product. This enables highly-specific data curation at scale, enhancing the discoverability and usability of data and making it more accessible to non-technical users.
The integration of Databricks and Harbr offers distinct advantages for various users within an organization:
1. Non-technical users
For non-technical users, Harbr has an AI-powered text-to-SQL query capability. This means any user can work autonomously with tabular data assets, without requiring deep technical expertise. Using this capability, they can perform key actions like evaluating, adapting, and analyzing data without intervention from IT or data engineering teams. This feature is also completely governed by the platform owner, so data owners can control which LLMs are used and even prevent data assets from being used with this feature.
2. Technical users
Technical users, such as data engineers and data scientists, benefit from the integration because Harbr enhances the user experience at the database level. Users can take advantage of Databricks’ advanced compute capabilities through Harbr’s front-end, which provides a seamless way for users to manage their own infrastructure and allocate compute resources in a cost-effective and scalable way. This allows technical users to focus on data science and model development without worrying about setting up the underlying infrastructure.
3. Platform operators
For platform operators who manage and orchestrate data platforms, Harbr provides visibility and control over how data is shared and accessed within and between organizations and users. This includes managing the isolation of different Databricks accounts and ensuring compliance with internal governance policies. Harbr provides full transparency into how data is used and offers flexibility in managing complex data environments, allowing operators to easily track and manage data access across multiple users and teams.
When both Databricks and Harbr are deployed together, the workflow becomes significantly more streamlined:
Using Databricks and Harbr together provides a comprehensive solution to the data consumption problem. By combining Databricks’ powerful data processing with Harbr’s user-friendly data curation and governance, data access and usage become seamless for technical and non-technical users. This ensures data can be efficiently shared, accessed, and used at scale, both within and between organizations, by removing the challenges around compliance and operational complexity.