The primary role of a data catalog is to create an inventory of all the data within an enterprise, most commonly for data governance and in some cases for distributed queries and access management.
Gartner expands upon this definition as follows:
A data catalog creates and maintains an inventory of data assets through the discovery, description and organization of distributed datasets. The data catalog provides context to enable data stewards, data/business analysts, data engineers, data scientists and other line of business (LOB) data consumers to find and understand relevant datasets for the purpose of extracting business value. Modern machine-learning-augmented data catalogs automate various tedious tasks involved in data cataloging, including metadata discovery, ingestion, translation, enrichment and the creation of semantic relationships between metadata. These next-generation data catalogs can therefore propel enterprise metadata management projects by allowing business users to participate in understanding, enriching and using metadata to inform and further their data and analytics initiatives.1
A data exchange (sometimes called a data marketplace) is a digital platform that manages data as a product to make it easy to find, use, manage and monetize. It connects data suppliers and consumers through a seamless experience.
Data exchanges can be public, in which case the data is typically exchanged as part of a commercial transaction. Or increasingly, they are deployed within an enterprise to provide business stakeholders with seamless access to curated data products. In this scenario, the interactions can be either transactional (find, access, use) or collaborative (bringing stakeholders together to customize data products).
Eckerson Group Report: Rise of Data Exchanges
Data catalogs and data exchanges exist for similar reasons. With the enormous volume of data in various repositories and aggregations inside an enterprise, it’s difficult to understand what data is available, valuable and useful for any particular business use case.
The data catalog addresses this challenge by creating a searchable inventory of all that data. The data exchange enables enterprises to curate the best and/or most in-demand data for easy access and consumption.
If you think of a traditional retail business, a data catalog would be the inventory list of everything in the warehouse. It tells you what you have, where to find each item and some details about them (metadata). However, you do not invite your customers to shop in your warehouse or through your inventory list, because it’s not easy to understand and find what you’re looking for. Items are packaged differently and some may be broken or expired. Instead, you curate the best items to display in your storefront where everything the customer needs to know to make a decision is readily available.
A data exchange on the other hand is focused on quickly realizing target outcomes from data products. Instead of listing all the data you have, it converts curated data assets into data products, making it easy to:
Yes! Not only can data catalogs and data exchanges co-exist, but in an enterprise environment, they absolutely should. Data catalogs are useful to manage and govern the entire data estate and support compliance and governance requirements. Data exchanges curate the subset of that data that can be most useful in driving business insights and outcomes from data. A data catalog would include the data products that sit within a data exchange.
In our conversations with one of the world’s biggest banks, they have indicated that while they’re happy with their data catalog (from a well-known provider), it doesn’t go far enough to make the data in the catalog usable and useful within their business. As a result, they are launching an enterprise data exchange initiative to help them realize more value from data — and faster.
If you’d like to better understand how a data exchange can add value to your business explore the Harbr platform.
1 Gartner, Augmented Data Catalogs: Now an Enterprise Must-Have for Data and Analytics Leaders by Ehtisham Zaidi and Guido De Simoni, 12 September 2019