What is Azure Purview and why are we excited about it?
Azure Purview, which is currently in preview, is a long-awaited tool from Microsoft that is designed to provide a solution for centralized data governance for an organization’s entire environment, including on-premises databases, cloud databases, SaaS data, and virtually any other data source or platform.
What does Azure Purview do?
Azure Purview is designed to help organizations enable better data governance by enabling data discovery, traceability, and searchability. This helps create more efficient business operations by allowing employees to search for and discover organizational data, helping to streamline data operations and prevent duplicate or redundant projects across multiple teams.
For example, large big-box stores or e-commerce sites may have multiple marketing teams, with individual teams supporting different lines of business, as well as teams dedicated to maintaining the website and managing digital marketing efforts.
In a situation where this hypothetical retailer doesn’t have Azure Purview, an employee sitting on one team supporting a given line of business (let’s say PC hardware, for the sake of the example) may want a Power BI dashboard visualizing web traffic metrics on all company webpages relating to PC hardware. They may not know whether such a dashboard already exists, nor even if they did, which team or person on that team to contact for access. As a result, the employee would likely reach out to a BI engineer to build the dashboard.
If the hypothetical retailer does have Azure Purview, then it becomes easy for the employee to search for an existing dashboard with the data they need. In this situation, Azure Purview would also provide the employee with the information for the dashboard owner so that they can reach out directly for access, eliminating the need for a BI engineer to build a duplicate dashboard.
Azure Purview can also provide data traceability and lineage, assuming the environment is configured to allow for it (more on this under limitations). These features allow employees to see exactly where data is coming from. This can help an employee validate that a given dashboard or figure encapsulates all of the data they expect, as well as identify whether the dashboard figures may contain results they do not want to include.
How does it do this?
The secret sauce for making Azure Purview work is metadata. Metadata, as the name may suggest, is self-referential data that typically includes markers like what or who created the data, when it was created, what type of data it is, and so on.
An example of metadata would be the detailed information attached to a digital image. If a Windows user right-clicks on a given image on their computer, then selects “properties”, a window will open where the user can view and edit the image’s metadata. Simply click over to the “Details” tab and you will see the image’s metadata.
Azure Purview extracts and refactors metadata from across an organization’s IT environment to accurately discover, tag, and catalog data while enabling data lineage traceability. Using metadata also allows organizations to restrict access to data based on the organization’s desired parameters by leveraging metadata tags.
Why is this such a big deal?
The capabilities provided by Azure Purview stand to make it a vital component of virtually every organization’s Azure cloud environment. The data discovery and traceability capabilities allow teams to operate more efficiently and eliminate the time and cost associated with building redundant dashboards and data pipelines. This, in turn, allows BI and data engineers to spend more time focusing on projects that add to the organization’s bottom line or supporting requests for visualizations that do not already exist.
At this point, one might be thinking, “That sounds useful, but how do I protect my sensitive data like PII, HIPAA-protected medical records, or PCI DSS-compliant cardholder data from being searchable and discoverable?”
One of the key features that make Azure Purview a big deal directly addresses this challenge: Users can map and control sensitive data (including during data discovery) by marking the data as sensitive in its metadata. By doing this, organizations can then leverage security group policies on top of Azure Purview to restrict the ability to search for the data. This enables organizations to realize the benefits of Azure Purview even when sensitive or protected data is involved, while still maintaining regulatory compliance.
Azure Purview provides fantastic functionality for virtually any organization, however, there is one limitation we feel that we should mention.
The data traceability features provided by Azure Purview generally require the use of Azure data orchestration and pipeline tools to be functional. Organizations can still leverage other Azure Purview features without using Azure orchestrating tools for their data pipelines, however, information on data lineage may be limited.
Implementing Azure Purview
Organizations around the globe that already have mature data environments on Azure may be wondering, “How do I implement Azure Purview in my complex environment?”
Neal Analytics is a Microsoft Gold Partner with over a decade of experience driving highly complex data migration and modernization projects, as well as implementing AI and machine learning solutions across virtually all industries and use cases. Our team of data scientists and consulting experts can help remove complications and provide clarity into implementing Azure Purview.
Additionally, Neal can help organizations who may desire to re-orchestrate their data pipelines to take full advantage of Azure Purview’s capabilities. Contact us to learn more.