The First Step Towards Responsible AI: Regulatory Compliance and Data Privacy

There is a growing cry from organizations, consumers, and regulators for responsible Artificial Intelligence—and with good reason. Gartner has termed responsible AI the foremost trend for driving near-term AI innovation. This concept encompasses several of the most eminent facets of business and societal use of AI, including explainability, ethics, fairness, accountability, and transparency.

Most importantly, responsible AI entails two of the most prominent issues across the data landscape at the moment: data privacy and regulatory compliance. Before organizations can ensure they’re deploying fair machine learning models bereft of bias, they must first satisfy an enlarging number of data privacy laws and regulatory demands when assembling training data for those models, crafting them, testing them, and putting them in production.

According to Privacera SVP of Customer Experience Nitin Mathur, “What has not happened through the phase of the data creation and the analytics of the Artificial Intelligence part is enabling the privacy and the governance of that data: the responsibility that’s there for an individual or the company or, from this perspective, regulators, too.”

Fortunately, several tools and approaches designed to secure data governance while adhering to any amount of regulations—involving both data privacy and otherwise—have emerged so organizations can hurdle this first obstacle on their way to practicing responsible AI. Mastering this initial facet of responsible AI has positive downstream connotations for building and deploying advanced analytics models that ultimately yield the best results.

“Issues of data privacy and data governance are just heating up,” Mathur observed. “As a society, as we learn more and more about the implications of having this amount of petabytes and petabytes of data for all of us, whether it’s personal or business related, we have to answer how do we responsibly use it.”

The Dual Data Science Demand

Most AI applications begin in a data scientist sandbox in which these practitioners attempt to address business problems with advanced analytics solutions based on enterprise data. Their ability to access that data, however, often hinges on the data governance or data privacy policies their organizations have. “If you look at a typical enterprise or a business, there is a set of consumers of that data which is the data stewards, which is where all of your AI/ML things are happening,” Mathur mentioned.

In these cases, data stewards are tasked with enabling or prohibiting end user access to data—including that for data scientists. However, there are oftentimes formal data governance councils and board level individuals who, well aware of the slew of data privacy regulations, would rather inherently circumscribe that access to meet these requirements. Consequently, “we’re seeing a dual mandate coming from the board level for many of the bigger enterprise companies where they’re saying, ‘hey look, we need to strive for that right balance between the democratization for the data so that the responsible AI /ML analytics can be run on that, as compared to being the good citizen and not invading the privacy of the consumer data,” Mathur revealed.

Assuring Privacy and Compliance

Building a foundation for securing regulatory compliance and data privacy begins with some of the core facets of data management related to data governance. Organizations can lay the groundwork for progressing to responsible AI by implementing the following procedures:

  • Data Discovery: The ability to preserve the privacy of consumer data and adhere to regulations is rooted in ascertaining where an organization’s data are throughout its various infrastructure. Once firms know what is where, they have to determine “in which part of this data, where is the PII or some confidential information stored,” Mathur commented. “That’s the discoverability of this data.” Data profiling tools can provide this granular insight. Sensitive data catalogs use machine learning tools to tag such data so organizations know exactly where they are.
  • Governance Controls: Credible governance solutions have a host of mechanisms for obfuscating data (down to individual columns or cells) so data scientists, for example, can only see the parts of the data relevant to machine learning models—not sensitive PII. According to Mathur, competitive options in this space let users “encrypt or mask that information so the data steward is not even aware that that exists.”
  • Policy Creation and Enforcement: The final step involves enforcing policies—partly based on data privacy laws and regulatory requirements—to ensure governed data access. “The final part is now that I’ve secured it once, what kind of governance policies do I put in place so that the right people have access to the right information,” Mathur noted.

Source Level Implementation

Creating data governance and security policies formalizes the requirements for data scientists accessing only the specific data (and parts of those data) that assist with their specific predictive or prescriptive models. Top approaches to this and the other procedures Mathur articulated are able to seamlessly propel those policies into individual source systems across clouds and on-premise milieus. This granular degree of policy enforcement is pivotal for ensuring data scientists are able to meet the data privacy and compliance demands for fulfilling these requisites for the responsible AI tenet.

Featured Image: NeedPix


Jelani Harper is an editorial consultant servicing the information technology market. He specializes in data-driven applications focused on semantic technologies, data governance, and analytics.

Opinions expressed by contributors are their own.

About Jelani Harper

Jelani Harper is an editorial consultant servicing the information technology market. He specializes in data-driven applications focused on semantic technologies, data governance, and analytics.

View all posts by Jelani Harper →