Research

Research

At Entropy Data, we're committed to advancing the field of data governance through research and innovation.

Automating Data Governance with Generative AI

This research examines how large language models can support data governance by generating warnings about data access decisions in decentralized systems. The study introduces Governance AI, an LLM-powered tool that evaluates whether data access requests comply with data contracts, company policies, and regulations like GDPR.

Rather than making final decisions, the system provides "structured warnings and suggestions for correction to guide human experts."

This approach ensures that AI augments human decision-making rather than replacing it, maintaining the necessary human oversight for contextual and legal accuracy in data governance decisions.

Publication Details

Authors:

Conference: AAAI/ACM Conference on Artificial Intelligence, Ethics, and Society 2025

Key Findings

  • Governance AI issued 3.6 times more warnings than human experts while catching all compliance concerns
  • 80% of AI-generated warnings were judged correct after secondary review
  • LLM-generated synthetic test cases effectively simulated real-world governance scenarios
  • Human oversight remains essential for contextual and legal accuracy

Learn More


Data Product MCP: Chat with your Enterprise Data

This research introduces a Model Context Protocol (MCP) server that enables conversational interaction with enterprise data products. The solution addresses the challenge of integrating diverse data sources into large language model workflows while maintaining data governance standards.

The framework provides a practical pathway for organizations to leverage LLMs in data discovery and analysis without compromising data protection standards.

By bridging enterprise data products and AI systems through a standardized interface, organizations can democratize data access while maintaining security and governance controls.

Publication Details

Authors:

  • Marco Tonnarelli
  • Filippo Scaramuzza
  • Simon Harrer (Entropy Data)
  • Linus W. Dietz (King's College London)

Status: Accepted for publication in the Springer Communications in Computer and Information Science (CCIS) proceedings of the SummerSoC conference (arXiv preprint, May 2026)

Key Contributions

  • MCP Server Implementation that bridges enterprise data products and AI systems, allowing natural language queries against structured datasets
  • Governance-aware design that maintains compliance with enterprise data governance requirements
  • Data product integration leveraging existing standards to create a standardized interface for LLM access
  • Open architecture that is extensible and allows organizations to adapt it to their specific data infrastructures

Learn More


Data Contracts: Structuring Promises and Expectations in Data Exchange

This research proposes a general concept of data contracts that captures the perspectives of both data providers and data consumers. Most interpretations treat data contracts as one-sided, provider-driven artifacts, but robust data exchange requires expectations and promises from both sides. The paper structures a data contract into four distinct aspects: provider promises, consumer expectations, provider expectations, and consumer promises.

We suggest explicitly implementing provider expectations and consumer-driven contracts, thereby defining data contracts fully from both points of view and allowing consumers to explicitly influence agreements.

By making all four aspects explicit, the approach enables higher levels of data governance automation, such as automatically matching consumers with data products that fulfill their needs, and reduces the risk of data incidents that arise when expectations and promises remain implicit.

Publication Details

Authors:

Status: Accepted for publication in the IEEE proceedings of the International Conference on Web Services (ICWS) 2026

Key Contributions

  • A general concept of data contracts spanning four aspects: provider promises, consumer expectations, provider expectations, and consumer promises
  • Consumer-driven perspective that lets consumers voice expectations and make promises, not just receive provider guarantees
  • Higher governance automation through formalized aspects that enable automatic matching of data products to consumers
  • Standards analysis of how ODCS and ODPS support the concept, validated through expert interviews across three companies

Learn More


Impact on Our Products

The findings from this research inform the development of our products, including the AI-powered governance features in Entropy Data. By exploring how AI can support and enhance human decision-making in data governance, we aim to make data more accessible, secure, and compliant across organizations.