Research
Research
At Entropy Data, we're committed to advancing the field of data governance through research and innovation.
Automating Data Governance with Generative AI
This research examines how large language models can support data governance by generating warnings about data access decisions in decentralized systems. The study introduces Governance AI, an LLM-powered tool that evaluates whether data access requests comply with data contracts, company policies, and regulations like GDPR.
Rather than making final decisions, the system provides "structured warnings and suggestions for correction to guide human experts."
This approach ensures that AI augments human decision-making rather than replacing it, maintaining the necessary human oversight for contextual and legal accuracy in data governance decisions.
Publication Details
Authors:
- Linus W. Dietz (King's College London)
- Arif Wider (HTW Berlin)
- Simon Harrer (Entropy Data)
Conference: AAAI/ACM Conference on Artificial Intelligence, Ethics, and Society 2025
Key Findings
- Governance AI issued 3.6 times more warnings than human experts while catching all compliance concerns
- 80% of AI-generated warnings were judged correct after secondary review
- LLM-generated synthetic test cases effectively simulated real-world governance scenarios
- Human oversight remains essential for contextual and legal accuracy
Learn More
Data Product MCP: Chat with your Enterprise Data
This research introduces a Model Context Protocol (MCP) server that enables conversational interaction with enterprise data products. The solution addresses the challenge of integrating diverse data sources into large language model workflows while maintaining data governance standards.
The framework provides a practical pathway for organizations to leverage LLMs in data discovery and analysis without compromising data protection standards.
By bridging enterprise data products and AI systems through a standardized interface, organizations can democratize data access while maintaining security and governance controls.
Publication Details
Authors:
- Marco Tonnarelli
- Filippo Scaramuzza
- Simon Harrer (Entropy Data)
- Linus W. Dietz (King's College London)
Status: Accepted for publication in the Springer Communications in Computer and Information Science (CCIS) proceedings of the SummerSoC conference (arXiv preprint, May 2026)
Key Contributions
- MCP Server Implementation that bridges enterprise data products and AI systems, allowing natural language queries against structured datasets
- Governance-aware design that maintains compliance with enterprise data governance requirements
- Data product integration leveraging existing standards to create a standardized interface for LLM access
- Open architecture that is extensible and allows organizations to adapt it to their specific data infrastructures
Learn More
Data Contracts: Structuring Promises and Expectations in Data Exchange
This research proposes a general concept of data contracts that captures the perspectives of both data providers and data consumers. Most interpretations treat data contracts as one-sided, provider-driven artifacts, but robust data exchange requires expectations and promises from both sides. The paper structures a data contract into four distinct aspects: provider promises, consumer expectations, provider expectations, and consumer promises.
We suggest explicitly implementing provider expectations and consumer-driven contracts, thereby defining data contracts fully from both points of view and allowing consumers to explicitly influence agreements.
By making all four aspects explicit, the approach enables higher levels of data governance automation, such as automatically matching consumers with data products that fulfill their needs, and reduces the risk of data incidents that arise when expectations and promises remain implicit.
Publication Details
Authors:
- Laura Schuiki (University of Stuttgart)
- Arif Wider (HTW Berlin)
- Simon Harrer (Entropy Data)
Status: Accepted for publication in the IEEE proceedings of the International Conference on Web Services (ICWS) 2026
Key Contributions
- A general concept of data contracts spanning four aspects: provider promises, consumer expectations, provider expectations, and consumer promises
- Consumer-driven perspective that lets consumers voice expectations and make promises, not just receive provider guarantees
- Higher governance automation through formalized aspects that enable automatic matching of data products to consumers
- Standards analysis of how ODCS and ODPS support the concept, validated through expert interviews across three companies
Learn More
Impact on Our Products
The findings from this research inform the development of our products, including the AI-powered governance features in Entropy Data. By exploring how AI can support and enhance human decision-making in data governance, we aim to make data more accessible, secure, and compliant across organizations.