A proposal adding “dataMiningPreference” in the metadata to preserve the digital content’s original intent and respect creator’s rights.
This EIP proposes a standardized approach to declaring mining preferences for digital media content on the EVM-compatible blockchains. This extends digital media metadata standards like ERC-7053 and NFT metadata standards like ERC-721 and ERC-1155, allowing asset creators to specify how their assets are used in data mining, AI training, and machine learning workflows.
Motivation
As digital assets become increasingly utilized in AI and machine learning workflows, it is critical that the rights and preferences of asset creators and license owners are respected, and the AI/ML creators can check and collect data easily and safely. Similar to robot.txt to websites, content owners and creators are looking for more direct control over how their creativities are used.
This proposal aims to propose a standardized method of declaring these preferences. Adding dataMiningPreference
in the content metadata allows creators to include the information about how they want their work whether the asset may be used as part of a data mining or AI/ML training workflow. This ensures the original intent of the content is maintained.
For AI-focused applications, this information serves as a guideline, facilitating the ethical and efficient use of content while respecting the creator’s rights and building a sustainable data mining and AI/ML environment.
The introduction of the dataMiningPreference
property in digital asset metadata covers the considerations including
-
Accessibility: A clear and easily accessible method with human-readibility and machine-readibility for digital asset creators and license owners to express their preferences for how their assets are used in data mining and AI/ML training workflows. The AI/ML creators can check and collect data systematically.
-
Adoption: As Coalition for Content Provenance and Authenticity (C2PA) already outlines guidelines for indicating whether an asset may be used in data mining or AI/ML training, it’s crucial that onchain metadata aligns with these standards. This ensures compatibility between in-media metadata and onchain records.
Please see the latest proposal here and provide your comments below. Thanks!