L3 Formal Requirements
This document specifies the formal requirements for achieving Makoto Level 3: Provenance is Unforgeable. Level 3 builds upon Level 2 by adding infrastructure isolation guarantees—attestations are generated by a trusted platform that user code cannot compromise, making provenance unforgeable even by malicious or compromised data producers.
Summary: L3 requires platform-isolated attestation generation where signing keys and hash computation are controlled by trusted infrastructure outside the data processing tenant's control. Even a fully compromised producer cannot forge attestations—they can only refuse to produce data.
Contents
Terminology #
The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be interpreted as described in RFC 2119.
| Term | Definition |
|---|---|
| Producer | Entity that creates data attestations. At L3, producers delegate attestation generation to the platform. |
| Platform | Trusted infrastructure that executes data pipelines and generates attestations on behalf of producers. |
| Control Plane | The privileged layer of the platform that manages attestation generation, isolated from tenant code. |
| Data Plane | The layer where tenant data processing code executes, with no access to signing keys. |
| Tenant | A user or organization running data pipelines on the platform. Tenants cannot access control plane resources. |
| HSM | Hardware Security Module—tamper-resistant hardware for secure key storage and cryptographic operations. |
| TEE | Trusted Execution Environment—hardware-isolated execution environment (e.g., Intel SGX, ARM TrustZone). |
| Isolation Boundary | The security perimeter separating tenant code from platform attestation infrastructure. |
Isolation Model #
L3 security relies on a clear separation between the data plane (where tenant code runs) and the control plane (where attestations are generated). This isolation ensures that even a fully compromised tenant cannot forge attestations.
┌─────────────────────────────────────────────────────────────────────┐ │ CONTROL PLANE (Trusted Platform) │ │ ┌──────────────────┐ ┌──────────────────┐ ┌──────────────────┐ │ │ │ Signing Service │ │ Hash Computer │ │ Audit Logger │ │ │ │ (HSM-backed) │ │ (Deterministic) │ │ (Immutable) │ │ │ └────────┬─────────┘ └────────┬─────────┘ └────────┬─────────┘ │ │ │ │ │ │ │ └──────────┬──────────┴──────────┬──────────┘ │ │ ▼ ▼ │ │ ┌──────────────────────────────────┐ │ │ │ Attestation Generator │ │ │ │ (Platform-controlled) │ │ │ └──────────────┬───────────────────┘ │ ├─────────────────────────────┼───────────────────────────────────────┤ │ │ ISOLATION BOUNDARY │ ├─────────────────────────────┼───────────────────────────────────────┤ │ DATA PLANE (Tenant Code) ▼ │ │ ┌──────────────────────────────────────────────────────────────┐ │ │ │ ┌────────────┐ ┌────────────┐ ┌────────────┐ │ │ │ │ │ Input Data │ -> │ Transform │ -> │ Output Data│ │ │ │ │ │ │ │ (User Code)│ │ │ │ │ │ │ └────────────┘ └────────────┘ └────────────┘ │ │ │ │ │ │ │ │ No access to: signing keys, attestation generation, │ │ │ │ hash computation, audit logs │ │ │ └──────────────────────────────────────────────────────────────┘ │ └─────────────────────────────────────────────────────────────────────┘
Key principle: The tenant can control what data is processed, but the platform controls how that processing is attested. A compromised tenant can produce garbage data, but cannot claim it came from a legitimate source or went through legitimate transformations.
Producer Requirements #
Producers MUST meet all L2 requirements plus the following additional requirements. Note that at L3, many L2 producer responsibilities shift to the platform.
Platform Integration #
Data processing pipelines MUST execute on a platform that meets L3 platform requirements. Self-hosted infrastructure does not qualify for L3 unless it implements the required isolation controls and is independently audited.
Verification
Verifiers check that the attestation's executor.platform identifies an L3-compliant platform
and that the signing identity belongs to the platform, not the tenant.
Producers MUST NOT generate or sign attestations directly. All attestation generation MUST be delegated to the platform's control plane. Producers MAY provide metadata (such as transform descriptions) but MUST NOT control signing.
Producers MUST declare their pipeline configuration (inputs, transforms, outputs) in a platform-readable format. The platform uses this declaration to generate accurate attestations.
{
"pipeline": {
"id": "customer-etl-v2",
"inputs": [{
"source": "kafka://broker/raw_customers",
"expectedOriginLevel": "L2"
}],
"transform": {
"type": "https://makoto.dev/transforms/anonymization",
"codeRef": "git+https://github.com/example/[email protected]"
},
"outputs": [{
"destination": "kafka://broker/processed_customers"
}]
}
}
Data Handoff #
All data ingress and egress MUST pass through platform-controlled channels where the platform can compute hashes and generate attestations. Direct data access that bypasses the platform is not permitted for L3 attestation.
Producers SHOULD use deterministic transformation logic where the same input always produces the same output. This enables verification that the declared transform matches actual behavior. Non-deterministic transforms (e.g., using current time, random values) SHOULD document their non-determinism in the pipeline declaration.
Platform Requirements #
Platforms supporting L3 attestation MUST implement the following isolation and security controls. These requirements are the core differentiator between L2 and L3.
Attestation Generation #
All attestation generation MUST occur in a control plane that is isolated from tenant code. The isolation MUST prevent tenant code from:
- Directly invoking attestation generation with arbitrary content
- Modifying attestations after generation
- Accessing or exfiltrating signing keys
- Tampering with hash computation
Verification
Platform undergoes independent security audit confirming isolation controls. Audit attestation is published and referenced in platform documentation.
Attestations MUST reflect the platform's observation of actual data flow, not tenant-provided claims. The platform MUST independently determine:
- Input data sources and their digests (computed by platform)
- Output data destinations and their digests (computed by platform)
- Transform code that was actually executed
- Timestamps of actual execution
Attestations MUST be signed with a platform-controlled identity, not a tenant identity. The signing identity MUST be verifiable as belonging to the platform (e.g., via X.509 certificate with platform-specific attributes, or Sigstore identity bound to platform OIDC issuer).
Key Isolation #
Platform signing keys MUST be stored in a Hardware Security Module (HSM) or equivalent hardware-backed secure storage. Keys MUST NOT be extractable from the HSM. Acceptable options:
- Cloud HSM — AWS CloudHSM, Azure Dedicated HSM, GCP Cloud HSM
- Cloud KMS with HSM backing — AWS KMS, Azure Key Vault HSM, GCP Cloud KMS
- On-premises HSM — FIPS 140-2 Level 3 or higher certified devices
Verification
Platform publishes HSM attestation or provides audit report confirming HSM-backed key storage. Key material provenance is documented.
The platform MUST ensure that tenant code running in the data plane has no mechanism to:
- Read signing key material
- Request signatures on arbitrary data
- Enumerate or discover signing keys
- Influence key selection for signing operations
Platform signing keys SHOULD be rotated at least annually. Key rotation MUST NOT invalidate previously generated attestations. The platform SHOULD maintain a verifiable history of signing keys and their validity periods.
Hash Computation #
All cryptographic hashes of input and output data MUST be computed by the platform control plane, not by tenant code. This prevents tenants from providing false hash values. Hash computation MUST occur at data ingress/egress points controlled by the platform.
Hash computation MUST be deterministic—the same data MUST always produce the same hash. For structured data, the platform MUST define and document the canonical serialization format used for hashing (e.g., canonical JSON, sorted keys, specific encoding).
For streaming data, the platform SHOULD support incremental hash computation (e.g., Merkle trees) that allows verification of individual records or windows without reprocessing entire datasets.
Audit Logging #
The platform MUST maintain an immutable audit log of all attestation operations. Each log entry MUST include:
- Timestamp of attestation generation
- Attestation digest (hash of the generated attestation)
- Pipeline identifier and version
- Tenant identifier
- Input and output data digests
Verification
Verifiers can request audit log entries corresponding to an attestation to confirm it was generated through legitimate platform operations.
Audit logs MUST be append-only and tamper-evident. Acceptable implementations include:
- Transparency log (e.g., Sigstore Rekor, Trillian)
- Blockchain or distributed ledger
- Cryptographically chained log entries with third-party witnesses
- Write-once storage with integrity verification
Audit logs MUST be retained for a minimum of 18 months. The platform SHOULD provide a mechanism for tenants to request extended retention. Logs MUST remain verifiable throughout the retention period.
Trusted Execution (Optional) #
Platforms MAY use Trusted Execution Environments (TEEs) to provide hardware-enforced isolation between tenant code and attestation infrastructure. TEE-based implementations provide additional assurance against privileged software attacks. Supported TEE technologies include:
- Intel SGX/TDX — Enclave-based isolation
- AMD SEV-SNP — Confidential VMs
- ARM TrustZone/CCA — Secure world isolation
- AWS Nitro Enclaves — Cloud-native isolation
If using TEE-based isolation, the platform SHOULD support remote attestation that allows verifiers to confirm the attestation infrastructure is running in a genuine TEE with expected code measurements.
Verification Requirements #
Verifiers (data consumers) MUST perform the following checks to confirm L3 compliance, in addition to all L2 verification requirements.
Platform Identity Verification #
Verifiers MUST confirm that the attestation was signed by a known L3-compliant platform, not by a tenant identity. This requires:
- Checking the signing certificate belongs to the platform (not tenant)
- Verifying the platform identity against a trust policy of approved L3 platforms
- Confirming the certificate was valid at attestation time
Verifiers MUST confirm the signing platform is L3-compliant. Acceptable evidence includes:
- Platform is listed in a trusted registry of L3-compliant platforms
- Platform provides a valid L3 compliance attestation from an accredited auditor
- Platform publishes machine-readable L3 compliance claims that can be verified
Isolation Verification #
Verifiers SHOULD confirm that the platform uses HSM-backed signing. This MAY be verified through:
- HSM attestation certificates included with signatures
- Platform audit reports documenting HSM usage
- Key provenance documentation showing HSM origin
If the platform claims TEE-based isolation, verifiers MAY request and verify remote attestation quotes to confirm the attestation infrastructure is running in a genuine TEE.
Audit Trail Verification #
Verifiers SHOULD confirm that the attestation appears in the platform's audit log. For transparency log implementations, this means verifying an inclusion proof. Absence from the audit log indicates potential forgery.
For high-assurance use cases, verifiers SHOULD cross-reference audit log entries with attestation content to confirm consistency (matching digests, timestamps, pipeline identifiers).
Verifiers MUST apply a trust policy that explicitly requires L3 compliance for use cases that need unforgeable provenance. The policy MUST reject attestations that:
- Are signed by tenant identities (not platform identities)
- Come from platforms not recognized as L3-compliant
- Cannot be verified against the platform's audit log
Threats Mitigated #
L3 compliance mitigates all threats from the Makoto threat model, including threats only partially addressed by L2:
| Threat | Description | L3 Mitigation |
|---|---|---|
| D1 | Source Falsification | Platform-verified origin claims. Tenant cannot forge source attestations—only the platform can attest to observed data sources. |
| D2 | Collection Tampering | Platform computes hashes at ingress. Tenant cannot provide false hashes for modified data. |
| D3 | Transform Opacity | Platform observes and attests actual transform execution. Tenant cannot claim different transforms than those executed. |
| D4 | Lineage Forgery | Platform generates chained attestations. Tenant cannot insert fake history—all lineage is platform-attested. |
| D5 | Stream Injection | Platform controls stream ingress. Unauthorized records cannot enter streams without platform attestation. |
| D6 | Aggregation Manipulation | Platform computes and attests aggregates. Tenant cannot claim false statistics—platform observes actual computation. |
| D8 | Time Manipulation | Platform controls timestamps. Combined with L2 trusted timestamping, attestation times are unforgeable. |
Key difference from L2: At L2, a compromised producer can forge attestations using their own signing keys. At L3, even a fully compromised tenant cannot forge attestations—they can only refuse to produce data or produce garbage data that will be accurately attested as coming from them.
Note: L3 does not mitigate threat D7 (Privacy Leakage). Privacy-preserving attestation techniques are orthogonal to the provenance level and should be applied based on data sensitivity requirements. See Privacy Techniques for guidance.
Conformance Checklist #
Use this checklist to verify L3 compliance. All L2 requirements must also be met.
Producer Checklist
| Req | Requirement | Level |
|---|---|---|
| L3.P.1 | Executing on L3-compliant platform | MUST |
| L3.P.2 | Using platform-provided attestation mechanisms (no direct signing) | MUST |
| L3.P.3 | Pipeline configuration declared in platform-readable format | MUST |
| L3.P.4 | Data routed through platform-controlled channels | MUST |
| L3.P.5 | Using deterministic transforms where possible | SHOULD |
Platform Checklist
| Req | Requirement | Level |
|---|---|---|
| Attestation Generation | ||
| L3.PL.1 | Attestations generated in isolated control plane | MUST |
| L3.PL.2 | Attestations bound to observed behavior (not tenant claims) | MUST |
| L3.PL.3 | Signed with platform identity (not tenant identity) | MUST |
| Key Isolation | ||
| L3.PL.4 | Signing keys stored in HSM | MUST |
| L3.PL.5 | No key access from data plane | MUST |
| L3.PL.6 | Annual key rotation | SHOULD |
| Hash Computation | ||
| L3.PL.7 | Hashes computed in control plane | MUST |
| L3.PL.8 | Deterministic hashing with documented canonicalization | MUST |
| L3.PL.9 | Streaming hash computation support (Merkle trees) | SHOULD |
| Audit Logging | ||
| L3.PL.10 | All attestation operations logged | MUST |
| L3.PL.11 | Audit log integrity protected (append-only, tamper-evident) | MUST |
| L3.PL.12 | 18-month minimum audit log retention | MUST |
| Trusted Execution (Optional) | ||
| L3.PL.13 | Hardware-based isolation (TEE) | MAY |
| L3.PL.14 | Remote attestation for TEE | SHOULD |
Verifier Checklist
| Req | Requirement | Level |
|---|---|---|
| L3.V.1 | Platform signing identity verified (not tenant) | MUST |
| L3.V.2 | Platform L3 compliance verified | MUST |
| L3.V.3 | HSM-backed signing confirmed | SHOULD |
| L3.V.4 | TEE attestation verified (if claimed) | MAY |
| L3.V.5 | Audit log inclusion verified | SHOULD |
| L3.V.6 | Audit entries cross-referenced with attestation | SHOULD |
| L3.V.7 | L3 trust policy applied | MUST |