Which Standard Should I Use?

Multiple standards address provenance and lineage in different ways. This guide helps you choose the right standard—or combination of standards—for your use case. The good news: these standards are often complementary, not competing.

TL;DR: Use SLSA for software builds, Makoto for data integrity, OpenLineage for pipeline observability, and W3C PROV for semantic interoperability. Many organizations use multiple standards together.

Quick Decision Matrix

Find your primary need in the left column to identify the best starting point:

Primary Need	Recommended Standard	Why
Prove software wasn't tampered with	SLSA	Purpose-built for software supply chain security
Prove data origin and transformations	Makoto	Cryptographic attestations for data integrity
Track pipeline runs and dependencies	OpenLineage	Runtime metadata for observability and debugging
Semantic provenance interchange	W3C PROV	Standard ontology for cross-system interoperability
ML model training data governance	Makoto + SLSA	Makoto for data, SLSA for model artifacts
Data catalog with lineage visibility	OpenLineage + Makoto	OpenLineage for discovery, Makoto for verification

Standard Profiles

SLSA — Supply-chain Levels for Software Artifacts

Purpose: Secure the software build process with verifiable provenance.

Focus: Software artifacts (binaries, packages, containers)

Strengths

Mature ecosystem (GitHub, GitLab support)
Clear level progression (L1-L3)
Strong tooling (Sigstore, cosign)
Industry adoption (OpenSSF)

Limitations

Software-focused (not data)
Single build step model
No streaming support
No privacy considerations

Best for: CI/CD pipelines, container builds, package publishing, infrastructure-as-code

Makoto — Data Integrity Framework

Purpose: Cryptographic attestations for data origin, transformation, and lineage.

Focus: Data artifacts (datasets, streams, ETL outputs)

Strengths

Multi-stage transform chains
Streaming data support (Merkle windows)
Privacy-preserving options
DBOM output format
in-toto compatible

Limitations

Newer standard (less tooling)
Requires implementation effort
L3 requires platform support

Best for: ETL pipelines, ML training data, data marketplaces, regulatory compliance

OpenLineage — Open Standard for Data Lineage

Purpose: Runtime metadata collection for pipeline observability and lineage tracking.

Focus: Job/task execution metadata and dataset dependencies

Strengths

Broad tool integration (Airflow, Spark, dbt)
Rich runtime metadata
Active community (LF AI & Data)
Great for data catalogs

Limitations

No cryptographic verification
Observability focus (not security)
No integrity guarantees
Metadata only (no data hashing)

Best for: Data catalogs, pipeline debugging, impact analysis, data discovery

W3C PROV — Provenance Data Model

Purpose: Standard ontology for representing provenance information across systems.

Focus: Semantic interoperability and provenance interchange

Strengths

W3C recommendation (stable)
Rich semantic model
Cross-domain applicability
RDF/OWL support

Limitations

Generic (not data-specific)
No security levels
Complex for simple use cases
Limited modern tooling

Best for: Research data, cross-organization sharing, semantic web applications, archives

Feature Comparison

Feature	SLSA	Makoto	OpenLineage	W3C PROV
Primary domain	Software	Data	Data	General
Cryptographic signing	Yes (L2+)	Yes (L2+)	No	No
Content hashing	Yes	Yes	No	Optional
Security levels	L1-L3	L1-L3	No	No
Streaming support	No	Yes (windows)	Limited	No
Multi-stage lineage	No	Yes	Yes	Yes
Privacy features	No	Yes	No	No
Runtime metadata	Basic	Basic	Rich	Basic
Tool integrations	Many	Growing	Many	Limited
Output format	SBOM	DBOM	Events	RDF/JSON
Attestation format	in-toto/DSSE	in-toto/DSSE	JSON	PROV-O/PROV-JSON

Use Case Scenarios

Scenario 1: ML Training Pipeline

"I'm training ML models and need to prove the provenance of both my training data and model artifacts."

Recommendation: Use Makoto for training data attestations and SLSA for model artifact provenance.

Makoto tracks data origin, transformations, and feature engineering
SLSA tracks model training code and resulting artifacts
Both use in-toto format, enabling unified verification policies

Scenario 2: Data Warehouse Governance

"I need to understand data lineage for impact analysis and also prove data hasn't been tampered with."

Recommendation: Use OpenLineage for lineage visibility and Makoto for integrity verification.

OpenLineage integrates with dbt, Airflow, and data catalogs for discovery
Makoto provides cryptographic proof of data integrity for high-value tables
Use OpenLineage broadly, Makoto for regulated or sensitive data

Scenario 3: Cross-Organization Data Sharing

"I'm sharing data with external partners and need to provide verifiable provenance claims."

Recommendation: Use Makoto L2 with signed attestations. Consider W3C PROV if partners use semantic web systems.

Makoto L2 provides signed, tamper-evident attestations
Partners can independently verify data hasn't been modified
DBOM documents complete chain of custody

Scenario 4: Real-Time Streaming Analytics

"I'm processing millions of events per second and need to attest to data integrity without impacting throughput."

Recommendation: Use Makoto stream-window attestations.

Window-based Merkle trees enable high-throughput attestation
Single signature per window (not per record)
Hash chaining provides tamper-evident stream history

Scenario 5: Research Data Publication

"I'm publishing research datasets and need to document provenance for reproducibility."

Recommendation: Use W3C PROV for semantic richness, optionally with Makoto L1 for structured attestations.

W3C PROV provides rich semantic descriptions understood by research tools
PROV-O enables integration with research data repositories
Makoto adds structured, machine-verifiable attestations

Using Multiple Standards Together

These standards are designed to complement each other. Here's how they fit together:

The Full Stack Approach

Layer	Standard	Purpose
Observability	OpenLineage	Runtime lineage events for data catalogs and debugging
Data Integrity	Makoto	Cryptographic attestations for data provenance
Code Integrity	SLSA	Build provenance for pipeline code and tools
Interchange	W3C PROV	Semantic export for cross-system interoperability

Integration Points

Makoto + SLSA: Both use in-toto/DSSE attestation format—store together, verify together
Makoto + OpenLineage: OpenLineage events can reference Makoto attestation URIs
Makoto + W3C PROV: Makoto attestations can be exported to PROV-JSON for semantic queries
OpenLineage + W3C PROV: OpenLineage lineage can be mapped to PROV entities/activities

Decision Flowchart

Answer these questions to find your starting point:

1. What type of artifact are you attesting?

Software (binaries, packages, containers) → SLSA
Data (datasets, streams, ETL outputs) → Continue to question 2

2. Do you need cryptographic integrity verification?

Yes, I need to prove data wasn't tampered with → Makoto
No, I just need lineage visibility for observability → OpenLineage
Both → Makoto + OpenLineage

3. Do you need to share provenance across different systems/organizations?

Yes, with semantic web or research systems → Consider W3C PROV export
Yes, with modern data systems → Makoto/OpenLineage native formats work well
No, internal use only → Use native format of your chosen standard

4. What security level do you need?

Documentation only → Makoto L1 or OpenLineage
Signed, tamper-evident records → Makoto L2
Hardware-backed, unforgeable attestations → Makoto L3

Summary

Standard	Use When You Need	Don't Use When
SLSA	Software build provenance, supply chain security	Data provenance, streaming, multi-stage transforms
Makoto	Data integrity, cryptographic verification, streaming, privacy	Software builds, basic lineage without security needs
OpenLineage	Pipeline observability, data catalogs, impact analysis	Cryptographic integrity, security requirements
W3C PROV	Semantic interoperability, research data, cross-system sharing	Simple use cases, when modern tooling is needed

Bottom line: Start with the standard that matches your primary need. Add complementary standards as your requirements grow. Most organizations benefit from using 2-3 standards together for comprehensive provenance coverage.

Compare SLSA and Makoto in depth → | Learn about Makoto Levels →