This document defines the parity contract for ODF validation in openxml-audit.
.odt, .ods, .odp) validation.scripts/odf/run_odf_parity_snapshot.pyscripts/odf/run_reference_validators.pyscripts/odf/check_reference_drift.pydata/odf/reference_corpus/manifest.jsontests/fixtures/odf/.run_reference_validators.py materializes deterministic ODF ZIP files from fixture directories.mimetype is written first and uncompressed when present.Sample entry contract:
id (string, stable identifier)profile (string, e.g. valid or invalid)category (string mismatch family grouping seed, e.g. package, schema, semantic, security)fixture_dir (string, path relative to fixtures root)filename (string, staged output name)file_format (string, currently odf1.2 or odf1.3)odf_version_marker)Primary report: JSON output from scripts/odf/run_reference_validators.py.
Top-level fields:
generated_at (ISO timestamp)contract_version (odf-reference-v1 for run reports, odf-reference-v2 for compare reports)corpus_manifest (resolved path)fixtures_root (resolved path)strict (boolean, Python validator mode)sample_count (integer)duration_seconds (float)python_issue_categories (category aggregate for python findings)runners (object with per-runner status counts + command template metadata)samples (array of sample records)Per-sample fields:
id, profile, category, fixture_dir, filename, file_format, staged_relpathruns:python (always attempted)odf_toolkit (optional; unavailable if command not configured)opf (optional; unavailable if command not configured)Per-run fields:
status: one of ok, unavailable, timeout, errorduration_seconds (when executed)exit_code (when executed)issues (normalized issue rows)reason, stdout_preview, stderr_preview, command)status=unavailable or status=error and do not contribute issue rows.Python rows:
openxml_audit.parity_normalization.normalize_error_tuple.severity from ValidationError.severitycomparison_key = "<severity>|<normalized_description>"Reference rows (ODF Toolkit / OPF):
warn -> warninginfo -> infoerrornormalize_description.comparison_key = "<severity>|<normalized_description>"scripts/odf/compare_reference_results.py compares Python vs each reference tool independently.
comparison_keyCounter) intersection/difference per sampletotal, compared, skipped)python, reference, matched, only_python, only_reference)only_python, only_reference) sorted by countfamily_group_key for cross-tool family groupingonly_python, only_reference) grouped by issue categorycross_tool_families.only_python and cross_tool_families.only_referencecomparison_key after tool-name/path/file noise reductioncount and per-tool count breakdown (tools)Skipped samples are recorded when either run status is not ok.
scripts/odf/check_reference_drift.py compares a current compare report to a pinned baseline and enforces
threshold policy.
Policy file:
data/odf/reference_baseline/2026-03-09/drift_policy.jsonDefault strict policy:
max_only_python_growth = 0max_only_reference_growth = 0max_new_only_python_families = 0max_new_only_reference_families = 0max_compared_sample_drop = 0max_unavailable_samples = 0max_timeout_samples = 0max_error_samples = 0Failure conditions:
Gate output:
reports/odf/reference_drift.jsonreports/odf/reference_drift.mdWaivers are declared in data/odf/reference_baseline/2026-03-09/waivers.json.
Required fields:
kindownerreasonexpires (YYYY-MM-DD)Optional fields:
tool (tool-scoped waiver)target (required for family-targeted waiver kinds)Allowed waiver kinds:
only_python_growthonly_reference_growthnew_only_python_family (target required)new_only_reference_family (target required)samples_compared_dropreference_unavailablereference_timeoutreference_errorRules:
Workflow: .github/workflows/odf-reference-calibration.yml
run_reference_validators.pycompare_reference_results.pybuild_mismatch_triage.pycheck_reference_drift.pydrift_policy.jsonwaivers.jsonscripts/odf/bootstrap_reference_validators.pyodf_toolkit_refopf_refTo regenerate baseline artifacts:
python scripts/odf/run_reference_validators.py \
--corpus-manifest data/odf/reference_corpus/manifest.json \
--output data/odf/reference_baseline/2026-03-09/reference_runs.json
python scripts/odf/compare_reference_results.py \
--input data/odf/reference_baseline/2026-03-09/reference_runs.json \
--output data/odf/reference_baseline/2026-03-09/mismatch_report.json \
--summary data/odf/reference_baseline/2026-03-09/mismatch_summary.md
python scripts/odf/build_mismatch_triage.py \
--compare data/odf/reference_baseline/2026-03-09/mismatch_report.json \
--runs data/odf/reference_baseline/2026-03-09/reference_runs.json \
--output data/odf/reference_baseline/2026-03-09/mismatch_triage.md
python scripts/odf/check_reference_drift.py \
--baseline data/odf/reference_baseline/2026-03-09/mismatch_report.json \
--current data/odf/reference_baseline/2026-03-09/mismatch_report.json \
--policy data/odf/reference_baseline/2026-03-09/drift_policy.json \
--waivers data/odf/reference_baseline/2026-03-09/waivers.json \
--output data/odf/reference_baseline/2026-03-09/drift_report.json \
--summary data/odf/reference_baseline/2026-03-09/drift_summary.md
run_odf_parity_snapshot.py) requires no external tools.scripts/odf/bootstrap_reference_validators.py.mvn -version) or Docker is available.python scripts/odf/bootstrap_reference_validators.py --maven-mode docker --runtime-mode dockerunavailable, inspect reason in reference_runs.json.stdout_preview / stderr_preview from run reports to verify parser-compatible output.{file}, {file_dir}, {file_name}, {file_stem}, {file_suffix}