# SemantiCord Schema Taxonomy

This document defines the canonical classification, naming, versioning and relation metadata conventions for schemas in this repository.

## 1. Kind Vocabulary

Canonical `kind` values (controlled set):

| Kind       | Purpose                                                            | Examples                                               |
| ---------- | ------------------------------------------------------------------ | ------------------------------------------------------ |
| core       | Envelope / base primitives that wrap domain payloads               | memoryUnit                                             |
| domain     | Atomic business / market / descriptive events or entities          | bid, settlement, participant, omniRFQ                  |
| report     | Periodic or aggregated narrative / compliance artifacts            | projectRecap, dataQualityReport, esgConsolidatedReport |
| impact     | Social / environmental impact measurements                         | impactSnapshot                                         |
| analytic   | Model outputs, evaluations, quantitative metrics beyond raw events | forecastEvaluation, (future) rollingMetric             |
| assurance  | Third-party or process attestations about other artifacts          | attestation                                            |
| integrity  | Cryptographic integrity grouping, logs, signature bundles          | signatureBundle, transactionLog                        |
| governance | Access control, policy, revocation actions                         | accessGrant, (future) accessRevocation, policy         |
| reference  | Catalog / registry of actors, agreements, documents                | participant, agreement, documentManifest               |

Notes:

- `analytic` is introduced for derived evaluations (currently `forecastEvaluation` is classified as domain; should migrate to analytic in a later minor rev).
- `report` vs `analytic`: reports are human narrative & compliance oriented; analytic outputs are machine-derived scores/metrics.

## 2. Naming Conventions

- File name: `<lowerCamelCase>.v<major>.json` (e.g. `priceIndex.v1.json`).
- `$id`: `https://semanticord.org/schema/<sameFileName>`.
- `title`: `<Readable Title> v<major>` (no "Payload" suffix).
- Avoid ambiguous casing: prefer `geoBidAuction` (consistent lowerCamel) not `geobidAuction`.
- No spaces in file names; hyphens reserved for multi-token acronyms only if required (avoid for now).

## 3. Versioning

- Semantic: `major.minor` where JSON Schema files lock a **const** version in their `properties.version` for envelope/payloads.
- In registry: `version` mirrors either top-level `version` or the `properties.version.const` if implicit.
- Hash change without semantic behavioral change: keep version, update hash (warn emitted).
- Semantic change (breaking shape / meaning): bump major.
- Additive non-breaking: bump minor (e.g. 1.1) and update schema `const` accordingly.

## 4. Relations (`x-relations`)

Include an `x-relations` object in every schema (empty `{}` if none). Keys are property paths (dot notation, arrays with `[]`), values list target schema keys / IDs.

Example:

```json
"x-relations": {
  "references.attestations[].hash": ["attestation"],
  "references.memory_units[].hash": ["memoryUnit"]
}
```

Guidelines:

- Always list the most direct relation path (avoid duplicating synonyms).
- For hashes referencing Memory Units, prefer target key `memoryUnit`.
- If using a full URL, the graph loader will attempt to map to manifested key.

## 5. Prefill (`x-prefill`)

Structure:

```json
"x-prefill": {
  "memoryUnit": {
    "platform": "GeoBid",
    "eventType": "auction",
    "title": "Auction Recap — Example",
    "outcome.summary": "Clear summary",
    "actors.primary": "Example Org",
    "location": "Region"
  },
  "domainPayloadExample": { /* minimal valid domain payload */ }
}
```

Top-level duplicates (`platform`, `eventType`) are tolerated but canonical location is inside `memoryUnit`.

## 6. Registry Consistency Rules

- All schemas must appear in `schemas/manifest.json` and be hash-registered via the hash script.
- `kind` must be one of the vocabulary above.
- `status`: one of `active`, `draft`, `deprecated`.
- `draft` items require a `$comment` describing stabilization criteria.

## 7. Deprecation & Migration

When renaming or replacing a schema:

- Keep old entry in registry with `status: "deprecated"` (optional alias file containing only a `$comment` referencing new `$id`).
- Provide a migration doc under `migrations/` describing changed properties and mapping.

## 8. Lint Expectations

The schema lint script will verify:

1. `$id` matches file name pattern.
2. Presence of `title`, `description`, `type`.
3. Presence of either top-level `version` or `properties.version.const`.
4. Existence of `x-relations` key (any object).
5. `kind` in registry matches vocabulary.
6. No forbidden suffixes (e.g., "Payload").

## 9. Future Extensions

Reserved future kinds: `lineage`, `risk` (may be merged into analytic/governance but kept open pending design).

## 10. Example Entry Checklist

Before committing a new schema:

- [ ] File & `$id` naming aligned
- [ ] `title` has correct capitalized format and version suffix
- [ ] `description` concise and meaningful
- [ ] `required` covers all minimal integrity fields
- [ ] `x-relations` present (even if `{}`)
- [ ] `x-prefill.domainPayloadExample` minimal valid instance included (if domain/report/analytic)
- [ ] Added to `schemas/manifest.json`
- [ ] Ran hash script `node scripts/compute-schema-hashes.mjs --write`

---

Questions / amendments: open an issue or propose a PR updating this document.
