# MemoryUnit Core Specification (Draft 1.0.0)

## Status

Draft – Non-normative until tagged 1.0.0. Feedback welcome.

## 1. Scope

Defines the canonical envelope ("Memory Unit") for representing normalized event records and associated domain payloads with integrity, provenance, and extensibility.

## 2. Conformance

An implementation conforms if it:

- Validates instances against the published JSON Schema for the declared version.
- Computes `artifacts.jsonHash` using the canonicalization profile (Section 5) with no deviations.
- Rejects signatures whose `canonicalHash` mismatches the recalculated hash.

Normative keywords (MUST, SHOULD, MAY) follow RFC 2119.

## 3. Data Model Overview

Top-level object fields (all optional unless noted by schema):

- `mu_id` (SHOULD): Stable identifier. Recommended format `MU-<Base32Prefix>-<ISO8601Date>`.
- `objectType` (MUST if present): Const `MemoryUnit`.
- `version` (MUST): Schema semantic version string (const `1.0` in v1 schema).
- `event` (SHOULD): Structured event classification; supersedes legacy `eventType`.
- `eventType` (DEPRECATED): Flat enumeration retained for backward compatibility.
- `title`, `timestamp`, `actors`, `location`, `outcome`: Human & structural descriptors.
- `domainPayload` (MAY): Opaque domain-specific payload validated via `metadata.schemaRef`.
- `artifacts` (MUST): Integrity data (Section 4).
- `metadata` (MAY): Classification, schema linkage, correlation.
- `revision`, lifecycle timestamps `createdAt`, `updatedAt`.
- `signatures` (MAY): Detached signature array (excluded from hash scope).
- `links` (MAY): Typed inter-object relationships.
- `anchors` (MAY): External notarization references.
- `dataClassifications`, `personalData`, `retentionPolicy` (MAY): Compliance attributes.

## 4. Integrity & Anchoring

### 4.1 Hash Scope

The preimage for `artifacts.jsonHash` MUST be the canonical serialization (Section 5) of the Memory Unit after applying these transformations:

1. Remove the `signatures` property entirely if present.
2. If `artifacts.jsonHash` exists, remove that property.
3. All other fields remain verbatim.

### 4.2 Hash Algorithm

- Algorithm: SHA-256
- Output: Lowercase hex, 64 characters
- Field: `artifacts.jsonHash`

### 4.3 Anchors

Each anchor entry:

- `type`: Namespaced token (e.g. `ethereum:tx`, `ipfs:cid`).
- `network`: Optional; network identifier (`mainnet`, `sepolia`).
- `value`: Raw reference (tx hash, CID, log seq).
- `timestamp`: Optional ISO 8601.

## 5. Canonicalization Profile

Name: `MU-JCS-1`.
Based on RFC 8785 JSON Canonicalization Scheme with the following clarifications:

- Object keys sorted by Unicode code point ascending.
- Array order preserved.
- Strings serialized per JSON standard; UTF-8 NFC input expected.
- Numbers: JavaScript JSON.stringify representation (no superfluous leading zeros, exponent form as produced by JS engine for extreme magnitudes) – implementers SHOULD adopt full RFC 8785 numeric rules for cross-language parity.
- No trailing spaces or line breaks other than required structural characters.

Pseudocode reference included in `scripts/canonicalize-hash.js`.

## 6. Signatures

### 6.1 Scope

Each signature object signs the same canonical hash (`canonicalHash`).

### 6.2 Algorithms

- `ecdsa-p256-sha256`: Public key MUST be provided in PEM SPKI.
- `eip191-hash`: MUST include recovered `address`.

### 6.3 Ordering

Consumers SHOULD treat the signatures array as unordered; producers SHOULD output signatures sorted lexicographically by `algorithm`, then `keyId`, then `createdAt` for deterministic diff friendliness.

### 6.4 Verification Steps

1. Recompute canonical hash.
2. Compare to `artifacts.jsonHash` (MUST match).
3. Compare to each signature `canonicalHash` (MUST match or signature invalid).
4. Verify cryptographic signature per algorithm profile (out of scope here – to be specified in Signature Profile Extension draft).

## 7. Relationships (`links`)

Each link entry MUST include:

- `rel`: Relation type token (e.g. `derivesFrom`, `supersedes`, `attestsTo`).
- `target`: Reference (MU identifier `mu:<mu_id>`, content hash `hash:<sha256>`, or absolute URI).
  Optional: `note`.
  No circularity constraints at core spec level (future graph constraints MAY be defined).

## 8. Schema Referencing

`metadata.schemaRef.id` MUST resolve (HTTP 200) or be a registered URN. When `schemaRef.hash` is provided it MUST match the SHA-256 of the raw schema content using `sha256:<hex>` prefix format.

## 9. Versioning Policy

- Additive optional fields: MINOR release.
- Field removal, required additions, semantics change: MAJOR.
- Clarifications / editorial: PATCH.

## 10. Backward Compatibility Guidelines

- Deprecated fields retained for ≥1 MAJOR cycle.
- New fields MUST be optional or have safe defaults.

## 11. Error Codes (Suggested)

| Code  | Condition                                         |
| ----- | ------------------------------------------------- |
| MU001 | Canonical hash mismatch (artifacts vs recomputed) |
| MU002 | Signature canonicalHash mismatch                  |
| MU003 | Unknown schemaRef hash mismatch                   |
| MU004 | Invalid relation target format                    |
| MU005 | Invalid anchor type format                        |

## 12. Security Considerations

- Canonicalization attacks mitigated by deterministic ordering.
- Replay protection depends on external anchoring (out of scope).
- Privacy classification MUST NOT be treated as enforcement – advisory only.

## 13. IANA / Registry Considerations

None at present (future: anchor type & relation type registries).

## Appendix A. Test Vector (Illustrative)

```
<filled once vectors produced>
```

---

END
