Skip to main content

Definition Schema

Complete YAML schema for entities, sources, and schemas. This is the reference for every field and option available in defacto definitions. For a guided introduction, see the definitions guide.

Entity

<entity_name>:
starts: <state_name>
identity: { ... }
properties: { ... }
states: { ... }
relationships: [ ... ]
always: { ... }
FieldRequiredDescription
startsyesInitial state for new entities. Must exist in states
identityyesIdentity field configuration
propertiesnoTyped attributes on the entity
statesyesState machine definition
relationshipsnoConnections to other entity types
alwaysnoHandlers that fire in any state

identity

identity:
<field_name>:
match: exact
normalize: "str::to_lowercase(value)"
FieldDefaultDescription
matchexactMatching strategy: exact or case_insensitive
normalizenoneExpression applied to values before matching. Uses value as the input variable

properties

properties:
<property_name>:
type: string
default: ""
sensitive: pii
treatment: mask
allowed: [free, pro, enterprise]
min: 0
max: 10000
compute: "entity.mrr * 12"
FieldRequiredDescription
typeyesstring, number, integer, boolean, or datetime
defaultnoValue assigned when the entity is created
allowednoList of valid values
minnoMinimum numeric value
maxnoMaximum numeric value
sensitivenoSensitivity label: pii, phi, or pci
treatmentnoRedaction behavior: hash, mask, or redact
computenoExpression recalculated after every state change

Property names cannot collide with system columns. The system columns are {entity_name}_id, {entity_name}_state, valid_from, valid_to, merged_into, last_event_time, state_entered_time, and created_time.

states

states:
<state_name>:
when:
<event_type>:
guard: "expression"
effects: [ ... ]
after:
- type: inactivity
threshold: 90d
effects: [ ... ]

A state with no when and no after is terminal.

handler

<event_type>:
guard: "boolean expression"
effects:
- create
- { transition: { to: <state_name> } }
- { set: { property: <name>, from: event.<field> } }
- { set: { property: <name>, value: <literal> } }
- { set: { property: <name>, compute: "expression" } }
- { set: { property: <name>, from: event.<field>, condition: "expression" } }
- { increment: { property: <name>, by: <number> } }
- { relate: { type: <rel_type>, target: <entity_type>, hints: { <entity>: [<field>] } } }
FieldRequiredDescription
guardnoBoolean expression. If false, handler is skipped
effectsyesList of effects to apply (at least one)

effects

EffectFieldsDescription
create(none)Initialize new entity. Idempotent
transitiontoMove to target state
setproperty, one of from/value/compute, optional conditionAssign property value
incrementproperty, optional by (default 1)Add to numeric property
relatetype, target, optional hintsCreate relationship to another entity

For set, exactly one of from, value, or compute must be specified. from uses dot-path syntax (event.email, entity.mrr). compute evaluates an expression. value is a literal.

time rules

after:
- type: inactivity
threshold: 90d
effects: [ ... ]
FieldRequiredDescription
typeyesinactivity, expiration, or state_duration
thresholdyesDuration: 30d, 24h, 90m, 45s
effectsyesList of effects to apply when threshold is met

relationships

relationships:
- type: placed
target: order
cardinality: has_many
properties:
total: { type: number }
FieldRequiredDescription
typeyesRelationship name
targetyesTarget entity type (must be defined)
cardinalityyeshas_many, has_one, belongs_to, or many_to_many
propertiesnoTyped attributes on the relationship (same format as entity properties)

Both sides of a relationship must be declared. Valid cardinality pairs: has_many/belongs_to, has_one/belongs_to, many_to_many/many_to_many.

always

always:
<event_type>:
guard: "expression"
effects: [ ... ]

Same format as state handlers. Fires regardless of which state the entity is in. Both state and always handlers fire if both match.

Source

<source_name>:
event_type: <field_name>
timestamp: <field_name>
event_id: [<field_name>, ...]

events:
<event_name>:
raw_type: "external.event.name"
mappings:
<output_field>:
from: <source_field>
compute: "expression"
type: string
default: "fallback"
hints:
<entity_type>: [<field_name>, ...]

top-level fields

FieldRequiredDescription
event_typeyesRaw field containing the event type
timestampyesRaw field containing the timestamp (ISO 8601 or Unix epoch)
event_idnoFields to hash for deduplication. Default: event type + all mapped data fields

event definition

FieldRequiredDescription
raw_typenoThe external event name. Default: use the defacto event name as-is
mappingsyesField mapping configuration
hintsyesIdentity hints connecting events to entities

field mapping

FieldDescription
fromSource field. Supports dot-path (user.address.city) and array indexing (items.0.id)
computeExpression using event.* for derived fields
typeType coercion: string, number, integer, boolean, datetime
defaultFallback value if the source field is missing or null

from and compute are mutually exclusive.

hints

hints:
customer: [email, phone]
order: [order_id]

Maps entity types to the field names that identify them. A single event can carry hints for multiple entity types.

Schema

<event_type>:
fields:
<field_name>:
type: string
required: true
allowed: [a, b, c]
min: 0
max: 100
min_length: 1
regex: "^[a-z]+$"
FieldRequiredDescription
typeyesstring, number, integer, boolean, or datetime
requirednoWhether the field must be present and non-null. Default: false
allowednoList of permitted values
minnoMinimum numeric value (number/integer only)
maxnoMaximum numeric value (number/integer only)
min_lengthnoMinimum string length (string only)
regexnoRegex pattern the string must match (string only)

Schemas define the contract between sources and entities. Source mappings must produce the fields the schema requires. At registration time, defacto validates that source output matches schema expectations.

File organization

Definitions can be loaded from a directory or passed as a dict.

Directory layout

my-project/
entities/
customer.yaml
order.yaml
sources/
app.yaml
billing.yaml
schemas/ # optional
customer_signup.yaml

Each file contains one or more definitions of its type. The directory names (entities/, sources/, schemas/) are required.

d = Defacto("my-project/")

Dict format

d = Defacto({
"entities": {
"customer": { "starts": "lead", ... }
},
"sources": {
"app": { "event_type": "type", ... }
},
"schemas": {}
})