ingestlayer/docs/v0.4

docs · 02 concepts

Entities.

Events are the history; entities are the state. Every customer you care about, every account, every order — anything keyed by an ID you control, with whatever traits matter to you. Stored once, looked up by any pipeline, by any destination, on the way out the door.

The model

One row per key, open traits.

An entity is a row scoped to your org. Three identifying fields, one open one:

i.
kind
Open string. user, account, order, device — whatever entity types matter in your domain. No catalog to register against.
ii.
external_id
Your own key. The email, the user UUID, the account ID, the order number. Paired with kind, it identifies one row.
iii.
traits
JSONB. Anything you want a pipeline to look up by key. Plan, ARR, account owner, country, signup date, feature flags — the columns are yours.
iv.
timestamps
first_seen and last_seen, maintained for you. last_seen advances on every upsert.

The contract: (org_id, kind, external_id) is unique. Two upserts with the same triple update the same row. Traits deep-merge one level, so a partial update doesn't blank out fields the prior write set.

Populating

Three ways to write entities.

Pick whichever matches where the data lives. They write to the same table; mix them freely.

a · REST API

The standard path. POST /api/entities is upsert: send the canonical state, don't worry whether the row exists. Use it from any backend, any cron, any CRM webhook handler.

curl · upsert one entity↵ copy
# create or update one entity. POST is upsert by (kind, external_id):
# net-new rows get a 201, existing rows have their traits deep-merged.

$ curl -X POST https://app.ingestlayer.com/api/entities \
    -H "Authorization: Bearer $INGESTLAYER_KEY" \
    -H "Content-Type: application/json" \
    -d '{
      "kind":        "user",
      "external_id": "alice@acme.co",
      "traits": {
        "plan":           "growth",
        "arr":            50000,
        "account_owner":  "ben",
        "company_domain": "acme.co"
      }
    }'

Net-new rows return 201; existing rows return 200 with the merged state. Existing traits not mentioned in the request are preserved.

b · Bulk sync from your CRM

The API above is the substrate. Run it in a loop from your CRM-of-record once a night and your identity graph stays current with whoever your customers are right now — including the fields your CRM tracks that we don't.

example · nightly sync from CRM↵ copy
// sync from your CRM in one nightly job
// pseudo-Node, ~30 lines

const customers = await crm.listCustomers();
for (const c of customers) {
  await fetch("https://app.ingestlayer.com/api/entities", {
    method:  "POST",
    headers: {
      authorization: `Bearer ${process.env.IL_KEY}`,
      "content-type": "application/json",
    },
    body: JSON.stringify({
      kind:        "account",
      external_id: c.id,
      traits: {
        name:           c.name,
        plan:           c.plan,
        arr:            c.arr,
        signed_at:      c.created_at,
        account_owner:  c.csm_email,
      },
    }),
  });
}

c · Dashboard

The /entities page in the app lists every row, with kind, external_id, trait count, and last-seen. Click a row to see the full traits, copy values, or delete it. Useful for inspection and one-off fixes — programmatic upsert handles the volume.

d · Pipeline action (upsert.entity)

If your events already carry the identity data, write it into the graph as the event passes through. The upsert.entity action maps event fields onto trait names — first signup populates the row, every subsequent event updates it, all without leaving the pipeline.

pipeline.yaml · auto-populate from signups↵ copy
# example: when a user.signup event arrives, write it into
# entities as a "user" row before the same pipeline fans out.

pipeline: signups
sources:
  - type: sdk.event
    match: user.signup

actions:
  # 01 · populate the identity graph from the event payload
  - upsert.entity:
      kind:        user
      external_id: $event.payload.email
      traits:
        name:    $event.payload.name
        plan:    $event.payload.intended_plan
        utm:     $event.payload.utm
        country: $event.payload.country
        signed_up_at: now()

  # 02 · use the now-stored traits in the outbound body
  - transform:
      template: |
        {
          email:   $event.payload.email,
          plan:    $entity.user.plan,
          country: $entity.user.country
        }

destinations:
  - webhook.out:
      url: https://webhook.site/...

Note the ordering: upsert.entity runs before any enrich.entity in the same pipeline can read the just-written traits. Actions run in declared order.

Reading

enrich.entity inside a pipeline.

The mirror of upsert.entity. Resolves a key field against the event, looks up the matching row, merges its traits into $entity.<kind>. From there any later transform, filter, or destination can read them.

pipeline.yaml · enrich + route↵ copy
# lookup by key, merge traits onto $entity.<kind>
pipeline: routing
actions:
  - enrich.entity:
      kind:        user
      external_id: $event.payload.email

  - transform:
      template: |
        {
          email:   $event.payload.email,
          plan:    $entity.user.plan,
          arr:     $entity.user.arr,
          owner:   $entity.user.account_owner
        }

Missing rows are silent no-ops — downstream templates see undefined. Use a filter step after if you want to drop unmatched events instead.

Endpoints

The full API surface.

All endpoints are org-scoped. Auth uses the same bearer key as event ingest. Row IDs ({id} in paths) are UUIDs returned from list / create.

Privacy

Where the data lives.

Entities are stored in the same region as your events. No third-party copy, no external enrichment provider sees them. If you need richer fields than your own data carries — job title, employee count, technographics — wire up an enrichment provider separately via enrich.person or enrich.company; entities stays your own.


destinationsback to docsnext: deliveries