# Cyoda Documentation — full content

Source: https://docs.cyoda.net
Generated: 2026-04-25T01:36:46.515Z

---

## build.md

# Build

Develop Cyoda applications — tier-agnostic patterns that work on any runtime.

Cyoda applications are **digital twins**: the same code runs on every
storage tier, from in-memory dev through Cassandra at enterprise
scale. Pages in this section cover the patterns — entity modeling,
workflows, external processors, testing — independent of where your
app runs.

Where to next:

- [Modeling entities](./modeling-entities/)
- [Working with entities](./working-with-entities/)
- [Workflows and processors](./workflows-and-processors/)
- [Client compute nodes](./client-compute-nodes/)
- [Testing with digital twins](./testing-with-digital-twins/)

---

## build/analytics-with-sql.md

# Analytics with SQL

Query entities with Trino SQL — when its the right surface, how to connect, and where to find the full grammar.

:::caution[Upcoming]
Trino SQL is on the roadmap and not yet available in cyoda-go at this release. This page documents the planned surface; names and shapes may change before release.
:::

Cyoda projects every entity model into a set of virtual SQL tables and
exposes them through a Trino connector. Use this surface for cross-entity
analytics: joins across entity types, aggregates, reporting, time-series,
BI dashboards. For operational read/write, stay on [REST](/build/working-with-entities/);
for compute that runs against transitions, use [gRPC compute
nodes](/build/client-compute-nodes/).

## When to use SQL

- Ad-hoc analysis against live data in a notebook or BI tool.
- Scheduled reports aggregating entities across a tenant.
- Historical queries using the `point_time` column for as-of reads.
- Cross-entity joins — e.g. orders joined to customers joined to
  payments.

If the question is *transactional* ("read this one entity", "fire this
transition"), it does not belong here. REST is faster, cheaper, and
correctly scoped for that.

## Connect

The JDBC connection string pattern:

```
jdbc:trino://trino-client-<caas_user_id>.eu.cyoda.net:443/cyoda/<your_schema>
```

- `caas_user_id` — your CAAS user ID.
- `<your_schema>` — the SQL schema name you configured (create one in
  the Cyoda UI under **Trino/SQL**, or via
  `PUT /sql/schema/putDefault/{schemaName}`).

Authenticate with a bearer token issued by the platform. For
technical-user setup and the OAuth 2.0 client-credentials flow, see
[Authentication and identity](/concepts/authentication-and-identity/).

## A first query

Given an entity model `orders` with nested `lines`, Cyoda produces one
table per nested level plus a JSON reconstruction table:

- `orders` — root columns + top-level fields
- `orders_lines` — one row per line item, with `index_0` marking
  position
- `orders_json` — the complete JSON document per entity

Read a single order and its lines:

```sql
SELECT
  o.entity_id,
  o.state,
  o.customer_id,
  l.index_0 AS line_no,
  l.sku,
  l.quantity,
  l.price
FROM orders o
JOIN orders_lines l
  ON l.entity_id = o.entity_id
WHERE o.entity_id = '00000000-0000-0000-0000-000000000001';
```

Query as of last Tuesday:

```sql
SELECT * FROM orders
WHERE point_time = TIMESTAMP '2026-04-14 00:00:00'
  AND state = 'submitted';
```

## Table-naming rules, at a glance

- Root node of an entity → table named after the model
  (e.g. `orders`).
- Array-of-objects node → `<model>_<field>` (e.g. `orders_lines`).
- Multi-dimensional arrays → `<model>_<field>_<N>d_array` (detached
  array naming).
- JSON reconstruction → `<model>_json`.

The full projection rules — node decomposition, detached arrays, type
mapping, polymorphic fields — live in the
[Trino SQL reference](/reference/trino/).

## Performance notes

- When querying a `<model>_json` table, **always include `entity_id` in
  the WHERE clause**. Without that predicate the query scans the
  reconstruction table for every entity and gets very slow.
- For joins across nested-array tables, use `entity_id` plus matching
  `index_*` columns as join keys.
- Omit `point_time` unless you actually need historical data.

## Where to go next

- [Trino SQL reference](/reference/trino/) — full projection rules,
  type mapping, polymorphic fields, complete worked example.
- [APIs and surfaces](/concepts/apis-and-surfaces/) — when to pick
  REST vs gRPC vs SQL.
- [Entities and lifecycle](/concepts/entities-and-lifecycle/) — the
  entity model whose shape becomes your SQL tables.

---

## build/client-compute-nodes.md

Patterns for processor and criteria services — implementation, registration, and lifecycle.

# 1. Architecture Overview

A **calculation member** is an external gRPC client that participates in entity workflow processing on the Cyoda platform. The platform delegates work to your client over a persistent bidirectional gRPC stream, and your client returns results on the same stream. For the rationale behind preferring gRPC over HTTP for compute nodes, see [APIs and surfaces](/concepts/apis-and-surfaces/).

```
┌──────────────────────┐         gRPC (bidirectional stream)         ┌─────────────────────────┐
│   Cyoda Platform     │ ◄──────────────────────────────────────────►│  Your Calculation       │
│                      │    CloudEvent (Protobuf, JSON payload)      │  Member (Client)        │
│  ┌────────────────┐  │                                             │                         │
│  │ Workflow Engine│  │  1. Client opens stream, sends Join         │  ┌───────────────────┐  │
│  │                │  │  2. Server responds with Greet              │  │ Business Logic    │  │
│  │  - Processors  │──┼──3. Server pushes Processing/Criteria reqs──┼──│                   │  │
│  │  - Criteria    │  │  4. Client returns responses                │  │ - Data transforms │  │
│  │                │  │  5. Keep-alive heartbeats (bidirectional)   │  │ - Criteria checks │  │
│  └────────────────┘  │                                             │  └───────────────────┘  │
└──────────────────────┘                                             └─────────────────────────┘
```

Two types of work can be delegated:

| Use Case | Description                                                                                                                                                                  | Request Type | Response Type |
|---|------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|---|---|
| **Processing** | Perform actions, such as transforming entity data during a workflow transition, performing CRUD ops on other entities, running reports, interacting with other systems, etc. | `EntityProcessorCalculationRequest` | `EntityProcessorCalculationResponse` |
| **Criteria Evaluation** | Evaluate a boolean condition (e.g., "should this transition fire?")                                                                                                          | `EntityCriteriaCalculationRequest` | `EntityCriteriaCalculationResponse` |

## 1.1 Protocol Summary

- **Transport**: gRPC bidirectional streaming via `CloudEventsService.startStreaming`
- **Message format**: [CNCF CloudEvents](https://cloudevents.io/) Protobuf envelope with JSON `text_data` payload
- **Authentication**: Bearer JWT token in gRPC `Authorization` metadata header
- **Auth context propagation**: The platform attaches [CloudEvents Auth Context extension](https://github.com/cloudevents/spec/blob/main/cloudevents/extensions/authcontext.md) attributes to processor and criteria requests, identifying the principal whose action triggered the workflow (see [Section 8](#8-auth-context-on-incoming-requests))
- **Serialization**: All payloads are JSON-serialized inside CloudEvent `text_data` (not binary protobuf)

---

# 2. Prerequisites

## 2.1 Proto Definitions

Your client needs the following proto files to generate gRPC stubs:

- **`cloudevents.proto`** — The standard CloudEvents Protobuf message definition (package `io.cloudevents.v1`)
- **`cyoda-cloud-api.proto`** — The Cyoda service definition (package `org.cyoda.cloud.api.grpc`)

The service definition:

```protobuf
service CloudEventsService {
  rpc startStreaming(stream io.cloudevents.v1.CloudEvent) returns (stream io.cloudevents.v1.CloudEvent);
}
```

The CloudEvent message:

```protobuf
message CloudEvent {
  string id = 1;          // Unique event ID (UUID recommended)
  string source = 2;      // URI-reference identifying the event source
  string spec_version = 3; // Must be "1.0"
  string type = 4;        // Event type string (see Section 4)

  map<string, CloudEventAttributeValue> attributes = 5;

  oneof data {
    bytes binary_data = 6;
    string text_data = 7;       // ← Used by Cyoda (JSON payload)
    google.protobuf.Any proto_data = 8;
  }
}
```

## 2.2 JWT Authentication Token

Obtain a valid JWT Bearer token from the Cyoda IAM system (OAuth 2.0 client credentials flow). The token must contain:
- A valid `caas_org_id` claim (your legal entity ID)
- Valid user roles

The token is validated on every stream establishment. If the token expires during an active stream, the stream remains valid — re-authentication occurs only when a new stream is opened.

## 2.3 Dependencies (Java/Kotlin Example)

For JVM-based clients, the recommended dependencies are:

- `io.grpc:grpc-stub`, `io.grpc:grpc-protobuf`, `io.grpc:grpc-netty-shaded` — gRPC runtime
- `io.cloudevents:cloudevents-protobuf` — CloudEvents SDK Protobuf format support
- `io.cloudevents:cloudevents-core` — CloudEvents SDK core
- `com.fasterxml.jackson.core:jackson-databind` — JSON serialization

---

# 3. Connection Setup

## 3.1 Create the gRPC Channel

```java
ManagedChannel channel = ManagedChannelBuilder
    .forAddress("cyoda-host.example.com", 50051)
    .usePlaintext()           // Use .useTransportSecurity() for TLS in production
    .keepAliveTime(30, TimeUnit.SECONDS)
    .keepAliveTimeout(10, TimeUnit.SECONDS)
    .build();
```

**Production TLS**: In production, always use TLS. Replace `.usePlaintext()` with:
```java
    .useTransportSecurity()
    .sslContext(/* your SSL context */)
```

## 3.2 Attach JWT Credentials

Create a `CallCredentials` implementation that injects the `Authorization` header:

```java
CallCredentials callCredentials = new CallCredentials() {
    @Override
    public void applyRequestMetadata(RequestInfo requestInfo, Executor executor, MetadataApplier applier) {
        executor.execute(() -> {
            Metadata headers = new Metadata();
            headers.put(
                Metadata.Key.of("Authorization", Metadata.ASCII_STRING_MARSHALLER),
                "Bearer " + jwtTokenSupplier.get()  // Always fetch a fresh token
            );
            applier.apply(headers);
        });
    }
};
```

## 3.3 Create the Stub

```java
CloudEventsServiceGrpc.CloudEventsServiceStub asyncStub = CloudEventsServiceGrpc
    .newStub(channel)
    .withCallCredentials(callCredentials)
    .withWaitForReady();   // Wait for the channel to become ready before sending
```

---

# 4. CloudEvent Type System

Every message on the stream is a CloudEvent with a `type` field that determines how to deserialize the JSON `text_data`. Your client must handle the following types:

| CloudEvent `type` | Direction | Description |
|---|---|---|
| `CalculationMemberJoinEvent` | Client → Server | Register as a calculation member |
| `CalculationMemberGreetEvent` | Server → Client | Server confirms registration |
| `CalculationMemberKeepAliveEvent` | Bidirectional | Heartbeat probe and response |
| `EventAckResponse` | Server → Client | Acknowledgment of keep-alive |
| `EntityProcessorCalculationRequest` | Server → Client | Process entity data |
| `EntityProcessorCalculationResponse` | Client → Server | Return processed entity data |
| `EntityCriteriaCalculationRequest` | Server → Client | Evaluate a boolean criterion |
| `EntityCriteriaCalculationResponse` | Client → Server | Return criterion result |

## 4.1 Building a CloudEvent

To send a CloudEvent on the stream (Java/Kotlin with CloudEvents SDK):

```java
// 1. Build the CloudEvents SDK event
io.cloudevents.CloudEvent sdkEvent = CloudEventBuilder.v1()
    .withType("CalculationMemberJoinEvent")   // Must match the type table above
    .withSource(URI.create("my-calculation-member"))
    .withId(UUID.randomUUID().toString())
    .withData(PojoCloudEventData.wrap(event, e -> objectMapper.writeValueAsBytes(e)))
    .build();

// 2. Serialize to Protobuf
EventFormat protobufFormat = EventFormatProvider.getInstance()
    .resolveFormat("application/cloudevents+protobuf");
byte[] protoBytes = protobufFormat.serialize(sdkEvent);

// 3. Parse to the gRPC CloudEvent message
io.cloudevents.v1.proto.CloudEvent grpcEvent =
    io.cloudevents.v1.proto.CloudEvent.parseFrom(protoBytes);
```

## 4.2 Parsing a Received CloudEvent

```java
// From the gRPC StreamObserver<CloudEvent>.onNext(value):
String eventType = value.getType();
String jsonPayload = value.getTextData();

// Deserialize based on type
switch (eventType) {
    case "CalculationMemberGreetEvent":
        GreetEvent greet = objectMapper.readValue(jsonPayload, GreetEvent.class);
        break;
    case "EntityProcessorCalculationRequest":
        ProcessorRequest req = objectMapper.readValue(jsonPayload, ProcessorRequest.class);
        break;
    // ... etc
}
```

---

# 5. Connection Lifecycle

## 5.1 Open the Stream

```java
StreamObserver<CloudEvent> requestObserver = asyncStub.startStreaming(
    new StreamObserver<CloudEvent>() {
        @Override
        public void onNext(CloudEvent value) {
            // Dispatch based on value.getType() — see Sections 6–8
        }

        @Override
        public void onError(Throwable t) {
            // Connection lost — trigger reconnect (see Section 11)
        }

        @Override
        public void onCompleted() {
            // Server closed the stream — trigger reconnect
        }
    }
);
```

## 5.2 Join Handshake

Immediately after opening the stream, send a `CalculationMemberJoinEvent`:

```json
{
  "id": "<uuid>",
  "tags": ["my-processor-tag", "production"]
}
```

**Tags** are critical for routing. The platform routes processing/criteria requests to members whose tags are a **superset** of the tags configured on the workflow processor/criterion. Tags are case-insensitive (lowercased server-side).

The server responds with a `CalculationMemberGreetEvent`:

```json
{
  "id": "<uuid>",
  "success": true,
  "memberId": "<server-assigned-member-uuid>",
  "joinedLegalEntityId": "<your-legal-entity-id>"
}
```

**Store the `memberId`** — you will need it for keep-alive messages.

If `success` is `false`, inspect the `error` object for the failure reason (e.g., subscription limit exceeded, invalid token).

### 5.3 Keep-Alive

The platform periodically probes your member with `CalculationMemberKeepAliveEvent` messages to verify liveness. You **must** respond to each probe with an `EventAckResponse`.

**Server-initiated keep-alive probe** (Server → Client):
```json
{
  "id": "<probe-uuid>",
  "memberId": "<your-member-id>"
}
```

**Required response** (Client → Server):
```json
{
  "id": "<new-uuid>",
  "sourceEventId": "<probe-uuid>",
  "success": true
}
```

You may also send **client-initiated keep-alive** messages to confirm your own liveness. The server will respond with an `EventAckResponse`.

**Timing parameters** (server-side defaults):
| Parameter | Default | Description |
|---|---|---|
| Keep-alive probe interval | 1,000 ms | How often the server probes |
| Max idle interval | 3,000 ms | How long before a member is marked as not alive |
| Keep-alive check timeout | 1,000 ms | How long the server waits for a probe response |

A member is marked not alive when a probe times out (keep-alive check timeout, default 1,000 ms) **and** the max idle interval (default 3,000 ms) has been exceeded since the last successful probe response. Both conditions must hold — a single slow probe within the idle window does not mark the member dead.

**If your member is marked as not alive, the platform will not route requests to it.** The member remains registered but idle. Responding to a subsequent keep-alive probe restores the alive status.

> ⚠️ **Critical**: Failing to respond to keep-alive probes will cause your member to be marked as dead. Ensure your keep-alive response handler is fast and non-blocking.

---

# 6. Handling Processor Requests

When an entity reaches a workflow transition with an externalized processor configured to match your member's tags, the platform sends an `EntityProcessorCalculationRequest`.

## 6.1 Request Schema

```json
{
  "id": "<event-uuid>",
  "requestId": "<correlation-id>",
  "entityId": "<entity-uuid>",
  "processorId": "<processor-uuid>",
  "processorName": "<configured-processor-name>",
  "transactionId": "<transaction-uuid>",
  "workflow": {
    "id": "<workflow-uuid>",
    "name": "<workflow-name>"
  },
  "transition": {
    "id": "<transition-uuid>",
    "name": "<transition-name>",
    "stateFrom": "<source-state>",
    "stateTo": "<target-state>"
  },
  "parameters": { /* arbitrary JSON configured on the processor */ },
  "payload": {
    "type": "TREE",
    "data": { /* entity data as JSON — present only if attachEntity=true */ },
    "meta": { /* entity metadata */ }
  }
}
```

**Key fields**:
- `requestId` — You **must** echo this back in the response for correlation.
- `entityId` — The entity being processed. Echo this back.
- `processorName` — Use this to dispatch to different business logic handlers.
- `parameters` — Arbitrary JSON configured on the processor in the workflow definition (the `context` field). Use for passing configuration to your handler.
- `payload.data` — The entity data. Only present when `attachEntity` is `true` in the workflow configuration.

> 💡 **Auth context**: The CloudEvent envelope for this request also carries auth context extension attributes (`authtype`, `authid`, `authclaims`) identifying the principal whose action triggered the workflow. See [Section 8](#8-auth-context-on-incoming-requests) for details on how to extract them.

## 6.2 Response Schema

```json
{
  "id": "<new-uuid>",
  "requestId": "<echo-request-id>",
  "entityId": "<echo-entity-id>",
  "success": true,
  "payload": {
    "type": "TREE",
    "data": { /* modified entity data to write back */ }
  }
}
```

**Rules**:
1. **`requestId`** must exactly match the value from the request.
2. **`entityId`** must exactly match the value from the request.
3. If you set `success: true`, the platform applies your `payload.data` to the entity.
4. If you set `success: false`, the platform treats this as a processing failure. Include an `error` object.
5. The `payload` field is optional. If omitted (or `payload.data` is null), no data modification occurs.

## 6.3 Error Response

```json
{
  "id": "<new-uuid>",
  "requestId": "<echo-request-id>",
  "entityId": "<echo-entity-id>",
  "success": false,
  "error": {
    "code": "BUSINESS_ERROR",
    "message": "Detailed error description",
    "retryable": true
  }
}
```

The `error.retryable` flag tells the platform whether it should retry the request on a different member (if a retry policy is configured). Set to `true` for transient failures and `false` for permanent failures.

---

# 7. Handling Criteria Requests

When a workflow transition has an externalized criterion configured as a `function`, the platform sends an `EntityCriteriaCalculationRequest`.

## 7.1 Request Schema

```json
{
  "id": "<event-uuid>",
  "requestId": "<correlation-id>",
  "entityId": "<entity-uuid>",
  "criteriaId": "<criteria-uuid>",
  "criteriaName": "<configured-function-name>",
  "target": "TRANSITION",
  "transactionId": "<transaction-uuid>",
  "workflow": {
    "id": "<workflow-uuid>",
    "name": "<workflow-name>"
  },
  "transition": {
    "id": "<transition-uuid>",
    "name": "<transition-name>",
    "stateFrom": "<source-state>",
    "stateTo": "<target-state>"
  },
  "processor": {
    "id": "<processor-uuid>",
    "name": "<processor-name>"
  },
  "parameters": { /* arbitrary JSON */ },
  "payload": {
    "type": "TREE",
    "data": { /* entity data */ }
  }
}
```

**The `target` field** indicates what the criterion is attached to:
| Target | Meaning | Available Context |
|---|---|---|
| `WORKFLOW` | Workflow-level criterion (selects which workflow applies) | `workflow` |
| `TRANSITION` | Transition-level criterion (should this transition fire?) | `workflow`, `transition` |
| `PROCESSOR` | Processor-level criterion (should this processor run?) | `workflow`, `transition`, `processor` |
| `NA` | Reserved for future use | — |

> 💡 **Auth context**: Like processor requests, criteria requests also carry auth context extension attributes on the CloudEvent envelope. See [Section 8](#8-auth-context-on-incoming-requests).

## 7.2 Response Schema

```json
{
  "id": "<new-uuid>",
  "requestId": "<echo-request-id>",
  "entityId": "<echo-entity-id>",
  "success": true,
  "matches": true,
  "reason": "Entity meets all validation criteria"
}
```

**Key fields**:
- `requestId` — Must exactly match the request.
- `entityId` — Must exactly match the request.
- `matches` — The boolean result: `true` means the criterion is satisfied (transition fires / processor runs), `false` means it is not.
- `reason` — Optional human-readable explanation (useful for debugging).

If `success: false`, the platform treats it as a criteria evaluation failure (the criterion evaluates to `false` by default).

---

# 8. Auth Context on Incoming Requests

The platform attaches [CloudEvents Auth Context extension](https://github.com/cloudevents/spec/blob/main/cloudevents/extensions/authcontext.md) attributes to every `EntityProcessorCalculationRequest` and `EntityCriteriaCalculationRequest`. These attributes identify the authenticated principal whose action triggered the workflow execution (e.g., the user who created or updated the entity).

## 8.1 Extension Attributes

The auth context is carried as CloudEvent extension attributes in the Protobuf `attributes` map — **not** inside the JSON `text_data` payload.

| Attribute | Type | Required | Description |
|---|---|---|---|
| `authtype` | String | YES | Principal type. One of: `user`, `service_account`, `system`, `unauthenticated`, `unknown` |
| `authid` | String | NO | Unique identifier of the principal (UUID). Absent for `system` or `unauthenticated`. |
| `authclaims` | String | NO | JSON string containing claims about the principal (e.g., `legalEntityId`, `roles`). Does not contain credentials. |

## 8.2 Auth Type Values

| `authtype` Value | Meaning |
|---|---|
| `user` | A regular authenticated user (JWT-based login) |
| `service_account` | A machine-to-machine (M2M) technical user |
| `system` | An internal platform trigger (no user context, e.g., scheduled transitions) |
| `unauthenticated` | No authentication context was available |
| `unknown` | Reserved for future use |

## 8.3 Extracting Auth Context (Java/Kotlin)

The attributes are available in the Protobuf CloudEvent's `attributes` map. The keys are the attribute names listed above (no prefix):

```java
// From the gRPC StreamObserver<CloudEvent>.onNext(value):
String authType = value.getAttributesMap().get("authtype").getCeString();
String authId = value.getAttributesMap().containsKey("authid")
    ? value.getAttributesMap().get("authid").getCeString()
    : null;
String authClaimsJson = value.getAttributesMap().containsKey("authclaims")
    ? value.getAttributesMap().get("authclaims").getCeString()
    : null;

// Parse claims if present
if (authClaimsJson != null) {
    Map<String, Object> claims = objectMapper.readValue(authClaimsJson, Map.class);
    String legalEntityId = (String) claims.get("legalEntityId");
    List<String> roles = (List<String>) claims.get("roles");  // may be null for plain IUser
}
```

The exact accessor depends on your gRPC tooling — in Go, use the generated message's `GetAttributes()` method; in Python, dict-like indexing on `.attributes`. See your language's generated proto bindings.

## 8.4 Example Claims JSON

```json
{
  "legalEntityId": "acme-corp",
  "roles": ["USER", "SUPER_USER"]
}
```

For `service_account` (M2M) users:
```json
{
  "legalEntityId": "acme-corp",
  "roles": ["M2M"]
}
```

## 8.5 Use Cases

- **Audit logging**: Record which user triggered the processing for compliance.
- **Authorization decisions**: Apply different business logic based on the caller's roles or legal entity.
- **Multi-tenant isolation**: Verify the triggering principal belongs to the expected tenant.
- **Debugging**: Trace processing failures back to the originating user action.

> ⚠️ **Note**: The `authclaims` field never contains credentials (passwords, tokens, secrets). It contains only identity and authorization metadata.

---

# 9. Workflow Configuration

Your calculation member does not exist in isolation — it is invoked by workflow configurations on the platform side. This section describes how workflows reference externalized processors and criteria, so you understand the relationship between your member's tags/handlers and the platform configuration.

## 9.1 Externalized Processor in Workflow JSON

```json
{
  "workflows": [{
    "version": "1",
    "name": "my-workflow",
    "initialState": "start",
    "states": {
      "start": {
        "transitions": [{
          "name": "process-data",
          "next": "processed",
          "manual": false,
          "processors": [{
            "name": "my-processor-function",
            "executionMode": "SYNC",
            "config": {
              "attachEntity": true,
              "calculationNodesTags": "my-processor-tag",
              "responseTimeoutMs": 60000,
              "retryPolicy": "FIXED",
              "context": "{\"key\": \"value\"}"
            }
          }]
        }]
      },
      "processed": {}
    }
  }]
}
```

## 9.2 Processor Configuration Fields

| Field | Type | Default | Description |
|---|---|---|---|
| `name` | string | — | **Required.** The processor name. Sent as `processorName` in the request. |
| `executionMode` | string | — | **Required.** One of `SYNC`, `ASYNC_SAME_TX`, `ASYNC_NEW_TX`. |
| `config.attachEntity` | boolean | `true` | Whether to send entity data in the request payload. |
| `config.calculationNodesTags` | string | `""` | Comma/semicolon-separated tags. Only members whose tags are a superset are eligible. |
| `config.responseTimeoutMs` | long | `60000` | How long the platform waits for your response before timing out. |
| `config.retryPolicy` | string | `FIXED` | `NONE` — no retry. `FIXED` — retry with fixed delay (default: 3 retries, 500ms delay). |
| `config.context` | string | `null` | Arbitrary string passed as `parameters` in the request. Use for handler-specific configuration. |
| `config.asyncResult` | boolean | `false` | Enable async response processing (advanced). |
| `config.crossoverToAsyncMs` | long | `5000` | Time before switching from sync to async response handling (advanced). |

## 9.3 Execution Modes

| Mode | Behavior |
|---|---|
| `SYNC` | The workflow engine waits for your response within the same transaction. The transition completes only after your response is applied. |
| `ASYNC_SAME_TX` | The engine sends the request and can process other work. Your response is applied within the same entity transaction. |
| `ASYNC_NEW_TX` | Like `ASYNC_SAME_TX`, but your response is applied in a new transaction. Useful for long-running computations. |

> For most use cases, **`SYNC`** is the simplest and recommended starting point.

## 9.4 Externalized Criteria (Function) in Workflow JSON

```json
{
  "transitions": [{
    "name": "conditional-transition",
    "next": "target-state",
    "manual": false,
    "criterion": {
      "type": "function",
      "function": {
        "name": "my-criteria-function",
        "config": {
          "attachEntity": true,
          "calculationNodesTags": "my-processor-tag",
          "responseTimeoutMs": 5000,
          "retryPolicy": "NONE"
        }
      }
    }
  }]
}
```

Criteria functions use the same `config` fields as processors (except `asyncResult` and `crossoverToAsyncMs`, which are not applicable to criteria).

## 9.5 Retry Policies

| Policy | Behavior |
|---|---|
| `NONE` | No retry. If the member fails or times out, the processing fails. |
| `FIXED` | Retries up to N times (default: 3) with a fixed delay (default: 500ms) between retries. Each retry attempts a **different** member if available (the failed member is excluded from selection). |

---

# 10. BaseEvent Schema

All events on the stream extend the `BaseEvent` schema:

```json
{
  "id": "<string, required>",
  "success": true,
  "error": {
    "code": "<string, required if error>",
    "message": "<string, required if error>",
    "retryable": false
  },
  "warnings": ["<optional array of warning strings>"]
}
```

- `id` — Every event must have a unique ID (UUID recommended).
- `success` — Defaults to `true`. Set to `false` to indicate an error.
- `error` — Only relevant when `success` is `false`. The `code` and `message` fields are required within the error object.
- `warnings` — Optional array of warning strings.

---

# 11. Production Robustness

## 11.1 Reconnection Strategy

gRPC streams can be terminated by network issues, server restarts, or load balancer timeouts. Implement automatic reconnection:

1. **Detect disconnection** via `onError` or `onCompleted` on the response observer.
2. **Back off exponentially** — start at 1 second, cap at 60 seconds.
3. **Re-join after reconnect** — every new stream requires a fresh `CalculationMemberJoinEvent`.
4. **Refresh the JWT token** before reconnecting if it is near expiry.

```
┌─────────┐    onError/onCompleted    ┌──────────┐    delay    ┌──────────────┐    success    ┌──────┐
│ Connected│ ──────────────────────► │ Backoff  │ ────────► │ Reconnecting │ ────────────► │ Join │
└─────────┘                          └──────────┘           └──────────────┘              └──────┘
     ▲                                                            │ failure                   │
     │                                                            ▼                           │
     │                                                      ┌──────────┐                      │
     │                                                      │ Backoff  │ (increase delay)      │
     │                                                      └──────────┘                      │
     └────────────────────────────────────────────────────────────────────────────────────────┘
                                              Greet received
```

## 11.2 Thread Safety

The gRPC `StreamObserver` is **not thread-safe**. If your business logic runs on multiple threads, synchronize all calls to `observer.onNext()`:

```java
synchronized (requestObserver) {
    requestObserver.onNext(cloudEvent);
}
```

## 11.3 Response Timeouts

Your client must respond within the configured `responseTimeoutMs` (default: 60 seconds). If you exceed this:
- The platform considers the request failed.
- If retry policy is `FIXED`, the platform retries with a different member.
- Late responses are silently discarded.

Design your business logic to complete well within the timeout, accounting for network latency.

## 11.4 Idempotency

In edge cases (e.g., network partitions, retries), you may receive the same request more than once. Use the `requestId` as an idempotency key to avoid processing the same request twice.

## 11.5 Graceful Shutdown

When shutting down your client:

1. Stop accepting new requests (drain in-flight work).
2. Complete any pending responses and send them.
3. Close the gRPC stream via `requestObserver.onCompleted()`.
4. Shut down the `ManagedChannel` with a grace period:
   ```java
   channel.shutdown().awaitTermination(10, TimeUnit.SECONDS);
   ```

The platform will detect the stream closure and broadcast a member-offline event to the cluster. Pending requests that were in-flight will time out and may be retried on other members.

## 11.6 Multiple Members

You can run **multiple calculation member instances** (same or different processes) with the same tags for horizontal scaling and high availability. The platform selects one eligible member per request, preferring members connected to the local cluster node. Running at least two members ensures continued processing if one goes down.

## 11.7 Monitoring

Track these metrics in your client:
- **Request count** by type (processor vs. criteria) and result (success vs. failure)
- **Response latency** (time from receiving request to sending response)
- **Keep-alive response time**
- **Reconnection count and frequency**
- **Stream errors** (by gRPC status code)

---

# 12. Quick Reference — Message Flow

```
Client                                          Server
  │                                                │
  │──── startStreaming() ─────────────────────────►│   (open bidirectional stream)
  │                                                │
  │──── CalculationMemberJoinEvent ───────────────►│   (register with tags)
  │◄─── CalculationMemberGreetEvent ───────────────│   (server confirms, assigns memberId)
  │                                                │
  │◄─── CalculationMemberKeepAliveEvent ───────────│   (periodic heartbeat probe)
  │──── EventAckResponse ─────────────────────────►│   (ack the probe)
  │                                                │
  │◄─── EntityProcessorCalculationRequest ─────────│   (process this entity)
  │──── EntityProcessorCalculationResponse ───────►│   (here's the result)
  │                                                │
  │◄─── EntityCriteriaCalculationRequest ──────────│   (evaluate this criterion)
  │──── EntityCriteriaCalculationResponse ────────►│   (matches: true/false)
  │                                                │
  │──── CalculationMemberKeepAliveEvent ──────────►│   (client-initiated heartbeat)
  │◄─── EventAckResponse ─────────────────────────│   (server acks)
  │                                                │
```

---

# 13. Troubleshooting

| Symptom | Likely Cause | Fix |
|---|---|---|
| `UNAUTHENTICATED` on stream open | Missing/invalid/expired JWT token | Refresh JWT before connecting. Ensure `Authorization: Bearer <token>` header. |
| `NOT_FOUND` after JWT validation | User not found in Cyoda for the given JWT | Verify user enrollment and legal entity configuration. |
| Greet event has `success: false` | Subscription limit exceeded (max client nodes) | Check your subscription plan limits. |
| Member marked as not alive | Keep-alive responses too slow or missing | Ensure non-blocking, fast keep-alive handler. Check network latency. |
| Requests not arriving | Tags mismatch | Verify your member's tags are a superset of the workflow processor's `calculationNodesTags`. Tags are case-insensitive. |
| Requests not arriving | Member on wrong legal entity | Requests only route to members in the same legal entity as the entity owner. |
| Request timeout | Business logic too slow | Optimize processing time or increase `responseTimeoutMs` in workflow config. |
| Duplicate requests | Retry policy triggered | Implement idempotency using `requestId`. |
| Stream drops unexpectedly | Server restart, network issue, idle timeout | Implement reconnection with exponential backoff (Section 11.1). |
| `authtype` is `system` unexpectedly | Workflow triggered by an internal platform action (e.g., scheduled transition) or no user context was available | This is expected for system-initiated workflows. If you expect a user context, verify the originating API call is authenticated. |
| `authclaims` is missing | The triggering principal is a plain `IUser` without extended claims, or the auth type is `system`/`unauthenticated` | Only `user` and `service_account` auth types include claims. Check `authtype` before parsing claims. |

---

## build/modeling-entities.md

# Modeling entities

Design patterns for entity schemas — boundaries, evolution, and validation.

Modeling well in Cyoda comes down to drawing the right boundaries between
entities, letting the schema grow with the data, and treating validation as
the job of the platform rather than the application layer. This page covers
the patterns worth knowing before you ship a first model.

## One entity per noun

The simplest rule: every domain noun that has an independent lifecycle is its
own entity. An `Order` has its own states, its own history, and its own audit
trail; so does a `Customer`. They relate via references, not by embedding.

A useful test: *does this thing change on its own clock?* If yes, it's an
entity. Line items on an order often do not — they live inside the order's
state transitions — so they stay embedded. Fulfilment events on an order do —
they have their own lifecycle — so they become their own entity, referenced
by the order.

## Two modes: discover or lock

Cyoda gives you two structural contracts for an entity model. The right
choice depends on how exposed the model is to outside producers.

**Discover (loose).** For a new model you do not write a schema file; you
post a representative sample (or a batch) and Cyoda records the fields,
their types, and the shape of nested arrays and objects. New samples
**widen** the schema — a field seen as `INTEGER` once and as
`[INTEGER, STRING]` later becomes polymorphic, and array widths grow to
fit observed data. Use discover mode when you are prototyping, exploring
a dataset, or have not yet fixed the contract with upstream producers.

**Lock (strict).** Once the shape is stable, lock the model. After
locking, any incoming entity that does not structurally match the current
schema is **rejected**. This is the right default for production systems
with external interface contracts — a trading system receiving FpML
confirmations, a payments pipeline consuming an agreed message format, a
regulated workflow whose processor logic is tailored to a specific
shape. In those contexts a silently widened schema is a latent bug at
best and a compliance failure at worst: if an upstream does an
uncoordinated FpML version upgrade, you want the new-shape messages
rejected at the door, not accepted and fed into processors that were
built against the old shape.

These two modes together cover the spectrum. Cyoda deliberately does
**not** layer a Confluent-style forward/backward/full compatibility
taxonomy on top: "compatible" is not a platform-generic concept when the
workflow (your app code) is part of the contract. Only the application
can judge whether adding an optional field, widening an integer to a
string, or dropping a field leaves its transition logic valid. The
platform contract is the simpler and stricter pair: loose discovery, or
lock-and-reject.

## Evolving a model

You evolve during discover mode by sending data: fields appear, types
widen, array widths grow. None of this is surprising until you lock.

After lock, evolution is **application-controlled**. The model has a
`modelVersion` that the application increments when it wants a new
structural contract. Each revision of each entity is tagged at write
time with the model version in force. Revisions are immutable: old
revisions are **not** re-validated, re-cast, or rewritten when a new
model version appears. A consumer reading an old revision reads it
under its original version; interpretation across versions is
application logic.

Concretely:

- **Add fields (pre-lock).** Send a sample that includes them; the
  schema widens automatically.
- **Widen types (pre-lock).** Cyoda handles polymorphic fields via a
  type hierarchy (e.g. `BYTE → SHORT → INT → LONG`;
  `FLOAT → DOUBLE → BIG_DECIMAL`). See the
  [Trino SQL reference](/reference/trino/) for the complete primitive
  lattice, including the temporal-type resolution hierarchy.
- **Lock.** Freeze evolution once the shape is stable. The default
  stance for anything with external producers.
- **Bump `modelVersion` and register the new schema (post-lock).** A
  locked model is a frozen contract; to accommodate a changed
  structure the application bumps `modelVersion` and **registers the
  new schema** for that version. Registration uses the same mechanism
  as initial discovery: submit a comprehensive set of representative
  samples that span the intended shape, and Cyoda infers the schema
  from them. The samples themselves are **not stored** — they exist
  only to define the shape of the new version. Once registered, lock
  the new version so it too is a hard contract. If data written under
  an older version needs to appear under the new shape, migrate it
  explicitly via app code; the platform takes no stance on whether
  the new shape is "compatible" with the old — that judgment belongs
  to the workflow that consumes the data.

Things to plan explicitly:

- **Renames.** Cyoda does not rename a field for you; if you rename
  in the source, you get a new field alongside the old one. Migrate
  existing data deliberately.
- **Deletes and deprecations.** Same story — Cyoda will not silently
  drop or re-interpret a field across a version boundary. The
  application owns the migration.
- **Narrowing types.** Once the schema has observed `STRING` in a
  field, you cannot narrow it to `INTEGER` within the same version.
  To narrow, introduce a new `modelVersion` with the stricter type
  and migrate the data.

## Who validates what

Cyoda validates **structure** and **types** against the model: required
shapes, element types, array constraints, polymorphic compatibility. That is
free; you do not write those validators.

Your application is responsible for **semantic** validation that lives inside
transitions: "the order total must equal the sum of line items", "the
payment currency must match the customer's currency". Those belong in
workflow criteria and processors where they can fail a transition and leave
the old revision intact.

## Anti-patterns

- **The god-entity.** One model that tries to represent everything. Split it
  along lifecycle boundaries; lifecycles that evolve independently want
  separate entities.
- **Premature generalisation.** A model that tries to anticipate every future
  field. Let the schema discover itself and lock when you are ready.
- **Shadow workflows.** Implementing state transitions as boolean flags on
  the entity. Put states in a workflow; that's what workflows are for.

## Where to go next

- [Entities and lifecycle](/concepts/entities-and-lifecycle/) — the
  conceptual model behind an entity.
- [Entity model export](/reference/entity-model-export/) — the wire
  format of a SIMPLE_VIEW export, node descriptors, type descriptors,
  and the JSON Schema for the response.
- [JSON schema reference](/reference/schemas/) — the REST-API message
  schemas generated from cyoda-go.
- [Workflows and events](/concepts/workflows-and-events/) — how state
  and transitions are configured.

---

## build/searching-entities.md

# Searching entities

Query sets of entities over REST — direct vs async modes, predicates, pagination, and historical reads.

<FromTheBinary topic="search" />

Cyoda exposes search over REST for any query that returns a set of
entities. Use it when you need more than one entity back, when the filter
goes beyond a single id lookup, or when you want to scope by workflow
state. For single-entity reads, stay on the CRUD endpoints in
[working with entities](/build/working-with-entities/); for cross-entity
analytics, use [SQL](/build/analytics-with-sql/); for event-driven
compute, use [gRPC compute nodes](/build/client-compute-nodes/).

## Two query modes

Cyoda splits search into **Immediate** and **Background** modes. Pick by
expected result size and urgency; the filter grammar is identical.

- **Immediate** (API term: `direct`) — synchronous. The request returns
  matching entities in the response body. Result size is **capped**, so
  `direct` is the right default only when you know the filter produces
  a bounded, small set: a UI list, a lookup, a small report. If a
  query hits the cap, switch it to `async`.
- **Background** (API term: `async`) — queued. The request returns a
  job handle; poll it to retrieve results. Result size is **unbounded**
  and results are **paged**. On the Cassandra-backed tier (Cyoda Cloud,
  or a licensed Enterprise install), `async` runs distributed across
  the cluster: for a fixed query shape, throughput scales roughly
  linearly with the number of nodes.

The decision tree is short:

- Small bounded result, UI-facing → `direct`.
- Might be large, can tolerate a second or two of queuing, exports,
  reports, batch jobs → `async`.
- Hitting the cap or the request timeout on `direct` → `async`.

## A direct search

Filter by a combination of entity fields and workflow state:

```bash
curl -X POST http://localhost:8080/api/search/direct/orders/1 \
  -H 'Content-Type: application/json' \
  -H "Authorization: Bearer $TOKEN" \
  -d '{
    "filter": {
      "state": "submitted",
      "customerId": "CUST-7"
    }
  }'
```

The path is `/api/search/direct/{entityName}/{modelVersion}`. The response is
the list of matching entities, each with its current state, revision, and
timestamps.

## An async search

Submit the search to `/api/search/async/{entityName}/{modelVersion}` and
capture the handle:

```bash
curl -X POST http://localhost:8080/api/search/async/orders/1 \
  -H 'Content-Type: application/json' \
  -H "Authorization: Bearer $TOKEN" \
  -d '{
    "filter": { "state": "submitted" }
  }'
```

Poll the returned `jobId` until the job is ready, then fetch pages
(`pageNumber` is zero-indexed; `pageSize` caps the page):

```
GET /api/search/async/{jobId}/status
GET /api/search/async/{jobId}/results?pageNumber=0&pageSize=1000
GET /api/search/async/{jobId}/results?pageNumber=1&pageSize=1000
```

A single `jobId` can be paged repeatedly until the result is
expired; expiry is controlled per deployment.

### Cancelling a job

If a job is no longer needed — the user navigated away, a replacement
query was submitted, the deployment is shutting down — cancel it rather
than letting it run to completion:

```bash
curl -X DELETE http://localhost:8080/api/search/async/{jobId}/cancel \
  -H "Authorization: Bearer $TOKEN"
```

Cancellation is cooperative: in-flight work is stopped at the next safe
point and any partial results for that `jobId` are discarded.

## Filter shape

The filter is a JSON document whose fields are entity field paths,
metadata (`state`, `createdAt`, …), or workflow labels. The authoritative
operator grammar — equality, comparisons, ranges, set membership, AND/OR
combinators, nested-field access — lives in the
[REST API reference](/reference/api/). The shape used in the simple
examples above (flat field→value map) is the equality short form; use
the full object form when you need operators:

```json
{
  "and": [
    { "field": "state", "eq": "submitted" },
    { "field": "amount", "gte": 1000 }
  ]
}
```

For the full predicate grammar — every operator, nesting rule, and function — run `cyoda help search` against your binary.

## Historical reads with `pointInTime`

Every search accepts a `pointInTime` parameter to run against the world
as it existed at a given timestamp. Each entity maintains a history of
revisions; point-in-time queries return results using the entity state
that was current at the specified timestamp. The result is the set of
entities that would have matched, using the revision active at that
time.

```bash
curl -X POST http://localhost:8080/api/search/direct/orders/1 \
  -H 'Content-Type: application/json' \
  -H "Authorization: Bearer $TOKEN" \
  -d '{
    "pointInTime": "2026-03-01T00:00:00Z",
    "filter": { "state": "submitted", "customerId": "CUST-7" }
  }'
```

This is the primary way to answer audit and regulatory questions from
REST — *what did this customer's open orders look like at quarter
close?* The Trino surface exposes the same capability as a column named
`point_time` (snake-case, matching SQL convention); for the analytical
form, see [`point_time` in analytics](/build/analytics-with-sql/).

## Paging and sort (async)

- `pageSize` and `pageNumber` are query parameters on
  `/search/async/{jobId}/results`; they apply at result-fetch time,
  not at job submission. `pageNumber` is zero-indexed.
- Sort is not documented on the REST async surface at this release;
  results are returned in insertion order.
- A completed `jobId` is stable for its retention window — page
  reads are idempotent.

## Performance notes

- Scope by `state` or a high-selectivity field first — the workflow
  state is indexed on every entity and is almost always the right
  first predicate.
- Prefer `async` as soon as the result set might be thousands of
  entities; the distributed execution on the Cassandra tier makes it
  cheaper per entity than a series of `direct` pages.
- Avoid open-ended `pointInTime` scans across every revision — anchor
  the query at a specific timestamp or a short window.

## Where to go next

- [REST API reference](/reference/api/) — authoritative search payload
  schema, operator grammar, status and result endpoints.
- [Working with entities](/build/working-with-entities/) — single-entity
  CRUD and transitions; the CRUD page for reference on the same
  surface.
- [Analytics with SQL](/build/analytics-with-sql/) — heavy analytical
  work, cross-entity joins, historical scans via `point_time`.
- [Entities and lifecycle](/concepts/entities-and-lifecycle/) — the
  audit/history model behind `pointInTime`.

---

## build/testing-with-digital-twins.md

# Testing with digital twins

In-memory mode as a test harness; running simulations at volumes exceeding production.

In Cyoda, a "digital twin" means the same application code — workflows,
criteria, processors — runs identically on every storage tier. Non-functional
properties (persistence, latency, concurrency model) differ; business logic
does not.

cyoda-go's **in-memory mode** is the built-in test harness. It runs the entire
platform — entity store, workflow engine, API surfaces — in a single process
with no external dependencies and no disk writes. It is the fastest way to
exercise workflow behaviour in CI, and the cheapest way to run large
scenario simulations against your application logic.

## In-memory mode as a test harness

Start cyoda-go with the in-memory profile (or `go run ./cmd/cyoda` against the
default in-memory config). Concretely: set `CYODA_STORAGE_BACKEND=memory`, or
leave it unconfigured — memory is the application default until `cyoda init`
is run. Point your tests at it; tear it down between cases; no database to
seed, no files to clean up.

Properties that matter for testing:

- **Deterministic.** Same inputs, same state, same result.
- **Fast.** Sub-millisecond latencies mean you can run thousands of transitions
  per test.
- **Isolated.** Nothing survives the process; tests cannot leak state into
  each other.
- **API-identical.** Your application code calls the same REST and gRPC
  endpoints it will call in production.

This makes in-memory mode suitable for unit-level tests of your processors,
integration tests of your whole workflow, and smoke tests in CI.

## Digital-twin simulations

Beyond unit tests, in-memory mode is a **digital-twin runtime**: a
behavioural clone of production that you can drive at volumes and rates your
real system could never sustain. Because there is no durable backend, no
rate-limited external API, and no shared-state concern, you can:

- Replay a year of historical transactions in minutes.
- Fan out thousands of concurrent scenario runs across CPUs.
- Stress the workflow with injected event streams that exceed production
  peak by multiples.

The application logic — workflows, criteria, processors — is the same
application that runs against PostgreSQL or Cyoda Cloud. The **only**
things that differ between in-memory and durable tiers are non-functional:
persistence, latency profile, and scale ceiling. That's the property that
makes the in-memory run a useful twin of the durable one.

## What stays the same, what changes

Same:
- API contracts (REST, gRPC today; Trino upcoming — see the [Trino reference](/reference/trino/)).
- Workflow semantics: states, transitions, criteria, processors.
- Event ordering within a transition.
- Audit-trail shape.

Different:
- Durability (none in-memory).
- Concurrency model (no multi-node contention).
- Performance envelope (faster, lower variance).

If your test depends on *durable* behaviour — restart recovery, cross-node
consensus, cross-process replay — graduate it to a SQLite or PostgreSQL run
for that suite. For everything else, in-memory is the default.

## Examples

Worked examples live in
[cyoda-go/examples](https://github.com/cyoda-platform/cyoda-go/tree/main/examples),
including scenario-simulation runners you can adapt as a template. When a
cyoda-go release ships a new example, this page links it.

## Where to go next

- [Digital twins and the growth path](/concepts/digital-twins-and-growth-path/)
  — the concept behind same-app-different-tier.
- [Run → overview](/run/) — choosing a tier for the other side of the twin.

---

## build/workflows-and-processors.md

# Workflows and processors

State-machine design, transitions, and external processors — with a preference for gRPC in compute nodes.

> Understanding Cyoda JSON workflow configurations.

## Overview

Cyoda workflows define finite, distributed state machines that govern the lifecycle of business entities in an event-driven environment. Each entity progresses through a sequence of states based on defined transitions, criteria, and processing rules.

:::tip[Use gRPC for compute nodes]
When implementing processors or criteria services, prefer gRPC over
HTTP. gRPC preserves audit hygiene and simplifies authorization.
See [APIs and surfaces](/concepts/apis-and-surfaces/) for the
decision rationale.
:::

The platform supports adaptable entity modeling, allowing business logic to evolve through configuration rather than implementation changes. Workflows declare the set of states, valid transitions, and associated processing steps while preserving immutable persistence for auditability.

## Workflow Architecture

### Core Components

1. **States**: Lifecycle stages of an entity
2. **Transitions**: Directed changes between states
3. **Criteria**: Conditional logic for transition eligibility
4. **Processors**: Executable logic triggered during transitions

## Configuration Schema

You can find the workflow schema in the [API reference](/reference/api/). See the workflow import endpoints for complete schema specifications. Here we explain the structure and meaning of each element.

### Workflow Object

```json
{
  "version": "1",
  "name": "Workflow Name",
  "desc": "Workflow description",
  "initialState": "StateName",
  "active": true,
  "criterion": {},
  "states": {}
}
```

#### Attributes

- `version`: Workflow schema version
- `name`: Identifier for the workflow. Must be unique per entity model.
- `desc`: Detailed description of the workflow
- `initialState`: Starting point for new entities
- `active`: Indicates whether the workflow is active
- `criterion`: Optional criterion for selecting which workflow applies to a given entity. Uses the same condition types as transition criteria (simple, group, function). When multiple workflows are defined for a model, the platform evaluates each workflow's criterion against the entity to determine which workflow governs it.
- `states`: Map of state names to state definitions

### Multiple Workflows per Model

An entity model can have multiple workflows, each with its own `criterion` at the workflow level. When an entity is created, the platform evaluates each active workflow's criterion to select the applicable workflow. The platform evaluates active workflows in the order they are defined and uses the first whose criterion matches (or the first with no criterion, which matches unconditionally). This allows different processing paths for different categories of entities within the same model.

## Import and Export

Workflows are managed via import and export API endpoints on the entity
model. The import request supports three modes that control how
existing workflows are reconciled with the payload:

- **`MERGE`** (default): Incremental update. Workflows with matching names are updated; unspecified workflows remain unchanged.
- **`REPLACE`**: Removes all existing workflows for the entity model and retains only the imported ones. Also deletes all unused processors and criteria.
- **`ACTIVATE`**: Similar to REPLACE, but deactivates (rather than deletes) existing workflows and transitions not included in the import. Unused processors and criteria are preserved.

See [API reference](/reference/api/) for endpoint details and the full
request/response schemas.

## States

States describe lifecycle phases for entities. Names must start with a letter and use only alphanumeric characters, underscores, or hyphens.

### Format

```json
"StateName": {
  "transitions": []
}
```

#### Special States

- **Initial state**: The initial state of a new entity
- **Terminal States**: States with no outgoing transitions

## Transitions

Transitions define allowed movements between states, optionally gated by conditions and supported by executable logic.

### Format

```json
{
  "name": "TransitionName",
  "next": "TargetState",
  "manual": true,
  "disabled": false,
  "criterion": {},
  "processors": []
}
```

#### Attributes

- `name`: Name of the transition (required)
- `next`: Target state code (required)
- `manual`: Determines if the transition is manual or automated (required)
- `disabled`: Marks the transition as inactive
- `criterion`: Optional condition for eligibility
- `processors`: Optional processing steps

### Manual vs Automated Transitions

Transitions may be either **manual** or **automated**, and are guarded by criteria that determine their eligibility. When an entity enters a new state, the first eligible automated transition is executed immediately within the same transaction. This continues recursively until no further **automated** transitions are applicable, resulting in a stable state. Each transition may trigger one or more attached processes, which can run synchronously or asynchronously, either within the current transaction or in a separate one. This forms the foundation for event flow automation, where processors may create or mutate entities in response, allowing a single transition to initiate a cascade of events and function executions across the system. `CYODA_MAX_STATE_VISITS` configures the per-state visit limit within a single cascade (default 10). A separate hard-coded safety cap of 100 steps limits total cascade depth across all states, preventing runaway automatic-transition chains.

## Criteria

Criteria define logic that determines if a transition is permitted. A criterion can be one of five types:

1. **Simple**: Evaluates a single condition on entity data
2. **Group**: Combines multiple criteria with logical operators
3. **Function**: Calls an external function for evaluation (delegated to a calculation node via gRPC)
4. **Lifecycle**: Evaluates a condition on entity lifecycle properties (state, creation date, previous transition)
5. **Array**: Evaluates a condition against an array of values

### Simple Criteria

Simple criteria evaluate a single condition directly on entity data using JSONPath expressions. They are executed directly on the processing node, without involving external compute nodes.

```json
"criterion": {
  "type": "simple",
  "jsonPath": "$.amount",
  "operation": "GREATER_THAN",
  "value": 1000
}
```

#### Simple Criteria Attributes

- `jsonPath`: JSONPath expression to extract the value from entity data
- `operation`: Comparison operator (see [Operator Types](#operator-types) below). Also accepts the alias `operatorType`.
- `value`: The value to compare against

### Group Criteria

Group criteria combine multiple conditions using logical operators.

```json
"criterion": {
  "type": "group",
  "operator": "AND",
  "conditions": [
    {
      "type": "simple",
      "jsonPath": "$.status",
      "operation": "EQUALS",
      "value": "VALIDATED"
    },
    {
      "type": "simple",
      "jsonPath": "$.amount",
      "operation": "GREATER_THAN",
      "value": 500
    }
  ]
}
```

#### Group Criteria Attributes

- `operator`: Logical operator combining conditions (`AND`, `OR`, `NOT`)
- `conditions`: Array of criteria (can be `simple`, `function`, `group`, `lifecycle`, or `array` types — supports arbitrary nesting)

### Function Criteria

Function criteria delegate evaluation to an external compute node via gRPC. The function must return a boolean result.

```json
"criterion": {
  "type": "function",
  "function": {
    "name": "FunctionName",
    "config": {
      "attachEntity": true,
      "calculationNodesTags": "validation,data-quality",
      "responseTimeoutMs": 3000,
      "retryPolicy": "FIXED",
      "context": "optionalContext"
    },
    "criterion": {
      "type": "simple",
      "jsonPath": "$.preCheckField",
      "operation": "EQUALS",
      "value": true
    }
  }
}
```

#### Function Attributes

- `name`: The name of the function to execute (required)
- `config`: Configuration for the function call (optional):
- `attachEntity`: Whether to pass the entity data to the function
- `calculationNodesTags`: Comma-separated list of tags for routing to specific calculation nodes
- `responseTimeoutMs`: Response timeout in milliseconds
- `retryPolicy`: Retry policy for the function (e.g., `"FIXED"`)
- `context`: Optional string parameter passed to the function for additional context or configuration. The `context` is passed "as is" with the event to the compute node. It can contain any sort of information that is relevant to the function's execution, in any format. The interpretation is up to the function itself.
- `criterion`: Optional quick-exit criterion evaluated locally before calling the (potentially expensive) external function. If this local criterion evaluates to false, the function call is skipped entirely. Useful for avoiding unnecessary network round-trips when the result can be confidently determined from entity data.

### Lifecycle Criteria

Lifecycle criteria evaluate conditions on entity lifecycle properties rather than entity data.

```json
"criterion": {
  "type": "lifecycle",
  "field": "state",
  "operation": "EQUALS",
  "value": "VALIDATED"
}
```

#### Lifecycle Criteria Attributes

- `field`: Lifecycle field to evaluate: `state`, `creationDate`, or `previousTransition`
- `operation`: Comparison operator
- `value`: The value to compare against

### Array Criteria

Array criteria evaluate a condition against an array of values.

```json
"criterion": {
  "type": "array",
  "jsonPath": "$.category",
  "operation": "EQUALS",
  "value": ["electronics", "software", "services"]
}
```

#### Array Criteria Attributes

- `jsonPath`: JSONPath expression to the field
- `operation`: Comparison operator
- `value`: Array of string values to match against

### Operator Types

The following comparison operators are available for simple, lifecycle, and array criteria:

**Basic Comparison:** `EQUALS`, `NOT_EQUAL`, `IS_NULL`, `NOT_NULL`, `GREATER_THAN`, `LESS_THAN`, `GREATER_OR_EQUAL`, `LESS_OR_EQUAL`, `BETWEEN`, `BETWEEN_INCLUSIVE`

**String Operations (Case-Sensitive):** `CONTAINS`, `NOT_CONTAINS`, `STARTS_WITH`, `NOT_STARTS_WITH`, `ENDS_WITH`, `NOT_ENDS_WITH`, `MATCHES_PATTERN`, `LIKE`

**Case-Insensitive String Operations:** `IEQUALS`, `INOT_EQUAL`, `ICONTAINS`, `INOT_CONTAINS`, `ISTARTS_WITH`, `INOT_STARTS_WITH`, `IENDS_WITH`, `INOT_ENDS_WITH`

**State Tracking:** `IS_UNCHANGED`, `IS_CHANGED`

## Processors

Processors implement custom logic to run during transitions. There are two types of processors: **externalized** (delegated to calculation nodes) and **scheduled** (delayed transitions).

### Externalized Processors

Externalized processors delegate execution to a calculation node via gRPC. This is the most common processor type.

```json
{
  "type": "externalized",
  "name": "ProcessorName",
  "executionMode": "SYNC",
  "config": {
    "attachEntity": true,
    "calculationNodesTags": "tag1,tag2",
    "responseTimeoutMs": 5000,
    "retryPolicy": "FIXED",
    "context": "optionalContext"
  }
}
```

#### Externalized Processor Attributes

- `type`: `"externalized"` (discriminator)
- `name`: Name of the processor (required)
- `executionMode`: Execution mode (see below). Default: `ASYNC_NEW_TX`.
- `config`: Configuration for the processor call:
- `attachEntity`: Whether to attach entity data to the processor call. Set to `true` if the processor needs access to the entity data (this is usually the case).
- `calculationNodesTags`: Comma-separated list of tags for routing to specific calculation nodes
- `responseTimeoutMs`: Response timeout in milliseconds
- `retryPolicy`: Retry policy for the processor
- `context`: Additional context passed to the processor
- `asyncResult`: Whether to await the result asynchronously, outside of the transaction
- `crossoverToAsyncMs`: Crossover delay in milliseconds to switch to asynchronous processing (effective only when `asyncResult` is true)

#### Execution Modes

- `SYNC`: Inline execution within the transaction. Runs immediately and blocks the current processing thread on the same node.
- `ASYNC_SAME_TX`: Deferred within the current transaction. Commits or rolls back atomically with the triggering transition.
- `ASYNC_NEW_TX`: Deferred execution in a separate, independent transaction. Default mode.

Processors should be idempotent; failed ASYNC_NEW_TX processors may be retried.

Synchronous executions run immediately and block the current processing thread on the same node, making them local and non-distributed. In contrast, asynchronous executions are scheduled for deferred processing and can be handled by any node in the cluster, enabling horizontal scalability and workload distribution, albeit with possibly somewhat higher latency.

### Scheduled Transition Processors

Scheduled processors trigger a delayed transition after a configured time period.

```json
{
  "type": "scheduled",
  "name": "schedule_timeout",
  "config": {
    "delayMs": 3600000,
    "transition": "timeout",
    "timeoutMs": 7200000
  }
}
```

#### Scheduled Processor Attributes

- `type`: `"scheduled"` (discriminator)
- `name`: Name of the processor (required)
- `config` (required):
- `delayMs`: Delay in milliseconds before executing the transition (required)
- `transition`: The name of the transition to execute after waiting (required)
- `timeoutMs`: Timeout in milliseconds for executing the transition task, after which it will be expired (optional)

### Calculation Nodes Tags

As described in the [Architecture](/architecture/cyoda-cloud-architecture/) section, the execution of processors and criteria is delegated to client compute nodes, i.e. your own infrastructure running your business logic. These nodes can be organized into groups and tagged based on their roles or capabilities. By optionally setting the `calculationNodesTags` property in a processor or criterion definition, you can direct execution to specific groups, giving you fine-grained control over workload distribution across your compute environment.

## Example: Payment Request Workflow

This workflow models the lifecycle of a payment request, covering validation, matching, approval, and notification handling.

It starts in the INVALID state, where the request is either amended or validated.
If validation succeeds and a matching order exists, the request advances automatically to the SUBMITTED state.
If not, it moves to PENDING, where it awaits a matching order or may be retried manually.
Requests in SUBMITTED require an approval decision, leading either to APPROVED, which triggers
asynchronous processing like payment message creation and ACK notifications, or to DECLINED,
which emits a rejection (NACK) notification. Manual amend and retry transitions at key
stages allow users or systems to correct or re-evaluate the request.

The following section walks through the configuration step by step.

![Payment Request Workflow](paymentRequestWorkflow)

### Step 1: Workflow Metadata

```json
{
  "version": "1",
  "name": "Payment Request Workflow",
  "desc": "Payment request processing workflow with validation, approval, and notification states",
  "initialState": "INVALID",
  "active": true
}
```

### Step 2: Define States and Transitions

Start by defining the overall structure of states and transitions.

```json
{
  "version": "1",
  "name": "Payment Request Workflow",
  "desc": "Payment request processing workflow with validation, approval, and notification states",
  "initialState": "INVALID",
  "active": true,
  "states": {
    "INVALID": {
      "transitions": [
        {
          "name": "VALIDATE",
          "next": "PENDING",
          "manual": false,
          "disabled": false
        },
        {
          "name": "AMEND",
          "next": "INVALID",
          "manual": true,
          "disabled": false
        },
        {
          "name": "CANCEL",
          "next": "CANCELED",
          "manual": true,
          "disabled": false
        }
      ]
    },
    "PENDING": {
      "transitions": [
        {
          "name": "MATCH",
          "next": "SUBMITTED",
          "manual": false,
          "disabled": false
        },
        {
          "name": "RETRY",
          "next": "PENDING",
          "manual": true,
          "disabled": false
        },
        {
          "name": "CANCEL",
          "next": "CANCELED",
          "manual": true,
          "disabled": false
        }
      ]
    },
    "SUBMITTED": {
      "transitions": [
        {
          "name": "APPROVE",
          "next": "APPROVED",
          "manual": true,
          "disabled": false
        },
        {
          "name": "DENY",
          "next": "DECLINED",
          "manual": true,
          "disabled": false
        }
      ]
    },
    "APPROVED": {
      "transitions": []
    },
    "DECLINED": {
      "transitions": []
    },
    "CANCELED": {
      "transitions": []
    }
  }
}
```

### Step 3: Add Criteria

We add criteria to the `VALIDATE` and `MATCH` transitions:

```json
{
  "version": "1",
  "name": "Payment Request Workflow",
  "desc": "Payment request processing workflow with validation, approval, and notification states",
  "initialState": "INVALID",
  "active": true,
  "states": {
    "INVALID": {
      "transitions": [
        {
          "name": "VALIDATE",
          "next": "PENDING",
          "manual": false,
          "disabled": false,
          "criterion": {
            "type": "function",
            "function": {
              "name": "IsValid",
              "config": {
                "attachEntity": true
              }
            }
          }
        },
        {
          "name": "AMEND",
          "next": "INVALID",
          "manual": true,
          "disabled": false
        },
        {
          "name": "CANCEL",
          "next": "CANCELED",
          "manual": true,
          "disabled": false
        }
      ]
    },
    "PENDING": {
      "transitions": [
        {
          "name": "MATCH",
          "next": "SUBMITTED",
          "manual": false,
          "disabled": false,
          "criterion": {
            "type": "function",
            "function": {
              "name": "HasOrder",
              "config": {
                "attachEntity": true
              }
            }
          }
        },
        {
          "name": "RETRY",
          "next": "PENDING",
          "manual": true,
          "disabled": false
        },
        {
          "name": "CANCEL",
          "next": "CANCELED",
          "manual": true,
          "disabled": false
        }
      ]
    },
    "SUBMITTED": {
      "transitions": [
        {
          "name": "APPROVE",
          "next": "APPROVED",
          "manual": true,
          "disabled": false
        },
        {
          "name": "DENY",
          "next": "DECLINED",
          "manual": true,
          "disabled": false
        }
      ]
    },
    "APPROVED": {
      "transitions": []
    },
    "DECLINED": {
      "transitions": []
    },
    "CANCELED": {
      "transitions": []
    }
  }
}
```

### Step 4: Add Processors

We add two processors to the `APPROVE` transition in the `SUBMITTED` state, respectively, to finish the job.

```json
{
  "version": "1",
  "name": "Payment Request Workflow",
  "desc": "Payment request processing workflow with validation, approval, and notification states",
  "initialState": "INVALID",
  "active": true,
  "states": {
    "INVALID": {
      "transitions": [
        {
          "name": "VALIDATE",
          "next": "PENDING",
          "manual": false,
          "disabled": false,
          "criterion": {
            "type": "function",
            "function": {
              "name": "IsValid",
              "config": {
                "attachEntity": true
              }
            }
          }
        },
        {
          "name": "AMEND",
          "next": "INVALID",
          "manual": true,
          "disabled": false
        },
        {
          "name": "CANCEL",
          "next": "CANCELED",
          "manual": true,
          "disabled": false
        }
      ]
    },
    "PENDING": {
      "transitions": [
        {
          "name": "MATCH",
          "next": "SUBMITTED",
          "manual": false,
          "disabled": false,
          "criterion": {
            "type": "function",
            "function": {
              "name": "HasOrder",
              "config": {
                "attachEntity": true
              }
            }
          }
        },
        {
          "name": "RETRY",
          "next": "PENDING",
          "manual": true,
          "disabled": false
        },
        {
          "name": "CANCEL",
          "next": "CANCELED",
          "manual": true,
          "disabled": false
        }
      ]
    },
    "SUBMITTED": {
      "transitions": [
        {
          "name": "APPROVE",
          "next": "APPROVED",
          "manual": true,
          "disabled": false,
          "processors": [
            {
              "type": "externalized",
              "name": "Create Payment Message",
              "executionMode": "ASYNC_NEW_TX",
              "config": { "attachEntity": true }
            },
            {
              "type": "externalized",
              "name": "Send ACK Notification",
              "executionMode": "ASYNC_NEW_TX",
              "config": { "attachEntity": false }
            }
          ]
        },
        {
          "name": "DENY",
          "next": "DECLINED",
          "manual": true,
          "disabled": false,
          "processors": [
            {
              "type": "externalized",
              "name": "Send NACK Notification",
              "executionMode": "ASYNC_NEW_TX",
              "config": { "attachEntity": false }
            }
          ]
        }
      ]
    },
    "APPROVED": {
      "transitions": []
    },
    "DECLINED": {
      "transitions": []
    },
    "CANCELED": {
      "transitions": []
    }
  }
}
```

## Best Practices

- Use domain-specific state names
- Match transition granularity to business needs
- Define recovery and cancellation paths
- Prefer asynchronous processing for external dependencies
- Use self-transitions for triggering workflow automation on exit from the current state

## Platform Integration

Cyoda workflows integrate directly with:

- **Entity Models**: Determine which workflows apply to which data types
- **Execution Engine**: Drives state and transition logic
- **External Functions**: Implement validation and custom behavior
- **Event System**: Triggers automated transitions on event reception

---

## build/working-with-entities.md

# Working with entities

Create, read, update, and search entities via the cyoda-go API — worked examples.

<FromTheBinary topic="crud" />

This page shows the patterns for interacting with entities through the
platform API. Examples assume a local cyoda-go instance running on the default
port with SQLite persistence; the same requests work against Cyoda Cloud with
the cloud endpoint and an issued token.

The complete endpoint catalogue — parameters, response shapes, error codes —
lives in the [API reference](/reference/api/). Keep that open as you work.

## The shape of the API

Cyoda speaks REST for CRUD, search, and workflow invocation, gRPC for external
processors, and Trino SQL for analytics. This page covers REST; see
[Build → client compute nodes](/build/client-compute-nodes/) for gRPC and the
[APIs and surfaces](/concepts/apis-and-surfaces/) overview for the decision
framework.

Every request is authenticated with a bearer token. Every response includes
the entity's current revision, state, and timestamps.

## Create

Post an entity to its model. The first time you post, Cyoda discovers the
schema from what you send:

```bash
curl -X POST http://localhost:8080/api/entity/JSON/orders/1 \
  -H 'Content-Type: application/json' \
  -H "Authorization: Bearer $TOKEN" \
  -d '{
    "orderId": "ORD-42",
    "customerId": "CUST-7",
    "amount": 120.00,
    "currency": "EUR",
    "lines": [
      { "sku": "AX-1", "qty": 2, "price": 60.00 }
    ]
  }'
```

The path is `/api/entity/{format}/{entityName}/{modelVersion}` — here `JSON`,
`orders`, and version `1`. The response carries an array whose first element
contains `entityIds[0]`, the **system-assigned UUID** of the new entity, plus
its current state and revision number. Capture the UUID — downstream reads,
updates, and transitions address the entity by that UUID, not by the business
key `orderId`.

## Read

Fetch the current revision by id. The `{entityId}` in these URLs is the UUID
returned in `entityIds[0]` from the create response, not a business key like
`orderId`:

```bash
curl http://localhost:8080/api/entity/${ENTITY_ID} \
  -H "Authorization: Bearer $TOKEN"
```

List every entity in a model with `GET /api/entity/{entityName}/{modelVersion}`:

```
GET /api/entity/orders/1
```

For filtered reads — predicates, pagination, result caps, historical reads —
see [searching entities](/build/searching-entities/). The list endpoint does
not accept ad-hoc field filters; those belong to search.

## Update

Direct updates use `PUT /api/entity/{format}/{entityId}` (loopback update — stores a new revision without a named transition) or `PUT /api/entity/{format}/{entityId}/{transition}` (update with a named transition). There is no `PATCH` endpoint — all writes are full-payload PUTs.

**Mutations that move the entity between lifecycle states should go through a
named transition**, not a bare loopback update. Invoking the `submit` transition
records it in the audit trail and runs any attached processors. The transition
carries the new entity JSON in the request body (the platform stores the updated
entity and records the named transition in one call):

```bash
curl -X PUT http://localhost:8080/api/entity/JSON/${ENTITY_ID}/submit \
  -H 'Content-Type: application/json' \
  -H "Authorization: Bearer $TOKEN" \
  -d '{ "orderId": "ORD-42", "status": "submitted" }'
```

See [Build → workflows and processors](/build/workflows-and-processors/) for
how to declare transitions.

## Search

Cyoda supports two query modes:

- **Direct** (synchronous, capped result size) — API term `direct`.
  Returns right away. Good for UI lookups and short operations.
  Result size is capped, so `direct` is best for queries that produce
  a bounded, small result set.
- **Async** (background, unbounded, paged) — API term `async`.
  Queued as a job, returns a handle you can poll. Result size is
  unbounded; results are paged. Good for large result sets, periodic
  reports, and exports. On the Cassandra-backed tier (Cyoda Cloud, or
  a licensed Enterprise install), `async` search runs distributed
  across the cluster and scales horizontally: query throughput for a
  fixed shape grows roughly linearly with the number of nodes.

Both accept the same filter grammar over entity fields, metadata, and
workflow state. Pick `direct` by default; switch to `async` when a
query would hit the `direct` result cap, would time out, or would hold
resources you need elsewhere. For predicates, pagination, and worked
examples, see [searching entities](/build/searching-entities/).

## Temporal queries

Every entity's history is queryable. Add a `pointInTime` parameter to any read
or search request to retrieve the world as of that timestamp:

```
GET /api/entity/{entityId}?pointInTime=2026-03-01T00:00:00Z
```

This is the primary way to answer regulatory and audit questions: *what did
this customer's balance look like at quarter close?* For the same
parameter applied to searches, see
[searching entities → historical reads](/build/searching-entities/#historical-reads-with-pointintime);
for analytical reads expressed as SQL, see
[analytics with SQL](/build/analytics-with-sql/).

## From a compute node

When your code is reacting to a transition — running a processor or
evaluating a criterion — talk to the platform over **gRPC**, not REST. The
gRPC path preserves the audit association between the transition and the
compute call, brokers identity, and supports streaming. See
[Build → client compute nodes](/build/client-compute-nodes/) for the
implementation pattern.

---

## concepts/apis-and-surfaces.md

# APIs and surfaces

REST, gRPC, and Trino SQL — when and why to use each.

Cyoda exposes three distinct API surfaces. Picking the right one for the job
matters; the surfaces are not equivalent, and each carries different guarantees.

## Three surfaces

- **REST** — the default surface for user-facing clients and administrative
  operations. CRUD over entities, search, workflow invocations, schema
  management, dashboards.
- **gRPC** — the surface for compute. External processors and criteria
  services connect to the platform over gRPC, receive work, and return
  results. Supports bidirectional streaming.
- **Trino SQL** — the surface for analytics. Cross-entity queries, reporting,
  JDBC connections, BI tools. Queries run against a Trino catalogue that
  projects entities into virtual SQL tables.

## Which surface, when?

- Building a UI, an admin tool, or a sync integration? **REST.**
- Writing a processor or criterion that runs against transitions? **gRPC.**
- Running analytics, reports, or ad-hoc queries across many entities?
  **Trino SQL.**

All three surfaces are backed by the same entity store. A transition recorded
via REST is visible to gRPC compute nodes and queryable through Trino, with
the same audit trail behind it.

## REST: humans and services that speak to the platform

Use REST when the call represents *a user or service interacting with the
platform as a whole*: creating an order, searching for customers, reading
audit history, managing a workflow definition. This is the surface your
front-end, your admin tooling, and most external integrations will use.

REST is synchronous, authenticated with OAuth 2.0 bearer tokens, and
versioned. The full endpoint catalogue lives in the
[API reference](/reference/api/).

## gRPC: external processors that speak for the platform

Use gRPC when your code is *a compute node acting on behalf of a workflow
transition*. Processors and criteria attached to a transition call out to
external services over gRPC; those services stream work units back to the
platform.

**Prefer gRPC for compute** over implementing processors as REST callbacks.
Three reasons:

1. **Audit hygiene.** Every gRPC call is recorded against the transition that
   invoked it, inside the platform's audit trail. REST callouts cannot
   reconstruct that association reliably.
2. **Authorization is simpler.** The platform brokers the identity and scopes
   passed to the compute node; you don't have to manage credentials between
   the platform and your processor independently.
3. **Bidirectional streaming.** High-throughput ingest and transformation
   workloads benefit from streaming both ways; REST cannot.

For how to implement a compute node, see
[Build → client compute nodes](/build/client-compute-nodes/).

## Trino SQL: cross-entity analytics

:::caution[Upcoming]
Trino SQL is on the roadmap and not yet available in cyoda-go at this release. The section below documents the planned surface; names and shapes may change before release.
:::

Use Trino when the question is *analytical* — joins across entity types,
aggregates, reporting, time-series. Every entity model is projected into a
set of virtual SQL tables; nested arrays and objects expand into separate
tables so relational queries remain natural.

Typical uses:

- Ad-hoc analysis against live data in a notebook or BI tool.
- Scheduled reports that aggregate entities across a tenant.
- Historical queries using the `point_time` column for as-of reads.

The table generation rules, data-type mappings, JDBC connection patterns,
and handling of polymorphic fields are in the
[Trino SQL reference](/reference/trino/). For the Build-side quickstart —
connection recipe, first query, performance notes — see
[Analytics with SQL](/build/analytics-with-sql/).

---

## concepts/authentication-and-identity.md

# Authentication and identity

OAuth 2.0 tokens, machine-to-machine credentials, on-behalf-of exchange, and external key trust — conceptually.

Cyoda is an OAuth 2.0 authorization server. All traffic to the platform — REST,
gRPC, Trino — is authenticated with bearer tokens the platform issues. This
page explains the identity concepts; the mechanics of configuring an IdP,
rotating keys, and provisioning credentials live under Run.

## The platform issues tokens

Every request carries a JWT bearer token. Cyoda both **issues** tokens (as an
OAuth 2.0 authorization server) and **validates** them on every API call. The
token encodes the subject, the scopes, and the tenant the request belongs to;
authorization is evaluated from the token, not from transport-level
credentials.

Clients obtain a token through an OAuth 2.0 flow appropriate to their role:
end-user flows for people, M2M flows for services, on-behalf-of exchange for
downstream calls.

## Machine-to-machine credentials

Services authenticate to Cyoda using **client credentials** (`client_id` and
`client_secret`). The platform issues tokens to those credentials and enforces
the scopes associated with the service account. Use M2M credentials for any
automated integration: ingest pipelines, compute nodes, back-office workers.

Rotate credentials like any other secret; the lifetime and rotation cadence
are enforced per environment.

## On-behalf-of exchange

When one service calls another on a user's behalf — a web app calling an API
that calls a processor, for example — Cyoda supports **token exchange**. The
calling service presents its own token plus the user's token and receives a
new token scoped to the downstream call. This preserves the user identity
through the chain without passing the original bearer token around. In
practice, the calling service includes the user's JWT as the `subject_token`
in a token-exchange request; the issued token carries both identities for
downstream authorization.

The result: the audit trail records who the original user was at every hop,
and each service still only sees a token scoped to what it is allowed to do.

## External key trust

Cyoda can be configured to **trust tokens issued by an external IdP** — your
corporate Okta, Auth0, or Keycloak, for example. The platform accepts tokens
signed with keys it recognises, maps the external subject to an internal
identity, and applies the local authorization rules. Users sign in with their
organisation's single sign-on and receive entitlements within Cyoda.

External key trust is configured per environment; the list of trusted signers
and the subject-to-identity mapping are part of the tenant's identity
configuration.

## Where this is configured

- **Self-hosted (cyoda-go).** Identity configuration — bootstrap credentials,
  JWT signing keys, external IdP trust — is managed via cyoda-go
  configuration. See the
  [cyoda-go authentication reference](https://github.com/cyoda-platform/cyoda-go#authentication)
  for the authoritative parameter list.
- **Cyoda Cloud.** Identity is surfaced as a managed service:
  [Run → Cyoda Cloud → identity and entitlements](/run/cyoda-cloud/identity-and-entitlements/).

## What your application does

Applications do not implement OAuth 2.0 flows from scratch; they fetch a token
using their client credentials (or accept one from a user session) and attach
it to every Cyoda call. See [Build → working with entities](/build/working-with-entities/)
for the client patterns.

---

## concepts/design-principles.md

# Design principles

The mental model behind Cyoda: entities as durable state machines, transitions as the unit of change, and history as a first-class query surface.

Cyoda's shape follows from a few connected ideas. Once those click, the rest of the
platform — the APIs, the workflows, the audit trail, the deployment tiers — is just
what falls out of them. This page is the high-level picture; the pages that follow
drill into each idea in depth.

## Everything is an entity

In Cyoda, every piece of persisted data is an **entity**: a JSON document that
belongs to a typed model and carries a lifecycle state. Entities are not rows to be
updated in place and they are not messages passing through a pipe. They are the
objects your system reasons about — customers, orders, documents, trades — with
identity, state, and history.

Models are discovered from the data, not declared up front, and widen as new
shapes arrive. Once a model is good enough to rely on, it can be **locked** so new
data must conform.

## An entity is a durable state machine

Each entity type has a **workflow** — a set of named states, the legal transitions
between them, and the criteria and processors that run along each transition. The
entity lives inside that machine.

Cyoda's state machines are close in spirit to BPM flows but looser: transitions can
be **automatic** (fire on entering a state) or **manual** (invoked by an actor), and
a workflow need not be linear or terminate. Cycles and branches are first-class.

See [Entities and lifecycle](/concepts/entities-and-lifecycle/) for the full model,
including the state-machine diagram.

## Transitions are the unit of change

Nothing in Cyoda is overwritten. Every transition produces a new, durable revision
of the entity, validated against the model at the moment it happened. Processors run
under a defined event contract; criteria gate whether the transition is allowed to
fire at all.

Because revisions accumulate instead of replacing each other, **history is not an
add-on audit layer — it is the storage model**. Point-in-time reads, transition
logs, and schema-lineage queries are all native operations.

## Events drive the machine

Events — "file uploaded", "payment received", "a manual transition issued by an operator" — trigger transitions.
Transitions invoke processors. Processors can be synchronous or asynchronous, and
can run inside or alongside the transition's transaction. The platform keeps the
whole chain observable and replayable.

## Same semantics, every tier

The same entity, workflow, and transition semantics run on every deployment tier
Cyoda supports: in-memory for tests, SQLite on a desktop, Docker for a single
machine, Kubernetes via Helm, or Cyoda Cloud for a managed fleet. Applications
move between tiers without rewriting their domain model.

## TL;DR

- All persisted data is an **entity** with a typed model and a lifecycle state.
- Each entity follows a **workflow** — a state machine of states, transitions,
  criteria, and processors.
- **Transitions** are the atomic unit of change; each one produces a new revision.
- Transitions can be **automatic** (fire on state entry) or **manual** (invoked).
- Criteria **guard** transitions; processors **execute** along them, sync or async.
- **History** is the storage model, not an add-on; every revision is addressable.
- **Events** drive transitions; the whole chain is observable and replayable.
- The same semantics apply on every tier, from in-memory tests to the cloud.

## Where to go next

- [Entities and lifecycle](/concepts/entities-and-lifecycle/) — schemas, states,
  history, and the state-machine shape of an entity.
- Workflows and events — how transitions, processors, and criteria are wired
  together. *(Coming as the Concepts section fills out.)*
- History and audit — query patterns for time travel and schema lineage.
  *(Coming.)*
- APIs and surfaces, tiers and deployment options, authentication and identity —
  each has its own page in this section.

For the long-form argument behind these ideas, see
[Entity Workflows for Event-Driven Architectures](https://medium.com/@paul_42036/entity-workflows-for-event-driven-architectures-4d491cf898a5)
and [What's an Entity Database?](https://medium.com/@paul_42036/whats-an-entity-database-11f8538b631a).

---

## concepts/digital-twins-and-growth-path.md

# Digital twins and the growth path

Why the same Cyoda app runs on any storage tier, and when to pick each tier.

A Cyoda application is **tier-agnostic**. The entity model, the workflows, the
REST and gRPC surfaces, and the query semantics are the same on every
deployment tier. What changes as you move between tiers is non-functional:
durability, consistency guarantees, scale, and operational cost. We call this
the **digital twin** property — the same logic, running at a different point
on the cost/scale curve.

## The growth path

<GrowthPathDiagram />

## When to use each tier

- **In-Memory** — a single-process, zero-dependency runtime. Data is lost on
  restart. Use it for functional tests, fast AI iteration loops, and
  **digital-twin scenario runs** where you want to drive the same app logic
  at volumes and rates that would be prohibitively expensive against a
  durable backend.
- **SQLite** — durable, single-file, zero-ops. Use it for edge and IoT
  deployments, small-team self-hosting, and local development where you want
  data to survive restarts but do not want to operate a separate database.
- **PostgreSQL** — durable storage with `SERIALIZABLE` isolation for
  production workloads. Run 3–10 stateless cyoda-go nodes behind a load
  balancer for active-active HA, with PostgreSQL as the only external
  dependency. This is the recommended production path for teams self-hosting.
- **Cassandra (via Cyoda Cloud)** — a distributed, horizontally scalable
  backend available today as a managed service. Use Cyoda Cloud when you need
  enterprise-grade identity, multi-tenancy, and provisioning, and do not
  want to run the infrastructure yourself.

## Choosing

Most teams make this choice along four axes:

- **Durability** — can the data disappear on a restart? If yes, In-Memory is
  fine. Otherwise you need SQLite or up.
- **Write volume and HA** — a single process can go a long way on SQLite, but
  active-active HA with concurrent write safety wants PostgreSQL or cloud.
- **Ops appetite** — PostgreSQL is one dependency; SQLite is zero; Cyoda Cloud
  is "someone else's problem".
- **Scale ceiling** — in-process stores have limits Cassandra does not.

You do not have to pick forever. The whole point of the growth path is that
the application does not change when you move. A test suite can run on
In-Memory today, the early product on SQLite, the growing service on
PostgreSQL, and the enterprise fleet on Cyoda Cloud — with the same code.

## Where to go next

- [Run → overview](/run/) — practical deployment guidance for each tier.
- [Build → testing with digital twins](/build/testing-with-digital-twins/) —
  using In-Memory mode for scenario tests.

---

## concepts/entities-and-lifecycle.md

# Entities and lifecycle

Entities are durable state machines — schemas, states, transitions, and history.

In Cyoda, an **entity** is a durable state machine. It is a JSON document that belongs
to a typed model, sits in a named lifecycle state, and carries a complete audit trail
of every transition it has ever undergone. The entity is not a row to be updated in
place; the entity *is* the state machine.

## The entity IS the state machine

A Cyoda workflow is close in spirit to a BPM process, but the two are not the same.
A BPM flow tends to be linear — it has a start, a sequence of activities, and an end.
A Cyoda entity workflow is a state machine in the strict sense: there is no
requirement for a terminating state, transitions need not follow a linear path, and
an entity can move between states in any topology the model calls for, including
cycles and branches.

That distinction matters in day-to-day modeling. You do not design "the process" and
then attach entities to it; you design the states and transitions of a piece of
business reality, and the entity lives inside that machine for as long as the
business cares about it.

![Example entity state machine: four states connected by auto and manual
transitions. One transition runs a `validate` processor; another runs `notify`;
a third is gated by an `age > 30d` criterion. A loopback from `archived` back to
`active` shows that workflows are not linear and need not
terminate.](/img/entity-state-machine.svg)

The picture above sketches the building blocks:

- **States** (teal rectangles) — the named stages of the entity's life.
- **Transitions** — the atomic units of change. Auto transitions (solid, teal)
  fire as soon as their source state is entered; manual transitions (dashed,
  orange) fire only when an actor invokes them.
- **Processors** (green pills on a transition) — code that runs as part of the
  transition, under the platform's event contract.
- **Criteria** (purple diamonds on a transition) — predicates that gate whether
  the transition is allowed to fire.

Every transition produces a new, durable revision of the entity. Nothing is
overwritten.

## Schema

Every entity belongs to a named **model** identified by `modelName` and
`modelVersion`. Cyoda auto-discovers the model schema from ingested samples
rather than requiring it up front: as records flow in, Cyoda observes the
fields that appear, the types they take, and the shape of nested arrays
and objects. That observed shape is the schema.

A model has two structural modes. While **unlocked**, it evolves by merging
— new fields appear, types widen, array widths grow. When **locked**, the
structural contract is frozen and any incoming entity that does not match
is rejected. Lock is the right default for production systems with external
producers, where silently accepting a widened shape would be a compliance
or correctness failure. `modelVersion` is application-controlled; to change
the contract after lock, the application bumps the version and **registers
the new schema** for it — by submitting a comprehensive set of
representative samples (the same mechanism as initial discovery; the
samples themselves are not stored). Old revisions are never re-validated
or re-cast; each remains valid under the model version active at write
time. See [Modeling entities](/build/modeling-entities/) for when to choose
each mode and how to plan evolutions.

The wire format and field conventions for exported models (type
descriptors, array representations, structural markers) are in the
[entity-model export reference](/reference/entity-model-export/).

## History and temporal queries

Because transitions produce revisions rather than overwriting state, the full history
of an entity is always retrievable. You can ask:

- "What did this entity look like at timestamp T?" — point-in-time reconstruction.
- "What transitions has this entity been through, and when?" — transition log.
- "Which version of the model was this revision validated against?" — schema lineage.

This is not an add-on audit layer bolted onto a mutable store; it is the storage model.
See the [API reference](/reference/api/) for the temporal-query grammar.

---

## concepts/what-is-cyoda.md

# What is Cyoda?

An Entity Database Management System — a database engine where the first-class abstraction is a stateful entity with schema, lifecycle, history, and transactional integrity.

Cyoda is an **Entity Database Management System (EDBMS)**. Unlike a relational
database, where the unit is a row in a table, and unlike a document database,
where the unit is a JSON blob, Cyoda's first-class unit is an **entity**: a
typed document that carries a lifecycle state, a complete history of every
change, and transactional integrity.

## An EDBMS, not a database

Most databases answer one question well: *what is the state of the world right
now?* They leave you to glue in a workflow engine, a message bus, an audit
layer, and a schema registry on top.

An EDBMS answers a broader question: *how does this thing evolve, under what
rules, and how do I ask what it looked like last Tuesday?* It folds the
workflow engine, the audit trail, the schema registry, and the event contract
into the storage model. Everything the entity does — transitions, rule
evaluations, processor invocations, revisions — is a first-class, queryable
part of the same store.

Out of the box, an entity in Cyoda has:

- a **schema** discovered from ingested samples, evolving over time, lockable;
- a **lifecycle state** governed by a workflow;
- **transactional transitions** that produce durable, addressable revisions;
- a **temporal history** you can query at any point in the past;
- an **audit trail** of every rule, transition, and processor that touched it.

## Why this shape

Cyoda targets domains where state, rules, and data must evolve together:
financial ledgers, order management, regulatory compliance, digital twins.
These domains share a property — "the row got updated" is not enough
information. You need to know *why* it changed, *under what rules*, and what
the world looked like before. That information has to be engineered into a
normal database as an afterthought; an EDBMS makes it the default.

## Two forms today

Cyoda ships in two closely related forms, with the same semantics:

- **cyoda-go** — an open-source Go implementation, run as a local binary or a
  small cluster. Backends are In-Memory, SQLite, or PostgreSQL, chosen at start.
- **Cyoda Cloud** — a managed service backed by the Cassandra-based Cyoda
  Platform Library. Horizontally scalable, multi-tenant, with enterprise
  identity, observability, and provisioning.

Applications written against one run against the other without rewriting the
domain model.

## The growth path

You can start on a laptop in minutes and graduate — on your schedule — to a
container, to Kubernetes, or to Cyoda Cloud as scale and operational needs
demand. The entity, workflow, and API contracts do not change as you move.

<GrowthPathDiagram />

## Where to go next

- [Design principles](/concepts/design-principles/) — the mental model in one
  read.
- [Entities and lifecycle](/concepts/entities-and-lifecycle/) — the state
  machine shape of an entity, including a worked example diagram.
- Digital twins and the growth path — how the same application runs at every
  tier. *(Coming as Concepts fills out.)*

---

## concepts/workflows-and-events.md

# Workflows and events

State machines as a first-class concept — triggers, external processors, and audit trails.

A **workflow** is the state machine an entity lives inside. It declares the
states an entity can be in, the transitions between them, the criteria that
gate whether a transition is allowed, and the processors that run along the
way. This page explains the concept; the
[Build guide for workflows and processors](/build/workflows-and-processors/)
covers how to configure one.

## State machines define allowed change

Every entity type has a workflow. Nothing changes the entity except a
transition defined in that workflow. A transition is atomic: it produces a new,
durable revision of the entity, runs any attached processors under the
platform's event contract, and either succeeds or has no effect.

Workflows are general state machines, not pipelines. Transitions can be
automatic (fire as soon as their source state is entered and any criteria are
satisfied) or manual (fire only when invoked by an actor). A workflow can
contain cycles, branches, and multiple terminal-looking states that are
actually re-entered later — it does not need to terminate at all.

## Triggers

Transitions fire in one of two ways:

- **Manual** — an actor (a user, a service, an admin) calls the transition by
  name through the API.
- **Automatic** — on entering a state, the first valid auto transition fires
  within the same transaction, recursing until no further auto transition
  applies.

## Processors

A **processor** is code that runs as part of a transition. It can read the
entity, compute a new field, call an external service, persist a side effect,
or reject the transition.

Two flavours:

- **Internal processors** — shipped with the platform for common work
  (validation, projection, enrichment) and invoked declaratively.
- **External processors** — your code, hosted anywhere, called by Cyoda over
  gRPC. External processors preserve audit hygiene (every call is logged
  against the transition), keep authorization simple (the platform brokers
  identity), and support bidirectional streaming for high-throughput
  workloads. For why gRPC is preferred and how to implement one, see
  [Build → client compute nodes](/build/client-compute-nodes/).

Processors can run synchronously within the transition's transaction, or
asynchronously alongside it.

Processors run in one of three modes: **SYNC** (inline, shares the
transition's transaction — failure aborts the transition), **ASYNC_SAME_TX**
(runs asynchronously but in the same transaction context — failure still
aborts), or **ASYNC_NEW_TX** (runs in a separate transaction via savepoint
isolation — failure is logged and the transition succeeds). Choose the mode
based on how atomically the side-effect must compose with the state change.

## Audit trail is the storage model

Because every transition produces a revision and every processor invocation is
recorded against it, the audit trail is not a separate log — it is a view of
the storage. You can query the transitions an entity has been through, the
criteria that were evaluated, the processors that ran, and the inputs and
outputs of each. Point-in-time queries use the same index.

## Why this matters

State machines plus durable transitions plus a queryable audit trail are the
native ingredients for **regulated**, **auditable**, and **replayable**
systems. Replaying a workflow from history does not require an event-sourcing
framework on the side — it is the default behavior of the store.

## Where to go next

- [Entities and lifecycle](/concepts/entities-and-lifecycle/) — the entity
  side of the machine, with a state-machine diagram.
- [Build → workflows and processors](/build/workflows-and-processors/) — how
  to declare a workflow in practice.
- [Build → client compute nodes](/build/client-compute-nodes/) — how to
  implement an external processor over gRPC.

---

## getting-started/install-and-first-entity.md

# Install cyoda-go and create your first entity

Install cyoda-go with SQLite (the default), define an entity model, trigger a workflow, and read state back.

This page takes you from nothing installed to a persisted entity you can query,
running entirely on your own machine. The default backend is **SQLite** — zero
operational overhead, one file on disk, data survives restarts.

## Install

cyoda-go ships as a single binary. Pick the flavour that fits your machine; the
authoritative list of installers lives in the
[cyoda-go README](https://github.com/cyoda-platform/cyoda-go#install).

```bash
# macOS / Linux via Homebrew
brew install cyoda-platform/cyoda-go/cyoda
```

The Homebrew formula is expected to run `cyoda init` automatically, enabling
SQLite persistence with data at `~/.local/share/cyoda/cyoda.db` on macOS/Linux.
If it didn't (or if you installed another way), run `cyoda init` yourself — see
`cyoda help cli init` for what it does. A `curl | sh` installer, Debian
packages, and Fedora RPMs are available for other environments.

<FromTheBinary topic="cli init" />

## Why SQLite is the default

SQLite is durable (data survives restarts), zero-ops (no server to run), and
single-file (easy to inspect, back up, or delete). It is the right starting
point for everyone except two groups: teams running functional tests where
latency matters more than durability (use **in-memory** mode — see the callout
below), and teams building production services (graduate to **PostgreSQL** when
you need active-active HA; see [Run → overview](/run/)).

## Start the server

Start the server with the defaults:

```bash
cyoda
```

Running `cyoda` (no subcommand) defaults to **mock auth** (`CYODA_IAM_MODE=mock`): every
request is authenticated as a configurable default user, no bearer token
required. You'll see examples elsewhere in the docs with
`-H "Authorization: Bearer $TOKEN"`; those are written for production
deployments running in JWT mode. On a fresh local install you can drop
the header — or keep it and send any placeholder; mock mode ignores it.

Flip to JWT mode by setting `CYODA_IAM_MODE=jwt` plus
`CYODA_JWT_SIGNING_KEY` (RSA private key, PEM). For the full auth-mode
configuration see [Configuration](/reference/configuration/).

## Discover everything else with `cyoda help`

Every flag, environment variable, endpoint, error code, metric, and operation
is documented in the binary. Browse the topic tree:

```bash
cyoda help
```

Drill in on anything:

```bash
cyoda help config        # configuration model and precedence
cyoda help crud          # entity CRUD over REST
cyoda help search        # query predicates and search modes
cyoda help errors        # the error catalogue
cyoda help telemetry     # metrics, health, tracing, logs
```

`cyoda help <topic> --format=json` emits a machine-readable shape suitable
for tools; `--format=markdown` is the default off a TTY.

## Import a workflow

Before creating an entity, import a workflow so the platform knows which
state machine to apply. Save this as `workflow.json` — two states
(`draft` and `submitted`) with one manual transition. States are keyed by
name; each state owns its outgoing transitions:

```json
{
  "workflows": [
    {
      "version": "1",
      "name": "orders-wf",
      "initialState": "draft",
      "active": true,
      "states": {
        "draft": {
          "transitions": [
            { "name": "submit", "next": "submitted", "manual": true }
          ]
        },
        "submitted": {}
      }
    }
  ]
}
```

Post it to the import endpoint for the `orders` model at version `1`:

```bash
curl -X POST http://localhost:8080/api/model/orders/1/workflow/import \
  -H 'Content-Type: application/json' \
  -d @workflow.json
```

Without this step, cyoda-go applies its default workflow (`NONE` →
`CREATED` → `DELETED` with a single automatic transition), and the
`submit` transition used below will not exist.

For the full workflow schema — processors, criteria, automatic
transitions, nested conditions — see
[Build → workflows and processors](/build/workflows-and-processors/).

## Create your first entity

Define a minimal model and push an entity. Cyoda discovers the schema from the
sample you send, so you do not need to declare it up front:

```bash
ENTITY_ID=$(curl -s -X POST http://localhost:8080/api/entity/JSON/orders/1 \
  -H 'Content-Type: application/json' \
  -d '{ "orderId": "ORD-1", "amount": 42.00, "currency": "EUR" }' \
  | jq -r '.[0].entityIds[0]')
echo "$ENTITY_ID"
```

The create response is an array; `entityIds[0]` on the first element is the
**system-assigned UUID** of the new entity. Subsequent reads and transitions
address the entity by that UUID (`${ENTITY_ID}` here), not by the business
key `orderId`.

Automatic transitions (`manual: false`) fire immediately on creation,
cascading the entity through applicable states until it reaches one with
no outgoing auto transitions. The `orders-wf` workflow you just imported
has none, so the entity settles in `draft` and waits for the manual
`submit` transition below.

## Invoke a manual transition

Trigger the `submit` transition on your entity. `{entityId}` (here
`${ENTITY_ID}`) is the system-assigned UUID captured from the create
response, not the business key `orderId`:

```bash
curl -X PUT http://localhost:8080/api/entity/JSON/${ENTITY_ID}/submit
```

## Read state back

Fetch the current state and the transition log:

```bash
curl http://localhost:8080/api/entity/${ENTITY_ID}
```

The response shows the entity in its new `submitted` state plus a record of
every revision it has been through. For the full request and response shapes,
see the [API reference](/reference/api/).

## Next steps

- **Explore the binary.** Every flag, env var, endpoint, and error is
  described by a `cyoda help <topic>` action. Browse the full tree at
  [Reference → `cyoda help`](/reference/cyoda-help/).
- **Understand the model.** Read [Design principles](/concepts/design-principles/)
  and [Entities and lifecycle](/concepts/entities-and-lifecycle/) for the mental
  model behind what you just did.
- **Build real applications.** Start with
  [Build → working with entities](/build/working-with-entities/) for the
  end-to-end patterns.
- **Choose a deployment tier.** See [Run → overview](/run/) when you outgrow
  local SQLite.

If you want fast functional tests without durability, run cyoda-go in
**in-memory** mode (`go run ./cmd/cyoda` or the `CYODA_STORAGE_BACKEND=memory` profile).
See [Testing with digital twins](/build/testing-with-digital-twins/) for the
pattern.

---

## index.md

# Cyoda Documentation

Build and run Cyoda applications — from local cyoda-go to hosted Cyoda Cloud.

![Illustration of interconnected entity lifecycle components — state, workflows, events, and audit — unified in a single Cyoda runtime.](heroImage)

  <Badge variant="teal" size="sm">EDBMS — Entity Database Management System</Badge>

  # One transactional runtime for the entity lifecycle.

  Cyoda is an EDBMS: state machine, processors, and full revision history
  live inside the record, committed atomically — minimizing the need for
  sagas. A simpler stack than Postgres + Temporal + Kafka + a CDC audit
  pipeline. Build complete event-driven backends on one runtime. Open
  source, single Go binary, Postgres-backed.

  <Button href="/getting-started/install-and-first-entity/" variant="primary">Get started</Button>
  <Button href="/concepts/what-is-cyoda/" variant="secondary">Learn the concepts</Button>

## Four storage engines. One application contract.

Three open-source engines ship with cyoda-go — in-memory, SQLite, and
PostgreSQL — each tuned to a different operational shape. A commercial
Cassandra plugin extends the same application code to fully scalable,
robust production workloads.

<GrowthPathDiagram />

<div class="section-separator" />

## Where to go next

- **New here?** Start with the [install-and-first-entity onramp](/getting-started/install-and-first-entity/).
- **Understanding Cyoda?** Read [Concepts](/concepts/what-is-cyoda/).
- **Building an app?** [Build](/build/) covers tier-agnostic patterns.
- **Running one?** [Run](/run/) covers [desktop](/run/desktop/), [Docker](/run/docker/), [Kubernetes](/run/kubernetes/), and [Cyoda Cloud](/run/cyoda-cloud/).
- **Need API specs?** [Reference](/reference/) embeds and ingests from [cyoda-go](https://github.com/Cyoda-platform/cyoda-go).

---

## reference.md

# Reference

Technical references — mostly generated from cyoda-go at build time.

Reference content on this site is a narrative skin over the cyoda-go
binary. The binary is self-documenting — every flag, environment
variable, endpoint, error code, metric, header, and operation ships with
its own help topic. This section points you at the right topics, shows
you the REST/gRPC surfaces, and documents the CloudEvent JSON Schemas.

The material here was captured against **cyoda-go v{helpIndex.pinnedVersion}**.
For whatever version you are running, `cyoda help` on your own binary
is the authoritative source.

## Start here

- **[cyoda help](./cyoda-help/)** — navigator over the full topic tree. Every
  top-level topic and its drilldowns, with synopses. The best first stop.

## Surfaces

- **[API](./api/)** — REST OpenAPI 3.1 reference, interactive viewer.
- **[gRPC](./api/#grpc)** — gRPC CloudEventsService (cross-linked from the API page).
- **[JSON Schemas](./schemas/)** — CloudEvent payload schemas, extracted from the
  pinned binary at build time.
- **[Trino SQL](./trino/)** — SQL analytics surface (Cyoda Cloud; upcoming).

## Navigators over specific topics

- **[CLI](./cli/)** — command-line entry points and global flags.
- **[Configuration](./configuration/)** — configuration model, precedence,
  profiles, `_FILE` secrets.
- **[Helm values](./helm/)** — chart layout, values model, secret provisioning.
- **[Entity model export](./entity-model-export/)** — SIMPLE_VIEW export shape.

Each navigator page carries a "Canonical reference" callout pointing at
the corresponding `cyoda help <topic>` for the authoritative contract.

---

## reference/api.md

# API reference

REST and gRPC surfaces.

<FromTheBinary topic="openapi" />

The REST API reference is rendered in a dedicated viewer so the documentation
chrome does not compete with it for horizontal space:

[Open the REST API reference](/api-reference/)

The viewer works from the OpenAPI spec shipped with this site
(`/openapi/openapi.json`), which was extracted from **cyoda-go v{helpIndex.pinnedVersion}**.
For the version you are running, `cyoda help openapi` on your own binary is
authoritative.

The viewer supports the standard operations: browsing endpoints, inspecting
request/response shapes, and try-it-out calls against an environment of your
choice.

## gRPC

gRPC proto documentation is tracked upstream and will appear here once the
generated reference is published from cyoda-go. Until then, the `.proto`
files in
[cyoda-go/api/grpc](https://github.com/cyoda-platform/cyoda-go/tree/main/api/grpc)
are the authoritative source.

---

## reference/cli.md

# CLI

cyoda-go command-line interface — narrative navigator over `cyoda help cli`.

export const cliTopics = (() => {
  const found = helpIndex.topics.filter(t => t.path[0] === 'cli');
  if (found.length === 0) {
    throw new Error(`EmptyNavigator: reference/cli.mdx filtered helpIndex to zero topics under prefix "cli" (pinned v${helpIndex.pinnedVersion}). Likely a topic rename upstream.`);
  }
  return found;
})();

<FromTheBinary topic="cli" />

The `cyoda` binary is the server and its own control surface. It runs as a
single long-lived process by default, with a small number of subcommands
for bootstrapping, health, and migration. Every flag, subcommand, and env
var is documented in-binary via `cyoda help <topic>` and is version-
accurate to whatever binary you are running.

## Output formats

Every help topic supports three output formats via `--format`:

- `text` (default on a TTY) — human reading.
- `markdown` (default off-TTY) — paste into docs, PRs, chat.
- `json` — machine-readable, stable schema; consumed by tools like
  cyoda-docs' own build pipeline.

## Drilldowns

Topics that naturally subdivide take a multi-word path on the CLI: `cyoda
help search async`, `cyoda help grpc compute`, and so on. The same path
appears in the JSON surface as an array.

## Related topics

The subset of `cyoda help` topics directly relevant to the CLI surface
itself is below.

  {cliTopics.map((t) => (
    <li>
      <strong><code>cyoda help {t.path.join(' ')}</code></strong> — {t.title.replace(/^[^—]*—\s*/, '')}
      <br />
      <span style="opacity: 0.8;">{t.synopsis}</span>
    </li>
  ))}

---

## reference/configuration.md

# Configuration

cyoda-go configuration model — narrative navigator over `cyoda help config`.

export const configTopics = (() => {
  const found = helpIndex.topics.filter(t => t.path[0] === 'config');
  if (found.length === 0) {
    throw new Error(`EmptyNavigator: reference/configuration.mdx filtered helpIndex to zero topics under prefix "config" (pinned v${helpIndex.pinnedVersion}). Likely a topic rename upstream.`);
  }
  return found;
})();

<FromTheBinary topic="config" />

cyoda-go reads configuration from `CYODA_*` environment variables and
from `.env`-format files. The authoritative key list — every variable,
its type, its default — lives in the binary. This page covers the
*model*: how sources compose, how profiles work, and how secrets are
mounted from files.

## Sources and precedence

Values resolve in this order, highest to lowest:

1. Shell environment
2. `.env.{profile}` files (in `CYODA_PROFILES` declaration order; later profiles override earlier ones within their group)
3. `.env` in the project directory
4. User config file
5. System config file
6. Hardcoded defaults

Format is `.env` only (godotenv-parsed). No TOML, no YAML, no `--config`
flag. Subcommand flags (e.g. `cyoda init --force`) are operation-scoped
and do not override server-runtime configuration.

**User config path** varies by OS: `~/.config/cyoda/cyoda.env` (Linux,
macOS with XDG), `%AppData%\cyoda\cyoda.env` (Windows). **System config**
lives at `/etc/cyoda/cyoda.env` on POSIX.

## Profiles

`CYODA_PROFILES` is comma-separated and evaluated in declaration order.
Within a profile, regular `.env` precedence applies; across profiles,
later entries in the list override earlier ones.

## Secrets via `_FILE` suffix

Any variable that accepts a credential (Postgres URL, JWT signing key,
metrics bearer, gossip HMAC, bootstrap client secret) accepts a
companion `*_FILE` variable that reads from a mounted file. Trailing
whitespace is stripped. The `_FILE` variant takes precedence when both
are set — the pattern designed for Kubernetes Secrets and Docker
secrets mounts.

## Related topics

  {configTopics.map((t) => (
    <li>
      <strong><code>cyoda help {t.path.join(' ')}</code></strong> — {t.title.replace(/^[^—]*—\s*/, '')}
      <br />
      <span style="opacity: 0.8;">{t.synopsis}</span>
    </li>
  ))}

---

## reference/cyoda-help.md

# `cyoda help` — topic tree

Every flag, env var, endpoint, error, metric, and operation — browsable from the binary. This page is a navigator over the full topic tree.

export const byTopLevel = (() => {
  const groups = new Map();
  for (const t of helpIndex.topics) {
    const top = t.path[0];
    if (!groups.has(top)) groups.set(top, []);
    groups.get(top).push(t);
  }
  return Array.from(groups.entries()).map(([top, topics]) => {
    topics.sort((a, b) => a.path.join('/').localeCompare(b.path.join('/')));
    return { top, topics };
  });
})();

The cyoda binary is self-documenting. Every flag, environment variable,
endpoint, error code, metric, header, and operation is described by a
**help topic** that ships with the binary. Topics are structured; many
subdivide into drilldowns. The tree is stable across patch releases and
evolves with minor releases.

## How to use it

```bash
cyoda help                          # browse the whole tree
cyoda help <topic>                  # read one topic (e.g. `cyoda help search`)
cyoda help <topic> <subtopic>       # drill down (e.g. `cyoda help search async`)
cyoda help <topic> --format=json    # machine-readable
cyoda help <topic> --format=markdown # default off-TTY — paste into docs, PRs, chat
```

The binary you run is the authority for the version you run. This page
lists the tree as shipped by the cyoda-go release this site was built
against — other releases may add or rename topics.

## The tree

  {byTopLevel.map(({ top, topics }) => (
    <>
      <dt><code>cyoda help {top}</code></dt>
      <dd>
        <p>{topics.find(t => t.path.length === 1)?.synopsis ?? ''}</p>
        {topics.filter(t => t.path.length > 1).length > 0 && (
          <ul>
            {topics.filter(t => t.path.length > 1).map(t => (
              <li>
                <code>cyoda help {t.path.join(' ')}</code> — {t.synopsis}
              </li>
            ))}
          </ul>
        )}
      </dd>
    </>
  ))}

{`
  .cyoda-help-tree dt {
    margin-top: 1.2rem;
    font-weight: 600;
  }
  .cyoda-help-tree dd {
    margin-left: 1.5rem;
    margin-top: 0.25rem;
  }
  .cyoda-help-tree dd p {
    margin: 0 0 0.4rem 0;
  }
  .cyoda-help-tree dd ul {
    margin: 0.25rem 0 0 0;
    padding-left: 1.2rem;
  }
  .cyoda-help-tree dd li {
    margin-block: 0.15rem;
  }
  .cyoda-help-tree dd code {
    font-size: 0.9em;
  }
`}

---

## reference/entity-model-export.md

# Entity model export (SIMPLE_VIEW)

API specification for the SIMPLE_VIEW entity-model export — response format, node descriptors, type descriptors, and error shapes.

<VendoredBanner stability="evolving" />

The SIMPLE_VIEW export returns the full structural model of an entity
type — every node, every field, every observed type — in a compact,
round-trippable JSON form. This page is the wire-format specification:
endpoint, response envelope, node and type descriptors, JSON Schema, and
error shapes. For the conceptual context (what a model is, how
discovery/widening/locking work), see
[Modeling entities](/build/modeling-entities/) and
[Entities and lifecycle](/concepts/entities-and-lifecycle/).

## Endpoint

```
GET /model/export/SIMPLE_VIEW/{entityName}/{modelVersion}
```

| Parameter      | Type    | Description                          |
|----------------|---------|--------------------------------------|
| `entityName`   | String  | Name of the entity model             |
| `modelVersion` | Integer | Version number of the entity model   |

**Response Content-Type:** `application/json`

## Response envelope

Every SIMPLE_VIEW export response is a JSON object with exactly two
top-level keys:

| Key            | Type   | Description                                                  |
|----------------|--------|--------------------------------------------------------------|
| `currentState` | String | Lifecycle state of the model: `"UNLOCKED"` or `"LOCKED"`    |
| `model`        | Object | The SIMPLE_VIEW body — a map of **node paths** to **node descriptors** |

```json
{
    "currentState": "LOCKED",
    "model": { ... }
}
```

## The `model` object — structure overview

The `model` value is a flat JSON object whose **keys are node paths**
and whose **values are node descriptors**. Each entry describes one
level of the entity's hierarchical structure. The entire nested tree is
flattened into this single-depth map.

There are three kinds of node descriptors, corresponding to three
structural cases:

1. **Object nodes** — JSON objects mapping field keys to type
   descriptors.
2. **Array nodes** — a single JSON value (primitive or array)
   representing a detached array specification.
3. **Mixed nodes** — a JSON array of exactly two elements:
   `[<object-node>, <array-node>]`.

## Node paths

Node paths use JSONPath-like syntax rooted at `$`.

| Path                      | Meaning                                                   |
|---------------------------|-----------------------------------------------------------|
| `$`                       | The root object                                           |
| `$.fieldName[*]`          | Array elements inside `fieldName` on the root object      |
| `$.parent[*].child[*]`    | Array elements inside `child`, nested under `parent` array elements |
| `$.a[*][*]`               | Elements of a nested (multi-dimensional) array            |

The `[*]` marker (called `COLLECTION_MARKER` internally) denotes "all
elements of this array." Node paths are always sorted lexicographically
in the output.

**Depth** is derived from the number of `[*]` segments in the path
(i.e., `path.split("[*]").count() - 1`).

## Node descriptor formats

### 1. Object node (most common)

A JSON object whose entries fall into two categories:

#### a) Data fields — keys starting with `.` (dot)

Each key is a dot-prefixed field name. The value is the field's **type
descriptor** (see [Type descriptors](#type-descriptors) below).

```json
{
    ".category": "STRING",
    ".year": "INTEGER",
    ".score": "DOUBLE"
}
```

Array fields within an object node have keys ending in `[*]`:

```json
{
    ".tags[*]": "(STRING x 3)",
    ".name": "STRING"
}
```

The `(TYPE x WIDTH)` form is an array descriptor, where `TYPE` is the
element type and `WIDTH` is the array length; e.g., `(INT x 4)` is a
four-element integer array. See [Array type descriptors](#array-type-descriptors)
for the full syntax.

#### b) Structural fields — keys starting with `#`

Structural fields are metadata markers prefixed with `#`. They indicate
the role of this node in the overall structure.

| Key            | Value              | Meaning                                         |
|----------------|--------------------|-------------------------------------------------|
| `#`            | `"ARRAY_ELEMENT"`  | This node describes elements of its parent array |
| `#.fieldName`  | `"OBJECT"`         | `fieldName` is an object (has its own child node)|

Example — a node representing array elements with an object sub-field:

```json
{
    ".firstname": "STRING",
    ".id": "STRING",
    "#": "ARRAY_ELEMENT",
    "#.address": "OBJECT"
}
```

Fields are sorted alphabetically within the object (data fields first,
then structural fields, both sorted by key).

### 2. Array node (detached array)

Arrays of arrays (multidimensional arrays beyond the first dimension)
create detached array nodes — each inner array becomes its own node in
the export tree, keyed by the path to that inner array (e.g.
`$.matrix[*]`). Likewise, arrays of objects create a separate node for
the element shape (see Examples 1 and 3).

When a node path points to a pure array (no object fields at this
level), the value is the array's **type descriptor** directly — either a
UniTypeArray string or a MultiTypeArray JSON array. See
[Array type descriptors](#array-type-descriptors).

```json
{
    "$.data[*]": "(INTEGER x 5)"
}
```

### 3. Mixed node (structural polymorphism)

When the same path has been observed as both an object and an array
(structural polymorphism), the value is a JSON array of exactly two
elements:

```json
{
    "$.data[*]": [
        { ".nested": "STRING", "#": "ARRAY_ELEMENT" },
        "(INTEGER x 2)"
    ]
}
```

- Element `[0]`: the object-node descriptor.
- Element `[1]`: the array-node type descriptor.

## Type descriptors

Type descriptors appear as values for data fields (`.fieldName` keys)
and for array nodes.

### Primitive type descriptor

A JSON string containing a single `DataType` name or a polymorphic set.

**Monomorphic** (single type):

```
"STRING"
```

**Polymorphic** (multiple observed types for the same field, enclosed in
brackets):

```
"[INTEGER, STRING]"
```

### Supported DataType values

| DataType           | Description                                    |
|--------------------|------------------------------------------------|
| `STRING`           | Text value                                     |
| `BYTE`             | 8-bit signed integer                           |
| `SHORT`            | 16-bit signed integer                          |
| `INTEGER`          | 32-bit signed integer                          |
| `LONG`             | 64-bit signed integer                          |
| `BIG_INTEGER`      | Arbitrary-precision integer (bounded by Int128) |
| `UNBOUND_INTEGER`  | Arbitrary-precision integer (unbounded)        |
| `FLOAT`            | 32-bit IEEE 754 floating point                 |
| `DOUBLE`           | 64-bit IEEE 754 floating point                 |
| `BIG_DECIMAL`      | Arbitrary-precision decimal (bounded, scale ≤ 18) |
| `UNBOUND_DECIMAL`  | Arbitrary-precision decimal (unbounded)        |
| `BOOLEAN`          | Boolean value                                  |
| `CHARACTER`        | Single character                               |
| `LOCAL_DATE`       | Date without time zone (ISO 8601)              |
| `LOCAL_DATE_TIME`  | Date-time without time zone                    |
| `LOCAL_TIME`       | Time without date                              |
| `ZONED_DATE_TIME`  | Date-time with time zone                       |
| `YEAR`             | Year value                                     |
| `YEAR_MONTH`       | Year and month                                 |
| `UUID_TYPE`        | UUID                                           |
| `TIME_UUID_TYPE`   | Version 1 (time-based) UUID                    |
| `BYTE_ARRAY`       | Binary data (base64-encoded)                   |
| `NULL`             | Null / no value observed yet                   |

### Structural DataType values (used only in `#`-prefixed keys)

| DataType           | Description                                    |
|--------------------|------------------------------------------------|
| `OBJECT`           | Marks a field as an object (has its own child node) |
| `ARRAY`            | Marks a field as an array container             |
| `ARRAY_ELEMENT`    | Marks this node as describing array elements   |
| `TYPE_REFERENCE`   | Internal reference to another type definition  |
| `POLYMORPHIC`      | Internal marker for polymorphic fields         |

## Array type descriptors

Array fields (keys ending in `[*]`) and detached array nodes use one of
two array representations.

### UniTypeArray (homogeneous)

All elements have the same type. Serialized as a parenthesized string:

```
(<type> x <width>)
```

- `type`: a DataType name or polymorphic set.
- `width`: the maximum observed array length.

Examples:

```
"(STRING x 3)"            — array of 3 strings
"(INTEGER x 10)"          — array of 10 integers
"([INTEGER, STRING] x 4)" — array of 4 elements, each either integer or string
```

### MultiTypeArray (heterogeneous)

Elements at different positions have different types. Serialized as a
JSON array of type strings:

```json
["INTEGER", "STRING", "BOOLEAN"]
```

Each element in the JSON array represents the type at that index
position. Polymorphic elements within a multi-type array use the bracket
notation:

```json
["[INTEGER, STRING]", "INTEGER", "[INTEGER, STRING]", "INTEGER"]
```

## Complete examples

### Example 1: Simple flat object (Nobel Prize)

**Input data shape:**

```json
{
    "category": "chemistry",
    "year": "2020",
    "laureates": [
        { "firstname": "Emmanuelle", "id": "991", "motivation": "...", "share": "2", "surname": "Charpentier" }
    ]
}
```

**SIMPLE_VIEW export:**

```json
{
    "currentState": "LOCKED",
    "model": {
        "$": {
            ".category": "STRING",
            ".year": "STRING"
        },
        "$.laureates[*]": {
            ".firstname": "STRING",
            ".id": "STRING",
            ".motivation": "STRING",
            ".share": "STRING",
            ".surname": "STRING",
            "#": "ARRAY_ELEMENT"
        }
    }
}
```

### Example 2: Nested objects and primitive arrays

**Input data shape:**

```json
{
    "name": "Alice",
    "scores": [95, 87, 92],
    "address": {
        "city": "London",
        "zip": "SW1A"
    }
}
```

**SIMPLE_VIEW export:**

```json
{
    "currentState": "UNLOCKED",
    "model": {
        "$": {
            ".address.city": "STRING",
            ".address.zip": "STRING",
            ".name": "STRING",
            ".scores[*]": "(BYTE x 3)"
        }
    }
}
```

Plain nested objects (non-array) are **inlined** into the parent node
using dot-path notation (e.g., `.address.city`). They do not produce
separate node entries or `#.fieldName` structural markers.

### Example 3: Multi-dimensional array

**Input data shape:**

```json
{
    "matrix": [
        [1, 2, 3],
        [4, 5, 6]
    ]
}
```

**SIMPLE_VIEW export:**

```json
{
    "currentState": "UNLOCKED",
    "model": {
        "$": {
            ".matrix[*]": "(ARRAY_ELEMENT x 2)",
            "#.matrix": "OBJECT"
        },
        "$.matrix[*]": "(INTEGER x 3)"
    }
}
```

### Example 4: Polymorphic field

**Input data shape** (after ingesting multiple records):

```json
{ "data": "hello" }
{ "data": 42 }
```

**SIMPLE_VIEW export:**

```json
{
    "currentState": "UNLOCKED",
    "model": {
        "$": {
            ".data": "[INTEGER, STRING]"
        }
    }
}
```

### Example 5: Mixed node (structural polymorphism)

**Input data shape** (after ingesting multiple records with different
shapes):

```json
{ "data": [{"nested": "primitive"}] }
{ "data": [[123, 321], [456, 654]] }
```

**SIMPLE_VIEW export:**

```json
{
    "currentState": "UNLOCKED",
    "model": {
        "$": {
            ".data[*]": "(ARRAY_ELEMENT x 2)",
            "#.data": "OBJECT"
        },
        "$.data[*]": [
            { ".nested": "STRING", "#": "ARRAY_ELEMENT" },
            "(INTEGER x 2)"
        ]
    }
}
```

The `$.data[*]` entry is a JSON array of two elements because the system
has observed `data` elements as both objects (with `.nested` field) and
arrays of integers.

### Example 6: Heterogeneous (multi-type) array

**Input data shape:**

```json
{
    "row": [1, null, "three"]
}
```

**SIMPLE_VIEW export:**

```json
{
    "currentState": "UNLOCKED",
    "model": {
        "$": {
            ".row[*]": ["INTEGER", "NULL", "STRING"]
        }
    }
}
```

## JSON Schema for the SIMPLE_VIEW response

```json
{
    "$schema": "https://json-schema.org/draft/2020-12/schema",
    "title": "SIMPLE_VIEW Model Export Response",
    "description": "Response from GET /model/export/SIMPLE_VIEW/{entityName}/{modelVersion}",
    "type": "object",
    "required": ["currentState", "model"],
    "additionalProperties": false,
    "properties": {
        "currentState": {
            "type": "string",
            "enum": ["LOCKED", "UNLOCKED"],
            "description": "Lifecycle state of the entity model."
        },
        "model": {
            "type": "object",
            "description": "Map of node paths to node descriptors. Keys are JSONPath-like strings (e.g., '$', '$.field[*]'). Always contains at least the root node '$'.",
            "propertyNames": {
                "pattern": "^\\$(\\.[\\w][-\\w.]*(\\[\\*\\])*)*$"
            },
            "additionalProperties": {
                "$ref": "#/$defs/nodeDescriptor"
            }
        }
    },
    "$defs": {
        "dataType": {
            "type": "string",
            "description": "A primitive DataType name.",
            "enum": [
                "STRING", "BYTE", "SHORT", "INTEGER", "LONG",
                "BIG_INTEGER", "UNBOUND_INTEGER",
                "FLOAT", "DOUBLE", "BIG_DECIMAL", "UNBOUND_DECIMAL",
                "BOOLEAN", "CHARACTER",
                "LOCAL_DATE", "LOCAL_DATE_TIME", "LOCAL_TIME",
                "ZONED_DATE_TIME", "YEAR", "YEAR_MONTH",
                "UUID_TYPE", "TIME_UUID_TYPE",
                "BYTE_ARRAY", "NULL"
            ]
        },
        "structuralDataType": {
            "type": "string",
            "description": "DataType used in structural field values.",
            "enum": ["OBJECT", "ARRAY", "ARRAY_ELEMENT", "TYPE_REFERENCE", "POLYMORPHIC"]
        },
        "typeDescriptor": {
            "description": "A field type: a single DataType, a polymorphic set '[TYPE1, TYPE2]', or an array spec '(TYPE x N)'.",
            "type": "string",
            "examples": [
                "STRING",
                "INTEGER",
                "[INTEGER, STRING]",
                "(STRING x 3)",
                "([INTEGER, STRING] x 4)"
            ]
        },
        "multiTypeArrayDescriptor": {
            "description": "A heterogeneous array where each position may have a different type.",
            "type": "array",
            "items": {
                "type": "string",
                "description": "Type descriptor for the element at this index position."
            },
            "minItems": 1,
            "examples": [
                ["INTEGER", "STRING", "BOOLEAN"],
                ["[INTEGER, STRING]", "INTEGER"]
            ]
        },
        "objectNodeDescriptor": {
            "type": "object",
            "description": "Describes an object structure node. Keys prefixed with '.' are data fields; keys prefixed with '#' are structural markers.",
            "propertyNames": {
                "pattern": "^(\\.[\\w][-\\w.]*(\\[\\*\\])*)|(#\\.?[-\\w.]*)$"
            },
            "additionalProperties": {
                "oneOf": [
                    { "$ref": "#/$defs/typeDescriptor" },
                    { "$ref": "#/$defs/multiTypeArrayDescriptor" },
                    { "$ref": "#/$defs/structuralDataType" }
                ]
            }
        },
        "arrayNodeDescriptor": {
            "description": "Describes a detached array node. Either a UniTypeArray string or a MultiTypeArray JSON array.",
            "oneOf": [
                { "$ref": "#/$defs/typeDescriptor" },
                { "$ref": "#/$defs/multiTypeArrayDescriptor" }
            ]
        },
        "mixedNodeDescriptor": {
            "description": "A node exhibiting structural polymorphism — observed as both object and array. Element [0] is the object descriptor, element [1] is the array descriptor.",
            "type": "array",
            "prefixItems": [
                { "$ref": "#/$defs/objectNodeDescriptor" },
                { "$ref": "#/$defs/arrayNodeDescriptor" }
            ],
            "minItems": 2,
            "maxItems": 2
        },
        "nodeDescriptor": {
            "description": "A node descriptor: object, array, or mixed.",
            "oneOf": [
                { "$ref": "#/$defs/objectNodeDescriptor" },
                { "$ref": "#/$defs/arrayNodeDescriptor" },
                { "$ref": "#/$defs/mixedNodeDescriptor" }
            ]
        }
    }
}
```

## Error responses

| Status | Condition                     | Response Body                              |
|--------|-------------------------------|--------------------------------------------|
| 404    | Model not found               | RFC 7807 Problem Detail with `entityName` and `entityVersion` in `properties` |
| 400    | Invalid converter value       | RFC 7807 Problem Detail with `parameter` and `invalidValue` in `properties`   |

**404 example:**

```json
{
    "type": "about:blank",
    "title": "Not Found",
    "status": 404,
    "detail": "cannot find model entityName=nobel-prize, version=2",
    "instance": "/api/model/export/SIMPLE_VIEW/nobel-prize/2",
    "properties": {
        "entityName": "nobel-prize",
        "entityVersion": 2
    }
}
```

## Key behaviors for consumers

1. **The root node `$` is always present** in the model. It represents
   the top-level object of the entity.

2. **Field ordering is deterministic.** Within each object node, data
   fields (`.` prefix) are sorted alphabetically, followed by structural
   fields (`#` prefix) also sorted alphabetically. Node paths in the
   `model` object are sorted lexicographically.

3. **Models evolve via merging.** As new entity instances are ingested,
   the model grows: new fields appear, types may widen (e.g., `INTEGER`
   → `[INTEGER, STRING]`), and array widths may increase. The
   SIMPLE_VIEW always reflects the cumulative model.

4. **Polymorphic types** use bracket notation `[TYPE1, TYPE2]` within a
   single string. Types within the brackets are sorted by the
   `DataType` enum ordering defined in cyoda-go (see
   `internal/domain/model/schema/types.go` for the authoritative rule):
   numeric types first (integer families, then decimal families), then
   text types (`STRING`, `CHARACTER`), then temporal, identifier,
   binary, boolean, and `NULL` last.

5. **UniTypeArray vs MultiTypeArray:** If all array elements have the
   same type, you get `(TYPE x N)`. If different positions have
   different types, you get a JSON array `["TYPE1", "TYPE2", ...]`.
   When element types converge through merging, a MultiTypeArray may
   simplify back to a UniTypeArray.

6. **Structural fields indicate nesting.** A `#.fieldName` entry with
   value `"OBJECT"` means that field has its own child node in the
   model map. A `#` entry with value `"ARRAY_ELEMENT"` means this node
   describes elements of its parent array.

7. **The SIMPLE_VIEW is round-trippable.** It can be exported and
   re-imported via the import endpoint
   (`POST /model/import/{dataFormat}/{converter}/{entityName}/{modelVersion}`,
   e.g., `POST /model/import/JSON/SIMPLE_VIEW/{entityName}/{modelVersion}`)
   without loss of structural information.

---

## reference/helm.md

# Helm values

cyoda-go Helm chart — narrative navigator over `cyoda help helm`.

export const helmTopics = (() => {
  const found = helpIndex.topics.filter(t => t.path[0] === 'helm');
  if (found.length === 0) {
    throw new Error(`EmptyNavigator: reference/helm.mdx filtered helpIndex to zero topics under prefix "helm" (pinned v${helpIndex.pinnedVersion}). Likely a topic rename upstream.`);
  }
  return found;
})();

<FromTheBinary topic="helm" />

The `deploy/helm/cyoda` chart packages cyoda-go for Kubernetes. The
chart's own `values.yaml` enumerates every configurable key; this page
covers the model that shapes how those values map to Kubernetes objects
and how secrets are provisioned.

## What the chart provisions

A standard deployment per release: a `Deployment`, a `Service` for HTTP
+ gRPC + admin ports, a `ConfigMap` materialising the `.env`-format
configuration, and a `Secret` for credential material. An optional
`ServiceMonitor` (Prometheus Operator) and `HorizontalPodAutoscaler`
are wired off per-value flags.

## Values model

Values split into two groups:

- **Configuration values** — map one-to-one to the `CYODA_*` env vars
  documented in the binary. Changing them alters runtime behaviour but
  not Kubernetes shape. See `cyoda help config` for the list.
- **Deployment values** — image tag, replica count, resource requests,
  service type, ingress glue. These shape the Kubernetes objects the
  chart renders; they never reach the binary.

## Secret provisioning

Credentials follow the same `_FILE` pattern as bare-metal deployments.
The chart mounts the `Secret` at a known path and sets `CYODA_*_FILE`
env vars accordingly, so the binary reads each secret at startup. You
never put raw credentials into the chart's `values.yaml` in a
production deployment — wire an existing `Secret` via `existingSecret`
or equivalent.

## Related topics

  {helmTopics.map((t) => (
    <li>
      <strong><code>cyoda help {t.path.join(' ')}</code></strong> — {t.title.replace(/^[^—]*—\s*/, '')}
      <br />
      <span style="opacity: 0.8;">{t.synopsis}</span>
    </li>
  ))}

See [Run → Kubernetes](/run/kubernetes/) for the deployment pattern and
production-sizing guidance.

---

## reference/schemas.md

# JSON Schemas

Complete reference for all JSON schemas used in Cyoda

# JSON Schemas

This section documents the JSON schemas used by the Cyoda platform — CloudEvent
payloads exchanged over the gRPC processing stream, plus the entity and model
structures that travel over REST and gRPC.

The schemas shown here were captured against **cyoda-go v0.6.2**.
For the version you are running, `cyoda help cloudevents` (narrative) and
`cyoda help cloudevents json` (machine-readable) on your own binary are
authoritative — the binary ships its own schema tree and always matches its
own code.

## Download Schemas

You can download all schemas as a ZIP file: [schemas.zip](/schemas.zip)

## Schema Categories

  <Card title="Common" icon="document">
    Browse [Common schemas](./common/)
  </Card>
  <Card title="Entity" icon="document">
    Browse [Entity schemas](./entity/)
  </Card>
  <Card title="Model" icon="document">
    Browse [Model schemas](./model/)
  </Card>
  <Card title="Processing" icon="document">
    Browse [Processing schemas](./processing/)
  </Card>
  <Card title="Search" icon="document">
    Browse [Search schemas](./search/)
  </Card>

## Using JSON Schemas

JSON schemas define the structure and validation rules for data in the Cyoda platform. Each schema includes:

- **Property definitions** with types and descriptions
- **Required fields** clearly marked
- **Validation rules** for data integrity
- **References** to related schemas

Navigate to any category above to explore the available schemas.

---

## reference/schemas/common.md

JSON schemas in the Common category

# Common Schemas

This section contains JSON schemas for common.

## Available Schemas

- [ModelSpec](./model-spec/)
- [ModelInfo](./model-info/)
- [ModelConverterType](./model-converter-type/)
- [ErrorCode](./error-code/)
- [EntityMetadata](./entity-metadata/)
- [EntityChangeMeta](./entity-change-meta/)
- [DataPayload](./data-payload/)
- [DataFormat](./data-format/)
- [CloudEventType](./cloud-event-type/)
- [BaseEvent](./base-event/)

---

## reference/schemas/common/base-event.md

# BaseEvent

Schema definition for BaseEvent

# BaseEvent

Schema definition for BaseEvent

## Schema Viewer

<JsonSchemaViewer
  schema={schema}
  name="BaseEvent"
  category="common"
  client:load
/>

## Description

This schema defines the structure and validation rules for BaseEvent.

## Properties

- **error** (object): Error details (if present).
- **id** (string, required): Event ID.
- **success** (boolean): Flag indicates whether this message relates to some failure.
- **warnings** (array): Warnings (if applicable).

## Related Schemas

See other schemas in the [common](/reference/schemas/common/) category.

---

## reference/schemas/common/cloud-event-type.md

# CloudEventType

Schema definition for CloudEventType

# CloudEventType

Schema definition for CloudEventType

## Schema Viewer

<JsonSchemaViewer
  schema={schema}
  name="CloudEventType"
  category="common"
  client:load
/>

## Description

This schema defines the structure and validation rules for CloudEventType.

## Related Schemas

See other schemas in the [common](/reference/schemas/common/) category.

---

## reference/schemas/common/data-format.md

# DataFormat

Specifies the format of the input data (e.g., JSON).

# DataFormat

Specifies the format of the input data (e.g., JSON).

## Schema Viewer

<JsonSchemaViewer
  schema={schema}
  name="DataFormat"
  category="common"
  client:load
/>

## Description

Specifies the format of the input data (e.g., JSON).

## Related Schemas

See other schemas in the [common](/reference/schemas/common/) category.

---

## reference/schemas/common/data-payload.md

# DataPayload

Schema definition for DataPayload

# DataPayload

Schema definition for DataPayload

## Schema Viewer

<JsonSchemaViewer
  schema={schema}
  name="DataPayload"
  category="common"
  client:load
/>

## Description

This schema defines the structure and validation rules for DataPayload.

## Properties

- **data** (any): Payload data.
- **meta** (any): Metadata for the payload.
- **type** (string, required): Payload type.

## Related Schemas

See other schemas in the [common](/reference/schemas/common/) category.

---

## reference/schemas/common/entity-change-meta.md

# EntityChangeMeta

Metadata about entity changes including transaction information and change type.

# EntityChangeMeta

Metadata about entity changes including transaction information and change type.

## Schema Viewer

<JsonSchemaViewer
  schema={schema}
  name="EntityChangeMeta"
  category="common"
  client:load
/>

## Description

Metadata about entity changes including transaction information and change type.

## Properties

- **changeType** (string, required): Type of change that was made to the entity.
- **fieldsChangedCount** (integer): Number of fields changed in the entity for this change.
- **timeOfChange** (string, required): Timestamp when the change occurred.
- **transactionId** (string): UUID of the transaction that made this change.
- **user** (string, required): User who made the change.

## Related Schemas

See other schemas in the [common](/reference/schemas/common/) category.

---

## reference/schemas/common/entity-metadata.md

# EntityMetadata

Metadata about an entity. id, modelKey and creationDate are invariant against the point-in-time. All other values are with respect to the as-at point-in-time for which the entity was retrieved. If the point-in-time was not explicitly set, the values correspond to the latest state of the entity.

# EntityMetadata

Metadata about an entity. id, modelKey and creationDate are invariant against the point-in-time. All other values are with respect to the as-at point-in-time for which the entity was retrieved. If the point-in-time was not explicitly set, the values correspond to the latest state of the entity.

## Schema Viewer

<JsonSchemaViewer
  schema={schema}
  name="EntityMetadata"
  category="common"
  client:load
/>

## Description

Metadata about an entity. id, modelKey and creationDate are invariant against the point-in-time. All other values are with respect to the as-at point-in-time for which the entity was retrieved. If the point-in-time was not explicitly set, the values correspond to the latest state of the entity.

## Properties

- **creationDate** (string, required): The creation date of the entity.
- **id** (string, required): ID of the entity.
- **lastUpdateTime** (string, required): The last time the entity was updated as-at the given point-in-time. Equals the creation date if the entity has not been updated.
- **modelKey** (object): Model of the entity.
- **pointInTime** (string): Optional value for the as-at point-in-time for which the entity was retrieved.
- **state** (string, required): The state of the entity at the given point-in-time.
- **transactionId** (string): The transaction id of the entity when last saved as-at the given point-in-time.
- **transitionForLatestSave** (string): The transition applied of the entity when last saved as-at the given point-in-time.

## Related Schemas

See other schemas in the [common](/reference/schemas/common/) category.

---

## reference/schemas/common/error-code.md

# ErrorCode

Schema definition for ErrorCode

# ErrorCode

Schema definition for ErrorCode

## Schema Viewer

<JsonSchemaViewer
  schema={schema}
  name="ErrorCode"
  category="common"
  client:load
/>

## Description

This schema defines the structure and validation rules for ErrorCode.

## Related Schemas

See other schemas in the [common](/reference/schemas/common/) category.

---

## reference/schemas/common/model-converter-type.md

# ModelConverterType

Defines the type of converter to use when importing the model (e.g., SAMPLE_DATA to use sample object)

# ModelConverterType

Defines the type of converter to use when importing the model (e.g., SAMPLE_DATA to use sample object)

## Schema Viewer

<JsonSchemaViewer
  schema={schema}
  name="ModelConverterType"
  category="common"
  client:load
/>

## Description

Defines the type of converter to use when importing the model (e.g., SAMPLE_DATA to use sample object)

## Related Schemas

See other schemas in the [common](/reference/schemas/common/) category.

---

## reference/schemas/common/model-info.md

# ModelInfo

Schema definition for ModelInfo

# ModelInfo

Schema definition for ModelInfo

## Schema Viewer

<JsonSchemaViewer
  schema={schema}
  name="ModelInfo"
  category="common"
  client:load
/>

## Description

This schema defines the structure and validation rules for ModelInfo.

## Properties

- **id** (string, required): Id of the model.
- **name** (string, required): Name of the model.
- **state** (string, required): Current state of the model.
- **version** (integer, required): Version of the model.

## Related Schemas

See other schemas in the [common](/reference/schemas/common/) category.

---

## reference/schemas/common/model-spec.md

# ModelSpec

Schema definition for ModelSpec

# ModelSpec

Schema definition for ModelSpec

## Schema Viewer

<JsonSchemaViewer
  schema={schema}
  name="ModelSpec"
  category="common"
  client:load
/>

## Description

This schema defines the structure and validation rules for ModelSpec.

## Properties

- **name** (string, required): Name of the model.
- **version** (integer, required): Version of the model.

## Related Schemas

See other schemas in the [common](/reference/schemas/common/) category.

---

## reference/schemas/common/statemachine.md

JSON schemas in the Statemachine category

# Statemachine Schemas

This section contains JSON schemas for statemachine.

## Available Schemas

- [WorkflowInfo](./workflow-info/)
- [TransitionInfo](./transition-info/)
- [ProcessorInfo](./processor-info/)

---

## reference/schemas/common/statemachine/processor-info.md

# ProcessorInfo

Schema definition for ProcessorInfo

# ProcessorInfo

Schema definition for ProcessorInfo

## Schema Viewer

<JsonSchemaViewer
  schema={schema}
  name="ProcessorInfo"
  category="common"
  client:load
/>

## Description

This schema defines the structure and validation rules for ProcessorInfo.

## Properties

- **id** (string, required): Processor ID.
- **name** (string, required): Processor name.

## Related Schemas

See other schemas in the [common](/reference/schemas/common/) category.

---

## reference/schemas/common/statemachine/transition-info.md

# TransitionInfo

Schema definition for TransitionInfo

# TransitionInfo

Schema definition for TransitionInfo

## Schema Viewer

<JsonSchemaViewer
  schema={schema}
  name="TransitionInfo"
  category="common"
  client:load
/>

## Description

This schema defines the structure and validation rules for TransitionInfo.

## Properties

- **id** (string, required): Transition ID.
- **name** (string, required): Transition name.
- **stateFrom** (string, required): Source state.
- **stateTo** (string, required): Target state.

## Related Schemas

See other schemas in the [common](/reference/schemas/common/) category.

---

## reference/schemas/common/statemachine/workflow-info.md

# WorkflowInfo

Schema definition for WorkflowInfo

# WorkflowInfo

Schema definition for WorkflowInfo

## Schema Viewer

<JsonSchemaViewer
  schema={schema}
  name="WorkflowInfo"
  category="common"
  client:load
/>

## Description

This schema defines the structure and validation rules for WorkflowInfo.

## Properties

- **id** (string, required): Workflow ID.
- **name** (string, required): Workflow name.

## Related Schemas

See other schemas in the [common](/reference/schemas/common/) category.

---

## reference/schemas/entity.md

JSON schemas in the Entity category

# Entity Schemas

This section contains JSON schemas for entity.

## Available Schemas

- [EntityUpdateRequest](./entity-update-request/)
- [EntityUpdatePayload](./entity-update-payload/)
- [EntityUpdateCollectionRequest](./entity-update-collection-request/)
- [EntityTransitionResponse](./entity-transition-response/)
- [EntityTransitionRequest](./entity-transition-request/)
- [EntityTransactionResponse](./entity-transaction-response/)
- [EntityTransactionInfo](./entity-transaction-info/)
- [EntityDeleteResponse](./entity-delete-response/)
- [EntityDeleteRequest](./entity-delete-request/)
- [EntityDeleteAllResponse](./entity-delete-all-response/)
- [EntityDeleteAllRequest](./entity-delete-all-request/)
- [EntityCreateRequest](./entity-create-request/)
- [EntityCreatePayload](./entity-create-payload/)
- [EntityCreateCollectionRequest](./entity-create-collection-request/)

---

## reference/schemas/entity/entity-create-collection-request.md

# EntityCreateCollectionRequest

Schema definition for EntityCreateCollectionRequest

# EntityCreateCollectionRequest

Schema definition for EntityCreateCollectionRequest

## Schema Viewer

<JsonSchemaViewer
  schema={schema}
  name="EntityCreateCollectionRequest"
  category="entity"
  client:load
/>

## Description

This schema defines the structure and validation rules for EntityCreateCollectionRequest.

## Properties

- **dataFormat** (object, required): 
- **payloads** (array, required): Data payloads containing entities to save.
- **transactionTimeoutMs** (integer): Indicates the timeout of transaction for transactional save.
- **transactionWindow** (integer): The collection will be saved in a single transaction up to a maximum number of entities given by the transactionWindow. Collections exceeding the transactionWindow size will be saved in separate chunked transactions of the transactionWindow size.

## Related Schemas

See other schemas in the [entity](/reference/schemas/entity/) category.

---

## reference/schemas/entity/entity-create-payload.md

# EntityCreatePayload

Schema definition for EntityCreatePayload

# EntityCreatePayload

Schema definition for EntityCreatePayload

## Schema Viewer

<JsonSchemaViewer
  schema={schema}
  name="EntityCreatePayload"
  category="entity"
  client:load
/>

## Description

This schema defines the structure and validation rules for EntityCreatePayload.

## Properties

- **data** (any, required): Payload data.
- **model** (object, required): Entity model to use for this payload.

## Related Schemas

See other schemas in the [entity](/reference/schemas/entity/) category.

---

## reference/schemas/entity/entity-create-request.md

# EntityCreateRequest

Schema definition for EntityCreateRequest

# EntityCreateRequest

Schema definition for EntityCreateRequest

## Schema Viewer

<JsonSchemaViewer
  schema={schema}
  name="EntityCreateRequest"
  category="entity"
  client:load
/>

## Description

This schema defines the structure and validation rules for EntityCreateRequest.

## Properties

- **dataFormat** (object, required): 
- **payload** (object, required): Data payload containing entity to save.
- **transactionTimeoutMs** (integer): Indicates the timeout of transaction for transactional save.

## Related Schemas

See other schemas in the [entity](/reference/schemas/entity/) category.

---

## reference/schemas/entity/entity-delete-all-request.md

# EntityDeleteAllRequest

Schema definition for EntityDeleteAllRequest

# EntityDeleteAllRequest

Schema definition for EntityDeleteAllRequest

## Schema Viewer

<JsonSchemaViewer
  schema={schema}
  name="EntityDeleteAllRequest"
  category="entity"
  client:load
/>

## Description

This schema defines the structure and validation rules for EntityDeleteAllRequest.

## Properties

- **model** (object, required): Information about the model.
- **pageSize** (integer): Page size.
- **pointInTime** (string): point in time, i.e. delete all that existed prior to this point in time
- **transactionSize** (integer): Transaction size.
- **verbose** (boolean): Include the list of entity ids deleted in the response

## Related Schemas

See other schemas in the [entity](/reference/schemas/entity/) category.

---

## reference/schemas/entity/entity-delete-all-response.md

# EntityDeleteAllResponse

Schema definition for EntityDeleteAllResponse

# EntityDeleteAllResponse

Schema definition for EntityDeleteAllResponse

## Schema Viewer

<JsonSchemaViewer
  schema={schema}
  name="EntityDeleteAllResponse"
  category="entity"
  client:load
/>

## Description

This schema defines the structure and validation rules for EntityDeleteAllResponse.

## Properties

- **entityIds** (array, required): IDs of the removed entities.
- **errorsById** (object): Collections of errors by ids if any.
- **modelId** (string, required): ID of the model.
- **numDeleted** (integer, required): Number of the deleted entities.
- **requestId** (string, required): ID of the original request to get data.

## Related Schemas

See other schemas in the [entity](/reference/schemas/entity/) category.

---

## reference/schemas/entity/entity-delete-request.md

# EntityDeleteRequest

Schema definition for EntityDeleteRequest

# EntityDeleteRequest

Schema definition for EntityDeleteRequest

## Schema Viewer

<JsonSchemaViewer
  schema={schema}
  name="EntityDeleteRequest"
  category="entity"
  client:load
/>

## Description

This schema defines the structure and validation rules for EntityDeleteRequest.

## Properties

- **entityId** (string, required): ID of the entity.

## Related Schemas

See other schemas in the [entity](/reference/schemas/entity/) category.

---

## reference/schemas/entity/entity-delete-response.md

# EntityDeleteResponse

Schema definition for EntityDeleteResponse

# EntityDeleteResponse

Schema definition for EntityDeleteResponse

## Schema Viewer

<JsonSchemaViewer
  schema={schema}
  name="EntityDeleteResponse"
  category="entity"
  client:load
/>

## Description

This schema defines the structure and validation rules for EntityDeleteResponse.

## Properties

- **entityId** (string, required): ID of the removed entity.
- **model** (object, required): Information about the model of the removed entity.
- **requestId** (string, required): ID of the original request to get data.
- **transactionId** (string, required): ID of the transaction.

## Related Schemas

See other schemas in the [entity](/reference/schemas/entity/) category.

---

## reference/schemas/entity/entity-transaction-info.md

# EntityTransactionInfo

Schema definition for EntityTransactionInfo

# EntityTransactionInfo

Schema definition for EntityTransactionInfo

## Schema Viewer

<JsonSchemaViewer
  schema={schema}
  name="EntityTransactionInfo"
  category="entity"
  client:load
/>

## Description

This schema defines the structure and validation rules for EntityTransactionInfo.

## Properties

- **entityIds** (array, required): IDs of entities in this transaction.
- **transactionId** (string): ID of the transaction.

## Related Schemas

See other schemas in the [entity](/reference/schemas/entity/) category.

---

## reference/schemas/entity/entity-transaction-response.md

# EntityTransactionResponse

Schema definition for EntityTransactionResponse

# EntityTransactionResponse

Schema definition for EntityTransactionResponse

## Schema Viewer

<JsonSchemaViewer
  schema={schema}
  name="EntityTransactionResponse"
  category="entity"
  client:load
/>

## Description

This schema defines the structure and validation rules for EntityTransactionResponse.

## Properties

- **requestId** (string, required): ID of the original request to save data.
- **transactionInfo** (object, required): Entity transaction info.

## Related Schemas

See other schemas in the [entity](/reference/schemas/entity/) category.

---

## reference/schemas/entity/entity-transition-request.md

# EntityTransitionRequest

Schema definition for EntityTransitionRequest

# EntityTransitionRequest

Schema definition for EntityTransitionRequest

## Schema Viewer

<JsonSchemaViewer
  schema={schema}
  name="EntityTransitionRequest"
  category="entity"
  client:load
/>

## Description

This schema defines the structure and validation rules for EntityTransitionRequest.

## Properties

- **entityId** (string, required): ID of the entity.
- **transition** (string, required): Name of the transition to apply.

## Related Schemas

See other schemas in the [entity](/reference/schemas/entity/) category.

---

## reference/schemas/entity/entity-transition-response.md

# EntityTransitionResponse

Schema definition for EntityTransitionResponse

# EntityTransitionResponse

Schema definition for EntityTransitionResponse

## Schema Viewer

<JsonSchemaViewer
  schema={schema}
  name="EntityTransitionResponse"
  category="entity"
  client:load
/>

## Description

This schema defines the structure and validation rules for EntityTransitionResponse.

## Properties

- **availableTransitions** (array): Available transitions from the current state.

## Related Schemas

See other schemas in the [entity](/reference/schemas/entity/) category.

---

## reference/schemas/entity/entity-update-collection-request.md

# EntityUpdateCollectionRequest

Schema definition for EntityUpdateCollectionRequest

# EntityUpdateCollectionRequest

Schema definition for EntityUpdateCollectionRequest

## Schema Viewer

<JsonSchemaViewer
  schema={schema}
  name="EntityUpdateCollectionRequest"
  category="entity"
  client:load
/>

## Description

This schema defines the structure and validation rules for EntityUpdateCollectionRequest.

## Properties

- **dataFormat** (object, required): 
- **payloads** (array, required): Data payloads containing entities to update.
- **transactionTimeoutMs** (integer): Indicates the timeout of transaction for transactional save.
- **transactionWindow** (integer): The collection will be saved in a single transaction up to a maximum number of entities given by the transactionWindow. Collections exceeding the transactionWindow size will be saved in separate chunked transactions of the transactionWindow size.

## Related Schemas

See other schemas in the [entity](/reference/schemas/entity/) category.

---

## reference/schemas/entity/entity-update-payload.md

# EntityUpdatePayload

Schema definition for EntityUpdatePayload

# EntityUpdatePayload

Schema definition for EntityUpdatePayload

## Schema Viewer

<JsonSchemaViewer
  schema={schema}
  name="EntityUpdatePayload"
  category="entity"
  client:load
/>

## Description

This schema defines the structure and validation rules for EntityUpdatePayload.

## Properties

- **data** (any, required): Entity payload data.
- **entityId** (string, required): ID of the entity.
- **transition** (string): Transition to use for update.

## Related Schemas

See other schemas in the [entity](/reference/schemas/entity/) category.

---

## reference/schemas/entity/entity-update-request.md

# EntityUpdateRequest

Schema definition for EntityUpdateRequest

# EntityUpdateRequest

Schema definition for EntityUpdateRequest

## Schema Viewer

<JsonSchemaViewer
  schema={schema}
  name="EntityUpdateRequest"
  category="entity"
  client:load
/>

## Description

This schema defines the structure and validation rules for EntityUpdateRequest.

## Properties

- **dataFormat** (object, required): 
- **payload** (object, required): Data payload containing entity to update.
- **transactionTimeoutMs** (integer): Indicates the timeout of transaction for transactional save.

## Related Schemas

See other schemas in the [entity](/reference/schemas/entity/) category.

---

## reference/schemas/model.md

JSON schemas in the Model category

# Model Schemas

This section contains JSON schemas for model.

## Available Schemas

- [EntityModelTransitionResponse](./entity-model-transition-response/)
- [EntityModelTransitionRequest](./entity-model-transition-request/)
- [EntityModelImportResponse](./entity-model-import-response/)
- [EntityModelImportRequest](./entity-model-import-request/)
- [EntityModelGetAllResponse](./entity-model-get-all-response/)
- [EntityModelGetAllRequest](./entity-model-get-all-request/)
- [EntityModelExportResponse](./entity-model-export-response/)
- [EntityModelExportRequest](./entity-model-export-request/)
- [EntityModelDeleteResponse](./entity-model-delete-response/)
- [EntityModelDeleteRequest](./entity-model-delete-request/)

---

## reference/schemas/model/entity-model-delete-request.md

# EntityModelDeleteRequest

Schema definition for EntityModelDeleteRequest

# EntityModelDeleteRequest

Schema definition for EntityModelDeleteRequest

## Schema Viewer

<JsonSchemaViewer
  schema={schema}
  name="EntityModelDeleteRequest"
  category="model"
  client:load
/>

## Description

This schema defines the structure and validation rules for EntityModelDeleteRequest.

## Properties

- **model** (object, required): Entity model specification.

## Related Schemas

See other schemas in the [model](/reference/schemas/model/) category.

---

## reference/schemas/model/entity-model-delete-response.md

# EntityModelDeleteResponse

Schema definition for EntityModelDeleteResponse

# EntityModelDeleteResponse

Schema definition for EntityModelDeleteResponse

## Schema Viewer

<JsonSchemaViewer
  schema={schema}
  name="EntityModelDeleteResponse"
  category="model"
  client:load
/>

## Description

This schema defines the structure and validation rules for EntityModelDeleteResponse.

## Related Schemas

See other schemas in the [model](/reference/schemas/model/) category.

---

## reference/schemas/model/entity-model-export-request.md

# EntityModelExportRequest

Schema definition for EntityModelExportRequest

# EntityModelExportRequest

Schema definition for EntityModelExportRequest

## Schema Viewer

<JsonSchemaViewer
  schema={schema}
  name="EntityModelExportRequest"
  category="model"
  client:load
/>

## Description

This schema defines the structure and validation rules for EntityModelExportRequest.

## Properties

- **converter** (object, required): 
- **model** (object, required): Entity model specification.

## Related Schemas

See other schemas in the [model](/reference/schemas/model/) category.

---

## reference/schemas/model/entity-model-export-response.md

# EntityModelExportResponse

Schema definition for EntityModelExportResponse

# EntityModelExportResponse

Schema definition for EntityModelExportResponse

## Schema Viewer

<JsonSchemaViewer
  schema={schema}
  name="EntityModelExportResponse"
  category="model"
  client:load
/>

## Description

This schema defines the structure and validation rules for EntityModelExportResponse.

## Properties

- **model** (object, required): Entity model specification.
- **modelId** (string): ID of the entity model.
- **payload** (any, required): The content format of the exported entity model depends on the selected converter.

## Related Schemas

See other schemas in the [model](/reference/schemas/model/) category.

---

## reference/schemas/model/entity-model-get-all-request.md

# EntityModelGetAllRequest

Schema definition for EntityModelGetAllRequest

# EntityModelGetAllRequest

Schema definition for EntityModelGetAllRequest

## Schema Viewer

<JsonSchemaViewer
  schema={schema}
  name="EntityModelGetAllRequest"
  category="model"
  client:load
/>

## Description

This schema defines the structure and validation rules for EntityModelGetAllRequest.

## Related Schemas

See other schemas in the [model](/reference/schemas/model/) category.

---

## reference/schemas/model/entity-model-get-all-response.md

# EntityModelGetAllResponse

Schema definition for EntityModelGetAllResponse

# EntityModelGetAllResponse

Schema definition for EntityModelGetAllResponse

## Schema Viewer

<JsonSchemaViewer
  schema={schema}
  name="EntityModelGetAllResponse"
  category="model"
  client:load
/>

## Description

This schema defines the structure and validation rules for EntityModelGetAllResponse.

## Properties

- **models** (array, required): Information about registered models.

## Related Schemas

See other schemas in the [model](/reference/schemas/model/) category.

---

## reference/schemas/model/entity-model-import-request.md

# EntityModelImportRequest

Schema definition for EntityModelImportRequest

# EntityModelImportRequest

Schema definition for EntityModelImportRequest

## Schema Viewer

<JsonSchemaViewer
  schema={schema}
  name="EntityModelImportRequest"
  category="model"
  client:load
/>

## Description

This schema defines the structure and validation rules for EntityModelImportRequest.

## Properties

- **converter** (object, required): 
- **dataFormat** (object, required): 
- **model** (object, required): Entity model specification.
- **payload** (any, required): The data to be used for importing the model.

## Related Schemas

See other schemas in the [model](/reference/schemas/model/) category.

---

## reference/schemas/model/entity-model-import-response.md

# EntityModelImportResponse

Schema definition for EntityModelImportResponse

# EntityModelImportResponse

Schema definition for EntityModelImportResponse

## Schema Viewer

<JsonSchemaViewer
  schema={schema}
  name="EntityModelImportResponse"
  category="model"
  client:load
/>

## Description

This schema defines the structure and validation rules for EntityModelImportResponse.

## Properties

- **modelId** (string, required): ID of the created or updated entity model.

## Related Schemas

See other schemas in the [model](/reference/schemas/model/) category.

---

## reference/schemas/model/entity-model-transition-request.md

# EntityModelTransitionRequest

Schema definition for EntityModelTransitionRequest

# EntityModelTransitionRequest

Schema definition for EntityModelTransitionRequest

## Schema Viewer

<JsonSchemaViewer
  schema={schema}
  name="EntityModelTransitionRequest"
  category="model"
  client:load
/>

## Description

This schema defines the structure and validation rules for EntityModelTransitionRequest.

## Properties

- **model** (object, required): Entity model specification.
- **transition** (string, required): Specifies the transition to perform.

## Related Schemas

See other schemas in the [model](/reference/schemas/model/) category.

---

## reference/schemas/model/entity-model-transition-response.md

# EntityModelTransitionResponse

Schema definition for EntityModelTransitionResponse

# EntityModelTransitionResponse

Schema definition for EntityModelTransitionResponse

## Schema Viewer

<JsonSchemaViewer
  schema={schema}
  name="EntityModelTransitionResponse"
  category="model"
  client:load
/>

## Description

This schema defines the structure and validation rules for EntityModelTransitionResponse.

## Properties

- **modelId** (string, required): ID of the entity model.
- **state** (string, required): State of the entity model.

## Related Schemas

See other schemas in the [model](/reference/schemas/model/) category.

---

## reference/schemas/processing.md

JSON schemas in the Processing category

# Processing Schemas

This section contains JSON schemas for processing.

## Available Schemas

- [EventAckResponse](./event-ack-response/)
- [EntityProcessorCalculationResponse](./entity-processor-calculation-response/)
- [EntityProcessorCalculationRequest](./entity-processor-calculation-request/)
- [EntityCriteriaCalculationResponse](./entity-criteria-calculation-response/)
- [EntityCriteriaCalculationRequest](./entity-criteria-calculation-request/)
- [CalculationMemberKeepAliveEvent](./calculation-member-keep-alive-event/)
- [CalculationMemberJoinEvent](./calculation-member-join-event/)
- [CalculationMemberGreetEvent](./calculation-member-greet-event/)

---

## reference/schemas/processing/calculation-member-greet-event.md

# CalculationMemberGreetEvent

Schema definition for CalculationMemberGreetEvent

# CalculationMemberGreetEvent

Schema definition for CalculationMemberGreetEvent

## Schema Viewer

<JsonSchemaViewer
  schema={schema}
  name="CalculationMemberGreetEvent"
  category="processing"
  client:load
/>

## Description

This schema defines the structure and validation rules for CalculationMemberGreetEvent.

## Properties

- **joinedLegalEntityId** (string, required): ID of the legal entity under which this member has joined.
- **memberId** (string, required): Assigned member ID.

## Related Schemas

See other schemas in the [processing](/reference/schemas/processing/) category.

---

## reference/schemas/processing/calculation-member-join-event.md

# CalculationMemberJoinEvent

Schema definition for CalculationMemberJoinEvent

# CalculationMemberJoinEvent

Schema definition for CalculationMemberJoinEvent

## Schema Viewer

<JsonSchemaViewer
  schema={schema}
  name="CalculationMemberJoinEvent"
  category="processing"
  client:load
/>

## Description

This schema defines the structure and validation rules for CalculationMemberJoinEvent.

## Properties

- **tags** (array): Member tags. Could be used to filter applicability.

## Related Schemas

See other schemas in the [processing](/reference/schemas/processing/) category.

---

## reference/schemas/processing/calculation-member-keep-alive-event.md

# CalculationMemberKeepAliveEvent

Schema definition for CalculationMemberKeepAliveEvent

# CalculationMemberKeepAliveEvent

Schema definition for CalculationMemberKeepAliveEvent

## Schema Viewer

<JsonSchemaViewer
  schema={schema}
  name="CalculationMemberKeepAliveEvent"
  category="processing"
  client:load
/>

## Description

This schema defines the structure and validation rules for CalculationMemberKeepAliveEvent.

## Properties

- **memberId** (string, required): Member ID.

## Related Schemas

See other schemas in the [processing](/reference/schemas/processing/) category.

---

## reference/schemas/processing/entity-criteria-calculation-request.md

# EntityCriteriaCalculationRequest

Schema definition for EntityCriteriaCalculationRequest

# EntityCriteriaCalculationRequest

Schema definition for EntityCriteriaCalculationRequest

## Schema Viewer

<JsonSchemaViewer
  schema={schema}
  name="EntityCriteriaCalculationRequest"
  category="processing"
  client:load
/>

## Description

This schema defines the structure and validation rules for EntityCriteriaCalculationRequest.

## Properties

- **criteriaId** (string, required): Criteria ID.
- **criteriaName** (string, required): Criteria name.
- **entityId** (string, required): Entity ID.
- **parameters** (any): Configured parameters, if any.
- **payload** (object): 
- **processor** (object): Processor information, available for targets PROCESSOR.
- **requestId** (string, required): Request ID.
- **target** (string, required): Target to which this condition is attached. NA is reserved for future cases.
- **transactionId** (string): Transaction ID.
- **transition** (object): Transition information, available for targets TRANSITION and PROCESSOR.
- **workflow** (object): Workflow information, available for targets WORKFLOW, PROCESSOR, TRANSITION.

## Related Schemas

See other schemas in the [processing](/reference/schemas/processing/) category.

---

## reference/schemas/processing/entity-criteria-calculation-response.md

# EntityCriteriaCalculationResponse

Schema definition for EntityCriteriaCalculationResponse

# EntityCriteriaCalculationResponse

Schema definition for EntityCriteriaCalculationResponse

## Schema Viewer

<JsonSchemaViewer
  schema={schema}
  name="EntityCriteriaCalculationResponse"
  category="processing"
  client:load
/>

## Description

This schema defines the structure and validation rules for EntityCriteriaCalculationResponse.

## Properties

- **entityId** (string, required): Entity ID.
- **matches** (boolean): Criteria check result.
- **reason** (string): Reason for the criteria check result.
- **requestId** (string, required): ID of the original criteria calculation request.

## Related Schemas

See other schemas in the [processing](/reference/schemas/processing/) category.

---

## reference/schemas/processing/entity-processor-calculation-request.md

# EntityProcessorCalculationRequest

Schema definition for EntityProcessorCalculationRequest

# EntityProcessorCalculationRequest

Schema definition for EntityProcessorCalculationRequest

## Schema Viewer

<JsonSchemaViewer
  schema={schema}
  name="EntityProcessorCalculationRequest"
  category="processing"
  client:load
/>

## Description

This schema defines the structure and validation rules for EntityProcessorCalculationRequest.

## Properties

- **entityId** (string, required): Entity ID.
- **parameters** (any): Configured parameters, if any.
- **payload** (object): 
- **processorId** (string, required): Processor ID.
- **processorName** (string, required): Processor name.
- **requestId** (string, required): Request ID.
- **transactionId** (string): Transaction ID.
- **transition** (object): Transition information.
- **workflow** (object, required): Workflow information.

## Related Schemas

See other schemas in the [processing](/reference/schemas/processing/) category.

---

## reference/schemas/processing/entity-processor-calculation-response.md

# EntityProcessorCalculationResponse

Schema definition for EntityProcessorCalculationResponse

# EntityProcessorCalculationResponse

Schema definition for EntityProcessorCalculationResponse

## Schema Viewer

<JsonSchemaViewer
  schema={schema}
  name="EntityProcessorCalculationResponse"
  category="processing"
  client:load
/>

## Description

This schema defines the structure and validation rules for EntityProcessorCalculationResponse.

## Properties

- **entityId** (string, required): Entity ID.
- **payload** (object): 
- **requestId** (string, required): ID of the original calculation request.

## Related Schemas

See other schemas in the [processing](/reference/schemas/processing/) category.

---

## reference/schemas/processing/event-ack-response.md

# EventAckResponse

Schema definition for EventAckResponse

# EventAckResponse

Schema definition for EventAckResponse

## Schema Viewer

<JsonSchemaViewer
  schema={schema}
  name="EventAckResponse"
  category="processing"
  client:load
/>

## Description

This schema defines the structure and validation rules for EventAckResponse.

## Properties

- **sourceEventId** (string, required): ID of the original event.

## Related Schemas

See other schemas in the [processing](/reference/schemas/processing/) category.

---

## reference/schemas/search.md

JSON schemas in the Search category

# Search Schemas

This section contains JSON schemas for search.

## Available Schemas

- [SnapshotGetStatusRequest](./snapshot-get-status-request/)
- [SnapshotGetRequest](./snapshot-get-request/)
- [SnapshotCancelRequest](./snapshot-cancel-request/)
- [SearchSnapshotStatus](./search-snapshot-status/)
- [EntityStatsResponse](./entity-stats-response/)
- [EntityStatsGetRequest](./entity-stats-get-request/)
- [EntityStatsByStateResponse](./entity-stats-by-state-response/)
- [EntityStatsByStateGetRequest](./entity-stats-by-state-get-request/)
- [EntitySnapshotSearchResponse](./entity-snapshot-search-response/)
- [EntitySnapshotSearchRequest](./entity-snapshot-search-request/)
- [EntitySearchRequest](./entity-search-request/)
- [EntityResponse](./entity-response/)
- [EntityGetRequest](./entity-get-request/)
- [EntityGetAllRequest](./entity-get-all-request/)
- [EntityChangesMetadataResponse](./entity-changes-metadata-response/)
- [EntityChangesMetadataGetRequest](./entity-changes-metadata-get-request/)

---

## reference/schemas/search/entity-changes-metadata-get-request.md

# EntityChangesMetadataGetRequest

Schema definition for EntityChangesMetadataGetRequest

# EntityChangesMetadataGetRequest

Schema definition for EntityChangesMetadataGetRequest

## Schema Viewer

<JsonSchemaViewer
  schema={schema}
  name="EntityChangesMetadataGetRequest"
  category="search"
  client:load
/>

## Description

This schema defines the structure and validation rules for EntityChangesMetadataGetRequest.

## Properties

- **entityId** (string, required): ID of the entity to retrieve change history for.
- **pointInTime** (string): Point in time to retrieve the entity changes. If not provided, retrieves all changes up to the current consistency time.

## Related Schemas

See other schemas in the [search](/reference/schemas/search/) category.

---

## reference/schemas/search/entity-changes-metadata-response.md

# EntityChangesMetadataResponse

Schema definition for EntityChangesMetadataResponse

# EntityChangesMetadataResponse

Schema definition for EntityChangesMetadataResponse

## Schema Viewer

<JsonSchemaViewer
  schema={schema}
  name="EntityChangesMetadataResponse"
  category="search"
  client:load
/>

## Description

This schema defines the structure and validation rules for EntityChangesMetadataResponse.

## Properties

- **changeMeta** (object, required): Metadata about a single entity change.
- **requestId** (string, required): ID of the original request to get entity changes metadata.

## Related Schemas

See other schemas in the [search](/reference/schemas/search/) category.

---

## reference/schemas/search/entity-get-all-request.md

# EntityGetAllRequest

Schema definition for EntityGetAllRequest

# EntityGetAllRequest

Schema definition for EntityGetAllRequest

## Schema Viewer

<JsonSchemaViewer
  schema={schema}
  name="EntityGetAllRequest"
  category="search"
  client:load
/>

## Description

This schema defines the structure and validation rules for EntityGetAllRequest.

## Properties

- **model** (object, required): Information about the model to search.
- **pageNumber** (integer): Page number (from 0).
- **pageSize** (integer): Page size.
- **pointInTime** (string): point in time

## Related Schemas

See other schemas in the [search](/reference/schemas/search/) category.

---

## reference/schemas/search/entity-get-request.md

# EntityGetRequest

Schema definition for EntityGetRequest

# EntityGetRequest

Schema definition for EntityGetRequest

## Schema Viewer

<JsonSchemaViewer
  schema={schema}
  name="EntityGetRequest"
  category="search"
  client:load
/>

## Description

This schema defines the structure and validation rules for EntityGetRequest.

## Properties

- **entityId** (string, required): ID of the entity.
- **pointInTime** (string): Point in time to retrieve the entity. If not provided, retrieves the entity at the current consistency time.

## Related Schemas

See other schemas in the [search](/reference/schemas/search/) category.

---

## reference/schemas/search/entity-response.md

# EntityResponse

Schema definition for EntityResponse

# EntityResponse

Schema definition for EntityResponse

## Schema Viewer

<JsonSchemaViewer
  schema={schema}
  name="EntityResponse"
  category="search"
  client:load
/>

## Description

This schema defines the structure and validation rules for EntityResponse.

## Properties

- **payload** (object, required): Payload with entity data and meta information.
- **requestId** (string, required): ID of the original request to get data.

## Related Schemas

See other schemas in the [search](/reference/schemas/search/) category.

---

## reference/schemas/search/entity-search-request.md

# EntitySearchRequest

Schema definition for EntitySearchRequest

# EntitySearchRequest

Schema definition for EntitySearchRequest

## Schema Viewer

<JsonSchemaViewer
  schema={schema}
  name="EntitySearchRequest"
  category="search"
  client:load
/>

## Description

This schema defines the structure and validation rules for EntitySearchRequest.

## Properties

- **condition** (object, required): Query condition to use for building this snapshot.
- **limit** (integer): The maximum number of rows to return.
- **model** (object, required): Entity model to use for building this snapshot.
- **pointInTime** (string): point in time
- **timeoutMillis** (integer): The maximum time to wait in milliseconds for the query to complete.

## Related Schemas

See other schemas in the [search](/reference/schemas/search/) category.

---

## reference/schemas/search/entity-snapshot-search-request.md

# EntitySnapshotSearchRequest

Schema definition for EntitySnapshotSearchRequest

# EntitySnapshotSearchRequest

Schema definition for EntitySnapshotSearchRequest

## Schema Viewer

<JsonSchemaViewer
  schema={schema}
  name="EntitySnapshotSearchRequest"
  category="search"
  client:load
/>

## Description

This schema defines the structure and validation rules for EntitySnapshotSearchRequest.

## Properties

- **condition** (object, required): Query condition to use for building this snapshot.
- **model** (object, required): Entity model to use for building this snapshot.
- **pointInTime** (string): point in time

## Related Schemas

See other schemas in the [search](/reference/schemas/search/) category.

---

## reference/schemas/search/entity-snapshot-search-response.md

# EntitySnapshotSearchResponse

Schema definition for EntitySnapshotSearchResponse

# EntitySnapshotSearchResponse

Schema definition for EntitySnapshotSearchResponse

## Schema Viewer

<JsonSchemaViewer
  schema={schema}
  name="EntitySnapshotSearchResponse"
  category="search"
  client:load
/>

## Description

This schema defines the structure and validation rules for EntitySnapshotSearchResponse.

## Properties

- **status** (object): Status information for the requested snapshot.

## Related Schemas

See other schemas in the [search](/reference/schemas/search/) category.

---

## reference/schemas/search/entity-stats-by-state-get-request.md

# EntityStatsByStateGetRequest

Schema definition for EntityStatsByStateGetRequest

# EntityStatsByStateGetRequest

Schema definition for EntityStatsByStateGetRequest

## Schema Viewer

<JsonSchemaViewer
  schema={schema}
  name="EntityStatsByStateGetRequest"
  category="search"
  client:load
/>

## Description

This schema defines the structure and validation rules for EntityStatsByStateGetRequest.

## Properties

- **model** (object): Optional specifier of the Entity model to calculate statistics for.
- **pointInTime** (string): The point-in-time for statistics in ISO 8601 format. Defaults to current consistency time if not provided.
- **states** (array): Optional list of states for which to calculate statistics. If not provided, statistics will be calculated for all current workflow states.

## Related Schemas

See other schemas in the [search](/reference/schemas/search/) category.

---

## reference/schemas/search/entity-stats-by-state-response.md

# EntityStatsByStateResponse

Schema definition for EntityStatsByStateResponse

# EntityStatsByStateResponse

Schema definition for EntityStatsByStateResponse

## Schema Viewer

<JsonSchemaViewer
  schema={schema}
  name="EntityStatsByStateResponse"
  category="search"
  client:load
/>

## Description

This schema defines the structure and validation rules for EntityStatsByStateResponse.

## Properties

- **count** (integer, required): Entity count for this model and state.
- **modelName** (string, required): Entity model name.
- **modelVersion** (integer, required): Entity model version.
- **requestId** (string, required): ID of the original request.
- **state** (string, required): Entity state.

## Related Schemas

See other schemas in the [search](/reference/schemas/search/) category.

---

## reference/schemas/search/entity-stats-get-request.md

# EntityStatsGetRequest

Schema definition for EntityStatsGetRequest

# EntityStatsGetRequest

Schema definition for EntityStatsGetRequest

## Schema Viewer

<JsonSchemaViewer
  schema={schema}
  name="EntityStatsGetRequest"
  category="search"
  client:load
/>

## Description

This schema defines the structure and validation rules for EntityStatsGetRequest.

## Properties

- **model** (object): Optional specifier of the Entity model to calculate statistics for.
- **pointInTime** (string): The point-in-time for statistics in ISO 8601 format. Defaults to current consistency time if not provided.

## Related Schemas

See other schemas in the [search](/reference/schemas/search/) category.

---

## reference/schemas/search/entity-stats-response.md

# EntityStatsResponse

Schema definition for EntityStatsResponse

# EntityStatsResponse

Schema definition for EntityStatsResponse

## Schema Viewer

<JsonSchemaViewer
  schema={schema}
  name="EntityStatsResponse"
  category="search"
  client:load
/>

## Description

This schema defines the structure and validation rules for EntityStatsResponse.

## Properties

- **count** (integer, required): Entity count for this model.
- **modelName** (string, required): Entity model name.
- **modelVersion** (integer, required): Entity model version.
- **requestId** (string, required): ID of the original request.

## Related Schemas

See other schemas in the [search](/reference/schemas/search/) category.

---

## reference/schemas/search/search-snapshot-status.md

# SearchSnapshotStatus

Schema definition for SearchSnapshotStatus

# SearchSnapshotStatus

Schema definition for SearchSnapshotStatus

## Schema Viewer

<JsonSchemaViewer
  schema={schema}
  name="SearchSnapshotStatus"
  category="search"
  client:load
/>

## Description

This schema defines the structure and validation rules for SearchSnapshotStatus.

## Properties

- **entitiesCount** (integer): Number of entities collected.
- **expirationDate** (string): Expiration date of the snapshot.
- **snapshotId** (string, required): ID of the snapshot.
- **status** (string, required): Status of the snapshot.

## Related Schemas

See other schemas in the [search](/reference/schemas/search/) category.

---

## reference/schemas/search/snapshot-cancel-request.md

# SnapshotCancelRequest

Schema definition for SnapshotCancelRequest

# SnapshotCancelRequest

Schema definition for SnapshotCancelRequest

## Schema Viewer

<JsonSchemaViewer
  schema={schema}
  name="SnapshotCancelRequest"
  category="search"
  client:load
/>

## Description

This schema defines the structure and validation rules for SnapshotCancelRequest.

## Properties

- **snapshotId** (string, required): ID of the snapshot to cancel.

## Related Schemas

See other schemas in the [search](/reference/schemas/search/) category.

---

## reference/schemas/search/snapshot-get-request.md

# SnapshotGetRequest

Schema definition for SnapshotGetRequest

# SnapshotGetRequest

Schema definition for SnapshotGetRequest

## Schema Viewer

<JsonSchemaViewer
  schema={schema}
  name="SnapshotGetRequest"
  category="search"
  client:load
/>

## Description

This schema defines the structure and validation rules for SnapshotGetRequest.

## Properties

- **clientPointTime** (string): Page of time to retrieve the results.
- **pageNumber** (integer): Page number (from 0).
- **pageSize** (integer): Page size.
- **snapshotId** (string, required): ID of the snapshot to retrieve data.

## Related Schemas

See other schemas in the [search](/reference/schemas/search/) category.

---

## reference/schemas/search/snapshot-get-status-request.md

# SnapshotGetStatusRequest

Schema definition for SnapshotGetStatusRequest

# SnapshotGetStatusRequest

Schema definition for SnapshotGetStatusRequest

## Schema Viewer

<JsonSchemaViewer
  schema={schema}
  name="SnapshotGetStatusRequest"
  category="search"
  client:load
/>

## Description

This schema defines the structure and validation rules for SnapshotGetStatusRequest.

## Properties

- **snapshotId** (string, required): ID of the snapshot.

## Related Schemas

See other schemas in the [search](/reference/schemas/search/) category.

---

## reference/trino.md

# Trino SQL surface

How entities are projected into virtual SQL tables — naming rules, type mapping, polymorphic fields, and JDBC.

<VendoredBanner stability="upcoming" />

Cyoda exposes an analytical SQL surface through a Trino connector. Every
entity model is projected into a set of **virtual SQL tables** so that
nested JSON/XML data can be queried with ordinary relational SQL — no
pre-flattening required.

This page documents the projection rules, the table naming convention, the
column categories, the JSON-to-SQL type mapping, and how polymorphic fields
are handled. For the conceptual framing of the three Cyoda surfaces (REST,
gRPC, Trino SQL), see [APIs and surfaces](/concepts/apis-and-surfaces/).

## Core concepts

Cyoda represents data (JSON/XML) as hierarchical tree structures.
Internally, these trees are decomposed into a collection of `Nodes`, each
capturing a specific branch or subset of the data. This decomposition
provides a uniform representation for querying and traversal — whether
through API-level path queries or via SQL through the Trino connector.

In the Trino view, each node is exposed as a **virtual SQL table**,
allowing nested structures to be queried using familiar relational syntax
without flattening the original hierarchy.

**Key principle**: each `Node` corresponds to exactly **one** SQL table.

### Node model

Each Node has:

- **path**: a string identifying the node's position in the hierarchy
  (e.g. `$`, `$.organization.clients`, `$.agreement_data`)
- **fields**: a map of field names to values, where values can be:
    - primitive types (string, number, boolean, date, etc.)
    - 1-dimensional arrays of primitives

The collection of these nodes forms the internal data model that underpins
the Trino connector. The `TreeNodeEntity` is the object type that
encapsulates this node collection when interacting with the Cyoda API, but
it is a byproduct of this broader structural model rather than its
defining feature.

### Node creation rules

Nodes are derived directly from the structure of the data (such as
JSON/XML):

1. **Root node**: always created with path `$` containing top-level fields.
2. **Array of objects**: each array of objects creates a new node.
3. **Multidimensional arrays**: each dimension beyond one creates a further
   node to preserve structural depth.

This consistent mapping enables Cyoda to represent, navigate, and query
arbitrarily nested data structures in a predictable and composable way.

#### Example: node structure

Given a JSON file with these paths:

```
$.organization.name
$.organization.address[]
$.organization.clients[].name
$.organization.clients[].address[]
$.quarterly_metrics[][]
```

The system creates the following nodes:

**Node 1** (Root):
- Path: `$`
- Fields: `.organization.name`, `.organization.address[]`

**Node 2** (Clients array):
- Path: `$.organization.clients[*]`
- Fields: `.name`, `.address[]`

**Node 3** (Agreement data — 2D array, detached):
- Path: `$.quarterly_metrics[*]`
- Fields: `[*]` (detached array containing the inner array elements)
- This is a detached array because `quarterly_metrics` is a 2-dimensional
  array.

For example:

```json
{
  "organization": {
    "name": "Acme Corp",
    "address": [
      "123 Market Street",
      "Suite 400",
      "San Francisco, CA 94105"
    ],
    "clients": [
      {
        "name": "Client A",
        "address": ["10 First Ave", "Seattle, WA 98101"]
      },
      {
        "name": "Client B",
        "address": ["200 Second St", "Portland, OR 97204"]
      }
    ]
  },
  "quarterly_metrics": [
    [1000, 1200, 900, 1100],
    [1300, 1400, 1250, 1500]
  ]
}
```

### Tree decomposition

Here is a visual representation of the node structure for the example
above, where the corresponding SQL tables are labelled ORGANIZATION,
CLIENTS, and METRICS:

```mermaid
graph TD
    %% STYLES
    classDef highlighted fill:none,stroke:#5A18AC,stroke-width:3px,rx:8,ry:8
    classDef normal fill:#F5FAF9,stroke:#4FB8B0,stroke-width:2px,rx:8,ry:8

    %% Root Node
    A["$(root node -> ORGANIZATION)"] --> B["organization"]
    A --> K["quarterly_metrics[][] (detached 2D array node -> METRICS)"]

    %% Organization branch
    B --> C["organization.name"]
    B --> D["organization.address[]"]
    B --> E["organization.clients[] (clients node -> CLIENTS)"]

    %% Clients branch
    E --> F["organization.clients[].name"]
    E --> G["organization.clients[].address[]"]

    %% Quarterly metrics branch
    K --> L["quarterly_metrics[*] (outer arrays)"]
    L --> M["quarterly_metrics[*][*] (individual cell values)"]

    %% Apply highlighting to specific nodes
    class A,K,E highlighted
    class B,C,D,F,G,L,M normal
```

### Embedded arrays and detached arrays

When JSON/XML contains one-dimensional arrays of primitives within objects,
the system does not create a separate node for such arrays. Instead they
are represented as a single column of an array type. In the table it can
be represented as a column of array type (i.e. `field_array` —
ARRAY[STRING]) or as multiple columns (i.e. `field_0` — STRING, `field_1`
— STRING, ...), depending on the `flatten_array` flag for this field in
the Trino schema settings. That setting also works for system fields,
such as `index`.

When JSON/XML contains multidimensional arrays (arrays of arrays), the
system creates separate nodes for each dimension after the first. This
process is called **array detachment**.

#### Understanding detached arrays

A **detached array** is created when an array contains other arrays as
elements. Each additional dimension becomes a separate node with its own
table.

The primary motivation behind this approach: if a JSON contains a
table-like structure — i.e. a 2-dimensional array of primitives — it
should be represented as a table in Trino. This logic was then extended
and generalized to work for arrays of any number of dimensions larger
than 1.

#### Example 1: simple 2D array

Consider this JSON with a 2-dimensional array:

```json
{
  "matrix": [
    [1, 2, 3],
    [4, 5, 6]
  ]
}
```

**Nodes created:**

**Node 1** (Root — `$`):
- Path: `$`
- Fields: (none — the matrix field is an array of arrays, so it's not
  stored here)

**Node 2** (First dimension — `$.matrix[*]`):
- Path: `$.matrix[*]`
- Fields: `[1]`, `[2]`, `[3]` (these are detached array fields containing
  values from the inner arrays, which are represented as rows)
- This node is marked as having a detached array

**Generated tables:**

**Table 1: `mydata`** (from node `$`)
- Contains only special and root columns (no data fields in this case)

**Table 2: `mydata_matrix_array`** (from node `$.matrix[*]` — detached
array)
- Table name breakdown:
    - `mydata` = entity name
    - `_matrix` = field name from path
    - `_array` = suffix indicating this is a detached array table
- Columns:
    - `entity_id` (UUID)
    - `point_time` (DATE)
    - Root columns (creation_date, last_update_date, state)
    - `index_0` (INTEGER) — Position in the outer array (0 or 1 in this
      example)
    - `element_0`, `element_1`, `element_2` (INTEGER) — The three elements
      of each inner array

#### Example 2: 3D array

For a 3-dimensional array:

```json
{
  "cube": [
    [[1, 2], [3, 4]],
    [[5, 6], [7, 8]]
  ]
}
```

**Nodes created:**

**Node 1** (Root — `$`):
- No data fields

**Node 2** (Second dimension — `$.cube[*][*]`, no separate layer for the
first dimension in this case):
- Fields: `[1]`, `[2]`, `[3]`

**Generated tables:**

**Table 1: `mydata`** (from node `$`)

**Table 2: `mydata_cube_2d_array`** (from node `$.cube[*][*]`. Note that
`2d` in this case represents the number of collapsed dimensions, while
the 3rd dimension is detached)

- Columns include `index_0` and `index_1` for positions in both
  dimensions
- `element_0`, `element_1` for the two primitive values

#### Example 3: array of objects with nested arrays

```json
{
  "data": [
    {
      "name": "item1",
      "values": [10, 20, 30]
    },
    {
      "name": "item2",
      "values": [40, 50]
    }
  ]
}
```

**Nodes created:**

**Node 1** (Root — `$`):
- Path: `$`

**Node 2** (Array of objects — `$.data[*]`):
- Path: `$.data[*]`
- Fields: `.name`, `.values[]`

**Generated tables:**

**Table 1: `mydata`** (from node `$`)

**Table 2: `mydata_data`** (from node `$.data[*]`)
- Columns:
    - Special and root columns
    - `index_0` (INTEGER) — Position in the data array
    - `name` (STRING) — From `.name` field
    - `values_array` (ARRAY[INTEGER]) or `values_0`, `values_1`, `values_2`
      (INTEGER) depending on the `flatten_array` flag for this field in
      the Trino schema settings

#### Variable-dimension arrays (mixed depths)

The system can handle arrays where elements have different nesting depths
— some elements are primitives while others are arrays.

**Example:**

```json
{
  "data": [
    1,
    [2, 3],
    [4, 5, [6, 7], [8, 9]]
  ]
}
```

This is a **polymorphic array** containing:
- A primitive value: `1`
- A 1-dimensional array: `[2, 3]`
- A 2-dimensional array: `[4, 5, [6, 7], [8, 9]]`

**Nodes created:**

**Node 1** (Root — `$`)

**Node 2** (Mixed node — `$.data[*]`)

**Node 3** (Mixed node — `$.data[*][*]`)

**Generated tables:**

**Table 1: `mydata`** (from node `$`)
- Columns:
    - Special and root columns
    - `data[*]` (ARRAY[INTEGER]) — Field for the top-level array elements
      (one row with array value `[1]` for the current example)

**Table 2: `mydata_data_array`** (from node `$.data[*]`)
- Contains rows for second-level array elements (2 rows for the current
  example)
- Columns:
    - Special and root columns
    - `index_0` (INTEGER) — Position in the data array
    - `element_0`, `element_1` — columns for primitive values at the
      second level (for values `2`, `3`, `4`, `5` presented in 2 rows)

**Table 3: `mydata_data_2d_array`** (from array part of `$.data[*][*]`)
- Contains the elements from nested arrays
- Columns:
    - Special and root columns
    - `index_0`, `index_1` (INTEGER) — Position in the outer array
    - `element_0`, `element_1` — Elements from the third-level arrays
      (for values `6`, `7`, `8`, `9` presented in 2 rows, with index
      values `2,2` and `2,3`)

---

## SQL table generation

### Table naming convention

**Important**: each node in your `TreeNodeEntity` is mapped to exactly
**one** SQL table.

Table names are generated using the following rules:

1. **Base name**: Entity model name (e.g., `prizes`, `companies_details`)
2. **Version suffix** (if version > 1): `_<version>` (e.g., `_2`, `_3`)
3. **Path suffix** (for non-root nodes): Derived from the node path with
   `.` and `#` replaced by `_`
    - Example: `$.prizes[*]` → `_prizes`
    - Example: `$.prizes[*].laureates[*]` → `_prizes_laureates`
4. **Multidimensional suffix**: `_<N>d` where N is the number of `][`
   sequences in the *node path* + 1 (Note that this suffix is derived
   from the node path, not the field path — it represents the number of
   collapsed dimensions. So if we are dealing with a 3-dimensional array
   of primitives, the suffix will be `_2d`, as the last dimension is
   expanded into columns.)
    - Example: `$.data[*][*]` has one `][` → `_2d`
    - Example: `$.cube[*][*][*]` has two `][` → `_3d`
5. **Detached array suffix**: `_array` added when the node represents a
   detached array
    - This happens for multidimensional arrays where inner dimensions are
      "detached".

#### Table naming examples

| Entity Name | Version | Node Path | Is Detached Array? | Table Name | Explanation |
|-------------|---------|-----------|-------------------|------------|-------------|
| `prizes` | 1 | `$` | No | `prizes` | Root node |
| `prizes` | 2 | `$` | No | `prizes_2` | Root node, version 2 |
| `prizes` | 1 | `$.prizes[*]` | No | `prizes_prizes` | Array of objects |
| `prizes` | 1 | `$.prizes[*].laureates[*]` | No | `prizes_prizes_laureates` | Nested array of objects |
| `companies` | 1 | `$.matrix[*]` | Yes | `companies_matrix_array` | 2D array — detached |
| `companies` | 1 | `$.cube[*][*]` | Yes | `companies_cube_2d_array` | 3D array — has `][` so gets `_2d`, plus `_array` |

This is the default schema-generation naming; any of these names can be
changed manually in the schema settings.

### Special JSON table

In addition to the structured tables, every entity model gets a special
**JSON table** that contains the complete reconstructed JSON for each
entity:

- **Table name**: `<entity_name>_json` (e.g., `prizes_json`)
- **Purpose**: allows you to retrieve the full original JSON document

:::caution[Performance]
When querying the JSON table, **always include `entity_id` in your WHERE
clause** for optimal performance. Without this predicate, the query may
be significantly slower, especially with large datasets.
:::

**Good practice:**

```sql
SELECT entity FROM prizes_json WHERE entity_id = '<uuid>';
```

**Avoid:**

```sql
SELECT entity FROM prizes_json;  -- This will be slow!
```

---

## Table columns

Every SQL table contains several categories of columns.

### 1. Special columns

These are system-generated columns available in all tables:

| Column Name | Data Type | Description |
|-------------|-----------|-------------|
| `entity_id` | UUID | Unique identifier for the entity (the loaded JSON/XML file) |
| `point_time` | DATE | Allows querying data as it existed at a specific point in time |

The JSON table also includes:

- `entity` (STRING): the complete reconstructed JSON document.

### 2. Root columns

These columns provide metadata about the entity:

| Column Name | Source Field | Data Type | Description |
|-------------|--------------|-----------|-------------|
| `creation_date` | `creationDate` | DATE | When the entity was created in the system |
| `last_update_date` | `lastUpdateTime` | DATE | When the entity was last modified |
| `state` | `state` | STRING | Current workflow state of the entity |

### 3. Index columns

For tables representing array (object or detached) elements (depth > 0),
an `index` column is provided:

- **Column name**: `index`
- **Purpose**: identifies the position of this row in the array hierarchy
- **Structure**: can be flattened into individual columns (`index_0`,
  `index_1`, etc.) for multidimensional arrays. Flattened by default.

**Example**: For `$.prizes[*].laureates[*]`:
- `index_0`: Position in the `prizes` array
- `index_1`: Position in the `laureates` array within that prize

### 4. Data columns

These are the actual fields from your JSON/XML data.

#### Primitive fields

Simple fields are mapped directly to columns:

| JSON Path | Field Key | Column Name | Data Type |
|-----------|-----------|-------------|-----------|
| `$.organization.name` | `.organization.name` | `organization_name` | STRING |
| `$.prizes[*].year` | `.year` | `year` | STRING |
| `$.prizes[*].laureates[*].id` | `.id` | `id` | STRING |

**Naming rules:**
- Leading `.` is removed from the field key.
- Reserved field names (like `index`) are prefixed with `_` (e.g.,
  `_index`).

#### Array fields

1-dimensional arrays of primitives are handled in two ways:

**Option 1: Array column** (default for homogeneous arrays)
- Column name: `<field_name>_array`
- Data type: `ARRAY[<element_type>]`
- Example: `.addresses[]` → `addresses_array` (ARRAY[STRING])

**Option 2: Flattened columns** (for multi-type or ZONED_DATE_TIME arrays)
- Multiple columns created: `<field_name>_0`, `<field_name>_1`, etc.
- Each column represents one position in the array.
- Example: `.scores[]` → `scores_0`, `scores_1`, `scores_2` (if array has
  3 elements)

---

## Supported data types

The system supports the following data types, which are mapped to
appropriate SQL types for Trino queries. Understanding how these types
are detected from JSON is crucial for working with your data effectively.

**Important**: All data is stored internally in Cyoda with full
precision. Trino provides a SQL query interface to this data, and some
types (like `UNBOUND_DECIMAL` and `UNBOUND_INTEGER`) are represented as
strings in Trino due to Trino's numeric limitations, even though they
are stored as numbers in Cyoda.

### Data type reference table

| System Type | SQL Type | Description | JSON Detection |
|-------------|----------|-------------|----------------|
| STRING | VARCHAR | Text data (max 1024 characters) | Text values, or values that can't be parsed as other types |
| CHAR | CHAR | Single character | Single-character strings |
| BYTE | TINYINT | 8-bit integer (-128 to 127) | Integers in byte range |
| SHORT | SMALLINT | 16-bit integer (-32,768 to 32,767) | Integers in short range |
| INT | INTEGER | 32-bit integer | Integers in int range |
| LONG | BIGINT | 64-bit integer | Integers in long range |
| FLOAT | REAL | Single-precision floating point | Decimals with ≤6 digits precision and scale ≤31 |
| DOUBLE | DOUBLE | Double-precision floating point | Decimals with ≤15 digits precision and scale ≤292 |
| BIG_DECIMAL | DECIMAL(38,18) | High-precision decimal (Trino Int128) | Decimals that fit in Int128 with scale ≤18 |
| UNBOUND_DECIMAL | VARCHAR | Very large/precise decimals (Trino representation) | Decimals exceeding BIG_DECIMAL limits |
| BIG_INTEGER | DECIMAL(38,0) | Large integer (Trino Int128) | Integers that fit in Int128 range |
| UNBOUND_INTEGER | VARCHAR | Very large integers (Trino representation) | Integers exceeding BIG_INTEGER limits |
| BOOLEAN | BOOLEAN | True/false values | JSON boolean values |
| LOCAL_DATE | DATE | Date without time | ISO date strings (e.g., "2024-01-15") |
| LOCAL_TIME | TIME | Time without date | ISO time strings (e.g., "14:30:00") |
| LOCAL_DATE_TIME | TIMESTAMP | Date and time without timezone | ISO datetime strings |
| ZONED_DATE_TIME | TIMESTAMP WITH TIME ZONE | Date and time with timezone | ISO datetime with timezone |
| UUID | UUID | Universally unique identifier | Valid UUID strings |
| TIME_UUID | UUID | Time-based UUID (version 1) | UUID version 1 strings |
| BYTE_ARRAY | VARBINARY | Binary data | Base64-encoded strings |

### How number types are detected from JSON

When the system parses JSON numbers, it follows a specific detection
algorithm to determine the most appropriate type.

#### Integer detection

For whole numbers (no decimal point), the system checks in this order:

1. **BYTE** — if the value is between -128 and 127
2. **SHORT** — if the value is between -32,768 and 32,767
3. **INT** — if the value is between -2,147,483,648 and 2,147,483,647
4. **LONG** — if the value is between
   -9,223,372,036,854,775,808 and 9,223,372,036,854,775,807
5. **BIG_INTEGER** — if the value fits in Int128 range (see below)
6. **UNBOUND_INTEGER** — for integers larger than Int128

**Example:**

```json
{
  "small": 42,           // → BYTE
  "medium": 1000,        // → SHORT
  "large": 100000,       // → INT
  "veryLarge": 10000000000,  // → LONG
  "huge": 123456789012345678901234567890  // → BIG_INTEGER or UNBOUND_INTEGER
}
```

#### Decimal detection

For numbers with decimal points, the system checks in this order:

1. **FLOAT** — if precision ≤ 6 digits and scale is between -31 and 31
2. **DOUBLE** — if precision ≤ 15 digits and scale is between -292 and
   292
3. **BIG_DECIMAL** — if the value fits in Int128 decimal constraints
   (see below)
4. **UNBOUND_DECIMAL** — for decimals exceeding BIG_DECIMAL limits

**Example:**

```json
{
  "price": 19.99,                    // → FLOAT (6 digits precision)
  "precise": 123.456789012345,       // → DOUBLE (15 digits precision)
  "veryPrecise": 123456789012345.123456789012345,  // → BIG_DECIMAL
  "extremelyPrecise": 1.23456789012345678901234567890123456789  // → UNBOUND_DECIMAL
}
```

### Understanding BIG_DECIMAL and UNBOUND_DECIMAL

#### BIG_DECIMAL (Trino Int128 decimal)

**BIG_DECIMAL** is bounded by Trino's maximum numeric capacity, which
uses a 128-bit integer (Int128) with a **fixed scale of 18**.

**Constraints:**
- **Scale**: must be ≤ 18 decimal places
- **Precision**: must be ≤ 38 total digits (with some complexity — see
  below)
- **Range**: approximately ±170,141,183,460,469,231,731.687303715884105727

**Detailed precision rules:**

The system uses two precision checks:

1. **Strict check**: `precision ≤ 38` AND `exponent ≤ 20` (where
   exponent = precision - scale)
2. **Loose check**: `precision ≤ 39` AND `exponent ≤ 21` AND the value
   fits when scaled to 18 decimal places

**Examples:**

```json
{
  "fits": 12345678901234567890.123456789012345678,     // ✓ BIG_DECIMAL (38 digits, scale 18)
  "tooManyDecimals": 123.1234567890123456789,          // ✗ UNBOUND_DECIMAL (scale > 18)
  "tooLarge": 999999999999999999999.999999999999999999 // ✗ UNBOUND_DECIMAL (exceeds Int128)
}
```

#### UNBOUND_DECIMAL

**UNBOUND_DECIMAL** is used for decimal values that exceed Trino's
numeric representation limits.

**Storage vs representation:**
- **In Cyoda**: stored as full-precision BigDecimal numbers with all
  numeric operations available in workflows.
- **In Trino SQL**: represented as VARCHAR (strings) due to Trino's
  Int128 limitations.

**When a decimal becomes UNBOUND_DECIMAL:**
- Scale > 18 decimal places.
- Total value exceeds Int128 range.
- Precision and exponent exceed the limits.

**Important for SQL queries**: when querying UNBOUND_DECIMAL columns in
Trino, treat them as VARCHAR, not as numeric types. Numeric operations
are not available in SQL queries for these fields.

```sql
-- Correct: treat as string in Trino SQL
SELECT * FROM mytable WHERE unbound_decimal_field = '123.12345678901234567890123456789';

-- Incorrect: cannot use numeric operations in Trino SQL
SELECT * FROM mytable WHERE unbound_decimal_field > 100;  -- This will fail!

-- Note: numeric operations ARE available in Cyoda workflows, just not in Trino SQL queries.
```

### Understanding BIG_INTEGER and UNBOUND_INTEGER

#### BIG_INTEGER (Trino Int128)

**BIG_INTEGER** is bounded by Trino's 128-bit integer capacity.

**Constraints:**
- **Range**: -170,141,183,460,469,231,731,687,303,715,884,105,728 to
  170,141,183,460,469,231,731,687,303,715,884,105,727
- This is 2127 - 1 for the maximum and -2127 for
  the minimum.

**Examples:**

```json
{
  "fits": 123456789012345678901234567890,              // ✓ BIG_INTEGER (within Int128)
  "tooLarge": 999999999999999999999999999999999999999  // ✗ UNBOUND_INTEGER (exceeds Int128)
}
```

#### UNBOUND_INTEGER

**UNBOUND_INTEGER** is used for integer values that exceed the Int128
range.

**Storage vs representation:**
- **In Cyoda**: stored as full-precision BigInteger numbers with all
  numeric operations available in workflows.
- **In Trino SQL**: represented as VARCHAR (strings) due to Trino's
  Int128 limitations.

**Important for SQL queries**: when querying UNBOUND_INTEGER columns in
Trino, treat them as VARCHAR. Numeric operations are not available in
SQL queries for these fields.

```sql
-- Correct: treat as string in Trino SQL
SELECT * FROM mytable WHERE unbound_integer_field = '999999999999999999999999999999999999999';

-- Incorrect: cannot use numeric operations in Trino SQL
SELECT * FROM mytable WHERE unbound_integer_field > 1000;  -- This will fail!

-- Note: numeric operations ARE available in Cyoda workflows, just not in Trino SQL queries.
```

### Type detection priority

When parsing JSON, the system always tries to use the **most specific
type** that fits the value:

1. **Smallest integer type** that can hold the value
   (BYTE → SHORT → INT → LONG → BIG_INTEGER → UNBOUND_INTEGER).
2. **Smallest decimal type** that can hold the value
   (FLOAT → DOUBLE → BIG_DECIMAL → UNBOUND_DECIMAL).
3. **STRING** as a fallback for any value that can't be parsed as a more
   specific type.

### String parsing

When a JSON value is a string, the system attempts to parse it as other
types in this priority order:

1. **Temporal types** (dates, times, datetimes).
2. **UUID types**.
3. **Boolean** ("true" or "false").
4. **Numeric types** (if the string contains a valid number).
5. **STRING** (if none of the above match).

**Example:**

```json
{
  "date": "2024-01-15",                    // → LOCAL_DATE
  "uuid": "550e8400-e29b-41d4-a716-446655440000",  // → UUID
  "bool": "true",                          // → BOOLEAN
  "text": "hello world"                    // → STRING
}
```

---

## Polymorphic fields

### What are polymorphic fields?

A **polymorphic field** occurs when the same field path has different
data types across different elements in your JSON/XML data. This commonly
happens in arrays of objects where the same field name contains different
types of values.

### Example of polymorphic data

```json
{
  "items": [
    { "value": "text string" },
    { "value": 123 },
    { "value": 45.67 }
  ]
}
```

In this example, the field `$.items[*].value` is polymorphic because it
contains:

- A STRING in the first element
- An INTEGER in the second element
- A DOUBLE in the third element

### How polymorphic fields are handled

When the system detects polymorphic fields, it automatically determines a
**common data type** that can accommodate all the different types
encountered. The logic:

1. **Check for compatible types**: the system first checks if all types
   are compatible and can be converted to a common type.
2. **Find the lowest common denominator**: it selects the most general
   type that all values can be converted to.
3. **Fall back to STRING**: if no common numeric or date type exists,
   the field is stored as STRING.

### Type compatibility rules

The system recognizes certain types as compatible and will convert them
to a common type using **widening conversions**. It always chooses the
type that can represent all values without loss of information.

#### Numeric type hierarchy

**Integer types** (from smallest to largest):

- BYTE → SHORT → INT → LONG → BIG_INTEGER → UNBOUND_INTEGER

**Decimal types** (from smallest to largest):

- FLOAT → DOUBLE → BIG_DECIMAL → UNBOUND_DECIMAL

**Cross-hierarchy conversions:**

- Any integer type can widen to any larger integer type or any decimal
  type.
- Any decimal type can widen to a larger decimal type.
- UNBOUND_DECIMAL is the widest numeric type (can hold any number).

#### Common type conversion examples

| Types Found | Common Type Used | Explanation |
|-------------|------------------|-------------|
| BYTE, SHORT | SHORT | Wider integer type |
| INT, LONG | LONG | Wider integer type |
| BYTE, DOUBLE | DOUBLE | Integer widens to decimal |
| INT, BIG_DECIMAL | BIG_DECIMAL | Integer widens to decimal |
| LONG, UNBOUND_INTEGER | UNBOUND_INTEGER | Wider integer type |
| FLOAT, DOUBLE | DOUBLE | Wider decimal type |
| DOUBLE, BIG_DECIMAL | UNBOUND_DECIMAL | DOUBLE can't fit in BIG_DECIMAL's fixed scale |
| BIG_INTEGER, BIG_DECIMAL | BIG_DECIMAL | Integer widens to decimal |
| BIG_DECIMAL, UNBOUND_DECIMAL | UNBOUND_DECIMAL | Wider decimal type |
| BIG_INTEGER, UNBOUND_INTEGER | UNBOUND_INTEGER | Wider integer type |
| Any numeric, STRING | STRING | Incompatible — falls back to STRING |
| BOOLEAN, INT | STRING | Incompatible — falls back to STRING |
| UUID, STRING | STRING | Incompatible — falls back to STRING |

#### Temporal type conversions

Temporal types have a resolution hierarchy, where lower-resolution types
(like YEAR) can be converted to higher-resolution types (like
LOCAL_DATE) by adding default values for the missing components.

**Resolution hierarchy:**

- YEAR → YEAR_MONTH → LOCAL_DATE → LOCAL_DATE_TIME → ZONED_DATE_TIME
- LOCAL_TIME → LOCAL_DATE_TIME → ZONED_DATE_TIME

:::note
The LOCAL_DATE_TIME and ZONED_DATE_TIME types are considered incompatible
with each other because of type-detection ambiguity: since any value of
these types can be parsed as the other, such polymorphism cannot occur
automatically.
:::

When polymorphic temporal fields are detected, the system converts all
values to the highest-resolution type found.

##### Upscaling (low resolution → high resolution)

When converting from a lower-resolution type to a higher-resolution
type, the system adds default values for the missing components:

| From Type | To Type | Conversion Rule | Example |
|-----------|---------|-----------------|---------|
| YEAR | YEAR_MONTH | Add month = 1 (January) | `2024` → `2024-01` |
| YEAR_MONTH | LOCAL_DATE | Add day = 1 (first day of month) | `2024-01` → `2024-01-01` |
| LOCAL_DATE | LOCAL_DATE_TIME | Add time = 00:00:00 (midnight) | `2024-01-01` → `2024-01-01T00:00:00` |
| LOCAL_TIME | LOCAL_DATE_TIME | Add date = 1970-01-01 (epoch) | `14:30:00` → `1970-01-01T14:30:00` |

**Multi-step conversions:**

The system can perform multi-step conversions by chaining the rules
above:

- **YEAR → LOCAL_DATE**: `2024` → `2024-01` → `2024-01-01`
- **YEAR → LOCAL_DATE_TIME**: `2024` → `2024-01` → `2024-01-01` →
  `2024-01-01T00:00:00`
- **YEAR_MONTH → LOCAL_DATE_TIME**: `2024-06` → `2024-06-01` →
  `2024-06-01T00:00:00`

##### Downscaling (high resolution → low resolution)

When converting from a higher-resolution type to a lower-resolution
type, the system truncates the extra precision:

| From Type | To Type | Conversion Rule | Example |
|-----------|---------|-----------------|---------|
| YEAR_MONTH | YEAR | Extract year only | `2024-06` → `2024` |
| LOCAL_DATE | YEAR_MONTH | Extract year and month | `2024-06-15` → `2024-06` |
| LOCAL_DATE_TIME | LOCAL_DATE | Extract date part | `2024-06-15T14:30:00` → `2024-06-15` |
| LOCAL_DATE_TIME | LOCAL_TIME | Extract time part | `2024-06-15T14:30:00` → `14:30:00` |

Downscaling is primarily used internally for query optimization (e.g.
when processing a query condition against a [YEAR, DATE] polymorphic
field, the query condition is downscaled to YEAR for the YEAR part of
the target field). In polymorphic fields represented in Trino, the
system always upscales to the highest-resolution type.

##### Polymorphic temporal field examples

**Example 1: Mixed date resolutions**

```json
{
  "events": [
    { "date": "2024" },              // YEAR
    { "date": "2024-06" },           // YEAR_MONTH
    { "date": "2024-06-15" }         // LOCAL_DATE
  ]
}
```

The field `$.events[*].date` is polymorphic with types: YEAR,
YEAR_MONTH, LOCAL_DATE.

**Common type**: LOCAL_DATE (highest resolution).

**Trino SQL values after conversion:**

- `"2024"` → `2024-01-01` (January 1st, 2024)
- `"2024-06"` → `2024-06-01` (June 1st, 2024)
- `"2024-06-15"` → `2024-06-15` (unchanged)

**Example 2: Date and DateTime mix**

```json
{
  "timestamps": [
    { "when": "2024-01-15" },                    // LOCAL_DATE
    { "when": "2024-01-15T14:30:00" }            // LOCAL_DATE_TIME
  ]
}
```

**Common type**: LOCAL_DATE_TIME.

**Trino SQL values after conversion:**

- `"2024-01-15"` → `2024-01-15T00:00:00` (midnight)
- `"2024-01-15T14:30:00"` → `2024-01-15T14:30:00` (unchanged)

**Example 3: Time and DateTime mix**

```json
{
  "schedule": [
    { "time": "14:30:00" },                      // LOCAL_TIME
    { "time": "2024-01-15T14:30:00" }            // LOCAL_DATE_TIME
  ]
}
```

**Common type**: LOCAL_DATE_TIME.

**Trino SQL values after conversion:**

- `"14:30:00"` → `1970-01-01T14:30:00` (epoch date + time)
- `"2024-01-15T14:30:00"` → `2024-01-15T14:30:00` (unchanged)

##### Important considerations for temporal polymorphism

1. **Default values matter**: When YEAR is converted to LOCAL_DATE, it
   becomes January 1st. This means:

   ```sql
   -- If the field contains polymorphic YEAR and LOCAL_DATE values
   SELECT * FROM events WHERE date = '2024-01-01'
   -- This will match both "2024" (converted to 2024-01-01) and "2024-01-01"
   ```

2. **Loss of semantic meaning**: A YEAR value of `"2024"` represents the
   entire year, but when converted to LOCAL_DATE it becomes `2024-01-01`,
   which represents a specific day. The original semantic meaning (the
   entire year) is lost.

3. **Query implications**: when querying polymorphic temporal fields, be
   aware of the conversion rules:

   ```sql
   -- Original data: ["2024", "2024-06-15"]
   -- Stored as LOCAL_DATE: [2024-01-01, 2024-06-15]

   -- This query will NOT match the original "2024" value
   SELECT * FROM events WHERE date >= '2024-06-15'

   -- Because "2024" was converted to 2024-01-01, which does not include every day of the year
   ```

4. **Best practice**: if you need to preserve the original resolution,
   consider using separate fields:

   ```json
   {
     "yearOnly": "2024",
     "exactDate": "2024-06-15"
   }
   ```

##### Polymorphic temporal conversion summary

**Common polymorphic combinations:**

| Types Found | Common Type | Conversion Applied |
|-------------|-------------|-------------------|
| YEAR, YEAR_MONTH | YEAR_MONTH | YEAR → YEAR_MONTH (add month=1) |
| YEAR, LOCAL_DATE | LOCAL_DATE | YEAR → YEAR_MONTH → LOCAL_DATE |
| YEAR_MONTH, LOCAL_DATE | LOCAL_DATE | YEAR_MONTH → LOCAL_DATE (add day=1) |
| LOCAL_DATE, LOCAL_DATE_TIME | LOCAL_DATE_TIME | LOCAL_DATE → LOCAL_DATE_TIME (add time=00:00:00) |
| LOCAL_TIME, LOCAL_DATE_TIME | LOCAL_DATE_TIME | LOCAL_TIME → LOCAL_DATE_TIME (add date=1970-01-01) |

### Important notes

**Type conversion**: when a polymorphic field is stored as a common type
(e.g., STRING), all values are converted to that type. This means:

- Numeric values may be stored as strings: `"123"` instead of `123`.
- You may need to cast values in your SQL queries:
  `CAST(value AS INTEGER)`.

### Best practices for polymorphic data

1. **Consistent typing**: when possible, maintain consistent data types
   for the same field across all array elements.
2. **Explicit casting**: when querying polymorphic fields that were
   converted to STRING, use explicit CAST operations with caution — some
   values may not be castable.
3. **Understand your data**: use the schema-inspection API to see which
   fields are polymorphic and what their common type is.
4. **Consider restructuring**: if you have control over the data
   structure, consider using different field names for different types.

---

## Complete example

### Input JSON saved under model "prizes" version 1

```json
{
  "extraction-date": "2024-01-15",
  "prizes": [
    {
      "year": "2023",
      "category": "Physics",
      "laureates": [
        {
          "id": "1001",
          "firstname": "Anne",
          "surname": "L'Huillier",
          "motivation": "for experimental methods...",
          "share": 3
        },
        {
          "id": "1002",
          "firstname": "Pierre",
          "surname": "Agostini",
          "motivation": "for experimental methods...",
          "share": 3
        }
      ]
    }
  ]
}
```

### Generated tables

#### Table 1: `prizes` (Root node: `$`)

| Column Name | Data Type | Category | Description |
|-------------|-----------|----------|-------------|
| `entity_id` | UUID | SPECIAL | Entity identifier |
| `point_time` | DATE | SPECIAL | Query time point |
| `creation_date` | DATE | ROOT | Entity creation date |
| `last_update_date` | DATE | ROOT | Entity last update |
| `state` | STRING | ROOT | Entity state |
| `extraction_date` | DATE | DATA | From `.extraction-date` |

#### Table 2: `prizes_prizes` (Node: `$.prizes[*]`)

| Column Name | Data Type | Category | Description |
|-------------|-----------|----------|-------------|
| `entity_id` | UUID | SPECIAL | Entity identifier |
| `point_time` | DATE | SPECIAL | Query time point |
| `creation_date` | DATE | ROOT | Entity creation date |
| `last_update_date` | DATE | ROOT | Entity last update |
| `state` | STRING | ROOT | Entity state |
| `index_0` | INTEGER | INDEX | Position in prizes array |
| `year` | STRING | DATA | From `.year` |
| `category` | STRING | DATA | From `.category` |

#### Table 3: `prizes_prizes_laureates` (Node: `$.prizes[*].laureates[*]`)

| Column Name | Data Type | Category | Description |
|-------------|-----------|----------|-------------|
| `entity_id` | UUID | SPECIAL | Entity identifier |
| `point_time` | DATE | SPECIAL | Query time point |
| `creation_date` | DATE | ROOT | Entity creation date |
| `last_update_date` | DATE | ROOT | Entity last update |
| `state` | STRING | ROOT | Entity state |
| `index_0` | INTEGER | INDEX | Position in prizes array |
| `index_1` | INTEGER | INDEX | Position in laureates array |
| `id` | STRING | DATA | From `.id` |
| `firstname` | STRING | DATA | From `.firstname` |
| `surname` | STRING | DATA | From `.surname` |
| `motivation` | STRING | DATA | From `.motivation` |
| `share` | TINYINT | DATA | From `.share` |

#### Table 4: `prizes_json` (Special JSON table)

| Column Name | Data Type | Category | Description |
|-------------|-----------|----------|-------------|
| `entity_id` | UUID | SPECIAL | Entity identifier |
| `point_time` | DATE | SPECIAL | Query time point |
| `creation_date` | DATE | ROOT | Entity creation date |
| `last_update_date` | DATE | ROOT | Entity last update |
| `state` | STRING | ROOT | Entity state |
| `entity` | STRING | SPECIAL | Complete JSON document |

---

## Querying your data

### Example queries

**1. Get all prizes from 2023:**

```sql
SELECT * FROM prizes_prizes WHERE year = '2023';
```

**2. Find all laureates with their prize information:**

```sql
SELECT
  p.year,
  p.category,
  l.firstname,
  l.surname,
  l.motivation
FROM prizes_prizes p
JOIN prizes_prizes_laureates l
  ON p.entity_id = l.entity_id
  AND p.index_0 = l.index_0;
```

**3. Retrieve the full JSON for a specific entity:**

```sql
SELECT entity FROM prizes_json WHERE entity_id = '<uuid>';
```

**4. Query data as it existed at a specific time:**

```sql
SELECT * FROM prizes_prizes
WHERE point_time = TIMESTAMP '2024-01-01 00:00:00';
```

---

## Best practices

1. **Use index columns for joins**: when joining tables from nested
   arrays, always join on `entity_id` and matching index columns.
2. **Understand your schema**: use the schema-generation API to see
   exactly what tables and columns are created for your data.
3. **Leverage the JSON table carefully**: for complex queries or when
   you need the full document, query the `_json` table, but **always
   include `entity_id` in the WHERE clause** for performance.
4. **Filter by `entity_id`**: for better performance, include
   `entity_id` in your WHERE clause when possible, especially when
   querying the JSON table.
5. **Use `point_time` wisely**: only specify `point_time` when you need
   historical data; omit it for current data.

---

## Schema management

You can create any number of SQL schemas, each with a different set of
tables and columns.

### Via the HTTP API

You can manage and inspect your SQL schemas using the REST API:

- **Create default schema**: `PUT /sql/schema/putDefault/{schemaName}`
- **Get schema**: `GET /sql/schema/{schemaId}`
- **List all schemas**: `GET /sql/schema/listAll`
- **Generate tables from entity model**:
  `GET /sql/schema/genTables/{entityModelId}`
- **Update tables**: `POST /sql/schema/updateTables/{entityModelId}`

For the full grammar, see the [REST API reference](/reference/api/).

### Using the Cyoda UI

It is very straightforward to create SQL schemas in the Cyoda UI. Once
logged in, navigate to the Trino/SQL menu — you can create and configure
new schemas, or edit existing ones.

## Connecting via JDBC

The JDBC connection string follows this pattern:

```
jdbc:trino://trino-client-<caas_user_id>.eu.cyoda.net:443/cyoda/<your_schema>
```

where `caas_user_id` is your CAAS user ID and `your_schema` is the
schema name you configured.

For authentication credentials and technical-user setup, see
[Authentication and identity](/concepts/authentication-and-identity/).

## Gaps in this reference

The following are intentionally not yet specified here and are tracked as
upstream asks:

- **Supported SQL dialect scope** — which Trino features are guaranteed
  supported versus best-effort.
- **Push-down matrix** — which predicates, projections, and aggregates
  execute in the underlying store versus requiring a full scan.
- **Consistency / isolation** of long-running queries relative to
  concurrent transition writes.
- **Performance envelope** — rows/sec scan rates, partitioning, and
  per-tenant query limits.

See the
[cyoda-go issue tracker](https://github.com/Cyoda-platform/cyoda-go/issues?q=is%3Aissue+label%3Acyoda-docs)
for progress.

---

## run.md

# Run

The cyoda-go packaging ladder — desktop, docker, kubernetes, and hosted Cyoda Cloud.

Cyoda runs on the tier that fits the job. Every packaging runs the same
application and the same workflow semantics; what changes is durability,
consistency guarantees, and operational cost. Pick your packaging by what
you need to operate, not by what your app needs to do.

## Pick your packaging

|                  | In-Memory | SQLite        | PostgreSQL       | Cassandra |
|------------------|-----------|---------------|------------------|-----------|
| **Desktop**      | ✓         | ✓ *(default)* |                  |           |
| **Docker**       | ✓         | ✓             | ✓                |           |
| **Kubernetes**   |           |               | ✓ *(production)* |           |
| **Cyoda Cloud**  |           |               |                  | ✓         |

## When to pick what

- **[Desktop](./desktop/)** — a single binary on a laptop or a small server.
  In-memory for tests; SQLite as the default durable store. Right for
  development, edge, IoT, and small-team self-hosting.
- **[Docker](./docker/)** — the same binary containerised. Use it for
  bespoke integrations, composition with other services, local PostgreSQL
  runs, and CI pipelines.
- **[Kubernetes](./kubernetes/)** — the production packaging for
  self-hosted clusters. Active-active stateless cyoda-go pods behind a load
  balancer with PostgreSQL as the only stateful dependency. Helm chart
  ships from cyoda-go.
- **[Cyoda Cloud](./cyoda-cloud/)** — a managed service backed by
  Cassandra. Right when you need enterprise-grade identity, multi-tenancy,
  and provisioning, and you do not want to operate the infrastructure.

## Moving between tiers

The application does not change when you move. That is the whole point of the
growth path: start on Desktop, containerise when integration demands it,
cluster when scale demands it, migrate to Cyoda Cloud when operating it no
longer pays for itself.

See [Digital twins and the growth path](/concepts/digital-twins-and-growth-path/)
for the decision framework.

## License and editions

Cyoda ships in three editions. Pick by how you want to consume it; the
application contract is the same across all three.

| Edition | License | Storage tiers | Status |
|---|---|---|---|
| **cyoda-go** | Apache 2.0 (OSS) | In-memory, SQLite, PostgreSQL | Generally available |
| **Cyoda Cloud** | Commercial (hosted service) | Cassandra-backed | Beta — test/demo only; commercial SLAs planned |
| **Enterprise** | Commercial (self-hosted) | Cassandra-backed | Available — contact sales |

The **cyoda-go** binary — everything in [Build](/build/) and
[Reference](/reference/) works against it — is Apache 2.0 and free to
use for development and production on the Desktop, Docker, and
Kubernetes packagings. The **Cassandra-backed** tier (horizontal scale
for write volume and distributed `async` search) ships two ways: as the
hosted **Cyoda Cloud** service, and as a self-hosted **Enterprise**
distribution under commercial license. Cyoda Cloud is currently a Beta
for test and demonstration; production workloads on the Cassandra tier
run under the Enterprise license today.

---

## run/cyoda-cloud.md

# Cyoda Cloud

Hosted Cyoda offering — test/demo today, commercial SLA offering planned.

Evolving · Test / demo only

Cyoda Cloud is currently a test and demonstration platform. Use at
your own risk. A commercial offering with SLAs is planned.

This page gives an implementation-oriented overview of Cyoda Cloud: its
physical architecture (how client compute integrates with the platform)
and the current operational characteristics, service boundaries, and
considerations that matter when you use it.

## Architecture

Cyoda models entities via configuration and provides persistence, transactional workflow orchestration and processing, and distributed querying via API and SQL in a write-only, horizontally distributed architecture. This enables building scalable event-driven systems with ease, with client-specific logic executed as independent compute.

![Cyoda Cloud Architecture](../../../../assets/architecture/Cyoda-Cloud-Client-View-compact.svg)

Key points:
- Cyoda calls client compute via **gRPC + CloudEvents**
- Client compute is **client-owned**, **polyglot**, and **independently scalable**
- Data is stored in **Cassandra (bare metal, per-tenant keyspaces)**
- Data is **queried via HTTP API or Trino SQL**, in both cases as distributed queries with horizontally scalability
- Each cluster layer (Cyoda, Trino, client compute, Cassandra) **scales horizontally**; nothing is coupled

### Platform overview

Cyoda Cloud is deployed on Hetzner bare‑metal infrastructure in Helsinki, Finland, supplemented by selected internal services running on Hetzner Cloud. Each lifecycle stage (development, staging, production) is deployed as an independent environment with its own Kubernetes and Cassandra clusters.

Key characteristics:
- **Bare metal first**: Cassandra runs directly on bare‑metal servers for predictable latency and I/O performance, and is intentionally not containerised.
- **High‑bandwidth internal network**: Bare‑metal nodes are connected via a private 10 Gbit LAN.
- **Strict network isolation**: External access is fronted by Cloudflare. All ingress to internal services occurs through VPN‑secured channels.
- **Horizontal scalability by design**: Cyoda services, Trino, and client compute nodes scale independently.

### Client compute model

Cyoda delegates all client‑specific business logic to Client Compute Nodes. These nodes:
- Connect to Cyoda via gRPC and CloudEvents
- Execute processors within workflows
- Evaluate criteria that control gateway transitions
- Are fully owned and implemented by the client

Client compute is a first‑class part of the architecture and can be deployed flexibly depending on latency, isolation, and operational requirements.

Currently supported client runtimes:
- Java/Kotlin
- Python

Additional languages are planned, enabling a fully polyglot execution model.

### Developer mode

In developer mode, client compute nodes typically run on the developer's local machine.

![Cyoda Cloud Architecture](../../../../assets/architecture/Cyoda-Cloud-Client-View-developer-mode.svg)

Characteristics:
- Client nodes connect to Cyoda Cloud via a **Cloudflare tunnel**
- Developers use their preferred IDE, language, and local tooling
- Hot‑reload and rapid iteration are supported

In this mode:
- **Client applications** interact with Cyoda using the HTTP API or the Trino JDBC driver to perform CRUD operations and queries
- **SQL tooling** can be used directly against entity data via Trino

Despite its simplicity, the architecture remains fully horizontally scalable:
- Cyoda services scale elastically per tenant
- Trino scales independently for analytical workloads
- Client compute nodes scale independently
- Cassandra scales horizontally and is shared across tenants

### Multi-cloud

Cyoda supports multi‑cloud deployments where tenant resources run in a separate cloud or region from the Cyoda control plane.

![Cyoda Cloud Architecture](../../../../assets/architecture/Cyoda-Cloud-Client-View-multi-cloud.svg)

### Multi-language

Each client compute node can be implemented in a different programming language.

This enables:
- Polyglot architectures
- Team autonomy
- Incremental migration between languages

Cyoda treats all client nodes uniformly at the protocol level, regardless of implementation language.

![Cyoda Cloud Architecture](../../../../assets/architecture/Cyoda-Cloud-Client-View-multi-language.svg)

### Single-cloud

An entire Cyoda instance (control plane and tenant workloads) can be deployed into a single cloud environment.

![Cyoda Cloud Architecture](../../../../assets/architecture/Cyoda-Cloud-Client-View-single-cloud.svg)

### Sharded by client tag

For advanced workloads, client tags can be used to route events to specific subsets of client compute nodes.

This enables targeted execution strategies, such as:
- Separating GPU/TPU‑backed compute from CPU‑only workloads
- Isolating high‑throughput, low‑latency processing from batch workloads
- Running specialised processors with different cost or performance characteristics

Routing is explicit and deterministic, making this model suitable for complex or heterogeneous compute requirements.

![Cyoda Cloud Architecture](../../../../assets/architecture/Cyoda-Cloud-Client-View-tag-sharded.svg)

### Cyoda Cloud layout

The current Cyoda Cloud deployment is a multi‑tenant platform running on bare‑metal infrastructure in Hetzner datacenters (EU).

![Cyoda Cloud Architecture](../../../../assets/architecture/Cyoda-Cloud-Client-View-cyoda-details.svg)

High‑level characteristics:
- Each tenant maps to a dedicated Kubernetes namespace
- Each tenant has:
  - Its own Cyoda pods
  - A dedicated Cassandra keyspace
  - A dedicated Trino deployment for SQL access

Cyoda and Trino can both be elastically scaled per tenant to meet:
- Event processing throughput
- Workflow execution demand
- Complex analytical query workloads

In addition to the core runtime components, the platform includes:

- Cyoda UI SPA for interactive use
- Cyoda Toolbox Server for administrators, exposing a GraphQL API for maintenance and analysis
- Apache Zookeeper for Cyoda cluster state management (multi‑tenant across namespaces)
- Prometheus and Grafana (LGTM stack) for metrics and observability
- Alertmanager for alert routing, with notifications sent to internal Google Chat

Log aggregation and analysis are currently handled by Elasticsearch and Kibana. This is expected to be consolidated into Loki and Grafana LogQL in the future.

The diagrams intentionally omit some infrastructure components (gateways and edge routing, VPN and internal network details, authentication and identity services, AI Studio and Cloud Manager applications, auxiliary load balancers and supporting services) to keep the focus on data flow and execution topology.

## Service details

    During Beta, we are still moving fast and we might be making breaking
    changes along the way, although we try to keep that to a minimum.
    Make sure you backup your data, and check before redeploying your
    Cyoda Cloud environment. It's good practice to set yourself up with
    some CI/CD pipelines to automate tearing down and rebuilding your
    data and environment. Contact us on
    [Discord](https://discord.com/invite/95rdAyBZr2) if you need help.

For current service status and known issues, and for upcoming work, see [Status and roadmap](./status-and-roadmap/).
For tier limits and entitlements, see [Identity and entitlements](./identity-and-entitlements/).

### Service availability

**Uptime and maintenance**

- Service operates 24/7
- Planned maintenance notifications posted on the [Status and roadmap](./status-and-roadmap/) page

**Geographic deployment**

- Single datacenter deployment in Helsinki, Finland
- Latency characteristics suitable for most development and testing use cases
- Contact us via [Discord](https://discord.com/invite/95rdAyBZr2) if latency impacts your use case

### Client integration points

- **HTTP API**: REST endpoints for application integration
- **gRPC**: High-performance interface for compute externalization
- **JDBC**: SQL querying via Trino
- **AI Studio**: Entry point for signing up, creating new applications via chat dialogue, provisioning and controlling your environment(s), and getting help. Available at [https://ai.cyoda.net](https://ai.cyoda.net)
- **Cyoda UI**: Web interface for your Cyoda environment. It's our legacy UI, but with extensive features including entity lifecycle configuration and observability, distributed report configuration and execution, entity model viewing and navigation, Trino SQL schema configuration, and deep processing manager analysis. Available at `https://client-<caas_user_id>.eu.cyoda.net`

### Data management

**Storage characteristics**

- Apache Cassandra backend with replication factor 3
- Write-only entity persistence for complete audit trails
- Point-in-time querying capabilities
- No automatic backups in Free Tier

**Data retention**

- Data persists until explicitly deleted by user or exported via API
- Users responsible for data export and backup
- Data export available via HTTP API endpoints

### API and integration

**Rate limiting**

Rate limits vary by subscription tier. See [Identity and entitlements](./identity-and-entitlements/) for specific limits.

- HTTP 429 responses when limits exceeded
- Burst capacity available within limits

**Authentication**

- Auth0-based human authentication for web interfaces
- OAuth 2.0 client credentials flow for technical users
- Technical user creation via [AI Studio](https://ai.cyoda.net)

For complete authentication details, see [Authentication and identity](/concepts/authentication-and-identity/).

### Current service limitations

**Beta phase considerations**

- Frequent platform updates and changes
- Feature set under active development
- Documentation continuously updated

**Support channels**

- Primary support via [Discord](https://discord.com/invite/95rdAyBZr2)
- Global team coverage across time zones
- Community-driven support model

**API versioning**

- API versioning planned for post-Beta release
- Client libraries available on GitHub. Lots of active development there.
- In Beta, expect breaking changes. We'll keep people informed on [Status and roadmap](./status-and-roadmap/) and [Discord](https://discord.com/invite/95rdAyBZr2).

### Migration and data portability

- Full data export via HTTP API
- User responsibility for backup and migration

## In this section

- [Provisioning](./provisioning/)
- [Identity and entitlements](./identity-and-entitlements/)
- [Status and roadmap](./status-and-roadmap/)

---

## run/cyoda-cloud/identity-and-entitlements.md

# Identity and entitlements (Cyoda Cloud)

Configure OIDC, manage signing keys, and assign entitlements on the hosted platform.

> This page covers identity operations for the **hosted Cyoda Cloud**
> platform. For self-hosted cyoda-go identity (OAuth 2.0 issuance,
> M2M credentials, external key trust on your own instance), see the
> [cyoda-go identity docs](https://github.com/Cyoda-platform/cyoda-go/blob/main/docs/).

## Overview

Cyoda Cloud authenticates users and technical clients via JWT access tokens. Tokens can be:

- Issued by Cyoda using internally managed asymmetric signing keys, for technical users (machine-to-machine API clients) and other scenarios where your own infrastructure needs to issue JWTs trusted by Cyoda.
- Issued by an external OpenID Connect (OIDC) provider that Cyoda is configured to trust, allowing you to use your own identity provider for regular and technical users, auto-enroll users from trusted JWT claims, and map IdP roles onto Cyoda authorities.

Once authenticated, what each identity can do on the platform is bounded by the subscription tier and its associated entitlements.

## Signing keys

Cyoda Cloud uses asymmetric JWT signing keys to issue and validate access tokens for technical users and custom integrations.

### JWT signing key attributes

Cyoda Cloud represents each signing key-pair with a set of attributes:

- **Audience (`audience`)**
  - Internal Cyoda concept that describes which consumers use tokens signed with this key. This is not the same as the JWT `aud` claim.
  - Typical values:
    - `human` – tokens issued for the regular users
    - `client` – tokens issued for technical users (machine-to-machine access)
- **Algorithm (`algorithm`)**
  - Asymmetric signing algorithms only (for example, `RS256`, `RS512`, `ES256`).
- **Validity window (`validFrom`, `validTo`)**
  - Optional `validFrom` and `validTo` define when a key becomes valid and when it expires.
- **Key ID (`keyId` / `kid`)**
  - Each key-pair has a unique `keyId`.
  - This value is included in the JWT header as the `kid` field.

In standard Cyoda Cloud, JWT signing keys for technical users are managed centrally by Cyoda. In custom or on-premise installations, it is also possible to configure an externally injected key-pair file that IAM uses instead of managing keys entirely through the API.

### Managing JWT signing keys

At a high level, you can do the following operations:

1. Create or rotate signing key-pairs for the relevant audience.
2. Invalidate the keys with a grace period and then reactivate them.
3. Delete the key-pairs completely.

#### Example: issue a key for technical users

The following example shows a minimal request for creating a key-pair for technical users:

```json
{
  "audience": "client",
  "algorithm": "RS256"
}
```

Your token issuer will receive the corresponding `keyId` and the public key. It should use the private key to sign access tokens for technical users and include the `kid` header in each JWT.

#### Example: rotate a key with grace period

To perform a safe rotation with a grace period for technical users:

1. Issue a new key-pair for the `client` audience if you don't have any.
2. Mark the old key as invalid but allow a grace period during which previously issued tokens remain valid, for example:

```json
{
  "gracePeriodSec": 3600
}
```

After the grace period expires, you can remove the old key or leave it invalidated.

#### End-to-end example: technical user using a rotated key

Putting it together for a typical environment:

1. **Before rotation**: Your technical user obtains access tokens signed with the old key (for example, `keyId = "old-key-id"`).
2. **Issue new key**: You create a new key for `audience = "client"` and obtain `keyId = "new-key-id"`.
3. **Switch issuer**: Token issuance logic starts using `"new-key-id"` to sign tokens while still accepting tokens with `"old-key-id"` during the grace period.
4. **Verify access**: Both old and new tokens can call Cyoda APIs until the grace period expires.
5. **Finalize rotation**: After the grace period, any remaining tokens signed with `"old-key-id"` are rejected, and you can clean up the old key.

**Key rotation flow**

```mermaid
flowchart TB
    A[Old active key for `client` audience]
    B[Issue new key for `client` audience]
    C[Sign new tokens with new `keyId`]
    D[Grace period: accept old and new tokens]
    E[Invalidate old key]

    A --> B
    B --> C
    C --> D
    D --> E
```

### Key rotation recommendations

- Rotate keys regularly to limit the potential security risks.
- Separate user and technical-user token lifecycles in your application.
- In custom installations with externally injected key-pairs, align your rotation schedule with how often you replace the external key-pair files.

## OIDC provider configuration

When you register an OIDC provider in Cyoda Cloud, you describe how Cyoda should trust and use tokens from that provider.

Key concepts:

- **Well-known configuration URI**
  - The standard OIDC discovery endpoint exposed by your IdP (for example, `https://your-idp/.well-known/openid-configuration`).
  - Cyoda fetches the JWKS (public keys) and other metadata from this URL.
- **Issuers list**
  - An optional but strongly recommended list of allowed `iss` (issuer) values.
  - When present and non-empty, Cyoda requires the JWT `iss` claim to match one of these values.
  - When omitted or empty, issuer validation is skipped.
- **Provider state**
  - Providers can be active or inactive. Inactive providers are ignored during JWT validation, and any keys loaded for them are treated as untrusted.

In most environments, Cyoda Cloud comes pre-configured with providers for the supported identity options (for example, Auth0 for the default UI). Custom OIDC providers are typically used for enterprise integrations.

### How Cyoda uses OIDC providers

At a high level, Cyoda IAM uses configured OIDC providers to:

1. Validate the JWT signature and basic claims (for example, issuer or expiration).
2. Extract required claims that identify the user and their organization within the Cyoda instance.
3. Apply auto-enrollment logic:
   - Create a **user** record based on the token.
   - Create a **legal entity** record based on organization-related claims (allowed only in custom installations).
4. Build the authenticated principal with **authorities** derived from role-related claims.

The exact auto-enrollment behavior depends on how your environment is configured, but the required claims listed below must be present for the standard Cyoda Cloud flow to work.

### Configuring your custom OIDC provider

When configuring your IdP (for example, Auth0, Azure AD, or another OIDC provider) to work with Cyoda Cloud:

1. Ensure that access tokens issued for Cyoda APIs include at least:
   - `sub`
   - `org_id`
   - `caas_org_id`
2. Configure a claim (often via custom rules or mappers) that emits `user_roles` as an array of strings for users that need explicit roles.
3. Verify that the token `iss` (issuer) value matches one of the `issuers` configured for the corresponding OIDC provider in Cyoda.

### Operational tips

- Use a dedicated OIDC client/application configuration per environment (dev, test, prod) and reflect that in your `org_id` or `caas_org_id` values as appropriate.
- Keep the list of allowed `issuers` small and explicit to reduce the chance of accepting tokens from an unexpected issuer.
- When rotating keys at your IdP, you can trigger a reload of provider metadata (including JWKS) in Cyoda Cloud using the appropriate API.

### End-to-end example: custom OIDC provider

The following example describes a typical setup using an external OIDC provider as the IdP:

1. **Configure your application in the OIDC provider** representing your Cyoda environment.
2. **Create a rule or action** that adds the following claims to the access token when the audience matches your Cyoda App:
   - `sub` (OIDC provider user ID)
   - `org_id` (your external organization identifier)
   - `caas_org_id` (the Cyoda legal entity identifier for your installation - `your_user_id` for the single-user Cyoda instance)
3. **Register the OIDC provider in Cyoda Cloud** using the OIDC `.well-known/openid-configuration` URL and, optionally, the expected `iss` value.
4. **Test login** via the OIDC provider, obtain an access token, and call a Cyoda API endpoint. Cyoda Cloud validates the token and auto-enrolls the user under your legal entity if needed.

```mermaid
sequenceDiagram
    participant User
    participant OIDC as OIDC Provider
    participant Cyoda as Cyoda Cloud

    User->>OIDC: Login with browser
    OIDC-->>User: Redirect with access token
    User->>Cyoda: Call API with bearer token
    OIDC-->>Cyoda: Fetch JWKS to validate signature
    Cyoda-->>User: Authorized response (user enrolled, roles applied)
```

## JWT claims → role mapping

When integrating a **custom OIDC provider**, its access tokens must carry specific claims so that Cyoda Cloud can:

- Identify the user
- Identify the organization (legal entity)
- Map roles

The following table summarizes the claims used by the integration:

| Claim name | Type | Required? | Purpose |
|-----------|------|-----------|---------|
| `sub` | string | Yes | Standard OIDC subject. Used as a stable external user identifier. |
| `org_id` | string | Yes | External organization identifier provided by your IdP. Used as the legal entity external key and to build a human-readable name (for example, `"Org. <org_id>"`). |
| `caas_org_id` | string | Yes (for Cyoda-backed tenants) | Cyoda legal entity identifier (`caas_org_id`). Used as the `owner` of both users and legal entities. |
| `user_roles` | array of strings | Recommended | List of application or user roles. If absent, the user is treated as having no additional roles. |

## Entitlements

Access to Cyoda Cloud is subscription-tier-based. This section provides details about the available subscription tiers and their entitlements.

**Important**: The information below is for reference purposes and is not guaranteed to be correct. The authoritative source for your account's current subscription details and entitlements is available through the Cyoda Cloud API at the following endpoints:

- **Current account information**: `GET /account` — Retrieve information about the current user's account, including current subscription.
- **All available subscriptions**: `GET /account/subscriptions` — Retrieve all subscriptions available for the current user's legal entity.

For complete API documentation, refer to the [OpenAPI specification](/api-reference/).

### Subscription tiers overview

| Entitlement | Free1 | Developer | Pro | Enterprise License2 |
| --- | --- | --- | --- | --- |
| **Status** | Available | Draft | Draft | Available |
| **Model Fields (per model)** | 150 | 150 | 500 | Unlimited |
| **Model Fields (cumulative)** | 300 | 300 | 2000 | Unlimited |
| **Models** | 20 | 20 | 100 | Unlimited |
| **Client Nodes** | 1 | 1 | 5 | Unlimited |
| **Payload Size** | 5 MB | 5 MB | 50 MB | Unlimited |
| **Disk Usage** | 2 GB | 2 GB | 1 TB | Unlimited |
| **API Requests** | 300/min | 300/min | 50/sec | Unlimited |
| **External Calls** | 300/min | 300/min | 50/sec | Unlimited |

1 _Free Tier environments are automatically reset after an expiry period. Contact us for details._
2 _Enterprise License is for the Cyoda Cloud system that clients operate themselves (outside of Cyoda Cloud). Contact us for details._

**Status legend:**
- **Available**: Tier is currently available for subscription.
- **Draft (unavailable)**: Tier is in planning/development phase and not yet available.

### Entitlement definitions

The following section provides detailed definitions for each entitlement ID used in the subscription tiers:

| Entitlement ID | Description |
| --- | --- |
| `NUM_MODEL_FIELDS` | Maximum number of fields allowed per individual data model. This controls the complexity of each model you can create. |
| `NUM_MODEL_FIELDS_CUMULATIVE` | Total number of fields allowed across all data models in your account. This is the sum of fields across all your models. |
| `NUM_MODELS` | Maximum number of data models you can create in your account. Each model represents a different data structure or entity type. |
| `NUM_CLIENT_NODES` | Maximum number of client nodes that can connect to your Cyoda Cloud instance simultaneously. This controls concurrent compute capacity. |
| `PAYLOAD_SIZE` | Maximum size in bytes for individual API request payloads. This limits the amount of data you can send in a single API call. |
| `DISK_USAGE` | Maximum disk storage space allocated for your account data in bytes. This includes all stored models, data, and metadata. |
| `API_REQUEST` | Maximum number of API requests allowed per time interval. This controls the rate at which you can make API calls. |
| `EXTERNALIZED_CALL` | Maximum number of external compute calls allowed per time interval. This applies to calls made from your Cyoda Cloud instance to your connected compute nodes. |

---

_This subscription tier information is maintained in step with the platform configuration and may deviate from your actual settings. For the most current and accurate information about your specific account entitlements, please refer to the `/account` API endpoints._

---

## run/cyoda-cloud/provisioning.md

# Provisioning (Cyoda Cloud)

Provision a Cyoda Cloud environment.

Evolving · Cyoda Cloud

Cyoda Cloud is currently a test/demo offering. A commercial SLA
offering is coming. Use at your own risk until then.

Getting your first Cyoda Cloud Free Tier environment is very straightforward. Simply follow the steps below.

## TL;DR

1. Create an account on [https://ai.cyoda.net](https://ai.cyoda.net)
2. Find out your environment by prompting in the chat dialogue of the AI Studio with: `What is my environment URL?`
3. Deploy your environment by prompting with: `Deploy my Cyoda environment`
4. Create a technical user by prompting with: `Add machine user`
5. Access your environment via the Cyoda UI or APIs

## Create an Account

1. **Access the AI Studio**: Navigate to the Cyoda Cloud web-based Single Page Application (SPA) at [https://ai.cyoda.net](https://ai.cyoda.net) and consent to the terms and conditions.

![AI Studio Consent](aiAssistantConsent)

![AI Studio Greeting Screen](aiAssistantGreet)

2. **Choose Authentication Provider**: Register using one of the supported providers:
    - **Google Auth**: Sign up using your Google account
    - **GitHub**: Sign up using your GitHub account
3. **Complete Registration**: Follow the Auth0 authentication flow to complete your account setup
4. **Free Tier Access**: Upon successful registration, you'll be automatically enrolled in the Free Tier subscription

## Know your Environment

Prompt in the chat dialogue of the AI Studio with: `What is my environment URL?`. Wait a bit.

![What is my environment URL Prompt](whatIsMyEnvironment)

## Deploy your Environment

Prompt with: `Deploy my environment`.

Wait for the deployment to complete. It usually takes about 5 minutes.

![Deploy Environment Prompt](deployEnvPrompt)

![Environment Deployed Confirmation](envDeployedConfirmation)

## Create a Technical User

**Create a technical user (M2M client)**: Prompt with: "Add machine user".
You will see a button to launch the query against your env to create a new user.
Write down the client ID and secret - you'll need them to access your environment.

![Add Machine User Prompt](createTechnicalUser)

## Access the Environment
Once your environment is deployed and you have a technical user, you can access your environment.

### Via the Cyoda UI

Just navigate to your environment URL in your favorite browser at `https://client-<your_caas_user_id>.eu.cyoda.net`
You can find your environment URL from the previous steps or ask in chat for the url

![Login Cyoda UI](loginCyodaUI)

With the Cyoda UI you need to login with your personal via Auth0.

![Cyoda UI Logged In](loggedIn)

### Via the API
To access the APIs you need to use your technical user credentials to authenticate.
See also [Connecting](/guides/authentication-authorization/) for more details

---

## run/cyoda-cloud/status-and-roadmap.md

# Status and roadmap (Cyoda Cloud)

Current status, known limitations, and upcoming work for the hosted Cyoda Cloud platform.

## Current status

<!--
ABOUTME: Quick reference guide providing a concise overview of the most critical caveats, limitations, and gotchas for Cyoda Cloud Free Tier users. For detailed information, see the Cyoda Cloud overview.

THIS SECTION NEEDS TO BE AS COMPACT AND CONCISE AS POSSIBLE, BECAUSE IT IS SHOWN AS A PANEL ON VARIOUS UIs

TONE: Direct, scannable format optimized for quick reference. Focus on the most impactful limitations that users need to be aware of immediately. Use the first-person plural form in the issue descriptions and status, and keep the tone conversational and friendly.
-->

**⚠️ Beta Phase** - Expect frequent changes and some interruptions.

**🔧 Planned Maintenance**: none at the moment.

| Issue                     | Description                                                                                                                                                                                                                                                            | Status                                                                                                                                                                                                                                  |
|---------------------------|------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
| **Environment Access**    | [AI Studio](https://ai.cyoda.net) is currently the only control interface for your environments. When you've logged in, it will give you your env details if prompted. But it's best to **write down your environment URL**                                            | We'll be releasing better options soon                                                                                                                                                                                                  |
| **Java Code Generation**  | Generating your code-base may take a while (15-30 minutes). Please be patient. In rare cases it might not compile, but it's usually obvious to fix.                                                                                                                    | Should get better as we improve the agentic workflow and prompts                                                                                                                                                                        |
| **Deployments**                 | In the [AI Studio](https://ai.cyoda.net) you can ask to deploy your environment to pick up the latest Cyoda version. **BEWARE**: This will reset all your data. And, might contain breaking changes in Alpha/Beta, such that your client compute node may not work. | We'll announce breaking changes in our [Discord](https://discord.gg/95rdAyBZr2) channel. This is Beta. You'll usually just need to merge the latest changes from the template projects to your codebase, and maybe adjust a few things. |
| **Auth0 Logouts**         | Unexpected session terminations, possibly related to idle times.                                                                                                                                                                                                       | We're only monitoring this at the moment.                                                                                                                                                                                               |
| **Transactional Deletions** | Deleting large amounts of data is slow, slightly slower than data saves.                                                                                                                                                                                               | It's in the backlog, but resolution probably not before end of Alpha Phase.                                                                                                                                                             |

**Reach out to us on [Discord](https://discord.com/invite/95rdAyBZr2) if you need help**.

## Roadmap

So many things to work on.
We'll be putting our major roadmap items here, once we have enough feedback from the community to know what's most important.

---

## run/desktop.md

# Desktop (single binary)

Run cyoda-go from a single binary — dev, low-volume production, in-memory and SQLite modes.

<FromTheBinary topic="config database" />

The desktop packaging is cyoda-go as it ships out of the box: a single
binary, no orchestrator, no external database. It is the right choice for
local development, edge deployments, small-team self-hosting, and any
low-volume production workload where a single machine is enough.

## In-memory vs SQLite

The desktop binary supports two storage modes:

- **In-memory** — everything lives in process memory. Sub-millisecond
  latencies. Data is lost on restart. Use it for tests, demos, and
  digital-twin scenario runs.
- **SQLite (default after `cyoda init`)** — durable, single-file, zero-ops. Data survives
  restarts; backup is a file copy. Use it for everyday persistent work.
  SQLite is single-writer; all writes serialize through the database file,
  which limits concurrent write throughput.

The SQLite database file is created at `~/.local/share/cyoda/cyoda.db` by
`cyoda init`. Back it up by copying the file; migrate it by moving the file.

## Install

Pick the installer that suits your platform; the authoritative list lives in
the [cyoda-go README](https://github.com/cyoda-platform/cyoda-go#install).

```bash
# macOS / Linux via Homebrew
brew install cyoda-platform/cyoda-go/cyoda
```

Debian, RHEL, and `curl | sh` installers are available in the same
document.

## Run

The Homebrew and packaged installers run `cyoda init` for you, which sets
up the SQLite store. To start the server:

```bash
cyoda
```

The binary exposes REST on port **8080** and gRPC on **9090** by default.
The full CLI reference lives at [Reference → CLI](/reference/cli/).

## Configure

cyoda-go reads configuration from environment variables, a config file, or
CLI flags. The full list of options lives at
[Reference → Configuration](/reference/configuration/); for everyday use
the defaults are fine, and you only set a handful of variables
(`CYODA_STORAGE_BACKEND`, listen ports, JWT keys) to adapt to your environment.

For secrets, cyoda-go supports `*_FILE` suffixes on any credential
environment variable so you can mount them from a secrets store rather than
pass them on the command line.

## Upgrading

Upgrading is a version bump: install the new binary, restart the process.
cyoda-go follows semantic versioning; configuration migration policy is
documented in the
[cyoda-go release notes](https://github.com/cyoda-platform/cyoda-go/releases).

---

## run/docker.md

# Docker

Run cyoda-go in Docker for bespoke integrations and local compositions.

Docker is the right packaging when you need cyoda-go to sit inside a larger
composition — with your app, with a PostgreSQL backend, with an observability
stack — or when your CI pipeline wants a clean container image per run.

## When Docker fits

- **Bespoke integrations.** Deploy cyoda-go alongside your own services on a
  single host, wire them over a Docker network.
- **Composed environments.** Run cyoda-go with PostgreSQL, Prometheus,
  Grafana, and OpenTelemetry collectors as a complete local stack.
- **PostgreSQL for dev and test.** Point cyoda-go at a containerised
  PostgreSQL to exercise production-mode behaviour locally before
  deploying.
- **CI pipelines.** Ephemeral, reproducible, no host state.

## Image

cyoda-go publishes container images per release. The authoritative reference
and pull instructions live in the
[cyoda-go Docker reference](https://github.com/cyoda-platform/cyoda-go/tree/main/deploy/docker).

## Compose example

The repository ships a minimal `compose.yaml` for getting a single node and
PostgreSQL up, plus a richer
[compose-with-observability](https://github.com/cyoda-platform/cyoda-go/tree/main/examples/compose-with-observability)
example that wires tracing and metrics.

Use these as templates rather than retyping them here — they track the
cyoda-go release and we link whichever is current.

## PostgreSQL backend

Point cyoda-go at a PostgreSQL instance by setting `CYODA_STORAGE_BACKEND=postgres`
and the usual connection variables (or `*_FILE` forms for secrets). The DSN
goes in `CYODA_POSTGRES_URL` (or `CYODA_POSTGRES_URL_FILE` for a file-mounted
secret per Docker conventions). The Docker compose example wires this up
end-to-end; for production you will run PostgreSQL separately and pass only
the DSN.

## Observability

The container emits structured logs to stdout, exposes a Prometheus scrape
endpoint for metrics, and accepts OpenTelemetry configuration for traces.
The observability example demonstrates a full loop:

- **Logs** — stream from the cyoda-go container.
- **Metrics** — Prometheus scrapes the admin port.
- **Traces** — OTLP exporter configured via environment.

Tune sampling and log level at runtime via the admin endpoints;
see the
[cyoda-go observability reference](https://github.com/cyoda-platform/cyoda-go#observability).

Health probes live on the admin port (default 9091): `/livez` (liveness) and
`/readyz` (readiness). Both are unauthenticated.

## Data directory

The container pre-stages `/var/lib/cyoda` as the data directory (with the
correct ownership for the non-root `65532:65532` user). Mount it as a named
volume if you want SQLite data or any plugin state to persist across
container restarts.

---

## run/kubernetes.md

# Kubernetes

Deploy cyoda-go with the Helm chart for clustered PostgreSQL-backed production.

Kubernetes is the recommended production packaging for self-hosted cyoda-go.
The application is designed for active-active stateless deployment: three to
ten cyoda-go pods behind a load balancer, with PostgreSQL as the only
stateful dependency.

## When Kubernetes fits

- **Production workloads** that need high availability.
- **Multi-node clustering** with rolling upgrades and blue/green.
- **Horizontal scale** up to the PostgreSQL backend's limits (10+ stateless
  pods serving one PostgreSQL cluster is a comfortable envelope).
- **Enterprise operations** — GitOps, secrets management, service meshes.

## Deployment shape

```
          ┌─────────────┐
          │ Load        │
          │ Balancer    │
          └──────┬──────┘
                 │
        ┌────────┼────────┐
        │        │        │
   ┌────▼─┐ ┌────▼─┐ ┌────▼─┐
   │cyoda │ │cyoda │ │cyoda │   (stateless, 3–10 pods)
   │-go   │ │-go   │ │-go   │
   └────┬─┘ └────┬─┘ └────┬─┘
        │        │        │
        └────────┼────────┘
                 │
          ┌──────▼──────┐
          │ PostgreSQL  │  (the only stateful dependency)
          └─────────────┘
```

Every pod is identical; any pod can serve any request. There is no leader
election, no ZooKeeper, no etcd. Coordination happens through PostgreSQL's
SERIALIZABLE isolation for writes and a gossip protocol (HMAC-authenticated)
for membership, so concurrent writers never silently corrupt data.

The stateful backend is pluggable: PostgreSQL (OSS default) or the
commercial Cassandra storage engine. The pod topology and the application
contract are identical either way — only the storage plugin configuration
differs.

## Helm chart

cyoda-go ships a Helm chart under
[`deploy/helm`](https://github.com/cyoda-platform/cyoda-go/tree/main/deploy/helm).
The chart provisions the cyoda-go Deployment, a Service, a ConfigMap for
non-sensitive configuration, and Secret references for credentials.

The authoritative values reference lives under
[Reference → Helm values](/reference/helm/); the chart's own `values.yaml`
remains the runtime source of truth.

The Helm chart auto-generates the HMAC secret unless
`cluster.hmacSecret.existingSecret` is provided. GitOps deployments should
always set `existingSecret` to avoid Helm rendering a fresh secret on every
reconcile, which would cause inter-node auth to drift.

## High availability

- **Load balancer.** Any Kubernetes-native L4 or L7 will do; match the pod
  readiness probe to the one the chart exposes.
- **Readiness and liveness probes.** Both are wired by default; tune if
  your control plane has stricter latency budgets.
- **Pod Disruption Budgets.** Set a minimum available count that matches
  your replica count minus one so rolling upgrades and node drains do not
  take the service below quorum.

## Backup and restore

Backup is standard PostgreSQL tooling: `pg_dump`, continuous WAL archiving,
snapshot-based backups from your cloud provider. cyoda-go does not maintain
any state outside PostgreSQL, so a PostgreSQL restore brings the platform
back to that point in time in full.

## Upgrades and rollback

cyoda-go releases follow semantic versioning. For production:

- **Blue/green or canary.** Run the new version alongside the old, cut
  traffic over, retire the old.
- **Rolling upgrade.** Fine for minor releases; set `maxUnavailable: 0` so
  capacity never drops.
- **Schema migration ordering.** Check the release notes for whether a
  release requires a PostgreSQL schema migration step before the new binary
  starts serving. The Helm chart runs schema migrations as a
  pre-install/pre-upgrade hook; pod startup is blocked until migrations
  complete.

## Sizing

Sizing is driven by write volume more than read volume, because reads scale
horizontally across pods while writes are serialised through PostgreSQL.
Qualitative guide:

- **Small.** 3 pods, `db.small` (or equivalent), up to a few hundred
  transitions per second.
- **Medium.** 5–7 pods, dedicated PostgreSQL, low thousands of
  transitions per second.
- **Large.** 10 pods with PostgreSQL scaled up; at this point consider
  swapping to the commercial Cassandra storage engine (still on
  Kubernetes), or handing operations to Cyoda Cloud as a SaaS.

## Observability

The chart exposes a Prometheus scrape annotation on the pods and surfaces
the admin endpoints for log-level and tracing control. Standard
OpenTelemetry configuration applies; wire OTLP exporters via environment
variables in the Helm values.

## Scaling past PostgreSQL

At the upper end of the sizing guide, PostgreSQL's write throughput
becomes the bottleneck. Two paths past it — the application contract
is identical in both:

- **Swap to the commercial Cassandra storage engine, still on
  Kubernetes.** The licensable plugin replaces the PostgreSQL backend
  with a horizontally-scaling Cassandra-backed tier. The Helm chart,
  the pod topology, and the application code are unchanged — only the
  storage plugin configuration changes.
- **Hand operations to Cyoda Cloud.** A SaaS that runs either the
  PostgreSQL or Cassandra stack for you. Same application contract,
  different operational model.

See [Cyoda Cloud](./cyoda-cloud/) for the SaaS option; contact sales
for the commercial Cassandra plugin.

---