# Cyoda Documentation — full content Source: https://docs.cyoda.net Generated: 2026-04-25T01:36:46.515Z --- ## build.md # Build Develop Cyoda applications — tier-agnostic patterns that work on any runtime. Cyoda applications are **digital twins**: the same code runs on every storage tier, from in-memory dev through Cassandra at enterprise scale. Pages in this section cover the patterns — entity modeling, workflows, external processors, testing — independent of where your app runs. Where to next: - [Modeling entities](./modeling-entities/) - [Working with entities](./working-with-entities/) - [Workflows and processors](./workflows-and-processors/) - [Client compute nodes](./client-compute-nodes/) - [Testing with digital twins](./testing-with-digital-twins/) --- ## build/analytics-with-sql.md # Analytics with SQL Query entities with Trino SQL — when its the right surface, how to connect, and where to find the full grammar. :::caution[Upcoming] Trino SQL is on the roadmap and not yet available in cyoda-go at this release. This page documents the planned surface; names and shapes may change before release. ::: Cyoda projects every entity model into a set of virtual SQL tables and exposes them through a Trino connector. Use this surface for cross-entity analytics: joins across entity types, aggregates, reporting, time-series, BI dashboards. For operational read/write, stay on [REST](/build/working-with-entities/); for compute that runs against transitions, use [gRPC compute nodes](/build/client-compute-nodes/). ## When to use SQL - Ad-hoc analysis against live data in a notebook or BI tool. - Scheduled reports aggregating entities across a tenant. - Historical queries using the `point_time` column for as-of reads. - Cross-entity joins — e.g. orders joined to customers joined to payments. If the question is *transactional* ("read this one entity", "fire this transition"), it does not belong here. REST is faster, cheaper, and correctly scoped for that. ## Connect The JDBC connection string pattern: ``` jdbc:trino://trino-client-.eu.cyoda.net:443/cyoda/ ``` - `caas_user_id` — your CAAS user ID. - `` — the SQL schema name you configured (create one in the Cyoda UI under **Trino/SQL**, or via `PUT /sql/schema/putDefault/{schemaName}`). Authenticate with a bearer token issued by the platform. For technical-user setup and the OAuth 2.0 client-credentials flow, see [Authentication and identity](/concepts/authentication-and-identity/). ## A first query Given an entity model `orders` with nested `lines`, Cyoda produces one table per nested level plus a JSON reconstruction table: - `orders` — root columns + top-level fields - `orders_lines` — one row per line item, with `index_0` marking position - `orders_json` — the complete JSON document per entity Read a single order and its lines: ```sql SELECT o.entity_id, o.state, o.customer_id, l.index_0 AS line_no, l.sku, l.quantity, l.price FROM orders o JOIN orders_lines l ON l.entity_id = o.entity_id WHERE o.entity_id = '00000000-0000-0000-0000-000000000001'; ``` Query as of last Tuesday: ```sql SELECT * FROM orders WHERE point_time = TIMESTAMP '2026-04-14 00:00:00' AND state = 'submitted'; ``` ## Table-naming rules, at a glance - Root node of an entity → table named after the model (e.g. `orders`). - Array-of-objects node → `_` (e.g. `orders_lines`). - Multi-dimensional arrays → `__d_array` (detached array naming). - JSON reconstruction → `_json`. The full projection rules — node decomposition, detached arrays, type mapping, polymorphic fields — live in the [Trino SQL reference](/reference/trino/). ## Performance notes - When querying a `_json` table, **always include `entity_id` in the WHERE clause**. Without that predicate the query scans the reconstruction table for every entity and gets very slow. - For joins across nested-array tables, use `entity_id` plus matching `index_*` columns as join keys. - Omit `point_time` unless you actually need historical data. ## Where to go next - [Trino SQL reference](/reference/trino/) — full projection rules, type mapping, polymorphic fields, complete worked example. - [APIs and surfaces](/concepts/apis-and-surfaces/) — when to pick REST vs gRPC vs SQL. - [Entities and lifecycle](/concepts/entities-and-lifecycle/) — the entity model whose shape becomes your SQL tables. --- ## build/client-compute-nodes.md Patterns for processor and criteria services — implementation, registration, and lifecycle. # 1. Architecture Overview A **calculation member** is an external gRPC client that participates in entity workflow processing on the Cyoda platform. The platform delegates work to your client over a persistent bidirectional gRPC stream, and your client returns results on the same stream. For the rationale behind preferring gRPC over HTTP for compute nodes, see [APIs and surfaces](/concepts/apis-and-surfaces/). ``` ┌──────────────────────┐ gRPC (bidirectional stream) ┌─────────────────────────┐ │ Cyoda Platform │ ◄──────────────────────────────────────────►│ Your Calculation │ │ │ CloudEvent (Protobuf, JSON payload) │ Member (Client) │ │ ┌────────────────┐ │ │ │ │ │ Workflow Engine│ │ 1. Client opens stream, sends Join │ ┌───────────────────┐ │ │ │ │ │ 2. Server responds with Greet │ │ Business Logic │ │ │ │ - Processors │──┼──3. Server pushes Processing/Criteria reqs──┼──│ │ │ │ │ - Criteria │ │ 4. Client returns responses │ │ - Data transforms │ │ │ │ │ │ 5. Keep-alive heartbeats (bidirectional) │ │ - Criteria checks │ │ │ └────────────────┘ │ │ └───────────────────┘ │ └──────────────────────┘ └─────────────────────────┘ ``` Two types of work can be delegated: | Use Case | Description | Request Type | Response Type | |---|------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|---|---| | **Processing** | Perform actions, such as transforming entity data during a workflow transition, performing CRUD ops on other entities, running reports, interacting with other systems, etc. | `EntityProcessorCalculationRequest` | `EntityProcessorCalculationResponse` | | **Criteria Evaluation** | Evaluate a boolean condition (e.g., "should this transition fire?") | `EntityCriteriaCalculationRequest` | `EntityCriteriaCalculationResponse` | ## 1.1 Protocol Summary - **Transport**: gRPC bidirectional streaming via `CloudEventsService.startStreaming` - **Message format**: [CNCF CloudEvents](https://cloudevents.io/) Protobuf envelope with JSON `text_data` payload - **Authentication**: Bearer JWT token in gRPC `Authorization` metadata header - **Auth context propagation**: The platform attaches [CloudEvents Auth Context extension](https://github.com/cloudevents/spec/blob/main/cloudevents/extensions/authcontext.md) attributes to processor and criteria requests, identifying the principal whose action triggered the workflow (see [Section 8](#8-auth-context-on-incoming-requests)) - **Serialization**: All payloads are JSON-serialized inside CloudEvent `text_data` (not binary protobuf) --- # 2. Prerequisites ## 2.1 Proto Definitions Your client needs the following proto files to generate gRPC stubs: - **`cloudevents.proto`** — The standard CloudEvents Protobuf message definition (package `io.cloudevents.v1`) - **`cyoda-cloud-api.proto`** — The Cyoda service definition (package `org.cyoda.cloud.api.grpc`) The service definition: ```protobuf service CloudEventsService { rpc startStreaming(stream io.cloudevents.v1.CloudEvent) returns (stream io.cloudevents.v1.CloudEvent); } ``` The CloudEvent message: ```protobuf message CloudEvent { string id = 1; // Unique event ID (UUID recommended) string source = 2; // URI-reference identifying the event source string spec_version = 3; // Must be "1.0" string type = 4; // Event type string (see Section 4) map attributes = 5; oneof data { bytes binary_data = 6; string text_data = 7; // ← Used by Cyoda (JSON payload) google.protobuf.Any proto_data = 8; } } ``` ## 2.2 JWT Authentication Token Obtain a valid JWT Bearer token from the Cyoda IAM system (OAuth 2.0 client credentials flow). The token must contain: - A valid `caas_org_id` claim (your legal entity ID) - Valid user roles The token is validated on every stream establishment. If the token expires during an active stream, the stream remains valid — re-authentication occurs only when a new stream is opened. ## 2.3 Dependencies (Java/Kotlin Example) For JVM-based clients, the recommended dependencies are: - `io.grpc:grpc-stub`, `io.grpc:grpc-protobuf`, `io.grpc:grpc-netty-shaded` — gRPC runtime - `io.cloudevents:cloudevents-protobuf` — CloudEvents SDK Protobuf format support - `io.cloudevents:cloudevents-core` — CloudEvents SDK core - `com.fasterxml.jackson.core:jackson-databind` — JSON serialization --- # 3. Connection Setup ## 3.1 Create the gRPC Channel ```java ManagedChannel channel = ManagedChannelBuilder .forAddress("cyoda-host.example.com", 50051) .usePlaintext() // Use .useTransportSecurity() for TLS in production .keepAliveTime(30, TimeUnit.SECONDS) .keepAliveTimeout(10, TimeUnit.SECONDS) .build(); ``` **Production TLS**: In production, always use TLS. Replace `.usePlaintext()` with: ```java .useTransportSecurity() .sslContext(/* your SSL context */) ``` ## 3.2 Attach JWT Credentials Create a `CallCredentials` implementation that injects the `Authorization` header: ```java CallCredentials callCredentials = new CallCredentials() { @Override public void applyRequestMetadata(RequestInfo requestInfo, Executor executor, MetadataApplier applier) { executor.execute(() -> { Metadata headers = new Metadata(); headers.put( Metadata.Key.of("Authorization", Metadata.ASCII_STRING_MARSHALLER), "Bearer " + jwtTokenSupplier.get() // Always fetch a fresh token ); applier.apply(headers); }); } }; ``` ## 3.3 Create the Stub ```java CloudEventsServiceGrpc.CloudEventsServiceStub asyncStub = CloudEventsServiceGrpc .newStub(channel) .withCallCredentials(callCredentials) .withWaitForReady(); // Wait for the channel to become ready before sending ``` --- # 4. CloudEvent Type System Every message on the stream is a CloudEvent with a `type` field that determines how to deserialize the JSON `text_data`. Your client must handle the following types: | CloudEvent `type` | Direction | Description | |---|---|---| | `CalculationMemberJoinEvent` | Client → Server | Register as a calculation member | | `CalculationMemberGreetEvent` | Server → Client | Server confirms registration | | `CalculationMemberKeepAliveEvent` | Bidirectional | Heartbeat probe and response | | `EventAckResponse` | Server → Client | Acknowledgment of keep-alive | | `EntityProcessorCalculationRequest` | Server → Client | Process entity data | | `EntityProcessorCalculationResponse` | Client → Server | Return processed entity data | | `EntityCriteriaCalculationRequest` | Server → Client | Evaluate a boolean criterion | | `EntityCriteriaCalculationResponse` | Client → Server | Return criterion result | ## 4.1 Building a CloudEvent To send a CloudEvent on the stream (Java/Kotlin with CloudEvents SDK): ```java // 1. Build the CloudEvents SDK event io.cloudevents.CloudEvent sdkEvent = CloudEventBuilder.v1() .withType("CalculationMemberJoinEvent") // Must match the type table above .withSource(URI.create("my-calculation-member")) .withId(UUID.randomUUID().toString()) .withData(PojoCloudEventData.wrap(event, e -> objectMapper.writeValueAsBytes(e))) .build(); // 2. Serialize to Protobuf EventFormat protobufFormat = EventFormatProvider.getInstance() .resolveFormat("application/cloudevents+protobuf"); byte[] protoBytes = protobufFormat.serialize(sdkEvent); // 3. Parse to the gRPC CloudEvent message io.cloudevents.v1.proto.CloudEvent grpcEvent = io.cloudevents.v1.proto.CloudEvent.parseFrom(protoBytes); ``` ## 4.2 Parsing a Received CloudEvent ```java // From the gRPC StreamObserver.onNext(value): String eventType = value.getType(); String jsonPayload = value.getTextData(); // Deserialize based on type switch (eventType) { case "CalculationMemberGreetEvent": GreetEvent greet = objectMapper.readValue(jsonPayload, GreetEvent.class); break; case "EntityProcessorCalculationRequest": ProcessorRequest req = objectMapper.readValue(jsonPayload, ProcessorRequest.class); break; // ... etc } ``` --- # 5. Connection Lifecycle ## 5.1 Open the Stream ```java StreamObserver requestObserver = asyncStub.startStreaming( new StreamObserver() { @Override public void onNext(CloudEvent value) { // Dispatch based on value.getType() — see Sections 6–8 } @Override public void onError(Throwable t) { // Connection lost — trigger reconnect (see Section 11) } @Override public void onCompleted() { // Server closed the stream — trigger reconnect } } ); ``` ## 5.2 Join Handshake Immediately after opening the stream, send a `CalculationMemberJoinEvent`: ```json { "id": "", "tags": ["my-processor-tag", "production"] } ``` **Tags** are critical for routing. The platform routes processing/criteria requests to members whose tags are a **superset** of the tags configured on the workflow processor/criterion. Tags are case-insensitive (lowercased server-side). The server responds with a `CalculationMemberGreetEvent`: ```json { "id": "", "success": true, "memberId": "", "joinedLegalEntityId": "" } ``` **Store the `memberId`** — you will need it for keep-alive messages. If `success` is `false`, inspect the `error` object for the failure reason (e.g., subscription limit exceeded, invalid token). ### 5.3 Keep-Alive The platform periodically probes your member with `CalculationMemberKeepAliveEvent` messages to verify liveness. You **must** respond to each probe with an `EventAckResponse`. **Server-initiated keep-alive probe** (Server → Client): ```json { "id": "", "memberId": "" } ``` **Required response** (Client → Server): ```json { "id": "", "sourceEventId": "", "success": true } ``` You may also send **client-initiated keep-alive** messages to confirm your own liveness. The server will respond with an `EventAckResponse`. **Timing parameters** (server-side defaults): | Parameter | Default | Description | |---|---|---| | Keep-alive probe interval | 1,000 ms | How often the server probes | | Max idle interval | 3,000 ms | How long before a member is marked as not alive | | Keep-alive check timeout | 1,000 ms | How long the server waits for a probe response | A member is marked not alive when a probe times out (keep-alive check timeout, default 1,000 ms) **and** the max idle interval (default 3,000 ms) has been exceeded since the last successful probe response. Both conditions must hold — a single slow probe within the idle window does not mark the member dead. **If your member is marked as not alive, the platform will not route requests to it.** The member remains registered but idle. Responding to a subsequent keep-alive probe restores the alive status. > ⚠️ **Critical**: Failing to respond to keep-alive probes will cause your member to be marked as dead. Ensure your keep-alive response handler is fast and non-blocking. --- # 6. Handling Processor Requests When an entity reaches a workflow transition with an externalized processor configured to match your member's tags, the platform sends an `EntityProcessorCalculationRequest`. ## 6.1 Request Schema ```json { "id": "", "requestId": "", "entityId": "", "processorId": "", "processorName": "", "transactionId": "", "workflow": { "id": "", "name": "" }, "transition": { "id": "", "name": "", "stateFrom": "", "stateTo": "" }, "parameters": { /* arbitrary JSON configured on the processor */ }, "payload": { "type": "TREE", "data": { /* entity data as JSON — present only if attachEntity=true */ }, "meta": { /* entity metadata */ } } } ``` **Key fields**: - `requestId` — You **must** echo this back in the response for correlation. - `entityId` — The entity being processed. Echo this back. - `processorName` — Use this to dispatch to different business logic handlers. - `parameters` — Arbitrary JSON configured on the processor in the workflow definition (the `context` field). Use for passing configuration to your handler. - `payload.data` — The entity data. Only present when `attachEntity` is `true` in the workflow configuration. > 💡 **Auth context**: The CloudEvent envelope for this request also carries auth context extension attributes (`authtype`, `authid`, `authclaims`) identifying the principal whose action triggered the workflow. See [Section 8](#8-auth-context-on-incoming-requests) for details on how to extract them. ## 6.2 Response Schema ```json { "id": "", "requestId": "", "entityId": "", "success": true, "payload": { "type": "TREE", "data": { /* modified entity data to write back */ } } } ``` **Rules**: 1. **`requestId`** must exactly match the value from the request. 2. **`entityId`** must exactly match the value from the request. 3. If you set `success: true`, the platform applies your `payload.data` to the entity. 4. If you set `success: false`, the platform treats this as a processing failure. Include an `error` object. 5. The `payload` field is optional. If omitted (or `payload.data` is null), no data modification occurs. ## 6.3 Error Response ```json { "id": "", "requestId": "", "entityId": "", "success": false, "error": { "code": "BUSINESS_ERROR", "message": "Detailed error description", "retryable": true } } ``` The `error.retryable` flag tells the platform whether it should retry the request on a different member (if a retry policy is configured). Set to `true` for transient failures and `false` for permanent failures. --- # 7. Handling Criteria Requests When a workflow transition has an externalized criterion configured as a `function`, the platform sends an `EntityCriteriaCalculationRequest`. ## 7.1 Request Schema ```json { "id": "", "requestId": "", "entityId": "", "criteriaId": "", "criteriaName": "", "target": "TRANSITION", "transactionId": "", "workflow": { "id": "", "name": "" }, "transition": { "id": "", "name": "", "stateFrom": "", "stateTo": "" }, "processor": { "id": "", "name": "" }, "parameters": { /* arbitrary JSON */ }, "payload": { "type": "TREE", "data": { /* entity data */ } } } ``` **The `target` field** indicates what the criterion is attached to: | Target | Meaning | Available Context | |---|---|---| | `WORKFLOW` | Workflow-level criterion (selects which workflow applies) | `workflow` | | `TRANSITION` | Transition-level criterion (should this transition fire?) | `workflow`, `transition` | | `PROCESSOR` | Processor-level criterion (should this processor run?) | `workflow`, `transition`, `processor` | | `NA` | Reserved for future use | — | > 💡 **Auth context**: Like processor requests, criteria requests also carry auth context extension attributes on the CloudEvent envelope. See [Section 8](#8-auth-context-on-incoming-requests). ## 7.2 Response Schema ```json { "id": "", "requestId": "", "entityId": "", "success": true, "matches": true, "reason": "Entity meets all validation criteria" } ``` **Key fields**: - `requestId` — Must exactly match the request. - `entityId` — Must exactly match the request. - `matches` — The boolean result: `true` means the criterion is satisfied (transition fires / processor runs), `false` means it is not. - `reason` — Optional human-readable explanation (useful for debugging). If `success: false`, the platform treats it as a criteria evaluation failure (the criterion evaluates to `false` by default). --- # 8. Auth Context on Incoming Requests The platform attaches [CloudEvents Auth Context extension](https://github.com/cloudevents/spec/blob/main/cloudevents/extensions/authcontext.md) attributes to every `EntityProcessorCalculationRequest` and `EntityCriteriaCalculationRequest`. These attributes identify the authenticated principal whose action triggered the workflow execution (e.g., the user who created or updated the entity). ## 8.1 Extension Attributes The auth context is carried as CloudEvent extension attributes in the Protobuf `attributes` map — **not** inside the JSON `text_data` payload. | Attribute | Type | Required | Description | |---|---|---|---| | `authtype` | String | YES | Principal type. One of: `user`, `service_account`, `system`, `unauthenticated`, `unknown` | | `authid` | String | NO | Unique identifier of the principal (UUID). Absent for `system` or `unauthenticated`. | | `authclaims` | String | NO | JSON string containing claims about the principal (e.g., `legalEntityId`, `roles`). Does not contain credentials. | ## 8.2 Auth Type Values | `authtype` Value | Meaning | |---|---| | `user` | A regular authenticated user (JWT-based login) | | `service_account` | A machine-to-machine (M2M) technical user | | `system` | An internal platform trigger (no user context, e.g., scheduled transitions) | | `unauthenticated` | No authentication context was available | | `unknown` | Reserved for future use | ## 8.3 Extracting Auth Context (Java/Kotlin) The attributes are available in the Protobuf CloudEvent's `attributes` map. The keys are the attribute names listed above (no prefix): ```java // From the gRPC StreamObserver.onNext(value): String authType = value.getAttributesMap().get("authtype").getCeString(); String authId = value.getAttributesMap().containsKey("authid") ? value.getAttributesMap().get("authid").getCeString() : null; String authClaimsJson = value.getAttributesMap().containsKey("authclaims") ? value.getAttributesMap().get("authclaims").getCeString() : null; // Parse claims if present if (authClaimsJson != null) { Map claims = objectMapper.readValue(authClaimsJson, Map.class); String legalEntityId = (String) claims.get("legalEntityId"); List roles = (List) claims.get("roles"); // may be null for plain IUser } ``` The exact accessor depends on your gRPC tooling — in Go, use the generated message's `GetAttributes()` method; in Python, dict-like indexing on `.attributes`. See your language's generated proto bindings. ## 8.4 Example Claims JSON ```json { "legalEntityId": "acme-corp", "roles": ["USER", "SUPER_USER"] } ``` For `service_account` (M2M) users: ```json { "legalEntityId": "acme-corp", "roles": ["M2M"] } ``` ## 8.5 Use Cases - **Audit logging**: Record which user triggered the processing for compliance. - **Authorization decisions**: Apply different business logic based on the caller's roles or legal entity. - **Multi-tenant isolation**: Verify the triggering principal belongs to the expected tenant. - **Debugging**: Trace processing failures back to the originating user action. > ⚠️ **Note**: The `authclaims` field never contains credentials (passwords, tokens, secrets). It contains only identity and authorization metadata. --- # 9. Workflow Configuration Your calculation member does not exist in isolation — it is invoked by workflow configurations on the platform side. This section describes how workflows reference externalized processors and criteria, so you understand the relationship between your member's tags/handlers and the platform configuration. ## 9.1 Externalized Processor in Workflow JSON ```json { "workflows": [{ "version": "1", "name": "my-workflow", "initialState": "start", "states": { "start": { "transitions": [{ "name": "process-data", "next": "processed", "manual": false, "processors": [{ "name": "my-processor-function", "executionMode": "SYNC", "config": { "attachEntity": true, "calculationNodesTags": "my-processor-tag", "responseTimeoutMs": 60000, "retryPolicy": "FIXED", "context": "{\"key\": \"value\"}" } }] }] }, "processed": {} } }] } ``` ## 9.2 Processor Configuration Fields | Field | Type | Default | Description | |---|---|---|---| | `name` | string | — | **Required.** The processor name. Sent as `processorName` in the request. | | `executionMode` | string | — | **Required.** One of `SYNC`, `ASYNC_SAME_TX`, `ASYNC_NEW_TX`. | | `config.attachEntity` | boolean | `true` | Whether to send entity data in the request payload. | | `config.calculationNodesTags` | string | `""` | Comma/semicolon-separated tags. Only members whose tags are a superset are eligible. | | `config.responseTimeoutMs` | long | `60000` | How long the platform waits for your response before timing out. | | `config.retryPolicy` | string | `FIXED` | `NONE` — no retry. `FIXED` — retry with fixed delay (default: 3 retries, 500ms delay). | | `config.context` | string | `null` | Arbitrary string passed as `parameters` in the request. Use for handler-specific configuration. | | `config.asyncResult` | boolean | `false` | Enable async response processing (advanced). | | `config.crossoverToAsyncMs` | long | `5000` | Time before switching from sync to async response handling (advanced). | ## 9.3 Execution Modes | Mode | Behavior | |---|---| | `SYNC` | The workflow engine waits for your response within the same transaction. The transition completes only after your response is applied. | | `ASYNC_SAME_TX` | The engine sends the request and can process other work. Your response is applied within the same entity transaction. | | `ASYNC_NEW_TX` | Like `ASYNC_SAME_TX`, but your response is applied in a new transaction. Useful for long-running computations. | > For most use cases, **`SYNC`** is the simplest and recommended starting point. ## 9.4 Externalized Criteria (Function) in Workflow JSON ```json { "transitions": [{ "name": "conditional-transition", "next": "target-state", "manual": false, "criterion": { "type": "function", "function": { "name": "my-criteria-function", "config": { "attachEntity": true, "calculationNodesTags": "my-processor-tag", "responseTimeoutMs": 5000, "retryPolicy": "NONE" } } } }] } ``` Criteria functions use the same `config` fields as processors (except `asyncResult` and `crossoverToAsyncMs`, which are not applicable to criteria). ## 9.5 Retry Policies | Policy | Behavior | |---|---| | `NONE` | No retry. If the member fails or times out, the processing fails. | | `FIXED` | Retries up to N times (default: 3) with a fixed delay (default: 500ms) between retries. Each retry attempts a **different** member if available (the failed member is excluded from selection). | --- # 10. BaseEvent Schema All events on the stream extend the `BaseEvent` schema: ```json { "id": "", "success": true, "error": { "code": "", "message": "", "retryable": false }, "warnings": [""] } ``` - `id` — Every event must have a unique ID (UUID recommended). - `success` — Defaults to `true`. Set to `false` to indicate an error. - `error` — Only relevant when `success` is `false`. The `code` and `message` fields are required within the error object. - `warnings` — Optional array of warning strings. --- # 11. Production Robustness ## 11.1 Reconnection Strategy gRPC streams can be terminated by network issues, server restarts, or load balancer timeouts. Implement automatic reconnection: 1. **Detect disconnection** via `onError` or `onCompleted` on the response observer. 2. **Back off exponentially** — start at 1 second, cap at 60 seconds. 3. **Re-join after reconnect** — every new stream requires a fresh `CalculationMemberJoinEvent`. 4. **Refresh the JWT token** before reconnecting if it is near expiry. ``` ┌─────────┐ onError/onCompleted ┌──────────┐ delay ┌──────────────┐ success ┌──────┐ │ Connected│ ──────────────────────► │ Backoff │ ────────► │ Reconnecting │ ────────────► │ Join │ └─────────┘ └──────────┘ └──────────────┘ └──────┘ ▲ │ failure │ │ ▼ │ │ ┌──────────┐ │ │ │ Backoff │ (increase delay) │ │ └──────────┘ │ └────────────────────────────────────────────────────────────────────────────────────────┘ Greet received ``` ## 11.2 Thread Safety The gRPC `StreamObserver` is **not thread-safe**. If your business logic runs on multiple threads, synchronize all calls to `observer.onNext()`: ```java synchronized (requestObserver) { requestObserver.onNext(cloudEvent); } ``` ## 11.3 Response Timeouts Your client must respond within the configured `responseTimeoutMs` (default: 60 seconds). If you exceed this: - The platform considers the request failed. - If retry policy is `FIXED`, the platform retries with a different member. - Late responses are silently discarded. Design your business logic to complete well within the timeout, accounting for network latency. ## 11.4 Idempotency In edge cases (e.g., network partitions, retries), you may receive the same request more than once. Use the `requestId` as an idempotency key to avoid processing the same request twice. ## 11.5 Graceful Shutdown When shutting down your client: 1. Stop accepting new requests (drain in-flight work). 2. Complete any pending responses and send them. 3. Close the gRPC stream via `requestObserver.onCompleted()`. 4. Shut down the `ManagedChannel` with a grace period: ```java channel.shutdown().awaitTermination(10, TimeUnit.SECONDS); ``` The platform will detect the stream closure and broadcast a member-offline event to the cluster. Pending requests that were in-flight will time out and may be retried on other members. ## 11.6 Multiple Members You can run **multiple calculation member instances** (same or different processes) with the same tags for horizontal scaling and high availability. The platform selects one eligible member per request, preferring members connected to the local cluster node. Running at least two members ensures continued processing if one goes down. ## 11.7 Monitoring Track these metrics in your client: - **Request count** by type (processor vs. criteria) and result (success vs. failure) - **Response latency** (time from receiving request to sending response) - **Keep-alive response time** - **Reconnection count and frequency** - **Stream errors** (by gRPC status code) --- # 12. Quick Reference — Message Flow ``` Client Server │ │ │──── startStreaming() ─────────────────────────►│ (open bidirectional stream) │ │ │──── CalculationMemberJoinEvent ───────────────►│ (register with tags) │◄─── CalculationMemberGreetEvent ───────────────│ (server confirms, assigns memberId) │ │ │◄─── CalculationMemberKeepAliveEvent ───────────│ (periodic heartbeat probe) │──── EventAckResponse ─────────────────────────►│ (ack the probe) │ │ │◄─── EntityProcessorCalculationRequest ─────────│ (process this entity) │──── EntityProcessorCalculationResponse ───────►│ (here's the result) │ │ │◄─── EntityCriteriaCalculationRequest ──────────│ (evaluate this criterion) │──── EntityCriteriaCalculationResponse ────────►│ (matches: true/false) │ │ │──── CalculationMemberKeepAliveEvent ──────────►│ (client-initiated heartbeat) │◄─── EventAckResponse ─────────────────────────│ (server acks) │ │ ``` --- # 13. Troubleshooting | Symptom | Likely Cause | Fix | |---|---|---| | `UNAUTHENTICATED` on stream open | Missing/invalid/expired JWT token | Refresh JWT before connecting. Ensure `Authorization: Bearer ` header. | | `NOT_FOUND` after JWT validation | User not found in Cyoda for the given JWT | Verify user enrollment and legal entity configuration. | | Greet event has `success: false` | Subscription limit exceeded (max client nodes) | Check your subscription plan limits. | | Member marked as not alive | Keep-alive responses too slow or missing | Ensure non-blocking, fast keep-alive handler. Check network latency. | | Requests not arriving | Tags mismatch | Verify your member's tags are a superset of the workflow processor's `calculationNodesTags`. Tags are case-insensitive. | | Requests not arriving | Member on wrong legal entity | Requests only route to members in the same legal entity as the entity owner. | | Request timeout | Business logic too slow | Optimize processing time or increase `responseTimeoutMs` in workflow config. | | Duplicate requests | Retry policy triggered | Implement idempotency using `requestId`. | | Stream drops unexpectedly | Server restart, network issue, idle timeout | Implement reconnection with exponential backoff (Section 11.1). | | `authtype` is `system` unexpectedly | Workflow triggered by an internal platform action (e.g., scheduled transition) or no user context was available | This is expected for system-initiated workflows. If you expect a user context, verify the originating API call is authenticated. | | `authclaims` is missing | The triggering principal is a plain `IUser` without extended claims, or the auth type is `system`/`unauthenticated` | Only `user` and `service_account` auth types include claims. Check `authtype` before parsing claims. | --- ## build/modeling-entities.md # Modeling entities Design patterns for entity schemas — boundaries, evolution, and validation. Modeling well in Cyoda comes down to drawing the right boundaries between entities, letting the schema grow with the data, and treating validation as the job of the platform rather than the application layer. This page covers the patterns worth knowing before you ship a first model. ## One entity per noun The simplest rule: every domain noun that has an independent lifecycle is its own entity. An `Order` has its own states, its own history, and its own audit trail; so does a `Customer`. They relate via references, not by embedding. A useful test: *does this thing change on its own clock?* If yes, it's an entity. Line items on an order often do not — they live inside the order's state transitions — so they stay embedded. Fulfilment events on an order do — they have their own lifecycle — so they become their own entity, referenced by the order. ## Two modes: discover or lock Cyoda gives you two structural contracts for an entity model. The right choice depends on how exposed the model is to outside producers. **Discover (loose).** For a new model you do not write a schema file; you post a representative sample (or a batch) and Cyoda records the fields, their types, and the shape of nested arrays and objects. New samples **widen** the schema — a field seen as `INTEGER` once and as `[INTEGER, STRING]` later becomes polymorphic, and array widths grow to fit observed data. Use discover mode when you are prototyping, exploring a dataset, or have not yet fixed the contract with upstream producers. **Lock (strict).** Once the shape is stable, lock the model. After locking, any incoming entity that does not structurally match the current schema is **rejected**. This is the right default for production systems with external interface contracts — a trading system receiving FpML confirmations, a payments pipeline consuming an agreed message format, a regulated workflow whose processor logic is tailored to a specific shape. In those contexts a silently widened schema is a latent bug at best and a compliance failure at worst: if an upstream does an uncoordinated FpML version upgrade, you want the new-shape messages rejected at the door, not accepted and fed into processors that were built against the old shape. These two modes together cover the spectrum. Cyoda deliberately does **not** layer a Confluent-style forward/backward/full compatibility taxonomy on top: "compatible" is not a platform-generic concept when the workflow (your app code) is part of the contract. Only the application can judge whether adding an optional field, widening an integer to a string, or dropping a field leaves its transition logic valid. The platform contract is the simpler and stricter pair: loose discovery, or lock-and-reject. ## Evolving a model You evolve during discover mode by sending data: fields appear, types widen, array widths grow. None of this is surprising until you lock. After lock, evolution is **application-controlled**. The model has a `modelVersion` that the application increments when it wants a new structural contract. Each revision of each entity is tagged at write time with the model version in force. Revisions are immutable: old revisions are **not** re-validated, re-cast, or rewritten when a new model version appears. A consumer reading an old revision reads it under its original version; interpretation across versions is application logic. Concretely: - **Add fields (pre-lock).** Send a sample that includes them; the schema widens automatically. - **Widen types (pre-lock).** Cyoda handles polymorphic fields via a type hierarchy (e.g. `BYTE → SHORT → INT → LONG`; `FLOAT → DOUBLE → BIG_DECIMAL`). See the [Trino SQL reference](/reference/trino/) for the complete primitive lattice, including the temporal-type resolution hierarchy. - **Lock.** Freeze evolution once the shape is stable. The default stance for anything with external producers. - **Bump `modelVersion` and register the new schema (post-lock).** A locked model is a frozen contract; to accommodate a changed structure the application bumps `modelVersion` and **registers the new schema** for that version. Registration uses the same mechanism as initial discovery: submit a comprehensive set of representative samples that span the intended shape, and Cyoda infers the schema from them. The samples themselves are **not stored** — they exist only to define the shape of the new version. Once registered, lock the new version so it too is a hard contract. If data written under an older version needs to appear under the new shape, migrate it explicitly via app code; the platform takes no stance on whether the new shape is "compatible" with the old — that judgment belongs to the workflow that consumes the data. Things to plan explicitly: - **Renames.** Cyoda does not rename a field for you; if you rename in the source, you get a new field alongside the old one. Migrate existing data deliberately. - **Deletes and deprecations.** Same story — Cyoda will not silently drop or re-interpret a field across a version boundary. The application owns the migration. - **Narrowing types.** Once the schema has observed `STRING` in a field, you cannot narrow it to `INTEGER` within the same version. To narrow, introduce a new `modelVersion` with the stricter type and migrate the data. ## Who validates what Cyoda validates **structure** and **types** against the model: required shapes, element types, array constraints, polymorphic compatibility. That is free; you do not write those validators. Your application is responsible for **semantic** validation that lives inside transitions: "the order total must equal the sum of line items", "the payment currency must match the customer's currency". Those belong in workflow criteria and processors where they can fail a transition and leave the old revision intact. ## Anti-patterns - **The god-entity.** One model that tries to represent everything. Split it along lifecycle boundaries; lifecycles that evolve independently want separate entities. - **Premature generalisation.** A model that tries to anticipate every future field. Let the schema discover itself and lock when you are ready. - **Shadow workflows.** Implementing state transitions as boolean flags on the entity. Put states in a workflow; that's what workflows are for. ## Where to go next - [Entities and lifecycle](/concepts/entities-and-lifecycle/) — the conceptual model behind an entity. - [Entity model export](/reference/entity-model-export/) — the wire format of a SIMPLE_VIEW export, node descriptors, type descriptors, and the JSON Schema for the response. - [JSON schema reference](/reference/schemas/) — the REST-API message schemas generated from cyoda-go. - [Workflows and events](/concepts/workflows-and-events/) — how state and transitions are configured. --- ## build/searching-entities.md # Searching entities Query sets of entities over REST — direct vs async modes, predicates, pagination, and historical reads. Cyoda exposes search over REST for any query that returns a set of entities. Use it when you need more than one entity back, when the filter goes beyond a single id lookup, or when you want to scope by workflow state. For single-entity reads, stay on the CRUD endpoints in [working with entities](/build/working-with-entities/); for cross-entity analytics, use [SQL](/build/analytics-with-sql/); for event-driven compute, use [gRPC compute nodes](/build/client-compute-nodes/). ## Two query modes Cyoda splits search into **Immediate** and **Background** modes. Pick by expected result size and urgency; the filter grammar is identical. - **Immediate** (API term: `direct`) — synchronous. The request returns matching entities in the response body. Result size is **capped**, so `direct` is the right default only when you know the filter produces a bounded, small set: a UI list, a lookup, a small report. If a query hits the cap, switch it to `async`. - **Background** (API term: `async`) — queued. The request returns a job handle; poll it to retrieve results. Result size is **unbounded** and results are **paged**. On the Cassandra-backed tier (Cyoda Cloud, or a licensed Enterprise install), `async` runs distributed across the cluster: for a fixed query shape, throughput scales roughly linearly with the number of nodes. The decision tree is short: - Small bounded result, UI-facing → `direct`. - Might be large, can tolerate a second or two of queuing, exports, reports, batch jobs → `async`. - Hitting the cap or the request timeout on `direct` → `async`. ## A direct search Filter by a combination of entity fields and workflow state: ```bash curl -X POST http://localhost:8080/api/search/direct/orders/1 \ -H 'Content-Type: application/json' \ -H "Authorization: Bearer $TOKEN" \ -d '{ "filter": { "state": "submitted", "customerId": "CUST-7" } }' ``` The path is `/api/search/direct/{entityName}/{modelVersion}`. The response is the list of matching entities, each with its current state, revision, and timestamps. ## An async search Submit the search to `/api/search/async/{entityName}/{modelVersion}` and capture the handle: ```bash curl -X POST http://localhost:8080/api/search/async/orders/1 \ -H 'Content-Type: application/json' \ -H "Authorization: Bearer $TOKEN" \ -d '{ "filter": { "state": "submitted" } }' ``` Poll the returned `jobId` until the job is ready, then fetch pages (`pageNumber` is zero-indexed; `pageSize` caps the page): ``` GET /api/search/async/{jobId}/status GET /api/search/async/{jobId}/results?pageNumber=0&pageSize=1000 GET /api/search/async/{jobId}/results?pageNumber=1&pageSize=1000 ``` A single `jobId` can be paged repeatedly until the result is expired; expiry is controlled per deployment. ### Cancelling a job If a job is no longer needed — the user navigated away, a replacement query was submitted, the deployment is shutting down — cancel it rather than letting it run to completion: ```bash curl -X DELETE http://localhost:8080/api/search/async/{jobId}/cancel \ -H "Authorization: Bearer $TOKEN" ``` Cancellation is cooperative: in-flight work is stopped at the next safe point and any partial results for that `jobId` are discarded. ## Filter shape The filter is a JSON document whose fields are entity field paths, metadata (`state`, `createdAt`, …), or workflow labels. The authoritative operator grammar — equality, comparisons, ranges, set membership, AND/OR combinators, nested-field access — lives in the [REST API reference](/reference/api/). The shape used in the simple examples above (flat field→value map) is the equality short form; use the full object form when you need operators: ```json { "and": [ { "field": "state", "eq": "submitted" }, { "field": "amount", "gte": 1000 } ] } ``` For the full predicate grammar — every operator, nesting rule, and function — run `cyoda help search` against your binary. ## Historical reads with `pointInTime` Every search accepts a `pointInTime` parameter to run against the world as it existed at a given timestamp. Each entity maintains a history of revisions; point-in-time queries return results using the entity state that was current at the specified timestamp. The result is the set of entities that would have matched, using the revision active at that time. ```bash curl -X POST http://localhost:8080/api/search/direct/orders/1 \ -H 'Content-Type: application/json' \ -H "Authorization: Bearer $TOKEN" \ -d '{ "pointInTime": "2026-03-01T00:00:00Z", "filter": { "state": "submitted", "customerId": "CUST-7" } }' ``` This is the primary way to answer audit and regulatory questions from REST — *what did this customer's open orders look like at quarter close?* The Trino surface exposes the same capability as a column named `point_time` (snake-case, matching SQL convention); for the analytical form, see [`point_time` in analytics](/build/analytics-with-sql/). ## Paging and sort (async) - `pageSize` and `pageNumber` are query parameters on `/search/async/{jobId}/results`; they apply at result-fetch time, not at job submission. `pageNumber` is zero-indexed. - Sort is not documented on the REST async surface at this release; results are returned in insertion order. - A completed `jobId` is stable for its retention window — page reads are idempotent. ## Performance notes - Scope by `state` or a high-selectivity field first — the workflow state is indexed on every entity and is almost always the right first predicate. - Prefer `async` as soon as the result set might be thousands of entities; the distributed execution on the Cassandra tier makes it cheaper per entity than a series of `direct` pages. - Avoid open-ended `pointInTime` scans across every revision — anchor the query at a specific timestamp or a short window. ## Where to go next - [REST API reference](/reference/api/) — authoritative search payload schema, operator grammar, status and result endpoints. - [Working with entities](/build/working-with-entities/) — single-entity CRUD and transitions; the CRUD page for reference on the same surface. - [Analytics with SQL](/build/analytics-with-sql/) — heavy analytical work, cross-entity joins, historical scans via `point_time`. - [Entities and lifecycle](/concepts/entities-and-lifecycle/) — the audit/history model behind `pointInTime`. --- ## build/testing-with-digital-twins.md # Testing with digital twins In-memory mode as a test harness; running simulations at volumes exceeding production. In Cyoda, a "digital twin" means the same application code — workflows, criteria, processors — runs identically on every storage tier. Non-functional properties (persistence, latency, concurrency model) differ; business logic does not. cyoda-go's **in-memory mode** is the built-in test harness. It runs the entire platform — entity store, workflow engine, API surfaces — in a single process with no external dependencies and no disk writes. It is the fastest way to exercise workflow behaviour in CI, and the cheapest way to run large scenario simulations against your application logic. ## In-memory mode as a test harness Start cyoda-go with the in-memory profile (or `go run ./cmd/cyoda` against the default in-memory config). Concretely: set `CYODA_STORAGE_BACKEND=memory`, or leave it unconfigured — memory is the application default until `cyoda init` is run. Point your tests at it; tear it down between cases; no database to seed, no files to clean up. Properties that matter for testing: - **Deterministic.** Same inputs, same state, same result. - **Fast.** Sub-millisecond latencies mean you can run thousands of transitions per test. - **Isolated.** Nothing survives the process; tests cannot leak state into each other. - **API-identical.** Your application code calls the same REST and gRPC endpoints it will call in production. This makes in-memory mode suitable for unit-level tests of your processors, integration tests of your whole workflow, and smoke tests in CI. ## Digital-twin simulations Beyond unit tests, in-memory mode is a **digital-twin runtime**: a behavioural clone of production that you can drive at volumes and rates your real system could never sustain. Because there is no durable backend, no rate-limited external API, and no shared-state concern, you can: - Replay a year of historical transactions in minutes. - Fan out thousands of concurrent scenario runs across CPUs. - Stress the workflow with injected event streams that exceed production peak by multiples. The application logic — workflows, criteria, processors — is the same application that runs against PostgreSQL or Cyoda Cloud. The **only** things that differ between in-memory and durable tiers are non-functional: persistence, latency profile, and scale ceiling. That's the property that makes the in-memory run a useful twin of the durable one. ## What stays the same, what changes Same: - API contracts (REST, gRPC today; Trino upcoming — see the [Trino reference](/reference/trino/)). - Workflow semantics: states, transitions, criteria, processors. - Event ordering within a transition. - Audit-trail shape. Different: - Durability (none in-memory). - Concurrency model (no multi-node contention). - Performance envelope (faster, lower variance). If your test depends on *durable* behaviour — restart recovery, cross-node consensus, cross-process replay — graduate it to a SQLite or PostgreSQL run for that suite. For everything else, in-memory is the default. ## Examples Worked examples live in [cyoda-go/examples](https://github.com/cyoda-platform/cyoda-go/tree/main/examples), including scenario-simulation runners you can adapt as a template. When a cyoda-go release ships a new example, this page links it. ## Where to go next - [Digital twins and the growth path](/concepts/digital-twins-and-growth-path/) — the concept behind same-app-different-tier. - [Run → overview](/run/) — choosing a tier for the other side of the twin. --- ## build/workflows-and-processors.md # Workflows and processors State-machine design, transitions, and external processors — with a preference for gRPC in compute nodes. > Understanding Cyoda JSON workflow configurations. ## Overview Cyoda workflows define finite, distributed state machines that govern the lifecycle of business entities in an event-driven environment. Each entity progresses through a sequence of states based on defined transitions, criteria, and processing rules. :::tip[Use gRPC for compute nodes] When implementing processors or criteria services, prefer gRPC over HTTP. gRPC preserves audit hygiene and simplifies authorization. See [APIs and surfaces](/concepts/apis-and-surfaces/) for the decision rationale. ::: The platform supports adaptable entity modeling, allowing business logic to evolve through configuration rather than implementation changes. Workflows declare the set of states, valid transitions, and associated processing steps while preserving immutable persistence for auditability. ## Workflow Architecture ### Core Components 1. **States**: Lifecycle stages of an entity 2. **Transitions**: Directed changes between states 3. **Criteria**: Conditional logic for transition eligibility 4. **Processors**: Executable logic triggered during transitions ## Configuration Schema You can find the workflow schema in the [API reference](/reference/api/). See the workflow import endpoints for complete schema specifications. Here we explain the structure and meaning of each element. ### Workflow Object ```json { "version": "1", "name": "Workflow Name", "desc": "Workflow description", "initialState": "StateName", "active": true, "criterion": {}, "states": {} } ``` #### Attributes - `version`: Workflow schema version - `name`: Identifier for the workflow. Must be unique per entity model. - `desc`: Detailed description of the workflow - `initialState`: Starting point for new entities - `active`: Indicates whether the workflow is active - `criterion`: Optional criterion for selecting which workflow applies to a given entity. Uses the same condition types as transition criteria (simple, group, function). When multiple workflows are defined for a model, the platform evaluates each workflow's criterion against the entity to determine which workflow governs it. - `states`: Map of state names to state definitions ### Multiple Workflows per Model An entity model can have multiple workflows, each with its own `criterion` at the workflow level. When an entity is created, the platform evaluates each active workflow's criterion to select the applicable workflow. The platform evaluates active workflows in the order they are defined and uses the first whose criterion matches (or the first with no criterion, which matches unconditionally). This allows different processing paths for different categories of entities within the same model. ## Import and Export Workflows are managed via import and export API endpoints on the entity model. The import request supports three modes that control how existing workflows are reconciled with the payload: - **`MERGE`** (default): Incremental update. Workflows with matching names are updated; unspecified workflows remain unchanged. - **`REPLACE`**: Removes all existing workflows for the entity model and retains only the imported ones. Also deletes all unused processors and criteria. - **`ACTIVATE`**: Similar to REPLACE, but deactivates (rather than deletes) existing workflows and transitions not included in the import. Unused processors and criteria are preserved. See [API reference](/reference/api/) for endpoint details and the full request/response schemas. ## States States describe lifecycle phases for entities. Names must start with a letter and use only alphanumeric characters, underscores, or hyphens. ### Format ```json "StateName": { "transitions": [] } ``` #### Special States - **Initial state**: The initial state of a new entity - **Terminal States**: States with no outgoing transitions ## Transitions Transitions define allowed movements between states, optionally gated by conditions and supported by executable logic. ### Format ```json { "name": "TransitionName", "next": "TargetState", "manual": true, "disabled": false, "criterion": {}, "processors": [] } ``` #### Attributes - `name`: Name of the transition (required) - `next`: Target state code (required) - `manual`: Determines if the transition is manual or automated (required) - `disabled`: Marks the transition as inactive - `criterion`: Optional condition for eligibility - `processors`: Optional processing steps ### Manual vs Automated Transitions Transitions may be either **manual** or **automated**, and are guarded by criteria that determine their eligibility. When an entity enters a new state, the first eligible automated transition is executed immediately within the same transaction. This continues recursively until no further **automated** transitions are applicable, resulting in a stable state. Each transition may trigger one or more attached processes, which can run synchronously or asynchronously, either within the current transaction or in a separate one. This forms the foundation for event flow automation, where processors may create or mutate entities in response, allowing a single transition to initiate a cascade of events and function executions across the system. `CYODA_MAX_STATE_VISITS` configures the per-state visit limit within a single cascade (default 10). A separate hard-coded safety cap of 100 steps limits total cascade depth across all states, preventing runaway automatic-transition chains. ## Criteria Criteria define logic that determines if a transition is permitted. A criterion can be one of five types: 1. **Simple**: Evaluates a single condition on entity data 2. **Group**: Combines multiple criteria with logical operators 3. **Function**: Calls an external function for evaluation (delegated to a calculation node via gRPC) 4. **Lifecycle**: Evaluates a condition on entity lifecycle properties (state, creation date, previous transition) 5. **Array**: Evaluates a condition against an array of values ### Simple Criteria Simple criteria evaluate a single condition directly on entity data using JSONPath expressions. They are executed directly on the processing node, without involving external compute nodes. ```json "criterion": { "type": "simple", "jsonPath": "$.amount", "operation": "GREATER_THAN", "value": 1000 } ``` #### Simple Criteria Attributes - `jsonPath`: JSONPath expression to extract the value from entity data - `operation`: Comparison operator (see [Operator Types](#operator-types) below). Also accepts the alias `operatorType`. - `value`: The value to compare against ### Group Criteria Group criteria combine multiple conditions using logical operators. ```json "criterion": { "type": "group", "operator": "AND", "conditions": [ { "type": "simple", "jsonPath": "$.status", "operation": "EQUALS", "value": "VALIDATED" }, { "type": "simple", "jsonPath": "$.amount", "operation": "GREATER_THAN", "value": 500 } ] } ``` #### Group Criteria Attributes - `operator`: Logical operator combining conditions (`AND`, `OR`, `NOT`) - `conditions`: Array of criteria (can be `simple`, `function`, `group`, `lifecycle`, or `array` types — supports arbitrary nesting) ### Function Criteria Function criteria delegate evaluation to an external compute node via gRPC. The function must return a boolean result. ```json "criterion": { "type": "function", "function": { "name": "FunctionName", "config": { "attachEntity": true, "calculationNodesTags": "validation,data-quality", "responseTimeoutMs": 3000, "retryPolicy": "FIXED", "context": "optionalContext" }, "criterion": { "type": "simple", "jsonPath": "$.preCheckField", "operation": "EQUALS", "value": true } } } ``` #### Function Attributes - `name`: The name of the function to execute (required) - `config`: Configuration for the function call (optional): - `attachEntity`: Whether to pass the entity data to the function - `calculationNodesTags`: Comma-separated list of tags for routing to specific calculation nodes - `responseTimeoutMs`: Response timeout in milliseconds - `retryPolicy`: Retry policy for the function (e.g., `"FIXED"`) - `context`: Optional string parameter passed to the function for additional context or configuration. The `context` is passed "as is" with the event to the compute node. It can contain any sort of information that is relevant to the function's execution, in any format. The interpretation is up to the function itself. - `criterion`: Optional quick-exit criterion evaluated locally before calling the (potentially expensive) external function. If this local criterion evaluates to false, the function call is skipped entirely. Useful for avoiding unnecessary network round-trips when the result can be confidently determined from entity data. ### Lifecycle Criteria Lifecycle criteria evaluate conditions on entity lifecycle properties rather than entity data. ```json "criterion": { "type": "lifecycle", "field": "state", "operation": "EQUALS", "value": "VALIDATED" } ``` #### Lifecycle Criteria Attributes - `field`: Lifecycle field to evaluate: `state`, `creationDate`, or `previousTransition` - `operation`: Comparison operator - `value`: The value to compare against ### Array Criteria Array criteria evaluate a condition against an array of values. ```json "criterion": { "type": "array", "jsonPath": "$.category", "operation": "EQUALS", "value": ["electronics", "software", "services"] } ``` #### Array Criteria Attributes - `jsonPath`: JSONPath expression to the field - `operation`: Comparison operator - `value`: Array of string values to match against ### Operator Types The following comparison operators are available for simple, lifecycle, and array criteria: **Basic Comparison:** `EQUALS`, `NOT_EQUAL`, `IS_NULL`, `NOT_NULL`, `GREATER_THAN`, `LESS_THAN`, `GREATER_OR_EQUAL`, `LESS_OR_EQUAL`, `BETWEEN`, `BETWEEN_INCLUSIVE` **String Operations (Case-Sensitive):** `CONTAINS`, `NOT_CONTAINS`, `STARTS_WITH`, `NOT_STARTS_WITH`, `ENDS_WITH`, `NOT_ENDS_WITH`, `MATCHES_PATTERN`, `LIKE` **Case-Insensitive String Operations:** `IEQUALS`, `INOT_EQUAL`, `ICONTAINS`, `INOT_CONTAINS`, `ISTARTS_WITH`, `INOT_STARTS_WITH`, `IENDS_WITH`, `INOT_ENDS_WITH` **State Tracking:** `IS_UNCHANGED`, `IS_CHANGED` ## Processors Processors implement custom logic to run during transitions. There are two types of processors: **externalized** (delegated to calculation nodes) and **scheduled** (delayed transitions). ### Externalized Processors Externalized processors delegate execution to a calculation node via gRPC. This is the most common processor type. ```json { "type": "externalized", "name": "ProcessorName", "executionMode": "SYNC", "config": { "attachEntity": true, "calculationNodesTags": "tag1,tag2", "responseTimeoutMs": 5000, "retryPolicy": "FIXED", "context": "optionalContext" } } ``` #### Externalized Processor Attributes - `type`: `"externalized"` (discriminator) - `name`: Name of the processor (required) - `executionMode`: Execution mode (see below). Default: `ASYNC_NEW_TX`. - `config`: Configuration for the processor call: - `attachEntity`: Whether to attach entity data to the processor call. Set to `true` if the processor needs access to the entity data (this is usually the case). - `calculationNodesTags`: Comma-separated list of tags for routing to specific calculation nodes - `responseTimeoutMs`: Response timeout in milliseconds - `retryPolicy`: Retry policy for the processor - `context`: Additional context passed to the processor - `asyncResult`: Whether to await the result asynchronously, outside of the transaction - `crossoverToAsyncMs`: Crossover delay in milliseconds to switch to asynchronous processing (effective only when `asyncResult` is true) #### Execution Modes - `SYNC`: Inline execution within the transaction. Runs immediately and blocks the current processing thread on the same node. - `ASYNC_SAME_TX`: Deferred within the current transaction. Commits or rolls back atomically with the triggering transition. - `ASYNC_NEW_TX`: Deferred execution in a separate, independent transaction. Default mode. Processors should be idempotent; failed ASYNC_NEW_TX processors may be retried. Synchronous executions run immediately and block the current processing thread on the same node, making them local and non-distributed. In contrast, asynchronous executions are scheduled for deferred processing and can be handled by any node in the cluster, enabling horizontal scalability and workload distribution, albeit with possibly somewhat higher latency. ### Scheduled Transition Processors Scheduled processors trigger a delayed transition after a configured time period. ```json { "type": "scheduled", "name": "schedule_timeout", "config": { "delayMs": 3600000, "transition": "timeout", "timeoutMs": 7200000 } } ``` #### Scheduled Processor Attributes - `type`: `"scheduled"` (discriminator) - `name`: Name of the processor (required) - `config` (required): - `delayMs`: Delay in milliseconds before executing the transition (required) - `transition`: The name of the transition to execute after waiting (required) - `timeoutMs`: Timeout in milliseconds for executing the transition task, after which it will be expired (optional) ### Calculation Nodes Tags As described in the [Architecture](/architecture/cyoda-cloud-architecture/) section, the execution of processors and criteria is delegated to client compute nodes, i.e. your own infrastructure running your business logic. These nodes can be organized into groups and tagged based on their roles or capabilities. By optionally setting the `calculationNodesTags` property in a processor or criterion definition, you can direct execution to specific groups, giving you fine-grained control over workload distribution across your compute environment. ## Example: Payment Request Workflow This workflow models the lifecycle of a payment request, covering validation, matching, approval, and notification handling. It starts in the INVALID state, where the request is either amended or validated. If validation succeeds and a matching order exists, the request advances automatically to the SUBMITTED state. If not, it moves to PENDING, where it awaits a matching order or may be retried manually. Requests in SUBMITTED require an approval decision, leading either to APPROVED, which triggers asynchronous processing like payment message creation and ACK notifications, or to DECLINED, which emits a rejection (NACK) notification. Manual amend and retry transitions at key stages allow users or systems to correct or re-evaluate the request. The following section walks through the configuration step by step. ![Payment Request Workflow](paymentRequestWorkflow) ### Step 1: Workflow Metadata ```json { "version": "1", "name": "Payment Request Workflow", "desc": "Payment request processing workflow with validation, approval, and notification states", "initialState": "INVALID", "active": true } ``` ### Step 2: Define States and Transitions Start by defining the overall structure of states and transitions. ```json { "version": "1", "name": "Payment Request Workflow", "desc": "Payment request processing workflow with validation, approval, and notification states", "initialState": "INVALID", "active": true, "states": { "INVALID": { "transitions": [ { "name": "VALIDATE", "next": "PENDING", "manual": false, "disabled": false }, { "name": "AMEND", "next": "INVALID", "manual": true, "disabled": false }, { "name": "CANCEL", "next": "CANCELED", "manual": true, "disabled": false } ] }, "PENDING": { "transitions": [ { "name": "MATCH", "next": "SUBMITTED", "manual": false, "disabled": false }, { "name": "RETRY", "next": "PENDING", "manual": true, "disabled": false }, { "name": "CANCEL", "next": "CANCELED", "manual": true, "disabled": false } ] }, "SUBMITTED": { "transitions": [ { "name": "APPROVE", "next": "APPROVED", "manual": true, "disabled": false }, { "name": "DENY", "next": "DECLINED", "manual": true, "disabled": false } ] }, "APPROVED": { "transitions": [] }, "DECLINED": { "transitions": [] }, "CANCELED": { "transitions": [] } } } ``` ### Step 3: Add Criteria We add criteria to the `VALIDATE` and `MATCH` transitions: ```json { "version": "1", "name": "Payment Request Workflow", "desc": "Payment request processing workflow with validation, approval, and notification states", "initialState": "INVALID", "active": true, "states": { "INVALID": { "transitions": [ { "name": "VALIDATE", "next": "PENDING", "manual": false, "disabled": false, "criterion": { "type": "function", "function": { "name": "IsValid", "config": { "attachEntity": true } } } }, { "name": "AMEND", "next": "INVALID", "manual": true, "disabled": false }, { "name": "CANCEL", "next": "CANCELED", "manual": true, "disabled": false } ] }, "PENDING": { "transitions": [ { "name": "MATCH", "next": "SUBMITTED", "manual": false, "disabled": false, "criterion": { "type": "function", "function": { "name": "HasOrder", "config": { "attachEntity": true } } } }, { "name": "RETRY", "next": "PENDING", "manual": true, "disabled": false }, { "name": "CANCEL", "next": "CANCELED", "manual": true, "disabled": false } ] }, "SUBMITTED": { "transitions": [ { "name": "APPROVE", "next": "APPROVED", "manual": true, "disabled": false }, { "name": "DENY", "next": "DECLINED", "manual": true, "disabled": false } ] }, "APPROVED": { "transitions": [] }, "DECLINED": { "transitions": [] }, "CANCELED": { "transitions": [] } } } ``` ### Step 4: Add Processors We add two processors to the `APPROVE` transition in the `SUBMITTED` state, respectively, to finish the job. ```json { "version": "1", "name": "Payment Request Workflow", "desc": "Payment request processing workflow with validation, approval, and notification states", "initialState": "INVALID", "active": true, "states": { "INVALID": { "transitions": [ { "name": "VALIDATE", "next": "PENDING", "manual": false, "disabled": false, "criterion": { "type": "function", "function": { "name": "IsValid", "config": { "attachEntity": true } } } }, { "name": "AMEND", "next": "INVALID", "manual": true, "disabled": false }, { "name": "CANCEL", "next": "CANCELED", "manual": true, "disabled": false } ] }, "PENDING": { "transitions": [ { "name": "MATCH", "next": "SUBMITTED", "manual": false, "disabled": false, "criterion": { "type": "function", "function": { "name": "HasOrder", "config": { "attachEntity": true } } } }, { "name": "RETRY", "next": "PENDING", "manual": true, "disabled": false }, { "name": "CANCEL", "next": "CANCELED", "manual": true, "disabled": false } ] }, "SUBMITTED": { "transitions": [ { "name": "APPROVE", "next": "APPROVED", "manual": true, "disabled": false, "processors": [ { "type": "externalized", "name": "Create Payment Message", "executionMode": "ASYNC_NEW_TX", "config": { "attachEntity": true } }, { "type": "externalized", "name": "Send ACK Notification", "executionMode": "ASYNC_NEW_TX", "config": { "attachEntity": false } } ] }, { "name": "DENY", "next": "DECLINED", "manual": true, "disabled": false, "processors": [ { "type": "externalized", "name": "Send NACK Notification", "executionMode": "ASYNC_NEW_TX", "config": { "attachEntity": false } } ] } ] }, "APPROVED": { "transitions": [] }, "DECLINED": { "transitions": [] }, "CANCELED": { "transitions": [] } } } ``` ## Best Practices - Use domain-specific state names - Match transition granularity to business needs - Define recovery and cancellation paths - Prefer asynchronous processing for external dependencies - Use self-transitions for triggering workflow automation on exit from the current state ## Platform Integration Cyoda workflows integrate directly with: - **Entity Models**: Determine which workflows apply to which data types - **Execution Engine**: Drives state and transition logic - **External Functions**: Implement validation and custom behavior - **Event System**: Triggers automated transitions on event reception --- ## build/working-with-entities.md # Working with entities Create, read, update, and search entities via the cyoda-go API — worked examples. This page shows the patterns for interacting with entities through the platform API. Examples assume a local cyoda-go instance running on the default port with SQLite persistence; the same requests work against Cyoda Cloud with the cloud endpoint and an issued token. The complete endpoint catalogue — parameters, response shapes, error codes — lives in the [API reference](/reference/api/). Keep that open as you work. ## The shape of the API Cyoda speaks REST for CRUD, search, and workflow invocation, gRPC for external processors, and Trino SQL for analytics. This page covers REST; see [Build → client compute nodes](/build/client-compute-nodes/) for gRPC and the [APIs and surfaces](/concepts/apis-and-surfaces/) overview for the decision framework. Every request is authenticated with a bearer token. Every response includes the entity's current revision, state, and timestamps. ## Create Post an entity to its model. The first time you post, Cyoda discovers the schema from what you send: ```bash curl -X POST http://localhost:8080/api/entity/JSON/orders/1 \ -H 'Content-Type: application/json' \ -H "Authorization: Bearer $TOKEN" \ -d '{ "orderId": "ORD-42", "customerId": "CUST-7", "amount": 120.00, "currency": "EUR", "lines": [ { "sku": "AX-1", "qty": 2, "price": 60.00 } ] }' ``` The path is `/api/entity/{format}/{entityName}/{modelVersion}` — here `JSON`, `orders`, and version `1`. The response carries an array whose first element contains `entityIds[0]`, the **system-assigned UUID** of the new entity, plus its current state and revision number. Capture the UUID — downstream reads, updates, and transitions address the entity by that UUID, not by the business key `orderId`. ## Read Fetch the current revision by id. The `{entityId}` in these URLs is the UUID returned in `entityIds[0]` from the create response, not a business key like `orderId`: ```bash curl http://localhost:8080/api/entity/${ENTITY_ID} \ -H "Authorization: Bearer $TOKEN" ``` List every entity in a model with `GET /api/entity/{entityName}/{modelVersion}`: ``` GET /api/entity/orders/1 ``` For filtered reads — predicates, pagination, result caps, historical reads — see [searching entities](/build/searching-entities/). The list endpoint does not accept ad-hoc field filters; those belong to search. ## Update Direct updates use `PUT /api/entity/{format}/{entityId}` (loopback update — stores a new revision without a named transition) or `PUT /api/entity/{format}/{entityId}/{transition}` (update with a named transition). There is no `PATCH` endpoint — all writes are full-payload PUTs. **Mutations that move the entity between lifecycle states should go through a named transition**, not a bare loopback update. Invoking the `submit` transition records it in the audit trail and runs any attached processors. The transition carries the new entity JSON in the request body (the platform stores the updated entity and records the named transition in one call): ```bash curl -X PUT http://localhost:8080/api/entity/JSON/${ENTITY_ID}/submit \ -H 'Content-Type: application/json' \ -H "Authorization: Bearer $TOKEN" \ -d '{ "orderId": "ORD-42", "status": "submitted" }' ``` See [Build → workflows and processors](/build/workflows-and-processors/) for how to declare transitions. ## Search Cyoda supports two query modes: - **Direct** (synchronous, capped result size) — API term `direct`. Returns right away. Good for UI lookups and short operations. Result size is capped, so `direct` is best for queries that produce a bounded, small result set. - **Async** (background, unbounded, paged) — API term `async`. Queued as a job, returns a handle you can poll. Result size is unbounded; results are paged. Good for large result sets, periodic reports, and exports. On the Cassandra-backed tier (Cyoda Cloud, or a licensed Enterprise install), `async` search runs distributed across the cluster and scales horizontally: query throughput for a fixed shape grows roughly linearly with the number of nodes. Both accept the same filter grammar over entity fields, metadata, and workflow state. Pick `direct` by default; switch to `async` when a query would hit the `direct` result cap, would time out, or would hold resources you need elsewhere. For predicates, pagination, and worked examples, see [searching entities](/build/searching-entities/). ## Temporal queries Every entity's history is queryable. Add a `pointInTime` parameter to any read or search request to retrieve the world as of that timestamp: ``` GET /api/entity/{entityId}?pointInTime=2026-03-01T00:00:00Z ``` This is the primary way to answer regulatory and audit questions: *what did this customer's balance look like at quarter close?* For the same parameter applied to searches, see [searching entities → historical reads](/build/searching-entities/#historical-reads-with-pointintime); for analytical reads expressed as SQL, see [analytics with SQL](/build/analytics-with-sql/). ## From a compute node When your code is reacting to a transition — running a processor or evaluating a criterion — talk to the platform over **gRPC**, not REST. The gRPC path preserves the audit association between the transition and the compute call, brokers identity, and supports streaming. See [Build → client compute nodes](/build/client-compute-nodes/) for the implementation pattern. --- ## concepts/apis-and-surfaces.md # APIs and surfaces REST, gRPC, and Trino SQL — when and why to use each. Cyoda exposes three distinct API surfaces. Picking the right one for the job matters; the surfaces are not equivalent, and each carries different guarantees. ## Three surfaces - **REST** — the default surface for user-facing clients and administrative operations. CRUD over entities, search, workflow invocations, schema management, dashboards. - **gRPC** — the surface for compute. External processors and criteria services connect to the platform over gRPC, receive work, and return results. Supports bidirectional streaming. - **Trino SQL** — the surface for analytics. Cross-entity queries, reporting, JDBC connections, BI tools. Queries run against a Trino catalogue that projects entities into virtual SQL tables. ## Which surface, when? - Building a UI, an admin tool, or a sync integration? **REST.** - Writing a processor or criterion that runs against transitions? **gRPC.** - Running analytics, reports, or ad-hoc queries across many entities? **Trino SQL.** All three surfaces are backed by the same entity store. A transition recorded via REST is visible to gRPC compute nodes and queryable through Trino, with the same audit trail behind it. ## REST: humans and services that speak to the platform Use REST when the call represents *a user or service interacting with the platform as a whole*: creating an order, searching for customers, reading audit history, managing a workflow definition. This is the surface your front-end, your admin tooling, and most external integrations will use. REST is synchronous, authenticated with OAuth 2.0 bearer tokens, and versioned. The full endpoint catalogue lives in the [API reference](/reference/api/). ## gRPC: external processors that speak for the platform Use gRPC when your code is *a compute node acting on behalf of a workflow transition*. Processors and criteria attached to a transition call out to external services over gRPC; those services stream work units back to the platform. **Prefer gRPC for compute** over implementing processors as REST callbacks. Three reasons: 1. **Audit hygiene.** Every gRPC call is recorded against the transition that invoked it, inside the platform's audit trail. REST callouts cannot reconstruct that association reliably. 2. **Authorization is simpler.** The platform brokers the identity and scopes passed to the compute node; you don't have to manage credentials between the platform and your processor independently. 3. **Bidirectional streaming.** High-throughput ingest and transformation workloads benefit from streaming both ways; REST cannot. For how to implement a compute node, see [Build → client compute nodes](/build/client-compute-nodes/). ## Trino SQL: cross-entity analytics :::caution[Upcoming] Trino SQL is on the roadmap and not yet available in cyoda-go at this release. The section below documents the planned surface; names and shapes may change before release. ::: Use Trino when the question is *analytical* — joins across entity types, aggregates, reporting, time-series. Every entity model is projected into a set of virtual SQL tables; nested arrays and objects expand into separate tables so relational queries remain natural. Typical uses: - Ad-hoc analysis against live data in a notebook or BI tool. - Scheduled reports that aggregate entities across a tenant. - Historical queries using the `point_time` column for as-of reads. The table generation rules, data-type mappings, JDBC connection patterns, and handling of polymorphic fields are in the [Trino SQL reference](/reference/trino/). For the Build-side quickstart — connection recipe, first query, performance notes — see [Analytics with SQL](/build/analytics-with-sql/). --- ## concepts/authentication-and-identity.md # Authentication and identity OAuth 2.0 tokens, machine-to-machine credentials, on-behalf-of exchange, and external key trust — conceptually. Cyoda is an OAuth 2.0 authorization server. All traffic to the platform — REST, gRPC, Trino — is authenticated with bearer tokens the platform issues. This page explains the identity concepts; the mechanics of configuring an IdP, rotating keys, and provisioning credentials live under Run. ## The platform issues tokens Every request carries a JWT bearer token. Cyoda both **issues** tokens (as an OAuth 2.0 authorization server) and **validates** them on every API call. The token encodes the subject, the scopes, and the tenant the request belongs to; authorization is evaluated from the token, not from transport-level credentials. Clients obtain a token through an OAuth 2.0 flow appropriate to their role: end-user flows for people, M2M flows for services, on-behalf-of exchange for downstream calls. ## Machine-to-machine credentials Services authenticate to Cyoda using **client credentials** (`client_id` and `client_secret`). The platform issues tokens to those credentials and enforces the scopes associated with the service account. Use M2M credentials for any automated integration: ingest pipelines, compute nodes, back-office workers. Rotate credentials like any other secret; the lifetime and rotation cadence are enforced per environment. ## On-behalf-of exchange When one service calls another on a user's behalf — a web app calling an API that calls a processor, for example — Cyoda supports **token exchange**. The calling service presents its own token plus the user's token and receives a new token scoped to the downstream call. This preserves the user identity through the chain without passing the original bearer token around. In practice, the calling service includes the user's JWT as the `subject_token` in a token-exchange request; the issued token carries both identities for downstream authorization. The result: the audit trail records who the original user was at every hop, and each service still only sees a token scoped to what it is allowed to do. ## External key trust Cyoda can be configured to **trust tokens issued by an external IdP** — your corporate Okta, Auth0, or Keycloak, for example. The platform accepts tokens signed with keys it recognises, maps the external subject to an internal identity, and applies the local authorization rules. Users sign in with their organisation's single sign-on and receive entitlements within Cyoda. External key trust is configured per environment; the list of trusted signers and the subject-to-identity mapping are part of the tenant's identity configuration. ## Where this is configured - **Self-hosted (cyoda-go).** Identity configuration — bootstrap credentials, JWT signing keys, external IdP trust — is managed via cyoda-go configuration. See the [cyoda-go authentication reference](https://github.com/cyoda-platform/cyoda-go#authentication) for the authoritative parameter list. - **Cyoda Cloud.** Identity is surfaced as a managed service: [Run → Cyoda Cloud → identity and entitlements](/run/cyoda-cloud/identity-and-entitlements/). ## What your application does Applications do not implement OAuth 2.0 flows from scratch; they fetch a token using their client credentials (or accept one from a user session) and attach it to every Cyoda call. See [Build → working with entities](/build/working-with-entities/) for the client patterns. --- ## concepts/design-principles.md # Design principles The mental model behind Cyoda: entities as durable state machines, transitions as the unit of change, and history as a first-class query surface. Cyoda's shape follows from a few connected ideas. Once those click, the rest of the platform — the APIs, the workflows, the audit trail, the deployment tiers — is just what falls out of them. This page is the high-level picture; the pages that follow drill into each idea in depth. ## Everything is an entity In Cyoda, every piece of persisted data is an **entity**: a JSON document that belongs to a typed model and carries a lifecycle state. Entities are not rows to be updated in place and they are not messages passing through a pipe. They are the objects your system reasons about — customers, orders, documents, trades — with identity, state, and history. Models are discovered from the data, not declared up front, and widen as new shapes arrive. Once a model is good enough to rely on, it can be **locked** so new data must conform. ## An entity is a durable state machine Each entity type has a **workflow** — a set of named states, the legal transitions between them, and the criteria and processors that run along each transition. The entity lives inside that machine. Cyoda's state machines are close in spirit to BPM flows but looser: transitions can be **automatic** (fire on entering a state) or **manual** (invoked by an actor), and a workflow need not be linear or terminate. Cycles and branches are first-class. See [Entities and lifecycle](/concepts/entities-and-lifecycle/) for the full model, including the state-machine diagram. ## Transitions are the unit of change Nothing in Cyoda is overwritten. Every transition produces a new, durable revision of the entity, validated against the model at the moment it happened. Processors run under a defined event contract; criteria gate whether the transition is allowed to fire at all. Because revisions accumulate instead of replacing each other, **history is not an add-on audit layer — it is the storage model**. Point-in-time reads, transition logs, and schema-lineage queries are all native operations. ## Events drive the machine Events — "file uploaded", "payment received", "a manual transition issued by an operator" — trigger transitions. Transitions invoke processors. Processors can be synchronous or asynchronous, and can run inside or alongside the transition's transaction. The platform keeps the whole chain observable and replayable. ## Same semantics, every tier The same entity, workflow, and transition semantics run on every deployment tier Cyoda supports: in-memory for tests, SQLite on a desktop, Docker for a single machine, Kubernetes via Helm, or Cyoda Cloud for a managed fleet. Applications move between tiers without rewriting their domain model. ## TL;DR - All persisted data is an **entity** with a typed model and a lifecycle state. - Each entity follows a **workflow** — a state machine of states, transitions, criteria, and processors. - **Transitions** are the atomic unit of change; each one produces a new revision. - Transitions can be **automatic** (fire on state entry) or **manual** (invoked). - Criteria **guard** transitions; processors **execute** along them, sync or async. - **History** is the storage model, not an add-on; every revision is addressable. - **Events** drive transitions; the whole chain is observable and replayable. - The same semantics apply on every tier, from in-memory tests to the cloud. ## Where to go next - [Entities and lifecycle](/concepts/entities-and-lifecycle/) — schemas, states, history, and the state-machine shape of an entity. - Workflows and events — how transitions, processors, and criteria are wired together. *(Coming as the Concepts section fills out.)* - History and audit — query patterns for time travel and schema lineage. *(Coming.)* - APIs and surfaces, tiers and deployment options, authentication and identity — each has its own page in this section. For the long-form argument behind these ideas, see [Entity Workflows for Event-Driven Architectures](https://medium.com/@paul_42036/entity-workflows-for-event-driven-architectures-4d491cf898a5) and [What's an Entity Database?](https://medium.com/@paul_42036/whats-an-entity-database-11f8538b631a). --- ## concepts/digital-twins-and-growth-path.md # Digital twins and the growth path Why the same Cyoda app runs on any storage tier, and when to pick each tier. A Cyoda application is **tier-agnostic**. The entity model, the workflows, the REST and gRPC surfaces, and the query semantics are the same on every deployment tier. What changes as you move between tiers is non-functional: durability, consistency guarantees, scale, and operational cost. We call this the **digital twin** property — the same logic, running at a different point on the cost/scale curve. ## The growth path ## When to use each tier - **In-Memory** — a single-process, zero-dependency runtime. Data is lost on restart. Use it for functional tests, fast AI iteration loops, and **digital-twin scenario runs** where you want to drive the same app logic at volumes and rates that would be prohibitively expensive against a durable backend. - **SQLite** — durable, single-file, zero-ops. Use it for edge and IoT deployments, small-team self-hosting, and local development where you want data to survive restarts but do not want to operate a separate database. - **PostgreSQL** — durable storage with `SERIALIZABLE` isolation for production workloads. Run 3–10 stateless cyoda-go nodes behind a load balancer for active-active HA, with PostgreSQL as the only external dependency. This is the recommended production path for teams self-hosting. - **Cassandra (via Cyoda Cloud)** — a distributed, horizontally scalable backend available today as a managed service. Use Cyoda Cloud when you need enterprise-grade identity, multi-tenancy, and provisioning, and do not want to run the infrastructure yourself. ## Choosing Most teams make this choice along four axes: - **Durability** — can the data disappear on a restart? If yes, In-Memory is fine. Otherwise you need SQLite or up. - **Write volume and HA** — a single process can go a long way on SQLite, but active-active HA with concurrent write safety wants PostgreSQL or cloud. - **Ops appetite** — PostgreSQL is one dependency; SQLite is zero; Cyoda Cloud is "someone else's problem". - **Scale ceiling** — in-process stores have limits Cassandra does not. You do not have to pick forever. The whole point of the growth path is that the application does not change when you move. A test suite can run on In-Memory today, the early product on SQLite, the growing service on PostgreSQL, and the enterprise fleet on Cyoda Cloud — with the same code. ## Where to go next - [Run → overview](/run/) — practical deployment guidance for each tier. - [Build → testing with digital twins](/build/testing-with-digital-twins/) — using In-Memory mode for scenario tests. --- ## concepts/entities-and-lifecycle.md # Entities and lifecycle Entities are durable state machines — schemas, states, transitions, and history. In Cyoda, an **entity** is a durable state machine. It is a JSON document that belongs to a typed model, sits in a named lifecycle state, and carries a complete audit trail of every transition it has ever undergone. The entity is not a row to be updated in place; the entity *is* the state machine. ## The entity IS the state machine A Cyoda workflow is close in spirit to a BPM process, but the two are not the same. A BPM flow tends to be linear — it has a start, a sequence of activities, and an end. A Cyoda entity workflow is a state machine in the strict sense: there is no requirement for a terminating state, transitions need not follow a linear path, and an entity can move between states in any topology the model calls for, including cycles and branches. That distinction matters in day-to-day modeling. You do not design "the process" and then attach entities to it; you design the states and transitions of a piece of business reality, and the entity lives inside that machine for as long as the business cares about it. ![Example entity state machine: four states connected by auto and manual transitions. One transition runs a `validate` processor; another runs `notify`; a third is gated by an `age > 30d` criterion. A loopback from `archived` back to `active` shows that workflows are not linear and need not terminate.](/img/entity-state-machine.svg) The picture above sketches the building blocks: - **States** (teal rectangles) — the named stages of the entity's life. - **Transitions** — the atomic units of change. Auto transitions (solid, teal) fire as soon as their source state is entered; manual transitions (dashed, orange) fire only when an actor invokes them. - **Processors** (green pills on a transition) — code that runs as part of the transition, under the platform's event contract. - **Criteria** (purple diamonds on a transition) — predicates that gate whether the transition is allowed to fire. Every transition produces a new, durable revision of the entity. Nothing is overwritten. ## Schema Every entity belongs to a named **model** identified by `modelName` and `modelVersion`. Cyoda auto-discovers the model schema from ingested samples rather than requiring it up front: as records flow in, Cyoda observes the fields that appear, the types they take, and the shape of nested arrays and objects. That observed shape is the schema. A model has two structural modes. While **unlocked**, it evolves by merging — new fields appear, types widen, array widths grow. When **locked**, the structural contract is frozen and any incoming entity that does not match is rejected. Lock is the right default for production systems with external producers, where silently accepting a widened shape would be a compliance or correctness failure. `modelVersion` is application-controlled; to change the contract after lock, the application bumps the version and **registers the new schema** for it — by submitting a comprehensive set of representative samples (the same mechanism as initial discovery; the samples themselves are not stored). Old revisions are never re-validated or re-cast; each remains valid under the model version active at write time. See [Modeling entities](/build/modeling-entities/) for when to choose each mode and how to plan evolutions. The wire format and field conventions for exported models (type descriptors, array representations, structural markers) are in the [entity-model export reference](/reference/entity-model-export/). ## History and temporal queries Because transitions produce revisions rather than overwriting state, the full history of an entity is always retrievable. You can ask: - "What did this entity look like at timestamp T?" — point-in-time reconstruction. - "What transitions has this entity been through, and when?" — transition log. - "Which version of the model was this revision validated against?" — schema lineage. This is not an add-on audit layer bolted onto a mutable store; it is the storage model. See the [API reference](/reference/api/) for the temporal-query grammar. --- ## concepts/what-is-cyoda.md # What is Cyoda? An Entity Database Management System — a database engine where the first-class abstraction is a stateful entity with schema, lifecycle, history, and transactional integrity. Cyoda is an **Entity Database Management System (EDBMS)**. Unlike a relational database, where the unit is a row in a table, and unlike a document database, where the unit is a JSON blob, Cyoda's first-class unit is an **entity**: a typed document that carries a lifecycle state, a complete history of every change, and transactional integrity. ## An EDBMS, not a database Most databases answer one question well: *what is the state of the world right now?* They leave you to glue in a workflow engine, a message bus, an audit layer, and a schema registry on top. An EDBMS answers a broader question: *how does this thing evolve, under what rules, and how do I ask what it looked like last Tuesday?* It folds the workflow engine, the audit trail, the schema registry, and the event contract into the storage model. Everything the entity does — transitions, rule evaluations, processor invocations, revisions — is a first-class, queryable part of the same store. Out of the box, an entity in Cyoda has: - a **schema** discovered from ingested samples, evolving over time, lockable; - a **lifecycle state** governed by a workflow; - **transactional transitions** that produce durable, addressable revisions; - a **temporal history** you can query at any point in the past; - an **audit trail** of every rule, transition, and processor that touched it. ## Why this shape Cyoda targets domains where state, rules, and data must evolve together: financial ledgers, order management, regulatory compliance, digital twins. These domains share a property — "the row got updated" is not enough information. You need to know *why* it changed, *under what rules*, and what the world looked like before. That information has to be engineered into a normal database as an afterthought; an EDBMS makes it the default. ## Two forms today Cyoda ships in two closely related forms, with the same semantics: - **cyoda-go** — an open-source Go implementation, run as a local binary or a small cluster. Backends are In-Memory, SQLite, or PostgreSQL, chosen at start. - **Cyoda Cloud** — a managed service backed by the Cassandra-based Cyoda Platform Library. Horizontally scalable, multi-tenant, with enterprise identity, observability, and provisioning. Applications written against one run against the other without rewriting the domain model. ## The growth path You can start on a laptop in minutes and graduate — on your schedule — to a container, to Kubernetes, or to Cyoda Cloud as scale and operational needs demand. The entity, workflow, and API contracts do not change as you move. ## Where to go next - [Design principles](/concepts/design-principles/) — the mental model in one read. - [Entities and lifecycle](/concepts/entities-and-lifecycle/) — the state machine shape of an entity, including a worked example diagram. - Digital twins and the growth path — how the same application runs at every tier. *(Coming as Concepts fills out.)* --- ## concepts/workflows-and-events.md # Workflows and events State machines as a first-class concept — triggers, external processors, and audit trails. A **workflow** is the state machine an entity lives inside. It declares the states an entity can be in, the transitions between them, the criteria that gate whether a transition is allowed, and the processors that run along the way. This page explains the concept; the [Build guide for workflows and processors](/build/workflows-and-processors/) covers how to configure one. ## State machines define allowed change Every entity type has a workflow. Nothing changes the entity except a transition defined in that workflow. A transition is atomic: it produces a new, durable revision of the entity, runs any attached processors under the platform's event contract, and either succeeds or has no effect. Workflows are general state machines, not pipelines. Transitions can be automatic (fire as soon as their source state is entered and any criteria are satisfied) or manual (fire only when invoked by an actor). A workflow can contain cycles, branches, and multiple terminal-looking states that are actually re-entered later — it does not need to terminate at all. ## Triggers Transitions fire in one of two ways: - **Manual** — an actor (a user, a service, an admin) calls the transition by name through the API. - **Automatic** — on entering a state, the first valid auto transition fires within the same transaction, recursing until no further auto transition applies. ## Processors A **processor** is code that runs as part of a transition. It can read the entity, compute a new field, call an external service, persist a side effect, or reject the transition. Two flavours: - **Internal processors** — shipped with the platform for common work (validation, projection, enrichment) and invoked declaratively. - **External processors** — your code, hosted anywhere, called by Cyoda over gRPC. External processors preserve audit hygiene (every call is logged against the transition), keep authorization simple (the platform brokers identity), and support bidirectional streaming for high-throughput workloads. For why gRPC is preferred and how to implement one, see [Build → client compute nodes](/build/client-compute-nodes/). Processors can run synchronously within the transition's transaction, or asynchronously alongside it. Processors run in one of three modes: **SYNC** (inline, shares the transition's transaction — failure aborts the transition), **ASYNC_SAME_TX** (runs asynchronously but in the same transaction context — failure still aborts), or **ASYNC_NEW_TX** (runs in a separate transaction via savepoint isolation — failure is logged and the transition succeeds). Choose the mode based on how atomically the side-effect must compose with the state change. ## Audit trail is the storage model Because every transition produces a revision and every processor invocation is recorded against it, the audit trail is not a separate log — it is a view of the storage. You can query the transitions an entity has been through, the criteria that were evaluated, the processors that ran, and the inputs and outputs of each. Point-in-time queries use the same index. ## Why this matters State machines plus durable transitions plus a queryable audit trail are the native ingredients for **regulated**, **auditable**, and **replayable** systems. Replaying a workflow from history does not require an event-sourcing framework on the side — it is the default behavior of the store. ## Where to go next - [Entities and lifecycle](/concepts/entities-and-lifecycle/) — the entity side of the machine, with a state-machine diagram. - [Build → workflows and processors](/build/workflows-and-processors/) — how to declare a workflow in practice. - [Build → client compute nodes](/build/client-compute-nodes/) — how to implement an external processor over gRPC. --- ## getting-started/install-and-first-entity.md # Install cyoda-go and create your first entity Install cyoda-go with SQLite (the default), define an entity model, trigger a workflow, and read state back. This page takes you from nothing installed to a persisted entity you can query, running entirely on your own machine. The default backend is **SQLite** — zero operational overhead, one file on disk, data survives restarts. ## Install cyoda-go ships as a single binary. Pick the flavour that fits your machine; the authoritative list of installers lives in the [cyoda-go README](https://github.com/cyoda-platform/cyoda-go#install). ```bash # macOS / Linux via Homebrew brew install cyoda-platform/cyoda-go/cyoda ``` The Homebrew formula is expected to run `cyoda init` automatically, enabling SQLite persistence with data at `~/.local/share/cyoda/cyoda.db` on macOS/Linux. If it didn't (or if you installed another way), run `cyoda init` yourself — see `cyoda help cli init` for what it does. A `curl | sh` installer, Debian packages, and Fedora RPMs are available for other environments. ## Why SQLite is the default SQLite is durable (data survives restarts), zero-ops (no server to run), and single-file (easy to inspect, back up, or delete). It is the right starting point for everyone except two groups: teams running functional tests where latency matters more than durability (use **in-memory** mode — see the callout below), and teams building production services (graduate to **PostgreSQL** when you need active-active HA; see [Run → overview](/run/)). ## Start the server Start the server with the defaults: ```bash cyoda ``` Running `cyoda` (no subcommand) defaults to **mock auth** (`CYODA_IAM_MODE=mock`): every request is authenticated as a configurable default user, no bearer token required. You'll see examples elsewhere in the docs with `-H "Authorization: Bearer $TOKEN"`; those are written for production deployments running in JWT mode. On a fresh local install you can drop the header — or keep it and send any placeholder; mock mode ignores it. Flip to JWT mode by setting `CYODA_IAM_MODE=jwt` plus `CYODA_JWT_SIGNING_KEY` (RSA private key, PEM). For the full auth-mode configuration see [Configuration](/reference/configuration/). ## Discover everything else with `cyoda help` Every flag, environment variable, endpoint, error code, metric, and operation is documented in the binary. Browse the topic tree: ```bash cyoda help ``` Drill in on anything: ```bash cyoda help config # configuration model and precedence cyoda help crud # entity CRUD over REST cyoda help search # query predicates and search modes cyoda help errors # the error catalogue cyoda help telemetry # metrics, health, tracing, logs ``` `cyoda help --format=json` emits a machine-readable shape suitable for tools; `--format=markdown` is the default off a TTY. ## Import a workflow Before creating an entity, import a workflow so the platform knows which state machine to apply. Save this as `workflow.json` — two states (`draft` and `submitted`) with one manual transition. States are keyed by name; each state owns its outgoing transitions: ```json { "workflows": [ { "version": "1", "name": "orders-wf", "initialState": "draft", "active": true, "states": { "draft": { "transitions": [ { "name": "submit", "next": "submitted", "manual": true } ] }, "submitted": {} } } ] } ``` Post it to the import endpoint for the `orders` model at version `1`: ```bash curl -X POST http://localhost:8080/api/model/orders/1/workflow/import \ -H 'Content-Type: application/json' \ -d @workflow.json ``` Without this step, cyoda-go applies its default workflow (`NONE` → `CREATED` → `DELETED` with a single automatic transition), and the `submit` transition used below will not exist. For the full workflow schema — processors, criteria, automatic transitions, nested conditions — see [Build → workflows and processors](/build/workflows-and-processors/). ## Create your first entity Define a minimal model and push an entity. Cyoda discovers the schema from the sample you send, so you do not need to declare it up front: ```bash ENTITY_ID=$(curl -s -X POST http://localhost:8080/api/entity/JSON/orders/1 \ -H 'Content-Type: application/json' \ -d '{ "orderId": "ORD-1", "amount": 42.00, "currency": "EUR" }' \ | jq -r '.[0].entityIds[0]') echo "$ENTITY_ID" ``` The create response is an array; `entityIds[0]` on the first element is the **system-assigned UUID** of the new entity. Subsequent reads and transitions address the entity by that UUID (`${ENTITY_ID}` here), not by the business key `orderId`. Automatic transitions (`manual: false`) fire immediately on creation, cascading the entity through applicable states until it reaches one with no outgoing auto transitions. The `orders-wf` workflow you just imported has none, so the entity settles in `draft` and waits for the manual `submit` transition below. ## Invoke a manual transition Trigger the `submit` transition on your entity. `{entityId}` (here `${ENTITY_ID}`) is the system-assigned UUID captured from the create response, not the business key `orderId`: ```bash curl -X PUT http://localhost:8080/api/entity/JSON/${ENTITY_ID}/submit ``` ## Read state back Fetch the current state and the transition log: ```bash curl http://localhost:8080/api/entity/${ENTITY_ID} ``` The response shows the entity in its new `submitted` state plus a record of every revision it has been through. For the full request and response shapes, see the [API reference](/reference/api/). ## Next steps - **Explore the binary.** Every flag, env var, endpoint, and error is described by a `cyoda help ` action. Browse the full tree at [Reference → `cyoda help`](/reference/cyoda-help/). - **Understand the model.** Read [Design principles](/concepts/design-principles/) and [Entities and lifecycle](/concepts/entities-and-lifecycle/) for the mental model behind what you just did. - **Build real applications.** Start with [Build → working with entities](/build/working-with-entities/) for the end-to-end patterns. - **Choose a deployment tier.** See [Run → overview](/run/) when you outgrow local SQLite. If you want fast functional tests without durability, run cyoda-go in **in-memory** mode (`go run ./cmd/cyoda` or the `CYODA_STORAGE_BACKEND=memory` profile). See [Testing with digital twins](/build/testing-with-digital-twins/) for the pattern. --- ## index.md # Cyoda Documentation Build and run Cyoda applications — from local cyoda-go to hosted Cyoda Cloud. ![Illustration of interconnected entity lifecycle components — state, workflows, events, and audit — unified in a single Cyoda runtime.](heroImage) EDBMS — Entity Database Management System # One transactional runtime for the entity lifecycle. Cyoda is an EDBMS: state machine, processors, and full revision history live inside the record, committed atomically — minimizing the need for sagas. A simpler stack than Postgres + Temporal + Kafka + a CDC audit pipeline. Build complete event-driven backends on one runtime. Open source, single Go binary, Postgres-backed. ## Four storage engines. One application contract. Three open-source engines ship with cyoda-go — in-memory, SQLite, and PostgreSQL — each tuned to a different operational shape. A commercial Cassandra plugin extends the same application code to fully scalable, robust production workloads.
## Where to go next - **New here?** Start with the [install-and-first-entity onramp](/getting-started/install-and-first-entity/). - **Understanding Cyoda?** Read [Concepts](/concepts/what-is-cyoda/). - **Building an app?** [Build](/build/) covers tier-agnostic patterns. - **Running one?** [Run](/run/) covers [desktop](/run/desktop/), [Docker](/run/docker/), [Kubernetes](/run/kubernetes/), and [Cyoda Cloud](/run/cyoda-cloud/). - **Need API specs?** [Reference](/reference/) embeds and ingests from [cyoda-go](https://github.com/Cyoda-platform/cyoda-go). --- ## reference.md # Reference Technical references — mostly generated from cyoda-go at build time. Reference content on this site is a narrative skin over the cyoda-go binary. The binary is self-documenting — every flag, environment variable, endpoint, error code, metric, header, and operation ships with its own help topic. This section points you at the right topics, shows you the REST/gRPC surfaces, and documents the CloudEvent JSON Schemas. The material here was captured against **cyoda-go v{helpIndex.pinnedVersion}**. For whatever version you are running, `cyoda help` on your own binary is the authoritative source. ## Start here - **[cyoda help](./cyoda-help/)** — navigator over the full topic tree. Every top-level topic and its drilldowns, with synopses. The best first stop. ## Surfaces - **[API](./api/)** — REST OpenAPI 3.1 reference, interactive viewer. - **[gRPC](./api/#grpc)** — gRPC CloudEventsService (cross-linked from the API page). - **[JSON Schemas](./schemas/)** — CloudEvent payload schemas, extracted from the pinned binary at build time. - **[Trino SQL](./trino/)** — SQL analytics surface (Cyoda Cloud; upcoming). ## Navigators over specific topics - **[CLI](./cli/)** — command-line entry points and global flags. - **[Configuration](./configuration/)** — configuration model, precedence, profiles, `_FILE` secrets. - **[Helm values](./helm/)** — chart layout, values model, secret provisioning. - **[Entity model export](./entity-model-export/)** — SIMPLE_VIEW export shape. Each navigator page carries a "Canonical reference" callout pointing at the corresponding `cyoda help ` for the authoritative contract. --- ## reference/api.md # API reference REST and gRPC surfaces. The REST API reference is rendered in a dedicated viewer so the documentation chrome does not compete with it for horizontal space: [Open the REST API reference](/api-reference/) The viewer works from the OpenAPI spec shipped with this site (`/openapi/openapi.json`), which was extracted from **cyoda-go v{helpIndex.pinnedVersion}**. For the version you are running, `cyoda help openapi` on your own binary is authoritative. The viewer supports the standard operations: browsing endpoints, inspecting request/response shapes, and try-it-out calls against an environment of your choice. ## gRPC gRPC proto documentation is tracked upstream and will appear here once the generated reference is published from cyoda-go. Until then, the `.proto` files in [cyoda-go/api/grpc](https://github.com/cyoda-platform/cyoda-go/tree/main/api/grpc) are the authoritative source. --- ## reference/cli.md # CLI cyoda-go command-line interface — narrative navigator over `cyoda help cli`. export const cliTopics = (() => { const found = helpIndex.topics.filter(t => t.path[0] === 'cli'); if (found.length === 0) { throw new Error(`EmptyNavigator: reference/cli.mdx filtered helpIndex to zero topics under prefix "cli" (pinned v${helpIndex.pinnedVersion}). Likely a topic rename upstream.`); } return found; })(); The `cyoda` binary is the server and its own control surface. It runs as a single long-lived process by default, with a small number of subcommands for bootstrapping, health, and migration. Every flag, subcommand, and env var is documented in-binary via `cyoda help ` and is version- accurate to whatever binary you are running. ## Output formats Every help topic supports three output formats via `--format`: - `text` (default on a TTY) — human reading. - `markdown` (default off-TTY) — paste into docs, PRs, chat. - `json` — machine-readable, stable schema; consumed by tools like cyoda-docs' own build pipeline. ## Drilldowns Topics that naturally subdivide take a multi-word path on the CLI: `cyoda help search async`, `cyoda help grpc compute`, and so on. The same path appears in the JSON surface as an array. ## Related topics The subset of `cyoda help` topics directly relevant to the CLI surface itself is below. {cliTopics.map((t) => (
  • cyoda help {t.path.join(' ')} — {t.title.replace(/^[^—]*—\s*/, '')}
    {t.synopsis}
  • ))} --- ## reference/configuration.md # Configuration cyoda-go configuration model — narrative navigator over `cyoda help config`. export const configTopics = (() => { const found = helpIndex.topics.filter(t => t.path[0] === 'config'); if (found.length === 0) { throw new Error(`EmptyNavigator: reference/configuration.mdx filtered helpIndex to zero topics under prefix "config" (pinned v${helpIndex.pinnedVersion}). Likely a topic rename upstream.`); } return found; })(); cyoda-go reads configuration from `CYODA_*` environment variables and from `.env`-format files. The authoritative key list — every variable, its type, its default — lives in the binary. This page covers the *model*: how sources compose, how profiles work, and how secrets are mounted from files. ## Sources and precedence Values resolve in this order, highest to lowest: 1. Shell environment 2. `.env.{profile}` files (in `CYODA_PROFILES` declaration order; later profiles override earlier ones within their group) 3. `.env` in the project directory 4. User config file 5. System config file 6. Hardcoded defaults Format is `.env` only (godotenv-parsed). No TOML, no YAML, no `--config` flag. Subcommand flags (e.g. `cyoda init --force`) are operation-scoped and do not override server-runtime configuration. **User config path** varies by OS: `~/.config/cyoda/cyoda.env` (Linux, macOS with XDG), `%AppData%\cyoda\cyoda.env` (Windows). **System config** lives at `/etc/cyoda/cyoda.env` on POSIX. ## Profiles `CYODA_PROFILES` is comma-separated and evaluated in declaration order. Within a profile, regular `.env` precedence applies; across profiles, later entries in the list override earlier ones. ## Secrets via `_FILE` suffix Any variable that accepts a credential (Postgres URL, JWT signing key, metrics bearer, gossip HMAC, bootstrap client secret) accepts a companion `*_FILE` variable that reads from a mounted file. Trailing whitespace is stripped. The `_FILE` variant takes precedence when both are set — the pattern designed for Kubernetes Secrets and Docker secrets mounts. ## Related topics {configTopics.map((t) => (
  • cyoda help {t.path.join(' ')} — {t.title.replace(/^[^—]*—\s*/, '')}
    {t.synopsis}
  • ))} --- ## reference/cyoda-help.md # `cyoda help` — topic tree Every flag, env var, endpoint, error, metric, and operation — browsable from the binary. This page is a navigator over the full topic tree. export const byTopLevel = (() => { const groups = new Map(); for (const t of helpIndex.topics) { const top = t.path[0]; if (!groups.has(top)) groups.set(top, []); groups.get(top).push(t); } return Array.from(groups.entries()).map(([top, topics]) => { topics.sort((a, b) => a.path.join('/').localeCompare(b.path.join('/'))); return { top, topics }; }); })(); The cyoda binary is self-documenting. Every flag, environment variable, endpoint, error code, metric, header, and operation is described by a **help topic** that ships with the binary. Topics are structured; many subdivide into drilldowns. The tree is stable across patch releases and evolves with minor releases. ## How to use it ```bash cyoda help # browse the whole tree cyoda help # read one topic (e.g. `cyoda help search`) cyoda help # drill down (e.g. `cyoda help search async`) cyoda help --format=json # machine-readable cyoda help --format=markdown # default off-TTY — paste into docs, PRs, chat ``` The binary you run is the authority for the version you run. This page lists the tree as shipped by the cyoda-go release this site was built against — other releases may add or rename topics. ## The tree {byTopLevel.map(({ top, topics }) => ( <>
    cyoda help {top}

    {topics.find(t => t.path.length === 1)?.synopsis ?? ''}

    {topics.filter(t => t.path.length > 1).length > 0 && (
      {topics.filter(t => t.path.length > 1).map(t => (
    • cyoda help {t.path.join(' ')} — {t.synopsis}
    • ))}
    )}
    ))} {` .cyoda-help-tree dt { margin-top: 1.2rem; font-weight: 600; } .cyoda-help-tree dd { margin-left: 1.5rem; margin-top: 0.25rem; } .cyoda-help-tree dd p { margin: 0 0 0.4rem 0; } .cyoda-help-tree dd ul { margin: 0.25rem 0 0 0; padding-left: 1.2rem; } .cyoda-help-tree dd li { margin-block: 0.15rem; } .cyoda-help-tree dd code { font-size: 0.9em; } `} --- ## reference/entity-model-export.md # Entity model export (SIMPLE_VIEW) API specification for the SIMPLE_VIEW entity-model export — response format, node descriptors, type descriptors, and error shapes. The SIMPLE_VIEW export returns the full structural model of an entity type — every node, every field, every observed type — in a compact, round-trippable JSON form. This page is the wire-format specification: endpoint, response envelope, node and type descriptors, JSON Schema, and error shapes. For the conceptual context (what a model is, how discovery/widening/locking work), see [Modeling entities](/build/modeling-entities/) and [Entities and lifecycle](/concepts/entities-and-lifecycle/). ## Endpoint ``` GET /model/export/SIMPLE_VIEW/{entityName}/{modelVersion} ``` | Parameter | Type | Description | |----------------|---------|--------------------------------------| | `entityName` | String | Name of the entity model | | `modelVersion` | Integer | Version number of the entity model | **Response Content-Type:** `application/json` ## Response envelope Every SIMPLE_VIEW export response is a JSON object with exactly two top-level keys: | Key | Type | Description | |----------------|--------|--------------------------------------------------------------| | `currentState` | String | Lifecycle state of the model: `"UNLOCKED"` or `"LOCKED"` | | `model` | Object | The SIMPLE_VIEW body — a map of **node paths** to **node descriptors** | ```json { "currentState": "LOCKED", "model": { ... } } ``` ## The `model` object — structure overview The `model` value is a flat JSON object whose **keys are node paths** and whose **values are node descriptors**. Each entry describes one level of the entity's hierarchical structure. The entire nested tree is flattened into this single-depth map. There are three kinds of node descriptors, corresponding to three structural cases: 1. **Object nodes** — JSON objects mapping field keys to type descriptors. 2. **Array nodes** — a single JSON value (primitive or array) representing a detached array specification. 3. **Mixed nodes** — a JSON array of exactly two elements: `[, ]`. ## Node paths Node paths use JSONPath-like syntax rooted at `$`. | Path | Meaning | |---------------------------|-----------------------------------------------------------| | `$` | The root object | | `$.fieldName[*]` | Array elements inside `fieldName` on the root object | | `$.parent[*].child[*]` | Array elements inside `child`, nested under `parent` array elements | | `$.a[*][*]` | Elements of a nested (multi-dimensional) array | The `[*]` marker (called `COLLECTION_MARKER` internally) denotes "all elements of this array." Node paths are always sorted lexicographically in the output. **Depth** is derived from the number of `[*]` segments in the path (i.e., `path.split("[*]").count() - 1`). ## Node descriptor formats ### 1. Object node (most common) A JSON object whose entries fall into two categories: #### a) Data fields — keys starting with `.` (dot) Each key is a dot-prefixed field name. The value is the field's **type descriptor** (see [Type descriptors](#type-descriptors) below). ```json { ".category": "STRING", ".year": "INTEGER", ".score": "DOUBLE" } ``` Array fields within an object node have keys ending in `[*]`: ```json { ".tags[*]": "(STRING x 3)", ".name": "STRING" } ``` The `(TYPE x WIDTH)` form is an array descriptor, where `TYPE` is the element type and `WIDTH` is the array length; e.g., `(INT x 4)` is a four-element integer array. See [Array type descriptors](#array-type-descriptors) for the full syntax. #### b) Structural fields — keys starting with `#` Structural fields are metadata markers prefixed with `#`. They indicate the role of this node in the overall structure. | Key | Value | Meaning | |----------------|--------------------|-------------------------------------------------| | `#` | `"ARRAY_ELEMENT"` | This node describes elements of its parent array | | `#.fieldName` | `"OBJECT"` | `fieldName` is an object (has its own child node)| Example — a node representing array elements with an object sub-field: ```json { ".firstname": "STRING", ".id": "STRING", "#": "ARRAY_ELEMENT", "#.address": "OBJECT" } ``` Fields are sorted alphabetically within the object (data fields first, then structural fields, both sorted by key). ### 2. Array node (detached array) Arrays of arrays (multidimensional arrays beyond the first dimension) create detached array nodes — each inner array becomes its own node in the export tree, keyed by the path to that inner array (e.g. `$.matrix[*]`). Likewise, arrays of objects create a separate node for the element shape (see Examples 1 and 3). When a node path points to a pure array (no object fields at this level), the value is the array's **type descriptor** directly — either a UniTypeArray string or a MultiTypeArray JSON array. See [Array type descriptors](#array-type-descriptors). ```json { "$.data[*]": "(INTEGER x 5)" } ``` ### 3. Mixed node (structural polymorphism) When the same path has been observed as both an object and an array (structural polymorphism), the value is a JSON array of exactly two elements: ```json { "$.data[*]": [ { ".nested": "STRING", "#": "ARRAY_ELEMENT" }, "(INTEGER x 2)" ] } ``` - Element `[0]`: the object-node descriptor. - Element `[1]`: the array-node type descriptor. ## Type descriptors Type descriptors appear as values for data fields (`.fieldName` keys) and for array nodes. ### Primitive type descriptor A JSON string containing a single `DataType` name or a polymorphic set. **Monomorphic** (single type): ``` "STRING" ``` **Polymorphic** (multiple observed types for the same field, enclosed in brackets): ``` "[INTEGER, STRING]" ``` ### Supported DataType values | DataType | Description | |--------------------|------------------------------------------------| | `STRING` | Text value | | `BYTE` | 8-bit signed integer | | `SHORT` | 16-bit signed integer | | `INTEGER` | 32-bit signed integer | | `LONG` | 64-bit signed integer | | `BIG_INTEGER` | Arbitrary-precision integer (bounded by Int128) | | `UNBOUND_INTEGER` | Arbitrary-precision integer (unbounded) | | `FLOAT` | 32-bit IEEE 754 floating point | | `DOUBLE` | 64-bit IEEE 754 floating point | | `BIG_DECIMAL` | Arbitrary-precision decimal (bounded, scale ≤ 18) | | `UNBOUND_DECIMAL` | Arbitrary-precision decimal (unbounded) | | `BOOLEAN` | Boolean value | | `CHARACTER` | Single character | | `LOCAL_DATE` | Date without time zone (ISO 8601) | | `LOCAL_DATE_TIME` | Date-time without time zone | | `LOCAL_TIME` | Time without date | | `ZONED_DATE_TIME` | Date-time with time zone | | `YEAR` | Year value | | `YEAR_MONTH` | Year and month | | `UUID_TYPE` | UUID | | `TIME_UUID_TYPE` | Version 1 (time-based) UUID | | `BYTE_ARRAY` | Binary data (base64-encoded) | | `NULL` | Null / no value observed yet | ### Structural DataType values (used only in `#`-prefixed keys) | DataType | Description | |--------------------|------------------------------------------------| | `OBJECT` | Marks a field as an object (has its own child node) | | `ARRAY` | Marks a field as an array container | | `ARRAY_ELEMENT` | Marks this node as describing array elements | | `TYPE_REFERENCE` | Internal reference to another type definition | | `POLYMORPHIC` | Internal marker for polymorphic fields | ## Array type descriptors Array fields (keys ending in `[*]`) and detached array nodes use one of two array representations. ### UniTypeArray (homogeneous) All elements have the same type. Serialized as a parenthesized string: ``` ( x ) ``` - `type`: a DataType name or polymorphic set. - `width`: the maximum observed array length. Examples: ``` "(STRING x 3)" — array of 3 strings "(INTEGER x 10)" — array of 10 integers "([INTEGER, STRING] x 4)" — array of 4 elements, each either integer or string ``` ### MultiTypeArray (heterogeneous) Elements at different positions have different types. Serialized as a JSON array of type strings: ```json ["INTEGER", "STRING", "BOOLEAN"] ``` Each element in the JSON array represents the type at that index position. Polymorphic elements within a multi-type array use the bracket notation: ```json ["[INTEGER, STRING]", "INTEGER", "[INTEGER, STRING]", "INTEGER"] ``` ## Complete examples ### Example 1: Simple flat object (Nobel Prize) **Input data shape:** ```json { "category": "chemistry", "year": "2020", "laureates": [ { "firstname": "Emmanuelle", "id": "991", "motivation": "...", "share": "2", "surname": "Charpentier" } ] } ``` **SIMPLE_VIEW export:** ```json { "currentState": "LOCKED", "model": { "$": { ".category": "STRING", ".year": "STRING" }, "$.laureates[*]": { ".firstname": "STRING", ".id": "STRING", ".motivation": "STRING", ".share": "STRING", ".surname": "STRING", "#": "ARRAY_ELEMENT" } } } ``` ### Example 2: Nested objects and primitive arrays **Input data shape:** ```json { "name": "Alice", "scores": [95, 87, 92], "address": { "city": "London", "zip": "SW1A" } } ``` **SIMPLE_VIEW export:** ```json { "currentState": "UNLOCKED", "model": { "$": { ".address.city": "STRING", ".address.zip": "STRING", ".name": "STRING", ".scores[*]": "(BYTE x 3)" } } } ``` Plain nested objects (non-array) are **inlined** into the parent node using dot-path notation (e.g., `.address.city`). They do not produce separate node entries or `#.fieldName` structural markers. ### Example 3: Multi-dimensional array **Input data shape:** ```json { "matrix": [ [1, 2, 3], [4, 5, 6] ] } ``` **SIMPLE_VIEW export:** ```json { "currentState": "UNLOCKED", "model": { "$": { ".matrix[*]": "(ARRAY_ELEMENT x 2)", "#.matrix": "OBJECT" }, "$.matrix[*]": "(INTEGER x 3)" } } ``` ### Example 4: Polymorphic field **Input data shape** (after ingesting multiple records): ```json { "data": "hello" } { "data": 42 } ``` **SIMPLE_VIEW export:** ```json { "currentState": "UNLOCKED", "model": { "$": { ".data": "[INTEGER, STRING]" } } } ``` ### Example 5: Mixed node (structural polymorphism) **Input data shape** (after ingesting multiple records with different shapes): ```json { "data": [{"nested": "primitive"}] } { "data": [[123, 321], [456, 654]] } ``` **SIMPLE_VIEW export:** ```json { "currentState": "UNLOCKED", "model": { "$": { ".data[*]": "(ARRAY_ELEMENT x 2)", "#.data": "OBJECT" }, "$.data[*]": [ { ".nested": "STRING", "#": "ARRAY_ELEMENT" }, "(INTEGER x 2)" ] } } ``` The `$.data[*]` entry is a JSON array of two elements because the system has observed `data` elements as both objects (with `.nested` field) and arrays of integers. ### Example 6: Heterogeneous (multi-type) array **Input data shape:** ```json { "row": [1, null, "three"] } ``` **SIMPLE_VIEW export:** ```json { "currentState": "UNLOCKED", "model": { "$": { ".row[*]": ["INTEGER", "NULL", "STRING"] } } } ``` ## JSON Schema for the SIMPLE_VIEW response ```json { "$schema": "https://json-schema.org/draft/2020-12/schema", "title": "SIMPLE_VIEW Model Export Response", "description": "Response from GET /model/export/SIMPLE_VIEW/{entityName}/{modelVersion}", "type": "object", "required": ["currentState", "model"], "additionalProperties": false, "properties": { "currentState": { "type": "string", "enum": ["LOCKED", "UNLOCKED"], "description": "Lifecycle state of the entity model." }, "model": { "type": "object", "description": "Map of node paths to node descriptors. Keys are JSONPath-like strings (e.g., '$', '$.field[*]'). Always contains at least the root node '$'.", "propertyNames": { "pattern": "^\\$(\\.[\\w][-\\w.]*(\\[\\*\\])*)*$" }, "additionalProperties": { "$ref": "#/$defs/nodeDescriptor" } } }, "$defs": { "dataType": { "type": "string", "description": "A primitive DataType name.", "enum": [ "STRING", "BYTE", "SHORT", "INTEGER", "LONG", "BIG_INTEGER", "UNBOUND_INTEGER", "FLOAT", "DOUBLE", "BIG_DECIMAL", "UNBOUND_DECIMAL", "BOOLEAN", "CHARACTER", "LOCAL_DATE", "LOCAL_DATE_TIME", "LOCAL_TIME", "ZONED_DATE_TIME", "YEAR", "YEAR_MONTH", "UUID_TYPE", "TIME_UUID_TYPE", "BYTE_ARRAY", "NULL" ] }, "structuralDataType": { "type": "string", "description": "DataType used in structural field values.", "enum": ["OBJECT", "ARRAY", "ARRAY_ELEMENT", "TYPE_REFERENCE", "POLYMORPHIC"] }, "typeDescriptor": { "description": "A field type: a single DataType, a polymorphic set '[TYPE1, TYPE2]', or an array spec '(TYPE x N)'.", "type": "string", "examples": [ "STRING", "INTEGER", "[INTEGER, STRING]", "(STRING x 3)", "([INTEGER, STRING] x 4)" ] }, "multiTypeArrayDescriptor": { "description": "A heterogeneous array where each position may have a different type.", "type": "array", "items": { "type": "string", "description": "Type descriptor for the element at this index position." }, "minItems": 1, "examples": [ ["INTEGER", "STRING", "BOOLEAN"], ["[INTEGER, STRING]", "INTEGER"] ] }, "objectNodeDescriptor": { "type": "object", "description": "Describes an object structure node. Keys prefixed with '.' are data fields; keys prefixed with '#' are structural markers.", "propertyNames": { "pattern": "^(\\.[\\w][-\\w.]*(\\[\\*\\])*)|(#\\.?[-\\w.]*)$" }, "additionalProperties": { "oneOf": [ { "$ref": "#/$defs/typeDescriptor" }, { "$ref": "#/$defs/multiTypeArrayDescriptor" }, { "$ref": "#/$defs/structuralDataType" } ] } }, "arrayNodeDescriptor": { "description": "Describes a detached array node. Either a UniTypeArray string or a MultiTypeArray JSON array.", "oneOf": [ { "$ref": "#/$defs/typeDescriptor" }, { "$ref": "#/$defs/multiTypeArrayDescriptor" } ] }, "mixedNodeDescriptor": { "description": "A node exhibiting structural polymorphism — observed as both object and array. Element [0] is the object descriptor, element [1] is the array descriptor.", "type": "array", "prefixItems": [ { "$ref": "#/$defs/objectNodeDescriptor" }, { "$ref": "#/$defs/arrayNodeDescriptor" } ], "minItems": 2, "maxItems": 2 }, "nodeDescriptor": { "description": "A node descriptor: object, array, or mixed.", "oneOf": [ { "$ref": "#/$defs/objectNodeDescriptor" }, { "$ref": "#/$defs/arrayNodeDescriptor" }, { "$ref": "#/$defs/mixedNodeDescriptor" } ] } } } ``` ## Error responses | Status | Condition | Response Body | |--------|-------------------------------|--------------------------------------------| | 404 | Model not found | RFC 7807 Problem Detail with `entityName` and `entityVersion` in `properties` | | 400 | Invalid converter value | RFC 7807 Problem Detail with `parameter` and `invalidValue` in `properties` | **404 example:** ```json { "type": "about:blank", "title": "Not Found", "status": 404, "detail": "cannot find model entityName=nobel-prize, version=2", "instance": "/api/model/export/SIMPLE_VIEW/nobel-prize/2", "properties": { "entityName": "nobel-prize", "entityVersion": 2 } } ``` ## Key behaviors for consumers 1. **The root node `$` is always present** in the model. It represents the top-level object of the entity. 2. **Field ordering is deterministic.** Within each object node, data fields (`.` prefix) are sorted alphabetically, followed by structural fields (`#` prefix) also sorted alphabetically. Node paths in the `model` object are sorted lexicographically. 3. **Models evolve via merging.** As new entity instances are ingested, the model grows: new fields appear, types may widen (e.g., `INTEGER` → `[INTEGER, STRING]`), and array widths may increase. The SIMPLE_VIEW always reflects the cumulative model. 4. **Polymorphic types** use bracket notation `[TYPE1, TYPE2]` within a single string. Types within the brackets are sorted by the `DataType` enum ordering defined in cyoda-go (see `internal/domain/model/schema/types.go` for the authoritative rule): numeric types first (integer families, then decimal families), then text types (`STRING`, `CHARACTER`), then temporal, identifier, binary, boolean, and `NULL` last. 5. **UniTypeArray vs MultiTypeArray:** If all array elements have the same type, you get `(TYPE x N)`. If different positions have different types, you get a JSON array `["TYPE1", "TYPE2", ...]`. When element types converge through merging, a MultiTypeArray may simplify back to a UniTypeArray. 6. **Structural fields indicate nesting.** A `#.fieldName` entry with value `"OBJECT"` means that field has its own child node in the model map. A `#` entry with value `"ARRAY_ELEMENT"` means this node describes elements of its parent array. 7. **The SIMPLE_VIEW is round-trippable.** It can be exported and re-imported via the import endpoint (`POST /model/import/{dataFormat}/{converter}/{entityName}/{modelVersion}`, e.g., `POST /model/import/JSON/SIMPLE_VIEW/{entityName}/{modelVersion}`) without loss of structural information. --- ## reference/helm.md # Helm values cyoda-go Helm chart — narrative navigator over `cyoda help helm`. export const helmTopics = (() => { const found = helpIndex.topics.filter(t => t.path[0] === 'helm'); if (found.length === 0) { throw new Error(`EmptyNavigator: reference/helm.mdx filtered helpIndex to zero topics under prefix "helm" (pinned v${helpIndex.pinnedVersion}). Likely a topic rename upstream.`); } return found; })(); The `deploy/helm/cyoda` chart packages cyoda-go for Kubernetes. The chart's own `values.yaml` enumerates every configurable key; this page covers the model that shapes how those values map to Kubernetes objects and how secrets are provisioned. ## What the chart provisions A standard deployment per release: a `Deployment`, a `Service` for HTTP + gRPC + admin ports, a `ConfigMap` materialising the `.env`-format configuration, and a `Secret` for credential material. An optional `ServiceMonitor` (Prometheus Operator) and `HorizontalPodAutoscaler` are wired off per-value flags. ## Values model Values split into two groups: - **Configuration values** — map one-to-one to the `CYODA_*` env vars documented in the binary. Changing them alters runtime behaviour but not Kubernetes shape. See `cyoda help config` for the list. - **Deployment values** — image tag, replica count, resource requests, service type, ingress glue. These shape the Kubernetes objects the chart renders; they never reach the binary. ## Secret provisioning Credentials follow the same `_FILE` pattern as bare-metal deployments. The chart mounts the `Secret` at a known path and sets `CYODA_*_FILE` env vars accordingly, so the binary reads each secret at startup. You never put raw credentials into the chart's `values.yaml` in a production deployment — wire an existing `Secret` via `existingSecret` or equivalent. ## Related topics {helmTopics.map((t) => (
  • cyoda help {t.path.join(' ')} — {t.title.replace(/^[^—]*—\s*/, '')}
    {t.synopsis}
  • ))} See [Run → Kubernetes](/run/kubernetes/) for the deployment pattern and production-sizing guidance. --- ## reference/schemas.md # JSON Schemas Complete reference for all JSON schemas used in Cyoda # JSON Schemas This section documents the JSON schemas used by the Cyoda platform — CloudEvent payloads exchanged over the gRPC processing stream, plus the entity and model structures that travel over REST and gRPC. The schemas shown here were captured against **cyoda-go v0.6.2**. For the version you are running, `cyoda help cloudevents` (narrative) and `cyoda help cloudevents json` (machine-readable) on your own binary are authoritative — the binary ships its own schema tree and always matches its own code. ## Download Schemas You can download all schemas as a ZIP file: [schemas.zip](/schemas.zip) ## Schema Categories Browse [Common schemas](./common/) Browse [Entity schemas](./entity/) Browse [Model schemas](./model/) Browse [Processing schemas](./processing/) Browse [Search schemas](./search/) ## Using JSON Schemas JSON schemas define the structure and validation rules for data in the Cyoda platform. Each schema includes: - **Property definitions** with types and descriptions - **Required fields** clearly marked - **Validation rules** for data integrity - **References** to related schemas Navigate to any category above to explore the available schemas. --- ## reference/schemas/common.md JSON schemas in the Common category # Common Schemas This section contains JSON schemas for common. ## Available Schemas - [ModelSpec](./model-spec/) - [ModelInfo](./model-info/) - [ModelConverterType](./model-converter-type/) - [ErrorCode](./error-code/) - [EntityMetadata](./entity-metadata/) - [EntityChangeMeta](./entity-change-meta/) - [DataPayload](./data-payload/) - [DataFormat](./data-format/) - [CloudEventType](./cloud-event-type/) - [BaseEvent](./base-event/) --- ## reference/schemas/common/base-event.md # BaseEvent Schema definition for BaseEvent # BaseEvent Schema definition for BaseEvent ## Schema Viewer ## Description This schema defines the structure and validation rules for BaseEvent. ## Properties - **error** (object): Error details (if present). - **id** (string, required): Event ID. - **success** (boolean): Flag indicates whether this message relates to some failure. - **warnings** (array): Warnings (if applicable). ## Related Schemas See other schemas in the [common](/reference/schemas/common/) category. --- ## reference/schemas/common/cloud-event-type.md # CloudEventType Schema definition for CloudEventType # CloudEventType Schema definition for CloudEventType ## Schema Viewer ## Description This schema defines the structure and validation rules for CloudEventType. ## Related Schemas See other schemas in the [common](/reference/schemas/common/) category. --- ## reference/schemas/common/data-format.md # DataFormat Specifies the format of the input data (e.g., JSON). # DataFormat Specifies the format of the input data (e.g., JSON). ## Schema Viewer ## Description Specifies the format of the input data (e.g., JSON). ## Related Schemas See other schemas in the [common](/reference/schemas/common/) category. --- ## reference/schemas/common/data-payload.md # DataPayload Schema definition for DataPayload # DataPayload Schema definition for DataPayload ## Schema Viewer ## Description This schema defines the structure and validation rules for DataPayload. ## Properties - **data** (any): Payload data. - **meta** (any): Metadata for the payload. - **type** (string, required): Payload type. ## Related Schemas See other schemas in the [common](/reference/schemas/common/) category. --- ## reference/schemas/common/entity-change-meta.md # EntityChangeMeta Metadata about entity changes including transaction information and change type. # EntityChangeMeta Metadata about entity changes including transaction information and change type. ## Schema Viewer ## Description Metadata about entity changes including transaction information and change type. ## Properties - **changeType** (string, required): Type of change that was made to the entity. - **fieldsChangedCount** (integer): Number of fields changed in the entity for this change. - **timeOfChange** (string, required): Timestamp when the change occurred. - **transactionId** (string): UUID of the transaction that made this change. - **user** (string, required): User who made the change. ## Related Schemas See other schemas in the [common](/reference/schemas/common/) category. --- ## reference/schemas/common/entity-metadata.md # EntityMetadata Metadata about an entity. id, modelKey and creationDate are invariant against the point-in-time. All other values are with respect to the as-at point-in-time for which the entity was retrieved. If the point-in-time was not explicitly set, the values correspond to the latest state of the entity. # EntityMetadata Metadata about an entity. id, modelKey and creationDate are invariant against the point-in-time. All other values are with respect to the as-at point-in-time for which the entity was retrieved. If the point-in-time was not explicitly set, the values correspond to the latest state of the entity. ## Schema Viewer ## Description Metadata about an entity. id, modelKey and creationDate are invariant against the point-in-time. All other values are with respect to the as-at point-in-time for which the entity was retrieved. If the point-in-time was not explicitly set, the values correspond to the latest state of the entity. ## Properties - **creationDate** (string, required): The creation date of the entity. - **id** (string, required): ID of the entity. - **lastUpdateTime** (string, required): The last time the entity was updated as-at the given point-in-time. Equals the creation date if the entity has not been updated. - **modelKey** (object): Model of the entity. - **pointInTime** (string): Optional value for the as-at point-in-time for which the entity was retrieved. - **state** (string, required): The state of the entity at the given point-in-time. - **transactionId** (string): The transaction id of the entity when last saved as-at the given point-in-time. - **transitionForLatestSave** (string): The transition applied of the entity when last saved as-at the given point-in-time. ## Related Schemas See other schemas in the [common](/reference/schemas/common/) category. --- ## reference/schemas/common/error-code.md # ErrorCode Schema definition for ErrorCode # ErrorCode Schema definition for ErrorCode ## Schema Viewer ## Description This schema defines the structure and validation rules for ErrorCode. ## Related Schemas See other schemas in the [common](/reference/schemas/common/) category. --- ## reference/schemas/common/model-converter-type.md # ModelConverterType Defines the type of converter to use when importing the model (e.g., SAMPLE_DATA to use sample object) # ModelConverterType Defines the type of converter to use when importing the model (e.g., SAMPLE_DATA to use sample object) ## Schema Viewer ## Description Defines the type of converter to use when importing the model (e.g., SAMPLE_DATA to use sample object) ## Related Schemas See other schemas in the [common](/reference/schemas/common/) category. --- ## reference/schemas/common/model-info.md # ModelInfo Schema definition for ModelInfo # ModelInfo Schema definition for ModelInfo ## Schema Viewer ## Description This schema defines the structure and validation rules for ModelInfo. ## Properties - **id** (string, required): Id of the model. - **name** (string, required): Name of the model. - **state** (string, required): Current state of the model. - **version** (integer, required): Version of the model. ## Related Schemas See other schemas in the [common](/reference/schemas/common/) category. --- ## reference/schemas/common/model-spec.md # ModelSpec Schema definition for ModelSpec # ModelSpec Schema definition for ModelSpec ## Schema Viewer ## Description This schema defines the structure and validation rules for ModelSpec. ## Properties - **name** (string, required): Name of the model. - **version** (integer, required): Version of the model. ## Related Schemas See other schemas in the [common](/reference/schemas/common/) category. --- ## reference/schemas/common/statemachine.md JSON schemas in the Statemachine category # Statemachine Schemas This section contains JSON schemas for statemachine. ## Available Schemas - [WorkflowInfo](./workflow-info/) - [TransitionInfo](./transition-info/) - [ProcessorInfo](./processor-info/) --- ## reference/schemas/common/statemachine/processor-info.md # ProcessorInfo Schema definition for ProcessorInfo # ProcessorInfo Schema definition for ProcessorInfo ## Schema Viewer ## Description This schema defines the structure and validation rules for ProcessorInfo. ## Properties - **id** (string, required): Processor ID. - **name** (string, required): Processor name. ## Related Schemas See other schemas in the [common](/reference/schemas/common/) category. --- ## reference/schemas/common/statemachine/transition-info.md # TransitionInfo Schema definition for TransitionInfo # TransitionInfo Schema definition for TransitionInfo ## Schema Viewer ## Description This schema defines the structure and validation rules for TransitionInfo. ## Properties - **id** (string, required): Transition ID. - **name** (string, required): Transition name. - **stateFrom** (string, required): Source state. - **stateTo** (string, required): Target state. ## Related Schemas See other schemas in the [common](/reference/schemas/common/) category. --- ## reference/schemas/common/statemachine/workflow-info.md # WorkflowInfo Schema definition for WorkflowInfo # WorkflowInfo Schema definition for WorkflowInfo ## Schema Viewer ## Description This schema defines the structure and validation rules for WorkflowInfo. ## Properties - **id** (string, required): Workflow ID. - **name** (string, required): Workflow name. ## Related Schemas See other schemas in the [common](/reference/schemas/common/) category. --- ## reference/schemas/entity.md JSON schemas in the Entity category # Entity Schemas This section contains JSON schemas for entity. ## Available Schemas - [EntityUpdateRequest](./entity-update-request/) - [EntityUpdatePayload](./entity-update-payload/) - [EntityUpdateCollectionRequest](./entity-update-collection-request/) - [EntityTransitionResponse](./entity-transition-response/) - [EntityTransitionRequest](./entity-transition-request/) - [EntityTransactionResponse](./entity-transaction-response/) - [EntityTransactionInfo](./entity-transaction-info/) - [EntityDeleteResponse](./entity-delete-response/) - [EntityDeleteRequest](./entity-delete-request/) - [EntityDeleteAllResponse](./entity-delete-all-response/) - [EntityDeleteAllRequest](./entity-delete-all-request/) - [EntityCreateRequest](./entity-create-request/) - [EntityCreatePayload](./entity-create-payload/) - [EntityCreateCollectionRequest](./entity-create-collection-request/) --- ## reference/schemas/entity/entity-create-collection-request.md # EntityCreateCollectionRequest Schema definition for EntityCreateCollectionRequest # EntityCreateCollectionRequest Schema definition for EntityCreateCollectionRequest ## Schema Viewer ## Description This schema defines the structure and validation rules for EntityCreateCollectionRequest. ## Properties - **dataFormat** (object, required): - **payloads** (array, required): Data payloads containing entities to save. - **transactionTimeoutMs** (integer): Indicates the timeout of transaction for transactional save. - **transactionWindow** (integer): The collection will be saved in a single transaction up to a maximum number of entities given by the transactionWindow. Collections exceeding the transactionWindow size will be saved in separate chunked transactions of the transactionWindow size. ## Related Schemas See other schemas in the [entity](/reference/schemas/entity/) category. --- ## reference/schemas/entity/entity-create-payload.md # EntityCreatePayload Schema definition for EntityCreatePayload # EntityCreatePayload Schema definition for EntityCreatePayload ## Schema Viewer ## Description This schema defines the structure and validation rules for EntityCreatePayload. ## Properties - **data** (any, required): Payload data. - **model** (object, required): Entity model to use for this payload. ## Related Schemas See other schemas in the [entity](/reference/schemas/entity/) category. --- ## reference/schemas/entity/entity-create-request.md # EntityCreateRequest Schema definition for EntityCreateRequest # EntityCreateRequest Schema definition for EntityCreateRequest ## Schema Viewer ## Description This schema defines the structure and validation rules for EntityCreateRequest. ## Properties - **dataFormat** (object, required): - **payload** (object, required): Data payload containing entity to save. - **transactionTimeoutMs** (integer): Indicates the timeout of transaction for transactional save. ## Related Schemas See other schemas in the [entity](/reference/schemas/entity/) category. --- ## reference/schemas/entity/entity-delete-all-request.md # EntityDeleteAllRequest Schema definition for EntityDeleteAllRequest # EntityDeleteAllRequest Schema definition for EntityDeleteAllRequest ## Schema Viewer ## Description This schema defines the structure and validation rules for EntityDeleteAllRequest. ## Properties - **model** (object, required): Information about the model. - **pageSize** (integer): Page size. - **pointInTime** (string): point in time, i.e. delete all that existed prior to this point in time - **transactionSize** (integer): Transaction size. - **verbose** (boolean): Include the list of entity ids deleted in the response ## Related Schemas See other schemas in the [entity](/reference/schemas/entity/) category. --- ## reference/schemas/entity/entity-delete-all-response.md # EntityDeleteAllResponse Schema definition for EntityDeleteAllResponse # EntityDeleteAllResponse Schema definition for EntityDeleteAllResponse ## Schema Viewer ## Description This schema defines the structure and validation rules for EntityDeleteAllResponse. ## Properties - **entityIds** (array, required): IDs of the removed entities. - **errorsById** (object): Collections of errors by ids if any. - **modelId** (string, required): ID of the model. - **numDeleted** (integer, required): Number of the deleted entities. - **requestId** (string, required): ID of the original request to get data. ## Related Schemas See other schemas in the [entity](/reference/schemas/entity/) category. --- ## reference/schemas/entity/entity-delete-request.md # EntityDeleteRequest Schema definition for EntityDeleteRequest # EntityDeleteRequest Schema definition for EntityDeleteRequest ## Schema Viewer ## Description This schema defines the structure and validation rules for EntityDeleteRequest. ## Properties - **entityId** (string, required): ID of the entity. ## Related Schemas See other schemas in the [entity](/reference/schemas/entity/) category. --- ## reference/schemas/entity/entity-delete-response.md # EntityDeleteResponse Schema definition for EntityDeleteResponse # EntityDeleteResponse Schema definition for EntityDeleteResponse ## Schema Viewer ## Description This schema defines the structure and validation rules for EntityDeleteResponse. ## Properties - **entityId** (string, required): ID of the removed entity. - **model** (object, required): Information about the model of the removed entity. - **requestId** (string, required): ID of the original request to get data. - **transactionId** (string, required): ID of the transaction. ## Related Schemas See other schemas in the [entity](/reference/schemas/entity/) category. --- ## reference/schemas/entity/entity-transaction-info.md # EntityTransactionInfo Schema definition for EntityTransactionInfo # EntityTransactionInfo Schema definition for EntityTransactionInfo ## Schema Viewer ## Description This schema defines the structure and validation rules for EntityTransactionInfo. ## Properties - **entityIds** (array, required): IDs of entities in this transaction. - **transactionId** (string): ID of the transaction. ## Related Schemas See other schemas in the [entity](/reference/schemas/entity/) category. --- ## reference/schemas/entity/entity-transaction-response.md # EntityTransactionResponse Schema definition for EntityTransactionResponse # EntityTransactionResponse Schema definition for EntityTransactionResponse ## Schema Viewer ## Description This schema defines the structure and validation rules for EntityTransactionResponse. ## Properties - **requestId** (string, required): ID of the original request to save data. - **transactionInfo** (object, required): Entity transaction info. ## Related Schemas See other schemas in the [entity](/reference/schemas/entity/) category. --- ## reference/schemas/entity/entity-transition-request.md # EntityTransitionRequest Schema definition for EntityTransitionRequest # EntityTransitionRequest Schema definition for EntityTransitionRequest ## Schema Viewer ## Description This schema defines the structure and validation rules for EntityTransitionRequest. ## Properties - **entityId** (string, required): ID of the entity. - **transition** (string, required): Name of the transition to apply. ## Related Schemas See other schemas in the [entity](/reference/schemas/entity/) category. --- ## reference/schemas/entity/entity-transition-response.md # EntityTransitionResponse Schema definition for EntityTransitionResponse # EntityTransitionResponse Schema definition for EntityTransitionResponse ## Schema Viewer ## Description This schema defines the structure and validation rules for EntityTransitionResponse. ## Properties - **availableTransitions** (array): Available transitions from the current state. ## Related Schemas See other schemas in the [entity](/reference/schemas/entity/) category. --- ## reference/schemas/entity/entity-update-collection-request.md # EntityUpdateCollectionRequest Schema definition for EntityUpdateCollectionRequest # EntityUpdateCollectionRequest Schema definition for EntityUpdateCollectionRequest ## Schema Viewer ## Description This schema defines the structure and validation rules for EntityUpdateCollectionRequest. ## Properties - **dataFormat** (object, required): - **payloads** (array, required): Data payloads containing entities to update. - **transactionTimeoutMs** (integer): Indicates the timeout of transaction for transactional save. - **transactionWindow** (integer): The collection will be saved in a single transaction up to a maximum number of entities given by the transactionWindow. Collections exceeding the transactionWindow size will be saved in separate chunked transactions of the transactionWindow size. ## Related Schemas See other schemas in the [entity](/reference/schemas/entity/) category. --- ## reference/schemas/entity/entity-update-payload.md # EntityUpdatePayload Schema definition for EntityUpdatePayload # EntityUpdatePayload Schema definition for EntityUpdatePayload ## Schema Viewer ## Description This schema defines the structure and validation rules for EntityUpdatePayload. ## Properties - **data** (any, required): Entity payload data. - **entityId** (string, required): ID of the entity. - **transition** (string): Transition to use for update. ## Related Schemas See other schemas in the [entity](/reference/schemas/entity/) category. --- ## reference/schemas/entity/entity-update-request.md # EntityUpdateRequest Schema definition for EntityUpdateRequest # EntityUpdateRequest Schema definition for EntityUpdateRequest ## Schema Viewer ## Description This schema defines the structure and validation rules for EntityUpdateRequest. ## Properties - **dataFormat** (object, required): - **payload** (object, required): Data payload containing entity to update. - **transactionTimeoutMs** (integer): Indicates the timeout of transaction for transactional save. ## Related Schemas See other schemas in the [entity](/reference/schemas/entity/) category. --- ## reference/schemas/model.md JSON schemas in the Model category # Model Schemas This section contains JSON schemas for model. ## Available Schemas - [EntityModelTransitionResponse](./entity-model-transition-response/) - [EntityModelTransitionRequest](./entity-model-transition-request/) - [EntityModelImportResponse](./entity-model-import-response/) - [EntityModelImportRequest](./entity-model-import-request/) - [EntityModelGetAllResponse](./entity-model-get-all-response/) - [EntityModelGetAllRequest](./entity-model-get-all-request/) - [EntityModelExportResponse](./entity-model-export-response/) - [EntityModelExportRequest](./entity-model-export-request/) - [EntityModelDeleteResponse](./entity-model-delete-response/) - [EntityModelDeleteRequest](./entity-model-delete-request/) --- ## reference/schemas/model/entity-model-delete-request.md # EntityModelDeleteRequest Schema definition for EntityModelDeleteRequest # EntityModelDeleteRequest Schema definition for EntityModelDeleteRequest ## Schema Viewer ## Description This schema defines the structure and validation rules for EntityModelDeleteRequest. ## Properties - **model** (object, required): Entity model specification. ## Related Schemas See other schemas in the [model](/reference/schemas/model/) category. --- ## reference/schemas/model/entity-model-delete-response.md # EntityModelDeleteResponse Schema definition for EntityModelDeleteResponse # EntityModelDeleteResponse Schema definition for EntityModelDeleteResponse ## Schema Viewer ## Description This schema defines the structure and validation rules for EntityModelDeleteResponse. ## Related Schemas See other schemas in the [model](/reference/schemas/model/) category. --- ## reference/schemas/model/entity-model-export-request.md # EntityModelExportRequest Schema definition for EntityModelExportRequest # EntityModelExportRequest Schema definition for EntityModelExportRequest ## Schema Viewer ## Description This schema defines the structure and validation rules for EntityModelExportRequest. ## Properties - **converter** (object, required): - **model** (object, required): Entity model specification. ## Related Schemas See other schemas in the [model](/reference/schemas/model/) category. --- ## reference/schemas/model/entity-model-export-response.md # EntityModelExportResponse Schema definition for EntityModelExportResponse # EntityModelExportResponse Schema definition for EntityModelExportResponse ## Schema Viewer ## Description This schema defines the structure and validation rules for EntityModelExportResponse. ## Properties - **model** (object, required): Entity model specification. - **modelId** (string): ID of the entity model. - **payload** (any, required): The content format of the exported entity model depends on the selected converter. ## Related Schemas See other schemas in the [model](/reference/schemas/model/) category. --- ## reference/schemas/model/entity-model-get-all-request.md # EntityModelGetAllRequest Schema definition for EntityModelGetAllRequest # EntityModelGetAllRequest Schema definition for EntityModelGetAllRequest ## Schema Viewer ## Description This schema defines the structure and validation rules for EntityModelGetAllRequest. ## Related Schemas See other schemas in the [model](/reference/schemas/model/) category. --- ## reference/schemas/model/entity-model-get-all-response.md # EntityModelGetAllResponse Schema definition for EntityModelGetAllResponse # EntityModelGetAllResponse Schema definition for EntityModelGetAllResponse ## Schema Viewer ## Description This schema defines the structure and validation rules for EntityModelGetAllResponse. ## Properties - **models** (array, required): Information about registered models. ## Related Schemas See other schemas in the [model](/reference/schemas/model/) category. --- ## reference/schemas/model/entity-model-import-request.md # EntityModelImportRequest Schema definition for EntityModelImportRequest # EntityModelImportRequest Schema definition for EntityModelImportRequest ## Schema Viewer ## Description This schema defines the structure and validation rules for EntityModelImportRequest. ## Properties - **converter** (object, required): - **dataFormat** (object, required): - **model** (object, required): Entity model specification. - **payload** (any, required): The data to be used for importing the model. ## Related Schemas See other schemas in the [model](/reference/schemas/model/) category. --- ## reference/schemas/model/entity-model-import-response.md # EntityModelImportResponse Schema definition for EntityModelImportResponse # EntityModelImportResponse Schema definition for EntityModelImportResponse ## Schema Viewer ## Description This schema defines the structure and validation rules for EntityModelImportResponse. ## Properties - **modelId** (string, required): ID of the created or updated entity model. ## Related Schemas See other schemas in the [model](/reference/schemas/model/) category. --- ## reference/schemas/model/entity-model-transition-request.md # EntityModelTransitionRequest Schema definition for EntityModelTransitionRequest # EntityModelTransitionRequest Schema definition for EntityModelTransitionRequest ## Schema Viewer ## Description This schema defines the structure and validation rules for EntityModelTransitionRequest. ## Properties - **model** (object, required): Entity model specification. - **transition** (string, required): Specifies the transition to perform. ## Related Schemas See other schemas in the [model](/reference/schemas/model/) category. --- ## reference/schemas/model/entity-model-transition-response.md # EntityModelTransitionResponse Schema definition for EntityModelTransitionResponse # EntityModelTransitionResponse Schema definition for EntityModelTransitionResponse ## Schema Viewer ## Description This schema defines the structure and validation rules for EntityModelTransitionResponse. ## Properties - **modelId** (string, required): ID of the entity model. - **state** (string, required): State of the entity model. ## Related Schemas See other schemas in the [model](/reference/schemas/model/) category. --- ## reference/schemas/processing.md JSON schemas in the Processing category # Processing Schemas This section contains JSON schemas for processing. ## Available Schemas - [EventAckResponse](./event-ack-response/) - [EntityProcessorCalculationResponse](./entity-processor-calculation-response/) - [EntityProcessorCalculationRequest](./entity-processor-calculation-request/) - [EntityCriteriaCalculationResponse](./entity-criteria-calculation-response/) - [EntityCriteriaCalculationRequest](./entity-criteria-calculation-request/) - [CalculationMemberKeepAliveEvent](./calculation-member-keep-alive-event/) - [CalculationMemberJoinEvent](./calculation-member-join-event/) - [CalculationMemberGreetEvent](./calculation-member-greet-event/) --- ## reference/schemas/processing/calculation-member-greet-event.md # CalculationMemberGreetEvent Schema definition for CalculationMemberGreetEvent # CalculationMemberGreetEvent Schema definition for CalculationMemberGreetEvent ## Schema Viewer ## Description This schema defines the structure and validation rules for CalculationMemberGreetEvent. ## Properties - **joinedLegalEntityId** (string, required): ID of the legal entity under which this member has joined. - **memberId** (string, required): Assigned member ID. ## Related Schemas See other schemas in the [processing](/reference/schemas/processing/) category. --- ## reference/schemas/processing/calculation-member-join-event.md # CalculationMemberJoinEvent Schema definition for CalculationMemberJoinEvent # CalculationMemberJoinEvent Schema definition for CalculationMemberJoinEvent ## Schema Viewer ## Description This schema defines the structure and validation rules for CalculationMemberJoinEvent. ## Properties - **tags** (array): Member tags. Could be used to filter applicability. ## Related Schemas See other schemas in the [processing](/reference/schemas/processing/) category. --- ## reference/schemas/processing/calculation-member-keep-alive-event.md # CalculationMemberKeepAliveEvent Schema definition for CalculationMemberKeepAliveEvent # CalculationMemberKeepAliveEvent Schema definition for CalculationMemberKeepAliveEvent ## Schema Viewer ## Description This schema defines the structure and validation rules for CalculationMemberKeepAliveEvent. ## Properties - **memberId** (string, required): Member ID. ## Related Schemas See other schemas in the [processing](/reference/schemas/processing/) category. --- ## reference/schemas/processing/entity-criteria-calculation-request.md # EntityCriteriaCalculationRequest Schema definition for EntityCriteriaCalculationRequest # EntityCriteriaCalculationRequest Schema definition for EntityCriteriaCalculationRequest ## Schema Viewer ## Description This schema defines the structure and validation rules for EntityCriteriaCalculationRequest. ## Properties - **criteriaId** (string, required): Criteria ID. - **criteriaName** (string, required): Criteria name. - **entityId** (string, required): Entity ID. - **parameters** (any): Configured parameters, if any. - **payload** (object): - **processor** (object): Processor information, available for targets PROCESSOR. - **requestId** (string, required): Request ID. - **target** (string, required): Target to which this condition is attached. NA is reserved for future cases. - **transactionId** (string): Transaction ID. - **transition** (object): Transition information, available for targets TRANSITION and PROCESSOR. - **workflow** (object): Workflow information, available for targets WORKFLOW, PROCESSOR, TRANSITION. ## Related Schemas See other schemas in the [processing](/reference/schemas/processing/) category. --- ## reference/schemas/processing/entity-criteria-calculation-response.md # EntityCriteriaCalculationResponse Schema definition for EntityCriteriaCalculationResponse # EntityCriteriaCalculationResponse Schema definition for EntityCriteriaCalculationResponse ## Schema Viewer ## Description This schema defines the structure and validation rules for EntityCriteriaCalculationResponse. ## Properties - **entityId** (string, required): Entity ID. - **matches** (boolean): Criteria check result. - **reason** (string): Reason for the criteria check result. - **requestId** (string, required): ID of the original criteria calculation request. ## Related Schemas See other schemas in the [processing](/reference/schemas/processing/) category. --- ## reference/schemas/processing/entity-processor-calculation-request.md # EntityProcessorCalculationRequest Schema definition for EntityProcessorCalculationRequest # EntityProcessorCalculationRequest Schema definition for EntityProcessorCalculationRequest ## Schema Viewer ## Description This schema defines the structure and validation rules for EntityProcessorCalculationRequest. ## Properties - **entityId** (string, required): Entity ID. - **parameters** (any): Configured parameters, if any. - **payload** (object): - **processorId** (string, required): Processor ID. - **processorName** (string, required): Processor name. - **requestId** (string, required): Request ID. - **transactionId** (string): Transaction ID. - **transition** (object): Transition information. - **workflow** (object, required): Workflow information. ## Related Schemas See other schemas in the [processing](/reference/schemas/processing/) category. --- ## reference/schemas/processing/entity-processor-calculation-response.md # EntityProcessorCalculationResponse Schema definition for EntityProcessorCalculationResponse # EntityProcessorCalculationResponse Schema definition for EntityProcessorCalculationResponse ## Schema Viewer ## Description This schema defines the structure and validation rules for EntityProcessorCalculationResponse. ## Properties - **entityId** (string, required): Entity ID. - **payload** (object): - **requestId** (string, required): ID of the original calculation request. ## Related Schemas See other schemas in the [processing](/reference/schemas/processing/) category. --- ## reference/schemas/processing/event-ack-response.md # EventAckResponse Schema definition for EventAckResponse # EventAckResponse Schema definition for EventAckResponse ## Schema Viewer ## Description This schema defines the structure and validation rules for EventAckResponse. ## Properties - **sourceEventId** (string, required): ID of the original event. ## Related Schemas See other schemas in the [processing](/reference/schemas/processing/) category. --- ## reference/schemas/search.md JSON schemas in the Search category # Search Schemas This section contains JSON schemas for search. ## Available Schemas - [SnapshotGetStatusRequest](./snapshot-get-status-request/) - [SnapshotGetRequest](./snapshot-get-request/) - [SnapshotCancelRequest](./snapshot-cancel-request/) - [SearchSnapshotStatus](./search-snapshot-status/) - [EntityStatsResponse](./entity-stats-response/) - [EntityStatsGetRequest](./entity-stats-get-request/) - [EntityStatsByStateResponse](./entity-stats-by-state-response/) - [EntityStatsByStateGetRequest](./entity-stats-by-state-get-request/) - [EntitySnapshotSearchResponse](./entity-snapshot-search-response/) - [EntitySnapshotSearchRequest](./entity-snapshot-search-request/) - [EntitySearchRequest](./entity-search-request/) - [EntityResponse](./entity-response/) - [EntityGetRequest](./entity-get-request/) - [EntityGetAllRequest](./entity-get-all-request/) - [EntityChangesMetadataResponse](./entity-changes-metadata-response/) - [EntityChangesMetadataGetRequest](./entity-changes-metadata-get-request/) --- ## reference/schemas/search/entity-changes-metadata-get-request.md # EntityChangesMetadataGetRequest Schema definition for EntityChangesMetadataGetRequest # EntityChangesMetadataGetRequest Schema definition for EntityChangesMetadataGetRequest ## Schema Viewer ## Description This schema defines the structure and validation rules for EntityChangesMetadataGetRequest. ## Properties - **entityId** (string, required): ID of the entity to retrieve change history for. - **pointInTime** (string): Point in time to retrieve the entity changes. If not provided, retrieves all changes up to the current consistency time. ## Related Schemas See other schemas in the [search](/reference/schemas/search/) category. --- ## reference/schemas/search/entity-changes-metadata-response.md # EntityChangesMetadataResponse Schema definition for EntityChangesMetadataResponse # EntityChangesMetadataResponse Schema definition for EntityChangesMetadataResponse ## Schema Viewer ## Description This schema defines the structure and validation rules for EntityChangesMetadataResponse. ## Properties - **changeMeta** (object, required): Metadata about a single entity change. - **requestId** (string, required): ID of the original request to get entity changes metadata. ## Related Schemas See other schemas in the [search](/reference/schemas/search/) category. --- ## reference/schemas/search/entity-get-all-request.md # EntityGetAllRequest Schema definition for EntityGetAllRequest # EntityGetAllRequest Schema definition for EntityGetAllRequest ## Schema Viewer ## Description This schema defines the structure and validation rules for EntityGetAllRequest. ## Properties - **model** (object, required): Information about the model to search. - **pageNumber** (integer): Page number (from 0). - **pageSize** (integer): Page size. - **pointInTime** (string): point in time ## Related Schemas See other schemas in the [search](/reference/schemas/search/) category. --- ## reference/schemas/search/entity-get-request.md # EntityGetRequest Schema definition for EntityGetRequest # EntityGetRequest Schema definition for EntityGetRequest ## Schema Viewer ## Description This schema defines the structure and validation rules for EntityGetRequest. ## Properties - **entityId** (string, required): ID of the entity. - **pointInTime** (string): Point in time to retrieve the entity. If not provided, retrieves the entity at the current consistency time. ## Related Schemas See other schemas in the [search](/reference/schemas/search/) category. --- ## reference/schemas/search/entity-response.md # EntityResponse Schema definition for EntityResponse # EntityResponse Schema definition for EntityResponse ## Schema Viewer ## Description This schema defines the structure and validation rules for EntityResponse. ## Properties - **payload** (object, required): Payload with entity data and meta information. - **requestId** (string, required): ID of the original request to get data. ## Related Schemas See other schemas in the [search](/reference/schemas/search/) category. --- ## reference/schemas/search/entity-search-request.md # EntitySearchRequest Schema definition for EntitySearchRequest # EntitySearchRequest Schema definition for EntitySearchRequest ## Schema Viewer ## Description This schema defines the structure and validation rules for EntitySearchRequest. ## Properties - **condition** (object, required): Query condition to use for building this snapshot. - **limit** (integer): The maximum number of rows to return. - **model** (object, required): Entity model to use for building this snapshot. - **pointInTime** (string): point in time - **timeoutMillis** (integer): The maximum time to wait in milliseconds for the query to complete. ## Related Schemas See other schemas in the [search](/reference/schemas/search/) category. --- ## reference/schemas/search/entity-snapshot-search-request.md # EntitySnapshotSearchRequest Schema definition for EntitySnapshotSearchRequest # EntitySnapshotSearchRequest Schema definition for EntitySnapshotSearchRequest ## Schema Viewer ## Description This schema defines the structure and validation rules for EntitySnapshotSearchRequest. ## Properties - **condition** (object, required): Query condition to use for building this snapshot. - **model** (object, required): Entity model to use for building this snapshot. - **pointInTime** (string): point in time ## Related Schemas See other schemas in the [search](/reference/schemas/search/) category. --- ## reference/schemas/search/entity-snapshot-search-response.md # EntitySnapshotSearchResponse Schema definition for EntitySnapshotSearchResponse # EntitySnapshotSearchResponse Schema definition for EntitySnapshotSearchResponse ## Schema Viewer ## Description This schema defines the structure and validation rules for EntitySnapshotSearchResponse. ## Properties - **status** (object): Status information for the requested snapshot. ## Related Schemas See other schemas in the [search](/reference/schemas/search/) category. --- ## reference/schemas/search/entity-stats-by-state-get-request.md # EntityStatsByStateGetRequest Schema definition for EntityStatsByStateGetRequest # EntityStatsByStateGetRequest Schema definition for EntityStatsByStateGetRequest ## Schema Viewer ## Description This schema defines the structure and validation rules for EntityStatsByStateGetRequest. ## Properties - **model** (object): Optional specifier of the Entity model to calculate statistics for. - **pointInTime** (string): The point-in-time for statistics in ISO 8601 format. Defaults to current consistency time if not provided. - **states** (array): Optional list of states for which to calculate statistics. If not provided, statistics will be calculated for all current workflow states. ## Related Schemas See other schemas in the [search](/reference/schemas/search/) category. --- ## reference/schemas/search/entity-stats-by-state-response.md # EntityStatsByStateResponse Schema definition for EntityStatsByStateResponse # EntityStatsByStateResponse Schema definition for EntityStatsByStateResponse ## Schema Viewer ## Description This schema defines the structure and validation rules for EntityStatsByStateResponse. ## Properties - **count** (integer, required): Entity count for this model and state. - **modelName** (string, required): Entity model name. - **modelVersion** (integer, required): Entity model version. - **requestId** (string, required): ID of the original request. - **state** (string, required): Entity state. ## Related Schemas See other schemas in the [search](/reference/schemas/search/) category. --- ## reference/schemas/search/entity-stats-get-request.md # EntityStatsGetRequest Schema definition for EntityStatsGetRequest # EntityStatsGetRequest Schema definition for EntityStatsGetRequest ## Schema Viewer ## Description This schema defines the structure and validation rules for EntityStatsGetRequest. ## Properties - **model** (object): Optional specifier of the Entity model to calculate statistics for. - **pointInTime** (string): The point-in-time for statistics in ISO 8601 format. Defaults to current consistency time if not provided. ## Related Schemas See other schemas in the [search](/reference/schemas/search/) category. --- ## reference/schemas/search/entity-stats-response.md # EntityStatsResponse Schema definition for EntityStatsResponse # EntityStatsResponse Schema definition for EntityStatsResponse ## Schema Viewer ## Description This schema defines the structure and validation rules for EntityStatsResponse. ## Properties - **count** (integer, required): Entity count for this model. - **modelName** (string, required): Entity model name. - **modelVersion** (integer, required): Entity model version. - **requestId** (string, required): ID of the original request. ## Related Schemas See other schemas in the [search](/reference/schemas/search/) category. --- ## reference/schemas/search/search-snapshot-status.md # SearchSnapshotStatus Schema definition for SearchSnapshotStatus # SearchSnapshotStatus Schema definition for SearchSnapshotStatus ## Schema Viewer ## Description This schema defines the structure and validation rules for SearchSnapshotStatus. ## Properties - **entitiesCount** (integer): Number of entities collected. - **expirationDate** (string): Expiration date of the snapshot. - **snapshotId** (string, required): ID of the snapshot. - **status** (string, required): Status of the snapshot. ## Related Schemas See other schemas in the [search](/reference/schemas/search/) category. --- ## reference/schemas/search/snapshot-cancel-request.md # SnapshotCancelRequest Schema definition for SnapshotCancelRequest # SnapshotCancelRequest Schema definition for SnapshotCancelRequest ## Schema Viewer ## Description This schema defines the structure and validation rules for SnapshotCancelRequest. ## Properties - **snapshotId** (string, required): ID of the snapshot to cancel. ## Related Schemas See other schemas in the [search](/reference/schemas/search/) category. --- ## reference/schemas/search/snapshot-get-request.md # SnapshotGetRequest Schema definition for SnapshotGetRequest # SnapshotGetRequest Schema definition for SnapshotGetRequest ## Schema Viewer ## Description This schema defines the structure and validation rules for SnapshotGetRequest. ## Properties - **clientPointTime** (string): Page of time to retrieve the results. - **pageNumber** (integer): Page number (from 0). - **pageSize** (integer): Page size. - **snapshotId** (string, required): ID of the snapshot to retrieve data. ## Related Schemas See other schemas in the [search](/reference/schemas/search/) category. --- ## reference/schemas/search/snapshot-get-status-request.md # SnapshotGetStatusRequest Schema definition for SnapshotGetStatusRequest # SnapshotGetStatusRequest Schema definition for SnapshotGetStatusRequest ## Schema Viewer ## Description This schema defines the structure and validation rules for SnapshotGetStatusRequest. ## Properties - **snapshotId** (string, required): ID of the snapshot. ## Related Schemas See other schemas in the [search](/reference/schemas/search/) category. --- ## reference/trino.md # Trino SQL surface How entities are projected into virtual SQL tables — naming rules, type mapping, polymorphic fields, and JDBC. Cyoda exposes an analytical SQL surface through a Trino connector. Every entity model is projected into a set of **virtual SQL tables** so that nested JSON/XML data can be queried with ordinary relational SQL — no pre-flattening required. This page documents the projection rules, the table naming convention, the column categories, the JSON-to-SQL type mapping, and how polymorphic fields are handled. For the conceptual framing of the three Cyoda surfaces (REST, gRPC, Trino SQL), see [APIs and surfaces](/concepts/apis-and-surfaces/). ## Core concepts Cyoda represents data (JSON/XML) as hierarchical tree structures. Internally, these trees are decomposed into a collection of `Nodes`, each capturing a specific branch or subset of the data. This decomposition provides a uniform representation for querying and traversal — whether through API-level path queries or via SQL through the Trino connector. In the Trino view, each node is exposed as a **virtual SQL table**, allowing nested structures to be queried using familiar relational syntax without flattening the original hierarchy. **Key principle**: each `Node` corresponds to exactly **one** SQL table. ### Node model Each Node has: - **path**: a string identifying the node's position in the hierarchy (e.g. `$`, `$.organization.clients`, `$.agreement_data`) - **fields**: a map of field names to values, where values can be: - primitive types (string, number, boolean, date, etc.) - 1-dimensional arrays of primitives The collection of these nodes forms the internal data model that underpins the Trino connector. The `TreeNodeEntity` is the object type that encapsulates this node collection when interacting with the Cyoda API, but it is a byproduct of this broader structural model rather than its defining feature. ### Node creation rules Nodes are derived directly from the structure of the data (such as JSON/XML): 1. **Root node**: always created with path `$` containing top-level fields. 2. **Array of objects**: each array of objects creates a new node. 3. **Multidimensional arrays**: each dimension beyond one creates a further node to preserve structural depth. This consistent mapping enables Cyoda to represent, navigate, and query arbitrarily nested data structures in a predictable and composable way. #### Example: node structure Given a JSON file with these paths: ``` $.organization.name $.organization.address[] $.organization.clients[].name $.organization.clients[].address[] $.quarterly_metrics[][] ``` The system creates the following nodes: **Node 1** (Root): - Path: `$` - Fields: `.organization.name`, `.organization.address[]` **Node 2** (Clients array): - Path: `$.organization.clients[*]` - Fields: `.name`, `.address[]` **Node 3** (Agreement data — 2D array, detached): - Path: `$.quarterly_metrics[*]` - Fields: `[*]` (detached array containing the inner array elements) - This is a detached array because `quarterly_metrics` is a 2-dimensional array. For example: ```json { "organization": { "name": "Acme Corp", "address": [ "123 Market Street", "Suite 400", "San Francisco, CA 94105" ], "clients": [ { "name": "Client A", "address": ["10 First Ave", "Seattle, WA 98101"] }, { "name": "Client B", "address": ["200 Second St", "Portland, OR 97204"] } ] }, "quarterly_metrics": [ [1000, 1200, 900, 1100], [1300, 1400, 1250, 1500] ] } ``` ### Tree decomposition Here is a visual representation of the node structure for the example above, where the corresponding SQL tables are labelled ORGANIZATION, CLIENTS, and METRICS: ```mermaid graph TD %% STYLES classDef highlighted fill:none,stroke:#5A18AC,stroke-width:3px,rx:8,ry:8 classDef normal fill:#F5FAF9,stroke:#4FB8B0,stroke-width:2px,rx:8,ry:8 %% Root Node A["$(root node -> ORGANIZATION)"] --> B["organization"] A --> K["quarterly_metrics[][] (detached 2D array node -> METRICS)"] %% Organization branch B --> C["organization.name"] B --> D["organization.address[]"] B --> E["organization.clients[] (clients node -> CLIENTS)"] %% Clients branch E --> F["organization.clients[].name"] E --> G["organization.clients[].address[]"] %% Quarterly metrics branch K --> L["quarterly_metrics[*] (outer arrays)"] L --> M["quarterly_metrics[*][*] (individual cell values)"] %% Apply highlighting to specific nodes class A,K,E highlighted class B,C,D,F,G,L,M normal ``` ### Embedded arrays and detached arrays When JSON/XML contains one-dimensional arrays of primitives within objects, the system does not create a separate node for such arrays. Instead they are represented as a single column of an array type. In the table it can be represented as a column of array type (i.e. `field_array` — ARRAY[STRING]) or as multiple columns (i.e. `field_0` — STRING, `field_1` — STRING, ...), depending on the `flatten_array` flag for this field in the Trino schema settings. That setting also works for system fields, such as `index`. When JSON/XML contains multidimensional arrays (arrays of arrays), the system creates separate nodes for each dimension after the first. This process is called **array detachment**. #### Understanding detached arrays A **detached array** is created when an array contains other arrays as elements. Each additional dimension becomes a separate node with its own table. The primary motivation behind this approach: if a JSON contains a table-like structure — i.e. a 2-dimensional array of primitives — it should be represented as a table in Trino. This logic was then extended and generalized to work for arrays of any number of dimensions larger than 1. #### Example 1: simple 2D array Consider this JSON with a 2-dimensional array: ```json { "matrix": [ [1, 2, 3], [4, 5, 6] ] } ``` **Nodes created:** **Node 1** (Root — `$`): - Path: `$` - Fields: (none — the matrix field is an array of arrays, so it's not stored here) **Node 2** (First dimension — `$.matrix[*]`): - Path: `$.matrix[*]` - Fields: `[1]`, `[2]`, `[3]` (these are detached array fields containing values from the inner arrays, which are represented as rows) - This node is marked as having a detached array **Generated tables:** **Table 1: `mydata`** (from node `$`) - Contains only special and root columns (no data fields in this case) **Table 2: `mydata_matrix_array`** (from node `$.matrix[*]` — detached array) - Table name breakdown: - `mydata` = entity name - `_matrix` = field name from path - `_array` = suffix indicating this is a detached array table - Columns: - `entity_id` (UUID) - `point_time` (DATE) - Root columns (creation_date, last_update_date, state) - `index_0` (INTEGER) — Position in the outer array (0 or 1 in this example) - `element_0`, `element_1`, `element_2` (INTEGER) — The three elements of each inner array #### Example 2: 3D array For a 3-dimensional array: ```json { "cube": [ [[1, 2], [3, 4]], [[5, 6], [7, 8]] ] } ``` **Nodes created:** **Node 1** (Root — `$`): - No data fields **Node 2** (Second dimension — `$.cube[*][*]`, no separate layer for the first dimension in this case): - Fields: `[1]`, `[2]`, `[3]` **Generated tables:** **Table 1: `mydata`** (from node `$`) **Table 2: `mydata_cube_2d_array`** (from node `$.cube[*][*]`. Note that `2d` in this case represents the number of collapsed dimensions, while the 3rd dimension is detached) - Columns include `index_0` and `index_1` for positions in both dimensions - `element_0`, `element_1` for the two primitive values #### Example 3: array of objects with nested arrays ```json { "data": [ { "name": "item1", "values": [10, 20, 30] }, { "name": "item2", "values": [40, 50] } ] } ``` **Nodes created:** **Node 1** (Root — `$`): - Path: `$` **Node 2** (Array of objects — `$.data[*]`): - Path: `$.data[*]` - Fields: `.name`, `.values[]` **Generated tables:** **Table 1: `mydata`** (from node `$`) **Table 2: `mydata_data`** (from node `$.data[*]`) - Columns: - Special and root columns - `index_0` (INTEGER) — Position in the data array - `name` (STRING) — From `.name` field - `values_array` (ARRAY[INTEGER]) or `values_0`, `values_1`, `values_2` (INTEGER) depending on the `flatten_array` flag for this field in the Trino schema settings #### Variable-dimension arrays (mixed depths) The system can handle arrays where elements have different nesting depths — some elements are primitives while others are arrays. **Example:** ```json { "data": [ 1, [2, 3], [4, 5, [6, 7], [8, 9]] ] } ``` This is a **polymorphic array** containing: - A primitive value: `1` - A 1-dimensional array: `[2, 3]` - A 2-dimensional array: `[4, 5, [6, 7], [8, 9]]` **Nodes created:** **Node 1** (Root — `$`) **Node 2** (Mixed node — `$.data[*]`) **Node 3** (Mixed node — `$.data[*][*]`) **Generated tables:** **Table 1: `mydata`** (from node `$`) - Columns: - Special and root columns - `data[*]` (ARRAY[INTEGER]) — Field for the top-level array elements (one row with array value `[1]` for the current example) **Table 2: `mydata_data_array`** (from node `$.data[*]`) - Contains rows for second-level array elements (2 rows for the current example) - Columns: - Special and root columns - `index_0` (INTEGER) — Position in the data array - `element_0`, `element_1` — columns for primitive values at the second level (for values `2`, `3`, `4`, `5` presented in 2 rows) **Table 3: `mydata_data_2d_array`** (from array part of `$.data[*][*]`) - Contains the elements from nested arrays - Columns: - Special and root columns - `index_0`, `index_1` (INTEGER) — Position in the outer array - `element_0`, `element_1` — Elements from the third-level arrays (for values `6`, `7`, `8`, `9` presented in 2 rows, with index values `2,2` and `2,3`) --- ## SQL table generation ### Table naming convention **Important**: each node in your `TreeNodeEntity` is mapped to exactly **one** SQL table. Table names are generated using the following rules: 1. **Base name**: Entity model name (e.g., `prizes`, `companies_details`) 2. **Version suffix** (if version > 1): `_` (e.g., `_2`, `_3`) 3. **Path suffix** (for non-root nodes): Derived from the node path with `.` and `#` replaced by `_` - Example: `$.prizes[*]` → `_prizes` - Example: `$.prizes[*].laureates[*]` → `_prizes_laureates` 4. **Multidimensional suffix**: `_d` where N is the number of `][` sequences in the *node path* + 1 (Note that this suffix is derived from the node path, not the field path — it represents the number of collapsed dimensions. So if we are dealing with a 3-dimensional array of primitives, the suffix will be `_2d`, as the last dimension is expanded into columns.) - Example: `$.data[*][*]` has one `][` → `_2d` - Example: `$.cube[*][*][*]` has two `][` → `_3d` 5. **Detached array suffix**: `_array` added when the node represents a detached array - This happens for multidimensional arrays where inner dimensions are "detached". #### Table naming examples | Entity Name | Version | Node Path | Is Detached Array? | Table Name | Explanation | |-------------|---------|-----------|-------------------|------------|-------------| | `prizes` | 1 | `$` | No | `prizes` | Root node | | `prizes` | 2 | `$` | No | `prizes_2` | Root node, version 2 | | `prizes` | 1 | `$.prizes[*]` | No | `prizes_prizes` | Array of objects | | `prizes` | 1 | `$.prizes[*].laureates[*]` | No | `prizes_prizes_laureates` | Nested array of objects | | `companies` | 1 | `$.matrix[*]` | Yes | `companies_matrix_array` | 2D array — detached | | `companies` | 1 | `$.cube[*][*]` | Yes | `companies_cube_2d_array` | 3D array — has `][` so gets `_2d`, plus `_array` | This is the default schema-generation naming; any of these names can be changed manually in the schema settings. ### Special JSON table In addition to the structured tables, every entity model gets a special **JSON table** that contains the complete reconstructed JSON for each entity: - **Table name**: `_json` (e.g., `prizes_json`) - **Purpose**: allows you to retrieve the full original JSON document :::caution[Performance] When querying the JSON table, **always include `entity_id` in your WHERE clause** for optimal performance. Without this predicate, the query may be significantly slower, especially with large datasets. ::: **Good practice:** ```sql SELECT entity FROM prizes_json WHERE entity_id = ''; ``` **Avoid:** ```sql SELECT entity FROM prizes_json; -- This will be slow! ``` --- ## Table columns Every SQL table contains several categories of columns. ### 1. Special columns These are system-generated columns available in all tables: | Column Name | Data Type | Description | |-------------|-----------|-------------| | `entity_id` | UUID | Unique identifier for the entity (the loaded JSON/XML file) | | `point_time` | DATE | Allows querying data as it existed at a specific point in time | The JSON table also includes: - `entity` (STRING): the complete reconstructed JSON document. ### 2. Root columns These columns provide metadata about the entity: | Column Name | Source Field | Data Type | Description | |-------------|--------------|-----------|-------------| | `creation_date` | `creationDate` | DATE | When the entity was created in the system | | `last_update_date` | `lastUpdateTime` | DATE | When the entity was last modified | | `state` | `state` | STRING | Current workflow state of the entity | ### 3. Index columns For tables representing array (object or detached) elements (depth > 0), an `index` column is provided: - **Column name**: `index` - **Purpose**: identifies the position of this row in the array hierarchy - **Structure**: can be flattened into individual columns (`index_0`, `index_1`, etc.) for multidimensional arrays. Flattened by default. **Example**: For `$.prizes[*].laureates[*]`: - `index_0`: Position in the `prizes` array - `index_1`: Position in the `laureates` array within that prize ### 4. Data columns These are the actual fields from your JSON/XML data. #### Primitive fields Simple fields are mapped directly to columns: | JSON Path | Field Key | Column Name | Data Type | |-----------|-----------|-------------|-----------| | `$.organization.name` | `.organization.name` | `organization_name` | STRING | | `$.prizes[*].year` | `.year` | `year` | STRING | | `$.prizes[*].laureates[*].id` | `.id` | `id` | STRING | **Naming rules:** - Leading `.` is removed from the field key. - Reserved field names (like `index`) are prefixed with `_` (e.g., `_index`). #### Array fields 1-dimensional arrays of primitives are handled in two ways: **Option 1: Array column** (default for homogeneous arrays) - Column name: `_array` - Data type: `ARRAY[]` - Example: `.addresses[]` → `addresses_array` (ARRAY[STRING]) **Option 2: Flattened columns** (for multi-type or ZONED_DATE_TIME arrays) - Multiple columns created: `_0`, `_1`, etc. - Each column represents one position in the array. - Example: `.scores[]` → `scores_0`, `scores_1`, `scores_2` (if array has 3 elements) --- ## Supported data types The system supports the following data types, which are mapped to appropriate SQL types for Trino queries. Understanding how these types are detected from JSON is crucial for working with your data effectively. **Important**: All data is stored internally in Cyoda with full precision. Trino provides a SQL query interface to this data, and some types (like `UNBOUND_DECIMAL` and `UNBOUND_INTEGER`) are represented as strings in Trino due to Trino's numeric limitations, even though they are stored as numbers in Cyoda. ### Data type reference table | System Type | SQL Type | Description | JSON Detection | |-------------|----------|-------------|----------------| | STRING | VARCHAR | Text data (max 1024 characters) | Text values, or values that can't be parsed as other types | | CHAR | CHAR | Single character | Single-character strings | | BYTE | TINYINT | 8-bit integer (-128 to 127) | Integers in byte range | | SHORT | SMALLINT | 16-bit integer (-32,768 to 32,767) | Integers in short range | | INT | INTEGER | 32-bit integer | Integers in int range | | LONG | BIGINT | 64-bit integer | Integers in long range | | FLOAT | REAL | Single-precision floating point | Decimals with ≤6 digits precision and scale ≤31 | | DOUBLE | DOUBLE | Double-precision floating point | Decimals with ≤15 digits precision and scale ≤292 | | BIG_DECIMAL | DECIMAL(38,18) | High-precision decimal (Trino Int128) | Decimals that fit in Int128 with scale ≤18 | | UNBOUND_DECIMAL | VARCHAR | Very large/precise decimals (Trino representation) | Decimals exceeding BIG_DECIMAL limits | | BIG_INTEGER | DECIMAL(38,0) | Large integer (Trino Int128) | Integers that fit in Int128 range | | UNBOUND_INTEGER | VARCHAR | Very large integers (Trino representation) | Integers exceeding BIG_INTEGER limits | | BOOLEAN | BOOLEAN | True/false values | JSON boolean values | | LOCAL_DATE | DATE | Date without time | ISO date strings (e.g., "2024-01-15") | | LOCAL_TIME | TIME | Time without date | ISO time strings (e.g., "14:30:00") | | LOCAL_DATE_TIME | TIMESTAMP | Date and time without timezone | ISO datetime strings | | ZONED_DATE_TIME | TIMESTAMP WITH TIME ZONE | Date and time with timezone | ISO datetime with timezone | | UUID | UUID | Universally unique identifier | Valid UUID strings | | TIME_UUID | UUID | Time-based UUID (version 1) | UUID version 1 strings | | BYTE_ARRAY | VARBINARY | Binary data | Base64-encoded strings | ### How number types are detected from JSON When the system parses JSON numbers, it follows a specific detection algorithm to determine the most appropriate type. #### Integer detection For whole numbers (no decimal point), the system checks in this order: 1. **BYTE** — if the value is between -128 and 127 2. **SHORT** — if the value is between -32,768 and 32,767 3. **INT** — if the value is between -2,147,483,648 and 2,147,483,647 4. **LONG** — if the value is between -9,223,372,036,854,775,808 and 9,223,372,036,854,775,807 5. **BIG_INTEGER** — if the value fits in Int128 range (see below) 6. **UNBOUND_INTEGER** — for integers larger than Int128 **Example:** ```json { "small": 42, // → BYTE "medium": 1000, // → SHORT "large": 100000, // → INT "veryLarge": 10000000000, // → LONG "huge": 123456789012345678901234567890 // → BIG_INTEGER or UNBOUND_INTEGER } ``` #### Decimal detection For numbers with decimal points, the system checks in this order: 1. **FLOAT** — if precision ≤ 6 digits and scale is between -31 and 31 2. **DOUBLE** — if precision ≤ 15 digits and scale is between -292 and 292 3. **BIG_DECIMAL** — if the value fits in Int128 decimal constraints (see below) 4. **UNBOUND_DECIMAL** — for decimals exceeding BIG_DECIMAL limits **Example:** ```json { "price": 19.99, // → FLOAT (6 digits precision) "precise": 123.456789012345, // → DOUBLE (15 digits precision) "veryPrecise": 123456789012345.123456789012345, // → BIG_DECIMAL "extremelyPrecise": 1.23456789012345678901234567890123456789 // → UNBOUND_DECIMAL } ``` ### Understanding BIG_DECIMAL and UNBOUND_DECIMAL #### BIG_DECIMAL (Trino Int128 decimal) **BIG_DECIMAL** is bounded by Trino's maximum numeric capacity, which uses a 128-bit integer (Int128) with a **fixed scale of 18**. **Constraints:** - **Scale**: must be ≤ 18 decimal places - **Precision**: must be ≤ 38 total digits (with some complexity — see below) - **Range**: approximately ±170,141,183,460,469,231,731.687303715884105727 **Detailed precision rules:** The system uses two precision checks: 1. **Strict check**: `precision ≤ 38` AND `exponent ≤ 20` (where exponent = precision - scale) 2. **Loose check**: `precision ≤ 39` AND `exponent ≤ 21` AND the value fits when scaled to 18 decimal places **Examples:** ```json { "fits": 12345678901234567890.123456789012345678, // ✓ BIG_DECIMAL (38 digits, scale 18) "tooManyDecimals": 123.1234567890123456789, // ✗ UNBOUND_DECIMAL (scale > 18) "tooLarge": 999999999999999999999.999999999999999999 // ✗ UNBOUND_DECIMAL (exceeds Int128) } ``` #### UNBOUND_DECIMAL **UNBOUND_DECIMAL** is used for decimal values that exceed Trino's numeric representation limits. **Storage vs representation:** - **In Cyoda**: stored as full-precision BigDecimal numbers with all numeric operations available in workflows. - **In Trino SQL**: represented as VARCHAR (strings) due to Trino's Int128 limitations. **When a decimal becomes UNBOUND_DECIMAL:** - Scale > 18 decimal places. - Total value exceeds Int128 range. - Precision and exponent exceed the limits. **Important for SQL queries**: when querying UNBOUND_DECIMAL columns in Trino, treat them as VARCHAR, not as numeric types. Numeric operations are not available in SQL queries for these fields. ```sql -- Correct: treat as string in Trino SQL SELECT * FROM mytable WHERE unbound_decimal_field = '123.12345678901234567890123456789'; -- Incorrect: cannot use numeric operations in Trino SQL SELECT * FROM mytable WHERE unbound_decimal_field > 100; -- This will fail! -- Note: numeric operations ARE available in Cyoda workflows, just not in Trino SQL queries. ``` ### Understanding BIG_INTEGER and UNBOUND_INTEGER #### BIG_INTEGER (Trino Int128) **BIG_INTEGER** is bounded by Trino's 128-bit integer capacity. **Constraints:** - **Range**: -170,141,183,460,469,231,731,687,303,715,884,105,728 to 170,141,183,460,469,231,731,687,303,715,884,105,727 - This is 2127 - 1 for the maximum and -2127 for the minimum. **Examples:** ```json { "fits": 123456789012345678901234567890, // ✓ BIG_INTEGER (within Int128) "tooLarge": 999999999999999999999999999999999999999 // ✗ UNBOUND_INTEGER (exceeds Int128) } ``` #### UNBOUND_INTEGER **UNBOUND_INTEGER** is used for integer values that exceed the Int128 range. **Storage vs representation:** - **In Cyoda**: stored as full-precision BigInteger numbers with all numeric operations available in workflows. - **In Trino SQL**: represented as VARCHAR (strings) due to Trino's Int128 limitations. **Important for SQL queries**: when querying UNBOUND_INTEGER columns in Trino, treat them as VARCHAR. Numeric operations are not available in SQL queries for these fields. ```sql -- Correct: treat as string in Trino SQL SELECT * FROM mytable WHERE unbound_integer_field = '999999999999999999999999999999999999999'; -- Incorrect: cannot use numeric operations in Trino SQL SELECT * FROM mytable WHERE unbound_integer_field > 1000; -- This will fail! -- Note: numeric operations ARE available in Cyoda workflows, just not in Trino SQL queries. ``` ### Type detection priority When parsing JSON, the system always tries to use the **most specific type** that fits the value: 1. **Smallest integer type** that can hold the value (BYTE → SHORT → INT → LONG → BIG_INTEGER → UNBOUND_INTEGER). 2. **Smallest decimal type** that can hold the value (FLOAT → DOUBLE → BIG_DECIMAL → UNBOUND_DECIMAL). 3. **STRING** as a fallback for any value that can't be parsed as a more specific type. ### String parsing When a JSON value is a string, the system attempts to parse it as other types in this priority order: 1. **Temporal types** (dates, times, datetimes). 2. **UUID types**. 3. **Boolean** ("true" or "false"). 4. **Numeric types** (if the string contains a valid number). 5. **STRING** (if none of the above match). **Example:** ```json { "date": "2024-01-15", // → LOCAL_DATE "uuid": "550e8400-e29b-41d4-a716-446655440000", // → UUID "bool": "true", // → BOOLEAN "text": "hello world" // → STRING } ``` --- ## Polymorphic fields ### What are polymorphic fields? A **polymorphic field** occurs when the same field path has different data types across different elements in your JSON/XML data. This commonly happens in arrays of objects where the same field name contains different types of values. ### Example of polymorphic data ```json { "items": [ { "value": "text string" }, { "value": 123 }, { "value": 45.67 } ] } ``` In this example, the field `$.items[*].value` is polymorphic because it contains: - A STRING in the first element - An INTEGER in the second element - A DOUBLE in the third element ### How polymorphic fields are handled When the system detects polymorphic fields, it automatically determines a **common data type** that can accommodate all the different types encountered. The logic: 1. **Check for compatible types**: the system first checks if all types are compatible and can be converted to a common type. 2. **Find the lowest common denominator**: it selects the most general type that all values can be converted to. 3. **Fall back to STRING**: if no common numeric or date type exists, the field is stored as STRING. ### Type compatibility rules The system recognizes certain types as compatible and will convert them to a common type using **widening conversions**. It always chooses the type that can represent all values without loss of information. #### Numeric type hierarchy **Integer types** (from smallest to largest): - BYTE → SHORT → INT → LONG → BIG_INTEGER → UNBOUND_INTEGER **Decimal types** (from smallest to largest): - FLOAT → DOUBLE → BIG_DECIMAL → UNBOUND_DECIMAL **Cross-hierarchy conversions:** - Any integer type can widen to any larger integer type or any decimal type. - Any decimal type can widen to a larger decimal type. - UNBOUND_DECIMAL is the widest numeric type (can hold any number). #### Common type conversion examples | Types Found | Common Type Used | Explanation | |-------------|------------------|-------------| | BYTE, SHORT | SHORT | Wider integer type | | INT, LONG | LONG | Wider integer type | | BYTE, DOUBLE | DOUBLE | Integer widens to decimal | | INT, BIG_DECIMAL | BIG_DECIMAL | Integer widens to decimal | | LONG, UNBOUND_INTEGER | UNBOUND_INTEGER | Wider integer type | | FLOAT, DOUBLE | DOUBLE | Wider decimal type | | DOUBLE, BIG_DECIMAL | UNBOUND_DECIMAL | DOUBLE can't fit in BIG_DECIMAL's fixed scale | | BIG_INTEGER, BIG_DECIMAL | BIG_DECIMAL | Integer widens to decimal | | BIG_DECIMAL, UNBOUND_DECIMAL | UNBOUND_DECIMAL | Wider decimal type | | BIG_INTEGER, UNBOUND_INTEGER | UNBOUND_INTEGER | Wider integer type | | Any numeric, STRING | STRING | Incompatible — falls back to STRING | | BOOLEAN, INT | STRING | Incompatible — falls back to STRING | | UUID, STRING | STRING | Incompatible — falls back to STRING | #### Temporal type conversions Temporal types have a resolution hierarchy, where lower-resolution types (like YEAR) can be converted to higher-resolution types (like LOCAL_DATE) by adding default values for the missing components. **Resolution hierarchy:** - YEAR → YEAR_MONTH → LOCAL_DATE → LOCAL_DATE_TIME → ZONED_DATE_TIME - LOCAL_TIME → LOCAL_DATE_TIME → ZONED_DATE_TIME :::note The LOCAL_DATE_TIME and ZONED_DATE_TIME types are considered incompatible with each other because of type-detection ambiguity: since any value of these types can be parsed as the other, such polymorphism cannot occur automatically. ::: When polymorphic temporal fields are detected, the system converts all values to the highest-resolution type found. ##### Upscaling (low resolution → high resolution) When converting from a lower-resolution type to a higher-resolution type, the system adds default values for the missing components: | From Type | To Type | Conversion Rule | Example | |-----------|---------|-----------------|---------| | YEAR | YEAR_MONTH | Add month = 1 (January) | `2024` → `2024-01` | | YEAR_MONTH | LOCAL_DATE | Add day = 1 (first day of month) | `2024-01` → `2024-01-01` | | LOCAL_DATE | LOCAL_DATE_TIME | Add time = 00:00:00 (midnight) | `2024-01-01` → `2024-01-01T00:00:00` | | LOCAL_TIME | LOCAL_DATE_TIME | Add date = 1970-01-01 (epoch) | `14:30:00` → `1970-01-01T14:30:00` | **Multi-step conversions:** The system can perform multi-step conversions by chaining the rules above: - **YEAR → LOCAL_DATE**: `2024` → `2024-01` → `2024-01-01` - **YEAR → LOCAL_DATE_TIME**: `2024` → `2024-01` → `2024-01-01` → `2024-01-01T00:00:00` - **YEAR_MONTH → LOCAL_DATE_TIME**: `2024-06` → `2024-06-01` → `2024-06-01T00:00:00` ##### Downscaling (high resolution → low resolution) When converting from a higher-resolution type to a lower-resolution type, the system truncates the extra precision: | From Type | To Type | Conversion Rule | Example | |-----------|---------|-----------------|---------| | YEAR_MONTH | YEAR | Extract year only | `2024-06` → `2024` | | LOCAL_DATE | YEAR_MONTH | Extract year and month | `2024-06-15` → `2024-06` | | LOCAL_DATE_TIME | LOCAL_DATE | Extract date part | `2024-06-15T14:30:00` → `2024-06-15` | | LOCAL_DATE_TIME | LOCAL_TIME | Extract time part | `2024-06-15T14:30:00` → `14:30:00` | Downscaling is primarily used internally for query optimization (e.g. when processing a query condition against a [YEAR, DATE] polymorphic field, the query condition is downscaled to YEAR for the YEAR part of the target field). In polymorphic fields represented in Trino, the system always upscales to the highest-resolution type. ##### Polymorphic temporal field examples **Example 1: Mixed date resolutions** ```json { "events": [ { "date": "2024" }, // YEAR { "date": "2024-06" }, // YEAR_MONTH { "date": "2024-06-15" } // LOCAL_DATE ] } ``` The field `$.events[*].date` is polymorphic with types: YEAR, YEAR_MONTH, LOCAL_DATE. **Common type**: LOCAL_DATE (highest resolution). **Trino SQL values after conversion:** - `"2024"` → `2024-01-01` (January 1st, 2024) - `"2024-06"` → `2024-06-01` (June 1st, 2024) - `"2024-06-15"` → `2024-06-15` (unchanged) **Example 2: Date and DateTime mix** ```json { "timestamps": [ { "when": "2024-01-15" }, // LOCAL_DATE { "when": "2024-01-15T14:30:00" } // LOCAL_DATE_TIME ] } ``` **Common type**: LOCAL_DATE_TIME. **Trino SQL values after conversion:** - `"2024-01-15"` → `2024-01-15T00:00:00` (midnight) - `"2024-01-15T14:30:00"` → `2024-01-15T14:30:00` (unchanged) **Example 3: Time and DateTime mix** ```json { "schedule": [ { "time": "14:30:00" }, // LOCAL_TIME { "time": "2024-01-15T14:30:00" } // LOCAL_DATE_TIME ] } ``` **Common type**: LOCAL_DATE_TIME. **Trino SQL values after conversion:** - `"14:30:00"` → `1970-01-01T14:30:00` (epoch date + time) - `"2024-01-15T14:30:00"` → `2024-01-15T14:30:00` (unchanged) ##### Important considerations for temporal polymorphism 1. **Default values matter**: When YEAR is converted to LOCAL_DATE, it becomes January 1st. This means: ```sql -- If the field contains polymorphic YEAR and LOCAL_DATE values SELECT * FROM events WHERE date = '2024-01-01' -- This will match both "2024" (converted to 2024-01-01) and "2024-01-01" ``` 2. **Loss of semantic meaning**: A YEAR value of `"2024"` represents the entire year, but when converted to LOCAL_DATE it becomes `2024-01-01`, which represents a specific day. The original semantic meaning (the entire year) is lost. 3. **Query implications**: when querying polymorphic temporal fields, be aware of the conversion rules: ```sql -- Original data: ["2024", "2024-06-15"] -- Stored as LOCAL_DATE: [2024-01-01, 2024-06-15] -- This query will NOT match the original "2024" value SELECT * FROM events WHERE date >= '2024-06-15' -- Because "2024" was converted to 2024-01-01, which does not include every day of the year ``` 4. **Best practice**: if you need to preserve the original resolution, consider using separate fields: ```json { "yearOnly": "2024", "exactDate": "2024-06-15" } ``` ##### Polymorphic temporal conversion summary **Common polymorphic combinations:** | Types Found | Common Type | Conversion Applied | |-------------|-------------|-------------------| | YEAR, YEAR_MONTH | YEAR_MONTH | YEAR → YEAR_MONTH (add month=1) | | YEAR, LOCAL_DATE | LOCAL_DATE | YEAR → YEAR_MONTH → LOCAL_DATE | | YEAR_MONTH, LOCAL_DATE | LOCAL_DATE | YEAR_MONTH → LOCAL_DATE (add day=1) | | LOCAL_DATE, LOCAL_DATE_TIME | LOCAL_DATE_TIME | LOCAL_DATE → LOCAL_DATE_TIME (add time=00:00:00) | | LOCAL_TIME, LOCAL_DATE_TIME | LOCAL_DATE_TIME | LOCAL_TIME → LOCAL_DATE_TIME (add date=1970-01-01) | ### Important notes **Type conversion**: when a polymorphic field is stored as a common type (e.g., STRING), all values are converted to that type. This means: - Numeric values may be stored as strings: `"123"` instead of `123`. - You may need to cast values in your SQL queries: `CAST(value AS INTEGER)`. ### Best practices for polymorphic data 1. **Consistent typing**: when possible, maintain consistent data types for the same field across all array elements. 2. **Explicit casting**: when querying polymorphic fields that were converted to STRING, use explicit CAST operations with caution — some values may not be castable. 3. **Understand your data**: use the schema-inspection API to see which fields are polymorphic and what their common type is. 4. **Consider restructuring**: if you have control over the data structure, consider using different field names for different types. --- ## Complete example ### Input JSON saved under model "prizes" version 1 ```json { "extraction-date": "2024-01-15", "prizes": [ { "year": "2023", "category": "Physics", "laureates": [ { "id": "1001", "firstname": "Anne", "surname": "L'Huillier", "motivation": "for experimental methods...", "share": 3 }, { "id": "1002", "firstname": "Pierre", "surname": "Agostini", "motivation": "for experimental methods...", "share": 3 } ] } ] } ``` ### Generated tables #### Table 1: `prizes` (Root node: `$`) | Column Name | Data Type | Category | Description | |-------------|-----------|----------|-------------| | `entity_id` | UUID | SPECIAL | Entity identifier | | `point_time` | DATE | SPECIAL | Query time point | | `creation_date` | DATE | ROOT | Entity creation date | | `last_update_date` | DATE | ROOT | Entity last update | | `state` | STRING | ROOT | Entity state | | `extraction_date` | DATE | DATA | From `.extraction-date` | #### Table 2: `prizes_prizes` (Node: `$.prizes[*]`) | Column Name | Data Type | Category | Description | |-------------|-----------|----------|-------------| | `entity_id` | UUID | SPECIAL | Entity identifier | | `point_time` | DATE | SPECIAL | Query time point | | `creation_date` | DATE | ROOT | Entity creation date | | `last_update_date` | DATE | ROOT | Entity last update | | `state` | STRING | ROOT | Entity state | | `index_0` | INTEGER | INDEX | Position in prizes array | | `year` | STRING | DATA | From `.year` | | `category` | STRING | DATA | From `.category` | #### Table 3: `prizes_prizes_laureates` (Node: `$.prizes[*].laureates[*]`) | Column Name | Data Type | Category | Description | |-------------|-----------|----------|-------------| | `entity_id` | UUID | SPECIAL | Entity identifier | | `point_time` | DATE | SPECIAL | Query time point | | `creation_date` | DATE | ROOT | Entity creation date | | `last_update_date` | DATE | ROOT | Entity last update | | `state` | STRING | ROOT | Entity state | | `index_0` | INTEGER | INDEX | Position in prizes array | | `index_1` | INTEGER | INDEX | Position in laureates array | | `id` | STRING | DATA | From `.id` | | `firstname` | STRING | DATA | From `.firstname` | | `surname` | STRING | DATA | From `.surname` | | `motivation` | STRING | DATA | From `.motivation` | | `share` | TINYINT | DATA | From `.share` | #### Table 4: `prizes_json` (Special JSON table) | Column Name | Data Type | Category | Description | |-------------|-----------|----------|-------------| | `entity_id` | UUID | SPECIAL | Entity identifier | | `point_time` | DATE | SPECIAL | Query time point | | `creation_date` | DATE | ROOT | Entity creation date | | `last_update_date` | DATE | ROOT | Entity last update | | `state` | STRING | ROOT | Entity state | | `entity` | STRING | SPECIAL | Complete JSON document | --- ## Querying your data ### Example queries **1. Get all prizes from 2023:** ```sql SELECT * FROM prizes_prizes WHERE year = '2023'; ``` **2. Find all laureates with their prize information:** ```sql SELECT p.year, p.category, l.firstname, l.surname, l.motivation FROM prizes_prizes p JOIN prizes_prizes_laureates l ON p.entity_id = l.entity_id AND p.index_0 = l.index_0; ``` **3. Retrieve the full JSON for a specific entity:** ```sql SELECT entity FROM prizes_json WHERE entity_id = ''; ``` **4. Query data as it existed at a specific time:** ```sql SELECT * FROM prizes_prizes WHERE point_time = TIMESTAMP '2024-01-01 00:00:00'; ``` --- ## Best practices 1. **Use index columns for joins**: when joining tables from nested arrays, always join on `entity_id` and matching index columns. 2. **Understand your schema**: use the schema-generation API to see exactly what tables and columns are created for your data. 3. **Leverage the JSON table carefully**: for complex queries or when you need the full document, query the `_json` table, but **always include `entity_id` in the WHERE clause** for performance. 4. **Filter by `entity_id`**: for better performance, include `entity_id` in your WHERE clause when possible, especially when querying the JSON table. 5. **Use `point_time` wisely**: only specify `point_time` when you need historical data; omit it for current data. --- ## Schema management You can create any number of SQL schemas, each with a different set of tables and columns. ### Via the HTTP API You can manage and inspect your SQL schemas using the REST API: - **Create default schema**: `PUT /sql/schema/putDefault/{schemaName}` - **Get schema**: `GET /sql/schema/{schemaId}` - **List all schemas**: `GET /sql/schema/listAll` - **Generate tables from entity model**: `GET /sql/schema/genTables/{entityModelId}` - **Update tables**: `POST /sql/schema/updateTables/{entityModelId}` For the full grammar, see the [REST API reference](/reference/api/). ### Using the Cyoda UI It is very straightforward to create SQL schemas in the Cyoda UI. Once logged in, navigate to the Trino/SQL menu — you can create and configure new schemas, or edit existing ones. ## Connecting via JDBC The JDBC connection string follows this pattern: ``` jdbc:trino://trino-client-.eu.cyoda.net:443/cyoda/ ``` where `caas_user_id` is your CAAS user ID and `your_schema` is the schema name you configured. For authentication credentials and technical-user setup, see [Authentication and identity](/concepts/authentication-and-identity/). ## Gaps in this reference The following are intentionally not yet specified here and are tracked as upstream asks: - **Supported SQL dialect scope** — which Trino features are guaranteed supported versus best-effort. - **Push-down matrix** — which predicates, projections, and aggregates execute in the underlying store versus requiring a full scan. - **Consistency / isolation** of long-running queries relative to concurrent transition writes. - **Performance envelope** — rows/sec scan rates, partitioning, and per-tenant query limits. See the [cyoda-go issue tracker](https://github.com/Cyoda-platform/cyoda-go/issues?q=is%3Aissue+label%3Acyoda-docs) for progress. --- ## run.md # Run The cyoda-go packaging ladder — desktop, docker, kubernetes, and hosted Cyoda Cloud. Cyoda runs on the tier that fits the job. Every packaging runs the same application and the same workflow semantics; what changes is durability, consistency guarantees, and operational cost. Pick your packaging by what you need to operate, not by what your app needs to do. ## Pick your packaging | | In-Memory | SQLite | PostgreSQL | Cassandra | |------------------|-----------|---------------|------------------|-----------| | **Desktop** | ✓ | ✓ *(default)* | | | | **Docker** | ✓ | ✓ | ✓ | | | **Kubernetes** | | | ✓ *(production)* | | | **Cyoda Cloud** | | | | ✓ | ## When to pick what - **[Desktop](./desktop/)** — a single binary on a laptop or a small server. In-memory for tests; SQLite as the default durable store. Right for development, edge, IoT, and small-team self-hosting. - **[Docker](./docker/)** — the same binary containerised. Use it for bespoke integrations, composition with other services, local PostgreSQL runs, and CI pipelines. - **[Kubernetes](./kubernetes/)** — the production packaging for self-hosted clusters. Active-active stateless cyoda-go pods behind a load balancer with PostgreSQL as the only stateful dependency. Helm chart ships from cyoda-go. - **[Cyoda Cloud](./cyoda-cloud/)** — a managed service backed by Cassandra. Right when you need enterprise-grade identity, multi-tenancy, and provisioning, and you do not want to operate the infrastructure. ## Moving between tiers The application does not change when you move. That is the whole point of the growth path: start on Desktop, containerise when integration demands it, cluster when scale demands it, migrate to Cyoda Cloud when operating it no longer pays for itself. See [Digital twins and the growth path](/concepts/digital-twins-and-growth-path/) for the decision framework. ## License and editions Cyoda ships in three editions. Pick by how you want to consume it; the application contract is the same across all three. | Edition | License | Storage tiers | Status | |---|---|---|---| | **cyoda-go** | Apache 2.0 (OSS) | In-memory, SQLite, PostgreSQL | Generally available | | **Cyoda Cloud** | Commercial (hosted service) | Cassandra-backed | Beta — test/demo only; commercial SLAs planned | | **Enterprise** | Commercial (self-hosted) | Cassandra-backed | Available — contact sales | The **cyoda-go** binary — everything in [Build](/build/) and [Reference](/reference/) works against it — is Apache 2.0 and free to use for development and production on the Desktop, Docker, and Kubernetes packagings. The **Cassandra-backed** tier (horizontal scale for write volume and distributed `async` search) ships two ways: as the hosted **Cyoda Cloud** service, and as a self-hosted **Enterprise** distribution under commercial license. Cyoda Cloud is currently a Beta for test and demonstration; production workloads on the Cassandra tier run under the Enterprise license today. --- ## run/cyoda-cloud.md # Cyoda Cloud Hosted Cyoda offering — test/demo today, commercial SLA offering planned. Evolving · Test / demo only Cyoda Cloud is currently a test and demonstration platform. Use at your own risk. A commercial offering with SLAs is planned. This page gives an implementation-oriented overview of Cyoda Cloud: its physical architecture (how client compute integrates with the platform) and the current operational characteristics, service boundaries, and considerations that matter when you use it. ## Architecture Cyoda models entities via configuration and provides persistence, transactional workflow orchestration and processing, and distributed querying via API and SQL in a write-only, horizontally distributed architecture. This enables building scalable event-driven systems with ease, with client-specific logic executed as independent compute. ![Cyoda Cloud Architecture](../../../../assets/architecture/Cyoda-Cloud-Client-View-compact.svg) Key points: - Cyoda calls client compute via **gRPC + CloudEvents** - Client compute is **client-owned**, **polyglot**, and **independently scalable** - Data is stored in **Cassandra (bare metal, per-tenant keyspaces)** - Data is **queried via HTTP API or Trino SQL**, in both cases as distributed queries with horizontally scalability - Each cluster layer (Cyoda, Trino, client compute, Cassandra) **scales horizontally**; nothing is coupled ### Platform overview Cyoda Cloud is deployed on Hetzner bare‑metal infrastructure in Helsinki, Finland, supplemented by selected internal services running on Hetzner Cloud. Each lifecycle stage (development, staging, production) is deployed as an independent environment with its own Kubernetes and Cassandra clusters. Key characteristics: - **Bare metal first**: Cassandra runs directly on bare‑metal servers for predictable latency and I/O performance, and is intentionally not containerised. - **High‑bandwidth internal network**: Bare‑metal nodes are connected via a private 10 Gbit LAN. - **Strict network isolation**: External access is fronted by Cloudflare. All ingress to internal services occurs through VPN‑secured channels. - **Horizontal scalability by design**: Cyoda services, Trino, and client compute nodes scale independently. ### Client compute model Cyoda delegates all client‑specific business logic to Client Compute Nodes. These nodes: - Connect to Cyoda via gRPC and CloudEvents - Execute processors within workflows - Evaluate criteria that control gateway transitions - Are fully owned and implemented by the client Client compute is a first‑class part of the architecture and can be deployed flexibly depending on latency, isolation, and operational requirements. Currently supported client runtimes: - Java/Kotlin - Python Additional languages are planned, enabling a fully polyglot execution model. ### Developer mode In developer mode, client compute nodes typically run on the developer's local machine. ![Cyoda Cloud Architecture](../../../../assets/architecture/Cyoda-Cloud-Client-View-developer-mode.svg) Characteristics: - Client nodes connect to Cyoda Cloud via a **Cloudflare tunnel** - Developers use their preferred IDE, language, and local tooling - Hot‑reload and rapid iteration are supported In this mode: - **Client applications** interact with Cyoda using the HTTP API or the Trino JDBC driver to perform CRUD operations and queries - **SQL tooling** can be used directly against entity data via Trino Despite its simplicity, the architecture remains fully horizontally scalable: - Cyoda services scale elastically per tenant - Trino scales independently for analytical workloads - Client compute nodes scale independently - Cassandra scales horizontally and is shared across tenants ### Multi-cloud Cyoda supports multi‑cloud deployments where tenant resources run in a separate cloud or region from the Cyoda control plane. ![Cyoda Cloud Architecture](../../../../assets/architecture/Cyoda-Cloud-Client-View-multi-cloud.svg) ### Multi-language Each client compute node can be implemented in a different programming language. This enables: - Polyglot architectures - Team autonomy - Incremental migration between languages Cyoda treats all client nodes uniformly at the protocol level, regardless of implementation language. ![Cyoda Cloud Architecture](../../../../assets/architecture/Cyoda-Cloud-Client-View-multi-language.svg) ### Single-cloud An entire Cyoda instance (control plane and tenant workloads) can be deployed into a single cloud environment. ![Cyoda Cloud Architecture](../../../../assets/architecture/Cyoda-Cloud-Client-View-single-cloud.svg) ### Sharded by client tag For advanced workloads, client tags can be used to route events to specific subsets of client compute nodes. This enables targeted execution strategies, such as: - Separating GPU/TPU‑backed compute from CPU‑only workloads - Isolating high‑throughput, low‑latency processing from batch workloads - Running specialised processors with different cost or performance characteristics Routing is explicit and deterministic, making this model suitable for complex or heterogeneous compute requirements. ![Cyoda Cloud Architecture](../../../../assets/architecture/Cyoda-Cloud-Client-View-tag-sharded.svg) ### Cyoda Cloud layout The current Cyoda Cloud deployment is a multi‑tenant platform running on bare‑metal infrastructure in Hetzner datacenters (EU). ![Cyoda Cloud Architecture](../../../../assets/architecture/Cyoda-Cloud-Client-View-cyoda-details.svg) High‑level characteristics: - Each tenant maps to a dedicated Kubernetes namespace - Each tenant has: - Its own Cyoda pods - A dedicated Cassandra keyspace - A dedicated Trino deployment for SQL access Cyoda and Trino can both be elastically scaled per tenant to meet: - Event processing throughput - Workflow execution demand - Complex analytical query workloads In addition to the core runtime components, the platform includes: - Cyoda UI SPA for interactive use - Cyoda Toolbox Server for administrators, exposing a GraphQL API for maintenance and analysis - Apache Zookeeper for Cyoda cluster state management (multi‑tenant across namespaces) - Prometheus and Grafana (LGTM stack) for metrics and observability - Alertmanager for alert routing, with notifications sent to internal Google Chat Log aggregation and analysis are currently handled by Elasticsearch and Kibana. This is expected to be consolidated into Loki and Grafana LogQL in the future. The diagrams intentionally omit some infrastructure components (gateways and edge routing, VPN and internal network details, authentication and identity services, AI Studio and Cloud Manager applications, auxiliary load balancers and supporting services) to keep the focus on data flow and execution topology. ## Service details During Beta, we are still moving fast and we might be making breaking changes along the way, although we try to keep that to a minimum. Make sure you backup your data, and check before redeploying your Cyoda Cloud environment. It's good practice to set yourself up with some CI/CD pipelines to automate tearing down and rebuilding your data and environment. Contact us on [Discord](https://discord.com/invite/95rdAyBZr2) if you need help. For current service status and known issues, and for upcoming work, see [Status and roadmap](./status-and-roadmap/). For tier limits and entitlements, see [Identity and entitlements](./identity-and-entitlements/). ### Service availability **Uptime and maintenance** - Service operates 24/7 - Planned maintenance notifications posted on the [Status and roadmap](./status-and-roadmap/) page **Geographic deployment** - Single datacenter deployment in Helsinki, Finland - Latency characteristics suitable for most development and testing use cases - Contact us via [Discord](https://discord.com/invite/95rdAyBZr2) if latency impacts your use case ### Client integration points - **HTTP API**: REST endpoints for application integration - **gRPC**: High-performance interface for compute externalization - **JDBC**: SQL querying via Trino - **AI Studio**: Entry point for signing up, creating new applications via chat dialogue, provisioning and controlling your environment(s), and getting help. Available at [https://ai.cyoda.net](https://ai.cyoda.net) - **Cyoda UI**: Web interface for your Cyoda environment. It's our legacy UI, but with extensive features including entity lifecycle configuration and observability, distributed report configuration and execution, entity model viewing and navigation, Trino SQL schema configuration, and deep processing manager analysis. Available at `https://client-.eu.cyoda.net` ### Data management **Storage characteristics** - Apache Cassandra backend with replication factor 3 - Write-only entity persistence for complete audit trails - Point-in-time querying capabilities - No automatic backups in Free Tier **Data retention** - Data persists until explicitly deleted by user or exported via API - Users responsible for data export and backup - Data export available via HTTP API endpoints ### API and integration **Rate limiting** Rate limits vary by subscription tier. See [Identity and entitlements](./identity-and-entitlements/) for specific limits. - HTTP 429 responses when limits exceeded - Burst capacity available within limits **Authentication** - Auth0-based human authentication for web interfaces - OAuth 2.0 client credentials flow for technical users - Technical user creation via [AI Studio](https://ai.cyoda.net) For complete authentication details, see [Authentication and identity](/concepts/authentication-and-identity/). ### Current service limitations **Beta phase considerations** - Frequent platform updates and changes - Feature set under active development - Documentation continuously updated **Support channels** - Primary support via [Discord](https://discord.com/invite/95rdAyBZr2) - Global team coverage across time zones - Community-driven support model **API versioning** - API versioning planned for post-Beta release - Client libraries available on GitHub. Lots of active development there. - In Beta, expect breaking changes. We'll keep people informed on [Status and roadmap](./status-and-roadmap/) and [Discord](https://discord.com/invite/95rdAyBZr2). ### Migration and data portability - Full data export via HTTP API - User responsibility for backup and migration ## In this section - [Provisioning](./provisioning/) - [Identity and entitlements](./identity-and-entitlements/) - [Status and roadmap](./status-and-roadmap/) --- ## run/cyoda-cloud/identity-and-entitlements.md # Identity and entitlements (Cyoda Cloud) Configure OIDC, manage signing keys, and assign entitlements on the hosted platform. > This page covers identity operations for the **hosted Cyoda Cloud** > platform. For self-hosted cyoda-go identity (OAuth 2.0 issuance, > M2M credentials, external key trust on your own instance), see the > [cyoda-go identity docs](https://github.com/Cyoda-platform/cyoda-go/blob/main/docs/). ## Overview Cyoda Cloud authenticates users and technical clients via JWT access tokens. Tokens can be: - Issued by Cyoda using internally managed asymmetric signing keys, for technical users (machine-to-machine API clients) and other scenarios where your own infrastructure needs to issue JWTs trusted by Cyoda. - Issued by an external OpenID Connect (OIDC) provider that Cyoda is configured to trust, allowing you to use your own identity provider for regular and technical users, auto-enroll users from trusted JWT claims, and map IdP roles onto Cyoda authorities. Once authenticated, what each identity can do on the platform is bounded by the subscription tier and its associated entitlements. ## Signing keys Cyoda Cloud uses asymmetric JWT signing keys to issue and validate access tokens for technical users and custom integrations. ### JWT signing key attributes Cyoda Cloud represents each signing key-pair with a set of attributes: - **Audience (`audience`)** - Internal Cyoda concept that describes which consumers use tokens signed with this key. This is not the same as the JWT `aud` claim. - Typical values: - `human` – tokens issued for the regular users - `client` – tokens issued for technical users (machine-to-machine access) - **Algorithm (`algorithm`)** - Asymmetric signing algorithms only (for example, `RS256`, `RS512`, `ES256`). - **Validity window (`validFrom`, `validTo`)** - Optional `validFrom` and `validTo` define when a key becomes valid and when it expires. - **Key ID (`keyId` / `kid`)** - Each key-pair has a unique `keyId`. - This value is included in the JWT header as the `kid` field. In standard Cyoda Cloud, JWT signing keys for technical users are managed centrally by Cyoda. In custom or on-premise installations, it is also possible to configure an externally injected key-pair file that IAM uses instead of managing keys entirely through the API. ### Managing JWT signing keys At a high level, you can do the following operations: 1. Create or rotate signing key-pairs for the relevant audience. 2. Invalidate the keys with a grace period and then reactivate them. 3. Delete the key-pairs completely. #### Example: issue a key for technical users The following example shows a minimal request for creating a key-pair for technical users: ```json { "audience": "client", "algorithm": "RS256" } ``` Your token issuer will receive the corresponding `keyId` and the public key. It should use the private key to sign access tokens for technical users and include the `kid` header in each JWT. #### Example: rotate a key with grace period To perform a safe rotation with a grace period for technical users: 1. Issue a new key-pair for the `client` audience if you don't have any. 2. Mark the old key as invalid but allow a grace period during which previously issued tokens remain valid, for example: ```json { "gracePeriodSec": 3600 } ``` After the grace period expires, you can remove the old key or leave it invalidated. #### End-to-end example: technical user using a rotated key Putting it together for a typical environment: 1. **Before rotation**: Your technical user obtains access tokens signed with the old key (for example, `keyId = "old-key-id"`). 2. **Issue new key**: You create a new key for `audience = "client"` and obtain `keyId = "new-key-id"`. 3. **Switch issuer**: Token issuance logic starts using `"new-key-id"` to sign tokens while still accepting tokens with `"old-key-id"` during the grace period. 4. **Verify access**: Both old and new tokens can call Cyoda APIs until the grace period expires. 5. **Finalize rotation**: After the grace period, any remaining tokens signed with `"old-key-id"` are rejected, and you can clean up the old key. **Key rotation flow** ```mermaid flowchart TB A[Old active key for `client` audience] B[Issue new key for `client` audience] C[Sign new tokens with new `keyId`] D[Grace period: accept old and new tokens] E[Invalidate old key] A --> B B --> C C --> D D --> E ``` ### Key rotation recommendations - Rotate keys regularly to limit the potential security risks. - Separate user and technical-user token lifecycles in your application. - In custom installations with externally injected key-pairs, align your rotation schedule with how often you replace the external key-pair files. ## OIDC provider configuration When you register an OIDC provider in Cyoda Cloud, you describe how Cyoda should trust and use tokens from that provider. Key concepts: - **Well-known configuration URI** - The standard OIDC discovery endpoint exposed by your IdP (for example, `https://your-idp/.well-known/openid-configuration`). - Cyoda fetches the JWKS (public keys) and other metadata from this URL. - **Issuers list** - An optional but strongly recommended list of allowed `iss` (issuer) values. - When present and non-empty, Cyoda requires the JWT `iss` claim to match one of these values. - When omitted or empty, issuer validation is skipped. - **Provider state** - Providers can be active or inactive. Inactive providers are ignored during JWT validation, and any keys loaded for them are treated as untrusted. In most environments, Cyoda Cloud comes pre-configured with providers for the supported identity options (for example, Auth0 for the default UI). Custom OIDC providers are typically used for enterprise integrations. ### How Cyoda uses OIDC providers At a high level, Cyoda IAM uses configured OIDC providers to: 1. Validate the JWT signature and basic claims (for example, issuer or expiration). 2. Extract required claims that identify the user and their organization within the Cyoda instance. 3. Apply auto-enrollment logic: - Create a **user** record based on the token. - Create a **legal entity** record based on organization-related claims (allowed only in custom installations). 4. Build the authenticated principal with **authorities** derived from role-related claims. The exact auto-enrollment behavior depends on how your environment is configured, but the required claims listed below must be present for the standard Cyoda Cloud flow to work. ### Configuring your custom OIDC provider When configuring your IdP (for example, Auth0, Azure AD, or another OIDC provider) to work with Cyoda Cloud: 1. Ensure that access tokens issued for Cyoda APIs include at least: - `sub` - `org_id` - `caas_org_id` 2. Configure a claim (often via custom rules or mappers) that emits `user_roles` as an array of strings for users that need explicit roles. 3. Verify that the token `iss` (issuer) value matches one of the `issuers` configured for the corresponding OIDC provider in Cyoda. ### Operational tips - Use a dedicated OIDC client/application configuration per environment (dev, test, prod) and reflect that in your `org_id` or `caas_org_id` values as appropriate. - Keep the list of allowed `issuers` small and explicit to reduce the chance of accepting tokens from an unexpected issuer. - When rotating keys at your IdP, you can trigger a reload of provider metadata (including JWKS) in Cyoda Cloud using the appropriate API. ### End-to-end example: custom OIDC provider The following example describes a typical setup using an external OIDC provider as the IdP: 1. **Configure your application in the OIDC provider** representing your Cyoda environment. 2. **Create a rule or action** that adds the following claims to the access token when the audience matches your Cyoda App: - `sub` (OIDC provider user ID) - `org_id` (your external organization identifier) - `caas_org_id` (the Cyoda legal entity identifier for your installation - `your_user_id` for the single-user Cyoda instance) 3. **Register the OIDC provider in Cyoda Cloud** using the OIDC `.well-known/openid-configuration` URL and, optionally, the expected `iss` value. 4. **Test login** via the OIDC provider, obtain an access token, and call a Cyoda API endpoint. Cyoda Cloud validates the token and auto-enrolls the user under your legal entity if needed. ```mermaid sequenceDiagram participant User participant OIDC as OIDC Provider participant Cyoda as Cyoda Cloud User->>OIDC: Login with browser OIDC-->>User: Redirect with access token User->>Cyoda: Call API with bearer token OIDC-->>Cyoda: Fetch JWKS to validate signature Cyoda-->>User: Authorized response (user enrolled, roles applied) ``` ## JWT claims → role mapping When integrating a **custom OIDC provider**, its access tokens must carry specific claims so that Cyoda Cloud can: - Identify the user - Identify the organization (legal entity) - Map roles The following table summarizes the claims used by the integration: | Claim name | Type | Required? | Purpose | |-----------|------|-----------|---------| | `sub` | string | Yes | Standard OIDC subject. Used as a stable external user identifier. | | `org_id` | string | Yes | External organization identifier provided by your IdP. Used as the legal entity external key and to build a human-readable name (for example, `"Org. "`). | | `caas_org_id` | string | Yes (for Cyoda-backed tenants) | Cyoda legal entity identifier (`caas_org_id`). Used as the `owner` of both users and legal entities. | | `user_roles` | array of strings | Recommended | List of application or user roles. If absent, the user is treated as having no additional roles. | ## Entitlements Access to Cyoda Cloud is subscription-tier-based. This section provides details about the available subscription tiers and their entitlements. **Important**: The information below is for reference purposes and is not guaranteed to be correct. The authoritative source for your account's current subscription details and entitlements is available through the Cyoda Cloud API at the following endpoints: - **Current account information**: `GET /account` — Retrieve information about the current user's account, including current subscription. - **All available subscriptions**: `GET /account/subscriptions` — Retrieve all subscriptions available for the current user's legal entity. For complete API documentation, refer to the [OpenAPI specification](/api-reference/). ### Subscription tiers overview | Entitlement | Free1 | Developer | Pro | Enterprise License2 | | --- | --- | --- | --- | --- | | **Status** | Available | Draft | Draft | Available | | **Model Fields (per model)** | 150 | 150 | 500 | Unlimited | | **Model Fields (cumulative)** | 300 | 300 | 2000 | Unlimited | | **Models** | 20 | 20 | 100 | Unlimited | | **Client Nodes** | 1 | 1 | 5 | Unlimited | | **Payload Size** | 5 MB | 5 MB | 50 MB | Unlimited | | **Disk Usage** | 2 GB | 2 GB | 1 TB | Unlimited | | **API Requests** | 300/min | 300/min | 50/sec | Unlimited | | **External Calls** | 300/min | 300/min | 50/sec | Unlimited | 1 _Free Tier environments are automatically reset after an expiry period. Contact us for details._ 2 _Enterprise License is for the Cyoda Cloud system that clients operate themselves (outside of Cyoda Cloud). Contact us for details._ **Status legend:** - **Available**: Tier is currently available for subscription. - **Draft (unavailable)**: Tier is in planning/development phase and not yet available. ### Entitlement definitions The following section provides detailed definitions for each entitlement ID used in the subscription tiers: | Entitlement ID | Description | | --- | --- | | `NUM_MODEL_FIELDS` | Maximum number of fields allowed per individual data model. This controls the complexity of each model you can create. | | `NUM_MODEL_FIELDS_CUMULATIVE` | Total number of fields allowed across all data models in your account. This is the sum of fields across all your models. | | `NUM_MODELS` | Maximum number of data models you can create in your account. Each model represents a different data structure or entity type. | | `NUM_CLIENT_NODES` | Maximum number of client nodes that can connect to your Cyoda Cloud instance simultaneously. This controls concurrent compute capacity. | | `PAYLOAD_SIZE` | Maximum size in bytes for individual API request payloads. This limits the amount of data you can send in a single API call. | | `DISK_USAGE` | Maximum disk storage space allocated for your account data in bytes. This includes all stored models, data, and metadata. | | `API_REQUEST` | Maximum number of API requests allowed per time interval. This controls the rate at which you can make API calls. | | `EXTERNALIZED_CALL` | Maximum number of external compute calls allowed per time interval. This applies to calls made from your Cyoda Cloud instance to your connected compute nodes. | --- _This subscription tier information is maintained in step with the platform configuration and may deviate from your actual settings. For the most current and accurate information about your specific account entitlements, please refer to the `/account` API endpoints._ --- ## run/cyoda-cloud/provisioning.md # Provisioning (Cyoda Cloud) Provision a Cyoda Cloud environment. Evolving · Cyoda Cloud Cyoda Cloud is currently a test/demo offering. A commercial SLA offering is coming. Use at your own risk until then. Getting your first Cyoda Cloud Free Tier environment is very straightforward. Simply follow the steps below. ## TL;DR 1. Create an account on [https://ai.cyoda.net](https://ai.cyoda.net) 2. Find out your environment by prompting in the chat dialogue of the AI Studio with: `What is my environment URL?` 3. Deploy your environment by prompting with: `Deploy my Cyoda environment` 4. Create a technical user by prompting with: `Add machine user` 5. Access your environment via the Cyoda UI or APIs ## Create an Account 1. **Access the AI Studio**: Navigate to the Cyoda Cloud web-based Single Page Application (SPA) at [https://ai.cyoda.net](https://ai.cyoda.net) and consent to the terms and conditions. ![AI Studio Consent](aiAssistantConsent) ![AI Studio Greeting Screen](aiAssistantGreet) 2. **Choose Authentication Provider**: Register using one of the supported providers: - **Google Auth**: Sign up using your Google account - **GitHub**: Sign up using your GitHub account 3. **Complete Registration**: Follow the Auth0 authentication flow to complete your account setup 4. **Free Tier Access**: Upon successful registration, you'll be automatically enrolled in the Free Tier subscription ## Know your Environment Prompt in the chat dialogue of the AI Studio with: `What is my environment URL?`. Wait a bit. ![What is my environment URL Prompt](whatIsMyEnvironment) ## Deploy your Environment Prompt with: `Deploy my environment`. Wait for the deployment to complete. It usually takes about 5 minutes. ![Deploy Environment Prompt](deployEnvPrompt) ![Environment Deployed Confirmation](envDeployedConfirmation) ## Create a Technical User **Create a technical user (M2M client)**: Prompt with: "Add machine user". You will see a button to launch the query against your env to create a new user. Write down the client ID and secret - you'll need them to access your environment. ![Add Machine User Prompt](createTechnicalUser) ## Access the Environment Once your environment is deployed and you have a technical user, you can access your environment. ### Via the Cyoda UI Just navigate to your environment URL in your favorite browser at `https://client-.eu.cyoda.net` You can find your environment URL from the previous steps or ask in chat for the url ![Login Cyoda UI](loginCyodaUI) With the Cyoda UI you need to login with your personal via Auth0. ![Cyoda UI Logged In](loggedIn) ### Via the API To access the APIs you need to use your technical user credentials to authenticate. See also [Connecting](/guides/authentication-authorization/) for more details --- ## run/cyoda-cloud/status-and-roadmap.md # Status and roadmap (Cyoda Cloud) Current status, known limitations, and upcoming work for the hosted Cyoda Cloud platform. ## Current status **⚠️ Beta Phase** - Expect frequent changes and some interruptions. **🔧 Planned Maintenance**: none at the moment. | Issue | Description | Status | |---------------------------|------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------| | **Environment Access** | [AI Studio](https://ai.cyoda.net) is currently the only control interface for your environments. When you've logged in, it will give you your env details if prompted. But it's best to **write down your environment URL** | We'll be releasing better options soon | | **Java Code Generation** | Generating your code-base may take a while (15-30 minutes). Please be patient. In rare cases it might not compile, but it's usually obvious to fix. | Should get better as we improve the agentic workflow and prompts | | **Deployments** | In the [AI Studio](https://ai.cyoda.net) you can ask to deploy your environment to pick up the latest Cyoda version. **BEWARE**: This will reset all your data. And, might contain breaking changes in Alpha/Beta, such that your client compute node may not work. | We'll announce breaking changes in our [Discord](https://discord.gg/95rdAyBZr2) channel. This is Beta. You'll usually just need to merge the latest changes from the template projects to your codebase, and maybe adjust a few things. | | **Auth0 Logouts** | Unexpected session terminations, possibly related to idle times. | We're only monitoring this at the moment. | | **Transactional Deletions** | Deleting large amounts of data is slow, slightly slower than data saves. | It's in the backlog, but resolution probably not before end of Alpha Phase. | **Reach out to us on [Discord](https://discord.com/invite/95rdAyBZr2) if you need help**. ## Roadmap So many things to work on. We'll be putting our major roadmap items here, once we have enough feedback from the community to know what's most important. --- ## run/desktop.md # Desktop (single binary) Run cyoda-go from a single binary — dev, low-volume production, in-memory and SQLite modes. The desktop packaging is cyoda-go as it ships out of the box: a single binary, no orchestrator, no external database. It is the right choice for local development, edge deployments, small-team self-hosting, and any low-volume production workload where a single machine is enough. ## In-memory vs SQLite The desktop binary supports two storage modes: - **In-memory** — everything lives in process memory. Sub-millisecond latencies. Data is lost on restart. Use it for tests, demos, and digital-twin scenario runs. - **SQLite (default after `cyoda init`)** — durable, single-file, zero-ops. Data survives restarts; backup is a file copy. Use it for everyday persistent work. SQLite is single-writer; all writes serialize through the database file, which limits concurrent write throughput. The SQLite database file is created at `~/.local/share/cyoda/cyoda.db` by `cyoda init`. Back it up by copying the file; migrate it by moving the file. ## Install Pick the installer that suits your platform; the authoritative list lives in the [cyoda-go README](https://github.com/cyoda-platform/cyoda-go#install). ```bash # macOS / Linux via Homebrew brew install cyoda-platform/cyoda-go/cyoda ``` Debian, RHEL, and `curl | sh` installers are available in the same document. ## Run The Homebrew and packaged installers run `cyoda init` for you, which sets up the SQLite store. To start the server: ```bash cyoda ``` The binary exposes REST on port **8080** and gRPC on **9090** by default. The full CLI reference lives at [Reference → CLI](/reference/cli/). ## Configure cyoda-go reads configuration from environment variables, a config file, or CLI flags. The full list of options lives at [Reference → Configuration](/reference/configuration/); for everyday use the defaults are fine, and you only set a handful of variables (`CYODA_STORAGE_BACKEND`, listen ports, JWT keys) to adapt to your environment. For secrets, cyoda-go supports `*_FILE` suffixes on any credential environment variable so you can mount them from a secrets store rather than pass them on the command line. ## Upgrading Upgrading is a version bump: install the new binary, restart the process. cyoda-go follows semantic versioning; configuration migration policy is documented in the [cyoda-go release notes](https://github.com/cyoda-platform/cyoda-go/releases). --- ## run/docker.md # Docker Run cyoda-go in Docker for bespoke integrations and local compositions. Docker is the right packaging when you need cyoda-go to sit inside a larger composition — with your app, with a PostgreSQL backend, with an observability stack — or when your CI pipeline wants a clean container image per run. ## When Docker fits - **Bespoke integrations.** Deploy cyoda-go alongside your own services on a single host, wire them over a Docker network. - **Composed environments.** Run cyoda-go with PostgreSQL, Prometheus, Grafana, and OpenTelemetry collectors as a complete local stack. - **PostgreSQL for dev and test.** Point cyoda-go at a containerised PostgreSQL to exercise production-mode behaviour locally before deploying. - **CI pipelines.** Ephemeral, reproducible, no host state. ## Image cyoda-go publishes container images per release. The authoritative reference and pull instructions live in the [cyoda-go Docker reference](https://github.com/cyoda-platform/cyoda-go/tree/main/deploy/docker). ## Compose example The repository ships a minimal `compose.yaml` for getting a single node and PostgreSQL up, plus a richer [compose-with-observability](https://github.com/cyoda-platform/cyoda-go/tree/main/examples/compose-with-observability) example that wires tracing and metrics. Use these as templates rather than retyping them here — they track the cyoda-go release and we link whichever is current. ## PostgreSQL backend Point cyoda-go at a PostgreSQL instance by setting `CYODA_STORAGE_BACKEND=postgres` and the usual connection variables (or `*_FILE` forms for secrets). The DSN goes in `CYODA_POSTGRES_URL` (or `CYODA_POSTGRES_URL_FILE` for a file-mounted secret per Docker conventions). The Docker compose example wires this up end-to-end; for production you will run PostgreSQL separately and pass only the DSN. ## Observability The container emits structured logs to stdout, exposes a Prometheus scrape endpoint for metrics, and accepts OpenTelemetry configuration for traces. The observability example demonstrates a full loop: - **Logs** — stream from the cyoda-go container. - **Metrics** — Prometheus scrapes the admin port. - **Traces** — OTLP exporter configured via environment. Tune sampling and log level at runtime via the admin endpoints; see the [cyoda-go observability reference](https://github.com/cyoda-platform/cyoda-go#observability). Health probes live on the admin port (default 9091): `/livez` (liveness) and `/readyz` (readiness). Both are unauthenticated. ## Data directory The container pre-stages `/var/lib/cyoda` as the data directory (with the correct ownership for the non-root `65532:65532` user). Mount it as a named volume if you want SQLite data or any plugin state to persist across container restarts. --- ## run/kubernetes.md # Kubernetes Deploy cyoda-go with the Helm chart for clustered PostgreSQL-backed production. Kubernetes is the recommended production packaging for self-hosted cyoda-go. The application is designed for active-active stateless deployment: three to ten cyoda-go pods behind a load balancer, with PostgreSQL as the only stateful dependency. ## When Kubernetes fits - **Production workloads** that need high availability. - **Multi-node clustering** with rolling upgrades and blue/green. - **Horizontal scale** up to the PostgreSQL backend's limits (10+ stateless pods serving one PostgreSQL cluster is a comfortable envelope). - **Enterprise operations** — GitOps, secrets management, service meshes. ## Deployment shape ``` ┌─────────────┐ │ Load │ │ Balancer │ └──────┬──────┘ │ ┌────────┼────────┐ │ │ │ ┌────▼─┐ ┌────▼─┐ ┌────▼─┐ │cyoda │ │cyoda │ │cyoda │ (stateless, 3–10 pods) │-go │ │-go │ │-go │ └────┬─┘ └────┬─┘ └────┬─┘ │ │ │ └────────┼────────┘ │ ┌──────▼──────┐ │ PostgreSQL │ (the only stateful dependency) └─────────────┘ ``` Every pod is identical; any pod can serve any request. There is no leader election, no ZooKeeper, no etcd. Coordination happens through PostgreSQL's SERIALIZABLE isolation for writes and a gossip protocol (HMAC-authenticated) for membership, so concurrent writers never silently corrupt data. The stateful backend is pluggable: PostgreSQL (OSS default) or the commercial Cassandra storage engine. The pod topology and the application contract are identical either way — only the storage plugin configuration differs. ## Helm chart cyoda-go ships a Helm chart under [`deploy/helm`](https://github.com/cyoda-platform/cyoda-go/tree/main/deploy/helm). The chart provisions the cyoda-go Deployment, a Service, a ConfigMap for non-sensitive configuration, and Secret references for credentials. The authoritative values reference lives under [Reference → Helm values](/reference/helm/); the chart's own `values.yaml` remains the runtime source of truth. The Helm chart auto-generates the HMAC secret unless `cluster.hmacSecret.existingSecret` is provided. GitOps deployments should always set `existingSecret` to avoid Helm rendering a fresh secret on every reconcile, which would cause inter-node auth to drift. ## High availability - **Load balancer.** Any Kubernetes-native L4 or L7 will do; match the pod readiness probe to the one the chart exposes. - **Readiness and liveness probes.** Both are wired by default; tune if your control plane has stricter latency budgets. - **Pod Disruption Budgets.** Set a minimum available count that matches your replica count minus one so rolling upgrades and node drains do not take the service below quorum. ## Backup and restore Backup is standard PostgreSQL tooling: `pg_dump`, continuous WAL archiving, snapshot-based backups from your cloud provider. cyoda-go does not maintain any state outside PostgreSQL, so a PostgreSQL restore brings the platform back to that point in time in full. ## Upgrades and rollback cyoda-go releases follow semantic versioning. For production: - **Blue/green or canary.** Run the new version alongside the old, cut traffic over, retire the old. - **Rolling upgrade.** Fine for minor releases; set `maxUnavailable: 0` so capacity never drops. - **Schema migration ordering.** Check the release notes for whether a release requires a PostgreSQL schema migration step before the new binary starts serving. The Helm chart runs schema migrations as a pre-install/pre-upgrade hook; pod startup is blocked until migrations complete. ## Sizing Sizing is driven by write volume more than read volume, because reads scale horizontally across pods while writes are serialised through PostgreSQL. Qualitative guide: - **Small.** 3 pods, `db.small` (or equivalent), up to a few hundred transitions per second. - **Medium.** 5–7 pods, dedicated PostgreSQL, low thousands of transitions per second. - **Large.** 10 pods with PostgreSQL scaled up; at this point consider swapping to the commercial Cassandra storage engine (still on Kubernetes), or handing operations to Cyoda Cloud as a SaaS. ## Observability The chart exposes a Prometheus scrape annotation on the pods and surfaces the admin endpoints for log-level and tracing control. Standard OpenTelemetry configuration applies; wire OTLP exporters via environment variables in the Helm values. ## Scaling past PostgreSQL At the upper end of the sizing guide, PostgreSQL's write throughput becomes the bottleneck. Two paths past it — the application contract is identical in both: - **Swap to the commercial Cassandra storage engine, still on Kubernetes.** The licensable plugin replaces the PostgreSQL backend with a horizontally-scaling Cassandra-backed tier. The Helm chart, the pod topology, and the application code are unchanged — only the storage plugin configuration changes. - **Hand operations to Cyoda Cloud.** A SaaS that runs either the PostgreSQL or Cassandra stack for you. Same application contract, different operational model. See [Cyoda Cloud](./cyoda-cloud/) for the SaaS option; contact sales for the commercial Cassandra plugin. ---