Skip to content
Settings

search β€” entity search API

cyoda-go version 0.6.2

search β€” entity search API: synchronous direct search, asynchronous snapshot search, and entity statistics.

POST /api/search/direct/{entityName}/{modelVersion}
POST /api/search/async/{entityName}/{modelVersion}
GET /api/search/async/{jobId}
GET /api/search/async/{jobId}/status
PUT /api/search/async/{jobId}/cancel

Context path prefix is CYODA_CONTEXT_PATH (default /api). All endpoints require Authorization: Bearer <token> except when CYODA_IAM_MODE=mock.

Search operates against a specific entity model (entityName, modelVersion). Two modes are supported:

Synchronous (direct) search: POST /search/direct/{entityName}/{modelVersion}. Executes inline within the HTTP request. The response is an NDJSON stream (application/x-ndjson), one entity envelope per line. The default result limit is 1000 entities per request; the maximum is 10000 (values above 10000 are clamped to 10000).

Asynchronous search: POST /search/async/{entityName}/{modelVersion}. Submits a search job and returns a job UUID immediately. The search executes in a background goroutine (or in the plugin’s own executor for SelfExecutingSearchStore plugins). Results are retrieved by polling status and then fetching pages.

Both modes accept the same Condition DSL as the request body. When the storage plugin implements spi.Searcher, the condition is translated to a plugin-level predicate and pushed down to the backend. When translation fails (unsupported condition type) or an active transaction is present, the service falls back to in-memory filtering after a full GetAll scan.

All search requests accept a Condition JSON document as the POST body. Conditions are parsed recursively up to a maximum nesting depth of 50. Body size limit: 10 MiB.

SimpleCondition β€” match a single JSON path against a scalar value:

{
"type": "simple",
"jsonPath": "$.category",
"operatorType": "EQUALS",
"value": "physics"
}
  • type: "simple"
  • jsonPath: JSONPath string (e.g., "$.year", "$.laureates[0].firstname")
  • operatorType (also accepted as operator or operation): operator string (see valid values below)
  • value: any JSON scalar

Valid operatorType values (exhaustive):

  • EQUALS β€” exact equality; numeric-aware (JSON number vs string representation)
  • NOT_EQUAL β€” inequality; inverse of EQUALS
  • GREATER_THAN β€” numeric or lexicographic greater-than
  • LESS_THAN β€” numeric or lexicographic less-than
  • GREATER_OR_EQUAL β€” greater-than or equal
  • LESS_OR_EQUAL β€” less-than or equal
  • CONTAINS β€” substring or array-element containment
  • NOT_CONTAINS β€” inverse of CONTAINS
  • STARTS_WITH β€” string prefix match
  • NOT_STARTS_WITH β€” inverse of STARTS_WITH
  • ENDS_WITH β€” string suffix match
  • NOT_ENDS_WITH β€” inverse of ENDS_WITH
  • LIKE β€” SQL-style LIKE pattern (% = any sequence, _ = any single char)
  • IS_NULL β€” field is absent or JSON null
  • NOT_NULL β€” field is present and not JSON null
  • BETWEEN β€” range check (exclusive bounds); value must be a two-element array [low, high]
  • BETWEEN_INCLUSIVE β€” range check (inclusive bounds); same value shape as BETWEEN
  • MATCHES_PATTERN β€” regular expression match
  • IEQUALS β€” case-insensitive EQUALS
  • INOT_EQUAL β€” case-insensitive NOT_EQUAL
  • ICONTAINS β€” case-insensitive CONTAINS
  • INOT_CONTAINS β€” case-insensitive NOT CONTAINS
  • ISTARTS_WITH β€” case-insensitive STARTS_WITH
  • INOT_STARTS_WITH β€” case-insensitive NOT STARTS_WITH
  • IENDS_WITH β€” case-insensitive ENDS_WITH
  • INOT_ENDS_WITH β€” case-insensitive NOT ENDS_WITH

Operator strings outside this list are rejected with errors.BAD_REQUEST at request time; the error detail includes the canonical list.

LifecycleCondition β€” match entity lifecycle metadata:

{
"type": "lifecycle",
"field": "state",
"operatorType": "EQUALS",
"value": "APPROVED"
}
  • type: "lifecycle"
  • field: "state", "creationDate", or "previousTransition"
  • operatorType (also accepted as operator or operation): operator string β€” same valid values as for SimpleCondition
  • value: any JSON scalar

GroupCondition β€” combine conditions with a logical operator:

{
"type": "group",
"operator": "AND",
"conditions": [
{ "type": "simple", "jsonPath": "$.year", "operatorType": "EQUALS", "value": "2024" },
{ "type": "lifecycle", "field": "state", "operatorType": "EQUALS", "value": "NEW" }
]
}
  • type: "group"
  • operator: "AND" or "OR" β€” these are the only supported values; any other string produces errors.BAD_REQUEST at match time (β€œunknown group operator”)
  • conditions: array of Condition objects (recursive; maximum nesting depth 50)

"NOT" is not supported. An AND group with an empty conditions array evaluates to true (vacuous conjunction). An OR group with an empty conditions array evaluates to false (vacuous disjunction).

EMPTY CONDITION: Submitting an empty body ({}) or a body with no type field as the top-level search condition is rejected with errors.BAD_REQUEST β€” the parser requires a valid type field. Submitting a valid AND group with an empty conditions array ({"type":"group","operator":"AND","conditions":[]}) is accepted and matches all entities β€” this is the correct way to retrieve all entities without filtering.

ArrayCondition β€” match positional values in a JSON array:

{
"type": "array",
"jsonPath": "$.laureates",
"values": ["John", null, "Hopfield"]
}
  • type: "array"
  • jsonPath: path to the array field
  • values: positional values; null entries match any value at that index

FunctionCondition β€” server-side function predicate dispatched to a compute member:

{
"type": "function",
"function": {
"name": "my-criteria-fn",
"config": {
"calculationNodesTags": "approval-service",
"attachEntity": true,
"responseTimeoutMs": 30000
}
}
}
  • type: "function"
  • function.name: string β€” identifies the function; becomes criteriaId / criteriaName in the dispatch request; required for routing
  • function.config.calculationNodesTags: string β€” comma-separated tags used to select a registered compute member; follows the same tag-intersection rules as processor dispatch
  • function.config.attachEntity: boolean (optional, default true) β€” when true, the full entity payload is included in the dispatch request
  • function.config.responseTimeoutMs: int64 (optional, default 30000) β€” timeout in milliseconds

The function is dispatched as EntityCriteriaCalculationRequest to the matching compute member β€” see the grpc topic for the request/response shape. FunctionCondition cannot be translated to a storage-plugin pushdown filter; it always executes as a post-filter with in-memory entity loading.

POST /api/search/direct/{entityName}/{modelVersion} β€” Synchronous search

  • entityName (path): string
  • modelVersion (path): int32
  • pointInTime (query, optional): RFC 3339 date-time β€” search against entity state at this instant
  • limit (query, optional): string-encoded integer, clamped to maximum 10000; default 1000

Request body: Condition JSON document.

Response: 200 OK, Content-Type: application/x-ndjson.

Each line is a complete entity envelope JSON object:

{"type":"ENTITY","data":{"category":"physics","year":"2024"},"meta":{"id":"74807f00-ed0d-11ee-a357-ae468cd3ed16","state":"NEW","creationDate":"2025-08-01T10:00:00.000000000Z","lastUpdateTime":"2025-08-01T10:00:00.000000000Z"}}
{"type":"ENTITY","data":{"category":"chemistry","year":"2023"},"meta":{"id":"89abc100-ed0d-11ee-a357-ae468cd3ed16","state":"APPROVED","creationDate":"2025-07-15T09:00:00.000000000Z","lastUpdateTime":"2025-07-20T14:00:00.000000000Z"}}

The stream is truncated on encode failure after the header has been sent; the client detects truncation via a connection error or incomplete last line.

POST /api/search/async/{entityName}/{modelVersion} β€” Submit async search job

  • entityName (path): string
  • modelVersion (path): int32
  • pointInTime (query, optional): RFC 3339 β€” if not provided, the current time is captured at submission

Request body: Condition JSON document.

Response: 200 OK, application/json β€” bare UUID string (job ID):

"a1b2c3d4-e5f6-11ee-9e63-ae468cd3ed16"

The job is stored with status RUNNING. For non-SelfExecutingSearchStore backends, a goroutine begins the search immediately using a background context derived from the submitting user’s tenant context.

GET /api/search/async/{jobId}/status β€” Get async job status

  • jobId (path): UUID

Response: 200 OK, application/json:

{
"searchJobStatus": "SUCCESSFUL",
"createTime": "2025-08-01T10:00:00.000000000Z",
"entitiesCount": 42,
"calculationTimeMillis": 145,
"finishTime": "2025-08-01T10:00:00.145000000Z",
"expirationDate": "2025-08-02T10:00:00.000000000Z"
}
  • searchJobStatus: "RUNNING", "SUCCESSFUL", "FAILED", or "CANCELLED"
  • createTime: RFC 3339 with nanoseconds
  • entitiesCount: total matching entities (0 while running)
  • calculationTimeMillis: elapsed search time in milliseconds
  • finishTime: RFC 3339 with nanoseconds; absent when status is RUNNING
  • expirationDate: createTime + 24h β€” job results expire after this time

GET /api/search/async/{jobId} β€” Retrieve async job results (paginated)

  • jobId (path): UUID
  • pageSize (query, optional): string-encoded integer, default 1000
  • pageNumber (query, optional): string-encoded integer, default 0; offset = pageNumber * pageSize

The job must be in SUCCESSFUL status. Returns 400 BAD_REQUEST if the job is not yet complete.

Response: 200 OK, application/json:

{
"content": [
{
"type": "ENTITY",
"data": { "category": "physics", "year": "2024" },
"meta": {
"id": "74807f00-ed0d-11ee-a357-ae468cd3ed16",
"state": "NEW",
"creationDate": "2025-08-01T10:00:00.000000000Z",
"lastUpdateTime": "2025-08-01T10:00:00.000000000Z"
}
}
],
"page": {
"number": 0,
"size": 1000,
"totalElements": 42,
"totalPages": 1
}
}

Results are fetched from the stored entity snapshots at the job’s pointInTime. Entities deleted or modified after submission are returned as they existed at submission time.

PUT /api/search/async/{jobId}/cancel β€” Cancel a running async job

  • jobId (path): UUID

Cancellation succeeds only when the job status is RUNNING. If the job has already reached a terminal state (SUCCESSFUL, FAILED, or CANCELLED), the server returns 400 Bad Request:

{
"detail": "snapshot by id=<jobId> is not running. current status=SUCCESSFUL",
"properties": {
"currentStatus": "SUCCESSFUL",
"snapshotId": "<jobId>"
},
"status": 400,
"title": "Bad Request",
"type": "about:blank"
}

On successful cancellation, response: 200 OK, application/json:

{
"isCancelled": true,
"cancelled": true,
"currentSearchJobStatus": "CANCELLED"
}

Async search results use page-number pagination: pageNumber=0 is the first page, offset = pageNumber * pageSize. pageNumber and pageSize are both string-encoded integers in query parameters.

Synchronous search does not paginate; use the limit parameter (max 10000) to bound results. For large datasets, use async search with page retrieval.

  • errors.SEARCH_JOB_NOT_FOUND β€” 404 β€” async job UUID does not exist.
  • errors.SEARCH_JOB_ALREADY_TERMINAL β€” 400 β€” cancel attempted on a job that is already SUCCESSFUL, FAILED, or CANCELLED; error code in response is BAD_REQUEST
  • errors.SEARCH_RESULT_LIMIT β€” result set exceeds configured limit
  • errors.SEARCH_SHARD_TIMEOUT β€” per-shard search timeout exceeded (relevant for distributed backends)
  • errors.BAD_REQUEST β€” 400 β€” malformed condition JSON, invalid limit/pageSize/pageNumber, result retrieval on non-SUCCESSFUL job, unknown async job ID in result retrieval

Synchronous search β€” match by field value:

curl -s -X POST \
-H "Authorization: Bearer $TOKEN" \
-H "Content-Type: application/json" \
-d '{"type":"simple","jsonPath":"$.category","operatorType":"EQUALS","value":"physics"}' \
"http://localhost:8080/api/search/direct/nobel-prize/1"

Synchronous search β€” match by lifecycle state:

curl -s -X POST \
-H "Authorization: Bearer $TOKEN" \
-H "Content-Type: application/json" \
-d '{"type":"lifecycle","field":"state","operatorType":"EQUALS","value":"APPROVED"}' \
"http://localhost:8080/api/search/direct/nobel-prize/1"

Synchronous search β€” AND group:

curl -s -X POST \
-H "Authorization: Bearer $TOKEN" \
-H "Content-Type: application/json" \
-d '{
"type": "group",
"operator": "AND",
"conditions": [
{"type":"simple","jsonPath":"$.year","operatorType":"EQUALS","value":"2024"},
{"type":"lifecycle","field":"state","operatorType":"EQUALS","value":"NEW"}
]
}' \
"http://localhost:8080/api/search/direct/nobel-prize/1"

Synchronous search at point in time with limit:

curl -s -X POST \
-H "Authorization: Bearer $TOKEN" \
-H "Content-Type: application/json" \
-d '{"type":"group","operator":"AND","conditions":[]}' \
"http://localhost:8080/api/search/direct/nobel-prize/1?pointInTime=2025-08-01T00:00:00Z&limit=100"

Submit async search:

JOB_ID=$(curl -s -X POST \
-H "Authorization: Bearer $TOKEN" \
-H "Content-Type: application/json" \
-d '{"type":"simple","jsonPath":"$.year","operatorType":"EQUALS","value":"2024"}' \
"http://localhost:8080/api/search/async/nobel-prize/1" | tr -d '"')

Poll async job status:

curl -s -H "Authorization: Bearer $TOKEN" \
"http://localhost:8080/api/search/async/$JOB_ID/status"

Retrieve async results (page 0):

curl -s -H "Authorization: Bearer $TOKEN" \
"http://localhost:8080/api/search/async/$JOB_ID?pageNumber=0&pageSize=500"

Cancel an async job:

curl -s -X PUT \
-H "Authorization: Bearer $TOKEN" \
"http://localhost:8080/api/search/async/$JOB_ID/cancel"
  • crud
  • models
  • analytics
  • errors.SEARCH_JOB_NOT_FOUND
  • errors.SEARCH_JOB_ALREADY_TERMINAL
  • errors.SEARCH_RESULT_LIMIT
  • errors.SEARCH_SHARD_TIMEOUT
  • openapi
  • cyoda help crud β€” Entities are instances of models. Each entity has a UUID, a model reference (entityName, modelVersion), and a lifecycle state managed by the workflow engine. Creating an entity requires the referenced model to be in LOCKED state. All write operations run within a Cyoda transaction and return a transactionId alongside the affected entity IDs.
  • cyoda help models β€” A model is a named, versioned schema registered per tenant. Every entity in the system is an instance of exactly one model. Models are identified by (entityName, modelVersion). The model ID is a deterministic UUID v5 derived from that key: UUID.newSHA1(NameSpaceURL, "{entityName}.{modelVersion}").
  • cyoda help analytics β€” Cyoda Cloud exposes entity data as Trino SQL tables through a Trino connector. The connector uses the Schema Management REST API to discover table definitions and the WebSocket (STOMP) messaging API to stream entity rows at query time.
  • cyoda help errors SEARCH_JOB_NOT_FOUND β€” Polling a search job by ID returns this error when the job ID is unknown or belongs to a different tenant. Jobs are tenant-scoped; a valid job ID from one tenant is not visible to another.
  • cyoda help errors SEARCH_JOB_ALREADY_TERMINAL β€” Search jobs are long-running asynchronous operations. Once a job reaches a terminal state it cannot be cancelled, resumed, or otherwise modified. This error is returned when such an operation is attempted on a finished job.
  • cyoda help errors SEARCH_RESULT_LIMIT β€” The server imposes an upper bound on the number of results returned per page and per job to protect cluster resources. Returned when the request exceeds this limit β€” either by requesting too large a page size or by the matched result count exceeding the cap.
  • cyoda help errors SEARCH_SHARD_TIMEOUT β€” Distributed search fans out to multiple shards in parallel. If any shard does not return results before the search timeout expires, the job is marked failed and this error is returned. Occurs under high load, during partial cluster degradation, or with expensive queries.
  • cyoda help openapi β€” cyoda-go generates its OpenAPI 3.1 specification from the embedded api/openapi.yaml file compiled into the binary at build time. The spec is served at /openapi.json with runtime-patched server URLs. The Scalar API Reference UI is served at /docs and loads the spec from /openapi.json.