Concepts
Kyma is opinionated about a small number of things. Almost every load-bearing decision in the engine traces back to one of those opinions. The pages below are the mental model — read in order if you're starting; read by topic if you're deep on one.
If you only have time for two, start with The five invariants and The pruning cascade. Everything else is an instance of those two ideas.
What kyma is
The ten-minute version of the value prop. What the engine ingests, how it stores, how it answers, who it's for, and — explicitly — what it isn't.
The five invariants
Five non-negotiable architectural properties — object storage as the only source of truth, stateless compute, externalized catalog, pluggable format, pluggable parser. Encoded as architectural tests; regressions block merge.
The pruning cascade
Three levels of elimination — catalog, extent footer, block index — that skip 99 % of bytes on 99 % of queries. Why a query without a time bound plans like a full scan, and what to do about it.
Extents and snapshots
Append-only columnar files on object storage. CAS-committed snapshots. Iceberg-style manifests. The shape that makes ingest exactly-once and queries predictable.
Schema model
Eight column types, the schema-only-widens rule, mid-batch evolution. When to use int vs long, string vs dynamic, vector(N) vs external embedding tables.
Dynamic and vectors
The two non-relational column types. CBOR-encoded dynamic with token
- path indices for arbitrary structured data. Fixed-dimension
vector(N)with cosine / L2 / inner-product UDFs.
The agent loop
/v1/agent/ask. Natural-language question in, Server-Sent Events out. Schema RAG via pgvector keeps the agent's mental model accurate as schemas evolve. Read-only by design.
Multi-source data
How kyma joins your operational databases — Postgres, MySQL, MongoDB — with its own tables. Federation for live reads; CDC sync for fast historical queries; both at once via live(table).
Retention and compaction
Per-table retention policies. Compaction that merges small extents into fewer fat ones for better pruning. Tombstone collapse on synced tables. All as work-unit rows; adding capacity is starting another node.
Observability
How to tell what kyma is doing. Prometheus /metrics, the agent run trace, /v1/connectors/:id/status, the queryable kyma_connector_health table, and pushdown_summary — the trust mechanism for federation.
How to read this section
- First time? Read in order. Each page builds on the previous one. Quick path: invariants → cascade → extents → schema → agent loop.
- Debugging a slow query? Start with the pruning cascade, then pruning and performance for the practical companion.
- Choosing a column type? Schema model
- dynamic and vectors cover it.
- Integrating with another database?Multi-source data, then the Connectors section for the operational details.
- Architecting a deployment? Start with the five invariants, then Architecture for the slice roadmap.