Records

Records are immutable or mutable data objects (depending on stream template) that are stored in a stream. Records are created by ingesting data into a stream. Records have similar shape as you have for instances that live in the Data Modeling service, and their schema is defined by the containers that are referenced as sources. Each record item also links to the space that the record belongs to. This means we also support space based access control for records. This page contains the specification for records resource which uses the next Data Modeling concepts:

Note: Every record has a required top-level externalId property, which can be used in some queries to retrieve or aggregate records (as a filtering or sorting property). For mutable records, this property (together with the space property) uniquely identifies records for write operations (create/upsert/delete). For immutable records, the uniqueness of the space + externalId pair is not enforced by the records API, allowing multiple records with the same identifier to exist.

Rate and concurrency limits

Both the rate of requests (denoted as request per second, or ‘rps’) and the number of concurrent (parallel) requests are governed by limits, for all CDF API endpoints. If a request exceeds one of the limits, it will be throttled with a 429: Too Many Requests response. More on limit types and how to avoid being throttled is described here.

Limits are defined on the API endpoints belonging to the service. Some types of requests consume more resources (compute, storage IO) than others. For example, ingest requests are less resource intensive than ‘Analytical’ type requests (Aggregate, Retrieve).

Limits for query endpoints (Sync, Retrieve, Aggregate) have a hierarchical structure:

  • All query endpoints share a common Query request budget (shown in the tables below)
  • The Retrieve endpoint has an additional dedicated budget that is checked first
  • The Aggregate endpoint has an additional dedicated budget that is checked first

The Sync endpoint only checks the Query request budget. The Retrieve and Aggregate endpoints must pass both their dedicated budget check AND the Query request budget check.

For example, with mutable streams you can make up to 40 rps total across all query endpoints (Query budget limit), but only up to 20 of those can be Retrieve requests (Retrieve budget limit) and only up to 15 can be Aggregate requests (Aggregate budget limit). This means you could make: 20 Retrieve + 15 Aggregate + 5 Sync = 40 total RPS.

Query performance and rate limits vary between mutable and immutable streams due to their different storage characteristics.

Mutable streams provide consistent high-performance queries and higher rate limits (see "mutable streams" limits in the tables below).

Immutable streams are optimized for ingestion of very large amounts of data, which results in lower query performance and stricter rate limits compared to mutable streams (see "immutable streams" limits in the tables below).

When designing your data access patterns, consider using mutable streams if you need high-performance queries with higher rate limits, or immutable streams if your priority is high-volume data ingestion and long-term storage.

Amount of data that the service can return in responses to query endpoints (Sync, Retrieve, Aggregate) is also limited. It is therefore recommended reading only data that will actually be used (by providing appropriate filter and/or limiting the sources to be retrieved) rather than retrieving excessive amounts of data most of which will not be used.

The version limits for the records endpoints are illustrated in the tables below. These limits are subject to change, pending review of changing consumption patterns and resource availability over time:

Ingest request budget
Overall Per ID
Requests per second 40 30
Concurrent requests 20 15
Query request budget
Overall Per ID
Requests per second: mutable streams 40 30
Concurrent requests: mutable streams 30 22
Requests per second: immutable streams 10 7
Concurrent requests: immutable streams 10 7
Response MB per second 4 3

Additional dedicated budgets for Retrieve and Aggregate endpoints:

The Retrieve and Aggregate endpoints have dedicated budgets that are checked in addition to the Query request budget shown above. A request to these endpoints must pass both budget checks.

Retrieve request budget
Overall Per ID
Requests per second: mutable streams 20 15
Concurrent requests: mutable streams 20 15
Requests per second: immutable streams 10 7
Concurrent requests: immutable streams 10 7

Note: Retrieve endpoint requests are checked against both this budget and the Query request budget above.

Aggregate request budget
Overall Per ID
Requests per second: mutable streams 15 12
Concurrent requests: mutable streams 10 7
Requests per second: immutable streams 5 4
Concurrent requests: immutable streams 5 4

Note: Aggregate endpoint requests are checked against both this budget and the Query request budget above.

Summary:

  • Sync endpoint: Only limited by Query request budget
  • Retrieve endpoint: Limited by Retrieve request budget AND Query request budget
  • Aggregate endpoint: Limited by Aggregate request budget AND Query request budget