Records are immutable or mutable data objects (depending on stream template) that are stored in a stream. Records are created by ingesting data into a stream.
Records have similar shape as you have for instances that live in the Data Modeling service,
and their schema is defined by the containers that are referenced as sources.
Each record item also links to the space that the record belongs to. This means we also support space based access control for records.
This page contains the specification for records resource which uses the next Data Modeling concepts:
Getting Access: The Records service is generally available (GA). To enable Records on your CDF project, contact your Cognite representative or Cognite Support.
Note: Every record has a required top-level externalId property, which can be used
in some queries to retrieve or aggregate records (as a filtering or sorting property).
For mutable records, this property (together with the space property) uniquely identifies records for write operations (create/upsert/delete).
For immutable records, the uniqueness of the space + externalId pair is not enforced by the records API, allowing multiple records with the same identifier to exist.
Both the rate of requests (denoted as request per second, or ‘rps’) and the number of concurrent (parallel)
requests are governed by limits, for all CDF API endpoints. If a request exceeds one of the limits,
it will be throttled with a 429: Too Many Requests response. More on limit types
and how to avoid being throttled is described
here.
Limits are defined on the API endpoints belonging to the service. Some types of requests consume more resources (compute, storage IO) than others. For example, ingest requests are less resource intensive than ‘Analytical’ type requests (Aggregate, Retrieve).
Limits for query endpoints (Sync, Retrieve, Aggregate) have a hierarchical structure. Top level limits include requests to all of these endpoints. In addition, there are dedicated limits for Retrieve and Aggregate endpoint. For example, it is possible to send up to 40 rps to Sync, Retrieve and Aggregate endpoints together, but only up to 20 of them can go to Retrieve endpoint and up to 15 to Aggregate endpoint.
Since requests that touch cold data can be considerably slower and consume more resources,
query endpoints (Sync, Retrieve, Aggregate) have different limits for hot and cold data.
It is therefore important to build use cases so that frequent operations are performed only on hot data
and cold data is touched less often. Multiple stream creation templates with different
hot storage durations are available to choose from.
For Retrieve and Aggregate endpoints a request is considered to touch cold data
if interval defined by the lastUpdatedTime filter extends beyond the stream's hot storage duration.
For Sync endpoint a request is considered to touch cold data if its cursor attribute
points to the data that is in the cold storage. If cursor attribute is not provided,
initializeCursor attribute is considered instead.
Amount of data that the service can return in responses to query endpoints (Sync, Retrieve, Aggregate)
is also limited. It is therefore recommended reading only data that will actually be used
(by providing appropriate filter and/or limiting the sources to be retrieved) rather than
retrieving excessive amounts of data most of which will not be used.
The version limits for the records endpoints are illustrated in the diagram below. These limits are subject to change, pending review of changing consumption patterns and resource availability over time:
