Use the streams and records API to build high volume extensions to industrial knowledge graphs which are built with Data Modeling. The streams API lets you manage the streams that are used for storing records. With the API you can define the streams, and retrieve the streams, and eventually set and get settings (and statistics) associated with each stream.
A stream defines the data lifecycle, and not schema, type or source. Multiple different sets of data which have nothing in common can be put into the same stream, provided that settings of this stream fit lifecycle and usage patterns (volume, rate, etc) of the data involved.
Streams can be either mutable or immutable, which affects how records can be modified, storage capacity limits, and query performance characteristics.
Mutable streams allow records to be updated and deleted by the user. They provide consistent low-latency access and optimal query performance. However, the total number of records that can be stored is limited. Mutable streams are well-suited for use cases requiring frequent data access and updates.
Immutable streams do not allow records to be updated or deleted by the user, but support ingestion of very large amounts of data. Query performance may vary depending on the age of the data being accessed, as the system optimizes storage over time.
To delete all data in a stream, the stream itself should be deleted. To protect against irretrievable erroneous deletion, streams are 'soft deleted' allowing them to be recovered for up to 6 weeks after the time of deletion. The template used to create the stream determines the actual recovery time.
A single project have a limited number of soft deleted streams at any given time. To avoid hitting this limit, please
avoid using any pattern of create and delete streams in a high frequency.
Please note that we expect streams to be long lived. The exception are streams created
with one of the test templates. Deleting a stream can take a long time.
How long depends on the stream settings settings and the volume of data stored.
While a stream is soft deleted, it is not possible to recreate a stream with the same identifier as the deleted stream.
Once a stream is deleted, it does not count as one of the active streams, and more streams can be created to serve as active streams. If a stream is accidentally deleted, it is possible to recover the data by contacting Cognite Support. You must contact Cognite no less than 1 week prior to the expiration of the stream retention period to ensure we can recover the data.
Note: Stream Templates are in Beta
The current Stream Templates are in beta. This means:
Choose your template carefully for production use, as templates cannot be changed after stream creation.
This section lists all currently available templates that can be used for creating streams.
This template should be used exclusively for experimentation. It is configured for high throughput and total data volume, but has short data retention. Low retention in a soft-deleted state means that such streams can be quickly discarded when no longer needed, or recreated to get rid of the experimental data.
Note: This template should never be used for production purposes. As this template allows significant load on the system, if we detect improper usage patterns, we can change setting of streams created from this template as a last resort.
lastUpdatedTime property is 7 days.This template is intended for perpetual storage of data. However, overall data volume is limited, which needs to be taken into account when planning usage.
lastUpdatedTime property is 365 days.This template is intended for production usage and offers significant data volume and throughput.
Both the rate of requests (denoted as request per second, or ‘rps’) and the number of concurrent (parallel)
requests are governed by limits, for all CDF API endpoints. If a request exceeds one of the limits,
it will be throttled with a 429: Too Many Requests response. More on limit types
and how to avoid being throttled is described
here.
As streams are intended to be long-lived, users are not expected to interact with these endpoints frequently.
The version limits for the streams endpoints are illustrated in the tables below. These limits are subject to change, pending review of changing consumption patterns and resource availability over time:
| Create and Delete request budget | ||
|---|---|---|
| Overall | Per ID | |
| Requests per second | 2 | 1 |
| Concurrent requests | 1 | 1 |
| Retrieve and List request budget | ||
|---|---|---|
| Overall | Per ID | |
| Requests per second | 5 | 3 |
| Concurrent requests | 3 | 2 |