Use the streams and records API to build high volume extensions to industrial knowledge graphs which are built with Data Modeling. The streams API lets you manage the streams that are used for storing records. With the API you can define the streams, and retrieve the streams, and eventually set and get settings (and statistics) associated with each stream.
Getting Access: The Records service is generally available (GA). To enable Records on your CDF project, contact your Cognite representative or Cognite Support.
A stream defines the data lifecycle, and not schema, type or source. Multiple different sets of data which have nothing in common can be put into the same stream, provided that settings of this stream fit lifecycle and usage patterns (volume, rate, etc) of the data involved.
Data in a stream has two phases which define access conditions (limits, latency, etc): hot phase and cold phase.
When ingested, data is in the hot phase and has the lowest access latency. As the time passes,
data transitions to the cold phase. Access to cold data can be slower and stricter limits may be applied.
It is expected that hot data will be accessed more often than the cold one.
The duration of each phase depends on the stream settings
(the stream creation template defines the settings at the moment of stream creation).
Customization of stream settings is possible, but there is no API for it yet.
To delete all data in a stream, the stream itself should be deleted. To protect against irretrievable erroneous deletion, streams are 'soft deleted' allowing them to be recovered for up to 6 weeks after the time of deletion. The template used to create the stream determines the actual recovery time.
A single project have a limited number of soft deleted streams at any given time. To avoid hitting this limit, please
avoid using any pattern of create and delete streams in a high frequency.
Please note that we expect streams to be long lived. The exception are streams created
with one of the test templates. Deleting a stream can take a long time.
How long depends on the stream settings settings and the volume of data stored.
While a stream is soft deleted, it is not possible to recreate a stream with the same identifier as the deleted stream.
Once a stream is deleted, it does not count as one of the active streams, and more streams can be created to serve as active streams. If a stream is accidentally deleted, it is possible to recover the data by contacting Cognite Support. You must contact Cognite no less than 1 week prior to the expiration of the stream retention period to ensure we can recover the data.
Note: Stream Templates are in Beta
The current Stream Templates are in beta. This means:
Choose your template carefully for production use, as templates cannot be changed after stream creation.
This section lists all currently available templates that can be used for creating streams.
This template should be used exclusively for experimentation. It is configured for high throughput and total data volume, but has short data retention. Low retention in a soft-deleted state means that such streams can be quickly discarded when no longer needed, or recreated to get rid of the experimental data.
Note: This template should never be used for production purposes. As this template allows significant load on the system, if we detect improper usage patterns, we can change setting of streams created from this template as a last resort.
lastUpdatedTime property is 7 days.hot phase duration is 1 day.hot phase iscold phase.This template is intended for perpetual storage of data. However, overall data volume is limited, which needs to be taken into account when planning usage.
lastUpdatedTime property is 365 days.hot phase duration is 1 day.hot phase iscold phase.This template is intended for production usage and offers significant data volume and throughput.
Both the rate of requests (denoted as request per second, or ‘rps’) and the number of concurrent (parallel)
requests are governed by limits, for all CDF API endpoints. If a request exceeds one of the limits,
it will be throttled with a 429: Too Many Requests response. More on limit types
and how to avoid being throttled is described
here.
As streams are intended to be long-lived, users are not expected to interact with these endpoints frequently.
The version limits for the streams endpoints are illustrated in the diagram below. These limits are subject to change, pending review of changing consumption patterns and resource availability over time:
