Streams

Use the streams and records API to build high-volume extensions to industrial knowledge graphs that are built with Data Modeling. The streams API lets you manage the streams that are used for storing records. With the API you can create, list, retrieve, and delete streams, and view stream settings and statistics.

A stream defines the data lifecycle, and not schema, type or source. Multiple different sets of data which have nothing in common can be put into the same stream, provided that settings of this stream fit lifecycle and usage patterns (volume, rate, etc) of the data involved.

Streams can be either mutable or immutable, which affects how records can be modified, storage capacity limits, and query performance characteristics.

Mutable streams allow records to be updated and deleted by the user. They provide consistent low-latency access and optimal query performance. However, the total number of records that can be stored is limited. Mutable streams are well-suited for use cases requiring frequent data access and updates.

Immutable streams do not allow records to be updated or deleted by the user, but support ingestion of very large amounts of data. Query performance may vary depending on the age of the data being accessed, as the system optimizes storage over time.

To delete all data in a stream, the stream itself should be deleted. To protect against irretrievable erroneous deletion, streams are 'soft deleted' allowing them to be recovered for up to 6 weeks after the time of deletion. The template used to create the stream determines the actual recovery time.

A single project has a limited number of soft deleted streams at any given time. To avoid hitting this limit, please avoid using any pattern of create and delete streams in a high frequency. Please note that we expect streams to be long lived. The exception are streams created with one of the test templates. Deleting a stream can take a long time. How long depends on the stream settings and the volume of data stored. While a stream is soft deleted, it is not possible to recreate a stream with the same identifier as the deleted stream.

Once a stream is deleted, it does not count as one of the active streams, and more streams can be created to serve as active streams. If a stream is accidentally deleted, it is possible to recover the data by contacting Cognite Support. You must contact Cognite no less than 1 week prior to the expiration of the stream retention period to ensure we can recover the data.

Available stream templates

Note: Stream Templates are in Beta

The current Stream Templates are in beta. This means:

  • New templates may be added based on customer needs
  • Existing templates may be modified or removed if necessary
  • Such modifications will not affect existing streams created from these templates

Choose your template carefully for production use, as templates cannot be changed after stream creation.

This section lists all currently available templates that can be used for creating streams.

Immutable streams

ImmutableTestStream

This template should be used exclusively for experimentation. It is configured for high throughput and total data volume, but has a short data retention period. Low retention in a soft-deleted state means that such streams can be quickly discarded when no longer needed or recreated to remove the experimental data.

Note: This template should never be used for production purposes. As this template allows significant load on the system, if we detect improper usage patterns, we can change setting of streams created from this template as a last resort.

  • Maximum total number of records - 50 M (50,000,000)
  • Maximum total data volume - 50 GB
  • Data retention - 7 days
  • Maximum ingestion throughput (per 10 minutes) - 1.5 GB
  • Maximum reading throughput (per 10 minutes) - 1.5 GB
  • Maximum records ingested (per 10 minutes) - 800,000 items
  • Maximum unique properties with data across all records - 1000
  • Maximum range filter interval for the lastUpdatedTime property - 7 days
  • Stream soft-delete retention (before hard delete) - 1 day
  • Maximum active streams per project - 3

BasicArchive

This template is intended for perpetual storage of data. However, overall data volume is limited, which needs to be taken into account when planning usage.

  • Maximum total number of records - 50 M (50,000,000)
  • Maximum total data volume - 50 GB
  • Data retention - Unlimited (data never gets deleted)
  • Maximum ingestion throughput (per 10 minutes) - 170 MB
  • Maximum reading throughput (per 10 minutes) - 1.7 GB
  • Maximum records ingested (per 10 minutes) - 170,000 items
  • Maximum unique properties with data (across all records) - 1000
  • Maximum range filter interval for the lastUpdatedTime property - 365 days
  • Stream soft-delete retention (before hard delete) - 6 weeks
  • Maximum active streams per project - 2

Mutable streams

BasicLiveData

This template is intended for production usage and offers significant data volume and throughput.

  • Maximum total number of records - 5 M (5,000,000)
  • Maximum total data volume - 15 GB
  • Maximum ingestion throughput (per 10 minutes) - 170 MB
  • Maximum reading throughput (per 10 minutes) - 500 MB
  • Maximum records ingested (per 10 minutes) - 170,000 items
  • Maximum records updated or deleted (per 10 minutes) - 85,000 items
  • Maximum unique properties with data (across all records) - 1000
  • Stream soft-delete retention (before hard delete) - 6 weeks
  • Maximum active streams per project - 1

Rate and concurrency limits

Both the rate of requests (denoted as request per second (‘rps’)) and the number of concurrent (parallel) requests are governed by limits, for all CDF API endpoints. If a request exceeds one of the limits, it will be throttled with a 429: Too Many Requests response. See Resource throttling for limit types and how to avoid throttling.

As streams are intended to be long-lived, users are not expected to interact with these endpoints frequently.

The version limits for the streams endpoints are illustrated in the tables below. These limits are subject to change, pending review of changing consumption patterns and resource availability over time:

Create and Delete request budget
Overall Per ID
Requests per second 2 1
Concurrent requests 1 1
Retrieve and List request budget
Overall Per ID
Requests per second 5 3
Concurrent requests 3 2