Streaming data allows API servers to send and receive data in real-time or in chunks, rather than waiting for the entire response to be ready. This is already how browsers handle HTML, images, and other media, and now it can be done for APIs working with JSON.
This can improve responses with lots of data, or be used to send events from server to client in realtime without polling or adding the complexity of Webhooks or WebSockets. Streaming works by sending “chunks”, which clients can then work with individually instead of waiting for the entire response to be ready.
Streaming JSON in particular is increasingly useful as expectations around big data, data science, and AI continue to grow. JSON on its own does not stream very well, but a few standards and conventions have popped up to expand JSON into a streamable format, and OpenAPI v3.2 introduces keywords to describe data in these stream formats.
JSON Streaming
Streaming JSON is a bit tricky because JSON is not designed to be streamed. A naive approach might look like this:
{
{"timestamp": "1985-04-12T23:20:50.52Z", "level": 1, "message": "Hi!"},
This would trip up most tooling (because the closing bracket is not present in the payload), but we can use something like JSON Lines (a.k.a JSONL) to send one JSON instance per line.
{"timestamp": "1985-04-12T23:20:50.52Z", "level": 1, "message": "Hi!"}
{"timestamp": "1985-04-12T23:20:51.37Z", "level": 1, "message": "Hows it hangin?"}
{"timestamp": "1985-04-12T23:20:53.29Z", "level": 1, "message": "Bye!"}
This format allows each line to be a valid JSON object, making it easy to parse
with standard native tooling and a for
loop. There are a bunch of other
streaming formats you might want to work with in your API like Newline
Delimited JSON (NDJSON), JSON Text
Sequence, GeoJSON Text
Sequence. Thankfully they are
all quite similar and working with them in OpenAPI is almost identical.
Streaming with OpenAPI
OpenAPI v3.0 & v3.1 were able to stream binary data, but struggled to support JSON streaming formats as there was no standard way to define the schema of individual events in a stream. People would try to describe things as an array:
content:
application/jsonl:
schema:
type: array
items:
type: object
properties:
timestamp:
type: string
format: date-time
level:
type: integer
message:
type: string
You might see this sort of thing around, but it’s not valid, and will confuse
tooling. A stream cannot be described as a single array, and it is a sequence of
multiple objects on new lines which is rather different. Some tools could spot
the application/jsonl
content type and figure that out, but we don’t need
awkward hacks anymore because the OpenAPI team have solved the problem.
OpenAPI v3.2 introduces two new keywords to describe streamed data and events:
itemSchema
- define the structure of each item in a stream.itemEncoding
- define how those items are encoded (or serialized), as text, JSON, binary, etc.
itemSchema
Describing a stream with itemSchema
works just like schema
with one
difference: it will be applied to each item in the stream, instead of the entire
response.
Consider an example like the train travel API running a stream of tickets:
HTTP/1.1 200 OK
X-Powered-By: Express
Content-Type: application/jsonl; charset=utf-8
Transfer-Encoding: chunked
Date: Tue, 19 Aug 2025 18:36:10 GMT
Connection: keep-alive
Keep-Alive: timeout=5
{"train":"ICE 123","from":"Berlin","to":"Munich","price":79.9}
{"train":"TGV 456","from":"Paris","to":"Lyon","price":49.5}
{"train":"EC 789","from":"Zurich","to":"Milan","price":59}
To describe this stream of items, we can use the itemSchema
keyword:
content:
application/jsonl:
itemSchema:
type: object
properties:
train:
type: string
from:
type: string
to:
type: string
price:
type: number
format: float
Tooling now has two important switches it can use to figure out how to handle
the response. The itemSchema
makes it clear the response is a stream, and the
application/jsonl
content type lets tooling decide how to present that.
For streaming formats that just handle streams of JSON, the itemSchema
is
often sufficient to describe the structure of each item in the stream. For more
complicated formats, additional encoding information may be needed.
itemEncoding
The itemEncoding
keyword allows you to specify how each item in the stream
should be encoded, with the same encoding object as the encoding
keyword.
Using itemEncoding
is only possible for multipart/*
responses, so it is not
very useful for an API that’s streaming JSON, unless you were streaming a
mixture of JSON and assets/images on a single response.
content:
multipart/mixed:
itemSchema:
$comment: A single data image from the device
itemEncoding:
contentType: image/jpg
Let’s ignore itemEncoding for now and focus on the major use case of streams for APIs: streaming data and events.
Popular Streaming Formats
They all work a little different, but they share the common goal of allowing data to be sent in a continuous stream rather than as a single, complete response.
JSON Lines & NDJSON
Working with JSON Lines or NDJSON is basically identical in OpenAPI, and feels
very much like working with plain JSON responses just with a different header
and a bit of itemSchema
usage.
If using JSONL use content type application/jsonl
, and if using NDJSON use
content type application/x-ndjson
.
paths:
/logs:
get:
summary: Stream of logs as JSON Lines
responses:
'200':
description: |
A stream of JSON-format log messages that can be read
for as long as the application is running, and is available
in any of the sequential JSON media types.
content:
application/jsonl:
itemSchema:
type: object
properties:
timestamp:
type: string
format: date-time
level:
type: integer
minimum: 0
message:
type: string
examples:
JSONL:
summary: Log entries
description: JSONL examples are just a string where each line is a valid JSON object.
value: |
{"timestamp": "1985-04-12T23:20:50.52Z", "level": 1, "message": "Hi!"}
{"timestamp": "1985-04-12T23:20:51.37Z", "level": 1, "message": "Hows it hangin?"}
{"timestamp": "1985-04-12T23:20:53.29Z", "level": 1, "message": "Bye!"}
The example once again shows JSONL as a series of JSON objects with a newline character \n
(0x0A) between them. This can only be described as a YAML multiline string, because JSONL/NDJSON cannot be described as plain JSON/YAML due to the newline characters.
Remember to use
value: |
to write multi-line strings in YAML, because the pipe will allow newlines to be passed through. Usingvalue: >
would remove newlines and put each JSON instance onto the same line.
The sample code for either of these formats could look a bit like this:
app.get("/tickets", async (_, res) => {
res.setHeader("Content-Type", "application/jsonl; charset=utf-8");
res.setHeader("Transfer-Encoding", "chunked");
for (const ticket of tickets) {
res.write(JSON.stringify(ticket) + "\n");
}
res.end();
});
JSON Text Sequence
A third JSON streaming format which would be identical other than a weird little complication. The other two formats are just a newline character \n
(0x0A) at the end of the line, but RFC 7464: JSON Text Sequence requires a control character at the start ASCII Record Separator (0x1E). This is not a visible character in most contexts, but it will be in there like this:
0x1E{"timestamp": "1985-04-12T23:20:50.52Z", "level": 1, "message": "Hi!"}
0x1E{"timestamp": "1985-04-12T23:20:51.37Z", "level": 1, "message": "Hows it hangin?"}
0x1E{"timestamp": "1985-04-12T23:20:53.29Z", "level": 1, "message": "Bye!"}
The 0x1E
(ASCII Record Separator) indicates the start of a new JSON object in the stream. Control characters are a bit magical and invisible to most text editors so it can be a little confusing. Working with JSON Text Sequence tooling for both producing the stream and reading the stream can solve this problem, letting the tooling insert and read out the control characters without you needing to worry.
import { Generator } from "json-text-sequence";
// ... snip express setup ...
app.get("/tickets", async (_, res) => {
res.setHeader("Content-Type", "application/json-seq");
const g = new Generator();
g.pipe(res);
for (const ticket of tickets) {
g.write(ticket);
}
res.end();
});
The json-text-sequence package makes this easier and provides a simple method for generating and consuming JSON Text Sequences.
Server-Sent Events (SSE)
Streaming JSON as chunks of data is only one way that JSON gets streamed. What about sending events, with some JSON being passed along as attributes?
Server-Sent Events (SSE) can handle this, as a standard for sending real-time updates from a server to a client over HTTP. In OpenAPI, you can define SSE streams using the text/event-stream
content type and the itemSchema
keyword to describe the structure of the events being sent.
content:
description: A request body to add a stream of typed data.
required: true
content:
text/event-stream:
itemSchema:
type: object
properties:
event:
type: string
data:
type: string
retry:
type: integer
required: [event]
# Define event types and specific schemas for the corresponding data
oneOf:
- properties:
event:
const: addString
- properties:
event:
const: addInt64
data:
format: int64
- properties:
event:
const: addJson
data:
contentMediaType: application/json
contentSchema:
type: object
required: [foo]
properties:
foo:
type: integer
The oneOf
is optional, but a handy use of polymorphism to describe different schemas for each event - which can really help with documentation and validation.
Valid events to come through this stream might look like:
event: addString
data: This data is formatted
data: across two lines
retry: 5
event: addInt64
data: 1234.5678
unknownField: this is ignored
event: addJSON
data: {"foo": 42}
Sentinel Events
Some streaming systems do not always send all data or events in the exact same way. The items in a stream could be polymorphic objects, or there could be some special events that come through to say the stream is closed (also known as sentinel events).
Instead of trying to handle all of these edge cases with special new keywords, OpenAPI allows you to use the standard JSON Schema keywords to model these variations.
text/event-stream:
itemSchema:
oneOf:
- <your normal data/event schema>
- const: { data: "[DONE]" }
Whatever the schema is, it can be defined using the standard JSON Schema keywords like oneOf
, anyOf
, or allOf
to handle variations in the event structure. This allows you to define a flexible schema that can accommodate different types of events in the stream.
Conclusion
We’re excited for OpenAPI v3.2 to launch (hopefully any time in the next few weeks?) and we’re working hard to get Bump.sh ready to support as much of the new functionality as possible. Let us know in the comments what features you’re looking forward to, and share any ideas you have for how we can improve our support for streaming JSON in APIs.