Back to Blog

JSONL (JSON Lines) Format Explained: What It Is and When to Use It

Apr 1, 20266 min read

JSONL — JSON Lines, also known as NDJSON (Newline-Delimited JSON) — is a text format where each line is a complete, valid JSON value (usually an object). It looks simple, but it solves real problems that a regular JSON array cannot. If you work with log files, streaming data pipelines, or machine learning datasets, you have almost certainly encountered JSONL without realizing it.

What JSONL Looks Like

example.jsonl
{"id": 1, "name": "Alice", "action": "login", "ts": "2026-04-01T09:00:00Z"}
{"id": 2, "name": "Bob", "action": "purchase", "amount": 49.99, "ts": "2026-04-01T09:01:22Z"}
{"id": 3, "name": "Alice", "action": "logout", "ts": "2026-04-01T09:45:00Z"}

Each line is a self-contained JSON object. Lines are separated by a newline character (\n). There are no commas between lines, no surrounding array brackets, and no schema enforcement — each line can have different keys.

JSONL vs a JSON Array: Key Differences

FeatureJSON ArrayJSONL
StructureOne big [...] arrayOne JSON object per line
StreamingMust load entire file to parseProcess line-by-line, constant memory
AppendingMust rewrite entire fileAppend a new line — O(1)
SchemaAll items must be valid JSONEach line is independent
File sizeSlightly smaller (no newlines)Slightly larger
ToolingUniversal JSON toolsNeeds JSONL-aware tools
Human readabilityGood with formattingGood — one record per line

When to Use JSONL

1. Log Files and Event Streams

JSONL is the de facto format for structured logs. Tools like Fluentd, Logstash, Vector, and AWS CloudWatch Logs Insights all produce or consume JSONL. Each log entry is appended as a new line — no need to open and rewrite a JSON array, which would be catastrophically slow at scale.

Structured log in JSONL
{"level":"info","msg":"Server started","port":3000,"ts":"2026-04-01T08:00:00Z"}
{"level":"warn","msg":"Slow query","query_ms":523,"table":"users","ts":"2026-04-01T08:00:05Z"}
{"level":"error","msg":"DB connection lost","ts":"2026-04-01T08:01:00Z"}

2. Machine Learning and AI Datasets

The ML ecosystem has standardized on JSONL. OpenAI's fine-tuning API requires training data in JSONL format. Hugging Face datasets are often distributed as .jsonl files. The reason is the same: datasets can be millions of rows — you need to stream them without loading everything into memory.

OpenAI fine-tuning format (JSONL)
{"messages": [{"role": "system", "content": "You are a helpful assistant."}, {"role": "user", "content": "What is JSON?"}, {"role": "assistant", "content": "JSON is a lightweight data interchange format..."}]}
{"messages": [{"role": "system", "content": "You are a helpful assistant."}, {"role": "user", "content": "What is JSONL?"}, {"role": "assistant", "content": "JSONL is JSON Lines — one JSON object per line..."}]}

3. Data Export and Import Pipelines

When exporting large tables from a database, JSONL lets you stream rows directly to disk without buffering the entire result set. Tools like mongodump, BigQuery exports, and Elasticsearch bulk APIs all use JSONL for exactly this reason.

Reading and Writing JSONL

Read JSONL in Python
# Read line by line — constant memory regardless of file size
with open('data.jsonl', 'r') as f:
    for line in f:
        record = json.loads(line.strip())
        process(record)

# Write JSONL
with open('output.jsonl', 'w') as f:
    for record in records:
        f.write(json.dumps(record) + '\n')
Read JSONL in Node.js
import { createReadStream } from 'fs';
import { createInterface } from 'readline';

const rl = createInterface({ input: createReadStream('data.jsonl') });

rl.on('line', (line) => {
  if (line.trim()) {
    const record = JSON.parse(line);
    process(record);
  }
});
Query JSONL with jq
# Filter events where action is "purchase"
cat events.jsonl | jq 'select(.action == "purchase")'

# Extract just names and amounts
cat events.jsonl | jq '{name, amount}'

# Count records
cat events.jsonl | jq -s 'length'

JSONL Gotchas

  • Blank lines break naive parsers: Skip empty lines when reading — if (line.trim()) before parsing.
  • Encoding issues: Always use UTF-8 without BOM. A BOM on line 1 will cause the first record to fail to parse.
  • Trailing newline: Most JSONL files end with a trailing newline — this is correct and expected. Don't treat the empty last line as a record.
  • Not valid JSON itself: A .jsonl file as a whole is not valid JSON (no surrounding brackets). Don't try to parse the whole file as JSON.

Format and validate JSONL files

Use our JSONL Formatter to pretty-print, validate, and inspect JSONL files. It parses each line independently and flags any malformed records.

Format and validate your JSONL

Paste a JSONL file to validate each line, pretty-print records, and spot malformed entries instantly.