JSONL (JSON Lines) Format Explained: What It Is and When to Use It
JSONL — JSON Lines, also known as NDJSON (Newline-Delimited JSON) — is a text format where each line is a complete, valid JSON value (usually an object). It looks simple, but it solves real problems that a regular JSON array cannot. If you work with log files, streaming data pipelines, or machine learning datasets, you have almost certainly encountered JSONL without realizing it.
What JSONL Looks Like
{"id": 1, "name": "Alice", "action": "login", "ts": "2026-04-01T09:00:00Z"}
{"id": 2, "name": "Bob", "action": "purchase", "amount": 49.99, "ts": "2026-04-01T09:01:22Z"}
{"id": 3, "name": "Alice", "action": "logout", "ts": "2026-04-01T09:45:00Z"}Each line is a self-contained JSON object. Lines are separated by a newline character (\n). There are no commas between lines, no surrounding array brackets, and no schema enforcement — each line can have different keys.
JSONL vs a JSON Array: Key Differences
| Feature | JSON Array | JSONL |
|---|---|---|
| Structure | One big [...] array | One JSON object per line |
| Streaming | Must load entire file to parse | Process line-by-line, constant memory |
| Appending | Must rewrite entire file | Append a new line — O(1) |
| Schema | All items must be valid JSON | Each line is independent |
| File size | Slightly smaller (no newlines) | Slightly larger |
| Tooling | Universal JSON tools | Needs JSONL-aware tools |
| Human readability | Good with formatting | Good — one record per line |
When to Use JSONL
1. Log Files and Event Streams
JSONL is the de facto format for structured logs. Tools like Fluentd, Logstash, Vector, and AWS CloudWatch Logs Insights all produce or consume JSONL. Each log entry is appended as a new line — no need to open and rewrite a JSON array, which would be catastrophically slow at scale.
{"level":"info","msg":"Server started","port":3000,"ts":"2026-04-01T08:00:00Z"}
{"level":"warn","msg":"Slow query","query_ms":523,"table":"users","ts":"2026-04-01T08:00:05Z"}
{"level":"error","msg":"DB connection lost","ts":"2026-04-01T08:01:00Z"}2. Machine Learning and AI Datasets
The ML ecosystem has standardized on JSONL. OpenAI's fine-tuning API requires training data in JSONL format. Hugging Face datasets are often distributed as .jsonl files. The reason is the same: datasets can be millions of rows — you need to stream them without loading everything into memory.
{"messages": [{"role": "system", "content": "You are a helpful assistant."}, {"role": "user", "content": "What is JSON?"}, {"role": "assistant", "content": "JSON is a lightweight data interchange format..."}]}
{"messages": [{"role": "system", "content": "You are a helpful assistant."}, {"role": "user", "content": "What is JSONL?"}, {"role": "assistant", "content": "JSONL is JSON Lines — one JSON object per line..."}]}3. Data Export and Import Pipelines
When exporting large tables from a database, JSONL lets you stream rows directly to disk without buffering the entire result set. Tools like mongodump, BigQuery exports, and Elasticsearch bulk APIs all use JSONL for exactly this reason.
Reading and Writing JSONL
# Read line by line — constant memory regardless of file size
with open('data.jsonl', 'r') as f:
for line in f:
record = json.loads(line.strip())
process(record)
# Write JSONL
with open('output.jsonl', 'w') as f:
for record in records:
f.write(json.dumps(record) + '\n')import { createReadStream } from 'fs';
import { createInterface } from 'readline';
const rl = createInterface({ input: createReadStream('data.jsonl') });
rl.on('line', (line) => {
if (line.trim()) {
const record = JSON.parse(line);
process(record);
}
});# Filter events where action is "purchase"
cat events.jsonl | jq 'select(.action == "purchase")'
# Extract just names and amounts
cat events.jsonl | jq '{name, amount}'
# Count records
cat events.jsonl | jq -s 'length'JSONL Gotchas
- Blank lines break naive parsers: Skip empty lines when reading —
if (line.trim())before parsing. - Encoding issues: Always use UTF-8 without BOM. A BOM on line 1 will cause the first record to fail to parse.
- Trailing newline: Most JSONL files end with a trailing newline — this is correct and expected. Don't treat the empty last line as a record.
- Not valid JSON itself: A
.jsonlfile as a whole is not valid JSON (no surrounding brackets). Don't try to parse the whole file as JSON.
Format and validate JSONL files
Use our JSONL Formatter to pretty-print, validate, and inspect JSONL files. It parses each line independently and flags any malformed records.
Format and validate your JSONL
Paste a JSONL file to validate each line, pretty-print records, and spot malformed entries instantly.