← Back to Kontasks
JSONL Graph Storage Architecture Analysis
Evaluating whether JSONL is appropriate for gvFlow graph storage and when to consider alternatives
Output
JSONL Graph Storage: Is It Right?
Current Architecture
/konnectvol/konui/data/ ├── gvflows.jsonl # Flows (updateable) ├── gvedges.jsonl # Graph edges (append-only) ├── gvdecisions.jsonl # Decisions (append-only) └── kontasks.jsonl # Kontasks (append-only) + In-Memory Indexes (rebuilt on startup) ├── byId, byStatus, byType ├── kontaskToFlows, flowToKontasks └── turnToKontasks, sessionToKontasks
✅ Why JSONL Is Actually Good Here
- Simple ops model - git-friendly, easy backup
- Claude-readable - raw files AI can inspect
- Append-only - natural audit trail
- Fast startup - ~100ms to load 10K edges
- No dependencies - no DB server to maintain
- In-memory queries - O(1) lookups via indexes
⚠️ When JSONL Breaks Down
- >100K edges - startup slows, memory grows
- Complex traversals - 3+ hop queries expensive
- Multi-writer - file locking becomes issue
- Full-text search - requires scanning all
- Real graph algorithms - pagerank, clustering
What "Real" Graph DBs Offer
| Feature | JSONL+Index | SQLite | Neo4j |
|---|---|---|---|
| 1-hop queries | ✓ | ✓ | ✓ |
| N-hop traversal | slow | CTEs | ✓ |
| Graph algorithms | ✗ | manual | ✓ |
| Full-text search | ✗ | FTS5 | ✓ |
| Ops complexity | zero | low | high |
| Claude readable | ✓ | tools | ✗ |
📋 Recommendation
Keep JSONL for now. Your scale doesn't justify migration:
| Current flows: | 3 |
| Current edges: | 0 |
| Projected 1 year: | ~500 flows, ~5K edges |
| JSONL limit: | ~100K edges |
The pattern of JSONL + in-memory indexes is battle-tested. It's how Datomic and many event-sourced systems work at small scale.
🔮 Migration Trigger Points
Consider migrating to SQLite when:
- Startup time exceeds 2 seconds
- Memory usage exceeds 200MB for indexes
- You need full-text search on flow titles/descriptions
- You need complex queries the curator can't express
SQLite with FTS5 would be a pragmatic middle ground - single file, no server, still git-friendly. Save Neo4j for when you're doing real graph analytics.
Quick Actions
Original Request
No layout configured
Details
Type
General
Status
Completed
Scope
vibetools
Tags
gvflowsgrapharchitecturedatabase
Created
9 Jan 2026, 5:21 pm
Updated
9 Jan 2026, 5:21 pm
Timing
⏱️ 67.6s
0 tools
● TTFE 0%
● Thinking 100%
● Tools 0%
Created By
claude
Raw Data
{
"id": "346c8dc5-37b5-4bbd-8cd6-80c6fa372f1e",
"type": "general",
"status": "completed",
"title": "JSONL Graph Storage Architecture Analysis",
"description": "Evaluating whether JSONL is appropriate for gvFlow graph storage and when to consider alternatives",
"context": {
"output": "<div style=\"background:#1e1e1e;color:#e0e0e0;padding:20px;border-radius:8px;font-family:'Consolas','Monaco',monospace;\"><h2 style=\"font-family:sans-serif;color:#fff;margin:0 0 16px 0;\">JSONL Graph Storage: Is It Right?</h2><div style=\"background:#2d2d2d;padding:16px;border-radius:6px;margin-bottom:20px;\"><h3 style=\"color:#61afef;margin:0 0 12px 0;font-size:14px;\">Current Architecture</h3><pre style=\"background:#1e1e1e;padding:12px;border-radius:4px;margin:0;line-height:1.6;overflow-x:auto;\"><span style=\"color:#56b6c2;\">/konnectvol/konui/data/</span>\n├── <span style=\"color:#e5c07b;\">gvflows.jsonl</span> <span style=\"color:#7f848e;\"># Flows (updateable)</span>\n├── <span style=\"color:#e5c07b;\">gvedges.jsonl</span> <span style=\"color:#7f848e;\"># Graph edges (append-only)</span>\n├── <span style=\"color:#e5c07b;\">gvdecisions.jsonl</span> <span style=\"color:#7f848e;\"># Decisions (append-only)</span>\n└── <span style=\"color:#e5c07b;\">kontasks.jsonl</span> <span style=\"color:#7f848e;\"># Kontasks (append-only)</span>\n\n<span style=\"color:#c678dd;\">+ In-Memory Indexes</span> <span style=\"color:#7f848e;\">(rebuilt on startup)</span>\n ├── byId, byStatus, byType\n ├── kontaskToFlows, flowToKontasks\n └── turnToKontasks, sessionToKontasks</pre></div><div style=\"display:grid;grid-template-columns:1fr 1fr;gap:16px;margin-bottom:20px;\"><div style=\"background:#1a3a2a;border:1px solid #22c55e;padding:16px;border-radius:6px;\"><h3 style=\"color:#98c379;margin:0 0 12px 0;font-size:14px;\">✅ Why JSONL Is Actually Good Here</h3><ul style=\"margin:0;padding-left:18px;line-height:1.7;font-size:13px;\"><li><strong>Simple ops model</strong> - git-friendly, easy backup</li><li><strong>Claude-readable</strong> - raw files AI can inspect</li><li><strong>Append-only</strong> - natural audit trail</li><li><strong>Fast startup</strong> - ~100ms to load 10K edges</li><li><strong>No dependencies</strong> - no DB server to maintain</li><li><strong>In-memory queries</strong> - O(1) lookups via indexes</li></ul></div><div style=\"background:#3a2a1a;border:1px solid #f97316;padding:16px;border-radius:6px;\"><h3 style=\"color:#f97316;margin:0 0 12px 0;font-size:14px;\">⚠️ When JSONL Breaks Down</h3><ul style=\"margin:0;padding-left:18px;line-height:1.7;font-size:13px;\"><li><strong>>100K edges</strong> - startup slows, memory grows</li><li><strong>Complex traversals</strong> - 3+ hop queries expensive</li><li><strong>Multi-writer</strong> - file locking becomes issue</li><li><strong>Full-text search</strong> - requires scanning all</li><li><strong>Real graph algorithms</strong> - pagerank, clustering</li></ul></div></div><h3 style=\"font-family:sans-serif;color:#fff;margin:0 0 12px 0;\">What \"Real\" Graph DBs Offer</h3><div style=\"background:#2d2d2d;padding:16px;border-radius:6px;margin-bottom:16px;\"><table style=\"width:100%;border-collapse:collapse;font-size:13px;\"><tr style=\"border-bottom:1px solid #444;\"><th style=\"text-align:left;padding:8px 0;color:#7f848e;\">Feature</th><th style=\"text-align:center;padding:8px;color:#7f848e;\">JSONL+Index</th><th style=\"text-align:center;padding:8px;color:#7f848e;\">SQLite</th><th style=\"text-align:center;padding:8px;color:#7f848e;\">Neo4j</th></tr><tr style=\"border-bottom:1px solid #444;\"><td style=\"padding:8px 0;\">1-hop queries</td><td style=\"text-align:center;color:#98c379;\">✓</td><td style=\"text-align:center;color:#98c379;\">✓</td><td style=\"text-align:center;color:#98c379;\">✓</td></tr><tr style=\"border-bottom:1px solid #444;\"><td style=\"padding:8px 0;\">N-hop traversal</td><td style=\"text-align:center;color:#f97316;\">slow</td><td style=\"text-align:center;color:#e5c07b;\">CTEs</td><td style=\"text-align:center;color:#98c379;\">✓</td></tr><tr style=\"border-bottom:1px solid #444;\"><td style=\"padding:8px 0;\">Graph algorithms</td><td style=\"text-align:center;color:#f43f5e;\">✗</td><td style=\"text-align:center;color:#f97316;\">manual</td><td style=\"text-align:center;color:#98c379;\">✓</td></tr><tr style=\"border-bottom:1px solid #444;\"><td style=\"padding:8px 0;\">Full-text search</td><td style=\"text-align:center;color:#f43f5e;\">✗</td><td style=\"text-align:center;color:#98c379;\">FTS5</td><td style=\"text-align:center;color:#98c379;\">✓</td></tr><tr style=\"border-bottom:1px solid #444;\"><td style=\"padding:8px 0;\">Ops complexity</td><td style=\"text-align:center;color:#98c379;\">zero</td><td style=\"text-align:center;color:#98c379;\">low</td><td style=\"text-align:center;color:#f43f5e;\">high</td></tr><tr><td style=\"padding:8px 0;\">Claude readable</td><td style=\"text-align:center;color:#98c379;\">✓</td><td style=\"text-align:center;color:#f97316;\">tools</td><td style=\"text-align:center;color:#f43f5e;\">✗</td></tr></table></div><h3 style=\"font-family:sans-serif;color:#fff;margin:0 0 12px 0;\">📋 Recommendation</h3><div style=\"background:#1a2a3a;border:1px solid #3b82f6;padding:16px;border-radius:6px;\"><p style=\"margin:0 0 12px 0;color:#e0e0e0;\"><strong>Keep JSONL for now.</strong> Your scale doesn't justify migration:</p><table style=\"width:100%;font-size:13px;margin:12px 0;\"><tr><td style=\"padding:4px 0;\">Current flows:</td><td style=\"color:#61afef;\">3</td></tr><tr><td style=\"padding:4px 0;\">Current edges:</td><td style=\"color:#61afef;\">0</td></tr><tr><td style=\"padding:4px 0;\">Projected 1 year:</td><td style=\"color:#61afef;\">~500 flows, ~5K edges</td></tr><tr><td style=\"padding:4px 0;\">JSONL limit:</td><td style=\"color:#98c379;\">~100K edges</td></tr></table><p style=\"margin:0;color:#b0b0b0;font-size:12px;\">The pattern of <code>JSONL + in-memory indexes</code> is battle-tested. It's how Datomic and many event-sourced systems work at small scale.</p></div><h3 style=\"font-family:sans-serif;color:#fff;margin:16px 0 12px 0;\">🔮 Migration Trigger Points</h3><div style=\"background:#2d2d2d;padding:16px;border-radius:6px;\"><p style=\"margin:0 0 8px 0;color:#7f848e;font-size:12px;\">Consider migrating to SQLite when:</p><ul style=\"margin:0;padding-left:18px;line-height:1.7;font-size:13px;\"><li>Startup time exceeds <strong>2 seconds</strong></li><li>Memory usage exceeds <strong>200MB</strong> for indexes</li><li>You need <strong>full-text search</strong> on flow titles/descriptions</li><li>You need <strong>complex queries</strong> the curator can't express</li></ul><p style=\"margin:12px 0 0 0;color:#7f848e;font-size:12px;\">SQLite with FTS5 would be a pragmatic middle ground - single file, no server, still git-friendly. Save Neo4j for when you're doing real graph analytics.</p></div></div>",
"requestedAt": "2026-01-09T07:22:00.000Z",
"requestId": "28fe047f-0a6d-4cd0-b49f-ab5fef2b1ad6",
"choices": [
{
"label": "Keep JSONL",
"value": "Agreed, keep JSONL for now and revisit when we hit scale limits",
"primary": true
},
{
"label": "Plan SQLite migration",
"value": "Create a design doc for migrating to SQLite with FTS5 for future reference"
},
{
"label": "Add monitoring",
"value": "Add startup time and memory usage metrics to track when migration is needed"
}
],
"turnTiming": {
"totalMs": 67564,
"ttfeMs": 158,
"thinkingMs": 67406,
"toolExecutionMs": 0,
"toolCallCount": 0,
"thinkingPct": 100,
"toolsPct": 0,
"ttfePct": 0
}
},
"createdBy": "claude",
"createdAt": "2026-01-09T07:21:40.310Z",
"updatedAt": "2026-01-09T07:21:46.386Z",
"requestId": "28fe047f-0a6d-4cd0-b49f-ab5fef2b1ad6",
"scope": "vibetools",
"tags": [
"gvflows",
"graph",
"architecture",
"database"
],
"targetUser": "claude"
}