Completed
on 9 Jan 2026, 5:21 pm

Output

JSONL Graph Storage: Is It Right?

Current Architecture

/konnectvol/konui/data/
├── gvflows.jsonl     # Flows (updateable)
├── gvedges.jsonl     # Graph edges (append-only)
├── gvdecisions.jsonl # Decisions (append-only)
└── kontasks.jsonl    # Kontasks (append-only)

+ In-Memory Indexes (rebuilt on startup)
  ├── byId, byStatus, byType
  ├── kontaskToFlows, flowToKontasks
  └── turnToKontasks, sessionToKontasks

✅ Why JSONL Is Actually Good Here

  • Simple ops model - git-friendly, easy backup
  • Claude-readable - raw files AI can inspect
  • Append-only - natural audit trail
  • Fast startup - ~100ms to load 10K edges
  • No dependencies - no DB server to maintain
  • In-memory queries - O(1) lookups via indexes

⚠️ When JSONL Breaks Down

  • >100K edges - startup slows, memory grows
  • Complex traversals - 3+ hop queries expensive
  • Multi-writer - file locking becomes issue
  • Full-text search - requires scanning all
  • Real graph algorithms - pagerank, clustering

What "Real" Graph DBs Offer

FeatureJSONL+IndexSQLiteNeo4j
1-hop queries
N-hop traversalslowCTEs
Graph algorithmsmanual
Full-text searchFTS5
Ops complexityzerolowhigh
Claude readabletools

📋 Recommendation

Keep JSONL for now. Your scale doesn't justify migration:

Current flows:3
Current edges:0
Projected 1 year:~500 flows, ~5K edges
JSONL limit:~100K edges

The pattern of JSONL + in-memory indexes is battle-tested. It's how Datomic and many event-sourced systems work at small scale.

🔮 Migration Trigger Points

Consider migrating to SQLite when:

  • Startup time exceeds 2 seconds
  • Memory usage exceeds 200MB for indexes
  • You need full-text search on flow titles/descriptions
  • You need complex queries the curator can't express

SQLite with FTS5 would be a pragmatic middle ground - single file, no server, still git-friendly. Save Neo4j for when you're doing real graph analytics.

Quick Actions

Original Request

No layout configured

Details

Type General
Status Completed
Scope vibetools
Tags gvflowsgrapharchitecturedatabase
Created 9 Jan 2026, 5:21 pm
Updated 9 Jan 2026, 5:21 pm
Timing
⏱️ 67.6s
0 tools
TTFE 0% Thinking 100% Tools 0%
Created By claude

Raw Data

{
  "id": "346c8dc5-37b5-4bbd-8cd6-80c6fa372f1e",
  "type": "general",
  "status": "completed",
  "title": "JSONL Graph Storage Architecture Analysis",
  "description": "Evaluating whether JSONL is appropriate for gvFlow graph storage and when to consider alternatives",
  "context": {
    "output": "<div style=\"background:#1e1e1e;color:#e0e0e0;padding:20px;border-radius:8px;font-family:'Consolas','Monaco',monospace;\"><h2 style=\"font-family:sans-serif;color:#fff;margin:0 0 16px 0;\">JSONL Graph Storage: Is It Right?</h2><div style=\"background:#2d2d2d;padding:16px;border-radius:6px;margin-bottom:20px;\"><h3 style=\"color:#61afef;margin:0 0 12px 0;font-size:14px;\">Current Architecture</h3><pre style=\"background:#1e1e1e;padding:12px;border-radius:4px;margin:0;line-height:1.6;overflow-x:auto;\"><span style=\"color:#56b6c2;\">/konnectvol/konui/data/</span>\n├── <span style=\"color:#e5c07b;\">gvflows.jsonl</span>     <span style=\"color:#7f848e;\"># Flows (updateable)</span>\n├── <span style=\"color:#e5c07b;\">gvedges.jsonl</span>     <span style=\"color:#7f848e;\"># Graph edges (append-only)</span>\n├── <span style=\"color:#e5c07b;\">gvdecisions.jsonl</span> <span style=\"color:#7f848e;\"># Decisions (append-only)</span>\n└── <span style=\"color:#e5c07b;\">kontasks.jsonl</span>    <span style=\"color:#7f848e;\"># Kontasks (append-only)</span>\n\n<span style=\"color:#c678dd;\">+ In-Memory Indexes</span> <span style=\"color:#7f848e;\">(rebuilt on startup)</span>\n  ├── byId, byStatus, byType\n  ├── kontaskToFlows, flowToKontasks\n  └── turnToKontasks, sessionToKontasks</pre></div><div style=\"display:grid;grid-template-columns:1fr 1fr;gap:16px;margin-bottom:20px;\"><div style=\"background:#1a3a2a;border:1px solid #22c55e;padding:16px;border-radius:6px;\"><h3 style=\"color:#98c379;margin:0 0 12px 0;font-size:14px;\">✅ Why JSONL Is Actually Good Here</h3><ul style=\"margin:0;padding-left:18px;line-height:1.7;font-size:13px;\"><li><strong>Simple ops model</strong> - git-friendly, easy backup</li><li><strong>Claude-readable</strong> - raw files AI can inspect</li><li><strong>Append-only</strong> - natural audit trail</li><li><strong>Fast startup</strong> - ~100ms to load 10K edges</li><li><strong>No dependencies</strong> - no DB server to maintain</li><li><strong>In-memory queries</strong> - O(1) lookups via indexes</li></ul></div><div style=\"background:#3a2a1a;border:1px solid #f97316;padding:16px;border-radius:6px;\"><h3 style=\"color:#f97316;margin:0 0 12px 0;font-size:14px;\">⚠️ When JSONL Breaks Down</h3><ul style=\"margin:0;padding-left:18px;line-height:1.7;font-size:13px;\"><li><strong>&gt;100K edges</strong> - startup slows, memory grows</li><li><strong>Complex traversals</strong> - 3+ hop queries expensive</li><li><strong>Multi-writer</strong> - file locking becomes issue</li><li><strong>Full-text search</strong> - requires scanning all</li><li><strong>Real graph algorithms</strong> - pagerank, clustering</li></ul></div></div><h3 style=\"font-family:sans-serif;color:#fff;margin:0 0 12px 0;\">What \"Real\" Graph DBs Offer</h3><div style=\"background:#2d2d2d;padding:16px;border-radius:6px;margin-bottom:16px;\"><table style=\"width:100%;border-collapse:collapse;font-size:13px;\"><tr style=\"border-bottom:1px solid #444;\"><th style=\"text-align:left;padding:8px 0;color:#7f848e;\">Feature</th><th style=\"text-align:center;padding:8px;color:#7f848e;\">JSONL+Index</th><th style=\"text-align:center;padding:8px;color:#7f848e;\">SQLite</th><th style=\"text-align:center;padding:8px;color:#7f848e;\">Neo4j</th></tr><tr style=\"border-bottom:1px solid #444;\"><td style=\"padding:8px 0;\">1-hop queries</td><td style=\"text-align:center;color:#98c379;\">✓</td><td style=\"text-align:center;color:#98c379;\">✓</td><td style=\"text-align:center;color:#98c379;\">✓</td></tr><tr style=\"border-bottom:1px solid #444;\"><td style=\"padding:8px 0;\">N-hop traversal</td><td style=\"text-align:center;color:#f97316;\">slow</td><td style=\"text-align:center;color:#e5c07b;\">CTEs</td><td style=\"text-align:center;color:#98c379;\">✓</td></tr><tr style=\"border-bottom:1px solid #444;\"><td style=\"padding:8px 0;\">Graph algorithms</td><td style=\"text-align:center;color:#f43f5e;\">✗</td><td style=\"text-align:center;color:#f97316;\">manual</td><td style=\"text-align:center;color:#98c379;\">✓</td></tr><tr style=\"border-bottom:1px solid #444;\"><td style=\"padding:8px 0;\">Full-text search</td><td style=\"text-align:center;color:#f43f5e;\">✗</td><td style=\"text-align:center;color:#98c379;\">FTS5</td><td style=\"text-align:center;color:#98c379;\">✓</td></tr><tr style=\"border-bottom:1px solid #444;\"><td style=\"padding:8px 0;\">Ops complexity</td><td style=\"text-align:center;color:#98c379;\">zero</td><td style=\"text-align:center;color:#98c379;\">low</td><td style=\"text-align:center;color:#f43f5e;\">high</td></tr><tr><td style=\"padding:8px 0;\">Claude readable</td><td style=\"text-align:center;color:#98c379;\">✓</td><td style=\"text-align:center;color:#f97316;\">tools</td><td style=\"text-align:center;color:#f43f5e;\">✗</td></tr></table></div><h3 style=\"font-family:sans-serif;color:#fff;margin:0 0 12px 0;\">📋 Recommendation</h3><div style=\"background:#1a2a3a;border:1px solid #3b82f6;padding:16px;border-radius:6px;\"><p style=\"margin:0 0 12px 0;color:#e0e0e0;\"><strong>Keep JSONL for now.</strong> Your scale doesn't justify migration:</p><table style=\"width:100%;font-size:13px;margin:12px 0;\"><tr><td style=\"padding:4px 0;\">Current flows:</td><td style=\"color:#61afef;\">3</td></tr><tr><td style=\"padding:4px 0;\">Current edges:</td><td style=\"color:#61afef;\">0</td></tr><tr><td style=\"padding:4px 0;\">Projected 1 year:</td><td style=\"color:#61afef;\">~500 flows, ~5K edges</td></tr><tr><td style=\"padding:4px 0;\">JSONL limit:</td><td style=\"color:#98c379;\">~100K edges</td></tr></table><p style=\"margin:0;color:#b0b0b0;font-size:12px;\">The pattern of <code>JSONL + in-memory indexes</code> is battle-tested. It's how Datomic and many event-sourced systems work at small scale.</p></div><h3 style=\"font-family:sans-serif;color:#fff;margin:16px 0 12px 0;\">🔮 Migration Trigger Points</h3><div style=\"background:#2d2d2d;padding:16px;border-radius:6px;\"><p style=\"margin:0 0 8px 0;color:#7f848e;font-size:12px;\">Consider migrating to SQLite when:</p><ul style=\"margin:0;padding-left:18px;line-height:1.7;font-size:13px;\"><li>Startup time exceeds <strong>2 seconds</strong></li><li>Memory usage exceeds <strong>200MB</strong> for indexes</li><li>You need <strong>full-text search</strong> on flow titles/descriptions</li><li>You need <strong>complex queries</strong> the curator can't express</li></ul><p style=\"margin:12px 0 0 0;color:#7f848e;font-size:12px;\">SQLite with FTS5 would be a pragmatic middle ground - single file, no server, still git-friendly. Save Neo4j for when you're doing real graph analytics.</p></div></div>",
    "requestedAt": "2026-01-09T07:22:00.000Z",
    "requestId": "28fe047f-0a6d-4cd0-b49f-ab5fef2b1ad6",
    "choices": [
      {
        "label": "Keep JSONL",
        "value": "Agreed, keep JSONL for now and revisit when we hit scale limits",
        "primary": true
      },
      {
        "label": "Plan SQLite migration",
        "value": "Create a design doc for migrating to SQLite with FTS5 for future reference"
      },
      {
        "label": "Add monitoring",
        "value": "Add startup time and memory usage metrics to track when migration is needed"
      }
    ],
    "turnTiming": {
      "totalMs": 67564,
      "ttfeMs": 158,
      "thinkingMs": 67406,
      "toolExecutionMs": 0,
      "toolCallCount": 0,
      "thinkingPct": 100,
      "toolsPct": 0,
      "ttfePct": 0
    }
  },
  "createdBy": "claude",
  "createdAt": "2026-01-09T07:21:40.310Z",
  "updatedAt": "2026-01-09T07:21:46.386Z",
  "requestId": "28fe047f-0a6d-4cd0-b49f-ab5fef2b1ad6",
  "scope": "vibetools",
  "tags": [
    "gvflows",
    "graph",
    "architecture",
    "database"
  ],
  "targetUser": "claude"
}
DashboardReportsKontasksFlowsDecisionsSessionsTelemetryLogs + Go