Completed
on 9 Jan 2026, 6:02 pm

Output

Kongate Crash Investigation

Root Cause: EADDRINUSE (Port 8010 Already in Use)

The crashes were NOT caused by bad XML. The service was crash-looping because a zombie process was holding port 8010.

Timeline (07:51-07:52 UTC)

07:51:53 EADDRINUSE  Port 8010 in use by old process
07:51:58 EADDRINUSE  Systemd tries restart #1
07:52:04 EADDRINUSE  Systemd tries restart #2
07:52:10 EADDRINUSE  Systemd tries restart #3
07:52:16 EADDRINUSE  Systemd tries restart #4
07:52:22 EADDRINUSE  Systemd tries restart #5
07:52:28 EADDRINUSE  Systemd tries restart #6
07:52:33 EADDRINUSE  Systemd tries restart #7
07:52:57 SUCCESS    Zombie killed, port free, service starts!

What Happened

  1. A previous kongate process became a zombie - still holding port 8010 but not responding to SIGTERM
  2. Systemd tried to restart the service every ~5 seconds
  3. Each new process immediately crashed with EADDRINUSE
  4. Around 07:52:57 UTC, the zombie was killed (likely manually) and the service recovered

XML Error Handling is Solid

Bad XML cannot crash kongate. The code has proper error handling:

// xml-parser.ts:193-201
try {
  parse(xmlContent);  // @libs/xml parser
} catch (error) {
  return {
    success: false,
    error: `XML parsing failed: ${errorMessage}`
  };  // Gracefully returns error, doesn't crash
}

Current Status

● kongate.service - Kongate FTP Gateway Service
  Loaded: loaded (/etc/systemd/system/kongate.service; enabled)
  Active: active (running) since Fri 2026-01-09 17:52:57 AEST
  Memory: 60.9M
  Uptime: ~8 minutes at time of investigation

Prevention Options

  1. Systemd KillMode=mixed - Force kill zombie processes on restart
  2. Add TimeoutStopSec=10 - Don't wait forever for graceful shutdown
  3. Pre-start port check - Kill any process on 8010 before starting

The previous kontask "Kongate Crash Loop Fixed - Zombie Process Killed" already resolved this incident. No code changes needed for XML handling.

Quick Actions

Original Request

No layout configured

Details

Type General
Status Completed
Scope product
Tags kongatepipelinedebuggingcrash
Created 9 Jan 2026, 6:02 pm
Updated 9 Jan 2026, 6:02 pm
Timing
⏱️ 87.6s
0 tools
TTFE 0% Thinking 100% Tools 0%
Created By claude

Raw Data

{
  "id": "88fc7959-d17a-4fa0-a70f-53f8204b082e",
  "type": "general",
  "status": "completed",
  "title": "Kongate Crash Investigation - EADDRINUSE Zombie Loop",
  "description": "Root cause analysis of kongate service crash loop on Jan 9th 2026",
  "context": {
    "output": "<div style=\"background:#1e1e1e;color:#e0e0e0;padding:20px;border-radius:8px;font-family:'Consolas','Monaco',monospace;\">\n<h2 style=\"font-family:sans-serif;color:#fff;margin:0 0 16px 0;\">Kongate Crash Investigation</h2>\n\n<h3 style=\"color:#98c379;margin:16px 0 8px 0;\">Root Cause: EADDRINUSE (Port 8010 Already in Use)</h3>\n<p style=\"color:#e0e0e0;line-height:1.6;\">The crashes were <strong style=\"color:#f97316;\">NOT caused by bad XML</strong>. The service was crash-looping because a <span style=\"color:#e5c07b;\">zombie process</span> was holding port 8010.</p>\n\n<h3 style=\"color:#61afef;margin:16px 0 8px 0;\">Timeline (07:51-07:52 UTC)</h3>\n<pre style=\"background:#2d2d2d;padding:16px;border-radius:6px;overflow-x:auto;line-height:1.8;color:#e0e0e0;\">\n<span style=\"color:#7f848e;\">07:51:53</span> <span style=\"color:#e06c75;\">EADDRINUSE</span>  Port 8010 in use by old process\n<span style=\"color:#7f848e;\">07:51:58</span> <span style=\"color:#e06c75;\">EADDRINUSE</span>  Systemd tries restart #1\n<span style=\"color:#7f848e;\">07:52:04</span> <span style=\"color:#e06c75;\">EADDRINUSE</span>  Systemd tries restart #2\n<span style=\"color:#7f848e;\">07:52:10</span> <span style=\"color:#e06c75;\">EADDRINUSE</span>  Systemd tries restart #3\n<span style=\"color:#7f848e;\">07:52:16</span> <span style=\"color:#e06c75;\">EADDRINUSE</span>  Systemd tries restart #4\n<span style=\"color:#7f848e;\">07:52:22</span> <span style=\"color:#e06c75;\">EADDRINUSE</span>  Systemd tries restart #5\n<span style=\"color:#7f848e;\">07:52:28</span> <span style=\"color:#e06c75;\">EADDRINUSE</span>  Systemd tries restart #6\n<span style=\"color:#7f848e;\">07:52:33</span> <span style=\"color:#e06c75;\">EADDRINUSE</span>  Systemd tries restart #7\n<span style=\"color:#7f848e;\">07:52:57</span> <span style=\"color:#98c379;\">SUCCESS</span>    Zombie killed, port free, service starts!\n</pre>\n\n<h3 style=\"color:#c678dd;margin:16px 0 8px 0;\">What Happened</h3>\n<ol style=\"color:#e0e0e0;line-height:1.8;\">\n<li>A previous kongate process became a <span style=\"color:#e5c07b;\">zombie</span> - still holding port 8010 but not responding to SIGTERM</li>\n<li>Systemd tried to restart the service every ~5 seconds</li>\n<li>Each new process immediately crashed with <code style=\"background:#2d2d2d;padding:2px 6px;border-radius:3px;\">EADDRINUSE</code></li>\n<li>Around 07:52:57 UTC, the zombie was killed (likely manually) and the service recovered</li>\n</ol>\n\n<h3 style=\"color:#56b6c2;margin:16px 0 8px 0;\">XML Error Handling is Solid</h3>\n<p style=\"color:#e0e0e0;line-height:1.6;\">Bad XML <strong style=\"color:#98c379;\">cannot</strong> crash kongate. The code has proper error handling:</p>\n<pre style=\"background:#2d2d2d;padding:16px;border-radius:6px;overflow-x:auto;line-height:1.8;color:#e0e0e0;\">\n<span style=\"color:#7f848e;\">// xml-parser.ts:193-201</span>\n<span style=\"color:#c678dd;\">try</span> {\n  <span style=\"color:#61afef;\">parse</span>(xmlContent);  <span style=\"color:#7f848e;\">// @libs/xml parser</span>\n} <span style=\"color:#c678dd;\">catch</span> (error) {\n  <span style=\"color:#c678dd;\">return</span> {\n    success: <span style=\"color:#e5c07b;\">false</span>,\n    error: <span style=\"color:#98c379;\">`XML parsing failed: ${errorMessage}`</span>\n  };  <span style=\"color:#7f848e;\">// Gracefully returns error, doesn't crash</span>\n}\n</pre>\n\n<h3 style=\"color:#e5c07b;margin:16px 0 8px 0;\">Current Status</h3>\n<pre style=\"background:#2d2d2d;padding:16px;border-radius:6px;overflow-x:auto;line-height:1.8;color:#e0e0e0;\">\n<span style=\"color:#98c379;\">● kongate.service - Kongate FTP Gateway Service</span>\n  Loaded: <span style=\"color:#61afef;\">loaded</span> (/etc/systemd/system/kongate.service; enabled)\n  Active: <span style=\"color:#98c379;\">active (running)</span> since Fri 2026-01-09 17:52:57 AEST\n  Memory: 60.9M\n  Uptime: <span style=\"color:#98c379;\">~8 minutes</span> at time of investigation\n</pre>\n\n<h3 style=\"color:#f97316;margin:16px 0 8px 0;\">Prevention Options</h3>\n<ol style=\"color:#e0e0e0;line-height:1.8;\">\n<li><strong>Systemd KillMode=mixed</strong> - Force kill zombie processes on restart</li>\n<li><strong>Add TimeoutStopSec=10</strong> - Don't wait forever for graceful shutdown</li>\n<li><strong>Pre-start port check</strong> - Kill any process on 8010 before starting</li>\n</ol>\n\n<p style=\"color:#7f848e;margin-top:20px;font-style:italic;\">The previous kontask \"Kongate Crash Loop Fixed - Zombie Process Killed\" already resolved this incident. No code changes needed for XML handling.</p>\n</div>",
    "requestedAt": "2026-01-09T08:01:00.000Z",
    "requestId": "ef33f45a-9d74-4681-8425-dfe53c8934e5",
    "choices": [
      {
        "label": "Harden systemd",
        "value": "Improve the systemd service file with KillMode=mixed and TimeoutStopSec=10 to prevent zombie processes from blocking restarts",
        "primary": true
      },
      {
        "label": "Add port check",
        "value": "Add a pre-start script that kills any process holding port 8010 before starting kongate"
      },
      {
        "label": "View XML tests",
        "value": "Show me the XML parsing test coverage to verify bad XML really can't crash the service"
      }
    ],
    "turnTiming": {
      "totalMs": 87618,
      "ttfeMs": 67,
      "thinkingMs": 87549,
      "toolExecutionMs": 0,
      "toolCallCount": 0,
      "thinkingPct": 100,
      "toolsPct": 0,
      "ttfePct": 0
    }
  },
  "createdBy": "claude",
  "createdAt": "2026-01-09T08:02:48.110Z",
  "updatedAt": "2026-01-09T08:02:52.697Z",
  "requestId": "ef33f45a-9d74-4681-8425-dfe53c8934e5",
  "scope": "product",
  "tags": [
    "kongate",
    "pipeline",
    "debugging",
    "crash"
  ],
  "targetUser": "claude"
}
Loading timeline...
DashboardReportsKontasksOrphansFlowsDecisionsSessionsTelemetryLogs + Go