← Back to Kontasks
Kongate Crash Investigation - EADDRINUSE Zombie Loop
Root cause analysis of kongate service crash loop on Jan 9th 2026
Output
Kongate Crash Investigation
Root Cause: EADDRINUSE (Port 8010 Already in Use)
The crashes were NOT caused by bad XML. The service was crash-looping because a zombie process was holding port 8010.
Timeline (07:51-07:52 UTC)
07:51:53 EADDRINUSE Port 8010 in use by old process 07:51:58 EADDRINUSE Systemd tries restart #1 07:52:04 EADDRINUSE Systemd tries restart #2 07:52:10 EADDRINUSE Systemd tries restart #3 07:52:16 EADDRINUSE Systemd tries restart #4 07:52:22 EADDRINUSE Systemd tries restart #5 07:52:28 EADDRINUSE Systemd tries restart #6 07:52:33 EADDRINUSE Systemd tries restart #7 07:52:57 SUCCESS Zombie killed, port free, service starts!
What Happened
- A previous kongate process became a zombie - still holding port 8010 but not responding to SIGTERM
- Systemd tried to restart the service every ~5 seconds
- Each new process immediately crashed with
EADDRINUSE - Around 07:52:57 UTC, the zombie was killed (likely manually) and the service recovered
XML Error Handling is Solid
Bad XML cannot crash kongate. The code has proper error handling:
// xml-parser.ts:193-201 try { parse(xmlContent); // @libs/xml parser } catch (error) { return { success: false, error: `XML parsing failed: ${errorMessage}` }; // Gracefully returns error, doesn't crash }
Current Status
● kongate.service - Kongate FTP Gateway Service Loaded: loaded (/etc/systemd/system/kongate.service; enabled) Active: active (running) since Fri 2026-01-09 17:52:57 AEST Memory: 60.9M Uptime: ~8 minutes at time of investigation
Prevention Options
- Systemd KillMode=mixed - Force kill zombie processes on restart
- Add TimeoutStopSec=10 - Don't wait forever for graceful shutdown
- Pre-start port check - Kill any process on 8010 before starting
The previous kontask "Kongate Crash Loop Fixed - Zombie Process Killed" already resolved this incident. No code changes needed for XML handling.
Quick Actions
Original Request
No layout configured
Details
Type
General
Status
Completed
Scope
product
Tags
kongatepipelinedebuggingcrash
Created
9 Jan 2026, 6:02 pm
Updated
9 Jan 2026, 6:02 pm
Timing
⏱️ 87.6s
0 tools
● TTFE 0%
● Thinking 100%
● Tools 0%
Created By
claude
Raw Data
{
"id": "88fc7959-d17a-4fa0-a70f-53f8204b082e",
"type": "general",
"status": "completed",
"title": "Kongate Crash Investigation - EADDRINUSE Zombie Loop",
"description": "Root cause analysis of kongate service crash loop on Jan 9th 2026",
"context": {
"output": "<div style=\"background:#1e1e1e;color:#e0e0e0;padding:20px;border-radius:8px;font-family:'Consolas','Monaco',monospace;\">\n<h2 style=\"font-family:sans-serif;color:#fff;margin:0 0 16px 0;\">Kongate Crash Investigation</h2>\n\n<h3 style=\"color:#98c379;margin:16px 0 8px 0;\">Root Cause: EADDRINUSE (Port 8010 Already in Use)</h3>\n<p style=\"color:#e0e0e0;line-height:1.6;\">The crashes were <strong style=\"color:#f97316;\">NOT caused by bad XML</strong>. The service was crash-looping because a <span style=\"color:#e5c07b;\">zombie process</span> was holding port 8010.</p>\n\n<h3 style=\"color:#61afef;margin:16px 0 8px 0;\">Timeline (07:51-07:52 UTC)</h3>\n<pre style=\"background:#2d2d2d;padding:16px;border-radius:6px;overflow-x:auto;line-height:1.8;color:#e0e0e0;\">\n<span style=\"color:#7f848e;\">07:51:53</span> <span style=\"color:#e06c75;\">EADDRINUSE</span> Port 8010 in use by old process\n<span style=\"color:#7f848e;\">07:51:58</span> <span style=\"color:#e06c75;\">EADDRINUSE</span> Systemd tries restart #1\n<span style=\"color:#7f848e;\">07:52:04</span> <span style=\"color:#e06c75;\">EADDRINUSE</span> Systemd tries restart #2\n<span style=\"color:#7f848e;\">07:52:10</span> <span style=\"color:#e06c75;\">EADDRINUSE</span> Systemd tries restart #3\n<span style=\"color:#7f848e;\">07:52:16</span> <span style=\"color:#e06c75;\">EADDRINUSE</span> Systemd tries restart #4\n<span style=\"color:#7f848e;\">07:52:22</span> <span style=\"color:#e06c75;\">EADDRINUSE</span> Systemd tries restart #5\n<span style=\"color:#7f848e;\">07:52:28</span> <span style=\"color:#e06c75;\">EADDRINUSE</span> Systemd tries restart #6\n<span style=\"color:#7f848e;\">07:52:33</span> <span style=\"color:#e06c75;\">EADDRINUSE</span> Systemd tries restart #7\n<span style=\"color:#7f848e;\">07:52:57</span> <span style=\"color:#98c379;\">SUCCESS</span> Zombie killed, port free, service starts!\n</pre>\n\n<h3 style=\"color:#c678dd;margin:16px 0 8px 0;\">What Happened</h3>\n<ol style=\"color:#e0e0e0;line-height:1.8;\">\n<li>A previous kongate process became a <span style=\"color:#e5c07b;\">zombie</span> - still holding port 8010 but not responding to SIGTERM</li>\n<li>Systemd tried to restart the service every ~5 seconds</li>\n<li>Each new process immediately crashed with <code style=\"background:#2d2d2d;padding:2px 6px;border-radius:3px;\">EADDRINUSE</code></li>\n<li>Around 07:52:57 UTC, the zombie was killed (likely manually) and the service recovered</li>\n</ol>\n\n<h3 style=\"color:#56b6c2;margin:16px 0 8px 0;\">XML Error Handling is Solid</h3>\n<p style=\"color:#e0e0e0;line-height:1.6;\">Bad XML <strong style=\"color:#98c379;\">cannot</strong> crash kongate. The code has proper error handling:</p>\n<pre style=\"background:#2d2d2d;padding:16px;border-radius:6px;overflow-x:auto;line-height:1.8;color:#e0e0e0;\">\n<span style=\"color:#7f848e;\">// xml-parser.ts:193-201</span>\n<span style=\"color:#c678dd;\">try</span> {\n <span style=\"color:#61afef;\">parse</span>(xmlContent); <span style=\"color:#7f848e;\">// @libs/xml parser</span>\n} <span style=\"color:#c678dd;\">catch</span> (error) {\n <span style=\"color:#c678dd;\">return</span> {\n success: <span style=\"color:#e5c07b;\">false</span>,\n error: <span style=\"color:#98c379;\">`XML parsing failed: ${errorMessage}`</span>\n }; <span style=\"color:#7f848e;\">// Gracefully returns error, doesn't crash</span>\n}\n</pre>\n\n<h3 style=\"color:#e5c07b;margin:16px 0 8px 0;\">Current Status</h3>\n<pre style=\"background:#2d2d2d;padding:16px;border-radius:6px;overflow-x:auto;line-height:1.8;color:#e0e0e0;\">\n<span style=\"color:#98c379;\">● kongate.service - Kongate FTP Gateway Service</span>\n Loaded: <span style=\"color:#61afef;\">loaded</span> (/etc/systemd/system/kongate.service; enabled)\n Active: <span style=\"color:#98c379;\">active (running)</span> since Fri 2026-01-09 17:52:57 AEST\n Memory: 60.9M\n Uptime: <span style=\"color:#98c379;\">~8 minutes</span> at time of investigation\n</pre>\n\n<h3 style=\"color:#f97316;margin:16px 0 8px 0;\">Prevention Options</h3>\n<ol style=\"color:#e0e0e0;line-height:1.8;\">\n<li><strong>Systemd KillMode=mixed</strong> - Force kill zombie processes on restart</li>\n<li><strong>Add TimeoutStopSec=10</strong> - Don't wait forever for graceful shutdown</li>\n<li><strong>Pre-start port check</strong> - Kill any process on 8010 before starting</li>\n</ol>\n\n<p style=\"color:#7f848e;margin-top:20px;font-style:italic;\">The previous kontask \"Kongate Crash Loop Fixed - Zombie Process Killed\" already resolved this incident. No code changes needed for XML handling.</p>\n</div>",
"requestedAt": "2026-01-09T08:01:00.000Z",
"requestId": "ef33f45a-9d74-4681-8425-dfe53c8934e5",
"choices": [
{
"label": "Harden systemd",
"value": "Improve the systemd service file with KillMode=mixed and TimeoutStopSec=10 to prevent zombie processes from blocking restarts",
"primary": true
},
{
"label": "Add port check",
"value": "Add a pre-start script that kills any process holding port 8010 before starting kongate"
},
{
"label": "View XML tests",
"value": "Show me the XML parsing test coverage to verify bad XML really can't crash the service"
}
],
"turnTiming": {
"totalMs": 87618,
"ttfeMs": 67,
"thinkingMs": 87549,
"toolExecutionMs": 0,
"toolCallCount": 0,
"thinkingPct": 100,
"toolsPct": 0,
"ttfePct": 0
}
},
"createdBy": "claude",
"createdAt": "2026-01-09T08:02:48.110Z",
"updatedAt": "2026-01-09T08:02:52.697Z",
"requestId": "ef33f45a-9d74-4681-8425-dfe53c8934e5",
"scope": "product",
"tags": [
"kongate",
"pipeline",
"debugging",
"crash"
],
"targetUser": "claude"
}