A socket hands you bytes, not messages — so you keep a buffer and only deliver a message once every byte of it has actually shown up.
TCP is a byte stream with no message boundaries. A single recv() can hand you half of one WebSocket frame, or two whole frames glued together, or one frame split across three reads — the kernel does not care where your frames begin and end.
On top of that, WebSocket lets one application message be split into several frames (fragmentation): a text frame with FIN=0, then continuation frames, ending with FIN=1. So you must buffer incoming bytes, parse out whole frames only when enough bytes are present, and stitch fragments together until you see the final frame.
Treat one read as one message and you get corruption — you hand the app a half-parsed header as if it were text.
Press play to watch bytes arrive misaligned with frame boundaries.
One buffer per connection. Append every read to it, then drain whole frames in a loop — returning early the moment the buffer is too short to finish parsing.
let buf = Buffer.alloc(0); // per-connection accumulator
let msg = []; // fragments of the current message
let msgOpcode = null; // 0x1 text or 0x2 binary, set by the first fragment
function onData(chunk) {
buf = Buffer.concat([buf, chunk]); // bytes, not messages
drainFrames();
}
function drainFrames() {
while (true) {
if (buf.length < 2) return; // need the 2-byte minimal header
const fin = (buf[0] & 0x80) !== 0;
const opcode = buf[0] & 0x0f;
const masked = (buf[1] & 0x80) !== 0;
let len = buf[1] & 0x7f;
let off = 2;
if (len === 126) { // 16-bit extended length
if (buf.length < off + 2) return;
len = buf.readUInt16BE(off); off += 2;
} else if (len === 127) { // 64-bit extended length
if (buf.length < off + 8) return;
len = Number(buf.readBigUInt64BE(off)); off += 8;
}
if (masked) { // client->server frames carry a 4-byte key
if (buf.length < off + 4) return;
off += 4;
}
if (buf.length < off + len) return; // header parsed, payload not all here yet
let payload = buf.subarray(off, off + len);
if (masked) payload = unmask(payload, buf.subarray(off - 4, off));
buf = buf.subarray(off + len); // consume exactly one frame
if (opcode >= 0x8) { // control frame: ping / pong / close
if (!fin || len > 125) throw new Error('control frames cannot fragment');
handleControl(opcode, payload); // interleaved between fragments — handle now
continue;
}
if (opcode === 0x0) { // continuation of the current message
msg.push(payload);
} else { // 0x1 / 0x2 — start of a new message
msgOpcode = opcode; msg = [payload];
}
if (fin) { // last fragment — deliver and reset
deliver(msgOpcode, Buffer.concat(msg));
msg = []; msgOpcode = null;
}
}
}
The early returns are the whole trick: a short buffer is not an error, it just means “wait for the next read.” The bytes already parsed stay in buf until the rest arrives.
| Quantity | Cost | Note |
|---|---|---|
| Frame header | 2–14 bytes | 2 base + 0/2/8 length + 0/4 mask key |
| Buffer memory | O(message size) | fragments held until FIN=1 |
| Reassembly time | O(total bytes) | each byte copied once into the message |
| Latency to deliver | until FIN=1 | no partial message is handed up |
| Length encoding | 7 / 16 / 64 bit | ≤125 inline, 126→u16, 127→u64 |
0x8–0xA) can interleave between fragments of a message and must be processed immediately, not appended to the message.FIN can exhaust memory. Cap the assembled-message size and fail the connection past it.MASK bit adds a 4-byte key you must skip and XOR the payload against.A 300-byte text message is sent as two frames: a text frame (opcode 0x1, FIN=0) carrying the first 150 bytes, then a continuation frame (opcode 0x0, FIN=1) carrying the last 150 bytes. Because 150 > 125, each frame uses the 16-bit extended length, so each header is 4 bytes (unmasked, for brevity).
Now suppose the reads arrive misaligned with the frames:
[0x01, 0x7E] — the base of frame 1’s header. Buffer = 2 B. We see len=126, which means “read 2 more length bytes” — but the buffer has nothing left. Return and wait.[0x00, 0x96] (=150) plus the first 98 payload bytes. Now the header parses: FIN=0, text, len=150 — but only 98 of 150 payload bytes are present. Return and wait.FIN=0 text → start the message, hold 150 bytes, await continuation. Frame 2’s header parses but only 100 of 150 payload bytes are here. Return and wait.opcode 0x0 continuation, FIN=1 → append 150 bytes, total 300, deliver the message and reset.At no point did a partial header or partial payload escape to the application. The naive “one read = one message” receiver would have handed read 1’s two header bytes up as if they were text — garbage.
You’ve appended a read and the buffer holds exactly 1 byte: 0x82. What should the parser do?
Mid-way through reassembling a fragmented text message (FIN=0 seen, awaiting continuation), a complete ping frame (opcode 0x9, FIN=1) arrives. What happens to it?