The router never mails you the packets — it mails you a tally. You learn who talked to whom and how much, by best-effort post that sometimes goes missing.
A router or switch watches packets fly by and folds them into flows — groups keyed by the 5-tuple (src IP, dst IP, src port, dst port, protocol). For each flow it keeps a row in a flow cache, ticking up packet and byte counters as matching packets arrive.
When a flow expires — idle too long, open too long, a TCP FIN/RST, or the cache fills — the device exports one compact record over UDP to a collector. So what you receive is a stream of summaries, not a full packet capture. That makes it cheap and scalable, but it also means the counts are estimates, the UDP can be lost, and (in IPFIX) you can't decode a record until its template has arrived.
The observation point hashes each sampled packet's 5-tuple to find or create a cache row, then adds 1 to packets and the packet length to bytes. A flow leaves the cache — and is exported — under any of four conditions:
Export goes out over UDP — fire-and-forget, no acks, no retransmit. In IPFIX the record is just field values in a packed binary layout; the template that names and types those fields is sent separately and periodically. A collector that hasn't yet seen the matching template literally cannot decode the data record.
cache = {} # 5-tuple -> flow row
def on_packet(pkt):
if sampled_out(pkt): return # 1:N sampling drops the rest
key = (pkt.src_ip, pkt.dst_ip,
pkt.src_port, pkt.dst_port, pkt.proto)
f = cache.get(key)
if f is None:
f = cache[key] = {"packets": 0, "bytes": 0,
"start": now(), "last": now()}
f["packets"] += 1
f["bytes"] += pkt.length
f["last"] = now()
if pkt.tcp_fin or pkt.tcp_rst:
export(key, cache.pop(key)) # FIN/RST -> flush now
def sweep(): # runs on a timer
for key, f in list(cache.items()):
if now() - f["last"] > INACTIVE: export(key, cache.pop(key))
elif now() - f["start"] > ACTIVE: export(key, cache.pop(key))
def export(key, f):
f["packets"] *= sampling_rate # scale back up to an estimate
f["bytes"] *= sampling_rate
udp_send(collector, encode(key, f)) # best-effort, may be lost
Note the *= sampling_rate: if the device only inspected 1 in N packets, it multiplies the counts by N to estimate the true total. The estimate is unbiased on average but noisy for small flows.
| Lever | Effect | Watch |
|---|---|---|
| Sampling 1:1 | Exact counts, every flow seen | Heavy CPU / cache load on the router |
| Sampling 1:1000 | Cheap, scales to backbone links | Counts are ×1000 estimates; small flows missed |
| Active timeout short | Fresher data, faster visibility | More export traffic, splits long flows |
| Active timeout long | Fewer, fatter records | Stale view of in-progress conversations |
| UDP export | Cheap, no per-record router state | Silent loss — no retransmit, gaps appear |
A web server downloads a 1.5 MB file to a client over one TCP connection — say 10.0.0.7:443 → 10.0.0.40:51020. At the edge router the first data packet creates a flow row; each subsequent packet ticks packets and adds its length to bytes. At 1:1 sampling the row reaches roughly packets ≈ 1100, bytes ≈ 1,500,000. When the client's FIN arrives, the row is flushed immediately and one IPFIX record leaves over UDP. If that single datagram is lost in transit, the collector simply never learns about a 1.5 MB transfer — and with no retransmit, it never will. Bump sampling to 1:1000 and the same flow is built from roughly one observed packet, then multiplied by 1000: the estimate is in the right ballpark but could easily read bytes ≈ 1,200,000 or 2,000,000.
Your collector starts receiving IPFIX data records right after a router reboot, but it can't decode them — every field reads as garbage. What is the most likely cause?