Sharing a file peer to peer

Nobody downloads the whole file from one place. Everyone grabs a few pieces from each other and hands out what they already have.

The idea

A central server is a single tap: the more people pull from it, the slower it gets for everyone. A peer-to-peer protocol turns that around. The file is split into small numbered pieces, and a shared manifest lists the hash of each one.

Peers find each other through a tracker or DHT, then download missing pieces in parallel directly from whoever already has them. Each piece is checked against its hash on arrival, and as soon as a peer holds a piece it starts uploading it to others. Every downloader (a leecher) becomes an uploader (a seeder), so the more people join, the more total bandwidth the swarm has.

See it work

Press play. One seeder, three empty peers — watch them trade pieces until everyone is a seeder.

How it works

A client reads the manifest to learn how many pieces there are and the expected hash of each. It asks the tracker (or DHT) for a peer list, then keeps requesting the rarest missing piece it can find, verifies every arrival, and immediately advertises what it now has.

# 1. publish: split the file, hash every piece
pieces   = split(file, PIECE_SIZE)
manifest = [sha256(p) for p in pieces]     # the .torrent / magnet

# 2. download: join the swarm and fill in the gaps
peers = tracker.get_peers(manifest.info_hash)   # or DHT lookup
have  = [False] * len(manifest)

while not all(have):
    # rarest-first: fewest copies in the swarm = grab it now
    i    = pick_rarest_missing(have, peers)
    src  = some_peer_that_has(i, peers)
    data = src.request(i)                   # parallel: many i at once

    if sha256(data) == manifest[i]:         # verify — peers are untrusted
        write_piece(i, data)
        have[i] = True
        announce_have(i, peers)             # now you upload it too
    else:
        blacklist(src)                       # bad piece, try another peer

Trade-offs

ConcernCentral serverPeer to peer
Bandwidth as demand growsSplits one pipe thinner per userScales up — each new peer adds upload capacity
Single point of failureServer down = nobody downloadsNo single source; any seeder can serve
Startup latencyInstant — one connection, first byte fastSlower to start while finding peers and pieces
PrivacyOne operator sees every requestYour IP is visible to every peer in the swarm
Control over the fileOwner can update or pull it anytimeOnce seeded, copies live on independently

Watch out for

Worked example

A 4-piece file. Seeder S holds all four; peers A, B, C start empty.

Round 1. A asks for the rarest piece in the swarm — right now every piece exists only on S, so A just takes piece 0. Meanwhile B pulls piece 1 and C pulls piece 2 from S in parallel. The seeder hands out three different pieces at once instead of three copies of the same one.

Round 2. Now A, B, C each hold one distinct piece, so they trade among themselves — A sends piece 0 to B while B sends piece 1 to A — and S only needs to feed the last piece 3 into the swarm once. Every transferred piece is hash-checked on arrival.

Round 3. A few more swaps and A, B, C all reach 4/4. The single seeder has become four seeders, and S could leave entirely without breaking anyone's download.

Check yourself

A peer sends you piece 7, but sha256(piece7) doesn't match manifest[7]. What should the client do?

Why does the swarm often download the rarest missing piece first?