TCP From Scratch

TCP

I thought I understood TCP. Three-way handshake, sequence numbers, acknowledgments. I could draw the diagram on a whiteboard. Then I tried to implement it and the first thing I discovered was that I understood almost nothing.

Here's the thing about TCP that you don't get from reading about it: everything is hard for interesting reasons. It's not just "add reliability to IP." It's a collection of deeply clever solutions to problems that aren't obvious until you try to solve them yourself.

Reliability is a distributed systems problem

The internet is unreliable by design. Packets get lost, arrive out of order, arrive twice, take different routes with different delays. TCP has to make this chaos look like a smooth, ordered stream of bytes.

The mechanism sounds simple. Number every byte. Acknowledge receipt. Retransmit what wasn't acknowledged. But the simplicity falls apart when you implement it.

When should you retransmit? Too early and you flood the network with unnecessary copies. Too late and the connection stalls. The answer is to estimate the round-trip time and set your timeout based on that. But network conditions change constantly, so you need a moving estimate. TCP uses an exponential weighted moving average with variance tracking:

SRTT    = (1 - a) * SRTT + a * sample
RTTVAR  = (1 - b) * RTTVAR + b * |sample - SRTT|
RTO     = SRTT + 4 * RTTVAR

The 4 * RTTVAR is generous on purpose. A stable connection gets a tight timeout. A flaky one gets slack. I didn't appreciate this until I ran my implementation over a simulated lossy network and watched the RTO adapt in real time. It's an elegant piece of engineering hidden inside two lines of math.

Closing is harder than opening

The three-way handshake gets all the attention, but the real complexity is in connection teardown. TCP has eleven states. Most of them exist because of closing.

Why? In a distributed system, you can't agree on anything simultaneously. When I send a FIN to close my side, the other side might still be sending data. So there's a half-closed state. When both sides have finished, I still can't delete the connection state — old packets might be floating around the network. So there's TIME_WAIT, where you hold the state for twice the maximum segment lifetime, just in case.

I implemented all eleven states. The state diagram started to make sense only after I'd debugged transitions between them for a week. Every state exists because someone found a bug. That's true of most protocol complexity — it looks like overengineering until you hit the exact edge case it was designed to handle.

Congestion control is game theory

This surprised me most. TCP doesn't just need to be reliable — it needs to be fair. If your connection blasts data as fast as possible, it drowns out everyone else. But if it's too conservative, you waste available bandwidth.

The solution is a feedback loop that feels like it was designed by an economist:

Slow Start: Begin cautiously. One segment, wait for ACK, then two, then four. Exponential growth until something breaks.
Congestion Avoidance: After you detect congestion, grow linearly. One more segment per round trip. Careful.
Fast Retransmit: Three duplicate ACKs mean a packet was probably lost. Don't wait for the timeout — retransmit now.
Fast Recovery: After a fast retransmit, don't collapse to slow start. The duplicate ACKs mean data is still flowing. Only one packet was lost.

Implementing this is where I actually learned. The algorithms are simple on paper. Getting the edge cases right — when to increase the window, when to decrease it, how to handle multiple losses in the same window — that's where the understanding lives. I also implemented CUBIC and BBR for comparison. CUBIC is what Linux uses. BBR is Google's approach. They solve the same problem with very different philosophies. CUBIC reacts to loss. BBR tries to estimate bandwidth and RTT directly. Seeing both work (and fail) gave me intuition I couldn't have gotten from reading papers.

What I learned

Every complex system is an accumulation of solutions to specific problems people actually hit. TCP's state machine isn't complex for fun. The congestion control algorithms aren't clever for show. Each piece exists because someone's network broke and they had to fix it.

The gap between "I know how TCP works" and "I implemented TCP" turned out to be the gap between knowing vocabulary and understanding a system. They feel similar from the outside. They're not.

Code:github.com/UmarbekFU/tcp-from-scratch

Full implementation: 11 states, three congestion control algorithms (Reno, CUBIC, BBR), RTT estimation, sliding window, SACK support. Python, because the goal was understanding, not performance.

← Back to 101 Projects