Designing a Reliable Transmission Protocol

The modulation layer (Part 2) converts bits to audio frequencies. But raw bits aren’t enough — the receiver needs to know where a transmission starts, how long the payload is, what modulation scheme is in use, and whether the data arrived intact. That’s the protocol layer.

This post covers the frame structure, synchronization, forward error correction, and integrity checking that make NearWave work over a noisy, unidirectional channel.

Frame Structure

Every NearWave transmission is wrapped in a frame. The frame is the unit of data the receiver processes:

graph LR
    A[Preamble<br/>32 bits] --> B[Header<br/>9 bytes · FEC'd]
    B --> C[Payload<br/>N bytes · FEC'd]
    C --> D[Footer<br/>16 bits]

Each section has a specific role:

┌──────────┬──────────┬─────────┬──────────┐
│ Preamble │  Header  │ Payload │  Footer  │
│ 32 bits  │ 9 bytes  │ N bytes │ 16 bits  │
│          │ (FEC'd)  │ (FEC'd) │          │
└──────────┴──────────┴─────────┴──────────┘

Preamble (32 bits)

The preamble is a known bit pattern that the receiver uses to detect the start of a transmission. It consists of an alternating tone sequence followed by a sync word.

The alternating pattern (1010...) trains the receiver’s gain and timing. The sync word (a fixed pattern like 0xB5) marks the exact start of the header. The receiver correlates incoming bits against the sync word and triggers frame processing when it matches.

Why not just start transmitting data? Without a preamble, the receiver has no way to distinguish the beginning of a frame from ambient noise. The preamble provides a deterministic entry point.

Header (9 bytes, FEC-encoded)

The header carries metadata about the payload:

Field	Size	Purpose
Payload length	2 bytes	Number of payload bytes (before FEC)
Modulation type	1 byte	BFSK, 4-FSK, etc.
CRC32	4 bytes	Checksum of original data
Original bit length	2 bytes	Bit count before Hamming expansion

The header is Hamming(7,4)-encoded (see below), so it expands to ~16 bytes on the wire. Protecting the header with FEC is critical — a corrupted length field would cause the receiver to read the wrong number of payload bytes, making the entire frame unrecoverable.

Payload (N bytes, FEC-encoded)

The actual data, Hamming(7,4)-encoded. The length is specified in the header. After FEC decoding and CRC verification on the receive side, this yields the original input bytes.

A fixed end-of-frame marker. The receiver uses this to confirm that the frame terminated cleanly. If the footer doesn’t match, the frame is discarded.

Preamble Detection

The receiver runs continuously, processing audio samples through the Goertzel detector (covered in Part 4). It maintains a sliding window of recent bits and checks for the preamble pattern at every step.

The detection process:

Demodulate each symbol to a bit (or group of bits for MFSK)
Shift into a 32-bit buffer
Compare against the known preamble pattern
On match, switch from “scanning” to “receiving” mode
Read the header, then the payload, then the footer

A common failure mode is false triggering — ambient noise that coincidentally matches the preamble. The sync word reduces this risk. A 8-bit sync word has a 1/256 chance of random match per bit position, and the full 32-bit preamble makes false positives rare in practice.

Forward Error Correction: Hamming(7,4)

The audio channel introduces bit errors. Speaker distortion, ambient noise, microphone sensitivity — all can flip bits. NearWave uses Hamming(7,4) to correct single-bit errors without retransmission.

How It Works

Hamming(7,4) encodes every 4 data bits into a 7-bit codeword by adding 3 parity bits:

Data:     [d₁ d₂ d₃ d₄]
Codeword: [p₁ p₂ d₁ p₃ d₂ d₃ d₄]

Each parity bit covers a specific subset of data bits (determined by bit position in binary). On the receive side, the decoder computes a 3-bit syndrome from the parity checks. If the syndrome is zero, no error occurred. If non-zero, the syndrome identifies the exact bit position of the error, which is then flipped.

Overhead

Hamming(7,4) adds 75% overhead — every 4 bits become 7 bits on the wire. For a 100-byte payload:

Original: 800 bits
After Hamming: 1400 bits
Overhead: 600 bits (75%)

This is significant given the already-low bandwidth. But the alternative is no error correction at all on a unidirectional channel. There’s no ACK, no retransmission. If a bit flips and there’s no FEC, the data is corrupted silently.

Why Hamming(7,4) Specifically

Stronger codes exist (Reed-Solomon, convolutional codes, LDPC). Hamming(7,4) was chosen because:

It’s trivial to implement — a few XOR operations
The decoding is constant-time with no iterative computation
Single-bit correction is sufficient for the error rates observed in typical indoor environments
The overhead is acceptable given the small payload sizes NearWave targets

For a system transmitting kilobytes, not megabytes, simplicity wins over theoretical efficiency.

CRC32 Integrity Check

Hamming corrects single-bit errors but doesn’t detect multi-bit corruption. CRC32 provides an integrity check over the original (pre-FEC) data.

The flow:

Sender: compute CRC32 over the raw payload → store in header → FEC-encode header and payload → transmit
Receiver: receive frame → FEC-decode header and payload → compute CRC32 over decoded payload → compare with header CRC32

If the CRC doesn’t match, the frame is discarded. This catches:

Multi-bit errors that Hamming can’t correct
Systematic corruption from hardware issues
Frames where the preamble was falsely detected

Why No Retransmission

NearWave is strictly unidirectional. The sender plays audio. The receiver captures audio. There’s no back-channel — the receiver can’t send an ACK or NACK.

This means the protocol must be correct by construction:

FEC corrects what it can
CRC detects what FEC misses
The frame structure is self-describing (header contains length and type)
No state is shared between sender and receiver

This is a deliberate constraint, not a limitation. A bidirectional protocol would require two-way audio, echo cancellation, and substantially more complexity. For the intended use cases — air-gapped transfer, quick device-to-device exchange — unidirectional is the right tradeoff.

Putting It Together

A complete frame transmission looks like this:

Compute CRC32 of the raw input data
Build the 9-byte header (length, modulation type, CRC32, original bit length)
Hamming(7,4)-encode the header
Hamming(7,4)-encode the payload
Prepend the 32-bit preamble
Append the 16-bit footer
Pass the full bitstream to the modulator

On receive, every step reverses:

Detect preamble in the audio stream
Demodulate the header bits → Hamming-decode → extract metadata
Demodulate payload bits (count from header) → Hamming-decode
Verify CRC32
If valid, output the recovered data

The next post covers the signal processing side — specifically, how the Goertzel algorithm detects which frequency is playing at any given moment with minimal computation.