Designing a Reliable Transmission Protocol
The modulation layer (Part 2) converts bits to audio frequencies. But raw bits aren’t enough — the receiver needs to know where a transmission starts, how long the payload is, what modulation scheme is in use, and whether the data arrived intact. That’s the protocol layer.
This post covers the frame structure, synchronization, forward error correction, and integrity checking that make NearWave work over a noisy, unidirectional channel.
Frame Structure
Every NearWave transmission is wrapped in a frame. The frame is the unit of data the receiver processes:
graph LR
A[Preamble<br/>32 bits] --> B[Header<br/>9 bytes · FEC'd]
B --> C[Payload<br/>N bytes · FEC'd]
C --> D[Footer<br/>16 bits]
Each section has a specific role:
┌──────────┬──────────┬─────────┬──────────┐
│ Preamble │ Header │ Payload │ Footer │
│ 32 bits │ 9 bytes │ N bytes │ 16 bits │
│ │ (FEC'd) │ (FEC'd) │ │
└──────────┴──────────┴─────────┴──────────┘
Preamble (32 bits)
The preamble is a known bit pattern that the receiver uses to detect the start of a transmission. It consists of an alternating tone sequence followed by a sync word.
The alternating pattern (1010...) trains the receiver’s gain and timing. The sync word (a fixed pattern like 0xB5) marks the exact start of the header. The receiver correlates incoming bits against the sync word and triggers frame processing when it matches.
Why not just start transmitting data? Without a preamble, the receiver has no way to distinguish the beginning of a frame from ambient noise. The preamble provides a deterministic entry point.
Header (9 bytes, FEC-encoded)
The header carries metadata about the payload:
| Field | Size | Purpose |
|---|---|---|
| Payload length | 2 bytes | Number of payload bytes (before FEC) |
| Modulation type | 1 byte | BFSK, 4-FSK, etc. |
| CRC32 | 4 bytes | Checksum of original data |
| Original bit length | 2 bytes | Bit count before Hamming expansion |
The header is Hamming(7,4)-encoded (see below), so it expands to ~16 bytes on the wire. Protecting the header with FEC is critical — a corrupted length field would cause the receiver to read the wrong number of payload bytes, making the entire frame unrecoverable.
Payload (N bytes, FEC-encoded)
The actual data, Hamming(7,4)-encoded. The length is specified in the header. After FEC decoding and CRC verification on the receive side, this yields the original input bytes.
Footer (16 bits)
A fixed end-of-frame marker. The receiver uses this to confirm that the frame terminated cleanly. If the footer doesn’t match, the frame is discarded.
Preamble Detection
The receiver runs continuously, processing audio samples through the Goertzel detector (covered in Part 4). It maintains a sliding window of recent bits and checks for the preamble pattern at every step.
The detection process:
- Demodulate each symbol to a bit (or group of bits for MFSK)
- Shift into a 32-bit buffer
- Compare against the known preamble pattern
- On match, switch from “scanning” to “receiving” mode
- Read the header, then the payload, then the footer
A common failure mode is false triggering — ambient noise that coincidentally matches the preamble. The sync word reduces this risk. A 8-bit sync word has a 1/256 chance of random match per bit position, and the full 32-bit preamble makes false positives rare in practice.
Forward Error Correction: Hamming(7,4)
The audio channel introduces bit errors. Speaker distortion, ambient noise, microphone sensitivity — all can flip bits. NearWave uses Hamming(7,4) to correct single-bit errors without retransmission.
How It Works
Hamming(7,4) encodes every 4 data bits into a 7-bit codeword by adding 3 parity bits:
Data: [d₁ d₂ d₃ d₄]
Codeword: [p₁ p₂ d₁ p₃ d₂ d₃ d₄]
Each parity bit covers a specific subset of data bits (determined by bit position in binary). On the receive side, the decoder computes a 3-bit syndrome from the parity checks. If the syndrome is zero, no error occurred. If non-zero, the syndrome identifies the exact bit position of the error, which is then flipped.
Overhead
Hamming(7,4) adds 75% overhead — every 4 bits become 7 bits on the wire. For a 100-byte payload:
- Original: 800 bits
- After Hamming: 1400 bits
- Overhead: 600 bits (75%)
This is significant given the already-low bandwidth. But the alternative is no error correction at all on a unidirectional channel. There’s no ACK, no retransmission. If a bit flips and there’s no FEC, the data is corrupted silently.
Why Hamming(7,4) Specifically
Stronger codes exist (Reed-Solomon, convolutional codes, LDPC). Hamming(7,4) was chosen because:
- It’s trivial to implement — a few XOR operations
- The decoding is constant-time with no iterative computation
- Single-bit correction is sufficient for the error rates observed in typical indoor environments
- The overhead is acceptable given the small payload sizes NearWave targets
For a system transmitting kilobytes, not megabytes, simplicity wins over theoretical efficiency.
CRC32 Integrity Check
Hamming corrects single-bit errors but doesn’t detect multi-bit corruption. CRC32 provides an integrity check over the original (pre-FEC) data.
The flow:
- Sender: compute CRC32 over the raw payload → store in header → FEC-encode header and payload → transmit
- Receiver: receive frame → FEC-decode header and payload → compute CRC32 over decoded payload → compare with header CRC32
If the CRC doesn’t match, the frame is discarded. This catches:
- Multi-bit errors that Hamming can’t correct
- Systematic corruption from hardware issues
- Frames where the preamble was falsely detected
Why No Retransmission
NearWave is strictly unidirectional. The sender plays audio. The receiver captures audio. There’s no back-channel — the receiver can’t send an ACK or NACK.
This means the protocol must be correct by construction:
- FEC corrects what it can
- CRC detects what FEC misses
- The frame structure is self-describing (header contains length and type)
- No state is shared between sender and receiver
This is a deliberate constraint, not a limitation. A bidirectional protocol would require two-way audio, echo cancellation, and substantially more complexity. For the intended use cases — air-gapped transfer, quick device-to-device exchange — unidirectional is the right tradeoff.
Putting It Together
A complete frame transmission looks like this:
- Compute CRC32 of the raw input data
- Build the 9-byte header (length, modulation type, CRC32, original bit length)
- Hamming(7,4)-encode the header
- Hamming(7,4)-encode the payload
- Prepend the 32-bit preamble
- Append the 16-bit footer
- Pass the full bitstream to the modulator
On receive, every step reverses:
- Detect preamble in the audio stream
- Demodulate the header bits → Hamming-decode → extract metadata
- Demodulate payload bits (count from header) → Hamming-decode
- Verify CRC32
- If valid, output the recovered data
The next post covers the signal processing side — specifically, how the Goertzel algorithm detects which frequency is playing at any given moment with minimal computation.